Composite undergraduate clinical examinations: how should the components be combined to maximize reliability?
- Autores
- Wass, Val; McGibbon, David
- Año de publicación
- 2009
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Background: Clinical examinations increasingly consist of composite tests to assess all aspects of the curriculum recommended by the General Medical Council. Setting: A final undergraduate medical school examination for 214 students. Aim: To estimate the overall reliability of a composite examination, the correlations between the tests, and the effect of differences in test length, number of items and weighting of the results on the reliability. Method: The examination consisted of four written and two clinical tests: multiple-choice questions (MCQ) test, extended matching questions (EMQ), shortanswer questions (SAQ), essays, an objective structured clinical examination (OSCE) and history-taking long cases. Multivariate generalizability theory was used to estimate the composite reliability of the examination and the effects of item weighting and test length. Results: The composite reliability of the examination was 0-77, if all tests contributed equally. Correlations between examination components varied, suggesting that different theoretically interpretable parameters of competence were being tested. Weighting tests according to items per test or total test time gave improved reliabilities of 0-93 and 0-81, respectively. Double weighting of the clinical component marginally affected the reliability (0-76). Conclusion: This composite final examination achieved an overall reliability sufficient for high-stakes decisions on student clinical competence. However, examination structure must be carefully planned and results combined with caution. Weighting according to number of items or test length significantly affected reliability. The components testing different aspects of knowledge and clinical skills must be carefully balanced to ensure both content validity and parity between items and test length.
Selección de trabajos ya publicados sobre los que se trabajó en los talleres de las Jornadas de Educación Médica
Sociedad de Educación Medica de La Plata - Materia
-
Ciencias Médicas
Educación
Exámenes Médicos
programa de enseñanza - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc/2.5/ar/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/8525
Ver los metadatos del registro completo
id |
SEDICI_b4d0534ca8a59e45d091ae6bfbaa2ba0 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/8525 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability?Wass, ValMcGibbon, DavidCiencias MédicasEducaciónExámenes Médicosprograma de enseñanzaBackground: Clinical examinations increasingly consist of composite tests to assess all aspects of the curriculum recommended by the General Medical Council. Setting: A final undergraduate medical school examination for 214 students. Aim: To estimate the overall reliability of a composite examination, the correlations between the tests, and the effect of differences in test length, number of items and weighting of the results on the reliability. Method: The examination consisted of four written and two clinical tests: multiple-choice questions (MCQ) test, extended matching questions (EMQ), shortanswer questions (SAQ), essays, an objective structured clinical examination (OSCE) and history-taking long cases. Multivariate generalizability theory was used to estimate the composite reliability of the examination and the effects of item weighting and test length. Results: The composite reliability of the examination was 0-77, if all tests contributed equally. Correlations between examination components varied, suggesting that different theoretically interpretable parameters of competence were being tested. Weighting tests according to items per test or total test time gave improved reliabilities of 0-93 and 0-81, respectively. Double weighting of the clinical component marginally affected the reliability (0-76). Conclusion: This composite final examination achieved an overall reliability sufficient for high-stakes decisions on student clinical competence. However, examination structure must be carefully planned and results combined with caution. Weighting according to number of items or test length significantly affected reliability. The components testing different aspects of knowledge and clinical skills must be carefully balanced to ensure both content validity and parity between items and test length.Selección de trabajos ya publicados sobre los que se trabajó en los talleres de las Jornadas de Educación MédicaSociedad de Educación Medica de La Plata2009info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/8525enginfo:eu-repo/semantics/altIdentifier/url/http://www.semlp.org/wp-content/uploads/2010/01/n2-trabajos-ya-publicados-2.pdfinfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc/2.5/ar/Creative Commons Attribution-NonCommercial 2.5 Argentina (CC BY-NC 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-10T11:53:44Zoai:sedici.unlp.edu.ar:10915/8525Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-10 11:53:44.18SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? |
title |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? |
spellingShingle |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? Wass, Val Ciencias Médicas Educación Exámenes Médicos programa de enseñanza |
title_short |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? |
title_full |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? |
title_fullStr |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? |
title_full_unstemmed |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? |
title_sort |
Composite undergraduate clinical examinations: how should the components be combined to maximize reliability? |
dc.creator.none.fl_str_mv |
Wass, Val McGibbon, David |
author |
Wass, Val |
author_facet |
Wass, Val McGibbon, David |
author_role |
author |
author2 |
McGibbon, David |
author2_role |
author |
dc.subject.none.fl_str_mv |
Ciencias Médicas Educación Exámenes Médicos programa de enseñanza |
topic |
Ciencias Médicas Educación Exámenes Médicos programa de enseñanza |
dc.description.none.fl_txt_mv |
Background: Clinical examinations increasingly consist of composite tests to assess all aspects of the curriculum recommended by the General Medical Council. Setting: A final undergraduate medical school examination for 214 students. Aim: To estimate the overall reliability of a composite examination, the correlations between the tests, and the effect of differences in test length, number of items and weighting of the results on the reliability. Method: The examination consisted of four written and two clinical tests: multiple-choice questions (MCQ) test, extended matching questions (EMQ), shortanswer questions (SAQ), essays, an objective structured clinical examination (OSCE) and history-taking long cases. Multivariate generalizability theory was used to estimate the composite reliability of the examination and the effects of item weighting and test length. Results: The composite reliability of the examination was 0-77, if all tests contributed equally. Correlations between examination components varied, suggesting that different theoretically interpretable parameters of competence were being tested. Weighting tests according to items per test or total test time gave improved reliabilities of 0-93 and 0-81, respectively. Double weighting of the clinical component marginally affected the reliability (0-76). Conclusion: This composite final examination achieved an overall reliability sufficient for high-stakes decisions on student clinical competence. However, examination structure must be carefully planned and results combined with caution. Weighting according to number of items or test length significantly affected reliability. The components testing different aspects of knowledge and clinical skills must be carefully balanced to ensure both content validity and parity between items and test length. Selección de trabajos ya publicados sobre los que se trabajó en los talleres de las Jornadas de Educación Médica Sociedad de Educación Medica de La Plata |
description |
Background: Clinical examinations increasingly consist of composite tests to assess all aspects of the curriculum recommended by the General Medical Council. Setting: A final undergraduate medical school examination for 214 students. Aim: To estimate the overall reliability of a composite examination, the correlations between the tests, and the effect of differences in test length, number of items and weighting of the results on the reliability. Method: The examination consisted of four written and two clinical tests: multiple-choice questions (MCQ) test, extended matching questions (EMQ), shortanswer questions (SAQ), essays, an objective structured clinical examination (OSCE) and history-taking long cases. Multivariate generalizability theory was used to estimate the composite reliability of the examination and the effects of item weighting and test length. Results: The composite reliability of the examination was 0-77, if all tests contributed equally. Correlations between examination components varied, suggesting that different theoretically interpretable parameters of competence were being tested. Weighting tests according to items per test or total test time gave improved reliabilities of 0-93 and 0-81, respectively. Double weighting of the clinical component marginally affected the reliability (0-76). Conclusion: This composite final examination achieved an overall reliability sufficient for high-stakes decisions on student clinical competence. However, examination structure must be carefully planned and results combined with caution. Weighting according to number of items or test length significantly affected reliability. The components testing different aspects of knowledge and clinical skills must be carefully balanced to ensure both content validity and parity between items and test length. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Articulo http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/8525 |
url |
http://sedici.unlp.edu.ar/handle/10915/8525 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/http://www.semlp.org/wp-content/uploads/2010/01/n2-trabajos-ya-publicados-2.pdf |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc/2.5/ar/ Creative Commons Attribution-NonCommercial 2.5 Argentina (CC BY-NC 2.5) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc/2.5/ar/ Creative Commons Attribution-NonCommercial 2.5 Argentina (CC BY-NC 2.5) |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1842903683992387584 |
score |
12.993085 |