Evaluation of Validity of Computer Based Test Items in NOUN

Authors

  • Mrs Charity Akuadi Okonkwo (Ph.D.) National Open University of Nigeria

Keywords:

Difficulty index, discrimination index, distractor efficiency, multiple choice items, non-functional distracter, validity of test scores

Abstract

Multiple Choice Items (MCI) are one of the most commonly used Computer Based Assessment (CBA) instrument for assessment of students in educational settings especially in Open and Distance Learning (ODL) with large class sizes. The MCI making up the assessment instruments need to be examined for quality which depends on its Difficulty Index (DIF I), Discrimination Index (DI), and Distractor Efficiency (DE) if they are to meaningfully contribute to validity of the students’ examination scores. Such quality characteristics are amenable to examination by item analysis. Hence, the objective of this study is to evaluate the quality of MCI used for CBA in the National Open University of Nigeria (NOUN) as formative assessment measures by employing expost facto research design. One foundation course in School of Education of the University was used for the study. The aim was to develop a pool of valid items by assessing the items DIF I, DI and DE and also to store, revise or discard items based on obtained results. In this cross-sectional study, 240 MCI taken in four (4) sets of CBA per semester per course in 2012 – 2014 academic years were analysed. The data was entered and analysed in MS Excel 2007. The results indicated items of “good to excellent” DIF I and “good to excellent” DI, Efficient Distractors (DE) and non-functional distractors (NFD). Also established were items with poor DI. This study emphasized the selection of quality MCI which truly assess levels of students learning and differentiate students of different abilities in correct manner in NOUN thereby contributed to improving the validity of the test items.

Résumé : Les éléments à choix multiples (MCI) sont l'un des instruments d'évaluation par ordinateur (CBA) les plus couramment utilisés pour l'évaluation des étudiants dans les milieux éducatifs, en particulier dans la formation à distance (ODL) avec de grandes tailles de classe. L'ICM qui composent les instruments d'évaluation doit être examiné pour la qualité qui dépend de son indice de difficulté (DIF I), de son indice de discrimination (DI) et de son efficacité de distraction (DE) s'ils doivent contribuer de manière significative à la validité des notes d'examen des étudiants. Ces caractéristiques de qualité peuvent être examinées par l'analyse des points. Par conséquent, l'objectif de cette étude est d'évaluer la qualité de l'ICM utilisée pour le CBA à l’Université Nationale Ouverte du Nigeria (NOUN) comme mesures d'évaluation formative en employant la conception de la recherche exposit facto. Un cours de base à l'École d'éducation de l'Université a été utilisé pour l'étude. L'objectif est de développer un ensemble d'éléments valides en évaluant les éléments de  DIF I, DI et DE et aussi de stocker, réviser ou jeter des éléments en fonction des résultats obtenus. Dans cette étude transversale, 240 MCI répartis en quatre (4) ensembles de CBA par semestre par cours pour les années académiques 2012 - 2014 ont été analysés. Les données ont été saisies et analysées dans MS Excel 2007. Les résultats ont indiqué des éléments de «bon à excellent» DIF I et «bon à excellent» DI, Distracteurs efficaces (DE) et Distracteurs non-fonctionnel (NFD). Les éléments dont le DI est pauvre ont également été établis. Cette étude a mis l'accent sur la sélection de MCI de qualité qui évalue vraiment les niveaux d'apprentissage des étudiants et différencie les étudiants de différentes capacités de manière correcte à NOUN, contribuant ainsi à améliorer la validité des éléments de test.

References

American Educational Research Association, American Psychological Association and National Council on Measurement in Education (1999). Standards for Educational and Psychological Testing, Washington DC: American Educational Research Association

Baghaei, P. and Amrahi, N. (2011). Validation of a multiple choice English vocabulary test with the Rasch model. Journal of Language Teaching and Research, 2(5).

Crocker, L. and J. Algina (1986) Introduction to Classical and Modern Test Theory, New York: Holt, Rinehart and Winston

Haladyna, T. M., Downing, S. M. and Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15, 309-334.

Karelia, B. N., Piillai, A. and Vegada, B. N. (2013). The level of difficulty and discrimination indices and relationship between them in four-response type multiple choice questions of pharmacology summative tests of year II M.B.B.S students, leJSME, 7(2): 41-46

Mason and Bramble (1989). In Key, J. P. (1997). Research Design in Occupational Education. Except from those materials supplied by different department of the Oklahoma State University. www.okstate.edu/ag/agedom4h/academic/age5980/newspage18.htm

Matlock-Hetzel, S. (1997) ‘Basic Concepts in Item and Test Analysis’, Texas A. M. University. Available at http://www.ericae.net/ft/tamu/Espy.htm (accessed 1 Dec 2008)

Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performance as scientific inquiry into scoring meaning. American Psychologist, 9, 741-749

Messick, S. (1998). Test validity: A matter of consequence: Social Indicator Research, 45, 35-41.

Miller, M. D. Linn, R. L. and Gronlund, N. (1995). Measurement and Assessment in Teaching. Pearson Education.

Miller, M. D., Linn, R. L., and Gronlund, N. E. (2009). Measurement and assessment in teaching (10th ed.). Upper Saddle River, NJ: Pearson Education.

Nenty, H. J. (1985). Fundamentals of Measurement and Evaluation in Education. Faculty of Education, University of Calabar, Calabar, Nigeria.

Okonkwo, C. A. (2010). Using e-Assessment to enhance the operational efficiencies of the National Open University of Nigeria (NOUN). Journal of Educational Assessment in Africa, 5, 117 – 138. Publisher: The Association for Educational Assessment in Africa (AEAA).

Oosterhof, A. (1990) Classroom Applications of Educational Measurements, Columbus, OH: Merrill.

Pedhazur, E. J. and Schmelkin, L. P. (1991). Measurement, design and analysis: An integrated approach. Hillsdale, NJ: Lawrence Erlbaum Associates. Publishers Professional Testing Inc. 2006

Richardson, E. (2002). Item and Test Analysis. (Supports PEPE Teacher Indicator 3.2. Alabama Professional Development Modules. Alabama Department of Education. Institute for Assessment.

Sax, G. (1989). Principles of educational and psychological measurement and evaluation (3rd ed.). Belmont, CA: Wadsworth.
Thompson, B. & Levitov, J. E. (1985). Using microcomputers to score and evaluate test items. Collegiate Microcomputer, 3, 163 – 168.

Wierma, W. and Jurs, S. G. (1990). Educational Measurement and Testing (2nd ed.) Boston, MA: Allyn and Bacon

Published

2018-01-08

How to Cite

Okonkwo , C. A. (2018). Evaluation of Validity of Computer Based Test Items in NOUN. West African Journal of Open and Flexible Learning, 6(2), 147–171. Retrieved from https://wajofel.org/index.php/wajofel/article/view/18

Issue

Section

Research Articles