Resolving the attenuation paradox

"In general, there appears to be no reason that measures of reliability and validity should be monotone increasing functions of each other" (Sitgreaves, 1961)

Up to a point, reliability and validity increase together, but then any further increase in reliability decreases validity. This is the attenuation paradox (RMT 6(4) p. 257, RMT 7(2) 294). The attenuation paradox appears most clearly in the context of item selection and test construction. In practice, the problem is how to select those items that will simultaneously increase both the reliability and validity of the total test scores.

From the perspective of Rasch measurement, there is a simple solution to the attenuation paradox. Useful invariant measurement require items to have similar discrimination and stochasticity, but different difficulties. The elimination of both low and high discriminating items (Andrich, 1988) maximizes validity, while optimizing reliability.

Since classical test theory (CTT) focuses primarily on total scores, there are no unambiguous guidelines for accomplishing this goal. Even making the item, rather than total score, the unit of analysis does not resolve the attenuation paradox, because the elimination of highly discriminating items goes against "conventional wisdom" for many psychometricians trained in the CTT tradition. "Discarding the most as well as the least discriminating items also goes against one's instincts in test construction" (Cliff, 1989, p. 77). Within CTT, the higher the discrimination index, the better the item (Ebel, 1979). Consequently, a more palatable solution for scientists trained in the CTT tradition is to attempt control of variation in item discrimination by including another item parameter in the model.

The inclusion of an item-discrimination parameter in Birnbaum's two-parameter model reflects the historical influences of the CTT tradition on modern IRT, so, despite the attenuation paradox, ideas from CTT still influence measurement practice. The two-parameter IRT model attempts a statistical adjustment of test scores to account for variability in item discrimination. This is thought to resolve the paradox. But the price for maintaining a commitment to an antiquated concept of item quality is that the two-parameter model produces ordinal scales rather than interval measures (Cliff, 1989). Nevertheless many "modern" psychometricians still refuse to accept the implications of the attenuation paradox for modern measurement theory and practice.

Rasch measurement, on the other hand, sets out clear guidelines for test construction that lead to the elimination of items with extreme discrimination parameters. This resolves the attenuation paradox, and provides the opportunity to obtain interval scales by bringing the data into conformity with the Rasch model. In practice, however, test constructors should not eliminate items with no further thought. We should explore why some items are more or less discriminating. Masters (1988) presents a compelling case for viewing item discrimination as a type of item bias that may be the result of individual differences related to opportunity to learn, opportunity to answer, and test-wiseness.

Professor George Engelhard, Jr.
Emory University
Division of Educational Studies
Atlanta, GA 30322

Andrich, D. (1988, April). A scientific revolution in social measurement. Paper presented at the annual meeting of the American Educational Research Association, New Orleans.

Cliff, N. (1989). Ordinal consistency and ordinal true scores. Psychometrika, 54(1), 75-91.

Ebel, R.L. (1979). Essentials of educational measurement. Englewood cliffs, NJ: Prentice-Hall.

Masters, G.N. (1988). Item discrimination: when more is worse. Journal of Educational Measurement, 25(1), 15-29.

Sitgreaves, R. (1961). A statistical formulation of the attenuation paradox in test theory. In H. Solomon (Ed.), Studies in item analysis and prediction (p. 17-28). Stanford, CA: Stanford University Press.

Resolving the attenuation paradox. Engelhard G Jr. … Rasch Measurement Transactions, 1994, 8:3 p.379

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
July 21 - 23, 2025, Mon.-Wed.	Pacific Rim Objective Measurement Symposium (PROMS) 2025, www.proms2025.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com