Rasch Estimation Methods, Statistical Independence and Global Fit

Edited excerpts from the LTEST-L e-mail forum:

John de Jong (language test developer and long-time Rasch practitioner):

No global model-data fit statistics are reported by the Facets Rasch analysis program for judge-awarded ratings, and there can't be, because

1) Facets uses unconditional (JMLE UCON) estimation;

2) the data violates the assumption of independence, i.e., the different scores given by different raters are not independent: they are given to the same students on the same tasks.

So, though the Facets program may be nice to tinker with, no generalizations from results can be made.

The problem of consistent estimation when data is missing has been partially solved by Cees Glas and applied in the OPLM computer program (RMT 6:4 p.253). OPLM uses Conditional Maximum Likelihood CMLE to estimate item parameters across large numbers of linked subtests. These item parameter estimates are then used to estimate person measures.

John M. Linacre (author of Facets):

Question: Why doesn't Facets report a global summary statistic for the fit of judge-awarded data to the Rasch measurement model?

Answer: It does when you tell it to! Simply define an extra facet, beyond students, tasks and raters. This facet contains only one element which is specified as a component of all data points. The Infit and Outfit statistics for this element are global mean-square fit statistics (chi-squares divided by degrees of freedom) with z-score-equivalent significance levels. Since this may seem awkward, I'll overcome my revulsion to global fit tests and let Facets report a chi-square test of the null hypothesis that all data fit the model (as the BIGSTEPS Rasch analysis program does). I predict, in advance, that data will never fit - because empirical data never fit a theoretical ideal!

Why am I repulsed by global fit tests - the neutron bombs of statistical practice? Because misfit is never global and never a statistical event, it is always local and idiosyncratic. Rejecting a dataset or a measurement model on global fit is equivalent to refusing to eat because most things are inedible. In practice, you evaluate what you eat one mouthful at a time, checking as you eat for local fit to your model for food edibility. Otherwise you starve and your species becomes extinct!

There is a crucial difference between measurement models and the descriptive regression models beloved by statisticians:

Descriptive models summarize a particular set of assumed-linear numbers as succinctly as possible. Any model will do. Global fit is a convenient way of choosing the opportunistically "best" model or rejecting this or that "bad" model. Global fit is the only criterion.

Rasch measurement models are different in intent and practice. They extract, from a set of ordered observations, linear measures as generalizable as statistically possible. Only a specific family of models can do this. There is no influence of global (or even local) fit on model choice. The influence of fit is on data choice - its selection, reorganization, reconception.

John objects to fit statistics based on measures obtained with the joint (unconditional) maximum likelihood estimation algorithm (UCON). Like all estimation techniques, UCON has its strengths and weaknesses. Its strengths include efficient and versatile handling of systematic, random or unintentional missing data, and the ability to estimate measures from large, complex many-facet data sets incorporating diverse observation models. Its weaknesses are a minor degree of statistical inconsistency under artificial conditions, and statistical bias with some very small data sets and certain idiosyncratic data configurations.

How troublesome are these weaknesses? Statistical consistency is the property that, when applied to an infinite amount of data, the estimation algorithm will give a "right" answer. In fact, as implemented in Facets, UCON is consistent (Haberman 1977), because persons, items, etc. are all conceptually unlimited!

Bias is the degree to which an estimate based on a finite set of data is misleading. All Rasch estimation methods are biased, i.e., they produce a "wrong" answer (RMT 3:1 47). UCON is more biased than conditional methods, but this bias is negligibly small and always less than the standard errors of the estimated measures for real many-facet data sets of any useful size. Further, the bias can be easily corrected or eliminated for many simple situations, e.g., paired comparisons.

The crucial question here is: Are fit statistics based on UCON estimates useful for quality control? We know that all estimated fit statistics reported by any computer program are "wrong". Empirical data and the estimates derived from them are never exactly in accord with the statistical theory underlying any fit statistic. But 25 years of practical experience with the UCON algorithm provide convincing evidence that UCON-based fit statistics are helpful and trustworthy.

Since it produces reasonable and useful estimates, the UCON algorithm is currently employed in Facets. I watch for statistically better, faster converging, more robust and less computationally intensive estimation algorithms. Each year the Facets estimation algorithm improves. Cees Glas's work on two facet (person-item) estimation is remarkable and I hope he soon turns his attention to helping "many-facet" practitioners.

John also asserts that the lack of statistical independence in many-facet data invalidates any fit statistics. I agree that, when two raters rate the same performance, their ratings are not statistically independent in general. But, for independently- produced ratings by skilled and perceptive expert judges, the requirement for successful operation of a Rasch measurement model is not unconditional independence, but conditional or local independence. In practice this means: Do all the ratings awarded by judges to students on tasks have about the same amount of statistical independence from each other across judges, across students and across tasks? If so, their interdependence is used to estimate measures, and their relative independence is summarized by fit statistics. Thousands of analyses have been performed by scores of researchers that confirm the utility of Rasch measures and fit statistics derived from judge-awarded ratings.

Facets isn't perfect, but it's good enough for practical work and far better than any existing operational alternative. I thank John de Jong for provoking me to thought, and Stuart Luppescu for bringing his remarks to my attention. I welcome feedback.

Haberman S. J. 1977. Maximum likelihood estimates in exponential response models. Annals of Statistics 5: 815-841

Rasch Estimation methods, statistical independence and global fit. de Jong J, Linacre JM. … 1993, 7:2 p.296-7


Estimation methods, statistical independence and global fit. de Jong J, Linacre JM. … Rasch Measurement Transactions, 1993, 1993, 7:2 p.296-7



Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

Rasch Measurement Transactions welcomes your comments:

Your email address (if you want us to reply):

If Rasch.org does not reply, please post your message on the Rasch Forum
 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue. International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025 On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025 On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt72n.htm

Website: www.rasch.org/rmt/contents.htm