Dichotomous Infit and Outfit Mean-Square Fit Statistics

Georg Rasch suggests chi-square fit statistics to control the applicability of data to his model (Rasch 1980 p. 25). The chi- squares in common use are known as OUTFIT and INFIT. These are reported as mean-squares, chi-square statistics divided by their degrees of freedom, so that they have a ratio-scale form with expectation 1 and range 0 to +infinity. They are also reported in various interval-scale forms in which their expected value is zero.

OUTFIT is based on the conventional sum of squared standardized residuals. Let X be an observation, E be its expected value based on Rasch parameter estimates and σ² be its modelled variance about its expectation. Then the squared standardized residual is

z² = (X-E)² / σ²

OUTFIT is Sum(z²)/N, where N is the number of observations summed.

INFIT is an information-weighted sum. The statistical information in a Rasch observation is its variance, σ². This is larger for targeted observations, and smaller for extreme observations, e.g., easy items administered to able persons. INFIT is Sum(z²σ²)/Sum(σ²) = Sum((X-E)²)/Sum(σ²), summed over the relevant observations.

Fit statistics are formulated to test particular hypotheses. OUTFIT is dominated by unexpected outlying, off-target, low information responses and so is outlier-sensitive. INFIT is dominated by unexpected inlying patterns among informative, on-target observations and so is inlier-sensitive.

Person Responses:
Easy -- Items -- Hard
Diagnosis
Pattern
OUTFIT
Mean-square
INFIT
Mean-square
Point-Measure
Correlation
S.E.
Inflator
111¦0110110100¦000 Modelled/Ideal 1.0 1.1 0.62 1.0
111¦1111100000¦000 Guttman/Deterministic 0.3 0.5 0.87 1.0
000¦0000011111¦111 Miscode 12.6 4.3 -0.87 3.5
011¦1111110000¦000 Carelessness Sleeping Slipping 3.8 1.0 0.65 1.9
111¦1111000000¦001 Lucky Guessing 3.8 1.0 0.65 1.9
101¦0101010101¦010 Response set/Miskey 4.0 2.3 0.11 2.0
111¦1000011110¦000 Special knowledge 0.9 1.3 0.43 1.1
111¦1010110010¦000 Imputed outliers * 0.6 1.0 0.62 >1.0*
111¦0101010101¦000 Low discrimination 1.5 1.6 0.46 1.3
111¦1110101000¦000 High discrimination 0.5 0.7 0.79 1.0
111¦1111010000¦000 Very high discrimination 0.3 0.5 0.84 1.0
Right¦Transition¦Wrong          

high - low - high
OUTFIT sensitive to outlying observations >>1.0 unexpected outliers >>1.0 disturbed pattern    

low - high - low
INFIT sensitive to pattern of inlying observations <<1.0 overly predictable outliers <<1.0 Guttman pattern    
* as when a tailored test is filled out by imputing all "right" response to easier items and all "wrong" to harder items. Increase S.E. based on number of observed response.
The exact details of these computations have been lost, but the items appear to be uniformly distributed about 0.4 logits apart.

The Table shows typical dichotomous patterns. (For polytomies, see www.rasch.org/rmt/rmt103a.htm The S.E. inflator is a multiplier which can be used to increase the imprecision due to modelled observation error to allow for the added uncertainty due to misfit. This inflator is the square-root of the maximum value of INFIT mean-square, OUTFIT mean-square and 1.0. Infit and Outfit mean-squares less than 1.0 do not increase the standard errors, but suggest that the latent variable is locally compressed for the item or person.

The "!" in the tabled response patterns indicates a threshold from the zone in which OUTFIT is more sensitive to the zone in which INFIT is more sensitive. The > indicates the relevant diagnostic mean-square fit value for this range of item difficulties. In the outlying, OUTFIT zones, we expect nearly all successes or nearly all failures. In the transition, INFIT zone, we expect a mixture of success and failure. Departures from these expectations are flagged by the indicated fit statistics. Fit values noticeably above 1.0 indicate excessive unmodelled noise. Fit values noticeably below 1.0 indicate a local deficit in the stochastic variation necessary for useful measurement. What is noticeable depends on the nature of the data. Fit values in well-controlled data, e.g., MCQ responses, are more central than those for free-form responses and clinical observations. What is acceptable depends on what produces useful measurement in context.

Why is a Guttman response pattern, flagged by low INFIT and OUTFIT statistics, problematic? Why isn't it the ideal? A fundamental requirement for useful measurement is that it be test-free and sample- free, so that data sets that "differ materially in some relevant respects" (Rasch 1980 p. 9) produce statistically equivalent results. An obvious relevant difference is that between a hard test and an easy test. But when a Guttman pattern is split in two, it produces an easy test on which the subject performed infinitely well, and a hard test on which the same subject performed infinitely badly. This implicit contradiction exists within every Guttman pattern and so increases our uncertainty in the reported measure. Is the sharp transition really a precise indicator or the subject's measure or is it caused by a time limit? response style? curriculum effect? scanning error? illness?

A useful rule of thumb when investigating fit is to start with extreme high OUTFIT and INFIT values, and work down towards more central values, stopping when diagnosis no longer prompts remedial action nor provokes further thought about the nature of the subjects or the test questions. Edit the data as necessary, e.g., put to one side subjects with obvious "response sets" until the final reporting run. Then reestimate and examine extreme low OUTFIT and INFIT values. Elimination of high misfit values will make most low misfit values less extreme. Low fit values provide less motivation for data editing than do high values, unless obvious duplication is found, e.g., a repeated question or a double-scanned response form. Low fit values do not disturb the meaning of a measure. They merely reduce precision.

ni=16
item1=1
name1=1
codes=01
ptbiserial = measure
iafile=*
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
*
uascale = 2.4
&end
END LABELS
1110110110100000 Modelled/Ideal 1.0 1.1 1.0
1111111100000000 Guttman/Deterministic 0.3 0.5 1.8
0000000011111111 Miscode 12.6 4.3 3.5
0111111110000000 Carelessness/Sleeping 3.8 1.0 1.9
1111111000000001 Lucky Guessing 3.8 1.0 1.9
1010101010101010 Response set/Miskey 4.0 2.3 2.0
1111000011110000 Special knowledge 0.9 1.3 1.1
1111010110010000 Imputed outliers * 0.6 1.0 1.3
1110101010101000 low discrimination
1111110101000000 high discrimination
1111111010000000 very high discrimination

(Dichotomous Mean-square) Chi-square fit statistics. Linacre JM, Wright BD. … Rasch Measurement Transactions, 1994, 8:2 p.360



Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

Rasch Measurement Transactions welcomes your comments:

Your email address (if you want us to reply):

If Rasch.org does not reply, please post your message on the Rasch Forum
 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue. International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025 On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025 On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt82a.htm

Website: www.rasch.org/rmt/contents.htm