Schulz et al. (1989) compare the Rasch (RM) and Mantel-Haenszel (MH) procedures for detecting differential item functioning (DIF) (also RMT 1989 3:2 51-53). The RM procedure, following Wright et al. (1976), was implemented with computer programs MSCALE and LINK (Schulz 1984). The MH procedure (Holland and Thayer 1988) was implemented with program MHDIP (Raju 1988).
Sensitivity to DIF: With small groups, MH was significantly less sensitive to DIF than RM. MH indicates significance with null hypothesis chi-squares. The observed variance of these chi-squares was less than modelled. Male/Female DIF detected by both RM and MH when groups were N=1000 were lost by MH when groups were randomly reduced to N=100, but still detected by RM.
Reliability: Contrary to MH claims, empirical results show RM to be more reliable than MH when groups are small (N=100 to 200), and always as reliable when groups are large (N>300).
Validity: When groups are comparable in achievement, RM and MH detect "the same thing". Since RM and MH DIF indices from the male/female contrast correlate at their statistical maximum, .99, one cannot explain the greater sensitivity and reliability of RM as due to the two methods detecting "something different".
DIF versus Between-Group Achievement Differences: DIF must not be confused with real group differences in achievement. To be acceptable, a DIF procedure must produce "no net DIF" over items. Three of the four MH variants fail this criterion. The MH variants differ in 1) whether the studied item is included or excluded from the total score used for matching, and 2) whether matching is fat or fine (fat: seven or less levels of total score; fine: all possible levels of total score). When contrast groups differ significantly in achievement, RM yields "no net DIF". The only MH variant which yields "no net DIF" is the one which includes the studied item in the total score and uses all levels of total score for matching. This is the MH variant most similar to RM. When contrast groups differ in achievement, the other three MH variants yield net DIF across items that is significantly different from zero (p>.001).
DIF versus Item-by-Achievement Interactions: DIF is intended to detect item-by-group interaction exclusively. When groups differ in achievement too, some DIF indices confound item-by-group and item-by- achievement interactions. The correlation between RM and MH DIF indices was at its theoretical maximum of 0.99 for equal achievement contrasts. But it was substantially less (r=.81) than the theoretical maximum (.98) for unequal achievement contrasts. Now RM and MH DIF procedures no longer detect "the same thing".
The differences between RM and MH DIF indices estimated from unequal achievement contrasts are systematically related to item-by- achievement interactions of the kind detected by RM item fit statistics. Highly discriminating (low infit) items are biased in favor of high achievers while poorly discriminating (high infit) items are biased in favor of low achievers. Thus RM DIF indices correlate positively with RM infit statistics (r=.32), but, inexplicably, MH DIF indices correlate negatively (r=-.32).
Recommendations: When contrast groups differ in achievement, then construct achievement-matched samples of the largest possible size. When contrast groups are achievement-matched, RM item Z-scores:
Z12 = (b1 - b2) / [sqrt(s1^2 + s2^2)]
are more sensitive to DIF than MH chi-squares and at least as reliable. A practical advantage of RM is that it measures DIF in the same units as person achievement.
Holland PW & Thayer DT 1988 Differential item performance and Mantel- Haenszel. In H Wainer and H Braun (Eds.), Test Validity. Hillsdale NJ: Lawrence Erlbaum
Raju NS 1988 MHDIP. Chicago:Psych Dept, Ill Inst of Technology.
Schulz EM 1984 LINK. A program for comparing paired Rasch estimates and linking tests. Chicago: MESA Press.
Schulz EM, Perlman CP, Rice WK, Wright BD 1989 Empirical Comparison of Rasch and Mantel-Haenszel Procedures. AERA
Wright BD, Mead RJ, Draba R 1976 Detecting and correcting test item bias with a logistic model. Chicago: MESA.
DIF detection: Rasch versus Mantel-Haenszel, E M Schulz Rasch Measurement Transactions, 1990, 4:2 p. 107
Forum | Rasch Measurement Forum to discuss any Rasch-related topic |
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
Coming Rasch-related Events | |
---|---|
Apr. 21 - 22, 2025, Mon.-Tue. | International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net |
Jan. 17 - Feb. 21, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Feb. - June, 2025 | On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia |
Feb. - June, 2025 | On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia |
May 16 - June 20, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 20 - July 18, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com |
Oct. 3 - Nov. 7, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
The URL of this page is www.rasch.org/rmt/rmt42f.htm
Website: www.rasch.org/rmt/contents.htm