Category Disordering (disordered categories) vs. Threshold Disordering (disordered thresholds)

Rasch rating scale structure parameters, are also called Andrich thresholds, step calibrations or Tau's. These relate directly to category probabilities. These probabilities relate to the probability of a category being observed, not to the substantive order of achievement of the categories. So when step calibrations, i.e., Tau's, are disordered, they say that one category is less likely to be observed, not that it is easier to perform.

Here is an example that will produce disordered Tau's:

Around 100 people work in a building. Let us count the number of people in the building at 10 minute intervals over several days. The "items" are the times of day. The "people" are the days. Here is the rating scale:

Less than 100: category 1.
Exactly 100: category 2.
More than 100: category 3.

We will observe categories 1 and 3 far more often than category 2. As people arrive in the morning, it will be category 1. At peak times, category 3. In the evening category 1. During a day we may never observe category 2. But, of course category 2 goes between 1 and 3. But it is a category that is very difficult to observe. The Tau's will be "disordered".

So, how do we detect when the categories are actually substantively incorrectly ordered? We use fit statistics. An illustrative example follows.

Category disordering occurs when the ordinal numbering of categories does not accord with their substantive meaning. Consider the 7 level FIMTM rating scale. Each level is substantively defined to represent a higher level of functioning. The ordinal numbering accords with this. But what would happen if the numbering of two categories was reversed? Then a higher category number could correspond to a lower level of functioning. The categories would be substantively disordered.

FIM
Level
Count Average
Measure
INFIT
MNSQ
OUTFIT
MNSQ
Step calibration
Rasch-Andrich threshold
1
2
3
4
5
6
7
96
88
101
168
210
146
101
-2.80
-2.04
-1.02
-.27
.85
2.34
3.32
.98
.75
1.07
1.03
1.01
.75
.87
1.02
.80
1.03
1.19
.91
.83
.89
NONE
-2.22
-1.70
-1.31
.08
2.02
3.14
Table 1. Satisfactory Category Statistics
Average measures advance, Thresholds advance, MNSQs near 1.0

Here are the category summary statistics in Table 1 for some patient records with correctly coded FIM levels. Note that the "Average Measure" values advance with category. These indicate that, for this sample, higher patient performance corresponds to higher categories. The category mean-square fit statistics also do not markedly exceed their model values of 1.0. Figure 1 shows the modeled category probability curves. They depict the expected succession of "hills".

FIM
Level
Count Average
Measure
INFIT
MNSQ
OUTFIT
MNSQ
Step calibration
Rasch-Andrich threshold
1 (2)
2 (1)
3
4
5
6
7
88
96
101
168
210
146
101
-1.97
-2.18
-.95
-.25
.80
2.14
3.02
1.47
.54
1.05
.91
.97
.66
.83
1.41
.69
1.02
.99
.87
.75
.86
NONE
-2.08
-1.49
-1.24
.08
1.87
2.86
Table 2. Category Disordering
Average measures disordered, MNSQs misfit > 1.0, but Thresholds advance

Now, suppose that due to a coding or data entry error, the numbering of levels 1 and 2 was reversed, introducing substantive category disordering. Table 2 shows the resultant category statistics. The observed category counts verify that category 1 and 2 have been reversed. Now the "average measure" values for categories 1 and 2 are disordered, and category 1 is exhibiting large misfit. Counter-intuitively, the step calibrations are ordered. The modeled category probability curves, shown in Figure 2, still depict a succession of "hills". This is because the measures, the Rasch model parameters, are always estimated on the basis that the data fit the model.

Substantive disordering of the categories is flagged by disordering in the "average measure" values and mean-square fit statistics much larger than 1.0 (indicating misfit), not disordering in the step calibrations nor in the shape of the probability curves. Of course, these statistics comment on the functioning of the rating scale for this sample. Whether substantive category disordering is due to a misspecification of the rating scale or to idiosyncrasies only found in the sample requires further investigation.

Step (Threshold) Disordering

The step calibrations or Rasch/Andrich thresholds correspond to the Rasch model parameters for the rating scale structure. Each step calibration parameterizes the relationship between a pair of adjacent categories. If, for a given item targeted directly at the person's ability level, a step calibration has a positive value, then the lower of the pair of categories is more likely to be observed. If the step calibration has a negative value, then the higher category of the pair is more likely to be observed.

Rating scale categories, however, are not observed in pairs but in the entire set simultaneously. This complicates their interpretation. If the step calibrations become successively more positive as category number increases (as in the FIM examples), then the plot of the category probability curves depicts a "range of hills". Each category in turn is most probable to be observed, and the intersections of the modal categories correspond to the step calibrations.

If the step calibrations do not increase monotonically with category number, i.e., are disordered, then one or more categories are never modal, and one or more "hill tops" are missing from the range of hills.

FIM
Level
Count Average
Measure
INFIT
MNSQ
OUTFIT
MNSQ
Step calibration
Rasch-Andrich threshold
1
2
3
4
5
6
7
96
44
101
168
210
146
101
-2.81
-1.96
-1.03
-.30
.82
2.30
3.27
.90
.88
1.02
1.07
.96
.75
.87
.96
.92
.98
1.22
.88
.82
.89
NONE
-1.49
-2.33
-1.29
.05
1.97
3.09
Table 3. Low Frequency in Category 2
Thresholds disordered, but average measures advance, MNSQs near 1.0

An Example of Step Disordering

To illustrate this, consider the FIM data presented above, but with every other observation of level 2 made missing. Table 3 shows the resulting category statistics. Compare these with Table 1. The count for level 2 is reduced by 50%. The step calibration from level 2 to 3, -2.33, is now less than that from level 1 to 2, -1.49, and so is disordered. As shown in Figure 3, category 2 is no longer modal. The cross-over between the curves for levels 2 and 3 (i.e., the step calibration) is to the left of that for levels 1 and 2. The crossover points are disordered. All other statistics, however, are almost identical. Step disordering has not introduced category disordering (as diagnosed by average measures) nor category misfit (as diagnosed by fit mean-squares).

Step Calibrations and Modality

What is the relationship between step calibrations and modality? Consider a 3 category rating scale. In Figure 4 the steps are ordered. In Figure 5 the steps coincide. The maximum probability of the central category is .33. In Figure 6 the steps are disordered. For 3 categories, the relationship between the two step calibrations, F1 and F2, and the maximum probability of the central category, as plotted in Figure 7, is given by the ogive:

Step Calibrations and the Latent Variable

From the perspective of Cumulative Probabilities, i.e., Thurstone Thresholds as computed according to the Rasch model, (Figure 8), as the step calibrations become more disordered, the central category becomes narrower. Step disordering does not indicate that the category definitions are out of sequence, rather that the category defines a narrow section of the variable. Empirically, disordered step calibrations may indicate that the category definition is too narrow, or that too many category options have been presented to respondents. Consequently, combining the narrow category with an adjacent category may simplify use of the rating scale or assist with communication of conclusions based on the scale.

Step disordering Increases Item Discrimination

Expected score ogives (the model item characteristic curves shown in Figure 9) are steeper with disordered steps. Thus step disordering indicates an item that is highly discriminating over a limited region of the variable, but that is less informative in other regions. Thus "high item discrimination" is not synonymous with "better functioning" or "more effective".

John M. Linacre


Category Disordering (disordered categories) vs. Threshold Disordering (disordered thresholds). Linacre, J.M. … Rasch Measurement Transactions, 1999, 13:1 p. 675




Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

Rasch Measurement Transactions welcomes your comments:

Your email address (if you want us to reply):

If Rasch.org does not reply, please post your message on the Rasch Forum
 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue. International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025 On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025 On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt131a.htm

Website: www.rasch.org/rmt/contents.htm