Estimating Rasch Measures for Extreme Scores

Extreme scores (zero and perfect scores) imply extreme, but indefinitely located, measures. Indefinite measures are awkward to report and difficult to use in further analyses, such as computing means and standard deviations. What can be done to give these measures definite values? Here are several approaches. They are all based on the Bayesian idea that we would not have administered the test to the person, or included the item on the test, unless we thought that the person or item was relevant. Consequently, an extreme score implies a measure only slightly out of the measurement range of the test, not a measure a considerable distance away.

I. The extreme score is only barely extreme.

Raw scores are observed on an ordinal scale. Fractional raw scores are unobservable. Consequently any measure that yields an expected raw score closer than 0.5 score points to an extreme score is expected to be observed as producing an extreme score. Consequently the most central measure for a zero score is that corresponding to 0.5 score points, and for a perfect score is that corresponding to a perfect score less 0.5 score points. After the measures for non-extreme item and person have been estimated in the usual way, the measures corresponding to these almost extreme raw scores can be estimated (RMT 10:2 p. 499). Other commonly-used extreme score corrections are 1/3 and 1/4.

II. The extreme measure is only barely extreme.

From raw score R a measure MR and its standard error SER can be estimated. The measure for score R+1 is approximately MR + SER²(see Wright & Stone, BTD, 1979, 192-5). Thus the measure for an extreme score can be estimated from the measure for a score 1 point less extreme [see Table]. If S is the perfect score, then MS ≈ MS-1 + SES-1².

III. The extreme measure is only barely significantly different.

Only measures statistically significantly more extreme than non-extreme measures would provoke separate consideration. Thus a measure MS = MS-1 + 1.65*SES-1 is the most central that would cause the rejection of the hypothesis, at the .05 level, that MS and MS-1 are statistical equivalent.

IV. The extreme measure aligns smoothly with non-extreme measures.

This can be achieved by curve-fitting. For instance, a quadratic fit of MS to MS-1, MS-2 and MS-3 yields MS = 3*MS-1 - 3*MS-2 + MS-3.

V. The extreme response string is only barely modal.

The likelihood of each possible response string for a particular measure can be computed as LR = Pnix where x = R, the raw score corresponding to that response string. If L0>0.5 for a measure, then that measure will probably produce a response string with a raw score of zero. If L0<0.5, then a non-zero score will probably be observed. The most central measure likely to produce an extreme measure is the one for which L0 = 0.5.

VI. Data augmentation with non-extreme responses.

The belief in test relevancy can be expressed in terms of additional artificial responses (Jannarone et al., 1990). For instance, two further responses could be added to every person and item response string: a "1" and a "0". Then no response string can be extreme. If the additional responses are arranged to alternate "01" and "10" then the additional artificial persons and items will have close to 50% success rates, and so have minimal impact on the measurement system. Once the set of measures have been estimated, they can be anchored. Then the augmented data can be dropped, allowing standard errors and fit statistics to be computed from the observed data. If the prior belief is twice as strong, then 4 items can be added. For belief expressed in item fractions, then weights can be used for the artificial items.

VII. The underlying distribution is specified.

If the underlying distribution of, say, persons is specified to be normal (or any other distribution), then measures can be imputed for extreme scores that result in the best fit to that distribution. These measures are constrained to be more extreme than the measures estimated from similar non-extreme response strings.

VIII. Posterior distribution = Prior distribution.

The distribution of the measures estimated from the data is intended to coincide with the distribution of the measures that generated it. This can be used to refine the measure estimates for extreme scores.

After extreme measures are estimated using one of the methods above, the means and standard deviations of the item and person measure distributions are computed. Then data are simulated using the entire set of measures (extreme and non-extreme). From these data, a new set of measures are estimated for non-extreme and extreme scores. The means and S.D.s of these new measures are computed and compared to their previous values. The previous "extreme" measures are adjusted and new means and S.D.s computed which make the two distributions as similar as possible. Further data are simulated from the revised measures and the distributions are again compared. The extreme measures again adjusted to make the distributions coincide. This iterative process continues until no more adjustments are necessary or there is no improvement in distribution coincidence.

"Least Measurable Distance" Extrapolations for
Extreme Score Measures in Logits
Approach Number of dichotomous items or polytomous steps, L
I. Extreme Score Adjustment 10 25 50 100
R=1/2
R=1/3
R=1/4
(2L-1)/(L-1)
(3L-1)/(L-1)
(4L-1)/(L-1)
0.75
1.17
1.57
0.71
1.13
1.51
0.70
1.11
1.48
0.70
1.11
1.48
II. Measure Extrapolation
LMD lower bound L/(L-1) 1.11 1.04 1.02 1.01
Test
Width
in
Logits
2
4
6
8
Cf2/L, f=(L-1)/L
Cf4/L
Cf6/L
Cf8/L
1.16
1.22
1.37
1.44
1.04
1.08
1.12
1.17
1.02
1.04
1.07
1.10
1.01
1.01
1.04
1.04

Which to choose?

Most of the difference between these approaches is hair-splitting [see Table], but questions to be addressed include:

(a) Are the items dichotomous, polytomous or mixed?
(b) Is the test fixed length or adaptive?
(c) Are there missing data?
(d) What is known about the underlying distributions?
(e) What computational resources are available?
(f) Are the computed extreme measures reasonable?

Choose an extrapolation approach that provides consistently reasonable measures for your data and is easy to explain. Approach I has proved robust and flexible for small samples with missing data and is implemented in WINSTEPS.

A Rule of Thumb

Measures corresponding to extreme scores 0 and L should be no closer to their next integer neighbors 1 and L-1 than the least measurable distance, LMD, between integer neighbors estimated at 1 and L-1. According to Best Test Design (Wright & Stone, 1979, pp. 132, 135, 192-198, 214), when R=1 or L-1,

LMD = Cfw/L > L/R(L-R) > L/(L-1)

From the Table, reasonable values are generally in the range

1.0 MS - MS-1 1.2

A rule of thumb follows:

No extreme score extrapolation can be less than one logit. Extrapolations >1.2 logits require convincing justification.

Standard Errors of Extreme Measures

Extreme measures have indefinite standard errors, but the following provide useful values:

(1) SES > SES-1

(2) SES ≈ SES-1 + SES-1²/2

(3) SES ≈ 1/(Variance of raw score S | MS)
[This is implemented in WINSTEPS]

Benjamin D. Wright

Jannarone R.J., Yu K.F., Laughlin J.E. (1990) Easy Bayes estimation for Rasch-type models. Psychometrika 55, 3, 449-460.

Estimating Rasch measures for extreme scores.Wright B.D. … Rasch Measurement Transactions, 1998, 12:2 p. 632-3.




Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

Rasch Measurement Transactions welcomes your comments:

Your email address (if you want us to reply):

If Rasch.org does not reply, please post your message on the Rasch Forum
 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue. International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025 On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025 On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt122h.htm

Website: www.rasch.org/rmt/contents.htm