## How to Assign Item Weights: Item Replication or Rating Scales?

Recommendation: If the additional weight is intended to indicate a higher level of performance, then use a rating scale.
If the additional weight is intended to indicate replications of the same level of performance, then use item weighting.
If the additional weight is merely to make the scores look nicer, then use linear-rescaling of the measurement units.

Examples: a dichotomous item is scored 0-4 instead of 0-1:
1. Score levels 1,2,3 exist conceptually, but are not observed in these data. Analyze 0-4 as a rating scale or partial credit item. (In Winsteps, STKEEP=Yes, IWEIGHT=1)
2. 0-4 is specified because this item is considered to be 4 times as important as a 0-1 item. Analyze as 0-1 but give the item a weight of 4 or 4 replications in the data. (In Winsteps, STKEEP=No, IWEIGHT=4)
3. 0-4 is specified because there are 25 items and we want the raw score range to be 0-100. Analyze as 0-1 but report the raw scores as 0-4. (In Winsteps, STKEEP=No, IWEIGHT=1)

In general, each observation is expected to be an independent and equal witness to examinee ability. The scientific motivation for this expectation is comparable to the motivations for random sampling and randomization. The introduction of arbitrary emphases, such as item weights, degrades the inferential stability of results and biases conclusions in an unreproducible way.

In the political world of examinations, however, some observations are decreed more important than others. For instance, if a pass- fail decision is to be made on the composite outcome of a 100 item MCQ test and one essay graded from 0 to 10, then the examination board may decide to assign the essay rating a weight 10 times heavier in order to give the essay and the MCQ items supposedly "equal" weight in the final decision.

Should you fall victim to such a decree, there are several ways the weights can be implemented with Rasch computer programs. Since each method has its drawbacks, initial data screening and quality control should proceed as though no weights existed. Once the measurement process has been validated, the following assignment methods may help:

1. The essay ratings and the MCQ items are analyzed separately, yielding two ability measures for each examinee. If there is insufficient overlap among the essay ratings, then additional constraints are required, such as modelling the ratings as binomial trials, and asserting that each grader is equally severe in order for a coherent set of essay measures to be produced. For the pass- fail decision, a weighted sum of the pairs of ability measures is used " the precise formula will be complicated by the different logit ranges of the two variables. The way to see what to do is to plot MCQ vs. Essay measures, and then to draw on this plot the line that best asserts the conjoint judgment of the standard setting committee. This method is the most comprehensible.

2. Each essay rating is entered 10 times (or each essay is given a weight of 10 times), and then the MCQ items and the essay ratings are analyzed together. This diminishes local independence among the observations but avoids the complication of two measurement scales. The replicated data will make the reported standard errors too small. In this example, they should be inflated about 75%. The 10 essay difficulties will be reported at about the same location on the variable as the one original essay difficulty.

3. Use explicit item weights, e.g., using IWEIGHT= in Winsteps, but adjusting the item weights to maintain approximately correct standard errors and score range. The original score range is 0-110. The essay is to be upweighted 10 times. This would give a score range 0-200. So to keep the meaningful score range, the weights needs to be adjusted by 110/200 = .55. So each MCQ item is weighted .55, and the essay item is weighted 5.50. This method is operationally the simplest.

4. Each essay rating is multiplied by 10, and then the rescaled 0- 100 essay ratings are analyzed with the MCQ items. Since only every 10th category of the 0-100 essay rating scale is observed, the analysis must allow for structurally present, but empirically absent, categories (Wilson RMT 5:1 p. 128). Again, standard errors will need to be inflated about 75% due to the effect of the fictitious categories. Only one essay difficulty will be reported, but it will not be at the same location on the variable as the 0-10 essay would have been. By convention, the difficulty of a rating scale item is chosen so that the sum of the step difficulties is zero, i.e., at the location on the variable where the highest and lowest possible ratings on the item are equally probable. If the difficulty of the 0-10 essay item is D logits from the center of the person ability distribution, the difficulty of the 0-100 essay item will be much closer to the mean ability, only about D/10 logits away. This makes the construct harder to understand, and can be confusing if the assigned weights are changed.

Assigning item weights: Item Replication or Rating Scales?. Linacre JM, Wright BD. … Rasch Measurement Transactions, 1995, 8:4 p.403

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
March 31, 2017, Fri. Conference: 11th UK Rasch Day, Warwick, UK, www.rasch.org.uk
April 2-3, 2017, Sun.-Mon. Conference: Validity Evidence for Measurement in Mathematics Education (V-M2Ed), San Antonio, TX, Information
April 26-30, 2017, Wed.-Sun. NCME, San Antonio, TX, www.ncme.org - April 29: Ben Wright book
April 27 - May 1, 2017, Thur.-Mon. AERA, San Antonio, TX, www.aera.net
May 26 - June 23, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 30 - July 29, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 31 - Aug. 3, 2017, Mon.-Thurs. Joint IMEKO TC1-TC7-TC13 Symposium 2017: Measurement Science challenges in Natural and Social Sciences, Rio de Janeiro, Brazil, imeko-tc7-rio.org.br
Aug. 7-9, 2017, Mon-Wed. In-person workshop and research coloquium: Effect size of family and school indexes in writing competence using TERCE data (C. Pardo, A. Atorressi, Winsteps), Bariloche Argentina. Carlos Pardo, Universidad Catòlica de Colombia
Aug. 7-9, 2017, Mon-Wed. PROMS 2017: Pacific Rim Objective Measurement Symposium, Sabah, Borneo, Malaysia, proms.promsociety.org/2017/
Aug. 10, 2017, Thurs. In-person Winsteps Training Workshop (M. Linacre, Winsteps), Sydney, Australia. www.winsteps.com/sydneyws.htm
Aug. 11 - Sept. 8, 2017, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Aug. 18-21, 2017, Fri.-Mon. IACAT 2017: International Association for Computerized Adaptive Testing, Niigata, Japan, iacat.org
Sept. 15-16, 2017, Fri.-Sat. IOMC 2017: International Outcome Measurement Conference, Chicago, jampress.org/iomc2017.htm
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com