Indexing vs. Measuring

In RMT 22:1, Stenner, Stone, and Burdick (2008) distinguished between two different measurement models: reflective or latent variable models and formative or composite variable models (Edwards & Bagozzi, 2000). In the former, the causal action flows from the latent variable to the indicators (e.g., temperature) whereas, in the latter, the causal action flows from indicators to the composite variable (e.g., socioeconomic status). We believe that the language we use should accentuate these differences and as such we propose to call reflective models measurement models, what these models produce we will call measures and the process of producing these measures will be called measuring. In parallel fashion, formative models will be called index models, what they produce we will call indices, and the process of producing indices will be called indexing. The notion of an index is well developed in economics and sociology and carries the connotations we desire. What follows is a discussion of how indexing and measuring differ and why it is important to make this distinction in the human sciences.

Indices are the effects of their indicators whereas measures (of latent variables) are the causes of their indicators. So, changes in stature or consumer price behavior are caused by changes in height (or weight) and price changes for market baskets of commodities (computers, milk, gasoline), respectively. Changes in latent variable measures, in contrast, cause a homogeneous (often nonlinear) change in indicator behavior, as when temperature change causes thermometric fluid to expand in the thermometer or a change in reader ability causes a change in count correct on a reading test.

Altering the indicators of an index changes the definition of the variable being indexed, whereas changing the indicators for a measure will not alter the latent variable (although precision of measurement and or unit size may be affected). So, if midline girth is added to height and weight as indicators of stature or all electronic commodities are eliminated from the Consumer-Product-Index (CPI) market basket, the definition of what is being indexed changes.

In contrast, knowledge of expansion coefficients and viscosity differences allows us to swap new thermometric fluids for mercury without changing the construct being measured. Similarly, new reading items with different text and item types can be swapped for previous items without changing the construct being measured.

Another way to express this point is that the indicators for an index are constitutive of that index, whereas indicators for a latent variable are incidental to the construct's definition.

In a generally objective measurement framework (e.g., Lexiles) what is crucial in the definition of the construct is the specification equation that specifies the cause of the variation detected by the instrument. Because the indicators of an index by design track different kinds of variation (height, weight, midline girth), it is difficult to imagine a specification equation that could, somehow, capture what these indicators share independent of the linear (or otherwise) combination that constitutes the index. What, for example, would a parallel form of Sheldon's somatotype rating scale look like? Difficulty in imagining what new indicators would constitute a parallel form is strongly suggestive of the need for an index rather than a measurement model.

Because both index and measurement models are fundamentally associational (i.e., based on correlations among indicators), traditional applications of Rasch model software often cannot distinguish between an index and a latent variable (Stenner, Burdick, & Stone, 2008). Examples of resulting confusion take predominantly one particular form: index variables are interpreted, as if they are latent variables. Here is an example typical of many in the Rasch literature [and RMT, Ed.]:

The Rasch model has been shown to fit FIM data reasonably well, which indicates that the scale locations describe adequately the relative order in which these functions are lost in the aging population. The items on the top describe difficult activities, such as climbing stairs, whereas items on the bottom describe easier activities that are maintained relatively well. (Embretson, 2006, p. 52)

Contrary to a latent variable interpretation the FIM (Functional Independence Measure) appears to be an index of motor functioning with the causal action moving from indicators to index. If the desired medical outcome is "more functional independence," then rehabilitating bladder control, walking, bathing, and so on should promote the intended outcome rather than the other way around. Alternatively, we could teach the patient to drive a motorized wheelchair but to include this as an indicator would alter the definition of functional independence.

Global fit of data to a Rasch model will not sort out the direction of causal flow and thus will not provide unambiguous evidence for a latent variable interpretation of the construct. A substantive theory and associated specification equation capable of explaining variation in indicator difficulties is a big step in support of a latent variable interpretation. The coup de grace is a demonstration of the specification equation's causal status using experimental manipulation of instrument characteristics (radicals) and subsequent observation of the theoretically predicted change in the measurement outcome.

It is a property of indices (economical, sociological, or psychological) that the indicator composite may be found to correlate more highly with an unintended criterion than the intended one. Such a discomforting outcome is yet another reason that a correlational (as opposed to a causal) view of validity is not sustainable.

Latent variable interpretations are most defensible when global fit of data to a Rasch model is accompanied by invariance of the indicator structure throughout the range of the construct. In the language of additive conjoint measurement (Luce & Tukey, 1964) and as realized in the Lexile Framework for reading (Kingdon in press), it should be possible to trade off a difference between reader abilities of 200L for a difference in text readability of 200L to hold comprehension rate (count correct/total items) constant (Burdick, Stone, & Stenner, 2006). This trade-off property has been shown to operate throughout the grade range from kindergarten to advanced adult reading (e.g., Supreme Court decisions) and would not be expected to hold for a reading index variable composed of items such as: (1) number of books in the home, (2) daily newspaper subscription, (3) English as a first language, etc.

It may be true that "where there is correlational smoke there is likely to be causational fire" (Holland, 1986, p. 951). Good fit with a Rasch model is correlational smoke, but as we have just seen, it takes an experimental test of a substantive theory to unambiguously distinguish between a latent variable and an index.

Burdick, D. S., Stone, M. H., & Stenner, A. J. (2006). The Combined Gas Law and a Rasch Reading Law. Rasch Measurement Transactions, 20(2), 1059-60, www.rasch.org/rmt/rmt202.pdf

Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5, 155-174.

Embretson, S. E. (2006). The continued search for nonarbitrary metrics in psychology. American Psychologist, 61(1), 50-55.

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81, 945-960.

Luce, R & Tukey, J. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology, 1, 1-27.

Stenner, A. J., Burdick, D. S., & Stone, M. H. (2008). Formative and reflective models: Can a Rasch analysis tell the difference? Rasch Measurement Transactions, 22:1, 1152-3,www.rasch.org/rmt/rmt221.pdf

Indexing vs. Measuring … A.J. Stenner, M.H. Stone, and D.S. Burdick, Rasch Measurement Transactions, 2009, 22:4, 1176-7

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
July 21 - 23, 2025, Mon.-Wed.	Pacific Rim Objective Measurement Symposium (PROMS) 2025, www.proms2025.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com