Unifying the Language of Measurement

The frequent misplacement of Rasch measurement in the domain of Item Response Theory might be corrected by more persuasively and comprehensively situating it in the context of natural science, following the lead of L.L. Thurstone, G. Rasch, B.D. Wright, D. Andrich and others. The implicit and largely untested claim in this work is that the theory and practice of measurement across the sciences is unified insofar as it demands, models, and capitalizes on principles such as invariance, statistical sufficiency, parameter separation, unidimensionality, conjoint additivity, local independence, etc. Interest in such unification comes from both the social sciences and the natural sciences, as was indicated in a recent talk by Ludwik Finkelstein (2010; also see his 2004), a past editor (1982-2000) of Measurement, and a former Vice-President of the International Measurement Confederation:

There are theoretical and practical reasons for pursuing a unified science of measurement. As Finkelstein says, common logical and philosophical principles share common problems, but differing concepts, vocabularies and methodologies obscure those commonalities and make the solution of their shared problems needlessly difficult. Some problems are likely to have been better addressed to date in the social sciences, and others, in the natural sciences.

For instance, the Rasch focus on invariance will likely be found of particular value within metrology. The obvious places to begin Rasch applications in the context of metrology are with regard to its educational and human resource needs for individualized tests informing differentiated instruction, computerized adaptive certification exams, admission and graduation standards, program improvement and comparison metrics, employee assessments and opinion surveys, etc. But Rasch theory and methods might also play a role in resolving some problems that social scientists assume to be completely under control in the natural sciences, as in the potential for Rasch models to inform genomic and proteomic metrologies (Markward & Fisher, 2004) or clinical laboratory disease severity measures (Fisher & Burton, 2010). Coming at it from the other direction, the metrological focus on the traceability of individual instruments and measures to universally uniform reference standards will likely result in significant advances within the social sciences. The role of these kinds of technical networks in reducing market frictions (Barzel, 1982) and in amplifying individual effects in a kind of choral collective cognition (Magnus, 2007) may have profound implications for the advancement of science (Latour, 1987, 2005), economics, government, and the work place. There is, then, a potential for the theory and practice of invariant measurement advanced in work following from Rasch to piggyback on the principle of networked metrological traceability, while those networks capitalize in new ways on the potentials brought to bear by Rasch's principles of invariance.

The International Measurement Confederation (IMEKO; www.imeko.org) hosts annual and bi-annual meetings of its various technical committees (TCs) at locations globally. Of particular interest to Rasch measurement practitioners are the IMEKO TC1 on metrology education and TC7 on measurement science. The 13th IMEKO TC1-TC7 Joint Symposium took place September 1-3, 2010 at City University in London. For the first time, this Symposium included the IMEKO TC13 - Measurements in Biology and Medicine. The Symposium was organized by Sanowar Khan, Kenneth Grattan, Ludwik Finkelstein, and Panicos Kyriacou of the School of Engineering and Mathematical Sciences at City University London. Sponsors included the Institute of Physics (IOP), UK, City University, the Institute of Measurement and Control, and the Worshipful Company of Scientific Instrument Makers. The conference program can be accessed online at imeko.iopconfs.org.

Three Rasch papers were presented at the conference (Bezruzcko & Fatani, 2010; Fisher, 2010; Fisher, Elbaum, & Coulter, 2010). These papers presented variations on the same rationale for presenting research employing Rasch measurement at such a conference, namely, that models requiring linear, invariant comparisons provide an equivalent basis for quantification, no matter the field in which they happen to be employed. There is a need for further elaborations and explorations of Rasch's appropriation of the structure of natural law embodied in the Standard Model used by Maxwell in his analysis of mass, force, and acceleration. In starting from Maxwell's work in this way, Rasch capitalized on the uniformity with which natural laws involve the equivalence of one parameter with the multiplication or division of two other parameters, such that "virtually all the laws of physics can be expressed numerically as multiplications or divisions of measurements" (Ramsay, Bloxom, & Cramer, 1975, p. 258). Models of this kind require well-defined homomorphisms between empirical and numerical relational structures. Referred to by Rasch as isomorphisms, these are usually absent in social science scaling models, which are typically presumed valid and fit to data whether or not the empirical and numerical relational structures match (Krantz, Luce, Suppes, & Tversky, 1971).

Thus, Rasch models are properly situated in the tradition of measurement in the natural sciences because of their formal properties. But these properties are insufficient in themselves to the task of unifying measurement across the sciences. Ramsay, et al. (1975, p. 262) recognize that "Progress in physics would have been impossibly difficult without fundamental measurement," and that "we may have to await fundamental measurement before we will see any real progress in quantitative laws of behavior." But fundamental measurement and rigorously validated quantitative laws of behavior have been available now for decades, with little recognition or acceptance of their value in mainstream social science. Plainly something else besides mathematical proofs, experimental evidence, predictive theories, and persistently invariant instrumentation is needed for social scientists to adopt fundamental measurement and build out the theory and practice of psychosocial laws integrating qualitative and quantitative data and methods.

Latour (1987, 2005) provides convincing arguments and evidence to the effect that social networks are essential to the spread of new ideas and methods in science. There seems to be a popular notion that fundamental measures incorporated in lawfully regular patterns of inter-related phenomena are what one might call "naturally natural," and that these somehow propagate themselves spontaneously into existence as universally uniform and available things or effects. Latour and others reveal the huge resources invested in, and material practices associated with, making constructs that are incredibly rare in nature seem quintessentially natural. Steel, for instance, may well exist in nature, but certainly not in the quantities in which it has been manufactured over the last century and more. Rather than discovering pre-existing phenomena in nature, science and technology combine together to isolate useful and meaningful phenomena that are then exported from laboratories. The key to the process resides in the way technical media encapsulate and package an effect so as to keep it always connected with the networks of energy, communications, tools, and technicians that make it seem naturally universal. It may be then that to make psychosocial constructs seem "naturally natural," social scientists need to find a way to deploy those constructs via networks of measures metrologically traceable to reference standards.

A question that arose in the course of making the Rasch presentations at the IMEKO meeting led to some insight into the kinds of challenges likely to arise in the course of addressing the need for a unified language and practice of measurement. Though the term "calibration" is commonly employed in Rasch measurement to refer to the process of evaluating the invariance properties and estimating the parameters of an instrument, this process is almost always undertaken in an exploratory fashion, with no reference to a previously existing uniform standard metric (or even to previously calibrated instruments measuring the same construct). In the natural sciences and engineering, however, calibration does not mean anything except confirming traceability to a reference standard. Calibration is always relative to a standard.

This difference in the use of the same term represents a significant way in which barriers to understanding might arise. Because universal uniform metric standards, such as degrees Celsius, kilograms, meters, the second, etc., are nearly nonexistent in the social sciences, calibration is not yet a matter of establishing that kind of correspondence. Conversely, such standards are the norm in the natural sciences. New constructs are either rare or not considered candidates for calibrated instrumentation until standards are developed. There are, accordingly, few, if any, of the theoretical or practical guidelines for evaluating invariance in new constructs, estimating initial parameters, equating instruments, etc. that are taken for granted in Rasch-oriented psychometrics.

In light of this contrast between measurement in the natural and social sciences, other seeming similarities took on new significance. For instance, multiple papers presented at the London conference took up issues involving ordinal and nominal scales, referring to Stevens' fourfold measurement classification system in positive terms. When the question was raised as to why any interest would be invested in ordinal scales in the context of the natural sciences, the reply repeated the previous emphasis on the fact that all measurement, ordinal and nominal as well as interval and ratio, is performed relative to existing standards.

The Mohs Hardness Scale, for instance, provides an ordinal standard of measurement that works because it definitively encompasses virtually the entire range and every instance of possible variation in the construct in a metric that informs almost all applications involving it. Similarly, nominal standards define the shape of geometrical figures in the same way dictionaries define the meaning of words.

Reservations concerning the value and utility of ordinal measures in the social sciences, in turn, were seen in a new light by natural scientists at the IMEKO conference when the incomparability of scores from two different mathematics tests was raised and was amplified by then proposing to add a new item to both tests, which would make subsequently gathered scores incomparable with each of the already incomparable original tests. The chaos and confusion in that scenario was briefly contrasted with the simplicity and elegance of the comparisons that could be made of measures from exactly the same groups of items if those items had been drawn from a bank of items calibrated to measure in the same unit. Once again, all the difference was made by the presence or absence of a calibrated standard.

Two themes running through virtually all of the papers presented at the conference concern the most exciting and challenging areas for measurement-focused collaboration between social and natural scientists: metrological traceability and measurement uncertainty. The former is the domain of the International Vocabulary of Measurement, or VIM, now in its third edition (BIPM, et al., 2008). The latter is covered in the Guide to the expression of Uncertainty in Measurement, or GUM (BIPM, et al., 1995). These works are significant in being created and adopted as standards by an authoritative international group known as the Joint Committee for Guides in Metrology (JCGM), which includes the International Bureau of Weights and Measures (BIPM), the International Electrotechnical Commission (IEC), the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC), the International Organization for Standardization (ISO), the International Union of Pure and Applied Chemistry (IUPAC), the International Union of Pure and Applied Physics (IUPAP), the International Organization of Legal Metrology (OIML), and the International Laboratory Accreditation Cooperation (ILAC).

Several presentations at the 2010 IMEKO meeting in London, including the closing plenary, updated the IMEKO membership on the GUM and the VIM, especially as regards work towards the next, fourth, edition of the latter, expected to be published in 2018. These presentations were given primarily by Luca Mari (Università Cattaneo, Italy) and Paul de Bièvre (Editor, Accreditation & Quality Assurance), both of whom were involved the production of the VIM3 and are continuing work toward the VIM4. One of the goals for the VIM4 is to include, so far as possible, concepts and terminology that unify the language of measurement across the sciences. For some background on where work in this direction is starting from, see Mari (2010a, 2010b), Mari and Ugazio (2010), and Mari and Giordani (2010). Paul de Bièvre's (2006, 2010; Price & de Bièvre, 2009) articles and editorials in Accreditation and Quality Assurance are also illuminating. Other IMEKO presentations on probabilistic inferences (Rossi, 2010), uncertainty (Pavese, 2010; Pertile & Debei, 2010; Weißensee, Kühn, & Linß, 2010), multiscale models (Abdulla, Imms, Schleich, & Summers, 2010) and ordinal scales (Benoit, 2010) are also illustrative of current perspectives on problems related to "soft" or "wider" measurement in physics and engineering.

The VIM3 defines the concepts and associated terms employed in identifying units of measurement that are comparable across samples, instruments, operators, labs, time, and space. The realization of comparability requires a prior positive outcome of an experimental test of the hypothesis that an invariant, additive unit exists. The hows and whys of producing this outcome for new, previously unmeasured variables are not obvious or self-evident, but the culture of measurement in the natural sciences is oriented to the implementation of existing standards. The VIM3 says little or nothing concerning the form or content of hypotheses of invariant constructs, the observational frameworks, experimental designs, and estimation methods used in evaluating it, or the criteria for determining if and when that hypothesis is falsified.

Though these aspects of measurement theory and practice have developed to mature and widely applied forms over the last 80 years, they have not been proposed, debated, or consolidated as standard procedures. It may not, in fact, be appropriate to present them as standards. Instead, perhaps it would be better to provide methodological recommendations, and to focus on (a) specifying the properties of a variable, such as reading or cognitive development, already measured in a unit functioning as a de facto standard, and (b) defining the concepts needed for establishing traceability to it as a recognized de jure standard. It should be expected that these concepts will likely differ significantly from those associated with calibration to existing standards in the natural sciences and engineering, given the intangible, social nature of the constructs and their basis in ordinal observations.

There is a great need for the involvement of Rasch measurement theoreticians and practitioners in the formulation of a unified language of scientific measurement. The challenges are huge, but the returns on the investments, when measured in terms of human value, social cohesion, and environmental quality, stand to be even huger.

Abdulla, T., Imms, R., Schleich, J. M., & Summers, R. (2010). Multiscale information modelling for heart morphogenesis. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012062.

Barzel, Y. (1982). Measurement costs and the organization of markets. Journal of Law and Economics 25: 27-48.

Bezruczko, N. & Fatani, S. S. (2010). Probabilistic measurement of non-physical constructs during early childhood: epistemological implications for advancing psychosocial science. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012053.

BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP, ISO, OIML (2008) International Vocabulary of Metrology: basic and general concepts and associated terms (VIM) 3rd edn. Online available as JCGM 200:2008 at: www.bipm.org/en/publications/guides/vim.

BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP, ISO, OIML (1995) Guide to the expression of Uncertainty in Measurement (GUM). Online available as JCGM 100:2008 at: www.bipm.org/en/publications/guides/gum.

de Bièvre, P. (2006). Counting is measuring: Learning from the banks? Accreditation & Quality Assurance, 11: 1-2.

de Bièvre, P. (2010). A metrological traceability chain prevents circular reasoning in measurement design. Accreditation & Quality Assurance, 15: 491-492.

Finkelstein, L. (2010). Measurement and instrumentation science and technology-the educational challenges. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012001.

Fisher, W. P., Jr. (2010, September 1-3). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics: Conference Series, 238(1), iopscience.iop.org/1742-6596/238/1/012016.

Fisher, W. P., Jr., & Burton, E. (2010). Embedding measurement within existing computerized data systems: Scaling clinical laboratory and medical records heart failure data to predict ICU admission. Journal of Applied Measurement,11, 271-287.

Fisher, W. P., Jr., Elbaum, B., & Coulter, A. (2010). Reliability, precision, and measurement in the context of data from ability tests, surveys, and assessments. Journal of Physics, Conference Series, 238(1), iopscience.iop.org/1742-6596/238/1/012036.

Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971). Foundations of measurement. Volume 1: Additive and polynomial representations. New York: Academic Press.

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford, England: Oxford University Press.

Magnus, P. D. (2007). Distributed cognition and the task of science. Social Studies of Science, 37(2), 297-310.

Mari, L. (2010a). Properties as measurands: an overview and some critical issues. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012002.

Mari, L & Ugazio, E. (2010). Preliminary analysis of validation of measurement in soft systems. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012026.

Markward, N. J., & Fisher, W. P., Jr. (2004). Calibrating the genome. Journal of Applied Measurement 5(2), 129-41.

Pavese, F. (2010). Comparing statistical methods for the correction of the systematic effects and for the related uncertainty assessment. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012041.

Pertile, M. & Debei, S. (2010). Comparison between two modern uncertainty expression and propagation approaches. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012033.

Price, Gary & de Bièvre, Paul. (2009). Simple principles for metrology in chemistry: Identifying and counting. Accreditation & Quality Assurance, 14: 295-305.

Ramsay, J. O., Bloxom, B., & Cramer, E. M. (1975, June). Review of Foundations of Measurement, Vol. 1, by D. H. Krantz et al. Psychometrika, 40(2), pp. 257-262.

Rossi, G. B. (2010). Probabilistic inferences related to the measurement process. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012015.

Weißensee, K., Kühn, O. & Linß, G. (2010). Knowledge-based uncertainty estimation of dimensional measurements using visual sensors. Journal of Physics: Conference Series, 238: iopscience.iop.org/1742-6596/238/1/012023.

Unifying the Language of Measurement, W.P. Fisher, Jr. ... Rasch Measurement Transactions, 2010, 24:2 p. 1278-81

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
July 21 - 23, 2025, Mon.-Wed.	Pacific Rim Objective Measurement Symposium (PROMS) 2025, www.proms2025.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com