Faulty Thinking by Educational Researchers

When the twentieth century is viewed from the perspective of mental test technology, the Rasch model stands out as a watershed between earlier forms of empirical investigation and the construction of objective social research. By its elimination of reference groups and its emphasis on objective measurement which statistically models the properties of linearity and additivity, the Rasch model offers researchers the opportunity to undertake quantitative studies of mental growth and development with a precision and clarity that, even now, we only expect in the physical sciences.

Unfortunately, advances in methodology, however significant, are generally resisted by a research community (Cohen, 1985). New ideas and methods, despite their benefits, require reappraisal of prevailing practice and the acquisition of new concepts and skills. In general, the more concepts and techniques that must be given up, the more resistance there is against a new methodology. This resistance to change is necessary to protect the practice of science from frivolity and triviality but, not surprisingly, it also inhibits the dissemination and dispersal of genuine advances in scientific method and thinking.

Four topics associated with traditional mental testing obscure the advantages that objective measurement has for the study of educational and mental growth: (1) the distinction between qualitative and quantitative observations, (2) empirical analyses based on grade equivalents, (3) the longing for an absolute zero for the measurement of mental characteristics, and (4) a rigid conceptualization of reliability.

Qualitative versus quantitative:
Few issues in contemporary discussions of research methodology, and arguably in the history of social research, are more artificial and have led to more muddled thinking than a futile distinction that some researchers make between qualitative and quantitative observations. Miles & Huberman (1984) argue that social reality consists of a fundamental dichotomy between quantitative and qualitative observations which logically prohibits the application of statistical methods to particular social observations. Their approach attempts to preserve the "qualitative" integrity of social phenomena from a feared debasement by quantitative methods. Many philosophers have analyzed this misbegotten perspective and have concluded that the claimed distinction between qualitative and quantitative observations has no logical basis (Kaplan, 1964; Richardt & Cook, 1979, 1980; Walker & Evers, 1988). Unfortunately, most measurement specialists in contemporary social research, unable or unwilling to address this conclusion, have abandoned vast areas of empirical study to inadequate methods which cannot even approximate objectivity.

In the 1920's, Thurstone concluded that, while every measure begins as a qualitative experience, all empirical investigations, on close inspection, involve the application of quantitative reasoning. A measure focuses on a single aspect of experience, associated with some quality of particular interest, and describes it numerically in order to accomplish a fundamental scientific goal -- the precise description of variation.

Observations do not fall into mutually exclusive quantitative and qualitative classes. All observations are at first qualitative. The methods by which observations are used, however, are almost all quantitative. An important distinction between methods is the degree to which observations are summarized numerically. At one extreme are the "qualitative" methods that employ only non-numerical description, such as personal impression and subjective opinion. Other methods achieving greater generality apply increasingly numerical description, e.g. rank orders. At the other extreme, there are methods that rely exclusively on scientifically modelled linear measures.

Grade equivalents:
In 1972, Angoff noted the severe shortcomings of grade equivalents (GE) when measuring intellectual growth. The definition of GE's enforces an equal amount of growth each year, forcing all growth curves to be straight lines of predetermined slope, thus completely concealing variations in growth rate. Angoff explained how differences in GE could not be interpreted as differences in ability and so urged that GE's be avoided. Twenty years later, scholarly journals, public school systems and government agencies, otherwise committed to clarity and precision, continue to study and report growth in GE's. What a scientific embarrassment!.

Absolute zero:
For many, researchers and lay-persons, measurement in the social sciences will always seem fundamentally flawed because measures of non-physical characteristics, such as mental ability or attitude, do not seem to have the "natural" absolute zeroes so plentiful in physics. Even measurement specialists, knowledgeable about their particular techniques, fail to provide an adequate response to this naive apprehension. In fact, the "no zero" criticism is frequently accepted as an inherent limitation on the application of science to human affairs. This, in turn, perpetuates a myth, associated with Descartes, that the human aspects of experience are not suitable for scientific investigation.

While the role of zero in measurement has several perspectives , I offer the reader two from the physical sciences. First is the simple fact that many measurement applications in the physical sciences, such as pitch, hue, loudness and hardness, do well without any absolute zeroes. The Mohs hardness scale is not even a measure, but a physical operation for ranking geological specimens! In fact, the familiar measures of length and time only acquire their zeroes through the context in which they are applied. Neither length nor time have natural origins or absolute zeroes. What they have is agreed upon starting points - the points from which differences are measured. The practical importance of "natural" zeroes is vastly overrated.

Second is a lesson from thermodynamics where researchers use scales with various zeroes, each of which has its own theoretical significance. In a social research devoted to the expansion of scientific knowledge, this is the central concern. At the simplest level, say for measurement of temperature, the correlation of an observation with the physical expansion of a criterion requires only a convention to establish the numerical values on a scale such as centigrade or fahrenheit. The zero is no more than a convenient means of anchoring the numbers on the scale.

At higher levels, speculation on theoretical constructs that might underlie the interaction of observation and instrument become central. The instrument developer puts greater emphasis on assigning numbers to a scale according to a reproducible consistency between numerical order and hypothesized theoretical terms. In proposing that temperature scales be based on molecular activity and heat exchange, the concept of zero acquires a theoretical context the utility of which can be investigated through empirical research. When successful, this approach results in a measure with broad empirical implications, an outcome not possible when measurement is based on no more than a correlation with a criterion. The importance of conceptual insight to the development of the theoretical context for a scale of temperature with a meaningful zero applies equally well to the development of social measures.

Reliability:
Reliability is a term that has taken on a sacred and obscure status in contemporary social research. Few measurement terms have wider application and less meaning. Researchers rely upon reliability to qualify the fundamental adequacy of their research. They use it as the blanket criterion for success or failure. From the perspective of objective measurement, however, the implications of any particular reliability are revealed as ambiguous at best. The reliability of a test is determined by a local and by no means general or necessary mixture of item difficulties and person abilities. A minor, even trivial, change in any part of this mixture will change the value of the reliability coefficient. Indeed, it is not possible to decide from the value of a reliability coefficient alone whether the test in question is useful or useless. This widespread misunderstanding about reliability leads to confusion at best and to entirely erroneous conclusions at worst.



Faulty Thinking by Educational Researchers, N Bezruczko … Rasch Measurement Transactions, 1990, 4:3 p. 114-115




Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

Rasch Measurement Transactions welcomes your comments:

Your email address (if you want us to reply):

If Rasch.org does not reply, please post your message on the Rasch Forum
 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue. International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025 On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025 On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt43e.htm

Website: www.rasch.org/rmt/contents.htm