A DIMTEST diminuendo

Nandakumar and Stout (1993, p.64) write "Stout's [DIMTEST] procedure seems very promising for assessing the dimensionality underlying a set of items. It is an outgrowth of the conceptual definition of essential unidimensionality and was developed to be sensitive to dominant dimensions and insensitive to transient or minor dimensions. The procedure is nonparametric (thus avoiding parametric model-data problems), supported by asymptotic theory, and is computationally simplistic."

DIMTEST is a creative and mathematically ambitious attempt to assess unidimensionality. Nevertheless, despite its award-winning status, DIMTEST is severely flawed.

The indispensable step in DIMTEST is partitioning of test items into three subtests: AT1 a "unidimensional" subtest, AT2 a subtest with the same difficulty distribution as AT1, [in DIMTEST 2, AT2 is simulated], and PT a subtest used for partitioning person abilities. Then some simple (for a computer) calculations produce Stout's T statistic that assesses departure from essential unidimensionality. But hazards and defects abound!

1. All person response strings with missing data must be omitted. All non-dichotomous items must be dropped. This is because items are to be characterized by dichotomous p-values and persons by total test raw scores. Obviously DIMTEST is intended for conventional MCQ tests. In other situations, large proportions of the data will be lost. Adaptive testing and rated performances are excluded.

2. An item difficulty continuum must be constructed in order to get started. Then subset AT1 is specified to reasonably cover the difficulty continuum. But how is difficulty to be identified? DIMTEST requires that item characteristic curves be monotone, i.e., that any increase in person ability is always accompanied by an increase in the probability of a correct response on any item. The form of the monotonic function (logistic, normal ogival, with/without guessing, etc.) is not specified. This means that for given data there is no unique hierarchy of item difficulty (see example, RMT 6(1) p. 199 Figure 5). DIMTEST actually uses item p-values to define the continuum. This is equivalent to requiring monotonicity, on average, of person characteristic curves, i.e., that stochastic Guttman ordering prevail. This, in turn, implies that the data must approximate the Rasch model (RMT 6(3) p. 232)!

3. A thought-to-be "unidimensional" subset, AT1, must be chosen. AT1 is to contain up to ¼ of the items (but at least 4) chosen by "expert opinion" or analytical technique (e.g., factor analysis) to have the same dominant trait, i.e., to be unidimensional. In addition, this "dominant" trait is to be as different as possible from any other traits there may be in the test. This implies that the analyst can identify and contrast all the dimensions in a test a priori - in which case what could be the motive for a test of dimensionality? Finally, AT1 must otherwise contain the same types and amounts of response variance as the other items (another impossible requirement), and also provide reasonable coverage of the difficulty continuum. Since AT1 becomes the criterion for determining unidimensionality, a poor choice of items for AT1 makes DIMTEST results undetectably meaningless! In general, the analyst cannot know from DIMTEST results whether the choice is good or poor.

Rasch analysis is almost always more effective than factor analysis for identifying the dominant trait (Smith & Miao, 1994), and Rasch item maps certainly provide a useful guide to expert opinion.

4. Linear arithmetic is perpetrated on non-linear p-values during the selection of AT2. AT2 is selected after AT1 and must have the same number of items and the same item difficulty distribution as AT1, but contain the same dimensional structure as the PT items. This assumes that the raw score metric is linear enough so that the AT1 and AT2 distributions can be compared with the usual summary statistics. AT2 must also be as noisy and multidimensional as the remaining items (of whatever quality they happen to be)?! Here too, a poor selection of items for AT2 makes DIMTEST results undetectably meaningless! [In DIMTEST 2, AT2 is is simulated. The claim is "Research has shown this to result in a more powerful hypothesis-testing statistic" (www.assess.com). It would be a considerable accomplishment to construct a simulated AT2 that matches the noise and multidimensionality of PT.]

Since Rasch is more realistic about the metric, and better reports noise distribution, the best way to select AT2 is to use Rasch difficulties and fit mean-squares, not p-values and point-biserials or factor loadings, as the selection basis.

5. Consistency of person performance (i.e., person fit) across items is required. The PT subtest contains all remaining items, exhibiting all types of response variance, including multidimensionality, when present. The persons are stratified by raw score on PT into subgroups. The raw score on items of intended heterogeneity is treated as a good enough, (i.e., sufficient), statistic for subgrouping persons of similar ability on the trait. Since the subgroupings on PT are to be carried back to AT1 and AT2, it follows that these persons must perform at the same level on all three subtests. But this performance consistency is not verified. Thus, for example, response effects at the end of a speeded test, would skew DIMTEST results depending on how the last few items are allocated to AT1, AT2 and PT. Finally subgroups with less than 20 persons are dropped, so that DIMTEST can even drop complete, non-extreme response vectors.

Rasch analyze the PT data set and drop inconsistent persons before stratifying by raw scores.

6. Essential unidimensionality is not strictly unidimensional. DIMTEST's model raw score variance for each raw score subgroup is the sum of the binomial variances of the p-values, p_i(1-p_i), regardless of the supposed form of the monotonic ICC's. Essential unidimensionality then requires that the unmodelled noise level across subgroups on AT1, the purportedly "unidimensional" item subset, be statistically the same as on AT2, the purportedly "typical" item subset. This is a much more relaxed requirement than that of local unidimensionality, which requires that all item covariances be statistically zero. Essential unidimensionality is thus in pretension more accommodating to multidimensional data than is the Rasch model specification.

DIMTEST is easy-going on theory. It is based on a relaxed form of unidimensionality, defined in terms of a criterion subtest (AT1) of researcher-contrived "unidimensionality". DIMTEST deliberately overlooks bad items and misperforming persons. The resulting statistic has no clear meaning. But DIMTEST is demanding on data collection, rejecting person response strings containing missing data and also raw scores infrequently observed. A variable pronounced unidimensional by Rasch will always be essentially unidimensional by DIMTEST. A variable declared essentially unidimensional by DIMTEST may be far from unidimensional by Rasch criteria. Since, essential unidimensionality is easier to obtain (and manipulate by the arbitrary choice of items allowed in AT1 and AT2) than strict unidimensionality, expect Stout's T to become a test constructor's statistic of choice!

Nandakumar R., Stout W. (1993) Refinements of Stout's Procedure for Assessing Latent Trait Unidimensionality.

Smith R.M., Miao C.Y. (1994) Assessing unidimensionality for Rasch measurement. Chapter 18 in M. Wilson (ed.) Objective Measurement: Theory into Practice, Vol. 2. Norwood, NJ: Ablex.

DIMTEST diminuendo. Linacre JM. … Rasch Measurement Transactions, 1994, 8:3 p.384

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
July 21 - 23, 2025, Mon.-Wed.	Pacific Rim Objective Measurement Symposium (PROMS) 2025, www.proms2025.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com