Is Content Validity Valid?

"The belief in reality is essentially the conviction that an entity transcends immediate sense data; or, to put the same point more plainly, it is the conviction that what is real but hidden has more content than what is given and obvious." (Bachelard, 1984, p. 31-32)

"It is in the determination of invariants that the mathematization of the real finds its true justification" (p. 36). "The whole problem of scientific knowledge of the real turns on the initial choice of mathematics. When one has fully comprehended... that experimentation is always dependent on some prior intellectual construct, then it is obvious why one should look to the abstract for proof of the coherence of the concrete. Empirical possibilities are in one-to-one correspondence with sets of axioms" (p. 41).

The conventional focus on content validity has misled us about what is important in educational measurement. As Bachelard puts it, the abstract has more content than the given and obvious. But content validity enforces conformity to that selfsame given and obvious.

Perhaps the issue of validity is a political one. Teachers often say that they don't want to teach, but to creatively engage in learning with their students. Testing practice, of course, subverts such idealism:

"the method of scaling an educational achievement test should not be permitted to determine the content of the test." (E.F. Lindquist, 1953, p.35)

"above all else, a criterion-referenced test must have content validity" (Hambleton & Novick, 1973, p. 168).

"no recourse to response-inferred concepts" can be allowed in assessing test validity. (Osburn, 1968, p. 101)

Education in practice is a rigid hierarchy of authority in which objectives determine curriculum and curriculum determines test content. Since students are allowed no voice in test content, educational achievement is measured via a one-sided monologue, if not harangue, in which the system, represented by the teacher, does all the talking. In effect, then, educational achievement occurs in a public way only to the extent that students conform to criteria that are politically determined outside of the classroom by people not directly involved in the teaching process.

Rasch measurement has the audacity to regard the students' responses (not the statements of the authority figures) to be the empirical realization of the abstract. It includes the students as participants in a dialogue in which the object is to explore, elaborate, and, if possible, reveal the hidden and abstract construct to be measured. Of course, that dialogue can remove barriers to human contact, requiring sensitivity, attention, and judgment. Many in education may not want that. Allowing statistical inconsistencies in the data to suggest that some test questions may be irrelevant to the conversation at hand, poorly phrased, ambiguous, or easily misunderstood could lead to changes in the curriculum and in the overall objectives. But these changes may address issues that the educational system would rather ignore.

If we are truly concerned to investigate the real (but abstract) rather than the ephemeral (but concrete), then Lindquist must be turned on his head: The content of an educational achievement test must not be permitted to determine the method of scaling.

William P. Fisher, Jr.

Bachelard, Gaston (1984) The New Scientific Spirit. Beacon Press.

Lindquist E. F. (1953) Selecting appropriate score scales for tests (Discussion). Proceedings of the 1952 Invitational Conference on Testing Problems. Princeton NJ: Educational Testing Service

Hambleton R. K., Novick M. R. (1973) Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement, 10, 159-170.

Osburn H. G. (1968) Item sampling for achievement testing. Educational and Psychological Measurement, 28, 95-104.

Is Content Validity Valid? Fisher W. P. Jr. … Rasch Measurement Transactions, 1997, 11:1 p. 548.

