Lord F.M. & Novick M.R. (1968) Statistical Theories of Mental Test Scores. Reading, Mass: Addison-Wesley.
Lord Frederic M. (1952) A theory of test scores. Psychometric Monographs, No. 7.
Lord F.M. (1957) Do tests of the same length have the same standard error of measurement? Educational and Psychological Measurement, 17, 510-521.
Lord F.M. (1970) Some test theory for tailored testing. In W.H. Holzman (Ed.), Computer Assisted Instruction, Testing, and Guidance. New York: Harper and Row.
Lord F.M. (1974) Estimation of latent ability and item parameters when there are omitted responses. PSYCHOMETRIKA 39, 247-264.
Lord F.M. (1980) Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.
Lord F.M. (1983) Maximum likelihood estimation of item response parameters when some responses are omitted. PSYCHOMETRIKA 48, 477-482.
Lord F.M. (1983) Small N justifies Rasch model. In D.J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 51-61) New York, NY: Academic Press, Inc.
Lord F.M. (1984) Maximum likelihood and bayesian parameter estimation in item response theory. Educational Document Reproduction Service ED250365.
Loyd B.H. & Hoover, H.D. (1980) Vertical equating using the Rasch model. Journal of Educational Measurement, 17, 179-193.
Luce R.D. & Tukey J.W. (1964) Simultaneous conjoint measurement. Journal of Mathematical Psychology,(1),1-27.
Ludlow L.H. & Haley S.M. (1995, December) Rasch model logits: Interpretation, use, and transformation. Educational and Psychological Measurement, 55(6), 967-975.
Ludlow L.H., Haley S. (1996) Effect of context in rating of mobility activities in children with disabilities: an assessment using the pediatric evaluation of disability inventory, Educational and Psychological Measurement,56, 122-129.
Luecht R.M. (1996) Multidimensional computerized adaptive testing in a certification or licensure context. Applied Psychological Measurement, 20, 389-404.
Lumley T. & McNamara T.F. (1995) Rater characteristics and rater bias: implications for training. Language Testing 12: 54-71.
Lumsden J. (1978) Tests are perfectly reliable. British Journal of Mathematical and Statistical Psychology 31:19-26.
Lunz M.E. & Stahl J.A. (1993, April) The effect of rater severity on person ability measures: A Rasch model analysis. American Journal of Occupational Therapy, 47(4), 311-317.
Lunz M.E. & Stahl, J.A. (1990) Judge consistency and severity across grading periods. Evaluation and the Health Professions, 13, 425-444.
Lunz M.E. & Stahl, J.A. (1993) The impact of examiners of candidate scores: An introduction to the use of multi-facet Rasch model analysis for oral examinations. Teaching and Learning in Medicine, 5, 3.
Lunz M.E. & Wright B.D. (1997) Latent Trait Models for Performance Examinations. In J. Rost & R. Langeheine (Hrsg.), Applications of latent trait and latent class models in the social sciences. Munster: Waxmann.
Lunz M.E. (2000) Setting standards on performance examinations. In M.R. Wilson & G. Engelhard Jr. (Eds), Objective measurement: Theory into practice (Vol. 5, pp. 181-199) Stamford, Connecticut: Ablex Publishing.
Lunz M.E., Stahl, J.A. & Wright, B.D. (1994) Interjudge reliability and decision reproducibility. Educational and Psychological Measurement, 54(4), 913-925.
Lunz M.E., Stahl, J.A. & Wright, B.D. (1996) The invariance of rater severity calibrations. In G. Engelhard, Jr. & M. Wilson (Eds.), Objective Measurement: Theory into Practice (Vol. 3, pp. 99-112) Norwood, NJ: Ablex.
Lunz M.E., Wright B. & Linacre J. (1990) Measuring the impact of judge severity on examination scores. Applied Measurement in Education 3: 331-345.
Lunz M.E., Wright, B.D. & Linacre, J.M. (1990) Measuring the impact of judge severity on examination scores. Applied Measurement in Education, 3(4), 331-345.
Lynch B. & McNamara T.F. (1998) Using g-theory and many-facet Rasch measurement in the development of performance assessments of the ESL speaking skills of immigrants. Language Testing 15: 158-180.
Müller-Schneider Thomas. (1993) Different Scaling Models - Different Findings? A Comparison of the Models According to Rasch and Mokken As Well As the Classical Test Construction (German), Zeitschrift für Soziologie, 22, 371-384
MacKnight C, Rockwood K. (2000) Rasch analysis of the hierarchical assessment of balance and mobility (HABAM) Journal of Clinical Epidemiology 33:1242-1247.
Macmillan/McGraw-Hill. (1993). Reflecting Diversity: Multicultural Guidelines for Educational Publishing Professionals. New York, NY.
Malec J.F., Buffington AL., Moessner AK Degiorgio L. (2000) A medical/vocational case; coordination system for persons with brain injury: an evaluation of employment outcomes. Archives of Physical Medicine & Rehabilitation 81:1007-1013.
Malec J.F., Moessner AM, Kragness M, Lezak MD. (2000) Refining a measure of brain injury sequelae to predict post-acute rehabilitation outcome: rating scale analysis of the Mayo-Portland Adaptability Inventory. Journal of Head Trauma Rehabilitation 13:670-682.
Malec J.F. (2001) Impact of comprehensive day treatment on societal participation for persons with acquired brain injury. Archives of Physical Medicine & Rehabilitation 82 :885-893.
Mallinson T., Mahaffey, L. & Kielhofner, G. (1998) The occupational performance history interview : Evidence for three underlying constructs of occupational adaptation. Canadian Journal of Occupational Therapist. 65(4) : 219-228.
Mandsen H.S. (1987) Utilizing Rasch analysis to detect cheating on language examinations. ERIC Document Reproduction Service N1 ED 287 284.
Mantal N. & Haenszel, W. (1959) Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719-748.
Maris E. (1995) Psychometric latent response models. Psychometrika, 60, 523-547.
Maris E., De Boeck, P., Van Mechelen, I. (1996) Probability matrix decomposition models. Psychometrika, 61, 7-29.
Marston D. (1989) A curriculum based measurement approach to assessing academic performance: What it is and why do it. In M.R. Shinn (Ed.), Curriculum -Based Measurement: Assessing special children (pp. 18-78) New York : Guilford Press.
Marston D. B & Deno, S.L. (1981) The reliability of simple direct measures of written expression. (Research Report N. 50) Minneapolis: University of Minnesota Institute for Research on Learning Disabilities.
Martinez-Martin P, Grandas F, Linazasoro G, Bravo JL (1999) Conversion to controlled-release levopoda/carbidopa treatment and quality of life as measured by the Nottingham Health Profile. The STAR Study Group. Neurologia 14:338-343.
Masters G.N. & et al. (1990) Profiles of Learning: The Basic Skills Testing Program in New South Wales, 1989. Camberwell, Victoria, Australia: ACER.
Masters G.N. & Evans J. (1986) Banking non-dichotomously scored items. Applied Psychological Measurement, 10(4), 355-367.
Masters G.N. & Wright B.D. (1984) The essential process in a family of measurement models. Psychometrika, 49(4) 529-544.
Masters G.N. & Wright B.D. (1997) The partial credit model. In W.J. van der Linden and R.K. Hambleton (Eds.) Handbook of modern item response theory. (pp. 101-121. New York: Springer-Verlag.
Masters G.N. (1980) A Rasch model for rating scales. Dissertation Abstracts International, 41, 215A-216A.
Masters G.N. (1982) A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.
Masters G.N. (1984) Constructing an item bank using partial credit scoring. Journal of Educational Measurement, 21, 19-32.
Masters G.N. (1988) Measurement models for ordered response categories. In R. Langeheine & J. Rost (Eds.), Latent trait and latent class models (pp. 11-29) New York: Plenum press.
Masters G.N. (1988) Partial credit model. In J.P.Keeves, (Ed.) Educational research, methodology and measurement: an international handbook. (pp. 292-297) Elmsford N.Y.: Pergamon Press.
Masters G.N. (1988) The analysis of partial credit scoring. Applied Measurement in Education, 1(4) 279-297.
Masters G.N. (1995) Scaling and Aggregation in IEA Studies (Technical report) University of California, Berkeley: Technical Advisory Committee, International Association for the Evaluation of Educational Achievement (IEA)
Masters G.N., Adams R.J. & Lokan J. (1994) Mapping student achievement. International Journal of Educational Research, 21(6), 595-610.
Masters G.N. (1985) Common-Person Equating with the Rasch Model. Applied Psychological Measurement, 9 (1), 73-82.
Matthews M. (1990) Skill taxonomies and problems for the testing of reading. Reading in a foreign language. 7(1): 511-517.
Mauraun M.D. & Rossi, N.T. (2001) The extra-factor phenomenon revisited: unidimensional unfolding as quadratic factor analysis. Applied Psychological Measurement, 25, 77-87.
McArthur D.L. (1981) Bias in the writing of prose and its appraisal (CSE Report No. CSE-RR-162) Los Angeles, CA: Center for the Study of Evaluation. (ERIC Document Reproduction Service No. 217073)
McArthur D.L., Cohen, M.J. & Shandler, S.L. (1991) Rasch analysis of functional assessment scales : An example using pain behaviors. Archives of Physical Medicine and Rehabilitation. 72 : 296-304.
McBride J.R., Martin J.T. (1983) Reliability and Validity of Adaptive Ability Tests in a military setting. in Weiss D.J. (Ed.) "New Horizons in Testing" New York: Academic Press.
McColl M.A., Davies, D., Carlson, P., Johnston, J. & Minnes, P. (2001) The community integration measure: development and preliminary validation. Archives of Physical Medicine and Rehabilitation . 82(4): 429-34.
McCullagh P., Nelder, J.A. (1989) Generalized Linear Models (2nd Edition) New York: Chapman and Hall. Nonparametric and Parametric IRT, and the Future 25
McDonald R.P. (1967) Nonlinear factor analysis. Psychometric monographs No.15
McDonald R.P. (1994) Testing for approximate dimensionality. In D. Laveault, B. Zumbo, M. Gessaroli & M. Boss (Eds.), Modern theories of measurement: Problems and issues (pp. 63-85) Ottawa, Canada, University of Ottawa Press.
McDonald R.P. (1997) Normal-ogive multidimensional model. In van der Linden W.J. & Hambleton R.K. Handbook of Modern Item Response Theory. New York: Springer.
McDonald R.P. (1999) Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum Associates.
McDonald R.P. (2000) Test Theory: a unified treatment. Lawrence Erlbaum. ISBN: 0-8058-3075-8.
McDowell I. & Newell C. (1996) Measuring health: A guide to rating scales and questionnaires. 2d edition. Oxford: Oxford University Press.
McGraw-Hill. (1983). Guidelines for Bias-free Publishing. Monterey, CA.
McHorney C.A., Haley, S.M. & Ware, J.E. (1997) Evaluation of the MOS SF-36 Physical Functioning Scale (PF-10) Comparison of relative precision using Likert and Rasch scoring methods. Journal of Clinical Epidemiology. 50(4) : 451-461.
McLeod L.D. & Lewis, C. (1999) Detecting Item Memorization in the CAT Environment. Applied Psychological Measurement, 23, 147-160.
McLeond L.D., Swygert, K. A, Thissen, D. (2001) Factor analysis for items scored in two categories. D. Thissen and H. Wainer (eds.), Test Scoring, 189-216.Mahwah NJ: Lawrence Erlbaum Associates, Inc.
McNamara T.F. & Lumley T. (1997) The effect of interlocutor and assessment mode variables in overseas assessments of speaking skills in occupational settings. Language Testing 14: 140-156.
McNamara T.F. (1996) Measuring second language performance. New York: Addison Wesley Longman.
Meijer R.R. & Sijtsma, K. (2001) Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107-135.
Meijer R.R. (1996) Person-fit research: An introduction. (Guest editor's introduction to the Special Issue: Person-fit research: Theory and applications.) Applied Measurement in Education, 9, 3-8.
Meijer R.R., Molenaar I. W & Sijtsma K. (1994) Item, test, person and group characteristics and their influence on nonparametric appropriateness measurement. Applied Psychological Measurement, 18, 111-120.
Meijer R.R., Sijtsma, K. Smid, N.G. (1990) Theoretical and empirical comparison of the Mokken and the Rasch approach to IRT. Applied Psychological Measurement, 14, 283-298.
Meisels S.J. (1992) Doing harm by doing good: Iatrogenic effects of early childhood. Early Childhood Research Quarterly, 7 (2), 155-174.
Mellenbergh G.J. (1995) Conceptual notes on models for discrete polytomous item responses. Applied Psychological Measurement, 19, 91-100.
Mellenbergh Gideon J. & Vijn, Pieter. (1981) The Rasch Model As a Loglinear Model, Applied Psychological Measurement, 5, 369-376,
Meredith W. & Horn, J. (2001) The role of factorial invariance in modeling growth and change. In Collins, L.M. & Sayer, A.G. (Eds.), New methods for the analysis of change. Washington, D.C.: American Psychological Association, pp. 203-XXX.
Meredith,W. (1965) Some results based on a general stochastic model for mental tests. Psychometrika, 30, 419-440.
Messick S. (1989) Validity. In R.L. Linn (Ed.), Educational measurement (3rd ed.) New York: American Council on Education/ Macmillan.
Messick S. (1989) Validity. In R.L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103) New York: Macmillan.
Messick S. (1995) Validity of Psychological Assessment. American Psychologist, 50(9), 74149.
Messick S. (1995) Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741-749.
Michalewicz Z. (1994) Generic Algorithms + Data Structures = Generic Programs. Berlin: Springer-Verlag.
Michell J. (1986) Measurement scales and statistics: A clash of paradigms. Psychological Bulletin, 100, 398-407.
Michell J. (1990) An Introduction to the Logic of Psychological Measurement. Hillsdale, New Jersey: Lawrence Erlbaum Associates.
Michell J. (1997) Quantitative science and the definition of measurement in psychology', Br J Psych (1997) 88, 355-383.
Michell J. (1999) Measurement in psychology: a critical history of a methodological concept. Cambridge, Cambridge University Press.
Miller T., Reckase, R., Spray, J.,Luecht, R. & Davey, T. (1996) Multidimensional item response theory. Iowa City IA: ACT Publications.
Miller T.R. & Hirsch, T.M. (1992) Cluster analysis of angular data in applications of multidimensional item response theory. Applied Measurement in Education, 5, 193-211.
Mills C.N. & Stocking M.L. (1996) Practical issues in large-scale computerized adaptive testing. Applied Measurement in Education, 9, 287-304.
Mislevy R.J. & Bock R.D., (1996) BILOG computer program. Scientific Software International.
Mislevy R.J. & Bock, R.D. (1990) BILOG (Version 3.11) Mooresville, IN: Scientific Software, Inc.
Mislevy R.J. & Bock, R.D. (1991) BILOG users' guide. Chicago: Scientific Software.
Mislevy R.J. & Chang H.H. (2000) Does adaptive testing violate local independence? PSYCHOMETRIKA 65, 149-156.
Mislevy R.J. & Wu P.K. (1996) Missing responses and IRT ability estimation: omits, choice, time limits, and adaptive testing. ETS Research Report RR-96-30-ONR. Princeton NJ: Educational Testing Service.
Mislevy R.J. (1985) Estimation of latent group effects. Journal of the American Statistical Association, 80, 993-997.
Mislevy R.J. (1996) Test theory reconceived. Journal of Educational Measurement, 33, 379-416.
Mislevy R.J., Sheehan, K.M. (1989) The role of collateral information about examinees in item parameter estimation Psychometrika, 54, 661-679.
Mislevy Robert J. (1988) Exploiting Auxiliary Information About Items in the Estimation of Rasch Item Difficulty Parameters, Applied Psychological Measurement, 12, 281-296,
Mislevy, R.J. & Stocking, M.L. (1989) A consumer's guide to LOGIST and BILOG. Applied Psychological Measurement, 13, 57-75.
Mislevy, R.J., Beaton, A.E. & Kaplan, B. (1992) Estimating population characteristics from sparse matrix samples of item responses. Journal of Educational Measurement, 29 (2), 133-161.
Moeller Svend Kreiner. (1976) The Rasch-Weibull Process, Scandinavian Journal of Statistics, 3, 107-115,
Mokken R.J. & Lewis C. (1982) A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement, 6, 417-430.
Mokken R.J. (1971) A theory and procedure of scale analysis. The Hague: Mouton.
Mokken R.J. (1997) Nonparametric models for dichotomous items. In W.J. Van der Linden, R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351-368) New York: Springer Verlag.
Molenaar I. W and Hoijtink H. (1990) The many null distributions of person fit indices. Psychometrika, 55, 75-106.
Molenaar I. W and Hoijtink H. (1996) Person fit and the Rasch model, with an application of knowledge of logical quantors. Applied Measurement in Education, 9, 27-45.
Molenaar I.W. (1991) A weighted Loevinger H-coefficient extending Mokken scaling to multicategory items. Kwantitatieve Methoden, 37, 97-117.
Molenaar I.W. (1997) Nonparametric methods for polytomous responses. In W.J. Van der Linden, R.K. Hambleton (Eds.), Handbook of modern psychometrics (pp. 369-380) New York: Springer Verlag.
Molenaar I.W., Sijtsma, K. (1999) MSP for Windows. Groningen: iecProGAMMA.
Molenaar I.W., Stout, W.F. (January 5, 2000) Personal communication.
Molenaar Ivo W. & Hoijtink, Herbert. (1990) The Many Null Distributions of Person Fit Indices, Psychometrika, 55, 75-106,
Molenaar Ivo W. (1983) Some Improved Diagnostics for Failure of the Rasch Model, Psychometrika, 48, 49-72,
Molenaar Ivo W. (1992) Statistical Models for Educational Testing and Attitude Measurement, Statistical Modelling. Papers from the Sixth International Workshop on Statistical Modelling, Elsevier/North-Holland (New York; Amsterdam), 249-262,
Morales L., Reise S. & Hays R.D. (2000) Evaluating the equivalence of health care ratings by whites and Hispanics. Medical Care, 38, 517-527.
Morrow D. & Goertzen, S. (1986) A commentary on gender differences. Manitoba Department of Education, Winnipeg. Planning and Research Branch. (ERIC Document Reproduction Service No. ED 301 469)
Mueller Hans. (1987) A Rasch Model for Continuous Ratings, Psychometrika, 52, 165-181,
Munger G.E, & Loyd, B. H. (1991). Effect of speededness on test performance of handicapped and non-handicapped examinees. Journal of Educational Research, 85(l), 53-57.
Muraki E. & Bock, R.D. (1997) PARSCALE: IRT item analysis and test scoring for rating-scale data. Chicago: Scientific Software International.
Muraki E. (1990) Fitting a polytomous item response model to Likert-type data. Applied Psychological Measurement, 14, 59-71.
Muraki E. (1992) A generalized partial credit model: application of an EM algorithm. Applied Psychological Measurement 16, 159-176.
Muraki E. (1993) Information functions of the generalized partial credit model. Applied Psychological Measurement, 17, 351-363.
Muraki E., Carlson, J.E. (1995) Full-information Factor Analysis for Polytomous Item Responses. Applied Psychological Measurement, 19, 73-90.
Muraki Eiji. (1992) A Generalized Partial Credit Model: Application of An EM Algorithm, Applied Psychological Measurement, 16, 159-176,
Murray B. (1998, August) The latest techno tool: essay-grading computers. APA Monitor, p.43.
Myford C.M. & Mislevy, R.J. (1995) Monitoring and improving a portfolio assessment system (ETS Center for Performance Assessment Report No. MS 94-05) Princeton, NJ: Educational Testing Service.
Myford C.M. & Wolfe, E.W. (2001) Detecting and measuring rater effects using many-facet Rasch measurement: An instructional module. Manuscript submitted for publication.
Myford C.M., Marr, D.B. & Linacre, J.M. (1996) Reader calibration and its potential role in equating for the Test of Written English (ETS Center for Performance Assessment Report No. MS 95-02) Princeton, NJ: Educational Testing Service.
Narahara M. (1998) Kindergarten entrance age and academic achievement. Information Analyses. (ERIC Document Reproduction Service No. ED 421 218)
Narahara M. (1998) The effects of school entry age and gender on reading and math achievement scores of second grade students. Reports - Research. (ERIC Document Reproduction Service No. ED 421 233)
Nedelsky L. (1954) Absolute grading for objective tests. Educational and Psychological Measurement, 14, 3-19.
Nedelsky L. (1954) Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 3-19.
Nering M.L. (1995) The distribution of person fit using true and estimated person parameter. Applied Psychological Measurement, 19, 121-129.
Nering M.L. (1997) The distribution of indexes of person fit within the computerized adaptive testing environment. Applied Psychological Measurement, 21, 115-127.
Neyman J. & Scott E.L. (1948) Consistent estimates based on partially consistent observations. Econometrica 16, 1-32.
Neyman J. & Scott E.L. (1948) Consistent estimates based on partially consistent observations. Econometrica, 16, 1-32.
Nicholls J.G. (1989) The competitive ethos and democratic education. Cambridge, Mass.: Harvard University Press.
Nichols P., Sugrue, B. (1999) The lack of fidelity between cognitively complex constructs and conventional test development practice. Educational Measurement: Issues and Practice, 18, 18-29.
Nichols S.F. Chipman, R.L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 103-125) Hillsdale, NJ: Lawrence Erlbaum Associates.
Noel Y. (1999) Recovering unimodal latent patterns of change by unfolding analysis: Applications to smoking cessation. Psychological Methods, 4, 173-191.
Nordenskiold U. (1997) Daily activities in women with rheumatoid arthritis. Aspects of patient education, assistive devices and methods for disability and impairment assessment. Scandinavian Journal of Rehabilitation Medicine. Supplement. 37 : 1-72.
Nordenskiold U., Grimby, G., Hedberg, F.M., Wright, B. & Linacre, J.M. (1996) The structure of an instrument for assessing the effects of assistive devices and altered working methods in women with rheumatoid arthritis. Arthritis Care and Research. 9(5) :358-367.
Norris J.M. & Ortega L. 2000: Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. To appear in Language Learning 50/3.
Norusis M. (1990) SPSS Introductory Statistics Student Guide, Chicago: SPSS Inc.
Nunnally J.C. & Bernstein I. (1994) Psychometric Theory. 3rd Edition. McGraw-Hill. ISBN: 0-07-047849-X.
Nunnally J.C. (1979) Psychometric Theory 2nd Editn.. McGraw-Hill. ISBN: 0-07-047465-6.
Nunnally J.C., Lemond, L.C. & Wilson,W.H. (1977) Studies of voluntary visual attention: Theory, methods, and psychometric issues.Applied Psychological Measurement, 1(2), 203-218.
Nunnully J.C. & Koplin J.H. (1967) The effects of word-relatedness on learning. Educational Document Reproduction Service NO: ED 016214
O'Brien M.L. (1992) Using Rasch procedures to understand psychometric structure in measures of personality. En M. Wilson (Ed.) Objective measurement: theory into practice. (pp. 61-76) Norwood, NJ: Ablex Publishing Corporation.
O'Connell M.A., Belanger B.A., Haaland Perry D. (1993) Calibration and assay development using the four-parameter logistic model. Chemometrics and Intelligent Lab. Sys. V20, 97-114.
O'Neill T.R. & Lunz, M.E. (2000) A method to study rater severity across several administrations. In M. Wilson & G. Engelhard, Jr. (Eds.), Objective Measurement: Theory into Practice (Vol. 5, pp. 135-146) Stamford, CT: Ablex.
Orlando M. & Thissen, D. (2000) Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50-64.
Orlando M., Sherbourne, C.D. & Thissen, D. (2000) Summed-score linking using item response theory: Application to depression measurement. Psychological Assessment, 12(3), 354-359.
Owen R.J. (1975) A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70, 351-356.
Owens A.M. (2001, March 1) Boys should start kinderga{ten a year later than girls, report advises. National Post Online. Retrieved March 16, 2001 from the World Wide Web:
Page E.B. & Petersen, N.S. (1995) The computer moves into essay grading: Updating the ancient test. Phi Delta Kappan, 76, 561-565.
Page E.B. (1966) The imminence of grading essays by computer. Phi Delta Kappan, 48, 238-243.
Page E.B. (1968) Analyzing student essays by computer. International Review of Education, 14, 210-225.
Page E.S. (1954) Continuous inspection schemes. Biometrika, 41, 100-115.
Palmer D., Kays, M., Smith, A. & Doig, B. (1994) Stop! Look and Lesson. Camberwell: The Australian Council for Educational Research.
Pastor D.A., Dodd, B.G. & Chang, H.H. (2002) A comparison of item selection techniques and exposure control mechanisms in CATs using the generalized partial credit model. Applied Psychological Measurement, 26 (2), 147-163.
Patz R.J., Junker, B.W. (1999a) A straightforward approach to Markov Chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146-178.
Patz R.J., Junker, B.W. (1999b) Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, xxx-xxx.
Patz R.J., Junker, B.W., Lerch, F.J., Huguenard, B.R. (1996) Analyzing small psychological experiments with item response models (CMU Statistics Department technical report #644) [Online]. Available: Accessed 28 April 2000.
Perline R., B.D. Wright, et al. (1979) "The Rasch model as additive conjoint measurement." Applied Psychological Measurement 3(2): 237-255.
Perline R., Wright, B.D. & Wainer, H. (1979) The Rasch model as additive conjoint measurement. Applied Psychological Measurement, 3, 237-255.
Petersen N.S., Kolen, M.J. & Hoover, H.D. (1989) Scaling, norming, and equating. In R.L. Linn (Ed.), Educational Measurement, Third Edition (pp. 221-262)
Peterson N.S., Kolen, M.J. & Hoover, H.D. (1989) Scaling, norming, and equating, in RL Linn (eds): Educational Measurement (3rd ed) New York: Macmillan, pp 221-262.
Peterson S. & Bainbridge, J. (1999) Teachers' gendered expectations and their evaluation of student writing. Reading Research and Instruction, 38(3), 255-271.
Pfanzagl J. (1993) On the Consistency of Conditional Maximum Likelihood Estimators, Annals of the Institute of Statistical Mathematics, 45, 703-719,
Pfanzagl J. (1994) On item parameter estimation in certain latent trait models. In G.H. Fischer & D. Laming (Eds.) Contributions to Mathematical Psychology, Psychometrics and Methodology. New York: Springer Verlag.
Phillips A., Holland P.W. (1986) A new estimator of the variance of the Mantel-Haenszel Log-Odds-Ratio Estimator. Technical report no. 86-67. Princeton NJ: Educational Testing Service.
Phillips Gary W. & Gedeik, Sandra S. (1984) RKAPPA: Reliability of Mastery Tests: An Application of the Rasch Model, Applied Psychological Measurement, 8, 286-286,
Pimentel F.L., Maia-Gonçalves, J.P., Mesquita, N.F., Mateus, P., Alvarez, P., Roman, P., and Melon, J. (1998) Influence of patient clinical characteristics in quality of life measured by Rasch model in cancer patients: A portuguese experience. Quality of Life Research Vol. 7, 649.
Post W.J. (1992) Nonparametric unfolding models. A latent structure approach. Leiden: DSWO Press, Leiden University, The Netherlands.
Post W.J., Snijders, T.A.B. (1993) Nonparametric unfolding models for dichotomous data. Methodika, 7, 130-156.
Powers D.E., Fowles, M.E. & Welsh, C.K. (1999) Further validation of a writing assessment for graduate admissions. (GRE Board Research Report No. 96-13R and ETS Research Report 99-18) Princeton, NJ: Educational Testing Service.
Priestley H.A. (1997) Introduction to Integration. Clarendon Press, Oxford.
Prieto L., Alonso J., Lamarca R., Wright B.D. (1998) Rasch measurement for reducing the items of the Nottingham Health Profile. Journal of Outcome Measurement, 2(4):285-301.
Prieto L., Alonzo, J., Ferrer, M. & Anto, J.M. (1997) Are results of the SF-36 Health Survey and the Nottingham Health Profile similar? : A comparison in CODP patients. Journal of Clinical Epidemiology. 50(4) : 463-473.
Pula J.J. & Huot, B.A. (1993) A model of background influences on holistic raters. In M.M. Williamson & B.A. Huot (Eds.), Validating Holistic Scoring for Writing Assessment: Theoretical and Empirical Foundations (pp.237-265) Cresskill, NJ: Hampton Press.
Raczek AE, Ware JE, Bjorner JB, et al. (1998) Comparison of Rasch and summated rating scales constructed from SF-36 physical functioning items in seven countries: results fruit the IQOLA Project. International Quality of Life Assessment. Journal of Clinical Epidemiology 51:1203-1214.
Raju N.S., van der Linden W.J. & Fleer P.F. (1995) IRT-based internal measures of Differential functioning of items and tests. Applied Psychological Measurement, 19, 353-368.
Raju N.S., van der Linden, W.J. & Fleer, P.F. (1995) IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement, 19, 353-368.
Ramsay J.O. (1989) A Comparison of Three Simple Test Theory Models, Psychometrika, 54, 487-499,
Ramsay J.O. (1991) Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611-630.
Ramsay J.O. (1995) A similarity-based smoothing approach to nondimensional item analysis. Psychometrika, 60, 323-339.
Ramsay J.O. (1996) A geometrical approach to item response theory. Behaviormetrika, 23, 3-17.
Rasch G. (1960, 1980, 1992) Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research. (Reprinted by the Chicago University Press, 1980)
Rasch G. (1961) On general laws and the meaning of measurement in psychology. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Psychology, 4, 321-333.
Rasch G. (1966) An individualistic approach to item analysis. In P.F. Lazarfeld & N.W. Henry (Eds.), Readings in mathematical social science (pp.. 89107) Chicago, IL: Science Research Associates, Inc.
Rasch G. (1977) On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. Danish Yearbook of Philosophy, 14, 58-94.
Raymond M.R. & Viswesvaran, C. (1993) Least-squares models to correct for rater effects in performance assessment. Journal of Educational Measurement, 30(3), 253-268.
Raymond M.R. (1986) Missing data in evaluation research. Evaluation and the Health Professions, 9, 395-420.
Raymond M.R., Webb, L.C. & Houston, W.M. (1991) Correcting performance-rating errors in oral examinations. Evaluation and the Health Professions, 14(1), 100-122.
Reckase M.D. & McKinley, R.L. (1991) The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15, 361-373.
Reckase M.D. (1974) An interactive computer program for tailored testing based on the one-parameter logistic model. Behavior Research Methods and Instrumentation 6:2 208-212
Reckase M.D. (1979) Unifactor latent trait models applied to multi-factor tests: Results and implications. Journal of Educational Statistics, 4, 207-230.
Reckase M.D. (1985) The difficulty of items that measure more than one ability. Applied Psychological Measurement, 9, 401-412.
Reckase M.D. (1985) The difficulty of test items that measure more than one ability. Applied Psychological Measurement 9, 401-,412.
Reckase M.D. (1985) The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401-412.
Reckase M.D. (1985) The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9,401-412.
Reckase M.D. (1997) A linear logistic multidimensional model for dichotomous item response data. In van der Linden W.J. & Hambleton R.K. Handbook of Modern Item Response Theory. New York: Springer.
Reise S. E (1995) Scoring method and the detection of person misfit in a personality assessment context. Applied Psychological Measurement, 19, 213-229.
Reise S. E and Due A.M. (1991) The influence of test characteristics on the detection of aberrant response patterns. Applied Psychological Measurement, 15, 217-226.
Reise S.P. & Due, A.M. (1991) Test characteristics and their influence on the detection of aberrant response patterns. Applied Psychological Measurement, 15, 217-226.
Reise S.P. & Waller N.G. (1993) Traiteness and the assessment of response pattern scalability. Journal of Personality and Social Psychology, 65. 143-151.
Resnick L.B., Resnick, D.P. (1992) Assessing the thinking curriculum: new tools for educational reform. In B.R. Gifford, M.C. O'Connor (Eds.), Changing assessments: alternative views of aptitude, achievement, and instruction (pp 37-75) Norwell, MA: Kluwer Academic Publishers.
Revicki D.A. & Cella D.F. (1997, Aug) Health status assessment for the twenty-first century item response theory item banking and computer adaptive testing. Quality of Life Research, 6(6), 595-600.
Revuelta J. & Ponsada V. (1998) A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement 38 311-327.
Revuelta J. & Ponsoda, V. (1998) A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35, 311-327.
Richardson J. (1994) Cost utility analysis : What should be measured? Social Science Medicine. 39(1) :7-21.
Rigdon S.E., Tsutakawa, R.K. (1983) Parameter estimation in latent trait models. Psychometrika, 48, 567-574.
Rigdon Steven E. & Tsutakawa, Robert K. (1983) Parameter Estimation in Latent Trait Models, Psychometrika, 48, 567-574,
Rigdon Steven E. & Tsutakawa, Robert K. (1987) Estimation for the Rasch Model When Both Ability and Difficulty Parameters Are Random, Journal of Educational Statistics, 12, 76-86,
Roberts J.S. & Laughlin, J.E. (1996) A unidimensional item response model for unfolding responses from a graded disagree-agree response scale. Applied Psychological Measurement, 20, 231-255.
Roberts J.S. (2001b) GGUM2000: Estimation of parameters in the generalized graded unfolding model. Applied Psychological Measurement, 25, 38.
Roberts J.S., Donoghue, J.R. & Laughlin, J.E. (2000) A general item response theory model for unfolding unidimensional polytomous responses. Applied Psychological Measurement, 24, 3-32.
Roberts J.S., Donoghue, J.R. & Laughlin, J.E. (2000) A general item response theory model for unfolding unidimensional polytomous responses. Applied Psychological Measurement, 24, 3-32.
Roberts J.S., Donoghue, J.R. & Laughlin, J.E. (2002) Characteristics of MMLE/EAP parameter estimates in the generalized graded unfolding model. Applied Psychological Measurement, 26, 192-207.
Roberts J.S., Laughlin, J.E. & Wedell, D.H. (1999) Validity issues in the Likert and Thurstone approaches to attitude measurement. Educational and Psychological Measurement, 59, 211-233.
Roberts J.S., Lin, Y. & Laughlin, J.E. (2001) Computerized adaptive testing with the generalized graded unfolding model. Applied Psychological Measurement, 25, 177-196.
Robertson T., Wright, F.T., Dykstra, R.L. (1988) Order restricted statistical inference. New York: Wiley.
Rojas A.J. (1998) Aplicacion del Modelo de Credito Parcial y Modelo de Escalas de Clasificacion a la medicion de actitudes. Almeria: Servicio de Publicaciones de la Universidad de Almeria. [Edition CD-ROM].
Rosenbaum P.R. (1984) Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425-435. Nonparametric and Parametric IRT, and the Future 26
Rosenbaum P.R. (1985) Comparing distributions of item responses for two groups. British Journal of Mathematical and Statistical Psychology, 38, 206 - 215.
Rosenbaum P.R. (1987a) Probability inequalities for latent scales. British Journal of Mathematical and Statistical Psychology, 40, 157-168.
Rosenbaum P.R. (1987b) Comparing item characteristic curves. Psychometrika, 52, 217-233.
Roskam Edward E. & Jansen, Paul G.W. (1989) Conditions for Rasch-dichotomizability of the Unidimensional Polytomous Rasch Model, Psychometrika, 54, 317-332,
Ross J. & Cliff, N. (1964) A generalization of the interpoint distance model. Psychometrika, 29, 167-176.
Ross S. (1992) Accommodative questions in oral proficiency interviews. Language Testing 9: 173-176.
Rost Jürgen. (1985) A Latent Class Model for Rating Data, Psychometrika, 50, 37-49,
Rost Jürgen. (1989) Rasch Models and Latent Class Models for Measuring Change With Ordinal Variables, Multiway Data Analysis, North-Holland/Elsevier (Amsterdam; New York), 473-483
Rost Jürgen. (1990) Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis, Applied Psychological Measurement, 14, 271-282
Rost J. (1996) Testtheorie, TestKonstruction. Göttingen: Verlag Hans Huber.
Roth E.J., Heinemann, A.W., Lovell, L.L., Harvey, R.L., McGuire, J.R. & Diaz, S. (1998) Impairment and disability : Their relation during stroke rehabilitation. Archives of Physical Medicine and Rehabilitation. 79 : 329-335.
Roussos L.A., Stout, W. & Marden, J. (1998) Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35, 1-30.
Rubin D.B. (1976) Inference and missing data. Biometrika 63, 581-92.
Rubin D.B. (1987) Multiple Imputation for Nonresponse in Surveys. New York: Wiley.
Rubin D.B. (1987) Multiple imputation for nonresponse in surveys. New York: Wiley.
Rudner L.M. (1992) Reducing errors due to the use of judges. ERIC/TM Digest. (Report EDO-TM-92-10) Washington, DC: American Institutes for Research. (ERIC Document Reproduction Service No. ED355254)
Rudner Lawrence M. (1998) An On-line, Interactive, Computer Adaptive Testing Mini-Tutorial,
Ruiz R., Ortiz, R. & Alvarez, P. (2001) Dry bean cultivar characterisation by isoelectric focusing electrophoresis in polyacrylamide gel. Journal of the Science of Food and Agriculture 81, 1126-1131.
Ryan J.T., Williams, J.S. & Doig, B.A. (1998) National tests: Educating teachers about their children's mathematical thinking. In A. Olivier & K. Newstead (Eds) Proceedings of the Twenty-second Conference of the International Group for the Psychology of Mathematics Education. (Vol. IV, pp. 81-88) Stellenbosch, South Africa: University of Stellenbosch.
Saaty T.L. & Vargas, L.G. (1984) Comparison of eigenvalue, logarithmic least squares and least squares methods in estimating ratios. Mathematical Modeling, 5, 309-324.
Saaty T.L. (1990) Eigenvector and logarithmic least squares. European Journal of Operational Research. 48, 156-160.
Saaty T.L. (1996) Multicriteria decision making: The analytic hierarchy process. Pittsburgh, PA: RWS Publications.
Samejima F. (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34(4), 100-114.
Samejima F. (1969) Estimation of latent trait ability using a response pattern of graded scores. Psychometrika, Monograph Supplement No. 17.
Samejima F. (1972) A general model for free-response data. Psychometrika, Monograph Supplement No. 18.
Samejima F. (1995) Acceleration model in the heterogeneous case of the general graded response model. Psychometrika, 60, 549-572.
Samejima F. (1997) Departure from normal assumptions: a promise for future psychometrics with substantive mathematical modeling. Psychometrika, 62, 471-493.
Sands W.A., Waters, B.K. & McBride, J.R. (1997) Computerized Adaptive Testing : From Inquiry to Operation. Washington, DC: American Psychological Association.
Sawilowski S.S. (2000) Psychometrics versus datametrics: Comment on Vacha-Haase's "reliability generalization" method and some EPM editorial policies. Educational and Psychological Measurement, 60, 157-173.
Schafer J.L. (1997) Analysis of incomplete multivariate data. New York: Chapman and Hall.
Schaubroeck J. & Green, S.G. (1989) Confirmatory factor analytic procedures for assesisng change during organizational entry. Journal of Applied Psychology, 74, 892-900.
Scheiblechner H. (1995) Isotonic ordinal probabilistic models (ISOP) Psychometrika, 60, 281-304.
Scheiblechner H. (1995) Isotonic ordinal probabilistic models (ISOP) Psychometrika, 60, 281304.
Schmitt N. Cortina J.M. & Whitney D.J. (1993) Appropriateness fit and criterion-related validity. Applied Psychological Measurement, 17. 143-150.
Schoonman W. (1989) An applied study on computerized adaptive testing. Rockland MA: Swets & Zeitlinger.
Schumacker R.E. & Lomax, R.G. (1996) A beginner's guide to structural equation modeling. Mahwah, NJ: Lawrence Erlbaum Associates.
Schwartz A.E. (1998, April 26) Graded by machine. Washington Post.
Scott J. (1999, January 31) Looking for the tidy mind, alas. The New York Times.
Segal M.E., Heinemann, A.W., Schall, R.R. & Wright, B.D. (1997) Rasch analysis of a brief physical ability scale for long-term outcomes of stroke. Physical medicine and rehabilitation : State of the Art Reviews. 11(2) : 385-396.
Segall D.O. (1996) Multidimensional adaptive testing. Psychometrika, 61, 331-354.
Segall D.O. (2000) Principles of Multidimensional Adaptive Testing.W. J. van der Linden and C.A.W. Glas (eds.), Computerized Adaptive Testing: Theory and practice, 53-57. Dordrecht, The Netherlands: Kluwer Academic Publishers
Shealy R. & Stout, W. (1993) A model-based standardization approach that separates true bias / DIF from group ability differences and detects test bias / DTF as well as item bias / DIF. Psychometrika, 58, 159-194.
Shepard L.A. (1984) Setting performance standards. In R.A. Berk (Ed.) A Guide to criterion-referenced test construction. Baltimore: John Hopkins Press.
Shohamy E. (1983) The stability of oral proficiency assessment on the oral interview testing procedures. Language Learning 33: 527-540.
Shohamy E. (1994) The validity of direct versus semi-direct oral tests. Language Testing 11: 99-123.
Shohamy E., Gordon, C.M. & Kraemer, R. (1992) The effect of raters' background and training on the reliability of direct writing tests. The Modern Language Journal, 76, 27-33.
Shohamy E., Reves E. & Bejarno Y. (1986) Introducing a new comprehensive test of oral proficiency. ELT Journal 40: 212-220.
Shute V.J., Psotka, J. (1996) Intelligent tutoring systems: Past, Present and Future. In D. Jonassen (Ed.),
Siegmund D. (1985) Sequential Analysis: Tests and Confidence Intervals. Springer-Wrlag, New York.
Sijtsma K. & Hemker B.T. (1998) Nonparametric polytomous IRT models for invariant item ordering, with results for parametric models. Psychometrika, 63, 183-200.
Sijtsma K. & Hemker B.T. (2000) A taxonomy for ordering persons and items using simple sum scores. Journal of Educational and Behavioral Statistics, 25, 391-415.
Sijtsma K. & Junker B.W. (1996) A survey of theory and methods of invariant item ordering. British Journal of Mathematical and Statistical Psychology, 49, 79-105.
Sijtsma K. & Van der Ark L.A. (2001) Progress in NIRT analysis of polytomous item scores: Dilemmas and practical solutions. In A. Boomsma M.A.J. van Duijn & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 297 - 318) New York: Springer.
Sijtsma K. & Verweij A.C. (1999) Knowledge of solution strategies and IRT modeling of items for transitive reasoning. Applied Psychological Measurement, 23, 55-68.
Sijtsma K. (1998) Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores. Applied Psychological Measurement, 22, 3-31.
Sijtsma K. (1998) Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores. Applied Psychological Measurement, 22, 3-32.
Sijtsma K., Hemker, B.T. (1998) Nonparametric polytomous IRT models for invariant item ordering, with results for parametric models. Psychometrika, 63, 183-200.
Sijtsma K., Junker, B.W. (1996) A survey of theory and methods of invariant item ordering. British Journal of Mathematical and Statistical Psychology, 49, 79-105.
Sijtsma K., Junker, B.W. (1997) Invariant item ordering of transitive reasoning tasks. In J. Rost, R. Langeheine (Eds.), Applications of latent trait and latent class models in the social sciences (pp. 97-107) Munster: Waxmann Verlag.
Sijtsma K., Van der Ark, L.A. (this volume) Progress in IRT analysis of polytomous item scores: dilemmas and practical solutions. In A. Boomsma, T. Snijders, M. Van Duijn (Eds.), Essays in Item Response Modeling (pp. xxx-xxx) New York: Springer-Verlag.
Silverstein B., Fisher, W.P., Kilgore, K.M., Harley, J.P. & Harvey, R.F. (1992) Applying psychometric criteria to functional assessment in medical rehabilitation : 11. Defining interval measures. Archives of Physical Medicine and Rehabilitation. 73 : 507-518.
Smedts Diana M.P. (1987) The Rasch Model: Towards An Alternative Process of Item Selection (German), Tijdschrift Voor Onderwijs Research, 12, 355-364
Smith R.M. & Kramer, G.A. (1992) A comparison of two methods of test equating in the Rasch model. Educational and Psychological Measurement, 52, 835-847.
Smith R.M. (1997) Outcome measurement. First international outcome measurement conference, co-sponsored by rehabilitation foundation, Inc., and the MESA Psychometric Laboratory at the University of Chicago. Physical Medicine and Rehabilitation : State of the Art Reviews. June ; 11(2) : ix-x, 261-424.
Smith R.M. (1997) The relationship between goals and functional status in the Patient Evaluation and Conference System. Physical Medicine and Rehabilitation : State of the Art Reviews. June ; 11(2) : 333-343.
Smith Richard M. (1985) A Comparison of Rasch Person Analysis and Robust Estimators, Educational and Psychological Measurement, 45, 433-444
Smith Richard M. (1988) The Distributional Properties of Rasch Standardized Residuals, Educational and Psychological Measurement, 48, 657-667
Smith Richard M. (1994) A Comparison of the Power of Rasch Total and Between-item Fit Statistics to Detect Measurement Disturbances, Educational and Psychological Measurement, 54, 42-55
Snijders M. Van Duijn (Eds.), Essays in Item Response Modeling (pp. xxx-xxx) New York: Springer- Verlag.
Snijders T (2000) Asymptotic distribution of person-fit statistics with estimated person parameter. Psychometrika.
Snijders T.A.B. (1997) Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14 75-100.
Snijders T.A.B. (this volume) Two-level non-parametric scaling for dichotomous data. In A. Boomsma, T.
Spearman C. (1904) "General intelligence" objectively determined and measured. Amer.J. Psychol., 15, 201-293
Spector P.E. (1985) Measurement of human service staff satisfaction: Development of the job satisfaction survey. American Journal of Community Psychology, 13(6), 693-713.
Spiel C. & Gluck, J. (1998) Item response models for assessing change in dichotomous items. International Journal of Behavioral Development, 22, 517-536.
Stankov L., Cregan A. (1993) Quantitative and Qualitative properties of an intelligence test: series completion. Learning and Individual Differences, 5, 2, 137-169.
Stansfield C. & Kenyon D. (1992) Research on the comparability of the oral proficiency interview and the simulated oral proficiency interview. System 20: 347-362.
Stegelmann W. (1983) Expanding the Rasch model to a general model having more than one dimension. Psychometrika, 48, 259-267. Nonparametric and Parametric IRT, and the Future 27
Stegelmann Werner. (1983) Expanding the Rasch Model to a General Model Having More Than One Dimension, Psychometrika, 48, 259-267
Stenson Herbert H. (1986) TESTAT: Test Analysis for the PC and VAX, Psychometrika, 51, 615-616,
Stevens S.S. (1939) On the problem of scales for the measurement of psychological magnitudes.J. Unified Sci., 9, 94-99.
Stocking M.L. & Lewis C. (1998) Controlling item exposure conditional on ability in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 23, 57-75.
Stocking M.L. & Lord, F.M. (1983) Developing a common metric in item response theory. Psychological Bulletin, 99, 118-128.
Stocking M.L. & Swanson L. (1998) Optimal design of item banks for computerized adaptive tests. Applied Psychological Measurement, 22, 271-279.
Stocking, M.L. & Lord, F.M. (1983) Developing a common metric in item response theory. Applied Psychological Measurement, 7 (2), 201-210.
Stone G.E. A standard vision. Popular Measurement: journal of the Institute for Objective Measurement: 3,40-41.
Stone M. & Wright, B.D. (1988) Separation statistics in Rasch measurement (Research Memorandum No. 51) Chicago: MESA Press.
Stout W. & Roussos, L. (1996) SIBTEST manual. Statistical Laboratory for Educational and Psychological Measurement. University of Illinois at Urbana-Champaign.
Stout W. (1987) A non-parametric approach for assessing latent trait unidimensionality. Psychometrika, 52, 589-617.
Stout W. (1987) A nonparametric approach for assessing latent trait unidimensionality, Psychometrika, 52, 589-617.
Stout W. (1990) A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation. Psychometrika, 55(2), 293-325.
Stout W. (1990) DETECT and DIMTEST manual. Statistical Laboratory for Educational and Psychological Measurement. University of Illinois at Urbana-Champaign.
Stout W.F. (1987) A nonparametric approach for assessing latent trait dimensionality. Psychometrika, 52, 589-617.
Stout W.F. (1987) A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52, 589-617.
Stout W.F. (1990) A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55, 293-326. 45
Stout W.F. (1990) A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation. Psychometrika, 55, 293-325.
Stout W.F., Habing, B., Douglas, J., Kim, H.R., Roussos, L., Zhang, J. (1996) Conditional covariance-based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331-354.
Streiner D.L. & Norman G.R. (1995) Health Measurement Scales: A Practical Guide to their Development and Use, 2d edition. New York: Oxford University Press.
Stucki G., Daltroy, L., Katz, J.N., Johannesson, M. & Liang, M.H. (1996) Interpretation of change scores in ordinal clinical scales and health status measures : The whole may not equal the sum of the parts. Journal of Clinical Epidemiology. 49(7) : 711-717.
Suanthong S., Schumacker, R.E. & Beyerlein, M.M (2000) An investigation of factors affecting test equating in latent trait theory. Journal of Applied Measurement, 1(1), 25-43.
Suen H.K. (1990) Principles of Test Theory. Lawrence Erlbaum. ISBN: 0-8058-0198-7.
Swaminathan H. (1999) Latent trait measurement models. In: G.N. Masters,. & J.P. Keeves.. Advances in Measurement in Educational Research and Assessment, pp.43-54. Amsterdam: Pergamon.
Swaminathan Hariharan and Gifford, Janice A. (1982) Bayesian Estimation in the Rasch Model, Journal of Educational Statistics, 7, 175-191,
Swanson D.B., Dillon G.F. & Ross L.P. Setting content-based standards for National Board exams: initial research for the comprehensive Part I Examination. Academic Medicine: 65, S17-18.
Sympson J.B. & Hetter R.D. (1985) Controlling item-exposure rates in computerized adaptive testing. Proceedings of the 27th Annual Meeting of the Military Testing Association (pp. 973-977) San Diego, CA: Navy Personnel Research and Development Center.
Tanaka J.S. & Huba G.J. (1985) A fit index for covariance structure models under arbitrary GLS estimation. British Journal of Mathematical and Statistical Psychology, 38, 197-201.
Tanner M.A. (1996) Tools for statistical inference: methods for the exploration of posterior distributions and likelihood functions. 3rd Edition. New York: Springer-Verlag.
Taris T.W. (2000) A primer in longitudinal data analysis. Thousand Oaks, CA: Sage.
Tatsuoka K.K. (1984) Caution indices based on item response theory. Psychometrika, 49. 95-110.
Tatsuoka K.K. (1985) A probabilistic model for diagnosing misconceptions by the pattern classification approach. Journal of Educational Statistcs, 10. 55-73.
Tatsuoka K.K. (1990) Toward an integration of item response theory and cognitive error diagnosis. In N.
Tatsuoka K.K. (1995) Architecture of knowledge structures and cognitive diagnosis: a statistical pattern recognition and classification approach. In P.D. Nichols, S.F. Chipman, R.L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 327-359) Hillsdale, NJ: Lawrence Erlbaum Associates.
Tennant A. & Young, C. (1997) Coma to community : Continuity in measurement. Physical medicine and rehabilitation : State of the Art Reviews. 11(2) : 375-384.
Tennant A., Geddes, J.M.L. & Chamberlain, M.A. (1996) The Barthel Index : an ordinal score or interval level measure? Clinical Rehabilitation. 10 : 301-308.
Tennant A., Hilmann, M., Fear, J., Pickering, A. & Chamberlain, M.A. (1996) Are we making the most of the Stanford Health Assessment Questionnaire? British Journal of Rheumatology. 35 : 574-578.
Ter Hofstede, F. Steenkamp, J.-B. E.M., Wedel, M. (1999) Identifying spatially contiguous international target markets. Manuscript submitted for publication.
Thissen D. (1982) Marginal Maximum Likelihood Estimation for the One-parameter Logistic Model, Psychometrika, 47, 175-186,
Thissen D. & Steinberg L. (1986) A taxonomy of item response models. Psychometrika, 51, 567-577.
Thissen D. & Wainer H. (1982) Some standard errors in item response theory. Psychometrika, 47, 397-412.
Thissen D., Pommerich, M., Billeaud, K. & Williams, V.S.L. (1995) Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19, 39-49.
Thompson B. & Vacha-Haase, T. (2000) Psychometrics is datametrics: The test is not reliable. Educational and Psychological Measurement, 60, 174-195.
Thorndike E.L. (1904) An introduction to the theory of mental and social measurements. New York: Teacher's College.
Thorndike R.L. (1971) Concepts of culture-fairness. Journal of Educational Measurement, 8, 63-70.
Thurstone L.L. (1925) A method of scaling psychological and educational tests. Journal of Educational Psychology,(16), 433-451
Tindal G., Marston, D. & Deno, S.L. (1983) The reliability of direct and repeated measurement (Research Report No. 109) Minneapolis, MN: University of Minnesota Institute for Research on Learning Disabilities.
Tinsley Howard E.A. & Dawis, Rene V. (1975) An Investigation of the Rasch Simple Logistic Model: Sample Free Item and Test Calibration, Educational and Psychological Measurement, 35, 325-340
Tjur Tue. (1982) A Connection Between Rasch's Item Analysis Model and a Multiplicative Poisson Model, Scandinavian Journal of Statistics, 9, 23-30,
Traub R.E. (1983) A priori considerations in choosing an item response model. In R.K. Hambleton (Ed.), Applications of item response theory, pp 57-70,
Tsuji T, Liu M. Sonoda S, Domes K, Chino N. (2000) The stroke impairment assessment set: its internal consistency and predictive validity. Archives of Physical Medicine & Rehabilitation 81:863-868.
Tucker Ledyard R (entire middle name)
Tuerlinckx F. & De Boeck, P. (2001) The effect of ignoring item interactions on the estimated discrimination parameters in Item Response Theory. Psychological Methods, 6(2), 181-195.
Tutz G. (1990) Sequential item response models with an ordered response. British Journal of Mathematical and Statistical Psychology, 43, 39-55.
Tutz G. (1997) Sequential models for ordered responses. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 139 - 152) New York: Springer.
Tutz Gerhard. (1985) Categorical Response Models and Multiple Regression With Dummy Variables (German), Archiv für Psychologie, 137, 99-114,
Ullman S., Karabatsos G., Koss M. (1999) Alcohol and sexual assault for a national sample of college men. Psychology of Women Quarterly, 23, 673-689.
Ullman S., Karabatsos G., Koss M. (1999) Alcohol and sexual assault for a national sample of college women. Journal of Interpersonal Violence, 14, 6, 603-625.
Upshur J. & Turner C. (1999) Systematic effects in the rating of second-language speaking ability: Test method and learner discourse. Language Testing 16: 82-111.
van de Vijver, Fons J.R. (1988) Systematizing the Item Content in Test Design, Latent Trait and Latent Class Models, Plenum (New York; London), 291-307,
Van den Wollenberg A.L., Wierda F.W. & Janssen P.G.W. (1988) Consistency of Rasch model parameter estimation: a simulation study. Applied Psychological Measurement 12, 307-313.
Van den Wollenberg, A.L. (1982) Two new test statistics for the Rasch model. Psychometrika, 47, 123-140.
van den Wollenberg, Arnold L. (1982) A Simple and Effective Method to Test the Dimensionality Axiom of the Rasch Model, Applied Psychological Measurement, 6, 83-91
van den Wollenberg, Arnold L. (1982) Two New Test Statistics for the Rasch Model, Psychometrika, 47, 123-140
Van Der Flier H. (1982) Deviant response patterns and comparability of tests scores. Journal of Cross-Cultural Psychology, 13. 267-298.
van der Linden W.J. & Eggen, J.H.M. (1986) An empirical Bayesian approach to item banking. Applied Psychological Measurement, 10(4), 345354.
Van der Linden W.J. (1994) Fundamental Measurement and the Fundamentals of Rasch Measurement. In M. Wilson (ed.) Objective Measurement: Theory into Practice Vol.2. Ablex Publishing Corp. ISBN: 0-89381-843-1.
Van der Linden W.J. (1998) Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201-216.
van der Linden W.J. (1999) Multidimensional Adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24, 398-412.
van der Linden W.J. (2000) Constrained Adaptive Testing with Shadow Tests.W. J. van der Linden and C.A.W. Glas (eds.), Computerized Adaptive Testing: Theory and practice, 27-52. Dordrecht, The Netherlands: Kluwer Academic Publishers
Van der Linden, W.J., Hambleton, R.K. (eds.) (1997) Handbook of modern item response theory. New York: Springer Verlag.
Van der Ven A.H.G.S. & Ellis, J.L. (2000) A Rasch Analysis of Raven's Standard Progressive Matrices. Personality and Individual Differences, 29 (1), 45-64.
Van Krimpen-Stoop E.M.L.A. & Meijer, R.R. (2000) Detection of Person Misfit in Computerized Adaptive Tests with Polytomous Items. Research Report 01, University of Twente, The Netherlands.
Van Leir L. (1989) Reeling, writhing, drawling, stretching, and fainting in coils: Oral proficiency interviews as conversation. TESOL Quarterly 23: 489-508.
van Schuur W.H. & Kiers, H.A.L. (1994) Why factor analysis is often the incorrect model for analyzing bipolar concepts, and what model can be used instead. Applied Psychological Measurement, 18, 97-110.
van Schuur W.H. (1984) Structure in political beliefs: A new model for stochastic unfolding with application to European party activists. Amsterdam: CT Press.
Van Schuur W.H. (1993) Nonparametric unidimensional unfolding for multicategory data. In J.R. Freeman (Ed.), Political analysis (Vol. 4, pp 41-74) Ann Arbor, MI: University of Michigan Press.
van Schuur, W.H. & Kiers, H.A.L. (1994) Why factor analysis is often the incorrect model for analyzing bipolar concepts, and what model can be used instead. Applied Psychological Measurement, 18, 97-110.
VanLehn K., Niu, Z. (????) Bayesian student modeling, user interfaces and feedback: A sensitivity analysis. International Journal of Artificial Intelligence in Education.
VanLehn K., Niu, Z., Siler, S., Gertner, A. (1998) Student modeling from conventional test data: a Bayesian approach without priors. In B.P. Goetl, H.M. Halff, C.L. Redfield, V.J. Shute (Eds.), Proceedings of the Intelligent Tutoring Systems Fourth International Conference, ITS 98 (pp. 434-443) Berlin: Springer- Verlag.
Velozo C.A., Kielhofner G. & Lai J.S. (1999, Jan-Feb) The use of Rasch analysis to produce scale-free measurement of functional ability. American Journal of Occupational Therapy, 53(1), 83-90.
Velozo C.A., Magalhaes L.C., Pan A.-W. & Leiter P. (1995) Functional scale discrimination at admission and discharge: Rasch analysis of the Level of Rehabilitation Scale-III. Archives of Physical Medicine and Rehabilitation, 76(8), 705-712.
Velozo C.A., Magalhaes, L.C., Pan, A. & Leiter, P. (1995) Functional scale discrimination at admission and discharge : Rasch analysis of the Level of Rehabilitation Scale-III. Archives of Physical Medicine and Rehabilitation. 76 : 705-712.
Velozo CA, Kielhofner G, Lai JS. (1999) The use of Rasch analysis to produce scale-free measurement of functional ability. American Journal of Occupational Therapy 33:83-90.
Ventana J. , Antequera, T., Ruiz, J., Cava, R., and Alvarez, P. (1996) Measuring Sensorial Quality of Iberian Ham by Rasch Model. Journal of Food Quality. 19, 397-412.
Verguts T, De Boeck P. (2002) Some Mantel-Haenszel tests of Rasch model assumptions. British Journal of Mathematical & Statistical Psychology 34:21-37.
Verguts T. & De Boeck, P, 2000. A note on the Martin-Lof test for unidimensionality. MPR-online, 5,1, 77-82; (Internet
Verhelst N. & Molenaar, I.W. (1988) Logit Based Parameter Estimation in the Rasch Model, Statistica Neerlandica, 42, 273-295,
Verhelst N.D. & Glas C.A.W. (1995) The one parameter logistic model. In: Fischer G.H.& Molenaar I.W. (Eds.) Rasch models. (pp. 215-237) New York: Springer.
Verhelst N.D. & Glas, C.A.W. (1993) A Dynamic Generalization of the Rasch Model, Psychometrika, 58, 395-415,
Verhelst N.D., Glas C.A.W. & De Vries H.H. (1997) A steps model to analyze partial credit. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 123 - 138) New York: Springer.
Verhelst N.D., Glas C.A.W. & van der Sluis A. (1984) Estimation problems in the Rasch model: the basic symmetric functions. Computational Statistics Quarterly, 1, 245-262.
Verhelst N.D., Glas C.A.W. & Verstralen H.H.F.M. (1995) OPLM: One Parameter Logistic Model. Computer program and manual. Arnhem, The Netherlands: CITO.
Verhelst N.D., Verstralen, H.H.F.M. (1993) A stochastic unfolding model derived from the partial credit model. Kwantitatieve Methoden, 42, 73-92. Nonparametric and Parametric IRT, and the Future 28
Verhelst N.D., Verstralen, H.H.F.M. (2001) IRT models for multiple raters. In A. Boomsma, T. Snijders, and M. Van Duijn (Eds.), Essays in Item Response Modeling (pp. 89-108) New York: Springer-Verlag.
Vermetten Y., Lodewijks J. & Vermunt J. (1999) The role of personality traits and goal orientations in strategy use. Manuscript submitted to Contemporary Educational Psychology.
Vermunt J.D. (1998) The regulation of constructive learning processes. British Journal of Educational Psychology, 68, 149-171.
Vorberg Dirk and Schwarz, Wolfgang. (1990) Rasch-representable Reaction Time Distributions, Psychometrika, 55, 617-632
Wainer H. & Eignor, D.(2000) Caveats, pitfalls, and unexpected consequences of implementing large-scale computerized testing. In Wainer, Howard (Ed) Computerized adaptive testing: A primer (2nd ed.) pp. 271-299. Mahwah, NJ Lawrence Erlbaum Associates.
Wainer H. (1993) Model-based standardized measurement of an item's differential impact. In P.W. Holland & H. Wainer (Eds.), Differential item functioning (pp.123-135) Hillsdale, NJ: Erlbaum.
Wainer H., Dorans N.J., Flaughter R., Green B.F., Mislevy R.J., Steinberg L. & Thissen D. (2000) Computerized adaptive testing: A Primer. Second Edition. Hillsdale NJ: Lawrence Erlbaum.
Wainer H., Thissen D., Mislevy R.J. (2000) Computerized Adaptive Testing: A Primer. Lawrence Erlbaum Associates, Inc.
Wainer Howard and Morgan, Anne and Gustafsson, Jan-Eric. (1980) A Review of Estimation Procedures for the Rasch Model With An Eye Toward Longish Tests, Journal of Educational Statistics, 5, 35-64,
Wainer Howard and Wright, Benjamin D. (1980) Robust Estimation of Ability in the Rasch Model, Psychometrika, 45, 373-391
Wang W. (1998) Rasch analysis of distractors in multiple choice items. Journal of Outcome Measurement 2(1), 43-65.
Wang W.-C. (1999) Direct Estimation of Correlations Among Latent Traits within IRT Framework. Methods of Psychological Research Online 4(2): 47-70.
Wang W.-C., Adams, R., et al. (1998) Measuring Individual Differences in Change with Multidimensional Rasch Models. Journal of Outcome Measurement 2(3): 240-265.
Wang W.-C., Wilson, M., Adams, R.J. (1997) Rasch models for multidimensionality between items and within items, in M. Wilson, G. Engelhard Jr, K. Draney (Eds.) Objective measurement: Theory into practice, vol.4, 139-155.
Warm T A. (1989) Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450.
Waugh R.F., Hii T.K. & Islam A. (2000) An Approach to Studying scale for students in higher education: a Rasch measurement analysis. Journal of Applied Measurement, 1(1), 44-62.
Way W.D. (1998) Protecting the integrity of computerized testing item pools. Educational Measurement: Issues and Practice, 17(4), 17-27.
Weigle S. (1998) Using FACETS to model rater training effects. Language Testing 15: 263-287.
Weigle S.C. (1999) Investigating rater/prompt interactions in writing assessment: Quantitative and qualitative approaches. Assessing Writing, 6 (2), 145-178.
Weir C.J., Hughes A. & Porter D. (1990) Reading skills: hierarchies, implicational relationships and identifiability. Reading in a foreign language. 7(1): 505-510.
Weiss D.J. (1982) Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473-492.
Weiss D.J. (Ed.) (1978) Proceedings of the 1977 computerized adaptive testing conference. Minneapolis, MN: University of Minnesota, Department of Psychology, Psychometric Methods Program.
Weiss D.J. (Ed.) (1983) New horizons in testing: Latent trait test theory and computerized adaptive testing. New York: Academic Press.
Weiss D.J., Kingsbury G.G. (1984) Application of computerized adaptive testing to educational problems. Journal of Educational Measurement 21:4 361-375.
Welch C. & Hoover H.D. (1993) Procedures for extending item bias detection techniques to polytomously scored items. Applied Measurement in Education, 6, 1-19.
White P.O. (1976) A Note on Keats' Generalization of the Rasch Model, Psychometrika, 41, 405-408
Whitely Susan E. (1977) Models, Meanings and Misunderstandings: Some Issues in Applying Rasch's Theory, Journal of Educational Measurement, 14, 227-235
Whiteneck G.G., Charlifue, S.W., Gerhart, K.A. Overholser, J.D. & Richardson, G.N. (1992) Quantifying handicap : A new measure of long-term rehabilitation outcomes. Archives of Physical Medicine and Rehabilitation. 73 : 519-526.
Wigglesworth G. (1993) Exploring bias analysis as a tool for improving rater consistency in assessing oral interaction. Language Testing 10: 305-335.
Wigglesworth G. (1994) The investigation of rater and task variability using multi-faceted measurement. Report for the National Centre for English Language Teaching and Research, Macquarie University.
Willingham W.W. & Cole N.S. (1997) Gender and fair assessment. Hillsdale, NJ: Lawrence Erlbaum.
Wilson D.T., Wood R. & Gibbons R. (1991) TESTFACT. Test scoring, Item statistics, and Item Factor Analysis. (Computer Software) Chicago IL: Scientific Software International Inc.
Wilson H.G. (1988) Parameter estimation for peer grading under incomplete design. Educational and Psychological Measurement, 48, 69-81.
Wilson M. & Case, H. (2000) An examination of variation in rater severity over time: A study in rater drift. In M. Wilson & G. Engelhard, Jr. (Eds.), Objective Measurement: Theory into Practice (Vol. 5, pp. 113-133) Stamford, CT: Ablex.
Wimsatt W.C. (1981) Robustness, reliability and overdetermination. In M.B. Brewer & B.E. Collins (Eds.), Scientific inquiry and the social sciences. San Francisco: Jossey-Bass.
Wolcott W., et al.(1988) Discrepancies in essay scoring (Report No. TM013018) Springfield, VA: TM Clearinghouse. (ERIC Document Reproduction Service No. ED306246)
Wolfe E.W. & Chiu, C.W.T. (1999) Measuring change across multiple occasions using the Rasch rating scale model. Journal of Outcome Measurement, 3, 360-381.
Wolfe E.W. & Chiu, C.W.T. (1999) Measuring pretest-posttest change with a Rasch rating scale model. Journal of Outcome Measurement, 3,134-161.
Wolfe E.W., Engelhard, G., Jr. & Myford, C.M. (2001, May) Monitoring Reader Performance and DRIFT in the AP English Literature and Composition Exam Using Benchmark Essays. A proposal funded by the Advanced Placement Research and Development Committee, Educational Testing Service, Princeton, NJ.
Wood R. & Wilson, D. (1974) Evidence for differential marking discrimination among examiners of English. The Irish Journal of Education, 8(1), 36-48.
Wood R. (1978) Fitting the Rasch model: A heady tale. British Journal of Mathematical and Statistical Psychology 31:27-32.
Woodcock R.W. (1999) What can Rasch-based scores convey about a person's test performance? In S.E. Embretson & S.L. Hershberger (Eds.), The new rules of measurement: What every psychologist and educator should know. Hillsdale, NJ: Lawrence Erlbaum Associates.
Wright B.D. & Masters G.N. (1981) The measurement of knowledge and attitude (Research memorandum no. 30) Chicago: Statistical Laboratory, Department of Education, University of Chicago.
Wright B.D. & Masters G.N. (1982) Rating scale analysis: Rasch measurement. Chicago: MESA Press.
Wright B.D. & Panchapakesan, N. (1969) A procedure for sample-free item analysis. Educational and Psychological Measurement, 29(1), 23-48.
Wright B.D. & Stone M.H. (1979) Best test design: Rasch measurement. Chicago: MESA Press.
Wright B.D. & Stone M.H. (2003 - perhaps) Directing Observations, Inventing Constructs, Crafting Yardsticks and Examining Fit. Chicago: The Phaneron Press.
Wright B.D. (1968) Sample-free test calibration and person measurement. Proceedings 1967: Invitational conference on testing problems. Princeton: Educational Testing Service, 85-101.
Wright B.D. (1977) Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14, 97-166.
Wright B.D. (1980) Afterword. In Rasch G. (1960) Probabilistic models for some intelligence and attainment tests. (pp. ix-xxiii)The University of Chicago Press.
Wright B.D. (1984) Despair and hope for educational measurement. Contemporary Education Review, 3(1), 281-288.
Wright B.D. (1985) Additivity in psychological measurement. In E.E. Roskam(Ed.), Measurement and Personality Assessment. Amsterdam: North-Holland:Elsevier Science Publishers B.V. pp 101-112.
Wright B.D. (1996) Comparing Rasch measurement with factor analysis. Stuctural Equation modeling, 3(1), 3-24.
Wright B.D. (1997) Fundamental measurement for outcome evaluation. Physical medicine and rehabilitation : State of the Art Reviews. 11(2) : 261-288.
Wright B.D. (1999) Fundamental measurement for psychology. In S.E. Embretson & S.L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know. Hillsdale, NJ: Lawrence Erlbaum Associates.
Wright B.D., Linacre J.M. & Heinemann A.W. (1993) Measuring functional status in rehabilitation. Physical Medicine and Rehabilitation Clinics of North America, 4(3), 475-491.
Wright Benjamin D. (1977) Misunderstanding the Rasch Model, Journal of Educational Measurement, 14, 219-225
Wright T.A., Bennett, K.K. & Dun, T. (1999) Life and job satisfaction. Psychological Reports, 84(3, pt.1) 1025-1028.
Wu M.L., Adams R.J., Wilson M.R. (1997) ConQuest: Generalized item response modeling software. ACER.
Wu M.L., Adams R.J., Wilson M.R. (1998) ACER ConQuest: generalized item response modelling software. Melbourne: Australian Council for Educational Research.
Yamamoto K., Gitomer, D.H. (1993) Application of a HYBRID model to a test of cognitive skill representation. In N. Fredriksen, R.J. Mislevy (Eds.), Test theory for a new generation of tests (pp. 275-295) Hillsdale, NJ: Lawrence Erlbaum Associates.
Yamauchi Kana. (1999) Comparing Many-facet Rasch Model and ANOVA model: Analysis of ratings of essays [in Japanese]. Japanese Journal of Educational Psychology. Vol 47(3), Sep., 383-392.
Yen W.M. (1981) Using simulation results to choose a latent trait model. Applied Psychological Measurement, 5, 245-262.
Yen W.M. (1984) Effects of local item dependence on the fit and equating performance of the three parameter logistic model. Applied Psychological Measurement, 8, 125-145.
Yen W.M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187-213.
Young J.W. (1990) Adjusting the cumulative GPA using item response theory. Journal of Educational Measurement, 27, 175-186.
Yuan A., Clarke, B. (1999) Manifest characterization and testing for two latent traits. Manuscript submitted for publication.
Zhu W. (1996) Should total scores from a rating scale be used directly? Research Quarterly for Exercise and Sport, 67(3), 363-372.
Zhu W., Updyke W.F., Lewandowski, C (1997) Post Hoc Rasch analysis of optimal categorization of an ordered response scale. Journal of Outcome Measurement, 1(4) p.286-304.
Zimowski M.F., Muraki, E., Mislevy, R.J. & Bock, R.D. (1999) BILOG-MG: Multiple-Group IRT Analysis and Test Maintenance for Binary Items. Scientific Software International, Inc. Chicago, IL.
Zimowski M.F., Muraki, E., Mislevy, R.J., Bock, R.D. (1997) BILOG-MG. [Computer program]. Chicago: Scientific Software Inc. Online description available: Accessed 28 April 2000.
Zwick R. (1992) Special issue on the National Assessment of Educational Progress. Journal of Educational Measurement, 17, 93-94.
Zwick R., Donoghue J.R. & Grima A. (1993) Assessment of differential item functioning for performance tasks. Journal of Educational Measurement, 30, 233-251.
Zwinderman A.H. & van den Wollenberg, Arnold L. (1990) Robustness of Marginal Maximum Likelihood Estimation in the Rasch Model, Applied Psychological Measurement, 14, 73-81
Zwinderman A.H. (1991) A Generalized Rasch Model for Manifest Predictors, Psychometrika, 56, 589-600,
Zwinderman A.H. (1995) Pairwise parameter estimation in Rasch models. Applied Psychological Measurement, 19(4), 369-375.
Zwinderman A.H. (1997) Response models with manifest predictors. In van der Linden W.J. & Hambleton R.K. Handbook of Modern Item Response Theory. New York: Springer.
Bond, T., & Fox, C. M. (2001). Applying the Rasch Model: Fundamental Measurement in the Human Sciences. Mahwah NJ: Lawrence Erlbaum Assoc. Chan, D., Sacco, J., Schmitt, N., McFarland, L. A., & Jennings, D. (1998). Appropriateness fit, reactions, motivation, conscientiousness, subgroup differences, and test validity. Paper presented at the annual conference of the Society for Industrial and Organizational Psychology, Dallas, TX. Cronbach, L. J. (1946). Response sets and test validity. Educational and Psychological Measurement, 6, 475-494. Cronbach, L. J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10, 3-31. Douglas, G., & Wright, B. D. (1987). Response patterns and their probabilities. Rasch Measurement Transactions 3:4, 75-77. Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 67-86. Drasgow, F., Levine, M. V., & Zickar, M. J. (1996). Optimal identification of mismeasured individuals. Applied Measurement in Education, 9 (1), 47-64. Fan, X. (2003). Two approaches for correcting correlation attenuation caused by measurement error: implications for research practice. Educational and Psychological Measurement, 63 (6), 915-930. Ferrando, P. J., Lorenzo, U., & Molina, G. (2001). An item response theory analysis of response stability in personality measurement. Applied Psychological Measurement, 25 (1), 3-17. Glaser, R. (1952). The reliability of inconsistency. Educational and Psychological Measurement, 12, 60-64. Glaser, R. (1949). A methodological analysis of the inconsistency of responses to test items. Educational and Psychological Measurement, 9, 721-739. Guttman, L. (1944). A basis for scaling qualitative data. American Psychological Review, 9,139-150. Guttman, L. (1945). A basis for analysing test-retest reliability. Psychometrika, 10, 255-282. Hamisch, D. L., & Linn, R. L. (1981). Analysis of item response patterns: questionable test data and dissimilar curriculum practices. Journal of Educational Measurement, 18 (3), 133-146. Iacobucci, D., & Duhachek, A. (2003). Advancing alpha: measuring reliability with confidence. Journal of Consumer Psychology. Ingebo, G. (1987). Student personality and test objectivity. Rasch Measurement Transactions 3:4, 86-87. Karabatsos, G. (2000). A critique of Rasch residual fit statistics. Journal of Applied Measurement, 1 (2), 152-176. Klauer, K. C. (1991). An exact and optimal standardized person test for assessing consistency with the Rasch model. Psychometrika, 56 (2), 213-228. Klauer, K. C. (1995). The assessment of person fit. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (97-110). New York: Academic Press. Koning, A. J., & Franses, P. H. (2003). Confidence intervals for Cronbach's coefficient alpha values. ERIM Report Series in Management, ERS-2003-041-MKT. Leplege, A., & Ecosse, E. (2000). Methodological issues in using the Rasch model to select cross culturally equivalent items in order to develop a quality of life index: the analysis of four WHOQOL-100 data sets (Argentina, France, Hong Kong, United Kingdom). Journal of Applied Measurement, 1 (4), 372-392. Levine, M. V., & Rubin, D. B. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4 (4), 269-290. Linacre, J. M. (1994). Many facet Rasch measurement (2nd edition). Chicago: MESA. Lundlow, L. H., & Mahalik, J. R. (2001). Congruence between a theoretical continuum of masculinity and the Rasch model: examining the conformity to Masculine Norms Inventory. Journal of Applied Measurement, 2 (3), 205-206. Meijer, R. R., & Nering, M. (1997). Trait level estimation for nonfitting response vectors. Applied Psychological Measurement, 21(4), 321-336. Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25 (2), 107-135. Miller, M. D. (1986). Time Allocation and Patterns of Item Response. Journal of Educational Measurement, 23 (2), 147 - 156. Molenaar, I. W., & Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55(1), 75-106. Molenaar, I. W., & Hoijtink, H. (1996). Person-fit and the Rasch model, with an application to knowledge of logical quantors. Applied Measurement in Education, 9 (1), 27-45. Mosier, C. I. (1940). Psychophysics and mental test theory: fundamental postulates and elementary theorems. Psychological Review, 47, 355-366. Mosier, C. I. (1942). Psychophysics and mental test theory II: The constant process. Psychological Review, 48, 235-249. Reise, S. P., & Waller, N. G. (1993). Traitedness and the assessment of response pattern scalability. Journal of Personality and Social Psychology, 65(1), 143-151. Schmitt, N., Chan, D., Sacco, J. M., McFarland, L. A., & Jennings, D. (1999). Correlates of person fit and effect of person fit on test validity. Applied Psychological Measurement, 23,41-53. Schmitt, N., Cortina, J. M., & Whitney, D. J. (1993). Appropriateness fit and criterion-related validity. Applied Psychological Measurement, 17 (2), 143-150. Schulz, M. (1987). Functional assessment. Rasch Measurement Transactions 3:4, 82-84. Smith, R. M. (1991). The distributional properties of Rasch item fit statistics. Educational and Psychological Measurement, 51, 541-565. Smith, R. M. (2000). Fit analysis in Latent Trait measurement models. Journal of Applied Measurement, 1 (2), 199-218. Tatsuoka, K. K., & Linn, R. L. (1983). Indices for detecting unusual patterns: Links between two general approaches and potential applications. Applied Psychological Measurement, 7 (1), 81-96. Tatsuoka, K. K., & Tatsuoka, M. M. (1983). Spotting erroneous rules of operation by the individual consistency index. Journal of Educational Measurement, 20 (3), 221-230. Thurstone, L. L., & Chave, E. J. (1929). Measurement of attitudes. Chicago: University of Chicago Press. Van Der Flier, H. (1982). Deviant response patterns and comparability of test scores. Journal of Cross-cultural Psychology, 13, 267-298. Wright, B. D. & Masters, G. N. (1982). Rating Scale Analysis, Chicago: MESA Press. Wright, B. D. (1993). Data analysis and fit, Rasch Measurement Transactions, 7:4, 324-325. Wright, B. D., & Mok, M. (2000). Rasch models overview. Journal of Applied Measurement, 1(1), 83-106. Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press.
