|COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring.|
Applying principles of item response theory to produce efficient ultra-short form questionnaires
If you have a question about this talk, please contact Luning Sun.
Efficient development of ultra-short form questionnaires (3 or 4 items, eg for clinical use) is worth pushing to its limits. Success would discourage non-standardised use of similar questions in unconsidered interview format, and discourage the fantasy that single question responses can be taken precisely at face value, while acknowledging that lengthy questionnaires are often impractical. Factorial purity (‘consistency’ given by Cronbach’s alpha) has long been the overriding process goal for item selection and final performance indication, but it deserts us as a quality guarantor whenever we seek an intrinsically heterogeneous summary measure, or seek a pattern of discrimination that is not for linear prediction of a continuous measure but for a non-linear criterion such as an extreme cut-off (dichotomy). It is not necessary to reach the ideal of Rasch measurement, where (a few) items have to be highly selected, and once an item’s response curve has been appropriately located on the scale, it does most of the discriminatory work in that portion of the scale. But it is possible, with appropriate cautions about reliability and validity, to meet the general development objective with (1) an additive linear model using available items of overlapping response and heterogeneous response slope, and (2) emphasis on metrical validity, adding this to the classical 4 types of validity (face, construct, criterion, ecological). We see an index of validity of any of the 5 types as a goal to be sought through all stages in the psychometric process, not just a final performance indicator justifying adoption and use, according to some conventional cut-off. Combining these two approaches entails pursuing and optimising item scaling through several stages. Where the properties of the final criterion measure are clearly understood, item choice, item scaling and total score formulation can all be done to optimise for a particular application in view. These points are illustrated on a piece of applied development work in health status measurement meeting an outside request for a short-form measure using items in a clinical trial database. Two reasonable and related development criteria recognised by the field were adopted, but it was found that the distributional and linearity constraints that measures for the two criteria required were radically different. Proper consideration of metrical validity led to us offering two ultra-short forms, one for each of the criteria adopted; though conceptually related, these criteria were metrically irreconcilable. As often with consulting, the user’s requirement was redefined (ie as two), partly by problem analysis, but also by the fine-structure of the data.
This talk is part of the Cambridge Psychometrics Centre Seminars series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
Other listsMethodology in design research Number theory study group: Iwasawa theory St John's College Larmor Society
Other talksImmune evasion by the meningococcus: the competition hots up My Fisher: Memories of R.A. Fisher by his last student Members' Slides Cutting, Gluing and Playing House: The Fun of Making Dolls’ Houses of One’s Own ‘Why is patient safety so hard?’ Imagining India, decolonising l'Inde francaise, 1947-1954