Comparative judgement for measuring the hard-to-measure

Dr Ian Jones, Loughborough University
The hard-to-measure

Social scientists are often interested in theoretical constructs that elude efficient and reliable measurement, such as ‘beauty’ or ‘reading comprehension’. Developing instruments for hard-to-measure constructs can be expensive and labour intensive, and even then outcomes might not be robust when adapted to different contexts. Such measurement difficulties can thwart the advance of our disciplines. To overcome these difficulties there is growing interest in an approach to measurement in the social sciences that is based on the long-standing principle of comparative judgement. Here I provide a brief overview of the approach, along with examples from my own discipline of education.

Comparative judgement

Comparative judgement offers an efficient method to position a set of heterogenous objects on a linear measurement scale according to a high-level criterion. For example, the objects might be photographs and the criterion beauty. The measurement scale is constructed by presenting pairs of objects to participants and asking them to decide, for each pairing, which object has the ‘most’ of the given criterion. We collect the binary decisions of many such pairings from a group of participants, and then fit a statistical model to the binary decision data in order to produce a unique score for each object. This set of scores is our linear measurement scale and, like any set of scores, can be used for typical analytical processes such as hypothesis testing, regression analyses and so on.

Comparative judgement readily and efficiently produces reliable measurement scales across a wide range of hard-to-measure constructs because it harnesses the principle that human beings are consistent at making relative judgements and inconsistent at making absolute judgements. A neat demonstration of this using shades of colour can be seen at the free-to-use online comparative judgement platform. In the context of education, people are inconsistent when marking essays using a rubric, but consistent when comparatively judging the same essays.

Application to the social sciences

In a programme of research at Loughborough University, we have applied comparative judgement to the measurement of a range of nebulous but important educational constructs. In one study, research mathematicians made pairwise judgements of A-level examination scripts from a historic archive and the resultant scores enabled us to track changes in the ‘difficulty’ of A-level mathematics qualifications over recent decades. In another study we delivered teaching interventions to two randomly-assigned groups of older primary students and used comparative judgement to determine which intervention led to a ‘better’ understanding of algebra.

We have also applied comparative judgement to the measurement of students’ problem-solving skills, conceptual understanding and proof comprehension. Beyond education, we are currently collaborating with colleagues from other disciplines to apply comparative judgement to experimental philosophy, and to empirical literary studies.

Find out more

For those interested in learning more about how comparative judgement approaches might enable the construction of measure scales in their own research, I am delivering an NCRM online course coming up on 8 and 9 September 2022. You can also visit my online resource for getting started. Also, please get in touch for any advice and support about getting started with comparative judgement.

