Methodological reflections on the 2021 Census

NCRM news
David Martin, NCRM Co-Director
Census questionnaireCensus questionnaire

21 March 2021 is census day in England, Wales and Northern Ireland.  

There are at least two remarkable aspects to this simple observation – firstly, that the census is going ahead as planned in the middle of a pandemic and secondly, that Scotland is missing.  

Planning a census is a long-term exercise: many aspects of the 2021 census have been developed over the last decade, were tested in 2017 and extensively rehearsed in 2019.  Paper census forms have been printed, a large temporary staff force recruited, contracts placed for supporting services and now the major communications campaign is about to commence: the decision to go ahead has been guided by extensive adaptations to the practical operation and the massive investment already made.  However, the difficulty of the decision is reflected in the decision of Scotland (and others, including Republic of Ireland) to defer until 2022.  This round of censuses will throw up many new methodological challenges, for both the census agencies themselves and for researchers.

Demand for high quality, detailed data on local populations has never been greater.  Understanding prevalence and managing response to the COVID-19 pandemic has led to unprecedented interest in population denominators and local and regional inequalities.  Our current understanding of, for example, basic demographics, ethnic diversity (essential for understanding prevalence), travel to work, (presently hugely disrupted), industry of employment and especially the interactions between these are deeply reliant on 2011 census data.  The census provides the only complete picture from which detailed multivariate crosstabulation is possible for small areas, together with key research datasets such as commuting and migration flows, microdata samples and the ONS Longitudinal Study.  Even the ‘middle layer super output areas’ (MSOAs) that have been used for local reporting of Covid-19 prevalence are themselves derived from the census data system, being based on census ‘output areas’.  Postulated changes to migration prompted by Brexit and unprecedented interruption to international travel are all presently obscured by discontinuation of the International Passenger Survey – again, census outputs are a key resource for calibrating our understanding of migration.  

In keeping with wider trends in survey methodology, this census has been designed ‘digital first’, (although a choice of paper or online completion will be available to all respondents) with the expectation that around 75% of responses will be made online.  Canada achieved 68.4% and Australia 63% online response in 2016, while New Zealand managed 82% online in 2018, although the latter was set in the context of a disappointingly low overall response.  Online responses are generally of higher quality, due to the ability to guide the user through an adaptive questionnaire and undertake a degree of real-time validation.  However, the inevitable continued use of paper questionnaires by those that do not wish or are unable to respond online has the potential to introduce new modal biases that may need to be factored into subsequent analysis.  A further innovation will be the integration of some linked administrative data in order to produce additional statistics on income and numbers of rooms.  Overall response rates will of course be influenced to some extent by lockdown restrictions in place at the time, yet a digital census with linked administrative inputs provides a high degree of resilience in the face of uncertain restrictions on personal activity.

The pandemic itself is simultaneously producing changes to long-established societal characteristics and also to the ability of the census to measure them.  Inevitably, the collection and interpretation of questions such as usual place of work and term-time addresses of students will be additionally challenging and while the questions themselves cannot be altered, much effort is going into adapting the guidance in light of current circumstances.  Since 2001, production of the final census estimates has been the result of dual system estimation, relying on a large Census Coverage Survey (CCS) – itself the largest household survey in the country – to permit adjustment of the raw census counts.  These important processes will themselves now be subject to modifications to deal with the unique challenges of the census in a time of pandemic.

Not only has the design of the 2021 census proved to be a fascinating methodological journey, but it seems inevitable that some innovation will be required in using the data to best effect, possibly involving adjustments to the estimates to take account of administrative benchmarks and with implications for future research users.  It has always been necessary for secondary data users to take pains to understand the circumstances under which data were collected, but that need will be greater than ever when the first results begin to be made available in 2022.

Readers interested in specific aspects of 2021 Census methodology in England and Wales will find an extensive library of papers here. 

David Martin is a Co-Director of NCRM, Deputy Director of the UK Data Service and a member of the UK Statistics Authority’s Census Methodological Assurance Review Panel.  Twitter @GeogDave