Availability and use of ethnicity data for UK health research

Kaisa Puustinen

Article by Rohini Mathur, Emily Grundy and Liam Smeeth (PATHWAYS node, London School of Hygiene and Tropical Medicine and University of Cambridge). This article appears in the Summer 2013 issue of MethodsNews newsletter (opens a .pdf file).

In the UK, minority ethnic groups experience higher rates of disease, with earlier onset and worse outcomes than the 'White British' population. This makes studying ethnic variations in health very important and, as described here, and in our recent working paper, improvements in data availability now mean there is much greater potential for doing this than in the past.

The concept of ethnicity is completely divorced from that of race. Ethnicity is now understood to be a much broader self-identification, encompassing a range of socially constructed characteristics. Ethnic self-identity can be fluid over time, responding to political and cultural forces.

Just as ethnic identity can be context and time dependent, so too can the relevance of ethnic categories used to define population groups of interest. In the UK, and indeed worldwide, a pragmatic approach has been undertaken to create ethnic categories for research which are simple to interpret, though their meanings may not remain stable over time. Ethnic groups themselves should not be considered to be homogeneous as it is well established that high level groupings can conceal significant heterogeneity. In both the USA and the UK it has been acknowledged that the ethnic categories used in official statistics are, to some extent, arbitrary and have been selected primarily for pragmatic reasons. Provided that researchers recognize the limitations of categories and approach them critically, the study of ethnic differences can provide vital information about the patterns of health and social indicators and provide an essential foundation for tackling inequalities.

Graph: Proportion of UK practices achieving 100% ethnicity recording for all newly registered patients. Graph produced using freely available NHS data from http://gpcontract.co.ukGraph: Proportion of UK practices achieving 100% ethnicity recording for all newly registered patients. Graph produced using freely available NHS data from GP Contract.


Ethnicity data in computerised health records

The computerization of health records across the NHS has generated enormous potential for population based research into the relationship between ethnicity and health in the UK. Though ethnicity data has been collected electronically since 1991, until recently, the usability of ethnicity data coded in electronic health records was low. Critically, thanks to a scheme of financial incentivisation under the Quality and Outcomes Framework, self-reported ethnicity data has been available for over 90% of newly registered patients since 2011.

For researchers who wish to conduct population based research into patterns of health care usage and outcomes across the UK, population based databases such as the Clinical Practice Research Datalink (CPRD), The Health Improvement Network Database (THIN) and the QRESEARCH databases provide anonymised routine health records on patients from a representative sample of general practices from across the UK. For research into hospital care and outcomes, the hospital episode statistics for England (HES) supply comparable data for all patients admitted to NHS hospitals, with ethnicity recording reported as being over 90% for all inpatient finished consultant episodes.

Though the quality and completeness of ethnicity data in routine primary and secondary care records has steadily improved over the past decade, they have not been extensively used for research into ethnic inequalities in health. A review in April 2012 of studies using the four databases described above identified only 15 peer-reviewed articles which described the use of patient level self-reported ethnicity in exploring differentials in health care usage, disease prevalence, and disease risk.

Microdata from the Census

A further source of population based ethnicity data for health research is the Census. Two particular census outputs of interest when examining health outcomes and ethnicity are the Samples of Individual Person-Level records (SARs) and the Office for National Statistics (ONS) Longitudinal study of England and Wales (LS). While the SARs data provide large cross-sectional cuts of the Census return, The ONS LS can be linked over time to examine trends and changes in the ethnic profile of the population and related trends in morbidity and mortality.

One emerging area where routine ethnicity data can be used to great benefit is that of chronic disease management. Though research into ethnic disparities is on-going, this has yet to be translated into concrete guidance for managing conditions differentially by ethnic group. A further use of routinely recorded ethnicity data is within pragmatic clinical trials, which use electronic health databases to examine the efficacy of widely prescribed interventions across a vast number of patients, for a lower cost than traditional clinical trials. Finally, linkage of these datasets to additional health and social data, as is currently on-going in the CPRD, will allow us to fully explore the relationship between ethnicity and the wider determinants of health.