Using survey data to enhance administrative data for policy relevant research

Principal Investigator: Lorraine Dearden (PEPA and ADMIN)
Co-investigators: Harvey Goldstein (LEMMA 3), Jon Johnson (Centre for Longitudinal Studies, IOE), Luke Sibieta (PEPA) and Ellen Greaves (PEPA)

In recent years, survey data has been significantly enhanced through linkage to administrative data sets. In the UK this has included linkage to administrative health records, education records and recently economic records. This has improved the quality and value of survey data and allowed problems associated with non-response and attrition common in survey data to be more accurately identified. In addition, the problems of using administrative data (which has relatively poor background information) to find causal estimates has been documented by comparing estimates using the restrictive sub-sample of characteristics available in administrative data with estimates using a wider set of characteristics available in survey data. Both of these agendas formed an important part of the ADMIN (Phase 2 node) work.

Survey data is increasingly collecting information which has relevance beyond the individual and/or household in the survey, however, which has the potential to enhance administrative data in ways which have not, to our knowledge, been explored before. For example, the British Cohort Studies have a long tradition of undertaking teacher surveys whilst the study child is in school. These teacher surveys typically collect information from the teacher about the child. But they also collect a range of policy relevant data that is not child specific, for example information about the wider practices and characteristics of the school and school cohort of the child which is not measured in school administrative data. Given that the coverage of schools in these surveys is typically quite large, this school level information has the potential through linkage to school administrative data to identify the impact of school and school cohort practices on child and school outcomes.

This project investigates the feasibility of enhancing the National Pupil Database (NPD) school administrative data using survey data. Specifically, data collected by the Millennium Cohort Study (MCS) Wave 4 (2008) in its survey of teachers is used to create school and school cohort information that will be linked to the National Pupil Database for all primary school children in state schools in 2008. The response from the survey means that we have this information for just over 2,200 primary schools in England; around 13% of all state primary schools in England. We look at the value of the approach, by firstly developing appropriate ways to deal with possible non-representativeness of the sample of schools arising from the fact that only a sub-sample of state schools are observed in the MCS, and not all teachers responded to the survey which introduced a possible bias to the data.

The second stage of the project assesses the potential value of the linkage seeing if this linkage can provide insight into two policy relevant questions: 1) What was the impact of school setting policies (streaming, setting in certain subjects, no setting) on child outcomes at KS1 (Year 2) and KS2 (Year 6) in 2008 and 2) What was the impact of teacher qualifications, class environment and disruptive pupils in Year 2, on Key Stage 1 outcomes for children born in 2000/2001.

The MCS Wave 4 teacher survey has information on school streaming and setting policies which is the focus of the first policy relevant question. It also has information on the MCS child’s teacher qualifications and experience; information on peer relations in the MCS child’s class; whether there has been a child excluded from the class; and how conducive the class if for learning - which is the focus of the second policy relevant question.

This project will help understand the impact of different school setting/streaming policies on child outcomes and the impact of teacher qualifications and class environment on child outcomes for different types of students. This research may help inform schools what practices are best for different types of students – something which is not well known at this moment in time for current cohorts. The research should be able to look at this by the socio-economic background of students and see whether particular policies work better for students from poorer socio-economic backgrounds.