Adjustment Methods for Data Quality Problems: Missing Data, Measurement Error and Misclassification
10/01/2024 - 11/01/2024
NCRM, University of Southampton
Professor Jose Pina-Sanchez and Dr Albert Varela Montane
Intermediate (some prior knowledge)
Training and Capacity Building Coordinator, National Centre for Research Methods, University of Southampton
View in Google Maps (LS2 9JT)
LIDA 11.06, Worsley Building Level 11, University of Leeds
Survey and administrative data are frequently affected by data quality issues as a result of problems at the sampling and data collection stages (e.g. non-response, coverage error, social desirability bias or recording errors). Just one of those problems in its most basic form, like random measurement errors, can exert strong biases in multivariate analysis relying on this data. Yet, attempts to adjust for the impact of these types of data quality problems remain rare.
We believe this is because of most approaches suggested in the literature, are either: i) designed for ad hoc solutions; ii) require additional forms of data often unavailable to researchers; or iii) are too complex, involving the adoption of different estimation methods and specialised software.
In this course we introduce a combination of – relatively - simple adjustment methods available in R, that can be applied across a wide range of missing data, measurement error and misclassification problems, even in instances when all that we have is an educated guess of the extent of the problem.
The course covers an introduction to common data quality problems and adjustment methods:
- Missing data, completely at random, at random, and not at random
- Coverage error
- Measurement error, additive, multiplicative, random, systematic, and differential
- Multiple imputation
- Sensitivity analysis
By the end of the course participants will be able to:
- Anticipate contexts in Social Science research where measurement error and missing data can be expected to be particularly prevalent.
- Distinguish between different types of measurement error and missing data problems employing the precise terminology.
- Adjust – or at least illustrate - the potential biasing effect associated to different types of data quality problems.
This course is aimed at Social Science researchers of all backgrounds and disciplines, who undertake multivariate analysis with data prone to either measurement error, missingness, or both. It is essential that participants possess at least a beginner level of familiarity with R. Some basic understanding of regression modelling is also required.
This two day course will run from 10:00-17:00 on 10th January and 09:00-14:00 on 11th January 2024.
The following references provide a useful reading list covering the methods that we will see in this course. They are listed in order of relevance:
Van Buuren, S. (2018). Flexible imputation of missing data. CRC press. https://stefvanbuuren.name/fimd/
Blackwell, M., Honaker, J., & King, G. (2017). A unified approach to measurement error and missing data: overview and applications. Sociological Methods & Research, 46(3), 303-341.
Lederer, W., & Küchenhoff, H. (2006). A short introduction to the SIMEX and MCSIMEX. The Newsletter of the R Project, 6, 26.
Gallop, M., & Weschle, S. (2019). Assessing the impact of non-random measurement error on inference: a sensitivity analysis approach. Political Science Research and Methods, 7(2), 367-384.
Pina-Sánchez, J., Buil-Gil, D., Brunton-Smith, I., & Cernat, A. (2022). The Impact of Measurement Error in Regression Models Using Police Recorded Crime Rates. Journal of Quantitative Criminology, 1-28.
The fee per teaching day is: • £35 per day for students registered at University. • £70 per day for staff at academic institutions, Research Councils researchers, public sector staff and staff at registered charity organisations and recognised research institutions. • £250 per day for all other participants All fees include event materials and morning and afternoon refreshments and lunch. Fees do not include travel and accommodation costs. In the event of cancellation by the delegate a full refund of the course fee is available up to two weeks prior to the course. NO refunds are available after this date. If it is no longer possible to run a course due to circumstances beyond its control, NCRM reserves the right to cancel the course at its sole discretion at any time prior to the event. In this event every effort will be made to reschedule the course. If this is not possible or the new date is inconvenient a full refund of the course fee will be given. NCRM shall not be liable for any costs, losses or expenses that may be incurred as a result of its cancellation of a course, including but not limited to any travel or accommodation costs. The University of Southampton’s Online Store T&Cs also continue to apply.
Website and registration:
Data Quality and Data Management , Missing data, Sensitivity analysis, R, Selection bias, Measurement error, Misclassification, Multiple imputation, SIMEX
Related publications and presentations: