The Data Analysis Workflow

Presenter(s): Vernon Gayle

This resource relates to the practical aspects of organising and undertaking statistically orientated analyses of social science data. Many of the issues that are discussed are relevant to other social research methods.

This resource provides information on how to plan, organise, compute and document data analyses. It provides advice on how to use data analysis software effectively and discusses the critical role of documentation. It also discussed frequently overlook aspects of the workflow such as organising directory structures and establishing file naming protocols. The overall aim of the resource is to provide researchers with information that enables them improve the efficiency and effectiveness of their data analysis workflow.

This resources introduces the concepts of the data analysis workflow. It includes a 28-minute video, the associated PowerPoint slides and a list of further reading.

This 28-minute video introduces the concept of the data analysis workflow. The focus of this video is social science research that employs statistical techniques to analyse data. Many of the issues associated with the statistical data analysis workflow also pervade other forms of social science research (e.g. qualitative data analysis), despite the different nature of the data and the analytical techniques that are used.

About the author

My work involves the statistical analysis of large-scale and complex social science datasets. These datasets include both social surveys and administrative data resources. The analysis of longitudinal (i.e. repeated contacts) data is an area in which I specialize. The main substantive focus of my work is social stratification. I have particular interests in the sociology of youth and youth transitions, education and sport. I also have interests in demography, with a focus on migration, and to a lesser extent fertility. I have also undertaken work in the area of digital social research. My methodological research focuses on a range of challenges, which include topics such as quasi-variance estimation, missing data and multiple imputation, and the graphical representation of data. I am attempting to promote the 'Public Awareness of Social Statistics'.

