E-books for causal modelling and missing data methods

Principal Investigator: William Browne (LEMMA 3)
Co-investigators: Bianca De Stavola (PATHWAYS), Paul Clarke (LEMMA 3), Chris Charlton (LEMMA 3), Mike Kenward (PATHWAYS), Rhian Daniel (collaborator on PATHWAYS), Richard Parker (University of Bristol)

The aim of this project is to produce several electronic notebooks (e-books) for use with the STAT-JR software package for teaching social science researchers about causal modelling and missing data and assisting them in performing their statistical analyses. We believe that e-books are likely to play a big role in how people learn and do research in the future.

The team at the LEMMA 3 node have developed the STAT-JR software package, which allows users to fit complex statistical models to their datasets, but unlike previous software that the LEMMA team have produced (for example MLwiN) the process of estimation is much more transparent with the software outputting the used algorithms and computer code. The template-based system created allows other researchers to contribute pieces of code (templates) for fitting specific model classes or that inter-operate with other software packages.

The STAT-JR software was (beta) released to the user community in May 2012 and currently contains three user interfaces: a command driven interface (cmdtest) via the program Python; a web-based interface (webtest) that works in a familiar way to users of other software packages; and an interactive e-book web based interface (ebooktest) that combines the processes of reading a book and running (controlled) statistical analyses into one operation. This e-book interface can potentially have a multitude of uses from interactive training materials to creating small packages that lead the beginner through a complex statistical procedure, e.g. dealing with missing data via multiple imputation.

The team at the PATHWAYS node are experts in causal modelling and missing data. The materials for the causal modelling courses given by PATHWAYS currently introduce the key concepts in causal inference - namely the language of counterfactuals and causal diagrams - and details of particular statistical methods used in causal inference, grouped according to the sort of causal question they address, and the structure/completeness of the data available to answer them. E-books will be written to replace some of the practicals given in these courses and willll be linked to other ones, such as the missing data e-book being written by the LEMMA team.