LEMMA II finished its work in 2011. The impact report for the LEMMA II node is available on ESRC website.

Goldstein and Noden introduced a new model-based approach to the measurement of diversity by considering a multilevel model where the main focus of interest was the modelling of variation. In the case of schooling, eligibility for free school meals has been the focus of much interest. These authors used the proportion of such children in a school as their response variable in a 3-level model which explicitly included school and local education authority (LEA) effects. Thus the between-school and between-LEA variances are model parameters. In essence these parameters capture the diversity among schools and LEAs and functions of the estimates of them will correspond to different indexes that have been put forward in the literature.

This work has been extended with further data sets, and also in the following methodological directions: 1) Extending the modelling to cross-classified models, so that in the context of schooling neighbourhood effects have been taken into account, and the effects of different stages of schooling could be studied. 2) Work has been conducted on the measurement of school catchment areas, which has been used to explore neighbourhood effects. In addition, pupil mobility across schools and neighbourhoods has been explored using multiple membership models.

Researchers: Harvey Goldstein, Kelvyn Jones, Simon Burgess, Rich Harris

In this strand LEMMA II researchers explored three substantive questions: (i) The impact of families on pupil achievement, (ii) School competition and pupils' learning progress, and (iii) Parental selection into schools and neighbourhoods (catchment areas) driving the school competition processes.

(i) The impact of families on pupil achievement.

There has been no multilevel analysis of datasets which contain both multiple children per family and multiple children per school, thereby allowing estimation of separate variance components for children, schools and families. In an analysis of US data a series of separate sibling, peer, neighbourhood and schoolmate correlations has been modelled to adolescent achievement data and the largest correlation has been found to be between siblings (0.5). A full multilevel analysis of such data will properly partition these effects reducing their size. However, the sibling (family) variance component in a multilevel model of pupil attainment may well remain the most important source of variation in the system. Developmental psychological theory predicts that when families are placed under stress there will be fragmentation and differential failure of children within the family.

LEMMA II researchers explored this hypothesis by allowing the within-family variance to be a function of socio-economic stress.

(ii) School competition and pupils' learning progress.

Using PLASC researchers at LEMMA II developed modelling frameworks and methodology to assess the evidence for school competition affecting pupils' learning progress after conditioning on the individual and contextual effects of differential school intakes. The assumption was that if there was a residual effect of school competition on pupils' learning progress the school-level random effects in a multilevel model (pupils within schools) would be correlated. The between-school correlations were modelled initially in terms of a proximity function based on de facto catchment areas. For example, only schools with overlapping catchments could be allowed to covary. The dependence between pairs of overlapping schools was then further explored by elaborating the function for between-school covariance to include, for example, differences in school characteristics.

(iii) Parental selection into schools and neighbourhoods (catchment areas) driving the school competition processes.

One hypothesis was that the main mechanism for school competition effects is parental selection into schools and neighbourhoods. To test this hypothesis LEMMA II researchers extended the school competition model (ii) by including catchment area directly as a random classification in the model. Pupils are therefore multiple members of catchment areas and catchment areas are cross-classified with schools. The assuption was that if there were effects of parental selection into schools and neighbourhoods on pupils' learning progress LEMMA II researchers would find a non-zero correlation between school and neighbourhood random effects. LEMMA II modeled the school-neighbourhood correlation in terms of adjacency indicators and attractor mechanisms based on historical characteristics of schools and neighbourhoods. LEMMA II researchers expected that if the driving force for school competition effects is parental selection into schools and neighbourhoods, they would see a weakening of any between-school dependencies established in model (ii) and significant dependency between schools and catchment areas.

Researchers: Jon Rasbash, Fiona Steele, Harvey Goldstein, Simon Burgess and Jo-Anne Baird

It is well established that risky behaviours in families cluster together and that the causal relationships between these behaviours is not well understood. This project used data from the Avon Brothers and Sisters Study, which has very detailed entire family (including non-target siblings) longitudinal information on 200 of the ALSPAC families.

LEMMA II researchers examined the directionality of relationships between parental depression, child behaviour, marital difficulties and family type (single parent, nuclear, step) in a multiprocess model. Family data provide a particularly challenging set of issues for multiprocess models because of the complex structure of families: individuals are multiple members of dyads and dyads are nested within families. The multivariate response variables are of different types (normal and multinomial) and defined at different levels. LEMMA II researchers started their analysis by exploring reciprocal causation between pairs of risk variables and treat the remaining set of risk variables as exogenous to the system, after which they proceeded to explore triplets of endogenous variables, thus building a realistically complex model for the transmission of risks within families. LEMMA II tackled issues of endogeneity caused by selection and reciprocal and lagged effects.

Researchers: Jon Rasbash, Fiona Steele, Jenny Jenkins, Tom O'Connor, Jonathan Evans, Carol Propper, Frank Windmeijer

LEMMA II previous work implemented procedures that are based on methodological extensions that allow multivariate mixtures of normal, ordered or unordered categorical responses that can be defined at any level of a data hierarchy. LEMMA II considered the 2-level model in detail, with a major application to conduct a multiple imputation for missing data. They used latent variable ideas to create an underlying set of latent multivariate normal responses: one normal response for each binary or ordered response variable and a set of normal responses for each multicategory response variable. This reduced the analysis to a multivariate normal model that allowed LEMMA II researchers to apply standard algorithmic steps in the estimation.

In multiple imputation there are two models. One is the scientific model of interest (MOI) and the other is the imputation model (IM). The basic idea is that all the variables that are present in the MOI form a set of response variables in the IM which is then fitted, within a multilevel structure, with intercepts in the fixed part of the model. For a set of multivariate Normal responses this is straightforward and in addition, if any responses are missing, they will be randomly imputed within an MCMC analysis. For original non-normal variables, these imputed values are then transformed back to the original (ordered or unordered) scales so that the imputed 'complete' datasets will have all variables on their original scales.

Researchers: Harvey Goldstein, James Carpenter

The classic random effects multilevel model assumes independence between units of the same classification (e.g. school effects are independent) and independence between units of different classifications (e.g. school and neighbourhood effects are independent). The research questions addressed in the 'Realistic Models for School Effects' strand required both these assumptions to be relaxed. By definition when modelling school competition, school effects cannot be independent, and also when exploring parental selection mechanisms into schools and neighbourhoods, school and neighbourhood effects cannot be independent.

LEMMA II developed a methodology to handle both these cases. Non-independence between units of the same classification, where the classifications are at higher levels, has been handled conditional autoregressive models. Researchers at LEMMA II proposed an alternative approach that readily extended to the second, more general, case of non-independence between random classifications. In the past LEMMA II researchers had developed models that allow for correlation between crossed random classifications where the classifications share units. Rasbash et al. fitted a social relations model where bi-directional relationship measures could be decomposed into actor, partner, dyad and family random classifications. Correlations were modelled between the actor and partner random classifications, which were different mappings from the bi-directional relationship scores (level 1 units) to the same set of individuals. LEMMA II extended this work to allow correlations between classifications that do not share units.

Researchers: Harvey Goldstein, William Browne, Paul Clarke, Fiona Steele, Jon Rasbash