Sharing code in secure research data labs: good practice and practicalities

Date
Category
NCRM news
Author(s)
Louise Corti, Office for National Statistics
Computer codeComputer code

In 2021, I led a session at the 2021 Research Methods e-Festival, which had its roots in an earlier joint ONS-UKDS #LoveYourCode2020 workshop. This focused on the challenges of sharing code in Trusted Research Environments (TRE). In a previous NCRM blog, I reflected on a conversation I had at the festival with research economist Felix Ritchie and engineer Martin O’Reilly from the Turing Institute on their approaches to sharing useful analytical code in secure environments.

I presented possible solutions for ONS at a banking secure labs conference session in Basel in 2022 and since then, we at our Secure Research Service (SRS) code-sharing team at ONS have made some real progress. We spent the last year planning and implementing a shared code repository. It’s been a tremendous amount of collaborative work to set up working policies, procedures and infrastructure.

Code can act as building blocks, for example using a code base from an existing project to create new or similar derived variables not already available in a dataset. Code can also be used to replicate analyses, either for new similar substantive modelling or for reproducing work. We agreed to focus on the community support aspect of sharing code, rather than on ‘gold standard’ checking and auditing (for example, by validation of reproducibility through an external service for validation, such as cascad or CODECHECK services).

Pilot work

Perhaps the hardest challenge was finding volunteers willing to have their own working code reviewed, other than by members of their team. 

We piloted our repository and worked with code created by the ADR UK-funded Wage and Employment Dynamics data enhancement and linkage project. The research team was very supportive and we interviewed them about their experiences in a recent blog.

The Code Sharing Repository holds output-checked code, submitted by researchers using the SRS. We have published basic guidance, recently released, on writing and sharing code in a TRE as part of the new ADR Learning Hub. This includes guidance for reproducible code, learning videos and coding templates.

About the author

Louise Corti is Head of Analytical Insights and Impact for the Integrated Data Service at the Office for National Statistics. Her impact team focuses on tracking, measuring and showcasing innovative use of survey and administrative data excellence from research undertaken in ONS’s secure environment, the Secure Research Service (SRS).  She has led on a pilot on facilitating and promoting reproducible code.