How Can Generative AI (LLMs) Help with Analysing Survey and International Large-Scale Assessment Data
Date:
04/03/2026
Organised by:
NCRM, University of Southampton
Presenter:
Dr Yin Wang and Dr Laone Maphane
Level:
Entry (no or almost no prior knowledge)
Contact:
Jacqui Thorp
Training and Capacity Building Coordinator, National Centre for Research Methods, University of Southampton
Email: jmh6@soton.ac.uk
Location:
View in Google Maps (SO17 1BJ)
Venue:
Building 54, Room 4001, University of Southampton, Highfield, Hants
Description:
This course is one of a series of four. You may register for any number of sessions individually. If you choose to register for all four, a discount will be applied. Further information about the series can be found at the end of this listing.
Session Three - How Can Generative AI (LLMs) Help with Analysing Survey and International Large-Scale Assessment Data
Would you like to use Generative Artificial Intelligence (AI) in social science research efficiently and responsibly—turning it into a powerful partner for analysis and interpretation?
The session demonstrates how Large Language Models (LLMs) can serve as both front-end and back-end research assistants, bridging conceptual reasoning, code generation, data interpretation, and policy communication. Participants will learn how to use LLMs to translate research questions and hypotheses into executable code workflows, and to automatically generate research outputs such as summaries, figure captions, and interpretive narratives. The course also introduces key aspects of Responsible AI, discussing how to evaluate LLM-generated content in terms of: Safety (toxicity); Fairness and bias (stereotype score); Linguistic fluency (perplexity) and Semantic fidelity (BLEU / ROUGE).
This session uses International Large-Scale Assessment (ILSA) data as a central example. ILSA results often require interpretation within rich social, cultural, and educational contexts; they offer an ideal testing ground for examining whether LLMs can truly understand and explain rather than merely summarise and predict.
By the end of this session, participants will be able to:
- Explain the potential and limitations of Generative Artificial Intelligence (AI) in analysing ILSA data and supporting educational research.
- Describe how Generative AI can act as front-end and back-end assistants, connecting conceptual reasoning, coding, and interpretive writing.
- Understand key ethical and responsible AI principles, including transparency, fairness, and researcher accountability.
- Evaluate LLM-generated outputs using quantitative indicators such as Toxicity Score, Stereotype Score, Perplexity, and BLEU/ROUGE.
- Use LLMs to translate research questions and hypotheses into executable code and apply LLMs to generate and refine summaries, figure captions, and policy-facing narratives based on analytical outputs.
This course is aimed at researchers, graduate students, data analysts, and education professionals interested in applying traditional and AI methods to the analysis of international large-scale assessment data.
Pre-requisites
This session is divided into two parts:
Morning lecture (2 hours): No prior technical background is required. The session introduces core ideas conceptually, making it accessible to participants from educational and social science backgrounds.
Afternoon workshop (2 hours): Basic familiarity with statistical syntax or programming is recommended to follow the hands-on exercises effectively. Participants with limited experience are still welcome to observe and learn from live demonstrations.
IMPORTANT: Please note that this course includes computer workshops. Before registering please check that you will be able to access the software noted below. Please bear in mind minimum system requirements to run software and administration restrictions imposed by your institution or employer with may block the installation of software.
Software: Python and possibly R
Format: Demonstration-based workshop with guided hands-on examples.
Delivery
This course is being delivered in a hybrid format on Wednesday 4th March from 09:00-14:00 (Lecture - 09:00-11:00, Workshop - 12:00-14:00):
In person - Room 54/4001 (limited capacity, offered on first-come first-served basis) or Online.
Series details:
Session One - £25 - https://www.ncrm.ac.uk/training/show.php?article=14610
Session Two – £50 - https://www.ncrm.ac.uk/training/show.php?article=14611
Session Four – £25 - https://www.ncrm.ac.uk/training/show.php?article=14613
Special offer: Register for all four sessions for £120
Cost:
The fee for this session is:
• £50 per person for all participants.
In the event of cancellation by the delegate a full refund of the course fee is available up to two weeks prior to the course. NO refunds are available after this date.
If it is no longer possible to run a course due to circumstances beyond its control, NCRM reserves the right to cancel the course at its sole discretion at any time prior to the event. In this event every effort will be made to reschedule the course. If this is not possible or the new date is inconvenient a full refund of the course fee will be given. NCRM shall not be liable for any costs, losses or expenses that may be incurred as a result of its cancellation of a course, including but not limited to any travel or accommodation costs.
The University of Southampton’s Online Store T&Cs also continue to apply.
Website and registration:
Region:
South East
Keywords:
Quantitative Data Handling and Data Analysis, Mixed Methods Data Handling and Data Analysis, AI and machine learning, International large-scale assessment data., Large Language Models, Artificial intelligence methods, Responsible AI, Cross-Sectional Research, Longitudinal Data Analysis, Quantitative Software
Related publications and presentations from our eprints archive:
Quantitative Data Handling and Data Analysis
Mixed Methods Data Handling and Data Analysis
AI and machine learning
