Web-scraping with Python and Introduction to text data with Python


24/04/2024 - 25/04/2024

Organised by:

University of Exeter


Mariam Cook


Intermediate (some prior knowledge)


C2S2 admin team,


View in Google Maps  (EX4 4PE)


University of Exeter
Streatham Campus


Technological advancements have not only driven the digitisation of society and the emergence of novel socio-political issues, but have also resulted in significant developments in algorithms, computational power, and increasingly large datasets. 

This practical-based face to face session will be delivered over two days and will provide you with both the technical programming skills and understanding of data science techniques that you will need to research pre-existing and novel social-political and economic issues and the kind of transferable skills that are currently in demand in the job market.

Text data surrounds us in our lives and comes in different shapes and sizes, e.g. newspaper articles, tweets, product reviews, song lyrics, etc. While it might seem at first glance that this information can hardly be summarized and compared, certain computational techniques allow extracting meaningful information from text data. This course provides the foundations for you to understand, execute and communicate text data analysis in a widely recognised software platform that was built for data analysis

Specifically, it will introduce additional skills using the Python programming language, and requires prior introductory experience with Python. 

This training can be standalone with prior Python experience or as a follow on from Introduction to Python and Python for Data Analysis on 22nd and 23rd April 2024 Introduction to Python and Python for Data Analysis (ncrm.ac.uk)

Web scraping with Python

  • Introduction to Google Colab (students need a functioning gmail/google account they can log into)
  • Pandas dataframes and uploading external data to Colab
  • How to scrape a web page and extract text with Beautiful Soup 
  • How to analyse and visualise text content using the Seaborn library

Introduction to Text Data with Python

  • Text preprocessing
  • Bag of words modelling and count vectorizer
  • Lexicon based sentiment analysis using spacy
  • Comparative visualisation


The fee per teaching day is £35 per day for students / £75 per day for staff working for academic institutions, Research Councils and other recognised research institutions, registered charity organisations and the public sector / £250 per day for all other participants. In the event of cancellation by the delegate a full refund of the course fee is available up to two weeks prior to the course. NO refunds are available after this date. If it is no longer possible to run a course due to circumstances beyond its control, NCRM reserves the right to cancel the course at its sole discretion at any time prior to the event. In this event every effort will be made to reschedule the course. If this is not possible or the new date is inconvenient a full refund of the course fee will be given. NCRM shall not be liable for any costs, losses or expenses that may be incurred as a result of its cancellation of a course, including but not limited to any travel or accommodation costs. The University of Southampton’s Online Store T&Cs also continue to apply.

Website and registration:


South West


Explanatory Research and Causal analysis, Experimental Research , Quasi-Experimental Research, Evaluation Research, Behavioural Research, Hypothesis testing research, Intervention studies, Qualitative Interviewing, Quantitative Data Handling and Data Analysis, Regression Methods, Quantitative Approaches (other), Stata, Research Ethics, Evidence-Based Policy and Practice, Conference Posters and Presentations, Research Skills, Communication and Dissemination (other)

Related publications and presentations:

Explanatory Research and Causal analysis
Experimental Research
Quasi-Experimental Research
Evaluation Research
Behavioural Research
Hypothesis testing research
Intervention studies
Qualitative Interviewing
Quantitative Data Handling and Data Analysis
Regression Methods
Quantitative Approaches (other)
Research Ethics
Evidence-Based Policy and Practice
Conference Posters and Presentations
Research Skills, Communication and Dissemination (other)

Back to archive...