Introduction to Machine Learning with Scikit Learn in Python - online
Date:
03/12/2025
Organised by:
University of Southampton
Presenter:
Dr Sam Mangham and Dr Edward Parkinson (subject to change based on availability)
Level:
Intermediate (some prior knowledge)
Contact:
Penny White
NCRM Centre Manager
p.c.white@southampton

Venue: Online
Description:
A one day introduction to machine learning using Scikit Learn in Python. Learners will be introduced to several machine learning techniques including regression, clustering, dimensionality reduction, and neural networks. The course also includes a brief overview of the ethics and implications of machine learning.
The course covers:
Introduction to machine learning
Regression
Introducing Scikit Learn
Clustering with Scikit Learn
Dimensionality reduction
Neural networks
Ethics and implications of machine learning
By the end of the course participants will:
- Gain an overview of what machine learning is and the techniques available.
- Understand how machine learning and artificial intelligence differ.
- Be aware of some caveats when using Machine Learning.
- Apply linear regression with Scikit-Learn to create a model.
- Measure the error between a regression model and input data.
- Analyse and assess the accuracy of a linear model using Scikit-Learn’s metrics library.
- Understand how more complex models can be built with non-linear equations.
- Apply polynomial modelling to non-linear data using Scikit-Learn.
- Use two different supervised methods to classify data.
- Learn about the concept of hyper-parameters.
- Learn to validate and cross-validate models
- Understand the difference between supervised and unsupervised learning
- Identify clusters in data using k-means clustering.
- Understand the limitations of k-means when clusters overlap.
- Use spectral clustering to overcome the limitations of k-means.
- Recall that most data is inherently multidimensional.
- Understand that reducing the number of dimensions can simplify modelling and allow classifications to be performed.
- Apply Principle Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE) to reduce the dimensions of data.
- Evaluate the relative peformance of PCA and t-SNE in reducing data dimensionality.
- Understand the basic architecture of a perceptron.
- Be able to create a perceptron to encode a simple function.
- Understand that layers of perceptrons allow non-linear separable problems to be solved.
- Train a multi-layer perceptron using Scikit-Learn.
- Evaluate the accuracy of a multi-layer perceptron using real input data.
- Understand that cross validation allows the entire data set to be used in the training process.
- Consider the ethical implications of machine learning, in general, and in research.
IMPORTANT: Please note that this course includes computer workshops. Before registering please check that you will be able to access the software noted below. Please bear in mind minimum system requirements to run software and administration restrictions imposed by your institution or employer with may block the installation of software.
Pre-requisites:
A basic understanding of Python. You will need to know how to write a for loop, if statement, use functions, libraries and perform basic arithmetic. The ‘Introduction to Software Development’ covers sufficient background.
Setup Instructions:
You will need a terminal, Python 3.8+, and the ability to create Python virtual environments.
To install Python, follow the Beginner’s Guide or head straight to the download page.
You will need the MatPlotLib, Pandas, Numpy and OpenCV packages.
Create a new directory for the workshop, then launch a terminal in it:
mkdir workshop-ml
cd workshop-ml
Creating a new Virtual Environment
We’ll install the prerequisites in a virtual environment, to prevent them from cluttering up your Python environment and causing conflicts.
To create a new virtual environment (“venv”) called “intro_ml” for the project, open the terminal (Max/Linux), Git Bash (Windows) or Anacomda Prompt (Windows), and type one of the below OS-specific options:
python3 -m venv intro_ml # mac/linux
python -m venv intro_ml # windows
If you’re on Linux and this doesn’t work, you may need to install venv first. Try running sudo apt-get install python3-venv first, then python3 -m venv intro_ml
Activate environment
To activate the environment, run the following OS-specific commands in Terminal (Mac/Linux) or Git Bash (Windows) or Anaconda Prompt (Windows):
- Windows + Git Bash: source intro_ml/Scripts/activate
- Windows + Anaconda Prompt: intro_ml/Scripts/activate
- Mac/Linux: source intro_ml/bin/activate
Install the prerequisites
pip install numpy pandas matplotlib opencv-python scikit-learn
Cost:
The fee per teaching day is £60 per day for students registered at university / £150 per day for staff at academic institutions, Research Councils researchers, public sector staff and staff at registered charity organisations and recognised research institutions / £350 per day for all other participants.
In the event of cancellation by the delegate a full refund of the course fee is available up to two weeks prior to the course. NO refunds are available after this date.
If it is no longer possible to run a course due to circumstances beyond its control, NCRM reserves the right to cancel the course at its sole discretion at any time prior to the event. In this event every effort will be made to reschedule the course. If this is not possible or the new date is inconvenient a full refund of the course fee will be given. NCRM shall not be liable for any costs, losses or expenses that may be incurred as a result of its cancellation of a course, including but not limited to any travel or accommodation costs.
The University of Southampton’s Online Store T&Cs also continue to apply.
Website and registration:
Region:
South East
Keywords:
Quantitative Data Handling and Data Analysis, Regression Methods, Machine learning, ICT and Software, Quantitative Software, Python, Technology
Related publications and presentations from our eprints archive:
Quantitative Data Handling and Data Analysis
Regression Methods
Machine learning
ICT and Software
Quantitative Software
Python
Technology