Advanced Big Data Analysis and Management Using R - online
Date:
13/10/2025 - 14/10/2025
Organised by:
University of Southampton
Presenter:
Dr Somnath Chaudhuri
Level:
Intermediate (some prior knowledge)
Contact:
Penny White
NCRM Centre Manager
p.c.white@southampton.ac.uk

Venue: Online
Description:
This two-day online course provides advanced training in Big Data Analysis and Management using R, focusing on efficient techniques for processing, managing, and visualizing large datasets. Participants will learn to overcome common challenges in big data workflows, including memory optimization, time series analysis, and geospatial data handling. The course combines theory with hands-on practice, equipping learners with practical skills for real-world data applications.
The course covers:
Introduction to Big Data and R Environment
- Big data concepts
- Challenges of handling Big Data (Memory limits, computational efficiency)
- R/RStudio setup, Package installations
Handling Large Datasets
- Working with large datasets using data.table
- Memory-efficient data wrangling with dplyr
- Working with Out-of-Memory data (disk.frame, ff, feather)
- Best practices for efficient pipelines
Visualization in Big Data Context
- Challenges of visualizing big data
- Exploratory data analysis with ggplot2 and plotly
- Handling large datasets in visualizations (sampling, aggregation, ggforce)
- Overview of Shiny dashboard - example
Handling Time Series (Temporal) Data
- Temporal data structures in R (Date, lubridate)
- Time series storage and manipulation (xts, zoo)
- Time series aggregation and decomposition
- Temporal visualization techniques (ggfortify, gghighlight)
- Interactive time exploration with dygraphs
Handling Geospatial Data
- Geospatial vector data structure in R (sf, sp)
- Handling Raster data in R (terra, raster)
- Projections and CRS: understanding EPSG codes and proj4 strings
- Static Maps: ggplot2 with geom_sf, tmap for thematic mapping
- Interactive Maps: leaflet, mapview, and plotly integration
By the end of the course participants will:
- Understand key concepts and challenges of big data analysis in R.
- Efficiently handle and process large datasets using optimized techniques.
- Apply effective visualization methods for exploratory data analysis.
- Manage and analyse time series data.
- Work with geospatial data for mapping and spatial analysis.
- Build streamlined workflows for scalable big data solutions.
Pre-requisites:
Basic programming skills in R.
IMPORTANT: Please note that this course includes computer workshops. Before registering please check that you will be able to access the software noted below. Please bear in mind minimum system requirements to run software and administration restrictions imposed by your institution or employer with may block the installation of software.
- Software: Participants will use R and RStudio for hands-on exercises (participants should install R and RStudio before the workshop to ensure smooth participation).
R Version: Latest stable release (R 4.3.x or newer).
RStudio: Recommended version (2023.09+ or newer).
- Key R Packages: Pre-install essential packages (list will be provided before the course).
- Internet Access: Required for package installations and data downloads.
Cost:
The fee per teaching day is: £60 per day for registered students / £150 per day for staff at academic institutions, Research Councils researchers, public sector staff, staff at registered charity organisations and recognised research institutions / £350 per day for all other participants.
In the event of cancellation by the delegate a full refund of the course fee is available up to two weeks prior to the course. No refunds are available after this date.
If it is no longer possible to run a course due to circumstances beyond its control, NCRM reserves the right to cancel the course at its sole discretion at any time prior to the event. In this event every effort will be made to reschedule the course. If this is not possible or the new date is inconvenient a full refund of the course fee will be given. NCRM shall not be liable for any costs, losses or expenses that may be incurred as a result of the cancellation of a course.
The University of Southampton’s Online Store T&Cs also continue to apply.
Website and registration:
Region:
South East
Keywords:
Exploratory Research, Secondary Analysis, Interdisciplinary and Multidisciplinary Research, Data Management , Nonresponse , Descriptive Statistics, Statistical Theory and Methods of Inference, Spatial Data Analysis, Time Series Analysis, Quantitative Software, Data Visualisation
Related publications and presentations from our eprints archive:
Exploratory Research
Secondary Analysis
Interdisciplinary and Multidisciplinary Research
Data Management
Nonresponse
Descriptive Statistics
Statistical Theory and Methods of Inference
Spatial Data Analysis
Time Series Analysis
Quantitative Software
Data Visualisation