Big Data and Data Visualisation using R

Date:

27/06/2017 - 29/06/2017

Organised by:

University of East Anglia

Presenter:

Dr Jibonayan Raychaudhuri & Dr Jack Fosten

Level:

Intermediate (some prior knowledge)

Contact:

To book, please e-mail SSF.AdvancedTraining@uea.ac.uk (deadline for bookings is Friday 28th April 2017). Academic/content enquiries, please mail simon.d.watts@uea.ac.uk in the first instance.

Map:

View in Google Maps  (NR4 7TJ)

Venue:

University of East Anglia,
Norwich Research Park,
Norwich.

Description:

This is a two-part course aimed at training students in the latest techniques in data analysis. The first part of the course, taught by Dr Jibonayan Raychaudhuri, is designed as an introduction to data visualization techniques using the R programming language.  This is a “self-contained” course where we will first learn how to import (well –formatted) data into R. Then we will learn how to draw graphs using the base R graphics package. Next we will look at lattice – an R package which improves on the base R graphics package by providing us with an easy way of displaying multivariate relationships. Then we will learn about the ggplot2 package – a plotting system for R – based on the grammar of graphics, which provides a powerful model of graphics that makes it easy to produce complex multi-layered graphics. Participants should have access to a computer with administrative rights as this course is meant to be interactive. Participants are encouraged to learn by working out exercises. Ideally, participants should already have R and R-Studio already installed on their machines.

The second part of the course is an introduction to statistical methods used in analysing high-dimensional economic data, or ‘big data’. The course is taught by Dr Jack Fosten. Nowadays it is often the case that economic researchers have access to data on hundreds or thousands of economic variables, making it difficult to specify an informative predictive model. This course moves beyond standard regression techniques such as Ordinary Least Squares (OLS), which break down in the face of big data, and will cover topics such as stepwise selection, penalised regression (Ridge and LASSO), and factor models, all of which are sometimes referred to as ‘machine learning’ methods. This will enable econometric models to be built on robust statistical grounds in situations where there may be many more variables than there are sample points. The course will follow theoretical sessions in the mornings and computer lab sessions in the afternoon which will use the computer package R. In the first day, the theory and lab sessions will mostly cover microeconomic applications whereas the second day will be geared towards time series applications.

Cost:

PGR students from Universities of East Anglia; Essex; Kent; Surrey; Sussex; Reading; Royal Holloway; Goldsmiths; Roehampton; & City University; PGR students from all other institutions = £30; Early-career researchers/academics = £60

Website and registration:

Region:

East of England

Keywords:

Interdisciplinary and Multidisciplinary Research, Econometrics, Data Visualisation, Economics , Big data

Related publications and presentations:

Interdisciplinary and Multidisciplinary Research
Econometrics
Data Visualisation

Back to archive...