Applied Data Science in R

Date:

17/02/2016 - 19/02/2016

Organised by:

Mind Project Limited

Presenter:

Simon Walkowiak MBPsS

Level:

Entry (no or almost no prior knowledge)

Contact:

Mind Project Ltd, 0203 322 3786, info@mindproject.co.uk

Map:

View in Google Maps  (WC1B 4HS)

Venue:

London Mathematical Society, De Morgan House, 57-58 Russell Square, London, WC1B 4HS.

Description:

1. Course description.

 

This course will introduce participants to all basic concepts of Data Analysis in R environment. More specifically participants will learn how to input different types of data, prepare, transform and manage datasets and their variables, export/import data files, create simple graphical representations of the data (bar plots, histograms, box plots etc.), run basic statistical tests (e.g. correlations, t-tests etc.), obtain descriptive statistics from a dataset and formulate the results. The course will also provide an introduction to Regression analyses and ANOVAs. Methods of data visualisation will be presented for each statistical test.

Throughout the course the attendees will learn the following concepts:

  • R environment: what is R? Starting R environment;Basic settings and functions; Introduction to IDEs e.g. RStudio,
  • Mathematical functions and control flow operators: R-related help and support; Installing and running third-party packages,
  • R data structures: creating scalars, vectors, matrices, arrays, lists and other data objects in R; Creating simple data frames,
  • Data input and export: adding/deleting observations; Sampling; Flagging/identifying specific cases based on conditional search; Sorting cases; Adding/editing value and variable labels; Dealing with missing data; Reshaping data from long/narrow into wide formats,
  • Exploratory Data Analysis: inspecting the structure of data objects; Cross-tabulations and descriptive statistics (measures of central tendency and dispersion); Vertical/horizontal merging of data frames and other R objects; Basic EDA plots: histograms, density plots, scatterplots, box plots, bar plots, line graphs etc.,
  • Tests of differences and correlations; Testing for normality assumptions: QQ, density plots and test-specific normality measurements; One-sample, matched-sample and independent t tests; Correlations and simple regressions; Test-specific visualisation functions/packages; Effect size and power estimation,
  • Data modelling: ANOVA and multiple regressions; Testing for normality assumptions in GLMs; Introduction to logistic and Poisson regressions; Model optimisation techniques in R,
  • Introduction to data visualisations: creating informative data visualisations using R core and third-party packages; Using graphical parameters for adding/editing text, titles, lines, fonts, colours, axes, background and other elements of plots; Introduction to ggplot2 syntax and an rCharts example,
  • Creating a simple data product with R; Data cleaning, EDA, data management, data "crunching" and analysis, data visualisation, model optimisation and debugging. 

 

2. Programme. 

 

The course will run for three days (Wednesday to Friday) between 9:00am and 5:00pm and will consist of alternating lecture-style presentations and practical tutorials. The example datasets used during tutorial sessions will come from social sciences, psychology and business fields, however the contents may vary depending on specific interests of participants (based on the Participant's Skills Inventory). There will be two 15-minute coffee/tea breaks and one 1-hour lunch break on each day of the course.

 

3. What is included?

 

Apart from the contents of the course, Mind Project will provide the participants with the following:

 

  • a digital (USB memory stick) Course Manual including all presentation slides, R course codes and a list of reference books and online resources,
  • additional home exercises and all data sets available to download,
  • coffee/tea breaks with light refreshments and lunch,
  • Wi-Fi access,
  • Central London location - at the historical London Mathematical Society at Russell Square
  • networking opportunity,
  • Mind Project course attendance certificate.

 

4. Further instructions.

 

  • Participants are required to have the most recent version of R and R Studio software installed on their personal laptops (any operating system). As R is a free environment you can download it directly from www.r-project.org website and R Studio is available athttps://www.rstudio.com/products/rstudio/#Desktop. Please contact us should you have any questions or issues with the installation process. No specific R packages are required before the course (the course tutors will explain this during the training).
  • No prior knowledge of R is required from participants enrolling on this course, however a keen interest in data analysis is assumed.

 

  • Participants are encouraged to complete the online Participant's Skills Inventory available at http://mindproject.co.uk/skillsinventory.html to allow Mind Project and our course tutors to customise the contents of the course depending on the level of participants' knowledge and their areas of interest. The data obtained through the Participant's Skills Inventory will be held fully-confidential and will only be used to provide a quality data analysis training.
  • By purchasing a place on one of our courses you agree to the Terms and Conditions available at http://mindproject.co.uk/trainingterms.html. Please read the Terms and Conditions before making a booking.

 

Should you have any questions please contact Mind Project Ltd at info@mindproject.co.uk or by phone on 0203 322 3786 or 07581 669 359. Please visit the course website at http://mindproject.co.uk/applied-data-science-london-feb16.html. 

Cost:

£475 + VAT (£570) per person for the whole course (normal fee).
£325 + VAT (£390) per person for the whole course for UK registered undergraduate and postgraduate students, and representatives of registered charitable organisations (discounted fee).

Website and registration:

Region:

Greater London

Keywords:

Quantitative Data Handling and Data Analysis, Descriptive Statistics, Correlation, Effect size , Levels of measurement, Variance estimation, Statistical Theory and Methods of Inference, Probability theory , Power analysis, Parametric statistics, Non-parametric statistics, Regression Methods, Quantitative Software, R, Data Visualisation

Related publications and presentations:

Quantitative Data Handling and Data Analysis
Descriptive Statistics
Correlation
Effect size
Levels of measurement
Variance estimation
Statistical Theory and Methods of Inference
Probability theory
Power analysis
Parametric statistics
Non-parametric statistics
Regression Methods
Quantitative Software
R
Data Visualisation

Back to archive...