Text Mining in R

Date:

02/07/2024 - 03/07/2024

Organised by:

Royal Statistical Society

Presenter:

Jumping RIvers Tutor

Level:

Intermediate (some prior knowledge)

Contact:

training@rss.org.uk

Map:

View in Google Maps  (EC1Y 8LX)

Venue:

Online

Description:

Level: Intermediate (I)


Want to learn how to get the most out of text data? Today, a lot of data produced contains unstructured text, which can be difficult to transform and analyse without the correct knowledge and tools. This virtual course runs two afternoons and will teach you the basics of manipulating and transforming text data as well as how to extract meaning and sentiment in R, using packages such as {stringr} and {tidytext}.

 

Topics Covered
  • Appreciating the benefits of text data
  • Cleaning and extracting text with {stringr} and regular expressions
  • Transforming and mining text with {tidytext}
  • Analysing the sentiment of text
  • Understanding the content of a text with TF-IDF
  •  
Learning Outcomes


By the end of the course, participants will be able to…

  • clean, manipulate, and transform text data with {stringr}
  • use basic regular expressions to extract and remove patterns in text
  • convert unstructured text data into a tidy format suitable for analysis with {tidytext}
  • understand basic text mining concepts, such as tokenization, stop words, n-grams, lemmatization and more
  • create beautiful plots of text data
  • analyse the sentiment of a piece of text and compare sentiment across texts and over time
  • extract representative words of a text to classify its content
  •  
Knowledge Assumed

This course assumes basic familiarity with R and the {tidyverse}. We recommend first attending our Introduction to R course if you want to get up to speed for this course!

Cost:

£427.20 to £592.80 (including VAT)

Website and registration:

Region:

Greater London

Keywords:

Quantitative Data Handling and Data Analysis, R, Text Mining

Related publications and presentations:

Quantitative Data Handling and Data Analysis
R

Back to archive...