Text Mining in R
Date:
02/07/2024 - 03/07/2024
Organised by:
Royal Statistical Society
Presenter:
Jumping RIvers Tutor
Level:
Intermediate (some prior knowledge)
Contact:
Description:
Level: Intermediate (I)
Want to learn how to get the most out of text data? Today, a lot of data produced contains unstructured text, which can be difficult to transform and analyse without the correct knowledge and tools. This virtual course runs two afternoons and will teach you the basics of manipulating and transforming text data as well as how to extract meaning and sentiment in R, using packages such as {stringr} and {tidytext}.
- Appreciating the benefits of text data
- Cleaning and extracting text with {stringr} and regular expressions
- Transforming and mining text with {tidytext}
- Analysing the sentiment of text
- Understanding the content of a text with TF-IDF
By the end of the course, participants will be able to…
- clean, manipulate, and transform text data with {stringr}
- use basic regular expressions to extract and remove patterns in text
- convert unstructured text data into a tidy format suitable for analysis with {tidytext}
- understand basic text mining concepts, such as tokenization, stop words, n-grams, lemmatization and more
- create beautiful plots of text data
- analyse the sentiment of a piece of text and compare sentiment across texts and over time
- extract representative words of a text to classify its content
This course assumes basic familiarity with R and the {tidyverse}. We recommend first attending our Introduction to R course if you want to get up to speed for this course!
Cost:
£427.20 to £592.80 (including VAT)
Website and registration:
Region:
Greater London
Keywords:
Quantitative Data Handling and Data Analysis, R, Text Mining
Related publications and presentations from our eprints archive:
Quantitative Data Handling and Data Analysis
R