Name: Text Learning Workshop
Start: 2017-04-24
End: 2017-04-25
Location: London School of Economics, Houghton Street, London

Text Learning Workshop

Date:

24/04/2017 - 25/04/2017

Organised by:

London School of Economics and Political Science

Presenter:

Professor Kenneth Benoit

Level:

Intermediate (some prior knowledge)

Contact:

Esti Sidley, 0207 955 6947, e.sidley@lse.ac.uk

Location:

View in Google Maps (WC2A 2AE)

Venue:

London School of Economics, Houghton Street, London

Description:

Text Analysis Using R
Over two days, 24-25 April, we will offer two days of training workshops for text analysis using R. The focus is on a mix of foundations of working with texts in R, and specific use of the quanteda (http://quanteda.io) package developed by Kenneth Benoit and a team of collaborators at the LSE.

Who May Participate
Anyone may apply to participate, but priority will be given to, in this order:

* PhD students
* early career academics
* other academics
* everyone else

Applicants should have some prior experience of programming in R and in text analysis, although the first day is pitched at an introductory level.

Once your application has been approved, we will send you a link to register. We will only book travel and accommodation for applicants once they have registered for this workshop. The application form for this workshop can be found at https://docs.google.com/forms/d/1-QlxJAPBkFJbVJji5w8lBC1oHPy7tNEGJtQptVvE7XA/edit

The closing date for applications is Wednesday 22nd March and registrations will close on Friday 31st March.

Financial Support

The workshop is not only free to attend, but also we will cover the cost of travel and accommodation up to £300. If you provide us with the details of your requirements, we will book flights and accommodations directly. Lunch and refreshments will be provided on both days and there will be a reception on the evening of the April 24th. Breakfast will be provided on the morning of 25th April for those people who stayed overnight on the 24th. We will only cover accommodation for the night of 24th April. If you require additional nights, we can book this for you but you will be responsible for covering those costs incurred.

Schedule
Day 1: Introduction to Text Analysis Using R (24 April): 1:30pm - 6pm (commencing with lunch from 12:30pm)

We will cover how to format and input source texts, how to structure their metadata, and how to prepare them for analysis. This includes common tasks such as tokenisation, including constructing ngrams and "skip-grams", removing stopwords, stemming words, and other forms of feature selection. We show how to: get summary statistics from text, search for and analyse keywords and phrases, analyse text for lexical diversity and readability, detect collocations, apply dictionaries, and measure term and document associations using distance measures. Our analysis covers basic text-related data processing in the R base language, but most relies on the quanteda package (https://github.com/kbenoit/quanteda) for the quantitative analysis of textual data.

Day 2: Advanced Text Analysis Using R (25 April): 9am - 5pm (with coffee and refreshments at the start)

This day will cover more advanced text analysis using R, including more advanced methods, including how to pass the structured objects from quanteda into other text analytic packages for doing topic modelling, latent semantic analysis, regression models, and other forms of machine learning.

An illustrative workshop previously given can be viewed here https://github.com/kbenoit/ITAUR.

This workshop is supported by European Research Council grant ERC-2011-StG 283794-QUANTESS and the Social and Economic Data Science Unit at the LSE.

Cost:

FREE

Website and registration:

Region:

Greater London

Keywords:

Textual Analysis, R, tokenisation , feature selection , constructing ngrams , removing stopwords , topic modelling , latent semantic analysis , regression models , machine learning

Related publications and presentations from our eprints archive:

Textual Analysis
R

Back to the training database