Introduction to Text Processing and Natural Language Processing for Social Scientists
Date:
27/10/2017
Organised by:
NCRM, University of Southampton
Presenter:
Dr Juan Grigera, UCL Institute of the Americas
Level:
Entry (no or almost no prior knowledge)
Contact:
Dr Juan Grigera
j.grigera@ucl.ac.uk
Description:
This one day workshop is an entry level workshop for academics, particularly in humanities and social sciences.
This course is an introduction to basic Text Processing and Natural Language Processing (NLP_ techniques, targeted at anyone trying to begin working on the topic, particularly those coming from the humanities and the social sciences.
A quick survey of Text Processing will present different techniques to dealing with digital text and provide tools and concepts for building corporas. This will include web scraping, OCR and regular expressions.
NLP is a general term describing computer methods to process human language (i.e. natural, unlike ‘artificial’ programming languages that have a strict syntax and semantics).The course will include a conceptual presentation of the tools and possibilities and intend to showcase the theoretical issues and the practical possibilities of NLP. This course will mainly focus on parsing and understanding of natural languages and will survey the available tools (ready made and those available for use with R, Python and Java).
The course covers:
- Basic text processing techniques: web scraping, OCR, regular expressions
- NLP: A brief history of the field and of basic achievements and techniques of the structuralist phase (including concordance, dispersion plots, bigrams, collocations, frequency distributions, etc)
- Text Analysis: Segmentation and tokenization. Regular Expressions, Chunking, part of Speech tagging, lemmantization, folding and stemming
- Conceptual problems (word sense disambiguation, Pronoun resolution and coreferencing, Textual entailment)
- Named Entity Recognition
- Topic Models
- Autoclassifying
By the end of the course participants will learn about:
- Basic Text Processing techniques
- Different approaches to NLP
- A sample of the techniques available
- The possible uses of NLP for different BigData and text analysis
Start: 10:00 End: 15:00
Cost:
Attendance is free of charge but registration is strictly required. For questions on eligibility or suitability, please refer to the entry on Participants above or contact Dr. Juan Grigera on j.grigera@ucl.ac.uk
Website and registration:
http://www.ucl.ac.uk/americas/people/academic-staff/juan-grigera
Region:
Greater London
Keywords:
ICT and Software, Natural Language Processing , Web Scraping , Digital Text
Related publications and presentations: