Towards a social media science: Tools and methodologies

Principal Investigator: David Weir, University of Sussex

Co-Investigator: Jamie Bartlett (Demos)

Project duration: 1 April 2013 - 30 September 2014

The explosion of social media has created an unprecedented research opportunity for social scientists. Social media present a digital tableau of society-in-motion: of people arguing, condemning, joking, influencing. The growth of these digital spaces has coincided with the emergence of a family of tools - ‘big data analytics’ – that can make sense of them. Harnessing social media data as behavioral evidence using these tools could bring about a step-change evolution in the social sciences.

However, deriving meaning from these messy, contradictory data sets requires the development of new methodologies across the research cycle. The social sciences are concerned with collecting unbiased, representative samples; of analyzing data in ways that reflect and appreciate social reality; and of constructing more general explanatory theories, all according to an ethical frame that protects subjects of the research. Current methods do little more than offer raw, descriptive enumeration of social media phenomena drawn from samples of convenience, with little additional interpretative enquiry or context. This project is designed to address these problems by creating an overall research system that can garner insight from Twitter that satisfy the standards of evidence demanded by social science. To do this, it proposes to create new methodologies at each point of the research cycle:

Collection: Samples from Twitter are created by storing tweets that contain words that are searched for. Our sampling method will discover the search terms that are statistically co-incident with on-topic tweets, and filter out tweets that are irrelevant. The system ‘cascades’: a constantly refreshing, statistically grounded, method of sampling that accommodates changes in how topics are discussed.
Analysis & Interpretation: Automated sentiment analysis classifies tweets at great volumes into categories of meaning. We will create categories that are informed by standing social science theory and the data itself, in order to draw meaningful conclusions and inferences based on the data.
Ethics: Social media science entails a number of moral hazards that are not answered by conventional ethical frameworks. We will construct a new framework of social media science ethics on the basis of current public attitudes, possible harms to research subjects, and propose ways to measure and minimize them.

This will open up the value of social media to many different actors who have a stake in understanding people and society. Through publishing software and findings, and a series of training workshops aimed at both producers and consumers of research, the project investigators aim to spread the practice of rigorous, ethical social media science.

News

Simon Wibberley (University of Sussex), presented a paper 'Language Technology for Agile Social Media Science' at LaTeCH 2013: ACL 2013 workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities.
Carl Miller (Demos) delivered a plenary talk 'Towards social media science' at the Joint Annual Conference of the Government Economic Service and the Government Social Research Service, HM Treasury on 20th September 2013.
Carl Miller presented a paper at International Studies Association 2013 San Francisco entitled "Beyond Big Data: New approaches to social media for intelligence analysis".
Project researchers have contributed to the report 'Social media and public policy - What is the evidence?' that was produced for the Alliance for Useful Evidence in September 2013.