Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-t5tsf Total loading time: 0 Render date: 2024-11-04T18:31:35.100Z Has data issue: false hasContentIssue false

4 - Social media data

Published online by Cambridge University Press:  30 April 2022

Jeff Evans
Affiliation:
Middlesex University
Sally Ruane
Affiliation:
De Montfort University, Leicester
Humphrey Southall
Affiliation:
University of Portsmouth
Get access

Summary

Introduction

As the ‘participatory’ Web 2.0 model has supplanted ‘publication’ on the World Wide Web, several rapidly evolving sites and applications, such as Twitter, Facebook, Flickr, Wikipedia and YouTube, have promoted the creation and enabled, to varying extents, the retrieval of increasingly large volumes of user-generated content. Some of these human-made digital artefacts consisting of text, shared web links, audio, image or video files are publicly posted allowing widespread, although seldom free, access to potentially huge volumes of material. Social media data are rarely numerical, but many statistical techniques are now deployed to analyse these newfound sources of ‘Big Data’. Chang and colleagues (2014) have suggested that a ‘paradigmatic shift’ has resulted from these technological advances, leading to a new type of computational social science, a development which, relying largely on quantitative and inductive methodologies, has not been universally welcomed (Fuchs, 2017a; Wyly, 2014). This chapter describes the characteristics of social media data, methods of data collection and analysis and argues that, with several inherent peculiarities, social media data must be embraced, but approached cautiously, by statisticallyminded researchers.

Social media big data

Characteristics

Social media datasets are widely accessed and used in government, corporate and academic environments. Applications include the surveillance and monitoring of citizens (Fuchs, 2017b), business brand and reputation management (Grabher and Konig, 2017) and wide-ranging investigations in social and information systems research (Kapoor et al, 2018). Many digital records of human societal interaction, typically sourced from the billions of messages created every day by users of popular online social networks such as Facebook and Twitter, are now accessible. Social media data are time-stamped, allowing temporal sequencing while individual records are often packaged for access, with metadata, in one of the ‘semi-structured’ interchange formats of the web, such as XML or JSON, not always familiar to statisticians. Some social media data, for example Flickr images or Twitter tweets, hold Latitude and Longitude coordinates allowing straightforward mapping of ‘geotagged’ phenomena. Key demographic or address information, including age, sex, street, town or postcode are not, for privacy reasons, available in social media data although some, such as gender, may be imputed with varying levels of success by examining language usage in text. Exceptionally, where users grant ‘read access’ to third-party social media applications, these variables may become visible to ‘app’ developers.

Type
Chapter
Information
Data in Society
Challenging Statistics in an Age of Globalisation
, pp. 47 - 60
Publisher: Bristol University Press
Print publication year: 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×