Social media platforms are increasingly important platforms for public communication, and Twitter, in particular, is being used extensively in various positive capacities in public health. Reference Denecke, Krieck and Otrusina1-Reference Paul and Dredze3 Social media have become important as a platform for organizations attempting to spread important public health messages, Reference Gesser-Edelsburg and Shir-Raz4 for engaging with the public, Reference Chou, Prestin, Lyons and Wen5 and for understanding response to those messages. The 2019 Digital Media Report found that 25% of Americans mainly discover news via social media, and over 20% of adults in a selection of Western countries use Twitter each week. 6 Not only is this significant, but also the proportion of the population using and seeking information from newer platforms seems likely to continue to grow.
The World Health Organization (WHO) is an international organization that provides leadership and builds partnerships in international health, and along with other key functions, helps set standards and norms for the international community. 7 One such partnership building and norm setting program run by the WHO is the Vaccine Safety Network (VSN), “a global network of websites … that provides reliable information on vaccine safety,” with goals that include “communicating vaccine safety information through a diversity of digital channels.” 8 Both the VSN and many of the organizations in the network have Twitter accounts, which were studied. Risk communication is a core need for such organizations, Reference Gesser-Edelsburg, Shir-Raz and Walter9,Reference Dickmann, Abraham and Sarkar10 and this network provides an excellent starting point for looking at such communication on Twitter related to vaccines.
Much previous research on Twitter for public health focuses on the lay public, Reference Bricker11 or experts, rather than organizations. There have been calls to focus on engagement with the public, Reference Linas12 but much of the analysis is on either “following”/“follower” relationships Reference Cavazos-Rehg, Krauss, Grucza and Bierut13 or tweet contents (sentiment analysis, etc), rather than conversation, that is, tweets, replies, and responses. Some studies have considered conversation, at least at smaller scales. Lakon et al. Reference Lakon, Pechmann and Wang14 staged an intervention that looked at conversation structure for small groups of smokers with up to 4 participants.
The current work focuses on engagement and looks at the other half of risk communication – how organizations engage with experts and the lay public. This is particularly important in the polyvocal context faced by health organizations. Reference Gesser-Edelsburg and Shir-Raz4 The discussion of measles, mumps, and rubella (MMR), which the data collected for this analysis focus on, poses particular challenges (noted by Bricker et al. Reference Bricker11 ) that require effective health communication strategies. Reference Vaughan and Tinker15
Communication depends on the medium being used, and conversation on Twitter differs from other types of expert and health education interactions, even compared to other online platforms. Reference Paul and Dredze3,Reference Gesser-Edelsburg and Shir-Raz4 It is a vehicle for experts to communicate with each other and with the public, Reference Gesser-Edelsburg and Shir-Raz4,Reference Lovejoy and Saxton16 not just broadcast messages. This is critical because conversations, rather than the validity or convincingness of the arguments made, are particularly critical for influencing the public. Reference George, Rovniak and Kraschnewski2
Conversation and outreach are also critical given the general declining trust. Less than half of people say they trust even the news institutions that they prefer using. 6 The issue is particularly critical for measles and conversations about vaccines. In the last several years, large outbreaks in the Netherlands, Sweden, Italy, Japan, and the United States have each occurred, all in part due to low vaccination rates. Reference Bricker11,17 The recent outbreaks in the Western world stem in large part from vaccine hesitancy. 18 For the MMR vaccine specifically, anti-vaccine sentiment has created gaps in vaccine coverage that led to a spike in cases. Reference Bricker11
Institutions attempting to provide leadership in public health and communication are already attempting to address the lack of trust, and engagement is a key strategy for doing so. Given the above issues of trust, in general, and vaccine skepticism specifically, the importance of reaching the public’s concerns and engaging them, not just broadcasting messages, should be clear. However, changing minds requires interaction, not just exposure, or social media engagement as typically measured. Reference Bruns and Stieglitz19-Reference Chretien, Tuck and Simon21 Given the different goals for Twitter usage in medicine and public health, there are different ways of engaging with different advantages and risks, which can be best accomplished using different types of discourse. Reference Keller, Labrique and Jain20 These will also involve different conversational structures, which this research attempts to analyze.
This paper is the first example in infectious disease research, to our knowledge, that explores the structure of discourse on social media on a large scale by considering the relationship between multiple posts sent by different accounts. The analysis focuses on discussions between public health organizations, experts, and the general public about infectious disease and vaccines, in order to understand conversation structure in historical Twitter data. The analysis considers key risk-communication goals and whether they are being met. In doing so, it applies and validates a methodology for analyzing discourse among users on social media with a novel and widely applicable (if data- and time-intensive) sampling strategy.
Previous work in this vein includes Neiger et al.’s study, which considered local health departments’ use of Twitter. Reference Neiger, Thackeray and Burton22 In addition to sentiment analysis, they considered whether they are attempting outreach or promotion, albeit by analyzing tweet contents rather than conversational structure itself. More generally, Lovejoy and Saxton Reference Lovejoy and Saxton16 looked at nonprofits. While they considered conversation (or “dialogue”), it was only based on whether organizations used the “reply” feature.
Materials and Methods
Below we provide an outline of the methods used for data analysis. Further details are available in the appendix and the GitHub repository.
Sample
An initial sample was retrieved that included all retrievable tweets sent during the calendar year 2018 by accounts belonging to the 52 organizations in the WHO VSN with Twitter accounts. This included 35 878 tweets, of which 11 454 were retweets of other accounts, and 3869 were non-retweeted interactions with other users.
This sample was used to find tweet threads for the subset that involved interaction (either replies or quote-tweets) by the organization. Any tweets by accounts identified as “Bots” were excluded. This sampling strategy allows the extraction of conversations, rather than typical methods of retrieving tweets that use a hashtag or key term unrelated to the conversational structure. This is conceptually similar to Cogan et al.’s Reference Cogan, Andrews and Bradonjic23 early “complete conversation” approach but is not hobbled by Twitter’s more recent 1% sample limit.
After retrieval, we constructed 2 data sets that allowed us to focus on the question of how organizations were using Twitter, and to compare their usage to a related baseline of conversations. The first data set was conversations involving the VSN Twitter accounts. The second was all conversations where at least 1 tweet in the conversation contained a key word related to measles or vaccinations, described in the appendix. The final analyzed corpus included 3998 threads with a total of 24 192 tweets. This again excludes tweets from accounts identified as bots, discussed further in the appendix.
Heuristic Classification Methods
Because of the size of the data set and number of accounts in the data set, it is infeasible to manually classify the accounts. In place of manual classification, a set of heuristics was developed and applied, then manually reviewed to assess accuracy. The 2 heuristics described in the appendix were (1) automated analysis of the websites associated with the account, and (2) key word filters applied to account descriptions. Note that for non-verified accounts, which are the majority of accounts, neither the description nor the websites are verified by Twitter. This means that classification is based on how accounts represent themselves and is what other users see when considering the source, and therefore the authenticity or trustworthiness of a tweet. Unfortunately, Twitter can only provide user data at present, not the data that were extant at the time of the tweet being sent.
Identifying Deceptive Activities and “Bots”
A significant concern in analyzing Twitter conversations is whether the accounts were run by bonafide users or “bots” – a somewhat conceptually unclear category. This is especially true for vaccines, where there has been concerns voiced about astroturfing and evidence that governments attempted to push disinformation. Supporting this concern, there is research by Boniatowski et al. Reference Broniatowski, Jamison and Qi24 showing that bots have been used, including those involved in government disinformation campaigns. While it is unfortunately impossible to reliably identify all accounts that are being used for nefarious purposes, we used the “Bot or Not” classifier Reference Yang, Varol and Davis25 to consider whether the accounts, in fact, were bots and report the scores, as well as removing detected bots from the analysis. Bot detection is an ongoing struggle, and, as it improves, bots adopt new techniques for remaining unidentified.
Theoretical Framework
In this paper, we build on the rhetorical structure theory (RST), a formal model introduced by Mann and Thompson Reference Mann and Thompson26 for the systematic analysis of relationships between different text spans, which has been suggested can be applied to Twitter. Reference Sidarenka, Bisping and Stede27 RST provides a topology of conversation and interaction, which is extended here to allow the identification of conversational flow between multiple speakers. This is distinct from the more widely explored topology of follower networks, Reference Kwak, Lee, Park and Moon28 or conversation analyses where conversation is defined by shared hashtag usage. Reference Derczynski, Yang and Jensen29 This novel use of RST allows us to consider how health organizations engage with doctors, experts, and the public, both in general and in the context of recent measles outbreaks and discussions of the measles vaccine. Critically, we can explore whether organizational Twitter use is primarily 1-way communication, broadcasting information without responding to or interacting with other users, or whether there is conversational interaction. In addition, when conversation occurs, we explore differences between intra-expert and expert-organization discussions versus engaging with the broader public either responsively or interactively. Where it is interactive, we further characterize the types of interaction, as explained below and in the methodological appendix.
Note that retweets are not reflected in the conversational structure we consider using RST, since the retweeting account does not itself make a statement. This could be modified by assigning multiple authors to a retweeted statement, but this is misleading given that, as the standard disclaimer notes, retweets are not endorsements.
Research Approach
The purpose of this analysis is to indicate whether the organizations were involved in 1-way (broadcast-type) communication, whether they replied but did not engage, or whether there was the type of more intensive communication that social media allow. In the cases in which there was communication, we wish to understand the form of that conversation.
The analysis performed in this paper focuses on conversation and does not involve content analysis. While traditional content analyses are valuable, such methods struggle to capture context and interaction. Reference Smith and Gallicano30 In contrast, our approach is particularly appropriate for understanding whether and how organizations engaged with the public, especially given large and massive multi-user data sets.
In the first stage, we defined engagement. The term is used differently in public health, in risk communication, and in social media. In public health, engagement can mean participation in an intervention or interacting with health providers. In social media, it is interaction with the content with the user interface – expanding the tweet, liking it, retweeting, or replying. But Smith and Gallicano Reference Smith and Gallicano30 point out the critical difference between interaction with content, mental engagement with the topic, and having an interactive discussion that is critical for risk communication. We are focused on this last form of engagement, which we call conversation. Unfortunately, typical metrics, such as likes and retweets, measure success at broadcasting, and are indications of “reach an agreement” rather than indicating engagement relevant for risk communication, that is, conversation.
In the second stage, we applied the definition to build a conceptual framework based on RST, Reference Mann and Thompson26 a descriptive theory from linguistics, for understanding conversation on Twitter to measure engagement. A preliminary analysis considered organizations’ individual tweets to evaluate whether organizations use Twitter for conversation at all, either by responding to or quoting tweets by other users, or whether Twitter was used only for tweeted announcements or dissemination of information from trusted sources by linking or retweeting. Length of conversation is a key factor, but so is structure. Where organizations engaged in any form of interaction, we analyzed these conversations with the adapted RST framework. The analysis categorized conversations into monologues, where the account replied only to itself; reply conversations, where replies were received but no dialogue occurred; dialogues, where 2 or more users replied to one another; and multilogues, where more than 2 users responded within a single discussion.
This analysis is critical because interaction alone is not necessarily sufficient. The typology above therefore differentiates between responsive dialogue, replying to specific tweets, or more intense conversation between multiple users, experts, and/or organizations. Conversation is a critical part of health communication, according to Park et al., Reference Park, Reber and Chon31 and the analysis builds on the approach of Lovejoy and Saxton, Reference Lovejoy and Saxton16 among others.
The value of conversation for risk communication also depends on participants. Gesser-Edelsburg and Shir-Raz Reference Gesser-Edelsburg and Shir-Raz4 identify several challenges in creating credible discourse on online platforms, which include: “public health experts versus ‘people with agendas’; conflicts of interest; [and] facts/rationality v. emotions/myths.” Reference Gesser-Edelsburg and Shir-Raz4 To overcome these challenges, engagement with the public should be done by clearly identified experts and well-known organizations pointing to verifiable and trustworthy sources. Clear identification can be measured by looking at how experts identify themselves, as discussed below, and reference to trustworthy sources can be considered by looking at inclusion of links in replies, and the sources referenced.
Given this, the third and final stage considered who was participating in conversations. In addition to the 52 organizations of the WHO list, there were a variety of other organizations and individuals that replied, were replied to, or were quoted, and therefore appear in the data set. The organizations include both health care organizations such as hospitals, and non-profits, as well as other organization types, including news, government bodies, and others. The individuals range from experts to the lay public to celebrities and other noteworthy personalities, such as leaders of key organizations and news reporters. Unfortunately, as discussed below and in the appendix, it is often difficult to know which is which, but heuristic methods were used to perform classification. Given this identification, it is possible to see whether conversations were in fact engaging with the public.
To consider this, we present a breakdown of a number of characteristics, including thread type, length, number of participants, and participant types – expert, organizational, or member of the public. User types were based on a heuristic method, details of which are available in the appendix.
Results
The data retrieved were a total of 1 017 176 tweets across the participants in conversations with VSN accounts. This was filtered and partitioned into 2 partially overlapping data sets. The first, the organization data set, was all retrieved conversations directly involving the organizations. This was a total of 1826 conversations involving 6630 tweets.
The second, the relevant conversations data set, were all conversations captured in the tweet set in which a vaccine relevant key word (defined in the methodology appendix) was used somewhere in the extended conversation, by any user. This was a total of 2427 conversations, with a total of 24 192 tweets; 255 of the conversations, with 1257 tweets, were included in both sets.
Conversational length, in the number of tweets, is a simple measure of engagement, as noted above. A Welch 2 sample, 1-sided t-test shows that the VSN conversations were statistically and significantly shorter (t = 7.8984, df = 2433.7, P < 0.0001), with a mean of 3.6 tweets in VSN conversations, compared to a mean of 9.5 tweets for relevant conversations. The distribution of the conversation sizes as a percentage of the total is shown below (Figure 1).
Tweet Types
On Twitter, tweets can be replies, retweets, or tweets that are none of these, which we call isolated tweets. They can also be both replies and retweets. For example, a reply tweet can quote a third tweet, or the tweet it is replying to. In the VSN data set, we find that VSN members tweeted or retweeted others a total of 37 587 times. Of these, 7479 were replies, 5053 were quote-tweets, and a further 145 were replies that also quoted a tweet. There was significant variance between accounts, however. For instance, as of the date of retrieval, the total number of tweets sent by the accounts, including those from before the retrieval window, ranged from 24 809 tweets sent, to only 42 (Figure 2).
Note that, while most members rarely replied, some did so heavily. Similarly, only a few accounts ever both replied and quote-tweeted in a single tweet (shown in red,) but one did so relatively often.
Analysis of Conversation Types
The VSN conversations had a significantly smaller (X Reference George, Rovniak and Kraschnewski2 = 154.93, df = 1, P < 0.0001) proportion of multilogues, 6% compared to 19%. Given the number of relevant conversations and the number of reply conversations, this was not for lack of opportunity. See Table 1.
Notes: The VSN data set contained 6% multilogues, a significantly smaller proportion (X Reference George, Rovniak and Kraschnewski2 = 154.93, df = 1, P < 0.0001) than the relevant conversations data set, which was 19% multilogues.
The breakdown of these conversations by the user types involved is shown below (Figure 4).
The 2 data sets display a clear difference in the number of multilogues. This is similar to the previous observation, and we see that VSN members engaged in 14% fewer multilogues than occurred in relevant conversations.
Question and Answer Frequency
In the data sets, we also looked at questions asked. Table 2 below compares the two corpuses analyzed and shows that VSN members responded to questions in conversations infrequently.
Notes: Questions were found using the method of Li et al., which found that the simple heuristic of detecting question marks in tweets found almost 85% of questions, with a precision of 97%. Applying that method to our data sets, it is possible to identify not only questions, but also the answers, that is, other users’ tweets that replied to the question.
On other social media services, there is work showing that questions are common, and answers by experts are sought out. Unfortunately, the organizations best placed to be reliable sources of that information seem not to engage in those conversations; less than half of all conversations with questions had answers by VSN members. In the relevant conversations data set, most of the answers, 3450, were from users that were not identifiable organizations or experts. Given the focus on providing information, it seems unfortunate that these organizations are often unresponsive when answers are sought. Perhaps even worse, if it is clear that organizations don’t typically reply to questions, the public will instead seek answers from sources, which may not provide authoritative or even correct responses.
Discussion
The organizations whose Twitter use is analyzed in this paper are critical parts of the health education ecosystem, according to the WHO, 8 but their use of Twitter in this context leaves much to be desired.
We looked for conversation and interaction between health organizations, experts, and the public. These did occur but were vastly outnumbered by broadcast-focused use of Twitter. As noted, VSN members engaged in 14% (12% to 16%, P < 0.0001) fewer multilogues than occurred in relevant conversations, and answered fewer than half of the questions asked in conversations that they participated in.
The organizations mostly use Twitter to make announcements. The provision of reliable health information is critical, and these organizations continue to carry out that mission well. Still, many fail to capitalize on social media as a way to interact with the public. As the WHO 32 notes, a key advantage of social media is allowing people to “engage in conversations,” which requires goals beyond “using more visuals, pictures and infographics to simplify information, tell better stories, and create a fast but lasting impact.” 32
The analysis in this paper validates the use of conversation analysis in understanding the usage of Twitter as a social media platform more generally by showing that material differences in usage exist between accounts and between corpuses. Previous work has highlighted the importance of conversation, but large-scale studies used far less direct methods for their analyses based on hashtags and counting replies. Given the existence of this new approach for a more in-depth large scale analysis, the method should be considered as an alternative or complement to content analysis when considering how social media are used and experienced.
The analysis also gives insight about how to better address the recommendations for creating engagement; focus more on replying to those who respond and ask questions, in addition to focus on using hashtags and images, or focus on follower counts and retweets. It would be concerning if metrics for replies were to completely crowd out current metrics for social media use in organizations, but a more comprehensive approach that included different metrics could certainly encourage a balance. Reference Manheim33,Reference Sugimoto, Work, Larivière and Haustein34
A secondary point that emerges from this analysis is the way in which health organizations’ lack of coordinated response and engagement with the public has left individual experts with the burden of addressing these issues. While domain experts have performed admirably, and even large parts of the broader public have been helpful in spreading useful health education messages, they are filling in a role that health organizations are not.
The method used has 2 key limitations, discussed more completely in the appendix. The first is that retrieval is limited to identified users, which is critical when understanding failure to reply. In the current case, however, the captured replies are sufficient to show the noted lack of engagement.
The second is that analysis of participants required for conversation analysis relies on effective identification of accounts, and this is challenging to do at scale. However, individual experts whose accounts are not clearly identified for the purposes of public understanding are not experts. In addition, for organization identification, the set of health education organizations analyzed was specified before data collection began so that the impact was more limited. Still, better methods are required for more complete analyses.
Conclusion
There is a clear difference between the conversations on Twitter that health organizations are involved in and the conversations that discuss vaccines and measles. It seems clear that most organizations are still using Twitter as a 1-way communication channel. This can be addressed by reconsidering the communication strategies used on Twitter and dedicating people and resources to interaction. This goes farther than tweeting more, or even replying and conversing more. Risk communication requires consideration of the context of questions. This will require engagement with the public that extends beyond the health education organizations’ role as dispassionate providers of truth, to greater real engagement with the public.
Despite the great importance that health organizations attach to the use of social media, and despite existing guidelines and accumulated knowledge at the application level, there are still gaps. If organizations want to begin allocating appropriate resources, they can begin to create a continuous conversation with the public. As current trends in misinformation and infodemics make clear, the need is critical and not yet fully addressed. Improving social media risk communication is not a 1-time effort restricted to a single class of misinformation or acute event, but it is part of a continuing need in public health.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/dmp.2020.404
Data Availability Statement
Code is available on GitHub, per the appendix. Data are available from Twitter based on available files on GitHub, or per Twitter terms and conditions, and a full data set is available for academic use on request.
Conflict of Interest
The authors have no conflicts of interest to declare.
Ethical Standards
An institutional review board determined that the data collection and analysis do not constitute human subjects research and was approved.