As social media play an ever-increasing role in contemporary life (Perrin, 2015), understanding their potential contribution to sexual and drug behaviors that pose risk for HIV is paramount. Prior research has found that the language used in social media communications in a geographical area is predictive of HIV rates in those communities ( Ireland, Chen, Schwartz, Ungar, & Albarracín, 2016; Ireland, Schwartz, Chen, Ungar, & Albarracín, 2015; Young, Rivers, & Lewis, 2014). In particular, these articles examined the association between prevalence of HIV and language on Twitter (e.g.,, tweets about sex and drug use; Young et al., 2014). Online conversations may refer to sex or drug use in an explicit attempt to promote risky behavior or as jokes or memes about sex or drug use that still convey a casual positive attitude towards objectively unsafe practices (Gabarron, Serrano, Wynn, & Lau, 2014). Different message sources (e.g., private Twitter users, commercial organizations, NGOs) may approach the topics in different ways. Models of behavior change have identified several important strategies that apply to the area of HIV-related behaviors (Ajzen, 1991; Albarracín et al., 2005; Fishbein & Cappella, 2006; Fisher & Fisher, 1992), such as changing beliefs and attitudes, establishing knowledge about the steps needed to perform the behavior, and finally initiating the behavior. Correspondingly, some social media messages may try to shape the audience’s behavior through these strategies. This paper thus examined tweets to determine the relative frequency of these messages, including informational strategies to correct myths or misconceptions (changing beliefs), messages to change attitudes in a positive direction (changing attitudes), messages to change attitudes in a negative direction (changing attitudes), messages to teach behavioral skills (changing procedural knowledge), and messages with calls to action (changing behavior). In addition, the analysis included informal references to risky behavior, and the sources of the messages, drawing on findings that message source is a critical contributor to persuasive messages and behavioral interventions (Albarracín & Glasman, 2016; Albarracín, Kumkale, & Poyner-Del Vento, 2017; Durantini, Albarracín, Mitchell, Earl, & Gillette, 2006; Wilson & Sherrell, 1993). This investigation aimed to answer the question Who is Saying What on Twitter. An answer to this question has the potential to inform health professionals who wish to know how HIV is being discussed online or who want to create interventions. This research further offers a suggestion of which place online health interventions could take in the existing social media landscape.
Method
Twitter’s Streaming API was used to collect a random sample of all tweets (140-character short messages) from February 2015 to August 2015, resulting in a total of 234,603,322 tweets (excluding retweets). Twitter was chosen because social groups who are at risk for HIV and other STIs are particularly well-represented on Twitter (26% of people aged 18 to 29, 28% of African-Americans, and 19% of people in urban areas use Twitter; Smith & Brenner, 2012) and because tweets are freely accessible via the API. This research used only data that Twitter users had posted as publicly available tweets and was not considered human subject research by the Institutional Review Board of our university. Through stratified sampling, the final subsample of tweets was more likely to be related to HIV/AIDS than a randomly collected tweet. Specifically, the tweets had to contain one or more keyword from the following nine categories, which are either causal factors in transmission or factors that have been associated with increased prevalence of HIV (e.g., Centers for Disease Control and Prevention, 2017; Ward & Rönn, 2010): sex, drugs/alcohol, HIV/AIDS, other STIs, preventive methods/safe sex, men who have sex with men, full-service sex work, runaway youth, and sexual violence and abuse. The categories were identified by experts in the domain of HIV working on our team, and keywords were generated by looking through existing glossaries (e.g., Aids.gov HIV glossary) as well as slang dictionaries (e.g., urbandictionary.com; 590 keywords total). This keyword-based filter identified 9,639,111 tweets (4% of the random sample of tweets we collected), including 17,554 tweets with HIV/AIDS keywords (0.02%) and 11,956 tweets with other STI keywords (0.01%). The aim was to code about 1,000 tweets and thus about 112 tweets were coded for each of the nine categories. After excluding duplicates and tweets that were completely or partially non-English, N = 975 tweets remained (examples in Fig 1). Four coders were trained to code for a total of 36 variables (see supplemental materials): Message sources (e.g., private user), content (e.g., about a person), persuasive strategies (e.g., conveying a negative attitude), and miscellaneous variables (e.g., whether the tweet appeared to be some type of spam, such as an advertisement for a product or for pornography). With the exception of the source variables, all variables were binary and assessed the presence versus absence of a topic or a strategy. Coders were first trained to understand HIV-related terminology and then began reading tweets. The coding scheme was iteratively clarified based on their feedback until intercoder reliability coefficients showed that the coders shared an understanding of the variables. They were instructed to follow links, if available, and consider the content of the linked website as additional context information. Intercoder reliability for all variables was adequate: Cronbach’s α > .70 (M=.82) and Fleiss’s α > .35 (M = .51). 1
Results
All tweets were selected to include words that are relevant to HIV/STIs but their content was highly diverse. In general, 50% of tweets were about a person, 39% about an attitude or opinion, 29% about a behavior related to a disease or social problem (see Fig 1 for examples), 24% were about a product or service, and 6% were about whether or not one or several people had a disease such as HIV (these categories were not mutually exclusive; see Table 1, column “Overall”). Despite preselecting for terms related to HIV or risk categories, HIV or AIDS were a topic in only 7% of all tweets and the main topic in only 3% of tweets (Table 1, column “Overall”). Correspondingly, terms were often not used with their original meaning. For instance, when selecting only the tweets that included names of sexually transmitted infections, in 9% of these tweets these STI names were used as an insult, joke, or in another non-medical sense, such as “@Benihana food was decent but you screwed up my rice order. My stomach has aids from the onions I ate” (tweet by user QuiKzZ_COD2) (which mirrors prior findings on STIs and jokes on Twitter; Gabarron et al., 2014). Similarly, the risk category vocabulary (especially for sex, sex work, and drugs) was often non-specific because the majority of keywords were used as swear words, jokes, or insults.
Many tweets contained communications that aligned well with effective messages to promote specific behaviors. In terms of coded strategies, about 5% of tweets countered myths and misinformation (e.g., “”you support gay rights so u must be gay” i support animal rights do i look like a fucking alpaca to you”, tweet by user TheFunnyTeens), 24% tried to change attitudes into a positive direction (e.g., “Great new HIV home test kit trial by these consultants at Derriford”, user DLapthorne, originally by Plymouth Hospitals @PHNT_NHS; or “The love I have for Kanye West transcends all elements of my middle class white boy persona.”, user Josh Spoelstra @Spollyy), 24% tried to change attitudes in a negative direction (e.g., “This Mother Wants You To See What An HPV Vaccine Injury Looks Like”, user Kelsha LeAnne @KelshaWellness; or “on this boring ass bus”, user Kay @__reddish, and 10% included calls to action (e.g., “National HIV Testing Day is Right around the corner. Come out this Saturday from 2-4PM to get tested at the...”, FVSU PEs OSHCS @PEsFVSU; or “Give ma mixtape a shot.. You will Love it!! 10 Dope tracks, FREE Download!!”, Dom Shatti @DomShatti; Table 1, column “Overall”). Social media landscapes thus involve many persuasion attempts, most of which are attempts to influence attitudes. This result suggests that health campaigns are not out of place in social media, although they might need to prevail in the competition for an audience that has many other messages vying for their attention.
Tweet source | |||||||
Tweet content | Private user | Expert/ | Commercial | Organizationb | Overall | Fisher’s exact test p-value | χ²(1) |
researcha | company | ||||||
Content | |||||||
reference to a person | 43.18% | 0.21% | 3.18% | 3.69% | 50.26% | <.001*** | 24.20*** |
Opinion | 37.23% | 0.00% | 0.51% | 1.03% | 38.77% | <.001*** | 101.99*** |
socially/disease-relevant behaviorc | 22.97% | 0.41% | 2.36% | 3.69% | 29.44% | 0.002** | 0.4 |
product or service | 14.56% | 0.51% | 7.90% | 1.33% | 24.31% | <.001*** | 71.98*** |
Organization | 6.87% | 0.21% | 1.44% | 1.33% | 9.85% | 0.040* * | 5.50 |
nonbehavioral reference to disease | 3.59% | 0.31% | 0.41% | 1.23% | 5.54% | <.001*** | 6.63* |
HIV or AIDS | |||||||
as a topic | 4.51% | 0.10% | 0.51% | 1.44% | 6.56% | .002** | 4.16* |
as the main topic | 1.74% | 0.10% | 0.10% | 1.13% | 3.08% | <.001*** | 8.49** |
Strategies | |||||||
attitude change in a positive direction | 20.92% | 0.31% | 1.85% | 1.03% | 24.10% | 0.005** | 9.60** |
attitude change in a negative direction | 23.08% | 0.00% | 0.10% | 1.13% | 24.31% | <.001*** | 44.59*** |
calls to action | 7.90% | 0.10% | 1.44% | 0.51% | 9.95% | 0.505 | 0 |
countering myths | 4.72% | 0.00% | 0.10% | 0.10% | 4.92% | 0.036* | 07.25** |
Across all tweets | 79.49% | 0.92% | 11.49% | 8.10% | 100% |
Next, message sources were analyzed. Some tweets were advertisements or were delivered by spambots (12%) but the majority (88%) had real users actively involved in the communications (73 tweets were not coded for this variable, N = 902). The majority of tweets (79%) appeared to come from private users (operationalized as accounts that had no indication of being affiliated with an institution; this category included celebrities). Eleven percent of tweets came from companies and 8% from other organizations such as political figures or news outlets. Only 1% of tweets seemed to come from experts or research organizations (such as university press offices; see last row of Table 1).
Finally, message content and employed strategies differed as a function of the source (Table 1). For instance, private users tweeted about people having diseases less than did experts, commercial organizations, and other organizations.
Discussion
Through the rise of online forums and social media, disseminating factual information about relevant issues such as health problems is not the prerogative of experts and news outlets anymore, but such information still comes disproportionately from experts and institutions as more trustworthy sources. Experts and commercial or other organizations made fewer attempts at changing attitudes, especially at conveying negative attitudes, than did private users. It is possible that the experts and organizations in our sample saw it as their task to educate their audiences, rather than directly influence their opinions. There were no significant differences for disease- or social-problem-relevant behavior, teaching skills, or calls to action. These differences in strategy use suggest avenues for future research that analyzes whether these strategies are also differentially effective at changing behavior on social media when coming from different sources. For instance, an expert or NGO trying to convey a negative attitude towards a risk behavior may be seen as more patronizing and threatening than a private user trying to do the same. Additionally, the results showed that many references to STIs and risk behaviors were informal, taking the form of jokes or insults, perhaps reflecting stigma or associations of HIV or other STIs with negative attributes. Further research that analyzes the way in which these informal conversations about sex, drug use, and STIs reinforce existing stereotypes and misinformation can elucidate a large part of social media that people are exposed to on a daily basis.
To our knowledge, this is the first study that systematically analyzes which users and which persuasive strategies are involved in social media communications related to HIV and HIV risk factors. Future studies can include user characteristics to assess whether communication styles differ across geographic locations or across social networks within Twitter. Those studies may also add context information to better understand why particular users talk about HIV in particular ways, which the constraints of the current dataset do not allow. More precise knowledge about Twitter users who are posting relevant content can inform the development of online health interventions. Another limitation of the current study is that we analyzed only 1000 tweets. Analyzing social media in the domain of health is a new field with a variety of possible approaches. Prior work that has used dictionaries to analyze messages has typically used many more messages, such as 150 million tweets (Ireland et al., 2016, 2015). For purposes of this study, however, a more nuanced approach was found to be more suitable, relying on manual codings rather than automatized dictionaries. Manual codings impose natural limits on how much data can be coded, and other studies with this approach also had lower sample sizes (e.g., 694 tweets in Gabarron et al., 2014). In the future, the manual coding could be automatized using machine learning so that much larger numbers of tweets can be classified efficiently. Finally, Twitter of course represents only one social media platform of many, and both the way that conversations related to HIV are conducted and the requirements for online health campaigns may differ from platform to platform.
In conclusion, factual conversations regarding issues that are relevant to public health, such as STIs, are uncommon on Twitter, which is to be expected given that it is not a content-specific platform such as a health forum. Correspondingly, reliable expert sources produce only a small portion of tweets. In contrast, casual conversations (e.g., jokes and insults) about sex, drugs, and sex work are plentiful and could be the basis for rich analyses of attitudes towards sex and drug use practices, even if it can be difficult to single out these conversations due to the ubiquity of sex- and drug-related language in memes, swear words, insults, and jokes. Strategies to change behavior are common on Twitter, suggesting that online health interventions may be competing for users’ attention with many other persuasive messages. Most of these strategies, however, attempt to influence behavior indirectly through attitude change. Health campaigns with clear calls to action are thus well-suited for supplementing existing conversations on social media.