Who is saying what on Twitter: An analysis of messages with references to HIV and HIV risk behavior

Lohmann, Sophie; Lourentzou, Ismini; Zhai, Chengxiang; Albarracín, Dolores; Lohmann, Sophie; Lourentzou, Ismini; Zhai, Chengxiang; Albarracín, Dolores

doi:10.22201/fpsi.20074719e.2018.1.09

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Acta de investigación psicológica

On-line version ISSN 2007-4719Print version ISSN 2007-4832

Acta de investigación psicol vol.8 n.1 Ciudad de México Apr. 2018

https://doi.org/10.22201/fpsi.20074719e.2018.1.09

Artículos

Who is saying what on Twitter: An analysis of messages with references to HIV and HIV risk behavior^*

Quién dice qué en Twitter: Mensajes con referencia a VIH y conducta de riesgo de VIH

Sophie Lohmann^a^**

Ismini Lourentzou^a

Chengxiang Zhai^a

Dolores Albarracín^a

^{^a} Department of Psychology, University of Illinois at Urbana-Champaign, Illinois, USA.

^{^b} Department of Computer Science, University of Illinois at Urbana-Champaign, Illinois, USA.

Abstract:

This research aimed to determine the nature of social media discussions about HIV. With the goal of conducting a descriptive analysis, we collected almost 1,000 tweets posted February to September 2015. The sample of tweets included keywords related to HIV or behavioral risk factors (e.g., sex, drug use) and was coded for content (e.g., HIV), behavior change strategies, and message source. Seven percent of tweets concerned HIV/AIDS, which were often referred to as jokes or insults. The majority of tweets coded as behavior change attempts involved attitude change strategies. The majority of the tweets (80%) came from private users (vs. organizations). Different types of sources employed different types of behavior change strategies: For instance, private users, compared to experts or organizations, included more strategies to decrease detrimental attitudes (29% versus 6%, p < .001), and also more strategies to counter myths and misinformation (6% versus 1%, p = .008). In summary, tweets related to HIV/AIDS and associated risk factors frequently use the terms in jokes and insults, come largely from private users, and entail attitudinal and informational strategies. Online health campaigns with clear calls to action and corrections of misinformation may make important contributions to social media conversations about HIV/AIDS.

Keywords: HIV; Acquired Immunodeficiency Syndrome; Sexually transmitted infections; Social media; Communication; Attitude; Behavior change

Resumen:

Esta investigación tuvo el objetivo de caracterizar las discusiones sobre VIH en los medios sociales. Con el objetivo de realizar un análisis descriptivo, recogimos alrededor de mil tweets entre febrero y septiembre del 2015. Estos tweets fueron seleccionados si incluían palabras claves relacionadas con el VIH o con factores de riesgo conductual tales como sexo o uso de drogas. Cuatro codificadores clasificaron los tweets en función del contenido (e.g., el VIH como enfermedad, referido a un producto o servicio), la estrategia de cambio conductual (cambio conductual, llamada a la acción, o corrección de mitos), y la fuente del mensaje (e.g., usuarios privados, expertos, empresas comerciales). La mayoría de los tweets (80%) provenía de usuarios privados en lugar de institucionales. El 7% de los tweets se refería estrictamente al VIH u otras infecciones de transmisión sexual, frecuentemente utilizando esos términos como bromas o insultos, tales como escribir que una experiencia displacentera “me dio SIDA”. La mayoría de los intentos de cambio conductual incluía estrategias de reducción de actitudes negativas. Fuentes de distintos tipos empleaban estrategias de cambio conductual de distintos tipos. Por ejemplo, usuarios privados (comparados con expertos, organizaciones comerciales, y otras organizaciones, tal como periódicos y ONGs), publicaban más mensajes clasificados como estrategias de promoción de actitudes negativas (29% versus 6%, p < .001), y tenían más correcciones de mitos (6% versus 1%, p = .008). En resumen, los tweets que mencionan el VIH o factores de riesgo de VIH utilizan los términos en bromas e insultos con gran frecuencia, provienen mayormente de usuarios privados, e incluyen estrategias de cambio de actitud. Las campañas de Internet con llamadas claras a la acción y con correcciones de mitos pueden hacer contribuciones importantes a las conversaciones sobre VIH en los medios sociales.

Palabras clave: VIH; Síndrome de inmunodeficiencia adquirida; Infección de transmisión sexual; Medios sociales; Comunicación; Actitud; Cambio de conducta

As social media play an ever-increasing role in contemporary life (^{Perrin, 2015}), understanding their potential contribution to sexual and drug behaviors that pose risk for HIV is paramount. Prior research has found that the language used in social media communications in a geographical area is predictive of HIV rates in those communities ( ^{Ireland, Chen, Schwartz, Ungar, & Albarracín, 2016}; ^{Ireland, Schwartz, Chen, Ungar, & Albarracín, 2015}; ^{Young, Rivers, & Lewis, 2014}). In particular, these articles examined the association between prevalence of HIV and language on Twitter (e.g.,, tweets about sex and drug use; Young et al., 2014). Online conversations may refer to sex or drug use in an explicit attempt to promote risky behavior or as jokes or memes about sex or drug use that still convey a casual positive attitude towards objectively unsafe practices (^{Gabarron, Serrano, Wynn, & Lau, 2014}). Different message sources (e.g., private Twitter users, commercial organizations, NGOs) may approach the topics in different ways. Models of behavior change have identified several important strategies that apply to the area of HIV-related behaviors (^{Ajzen, 1991}; ^{Albarracín et al., 2005}; ^{Fishbein & Cappella, 2006}; ^{Fisher & Fisher, 1992}), such as changing beliefs and attitudes, establishing knowledge about the steps needed to perform the behavior, and finally initiating the behavior. Correspondingly, some social media messages may try to shape the audience’s behavior through these strategies. This paper thus examined tweets to determine the relative frequency of these messages, including informational strategies to correct myths or misconceptions (changing beliefs), messages to change attitudes in a positive direction (changing attitudes), messages to change attitudes in a negative direction (changing attitudes), messages to teach behavioral skills (changing procedural knowledge), and messages with calls to action (changing behavior). In addition, the analysis included informal references to risky behavior, and the sources of the messages, drawing on findings that message source is a critical contributor to persuasive messages and behavioral interventions (^{Albarracín & Glasman, 2016}; ^{Albarracín, Kumkale, & Poyner-Del Vento, 2017}; ^{Durantini, Albarracín, Mitchell, Earl, & Gillette, 2006}; ^{Wilson & Sherrell, 1993}). This investigation aimed to answer the question Who is Saying What on Twitter. An answer to this question has the potential to inform health professionals who wish to know how HIV is being discussed online or who want to create interventions. This research further offers a suggestion of which place online health interventions could take in the existing social media landscape.

Method

Twitter’s Streaming API was used to collect a random sample of all tweets (140-character short messages) from February 2015 to August 2015, resulting in a total of 234,603,322 tweets (excluding retweets). Twitter was chosen because social groups who are at risk for HIV and other STIs are particularly well-represented on Twitter (26% of people aged 18 to 29, 28% of African-Americans, and 19% of people in urban areas use Twitter; ^{Smith & Brenner, 2012}) and because tweets are freely accessible via the API. This research used only data that Twitter users had posted as publicly available tweets and was not considered human subject research by the Institutional Review Board of our university. Through stratified sampling, the final subsample of tweets was more likely to be related to HIV/AIDS than a randomly collected tweet. Specifically, the tweets had to contain one or more keyword from the following nine categories, which are either causal factors in transmission or factors that have been associated with increased prevalence of HIV (e.g., ^{Centers for Disease Control and Prevention, 2017}; ^{Ward & Rönn, 2010}): sex, drugs/alcohol, HIV/AIDS, other STIs, preventive methods/safe sex, men who have sex with men, full-service sex work, runaway youth, and sexual violence and abuse. The categories were identified by experts in the domain of HIV working on our team, and keywords were generated by looking through existing glossaries (e.g., Aids.gov HIV glossary) as well as slang dictionaries (e.g., urbandictionary.com; 590 keywords total). This keyword-based filter identified 9,639,111 tweets (4% of the random sample of tweets we collected), including 17,554 tweets with HIV/AIDS keywords (0.02%) and 11,956 tweets with other STI keywords (0.01%). The aim was to code about 1,000 tweets and thus about 112 tweets were coded for each of the nine categories. After excluding duplicates and tweets that were completely or partially non-English, N = 975 tweets remained (examples in Fig 1). Four coders were trained to code for a total of 36 variables (see supplemental materials): Message sources (e.g., private user), content (e.g., about a person), persuasive strategies (e.g., conveying a negative attitude), and miscellaneous variables (e.g., whether the tweet appeared to be some type of spam, such as an advertisement for a product or for pornography). With the exception of the source variables, all variables were binary and assessed the presence versus absence of a topic or a strategy. Coders were first trained to understand HIV-related terminology and then began reading tweets. The coding scheme was iteratively clarified based on their feedback until intercoder reliability coefficients showed that the coders shared an understanding of the variables. They were instructed to follow links, if available, and consider the content of the linked website as additional context information. Intercoder reliability for all variables was adequate: Cronbach’s α > .70 (M=.82) and Fleiss’s α > .35 (M = .51). ^¹

Results

All tweets were selected to include words that are relevant to HIV/STIs but their content was highly diverse. In general, 50% of tweets were about a person, 39% about an attitude or opinion, 29% about a behavior related to a disease or social problem (see Fig 1 for examples), 24% were about a product or service, and 6% were about whether or not one or several people had a disease such as HIV (these categories were not mutually exclusive; see Table 1, column “Overall”). Despite preselecting for terms related to HIV or risk categories, HIV or AIDS were a topic in only 7% of all tweets and the main topic in only 3% of tweets (Table 1, column “Overall”). Correspondingly, terms were often not used with their original meaning. For instance, when selecting only the tweets that included names of sexually transmitted infections, in 9% of these tweets these STI names were used as an insult, joke, or in another non-medical sense, such as “@Benihana food was decent but you screwed up my rice order. My stomach has aids from the onions I ate” (tweet by user QuiKzZ_COD^²) (which mirrors prior findings on STIs and jokes on Twitter; ^{Gabarron et al., 2014}). Similarly, the risk category vocabulary (especially for sex, sex work, and drugs) was often non-specific because the majority of keywords were used as swear words, jokes, or insults.

Many tweets contained communications that aligned well with effective messages to promote specific behaviors. In terms of coded strategies, about 5% of tweets countered myths and misinformation (e.g., “”you support gay rights so u must be gay” i support animal rights do i look like a fucking alpaca to you”, tweet by user TheFunnyTeens), 24% tried to change attitudes into a positive direction (e.g., “Great new HIV home test kit trial by these consultants at Derriford”, user DLapthorne, originally by Plymouth Hospitals @PHNT_NHS; or “The love I have for Kanye West transcends all elements of my middle class white boy persona.”, user Josh Spoelstra @Spollyy), 24% tried to change attitudes in a negative direction (e.g., “This Mother Wants You To See What An HPV Vaccine Injury Looks Like”, user Kelsha LeAnne @KelshaWellness; or “on this boring ass bus”, user Kay @__reddish, and 10% included calls to action (e.g., “National HIV Testing Day is Right around the corner. Come out this Saturday from 2-4PM to get tested at the...”, FVSU PEs OSHCS @PEsFVSU; or “Give ma mixtape a shot.. You will Love it!! 10 Dope tracks, FREE Download!!”, Dom Shatti @DomShatti; Table 1, column “Overall”). Social media landscapes thus involve many persuasion attempts, most of which are attempts to influence attitudes. This result suggests that health campaigns are not out of place in social media, although they might need to prevail in the competition for an audience that has many other messages vying for their attention.

Table 1 Tweet content split by source type

Tweet source
Tweet content	Private user	Expert/	Commercial	Organization^b	Overall	Fisher’s exact test p-value	χ²(1)
Tweet content	Private user	research^a	company	Organization^b	Overall	Fisher’s exact test p-value	χ²(1)
Content
reference to a person	43.18%	0.21%	3.18%	3.69%	50.26%	<.001***	24.20***
Opinion	37.23%	0.00%	0.51%	1.03%	38.77%	<.001***	101.99***
socially/disease-relevant behavior^c	22.97%	0.41%	2.36%	3.69%	29.44%	0.002**	0.4
product or service	14.56%	0.51%	7.90%	1.33%	24.31%	<.001***	71.98***
Organization	6.87%	0.21%	1.44%	1.33%	9.85%	0.040* *	5.50
nonbehavioral reference to disease	3.59%	0.31%	0.41%	1.23%	5.54%	<.001***	6.63*
HIV or AIDS
as a topic	4.51%	0.10%	0.51%	1.44%	6.56%	.002**	4.16*
as the main topic	1.74%	0.10%	0.10%	1.13%	3.08%	<.001***	8.49**
Strategies
attitude change in a positive direction	20.92%	0.31%	1.85%	1.03%	24.10%	0.005**	9.60**
attitude change in a negative direction	23.08%	0.00%	0.10%	1.13%	24.31%	<.001***	44.59***
calls to action	7.90%	0.10%	1.44%	0.51%	9.95%	0.505	0
countering myths	4.72%	0.00%	0.10%	0.10%	4.92%	0.036*	07.25**
Across all tweets	79.49%	0.92%	11.49%	8.10%	100%

Next, message sources were analyzed. Some tweets were advertisements or were delivered by spambots (12%) but the majority (88%) had real users actively involved in the communications (73 tweets were not coded for this variable, N = 902). The majority of tweets (79%) appeared to come from private users (operationalized as accounts that had no indication of being affiliated with an institution; this category included celebrities). Eleven percent of tweets came from companies and 8% from other organizations such as political figures or news outlets. Only 1% of tweets seemed to come from experts or research organizations (such as university press offices; see last row of Table 1).

Finally, message content and employed strategies differed as a function of the source (Table 1). For instance, private users tweeted about people having diseases less than did experts, commercial organizations, and other organizations.

Discussion

Through the rise of online forums and social media, disseminating factual information about relevant issues such as health problems is not the prerogative of experts and news outlets anymore, but such information still comes disproportionately from experts and institutions as more trustworthy sources. Experts and commercial or other organizations made fewer attempts at changing attitudes, especially at conveying negative attitudes, than did private users. It is possible that the experts and organizations in our sample saw it as their task to educate their audiences, rather than directly influence their opinions. There were no significant differences for disease- or social-problem-relevant behavior, teaching skills, or calls to action. These differences in strategy use suggest avenues for future research that analyzes whether these strategies are also differentially effective at changing behavior on social media when coming from different sources. For instance, an expert or NGO trying to convey a negative attitude towards a risk behavior may be seen as more patronizing and threatening than a private user trying to do the same. Additionally, the results showed that many references to STIs and risk behaviors were informal, taking the form of jokes or insults, perhaps reflecting stigma or associations of HIV or other STIs with negative attributes. Further research that analyzes the way in which these informal conversations about sex, drug use, and STIs reinforce existing stereotypes and misinformation can elucidate a large part of social media that people are exposed to on a daily basis.

To our knowledge, this is the first study that systematically analyzes which users and which persuasive strategies are involved in social media communications related to HIV and HIV risk factors. Future studies can include user characteristics to assess whether communication styles differ across geographic locations or across social networks within Twitter. Those studies may also add context information to better understand why particular users talk about HIV in particular ways, which the constraints of the current dataset do not allow. More precise knowledge about Twitter users who are posting relevant content can inform the development of online health interventions. Another limitation of the current study is that we analyzed only 1000 tweets. Analyzing social media in the domain of health is a new field with a variety of possible approaches. Prior work that has used dictionaries to analyze messages has typically used many more messages, such as 150 million tweets (^{Ireland et al., 2016}, ²⁰¹⁵). For purposes of this study, however, a more nuanced approach was found to be more suitable, relying on manual codings rather than automatized dictionaries. Manual codings impose natural limits on how much data can be coded, and other studies with this approach also had lower sample sizes (e.g., 694 tweets in ^{Gabarron et al., 2014}). In the future, the manual coding could be automatized using machine learning so that much larger numbers of tweets can be classified efficiently. Finally, Twitter of course represents only one social media platform of many, and both the way that conversations related to HIV are conducted and the requirements for online health campaigns may differ from platform to platform.

In conclusion, factual conversations regarding issues that are relevant to public health, such as STIs, are uncommon on Twitter, which is to be expected given that it is not a content-specific platform such as a health forum. Correspondingly, reliable expert sources produce only a small portion of tweets. In contrast, casual conversations (e.g., jokes and insults) about sex, drugs, and sex work are plentiful and could be the basis for rich analyses of attitudes towards sex and drug use practices, even if it can be difficult to single out these conversations due to the ubiquity of sex- and drug-related language in memes, swear words, insults, and jokes. Strategies to change behavior are common on Twitter, suggesting that online health interventions may be competing for users’ attention with many other persuasive messages. Most of these strategies, however, attempt to influence behavior indirectly through attitude change. Health campaigns with clear calls to action are thus well-suited for supplementing existing conversations on social media.

References

Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50(2), 179-211. https://doi.org/10.1016/0749-5978(91)90020-T [ Links ]

Albarracín, D., Gillette, J. C., Earl, A. N., Glasman, L. R., Durantini, M. R., & Ho, M.-H. (2005). A test of major assumptions about behavior change: A comprehensive look at the effects of passive and active HIV-prevention interventions since the beginning of the epidemic. Psychological Bulletin, 131(6), 856-897. https://doi.org/10.1037/0033-2909.131.6.856 [ Links ]

Albarracín, D., & Glasman, L. R. (2016). Multidimensional targeting for tailoring: A comment on Ogden (2016). Health Psychology Review, 10(3), 251-255. https://doi.org/10.1080/17437199.2016.1190294 [ Links ]

Albarracín, D., Kumkale, G. T., & Poyner-Del Vento, P. (2017). How people can become persuaded by weak messages presented by credible communicators: Not all sleeper effects are created equal. Journal of Experimental Social Psychology, 68, 171-180. https://doi.org/10.1016/j.jesp.2016.06.009 [ Links ]

Centers for Disease Control and Prevention. (2017). HIV surveillance report, 2016 (Vol. 28). Retrieved from http://www.cdc.gov/hiv/library/reports/hiv-surveillance.html [ Links ]

Durantini, M. R., Albarracín, D., Mitchell, A. L., Earl, A. N., & Gillette, J. C. (2006). Conceptualizing the influence of social agents of behavior change: A meta-analysis of the effectiveness of HIV-prevention interventionists for different groups. Psychological Bulletin, 132(2), 212-248. https://doi.org/10.1037/0033-2909.132.2.212 [ Links ]

Fishbein, M., & Cappella, J. N. (2006). The role of theory in developing effective health communications. Journal of Communication, 56, S1-S17. https://doi.org/10.1111/j.1460-2466.2006.00280.x [ Links ]

Fisher, J. D., & Fisher, W. A. (1992). Changing AIDS-risk behavior. Psychological Bulletin, 111(3), 455-474. https://doi.org/10.1037/0033-2909.111.3.455 [ Links ]

Gabarron, E., Serrano, J. A., Wynn, R., & Lau, A. Y. S. (2014). Tweet content related to sexually transmitted diseases: No joking matter. Journal of Medical Internet Research, 16(10), e228. https://doi.org/10.2196/jmir.3259 [ Links ]

Ireland, M. E., Chen, Q., Schwartz, H. A., Ungar, L. H., & Albarracín, D. (2016). Action tweets linked to reduced county-level HIV prevalence in the United States: Online messages and structural determinants. AIDS and Behavior, 20(6), 1256-1264. https://doi.org/10.1007/s10461-015-1252-2 [ Links ]

Ireland, M. E., Schwartz, H. A., Chen, Q., Ungar, L. H., & Albarracín, D. (2015). Future-oriented tweets predict lower county-level HIV prevalence in the United States. Health Psychology, 34(Suppl), 1252-1260. https://doi.org/10.1037/hea0000279 [ Links ]

Perrin, A. (2015). Social media usage: 2005-2015. Retrieved from http://www.pewinternet.org/files/2015/10/PI_2015-10-08_Social-Networking-Usage-2005-2015_FINAL.pdf [ Links ]

Smith, A., & Brenner, J. (2012). Twitter use 2012. Pew Research Center. Retrieved from http://www.pewinternet.org/files/old-media//Files/Reports/2012/PIP_Twitter_Use_2012.pdf [ Links ]

Ward, H., & Rönn, M. (2010). Contribution of sexually transmitted infections to the sexual transmission of HIV. Current Opinion in HIV and AIDS, 5(4), 305-310. https://doi.org/10.1097/COH.0b013e32833a8844 [ Links ]

Wilson, E. J., & Sherrell, D. L. (1993). Source effects in communication and persuasion research: A meta-analysis of effect size. Journal of the Academy of Marketing Science, 21(2), 101-112. https://doi.org/10.1007/BF02894421 [ Links ]

Young, S. D., Rivers, C., & Lewis, B. (2014). Methods of using real-time social media technologies for detection and remote monitoring of HIV outcomes. Preventive Medicine: An International Journal Devoted to Practice and Theory, 63, 112-115. https://doi.org/10.1016/j.ypmed.2014.01.024 [ Links ]

*Acknowledgments and Disclosures: This work was funded by a National Institutes of Health grant (NIH 1 R56 AI114501- 01A1).We are grateful toYisi Liu for his help in data collection and to Nadeem Dabbakeh, Micah Iserman, Xiaomeng Li, Lukas Piatek, Amanda Taylor, and Alexandra Y. Zhang for their involvement in coding and development of the coding scheme. We would further like to thank Sherry L. Emery for providing valuable feedback on an earlier version of the manuscript.

¹The following changes were made to the codings: We noticed that for 8 tweets, other STIs such as HPV were misidentified as HIV. To maintain validity of our variables, the codings were corrected for these 8 tweets. Further, information on the source for 21 tweets, and information on content for six tweets was considered indeterminable by the coders and was later coded by the first author. These additional codings did not change the pattern of results with one exception: The effect of the four source types on countering myths or misinformation spread by others became significant, changing from p = .064 to p = .036.

²In the interest of attributing proper credit of specific tweets that are quoted, we include original usernames, similar to how we would include the author’s name when citing a more traditional blog post or a newspaper article. Twitter’s terms of service inform users that posted content is public by default and this paper quotes only tweets that were public at the time of writing. When the cited tweets contained a link or a @username, these elements were removed to make the quote more concise.

Received: August 29, 2017; Accepted: January 11, 2018

^**Contact: Sophie Lohmann, 603 East Daniel St., Champaign, IL, 61820, lohmann2@illinois.edu

This is an open-access article distributed under the terms of the Creative Commons Attribution License