1 Introduction
Language, in all its forms, is one of the most comprehensive ways to characterize human societies. However, given its social nature, it cannot be only defined in terms of grammatical issues. In this respect, while it is true that grammar regulates language in order to have a non-chaotic system, it is also true that language is dynamic, and accordingly, a live entity. This means that language is not static; rather, it is in constant interaction between the rules of its grammar and its pragmatic use. For instance, the idiom “all of a sudden” has a grammatical structure which is not made intelligible only by knowledge of the familiar rules of its grammar [15], but by inferring implicit information. This latter process fills in the gap to properly interpret the idiom.
The previous example shows how our utterances entail two dimensions to decode what it is intended to be communicated: The explicit dimension, which is mainly featured by the use of literal language (not means not), and the implicit dimension, in which the use of figurative language is often profiled, for instance, by the use of figurative devices, such as irony, sarcasm, metaphor, among others (not could mean yes, perhaps, possibly, or more).
In simple words, it could be argued that the explicit dimension is what any hearer could understand effortlessly, whereas the implicit dimension is the hidden information to be unveiled by the same hearer to fully understand what the speaker is communicating.
This latter dimension is clearly the most challenging one to be recognized (and formalized), for both people and computers.
In this context, this article is focused on analyzing textual information, mainly extracted from Twitter, in order to recognize formal elements for setting a computational framework to differentiate between explicit and implicit language. In particular, the analysis is performed in the scenario of hate speech. To this end, a corpus with hate speech tweets in Spanish was built. It is divided in four classes to better understand how hate speech is verbalized explicitly and implicitly.
The challenge of recognizing whether an utterance conveys implicit content of hate speech or not is faced by analyzing two figurative devices: Irony and sarcasm. According to the specialized literature, one of the most challenging issues regarding hate speech is precisely the presence of devices such as the ones cited [22, 38, 25, 27, 47, 30]. In addition, as mentioned in the previous paragraphs, figurative language is commonly used to communicate information not given literally [33, 34]. This fact can be seen in the following tweet:
“Don’t come here. If you are afraid for your life and you have no place to go, don’t pick this country.”
Literally, this text is communicating an explicit description of the current immigration phenomenon; therefore, it could be classified as a harmless tweet. However, in the implicit dimension, it is also communicating a veiled threat; therefore, it should be classified as a hate speech tweet. Assuming that the second interpretation is correct, one way to unveil the threat is by recognizing that a figurative device, such as irony, underlies the tweet.
Given this distinction, we are interested in analyzing figurative language (irony and sarcasm, specifically) in order to better understand how hate speech is linguistically expressed.
The rest of the article is organized as follows: In Section 2 the theoretical background concerning figurative language will be introduced. The related work on irony, sarcasm, and hate speech will be described in Section 3. The analysis of the data and the discussion of the findings will be detailed in Section 4. Finally, in Section 5, we will conclude with some final remarks and some pointers to address the future work.
2 Two Dimensions of Language
Modern linguists deem language as a continuum of symbolic structures in which lexicon, morphology, and syntax form a continuum which differs along various parameters, what can be divided into separate components only arbitrarily [24].
Language, thus, is viewed as an entity whose components and levels of analysis cannot be independent nor isolated. On the contrary, they are embedded in a global system which depends on cognitive, experiential, and social contexts, which go far beyond the linguistic system proper [21].
This vision, according to the cognitive linguistics bases, entails a close relation between semantics and conceptualization (cf. [24]), i.e., apart from grammar, the linguistic system is dependent on cognitive domains, in which both referential knowledge (e.g., lexical semantic information) and inferential knowledge (e.g., contextual and pragmatic information) are fundamental to understand what it is communicated.
Based on this integral vision of language, in which its grammatical substance is as important as its social referents, the explicit (literal) and implicit (figurative) dimensions of language will be described below.
2.1 Literal Language (Explicit Dimension)
The simplest definition of literal language is related to the notion of
Hence, it is assumed that it must be invariant in all contexts. According to [1], literalness is generated by linguistic knowledge of lexical items, combined with linguistic rules. Therefore, it is determined, explicit, and fully compositional. For instance, the word
2.2 Figurative Language (Implicit Dimension)
In the context of a dichotomous view of language, figurative language could be regarded as the opposite of literal language. Thus, whereas the latter is assumed to communicate a direct and explicit meaning, the former is more related to the notion of conveying veiled or implicit meanings.
For instance, the word
Although, at first glance, this distinction seems to be clear and sufficient on its own, figurative language involves basic cognitive processes rather than only deviant usage [29]. Therefore, it is necessary to go deeper into the mechanisms and processes that differentiate both dimensions of language.
In accordance with classical perspectives, the notions of literalness and figurativity are viewed as pertaining directly to language, i.e., words have literal meanings, and can be used figuratively [20].
Consequently, figurative language could be regarded as a type of language that is based on literal meaning, but is disconnected from what people learn about the world [or about the words] based on it [them] [4].
Thus, by breaking this link, literal meaning loses its primary referent and, accordingly, the interpretation process becomes senseless. Let us consider Chomsky’s famous example to explain this issue:
“Colorless green ideas sleep furiously” [8].
Beyond grammatical aspects, in the previous example it is possible to observe how the decoding process is achieved easily enough. Either phonologically or orthographically, Chomsky’s example is fully understandable in terms of its linguistic constituents.
However, when interpreting, its literal meaning is completely nonsensical. For instance, the bigrams [colorless green] or [green ideas] are sufficiently disconnected from their conventional referents for being able to produce a coherent interpretation.
Thus, in order to make the example understandable, secondary interpretations are necessary. If such interpretations are successfully activated, then figurative meaning is triggered and, accordingly, a more coherent interpretation can be achieved.
Based on this explanation, literal meaning could be deemed as denotative, whereas figurative meaning, connotative, i.e., figurative meaning is not given a priori; rather, it must be implicated.
Finally, it is worth stressing out that language on its own provides specific linguistic devices to intentionally express different types of implicit contents: Metaphor, allegory, irony, similes, analogy, sarcasm, and so on.
2.3 Objective
Unlike literal language, figurative language uses linguistic devices such as irony, sarcasm, metaphor, analogy, and so on, in order to communicate implicit content, which is not usually interpretable by simply decoding syntactic or semantic information. Rather, figurative language reflects patterns of thought within a communicative and social framework that turns quite challenging its linguistic representation, as well as its computational processing.
In this respect, our objective is to develop a linguistic-based framework to recognize implicit content about hate speech in web texts. By the analysis of two specific domains of figurative language, it is intended to provide arguments about how people conceptualize hate speech, and how they verbalize such discourse deliberately. In particular, we are interested in developing formal models to recognize ironic and sarcastic texts, in which people veil consciously the contents of hate speech, in order to prevent and reduce, hopefully, the impact of such behaviors on the society.
3 Related Work on Figurative Language and Hate Speech
In this section, two figurative devices, irony and sarcasm, will be described in terms of their automatic processing. In addition, the related work on hate speech will be referred.
3.1 Irony
Like most figurative devices, irony is difficult to pin down in formal terms, and no single definition ever seems entirely satisfactory. According to various experts, irony is essentially a communicative act that expresses an opposite meaning of what was literally said, i.e., irony is a playful use of language in which a speaker implies the opposite of what is literally said [50, 9].
In terms of its automatic processing, there have been various approaches to automatically detect irony in text. For instance, [42] reported one of the first computational attempts to formalize the phenomenon. His model attempted to represent irony by modeling the interaction between speakers and hearers. [43, 44] analyzed the cognitive processes that underlie verbal irony to separate irony from non-irony in figurative comparisons.
In addition, [7] determined some clues for automatically identifying ironic sentences [34] and [33], in turn, presented a set of linguistic-based features to determine whether a tweet is ironic or not. More recently, [5], as well as [40] have developed ad hoc corpora for the task in languages beyond English. Likewise, some other researchers have addressed the task by setting a social media scenario in which it is quite common to find ironic statements about anything.
For instance, [6] focus their approach on product reviews, [49], on personal blogs, [13, 18, 39], on microblogs such as Twitter. [45], in turn, investigate irony in broader scenarios such as online communities.
3.2 Sarcasm
Although at first glance the terms irony and sarcasm seem to be concepts perfectly distinguishable from each other, when they are used in real communicative scenarios, such distinction is rarely accomplished. In this respect, [19] states that sarcasm, but not irony, involves the ridicule of a specific person or group of people.
It could be argued, for instance, that irony courts ambiguity and often exhibits great subtlety, whereas sarcasm is delivered with a cutting or withering tone that is rarely ambiguous. However, these differences rely indeed on matters of usage, tone, and obviousness, rather than only on theoretical assumptions.
With respect to sarcasm detection, [41], as well as [10], addressed their research to finding sarcastic patterns in online products reviews and tweets, respectively. [16] investigated the impact of lexical and pragmatic features on the task. Some others works have addressed the task by analyzing texts from social media platforms, especially, Twitter: [3, 28, 2, 32, 39] are examples about it. On the other hand, [31] approached sarcasm from a multilingual point of view. Finally, some research works have provided corpora for detecting sarcasm in different types of documents, for instance, [14, 26, 17].
3.3 Hate Speech
As stated previously, the challenge of detecting implicit content in text will be focused on hate speech. First of all, the term hate speech tends to be too general.
According to the United Nations, the term refers to “any kind of communication that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are”1. In addition, [27] listed six different definitions of hate speech. On this matter, the authors highlight what hate speech is for both specialists and social media platforms, such as Twitter or Facebook. Some targets of hate speech are related to others’ inherent properties, such as religion, ethnicity, nationality, race, color, descent, or gender. On the other hand, as noted by various researchers, any formal definition of hate speech is far from being universal. For instance, there is no clear boundary between hate speech and freedom of speech.
Despite these drawbacks, there is an increasing interest in the academic community on this topic due to the social implications of hate speech found all over the internet and mass media. In the context of Natural Language Processing, some approaches to hate speech are totally related to social media.
For instance, [46, 12, 11] analyze hate speech on websites, web comments and Facebook, respectively. [51, 30] focus on hate speech about the migration phenomenon and the immigrants. [23, 36] approach hate speech regarding specific target communities. Likewise, some other investigations have addressed the problem by assessing particular features to detect hate speech automatically in different languages. The works reported by [47, 48, 30, 35] represent fair examples about this concern. It is also worth noting the development of lexical resources, language models, and systems to automatically deal with this phenomenon [22, 38, 25].
4 Analysis
The data for the analysis, as well as the experiments performed to assess our preliminary findings are described below.
4.1 Hate Speech Data
A data set with tweets in Spanish was built in order to analyze how people verbalize explicit and implicit content related to hate speech. The tweets were collected manually by twelve doctoral students in Language Sciences. Because of the manual gathering, no hashtags were considered for collecting the data. Instead, the students were asked to read the most tweets they could in a time interval of three weeks. After reading them, they had to select the ones that they deemed to express hate speech. To this end, they attended some lectures on the topic; therefore, each one had a theoretical background about hate speech, as well as a variety of discussions about the different ways to express it linguistically.
In order to provide the students with a guide to systematize the task, four categories of hate speech were defined a priori: Violence, discrimination, bullying/harassment, and general (this last category is intended to cover tweets that cannot be classified in the previous ones). Each student should classify his/her tweets into one of these categories by identifying the target of the message. Finally, the students should annotate their tweets with two labels: Explicit hate speech or implicit hate speech.
The total amount of tweets collected with these criteria was 10.883. All of them were written in Spanish. Although the data set contains different dialectal variants, the most representative one is the Mexican. General statistics of the data set are provided in Table 1. This data set will be available for academic purposes in the near future.
4.2 Implicit Hate Speech Agreement
In Section 2.2 it was stated that the implicit content is not given straightforwardly. Therefore, to guarantee that the tweets annotated with the implicit label were, in fact, members of this class, a subsequent task was requested of the students. They had to read the tweets annotated with the label implicit hate speech to confirm that the tweet, indeed, belongs to such class.
The total number of tweets annotated with this label was 2.638. Thus, each student assessed 220 tweets, i.e., every tweet in this class was annotated twice.
The final data set with implicit hate speech content was built by selecting only the tweets assessed by two students as belonging to the implicit class. If a tweet was assessed by one student as implicit hate speech, but the second student assessed as explicit hate speech, or vice-versa, then such tweet was disregarded.
By doing this, the total number of tweets in the implicit hate speech class was reduced to 1.973. Such reduction, hypothetically, should ensure a set of fine-grained tweets in which the implicit content could be analyzed with deeper insights. The final distribution per category is depicted in Table 2.
4.3 Figurative Language Recognition
In order to examine how often the figurative devices appeared in the tweets with implicit hate speech, a classification task was performed. The underlying assumption, according to the information given in the previous sections, was to verify whether or not this set of tweets were implicitly communicating hate speech content by means of using irony or sarcasm.
In this respect, the remaining 1,973 tweets were classified in three categories: Ironic, sarcastic or literal. A set of some of the most discriminating features described in the specialized literature was used for representing both figurative devices in the texts (see Sections 3.1 and 3.2). In this respect, features such as BoW, polarity, aggressiveness, among other were used to represent the documents.
Finally, the Bayes algorithm was used to classify. The results are summarized in Figure 1.
As noted in the figure, when focusing on both figurative devices, most tweets were classified as ironic for almost the four categories, except for the category Bullying/Harassment, in which the balance between the classes ironic and sarcastic was very similar. However, it is worth noting that several tweets were classified in the third class, i.e., according to the set of features used in the classification, they are neither ironic nor sarcastic.
This outcome highlights two aspects to consider: (i) as described in the previous sections, figurative language is used to convey hate speech in a more sophisticated way, especially, by using irony. This means that more complex models regarding irony detection or sarcasm detection could improve the performance of current systems to detect implicit hate speech automatically in online communities; (ii) it is quite fuzzy to establish a formal boundary between the explicit and the implicit content when analyzing hate speech data. If several tweets were classified as literal hate speech, then the implicit content is being conveyed by means of different communicative strategies, not necessarily related to figurative language.
In the following section, both aspects are approached in linguistic terms in order to set a framework to allow the processing of implicit hate speech based on the observations noted so far.
4.4 Linguistic Features
One of the most challenging issues when manually revising some of the tweets with implicit hate content was related to the distinction between literal and figurative language. Although there are some works related to explain such distinction theoretically, when reading what common users post in social media, it is evident that the problem is much more complex than the functional distinction exposed in sections 2.1 and 2.2. In this respect, one element that we observed in the manual review to differentiate between literal and figurative content is the so-called intention.
This extra-linguistic element is useful to explain why figurative language requires much more cognitive effort to correctly interpret its meaning. If we look at any of the tweets from this data set (or any other), it is easy to realize that they are only sequences of words with semantic meaning. Perhaps, such meaning is totally explicit (literalness), or perhaps, it could be senseless (figurativity). This difference could be explained in terms of performance and competence, or even as a matter of correctness.
However, in a more comprehensive conception of language, such difference would be motivated by the need of maximizing a communicative success [37].
This need could be the element that will determine what type of information has to be profiled linguistically. If a literal content is profiled, then certain intention will permeate the statement. This intention will find a linguistic formalization by selecting some words or syntactic structures, for instance, to successfully communicate what it is intended. In contrast, if the figurative content is profiled, then the intention will guide the choice of different linguistic elements to guarantee the right transmission of information. It is likely that such content cannot be accomplished, but in this case, the failure will not rely on the speaker’s intention; rather, on the hearer’s skills to interpret what is communicated figuratively. Let us observe the following tweets to clarify this point.
(a) “Esta gente solo merece el rechazo y el desprecio”. (These people deserve rejection and contempt only).
(b) “Podrán decir lo que quieran de los de Tepito pero son de las pocas personas que se tapan la boca para estornudar o toser, hasta cargan tiner para desinfectarse las manos”. (You could say anything about people from Tepito, but they indeed cover their mouths when sneezing or coughing. Actually, they even use thinner to clean their hands).
Whereas in (a) the intention is to express hate speech against a social group, in (b) the intention is to express hate speech implicitly by means of using encrypted elements.
In each statement, the speaker has a communicative need, which is solved by maximizing certain elements. Thus, in the first example, the communicative success is based on making a precise affirmation (note that all the words in this context are very clear in terms of their semantic meaning). In contrast, the second example is based on deliberately selecting elements that entail secondary and non literal relations: Using thinner to clean hands is a sarcastic way to say that these people are drug addicts. In addition, by naming the place Tepito, the speaker is implicitly communicating that they are poor and, likely, criminal. Now, it is not that simple to identify what the intention is.
As noted above, this is an extra-linguistic element. Therefore, it is quite difficult to be formalized. However, there is a fact that deserves in depth analysis to face this issue: Understanding the intention often involves an interpretive adjustment to individual words, i.e., not all the words in an utterance are triggering an implicit intention; for this reason, the intention tends to be usually triggered by manipulating individual words.
In addition, we explore some linguistic features to go in deep with the recognition of the mechanisms to convey implicit hate speech (beyond figurative issues). It is worth noting that such features are work in progress; thus, their usefulness is preliminary. Finally, in order to be assessed further, some of them are listed below:
These features are intended to provide elements to analyze implicit content at different level. For instance, implicit content is supposed to be achieved by processing the linguistic input in secondary paths; then, by analyzing components such as the incongruity produced by the simplest interpretation, or by analyzing the valences in syntactic chunks within a discussion thread, or even, by measuring the entropy among n-grams, we consider that it is linguistically feasible to recognize some patterns to approach implicit language.
To illustrate this, let us consider a 4-gram, such as “mafia del no poder”. This is an atypical sequence in a reference corpus; therefore, its entropy could make evident that something is happening: processed literally is a senseless sequence, but making the right inferences, its violent content unveils.
5 Conclusions and Further work
In this article it has been presented an exploratory approach for facing implicit language in hate speech tweets. To this end, a data set with hate speech content in Spanish was manually built. The tweets were classified in four categories (violence, discrimination, bullying/harassment, and general), and then, they were labeled by human annotators in two classes: Explicit hate speech or implicit hate speech.
The approach relied on first analyzing figurative language, especially regarding irony and sarcasm, in the tweets belonging to the implicit hate speech class. Then, a manual review was carried out for investigating in deep what kind of formal information could be recognized for characterizing implicit language (considering both figurative and literal use of language) in the context of hate speech. In this respect, a core feature was suggested for differentiating figurative from literal language, as well as a set of exploratory linguistic features was introduced to approach implicit language in the near future beyond the presence of figurative devices.
The initial findings are encouraging, although a more robust set of experiments has to be done in order to demonstrate how useful such a set of features could be. The further work consists in assessing the exploratory linguistic features by comparing with some of the data sets used in some competitions, such as HatEval, MeOffendEs, and others.