Emoji Prediction Technical Map
What am I looking at?
Predictive emoji is all about understanding the sentiments of an author through their words. The map below aims to unveil how this algorithm works, and what architecture allows it to do so. The language that an author uses is crucial for investigating how they feel about a subject - and therefore how they might react to certain events, products, or political happenings. Sentiment Analysis is used by researchers to understand their audience in many contexts — political, social, business, marketing, or otherwise.
For this project, the primary technical component is the predictive emoji algorithm, which is unveiled by following the boldened branch of the diagram above.
The predictive emoji algorithm (which varies in methodology across media platforms such as Twitter, Google, Apple, etc.) is a general term for a technology that has grown in recent years. The map aims to unveil how the algorithm connects language, emotion, and emoji to suggest additional ideas to your communication. For instance, when you type a few words into your messaging application, say “I am so excited!”, the algorithm detects the various linguistic structures present (“I” as the pronoun, “am” as the verb, “so” and “excited” as adjectives, and “!” as punctuation). Each word is connected in a matrix, a massive array of possible meanings, and alternative keywords (just as the 😆 is connected to the terms: face, laugh, mouth, or smile in the diagram). However, you don’t have to simply describe the emoji to get a suggestion for the emoji, you may also use emotional keywords - such as “exciting”, or “funny’, or pair these words in phrases that may or may not change their meaning completely such as “not funny” or “can’t wait”. All of these factors and more play into the confidence behind predicting an emoji or word for the user.
In short, the way a predictive emoji algorithm detects the meaning in language and applies suggested emojis to the words is complex. The factors included for any specific prediction algorithm vary across platforms (i.e. Apple versus Google). However, we can point to a few general factors. These include, but are not limited to:
-
the emotion of single words (i.e. happy)
-
the emotion of paired words and phrases (i.e. I am so not happy)
-
the quantity of used emojis versus less-used emojis (🥰 is used much more frequently than 🤢),
-
use of punctuation (Commas, question marks, and exclamation points may completely alter the meaning),
-
use of other emojis (if someone already is used 😏 it’s more likely they are being sarcastic), and level of emotion in the text (certain words are more emotionally meaningful than others; compare “interesting” with “superb”).
In the end, the algorithm pulls together all of these factors to give us the highest probable and applicable emojis it can with the context provided.
Now that predictive emoji has been adequately unpacked, it’s important to ask questions about the implications of this technology. The ability to apply computer science to linguistics and predict patterns of human emotion and reaction via language is quite incredible. Sentiment Analysis allows us to detect social unrest in communities beforehand, to predict elections, to measure how a consumer base feels about one’s latest product, and to conduct research without needing to move away from your desktop. It is worth considering, however, what this technology may be used to do in a less healthy way. As we become better predictors of emotion, it will become much easier for powerful political actors to use this data to manipulate the public subconscious, in a form of propaganda. It’s always worth asking:
How is this technology making me think in ways that I wouldn’t be if it wasn’t present?
Authored by Andrew Peacock (April 2022)
References
-
Unicode Consortium, Unicode.org
-
Singh, G. V.; Firdaus, M.; Ekbal, A.; Bhattacharyya, P. (2022). Unity in Diversity: Multilabel Emoji Identification in Tweets. IEEE Transcations on Computational Social Systems. https://doi-org.proxy.library.georgetown.edu/10.1109/TCSS.2022.3162865
-
https://doi-org.proxy.library.georgetown.edu/10.1109/TCSS.2022.3162865