Doing some twitter analysis for , a philosopher from Milan who was doing conspiracy theory studies concerning tweets about a Italian "Big Brother" couple, 🦋🐻, I ended up isolating a few recurring emojis in this material, namely 💞, 🍒, 👅 , 🐻, 🦋, 🤦♀️, 🌶️, 🍀, 🤪, 😂, 😅. Being typical for certain tweets, would they, used without text, produce a series of similar tweets?
A search_tweets with single emojis did not give any results, with average text cosines, as a first approximation, below .3, and only the cosines between emojis above .5, as expectable.
But what about pairs of emojis? If they are used like words or short expressions, their combinations could be significant.
I simply framed the search query by a nested loop.
As the result of a first try (14/09/2022), with n=100, some of the combinations could be eliminated, being rarely used, as 🍒🌶️ (only twice, probably a question of taste). Most of the others produced a lower text cosine, as 🍒🦋 (n=70, cosine= .23). But what about a tweet number of 15 for 🍒🤪, with a text cosine of .58?
Seemingly more interesting was the 👅🦋 pair. 78 documents with a text cosine of .52, and a cosine between emojis of .89. Does this mean that certain pairs of emojis produce a coherent text world? Even a social world?
Computing a similar research (tweetr: search_tweets) about a month later (09/10/2022), with n=10000, I get 71 tweets with 👅🦋. The similarity between the tweets seems to be lower. But still, the cosine average for all the texts is 0.3721238, the emoji cosine 0.8074077.
Among the most frequent features, no words at all. from here it looks like a closed emoji world.
textstat_frequency(matrixa, n=20)
feature frequency rank docfreq group
1 👅 177 1 71 all
2 🦋 115 2 71 all
3 💦 56 3 33 all
4 💋 56 3 33 all
5 ❤ 29 5 19 all
6 ♥ 28 6 9 all
7 🔥 21 7 10 all
8 🍆 19 8 17 all
What do they mean?
If we follow certain linguistic approaches to semanics, we might say: We will know an emoji by the company it keeps. This could mean the company of other emojis, but also a company of words. A quick topic analysis (lda) , with k=5, results in themes like love ("amore", "bellezza"), but mainly sex, with vulgar words ("figa", "troia", "patatina"), and some user names similar to "@milf43". This has to do with the meaning of "butterfly" in Italian. A month ago it had been used as personification, now as a sexual metaphor.
A plot of the underlying (?) communication structure with igraph and actorNetwork:
An entire world, no: two of them or three are waiting to be explored.
No comments:
Post a Comment