[1] "😍"
[1] 100
feature frequency rank
1 😍_😍 147 1
2 😍_😍_😍 106 2
3 😍_😍_😍_😍 71 3
Looking for the same emoji in German tweets:
tokens_ngrams(n = 2:4)
feature frequency rank docfreq group
1 😍_😍 26 1 10 all
2 😍_😍_😍 16 2 9 all
3 guten_morgen 15 3 15 all
Looks like a cultural difference. But, anyway, the most frequent bigrams still are these 😍 couples.
The general rule could be:
Searching tweets for emojis, you will get other emojis as most frequent bigrams. Other examples:
[1] "🤥"
[1] 100
feature frequency rank docfreq group
1 🤥_🤥 165 1 31 all
2 🤥_🤥_🤥 134 2 22 all
3 🤥_🤥_🤥_🤥 112 3 12 all
[1] "😂"
[1] 100
feature frequency rank docfreq
1 😂_😂 113 1 48
2 😂_😂_😂 65 2 32
Laughter ("Rolling ...") only in 55% of the cases comes alone.
[1] "🤣"
[1] 100
feature frequency rank docfreq group
1 🤣_🤣 143 1 45 all
2 🤣_🤣_🤣 96 2 32 all
3 🤣_🤣_🤣_🤣 62 3 13 all
On 100 documents, there are 246 🤣. Looks like an echo.
Again, on German tweets, the tendency is weaker.
feature frequency rank docfreq group
1 🤣_🤣 78 1 35 all
2 🤣_🤣_🤣 42 2 28 all
3 ._. 38 3 10 all
12 ?_🤣 5 12 5 all
This circumstance will be explored later.
Anger seems to be contagious as well. Take a look at
[1] "😡"
[1] 100
feature frequency rank docfreq
1 😡_😡 114 1 49
2 ._. 75 2 20
3 😡_😡_😡 65 3 36
And, uhm
[1] 100
feature frequency rank docfreq group
1 💩_💩 102 1 35 all
2 ._. 73 2 21 all
3 💩_💩_💩 67 3 27 all
Washing it away:
[1] "💦"
[1] 100
feature frequency rank docfreq
1 💦_💦 137 1 60
2 💦_💦_💦 77 2 39
A rather strange guy:
[1] "👺"
[1] 100
feature frequency rank
1 👺_👺 88 1
2-4 user names
5 👺_👺_👺 37 2
6 :_👺 20 6
Number six is the combination of the tengu or leprechaun with a colon. Consider "!_😍 131"!
We know that punctuation signs, in text messages, behave strangely. They often appear in couples or triples. We are getting used to phenomenons like "!!!!". In the meantime, the simple full stop is weakened. What if punctuation signs lost their grammatical meaning, and became emojis of their own right?
A second hint for further research are "evoking emojis". They are not doubled, but complemented by other emojis, seemingly according to certain rules.
The general rule, for now, would be:
Searching tweets for emojis, you will get other emojis as most frequent bigrams. These are not always the same emojis.
[1] "🤍" white heart
[1] 100
feature frequency rank
1 😍_😍 130 1
2 😍_😍_😍 106 2
3 😍_😍_😍_😍 82 3
Weaker:
[1] "💛"
[1] 100
feature frequency rank docfreq
1 💛_❤ 23 1 22
Technically
No comments:
Post a Comment