In order to find spots where twitter gets hot, with people writing down emphatic statements or opinions, we could start with emojis, as the creation of hashtags is not predictable, while the number of emojis is relatively limited.
Yet, by the most common packages for Python and R we still get between 1400 and 5000 emojis. These still are far too much, for me, to loop through these lists and make a tweet search request for each of them. I decided to begin with my own list of emojis which will be enriched.
The emojis used in a very lively Italian gossip group ("#jerรน"), where often anger about the stars is expressed, will be the first ones, already 34. For the beginning, I will keep within the realm of Italian tweets. As the case of Polish ๐ teaches, the use of emojis largely depends on cultures defined by languages.
My first Italian list is : ๐ ๐ป ๐ ๐ ๐ ๐ ☕ ๐ค ๐ ๐ฅฐ ♥️ ๐คฆ ❤️ ๐ฒ ๐ ๐ฆ ๐ธ ๐คฃ ⛰️ ๐ ๐ ♀️ ๐ ๐ฅ ๐ ๐ ๐ ๐ ๐ ๐ ๐ญ ๐ ๐บ ✨.
Looking for signs of excitement, I will keep record of tweets searched with my emoji list only if on 100 tweets I get more than 15 exclamation marks. Angry and happy people love doubling and tripling these signs.
In a tweet search done on November 12, 2022, the most prolific emoji was ๐คฃ
In 100 tweets with this emoji, it appeared 231 times. Exclamation marks: 27 on 9234 characters, with eight double "!" and three "!!!!".
The most frequent hashtags were
"#taleequaleshow" "#merito" "#konpetenza" "#ottoemezzo" "#calenda" "#novax",
i.e. two about political tv shows, two are hashtags used by a journalist (book title: “Damned Pacifists”), two about politics, with the famous hashtag "novax". Maybe this could indicate the right way to find angry people. Should we move on following the hashtags?
A hashtag "novax" search (07/ 01/ 23, n=100) results in only 15 emojis,
๐ฏ ๐ ⚧️ ๐ ๐ฟ ๐ ๐ ๐ง ๐ณ ๐ฉ ๐คฃ ๐คก ๐ณ️ ๐ช๐บ ๐ชณ
with only four exclamation marks. Where has the excitation gone? We see ten times ๐ฉ, following 35 ๐คฃ.
The same for "#calenda" (an Italian politician). A search results in 18 exclamation marks, with three times double "!" and one "!!!". But, according to R and the relative emoji package, only five out of 100 tweets contain emojis, 44 in all (26 unique). Five ➡️ , five ๐ฎ๐น, five ๐ช๐บ. Is it that in politics, Italians use only few emojis? This could be due to the higher age of people interested in politics here.
Let us try with other "controversial topics". A "#Salvini" search (right wing politician) results in 11 on 100 tweets with emojis. 22 ๐คก , 18 ๐, 13 ๐ and five ๐ฎ๐น. Other "hot" topics like #bce (European Central Bank) or "nosbarchi" (no acceptance of refugees) give similar results. People get angry about politics, but do not use emojis in these fields.
Still, by searching with emojis we can find angry people. The clown ๐คก is used when people find laughable something or somebody. Again, in Italy, we get top hashtags about soccer and about the Reality Show Big Brother VIP.
We see
feature frequency rank 1 ๐คก 432 1 2 ๐ 72 2 3 ๐คฃ 38 3 4 ๐คฎ 26 4 5 ๐ 16 5 6 ๐๐ป 12 6 7 ๐ฉ 12 6
People are angry. Searching for anger with emojis around could be helpful.
The number of unique emojis here is only 23.
๐คฌ ๐คฃ ✅ ๐คฆ ๐ ๐ฑ ๐ก ⚫ ๐ ๐ต ๐ธ ๐คก ๐ ๐คข ๐ฉ ๐คฎ ๐ซ ⏩ ๐ ๐น.
Maybe this should be the basic list of emojis, on the search of angry people, i.e. heat stains on
twitter.
With the middle finger, for example, tweets supposedly are quite aggressive. With an apisearch (n=10) on November 12, 2022, we receive antisemitic content, but only two exclamation marks. This number is again growing with ๐คฎ, becoming 7 in ten tweets, and 2 double "!". ๐ก gives eight exclamation marks (two double), tweets about animals rights. The ๐ฃ brings tweets about conspiracy theories, without any "!"
๐ก search gives 42 emojis in 10 tweets, namely
๐คฌ ๐คฃ ๐ฏ ๐ ๐ ๐ช ๐คฆ ➡️ ๐ ๐ข ๐ ♀️ ๐ฅ ๐ค ๐ฟ ๐ฑ ๐ก ๐ค ❣️ ๐ฎ๐น ๐ป ๐ ♂️ ๐ด ๐ ๐ ๐ค ๐ ๐คข ๐ฉ ๐คฎ ๐ ๐ฅฒ ๐คช ๐ฅบ ‼️ ๐ ⏩ ๐คจ ๐ ๐น ☕
The exclamation marks are here, not among the
characters. We should getting to know emojis by the company they keep.
Emoji diversity
For comparison, a ☕ search results in 62 unique emojis and 421 total emojis, lexical emoji diversity .147. Lexical diversity among emojis is low in all cases, maximums are .23 with ๐ฅ, .21 with ๐, .22 with the ❤️ and .2 with ๐. This does not depend on the simple number of emojis. (for example 254 correspond to .10, 601 to .15.). The variety of accompanying emojis rather seems to be a characteristic of the emoji itself.
technically
(ok, it is not elegant, but I had learned programming with Algol W, in 1975)
import pandas as pd
import tweepy
import csv
import emoji
import emojis
import regex
from collections import Counter
#https://stackoverflow.com/questions/49113909/split-and-count-emojis-and-words-in-agiven-string-in-python
<authentication stuff>
api = tweepy.API(auth)
with open('emojilistit22.csv','r') as mine:
leser = csv.reader(mine, delimiter=',')
leserl = list(*leser)
for kw in leserl:
print (kw)
container = []
tweetCount = 100
results = api.search_tweets(kw, count=tweetCount, lang="it")
for tweet in results:
container.append(tweet.text)
row_count = len(container)
print("number of tweets ", row_count)
filename = "exclamation_basis_" + kw + "2022-11" + ".csv"
f = open(filename, 'w')
writer = csv.writer(f)
writer.writerow(container)
f.close()
zeichendf = pd.read_csv(filename)
zeichenkette = zeichendf.to_string()
lang = len(zeichenkette)
print("Number of characters: ", lang)
x = zeichenkette.count(".")
print("Number of single points (full stops? .)", x)
prozent = x/lang*100
print(prozent, "%")
y = zeichenkette.count("...")
print("Number of three points (ellipsis ...)", y)
prozent = y/lang*100
print(prozent, "%")
if y>0:
relation = x/y
print("Relation: ", relation)
x = zeichenkette.count("!")
print("Number of exclamation marks", x)
prozent = x/lang
print(prozent, "%")
if x>15:
anteil = lang/x
print (kw, "exclamation marks:", x, "on", lang, "characters", anteil, "%")
y = zeichenkette.count("!!")
print("Number of two exclamation marks", y)
prozent = y/lang*100
print(prozent, "%")
if y>0:
relation = x/y
print("Relation: ", relation)
x = zeichenkette.count("!!!")
print("Number of three exclamation marks", x)
y = zeichenkette.count("!!!!")
print("Number of four exclamation marks", y)
# just having a look at the emojis, will construct an archive later
material = str(zeichenkette)
emoji_hier = emojis.get(material)
print (*emoji_hier)
zahlein = emojis.count(material, unique=True)
zahlall = emojis.count(material)
print (zahlein, zahlall)
No comments:
Post a Comment