We might think in chats the full stop sign were needed less than in written texts, as the end of a sentence may coincide with the end of a statement, sufficiently defined by the "send" button.
The ellipsis ("...") instead could, as a means for being fast and allusive, when communicating within a social group, be more frequent.
First try: looking for tweets with full stops
Searching tweets with full stops, we might get a first impression of the distribution of punctuation marks. I submitted tweet researches (22-28/10 2022, n =1000) in German, in Italian, and in Polish
German
From the general feature frequency table,
textstat_frequency(matrix, n=20)
feature frequency docfreq
. 783 427
Does this mean that among 1000 tweets searched with keyword = ".", only 427 documents really contain the mark? Something strange is happening here, in the punctuation signs count (see below, "technically").
At least, as a result, we clearly see: the full stop sign is still there. by far not all the "."s are absorbed by "...".
The rest of the table gives an impression of the situation. Nearly a fourth of the "." tweets are making use of the ellipsis sign as well, usually once in a single tweet.
, 568 354
: 502 412
rt 350 350
… 227 225
In Italian, full stops are less frequent, and so are "...", although the relation "."/"..." is quite similar (783/227 = 3.45 against 537/195 = 2.75).
. 537 281
: 431 375
, 345 220
… 195 183
! 121 77
In Polish, we have less full stops (500) than colons (555), while
“…” does not appear among the first twelve.
Obviously, having looked for tweets with ".", we do not get a view on the real frequency of full stops in tweets.
A little surprise, though, could be seen when considering skipgrams (4, 2:4).
In German, the most frequent ones are
1 \U{01faf6} \U{01faf6} \U{01faf6} \U{01faf6}
2 🌹 🌹 🌻 🌻
In Italian, the first one is
! 👏 👏 👏 194
And in Polish, we see
🌱 ✨ 💚 ✨
The scene is dominated by emojis. In the Italian result, it may even seem the exclamation mark was soaked up by emojis, becoming an emoji by itself.
For further investigation, I will not use R, but Python, because I prefer controlling directly what we are counting.
Technically
Search command
fund <- search_tweets(".", n=1000, retryonratelimit = TRUE, include_rts=TRUE, lang="de")
Punctuation mark count
keeping <- c(".","...",",","!","?",":","-",";")
nmatrix <- tokens_select(fund_toks, keeping, selection = "keep")
schau <- dfm(nmatrix)
textstat_frequency(schau)
kwic search
kontext1 <- kwic(fund_toks, ".", valuetype = "glob", window = 10)
kontext2 <- kwic(fund_toks, pattern= ":", window=10)
kontext3 <- kwic(fund_toks), pattern="...", window=10)
The last command does not give any results, as I have posted on stackoverflow, without getting response.