In 10 languages, happy words beat sad ones

Amid the everyday storm of flaming, bitching, cursing and general bad-mouthing in music and film, and on Twitter and the web, how’s this for bizarre? A new study of billions of words actually used in 10 major languages finds that writers in each language prefer positive, happy words. Whether it’s tweets in English, movie subtitles in Korean, books in Russian or websites in French, this general preference for positivity showed up in every single realm.

The study is a classic example of using big data to address longstanding questions. The authors cast a wide net, looking at English, Spanish, French, German, Brazilian Portuguese, Korean, Chinese (simplified), Russian, Indonesian and Arabic. All told, they drift-netted 24 subtypes of media, including books, news outlets, social media, websites, television and movie subtitles, and music lyrics.
Each medium provided at least 5,000 words, and each language, at least 10,000. “Lists that were used in the past were created by experts,” says Peter Dodds, who collaborated with Chris Danforth on the study. Both men are mathematicians at the University of Vermont’s Computational Story Lab. “We thought, ‘Let’s go find all the words we use most commonly.'”
The researchers then asked 1,900 native speakers of the various languages to rate the words on a scale of 1 (illustrated with a deeply frowning face) to 9 (a broadly smiling face).
The result was 50 ratings for each word, for 5 million total evaluations.
But what about this?
They asked whether words removed from their context could be misleading. Wouldn’t the first word of the love ballad “Killing Me Softly With His Song” register as negative? Yes, Dodds says, “but they wash out when the sample gets large enough. It’s like measuring the temperature of a room. If you look at a few molecules, maybe they are more or less active,” but temperature is an average of all the molecules. In the same way, “We are trying to get at the whole picture” of language, he says.

Neutral words, such as “the” and “for” were ranked, as expected, around 5, but why not just ignore them? “We wanted the instrument we created to fit the language we would be using it on,” says Dodds. It’s easy to remove these “function words” from evaluations later on, he says, “but we can do this in a principled way. We don’t bring in our biases. We let people tell us what their language is.”
Once the words and the ratings were nailed down, it was a straightforward — if big — computing task to rate the words actually used in different media for the various languages.
And here’s the weird part: The average rating for every language — and every medium — was significantly above 5.
So did the study show that people have, in general, positive emotions? “That’s strong,” Dodds says. “I think it’s proof of a positive bias in language, and language is our code for how we interact; it’s our great social technology, an amazing invention that allows our communication in a powerful way.”
A long debate in linguistics boils down to this: Language shapes us. Or we shape language. The reality, Dodds says, is somewhere in the middle. “I think language encodes our sociality. We are social beings. You can argue that we are selfish or altruistic, but language tells a story about how we behave.”

The sweet tweet
After collecting about 100 billion words in tweets, the researchers were able to track emotional expressions by time and place (if the tweets were geotagged).
And that showed how emotional expression can vary from day to day. On a weekly cycle, Saturdays proved most positive; Tuesdays were most negative. Holidays, especially “merry” Christmas and “happy” Thanksgiving, marked the biggest spikes; deaths, murders and other outrages brought the biggest lows. Spanish — taken from Mexican websites, books and tweets — was the most positive — and Chinese, taken from Google books, was the most negative.
With those consistent trends, the study confirmed the 1969 “Pollyanna hypothesis,” which held that most human communication tends toward the positive.
Among cities, Boulder, Colo., rates highest for positive word usage while Racine, Wis., was the most negative. In Racine, Dodds says, “There is a lot more swearing, and ‘don’t,’ ‘never,’ ‘no,’ ‘nobody,’ and less ‘haha’ — that’s important on Twitter — and less ‘happy,’ ‘best,’ and ‘awesome.'”
We returned to our attempt to summarize the study, asking, “Are you saying people are happy based on the words they choose?” No, says Dodds. “We are not telling you what people are thinking inside their heads. We are telling you how people react to the words people use.”
– David J. Tenenbaum
Kevin Barrett, project assistant; Terry Devitt, editor; S.V. Medaris, designer/illustrator; David J. Tenenbaum, feature writer
Bibliography
- Human language reveals a universal positivity bias, Peter Sheridan Dodds et al, Proceedings of the National Academy of Sciences, Feb. 9, 2015. ↩
- Eric Idle – “Always Look On The Bright Side Of Life” ↩
- Big data. Big obstacles. ↩
- How Big Data can bring medical benefits, without compromising privacy rights. ↩