Researchers have theoretized for some time that our languages are skewed towards happy words – with some more skewed than others. A new study conducted on 10 different languages confirmed this idea, and also found that Spanish is the happiest language, while Chinese is the most balanced.

Image: Dodds et al., PNAS

It all started back in 1969, when two psychologists at the University of Illinois came up with the Pollyanna Hypothesis – that humans universally tend to skew their use of language towards happy words, rather than negative ones. Basically, we want to be happier, so we use happier words.

“Put even more simply,” the pair wrote, “humans tend to look on (and talk about) the bright side of life.” But this was just an hypothesis, they had little evidence to back it up, and it has inspired heated debate ever since.

In this study, researchers from the University of Vermont (UoV) and not-for-profit research and development organisation, the MITRE Corporation analyzed words used by actual humans professionally and in their personal lives (both online and offline).

“We collected roughly 100 billion words written in tweets,” one of the team, mathematician Chris Danforth from UoV, said in a press release.

Arabic movie subtitles, Korean tweets, Russian novels, Chinese websites, English lyrics, and even the war-torn pages of the New York Times – they looked at it all, and the results were consistent.

“We looked at ten languages,” says UVM mathematician Peter Dodds who co-led the study, “and in every source we looked at, people use more positive words than negative ones.” They analyzed English, Spanish, French, German, Brazilian Portuguese, Korean, Chinese (simplified), Russian, Indonesian and Arabic.

Credit: Dodds et al., PNAS

They then selected the 10,000 most frequently used words in each of the language. They then hired fifty native language speakers to rate the words based on how happy or sad they are, on a scale of 1 to 10. For example, “laughter” got an average rating of 8.50, “food” got 7.44, “truck” 5.48, “the” 4.98, “greed” 3.06 and “terrorist” 1.30. In this way, they noted which words are happy, and which ones are sad.

For the next step of the study they went on Google to see how often these words are used online. Publishing in the Proceedings of the National Academy of Sciences, the team reports that Spanish-language websites had the highest use of happy words.

John Bohannon explains the results, which you can see below, at Science:

“Graphs of the data show that the Pollyanna principle is indeed part of language itself. If there were no bias, then the median emotional values of the words (red lines) would fall in the middle of the emotional scale. But instead, the median emotional resonance of words falls well into positive territory for every corpus from every language tested.”

The study is part of a bigger project which aims to determine a type of happiness meter – a hedonometer – which can track the ‘happiness signals’ in English-language Twitter posts on an almost real-time basis. While for now, the hedonometer only works for English tweets, researchers hope to expand it to other languages as well.

You can download the entire data set here.