A new study finds that Facebook posts are better than demographic information when it comes to predicting a number of mental health conditions, as well as diabetes. This suggests that one day, our social media history might play an important role in the doctor’s office.

You can tell a lot of things by a person’s social media history, but medical information isn’t exactly one of them, but this is exactly what was presented in a new study. In the research, the team analyzed the entire Facebook post history of around 1,000 patients (who had given their consent for this study), building three analysis models: one that looked at post language, one that looked at demographics, and one that combined the two.

They then looked at 21 different medical conditions, assessing whether the Facebook history could be used to predict these conditions — all 21 were. Actually, 10 of them were predictable from post history alone, without even looking at the demographic information. It’s still early, but the results were impressive.

“This work is early, but our hope is that the insights gleaned from these posts could be used to better inform patients and providers about their health,” said lead author Raina Merchant, MD, MS, the director of Penn Medicine’s Center for Digital Health and an associate professor of Emergency Medicine. “As social media posts are often about someone’s lifestyle choices and experiences or how they’re feeling, this information could provide additional information about disease management and exacerbation.”

The language we use carries many unconscious biases which can be linked to our behaviors and habits. In turn, these behaviors can also be indicative of other underlying problems. Some connections were clear: people who tended to use words like “bottle” or “drink” more often were more likely to suffer from alcohol abuse. Others, however, were much less intuitive.

Subscribe to our newsletter and receive our new book for FREE
Join 50,000+ subscribers vaccinated against pseudoscience
Download NOW
By subscribing you agree to our Privacy Policy. Give it a try, you can unsubscribe anytime.

For instance, the people that most often used religious language (with words such as “God” or “pray”) were 15 times more likely to have diabetes than those who used these terms the least. When fed into the models, this information could be used and extrapolated to predict serious conditions.

“Our digital language captures powerful aspects of our lives that are likely quite different from what is captured through traditional medical data,” said study author Andrew Schwartz, PhD, visiting assistant professor at Penn in Computer and Information Science, and an assistant professor of Computer Science at Stony Brook University. “Many studies have now shown a link between language patterns and specific disease, such as language predictive of depression or language that gives insights into whether someone is living with cancer. However, by looking across many medical conditions, we get a view of how conditions relate to each other, which can enable new applications of AI for medicine.”

The thing is, because the content we publish on Facebook is not in a medical context, it can contain information that’s usually not mentioned in a medical or clinical context, including potential markers for specific diseases. For depression, words like “pain,” “crying,” or “tears” were good indicators, but also less obvious words such as “stomach,” “head,” or “hurt”.

It’s not the first time this idea was suggested. Previous research has found that Facebook history can be indicative of mental health conditions such as depression. The fact that this approach can be further extended to conditions such as diabetes is even more encouraging.

Now, the team is carrying a larger trial where they will ask participants to share social media history with their doctor to see how this data can be best used in a practical setting. This is the one big caveat to this study: the sample size. Not only was it fairly small, but it was also largely female (76%) and African American (71%) — not representative for the entire population.

Journal Reference: Merchant et al. Evaluating the predictability of medical conditions from social media posts. PLOS ONE. DOI:10.1371/journal.pone.0215476