Data Science Techniques Help Evaluate COVID's Impact on Mental Health
In case of another pandemic, authorities might only have a 28-day window to connect vulnerable populations to mental health providers before it’s too late to prevent long-term concerns, according to new research assisted by a data science expert at New Jersey Institute of Technology.
The research employed five types of natural-language processing and statistical methods to evaluate more than 350,000 posts in the Reddit subcommunities r/Anxiety and r/Depression, spanning the years 2019-2022, in response to the worldwide dire news and eventual waning of COVID. NJIT Associate Professor Hai Phan was a co-principal investigator along with four people from Kent State University, located in northeast Ohio.
“Understanding the impact of a pandemic on mental wellness is critical. So for instance, we can prioritize the support for sub-populations … that could not cope with the anxiety level or its distraction level, leading them towards more harmful thoughts. Our society's resources are limited. So we have to use them in the most effective way in order to help the whole society, not leave anyone behind,” Phan noted. Whatever the nature of a future global event, he said, “If policy makers look into our data analysis tools, and derive actionable plans, that could help with saving our resources and ease the burden on vulnerable populations.”
As explained in the abstract to their paper, Investigating COVID-19’s Impact on Mental Health: Trend and Thematic Analysis of Reddit Users’ Discourse, “Topic modeling and Word2Vec embedding models were used to identify key terms associated with the targeted themes within the data set. A range of trend and thematic analysis techniques, including time-to-event analysis, heat map analysis, factor analysis, regression analysis and k-means clustering analysis, were used to analyze the data.”
If policy makers look into our data analysis tools, and derive actionable plans, that could help with saving our resources and ease the burden on vulnerable populations.
In addition to the 28-day timeframe, they cited four additional conclusions: a theme-trend analysis revealed that posters experienced economic stress, social stress, suicide and substance use; factor analysis highlighted economic concerns and social factors; regression analysis showed that economic stress consistently demonstrated the strongest association with the suicide theme, whereas the substance theme had a notable association in both data sets; and k-means clustering analysis showed that in r/Depression, the number of posts related to the depression, anxiety and medication cluster decreased after 2020, whereas the social relationships and friendship cluster showed a steady decrease.
Phan said he personally experienced certain effects from working at home during the pandemic, such as boredom from not being able to go out, and concern about lacking in-person interaction with his graduate students. Phan had COVID twice, each time lasting about a month, while his wife and children also had the disease at different times. He said the idea for the research came from earlier work of theirs which explored drug addiction.
He pointed to another paper, Distinguishing the Effect of Time Spent at Home During the COVID-19 Pandemic on the Mental Health of Urban and Suburban College Students Using Cell Phone Geolocation, by graduate student Pelin Ayranci, Phan himself, NJIT Associate Professor of Entrepreneurship Cesar Bandera and others, which summarized research finding that 51% of students experienced mild or severe depression.
In that paper, which also spanned NJIT and Kent State, “We found that social isolation at home had a negative relation with depression in students living in suburban areas, but a positive relation with depression on students living in urban areas.”