01 June 2023

A neural network was able to determine whether a person is depressed by his words

Researchers from the Jinhua Institute for Advanced Study and Harbin University of Science and Technology recently developed a deep learning algorithm that can detect depression from human speech.

The model, presented in an article in Mobile Networks and Applications, was trained to recognize emotions in human speech by analyzing various relevant features.

"A multi-information model of a collaborative decision-making algorithm is created through emotion recognition," Han Tian, Zhang Zhu and Xu Jing wrote in their paper. - The model is used to analyze representative data about subjects and to help diagnose depression in subjects."

Tian and his colleagues trained their model on the DAIC-WOZ dataset, a set of audio and three-dimensional facial expressions of patients diagnosed with depressive disorder and people without depression. These audio recordings and facial expressions were collected during interviews conducted by a virtual agent who asked various questions about the interviewee's mood and life.

"Based on a study of the speech characteristics of people with depressive disorder, this paper conducts an in-depth study of the diagnosis of depression using speech data from the DAIC-WOZ dataset," Tian, Zhu, and Jian wrote in their study. - First, the speech information is pre-processed, including speech signal preemphasis, framing, endpoint detection, noise reduction, etc. Second, OpenSmile is used to extract features of speech signals, and speech features that may reflect features are studied and analyzed in depth."

To extract important features from voice recordings, the team model uses OpenSmile (open-source speech and music interpretation by big-space extraction). This is a toolkit often used by computer scientists to extract features from audio clips and classify those clips.

The researchers used this tool to extract individual speech features and combinations thereof commonly found in the speech of patients diagnosed with depression. They subsequently used a technique known as principal component analysis to reduce the set of features extracted.

Tian, Zhu, and Jian evaluated their model in a series of tests in which they assessed its ability to detect depressed and non-depressed people from their voice recordings. Their scheme produced remarkable results, detecting depression with 87% accuracy in male patients and 87.5% in female patients.

In the future, the deep learning algorithm developed by this research team may become an additional supportive tool for psychiatrists and physicians, along with other well-established diagnostic tools. In addition, this research could inspire the development of similar AI tools for detecting signs of mental disorders from speech.
Found a typo? Select it and press ctrl + enter Print version