Clinical state tracking in serious mental illness through computational analysis of speech

Autoři: Armen C. Arevian aff001;  Daniel Bone aff002;  Nikolaos Malandrakis aff002;  Victor R. Martinez aff002;  Kenneth B. Wells aff001;  David J. Miklowitz aff001;  Shrikanth Narayanan aff002
Působiště autorů: Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, United States of America aff001;  Signal Analysis and Interpretation Lab, University of Southern California, Los Angeles, CA, United States of America aff002;  RAND Corporation, Santa Monica, CA, United States of America aff003
Vyšlo v časopise: PLoS ONE 15(1)
Kategorie: Research Article


Individuals with serious mental illness experience changes in their clinical states over time that are difficult to assess and that result in increased disease burden and care utilization. It is not known if features derived from speech can serve as a transdiagnostic marker of these clinical states. This study evaluates the feasibility of collecting speech samples from people with serious mental illness and explores the potential utility for tracking changes in clinical state over time. Patients (n = 47) were recruited from a community-based mental health clinic with diagnoses of bipolar disorder, major depressive disorder, schizophrenia or schizoaffective disorder. Patients used an interactive voice response system for at least 4 months to provide speech samples. Clinic providers (n = 13) reviewed responses and provided global assessment ratings. We computed features of speech and used machine learning to create models of outcome measures trained using either population data or an individual’s own data over time. The system was feasible to use, recording 1101 phone calls and 117 hours of speech. Most (92%) of the patients agreed that it was easy to use. The individually-trained models demonstrated the highest correlation with provider ratings (rho = 0.78, p<0.001). Population-level models demonstrated statistically significant correlations with provider global assessment ratings (rho = 0.44, p<0.001), future provider ratings (rho = 0.33, p<0.05), BASIS-24 summary score, depression sub score, and self-harm sub score (rho = 0.25,0.25, and 0.28 respectively; p<0.05), and the SF-12 mental health sub score (rho = 0.25, p<0.05), but not with other BASIS-24 or SF-12 sub scores. This study brings together longitudinal collection of objective behavioral markers along with a transdiagnostic, personalized approach for tracking of mental health clinical state in a community-based clinical setting.

Klíčová slova:

Acoustics – Bipolar disorder – Depression – Emotions – Language – Mental health and psychiatry – Semantics – Speech


1. Hedden SL. Behavioral health trends in the United States: results from the 2014 National Survey on Drug Use and Health: Substance Abuse and Mental Health Services Administration, Department of Heath & Human Services; 2015.

2. Druss BG, Zhao L, Von Esenwein S, Morrato EH, Marcus SC. Understanding excess mortality in persons with mental illness: 17-year follow up of a nationally representative US survey. Medical care. 2011;49(6):599–604. doi: 10.1097/MLR.0b013e31820bf86e 21577183

3. Gore FM, Bloem PJ, Patton GC, Ferguson J, Joseph V, Coffey C, et al. Global burden of disease in young people aged 10–24 years: a systematic analysis. The Lancet. 2011;377(9783):2093–102.

4. Fears SC, Kremeyer B, Araya C, Araya X, Bejarano J, Ramirez M, et al. Multisystem component phenotypes of bipolar disorder for genetic investigations of extended pedigrees. JAMA psychiatry. 2014;71(4):375–87. doi: 10.1001/jamapsychiatry.2013.4100 24522887

5. Nestler EJ, Barrot M, DiLeone RJ, Eisch AJ, Gold SJ, Monteggia LM. Neurobiology of depression. Neuron. 2002;34(1):13–25. doi: 10.1016/s0896-6273(02)00653-0 11931738

6. Emsley R, Chiliza B, Asmal L, Harvey BH. The nature of relapse in schizophrenia. BMC psychiatry. 2013;13(1):50.

7. Liu G-D, Li Y-C, Zhang W, Zhang L. A Brief Review of Artificial Intelligence Applications and Algorithms for Psychiatric Disorders. Engineering. 2019.

8. Cuthbert BN, Insel TR. Toward the future of psychiatric diagnosis: the seven pillars of RDoC. BMC medicine. 2013;11(1):126.

9. Davis J, Maes M, Andreazza A, McGrath J, Tye SJ, Berk M. Towards a classification of biomarkers of neuropsychiatric disease: from encompass to compass. Molecular psychiatry. 2015;20(2):152. doi: 10.1038/mp.2014.139 25349167

10. Chen Y, Cinnamon Bidwell L, Norton D. Trait vs. state markers for schizophrenia: identification and characterization through visual processes. Current psychiatry reviews. 2006;2(4):431–8. doi: 10.2174/157340006778699729 17487285

11. Tamminga C, Holcomb H. Phenotype of schizophrenia: a review and formulation. Molecular psychiatry. 2005;10(1):27. doi: 10.1038/ 15340352

12. Musliner KL, Munk-Olsen T, Laursen TM, Eaton WW, Zandi PP, Mortensen PB. Heterogeneity in 10-year course trajectories of moderate to severe major depressive disorder: a danish national register-based study. JAMA psychiatry. 2016;73(4):346–53. doi: 10.1001/jamapsychiatry.2015.3365 26934026

13. Tamminga C, Holcomb H. Phenotype of schizophrenia: a review and formulation. Nature Publishing Group; 2005.

14. Torous J, Baker JT. Why psychiatry needs data science and data science needs psychiatry: connecting with technology. JAMA psychiatry. 2016;73(1):3–4. doi: 10.1001/jamapsychiatry.2015.2622 26676879

15. Cohen AS, Elvevåg B. Automated Computerized Analysis of Speechin Psychiatric Disorders. Current opinion in psychiatry. 2014;27(3):203. doi: 10.1097/YCO.0000000000000056 24613984

16. Elvevåg B, Cohen AS, Wolters MK, Whalley HC, Gountouna VE, Kuznetsova KA, et al. An examination of the language construct in NIMH's research domain criteria: Time for reconceptualization! American Journal of Medical Genetics Part B: Neuropsychiatric Genetics. 2016;171(6):904–19. doi: 10.1002/ajmg.b.32438 26968151

17. Burns A, Gallagley A, Byrne J. Delirium. Journal of Neurology, Neurosurgery & Psychiatry. 2004;75(3):362–7.

18. Harrison Y, Horne JA. Sleep deprivation affects speech. Sleep. 1997;20(10):871–7. doi: 10.1093/sleep/20.10.871 9415947

19. Cohn MA, Mehl MR, Pennebaker JW. Linguistic markers of psychological change surrounding September 11, 2001. Psychological science. 2004;15(10):687–93. doi: 10.1111/j.0956-7976.2004.00741.x 15447640

20. Rude S, Gortner E-M, Pennebaker J. Language use of depressed and depression-vulnerable college students. Cognition & Emotion. 2004;18(8):1121–33.


22. Hashim NW, Wilkes M, Salomon R, Meggs J, France DJ. Evaluation of voice acoustics as predictors of clinical depression scores. Journal of Voice. 2017;31(2):256. e1–.e6.

23. Mundt JC, Vogel AP, Feltner DE, Lenderking WR. Vocal acoustic biomarkers of depression severity and treatment response. Biological psychiatry. 2012;72(7):580–7. doi: 10.1016/j.biopsych.2012.03.015 22541039

24. Faurholt-Jepsen M, Busk J, Frost M, Vinberg M, Christensen E, Winther O, et al. Voice analysis as an objective state marker in bipolar disorder. Translational psychiatry. 2016;6(7):e856.

25. Cummins N, Scherer S, Krajewski J, Schnieder S, Epps J, Quatieri TF. A review of depression and suicide risk assessment using speech analysis. Speech Communication. 2015;71:10–49.

26. Karam ZN, Provost EM, Singh S, Montgomery J, Archer C, Harrington G, et al., editors. Ecologically valid long-term mood monitoring of individuals with bipolar disorder using speech. Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on; 2014: IEEE.

27. Muaremi A, Gravenhorst F, Grünerbl A, Arnrich B, Tröster G, editors. Assessing bipolar episodes using speech cues derived from phone calls. International Symposium on Pervasive Computing Paradigms for Mental Health; 2014: Springer.

28. Ringeval F, Schuller B, Valstar M, Cowie R, Kaya H, Schmitt M, et al., editors. AVEC 2018 workshop and challenge: Bipolar disorder and cross-cultural affect recognition. Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop; 2018: ACM.

29. Xiao B, Imel ZE, Georgiou PG, Atkins DC, Narayanan SS. " Rate My Therapist": Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing. PloS one. 2015;10(12):e0143055. doi: 10.1371/journal.pone.0143055 26630392

30. van de Leemput IA, Wichers M, Cramer AO, Borsboom D, Tuerlinckx F, Kuppens P, et al. Critical slowing down as early warning for the onset and termination of depression. Proceedings of the National Academy of Sciences. 2014;111(1):87–92.

31. Holt TA. Complexity for clinicians: Radcliffe Publishing; 2004.

32. Pimm SL. The complexity and stability of ecosystems. Nature. 1984;307(5949):321.

33. Carney RM, Blumenthal JA, Stein PK, Watkins L, Catellier D, Berkman LF, et al. Depression, heart rate variability, and acute myocardial infarction. Circulation. 2001;104(17):2024–8. doi: 10.1161/hc4201.097834 11673340

34. Zhao J, Freeman B, Li M. Can mobile phone apps influence people’s health behavior change? An evidence review. Journal of medical Internet research. 2016;18(11).

35. Mundt JC, Snyder PJ, Cannizzaro MS, Chappie K, Geralts DS. Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. Journal of neurolinguistics. 2007;20(1):50–64. doi: 10.1016/j.jneuroling.2006.04.001 21253440

36. Medicare Cf, Services M. Dual Eligible Beneficiaries under Medicare and Medicaid. Baltimore, MD: Available from: …; 2019.

37. Arevian AC, O’hora J, Jones F, Mango JD, Jones L, Williams P, Booker-Vuaghns J, Pulido E, Banner D, Wells K (2018) Participatory Technology Development to Enhance Community Resilience. Ethnicity & Disease. Volume 24, Special Issue.

38. Smith GN, Ehmann TS, Flynn SW, MacEwan GW, Tee K, Kopala LC, et al. The assessment of symptom severity and functional impairment with DSM-IV Axis V. Psychiatric Services. 2011;62(4):411–7. doi: 10.1176/ps.62.4.pss6204_0411 21459993

39. Eisen SV, Gerena M, Ranganathan G, Esch D, Idiculla T. Reliability and validity of the BASIS-24© mental health survey for whites, African-Americans, and Latinos. The journal of behavioral health services & research. 2006;33(3):304.

40. Bone D, Lee C-C, Black MP, Williams ME, Lee S, Levitt P, et al. The psychologist as an interlocutor in autism spectrum disorder assessment: Insights from a study of spontaneous prosody. Journal of Speech, Language, and Hearing Research. 2014;57(4):1162–77.

41. Malandrakis N, Narayanan SS, editors. Therapy language analysis using automatically generated psycholinguistic norms. Sixteenth Annual Conference of the International Speech Communication Association; 2015.

42. Tausczik YR, Pennebaker JW. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of language and social psychology. 2010;29(1):24–54.

43. Kincaid JP, Fishburne RP Jr, Rogers RL, Chissom BS. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Naval Technical Training Command Millington TN Research Branch; 1975.

44. Gunning R. The technique of clear writing. 1952.

45. Mc Laughlin GH. SMOG grading-a new readability formula. Journal of reading. 1969;12(8):639–46.

46. Senter R, Smith EA. Automated readability index. CINCINNATI UNIV OH; 1967.

47. Mcinnes N, Haglund BJ. Readability of online health information: implications for health literacy. Informatics for health and social care. 2011;36(4):173–89. doi: 10.3109/17538157.2010.542529 21332302

48. Elvevåg B, Foltz PW, Weinberger DR, Goldberg TE. Quantifying incoherence in speech: An automated methodology and novel application to schizophrenia. Schizophrenia research. 2007;93(1):304–16.

49. Cummins N, Sethu V, Epps J, Schnieder S, Krajewski J. Analysis of acoustic space variability in speech affected by depression. Speech Communication. 2015;75:27–49.

50. Boersma P. Praat: doing phonetics by computer. 2006.

51. Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J. LIBLINEAR: A library for large linear classification. Journal of machine learning research. 2008;9(Aug):1871–4.

52. Lanata A, Valenza G, Nardelli M, Gentili C, Scilingo EP. Complexity index from a personalized wearable monitoring system for assessing remission in mental health. IEEE Journal of Biomedical and health Informatics. 2015;19(1):132–9. doi: 10.1109/JBHI.2014.2360711 25291802

53. Stange JP, Zulueta J, Langenecker SA, Ryan KA, Piscitello A, Duffecy J, et al. Let your fingers do the talking: Passive typing instability predicts future mood outcomes. Bipolar disorders. 2018;20(3):285–8.

54. Prechter HCBRP. PRIORI—Longitudinal Voice Patterns in Bipolar Disorder: University of Michigan Medicine; [Available from:

55. Cummins N, Epps J, Breakspear M, Goecke R, editors. An investigation of depressed speech detection: Features and normalization. Twelfth Annual Conference of the International Speech Communication Association; 2011.

56. Gottschalk A, Bauer MS, Whybrow PC. Evidence of chaotic mood variation in bipolar disorder. Archives of general psychiatry. 1995;52(11):947–59. doi: 10.1001/archpsyc.1995.03950230061009 7487343

57. Paulus MP, Braff DL. Chaos and schizophrenia: does the method fit the madness? Biological Psychiatry. 2003;53(1):3–11. doi: 10.1016/s0006-3223(02)01701-8 12513940

58. Bedi G, Carrillo F, Cecchi GA, Slezak DF, Sigman M, Mota NB, et al. Automated analysis of free speech predicts psychosis onset in high-risk youths. npj Schizophrenia. 2015;1:15030. doi: 10.1038/npjschz.2015.30 27336038

59. Insel TR. Blog Posts by Thomas Insel [Internet]. National Institute of Mental Health Website: National Institute of Mental Health. 2015. [cited 2017]. Available from:

60. Sugar C, Sturm R, Lee TT, Sherbourne CD, Olshen RA, Wells KB, et al. Empirically defined health states for depression from the SF-12. Health Services Research. 1998;33(4 Pt 1):911.

61. Söderberg P, Tungström S, Armelius BÅ. Special section on the GAF: reliability of Global Assessment of Functioning ratings made by clinical psychiatric staff. Psychiatric Services. 2005;56(4):434–8. doi: 10.1176/

62. Bojanowski P, Grave E, Joulin A, Mikolov T. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics. 2017;5:135–46.

63. Fraser KC, Fors KL, Kokkinakis D. Multilingual word embeddings for the assessment of narrative speech in mild cognitive impairment. Computer Speech & Language. 2019;53:121–39.

Článek vyšel v časopise


2020 Číslo 1
Nejčtenější tento týden