Neuromathematical challenges to predicting psychosis onset in high-risk youths

NeuroMat research-team members have been involved in quantifying semantics as a means of understanding behavior and in a recent publication have tested automated speech analyses combined with machine learning processes to predict psychosis onset in youths at clinical high-risk for psychosis. NeuroMat members Guillermo Cecchi, from the IBM T. J. Watson Research Center, Sidarta Ribeiro, from the Federal University of Rio Grande do Norte, Mauro Copelli, from the Federal University of Pernambuco, and colleagues have shown that speech features predicted later psychosis development with 100% accuracy. NeuroMat is the São Paulo Research Foundation (FAPESP)’s Research, Innovation and Dissemination Center for Neuromathematics (RIDC NeuroMat) and was launched in 2013 to contribute to creating a mathematical framework for neuroscience.

“In psychiatry, there are no exams, and diagnoses depend overall on the opinion of health professionals and on highly-subjective scales, which might lead to unreliable results and a tendency to treat psychiatric illnesses on a trial-and-error basis. What we are indicating with this research is that a simple, fairly cheap method could bring about a substantial, robust improvement to diagnosing mental illness,” said Sidarta Ribeiro, a NeuroMat associate investigator.

Interviews with 34 Anglophone youths at clinical high-risk for psychosis were evaluated for semantic and syntactic features. According to the original paper, a combination of semantic coherence and syntactic assays as predictors of psychosis transition was used. For the semantic analyses, the research group relied on LSA, “a high-dimensional associative model that rests on the premise that word meaning is a function of the relationship of each word to every other word in the lexicon.” The method also relies on syntactic information: “semantic trajectories are represented by similarity among pairs of consecutive phrases, or pairs of phrases separated by an intervening phrase,” and every word was labeled according to its grammatical function (POS-Tag). Interviewees were assessed quarterly for up to two years and a half, from 2007 to 2012, thus generating a vast amount of data. The dataset that was used for the research on psychosis onset was not designed specifically for this research. It relies on non-structured interviews, relying on a kind of conversation that was called “free speech.” Five out of the 34 youths transitioned to psychosis within the period of follow-up.

Challenges that pertain directly to Neuromathematics could be at least fourfold. Firstly, the prediction of the onset of psychosis and psychiatric illnesses in general may be sharpened with more efficient tools to identify speech patterns that are relevant to predict these illnesses at earlier stages. Ribeiro said that this depends on designing new experiments, that would bring onto the analysis a longitudinal perspective on knowledge acquisition and would therefore require the mathematical capacity to take into consideration time-evolving dynamics.

Secondly, in order to assess deviant behavioral or speech patterns one would have to have a robust sense of “normal” patterns and significance thresholds. The discrimination between individuals who transitioned to psychosis and those who did not relied on three dimensions in the paper: “The frequency of use of determiners (‘that’, ‘what’, ‘whatever’, ‘which’, and ‘whichever’) normalized by phrase length; the minimum semantic coherence between two consecutive phrases within the interview; and the maximum phrase length.” According to Ribeiro, one could envision national and international datasets with recording and coding speech that could potentially be made available from longitudinal standardized tests. “This could be seen as a cheap way of identifying children and youths that might be at risk for psychological and psychiatric illnesses,” said Ribeiro. Yet, according to the NeuroMat associate investigator, there is a fundamental mathematical challenge involved: given the many dimensions that are taken into consideration in the model, how do I define what the normal results are and the number of individuals I need to assess in order to reach an expected result. “This is a key question, because if we do not have a valid mechanism to establish what is a normal and a deviant outcome we could be unable to provide efficient treatment."

Thirdly, research could be furthered on the interface between semantics, structure and prosody. A more robust model should especially include prosody, that at this point is still largely absent from what has been done, according to Ribeiro. Prosodic elements include intonation, tone, stress, and rhythm.

Fourthly, data could also be fruitful for bettering medical protocols. The data analysis on the study on psychosis onset did not rely solely on interviewees’ responses but on the whole interaction between trained health professionals and patients, and such interaction could lead to new directions of research. According to Ribeiro, this could contribute to designing new, efficient protocols on the kind of interactions that make it easier to reach a valid diagnosis. “In the future, we could reach the point in which we know that only four well-designed questions are necessary to estimate the likelihood of a clinical high-risk person to transition to psychosis.”

The former director of the NIHMH , Thomas Insel, alluded to Ribeiro and colleagues' research on predicting psychosis as a breakthrough in Computational Psychiatry. On a post on the NIMH blog, from August 31, 2015, Insel wrote: "The approach, developed by Guillermo Cecchi at IBM, maps semantic coherence and speech complexity as a window into the earliest stages of disorganized thought. While analysis of previous clinical features have yielded, at best, 80 percent prediction, this automated analysis of unstructured speech was reported to be 100 percent accurate for identifying who would convert to psychosis during the follow up period. This is a small study (34 participants with 5 developing psychosis), but it serves as a preview of what we might see as the power of technology is applied to provide objective measures of behavior and cognition.” Dr. Insel was recently joined Google Life Sciences, "which seeks to develop technologies for early detection and treatment of health problems,” according to a piece on The New York Times (09/15/2015).

Link to paper

Guillermo A. Cecchi, Sidarta Ribeiro, Mauro Copelli et al. Automated analysis of free speech predicts psychosis onset in high-risk youths, npj Schizophrenia. 2015; 1.

This piece is part of NeuroMat's Newsletter #20. Read more here

Featuring this week:

Stay informed on our latest news!

Previous issues

Podcast A Matemática do Cérebro
Podcast A Matemática do Cérebro
NeuroMat Brachial Plexus Injury Initiative
Logo of the NeuroMat Brachial Plexus Injury Initiative
Neuroscience Experiments System
Logo of the Neuroscience Experiments System
NeuroMat Parkinson Network
Logo of the NeuroMat Parkinson Network
NeuroMat's scientific-dissemination blog
Logo of the NeuroMat's scientific-dissemination blog