New Trend in Medical AI: Diagnosing Diseases From Speech

But challenges remain before it is ready for widespread adoption

Andrew Brosnan

November 21, 2022

5 Min Read

The latest FDA update of approved AI/ML-enabled medical devices showed a 52% increase to bring the total to 521 in the U.S. While most of these AI-related approvals are in radiology, doctors and medical researchers across the health care sector are looking to expand the use of AI to screen, diagnose, and monitor patients for a wide range of medical conditions.

One emerging trend is using speech or vocal analysis to identify and analyze what are known as vocal biomarkers to assist in identifying underlying medical conditions in patients. The FDA categorizes AI in this context as Software as a Medical Device (SaMD), which provides a regulatory pathway to market. While the FDA has yet to approve AI in this field of analysis, at least one company, Aural Analytics, has registered a ‘device’ with the FDA.

Using vocal biomarkers − measurable signs pointing to a patient’s medical state such as speech, breathing and coughs – in diagnosis is not new. Doctors have been doing it for centuries. However, leveraging AI to identify and analyze digital vocal biomarkers at a depth and scale beyond human ability is an emerging and promising trend in health care.

Digital vocal biomarkers − those captured and measured by digital devices such as smartphones or smart speakers − offer a low-cost, non-invasive and easier method for screening and monitoring patients versus more traditional methods. As such, this approach can reduce the overall cost to health care payors compared to the materials and processing costs of more traditional and invasive measures since voice-enabled devices are much more ubiquitous.

Speech-based biomarkers, coupled with AI and ML, also offer the potential to support and further clinical research.

Winterlight Labs of Toronto is one example. Its speech assessment technology, which uses natural language processing (NLP) models, is being used in a dozen clinical trials spanning Alzheimer’s disease, depression, dementia, bipolar disorder, and schizophrenia.

These technologies can identify novel vocal biomarkers and monitor the progression of voice-based signals in clinical trials. As seen in other clinical applications of AI, AI has the ability to detect finer variations in frequency across a broader spectrum than is discernible to the human ear.

Also, AI coupled with digital devices can automate and scale more easily than traditional methods while tracking hundreds of voice-based ‘features’ and, in the context of clinical trials, detect responses to drug candidates.

Meanwhile, many companies are choosing to deploy apps in the consumer health sector as an easier route to market in the short term versus the much longer process of FDA approval. Several companies already offer iOS and Android apps enabling consumers to take a greater degree of responsibility for their own health and wellness.

For example, Sonde Health offers an app that enables consumers to record their voice and mood on a daily basis to provide a mental health score. The app also provides contextual advice that the consumer can take to improve their state of mental health based on a mental health fitness score.

Big potential but challenges, concerns remain

While the technology offers tremendous potential in research and for consumer health and clinics, challenges for widespread adoption in the clinic remain. The current regulatory pathway to market remains unclear. While the FDA has issued guidelines for SaMD, there currently is not an FDA-approved vocal SaMD for biomarker analysis.

Also, these are still early days in the development of AI-enabled speech models and associated algorithms, and therefore the possibility of flagging false positives exists. In the hands of consumers, the technology’s potential for naïve self-diagnosis or misdiagnosis increases, as doctors tend to look at multiple inputs other than speech when making a diagnosis.

Furthermore, the potential knock-on effects of medical conditions flagged by consumer health apps are unclear. What sort of culpability will medical professionals be exposed to by the flagging of medical conditions by consumer apps? Will they be held responsible for not making a referral based on an app?

There is also the potential for medical services to be overwhelmed by the number of possible conditions due to an abundance of voice-enabled devices and ease of voice sampling. While there is value in a consumer health context to let the patient take a greater degree of control and responsibility for their wellness, further investigation before making the leap to the clinic is required.

Additionally, consumers and patients alike are becoming increasingly concerned about the use − and sharing − of their personal data. Voice data represents a particular stumbling block in that it is arguably easier to identify a patient with speech data versus other forms of data, and the potential for organizational overreach exists. Therefore, strict adherence to privacy and security is required to allay concerns over the potential misuse of patient data.

Technology needs to mature before widespread adoption

“These are indeed very promising as non-invasive tools for screening or triage, but in reality, the safety and performance of the algorithms still need to be confirmed,” said Guy Fagherazzi, group leader of the Deep Digital Phenotyping research unit at the Luxembourg Institute of Health, in an article published last fall in the prestigious Lancet medical journal.

With the ubiquity of voice-enabled devices, the ability to capture large voice-based datasets already exists but the rules governing the sharing of identifiable data in a medical context need to be better defined, as does the method for validating the performance of such models.

Generally speaking, larger datasets tend to result in better-performing models and algorithms. But, as with other applications of clinical AI, the training datasets need to be representative of the populations where they will be deployed, and a deep understanding of the clinical area where they will be used is also required in the curation of data and training of clinical AI models to ensure the recommendations they provide are valid and relevant.

The application of AI-to-speech analysis is a promising development and one that is garnering significant interest due to its low-cost, non-invasive diagnostic capability. While challenges and concerns remain, many of the obstacles will be overcome in due course as regulatory, legal, and governance bodies provide guidelines and frameworks for the responsible use of AI and treatment of patient data. Consensus and acceptance of its use in the clinic will grow, provided its performance accuracy is backed by additional studies.

Get the newsletter
From automation advancements to policy announcements, stay ahead of the curve with the bi-weekly AI Business newsletter.