Deep Learning Model Uses Speech to Estimate Blood Pressure

TLDR: A new research proposes a non-invasive method to predict blood pressure (BP) using speech signals and a BERT-based deep learning model. This approach aims to overcome the limitations of traditional cuff-based methods by analyzing acoustic characteristics of speech. The model achieved high accuracy, with mean absolute errors of 1.36 mmHg for systolic BP and 1.24 mmHg for diastolic BP, demonstrating its potential for convenient, real-time health monitoring in telemedicine and remote care.

Monitoring blood pressure (BP) is crucial for maintaining cardiovascular health and preventing serious conditions like heart attacks and strokes. However, traditional cuff-based methods, while reliable, can be uncomfortable, inconvenient, and sometimes yield inconsistent results due to factors like ‘white-coat hypertension’ (elevated BP in a clinical setting) or ‘masked hypertension’ (normal BP in clinic, high at home).

A recent study by Kainat, supervised by Dr. Rabia Tehseen, introduces a groundbreaking non-invasive method for predicting arterial blood pressure (ABP) using only speech signals. This innovative approach leverages advanced deep learning techniques, specifically a BERT-based regression model, to analyze the acoustic characteristics of spoken sentences and correlate them with blood pressure levels. The full research paper can be accessed here: Cuffless Blood Pressure Prediction from Speech Sentences using Deep Learning Methods.

The core idea is that subtle changes in our voice can reflect underlying physiological states, including heart rate, stress levels, and breathing patterns, which are all linked to blood pressure. By capturing these voice features, the model can provide real-time monitoring without the discomfort and limitations of conventional methods.

How the System Works

The research involved a dataset of speech recordings from 95 participants, aged 20 to 70, who were free from neurological or neuropsychiatric diseases. The recordings included vocal exercises like pronouncing vowels and a standard English sentence, “The weather is good today,” over a three-hour period, interspersed with traditional BP measurements using an Omron M3 device.

The methodology follows several key steps:

Data Pre-processing: Initial and final BP measurements were used to categorize participants as hypertensive or normal based on defined thresholds for systolic blood pressure (SBP) and diastolic blood pressure (DBP).
Audio Processing: All speech recordings underwent normalization to ensure consistent amplitude. Vowels were detected and extracted, and each audio signal was segmented into small 50-millisecond parts. Fast Fourier Transform was applied to these segments to extract frequency components, crucial for vowel detection, and a Gaussian window was used for smoothing.
Feature Extraction: A wide range of features were extracted from the audio signals, including spectral features (Mel-Frequency Cepstral Coefficients or MFCCs, spectral centroids, bandwidth, flatness), temporal features (zero-crossing rate, energy amplitude, time frames), and pitch. These features capture the underlying speech dynamics.
Feature Selection: To optimize the model and reduce computational complexity, a technique called ReliefF was used to identify and remove irrelevant features. MFCCs, skewness, kurtosis, and polygonal area weights were found to be most significant, while maximum and minimum amplitude values were deemed less important.
BERT-Based Regression: The heart of the proposed method is a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model. BERT is a powerful language model known for its ability to understand the context of words in a sentence. In this study, numerical features from speech were converted into a text-like representation, tokenized, and fed into the BERT encoder. A regression head was then added to the model to predict SBP and DBP values, learning the intricate relationship between linguistic features and blood pressure levels.

Impressive Results

The BERT-based model demonstrated remarkable performance. It achieved a mean absolute error (MAE) of just 1.36 mmHg for systolic blood pressure (SBP) and 1.24 mmHg for diastolic blood pressure (DBP). The R² scores, which indicate how well the model explains the variance in BP, were 0.99 for SBP and 0.94 for DBP. These metrics highlight the model’s robustness and accuracy in predicting blood pressure levels, even with a relatively smaller dataset compared to some other studies.

The training and validation loss analysis showed effective learning and minimal overfitting, indicating that the model generalizes well to new, unseen data. This performance surpasses many traditional and even some deep learning models that rely on other physiological signals like electrocardiograms (ECG) and photoplethysmography (PPG), which often require physical sensors and calibration.

Also Read:

Implications for Healthcare

This research has significant implications for enhancing patient care and the proactive management of cardiovascular health. By providing a user-friendly and accurate method for blood pressure assessment, it paves the way for improved applications in telemedicine and remote health monitoring. Imagine checking your blood pressure simply by speaking into your smartphone or a smart speaker, without any physical contact or cumbersome equipment.

The cuffless approach offers increased patient comfort, accessibility, and the potential for continuous monitoring, which is vital for early detection and management of hypertension. While challenges remain, such as ensuring accuracy across diverse populations and integrating with existing medical systems, this study represents a significant step towards making blood pressure monitoring more convenient, affordable, and integrated into daily life.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Deep Learning Model Uses Speech to Estimate Blood Pressure

How the System Works

Impressive Results

Implications for Healthcare

Gen AI News and Updates

Microsoft Research Unveils Project Gecko to Advance Equitable Multilingual AI for Global Communities

Get Well and RhythmX AI Unite to Form GW RhythmX, Pioneering AI-Native Healthcare Intelligence

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates