spot_img
HomeResearch & DevelopmentDeep Learning Model Uses Speech to Estimate Blood Pressure

Deep Learning Model Uses Speech to Estimate Blood Pressure

TLDR: A new research proposes a non-invasive method to predict blood pressure (BP) using speech signals and a BERT-based deep learning model. This approach aims to overcome the limitations of traditional cuff-based methods by analyzing acoustic characteristics of speech. The model achieved high accuracy, with mean absolute errors of 1.36 mmHg for systolic BP and 1.24 mmHg for diastolic BP, demonstrating its potential for convenient, real-time health monitoring in telemedicine and remote care.

Monitoring blood pressure (BP) is crucial for maintaining cardiovascular health and preventing serious conditions like heart attacks and strokes. However, traditional cuff-based methods, while reliable, can be uncomfortable, inconvenient, and sometimes yield inconsistent results due to factors like ‘white-coat hypertension’ (elevated BP in a clinical setting) or ‘masked hypertension’ (normal BP in clinic, high at home).

A recent study by Kainat, supervised by Dr. Rabia Tehseen, introduces a groundbreaking non-invasive method for predicting arterial blood pressure (ABP) using only speech signals. This innovative approach leverages advanced deep learning techniques, specifically a BERT-based regression model, to analyze the acoustic characteristics of spoken sentences and correlate them with blood pressure levels. The full research paper can be accessed here: Cuffless Blood Pressure Prediction from Speech Sentences using Deep Learning Methods.

The core idea is that subtle changes in our voice can reflect underlying physiological states, including heart rate, stress levels, and breathing patterns, which are all linked to blood pressure. By capturing these voice features, the model can provide real-time monitoring without the discomfort and limitations of conventional methods.

How the System Works

The research involved a dataset of speech recordings from 95 participants, aged 20 to 70, who were free from neurological or neuropsychiatric diseases. The recordings included vocal exercises like pronouncing vowels and a standard English sentence, “The weather is good today,” over a three-hour period, interspersed with traditional BP measurements using an Omron M3 device.

The methodology follows several key steps:

  • Data Pre-processing: Initial and final BP measurements were used to categorize participants as hypertensive or normal based on defined thresholds for systolic blood pressure (SBP) and diastolic blood pressure (DBP).
  • Audio Processing: All speech recordings underwent normalization to ensure consistent amplitude. Vowels were detected and extracted, and each audio signal was segmented into small 50-millisecond parts. Fast Fourier Transform was applied to these segments to extract frequency components, crucial for vowel detection, and a Gaussian window was used for smoothing.
  • Feature Extraction: A wide range of features were extracted from the audio signals, including spectral features (Mel-Frequency Cepstral Coefficients or MFCCs, spectral centroids, bandwidth, flatness), temporal features (zero-crossing rate, energy amplitude, time frames), and pitch. These features capture the underlying speech dynamics.
  • Feature Selection: To optimize the model and reduce computational complexity, a technique called ReliefF was used to identify and remove irrelevant features. MFCCs, skewness, kurtosis, and polygonal area weights were found to be most significant, while maximum and minimum amplitude values were deemed less important.
  • BERT-Based Regression: The heart of the proposed method is a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model. BERT is a powerful language model known for its ability to understand the context of words in a sentence. In this study, numerical features from speech were converted into a text-like representation, tokenized, and fed into the BERT encoder. A regression head was then added to the model to predict SBP and DBP values, learning the intricate relationship between linguistic features and blood pressure levels.

Impressive Results

The BERT-based model demonstrated remarkable performance. It achieved a mean absolute error (MAE) of just 1.36 mmHg for systolic blood pressure (SBP) and 1.24 mmHg for diastolic blood pressure (DBP). The R² scores, which indicate how well the model explains the variance in BP, were 0.99 for SBP and 0.94 for DBP. These metrics highlight the model’s robustness and accuracy in predicting blood pressure levels, even with a relatively smaller dataset compared to some other studies.

The training and validation loss analysis showed effective learning and minimal overfitting, indicating that the model generalizes well to new, unseen data. This performance surpasses many traditional and even some deep learning models that rely on other physiological signals like electrocardiograms (ECG) and photoplethysmography (PPG), which often require physical sensors and calibration.

Also Read:

Implications for Healthcare

This research has significant implications for enhancing patient care and the proactive management of cardiovascular health. By providing a user-friendly and accurate method for blood pressure assessment, it paves the way for improved applications in telemedicine and remote health monitoring. Imagine checking your blood pressure simply by speaking into your smartphone or a smart speaker, without any physical contact or cumbersome equipment.

The cuffless approach offers increased patient comfort, accessibility, and the potential for continuous monitoring, which is vital for early detection and management of hypertension. While challenges remain, such as ensuring accuracy across diverse populations and integrating with existing medical systems, this study represents a significant step towards making blood pressure monitoring more convenient, affordable, and integrated into daily life.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -