Listening for Life: A New Audio Dataset for Drone Search and Rescue

TLDR: Researchers have introduced DRONEAUDIOSET, a new 23.5-hour audio dataset designed to improve drone-based search and rescue by enabling better detection of human sounds amidst loud drone noise. The dataset features diverse recordings across various drone types, microphone setups, and environments, allowing for the development and testing of advanced noise suppression and sound classification technologies. Initial evaluations show that while current methods struggle in extreme noise, the dataset provides crucial insights for designing more effective drone audio systems.

Unmanned Aerial Vehicles, commonly known as drones, have become indispensable tools for search and rescue (SAR) missions, especially in challenging environments like collapsed buildings or disaster zones. Traditionally, these missions rely heavily on visual data, but this approach often fails in conditions with poor visibility, such as smoke, fog, or cluttered spaces. This is where audio perception comes in, offering a complementary way to detect human presence through sounds like speech, screams, cries, or even non-verbal cues like banging and footsteps.

However, using microphones on drones presents a significant challenge: the drone’s own intense noise, known as ego-noise, combined with wind noise. This drone noise can be so loud that it completely masks the faint sounds indicating human presence, making detection incredibly difficult. Existing audio datasets for drones are often limited in their diversity or are purely synthetic, meaning they don’t capture the complex, real-world acoustic interactions.

Introducing DRONEAUDIOSET

To address these critical limitations, a team of researchers from the National University of Singapore has introduced DRONEAUDIOSET, a comprehensive new audio dataset specifically designed for drone-based search and rescue. This dataset is a major step towards enabling the design and deployment of effective drone-audition systems.

DRONEAUDIOSET is an extensive collection, featuring 23.5 hours of carefully annotated recordings. It covers a wide spectrum of signal-to-noise ratios (SNRs), ranging from extremely low (-57.2 dB) to moderately low (-2.5 dB), reflecting the challenging conditions faced in real-world scenarios. The dataset incorporates various drone types, different throttle settings, multiple microphone configurations, and diverse indoor environments, providing a rich resource for researchers.

How the Data Was Collected

The researchers employed a systematically controlled experimental setup. The drone was securely mounted on a fixed aluminum frame, mimicking a hovering drone in a static position. This setup allowed for consistent and repeatable conditions while capturing a wide range of audio samples. The data collection varied several key parameters:

Drone Types: Two quadcopters of different sizes, a larger DJI F450 (Dlarge) and a smaller DJI F330 (Dsmall), were used to capture varied ego-noise profiles.
Throttle Settings: Recordings were made at both ‘low’ and ‘high’ throttle speeds to simulate different operational modes.
Microphone Configurations: A total of 17 microphones were deployed. This included two 8-channel circular arrays (Mup and Mdown, placed above and below the drone, respectively) and a central standalone microphone (Mcenter). These were positioned at 25 cm and 50 cm distances from the drone.
Sound Sources: Three categories of sounds relevant to search and rescue were used: human vocal sounds (speech, screams, cries), human non-vocal sounds (door knocks, clapping, footsteps), and ambient non-human sounds (fire crackling, water dripping). These sounds were played through a speaker at different loudness levels (60 dB and 90 dB).
Environments: Data was collected in three different indoor rooms – a small conference room and two large multi-purpose halls – to introduce diversity in reverberation and multi-path effects.

The dataset includes recordings where drone noise and source sounds were captured simultaneously, as well as separate recordings of drone-only noise and source-only sounds. This allows for detailed analysis and computation of signal-to-noise ratios.

Key Findings and Challenges

The research paper also benchmarks state-of-the-art noise suppression and audio classification models using DRONEAUDIOSET. The evaluations revealed significant insights:

Noise Suppression: Neural network-based noise suppression methods generally outperformed traditional techniques, especially in extremely noisy conditions (below -20 dB SNR). However, all methods struggled when the SNR dropped below -30 dB, highlighting the need for more advanced solutions. Human vocal sounds showed the most improvement after noise suppression, while non-vocal human sounds and non-human ambient sounds remained challenging.
Sound Classification: After noise suppression, human vocal sounds were classified with much higher accuracy compared to non-vocal human sounds and non-human ambient sounds. Many non-vocal and non-human sounds were often misclassified as silence, indicating that improving noise suppression for these sound types is crucial for better detection.

Designing Better Drone Audio Systems

Based on the empirical analysis, the researchers derived several actionable recommendations for designing more effective drone-audition systems:

Microphone Placement: Microphones placed above the drone generally performed better than those below, as they were less exposed to direct wind noise from the propellers. Increasing the distance between the microphone and the drone also improved performance.
Microphone Arrays: While multi-channel microphone arrays offer advantages like beamforming (which can help focus on sounds from a specific direction), they also demand higher processing power. System designers must balance these trade-offs with mission requirements.
Drone Throttle Adjustments: Operating the drone at lower throttle levels significantly improved acoustic performance. This suggests that drones could incorporate adaptive throttle reduction strategies during critical listening periods.
Drone Size: Smaller drones generated less ego-noise and thus achieved better acoustic performance. However, larger drones can carry more advanced recording equipment, presenting a trade-off between payload capacity and acoustic clarity.

Also Read:

Looking Ahead

DRONEAUDIOSET opens up new research opportunities for developing next-generation noise suppression algorithms and audio classification models that can operate effectively in extreme low-SNR drone environments. The dataset’s controlled variations can help train models to adapt to different noise profiles and could also be useful for sound localization and speech recovery in mobile robotics.

The positive societal impact of this work is significant, promising improved search and rescue capabilities in disaster scenarios where visual systems are inadequate. However, the researchers also acknowledge potential negative applications, such as unauthorized surveillance, and recommend safeguards like access controls for sensitive data and clear ethical usage guidelines.

While the dataset represents a significant leap, future work will aim to expand it to include outdoor recordings, capture the micro-dynamics of real hovering drones, and detect other emergency-relevant auditory cues like fire or structural collapse. For more in-depth information, you can read the full research paper here: DRONEAUDIOSET: An Audio Dataset for Drone-based Search and Rescue.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Listening for Life: A New Audio Dataset for Drone Search and Rescue

Introducing DRONEAUDIOSET

How the Data Was Collected

Key Findings and Challenges

Designing Better Drone Audio Systems

Looking Ahead

Gen AI News and Updates

SG-XDEAT: A New Approach for Robust Tabular Data Learning

OWL: A Breakthrough in AI’s Ability to Understand Sound Location and Distance

CoSupFormer: Enhancing EEG Analysis Through Advanced Deep Learning

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates