TLDR: The witheFlow system is a proof-of-concept AI tool that enhances live music performance by automatically adjusting audio effects based on a musician’s real-time emotional state, derived from biosignals (EEG, ECG) and audio analysis. It aims to create a more expressive and collaborative environment between human performers and AI, allowing musicians to focus on artistic expression while the system handles technical modulation. The system is lightweight, open-source, and prioritizes performer control and ethical data handling.
Music performance is a deeply human endeavor, intrinsically linked to a performer’s ability to convey and express emotion. While machines can produce or synthesize music, they lack the capacity for genuine emotional experience. This distinction makes music performance an ideal area to explore new forms of collaboration between humans and artificial intelligence.
A new system, dubbed witheFlow, is emerging as a proof-of-concept designed to enhance real-time music performance. It achieves this by automatically modulating audio effects based on features extracted from both the performer’s biosignals and the audio itself. The system is designed to be lightweight, capable of running locally on a laptop, and is open-source, requiring only a compatible Digital Audio Workstation (DAW) and sensors.
Instead of positioning AI as a replacement for human creativity, witheFlow explores an alternative where AI serves as a sophisticated tool. It handles technical details and complex processing tasks, freeing performers to focus on aesthetic ideas and artistic decisions. This collaborative approach aims to create a more liberated environment for musicians, allowing them to concentrate on self-expression.
How witheFlow Works: A Glimpse into the System
The witheFlow system integrates lightweight machine learning models with traditional rule-based AI to process multiple input streams from performers in real-time. Its architecture comprises three main components:
- A biosignal-based ’emotional state’ feature extractor.
- An audio-based emotion regressor, operating in the Valence-Arousal (VA) space.
- A rule-based mixing logic module.
The system is built using Python, with communication between modules facilitated by MIDI (Musical Instrument Digital Interface) protocol messages.
For biosignals, witheFlow employs commercial-grade electroencephalography (EEG) and electrocardiography (ECG) sensors. EEG data is used to estimate Attention and Relaxation levels, while ECG data helps compute the Baevsky Stress Index (SI), a well-established metric for quantifying stress based on heart rate variability. These physiological insights provide a real-time understanding of the performer’s internal state.
In the audio domain, emotions are modeled based on Russell’s circumplex model of valence and arousal. A trained neural network analyzes the dry audio signal to determine its emotional characteristics, providing another layer of input to the system.
The core of witheFlow’s responsiveness lies in its mixing logic. This module dynamically adjusts the gains of various audio effect channels based on a combination of the musician’s stress and attention levels, and the valence-arousal characteristics of the dry audio. The mixing logic is highly customizable, utilizing rulesets encoded in YAML files. These rules define conditions (e.g., high stress, low attention) that trigger specific functions to boost or suppress certain audio effects, shaping the overall emotional output of the mix.
Ensuring Robustness and Performer Control
The system incorporates artifact detection for both EEG and ECG signals, ensuring data reliability. If a persistent artifact is detected, the affected device is deactivated, and the system dynamically updates its rule set to prioritize reliable signal sources. Importantly, performers retain control over the system’s behavior; they can adjust the strength of the rules via MIDI (for example, with a foot pedal) and even reverse effects, ensuring psychological safety and artistic agency.
Also Read:
- Empowering Musicians: How a Deep Learning Tool Supports Creative Ownership in Music
- Teaching Musical Agents to Understand Song Structure Through Deep Learning
Future Horizons and Ethical Considerations
The witheFlow system, co-developed and tested with musicians, has received positive feedback, particularly during improvisational sessions. Future research directions include developing more comprehensive datasets for solo performances that incorporate biosignals, exploring new meaningful features from various sensors (like video), and advancing towards learnable mixing logic that is both interpretable and controllable.
Ethical considerations are paramount, especially concerning sensitive personal data like biosignals. The project adheres to strict ethical principles, ensuring informed consent, anonymity, and secure data storage. While the current proof-of-concept operates locally to minimize data exposure, future scaling might involve cloud computation, raising further privacy challenges. The system emphasizes that authorship of the musical performance belongs entirely to the performer, with the AI acting solely as an enhancing tool.
The witheFlow system represents a significant step towards integrating biosignals with audio processing to enhance musical performance while preserving human creative agency. It offers a prototype and a conceptual framework for emotion-aware audio processing in live contexts, opening new expressive possibilities for musicians. For more details, you can read the full research paper here.


