TLDR: DeformTune is a prototype system that uses a deformable tactile interface to make AI music generation more accessible and understandable for non-musicians. A study with 11 participants revealed challenges like unclear control mappings and a desire for more diverse musical control and feedback. The findings highlight opportunities for designing AI music systems with better explainability through multimodal feedback, layered explanations, and a focus on creative, rather than technical, understanding.
Many of today’s AI music generation tools often require users to have musical or technical expertise, relying on complex interfaces, text prompts, or instrument-like controls. This can be a significant barrier for non-musicians who wish to explore AI-assisted music creation.
Addressing this challenge, a new prototype system called DeformTune has been introduced. DeformTune combines a tactile deformable interface with the MeasureVAE model to offer a more intuitive, embodied, and explainable way for non-musicians to interact with AI music. The goal is to make AI music creation accessible and understandable for everyone, regardless of their musical background.
To investigate the user experience, a preliminary study was conducted with 11 adult participants who had no formal musical training. Each session involved a demographic survey, free exploration of the system, a creative task (composing a 10-second ringtone), and questionnaires, followed by a semi-structured interview. The feedback gathered from these sessions was analyzed to understand the challenges and needs of novice users.
The DeformTune system itself is composed of three main parts. First, a deformable interface made of conductive fabric. Second, an Arduino-based sensing module equipped with four pressure sensors. This module processes the sensor data, smoothing it and converting pressure readings into stable integer values. Third, a Python backend receives these integer values from the Arduino. These values correspond to specific dimensions of the MeasureVAE model’s latent space, which control musical parameters such as rhythmic complexity, note range, note density, and average interval jump. Higher pressure on a sensor leads to higher mapped values, increasing the complexity of the generated music. The system works by selecting pre-generated MIDI files that correspond to discrete latent vectors, playing them back in real time. This design choice prioritizes a clear and transparent mapping from user input to musical output.
The study’s results indicated that DeformTune was generally enjoyable and expressive, scoring well in Hedonic Quality, Creativity Potential, and Expressivity. However, participants also noted usability challenges, reflected in a lower Pragmatic Quality score. The qualitative analysis of the interviews revealed four key themes regarding user interaction and explainability needs.
Ambiguity in Action–Sound Mapping
This theme highlighted that while participants felt their physical input had an effect, most found it difficult to understand exactly what musical parameters they were controlling. Users described the system as “responsive” but not predictable, making it hard to form a reliable mental model. Suggestions included adding visual feedback like LED lights or on-screen icons, and redesigning the physical shape of the sensors to better align with common gestures like pressing.
The Tension Between Mystery and Control
This theme showed that while the unpredictability of DeformTune was novel and engaging for some, it also led to frustration when users couldn’t reliably recreate desired sounds. Participants expressed a desire for “some rules” to help stabilize interaction without completely removing the sense of mystery, suggesting a need for layered and context-sensitive explanations.
The Need for Multimodal Feedback and Learning Cues
Participants reported that haptic input alone was insufficient for understanding the system. They suggested incorporating visual, auditory, or enhanced tactile feedback, such as pressure indicators or sensor activation cues. The importance of onboarding support, like guided demonstrations or tutorials explaining each sensor’s effect, was also emphasized, especially for first-time users.
Also Read:
- Unpacking AI Recommendations: Tailored Visual Explanations for Social Media Users
- Audience Perception of AI Art: The Impact of Transparency in Live Performance
Explainability as a Means of Creative Empowerment
Participants were interested in integrating their own musical preferences or training data. Many also desired control over a wider range of musical attributes like rhythm, melody, harmony, timbre, or pitch, and expressed a wish for more diverse interaction modalities beyond pressing, such as squeezing, twisting, sliding, and tapping.
The research paper discusses several opportunities for enhancing systems like DeformTune. These include clarifying action-sound mappings through visual and tactile cues, balancing mystery and control with layered explanations, providing guided and gradual explanations, and shifting from technical transparency to creative explainability. The latter suggests that explanations should focus on perceptually salient musical features (like pitch or loudness) rather than abstract technical parameters.
While DeformTune is still in its early stages, this user study provides valuable insights into the challenges non-musicians face with AI music systems and their specific needs for explainability. These findings offer promising directions for designing future AI music tools that are more transparent, engaging, and usable for novice users. For more details, you can read the full research paper here.


