Name: Semantic Audio Encoders from EQ Parameters Alone: Effects of Training Data Composition on Limited-Data Learning
Start: 2026-05-29T09:00:00+0200
End: 2026-05-29T11:00:00+0200

Schedule as of May 16, 2022 - subject to change

Default Time Zone is CEST - Central European Summer Time
You can change your view to your time zone (look for "Timezone" on the right)

LIVESTREAMS : A and B

ON DEMAND VIDEOS (previous days)

Semantic Audio Encoders from EQ Parameters Alone: Effects of Training Data Composition on Limited-Data Learning

Friday May 29, 2026 9:00am - 11:00am CEST

Foyer Building 303A

We investigate how training data composition influences
semantic audio encoders that learn perceptual descriptors
such as "warm," "bright,"; "muddy" from equalization
(EQ) parameter datasets without labeled audio examples.
Using the SAFE-DB dataset of 1,369 labeled EQ settings, we
train audio encoders via an inverse problem formulation in
which labeled EQ parameters are applied to source audio;
the encoder is trained to recognize the resulting semantic
characteristics. Three training configurations are
compared, varying both class sampling strategy (uniform
versus balanced); source audio type (pink noise versus
real music). Despite severe class imbalance in SAFE-DB,
where 76 percent of examples are labeled "bright" or
"warm," balanced class sampling combined with mixed-source
training (50 percent pink noise; 50 percent FMA music)
successfully learns physically meaningful semantic-spectral
relationships: "warm"; "muddy" show negative correlation
with spectral centroid (r = -0.56), while "bright";
"thin" show positive correlation (r = +0.49). However,
prediction confidence decreases substantially (from 0.96 to
0.76 to 0.86),; top-1 predictions remain dominated by
the "bright" class across all evaluated music genres,
reflecting inherent dataset bias rather than training
failure. These results demonstrate that training data
composition significantly affects model calibration but
cannot fully overcome fundamental bias in the underlying
label distribution, highlighting key challenges for
semantic audio understanding systems.

Authors

Daniel Dutulescu

UCL

Friday May 29, 2026 9:00am - 11:00am CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

AI and Machine Learning in Audio, Poster | Audio Applications and Technologies, Poster

Presentation Type Poster

AES Europe 2026

Daniel Dutulescu

Attendees (7)

Get help with the event

AES Europe 2026

Daniel Dutulescu

Attendees (7)

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Get help with the event