Loading…
Schedule as of May 16, 2022 - subject to change

Default Time Zone is CEST - Central European Summer Time
You can change your view to your time zone (look for "Timezone" on the right)


LIVESTREAMS : A and B


ON DEMAND VIDEOS (previous days)
 
Type: Perception clear filter
arrow_back View All Dates
Thursday, May 28
 

10:00am CEST

Distortion Measurements; Can We Measure What We Hear?
Thursday May 28, 2026 10:00am - 11:00am CEST
There are many types of different distortions that can be
measured from linear to non-linear distortion. Often the
two are convoluted together and the linear distortion
influences the non-linear distortion. Distortion is also
very signal and level dependent and it is hard to compare
one type of distortion measurement to another. There are
many type of non-linear distortion metrics, e.g. THD, THD+N
and IMD being the most classic ones using sine tones as the
test signal. But how can we measure distortion with real
signals such as speech and music or even noise and compare
the results to audibility? This tutorial discusses a wide
range of distortion measurements, discusses what is audible
and what distortion sounds like.
Speakers
avatar for Steve Temme

Steve Temme

Listen Inc.
Steve Temme is founder and President of Listen, Inc., manufacturer of the SoundCheck audio test system. Steve founded the company in 1995, and for the past 30 years the company has remained on the cutting edge of research into audio measurement, regularly introducing new measurement... Read More →
Thursday May 28, 2026 10:00am - 11:00am CEST
Aud 49 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

10:30am CEST

Effect of an Active Acoustic Reinforcement System on Musical Performance in a Recording Studio
Thursday May 28, 2026 10:30am - 11:00am CEST
This work presents the results of a perceptual study
investigating the influence on musicians of a virtual
acoustics system installed in the live room of a
professional recording studio. The study focused on
analyzing relationships between a selection of objective
acoustic parameters (T30, STLate, LJ); subjective
perceptions of 19 solo
musicians performing under 11 different acoustic
conditions. The experiment was conducted using the VAT
(Virtual Acoustic Technology) system; the VAT Suite
software developed at the Immersive Media Laboratory
(IMLab) in the Sound Recording Department at McGill
University. Correlations between quantitative;
qualitative analyses
show that musicians’ preferences converge on conditions
with T30 ≈ 1 s,; that late; lateral energy increases
the perception of spatiality, providing a positive balance
between clarity; acoustic support. However, longer
reverberation reduces comfort; executive control.
Authors
avatar for Gianluca Grazioli

Gianluca Grazioli

Montreal, Canada, McGill University
avatar for Richard King

Richard King

McGill University, McGill University
Montreal
WW

Wieslaw Woszczyk

McGill University
Thursday May 28, 2026 10:30am - 11:00am CEST
Aud 42 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

11:00am CEST

Kseniya Kawko: 3D Masterclass
Thursday May 28, 2026 11:00am - 12:00pm CEST
Kseniya Kawko, a Munich- and London-based Tonmeister and
recording engineer specializing in classical music and
jazz, shares selections from her recent live and studio
recording and mixing projects, featuring leading orchestras
and jazz ensembles, and provides an introduction to the
artistic and production considerations behind immersive
formats.

This masterclass series, featuring remarkable recording
artists, is a chance to hear 3D audio at its best; as we
discuss qualities that make it truly worth the effort.

In each masterclass, we explore the new spatial
possibilities in recording and production, detailing also
this specific listening room, regarding ITU-R BS.1116
compliance and auditory envelopment (AEV) transparency.
Seats are limited to keep playback variation at bay.
Speakers
avatar for Kseniya Kawko

Kseniya Kawko

Tonmeister, msm studios
Kseniya Kawko is a producer and recording engineer specialized in classical music and jazz. She holds Master of Music degrees from two world-renowned audio programs: Sound Recording, McGill University (Montréal, Canada) and Musikregie / Tonmeister, Hochschule für Musik Detmold (Germany... Read More →
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
Thursday May 28, 2026 11:00am - 12:00pm CEST
Aud 31 Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark

11:00am CEST

Measurement tools for immersive audio production
Thursday May 28, 2026 11:00am - 12:00pm CEST
Multichannel audio formats require an attention to
channels' correlations and sometimes special approach. In
this workshop, we would like to continue the discussion
started at AES Show 2025 in LA and show how you can use
different measurement tools to avoid certain problems in
the final mix. For example, the mutual influence between
the upper and main beds in immersive layout or problems in
the LFE channel and how to check the mix for the
correlation issues outside the sweet spot.
Speakers
avatar for Pavel Smokotnin

Pavel Smokotnin

RTW GmbH & Co. KG, RTW GmbH & Co. KG
Germany, Köln
Thursday May 28, 2026 11:00am - 12:00pm CEST
Aud 49 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

11:30am CEST

The efficacy of phantom image perception: an active listener perspective.
Thursday May 28, 2026 11:30am - 12:00pm CEST
A “phantom image” is the illusion of an independent sound
source created by two or more loudspeakers. Most often
created by manipulating level differences between
stereophonic channels (aka, “panning”), the effect is used
to create a sense of auditory space between loudspeakers
; is largely taken for granted. In recent years,
surround; immersive audio systems have attempted to
utilize phantom image processing to render audio objects in
desired positions across multiple loudspeaker arrays. This
research examined the efficacy of phantom image perception
horizontally; vertically from an active listener
perspective. After listening to a target loudspeaker,
listeners (n = 442) were asked to move a phantom sound to a
position to match that of the target loudspeaker. The
listener’s phantom placement was then compared to the
target,; subjects were allowed “correct” their phantom
position. The horizontal experiment was based on a
standard stereophonic 60° loudspeaker array with the target
loudspeaker at 15° off center. The vertical experiment
utilized elevated loudspeakers in a 60° arc with the target
loudspeaker elevated 10° above the horizon (lower
loudspeaker). Results show nearly universal “undershoot” in
horizontal placement error on first attempts with gradual
improvement over trials that coalesced around the projected
target location. However, after repeated tries, final
perceptual image locations were spread over 2/3 of the
sound-field around the target loudspeaker. In the vertical
trials perceptual locations were spread across the entire
sound field in all three trials; failed to show any
patterns of coalescence around the target loudspeaker.
Authors
avatar for Song Hui CHON

Song Hui CHON

Associate Professor, Belmont University
Associate Professor of Audio Engineering Technology, interested in the perception and cognition of music and sound, especially timbre and attention. An amateur historical keyboardist. And my first name sounds like "song-he" as in "The song he sang was beautiful."
WB

Wesley Bulla

Belmont University
Thursday May 28, 2026 11:30am - 12:00pm CEST
Aud 42 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

A New Reference Target Curve for Studio Headphones
Thursday May 28, 2026 1:30pm - 2:00pm CEST
Target curves for the sound signature of headphones are a
helpful design target during the development process. While
a lot of attention has been made to fi nd target curves that
match the listening preference of consumers, equivalents
for studio headphones date back to the 90’s. In the context
of music production a mutual target or even standard is
essential as to make mixing; mastering more
gear-independent. This becomes even more important since
the main tool for sound engineers shifts from loudspeakers
in professional environments such as acoustically treated
studios to headphones, often additionally equipped with
virtualization algorithms. This enables them to be more fl
exible; to rely less on potentially expensive
loudspeaker setups. The diffuse fi eld target curve that is
currently still the only standardized target curve for
studio headphones is often reported to not match a real
loudspeaker-equivalent of studio environments. In this
paper, we approach to find a new standard target curve for
studio headphones emulating the frequency response of a
loudspeaker setup in modern studio environments.
For this, we give an overview of current target curves;
match them to their equivalent loudspeaker setups.
Based on that we propose a new methodology for a
measurement-based target curve incorporating typical
panning paradigms of music signals based on measurements
inside multiple control rooms. To verify the results, we
conduct listening tests with professionals in multiple
studio environments.
Authors
avatar for Jonas Foerster

Jonas Foerster

Signal Processing Engineer, beyerdynamic GmbH & Co. KG
Passionate about Headphones, Signal Processing and their interaction.

Focus on headphone target curves, spatial audio and ANC
LK

Lukas Keppler

beyerdynamic GmbH & Co. KG
Thursday May 28, 2026 1:30pm - 2:00pm CEST
Aud 44 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

Binaspect: A Python Library for Binaural Audio Analysis, Visualization & Feature Generation
Thursday May 28, 2026 1:30pm - 3:30pm CEST
We present Binaspect, an open-source Python library for
binaural audio analysis, visualization,; feature
generation. Binaspect generates interpretable “azimuth
maps” by calculating modified interaural time; level
difference spectrograms,; clustering those
time-frequency (TF) bins into stable time-azimuth histogram
representations. This allows multiple active sources to
appear as distinct azimuthal clusters, while degradations
manifest as broadened, diffused, or shifted distributions.
Crucially, Binaspect operates blindly on audio, requiring
no prior knowledge of head models. These visualizations
enable researchers; engineers to observe how binaural
cues are degraded by codec; renderer design choices,
among other downstream processes. We demonstrate the tool
on bitrate ladders, ambisonic rendering,; VBAP source
positioning, where degradations are clearly revealed. In
addition to their diagnostic value, the proposed
representations can be exported as structured features
suitable for training machine learning models in quality
prediction, spatial audio classification,; other
binaural tasks. Binaspect is released under an open-source
license with full reproducibility scripts at: (link removed
for blind review)
Authors
AR

Alessandro Ragano

University College Dublin
DB

Dan Barry

University College Dublin
DS

Davoud Shariat Panah

University College Dublin
avatar for Jan Skoglund

Jan Skoglund

Google, Google

Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

Perceptual Evaluation of the MPEG-I Immersive Audio Standard
Thursday May 28, 2026 1:30pm - 3:30pm CEST
The recently finalized ISO international standard (IS) on
MPEG-I immersive audio enables interactive
six-degrees-of-freedom (6DoF) audio rendering for a
multitude of virtual-reality; augmented-reality (VR/AR)
acoustic scenarios; applications with comprehensive
modeling of room acoustics; intricate acoustic
phenomena, including e.g. occlusion, reflection,
transmission; diffraction caused by sound obstacles,
Doppler effect,; dynamic environment changes triggered
by user interactivity. This paper describes concept,
methodology; results of the final verification test of
this standard. In the verification test, the perceptual
quality of the renderer was assessed in an interactive
listening test using different in-; outdoor acoustic
scenes, testing the above-mentioned features of the
standard. More than 50 listeners participated in the test
distributed across six labs using the ITU‑R BS.2132 [1]
multi‑stimulus method on a 100‑point scale for three
conditions (IS, mid-; low anchor) in 10 VR scenes plus
two repetitions. The results of several anchor processing
configurations are presented. The selected mid; low
anchors have demonstrated stable quality across diverse
scenes with progressive timbre; spatial degradations.
The listening test results show a clear separation of the
conditions (IS > mid > low); the low anchor was stable
(around 16 points median value) while the mid anchor varied
by scene (around 47 points). The IS is rated with a median
of 84 points among all labs, which is the “excellent”
region of the scale. The individual scenes are rated
differently. The quartile range for some scenes can exhibit
20 points. The median value for the IS of the different
labs varied, some are a bit more critical than others.
Authors
AS

Andreas Silzle

Fraunhofer IIS, Fraunhofer IIS
Germany
avatar for Leon Terentiv

Leon Terentiv

Dolby, Dolby
Germany
avatar for Pablo Delgado

Pablo Delgado

Fraunhofer IIS, Fraunhofer IIS
Erlangen, DE
SJ

Sam Jelfs

Philips
avatar for Sascha Disch

Sascha Disch

Fraunhofer IIS, Fraunhofer IIS
Sascha Disch received his Dipl.-Ing. degree in electrical engineering from the Technical University Hamburg-Harburg (TUHH) in 1999 and joined the Fraunhofer Institute for Integrated Circuits (IIS) the same year. Ever since he has been working in research and development of perceptual... Read More →
Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

Capturing Immersive Sound in Concert Halls: A Comparative Analysis of PCMA-3D and Decca Cuboid Recording Techniques
Thursday May 28, 2026 1:30pm - 3:30pm CEST
This paper presents a comparative analysis of two immersive
recording techniques for classical music: the PCMA-3D
(Perspective Control Microphone Array); the Decca
Cuboid. While the Decca Cuboid relies primarily on
time-of-arrival differences to generate spatial
impressions, the PCMA-3D utilises intensity differences;
separates ambience from direct sound. A recording session
was conducted in a concert hall using a classical guitar
soloist; two distinct folk music ensembles to capture
performances simultaneously with both arrays. Subjective
evaluation was performed using a MUSHRA listening test with
18 participants, assessing parameters such as sensation of
space, localisation precision,; sound quality.
Statistical analysis reveals that while both systems
provide high-quality immersive experiences, the PCMA-3D
scored significantly higher in the sensation of space (p
Authors
ZW

Zechen Wang

University of York
Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

2:00pm CEST

Personalized VR for hearing research with embedded devices
Thursday May 28, 2026 2:00pm - 2:30pm CEST
Deep learning has significantly improved speech enhancement
performance in controlled laboratory conditions, yet these
advances rarely translate into robust real-world benefit
for hearing aid users. Current algorithms are trained;
evaluated in simplified acoustic scenarios, neglecting
multimodal cues, user interaction, environmental dynamics,
; the strict latency; power constraints of embedded
devices. As a result, a persistent gap remains between
algorithmic performance; everyday listening experience.
This position paper reviews recent progress in speech
enhancement, embedded Artificial Intelligence hardware,;
hearing aid systems,; argues for a shift toward
ecologically valid evaluation; hardware-aware design. We
propose virtual reality as a reproducible, multisensory
benchmarking platform enabling joint assessment of human
perception; algorithmic processing. This perspective
outlines a research roadmap toward adaptive, context-aware,
; practically deployable hearing technologies.
Authors
RS

Romain Serizel

LORIA - Laboratoire Lorrain de Recherche en Informatique etnses Applications
SS

Stefania Serafin

Department of Engineering Technology and Didactics,nTechnical University of Denmark
Thursday May 28, 2026 2:00pm - 2:30pm CEST
Aud 42 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

2:00pm CEST

The Perception; Measurement of Nonlinear Distortion in Headphones
Thursday May 28, 2026 2:00pm - 2:30pm CEST
Few studies exist on the perception; measurement of
nonlinear distortion in headphones. This paper reports the
detection thresholds; perceived sound quality from real
distortion in headphones. Five different distortion
measurements were made on the headphones to determine how
well they predict audibility; quality. Music samples
were binaurally recorded on six headphones at playback
levels ranging from 85 to +110 dBA at 3 dB increments. The
recordings were reproduced at a normal playback level (83
dBA) through a reference headphone with low distortion. The
headphone recordings were post-processed to remove both
level; frequency response differences so only nonlinear
distortions; residual noise remained. In a second test,
listeners rated the similarity in quality of headphones
relative to an undistorted reference; a hidden version
of it. The results provide evidence audible distortion in
headphones with music occurs at significantly higher
playback levels (104 to 112 dBA SPL) than what is
considered typical; safe. The percentage of measured THD
in the headphone had the highest correlation with the
detection thresholds while the non-coherent distortion with
music best predicted the similarity ratings. We discuss the
results; the practical implications they might have on
future headphone design, testing; measurement.
Authors
avatar for Sean Olive

Sean Olive

Audio Consultant, Sean Olive Audio Consulting
United States
Thursday May 28, 2026 2:00pm - 2:30pm CEST
Aud 44 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

2:00pm CEST

Perceptual Model Considering Comodulation Masking Release by Postmasking Adaptation
Thursday May 28, 2026 2:00pm - 2:30pm CEST
This work presents a perceptual model based on a complex
IIR filterbank. The filterbank with a frequency resolution
of 4 bands per Bark consists of 104 filters whose slopes
are designed to take spectral masking effects into account.
The filter outputs are used to obtain masking thresholds
with the following post processing. To obtain resonable
masking thresholds from the spreading outputs, a post
masking stage is required. Here, we propose a comodulation
dependent adaptation of the postmasking decay to model
Comodulation Masking Release (CMR) effects. This approach
explicitely considers the dip-listening effect known from
literature. The final masking thresholds are obtained by
weighting the postmasking outputs by a tonality dependent
gain, controlled using spectral flatness estimation. A
listening test compares the proposed method to an already
known approach using direct CMR based modification of the
masking threshold gains.
Authors
BE

Bernd Edler

International Audio Laboratories Erlangen, Germany
FS

Fabian Schaller

Fraunhofer IIS, Erlangen, Germany
Thursday May 28, 2026 2:00pm - 2:30pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

2:00pm CEST

Florian Camerer: 3D Masterclass
Thursday May 28, 2026 2:00pm - 3:00pm CEST
Florian details the design of his brilliant and durable
Double-Ufix 3D mic array, capable of high resolution
outdoor recording. Attendees are treated to memorable
listening examples from natural and rural environments in
Austria and the Nordics.

This masterclass series, featuring remarkable recording
artists, is a chance to hear 3D audio at its best; as we
discuss qualities that make it truly worth the effort.

In each masterclass, we explore the new spatial
possibilities in recording and production, detailing also
this specific listening room, regarding ITU-R BS.1116
compliance and auditory envelopment (AEV) transparency.
Seats are limited to keep playback variation at bay.
Speakers
avatar for Florian Camerer

Florian Camerer

Senior Sound Engineer, ORF
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
Thursday May 28, 2026 2:00pm - 3:00pm CEST
Aud 31 Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark

2:30pm CEST

EMORSION – Examining the Impact of Audio Features on Emotional Responses; Immersion in Film.
Thursday May 28, 2026 2:30pm - 3:00pm CEST
EMORSION is an exploratory study examining how film audio
design shapes audience emotion; immersion. It was
conducted using scenes from four films in the horror (2)
; drama (2) genres, with two mainstream; two
independent productions. For each scene, multiple
alternative audio mixes were created by systematically
manipulating three core aspects of audio design; frequency
(pitch), dynamics (loudness),; directionality (spatial
placement). Three audience groups were exposed to the
scenes in a cinema setting, with each group experiencing
either one manipulated audio mix; a control mix.
Audience responses were assessed through a multimodal
framework combining self-reported emotion; immersion via
a questionnaire,; physiological measures, including
heart rate monitoring; video-based motion tracking.
Results show that subtle changes in audio design
significantly affect emotional perception; immersion.
Unconventional mixes produced greater variability in
interpretation, while conventional immersive mixes led to
stronger agreement across audiences. Notably, participants
often reported perceived visual changes despite no
alterations to the visual content.
Authors
CS

Charalampos Saitis

Queen Mary University of London
GF

George Fazekas

Queen Mary University of London
avatar for Josh Reiss

Josh Reiss

Professor, Queen Mary University of London
Josh Reiss is Professor of Audio Engineering with the Centre for Digital Music at Queen Mary University of London. He has published more than 200 scientific papers (including over 50 in premier journals and 6 best paper awards) and co-authored two books. His research has been featured... Read More →
avatar for Nelly Garcia

Nelly Garcia

PhD Researcher, Queen Mary University of London
I'm Nelly Garcia.
I'm an engineer in communications and electronics with the specialty in acoustics.
Now, I'm a PhD Researcher at the Centre for Digital Music (C4DM) at Queen Mary University of London.
My main interest is sound design, ways to create sounds from scratch, optimize the workflow of a sound designer and innovative ways to label, categorise or access samples... Read More →
avatar for Ruby Crocker

Ruby Crocker

Queen Mary University of London
Thursday May 28, 2026 2:30pm - 3:00pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

3:00pm CEST

Deep-Learning-Driven Sensory Profiling of Headphone Target Curves with Adaptive Listening Test Validation
Thursday May 28, 2026 3:00pm - 3:30pm CEST
Identifying robust headphone target curves is challenging
when preference data from untrained listeners are
interpreted without explicit perceptual structure. This
work presents a methodological framework in which deep-
learning-driven sensory-profile analysis serves as the
primary interpretive layer for listening data.
Candidate target curves are generated using an Interactive
Differential Evolution (IDE) listening experiment that
combines paired comparisons with a second- stage
absolute-rating task, enabling continuous exploration of the
perceptually relevant tuning space while reducing cognitive
load. Converged gain sets are analyzed using a Virtual
Listener Panel (VLP), a Deep Learning (DL) model trained on
large-scale expert evaluations to predict perceptual
attributes from rendered musical material. Predicted
attributes are reported as relative scores along key sensory
dimensions, including bass strength, timbral balance,;
brilliance, enabling exploration of sensory clusters,
perceptual trade-offs,; potential families of target
tunings.
Adaptive listening data from three culturally distinct
listener panels (Denmark, Japan,; Colombia; 20
participants
per site) support the DL-based interpretation. Convergence
is quantified as a reduction in population variance,
; cross-site analyses assess the similarity of clustering
structures; the consistency of relationships between
preference; sensory attributes. Overall, the framework
provides a scalable, perceptually grounded approach to
interpreting listener-preference data when developing
headphone target curves.
Authors
avatar for Gabriele Ravizza

Gabriele Ravizza

Perceptual Audio Evaluation Specialist, FORCE Technology
▪  Acoustics, psychoacoustics, product development, and digital communication as an Audio Engineer in the consumer electronics industry.
▪ Currently employed as a specialist at FORCE Technology's SenseLab department, contributing to enhancing sound quality in a wide range of consumer electronics products, collaborating with audio companies from across the globe... Read More →
avatar for Julian Villegas

Julian Villegas

University of Aizu, University of Aizu
Japan
Thursday May 28, 2026 3:00pm - 3:30pm CEST
Aud 44 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

3:00pm CEST

Emergence; Spatial Directionality of Sa Quintina in the Sacred Vocal Tradition of Castelsardo, Sardinia, Italy: An Early-Stage Sonological–Acoustical Study
Thursday May 28, 2026 3:00pm - 3:30pm CEST
Sa quintina is a distinctive emergent vocal phenomenon
almost exclusively associated with the sacred polyphonic
singing tradition of Castelsardo, perceived as an
autonomous “fifth voice” arising during collective
performance by four male singers. Although widely
acknowledged in ethnomusicological literature, its
formation mechanisms remain only partially explored within
audio engineering; acoustical research.
This paper presents an early-stage, descriptive sonological
case study proposing new hypotheses on the formation;
spatial reinforcement of sa quintina. The phenomenon is
interpreted as a physically grounded, measurable outcome of
harmonic fusion; spatial interference, observable
through spectral energy distribution; coherence. It is
hypothesized to emerge from a converging set of
conditions—including non-tempered harmonic textures,
differentiated vocal emission techniques, intentional
formant tuning,; circular spatial configuration—none of
which is assumed to be strictly sufficient in isolation.
Building upon previous spectral coherence analyses, the
study introduces a Quintina Directionality Index (QDI) to
quantify the spatial dimension of the phenomenon. QDI is
defined as the ratio between spectral energy in two
frequency bands associated with sa quintina (600–750 Hz;
1200–1400 Hz); total spectral energy. The index is
evaluated as a function of direction using ambisonic
recordings in an anechoic chamber; as a function of
microphone position in a controlled field setting.
Preliminary observations suggest that sa quintina
corresponds to localized regions of enhanced spectral
coherence; energy reinforcement, supporting its
interpretation as an emergent physical phenomenon that
precedes; enables its perceptual salience, rather than a
purely auditory illusion.
Authors
FB

Felicita Brusoni

PhD candidate Musikhögskolan i Malmö, Lund University
LF

Luca Frigo

Conservatorio G. Nicolini Piacenza
MS

Martino Sarolli

Conservatorio Paganini Genova
RD

Riccardo Dapelo

Conservatorio Nicolini Piacenza
Thursday May 28, 2026 3:00pm - 3:30pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

3:00pm CEST

Jim and Ulrike Anderson: 3D Masterclass
Thursday May 28, 2026 3:00pm - 4:00pm CEST
Jim and Ulrike have been recording in and for immersive
audio for broadcast, film and audiophile staples for
decades. They specialize in turning traditional acoustic
New York Studio recordings into vast spatial experiences.
The audiences will be experiencing the breathtaking
virtuosity of the likes of Jane Ira Bloom, the Secret Trio,
Donald Vega and large format ensembles under Franco
Ambrosetti and Jim Pugh.

This masterclass series, featuring remarkable recording
artists, is a chance to hear 3D audio at its best; as we
discuss qualities that make it truly worth the effort.

In each masterclass, we explore the new spatial
possibilities in recording and production, detailing also
this specific listening room, regarding ITU-R BS.1116
compliance and auditory envelopment (AEV) transparency.
Seats are limited to keep playback variation at bay.
Speakers
avatar for Jim Anderson

Jim Anderson

Professor, Anderson Audio New York
Jim has been the President of the AES Educational Foundation since 2020 and is a professor of recorded music with the Clive Davis Institute of Recorded Music in the Tisch School of the Arts at New York University. Jim was the Institute’s Chair from 2004 – 2008. A graduate of the... Read More →
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
UA

Ulrike Anderson

Anderson Audio New York
Thursday May 28, 2026 3:00pm - 4:00pm CEST
Aud 31 Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark

3:30pm CEST

NAVIQUAL: Creating Spatial Audio Quality Maps for Virtual Live Music Environments
Thursday May 28, 2026 3:30pm - 4:00pm CEST
Live music environments can be simulated; evaluated
through spatial audio; augmented reality (AR)
technology. However, conducting perceptual studies on AR
environments can be challenging, as multiple design
considerations; uncontrolled variables come into play.
Hence, we developed Naviqual, a tool to create a spatial
audio quality map for a virtual live music environment. We
generated objective quality contour; polar maps to
predict the quality of experience (QoE) across listener
locations; directions respectively. We found that these
maps strongly aligned with perceptual evaluations by
normal-hearing listeners through listening tests. We also
found that binaural objective metrics; signal-to-noise
ratio both strongly predict QoE across listener
translations, with the former outperforming the latter in
predicting QoE across listener directions. Overall,
Naviqual provides a QoE map for virtual live music
environments robust across various listener locations;
directions, noise locations, music content,; room
acoustics.
Authors
CT

Carl Timothy Tolentino

University College Dublin
Thursday May 28, 2026 3:30pm - 4:00pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

3:30pm CEST

Audio engineering music for listeners with hearing loss
Thursday May 28, 2026 3:30pm - 4:30pm CEST
Audio engineering often implicitly assumes a uniformity in
hearing across listeners; this is an assumption that does
not reflect real-world diversity. How could technologies
and practices in production, mixing, and reproduction be
adapted to create music that is more inclusive? While the
AES has a conference series on Audio and Music Induced
Hearing Disorders, this has focused on the causes of
hearing loss with little on audio engineering for listeners
who have a hearing loss.

In western countries, about one in three adults are deaf,
have hearing loss or suffer from tinnitus. Hearing loss can
lead to many challenges with music such as: inaudibility of
quieter passages, distortion, degraded pitch perception,
and difficulty in identifying and picking out lyrics and
instruments. The most common intervention for mild to
moderately severe hearing loss is hearing aids. But while
many of these devices have music programs, their efficacy
is mixed, to the point that many opt not to use them. With
the rise of machine learning within Audio Engineering,
there are opportunities to better personalise music, and
therefore address issues listeners face. Consumer devices
are also increasingly having audio accessibility features
added, but the usefulness of these lack independent
testing. This workshop will consider opportunities for
making music more accessible.

The workshop will start by exploring how hearing loss harms
the experience of listening to music and how this varies
between people. This will lead to discussion of why no
technology can fully ‘correct’ music to achieve a ‘perfect’
listening experience for those with hearing loss. There is
no technology to recreate a ‘golden-ears’ experience. This
leads to a key research question: what is the best,
rendition of a piece of music for someone who has hearing
loss? What do listeners want from music, and how can we get
closest to achieving that?

We will bring in findings from research projects and
listening tests to explore what is known, and also to
highlight that there are significant gaps in knowledge that
require further research. We will then explore
state-of-the-art in wearables such as hearing aids and
sound reproduction systems. This will include the current
Cadenza project, which has been running a series of machine
learning challenges to improve music for those with hearing
loss.

Throughout, we will encourage questions and engagement from
delegates. We want to hear about lived experience of
hearing difference and how that has changed professional
practice and personal lives. We are also keen to hear
suggestions from delegates on what approaches might be used
to improve music for those with hearing loss.

We aim to raise awareness of the importance of considering
diverse audiences in Audio Engineering practice. Where
possible, the workshop will provide practical guidance for
audio engineers, highlighting techniques and emerging
technologies that can better support listeners with diverse
hearing profiles.

The Workshop will be organised by the Cadenza Project Team
https://cadenzachallenge.org/ A large UK-funded project
about improving music for those with hearing loss.
Speakers
avatar for Josh Reiss

Josh Reiss

Professor, Queen Mary University of London
Josh Reiss is Professor of Audio Engineering with the Centre for Digital Music at Queen Mary University of London. He has published more than 200 scientific papers (including over 50 in premier journals and 6 best paper awards) and co-authored two books. His research has been featured... Read More →
TC

Trevor Cox

University of Salford
SM

Sara Madsen

GN Store Nord
AS

Adam Steed

Contact Theatre, Manchester
Thursday May 28, 2026 3:30pm - 4:30pm CEST
Aud 41 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

4:00pm CEST

Influences of Nonlinear Distortion in Music Playback on Listeners’ Stress Evaluated by PPI; RMSSD of PPG
Thursday May 28, 2026 4:00pm - 4:30pm CEST
The phenomenon in which listeners’ impressions of music are
unintentionally altered even when the same sound source is
played back remains an important issue. Previous research
has shown that the state; combination of audio equipment
affect the characteristics of nonlinear distortion in music
playback. Hence, we conducted a subjective evaluation of
auditory; musical impressions using sound sources with
various nonlinear distortions. However, the subjective
evaluation was unstable; difficult to assess. The reason
was that the sound change was perceived emotionally as a
slight change in sound image; musicality,; the
interpretation of evaluation terms varies widely among
subjects due to the difficulty of verbalizing the
impression. Therefore, we evaluated the change in
listeners’ stress caused by nonlinear distortion in music
playback using the photoplethysmography (PPG). In this
study, we conducted a follow-up experiment with improved
accuracy.
In the experiment, 41 subjects listened to sound sources
with even-order harmonic distortion at 2.69% THD, odd-order
harmonic distortion at 2.69% THD,; no distortion. The
musical piece of sound sources is an original to eliminate
familiarity; bias toward existing music.
We evaluated changes in subjects’ stress states using the
mean pulse-pulse interval (PPI); the root mean square of
successive differences (RMSSD), computed from the PPG
signal, as indicators of stress.
These results reconfirm that nonlinear distortion in music
playback affects listeners’ vital responses, as evidenced
by significant differences in both mean PPI; RMSSD, as
assessed by Cochran's Q test at the 5% significance level.
Authors
KN

Kenshin Nakada

Tokyo University of Science
SM

Shun Muramatsu

The University of Tokyo
TY

Takahiro Yoshida

Professor, Tokyo University of Science
Thursday May 28, 2026 4:00pm - 4:30pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

4:00pm CEST

Stefan Bock: 3D Masterclass
Thursday May 28, 2026 4:00pm - 5:00pm CEST
Stefan reports from the front lines of recording, mixing,
and live streaming immersive music, highlighting the
technical and creative challenges of delivering
three-dimensional sound in real time. He shares practical
insights into spatial mixing, format compatibility, and the
realities of reliable immersive streaming across diverse
playback environments.

This masterclass series, featuring remarkable recording
artists, is a chance to hear 3D audio at its best; as we
discuss qualities that make it truly worth the effort.

In each masterclass, we explore the new spatial
possibilities in recording and production, detailing also
this specific listening room, regarding ITU-R BS.1116
compliance and auditory envelopment (AEV) transparency.
Seats are limited to keep playback variation at bay.
Speakers
avatar for Stefan Bock

Stefan Bock

Managing Director, msm-studios GmbH
Stefan Bock, born 20.08.1964 in southern Germany was starting his career in 1987 as an audio engineer. After freelancing in different facilities in Munich, he co-founded msm-studios in 1991 where he was the Chief Mastering Engineer and General Manager.

He was leading msm-studios t... Read More →
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
Thursday May 28, 2026 4:00pm - 5:00pm CEST
Aud 31 Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark

4:30pm CEST

Personalized Timbre Optimization for Stereophonic Sound Reproduction via Earphones: Part 2 – Practical Implementation; Validation on Consumer TWS Devices
Thursday May 28, 2026 4:30pm - 5:00pm CEST
This paper presents Part 2 of our study on personalized
timbre optimization for stereophonic sound reproduction via
earphones, following our previous work presented at the AES
International Conference on Headphone Technology in 2025.
While Part 1 established a novel auditory-model-based
framework for reproducing a listener’s natural timbre
reference; demonstrated its perceptual validity under
controlled conditions, the present study focuses on the
practical implementation; validation of this approach
for real-world use with consumer True Wireless Stereo (TWS)
earphones.

Conventional headphone; earphone personalization
techniques primarily target spatial audio reproduction or
rely on preference-based equalization, often overlooking
the accurate reproduction of natural timbre in stereophonic
content. Our approach explicitly addresses this limitation
by isolating; optimizing perceptually relevant timbral
cues while excluding spatial encoding components, thereby
improving timbral fidelity without degrading stereo imaging.

The proposed method originally consists of four stages:
high-resolution anatomical scanning of the listener’s upper
body, including the pinnae, individualized HRTF computation
using the boundary element method, selective removal of
spatial encoding components to derive a personalized
reference target response curve (PR-TRC),; perceptual
optimization using a listener-specific weighting
coefficient grounded in auditory reference fidelity rather
than preference. In this paper, each stage is simplified
; automated using smartphone-based scanning;
AI-assisted processing, enabling end users to complete the
entire personalization process via a smartphone connected
to a cloud-based server. The resulting personalized target
response curve is implemented within the computational;
memory constraints of the DSP pipeline of commercial
consumer TWS earphones.

A subjective evaluation using the Semantic Differential
Method was conducted to assess the perceptual impact of the
simplified implementation. Twenty-four listeners evaluated
personalized target curves generated by both the original
; simplified methods, as well as two non-personalized
target curves commonly used in commercial TWS earphones.
The results show that both personalized methods
consistently outperform non-personalized conditions in
overall sound quality; listener preference. Importantly,
no statistically significant degradation in perceived
timbral naturalness was observed between the simplified;
original methods.

These findings demonstrate that auditory-model-based
personalized timbre optimization can be effectively
translated into a practical, consumer-ready technology. The
proposed approach represents a foundational contribution to
future audio personalization; has broad applicability
across headphone; earphone systems for stereophonic
sound reproduction.
Authors
AH

Atsushi Hara

final Inc.
HH

Haruto Hirai

final Inc.
avatar for Kimio Hamasaki

Kimio Hamasaki

President, Artsridge LLC
Kimio Hamasaki, an AES Fellow, is a producer and balance engineer for music recordings, a researcher in spatial audio, an educator in audio engineering and acoustics, and a consultant in audio engineering. He has recorded and produced numerous orchestral and operatic works with the Vienna Philharmonic... Read More →
MH

Mitsuru Hosoo

final Inc.
NT

Nao Tojo

final Inc.
SS

Shun Saito

final Inc./post-doc

Thursday May 28, 2026 4:30pm - 5:00pm CEST
Aud 44 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

4:30pm CEST

From Gaze to Gnosis: A Critical Framework for Embodied Audio Production
Thursday May 28, 2026 4:30pm - 5:00pm CEST
Audio engineering standards often present as objective, yet
they frequently rely on a systemic data bias which Perez
characterises as the 'default male bias' [1]. This paper
examines the hegemony of the male ear, a system of norms
that privileges masculine modes of hearing by prioritizing
technical structure; text over affective experience;
timbre [2]. By transitioning from a visual centric auditory
gaze toward an embodied sonic gnosis, researchers can
recover haptic; physiological ways of knowing sound.
Drawing on the feminist listening praxis of the Female Ear
[3], this work explores the recording studio as an
analytical space where sonic microaggressions [4] enforce
rigid technical standards. The author argues for a new
audio praxis that centers ear pleasures [5], validating
subjective; affective sensory data as legitimate
engineering input. This approach seeks to dismantle the
regulatory fiction [6] of a universal hearing standard,
promoting a pluralistic understanding of musicking [7] that
is inclusive of non normative perspectives.
Authors
avatar for Katie Ambrose

Katie Ambrose

PhD Student, University of York
Katie is a postgraduate researcher at the University of York, working on a th...
Thursday May 28, 2026 4:30pm - 5:00pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
 


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -