Loading…
Schedule as of May 16, 2022 - subject to change

Default Time Zone is CEST - Central European Summer Time
You can change your view to your time zone (look for "Timezone" on the right)


LIVESTREAMS : A and B


ON DEMAND VIDEOS (previous days)
 
Type: Perception clear filter
arrow_back View All Dates
Friday, May 29
 

9:00am CEST

Use of Headphones in Stereo Mastering and 3D Recording
Friday May 29, 2026 9:00am - 10:30am CEST
Loudspeaker monitoring is the reference when audio
professionals evaluate content. Headphones are also
important quality-checking tools; and many consumers enjoy
music using “close-fitting listening devices”, as all
different flavours of headphones are known in recent
standards writing.

We discuss the two reproduction methods from perceptual,
recording and mastering perspectives; especially
differences in timbre, imaging and auditory envelopment
when listening to stereo. Applications of headphones in
recording, when setting up and trimming stereo or 3D
microphone arrays, are also practically detailed.

In the last part of the workshop, attendees are invited to
personally compare the two domains on the qualities and
applications discussed; with guided listening to audio
examples between a pair of precision nearfield monitors,
Genelec 8351B, and a pair of excellent headphones, Audeze
CRBN2.
Speakers
avatar for Stefan Bock

Stefan Bock

Managing Director, msm-studios GmbH
Stefan Bock, born 20.08.1964 in southern Germany was starting his career in 1987 as an audio engineer. After freelancing in different facilities in Munich, he co-founded msm-studios in 1991 where he was the Chief Mastering Engineer and General Manager.

He was leading msm-studios t... Read More →
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
avatar for Morten Lindberg

Morten Lindberg

Engineer and Producer, 2L (Lindberg Lyd)
Recording Producer and Balance Engineer with 50 GRAMMY-nominations, 42 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner 2020 and 2026. Immersive... Read More →
UA

Ulrike Anderson

Anderson Audio New York
avatar for Chris Berens

Chris Berens

Artist and Industry Relations, Audeze
Brand ambassador for Audeze, I love all aspects of audio production and engineering, especially immersive audio!

Friday May 29, 2026 9:00am - 10:30am CEST
Aud 49 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

9:00am CEST

Exploring Perceptual; Physiological Auditory Models for Assessing Speech Intelligibility in Enhanced Signals
Friday May 29, 2026 9:00am - 11:00am CEST
Current deep learning approaches to speech enhancement rely
heavily on objective measures like mean squared error or
scale-invariant signal-to-distortion ratio as both training
objectives; evaluation metrics. While analytically
convenient, these benchmarks often fail to capture the
nuances of human perception or actual intelligibility.
Furthermore, the inconsistent integration of metrics like
Short-Term Objective Intelligibility or Perceptual
Evaluation of Speech Quality into training; evaluation
pipelines leaves a gap between algorithmic performance;
perceptual reality. This paper proposes a transition
towards evaluation methodologies grounded in
psychoacoustics; audiological modeling. Our study
explores two distinct methods to characterise enhanced
signals. On one hand, we employ a perceptual approach based
on the Cambridge loudness model to assess the preservation
of spectral excitation patterns; perceived intensity. On
the other hand, we adopt a biophysical approach by
utilising CoNNear, a convolutional model of the human
auditory periphery. This allows us to simulate
representations of responses at different stages of the
auditory periphery to observe how speech enhancement
processing affects the physiological representation of
speech. We analyse pre-trained speech enhancement models
using automatic speech recognition; Short-Term Objective
Intelligibility as an additional proxy for human
intelligibility. By mapping automatic speech recognition
performance against loudness; peripheral response
patterns, we investigate the extent to which current
enhancement strategies maintain the perceptual;
physiological integrity of the speech signal. This work
aims to identify features predictive of intelligibility,
providing a foundation for speech enhancement systems
optimised for the human listener rather than purely
signal-based objective functions.
Authors
FE

François Effa

Université de Lorraine, CNRS, Inria, Loria, Nancy, France
RS

Romain Serizel

LORIA - Laboratoire Lorrain de Recherche en Informatique etnses Applications
Friday May 29, 2026 9:00am - 11:00am CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

9:00am CEST

Objective Quality Models for Decision-Making in Speech Coding
Friday May 29, 2026 9:00am - 11:00am CEST
Objective quality evaluation is widely used in speech
coding, yet objective estimates often show limited
agreement with subjective listening-test results. Rather
than focusing on absolute score accuracy, this paper
evaluates objective speech quality models from a
decision-making perspective, defined as their ability to
support comparative judgments between speech codecs or
codec configurations. A formal ITU-R P.800 Absolute
Category Rating (ACR) listening test was conducted with 30
listeners across 24 conditions, covering conventional;
neural monophonic speech codecs operating under
clear-channel conditions at sampling frequencies from 16 to
48 kHz; bit rates ranging from below 1 kbps to above 16
kbps. The speech material consisted of internally recorded,
clean French-language speech that was not used in the
development or training of any of the evaluated codecs or
objective quality models. Seven objective quality models,
namely PESQ, VISQOL Speech, VISQOL Audio, WARP-Q, NISQA,
UTMOS,; DistillMOS, were evaluated on the same material.
Decision-making performance was assessed by comparing
subjective; objective rankings using Kendall’s rank
correlation coefficient; by analyzing pairwise codec
comparisons using t-tests at a 95% confidence level. The
results show that some objective quality models are
effective for comparing bit rate variations within a given
speech coding technology, provided that all other codec
parameters remain unchanged (e.g., sampling frequency).
However, all models exhibit limitations, including
tendencies toward over- or underestimation for certain
technologies, as well as reduced reliability when applied
across different sampling frequencies. Despite its
conventional origins, PESQ remains capable of supporting
decision-making even when applied to neural speech codecs.
Authors
CL

Clémence Lamballe

Universite de Sherbrooke
PG

Philippe Gournay

Universite de Sherbrooke
RL

Roch Lefebvre

Universite de Sherbrooke
Friday May 29, 2026 9:00am - 11:00am CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

9:00am CEST

A perceptual evaluation of various commercial models of music source separation, with a focus on model performance against non-traditional source material
Friday May 29, 2026 9:00am - 11:00am CEST
Music source separation (MSS) systems are commonly used in
production, remixing,; audio analysis work, yet
questions arise regarding the extent that objective
evaluations of model performance align with human
perceptual evaluations, particularly when tasked with
non-traditional source material (in this case, heavily
processed electronic music). This study seeks to set a
framework for an evaluation of 3 machine learning
approaches to MSS: a spectrogram-domain model (spleeter), a
waveform-domain model (Demucs v2),; a hybrid-domain
model (HTDemucs). Subjective evaluations of model
performance were accumulated via a MUSHRA-style listening
test, while objective evaluations were assessed using
signal-to-distortion ratio (SDR); Frechet Audio Distance
(FAD). Results showed consistent agreement across objective
metrics, with the hybrid-domain model outperforming the
other singular-domain models. Perceptual ratings also
favored the hybrid model, with listeners occasionally
rating the model output as equal or better quality than the
original reference, interestingly. Preliminary analysis
indicates some moderate but insignificant correlations
between the two assessment paths, reinforcing concerns
about relying solely on numerical evaluations when
discussing MSS model performance. Implications for model
design; future evaluation procedures are discussed.
Authors
avatar for Sahan Wijewardane

Sahan Wijewardane

University of Miami
Friday May 29, 2026 9:00am - 11:00am CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

9:30am CEST

Who Controls the Space? Artistic Intent; Sound Diffusion in Immersive Concert Performance
Friday May 29, 2026 9:30am - 10:00am CEST
Recent advances in large-scale multichannel loudspeaker
systems have enabled immersive concert formats that extend
spatial control beyond conventional stereo; small
multichannel configurations. High-density loudspeaker
arrays (HDLAs) allow sound to be distributed across complex
architectural spaces, challenging established distinctions
between composition, performance,; live sound practice.
In live contexts, however, the realization of spatial
attributes is often constrained by system complexity,
limited rehearsal time,; the lack of artist-facing
spatial control interfaces. As a result, spatial
realization; sound diffusion are frequently delegated to
sound engineers, who translate artistic material to the
acoustic; architectural conditions of the venue in real
time.

This paper examines three immersive concerts presented
during Sonic Days 2025 in Denmark, realized on both
large-scale; small-scale multichannel loudspeaker
systems. The concerts represent contrasting production
contexts, including a site-specific spatial composition
conceived explicitly for a high-density loudspeaker array
; performances by artists whose practices are typically
oriented toward stereo or small multichannel formats.
Across these cases, spatialization functioned variously as
compositional material, interpretive layer,; adaptive
live-mixing practice.

The paper analyzes how control over spatial attributes is
negotiated between artists; sound engineers in live
immersive concert settings,; how this negotiation
affects the interpretation of artistic intent; audience
experience. Particular attention is given to the role of
sound engineers as active mediators whose decisions shape
spatial form, listening perspective,; the relationship
between sound; architecture. The findings suggest that
immersive concert formats redistribute creative agency
across artists, technicians,; technological
infrastructures,; point toward the need for revised
conceptual frameworks for authorship, performance,;
listening in large-scale spatial audio environments.
Authors
avatar for Kasper Fangel Skov

Kasper Fangel Skov

Assistant Professor, PhD, Sonic College (UC SYD)
Friday May 29, 2026 9:30am - 10:00am CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

10:30am CEST

The cognition of sound in museums: Toward a spectrum of meanings
Friday May 29, 2026 10:30am - 11:00am CEST
This presentation develops a conceptual framework for
understanding how visitors cognize sound in museum
exhibitions. While sound increasingly features in museum
practice, research has focused primarily on measuring
visitor enjoyment; engagement rather than examining the
specific meanings sound generates. This gap reflects the
absence of a framework conceptualizing sound's
meaning-making capacities to guide empirical investigation.
Drawing on scholarship from music studies, semiotics,
phenomenology,; embodied cognition, I propose a
seven-component spectrum identifying distinct yet
interrelated meanings that sound can convey in museums:
aesthetic, representational, emotional, sensorial,
imaginative, social,; political. These meanings can be
apprehended independently or in combination, typically
through emergent, pre-conscious perception rather than
deliberate awareness.
The spectrum builds on the premise that museum sound
meaning-making unfolds through dynamics internalized from
early childhood as we attune to the world sonically. It
draws on the notion of sound as a "sonic aggregate"
(Grimshaw; Garner 2015)—encompassing social, contextual,
temporal,; embodied experiences—rather than reducing
sound to wave phenomena. Visitors actively co-produce
meanings by drawing on their moods, memories, knowledge,
; imagination during exhibition encounters.
Each meaning category is illustrated with exhibition case
studies, demonstrating the spectrum's applicability across
diverse sound-based multimodal museum practices—from
popular music exhibitions to sound art installations. The
spectrum aims to catalyze research through varied
methodological approaches; establish analytical
standards for studying sound in museums, with potential
adoption by international standardization bodies.
Authors
avatar for alcina cortez

alcina cortez

Sound Studies Researcher, INET-md | NOVA University lisbon
A PhD in ethnomusicology and museum studies and a curator, I am committed to exploring the diverse meaning-making capabilities of sound when exhibited in museums, encompassing the representational, emotional, sensorial, and social, as well as its ability to foster imagination and... Read More →
Friday May 29, 2026 10:30am - 11:00am CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

11:00am CEST

Li Dakang: 3D Masterclass
Friday May 29, 2026 11:00am - 12:00pm CEST
Prof. Li Dakang is a preeminent recording engineer and
pioneer of 3D recording of Chinese traditional music,
ancient instruments and spaces. Attendees are treated to a
selection of unique 3D recordings, including a new and
glorious version of China’s National Anthem. Prof. Li
describes the LDK-Cube for capturing the envelopment of an
acoustic space, and questions reliable reproduction of this
important quality.

This masterclass series, featuring remarkable recording
artists, is a chance to hear 3D audio at its best; as we
discuss qualities that make it truly worth the effort.

In each masterclass, we explore the new spatial
possibilities in recording and production, detailing also
this specific listening room, regarding ITU-R BS.1116
compliance and auditory envelopment (AEV) transparency.
Seats are limited to keep playback variation at bay.
Speakers
avatar for Hanying Feng

Hanying Feng

China National Director, Communications University of China
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
Friday May 29, 2026 11:00am - 12:00pm CEST
Aud 31 Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark

12:30pm CEST

Perceptual Evaluation of the Open Binaural Renderer
Friday May 29, 2026 12:30pm - 1:00pm CEST
This paper presents the perceptual evaluation of the Open Binaural Renderer (OBR), an open-source librarydeveloped for headphone-based rendering of Immersive Audio Model and Formats (IAMF) content. The evaluationfollowed an iterative framework in which findings from a pilot listening study informed the tuning of renderingprofiles, and the resulting renderer was benchmarked against established proprietary solutions. In the pilot study,19 expert listeners rated the Overall Listening Experience (OLE) of the initial prototype (OBRv1) and five externalrenderers across diverse audio content. Qualitative feedback was analysed using inductive coding to identify salientperceptual dimensions. The pilot revealed content-dependent performance and showed that a single default profilewas inadequate, yielding mixed responses in both the numerical scale and in the qualitative feedback and motivatingthe development of multiple rendering profiles in OBRv2. The main study evaluated two OBRv2 profiles targetingdifferent reverberation characteristics (Direct and Ambient) alongside three top-performing external renderers. Atotal of 39 participants, divided into expert and non-expert groups, rated five perceptual attributes: Voice Quality,Envelopment, Externalisation, Overall Listening Experience, and Timbral Balance. Mixed-design ANOVA revealedsignificant main effects of renderer condition on all attributes. Pairwise comparisons showed that OBRv2,Ambientachieved significantly higher OLE ratings than one proprietary renderer and reached statistical parity with theremaining two, representing a measurable improvement over the prototype. A trade-off between Voice Qualityand Externalisation was observed, driven by the level of reverberation in each renderer. The results demonstratethat iterative, perceptually informed tuning can yield competitive binaural rendering quality in an open-sourceframework.
Authors
FL

Felicia Lim

Google LLC
avatar for Gavin Kearney

Gavin Kearney

Professor of Audio Engineering, University of York
Gavin Kearney graduated from Dublin Institute of Technology in 2002 with an Honors degree in Electronic Engineering and has since obtained MSc and PhD degrees in Audio Signal Processing from Trinity College Dublin. He joined the University of York as Lecturer in Sound Design in January... Read More →
avatar for Jan Skoglund

Jan Skoglund

Google, Google

avatar for Jani Huoponen

Jani Huoponen

Google, Google LLC
With 25+ years of media industry product development, Jani Huoponen is a seasoned expert in developing cutting-edge audio and video technologies for consumer devices and streaming systems. Joining Google in 2010, he’s served as a product manager across key multimedia initiatives... Read More →
avatar for Katarzyna Sochaczewska

Katarzyna Sochaczewska

Immersive Music Producer, Researcher, University of York

TR

Tomasz Rudzki

University of York
Friday May 29, 2026 12:30pm - 1:00pm CEST
Aud 42 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

12:30pm CEST

Evaluation of Objective Speech Intelligibility Metrics for Hearing-Aid Users in Multi-Talker Spatial Environments
Friday May 29, 2026 12:30pm - 1:00pm CEST
Despite the growing number of hearing-impaired workers
wearing hearing-aids in occupational settings,
understanding speech in multi-talker situations remains
challenging. This difficulty is particularly pronounced in
open-plan offices, where simultaneous talkers; room
reverberation are prone to degrade speech intelligibility.
While spatial cues are essential for segregating target
speech from competing sources, hearing-aids signal
processing may alter binaural information that supports
spatial hearing.
Accurate evaluation of hearing-aids performance is
therefore crucial. Objective speech intelligibility metrics
offer an efficient alternative to time-consuming listening
tests; however, their validity in complex spatial scenarios
involving hearing-impaired listeners remains unclear.
Monaural metrics such as HASPI account for individual
hearing loss but neglect spatial information, whereas
binaural metrics such as MBSTOI incorporate spatial cues
but are primarily designed for normal-hearing listeners.
This study evaluates the ability of existing objective
metrics to predict speech intelligibility for hearing-aid
users in multi-talker spatial environments. Listening tests
are conducted on 20 hearing-impaired participants fitted
with binaural hearing-aids. Four types of multi-talker
auditory scenes representative of open-plan offices are
reproduced using a loudspeaker array. They involve a target
speech, combined with diffuse noise; a localized
competing speech source. Objective measurements are
performed using an acoustic mannequin fitted with the
participants’ hearing-aids. HASPI; MBSTOI values are
computed from the binaural signals recorded at the eardrums
; incorporating individual hearing losses.
Objective predictions are compared with subjective
intelligibility scores,; an ablation analysis is
conducted to distinguish the effects of hearing loss
modeling from those of binaural processing.
Authors
JA

Jean-Pierre Arz

INRS ( Vandoeuvre lès Nancy) - Institut national denrecherche et de sécurité (Vandoeuvre lès Nancy)
JD

Joël Ducourneau

LEMTA - Laboratoire d'Energétique et Mécanique Théorique etnAppliquée
LD

Louis Delebecque

LEMTA - Laboratoire d'Energétique et Mécanique Théorique etnAppliquée
RS

Romain Serizel

LORIA - Laboratoire Lorrain de Recherche en Informatique etnses Applications
Friday May 29, 2026 12:30pm - 1:00pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
  Perception, Lecture

12:30pm CEST

Morten Lindberg: 3D Masterclass
Friday May 29, 2026 12:30pm - 1:30pm CEST
Morten describes his excellent recording techniques, and
attendees are treated to
a unique selection of high resolution 3D music listening
examples.

This masterclass series, featuring remarkable recording
artists, is a chance to hear 3D audio at its best; as we
discuss qualities that make it truly worth the effort.

In each masterclass, we explore the new spatial
possibilities in recording and production, detailing also
this specific listening room, regarding ITU-R BS.1116
compliance and auditory envelopment (AEV) transparency.
Speakers
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
avatar for Morten Lindberg

Morten Lindberg

Engineer and Producer, 2L (Lindberg Lyd)
Recording Producer and Balance Engineer with 50 GRAMMY-nominations, 42 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner 2020 and 2026. Immersive... Read More →
Friday May 29, 2026 12:30pm - 1:30pm CEST
Aud 31 Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark

12:30pm CEST

Innovative Measurement of Speech Intelligibility – Applications of Listening Effort in Research & Practice
Friday May 29, 2026 12:30pm - 2:00pm CEST
Speech intelligibility is a key factor in successful
communication across various domains, including research,
post-production for film and television, live sound
reinforcement, and audio production. Traditional assessment
methods often lack objectivity or fail to capture the
listener’s experience in real-world scenarios. In this
workshop, we introduce an innovative approach to measuring
speech intelligibility based on the concept of “Listening
Effort.” We will present the underlying technology, share
practical examples from different application areas, and
demonstrate how this method can be integrated into
workflows to optimize intelligibility. Attendees will have
the opportunity to participate in a hands-on demonstration
and discuss potential use cases relevant to their own work.
This session is designed for professionals and researchers
seeking reliable and actionable tools for evaluating and
improving speech intelligibility in diverse environments.
In this workshop, we present a new technology for measuring
speech intelligibility (“Listening Effort”). The method is
used in research, post-production (film/TV), live sound,
and audio production. The session is aimed at professionals
from both academia and industry who are interested in
objectively assessing and optimizing speech intelligibility.

Participants will be able to join a short demo/exercise and
ask questions.

Introduction & Relevance: Overview of the importance of
speech intelligibility across different fields
Technology & Methodology: Presentation of the measurement
method and underlying concepts
Practical Examples: Case studies from research,
post-production (film/TV), live sound, and production
Live Demo / Interactive Exercise: Practical demonstration
and opportunity for active participation
Discussion & Outlook: Q&A, exchange of ideas, and future
perspectives
Speakers
HB

Hannah Baumgartner

Fraunhofer IDMT
JR

Jan Rennies-Hochmuth

Fraunhofer IDMT
Friday May 29, 2026 12:30pm - 2:00pm CEST
Aud 41 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:00pm CEST

Assessing Situational Awareness of Hearing-Impaired People Through their Perception of Non-Speech Sound Events: a Literature Review
Friday May 29, 2026 1:00pm - 1:30pm CEST
Situational awareness is a multisensory ability that
enables individuals to perceive; appropriately take into
account their immediate environment. This perception of the
world through our senses is carried out continuously;
unconsciously throughout the day. When auditory perception
is degraded, an individual may no longer correctly perceive
a doorbell, a water leak, or an alarm signal, which
negatively affects quality of life; may lead to
dangerous situations. Auditory perception can in particular
be degraded by hearing loss, a common; widespread
condition. The most common treatment consists of wearing
hearing aids, which are mainly designed to improve speech
intelligibility, especially in noisy environments. Feedback
from hearing-impaired people; hearing-aid users
indicates that, although auditory situational awareness has
been recognised as an essential component of well-being, it
remains insufficiently studied; requires further
investigation. There is currently no standard method for
assessing to which extent one's situational awareness is
affected by hearing impairment; the use of hearing aids.
This is a complex process that requires assessing the
perception of relevant sound events within a continuous
stream of multisensorial information, by individuals who
have different subjective preferences. Most existing
methods are limited to evaluating only a subset of the
problem, such as identification; localisation of
non-speech sound events. The rise of new technologies, such
as virtual reality, enables the development of assessment
methods within more realistic yet controlled environments.
This study aims to review existing methods in order to
highlight their limitations in addressing the issue at hand.
Authors
AF

Adil Faiz

Université de Lorraine, CNRS, LEMTA, F-54000 Nancy, France
BM

Balbine Maillou

Université de Lorraine, CNRS, LEMTA, F-54000 Nancy, France

EG

Emma Granier

Université de Lorraine, CNRS, Inria, Loria
JD

Joël Ducourneau

LEMTA - Laboratoire d'Energétique et Mécanique Théorique etnAppliquée
RS

Romain Serizel

LORIA - Laboratoire Lorrain de Recherche en Informatique etnses Applications
Friday May 29, 2026 1:00pm - 1:30pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
  Perception, Lecture

1:00pm CEST

Real-Time Heart Rate Sonification Using Spectral Filtering of Preferred Music for Running Training
Friday May 29, 2026 1:00pm - 3:00pm CEST
The purpose of this study was to evaluate a sonification
system that maps live heart rate data to real-time spectral
filtering of a runner's preferred music. Assessed using a
within-subjects design (n = 13), the system employs
high-pass; low-pass filters to indicate deviations from
target heart rate zones, providing instantaneous
biofeedback without requiring visual attention.
Quantitative analysis revealed no statistically significant
differences in target zone accuracy or response time
between auditory, visual,; combined conditions. However,
qualitative thematic analysis identified a clear division
in user preference. Participants favouring the auditory
condition demonstrated faster mean response times to audio
biofeedback. Findings suggest that while sonification
promotes environmental focus; "gamifies" training, its
efficacy is highly dependent on individual processing
styles; music familiarity.
Authors
avatar for Duncan Williams

Duncan Williams

Senior Lecturer, Acoustics Research Centre, University of Salford
JS

Jay Steel

Acoustics Research Centre, University of Salford
NR

Nicholas Ripley

School of Health and Society, University of Salford
Friday May 29, 2026 1:00pm - 3:00pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:00pm CEST

A Psychoacoustic Framework for In-Vehicle Audio-Light Mapping
Friday May 29, 2026 1:00pm - 3:00pm CEST
This paper proposes a psychoacoustic-based audio-visual
mapping framework for intelligent vehicle cabins to enhance
immersion; stabilize spatial auditory perception. By
establishing mappings between auditory descriptors—such as
Direction of Arrival (DOA), spectral centroid,; temporal
envelope—and ambient lighting parameters, the framework
leverages "ambient vision" to augment the perceptual
experience without increasing the driver's cognitive load.
Theoretical analysis based on Stevens’ Power Law indicates
that the proposed mapping strategies effectively
synchronize audio-visual intensities; mitigate
perceptual fatigue, providing a conceptual reference for
future multisensory HMI design.
Authors
avatar for Kangwei Wang

Kangwei Wang

Acoustic System Engineer, GoerDynamics Lab2
Friday May 29, 2026 1:00pm - 3:00pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

Transient Evoked Otoacoustic Emissions; Self Reported Sound Exposure
Friday May 29, 2026 1:30pm - 2:00pm CEST
Headphone listening has become an integral part of everyday
life, spanning music consumption, communication, online
media,; increasingly, computer gaming. These diverse
listening contexts make individual sound exposure highly
variable; difficult to quantify. While music listening
; occupational headphone use have been widely studied,
sound exposure from gaming remains comparatively
undocumented. This study investigated the relationship
between self‑reported exposure through headphones;
cochlear function assessed using transient evoked
otoacoustic emissions (TEOAE). Forty‑one university
students completed a detailed questionnaire on listening
habits,; TEOAEs were recorded in both ears across five
half‑octave frequency bands. Estimated weekly exposure
levels were derived from participants’ reported durations
; contexts of use. TEOAE amplitude, signal‑to‑noise ratio
(SNR),; reproducibility showed clear frequency‑dependent
patterns; small ear asymmetries, consistent with typical
OAE behaviour. Only limited associations were found between
self‑reported exposure; TEOAE measures, with significant
effects emerging primarily for SNR; reproducibility in
the highest‑exposure group. No consistent differences were
observed between long‑term gamers; non‑gamers. These
findings suggest that self‑reported exposure alone may be
insufficient to detect subtle cochlear changes in young
adults,; underscore the need for more precise
exposure‑monitoring methods when evaluating recreational
sound exposure risks.
Authors
DH

Dorte Hammershøi

Professor, Acoustics and Hearing, AI and Sound, Department of Electronic Systems, Aalborg University
RO

Rodrigo Ordoñez

Aalborg University
Friday May 29, 2026 1:30pm - 2:00pm CEST
Aud 43 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

A Perceptual Evaluation Method for Binaural Rendering Algorithms via Minimum Audible Angle Measurements
Friday May 29, 2026 1:30pm - 2:00pm CEST
Binaural rendering is typically assessed via timbre;
localization accuracy, while its intrinsic spatial
resolution remains rarely quantified. This paper proposes a
perceptual evaluation method based on Minimum Audible Angle
(MAA) measurements to estimate the azimuthal
just-noticeable difference (JND) introduced by binaural
rendering algorithms. We systematically compared several
rendering algorithms across eight reference azimuths using
two participant-allocation paradigms. The results show that
spatial resolution is significantly influenced by Ambisonic
order; choice of the rendering alrorithm, with MAA
thresholds systematically decreasing as the truncation
order increases. Furthermore, the propsed method
successfully captures physiological spatial characteristics
; identifies resolution limits imposed by reference
angles. While both participant-allocation paradigms yield
consistent qualitative trends, the repeated-measures design
provides superior data stability. These findings
demonstrate that the proposed MAA-based method is an
effective tool for quantifying the spatial resolution of
binaural rendering algorithms.
Authors
HZ

Houlin Zhu

Peking University
TQ

Tianshu Qu

Peking University
XW

Xihong Wu

Peking University
YQ

Yufan Qian

Peking University
Friday May 29, 2026 1:30pm - 2:00pm CEST
Aud 42 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

Richard King: 3D Masterclass
Friday May 29, 2026 1:30pm - 2:30pm CEST
Richard is a multiple Grammy Award–winning recording
engineer and a specialist in acoustic music recording. His
work is focused primarily on classical, jazz, and film
score music. A selection of his immersive recordings will
be presented, accompanied by a discussion of the microphone
configurations and mixing decisions employed in each
example.

This masterclass series, featuring remarkable recording
artists, is a chance to hear 3D audio at its best; as we
discuss qualities that make it truly worth the effort.

In each masterclass, we explore the new spatial
possibilities in recording and production, detailing also
this specific listening room, regarding ITU-R BS.1116
compliance and auditory envelopment (AEV) transparency.
Seats are limited to keep playback variation at bay.
Speakers
avatar for Richard King

Richard King

McGill University, McGill University
Montreal
avatar for Thomas Lund

Thomas Lund

Genelec Oy, Genelec Oy
Denmark
Friday May 29, 2026 1:30pm - 2:30pm CEST
Aud 31 Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark

2:30pm CEST

Exploring Rendering Variability in Next-Generation Audio Reproduction
Friday May 29, 2026 2:30pm - 3:00pm CEST
This study evaluates three Next-Generation Audio (NGA)
rendering systems through listening tests using real-life
audio content. The testing paradigm prioritized subjective
preference over adherence to a ground-truth reference.
Participants assessed perceptual spatial audio attributes
in both 5.1; 7.1.4 loudspeaker setups. The findings
suggest that strict adherence to the rendering algorithm
used during content creation is not mandatory in terms of
listener preference. While not advocating disregarding
artistic intent without consideration, this study proposes
that such flexibility in reproduction can be an acceptable
compromise.
Authors
ES

Ema Souza-Blanes

Samsung Research America
avatar for Toni Hirvonen

Toni Hirvonen

Researcher, Samsung Research America
Toni Hirvonen studied acoustics at the Helsinki University of Technology (now Aalto University), where he obtained a PhD in audio signal processing and spatial audio. After a position as a Marie Curie fellow, he has worked internationally in the audio industry since 2010. His projects... Read More →
WJ

Wonbeen Jo

Samsung Research
YK

Yongmin Kwon

Samsung Research

Friday May 29, 2026 2:30pm - 3:00pm CEST
Aud 42 Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
 


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -