Loading…
Schedule as of May 16, 2022 - subject to change

Default Time Zone is CEST - Central European Summer Time
You can change your view to your time zone (look for "Timezone" on the right)


LIVESTREAMS : A and B


ON DEMAND VIDEOS (previous days)
 
Type: Immersive Audio clear filter
Thursday, May 28
 

1:30pm CEST

Binaspect: A Python Library for Binaural Audio Analysis, Visualization & Feature Generation
Thursday May 28, 2026 1:30pm - 3:30pm CEST
We present Binaspect, an open-source Python library for
binaural audio analysis, visualization,; feature
generation. Binaspect generates interpretable “azimuth
maps” by calculating modified interaural time; level
difference spectrograms,; clustering those
time-frequency (TF) bins into stable time-azimuth histogram
representations. This allows multiple active sources to
appear as distinct azimuthal clusters, while degradations
manifest as broadened, diffused, or shifted distributions.
Crucially, Binaspect operates blindly on audio, requiring
no prior knowledge of head models. These visualizations
enable researchers; engineers to observe how binaural
cues are degraded by codec; renderer design choices,
among other downstream processes. We demonstrate the tool
on bitrate ladders, ambisonic rendering,; VBAP source
positioning, where degradations are clearly revealed. In
addition to their diagnostic value, the proposed
representations can be exported as structured features
suitable for training machine learning models in quality
prediction, spatial audio classification,; other
binaural tasks. Binaspect is released under an open-source
license with full reproducibility scripts at: (link removed
for blind review)
Authors
AR

Alessandro Ragano

University College Dublin
DB

Dan Barry

University College Dublin
DS

Davoud Shariat Panah

University College Dublin
avatar for Jan Skoglund

Jan Skoglund

Google, Google

Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

Lightweight Real-time Spatial Audio Interpolation for Standalone VR using Hand Claps
Thursday May 28, 2026 1:30pm - 3:30pm CEST
Realistic spatial audio consistent with visual information
is essential for providing high immersion in Augmented
Reality (AR) environments. However, conventional
high-precision real-time acoustic simulations require
significant computational power, limiting their
implementation on standalone mobile VR devices such as the
Meta Quest. This study proposes a practical method to
enhance reverb realism using solely a standalone VR HMD,
without the need for additional external equipment. By
measuring impulse responses using a few hand claps in the
physical space, we interpolate room acoustic
parameters—specifically RT60; early/late energy
ratios—to reflect the environment's unique characteristics.
These extracted parameters are then applied to the VR
engine's built-in reverb effects, enabling dynamic,
location-aware real-time rendering with minimal
computational load. The proposed method demonstrates that a
brief calibration period of 3 to 5 minutes yields
significantly improved realism compared to static reverb
templates, offering an efficient; practical spatial
audio solution for mobile
AR environments.
Authors
MK

Minsu Kim

Seoul National University
Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark

1:30pm CEST

Can the individual winner HRTFs be determined in a shooting task during onboarding for an Audio Only VR?
Thursday May 28, 2026 1:30pm - 3:30pm CEST
The significance of individual versus generic HRTFs in
Virtual Audio can be difficult to ascertain given the
variety of scenarios; tasks related to the spatial
listening experience. Are we working on the most
significant 80% of the success or fine-tuning the last 5%
of the sound quality? When the VR users are blind it is
fair to assume that the quality of the spatial audio
becomes a critical; more important factor. This is the
challenge as we see it. In the present project, we will
investigate options for powerful game components relying on
spatialized sound, using effects that are natural for the
blind gamer. As a first step, we have implemented a test
platform, where different options for HRTFs will exist,;
where the on-boarding process shall reveal the optimal
solution for the given user. The test scenario is inspired
by a “classical” shooting down sound sources scenario,
where we will vary e.g. the task definition, success
criteria (hit zone, number of attempts; elapsed time) as
well as eavesdropping game internal parameters of more
complex nature (e.g. navigation trajectories). The results
will display the variation in normal seeing listeners;
produce normative data for later comparisons with blind
participants. The platform also includes options for simple
mirror-image room models,; standardized reverberation,
which will be used in later tests to learn, whether the
room acoustics may play a stronger role for the blind
gamers’ navigation; source identification, than for
normal seeing listeners.
Authors
DH

Dorte Hammershøi

Professor, Acoustics and Hearing, AI and Sound, Department of Electronic Systems, Aalborg University
FC

Flemming Christensen

Acoustics and Hearing, AI and Sound, Department ofnElectronic Systems, Aalborg University
avatar for Max Væhrens

Max Væhrens

PhD Fellow, Acoustics and Hearing, AI and Sound, Department ofnElectronic Systems, Aalbor...


SS

Stefania Serafin

Department of Engineering Technology and Didactics,nTechnical University of Denmark
Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
  Immersive Audio, Poster

1:30pm CEST

Exploiting Source Directivity for Robust Asymmetric Crosstalk Cancellation
Thursday May 28, 2026 1:30pm - 3:30pm CEST
This study investigates the relationship between the
robustness of crosstalk cancellation; the symmetry of
system configuration. Analytical results show that, when
the positions of the sound sources are fixed, increasing
asymmetry caused by deviations in the listener’s head
position or orientation leads to a reduction in system
robustness, whereas optimal performance is consistently
achieved in symmetric layouts. For asymmetric
configurations, we propose a method to optimize the axial
angles of the sound sources. This method leverages source
directivity patterns to adjust level differences along the
acoustic propagation paths, thereby improving system
robustness. Experiments confirm the effectiveness of the
proposed method in asymmetric crosstalk cancellation
systems, demonstrating enhanced robustness; yielding
higher binaural channel separation under slight listener
head movements.
Authors
JY

Jianbin Yang

Dynaudio Lab, Gammel Lundtoftevej 3B, Copenhagen, Denmark
KP

Keyu Pan

Dynaudio Lab, Gammel Lundtoftevej 3B, Copenhagen, Denmark
NC

Ning Cong

Dynaudio Lab, Gammel Lundtoftevej 3B, Copenhagen, Denmark
XT

Xing Tian

Dynaudio Lab, Gammel Lundtoftevej 3B, Copenhagen, Denmark, Dynaudio Lab, Gammel Lundtoftevej 3B, Copenhagen, Denmark
Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
  Immersive Audio, Poster

1:30pm CEST

Capturing Immersive Sound in Concert Halls: A Comparative Analysis of PCMA-3D and Decca Cuboid Recording Techniques
Thursday May 28, 2026 1:30pm - 3:30pm CEST
This paper presents a comparative analysis of two immersive
recording techniques for classical music: the PCMA-3D
(Perspective Control Microphone Array); the Decca
Cuboid. While the Decca Cuboid relies primarily on
time-of-arrival differences to generate spatial
impressions, the PCMA-3D utilises intensity differences;
separates ambience from direct sound. A recording session
was conducted in a concert hall using a classical guitar
soloist; two distinct folk music ensembles to capture
performances simultaneously with both arrays. Subjective
evaluation was performed using a MUSHRA listening test with
18 participants, assessing parameters such as sensation of
space, localisation precision,; sound quality.
Statistical analysis reveals that while both systems
provide high-quality immersive experiences, the PCMA-3D
scored significantly higher in the sensation of space (p
Authors
ZW

Zechen Wang

University of York
Thursday May 28, 2026 1:30pm - 3:30pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
 
Friday, May 29
 

1:00pm CEST

A Psychoacoustic Framework for In-Vehicle Audio-Light Mapping
Friday May 29, 2026 1:00pm - 3:00pm CEST
This paper proposes a psychoacoustic-based audio-visual
mapping framework for intelligent vehicle cabins to enhance
immersion; stabilize spatial auditory perception. By
establishing mappings between auditory descriptors—such as
Direction of Arrival (DOA), spectral centroid,; temporal
envelope—and ambient lighting parameters, the framework
leverages "ambient vision" to augment the perceptual
experience without increasing the driver's cognitive load.
Theoretical analysis based on Stevens’ Power Law indicates
that the proposed mapping strategies effectively
synchronize audio-visual intensities; mitigate
perceptual fatigue, providing a conceptual reference for
future multisensory HMI design.
Authors
avatar for Kangwei Wang

Kangwei Wang

Acoustic System Engineer, GoerDynamics Lab2
Friday May 29, 2026 1:00pm - 3:00pm CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
 
Saturday, May 30
 

9:00am CEST

Zylia ZM-1 vs. Harpex Spcmic: A Case Study of Higher-Order Ambisonic Recording Performance
Saturday May 30, 2026 9:00am - 11:00am CEST
The Zylia ZM-1 (19 MEMS capsules, spherical array, 88 mm
diameter, 3rd-order); Harpex Spcmic (84 MEMS capsules,
planar array, 230 mm diameter, 5th-order capable) represent
two distinct geometrical approaches to higher-order
Ambisonics capture. Despite widespread adoption in research
; production, systematic comparison of their performance
in real-world recordings remains absent from published
literature. This case study presents a controlled
comparison through simultaneous recordings of piano
recitals in the same concert hall.

Two arrays—Zylia ZM-1; Harpex Spcmic—were mounted on a
single stereo bar (17 cm apart) ensuring acoustically
identical capture positions. Recording sessions occurred in
Aula Politechniki Gdańskiej (370-seat hall, RT60 = 1.97 s)
on two dates: August 15, 2024 (Franck: Prélude, Choral et
Fugue; Prokofiev: Piano Sonata No. 4, 35.6 minutes
total); April 30, 2024 (Ginastera: Sonata No. 1, Op. 22,
15.4 minutes). Both arrays recorded simultaneously; files
were processed through manufacturer A-to-B conversion
software; peak-normalized to −0.5 dBTP. The Spcmic was
encoded to both native 5th-order; truncated 3rd-order
formats for direct comparison with the ZM-1.

Four metrics were analyzed: (1) W-channel spectral
response, (2) integrated loudness (LUFS-I per ITU-R
BS.1770-5), (3) spatial energy distribution across
Ambisonics orders,; (4) first-order directional
component ratios.

Spectral analysis reveals the ZM-1 exhibits 5–8 dB
elevation at 200–600 Hz relative to the Spcmic. Loudness
measurements show the Spcmic 3rd-order yields 2.3–3.3 dB
higher LUFS-I than the ZM-1 despite identical peak
normalization.

The primary finding concerns spatial energy: the ZM-1
exhibits 27.4 dB attenuation from 0th to 3rd order, while
the Spcmic shows only 8.4 dB—a 19 dB difference despite
both producing "3rd-order Ambisonics" format. Analysis of
both recording sessions confirms consistency across
different repertoire (romantic, 20th-century,
contemporary). Directional analysis shows the Spcmic
exhibits stronger first-order components (X/Y/Z ratios
0.68–0.83) versus the ZM-1 (0.42–0.55).

Results demonstrate that nominal Ambisonics order
inadequately characterizes spatial resolution in real
recordings. The substantial higher-order energy deficit in
compact spherical arrays has implications for reproduction
quality, decoder design,; archival standards. Arrays
with steeper rolloff may require order-dependent gain
compensation to match spatial impression of larger systems.

This case study complements existing anechoic validation by
demonstrating performance differences in authentic
recording conditions. Recordings are part of a publicly
available HOA corpus (Gdańsk University of Technology
repository).
Authors
avatar for Bartlomiej Mroz

Bartlomiej Mroz

Assistant Professor, Gdańsk University of Technology
PhD, Spatial Audio & Immersive Media Researcher, Recording Engineer, Statistics enthusiast
SZ

Szymon Zaporowski

Gdańsk University of Technology
Saturday May 30, 2026 9:00am - 11:00am CEST
Foyer Building 303A Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
 


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.