Recent advances in large-scale multichannel loudspeaker systems have enabled immersive concert formats that extend spatial control beyond conventional stereo; small multichannel configurations. High-density loudspeaker arrays (HDLAs) allow sound to be distributed across complex architectural spaces, challenging established distinctions between composition, performance,; live sound practice. In live contexts, however, the realization of spatial attributes is often constrained by system complexity, limited rehearsal time,; the lack of artist-facing spatial control interfaces. As a result, spatial realization; sound diffusion are frequently delegated to sound engineers, who translate artistic material to the acoustic; architectural conditions of the venue in real time.
This paper examines three immersive concerts presented during Sonic Days 2025 in Denmark, realized on both large-scale; small-scale multichannel loudspeaker systems. The concerts represent contrasting production contexts, including a site-specific spatial composition conceived explicitly for a high-density loudspeaker array ; performances by artists whose practices are typically oriented toward stereo or small multichannel formats. Across these cases, spatialization functioned variously as compositional material, interpretive layer,; adaptive live-mixing practice.
The paper analyzes how control over spatial attributes is negotiated between artists; sound engineers in live immersive concert settings,; how this negotiation affects the interpretation of artistic intent; audience experience. Particular attention is given to the role of sound engineers as active mediators whose decisions shape spatial form, listening perspective,; the relationship between sound; architecture. The findings suggest that immersive concert formats redistribute creative agency across artists, technicians,; technological infrastructures,; point toward the need for revised conceptual frameworks for authorship, performance,; listening in large-scale spatial audio environments.
This presentation develops a conceptual framework for understanding how visitors cognize sound in museum exhibitions. While sound increasingly features in museum practice, research has focused primarily on measuring visitor enjoyment; engagement rather than examining the specific meanings sound generates. This gap reflects the absence of a framework conceptualizing sound's meaning-making capacities to guide empirical investigation. Drawing on scholarship from music studies, semiotics, phenomenology,; embodied cognition, I propose a seven-component spectrum identifying distinct yet interrelated meanings that sound can convey in museums: aesthetic, representational, emotional, sensorial, imaginative, social,; political. These meanings can be apprehended independently or in combination, typically through emergent, pre-conscious perception rather than deliberate awareness. The spectrum builds on the premise that museum sound meaning-making unfolds through dynamics internalized from early childhood as we attune to the world sonically. It draws on the notion of sound as a "sonic aggregate" (Grimshaw; Garner 2015)—encompassing social, contextual, temporal,; embodied experiences—rather than reducing sound to wave phenomena. Visitors actively co-produce meanings by drawing on their moods, memories, knowledge, ; imagination during exhibition encounters. Each meaning category is illustrated with exhibition case studies, demonstrating the spectrum's applicability across diverse sound-based multimodal museum practices—from popular music exhibitions to sound art installations. The spectrum aims to catalyze research through varied methodological approaches; establish analytical standards for studying sound in museums, with potential adoption by international standardization bodies.
Sound Studies Researcher, INET-md | NOVA University lisbon
A PhD in ethnomusicology and museum studies and a curator, I am committed to exploring the diverse meaning-making capabilities of sound when exhibited in museums, encompassing the representational, emotional, sensorial, and social, as well as its ability to foster imagination and... Read More →
Friday May 29, 2026 10:30am - 11:00am CEST Aud 43Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
This paper presents the perceptual evaluation of the Open Binaural Renderer (OBR), an open-source librarydeveloped for headphone-based rendering of Immersive Audio Model and Formats (IAMF) content. The evaluationfollowed an iterative framework in which findings from a pilot listening study informed the tuning of renderingprofiles, and the resulting renderer was benchmarked against established proprietary solutions. In the pilot study,19 expert listeners rated the Overall Listening Experience (OLE) of the initial prototype (OBRv1) and five externalrenderers across diverse audio content. Qualitative feedback was analysed using inductive coding to identify salientperceptual dimensions. The pilot revealed content-dependent performance and showed that a single default profilewas inadequate, yielding mixed responses in both the numerical scale and in the qualitative feedback and motivatingthe development of multiple rendering profiles in OBRv2. The main study evaluated two OBRv2 profiles targetingdifferent reverberation characteristics (Direct and Ambient) alongside three top-performing external renderers. Atotal of 39 participants, divided into expert and non-expert groups, rated five perceptual attributes: Voice Quality,Envelopment, Externalisation, Overall Listening Experience, and Timbral Balance. Mixed-design ANOVA revealedsignificant main effects of renderer condition on all attributes. Pairwise comparisons showed that OBRv2,Ambientachieved significantly higher OLE ratings than one proprietary renderer and reached statistical parity with theremaining two, representing a measurable improvement over the prototype. A trade-off between Voice Qualityand Externalisation was observed, driven by the level of reverberation in each renderer. The results demonstratethat iterative, perceptually informed tuning can yield competitive binaural rendering quality in an open-sourceframework.
Professor of Audio Engineering, University of York
Gavin Kearney graduated from Dublin Institute of Technology in 2002 with an Honors degree in Electronic Engineering and has since obtained MSc and PhD degrees in Audio Signal Processing from Trinity College Dublin. He joined the University of York as Lecturer in Sound Design in January... Read More →
With 25+ years of media industry product development, Jani Huoponen is a seasoned expert in developing cutting-edge audio and video technologies for consumer devices and streaming systems. Joining Google in 2010, he’s served as a product manager across key multimedia initiatives... Read More →
Despite the growing number of hearing-impaired workers wearing hearing-aids in occupational settings, understanding speech in multi-talker situations remains challenging. This difficulty is particularly pronounced in open-plan offices, where simultaneous talkers; room reverberation are prone to degrade speech intelligibility. While spatial cues are essential for segregating target speech from competing sources, hearing-aids signal processing may alter binaural information that supports spatial hearing. Accurate evaluation of hearing-aids performance is therefore crucial. Objective speech intelligibility metrics offer an efficient alternative to time-consuming listening tests; however, their validity in complex spatial scenarios involving hearing-impaired listeners remains unclear. Monaural metrics such as HASPI account for individual hearing loss but neglect spatial information, whereas binaural metrics such as MBSTOI incorporate spatial cues but are primarily designed for normal-hearing listeners. This study evaluates the ability of existing objective metrics to predict speech intelligibility for hearing-aid users in multi-talker spatial environments. Listening tests are conducted on 20 hearing-impaired participants fitted with binaural hearing-aids. Four types of multi-talker auditory scenes representative of open-plan offices are reproduced using a loudspeaker array. They involve a target speech, combined with diffuse noise; a localized competing speech source. Objective measurements are performed using an acoustic mannequin fitted with the participants’ hearing-aids. HASPI; MBSTOI values are computed from the binaural signals recorded at the eardrums ; incorporating individual hearing losses. Objective predictions are compared with subjective intelligibility scores,; an ablation analysis is conducted to distinguish the effects of hearing loss modeling from those of binaural processing.
Situational awareness is a multisensory ability that enables individuals to perceive; appropriately take into account their immediate environment. This perception of the world through our senses is carried out continuously; unconsciously throughout the day. When auditory perception is degraded, an individual may no longer correctly perceive a doorbell, a water leak, or an alarm signal, which negatively affects quality of life; may lead to dangerous situations. Auditory perception can in particular be degraded by hearing loss, a common; widespread condition. The most common treatment consists of wearing hearing aids, which are mainly designed to improve speech intelligibility, especially in noisy environments. Feedback from hearing-impaired people; hearing-aid users indicates that, although auditory situational awareness has been recognised as an essential component of well-being, it remains insufficiently studied; requires further investigation. There is currently no standard method for assessing to which extent one's situational awareness is affected by hearing impairment; the use of hearing aids. This is a complex process that requires assessing the perception of relevant sound events within a continuous stream of multisensorial information, by individuals who have different subjective preferences. Most existing methods are limited to evaluating only a subset of the problem, such as identification; localisation of non-speech sound events. The rise of new technologies, such as virtual reality, enables the development of assessment methods within more realistic yet controlled environments. This study aims to review existing methods in order to highlight their limitations in addressing the issue at hand.
Headphone listening has become an integral part of everyday life, spanning music consumption, communication, online media,; increasingly, computer gaming. These diverse listening contexts make individual sound exposure highly variable; difficult to quantify. While music listening ; occupational headphone use have been widely studied, sound exposure from gaming remains comparatively undocumented. This study investigated the relationship between self‑reported exposure through headphones; cochlear function assessed using transient evoked otoacoustic emissions (TEOAE). Forty‑one university students completed a detailed questionnaire on listening habits,; TEOAEs were recorded in both ears across five half‑octave frequency bands. Estimated weekly exposure levels were derived from participants’ reported durations ; contexts of use. TEOAE amplitude, signal‑to‑noise ratio (SNR),; reproducibility showed clear frequency‑dependent patterns; small ear asymmetries, consistent with typical OAE behaviour. Only limited associations were found between self‑reported exposure; TEOAE measures, with significant effects emerging primarily for SNR; reproducibility in the highest‑exposure group. No consistent differences were observed between long‑term gamers; non‑gamers. These findings suggest that self‑reported exposure alone may be insufficient to detect subtle cochlear changes in young adults,; underscore the need for more precise exposure‑monitoring methods when evaluating recreational sound exposure risks.
Binaural rendering is typically assessed via timbre; localization accuracy, while its intrinsic spatial resolution remains rarely quantified. This paper proposes a perceptual evaluation method based on Minimum Audible Angle (MAA) measurements to estimate the azimuthal just-noticeable difference (JND) introduced by binaural rendering algorithms. We systematically compared several rendering algorithms across eight reference azimuths using two participant-allocation paradigms. The results show that spatial resolution is significantly influenced by Ambisonic order; choice of the rendering alrorithm, with MAA thresholds systematically decreasing as the truncation order increases. Furthermore, the propsed method successfully captures physiological spatial characteristics ; identifies resolution limits imposed by reference angles. While both participant-allocation paradigms yield consistent qualitative trends, the repeated-measures design provides superior data stability. These findings demonstrate that the proposed MAA-based method is an effective tool for quantifying the spatial resolution of binaural rendering algorithms.
This study evaluates three Next-Generation Audio (NGA) rendering systems through listening tests using real-life audio content. The testing paradigm prioritized subjective preference over adherence to a ground-truth reference. Participants assessed perceptual spatial audio attributes in both 5.1; 7.1.4 loudspeaker setups. The findings suggest that strict adherence to the rendering algorithm used during content creation is not mandatory in terms of listener preference. While not advocating disregarding artistic intent without consideration, this study proposes that such flexibility in reproduction can be an acceptable compromise.
Toni Hirvonen studied acoustics at the Helsinki University of Technology (now Aalto University), where he obtained a PhD in audio signal processing and spatial audio. After a position as a Marie Curie fellow, he has worked internationally in the audio industry since 2010. His projects... Read More →