Acoustic lenses are structures that enable the focusing of acoustic waves, with increasing applications in audio devices like loudspeakers to concentrate energy toward a listening position. While typically employed at higher frequencies, achieving effective performance within the audible frequency range remains a significant challenge due to long acoustic wavelengths, which necessitate structures of substantially larger dimensions. This paper addresses the design of an acoustic lens dedicated to operation in the audible range. The proposed lens is composed of periodically arranged acoustic unit cells, enabling precise control over both the sound transmission coefficient; the phase delay. A parametric analysis of a single acoustic unit cell was performed, followed by global optimization of the complete lens structure using the Particle Swarm Optimization (PSO) algorithm. The outcome of the study is an acoustic lens design with predefined properties that demonstrate the desired directional characteristics. The findings highlight the potential of this approach for effectively manipulating the acoustic wave field; the directivity of sound sources within the audible frequency range.
The proposed workshop/tutorial serves as a prequel to the presentation on the history of dynamic loudspeakers given at the 158th Convention (Warsaw, 2025). It focuses on the earliest phase of consumer loudspeaker technology in the 1920s, prior to the widespread adoption of dynamic loudspeakers in the mass market.
Loudspeakers had been in use since the mid-1910s for public address applications, and the rapid global expansion of broadcast radio soon brought loudspeakers into domestic use. The 1920s constituted a period of rapid innovation in loudspeaker design, preceding the introduction of the dynamic loudspeaker, which achieved significant commercial impact only in the latter part of the decade.
The workshop/tutorial will examine consumer loudspeaker technologies of the 1920s, the concurrent advancements in audio electronics and signal sources that enabled subsequent developments, and the earliest efforts in systematic loudspeaker theory and measurement.
Two loudspeaker types dominated this period: horn loudspeakers driven by electromagnetic drivers similar to those used in headphones and telephone receivers (with headphones, particularly Baldwin models, also serving as the basis for do-it-yourself loudspeakers), and open-baffle cone loudspeakers, frequently actuated by electromagnetic reed drivers.
Although these transducer technologies were rapidly superseded during the following decade, the electromagnetic loudspeaker era already featured multi-way loudspeakers employing passive crossovers. Early measurements exposed deficiencies in frequency response, leading to the introduction of equalisation techniques, including notch filters, to correct these responses.
Developments in amplification were equally significant. The 1920s saw the introduction of push-pull amplifiers (described at the time as “distortionless”) and, as a key contributor to improved bandwidth and reduced distortion, new audio transformers derived from Bell Labs’ telephone research. Amplifier power limitations nevertheless remained a dominant constraint in loudspeaker design, resulting in the widespread use of strong resonances to achieve high sensitivity. Improvements in signal source quality from the mid-1920s onwards — including advances in radio transmission and the introduction of electrical disc recording and playback — further increased the demand for improved loudspeaker performance, ultimately contributing to the development of dynamic loudspeakers. In contrast, headphone technology appears to have undergone relatively little development during this period.
The tutorial will conclude with a brief overview of the loudspeaker manufacturing landscape of the era, noting that only a small proportion of manufacturers survived the transition to dynamic loudspeaker technology.
In today’s live; electronic music events there are some sound reinforcement systems that are using horn loaded bass speaker cabinets to provide the low-end section. Especially for the electronic music applications the PA system is designed to use one or multiple clusters of bass cabinets to provide the needed SPL; impact in the low frequency range. Despite being large; heavy the horn loaded bass speakers have some advantages like the efficiency; directivity which makes them a great option for electronic music. Even more, the enthusiasts are describing them as having a longer projection of the sound when compared with bass reflex units. When used in clusters the bass horns present a mutual coupling due to a larger mouth surface area; the physics behind. This effect alters the working parameters in a good way regarding sound reproduction; is clearly noticed at high levels. This mechanism increases the output close to the low edge of the frequency response interval; changes the directivity pattern. A cluster of four or six double 18” horn loaded bass bins placed in the front middle of a dance area will provide good impact described a “punchy” sound, so acclaimed in the electronic music party scene. In this paper I will describe an investigation of the mutual coupling between horn cabinets using electrical; acoustical measurements to reveal the mentioned above mechanism. Electrical impedance measurement together with SPL; frequency response in coupled; uncoupled scenarios are used to describe; demystify the mutual coupling phenomena.
Sound system design and calibration engineer. I am running a company providing professional sound systems and DJ equipment rental. Sound system setup design, numerical simulations and technical support are included in the portfolio. Horn speakers and Vacuum tube amplifiers enthus... Read More →
Thursday May 28, 2026 9:30am - 10:00am CEST Aud 44Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
The development of personal sound zone systems in recent years show great potential for low-frequency noise control outside of noisy spaces. These approaches show promising applications to manage noise pollution arising from concerts in large venues or urban festivals. However, most of the literature considered that the created sound zones would exist in the same room or acoustic space as the noise source. This premise hence discards all setups where the disturbances would occur outside of concert venues (e.g in neighboring houses). This paper presents a first experimental study of the behavior of sound zone methods for indoor sound zones; outdoor noise sources. These initial results present a good efficiency of these methods in this edge case, opening new use cases for these approaches.
There are many types of different distortions that can be measured from linear to non-linear distortion. Often the two are convoluted together and the linear distortion influences the non-linear distortion. Distortion is also very signal and level dependent and it is hard to compare one type of distortion measurement to another. There are many type of non-linear distortion metrics, e.g. THD, THD+N and IMD being the most classic ones using sine tones as the test signal. But how can we measure distortion with real signals such as speech and music or even noise and compare the results to audibility? This tutorial discusses a wide range of distortion measurements, discusses what is audible and what distortion sounds like.
Steve Temme is founder and President of Listen, Inc., manufacturer of the SoundCheck audio test system. Steve founded the company in 1995, and for the past 30 years the company has remained on the cutting edge of research into audio measurement, regularly introducing new measurement... Read More →
Thursday May 28, 2026 10:00am - 11:00am CEST Aud 49Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Before digital signal processing took over electronic keyboard instruments, they were implemented using analogue circuits that used tubes/valves, transistors, and even neon lightbulbs! Yet using these components keyboards were developed that could mimic string and brass ensembles, pianos and harpsichords and many other instruments. How did they do it?
The purpose of this tutorial is to look at both the architecture and the circuitry of these instruments. And show how amazing results could be achieved using comparatively simple electronic circuitry. It will look at:
1. The basic architecture of these instruments 2. How they generated the right notes, 3. How they desired envelope, 4. And imposed them on the waveform, 5. Simulated the effect of many instruments playing together.
It will also look at how, if it was required, touch sensitivity could be achieved, such as in electronic pianos. Where possible there will be audio examples demonstrating the sounds that could be achieved.
For many people who have only ever experienced the digital world it will be illuminating to see just how much could be achieved by comparatively simple circuits. In those days electronic components were expensive so considerable ingenuity was expended in minimising the total number of components required.
These instruments are part of our musical and audio heritage and the circuit techniques they used are in danger of being forgotten so this tutorial will be a timely reminder of what used to be done. It may also provide useful information to people who are attempting to model these instruments using modern digital methods.
The tutorial will be accessible to everyone, you will not have to be an electronic engineer to understand the principles behind these unique pieces of audio engineering history.
Jamie Angus-Whiteoak Is Emeritus Professor of Audio Technology at Salford University and VP for Northern Europe.
Her interest in audio was crystallized aged 11 when she visited the WOR studios, NYC, in 1967 on a school trip. After this she was hooked, and spent much of her free ti... Read More →
Thursday May 28, 2026 10:00am - 11:00am CEST Aud 41Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Damping in viscoelastic materials such as rubbers is often desirable, especially in loudspeaker suspensions. Under high strain loads however, viscoelastic materials can also exhibit a hysteretic stiffness behavior, causing a stiffness decrease with amplitude. In this study, we examine the viscoelastic rubber suspension of a loudspeaker, using the loudspeaker motor system as actuator ; sensor. From measurements we observe the hysteretic force-displacement behavior; pronounced odd-order harmonic distortion even at low amplitudes, in accordance with the literature. We further explore a macro-thermodynamic plastic flow model to model the stiffness of viscoelastic materials. The results show that the plastic flow suspension model explains; replicates the observed nonlinear hysteretic behavior. We also show that a fitted time-domain loudspeaker model including plastic flow matches the measured distortion profile. In contrast, models with polynomial stiffness; viscous damping fail to explain the observed amplitude dependencies such as odd order harmonic levels. The experiments demonstrate that viscoelastic hysteresis occurs not only at high but also at low amplitudes, where the elastic stiffness is approximately linear.
My interest are loudspeakers (measurements, modelling, (nonlinear) parameter estimation, nonlinear compensation. Active noise control, indoor and outdoor sound field control
Mechanical overload remains a primary limitation in high-output loudspeaker operation, particularly at low frequencies where large coil excursions are required. Conventional mechanical protection strategies are typically implemented as signal-domain limiters or filters, which act indirectly on the loudspeaker’s mechanical state; may introduce discontinuities, spectral modification, or unnecessary attenuation.
This paper proposes a methodological framework for mechanical loudspeaker protection based on the virtualization of admissible system behavior. The approach is formulated within a nonlinear wave digital loudspeaker model; realized using a direct–inverse–direct architecture. Mechanical protection is embedded directly into the virtual loudspeaker dynamics by shaping the nonlinear suspension compliance as a function of voice-coil displacement. As the excursion approaches a prescribed admissible limit, the virtual compliance is progressively reduced using a smooth raised-cosine law, resulting in a continuous increase of the virtual mechanical stiffness. Excessive excursion is therefore prevented as a consequence of the system dynamics, without explicit limiting, clipping, or signal-domain intervention.
The proposed framework is evaluated through numerical simulations using steady-state low-frequency sinusoids; low-frequency sine bursts under free-air loading. Results are compared against an unprotected loudspeaker; a fixed high-pass filter configured to meet the same excursion constraint. The simulations verify that the proposed method enforces a soft excursion ceiling without discontinuities, preserves low-frequency output in the near-limit operating region,; exhibits stable; immediate recovery following transient excitation. Distortion behavior is characterized; shown to increase smoothly as a result of the introduced mechanical nonlinearity.
The results demonstrate that mechanical protection can be realized as an emergent property of a virtual loudspeaker model rather than as an external control action. The proposed approach provides a physically interpretable; numerically robust foundation for virtualization-based loudspeaker protection.
Target curves for the sound signature of headphones are a helpful design target during the development process. While a lot of attention has been made to fi nd target curves that match the listening preference of consumers, equivalents for studio headphones date back to the 90’s. In the context of music production a mutual target or even standard is essential as to make mixing; mastering more gear-independent. This becomes even more important since the main tool for sound engineers shifts from loudspeakers in professional environments such as acoustically treated studios to headphones, often additionally equipped with virtualization algorithms. This enables them to be more fl exible; to rely less on potentially expensive loudspeaker setups. The diffuse fi eld target curve that is currently still the only standardized target curve for studio headphones is often reported to not match a real loudspeaker-equivalent of studio environments. In this paper, we approach to find a new standard target curve for studio headphones emulating the frequency response of a loudspeaker setup in modern studio environments. For this, we give an overview of current target curves; match them to their equivalent loudspeaker setups. Based on that we propose a new methodology for a measurement-based target curve incorporating typical panning paradigms of music signals based on measurements inside multiple control rooms. To verify the results, we conduct listening tests with professionals in multiple studio environments.
Headphones have become the dominant device for music playback, and their design appears to have reached a certain level of technical maturity. This workshop presents an overview of the current state of the art in headphone design and examines potential directions for future technological development, addressing both acoustic aspects—including transducer design—and signal-processing approaches.
The workshop establishes a common foundation by introducing the fundamentals of headphone acoustics and design principles, together with a brief overview of the historical development of headphones and the main headphone types in use today.
Based on this foundation, the workshop addresses current challenges and future development potential in headphone technology, including: • Transducer and acoustic development potential: materials, design methodologies and simulation techniques, and advances in measurement technology • Characteristics of a high-quality headphone: What differentiates an excellent headphone from a good one? To what extent can headphone performance be characterized using current measurement techniques, and what additional metrics, target criteria, or perceptual considerations may be required? What is the role of mechanical quality? • Signal processing potential: from advanced noise cancellation and augmented hearing to spatial audio processing • Challenges in realistic spatial reproduction: interaction between auditory and visual environments • Emerging wireless technologies: technologies such as UWB and Bluetooth 6 offer not only increased bandwidth and reduced latency but also the capability to localize the playback device. What are the implications for conventional headphone performance and for spatial audio applications? • Changes in studio workflows: professional practice has evolved from loudspeakers as the primary monitoring tools, with headphones mainly used for detailed analysis, toward headphones playing a central role in the early stages of recording and mixing. What are the consequences of this shift for headphone design and signal processing? • Technically feasible but not yet commercialized solutions: advanced headphone concepts that are achievable with current technology but have not yet been adopted due to economic or practical constraints
Few studies exist on the perception; measurement of nonlinear distortion in headphones. This paper reports the detection thresholds; perceived sound quality from real distortion in headphones. Five different distortion measurements were made on the headphones to determine how well they predict audibility; quality. Music samples were binaurally recorded on six headphones at playback levels ranging from 85 to +110 dBA at 3 dB increments. The recordings were reproduced at a normal playback level (83 dBA) through a reference headphone with low distortion. The headphone recordings were post-processed to remove both level; frequency response differences so only nonlinear distortions; residual noise remained. In a second test, listeners rated the similarity in quality of headphones relative to an undistorted reference; a hidden version of it. The results provide evidence audible distortion in headphones with music occurs at significantly higher playback levels (104 to 112 dBA SPL) than what is considered typical; safe. The percentage of measured THD in the headphone had the highest correlation with the detection thresholds while the non-coherent distortion with music best predicted the similarity ratings. We discuss the results; the practical implications they might have on future headphone design, testing; measurement.
There are three architectural approaches to microelectromechanical systems (MEMS) microphones, miniature devices used in a wide range of products. Capacitive microelectromechanical systems (MEMS) microphones are embedded in billions of consumer electronics. Solder-compatible; providing tight part-to-part sensitivity matching—all in a small footprint—capacitive MEMS microphones have demonstrated improved performance in recent years. State-of-the-art digital capacitive MEMS microphones can now achieve up to 72dB signal-to-noise ratio (SNR), with a 22dBA noise floor ; overall dynamic range in the order of 106 dB.
However, capacitive MEMS microphone technology has now reached the limits of its architecture, which constrains the key audio performance metrics: SNR; acoustic overload point (AOP).
Piezoelectric MEMS microphones have not demonstrated SNR performance exceeding 65dB,; require new materials to be developed to increase their performance. Optical MEMS microphones—a new architectural approach that combines a laser optical subsystem, a MEMS; advanced CMOS circuit design—has exceeded the limits of capacitive technology. With 80dB SNR supporting a 14 dBA noise floor, 132 dB dynamic range,; a 146dB AOP, optical MEMS microphones accomplish studio-quality performance in a tiny form factor that supports semiconductor-level yields in high-volume manufacturing.
This presentation will explain the architectural advancements of optical MEMS microphones in comparison to capacitive MEMS microphones. It will provide example use cases of high-SNR; high-AOP microphones in high volume applications.
This work presents a measurement uncertainty evaluation of the free-field sensitivity of a MEMS microphone using a substitution comparison method. The measurement setup is based on principles used in secondary microphone calibration, with sensitivity determined relative to a calibrated reference microphone. The uncertainty analysis follows the Guide to the Expression of Uncertainty in Measurement (GUM), where Type A; Type B uncertainty evaluations are propagated through a defined measurement model to obtain the final measurement result. The MEMS microphone sensitivity is estimated together with an expanded uncertainty, where the calibration uncertainty of the reference microphone is identified as the dominant contributor. Broadband results show that the measured sensitivity is close to the typical manufacturer sensitivity over a wide frequency range; follows a similar frequency trend. The proposed approach enables reproducible estimation of the free-field sensitivity of MEMS microphones; provides a clear framework for uncertainty evaluation.
This paper presents an improved method for characterizing integrated microphone arrays for Device‑Related Transfer Function (DRTF) synthesis. A probe‑array extension of the IMPro technique is introduced to measure all device microphones simultaneously, eliminating unknown timing offsets that arise in asynchronous device–probe recordings. A custom four‑element probe array; modular test jig were developed to evaluate relative inter‑channel propagation delay (RIPD) accuracy across varied microphone‑port geometries. Hybrid free‑field DRTFs were synthesized by combining IMPro data with Boundary Element Method (BEM) acoustic scattering simulations, demonstrating that the probe‑array measurements capture small delay variations essential for precise spatial‑audio modeling. The extended IMPro method offers a practical, scalable alternative to anechoic‑chamber measurements for modern multi‑microphone devices.
The demand for wireless audio expands constantly, while the available RF spectrum over recent decades has shrunk and become more crowded. This session will explore strategies for making wireless audio work cleanly and reliably, essential information for live production, as well as TV and film production.
This paper presents Part 2 of our study on personalized timbre optimization for stereophonic sound reproduction via earphones, following our previous work presented at the AES International Conference on Headphone Technology in 2025. While Part 1 established a novel auditory-model-based framework for reproducing a listener’s natural timbre reference; demonstrated its perceptual validity under controlled conditions, the present study focuses on the practical implementation; validation of this approach for real-world use with consumer True Wireless Stereo (TWS) earphones.
Conventional headphone; earphone personalization techniques primarily target spatial audio reproduction or rely on preference-based equalization, often overlooking the accurate reproduction of natural timbre in stereophonic content. Our approach explicitly addresses this limitation by isolating; optimizing perceptually relevant timbral cues while excluding spatial encoding components, thereby improving timbral fidelity without degrading stereo imaging.
The proposed method originally consists of four stages: high-resolution anatomical scanning of the listener’s upper body, including the pinnae, individualized HRTF computation using the boundary element method, selective removal of spatial encoding components to derive a personalized reference target response curve (PR-TRC),; perceptual optimization using a listener-specific weighting coefficient grounded in auditory reference fidelity rather than preference. In this paper, each stage is simplified ; automated using smartphone-based scanning; AI-assisted processing, enabling end users to complete the entire personalization process via a smartphone connected to a cloud-based server. The resulting personalized target response curve is implemented within the computational; memory constraints of the DSP pipeline of commercial consumer TWS earphones.
A subjective evaluation using the Semantic Differential Method was conducted to assess the perceptual impact of the simplified implementation. Twenty-four listeners evaluated personalized target curves generated by both the original ; simplified methods, as well as two non-personalized target curves commonly used in commercial TWS earphones. The results show that both personalized methods consistently outperform non-personalized conditions in overall sound quality; listener preference. Importantly, no statistically significant degradation in perceived timbral naturalness was observed between the simplified; original methods.
These findings demonstrate that auditory-model-based personalized timbre optimization can be effectively translated into a practical, consumer-ready technology. The proposed approach represents a foundational contribution to future audio personalization; has broad applicability across headphone; earphone systems for stereophonic sound reproduction.
Kimio Hamasaki, an AES Fellow, is a producer and balance engineer for music recordings, a researcher in spatial audio, an educator in audio engineering and acoustics, and a consultant in audio engineering. He has recorded and produced numerous orchestral and operatic works with the Vienna Philharmonic... Read More →
Loudspeaker monitoring is the reference when audio professionals evaluate content. Headphones are also important quality-checking tools; and many consumers enjoy music using “close-fitting listening devices”, as all different flavours of headphones are known in recent standards writing.
We discuss the two reproduction methods from perceptual, recording and mastering perspectives; especially differences in timbre, imaging and auditory envelopment when listening to stereo. Applications of headphones in recording, when setting up and trimming stereo or 3D microphone arrays, are also practically detailed.
In the last part of the workshop, attendees are invited to personally compare the two domains on the qualities and applications discussed; with guided listening to audio examples between a pair of precision nearfield monitors, Genelec 8351B, and a pair of excellent headphones, Audeze CRBN2.
Stefan Bock, born 20.08.1964 in southern Germany was starting his career in 1987 as an audio engineer. After freelancing in different facilities in Munich, he co-founded msm-studios in 1991 where he was the Chief Mastering Engineer and General Manager.
Recording Producer and Balance Engineer with 50 GRAMMY-nominations, 42 of these in craft categories Best Engineered Album, Best Surround Sound Album, Best Immersive Audio Album and Producer of the Year. Founder and CEO of the record label 2L. Grammy Award-winner 2020 and 2026. Immersive... Read More →
Audio-over-Ethernet (AoE) protocols have become fundamental in modern live sound reinforcement systems, yet their real-world synchronization behavior under diverse stress conditions, both in terms of load; configuration, is not accurately characterized. Microsecond-scale timing mismatches between amplifier outputs can disrupt line-array interference patterns, reducing directivity control; spectral consistency. Ensuring robust timing accuracy across large, mixed-traffic network topologies is therefore critical for predictable system performance. This paper presents a comprehensive, application-oriented evaluation of Dante, AES67; Milan-AVB. A representative multi-hop architecture typical of touring deployments has been considered. A controlled measurement campaign, combining eight daisy-chained switches, heavy concurrent data traffic approaching link saturation,; sub-sampled latency tracking, assesses each protocol under ideal conditions, typical field situations,; common misconfigurations. Results reveal clear performance distinctions. Dante exhibits substantial timing variations, exceeding 100~$\mu$s under load. AES67 provides tighter synchronization but remains vulnerable to configuration errors, which can induce latency drift or even audio packet loss. Milan-AVB consistently maintains sub-microsecond accuracy across all scenarios.
The Saul Walker Student Design Competition is a long-running event of the Audio Engineering Society that highlights practical and creative work in audio design. It brings together experienced judges and a wide range of strong student submissions each year.
During this session, students from around the world will present their projects and bring their hardware designs for hands-on inspection by the judges. The format encourages open discussion, giving attendees a chance to hear how ideas are evaluated and improved in a professional setting.
Sponsored by API, the competition includes cash prizes for the winners. More importantly, it offers students valuable feedback and the opportunity to connect with people working in the industry. The session is open to everyone—students and non-students alike—who are interested in seeing what participants have created and learning more about current work in audio design.
Jamie Angus-Whiteoak Is Emeritus Professor of Audio Technology at Salford University and VP for Northern Europe.
Her interest in audio was crystallized aged 11 when she visited the WOR studios, NYC, in 1967 on a school trip. After this she was hooked, and spent much of her free ti... Read More →
Director of Music Media Production, AES Education Committee, Ball State University
Christoph Thompson is vice-chair of the AES audio education committee. He is the chair of the AES Student Design Competition and the Matlab Plugin Design Competition. He is the director of the music media production program at Ball State University. His research topics include audio... Read More →
Sascha Disch received his Dipl.-Ing. degree in electrical engineering from the Technical University Hamburg-Harburg (TUHH) in 1999 and joined the Fraunhofer Institute for Integrated Circuits (IIS) the same year. Ever since he has been working in research and development of perceptual... Read More →
Friday May 29, 2026 12:00pm - 1:30pm CEST Aud 49Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Loudspeaker array beamforming technology has been widely used; however, current frequency-domain; time-domain design methods for calculating FIR filters face challenges, including the need for modeling delay; high computational complexity. To address these issues, this paper proposes a time–frequency integrated framework. This framework supports both pressure matching; amplitude matching methods, enabling not only the realization of traditional superdirective beams but also the design of frequency-invariant beams. For the nonlinear optimization problem in amplitude matching, an efficient solving algorithm based on the Alternating Direction Method of Multipliers (ADMM) is introduced. Experimental results demonstrate that the proposed method combines the advantages of existing frequency-domain; time-domain approaches, directly computing FIR filter coefficients without delay modeling while maintaining high computational efficiency. This provides an effective solution for beam control in loudspeaker arrays.
Headphone listening has become an integral part of everyday life, spanning music consumption, communication, online media,; increasingly, computer gaming. These diverse listening contexts make individual sound exposure highly variable; difficult to quantify. While music listening ; occupational headphone use have been widely studied, sound exposure from gaming remains comparatively undocumented. This study investigated the relationship between self‑reported exposure through headphones; cochlear function assessed using transient evoked otoacoustic emissions (TEOAE). Forty‑one university students completed a detailed questionnaire on listening habits,; TEOAEs were recorded in both ears across five half‑octave frequency bands. Estimated weekly exposure levels were derived from participants’ reported durations ; contexts of use. TEOAE amplitude, signal‑to‑noise ratio (SNR),; reproducibility showed clear frequency‑dependent patterns; small ear asymmetries, consistent with typical OAE behaviour. Only limited associations were found between self‑reported exposure; TEOAE measures, with significant effects emerging primarily for SNR; reproducibility in the highest‑exposure group. No consistent differences were observed between long‑term gamers; non‑gamers. These findings suggest that self‑reported exposure alone may be insufficient to detect subtle cochlear changes in young adults,; underscore the need for more precise exposure‑monitoring methods when evaluating recreational sound exposure risks.
Join us for a panel discussion about audio design featuring some of the industry’s leading audio designers and educators. This session is meant to inspire upcoming designers and encourage dialogue with established audio designers.
The panelists will give a brief overview of their designs, their roles in the AES, and how and why educators and students should participate in the various design competitions that the AES has to offer. The panel discussion is followed by a Q&A session that allows for questions and exchange with the panelists.
Jamie Angus-Whiteoak Is Emeritus Professor of Audio Technology at Salford University and VP for Northern Europe.
Her interest in audio was crystallized aged 11 when she visited the WOR studios, NYC, in 1967 on a school trip. After this she was hooked, and spent much of her free ti... Read More →
Director of Music Media Production, AES Education Committee, Ball State University
Christoph Thompson is vice-chair of the AES audio education committee. He is the chair of the AES Student Design Competition and the Matlab Plugin Design Competition. He is the director of the music media production program at Ball State University. His research topics include audio... Read More →
Friday May 29, 2026 2:00pm - 3:00pm CEST Aud 41Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Immersive audio continues to expand beyond traditional studio; terrestrial field-recording environments, yet underwater soundscapes—particularly those involving marine mammals—remain largely documented in mono or stereo formats. This paper presents a practical; low-cost approach for capturing immersive underwater audio using a newly developed wideband hydrophone; a multichannel array optimized for marine environments. The hydrophones, designed by the author, feature a low noise floor, extended frequency response exceeding 100 kHz,; direct compatibility with standard P48 phantom-powered audio recorders, enabling deployment without specialized underwater preamplifiers or power systems.
To translate established immersive recording techniques into the ocean environment, an array architecture was developed based on a compact eight-element cube geometry. Two array variants were constructed to account for the significantly higher speed of sound in water compared to air, allowing the spatial characteristics of underwater sources to be captured with appropriate inter-element spacing. Field recordings were conducted off the coast of Hawaii in January during the peak season for humpback whale song. Recordings were made at multiple depths; positions to explore variations in reverberation, propagation,; ambient biological activity.
Preliminary results indicate that the system captures detailed spatial cues from humpback whale vocalizations while simultaneously preserving the rich ambient marine soundscape. The extended ultrasonic response further allows slowed or pitch-shifted playback to reveal fine temporal structures not typically audible. This work demonstrates a feasible method for immersive underwater recording; provides a foundation for both scientific research; creative content production.
Jules career with audio and electronics started early. At 16 he built an analog synthesizer from a PAiA kit. While still in high school, he designed and built a mixing board then started doing sound for local bands. Jules went to college, studied physics, and then joined the US Navy where he spent 20 years as a nuclear submariner. In between submarines, he was an instructor at the Naval Nuclear Power School in Orlando, Florida. He taught Reactor Kinetics by day, and spent many a night in local... Read More →
Friday May 29, 2026 2:30pm - 3:00pm CEST Aud 31Technical University of Denmark Asmussens Alle, Building 306 DK-2800 Kgs. Lyngby Denmark
The inherent narrowing of directivity at high frequencies in compact tweeters limits the spatial uniformity of sound reproduction in modern audio systems. Conventional passive solutions, such as waveguides; acoustic lenses, partially mitigate this issue but typically rely on bulky geometries; treat the diaphragm as a unitary radiator, neglecting localized vibration behavior. This study proposes a Matrix Wavefront Modulator (MWM), a compact passive device that implements a differentiated wavefront-shaping strategy based on vibration-aware radiation control. Sound radiation from the piston-like diaphragm dome; the breakup-prone surround is processed independently by combining guided wavefront steering with targeted scattering compensation. The geometry of the MWM is optimized to adapt to the radiation characteristics of the tweeter. Numerical simulations show that the optimized MWM reshapes the high-frequency wavefront toward a more spherical distribution; significantly reduces off-axis attenuation above 10 kHz. Experimental measurements confirm significant improvements in high-frequency directivity over wide radiation angles.
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Loudspeaker spider suspensions play a crucial role in defining the compliance; stability of electrodynamic transducers. Due to their woven structure impregnated with thermosetting resins, spiders exhibit a nonlinear; viscoelastic mechanical response, resulting in stiffness dependence on displacement; excitation rate, as well as energy dissipation during operation. However, viscoelastic effects are often simplified during early loudspeaker design stages. This work presents a combined numerical–experimental study aimed at characterizing the mechanical behaviour of loudspeaker spiders; assessing its influence on optimization choices during the pre-design phase. An experimental campaign was conducted on spider samples with fixed geometry; varying materials. Loading–unloading cycle measurements were performed at different displacement rates to capture nonlinear stiffness; hysteresis effects. A finite element modelling framework was developed using a 2D axisymmetric formulation. Viscoelastic material behaviour was first described through time-dependent simulations, with model parameters identified by fitting simulated loading–unloading curves to experimental data. A parametric geometry optimization model based on linear elastic assumptions was then implemented using quasi-static simulations. Finally, the optimized spider geometries were re-evaluated using time-dependent simulations incorporating the identified viscoelastic material properties. Results show that spider materials may influence its mechanical behaviour, in particular the suspension stiffness; hysteresis effects. Viscoelasticity mainly affects the magnitude of the stiffness curve rather than its overall shape, particularly at small displacements. These findings support the use of quasi-static linear elastic simulations for geometry optimization in early loudspeaker design, while highlighting the importance of material characterization for accurate performance prediction.
Chiara has joined Faital S.p.A. in 2018, working as a FEM analyst in the R&D Department. Her research activities are focused on thermal phenomena associated with loudspeaker functioning, and mechanical behavior of the speaker moving parts. To this goal, she uses FEM and lumped parameter... Read More →
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Measuring the anechoic response of a loudspeaker system requires space; facilities that are not commonly available. The evolution of measurement instruments has made it possible to visualize the time response of the system under analysis, enabling the identification of reflected signals; their elimination through time-gating (windowing) of the impulse response. However, this comes at the cost of a loss of resolution; characterization of the system's response at lower frequencies. To correctly characterize the system's response at the lowest frequencies, the most widely used technique is the one described by Keele in his AES paper "Low-Frequency Loudspeaker Assessment by Nearfield Sound-Pressure Measurement". To obtain the overall system response, the appropriately windowed far-field response; the near-field response are combined, as described by Struck; Temme in their paper "Simulated Free Field Measurements". This operation is performed in the frequency domain, but what happens when applied in the time domain? The goal of this work is to use the near-field impulse response to reconstruct the far-field portion of the impulse response affected by environmental reflections. As already stated, it’s quite easy to identify the first reflection point on a far-field impulse response; this can be used as a merging point to reconstruct the reflections affected impulse tail using the corresponding part of the near-field impulse measurement. Once the far-field impulse tail is reconstructed, it is possible to obtain the full-range frequency response of the system under test while maintaining maximum measurement resolution. The steps required to achieve a full-range frequency response are fewer than those required for the frequency-domain technique. For example, it is not necessary to add the baffle diffraction step effect, as demonstrated in the paper.
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
This paper investigates mid-to-high-frequency distortion in traditional electrodynamic loudspeakers arising from current-dependent nonlinearity in the magnetic circuit. Through theoretical analysis, finite-element simulations ; experimental validation, the dominant distortion mechanisms are identified. To mitigate distortion while maintaining a stable frequency response, an improved magnetic circuit is proposed, which introduces longitudinal slits to suppress surface-concentrated eddy currents. Experimental results demonstrate that the modified circuit achieves greater distortion reduction compared with conventional designs. As the improvement relies solely on structural modifications without changing the ferromagnetic materials, the proposed design offers a practical; cost-effective solution for engineering applications.
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
The Panel-shaped Bending Wave Loudspeaker was proposed recently by Kawahara. The authors conducted an objective evaluation of the diffusion characteristics of Bending Wave Loudspeakers (BWL) using the degree of interaural cross-correlation (DICC) in this paper. Conventional speakers exhibit strong directionality; rely on room reflections to create a spatial impression. In contrast, BWLs are considered less susceptible to room reflections due to complex mode vibrations across the entire diaphragm. To quantify this characteristic, the authors recorded sound in a real-world environment using a head-and-torso simulator (HATS); compared the BWL's DICC with that of a conventional speaker. The results showed that the BWL exhibited significantly lower DICC values than conventional loudspeaker at the front position (Center) under both broadband noise; music conditions, confirming its high diffusivity. Furthermore, this difference exceeded the Just Noticeable Difference (JND) for spatial perception, suggesting it is also significant to the human ear. In addition, analysis separating early reflections; late reflections suggested differences in diffusion characteristics between conventional speakers; BWL.
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
The Zylia ZM-1 (19 MEMS capsules, spherical array, 88 mm diameter, 3rd-order); Harpex Spcmic (84 MEMS capsules, planar array, 230 mm diameter, 5th-order capable) represent two distinct geometrical approaches to higher-order Ambisonics capture. Despite widespread adoption in research ; production, systematic comparison of their performance in real-world recordings remains absent from published literature. This case study presents a controlled comparison through simultaneous recordings of piano recitals in the same concert hall.
Two arrays—Zylia ZM-1; Harpex Spcmic—were mounted on a single stereo bar (17 cm apart) ensuring acoustically identical capture positions. Recording sessions occurred in Aula Politechniki Gdańskiej (370-seat hall, RT60 = 1.97 s) on two dates: August 15, 2024 (Franck: Prélude, Choral et Fugue; Prokofiev: Piano Sonata No. 4, 35.6 minutes total); April 30, 2024 (Ginastera: Sonata No. 1, Op. 22, 15.4 minutes). Both arrays recorded simultaneously; files were processed through manufacturer A-to-B conversion software; peak-normalized to −0.5 dBTP. The Spcmic was encoded to both native 5th-order; truncated 3rd-order formats for direct comparison with the ZM-1.
Four metrics were analyzed: (1) W-channel spectral response, (2) integrated loudness (LUFS-I per ITU-R BS.1770-5), (3) spatial energy distribution across Ambisonics orders,; (4) first-order directional component ratios.
Spectral analysis reveals the ZM-1 exhibits 5–8 dB elevation at 200–600 Hz relative to the Spcmic. Loudness measurements show the Spcmic 3rd-order yields 2.3–3.3 dB higher LUFS-I than the ZM-1 despite identical peak normalization.
The primary finding concerns spatial energy: the ZM-1 exhibits 27.4 dB attenuation from 0th to 3rd order, while the Spcmic shows only 8.4 dB—a 19 dB difference despite both producing "3rd-order Ambisonics" format. Analysis of both recording sessions confirms consistency across different repertoire (romantic, 20th-century, contemporary). Directional analysis shows the Spcmic exhibits stronger first-order components (X/Y/Z ratios 0.68–0.83) versus the ZM-1 (0.42–0.55).
Results demonstrate that nominal Ambisonics order inadequately characterizes spatial resolution in real recordings. The substantial higher-order energy deficit in compact spherical arrays has implications for reproduction quality, decoder design,; archival standards. Arrays with steeper rolloff may require order-dependent gain compensation to match spatial impression of larger systems.
This case study complements existing anechoic validation by demonstrating performance differences in authentic recording conditions. Recordings are part of a publicly available HOA corpus (Gdańsk University of Technology repository).
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
String ageing is a familiar; perceptually important phenomenon for guitarists; players of other stringed instruments. From the moment a new set of strings is installed, the sound they produce when excited begins to change due to a combination of chemical degradation, corrosion,; mechanical wear arising from playing. Musicians commonly report that aged strings sound dull, lack sustain,; feel less responsive compared to new strings. String ageing is a function of both elapsed time ; accumulated playing time, with repeated playing accelerating degradation through contamination; repeated mechanical stress.
Previous studies have investigated individual aspects of string ageing by artificially accelerating wear; performing controlled acoustic measurements, identifying effects such as increased damping of higher partials; increased inharmonicity. While these approaches provide valuable physical insight, the tightly constrained experimental conditions differ significantly from real-world playing conditions.
This paper presents a dataset of audio recordings of guitar playing over a four-week period, starting from the point of new strings being installed. Audio performance data from different sets of electric guitar strings is recorded daily over a four-week period, using strictly fixed musical exercises that are repeated multiple times per session. By collecting many takes of identical material at each stage of string age, the dataset enables statistical analysis of ageing-related changes while accounting for natural performance variability.
The dataset is intended to support exploratory machine learning investigations into string ageing, including questions of how ageing manifests over time; playing duration, whether string age can be predicted from audio alone,; which audio features or learned representations capture perceptually relevant aspects of the ageing process.
Thomas McKenzie is a Lecturer in Acoustics and Architectural Acoustics at the Reid School of Music, Edinburgh College of Art, University of Edinburgh, UK. He completed a B.Sc. in Music, Multimedia, and Electronics at the University of Leeds, UK, in 2013, before completing his M.Sc... Read More →
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Tape recording of audio programme produces significant noise signals underlying the audio signal. Measurements show that total modulation noise is significant; often around 25 dB down from a sinusoidal audio signal, although historical measurement methods give numbers that may exceed 50 dB. The persistent popularity of tape in the audio industry may indicate a preference for some of the more salient tape characteristics; perhaps even modulation noise. Measurements on a variety of tapes; machines are presented in an attempt to understand the basic principles. A model of modulation noise is developed which provides a broad steepening spectral peak centred on the signal frequency; captures much of the tape noise character. This could be the basis of a plug-in to simulate such noise. A new measurement method is presented culminating in a single plot which gives a useful more complete picture of modulation noise.
Saturday May 30, 2026 9:00am - 11:00am CEST Foyer Building 303ATechnical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark
Most contemporary immersive audio production workflows are centered on discrete channel-based loudspeaker formats such as 7.1.4. These formats are rarely experienced by most consumers and listeners, particularly in music playback. In practice, spatial audio is predominantly delivered via binaural reproduction. Beyond headphones, head-tracked loudspeaker array systems now enable convincing binaural reproduction in a practical, listener-centric manner, unlocking spatial audio over loudspeakers for ordinary listeners. This positions binaural reproduction not as a secondary translation, but as the core delivery format for immersive audio consumption.
Creating primarily for fixed speaker layouts can impose creative and technical constraints often resulting in restrained spatial design when content is later rendered binaurally. This workshop advocates a binaural-centric approach to spatial audio creation, treating binaural as the main deliverable, while preserving compatibility with discrete channel-based systems. Through discussion and practical examples, we will explore how designing with binaural in mind enables more expressive, perceptually robust, and immersive experiences across both headphone and loudspeaker-based binaural playback, without relying on traditional 7.1.4-centric production models.
Most of music contents available are stereo which cause inadequate spatial treatment; listeners feel disconnected from the music, failing to transport them into the intended sonic environment. Insufficient separation between instruments can lead to an unbalanced mix, where certain elements dominate others; disrupt the overall harmony. Instruments may appear flat; confined to a narrow area, reducing the sense of dimensionality in the mix. Stereo audio offers limited spatial information, restricting its adaptability to immersive sound environments. This research presents a novel approach for converting stereo audio into a personalized immersive experience by leveraging object-based audio rendering, sound stage of listener; surround speaker capability. The proposed system separates audio signals into individual objects (such as instruments or vocals); dynamically maps these objects to specific speakers based on personalized preferences; spatial configurations. This method improves audio localization; enhances the listener's engagement by delivering a tailored auditory experience.
I am working as Software developer in Samsung Research Institute India - Delhi and am responsible for development of features related to Samsung sound device’s
Imagine that you just finished designing and are now managing your dream immersive audio mix room for a client with an array of 64 speakers and it functions beautifully - then CoVid19 wreaks global havoc. You find yourself suddenly isolated in a new country, forced into retirement with its budgetary restrictions, and your dream studio has become an early victim to the pandemic. What would be your next move?
In this real-life story, follow the adventures of an intrepid audio engineer and his quest to build a personal version of that immersive studio that was lost – all within a fixed-income retiree’s budget.
In this tutorial, an immersive studio design and construction will be described including:
Inspiration from prior work by the author and colleagues Room design goals Equipment choices Custom electronics design Speaker design considerations Speaker support and position alignment Construction steps VBAP, Ambisonics, and WFS approaches Test mixes
The bone-conducted occlusion effect (OE) is a major source of acoustic discomfort for users of hearing aids, earbuds, earplugs,; related devices. Conventional objective OE measurements rely on in-ear microphones in human subjects, which are time-consuming, invasive,; difficult to control during product development. The aim of this paper is to present a new artificial ear, specifically designed for objective OE measurements under bone-conducted excitation, coupled with a finite element analysis (FEA) model developed in COMSOL Multiphysics. Both the model; the artificial ear demonstrate good agreement regarding the sound pressure found at the tympanic membrane for a conventional dome at shallow, medium; deep insertions. The validated FEA model is then used to perform a virtual test of the bone-conducted objective OE for different occluding devices, including plastic; foam earplugs; a conventional closed dome for hearing aids. This is to investigate the relative contributions; phases of the ear-canal; device surfaces govern the resulting occluded sound pressure. The proposed artificial ear; modeling approach provide a controlled; repeatable platform for studying the OE; for evaluating occluding devices during the development process.
Accurate; efficient measurement of sound pressure levels around the ears of occupants in cars is essential for objective evaluation of basic sound quality; automotive audio features such as personal sound zones; active noise control. In this paper, the uncertainties of sound pressure measurements obtained with 5 commonly used methods are compared, which are the AES 6-microphone method, the single-microphone method, the two-microphone method with occupants presented, the head-and-torso simulator method, ; the human binaural method. Measurements were conducted in the front-right seat of a 4-door electric Sedan, using either all car body loudspeakers or a pair of headrest loudspeakers driven by a two-channel uncorrelated pink noise to generate an average sound pressure level of 70 dBA in the seat. Each method underwent 3 complete install–measure–remove cycles, a total of 54 recordings were collected,; the standard deviation of the measured average sound pressure levels was adopted to quantify measurement uncertainty. The test results show that all the 5 methods have good repeatability; low uncertainty below 200 Hz; above 15 kHz, but have large uncertainty between 200 Hz; 15 kHz. The AES 6-microphone method demonstrates the best repeatability with the lowest uncertainty across most frequency resolutions,; its maximum uncertainty in 1/3 octave bands is less than 2.0 dB for sound pressure measurements in the car. Therefore, the AES 6-microphone method is recommended for use in engineering comparison; reporting.
Linear loudspeaker parameters are often estimated via fitting of transferfunctions, under the assumption of linearity. This paper investigates the corruption of the measurement caused by nonlinearities in the system; presents a new; improved method for separating the true linear response from the nonlinear components by analyzing a sequence of measurements done at different levels. The method is improved by analyzing the influence of the chosen measurement levels as well as the measurement time at each level; presents numerically optimal values for the most typical cases of nonlinear behaviour. While the influence of noise; nonlinear distortion can be eliminated completely in the case of finite orders of nonlinearities on the system, the method is also shown to provide improved accuracy in the more realistic case where all orders are present but only a finite number of them dominate.
My interest are loudspeakers (measurements, modelling, (nonlinear) parameter estimation, nonlinear compensation. Active noise control, indoor and outdoor sound field control
Saturday May 30, 2026 2:00pm - 2:30pm CEST Aud 43Technical University of Denmark Asmussens Alle, Building 303A DK-2800 Kgs. Lyngby Denmark