Name: A Parametric Dual-Channel Audio Coding via Learned Time-Frequency Masking
Start: 2026-05-28T16:30:00+0200
End: 2026-05-28T17:00:00+0200

Schedule as of May 16, 2022 - subject to change

Default Time Zone is CEST - Central European Summer Time
You can change your view to your time zone (look for "Timezone" on the right)

LIVESTREAMS : A and B

ON DEMAND VIDEOS (previous days)

A Parametric Dual-Channel Audio Coding via Learned Time-Frequency Masking

Thursday May 28, 2026 4:30pm - 5:00pm CEST

Aud 42

While Neural Audio Codecs (NAC) have revolutionized
monaural audio compression, achieving high-fidelity
dual-channel coding at low bitrates remains a significant
challenge. Existing approaches often rely on naive
independent channel quantization, leading to phase
incoherence, or entangled latent modeling, which sacrifices
spatial precision for spectral energy. This paper proposes
a novel dual-channel coding framework based on
contentspatial disentanglement. Reframing spatial
reconstruction as an informed source separation task, our
architecture synergizes a frozen, pre-trained DAC encoder
for robust mono content preservation with a
parameter-efficient side information encoder that predicts
fine-grained time-frequency masks. To ensure precise
spatial imaging, we introduce explicit physical constraints
into the end-to-end training. Experimental results indicate
that at low bitrates of 9; 11 kbps, the proposed method
outperforms state-of-the-art dual-mono neural baselines;
industry standards in both objective spatial metrics;
subjective MUSHRA evaluations.

Authors

AI and Machine Learning in Audio, Lecture | Audio Processing, Lecture

Presentation Type Lecture

AES Europe 2026

Qingbo Huang

Tianshu Qu

Yihan Wang

Yufan Qian

Attendees (8)

Get help with the event