While Neural Audio Codecs (NAC) have revolutionized monaural audio compression, achieving high-fidelity dual-channel coding at low bitrates remains a significant challenge. Existing approaches often rely on naive independent channel quantization, leading to phase incoherence, or entangled latent modeling, which sacrifices spatial precision for spectral energy. This paper proposes a novel dual-channel coding framework based on contentspatial disentanglement. Reframing spatial reconstruction as an informed source separation task, our architecture synergizes a frozen, pre-trained DAC encoder for robust mono content preservation with a parameter-efficient side information encoder that predicts fine-grained time-frequency masks. To ensure precise spatial imaging, we introduce explicit physical constraints into the end-to-end training. Experimental results indicate that at low bitrates of 9; 11 kbps, the proposed method outperforms state-of-the-art dual-mono neural baselines; industry standards in both objective spatial metrics; subjective MUSHRA evaluations.