We construct reduced-order models of aeroacoustic sources for single and twin subsonic jets ($M_j=0.9$, $Re=3600$), with the goal of accurately recovering the far-field sound over a wide band of frequencies $St=[0.07,1.0]$ and directivity angles $\phi = [30^{\circ },120^{\circ }]$ within a subdecibel level accuracy. These models are realized via combining spatio-temporally coherent spectral proper orthogonal decomposition (SPOD) modes extracted directly from Lighthill's stress tensor, itself calculated using large-eddy simulation (LES). We consider two sets of twin subsonic jets of diameter $D$ each, with spacings of $0.1D$ and $1D$, where the jets merge upstream and downstream of breakdown, respectively. The closely spaced twin jet decays the slowest due to reduced turbulent stresses which are, however, more broadband due to early merging. Such jets show strong shielding in the plane of jets, especially at shallow directivity angles where sound levels may drop below that of the single jet. The farther spaced twin jets have dynamics more akin to the constituent single jet with turbulent fluctuations peaking here at $St=0.34$, but showing very little shielding, with their overall sound pressure level (OASPL) mostly linked to the nature of extra flow structures created during merging. Three-dimensional, energy-ranked, coherent structures for twin jets exhibit rather poor low-rank behaviour, especially at the far-field spectral peak $St=0.14$. At $St \gtrsim 0.3$, the SPOD wavepackets show strong visual coherence, resembling Kelvin–Helmholtz instability modes upstream of breakdown, while at the lower frequencies, there is very little spatial coherence with wavepackets peaking downstream of breakdown. Although the leading SPOD modes radiate poorly, reduced-order models using a subset of them, up to $45$ SPOD modes per frequency, show a remarkable match (within $1$ dB) against the LES-predicted sound over $0.1 \lesssim St \lesssim 0.5$, at all angles investigated. At other frequencies, the closely spaced twin jet shows more error, due to its greater hierarchy of spatio-temporal structures, showing slower convergence at the shallower angles.