Real-Time Spherical Array Renderer for binaural reproduction in Python
Implementation of the Real-Time Spherical Array Renderer for binaural reproduction in Python [1][2].
Contents: | Requirements | Setup | Quickstart | Execution parameters | Execution modes | Remote Control | Validation - Setup and Execution | Benchmark - Setup and Execution | References | Change Log | Contributing | Credits | License |
10.14 Mojave
and 10.15 Catalina
) or Linux (tested on 5.9.1-1-rt19-MANJARO
)multiprocessing
implementation)brew install qjackctl
or alternatively prebuilt installers are available)11 Big Sur
and following versions require jack >= 1.9.21
to work correctly!)miniconda
is sufficient; provides an easy way to get Intel MKL or alternatively OpenBLAS optimized numpy
versions which is highly recommended)3.7
to 3.11
; recommended way to get Python is to use Conda as described in the setup section)git clone https://github.com/AppliedAcousticsChalmers/ReTiSAR.git
git pull
cd ReTiSAR/
conda update conda
--force
to overwrite potentially existing outdated environment):conda env create --file environment.yml --force
conda activate ReTiSAR
jackd -d coreaudio -r 48000
[macOS]
jackd -d alsa -r 48000
[Linux]
Remark: Select the correct audio interface that should be used in QjackCtl or terminal with the help of e.g.jackd -d coreaudio -d -l
!python -m ReTiSAR
--help
).JACK initialization — In case you have never started the JACK audio server on your system or want to make sure it initializes with appropriate values. Open the QjackCtl application to set your system specific default settings.
At this point the only relevant JACK audio server setting is the sampling frequency, which has to match the sampling frequency of your rendered audio source file or stream (no resampling will be applied for that specific file).
FFTW optimization — In case the rendering takes very long to start (after the message "initializing FFTW DFT optimization ..."), you might want to endure this long computation time once (per rendering configuration) or lower your [FFTW] planner effort (see --help
).
Rendering performance — Follow these remarks to expect continuous and artifact free rendering:
Always check the command line output or generated log files in case the rendering pipeline does not initialize successfully!
The following parameters are all optional and available in combinations with the named execution modes subsequently:
python -m ReTiSAR -b=4096
[default]python -m ReTiSAR -b=1024
python -m ReTiSAR -b=256
python -m ReTiSAR -SP=TRUE
[default]python -m ReTiSAR -SP=FALSE
python -m ReTiSAR -irt=-60
[default]
python -m ReTiSAR -irt=0
[applied in all scientific evaluations]
python -m ReTiSAR -tt=NONE
[default]
python -m ReTiSAR -tt=AUTO_ROTATE
python -m ReTiSAR -tt=RAZOR_AHRS -t=/dev/tty.usbserial-AH03F9XC
python -m ReTiSAR -tt=POLHEMUS_PATRIOT -t=/dev/tty.UC-232AC
python -m ReTiSAR -tt=POLHEMUS_FASTRACK -t=/dev/tty.UC-232AC
python -m ReTiSAR -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
[default]
python -m ReTiSAR -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir_struct.mat -hrt=HRIR_MIRO
python -m ReTiSAR -hr=res/HRIR/KU100_SADIE2/48k_24bit_256tap_8802dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -hr=res/HRIR/KEMAR_SADIE2/48k_24bit_256tap_8802dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -hr=res/HRIR/FABIAN_TUB/44k_32bit_256tap_11950dir_HATO_0.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -hp=NONE
[default]
python -m ReTiSAR -hp=res/HPCF/KU100_SADIE2/48k_24bit_1024tap_Beyerdynamic_DT990.wav
python -m ReTiSAR -hp=res/HPCF/KEMAR_SADIE2/48k_24bit_1024tap_Beyerdynamic_DT990.wav
python -m ReTiSAR -hp=res/HPCF/FABIAN_TUB/44k_32bit_4096tap_AKG_K701.wav
python -m ReTiSAR -hp=res/HPCF/FABIAN_TUB/44k_32bit_4096tap_Sennheiser_HD800.wav
python -m ReTiSAR -hp=res/HPCF/KEMAR_TUR/44k_24bit_2048tap_Sennheiser_HD600.wav
python -m ReTiSAR -arr=18
[default]
python -m ReTiSAR -sht=SHF
python -m ReTiSAR -sht=SHT+SHF
[default]
python -m ReTiSAR -gt=NONE
[default]
python -m ReTiSAR -gt=NOISE_WHITE -gl=-30 -gm=FALSE
python -m ReTiSAR -gt=NOISE_IIR_PINK -gl=-30 -gm=FALSE
python -m ReTiSAR -gt=NOISE_IIR_EIGENMIKE -gl=-30 -gm=FALSE
This section list all the conceptually different rendering modes of the pipeline. Most of the other beforehand introduced execution parameters can be combined with the mode-specific parameters. In case no manual value for all specific rendering parameters is provided (as in the following examples), their respective default values will be used.
Most execution modes require additional external measurement data, which cannot be republished here. However, all provided examples are based on publicly available research data. Respective files are represented here by provided source reference files (see res/), containing a source URL and potentially further instructions. In case the respective resource data file is not yet available on your system, download instructions will be shown in the command line output and generated log files.
Run as array recording renderer
python -m ReTiSAR -sh=4 -tt=NONE -s=res/record/EM32ch_lab_voice_around.wav -ar=res/ARIR/RT_calib_EM32ch_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
[default]
python -m ReTiSAR -sh=4 -tt=NONE -s=res/record/EM32ch_lab_voice_updown.wav -ar=res/ARIR/RT_calib_EM32ch_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -b=512 -sh=3 -tt=NONE -s=res/record/ZY19_off_around.wav -sl=9 -ar=res/ARIR/RT_calib_ZY19_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -b=2048 -sh=7 -tt=NONE -s=res/record/HOS64_hall_lecture.wav -sp="[(90,0)]" -sl=9 -ar=res/ARIR/RT_calib_HOS64_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
Run as array live-stream renderer with minimum latency (e.g. Eigenmike with the respective channel calibration provided by the manufacturer)
python -m ReTiSAR -b=512 -sh=4 -tt=NONE -s=None -ar=res/ARIR/RT_calib_EM32ch_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -b=512 -sh=4 -tt=NONE -s=None -ar=res/ARIR/RT_calib_EM32frl_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -b=512 -sh=3 -tt=NONE -s=None -ar=res/ARIR/RT_calib_ZY19_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -b=2048 -sh=7 -tt=NONE -s=None -ar=res/ARIR/RT_calib_HOS64_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
Run as array IR renderer, e.g. Eigenmike
python -m ReTiSAR -sh=4 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_sim_EM32_PW_struct.mat -art=ARIR_MIRO -arl=-6 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -sh=4 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_anec_EM32ch_S_struct.mat -art=ARIR_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
Run as array IR renderer, e.g. sequential VSA measurements from [11] at the maximum respective SH order (different room, source positions and array configurations are available in res/ARIR/)
python -m ReTiSAR -sh=5 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_LBS_VSA_50RS_PAC.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -sh=7 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_SBS_VSA_86RS_PAC.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -sh=8 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -sp="[(37,0)]" -ar=res/ARIR/DRIR_CR1_VSA_110RS_L.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -sh=11 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_LBS_VSA_194OSC_PAC.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
python -m ReTiSAR -sh=12 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -sp="[(37,0)]" -ar=res/ARIR/DRIR_CR7_VSA_1202RS_L.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
Note that the rendering performance is mostly determined by the chosen combination of the following parameters: number of microphones (ARIR channels), room reverberation time (ARIR length), IR truncation cutoff level and rendering block size.
python -m ReTiSAR -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -art=NONE -hr=res/HRIR/KU100_THK/BRIR_CR1_VSA_110RS_L_SSR_SFA_-37_SOFA_RFI.wav -hrt=BRIR_SSR -hrl=-12
python -m ReTiSAR -tt=AUTO_ROTATE -s=res/source/PinkMartini_Lilly_44.wav -sp="[(30, 0),(-30, 0)]" -art=NONE -hr=res/HRIR/FABIAN_TUB/hrirs_fabian.wav -hrt=HRIR_SSR
(provide respective source file and source positions!)
5005
[default]./generator/volume 0
, /generator/volume -12
(set any client output volume in dBFS),/prerenderer/mute 1
, /prerenderer/mute 0
, /prerenderer/mute -1
, /prerenderer/mute
(set/toggle any client mute state),/hpeq/passthrough true
, /hpeq/passthrough false
, /hpeq/passthrough toggle
(set/toggle any client passthrough state)/renderer/crossfade
, /crossfade/renderer
(set/toggle the crossfade state),/renderer/delay 350.0
(set an additional input delay in ms),/renderer/order 0
, /renderer/order 4
(set the SH rendering order),/tracker/zero
(calibrate the tracker), /tracker/azimuth 45
(set the tracker orientation),/player/stop
, /player/play
, /quit
(quit all rendering components)5006
[default] in the specified exemplary data format (number of values depends on output ports) i.e., arbitrary combinations of the name and parameters:/player/rms 0.0
, /generator/peak 0.0 0.0 0.0 0.0
(the current audio output metrics),/renderer/load 100
(the current client load),/tracker/AzimElevTilt 0.0 0.0 0.0
(the current head orientation),/load 100
(the current JACK system load)brew install ecasound
brew install yoggy/tap/sendosc
./res/research/validation/record_ir.sh
res/research/validation/
against the provided reference IRs:python -m ReTiSAR --VALIDATION_MODE=res/HRIR/KU100_THK/BRIR_CR1_VSA_110RS_L_SSR_SFA_-37_SOFA_RFI.wav
./res/research/validation/record_snr.sh
open ./res/research/validation/calculate_snr.m
conda env update --file environment_dev.yml
[CMD]+[T]
):jackd -d coreaudio
python -m ReTiSAR --BENCHMARK_MODE=PARALLEL_CONVOLVERS
python -m ReTiSAR --BENCHMARK_MODE=PARALLEL_CLIENTS
[1] H. Helmholz, C. Andersson, and J. Ahrens, “Real-Time Implementation of Binaural Rendering of High-Order Spherical Microphone Array Signals,” in Fortschritte der Akustik -- DAGA 2019, 2019, pp. 1462–1465.
[2] H. Helmholz, T. Lübeck, J. Ahrens, S. V. Amengual Garí, D. Lou Alon, and R. Mehra, “Updates on the Real-Time Spherical Array Renderer (ReTiSAR),” in Fortschritte der Akustik -- DAGA 2020, 2020, pp. 1169–1172.
[3] B. Bernschütz, C. Pörschmann, S. Spors, and S. Weinzierl, “SOFiA Sound Field Analysis Toolbox,” in International Conference on Spatial Audio, 2011, pp. 7–15.
[4] C. Hold, H. Gamper, V. Pulkki, N. Raghuvanshi, and I. J. Tashev, “Improving Binaural Ambisonics Decoding by Spherical Harmonics Domain Tapering and Coloration Compensation,” in International Conference on Acoustics, Speech and Signal Processing, 2019, pp. 261–265, doi: 10.1109/ICASSP.2019.8683751.
[5] Z. Ben-Hur, F. Brinkmann, J. Sheaffer, S. Weinzierl, and B. Rafaely, “Spectral Equalization in Binaural Signals Represented by Order-Truncated Spherical Harmonics,” J. Acoust. Soc. Am., vol. 141, no. 6, pp. 4087–4096, 2017, doi: 10.1121/1.4983652.
[6] B. Bernschütz, “A Spherical Far Field HRIR/HRTF Compilation of the Neumann KU 100,” in Fortschritte der Akustik -- AIA/DAGA 2013, 2013, pp. 592–595, doi: 10.5281/zenodo.3928296.
[7] P. Majdak et al., “Spatially Oriented Format for Acoustics: A Data Exchange Format Representing Head-Related Transfer Functions,” in AES Convention 134, 2013, pp. 262–272.
[8] C. Armstrong, L. Thresh, D. Murphy, and G. Kearney, “A Perceptual Evaluation of Individual and Non-Individual HRTFs: A Case Study of the SADIE II Database,” Appl. Sci., vol. 8, no. 11, pp. 1–21, 2018, doi: 10.3390/app8112029.
[9] F. Brinkmann et al., “The FABIAN Head-Related Transfer Function Data Base.” Technische Universität Berlin, Berlin, Germany, 2017, doi: 10.14279/depositonce-5718.5.
[10] H. Helmholz, D. Lou Alon, S. V. Amengual Garí, and J. Ahrens, “Instrumental Evaluation of Sensor Self-Noise in Binaural Rendering of Spherical Microphone Array Signals,” in Forum Acusticum, 2020, pp. 1349–1356, doi: 10.48465/fa.2020.0074.
[11] P. Stade, B. Bernschütz, and M. Rühl, “A Spatial Audio Impulse Response Compilation Captured at the WDR Broadcast Studios,” in 27th Tonmeistertagung -- VDT International Convention, 2012, pp. 551–567.
[12] C. Hohnerlein and J. Ahrens, “Spherical Microphone Array Processing in Python with the sound_field_analysis-py Toolbox,” in Fortschritte der Akustik -- DAGA 2017, 2017, pp. 1033–1036.
[13] H. Helmholz, J. Ahrens, D. Lou Alon, S. V. Amengual Garí, and R. Mehra, “Evaluation of Sensor Self-Noise In Binaural Rendering of Spherical Microphone Array Signals,” in International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 161–165, doi: 10.1109/ICASSP40776.2020.9054434.
[14] H. Helmholz, D. Lou Alon, S. V. Amengual Garí, and J. Ahrens, “Effects of Additive Noise in Binaural Rendering of Spherical Microphone Array Signals,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 29, pp. 3642–3653, 2021, doi: 10.1109/TASLP.2021.3129359.
See CHANGE_LOG for full details.
See CONTRIBUTING for full details.
Written by Hannes Helmholz.
Scientific supervision by Jens Ahrens.
Contributions by Carl Andersson and Tim Lübeck.
This work was funded by Facebook Reality Labs.
This software is licensed under a Non-Commercial Software License (see LICENSE for full details).
Copyright (c) 2018
Division of Applied Acoustics
Chalmers University of Technology
[SOFA]: https://www.sofaconventions.org/mediawiki/index.php/SOFA_(Spatially_Oriented_Format_for_Acoustics [FFTW]: http://www.fftw.org/fftw3_doc/Planner-Flags.html