Audio tuning

Last modified by Simon Morlat on 2023/11/07 22:05

This article describes the audio processing functions available in liblinphone, such Acoustic Echo Cancellation and Equalization.

Hardware audio processing

On mobile platforms (iOS, Android), these functions are assumed by the hardware itself, or more generally the firmware. Applications have no control on audio gains for example.

on iOS liblinphone uses the VoiceProcessingIO AudioUnit of the system, that has built-in echo cancellation, noise supression, automatic gain control and so on.

on Android, starting from liblinphone version >= 4.5 the Android AudioManager is automatically set under MODE_IN_COMMUNICATIONS, which normally turns on the firmware's voice processing functions if they are available. On previous version, the application has to do it. In addition, liblinphone attempts to use the hardware Acoustic Echo Canceller if available. If not, it will revert back to a software implementation, embedded in liblinphone.

Software audio processing

In absence of hardware audio processing units, three software ones can be activated. The most important one is the acoustic echo canceller.

Acoustic echo canceller

Liblinphone has software echo canceller algorithms. They are enabled by default, and always used in Desktop editions of Linphone/Liblinphone (Mac, Windows, GNU/Linux).

Note that Linphone will never use its software echo canceller algorithm if the underlying platforms provides hardware echo cancellation, in order to avoid bad interactions that can cause bad audio quality.

Prequisites

Keep in mind that good hardware is essential to minimize echo, or at least make it smart enough to be processed by signal processing algorithms. In particular, these points are crucial:

mechanical isolation between mic and speaker, so that no vibration are directly transmitted over materials, which would result in immediate saturation of microphone's signal when speaker outputs something.
good amplifier/speaker and microphone quality - typically a noisy speaker creating sound distortions is unlikely to be compatible with any echo canceller.
sound device and driver quality - it is essential that the audio system has minimal and constant record and play-out delays.

If the above criterias are not met, the echo cancellation will perform poorly.

Enabling and configuring software echo canceller

Choose the algorithm

There are actually three echo cancellation algorithms in liblinphone:

MSSpeexAEC : an echo canceller based on an algortihm present in the speexdsp library. Its cancellation performance is not state-of-the art (the source code is old). It is the fallback solution if the two next are not compiled.
MSWebRTCAECm : an echo canceller algorithm taken from webRTC. In the source code, it is within the mswebrtc git submodule. This is the default choice when running on an ARM platform, because this algorithm is CPU-optimized by using NEON instructions. However, its cancellation performance is far from being perfect, especially when there is double talk. It can indeed easily drop the sound of one of the talkers. In addition, it can operate only at 8 and 16 khz. The 'm' at the end of "MSWebRTCAECm" stands for "mobile". Its compilation is driven by the -DENABLE_WEBRTC_AECM=ON cmake option, enabled by default.
MSWebRTCAEC : still an echo canceller algorithm taken from webRTC in mswebrtc submodule, but this one provides much better quality, and operates up to 48000 Hz. As a drawback, it consumes more CPU. -DENABLE_WEBRTC_AEC=ON cmake option, enabled by default.

I discourage the use of the undocumented "echolimiter" feature, which is an old code that we no longer use for at least ten years. Its principle is to make half duplex in cases where the mic and speaker are so bad that classic echo cancellation algorithms can't work because of non-linear alterations.

To get the best echo cancellation performance, it is recommended to use the MSWebRTCAEC algorithm (the third one).

Doing a grep on debug logs helps to verify which one is being used, here some expected logs at the beginning of a call:

ms_filter_link: MSEqualizer:0x6000398ac1e0,0-->MSWebRTCAEC:0x6000398a52c0,1
ms_filter_link: MSWebRTCAEC:0x6000398a52c0,1-->MSVolume:0x6000398ac000,0

Note that the MSWebRTCAEC has to be explicitely requested in order to be used for ARM platforms (because MSWebRTCAECm is the default choice in order to save CPU resources).

The below property has to be set in your linphonerc configuration file or factory configuration file:

[sound]

ec_filter=MSWebRTCAEC

Alternatively, it may be set programmatically at application startup (c++ pseudo code):

Core.getConfig().setString("sound", "MSWebRTCAEC");

Configuration

The only parameter of these software echo cancellers is the echo delay, ie the estimated latency of the I/O audio device. Setting this parameter is essential, because the algorithm won't "search" the echo on very large window of time: only 80ms. The setting can be set in linphonerc configuration file, as follows:

[sound]
echocancellation=1
ec_delay=100

This means that the AEC algorithm will search for an echo in the time window 100 - 180 ms.

echocancellation=1 simply activates the echo canceller, but this can also be done programmatically using linphone_core_enable_echo_cancellation().

Measuring sound device's latency

The ec_delay setting is important for effectiveness of echo cancellation. It can be difficult to find what the latency of a given hardware platform is.

Fortunately, liblinphone provides an API and procedure to help measuring this value, called "Echo canceller calibration".

use linphone_core_start_echo_canceller_calibration() . It will play-out some beeps, which are correlated with mic signal, so that delay is estimated.
receive notification about calibration procedure result by setting a callback function to with linphone_core_cbs_set_ec_calibrator_result().

The test_ecc.c tool from liblinphone does this in command-line (though it actually uses some deprecated functions).

You can then set that value in our SDK and reload sound devices to apply it:

core.mediastreamerFactory.setDeviceInfo(Build.MANUFACTURER, Build.MODEL, Build.DEVICE, org.linphone.mediastream.Factory.DEVICE_HAS_BUILTIN_AEC_CRAPPY, <delay>, 0)
core.reloadSoundDevices()

Spectral equalizer

This processing unit applies a filtering on sound captured from mic, or sent to speaker.

The linphonerc configuration file has parameters to activate equalization and configure it with triplets frequency:gain:width, for both speaker and mic.

In the below example, an equalizer is activated on the speaker path (spk). It attenuates frequencies around 1000 Hz, with a width of 50Hz, by a factor of 0.1, and increases frequencies around 2500 Hz, with a width of 200 Hz, by a factor of two. The equalizer is deactivated on the mic path, no frequency configuration is given.

The gains are given in linear scale and multiple triplets can be given.

[sound]
spk_eq_active=1
spk_eq_gains=1000:0.1:50 2500:2:200
mic_eq_active=0
mic_eq_gains=

Audio gains

These are software gains applied to mic and speaker. As software gains, they are far less efficient than analog gains that are under control of the low-level audio driver of the system because they will increase the quantization noise. You may use them as a last resort. Gains are given in dB. Here's an example in a linphonerc configuration file:

[sound]
mic_gain_db=3
playback_gain_db=-3

Others

Liblinphone has no AGC or noise suppressor, despite there are references inside of the source code to experimental and unmaintained noise gate and and AGC.