How does adaptive bitrate algorithm work ?

Last modified by jehan monnier on 2018/06/25 10:34

Adaptive bitrate control

Linphone now has a brand new algorithm to adapt the audio and video codec bitrates to the available bandwidth, and hence optimize audio & video quality.  In most cases, the available network bandwith is not something known by either clients, it has to be discovered at run time.

On the video side, the control is made dynamically on the bitrate and the framerate (fps) of the encoder. It does NOT control the resolution of video pictures, it cannot increase or decrease it during a call. The resolution shall then be chosen consistently with the average bandwidth available for the application deployment.

Description

The audio and video of a Linphone call is transmitted via a protocol called RTP (Real-time Transfer Protocol) implemented in our library oRTP. These streams are controlled with the RTCP (Real-time Transfer Control Protocol) protocol.

In order to control the bitrate of call, a client will send a specific RTCP packet called TMMBR (Temporary Maximum Media Stream Bit Rate Request) to the remote client that contains the bitrate requested by the former. When he receives it, the client will adapt its bitrate from the received value according to its encoding capabilities, and thus may decide to increase or decrease the framerate and output bitrate of the encoder.

In practice, the a receiver can send TMMBR requests to the remote sender under these two situations:

Congestion

A congestion happens when a router between sender and receiver fails to expedite packets in time due to insufficient physical bandwidth. It may queue packets, which causes delay, and after some time will drop them because it has no more memory to store packets until they can be expedited. This typically produces when a video encoder is configured with a target bitrate that exceeds the available network bandwidth between sender and receiver.

Liblinphone has a network congestion detector, that operates based on an analysis of the correlation between timing of arrival of RTP packets and actual timestamp carried within the RTP packet. When congestion is detected, which usually takes a couple of seconds, a measurement of the total bitrate of media streams received is performed by the receiver, and is used to compute a new and lower target bitrate that is send to the remote client in a TMMBR packet. This behavior also apply to audio only stream for multi rate vocoder like OPUS.

Estimation of greater bandwidth available

Liblinphone has a maximum download bandwidth estimator running during all the duration of the call. It operates by measuring the maximum bitrate while receiving video frames composed of multiple RTP packets, which is the frequently the case. These measurements are filtered and classified by an algorithm in order to improve their accuracy. When a new estimate is computed, and provided that this estimate is greater than the current downstream bitrate, it is sent to the remote in a TMMBR packet. The sender can then use this information to increase the video encoder's output bitrate and framerate, which increase video quality. This behavior does not apply to audio stream.

Of course, in a two party call, each receiver performs these tasks simultaneously, which allows audio and video quality to be optimized in both directions.

 

Tags:
Created by Mickaƫl Turnel on 2017/10/13 16:46