Adaptive bitrate control

Linphone now has a brand new algorithm to adapt the audio and video codec bitrates to the available bandwidth, and hence optimize audio & video quality.  In most cases, the available network bandwidth is not something known by either clients, it has to be discovered at run time.

On the video side, control is made dynamically on the bitrate, the framerate (fps) and image size of the encoder.

Description

The audio and video of a Linphone call is transmitted via a protocol called RTP (Real-time Transfer Protocol) implemented in our library oRTP. These streams are controlled with the RTCP (Real-time Transfer Control Protocol) protocol.

In order to control the bitrate of call, a client sends a specific RTCP packet called TMMBR (Temporary Maximum Media Stream Bit Rate Request) to the remote client that contains the bitrate requested by the former. When he receives it, the client adapts its bitrate from the received value according to its encoding capabilities, and thus may decide to increase or decrease the framerate and output bitrate of the encoder, and when needed, the image size.

The a receiver is sending TMMBR requests to the remote sender under these two situations:

Congestion

A congestion happens when a router between sender and receiver fails to expedite packets in time due to insufficient physical bandwidth. It may queue packets, which causes delay, and after some time will drop them because it has no more memory to store packets until they can be expedited. This typically produces when a video encoder is configured with a target bitrate that exceeds the available network bandwidth between sender and receiver.

Liblinphone has a network congestion detector, that operates based on an analysis of the correlation between timing of arrival of RTP packets and actual timestamp carried within the RTP packet. When congestion is detected, which usually takes a couple of seconds, a measurement of the total bitrate of media streams received is performed by the receiver, and is used to compute a new and lower target bitrate that is send to the remote client in a TMMBR packet. This behavior also apply to audio only stream for multi rate vocoder like OPUS.

Estimation of greater bandwidth available

Liblinphone has a maximum download bandwidth estimator running during all the duration of the call. It operates by measuring the maximum bitrate while receiving video frames composed of multiple RTP packets, which is the frequently the case. These measurements are filtered and classified by an algorithm in order to improve their accuracy. When a new estimate is computed, and provided that this estimate is greater than the current downstream bitrate, it is sent to the remote in a TMMBR packet. The sender can then use this information to increase the video encoder's output bitrate, framerate ,or picture size, which increases video quality.

Please note that as of today, estimation of available bandwidth is done thanks to video stream only. If only an audio stream is used during the SIP call, no estimation is computed.

Of course, in a two party call, each receiver performs these tasks simultaneously, which allows audio and video quality to be optimized in both directions.

Both congestion and bandwidth detection are done by liblinphone at receiving side. In other words, our rate control algorithm is receiver-driven, which puts very few requirement on the sender. Typically, a non-liblinphone user-agent willing to send video to a liblinphone user-agent with the rate-control capabilities discussed here has only to support RTCP TMMBR from RTP/AVPF profile in order to benefit from rate control.

 

Tags: