After explaining the video streaming background and the available adaptive streaming protocols, in this part we will explain how the a video player can and should utilize the capabilities of the adaptive streaming protocols.
The current common adaptive streaming protocols (HLS, SmoothStreaming, MPEG-DASH) require the video player to have a logic for selecting the most suitable video quality to play.
There are multiple and different player implementations, some are more sophisticated than others, but most of them will share the basic logic described in this post.
Initialization
In order to be able to select the correct video quality, the player must be able to know what are the available video qualities and their bandwidth requirements.
The first thing that any adaptive video player has to do is download the video’s manifest, and parse the information regarding any one of the available video quality.
After the player knows about the available qualities it has to select the first quality at which it will begin the playback. In this initial state, the player doesn’t know what is the available bandwidth. In most cases, the video publisher sets a default quality in which the video will begin. This initial quality will usually be a low quality, to ensure fast video start up time, and avoid stalling.
Being adaptive
As the player starts to download the video, it should measure the speed in which the video is downloaded. By measuring the download speed, the player should be able to estimate the highest video quality it can play, according to the bandwidth that was specified for each one of the available qualities. Most player will select the quality with the highest bandwidth, which is lower than the measured bandwidth by a certain margin. For example, if a player measures an available bandwidth of 4Mbps, and have a margin of 20%, it will select the highest quality having a bandwidth lower than 3Mbps.
While the player is downloading a video quality having a lower bandwidth than the one it receives from the network, it downloads the video faster than it plays it. If the player is downloading a 3Mbps quality, while downloading it at a rate of 4Mbps, it can download 4 seconds of video every 3 seconds. This gap allows the player to fill up a buffer, which stores future video data that was downloaded and not yet played. If we take the example above, every 4 seconds that pass – the player’s buffer increases by 1 second of video. This buffer allows the player to play the video smoothly when the received network bandwidth is less than the video’s bandwidth for short periods of time.
In some cases, the network bandwidth available for the video player is constant, according to the plan of his ISP. In other cases, the network bandwidth available for the video player is changing over time. In cases where multiple devices are trying to utilize the internet connection, the player can experience changes in its’ available bandwidth, as other devices make use of the internet connection at different times during the video playback. In other cases, the bottleneck can be in other locations, like the ISP or the server from which the video is being downloaded.
The player must constantly measure the effective bandwidth it receives. In case it sees a decrease in the available bandwidth, it should switch to a lower quality fast enough, before its’ buffer will get empty and the video will stall. In case it sees an increase in the available bandwidth, it should switch to a higher quality, so the viewer will be able to see a higher quality video.
The challenge
The main challenge an adaptive video player is facing is the trade-off between being too adaptive and being not adaptive enough. A too adaptive player will switch between higher and lower qualities rapidly, if it is too sensitive to network changes. This behavior will end up in a poor video experience, as the viewer will see many changes in the video quality, and especially the changes to lower qualities are the ones that catch the eye, as the picture changes from being smooth, to being pixelated. If a player is not adaptive enough, it can result in video stalls when the network bandwidth is getting too low and the player does not switch to a lower quality in time, or remain in a low quality, while the network bandwidth is high enough to allow higher quality to be played.
One way a player can overcome this challenge is by considering its’ available buffer, along with the available network bandwidth. In case the player has sufficient buffer, it doesn’t have to switch to a lower quality, even if the measured network bandwidth is not high enough to sustain the current downloaded quality. As time passes, if the available bandwidth remain low, the available buffer will be reduced, as the video is player faster than it’s actually downloaded. When the buffer reaches the low threshold mark, the player can then select a lower quality according to the available network bandwidth. The low threshold mark should be high enough to allow the player to switch to a lower quality without stalling. It can also use this low threshold mark in order to prevent switching to higher qualities while the buffer is lower than this mark, and by doing so, the player will avoid unnecessary quality switches and stalls. A player can also use two different marks – a higher mark for avoiding switches to lower qualities, and a lower mark for avoiding switches to higher qualities. An example for a player that implements these two marks is ExoPlayer for Android.
Additional considerations
Players can also consider the actual display size when selecting a video quality, so they will not select qualities with a higher resolution than the actual display size. There is no benefit in playing a 4K video on a 1080p display, or a 1080p video on a 720p display. By limiting the video quality to the display size, the viewer is actually benefits twice: first by reducing the bandwidth utilization, as lower resolution video requires less bandwidth, and second by avoiding switches or stalling, in cases the network bandwidth is higher than the bandwidth of the optimal video resolution, but lower than higher video qualities. Known adaptive video players that consider the actual display size are Youtube player, and the iOS player.
Another parameter that players should consider, is the amount of dropped frames. As the video quality is higher, it requires more computing power in order to be decoded and displayed. In cases the played video resolution is higher than the display size, it also needs to be down-scaled, which will require additional computing. In cases the device that plays the video, whether it’s a PC or mobile device, doesn’t have the required computing power for playing the video smoothly, it will skip the decoding, or drawing of some of the frames (drop them), and the video will become choppy. By measuring the amount of dropped frames per interval, the player can decide to reduce the selected quality if there are too many dropped frames, So the video playback will be as smooth as possible.
This is the end of the 3 part Adaptive Video Streaming posts.
Hope you learned something new and enjoyed it.
Don’t hesitate to leave your questions or comments below.