How “FREE” Is Your Open Source-Based
Live Video Solution?
For decades, live video has been the sole property of a limited
number of content providers and the video service providers
that delivered their content to the masses. But today, there is an
ever-increasing number of applications and services that offer live
video as one of their key features (e.g. Facebook Live).
Live encoding technology has transitioned from dedicated
hardware appliances to software on bare-metal commercial off-the-shelf (COTS) servers/virtual machines (VMs) / Linux containers to a
software component within a microservice cloud architecture.
Cloud developers who are integrating video encoding into
their software architecture focus on easy integration, scalability,
modularity, and quick time to market. These requirements are
on top of the more “traditional” requirements from encoding
that continue to hold true—bandwidth efficiency, performance,
stability, robustness and total cost of ownership.
Open source code combines the robustness and innovation
of a developer community and R&D organizations with the
advantages of freely obtainable software. The number of open
source projects increases constantly, and this applies also to the
number of video processing projects. Open source pipelines such
as FFmpeg or GStreamer offer a simple yet robust framework
that allows programmers to easily define their workflow, plug in
different filters, codecs, etc., and run over different platforms.
The open-source codec for H.264 called x264 has been widely
deployed for a range of applications from user-generated content
(UGC) to professional broadcasting. x264 offers good compression
efficiency and a wide range of tradeoffs between quality,
performance, and density through the use of its preset modes.
However, x264 was originally designed for file-based workflows.
Live encoding introduces additional challenges that x264 addresses
in a very partial, non-optimal way:
• Unpredictable CPU load—As video complexity changes over
time, moving from high-motion and high-detail scenes to simple
scenes, the amount of CPU cycles required by x264 for encoding
changes. The result is an unpredictable CPU load, and when
multiple encoding sessions run on the same CPU—occasions
when CPU is over-utilized—this results in uncontrolled input
frame drops. To avoid this scenario, video engineers end up
allocating some “CPU cycle margin” around x264 to try to ensure
they don’t reach >100% CPU utilization during complex scenes.
This causes under-utilization of resources most of the time.
• High & variable latency—for similar reasons as above, x264
THE IDT H.264 ENCODER
does not guarantee latency for live encoding. Latency will
fluctuate as video complexity increases or decreases which
affects the frame encoding duration. This forces video
engineers to set buffers & latency to high values, affecting
the quality of experience.
The IDT encoder is a cloud-optimized H.264 encoder designed
for general purpose CPUs or FPGAs. The technology in IDT’s
encoder is the culmination of decades of compression expertise.
Unlike codecs designed for file input/output and static video tool
presets, IDT’s encoders are optimized for real-time live encoding.
The IDT encoder constantly adapts to scene complexity to
guarantee a predicable encoding duration per frame. By ensuring
the frame encoding duration, the pipeline latency becomes
deterministic and easily managed.
Another pitfall of using static presets is that the user must
add CPU cycle overhead for the worst-case scene complexity,
resulting in unnecessary performance sacrifice. As IDT’s encoders
dynamically adjust to scene complexity, no idle overhead is
necessary, enabling the user to maximize the density and quality
per server. When more CPU cycles are available per instance, the
ID T encoder will use the maximum number of video tools for even
higher quality over x264, with bitrate savings up to 20%.
Due to the nature of live video, video buffers must be tightly
managed to ensure a high quality of experience. High variances
of bitrates makes network bandwidth management a challenge,
causing the client player to switch profiles too often or re-buffer
due to a stoppage. ID T’s rate control provides a higher quality of
experience with less stoppages and switching.
The following table compares the channel density of x264 and
the ID T H.264 codec when encoding three sources of public, well-known content on the same infrastructure. These results are based
on a dual Intel Xeon E5-2697 v3 server, but similar ratios of 2x can
also be demonstrated on other hardware configurations as well as
public clouds such as AWS and Azure instances. Video quality was
verified to be similar using both PSNR & VMAF objective methods,
as well as side-by-side subjective testing.