The temporary solution adopted by the industry
hinged on encoding content at varying bitrates. This is
generally known as the bitrate ladder, with each higher rung being a higher resolution at a progressively
higher data rate.
Playback of each rung on the bitrate ladder is determined based on near-real-time feedback from the
end user’s device, which sends back confirmation each
time a portion of the stream (known interchangeably
as a “segment” or “chunk”) is received. If the confirmation takes too long, the media server assumes the cause
is network congestion and steps down one or more
rungs on the bitrate ladder, sending lower-bandwidth
segments for the next portion of the stream.
This approach, though, has led to over-encoding.
The number of rungs that need to be on the bitrate ladder is always up for debate. Some content types need
more rungs to ensure that the visual difference between segments for two different rungs is minimized
for the end user’s viewing experience.
How Does CAE Work?
A natural step in the progression toward better encoding approaches was to identify the overall type of
content of a particular title (be it an episode in a TV
series or a stand-alone movie) and then choose a set
of encoding parameters on a per-title basis.
That’s all well and good, but it’s only a small step on
the journey to encoding nirvana. Why? Because everyone’s aware that even a single content title contains
multiple types of content: action, talking heads, environmental shots, and everything in between.
A few innovative souls in the industry tried a radical
approach a few years ago—using multiple codecs within a single title, with the best codec chosen for each
shot or series of shots, known as a scene.
The problem with scene-based encoding isn’t just that
the end users’ players need to be able to switch seamlessly between codecs—a herculean feat that proved
to be the rather quick undoing of the radical approach
mentioned above—but also that it takes an inordinate
amount of manual labor to choose the best codec for
CAE, on the other hand, uses machine learning to
compare content against known parameters for a
given device and/or media player type. These parameters, coupled with the anticipated bandwidth and optional data regarding the average bandwidth across an
over-the-top (OTT) operator’s viewing footprint, allow
the CAE approach to take a significant amount of the
guesswork out of recommending the bitrate ladder.
In some instances, there will be fewer rungs on
the bitrate ladder (e.g., some ladders may have wid-
er “spaces” between the data-rate rungs, since the
content does not vary perceptibly across wide band-
width ranges), but in other instances there may be less
“space” between data-rate rungs, when even a few hun-
dred kilobits per second may reveal perceptual visual
differences between rungs.
As with all machine-learning solutions, including
those with artificial intelligence (AI) algorithms, the
early progress made in bitrate ladder enhancements
will give way to more complex encoding parameters.
There’s still room for improvement, of course, including options to use different encoding parameters
for different scenes in a title, as well as options to determine how many rungs are needed on the bitrate
ladder for a given title. Yet the CAE approach is steadily progressing forward as a viable means to take the
burden of more mundane compression decisioning off
the shoulders of compressionists and content owners.
What Am I Buying?
When trying to differentiate between various
CAE-enabled encoding solutions, whether hardware
appliances or online encoding services, here are a
few key pointers to consider:
First, as mentioned above, does the CAE solution opti-
mize for both the content type and the delivery context
of a set of known OT T operators? This could vary from
devices and intended uses to known peering arrange-
ments used to load-balance content delivery at scale.
Second, does the CAE solution you’re considering
lean more toward saving bandwidth or increasing visual quality? These are the two distinct ways that CAE
can be used to create a better viewing experience. One
maintains equivalent quality while simultaneously lowering the overall and average bandwidths by 15–20%,
which means buffering is less likely and content may
be able to be delivered to a wider audience if average
bandwidths are significantly lowered. The other approach continues to use the same amount of bandwidth
previously assigned to a rung of the bitrate ladder, but
offers an opportunity to dramatically increase the quality of the visual image at that “standard” bandwidth.
Third, as a byproduct of the points above, does the
CAE solution you’re considering save bandwidth and
storage, but inadvertently increase the transcoding
costs by failing to reduce the number of rungs on the
Fourth—and this is key—does the CAE solution
break basic encoding parameters such as variable bit-rate (VBR) or Group of Pictures (GoP) in achieving
bandwidth savings or bitrate-ladder rung reductions?
Enabling maximum reach also means maintaining
100% compliance with existing media players and devices. The benefit for CAE solutions, whether they be
on-prem or cloud-based processes, is the fact that one
compression expert’s learnings can be fine-tuned and
perhaps even automated in a way that’s beneficial to
thousands of encoding sessions via firmware-upgrade-able algorithms in an on-prem encoder or transcoder,