ladder experience for users with the same effective bandwidth. More on this below.
This analysis produced more than 1,200 data
points, making it tough to figure out how to synthesize all this data into usable comparison
points. Since it’s summertime, and since my editor Eric Schumacher-Rasmussen loves baseball,
I decided to use the following schema.
Wins and Losses: For each file, a technology got
a “win” when the per-title technology increased
the data rate and quality without pushing the
PSNR score beyond 45 dB, which wouldn’t be
perceivable to most viewers. A technology also
got a win when it reduced the data rate, but
didn’t push PSNR values below 35, which could
result in visible artifacts. Conversely, a technology got a “loss” when it violated either rule.
Errors: In order to function properly, encoding
ladders need to have fixed distances between
the various rungs. For example, Apple recommends that higher rungs be no more than
200% the data rate of the lower rung. So, each
technology received an “error” when rungs exceeded 2.05x the data rate of the immediately lower rung, starting with rung 3 to exclude
the two lowest bitrate rungs. Capped CRF and
Capella also got an error when VMAF quality
dropped by more than 6 points in one of the top
four rungs, which according to Netflix would
equal noticeable difference in quality. I excluded Brightcove from this measure because of
the comparison issue described above.
Saves: A technology received a “save” when it
eliminated a rung on the encoding ladder or
an encoding pass, both of which reduced encoding costs.
Home runs: A technology scored a “home run”
when it improved the PSNR value of the output
files for any clip by greater than 1% in four or
more ladder rungs.
Bandwidth saved: I also tracked the net bits
per second in the output files saved by each
I summarized the scores of the three technologies that I fully analyzed in Table 3.
Now, let’s dive into the individual technologies.
The DIY Option: Capped CRF
Constant rate factor is an encoding mode
that adjusts the file data rate up or down to
achieve a selected quality level rather than a
specific data rate. CRF values range from 0 to
51, with lower numbers delivering higher quality scores. Multiple codecs support CRF, including x264, x265, and VP9.
On its own, CRF is unusable for adaptive bitrate streaming, where data rates in the ladder
rungs need to be closely adhered to. However, by adding a “cap” to CRF, you limit the data
rate to that figure. An FFmpeg argument implementing capped CRF would look like this:
ffmpeg -i inputfile -crf 23 -maxrate 6750k
This tells FFmpeg to encode at a quality level
of 23, but to cap the data rate at 6750 (this was
the 1080p stream). For low-motion clips, the
CRF value would limit the data rate, as the required quality could be achieved at data rates
lower than the cap. For hard-to-encode clips,
the cap would kick in to control the data rate.
Looking back at Table 1, capped CRF can
adjust the data rate, but not the number of
rungs or their resolution. There’s also no independent bitrate control or post-encode quality check.
Looking at the box score in Table 3, capped
CRF had 14 wins out of 14 completed trials. The
eight errors all related to jumps from the 720p
file to the 1080p file of greater than 2.05x. For the
talking-head clip, for example, the 720p clip had
a data rate of 1.04Mbps, the 1080p three times
larger at 3. 14. This would strand many viewers
at the 720p clip, reducing overall QoE. If I was
deploying capped CRF, I would try a lower quality value like CRF 25 for the 1080p file to limit