model. However, it compares results with subjective MOS scores trains and fine-tunes the model
so that it improves over time. Plus, the machine
learning itself can be tuned, so one model could
represent animations, another sports, and so
on, allowing organizations to train the metric
for videos most relevant to them.
Netflix’s VMAF is another metric that can be
trained, using what’s called a support vector machine. Since the primary use for VMAF is to help
Netflix produce encoding ladders for its per-title encoding, the Netflix training dataset includes clips ranging in resolution from 384x288
to 1080p at data rates ranging from 375Kbps to
20Mbps. Again, by correlating the mathematical
result with subjective MOS scores, VMAF became much better at making the 540p vs. 720p
decision mentioned above.
As the name suggests, VMAF is a fusion of
three metrics, two that measure image quality
and one that measures temporal quality, making it a true “video” metric. Similarly, Tektronix’s
TekMOS metric includes a temporal decay filter
that helps make the scoring more accurate for
video. TekMOS also has a region of interest filter, which VMAF currently lacks. One huge benefit of VMAF is that Netflix chose to open source
the metric, making it available on multiple platforms, as you’ll learn more about below.
Which Metric Is Best?
No article on metrics would be complete without scatter graphs like those shown in Figure 2,
which were adopted slightly from Netflix’s blog
post on VMAF ( go2sm.com/netflixvmaf). The
scatter graph on the left compares the VMAF
scores (left axis) with
actual MOS scores (
bottom axis). The graph on
the right does the same
for a different metric entitled PSNRHVS.
If the scores corresponded exactly, they
would all fit directly on
the red diagonal line,
though, of course, that
never happens. Still,
the closer to the line,
and the tighter the pattern around the line, the
TekMOS metric and