Ijust started a consulting project relating to 360° VR video, and have some introductory conclusions. I am only an egg, as Robert Hein-lein might say, but I thought I would share them.
First, done right, VR can be an incredibly pow-
erful medium, capable of a level of immersion
that can’t be matched in the world of “flat video,”
the pejorative designation for the 2D video we
watch on phones, TVs, and computer screens.
The key phrase is “done right.”
Second, VR is a challenging medium with
little margin for error. Where producers dis-
cuss quality of experience for flat video with
terms like abandonment rate, in the VR world
it’s cybersickness or nausea, which presumably
generates a much stronger antipathy toward
your company or brand than does a pre-roll ad.
Third, resolution is critical for effective VR
video, and the numbers work against you. You-Tube delivers its top-quality VR video at 4K,
which sounds great until you realize that this
4K represents the entire 360° view. Devices like
the Oculus Rift or HTC Vive have field of views
of 110%, or 30% of the video (110/360), which
means at any one time, you see only 30% of the
horizontal pixels, or about 1200 pixels from the
original 4K. The Rift and Vive have a display
resolution of 1080x1200 per eye, so it’s a pretty
good match, but if you deliver 2K video, you have
to scale the video to double the resolution. That
can cause pixelation and softness.
These are the numbers for mono video, where
both eyes see the same image. If you shoot stereoscopic VR, or a video for each eye, you have
to pack both videos into the 4K stream, halving
the resolution of each.
While field of view is your enemy from a resolution perspective, it can also be your friend.
That is, since the viewer only sees what’s in the
field of view at any one time, why not just send
that field of view, with a lower-quality buffer
around the edges to make sure there’s something to watch if the viewer quickly turns his
or her head.
To start, understand that the default frame
for VR video is called an equirectangular layout,
which is what you get when you map a sphere
to a rectangle. As an example, a typical world
map is an equirectangular image. There’s lots of
distortion at the poles since you have to stretch
them horizontally to achieve the same width as
the equator. The problem with the equirectan-
gular image is that it shows the same quality for
all parts of image, even though the viewer can
only see, at most, 30% of the image at a time.
In a blog post called “Next-Generation Video Encoding Techniques for 360° Video and VR”
( go2sm.com/nextgenvr), two Facebook employees discussed pyramid encoding, which divides
each uploaded video into 30 viewports. Each
viewport contains a full-resolution version of
the current viewport, plus much lower resolution detail of the rest of the original frame.
Using 1-second segments, the player monitors
the field of view and changes quickly with head
movement. If the viewer whips her head around,
she’ll see a low resolution version for a moment,
but quality should quickly improve. The authors
claim that this approach reduced delivered file
size by 80%.
A different approach was described in a paper called “Viewport-Aware Adaptive 360° Video Streaming Using Tiles for Virtual Reality”
( go2sm.com/viewportvr). Here, the authors divided the equirectangular frame into multiple
tiles that were encoded separately at declining
quality, like a traditional encoding ladder. The
player retrieves the highest quality rungs from
the field of view, with declining quality from
other segments. So the tiles directly behind the
current field of view, and the poles, might be
the lowest quality, with improving quality delivered closer to the current viewport. This is
all managed via a DASH manifest.
I’m working out what all of this means from
a compression perspective, but it’s refreshing
to see such creative solutions to bandwidth-related problems.
Three Truths About VR
Jan Ozer ( email@example.com) is a streaming
media producer and consultant, a frequent contributor to
industry magazines and websites on streaming-related topics,
and the author of Video Encoding by the Numbers. He blogs
frequently at streaminglearningcenter.com.
Comments? Email us at firstname.lastname@example.org, or check
the masthead for other ways to contact us.