WebRTC is designed to use UDP and RTP
by default, although it can be set to use a TCP-based fallback if anomalies are detected. Not
surprisingly, the video codecs for WebRTC
are VP8 and VP9, which Google has continued to advance while also working on the Alliance for Open Media’s AV1 codec. AVC, better known as H.264, can also be used, although
the audio companion to AVC cannot: Audio is
usually an open-source codec called Opus, not
the more-widely used AAC.
WebRTC is able to achieve very low latency, but doesn’t normally operate well in a typical streaming environment, meaning one
that is based on real-time messaging protocol
(RTMP), HLS, or AVC with AAC.
Companies such as nanocosmos GmbH, a
Berlin-based company that focuses on solutions
from the media server to the end user, have put
forth concepts to create a scalable WebRTC live
scenario. But the company acknowledges that,
on the delivery side, it is missing CDN and vendor support, including any support by Apple.
Many different approaches have attempted to “repair” TCP, including some that attempt to circumvent its inherent problems by
narrowing the windowing size (the period of
time in which, if a TCP packet has not been received by the end-user device, that device can
request that the packet be retransmitted). If
the TCP window is too short, though, trouble
ensues as packet transmissions get backed up
or collisions between packets increase.
Using WebSockets is one approach that attempts to create a persistent state of transmission (essentially a tunnel) as a way to
eliminate the buildup of TCP packet transmission errors inherent to very short TCP
windowing times. Still other solutions work
by lowering the segment size of an HTTP-de-livered video (think HLS or MPEG-DASH) to a
level that allows for faster startup times and
lower overall latency.
“HLS and DASH both suffer from the re-
striction to require file-based segments which
are pulled over HTTP requests,” says Oliver
Lietz, CEO of nanocosmos. “Due to the na-
ture of HTTP and internet connections, the
segment size cannot be reduced below 2 sec-
onds easily without sacrificing performance
These segments for HLS have traditional-
ly been MPEG- 2 transport stream (M2TS or
just TS), which itself is based on a decades-old
asynchronous transfer mode (ATM) protocol
designed for sending video signals across sat-
ellite. Besides the time it takes to segment vid-
eo into 22- to 10-second segments, the actual
TS packaging has a relatively high number of
header bits as well as interleaved audio.
These header bits were useful for reassembling content sent over a satellite in a direct 1: 1
link from an earth station to a satellite to a receiving satellite dish, but they are unnecessary
in a TCP environment where the network transmission protocol handles delivery sequencing.
Advances made in packaging of segments
were first addressed in late 2011, with Adobe
and Microsoft making the joint case for the use
of fragmented MP4 files that would allow delivery of multiple permutations of video streams
(e.g., camera angles) and audio streams (e.g., alternate language or commentary tracks) without requiring interleaving that slowed down
HLS based on its reliance on the M2TS packaging approach.
MPEG-DASH adopted the fragmented MP4
(fMP4) approach, as did Apple in the most recent version of its iOS mobile platform. One
result of this move to fMP4 is an ability to avoid
altogether the restrictions of file-based segments of HLS and DASH.
For instance, nanocosmos uses frame-based
segments from an MP4 live stream, essentially
allowing fMP4 to act as the “segment” to achieve
ultra-low latency, with a fallback to HLS low latency for standard HLS players.
The sibling of TCP is called UDP, and it isn’t
necessarily designed to play well with others.
As a very simple, low-level internet protocol,
at least when compared to TCP, the UDP approach forgoes a specific handshake between
sender and receiver. This helps with speed of
delivery, but there is no guarantee of delivery
as packets are not confirmed by the receiver.
“The market was thirsty for an open source,
freely available, low-latency, UDP-like approach
for streaming over the internet,” says Peter Maag, chief marketing officer of Haivision,
which has jointly developed a protocol called
secure reliable transport (SRT) with Wowza to
blend the strengths of both UDP and TCP.