Artificial intelligence (AI), deep learning, and natural language processing will be the next transformative technologies
for streaming. They all will have an impact on
streaming through all stages of production,
from content creation to consumption. With
the proliferation of AI in many different industries, there’s no doubt that it will be heavily
utilized for live streaming on a wider scale in
the near future.
Some of the companies and technologies
that are making headway in this space include Google Cloud Video Intelligence, Conviva’s Video AI Architecture, Nvidia DLA, and
IBM’s Watson technology. All of these technologies currently deploy AI in varying degrees—
especially in the cloud—but we’ll soon see AI
making inroads into other facets of streaming
AI can help replace the production workforce
behind the camera and even perform mundane
and time-consuming tasks that involve labor-intensive content/data management. Currently,
AI is being used in viewer metrics, network and
technical troubleshooting, and ad serving, but
there are other potential uses that remain virtually untapped.
Smart Camera Tracking
and Video Frame Composition
Although there are currently several motion-
tracking camera systems that allow automated
tracking of moving subjects in front of the cam-
era, they all require producers to place trans-
mitters or sensors on the subject. AI will be
able to track speakers, athletes, or entertainers
without needing any type of additional hard-
ware or sensors. Deep learning algorithms will
analyze the video and follow people doing dif-
ferent activities, whether on a stage or in oth-
er environments, while simultaneously keeping
them perfectly framed within the camera. Even
now, this technology enables drones to follow
athletes sprinting on a field and tracks the tar-
gets with unrelenting precision.
In addition, there is a direct correlation between creative visual storytelling and mathematics. The key components of video imaging—frame rates, focal lengths, aperture, and
composition—are based on ratios and require
at least a basic understanding of the math behind them to use them effectively.
The Golden Ratio (a proportion, prized for
millennia by artists, architects, and scientists
alike, in which the ratio of two numbers is the
same as the ratio of their sum to the larger of the
two quantities) can be programmed into deep-learning-based visual perception algorithms.
Thus, AI-enabled cameras can be optimized
to capture the most aesthetically pleasing video images for the human eye, a task that has
traditionally been performed by camera operators. AI will eventually replace the need for
a camera operator in most cases. In addition,
AI will be programmed to track subjects using
the golden ratio and the principals of visual
hierarchy as its foundation.
Real-Time Video Switching
Deep learning algorithms are automating
the editing and video creation process, and
will assist in bringing AI to real-time video
switching as well. Intelligent software will select optimum cameras shots or angles based
on the content of the stream by using facial,
emotional, gesture, clothing, body, color recognition, and other imaging data and cues.
The program will determine what is in each
frame of the stream and decide if it is a wide,
medium, or close-up angle, along with choosing what subject matter or person it includes.
The software will analyze the audio, video,
and other aspects of the stream and switch a
full event or show by recognizing faces, speech,
movements, or events based on many other
Get Ready for