Patent classifications
H04N19/27
Separation of graphics from natural video in streaming video content
Aspects of the subject disclosure may include, for example, a method that includes obtaining, by a processing system including a processor, video frames over a network; the processing system uses a machine learning algorithm to identify in each frame a first region comprising a natural image and a second region comprising a synthetic graphic image. The processing system separates the natural image from the synthetic graphic image to generate a natural video and a graphics video, encodes the natural video, and processes the graphics video to generate instructions for rendering graphic images at a client system. The client system performs a decoding procedure for the encoded video, a rendering procedure for client-side graphics in accordance with the instructions, and a compositing procedure to obtain a presentable video stream including the natural image and a client-side graphic corresponding to the synthetic graphic image. Other embodiments are disclosed.
Separation of graphics from natural video in streaming video content
Aspects of the subject disclosure may include, for example, a method that includes obtaining, by a processing system including a processor, video frames over a network; the processing system uses a machine learning algorithm to identify in each frame a first region comprising a natural image and a second region comprising a synthetic graphic image. The processing system separates the natural image from the synthetic graphic image to generate a natural video and a graphics video, encodes the natural video, and processes the graphics video to generate instructions for rendering graphic images at a client system. The client system performs a decoding procedure for the encoded video, a rendering procedure for client-side graphics in accordance with the instructions, and a compositing procedure to obtain a presentable video stream including the natural image and a client-side graphic corresponding to the synthetic graphic image. Other embodiments are disclosed.
METHOD AND DEVICE FOR GENERATING SPEECH MOVING IMAGE
A device for generating a speech moving image according to an embodiment includes a first encoder that receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector, a second encoder that receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector, a combination unit that generates a combination vector of the compressed image feature vector and the compressed voice feature vector, and an image reconstruction unit that reconstructs the speech moving image of the person with the combination as an input.
System, method and computer program product for generating remote views in a virtual mobile device platform using efficient color space conversion and frame encoding
Embodiments disclosed herein provide systems, methods and computer readable media for generating remote views in a virtual mobile device platform. A virtual mobile device platform may be coupled to a physical mobile device over a network and generate frames of data for generating views on the physical device. These frames can be generated using an efficient display encoding pipeline on the virtual mobile device platform. Such efficiencies may include, for example, the synchronization of various processes or operations, the governing of various processing rates, the elimination of duplicative or redundant processing, the application of different encoding schemes, the efficient detection of duplicative or redundant data or the combination of certain operations.
IMAGE SIGNAL PROCESSOR, METHOD OF OPERATING THE IMAGE SIGNAL PROCESSOR AND IMAGE PROCESSING SYSTEM INCLUDING THE IMAGE SIGNAL PROCESSOR
An image signal processor includes a first encoder receiving first image data for pixel data in a graphic image and performing compression on the first image data to generate first compressed data, an alpha map scaler extracting an alpha value α of the graphic image from the first image data and generating an alpha map for the alpha value α and a second encoder receiving the alpha map and second image data for pixel data in a video image, performing computation on the second image data using the alpha map to generate multiply data and performing compression on the multiply data to generate second compressed data.
SEPARATION OF GRAPHICS FROM NATURAL VIDEO IN STREAMING VIDEO CONTENT
Aspects of the subject disclosure may include, for example, a method that includes obtaining, by a processing system including a processor, video frames over a network; the processing system uses a machine learning algorithm to identify in each frame a first region comprising a natural image and a second region comprising a synthetic graphic image. The processing system separates the natural image from the synthetic graphic image to generate a natural video and a graphics video, encodes the natural video, and processes the graphics video to generate instructions for rendering graphic images at a client system. The client system performs a decoding procedure for the encoded video, a rendering procedure for client-side graphics in accordance with the instructions, and a compositing procedure to obtain a presentable video stream including the natural image and a client-side graphic corresponding to the synthetic graphic image. Other embodiments are disclosed.
SEPARATION OF GRAPHICS FROM NATURAL VIDEO IN STREAMING VIDEO CONTENT
Aspects of the subject disclosure may include, for example, a method that includes obtaining, by a processing system including a processor, video frames over a network; the processing system uses a machine learning algorithm to identify in each frame a first region comprising a natural image and a second region comprising a synthetic graphic image. The processing system separates the natural image from the synthetic graphic image to generate a natural video and a graphics video, encodes the natural video, and processes the graphics video to generate instructions for rendering graphic images at a client system. The client system performs a decoding procedure for the encoded video, a rendering procedure for client-side graphics in accordance with the instructions, and a compositing procedure to obtain a presentable video stream including the natural image and a client-side graphic corresponding to the synthetic graphic image. Other embodiments are disclosed.
Video Encoder With Motion Compensated Temporal Filtering
Various schemes pertaining to pre-encoding processing of a video stream with motion compensated temporal filtering (MCTF) are described. An apparatus determines a filtering interval for a received raw video stream having pictures in a temporal sequence. The apparatus selects from the pictures a plurality of target pictures based on the filtering interval, as well as a group of reference pictures for each target picture to perform pixel-based MCTF, which generates a corresponding filtered picture for each target picture. The apparatus subsequently transmits the filtered pictures as well as non-target pictures to an encoder for encoding the video stream. Subpictures of natural images and screen content images are separately processed by the apparatus.
Framework for video conferencing based on face restoration
There is included a method and apparatus comprising computer code configured to cause a processor or processors to perform obtaining video data, detecting at least one face from at least one frame of the video data, determining a set of facial landmark features of the at least one face from the at least one frame of the video data, and coding the video data at least partly by a neural network based on the determined set of facial landmark features.
Video quality through compression-aware graphics layout
An embodiment provides a method, including: identifying a first type of media and a second type of media; determining a compression technique to be used to compress a combined media created from the first type of media and the second type of media; and aligning using a processor, based on the compression technique determined, the first type of media and the second type of media to create the combined media. Other aspects are described and claimed.