Patent classifications
H04N19/162
Video encoding technique utilizing user guided information in cloud environment
The present disclosure relates to a computer-implemented method for processing video data. The method comprises receiving a user input corresponding to a first picture of the video data, generating, based on the user input, prediction information of the first picture with respect a reference picture of the video data, and encoding the first picture using the prediction information.
Video encoding technique utilizing user guided information in cloud environment
The present disclosure relates to a computer-implemented method for processing video data. The method comprises receiving a user input corresponding to a first picture of the video data, generating, based on the user input, prediction information of the first picture with respect a reference picture of the video data, and encoding the first picture using the prediction information.
A Method, An Apparatus and a Computer Program Product for Video Encoding and Video Decoding
The embodiments relate to a method including generating a bitstream defining a presentation including an omnidirectional visual media content; encoding into the bitstream a parameter to indicate viewport-control options for viewing the presentation, wherein the viewport-control options includes options controllable by a receiving device and options not-controllable by the receiving device and sending the bitstream to the receiver device; receiving one of the indicated viewport-control options from the receiver device as a response; streaming the presentation to the receiver device; when the response has included an indication on a viewport-control controllable by the receiving device, the method also includes receiving information on viewport definitions from the receiver device during streaming of the presentation and adapting the presentation accordingly; when the response has included an indication on a viewport-control not- controllable by the receiving device, the presentation is streamed to the receiver device according to the viewport-control specified in the response.
Event/object-of-interest centric timelapse video generation on camera device with the assistance of neural network input
An apparatus including an interface and a processor. The interface may be configured to receive pixel data generated by a capture device. The processor may be configured to generate video frames in response to the pixel data, perform computer vision operations on the video frames to detect objects, perform a classification of the objects detected based on characteristics of the objects, determine whether the classification of the objects corresponds to a user-defined event and generate encoded video frames from the video frames. The encoded video frames may be communicated to a cloud storage service. The encoded video frames may comprise a first sample of the video frames selected at a first rate when the user-defined event is not detected and a second sample of the video frames selected at a second rate while the user-defined event is detected. The second rate may be greater than the first rate.
Sliced encoding and decoding for remote rendering
Disclosed herein are related to a device and a method of remotely rendering an image. In one approach, a device divides an image of an artificial reality space into a plurality of slices. In one approach, the device encodes a first slice of the plurality of slices. In one approach, the device encodes a portion of a second slice of the plurality of slices, while the device encodes a portion of the first slice. In one approach, the device transmits the encoded first slice of the plurality of slices to a head wearable display. In one approach, the device transmits the encoded second slice of the plurality of slices to the head wearable display, while the device transmits a portion of the encoded first slice to the head wearable display.
IMAGE PROCESSING DEVICE AND IMAGE PROCESSING METHOD
The present disclosure relates to an image processing device and an image processing method for instantaneously displaying an image of a user's field of view.
An encoder encodes a celestial sphere image of a cube formed by images of multiple planes generated from omnidirectional images, the encoding being performed plane by plane at a high resolution, to generate a high-resolution encoded stream corresponding to each of the planes. The encoder further encodes, at a low resolution, the celestial sphere image to generate a low-resolution encoded stream. The present disclosure may be applied, for example, to image display systems that generate a celestial sphere image so as to display an image of the user's field of view derived therefrom.
ENCODER-SIDE SEARCH RANGES HAVING HORIZONTAL BIAS OR VERTICAL BIAS
Innovations in encoder-side search ranges having horizontal bias or vertical bias are described herein. For example, a video encoder determines a block vector (“BV”) for a current block of a picture, performs intra prediction for the current block using the BV, and encodes the BV. The BV indicates a displacement to a region within the picture. When determining the BV, the encoder checks a constraint that the region is within a BV search range having a horizontal bias or vertical bias. The encoder can select the BV search range from among multiple available BV search ranges, e.g., depending at least in part on BV values of one or more previous blocks, which can be tracked in a histogram data structure.
ENCODER-SIDE SEARCH RANGES HAVING HORIZONTAL BIAS OR VERTICAL BIAS
Innovations in encoder-side search ranges having horizontal bias or vertical bias are described herein. For example, a video encoder determines a block vector (“BV”) for a current block of a picture, performs intra prediction for the current block using the BV, and encodes the BV. The BV indicates a displacement to a region within the picture. When determining the BV, the encoder checks a constraint that the region is within a BV search range having a horizontal bias or vertical bias. The encoder can select the BV search range from among multiple available BV search ranges, e.g., depending at least in part on BV values of one or more previous blocks, which can be tracked in a histogram data structure.
SIGNALING OF VISUAL CONTENT
Concepts for signaling visual content are described. According to one aspect, a picture representing a projection of a volumetric representation onto a projection plane is provided in a bitstream along with a geometric representation of the projection plane. According to a further aspect, a regions of a composed picture signaled in a bitstream are associated with a number of pictures of a number of projections of objects. According to a further aspect, regions of a composed picture signaled in a bitstream are associated with respective information types of a picture.
SIGNALING OF VISUAL CONTENT
Concepts for signaling visual content are described. According to one aspect, a picture representing a projection of a volumetric representation onto a projection plane is provided in a bitstream along with a geometric representation of the projection plane. According to a further aspect, a regions of a composed picture signaled in a bitstream are associated with a number of pictures of a number of projections of objects. According to a further aspect, regions of a composed picture signaled in a bitstream are associated with respective information types of a picture.