H04N21/816

Positional zero latency

Based on viewing tracking data, a viewer's view direction to a three-dimensional (3D) scene depicted by a first video image is determined. The first video image has been streamed in a video stream to the streaming client device before the first time point and rendered with the streaming client device to the viewer at the first time point. Based on the viewer's view direction, a target view portion is identified in a second video image to be streamed in the video stream to the streaming client device to be rendered at a second time point subsequent to the first time point. The target view portion is encoded into the video stream with a higher target spatiotemporal resolution than that used to encode remaining non-target view portions in the second video image.

Methods for timed metadata priority rank signaling for point clouds
11704867 · 2023-07-18 · ·

Embodiments herein provide techniques for signaling of priority information (e.g., priority ranking) and/or quality information in a timed metadata track associated with point cloud content. For example, embodiments include procedures for signaling of priority information and/or quality information in a timed metadata track to support viewport-dependent distribution of point cloud content, e.g., based on MPEG's International Organization for Standardization (ISO) Base Media File Format (ISOBMFF). In some embodiments, metadata samples of the timed metadata track may include priority information and/or quality information for a point cloud bounding box of a point cloud media presentation (e.g., for one or more point cloud objects in the point cloud bounding box). Other embodiments may be described and claimed.

Methods and apparatus for re-timing and scaling input video tracks
11706374 · 2023-07-18 · ·

The techniques described herein relate to methods, apparatus, and computer readable media configured to access multimedia data comprising a hierarchical track structure comprising at least a first track at a first level of the hierarchical track structure comprising first media data, wherein the first media data comprises a first sequence of video media units, and a second track at a second level in the hierarchical track structure different than the first level of the first track, the second track comprising metadata specifying a re-timing derivation operation. Output video media units are generated according to the second track, comprising performing the re-timing derivation operation on the first sequence of video media units to modify a timing of the first sequence of video media units by removing one or more video media units associated with the re-timing derivation operation and/or shifting timing information of the first sequence of video media units.

Systems and methods to insert supplemental content into presentations of two-dimensional video content
11704882 · 2023-07-18 · ·

Systems and methods for inserting supplemental content into presentations of two-dimensional video content are disclosed. Exemplary implementations may: obtain two-dimensional video content depicting a three-dimensional space; obtain supplemental content; obtain a model of the three-dimensional space defining the one or more visible physical features within the three-dimensional space; determine the camera position of the two-dimensional video content; identify a presentation location within the two-dimensional video content; determine integration information; modify the two-dimensional video content to include the supplemental content at the identified presentation locations in accordance with the integration information and/or perform other operations.

Method and program for producing and providing reactive video
11706503 · 2023-07-18 · ·

The inventive concept relates to a method for producing a multi-reactive video and providing a multi-reactive video service, and a program using the same. It is possible to grasp a user's reaction to a video by recording manipulation details for a specific user's multi-reactive video. For example, it is possible to grasp the object of interest and the degree of interest of a user and to grasp a user interest in the user, by grasping the number of touch manipulations to the user's multi-reactive video, a frame in which a touch manipulation has been performed, and an object in the frame, or the like.

HEAD-MOUNTABLE DISPLAY SYSTEMS AND METHODS

A system includes one or more sensors to detect one or more properties for a current user wearing a head-mountable display (HMD), control circuitry to select a user profile from a plurality of stored user profiles in dependence upon one or more of the detected properties, where each stored user profile is associated with a respective user and includes data indicative of one or more reference properties for the respective user, where the selected user profile includes data indicative of at least one reference property that is substantially the same as at least one of the one or more of the detected properties, and processing circuitry to generate at least one of audio and video content for output by the HMD in dependence upon the selected user profile and in response to the selection of the user profile.

DIFFERENT ATLAS PACKINGS FOR VOLUMETRIC VIDEO
20230224501 · 2023-07-13 ·

Methods, devices and stream are disclosed to encode and decode a scene (such as a point cloud) in the context of a patch-based transmission of a volumetric video content. Attributes of points of the scene are projected onto patches. Every point has a geometry attribute. For other attributes, like transparency of displacement attribute, some points may have no value. According to the present principles, each attribute is encoded in a different atlas with its own layout. This allow to save pixel rate in memory of the renderer.

Method and device for latency reduction of an image processing pipeline

In some implementations, a method includes: determining a complexity value for first image data associated with of a physical environment that corresponds to a first time period; determining an estimated composite setup time based on the complexity value for the first image data and virtual content for compositing with the first image data; in accordance with a determination that the estimated composite setup time exceeds the threshold time: forgoing rendering the virtual content from the perspective that corresponds to the camera pose of the device relative to the physical environment during the first time period; and compositing a previous render of the virtual content for a previous time period with the first image data to generate the graphical environment for the first time period.

LIVE VIDEO BROADCAST METHOD, LIVE BROADCAST DEVICE AND STORAGE MEDIUM
20230224511 · 2023-07-13 ·

The present disclosure provides a live video broadcast method performed by a terminal device. The method includes: receiving a live broadcast command to live broadcast a video game in real time, and creating a video buffer for the video game based on the live broadcast command; while playing the video game in real time, extracting video picture frames from the video game and storing them in the video buffer; collecting voice data by using a microphone of the terminal device, and synchronously synthesize the voice data and the video picture frames stored in the video buffer into a video streaming media file; and uploading the video streaming media file to a live broadcast server for live broadcasting the video streaming media file on other mobile devices.

INFORMATION PROCESSING DEVICE AND METHOD

A scene descriptive file describing a scene of 3D object content is generated, in the scene descriptive file, timed metadata identification information indicating that metadata of an associated external file changes in a time direction being stored in an MPEG_media extension, and timed metadata access information associating a camera object with the metadata being stored in the camera object. Furthermore, timed metadata that changes in the time direction is acquired on the basis of the timed metadata identification information and the timed metadata access information stored in the scene descriptive file, and a display image of the 3D object content is generated on the basis of the acquired timed metadata. The present disclosure is applicable to, for example, an information processing device, an information processing method, or the like.