H04N21/23412

Systems and methods to insert supplemental content into presentations of two-dimensional video content
11704882 · 2023-07-18 · ·

Systems and methods for inserting supplemental content into presentations of two-dimensional video content are disclosed. Exemplary implementations may: obtain two-dimensional video content depicting a three-dimensional space; obtain supplemental content; obtain a model of the three-dimensional space defining the one or more visible physical features within the three-dimensional space; determine the camera position of the two-dimensional video content; identify a presentation location within the two-dimensional video content; determine integration information; modify the two-dimensional video content to include the supplemental content at the identified presentation locations in accordance with the integration information and/or perform other operations.

Data model generation using generative adversarial networks

Methods for generating data models using a generative adversarial network can begin by receiving a data model generation request by a model optimizer from an interface. The model optimizer can provision computing resources with a data model. As a further step, a synthetic dataset for training the data model can be generated using a generative network of a generative adversarial network, the generative network trained to generate output data differing at least a predetermined amount from a reference dataset according to a similarity metric. The computing resources can train the data model using the synthetic dataset. The model optimizer can evaluate performance criteria of the data model and, based on the evaluation of the performance criteria of the data model, store the data model and metadata of the data model in a model storage. The data model can then be used to process production data.

INFORMATION PROCESSING DEVICE AND METHOD

A scene descriptive file describing a scene of 3D object content is generated, in the scene descriptive file, timed metadata identification information indicating that metadata of an associated external file changes in a time direction being stored in an MPEG_media extension, and timed metadata access information associating a camera object with the metadata being stored in the camera object. Furthermore, timed metadata that changes in the time direction is acquired on the basis of the timed metadata identification information and the timed metadata access information stored in the scene descriptive file, and a display image of the 3D object content is generated on the basis of the acquired timed metadata. The present disclosure is applicable to, for example, an information processing device, an information processing method, or the like.

SYSTEM AND METHOD OF SERVER-SIDE DYNAMIC ADAPTATION FOR SPLIT RENDERING
20230224512 · 2023-07-13 · ·

The techniques described herein relate to methods, apparatus, and computer readable media configured to provide video data for immersive media implemented by a server in communication with a client device. A request to access a stream of media data associated with immersive content at a point in time the client is first accessing the stream of media data for the immersive content is received from the client device. In response to the request from the client, the server transmits a response indication whether it has rendered at least part of the stream of media data. The server may also determine, based on the request from the client, whether to render at least part of the stream of media data for delivery to the client device.

Method and apparatus for processing video

A method and an apparatus for processing a video are provided technology. The method may include: separating a foreground image and a background image from a video frame in the target video stream, in response to acquiring a target video stream; adding a to-be-displayed content at a target display position in the background image to obtain a processed background image; and combining the foreground image and the processed background image to obtain a target video frame. The present disclosure may directly render the to-be-displayed content in the background, so that the content displayed in the background does not block a body in the foreground, such as person.

A METHOD AND APPARATUS FOR ENCODING AND DECODING VOLUMETRIC VIDEO

Methods, devices and stream are disclosed to encode and decode a volumetric content. At the encoding, the space of the volumetric content is divided in distinct sectors according to at least two different sectorizations. One atlas is generated for each sectorization or a single atlas is generated encoding all the sectorizations. At the decoding, a sectorization is selected according to the current direction and field of view, according to user's gaze navigation and according to prediction of the upcoming pose of the virtual camera controlled by the user. Sectors are selected according the selected sectorization and the current direction and field of view and only patches encoded in regions of the atlas associated with these sectors are accessed to generate the viewport image representative of the content seen from the current point of view.

Separation of graphics from natural video in streaming video content
11546617 · 2023-01-03 · ·

Aspects of the subject disclosure may include, for example, a method that includes obtaining, by a processing system including a processor, video frames over a network; the processing system uses a machine learning algorithm to identify in each frame a first region comprising a natural image and a second region comprising a synthetic graphic image. The processing system separates the natural image from the synthetic graphic image to generate a natural video and a graphics video, encodes the natural video, and processes the graphics video to generate instructions for rendering graphic images at a client system. The client system performs a decoding procedure for the encoded video, a rendering procedure for client-side graphics in accordance with the instructions, and a compositing procedure to obtain a presentable video stream including the natural image and a client-side graphic corresponding to the synthetic graphic image. Other embodiments are disclosed.

Server for providing a graphical user interface to a client and a client

The invention relates to a server for providing a graphical user interface to a client over a communication network. The graphical user interface comprises a graphical user interface element, the graphical user interface element being formed by an element shape and an element text, the element shape being represented by element shape data, the element text being represented by element text data. The server comprises an encoder configured to encode the element shape data into video data, a detector configured to detect a change associated with the graphical user interface element within the graphical user interface, and a communication interface configured to separately transmit the video data and the element text data over the communication network, the element text data being transmitted upon detection of the change associated with the graphical user interface element for providing the graphical user interface to the client.

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM
20220408153 · 2022-12-22 · ·

Provided is an information processing device configured to determine whether to perform modification to a detail of part of not-yet-viewed-and-listened-to content for a user, on the basis of a reaction of the user to viewed-and-listened-to content.

Apparatus, a method and a computer program for viewing volume signalling for volumetric video

There are disclosed various methods, apparatuses and computer program products for viewing volume signalling of volumetric video. In accordance with an embodiment of a method information of a viewing volume appropriate for viewing a volumetric video is obtained. The viewing volume is examined to determine what geometrical shapes describe the viewing volume. One or more geometrical shapes determined for describing the viewing volume are selected, wherein signalling information for the selected one or more geometrical shapes is constructed.