H04N21/8146

DIARISATION AUGMENTED REALITY AIDE

An image of a real-world environment including one or more users, is received from an image capture device. A mask status of a first user of is determined by a processor based on the image. A stream of audio including speech from one or more users is captured from one or more audio transceivers. A first user speech from the stream of audio identified by the processor. The stream of audio is parsed, by the processor and based on the first user speech and based on an audio processing technique, to create a first user speech element. An augmented view that includes the first user speech element is generated, for a wearable computing device, based on the first user speech and based on the mask status.

Multi-resolution graphics

Provided herein is technology for displaying, reposition, and/or formatting graphics on a display. The technology includes receiving a graphics stream in a first playout format that includes a first display resolution and first display layout. The technology also includes determining a second playout format that includes a second display resolution and a second display layout. The technology further determines an area of importance within the first display layout given the first display layout, second display resolution, and second display layout. A preferred position within the second display layout is determined so that the preferred position is a location in the second display layout that is in a relatively similar location as the area of importance in the first display layout. The first playout format is converted into the second playout format using the area of importance and preferred position. Finally, the graphics stream is displayed in the second playout format.

Video file playing method and apparatus, and storage medium

This application discloses a video file playing method and apparatus, and a storage medium. The video file playing method includes playing an animation file frame by frame according to a playback time of a video file, the video file comprising at least one displayed object, and the animation file comprising an animation element generated according to the displayed object; determining click/tap position information of a screen clicking/tapping event in response to the screen clicking/tapping event being detected; determining an animation element display area corresponding to the click/tap position information of the screen clicking/tapping event in the animation file according to the click/tap position information; determining, according to the corresponding animation element display area, an animation element triggered by the screen clicking/tapping event; and determining an interactive operation corresponding to the triggered animation element and performing the interactive operation.

METHOD AND APPARATUS FOR DETERMINING OBJECT ADDING MODE, ELECTRONIC DEVICE AND MEDIUM
20220394326 · 2022-12-08 ·

Embodiments of the present application disclose a method and apparatus for determining an object adding mode, an electronic device and a medium. A specific implementation of the method includes: obtaining a target video and a to-be-added object set to be added to a target video; determining time periods corresponding to storyboards of the target video; determining a presentation period of a to-be-added object in the to-be-added object set in the target video based on the time periods corresponding to the storyboards; generating object adding indication information based on the determined presentation period, where the object adding indication information includes time prompt information used to indicate the presentation period of the to-be-added object in the to-be-added object set in the target video. According to this implementation, automatic selection of time for adding the to-be-added object to the target video is realized.

Extended reality recorder

Implementations of the subject technology provide systems and methods for recording an extended reality experience in a way that allows the experience to be played back at a later time from a different viewpoint or perspective. This allows computer-generated content that was rendered for display to a user during the recording, to be re-rendered during playback at the correct time and location in the recording, but from a different perspective. In order to facilitate this type of viewer-centric playback, the recording includes a computer-generated content track that references resources for re-rendering the computer-generated content at each point in time in the recording.

REAL-TIME TEMPORALLY CONSISTENT OBJECT SEGMENTED STYLE TRANSFER IN MEDIA AND GAMING

One embodiment provides a method comprising, at a runtime library executed by a processor of a data processing system, receiving an input frame having objects to be stylized via a style transfer network associated with the runtime library, wherein the style transfer network is a neural network model trained to apply one or more visual styles to an input frame, performing instance segmentation on the input frame to generate one or more instance masks to identify one or more objects to be stylized, generating one or more stylized frames for each style to transfer to the input frame, and merging, via the one or more instance masks, stylized objects from one or more stylized frames with un-stylized content from the input frame to generate an output frame with per-instance stylization.

Video distribution system, information processing method, and computer program

A video distribution system according to this disclosure is a video distribution system that distributes a video including an animation of a character object generated based on a movement of a distribution user and comprises one or a plurality of computer processors. The one or plurality of computer processors provide (i) a distribution portion, (ii) a display device, (iii) a determination portion. If the determination portion determines that a first object is being displayed within a predetermined distance from a second object, the object display device changes the first object and the second object to a third object and displays the third object.

APPARATUS, A METHOD AND A COMPUTER PROGRAM FOR OMNIDIRECTIONAL VIDEO

There are disclosed various methods, apparatuses and computer program products for video encoding and decoding. In some embodiments the method for video encoding comprises obtaining compressed volumetric video data representing a three-dimensional scene or object (71); capsulating the compressed volumetric video data into a data structure (72); obtaining data of a two-dimensional projection of at least a part of the three-dimensional scene as seen from a certain viewport (73); and including the data of the two-dimensional projection into the data structure (74).

CONSISTENT GENERATION OF MEDIA ELEMENTS ACROSS MEDIA

An example method performed by a processing system includes retrieving a digital model of a media element from a database storing a plurality of media elements, wherein the media element is to be inserted into a scene of an audiovisual media, rendering the media element in the scene of the audiovisual media, based on the digital model of the media element and on metadata associated with the digital model to produce a rendered media element, wherein the metadata describes a characteristic of the media element and a limit on the characteristic, and inserting the rendered media element into the scene of the audiovisual media.

THREE-DIMENSIONAL CONTENT PROCESSING METHODS AND APPARATUS
20220366611 · 2022-11-17 ·

Methods, systems, and apparatus for processing of three-dimensional visual content are described. One example method of processing three-dimensional content includes parsing a level of detail (LoD) information of a bitstream containing three-dimensional (3D) content that is represented as one geometry sub-bitstream and one or more attribute sub-bitstreams; and generating, based on the LoD information, decoded information by decoding at least a portion of the geometry sub-bitstream and the one or more attribute sub-bitstreams corresponding to a desired level of detail; and reconstructing, using the decoded information, a three-dimensional scene corresponding at least to the desired level of detail. The bitstream conforms to a format organized according to multiple levels of details of the 3D content.