Patent classifications
H04N21/816
COMMUNICATION TERMINAL, IMAGE COMMUNICATION SYSTEM, AND METHOD OF DISPLAYING IMAGE
A communication terminal including circuitry to: receive video data including a captured image, from a communication management server that manages the captured image of video data distributed from another communication terminal different from the communication terminal; determine whether any predetermined-area information indicating a predetermined area of the captured image to be displayed during a reproduction time of the video data is stored in a memory; and control a display to display an image representing the predetermined area indicated by the predetermined-area information, based on a determination that the predetermined-area information is stored in the memory.
Cloud-based Rendering of Interactive Augmented/Virtual Reality Experiences
Systems and methods for implementing methods for cloud-based rendering of interactive augmented reality (AR) and/or virtual reality (VR) experiences. A client device may initiate execution of a content application on a server and provide information associated with the content application to the server. The client device may initialize, while awaiting a notification from the server, local systems associated with the content application and, upon receipt of the notification, provide, to the server, information associated with the local systems. Further, the client device may receive, from the server, data associated with the content application and render an AR/VR scene based on the received data. The data may be based, at least in part, on the information associated with the local system. The providing and receiving may be performed periodically, e.g., at a rate to sustain a comfortable viewing environment of the AR/VR scene by a user of the client device.
Extended reality recorder
Implementations of the subject technology provide systems and methods for recording an extended reality experience in a way that allows the experience to be played back at a later time from a different viewpoint or perspective. This allows computer-generated content that was rendered for display to a user during the recording, to be re-rendered during playback at the correct time and location in the recording, but from a different perspective. In order to facilitate this type of viewer-centric playback, the recording includes a computer-generated content track that references resources for re-rendering the computer-generated content at each point in time in the recording.
Method and device for latency reduction of an image processing pipeline
In some implementations, a method of reducing latency associated with an image read-out operation is performed at a device including one or more processors, non-transitory memory, an image processing architecture, and an image capture device. The method includes: obtaining first image data corresponding to a physical environment; reading a first slice of the first image data into an input buffer; performing processing operations on the first slice of the first image data to obtain a first portion of second image data; reading a second slice of the first image data into the input buffer; performing the image processing operations on the second slice of the first image data to obtain a second portion of the second image data; and generating an image frame of the physical environment based at least in part on the first and second portions of the second image data.
REAL-TIME TEMPORALLY CONSISTENT OBJECT SEGMENTED STYLE TRANSFER IN MEDIA AND GAMING
One embodiment provides a method comprising, at a runtime library executed by a processor of a data processing system, receiving an input frame having objects to be stylized via a style transfer network associated with the runtime library, wherein the style transfer network is a neural network model trained to apply one or more visual styles to an input frame, performing instance segmentation on the input frame to generate one or more instance masks to identify one or more objects to be stylized, generating one or more stylized frames for each style to transfer to the input frame, and merging, via the one or more instance masks, stylized objects from one or more stylized frames with un-stylized content from the input frame to generate an output frame with per-instance stylization.
SELECTING SUPPLEMENTAL AUDIO SEGMENTS BASED ON VIDEO ANALYSIS
Aspects of the present application correspond to generation of supplemental content based on processing information associated with content to be rendered. More specifically, aspects of the present application correspond to the generation of audio track information, such as music tracks, that are created for playback during the presentation of video content. Illustratively, one or more frames of the video content are processed by machine learned algorithm(s) to generate processing results indicative of one or more attributes characterizing individual frames of video content. A selection system can then identify potential music track or other audio data in view of the processing results.
Displaying a virtual environment of a session
In various example embodiments, a system and method for facilitating display of virtual content are presented. A session that displays two-dimensional (2D) content of one or more items available for sale is presented on a first device of a user. A second device of the user is detected, the second device being able to display three-dimensional (3D) content of the one or more items available for sale. 3D content of the one or more items available for sale is retrieved. Display of the 3D content on the second device is caused, the 3D content selectable by the user to perform interactions with the 3D content. An indication of the user performed interactions is received and processed. A result that depicts the user performed interactions as being processed is displayed on the first device of the user.
Method and device for transmitting information on three-dimensional content including multiple view points
Provided is a method for transmitting metadata for omnidirectional content including a plurality of viewpoints. The method comprises identifying the metadata for the omnidirectional content including the plurality of viewpoints; and transmitting the identified metadata, wherein the metadata includes information about an identifier (ID) of a viewpoint group including at least one viewpoint of the plurality of viewpoints, and wherein the at least one viewpoint in the viewpoint group shares a common reference coordinate system.
Information processing apparatus, information processing method, and program for point cloud sample processing
The present disclosure relates to an information processing apparatus and an information processing method that enable processing to be performed simply, and a program. By converting a point cloud representing a three-dimensional structure into two dimensions, a geometry image and a texture image, and three-dimensional information metadata required for constructing the geometry image and the texture image in three dimensions are obtained. Then, one PC sample included in a Point Cloud displayed at a specific time is generated by storing the geometry image, the texture image, and the three-dimensional information metadata in accordance with a playback order required at a time of reproducing and playing back the geometry image and the texture image in three dimensions on the basis of the three-dimensional information metadata. It is possible to apply to a data generation device that generates data for distribution of a Point Cloud.
STREAMING-BASED VR MULTI-SPLIT SYSTEM AND METHOD
A streaming-based VR multi-split system and method are provided. The system includes a control system, a manipulation terminal and at least one experience terminal. The control system includes a streaming media server, a streaming coding and decoding interaction module and a VR platform. The streaming decoding and interaction processing module is configured to receive a video stream sent by the streaming coding and decoding interaction module, collect interaction data at one side of the manipulation terminal in real time, and transmit the interaction data to the streaming coding and decoding interaction module. The streaming coding and decoding interaction module sends a video picture code of the VR platform to the manipulation terminal, and pushes the video stream corresponding to the interaction data to the streaming media server. A presentation picture corresponding to an operation of the experience terminal is acquired by the streaming media server.