G06T3/0087

Method and apparatus for enabling multiple timeline support for omnidirectional content playback

A method, apparatus and computer program product enable multiple timeline support in playback of omnidirectional media content with overlay. The method, apparatus and computer program product receive a visual overlay configured to be rendered as a multi-layer visual content with an omnidirectional media content file (30). The omnidirectional media content file is associated with a first presentation timeline. The visual overlay is associated with a second presentation timeline. The method, apparatus and computer program product construct an overlay behavior definition file associated with the visual overlay (32). The overlay behavior definition file indicates a behavior of the second presentation timeline with respect to the first presentation in an instance that a pre-defined user interaction switch occurs during a playback of the omnidirectional media content file.

Methods and apparatus for using track derivations to generate new tracks for network based media processing applications
11589032 · 2023-02-21 · ·

The techniques described herein relate to methods, apparatus, and computer readable media configured to perform media processing tasks. A media processing entity includes a processor in communication with a memory, wherein the memory stores computer-readable instructions that, when executed by the processor, cause the processor to perform receiving, from a remote computing device, multi-view multimedia data comprising a hierarchical track structure comprising at least a first track comprising first media data at a first level of the hierarchical track structure, and a second track comprising task instruction data at a second level in the hierarchical track structure that is different than the first level of the first track. The instructions further cause the processor to perform processing the first media data of the first track based on the task instruction data of the second track to generate modified media data and an output track that includes the modified media data.

Hybrid cubemap projection for 360-degree video coding
11616981 · 2023-03-28 · ·

A system, method, and/or instrumentality may be provided for coding a 360-degree video. A picture of the 360-degree video may be received. The picture may include one or more faces associated with one or more projection formats. A first projection format indication may be received that indicates a first projection format may be associated with a first face. A second projection format indication may be received that indicates a second projection format may be associated with a second face. Based on the first projection format, a first transform function associated with the first face may be determined. Based on the second projection format, a second transform function associated with the second face may be determined. At least one decoding process may be performed on the first face using the first transform function and/or at least one decoding process may be performed on the second face using the second transform function.

Illumination control system
11490026 · 2022-11-01 · ·

Provided is an illumination control system including: an image capturing unit capturing an image in real time; a motion sensing unit sensing motion in the image and generating a notification; a communication unit receiving the image and the notification for transmission to outside, and receiving a control signal; a control unit receiving information on the image and the notification through the communication unit, analyzing the information, and transmitting the control signal for control; a light unit connected to the communication unit so that lights are controlled through the control signal; and a storage unit storing the information.

RANK INFORMATION IN IMMERSIVE MEDIA PROCESSING
20220343457 · 2022-10-27 ·

Methods, apparatus, and systems for providing consistent immersive media viewing experiences to user while reducing bandwidth consumption are disclosed. In one example aspect, a method for processing multimedia content includes determining, for a conversion between a frame of panoramic media content comprising multiple segments and a bitstream representation of the frame of panoramic media content, multiple sets of rank information associated with the frame. Each set of the rank information indicates a priority level for processing a segment of the frame of panoramic media content. The method also includes performing the conversion based on the multiple sets of rank information.

System and method of providing real-time dynamic imagery of a medical procedure she using multiple modalities

A system and method of providing composite real-time dynamic imagery of a medical procedure site from multiple modalities which continuously and immediately depicts the current state and condition of the medical procedure site synchronously with respect to each modality and without undue latency is disclosed. The composite real-time dynamic imagery may be provided by spatially registering multiple real-time dynamic video streams from the multiple modalities to each other. Spatially registering the multiple real-time dynamic video streams to each other may provide a continuous and immediate depiction of the medical procedure site with an unobstructed and detailed view of a region of interest at the medical procedure site at multiple depths. A user may thereby view a single, accurate, and current composite real-time dynamic imagery of a region of interest at the medical procedure site as the user performs a medical procedure.

Method and apparatus of encoding/decoding image data based on tree structure-based block division

Disclosed are methods and apparatuses for image data encoding/decoding. A method of decoding an image includes receiving a bitstream in which the image is encoded; obtaining index information for specifying a block division type of a current block in the image; and determining the block division type of the current block from a candidate group pre-defined in the decoding apparatus. The candidate group includes a plurality of candidate division types, including at least one of a non-division, a first quad-division, a second quad-division, a binary-division or a triple-division. The method also includes dividing the current block into a plurality of sub-blocks; and decoding each of the sub-blocks with reference to syntax information obtained from the bitstream.

Face detection in spherical images

A face located along a stitch line in a spherical image is detected by rendering views of regions of the spherical image along the stitch line. The spherical image may be produced by combining first and second images. A first view of a projection of the spherical image is rendered. A scaling factor for rendering a second view of the projection is determined based characteristics of the first portion of the face. The second view is then rendered according to the scaling factor. The use of the scaling factor to render the second view causes a change in the depiction of the second portion of the face. For example, the scaling factor can indicate to change the resolution or expected size of the second portion of the face when rendering the second view. A face is then detected within the spherical image based on the rendered first and second views.

Display system for capsule endoscopic image and method for generating 3D panoramic view

The present disclosure relates to a display system including a capsule image view, a 3D mini-map, and a 3D panoramic view, and a method of generating a 3D panoramic view. Specifically, according to the present disclosure, it is possible to infer the shape of an organ using a 3D mini-map and to simultaneously identify whether or not the capsule endoscope captures the images, and information on the position and posture of the capsule endoscope at primary captured points by visualizing the actual movement path of the capsule endoscope, thereby improving the accuracy of examination, and since multiple 2D images captured by a single capsule endoscope are able to be viewed as a single 3D panoramic image without changing the structure of the capsule endoscope, it is economical and the viewing angle of the image is able to be increased, thereby reducing the examination time and fatigue of the examiner.

Video processing method for remapping sample locations in projection-based frame with projection layout to locations on sphere and associated video processing apparatus
11663690 · 2023-05-30 · ·

A video processing method includes: decoding apart of a bitstream to generate a decoded frame, where the decoded frame is a projection-based frame that includes projection faces in a projection layout; and remapping sample locations of the projection-based frame to locations on the sphere, where a sample location within the projection-based frame is converted into a local sample location within a projection face packed in the projection-based frame; in response to adjustment criteria being met, an adjusted local sample location within the projection face is generated by applying adjustment to at least one coordinate value of the local sample location within the projection face, and the adjusted local sample location within the projection face is remapped to a location on the sphere; and in response to the adjustment criteria not being met, the local sample location within the projection face is remapped to a location on the sphere.