Patent classifications
H04N21/816
Method and apparatus for 3-D auto tagging
A multi-view interactive digital media representation (MVIDMR) of an object can be generated from live images of an object captured from a camera. Selectable tags can be placed at locations on the object in the MVIDMR. When the selectable tags are selected, media content can be output which shows details of the object at location where the selectable tag is placed. A machine learning algorithm can be used to automatically recognize landmarks on the object in the frames of the MVIDMR and a structure from motion calculation can be used to determine 3-D positions associated with the landmarks. A 3-D skeleton associated with the object can be assembled from the 3-D positions and projected into the frames associated with the MVIDMR. The 3-D skeleton can be used to determine the selectable tag locations in the frames of the MVIDMR of the object.
Video client optimization during pause
A system and method for providing quality control in immersive video during pausing of a video streaming session. In one embodiment, a paused video frame may comprise a plurality of mixed quality video tiles depending on user gaze vector information. Under pause control, the video quality of all tiles of the paused video frame is equalized such that it is of same value for all the video tiles, which may be the video quality of the tiles presented in a viewport of the client device. The paused video frame having the same quality tiles throughout is used as a replacement video frame, which is presented to the client device player for decoding and displaying instead of the mixed quality video frame while the streaming session is paused.
DYNAMIC ADAPTATION OF VOLUMETRIC CONTENT COMPONENT SUB-BITSTREAMS IN STREAMING SERVICES
A media content processing device may decode visual volumetric content based on one or more messages, which may indicate which attribute sub-bitstream of one or more attribute sub-bitstreams indicated in a parameter set is active, The parameter set may include a visual volumetric video-based parameter set. The message indicating one or more active attribute sub-bitstreams may be received by the decoder, A decoder may perform decoding, such as determining which attribute sub-bitstream to use for decoding visual media content, based on the one or more messages, The one or more messages may be generated and sent to a decoder, for example, to indicate the deactivation of the one or more attribute sub-bitstreams. The decoder may determine an inactive attribute sub-bitstream and skip the inactive attribute sub-bitstream for decoding the visual media content based on the one or more messages.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM
A generation unit (111) generates an identification region map indicating whether or not each divided object data is visible from each position in a three-dimensional space by using information regarding orientation of a normal vector (15) dividing an object (10) in the three-dimensional space and information regarding an outline of the object (10).
ANIMATION PRODUCTION SYSTEM
To enable to shoot animations in a virtual space, an animation production method that provides a virtual space in which a given object is placed, the method comprising: detecting an operation of a user equipped with a head mounted display; controlling an action of an object based on the detected operation of the user; shooting the actionof the object; and storing action data in relation to the action of the shot object in a predetermined track.
IMAGE DISPLAY SYSTEM, MOVING IMAGE DISTRIBUTION SERVER, IMAGE PROCESSING APPARATUS, AND MOVING IMAGE DISTRIBUTION METHOD
A server performs a part of a forming process necessary for conversion into formats corresponding to display modes of a head mounted display and a flat-plate display connected to an image processing apparatus, and transmits a processing result to the image processing apparatus. At this time, the server switches the part of the process to transmit any one of a pair of a left-eye image and a right-eye image, an image suited for the flat-plate display, and an image constituted by a left-eye image and a right-eye image to each of which distortion for an ocular lens has been given.
METHOD AND APPARATUS FOR PROCESSING TRACK DATA OF MULTIMEDIA FILE, AND MEDIUM AND DEVICE
Embodiments of this disclosure provide a method and apparatus for processing track data in a multimedia file, a medium, and a device. The processing method includes: receiving a multimedia file, the multimedia file including a plurality of track data and track group information corresponding to the respective track data, where the track group information corresponding to target track data includes identification information of a plurality of track groups, and the identification information of the plurality of track groups is used for indicating that the target track data belongs to the plurality of track groups simultaneously; parsing the track group information to obtain a track group to which the respective track data belongs; and decoding track data belonging to a specified track group to obtain multimedia data corresponding to the specified track group.
METHOD AND DEVICE FOR TRANSMITTING IMAGE CONTENT USING EDGE COMPUTING SERVICE
An example method, performed by an edge data network, of transmitting image content, includes obtaining azimuth information and focal position information from an electronic device connected to the edge data network, and generating a filtered first partial image by performing filtering on a first partial image corresponding to the azimuth information by using one filter determined based on the focal position information.
APPARATUS, SYSTEM AND METHOD OF VIDEO ENCODING
For example, an apparatus may include a video encoder configured to encode video data into a parallel plurality of encoded video streams, the parallel plurality of encoded video streams including the video data encoded according to a respective plurality of different video bitrates; a selector configured to select, based on one or more parameters corresponding to a condition of a wireless communication link, a selected encoded video stream from the parallel plurality of encoded video streams; and a radio to transmit the selected encoded video stream over the wireless communication link.
CODING SCHEME FOR IMMERSIVE VIDEO WITH ASYMMETRIC DOWN-SAMPLING AND MACHINE LEARNING
Methods of encoding and decoding immersive video are provided. In an encoding method, source video data comprising a plurality of source views is encoded into a video bitstream. At least one of the source views is down-sampled prior to encoding. A metadata bitstream associated with the video stream comprises metadata describing a configuration of the down-sampling, to assist a decoder to decode the video bitstream. It is believed that the use of down-sampled views may help to reduce coding artifacts, compared with a patch-based encoding approach. Also provided are an encoder and a decoder for immersive video, and an immersive video bitstream.