Patent classifications
H04N13/268
HYBRID MECHANISM FOR EFFICIENT RENDERING OF GRAPHICS IMAGES IN COMPUTING ENVIRONMENTS
A mechanism is described for facilitating hybrid rendering of graphics images in computing environments. A method of embodiments, as described herein, includes detecting the video stream including two-dimensional (2D) images, where the video stream is processed through a graphics pipeline at a computing device. The method may further include performing hybrid combination of a luma (Y)-plane with chrominance (UV)-planes to directly generate a YUV texture, wherein the YUV texture is used to generate three-dimensional (3D) images corresponding to the 2D images.
Method and apparatus for immersive video formatting
Disclosed herein is an immersive video formatting method and apparatus for supporting motion parallax, The immersive video formatting method includes acquiring a basic video at a basic position, acquiring a multiple view video at at least one position different from the basic position, acquiring at least one residual video plus depth (RVD) video using the basic video and the multiple view video, and generating at least one of a packed video plus depth (PVD) video or predetermined metadata using the acquired basic video and the at least one RVD video.
Method and apparatus for immersive video formatting
Disclosed herein is an immersive video formatting method and apparatus for supporting motion parallax, The immersive video formatting method includes acquiring a basic video at a basic position, acquiring a multiple view video at at least one position different from the basic position, acquiring at least one residual video plus depth (RVD) video using the basic video and the multiple view video, and generating at least one of a packed video plus depth (PVD) video or predetermined metadata using the acquired basic video and the at least one RVD video.
Apparatus, a method and a computer program for volumetric video
Video encoding may comprise obtaining a volumetric content containing visual information of three-dimensional objects; generating at least one patch by projecting the visual information of three-dimensional objects of the volumetric content to at least one projection plane. Video decoding may comprise obtaining neighboring pixels of a location on the 2D image based on said geometry information; determining a difference of values of the neighboring pixels on the 2D image; comparing the difference with a value range to determine a number of 3D points to be interpolated; projecting back the 2D image to create the volumetric content; wherein the projection comprises interpolating the number of 3D points on the basis of the values of the neighboring pixels.
Imaging system including light source, image sensor, and double-band pass filter
An imaging system includes a light source that, in operation, emits an emitted light containing a near-infrared light in a first wavelength region, an image sensor, and a double-band pass filter that transmits a visible light in at least a part of a wavelength region out of a visible region and the near-infrared light in the first wavelength region. The image sensor includes light detection cells, a first filter that selectively transmits the near-infrared light in the first wavelength region, second to fourth filters that selectively transmit lights in second to fourth wavelength regions, respectively, which are contained in the visible light, and an infrared absorption filter. The infrared absorption filter faces the second to fourth filters and absorbs the near-infrared light in the first wavelength region.
Imaging system including light source, image sensor, and double-band pass filter
An imaging system includes a light source that, in operation, emits an emitted light containing a near-infrared light in a first wavelength region, an image sensor, and a double-band pass filter that transmits a visible light in at least a part of a wavelength region out of a visible region and the near-infrared light in the first wavelength region. The image sensor includes light detection cells, a first filter that selectively transmits the near-infrared light in the first wavelength region, second to fourth filters that selectively transmit lights in second to fourth wavelength regions, respectively, which are contained in the visible light, and an infrared absorption filter. The infrared absorption filter faces the second to fourth filters and absorbs the near-infrared light in the first wavelength region.
Method and apparatus for encoding and decoding three-dimensional scenes in and from a data stream
Methods and devices are provided to encode and decode a data stream carrying data representative of a three-dimensional scene, the data stream comprising color pictures packed in a color image; depth pictures packed in a depth image; and a set of patch data items comprising de-projection data; data for retrieving a color picture in the color image and geometry data. Two types of geometry data are possible. The first type of data describes how to retrieve a depth picture in the depth image. The second type of data comprises an identifier of a 3D mesh. Vertex coordinates and faces of this mesh are used to retrieve the location of points in the de-projected scene.
Few-shot viewpoint estimation
When an image is projected from 3D, the viewpoint of objects in the image, relative to the camera, must be determined. Since the image itself will not have sufficient information to determine the viewpoint of the various objects in the image, techniques to estimate the viewpoint must be employed. To date, neural networks have been used to infer such viewpoint estimates on an object category basis, but must first be trained with numerous examples that have been manually created. The present disclosure provides a neural network that is trained to learn, from just a few example images, a unique viewpoint estimation network capable of inferring viewpoint estimations for a new object category.
Few-shot viewpoint estimation
When an image is projected from 3D, the viewpoint of objects in the image, relative to the camera, must be determined. Since the image itself will not have sufficient information to determine the viewpoint of the various objects in the image, techniques to estimate the viewpoint must be employed. To date, neural networks have been used to infer such viewpoint estimates on an object category basis, but must first be trained with numerous examples that have been manually created. The present disclosure provides a neural network that is trained to learn, from just a few example images, a unique viewpoint estimation network capable of inferring viewpoint estimations for a new object category.
Video reconstruction method, system, device, and computer readable storage medium
A method, a system, a device, and a computer readable storage medium for video reconstruction are disclosed. The method includes: obtaining an image combination of multi-angle free-perspective video frames, parameter data corresponding to the image combinations of the video frames, and position information of a virtual viewpoint based on a user interaction; selecting texture images and depth maps of corresponding groups in the image combinations of the video frames at a time moment of the user interaction according to a preset rule and based on the position information of the virtual viewpoint and the parameter data corresponding to the image combinations of the video frames; and combining and rendering the texture images and the depth maps of the corresponding groups based on the position information of the virtual viewpoint and parameter data corresponding to the depth maps and the texture images of the corresponding groups to obtain a reconstructed image.