Patent classifications
H04N19/62
Object pose estimation and tracking using machine learning
A method includes receiving a video comprising images representing an object, and determining, using a machine learning model, based on a first image of the images, and for each respective vertex of vertices of a bounding volume for the object, first two-dimensional (2D) coordinates of the respective vertex. The method also includes tracking, from the first image to a second image of the images, a position of each respective vertex along a plane underlying the bounding volume, and determining, for each respective vertex, second 2D coordinates of the respective vertex based on the position of the respective vertex along the plane. The method further includes determining, for each respective vertex, (i) first three-dimensional (3D) coordinates of the respective vertex based on the first 2D coordinates and (ii) second 3D coordinates of the respective vertex based on the second 2D coordinates.
Multi-processor support for array imagers
Using the techniques discussed herein, a set of images is captured by one or more array imagers (106). Each array imager includes multiple imagers configured in various manners. Each array imager captures multiple images of substantially a same scene at substantially a same time. The images captured by each array image are encoded by multiple processors (112, 114). Each processor can encode sets of images captured by a different array imager, or each processor can encode different sets of images captured by the same array imager. The encoding of the images is performed using various image-compression techniques so that the information that results from the encoding is smaller, in terms of storage size, than the uncompressed images.
Multi-processor support for array imagers
Using the techniques discussed herein, a set of images is captured by one or more array imagers (106). Each array imager includes multiple imagers configured in various manners. Each array imager captures multiple images of substantially a same scene at substantially a same time. The images captured by each array image are encoded by multiple processors (112, 114). Each processor can encode sets of images captured by a different array imager, or each processor can encode different sets of images captured by the same array imager. The encoding of the images is performed using various image-compression techniques so that the information that results from the encoding is smaller, in terms of storage size, than the uncompressed images.
Method for producing video coding and programme-product
According to the invention, there are provided sets of contexts specifically adapted to encode special coefficients of a prediction error matrix, on the basis of previously encoded values of level k. Furthermore, the number of values of levels other than 0 is explicitly encoded and numbers of appropriate contexts are selected on the basis of the number of spectral coefficients other than 0.
Method for producing video coding and programme-product
According to the invention, there are provided sets of contexts specifically adapted to encode special coefficients of a prediction error matrix, on the basis of previously encoded values of level k. Furthermore, the number of values of levels other than 0 is explicitly encoded and numbers of appropriate contexts are selected on the basis of the number of spectral coefficients other than 0.
Retinal encoder for machine vision
A method is disclosed including: receiving raw image data corresponding to a series of raw images; processing the raw image data with an encoder to generate encoded data, where the encoder is characterized by an input/output transformation that substantially mimics the input/output transformation of one or more retinal cells of a vertebrate retina; and applying a first machine vision algorithm to data generated based at least in part on the encoded data.
Retinal encoder for machine vision
A method is disclosed including: receiving raw image data corresponding to a series of raw images; processing the raw image data with an encoder to generate encoded data, where the encoder is characterized by an input/output transformation that substantially mimics the input/output transformation of one or more retinal cells of a vertebrate retina; and applying a first machine vision algorithm to data generated based at least in part on the encoded data.
Method and apparatus for encoding and decoding an omnidirectional video
A method for decoding a large field of view video is disclosed. At least one picture of said large field of view video is represented as a 3D surface projected onto at least one 2D picture using a projection function. The method comprises, for at least one current block of said 2D picture: —determining whether an absolute value of at least one component of a motion vector (d V) associated with another block of said 2D picture satisfies a condition; —transforming, based on said determining, said motion vector (d V) into a current motion vector (d P) associated with said current block responsive to said projection function; and —decoding said current block using said current motion vector (d P).
Method and apparatus for encoding and decoding an omnidirectional video
A method for decoding a large field of view video is disclosed. At least one picture of said large field of view video is represented as a 3D surface projected onto at least one 2D picture using a projection function. The method comprises, for at least one current block of said 2D picture: —determining whether an absolute value of at least one component of a motion vector (d V) associated with another block of said 2D picture satisfies a condition; —transforming, based on said determining, said motion vector (d V) into a current motion vector (d P) associated with said current block responsive to said projection function; and —decoding said current block using said current motion vector (d P).
Image encoding/decoding method and device
An image encoding/decoding method of the present invention constructs a merge candidate list of a current block, derives motion information of the current block on the basis of the merge candidate list and a merge candidate index, and performs inter prediction on the current block on the basis of the derived motion information, wherein the merge candidate list can improve encoding/decoding efficiency by adaptively determining a plurality of merge candidates on the basis of the position or size of a merge estimation region (MER) to which the current block belongs.