H04N19/00

Method and device for processing video signal on basis of inter prediction
11695948 · 2023-07-04 · ·

The disclosure discloses a method for processing a video signal and an apparatus therefor. Specifically, the method of processing a video signal based on an inter prediction, comprising: configuring a merge list based on a neighboring block of a current block; adding a history based merge candidate included in a history based merge candidate list to the merge list when a number of a merge candidate included in the merge list is smaller than a first predetermined number; obtaining a merge index indicating a merge candidate used for an inter prediction of the current block within the merge list; and generating a prediction block of the current block based on motion information of a merge candidate indicated by the merge index, wherein a step of adding the history based merge candidate to the merge list comprises checking whether a second predetermined number of the history based merge candidate within the history based merge candidate list has the same motion information as the merge candidate included in the merge list.

Affine linear weighted intra prediction in video coding

A video coder performs a Most-Probable Mode (MPM) derivation process that derives one or more MPMs for a current block that is not coded using affine linear weighted intra prediction (ALWIP). As part of performing the MPM derivation process, the video coder determines whether a neighboring block of the current block is an ALWIP-coded neighboring block. Based on the neighboring block being an ALWIP-coded neighboring block, the video coder determines that a value of an intra prediction mode of the neighboring block is a value indicating a planar mode. The video coder codes the current block based on one of the MPMs for the current block.

Predictive field-of-view (FOV) and cueing to enforce data capture and transmission compliance in real and near real time video

To prevent the capture and transmission of excluded data, the current pose and motion of a video camera are used to predict a pose and predicted FOV for the video camera over one or more future frames. The predicted pose and predicted FOV are used to generate cues to enforce an alignment condition to an allowed object or to prevent capture of a disallowed object. If the cues fail, an interrupt is generated to prevent capture of disallowed objects in the video signal and perhaps to deactivate the video camera. The predicted FOV prevents excluded data from entering the video signal and reaching circuitry or being processed downstream of the video camera. This can be implemented in real or near real time.

Method for encoding of a video stream

A temporal sequence of pictures is generated in a method for encoding of a first video stream. To do so, a synchronization signal can be used, which can be derived from a second video stream independently of the first video stream. Alternatively, the encoding of a second video stream independent of the first video stream can be based on the same principle as for the encoding of the first video stream.

Upsampling for signal enhancement coding
11546634 · 2023-01-03 · ·

There is disclosed a method of encoding an input signal, the method comprising: receiving a base encoded signal, the base encoded signal being generated by feeding an encoder with a down-sampled version of an input signal; producing a first residual signal by: decoding the base encoded signal to produce a first decoded signal; and using a difference between the base decoded signal and the down-sampled version of the input signal to produce the first residual signal; producing a second residual signal by: correcting the base decoded signal using the residual signal to create a corrected decoded version; up-sampling the corrected decoded version; and using a difference between the up-sampled corrected decoded signal and the input signal to produce the second residual signal; wherein the up-sampling is one of bilinear or bicubic up-sampling. A corresponding decoding method is also disclosed.

Image processing apparatus, image processing method and image processing program

An image processing device that updates a pixel value of a processing target image and generates a new image generates a first feature vector based on the processing target image and a first feature map generated with at least one pre-decided filter; updates the processing target image to generate an updated image; generates a second feature vector based on the updated image and a second feature map generated with at least one pre-decided filter; performs quality evaluation of the updated image based on the first and second feature vectors and generates a quality feedback vector which is a vector based on a result of the quality evaluation; performs an encoding amount evaluation on the updated image and generates an encoding amount feedback vector which is a vector based on a result of the encoding amount evaluation; and determines an updating amount in updating of the updated image based on the quality feedback vector and the encoding amount feedback vector.

Method and apparatus for encoding and transmitting at least a spatial part of a video sequence

When encoding and transmitting video data comprising regions of interest, different usages of the regions of interest implicate different kinds of combination of region of interest at decoding. By studying the different impacts of the encoding mechanisms depending on other set of tiles data on the different kind of combination, it is possible to define a plurality of tile set coding dependency levels. Each tile set coding dependency level is linked to a set of constraints on encoding. These set of constraints have different impacts on the possibilities allowed when combining the different regions of interest. It is therefore possible, based on a desired usage, to select an encoding with minimal restrictions, as defined by a given tile coding dependency level, compatible with the desired usage. Accordingly, the encoding efficiency is improved, for a given usage, compared to a solution where a complete tile independency solution is used.

Rate-matching scheme for control channels using polar codes

Certain aspects of the present disclosure generally relate to wireless communications and, more particularly, to methods and apparatus for rate-matching control channels using polar codes. An exemplary method generally includes encoding a stream of bits using a polar code, determining a size of a circular buffer for storing the encoded stream of bits based, at least in part, on a minimum supported code rate and a control information size, and performing rate-matching on stored encoded stream of bits based, at least in part, on a mother code size, N, and a number of coded bits for transmission, E.

Method for encoding/decoding texture of points of a point cloud

A method and device for point cloud compression are provided. A bitstream is generated that comprises color information data representative of a texture image from which points of the point cloud are colorized when reconstructing the point cloud. The bitstream also comprises an interpolation texture coding mode indicating that texture interpolation has to be done on points of the reconstructed point cloud that are not colorized from the texture image.

CONTEXT ENABLED MACHINE LEARNING
20220400251 · 2022-12-15 ·

Certain aspects of the present disclosure provide techniques for generating context-aware inferences using a machine learning model. The method generally includes receiving a time-series data sequence and a contextual model specifying characteristics of how objects behave in an environment in which the time-series data sequence was captured. A feature data set from the contextual model is extracted using a first machine learning model. Generally, the extracted feature data set comprises a representation of the specified characteristics of how objects behave in the environment. A future state of an object in the environment is predicted using the time-series data sequence and the extracted feature data set representing the specified characteristics of how objects behave in the environment as input into a second machine learning model. One or more actions are taken based on the predicted future state of the object in the environment.