Patent classifications
H04N19/48
Scalable video coding using inter-layer prediction of spatial intra prediction parameters
The coding efficiency of scalable video coding is increased by substituting missing spatial intra prediction parameter candidates in a spatial neighborhood of a current block of the enhancement layer by use of intra prediction parameters of a co-located block of the base layer signal. By this measure, the coding efficiency for coding the spatial intra prediction parameters is increased due to the improved prediction quality of the set of intra prediction parameters of the enhancement layer, or, more precisely stated, the increased likelihood, that appropriate predictors for the intra prediction parameters for an intra predicted block of the enhancement layer are available thereby increasing the likelihood that the signaling of the intra prediction parameter of the respective enhancement layer block may be performed, on average, with less bits.
Cross-Component Transform Coefficient Level Reconstruction
This disclosure relates to cross component methods for refining decoded transform coefficients before or after dequantization in video decoding. For example, a method for video decoding is disclosed. The method may include, comprising extracting a first transform coefficient of a first color component from a bitstream of a coded video; extracting a second transform coefficient of a second color component from the bitstream of the coded video; deriving an offset value based on a magnitude or sign value of the first transform coefficient; adding the offset value to a magnitude of the second transform coefficient to generate a modified second transform coefficient for the second color component; and reconstructing the coded video based on at least the first transform coefficient of the first color component and the modified second transform coefficient of the second color component.
Inference Processing of Data
A method for processing data in a system configured to operate in either of at least a first power mode and a second power mode, wherein the first power mode is associated with a first power level and the second power mode is associated with a second power level, the second power level being higher than the first power level, wherein the first and second power modes each are configured to prepare a respective model for inference processing is disclosed. The method comprises acquiring (101) compressed data, determining (102) whether the system operates in the first power mode or in the second power mode. The method further comprises, when the system operates in the first power mode, determining (103) whether the acquired compressed data comprises a self-contained frame, and if so partly decoding (104) the self-contained frame, performing (105) feature extraction of the decoded self-contained frame, preparing (107) the model for inference processing in the first power mode in the system, wherein the model comprises inference parameters for the first power mode, and performing (108) inference processing by a neural network based on the extracted features and the prepared model for inference processing. Corresponding computer program product, apparatus, and system are also disclosed.
Method and apparatus for coding image on basis of transform
An image decoding method according to the present document may comprise the steps of: deriving a first variable indicating whether there is a valid coefficient in a region excluding a DC region from a current block; deriving a second variable indicating whether there is a valid coefficient in a second region excluding a first region formed at the upper left end of the current block; when the first variable indicates that the valid coefficient exists in the region excluding the DC region, and the second variable indicates that the valid coefficient does not exist in the second region, parsing an LFNST index from the bitstream; and applying an LFNST matrix derived on the basis of the LFNST index to transform coefficients in the first region, to derive the modified transform coefficients.
Scalable video coding using derivation of subblock subdivision for prediction from base layer
Scalable video coding is rendered more efficient by deriving/selecting a subblock subdivision to be used for enhancement layer prediction, among a set of possible subblock subdivisions of an enhancement layer block by evaluating the spatial variation of the base layer coding parameters over the base layer signal. By this measure, less of the signalization overhead has to be spent on signaling this subblock subdivision within the enhancement layer data stream, if any. The subblock subdivision thus selected may be used in predictively coding/decoding the enhancement layer signal.
Scalable video coding using derivation of subblock subdivision for prediction from base layer
Scalable video coding is rendered more efficient by deriving/selecting a subblock subdivision to be used for enhancement layer prediction, among a set of possible subblock subdivisions of an enhancement layer block by evaluating the spatial variation of the base layer coding parameters over the base layer signal. By this measure, less of the signalization overhead has to be spent on signaling this subblock subdivision within the enhancement layer data stream, if any. The subblock subdivision thus selected may be used in predictively coding/decoding the enhancement layer signal.
High-level constraints for transform skip blocks in video coding
An example device includes memory and one or more processors implemented in circuitry and communicatively coupled to the memory. The one or more processors are configured to receive a first slice header syntax element for a slice of the video data and determine a first value for the first slice header syntax element, the first value being indicative of whether dependent quantization is enabled. The one or more processors are configured to receive a second slice header syntax element for the slice of the video data and determine a second value for the second slice header syntax element, the second value being indicative of whether sign data hiding is enabled. The one or more processors are configured to determine whether transform skip residual coding is disabled for the slice based on the first value and the second value and decode the slice based on the determinations.
Multimedia Distribution System
A multimedia file and methods of generating, distributing and using the multimedia file are described. Multimedia files in accordance with embodiments of the present invention can contain multiple video tracks, multiple audio tracks, multiple subtitle tracks, a complete index that can be used to locate each data chunk in each of these tracks and an abridged index that can enable the location of a subset of the data chunks in each track, data that can be used to generate a menu interface to access the contents of the file and ‘meta data’ concerning the contents of the file. Multimedia files in accordance with several embodiments of the present invention also include references to video tracks, audio tracks, subtitle tracks and ‘meta data’ external to the file. One embodiment of a multimedia file in accordance with the present invention includes a series of encoded video frames, a first index that includes information indicative of the location within the file and characteristics of each encoded video frame and a separate second index that includes information indicative of the location within the file of a subset of the encoded video frames.
Multimedia Distribution System
A multimedia file and methods of generating, distributing and using the multimedia file are described. Multimedia files in accordance with embodiments of the present invention can contain multiple video tracks, multiple audio tracks, multiple subtitle tracks, a complete index that can be used to locate each data chunk in each of these tracks and an abridged index that can enable the location of a subset of the data chunks in each track, data that can be used to generate a menu interface to access the contents of the file and ‘meta data’ concerning the contents of the file. Multimedia files in accordance with several embodiments of the present invention also include references to video tracks, audio tracks, subtitle tracks and ‘meta data’ external to the file. One embodiment of a multimedia file in accordance with the present invention includes a series of encoded video frames, a first index that includes information indicative of the location within the file and characteristics of each encoded video frame and a separate second index that includes information indicative of the location within the file of a subset of the encoded video frames.
SHOT-CHANGE DETECTION USING CONTAINER LEVEL INFORMATION
The disclosed computer-implemented method may include, for a current frame of a sequence of video frames, determining a frame type label of the current frame. The method may include, in response to determining that the current frame is labeled as an intra frame (I-frame), decoding the current frame and comparing the decoded frame to historical I-frame data. The method may also include, in response to the comparison satisfying a shot-change threshold, flagging the current frame as a shot-change frame, and in response to flagging the current frame as the shot-change frame, storing the current frame for a subsequent shot-change detection. The method may further include updating, based on flagged shot-change frames, shot boundaries for the sequence of video frames. Various other methods, systems, and computer-readable media are also disclosed.