Patent classifications
H04N19/40
Processing of motion information in multidimensional signals through motion zones and auxiliary information through auxiliary zones
Computer processor hardware receives zone information specifying multiple elements of a rendition of a signal belonging to a zone. The computer processor hardware also receives motion information associated with the zone. The motion information can be encoded to indicate to which corresponding element in a reference signal each of the multiple elements in the zone pertains. For each respective element in the zone as specified by the zone information, the computer processor hardware utilizes the motion information to derive a corresponding location value in the reference signal; the corresponding location value indicates a location in the reference signal to which the respective element pertains.
Source color volume information messaging
Methods are described to communicate source color volume information in a coded bitstream using SEI messaging. Such data include at least the minimum, maximum, and average luminance values in the source data plus optional data that may include the color volume x and y chromaticity coordinates for the input color primaries (e.g., red, green, and blue) of the source data, and the color x and y chromaticity coordinates for the color primaries corresponding to the minimum, average, and maximum luminance values in the source data. Messaging data signaling an active region in each picture may also be included.
OPTIMAL FORMAT SELECTION FOR VIDEO PLAYERS BASED ON PREDICTED VISUAL QUALITY USING MACHINE LEARNING
A system and methods are disclosed for optimal format selection for video players based on visual quality. The method includes generating a plurality of reference transcoded versions of a reference video, obtaining quality scores for frames of the plurality of reference transcoded versions of the reference video, generating a first training input comprising a set of color attributes, spatial attributes, and temporal attributes of the frames of the reference video, and generating a first target output for the first training input, wherein the first target output comprises the quality scores for the frames of the plurality of reference transcoded versions of the reference video. The method further includes providing the training data to train a machine learning model on (i) a set of training inputs comprising the first training input and (ii) a set of target outputs comprising the first target output.
OPTIMAL FORMAT SELECTION FOR VIDEO PLAYERS BASED ON PREDICTED VISUAL QUALITY USING MACHINE LEARNING
A system and methods are disclosed for optimal format selection for video players based on visual quality. The method includes generating a plurality of reference transcoded versions of a reference video, obtaining quality scores for frames of the plurality of reference transcoded versions of the reference video, generating a first training input comprising a set of color attributes, spatial attributes, and temporal attributes of the frames of the reference video, and generating a first target output for the first training input, wherein the first target output comprises the quality scores for the frames of the plurality of reference transcoded versions of the reference video. The method further includes providing the training data to train a machine learning model on (i) a set of training inputs comprising the first training input and (ii) a set of target outputs comprising the first target output.
METHOD AND APPARATUS FOR DECODING VIDEO, AND METHOD AND APPARATUS FOR ENCODING VIDEO
Provided are a video decoding method and apparatus for, in a video encoding and decoding procedure, when a merge candidate list of a current block is configured, determining whether the number of merge candidates included in the merge candidate list is greater than 1 and is smaller than a predetermined maximum merge candidate number, when the number of the merge candidates included in the merge candidate list is greater than 1 and is smaller than the predetermined maximum merge candidate number, determining an additional merge candidate by using a first merge candidate and a second merge candidate of the merge candidate list of the current block, configuring the merge candidate list by adding the determined additional merge candidate to the merge candidate list, and performing prediction on the current block, based on the merge candidate list.
METHOD AND APPARATUS FOR DECODING VIDEO, AND METHOD AND APPARATUS FOR ENCODING VIDEO
Provided are a video decoding method and apparatus for, in a video encoding and decoding procedure, when a merge candidate list of a current block is configured, determining whether the number of merge candidates included in the merge candidate list is greater than 1 and is smaller than a predetermined maximum merge candidate number, when the number of the merge candidates included in the merge candidate list is greater than 1 and is smaller than the predetermined maximum merge candidate number, determining an additional merge candidate by using a first merge candidate and a second merge candidate of the merge candidate list of the current block, configuring the merge candidate list by adding the determined additional merge candidate to the merge candidate list, and performing prediction on the current block, based on the merge candidate list.
Constraints On Reference Picture Lists
Methods and apparatus for processing of video are described. The processing may include video encoding, decoding, or transcoding. One example video processing method includes performing a conversion between a video including one or more pictures including one or more subpictures and a bitstream of the video. The bitstream conforms to a format rule that specifies that a subpicture cannot be a random access type of subpicture in response to the subpicture not being a leading subpicture of an intra random access point subpicture. The leading subpicture precedes the intra random access point subpicture in output order.
APPARATUS, METHOD, AND COMPUTER READABLE MEDIUM
Provided is an apparatus including: an image acquisition unit configured to acquire a captured image; a compression unit configured to compress the captured image to generate a compressed image; an evaluation acquisition unit configured to acquire evaluation according to visibility of the compressed image from a user; and a learning processing unit configured to perform, in response to input of a new captured image, learning processing of a model for outputting a compression parameter value to be applied in compression of the captured image by using learning data including the evaluation, a captured image corresponding to the compressed image targeted for the evaluation, and a compression parameter value applied in generation of the compressed image.
APPARATUS, METHOD, AND COMPUTER READABLE MEDIUM
Provided is an apparatus including: an image acquisition unit configured to acquire a captured image; a compression unit configured to compress the captured image to generate a compressed image; an evaluation acquisition unit configured to acquire evaluation according to visibility of the compressed image from a user; and a learning processing unit configured to perform, in response to input of a new captured image, learning processing of a model for outputting a compression parameter value to be applied in compression of the captured image by using learning data including the evaluation, a captured image corresponding to the compressed image targeted for the evaluation, and a compression parameter value applied in generation of the compressed image.
Perceptual luminance nonlinearity-based image data exchange across different display capabilities
A handheld imaging device has a data receiver that is configured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is based on perceptual non-linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between the reference code values and device-specific code values of the imaging device. The device-specific code values are configured to produce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode the reference encoded image data into device-specific image data, which is encoded with the device-specific code values.