H04N19/00

Image filtering apparatus, image decoding apparatus, and image coding apparatus

To apply a filter to input image data in accordance with an image characteristic. A CNN filter includes a neural network configured to receive an input of one or multiple first type input image data and one or multiple second type input image data, and output one or multiple first type output image data, the one or multiple first type input image data each having a pixel value of a luminance or chrominance, the one or multiple second type input image data each having a pixel value of a value corresponding to a reference parameter for generating a prediction image and a differential image, the one or multiple first type output image data each having a pixel value of a luminance or chrominance.

Binarization of dQP using separate absolute value and sign (SAVS) in CABAC
11665348 · 2023-05-30 · ·

Video coding systems or apparatus utilizing context-based adaptive binary arithmetic coding (CABAC) during encoding and/or decoding, are configured according to the invention with an enhanced binarization of non-zero Delta-QP (dQP). During binarization the value of dQP and the sign are separately encoded using unary coding and then combined into a binary string which also contains the dQP non-zero flag. This invention capitalizes on the statistical symmetry of positive and negative values of dQP and results in saving bits and thus a higher coding efficiency.

Syntax and semantics for buffering information to simplify video splicing

Innovations in syntax and semantics of coded picture buffer removal delay (“CPBRD”) values potentially simplify splicing operations. For example, a video encoder sets a CPBRD value for a current picture that indicates an increment value relative to a nominal coded picture buffer removal time of a preceding picture in decoding order, regardless of whether the preceding picture has a buffering period SEI message. The encoder can signal the CPBRD value according to a single-value approach in which a flag indicates how to interpret the CPBRD value, according to a two-value approach in which another CPBRD value (having a different interpretation) is also signaled, or according to a two-value approach that uses a flag and a delta value. A corresponding video decoder receives and parses the CPBRD value for the current picture. A splicing tool can perform simple concatenation operations to splice bitstreams using the CPBRD value for the current picture.

Method and apparatus for prediction and transform for small blocks
11665359 · 2023-05-30 · ·

A method of video decoding for a video decoder includes determining whether a block size of a chroma block is less than or equal to a block size threshold. The method further includes, in response to a determination that the block size of the chroma block is greater than the block size threshold, selecting an intra prediction mode for the chroma block from a plurality of intra prediction modes. The method further includes, in response to a determination that the block size of the chroma block is less than or equal to the block size threshold, selecting the intra prediction mode for the chroma block from a subset of the plurality of intra prediction modes. The method further includes performing intra prediction for the chroma block based on a chroma sample obtained with the selected intra prediction mode to encode the chroma block.

Generating multi-pass-compressed-texture images for fast delivery

The present disclosure relates to systems, methods, and non-transitory computer-readable media to enhance texture image delivery and processing at a client device. For example, the disclosed systems can utilize a server-side compression combination that includes, in sequential order, a first compression pass, a decompression pass, and a second compression pass. By applying this compression combination to a texture image at the server-side, the disclosed systems can leverage both GPU-friendly and network-friendly image formats. For example, at a client device, the disclosed system can instruct the client device to execute a combination of decompression-compression passes on a GPU-network-friendly image delivered over a network connection to the client device. In so doing, client device can generate a tri-pass-compressed-texture from a decompressed image comprising texels with color palettes based on previously reduced color palettes from the first compression pass at the server-side, which reduces computational overhead and increases performance speed.

System and method for processing multi-dimensional and time-overlapping imaging data in real time with cloud computing
11663759 · 2023-05-30 ·

The present embodiments include a system and method for processing multi-dimensional images in real time through the use of third-party servers and cloud computing. The system includes a data acquisition processor, a data storage unit, an administrator processor, and a server. The server can be a cloud-based server. The method includes receiving multi-dimensional imaging data, compressing and blending the image data, transmitting the image data to a server, decompressing and deblending the data, generating multi-dimensional images, and transmitting the imaging data back to the administrator processor.

System and method for processing multi-dimensional and time-overlapping imaging data in real time with cloud computing
11663759 · 2023-05-30 ·

The present embodiments include a system and method for processing multi-dimensional images in real time through the use of third-party servers and cloud computing. The system includes a data acquisition processor, a data storage unit, an administrator processor, and a server. The server can be a cloud-based server. The method includes receiving multi-dimensional imaging data, compressing and blending the image data, transmitting the image data to a server, decompressing and deblending the data, generating multi-dimensional images, and transmitting the imaging data back to the administrator processor.

Image decoding method and apparatus using same

An image decoding method according to the present invention includes: receiving information on a set of reference pictures for configuring a set of reference pictures of a current picture, wherein the information on the set of reference pictures includes the most significant bit (MSB) information that may calculate the MSB of the picture order count (POC) of a long-term reference picture relative to the current picture, and flag information that represents whether there is MSB information; and eliciting the set of reference pictures by using received MSB information when the flag information is 1, and performing marking on the reference picture, wherein the flag information may be 1 when a temporal sub-layer identifier is 0, and there is at least one POC value for which a remainder obtained by dividing by a maximum value MaxPicOrderCntLsb capable of being represented by the LSB is the same as the least significant bit (LSB) of the POC of the long-term reference picture, in a set of POCs of a previous picture including POC values related to the previous picture that may not be discarded without affecting whether other pictures of the same temporal layer may be decoded.

Method and device for encoding/decoding the geometry of a point cloud

The present embodiments relate to a method and device. The method comprises obtaining at least one first point from at least one point of a point cloud by projecting said point of the point cloud onto a projection plane and obtaining at least one other point of the point cloud determined according to said at least one first point; determining and encoding at least one interpolation coding mode for said at least one first point based on at least one reconstructed point obtained from said at least one first point and at least one interpolation point defined by said at least one interpolation coding mode to approximate said at least one other point of the point cloud; and signaling said at least interpolation coding mode as values of image data.

Adaptive coding and streaming of multi-directional video

In communication applications, aggregate source image data at a transmitter exceeds the data that is needed to display a rendering of a viewport at a receiver. Improved streaming techniques that include estimating a location of a viewport at a future time. According to such techniques, the viewport may represent a portion of an image from a multi-directional video to be displayed at the future time, and tile(s) of the image may be identified in which the viewport is estimated to be located. In these techniques, the image data of tile(s) in which the viewport is estimated to be located may be requested at a first service tier, and the other tile in which the viewport is not estimated to be located may be requested at a second service tier, lower than the first service tier.