Patent classifications
H04N19/103
IMAGE ENCODING/DECODING METHOD AND DEVICE FOR DETERMINING DIVISION MODE ON BASIS OF COLOR FORMAT, AND METHOD FOR TRANSMITTING BITSTREAM
An image encoding/decoding method and apparatus are provided. An image decoding method performed by an image decoding apparatus may comprise determining prediction mode characteristic information based on a color format of a current block, determining a prediction mode type of a lower-layer block split from the current block based on the prediction mode characteristic information, obtaining the lower-layer block by splitting the current block based on the prediction mode type of the lower-layer block, and decoding the lower-layer block based on the prediction mode type of the lower-layer block. The prediction mode type of the lower-layer block comprises a first prediction mode type specifying that both an intra prediction mode and an inter prediction mode are available, a second prediction mode type specifying that only the intra prediction mode is available and a third prediction mode type specifying that only the inter prediction mode is available. Based on a first condition for the current block being satisfied, the prediction mode characteristic information has a first value, and the first condition comprises a case where a color format of the current block is a monochrome format or a 4:4:4 format. Based on the first condition for the current block being not satisfied, the prediction mode characteristic information has a second value or a third value based on at least one of a color format, split mode or size of the current block.
APPARATUS, SYSTEM, METHOD, STORAGE MEDIUM, AND FILE FORMAT
An apparatus acquires data representing a material appearance of a surface of an object, selects, based on the data, one of a coding method for coding by providing a scalability of a bit plane and a coding method for coding by providing a scalability of resolution, and outputs the data encoded by the selected coding method.
APPARATUS, SYSTEM, METHOD, STORAGE MEDIUM, AND FILE FORMAT
An apparatus acquires data representing a material appearance of a surface of an object, selects, based on the data, one of a coding method for coding by providing a scalability of a bit plane and a coding method for coding by providing a scalability of resolution, and outputs the data encoded by the selected coding method.
Video encoding mode selection by a hierarchy of machine learning models
Techniques for training and using machine learning models for video encoding mode selection are described. According to some embodiments, a computer-implemented method includes receiving a live video at a content delivery service, extracting one or more features for a plurality of macroblocks of a frame of the live video, determining an encoding mode from a plurality of encoding modes for each of the plurality of macroblocks of the frame with a machine learning model based at least in part on an input of the one or more features, performing a real time encode of the frame of the live video based at least in part on the determined encoding modes to generate an encoded frame by the content delivery service, and transmitting the encoded frame from the content delivery service to a viewer device.
Video encoding mode selection by a hierarchy of machine learning models
Techniques for training and using machine learning models for video encoding mode selection are described. According to some embodiments, a computer-implemented method includes receiving a live video at a content delivery service, extracting one or more features for a plurality of macroblocks of a frame of the live video, determining an encoding mode from a plurality of encoding modes for each of the plurality of macroblocks of the frame with a machine learning model based at least in part on an input of the one or more features, performing a real time encode of the frame of the live video based at least in part on the determined encoding modes to generate an encoded frame by the content delivery service, and transmitting the encoded frame from the content delivery service to a viewer device.
Architecture to adapt cumulative distribution functions for mode decision in video encoding
A mode decision component is configured to determine the costs of different modes for a selected partition of a frame of a video using an estimated compression coding data, which is calculated prior to a corresponding actual compression coding data being calculated based on another partition immediately prior to the selected partition in a partition processing order. The estimated compression coding data is determined based on a previously calculated compression coding data calculated based on a completed partition prior to the selected partition in the partition processing order. The mode decision component is configured to use the determined costs to select one of the modes. An encoder component is configured to use the selected mode to encode the selected partition by using the corresponding actual compression coding data calculated based on the another partition immediately prior to the selected partition in the partition processing order.
Architecture to adapt cumulative distribution functions for mode decision in video encoding
A mode decision component is configured to determine the costs of different modes for a selected partition of a frame of a video using an estimated compression coding data, which is calculated prior to a corresponding actual compression coding data being calculated based on another partition immediately prior to the selected partition in a partition processing order. The estimated compression coding data is determined based on a previously calculated compression coding data calculated based on a completed partition prior to the selected partition in the partition processing order. The mode decision component is configured to use the determined costs to select one of the modes. An encoder component is configured to use the selected mode to encode the selected partition by using the corresponding actual compression coding data calculated based on the another partition immediately prior to the selected partition in the partition processing order.
Method and apparatus for point cloud coding
An apparatus for point cloud decoding includes processing circuitry. The processing circuitry receives, from a coded bitstream for a point cloud, encoded occupancy codes for nodes in an octree structure for the point cloud. The nodes in the octree structure correspond to three dimensional (3D) partitions of a space of the point cloud. Sizes of the nodes are associated with sizes of the corresponding 3D partitions. Further, the processing circuitry decodes, from the encoded occupancy codes, occupancy codes for the nodes. At least a first occupancy code for a child node of a first node is decoded without waiting for a decoding of a second occupancy code for a second node having a same node size as the first node. Then, the processing circuitry reconstructs the octree structure based on the decoded occupancy codes for the nodes, and reconstructs the point cloud based on the octree structure.
Method and apparatus for point cloud coding
An apparatus for point cloud decoding includes processing circuitry. The processing circuitry receives, from a coded bitstream for a point cloud, encoded occupancy codes for nodes in an octree structure for the point cloud. The nodes in the octree structure correspond to three dimensional (3D) partitions of a space of the point cloud. Sizes of the nodes are associated with sizes of the corresponding 3D partitions. Further, the processing circuitry decodes, from the encoded occupancy codes, occupancy codes for the nodes. At least a first occupancy code for a child node of a first node is decoded without waiting for a decoding of a second occupancy code for a second node having a same node size as the first node. Then, the processing circuitry reconstructs the octree structure based on the decoded occupancy codes for the nodes, and reconstructs the point cloud based on the octree structure.
Method and system for picture segmentation using columns
Described is picture segmentation through columns and slices in video encoding and decoding. A video picture is divided into a plurality of columns, each column covering only a part of the video picture in a horizontal dimension. All coded tree blocks (“CTBs”) belonging to a slice may belong to one or more columns. The columns may be used to break the same or different prediction or in-loop filtering mechanisms of the video coding, and the CTB scan order used for encoding and/or decoding may be local to a column. Column widths may be indicated in a parameter set and/or may be adjusted at the slice level. At the decoder, column width may be parsed from the bitstream, and slice decoding may occur in one or more columns.