Patent classifications
H04N19/55
Multi-person pose recognition method and apparatus, electronic device, and storage medium
In a multi-person pose recognition method, a to-be-recognized image is obtained, and a circuitous pyramid network is constructed. The circuitous network pyramid includes parallel phases, and each phase includes downsampling network layers, upsampling network layers, and a first residual connection layer to connect the downsampling and upsampling network layers. The phases are interconnected by a second residual connection layer. The circuitous pyramid network is traversed, by extracting a feature map for each phase, and the feature map of the last phase is determined to be the feature map of the to-be-recognized image. Multi-pose recognition is then performed on the to-be-recognized image according to the feature map to obtain a pose recognition result for the to-be-recognized image.
Multi-person pose recognition method and apparatus, electronic device, and storage medium
In a multi-person pose recognition method, a to-be-recognized image is obtained, and a circuitous pyramid network is constructed. The circuitous network pyramid includes parallel phases, and each phase includes downsampling network layers, upsampling network layers, and a first residual connection layer to connect the downsampling and upsampling network layers. The phases are interconnected by a second residual connection layer. The circuitous pyramid network is traversed, by extracting a feature map for each phase, and the feature map of the last phase is determined to be the feature map of the to-be-recognized image. Multi-pose recognition is then performed on the to-be-recognized image according to the feature map to obtain a pose recognition result for the to-be-recognized image.
Method and apparatus of simplified sub-mode for video coding
A method and apparatus of Inter prediction for video coding are disclosed. According to one method, a sub-block motion vector prediction (MVP) mode is turned off for small size coding units (CUs). In another method, if the neighbouring reference block for a current coding unit (CU) is in a root CU region, the neighbouring reference block is not used to derive a Merge candidate or a modified neighbouring reference block on the shared boundary of the root CU is used to derive the Merge candidate for the current block. In yet another method, a shared sub-block Merge candidate list is derived for sub-CUs within a root CU region or an MER (Merge estimation region). If a neighbouring reference block is within the same MER as a current sub-CU, the neighbouring reference block is not used for deriving a candidate for the shared sub-CU Merge list.
Image encoding/decoding method and device, and recording medium in which bitstream is stored
The present invention is about an image encoding/decoding method and apparatus. According to present invention, a method of decoding an image, the method comprising, deriving an initial motion vector of a current block; deriving a refined motion vector by using the initial motion vector; and generating a prediction block of the current block by using the refined motion vector.
Method and apparatus for video coding
Aspects of the disclosure provide methods and apparatuses for video encoding/decoding. In some examples, an apparatus for video decoding includes receiving circuitry and processing circuitry. In some embodiments, the processing circuitry decodes prediction information of a current block in a current coding from a coded video bitstream. The prediction information is indicative of an intra block copy mode. Then, the processing circuitry determines a block vector that points to a reference block in a same picture as the current block. The reference block is restricted within a coding region with reconstructed samples buffered in a reference sample memory. The coding region is one of multiple predefined regions of a coding tree unit (CTU). Then, the processing circuitry reconstructs at least a sample of the current block based on the reconstructed samples of the reference block that are retrieved from the reference sample memory.
Video Compression Using Block Vector Predictor Refinement
Encoding and/or decoding a block of a video frame may be based on a previously decoded reference block in the same frame or in a different frame. The reference block may be indicated by a block vector (BV). The BV may be encoded as difference between a block vector predictor (BVP) and the BV. The BVP may be adjusted to improve prediction accuracy of the BVP.
VIDEO ENCODING OPTIMIZATION FOR MACHINE LEARNING CONTENT CATEGORIZATION
Systems, apparatuses, and methods for performing machine learning content categorization leveraging video encoding pre-processing are disclosed. A system includes at least a motion vector unit and a machine learning (ML) engine. The motion vector unit pre-processes a frame to determine if there is temporal locality with previous frames. If the objects of the scene have not changed by a threshold amount, then the ML engine does not process the frame, saving computational resources that would typically be used. Otherwise, if there is a change of scene or other significant changes, then the ML engine is activated to process the frame. The ML engine can then generate a QP map and/or perform content categorization analysis on this frame and a subset of the other frames of the video sequence.
VIDEO ENCODING OPTIMIZATION FOR MACHINE LEARNING CONTENT CATEGORIZATION
Systems, apparatuses, and methods for performing machine learning content categorization leveraging video encoding pre-processing are disclosed. A system includes at least a motion vector unit and a machine learning (ML) engine. The motion vector unit pre-processes a frame to determine if there is temporal locality with previous frames. If the objects of the scene have not changed by a threshold amount, then the ML engine does not process the frame, saving computational resources that would typically be used. Otherwise, if there is a change of scene or other significant changes, then the ML engine is activated to process the frame. The ML engine can then generate a QP map and/or perform content categorization analysis on this frame and a subset of the other frames of the video sequence.
METHOD, APPARATUS, AND COMPUTER PROGRAM PRODUCT FOR GRADUAL DECODING REFRESH FOR VIDEO ENCODING AND DECODING
A method, apparatus and a non-transitory computer readable medium are provided for receiving an input picture divided into a plurality of coding units (CUs) and comprising a virtual boundaries between a dirty area and a clean area of the input picture, each CU located within either the clean area or the dirty area. The virtual boundary is treated as a picture boundary for coding units within the clean area and as a non-boundary for coding units within the dirty area. For a current CU, a history-based motion vector prediction (HMVP) table can be prepared that identifies other CUs as HMVP candidates for inter prediction, the HMVP candidates being adjacent the current CU. The HMVP candidate CUs are limited to CUs previously coded in the clean area. The current CU can be intra coded based at least upon the HMVP candidates from the HMVP table.
INTER-PREDICTION METHOD AND IMAGE DECODING DEVICE
Disclosed are an inter-prediction method and an video decoding device. One embodiment of the present invention provides an inter-prediction method executed in an video decoding device, including deriving a motion vector of a current block based on motion information decoded from a bitstream; acquiring reference samples of a first reference block by using the motion vector, wherein reference samples of an external region located outside a reference picture among the first reference block are acquired from a corresponding region corresponding to the external region within the reference picture; and predicting the current block based on the acquired reference samples.