G06V10/449

VIDEO CODING BASED ON FEATURE EXTRACTION AND PICTURE SYNTHESIS

Computer-implemented methods, computer-readable media and devices for encoding video data using picture synthesis from features are provided. A computer-implemented method of encoding video data includes extracting features from a picture in a video; obtaining a predicted value of one or more regions in the picture by applying generative picture synthesis onto the features; obtaining a residual value of the one or more regions in the picture based on an original value of the one or more regions in the picture and the predicted picture; and encoding the residual value and the extracted features. Disclosed herein are also computer-implemented methods, computer-readable media and devices for decoding video data using picture synthesis from features.

Image analysis method, apparatus, non-transitory computer readable medium, and deep learning algorithm generation method

Disclosed is an image analysis method including inputting analysis data, including information regarding an analysis target cell to a deep learning algorithm having a neural network structure, and analyzing an image by calculating, by use of the deep learning algorithm, a probability that the analysis target cell belongs to each of morphology classifications of a plurality of cells belonging to a predetermined cell group.

Deep learning based adaptive arithmetic coding and codelength regularization
11423310 · 2022-08-23 · ·

A deep learning based compression (DLBC) system applies trained models to compress binary code of an input image to a target codelength. For a set of binary codes representing the quantized coefficents of an input image, the DLBC system applies a first model that is trained to predict feature probabilities based on the context of each bit of the binary codes. The DLBC system compresses the binary code via adaptive arithmetic coding based on the determined probability of each bit. The compressed binary code represents a balance between a reconstruction quality of a reconstruction of the input image and a target compression ratio of the compressed binary code.

METHOD FOR RESTORING VIDEO DATA OF DRAINAGE PIPE BASED ON COMPUTER VISION
20220292645 · 2022-09-15 ·

A method for restoring video data of a pipe based on computer vision is provided, including: performing gray stretching on pipe image/video collected by a pipe robot; processing noise interference by smoothing filtering; extracting an iron chain from the center of a video image as a template for location; performing target recognition on the center of video data by an SIFT corner detection algorithm; detecting ropes on left and right sides of a target by Hough transform; performing gray covering on the iron chain at the center of the video image and the ropes on two sides; and restoring data by an FMM image restoration algorithm.

Classification and 3D modelling of 3D dento-maxillofacial structures using deep learning methods

A computer-implemented method for processing 3D image data of a dento-maxillofacial structure is described wherein the method may comprise the steps of: receiving 3D image data defining a volume of voxels, a voxel being associated with a radiodensity value and a position in the volume and the voxels providing a 3D representation of a dento-maxillofacial structure; using the voxels of the 3D image data to determine one or more 3D positional features for input to a first deep neural network, a 3D positional feature defining information aggregated from the entire received 3D data set; and, the first deep neural network receiving the 3D image data and the one or more positional features at its input and using the one or more 3D positional features to classify at least part of the voxels of the 3D image data into jaw, teeth and/or nerve voxels.

ONLINE LEARNING SYSTEM BASED ON CLOUD-CLIENT INTEGRATION MULTIMODAL ANALYSIS

An online learning system based on cloud-client integration multimodal analysis includes: an online learning module used for providing an online learning interface for a user and collecting image data, physiological data, posture data and interaction log data during an online learning process of the user; a multimodal data integration decision module used for preprocessing the image data, physiological data and posture data, extracting corresponding features, and making a comprehensive decision in combination with the interaction log data to obtain a current learning state of the user; a cloud-client integration system architecture module used for coordinating use of computing resources of a cloud server and a local client according to usage conditions of the cloud server and the local client, and visually displaying a progress of computing tasks; a system interaction adjustment module used for adjusting the online learning module according to the current learning state of the user.

Methods and systems for annotation and truncation of media assets

Methods and systems for improving the interactivity of media content. The methods and systems are particularly applicable to the e-learning space, which features unique problems in engaging with users, maintaining that engagement, and allowing users to alter media assets to their specific needs. To address these issues, as well as improving interactivity of media assets generally, the methods and systems described herein provide for annotation and truncation of media assets. More particularly, the methods and systems described herein provide features such as annotation guidance and video condensation.

METHOD FOR DETECTING DEFECTS IN IMAGES, APPARATUS APPLYING METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM APPLYING METHOD
20220230291 · 2022-07-21 ·

A method for detecting defects in products revealed by images of the products inputs images of the flaw-free products into an autoencoder for model training to obtain reconstructed images. The images are further processed to obtain target images. A group of testing errors are obtained by comparing the reconstructed images and the target images. An error threshold is selected from the group of the testing errors according to a specified rule. A to-be-analyzed image is inputted for obtaining a candidate be-analyzed reconstructed image, a candidate be-analyzed target image, and a potential be-analyzed error between the candidate be-analyzed reconstructed image and the candidate be-analyzed target image. A result of the to-be-analyzed image confirms defects existing or defects not existing according to the potential be-analyzed error and the error threshold. A defect detection apparatus, an electronic device, and a non-transitory computer-readable storage medium applying the method are also disclosed.

IMAGE PYRAMID GENERATION FOR IMAGE KEYPOINT DETECTION AND DESCRIPTOR GENERATION
20220301110 · 2022-09-22 ·

Embodiments relate to generating an image pyramid for feature extraction. A pyramid image generator circuit includes a first image buffer that stores image data at a first octave, a first blur filter circuit, a first spatial filter circuit, and a first decimator circuit. The first blur filter circuit generates a first pyramid image for a first scale of the first octave by applying a first amount of smoothing to the first image data stored in the first image buffer. The first spatial filter circuit and the first decimator generate second image data of a second octave that is higher than the first octave by applying a smoothing and a decimation to the first image data stored in the first image buffer. The first spatial filter circuit begins processing the first image data before the first blur filter circuit begins to process the first image data.

Intelligent Vehicle Trajectory Measurement Method Based on Binocular Stereo Vision System
20220092797 · 2022-03-24 ·

The invention provides a method for intelligently measuring vehicle trajectory based on a binocular stereo vision system, including the following steps: inputting a dataset into an SSD (Single Shot Multibox Detector) neural network to train a license plate recognition model; calibrating the binocular stereo vision system, and recording videos of moving target vehicles; detecting the license plates in the video frames with the license plate recognition model; performing stereo matching on the license plates in the subsequent frames of the same camera and in the corresponding left-view and right-view video frames by a feature-based matching algorithm; reserving correct matching point pairs after filtering with a homography matrix; screening the reserved matching point pairs, and reserving the one closest to the license plate center as the position of the target vehicle in the current frame; performing stereo measurement on the screened and reserved matching point pairs to get the spatial position coordinates of the vehicle in the video frames; and generating the moving trajectory of the vehicle in time sequence. The present invention is easy to install and adjust, and can simultaneously measure multiple target vehicles in multiple directions and on multiple lanes.