Patent classifications
G06V10/806
POINT OF VIEW VIDEO PROCESSING AND CURATION PLATFORM
Embodiments of the present disclosure may provide methods and systems enabled to perform the following stages: receiving a plurality of content streams; retrieving metadata associated with each of the plurality of content streams; processing the metadata to detect at least one target annotation within at least one target content stream; retrieving telemetry data associated with the at least one target content stream; processing the telemetry data and the metadata associated with a plurality of frames in the at least one target content stream to ascertain vector motion data; and mapping a spatial relationship associated with at least one capturing device associated with at least one target content source.
IMAGE PROCESSING METHOD AND APPARATUS, DEVICE, VIDEO PROCESSING METHOD AND STORAGE MEDIUM
An image processing method and apparatus, a device, a video processing method and a storage medium are provided. The image processing method includes: receiving an input image; and processing the input image by using the convolutional neural network to obtain an output image. A definition of the output image is higher than a definition of the input image. Processing the input image by using the convolutional neural network to obtain the output image includes: performing feature extraction on the input image; concatenating the input image and the plurality of first images; performing the feature extraction on the first image group; fusing the plurality of second images and the plurality of first images; concatenating the input image and the plurality of third images to obtain a second image group; and performing the feature extraction on the second image group to obtain the output image.
OBJECT DETECTION METHOD, OBJECT DETECTION APPARATUS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM STORING COMPUTER PROGRAM
An object detection method includes inputting an input image to a learned machine learning model and generating a similarity image from an output of at least one specific layer, and generating a discriminant image to which at least an unknown label is assigned, by comparing a similarity of each pixel in the similarity image to a predetermined threshold value.
FUNDUS IMAGE QUALITY EVALUATION METHOD AND DEVICE BASED ON MULTI-SOURCE AND MULTI-SCALE FEATURE FUSION
Disclosed is a fundus image quality evaluation method based on multi-source and multi-scale feature fusion, comprising following steps: S1, acquiring multi-source fundus images, labeling the multi-source fundus images with four evaluation dimensions of brightness, blur, contrast and overall image quality, and forming training samples with the fundus image and labeling labels; S2, constructing a fundus image quality evaluation network including a feature extraction module, a fusion module, an attention module and an evaluation module; S3, training the fundus image quality evaluation network by using training samples to obtain a fundus image quality evaluation model; and S4: inputting fundus images to be measured into the fundus image quality evaluation model, and outputting quality evaluation results through calculation. Also provided is a fundus image quality evaluation device based on above method.
Computer aided diagnosis system for detecting tissue lesion on microscopy images based on multi-resolution feature fusion
Embodiments of the present disclosure include a method, device and computer readable medium involving receiving image data to detect tissue lesions, passing the image data through at least one first convoluted neural network, segmenting the image data, fusing the segmented image data, and detecting tissue lesions.
METHOD OF GENERATING CLASSIFIER BY USING SMALL NUMBER OF LABELED IMAGES
A method of generating a classifier by using a small number of labeled images includes: pre-training a wide residual network by using a set of labeled data with a data amount meeting requirements, and determining portions of the pre-trained wide residual network except for a fully connected layer as a feature extractor for an image; randomly selecting, for a N-class classifier to be generated, N classes from a training set for each of a plurality of times; and for N classes selected each time: randomly selecting one or more images from each class of the N classes as training samples; extracting a feature vector from training samples of each class by using the feature extractor; inputting a total of N feature vectors extracted into a classifier generator; and sequentially performing a class information fusion and a parameter prediction for the N-class classifier by using the classifier generator.
APPARATUS AND METHOD WITH VIDEO PROCESSING
A processor-implemented method with video processing includes: determining a first image feature of a first image of video data and a second image feature of a second image that is previous to the first image; determining a time-domain information fusion processing result by performing time-domain information fusion processing on the first image feature and the second image feature; and determining a panoptic segmentation result of the first image based on the time-domain information fusion processing result.
INFORMATION PROCESSING APPARATUS AND METHOD OF INFERRING
A non-transitory computer-readable recording medium stores a program for causing a computer to execute a process including for each of plural pieces of first-type training data including image information, first semantic information, and a first class of a relevant first object, generating a first hyperdimensional vector (HV) from the image information and the first semantic information, and storing the first HV in a storage unit in correlation with the first class, and for each of plural pieces of second-type training data including second semantic information and a second class of a relevant second object, obtaining, from the storage unit, a predetermined number of HVs exhibiting a higher degree of matching with an HV generated from the second semantic information, generating a second HV of the second-type training data based on the predetermined number of HVs, and storing the second HV in the storage unit in correlation with the second class.
SYSTEMS AND METHODS FOR IMAGE PROCESSING USING NATURAL LANGUAGE
Embodiments of the disclosure provide a machine learning model for generating a predicted executable command for an image. The learning model includes an interface configured to obtain an utterance indicating a request associated with the image, an utterance sub-model, a visual sub-model, an attention network, and a selection gate. The machine learning model generates a segment of the predicted executable command from weighted probabilities of each candidate token in a predetermined vocabulary determined based on the visual features, the concept features, current command features, and the utterance features extracted from the utterance or the image.
Media processing method, related apparatus, and storage medium
Provided is a video processing method, including: obtaining a to-be-processed video and generating a first gait energy diagram, the to-be-processed video including an object with a to-be-recognized identity; obtaining a second gait energy diagram, the second gait energy diagram being generated based on a video including an object with a known identity; inputting the first gait energy diagram and the second gait energy diagram into a deep neural network; extracting respective identity information of the first gait energy diagram and the second gait energy diagram, and determining a fused gait feature vector from gait feature vectors of the first gait energy diagram and the second gait energy diagram; and calculating a similarity based on at least the fused gait feature vector. The identity information of the first gait energy diagram includes gait feature vectors, and the identity information of the second gait energy diagram includes gait feature vectors.