G06V10/776

PROCESSING SYSTEM, IMAGE PROCESSING METHOD, LEARNING METHOD, AND PROCESSING DEVICE
20230005247 · 2023-01-05 · ·

A processing system includes a processor with hardware. The processor is configured to perform processing of acquiring a detection target image captured by an endoscope apparatus, controlling the endoscope apparatus based on control information, detecting a region of interest included in the detection target image based on the detection target image for calculating estimated probability information representing a probability of the detected region of interest, identifying the control information for improving the estimated probability information related to the region of interest within the detection target image based on the detection target image, and controlling the endoscope apparatus based on the identified control information.

METHOD FOR VIDEO RECOGNITION AND RELATED PRODUCTS
20230005264 · 2023-01-05 ·

A method for video recognition and related products are provided. The method includes the following. An original set of clip descriptors is obtained by providing multiple clips of a video as an input of a 3D CNN of a neural network, where the neural network includes the 3D CNN and at least one first fully connected layer, and each of the multiple clips includes at least one frame. An attention vector corresponding to the original set of clip descriptors is determined. An enhanced set of clip descriptors is obtained based on the original set of clip descriptors and the attention vector. The enhanced set of clip descriptors is input into the at least one first fully connected layer and video recognition is performed based on an output of the at least one first fully connected layer.

METHOD FOR VIDEO RECOGNITION AND RELATED PRODUCTS
20230005264 · 2023-01-05 ·

A method for video recognition and related products are provided. The method includes the following. An original set of clip descriptors is obtained by providing multiple clips of a video as an input of a 3D CNN of a neural network, where the neural network includes the 3D CNN and at least one first fully connected layer, and each of the multiple clips includes at least one frame. An attention vector corresponding to the original set of clip descriptors is determined. An enhanced set of clip descriptors is obtained based on the original set of clip descriptors and the attention vector. The enhanced set of clip descriptors is input into the at least one first fully connected layer and video recognition is performed based on an output of the at least one first fully connected layer.

ANALYSIS DEVICE AND COMPUTER-READABLE RECORDING MEDIUM STORING ANALYSIS PROGRAM

An analysis device includes a processor configured to: execute a first learning process on a generative model for images such that the images that bring a recognition result of an image recognition process into a preassigned state are generated; execute a second learning process on the generative model on which the first learning process has been executed, while gradually changing recognition accuracy of the images generated by the generative model on which the first learning process has been executed, to desired recognition accuracy; acquire each piece of information on back-error propagation calculated by executing the image recognition process, for the images with each level of the recognition accuracy generated through a course of the second learning process; and generate evaluation information indicating each of image parts that cause erroneous recognition at each level of the recognition accuracy, based on the acquired each piece of the information on the back-error propagation.

Closed loop automatic dataset creation systems and methods

Various techniques are provided for training a neural network to classify images. A convolutional neural network (CNN) is trained using training dataset comprising a plurality of synthetic images. The CNN training process tracks image-related metrics and other informative metrics as the training dataset is processed. The trained inference CNN may then be tested using a validation dataset of real images to generate performance results (e.g., whether a training image was properly or improperly labeled by the trained inference CNN). In one or more embodiments, a training dataset and analysis engine extracts and analyzes the informative metrics and performance results, generates parameters for a modified training dataset to improve CNN performance, and generates corresponding instructions to a synthetic image generator to generate a new training dataset. The process repeats in an iterative fashion to build a final training dataset for use in training an inference CNN.

Training a neural network based on temporal changes in answers to factoid questions

A method trains a neural network to identify an event based on discrepancies in answers to factoid questions at different times. One or more processors identify answers to a series of factoid questions. The processor(s) compare the answers from the series of factoid questions in order to determine discrepancies in the answers at different times, and then train a neural network to identify an event based on the discrepancies in the answers at the different times.

Parameter estimation for metrology of features in an image
11569056 · 2023-01-31 · ·

Methods and apparatuses are disclosed herein for parameter estimation for metrology. An example method at least includes optimizing, using a parameter estimation network, a parameter set to fit a feature in an image based on one or more models of the feature, the parameter set defining the one or more models, and providing metrology data of the feature in the image based on the optimized parameter set.

METHOD AND APPARATUS FOR DETECTING FACE, COMPUTER DEVICE AND COMPUTER-READABLE STORAGE MEDIUM
20230023271 · 2023-01-26 ·

A method for training a neural network, including: determining a neural network; training the neural network at a first learning rate according to a first optimization mode, where the first learning rate is updated each time the neural network is trained; mapping the first learning rate of the first optimization mode to a second learning rate of a second optimization mode in the same vector space; determining the second learning rate satisfies a preset update condition; and continuing to train the neural network at the second learning rate according to the second optimization mode.

METHOD AND APPARATUS FOR DETECTING FACE, COMPUTER DEVICE AND COMPUTER-READABLE STORAGE MEDIUM
20230023271 · 2023-01-26 ·

A method for training a neural network, including: determining a neural network; training the neural network at a first learning rate according to a first optimization mode, where the first learning rate is updated each time the neural network is trained; mapping the first learning rate of the first optimization mode to a second learning rate of a second optimization mode in the same vector space; determining the second learning rate satisfies a preset update condition; and continuing to train the neural network at the second learning rate according to the second optimization mode.

FLAT FINE-GRAINED IMAGE CLASSIFICATION WITH PROGRESSIVE PRECISION
20230237786 · 2023-07-27 ·

Progressive precision image classifier and method of training include storing a dataset of labeled images, training a neural network to generate a classification vector comprising a plurality of confidence values, each confidence value corresponding to a classification, validating the trained neural network, calculating fine-grained confidence thresholds for each classification, wherein each classification represents a leaf-level classification in a hierarchical classification structure, and calculating coarse-level confidence thresholds for at least one parent class in the hierarchical classification structure, wherein each parent class defines a group of at least one leaf-level classification. Each label in the training data identifies a leaf-level classification in the hierarchical classification structure, and the classification vector includes a 1xN vector of confidence values, where N represents a number of leaf-level classifications output by the trained neural network. The neural network may be implemented as a convolution neural network with a single output head.