Patent classifications
G06V10/80
RECOGNITION APPARATUS, RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM
According to one embodiment, a recognition apparatus includes processing circuitry. The processing circuitry generates a first feature quantity exhibiting a feature of sensor data based on the sensor data, converts the first feature quantity into a second feature quantity exhibiting a feature contributing to identification of a class of the sensor data, generates a significant feature quantity exhibiting a feature that is significant in the identification of the class based on a cross-correlation between the first feature quantity and the second feature quantity, generates an integrated feature quantity considering features of the first feature quantity and the second feature quantity, based on the second feature quantity and the significant feature quantity, and identifies the class based on the integrated feature quantity.
Neural architecture search for fusing multiple networks into one
One or more embodiments of the present disclosure include systems and methods that use neural architecture fusion to learn how to combine multiple separate pre-trained networks by fusing their architectures into a single network for better computational efficiency and higher accuracy. For example, a computer implemented method of the disclosure includes obtaining multiple trained networks. Each of the trained networks may be associated with a respective task and has a respective architecture. The method further includes generating a directed acyclic graph that represents at least a partial union of the architectures of the trained networks. The method additionally includes defining a joint objective for the directed acyclic graph that combines a performance term and a distillation term. The method also includes optimizing the joint objective over the directed acyclic graph.
Machine learning based models for object recognition
Machine learning based models recognize objects in images. Specific features of the object are extracted from the image using machine learning based models. The specific features extracted from the image assist deep learning based models in identifying subtypes of a type of object. The system recognizes the objects and collections of objects and determines whether the arrangement of objects violates any predetermined policies. For example, a policy may specify relative positions of different types of objects, height above ground at which certain types of objects are placed, or an expected number of certain types of objects in a collection.
VIDEO CLIP POSITIONING METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
This application discloses a video clip positioning method performed by a computer device. In this application, clip features of video clips in a video are determined according to the unit features of video units within the video clips, so that the acquired clip features integrate the features of the video units and the time sequence correlation between the video units; and then the clip features of the video clips and a text feature of a target text are fused. The features of video clip dimensions and the time sequence correlation between the video clips are fully used in the feature fusion process, so that more accurate attention weights can be acquired based on the fused features. The attention weights are used to represent matching degrees between the video clips and the target text, and then a target video clip matching the target text can be positioned more accurately.
OBJECT DETECTION VISION SYSTEM
An object detection vision system and methods are disclosed. A method for detecting objects in a vision system of an industrial machine includes receiving image data from one or more vision cameras and receiving detection data from one or more detect devices. The detection data includes one or more detected objects. The method includes combining the detection data and the image data and transforming the detection data in the image data based on one or more objects in the image data. The method also includes displaying an indication of the detected one or more objects in the image data based on the transformed detection data.
OBJECT DETECTION VISION SYSTEM
An object detection vision system and methods are disclosed. A method for detecting objects in a vision system of an industrial machine includes receiving image data from one or more vision cameras and receiving detection data from one or more detect devices. The detection data includes one or more detected objects. The method includes combining the detection data and the image data and transforming the detection data in the image data based on one or more objects in the image data. The method also includes displaying an indication of the detected one or more objects in the image data based on the transformed detection data.
AUTOMATIC IMAGE CLASSIFICATION AND PROCESSING METHOD BASED ON CONTINUOUS PROCESSING STRUCTURE OF MULTIPLE ARTIFICIAL INTELLIGENCE MODEL, AND COMPUTER PROGRAM STORED IN COMPUTER-READABLE RECORDING MEDIUM TO EXECUTE THE SAME
Disclosed is an automatic image classification and processing method based on the continuous processing structure of multiple artificial intelligence models. An automatic image classification and processing method based on a continuous processing structure of multiple artificial intelligence models includes receiving image data, generating a first feature extraction value by inputting the image data into a first feature extraction model among feature extraction models, generating a second feature extraction value by inputting the image data into a second feature extraction model among the feature extraction models, and determining a classification value of the image data by inputting the first and second feature extraction values into a classification model.
Dynamic imaging system
A dynamic imaging system is disclosed. The dynamic imaging system may comprise one or more imager, one or more input device, a controller, and/or a display. Each imager may be operable to capture a video stream having a field of view. In some embodiments, the controller may articulate the imager or crop the field of view to change the field of view in response to signals from the one or more input devices. For example, the signal may relate to the vehicle's speed. In other embodiments, the controller may apply a warp to the field of view. The warp may be applied in response to signals from the one or more input devices. In yet other embodiments, video streams from one or more imagers may be stitched together by the controller. Further, the controller may likewise move the stitch line in response to signals from the or more input devices.
LEARNING FEATURE IMPORTANCE FOR IMPROVED VISUAL EXPLANATION
Systems, methods and computer readable media provide technology to perform image classification and produce visualization using a machine learning architecture. The disclosed image classification and visualization technology includes a feature extraction network to generate a feature map, a feature importance network to generate a feature importance vector, an attention map generated based on a weighted sum of the feature importance vector and the feature map, a classification output determined based on a combination of the attention map and the feature map, and a feature visualization image generated by overlaying the attention map onto an input image. Each of the feature extraction network and the feature importance network can include a neural network.
LEARNING FEATURE IMPORTANCE FOR IMPROVED VISUAL EXPLANATION
Systems, methods and computer readable media provide technology to perform image classification and produce visualization using a machine learning architecture. The disclosed image classification and visualization technology includes a feature extraction network to generate a feature map, a feature importance network to generate a feature importance vector, an attention map generated based on a weighted sum of the feature importance vector and the feature map, a classification output determined based on a combination of the attention map and the feature map, and a feature visualization image generated by overlaying the attention map onto an input image. Each of the feature extraction network and the feature importance network can include a neural network.