G06V10/806

AUTOMATIC IMAGE ANNOTATIONS
20220327312 · 2022-10-13 ·

A computer-implemented method for annotating images is disclosed. The computer-implemented method includes generating a saliency map corresponding to an input image, wherein the input image is an image that requires annotation, generating a behavior saliency map, wherein the behavior saliency map is a saliency map formed from an average of a plurality of objects contained within respective bounding boxes of a plurality of sample images, generating a historical saliency map, wherein the historical saliency map is a saliency map formed from an average of a plurality of tagged objects in the plurality of sample images, fusing the saliency map corresponding to the input image, the behavior saliency map, and the historical saliency map to form a fused saliency map, and generating, based on the fused saliency map, a bounding box around an object in the input image.

Method for recognizing distribution network equipment based on raspberry pi multi-scale feature fusion

Disclosed is a method for recognizing distribution network equipment based on Raspberry Pi multi-scale feature fusion. The method includes obtaining an initial sample data set; constructing an object detection network composed of EfficientNe-B0 backbone network, multi-scale feature fusion module and a regression classification prediction head; training the object detection network by taking the initial sample data set as a training sample; finally, detecting inspection pictures by using a the trained object detection network. A light-weight EfficientNet-B0 backbone network feature extraction method obtains more features of objects. Meanwhile, an introduction of multi-scale feature fusion better adapts to small object detection, and a light-weight y_pred regression classification detection head is effectively deployed and realized in Raspberry Pi embedded equipment with tight resources and limited computing power.

Image processing method, apparatus, and device, and storage medium

An image processing method, apparatus, and device, and a storage medium are provided. The method is performed by a computing device, and includes: determining a first image feature of a first size of an input image, the first image feature having at least two channels; performing weight adjustment on each channel in the first image feature by using a first weight adjustment parameter, to obtain an adjusted first image feature, the first weight adjustment parameter including at least two parameter components, and each parameter component being used for adjusting a pixel of a channel corresponding to each parameter component; downsampling the adjusted first image feature to obtain a second image feature having a second size; combining the first image feature and the second image feature to obtain a combined image feature; and determining an image processing result according to the combined image feature.

Systems and methods for image based perception

Systems and methods for image-based perception. The methods comprise: obtaining, by a computing device, images captured by a plurality of cameras with overlapping fields of view; generating, by the computing device, spatial feature maps indicating locations of features in the images; defining, by the computing device, predicted cuboids at each location of an object in the images based on the spatial feature maps; and assigning, by the computing device, at least two cuboids of said predicted cuboids to a given object when predictions from images captured by separate cameras of the plurality of cameras should be associated with a same detected object.

IMAGE SAMPLE GENERATING METHOD AND SYSTEM, AND TARGET DETECTION METHOD
20230162342 · 2023-05-25 ·

Provided are a target detection method and an image sample generating method and system for deep learning. The image sample generating method includes performing a scenario composition analysis on an item to be detected in a security check place; obtaining a real-shot security check image of a target scenario having a corresponding composition ratio according to the scenario composition analysis; obtaining a target security check image having a label, where the target security check image is captured by a security check device; processing a pixel gray value of an i-th feature layer in the real-shot security check image and a pixel gray value of an i-th feature layer in the target security check image separately; determining images to be fused; normalizing sizes of the images to be fused; fusing the size-normalized images to be fused to form a new sample; and performing the determining the images to be fused.

METHOD OF PROCESSING IMAGE, METHOD OF TRAINING MODEL, AND ELECTRONIC DEVICE
20230162474 · 2023-05-25 ·

A method of processing an image, a method of training a multi-task processing model, and an electronic device, which relate to a field of an automatic driving technology, in particular to a field of high-definition map technology. The method of processing an image includes: processing a to-be-processed image to obtain a feature point of the to-be-processed image, a feature point descriptor map of the to-be-processed image, and a dense descriptor map of the to-be-processed image; determining a pair of matched feature points between the to-be-processed image and a reference image based on the feature point and the feature point descriptor map; and determining a pair of matched pixels between the to-be-processed image and the reference image based on the dense descriptor map.

INTERPRETATION OF RESONANT SENSOR DATA USING MACHINE LEARNING

Examples are disclosed relating to the tracking of facial expressions as computing device inputs. One example provides a computing system, comprising a logic system, and a storage system comprising instructions executable by the logic device to obtain facial tracking sensor data from one or more resonant inductive-capacitive (LC) sensors, determine a facial expression by inputting the facial tracking sensor data into a trained machine learning function, and output the facial expression determined.

VEHICLE LIGHT CLASSIFICATION SYSTEM
20230162508 · 2023-05-25 ·

The described aspects and implementations enable vehicle light classification in autonomous vehicle (AV) applications. In one implementation, disclosed is a method and a system to perform the method that includes, obtaining, by a processing device, first image data characterizing a driving environment of an autonomous vehicle (AV). The processing device may identify, based on the image data, a vehicle within the driving environment. The processing device may process the image data using one or more trained machine-learning models (MLMs) to determine a state of one or more lights of the vehicle and cause an update to a driving path of the AV based on the determined state of the lights.

SYSTEMS AND METHODS FOR VISION-LANGUAGE DISTRIBUTION ALIGNMENT
20230162490 · 2023-05-25 ·

Embodiments described herein a CROss-Modal Distribution Alignment (CROMDA) model for vision-language pretraining, which can be used for retrieval downstream tasks. In the CROMDA mode, global cross-modal representations are aligned on each unimodality. Specifically, a uni-modal global similarity between an image/text and the image/text feature queue are computed. A softmax-normalized distribution is then generated based on the computed similarity. The distribution thus takes advantage of property of the global structure of the queue. CROMDA then aligns the two distributions and learns a modal invariant global representation. In this way, CROMDA is able to obtain invariant property in each modality, where images with similar text representations should be similar and vice versa.

Advanced gaming and virtual reality control using radar
11656336 · 2023-05-23 · ·

Techniques are described herein that enable advanced gaming and virtual reality control using radar. These techniques enable small motions and displacements to be tracked, even in the millimeter or submillimeter scale, for user control actions even when those actions are optically occluded or obscured.