Patent classifications
G06V10/806
DEPTH IMAGE GENERATION METHOD, APPARATUS, AND STORAGE MEDIUM AND ELECTRONIC DEVICE
A depth image generation method, apparatus, and storage medium and electronic device. The method includes: acquiring a plurality of target images; performing multi-stage convolution processing on the plurality of target images through a plurality of convolutional layers in a convolution model to obtain feature map sets respectively outputted by the plurality of convolutional layers; performing view aggregation on a plurality of feature maps in each feature map set respectively to obtain an aggregated feature corresponding to each feature map set; and performing fusion processing on the plurality of obtained aggregated features to obtain a depth image. The plurality of acquired target images are obtained by photographing the target object from different views respectively, so that the plurality of obtained target images include information from different angles, which enriches information content of the acquired target images.
ARTIFICIAL INTELLIGENCE-BASED OBJECT DETECTION METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
This application discloses an artificial intelligence-based object detection method and apparatus. The method includes inputting a target image comprising an object to an object detection model; obtaining feature images of different scales from the target image using the object detection model; determining image location information of the object and a first confidence level that the object belongs to each category; acquiring a target region in which the object is located; inputting the target region to an object retrieval model, comparing the target region with sample images of a plurality of categories to obtain a second confidence level that the object belongs to each category; and determining a target category of the object based on the first confidence level and the second confidence level, a sum of the first confidence level and the second confidence level of the target category being the largest of the plurality of categories.
METHOD FOR SELECTING IMAGE SAMPLES AND RELATED EQUIPMENT
The present disclosure relates to a technology field of artificial intelligence and provides a method for selecting image samples and related equipment. The method trains an instance segmentation model with first image samples and trains a score prediction model with third image samples. An information quantum score of second image samples is calculated through the score prediction model and feature vectors extracted. The second image samples are clustered according to the feature vectors of the second image samples and sample clusters of the second image samples are obtained. Target image samples are selected from the second image samples according to the information quantum score of the second image samples and the sample clusters. Target image samples from the image samples are selected for labelling, improving an accuracy of sample selection.
RECOGNITION AND POSITIONING DEVICE AND INFORMATION CONVERSION DEVICE
A recognition and positioning device includes a CV video obtaining unit that generates a CV video by adding a CV value to an object video, an object designation unit that designates a recognition target object captured throughout a plurality of frames of the CV video, a consecutive frame machine learning unit that repetitively executes a recognition process through machine learning, a three-dimensional coordinate computation and object coordinate assigning unit that associates recognized objects over the frames of the CV video, identifies the objects having matched coordinates of the captured objects in corresponding frames, and assigns three-dimensional position coordinates to the objects, and a coordinate assigning and recognition output unit that assigns the three-dimensional coordinates to the objects about which a predetermined recognition certainty degree and a three-dimensional coordinate accuracy are obtained, by repeating recognition of the object and assigning of the three-dimensional position coordinates.
OBSTACLE DETECTION METHOD AND APPARATUS, DEVICE, AND MEDIUM
This application discloses an obstacle detection method, including: obtaining a first image, where the first image is an image encoded based on an RGB model; reconstructing the first image to obtain a second image, where the second image is a hyper spectral image; and extracting a hyper spectral feature from the hyper spectral image, and classifying a candidate object in the hyper spectral image based on the hyper spectral feature to obtain an obstacle detection result. Because different textures correspond to different hyper spectral features, classifying candidate objects in hyper spectral images based on the hyper spectral features can distinguish an object that has a similar color but a different texture.
METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM FOR PROCESSING IMAGE
A method, an apparatus, a device and a storage medium for processing an image are provided. The method includes: acquiring a target video including a target image frame and at least one image frame of a labeled target object; based on the labeled target object in the at least one image frame, determining a search area for the target object in the target image frame; based on the search area, determining center position information of the target object; based on a labeled area in which the target object is located and the center position information, determining a target object area; and based on the target object area, segmenting the target image frame.
PRODUCT DEFECT DETECTION METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM
A product defect detection method and apparatus, an electronic device, and a storage medium are provided. A method includes: acquiring a multi-channel image of a target product; inputting the multi-channel image to a defect detection model, wherein the defect detection model includes a plurality of convolutional branches, a merging module and a convolutional headbranch; performing feature extraction on each channel in the multi-channel image by using the plurality of convolutional branches, to obtain a plurality of first characteristic information; merging the plurality of first characteristic information by using the merging module, to obtain second characteristic information; performing feature extraction on the second characteristic information by using the convolutional headbranch, to obtain third characteristic information to be output by the defect detection model; and determining defect information of the target product based on the third characteristic information.
METHOD AND DEVICE FOR VISUAL QUESTION ANSWERING, COMPUTER APPARATUS AND MEDIUM
The present disclosure provides a method for visual question answering, which relates to a field of computer vision and natural language processing. The method includes: acquiring an input image and an input question; constructing a Visual Graph based on the input image, wherein the Visual Graph comprises a Node Feature and an Edge Feature; updating the Node Feature by using the Node Feature and the Edge Feature to obtain an updated Visual Graph; determining a question feature based on the input question; fusing the updated Visual Graph and the question feature to obtain a fused feature; and generating a predicted answer for the input image and the input question based on the fused feature. The present disclosure further provides an apparatus for visual question answering, a computer device and a non-transitory computer-readable storage medium.
System and Method for Sensor Fusion System Having Distributed Convolutional Neural Network
An early fusion network is provided that reduces network load and enables easier design of specialized ASIC edge processors through performing a portion of convolutional neural network layers at distributed edge and data-network processors prior to transmitting data to a centralized processor for fully-connected/deconvolutional neural networking processing. Embodiments can provide convolution and downsampling layer processing in association with the digital signal processors associated with edge sensors. Once the raw data is reduced to smaller feature maps through the convolution-downsampling process, this reduced data is transmitted to a central processor for further processing such as regression, classification, and segmentation, along with feature combination of the data from the sensors. In some embodiments, feature combination can be distributed to gateway or switch nodes closer to the edge sensors, thereby further reducing the data transferred to the central node and reducing the amount of computation performed there.
RECOGNITION OF OBJECTS IN IMAGES WITH EQUIVARIANCE OR INVARIANCE IN RELATION TO THE OBJECT SIZE
A method for recognizing at least one object in at least one input image. In the method, a template image of the object is processed by a first convolutional neural network (CNN) to form at least one template feature map; the input image is processed by a second CNN to form at least one input feature map; the at least one template feature map is compared to the at least one input feature map; it is evaluated from the result of the comparison whether and possibly at which position the object is contained in the input image, the convolutional neural networks each containing multiple convolutional layers, and at least one of the convolutional layers being at least partially formed from at least two filters, which are convertible into one another by a scaling operation.