Patent classifications
G06V10/806
Auxtiliary filtering device of electronic device and cellphone
An auxiliary filtering device for face recognition is provided. The auxiliary filtering device is used to exclude an ineligible object to be identified according to the relative relationship between object distances and image sizes, the image variation with time and/or the feature difference between images captured by different cameras to prevent the possibility of cracking the face recognition by using a photo or a video.
System and method for providing an interpretable and unified representation for trajectory prediction
A system and method for providing an interpretable and unified representation for trajectory prediction that includes receiving birds-eye image data associated with travel of at least one agent within a roadway environment. The system and method also include analyzing the birds-eye image data to determine a potential field associated with the roadway environment and analyzing the birds-eye image data to determine a potential field associated with a past trajectory of the at least one agent. The system and method further include predicting a future trajectory of the at least one agent based on analysis of the potential fields.
Multi-modal segmentation network for enhanced semantic labeling in mapping
Provided are methods for enhanced semantic labeling in mapping with a semantic labeling system, which can include receiving, from a LiDAR sensor of a vehicle, LiDAR point cloud information including at least one raw point feature for a point, receiving, from a camera of the vehicle, image data associated with an image captured using the camera, generating at least one rich point feature for the point based on the image data, predicting, using a LiDAR segmentation neural network and based on the at least one raw point feature and the at least one rich point feature, a point-level semantic label for the point, and providing the point-level semantic label to a mapping engine to generate a map based on the point-level semantic label Systems and computer program products are also provided.
IMAGING PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
The present disclosure relates to an image processing method and apparatus, an electronic device and a storage medium. The method includes performing feature extraction on an image to be processed to obtain a first feature map of the image to be processed and performing weight prediction on the first feature map to obtain a weight feature map of the first feature map. The weight feature map includes weight values of feature points in the first feature map. The method further includes performing feature value adjustment on the feature points in the first feature map based on the weight feature map to obtain a second feature map and determining a processing result of the image to be processed according to the second feature map. Embodiments of the present disclosure may improve the image processing accuracy.
METHOD FOR TRAINING IMAGE RECOGNITION MODEL BASED ON SEMANTIC ENHANCEMENT
Embodiments of the present disclosure provide a method and apparatus for training an image recognition model based on a semantic enhancement, a method and apparatus for recognizing an image, an electronic device, and a computer readable storage medium. The method for training an image recognition model based on a semantic enhancement comprises: extracting, from an inputted first image being unannotated and having no textual description, a first feature representation of the first image; calculating a first loss function based on the first feature representation; extracting, from an inputted second image being unannotated and having an original textual description, a second feature representation of the second image; calculating a second loss function based on the second feature representation, and training an image recognition model based on a fusion of the first loss function and the second loss function.
SYSTEMS AND METHODS FOR ESTIMATING VISIBILITY IN A SCENE
Systems and methods herein provide for improving visibility in a scene. In one embodiment, a system includes a first camera device operable to capture images of a scene at a first band of wavelengths, and a second camera device operable to capture images of the scene at a second band of wavelengths. The first and second bands are different. The system also includes a processor communicatively coupled to the first and second camera devices, the processor being operable to detect an object in the scene based on a first of the images from the first camera device and based on a first of the images from the second camera device that was captured at substantially a same time as the first image from the first camera device, to estimate an obscurant in the scene based on the first images, and to estimate a visibility parameter of the scene based on the object and the estimated obscurant.
LIGHTWEIGHT TRANSFORMER FOR HIGH RESOLUTION IMAGES
Systems and methods for obtaining attention features are described. Some examples may include: receiving, at a projector of a transformer, a plurality of tokens associated with image features of a first dimensional space; generating, at the projector of the transformer, projected features by concatenating the plurality of tokens with a positional map, the projected features having a second dimensional space that is less than the first dimensional space; receiving, at an encoder of the transformer, the projected features and generating encoded representations of the projected features using self-attention; decoding, at a decoder of the transformer, the encoded representations and obtaining a decoded output; and projecting the decoded output to the first dimensional space and adding the image features of the first dimensional space to obtain attention features associated with the image features.
MULTI-RESOLUTION NEURAL NETWORK ARCHITECTURE SEARCH SPACE FOR DENSE PREDICTION TASKS
Systems and methods for searching a search space are disclosed. Some examples may include using a first parallel module including a first plurality of stacked searching blocks and a second plurality of stacked searching blocks to output first feature maps of a first resolution and to output second feature maps of a second resolution. In some examples, a fusion module may include a plurality of searching blocks, where the fusion module is configured to generate multiscale feature maps by fusing one or more feature maps of the first resolution received from the first parallel module with one or more feature maps of the second resolution received from the first parallel module, and wherein the fusion module is configured to output the multiscale feature maps and output third feature maps of a third resolution.
Method for processing images, electronic device, and storage medium
A method for processing images includes: detecting a plurality of human face key points of a three-dimensional human face in a target image; acquiring a virtual makeup image, wherein the virtual makeup image includes a plurality of reference key points, the reference key points indicating human face key points of a two-dimensional human face; and acquiring a target image fused with the virtual makeup image by fusing the virtual makeup image and the target image with each of the reference key points in the virtual makeup image aligned with a corresponding human face key point.
System, client terminal, control method for system, and storage medium
An estimation apparatus includes an acquisition unit configured to acquire data about a domesticated animal identified by identification information that is transmitted by a client terminal, and an estimation unit configured to perform estimation by inputting the acquired data about a domesticated animal to a trained model generated by performing machine learning based on captured image data of domesticated animals and collected data about domesticated animals and provide, to the client terminal, an estimation result indicating a result of the estimation, and the client terminal includes a presenting unit configured to transmit a request for estimation together with identification information for identifying a domesticated animal targeted for estimation, receive the estimation result, and present body weight data about the domesticated animal targeted for estimation to a user.