Patent classifications
G06V10/426
MULTI-SENSOR SEQUENTIAL CALIBRATION SYSTEM
Techniques for performing a sensor calibration using sequential data is disclosed. An example method includes receiving, from a first camera located on a vehicle, a first image comprising at least a portion of a road comprising lane markers, where the first image is obtained by the camera at a first time; obtaining a calculated value of a position of an inertial measurement (IM) device at the first time; obtaining an optimized first extrinsic matrix of the first camera by adjusting a function of a first actual pixel location of a location of a lane marker in the first image and an expected pixel location of the location of the lane marker; and performing autonomous operation of the vehicle using the optimized first extrinsic matrix of the first camera when the vehicle is operated on another road or at another time.
Multi-modal document feature extraction
Systems and methods are described for generating a machine learning model for multi-modal feature extraction. The method may include receiving a document in a digital format, where the digital format comprises text information and image information, performing a text extraction function on a first portion of the document to produce a set of text features, performing an image extraction function on a second portion of the document to produce a set of image features, generating a feature tree, wherein a plurality of nodes of the feature tree correspond to the set of text features and the set of image features, and generating an input vector for a machine learning model based on the feature tree. In some cases, the feature tree may be generated synthetically, or modified by a user prior to being converted into the input vector.
UNSUPERVISED IMAGE SEGMENTATION METHOD AND ELECTRONIC DEVICE
An unsupervised image segmentation method includes: performing a superpixel segmentation on an image containing a target object to acquire a plurality of superpixel sets, each superpixel set corresponding to a respective superpixel node; generating an undirected graph according to superpixel nodes; determining foreground superpixel nodes and background superpixel nodes in the undirected graph according to a first label set corresponding to the plurality of superpixel nodes; generating a minimization objective function according to the foreground superpixel nodes and the background superpixel nodes; segmenting the undirected graph according to the minimization objective function to acquire a foreground part and a background part and to generate a second label set; and performing an image segmentation on the image according to a comparison result of the first label set and the second label set.
Generating scene graphs from digital images using external knowledge and image reconstruction
Methods, systems, and non-transitory computer readable storage media are disclosed for generating semantic scene graphs for digital images using an external knowledgebase for feature refinement. For example, the disclosed system can determine object proposals and subgraph proposals for a digital image to indicate candidate relationships between objects in the digital image. The disclosed system can then extract relationships from an external knowledgebase for refining features of the object proposals and the subgraph proposals. Additionally, the disclosed system can generate a semantic scene graph for the digital image based on the refined features of the object/subgraph proposals. Furthermore, the disclosed system can update/train a semantic scene graph generation network based on the generated semantic scene graph. The disclosed system can also reconstruct the image using object labels based on the refined features to further update/train the semantic scene graph generation network.
Generating scene graphs from digital images using external knowledge and image reconstruction
Methods, systems, and non-transitory computer readable storage media are disclosed for generating semantic scene graphs for digital images using an external knowledgebase for feature refinement. For example, the disclosed system can determine object proposals and subgraph proposals for a digital image to indicate candidate relationships between objects in the digital image. The disclosed system can then extract relationships from an external knowledgebase for refining features of the object proposals and the subgraph proposals. Additionally, the disclosed system can generate a semantic scene graph for the digital image based on the refined features of the object/subgraph proposals. Furthermore, the disclosed system can update/train a semantic scene graph generation network based on the generated semantic scene graph. The disclosed system can also reconstruct the image using object labels based on the refined features to further update/train the semantic scene graph generation network.
Visual Camera Re-Localization using Graph Neural Networks and Relative Pose Supervision
The present disclosure describes approaches to camera re-localization using a graph neural network (GNN). A re-localization model includes encoding an input image into a feature map. The model retrieves reference images from an image database of a previously scanned environment based on the feature map of the image. The model builds a graph based on the image and the reference images, wherein nodes represent the image and the reference images, and edges are defined between the nodes. The model may iteratively refine the graph through auto-aggressive edge-updating and message passing between nodes. With the graph built, the model predicts a pose of the image based on the edges of the graph. The pose may be a relative pose in relation to the reference images, or an absolute pose.
Visual relationship detection method and system based on adaptive clustering learning
The present disclosure discloses a visual relationship detection method based on adaptive clustering learning, including: detecting visual objects from an input image and recognizing the visual objects to obtain context representation; embedding the context representation of pair-wise visual objects into a low-dimensional joint subspace to obtain a visual relationship sharing representation; embedding the context representation into a plurality of low-dimensional clustering subspaces, respectively, to obtain a plurality of preliminary visual relationship enhancing representation; and then performing regularization by clustering-driven attention mechanism; fusing the visual relationship sharing representations and regularized visual relationship enhancing representations with a prior distribution over the category label of visual relationship predicate, to predict visual relationship predicates by synthetic relational reasoning. The method is capable of fine-grained recognizing visual relationships of different subclasses by mining latent relationships in-between, which improves the accuracy of visual relationship detection.
METHODS AND SYSTEMS FOR GROUND SEGMENTATION USING GRAPH-CUTS
Systems and methods for segmenting scan data are disclosed. The methods include receiving scan data representing a plurality of points in an environment associated with a ground surface and one or more objects, and creating a graph from the scan data. The graph includes a plurality of vertices corresponding to the plurality of points. The method further includes assigning a unary potential to each of the plurality of vertices that is a cost of assigning that vertex to a ground label or a non-ground label, and assigning a pairwise potential to each pair of neighboring vertices in the graph that is the cost of assigning different labels to neighboring vertices. The methods include using the unary potentials and the pairwise potentials to identify labels for each of the plurality of points, and segmenting the scan data to identify points associated with the ground based on the identified labels.
Computer architecture for identifying data clusters using correlithm objects and machine learning in a correlithm object processing system
A device that includes a model training engine implemented by a processor. The model training engine is configured to obtain a set of data values associated with a feature vector. The model training engine is further configured to transform a first data value and a second data value from the set of data value into sub-string correlithm objects. The model training engine is further configured to compute a Hamming distance between the first sub-string correlithm object and the second sub-string correlithm object and to identify a boundary in response to determining that the Hamming distance exceeds a bit difference threshold value. The model training engine is further configured to determine a number of identified boundaries, to determine a number of clusters based on the number of identified boundaries, and to train the machine learning model to associate the determined number of clusters with the feature vector.
IMAGE RECOGNITION METHOD AND APPARATUS, COMPUTER-READABLE STORAGE MEDIUM, AND ELECTRONIC DEVICE
This application provides an image recognition method and apparatus, an electronic device, and a computer-readable storage medium, and relates to the field of artificial intelligence technologies. The method includes obtaining feature information corresponding to a target object in an image to be recognized, the feature information comprising blur degree information, local feature information, and global feature information; determining a category of the target object based on the feature information, and determining a confidence level corresponding to the target object; and obtaining target information corresponding to the image to be recognized according to the category of the target object and the confidence level.