Patent classifications
G06V30/1918
Systems and methods for synchronizing an image sensor
Systems and methods for synchronization are provided. In some aspects, a method for synchronizing an image sensor is provided. The method includes receiving image data captured using an image sensor that is moving along a pathway, and assembling an image sensor trajectory using the image data. The method also includes receiving position data acquired along the pathway using a position sensor, wherein timestamps for the image data and position data are asynchronous, and assembling a position sensor trajectory using the position data. The method further includes generating a spatial transformation that aligns the image sensor trajectory and position sensor trajectory, and synchronizing the image sensor based on the spatial transformation.
System and method for data extraction and searching
Systems and methods are provided for quickly and efficiently searching and receiving results for real estate-related information without or at least with minimal human processing of real estate-related documents. Optical character recognition on a plurality of scanned document images is performed to obtain a plurality of textual data representations of the real estate-related documents. Data is extracted from the textual data representations, and subsequently contextualized according to a real estate-related context. Aspects of the extracted data as well as the textual data representations are provided as search results based on one or more searches for real estate-related information.
Personal information display system and associated method
A system for identification and/or authentication of a user in a travel terminal, the system comprising: a multiuser interactive screen having one or more interaction zones, with which a user can interact; an image generation system for generating information on the interaction zone to provide information to the user with which the user can interact and; an image capture system which generates a captured image of any item in contact with the interaction zone; a recognition system for recognizing one or more features from the captured image to enable identification and/or authentication of the user; a communication system for communicating with the user by means of the image generation means to confirm identification or request additional information.
Top-down view object detection and tracking
Tracking a current and/or previous position, velocity, acceleration, and/or heading of an object using sensor data may comprise determining whether to associate a current object detection generated from recently received (e.g., current) sensor data with a previous object detection generated from formerly received sensor data. In other words, a track may identify that an object detected in former sensor data is the same object detected in current sensor data. However, multiple types of sensor data may be used to detect objects and some objects may not be detected by different sensor types or may be detected differently, which may confound attempts to track an object. An ML model may be trained to receive outputs associated with different sensor types and/or a track associated with an object, and determine a data structure comprising a region of interest, object classification, and/or a pose associated with the object.
Systems and methods for using image analysis to automatically determine vehicle information
The present disclosure is directed to systems and methods for analyzing digital images to determine alphanumeric strings depicted in the digital images. An electronic device may generate a set of filtered images using a received digital image. The electronic device may also perform an optical character recognition (OCR) technique on the set of filtered images, and may filter out any of the set of filtered images according to a set of rules. The electronic device may further identify a set of common elements representative of the alphanumeric string depicted in the digital image, and determine a machine-encoded alphanumeric string based on the set of common elements.
Multi-modal electronic document classification
A method comprising operating at least one hardware processor for: receiving, as input, a plurality of electronic documents, training a machine learning classifier based, at least on part, on a training set comprising: (i) labels associated with the electronic documents, (ii) raw text from each of said plurality of electronic documents, and (iii) a rasterized version of each of said plurality of electronic documents, and applying said machine learning classifier to classify one or more new electronic documents.
USING MULTIPLE CAMERAS TO PERFORM OPTICAL CHARACTER RECOGNITION
The subject matter of this specification can be implemented in, among other things, a method that includes receiving a first image from a first camera depicting a first view of a physical item, where the physical item displays a plurality of characters. The method includes receiving a second image from a second camera depicting a second view of the physical item. The method includes performing optical character recognition on the first image to identify first characters and a first layout in the first image and on the second image to identify second characters and a second layout in the second image. The method includes combining the first characters with the second characters by comparing the first characters with the second characters and the first layout with the second layout. The method includes storing the combined first and second characters.
METHOD FOR RECOGNIZING TEXT, AND APPARATUS
A method for recognizing a text, and an apparatus. A specific embodiment of the method comprises: obtaining feature maps, the feature maps being obtained by means of performing text instance segmentation on an image having a text to be recognized; constructing a relationship graph according to the feature maps, wherein in the relationship graph, each node represents a pixel in a feature map, and each edge represents that a similarity measure of spatial semantic features of two connected nodes is greater than a target threshold, and a spatial semantic feature of a node comprises a type feature and a position feature of a pixel represented by the node; utilizing a pre-trained graph convolutional network to perform processing on the relationship graph, and obtaining a first text feature corresponding to the image; and generating a text recognition result for the image according to the first text feature.
Methods of processing data from multiple image sources to provide normalized confidence levels for use in improving performance of a recognition processor
A method comprises receiving from a first data source first recognition results which are associated with the first data source, and receiving from a second data source second recognition results which are associated with the second data source. The method further comprises, processing a first set of confidence levels associated with the first recognition results to provide a first set of normalized confidence levels associated with the first data source, and processing a second set of confidence levels associated with the second recognition results to provide a second set of normalized confidence levels associated with the second data source. The method also comprises storing the first set of normalized confidence levels associated with the first data source in a first table of normalized confidence levels and the second set of normalized confidence levels associated with the second data source in a second table of normalized confidence levels.
DOCUMENT INFORMATION EXTRACTION
An embodiment for a method of extracting information from documents using knowledge graphs and prompt-based learning. The embodiment may receive a document and perform optical character recognition (OCR) to obtain OCR text lines and associated bounding boxes. The embodiment may encode each of the obtained OCR text lines into semantic vectors and each of the associated bounding boxes into position vectors to generate a knowledge graph using fusion vectors derived therefrom. The embodiment may receive a query including a key value. The embodiment may identify a series of candidate nodes including a series of most similar nearby nodes positioned near a first node associated with the key value. The embodiment may generate prompt template to determine closeness of the candidate nodes to the key value and calculate associated confidence levels. The embodiment may output extraction information associated with the candidate node having a highest calculated confidence level.