Patent classifications
G06V30/186
Object Pose Neural Network System
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium for predicting object pose. In one aspect, a method includes receiving an image of an object having one or more feature points; providing the image as an input to a neural network subsystem trained to receive images of objects and to generate an output including a heat map for each feature point; applying a differentiable transformation on each heat map to generate respective one or more feature coordinates for each feature point; providing the feature coordinates for each feature point as input to an object pose solver configured to compute a predicted object pose for the object, wherein the predicted object pose for the object specifies a position and an orientation of an object; and receiving, at the output of the object pose solver, a predicted object pose for the object in the image.
Object Pose Neural Network System
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium for predicting object pose. In one aspect, a method includes receiving an image of an object having one or more feature points; providing the image as an input to a neural network subsystem trained to receive images of objects and to generate an output including a heat map for each feature point; applying a differentiable transformation on each heat map to generate respective one or more feature coordinates for each feature point; providing the feature coordinates for each feature point as input to an object pose solver configured to compute a predicted object pose for the object, wherein the predicted object pose for the object specifies a position and an orientation of an object; and receiving, at the output of the object pose solver, a predicted object pose for the object in the image.
CONNECTING VISION AND LANGUAGE USING FOURIER TRANSFORM
A method for text-image integration is provided. The method may include receiving a question related to pairable data comprising text data and image data. Embeddings are generated from the text tokens and image encodings. Embeddings are generated from the text tokens and image encodings. The embeddings include text embeddings and image embeddings. A spectral conversion of the text embeddings and the image embeddings is performed to generate spectral data. The spectral data is processed to extract text-image features. The text-image features are processed to generate inferred answers to the question.
Document fraud detection
Systems and methods provide for a document fraud detection system for identifying fraudulent documents. The document fraud detection system can include up to three steps of fraud detection, where if the document fails any of the three steps, the document can be flagged for further review. In another embodiment, the document fraud detection system can score each of the three tests, where the scores represent the likelihood that the document is fraudulent. If the combined score satisfies a predetermined criterion, the document can be flagged as potentially fraudulent. The three tests can include analyzing an scanned image of the document and comparing to other similar documents to determine if there have been any alterations. The second test can compare indents to previous documents, and the third test can analyze chemical and biometric factors that may indicate whether the document has been altered.
ENTITY EXTRACTION VIA DOCUMENT IMAGE PROCESSING
A document processing system processes a document image to identify document image regions including floating images, structured data units, and unstructured floating text. A first masked image is generated by deleting any floating images from the document image and a second masked image is generated by deleting any structured data units from the first masked image. The structured data units and the unstructured floating text are thus identified serially one after another. Textual data is extracted from the structured data units and the unstructured floating text by processing the corresponding document image regions via optical character recognition (OCR). Entities are extracted from the textual data using natural language processing (NLP) techniques.
ENTITY EXTRACTION VIA DOCUMENT IMAGE PROCESSING
A document processing system processes a document image to identify document image regions including floating images, structured data units, and unstructured floating text. A first masked image is generated by deleting any floating images from the document image and a second masked image is generated by deleting any structured data units from the first masked image. The structured data units and the unstructured floating text are thus identified serially one after another. Textual data is extracted from the structured data units and the unstructured floating text by processing the corresponding document image regions via optical character recognition (OCR). Entities are extracted from the textual data using natural language processing (NLP) techniques.
DIGITAL-IMAGE SHAPE RECOGNITION USING TANGENTS AND CHANGE IN TANGENTS
In one aspect, a method of optical character recognition of digital character objects in digital images includes the step of obtaining a digital image. The digital images include rendering of a first object in the digital image. The first object comprises a set of sub-objects and a set of relationships between the sub-object. The method includes the step of generating a definition of a first object by defining an object outline for the first object as a set of sub-objects; defining a sub-object outline for each sub-object as a set of lines and curves; and defining each relationship between each set of connected sub-objects in terms of one or more intersections or one or more corners.
SYSTEM AND METHOD FOR ROBUST ESTIMATION OF STATE PARAMETERS FROM INFERRED READINGS IN A SEQUENCE OF IMAGES
A system and method for robust estimation of state parameters from internal readings in a sequence of images are provided. Various techniques can be implemented to address observation noise and/or underlying process noise to stabilize the readings.
SYSTEM AND METHOD FOR ROBUST ESTIMATION OF STATE PARAMETERS FROM INFERRED READINGS IN A SEQUENCE OF IMAGES
A system and method for robust estimation of state parameters from internal readings in a sequence of images are provided. Various techniques can be implemented to address observation noise and/or underlying process noise to stabilize the readings.
METHOD AND SYSTEM OF EXTRACTING NON-SEMANTIC ENTITIES
A method and system of extracting one or more non-semantic entities in a document image including data entities is disclosed. The methodology includes extraction, by a processor, of row entities and corresponding row location based on a text extraction technique from the document image. The row entities are split into split-row entities based on a splitting rule. Semantic entities are determined from alphabetic entities using semantic recognition technique. The non-semantic entities are determined as split-row entities other than semantic entities. Feature values of each feature type for each of the non-semantic entities is determined. The processor further determines a first probability output for non-semantic entities and a second probability output for semantic entities surrounding the non-semantic entities. The system further labels each of the non-semantic entities based on determination of a highest probability value from a sum of the first probability output and the second probability output.