G06V30/1914

Unsupervised anchor handling for machine vision system
11308634 · 2022-04-19 · ·

A device includes an image sensor and a processor to: receive training images that each include multiple visual features and an ROI; receive indications of locations of ROIs within each training image; perform at least one transform on the multiple visual features and ROI of at least one training image to align the multiple visual features and ROIs among all of the training images to identify a common set of visual features present within all of the training images; derive a converged ROI from at least a portion of the ROI of at least one training image; and generate an anchor model based on the converged ROI and the common set of visual features, wherein the common set of visual features defines the anchor and are each specified relative to the converged ROI, and the anchor model is used to derive a location of a candidate ROI in an image.

Predictive data analysis using image representations of categorical and scalar feature data

There is a need for more effective and efficient predictive data analysis solutions and/or more effective and efficient solutions for generating image representations of categorical/scalar data. Various embodiments of the present invention address one or more of the noted technical challenges. In one example, a method comprises receiving the one or more categorical input features; generating an image representation of the one or more categorical input features, wherein the image representation comprises image region values each associated with a categorical input feature, and further wherein each image region value of the one or more image region values is determined based at least in part on the corresponding categorical input feature associated with the image region value; and processing the image representation using an image-based machine learning model to generate the image-based predictions.

Mapper component for a neuro-linguistic behavior recognition system

Techniques are disclosed for generating a sequence of symbols based on input data for a neuro-linguistic model. The model may be used by a behavior recognition system to analyze the input data. A mapper component of a neuro-linguistic module in the behavior recognition system receives one or more normalized vectors generated from the input data. The mapper component generates one or more clusters based on a statistical distribution of the normalized vectors. The mapper component evaluates statistics and identifies statistically relevant clusters. The mapper component assigns a distinct symbol to each of the identified clusters.

SYSTEM FOR MULTIPLE ALGORITHM PROCESSING OF BIOMETRIC DATA

A system performs processing of biometric information to create multiple templates. This allows biometric systems to be flexible and interact with a plurality of vendors' technologies. Specifically, a biometric sample is captured from a sensor and transmitted to a processing component. The biometric sample is then processed by a first algorithm to yield a biometric template and the template is stored and associated with a record identifier. The biometric sample is also processed by a second algorithm to yield a second template. The second template is stored and associated with the record identifier.

Book scanning using machine-trained model
11145037 · 2021-10-12 · ·

This application discloses a technology for flattening a photographed page of a book and straightening texts therein. The technology uses one or more mathematical models to represent a curved shape of the photographed page with certain parameters. The technology also uses one or more photographic image processing techniques to dewarp the photographed page using the parameters of the curved shape. The technology uses one or more additional parameters that represent certain features of the photographed page to dewarp the photographed page.

UNSUPERVISED ANCHOR HANDLING FOR MACHINE VISION SYSTEM
20210241469 · 2021-08-05 ·

A device includes an image sensor and a processor to: receive training images that each include multiple visual features and an ROI; receive indications of locations of ROIs within each training image; perform at least one transform on the multiple visual features and ROI of at least one training image to align the multiple visual features and ROIs among all of the training images to identify a common set of visual features present within all of the training images; derive a converged ROI from at least a portion of the ROI of at least one training image; and generate an anchor model based on the converged ROI and the common set of visual features, wherein the common set of visual features defines the anchor and are each specified relative to the converged ROI, and the anchor model is used to derive a location of a candidate ROI in an image.

System for locating, interpreting and extracting data from documents

Methods, systems and computer-readable media for extracting data from a document. One method includes receiving a document in a text format and assigning a first signature to the document. The method also includes matching the first signature to a second signature of a template and extracting data a user desires to have extracted from the document, wherein instructions for locating the data are stored within the template.

Systems and methods for information extraction from text documents with spatial context

Performing information extraction from an electronic document is disclosed. A method comprises: receiving a semi-structured input document; retrieving an entity model that provides one or more domain variable definitions for one or more domain variables, wherein the entity model and the input document correspond to a common domain; determining that the input document includes an entity that satisfies a first domain variable definition corresponding to a first domain variable; retrieving a relational model that provides, for the first domain variable, one or more relational definitions comprising spatial restrictions for one or more values corresponding to the first domain variable; extracting one or more data elements from the input document that satisfy the one or more relational definitions; and generating an information graph having a structured data format, wherein the one or more data elements extracted from the input document correspond to the first domain variable in the structured data format.

Image quality assessment and improvement for performing optical character recognition

Techniques are disclosed for performing optical character recognition (OCR) by assessing and improving quality of electronic documents to perform the OCR. For example a method for identifying information in an electronic document includes obtaining a reference image of the electronic document, distorting the reference image by adjusting different sets of one or more parameters associated with a quality of the reference image to generate a plurality of distorted images, analyzing each distorted image to detect the adjusted set of parameters and corresponding adjusted values, determining an accuracy of detection of the set of parameters and the adjusted values, and training a model based at least on the plurality of distorted images and the accuracy of the detection, wherein the trained model determines at least a first technique for adjusting a set of parameters in a second image to prepare the second image for optical character recognition.

COMPUTER-BASED SYSTEMS AND METHODS FOR CORRECTING DISTORTED TEXT IN FACSIMILE DOCUMENTS

A method includes passing an original text document through distortion filter generators to generate a training dataset that includes distorted text documents. Each distortion filter generator is configured to distort words or letters of words in phrases of text of a facsimile image in a respective unique manner. A neural network model is trained to recognize each respective distortion and match each respective distortion with each respective distortion filter generator based on the training dataset and the original text document. Image data of one facsimile having at least one text distortion is received and inputted to the trained neural network model. The output of the trained neural network model is coupled to an input of an optical character recognition (OCR) engine. The trained neural network model and the OCR engine convert the received image data of the incoming facsimile corrected for the at least one text distortion to machine-encoded text.