Patent classifications
G06V30/19147
SYSTEM TO IDENTIFY AUTHORSHIP OF HANDWRITTEN TEXT BASED ON INDIVIDUAL ALPHABETS
A device, method, and non-transitory computer readable medium are described. The method includes receiving a dataset including hand written Arabic words and hand written Arabic alphabets from one or more users. The method further includes removing whitespace around alphabets in the hand written Arabic words and the hand written Arabic alphabets in the dataset. The method further includes splitting the dataset into a training set, a validation set, and a test set. The method further includes classifying one or more user datasets from the training set, the validation set, and the test set. The method further includes identifying the target user from the one or more user datasets. The identification of the target user includes a verification accuracy of the hand written Arabic words being larger than a verification accuracy threshold value.
MACHINE LEARNING MODELS FOR AUTOMATED ENTITY FIELD CORRECTION
A computer system includes memory hardware configured to store a machine learning model, a record database, and historical feature vector inputs. Processor hardware is configured to execute instructions which include training the machine learning model to generate an entity field output, and for each of multiple database entities, scanning the database entity to generate a feature vector input, and processing the feature vector input to generate the entity field output. In response to determining that the entity field output includes at least one missing field value, the instructions include accessing the record database to identify a predicted value for the missing field value, analyzing the structured scan data or rescanning the database entity to determine whether the predicted value is present in the database entity, and assigning the database entity to the validated subset of the multiple database entities when the predicted value is present in the database entity.
Cloud detection on remote sensing imagery
A system for detecting clouds and cloud shadows is described. In one approach, clouds and cloud shadows within a remote sensing image are detected through a three step process. In the first stage a high-precision low-recall classifier is used to identify cloud seed pixels within the image. In the second stage, a low-precision high-recall classifier is used to identify potential cloud pixels within the image. Additionally, in the second stage, the cloud seed pixels are grown into the potential cloud pixels to identify clusters of pixels which have a high likelihood of representing clouds. In the third stage, a geometric technique is used to determine pixels which likely represent shadows cast by the clouds identified in the second stage. The clouds identified in the second stage and the shadows identified in the third stage are then exported as a cloud mask and shadow mask of the remote sensing image.
AUTOMATED CATEGORIZATION AND PROCESSING OF DOCUMENT IMAGES OF VARYING DEGREES OF QUALITY
An apparatus includes a memory and a processor. The memory stores a dictionary and a machine learning algorithm trained to classify text. The processor receives an image of a page, converts the image into a set of text, and identifies a plurality of tokens within the text. Each token includes one or more contiguous characters that are both preceded and followed by whitespace within the text. The processor identifies invalid tokens by removing tokens of the plurality of tokens that correspond to words of the dictionary. The processor calculates, based on a ratio of a total number of valid tokens to a total number of tokens, a score. In response to determining that the score is greater than a threshold, the processor applies the machine learning algorithm to classify the text into a category and stores the image and/or text in a database according to the category.
METHOD FOR TRAINING A FONT GENERATION MODEL, METHOD FOR ESTABLISHING A FONT LIBRARY, AND DEVICE
Provided are a method for training a font generation model, a method for establishing a font library, and a device. The method for training a font generation model includes the following steps. A source-domain sample character is input into the font generation model to obtain a first target-domain generated character. The first target-domain generated character is input into a font recognition model to obtain the target adversarial loss of the font generation model. The model parameter of the font generation model is updated according to the target adversarial loss.
TEXT INDEPENDENT WRITER VERIFICATION METHOD AND SYSTEM
A device, method, and non-transitory computer readable medium are described. The method includes receiving a dataset including hand written Arabic words and hand written Arabic alphabets from one or more users. The method further includes removing whitespace around alphabets in the hand written Arabic words and the hand written Arabic alphabets in the dataset. The method further includes splitting the dataset into a training set, a validation set, and a test set. The method further includes classifying one or more user datasets from the training set, the validation set, and the test set. The method further includes identifying the target user from the one or more user datasets. The identification of the target user includes a verification accuracy of the hand written Arabic words being larger than a verification accuracy threshold value.
Visually guided machine-learning language model
Visually guided machine-learning language model and embedding techniques are described that overcome the challenges of conventional techniques in a variety of ways. In one example, a model is trained to support a visually guided machine-learning embedding space that supports visual intuition as to “what” is represented by text. The visually guided language embedding space supported by the model, once trained, may then be used to support visual intuition as part of a variety of functionality. In one such example, the visually guided language embedding space as implemented by the model may be leveraged as part of a multi-modal differential search to support search of digital images and other digital content with real-time focus adaptation which overcomes the challenges of conventional techniques.
Method for optical character recognition in document subject to shadows, and device employing method
A method for recognition of characters by optical means in an unclear or non-optimal image of an object document, the image carrying shadows or other impediments inputs the document into a shadow prediction model to obtain a shadow mask. A determination is made as to whether the shadow mask of the document affect an optical character recognition (OCR) performance. The method further inputs the document into a shadow removing model for removal of shadows to obtain an intermediate document if the shadow mask are deemed to affect the OCR performance, then OCR can then be performed on the final object document.
SYSTEMS, APPARATUS, ARTICLES OF MANUFACTURE, AND METHODS TO GENERATE DIGITIZED HANDWRITING WITH USER STYLE ADAPTATIONS
Systems, apparatus, articles of manufacture, and methods to generate digitized handwriting with user style adaptations are disclosed. An example apparatus includes at least one memory, and processor circuitry to train a machine learning model to generate a first digitized handwriting sequence based on a stored handwriting sample. To train the machine learning model, the processor circuitry is to cause a parameterization of a first portion of the machine learning model; and cause a reparameterization of a second portion of the machine learning model. The processor circuitry to re-train the trained machine learning model to generate a second digitized handwriting sequence based on a user handwriting sample.
Machine learning for automatic extraction and workflow assignment of action items
Systems, methods, and other embodiments associated with automatic smart extraction and workflow assignment of action items are described. In one embodiment, a method includes extracting a set of candidate action items from text of a construction project manual; applying static rules to the candidate action items to distinguish valid and invalid candidate action items; evaluating each valid candidate action item with a first machine learning model to label the valid candidate action item either (i) a true action item or (ii) not a true action item; evaluating each true action item with a second machine learning model to allocate each of the true action items to a construction workflow class; and transmitting the set of true action items to a submittal exchange system to populate one or more workflows.