Patent classifications
G06V30/18057
SYSTEMS AND METHODS FOR IMAGE MODIFICATION AND IMAGE BASED CONTENT CAPTURE AND EXTRACTION IN NEURAL NETWORKS
Systems and methods for image modification to increase contrast between text and non-text pixels within the image. In one embodiment, an original document image is scaled to a predetermined size for processing by a convolutional neural network. The convolutional neural network identifies a probability that each pixel in the scaled is text and generates a heat map of these probabilities. The heat map is then scaled back to the size of the original document image, and the probabilities in the heat map are used to adjust the intensities of the text and non-text pixels. For positive text, intensities of text pixels are reduced and intensities of non-text pixels are increased in order to increase the contrast of the text against the background of the image. Optical character recognition may then be performed on the contrast-adjusted image.
System and Method for Processing Insurance Cards
A system and method processes images of insurance cards to extract information. The images of the insurance cards are processed using OCR to identify characters on the insurance cards. Combinations of characters on each insurance card are identified as tokens, and their relative spatial orientation is determined. Deep learning architectures are utilized to generate a fully connected neural network with a node for each token on each card. The neural network is utilized to extract entities from each insurance card, such as a valid member ID.
METHOD AND SYSTEM FOR TRAINING NEURAL NETWORK FOR ENTITY DETECTION
Disclosed is a system and method for training a neural network to be implemented for detecting at least one entity in a document to derive relevant inferences therefrom. The method comprising obtaining at least one document, processing, the at least one document via a detection module to detect a widget entity, wherein the detected widget entity is classified as active or inactive based on a detected state of the widget entity, modifying, the classified widget entity into a corresponding machine-readable widget-entity based on the detected state, processing, the at least one document via an extraction module to detect a text entity in near vicinity of the classified widget entity, generating a training pair comprising the machine-readable widget entity and the corresponding text entity and training the neural network using the generated training pair.
Parsing an ink document using object-level and stroke-level processing
Technology is described herein for parsing an ink document having a plurality of ink strokes. The technology performs stroke-level processing on the plurality of ink strokes to produce stroke-level information, the stroke-level information identifying at least one characteristic associated with each ink stroke. The technology also performs object-level processing on individual objects within the ink document to produce object-level information, the object-level information identifying one or more groupings of ink strokes in the ink document. The technology then parses the ink document into constituent parts based on the stroke-level information and the object-level information. In some implementations, the technology converts the ink stroke data into an ink image. The stroke-level processing and/or the object-level processing may operate on the ink image using one or more neural networks. More specifically the stroke-level processing can classify pixels in the input image, while the object-level processing can identify bounding boxes containing possible objects.
TYRE SIDEWALL IMAGING METHOD
A computer implemented method for generating a region of interest on a digital image of a sidewall of a tyre, the sidewall having one or more embossed and/or engraved markings, is provided. The method comprises generating a histogram of oriented gradients feature map of the digital image, inputting the histogram of oriented gradients feature map into a trained convolutional neural network, wherein said trained convolutional neural network is configured to output a first probability based on the input histogram of oriented gradients feature map that a region of pixels of the digital image contains the embossed and/or engraved markings, and if the first probability is at or above a first predetermined threshold, accepting said region of pixels as said region of interest.
Multi-dimensional language style transfer
In some embodiments, a style transfer computing system generates a set of discriminator models corresponding to a set of styles based on a set of training datasets labeled for respective styles. The style transfer computing system further generates a style transfer language model for a target style combination that includes multiple target styles from the set of styles. The style transfer language model includes a cascaded language model and multiple discriminator models selected from the set of discriminator models. The style transfer computing system trains the style transfer language model to minimize a loss function containing a loss term for the cascaded language model and multiple loss terms for the multiple discriminator models. For a source sentence and a given target style combination, the style transfer computing system applies the style transfer language model on the source sentence to generate a target sentence in the given target style combination.
Data compression for machine learning tasks
A machine learning (ML) task system trains a neural network model that learns a compressed representation of acquired data and performs a ML task using the compressed representation. The neural network model is trained to generate a compressed representation that balances the objectives of achieving a target codelength and achieving a high accuracy of the output of the performed ML task. During deployment, an encoder portion and a task portion of the neural network model are separately deployed. A first system acquires data, applies the encoder portion to generate a compressed representation, performs an encoding process to generate compressed codes, and transmits the compressed codes. A second system regenerates the compressed representation from the compressed codes and applies the task model to determine the output of a ML task.
Tyre sidewall imaging method
A computer implemented method is proposed for classifying one or more embossed and/or engraved markings on a sidewall of a tyre into one or more classes comprising digital image data of the sidewall of the tyre. The method comprises generating a first image channel from a first portion of the digital image data relating to a corresponding first portion of the sidewall of the tyre. Generating the first image channel comprises performing histogram equalisation on the first portion of the digital image data to generate the first image channel. The method further comprises generating a first feature map using the first image channel and applying a first classifier to the first feature map to classify said embossed and/or engraved markings into one or more first classes.
CROSS-MODALITY NEURAL NETWORK TRANSFORM FOR SEMI-AUTOMATIC MEDICAL IMAGE ANNOTATION
A cross-modality neural network transform for semi-automatic medical image annotation is provided. In various embodiments, an input medical image is mapped to a first vector in a text vector space. The first vector corresponds to the features of the medical image. A set of predetermined vectors is searched for a closest one of the predetermined vectors to the first vector. From the closest one of the predetermined vectors, one or more keywords is determined describing the input medical image.
Two-dimensional document processing
Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.