G06V30/19147

Deep learning based on image encoding and decoding
11593632 · 2023-02-28 · ·

A deep learning based compression (DLBC) system trains multiple models that, when deployed, generates a compressed binary encoding of an input image that achieves a reconstruction quality and a target compression ratio. The applied models effectively identifies structures of an input image, quantizes the input image to a target bit precision, and compresses the binary code of the input image via adaptive arithmetic coding to a target codelength. During training, the DLBC system reconstructs the input image from the compressed binary encoding and determines the loss in quality from the encoding process. Thus, the models can be continually trained to, when applied to an input image, minimize the loss in reconstruction quality that arises due to the encoding process while also achieving the target compression ratio.

Object detection and image cropping using a multi-detector approach
11593585 · 2023-02-28 · ·

Computer-implemented methods for detecting objects within digital image data based on color transitions include: receiving or capturing a digital image depicting an object; sampling color information from a first plurality of pixels of the digital image, wherein each of the first plurality of pixels is located in a background region of the digital image; optionally sampling color information from a second plurality of pixels of the digital image, wherein each of the second plurality of pixels is located in a foreground region of the digital image; assigning each pixel a label of either foreground or background using an adaptive label learning process; binarizing the digital image based on the labels assigned to each pixel; detecting contour(s) within the binarized digital image; and defining edge(s) of the object based on the detected contour(s). Corresponding systems and computer program products configured to perform the inventive methods are also described.

SYSTEM AND METHOD FOR RECOGNIZING ONLINE HANDWRITING
20220366713 · 2022-11-17 · ·

A method for recognizing online handwriting comprising acquiring, by a handwriting instrument comprising a module comprising at least one motion sensor, motion data on the handwriting of the user when the user is writing a sequence of characters with the handwriting instrument, the handwriting instrument further including a body extending longitudinally between a first end and a second end, the first end having a writing tip which is able to write on a support, analyzing the motion data with a machine learning model trained in a multi-task way, the machine learning model being configured to deliver as an output the sequence of characters which was written by the user with the handwriting instrument.

SEMANTIC SIMILARITY FOR SKU VERIFICATION

A semantic similarity fingerprint is generated for an image by inferring a plurality of SKUs each at an associated first weight based upon analysis of the image using the machine learning models. The associated first weights for each of the classifications based upon each machine learning model is the semantic similarity fingerprint. The semantic similarity fingerprint may be compared to previously generated semantic similarity fingerprints. If a match is found with a semantic similarity fingerprint that has previously been identified as a particular.

SYSTEMS AND METHODS FOR KNOWLEDGE BASE QUESTION ANSWERING USING GENERATION AUGMENTED RANKING
20230059870 · 2023-02-23 ·

Embodiments described herein provide a question answering approach that answers a question by generating an executable logical form. First, a ranking model is used to select a set of good logical forms from a pool of logical forms obtained by searching over a knowledge graph. The selected logical forms are good in the sense that they are close to (or exactly match, in some cases) the intents in the question and final desired logical form. Next, a generation model is adopted conditioned on the question as well as the selected logical forms to generate the target logical form and execute it to obtain the final answer. For example, at inference stage, when a question is received, a matching logical form is identified from the question, based on which the final answer can be generated based on the node that is associated with the matching logical form in the knowledge base.

Methods and systems for accurately recognizing vehicle license plates

Systems can be configured for detecting license plates and recognizing characters in license plates. In an example, a system can receive an image and identify one or more regions in the image that include a license plate. Character recognition can be performed in the one or more regions to determine contents of a candidate license plate. Location-specific information about a license plate format can be used together with the determined contents of the candidate license plate to determine if the recognized characters are valid.

Cross modality training of machine learning models

There is provided a method, comprising: providing a training dataset including, medical images and corresponding text based reports, and concurrently training a natural language processing (NLP) machine learning (ML) model for generating a NLP category for a target text based report and a visual ML model for generating a visual finding for a target image, by: training the NLP ML model using the text based reports of the training dataset and a ground truth comprising the visual finding generated by the visual ML model in response to an input of the images corresponding to the text based reports of the training dataset, and training the visual ML model using the images of the training dataset and a ground truth comprising the NLP category generated by the NLP ML model in response to an input of the text based reports corresponding to the images of the training dataset.

MACHINE LEARNING-BASED TEXT RECOGNITION SYSTEM WITH FINE-TUNING MODEL

A non-transitory processor-readable medium stores instructions to be executed by a processor. The instructions cause the processor to receive a first trained machine learning model that generates a transcription based on a document. The instructions cause the processor to execute the first trained machine learning model and a second trained machine learning model to generate a refined transcription based on the transcription. The instructions cause the processor to execute a quality assurance program to generate a transcription score based on the document and the transcription. The instructions cause the processor to execute the quality assurance program to generate a refined transcription score based on the refined transcription and at least one of the document or the transcription. The at least one refined transcription score indicates an automation performance better than an automation performance for the at least one transcription score.

Systems and Methods for Generating Document Numerical Representations

Described embodiments relate to a method comprising: determining a candidate document comprising image data and character data and extracting the image data and the character data from the candidate document. The method comprises providing, to an image-based numerical representation generation model, the image data, and generating, by the image-based numerical representation generation model, an image-based numerical representation of the image data. The method comprises providing, to a character-based numerical representation generation model, the character data; and generating, by the character-based numerical representation generation model, a character-based numerical representation of the character data. The method comprises providing, to a consolidated image-character based numerical representation generation model, the image-based numerical representation and the character-based numerical representation; and generating, by the consolidated image-character based numerical representation generation model, a combined image-character based numerical representation of the candidate document.

CREATING APPLICATIONS AND TEMPLATES BASED ON DIFFERENT TYPES OF INPUT CONTENT

The disclosure herein describes generating an application from input content. Input content of a content type is obtained, such as an image file, digital document file, or the like. A content data extractor is selected from a set of content data extractors based on the content type. A set of content entities is extracted from the obtained input content using the selected content data extractor, such as text labels, text boxes, buttons, or the like. The set of content entities are normalized according to a standard interface schema and an application template is generated using the normalized set of content entities, whereby an application can be developed using the generated application template. The disclosure enables application interfaces to be designed using a variety of methods and for those different types of designs to be efficiently converted to a functional application.