G06V30/19133

Computer system and method for detecting, extracting, weighing, benchmarking, scoring, reporting and capitalizing on complex risks found in buy/sell transactional agreements, financing agreements and research documents
12106382 · 2024-10-01 ·

Computer-implemented systems and methods enhance a user's sophistication as she/he reviews complex information sources using specialized detective tools provided by a user interface of the computer system. The specialized investigative inquiries are stored in a database and are particularly tailored a priori by a subject-matter content designer for the type of documents being reviewed for risk and opportunity. The investigative scripts are organized into to a path of risk-related subjects or topics, and within each path of subjects/topics the investigative scripts are organized into a specialized inquiry or flow chart.

GUIDING SELECTION OF DIGITAL CONTENT BASED ON EDITORIAL CONTENT

A digital magazine server generates a digital magazine for user based on a received request for the digital magazine identifying one or more topics. The digital magazine server applies one or more machined trained models to obtained content items to select content items for the topic. A hierarchy of the topics included in the received request may be determined by the digital magazine server and used by the trained models to select content items. When generating the digital magazine, the digital magazine server also includes one or more editorial content items that are manually selected. The digital magazine serer may reposition one or more content items selected by the trained models to include an editorial content items.

Photo management

A method for image processing includes determining features of multiple stored images from a pre-trained deep convolutional network. The method also includes clustering each image of the multiple stored images based on the determined features.

Training data to increase pixel labeling accuracy

Techniques are described to generate improved training data for pixel labeling. To generate training data, objects are displayed in a user interface by a computing device, e.g., iteratively. The objects are taken from a structured object representation associated with a respective one of a plurality of images. The structured object representation defines a hierarchical relationship of the objects within the respective image. Inputs are then received that are originated through user interaction with the user interface. The inputs label respective ones of the iteratively displayed objects, e.g., as text, a graphical element, background, foreground, and so forth. A model is trained by the computing device using machine learning.

System and method for optical character recognition

This disclosure relates to system and method for optical character recognition. In one embodiment, the method comprises providing an image data to a plurality of customized machine learning algorithms or various customized neural networks, configured to recognize a set of pre-defined characters. The method comprises presenting one or more suggestions for the character to the user in response to negative character recognition, and training a customized machine learning algorithm corresponding to the character if one of the suggestions is identified by the user. If the suggestions are rejected by the user, the method comprises prompting the user to identify the character and determining presence of the character in the set of pre-defined characters. The method further comprises training a customized machine learning algorithm corresponding to the character if the character is present, or dynamically creating a customized machine learning algorithm corresponding to the character if the character is not present.

System and method for generating best potential rectified data based on past recordings of data
12183100 · 2024-12-31 · ·

Various methods, apparatuses/systems, and media for data processing are disclosed. A processor receives a digital document; applies an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identifies defective data extracted by the OCR tool resulted from relatively inferior image quality of the received digital document; implements an auto rectification algorithm on the identified defective data; automatically generates, in response to implementing the auto rectification algorithm, corresponding auto-rectified data for each identified defective data; records the defective data and corresponding auto-rectified data at a field level; receives user input data on said recorded auto-rectified data; determines whether the auto-rectified data is correct or not; and populates, based on determining that the auto-rectified data is correct, a machine learning model with said received user input data to be utilized for subsequently received digital document.

Online, incremental real-time learning for tagging and labeling data streams for deep neural networks and neural network applications

Today, artificial neural networks are trained on large sets of manually tagged images. Generally, for better training, the training data should be as large as possible. Unfortunately, manually tagging images is time consuming and susceptible to error, making it difficult to produce the large sets of tagged data used to train artificial neural networks. To address this problem, the inventors have developed a smart tagging utility that uses a feature extraction unit and a fast-learning classifier to learn tags and tag images automatically, reducing the time to tag large sets of data. The feature extraction unit and fast-learning classifiers can be implemented as artificial neural networks that associate a label with features extracted from an image and tag similar features from the image or other images with the same label. Moreover, the smart tagging system can learn from user adjustment to its proposed tagging. This reduces tagging time and errors.

Dynamically adjusting instructions in an augmented-reality experience

Systems and methods for augmented-reality tutoring can utilize optical character recognition, natural language processing, and/or augmented-reality rendering for providing real-time notifications for completing a determined task. The systems and methods can include utilizing one or more machine-learned models trained for quantitative reasoning and can include providing a plurality of different user interface elements at different times.

AUTOMATIC DOCUMENT TEMPLATE INFERENCE, GENERATION, AND REFINEMENT

Various embodiments offer improved functionality for generating and/or refining templates that can be used for automatically extracting information from within an invoice or other document, based on geometric characteristics of the document. An initial template may be automatically generated, and such initial template may then be refined over time based on user feedback, so as to improve reliability and accuracy in information extraction.

SIMULATION OF LABEL DATA TO OPTIMIZE THE VISUAL DOCUMENT UNDERSTANDING BY USING PDFS ANNOTATION AWARE METHODOLOGY
20250087005 · 2025-03-13 ·

One example method includes obtaining a text-based document, identifying annotations in the text-based document, and retaining those annotations, converting the text-based document to an image, processing the image, creating simulated label data by integrating the processed image with the annotations, and using the processed simulated label data to train a machine learning model of an OCR (optical character recognition) system. The processed image is a lower quality version of the text-based document that was used to create the image.