Patent classifications
G06V30/19013
VISUAL MODE IMAGE COMPARISON
A method, a computer program product, and a computer system compare images for content consistency. The method includes receiving a first image including a first document and a second image including a second document. The method includes performing a visual classification analysis on the first image and the second image. The visual classification analysis generates an overlap of the first image with the second image. The method includes determining whether a region of the overlap is indicative of a content inconsistency. As a result of the region of the overlap being indicative of a content inconsistency, the method includes performing a character recognition analysis on a first area of the first image and a second area of the second image corresponding to the region of the overlap to verify the content inconsistency.
APPARATUS AND METHOD FOR RECOMMENDING LEARNING USING OPTICAL CHARACTER RECOGNITION
There are provided a learning recommendation apparatus and method for detecting a problem from an image through character recognition and providing at least one sub-topic learning among a plurality of sub-topic learnings related to the detected problem. The provided learning recommendation apparatus recommends, as a recommendation target, a plurality of learning topics including the concept of a formula which has been read through the character recognition for an image, wherein a priority order is set to the plurality of learning topics based on the concept distance between the learning topic and the learning history, and the learning topics are recommended so that the learning topic having a higher priority order is located at a higher position.
Apparatus, systems, and methods for detection and indexing clinical images of patient encounters
Technologies and techniques for managing clinical images in electronic healthcare record systems are disclosed. Disclosed methods and apparatus provide indexing of captured clinical images with minimal or no user input. This is accomplished through providing the ability to set a predetermined type of patient encounter and then retrieving a clinical image of a patient from either an image capture device such as a camera in a mobile device or from a memory device. The methods and systems may also include image recognition of the clinical image to determine the type of the clinical image. Additionally, the indexing is automated, where the automated indexing is based on at least one of the predetermined type of patient encounter and the determined type of clinical image.
Text detection, caret tracking, and active element detection
Detection of typed and/or pasted text, caret tracking, and active element detection for a computing system are disclosed. The location on the screen associated with a computing system where the user has been typing or pasting text, potentially including hot keys or other keys that do not cause visible characters to appear, can be identified and the physical position on the screen where typing or pasting occurred can be provided based on the current resolution of where one or more characters appeared, where the cursor was blinking, or both. This can be done by identifying locations on the screen where changes occurred and performing text recognition and/or caret detection on these locations. The physical position of the typing or pasting activity allows determination of an active or focused element in an application displayed on the screen.
TEXT EXTRACTION METHOD, TEXT EXTRACTION MODEL TRAINING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
A text extraction method and a text extraction model training method are provided. The present disclosure relates to the technical field of artificial intelligence, in particular to the technical field of computer vision. An implementation of the method comprises: obtaining a visual encoding feature of a to-be-detected image; extracting a plurality of sets of multimodal features from the to-be-detected image, wherein each set of multimodal features includes position information of one detection frame extracted from the to-be-detected image, a detection feature in the detection frame and first text information in the detection frame; and obtaining second text information matched with a to-be-extracted attribute based on the visual encoding feature, the to-be-extracted attribute and the plurality of sets of multimodal features, wherein the to-be-extracted attribute is an attribute of text information needing to be extracted.
SYSTEMS AND METHODS FOR EXTRACTING, DIGITIZING, AND USING ENGINEERING DRAWING DATA
Re-usage of part of object or object is highly important in manufacturing industry as it can drastically reduce cost and time spent on manufacturing. However, lack of proper information about availability of similar parts leads to redesigning of similar part. Existing databases for engineering drawings do not store categorized information due to which performing feature-based search is not possible. Present application provides systems and methods for extracting, digitizing, and using engineering drawing data. The system receives engineering drawing document and extracts text data present in each cell of table provided in document. Once table data is extracted, isometric views and views other than isometric views that are present in document are identified by the system using pretrained machine learning based model. The system further extract view labels and coordinate information from identified views. The information extracted from document is then stored by the system as engineering drawing data for document.
Mobile supplementation, extraction, and analysis of health records
A system, method, and mobile device application are configured to capture, with a mobile device, a document such as a next generation sequencing (NGS) report that includes NGS medical information about a genetically sequenced patient. The method includes receiving, from a mobile device, an image of a medical document comprising NGS medical information of the patient, extracting a first region from the image, extracting NGS medical information of the patient from the first region into a structured dataset, the extracted NGS medical information including at least one RNA expression, correlating a portion of the extracted NGS medical information that includes the at least one RNA expression with summarized medical information from a cohort of patients similar to the patient, and generating, for display on the mobile device, a clinical decision support report comprising the summarized medical information.
Continuous machine learning method and system for information extraction
Methods and systems for artificial intelligence (AI)-assisted document annotation and training of machine learning-based models for document data extraction are described. The methods and systems described herein take advantage of a continuous machine learning approach to create document processing pipelines that provide accurate and efficient data extraction from documents that include structured text, semi-structured text, unstructured text, or any combination thereof.
METHOD FOR EXTRACTING CHARACTERS FROM VEHICLE LICENSE PLATE, AND LICENSE PLATE CHARACTER EXTRACTION DEVICE FOR PERFORMING METHOD
There is provided a method of extracting characters from a license plate of a vehicle performed by a license plate character extraction device. The method comprises: converting a input image obtained by capturing the license plate of the vehicle into a grayscale image; generating a converted image based on a result of comparing a value of at least one pixel included in the grayscale image with a first average of values of pixels adjacent to the at least one pixel; generating a refined image based on a result of comparing the converted image with a binarized image obtained by binarizing the converted image; and extracting characters included in the refined image.
METHOD AND ELECTRONIC DEVICE FOR RECOGNIZING TEXT IN IMAGE
A method and an electronic device for recognizing text are provided. The method includes detecting positions of pieces of text included in the text in the image, generating cropped images by cropping areas corresponding to the pieces of text in the image, recognizing characters of the pieces of text based on the cropped images, generating a sentence by inputting the positions of the pieces of text and the characters of the pieces of text to a multimodal language model, wherein the multimodal language model is an artificial intelligence (AI) model for inferring an original sentence of the text, and displaying the sentence.