Patent classifications
G06V30/268
SUPERVISED OCR TRAINING FOR CUSTOM FORMS
The disclosed technology is generally directed to optical character recognition for forms. In one example of the technology, optical character recognition is performed on a plurality of forms. The forms of the plurality of forms include at least one type of form. Anchors are determined for the forms, including corresponding anchors for each type of form of the plurality of forms. Feature rules are determined, including corresponding feature rules for each type of form of the plurality of forms. Features and labels are determined for each form of the plurality of forms. A training model is generated based on a ground truth that includes a plurality of key-value pairs corresponding to the plurality of forms, and further based on the determined features and labels for the plurality of forms.
Labeling Training Set Data
A computer readable storage medium comprising instruction which when executed cause a processor to: generate a machine learning model based on a limited set of labeled training data and a larger set of unlabeled training data, the labeled and unlabeled training data having a common subject matter, by: identifying an inclusion and exclusion list of terms; taking a subset of unlabeled documents which contain any term from the inclusion list and excluding any document that contain a term from the exclusion list; identifying terms that are similar within a set standard to a term from the inclusion list or exclusion list and adding those identified terms to the inclusion list or exclusion list, respectively; repeating until no new similar terms are identified; and generating training data of the machine learning model comprising a final subset of documents for each category from the unlabeled training data.
Information extraction and annotation systems and methods for documents
Information extraction and annotation systems and methods for use in annotating and determining annotation instances are provided herein. Exemplary methods include receiving training documents having annotated words, identifying a predetermined number of characters preceding and following each annotated word for each of the training documents to determine a context for each of the annotated words, performing an alignment of an annotated word and its context with characters in the target document, identifying common sequences, and assigning annotations to words in the target document when common sequences are found.
CORRECTION OF MISSPELLINGS IN QA SYSTEM
Embodiments provide a computer implemented method for identifying and correcting a misspelling in a question answering (QA) system, wherein the QA system is coupled to a document corpus, and the document corpus includes a plurality of documents related to a particular domain. The method includes the following steps: receiving an input question and a plurality of passages, wherein the plurality of passages are extracted from the document corpus by the QA system; providing at least one alternate form for each token extracted from the input question and the plurality of passages; identifying at least one misspelled token; and scoring at least one alternate form of each identified misspelled token.
Document editing and feedback
Techniques and systems for collaborative document editing and generating feedback on draft documents are described. A draft document is shared with multiple readers in a file format that is the same or similar to the file format in which the document will be published. The readers provide comments on the draft document. The comments can be stored in the same file as the document. Feedback may be solicited from a reader based on reading activity while interacting with the document.
METHOD AND APPARATUS FOR RECOGNIZING CHARACTERS
A method and an apparatus for recognizing characters using an image are provided. A camera is activated according to a character recognition request and a preview mode is set for displaying an image photographed through the camera in real time. An auto focus of the camera is controlled and an image having a predetermined level of clarity is obtained for character recognition from the images obtained in the preview mode. The image for character recognition is character-recognition-processed so as to extract recognition result data. A final recognition character row is drawn that excludes non-character data from the recognition result data. A first word is combined including at least one character of the final recognition character row and a predetermined maximum number of characters. A dictionary database that stores dictionary information on various languages using the first word is searched, so as to provide the user with the corresponding word.
SOUND PLAYBACK INTERVAL CONTROL METHOD, SOUND PLAYBACK INTERVAL CONTROL PROGRAM, AND INFORMATION PROCESSING APPARATUS
A sound playback interval control method performed by a computer is provided for a speech recognition system. The method includes: arranging and displaying a word block subjected to correction and confirmation in a central portion of a first area on a display screen, the first area being an area in which a plurality of word blocks generated by using morphological analysis from a character string obtained by speech recognition are displayed, and performing playback control on sound of the word block subjected to correction and confirmation displayed in the first area.
DATA EXTRACTION AND DUPLICATE DETECTION
A system provides an end-to-end solution for invoice processing which includes reading invoices (both pdfs and images), extracting key relevant information from the face of invoices, organizing the relevant information in a structured template as a key-value pair, and comparing invoices based on the similarities between different invoice fields to identify potential duplicate invoices.
SYSTEM AND METHOD FOR LEARNING SCENE EMBEDDINGS VIA VISUAL SEMANTICS AND APPLICATION THEREOF
The present teaching relates to method, system, and programming for responding to an image related query. Information related to each of a plurality of images is received, wherein the information represents concepts co-existing in the image. Visual semantics for each of the plurality of images are created based on the information related thereto. Representations of scenes of the plurality of images are obtained via machine learning, based on the visual semantics of the plurality of images, wherein the representations capture concepts associated with the scenes.
CHINESE ENTITY IDENTIFICATION
Methods, systems, and computer program products are provided for language entity identification. In one embodiment, a computer-implemented method is disclosed. In the method, respective pinyin codes may be determined for respective Chinese characters comprised in a string that is to be processed. Then, respective pinyin features may be generated from the respective pinyin codes. Next, a candidate language entity may be identified from the string based on the respective pinyin features and a mapping describing an association between pinyin features and language entity. In other embodiments, a computer-implemented system and a computer program product for security management are disclosed.