Patent classifications
G06V30/19093
SELECTING FILES FOR INTENSIVE TEXT EXTRACTION
A feature of a subject file is identified. The feature is compared to a historical feature of a historical file. A similarity between the subject file and the historical file is calculated based on the comparing. A historical success metric for the historical file is identified. An intensive text extraction success value for the subject file is calculated based on the similarity and the historical success metric. Based on the intensive extraction success value, whether an intensive text extraction method should be performed on the subject file is determined.
PRINT INSPECTION DEVICE, PRINT INSPECTION METHOD, AND PROGRAM
A print inspection device includes a camera that captures a character printed on an inspection target, a shape matching processor, a deformation pattern generator, and an inspection processor. The shape matching processor checks a shape of a captured character pattern included in a captured image obtained by the camera against a shape of a preset reference character pattern while changing a deformation degree of the shape of the reference character pattern and searches for a matched character. The deformation pattern generator generates a deformed character pattern obtained by deforming the reference character pattern at the deformation degree to which the character is matched in the shape matching process. The inspection processor inspects whether a printed state of the character is satisfactory from a result of the comparison between the deformed character pattern and the captured character pattern.
METHOD, COMPUTER DEVICE, AND STORAGE MEDIUM FOR GENERATING VIDEO COVER
A method for generating a video cover is performed by a computer device and includes acquiring a video title and a candidate video cover of a target video; determining highlighted characters of the video title; determining typesetting parameters of the highlighted characters based on the highlighted characters and a cover parameter of the candidate video cover; and generating a target video cover of the target video by rendering the highlighted characters to the candidate video cover based on the typesetting parameters.
METHOD FOR IDENTIFYING ENTITY DATA IN A DATA SET
A data processing system receives a plurality of electronic documents in image format, and extracts text data using an optical character recognition processor. The system determines a plurality of candidate entity data and candidate context data based on the extracted text data using a trained natural language processing closed-domain question answering model. The system accesses n-gram words stored in a knowledge base, and determines similarity scores between each candidate context data and each of the n-gram words. The system determines a weighted average of the similarity scores, and selects an optimum entity data from the plurality of candidate entity data based on the weighted average of the similarity scores.
Information Extraction Method and Apparatus for Text With Layout
An information extraction method includes: determining that a text block that belongs to a target category and that is in text with layout is to be extracted; recognizing, based on feature information at a text block granularity, the text block that belongs to the target category and that is in the text with layout; and outputting an identifier of the text block that belongs to the target category and that is in the text with layout.
UNIDIRECTIONAL TEXT COMPARISON
A method, a structure, and a computer system for unidirectional text comparison. The exemplary embodiments may include determining a first similarity score between a first text string and a second text string, and computing an error term between the first text string and the second text string, wherein the error term incorporates a directionality of the first text string and the second text string. The exemplary embodiments may further include determining a second similarity score based on the first similarity score and the error term.
Answer correction method and device
The disclosure provides an answer correction method and a device, including: determining the target test paper that matches the test paper to be corrected; marking the area of each answer in the test paper to be corrected as the first answer set, and marking the area of each answer in the target test paper as the second answer set; matching each answer area in the first answer set and the second answer set, and adjusting the position of the answer area in the first answer set on the test paper to be corrected; for each answer area in the second answer set, determining the target answer area in the first answer set according to the position information of the answer area on the target test paper, and correcting the answer in the determined target answer area according to the answer in the answer area. The disclosure can solve the problem in the related art that the accurate position of the answer filled by the student cannot be identified, thereby affecting the correction of the answer.
TEXT GENERATION APPARATUS AND MACHINE LEARNING METHOD
A text generation apparatus receives a first text. The text generation apparatus specifies a first position in the first text of a word that is identical to a first word whose use in a second text to be generated based on the first text has been determined. The text generation apparatus selects a second word from a plurality of words included in the first text based on positional relationships between each of the plurality of words and the first position. The text generation apparatus generates the second text including the second word.
Confidence calibration using pseudo-accuracy
Systems and methods for training machine learning models are disclosed. An example method includes receiving a plurality of first outputs and a ground truth value for each first output, each first output including an extracted string and a raw confidence score, determining, for each first output, an accuracy metric based at least in part on the extracted string and its corresponding ground truth value, for each extracted string: determining a similarity metric between the respective extracted string and each other extracted string of the plurality of first outputs, and determining a pseudo-accuracy based at least in part on the determined similarity metrics and the determined accuracy metrics, generating training data based at least in part on the determined pseudo-accuracies and the plurality of first outputs, and training the machine learning model, based on the training data, to predict pseudo-accuracies associated with subsequent outputs from a document extraction model.
Method of generating font database, and method of training neural network model
A method of generating a font database, and a method of training a neural network model are provided, which relate to a field of artificial intelligence, in particular to a computer vision and deep learning technology. The method of generating the font database includes: determining, by using a trained similarity comparison model, a basic font database most similar to handwriting font data of a target user in a plurality of basic font databases as a candidate font database; and adjusting, by using a trained basic font database model for generating the candidate font database, the handwriting font data of the target user, so as to obtain a target font database for the target user.