G06V30/268

Method and apparatus for evaluating matching degree based on artificial intelligence, device and storage medium

The present disclosure provides a method and apparatus for evaluating a matching degree based on artificial intelligence, a device and a storage medium, wherein the method comprises: respectively obtaining word expressions of words in a query and word expressions of words in a title; respectively obtaining context-based word expressions of words in the query and context-based word expressions of words in the title according to the word expressions; generating matching features according to obtained information; determining a matching degree score between the query and the title according to the matching features. The solution of the present disclosure may be applied to improve the accuracy of the evaluation result.

Method and apparatus for detecting text regions in image, device, and medium

A method and apparatus for detecting text regions in an image, a device, and a medium are provided. The method may include: detecting, based on feature representation of an image, a first text region in the image, where the first text region covers a text in the image, a region occupied by the text being of a certain shape; determining, based on a feature block of the first text region, text geometry information associated with the text, where the text geometry information includes a text centerline of the text and distance information of the centerline from the upper and lower borders of the text; and adjusting, based on the text geometry information associated with the text, the first text region to a second text region, where the second text region also covers the text and is smaller than the first text region.

System and method for learning scene embeddings via visual semantics and application thereof
11481575 · 2022-10-25 · ·

The present teaching relates to method, system, and programming for responding to an image related query. Information related to each of a plurality of images is received, wherein the information represents concepts co-existing in the image. Visual semantics for each of the plurality of images are created based on the information related thereto. Representations of scenes of the plurality of images are obtained via machine learning, based on the visual semantics of the plurality of images, wherein the representations capture concepts associated with the scenes.

OCR error correction

Implementations of the disclosure are directed to OCR error correction systems and methods. In some implementations, a method comprises: obtaining, at a computing device, optical character recognition (OCR) text extracted from a document image, the text comprising a token; searching, at the computing device, based on a token bigram determined from the token and a mapping between words in a corpus and a corpus bigram set comprised of unique bigrams from the beginning or ending of the words in the corpus, the corpus for a best word to replace the token; and replacing, at the computing device, the token with the best word.

Query by image

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing queries made up of images. In one aspect, a method includes indexing images by image descriptors. The method further includes associating descriptive n-grams with the images. In another aspect, a method includes receiving a query, identifying text describing the query, and performing a search according to the text identified for the query.

Method of correcting strings

Determining a set of edit operations to perform on a string, such as one generated by optical character recognition, to satisfy a string template by determining a minimum cost of performing edit operations on the string to satisfy the string template and then determining the set of edit operations corresponding to the minimum cost. Transforming a string to satisfy one or more string templates by determining a minimum cost of performing edit operations on the string to satisfy one or more string templates, selecting one or more minimum costs, determining a set of edit operations corresponding to the minimum costs, and then performing the set of edit operations on the string. Determining a minimum cost of performing edit operations on a string to satisfy a string template by determining set costs of performing sets of edit operations using costs associated with edit operations of the set and determining the minimum cost using the set costs.

AUTOMATIC FACT EXTRACTION
20210383249 · 2021-12-09 ·

Automatic fact extraction that involves tokenizing text in unstructured information to generate a token list. Parent entity rules defined for a selected domain are applied to the token list to identify a parent entity. Related entity rules that are defined for a related entity linked to the parent entity are applied to the token list to identify the related entity. The related entity is added as an extracted fact of the parent entity to a fact list. The extracted fact is transmitted as structured information to a repository.

Content-aware selection

An image editing program can include a content-aware selection system. The content-aware selection system can enable a user to select an area of an image using a label or a tag that identifies object in the image, rather than having to make a selection area based on coordinates and/or pixel values. The program can receive a digital image and metadata that describes an object in the image. The program can further receive a label, and can determine from the metadata that the label is associated with the object. The program can then select a bounding box for the object, and identify in the bounding box, pixels that represent the object. The program can then output a selection area that surrounds the pixels.

IMAGE PROCESSING DEVICE AND IMAGE FORMING APPARATUS CAPABLE OF DETECTING AND CORRECTING MIS-CONVERTED CHARACTER IN TEXT EXTRACTED FROM DOCUMENT IMAGE
20220141349 · 2022-05-05 · ·

An image processing device includes a storage device that previously stores a document image, a plurality of registered words, and a plurality of font characters, and a control device that functions as: a character region identifier that identifies a character region in the document image; an image acquirer that acquires an image of the character region; a text extractor that extracts a text from the image of the character region; a word identifier that identifies each of words in the text; a word determiner that determines whether each of the words is matched with one of the registered words; and a generator that generates a corrected text by replacing a target character of a non-matching word in the text with, among the font characters, a font character having a first degree of matching not lower than a first rate with the target character and a highest first degree of matching.

Intelligent detection of sensitive data within a communication platform

Methods, systems, and apparatus, including computer programs encoded on computer storage media provide for the intelligent detection of sensitive information within a communication platform. The system displays a communication interface including a first input section for receiving an input message associated with a sending user account, and a display section for displaying message information received by the sending user account from other user accounts. The system determines or retrieves a sensitive messaging profile for the sending user account, then receives an input message associated with the sending user account. The system detects that the input message comprises sensitive information, and transmits a sensitive message to one or more receiving user accounts within a sensitive container component, with the sensitive message including at least a subset of the input message.