IPIQ

G06V30/19187

LEXICON-FREE, MATCHING-BASED WORD-IMAGE RECOGNITION

20170011273 · 2017-01-12 ·

Methods and systems recognize alphanumeric characters in an image by computing individual representations of every character of an alphabet at every character position within a certain word transcription length. These methods and systems embed the individual representations of each alphabet character in a common vectorial subspace (using a matrix) and embed a received image of an alphanumeric word into the common vectorial subspace (using the matrix). Such methods and systems compute the utility value of the embedded alphabet characters at every one of the character positions with respect to the embedded alphanumeric character image; and compute the best transcription alphabet character of every one of the image characters based on the utility value of each embedded alphabet character at each character position. Such methods and systems then assign the best transcription alphabet character for each of the character positions to produce a recognized alphanumeric word within the received image.

Method for recognizing receipt, electronic device and storage medium

12307797 · 2025-05-20 ·

BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Provided are method for recognizing a receipt, an electronic device and a storage medium, which relate to the fields of deep learning and pattern recognition. The method may include: a target receipt to be recognized is acquired; two-dimensional position information of multiple text blocks on the target receipt respectively is encoded, to obtain multiple encoding results; graph convolution is performed on the multiple encoding results respectively, to obtain multiple convolution results; and each of the multiple convolution results is recognized based on a first conditional random field model, to obtain a first prediction result at text block-level of the target receipt, wherein the first conditional random field model and a second conditional random field model are co-trained, so as to obtain a second prediction result at token-level of the target receipt.

Methods, systems, articles of manufacture, and apparatus to determine related content in a document

12322195 · 2025-06-03 ·

Nielsen Consumer LLC

Methods, apparatus, systems, and articles of manufacture are disclosed that determine related content. An example apparatus includes processor circuitry to generate a segment-level graph by sampling segment-level edges among segment nodes representing text segments, the segment-level graph including segment node embeddings representing features of the segment nodes; cluster the text segments to form entities by applying a first GAN based model to the segment-level graph to update the segment node embeddings; generate a multi-level graph by (a) generating an entity-level graph including hypernodes representing the entities and sampled entity edges connecting ones of the hypernodes, and (b) connecting the segment nodes to respective ones of the hypernodes using relation edges; generate hypernode embeddings by propagating the updated segment node embeddings using a relation graph; and cluster the entities by product by applying a second GAN based model to the multi-level graph, the multi-level graph to generate updated hypernode embeddings.

METHODS, SYSTEMS, ARTICLES OF MANUFACTURE, AND APPARATUS TO DETERMINE RELATED CONTENT IN A DOCUMENT

20250259468 · 2025-08-14 ·

Recognition method and electronic device

12423524 · 2025-09-23 ·

A recognition method includes the following steps. A text is analyzed by a language recognition network to generate an entity feature, a relation feature and an overall feature. An input image is analyzed by an object detection network to generate candidate regions. Node features, aggregated edge features and compound features are generated by an enhanced cross-modal graph attention network according to the entity feature, the relation feature, the candidate regions and the overall feature. The entity feature and the relation feature are matched to the node features and the aggregated edge features to generate the first scores. The overall feature is matched to the compound features to generate second scores. Final scores corresponding to the candidate regions are generated according to the first scores and the second scores.

Method, device, computer equipment and storage medium for identifying illegal commodity

12450933 · 2025-10-21 ·

ZHEJIANG LAB

A method, a device, computer equipment and a storage medium for identify an illegal commodity. The method comprises: firstly, constructing a multi-modal knowledge graph according to a multi-modal knowledge graph data set, and extracting visual features of all visual modality entities and text features of all text modality entities in the knowledge graph; then obtaining a commodity image and a commodity text according to a database; then, generating commodity visual feature according to the commodity image; then generating the commodity text feature according to the commodity text; secondly, according to the visual features and text features, as well as the commodity visual feature and the commodity text feature, linking the commodity image and the commodity text to the knowledge graph by using an entity linking method; finally, obtaining the correlation between the commodity image and the commodity text according to the linked knowledge graph to determine the illegality of the commodity.

Method and device for constructing legal knowledge graph based on joint entity and relation extraction

12530597 · 2026-01-20 ·

Xi'an Jiaotong University

A method and device for constructing a legal knowledge graph based on joint entity and relation extraction. The construction method comprises the following steps: constructing a triple data set; design of a model architecture and training of a model, wherein the model architecture comprises an encoding layer, a head entity extraction layer and a relation-tail entity extraction layer; determination of the relation between the sentences of the text; triple combination and graph visualization. The design of the model framework of the present disclosure adopts a Chinese Bert pre-training model as an encoder. In the entity extraction part, two BiLSTM binary classifiers are used to identify the start position and end position of an entity. The head entity is first extracted, and then the tail entity corresponding to the entity relation is extracted from the extracted head entity.

Patent classifications