G06V30/19187

AUTOMATIC DATA EXTRACTION FROM A DIGITAL IMAGE

The invention relates to a computer-implemented method for automatically extracting data from a digital image comprising a graphical representation of quantitative data. The method comprises: Basic graphical objects are detected and structural primitives determined comprising grouping the basic graphical objects based on geometric relations. A semantic label is assigned to each of the structural primitives. A spatial data region of the graphical representation is determined using the semantic labels of the structural primitives. Quantitative data values are extracted which are represented by structural primitives within the data region which are assigned with first semantic labels identifying the respective structural primitives to represent quantitative data. The extracted quantitative data values are provided in units of pixels according to an image coordinate system. The extracted quantitative data values are transformed from the image coordinate system to a coordinate system of physical units of the quantitative data represented by the graphical representation.

SYSTEM AND METHOD FOR ROBUST ESTIMATION OF STATE PARAMETERS FROM INFERRED READINGS IN A SEQUENCE OF IMAGES

A system and method for robust estimation of state parameters from internal readings in a sequence of images are provided. Various techniques can be implemented to address observation noise and/or underlying process noise to stabilize the readings.

Arabic optical character recognition method using hidden markov models and decision trees

Disclosed is an Arabic optical character recognition method using Hidden Markov Models and decision trees, comprising: receiving an input image containing Arabic text, removing all diacritics from the input image by detecting a bounding box of each diacritic and comparing coordinates thereof to those of a bounding box of a text body, segmenting the input image into four layers, and conducting feature extraction on the segmented four layers, inputting results of feature extraction into a Hidden Markov Model thereby generating HMM models for representing each Arabic character, conducting iterative training of the HMM models until an overall likelihood criterion is satisfied, and inputting results of iterative training into a decision tree thereby predicting locations and the classes of the diacritics and producing final recognition results. The invention is capable of facilitating simple recognition of Arabic by utilizing writing feature thereof, and meanwhile featuring comparatively high recognition precision.

Pseudo labelling for key-value extraction from documents

A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.

METHOD, DEVICE, COMPUTER EQUIPMENT AND STORAGE MEDIUM FOR IDENTIFYING ILLEGAL COMMODITY
20240331425 · 2024-10-03 ·

A method, a device, computer equipment and a storage medium for identify an illegal commodity. The method comprises: firstly, constructing a multi-modal knowledge graph according to a multi-modal knowledge graph data set, and extracting visual features of all visual modality entities and text features of all text modality entities in the knowledge graph; then obtaining a commodity image and a commodity text according to a database; then, generating commodity visual feature according to the commodity image; then generating the commodity text feature according to the commodity text; secondly, according to the visual features and text features, as well as the commodity visual feature and the commodity text feature, linking the commodity image and the commodity text to the knowledge graph by using an entity linking method; finally, obtaining the correlation between the commodity image and the commodity text according to the linked knowledge graph to determine the illegality of the commodity.

AUTOMATIC DATA EXTRACTION FROM A DIGITAL IMAGE

The invention relates to a computer-implemented method for automatically extracting data from a digital image comprising a graphical representation of quantitative data. The method comprises: Basic graphical objects are detected and structural primitives determined comprising grouping the basic graphical objects based on geometric relations. A semantic label is assigned to each of the structural primitives. A spatial data region of the graphical representation is determined using the semantic labels of the structural primitives. Quantitative data values are extracted which are represented by structural primitives within the data region which are assigned with first semantic labels identifying the respective structural primitives to represent quantitative data. The extracted quantitative data values are provided in units of pixels according to an image coordinate system. The extracted quantitative data values are transformed from the image coordinate system to a coordinate system of physical units of the quantitative data represented by the graphical representation.

Search Query Generation Based Upon Received Text

In an example, a first set of text may be received from a client device. A set of content items may be selected from among content items based upon the first set of text and a plurality of sets of content item text associated with the content items. A set of terms may be determined based upon the first set of text and the set of content items. A similarity profile associated with the set of terms may be generated. The similarity profile is indicative of similarity scores associated with similarities between terms of the set of terms. Relevance scores associated with the set of terms may be determined based upon the similarity profile. One or more search terms may be selected from among the set of terms based upon the relevance scores. A search may be performed based upon the one or more search terms.

Lexicon-free, matching-based word-image recognition

Methods and systems recognize alphanumeric characters in an image by computing individual representations of every character of an alphabet at every character position within a certain word transcription length. These methods and systems embed the individual representations of each alphabet character in a common vectorial subspace (using a matrix) and embed a received image of an alphanumeric word into the common vectorial subspace (using the matrix). Such methods and systems compute the utility value of the embedded alphabet characters at every one of the character positions with respect to the embedded alphanumeric character image; and compute the best transcription alphabet character of every one of the image characters based on the utility value of each embedded alphabet character at each character position. Such methods and systems then assign the best transcription alphabet character for each of the character positions to produce a recognized alphanumeric word within the received image.

TECHNIQUES OF INFORMATION EXTRACTION FOR SELECTION MARKS

A method may include receiving a primary document including one or more selection boxes, one or more text lines, and one or more annotations. The method may include determining, a class based on the annotations. The method may include identifying the one or more selection boxes and one or more text lines of the primary document. The method may include generating a graph representing the one or more selection boxes and the one or more text lines. The method may include mapping each of the one or more selection boxes to a respective text line of the one or more text lines of the graph based at least in part on one or more characteristics associated with the selection boxes. The method may include generating a key-value pair associated with each of the one or more text lines and generating a document model of the primary document.

ARABIC OPTICAL CHARACTER RECOGNITION METHOD USING HIDDEN MARKOV MODELS AND DECISION TREES
20170017854 · 2017-01-19 ·

Disclosed is an Arabic optical character recognition method using Hidden Markov Models and decision trees, comprising: receiving an input image containing Arabic text, removing all diacritics from the input image by detecting a bounding box of each diacritic and comparing coordinates thereof to those of a bounding box of a text body, segmenting the input image into four layers, and conducting feature extraction on the segmented four layers, inputting results of feature extraction into a Hidden Markov Model thereby generating HMM models for representing each Arabic character, conducting iterative training of the HMM models until an overall likelihood criterion is satisfied, and inputting results of iterative training into a decision tree thereby predicting locations and the classes of the diacritics and producing final recognition results. The invention is capable of facilitating simple recognition of Arabic by utilizing writing feature thereof, and meanwhile featuring comparatively high recognition precision.