G06V30/43

Parallel prediction of multiple image aspects

Example embodiments that analyze images to characterize aspects of the images rely on a same neural network to characterize multiple aspects in parallel. Because additional neural networks are not required for additional aspects, such an approach scales with increased aspects.

CONSTRUCTING A PATH FOR CHARACTER GLYPHS

Techniques described herein take character glyphs as input and generate a text-on-a-path text object that includes the character glyphs arranged in a determined order along a path. For instance, a method described herein includes accessing character glyphs in input data. The method further includes determining an order for the character glyphs based on relative positions and orientations of the character glyphs in the input data. The method further includes generating a path for the character glyphs, based on the order, and associating the path with the character glyphs. Further, the method includes generating a text object that includes the set of character glyphs arranged in the order along the path.

AUGMENTED REALITY (AR)-ASSISTED SMART CARD FOR SECURE AND ACCURATE REVISION AND/OR SUBMISSION OF SENSITIVE DOCUMENTS
20210383338 · 2021-12-09 ·

Systems and methods for an augmented reality (AR)-assisted smart card for secure and accurate revision and/or submission of sensitive documents are provided. The methods may be executed via computer-executable instructions running on a microprocessor embedded in the smart card. A method may include capturing an image of a document, processing the image of the document, and computing, for one or more of the fields of the document, a recommended input. The method may further include comparing, for the one or more fields, the recommended input with an actual input, and, when the recommended input is more than a threshold difference apart from the actual input, generating a recommended revision. The method may also include displaying an AR image of the document on a display screen that is embedded in the smart card, said AR image comprising the image of the document augmented with the recommended revisions.

DEEP DOCUMENT PROCESSING WITH SELF-SUPERVISED LEARNING

A document processing system processes documents including typewritten and/or handwritten data by converting them to document images for entity extraction. A received document is initially processed to generate a deep document data structured and for classification as one of a structured or an unstructured document. If the document is classified as a structured document, it is processed for entity extraction based on a matching template and image alignment of the document image with the matching template. If the document is classified as an unstructured document, entities are extracted by obtaining nodes and providing the nodes to a self-supervised masked visual language model.

PARALLEL PREDICTION OF MULTIPLE IMAGE ASPECTS

Example embodiments that analyze images to characterize aspects of the images rely on a same neural network to characterize multiple aspects in parallel. Because additional neural networks are not required for additional aspects, such an approach scales with increased aspects.

Constructing a path for character glyphs

Techniques described herein take character glyphs as input and generate a text-on-a-path text object that includes the character glyphs arranged in a determined order along a path. For instance, a method described herein includes accessing character glyphs in input data. The method further includes determining an order for the character glyphs based on relative positions and orientations of the character glyphs in the input data. The method further includes generating a path for the character glyphs, based on the order, and associating the path with the character glyphs. Further, the method includes generating a text object that includes the set of character glyphs arranged in the order along the path.

Scalable structure learning via context-free recursive document decomposition

An approach is provided in which the approach aggregates a set of pixel values from a bitmap image into a set of row sum values and a set of column sum values. The bitmap image is a pixelated representation of a document. The approach applies a localized Fourier transform to the set of row sum values and the set of column sum values to generate frequency representations of the set of row sum values and the set of frequency sum values. The approach decomposes the bitmap image into a set of image portions based on at least one separation location identified in the set of frequency representations, and sends the set of image portions to a text recognition system.

PROBABILISTIC TEXT INDEX FOR SEMI-STRUCTURED DATA IN COLUMNAR ANALYTICS STORAGE FORMATS

Herein is a probabilistic indexing technique for searching semi-structured text documents in columnar storage formats such as Parquet, using columnar input/output (I/O) avoidance, and needing minimal storage overhead. In an embodiment, a computer associates columns with text strings that occur in semi-structured documents. Text words that occur in the text strings are detected. Respectively for each text word, a bitmap, of a plurality of bitmaps, that contains a respective bit for each column is generated. Based on at least one of the bitmaps, some of the columns or some of the semi-structured documents are accessed.

METHOD AND DEVICE FOR BEHAVIOR CONTROL OF VIRTUAL IMAGE BASED ON TEXT, AND MEDIUM
20220004825 · 2022-01-06 ·

A method and device for behavior control of a virtual image based on a text, and a medium are disclosed. The method includes inserting a symbol in a text, and generating a plurality of input vectors corresponding to the symbol and elements in the text; inputting the plurality of input vectors to a first encoder network, and determining a behavior trigger position in the text based on an attention vector of a network node corresponding to the symbol; determining behavior content based on a first encoded vector that is outputted from the first encoder network and corresponds to the symbol; and playing an audio corresponding to the text, and controlling the virtual image to present the behavior content when the audio is played to the behavior trigger position.

METHOD AND APPARATUS FOR DIGITIZING PAPER DATA, ELECTRONIC DEVICE AND STORAGE MEDIUM
20220004752 · 2022-01-06 ·

The present application discloses a method and apparatus for digitizing paper data, an electronic device and a storage medium, relating to fields of image processing and cloud computing, in particular to image recognition technologies. According to the solution provided by the present application, graphic handwriting information included in an image to be processed can be recognized, and the handwriting information can be combined with a reference coordinate system of the image to be processed to obtain digitized data, in this way, paper data can still be converted into digitized data even when graphic data is included in the paper data.