Patent classifications
G06V30/18181
CHARACTER DETECTION METHOD AND APPARATUS
Disclosed embodiments relate to a character detection method and apparatus. In some embodiments, the method includes: using an image including an annotated word as an input to a machine learning model; selecting, based on a predicted result of characters inside an annotation region of the annotated word predicted and annotation information of the annotated word, characters for training the machine learning model from the characters inside the annotation region of the annotated word predicted; and training the machine learning model based on features of the selected characters. This implementation manner implements the full training of a machine learning model by using existing word level annotated images, to obtain a machine learning model capable of detecting characters in images, thereby reducing the costs for the training of a machine learning model capable of detecting characters in images.
SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR ELECTRONIC DOCUMENT DISPLAY
A method, system, and computer program product, include receiving a first input at a first element among a plurality of elements associated with at least one electronic document, determining a second element associated with the first element from the plurality of elements based on predetermined relations of the plurality of elements, and causing a view to be displayed together with an electronic document including the first element, the view at least including the second element.
Robust method for tracing lines of table
A method for image processing includes obtaining a mask of a stroke from an image and identifying a plurality of cross edges for the stroke based on the mask and a reference line. The plurality of cross edges includes a group of adjacent cross edges that intersect the reference line. The method further includes (a) calculating a first vector based on positions of at least two of the cross edges in the group, (b) expanding the group, based on the first vector, to include cross edges adjacent to the group that do not intersect the reference line, (c) calculating a second vector based on positions of at least two of the cross edges in the expanded group, and (d) expanding the expanded group, based on the second vector, to include a second group of adjacent cross edges nearby the expanded group that do not intersect the reference line.
SYMBOL MONITORING METHODS AND SYSTEMS
Vehicle systems and methods are provided for monitoring symbology in a video stream associated with a software application. A method involves a programmable device identifying, within a first portion of a frame of a video stream, metadata identifying one or more characteristics of a symbol to be analyzed and corresponding indicia of an expected location of the symbol within a second portion of the frame, extracting a subset of pixels from the second portion of the frame encompassing the expected location, and providing the extracted subset of pixels from the second portion of the frame and the metadata from the first portion of the frame to a hardware symbol detector configurable to determine one or more metrics associated with the symbol based on the extracted subset of pixels and the metadata and provide the one or more metrics to the software application.
METHODS AND SYSTEMS FOR GRAPH-INFERENCE-BASED TEXT EXTRACTION FROM UNSTRUCTURED DOCUMENTS
According to one aspect, the subject matter described herein includes a method for extracting text from unstructured documents. The method includes receiving a page of an unstructured document; extracting, from the page, a glyph identifier and a glyph position for each glyph on the page; and generating an adjacency graph based on the glyph positions for each glyph on the page, each node in the graph corresponding to a glyph and comprising glyph information that includes at least the glyph identifier and the glyph position for the respective glyph. The method further includes processing the adjacency graph by a machine learning model to classify edges and nodes in the adjacency graph, then grouping the glyphs according to their edge and node classifications to produce text output.
Navigation using portable reading machine
Navigation techniques including map based and object recognition based and especially adapted for use in a portable reading machine are described.
METHOD, DEVICE, AND APPARATUS WITH THREE-DIMENSIONAL IMAGE ORIENTATION
A processor-implemented method including detecting pieces of text from an image, determining vanishing points of the image, and estimating an orientation of the image based on the determined vanishing points, the detected pieces of text, and a key text graph representing a connection between nodes corresponding to pieces of key text.
Content extraction based on graph modeling
Methods and systems are presented for extracting categorizable information from an image using a graph that models data within the image. Upon receiving an image, a data extraction system identifies characters in the image. The data extraction system then generates bounding boxes that enclose adjacent characters that are related to each other in the image. The data extraction system also creates connections between the bounding boxes based on locations of the bounding boxes. A graph is generated based on the bounding boxes and the connections such that the graph can accurately represent the data in the image. The graph is provided to a graph neural network that is configured to analyze the graph and produce an output. The data extraction system may categorize the data in the image based on the output.
Content extraction based on hop distance within a graph model
A method of categorizing text entries on a document can include determining, for each of a plurality of text bounding boxes in the document, respective text, respective coordinates, and respective input embeddings. The method may further include defining a graph of the plurality of bounding boxes, the graph comprising a plurality of connections among the plurality of bounding boxes, each connection comprising a first and second bounding box and zero or more respective intermediate bounding boxes. The method may further include determining a respective attention value for each connection according to a quantity of intermediate bounding boxes in the connection and, based on a the respective attention values and a transformer-based machine learning model applied to the respective input embeddings and respective coordinates, determining output embeddings for each bounding box and, based on the respective output embeddings, generating a bounding box label for each bounding box.
Online handwriting document layout analysis system
An online handwriting document layout analysis system includes a preprocess unit serving to receive a document composed of a plurality of strokes and to generate an undirected graph including a plurality of nodes and a plurality of edges for representing relations between different strokes. A bidirectional recursive neural network unit for initializing a feature vector of each of the nodes and initializing a feature vector of each of edges. A graphic neural network unit serves to update the feature vectors of the nodes and the edges for obtaining updated feature vectors. A fully connected neural network unit serves for performing a coarse-grained object classifying and a fine-grained object classifying for each of the nodes and the edges based on the updated feature vectors. A document restoration unit serves for restoring a tree structure of the document.