Patent classifications
G06V30/18181
CHARACTER RECOGNITION METHOD AND APPARATUS, ELECTRONIC DEVICE AND COMPUTER READABLE STORAGE MEDIUM
A character recognition method, a character recognition apparatus, an electronic device and a computer readable storage medium are disclosed. The character recognition method includes: determining semantic information and first position information of each individual character recognized from an image; constructing a graph network according to the semantic information and the first position information of each individual character; and determining a character recognition result of the image according to a feature of each individual character calculated by the graph network.
System, method and computer program product for electronic document display
A method, system, and computer program product, include receiving a first input at a first element among a plurality of elements associated with at least one electronic document, determining a second element associated with the first element from the plurality of elements based on predetermined relations of the plurality of elements, and causing a view to be displayed together with an electronic document including the first element, the view at least including the second element.
Character detection method and apparatus
Disclosed embodiments relate to a character detection method and apparatus. In some embodiments, the method includes: using an image including an annotated word as an input to a machine learning model; selecting, based on a predicted result of characters inside an annotation region of the annotated word predicted and annotation information of the annotated word, characters for training the machine learning model from the characters inside the annotation region of the annotated word predicted; and training the machine learning model based on features of the selected characters. This implementation manner implements the full training of a machine learning model by using existing word level annotated images, to obtain a machine learning model capable of detecting characters in images, thereby reducing the costs for the training of a machine learning model capable of detecting characters in images.
DEEP LEARNING METHOD
Provided is a deep learning method including a step of each of at least two or more deep learning machines learning a web traffic by using a hexadecimal; a step of the deep learning machines learning the web traffic by using an incremental learning using a weight; a step of, when the web traffic is received, each of the deep learning machines encoding a character string of the web traffic with UTF-8 hexadecimal; a step of each of the deep learning machines converting the character string into an image and deep learning the image.
TEXT BLOCK SEGMENTATION
A computer-implemented method for text block segmentation includes determining a first text block segmentation pattern utilized to generate a segmented text block based, at least in part, on a comparison of semantic information associated with the segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph; calculating a first degree of confidence in a size of the segmented text block based, at least in part, on comparing semantic entities associated with the segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node included in the graph and representative of the first type of text block segmentation pattern; and determining that the size of the segmented text block is non-optimal based on the calculated degree of confidence in the size of the segmented text block being below a predetermined threshold.
Conversion of tabular format data to machine readable text for QA operations
A system and method for table conversion including converting a table containing text in tabular form to an image, labeling each text area of the image with a bounding box, determining for each bounding box, a position information, a semantic information, and an image information, reconstructing the image into a graph form having a plurality of nodes, wherein each node represents the bounding box of the text areas of the image, inputting at least two nodes into a trained neural network to determine a relative relationship between the at least two nodes, building a knowledge graph using the relative relationship of the at least two nodes, and translating the knowledge graph into machine readable natural language.
SYSTEM AND METHOD FOR PROGRAM SYNTHESIS FOR WEAKLY-SUPERVISED MULTIMODAL QUESTION ANSWERING USING FILTERED ITERATIVE BACK-TRANSLATION
This disclosure relates generally to program synthesis for weakly-supervised multimodal question answering using filtered iterative back-translation (FIBT). Existing approaches for chart question answering mainly address structural, visual, relational, or simple data retrieval queries with fixed-vocabulary answers. The present disclosure implements a two-stage approach where, in first stage, a computer vision pipeline is employed to extract data from chart images and store in a generic schema. In second stage, SQL programs for Natural Language (NL) queries are generated in dataset by using FIBT. To adapt forward and backward models to required NL queries, a Probabilistic Context-Free Grammar is defined, whose probabilities are set to be inversely proportional to SQL programs in training data and sample programs from it. Compositional similarity-based filtration strategy employed on the NL queries generated for these SQL programs enables synthesizing, filtering, and appending NL query-SQL program pairs to training data, iteratively moving towards required NL query distribution.
Computer Vision Systems and Methods for Information Extraction from Floorplan Images
Computer vision systems and methods for information extraction from floorplan images are provided. The system generates a multi-attributed graph representing an architectural floorplan image having nodes representing rooms of the floorplan image and connecting edges therebetween representing connectivity between the rooms. Each node of the multi-attributed graph can have multiple attributes including a type of the room, a room size, and the floor number on which room lies. Each edge can have attributes to denote a type of connectivity, such as door-based, wall-based, wall-with-window-based, and vertical connectivity where one room is located beneath another room on a separate floor of the floorplan image.
Method, apparatus, device, storage medium and program product of performing text matching
A method, an apparatus, a device, a storage medium and a program product of performing a text matching are provided, which relate to a field of a computer technology, and in particular to natural language processing and deep learning technologies. The method includes: determining a word set and a plurality of semantic units from a text set, the word set is associated with a first predetermined attribute, and the text set contains a plurality of first texts indicating an object information and a plurality of second texts indicating an object demand information; generating a graph; and generating a final feature representation associated with the text set and the word set based on the graph and a graph convolution model, so as to perform the text matching.
Techniques for graph data structure augmentation
A computing device may receive a set of user documents. Data may be extracted from the documents to generate a first graph data structure with one or more initial graphs containing key-value pairs. A model may be trained on the first graph data structure to classify the pairs. Until a set of evaluation metrics for the model exceeds a set of deployment thresholds: generating, a set of evaluation metrics may be generated for the model. The set of evaluation metrics may be compared to the set of deployment thresholds. In response to a determination that the set of evaluation metrics are below the set of deployment thresholds: one or more new graphs may be generated from the one or more initial graphs in the first graph data structure to produce a second graph data structure. The first and second graph can be used to train the model.