Patent classifications
G06V30/18181
Digital forensic apparatus for searching recovery target area for large-capacity video evidence using time map and method of operating the same
The present disclosure relates to technology for automatically searching and recovering the recovery area of frames corresponding to a desired time for large-capacity video evidence using a time map generated through an optical character recognition (OCR) function. A digital forensic apparatus for searching and recovering a recovery target area for large-capacity video evidence using a time map according to an embodiment of the present disclosure may include a division recovery device for collecting video evidence from a storage device, dividing the collected video evidence into a plurality of spaces in consideration of the physical space of the storage device, and recovering a representative frame in each of the divided spaces; a time information recognizer for recognizing time information from the recovered representative frame using an optical character recognition (OCR) function; a time map generator for generating a time map in which the divided spaces are arranged according to a time criterion based on the recognized time information; and a selective recovery device for searching a recovery target area by matching specific time information input by a user with the generated time map and recovering the searched recovery target area.
Text block segmentation
A computer-implemented method for text block segmentation includes determining a first text block segmentation pattern utilized to generate a segmented text block based, at least in part, on a comparison of semantic information associated with the segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph; calculating a first degree of confidence in a size of the segmented text block based, at least in part, on comparing semantic entities associated with the segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node included in the graph and representative of the first type of text block segmentation pattern; and determining that the size of the segmented text block is non-optimal based on the calculated degree of confidence in the size of the segmented text block being below a predetermined threshold.
Generating templates using structure-based matching
In implementations of systems for generating templates using structure-based matching, a computing device implements a template system to receive input data describing a set of digital design elements. The template system represents the input data as a sentence in a design structure language that describes structural relationships between design elements included in the set of digital design elements. An input template embedding is generated based on the sentence in the design structure language. The template system generates a digital template that includes the set of digital design elements for display in a user interface based on the input template embedding.
Systems and methods for machine learning key-value extraction on documents
A machine learning based key-value extraction model extracts fields/entities from documents. The input images are processed through OCR. A list of words (uni-grams) and their coordinates are extracted from the original images. Following word cleaning and manipulation, n-gram creation (multi-words), and feature engineering, the transformed data is fed into a classification algorithm to predict if a uni-gram or n-gram is one of the target entities or a non-entity. Following the first step that includes unique feature engineering, a second step improves extraction accuracy among the fields/entities.
SYSTEM AND METHOD FOR VISION-ASSISTED APPROACH FOR GRAPH STRUCTURE EXTRACTION IN VARIOUS TYPES OF DOCUMENTS
Various methods and processes, apparatuses or systems, and media for deterministically deriving underlying graph structure and associated text information in a document are disclosed. A processor implements a vision-based algorithm and a network-based algorithm that may extract and structure a diagram from an image obtained from the document. The processor deterministically derives underlying graph structure and associated text information in the document by applying the vision-based algorithm and the network-based algorithm, thereby allowing encoding of graph content and reasoning into downstream applications including LLM inputs, graphical question-answering, and information extraction tasks. The processor also implements OCR algorithm for text fields, and then isolates which piece of text belongs to which node by examining the spatial coordinates of the text against bounding box of the node and executes cross-page resolution.
System and method for program synthesis for weakly-supervised multimodal question answering using filtered iterative back-translation
This disclosure relates generally to program synthesis for weakly-supervised multimodal question answering using filtered iterative back-translation (FIBT). Existing approaches for chart question answering mainly address structural, visual, relational, or simple data retrieval queries with fixed-vocabulary answers. The present disclosure implements a two-stage approach where, in first stage, a computer vision pipeline is employed to extract data from chart images and store in a generic schema. In second stage, SQL programs for Natural Language (NL) queries are generated in dataset by using FIBT. To adapt forward and backward models to required NL queries, a Probabilistic Context-Free Grammar is defined, whose probabilities are set to be inversely proportional to SQL programs in training data and sample programs from it. Compositional similarity-based filtration strategy employed on the NL queries generated for these SQL programs enables synthesizing, filtering, and appending NL query-SQL program pairs to training data, iteratively moving towards required NL query distribution.
Computer systems and methods for identifying location entities and generating a location entity data taxonomy
An example computing platform is configured to: obtain a two-dimensional drawing of a portion of a construction project; perform an image processing analysis of the two-dimensional drawing to identify one or more location entities within the two-dimensional drawing; derive embeddings for each location entity in the two-dimensional drawing; based on the derived embeddings, determine relationships between the one or more location entities; and based on the determined relationships between the one or more location entities, generate a location entity data taxonomy that includes each identified location entity as a respective node that is related to at least one other location entity.
Graph machine learning for case similarity
Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
Online handwriting document layout analysis system
An online handwriting document layout analysis system includes a preprocess unit serving to receive a document composed of a plurality of strokes and to generate an undirected graph including a plurality of nodes and a plurality of edges for representing relations between different strokes. A bidirectional recursive neural network unit for initializing a feature vector of each of the nodes and initializing a feature vector of each of edges. A graphic neural network unit serves to update the feature vectors of the nodes and the edges for obtaining updated feature vectors. A fully connected neural network unit serves for performing a coarse-grained object classifying and a fine-grained object classifying for each of the nodes and the edges based on the updated feature vectors. A document restoration unit serves for restoring a tree structure of the document.
DETECTION OF TECHNICAL DATA IN AN IMAGE OF A TECHNICAL DRAWING
A technical data detection method in a technical drawing image. The technical drawing includes a view of a technical object and a technical annotation. The method includes identifying one or more views in the technical drawing. The method includes identifying one or more technical annotations in each view. The method includes identifying characters in each technical annotation. The method includes determining a graph representation of each view. The graph representation includes nodes each corresponding to a classification of pixels in the view into a semantic class and edges each connects two nodes either if the two nodes represent neighboring pixels or if the two nodes represent pixels distant from each other below a threshold. The method includes, for each identified view, using the graph topology and the identified characters to associate nodes corresponding to the dimension-related symbol or dimension classes to nodes corresponding to the geometry class.