G06V10/7625

GENERATING MACHINE RENDERABLE REPRESENTATIONS OF FORMS USING MACHINE LEARNING
20220035996 · 2022-02-03 · ·

A method may include clustering form elements into line objects and columns of a table of a structured representation by applying a trained multi-dimensional clustering model to spatial coordinates of the form elements, and assigning a table header line type to a table header line object of the line objects based on a spatial coordinate of the table header line object relative to a spatial coordinate of a topmost table data line object of the line objects, and a determination that a number of columns of the table header line object is within a threshold of a number of columns of the topmost table data line object. The topmost table data line object may be assigned a table data line type. The method may further include presenting the structured representation to a user.

COMPUTERIZED CORRESPONDENCE ESTIMATION USING DISTINCTIVELY MATCHED PATCHES

Correspondences in content items may be determined using a trained decision tree to detect distinctive matches between portions of content items. The techniques described include determining a first group of patches associated with a first content item and processing a first patch based at least partly on causing the first patch to move through a decision tree, and determining a second group of patches associated with a second content item and processing a second patch based at least partly on causing the second patch to move through the decision tree. The techniques described include determining that the first patch and the second patch are associated with a same leaf node of the decision tree and determining that the first patch and the second patch are corresponding patches based at least partly on determining that the first patch and the second patch are associated with the same leaf node.

Method for determining dimensions in an indoor scene from a single depth image

A method determines dimensions in a scene by first acquiring a depth image of the scene acquired by a sensor, and extracting planes from the depth image. Topological relationships of the planes are determined. The dimensions are determined based on the planes and the topological relationships. A quality of the dimensions is evaluated using a scene type, and if the quality is sufficient outputting the dimensions, and otherwise outputting a guidance to reposition the sensor.

SYSTEMS AND METHODS FOR SLIDE IMAGE ALIGNMENT

Systems and methods for slide image alignment are described herein. An example method includes receiving a plurality of slide images, detecting a plurality of features contained in the slide images, and comparing a plurality of pairs of the slide images. The comparison uses the detected features. The method also includes creating a distance matrix that reflects a respective difference between each of the pairs of the slide images, creating a graph by connecting each of the slide images to its most similar slide image, and detecting a plurality of graph components. Each of the graph components includes one or more of the slide images. The method further includes aligning the slide images within each of the graph components, and aligning the graph components to form a composite image.

METHOD FOR GENERATING VIDEO FILE FORMAT-BASED SHAPE RECOGNITION LIST

A three-dimensional (3D) video reconstruction method, an encoder and a decoder are provided, comprising obtaining a list of video content screens or video content frames of an object from the 3D video; obtaining a list of depth screens of the 3D video; adding a shape screen to each video frame of the 3D video; superimposing each of the video content screens or video content frames with the depth screen and the shape screen to form a shape identification library; and storing the shape identification library at a header of a compressed file for unmasking of the object. The shape recognition list format may significantly reduce the storage size and increase the compression ratio by replacing the original shape with the identifications, and help improve the rendering quality.

INTERACTIVE VALIDATION OF ANNOTATED DOCUMENTS

A computer-implemented method includes: receiving, by a computer device, an electronic document having labels; predicting, by the computer device, a user will reject the labels; determining, by the computer device and in response to the determining the user will reject the labels, that a subset of labels of the labels violate association rules; marking, by the computer device, the subset of labels which violate the association rules for validation; prioritizing, by the computer device, the subset of labels which violate the association rules; and rendering, by the computer device, the subset of labels which violate the association rules in view of priority.

High-dimensional image feature matching method and device

A high-dimensional image feature matching method and device relating to the field of image retrieval. The method includes extracting a high-dimensional image feature of an image to be retrieved; dividing the high-dimensional image feature of the image to be retrieved into a plurality of low-dimensional image features; comparing each of the low-dimensional image features of the image to be retrieved with clustering centers at each layer of the low-dimensional image features of the images in a database; and determining a similarity the low-dimensional image feature between the image to be retrieved and each of some images in the database according to a comparison result, so that at least one feature matching the high-dimensional image feature of the image to be retrieved is retrieved in the database.

SYSTEMS AND METHODS FOR MAPPING BASED ON MULTI-JOURNEY DATA

A method performed by an apparatus is described. The method includes receiving map data that is based on first image data, second image data, and a similarity metric. The first image data can be received from a first vehicle and represent an object. The second image data can be received from a second vehicle and represent the object. The similarity metric can be associated with the object represented in the first image data and the object represented in the second image data. The method can also include storing, by a vehicle, the received map data and localizing the vehicle based on the stored map data.

Monotone speech detection

Examples of the present disclosure describe systems and methods for detecting monotone speech. In aspects, audio data provided by a user may be received a device. Pitch values may be calculated and/or extracted from the audio data. The non-zero pitch values may be divided into clusters. For each cluster, a Pitch Variation Quotient (PVQ) value may be calculated. The weighted average of PVQ values across the clusters may be calculated and compared to a threshold for determining monotone speech. Based on the comparison, the audio data may be classified as monotone or non-monotone and an indication of the classification may be provided to the user in real-time via a user interface. Upon the completion of the audio session in which the audio data is received, feedback for the audio data may be provided to the user via the user interface.

ENTITY RECOGNITION FROM AN IMAGE
20210374386 · 2021-12-02 ·

Aspects of the current disclosure include systems and methods for identifying an entity in a query image by comparing the query image with digital images in a database. In one or more embodiments, a query feature may be extracted from the query image and a set of candidate features may be extracted from a set of images in the database. In one or more embodiments, the distances between the query feature and the candidate features are calculated. A feature, which includes a set of shortest distances among the calculated distances and a distribution of the set of shortest distances, may be generated. In one or more embodiments, the feature is input to a trained model to determine whether the entity in the query image is the same entity associated with one of the set of shortest distances.