Patent classifications
G06V10/86
IMAGE PROCESSING APPARATUS, METHOD, AND PROGRAM
A processor extracts, from an image including a plurality of structures that spatially continuously present and whose corresponding labels have a hierarchy, respective key points of the plurality of structures in association with labels of a first layer; uses the key points as nodes to derive a graph structure in which the labels of the first layer are associated with the nodes; and associates the nodes with labels of a second layer lower than the first layer by analyzing the graph structure.
End-to-end signalized intersection transition state estimator with scene graphs over semantic keypoints
Systems, methods, computer-readable media, techniques, and methodologies are disclosed for performing end-to-end, learning-based keypoint detection and association. A scene graph of a signalized intersection is constructed from an input image of the intersection. The scene graph includes detected keypoints and linkages identified between the keypoints. The scene graph can be used along with a vehicle's localization information to identify which keypoint that represents a traffic signal is associated with the vehicle's current travel lane. An appropriate vehicle action may then be determined based on a transition state of the traffic signal keypoint and trajectory information for the vehicle. A control signal indicative of this vehicle action may then be output to cause an autonomous vehicle, for example, to implement the appropriate vehicle action.
RELATIONSHIP MODELING AND KEY FEATURE DETECTION BASED ON VIDEO DATA
A method includes acquiring digital video data that portrays an interacting event, extracting image data, audio data, and semantic text data from the video data, analyzing the extracted data to identify a plurality of video features, and analyzing the plurality of video features to create a relationship graph. The interacting event comprises a plurality of interactions between plurality of individuals and the relationship graph comprises a plurality of nodes and a plurality of edges. Each node of the plurality of nodes represents an individual of the plurality of individuals, and each edge of the plurality of edges extends between two nodes of the plurality of nodes, and the plurality of edges represents the plurality of interactions. The method further comprises determining whether a first key feature is present in the relationship graph, wherein presence of the first key feature is predictive of a positive outcome of the interacting event.
RELATIONSHIP MODELING AND KEY FEATURE DETECTION BASED ON VIDEO DATA
A method includes acquiring digital video data that portrays an interacting event, extracting image data, audio data, and semantic text data from the video data, analyzing the extracted data to identify a plurality of video features, and analyzing the plurality of video features to create a relationship graph. The interacting event comprises a plurality of interactions between plurality of individuals and the relationship graph comprises a plurality of nodes and a plurality of edges. Each node of the plurality of nodes represents an individual of the plurality of individuals, and each edge of the plurality of edges extends between two nodes of the plurality of nodes, and the plurality of edges represents the plurality of interactions. The method further comprises determining whether a first key feature is present in the relationship graph, wherein presence of the first key feature is predictive of a positive outcome of the interacting event.
AUTO-ENCODING USING NEURAL NETWORK ARCHITECTURES BASED ON SYNAPTIC CONNECTIVITY GRAPHS
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting a neural network architecture for performing a prediction task for data elements of a specified data type. In one aspect, a method comprises: obtaining data defining a synaptic connectivity graph representing synaptic connectivity between neurons in a brain of a biological organism; generating a plurality of candidate graphs based on the synaptic connectivity graph; for each candidate graph of the plurality of candidate graphs: determining an auto-encoding neural network architecture based on the candidate graph; training an auto-encoding neural network having the auto-encoding neural network architecture to perform an auto-encoding task for data elements of the specified data type; and determining a performance measure characterizing a performance of the auto-encoding neural network in performing the auto-encoding task; and selecting the neural network architecture based on the performance measures.
COMPUTER VISION FRAMEWORK FOR REAL ESTATE
Image processing apparatuses and systems implementing deep learning architectures that can learn high-quality representations of images (e.g., of real estate images of properties) are described. The described techniques may be implemented to generate high-quality image representations may be used for various downstream applications including improved image captioning applications, improved image labeling applications, improved image search applications, etc. For instance, image representations generated according to one or more aspects of the described techniques may be used for image (e.g., real estate/property) classification, automatic property listing generation based on one or more images, property or listing recommendations based on searched images, etc. Moreover, the image processing systems described herein may be interpretable, which may be useful for designing or improving applications such as real estate appraisal, real estate interior design, and real estate renovation, real estate insurance, among other examples.
COMPUTER VISION FRAMEWORK FOR REAL ESTATE
Image processing apparatuses and systems implementing deep learning architectures that can learn high-quality representations of images (e.g., of real estate images of properties) are described. The described techniques may be implemented to generate high-quality image representations may be used for various downstream applications including improved image captioning applications, improved image labeling applications, improved image search applications, etc. For instance, image representations generated according to one or more aspects of the described techniques may be used for image (e.g., real estate/property) classification, automatic property listing generation based on one or more images, property or listing recommendations based on searched images, etc. Moreover, the image processing systems described herein may be interpretable, which may be useful for designing or improving applications such as real estate appraisal, real estate interior design, and real estate renovation, real estate insurance, among other examples.
Dynamic image re-timing
Techniques for the modification of at least part of a target image (e.g., scene objects within the target image), e.g., to make the target image appear that it was captured at a different time (e.g., a different time of day, different time of year) are disclosed. This “dynamic re-timing” of the target image may be achieved by finding one or more source images including the same (or similar) scene depicted in the target image (but, e.g., captured at different times), extracting stylistic elements from the source image(s), and then modifying at least part of the target image in a realistic fashion (e.g., not altering the geometry of objects in the target image), based on one or more extracted stylistic elements from the source image(s). Three-dimensional modeling of scene objects may allow a more realistic-looking transfer of the extracted stylistic elements onto scene objects in the target image to be achieved.
AUTOMATIC DELINEATION AND EXTRACTION OF TABULAR DATA IN PORTABLE DOCUMENT FORMAT USING GRAPH NEURAL NETWORKS
Aspects of the present invention disclose a method for automatic delineation and extraction of tabular data in portable document format (PDF). The method includes one or more processors extracting metadata corresponding to tabular data in a text-based portable document format (PDF), wherein the metadata is associated with characters and border lines of the tabular data. The method further includes generating a graph structure corresponding to the tabular data in the text-based PDF based at least in part on the metadata. The method further includes generating a vector representation of the graph structure. The method further includes constructing a tree structure corresponding to the tabular data based at least in part on the vector representation.
DIGITAL IMAGE ANNOTATION AND RETRIEVAL SYSTEMS AND METHODS
In a digital image annotation and retrieval system, a machine learning model identifies an image feature in an image and generates a plurality of question prompts for the feature. For a particular feature, a feature annotation is generated, which can include capturing a narrative, determining a plurality of narrative units, and mapping a particular narrative unit to the identified image feature. An enriched image is generated using the generated feature annotation. The enriched image includes searchable metadata comprising the feature annotation and the plurality of question prompts.