G06V10/464

VIRTUAL USER INPUT CONTROLS IN A MIXED REALITY ENVIRONMENT
20230315250 · 2023-10-05 ·

A wearable display system includes a mixed reality display for presenting a virtual image to a user, an outward-facing imaging system configured to image an environment of the user, and a hardware processor operably coupled to the mixed reality display and to the imaging system. The hardware processor is programmed to generate a virtual remote associated with a parent device, render the virtual remote and the virtual control element on the mixed reality display, determine when the user of the wearable system interacts with the virtual control element of the virtual remote, and perform certain functions in response to user interaction with a virtual control element of the virtual remote. These functions may include generation the virtual control element to move on the mixed reality display; and when movement of the virtual control element surpasses a threshold condition, generate a focus indicator for the virtual control element.

EXPLAINABLE ARTIFICIAL INTELLIGENCE (AI) BASED IMAGE ANALYTIC, AUTOMATIC DAMAGE DETECTION AND ESTIMATION SYSTEM

An Artificial Intelligence (AI) based automatic damage detection and estimation system receives images of a damaged object. The images are converted into monochrome versions if needed and analyzed by an ensemble machine learning (ML) cause prediction model that includes a plurality of sub-models that are each trained to identify a cause of damage to a corresponding portion for the damaged object from a plurality of causes. In addition, an explanation for the selection of the cause from the plurality of causes is also provided. The explanation includes image portions and pixels of images that enabled the cause prediction model to select the cause of damage. An ML parts identification model is also employed to identify and labels parts of the damaged object which are repairable and parts that are damaged and need replacement. The cost estimation for the repair and restoration of the damaged object can also be generated.

Mitigating people distractors in images

Systems, methods, and software are described herein for removing people distractors from images. A distractor mitigation solution implemented in one or more computing devices detects people in an image and identifies salient regions in the image. The solution then determines a saliency cue for each person and classifies each person as wanted or as an unwanted distractor based at least on the saliency cue. An unwanted person is then removed from the image or otherwise reduced from the perspective of being an unwanted distraction.

SEMI-SUPERVISED KEYPOINT BASED MODELS
20230281966 · 2023-09-07 ·

A method for training a neural network to predict keypoints of unseen objects using a training data set including labeled and unlabeled training data is described. The method comprising: receiving the training data set comprising a plurality of training samples, each training sample comprising a set of synchronized images of one or more objects from a respective scene, wherein each image in the set is synchronously taken by a respective camera from a different point of view, and wherein a subset of the set of synchronized images is labeled with ground-truth keypoints and the remaining images in the set are unlabeled; and for each of one or more training samples of the plurality of training samples: training the neural network on the training sample by updating current values of parameters of the neural network to minimize a loss function which is a combination of a supervised loss function and an unsupervised loss function.

Content extraction based on graph modeling
11657629 · 2023-05-23 · ·

Methods and systems are presented for extracting categorizable information from an image using a graph that models data within the image. Upon receiving an image, a data extraction system identifies characters in the image. The data extraction system then generates bounding boxes that enclose adjacent characters that are related to each other in the image. The data extraction system also creates connections between the bounding boxes based on locations of the bounding boxes. A graph is generated based on the bounding boxes and the connections such that the graph can accurately represent the data in the image. The graph is provided to a graph neural network that is configured to analyze the graph and produce an output. The data extraction system may categorize the data in the image based on the output.

Automated categorization and assembly of low-quality images into electronic documents

An apparatus includes a memory and processor. The memory stores document categories, text generated from an image a physical document page, and a machine learning algorithm. The text includes errors associated with noise in the image. The machine learning algorithm is configured to extract features associated with natural language processing and features associated with the errors from the text. The machine learning algorithm is also configured to generate a feature vector that includes the first and second pluralities of features, and to generate, based on the feature vector, a set of probabilities, each of which is associated with a document category and indicates a probability that the physical document from which the text was generated belongs to that document category. The processor applies the machine learning algorithm to the text, to generate the set of probabilities, identifies a largest probability, and assigns the image to the associated document category.

ANONYMOUS FINGERPRINTING OF MEDICAL IMAGES
20230368386 · 2023-11-16 ·

Disclosed herein is a medical system comprising a memory storing machine executable instructions and at least one trained neural network. Each of the at least one neural network is configured for receiving a medical image as input. Each of the at least one trained neural network has been modified to provide hidden layer output. Execution of the machine executable instructions causes the computational system to: receive the medical image; receive the hidden layer output in response to inputting the medical image into each of the at least one trained neural network; provide an anonymized image fingerprint comprising the hidden layer output from each of the at least one trained neural network; and receive an image assessment of the medical image in response to querying a historical image database using the anonymized image fingerprint.

TRANSMISSION LINE DEFECT IDENTIFICATION METHOD BASED ON SALIENCY MAP AND SEMANTIC-EMBEDDED FEATURE PYRAMID
20230360390 · 2023-11-09 ·

The present disclosure provides a transmission line defect identification method based on a saliency map and a semantic-embedded feature pyramid, including the following steps: step 1: cleaning and classifying a dataset; step 2: generating a super-resolution image for a small target of a transmission line by using an Electric Line-Enhanced Super-Resolution Generative Adversarial Network (EL-ESRGAN) model; step 3: performing image saliency detection on the dataset by constructing a U.sup.2-Net; step 4: performing data augmentation on the dataset by using GridMask and random cutout algorithms based on a saliency map, and generating a classified dataset; and step 5: performing image classification on a normal set and a defect set by using a ResNet34 classification algorithm and a deep semantic embedding (DSE)-based feature pyramid classification network.

Kinect-based auxiliary training system for basic badminton movements
11823495 · 2023-11-21 · ·

A Kinect-based auxiliary training system for basic badminton movements, includes a data collection module, a movement feature extraction and recognition module, and a movement standard degree analysis and guidance module. The data collection module is provided with a Kinect v2 somatosensory device for monitoring athletes in real time, and collecting 3D coordinate data of 25 joint points of athletes' whole body. The movement feature extraction and recognition module is provided for establishing a standard template, and obtaining a similarity between the movement data and the standard template. The movement standard degree analysis and guidance module is provided for determining a category of the current movement of the user to be tested according to the similarity, and further analyzing whether the current movement of the user to be tested meets a standard according to a threshold range of the bone included angle set by a technology evaluation rule.

SEGMENTATION IN MULTI-ENERGY CT DATA

Segmentation of multi-energy CT data, including data in three or more energy bands. A user is enabled to input one or more region indicators in displayed CT data. At least some data is labelled based on the region indicators. Feature vectors are created for at least some data elements, which are then classified based on the labelled data elements and feature vectors. Feature vectors may be constructed using a Bag of Features or similar process. Classification may be performed using a Support Vector Machine classifier or other machine learning classifier.