Patent classifications
G06V10/7747
INDOOR NAVIGATION METHOD, INDOOR NAVIGATION EQUIPMENT, AND STORAGE MEDIUM
An indoor navigation method is provided, including: receiving an instruction for navigation, and collecting an environment image; extracting an instruction room feature and an instruction object feature carried in the instruction, and determining a visual room feature, a visual object feature, and a view angle feature based on the environment image; fusing the instruction object feature and the visual object feature with a first knowledge graph representing an indoor object association relationship to obtain an object feature, and determining a room feature based on the visual room feature and the instruction room feature; and determining a navigation decision based on the view angle feature, the room feature, and the object feature.
DYNAMIC ARTIFICIAL INTELLIGENCE CAMERA MODEL UPDATE
A system may be configured to dynamically update deployed machine learning models. In some aspects, the system may receive sampled video information, generate first object detection information based on a cloud model and the sampled video information, and generate second object detection information based on a first edge model and the sampled video information. Further, the system may select, based on the first object detection information and the second object detection information, a plurality of training images from the sampled video information, detect motion information corresponding to motion of one or more detected objects within the plurality of training images, generate a plurality of annotated images based at least in part on the first object detection information and the motion information, and generate a second edge model based upon training the first edge model using the plurality of annotated images.
SYSTEMS, METHODS, AND APPARATUSES FOR IMPLEMENTING ANNOTATION-EFFICIENT DEEP LEARNING MODELS UTILIZING SPARSELY-ANNOTATED OR ANNOTATION-FREE TRAINING
Described herein are means for implementing annotation-efficient deep learning models utilizing sparsely-annotated or annotation-free training, in which trained models are then utilized for the processing of medical imaging. An exemplary system includes at least a processor and a memory to execute instructions for learning anatomical embeddings by forcing embeddings learned from multiple modalities; initiating a training sequence of an AI model by learning dense anatomical embeddings from unlabeled date, then deriving application-specific models to diagnose diseases with a small number of examples; executing collaborative learning to generate pretrained multimodal models; training the AI model using zero-shot or few-shot learning; embedding physiological and anatomical knowledge; embedding known physical principles refining the AI model; and outputting a trained AI model for use in diagnosing diseases and abnormal conditions in medical imaging. Other related embodiments are disclosed.
Image processing system
An image processing system comprises a template matching engine (TME). The TME reads an image from the memory; and as each pixel of the image is being read, calculates a respective feature value of a plurality of feature maps as a function of the pixel value. A pre-filter is responsive to a current pixel location comprising a node within a limited detector cascade to be applied to a window within the image to: compare a feature value from a selected one of the plurality of feature maps corresponding to the pixel location to a threshold value; and responsive to pixels for all nodes within a limited detector cascade to be applied to the window having been read, determine a score for the window. A classifier, responsive to the pre-filter indicating that a score for a window is below a window threshold, does not apply a longer detector cascade to the window before indicating that the window does not comprise an object to be detected.
ACTION RECOGNITION LEARNING DEVICE, ACTION RECOGNITION LEARNING METHOD, ACTION RECOGNITION LEARNING DEVICE, AND PROGRAM
The present invention makes it possible to cause an action recognizer capable of recognizing actions with high accuracy and with a small quantity of learning data to learn. An input unit 101 receives input of a learning video and an action label indicating an action of an object, a detection unit 102 detects a plurality of objects included in each frame image included in the learning video, a direction calculation unit 103 calculates a direction of a reference object, which is an object to be used as a reference among the plurality of detected objects, a normalization unit 104 normalizes the learning video so that a positional relationship between the reference object and another object becomes a predetermined relationship, and an optimization unit 106 optimizes parameters of an action recognizer to estimate the action of the object in the inputted video based on the action estimated by inputting the normalized learning video to the action recognizer and the action indicated by the action label.
INFORMATION PROCESSING DEVICE, LEARNING METHOD, AND RECORDING MEDIUM
The information processing device performs distillation learning of a student model using unknown data which a teacher model has not learned. The label distribution determination unit outputs an arbitrary label for the unknown data. The data generation unit outputs new generated data using an arbitrary label and unknown data as inputs. The distillation learning part performs distillation learning of the student model using the teacher model and using the generated data as an input.
IMAGE RECOGNITION SUPPORT APPARATUS, IMAGE RECOGNITION SUPPORT METHOD, AND IMAGE RECOGNITION SUPPORT PROGRAM
The invention supports creation of models for recognizing attributes in an image with high accuracy. An image recognition support apparatus includes an image input unit configured to acquire an image, a pseudo label generation unit configured to recognize the acquired image based on a plurality of types of image recognition models and output recognition information, and generate pseudo labels indicating attributes of the acquired image based on the output recognition information, and a new label generation unit configured to generate new labels based on the generated pseudo labels.
METHOD AND APPARATUS FOR TRANSFER LEARNING
A method for transfer learning includes: obtaining a pre-trained model, and generating a model to be transferred based on the pre-trained model, in which the model to be transferred includes N Transformer layers, and N is a positive integer; obtaining a mini-batch by performing random sampling on a target training set; and training the model to be transferred based on the mini-batch, in which a loss value for each Transformer layer is generated based on an empirical loss value and a noise stability loss value.
TRAINING A NEURAL NETWORK FOR ACTION RECOGNITION
A system for training a neural network for action recognition based on unlabeled action sequences includes a first neural network (NN1) and a second neural network (NN2). A first updating module is arranged to update parameters of NN1 to minimize a difference between representation data generated by NN1 and representation data generated by NN2. A second updating module is arranged to update parameters of NN2 as a function of the parameters of NN1. An augmentation module includes first and second sub-modules and is configured to include augmented versions of incoming action sequences in first and second input data. The first and second sub-modules are configured to apply at least partly different augmentation to the incoming action sequences. After NN1 and NN2 have been operated on one or more instances of the first and second input data, NN1 comprises a parameter definition of a pre-trained neural network.
METHODS AND SYSTEMS FOR LOW LIGHT MEDIA ENHANCEMENT
A method for enhancing media includes: receiving, by an electronic device, a media stream; performing, by the electronic device, an alignment of a plurality of frames of the media stream; correcting, by the electronic device, a brightness of the plurality of frames; selecting, by the electronic device, one of a first neural network, a second neural network, or a third neural network, by analyzing parameters of the plurality of frames having the corrected brightness, wherein the parameters include at least one of shot boundary detection and artificial light flickering; and generating, by the electronic device, an output media stream by processing the plurality of frames of the media stream using the selected one of the first neural network, the second neural network, or the third neural network.