Patent classifications
G06V10/803
SYSTEMS AND METHODS FOR UNIFIED VISION-LANGUAGE UNDERSTANDING AND GENERATION
Embodiments described herein provide bootstrapping language-images pre-training for unified vision-language understanding and generation (BLIP), a unified VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP enables a wider range of downstream tasks, improving on both shortcomings of existing models.
Method of predicting road attributes, data processing system and computer executable code
A method of predicting one or more road segment attributes corresponding to a road segment in a geographical area, the method including: providing trajectory data and satellite image of the geographical area; calculating one or more image channels based on the trajectory data; and using at least one processor, classifying the road segment based on the one or more image channels and the satellite image using a trained classifier into prediction probabilities of the road attributes A data processing system including one or more processors configured to carry out a the method of predicting road attributes. A computer executable code including instructions for predicting one or more road segment attributes according to the method.
TASK PERFORMANCE ADJUSTMENT BASED ON VIDEO ANALYSIS
A method of adjusting task performance includes extracting image data and audio data from video data and extracting semantic text data from the audio data. The method further includes analyzing at least one of the image data, the audio data, and the semantic text data to identify a first set of features, generating an adjustment recommendation based on the first set of features and a relational feature model, and outputting the adjustment recommendation. The video data portrays a first individual performing a first iteration of a task. The at least one of the image data, audio data, and semantic text data is analyzed by a first computer-implemented machine learning model. The adjustment recommendation is generated by a second computer-implemented machine learning model and comprises instructions that can be performed by the first individual to adjust task performance. The relational feature model relates features and task performance.
ADJUSTING MENTAL STATE TO IMPROVE TASK PERFORMANCE
A method of adjusting mental state includes acquiring video data of an individual, extracting image data and audio data from the video data, extracting semantic text data from the audio data, identifying a first set of features, predicting a baseline mental state, identifying a target mental state, and simulating a predicted path from the baseline mental state to the target mental state. The baseline mental state is predicted based on the first set of features. The predicted path is simulated using a multidimensional mental state model, a plurality of actions, and a first computer-implemented machine learning model. The predicted path comprises one or more actions of the plurality of actions and corresponding changes to at least one of first and second dimensions of the multidimensional mental state model. An indication of the one or more actions of the predicted path is output to the individual.
ADJUSTING MENTAL STATE TO IMPROVE TASK PERFORMANCE AND COACHING IMPROVEMENT
A method of adjusting mental state includes acquiring video data of a first individual, identifying a first set of features based on image, audio, and text data extracted from the video data, predicting a baseline mental state for the first individual based on the first set of features, identifying a target mental state, and simulating a predicted path from the baseline mental state to the target mental state. The predicted path is simulated using a multidimensional mental state model, a plurality of actions, and a computer-implemented machine learning model. The predicted path comprises one or more actions of the plurality of actions and corresponding changes to at least one of first and second dimensions of the multidimensional mental state model. The one or more actions of the predicted path are output to a second individual and are performable to adjust the baseline mental state of the first individual.
Image processing device, image processing method, and storage medium
An image processing device according to one aspect of the present disclosure includes: at least one memory storing a set of instructions; and at least one processor configured to execute the set of instructions to: receive a visible image of a face; receive a near-infrared image of the face; adjust brightness of the visible image based on a frequency distribution of pixel values of the visible image and a frequency distribution of pixel values of the near-infrared image; specify a relative position at which the visible image is related to the near-infrared image; invert adjusted brightness of the visible image; detect a region of a pupil from a synthetic image obtained by adding up the visible image the brightness of which is inverted and the near-infrared image based on the relative position; and output information on the detected pupil.
Image processing device, image processing method, and storage medium
An image processing device according to one aspect of the present disclosure includes: at least one memory storing a set of instructions; and at least one processor configured to execute the set of instructions to: receive a visible image of a face; receive a near-infrared image of the face; adjust brightness of the visible image based on a frequency distribution of pixel values of the visible image and a frequency distribution of pixel values of the near-infrared image; specify a relative position at which the visible image is related to the near-infrared image; invert adjusted brightness of the visible image; detect a region of a pupil from a synthetic image obtained by adding up the visible image the brightness of which is inverted and the near-infrared image based on the relative position; and output information on the detected pupil.
SURGICAL DEVICES, SYSTEMS, AND METHODS USING FIDUCIAL IDENTIFICATION AND TRACKING
In general, devices, systems, and methods for fiducial identification and tracking are provided.
SURGICAL METHODS USING FIDUCIAL IDENTIFICATION AND TRACKING
In general, devices, systems, and methods for fiducial identification and tracking are provided.
METHOD FOR GENERATING DEPTH IN IMAGES, ELECTRONIC DEVICE, AND NON-TRANSITORY STORAGE MEDIUM
A method and system for generating depth in monocular images acquires multiple sets of binocular images to build a dataset containing instance segmentation labels as to content; training an work using the dataset with instance segmentation labels to obtain a trained autoencoder network; acquiring monocular image, the monocular image is input into the trained autoencoder network to obtain a first disparity map and the first disparity map is converted to obtain depth image corresponding to the monocular image. The method combines binocular images with instance segmentation images as training data for training an autoencoder network, monocular images can simply be input into the autoencoder network to output the disparity map. Depth estimation for monocular images is achieved by converting the disparity map to a depth image corresponding to the monocular image. An electronic device and a non-transitory storage are also disclosed.