Patent classifications
G06F18/256
Generating modified digital images utilizing a dispersed multimodal selection model
The present disclosure relates to systems, methods, and non-transitory computer readable media for generating modified digital images based on verbal and/or gesture input by utilizing a natural language processing neural network and one or more computer vision neural networks. The disclosed systems can receive verbal input together with gesture input. The disclosed systems can further utilize a natural language processing neural network to generate a verbal command based on verbal input. The disclosed systems can select a particular computer vision neural network based on the verbal input and/or the gesture input. The disclosed systems can apply the selected computer vision neural network to identify pixels within a digital image that correspond to an object indicated by the verbal input and/or gesture input. Utilizing the identified pixels, the disclosed systems can generate a modified digital image by performing one or more editing actions indicated by the verbal input and/or gesture input.
Systems and methods of product recognition through multi-model image processing
In some embodiments, systems and methods are provided to recognize retail products, comprising: a model training system configured to: identify a customer; access an associated customer profile; access and apply a set of filtering rules to a product database based on customer data; generate a listing of products specific to the customer; access and apply a model training set of rules to train a machine learning model based on the listing of products and corresponding image data for each of the products in the listing of products; and communicate the trained model to the portable user device associated with first customer.
Gating model for video analysis
Implementations described herein relate to methods, devices, and computer-readable media to perform gating for video analysis. In some implementations, a computer-implemented method includes obtaining a video comprising a plurality of frames and corresponding audio. The method further includes performing sampling to select a subset of the plurality of frames based on a target frame rate and extracting a respective audio spectrogram for each frame in the subset of the plurality of frames. The method further includes reducing resolution of the subset of the plurality of frames. The method further includes applying a machine-learning based gating model to the subset of the plurality of frames and corresponding audio spectrograms and obtaining, as output of the gating model, an indication of whether to analyze the video to add one or more video annotations.
Sensor transformation attention network (STAN) model
A sensor transformation attention network (STAN) model including sensors, attention modules, a merge module and a task-specific module is provided. The attention modules calculate attention scores of feature vectors corresponding to the input signals collected by the sensors. The merge module calculates attention values of the attention scores, and generates a merged transformation vector based on the attention values and the feature vectors. The task-specific module classifies the merged transformation vector.
System and method for acquiring multimodal biometric information
Methods, systems, and programming for user identification are presented. In one example, a system for acquiring biometric information is disclosed. The system comprises a housing including a surface for a person to place a finger thereon. The system also comprises a sensor, a first image acquisition portion, and a second image acquisition portion. The sensor is configured for sensing presence of the finger when the person places the finger on the surface. The first image acquisition portion is configured for acquiring a fingerprint image of the finger placed on the surface. The second image acquisition portion is configured for acquiring a finger vein image of the finger placed on the surface. The first and second image acquisition portions acquire their respective images at different times.
Patient-adaptive nuclear imaging
Systems and methods include control of a nuclear imaging scanner to acquire nuclear imaging scan data of a body, control of a computed tomography scanner to acquire computed tomography scan data of the body, determination of a scanning speed, of the nuclear imaging scanner, associated with each of a plurality of scanning coordinates based on locations of one or more internal volumes associated with radioactivity greater than a threshold level, a classification determined for each of the one or more of the internal volumes indicating a degree of clinical interest based at least in part on the radioactivity associated with the internal volume, and an attenuation coefficient map based on the computed tomography scan data, and control of the nuclear imaging scanner to scan the body over each of the scanning coordinates at the associated scanning speed.
Multi-modal reconstruction network
A system and method include training of an artificial neural network to generate an output data set, the training based on the plurality of sets of emission data acquired using a first imaging modality and respective ones of data sets acquired using a second imaging modality.
SYSTEMS AND METHODS FOR IMPROVED INTERRUPTION AND RESUMPTION MANAGEMENT
Interruptions of user activity can cost time and money for organizations. Improved technology is needed for interruption and resumption management. A communication engine can monitor the task state of a user, receive an incoming matter, analyze the incoming matter with a classification model, compare the analyzed incoming matter with the priority of the task state of the user, and interrupt the user by providing the incoming matter to the user if the priority of the analyzed incoming matter is greater than the priority of the task state of the user. When the user would like to resume their interrupted task, the communication engine can provide an output refocus communication and automatically return the device software status to the task state of the user when the interruption occurred.
ROAD MODEL GENERATION METHOD AND DEVICE
The disclosure relates to a road model generation method, the method including: receiving lane line information from a plurality of sources; determining a reference lane line; and performing arbitration and then selecting, from the lane line information from the plurality of sources, a first lane line corresponding to the reference lane line, where the first lane line and the reference lane line together form left and right lane lines of a present lane. The disclosure further relates to a road model generation device, a computer storage medium, an autonomous driving system, and a vehicle.
System and methods for prediction communication performance in networked systems
A system for processing performance prediction decisions includes one or more processors configured to execute one or more program modules. The modules are configured to receive, via the one or more processors, a prediction for an account at a prediction timestamp. The modules are also configured to identify, via the one or more processors, a prediction rule using attributes from the prediction. Responsive to the prediction rule having a network trigger associated therewith, the modules are configured to retrieve, via the one or more processors, a network trigger time associated with the network trigger, compare, via the one or more processors, the prediction timestamp to the network trigger time, and apply, via the one or more processors, a prediction decision based on the comparison of the prediction timestamp and the network trigger time. Applying the prediction decision includes determining a confidence level that a communication associated with the prediction will occur by a given time.