G06F18/2193

CONFIDENCE-BASED ASSISTED LEARNING
20220414466 · 2022-12-29 ·

Techniques are disclosed for assisted learning with enhanced privacy. A method comprises: sending first statistical information from a first agent to a second agent in an architecture having at least two agents, wherein a first set of sample weights correspond to training the first machine learning model, wherein the first statistical information comprises the second set of sample weights determined from a first model weight; receiving, from the second agent, second statistical information comprising the second model weight and updated first set of sample weights or, from a third agent of the architecture, third statistical information comprising a third model weight and a next iteration of the first set of sample weights; and updating the first machine learning model using the second statistical information or the third statistical information.

Evaluating text classification anomalies predicted by a text classification model

In response to running at least one testing phrase on a previously trained text classifier and identifying a separate predicted classification label based on a score calculated for each respective at least one testing phrase, a text classifier decomposes extracted features summed in the score into word-level scores for each word in the at least one testing phrase. The text classifier assigns a separate heatmap value to each of the word-level scores, each respective separate heatmap value reflecting a weight of each word-level score. The text classifier outputs the separate predicted classification label and each separate heatmap value reflecting the weight of each word-level score for defining a heatmap identifying the contribution of each word in the at least one testing phrase to the separate predicted classification label for facilitating client evaluation of text classification anomalies.

SYSTEM AND METHOD FOR IMPROVED FEATURE DEFINITION USING SUBSEQUENCE CLASSIFICATION
20220405475 · 2022-12-22 ·

A feature set for performing classification of datasets such as speech transcripts by a machine learning classifier model is constructed using identification of features of interest through classification of subsequences of the dataset. An anchor comprising a class-differentiating token is identified, and subsequences of different lengths comprising the anchors and surrounding tokens are generated. The subsequence length producing a best performing classifier is selected. A feature set is then generated using transcript-level aggregates of token-level features for tokens in the dataset within that subsequence lengths length. The feature set may be added to a previously defined feature set for the dataset.

SYSTEM AND METHOD FOR DE-NOSING AN ULTRASONIC SCAN IMAGE USING A CONVOLUTIONAL NEURAL NETWORK

A system and method apply an input noisy ultrasonic test (UT) scan image to an input layer of a convolutional neural network, generate a feature map using a convolutional layer, pool the feature map using a pooling layer, apply the pooled feature map to a fully connected layer, generate a de-noised UT scan image, and output the de-noised UT scan image from an output layer.

METHOD AND SYSTEM FOR CREATING AN ENSEMBLE OF MACHINE LEARNING MODELS TO DEFEND AGAINST ADVERSARIAL EXAMPLES

One embodiment provides a system which facilitates construction of an ensemble of machine learning models. During operation, the system determines a training set of data objects, wherein each data object is associated with one of a plurality of classes. The system divides the training set of data objects into a number of partitions. The system generates a respective machine learning model for each respective partition using a universal kernel function, which processes the data objects divided into a respective partition to obtain the ensemble of machine learning models. The system trains the machine learning models based on the data objects of the training set. The system predicts an outcome for a testing data object based on the ensemble of machine learning models and an ensemble decision rule.

Method and system of performing data imbalance detection and correction in training a machine-learning model

A method and system for performing semi or fully automatic data imbalance detection and correction in training a machine-learning (ML) model includes receiving a request to train the ML model, receiving access to a dataset for use in training the ML model, identifying a feature of the dataset for which data imbalance detection is to be performed, examining the dataset to determine a distribution of the feature across the dataset, determining if the distribution of the feature across the dataset indicates data imbalance, upon determining that the distribution of the feature across the dataset indicates data imbalance, identifying a desired distribution for the identified feature, selecting a subset of the dataset that corresponds with the selected feature and the desired distribution, and using the subset to train the ML model.

Configuring machine learning model thresholds in models using imbalanced data sets
11526606 · 2022-12-13 · ·

Certain aspects of the present disclosure provide techniques for efficiently configuring a machine learning model. An example method generally includes generating a randomly sampled data set from a data set including a larger first set of samples associated with a first classification and a smaller second set of samples associated with a second classification. An analysis plot for the machine learning model is generated based on the randomly sampled data set. A point associated with an accuracy metric for the machine learning model is identified on the analysis plot based on a slope of a line tangential to the identified point and a value identifying a relative importance of precision to recall in the machine learning model. The machine learning model is configured with a threshold value between the first classification and the second classification based at least in part on the identified point on the analysis plot.

Semantic state based sensor tracking and updating

Provided are methods, systems, and devices for updating a sensor based on sensor data and the semantic state associated with an area. Sensor data can be received by a computing system. The sensor data can be based on sensor outputs from sensors. The sensor data can include information associated with states of areas detected by the sensors. An estimated semantic state of one of the areas from a target sensor that can detect the states of the areas can be generated. Based on a comparison of the estimated semantic state to semantic states of the area from the sensors, an uncertainty level associated with an accuracy of the estimated semantic state can be determined. In response to the uncertainty level satisfying one or more update criteria, an updated version of the sensor data from the target sensor can be obtained.

Dictionary generation apparatus, evaluation apparatus, dictionary generation method, evaluation method, and storage medium for selecting data and generating a dictionary using the data

Embodiments of the present invention are directed to learning of an appropriate dictionary which has a high expression ability of minority data while preventing reduction of an expression ability of majority data. A dictionary generation apparatus which generates a dictionary used for discriminating whether data to be discriminated belongs to a specific category includes a generation unit configured to generate a first dictionary based on learning data belonging to the specific category and a selection unit configured to estimate a degree of matching of the learning data at each portion with the first dictionary using the generated first dictionary and select a portion of the learning data based on the estimated degree of matching, wherein the generation unit generates a second dictionary based on the selected portion of the learning data.

Evaluation of modeling algorithms with continuous outputs

Certain aspects involve evaluating modeling algorithms whose outputs can impact machine-implemented operating environments. For instance, a computing system generates, from a comparison of a set of estimated attribute values of an attribute to a set of validation attribute values of the attribute, a discretized evaluation dataset with data values in multiple categories. The computing system computes, for a modeling algorithm used to generate the estimated attribute values, an evaluation metric. The computing system provides a host computing system with access to the evaluation metric, one or more modeling outputs generated with the modeling algorithm, or both. Providing one or more of these outputs to the host computing system can facilitate modifying one or more machine-implemented operations.