G06F18/15

SYSTEM AND METHOD FOR LINE MURA DETECTION WITH PREPROCESSING
20190258890 · 2019-08-22 ·

A system and method for identifying line Mura defects on a display. The system is configured to generate a filtered image by preprocessing an input image of a display using at least one filter. The system then identifies line Mura candidates by converting the filtered image to a binary image, counting line components along a slope in the binary image, and marking a potential candidate location when the line components along the slope exceed a line threshold. Image patches are then generated with the candidate locations at the center of each image patch. The image patches are then classified using a machine learning classifier.

MINING PATTERNS IN A HIGH-DIMENSIONAL SPARSE FEATURE SPACE

Disclosed are systems and methods for data mining a plurality of records to identify one or more patterns. A list of frequent items is generated using the records of a certain subpopulation in a dataset of the records. By scanning through the dataset, a prefix tree is generated based on the list of frequent items. Each node in the prefix tree includes an accumulator which maintains separate counts of records from the subpopulation matching the respective node and of records from the plurality of records matching the respective node. One or more population-normalized frequent patterns associated with the plurality of records are extracted based on a traversal of the prefix tree.

BAYESIAN HIERARCHICAL MODELING FOR LOW SIGNAL DATASETS

Methods and systems are described herein for generating a trained Bayesian Hierarchical model from low signal datasets. The disclosed approach utilizes data from alternative segments as a baseline to train the Bayesian Hierarchical model. In some embodiments, the disclosed approach may supplement segment-specific features from another dataset. In some embodiments, inputs for prior distributions may be received from an expert and modified based on the model specification. In one example, the disclosed approach may be used to model probability of default for companies in a low-default segment like Energy portfolio. In this example, data from other commercial and industrial segments is used to form a baseline in the Bayesian Hierarchical model. Further, dataset containing segment-specific features for Energy is supplemented to the training dataset.

BAYESIAN HIERARCHICAL MODELING FOR LOW SIGNAL DATASETS

Methods and systems are described herein for generating a trained Bayesian Hierarchical model from low signal datasets. The disclosed approach utilizes data from alternative segments as a baseline to train the Bayesian Hierarchical model. In some embodiments, the disclosed approach may supplement segment-specific features from another dataset. In some embodiments, inputs for prior distributions may be received from an expert and modified based on the model specification. In one example, the disclosed approach may be used to model probability of default for companies in a low-default segment like Energy portfolio. In this example, data from other commercial and industrial segments is used to form a baseline in the Bayesian Hierarchical model. Further, dataset containing segment-specific features for Energy is supplemented to the training dataset.

DATA GAP MITIGATION

Disclosed embodiments provide techniques for estimating imputation algorithm performance. Multiple imputer algorithms are selected, and an evaluation of how well each of the imputer algorithms can estimate the missing data is performed. Disclosed embodiments obtain an imputer candidate dataset (ICD). The imputer candidate dataset is compared to the incomplete data range, and a similarity metric is determined between the data range and the ICD. When the similarity metric exceeds a predetermined threshold, an imputer evaluation dataset (IED) is created from the ICD by removing one or more data points from the ICD. Each imputer algorithm is evaluated by applying the IED to it, and computing an imputer evaluation metric based on its performance. The multiple imputer algorithms are ranked based on the imputer evaluation metric. The best ranked imputer algorithm can then be selected for use on the incomplete data range within the measurement dataset.

Geometric aging data reduction for machine learning applications

Techniques for geometric aging data reduction for machine learning applications are disclosed. In some embodiments, an artificial-intelligence powered system receives a first time-series dataset that tracks at least one metric value over time. The system then generates a second time-series dataset that includes a reduced version of a first portion of the time-series dataset and a non-reduced version of a second portion of the time-series dataset. The second portion of the time-series dataset may include metric values that are more recent than the first portion of the time-series dataset. The system further trains a machine learning model using the second time-series dataset that includes the reduced version of the first portion of the time-series dataset and the non-reduced version of the second portion of the time-series dataset. The trained model may be applied to reduced and/or non-reduced data to detect multivariate anomalies and/or provide other analytic insights.

REDUCING BIAS IN VISUAL SPEECH RECOGNITION

Systems, methods, and computer-readable media for reducing a bias in visual speech recognition (VSR). In the present embodiments, a comprehensive analysis of the bias (e.g., determining type and severity of the bias) can be performed for each sample in the training data, such as age, gender, and ethnicity, for example. Further, synthetic training data can be generated for under-represented groups using various techniques, such as generative adversarial networks (GANs), for example. Additionally, synthetic video generation can be performed using different modes (e.g., six modes) to ensure quantities and diversity in the synthetic samples. A combination of the real data and the synthetic training data generated can be used to train a VSR model.

REDUCING BIAS IN VISUAL SPEECH RECOGNITION

Systems, methods, and computer-readable media for reducing a bias in visual speech recognition (VSR). In the present embodiments, a comprehensive analysis of the bias (e.g., determining type and severity of the bias) can be performed for each sample in the training data, such as age, gender, and ethnicity, for example. Further, synthetic training data can be generated for under-represented groups using various techniques, such as generative adversarial networks (GANs), for example. Additionally, synthetic video generation can be performed using different modes (e.g., six modes) to ensure quantities and diversity in the synthetic samples. A combination of the real data and the synthetic training data generated can be used to train a VSR model.

SYSTEMS, APPARATUSES, METHODS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE DEVICES FOR TRAINING ARTIFICIAL-INTELLIGENCE MODELS USING ADAPTIVE DATA-SAMPLING

A method has the steps of: calculating importance metrics of a plurality of data samples based on predictions of an artificial-intelligence (AI) model obtained from the plurality of data samples in a plurality of previous training epochs without using labels of the plurality of data samples and without using a learning rate of the AI model; calculating sampling probabilities of the plurality of data samples based on the importance metrics thereof; selecting a subset of the plurality of data samples based on the sampling probabilities of the of plurality of data samples; and training the AI model using the selected subset of the plurality of data samples for one or more epochs.

SYSTEMS, APPARATUSES, METHODS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE DEVICES FOR TRAINING ARTIFICIAL-INTELLIGENCE MODELS USING ADAPTIVE DATA-SAMPLING

A method has the steps of: calculating importance metrics of a plurality of data samples based on predictions of an artificial-intelligence (AI) model obtained from the plurality of data samples in a plurality of previous training epochs without using labels of the plurality of data samples and without using a learning rate of the AI model; calculating sampling probabilities of the plurality of data samples based on the importance metrics thereof; selecting a subset of the plurality of data samples based on the sampling probabilities of the of plurality of data samples; and training the AI model using the selected subset of the plurality of data samples for one or more epochs.