Patent classifications
G06F17/18
UNSUPERVISED STATISTICAL METHOD FOR MULTIVARIATE IDENTIFICATION OF ATYPICAL SENSORS
A method for identifying atypical sensors measuring characteristics of individuals. Curves of characteristic of individuals are collected, the curves being measured by each sensor. For a given sensor, a reference curve is processed to calculate a dissimilarity index between the reference curve and each of the other curves of the sensor and the dissimilarity processing is iteratively repeated for each curve resulting from the same sensor to obtain the dissimilarity index for each curve. The dissimilarity processing is repeated for the other sensors to obtain a table of dissimilarity indices. An atypicality index is calculated for each individual from a multivariate statistical processing of the tables. Atypical individuals and atypical sensors are identified.
UNSUPERVISED STATISTICAL METHOD FOR MULTIVARIATE IDENTIFICATION OF ATYPICAL SENSORS
A method for identifying atypical sensors measuring characteristics of individuals. Curves of characteristic of individuals are collected, the curves being measured by each sensor. For a given sensor, a reference curve is processed to calculate a dissimilarity index between the reference curve and each of the other curves of the sensor and the dissimilarity processing is iteratively repeated for each curve resulting from the same sensor to obtain the dissimilarity index for each curve. The dissimilarity processing is repeated for the other sensors to obtain a table of dissimilarity indices. An atypicality index is calculated for each individual from a multivariate statistical processing of the tables. Atypical individuals and atypical sensors are identified.
SYSTEM AND METHOD FOR PREDICTING LOSS OF FUNCTION CAUSED BY GENETIC VARIANT
Disclosed herein is a system for predicting a loss of the function of genetic variants. The system includes a loss of function (LoF) prediction unit for calculating a probability that a target genetic variant will cause a loss of function (LoF) in a target gene through logistic regression with respect to a first probability that the target gene will be intolerant of the loss of function and a second probability that the target genetic variant contained in the target gene will be intolerant.
SYSTEM AND METHOD FOR PREDICTING LOSS OF FUNCTION CAUSED BY GENETIC VARIANT
Disclosed herein is a system for predicting a loss of the function of genetic variants. The system includes a loss of function (LoF) prediction unit for calculating a probability that a target genetic variant will cause a loss of function (LoF) in a target gene through logistic regression with respect to a first probability that the target gene will be intolerant of the loss of function and a second probability that the target genetic variant contained in the target gene will be intolerant.
ANOMALY DETECTION PERFORMANCE ENHANCEMENT USING GRADIENT-BASED FEATURE IMPORTANCE
Herein are machine learning techniques that adjust reconstruction loss of a reconstructive model, such as a principal component analysis (PCA), based on importances of features. In an embodiment having a reconstructive model that more or less accurately reconstructs its input, a computer measures, for each feature, a respective importance that is based on the reconstructive model. For example, importance may be based on grading samples that the reconstructive model correctly or incorrectly inferenced. For each feature during production inferencing, a respective original loss from the reconstructive model measures a difference between a value of the feature in an input and a reconstructed value of the feature generated by the reconstructive model. For each feature, the respective importance of the feature is applied to the respective original loss to generate a respective weighted loss, which compensates for concept drift. The weighted losses of the features of the input are collectively detected as anomalous or non-anomalous.
ANOMALY DETECTION PERFORMANCE ENHANCEMENT USING GRADIENT-BASED FEATURE IMPORTANCE
Herein are machine learning techniques that adjust reconstruction loss of a reconstructive model, such as a principal component analysis (PCA), based on importances of features. In an embodiment having a reconstructive model that more or less accurately reconstructs its input, a computer measures, for each feature, a respective importance that is based on the reconstructive model. For example, importance may be based on grading samples that the reconstructive model correctly or incorrectly inferenced. For each feature during production inferencing, a respective original loss from the reconstructive model measures a difference between a value of the feature in an input and a reconstructed value of the feature generated by the reconstructive model. For each feature, the respective importance of the feature is applied to the respective original loss to generate a respective weighted loss, which compensates for concept drift. The weighted losses of the features of the input are collectively detected as anomalous or non-anomalous.
Intelligent framework updater to incorporate framework changes into data analysis models
A computer system adapts a model analyzing data. Information sources are analyzed to determine one or more changes for a computerized model employed for analyzing data. One or more current projects each using an implementation of the computerized model with at least one of the determined changes are identified. The implementations are compared to the employed computerized model to determine differences. One or more adaptations for the employed computerized model are determined in response to the determined differences satisfying a threshold, wherein the one or more adaptations for the employed computerized model are based on the determined changes in the corresponding implementation of the computerized model. At least one adaption is installed into a platform hosting the employed model for modification of the employed model. Embodiments of the present invention further include a method and program product for adapting a model analyzing data in substantially the same manner described above.
Intelligent framework updater to incorporate framework changes into data analysis models
A computer system adapts a model analyzing data. Information sources are analyzed to determine one or more changes for a computerized model employed for analyzing data. One or more current projects each using an implementation of the computerized model with at least one of the determined changes are identified. The implementations are compared to the employed computerized model to determine differences. One or more adaptations for the employed computerized model are determined in response to the determined differences satisfying a threshold, wherein the one or more adaptations for the employed computerized model are based on the determined changes in the corresponding implementation of the computerized model. At least one adaption is installed into a platform hosting the employed model for modification of the employed model. Embodiments of the present invention further include a method and program product for adapting a model analyzing data in substantially the same manner described above.
Attribute diversity for frequent pattern analysis
A data processing server may receive a set of data objects for frequent pattern (FP) analysis. The set of data objects may be analyzed using an attribute diversity technique. For the set of data attributes of the set of data objects, the server may arrange the attributes in one or more dimensions. The server may initialize a set of centroids on data points and identify mean values of nearby data points. Based on an iteration of the mean value calculation, the server may identify a set of attributes corresponding to final mean values as being groups of similarly frequent attributes. These groups of similarly frequent attributes may be analyzed using an FP analysis procedure to identify frequent patterns of data attributes.
Attribute diversity for frequent pattern analysis
A data processing server may receive a set of data objects for frequent pattern (FP) analysis. The set of data objects may be analyzed using an attribute diversity technique. For the set of data attributes of the set of data objects, the server may arrange the attributes in one or more dimensions. The server may initialize a set of centroids on data points and identify mean values of nearby data points. Based on an iteration of the mean value calculation, the server may identify a set of attributes corresponding to final mean values as being groups of similarly frequent attributes. These groups of similarly frequent attributes may be analyzed using an FP analysis procedure to identify frequent patterns of data attributes.