Patent classifications
G06F18/23211
LEARNING METHOD, LEARNING SYSTEM, DEVICE, LEARNING APPARATUS AND PROGRAM
A learning method includes a step in which a device acquires a plurality of samples, a step in which the device divides the plurality of samples into a plurality of clusters, a step in which the device extracts samples from each of the plurality of clusters according to an effectiveness of each cluster received from a learning apparatus, a step in which the device transmits the extracted samples to the learning apparatus, a step in which the learning apparatus learns the extracted samples, a step in which the learning apparatus calculates, for each cluster, an effectiveness in learning the samples belonging to a cluster from learning results, and a step in which the learning apparatus transmits the effectiveness of each cluster to the device.
LEARNING METHOD, LEARNING SYSTEM, DEVICE, LEARNING APPARATUS AND PROGRAM
A learning method includes a step in which a device acquires a plurality of samples, a step in which the device divides the plurality of samples into a plurality of clusters, a step in which the device extracts samples from each of the plurality of clusters according to an effectiveness of each cluster received from a learning apparatus, a step in which the device transmits the extracted samples to the learning apparatus, a step in which the learning apparatus learns the extracted samples, a step in which the learning apparatus calculates, for each cluster, an effectiveness in learning the samples belonging to a cluster from learning results, and a step in which the learning apparatus transmits the effectiveness of each cluster to the device.
DETERMINING SIMILAR BEHAVIORAL PATTERN BETWEEN TIME SERIES DATA OBTAINED FROM MULTIPLE SENSORS AND CLUSTERING THEREOF
Industries deploy a plethora of sensors that are attached to a system or human being, respectively. Under multi-sensor environment scenarios, there is a need to detect which sensors are behaving similarly within a time span. Sensor values often vary in range of values yet depict similar time series characteristic and sometimes have a phase difference in operation, thus making it impossible to detect such sensor similarity in a large system where the number of input parameters/sensor observations. Systems and methods of the present disclosure determine similar behavioral pattern between time series data obtained from multiple sensors and cluster the sensors. The system implements a pattern recognition-based approach to find the similarity and then applies a Dynamic Programming-based approach to detect similarity in at least two time series data and cluster the sensors and corresponding time series data into specific cluster(s).
Methods, systems, and apparatuses for quantitative analysis of heterogeneous biomarker distribution
Methods, systems, and apparatuses for detecting and describing heterogeneity in a cell sample are disclosed herein. A plurality of fields of view (FOV) are generated for one or more areas of interest (AOI) within an image of the cell sample are generated. Hyperspectral or multispectral data from each FOV is organized into an image stack containing one or more z-layers, with each z-layer containing intensity data for a single marker at each pixel in the FOV. A cluster analysis is applied to the image stacks, wherein the clustering algorithm groups pixels having a similar ratio of detectable marker intensity across layers of the z-axis, thereby generating a plurality of clusters having similar expression patterns.
METHOD OF UPDATING DATA CLUSTER
A method of updating data cluster, adapted to a computing device, includes receiving update data, and calculating a first distance between the update data and an existing representative of an existing cluster, determining whether the first distance is smaller than a threshold distance, updating the existing cluster with the update data to generate an updated cluster when the first distance is smaller than the threshold distance, and performing a representative updating procedure on the updated cluster to generate an updated representative.
METHOD OF UPDATING DATA CLUSTER
A method of updating data cluster, adapted to a computing device, includes receiving update data, and calculating a first distance between the update data and an existing representative of an existing cluster, determining whether the first distance is smaller than a threshold distance, updating the existing cluster with the update data to generate an updated cluster when the first distance is smaller than the threshold distance, and performing a representative updating procedure on the updated cluster to generate an updated representative.
INTERACTIVE SYSTEM TO ASSIST A USER IN BUILDING A MACHINE LEARNING MODEL
A method that includes (a) receiving a training dataset, a testing dataset, a number of iterations, and a parameter space of possible parameter values that define a base model, (b) for the number of iterations, performing a parametric search process that produces a report that includes information concerning a plurality of machine learning models, where the parametric search process includes (i) generating a Bayesian optimized parameter space with an option to validate through Stratified Kfold cross validation, where an optimized parameter set includes training data from the training dataset, and testing data from the testing dataset, (ii) running the base model with the final optimized parameter set, thus yielding model results for the plurality of machine learning models, (iii) calculating Kolmogorov-Smirnov (KS) statistics for the model results, and (iv) saving the model results and the KS statistics to the report, and (c) sending the report to a user device.
INTERACTIVE SYSTEM TO ASSIST A USER IN BUILDING A MACHINE LEARNING MODEL
A method that includes (a) receiving a training dataset, a testing dataset, a number of iterations, and a parameter space of possible parameter values that define a base model, (b) for the number of iterations, performing a parametric search process that produces a report that includes information concerning a plurality of machine learning models, where the parametric search process includes (i) generating a Bayesian optimized parameter space with an option to validate through Stratified Kfold cross validation, where an optimized parameter set includes training data from the training dataset, and testing data from the testing dataset, (ii) running the base model with the final optimized parameter set, thus yielding model results for the plurality of machine learning models, (iii) calculating Kolmogorov-Smirnov (KS) statistics for the model results, and (iv) saving the model results and the KS statistics to the report, and (c) sending the report to a user device.
Apparatus and Methods for Improved Subsurface Data Processing Systems
A method and apparatus for subsurface data processing includes determining a set of clusters based at least in part on measurement vectors associated with different depths or times in subsurface data, defining clusters in a subsurface data by classes associated with a state mode, reducing a quantity of the subsurface data based at least in part on the classes, and storing the reduced quantity of the subsurface data and classes with the state model in a training database for a machine learning process.
Optimizing training data for image classification
A method for machine learning-based classification may include training a machine learning model with a full training data set, the full training data set comprising a plurality of data points, to generate a first model state of the machine learning model, generating respective embeddings for the data points in the full training data set with the first model state of the machine learning model, applying a clustering algorithm to the respective embeddings to generate one or more clusters of the embeddings, identifying outlier embeddings from the one or more clusters of the embeddings, generating a reduced training data set comprising the full training data set less the data points associated with the outlier embeddings, training the machine learning model with the reduced training data set to a second model state, and applying the second model state to one or more data sets to classify the one or more data sets.