G06F18/24323

Poisson distribution based approach for bootstrap aggregation in a random forest

Systems, apparatuses and methods may provide for technology that generates inclusion data in accordance with a Poisson distribution, wherein the inclusion data specifies a number of inclusions for each observation in a set of observations. The technology may also train a first decision tree in a random forest based at least in part on the inclusion data.

SYSTEM AND METHOD FOR ASSESSING A CANCER STATUS OF BIOLOGICAL TISSUE

A method for assessing a cancer status of biological tissue includes the steps of: obtaining a Raman spectrum indicating a Raman spectroscopy response of the biological tissue, the Raman spectrum captured using a fiber-optic probe of a fiber-optic Raman spectroscopy system; inputting the Raman spectrum into a boosted tree classification algorithm of a computer program, and using the boosted tree classification algorithm for comparing, in real-time, the captured Raman spectrum to reference data and assessing the cancer status of the biological tissue based on said comparison, the reference data being previously determined based on a set of reference Raman spectra indicating Raman spectroscopy responses of reference biological tissues wherein each of the reference biological tissues is associated with a known cancer status; and generating a real-time output indicating the assessed cancer status of the biological tissue,

Computer network troubleshooting

A system for troubleshooting network problems is disclosed. A model can use demographic information, network usage information, and network membership information to determine an importance of a problem. The importance of the problem for the user who reported the problem, a number of other users affected by the problem, and the importance of the problem to the other users can be used to determine a priority for resolving the problem. Before and after a work order is executed to resolve the problem, network metrics can be gathered, including aggregate network metrics, and automatically presented in various user interfaces. The analysis of the metrics can be used to update a database of which work orders are assigned in response to which problems.

Fast and accurate rule selection for interpretable decision sets
11704591 · 2023-07-18 · ·

An IDS generator determines multiple classes for electronic data items. The IDS generator determines, for each class, a class-specific candidate ruleset. The IDS generator performs a differential analysis of each class-specific candidate ruleset. The differential analysis is based on differences between result values of a scoring objective function. In some cases, the differential analysis determines at least one of the differences based on additional data structures, such as an augmented frequent-pattern tree. A probability function based on the differences is compared to a threshold probability At least one testing ruleset is modified based on the comparison. The IDS generator determines, for each class, a class-specific optimized ruleset based on the differential analysis of each class-specific candidate ruleset. The IDS generator creates an optimized interpretable decision set based on combined class-specific optimized rulesets for the multiple classes.

Autonomous application of security measures to IoT devices
11706236 · 2023-07-18 · ·

Methods and systems for classifying a device on a network. The systems and methods may receive network activity data associated with an unknown device. A classifier executing one or more machine learning models may then classify the device as an internet of things (IoT) device or a non-IoT device.

Analysis of deep-level cause of fault of storage management
11704186 · 2023-07-18 · ·

Storage management is performed. For example, a computing device may determine that a fault belongs to one of a plurality of predefined fault categories based on description information of the fault of a storage system. Then, the computing device may determine at least one fault cause associated with the fault category at a first level of a hierarchical structure of predetermined fault causes. Further, the computing device may determine a first fault cause that causes the fault among the at least one fault cause. After that, the computing device may determine a target fault cause at the deepest level that causes the fault based on the first fault cause. As a result, the root cause of a fault of a storage system may be accurately and efficiently determined, thereby providing the possibility of fundamentally eliminating the fault.

Identifying ground types from interpolated covariates

A system and method for identifying ground types from one or more interpolated covariates. The method proceeds by accessing soil composition information for plots of land, in which the soil composition information includes measured soil sample results, environmental results, soil conductivity results or any combination thereof. The method continues by identifying covariates from the soil composition information. Subsequently, the method interpolates covariates associated with different locations with an interpolation training model. Voxels are generated that are each associated with interpolated covariates having a corresponding geographical location. The method trains a random forest training model with the interpolated covariates. The voxels traverse the trained random forest model to identify clusters of voxels that are co-associated. The method identifies a ground type by combining the co-associated clusters. Each ground type is associated with a crop zone, a soil fertility, or a farm management recommendation.

SYSTEMS AND METHODS FOR AUTOMATICALLY DERIVING DATA TRANSFORMATION CRITERIA

Systems, apparatuses, methods, and computer program products are disclosed for automatically deriving data transformation criteria. An example method includes receiving, by communications circuitry, a source dataset and a target dataset and identifying, by a model generator, a target variable. The example method further includes training, by the model generator, a decision tree for the target variable using the source dataset and the target dataset such that the trained decision tree can predict a value for the target variable from new source data. The example method further includes deriving, by a derivation engine, a set of parameters and pseudocode for producing the target variable from the source dataset.

Techniques for determining artificial neural network topologies

Various embodiments are generally directed to techniques for determining artificial neural network topologies, such as by utilizing probabilistic graphical models, for instance. Some embodiments are particularly related to determining neural network topologies by bootstrapping a graph, such as a probabilistic graphical model, into a multi-graphical model, or graphical model tree. Various embodiments may include logic to determine a collection of sample sets from a dataset. In various such embodiments, each sample set may be drawn randomly for the dataset with replacement between drawings. In some embodiments, logic may partition a graph into multiple subgraph sets based on each of the sample sets. In several embodiments, the multiple subgraph sets may be scored, such as with Bayesian statistics, and selected amongst as part of determining a topology for a neural network.

Method and process for predicting and analyzing patient cohort response, progression, and survival

A system and method for analyzing a data store of de-identified patient data to generate one or more dynamic user interfaces usable to predict an expected response of a particular patient population or cohort when provided with a certain treatment. The automated analysis of patterns occurring in patient clinical, molecular, phenotypic, and response data, as facilitated by the various user interfaces, provides an efficient, intuitive way for clinicians to evaluate large data sets to aid in the potential discovery of insights of therapeutic significance.