Patent classifications
G06F16/2237
Message Object Traversal In High-Performance Network Messaging Architecture
A communications system implements instructions including maintaining a message object that includes an array of entries. Each entry of the array includes a field identifier, a data type, and a next entry pointer. The next entry pointers and a head pointer establish a linked list of entries. The instructions include, in response to a request to add a new entry to the message object, calculating an index based on a field identifier of the new entry and determining whether the entry at the calculated index within the array of entries is active. The instructions include, if the entry is inactive, writing a data type, field identifier, and data value of the new entry to the calculated index, and inserting the new entry into the linked list. The instructions include, if the entry is already active, selectively expanding the size of the array and repeating the calculating and determining.
MODELING TECHNIQUES TO CLASSIFY DATA SETS CONTAINING PERSONAL IDENTIFIABLE INFORMATION COMPRISING NUMERICAL IDENTIFIERS
Modeling techniques to classify data sets containing personal identifiable information (PII) comprising identifiers are provided. In one technique, multiple data sets are identified, each data set containing identifiers that were generated by a computer system and that qualify as PII of a known identifier (ID) type. For each of the multiple data sets, a model is generated based on that data set and added to a set of models. A target data set that contains identifiers that were generated by the computer system and that qualify as PII of an unknown ID type is identified. A target model is generated based on the target data set. For at least one model in the set of models, a similarity operation of that model and the target model is performed. Based on the similarity operation, it is determined whether to associate the ID type of that model with the target data set.
Reconstructing deduplicated data
A system and method for efficiently storing data in a storage system. A data storage subsystem includes multiple data storage locations on multiple storage devices in addition to at least one mapping table. A data storage controller determines whether data to store in the storage subsystem has one or more patterns of data intermingled with non-pattern data within an allocated block. Rather than store the one or more pattern on the storage devices, the controller stores information in a header on the storage devices. The information includes at least an offset for the first instance of a pattern, a pattern length, and an identification of the pattern. The data may be reconstructed for a corresponding read request from the information stored in the header.
Storage of Data Objects with a Common Trait in a Storage Network
A method includes identifying an independent data object of a plurality of independent data objects for retrieval from dispersed storage network (DSN) memory. The method further includes determining a mapping of the plurality of independent data objects into a data matrix, wherein the mapping is in accordance with the dispersed storage error encoding function. The method further includes identifying, based on the mapping, an encoded data slice of the set of encoded data slices corresponding to the independent data object. The method further includes sending a retrieval request to a storage unit of the DSN memory regarding the encoded data slice. When the encoded data slice is received, the method further includes decoding the encoding data slice in accordance with the dispersed storage error encoding function and the mapping to reproduce the independent data object.
USING CONDENSED BITMAP REPRESENTATION FOR FILTERING OF DATASETS
Disclosed are techniques for increasing the speed of pairwise comparison operations in a database system. In an embodiment, a method is disclosed comprising receiving a network request identifying a user; identifying a plurality of segments associated with the user; loading a plurality of bitmaps associated with the plurality of segments, each bitmap in the plurality of bitmaps representing a set of users associated with a segment; comparing pairs of bitmaps from the plurality of bitmaps to generate a set of overlaps; filtering the plurality of segments based on the set of overlaps to generate an anonymized set of segments; and returning aggregated data associated with the anonymized set of segments in response to the network request.
MERGING A MATRIX USER STRUCTURE INTO A MULTILINE USER STRUCTION
Disclosed herein is a system and method to any existing Matrix MLM to be merged into a Multiline MLM system. Further the existing MLM members have full access to the Multiline MLM commission structure, for example, a member of a Matrix MLM will maintain their existing lines and downlines.
Feature-agnostic behavior profile based anomaly detection
Techniques for user behavior anomaly detection. At least one low-variance characteristic is compared to an expected result for the corresponding low-variance characteristics to determine if the low-variance characteristic(s) is/are within a pre-selected range of the expected results. A security response action is taken in response to the low-variance characteristic not being within the first pre-selected range of the expected results. At least one high-variance characteristic is compared to an expected result for the corresponding high-variance characteristics to determine if the high-variance characteristic(s) is/are within a pre-selected range of the expected results. A security response action is taken in response to the high-variance characteristic not being within the first pre-selected range of the expected results. Access is provided if the low-variance and the high-variance characteristics are within the respective expected ranges.
Data model generation using generative adversarial networks
Methods for generating data models using a generative adversarial network can begin by receiving a data model generation request by a model optimizer from an interface. The model optimizer can provision computing resources with a data model. As a further step, a synthetic dataset for training the data model can be generated using a generative network of a generative adversarial network, the generative network trained to generate output data differing at least a predetermined amount from a reference dataset according to a similarity metric. The computing resources can train the data model using the synthetic dataset. The model optimizer can evaluate performance criteria of the data model and, based on the evaluation of the performance criteria of the data model, store the data model and metadata of the data model in a model storage. The data model can then be used to process production data.
Ensuring integrity of records in a not only structured query language database
A method, computer system, and a computer program product for ensuring integrity of records in a NoSQL database including a first table and a second table is provided. The present invention may include the first table having first records representing respective first entities and the second table having second records representing respective second entities. The present invention may include using a hash table associating each second entity of the second table with the respective hash or summary hash values of first records for reading the second records of the second table.
METADATA CLASSIFICATION
Systems and method are disclosed that retrieve data from a data set organized in a plurality of columns. For each column in the plurality of columns, the systems and method generate one or more candidate semantic categories for the column, where each of the one or more candidate semantic categories has a corresponding probability. The systems and method create a feature vector for the column from the one or more candidate semantic categories and the corresponding probabilities. The systems and method determine a semantic category type of the column based on the feature vector. The systems and method anonymize the data in the column based on the semantic category type, which includes replacing more specific data in the column with less specific data based on a data hierarchy that relates the more specific data to the less specific data.