Patent classifications
G06F2207/025
AUTOMATIC DETECTION OF PERSONAL IDENTIFIABLE INFORMATION
Described herein are example implementations for the automatic detection and handling of personal identifiable information (PII) in electronic records. In some aspects, a system receives one or more computer readable logs of information for one or more computer services, with each log including a string of characters. The system performs one or more string search algorithm based operations on the entirety of the one or more strings of the one or more computer readable logs to identify a range of the one or more strings to be searched for PII that is less than the entirety of the one or more strings. The system also performs one or more regular expression algorithm based operations on the range of the one or more strings to identify one or more instances of PII. The system generates and outputs an indication of the one or more instances of the PII that are identified.
Determination device
A determination device 1 includes: a history storage unit 101 configured to accumulate and store use history information in which application identification information, position information indicating a position of a user, and user identification information are correlated with each other; a use region UU number totaling unit 105 configured to total a UU number for each of a plurality of regions on the basis of the use history information for a specific application; a use region number totaling unit 106 configured to total the number of regions on the basis of the use history information for the specific application; a score calculating unit 107 configured to calculate a regionality score for each of the plurality of regions on the basis of the UU number and the number of regions for the specific application; and a determination unit 108 configured to determine whether there is regionality for the specific application on the basis of the regionality score.
Fuzzy string alignment
A method includes computing multiple term distances between pairs of multiple first string terms in a first string and multiple second string terms in a second string, generating a cost matrix based on the term distances, and selecting a set of candidate alignments based on the cost matrix. The method further includes generating multiple alignment scores for the set of candidate alignments, and selecting, from the set of candidate alignments, an alignment between the first string and the second string based on the alignment scores. The method further includes outputting a match identifier based on the alignment.
METHOD FOR AUTOMATICALLY MATCHING CHART NAMES
A method for correlating a first item to a second item is disclosed. The method includes detecting a first identifier associated with the first item and detecting a second identifier associated with the second item. The method further includes simplifying the first identifier, and executing a matching procedure configured to match the first identifier to the second identifier and generating a match value, which includes comparing the first identifier to the second identifier and generating the match value between the first identifier and the second identifier. The method also includes reporting a correlation between the first item and the second item based on the match value. The method may be performed by a system that includes a controller, one or more processors, and memory.
Text file binary search device, search method, program, and information recording medium
A device searches a file being recorded that includes lines sorted in accordance with keys included in the lines to find a line that matches a pattern. When the device receives a pattern, it initializes upper and lower limits of a search range and calculates a middle position between the limits. It acquires, from the file, a middle line that starts at or before the middle position and ends after it. If the key included in the middle line matches the pattern, it outputs the middle line and re-sets the upper or lower limit based on whether the key included in the middle line is greater or less than the pattern and, if there is a distance greater than a length of a newline between the limits, repeats the procedure starting from the middle position. Otherwise, it outputs a result to the effect that no matching line has been found.
SEMICONDUCTOR DEVICE
In a semiconductor device, an arithmetic circuit of a chip on a first stage performs a predetermined arithmetic operation on an input N-bit (N = 4) selection signal. Similarly, an arithmetic circuit of each of chips on second and subsequent stages among chips on a total of M stages (M > N ≥ 2, M = 16) performs a predetermined common arithmetic operation on an operation result of the arithmetic circuit of the chip on the preceding stage. A determination circuit provided in each chip performs a predetermined common logic operation on a bit string of the N-bit signal, which is the operation result of the corresponding arithmetic circuit, thereby determining whether it is the chip selected by the selection signal.
Multipattern regular expression search systems and methods therefor
A tool, system, and method for searching input data includes a pattern input module, configured to receive regular expression patterns of symbols. An interpreter module may be configured to access individual ones of the symbols of the input data and upon accessing each symbol and compare a thread against the symbol. For each pattern, the thread corresponding to the pattern is compared against the symbol prior to the at least one thread being compared against a subsequent symbol of the input data. An output module may be configured to output an indication of ones of the patterns determined to be contained within the input data based on the comparison of the corresponding at least one thread to the symbols of the input data.
Sensitive Data Evaluation
Evaluating risk of sensitive data associated with a target data set includes a computer system receiving a pattern that defines sensitive data and a selection of a data set as the target data set for evaluating. The system determines portions of the target data set from which to select sample data sets and determines, responsive to a confidence limit and sizes of the respective portions of the target data, a size of a sample data set for each respective target data set portion. The system randomly samples the target data set portions to provide sample data sets of the determined sample data set sizes and determines whether there is an occurrence of the sensitive data in each sample data set by searching for the pattern in the sample data sets. The system determines a proportion of the sample data sets that have the occurrence of the sensitive data.
Network key value indexing design
Provided is a method of indexing in a network key value indexing system. The method includes retrieving a first key name from a storage device of the network key value indexing system, the first key name identifying a first prefix, a first bucket, and a first key, the first prefix indicating the first bucket, parsing the first key name into the first prefix, the first bucket, and the first key, determining the first prefix, the first bucket, and the first key based on a first delimiter, and generating a hash table in a memory cache of the network key value indexing system to associate the first prefix with the first key.
Sensitive data evaluation
Evaluating risk of sensitive data associated with a target data set includes a computer system receiving a pattern that defines sensitive data and a selection of a data set as the target data set for evaluating. The system determines portions of the target data set from which to select sample data sets and determines, responsive to a confidence limit and sizes of the respective portions of the target data, a size of a sample data set for each respective target data set portion. The system randomly samples the target data set portions to provide sample data sets of the determined sample data set sizes and determines whether there is an occurrence of the sensitive data in each sample data set by searching for the pattern in the sample data sets. The system determines a proportion of the sample data sets that have the occurrence of the sensitive data.