Patent classifications
G06F7/16
VECTORIZED SORTED-SET INTERSECTION USING CONFLICT-DETECTION SIMD INSTRUCTIONS
Vectorized sorted-set intersection is performed using conflict-detection single instruction, multiple data (SIMD) instructions. A first ordered subset of values of a first ordered set of distinct values and a second ordered subset of values of a second ordered set of distinct values is loaded into a register. A first value in the register that matches another value in the register (i.e., common values) is identified by performing an SIMD instruction. The first value is then stored in a result set representing a merge-sort result set between the first ordered set of distinct values and the second ordered set of distinct values.
Method of and system for clustering documents
There is provided a method and a system for generating clusters of documents using a combined metric parameter. A first document and a second document are received, and for a potential cluster including the first document and the second document: a first metric parameter indicative of a degree of complementariness of document content in the potential cluster is determined, a second metric parameter indicative of a degree of dilution of the document content in the potential cluster is determined. The combined metric parameter is determined based on the first metric parameter and the second metric parameter. A cluster is generated based on the combined metric parameter, where the cluster includes the first and second documents. Other document(s) or clusters may be added to the cluster by determining an updated combined metric parameter for a potential cluster and comparing the updated combined metric parameter with the combined metric parameter.
METHOD, DEVICE, COMPUTER APPARATUS, AND STORAGE MEDIUM FOR STORING DATA
A method of storing data, a device, a computer apparatus, and a storage medium relate to data processing technical field. The method of storing data comprising: acquiring call data obtained in a unit time period; preprocessing the call data based on a preset processing rule according to information carried by the call data; and storing a pre-processed call data, a start time and an end time of the unit time period simultaneously.
METHOD, DEVICE, COMPUTER APPARATUS, AND STORAGE MEDIUM FOR STORING DATA
A method of storing data, a device, a computer apparatus, and a storage medium relate to data processing technical field. The method of storing data comprising: acquiring call data obtained in a unit time period; preprocessing the call data based on a preset processing rule according to information carried by the call data; and storing a pre-processed call data, a start time and an end time of the unit time period simultaneously.
Large range lookups for B.SUP.ϵ.-tree
Embodiments herein are directed towards systems and methods for performing range lookups in B.sup.ε-trees. One example method involves receiving a request to return key-value pairs within a range of keys from the B.sup.ε-tree. The B.sup.ε-tree includes a plurality of nodes, each node being associated with a buffer that stores key-value pairs. The method further involves determining a fractional size of the range of keys. The method further involves, for each level of the B.sup.ε-tree, obtaining from within one or more buffers of one or more nodes of the level, a set of key-value pairs within the range of keys up to a size equal to the fractional size and transferring the set of key-value pairs to a result data structure. The method further involves sorting and merging all key-value pairs in the result data structure and returning the result data structure in response to the request.
Large range lookups for B.SUP.ϵ.-tree
Embodiments herein are directed towards systems and methods for performing range lookups in B.sup.ε-trees. One example method involves receiving a request to return key-value pairs within a range of keys from the B.sup.ε-tree. The B.sup.ε-tree includes a plurality of nodes, each node being associated with a buffer that stores key-value pairs. The method further involves determining a fractional size of the range of keys. The method further involves, for each level of the B.sup.ε-tree, obtaining from within one or more buffers of one or more nodes of the level, a set of key-value pairs within the range of keys up to a size equal to the fractional size and transferring the set of key-value pairs to a result data structure. The method further involves sorting and merging all key-value pairs in the result data structure and returning the result data structure in response to the request.
Compiler-level general matrix multiplication configuration optimization
A system and method is provided for optimizing general matrix multiplication (GEMM) on target hardware by splitting matrices to be multiplied into tiles and formulating a tiling configuration search problem for matrices to be multiplied that explores a configuration search space to identify an optimal tiling configuration that minimizes running time on the target hardware for multiplication of matrices A (m×k) and B (k×n) on the target hardware for respective configuration states as a function of matrix parameters m, k, and n, and numbers of respective nested loops for each dimension m, k, and n, respectively. The optimal tiling configuration for the target hardware is obtained by implementing a Greedy Best-First-Search (GBFS) algorithm or a Neighborhood Actor Advantage Critic (N-A2C) algorithm that optimizes the running time for multiplication of the matrices on the target hardware, and the target hardware is configured and computations are run accordingly.
Compiler-level general matrix multiplication configuration optimization
A system and method is provided for optimizing general matrix multiplication (GEMM) on target hardware by splitting matrices to be multiplied into tiles and formulating a tiling configuration search problem for matrices to be multiplied that explores a configuration search space to identify an optimal tiling configuration that minimizes running time on the target hardware for multiplication of matrices A (m×k) and B (k×n) on the target hardware for respective configuration states as a function of matrix parameters m, k, and n, and numbers of respective nested loops for each dimension m, k, and n, respectively. The optimal tiling configuration for the target hardware is obtained by implementing a Greedy Best-First-Search (GBFS) algorithm or a Neighborhood Actor Advantage Critic (N-A2C) algorithm that optimizes the running time for multiplication of the matrices on the target hardware, and the target hardware is configured and computations are run accordingly.
COMPILER-LEVEL GENERAL MATRIX MULTIPLICATION CONFIGURATION OPTIMIZATION
A system and method is provided for optimizing general matrix multiplication (GEMM) on target hardware by splitting matrices to be multiplied into tiles and formulating a tiling configuration search problem for matrices to be multiplied that explores a configuration search space to identify an optimal tiling configuration that minimizes running time on the target hardware for multiplication of matrices A (m×k) and B (k×n) on the target hardware for respective configuration states as a function of matrix parameters m, k, and n, and numbers of respective nested loops for each dimension m, k, and n, respectively. The optimal tiling configuration for the target hardware is obtained by implementing a Greedy Best-First-Search (GBFS) algorithm or a Neighborhood Actor Advantage Critic (N-A2C) algorithm that optimizes the running time for multiplication of the matrices on the target hardware, and the target hardware is configured and computations are run accordingly.
Multi-cycle key compares for keys and records of variable length
Multi-cycle key compare units. A compare unit includes a comparator, additional compare logic and at least one pair of buffers which provide input to the comparator. The compare unit sorts variable length records in streaming mode without the need for complex state machines to maintain state relating to the comparing. A record may have a variable length key and optional variable length data. The record and/or the key is split into fixed, pre-defined lengths. The total key and record lengths are unknown to the comparator of the compare unit.