Patent classifications
G06F7/32
Joining and dimensional annotation in a streaming pipeline
Disclosed are embodiments for providing batch performance using a stream processor. In one embodiment, a method is disclosed comprising receiving, at a stream processor, an event, the stream processor including a plurality of processing stages; generating, by the stream processor, an augmented event based on the event, the augmented event including at least one additional field not appearing in the event, the additional field generated by an operation selected from the group consisting of a join or dimensional annotation operation; and emitting, by the stream processor, the augmented event to downstream consumer.
Joining and dimensional annotation in a streaming pipeline
Disclosed are embodiments for providing batch performance using a stream processor. In one embodiment, a method is disclosed comprising receiving, at a stream processor, an event, the stream processor including a plurality of processing stages; generating, by the stream processor, an augmented event based on the event, the augmented event including at least one additional field not appearing in the event, the additional field generated by an operation selected from the group consisting of a join or dimensional annotation operation; and emitting, by the stream processor, the augmented event to downstream consumer.
MERGING DATABASE TABLES BY CLASSIFYING COMPARISON SIGNATURES
The present disclosure relates to merging database tables. Systems and methods may involve performing a comparison between the first set of records and the second set of records and identifying a plurality of record pairs based on the comparison. Each record pair may comprise a record in the first set of records and a record in the second set of records. In addition, A feature signature may be generated for each record pair by comparing field values in each record pair. The feature signature may be classified to identify at least one related record pair. A merged database table may be generated such that it comprises the at least one related record pair and comprises a set of unique records among selected from the first set of records and the second set of records.
MERGING DATABASE TABLES BY CLASSIFYING COMPARISON SIGNATURES
The present disclosure relates to merging database tables. Systems and methods may involve performing a comparison between the first set of records and the second set of records and identifying a plurality of record pairs based on the comparison. Each record pair may comprise a record in the first set of records and a record in the second set of records. In addition, A feature signature may be generated for each record pair by comparing field values in each record pair. The feature signature may be classified to identify at least one related record pair. A merged database table may be generated such that it comprises the at least one related record pair and comprises a set of unique records among selected from the first set of records and the second set of records.
K-mer based genomic reference data compression
A computer-implemented method includes receiving genomic data associated with a plurality of genomes and identifying k-mer sets within the genomic data. The method includes constructing a k-mer subset tree according to the following process: performing iterative pairwise comparisons on the k-mer sets, wherein the iterative pairwise comparisons identify fragments with the most shared k-mers, merging the identified fragments into non-leaf nodes of the k-mer subset tree, and placing each remaining k-mer into a leaf node of the k-mer subset tree. The method includes storing the k-mer subset tree. A computer program product for data compression includes a computer readable storage medium having program instructions embodied therewith. The program instructions are executable by a computer to cause the compute to perform the foregoing method. A system includes a processor and logic. The logic is configured to perform the foregoing method.
AUTOMATIC GENERATION OF CONVERGENT DATA MAPPINGS FOR BRANCHES IN AN INTEGRATION WORKFLOW
In an approach to improve integration workflows by automatically generating convergent data mappings for branches in an integration workflow using a computer. A branch schema for each branch is generated, wherein the branch schema represents the union of all the individual node output schemas on the branch. A common output schema for a convergence point is generated, wherein the common output schema represents an intersection of all the branch schemas and generates branch mappings from each branch node to the common output schema.
Information output method and apparatus
Disclosed are an information output method and apparatus. One specific implementation of the method comprises: acquiring order data to be sorted; determining whether items to be sorted matching the order data to be sorted are stored in a shelf set; in response to having determined that items to be sorted matching the order data to be sorted are stored in the shelf set, determining, from the shelf set, a shelf storing an item to be sorted, and adding same to a candidate shelf set; choosing a candidate shelf from the candidate shelf set and adding same to a target shelf set, where the target shelf set stores the items to be sorted matching the order data; and outputting an identifier of a target shelf in the target shelf set.
Information output method and apparatus
Disclosed are an information output method and apparatus. One specific implementation of the method comprises: acquiring order data to be sorted; determining whether items to be sorted matching the order data to be sorted are stored in a shelf set; in response to having determined that items to be sorted matching the order data to be sorted are stored in the shelf set, determining, from the shelf set, a shelf storing an item to be sorted, and adding same to a candidate shelf set; choosing a candidate shelf from the candidate shelf set and adding same to a target shelf set, where the target shelf set stores the items to be sorted matching the order data; and outputting an identifier of a target shelf in the target shelf set.
LOYALTY EXTRACTION MACHINE
The present invention provides a loyalty extraction machine, wherein “quadratic multiform separation” (QMS) is modified and executed multiplicatively in an even generalized way. In each execution, the characteristic of one single membership is either enhanced or reduced. This process is performed in turn to each membership. Thus, every sample data (or element) receives multiple classification results. Then, the multiple classification results are collected and analyzed by an “eclectic classifier” to reach a final decision. The combination of the generalized QMS and the eclectic classifier therefore develops the loyalty extraction machine. Moreover, a label called “loyalty type” of the element is introduced to describe the effectiveness of membership recognition with respect to a training set.
LOYALTY EXTRACTION MACHINE
The present invention provides a loyalty extraction machine, wherein “quadratic multiform separation” (QMS) is modified and executed multiplicatively in an even generalized way. In each execution, the characteristic of one single membership is either enhanced or reduced. This process is performed in turn to each membership. Thus, every sample data (or element) receives multiple classification results. Then, the multiple classification results are collected and analyzed by an “eclectic classifier” to reach a final decision. The combination of the generalized QMS and the eclectic classifier therefore develops the loyalty extraction machine. Moreover, a label called “loyalty type” of the element is introduced to describe the effectiveness of membership recognition with respect to a training set.