G06F16/24568

Processing of sequencing data streams

This disclosure relates to methods and systems for processing of sequencing data streams. The system receives sequences from a sequencer and stores them as data records on a database. The sequences are associated with a counter indicative of a number of times the associated sequence has been sequenced. The system progressively receives a further sequence as streaming data from the sequence. While receiving the further sequence, the system matches the streaming data against the stored sequences to determine a matching score. Upon the matching score exceeding a matching threshold for one of the multiple sequences in the database, the system selects the one of the sequences in the database based on the matching score and stores the further sequence on non-volatile memory where the counter value associated with the selected sequence is below a saturation threshold. The system also terminates the receiving where the counter value is above the saturation threshold.

Method and system for surveillance system management

Exemplary surveillance system is provided having a plurality of data generating devices, and nodes for data handling of the data streams generated from the data generating devices. The system may be configured to fragment the data streams and store the fragments among the plurality of nodes of the system. The system may be configured to redundantly transmit data through the nodes so the fragmented data stream arrive at the desired location for storage. The data transmission may permit redirection or retransmission based on node or data transmission failure.

COMPUTING SYSTEMS AND METHODS FOR CREATING AND EXECUTING USER-DEFINED ANOMALY DETECTION RULES AND GENERATING NOTIFICATIONS FOR DETECTED ANOMALIES

A computing platform may be installed with software technology for creating and executing user-defined anomaly detection rules that configures the computing platform to: (1) receive, from a client device, data defining a given anomaly detection rule that has been created by a user, wherein the given anomaly detection rule comprises at least one anomaly condition that is to be applied to at least one streaming event queue, (2) store a data representation of the given anomaly detection rule in a data store, (3) convert the data representation of the given anomaly detection rule to a streaming query statement, (4) iteratively apply the streaming query statement to the at least one streaming event queue, and (5) while iteratively applying the streaming query statement, make at least one determination that the at least one anomaly condition is satisfied and then cause at least one anomaly notification to be issued to the user.

IMPLEMENTATION OF INSTANT CORRUPTION DETECTION AND RECOVERY
20230237046 · 2023-07-27 ·

The present disclosure describes techniques for implementing instant corruption detection and recovery. A plurality of streams may be created in a storage device. Each of the plurality of streams may contain a sequence of metadata nodes of a same type. Each of the plurality of streams may maintain an initial state, a sequence of delta modifications to the initial state, and an actual state for each of the sequence of metadata nodes. A checking and recovery function associated with a particular stream among the plurality of streams may be determined. The checking and recovery function may comprise checking logic configured to detect corruptions by checking modification operations associated with metadata nodes in the particular stream. The checking and recovery function may further comprise recovery logic configured to perform recoveries from the corruptions. The checking and recovery function associated with the particular stream may be implemented in the storage device.

Enhanced preparation and integration of data sets

Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for enhanced preparation and integration of data sets. In some implementations, data indicating user input that identifies a first data set that includes streaming data and a second data set that includes non-streaming data is received. The first data set and the second data set are integrated to generate a hybrid data set. The data processing system provides access to the hybrid data set through a (i) non-streaming access channel that provides a periodically-refreshed summary of both the streaming data and the non-streaming data and (ii) a streaming access channel that provides a data stream based on combined data of the first data set and the second data set. One or more application programing interfaces are provided. The one or more application programming interfaces allow at least one client device to access the non-streaming access channel and the streaming access channel.

Systems and methods for integration of multiple programming languages within a pipelined search query

According to one embodiment, a method that supports queries deploying operators based on multiple programming languages is described. A sequence of operators associated with a query is identified, where the sequence of operators includes at least two neighboring operators including a first operator based on a first programming language and a second operator based on a second programming language that is different from the first programming language. Thereafter, a schema associated with the first operator and a schema associated with the second operator is determined along with the compatibility between the schema of the first operator and the schema of the second operator. A query error message is generated in response to incompatibility between the first operator schema and the second operator schema. Compatibility is determined when an output generated by execution of the first operator provides machine data needed as input for execution of the second operator.

Customized digital content generation systems and methods

The invention provides in some aspects a method, executed on a digital data processing system, of mass generation of customized digital content that includes continuously identifying current external events taken by or with respect to a plurality of respective prospective targets and, upon identification of such an event, generating a set of actions, each identifying a digital content piece and a digital delivery mechanism therefor. Each action is generated, according to the method, based on the current identified events for a particular prospective target and on a database of information about prior events taken by or with respect to him/her. The sets of actions are queued upon generation and continuously retrieved on a first-in-first-out basis. And, upon retrieval, an action for generation of digital content for the respective prospective target is selected for transmittal from the set based on quotas associated with that target and/or the delivery mechanism identified for it per the selected action.

Representing result data streams based on execution of data stream language programs

An instrumentation analysis system processes data streams by executing instructions specified using a data stream language program. The data stream language allows users to specify a search condition using a find block for identifying the set of data streams processed by the data stream language program. The set of identified data streams may change dynamically. The data stream language allows users to group data streams into sets of data streams based on distinct values of one or more metadata attributes associated with the input data streams. The data stream language allows users to specify a threshold block for determining whether data values of input data streams are outside boundaries specified using low/high thresholds. The elements of the set of data streams input to the threshold block can dynamically change. The low/high threshold values can be specified as data streams and can dynamically change.

DATABASE REPLICATION USING ADAPTIVE COMPRESSION

Methods, computer program products, and/or systems are provided that perform the following operations: in a data replication environment, analyzing a database workload to generate a knowledge base of information related to compression; dividing a transfer data stream into different segments based, at least in part, on the knowledge base; obtaining candidate compression types for the transfer data stream based, at least in part, on the knowledge base; assigning respective compression types of the candidate compression types to the different segments; generating compressed segments based, at least in part, on the respective compression types assigned to the different segments; and providing the compressed segments to a replication target.

Visual data computing platform using a progressive computation engine

The subject matter herein provides a method, apparatus and computer program product that combines, in one intuitive interface, visualization user interfaces (UIs) as used for descriptive analytics, with workflow UIs as used for predictive analytics. These interfaces provide a visual workspace front-end. The workspace is coupled to a back-end that comprises a data processing engine that combines progressive computation, approximate query processing, and sampling, together with a focus on supporting user-defined operations, to drive the front-end efficiently and in real-time. The processing engine achieves rapid responsiveness through progressive sampling, quickly returning an initial answer, typically on a random sample of data, before continuing to refine that answer in the background. In this manner, any operation carried out in the platform immediately provides a visual response, regardless of the underlying complexity of the operation or data size.