Patent classifications
G06F16/2458
Cost-based query optimization for array fields in database systems
A document-oriented database system generates an optimal query execution plan for database queries on an untyped data field included in a collection of documents. The system generates histograms for multiple types of data stored by the untyped data field and uses the histograms to assign costs to operators usable to execute the database query. The system generates the optimal query execution plan by selecting operators based on the assigned costs. In various embodiments, the untyped data field stores scalars, arrays, and objects.
Information processing system, information processing device, and non-transitory computer-readable storage medium
An information processing system includes a first information processing device configured to accept an input of a query to be processed, and a second information processing device configured to execute the query for each of a plurality of tasks in parallel. The first information processing device determines whether or not an external database server contains records targeted by the query, and transmit the query and a connection information for accessing the external database server to the second information processing device. The second information processing device connects to the external database server based on the connection information received from the first information processing device, acquires information indicating a storage status of the records targeted by the query among records stored in the external database server, and determines a processing target range for each of the plurality of tasks relevant to the records targeted by the query, based on the acquired information.
GATHERING AND CONTRIBUTING CONTENT ACROSS DIVERSE SOURCES
A content unification system is described herein that aligns related content from various sources into one unified stream. The system leverages tagging (e.g., hashtags or other added metadata associated with content) to filter and connect related content across sites, formats, and sources. One component of the system is a user-friendly and customizable “dash board” view of the various topics, called a “tagboard”. Tagboards can be quickly and easily created by users and can be embedded on any website. Users can interact with various content sources such as blogs, forums, or services without leaving the tagboard they are viewing. The content unification system provides users the tools to make the web efficient, increase user interaction, and increase the signal to noise ratio. The system also allows site owners and publishers to monetize their traffic better by directing advertisements to their content in real time.
OPTIMISATION OF NETWORK PARAMETERS FOR ENABLING NETWORK CODING
Methods and devices for propagating transactions in a network of nodes, each node having one or more connections to other nodes. The method includes receiving a plurality of incoming transactions over a time period; combining the plurality of incoming transactions using network coding to generate a composite message; sending the composite message to one or more nodes in the network; and determining an adjusted time period based on an equilibrium constant parameter and a count of transactions in the plurality of incoming transactions received over the time period.
INFORMATION PROCESSING METHOD AND TERMINAL, AND COMPUTER STORAGE MEDIUM
The present application discloses an information processing method performed at a computing device. The method includes: collecting first information; executing an intent identification task on the first information to obtain an intent identification processing result; executing a slot identification task on the first information according to the intent identification processing result to obtain a slot identification processing result; narrowing a search range according to the slot identification processing result; and performing a database search for the first information within the narrowed search range.
CONFIGURABLE PARSER AND A METHOD FOR PARSING INFORMATION UNITS
A packet processing technique can include receiving a packet, and parsing the packet based on a protocol field to generate a parse result vector. The parse result vector is used to select between forwarding the packet to a virtual machine executing on a host processing integrated circuit, forwarding the packet to a physical media access controller, multicasting the packet to multiple virtual machines executing on the host processing integrated circuit, and sending the packet to a hypervisor.
PROCESSING INGESTED DATA TO IDENTIFY ANOMALIES
Systems and methods are described for processing ingested data in an asynchronous manner as the data is being ingested to detect potential anomalies. For example, one or more streaming data processors can convert data as the data is ingested into a comparable data structure, determine whether the comparable data structure should be assigned to an existing data pattern or a new data pattern, and optionally update a characteristic of the data pattern to which the comparable data structure is assigned. The streaming data processor(s) can perform these operations automatically in real-time or in periodic batches. Once one or more comparable data structures have been assigned to one or more data patterns, the streaming data processor(s) can analyze the comparable data structures assigned to a particular data pattern to determine whether any of the comparable data structures appear to be anomalous.
SYSTEMS AND METHODS FOR A DATA SEARCH ENGINE BASED ON DATA PROFILES
Systems and methods for searching data are disclosed. For example, the system may include one or more memory units storing instructions and one or more processors configured to execute the instructions to perform operations. The operations may include receiving a sample dataset and identifying a data schema of the sample dataset. The operations may include generating a sample data vector that includes statistical metrics of the sample dataset and information based on the data schema of the sample dataset. The operations may include searching a data index comprising a plurality of stored data vectors corresponding to a plurality of reference datasets. The stored data vectors may include statistical metrics of the reference datasets and information based on corresponding data schema. The operations may include generating, based on the search and the sample data vector, one or more similarity metrics of the sample dataset to individual ones of the reference datasets.
PROCESSING DATA INPUTS FROM ALTERNATIVE SOURCES TO GENERATE A PREDICTIVE SIGNAL
A computer-implemented method includes a method comprising using at least one hardware processor to: receive a plurality of data from a plurality of data sources; standardize the plurality of data; tag the standardized plurality of data with one or more companies; train a prediction model to predict a metric for each of the one or more companies based on the standardized plurality of data tagged with that company and historical measurements for that company; and apply the prediction model to new data to predict the metric for at least one of the one or more companies.
DATA STRUCTURES FOR STORING AND MANIPULATING LONGITUDINAL DATA AND CORRESPONDING NOVEL COMPUTER ENGINES AND METHODS OF USE THEREOF
In some embodiments, the present disclosure provides for an exemplary computer-implemented system that may include a longitudinal data engine, including: a processor and specialized index generation software to generate: an index data structure for a respective event type associated with each respective subject or object; where each respective index data structure is a respective event type-specific data schema, defining how to store events of a particular event type to form longitudinal data of each respective subject or object; an ontology data structure that is configured to describe one or more properties of a respective event of a respective subject or object; and longitudinal data extraction software to extract a respective longitudinal data for a plurality of index data structures and a plurality of ontology data structures associated with a plurality of subjects or objects.