Patent classifications
G06F16/24532
Semantic indexing engine
Embodiments are described for a method of distributing n-tuples over a cluster of triple-store machines, by storing each n-tuple as text in a distributed file system using a key value store; providing each machine of the cluster with a resident semantic data lake component accessing one or more persistent RDF triplestores for the n-tuple data stored on each machine; and defining one part of each n-tuple as a partition variable to ensure locality of data within each respective n-tuple. A method includes inserting graphs into a key/value store to determine how the key/value store distributes the data across a plurality of servers, by generating textual triple data, and storing the triple data in key-value stores wherein a fourth element of the triple comprises the key, and a value associated with the key comprises all the triples about a subject; indexing the data in the key-value store in an RDF triplestore using a partition based on the fourth element.
IN-MEMORY DATABASE (IMDB) ACCELERATION THROUGH NEAR DATA PROCESSING
An accelerator is disclosed. The accelerator may include an on-chip memory to store a data from a database. The on-chip memory may include a first memory bank and a second memory bank. The first memory bank may store the data, which may include a first value and a second value. A computational engine may execute, in parallel, a command on the first value in the data and the command on the second value in the data in the on-chip memory. The on-chip memory may be configured to load a second data from the database into the second memory bank in parallel with the computation engine executing the command on the first value in the data and executing the command on the second value in the data.
SYSTEM AND METHOD FOR PROVIDING A RESPONSE TO A PARALLEL SEARCH QUERY
A system and method for providing a response to a parallel search query received at a digital platform. The method encompasses receiving, at the digital platform, a user query. The method thereafter comprises identifying, a plurality of entities in the user query. Further the method encompasses identifying, the user query as the parallel search query based on the identification of the plurality of entities in the user query. The method thereafter comprises generating, a user interface based on the identification of the parallel search query, wherein the user interface comprises a scrollable segment for each entity from the plurality of entities. Also, the method comprises performing, a search for said each entity. The method thereafter comprises generating, a response for said each entity based on the search performed. Further the method comprises providing, the response generated for said each entity via the scrollable segment for said each entity.
Re-ordered processing of read requests
A method includes determining, in accordance with a first ordering, a plurality of read requests for a memory device. The plurality of read requests are added to a memory device queue for the memory device in accordance with the first ordering. The plurality of read requests in the memory device queue are processed, in accordance with a second ordering that is different from the first ordering, to determine read data for each of the plurality of read requests. The read data for the each of the plurality of read requests is added one of a set of ordered positions, based on the first ordering, of a ring buffer as the each of the plurality of reads requests is processed. The read data of a subset of the plurality of read requests is submitted based on adding the read data to a first ordered position of the set of ordered positions of the ring buffer.
LOW LATENCY INGESTION INTO A DATA SYSTEM
Described herein are techniques for improving transfer of metadata from a metadata database to a database stored in a data system, such as a data warehouse. The metadata may be written into the metadata database with a version stamp, which is monotonic increasing register value, and a partition identifier, which can be generated using attribute values of the metadata. A plurality of readers can scan the metadata database based on version stamp and partition identifier values to export the metadata to a cloud storage location. From the cloud storage location, the exported data can be auto ingested into the database, which includes a journal and snapshot table.
Statistics based query transformation
Techniques are described for responding to aggregate queries using optimizer statistics already available in the data dictionary of the database in which the database object targeting by the aggregate query resides, without the user creating any additional objects (e.g. materialized views) and without requiring the objects to be loaded into volatile memory in a columnar fashion. The user query is rewritten to produce a transformed query that targets the dictionary tables to form the aggregate result without scanning the user tables. “Accuracy indicators” may be maintained to indicate whether those statistics are accurate. Only accurate statistics are used to answer queries that require accurate answers. The accuracy check can be made during runtime, allowing the query plan of the transformed query to be used regardless of the accuracy of the statistics. For queries that request approximations, inaccurate statistics may be used so long as the statistics are “accurate enough”.
Multidimensional associative memory and data searching
A method for searching data includes storing a probe data and a target data expressed in a first orthogonal domain. The target data includes potential probe match data each characterized by the length of the target data. The probe data representation and the target data are transformed into an orthogonal domain. In the orthogonal domain, the target data is encoded with modulation functions to produce a plurality of encoded target data, each of the modulation functions having a position index corresponding to one of the potential probe match data. The plurality of encoded target data is interfered with the probe data in the orthogonal domain and an inverse transform result is obtained. If the inverse transform result exceeds a threshold, information is output indicating a match between the probe data and a corresponding one of the potential probe match data.
Systems and methods for determining peak memory requirements in SQL processing engines with concurrent subtasks
The present invention is generally directed to systems and methods of determining and provisioning peak memory requirements in Structured Query Language Processing engines. More specifically, methods may include determining or obtaining a query execution plan; gathering statistics associated with each database table; breaking the query execution plan into one or more subtasks: calculating an estimated memory usage for each subtask using the statistics; determining or obtaining a dependency graph of the one or more subtasks; based at least in part on the dependency graph, determining which subtasks can execute concurrently on a single worker node; and totaling the amount of estimated memory for each subtask that can execute concurrently on a single worker node and setting this amount of estimated memory as the estimated peak memory requirement for the specefic database query.
CONFIGURING GRAPH QUERY PARALLELISM FOR HIGH SYSTEM THROUGHPUT
The present disclosure involves systems, software, and computer implemented methods for configuring graph query parallelism for high system throughput. One example method includes receiving a query to be executed against a graph database. System properties are determined of a system in which the query is to be executed. Algorithmic properties are determined of at least one algorithm to be used to execute the query. Graph data statistics are determined for the graph database. Graph traversal estimations are determined for a first iteration of the graph query and an estimated cost model is determined for the first iteration based on the graph traversal estimations. Estimated thread boundaries are determined for performing parallel execution of the first iteration. Work packages of vertices to be processed during the execution of the first iteration are determined based on the first estimated cost model and the work packages are provided to a work package scheduler.
Building data platform with a distributed digital twin
A method including receiving, by one or more processing circuits, building data, generating, by the one or more processing circuits, a first digital twin based on the building data, wherein a first system stores the first digital twin and a second system stores a second digital twin generated based on the building data, where the first digital twin includes a relationship that forms a connection between the first digital twin and the second digital twin by linking a first entity of the first entities of the first digital twin and a second entity of the second entities of the second digital twin, and performing, by the one or more processing circuits, one or more operations based on at least one of the first digital twin, the second digital twin, or the relationship that forms the connection between the first digital twin and the second digital twin.