G06F16/316

Systems and methods for caching structural elements of electronic documents

Systems and methods are disclosed herein for caching structural elements of electronic documents. A plurality of indices is stored in a data store. The plurality of indices corresponds to locations within an electronic document of portions of a stylized sub-element. A mutation to the electronic document is received and it is determined that the mutation pertains to a style of the stylized sub-element. Based on the mutation and the plurality of indices, the stylized sub-element is updated, and the portions of the updated stylized sub-element are caused to be displayed at a user device using a different style.

MANAGEMENT SYSTEM FOR SOFTWARE INCIDENTS

A method comprises receiving an incident report comprising a textual description of an incident; generating a regularized incident report in which out-of-vocabulary terms in the received incident report are replaced with in-vocabulary terms; determining importance measures for a plurality of incident report terms, wherein each of the incident report terms is in the regularized incident report; generating an incident matrix in which similarity values are defined for combinations of terms in the incident report and terms in a predetermined term set; generating an incident vector based on the incident matrix and the importance measures for the terms in the incident report; applying one or more machine learning (ML) models that identify, based on the incident vector, relevant software support records and/or software modules, wherein the relevant software support records and the software modules are potentially relevant to the incident; and outputting data identifying relevant software support records and/or software modules.

Systems and Methods for Facilitating Semantic Search of Audio Content

The various implementations described herein include methods and devices for facilitating semantic search. In one aspect, a method includes obtaining audio content and extracting vocabulary terms from the audio content. The method further includes generating, using a transformer model, a vocabulary embedding from the vocabulary terms, and generating one or more topic embeddings from the audio content and the vocabulary embeddings. The method also includes generating a topic embedding index for the audio content based on the one or more topic embeddings, and storing the embedding index for use with a search engine system.

QUERY PROCESSING AND VISUALIZATION APPARATUSES, METHODS AND SYSTEMS

The QUERY PROCESSING AND VISUALIZATION APPARATUSES, METHODS AND SYSTEMS (QPAV) provides a platform that, in various embodiments, is configurable to receive, evaluate, and respond to queries over collections of structured and unstructured data, such as call records having associated metadata. Implementations provide for the generation of graphical representations of call networks, comprising nodes and links, in response to a received query which may comprise terms spoken in one or more call transcripts. The visual representation of query results may be enhanced by metadata, and may be configurable by the user to highlight particular connections, behaviors, or other insights associated with callers in the network.

Methods and systems for a compliance framework database schema

Generating a compliance framework. The compliance framework facilitates an organization's compliance with multiple authority documents by providing efficient methodologies and refinements to existing technologies, such as providing hierarchical fidelity to the original authority document; separating auditable citations from their context (e.g., prepositions and or informational citations); asset focused citations; SNED and Live values, among others.

Document elimination for compact and secure storage and management thereof

Documents, such as those that may or will be the subject of a litigation, may be managed by automatically determining that a document, such as an email or other communication, is privileged or producible such that superfluous documents may be removed to improve data storage and reduce the burden on storage, processing, and communication resources. Additionally, documents such as emails may comprise attached or embedded documents (e.g., attachments) which may be similarly or independently classified from their associated email. After determining privilege, such as via metadata associated with a sender/receiver of an email, similarly categorized documents may be grouped for presentation and/or storage. The documents may be indexed, such as by entries within a production log, to further facilitate accurate production and management of non-privileged documents, as well as, the exclusion of privileged documents. Documents not required for production may be indexed and/or purged from storage.

Generation of process models in domains with unstructured data

A computing server configured to process data of a domain from heterogeneous data sources. A domain may store data and schema, domain knowledge ontology such as resource description framework, and unstructured data. The computing server may extract objects from the unstructured data. The computing server may convert the extracted named entities and activities to word embeddings and input the word embeddings to a machine learning model to generate an activity time sequence. The machine learning model may be a long short-term memory. A process model may be generated from the time sequence. The computing server may identify outliers in the process model based on metrics defined by the domain. The computing server may convert transactions without outliers as word embeddings and generate signatures of the transactions using cosine similarity. The computing server may augment the results with the domain knowledge ontology.

Method and System for High Performance Integration, Processing and Searching of Structured and Unstructured Data

Disclosed herein are methods and systems for integrating an enterprise's structured and unstructured data to provide users and enterprise applications with efficient and intelligent access to that data. In accordance with exemplary embodiments, the generation of feature vectors about unstructured data can be hardware-accelerated by processing streaming unstructured data through a reconfigurable logic device, a graphics processor unit (GPU), or chip multi-processor (CMP) to determine features that can aid clustering of similar data objects.

DATA FORWARDER FOR DISTRIBUTED DATA ACQUISITION, INDEXING AND SEARCH SYSTEM
20190146836 · 2019-05-16 ·

A scheduler manages execution of a plurality of data-collection jobs, assigns individual jobs to specific forwarders in a set of forwarders, and generates and transmits tokens (e.g., pairs of data-collection tasks and target sources) to assigned forwarders. The forwarder uses the tokens, along with stored information applicable across jobs, to collect data from the target source and forward it onto an indexer for processing. For example, the indexer can then break a data stream into discrete events, extract a timestamp from each event and index (e.g., store) the event based on the timestamp. The scheduler can monitor forwarders' job performance, such that it can use the performance to influence subsequent job assignments. Thus, data-collection jobs can be efficiently assigned to and executed by a group of forwarders, where the group can potentially be diverse and dynamic in size.

SYSTEMS AND METHODS FOR PERFORMING SEARCH AND RETRIEVAL OF ELECTRONIC DOCUMENTS USING A BIG INDEX
20190147000 · 2019-05-16 ·

Methods and systems for providing a search engine capability for large datasets are disclosed. These methods and systems employ a Partition-by-Query index containing key-values pairs corresponding to keys reflecting concept-ordered search phrases and values reflecting ordered lists of document references that are responsive to the concept-ordered search phrase in a corresponding key. A large Partition-by-Query index may be partitioned across multiple servers depending on the size of the index, or the size of the index may be reduced by compressing query-references pairs into dusters. The methods and systems described herein may to provide suggestions and spelling corrections to the user, thereby improving the user's search engine experience while meeting user expectations for search quality and responsiveness.