G06F16/316

Methods and systems for building a search service application

A system for providing a search service is disclosed and includes a processor-based search service application builder component that provides a search model representing a search service application for a first object of a plurality of objects. The search model is based at least on a user-defined end-user input field corresponding to a first attribute of a plurality of attributes associated with the first object and a user-defined search result output field corresponding to a second attribute of the plurality of attributes. The search model is also associated with a backend data store that supports a storage structure configured to store information relating to the first object. The system also includes a processor-based deployment engine that automatically configures a search engine system associated with the backend data store system to generate and/or update search index(es) based on at least one of the first attribute and the second attribute.

Segmenting machine data into events based on source signatures

Methods and apparatus consistent with the invention provide the ability to organize and build understandings of machine data generated by a variety of information-processing environments. Machine data is a product of information-processing systems (e.g., activity logs, configuration files, messages, database records) and represents the evidence of particular events that have taken place and been recorded in raw data format. In one embodiment, machine data is turned into a machine data web by organizing machine data into events and then linking events together.

Determining similarity between documents
11636167 · 2023-04-25 · ·

Method and system for processing digital works, the method comprising the steps of identifying terms within each digital work of a plurality of digital works, wherein the terms are words and/or phrases. Determining a number of times that the identified terms occur within each digital work of the plurality of digital works. Generating a fingerprint for each digital work of the plurality of digital works, the generated fingerprint based on the identified terms and the number of times that the identified terms occur within each digital work. Using a neural network to find an encoding function, g, that encodes a higher dimensionality space, x, of each fingerprint into a lower dimensionality space, y. Applying the encoding function to each fingerprint of the plurality of digital works to reduce their dimensionality. Determining a similarity between a first fingerprint and one or more dimensionality reduced fingerprints.

AUTOMATIC IDENTIFICATION OF DOCUMENT SECTIONS TO GENERATE A SEARCHABLE DATA STRUCTURE
20230119590 · 2023-04-20 ·

Methods and apparatuses are described for automatically identifying text sections of a document to generate a searchable hierarchical data structure. A computing device receives a document comprising text entities and converts the document from a first format to a second format, including generating metadata associated with text alignment, text position, text spacing, or fonts. The computing device extracts the text blocks, including determining coordinates associated with each text block using the metadata. The computing device determines document sections using the document metadata by identifying strings in the extracted text blocks that indicate a presence of a bullet point in the document, assigns a hierarchical category to each identified document section, and inserts text of each document section into a hierarchical data structure based upon the assigned hierarchical category. The computing device traverses the hierarchical data structure using search request data to identify document sections relating to the search request data.

Segmenting machine data into events to identify matching events

Methods and apparatus consistent with the invention provide the ability to organize and build understandings of machine data generated by a variety of information-processing environments. Machine data is a product of information-processing systems (e.g., activity logs, configuration files, messages, database records) and represents the evidence of particular events that have taken place and been recorded in raw data format. In one embodiment, machine data is turned into a machine data web by organizing machine data into events and then linking events together.

Methods and systems for generating a virtual assistant in a messaging user interface
11665118 · 2023-05-30 · ·

A system for generating a virtual assistant in a messaging user interface the system including a computing device configured to initiate a virtual message user interface between a user client device and the computing device; receive a user message entered by a user into the virtual message user interface; retrieve data relating to a user agenda list including a plurality of agenda actions; analyze the user message to identify an agenda action related to the user message; and generate a response to the user message as a function of analyzing the user message, wherein generating the response further comprises generating a user-action learner, wherein the user-action learner utilizes a previous message and the user message as an input and output a response; identifying a response as a function of generating the user-action learner; and displaying the response within the virtual message user interface.

FULL-TEXT INDEXING METHOD AND SYSTEM BASED ON GRAPH DATABASE
20220335086 · 2022-10-20 ·

The present disclosure relates to a full-text indexing method and system based on a graph database. A full-text indexing engine creates an index template, data with a field type being character string in a graph database is synchronized to the full-text indexing engine, the full-text indexing engine creates an index for each piece of character string data according to the index template to obtain a full-text index; the graph database acquires and sends query request information to the full-text indexing engine; the full-text indexing engine acquires a first result set of a query statement according to the full-text index; the graph database performs data scanning on the first result set based on key-value pairs to obtain a second result set. The full-text indexing engine supports conditional filtering of the character string type. Character string data is quickly found in the full-text indexing engine, thereby improving efficiency of data retrieval.

DOCUMENT ELIMINATION FOR COMPACT AND SECURE STORAGE AND MANAGEMENT THEREOF

Documents, such as those that may or will be the subject of a litigation, may be managed by automatically determining that a document, such as an email or other communication, is privileged or producible such that superfluous documents may be removed to improve data storage and reduce the burden on storage, processing, and communication resources. Additionally, documents such as emails may comprise attached or embedded documents (e.g., attachments) which may be similarly or independently classified from their associated email. After determining privilege, such as via metadata associated with a sender/receiver of an email, similarly categorized documents may be grouped for presentation and/or storage. The documents may be indexed, such as by entries within a production log, to further facilitate accurate production and management of non-privileged documents, as well as, the exclusion of privileged documents. Documents not required for production may be indexed and/or purged from storage.

Data processing systems and methods

Example data processing systems and methods are described. In one implementation, a system accesses a corpus of data and analyzes the data contained in the corpus of data to identify multiple documents. The system generates vector indexes for the multiple documents such that the vector indexes allow a computing system to quickly access the plurality of documents and identify an answer to a question associated with the corpus of data.

Indexing Access Limited Native Applications
20230106266 · 2023-04-06 ·

Methods, systems, and apparatus for determining that a native application limits access to the native application using account credential requirements, the native application generating an application environment for display on a user device within the native application and operating independent of a browser application that can operate on the user device; obtaining a set of account credentials for indexing environment instances of the native application; instantiating the native application with the set of account credentials; and accessing environment instances of the native application, and for each of the environment instances: generating environment instance data describing content of the environment instance, the content described by the environment instance data including text that a user device displays on the environment instance when the user device displays the environment instance; and indexing the environment instance data for the native application in an index that is searchable by a search engine.