G06F16/3332

SEMANTICS BASED DATA AND METADATA MAPPING
20230044287 · 2023-02-09 ·

The present disclosure involves computer-implemented method, medium, and system for automatically correlating semantically connected data and metadata. One example method includes identifying a document that is to be analyzed using a semantics based mapping (SBM) infrastructure. A matching process is performed for the identified document using the SBM infrastructure, where the matching process identifies a plurality of matching terms within the document, the plurality of matching terms are assigned to a plurality of semantics identifiers (IDs), and each semantics ID corresponds to one or more terms in the plurality of matching terms. Each of the plurality of matching terms is replaced with a respective term ID to generate an updated document. A request to search for a target term in the document is received. The target term is translated to a target term ID based on the SBM infrastructure. The updated document is searched for one or more matching terms.

Hybrid structured/unstructured search and query system
11567978 · 2023-01-31 · ·

Technologies are described herein for executing queries expressed with reference to a structured query language against unstructured data. A user issues a structured query through a traditional structured data management (“SDM”) application. Upon receiving the structured query, an SDM driver analyzes the structured query and extracts a data structure from the unstructured data, if necessary. The structured query is then converted to an unstructured query based on the extracted data structure. The converted unstructured query may then be executed against the unstructured data. Results from the query are reorganized into structured data utilizing the extracted data structure and are then presented to the user through the SDM application.

Adaptive interpretation and compilation of database queries

A method executes at a computer system to retrieve data from a database. Upon receiving a database query, the computer system translates the query into an intermediate representation, and estimates a compilation time to compile the intermediate representation into machine executable code. The query execution time to retrieve a result set is also estimated. In accordance with a determination that the query execution time and compilation time satisfy an interpretation criterion, the computer system invokes a byte code interpreter to interpret the intermediate representation and retrieve the result set from the database. In accordance with a determination that the query execution and compilation times satisfy one of a plurality of compilation criteria, the computer system compiles the intermediate representation to form machine code and executes the machine code to retrieve the result set from the database. In some cases, the query intermediate representation is optimized prior to compilation.

Query rewrite for low performing queries based on customer behavior

A method includes receiving a plurality of product query arrays each including a plurality of individual product queries received during a single user search session. The method further includes inputting the plurality of product query arrays into the query rewrite model. Text of each of the plurality of individual product queries in each product query array is treated as a whole token. The method further includes receiving a product query from a user electronic device. The method further includes determining a query rewrite for the product query using the query rewrite model and determining search results for the product query using the query rewrite. The method further includes sending information for presenting the search results on a display of the user electronic device responsive to the product query.

Recognizing transliterated words using suffix and/or prefix outputs

A computer-implemented method includes: receiving, by a computing device, an input file defining correct spellings of one or more transliterated words; generating, by the computing device, suffix outputs based on the one or more transliterated words; generating, by the computing device, a dictionary that maps the suffix outputs to the one or more transliterated words; recognizing, by the computing device, an alternatively spelled transliterated word included in a document as one of the one or more correctly spelled transliterated words using the dictionary; and outputting, by the computing device, information corresponding to the recognized transliterated word.

Transforming a function-step-based graph query to another graph query language

To execute function-step-based graph queries on a graph engine that has its own graph query language, rather than re-implementing an existing infrastructure to support function-step-based graph protocols, function-step-based graph queries are transformed to the graph query language that is understood by the graph engine. The existing infrastructure computes the results of the transformed queries. Result sets are then transformed to function-based-based result sets, which are returned to customers. In this manner, the graph engine supports function-step-based graph query workloads without implementation of the function-step-based graph protocol.

APPARATUS AND METHOD FOR TRANSFORMING UNSTRUCTURED DATA SOURCES INTO BOTH RELATIONAL ENTITIES AND MACHINE LEARNING MODELS THAT SUPPORT STRUCTURED QUERY LANGUAGE QUERIES

A non-transitory computer readable storage medium has instructions executed by a processor to receive from a network connection different sources of unstructured data. An entity is formed by combining one or more sources of the unstructured data, where the entity has relational data attributes. A representation for the entity is created, where the representation includes embeddings that are numeric vectors computed using machine learning embedding models, including trunk models, where a trunk model is a machine learning model trained on data in a self-supervised manner. An enrichment model is created to predict a property of the entity. A query is processed to produce a query result, where the query is applied to one or more of the entity, the embeddings, the machine learning embedding models, and the enrichment model.

SYSTEMS AND METHODS FOR GENERATING SEARCH RESULTS BASED ON OPTICAL CHARACTER RECOGNITION TECHNIQUES AND MACHINE-ENCODED TEXT
20230126412 · 2023-04-27 ·

Disclosed are systems and methods for generating search result data based on machine-encoded text generated by computer vision optical character recognition machine learning techniques performed on digital media. The disclosed systems and methods provide a novel framework for performing machine learning visual search or machine learning text extraction techniques on digital media in order to extract and analyze the data therein and further conduct search queries based on the extracted and analyzed data. The disclosed framework may leverage the aforementioned computer vision machine learning techniques in order to provide a user with relevant search results regarding objects and text detect in digital media captured on a user device.

KNOWLEDGE BASE QUESTION ANSWERING

One or more computer processors parse a received natural language question into an abstract meaning representation (AMR) graph. The one or more computer processors enrich the AMR graph into an extended AMR graph. The one or more computer processors transform the extended AMR graph into a query graph utilizing a path-based approach, wherein the query graph is a directed edge-labeled graph. The one or more computer processors generate one or more answers to the natural language question through one or more queries created utilizing the query graph.

Source code retrieval

A method may include obtaining training code and extracting features from the training code. The extracted features of the training code may be mapped to natural language code vectors by a deep neural network. A natural language search query requesting source-code suggestions may be received, and the natural language search query may be mapped to a natural language search vector by the deep neural network. The method may include mapping the natural language search query to the natural language search vector in the same or a similar method as mapping the extracted features of the training code to natural language code vectors, and the natural language search vector may be compared to the natural language code vectors. Source code responsive to the natural language search query may be suggested based on the comparison between the natural language search vector and the natural language code vectors.