G06F16/3347

DIALOGUE GENERATION METHOD AND NETWORK TRAINING METHOD AND APPARATUS, STORAGE MEDIUM, AND DEVICE

A dialogue generation method, a network training method and apparatus, a storage medium, and a device are provided. The method includes: predicting, based on a plurality of a plurality of pieces of candidate knowledge text in a first candidate knowledge set, a preliminary dialogue response of a first dialogue preceding text; processing the first dialogue preceding text based on the preliminary dialogue response to obtain a first dialogue preceding text vector; obtaining a piece of target knowledge text based on a probability value of the piece of target knowledge text of being selected to be used in generating a final dialogue response, the probability value being obtained based on the first dialogue preceding text vector; and generating the final dialogue response based on the first dialogue preceding text and the piece of target knowledge text.

DOCUMENT RETRIEVAL SYSTEM
20230026321 · 2023-01-26 ·

A document retrieval system that retrieves documents, with concepts of the documents taken into account, is provided. The document retrieval system (100) includes an input unit (101), a first processing unit (102), a storage unit (105), a second processing unit (103), and an output unit (104). The input unit (101) has a function of inputting a first document (20), the first processing unit (102) has a function of creating a first graph structure (21) from the first document (20), the storage unit (105) has a function of storing a second graph structure (11), the second processing unit (103) has a function of calculating a similarity between the first graph structure (21) and the second graph structure (11), the output unit (104) has a function of supplying information, the first processing unit (102) has a function of dividing the first document (20) into a plurality of tokens, a node and an edge of the first graph structure (21) have a label, and the label includes the plurality of tokens.

Incorporating data into search engines using deep learning mechanisms

Methods, apparatus, and processor-readable storage media for incorporating data into search engines using deep learning mechanisms are provided herein. An example computer-implemented method includes extracting one or more features from a search query by applying one or more machine learning algorithms to the search query; generating one or more word vectors by applying at least one deep learning technique to the one or more extracted features; mapping the one or more generated word vectors to one or more words from a corpus of data by implementing at least one deep similarity network; and outputting one or more results in response to the search query, wherein the one or more results are based at least in part on the one or more words from the corpus to which the one or more generated word vectors were mapped.

Iterative query-based analysis of text

Techniques for iterative query-based analysis of text are described. According to various implementations, a neural network architecture is implemented receives a query for information about text content, and iteratively analyzes the content using the query. During the analysis a state of the query evolves until it reaches a termination state, at which point the state of the query is output as an answer to the initial query.

Log sourcetype inference model training for a data intake and query system
11704490 · 2023-07-18 · ·

Systems and methods are described for training an artificial intelligence model to infer a log sourcetype of a log. For example, logs may have different log sourcetypes, and logs having the same log sourcetypes may have different messagetypes. The artificial intelligence model may be a machine learning model, and can be trained using training data that includes logs with known log sourcetypes. Each log can be tokenized, filtered, converted into a vector, and applied to a machine learning model as an input to perform the training. The machine learning model may output an inferred log sourcetype, which can be compared with the known log sourcetype to update model parameters to improve the machine learning model accuracy. The trained machine learning model may be trained to infer a log sourcetype of a log regardless of the messagetype of the log.

Automated categorization and assembly of low-quality images into electronic documents

An apparatus includes a memory and processor. The memory stores OCR and NLP algorithms. The processor receives an image of a physical document page and executes the OCR algorithm to convert the image into text. The processor identifies errors in the text, which are associated with noise in the image. The processor generates a feature vector that includes features obtained by executing the NLP algorithm on the text, and features associated with the identified errors in the text. The processor uses the feature vector to assign the image to a document category. Documents assigned to the document category share one or more characteristics, and the feature vector is associated with a probability greater than a threshold that the physical document associated with the image includes those characteristics. The processor then stores the image in a database as a page of an electronic document belonging to the assigned document category.

EFFICIENT SEARCH FOR COMBINATIONS OF MATCHING ENTITIES GIVEN CONSTRAINTS

Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.

Cross-context natural language model generation

Provided is a method including obtaining a corpus and an associated set of domain indicators. The method includes learning a set of vectors in an embedding space based on n-grams of the corpus. The method includes updating ontology graphs comprising a set of vertices and edges associating the set of vertices with each other. The method also includes determining a vector cluster using hierarchical clustering based on distances of the set of vectors with respect to each other in the embedding space and determining a hierarchy of the ontology graphs based on a set of domain indicators of a respective set of vertices corresponding to vectors of the vector cluster. The method also includes updating an index based on the ontology graphs.

Method and apparatus for generating a competition commentary based on artificial intelligence, and storage medium

There is provided a method and apparatus for generating a competition commentary based on artificial intelligence, and a storage medium. The method comprises: obtaining commentator's words commentaries and structured data of historical competitions; generating a commentating model according to obtained information; during live broadcast of a competition, determining a corresponding words commentary according to the commentating model with respect to the structured data obtained each time.

RELATIONSHIP ANALYSIS USING VECTOR REPRESENTATIONS OF DATABASE TABLES
20230214375 · 2023-07-06 · ·

A computer-implemented method includes representing a plurality of database tables as respective vectors in a multi-dimensional vector space, receiving an indication that a first database table represented by a first vector and a second database table represented by a second vector are related to each other, moving positions of the respective vectors representing the plurality of database tables in the multi-dimensional vector space in response to the indication, and grouping the plurality of database tables into one or more table clusters based on positions of the respective vectors representing the plurality of database tables in the multi-dimensional vector space.