Patent classifications
G06F16/3346
Query rephrasing using encoder neural network and decoder neural network
A method comprising receiving first data representative of a query. A representation of the query is generated using an encoder neural network and the first data. Words for a rephrased version of the query are selected from a set of words comprising a first subset of words comprising words of the query and a second subset of words comprising words absent from the query. Second data representative of the rephrased version of the query is generated.
Descriptor uniqueness for entity clustering
A mechanism is provided in a data processing system to implement a cognitive natural language processing (NLP) system with descriptor uniqueness identification to support named entity mention clustering. The mechanism annotates a set of documents from a corpus of documents for entity types and mentions, collects descriptor usages from all documents in the corpus of documents, analyzes the descriptor usages to classify the descriptors as base terms or modifier terms, generates compatibility scores for the descriptors, and performs entity merging of entity clusters based on the compatibility scores.
Method and apparatus for generating Q and A model by using adversarial learning
A method of generating a question-answer learning model through adversarial learning may include: sampling a latent variable based on constraints in an input passage; generating an answer based on the latent variable; generating a question based on the answer; and machine-learning the question-answer learning model using a dataset of the generated question and answer, wherein the constraints are controlled so that the latent variable is present in a data manifold while increasing a loss of the question-answer learning model.
Electronic device and method for providing conversational service
A method, performed by an electronic device, of providing a conversational service includes: receiving an utterance input; identifying a temporal expression representing a time in a text obtained from the utterance input; determining a time point related to the utterance input based on the temporal expression; selecting a database corresponding to the determined time point from among a plurality of databases storing information about a conversation history of a user using the conversational service; interpreting the text based on information about the conversation history of the user, the conversation history information being acquired from the selected database; generating a response message to the utterance input based on a result of the interpreting; and outputting the generated response message.
DYNAMIC CROSS-PLATFORM ASK INTERFACE AND NATURAL LANGUAGE PROCESSING MODEL
The present disclosure relates to systems, non-transitory computer-readable media, and methods that generate a dynamic cross-platform ask interface and utilize a cross-platform language processing model to provide platform-specific, contextually based responses to natural language digital text queries. In particular, in one or more embodiments, the disclosed systems utilize machine learning models to extract registered intents from digital text queries to identify platform-specific configurations associated with the registered intents. Utilizing the platform-specific configurations, the disclosed systems can generate tailored platform-specific requests for information, as well as customized end-user search results that cause client devices to efficiently, accurately, and flexibly render platform-specific search results.
SYSTEMS AND METHODS FOR GENERATING A SYSTEM LOG PARSER
The present disclosure provides systems and methods for generation of parsing scripts or rules for unstructured or semi-structured system log messages, including systems and methods for identifying and clustering of same or substantially similar system log messages using machine learning. Patterns indicative of the same or substantially similar types system log messages can be generated based on the clustering of the system log messages and calculated similarities of attributes or distances between common features/fields of the system log messages, with the results of the clustering presented for analysis and development or adjustment of parsing scripts.
DOCUMENT SEARCH DEVICE, DOCUMENT SEARCH SYSTEM, DOCUMENT SEARCH PROGRAM, AND DOCUMENT SEARCH METHOD
To improve precision while maintaining a balance between accuracy and comprehensiveness of a document search. According to one embodiment of the present invention, a document search device includes an input reception unit configured to receive an input of a keyword of a document search, a document search unit configured to acquire, from a document, a hit character string matching a character string in which a portion of characters of the keyword is replaced with a wildcard, and character strings before and after the hit character string, and compute a likelihood of the hit character string, based on the hit character string, and the character strings before and after the hit character string, and a search result display unit configured to output a result of the document search based on the likelihood.
EFFICIENT SEARCH FOR COMBINATIONS OF MATCHING ENTITIES GIVEN CONSTRAINTS
Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.
Service architecture for ontology linking of unstructured text
Techniques for ontology linking of unstructured text as a service are described. A service may receive a request to link unstructured text to a standardized ontology, and the service may segment and tokenize the unstructured text and send the result to multiple services implementing multiple deep machine learning models trained to identify particular entities and one or more relationships between entities. The service may perform a search of the standardized ontology to identify a set of similar candidates from the standardized ontology for the detected entities and the one or more relationships, and then rank the set of similar candidates from the standardized ontology according to their similarity to the detected entities within the unstructured text. The output from the service may include a result identifying a highest ranked candidate of the set of similar candidates from the standardized ontology for the detected entities within the unstructured text.
Systems, computer-implemented methods, and computer program products for data sequence validity processing
Embodiments provide for improved data sequence validity processing, for example to determine validity of sentences or other language within a particular language domain. Such improved processing is useful at least for arranging data sequences based on determined validity, and/or making determinations and/or performing actions based on the determined validity. A determined probability (e.g., transformed into the perplexity space) of each token appearing in a data sequence is used in any of a myriad of manners to perform such data sequence validity processing. Example embodiments provide for generating a perplexity value set for each data sequence in a plurality of data sequences, generating a probabilistic ranking set for the plurality of data sequences based on the perplexity value sets and at least one sequence ranking metric, and generating an arrangement of the plurality of data sequences based on the probabilistic ranking set.