Patent classifications
G06F40/232
Identifying chat correction pairs for training models to automatically correct chat inputs
A chat input identifier may receive various chat inputs based on voice or text inputs from a user. The chat input identifier may apply different filters to the chat inputs to identify one or more chat correction pairs (e.g., chat input with errors, corrected chat input) from among the plurality of chat inputs. The chat correction pairs are used to train an auto-correction model. The trained auto-correction model receives a given chat input that has one or more errors. The auto-correction model processes the given chat input to generate a corrected version of the given chat input (without the need to obtain a correction from the user). The corrected chat input is then provided to a dialog-driven application.
DATA-SHARDING FOR EFFICIENT RECORD SEARCH
Data-sharding systems and/or methods for cost- and time-efficient record search are described. Data-sharding embodiments utilize a name-sharding dimension, optionally in combination with one or more additional dimensions such as record type and year, to reduce latency and reduce search-associated costs. The data-sharding systems and methods embodiments utilize an optimization algorithm to determine a distribution of records related to names. The optimization algorithm may use a three-character prefix for surnames in records to distribute shards across documents, with specific shards relating to no-name and multi-name records allocated.
CONVERSATIONAL INTERACTION ENTITY TESTING
One or more computing devices, systems, and/or methods are provided. In an example, a conversation path associated with a revised code segment of a conversational interaction entity is identified by a processor. The conversation path has a predetermined intent. A conversational phrase is generated by the processor for the conversation path. The conversational interaction entity is employed by the processor using the conversation path and the conversational phrase to generate a resultant intent. An issue report is generated by the processor for the conversational interaction entity responsive to the resultant intent not matching the predetermined intent.
Content editing using AI-based content modeling
A method of content production (e.g., content editing) using content modeling to facilitate content production. In one embodiment, an automated process is configured to render content. For a given content portion, and as the given portion is being rendered, the portion is processed to generate a content model. With respect to a concept expressed in or otherwise associated with the content, the system compares the content model with a target content derived model to generate a relevancy score. The target content derived model is generated by (a) identifying a set of target content portions in which the concept is expressed, (b) generating from each content portion an associated target content model; and (c) performing a vector operation on the associated target content models. Preferably, each associated target content model is built using an Artificial Intelligence (AI)-based content analysis. The relevancy score is used to generate a content production recommendation.
Feature-based deduplication of metadata for places
The technology disclosed relates to deduplicating metadata about places. A feature generator module is configured to generate features for metadata profiles. The metadata profiles represent a plurality of places. The features are based on geohash strings and word embeddings generated for the metadata profiles. A diff generator module is configured to generate diff vectors that pair-wise encode results of comparison between features of paired metadata profiles. A classification module is configured to generate similarity scores for the paired metadata profiles based on the diff vectors. A particular similarity score indicates whether metadata profiles in a particular pair of metadata profiles represent a same place.
Feature-based deduplication of metadata for places
The technology disclosed relates to deduplicating metadata about places. A feature generator module is configured to generate features for metadata profiles. The metadata profiles represent a plurality of places. The features are based on geohash strings and word embeddings generated for the metadata profiles. A diff generator module is configured to generate diff vectors that pair-wise encode results of comparison between features of paired metadata profiles. A classification module is configured to generate similarity scores for the paired metadata profiles based on the diff vectors. A particular similarity score indicates whether metadata profiles in a particular pair of metadata profiles represent a same place.
METHOD AND SYSTEM FOR HYBRID ENTITY RECOGNITION
A hybrid entity recognition system and accompanying method identify composite entities based on machine learning. An input sentence is received and is preprocessed to remove extraneous information, perform spelling correction, and perform grammar correction to generate a cleaned input sentence. A POS tagger tags parts of speech of the cleaned input sentence. A rules based entity recognizer module identifies first level entities in the cleaned input sentence. The cleaned input sentence is converted and translated into numeric vectors. Basic and composite entities are extracted from the cleaned input sentence using the numeric vectors.
METHOD AND SYSTEM FOR HYBRID ENTITY RECOGNITION
A hybrid entity recognition system and accompanying method identify composite entities based on machine learning. An input sentence is received and is preprocessed to remove extraneous information, perform spelling correction, and perform grammar correction to generate a cleaned input sentence. A POS tagger tags parts of speech of the cleaned input sentence. A rules based entity recognizer module identifies first level entities in the cleaned input sentence. The cleaned input sentence is converted and translated into numeric vectors. Basic and composite entities are extracted from the cleaned input sentence using the numeric vectors.
SYSTEMS AND METHODS FOR UNSUPERVISED NEOLOGISM NORMALIZATION OF ELECTRONIC CONTENT USING EMBEDDING SPACE MAPPING
Systems and methods are disclosed for utilizing a comment moderation bot for detecting and normalizing neologisms in social media. One method comprises transmitting, by a neologism normalization system, a comment moderation bot for detecting neologisms on an online platform maintained by one or more publisher systems. The comment moderation bot may aggregate data related to user comments and transmit the aggregated data to the neologism normalization system for further processing. The neologism normalization system implements unsupervised machine learning models for detecting neologisms in the aggregated data through tokenization and filtering; and normalizing the neologisms through similarity analysis and lattice decoding.
SYSTEMS AND METHODS FOR UNSUPERVISED NEOLOGISM NORMALIZATION OF ELECTRONIC CONTENT USING EMBEDDING SPACE MAPPING
Systems and methods are disclosed for utilizing a comment moderation bot for detecting and normalizing neologisms in social media. One method comprises transmitting, by a neologism normalization system, a comment moderation bot for detecting neologisms on an online platform maintained by one or more publisher systems. The comment moderation bot may aggregate data related to user comments and transmit the aggregated data to the neologism normalization system for further processing. The neologism normalization system implements unsupervised machine learning models for detecting neologisms in the aggregated data through tokenization and filtering; and normalizing the neologisms through similarity analysis and lattice decoding.