Patent classifications
G06V30/19093
MACHINE LEARNING TECHNIQUES FOR DETERMINING PREDICTED SIMILARITY SCORES FOR INPUT SEQUENCES
Various embodiments of the present invention provide methods, apparatuses, systems, computing devices, computing entities, and/or the like for dynamically generating a predicted similarity score for a pair of input sequences. According to one aspect, a predicted similarity score for a pair of input sequences is determined based at least in part on at least one of a token-level similarity probability score for the pair of input sequences, a target region match indication for the pair of input sequences, a fuzzy match score for the pair of input sequences, a character-level match score for the pair of input sequences, one or more similarity ratio occurrence indicators for the pair of input sequences, and a harmonic mean score of the fuzzy match score for the pair of input sequences and the token-level similarity probability score for the pair of input sequences.
SYSTEMS AND METHODS FOR INFORMATION RETRIEVAL AND EXTRACTION
To extract necessary information, documents are received and classified, converted to text, and stored in a database. A request for information is then received, and relevant documents and/or document passages are selected from the stored documents. The needed information is then extracted from the relevant documents. The various processes use one or more artificial intelligence (AI), image processing, and/or natural language processing (NLP) techniques as well as knowledge-based and rule-based techniques.
CLASSIFICATION OF USER SENTIMENT BASED ON MACHINE LEARNING
A system and method for machine learning classification of user sentiment is disclosed. The method includes storing including a plurality of category information. The plurality of category information includes a set of domain-specific category information. The method further includes extracting a plurality of aspects from textual data. The method further includes generating a sentiment by a machine learning model. The method further includes receiving the plurality of aspects and the set of domain-specific category information. The method further includes generating a sentiment based on the plurality of aspects and the set of domain-specific category information
SYSTEM AND METHOD FOR REAL-TIME AUTOMATED PROJECT SPECIFICATIONS ANALYSIS
Various methods, apparatuses/systems, and media for real-time automated analysis of project specifications are disclosed. A processor calls an API to invoke an OCR micro-service with the project specifications data as input data received from a plurality of applications each including a file corresponding to real-time project specifications data; determines whether the file corresponding to the project specification data is an image file; implements, based on determining, a neural network based image processing algorithm to extract data corresponding to the project specifications data from the input data; compares the extracted data corresponding to the project specifications data with predefined expected business results data; generates a similarity score, based on comparing, that identifies how similar the project specifications data is compared to the predefined expected business results data; and automatically generates a real-time analysis report on the project specifications in connection with the plurality of applications based on the similarity score.
Scoring sentiment in documents using machine learning and fuzzy matching
Computer-implemented systems and methods, trained through machine learning, score a sentiment expressed in a document. Individual sentences are scored and then overall document sentiment score is computed based on scores of individual sentences. Sentence scores can be computed with machine learning models. Digital matrix generator can generate N×M matrix for each sentence, where the matrix comprises vectors of word embeddings for the individual words of the sentence. A classifier computes a sentence sentiment score for each sentence based on the digital matrix for the sentence. Sentence sentiment scores computed by classifier can be adjusted based on a fuzzy matching of a phrase(s) in the sentence to key phrases in a lexicon that are labeled with a sentiment relevant to the context.
Targeted document information extraction
Disclosed herein are various embodiments for targeted document information extraction. An embodiment operates by receiving a document associated with a particular customer of a plurality of customers. It is determined whether to use a global processor or template processor to analyze the document based on whether one or more customer templates are associated with the particular customer. Which of the one or more templates associated with the particular customer correspond to the document is identified. The document is compared to the identified template associated with the customer. Information is extracted from the document based on the identified template and the identified plurality of variations. The extracted information for the document is output.
ELECTRONIC DEVICE FOR PROVIDING TEXT ASSOCIATED WITH CONTENT, AND OPERATING METHOD THEREFOR
An electronic device includes a display, a memory, and at least one processor configured to, execute an application, obtain a content from the executed application, obtain at least one piece of information associated with the content, obtain at least one first text corresponding to the at least one piece of information, identify at least one second text associated with the at least one first text among a plurality of texts stored in the memory, control the display to display at least one first tag object including the at least one first text and at least one second tag object including the at least one second text, and based on a tag object being selected from among the displayed at least one first tag object and the displayed at least one second tag object, store, in the memory, a text corresponding to the selected tag object to be associated with the content.
SERIAL NUMBER RECOGNITION PARAMETER DETERMINATION APPARATUS, SERIAL NUMBER RECOGNITION PARAMETER DETERMINATION PROGRAM, AND PAPER SHEET HANDLING SYSTEM
A serial number recognition parameter determination apparatus includes: a generation unit, an identification unit, and an evaluation index calculation unit. The generation unit generates a parameter set of a program, the program being used when a paper sheet handing apparatus identifies, from an image of a paper sheet, character present regions in which characters that form a serial number are present. The identification unit identifies, from an image of the paper sheet, the character present regions by using the parameter set that is generated by the generation unit. The evaluation index calculation unit calculates an evaluation index of the parameter set based on the character present regions that are identified by the identification unit.
Leveraging text profiles to select and configure models for use with textual datasets
Text profiles can be leveraged to select and configure models according to some examples described herein. In one example, a system can analyze a reference textual dataset and a target textual dataset using text-mining techniques to generate a first text profile and a second text profile, respectively. The first text profile can contain first metrics characterizing the reference textual dataset and the second text profile can contain second metrics characterizing the target textual dataset. The system can determine a similarity value by comparing the first text profile to the second text profile. The system can also receive a user selection of a model that is to be applied to the target textual dataset. The system can then generate an insight relating to an anticipated accuracy of the model on the target textual dataset based on the similarity value. The system can output the insight to the user.
INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD
A server device (10) corresponding to an example of an information processing apparatus includes an acquisition unit (13a) that acquires a text regarding a remark of a user who has refrained from sending the remark, an input text analysis unit (13b) (corresponding to an example of the “first analysis unit”) that analyzes the text regarding the remark acquired by the acquisition unit (13a) by a natural language process, a past information analysis unit (13c) (corresponding to an example of the “second analysis unit”) that analyzes past information about a content of the remark by the natural language process, and a generation unit (13e) that generates a candidate for the remark text sent by the user so that there is no contradiction with the past information based on a comparison between respective analysis results of the input text analysis unit (13b) and the past information analysis unit (13c).