Patent classifications
G06F40/258
DOCUMENTATION RECORD VERIFICATION
A system for verifying transactions includes a computing server configured to identify an unverified transaction among real-time transactions that the computing server processes on behalf of an organization client. The computing server receives a forward of a documentation record from an end user through a communication channel (e.g., receiving an image of a paper receipt of a transaction without prompting the end user to provide the receipt). The computing server parses the documentation record to extract information (e.g., a transaction date) to verify the unverified transaction. If the information from the parsed documentation record matches corresponding fields of the unverified transaction, the computing server may display a user interface indicating that the transaction was verified.
Natural language processing and text analytics for audit testing with documentation prioritization and selection
Disclosed are systems, methods, and computer readable media for natural language processing and text analytics of audit documentation for prioritization and selection. Text extraction and conversion techniques can analyze documents corresponding to an audit request to generate a dataset. A two-layer model can produce word embeddings to reconstruct linguistic contexts of words in the dataset. An embedding layer can map each word, and a classifier layer can generate a similarity score for each word. A three-layer model can determine weights of documents in the dataset. A ranking layer can obtain a document rank value for each document. An initial layer and successive layers can receive feature vectors and document rank values to assign weights to the documents. Based on the document weights and the audit request, the natural language processing and text analytics can determine an audit likelihood for each document to prioritize and select subsets of the documents.
Natural language processing and text analytics for audit testing with documentation prioritization and selection
Disclosed are systems, methods, and computer readable media for natural language processing and text analytics of audit documentation for prioritization and selection. Text extraction and conversion techniques can analyze documents corresponding to an audit request to generate a dataset. A two-layer model can produce word embeddings to reconstruct linguistic contexts of words in the dataset. An embedding layer can map each word, and a classifier layer can generate a similarity score for each word. A three-layer model can determine weights of documents in the dataset. A ranking layer can obtain a document rank value for each document. An initial layer and successive layers can receive feature vectors and document rank values to assign weights to the documents. Based on the document weights and the audit request, the natural language processing and text analytics can determine an audit likelihood for each document to prioritize and select subsets of the documents.
Information processing apparatus for complementing a heading of a table
An embodiment of the present invention provides an information processing apparatus capable of complementarily adding an attribute not included in a table in order to detect tables having a corresponding relationship. An information processing apparatus as an embodiment of the present invention includes a complementer. The complementer complementarily adds an attribute not included in a first table based on a content of at least one of the first table and an electronic document including the first table.
Detecting extraneous topic information using artificial intelligence models
Systems and methods for improving machine learning systems used to model topics on a plurality of calls are described herein. In an embodiment, a server computer receives plurality of digitally stored call transcripts that have been prepared from digitally recorded voice calls. The server computer uses a topic model of an artificial intelligence machine learning system, the topic model modeling words of a call as a function of one or more word distributions for each topic of a plurality of topics, to generate an output of the topic model which identifies the plurality of topics represented in the plurality of call transcripts. The server computer computes, for a particular topic of the plurality of topics a first value representing a vocabulary of the particular topic and a second value representing a consistency of the particular topic in two more call transcripts of the plurality of call transcripts which include the particular topic. Based, at least in part, on one or more of the first value or the second value, the server computer determines that the particular topic meets a particular criterion and, in response, updates the output of the topic model to remove the particular topic or distinguish the particular topic from other topics of the plurality of topics which do not meet the particular criterion.
SYSTEM AND METHOD OF AUTOMATIC TOPIC DETECTION IN TEXT
A method and system for automatic topic detection in text may include receiving a text document of a corpus of documents and extracting one or more phrases from the document, based on one or more syntactic patterns. For each phrase, embodiments of the invention may: apply a word embedding neural network on one or more words of the phrase, to obtain one or more respective word embedding vectors; calculate a weighted phrase embedding vector, and compute a phrase saliency score, based on the weighted phrase embedding vector. Embodiments of the invention may subsequently produce one or more topic labels, representing one or more respective topics in the document, based on the computed phrase saliency scores, and may select one or more topic labels according to their relevance to the business domain of the corpus.
Table header detection using global machine learning features from orthogonal rows and columns
A method, system and computer-usable medium for detecting headers in various documents, such as PDF and HTML files. The files are converted to a two dimensional array or table, having orthogonal rows and columns. Either rows or columns are determined to include headers. For determining if rows include headers. For each row in the array or table, pair wise comparison is performed for each cell of each column that is orthogonal to that row. The pair wise comparison scores or values are summed up for each orthogonal column to that row and the sum across for all the orthogonal columns to row provide a score or value for that row. Row scores are evaluated relative to one another to determine likelihood of headers in the row. For determining if columns have headers, similar calculation is performed between columns and their orthogonal rows.
SEMANTIC MAP GENERATION FROM NATURAL-LANGUAGE TEXT DOCUMENTS
Techniques include obtaining, with a computer system, a natural-language-text document comprising unstructured text; generating, with the computer system, based on a first set of machine learning model parameters, a neural representation of the unstructured text; identifying, with the computer system, based on the neural representation, a trigger word located within the unstructured text and associated with a first category; determining, with the computer system, based on the trigger word, a region within the unstructured text comprising descriptors associated with the first category; determining, with the computer system, from the region based on a second set of machine learning model parameters, a descriptor describing an action or condition of the first category; generating, with the computer system, a data model object comprising the descriptor defining an action or condition of the first category; and storing, with the computer system, the data model object in memory.
SYSTEMS AND METHODS FOR ASSOCIATING CONTEXT TO SUBTITLES DURING LIVE EVENTS
Systems and methods are provided herein for providing context to users who access video conferences late. This may be accomplished by a system receiving an audio segment of a video conference and generating a subtitle corresponding to the audio segment. The system may determine a summary relating to the audio segment and then display the subtitle, summary, and video conference on a device. The system allows a user, who accesses a video conference late, to quickly and accurately understand the current video conference discussion, improving the user's experience and increasing the productivity of the video conference.
Systems and methods for documentation through gleaning content with an intuitive user experience
Systems and methods for Documentation Through Gleaning Content [extract content (information) from various sources. collect gradually and bit by bit.] with an enhanced, easy to use and intuitive user interface experience. This is the system to glean the content such as text, image, audio, video in bit by bit from various sources such as web pages, document viewers, word of mouth, SMS, email, internet messenger, social medias etc., and tagged [labeled] to the document/topic in shorter amount of time. At any point of time one or more gleaned content(s) are compiled as a single document without the need of an editor. The team of users can use any type of devices to collaborate, review and publish the document.