G06F16/3346

Systems and methods for identifying and linking events in structured proceedings

The present disclosure relates to systems and methods for analyzing and extracting docket data related to a structured proceeding, for identifying docket entries associated with motions and docket entries associated with orders, and for identifying motions affected by orders. Embodiments provide for receiving docket data associated with a structured proceeding, the docket data including at least one docket entry. Embodiments also include identifying, by an automated analysis, docket entries associated with motions in the structured proceeding, and docket entries associated with orders in the structured proceeding. In embodiments, identifying the docket entries associated with orders includes identifying at least one order that includes a results-affecting decision affecting at least one motion. Embodiments further include linking, by the automated analysis, the affected at least one motion to the affecting order.

System and method of combining statistical models, data models, and human-in-the-loop for text normalization
11599723 · 2023-03-07 · ·

According to principles described herein, unsupervised statistical models, semi-supervised data models, and HITL methods are combined to create a text normalization system that is both robust and trainable with a minimum of human intervention. This system can be applied to data from multiple sources to standardize text for insertion into knowledge bases, machine learning model training and evaluation corpora, and analysis tools and databases

Enhanced search engine techniques utilizing third-party data

Systems and methods are described herein for generating enhanced search results utilizing third-party website content within a search engine provided by an electronic catalog of a service provider. This content may be collected in advance of query processing and analyzed to identify a category indicating some attribute of the content (e.g., terms mentioned, topics discussed, object depicted in images/videos/3D data of the content, etc.). Items may be matched to the website through analyzing the textual and/or visual representation data of the website to textual and/or visual representation data associated with an item offered within the electronic catalog. A query may be subsequently received and a third-party website may be identified as being relevant to the search query. In response to the query, the third-party website may be included in a search result list along with images and/or text identifying items pertaining to that website.

MACHINE LEARNING FOR SIMILARITY SCORES BETWEEN DIFFERENT DOCUMENT SCHEMAS

A document repository may be searched for documents that are similar to a source document. Multiple queries may be generated based on a type of the source document, and the results may be combined in a unified response. User behavior may then be monitored, and implicit and explicit feedback may be gathered to evaluate the performance of the search. The gathered feedback may indicate how relevant each of the result documents are in comparison to the original source document. This feedback may then be used to adjust search parameters for the source document type, such that the performance of subsequent searches may be improved. A model may also be trained to classify implicit feedback using explicit feedback received from users.

ENFORCING DATA OWNERSHIP AT GATEWAY REGISTRATION USING NATURAL LANGUAGE PROCESSING

Enforcing data ownership may include receiving a request to register an application programming interface (API) endpoint. A plurality of elements of the API endpoint and a target API endpoint may be preprocessed. A distance may be computed for each of element of the API endpoint relative to at least one of the elements of the target API endpoint. A distance score for the API endpoint may be computed based on the distance scores. A term frequency-inverse document frequency (TF-IDF) value may be computed for a plurality of metadata terms of the API endpoint and the target API endpoint. A similarity score between the TF-IDF values of the metadata terms may be computed. An adjusted score may be computed for the API endpoint based on the distance score and the similarity scores. The API endpoint may be registered based on the adjusted score being below a permissions threshold.

SYSTEMS AND METHODS FOR DYNAMIC LABELING OF REAL-TIME COMMUNICATION SESSIONS

Systems and methods are described for generating a dynamic label for a real-time communication session. An ongoing communication session is monitored to identify a content characteristic of the communication session. A size of a sliding window is determined based on the content characteristic, where the size of the sliding window defines a segment of the communication session to include in the most recent subset of communications. The most recent subset of communications is analyzed to identify relevant words based on one or more relevancy criteria. A dynamic label associated with the communication session is generated, where the dynamic label includes at least a selected one of the relevant words.

Searching data repositories using pictograms and machine learning
11663256 · 2023-05-30 · ·

A pictogram repository is created of pictograms including expressions that are mapped to at least a portion of source code that is stored in a separate source code repository. A score is recorded for developers for the source code that is stored in the source code repository. A source code search inquiry of at least one pictograms for search query elements is conducted, in which the at least one pictogram for the search query elements are matched to the pictograms in the repository of pictograms that includes expressions that are mapped to at least a portion of source code that is stored in the separate source code repository. Matching source code have the score for their developer checked against a threshold value. Source code meeting the search query elements and having a score for their developer meeting the threshold value are retrieved.

SENTIMENT ANALYSIS FOR ASPECT TERMS EXTRACTED FROM DOCUMENTS HAVING UNSTRUCTURED TEXT DATA

An apparatus comprises at least one processing device configured to receive a query to perform sentiment analysis for a document, to generate, utilizing a first machine learning model, a first set of encodings classifying words of the document as being aspect or non-aspect terms, to generate, utilizing a second machine learning model, a second set of encodings classifying sentiment of the words of the document, and to determine, for a given aspect term, attention weights for a given subset of the words of the document surrounding the given aspect term. The processing device is also configured to generate, utilizing a third machine learning model, a given sentiment classification of the given aspect term based on the attention weights and a given portion of the second set of encodings for the given subset of the words, and to provide a response to the query comprising the given sentiment classification.

SYSTEM AND METHOD FOR MEDICAL LITERATURE MONITORING OF ADVERSE DRUG REACTIONS

A system and method for medical literature monitoring of adverse drug relations, enabled by screening literature references by applying one or more machine learning models trained using a data labelling protocol and a plurality of data rules prescribed by a plurality of subject matter experts. The data labelling protocol comprises a set of inferences derived from screening and labelling a plurality of medical literature with suspected references to adverse drug reactions by subject matter experts. Suspected references to adverse drug reactions includes direct references to adverse drug reactions and indirect references to adverse drug reactions. The plurality of data rules is derived from observations of subject matter experts during data labelling. The predictions outputted by each of the machine learning models are validated with the data rules, and a final list of literature with suspected references to adverse drug reactions is generated.

Recommending online communication groups by matching unstructured text input to conversations
11468133 · 2022-10-11 · ·

A computer joining an online chat service, based on unstructured text input, can be matched automatically under computer control to one of multiple different online chat conversations using a trained transformer-based machine learning model, training techniques, and similarity assessment techniques. Computer analysis in this manner improves the likelihood that the unstructured text input results in assigning the computer to a relevant chat conversation. Additionally, or alternatively, a dense passage retrieval machine learning model having a first encoder for resources and a second encoder for messages can automatically match relevant resources to computers or sessions based on analysis of a series of messages of an online chat conversation. In either approach, continuous re-training is supported based on feedback from a moderator computer and/or user computers.