G06F40/151

ENCODING VARIABLE LENGTH CHARACTERS USING SIMULTANEOUS PROCESSING
20220405460 · 2022-12-22 ·

Embodiments are directed to managing character encoding. A plurality characters that are each encoded as code units based on a character code may be provided such that the code units for each character represents a code point of a character encoding scheme. An encoding model may be determined based on the character code, one or more processor features, and a target character code. Process features may be employed to transform the code units into target code units based on the encoding model such that the target code units are based on the target character code and such that the target code units encode the code point for each character. The plurality of target characters may be provided to a target stream such that each target character may be encoded as the target code units.

Title rating and improvement process and system

In accordance with one embodiment, a method can be implemented that comprises receiving as an input a title of a video from a video sharing web site; parsing the title of the video into one or more n-grams; computing with a computer a title-searchability-score by utilizing the one or more n-grams.

Logical, recursive definition of data transformations
11526656 · 2022-12-13 · ·

Techniques and solutions are described for defining transformation specifications in a programming-language independent language and converting such specifications to one or more executable formats. The language can provide for defining rules and actions. Rules can refer to (e.g., be based at least in part on) data targets, such as attributes of a schema, whose identifiers are to be read or updated, or to other rules. Rules can be reused, and can recursively refer to one another, such that a large number of complex schema transformations can be accomplished using a series of first order logic statements. Actions can define what, and how, values will be changed when a predicate rule is satisfied. A transformation specification in the language can be parsed and selectively complied to one or more executable formats, including in programming languages such as the structured query language. Disclosed technologies can facilitate data transformations by non-technical users.

Logical, recursive definition of data transformations
11526656 · 2022-12-13 · ·

Techniques and solutions are described for defining transformation specifications in a programming-language independent language and converting such specifications to one or more executable formats. The language can provide for defining rules and actions. Rules can refer to (e.g., be based at least in part on) data targets, such as attributes of a schema, whose identifiers are to be read or updated, or to other rules. Rules can be reused, and can recursively refer to one another, such that a large number of complex schema transformations can be accomplished using a series of first order logic statements. Actions can define what, and how, values will be changed when a predicate rule is satisfied. A transformation specification in the language can be parsed and selectively complied to one or more executable formats, including in programming languages such as the structured query language. Disclosed technologies can facilitate data transformations by non-technical users.

Methods and systems for determining relevance of documents

Methods and systems for determining relevance for a new document are described. Existing documents that have a high probability of relevance can be chosen. A vocabulary of words in the existing documents can be built. Each word can be mapped into a vector such that each existing document can be represented by a sequence of vectors and each sentence and/or paragraph in each existing document can be represented by a subsequence of vectors including a subset of the sequence of vectors. Data augmentation can be applied changing an order of the subsequences in order to create additional documents represented by the subsequences. A deep neural network can be trained using the subsequences that represent the existing documents and the subsequences that represent additional documents. The new documents can be trained using a trained deep neural network. A relevant document can be output using the trained deep neural network.

System for providing dynamic linked panels in user interface

A computer system may be configured to: execute a first query associated with a first panel; display the first panel in a user interface based on first display settings of the first panel, the first panel displaying at least a portion of the result of the first query, the result of the first query associated with a variable; execute a second query associated with a second panel, wherein the second query refers to the variable associated with the first query; display the second panel in the user interface based on second display settings of the second panel, the second panel displaying at least a portion of the result of the second query; and in response to user input changing the displayed result in the first panel: re-execute the second query; and update the display of the second panel in the user interface based on results of the re-executed second query.

System for providing dynamic linked panels in user interface

A computer system may be configured to: execute a first query associated with a first panel; display the first panel in a user interface based on first display settings of the first panel, the first panel displaying at least a portion of the result of the first query, the result of the first query associated with a variable; execute a second query associated with a second panel, wherein the second query refers to the variable associated with the first query; display the second panel in the user interface based on second display settings of the second panel, the second panel displaying at least a portion of the result of the second query; and in response to user input changing the displayed result in the first panel: re-execute the second query; and update the display of the second panel in the user interface based on results of the re-executed second query.

Automated nonparametric content analysis for information management and retrieval

Embodiments of the invention utilize a feature-extraction approach and/or a matching approach in combination with a nonparametric approach to estimate the proportion of documents in each of multiple labeled categories with high accuracy. The feature-extraction approach automatically generates continuously valued text features optimized for estimating the category proportions, and the matching approach constructs a matched set that closely resembles a data set that is unobserved based on an observed set, thereby improving the degree to which the distributions of the observed and unobserved sets resemble each other.

Automated nonparametric content analysis for information management and retrieval

Embodiments of the invention utilize a feature-extraction approach and/or a matching approach in combination with a nonparametric approach to estimate the proportion of documents in each of multiple labeled categories with high accuracy. The feature-extraction approach automatically generates continuously valued text features optimized for estimating the category proportions, and the matching approach constructs a matched set that closely resembles a data set that is unobserved based on an observed set, thereby improving the degree to which the distributions of the observed and unobserved sets resemble each other.

SYSTEMS AND METHODS FOR SEMANTIC CODE SEARCH

Embodiments described herein provides a contrastive learning framework that leverages hard negative examples, that are mined globally from the entire training corpus for a given query to improve the quality of code and natural language representations. Specifically, similar examples from the training corpus are extracted and used as hard negatives in an online manner during training while keeping the minibatch construction random.