G06F8/751

FREQUENT SOURCE CODE PATTERN MINING

A data mining technique is used to find large frequently-occurring source code patterns from methods/APIs that can be used in code development. Simplified trees that represent the syntactic structure and type and method usage of a source code fragment, such as a method, are mined to find closed and maximal frequent subtrees which represent the largest frequently-occurring source code patterns or idioms associated with a particular type and method usage. These idioms are then used in an idiom web service and/or a code completion system to assist users in the development of source code programs.

Systems and methods for code analysis heat map interfaces

The present application is directed towards systems and methods for providing a heat map interface for analyzing and reporting transformation capabilities of a source installation to a target installation of an application. Characteristics of the source installation are displayed in an easy, intuitive interface, providing improved efficiency in analysis and planning. Furthermore, the interface is interactive, allowing an administrator or user to select and apply transformation dispositions to code objects grouped into regions and sub-regions, providing versatility and accuracy of configuration.

Frequent source code pattern mining

A data mining technique is used to find large frequently-occurring source code patterns from methods/APIs that can be used in code development. Simplified trees that represent the syntactic structure and type and method usage of a source code fragment, such as a method, are mined to find closed and maximal frequent subtrees which represent the largest frequently-occurring source code patterns or idioms associated with a particular type and method usage. These idioms are then used in an idiom web service and/or a code completion system to assist users in the development of source code programs.

BINARY CODE SIMILARITY DETECTION SYSTEM
20220244953 · 2022-08-04 ·

A binary code similarity detection system that compares a target binary code to a source code by comparing the target binary code to a comparing binary generated by compiling the source code. Rather than using a comparing binary generated using a random or fixed compiling configuration, the system identifies the compiling configuration of the target binary code and compares the target binary code to a comparing binary generated using the same compiling configuration as the target binary code. The compiling configuration of the target binary code may be identified by a neural network (e.g., a graph attention network trained on attributed function call graphs of binary codes with known compiling configurations). The target binary code and the comparing binary may be compared using a graph neural network (e.g., a graph triplet loss network) that compares attributed control flow graphs of the of the target binary code and the comparing binary.

SYSTEM FOR COMPUTER CODE DEVELOPMENT ENVIRONMENT CLONING AND AUTHENTICATION USING A DISTRIBUTED SERVER NETWORK AND MACHINE LEARNING

A system is provided for computer code development environment cloning and authentication using a distributed server network and machine learning. In particular, the system may use a machine learning algorithm configured to automatically identify and analyze changes in computing code between two or more environments and publish a record of said changes to a private distributed register stored on a plurality of distributed server nodes. Based on the analysis, the system may generate one or more recommended changes to the source code. If the changes are confirmed by one or more authorized users, the system may automatically implement the changes and publish a confirmation record of the implemented changes to the distributed register. In this way, the system may provide an efficient way to ensure synchronization of code across multiple computing environments.

APPLICATION MIGRATION USING COST-AWARE CODE DEPENDENCY GRAPH

Described are techniques for application migration. The techniques include migrating an application to a target cloud infrastructure and generating a cost-aware code dependency graph during execution of the application on the target cloud infrastructure. The techniques further include modifying the application by removing source code corresponding to unused nodes according to the cost-aware code dependency graph and replacing identified source code of a high-cost subgraph of the cost-aware code dependency graph with calls to a generated microservice configured to provide functionality similar to the identified source code. The techniques further include implementing the modified application on one or more virtual machines of the target cloud infrastructure.

METHODS, APPARATUS, AND ARTICLES OF MANUFACTURE TO GENERATE USAGE DEPENDENT CODE EMBEDDINGS
20220107828 · 2022-04-07 ·

Methods, apparatus, systems, and articles of manufacture are disclosed to generate usage dependent code embeddings. An example apparatus includes parsing circuitry to select a usage context of a code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC. The example apparatus additionally includes embedding circuitry to generate a first list of token embedding vectors for first tokens of a second list of tokens for the code snippet and a third list of token embedding vectors for second tokens of a fourth list of tokens for the usage context. The example apparatus also includes concatenation circuitry to concatenate a transformed token embedding vector of a close token and a fifth list of transformed token embedding vectors for the first list.

Machine learning based tracking of derivaitive code

In an approach for using machine learning to track programming code derivatives of source code, a processor captures the source code to track iterations of the source code. A processor detects a change of the source code. A processor analyzes derivative code from the source code for correlation with the source code based on similarity. A processor determines that one or more functions of the derivative code are related to the change of the source code based on the correlation. A processor highlights the related one or more functions of the derivative code.

Method for identifying open-source software components at the source-code level

According to some exemplary embodiments of the present disclosure, a method for identifying open source software (OSS) components using a processor of a computing device is disclosed. The method for identifying open source software (OSS) components may include: constructing a component database by performing redundancy elimination for each of a plurality of open source software; and identifying a component of target software by using the component database.

Code recommender for resolving a new issue received by an issue tracking system

Training data identifying a plurality of pairs is received. Each pair identifies one or more separate code snippets known to resolve a respective issue of a plurality of issues. For each pair of the plurality of pairs, a respective issue representation of core content of the respective issue and a linear expression of the one or more separate code snippets in a respective code representation is constructed and a model to correlate the respective code representation as resolving the respective issue representation is trained. The model is queried with a new issue and a selected one of the one or more separate code snippets. The model returns a classification indicating whether the selected one of the one or more separate code snippets is likely to resolve the new issue.