Patent classifications
G06F21/563
Automated detection of malware using trained neural network-based file classifiers and machine learning
Automated malware detection for application file packages using machine learning (e.g., trained neural network-based classifiers) is described. A particular method includes generating, at a first device, a first feature vector based on occurrences of character n-grams corresponding to a first subset of files of multiple files of an application file package. The method includes generating, at the first device, a second feature vector based on occurrences of attributes in a second subset of files of the multiple files. The method includes sending the first feature vector and the second feature vector from the first device to a second device as inputs to a file classifier. The method includes receiving, at the first device from the second device, classification data associated with the application file package based on the first feature vector and the second feature vector. The classification data indicates whether the application file package includes malware.
Dynamic CFI using line-of-code behavior and relation models
Disclosed herein are techniques for analyzing control-flow integrity based on functional line-of-code behavior and relation models. Techniques include receiving data based on runtime operations of a controller; constructing a line-of-code behavior and relation model representing execution of functions on the controller based on the received data; constructing, based on the line-of-code behavioral and relation model, a dynamic control flow integrity model configured for the controller to enforce in real-time; and deploying the dynamic control flow integrity model to the controller.
Automatic integrity vulnerability detection in an integrated development environment
Aspects of the invention include receiving, by a processor, source code for a software program written in a first programming language. The received source code is converted into abstracted source code that is in a generic format that is different than a format of the first programming language. The abstracted source code is compared to known source code patterns. Based on determining that at least a subset of the abstracted source code matches a pattern in the known source code patterns, sending an alert to the user indicating that the received source code matches the pattern.
Systems and methods for triaging software vulnerabilities
Systems and methods are provided for the classification of identified security vulnerabilities in software applications, and their automated triage based on machine learning. The disclosed system may generate a report listing detected potential vulnerability issues, and extract features from the report for each potential vulnerability issue. The system may receive policy data and business rules, and compare the extracted features relative to such data and rules. The system may determine a token based on the source code of a potential vulnerability issue, and a vector based on the extracted features of a potential vulnerability issue and based on the token. The system may select a machine learning modelling method and/or an automated triaging method based on the vector, and determine a vulnerability accuracy score based on the vector using the selected method.
System and method for automatically detecting a security vulnerability in a source code using a machine learning model
A method for (of) automatically detecting a security vulnerability in a source code using a machine learning model, characterized in that the method includes: obtaining the source code from a client codebase, wherein the client codebase is a complete or an incomplete body of the source code for a given software program or an application; and using a machine learning (ML) model to perform a ML based analysis on an abstract syntax tree (AST) for detecting a first security vulnerability over a static source code, the machine learning based analysis comprise (i) flattening the abstract syntax tree (AST) into a sequence of structured tokens, wherein the sequence of structured tokens includes a semantic structure and a syntactic structure of the source code, (ii) implementing a natural language processing technique on the sequence of structured tokens for mapping the sequence of structured tokens to one or more integers, (iii) pre-training the machine learning model using an unlabeled source code as an input to predict a subsequent sub-token in the sequence of structured tokens and (iv) training the machine learning model on a labeled source code to predict a presence or an absence of the first security vulnerability.
Behavioral detection of malicious scripts
A script analysis platform may obtain a script associated with content wherein the script includes one or more functions that include one or more expressions. The script analysis platform may parse the script to generate a data structure and may traverse the data structure to determine the one or more functions and to determine properties of the one or more expressions, wherein traversing the data structure includes evaluating one or more constant sub-expressions of the one or more expressions. The script analysis platform may analyze the properties of the one or more expressions to determine whether the script exhibits malicious behavior. The script analysis platform may cause an action to be performed concerning the script or the content based on determining whether the script exhibits malicious behavior.
ANALYSIS FUNCTION IMPARTING DEVICE, ANALYSIS FUNCTION IMPARTING METHOD, AND ANALYSIS FUNCTION IMPARTING PROGRAM
An analysis function imparting device (10) includes a virtual machine analyzing unit (121) that analyzes a virtual machine of a script engine, a command set architecture analyzing unit (122) that analyzes a command set architecture that is a command system of the virtual machine, and an analysis function imparting unit (123) that performs hooking for imparting multipath execution functions to the script engine, on the basis of architecture information acquired by the analysis performed by the virtual machine analyzing unit (121) and the command set architecture analyzing unit (122).
METHOD AND APPARATUS FOR IDENTIFYING DYNAMICALLY INVOKED COMPUTER CODE
A method, computerized apparatus and computer program product, the method comprising: obtaining user code; using static analysis, determining from the user code a collection of components upon which the user code depends, the collection of components comprising a first component representing a first entity, wherein one or more components of the collection of components is to be loaded dynamically by the user code; determining whether the user code or the first component from the collection of components uses dynamic invocation; subject to the user code or the first component using dynamic invocation, adding a new connection to a second component from the collection of components, the second component representing a second entity that augments an entity reachable from the first entity; and outputting information about the second entity.
METHODS AND APPARATUS FOR USING MACHINE LEARNING ON MULTIPLE FILE FRAGMENTS TO IDENTIFY MALWARE
In some embodiments, a method includes processing at least a portion of a received file into a first set of fragments and analyzing each fragment from the first set of fragments using a machine learning model to identify within each fragment first information potentially relevant to whether the file is malicious. The method includes forming a second set of fragments by combining adjacent fragments from the first set of fragments and analyzing each fragment from the second set of fragments using the machine learning model to identify second information potentially relevant to whether the file is malicious. The method includes identifying the file as malicious based on the first information within at least one fragment from the first set of fragments and the second information within at least one fragment from the second set of fragments. The method includes performing a remedial action based on identifying the file as malicious.
Apparatus and Method for Blocking Malicious Code Embedded in Digital Data
The present invention is a device, system, and method for improving network security using pictorial communication and in preferred embodiments optical character recognition for the communication of digital information so as to block malicious code embedded in digital data. More specifically, the present invention in preferred embodiments receives a digital data stream from an open network; identifies and extracts desired digital content from the digital data stream; deletes all remaining digital data; displays the extracted digital content as an pictorial image containing alphanumeric or other characters on one side of an analog air gap; captures the pictorial image on the opposite side of the air gap in a closed network; converts the pictorial image to a digital image file; uses optical character recognition algorithms to recognize and convert the pictorial image into a clean digital content file; and stores a copy of the clean digital content file in the closed network.