MALICIOUS DATABASE REQUEST IDENTIFICATION

20200302078 ยท 2020-09-24

Assignee

Inventors

Cpc classification

International classification

Abstract

A computer implemented method to identify a malicious database request including receiving a database query for retrieving data from a database; classifying the received query based on query instructions contained in the query to identify a class of query for the query, the class of query having associated attributes defining expected characteristics of queries of the class when executed by the database; monitoring characteristics of the received query executed to retrieve data from the database; and responsive to a determination that the monitored characteristics deviate from the expected characteristics, identifying the query as malicious.

Claims

1. A computer implemented method to identify a malicious database request comprising: receiving a database query for retrieving data from a database; classifying the received database query based on query instructions contained in the database query to identify a class of query for the database query, the class of query having associated attributes defining expected characteristics of queries of the class when executed by the database; monitoring characteristics of the received database query executed to retrieve data from the database; and responsive to a determination that the monitored characteristics deviate from the expected characteristics, identifying the database query as malicious.

2. The method of claim 1, wherein the class of query has associated a class query including the query instructions of the received database query and the expected characteristics are defined based on the execution of the class query.

3. The method of claim 1, wherein the database query is received from a software application and responsive to the determination the software application is identified as a malicious application.

4. The method of claim 3, further comprising rejecting subsequent queries received from the identified malicious application.

5. The method of claim 1, further comprising rejecting subsequent queries belonging to the same class as the received database query and having attributes determined to be similar to attributes of the received database query based on predetermined threshold degree of similarity of attributes.

6. A computer system comprising: a processor and memory storing computer program code identifying a malicious database request, the processor and memory configured to: receive a database query for retrieving data from a database; classify the received database query based on query instructions contained in the database query to identify a class of query for the database query, the class of query having associated attributes defining expected characteristics of queries of the class when executed by the database; monitor characteristics of the received database query executed to retrieve data from the database; and responsive to a determination that the monitored characteristics deviate from the expected characteristics, identify the database query as malicious.

7. A non-transitory computer readable storage medium storing a computer program element comprising computer program code to, when loaded into a computer system and executed thereon, cause the computer to perform the method as claimed in claim 1.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] Embodiments of the present disclosure will now be described, by way of example only, with reference to the accompanying drawings, in which:

[0015] FIG. 1 is a block diagram of a computer system suitable for the operation of embodiments of the present disclosure.

[0016] FIG. 2 is a component diagram of a database driver proxy for identifying a malicious database request in accordance with an embodiment of the present disclosure.

[0017] FIG. 3 is a flowchart of a method of the proxy of FIG. 2 in accordance with an embodiment of the present disclosure.

DETAILED DESCRIPTION

[0018] FIG. 1 is a block diagram of a computer system suitable for the operation of embodiments of the present disclosure. A central processor unit (CPU) 102 is communicatively connected to a storage 104 and an input/output (I/O) interface 106 via a data bus 108. The storage 104 can be any read/write storage device such as a random access memory (RAM) or a non-volatile storage device. An example of a non-volatile storage device includes a disk or tape storage device. The I/O interface 106 is an interface to devices for the input or output of data, or for both input and output of data. Examples of I/O devices connectable to I/O interface 106 include a keyboard, a mouse, a display (such as a monitor) and a network connection.

[0019] FIG. 2 is a component diagram of a database driver proxy 200 for identifying a malicious database request in accordance with an embodiment of the present disclosure. The database driver proxy 200 is a software, hardware or firmware component adapted to receive database queries from one or more applications such as software application 204. Such queries are intended for a database 202 such as a relational database. For example, the queries can be SQL queries. The proxy 200 appears to the application 204 as a database driver for accessing a database 202. According to the arrangement of FIG. 2 the proxy 200 sits between the application 204 and the database driver 212. The database driver 212 is a software, hardware, firmware or combination component for enabling applications such as application 204 to work with, access and interact with the database 202. For example, the database driver can be a Java Database Connectivity (JDBC) Application Programming Interface (API).

[0020] The proxy includes a set of one or more query classifications 206 as classifiers for database queries. Each query classification 206 relates to a type of query having particular instructions. For example, a query of the form:

[0021] SELECT DISTINCT col1, co12, FROM table1, WHERE table1.col3=X

can be characterized by the particular query instructions with the variable X such that, while the value of X might change between queries, queries having instructions consistent with the above form can be classified together. Any number of different query classifications 206 can be adopted with new classifications being added by an operator or learned from queries received from applications such as application 204.

[0022] A query classification 206 preferably includes a class query 210 as a database query having instructions corresponding to queries in the class of queries with placeholder or multiple values for variable elements of the class query 210 (such as the variable X in the example query above). Thus for each classification 206 the class query 210 can be executed by the database 202 via the driver 212 to identify attributes 208 for the classification 206. The attributes serve to define expected characteristics of queries of the classification 206 when executed by the database 202.

[0023] Characteristics determined for a classification 206 by executing a class query 210 can include: an expected range of a number of data items, rows or records retrieved by queries of a class; an expected range of a number of data items, rows or records affected by queries of a class, such as by being modified, referenced or the like; and an execution time for queries of a class. Thus the classification 206 permits an identification of queries for execution by the database 202 via the driver 212 that are consistent with or deviate from expected characteristics defined by the class attributes 208.

[0024] The attributes 208 for a classification 206 can include value ranges for attributes such as a number of data items returned or a number of rows updated. For example, value ranges could be chosen from the following categories: zero; one; zero to one; zero to one hundred; one to any number (i.e. non-zero); or any number. Other categories of value or ranges of value could alternatively be employed. Similarly, the execution time for a classification 206 can be a range of durations or orders of magnitude of duration.

[0025] The proxy 200 further includes a query classifier 214 as a hardware, software, firmware or combination component for classifying a query 216 received from an application 204 into one of the classifications 206. The classifier 214 can achieve such classification by comparing query instructions of the received query 216 with query instructions of the class query 210 to identify similarity or, preferably, identity (save for variables that will differ). In some embodiments, where a received query 216 cannot be readily classified a closest matching classification 206 can be used or a classification 206 having a class query 210 exhibiting a degree of similarity to a received query 216 exceeding a predetermined threshold can be used. In some embodiments, where a received query 216 cannot be readily classified a new classification can be generated for the received query 216 including defining a new class query based on the instructions for the received query 216 and determining appropriate attributes for the new classification.

[0026] The proxy further includes a query executor 218 as a hardware, software, firmware or combination component adapted to execute a received query 216 via the database driver 212. In one embodiment the query executor 218 or the driver 212 maintains a query queue in order to manage the execution of queries received from applications. Thus a query for execution can be added to the query queue. Where a query queue is employed, information relating to an identified classification 206 of the received query 216 can be stored in association with the query in the queue, such as by use of metadata or other associated data indicating or identifying the attributes for the class 206 to identify or indicate the expected characteristics for the query.

[0027] During and subsequent to the execution of the received query by the database 202 via the driver 212 a query monitor 220 as a hardware, software, firmware or combination component monitors characteristics of the query execution for comparison with the expected characteristics defined by way of the attributes 208 of the query classification 206. The query monitor 220 is thus adapted to determine if the monitored characteristics for the execution of the received query 216 deviate from the expected characteristics. Where such deviation is detected the received query 216 can be identified as malicious or potentially malicious and such identification can be flagged or communicated. In some embodiments remediation or protection measures can be adopted in response to such identification.

[0028] Where the query monitor 220 does not identify actual or potential malicious received query 216 then a response to the query 222 can be delivered to the application 204. Thus the query monitor 220 is adapted to monitor the execution of the received query 216 such as by analyzing one or more of: the response/result of the query 216 as a number of data items, records or rows; a number of data items affected by the query 216; a duration of execution of the query 216 and the like. The identification of deviation by the monitor 220 can be informed by predetermined thresholds or extents such that an extent of deviation that exceeds or meets a particular threshold or extent is determined to constitute a deviation that warrants a reaction. Such reaction can include not providing the response 222 to the application 204 and other remediation or protective measures as will be apparent to those skilled in the art.

[0029] Examples of remediation or protective measures in response to a determination of deviation from expected characteristics include: identifying the application 204 as potentially or actually malicious; rejecting subsequent queries received from the identified malicious application 204; rejecting subsequent queries belonging to the same class as the received query 216 and having attributes determined to be similar to attributes of the received query 216 based on a predetermined threshold degree of similarity of attributes; disconnecting the application 204; and other such measures as will be apparent to those skilled in the art.

[0030] In some embodiments the proxy 200 additionally classifies applications from which requests are received such as application 204 based on characteristics of the applications and/or queries received from the applications so as to identify applications having a similar profile. Such characteristics can include: a frequency and/or volume of queries; particular characteristics of the queries themselves such as the classes of queries received from applications; particular characteristics of responses sent to the applications such that applications issuing queries to which responses are of similar size (e.g. in terms of number of data items, records or the like) or of similar duration of execution are classified together; etc. Thus embodiments of the present disclosure provide for the identification of malicious queries and/or applications for databases.

[0031] FIG. 3 is a flowchart of a method of the proxy of FIG. 2 in accordance with an embodiment of the present disclosure. Initially, at 302, the method receives a database query 216 from an application 204. At 304 the received query 216 is classified according to query classifications 206 to identify attributes 208 defining expected characteristics of the received query 216. At 306 the received query 216 is executed by the database 202 via the driver 212. At 30 the monitor 220 monitors characteristics of the execution of the query 216. At 312, if the monitor identifies that the monitored characteristics of the execution of the query 216 deviate from the expected characteristics the method proceeds to 314 where the query 216 is identified as malicious.

[0032] Insofar as embodiments of the disclosure described are implementable, at least in part, using a software-controlled programmable processing device, such as a microprocessor, digital signal processor or other processing device, data processing apparatus or system, it will be appreciated that a computer program for configuring a programmable device, apparatus or system to implement the foregoing described methods is envisaged as an aspect of the present disclosure. The computer program may be embodied as source code or undergo compilation for implementation on a processing device, apparatus or system or may be embodied as object code, for example.

[0033] Suitably, the computer program is stored on a carrier medium in machine or device readable form, for example in solid-state memory, magnetic memory such as disk or tape, optically or magneto-optically readable memory such as compact disk or digital versatile disk etc., and the processing device utilizes the program or a part thereof to configure it for operation. The computer program may be supplied from a remote source embodied in a communications medium such as an electronic signal, radio frequency carrier wave or optical carrier wave. Such carrier media are also envisaged as aspects of the present disclosure.

[0034] It will be understood by those skilled in the art that, although the present disclosure has been described in relation to the above described example embodiments, the invention is not limited thereto and that there are many possible variations and modifications which fall within the scope of the claims.

[0035] The scope of the present disclosure includes any novel features or combination of features disclosed herein. The applicant hereby gives notice that new claims may be formulated to such features or combination of features during prosecution of this application or of any such further applications derived therefrom. In particular, with reference to the appended claims, features from dependent claims may be combined with those of the independent claims and features from respective independent claims may be combined in any appropriate manner and not merely in the specific combinations enumerated in the claims.