Indicators Of Compromise By Analyzing Data Based On Rolling Baseline
20220377091 · 2022-11-24
Inventors
Cpc classification
G06V20/52
PHYSICS
G06V10/763
PHYSICS
International classification
Abstract
Techniques are disclosed for identifying indicators of compromise in a variety of objects. The objects may be finished products or components thereof. The indicators of compromise in the objects are determined/detected by analyzing their data which may reside in a cloud. The analysis is performed by an instant baseline engine that first establishes a rolling baseline with a centroid of a conceptual hypercube. The centroid represents the normal population of data packets. Data packets far enough away from the centroid indicate an anomaly that may be an indicator of a compromise of/in the respective object. An early detection of such indicators of compromise in the objects can prevent catastrophic downstream consequences for the concerned party/parties.
Claims
1. A system comprising computer-readable instructions stored in non-transitory storage medium and at least one microprocessor coupled to said non-transitory storage medium for executing said computer-readable instructions, said at least one microprocessor configured to: (a) analyze data from one or more objects; (b) establish a rolling baseline of said data by assigning each packet of said data to a cluster of packets amongst a plurality of clusters of packets of said data; (c) score, based on its distance from a centroid of said rolling baseline, each packet of said data; and (d) identify based on said distance an indicator of compromise in said one or more objects; wherein said one or more objects comprise one or both of a component and a product.
2. The system of claim 1 wherein said indicator of compromise is manifested as said data being one or both of unintelligible and obfuscated.
3. The system of claim 1 wherein said indicator of compromise is manifested as said data being unintentionally encrypted.
4. The system of claim 1 wherein said indicator of compromise is manifested as said data being misreported.
5. The system of claim 4 wherein said data being misreported is due to one or both of: (e) at least one unauthorized remote command executed on said one or more objects; and (f) at least one unauthorized message sent by said one or more objects.
6. The system of claim 1 wherein said data resides in a cloud.
7. The system of claim 6 wherein said cloud is one or more of a generic automation testing cloud, a device specific cloud, a vendor specific cloud and a component cloud.
8. The system of claim 6 wherein when said one or more objects comprise a component, said cloud is a component cloud used for electronic design automation (EDA) of said component.
9. The system of claim 8 wherein said indicator of compromise signifies a pattern of failure of said component.
10. The system of claim 6 wherein a training dataset is created from said data, said training dataset used to generate an optimal operational model of said one or more objects to facilitate establishing said rolling baseline line in element (b) above.
11. The system of claim 1 wherein said indictor of compromise is manifested as one or more of an overload of a CPU, an overuse of a memory, an overuse of a disk storage, an overuse of a network bandwidth and an overage of thermal output of said one or more objects.
12. A computer-implemented method executing computer-readable instructions by at least one processor, said computer-readable instructions stored in a non-transitory storage medium coupled to said at least one processor, and said computer-implemented method comprising the steps of: (a) analyzing data from one or more objects, said one or more objects comprising one or both of a component and a product; (b) establishing a rolling baseline of said data by assigning each packet of said data to a cluster of packets amongst a plurality of clusters of packets of said data; (c) scoring, based on its distance from a centroid of said rolling baseline, each packet of said data; and (d) identifying based on said distance an indicator of compromise in said one or more objects.
13. The computer-implemented method of claim 12 manifesting said indicator of compromise as said data being one or both of unintelligible and obfuscated.
14. The computer-implemented method of claim 12 manifesting said indicator of compromise as said data being encrypted.
15. The computer-implemented method of claim 12 manifesting said indicator of compromise as said data being misreported.
16. The computer-implemented method of claim 12 providing said data to be residing in a cloud.
17. The computer-implemented method of claim 16 providing said cloud to be one of a generic automation testing cloud, a device specific cloud, a vendor specific cloud and a component cloud.
18. The computer-implemented method of claim 16 when said one or more objects are comprising a component, providing said cloud to be a component cloud that is used for electronic design automation (EDA) of said component.
19. The computer-implemented method of claim 16 utilizing said data as a training dataset for generating an optimal operational model of said one or more objects for facilitating of said establishing of said rolling baseline line in step (b) above.
20. The computer-implemented method of claim 12 manifesting said indicator of compromise as one of an overage and an underage of one or more of a CPU usage, a memory usage, a disk storage usage, a network bandwidth usage and a thermal output associated with said one or more objects.
Description
BRIEF DESCRIPTION OF THE DRAWING FIGURES
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
DETAILED DESCRIPTION
[0056] The figures and the following description relate to preferred embodiments of the present invention by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of the claimed invention.
[0057] Reference will now be made in detail to several embodiments of the present invention(s), examples of which are illustrated in the accompanying figures. It is noted that wherever practicable, similar or like reference numbers may be used in the figures and may indicate similar or like functionality. The figures depict embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
[0058] The techniques described herein may employ computer code that may be implemented purely in software, hardware, firmware or a combination thereof as required for a given implementation. The system and methods of the present technology will be best understood by first reviewing an anomalous subject and device identification system 100 as illustrated in
[0059] Reference numerals 104A . . . 104N may represent anywhere from a single sensor up to hundreds or thousands or even more sensors as depicted by the dotted line, that may generate data for rolling baseline engine or for short baseline engine 110. Furthermore, non-limiting examples of these sensors are shown in
[0060] The sensors in
[0061] Any number and type of sensors 104A-N may be installed on one or more computing devices, such as mobile devices including cellular phones including smartphones. Sensors 104A-N may also be on tablets, and wearable devices such as smartwatches, even desktops, etc. It should further be noted that sensor(s) 104A may be one or more asset sensors, sensor(s) 104B may be one or more cameras, sensor(s) 104C may be one or more microphones that may or may not be integrated with camera(s) 104B, sensor(s) 104D may be one or more wireless personal-device sensors, examples of which were noted above, etc.
[0062] In this disclosure, unless otherwise explicitly noted, we may use reference numerals, for example reference numeral 104B to refer to a single sensor or multiple sensors of a given type, in this case camera or cameras. Any of sensors 104 may be operating in one or more kiosks, such as kiosk 105 at site 102. These sensors may be installed on one or more computing devices, fixed or mobile, enterprise or personal.
[0063] According to the present technology, sensors 104A . . . 104N gather data that is related to various subjects or targets 106. Subjects may be sentient beings, such as any sentient life forms or beings including animals or human beings shown in
[0064] Explained further, baseline engine 110 analyzes each packet of data gathered by sensors 104. As a part of this analysis, it assigns each packet of data to a cluster of packets amongst clusters of packets of data. The clustering is done preferably by utilizing k-means clustering, specifically by utilizing Eq. (1) of the above-incorporated references and teachings. As a result, baseline engine 110 establishes a rolling or evolving baseline 120 for the data that signifies the mean or normal behavior of the packets.
[0065] Baseline 120 is based on a conceptual hypercube 180 with a centroid 182 as shown in
[0066] Since baseline 120 with centroid 182 signifies the “normal” behavior of packets, packets that are very far away from centroid 182 represent an anomaly. In this way, anomalous subject and device identification system 100 identifies anomalous subjects among subjects 106 that are associated with anomalous packets of data. Once again, for even a more detailed explanation of the workings of baseline engine 110 of anomalous subject and device identification system 100, that is responsible for establishing a rolling baseline 120 and then identifying anomalous data packets, the reader is referred to the above-incorporated references including U.S. Pat. No. 10,542,026 issued on 21 Jan. 2020 to Christian.
[0067] Now let us take a more detailed look at the present technology by reviewing its various embodiments and by taking advantage of
[0068] Furthermore, sensors 204 are collecting data about people 206A, 206B, . . . or simply people 206 at site 202 and supplying it to baseline engine 110 for analysis such that any malicious or anomalous subjects/actors/people/beings amongst people/beings 206 or any anomalous devices at site 202 can be identified. This process depends upon the type of sensor(s) involved. The results of analysis performed by baseline engine 110 and any other related data is stored in an appropriate data storage mechanism for archival and analytics. Such a storage mechanism may be a database on premises at site 202 or in cloud 230 shown in
[0069] Let us now study the various embodiments utilizing the different types of sensors at a given site based on the present principles while referring to
[0070] Camera(s): Camera(s) or simply camera 204A visually monitors people 206. In various embodiments, camera 204A may be a standard video camera such as a closed-circuit television (CCTV) camera, or a more specialized camera such as a stereoscopic video camera or a thermal camera. Regardless, camera 204A supplies its data as video packets via network backbone 208 to baseline engine 110 of the above discussion.
[0071] Baseline engine 110 then establishes a rolling baseline 120A with conceptual hypercube 180A and centroid 182A for these video packets. It then identifies anomalous video packets as compared to baseline 120A per above-incorporated references and teachings. Anomalous video packets are associated with a specific subject/person, exemplarily person 206C amongst subjects/person 206 at site 202. Based on the analysis performed by baseline engine 110 and identification of anomalous video packet(s) by engine 110, anomalous subject and device identification system 200 of
[0072] Note that in the present and other embodiments discussed in this disclosure, the correspondence of the reference numeral of the baseline to the type of sensor 204 must not be taken too strictly. For example, any number of baselines may be established by baseline engine 110 based on the video stream from a single camera depending on the analysis performed by the baseline engine for a given implementation. There may be one baseline geared towards security aspects, another baseline geared towards training aspects, another towards behavioral aspects, etc. Conversely, data streams from multiple sensors may be combined into a single baseline also, as per the requirements of a given implementation.
[0073] As already mentioned, camera 204A may be a standard video camera such as the one typically integrated with today's cellular phones or smartphones or a more specialized camera or a CCTV camera. The analysis performed by baseline engine 110 for its rolling baseline 120A calculation may then be based on facial recognition and motion tracking of subjects/people/beings 206. Facial recognition and object tracking or simply tracking of people 206 in the video data from camera 204A are performed based on techniques known in the art by data processing module 220 shown in
[0074] Data processing module 220 is also responsible for performing any other data preprocessing tasks before supplying its output as data packets to baseline engine 110 for analysis. In various embodiments, data processing module 220 may be implemented as a single module or it may be comprised of various submodules per the needs of an implementation and based on techniques known in the art. In a preferred embodiment, it is implemented as a shim compatibility layer to baseline engine 110.
[0075] Each subject or person 206A, 206B, . . . at site 202 is identified by a hash signature or an alternate identifying signature/marker/information or simply an identifier for object tracking performed by data processing module 220. The movement data of each signature is then fed to baseline engine 110. Preferably, the movement data comprises (x, y, z) coordinates or other equivalent location information of the respective individual/subject/being at site 202 at various points in time. Alternately or in addition, the movement data comprises his/her speed and direction of movement at the given location and the given point in time.
[0076] As that person moves in a building or site, object tracking function of module 220 tracks the movements of the person in the building having the assigned identifier. If there are more than one cameras 204A, object/facial recognition and tracking is performed on video data streams of all such cameras by module 220. The movement data of tracked people 206 with their respective identifiers is then fed to baseline engine 110 for analysis per above. There are a number of useful scenarios or situations that can be captured by the embodiments. A non-exhaustive list of these includes: [0077] 1. Erratic/distressed movement pattern: In one embodiment, rolling baseline 120A signifies the average or mean behavior of crowd 206 by a given set of movements or movement pattern/patterns of people 206 that is considered “normal”. An individual or person, such as person 206C with an exemplary hash signature or simply hash or identifier C1369F4789DA, exhibiting an erratic or stressful or distressed movement pattern or patterns may signify an anomaly. In this case, baseline engine 110 will determine the distance of video packets associated with person 206C to be far enough away from centroid 182A of baseline 120A to signal an anomaly. This anomaly is then reported by engine 110 per prior teachings. Anomalous subject and device identification system 200 can then take appropriate actions based on the anomalies reported by baseline engine 110.
[0078] 2. Audio signatures: In a related variation, camera 204A may be integrated with microphone 204B in a single product/device. In such a variation, audio packets of data or audio data stream from microphone 204B are combined with video packets or video data stream from camera 204A to advantageously enhance facial recognition and object tracking of people 206 at site 202. For example, if site 202 is a theatre or studio or the like where the audio signature of each tracked individual may be distinguishable enough, such an audio signature may further help data processing module 220 to recognize and locate each individual with his/her identifier at site 202. Additional embodiments benefiting from audio or microphone sensors 204B are discussed further below.
[0079] As already mentioned, camera 204A may be a stereoscopic camera. Such a stereo camera has the advantage of providing depth information or size information of the object, thus better aiding facial recognition and object tracking of subjects 206 discussed above. In still other variations, camera 204A may be a thermal-video camera, that may or may not also be a stereo camera. Let us study this variation now in greater detail.
[0080] Thermal camera(s): In such a variation, a given site 202, such as a building or an arena or a school or any other site shown in
[0081] However, in other variations, camera 204A is a bi-spectrum camera because it captures both visible and infrared spectrums of the electromagnetic radiation. Preferably, thermal camera 204A is also a stereoscopic or stereo camera because then it can capture depth/size information. Regardless, thermal camera 204A working in conjunction with data processing module 220, identifies and tracks each individual person amongst persons/people 206 at site 202 and further, reads their body temperatures. Thus, each individual/person along with his/her identifier per above, is also associated with a body temperature reading that is taken in real-time or near real-time. The temperature readings of each tracked/identified person are then provided to baseline engine 110 for analysis.
[0082] Such an embodiment is shown in greater detail in
[0083] These visible and infrared video data streams or simply data streams are communicated to data processing module 220 via network backbone 208. Data processing module 220 identifies and tracks each subject 206A, 206B, . . . amongst subjects 206 per above, and associates a temperature reading with them. It then communicates this information to baseline engine 110 for analysis.
[0084] Preferably, module 220 communicates data packets containing the following information to engine 110: [0085] 1. A timestamp at which the observation is made by camera 204A. [0086] 2. An object identifier assigned to each subject/person 206A, 206B, . . . per above. [0087] 3. (x, y, z) coordinates or location information of each identified subject/person at site 202. [0088] 4. A temperature reading of the identified subject/person at timestamp in (1) above.
[0089] These data packets are then parsed by baseline engine 110 which then establishes a baseline 120A for the normal temperature readings for the individuals and identifies anomalous individuals per prior teachings. Preferably, an anomalous threshold value is provided as an input to baseline engine 110. For example, a normal threshold value of 38° C. or 100.4° F. is provided to baseline engine 110 that incorporates this value into baseline 120A with centroid 182A. It then identifies as anomalous any subjects with body temperatures above the normal threshold value.
[0090] A number of very useful scenarios are discovered/caught by the present embodiments of the anomalous subject and device identification system of the present design. The present technology allows an early detection of potential health and security threats in a reliable and flexible manner. A non-exhaustive list of useful scenarios identified/caught by the present design includes: [0091] 1. Elevated body temperature: Continuing with the above discussion, any individuals, such as person 206C, showing a temperature reading equal to or greater than this normal threshold value are then identified as anomalous by baseline engine 110. If there are multiple thermal cameras 204A, then video data streams from these cameras is processed by combining them at or by data processing module 220 that then tracks objects/people across the various data streams of different cameras and identifies anomalous subjects with elevated body temperatures per above teachings. Preferably, the temperature reading performed by thermal camera(s) 204A is accurate within an error tolerance of less than or equal to 0.3° F. [0092] 2. Mask detection and/or enforcing mask wearing: The facial recognition capabilities of module 220 also allow detection of facial masks worn by individuals/personals. Preferably, the facial recognition capabilities are not degraded as a result of subjects wearing masks. Therefore, anomalous subject and device identification system 200 of
[0093] Furthermore, while an anomalous subject with elevated body temperature per above, signifies a problem/anomaly, but if that individual is also not even wearing a mask, then that is even a greater anomaly or problem or threat, and baseline engine 110 can identify him/her as such. [0094] 3. Enforcing social distancing: Based on the capabilities of the present design and specifically the present embodiments, system 200 is able to enforce social distancing amongst subjects, such as that needed during the Covid-19 pandemic. Because the subjects are assigned an identifier and their location, speed and movements are known/tracked, the system can determine which subjects are not following social distancing guidelines. In the present case, proximity to other subjects may be a dimension on the hypercube of the respective baseline established by engine 110. A proximal distance, for example 6 feet, can be provided as an input to baseline engine 110 representing the minimum threshold value. If a given subject is in repeated violation of the minimum threshold value/distance, then this situation and the subject can be conveniently identified and flagged by baseline engine 110. [0095] 4. Weapons detection: Depending on the image/object recognition capabilities of data processing module 220, data streams captured by cameras 204A can be used to determine if a subject is carrying a weapon at site 202. Of course, the present technology can support additional specialized sensors for weapons detection, such as metal or ballistic detectors at the site, instead of or in addition to sensors 204 shown in
[0097] Microphone(s): While referring back to
[0098] While typically microphones will come integrated with cameras 204A, this is not necessarily the case. It is conceivable to have a site where audio signatures of subjects alone are used for identification and tracking and for determination of anomalous subjects. Examples of such audio sensitive sites include theaters, studios, etc. Moreover, the audio signatures may be combined with video signatures for better tracking of objects.
[0099] Data processing module 220 of
[0100] It can then better provide the movement patterns or temperature readings of these subjects to baseline engine 110 for analysis per above teachings.
[0101] Asset sensor(s): While still referring to
[0102] Asset sensor 204C captures data from one or more xmitters installed in or near or around assets present at the site. In the embodiments where site 202 is a manufacturing or chip fabrication facility, an xmitter can be any sensor installed in or near a manufacturing equipment or asset that senses/monitors the asset and transmits the sensed/monitored data to asset sensor 204C. An xmitter at a manufacturing or any other site can be based on any suitable wired or wireless technology including blue-tooth, cellular network, radio frequency identification (RFID), Zigbee, etc.
[0103] Exemplarily, such an xmitter monitors the asset to ensure that it stays at a given location. Alternatively or in addition, such an xmitter may perform measurements of one or more manufacturing parameters for and/or in conjunction with the asset/equipment/tool, such as, reading the value of a voltage, a current, a pH, etc. It then transmits this reading or sensed data, either by a wired connection or wirelessly to an asset sensor of the present design, such as asset sensor 204C.
[0104]
[0105] Data surveilled or monitored by xmitters 218A-C is then transmitted, by wire or wirelessly, on-demand or at regular intervals or on realtime or near-realtime basis, to asset sensor(s) 204C. Asset sensor 204C may be any wireless sensor receiving data packets from xmitters 218A-C based on techniques known in the art. For instance, asset sensor(s) 204C may communicate with xmitters 218A-C using one or more of blue-tooth, cellular network, radio frequency identification (RFID), a Zigbee or any other suitable wireless technologies required for a given implementation.
[0106] Asset sensor 204C then communicates this data to data processing module 220 as shown. In the present embodiment, data processing module 220 performs any necessary processing of data received from xmitters 218A-C before providing it to baseline engine 110 for analysis. In an exemplary embodiment, data processing module 220 normalizes data between one or more assets. In the same or another variation, module 220 correlates data between assets of the same type or of different types. In any event, the processed data is provided to baseline engine 110 for analysis. Baseline engine now establishes a rolling baseline for assets 216A-D based on data received from xmitters 218A-C and identifies any assets or subjects that may be anomalous.
[0107] In the preferred embodiment, baseline engine 110 establishes a rolling baseline for each different type of asset or manufacturing tool/equipment. For example, if site 202 is a fab then baseline engine 110 may establish a rolling baseline 120B with centroid 182B for chemical vapor deposition tools, and another baseline for metrology tools, etc. as shown. Note that in
[0108]
[0109] There are a number of useful scenarios that are identifiable by the variations shown in
[0113] As another example, if an unauthorized subject/human has excessive dwell time around a sensitive asset, then this might signify another form of espionage.
[0114] Similarly, a variety of other useful scenarios that are based on correlating data related to subjects 206F-H and captured by camera(s) 204A with the data related to subjects 216A-D captured by asset sensor(s) 204C, are conceivably caught and are identifiable by the embodiments explained in relation to
[0115] Personal-device sensor(s): In a highly preferred set of embodiments, a given site 202 of
[0116] If a personal-device sensor is a blue-tooth sensor, it is responsible for communicating with blue-tooth personal-devices, if it is a cellular signal sensor, it is responsible for communicating with cellular personal-devices such as cellular phones, if it is an RFID reader, it is responsible for communicating with RFID personal devices such as RFID tags, which may be active, passive or semi-active tags. If the personal device sensor is a Zigbee sensor, it is responsible for communicating with Zigbee personal-devices such a Zigbee end-devices.
Depending on the requirements of an implementation and the capabilities of a particular wireless technology, any of the communication above may be bi-directional or uni-directional i.e. only from the personal-devices to the personal-device sensor. Moreover, more than one sensors of the same or different type may be integrated into a single composite sensor/device in the present or any other embodiments of this disclosure.
[0117] A personal-device carried by a subject may or may not actually be owned by him/her or be his/her “personal” device in a manner of ownership. However, for the purposes of this disclosure any device carried by the subject is termed as a personal-device. Such subjects are typically human beings and the devices carried by them may be cellular phones including smartphones, tablets, wearable devices such as smartwatches, laptop computers, etc. Note however that there are situations that a personal-device is unattended or not carried by any subject. Such a situation is discussed in detail in the embodiments explained below.
[0118]
[0119] Now, based on triangulation and trilateration techniques known in the art and the availability of sufficient number of sensors 204D, the present design is able to determine where each device carrying subject is on the premises of site 202. For this purpose, our data processing module 220 may again be utilized with the necessary algorithms for locating devices 222, 224 and 226 with their respective subjects 206 at site 202. As noted, two such exemplary algorithmic techniques include triangulation and trilateration.
[0120] As a consequence, module 220 may determine that individual/subject 206I is in region R1 of site 202, individuals/subjects 206J and 206L are in region R2 and subjects 206M and 206N are in region R3. Furthermore, data processing module 220 of the present design also assigns an identifier to each device that it detects at site 202. Note that subject 206K who is not carrying any device will not be detected by sensors 204D1 and 204D2 alone. For this purpose, we will defer to embodiments discussed further below.
[0121] Now, given the above setup, the wireless embodiments of
[0126] Only after a device joins the LAN, it beacons with its correct MAC address and at which point system 200 can use its real MAC address as the device identifier. If it is expected that employees 206I-N will be connected to the LAN, then a device that continues to beacon in the unused MAC address space for a greater than normal period of time, will be identified as a suspect device by baseline engine 110.
[0127] More specifically, baseline engine 110 will establish rolling baseline 120D with a normal behavior of data streams from sensors 204D1-2 indicating that devices at site 202 start communicating with their real MAC address within a “normal” time window. This time will be a dimension in the conceptual hypercube with centroid 182D of baseline 120D. If a device such as device 222J carried by employee 206J beacons in the unused MAC address space for greater than normal time, then it will be far away enough from centroid 182D along this dimension to signify an anomaly. Such an anomaly may indicate a breach or security incident or a threat, or a technical issue. As a result, employee 206J with device 222J will be flagged/signaled as an anomaly by engine 110. These and other useful scenarios are easily identifiable and caught by anomalous subject and device identification system 200 of the personal-device sensor embodiment shown in
[0128] Personal-device sensor(s) together with camera(s): In a highly useful set of embodiments personal-device sensors 204D of
[0129]
[0130] Moreover and very importantly, system 200 with cameras 204A1-2 working in conjunction with data processing module 220 as well as personal-device sensors 204D1-2 is now able to associate a specific subject with each device. Anomalous subject and device identification system 200 of
[0131] Data streams from sensors 204A1-2 and 204D1-2 processed by module 220 are then provided to baseline engine 110. Based on data streams from cameras 204A1-2, baseline engine establishes one or more baselines 120A1, 120A2, 120A3, . . . 120AN for the dimensions of conceptual hypercube of interest with correspondent centroids 182A1, 182A2, 182A3, . . . 182AN. Similarly, based on data streams from wireless sensors 204D1-2, baseline engine establishes one or more baselines 120D1, 120D2, 120D3, . . . 120DN for the dimensions of conceptual hypercube of interest with correspondent centroids 182D1, 182D2, 182D3, . . . 182DN. It then scores each incoming packet from these data streams against the above baselines by computing the distance of the packet from the respective centroids on a certain dimension of interest. If the distance is far enough or greater than what is normal for the respective baseline, it identifies that packet as an anomalous packet and signals an anomaly identifying the associated subject and/or device per prior teachings.
[0132] Such a capability allows a number of important scenarios to be discovered/caught by anomalous subject and device identification system 200 of
[0135] Any of the above scenarios may simply signify an innocuous situation, such as a lost device. On the other hand, these may also indicate a more serious security incident/threat associated with device 222M and subject 206M. Regardless, the above scenarios along with the subject and/or device in question are signaled by baseline engine 110 as anomalies and identified. [0136] More specifically, in these scenarios, one dimension of the conceptual hypercube will exemplarily be the number of subjects associated with a device. If the number is 0 or greater than 1, then this indicates an anomaly for site 202. Per above, if there is a prior association of an anomalous device with a subject then that subject is also identified, otherwise just the device itself is identified as anomalous by the anomalous subject and device identification system 200 of the present design. [0137] 3. Transfer of a device: In an analogous manner, if a device that was once associated with one subject is now associated with another subject, such a situation also rises to a level of concern or anomaly. Again, such an anomaly caught by the present design may be innocuous or a more serious security exposure or threat. In this case also, one dimension of the conceptual hypercube will be the number of subjects associated with a device. If the number is 0 or greater than 1, then this indicates an anomaly for site 202.
[0138] Wireless sensors with site instrumentation: In addition to or alternatively of cameras, in some embodiments the wireless sensors of the present design are augmented by wireless antennas instrumented/installed at the site. Like cameras, these local antennas and instrumentation provide additional fidelity to the anomalous subject and device identification system of the present design.
[0139]
[0140] Any number of antennas 232A, 232B or more, installed in the local infrastructure at site 202 can operate in one or more of at least two configurations: (i) the antennas act as a booster for wireless sensors 204D1-2 by collecting data on the ground close to the devices at site 202 and then communicating it to sensors 204D1-2 either by wire or wirelessly, (ii) the antennas themselves operate as sensors 204D installed at optimal locations at site 202 for maximum signal coverage/strength. In other words, they may supplement existing wireless sensors 204D, but instead of or in addition to, may also act themselves as wireless sensors 204D.
[0141] In the absence of cameras 204A, antennas 232A and 232B assist in the determination of the location of a device with respect to the antennas in conjunction with wireless sensors 204D. As explained earlier in reference to the embodiments of
[0142] Using sensors on computing devices: In a highly useful set of embodiments, sensors available on computing devices are used to accrue the benefits of the anomalous subject and device identification system of the present design. The benefit of these embodiments is that instead of requiring separate sensors, sensors that are already ubiquitously present in today's computing devices are utilized. Exemplary computing devices include laptops, tablets, cellular phones including smartphones, wearable devices (including smartwatches and medical devices), security devices, etc.
[0143] Let us take advantage of
[0144] Kiosk 205 discussed further below has a computing device 240 installed in it. Device 240 may be a tablet or a cellular phone/smartphone or even a laptop or the like. Not all of sensors 204A-D above need to be embodied in computing devices. In other words, any subset of the sensors may be separately installed as in the embodiments of
[0145] All the relevant teachings of the prior embodiments apply to the present embodiments also, except that the sensors are now on economically and ubiquitously available on (personal) computing devices. One of the advantages of the present embodiments is that a given site, such as site 302 can be quickly provisioned with the instant anomalous subject and device identification system 300. This is because the computing devices housing the sensors of interest, such as devices 234, 236, 238 and 240 are cheaply and readily available. Moreover, they have a small form factor, such that they can be easily and flexibly deployed at site 302 for optimal results. In an interesting application of the present embodiments, mobile devices with police officers containing cameras, microphones and/or other sensors are used to surveil a location on a short notice per above teachings.
[0146] Kiosks: Referring to
[0147] Referring now to the embodiment of
[0148] Data layering: In the preferred embodiment, the present technology is implemented by storing the data streams from various sensors, such as sensors 204 at site 202/302 as separate data-tracks or layers in a file. Each data layer or track in the data file corresponds to a data stream from a sensor. For example, there may be a radio frequency (RF) data layer, a cellular layer, a blue-tooth layer, a video layer, an audio layer, etc. This layering may be performed by data processing module 220.
[0149] Additionally, as object recognition is performed, an underlying subject/device data layer containing characteristics of the objects being recognized and to whom an identifier is assigned per above, is also created. For instance, if the object recognition function recognizes two persons amongst persons 206 with identifiers 78X67 and Y6790 with heights 6 foot, 3 inches and 5 feet, 6 inches respectively, then this data is stored in the underlying subject/device data layer in the data file.
[0150] Where there are multiple sensors of the same type, such as cameras 204A1 and 204A2 in
[0151] Forensic analysis: As already mentioned, the embodiments of
[0152] For example, let us consider that a site, such as site 202/302 of the of the prior discussion is a restaurant/school. Then a claim by a patron/student 206 that he/she got infected with Covid-19 virus while at the restaurant/school on a given date may be challenged by uncovering evidence in the archive that the patron/student was not wearing a mask on that date at the restaurant/school. In another interesting application of the above embodiments for performing mask wearing enforcement/detection, a local government may audit a chain of hotels or restaurant based on the above-discussed instant archived data in cloud 230 to determine if they have been allowing patrons without masks.
[0153] Furthermore, as the data streams from sensors 204 about subjects at site 202/302 is stored in a database, whether the database is on-premise at site 202/302 or in cloud 230, this allows the creation of profiles for individual subjects. This capability is also very useful because any analytics performed on the output of baseline engine 110 can then be matched against the profile of the subject in question to determine whether a specific behavior matches his/her profile. If it does not, then system 200/300 updates the subject or target profile accordingly. The profiling capability further allows system 200/300 to blacklist or whitelist subjects as needed.
[0154] In yet another variation, the anomalous subject and device identification system of the present design further analyzes data from subjects based on their police record. For example, one dimension of the hypercube of the baseline established by baseline engine 110 may be the number of arrests or warrant or charges, etc. for the subjects. This information may then be utilized to determine if a given subject scored on that dimension is likely to be associated with an anomalous situation based on above teachings.
[0155] Overall: Any of the embodiments taught above may utilize a wired or a wireless connection as appropriate to facilitate communication between sensors, devices and ground infrastructure. Furthermore, backbone 208 discussed in various embodiments above may also be wired or wireless. Furthermore, various capabilities of the above embodiments may be combined (mixed and matched) depending on the number and types of various sensors and/or devices involved in an implementation.
[0156] Furthermore, exemplary sites/locations that may benefit from the anomalous subject and device identification system with its above-taught embodiments include airports, train stations, subways, central bus stations, embassies and consulates, government buildings, stadiums, arenas, venues, convention centers, Fortune 500 companies' headquarters or key offices, hospitals, universities/colleges, schools, restaurants and hospitality centers, office buildings, etc.
Indicators of Compromise by Analyzing Data Based on ROLLING Baseline
[0157] In a highly preferred set of embodiments, the data from a variety of objects is analyzed by the rolling baseline engine of the prior teachings to determine whether any of those objects may be compromised. The term object in these embodiments is used to refer either to a finished product or simply put, a product, or a component assembly or simply put, a component, that may be assembled into or integrated into or to produce a product.
[0158] Exemplarily, a product may be consumer product or device or an end-user device, such as a smart television, a smartphone, a tablet, a thermostat, a smart fridge, a security camera, a microphone, a measuring device, any other internet of things (IoT) devices, or any other consumer product/device. The product may also be an industrial product/device, or any other conceivable finished product. A component may be any conceivable component or assembly that is used in the assembly or manufacturing of a finished product, including the ones listed above. A component may also be a semi-conductor component, a chip or a circuit board or circuitry that may eventually be used in a product including the ones listed above.
[0159] A few examples of objects covered by the present embodiments are shown in
[0160]
[0161] Also shown in
[0162] As noted above, data in the present embodiments refers to any data that any of the above objects generates during its course of manufacturing/construction/assembly or operation thereafter. Exemplarily, if the object is a semiconductor chip or a circuit, the manufacturing data may be its prediction, simulation, test, verification, performance, etc. data. Such data may also take the form of “pings” or “heartbeats” or measurements or surveillance footage or an operational report, performance data, or any other type of conceivable data generated by an object or expected to be generated by the object during its lifecycle.
[0163] Baseline engine 110 then analyzes this data and determines if the corresponding object or objects that generated the data may be compromised. This determination is referred to as finding/detecting/generating an indicator of compromise in the object(s) by baseline engine 110. Per prior teachings, this is accomplished by identifying anomalous packets far enough away from centroid 182 representing the normal population of packets of data received from the objects.
[0164] In one example, object 400A is a smart tachometer, such as the one used in a modern car, that may normally transmit its reading every 1 minute to rolling baseline engine 110. Now, if the frequency of such a reading inexplicably changes to 5 minutes, then this may indicate a compromise of/in object/device 400A as detected by baseline engine 110. Explained further, a hacker or a malware may have intruded object 400A and may be executing remote and unauthorized commands on it and/or be sending/receiving unauthorized messages to/from it. Remote command execution on an IoT device, such as object 400A of
[0165] Our baseline engine 110 is able to detect such a misreporting or deviation of reporting frequency of object 400A as a far enough distance from centroid 182 of hypercube 180 of baseline 120, per prior teachings. This deviation is identified as an anomaly or an indicator of compromise of/in device/object 400A in the present embodiments. As a consequence, remedial or corrective actions may be immediately taken, which otherwise could have resulted in a catastrophic outcome, such as a car accident.
[0166] In a desirable set of variations of the present embodiments, the rolling baseline engine detects indicators of compromise in objects by analyzing their data sent to or residing in a cloud. There are many different types of clouds in which such internet-enabled objects may send their manufacturing, operational or any other data. These include at least: [0167] 1. Generic automation testing clouds or farms. These clouds, also sometimes referred to as device farms or simply farms, are clouds that are used for the functional automation and testing of various devices, including IoT devices. Current providers of such farms include Saucelabs™, Xamarin™, Perfecto™, etc. These clouds offer generic testing environments for many types of devices running a variety of operating systems, including iOS™ and Android™. The devices testable in these cloud environments include not only mobile phones, but also a variety of IoT devices, such as smart refrigerators, smart TVs, smart thermostats, etc. that run on the supported operating systems.
[0168] As a consequence, a developer team or a concerned party can test the application(s) developed for the “target” device or devices with a target operating system(s) without incurring the capital expense of owning and operating those devices and operating systems. This is especially helpful when the developer needs to support an application on not only a single target device, e.g. iPhone 13, but a range of devices e.g. iPhone 6S through iPhone 15 running a variety of iOS versions, including iOS 10 through iOS 15. [0169] 2. Device specific clouds These clouds are specific to a given type of object or device. For example, Amazon web services (AWS™) device farms offer mobile application testing services for iOS and Android devices. Although the devices currently testable are mobile phones, the types of such devices are expected to grow and include other types of iOS and Android devices, including smart home and industrial devices, IoT devices, etc. [0170] 3. Vendor specific clouds There are also clouds that receive data from specific vendor devices. For example, LG™ TV, Sony™ TV, Samsung™ TV, and other vendors have dedicated clouds that are intended to receive a variety of operational data from various types of smart appliances include smart TVs that are operating in production. There are also vendor specific clouds for smart TVs, smart fridges, toasters, and any conceivable internet-enabled device. Such clouds could also be specific to individual model numbers of the specific types of objects/devices. [0171] 4. Component clouds
[0172] These clouds are used for receiving manufacturing data from individual components of a product. Such components include semi-conductor components or microchips or simply a chips or circuit boards or assemblies. The components may also be contained in a consumer/finished product in which case the finished product may be sending its operational data to a device cloud discussed above while its individual component(s) would send their data to a component cloud. [0173] Alternatively, the component cloud may be used for electronic design automation (EDA), also sometimes referred to as electronic computer-aided design (ECAD), of the component(s) during their design and manufacturing and before they have gone into production. Some of the current industry participants operating such component clouds include Invidia™, Lattice™, Siemens™, Synopsis™, etc.
[0174] Those skilled in semiconductor design will appreciate the tremendous benefits of using the cloud for EDA/ECAD. In a cloud-based EDA model, the various design, prediction, simulation, verification, testing and any other EDA artifacts are kept in the component cloud. The cloud is operated by a specialized contract manufacturer or a fab or a foundry or in some cases a large vertically integrated company. Regardless, the cloud-based EDA approach greatly optimizes the time to market, cost, quality and other metrics of EDA for a given chip, instead of the traditional approach of chip design where a company vertically owns and operates the various aspects of EDA for a given chip/product.
[0175] The object/device clouds presented above may be multi-functional where a single cloud serves as more than one of the above presented clouds. They clouds may also be tiered or hierarchical such that data in component clouds is combined or integrated with the data in the next higher tier and so on.
[0176] Regardless of the type and structure of clouds, including the ones described above, a variety of objects, which includes finished or end-products or simply products and/or their components, send their design, manufacturing, testing, performance and/or operational data to a cloud. The purposes of doing so may be various including their testing/verification, monitoring, improvement and optimization, analysis, etc. As noted, the objects may send their data to the cloud either before their operation in production i.e. during their design and development, as well as during post-deployment operation, or both.
[0177] In the present variations, the rolling baseline engine analyzes this data sent by the objects to the cloud or the data residing in the cloud, and detects any anomalies with the associated contexts per prior teachings. A deviation or anomaly thus detected serves as an indicator of a compromise of/in the associated objects. In other words, if the data in the cloud observed by our baseline engine is far enough away from the centroid of the “normal” population of the data packets, then this acts as an indicator of compromise of the objects. The rolling baseline engine may also itself operate inside the same cloud, a different cloud, or on-premise while analyzing the object data in the cloud.
[0178] The anomaly indicating a compromise may manifest itself in a variety of ways including an overload/overuse/overage or underload/underuse/shortage of any number of metrics or resources of the objects. These include CPU usage, memory usage, disk storage usage, network bandwidth usage, thermal output, to name a few. The manifestation may also be in the form of misreporting of operational data, such as underreporting or overreporting of data. It may also be in the interpretability of data, e.g. if the data report is obfuscated, unintentionally encrypted, or is incomprehensible or gibberish.
[0179] Early knowledge of an indicator of a compromise allows the developer or a concerned party to patch or address the vulnerability that caused the compromise/infiltration. If the object in question is a component, then this prevents a costly production of products with inherent flaws or vulnerabilities in them, only to cause bigger and more consequential damage in the future. If the object is a product, then this facilitates their firmware/software updates or patches or recalls to ensure their secure operation and satisfied customers. The anomaly/anomalies detected by instant baseline engine may also denote technical flaw(s) in the design/manufacturing of the objects and not just necessarily exploited vulnerabilities.
[0180]
[0181] Regardless, baseline engine 110 analyzes data packets sent to or residing in cloud 452 that originated from various smart TVs 400H. Based on this analysis, it detects indicators of compromise in one or more smart TVs 400H. Per above, cloud 452 may be specific to a given TV manufacturer, and even a given TV model. By way of example, operational data of a given batch of smart TVs 400H of a given model may indicate that 10% of such TVs are getting their current data/time settings reset or corrupted. This may indicate a compromise through an exploit of a vulnerability of that model of TVs. Baseline engine 110 can detect such an indicator of compromise so adequate patches/updates could be made.
[0182]
[0183] Our baseline engine 110 analyzes this data and detects if a malware or a hacker has compromised components 400E that may be causing their test data to deviate from the normal. Similarly, if verification data generated by an EDA job, such as functional verification data, is unintelligible or incomprehensible or gibberish or misreported, then this may be an indicator of compromise of chips 400E. Similarly, if thermal data from chips 400E indicates an overage of thermal output, this may be because a malware is overloading the chip, or is otherwise causing a design defect in the heat flow of chips 400E.
[0184] Thusly, an immediate response can be launched that could avoid expensive chip/circuit failures down the line. Similarly, EDA simulation data, such as logic simulation data may be encrypted by an attack, resulting in large enough distance from centroid 182 of normal population of data packets per prior teachings. This may be an indicator of compromise, such as a ransomware attack. In a similar manner, the data may be obfuscated in some other way indicating a compromise.
[0185] The proactive knowledge of the compromise allows a developer or a concerned party to patch the flaw or vulnerability in the objects that allowed the exploit to occur. This limits the damage and avoids further and potentially more catastrophic compromises to occur in the future.
[0186] Data in component clouds is used for improvements and optimization of the chip design. The present technology is thus effectively used in component clouds with its rolling baseline engine for detecting patterns of failures and/or compromise of/in the components/chip. In another exemplary scenario, if rolling baseline engine 110 detects a security flaw in 20% of chips 400E that are sending their data to cloud 454, then this may indicate a pattern of failure or exploitation of the chips. A timely knowledge of such a failure or flaw is useful for efficiently patching the flaw before additional batches of chips 400E are produced and before a “mass compromise” occurs. This is especially the case once these components are fitted into potentially millions or more of consumer products that are then sold.
[0187] In yet another interesting application, if a batch of cell phones from a deployment, such as a military deployment is detected to be sending unauthorized pings to a network by engine 110, then this is an indicator of compromise of those cell phones. In this application, the rolling baseline engine analyzes the data on a network, which may be a secure cloud, and determine such pings as an overuse of network bandwidth and an anomaly or an indicator of compromise. Early detection of such a compromise by baseline engine 110 of the present design can prevent serious downstream consequences.
[0188] In still another variation, the operational data received from various objects in the various clouds taught above, is used as training data for modeling the operational behavior of the respective objects using artificial intelligence (AI) techniques. Such modeling is then used to determine the normal patterns of behavior i.e. for generating an optimal operational model of the objects. This optimal operational model signifies or is correlated to centroid 182 of hypercube of 180 of baseline engine 110 shown in
[0189] In view of the above teaching, a person skilled in the art will recognize that the apparatus and method of invention can be embodied in many different ways in addition to those described without departing from the principles of the invention. Therefore, the scope of the invention should be judged in view of the appended claims and their legal equivalents.