By Jacob R Hooker –
Users are often the bane of my existence. If they’re not busy being phished then they’re working hard to circumvent firewall restrictions or install unapproved software on company devices. While frustrating I find these people hard to fault. From an anecdotal observation there’s often little sign of malice, training shows improvement and in general people fulfill their duties with the best intentions. Statistically, however, this is a fallacy. Insider threats are becoming one of the most common threat vectors in both the public and private sector. A 2015 SolarWinds report states that insider threats are among the most prevalent and damaging to government agencies. On the other hand investment in preventative measures for insider threats has not been increasing. This is contrasted by the uptick in User Behavior Analytics (UBA) tools currently flooding the market. UBA tools are typically offered as a supplement to the traditional SIEM monitored environments we’re used to. The goal of these tools are to establish a baseline, or normal behavior patterns for every user in an environment.
Establishing a Baseline
In any size environment the users are the most dynamic source of activity. With a traditional SIEM environment an insider threat would typically be located retroactively. The nature of log collection means that an event has to happen before it can be detected. With insider threats this poses an obvious problem: odds are the data has been exfiltrated prior to detection. In an effort to counter that UBA tools start by building a profile for every user. The profiles are a collection of user activities that are used to determine what is considered normal behavior for that particular account. Eventually with proper integration it should be possible to determine the role a user has in an organization and the actions that are considered normal for that type of user. Once this baseline is made it is easier to recognize irregular behavior from a potentially compromised account or malicious actor. While closer to a solution the original issue still remains. Despite having a baseline for comparison there will still have to be an action occurring to compare against, thus the damage may already be done before it is detected. To counter this we employ analytics to try to predict user actions.
Developing Markov Chains
Behind the scenes of many UBA tools are mathematical devices known as Markov Chains. Put simply a Markov Chain is a type of probability experiment in which the outcome of a given experiment can affect the outcome of the next experiment. The idea is based around states. In the real world a state can be any event and in our context they can be seen as an action a user takes. The set of these states make up a Markov Chain. The probability of moving from one state to another is a transition probability. Chains can be created to represent each state possible for the user along with the probability of transitioning to another state. With all of these combined a transition matrix can be created. The transition matrix will contain all of the probabilities given the current state of transitioning to some other state. Borrowing the following example from the UC Davis Mathematics department it can be shown that with known transition probabilities it is possible to reliably predict the next state that will occur. In this matrix we look at the probability that those who eat at home (H), a Chinese restaurant (C), a Mexican restaurant (M), or a pizza place (P). From here we would calculate a state vector, a column that defines the probability a subject is in a certain state at the time of observation. In the example below x is the probability that a system is in a certain state at the observation period n.
H C M P
Fig 1. Transition Matrix
Fig 2. State Vector
(Source: UC Davis Mathematics)
Predicting User Behavior
At this point we’re able to combine both ideas of User Behavior Analtyics. A baseline will
be established for each user. This data will give us an idea of the type of states that a
user will typically be in, but as we found earlier it is not enough to preemptively detect an insider threat. The normal data set will prove invaluable though because it will allow us to gather probabilities of the user being in a given state. It will also be useful in identifying the types of state or actions that malicious actors might be using. Using those probabilities it is possible to start the Markov process and create transition matrices for each user or groups of users. Once these are in place it is no longer unreasonable to think that insider threats can be detected and prevented before catastrophic failure. For example we look at a user who typically is logged on only during business hours, accesses a set number of drives and browses the web predictably. A transition matrix is created for this user and state vectors are calculated daily. If this account were compromised, or if a users motives changed, the subtle behavior changes would cause the state vectors to vary far greater than what is normally expected. Combined with detection tools in a typical SIEM environment these types of deviations would generate an alert and a security professional could conduct an investigation into whether or not an insider threat is present.
Further Use of Analytics
This type of mathematical analysis has been around for over a century and can be usefully applied in many fields. As a SOC technician I found it most interesting from the perspective of anticipating insider threats. A more passive benefit that has risen from it however is system auditing. Collecting this type of data on users allows us administrators to clearly see the type of access users have and how they use it. This becomes important in large enterprise settings where access creep is a real issue.