Michael Thumann and me had the chance to give a talk at this year’s ISSE conference in Brussels, Belgium. ISSE was founded in 1999 as an initiative of the European Commission Directorate General Information Society. The con had a focus on eGovernment, electronic business processes and the corresponding security issues. Continue reading “ISSE 2013 – ERNW Rapid Rating System”Continue reading
Today I’m going to discuss the (presumably) most complex and difficult-to-handle of the three parameters contributing to a risk (as of the RRA), that is the “vulnerability [factor]”.
First it should be noted that “likelihood” and “vulnerability” must (“mentally”) be clearly separated which means that “likelihood” denotes: likelihood of threat showing up _without_ consideration of existing controls. Security controls already present will affect the vulnerability factor (in particular if they are effective ;-), but _not_ the likelihood.
First reflect on “how often will somebody stand at the door of our data center with the will to enter?” or “how often will a piece of malware show up at our perimeter?” or “how often will it happen that an operator commits a mistake?” and assign an associated value to the likelihood.
Then, _in a separate_ step, think about: “will that person be able to enter my data center?” (maybe it’s an external support engineer and, given their high workload, your admins are willing to violate the external_people_only_allowed_to_access_dc_when_attended policy. which – of course – is purely fictional and will never happen in your organization ;-)) or “how effective are our perimeter controls as for malware?” (are they? ;-)) or “hmm… what’s the maturity of our change management processes?” and assign an associated value to the vulnerability factor.
As stated in an earlier post: this will allow for identifying areas where to act and thus allow for efficient overall steering of infosec resources.
Mixing likelihood and vulnerability might lead to self complacent stuff like “oh, evidently likelihood of unauthorized access to datacenter is ‘1’ as we have that brand new shiny access control system” …
Now, how to rate the “vulnerability” (on a scale from 1 to 5)?
In my experience a “rough descriptive scale” like
1: Extensive controls in place, threat can only materialize if multiple failures coincide.
2: Multiple controls, but highly skilled+motivated attacker might overcome those.
3: Some control(s) in place, but highly skilled+motivated attacker will overcome those. Overall exposure might play a role.
4: Controls in place but they have limitations. High exposure given and/or medium skilled attacker required.
5: Maybe controls, but with limitations if at all. High Exposure and/or low skills required.
works quite well for exercises with participants somewhat experienced with the approach. If there are people in the room (or call) who’ve already performed RRA (or, for that matter, similar exercises) a joint understanding what a “2” or “4” mean in the context of figuring “the vulnerability [factor]” can be attained quickly. This might even work when most of the participants are complete newcomers to risk assessments. In this case discussing some examples is the key for gaining that joint understanding.
That’s why in RRAs – where getting results in a timely manner is crucial (see introductory notes in part 1 on the complexity vs. efficiency trade-off) – we usually use some scale like the one outlined above.
Still, this – perfectly legitimately – may seem a bit “unscientific” to some people. For example, I currently do a lot of work in an organization where relevant people involved in the (risk assessment) process heavily struggle with the above scale, feeling it does not permit “a justified evaluation of the vulnerability factor due to being too vague”. In general, in such environments going with a “weighted summation method” is a good idea. More or less this works as follows:
a) identify some factors contributing to “vulnerability” (or “attack potential”) like “overall (network connectivity) exposure of system” or skills/time needed by an attacker and so on.
b) assign individual value to those factors, perform some mathematical operation on them (usually simply adding them), map the result to a 1-5 scale and voilà, here’s the – “justified and calculated” – vulnerability factor.
Again, an example might help. Let’s assume the threats-to-be-discussed are different types of attacks (as opposed to all the other classes of threats like acts of god, hardware failures etc.). One might look at three vulnerability-contributing factors, that are “type of attacker”, “exposure of system” and “extent of current controls” and assign values to each three of them as shown in the following – sample – table:
|Attacker (knowledge)||Exposure||Extent of controls||Value|
|Script kid||Internet facing||none||5|
|Bot||Business partners||Some, but insufficient||4|
|Skilled + motivated attacker||Only own organization||some but not resistant to skilled and motivated attacker||3|
|Organized crime||Own organization, restricted||multiple controls, but single failure might lead to attack succeeding||2|
|Nation state/agency||Very restricted or mgmt access only||multiple controls, multiple failures needed at the same time for attack to succeed||1|
with the following “mapping scale”:
Sum of values in the range 1-3 gives overall value: 1.
4-6 gives a 2.
Now, suppose we discuss the vulnerability of certain type of attack against a certain asset (here: some system). In case an attacker shows up (the likelihood of this event would be expressed by the respective value that is not included in this example) a skilled and motivated attacker is assumed to perform the attack (leading to a “3” in the first column). Furthermore the system is exposed to business partners (=> “4” in second column) and, due to the high sensitivity of the data processed, it has multiple layers of security controls (=> “1” for “extent of controls”). Adding these values gives on overall “8” which in turn means a vulnerability [factor] “3”.
For an internet facing system (“5) which we expect to be attacked/attackable by script kids (“5”) and which does not dispose of good controls (thus “4”) adding the respective values gives a 14 and subsequently an overall vulnerability of “5”.
An extensive presentation of this approach can be found in the clause B.4 of the Common Criteria Evaluation Methodolgy (CEM). There the respective values “characterising [the] attack potential” are:
a) Time taken to identify and exploit (Elapsed Time);
b) Specialist technical expertise required (Specialist Expertise);
c) Knowledge of the TOE design and operation (Knowledge of the TOE);
d) Window of opportunity;
e) IT hardware/software or other equipment required for exploitation.
with a fairly advanced description of the different variants for these values and an elaborated point scale.
To the best of my knowledge (and I might be biased given I’m a “BSI certified Common Criteria evaluator”) this is the best description of the “weighted summation method” in the infosec space. Feel free to let me know if there’s a better (or older) source for this.
Back to the topic, I’d like to state that while I certainly have quite some sympathy for this approach, using it for risk assessments – obviously – requires (depending on the type of people involved and their “discussion culture” 😉 potentially much) more time for the actual exercise. Which in turn might endanger the overall goal of delivering timely results. That’s why the latter way of rating the vulnerability is not used in the RRA.
Stay tuned for the next part to follow soon,
This is the second part of the series (part 1 here) providing some background on the way we perform risk assessments. It can be seen as a direct continuation of the last post; today I cover the method of estimation and the scale & calculation formula used.
1.1 Method of Estimation
Again, two main approaches exist:
- Qualitative estimation which uses a scale of qualifying attributes (e.g. Low, Medium, High) to describe the magnitude of each of the contributing factors listed above. [ISO 27005, p. 14] states that qualitative estimation may be used
- As an initial screening activity to identify risks that require more detailed analysis.
- Where this kind of analysis is appropriate for decisions.
- Where the numerical data or resources are inadequate for a quantitative estimation.
As the latter is pretty much always the case for information security risks, in the infosec space usually qualitative estimation can be found. A sample qualitative scale (1–5, mapping to “very low” to “very high”) for the vulnerability factor will be provided in the next part of this series.
- Quantitative estimation which uses a scale with numerical values (rather than the descriptive scales used in qualitative estimation) for impact and likelihood, using data from a variety of sources. [ISO 27005, p. 14] states that “quantitative estimation in most cases uses historical incident data, providing the advantage that it can be related directly to the information security objectives and concerns of the organization. A disadvantage is the lack of such data on new risks or information security weaknesses. A disadvantage of the quantitative approach may occur where factual, auditable data is not available thus creating an illusion of worth and accuracy of the risk assessment.”
1.2 Scale & Calculation Formula Used
Each of the contributing factors (that are likelihood, vulnerability [factor] and impact) will be rated on a scale from 1 (“very low”) to 5 (“very high”). Experience shows that other scales either are not granular enough (as is the case for the scale “1–3”) or lead to endless discussions if too granular (as is the case for the scale “1–10”).
Most (qualitative) approaches use a “1–5” scale.
It should be noted that usually all values (for likelihood, vulnerability and impact) are mapped to concrete definitions; examples to be provided in the next part. Furthermore the “impact value” will not be split into “subvalues” for different security objectives (like individual values for “impact on availability”, “impact on confidentiality” and so on) in order to preserve the efficiency of the overall approach.
To get the resulting risk, all values will be multiplied (which is the most common way of calculating risks anyway). It should be noted that there are some objections with regard to this approach which, again, will be discussed in the next part.
The values themselves should be discussed by a group of experts (appropriate to the exercise-to-be-performed) and should not be assigned by a single person. The usefulness and credibility of the whole approach heavily depends on the credibility and expertise of the people participating in an exercise!
Wherever possible some lines of reasoning should be given for each value assigned, for documentation and future use purposes.
 [ISO 27000], section 18.104.22.168 provides a good overview.
 Models with quantitative estimation don’t use a “vulnerability factor” (as this one usually can’t be expressed in a quantitative way).
 And so does the example 2 (section E.2.2) of [ISO27005] which can be compared to the methodology described here.
Feel free to get back to us with any comments, criticism, case studies, whatever. Thanks,
At several occasions we’ve been asked to provide some background on the Rapid Risk Assessment (RRA) methodology we frequently use for a transparent (and documented) understanding of risks in certain situations and to deliver structured input for subsequent decision taking. As I had to write down (in another context) some notes on risk assessments and – from our perspective – practical, reasonable ways of performing them, I take the opportunity to lay out a bit the underlying ideas of the RRA approach. Which, btw, is no rocket science at all. Honestly, I sometimes wonder why stuff like this isn’t practiced everywhere, on a daily basis 😉
Here we go, second part to follow shortly. Feel free to get back to us with any comments, criticism, case studies, whatever. Thanks,
There’s a number of heterogeneous definitions of the term “risk”, quite some of them with an inconsistent or ambiguous meaning and use. In the following we will rely on the definitions furnished by the standard documents ISO 31000 Risk management — Principles and guidelines (providing a widely recognized paradigm for risk management practitioners from different backgrounds and industry sectors) and ISO/IEC 27005 Information technology — Security techniques — Information security risk management (with a dedicated focus on the information security context).
ISO 31000 simply defines risk simply as
“effect of uncertainty on objectives”
where uncertainty is “the state, even partial, of deficiency of information related to, understanding or knowledge of an event, its consequence, or likelihood” [ISO31000, p. 2].
Accepting uncertainty as being the main constituent of risk is a fundamental prerequisite for our approach outlined below. It must be well understood that first a certain degree of uncertainty is intrinsic to dealing with risks and second that there’s always a trade-off between the – given resource constraints and human bounded rationality – necessary reduction of complexity and (presumed) accuracy during an exercise of risk assessment.
[ISO31000, p. 8] emphasizes that “the success of risk management will depend on the effectiveness of the […] framework” and [ISO31010, p. 18] concludes that “a simple method, well done, may provide better results than a more sophisticated procedure poorly done.”
Or, to express it “the blog way”: going with a simple method and thereby preserving the ability to perform exercises in a time-efficient manner, while accepting some fuzziness, will usually provide better results (e.g. for well-informed decision making) than striving for the big hit of a comprehensive risk enlightenment considering numerous potential dependencies and illuminating various dimensions of security objectives (which usually gets finished at the very moment Godot arrives).
2. Sources of Threats
In general two main possible approaches can be identified here:
- Use of a well-defined threat catalogue (usually one and the same at different points of execution) which might be provided by an industry association, a standards body or a government agency regulating a certain industry sector. While this may serve the common advantages of a standards based approach (accelerated setup of overall procedure, easy acceptance within peer community etc.), [ISO31010, p. 31] lists some major drawbacks of this course of action, laying out that check-lists
- tend to inhibit imagination in the identification of risks;
- address the ‘known knowns’, not the ‘known unknowns’ or the ‘unknown unknown’.
- encourage ‘tick the box’ type behavior
- tend to be observation based, so miss problems that are not readily seen.
- Adoption of (mostly) individual threats for individual risk assessment performances, depending on the amount of available resources, the context and “the question to be answered by means of the exercise”. This certainly requires more creativity and most notably experience on the participating contributors’ side, but will generally produce better and more holistic results in a more time-efficient way.
One essential element of the RRA is to (only + strictly) follow the latter approach (individual threats, depending on context). This means that some key players of the exercise-to-be-performed have to figure out the main threats before the proper risk assessment’s performance (usually by email) and that additional threats are not allowed later on. Again, in our perception, this is one of the critical success factors of the approach!
3. Contributing Factors
ISO 27005 (currently, that is as of 2008) defines information security risk as the
“potential that a given threat will exploit vulnerabilities of an asset […]
and thereby cause harm to the organization”.
Following this, three main factors contribute to the risk associated with a given threat:
- the threat’s potential
- the vulnerabilities to-be-exploited
- the harm caused once the threat successfully materializes.
There’s a vast consensus amongst infosec risk assessment practitioners – and this is reflected by the way the wikipedia article on “risk” explains information security risk – that it hence makes sense to work with an explicit “vulnerability factor” expressing how vulnerable an asset is in case a threat shows up, for two main reasons:
- When thinking about threats, this allows to differentiate between “external phenomena” (malware is around, hardware fails occasionally, humans make errors) and “internal conditions” (“our malware controls might be insufficient”, “we don’t have clustering of some important servers”, “our change control procedures are circumvented too often”).
- This differentiation allows for governance and steering in the phase of risk treatment (“we can’t change [the badness of] the world, but we can mitigate our vulnerability”; which then is expressed by a diminished vulnerability factor and subsequently reduced overall risk.)
This furthermore facilitates looking at some asset’s (e.g. a product’s) intrinsic properties (leading to vulnerabilities) without knowing too many details about the environment the asset is operated in.
 The wikipedia article on the subject might serve as a starting point.
 Based on AS/NZS 4360 which in turn is regarded as a major contribution to the mainstream concept of risk in the 20th century.
The definition of the term “risk” within ISO 31000 is taken from ISO GUIDE 73:2009 and it can be expected that future versions of ISO 27005 will incorporate this definition (and the underlying idea) as well.
 It should be noted that the terms (and concepts) of “risk” and “uncertainty” might dispose of some duality on their own (see [COFTA07, p. 54ff.] for a detailed discussion on this). Still, we strictly follow the ISO 73 approach here.
 Where “assessing them” is one step in “dealing with risks”.
 [COFTA07, p.29] employs the concept of a “transactional horizon” to express the inherent limitations. Furthermore see [KAHNEMANN] or [SIMON] on bounded rationality.
 ISO 31010 gives an overview of risk assessment techniques.Continue reading
Once again, in some customer environment the question of allowing/prohibiting split tunneling for (in this case: IPsec) VPN connections popped up today. Given our strict stance when it comes to “fundamental architectural security principles” the valued reader might easily imagine we’re no big fans of allowing split tunneling (term abbreviated in the following by “ST”), as this usually constitutes a severe violation of the “isolation principle”, further aggravated by the fact that this (violation) takes place on a “trust boundary” (of trusted/untrusted networks).
Still, we’re security practitioners (and not everybody has such a firm belief in the value of “fundamental architectural security principles” as we have), so we had to deal with the proponents’ arguments. In particular as one of them mentioned additional costs (in case of disallowed ST forcing all 80K VPN users’ web browsing through some centralized corporate infrastructure) of US$ 40,000,000.
[yes, you read that correctly: 40 million. I’ve still no idea where this – in my perception: crazy – number comes from]. Anyhow, how to deal with this?
Internally we performed a rapid risk assessment (RRA) focused on two main threats, that were:
a) Targeted attack against $CORP, performed by means of network backdoor access by piece of malware acting as relay point.
a) Client gets infected by drive-by malware as not being protected by (corporate) infrastructure level controls.
[for obvious reasons I’ll not provide the values derived from the exercise here]
While the former seems the main reason why NIST document SP 800-77 “Guide to IPsec VPNs” recommends _against_ allowing ST, the RRA showed that the latter overall constitutes the bigger risk. To illustrate this I’ll give some numbers from the excellent Google Research paper “All Your iFRAMEs Point to Us” which is probably the best source for data on the distribution and propagation of drive-by malware.
Here’s the paper’s abstract:
“As the web continues to play an ever increasing role in information exchange, so too is it
becoming the prevailing platform for infecting vulnerable hosts. In this paper, we provide a
detailed study of the pervasiveness of so-called drive-by downloads on the Internet. Drive-by
downloads are caused by URLs that attempt to exploit their visitors and cause malware to be
installed and run automatically. Our analysis of billions of URLs over a 10 month period shows
that a non-trivial amount, of over 3 million malicious URLs, initiate drive-by downloads. An even
more troubling finding is that approximately 1.3% of the incoming search queries to Google’s
search engine returned at least one URL labeled as malicious in the results page. We also explore
several aspects of the drive-by downloads problem. We study the relationship between the user
browsing habits and exposure to malware, the different techniques used to lure the user into the
malware distribution networks, and the different properties of these networks.”
and I’m going to quote some parts of it in the following, to line some (in the end of the day: not so) small calculations.
The authors observed that from “the top one million URLs appearing in the search engine results, about 6,000 belong to sites that have been verified as malicious at some point during our data collection. Upon closer inspection, we found that these sites appear at uniformly distributed ranks within the top million web sites—with the most popular landing page having a rank of 1,588.”
Now, how many of those (top million websites and in particular of the 0.6%) will be visited _every day_ by a 80,000 user population? And how many of those visits will presumably happen over “the split tunnel”, thus without corporate infrastructure level security controls?
Looking at the post infection impact they noticed that the “number of Windows executables downloaded after visiting a malicious URL […] [was] 8 on average, but as large as 60 in the extreme case”.
To be honest I mostly cite this opportunisticly to refer to this previous post or this one 😉
And I will easily refrain from any comment on this observation ;-):
“We subject each binary for each of the anti-virus scanners using the latest virus definitions on that day. […] The graph reveals […] an average detection rate of 70% for the best engine.”
Part of their conclusion is that the “in-depth analysis of over 66 million URLs (spanning a 10 month period) reveals that the scope of the problem is significant. For instance, we find that 1.3% of the incoming search queries to Google’s search engine return at least one link to a malicious site.”
Let’s use this for some simple math: let’s assume 16K Users being online (20% of those 80K overall VPN users) on a given day, performing only 10 Google queries per user. That’s 160K queries/day. 1.3% of that, i.e. about 2K queries will return a link to at least one malicious site. If one out of 10 users clicks on that one, that means 200 attempted infections _per day_. If you cover 70% of that by local AV, 60 attempts (the remaining 30%) succeed. This gives 60 successful infections _each day_.
So, from our perspective, deliberately allowing a large user base to circumvent corporate infrastructure security controls (only relying on some local anti-malware piece) might simply… not be a good idea…