During our ongoing research on the security of cloud service providers and cloud based applications, we performed a regular audit of our AWS account password. Thinking of popular incidents and evergreens in attack vectors, we were wondering which consequences an online bruteforce attack on our AWS password would have. So we decided to perform a bruteforce attack against our own account. Analyzing the login process of AWS, the following requirements for the bruteforce tool to be used could be derived:
HTTP 3xx support.
It turned out that it was pretty hard to find a password testing tool which fulfilled these requirements and would be able to actually handle the complex AWS login process — eventually there was none. Since we use and like Burp Suite pretty much, the Intruder suggested itself as an alternative which is straight forward to configure even though it might lack the speed and efficiency of special bruteforcing tools. Using burp’s history, we were able to identify the request which triggered the login process:
After the request is sent to the Intruder, the password field is marked
and the payloads to be used are configured.
Using exemplary payloads, it is possible to identify a successful login attempt, since it results in a redirect to the authenticated area/SSO server/whatever whereas a wrong passwords results in HTTP 200 presenting the AWS login page again:
Having this basic bruteforcing process established, the wordlist to be used must be generated. To decide which complexity should be covered, the Amazon password policy must be analyzed — if the restrictions in place deserve to be called a policy. The only restriction is that the password is between 6 and 20 characters (though the upper limit was determined regarding the maxlength field parameter when changing the password using the webfrontend, since there is no documentation about this available. Thinking of business needs, this behavior might be understandable since Amazon loses “endusers” and therefore money if their password policy is too strict). So we decided to use a wordlist which contains all passwords of 6 characters consisting of numbers (which can be generated pretty easy reactivating some old perl scripting skills: perl -le ‘printf “%06d”, $_ foreach(1..999999)’ 😉 ). Such passwords even might be pretty common when thinking of “birthday passwords”.
After performing about 400k requests, we paused the attack and searched for requests which resulted in a HTTP 302 response, just as the baseline request did.
And indeed, it was possible to bruteforce the password — which is not such a big surprise though. The bigger — and worse — surprise is, that it was still possible to login to our amazon account after performing about 2 million requests (including some dry runs) within two days originating from one single IP adress without having the account locked, being throttled down or notified in any way. And we were performing about 80k requests per hour.
Coming back to the title of the blogpost: At the moment of our investigation, there were no protection mechanisms against bruteforce attacks for the key to your datacenter — which your AWS credentials actually can be, if you are hosting a large amount of your services in EC2. Following a repsonsible disclosure policy, we contacted the AWS Security Team and got a very comprehensive answer. As we supposed, they pointed us to their MFA solution, which is basically, even though there was a major incident recently, a viable security control when authenticating users for data center access. But in addition, we had a long and beneficial dialog about potential mechanisms such as connection throttling and account locking. The outcome of our discussion is a CAPTCHA mechanism which kicks in after a brute force attempt is detected — and was also re-tested several times by our bruteforcing attempt. It was quite impressive to see that it was possible for Amazon to implement additional security measures in such a short time frame, regarding the huge size and complexity of the AWS environment. So we were really glad to get in touch with the committed AWS Security Team and were really happy to see that those guys are really into security and trying to communicate with their customers.
eye-catching title of this post, huh?
Actually there is some justification for it ;-), that is bringing this excellent document covering the exact topic to your attention.
Other than that this post contains some unordered reflections which arose in a recent meeting in a quite large organization on the “common current iPad topic” (executives would like to have/use an iPad, infosec doesn’t like the idea, business – as we all know – wins, so bring external expertise in “to help us find a way of doing this securely” yadda yadda yadda).
Which – given those nifty little boxes are _consumer_ devices which were probably never meant to process sensitive corporate data – might be a next-to-impossible task… at least in a way that satisfies business expectations as for “usability”…[btw: can anybody confirm my observation that there’s a correlation between “rigor of restriction approach” to “number of corporate emails forwarded to private webmail accounts”?]
Anyway, in that meeting – due to my usual endeavor to look at things in a structured way – I started categorizing flavors of data wiping. I came up with
a) device-induced (call it “automatic” if you want) wipe. Here the trigger (to wipe) comes from the device itself, usually after some particular condition is met, which might be
number of failed passcode entries. This is supposed to help against an opportunistic attacker who “has found an iPad somewhere” and then tries to get access. Still, assuming a 4-digit passcode, based on their distribution the attacker might have a one-in-seven chance to succeed when the number of passcodes-to-fail is set to ten (isn’t this is the default setting? I don’t use such a device so I really don’t know ;-)).
check some system parameter (“am I jailbroken?”) and then perform a wipe.This somehow raises a – let’s call it – “matrix problem”: “judge the world’s trustworthy state from the own perspective and then delete my memory if found untrustworthy”. But how can I know my decision is a correct one if my own overall (“consciousness”) state might heavily depend on the USB port I’m connected to…
phone home (“Find My iPhone” et.al.), find out “I’m lost or stolen”, quickly wipe myself.This one requires a network connection, so a skilled+motivated attacker going after the data on the device will prevent this exact (network) connection. As most of you probably already knew ;-).
b) remote-wipe. That largely overhyped feature going like “if we learn that one of our devices is lost or stolen, we’ll just push the button and, boom, all the data on the device is wiped remotely”.
Unfortunately this one requires that the organization is able to react once the state of the device changed from “trustworthy environment” to “untrustworthy environment”. Which in turn usually relies on processes involving humans, e.g. might require people to call the organization’s service desk to inform them “I just lost my iPad”… which, depending on various circumstances that I leave the reader to imagine, might happen “in close temporal proximity to the event” or not …
And, of course, a skilled+motivated attacker will prevent the network connection needed for this one, as stated above.
So, all these flavors of wiping have their own share of shortcomings or pitfalls. At some point during that discussion I silently asked myself:
“How crazy is this? why do we spend all these cycles and resources and life hours of smart people on a detective+reactive type of control?”
Why not spend all this energy on avoiding the threat in the first place by just not putting the data on those devices (which lack fundamental security properties and are highly exposed to untrustworthy human behavior and environments)?
Which directly leads to the plea expressed in my Trooperskeynote “Do not process sensitive data on smartphones!” (but use those just as display terminals to applications and data hosted in secure environments).
Yes, I know that “but then we depend on network connectivity and Ms. CxO can’t read her emails while in a plane” argument. And I’m soo tired of it. Spending so much operational effort for those few offline minutes (by pursuing the “we must have the data on the device” approach) seems just a bit of waste to me [and, btw, I’m a CxO “of company driven by innovation” myself ;-)]. Which might even be acceptable if it wouldn’t expose the organization to severe risks at the same time. And if all the effort wasn’t doomed anyway in six months… when your organizations’ executives have found yet another fancy gadget they’d like to use…
Think about it & have a great sunday,
PS: as we’re a company with quite diverse mindsets and a high degree of freedom to conduct an individual lifestyle and express individual opinions, some of my colleagues actually think data processing on those devices can be done in a reasonable secure way. See for example this workshop or wait for our upcoming newsletter on “Certificate based authentication with iPads”.
Today I’m proudly releasing the first version of apnbf, a small python script designed for enumerating valid APNs (Access Point Name) on a GTP-C speaking device. It tries to establish a new PDP session with the endpoint via sending a createPDPContextRequest. This request needs to include a valid APN, so one can easily distinguish from a valid APN (which will be answered with a createPDPContextResponse) and an invalid APN (which will be answered with an error indication message). In addition the tool also parses the error indication and displays the reason (which should be “Missing or unknown APN” in case of an invalid APN).
Don’t waste time, get the source here (5a122f198ea35b1501bc3859fd7e87aa57ef853a)
So, after having a completely new release yesterday, we will stay with already known but updated software today. You might have heard of gtp_scan before, which is a small python script for scanning mainly 3G and 4G devices and detecting GTP (GPRS Tunneling Protocol) enabled ports. As GTP is transported via UDP and we all know, UDP scanning is a pain, the tool uses the GTP build-in echo mechanism to detect GTP speaking ports. Since the last version I’ve implemented some new features:
Support of complete GTP spectrum (GTP-C, GTP-U, GTP’)
Support for scanning on SCTP
Improved result output, including validity check of response packages
Find the sources here (bbdcc8888ebb4739025395f8c1c253fa5fd2bb15).
I’m proud to announce, today a new fuzzing framework will see the light of day. It’s called dizzy and was written because the tools we used for fuzzing in past didn’t match our requirements. Some (unique) features are:
Can send to L2 as well as to upper layers (TCP/UDP/SCTP)
Ability to work with odd length packet fields (no need to match byte borders, so even single flags or 7bit long fields can be represented and fuzzed)
Very easy protocol definition syntax
Ability to do multi packet state-full fuzzing with the ability to use received target data in response.
We already had a lot of success using it, now you will be able to know the true promises.
Find the source here (c715a7ba894b44497b98659242fce52128696a17).
Some days ago a security advisory related to web application firewalls (WAFs) was published on Full Disclosure. Wendel Guglielmetti Henrique found another bug in the IBM Web Application Firewall which can be used to circumvent the WAF and execute typical web application attacks like SQL injection (click here for details). Wendel talked already (look here) at the Troopers Conference in 2009 about the different techniques to identify and bypass WAFs, so this kind of bypass methods are not quite new.
Nevertheless doing a lot of web application assessments and talking about countermeasures to protect web applications there’s a TOP 1 question I have to answer almost every time: “Wouldn’t it be helpful to install a WAF in front of our web application to protect them from attacks?”. My typical answer is “NO” because it’s better to spent the resources for addressing the problems in the code itself. So I will take this opportunity to write some rants about sense and nonsense of WAFs ;-). Let’s start with some – from our humble position – widespread myths:
1. WAFs will protect a web application from all web attacks .
2. WAFs are transparent and can’t be detected .
3. After installation of a WAF our web application is secure, no further “To Dos” .
4. WAFs are smart, so they can be used with any web application, no matter how complex it is .
5. Vulnerabilities in web applications can’t be fixed in time, only a WAF can help to reduce the attack surface.
And now let us dig a little bit deeper into these myths ;-).
1. WAFs will protect a web application from all web attacks
There are different attack detection models used by common WAFs like signature based detection, behavior based detection or a whitelist approach. These detection models are also known by attackers, so it’s not too hard to construct an attack that will pass the detection engines.
Just a simple example for signatures ;-): Studying sql injection attacks we can learn from all the examples that we can manipulate “WHERE clauses” with attacks like “or 1=1”. This is a typical signature for the detection engine of WAFs, but what about using “or 9=9” or even smarter 😉 “or 14<15”? This might sound ridiculous for most of you, but this already worked at least against one WAF 😉 and there are much more leet attacks to circumvent WAFs (sorry that we don’t disclose any vendor names, but this post is about WAFs in general).
Another point to mention are the different types of attacks against web applications, it’s not all about SQL injection and Cross-Site Scripting attacks, there also logic flaws that can be attacked or the typical privilege escalation problem “can user A access data of user B?”. A WAF can’t protect against these attacks, it a WAF can raise the bar for attackers under some circumstances, but it can’t protect a web application from skilled attackers.
2. WAFs are transparent and can’t be detected
In 2009, initially at Troopers ;-), Wendel and Sandro Gauci published a tool called wafw00f and described their approach to fingerprint WAFs in different talks at security conferences. This already proves that this myth is not true. Furthermore there will be another tool release from ERNW soon, so stay tuned, it will be available for download shortly ;-).
3. After installation of a WAF my web application is secure, no further “To Dos”
WAFs require a lot of operational effort just because web applications offer more and more functionality and the main purpose of a web application is to support the organization’s business. WAF administrators have to ensure that the WAF doesn’t block any legitimate traffic. It’s almost the same as with Intrusion Detection and Prevention Systems, they require a lot of fine tuning to detect important attacks and ensure functionality in parallel. History proves that this didn’t (and still doesn’t) work for most IDS/IPS implementations, why should it work for WAFs ;-)?
4. WAFs are smart, so they can be used with any web application, no matter how complex it is
Today’s web applications are often quite complex, they use DOM based communication, web services with encryption and very often they create a lot of dynamic content. A WAF can’t use a whitelist approach or the behavior based detection model with these complex web applications because the content changes dynamically. This reduces the options to the signature based detection model which is not as effective as many people believe (see myth No. 1).
5. Vulnerabilities in web applications can’t be fixed in time, only a WAF can help to reduce the attack surface
This is one of the most common sales arguments, because it contains a lot of reasonable arguments, but what these sales guys don’t tell is the fact, that a WAF won’t solve your problem either ;-).
Talking about risk analysis the ERNW way we have 3 contributing factors: probability, vulnerability and impact. A WAF won’t have any influance on the impact, because if the vulnerability gets exploited there’s still the same impact. Looking at probabilities with the risk analysis view, you have to take care that you don’t consider existing controls (like WAFs 😉 ) because we’re talking about the probability that someone tries to attack your web application and I think that’s pretty clear that the installation of a WAF won’t change that ;-). So there’s only the vulnerability factor left that you can change with the implementation of controls.
But me let me ask one question using the picture of the Fukushima incident: What is the better solution to protect nuclear plants from tsunamis? 1. Building a high wall around it to protect it from the water? 2. Build the nuclear plant at a place where tsunamis can’t occur?
I think the answer is obvious and it’s the same with web application vulnerabilities, if you fix them there’s no need for a WAF. If you start using a Security Development Lifecycle (SDL) you can reach this goal with reasonable effort ;-), so it’s not a matter of costs.
Clarifying these myths of web application firewalls, I think the conclusions are clear. Spend your resources for fixing the vulnerabilities in your web applications instead of buying another appliance that needs operational effort, only slightly reducing the vulnerability instead of eliminating it and also costing more money. We have quite a lot of experience supporting our customers with a SDL and from this experience we can say, that it works effectively and can be implemented more easily than many people think.
You are still not convinced ;-)? In short we will publish an ERNW Newsletter (our newsletter archive can be found here) describing techniques to detect und circumvent WAFs and also a new tool called TSAKWAF (The Swiss Army Knife for Web Application Firewalls) which implements these techniques for practical use. Maybe this will change your mind ;-).
This again is going to be a little series of posts. Their main topic – next to the usual deviations & ranting I tend to include in blogposts 😉 – is some discussion of “trust” and putting this discussion into the context of recent events and future developments in the infosec space. The title originates from a conversation between Angus Blitter and me in a nice Thai restaurant in Zurich where we figured the consequences of the latest RSA revelations. While we both expect that – unfortunately – not much is really going to happen (surprisingly many people, including some CSOs we know, are still trying to somehow downplay this or sweep it under the carpet, shying away from the – obvious – consequences it might have to accept that for a number of environments RSA SecurID is potentially reduced to single factor auth nowadays…), the long term impact on our understanding of 3rd party (e.g. vendor) trust might be more interesting. Furthermore “Broken Trust” seems a promising title for a talk at upcoming Day-Con V… 😉
Despite (or maybe due to) being an apparently essential part of human nature and a fundamental factor of our relationships and societies there is no single, concise definition of “trust”. Quite some researchers discussing trust related topics do not define the term at all and presume some “natural language” understanding. This is even more true in papers in the area of computer science (CS) and related fields, the most prominent example being probably Ken Thompson’s ” Reflections on Trusting Trust” where no definition is provided at all. Given the character and purpose of RFCs in general and their relevance for computer networks it seems an obvious course of action to look for an RFC providing a clarification. In fact the RFC 2828 defines as follows
“Trust […] Information system usage: The extent to which someone who relies on a system can have confidence that the system meets its specifications, i.e., that the system does what it claims to do and does not perform unwanted functions.”
which is usually shortened to statements like “trust = system[s] perform[s] as expected”. We don’t like this definition for a number of reasons (outside the scope of a blogpost) and prefer the definition the Italian sociologist Diego Gambetta published in his paper “Can we trust trust?” in 1988 (and which has – while originating from another discipline – gained quite some adoption in CS) that states
“trust (or, symmetrically, distrust) is a particular level of the subjective probability with which an agent assesses that another agent or group of agents will perform a particular action, both before he can monitor such action (or independently of his capacity ever to be able to monitor it) and in a context in which it affects his own action.”
With Falcone & Castelfranchi we define control as
“a (meta) action:
a) Aimed at ascertaining whether another action has been successfully executed or if a given state of the world has been realized or maintained […]
b) aimed at dealing with the possible deviations and unforeseen events in order to positively cope with them (intervention).”
It should be noted that the term “control” is used here with a much broader sense (thus the attribute “meta” in the above definition) than in some information security standard documents (e.g. the ISO 27000 family) where control is defined as a “means of managing risk, including policies, procedures, guidelines, practices or organizational structures, which can be of administrative, technical, management, or legal nature.” [ISO27002, section 2.2]
Following Cofta’s model both, trust and control, constitute ways to build confidence which he defines as
“one’s subjective probability of expectation that a certain desired event will happen (or that the undesired one will not happen), if the course of action is believed to depend on another agent”.
[I know that this sounds quite similar to Gambetta’s definition of trust but I will skip discussing such subtleties for the moment, in this post. ;-).]
Putting the elements together & bringing the whole stuff to the infosec world
Let’s face it: in the end of the day all the efforts we as security practitioners take ultimately serve a simple goal, that is somebody (be it yourself, your boss or some CxO of your organization) reaching a point where she states “considering the relevant risks and our limited resources, we’ve done what is humanly possible”. Or just “it’s ok the way it is. I can sleep well now”.
I’m well aware that this may sound strange to some readers still believing in a “concret, concise and measurable approach to information security” but this is the reality in most organizations. And the mentioned point of “it’s ok” reflects fairly precisely the above definition of confidence (with the whole of events threatening the security of an environment being “another agent”).
Now, this state of confidence can be attained on two roads, that of “control” (applying security controls and, having done so, subsequently sleeping well) or that of “trust” (reflecting on some elements of the overall picture and then refraining from the application of certain security controls, still sleeping well).
A simple example might help to understand the difference: imagine you move to a new house in another part of the world. Your family is set to arrive one week later, so you have some days left to create an environment you consider “to be safe enough” for them.
What would you do (to reach the state of confidence)? I regularly ask this question in workshops I give and the most common answers go like “install [or check] the doors & locks”, “buy a dog”, “install an alarm system”. These are typical responses for “technology driven people” and the last one, sorry guys, is – as of my humble opinion – a particularly dull one given this is a detective/reactive type of control requiring lots of the most expensive operational resource, that is human brain (namely for follow-up on alarms = incident response). Which, btw, is the very reason why it pretty much never works in a satisfying manner, in pretty much any organization.
And yes, I understand that naming this regrettable reality is against the current vogue of “you’ll get owned anyway – uh, uh APT is around – so you need elaborate detection capabilities. just sign here for our new fancy SIEM/deep packet inspection appliance/deep inspection packet appliance/revolutionary network monitoring platform” BS.
Back to our initial topic (I announced some ranting, didn’t I? ;-)): all this stuff (doors & locks, the dog, the alarm system) follow the “control approach” and, more importantly and often overlooked, they all might require quite some operational effort (key management for the doors – don’t underestimate this, especially if you have a household full of kids 😉 -, walking the dog, tuning the alarm system & as stated above: resolving the incidents one it goes off, etc.).
Another approach – at least for some threats, to some assets – could be to find out which parties are involved in the overall picture (the neighbors, the utility providers, the legal authorities and so on) and then, maybe, deciding “in this environment we have trust in (some of) those and that’s why we don’t need the full set of potential controls”. Living in a hamlet with about 40 inhabitants, in the Bavarian country side, I can tell you that the handling of doors and locks there certainly is a different one than in most European metropolises…
Some of you might argue here “nice for you, Enno, but what’s the application in corporate information security space?”. Actually that’s an easy one. Just think about it:
– Do you encrypt the MPLS links connecting your organization’s main sites? [Most of you don’t, because “we trust our carriers”. Which can be a entirely reasonable security decision, depending on your carriers… and their partners… and the partners of their partners…]
– Do you perform full database encryption for your ERP systems hosted & operated by some outsourcing provider? [Most of you don’t, trusting that provider, which again might be a fully legitimate and reasonable approach].
– Did you ever ask the company providing essential parts of your authentication infrastructure if they kept copies of your key material and, more importantly, if they did so in a sufficiently secure way? [Most of you didn’t, relying on reputation-based trust “everybody uses them, and aren’t they the inventors of that algorithm widely used for banking transactions? and isn’t ‘the military’ using this stuff, too?” or so].
So, in short: trust is a confidence-contributing element and common security instrument in all environments and – here comes the relevant message – this is fully ok. As efficient information security work (leading to confidence) relies on both approaches: trust (where justified) and control (where needed). Alas most infosec people still have a control-driven mindset, not recognizing the value of trust. [this will have to change radically in “the age of the cloud”, more on this in a later part of this series].
Unfortunately, both approaches (trust & control) have their own respective shortcomings:
– following the control road usually leads to increased operational cost & complexity and might have severe business impact.
– trust, by it’s very nature (see the “Gambetta definition” above) is something “subjective” and thereby might not be suited to base corporate security decisions on 😉
BUT – and I’m finally getting to the main point of this post 😉 – if we could transform “subjective trust” into sth documented and justified, it might become a valuable and accepted element in an organization’s infosec governance process [and, again, this will have to happen anyway, as in the age of the cloud, the control approach is doomed. and pls don’t try to tell me the “we’ll audit our cloud providers then” story. ever tried to negotiate with $YOUR_FAVORITE_IAAS_PROVIDER on data center visits or just very basic security reporting or sth.?].
Still, then the question is: What could that those “reasons for trust” be?
Evaluating trust(worthiness) in a structured way
There are various approaches to evaluate factors that contribute to the trustworthiness of another party (called “trustee” in the following) and hence the own (the “trustor’s”) trust towards that party. For example, Cofta lists three main elements, that are (the associated questions in brackets are paraphrases by myself):
Continuity (“How long will we work together?”)
Competence (“Can $TRUSTEE provide what we expect?”)
Motivation (“What’s $TRUSTEE’s motivation?”)
We, howver, usually prefer the avenue the ISECOM uses for their Certified Trust Analyst training which is roughly laid out here. It’s based on ten trust properties, two of which are not aligned with our/Gambetta’s definition of trust and are thereby omitted (these two are “control” and “offsets”, for obvious reasons. Negotiating a compensation to be paid when the trust is broken constitutes the exact opposite of trust… it can contribute to confidence, but not to trust in the above sense). So there’s eight left and these are:
Size (“Who exactly are you going to trust?”. In quite some cases this might be an interesting question. Think of carriers partnering with others in areas of the world called “emerging markets” or just think of RSA and its shareholders. And this is why background checks are performed when you apply for a job in some agency; they want to find out who you interact with in your daily life and who/what might influence your decisions.).
Symmetry (“Do they trust us?”. This again is an interesting, yet often neglected, point. I first stumbled across this when performing an MPLS carrier evaluation back in 2007).
Transparency (“How much do we know about $TRUSTEE?”).
Consistency (“What happened in the past?”. This is the exact reason why to-be-employers ask for criminal records of the to-be-employees.).
Integrity (“[How] Do we notice if $TRUSTEE changes?”).
Value of Reward (“What do we gain by trusting?” If this one has enough weight, all the others might become irrelevant. Which is exactly the mechanism Ponzi schemes are based upon. Or your CIO’s decision “to go to the cloud within the next six months” – overlooking that the departments are already extensively using AWS, “for demo systems” only, of course 😉 – or, for that matter, her (your CIO’s decision) to virtualize highly classified systems by means of VMware products ;-). See also this post of Chris Hoff on “the CxO part in the game”.).
Components: (“Which resources does $TRUSTEE rely on?”).
Porosity (“How separated is $TRUSTEE from it’s environment?”).
Asking all these questions might either help to get a better understanding who to trust & why and thereby contribute to well-informed decision taking or might at least help to identify the areas where additional controls are needed (e.g. asking for enhanced reporting to be put into the contracts).
Applying this stuff to the RSA case
So, what does all this mean when reflecting on the RSA break-in? Why exactly is RSA’s trustworthiness potentially so heavily damaged?
As a little exercise, let’s just pick some of the above questions and try to figure the respective responses. Like I did in this post three days after RSA filed the 8-K report I will leave potential conclusions to the valued reader…
Here we go:
“Size”, so who exactly are you trusting when trusting “RSA, The Security Division of EMC”? Honestly, I do not know much about RSA’s share- and stakeholders. Still, even though not regarding myself as particularly prone to conspiracy theories, I think that Sachar Paulus, the Ex-CSO of SAP and now a professor for Corporate Security and Riskmanagement at the University of Applied Sciences Brandenburg, made some interesting observations in this blogpost.
“Symmetry” (do they trust us?): everybody who had the dubious pleasure to participate in one of those – hilarious – conference calls RSA held with large customers after the initial announcement in late March, going like
“no customer data was compromised.”
“what do you mean, do you mean no seed files where compromised?”
“as we stated, no customer data, that is PII was compromised.”
“So what about the seeds?”
“as we just said, no customer data was compromised.”
“and what about the seeds?”
“we can’t comment on this further due to the ongoing investigation. but we can assure you no customer data was compromised.”
might think of an answer on his/her own…
“Transparency”: well, see above. One might add: “did they ever tell you they kept a copy of your seed files?” but, hey, you never asked them, did you? I mean, even the (US …) defense contractors use this stuff, so why should one have asked such silly questions…
“Integrity”, which the ISECOM defines as “the amount and timely notice of change within the target”. Well… do I really have to comment on “the amount [of information]” and “timely notice” RSA delivered in the last weeks & months? Some people assume we might never have known of the break-in if they’d not been obliged to file a 8-K form (as standard breach-laws might not have kicked in, given – remember – “no customer data was exposed”…) and there’s speculation we might never have known that actually the seeds were compromised if the Lockheed Martin break-in hadn’t happened. I mean, most of us were pretty sure it _was_ about the seed files, but of course, it’s easy to say so in hindsight 😉
“Components” and “porosity”: see above.
If you have been wondering “why do my guts tell me we shouldn’t trust these guys anymore?” this post might serve as a little contribution to answering this question in a structured way. Furthermore the intent was to provide some introduction to the wonderful world of trust, control and confidence and its application in the infosec world. Stay tuned for more stuff to come in this series.
Have a great sunday everybody, thanks
This is the third (and last) part of the series (parts 1 & 2 here). We’ll provide the results from some additional tests supported by public cloud services, namely AWS (Amazon Web Services).
The Amazon Elastic Compute Cloud (short: EC2) provides a flexible environment for the on demand provisioning of virtual machines of different performance levels. For our lab setup, a so-called extra large instance was used. According to Amazon, the technical specs are the following:
15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
I/O Performance: High
API name: m1.xlarge
Since the I/O performance of single disks had turned out to be the bottleneck in the “local” setup, eight Elastic Block Storage (short: EBS) volumes were created and attached to the instance. Each EBS volume is hosted within a specific availability zone and can be attached to instances running in the same zone. EBS volumes can be created and attached issuing two commands of the amazon ec2 command line tools. Therefore the amount of storage can be scaled up very easily. The only requirement (for our tests) is the existence of a sufficient number of EBS volumes which then contain parts of the pcap file to be analyzed.
During the benchmarks, the performance was significantly lower than with the setup described in the previous post, even though eight different EBS volumes were used to avoid the bottleneck of a single storage volume. The overall performance of the test was seemingly limited by the I/O performance restriction within virtualized instances and virtualized storage systems. Following the overall cloud computing paradigm, performance limitations of this kind might be circumvented by using multiple resources which do the processing in parallel. This could be done by using multiple instances or by using frameworks like Amazon MapReduce which are designed to process huge sets of data. Applying this approach to the analysis of pcap files, the structure of the pcap format carries some inherent problems. The format consists of a binary representation of the data which is structured by the time of the captured packets and not by logical packet traces. Therefore it would be necessary to process the complete pcap file by each instance to extract all streams to identify which streams of the file are to be analyzed by the concrete worker instance. This prevents an efficient distribution of the analysis in multiple jobs or input files. If the captured network data would be stored in separate streams instead of one big pcap file, the processing using a map/reduce algorithm would be possible and thus potentially increase scalability significantly.
That said, finally here are the results of our testing (test methodology described in earlier post):
So it took much longer to extract the data from a 500 GB file which can be attributed to the increased latency times accessing centralized storage (from a SAN/over the network) when compared to locally connected SSDs.
Hopefully this little series provided some insight for you, dear readers. We’ll publish the full technical report as an ERNW Newsletter in the next months.
Have a good one, thanks
In the first post I’ve laid out the tools and lab setup, so in this one I’m going to discuss some results.
Description of overall test methodology
To evaluate the performance of the different setups used to analyze capture data, both tcpdump and pcap_extractor (see last post) were used. For the tests, five capture files were created using mergecap. Various sample traffic dumps were merged to five large files with different file sizes. All these files consisted of several capture files containing a variety of protocols (including iSCSI and FCoE packets). Capture files of ∼40, ∼80, ∼200, ∼500, and ∼800 GB size were created and were analyzed with both tools. For all tests the filtering expressions for tcpdump and pcap_extractor were configured to search for a specific source IP and a specific destination IP matching to iSCSI packets contained in the capture file. Additionally pcap_extractor was “instructed” to look for some search string (formatted like a credit card number).To address the performance bottleneck (again, see last post), that is the I/O throughput, two different setups of the testing environment (see above) were implemented, the first one going with a raid0 approach using four SSD hard drives, the second one with four individual SSD hard drives, each of them processing only a fourth of the analyzed capture file. Standard UNIX time command was invoked to measure the time of execution. Additionally the tools analyzing the data were started with the highest possible scheduling priority to ensure execution with the maximum of available resources. This is a sample command line invoking the test:
The most interesting results table is shown below:
So actually extracting a given search string from a 500 GB file could be done in about 21 minutes, employing readily available tools and using COTS hardware for about 3K EUR (as of March 2011). This means that an attacker disposing of (large) data sets resulting from previous eavesdropping attacks will most likely succeed in getting the exact data she’s going after. Furthermore the time needed scales in a lineary fashion with the file size, so that processing a 1 TB data volume presumably would have taken ∼42 minutes, a 2 TB file would have taken ∼84 minutes and so on. In addition, SSD prices are constantly declining, too.
Thus it could be shown that the perception that the sheer volume of data gained from eavesdropping attacks on high speed links might prohibit an attacker from analyzing this data is, well, simply not correct ;-).
Risk Assessment & Mitigating Controls
Several factors come into play when trying to assess the actual risk of this type of attack. Let’s put it like this: once an attacker disposes either of physical access to a fibre at some point or is able to get into the transport path by means of certain network based attacks – which are going to be covered in another, future post – collecting and analyzing the data is an easy task. If you have sufficient reasons for trusting the party actually implementing the connection (e.g. a carrier offering Metro Ethernet services) and “the overall circumstances” you might rely on the isolation properties provided by the service and topology. In case you either don’t have sufficient reasons to trust (some discussion on approaches to “evaluate trustworthiness” can be found here or here) or in highly regulated environments, using encryption technologies on layer 2 (like these or these) might be a safer approach.
In the next post we’ll discuss the cloud based test setup, together with its results. Stay tuned &
have a great day,