I’m proud to announce, today a new fuzzing framework will see the light of day. It’s called dizzy and was written because the tools we used for fuzzing in past didn’t match our requirements. Some (unique) features are:
Python based
Fast!
Can send to L2 as well as to upper layers (TCP/UDP/SCTP)
Ability to work with odd length packet fields (no need to match byte borders, so even single flags or 7bit long fields can be represented and fuzzed)
Very easy protocol definition syntax
Ability to do multi packet state-full fuzzing with the ability to use received target data in response.
We already had a lot of success using it, now you will be able to know the true promises.
Find the source here (c715a7ba894b44497b98659242fce52128696a17).
Some days ago a security advisory related to web application firewalls (WAFs) was published on Full Disclosure. Wendel Guglielmetti Henrique found another bug in the IBM Web Application Firewall which can be used to circumvent the WAF and execute typical web application attacks like SQL injection (click here for details). Wendel talked already (look here) at the Troopers Conference in 2009 about the different techniques to identify and bypass WAFs, so this kind of bypass methods are not quite new.
Nevertheless doing a lot of web application assessments and talking about countermeasures to protect web applications there’s a TOP 1 question I have to answer almost every time: “Wouldn’t it be helpful to install a WAF in front of our web application to protect them from attacks?”. My typical answer is “NO” because it’s better to spent the resources for addressing the problems in the code itself. So I will take this opportunity to write some rants about sense and nonsense of WAFs ;-). Let’s start with some – from our humble position – widespread myths:
1. WAFs will protect a web application from all web attacks .
2. WAFs are transparent and can’t be detected .
3. After installation of a WAF our web application is secure, no further “To Dos” .
4. WAFs are smart, so they can be used with any web application, no matter how complex it is .
5. Vulnerabilities in web applications can’t be fixed in time, only a WAF can help to reduce the attack surface.
And now let us dig a little bit deeper into these myths ;-).
1. WAFs will protect a web application from all web attacks
There are different attack detection models used by common WAFs like signature based detection, behavior based detection or a whitelist approach. These detection models are also known by attackers, so it’s not too hard to construct an attack that will pass the detection engines.
Just a simple example for signatures ;-): Studying sql injection attacks we can learn from all the examples that we can manipulate “WHERE clauses” with attacks like “or 1=1”. This is a typical signature for the detection engine of WAFs, but what about using “or 9=9” or even smarter 😉 “or 14<15”? This might sound ridiculous for most of you, but this already worked at least against one WAF 😉 and there are much more leet attacks to circumvent WAFs (sorry that we don’t disclose any vendor names, but this post is about WAFs in general).
Another point to mention are the different types of attacks against web applications, it’s not all about SQL injection and Cross-Site Scripting attacks, there also logic flaws that can be attacked or the typical privilege escalation problem “can user A access data of user B?”. A WAF can’t protect against these attacks, it a WAF can raise the bar for attackers under some circumstances, but it can’t protect a web application from skilled attackers.
2. WAFs are transparent and can’t be detected
In 2009, initially at Troopers ;-), Wendel and Sandro Gauci published a tool called wafw00f and described their approach to fingerprint WAFs in different talks at security conferences. This already proves that this myth is not true. Furthermore there will be another tool release from ERNW soon, so stay tuned, it will be available for download shortly ;-).
3. After installation of a WAF my web application is secure, no further “To Dos”
WAFs require a lot of operational effort just because web applications offer more and more functionality and the main purpose of a web application is to support the organization’s business. WAF administrators have to ensure that the WAF doesn’t block any legitimate traffic. It’s almost the same as with Intrusion Detection and Prevention Systems, they require a lot of fine tuning to detect important attacks and ensure functionality in parallel. History proves that this didn’t (and still doesn’t) work for most IDS/IPS implementations, why should it work for WAFs ;-)?
4. WAFs are smart, so they can be used with any web application, no matter how complex it is
Today’s web applications are often quite complex, they use DOM based communication, web services with encryption and very often they create a lot of dynamic content. A WAF can’t use a whitelist approach or the behavior based detection model with these complex web applications because the content changes dynamically. This reduces the options to the signature based detection model which is not as effective as many people believe (see myth No. 1).
5. Vulnerabilities in web applications can’t be fixed in time, only a WAF can help to reduce the attack surface
This is one of the most common sales arguments, because it contains a lot of reasonable arguments, but what these sales guys don’t tell is the fact, that a WAF won’t solve your problem either ;-).
Talking about risk analysis the ERNW way we have 3 contributing factors: probability, vulnerability and impact. A WAF won’t have any influance on the impact, because if the vulnerability gets exploited there’s still the same impact. Looking at probabilities with the risk analysis view, you have to take care that you don’t consider existing controls (like WAFs 😉 ) because we’re talking about the probability that someone tries to attack your web application and I think that’s pretty clear that the installation of a WAF won’t change that ;-). So there’s only the vulnerability factor left that you can change with the implementation of controls.
But me let me ask one question using the picture of the Fukushima incident: What is the better solution to protect nuclear plants from tsunamis? 1. Building a high wall around it to protect it from the water? 2. Build the nuclear plant at a place where tsunamis can’t occur?
I think the answer is obvious and it’s the same with web application vulnerabilities, if you fix them there’s no need for a WAF. If you start using a Security Development Lifecycle (SDL) you can reach this goal with reasonable effort ;-), so it’s not a matter of costs.
Clarifying these myths of web application firewalls, I think the conclusions are clear. Spend your resources for fixing the vulnerabilities in your web applications instead of buying another appliance that needs operational effort, only slightly reducing the vulnerability instead of eliminating it and also costing more money. We have quite a lot of experience supporting our customers with a SDL and from this experience we can say, that it works effectively and can be implemented more easily than many people think.
You are still not convinced ;-)? In short we will publish an ERNW Newsletter (our newsletter archive can be found here) describing techniques to detect und circumvent WAFs and also a new tool called TSAKWAF (The Swiss Army Knife for Web Application Firewalls) which implements these techniques for practical use. Maybe this will change your mind ;-).
This again is going to be a little series of posts. Their main topic – next to the usual deviations & ranting I tend to include in blogposts 😉 – is some discussion of “trust” and putting this discussion into the context of recent events and future developments in the infosec space. The title originates from a conversation between Angus Blitter and me in a nice Thai restaurant in Zurich where we figured the consequences of the latest RSA revelations. While we both expect that – unfortunately – not much is really going to happen (surprisingly many people, including some CSOs we know, are still trying to somehow downplay this or sweep it under the carpet, shying away from the – obvious – consequences it might have to accept that for a number of environments RSA SecurID is potentially reduced to single factor auth nowadays…), the long term impact on our understanding of 3rd party (e.g. vendor) trust might be more interesting. Furthermore “Broken Trust” seems a promising title for a talk at upcoming Day-Con V… 😉
Despite (or maybe due to) being an apparently essential part of human nature and a fundamental factor of our relationships and societies there is no single, concise definition of “trust”. Quite some researchers discussing trust related topics do not define the term at all and presume some “natural language” understanding. This is even more true in papers in the area of computer science (CS) and related fields, the most prominent example being probably Ken Thompson’s ” Reflections on Trusting Trust” where no definition is provided at all. Given the character and purpose of RFCs in general and their relevance for computer networks it seems an obvious course of action to look for an RFC providing a clarification. In fact the RFC 2828 defines as follows
“Trust […] Information system usage: The extent to which someone who relies on a system can have confidence that the system meets its specifications, i.e., that the system does what it claims to do and does not perform unwanted functions.”
which is usually shortened to statements like “trust = system[s] perform[s] as expected”. We don’t like this definition for a number of reasons (outside the scope of a blogpost) and prefer the definition the Italian sociologist Diego Gambetta published in his paper “Can we trust trust?” in 1988 (and which has – while originating from another discipline – gained quite some adoption in CS) that states
“trust (or, symmetrically, distrust) is a particular level of the subjective probability with which an agent assesses that another agent or group of agents will perform a particular action, both before he can monitor such action (or independently of his capacity ever to be able to monitor it) and in a context in which it affects his own action.”
With Falcone & Castelfranchi we define control as
“a (meta) action:
a) Aimed at ascertaining whether another action has been successfully executed or if a given state of the world has been realized or maintained […]
b) aimed at dealing with the possible deviations and unforeseen events in order to positively cope with them (intervention).”
It should be noted that the term “control” is used here with a much broader sense (thus the attribute “meta” in the above definition) than in some information security standard documents (e.g. the ISO 27000 family) where control is defined as a “means of managing risk, including policies, procedures, guidelines, practices or organizational structures, which can be of administrative, technical, management, or legal nature.” [ISO27002, section 2.2]
Following Cofta’s model both, trust and control, constitute ways to build confidence which he defines as
“one’s subjective probability of expectation that a certain desired event will happen (or that the undesired one will not happen), if the course of action is believed to depend on another agent”.
[I know that this sounds quite similar to Gambetta’s definition of trust but I will skip discussing such subtleties for the moment, in this post. ;-).]
Putting the elements together & bringing the whole stuff to the infosec world
Let’s face it: in the end of the day all the efforts we as security practitioners take ultimately serve a simple goal, that is somebody (be it yourself, your boss or some CxO of your organization) reaching a point where she states “considering the relevant risks and our limited resources, we’ve done what is humanly possible”. Or just “it’s ok the way it is. I can sleep well now”.
I’m well aware that this may sound strange to some readers still believing in a “concret, concise and measurable approach to information security” but this is the reality in most organizations. And the mentioned point of “it’s ok” reflects fairly precisely the above definition of confidence (with the whole of events threatening the security of an environment being “another agent”).
Now, this state of confidence can be attained on two roads, that of “control” (applying security controls and, having done so, subsequently sleeping well) or that of “trust” (reflecting on some elements of the overall picture and then refraining from the application of certain security controls, still sleeping well).
A simple example might help to understand the difference: imagine you move to a new house in another part of the world. Your family is set to arrive one week later, so you have some days left to create an environment you consider “to be safe enough” for them.
What would you do (to reach the state of confidence)? I regularly ask this question in workshops I give and the most common answers go like “install [or check] the doors & locks”, “buy a dog”, “install an alarm system”. These are typical responses for “technology driven people” and the last one, sorry guys, is – as of my humble opinion – a particularly dull one given this is a detective/reactive type of control requiring lots of the most expensive operational resource, that is human brain (namely for follow-up on alarms = incident response). Which, btw, is the very reason why it pretty much never works in a satisfying manner, in pretty much any organization.
And yes, I understand that naming this regrettable reality is against the current vogue of “you’ll get owned anyway – uh, uh APT is around – so you need elaborate detection capabilities. just sign here for our new fancy SIEM/deep packet inspection appliance/deep inspection packet appliance/revolutionary network monitoring platform” BS.
Back to our initial topic (I announced some ranting, didn’t I? ;-)): all this stuff (doors & locks, the dog, the alarm system) follow the “control approach” and, more importantly and often overlooked, they all might require quite some operational effort (key management for the doors – don’t underestimate this, especially if you have a household full of kids 😉 -, walking the dog, tuning the alarm system & as stated above: resolving the incidents one it goes off, etc.).
Another approach – at least for some threats, to some assets – could be to find out which parties are involved in the overall picture (the neighbors, the utility providers, the legal authorities and so on) and then, maybe, deciding “in this environment we have trust in (some of) those and that’s why we don’t need the full set of potential controls”. Living in a hamlet with about 40 inhabitants, in the Bavarian country side, I can tell you that the handling of doors and locks there certainly is a different one than in most European metropolises…
Some of you might argue here “nice for you, Enno, but what’s the application in corporate information security space?”. Actually that’s an easy one. Just think about it:
– Do you encrypt the MPLS links connecting your organization’s main sites? [Most of you don’t, because “we trust our carriers”. Which can be a entirely reasonable security decision, depending on your carriers… and their partners… and the partners of their partners…]
– Do you perform full database encryption for your ERP systems hosted & operated by some outsourcing provider? [Most of you don’t, trusting that provider, which again might be a fully legitimate and reasonable approach].
– Did you ever ask the company providing essential parts of your authentication infrastructure if they kept copies of your key material and, more importantly, if they did so in a sufficiently secure way? [Most of you didn’t, relying on reputation-based trust “everybody uses them, and aren’t they the inventors of that algorithm widely used for banking transactions? and isn’t ‘the military’ using this stuff, too?” or so].
So, in short: trust is a confidence-contributing element and common security instrument in all environments and – here comes the relevant message – this is fully ok. As efficient information security work (leading to confidence) relies on both approaches: trust (where justified) and control (where needed). Alas most infosec people still have a control-driven mindset, not recognizing the value of trust. [this will have to change radically in “the age of the cloud”, more on this in a later part of this series].
Unfortunately, both approaches (trust & control) have their own respective shortcomings:
– following the control road usually leads to increased operational cost & complexity and might have severe business impact.
– trust, by it’s very nature (see the “Gambetta definition” above) is something “subjective” and thereby might not be suited to base corporate security decisions on 😉
BUT – and I’m finally getting to the main point of this post 😉 – if we could transform “subjective trust” into sth documented and justified, it might become a valuable and accepted element in an organization’s infosec governance process [and, again, this will have to happen anyway, as in the age of the cloud, the control approach is doomed. and pls don’t try to tell me the “we’ll audit our cloud providers then” story. ever tried to negotiate with $YOUR_FAVORITE_IAAS_PROVIDER on data center visits or just very basic security reporting or sth.?].
Still, then the question is: What could that those “reasons for trust” be?
Evaluating trust(worthiness) in a structured way
There are various approaches to evaluate factors that contribute to the trustworthiness of another party (called “trustee” in the following) and hence the own (the “trustor’s”) trust towards that party. For example, Cofta lists three main elements, that are (the associated questions in brackets are paraphrases by myself):
Continuity (“How long will we work together?”)
Competence (“Can $TRUSTEE provide what we expect?”)
Motivation (“What’s $TRUSTEE’s motivation?”)
We, howver, usually prefer the avenue the ISECOM uses for their Certified Trust Analyst training which is roughly laid out here. It’s based on ten trust properties, two of which are not aligned with our/Gambetta’s definition of trust and are thereby omitted (these two are “control” and “offsets”, for obvious reasons. Negotiating a compensation to be paid when the trust is broken constitutes the exact opposite of trust… it can contribute to confidence, but not to trust in the above sense). So there’s eight left and these are:
Size (“Who exactly are you going to trust?”. In quite some cases this might be an interesting question. Think of carriers partnering with others in areas of the world called “emerging markets” or just think of RSA and its shareholders. And this is why background checks are performed when you apply for a job in some agency; they want to find out who you interact with in your daily life and who/what might influence your decisions.).
Symmetry (“Do they trust us?”. This again is an interesting, yet often neglected, point. I first stumbled across this when performing an MPLS carrier evaluation back in 2007).
Transparency (“How much do we know about $TRUSTEE?”).
Consistency (“What happened in the past?”. This is the exact reason why to-be-employers ask for criminal records of the to-be-employees.).
Integrity (“[How] Do we notice if $TRUSTEE changes?”).
Value of Reward (“What do we gain by trusting?” If this one has enough weight, all the others might become irrelevant. Which is exactly the mechanism Ponzi schemes are based upon. Or your CIO’s decision “to go to the cloud within the next six months” – overlooking that the departments are already extensively using AWS, “for demo systems” only, of course 😉 – or, for that matter, her (your CIO’s decision) to virtualize highly classified systems by means of VMware products ;-). See also this post of Chris Hoff on “the CxO part in the game”.).
Components: (“Which resources does $TRUSTEE rely on?”).
Porosity (“How separated is $TRUSTEE from it’s environment?”).
Asking all these questions might either help to get a better understanding who to trust & why and thereby contribute to well-informed decision taking or might at least help to identify the areas where additional controls are needed (e.g. asking for enhanced reporting to be put into the contracts).
Applying this stuff to the RSA case
So, what does all this mean when reflecting on the RSA break-in? Why exactly is RSA’s trustworthiness potentially so heavily damaged?
As a little exercise, let’s just pick some of the above questions and try to figure the respective responses. Like I did in this post three days after RSA filed the 8-K report I will leave potential conclusions to the valued reader…
Here we go:
“Size”, so who exactly are you trusting when trusting “RSA, The Security Division of EMC”? Honestly, I do not know much about RSA’s share- and stakeholders. Still, even though not regarding myself as particularly prone to conspiracy theories, I think that Sachar Paulus, the Ex-CSO of SAP and now a professor for Corporate Security and Riskmanagement at the University of Applied Sciences Brandenburg, made some interesting observations in this blogpost.
“Symmetry” (do they trust us?): everybody who had the dubious pleasure to participate in one of those – hilarious – conference calls RSA held with large customers after the initial announcement in late March, going like
“no customer data was compromised.”
“what do you mean, do you mean no seed files where compromised?”
“as we stated, no customer data, that is PII was compromised.”
“So what about the seeds?”
“as we just said, no customer data was compromised.”
“and what about the seeds?”
“we can’t comment on this further due to the ongoing investigation. but we can assure you no customer data was compromised.”
might think of an answer on his/her own…
“Transparency”: well, see above. One might add: “did they ever tell you they kept a copy of your seed files?” but, hey, you never asked them, did you? I mean, even the (US …) defense contractors use this stuff, so why should one have asked such silly questions…
“Integrity”, which the ISECOM defines as “the amount and timely notice of change within the target”. Well… do I really have to comment on “the amount [of information]” and “timely notice” RSA delivered in the last weeks & months? Some people assume we might never have known of the break-in if they’d not been obliged to file a 8-K form (as standard breach-laws might not have kicked in, given – remember – “no customer data was exposed”…) and there’s speculation we might never have known that actually the seeds were compromised if the Lockheed Martin break-in hadn’t happened. I mean, most of us were pretty sure it _was_ about the seed files, but of course, it’s easy to say so in hindsight 😉
“Components” and “porosity”: see above.
Conclusions
If you have been wondering “why do my guts tell me we shouldn’t trust these guys anymore?” this post might serve as a little contribution to answering this question in a structured way. Furthermore the intent was to provide some introduction to the wonderful world of trust, control and confidence and its application in the infosec world. Stay tuned for more stuff to come in this series.
Have a great sunday everybody, thanks
This is the third (and last) part of the series (parts 1 & 2 here). We’ll provide the results from some additional tests supported by public cloud services, namely AWS (Amazon Web Services).
Lab Setup
The Amazon Elastic Compute Cloud (short: EC2) provides a flexible environment for the on demand provisioning of virtual machines of different performance levels. For our lab setup, a so-called extra large instance was used. According to Amazon, the technical specs are the following:
15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.xlarge
Since the I/O performance of single disks had turned out to be the bottleneck in the “local” setup, eight Elastic Block Storage (short: EBS) volumes were created and attached to the instance. Each EBS volume is hosted within a specific availability zone and can be attached to instances running in the same zone. EBS volumes can be created and attached issuing two commands of the amazon ec2 command line tools. Therefore the amount of storage can be scaled up very easily. The only requirement (for our tests) is the existence of a sufficient number of EBS volumes which then contain parts of the pcap file to be analyzed.
Results
During the benchmarks, the performance was significantly lower than with the setup described in the previous post, even though eight different EBS volumes were used to avoid the bottleneck of a single storage volume. The overall performance of the test was seemingly limited by the I/O performance restriction within virtualized instances and virtualized storage systems. Following the overall cloud computing paradigm, performance limitations of this kind might be circumvented by using multiple resources which do the processing in parallel. This could be done by using multiple instances or by using frameworks like Amazon MapReduce which are designed to process huge sets of data. Applying this approach to the analysis of pcap files, the structure of the pcap format carries some inherent problems. The format consists of a binary representation of the data which is structured by the time of the captured packets and not by logical packet traces. Therefore it would be necessary to process the complete pcap file by each instance to extract all streams to identify which streams of the file are to be analyzed by the concrete worker instance. This prevents an efficient distribution of the analysis in multiple jobs or input files. If the captured network data would be stored in separate streams instead of one big pcap file, the processing using a map/reduce algorithm would be possible and thus potentially increase scalability significantly.
That said, finally here are the results of our testing (test methodology described in earlier post):
So it took much longer to extract the data from a 500 GB file which can be attributed to the increased latency times accessing centralized storage (from a SAN/over the network) when compared to locally connected SSDs.
Hopefully this little series provided some insight for you, dear readers. We’ll publish the full technical report as an ERNW Newsletter in the next months.
Have a good one, thanks
In the first post I’ve laid out the tools and lab setup, so in this one I’m going to discuss some results.
Description of overall test methodology
To evaluate the performance of the different setups used to analyze capture data, both tcpdump and pcap_extractor (see last post) were used. For the tests, five capture files were created using mergecap. Various sample traffic dumps were merged to five large files with different file sizes. All these files consisted of several capture files containing a variety of protocols (including iSCSI and FCoE packets). Capture files of ∼40, ∼80, ∼200, ∼500, and ∼800 GB size were created and were analyzed with both tools. For all tests the filtering expressions for tcpdump and pcap_extractor were configured to search for a specific source IP and a specific destination IP matching to iSCSI packets contained in the capture file. Additionally pcap_extractor was “instructed” to look for some search string (formatted like a credit card number).To address the performance bottleneck (again, see last post), that is the I/O throughput, two different setups of the testing environment (see above) were implemented, the first one going with a raid0 approach using four SSD hard drives, the second one with four individual SSD hard drives, each of them processing only a fourth of the analyzed capture file. Standard UNIX time command was invoked to measure the time of execution. Additionally the tools analyzing the data were started with the highest possible scheduling priority to ensure execution with the maximum of available resources. This is a sample command line invoking the test:
The most interesting results table is shown below:
Conclusions
So actually extracting a given search string from a 500 GB file could be done in about 21 minutes, employing readily available tools and using COTS hardware for about 3K EUR (as of March 2011). This means that an attacker disposing of (large) data sets resulting from previous eavesdropping attacks will most likely succeed in getting the exact data she’s going after. Furthermore the time needed scales in a lineary fashion with the file size, so that processing a 1 TB data volume presumably would have taken ∼42 minutes, a 2 TB file would have taken ∼84 minutes and so on. In addition, SSD prices are constantly declining, too.
Thus it could be shown that the perception that the sheer volume of data gained from eavesdropping attacks on high speed links might prohibit an attacker from analyzing this data is, well, simply not correct ;-).
Risk Assessment & Mitigating Controls
Several factors come into play when trying to assess the actual risk of this type of attack. Let’s put it like this: once an attacker disposes either of physical access to a fibre at some point or is able to get into the transport path by means of certain network based attacks – which are going to be covered in another, future post – collecting and analyzing the data is an easy task. If you have sufficient reasons for trusting the party actually implementing the connection (e.g. a carrier offering Metro Ethernet services) and “the overall circumstances” you might rely on the isolation properties provided by the service and topology. In case you either don’t have sufficient reasons to trust (some discussion on approaches to “evaluate trustworthiness” can be found here or here) or in highly regulated environments, using encryption technologies on layer 2 (like these or these) might be a safer approach.
In the next post we’ll discuss the cloud based test setup, together with its results. Stay tuned &
have a great day,
As I’m currently developing the ‘next gen’ state-full fuzzing framework @ERNW [called dizzy, to be released soon 😉 ], I will give you an updated set of fuzzing scripts from the ‘old world’.
Some of you will remember the 2008 release of sulley_l2, which was a modified version of the sulley fuzzing framework, enhanced with Layer 2 sending capabilities and a hole bunch of (L2) fuzzing scripts. All the blinking, rebooting, mem-corrupting ciscos gave us some attention. Back from then, we continued to write and use the fuzzing scripts, so the hole collection grew.
There is a common misconception that the sheer amount of data coupled with multiplexed channels (e.g. WDM technology) make successful eavesdropping attacks on high speed Ethernet links – like those connecting data centers – highly unlikely. This is mainly based on the assumption that the amount of resources (e.g. RAM, [sufficiently fast] storage or CPU power) needed to process large files of captured data is a limiting factor. However, to the best of our knowledge, no practical evaluation of these assumptions has so far been performed.
Therefore we conducted some research and started writing a paper (to be released as a technical report shortly) that aims to answer the following questions:
– Can the processing of large amounts of captured data be done “in a feasible way” ?
– How much time and which type of hardware is needed to perform this task?
– Can this be done with readily available tools or is custom code helpful or even required? If so, how should that code operate?
– Can this task be facilitated by means of public cloud services?
We performed a number of tests with files of different sizes and entropy. Tests were both carried out with different sets of dedicated hardware and by means of public cloud services. The paper describes the tools used, the various test setups and, of course, the results. A final section includes some conclusions derived from the insights provided by the test sets.It is assumed that an attacker has already gained access enabling her to eavesdrop on the high speed data link. A detailed description how this can be done can be found e.g. here or here. The focus of our paper is on the subsequent extraction of useful data from the resulting dump file. It is further assumed the collected data is available in standard pcap format.
We’ll summarize some of the stuff in a series of three blog posts, each discussing certain aspects of the overall research task. In the first one we’ll describe the tools and hardware used. In the second we’ll give the results from the test lab with our hardware while the third part describes the tests performed in the (AWS) cloud and provides the conclusions. Furthermore we’ll give a presentation of the results, including a demo (probably the extraction of credit card information from a file with the size 500 GB which roughly equates to a live migration of 16 virtual machines with 32 GB RAM each) at the Infoguard Security Lounge taking place on 8th of June in Zug/Switzerland.
Last but not least before it get’s technical: the majority of the work was performed by Daniel, Hendrik and Matthias. I myself had mostly a “supervisor role” 😉 So kudos to them!
COTS packet analysis tools
A number of tests utilizing available command-line tools (tethereal, tshark, tcpdump and the like) were performed. It turned out that, performance-wise, “classic” tcpdump showed the most promising results. During the following, in-depth testing phase two problems of tcpdump showed up:
– It’s single-threaded so it can’t use multiple processors of a system (for parallel processing). Given the actual bottlenecks to be related with I/O anyway (see below) this was not regarded to be major problem.
– Standard pcap filters do not allow for “keyword search” which somehow limits the attack scenarios (attacker might not be able to search for credit card numbers, user names etc. but would have to perform an IP parameter based search first and then hand over to another tool which might cause an unacceptable delay in the overall analysis process). To address this limitation Daniel wrote a small piece of code that we – not having found an elegant name like Loki so far 😉 – called pcap_extractor.
Pcap_extractor
This is basically the fastest possible implementation of a pcap file reader. It opens a libpcap file handle for the designated input file, applies a libpcap filter to it and loops through all the filter matching packets, writing them to an output pcap file. Contrary to tcpdump and most other libpcap based analysis tools, it provides the possibility to search for a given string inside the matching packets, for example a credit card number or a username. If such a search string is applied, only packets matching the libpcap filter and containing the search string are written to the output file.A call to search a pcap file for iSCSI packets which contain a certain credit card number and write them to the output file would then look like:
The source code of pcap_extractor can be downloaded here.
Identifying the bottleneck(s)
While measuring the performance of multiple pcap analysis tools the profiling of system calls indicated that the tools spend between 85% and 98% of the search time on waiting for I/O. In case of the fastest tool that means 98% of the time the tool does nothing, but waiting for dump data. So the I/O bandwidth turned out to be the major bottleneck in the initial test setups.
Actual lab setup
The final test system was designed to provide as much I/O bandwidth as possible and was composed of:
Intel Core i7-990X Extreme Edition, 6x 3.46 GHz
12GB (3 * 4GB) DDR3 1600MHz, PC3-12800
ASRock X58 Extreme6 S1366 mainboard
4 * Intel 510 Series Elm Crest SSD 250GB
The mainboard and the SSDs were chosen to support SATA3 with a theoretical maximal I/O bandwidth of 6 Gbit/s. FreeBSD was used as operating system.
===
In this post we’ve “prepared the battle ground” (as for the tools and hardware to be used) for the actual testing, in the next one we’ll discuss the results. Stay tuned & have a great day
Recently Micele and I were researching for our talk about the current state of security of Multifunction Devices (MFDs). Since we’re both seasoned pentesters who are quite familar with MFDs, we were really surprised that very little new research is going on on the topic of MFD security. While diving deeper into the topic, we found a very simple explanation for this: As in 2002, it is still possible to download print or scan jobs using PJL, many devices still offer default FTP or Telnet access, and, of course, stored files can be recovered from MFD hard drives — on an enterprise wide scale. To even strengthen our impression of the current state of MFD security, most devices crashed or did go wild while performing some scans — and we do not talk about fuzzing here.
This devastating result lead to the question how MFDs can be secured. Since there are a lot of MFD hardening resources out there, even from vendors, we decided to put together a comprehensive hardening guide for MFDs. To raise the level of awareness, we put together a lot of examples on attacks on MFDs and then focused on the development of our own MFD security guide which is based on the seven sisters. The result of this approach can be found here. And of course, soon there will be a ERNW newsletter to cover this topic in a more academic and structured way 😉
Lots of stuff has been written about this blog post from RSA describing the (potential) details of the attack, so I will refrain from detailed comments on this piece that Marsh Ray nicely called “some of the most egregious hyperbole I’ve read in infosec”.
Just one short note. Presumably the attack, in an early stage, used a “spreadsheet [that] contained a zero-day exploit that installs a backdoor through an Adobe Flash vulnerability (CVE-2011-0609)”.