Some Design Aspects of Hacking Challenges

Matthias Luft

We’re currently starting the preparation for the Troopers15 PacketWars Challenge, and since I’ve participated in quite some CTF games and have been involved in the preparation of a number of PacketWars Battles, I thought I’d write down some thoughts on the design of hacking challenges.

First of all, my experience is limited almost exclusively to attack-defend-CTFs or interactive war games (such as PacketWars or CCDC). While thinking about this blogpost, I also came across several terms which are used, so I decided to give a short summary:

War games vs CTFs: I’m not sure whether there is a clear definition, but I will use war games for “pentesting challenges” (i.e. providing know vulnerabilities/misconfigurations) while CTFs provide specifically developed services, which intentionally contain flaws.
CTF/attack-defend CTF (see also here): Every team is hosting the same system which has to offer multiple services. By discovering vulnerabilities in those services, you can attack other teams and patch your own services. (examples are CIPHER or RuCTFe)
Challenge-based/jeopardy CTF: A series of challenges is provided which must be solved by the participants. No interaction between the teams, just focus on the challenges. Of course there are CTFs which offer both, attack-defend and some “quests”.
Offensive-only war games/CTF: All teams attack the same network and have to get as far as possible/compromise as much as possible (I would also often refer to this just as a war game).

You see, there is no really clear terminology 😉

When it comes to attack-defend CTFs and wargames, I would like to discuss some design aspects:

Use of exotic programming languages: On a regular base, CTF services are written in exotic programming languages (such as a FTP service written in Ghostscript). The motivation behind this is to provide a fair chance to everyone, as presumably nobody is familiar with the language and has to learn it first. While this can be a valid point, usually some persons through the different teams still know the language and can easily score early — which can be frustrating for other teams and leads to large-scale disabling of the service. Hence I have a tendency to recommend using common programming languages, but include sophisticated/well-hidden/complicated vulnerabilities.
Having a clear rating schema: Nothing is more frustrating to teams than not understanding how points are assigned. Make a clear scheme beforehand, include a certain margin for bonus points, and stick to it — and of course document and publish it to the teams.
Take the hacker way into account: Teams will attack everything, and that includes the central infrastructure. There have been several occasions where teams scored due to flaws in the central scoreboard, one time this even determined the winner. While this clearly is the “hacker spirit”, it is very annoying for teams which focus on the CTF and see it as a competition. Just make sure your infrastructure is secure — but your doing that anyways, right? 😉
Be careful with bruteforce challenges: When it comes to games which have a limited/short time frame, you want to be careful with bruteforce challenges: The success depends too much on the correct choice of the key space. In the meantime, I tend to use the easiest possible passwords which are in every wordlist but not tried as the first choice (potential hint for Troopers PacketWars here… 😉 )
Think of network sniffing: As for attack-defend games, think of the possibility that every team is inspecting the network traffic. Most CTF teams nowadays have nice frameworks for analyzing network traffic per service, making the extraction of exploits from other teams trivial. If possible, think about ways to make this more difficult.
Shared vs. dedicated systems: In war games where teams attack central systems, you need to make a sound decision when it comes to using shared systems or a dedicated system for each team. This also depends heavily on your scoring system: If you only want the first successful team to score, there is no need for dedicated systems. However, if further challenges rely on e.g. hints stored on the systems, you definitely want dedicated systems because the first successful team will delete those.
Use stable systems: This might sound trivial, but use systems that can handle a certain load. This is usually not a problem for standard (operating) systems, but we tried to include embedded devices in the past (such as a camera 😉 ), and these devices usually crash for every other packet — which is very frustrating for the participants and also introduces a certain randomness to which team is successful.
Provide transparency: hcesperer states that debug output from central game servers should not be available to the participants — which is a perfectly valid point since it can give additional information. However, depending on the complexity of certain services, debug output from testscripts can be incredibly helpful to determine what is wrong with your service. RuCTFe has done this in the past and from a participant’s perspective, it was great.
Balanced challenges: CTFs are for everyone so you want to provide services and challenges for different skill levels. Even if you think a certain vulnerability is boring, it might make someone’s day to score a few points. A great way to include this is having challenges which comprise multiple staged vulnerabilities to achieve the overall goal.

I hope the points made above are helpful to some, but I’m sure there are many more design aspects of hacking challenges, so I would really appreciate your comments!

so long,
Matthias

References:
http://www.cipher-ctf.org/CaptureTheFlag.php
https://ctftime.org/
http://www.sans.org/reading-room/whitepapers/casestudies/capture-flag-education-mentoring-33018
http://ctf.hcesperer.org/talks/wst.pdf
http://ctf.hcesperer.org/
http://ppp.cylab.cmu.edu/wordpress/?p=1182

Matthias Luft