This is a guest post from Jose Miguel Esparza (@EternalTodo)
There are already some good blog posts talking about this exploit, but I think this is a really good example to show how peepdf works and what you can learn if you attend the workshop “Squeezing Exploit Kits and PDF Exploits” at Troopers14. The mentioned exploit was using the Adobe Reader ToolButton Use-After-Free vulnerability to execute code in the victim’s machine and then the Windows privilege escalation 0day to bypass the Adobe sandbox and execute a new payload without restrictions.
From this point, we can use the command “js_analyse” to try to emulate the code and extract the escaped bytes automatically or just use the command “js_unescape” to unescape manually the shellcode and ROP chains, if necessary. I will show the result of executing “js_analyse”, storing the shellcode in a variable and showing the content later:
The shellcodes can be emulated with the command “sctest”, but in this case we have a truncated output because one of the functions used in the shellcode is not handled by libemu. But, as we can extract the shellcode and write it to a file, we can analyze it in the way we like more. For example, using scdbg (as shown in this article), shellcode2exe to obtain an executable or just copying the bytes in a debugger/disassembler. This screenshot shows one part of the shellcode analyzed with IDA:
This shellcode tries to exploit the vulnerability CVE-2013-5065 to bypass the Adobe Reader sandbox and then decode and execute a binary. This binary is embedded within the PDF document, but where? As I mentioned before, the PDF document is quite big to store just 4 objects. We can see the physical structure of the document with the command “offsets”.
There is a huge gap between object 10 and object 2, so it is worth taking a quick look at that. We can show the raw bytes of the PDF document with the command “bytes”:
We have found a hidden “object” here. The tool is not showing this object because it is not an object really, due to the lack of a valid object header (“X Y obj”). Instead of that we have “obj 4 0”, so no PDF reader will read this object successfully, they will just ignore it. But it is not useless at all, because the shellcode will look for the bytes 0xa0909f2 within the PDF file content (see the IDA screenshot above) and start decoding from that point. The FireEye team posted the algorithm to decode the content, so no need to reinvent the wheel, we can extract all these bytes and then just use their Python script to obtain the executable:
The size of the decoded binary (105,476 bytes) does not match with the binary mentioned everywhere (111ed2f02d8af54d0b982d8c9dd4932e, 176,245 bytes). That’s because here we have decoded just the binary, but the shellcode decodes from the 0xa0909f2 mark until the end of the file, encoding the rest of the PDF file too, which is not necessary at all.
If you liked this type of analysis, a really good way to learn more about it is attending the workshop about how to analyze Exploit Kits and PDF exploits at Troopers on the 17th of March. It will be fun! See you there! 😉
Jose Miguel Esparza