16 Nov 2010 @ 4:05 PM 

[This post has been screwed due to Anti viruses claiming plain text as live exploits. I omitted most of the stuffs I planned to post 😐 ]

Many people don’t consider PDF files as a possible threat and oh, well I agree to them(!). It is not the PDF files but the rendering softwares we have to be afraid of. If you think I am referring to those Adobe Reader 0-days popping up periodically, hell yeah, you are RIGHT!. We are going to talk about PDF files, few Adobe Reader vulnerabilities, exploits and malwares that comes along with it πŸ˜‰

You can read about PDF in wiki page. PDF files are binary files with proper formatting and looks like a collection of objects. You can open a PDF file in a text editor or hex editor to view it’s object structure.

pdf file

As you can see PDF files start with a magic header %PDF or %%PDF followed by the spec version number.Β  From next line onwards you can see a pattern emerging, like [obj][data][endobj]. Well, this is the collection of object thing I said earlier. Each object is identified by an ID and a version number. 41 0 obj represents object 41 version 0. You can look into PDF specs for better understanding of the file architecture. You don’t have to understand every details of the spec, but you can specifically look into streams, encodings, java script implementations, acro forms etc… Before going further, I would like to explain a little more about streams. Streams are used to store data(images, text, java scripts etc…) and to make it efficient PDF allows us to use compression and encoding techniques like Flate/LZW/RLE etc. This creates sort of problem for us, we can’t just use text/hex editor for understanding the true content of PDF!. As a programmer I can’t ignore this challenge and I made a tool(PDF Analyzer) to solve this issue. I will use PDF Analyzer throughout this post but you won’t be able to get it as it is still in private build(I will release it…eventually ;)). For now you guys have other options, both commercial and freeware tools are available. I will post some links here.

PDF Dissector by zynamics – commercial

Origami by Sogeti ESEC Labs – freeware

PDF Stream Dumper by Dave – freeware

Various python PDF parsers from Didier Stevens and inREVERSE guys – freeware (search!)

PDF Analyzer is made in C# with only 3 external libraries, zlib(I should have used GZipStream with 2 byte header hack),Β  BeaEngine(Thanks BeatriX) and JSBeautifier(I ported 95% of code from js to C#). I spent around 2 weeks of free time on it. It may not be the fastest PDF parser, but it can handle every ill formatted PDF I have in my repository ;).

pdf analyzerAdobe reader’s top vulnerabilities come from Adobe specific javascript APIs. This gives us a chance to disable javascript and protect us from any of those javascript based exploits. Disabling javascript is crucial but it doesn’t fix vulnerabilities from other parts of Adobe Reader such as embedded image files and flash files.

Now we will look into some of the malware samples which exploits these vulnerabilities. You can find malware sample from many security blogs and I must thank two of my friends who sent a big archive of malware PDFs for analysis and testing :).

pdf analyzer jsThis particular sample splits javascript into three streams and concatenates them using <</Names[(1)6 0 R (2)7 0 R (3)8 0 R]>> which will eventually refer to three objects marked in red. After beautification, it seems it is exploiting one vulnerability existed inΒ  Adobe Reader namely this.media.newPlayer(null).

media newPlayer

It is essentially spraying heap with NOP sled and shellcode and calling the vulnerable function. The shellcode present here is a dropper/downloader, you can dump it to a file and use IDA to disassemble it.

Another PDF file which exploits util.printf is given below.

util printf
Again you can dump shellcode and disassemble with IDA. Another option is to use PDF Analyzers unescape functionality to directly disassemble the shell code.

disassemblyDisassembly starts with pretty straight forward steps to find base address via delta calculation(call – pop – sub). Then it fetches kernel32 base from PEB(fs[0x30])->Ldr.InInitOrder[0].base_address. This will be used to eventually load other modules and APIs.

Malware writers use multiple techniques to protect their payload. Techniques involves obfuscation, multiple and multi-level usage of encoding/compression schemes.

multiple encodingsIf any of you guys have samples that uses multi-level encoding, please send them to me πŸ˜‰ , I would like to test those with PDF Analyzer.

I will conclude the exploit samples by posting the latest exploit for the vulnerability printSeps. This code is retrieved from the PDF posted in full disclosure list.

printSeps
Evil actions of PDF malwares varies from regular password stealer to rootkits. Once you have attained arbitrary code execution, rest will be just imagination of malware writer. As malware writers are mainly targeting Adobe Reader, try to shift to other PDF rendering software or at least update to latest version. There are free PDF readers like Sumatra or GhostScript, try those out and always be cautious when opening a PDF file.

Posted By: Dan
Last Edit: 24 Nov 2010 @ 03:23 PM

EmailPermalinkComments (10)
Tags

 Last 50 Posts
Change Theme...
  • Users » 1
  • Posts/Pages » 15
  • Comments » 39
Change Theme...
  • VoidVoid « Default
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LightLight

About



    No Child Pages.