Monday, October 22, 2012

Exception Driven "Debugging": Getting behind anti debugging tricks.

Of course, every debugging is exception driven. At least because a breakpoint generates debug exception wich is passed to debugger. In this article, however, I will refer to regular exceptions.

There are tens if not hundreds of software protectors used by software vendors around the globe. Some are good, some are less good, in either case, vendors rarely use them in a proper way, thinking that simply enabling anti-debugging features, provided by protector of their choice, is enough. I have seen it myself - a widely known commercial application, protected using Themida (which is one of the most complicated protectors) remains SOOO unprotected, that Themida is not even notices during the extraction of relatively sensitive information using the application itself.

However, the purpose of this article is not to discuss pros and cons of Themida or any other protector, nor do I have any intention to disgrace any of the software vendors. The purpose is to describe a relatively easy way of bypassing common anti debugging tricks (including Windows DRM protection)  with DLL injection.

As the term "anti debugging" states - such methods target modern debuggers. There are several commonly known tricks:
  1. IsDebuggerPresent() - you would be surprised to know how many vendors rely on this API alone;
  2. Additional methods of debugger presence detection;
  3. IAT modification - which is not really worth trying;
  4. Redirection of debugging API (e.g. to an infinite loop).
  5. And some more.
Point #4 does not let you to implement your own debugger in a hope that it would not be noticed by the victim program (many beginners fall out at this point).

Point #3 - how much can you modify the IAT? I mean, system loader has to still be able to parse it, thus, if system loader can - everyone can.

Point #1 is not even worth further mentioning here.

In this article I am going to describe a simple way (although, some may cry and say it is a hard way) to get around most of anti debugging tricks without even noticing their presence by implementing a simple pseudo debugger dll, which is to be injected into the target process.


Step #1. Preparations

In order to use any debugger, you have to know where to set your breakpoints. Otherwise, the whole process is meaningless. But how can you define proper locations if the executable on disc is encrypted (e.g. with Themida) and you still cannot attach a debugger to see what is going on inside?

The solution is quite simple. Simple in deed. Windows provides us with all the instruments to read the memory of another process (given that you have sufficient access rights) with OpenProcess(), ReadProcessMemory() and NtQueryInformationProcess() API functions. Using those, you can simply dump the decrypted executable and any of its modules (DLLs) to a separate file on disc.

NtQueryInformationProcess() provides you with the address of the PEB (see this post for more information on PEB) of the target process. Then you simply parse the linked list of loaded modules, get the base address (module handle) and the image size for each, then use ReadProcessMemory to copy the image to a file. One complication, though, you will have to use ReadProcessMemory in order to access the PEB of the remote process.

Once you have dumped the target image to a file, such file can be easily loaded into IDA Pro, disassembled and researched statically.


Step #2. Injector and DLL

I do not see any reason to describe the DLL injection process here, as it has been described many times, even in this blog. You are free to use standard injection method, advanced DLL injection method or use this method if you have problems with the two previously mentioned.


DllMain()

It is suggested not to perform any heavy action in this function, however, we do not really have a choice (although, you can launch a separate thread). First thing to do is to suspend all running threads (except the current one of course). The problem is that Windows has no API function that would allow you to enumerate threads of a single process, instead, it lets you go through all the threads in the system. See MSDN pages for Thread32First and Thread32Next - there should be a perfect example of getting threads of the current process. Once all the threads are suspended, you are ready to proceed.


Installation of breakpoints 

No, we are not going to use regular 0xCC software breakpoints, neither are we going to make any use of hardware breakpoints here. Instead, we are going to place an instruction that would raise an exception to the location of desired breakpoint. To keep such instruction short and to avoid changing the values of the registers, 'AAM 0' seems to be a perfect candidate. It only takes two bytes 0xD4 0x00 and raises the EXCEPTION_INT_DIVIDE_BY_ZERO exception (exception code 0xC0000094).

Use the VirtualProtect() function to change the access rights of the target address, so you can alter its content, backup the original two bytes from that address and overwrite them with 0x00D4

VirtualProtect((LPVOID)(target & ~0xFFF), 0x1000, PAGE_EXECUTE_READWRITE, (PDWORD)&prevProtect);
*((unsigned short*)target) = 0x00D4;
VirtualProtect((LPVOID)(target & ~0xFFF), 0x1000, prevProtect, (PDWORD)&prevProtect);

Now the victim process is almost ready to be continued. One thing left - exception handler. We will use vectored exception handling mechanism as it allows our handler to be (at least among) the first to handle an exception. Once the handler has been added with AddVectoredExceptionHandler(), you may resume the suspended threads of the process.



Handler

One important thing to do once your handler gets control, is to check for the address where the exception occurred and for the exception code, as we have no intention to deal with irrelevant exceptions:

LONG CALLBACK handler(PEXCEPTION_POINTERS ep)
{
   if(ep->ContextRecord->Eip == target && ep->ExceptionRecord->ExceptionCode == 0xC0000094)
   {
      // Do your stuff here
   }
   else
      // Optionally log other exceptions
      return EXCEPTION_CONTINUE_SEARCH;
   return EXCEPTION_CONTINUE_EXECUTION;
}


Your Stuff

One of the parameters you get with your handler is the pointer to the CONTEXT structure, which provides you with the content of all the registers at the time of the exception. Needless to mention, that you have the access to the process' memory as well. Just as you were in a debugger with the only difference - you have to implement the routine that would show you the data you are interested in. Do not forget to emulate the original instruction replaced by the pseudo breakpoint and advance the Eip accordingly before returning from handler.

One more thing to mention - it may be a good idea to suspend all other threads of the victim process while in the 'your stuff' portion of the handler.


Stability

I am not claiming this method to be bullet proof and I am more than sure ( I simply know) - there are ways to defeat it, however, personally, I have not yet met such software. In addition - this method is tested and stable.


Hope this article was helpful. See you at the next.

P.S. Lazy guys, nerds, etc., do not cry for sources. This method is really simple. Besides, if copy/paste is the only programming technique you are aware of, then, probably, this blog is not the right place for you.

Thursday, October 18, 2012

Method of Computer Virus Detection. Sad story of a patent application

It was quite a long time ago (an epoch ago by terms of software development). Around the end of 2005 and beginning of 2006. I was then working for Aladdin Knowledge Systems' eSafe unit as a computer  virus researcher (my first formal RE job). Detection methods were quite poor at that time, even heuristic ones (not that they are THAT good these days). There was quite a lot noise about the Morphine scrambler at that time and I was responsible for finding a proper solution for that issue by developing a reliable detection method. 

I have to admit - Morphine was quite an advanced scrambler at that time. A masterpiece, I should say. Standard methods, at least those used by eSage at that time did not work and required some changes to be made to the engine.

As this was about the only task assigned to me at that period, I decided to play a bit more with Morphine while waiting for the aforementioned changes to be made. 

It was so easy to identify Morphine's code by eye, but, somehow I could not fit the pattern into any programmatic method (of those used at the time, as I said). Well, there are plenty of expert systems and neural networks that mimic the path of decision making as it happens in our mind and there were such systems at the time. However, I was not yet aware of those and those I heard about looked quite complicated.

My decision was to try and build a simple system capable of recognition of logic patterns in the code. It is quite obvious, that different implementations of the same algorithm share the same logic, which appears in a form of at least opcode sequences, although the overall binary representation may be different even if you replace one register with another. This lead me to the simple system described below.

It is important to mention that all of  the following information is publicly available, so I do not violate any NDA or whatsoever.

Code Generalization
Our mind generalizes the disassembled code by extracting the relevant logic information. But how to do that in software? The solution is easier than I initially expected. I simply had to sort the opcodes by categories, assigning a numeric value to each category. For example, let's take three categories - stack, bitwise and flow control operations.  The following example shows two pieces of code, that are different on a binary level, but are completely identical logically:

Code #1                                Code #2                                      Generalized form
push  eax            push  edx                   0x0001
xor   eax, eax       xor   edx, edx              0x0002
pop   ebx            pop   ecx                   0x0001
ret                  jmp   dword[esp]            0x0003

As you can see - the two code snippets are identical logically, but are quite different if you try to compare them in compiled form. However, if you try to generalize those snippets, you will get the same result from the both.

This is really a basic explanation of the system. Besides, it has evolved since that time.

Automatic Signature Generation
The most pleasant thing about this system was its ability to extract signatures automatically. At that time, only two samples of the same malware were needed, right now - one is more than enough. However, let me concentrate on the method as it was initially presented.

As I mentioned - there was a need for two samples of the same malware. Their executable content was then generalized using the system described above into a couple of arrays of extracted categories which were compared one to another and all similarities were put into a separate list of potential signatures. Why potential? Just because at that stage any of them could be a signature of "legal" logic which might be found in any executable (e.g. library routine).

In order to eliminate such "false" signatures, the list was applied to a set of "clean" files and each potential signature found in any clean file was removed.

Efficiency
The very first test results showed that Morphine, a masterpiece of polymorphism may be recognized with a single logic signature (and I tested it on thousands of files scrambled with Morphine). Needless to say that the efficiency was as good for at least 95% of malware known at that time. Basically, that meant that the database of several tens of megabytes could be replaced with a list of several kilobytes.

What's sad about this?
My employer at that time - Aladdin Knowledge Systems applied for a patent. Several years later (I was working for some other firm already) I came to know that the application was denied by USPTO.  The reason was quite surprising... As I had a chance to read the correspondence of the patent attorney and the examiner, I discovered that the application was denied based on the comparison algorithms used to compare the sequences of categories, which had TOTALLY NOTHING to do with the idea itself, which was about the preprocessing of data (extraction of logic patterns)... Somehow, this method (despite all the excitement) never got implemented in the product either...

For those interested, the application may be found here.