Work in Progress - my first steps!

Notes on fundamentals for reverse engineers

I remember the day when a friend told me that “this is fun!”- thingy we spend time doing on our weekends had a name: Reverse Engineering. I got curious about information security just after that! I really enjoy understanding deeper low-level internals of $thing. And when I decided to take a more “researcher-like” approach on it, I felt like diving into the a software developer’s perspective of computer architecture. I am planning to cover the same topic from another perspective, which is for me more comfortable and logical - I am coming from the hardware level back to software. There are plenty of extremely well written RE101 tutorials using a pletora of different tools. However not so many explaining how to get the code to analyze in first place or giving a general idea on reverse any kind of binary.

Why do I do what I do?

People ask me a lot how to become a good malware researcher and / or reverse engineer.

First things first, they are not the same thing. (At least, not for me!). I know great malware researchers who don’t do much reverse engineering and also great reversers who are not working with malware at all. I like the mix. I get bored really fast and I love puzzles: having malware code to reverse is feeding my curiosity most of the time and the deep reversing and static analysis hours keep me alive.

Maybe a good beginning would be finding which of these things makes you the happiest :)

If you are not sure you want to be a malware researcher AND reverse engineer, give it a try. There are lots of resources online, just set up a lab, download a sample and do it. Did you like it? Or did you have the feeling of “Äh maybe later…” already by setting up the environment?

Bullet points of things I am going to (MAYBE) write about

Static Analysis

  • ASM syntax flavors
    • Intel-syntax:
    • AT&T syntax:
      • AT&T: (%)before register name and ($)before numbers. Parentheses are used instead of brackets.
      • AT&T: A suffix is added to instructions to define the operand size:
        • q — quad (64 bits)
        • l — long (32 bits)
        • w — word (16 bits)
        • b — byte (8 bits)
  • General Purpose / Memory Registers
    • Aimed at doing arithmetic operations
    • EAX results register (accumulator)
    • EBX pointer to data
    • ECX counter
    • EDX pointer to data
    • ESP stack pointer register. Points to the top of the stack.
    • EBP stack data pointer. It is a registry used to calculate addresses relative to other addresses
    • ESI source index
    • EDI destination index
  • Control Registers
    • The stack grows down the address space: as more data is added to the stack, it is added at increasingly lower address value
    • The order of the bytes is reversed because IA32 machines are little-endian, reverse the address to use in the payload
    • They control the function of the processor
      • EIP, extended instruction pointer. Points to the next instruction to be executed
      • NOP (0x90) useful for shellcodes to add the padding. Higher chance of guessing the right address

Compilation Process

+--------                      +-------+         X
| +---------+                  | +-------+      X X
| |         |   +----------+   | |       |     X   X     +--------+
| |  source |   |          |   | | obj   |    X     X    |        |
| |  code   +---> compiler +---> | file  +---> linker+---> binary |
+ |         |   |          |   | |       |    X     X    |        |
  |         |   +----------+   + |       |     X   X     +--------+
  +---------+                    +-------+      X X
                                                 X

  • REMEMBER copyright laws are different from country to country
    Reverse Engineering is legal mostly only in few specific cases, it is not included in black box testing and reverse engineering spyware/adware/greyware is illegal in most countries! So if you are not sure, don’t do it!

It is legal to reverse engineer $thing in case you want to recover your own source code, to recover data from legacy systems or formats. It is also legal for security research and I am not just talking about vulnerability or malware research but also in cases of copyright infringement investigations for example.

Illegal activities

  • to RE $thing and sell a competing product
  • to crack copy protections
  • to distribute a cracked version / registration for copyrighted software
  • to gain unauthorized access to any computer system

My tools of choice

  • HexEditor: HIEW, HxD
  • disassembler: IDA, R2
  • Debugger: OllyDbg, WinDBG, Bochs
  • Scripting: Python

If you are still here …

… it means, you want to keep the path? :) AMAZE!! There are lots of RE101 labs and resources like:

So, no need for me to do another one ;)