Buffer Overflow : Oldie but a Goodie

We have all heard of them. Buffer over flows. Simple, yet extremely effective if pulled off correctly. But what exactly are they? Here is the Wikipedia definition.

In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. This is a special case of the violation of memory safety.

Buffer Overflow

Today I am going to try and go a bit more in depth and show some classic examples of the classic buffer over flow.

At it's core, a buffer overflows work by exploiting flawed code. Flaws in error handling and input checking among others. The overflow happens when you pass or write more data to a buffer then it has been allocated. For example, if the code designated an string buffer to be max 500 and you passed it 508, without the proper error handling in place, the extra data being passed in would overflow into the adjacent registers. Now that, in and of it's self, wont actually do anything malicious, but I am sure you can quickly start to think of ways to exploit this behavior.

Below is an EXTREMELY simple example

void func(char, *str) {
    char buff[5]; 
    strcopy(buffer,str); 
}
void main() {
    char string[16]; 
    printf("Input any data\n"); 
    gets(string);
    func(string); 
    printf("Next data goes here\n"); 
}
NOTE: Exploit runs on second iteration

Now as you can see this is super simple. I just wanted to illustrate for those who are not familiar with languages that allow for such things as assigning buffer size. Notice that I have not checked to see if the input for either buffer or char inputs are less then the allotted size. This means that if given something larger then the assigned size, the extra data will in essence, blow up the buffer.

Now for a more real world example.

Lets say an applications login page says your Username must be between 6 and 24 characters long. Being the intrepid hacker that you are you would type out exactly 24 characters followed by something like an executable or a command. Something to figure out of the application is actually processing every bit of data sent to the input. For example => Kilorfjutqwembhdtpoasedvrundll32%shell.dll,cmdpromt <= Now, if whoever programmed this application didn't take the time to do any input validation and filter anything beyond 24 characters, the remaining characters, which in our case is the command, will be put on the stack, waiting. From here there are several ways the code could end up being executed. Further errors in the code, a complier error, or with some other kind of shell exploit this code would actually be executed.

There have been some very famous buffer overflow exploits in the past. Below are links to two of the more famous ones. Code Red and SQL Slammer.

Code Red (Exploit in Microsoft's IIS web server)

SQL Slammer (DoS attack that infected almost 75,000 victims and dramatically slowed down the internet)

Now for those of you coming from languages like JavaScript or Python and who have never really touched any of the lower level languages such as C or C++, you may still be a bit confused as to how this all works. Higher level languages, for the most part, have abstracted away details such as assigning buffer lengths and made them dynamic. Much like in JavaScript how you do not have to assign a data type, it is just dynamically determined at run time, so are buffer sizes. Now I am not going to get into the nitty gritty details of how that happens because that is another post for another time, but for those of you from higher level languages I will give you a very quick refresher on how buffers and the stack work. Understanding these are essential to understanding buffer overflow attacks.

Registers, Buffers and Stacks, oh my! These terms tend to bring fear to the everyday bro-grammer. They want nothing to do with close-to-the-metal processes. But not you, you're a hacker and stuff like this gets your creative juices flowing! (or at least it should) Anyways, lets break them down, piece by piece.

Registers, the building blocks of Assembly and the basic workers that make all of our code run. In a computer, a register is one of a small set of data holding places that are part of a computer processor. A register may hold a computer instruction, a storage address, or any kind of data (such as a bit sequence or individual characters). Some instructions specify registers as part of the instruction. Simply put, they are where all data and instructions for that data live. Now there are VOLUMES written on this stuff, and I could go on forever but this should suffice for this context. If you're interested in learning more about this stuff check the links at the end of the post.

Registers

Buffers. A data buffer (or just buffer) is a region of physical memory storage used to temporarily store data while it is being moved from one place to another. It's that simple. So for those not familiar with the syntax from C and such that we saw, giving the buffer or the string a length or size, this is what they were doing. The program is telling the processor at run time to allocate a certain amount of space for this dynamic input of data.

Buffer

The Stack. The Stack, or call stack, is a stack data structure(last in, first out) that stores information about the active subroutines of a computer program. The main functionality of the stack is to keep track of the point to which each active subroutine should return control when it finishes executing (this will become the key to most of our exploits). The call stack is used all throughout computers and computer science from the core processor to inner functionality of Node.js.

Stack

Now when we put all of these things together we can start to paint the picture of exactly how a buffer overflow works. When you input something to the application that exceeds the allotted buffer size, it overflows and starts to fill up the registers adjacent to where the buffer ended. All buffers are a certain number of registers in length. So when you over load one, the extra data spills over.

Ok, so now we have the extra data that has spilled over into the adjacent registers. Now what we would need to do is somehow get those registers to be picked up and put on the call stack to be executed. Like I mentioned previously this can be done in several ways but the general idea would be to craft an exploit that would make sure those registers were re-run and the malicious command or executable would then be picked up and run. The illustration below does a great job of visualizing this idea.

Buffer Overflow

What we really want is for the call stack to come return back to an address just before our exploit (The sled in the picture) That way our code will get picked up and run! This is referred to as the NOP-sled technique

So that's it! That's the main idea being buffer overflow attacks. As you can imagine, there are endless possibilities for exploits with this type of attack. Now this is only a very rudimentary introduction. This category of exploit gets really deep if you start digging. Also personally it is what drove me to learn at least the basics of Assembly language programming, which is actually extremely useful knowledge to have. Every good hacker should know this stuff.

I hope you all enjoyed and learned a little, and as always please feel free to contact me with any questions or comments.

###Everything can Hacked!###

Here are some supplementary resources. The videos are particularly good!