Once you know what to look for, a buffer overflow is almost as easy to spot as it is to understand. So here’s what a buffer overflow looks like, whether you’re looking at suspicious network traffic or a suspicious file on disk.
A buffer overflow is a long sequence of NOP operations followed by machine code. The long sequence of NOPs is a tell-tale sign, but disassembling the data that follows will verify it–if it doesn’t disassemble to gibberish, you found a buffer overflow.
A common question on certification tests
I’m about 99% certain I had buffer overflow questions both on my CISSP exam and my Security+ exam. The question on Security+ was more theoretical, asking what a buffer overflow is. The CISSP question was more along the lines of finding one.
Buffer overflows aren’t necessarily the most intuitive concept, but once you understand the concept, these are some of the easiest questions on your test. Buffer overflows happen when data mingles with code, making it possible to overwrite program code with your own data. Here’s my explanation of how a buffer overflow works, which remains one of my more popular blog posts on security. If you find the concept confusing, my brief explanation of machine language may help clear things up for you.
Buffer overflows aren’t about overwhelming the CPU. It’s just about finding data sitting next to code in memory and overwriting the code by overwriting the data first. My heart sank the first time I heard about this kind of attack, because almost every program I wrote in C or machine language in the 90s probably had a buffer overflow in it. Storing data right next to the code that needed it made programs more efficient, so I usually did it.
Buffer overflows are much less common than they used to be, since it’s pretty easy to separate code from data, and we’ve known for 20 years to do it.
What a buffer overflow looks like
A buffer overflow consists of two parts: a NOP sled, followed by a payload. Since an attacker usually doesn’t know how long the attack has to be, the attacker will send a long sequence of NOP bytes to fill up the buffer. NOP literally means “do nothing,” so sending extra NOPs is harmless. Sending too few causes your payload to malfunction, so the attacker will probably pad the sled to be safe.
Buffer overflows are very CPU-specific. If a CPU even has a dedicated NOP instruction, it’s not necessarily the same byte from CPU to CPU. On my old 6502-based Commodore 8-bits, NOP was 0xEA. On Intel x86, NOP is 0x90. If you’ve ever wondered why consumer routers almost never use an Intel-compatible CPU, this is one reason why. An Intel buffer overflow may crash a process, but it won’t run any code.
NOP is a do-nothing instruction that doesn’t change the machine’s state at all. Its most common legitimate use is for timing purposes. CISC-architecture CPUs like Intel x86, Motorola 68K, and MOS 6502 all have one. The 68000 and 6502 are very old designs, but still sometimes show up in embedded systems today.
RISC is a different philosophy of CPU design, using a smaller number of simpler instructions that the CPU can run much faster. NOP is almost always one of the instructions RISC CPUs leave out. So on a RISC CPU, you have to find a different instruction that happens to not change the machine’s state, such as adding zero to the CPU’s stack pointer or one of its registers. This instruction will always be more than a byte long, frequently four bytes.
The payload is operating system specific, but you don’t have to fully understand the payload in order to spot a buffer overflow. The most common payload will open a shell, probably listening on a weird port. But some pentesters will just run a harmless program, like the calculator.
The payload will vary from CPU to CPU as well, but I’ll use the same payload in my examples below for simplicity’s sake.
What a buffer overflow targeting Intel x86 CPUs looks like
On Intel x86 CPUs, the NOP instruction is 0x90. So the telltale sign is a long string of bytes containing 0x90, followed by a sequence of unpredictable, semirandom bytes. They aren’t random at all, but they vary based on what they’re trying to make the machine do.
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 50 53 51 52 56 57 55 89 e5 83 ec 18 31 f6 56 6a 63
A 64-bit Intel CPU actually has several multi-byte NOP instructions, so a NOP sled won’t necessarily always be a string of 90s. But an attacker has to have to have a better idea how many bytes there are to work with, so you’re less likely to see those in NOP sleds.
Since this is the simplest example and x86 is so common, it’s worth memorizing the example of 90 repeating followed by a payload. That’s the example you’re most likely to see on a test, or in a job interview.
On ARM CPUs, there’s no dedicated NOP instruction, so you have to use a sequence of bytes that just happen to have no effect. F3 AF 80 00 is a good bet on 32-bit ARM, and d5 03 20 1f is a good bet on 64-bit ARM.
This makes a buffer overflow harder to spot. It’s a four-byte code sequence repeating over and over a lot, followed by a payload that’s a lot less repetitive.
F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 F3 AF 80 00 50 53 51 52 56 57 55 89 e5 83 ec 18 31 f6 56 6a 63
Like ARM, MIPS also has no dedicated NOP. And unfortunately, the do-nothing sequence on some MIPS variants happens to be four zeroes. So a long sequence of zero-bytes that happens to be a perfectly even multiple of four might be a NOP sled for a MIPS CPU. But it could also be perfectly innocent, because initializing memory to a known value is good practice, and everyone tends to choose zero.
One some other MIPS variants, the do-nothing sequence is 60 00 00 19. So we’ll use that one for illustrative purposes.
60 00 00 19 60 00 00 19 60 00 00 19 60 00 00 19 60 00 00 19 60 00 00 19 60 00 00 19 60 00 00 19 60 00 00 19 60 00 00 19 80 00 50 53 51 52 56 57 55 89 e5 83 ec 18 31 f6 56 6a 63
Spotting buffer overflows as a general rule
A buffer overflow in the real world might be trickier than the simplest case of a 32-bit Intel x86 overflow. If you see a sequence of a small number of bytes repeating over and over a lot, followed by something more complex, it may be a buffer overflow.