Buffer Over Flow

This is not a passage, but just some pieces of reminders.

Definition


A buffer overflow happens when more data is written to or read from a buffer than the buffer can hold.

real examples


In fact the first self-propagating Internet worm—1988’s Morris Worm—used a buffer overflow in the Unix finger daemon to spread from machine to machine.
And just this May, a buffer overflow found in a Linux driver left (potentially) millions of home and small office routers vulnerable to attack.
heart-bleed

stack it up


function have address, which are almost fixed
the memory layout of a program, virtual address, libs, kernel space, and so on
esp, eip, ebp, return address and so on
code inject, shell code

an attacker’s toolkit


instructions can contain zero-bytes : convert into equivalent sequences that avoid the problem byte
return address can contain zero-bytes(because they always lay in the lower part of the address space) : use “call esp” from elsewhere, may from the library or other functions(trampolining)
“NOP sled”

blame c


gets(), strcpy(), strcat() and even strncpy(), strncat() are unsafe

Fixing the leaks


safe runtime environments
lots of awful c code still in use:legacy, performance, low level operations, many libs depends on C(even C# and friends)
and so are buffer overflows
some tools to analyze source code and running programs :
AddressSantizer Valgrind, but they require active involvement of the developer
some systems to make it harder to exploit overflows :
W^X (“write exclusive-or execute”), DEP (“data execution prevention”), NX (“No Xecute”), XD (“eXecute Disable”), EVP (“Enhanced Virus Protection,” a rather peculiar term sometimes used by AMD), XN (“eXecute Never”), efficient and cost little, can be complemented by hardware, can be applied to existing programs retroactively just by updating the operating system to one that supports it(although hard for things like JVM and .NET), has been mainstream since 2004

Beyond NX


system()(the so-called return-to-libc technique) : useful , but sometimes system() don’t take arguments from stacks, and calling multiple function is hard
a number of ways to extend return-to-libc : nonetheless limited
return-oriented-programming (ROP) : using gadgets , each gadget follows a particular pattern: it performs some operation (putting a value in a register, writing to memory, adding two registers, etc.) followed by a return instruction, some times you can find a Turing-complete set of these, these instructions can even be used to change the current state of the pages, turning them into excutable

Getting random


Address Space Layout Randomization (ASLR) : it randomizes the position of the stack and the in-memory location of libraries and executables
it’s useful but hard to apply to current systems : it’s ok for dlls, but hard for linux and exes, (compatibility, performance)
but the range the address can be is limited on x86 : total memory is limited, libs must stay as close as they can, code always start at the beginning of a page, etc
thus, if the chance is 1/256, you can try it out for 256 times and you’ll succeed
the situation is better on x64, guessing is almost impossible
browsers : javascript, flash(both contain JIT), PDF plugin, Microsoft’s Office browser plugins(old version didn’t enable ASLR)

A never-ending war


Powerful protective systems such as ASLR and NX raise the bar for taking advantage of flaws and together have put the days of the simple stack buffer overflow behind us, but smart attackers can still combine multiple flaws to defeat these protections.
Microsoft’s EMET (“Enhanced Mitigation Experience Toolkit”) includes a range of semi-experimental protections that try to detect heap spraying or attempts to call certain critical functions in ROP-based exploits. But in the continuing digital arms war, even these have security techniques that have been defeated.
the difficulty (and hence cost) of exploiting flaws goes up with each new mitigation technique—but it’s a reminder of the need for constant vigilance.