Aug 18, 2013

Debug corrupt stack


How to coerce gdb into giving a backtrace

When debugging with gdb, I sometimes encounter a problem in getting it to recognise a stack trace. Sometimes it only gives a few function calls, or none at all. Here's an example:
(gdb) bt
#0  0xb7886424 in __kernel_vsyscall ()
#1  0xb7559163 in ?? () from /lib/i686/cmov/
#2  0xb74f1387 in ?? () from /lib/i686/cmov/
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

This obviously isn't a very useful backtrace. At this point it can be useful to have a look at the registers and the contents of the stack.
(gdb) info registers
eax            0xfffffe00 -512
ecx            0x80 128
edx            0x2 2
ebx            0xb75c33a0 -1218694240
esp            0xbfe55018 0xbfe55018    <= stack pointer on x86 
ebp            0xbfe55048 0xbfe55048
esi            0x0 0
edi            0x0 0
eip            0xb7886424 0xb7886424 <__kernel_vsyscall>
eflags         0x202 [ IF ]
cs             0x73 115
ss             0x7b 123
ds             0x7b 123
es             0x7b 123
fs             0x0 0
gs             0x33 51
(gdb) x/64x $sp
0xbfe55018: 0xbfe55048 0x00000002 0x00000080 0xb7559163
0xbfe55028: 0xb75c33a0 0xb75c1ff4 0x09ec0470 0xb74f1387
0xbfe55038: 0xb77c4afd 0xb782199c 0xb782199c 0x09ec0478
0xbfe55048: 0xbfe55078 0xb77c4bf4 0x09ec0478 0xb782199c
0xbfe55058: 0x0000ffff 0xb75c3438 0xbfe55098 0xb789a240
0xbfe55068: 0x00000000 0xb782199c 0x0000ffff 0xb75c3438
0xbfe55078: 0xbfe55098 0xb7818736 0x09ec0478 0x00000000
0xbfe55088: 0xb77c50cd 0xb782199c 0xb7818709 0xb782199c
0xbfe55098: 0xbfe550b8 0xb77c54ad 0x00000000 0x00000000
0xbfe550a8: 0x0000000b 0xb75c3438 0xb77c5449 0xb782199c
0xbfe550b8: 0xbfe550c8 0xb77bb7dd 0xb782199c 0x0000000b
0xbfe550c8: 0xbfe550e8 0xb77bb84e 0x0000ffff 0xb789a240
0xbfe550d8: 0x00000000 0xb77bb830 0xb77bb839 0xb782199c
0xbfe550e8: 0xbfe55108 0xb77bc08f 0x0000000b 0x00000000
0xbfe550f8: 0x00000000 0x00000010 0xb75c1ff4 0x3efafafb
0xbfe55108: 0xbfe556c8 0xb7886400 0x0000000b 0x00000033

The values in bold are the frame pointers, forming a linked list back up the stack. If you compiled with -fomit-frame-pointer things will be harder to figure out. In this case it certainly doesn't look like the stack is corrupt. Perhaps gdb is just confused.

Pick a location slightly further up the stack and set the $sp variable, and suddenly backtrace works!
(gdb) set $sp=0xbfe55048
(gdb) bt
#0  0xb7886424 in __kernel_vsyscall ()
#1  0xb782199c in ?? () from /usr/lib/
#2  0xb7818736 in ?? () from /usr/lib/
#3  0xb77c54ad in ?? () from /usr/lib/
#4  0xb77bb7dd in SDL_QuitSubSystem () from /usr/lib/
#5  0xb77bb84e in SDL_Quit () from /usr/lib/
#6  0xb77bc08f in ?? () from /usr/lib/
#8  0xb74ede80 in ?? () from /lib/i686/cmov/
#9  0xb74efc8c in malloc () from /lib/i686/cmov/
#10 0xb7764eec in ?? () from /usr/lib/
#11 0xb7765b98 in Mix_SetPanning () from /usr/lib/
#12 0x080828b1 in I_SDL_StartSound (id=-1218694088, channel=-1218699276, 
    vol=120, sep=38) at i_sdlsound.c:671
#13 0x08073e9b in S_StartSound (origin_p=0xb624f990, sfx_id=22)
    at s_sound.c:656
#14 0x08061459 in T_MoveFloor (floor=0xb63271b8) at p_floor.c:220
#15 0x0806d0ea in P_RunThinkers () at p_tick.c:119
#16 0x0806d161 in P_Ticker () at p_tick.c:153
#17 0x08052fb6 in G_Ticker () at g_game.c:1151
#18 0x0804dbf4 in D_DoomLoop () at d_main.c:437
#19 0x0804ecc0 in D_DoomMain () at d_main.c:1506
#20 0x08054d1c in main (argc=5, argv=0xbfe55d04) at i_main.c:152


Is the art to set the sp point to correct address and then we can get a backtrace?

We need to understand the stack pointer :-)

No comments: