problems w/dbx on custom binary and core
807578Oct 22 2007 — edited Oct 26 2007Hi, y'all. I'm a developer for VMware's virtual machine monitor group. I'm a big fan of dbx, and would love to kick gdb to the curb as my workaday debugger. Unfortunately, dbx seems to be having some trouble with our environment. I'm using custom ELF binaries produced by our own linker, along with core files that come from our own dumper. A typical session follows:
$ dbx -f vmm64 vmware64-core1
Reading vmm64
dbx: warning: program has entry point of 0
dbx: warning: The corefile was truncated.
It should have been 123969840 bytes long (is only 123969839)
Because of this, some functionality will be missing from dbx.
(See `help core')
core file header read successfully
program terminated by signal SEGV (Segmentation fault)
0xffffffffffffffff: <bad address 0xffffffffffffffff>
dbx: core file read error: address 0xffff81007fea5e90 not in data space
dbx: attempt to read frame failed -- cannot get return address
dbx: warning: No frame with source found
(dbx) where
[1] 0xfffffffffc2cefc6(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc2cefc6
[2] 0xfffffffffc2cf31b(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc2cf31b
[3] 0xfffffffffc240fa6(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xffff00000000, 0xfffffffffc2224b0, 0xfffffffffc048df0, 0xfffffffffc001000, 0xfffffffffc048de0, 0xfffffffffc241320, 0xfffffffffc048f40, 0xfffffffffc31b5d0, 0xfffffffffc048ec0, 0xfffffffffc29f9cb, 0x3000000008, 0xfffffffffc048ed0), at 0xfffffffffc240fa6
[4] 0xfffffffffc29fcb8(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc29fcb8
[5] 0xfffffffffc241320(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc241320
[6] 0xfffffffffc29f9cb(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc29f9cb
[7] 0xfffffffffc2dc7ae(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc2dc7ae
[8] 0xfffffffffc2cf928(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc2cf928
[9] 0xfffffffffc317554(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc317554
[10] 0xfffffffffc31b5d5(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc31b5d5
[11] 0xfffffffffc2dcae1(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc2dcae1
[12] 0xfffffffffc2cf928(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffffffc2cf928
I bet the off-by-one error in the core file is our fault. The fanciful backtrace is a little harder to make sense of; we're using frame pointers n' stuff. All of those virtual addresses look like they're part of the stack to me; any idea what we're doing to prevent dbx from picking up the RIP?
Examining globals seems to work ok:
(dbx) p VC
VC = 0xfffffffffc019000
(dbx) p gPhysL3MPNs
gPhysL3MPNs = (1898348U, 1818511U, 0, 0, 0, 0, 0, 0)
Text symbols are dandy enough:
(dbx) l BT_Init
122 *----------------------------------------------------------------------
123 */
124 void
125 BT_Init(void)
126 {
...
It seems like it's just the stack frames that are giving dbx fits. Any thoughts? My guess is that we're doing something wrong, either in our core file or our ELF file.