[[!meta copyright="Copyright © 2009, 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] Given an `a.out` executable that only does `raise (SIGABRT)`, invoking that one... * ... against `crash-dump-core` will... * ... not overwrite existing `core` files. Is this reasonable? Linux does overwrite them, for example. * ... show big variances in running-time behavior: $ TIMEFORMAT='real %R user %U system %S' $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 1.350 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 21:59 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 22.771 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 21:59 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 1.367 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:00 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 5.789 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:00 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 22.664 user 0.010 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:01 core * ... produce a huge `core` file: $ du -hs core 17M core On Linux, the `core` file occupies 76 KiB of disk space, which seems much more reasonable. This is possibly related with the default 128MiB heap preallocation. * ... does not always produce a useful backtrace: `abort();` $ gdb test core warning: core file may not match specified executable file. [New Thread 86678] warning: Wrong size fpregset in core file. ... Core was generated by `./test'. Program terminated with signal 6, Aborted. warning: Wrong size fpregset in core file. (gdb) bt #0 0x00000000 in ?? () #1 0x011f593f in __msg_sig_post (process=72, signal=6, sigcode=0, refport=1) at /build/buildd-eglibc_2.10.2-7-hurd-i386-iGL6op/eglibc-2.10.2/build-tree/hurd-i386-libc/hurd/RPC_msg_sig_post.c:144 #2 0x0109a433 in kill_port (pid=<value optimized out>) at ../sysdeps/mach/hurd/kill.c:68 #3 kill_pid (pid=<value optimized out>) at ../sysdeps/mach/hurd/kill.c:105 #4 0x0109a69f in __kill (pid=21142, sig=6) at ../sysdeps/mach/hurd/kill.c:139 #5 0x01099af6 in raise (sig=6) at ../sysdeps/posix/raise.c:27 #6 0x0109de59 in abort () at abort.c:88 #7 0x0804849f in main () `char *foo = 0; *foo = 1;` $ gdb test core Program terminated with signal 11, Segmentation fault. warning: Wrong size fpregset in core file. #0 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () #1 0x0108565b in __libc_start_main (main=0x8048464 <main>, argc=1, ubp_av=0x1023e64, init=0x8048490 <__libc_csu_init>, fini=0x8048480 <__libc_csu_fini>, rtld_fini=0xea20 <_dl_fini>, stack_end=0x1023e5c) at libc-start.c:251 #2 0x080483d1 in _start () `raise (SIGABRT);` $ gdb a.out core warning: core file may not match specified executable file. [New Thread 76651] warning: Wrong size fpregset in core file. Reading symbols from /lib/libc.so.0.3...[...] Core was generated by `./a.out'. Program terminated with signal 6, Aborted. warning: Wrong size fpregset in core file. #0 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () Cannot access memory at address 0x17 [[!tag open_issue_gdb]] Probably [[GDB]] doesn't manage to dig in the stack properly. * ... against `crash-suspend` will... * ... not work at all: $ CRASHSERVER=/servers/crash-suspend ./a.out $ [returns to the shell and doesn't suspended] * ... show big variances in running-time behavior: $ TIMEFORMAT='real %R user %U system %S' $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.381 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.332 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 21.228 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.323 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:05 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 22.279 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:05 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.362 user 0.000 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 21.110 user 0.000 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.350 user 0.000 system 0.020 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core * ... can reliably crash GNU Mach: This happens if a `core` file is already present (and won't get overwritten; see above). I reproduced this three times. $ TIMEFORMAT='real %R user %U system %S' $ time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted real 2.856 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core panic: zalloc: zone kalloc.8192 exhausted Kernel Breakpoint trap, eip 0x20020a77 Stopped at 0x20020a76: int $3 db> trace 0x20020a76(2006aba8,4d0f7e9c,200209b0,0,0) 0x20020a4d(2006b094,2006ae40,2000,20016803,4a5f4114) 0x2002bca5(49a03564,1,0,9,1000) 0x20022f4c(2000,4a5f45d4,4a84879c,49a46564,4ac43e78) 0x20021e65(4ac43e78,4a5f45d4,4a5f4114,0,0) 0x2005309d(2106ba9c,3,38,28,1783) Bad frame pointer: 0x2106ba78 $ addr2line -i -f -e /boot/gnumach-xen 0x20020a76 0x20020a4d 0x2002bca5 0x20022f4c 0x20021e65 0x2005309d Debugger /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:105 panic /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:148 zalloc /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/zalloc.c:470 kalloc /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/kalloc.c:185 ipc_kobject_server /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/ipc_kobject.c:76 mach_msg_trap /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/ipc/mach_msg.c:1367 # IRC, freenode, #hurd, 2013-09-07 <rekado> I'm trying to investigate a crash in pfinet, so it will actually die. I just want to know why it dies and what the value of a few variables has been when it died. <teythoon> have you tried to make it dump core? <rekado> oh, good idea. <rekado> I'll try that. <teythoon> do you know how? <rekado> I don't, but I think I can figure it out. <teythoon> look into /servers <rekado> do I just have to set CRASHSERVER=/servers/crash-dump-core and run pfinet in that environment? <teythoon> possibly, I've never heard of CRASHSERVER, but it's certainly plausible ;) <teythoon> I just link crash to crash-dump-core, that way it is permanent and for all processes <rekado> found it in the website contents <rekado> gotta try that. <rekado> hmm, I can't get pfinet to dump core; linked /servers/crash to /servers/crash-dump-core and compiled pfinet to raise(6) at one point. <rekado> But no core file is created. <teythoon> :/ <teythoon> rekado: try cd /tmp ; cat & kill -SIGILL %% to see if that dumps core <rekado> yes, this works. <rekado> I replaced the original pfinet with my crashing version. <rekado> Should it dump core to /hurd then? <teythoon> I'm not sure about it's wd <teythoon> hm, ok, I just did settrans -ca foo /hurd/pfinet and then killed that pfient with SIGILL and it dumped core <teythoon> to the directory I issued the settrans from <rekado> So I must run it myself. I can't just replace the original binary and have it dump core somewhere. <teythoon> it seems that you have to use settrans -ca to start an active translator <teythoon> do fsysopts /servers/socket/2 to find out the cmdline of your pfinet <rekado> that's very helpful. <rekado> thanks <teythoon> then use this to restart it, e.g.: <teythoon> settrans -afg /servers/socket/2 $(fsysopts /servers/socket/2) <teythoon> if it dies it should dump core to you cwd <rekado> great. Thank you very much. I had been wondering how to get the full cmdline of pfinet. * rekado makes a note of fsysopts <rekado> yup, there's the core file. Nice. <teythoon> cool 8D <teythoon> btw, in case using gdb doesn't work out for your problem, if you start pfinet (or any translator) this way (with -a == active), you can write stuff to stderr <rekado> yeah, I noticed that. The assert() call wrote to stderr. Useful. <braunr> rekado: core dumps are another not-working-well feature of the hurd :/ <braunr> i recommend attaching <tschwinge> rekado: In case that's still helpful: <http://www.gnu.org/software/hurd/hurd/debugging/translator.html>. # IRC, freenode, #hurd, 2013-12-14 <gnu_srs> How to get a core dump? <teythoon> either set CRASHSERVER to /servers/crash-dump-core for the process you want the core file of <teythoon> or make /servers/crash point to crash-dump-core to make this the default for all processes <gnu_srs> does it work now, it did not before? <teythoon> it does for me, never had issues <gnu_srs> k! <teythoon> well, i believe the second option has issues <teythoon> if two processes crash, both may write/create a file in the same location --- If someone is working in this area, they may want to have a look at [[GDB_gcore]], and port <http://code.google.com/p/google-coredumper/>, too.