[[!meta copyright="Copyright © 2009, 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] Given an `a.out` executable that only does `raise (SIGABRT)`, invoking that one... * ... against `crash-dump-core` will... * ... not overwrite existing `core` files. Is this reasonable? Linux does overwrite them, for example. * ... show big variances in running-time behavior: $ TIMEFORMAT='real %R user %U system %S' $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 1.350 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 21:59 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 22.771 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 21:59 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 1.367 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:00 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 5.789 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:00 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 22.664 user 0.010 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:01 core * ... produce a huge `core` file: $ du -hs core 17M core On Linux, the `core` file occupies 76 KiB of disk space, which seems much more reasonable. This is possibly related with the default 128MiB heap preallocation. * ... does not always produce a useful backtrace: `abort();` $ gdb test core warning: core file may not match specified executable file. [New Thread 86678] warning: Wrong size fpregset in core file. ... Core was generated by `./test'. Program terminated with signal 6, Aborted. warning: Wrong size fpregset in core file. (gdb) bt #0 0x00000000 in ?? () #1 0x011f593f in __msg_sig_post (process=72, signal=6, sigcode=0, refport=1) at /build/buildd-eglibc_2.10.2-7-hurd-i386-iGL6op/eglibc-2.10.2/build-tree/hurd-i386-libc/hurd/RPC_msg_sig_post.c:144 #2 0x0109a433 in kill_port (pid=) at ../sysdeps/mach/hurd/kill.c:68 #3 kill_pid (pid=) at ../sysdeps/mach/hurd/kill.c:105 #4 0x0109a69f in __kill (pid=21142, sig=6) at ../sysdeps/mach/hurd/kill.c:139 #5 0x01099af6 in raise (sig=6) at ../sysdeps/posix/raise.c:27 #6 0x0109de59 in abort () at abort.c:88 #7 0x0804849f in main () `char *foo = 0; *foo = 1;` $ gdb test core Program terminated with signal 11, Segmentation fault. warning: Wrong size fpregset in core file. #0 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () #1 0x0108565b in __libc_start_main (main=0x8048464
, argc=1, ubp_av=0x1023e64, init=0x8048490 <__libc_csu_init>, fini=0x8048480 <__libc_csu_fini>, rtld_fini=0xea20 <_dl_fini>, stack_end=0x1023e5c) at libc-start.c:251 #2 0x080483d1 in _start () `raise (SIGABRT);` $ gdb a.out core warning: core file may not match specified executable file. [New Thread 76651] warning: Wrong size fpregset in core file. Reading symbols from /lib/libc.so.0.3...[...] Core was generated by `./a.out'. Program terminated with signal 6, Aborted. warning: Wrong size fpregset in core file. #0 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () Cannot access memory at address 0x17 [[!tag open_issue_gdb]] Probably [[GDB]] doesn't manage to dig in the stack properly. * ... against `crash-suspend` will... * ... not work at all: $ CRASHSERVER=/servers/crash-suspend ./a.out $ [returns to the shell and doesn't suspended] * ... show big variances in running-time behavior: $ TIMEFORMAT='real %R user %U system %S' $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.381 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.332 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 21.228 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.323 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:05 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 22.279 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:05 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.362 user 0.000 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 21.110 user 0.000 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.350 user 0.000 system 0.020 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core * ... can reliably crash GNU Mach: This happens if a `core` file is already present (and won't get overwritten; see above). I reproduced this three times. $ TIMEFORMAT='real %R user %U system %S' $ time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted real 2.856 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core panic: zalloc: zone kalloc.8192 exhausted Kernel Breakpoint trap, eip 0x20020a77 Stopped at 0x20020a76: int $3 db> trace 0x20020a76(2006aba8,4d0f7e9c,200209b0,0,0) 0x20020a4d(2006b094,2006ae40,2000,20016803,4a5f4114) 0x2002bca5(49a03564,1,0,9,1000) 0x20022f4c(2000,4a5f45d4,4a84879c,49a46564,4ac43e78) 0x20021e65(4ac43e78,4a5f45d4,4a5f4114,0,0) 0x2005309d(2106ba9c,3,38,28,1783) Bad frame pointer: 0x2106ba78 $ addr2line -i -f -e /boot/gnumach-xen 0x20020a76 0x20020a4d 0x2002bca5 0x20022f4c 0x20021e65 0x2005309d Debugger /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:105 panic /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:148 zalloc /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/zalloc.c:470 kalloc /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/kalloc.c:185 ipc_kobject_server /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/ipc_kobject.c:76 mach_msg_trap /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/ipc/mach_msg.c:1367 # IRC, freenode, #hurd, 2013-09-07 I'm trying to investigate a crash in pfinet, so it will actually die. I just want to know why it dies and what the value of a few variables has been when it died. have you tried to make it dump core? oh, good idea. I'll try that. do you know how? I don't, but I think I can figure it out. look into /servers do I just have to set CRASHSERVER=/servers/crash-dump-core and run pfinet in that environment? possibly, I've never heard of CRASHSERVER, but it's certainly plausible ;) I just link crash to crash-dump-core, that way it is permanent and for all processes found it in the website contents gotta try that. hmm, I can't get pfinet to dump core; linked /servers/crash to /servers/crash-dump-core and compiled pfinet to raise(6) at one point. But no core file is created. :/ rekado: try cd /tmp ; cat & kill -SIGILL %% to see if that dumps core yes, this works. I replaced the original pfinet with my crashing version. Should it dump core to /hurd then? I'm not sure about it's wd hm, ok, I just did settrans -ca foo /hurd/pfinet and then killed that pfient with SIGILL and it dumped core to the directory I issued the settrans from So I must run it myself. I can't just replace the original binary and have it dump core somewhere. it seems that you have to use settrans -ca to start an active translator do fsysopts /servers/socket/2 to find out the cmdline of your pfinet that's very helpful. thanks then use this to restart it, e.g.: settrans -afg /servers/socket/2 $(fsysopts /servers/socket/2) if it dies it should dump core to you cwd great. Thank you very much. I had been wondering how to get the full cmdline of pfinet. * rekado makes a note of fsysopts yup, there's the core file. Nice. cool 8D btw, in case using gdb doesn't work out for your problem, if you start pfinet (or any translator) this way (with -a == active), you can write stuff to stderr yeah, I noticed that. The assert() call wrote to stderr. Useful. rekado: core dumps are another not-working-well feature of the hurd :/ i recommend attaching rekado: In case that's still helpful: . # IRC, freenode, #hurd, 2013-12-14 How to get a core dump? either set CRASHSERVER to /servers/crash-dump-core for the process you want the core file of or make /servers/crash point to crash-dump-core to make this the default for all processes does it work now, it did not before? it does for me, never had issues k! well, i believe the second option has issues if two processes crash, both may write/create a file in the same location --- If someone is working in this area, they may want to have a look at [[GDB_gcore]], and port , too.