[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!meta title="fork: mach_port_mod_refs: EKERN_UREFS_OWERFLOW"]] [[!toc]] # Original Report In the [[GCC testsuite|gcc]], at this point: Running /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/unsorted.exp ... ... `expect` had gone bonkers: $ ps --all --format=hurd-long -w PID UID PPID PGrp Sess TH Vmem RSS %CPU User System Args [...] 3567 1000 10295 3567 3567 2 137M 856K 98.2 5hrs 28 hours expect -- /usr/share/dejagnu/runtest.exp --tool gcc [...] Last lines of `gcc/testsuite/gcc/gcc.sum`: PASS: gcc.c-torture/unsorted/q.c, -O2 -flto -flto-partition=none PASS: gcc.c-torture/unsorted/q.c, -O2 -flto PASS: gcc.c-torture/unsorted/r.c, -O0 PASS: gcc.c-torture/unsorted/r.c, -O1 PASS: gcc.c-torture/unsorted/r.c, -O2 PASS: gcc.c-torture/unsorted/r.c, -O3 -fomit-frame-pointer PASS: gcc.c-torture/unsorted/r.c, -O3 -g PASS: gcc.c-torture/unsorted/r.c, -Os PASS: gcc.c-torture/unsorted/r.c, -O2 -flto -flto-partition=none Last lines of `gcc/testsuite/gcc/gcc.log`: Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -flto-partition=none -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c (timeout = 300) spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -flto-partition=none -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c PASS: gcc.c-torture/unsorted/r.c, -O2 -flto -flto-partition=none Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c (timeout = 300) spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c The root filesystem is sort-of deadlocked: `syncfs -c /` doesn't finish -- even without `-s`. But it is fine to spawn new processes, execute new commands, etc. GDB on 3567: (gdb) info threads 2 Thread 3567.2 0x011aaf4c in mach_msg_trap () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2 * 1 Thread 3567.1 0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2 (gdb) bt #0 0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2 #1 0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27 #2 0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31 #3 0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89 #4 __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814 #5 0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86 #6 0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37 #7 0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62 #8 0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466 #9 0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119 #10 0x0105bea6 in ?? () from /usr/lib/libexpect.so.5.44.1.15 #11 0x0105bf6d in ?? () from /usr/lib/libexpect.so.5.44.1.15 #12 0x0105c229 in exp_getptyslave () from /usr/lib/libexpect.so.5.44.1.15 #13 0x0103e4b2 in ?? () from /usr/lib/libexpect.so.5.44.1.15 #14 0x01087d79 in ?? () from /usr/lib/libtcl8.5.so.0 #15 0x01088beb in ?? () from /usr/lib/libtcl8.5.so.0 #16 0x0108826a in Tcl_EvalEx () from /usr/lib/libtcl8.5.so.0 #17 0x0108985f in TclEvalObjEx () from /usr/lib/libtcl8.5.so.0 [...] (gdb) bt full #0 0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2 No locals. #1 0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27 No locals. #2 0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31 No locals. #3 0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89 No locals. #4 __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814 ar_ptr = nb = 104 newp = 0x68 oldp = 0x8076450 oldsize = 104 __func__ = "__libc_realloc" #5 0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86 sf = {_sbf = {_f = {_flags = -72515584, _IO_read_ptr = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", _IO_read_end = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", _IO_read_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", _IO_write_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", _IO_write_ptr = 0x80764b5 "", _IO_write_end = 0x80764bc "\201\004", _IO_buf_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", _IO_buf_end = 0x80764bc "\201\004", _IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0, _markers = 0x0, _chain = 0x0, _fileno = 20046008, _flags2 = 0, _old_offset = 23018464, _cur_column = 0, _vtable_offset = 49 '1', _shortbuf = "\001", _lock = 0x0, _offset = 85643859813466072, _codecvt = 0x1304583, _wide_data = 0x15f3bc0, _freeres_list = 0x0, _freeres_buf = 0x15f3c58, _freeres_size = 0, _mode = -1, _unused2 = "\240;_\001\236D0\001\005\000\000\000\340;_\001(K3\001\000\000\000\000\005\000\000\000\000\000\000\000\250\230\060\001\260\303\031\001"}, vtable = 0x131b9c0}, _s = {_allocate_buffer = 0x122cdf0 <__libc_malloc>, _free_buffer = 0x122cd20 <__libc_free>}} ret = needed = 94 #6 0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37 done = 1 #7 0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62 errbuf = "\334\r", '\000' , "\f\265\032\001\000\000\000\000x\262\004\b\000\000\000\000\000\000\000\000\377\377\377\377 \262\004\b\250\065\063\001\070A_\001k\000\000\000\000\000\000\000ı2\001\002", '\000' "\366, \377\377\377\270\235\004\bk\000\000\000X=_\001\037\343\037\001\377\377\377\377\000\000\000\000s=_\001\361\t\006\001\070A_\001\362\t\006\001\350\t\006\001\000\000\000\000\304B_\001\350\t\006\001\330\377_\001\033\000\000\000)\036\024\001\364\267\025\001x=_\001Bq\022\001\000\000\000\000\350:\b\b\230=_\001\364\267\025\001X\313\031\b S\005\b\230=_\001\004\334\f\001X\313\031\b\000\022\030\b\300L\005\b\364\267\025\001\370\021\030\b S\005\b\bB_\001\a\365\f\001 S\005\b\320\021\030\b\001\000\000\000\274A_\001,\316\024\001\001\000\000\000\001\000\000\000\344"... buf = #8 0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466 newproc = 122 sigthread_refs = 4 portnames = 0x63000 porttypes = 0x64000 sigthread = 130 state = {gs = 1, fs = 18236712, es = 20390776, ds = 17004532, edi = 1, esi = 143348, ebp = 23020160, esp = 0, ebx = 23020088, edx = 23020016, ecx = 23020028, eax = 3966371413, eip = 23020088, cs = 18236712, efl = 0, uesp = 20138312, ss = 18488015} newtask = 121 thread = 139 thread_refs = 65534 statecount = nportnames = 142 nporttypes = 142 env = {{__jmpbuf = {20037620, 23068628, 23020252, 23020128, 23019736, 19231503}, __mask_was_saved = 0, __saved_mask = 4222451713}} pid = err = EKERN_INVALID_ADDRESS __PRETTY_FUNCTION__ = "__fork" ss = 0x1376808 threads = 0x65000 nthreads = 2 stopped = 0 i = 2 #9 0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119 status = save = pid = sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = 524288, sa_flags = 0} omask = 0 [...] `fork` failed here: 458 /* We have one extra user reference created at the beginning of this 459 function, accounted for by mach_port_names (and which will thus be 460 accounted for in the child below). This extra right gets consumed 461 in the child by the store into _hurd_sigthread in the child fork. */ 462 if (thread_refs > 1 && 463 (err = __mach_port_mod_refs (newtask, ss->thread, 464 MACH_PORT_RIGHT_SEND, 465 thread_refs))) 466 LOSE; This is in the parent, before signal thread setup, registering with the proc server, and starting the new process. The error is 19, `EKERN_UREFS_OVERFLOW`. (This is likely also the reason why the error path did not execute successfully.) [[!tag open_issue_glibc]] On 2010-11-30 and 2010-12-04, when I had again started the GCC testsuite, it failed again, but at another position (understandably), but with the same symptoms as shown below. In particular, the `thread_refs` values were the same ones. # Discussion * * This is likely *simply* a programming error in glibc's fork implementation. # Bounty There is a [[!FF_project 273]][[!tag bounty]] on this task.