summaryrefslogtreecommitdiff
path: root/open_issues/fork_mach_port_mod_refs_ekern_urefs_owerflow.mdwn
blob: 39003ae4aa7d1d5640fd7c6daf2b7d87789e17a1 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!meta title="fork: mach_port_mod_refs: EKERN_UREFS_OWERFLOW"]]

[[!toc]]


# Original Report

In the [[GCC testsuite|gcc]], at this point:

    Running /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/unsorted.exp ...

... `expect` had gone bonkers:

    $ ps --all --format=hurd-long -w
      PID  UID  PPID  PGrp  Sess TH  Vmem   RSS %CPU     User   System Args
    [...]
     3567 1000 10295  3567  3567  2  137M  856K 98.2     5hrs 28 hours expect -- /usr/share/dejagnu/runtest.exp --tool gcc
    [...]

Last lines of `gcc/testsuite/gcc/gcc.sum`:

    PASS: gcc.c-torture/unsorted/q.c,  -O2 -flto -flto-partition=none  
    PASS: gcc.c-torture/unsorted/q.c,  -O2 -flto  
    PASS: gcc.c-torture/unsorted/r.c,  -O0  
    PASS: gcc.c-torture/unsorted/r.c,  -O1  
    PASS: gcc.c-torture/unsorted/r.c,  -O2  
    PASS: gcc.c-torture/unsorted/r.c,  -O3 -fomit-frame-pointer  
    PASS: gcc.c-torture/unsorted/r.c,  -O3 -g  
    PASS: gcc.c-torture/unsorted/r.c,  -Os  
    PASS: gcc.c-torture/unsorted/r.c,  -O2 -flto -flto-partition=none  

Last lines of `gcc/testsuite/gcc/gcc.log`:

    Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/  -w  -O2 -flto -flto-partition=none   -c  -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c    (timeout = 300)
    spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -flto-partition=none -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c
    PASS: gcc.c-torture/unsorted/r.c,  -O2 -flto -flto-partition=none  
    Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/  -w  -O2 -flto   -c  -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c    (timeout = 300)
    spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c

The root filesystem is sort-of deadlocked: `syncfs -c /` doesn't finish
-- even without `-s`.  But it is fine to spawn new processes, execute new
commands, etc.

GDB on 3567:

    (gdb) info threads
      2 Thread 3567.2  0x011aaf4c in mach_msg_trap () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
    * 1 Thread 3567.1  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
    (gdb) bt
    #0  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
    #1  0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27
    #2  0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31
    #3  0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89
    #4  __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814
    #5  0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86
    #6  0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37
    #7  0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62
    #8  0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466
    #9  0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119
    #10 0x0105bea6 in ?? () from /usr/lib/libexpect.so.5.44.1.15
    #11 0x0105bf6d in ?? () from /usr/lib/libexpect.so.5.44.1.15
    #12 0x0105c229 in exp_getptyslave () from /usr/lib/libexpect.so.5.44.1.15
    #13 0x0103e4b2 in ?? () from /usr/lib/libexpect.so.5.44.1.15
    #14 0x01087d79 in ?? () from /usr/lib/libtcl8.5.so.0
    #15 0x01088beb in ?? () from /usr/lib/libtcl8.5.so.0
    #16 0x0108826a in Tcl_EvalEx () from /usr/lib/libtcl8.5.so.0
    #17 0x0108985f in TclEvalObjEx () from /usr/lib/libtcl8.5.so.0
    [...]
    (gdb) bt full
    #0  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
    No locals.
    #1  0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27
    No locals.
    #2  0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31
    No locals.
    #3  0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89
    No locals.
    #4  __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814
            ar_ptr = <value optimized out>
            nb = 104
            newp = 0x68
            oldp = 0x8076450
            oldsize = 104
            __func__ = "__libc_realloc"
    #5  0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86
            sf = {_sbf = {_f = {_flags = -72515584, 
                  _IO_read_ptr = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_read_end = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_read_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_write_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_write_ptr = 0x80764b5 "", _IO_write_end = 0x80764bc "\201\004", 
                  _IO_buf_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_buf_end = 0x80764bc "\201\004", _IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0, _markers = 0x0, _chain = 0x0, _fileno = 20046008, 
                  _flags2 = 0, _old_offset = 23018464, _cur_column = 0, _vtable_offset = 49 '1', _shortbuf = "\001", _lock = 0x0, _offset = 85643859813466072, 
                  _codecvt = 0x1304583, _wide_data = 0x15f3bc0, _freeres_list = 0x0, _freeres_buf = 0x15f3c58, _freeres_size = 0, _mode = -1, 
                  _unused2 = "\240;_\001\236D0\001\005\000\000\000\340;_\001(K3\001\000\000\000\000\005\000\000\000\000\000\000\000\250\230\060\001\260\303\031\001"}, 
                vtable = 0x131b9c0}, _s = {_allocate_buffer = 0x122cdf0 <__libc_malloc>, _free_buffer = 0x122cd20 <__libc_free>}}
            ret = <value optimized out>
            needed = 94
    #6  0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37
            done = 1
    #7  0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62
            errbuf = "\334\r", '\000' <repeats 14 times>, "\f\265\032\001\000\000\000\000x\262\004\b\000\000\000\000\000\000\000\000\377\377\377\377 \262\004\b\250\065\063\001\070A_\001k\000\000\000\000\000\000\000ı2\001\002", '\000' <repeats 11 times>"\366, \377\377\377\270\235\004\bk\000\000\000X=_\001\037\343\037\001\377\377\377\377\000\000\000\000s=_\001\361\t\006\001\070A_\001\362\t\006\001\350\t\006\001\000\000\000\000\304B_\001\350\t\006\001\330\377_\001\033\000\000\000)\036\024\001\364\267\025\001x=_\001Bq\022\001\000\000\000\000\350:\b\b\230=_\001\364\267\025\001X\313\031\b S\005\b\230=_\001\004\334\f\001X\313\031\b\000\022\030\b\300L\005\b\364\267\025\001\370\021\030\b S\005\b\bB_\001\a\365\f\001 S\005\b\320\021\030\b\001\000\000\000\274A_\001,\316\024\001\001\000\000\000\001\000\000\000\344"...
            buf = <value optimized out>
    #8  0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466
            newproc = 122
            sigthread_refs = 4
            portnames = 0x63000
            porttypes = 0x64000
            sigthread = 130
            state = {gs = 1, fs = 18236712, es = 20390776, ds = 17004532, edi = 1, esi = 143348, ebp = 23020160, esp = 0, ebx = 23020088, edx = 23020016, 
              ecx = 23020028, eax = 3966371413, eip = 23020088, cs = 18236712, efl = 0, uesp = 20138312, ss = 18488015}
            newtask = 121
            thread = 139
            thread_refs = 65534
            statecount = <value optimized out>
            nportnames = 142
            nporttypes = 142
            env = {{__jmpbuf = {20037620, 23068628, 23020252, 23020128, 23019736, 19231503}, __mask_was_saved = 0, __saved_mask = 4222451713}}
            pid = <value optimized out>
            err = EKERN_INVALID_ADDRESS
            __PRETTY_FUNCTION__ = "__fork"
            ss = 0x1376808
            threads = 0x65000
            nthreads = 2
            stopped = 0
            i = 2
    #9  0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119
            status = <value optimized out>
            save = <value optimized out>
            pid = <value optimized out>
            sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = 524288, sa_flags = 0}
            omask = 0
    [...]

`fork` failed here:

       458        /* We have one extra user reference created at the beginning of this
       459           function, accounted for by mach_port_names (and which will thus be
       460           accounted for in the child below).  This extra right gets consumed
       461           in the child by the store into _hurd_sigthread in the child fork.  */
       462        if (thread_refs > 1 &&
       463            (err = __mach_port_mod_refs (newtask, ss->thread,
       464                                         MACH_PORT_RIGHT_SEND,
       465                                         thread_refs)))
       466          LOSE;

This is in the parent, before signal thread setup, registering with the
proc server, and starting the new process.

The error is 19, `EKERN_UREFS_OVERFLOW`.

(This is likely also the reason why the error path did not execute
successfully.)

[[!tag open_issue_glibc]]

On 2010-11-30 and 2010-12-04, when I had again started the GCC testsuite, it
failed again, but at another position (understandably), but with the same
symptoms as shown below.  In particular, the `thread_refs` values were the same
ones.


# Discussion

  * <http://lists.gnu.org/archive/html/bug-hurd/2010-11/msg00028.html>

  * <http://lists.gnu.org/archive/html/bug-hurd/2010-12/msg00002.html>

This is likely *simply* a programming error in glibc's fork implementation. 


# Bounty

There is a [[!FF_project 273]][[!tag bounty]] on this task.