1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
|
[[!meta copyright="Copyright © 2013, 2015 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
[[!meta title="cancellation points are not cancelling threads"]]
[[!tag open_issue_libpthread]]
#include <pthread.h>
#include <stdio.h>
#include <sys/select.h>
#include <unistd.h>
void *f (void*foo)
{
char buf[128];
//pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
while (1) {
read (0, buf, sizeof(buf));
}
}
int main (void) {
pthread_t t;
pthread_create (&t, NULL, f, NULL);
sleep (1);
pthread_cancel (t);
pthread_join (t, NULL);
exit(0);
}
read() is not behaving as a cancellation point, only setting the cancel
type to asynchronous permits this testcase to terminate. We do have the
pthread_setcanceltype glibc/libpthread hook in the forward structure, but we are
not using it: the LIBC_CANCEL_ASYNC macros are void, and we're not using them in
the mig msg call either.
Update: As of glibc 2.33, some cancellation implementation was added. The test
above seems to be working fine.
# Provenance
## IRC, OFTC, #debian-hurd, 2013-04-15
<paravoid> so, let me say a few things about the bug in the first place
<paravoid> the package builds and runs a test suite
<paravoid> the second test in the test suite blocks forever
<paravoid> a blocked pthread_join is what I see
<paravoid> I'm unsure why
<paravoid> have you seen anything like it before?
<youpi> whenever the thread doesn't actually terminate, sure
<youpi> what is the thread usually blocked on when you cancel it?
<paravoid> this is a hurd-specific issue
<paravoid> works on all other arches
<youpi> could be just that all other archs have more relaxed behavior
<youpi> thus the question of what exactly is supposed to be happening
<youpi> apparently it is inside a select?
<youpi> it seems select is not cancellable here
<pinotree> wasn't the patch you sent?
<youpi> no, my patch was about signals
<youpi> not cancellation
<pinotree> k
<youpi> (even if that could be related, of course)
<paravoid> how did you see that?
<paravoid> what's the equivalent of strace?
<youpi> thread 3 is inside _hurd_select
<paravoid> thread 1 is blocked on join
<paravoid> but the code is
<paravoid> if(gdmaps->reload_thread_spawned) {
<paravoid> pthread_cancel(gdmaps->reload_tid);
<paravoid> pthread_join(gdmaps->reload_tid, NULL);
<paravoid> }
<paravoid> so cancel should have killed the thread
<youpi> cancelling a thread is a complex matter
<youpi> there are cancellation points
<youpi> e.g. a thread performing while(1); can't be cancelled
<paravoid> thread 3 is just a libev event loop
<youpi> yes, "just" calling poll, the most complex system call of unix :)
<youpi> paravoid: anyway, don't look for a bug in your program, it's most
likely a bug in glibc, thanks for the report
<paravoid> I think it all boils down to a problem cancelling a thread in
poll()
<youpi> yes
<youpi> paravoid: ok, actually with the latest libc it does work
<paravoid> oh?
<youpi> where latest = not uploaded yet :/
<paravoid> did you test this on exodar?
<youpi> pinotree: that's the libpthread_cancellation.diff I guess
<paravoid> because I commented out the join :)
<youpi> paravoid: in the root, yes
<youpi> well, I tried my own program
<paravoid> oh, okay
<youpi> which is indeed hanging inside select (or just read) in the chroot
<youpi> but not in the root
<pinotree> ah, richard's patch
<paravoid> url?
<youpi> I've installed the build-dep in the root, if you want to try
<paravoid> strange that root is newer than the chroot :)
<youpi> paravoid: it's the usual eglibc debian source
<paravoid> tried in root, still fails
<youpi> could you keep the process running?
<paravoid> done
<youpi> Mmm, but the thread running gdmaps_reload_thread never set the
cancel type to async?
<youpi> that said I guess read and select are supposed to be cancellation
points
<youpi> thus cancel_deferred should be working, but they are not
<youpi> it seems it's cancellation points which have just not been
implemented
<youpi> (they happen to be one of the most obscure things in posix)
## IRC, freenode, #hurd, 2013-04-15
<youpi> but yes, there is still an issue, with PTHREAD_CANCEL_DEFERRED
<youpi> how calls like read() or select() are supposed to test
cancellation?
<pinotree> iirc there are the LIBC_CANCEL_* macros in glibc
<pinotree> eg sysdeps/unix/sysv/linux/pread.c
<youpi> yes
<youpi> but in our libpthredaD?
<pinotree> could it be we lack the libpthread → glibc bridge of
cancellation stuff?
<youpi> we do have pthread_setcancelstate/type forwards
<youpi> but it seems the default LIBC_CANCEL_ASYNC is void
<pinotree> i mean, so when you cancel a thread, you can get that cancel
status in libc proper, just like it seems done with LIBC_CANCEL_* macros
and nptl
<youpi> as I said, the bridge is there
<youpi> we're just not using it in glibc
<youpi> I'm writing an open_issues page
### IRC, freenode, #hurd, 2013-04-16
<braunr> youpi: yes, we said some time ago that it was lacking
# `userspace-rcu`
With `2.13-39+hurd.3.rbraun.1` (that is, `2.13-39+hurd.3` plus
`hurd-i386/0001-Mask-options-implemented-by-the-userspace-side-of-ma.patch.`)
installed.
During `make check` of the `userspace-rcu` package.
[...]
./test_urcu_gc 4 4 10 -d 0 -b 4096
[hangs]
(gdb) thread apply all bt
Thread 5 (Thread 14933.5):
#0 0x0106785c in mach_msg_trap () at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
#1 0x01068074 in __mach_msg (msg=0x27fff2c, option=3, send_size=24, rcv_size=32, rcv_name=120, timeout=0, notify=0) at msg.c:115
#2 0x011ed35c in __thread_suspend (target_thread=115) at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/RPC_thread_suspend.c:84
#3 0x01045016 in __pthread_thread_halt (thread=0x80744a8) at ../libpthread/sysdeps/mach/pt-thread-halt.c:43
#4 0x01041365 in __pthread_exit (status=0x2) at ./pthread/pt-exit.c:118
#5 0x01040e78 in entry_point (start_routine=0x80494b0 <thr_writer>, arg=0x3) at ./pthread/pt-create.c:50
#6 0x00000000 in ?? ()
Thread 4 (Thread 14933.4):
#0 0x0106785c in mach_msg_trap () at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
#1 0x01068074 in __mach_msg (msg=0x25fff2c, option=3, send_size=24, rcv_size=32, rcv_name=119, timeout=0, notify=0) at msg.c:115
#2 0x011ed35c in __thread_suspend (target_thread=113) at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/RPC_thread_suspend.c:84
#3 0x01045016 in __pthread_thread_halt (thread=0x8073aa8) at ../libpthread/sysdeps/mach/pt-thread-halt.c:43
#4 0x01041365 in __pthread_exit (status=0x2) at ./pthread/pt-exit.c:118
#5 0x01040e78 in entry_point (start_routine=0x80494b0 <thr_writer>, arg=0x2) at ./pthread/pt-create.c:50
#6 0x00000000 in ?? ()
Thread 3 (Thread 14933.3):
#0 0x0106785c in mach_msg_trap () at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
#1 0x01068074 in __mach_msg (msg=0x23ffe34, option=1282, send_size=0, rcv_size=40, rcv_name=122, timeout=10, notify=0) at msg.c:115
#2 0x0106ece3 in _hurd_select (nfds=0, pollfds=0x0, readfds=0x0, writefds=0x0, exceptfds=0x0, timeout=0x23ffefc, sigmask=0x0) at hurdselect.c:382
#3 0x0115875b in __poll (fds=fds@entry=0x0, nfds=nfds@entry=0, timeout=timeout@entry=10) at ../sysdeps/mach/hurd/poll.c:48
#4 0x0804a1bc in urcu_adaptative_busy_wait (wait=0x23fff48) at ../urcu-wait.h:164
#5 synchronize_rcu_mb () at ../urcu.c:329
#6 0x0804946c in rcu_gc_clear_queue (wtidx=wtidx@entry=1) at test_urcu_gc.c:241
#7 0x080495e6 in rcu_gc_reclaim (old=<optimized out>, wtidx=1) at test_urcu_gc.c:264
#8 thr_writer (data=0x1) at test_urcu_gc.c:295
#9 0x01040e70 in entry_point (start_routine=0x80494b0 <thr_writer>, arg=0x1) at ./pthread/pt-create.c:50
#10 0x00000000 in ?? ()
Thread 2 (Thread 14933.2):
#0 0x0106785c in mach_msg_trap () at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
#1 0x01068074 in __mach_msg (msg=0x17fdf30, option=3, send_size=32, rcv_size=4096, rcv_name=95, timeout=0, notify=0) at msg.c:115
#2 0x01068799 in __mach_msg_server_timeout (demux=0x1079150 <msgport_server>, max_size=4096, rcv_name=95, option=0, timeout=0) at msgserver.c:151
#3 0x0106886b in __mach_msg_server (demux=0x1079150 <msgport_server>, max_size=4096, rcv_name=95) at msgserver.c:196
#4 0x0107911f in _hurd_msgport_receive () at msgportdemux.c:68
#5 0x01040e70 in entry_point (start_routine=0x10790b0 <_hurd_msgport_receive>, arg=0x0) at ./pthread/pt-create.c:50
#6 0x00000000 in ?? ()
Thread 1 (Thread 14933.1):
#0 0x0106785c in mach_msg_trap () at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
#1 0x01068074 in __mach_msg (msg=0x15ff93c, option=2, send_size=0, rcv_size=24, rcv_name=94, timeout=0, notify=0) at msg.c:115
#2 0x010451a2 in __pthread_block (thread=0x805e600) at ../libpthread/sysdeps/mach/pt-block.c:35
#3 0x010443a8 in __pthread_cond_timedwait_internal (cond=0x80730dc, mutex=0x80730bc, abstime=0x0) at ./pthread/../sysdeps/generic/pt-cond-timedwait.c:130
#4 0x01043fcc in __pthread_cond_wait (cond=0x80730dc, mutex=0x80730bc) at ./pthread/../sysdeps/generic/pt-cond-wait.c:36
#5 0x010414ef in pthread_join (thread=8, status=status@entry=0x15ffa6c) at ./pthread/pt-join.c:46
#6 0x08048f9b in main (argc=8, argv=0x15ffb08) at test_urcu_gc.c:466
(gdb) thread 3
[Switching to thread 3 (Thread 14933.3)]
#0 0x0106785c in mach_msg_trap () at /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
2 /home/rbraun/devel/debian/packages/eglibc/eglibc-2.13/build-tree/hurd-i386-libc/mach/mach_msg_trap.S: Datei oder Verzeichnis nicht gefunden.
(gdb) call pthread_self()
$1 = 8
That is, Thread 1 is waiting for Thread 3 (8) to join, which is stuck in
`poll`.
Is this really the [[libpthread_cancellation_points]] issue -- there doesn't
seem to be any thread cancellation involved?
|