open_issues/multithreading.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226

[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!tag open_issue_hurd]]

Hurd servers / VFS libraries are multithreaded.


# Implementation

  * well-known threading libraries

      * [[hurd/libthreads]]

      * [[hurd/libpthread]]


# Design

See [[hurd/libports]]: roughly using one thread per
incoming request.  This is not the best approach: it doesn't really make sense
to scale the number of worker threads with the number of incoming requests, but
instead they should be scaled according to the backends' characteristics.

The [[hurd/Critique]] should have some more on this.

[*Event-based Concurrency
Control*](http://soft.vub.ac.be/~tvcutsem/talks/presentations/T37_nobackground.pdf),
Tom Van Cutsem, 2009.


## IRC, freenode, #hurd, 2012-07-08

    <youpi> braunr: about limiting number of threads, IIRC the problem is that
      for some threads, completing their work means triggering some action in
      the server itself, and waiting for it (with, unfortunately, some lock
      held), which never terminates when we can't create new threads any more
    <braunr> youpi: the number of threads should be limited, but not globally
      by libports
    <braunr> pagers should throttle their writeback requests
    <youpi> right


## IRC, freenode, #hurd, 2012-07-16

    <braunr> hm interesting
    <braunr> when many threads are creating to handle requests, they
      automatically create a pool of worker threads by staying around for some
      time
    <braunr> this time is given in the libport call
    <braunr> but the thread always remain
    <braunr> they must be used in turn each time a new requet comes in
    <braunr> ah no :(, they're maintained by the periodic sync :(
    <braunr> hm, still not that, so weird
    <antrik> braunr: yes, that's a known problem: unused threads should go away
      after some time, but that doesn't actually happen
    <antrik> don't remember though whether it's broken for some reason, or
      simply not implemented at all...
    <antrik> (this was already a known issue when thread throttling was
      discussed around 2005...)
    <braunr> antrik: ok
    <braunr> hm threads actually do finish ..
    <braunr> libthreads retain them in a pool for faster allocations
    <braunr> hm, it's worse than i thought
    <braunr> i think the hurd does its job well
    <braunr> the cthreads code never reaps threads
    <braunr> when threads are finished, they just wait until assigned a new
      invocation

    <braunr> i don't understand ports_manage_port_operations_multithread :/
    <braunr> i think i get it
    <braunr> why do people write things in such a complicated way ..
    <braunr> such code is error prone and confuses anyone

    <braunr> i wonder how well nested functions interact with threads when
      sharing variables :/
    <braunr> the simple idea of nested functions hurts my head
    <braunr> do you see my point ? :) variables on the stack automatically
      shared between threads, without the need to explicitely pass them by
      address
    <antrik> braunr: I don't understand. why would variables on the stack be
      shared between threads?...
    <braunr> antrik: one function declares two variables, two nested functions,
      and use these in separate threads
    <braunr> are the local variables still "local"
    <braunr> ?
    <antrik> braunr: I would think so? why wouldn't they? threads have separate
      stacks, right?...
    <antrik> I must admit though that I have no idea how accessing local
      variables from the parent function works at all...
    <braunr> me neither

    <braunr> why don't demuxers get a generic void * like every callback does
      :((
    <antrik> ?
    <braunr> antrik: they get pointers to the input and output messages only
    <antrik> why is this a problem?
    <braunr> ports_manage_port_operations_multithread can be called multiple
      times in the same process
    <braunr> each call must have its own context
    <braunr> currently this is done by using nested functions
    <braunr> also, why demuxers return booleans while mach_msg_server_timeout
      happily ignores them :(
    <braunr> callbacks shouldn't return anything anyway
    <braunr> but then you have a totally meaningless "return 1" in the middle
      of the code
    <braunr> i'd advise not using a single nested function
    <antrik> I don't understand the remark about nested function
    <braunr> they're just horrible extensions
    <braunr> the compiler completely hides what happens behind the scenes, and
      nasty bugs could come out of that
    <braunr> i'll try to rewrite ports_manage_port_operations_multithread
      without them and see if it changes anything
    <braunr> but it's not easy
    <braunr> also, it makes debugging harder :p
    <braunr> i suspect gdb hangs are due to that, since threads directly start
      on a nested function
    <braunr> and if i'm right, they are created on the stack
    <braunr> (which is also horrible for security concerns, but that's another
      story)
    <braunr> (at least the trampolines)
    <antrik> I seriously doubt it will change anything... but feel free to
      prove me wrong :-)
    <braunr> well, i can see really weird things, but it may have nothing to do
      with the fact functions are nested
    <braunr> (i still strongly believe those shouldn't be used at all)


## IRC, freenode, #hurd, 2012-08-31

    <braunr> and the hurd is all but scalable
    <gnu_srs> I thought scalability was built-in already, at least for hurd??
    <braunr> built in ?
    <gnu_srs> designed in
    <braunr> i guess you think that because you read "aggressively
      multithreaded" ?
    <braunr> well, a system that is unable to control the amount of threads it
      creates for no valid reason and uses global lock about everywhere isn't
      really scalable
    <braunr> it's not smp nor memory scalable
    <gnu_srs> most modern OSes have multi-cpu support.
    <braunr> that doesn't mean they scale
    <braunr> bsd sucks in this area
    <braunr> it got better in recent years but they're way behind linux
    <braunr> linux has this magic thing called rcu
    <braunr> and i want that in my system, from the beginning
    <braunr> and no, the hurd was never designed to scale
    <braunr> that's obvious
    <braunr> a very common mistake of the early 90s


## IRC, freenode, #hurd, 2012-09-06

    <braunr> mel-: the problem with such a true client/server architecture is
      that the scheduling context of clients is not transferred to servers
    <braunr> mel-: and the hurd creates threads on demand, so if it's too slow
      to process requests, more threads are spawned
    <braunr> to prevent hurd servers from creating too many threads, they are
      given a higher priority
    <braunr> and it causes increased latency for normal user applications
    <braunr> a better way, which is what modern synchronous microkernel based
      systems do
    <braunr> is to transfer the scheduling context of the client to the server
    <braunr> the server thread behaves like the client thread from the
      scheduler perspective
    <gnu_srs> how can creating more threads ease the slowness, is that a design
      decision??
    <mel-> what would be needed to implement this?
    <braunr> mel-: thread migration
    <braunr> gnu_srs: is that what i wrote ?
    <mel-> does mach support it?
    <braunr> mel-: some versions do yes
    <braunr> mel-: not ours
    <gnu_srs> 21:49:03) braunr: mel-: and the hurd creates threads on demand,
      so if it's too slow to process requests, more threads are spawned
    <braunr> of course it's a design decision
    <braunr> it doesn't "ease the slowness"
    <braunr> it makes servers able to use multiple processors to handle
      requests
    <braunr> but it's a wrong design decision as the number of threads is
      completely unchecked
    <gnu_srs> what's the idea of creating more theads then, multiple cpus is
      not supported?
    <braunr> it's a very old decision taken at a time when systems and machines
      were very different
    <braunr> mach used to support multiple processors
    <braunr> it was expected gnumach would do so too
    <braunr> mel-: but getting thread migration would also require us to adjust
      our threading library and our servers
    <braunr> it's not an easy task at all
    <braunr> and it doesn't fix everything
    <braunr> thread migration on mach is an optimization
    <mel-> interesting
    <braunr> async ipc remains available, which means notifications, which are
      async by nature, will create messages floods anyway


# Alternative approaches:

  * <http://www.concurrencykit.org/>

  * Continuation-passing style

      * [[microkernel/Mach]] internally [[uses
        continuations|microkernel/mach/continuation]], too.

  * [[Erlang-style_parallelism]]

      * [[!wikipedia Actor_model]]; also see overlap with
        {{$capability#wikipedia_object-capability_model}}.

  * [libtcr - Threaded Coroutine Library](http://oss.linbit.com/libtcr/)

  * <http://monkey.org/~provos/libevent/>

---

See also: [[multiprocessing]].