summaryrefslogtreecommitdiff
path: root/glibc/fork.mdwn
blob: c9efd1f4488981969fd58d2626cf3fa4cac8cd27 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

On [[Unix]] systems, `fork` is a rather simple system call.

Our implementation in [[glibc]] is and needs to be rather bulky.

For example, it has to duplicate all port rights for the new [[Mach
task|microkernel/mach/task]].  The address space can simply be duplicated by
standard means of the [[microkernel/Mach]], but as [[file descriptor]]s (for
example) are a concept that is implemented inside [[glibc]] (based on [[Mach
port|microkernel/mach/port]]s), these have to be duplicated from userspace,
which requires a small number of [[RPC]] for each of them.

In sum, [[this affects performance|open_issues/performance/fork]] when new
processes are continuously being spawned from the shell, for example.

Often, a `fork` call will eventually be followed by an `exec`, which will in
turn close (most of) the duplicated port rights.  Unfortunately, this cannot be
known at the time the `fork` executing, so the code calling `fork` has to be
modified, and the `fork`, `exec` combo be replaced by a `posix_spawn` call, for
example, to avoid this work of duplicating each port right, then closing each
again.

As far as we know, Cygwin has the same problem of `fork` being a nontrivial
operation.  Perhaps we can learn from what they're been doing?  Also, perhaps
they have patches for software packages, to avoid using `fork` followed by
`exec`, for example.


# TODO

  * [[fork: mach_port_mod_refs:
    EKERN_UREFS_OWERFLOW|open_issues/fork_mach_port_mod_refs_ekern_urefs_owerflow]]
    ([[!taglink open_issue_glibc]]).

  * Include de-duplicate information from elsewhere: [[hurd-paper]],
    [[hurd-talk]] [[hurd/ng/trivialconfinementvsconstructorvsfork]],
    [[open_issues/resource_management_problems/zalloc_panics]] ([[!taglink
    open_issue_glibc open_issue_documentation]]).

  * We no longer support `MACH_IPC_COMPAT`, thus we can get rid of the `err =
    __mach_port_allocate_name ([...]); if (err == KERN_NAME_EXISTS)` code
    ([[!taglink open_issue_glibc]]).


# External

  * [*How fork(2) ought to be*](http://www.greenend.org.uk/rjk/fork.html) by
    Richard Kettlewell.

  * [*The self-pipe trick*](http://cr.yp.to/docs/selfpipe.html) by
    D. J. Bernstein.