summaryrefslogtreecommitdiff
path: root/glibc/fork.mdwn
blob: 378fe835dfc7b4ded8e687bea72dc76b5c014df0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

On [[Unix]] systems, `fork` is a rather simple [[system call]].

Our implementation in [[glibc]] is and needs to be rather bulky.

For example, it has to duplicate all port rights for the new [[Mach
task|microkernel/mach/task]].  The address space can simply be duplicated by
standard means of the [[microkernel/Mach]], but as [[unix/file_descriptor]]s
(for example) are a concept that is implemented inside [[glibc]] (based on
[[Mach port|microkernel/mach/port]]s), these have to be duplicated from
userspace, which requires a small number of [[RPC]] for each of them.

In sum, [[this affects performance|open_issues/performance/fork]] when new
processes are continuously being spawned from the shell, for example.

Often, a `fork` call will eventually be followed by an `exec`, which [[may in
turn close|open_issues/secure_file_descriptor_handling]] (most of) the
duplicated port rights.  Unfortunately, this cannot be known at the time the
`fork` executing, so in order to optimize this, the code calling `fork` has to
be modified instead, and the `fork`, `exec` combo be replaced by a
`posix_spawn` call, for example, to avoid this work of duplicating each port
right, then closing each again.

As far as we know, Cygwin has the same problem of `fork` being a nontrivial
operation.  Perhaps we can learn from what they're been doing?  Also, perhaps
they have patches for software packages, to avoid using `fork` followed by
`exec`, for example.


# TODO

  * [[fork: mach_port_mod_refs:
    EKERN_UREFS_OWERFLOW|open_issues/fork_mach_port_mod_refs_ekern_urefs_owerflow]]
    ([[!taglink open_issue_glibc]]).

  * Include de-duplicate information from elsewhere: [[hurd-paper]],
    [[hurd-talk]], [[hurd/ng/trivialconfinementvsconstructorvsfork]],
    [[open_issues/resource_management_problems/zalloc_panics]] ([[!taglink
    open_issue_glibc open_issue_documentation]]).

  * We no longer support `MACH_IPC_COMPAT`, thus we can get rid of the `err =
    __mach_port_allocate_name ([...]); if (err == KERN_NAME_EXISTS)` code
    ([[!taglink open_issue_glibc]]).


## Related

  * [[open_issues/secure_file_descriptor_handling]].


# External

  * {{$unix#djb_self-pipe}}.

  * {{$unix#rjk_fork}}.