summaryrefslogtreecommitdiff
path: root/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
blob: 7739f4d1ceb2ae000bd3b30c5faa05c3144de11e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!tag open_issue_gnumach]]


# IRC, freenode, #hurd, 2011-07-20

    <braunr> could we add gnumach forward map entry merging as an open issue ?
    <braunr> probably hurting anything using bash extensively, like build most
      build systems
    <braunr> mcsim: this map entry merging problem might interest you
    <braunr> tschwinge: see vm/vm_map.c, line ~905
    <braunr> "See whether we can avoid creating a new entry (and object) by
      extending one of our neighbors.  [So far, we only attempt to extend from
      below.]"
    <braunr> and also vm_object_coalesce
    <braunr> "NOTE:   Only works at the moment if the second object is NULL -
      if it's not, which object do we lock first?"
    <braunr> although map entry merging should be enough
    <braunr> this seems to be the cause for bash having between 400 and 1000+
      map entries
    <braunr> thi makes allocations and faults slow, and forks even more
    <braunr> but again, this should be checked before attempting anything
    <braunr> (for example, this comment still exists in freebsd, although they
      solved the problem, so who knows)
    <antrik> braunr: what exactly would you want to check?
    <antrik> braunr: this rather sounds like something you would just have to
      try...
    <braunr> antrik: that map merging is actually incomplete
    <braunr> and that entries can actually be merged
    <antrik> hm, I see...
    <braunr> (i.e. they are adjacent and have compatible properties
    <braunr> )
    <braunr> antrik: i just want to avoid the "hey, splay trees mak fork slow,
      let's work on it for a month to see it wasn't the problem"
    <antrik> so basically you need a dump of a task's map to check whether
      there are indeed entries that could/should be merged?
    <antrik> hehe :-)
    <braunr> well, vminfo should give that easily, i just didn't take the time
      to check it
    <jkoenig> braunr, as you pointed out, "vminfo $$" seems to indicate that
      merging _is_ incomplete.
    <braunr> this could actually have a noticeable impact on package builds
    <braunr> hm
    <braunr> the number of entries for instances of bash running scripts don't
      exceed 50-55 :/
    <braunr> the issue seems to affect only certain instances (login shells,
      and su -)
    <braunr> jkoenig: i guess dash is just much lighter than bash in many ways
      :)
    <jkoenig> braunr, the number seems to increase with usage (100 here for a
      newly started interactive shell, vs. 150 in an old one)
    <braunr> yes, merging is far from complete in the vm_map code
    <braunr> it only handles null objects (private zeroed memory), and only
      tries to extend a previous entry (this isn't even a true merge)
    <braunr> this works well for the kernel however, which is why there are so
      few as 25 entries
    <braunr> but any request, be it e.g. mmap(), or mprotect(), can easily
      split entries
    <braunr> making their number larger
    <jkoenig> my ext2fs has ~6500 entries, but I guess this is related to
      mapping blocks from the filesystem, right?
    <braunr> i think so
    <braunr> hm not sure actually
    <braunr> i'd say it's fragmentation due to copy on writes when client have
      mapped memory from it
    <braunr> there aren't that many file mappings though :(
    <braunr> jkoenig: this might just be the same problem as in bash
    <braunr>  0x1308000[0x3000] (prot=RW, max_prot=RWX, mem_obj=584)
    <braunr>  0x130b000[0x6000] (prot=RW, max_prot=RWX, mem_obj=585)
    <braunr>  0x1311000[0x3000] (prot=RX, max_prot=RWX, mem_obj=586)
    <braunr>  0x1314000[0x1000] (prot=RW, max_prot=RWX, mem_obj=586)
    <braunr>  0x1315000[0x2000] (prot=RX, max_prot=RWX, mem_obj=587)
    <braunr> the first two could be merged but not the others
    <jkoenig> theoritically, those may correspond to memory objects backed by
      different portions of the disk, right?
    <braunr> jkoenig: this looks very much like the same issue (many private
      mappings not merged)
    <braunr> jkoenig: i'm not sure
    <braunr> jkoenig: normally there is an offset when the object is valid
    <braunr> but vminfo may simply not display it if 0
    * jkoenig goes read about memory object
    <braunr> ok, vminfo can't actually tell if the object is anonymous or
      file-backed memory
    <jkoenig> (I'm perplexed about how the kernel can merge two memory objects
      if disctinct port names exist in the tasks' name space -- that's what
      mem_obj is, right?)
    <braunr> i don't see why
    <braunr> jkoenig: can you be more specific ?
    <jkoenig> braunr, if, say, 584 and 585 above are port names which the task
      expects to be able to access and do stuff with, what will happen to them
      when the memory objects are merged?
    <braunr> good question
    <braunr> but hm
    <braunr> no it's not really a problem
    <braunr> memory objects aren't directly handled by the vm system
    <braunr> vm_object and memory_object are different things
    <braunr> vm_objects can be split and merged
    <braunr> and shadow objects form chains ending on a final vm_object
    <braunr> which references a memory object
    <braunr> hm
    <braunr> jkoenig: ok no solution, they can't be merged :)
    <jkoenig> braunr, I'm confused :-)
    <braunr> jkoenig: but at least, if two vm_objects are created but reference
      the same externel memory object, the vm should be able to merge them back
    <braunr> external*
    <braunr> are created as a result of a split
    <braunr> say, you map a memory object, mprotect part of it (=split), then
      mprotect the reste of it (=merge), it should work
    <braunr> jkoenig: does that clarify things a bit ?
    <jkoenig> ok so if I get it right, the entries shown by vmstat are the
      vm_object, and the mem_obj listed is a send right to the memory object
      they're referencing ?
    <braunr> yes
    <braunr> i'm not sure about the type of the integer showed (port name or
      simply an index)
    <braunr> jkoenig: another possibility explaining the high number of entries
      is how anonymous memory is implemented
    <braunr> if every vm_allocate request implies the creation of a memory
      object from the default pager
    <braunr> the vm has no way to merge them
    <jkoenig> and a vm_object is not a capability, but just an internal kernel
      structure used to record the composition of the address space
    <braunr> jkoenig: not exactly the address space, but close enough
    <braunr> jkoenig: it's a container used to know what's in physical memory
      and what isn't
    <jkoenig> braunr, ok I think I'm starting to get it, thanks.
    <braunr> glad i could help
    <braunr> i wonder when vm_map_enter() gets null objects though :/
    <braunr> "If this port is MEMORY_OBJECT_NULL, then zero-filled memory is
      allocated instead"
    <braunr> which means vm_allocate()
    <jkoenig> braunr, when the task uses vm_allocate(), or maybe vm_map_enter()
      with MEMORY_OBJECT_NULL, there's an opportunity to extend an existing
      object though, is that what you referred to earlier ?
    <braunr> jkoenig: yes, and that's what is done
    <jkoenig> but how does that play out with the default pager? (I'm thinking
      aloud, as always feel free to ignore ;-)
    <braunr> the default pager backs vm_objects providing zero filled memory
    <braunr> hm, guess it wasn't your question
    <braunr> well, swap isn't like a file, pages can be placed dynamically,
      which is why the offset is always 0 for this type of memory
    <jkoenig> hmm I see, apparently a memory object does not have a size
    <braunr> are you sure ?
    <jkoenig> from what I can gather from
      http://www.gnu.org/software/hurd/gnumach-doc/External-Memory-Management.html,
      but I looked very quickly
    <braunr> vm_objects have a size
    <braunr> and each map entry recors the offset within the object where the
      mapping begins
    <braunr> offset and sizes are used by the kernel when querying the memory
      object pager
    <braunr> see memory_object_data_request for example
    <jkoenig> right.
    <braunr> but the default pager has another interface
    <braunr> jkoenig: after some simple tests, i haven't seen a simple case
      where forward merging could be applied :(
    <braunr> which means it's a lot harder than it first looked
    <braunr> hm
    <braunr> actually, there seems to be cases where this can be done
    <braunr> all of them occurring after a first merge was done
    <braunr> (which means a mapping request perfectly fits between two map
      entries)


# IRC, freenode, #hurd, 2011-07-21

    <braunr> tschwinge: you may remove the forward map entry merging issue :/
    <pinotree> what did you discover?
    <braunr> tschwinge: it's actually much more complicated than i thought, and
      needs major changes in the vm, and about the way anonymous memory is
      handled
    <braunr> from what i could see, part of the problem still exists in freebsd
    <braunr> for the same reasons (shadow objects being one of them)

[[mach_shadow_objects]].


# GCC build time using bash vs. dash

<http://gcc.gnu.org/ml/gcc/2011-07/msg00444.html>


# Procedure

  * Analyze.

  * Measure.

  * Fix.

  * Measure again.

  * Have Samuel measure on the buildd.