microkernel/mach/gnumach/memory_management.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205

[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation,
Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!tag open_issue_documentation open_issue_gnumach]]

[[!toc]]


# IRC, freenode, #hurd, 2011-02-15

    <braunr> etenil: originally, mach had its own virtual space (the kernel
      space)
    <braunr> etenil: in order to use linux 2.0 drivers, it now directly maps
      physical memory, as linux does
    <braunr> etenil: but there is nothing similar to kmap() or vmalloc() in
      mach, so the kernel is limited to its 1 GiB
    <braunr> (3 GiB userspace / 1 GiB kernelspace)
    <braunr> that's the short version, there is a vmalloc() in mach, but this
      trick made it behave almost like a kmalloc()
    <antrik> braunr: the direct mapping is *only* for the benefit of Linux
      drivers?...
    <braunr> also, the configuration of segments limits the kernel space
    <braunr> antrik: i'm not sure, as i said, this is the short version
    <braunr> antrik: but there is a paper which describes the integration of
      those drivers in mach
    <etenil> you mean the linux 2.0 drivers?
    <antrik> braunr: I read it once, but I don't remember anything about the
      physical mapping in there...
    <antrik> etenil: well, originally it was 1.3, but essentially that's the
      same...
    <braunr> i don't see any other reason why there would be a direct mapping
    <braunr> except for performance (because you can use larger - even very
      lage - pages without resetting the mmu often thanks to global pages, but
      that didn't exist at the time)


# IRC, freenode, #hurd, 2011-02-15

    <antrik> however, the kernel won't work in 64 bit mode without some changes
      to physical memory management
    <braunr> and mmu management
    <braunr> (but maybe that's what you meant by physical memory)


## IRC, freenode, #hurd, 2011-02-16

    <braunr> antrik: youpi added it for xen, yes
    <braunr> antrik: but you're right, since mach uses a direct mapped kernel
      space, the true problem is the lack of linux-like highmem support
    <braunr> which isn't required if the kernel space is really virtual


# IRC, freenode, #hurd, 2011-06-09

    <braunr> btw, how can gnumach use 1 GiB of RAM ? did you lower the
      user/kernel boundary address ?
    <youpi> I did
    <braunr> 2G ?
    <youpi> yes
    <braunr> ok
    <youpi> it doesn't make so much sense to let processes have 3G addressing
      space when there can't be more that 1G physical memory
    <braunr> that's sad for an operating system which does most things by
      mapping memory eh
    <youpi> well, if a process wants to map crazy things, 3G may be tight
      already
    <youpi> e.g. ext2fs
    <braunr> yes
    <youpi> so there's little point in supporting them
    <braunr> we need hurd/amd64
    <youpi> and there's quite some benefit in shrinking them to 2G
    <youpi> yes
    <youpi> actually even 2G may become a bit tight
    <youpi> webkit linking needs about 1.5-2GiB
    <youpi> things become really crazy
    <braunr> wow
    <braunr> i remember the linux support for 4G/4G split when there was enough
      RAM to fill the kernel space with struct page entries


# IRC, freenode, #hurd, 2011-11-12

    <youpi> well, the Hurd doesn't "artificially" limits itself to 1.5GiB
      memory
    <youpi> i386 has only 4GiB addressing space
    <youpi> we currently chose 2GiB for the kernel and 2GiB for the userspace
    <youpi> since kernel needs some mappings, that leaves only 1.5GiB usable
      physical memory
    <sea4ever`> Hm? 2GiB for kernel, 2GiB for userspace, 500MiB are used for
      what?
    <youpi> for mappings
    <youpi> such as device iomap
    <youpi> contiguous buffer allocation
    <youpi> and such things
    <sea4ever`> Ah, ok. You map things in kernel space into user space then.
    <youpi> linux does the same without the "bigmem" support
    <youpi> no, just in kernel space
    <youpi> kernel space is what determines how much physical memory you can
      address
    <youpi> unless using the linux-said-awful "bigmem" support


# IRC, freenode, #hurd, 2012-07-05

    <braunr> hm i got an address space exhaustion while building eglibc :/
    <braunr> we really need the 3/1 split back with a 64-bits kernel
    <pinotree> 3/1?
    <braunr> 3 GiB userspace, 1 GiB kernel
    <pinotree> ah
    <braunr> the debian gnumach package is patched to use a 2/2 split
    <braunr> and 2 GiB is really small for some needs
    <braunr> on the bright side, the machine didn't crash
    <braunr> there is issue with watch ./slabinfo which turned in a infinite
      loop, but it didn't affect the stability of the system
    <braunr> actually with a 64-bits kernel, we could use a 4/x split


# IRC, freenode, #hurd, 2012-08-10

    <braunr> all modern systems embed the kernel in every address space
    <braunr> which allows reduced overhead when making a system call
    <braunr> sometimes there is no context switch at all
    <braunr> on i386, there are security checks to upgrade the privilege level
      (switch to ring 0), and when used, kernel page tables are global, so
      they're not flushed
    <braunr> using sysenter/sysexit makes it even faster

[[service_solahart_jakarta_selatan__082122541663/system_call_mechanism]].


# IRC, freenode, #hurd, 2012-12-12

    <braunr> youpi: is the 2g split patch really needed ?
    <braunr> or rather, is it really a good thing for most people ?
    <braunr> instead of the common 3g/1g
    <braunr> it reduces tasks' address space but allows the kernel to reference
      more physical memory
    <braunr> the thing is, because of the current page cache implementation,
      most of the time, this physical memory remains unused, or very rarely
    <youpi> ?
    <braunr> on the other hand, a larger address space for tasks allows running
      more threads (more space for tasks) and not failing while linking webkit
      .. :)
    <youpi> it's needed for quite a few compilations, yes
    <braunr> if you refer to the link stage, with a decent amount of swap, it
      goes without trouble
    <youpi> well, if your kernel doesn't have 2GiB physical addressing
      capacity, userspace won't have >2GiB memory capacity either
    <youpi> does it now?
    <youpi> it didn't use to
    <youpi> and it was crawling like hell for some builds
    <youpi> (until simply hanging)
    <braunr> i never have a problem e.g. runing the big malloc glibc test
    <braunr> (bug22 or something like that)
    <youpi> that doesn't involve objects from the fs, does it?
    <braunr> no
    <braunr> as long as it's anonymous memory, it's ok
    <braunr> the default pager looks safe, i'm pretty sure our lockups are
      because of something in ext2fs
    <youpi> braunr: well, an alternative would be to build two kernels, one 2/2
      and one 3/1
    <braunr> not really worth it
    <braunr> i was just wondering
    <braunr> i usually prefer a 3/1 on darnassus, but i don't build as often as
      a buildd :x
    <youpi> or we can go with 2.5/1.5
    <youpi> I can do that on bach & mozart for instance
    <youpi> (they have their own kernel anyway)
    <braunr> youpi: if you think it's worth the effort
    <braunr> again, i was just wondering out loud
    <youpi> braunr: well, bach & mozart don't have > 1.2GiB mem anyway
    <youpi> so it doesn't pose problem


# IRC, freenode, #hurd, 2013-01-12

    <sobhan> can hurd have more than 1GB of ram ?
    <braunr> sobhan: not with the stock kernel, but yes with a simple patch
    <braunr> sobhan: although you should be aware of the implications of this
      patch
    <braunr> (more kernel memory, thus more physical memory - up to 1.8 GiB -
      but then, less user memory)


# IRC, freenode, #hurd, 2013-06-06

    <nlightnfotis> braunr: quick question, what memory allocation algorithms
      does the Mach use? I know it uses slab allocation, so I can guess buddy
      allocators too?
    <braunr> no
    <braunr> slab allocator for kernel memory (allocation of buffers used by
      the kernel itself)
    <braunr> a simple freelist for physical pages
    <braunr> and a custom allocator based on a red-black tree, a linked list
      and a hint for virtual memory
    <braunr> (which is practically the same in all BSD variants)
    <braunr> and linux does something very close too