summaryrefslogtreecommitdiff
path: root/open_issues/ext2fs_page_cache_swapping_leak.mdwn
blob: 075533e7b8733bce9c6a16a7f697ea991b99f880 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!tag open_issue_hurd]]

There is a [[!FF_project 272]][[!tag bounty]] on this task.

[[!toc]]


# IRC, OFTC, #debian-hurd, 2011-03-24

    <youpi> I still believe we have an ext2fs page cache swapping leak, however
    <youpi> as the 1.8GiB swap was full, yet the ld process was only 1.5GiB big
    <pinotree> a leak at swapping time, you mean?
    <youpi> I mean the ext2fs page cache being swapped out instead of simply
      dropped
    <pinotree> ah
    <pinotree> so the swap tends to accumulate unuseful stuff, i see
    <youpi> yes
    <youpi> the disk content, basicallyt :)

# IRC, freenode, #hurd, 2011-04-18

    <antrik> damn, a cp -a simply gobbles down swap space...
    <braunr> really ?
    <braunr> that's weird
    <braunr> why would a copy use so much anonymous memory ?
    <braunr> unless the external pager is so busy that the kernel falls back to
      its default pager
    <youpi> that's what I suggested some time ago
    <braunr> maybe this case should be traced in the kernel
    <braunr> a simple message in the kernel buffer to warn that this condition
      happened may help
    <youpi> I'm seeing swap space being kept used on buildds for no real reason
      except possibly backing ext2fs pages
    <youpi> that could help, yes
    <antrik> youpi: I think it was actually slpz who suggested that...
    <youpi> I think we're generally missing feedback from memory behavior
    <antrik> youpi: do you think andrei's kernel instrumentation work might be
      helpful with analyzing such things?
    <youpi> antrik: I think I suggested it too, but never mind
    <youpi> antrik: no, because it's not a trace of events that you want
    <youpi> some specific events would be useful
    <youpi> but then we don't really need a whole framework for that
    <antrik> apt-get upgrade eats swap too
    <youpi> the upgrade itself, or the computation of the ugprade?
    <youpi> apt is a memory eater nowadays
    <antrik> installing the packages
    <antrik> seems to have stabilized though after a while...
    <antrik> so perhaps it's not a leak in this case
    <youpi> ideally we should have a way to know what was put in the swap
    <braunr> how would you represent what's in the swap ?
    <antrik> the apt-get process has 46M of virtual memory above the 128 M
      baseline
    <braunr> mostly libraries i guess
    <braunr> are trheads stacks 8 MiB like on Linux ?
    <youpi> braunr: at least knowing how much of each process is in the swap
    <youpi> braunr: 2MiB
    <braunr> ok
    <youpi> vminfo could also report which parts of the address space are in
      the swap
    <antrik> youpi: would be nice to have some simple utility reporting how
      much of a process' address space is anonymous
    <antrik> (in fact, I wonder why it's not reported by standard tools such as
      ps or top... this shouldn't be too difficult I would think?)
    <antrik> it would be much more useful information than the total virt size,
      which includes rather meaningless disk and device mappings...
    <youpi> agreed
    <braunr> well
    <braunr> there are tools like pmap for this
    <braunr> unfortunately, it's difficult in mach to know what backs a
      non-anonymous mapping
    <braunr> pagers should be able to name their mappings
    <youpi> that'd be helpful for debugging yes
    <braunr> there is almost no overhead in doing that, and it would be very
      useful
    <youpi> and could lead to /proc/pid/maps
    <braunr> yes
    <braunr> isn't there a maps already ?
    <youpi> nope
    <braunr> ok
    <youpi> (probably not very useful without the names)
    <braunr>  ithought i remembered maps without names, and guessed it might
      have been on the hurd for that reason
    <braunr> but i'm not sure
    <youpi> there's the vminfo command, yes
    <braunr> 14:06 < youpi> braunr: at least knowing how much of each process
      is in the swap
    <braunr> wouldn't it be clearer to do it the other way around ?
    <braunr> like a swapinfo tool indicating what it contains ?
    <youpi> sure, but it's a lot more difficult
    <braunr> really ?
    <braunr> why ?
    <youpi> because you have to traverse all the mappings
    <youpi> etc
    <youpi> (in all processes, I mean)
    <youpi> and you have to name what is waht
    <braunr> there are other ways
    <braunr> the swap is a central structure
    <youpi> while simply introducing the swap %  in vminfo
    <youpi> for a given process you know what is what
    <braunr> right
    <youpi> and doing that introduction is  probably very simple
    <braunr> that's a good point
    <braunr> top-down is effectively easier than bottom-up resolution in Mach
      VM
    <antrik> hm... the memory use caused by cp doesn't seem to be reflected in
      the virtual size of any particular process
    <antrik> ghost memory
    <braunr> what's cp vmsize at the time of the problem ?
    <antrik> it's at 134 M right now... so considering the 128 M baseline,
      nothing worth speaking of
    <braunr> right
    <braunr> maybe a copy map during I/O
    <braunr> but I don't know Mach copy maps in detail, as they have been
      eliminated from UVM
    <antrik> BTW, the memory eatup happens even before swap comes into
      play... swapping seems to be a result of the problem, not the cause
    <braunr> what do you mean ?
    <braunr> I thought swapping was the issue
    <braunr> you mean RAM is full before swapping ?
    <antrik> well, I don't know what the actual problem is... I just don't
      understand why the memory use increases without any particular process
      seeing an increase in size
    <antrik> the "free" size in vmstat decreses
    <antrik> once it's eatun up, swap space use increases
    <braunr> well it doesn't change much of it
    <braunr> the anonymous memory pager will use RAM before resorting to the
      external default-pager
    <antrik> I would suspect normal block caching... but then, shouldn't this
      show up in the memory info of the ext2 process?
    <braunr> although, again, I'm not sure of the behaviour of the anonymous
      memory pager
    <braunr> antrik: I don't know how block caching behaves
    <antrik> BTW, is it a know problem that doing ^C on a "cp -a" seems to hang
      the whole system?...
    <antrik> (the whole hurd instance that is... the other instance is not
      affected)
    <youpi> not that I know of
    <braunr> seems like a deadlock in the anonymous memory handling
    <youpi> (and I've never seen that)
    <antrik> happens both in my main system (using ancient hurd/libc) and in my
      subhurd (recently upgraded to current stuff)
    <antrik> this make testing this stuff quite a lot harder... [sigh]
    <antrik> any suggestions how to debug this hang?
    <braunr> antrik: no :/

2011-04-28: [[!taglink open_issue_documentation]]

    <antrik> hm... is it normal that "swap free" doesn't increase as a process'
      memory is paged back in?
    <youpi> yes
    <youpi> there's no real use cleaning swap
    <youpi> on the contrary, it makes paging the process out again longer
    <antrik> hm... so essentially, after swapping back and forth a bit, a part
      of the swap equal to the size of physical RAM will be occupied with stuff
      that is actually in RAM?
    <youpi> yes
    <youpi> so that that RAM can be freed immediately if needed
    <antrik> hm... that means my effective swap size is only like 300 MB... no
      wonder I see crashes under load
    <antrik> err... make that 230 actually
    <antrik> indeed, quitting the application freed both the physical RAM and
      swap space
    <braunr> 02:28 < antrik> hm... is it normal that "swap free" doesn't
      increase as a process' memory is paged back in?
    <braunr> swap is the backing store of anonymous memory, like ext2fs is the
      backing store of memory objects created from its pager
    <braunr> so you can view swap as the file system for everything that isn't
      an external memory object


# IRC, freenode, #hurd, 2011-11-15

    <braunr> hm, now my system got unstable
    <braunr> swap is increasing, without any apparent reason
    <antrik> you mean without any load?
    <braunr> with load, yes
    <braunr> :)
    <antrik> well, with load is "normal"...
    <antrik> at least for some loads
    <braunr> i can't create memory pressure to stress reclaiming without any
      load
    <antrik> what load are you using?
    <braunr> ftp mirrorring
    <antrik> hm... never tried that; but I guess it's similar to apt-get
    <antrik> so yes, that's "normal". I talked about it several times, and also
      wrote to the ML
    <braunr> antrik: ok
    <antrik> if you find out how to fix this, you are my hero ;-)
    <braunr> arg :)
    <antrik> I suspect it's the infamous double swapping problem; but that's
      just a guess
    <braunr> looks like this
    <antrik> BTW, if you give me the exact command, I could check if I see it
      too
    <braunr> i use lftp (mirror -Re) from a linux git repository
    <braunr> through sftp
    <braunr> (lots of small files, big content)
    <antrik> can't you just give me the exact command? I don't feel like
      figuring it out myself
    <braunr> antrik: cd linux-stable; lftp sftp://hurd_addr/
    <braunr> inside lftp: mkdir linux-stable; cd linux-stable; mirror -Re
    <braunr> hm, half of physical memory just got freed
    <braunr> our page cache is really weird :/
    <braunr> (i didn't delete any file when that happened)
    <antrik> hurd_addr?
    <braunr> ssh server ip address
    <braunr> or name
    <braunr> of your hurd :)
    <antrik> I'm confused. you are mirroring *from* the Hurd box?
    <braunr> no, to it
    <antrik> ah, so you login via sftp and then push to it?
    <braunr> yes
    <braunr> fragmentation looks very fine
    <braunr> even for the huge pv_entry cache and its 60k+ entries
    <braunr> (and i'm running a kernel with the cpu layer enabled)
    <braunr> git reset/status/diff/log/grep all work correctly
    <braunr> anyway, mcsim's branch looks quite stable to me
    <antrik> braunr: I can't reproduce the swap leak with ftp. free memory
      idles around 6.5 k (seems to be the threshold where paging starts), and
      swap use is constant
    <antrik> might be because everything swappable is already present in swap
      from previous load I guess...
    <antrik> err... scratch that. was connected to the wrong host, silly me
    <antrik> indeed swap gets eaten away, as expected
    <antrik> but only if free memory actually falls below the
      threshold. otherwise it just oscillates around a constant value, and
      never touches swap
    <antrik> so this seems to confirm the double swapping theory
    <youpi> antrik: is that "double swap" theory written somewhere?
    <youpi> (no, a quick google didn't tell me)


## IRC, freenode, #hurd, 2011-11-16

    <antrik> youpi:
      http://lists.gnu.org/archive/html/l4-hurd/2002-06/msg00001.html talks
      about "double paging". probably it's also the term others used for it;
      however, the term is generally used in a completely different meaning, so
      I guess it's not really suitable for googling either ;-)
    <antrik> IIRC slpz (or perhaps someone else?) proposed a solution to this,
      but I don't remember any details
    <youpi> ok so it's the same thing I was thinking about with swap getting
      filled
    <youpi> my question was: is there something to release the double swap,
      once the ext2fs pager managed to recover?
    <antrik> apparently not
    <antrik> the only way to free the memory seems to be terminating the FS
      server
    <youpi> uh :/