[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_gnumach open_issue_hurd]] There is a [[!FF_project 272]][[!tag bounty]] on this task. [[!toc]] # IRC, OFTC, #debian-hurd, 2011-03-24 I still believe we have an ext2fs page cache swapping leak, however as the 1.8GiB swap was full, yet the ld process was only 1.5GiB big a leak at swapping time, you mean? I mean the ext2fs page cache being swapped out instead of simply dropped ah so the swap tends to accumulate unuseful stuff, i see yes the disk content, basicallyt :) # IRC, freenode, #hurd, 2011-04-18 damn, a cp -a simply gobbles down swap space... really ? that's weird why would a copy use so much anonymous memory ? unless the external pager is so busy that the kernel falls back to its default pager that's what I suggested some time ago maybe this case should be traced in the kernel a simple message in the kernel buffer to warn that this condition happened may help I'm seeing swap space being kept used on buildds for no real reason except possibly backing ext2fs pages that could help, yes youpi: I think it was actually slpz who suggested that... I think we're generally missing feedback from memory behavior youpi: do you think andrei's kernel instrumentation work might be helpful with analyzing such things? antrik: I think I suggested it too, but never mind antrik: no, because it's not a trace of events that you want some specific events would be useful but then we don't really need a whole framework for that apt-get upgrade eats swap too the upgrade itself, or the computation of the ugprade? apt is a memory eater nowadays installing the packages seems to have stabilized though after a while... so perhaps it's not a leak in this case ideally we should have a way to know what was put in the swap how would you represent what's in the swap ? the apt-get process has 46M of virtual memory above the 128 M baseline mostly libraries i guess are trheads stacks 8 MiB like on Linux ? braunr: at least knowing how much of each process is in the swap braunr: 2MiB ok vminfo could also report which parts of the address space are in the swap youpi: would be nice to have some simple utility reporting how much of a process' address space is anonymous (in fact, I wonder why it's not reported by standard tools such as ps or top... this shouldn't be too difficult I would think?) it would be much more useful information than the total virt size, which includes rather meaningless disk and device mappings... agreed well there are tools like pmap for this unfortunately, it's difficult in mach to know what backs a non-anonymous mapping pagers should be able to name their mappings that'd be helpful for debugging yes there is almost no overhead in doing that, and it would be very useful and could lead to /proc/pid/maps yes isn't there a maps already ? nope ok (probably not very useful without the names) ithought i remembered maps without names, and guessed it might have been on the hurd for that reason but i'm not sure there's the vminfo command, yes 14:06 < youpi> braunr: at least knowing how much of each process is in the swap wouldn't it be clearer to do it the other way around ? like a swapinfo tool indicating what it contains ? sure, but it's a lot more difficult really ? why ? because you have to traverse all the mappings etc (in all processes, I mean) and you have to name what is waht there are other ways the swap is a central structure while simply introducing the swap % in vminfo for a given process you know what is what right and doing that introduction is probably very simple that's a good point top-down is effectively easier than bottom-up resolution in Mach VM hm... the memory use caused by cp doesn't seem to be reflected in the virtual size of any particular process ghost memory what's cp vmsize at the time of the problem ? it's at 134 M right now... so considering the 128 M baseline, nothing worth speaking of right maybe a copy map during I/O but I don't know Mach copy maps in detail, as they have been eliminated from UVM BTW, the memory eatup happens even before swap comes into play... swapping seems to be a result of the problem, not the cause what do you mean ? I thought swapping was the issue you mean RAM is full before swapping ? well, I don't know what the actual problem is... I just don't understand why the memory use increases without any particular process seeing an increase in size the "free" size in vmstat decreses once it's eatun up, swap space use increases well it doesn't change much of it the anonymous memory pager will use RAM before resorting to the external default-pager I would suspect normal block caching... but then, shouldn't this show up in the memory info of the ext2 process? although, again, I'm not sure of the behaviour of the anonymous memory pager antrik: I don't know how block caching behaves BTW, is it a know problem that doing ^C on a "cp -a" seems to hang the whole system?... (the whole hurd instance that is... the other instance is not affected) not that I know of seems like a deadlock in the anonymous memory handling (and I've never seen that) happens both in my main system (using ancient hurd/libc) and in my subhurd (recently upgraded to current stuff) this make testing this stuff quite a lot harder... [sigh] any suggestions how to debug this hang? antrik: no :/ 2011-04-28: [[!taglink open_issue_documentation]] hm... is it normal that "swap free" doesn't increase as a process' memory is paged back in? yes there's no real use cleaning swap on the contrary, it makes paging the process out again longer hm... so essentially, after swapping back and forth a bit, a part of the swap equal to the size of physical RAM will be occupied with stuff that is actually in RAM? yes so that that RAM can be freed immediately if needed hm... that means my effective swap size is only like 300 MB... no wonder I see crashes under load err... make that 230 actually indeed, quitting the application freed both the physical RAM and swap space 02:28 < antrik> hm... is it normal that "swap free" doesn't increase as a process' memory is paged back in? swap is the backing store of anonymous memory, like ext2fs is the backing store of memory objects created from its pager so you can view swap as the file system for everything that isn't an external memory object # IRC, freenode, #hurd, 2011-11-15 hm, now my system got unstable swap is increasing, without any apparent reason you mean without any load? with load, yes :) well, with load is "normal"... at least for some loads i can't create memory pressure to stress reclaiming without any load what load are you using? ftp mirrorring hm... never tried that; but I guess it's similar to apt-get so yes, that's "normal". I talked about it several times, and also wrote to the ML antrik: ok if you find out how to fix this, you are my hero ;-) arg :) I suspect it's the infamous double swapping problem; but that's just a guess looks like this BTW, if you give me the exact command, I could check if I see it too i use lftp (mirror -Re) from a linux git repository through sftp (lots of small files, big content) can't you just give me the exact command? I don't feel like figuring it out myself antrik: cd linux-stable; lftp sftp://hurd_addr/ inside lftp: mkdir linux-stable; cd linux-stable; mirror -Re hm, half of physical memory just got freed our page cache is really weird :/ (i didn't delete any file when that happened) hurd_addr? ssh server ip address or name of your hurd :) I'm confused. you are mirroring *from* the Hurd box? no, to it ah, so you login via sftp and then push to it? yes fragmentation looks very fine even for the huge pv_entry cache and its 60k+ entries (and i'm running a kernel with the cpu layer enabled) git reset/status/diff/log/grep all work correctly anyway, mcsim's branch looks quite stable to me braunr: I can't reproduce the swap leak with ftp. free memory idles around 6.5 k (seems to be the threshold where paging starts), and swap use is constant might be because everything swappable is already present in swap from previous load I guess... err... scratch that. was connected to the wrong host, silly me indeed swap gets eaten away, as expected but only if free memory actually falls below the threshold. otherwise it just oscillates around a constant value, and never touches swap so this seems to confirm the double swapping theory antrik: is that "double swap" theory written somewhere? (no, a quick google didn't tell me) ## IRC, freenode, #hurd, 2011-11-16 youpi: http://lists.gnu.org/archive/html/l4-hurd/2002-06/msg00001.html talks about "double paging". probably it's also the term others used for it; however, the term is generally used in a completely different meaning, so I guess it's not really suitable for googling either ;-) IIRC slpz (or perhaps someone else?) proposed a solution to this, but I don't remember any details ok so it's the same thing I was thinking about with swap getting filled my question was: is there something to release the double swap, once the ext2fs pager managed to recover? apparently not the only way to free the memory seems to be terminating the FS server uh :/ # IRC, freenode, #hurd, 2011-11-30 slpz: basically, whenever free memory goes below the paging threshold (which seems to be around 6 MiB) while there is other I/O happening, swap usage begins to increase continuously; and only gets freed again when the filesystem translator in question exits so it sounds *very* much like pages go to swap because the filesystem isn't quick enough to properly page them out slpz: I think it was you who talked about double paging a while back? antrik: probably, sounds like me :-) slpz: I have some indication that the degenerating performance and ultimate hang issues I'm seeing are partially or entirely caused by double paging... slpz: I don't remember, did you propose some possible fix? antrik: hmm... perhaps it wasn't me, because I don't remember trying to fix that problem... antrik: at which point do you think pages get duplicated? slpz: it was a question. I don't remember whether you proposed something or not :-) slpz: basically, whenever free memory goes below the paging threshold (which seems to be around 6 MiB) while there is other I/O happening, swap usage begins to increase continuously; and only gets freed again when the filesystem translator in question exits so it sounds *very* much like pages go to swap because the filesystem isn't quick enough to properly page them out antrik: I see antrik: I didn't addressed this problem directly, but when I've modified the pageout mechanism to provide a special treatment for external pages, I also removed the possibility of sending them to the default pager antrik: this was in my experimental environment, of course slpz: oh, nice... so it may fix the issues I'm seeing? :-) anything testable yet? antrik: yes, only anonymous memory could be swapped with that antrik: it works, but is ugly as hell tschwinge: these is also your observation about compilations getting slower on further runs, and my followups... I *suspect* it's the same issue [[performance/degradation]]. antrik: I'm thinking about establishing a repository for these experimental versions, so they don't get lost with the time slpz: please do :-) antrik: perhaps in savannah's HARD project even if it's not ready for upstream, it would be nice if I could test it -- right now it's bothering me more than any other Hurd issues I think... also, there's another problem which causes performance degradation with the simple use of the system slpz: Please just push to Savannah Hurd. Under your slpz/... or similar. antrik: Might very well be, yes. and I almost sure it is the fragmentation of the task map tschwinge: ok after playing a bit with a translator, it can easily get more than 3000 entries in its map slpz: yeah, other issues might play a role here as well. I observed that terminating the problematic FS servers does free most of the memory and remove most of the performance degradation, but in some cases it's still very slow that makes vm_map_lookup a lot slower on a related note: any idea what can cause paging errors and a system hang even when there is plenty of free swap? (I'm not entirely sure, but my impression is that it *might* be related to the swap usage and performance degradation problems) I think this degree of fragmentation has something to do with the reiterative mapping of memory objects which is done in pager-memcpy.c antrik: which kind of paging errors? hm... I don't think I ever noted down the exact message; but I think it's the same you get when actually running out of swap antrik: that could be the default pager dying for some internal bug well, but it *seems* to go along with the performance degradation and/or swap usage I also have the impression that we're using memory objects the wrong way basically, once I get to a certain level of swap use and slowness (after about a month of use), the system eventually dies antrik: I never had a system running for that time, so it could be a completely different problem from what I've seen before :-/ Anybody has experience with block-level caches on microkernel environments? slpz: yeah, it typically happens after about a month of my normal use... but I can significantly accellerate it by putting some problematic load on it, such as large apt-get runs... I wonder if it would be better to put them in kernel or in user space. And in the latter, if it would be better to have one per-device shared for all accesing translators, or just each task should have its own cache... slpz: http://lists.gnu.org/archive/html/bug-hurd/2011-09/msg00041.html is where I described the issue(s) (should send another update for the most recent findings I guess...) slpz: well, if we move to userspace drivers, the kernel part of the question is already answered ;-) but I'm not sure about per-device cache vs. caching in FS server