summaryrefslogtreecommitdiff
path: root/open_issues/ext2fs_page_cache_swapping_leak.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'open_issues/ext2fs_page_cache_swapping_leak.mdwn')
-rw-r--r--open_issues/ext2fs_page_cache_swapping_leak.mdwn109
1 files changed, 107 insertions, 2 deletions
diff --git a/open_issues/ext2fs_page_cache_swapping_leak.mdwn b/open_issues/ext2fs_page_cache_swapping_leak.mdwn
index 075533e7..7c4cf52d 100644
--- a/open_issues/ext2fs_page_cache_swapping_leak.mdwn
+++ b/open_issues/ext2fs_page_cache_swapping_leak.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -8,7 +8,7 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
-[[!tag open_issue_hurd]]
+[[!tag open_issue_gnumach open_issue_hurd]]
There is a [[!FF_project 272]][[!tag bounty]] on this task.
@@ -27,6 +27,7 @@ There is a [[!FF_project 272]][[!tag bounty]] on this task.
<youpi> yes
<youpi> the disk content, basicallyt :)
+
# IRC, freenode, #hurd, 2011-04-18
<antrik> damn, a cp -a simply gobbles down swap space...
@@ -257,3 +258,107 @@ There is a [[!FF_project 272]][[!tag bounty]] on this task.
<antrik> the only way to free the memory seems to be terminating the FS
server
<youpi> uh :/
+
+
+# IRC, freenode, #hurd, 2011-11-30
+
+ <antrik> slpz: basically, whenever free memory goes below the paging
+ threshold (which seems to be around 6 MiB) while there is other I/O
+ happening, swap usage begins to increase continuously; and only gets
+ freed again when the filesystem translator in question exits
+ <antrik> so it sounds *very* much like pages go to swap because the
+ filesystem isn't quick enough to properly page them out
+ <antrik> slpz: I think it was you who talked about double paging a while
+ back?
+ <slpz> antrik: probably, sounds like me :-)
+ <antrik> slpz: I have some indication that the degenerating performance and
+ ultimate hang issues I'm seeing are partially or entirely caused by
+ double paging...
+ <antrik> slpz: I don't remember, did you propose some possible fix?
+ <slpz> antrik: hmm... perhaps it wasn't me, because I don't remember trying
+ to fix that problem...
+ <slpz> antrik: at which point do you think pages get duplicated?
+ <antrik> slpz: it was a question. I don't remember whether you proposed
+ something or not :-)
+ <antrik> slpz: basically, whenever free memory goes below the paging
+ threshold (which seems to be around 6 MiB) while there is other I/O
+ happening, swap usage begins to increase continuously; and only gets
+ freed again when the filesystem translator in question exits
+ <antrik> so it sounds *very* much like pages go to swap because the
+ filesystem isn't quick enough to properly page them out
+ <tschwinge>
+ http://www.bddebian.com:8888/~hurd-web/open_issues/ext2fs_page_cache_swapping_leak/
+ <slpz> tschwinge: thanks
+ <slpz> antrik: I see
+ <tschwinge> Always at your service. ;-)
+ <slpz> antrik: I didn't addressed this problem directly, but when I've
+ modified the pageout mechanism to provide a special treatment for
+ external pages, I also removed the possibility of sending them to the
+ default pager
+ <slpz> antrik: this was in my experimental environment, of course
+ <antrik> slpz: oh, nice... so it may fix the issues I'm seeing? :-)
+ <antrik> anything testable yet?
+ <slpz> antrik: yes, only anonymous memory could be swapped with that
+ <slpz> antrik: it works, but is ugly as hell
+ <antrik> tschwinge: these is also your observation about compilations
+ getting slower on further runs, and my followups... I *suspect* it's the
+ same issue
+
+[[performance/degradation]].
+
+ <slpz> antrik: I'm thinking about establishing a repository for these
+ experimental versions, so they don't get lost with the time
+ <antrik> slpz: please do :-)
+ <slpz> antrik: perhaps in savannah's HARD project
+ <antrik> even if it's not ready for upstream, it would be nice if I could
+ test it -- right now it's bothering me more than any other Hurd issues I
+ think...
+ <slpz> also, there's another problem which causes performance degradation
+ with the simple use of the system
+ <tschwinge> slpz: Please just push to Savannah Hurd. Under your
+ slpz/... or similar.
+ <tschwinge> antrik: Might very well be, yes.
+ <slpz> and I almost sure it is the fragmentation of the task map
+ <slpz> tschwinge: ok
+ <slpz> after playing a bit with a translator, it can easily get more than
+ 3000 entries in its map
+ <antrik> slpz: yeah, other issues might play a role here as well. I
+ observed that terminating the problematic FS servers does free most of
+ the memory and remove most of the performance degradation, but in some
+ cases it's still very slow
+ <slpz> that makes vm_map_lookup a lot slower
+ <antrik> on a related note: any idea what can cause paging errors and a
+ system hang even when there is plenty of free swap?
+ <antrik> (I'm not entirely sure, but my impression is that it *might* be
+ related to the swap usage and performance degradation problems)
+ <slpz> I think this degree of fragmentation has something to do with the
+ reiterative mapping of memory objects which is done in pager-memcpy.c
+ <slpz> antrik: which kind of paging errors?
+ <antrik> hm... I don't think I ever noted down the exact message; but I
+ think it's the same you get when actually running out of swap
+ <slpz> antrik: that could be the default pager dying for some internal bug
+ <antrik> well, but it *seems* to go along with the performance degradation
+ and/or swap usage
+ <slpz> I also have the impression that we're using memory objects the wrong
+ way
+ <antrik> basically, once I get to a certain level of swap use and slowness
+ (after about a month of use), the system eventually dies
+ <slpz> antrik: I never had a system running for that time, so it could be a
+ completely different problem from what I've seen before :-/
+ <slpz> Anybody has experience with block-level caches on microkernel
+ environments?
+ <antrik> slpz: yeah, it typically happens after about a month of my normal
+ use... but I can significantly accellerate it by putting some problematic
+ load on it, such as large apt-get runs...
+ <slpz> I wonder if it would be better to put them in kernel or in user
+ space. And in the latter, if it would be better to have one per-device
+ shared for all accesing translators, or just each task should have its
+ own cache...
+ <antrik> slpz:
+ http://lists.gnu.org/archive/html/bug-hurd/2011-09/msg00041.html is where
+ I described the issue(s)
+ <antrik> (should send another update for the most recent findings I
+ guess...)
+ <antrik> slpz: well, if we move to userspace drivers, the kernel part of
+ the question is already answered ;-)
+ <antrik> but I'm not sure about per-device cache vs. caching in FS server