[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_gnumach open_issue_glibc open_issue_hurd]] Issues relating to system behavior under memory pressure. [[!toc]] # [[gnumach_page_cache_policy]] # IRC, freenode, #hurd, 2012-07-08 am i mistaken or is the default pager simply not vm privileged ? (which would explain the hangs when memory is very low) no idea but that's very possible we start it by hand from the init scripts actually, i see no way provided by mach to set that i'd assume it would set the property when a thread would register itself as the default pager, but it doesn't i'll check at runtime and see if fixing helps thread_wire(host, thread, 1) ? ./hurd/mach-defpager/wiring.c: kr = thread_wire(priv_host_port, no look in cprocs.c iir iirc iiuc, it sets a 1:1 kernel/user mapping ?? thread_wire, not cthread_wire ah right, i'm getting tired youpi: do you understand the comment in default_pager_thread() ? well, I'm not sure to know what external vs internal is i'm almost sure the default pager is blocked because of a relation with an unprivlege thread +d when hangs happen, the pageout daemon is still running, waiting for an event so he can continue it* all right, our pageout stuff completely sucks when you think the system is hanged, it's actually not and what's happening instead? instead, it seems it's in a very complex resursive state which ends in the slab allocator not being able to allocate kernel map entries recursive* the pageout daemon, unable to continue, progressively slows in hope the default pager is able to service the pageout requests, but it's not probably the most complicated deadlock i've seen :) luckily ! i've been playing with some tunables involved in waking up the pageout daemon and got good results so far (although it's clearly not a proper solution) one thing the kernel lacks is a way to separate clean from dirty pages this stupid kernel doesn't try to free clean pages first .. :) hm now i can see the system recover, but some applications are still stuck :( (but don't worry, my tests are rather aggressive) what i mean by aggressive is several builds and various dd of a few hundred MiB in parallel, on various file systems so far the file systems have been very resilient ok, let's try running the hurd with 64 MiB of RAM after some initial swapping, it runs smoothly :) uh ? ah no, i'm still doing my parallel builds although less gcc: internal compiler error: Resource lost (program as) arg lol the file system crashed under the compiler too much memory required during linking? or ram+swap should have been enough? there is a lot of swap, i doubt it the hurd is such a dumb and impressive system at the same time pinotree: what does this tell you ? git: hurdsig.c:948: post_signal: Unexpected error: (os/kern) failure. something samuel spots often during the builds of haskell packages Probably also the *sigpost* case mentioned in [[!message-id "87bol6aixd.fsf@schwinge.name"]]. actually i should be asking jkoenig it seems the lack of memory has a strong impact on signal delivery which is bad braunr: I have a vague recollection of slpz also saying something about missing dirty page tracking a while back... I might be confusing stuff though pinotree: yes it happens often during links which makes sense braunr: "happens often" == "hurdsig.c:948: post_signal: ..."? yes if you can reproduce it often, what about debugging it? :P i mean, the few times i got it, it was often during a link :p i'd rather debug the pageout deadlock :( but it's hard