From be4193108513f02439a211a92fd80e0651f6721b Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 30 Nov 2011 21:21:45 +0100 Subject: IRC. --- open_issues/gnumach_memory_management.mdwn | 202 +++++++++++++++++++++++++++++ 1 file changed, 202 insertions(+) (limited to 'open_issues/gnumach_memory_management.mdwn') diff --git a/open_issues/gnumach_memory_management.mdwn b/open_issues/gnumach_memory_management.mdwn index 9a4418c1..c9c3e64f 100644 --- a/open_issues/gnumach_memory_management.mdwn +++ b/open_issues/gnumach_memory_management.mdwn @@ -1810,3 +1810,205 @@ There is a [[!FF_project 266]][[!tag bounty]] on this task. etenil: but mcsim's work is, for one, useful because the allocator code is much clearer, adds some debugging support, and is smp-ready + + +# IRC, freenode, #hurd, 2011-11-14 + + i've just realized that replacing the zone allocator removes most + (if not all) static limit on allocated objects + as we have nothing similar to rlimits, this means kernel resources + are actually exhaustible + and i'm not sure every allocation is cleanly handled in case of + memory shortage + youpi: antrik: tschwinge: is this acceptable anyway ? + (although IMO, it's also a good thing to get rid of those limits + that made the kernel panic for no valid reason) + there are actually not many static limits on allocated objects + only a few have one + those defined in kern/mach_param.h + most of them are not actually enforced + ah ? + they are used at zinit() time + i thought they were + yes, but most zones are actually fine with overcoming the max + ok + see zone->max_size += (zone->max_size >> 1); + you need both !EXHAUSTIBLE and FIXED + ok + making having rlimits enforced would be nice... + s/making// + pinotree: the kernel wouldn't handle many standard rlimits anyway + + i've just committed my final patch on mcsim's branch, which will + serve as the starting point for integration + which means code in this branch won't change (or only last minute + changes) + you're invited to test it + there shouldn't be any noticeable difference with the master + branch + a bit less fragmentation + more memory can be reclaimed by the VM system + there are debugging features + it's SMP ready + and overall cleaner than the zone allocator + although a bit slower on the free path (because of what's + performed to reduce fragmentation) + but even "slower" here is completely negligible + + +# IRC, freenode, #hurd, 2011-11-15 + + I enabled cpu_pool layer and kentry cache exhausted at "apt-get + source gnumach && (cd gnumach-* && dpkg-buildpackage)" + I mean kernel with your last commit + braunr: I'll make patch how I've done it in a few minutes, ok? It + will be more specific. + mcsim: did you just remove the #if NCPUS > 1 directives ? + no. I replaced macro NCPUS > 1 with SLAB_LAYER, which equals NCPUS + > 1, than I redefined macro SLAB_LAYER + ah, you want to make the layer optional, even on UP machines + mcsim: can you give me the commands you used to trigger the + problem ? + apt-get source gnumach && (cd gnumach-* && dpkg-buildpackage) + mcsim: how much ram & swap ? + let's see if it can handle a quite large aptitude upgrade + how can I check swap size? + free + cat /proc/meminfo + top + whatever + total used free shared buffers + cached + Mem: 786368 332296 454072 0 0 + 0 + -/+ buffers/cache: 332296 454072 + Swap: 1533948 0 1533948 + ok, i got the problem too + braunr: do you run hurd in qemu? + yes + i guess the cpu layer increases fragmentation a bit + which means more map entries are needed + hm, something's not right + there are only 26 kernel map entries when i get the panic + i wonder why the cache gets that stressed + hm, reproducing the kentry exhaustion problem takes quite some + time + braunr: what do you mean? + sometimes, dpkg-buildpackage finishes without triggering the + problem + the problem is in apt-get source gnumach + i guess the problem happens because of drains/fills, which + allocate/free much more object than actually preallocated at boot time + ah ? + ok + i've never had it at that point, only later + i'm unable to trigger it currently, eh + do you use *-dbg kernel? + yes + well, i use the compiled kernel, with the slab allocator, built + with the in kernel debugger + when you run apt-get source gnumach, you run it in clean directory? + Or there are already present downloaded archives? + completely empty + ah just got it + ok the limit is reached, as expected + i'll just bump it + the cpu layer drains/fills allocate several objects at once (64 if + the size is small enough) + the limit of 256 (actually 252 since the slab descriptor is + embedded in its slab) is then easily reached + mcsim: most direct way to check swap usage is vmstat + damn, i can't live without slabtop and the amount of + active/inactive cache memory any more + hm, weird, we have active/inactive memory in procfs, but not + buffers/cached memory + we could set buffers to 0 and everything as cached memory, since + we're currently unable to communicate the purpose of cached memory + (whether it's used by disk servers or file system servers) + mcsim: looks like there are about 240 kernel map entries (i forgot + about the ones used in kernel submaps) + so yes, addin the cpu layer is what makes the kernel reach the + limit more easily + braunr: so just increasing limit will solve the problem? + mcsim: yes + slab reclaiming looks very stable + and unfrequent + (which is surprising) + braunr: "unfrequent"? + pinotree: there isn't much memory pressure + slab_collect() gets called once a minute on my hurd + or is it infrequent ? + :) + i have no idea :) + infrequent, yes + + +# IRC, freenode, #hurd, 2011-11-16 + + for those who want to play with the slab branch of gnumach, the + slabinfo tool is available at http://git.sceen.net/rbraun/slabinfo.git/ + for those merely interested in numbers, here is the output of + slabinfo, for a hurd running in kvm with 512 MiB of RAM, an unused swap, + and a short usage history (gnumach debian packages built, aptitude + upgrade for a dozen of packages, a few git commands) + http://www.sceen.net/~rbraun/slabinfo.out + braunr: numbers for a long usage history would be much more + interesting :-) + + +## IRC, freenode, #hurd, 2011-11-17 + + antrik: they'll come :) + is something going on on darnassus? it's mighty slow + yes + i've rebooted it to run a modified kernel (with the slab + allocator) and i'm building stuff on it to stress it + (i don't have any other available machine with that amount of + available physical memory) + ok + braunr: probably would be actually more interesting to test under + memory pressure... + guess that doesn't make much of a difference for the kernel object + allocator though + antrik: if ram is larger, there can be more objects stored in + kernel space, then, by building something large such as eglibc, memory + pressure is created, causing caches to be reaped + our page cache is useless because of vm_object_cached_max + it's a stupid arbitrary limit masking the inability of the vm to + handle pressure correctly + if removing it, the kernel freezes soon after ram is filled + antrik: it may help trigger the "double swap" issue you mentioned + what may help trigger it? + not checking this limit + hm... indeed I wonder whether the freezes I see might have the + same cause + + +## IRC, freenode, #hurd, 2011-11-19 + + http://www.sceen.net/~rbraun/slabinfo.out <= state of the slab + allocator after building the debian libc packages and removing all files + once done + it's mostly the same as on any other machine, because of the + various arbitrary limits in mach (most importantly, the max number of + objects in the page cache) + fragmentation is still quite low + braunr: actually fragmentation seems to be lower than on the other + run... + antrik: what makes you think that ? + the numbers of currently unused objects seem to be in a similar + range IIRC, but more of them are reclaimable I think + maybe I'm misremembering the other numbers + there had been more reclaims on the other run + + +# IRC, freenode, #hurd, 2011-11-25 + + mcsim: i've just updated the slab branch, please review my last + commit when you have time + braunr: Do you mean compilation/tests? + no, just a quick glance at the code, see if it matches what you + intended with your original patch + braunr: everything is ok + good + i think the branch is ready for integration -- cgit v1.2.3