diff options
Diffstat (limited to 'open_issues/gnumach_memory_management.mdwn')
-rw-r--r-- | open_issues/gnumach_memory_management.mdwn | 397 |
1 files changed, 397 insertions, 0 deletions
diff --git a/open_issues/gnumach_memory_management.mdwn b/open_issues/gnumach_memory_management.mdwn index 448aafcc..a728fc9d 100644 --- a/open_issues/gnumach_memory_management.mdwn +++ b/open_issues/gnumach_memory_management.mdwn @@ -923,3 +923,400 @@ There is a [[!FF_project 266]][[!tag bounty]] on this task. <braunr> 20 years ago <braunr> but it's a source of deadlock <mcsim> Indeed. I'll won't use kmem_alloc_pageable. + + +# IRC, freenode, #hurd, 2011-08-09 + + < braunr> mcsim: what's the "bug related to MEM_CF_VERIFY" you refer to in + one of your commits ? + < braunr> mcsim: don't use spin_lock_t as a member of another structure + < mcsim> braunr: I confused with types in *_verify functions, so they + didn't work. Than I fixed it in the commit you mentioned. + < braunr> in gnumach, most types are actually structure pointers + < braunr> use simple_lock_data_t + < braunr> mcsim: ok + < mcsim> > use simple_lock_data_t + < mcsim> braunr: ok + < braunr> mcsim: don't make too many changes to the code base, and if + you're unsure, don't hesitate to ask + < braunr> also, i really insist you rename the allocator, as done in x15 + for example + (http://git.sceen.net/rbraun/x15mach.git/?a=blob;f=vm/kmem.c), instead of + a name based on mine :/ + < mcsim> braunr: Ok. It was just work name. When I finish I'll rename the + allocator. + < braunr> other than that, it's nice to see progress + < braunr> although again, it would be better with some reports along + < braunr> i won't be present at the meeting tomorrow unfortunately, but you + should use those to report the status of your work + < mcsim> braunr: You've said that I have to tweak gc process. Did you mean + to call mem_gc() when physical memory ends instead of calling it every x + seconds? Or something else? + < braunr> there are multiple topics, alhtough only one that really matters + < braunr> study how zone_gc was called + < braunr> reclaiming memory should happen when there is pressure on the VM + subsystem + < braunr> but it shouldn't happen too ofte, otherwise there is trashing + < braunr> and your caches become mostly useless + < braunr> the original slab allocator uses a 15-second period after a + reclaim during which reclaiming has no effect + < braunr> this allows having a somehow stable working set for this duration + < braunr> the linux slab allocator uses 5 seconds, but has a more + complicated reclaiming mechanism + < braunr> it releases memory gradually, and from reclaimable caches only + (dentry for example) + < braunr> for x15 i intend to implement the original 15 second interval and + then perform full reclaims + < mcsim> In zalloc mem_gc is called by vm_pageout_scan, but not often than + once a second. + < mcsim> In balloc I've changed interval to once in 15 seconds. + < braunr> don't use the code as it is + < braunr> the version you've based your work on was meant for userspace + < braunr> where there isn't memory pressure + < braunr> so a timer is used to trigger reclaims at regular intervals + < braunr> it's different in a kernel + < braunr> mcsim: where did you see vm_pageout_scan call the zone gc once a + second ? + < mcsim> vm_pageout_scan calls consider_zone_gc and consider_zone_gc checks + if second is passed. + < braunr> where ? + < mcsim> Than zone_gc can be called. + < braunr> ah ok, it's in zaclloc.c then + < braunr> zalloc.c + < braunr> yes this function is fine + < mcsim> so old gc didn't consider vm pressure. Or I missed something. + < braunr> it did + < mcsim> how? + < braunr> well, it's called by the pageout daemon + < braunr> under memory pressure + < braunr> so it's fine + < mcsim> so if mem_gc is called by pageout daemon is it fine? + < braunr> it must be changed to do something similar to what + consider_zone_gc does + < mcsim> It does. mem_gc does the same work as consider_zone_gc and + zone_gc. + < braunr> good + < mcsim> so gc process is fine? + < braunr> should be + < braunr> i see mem.c only includes mem.h, which then includes other + headers + < braunr> don't do that + < braunr> always include all the headers you need where you need them + < braunr> if you need avltree.h in both mem.c and mem.h, include it in both + files + < braunr> and by the way, i recommend you use the red black tree instead of + the avl type + < braunr> (it's the same interface so it shouldn't take long) + < mcsim> As to report. If you won't be present at the meeting, I can tell + you what I have to do now. + < braunr> sure + < braunr> in addition, use GPLv2 as the license, teh BSD one is meant for + the userspace version only + < braunr> GPLv2+ actually + < braunr> hm you don't need list.c + < braunr> it would only add dead code + < braunr> "Zone for dynamical allocator", don't mix terms + < braunr> this comment refers to a vm_map, so call it a map + < mcsim> 1. Change constructor for kentry_alloc_cache. + < mcsim> 2. Make measurements. + < mcsim> + + < mcsim> 3. Use simple_lock_data_t + < mcsim> 4. Replace license + < braunr> kentry_alloc_cache <= what is that ? + < braunr> cache for kernel map entries in vm_map ? + < braunr> the comment for mem_cpu_pool_get doesn't apply in gnumach, as + there is no kernel preemption + < braunr> "Don't attempt mem GC more frequently than hz/MEM_GC_INTERVAL + times a second. + < braunr> " + < mcsim> sorry. I meant vm_map_kentry_cache + < braunr> hm nothing actually about this comment + < braunr> mcsim: ok + < braunr> yes kernel map entries need special handling + < braunr> i don't know how it's done in gnumach though + < braunr> static preallocation ? + < mcsim> yes + < braunr> that's ugly :p + < mcsim> but it uses dynamic allocation further even for vm_map kernel + entries + < braunr> although such bootstrapping issues are generally difficult to + solve elegantly + < braunr> ah + < mcsim> now I use only static allocation, but I'll add dynamic allocation + too + < braunr> when you have time, mind the coding style (convert everything to + gnumach style, which mostly implies using tabs instead of 4-spaces + indentation) + < braunr> when you'll work on dynamic allocation for the kernel map + entries, you may want to review how it's done in x15 + < braunr> the mem_source type was originally intended for that purpose, but + has slightly changed once the allocator was adapted to work in my kernel + < mcsim> ok + < braunr> vm_map_kentry_zone is the only zone created with ZONE_FIXED + < braunr> and it is zcram()'ed immediately after + < braunr> so you can consider it a statically allocated zone + < braunr> in x15 i use another strategy: there is a special kernel submap + named kentry_map which contains only one map entry (statically allocated) + < braunr> this map is the backend (mem_source) for the kentry_cache + < braunr> the kentry_cache is created with a special flag that tells it + memory can't be reclaimed + < braunr> when the cache needs to grow, the single map entry is extended to + cover the allocated memory + < braunr> it's similar to the way pmap_growkernel() works for kernel page + table pages + < braunr> (and is actually based on that idea) + < braunr> it's a compromise between full static and dynamic allocation + types + < braunr> the advantage is that the allocator code can be used (so there is + no need for a special allocator like in netbsd) + < braunr> the drawback is that some resources can never be returned to + their source (and under peaks, the amount of unfreeable resources could + become large, but this is unexpected) + < braunr> mcsim: for now you shouldn't waste your time with this + < braunr> i see the number of kernel map entries is fixed at 256 + < braunr> and i've never seen the kernel use more than around 30 entries + < mcsim> Do you think that I have to left this problem to the end? + < braunr> yes + + +# IRC, freenode, #hurd, 2011-08-11 + + < mcsim> braunr: Hello. Can you give me an advice how can I make + measurements better? + < braunr> mcsim: what kind of measurements + < mcsim> braunr: How much is your allocator better than zalloc. + < braunr> slightly :p + < braunr> that's why i never took the time to put it in gnumach + < mcsim> braunr: Just I thought that there are some rules or + recommendations of such measurements. Or I can do them any way I want? + < braunr> mcsim: i don't know + < braunr> mcsim: benchmarking is an art of its own, and i don't even know + how to use the bits of profiling code available in gnumach (if it still + works) + < antrik> mcsim: hm... are you saying you already have a running system + with slab allocator?... :-) + < braunr> mcsim: the main advantage i can see is the removal of many + arbitrary hard limits + < mcsim> antrik: yes + < antrik> \o/ + < antrik> nice work! + < braunr> :) + < braunr> the cpu layer should also help a bit, but it's hard to measure + < braunr> i guess it could be seen on the ipc path for very small buffers + < mcsim> antrik: Thanks. But I still have to 1. Change constructor for + kentry_alloc_cache. and 2. Make measurements. + < braunr> and polish the whole thing :p + < antrik> mcsim: I'm not sure this can be measured... the performance + differente in any real live usage is probably just a few percent at most + -- it's hard to construct a benchmark giving enough precision so it's not + drowned in noise... + < antrik> perhaps it conserves some memory -- but that too would be hard to + measure I fear + < braunr> yes + < braunr> there *should* be better allocation times, less fragmentation, + better accounting ... :) + < braunr> and no arbitrary limits ! + < antrik> :-) + < braunr> oh, and the self debugging features can be nice too + < mcsim> But I need to prove that my work wasn't useless + < braunr> well it wasn't, but that's hard to measure + < braunr> it's easy to prove though, since there are additional features + that weren't present in the zone allocator + < mcsim> Ok. If there are some profiling features in gnumach can you give + me a link with their description? + < braunr> mcsim: sorry, no + < braunr> mcsim: you could still write the basic loop test, which counts + the number of allocations performed in a fixed time interval + < braunr> but as it doesn't match many real life patterns, it won't be very + useful + < braunr> and i'm afraid that if you consider real life patterns, you'll + see how negligeable the improvement can be compared to other operations + such as memory copies or I/O (ouch) + < mcsim> Do network drivers use this allocator? + < mcsim> ok. I'll scrape up some test and than I'll report results. + + +# IRC, freenode, #hurd, 2011-08-26 + + < mcsim> hello. Are there any analogs of copy_to_user and copy_from_user in + linux for gnumach? + < mcsim> Or how can I determine memory map if I know address? I need this + for vm_map_copyin + < guillem> mcsim: vm_map_lookup_entry? + < mcsim> guillem: but I need to transmit map to this function and it will + return an entry which contains specified address. + < mcsim> And I don't know what map have I transmit. + < mcsim> I need to transfer static array from kernel to user. What map + contains static data? + < antrik> mcsim: Mach doesn't have copy_{from,to}_user -- instead, large + chunks of data are transferred as out-of-line data in IPC messages + (i.e. using VM magic) + < mcsim> antrik: can you give me an example? I just found using + vm_map_copyin in host_zone_info. + < antrik> no idea what vm_map_copyin is to be honest... + + +# IRC, freenode, #hurd, 2011-08-27 + + < braunr> mcsim: the primitives are named copyin/copyout, and they are used + for messages with inline data + < braunr> or copyinmsg/copyoutmsg + < braunr> vm_map_copyin/out should be used for chunks larger than a page + (or roughly a page) + < braunr> also, when writing to a task space, see which is better suited: + vm_map_copyout or vm_map_copy_overwrite + < mcsim> braunr: and what will be src_map for vm_map_copyin/out? + < braunr> the caller map + < braunr> which you can get with current_map() iirc + < mcsim> braunr: thank you + < braunr> be careful not to leak anything in the transferred buffers + < braunr> memset() to 0 if in doubt + < mcsim> braunr:ok + < braunr> antrik: vm_map_copyin() is roughly vm_read() + < antrik> braunr: what is it used for? + < braunr> antrik: 01:11 < antrik> mcsim: Mach doesn't have + copy_{from,to}_user -- instead, large chunks of data are transferred as + out-of-line data in IPC messages (i.e. using VM magic) + < braunr> antrik: that "VM magic" is partly implemented using vm_map_copy* + functions + < antrik> braunr: oh, you mean it doesn't actually copy data, but only page + table entries? if so, that's *not* really comparable to + copy_{from,to}_user()... + + +# IRC, freenode, #hurd, 2011-08-28 + + < braunr> antrik: the equivalent of copy_{from,to}_user are + copy{in,out}{,msg} + < braunr> antrik: but when the data size is about a page or more, it's + better not to copy, of course + < antrik> braunr: it's actually not clear at all that it's really better to + do VM magic than to copy... + + +# IRC, freenode, #hurd, 2011-08-29 + + < braunr> antrik: at least, that used to be the general idea, and with a + simpler VM i suspect it's still true + < braunr> mcsim: did you progress on your host_zone_info replacement ? + < braunr> mcsim: i think you should stick to what the original + implementation did + < braunr> which is making an inline copy if caller provided enough space, + using kmem_alloc_pageable otherwise + < braunr> specify ipc_kernel_map if using kmem_alloc_pageable + < mcsim> braunr: yes. And it works. But I use kmem_alloc, not pageable. Is + it worse? + < mcsim> braunr: host_zone_info replacement is pushed to savannah + repository. + < braunr> mcsim: i'll have a look + < mcsim> braunr: I've pushed one more commit just now, which has attitude + to host_zone_info. + < braunr> mem_alloc_early_init should be renamed mem_bootstrap + < mcsim> ok + < braunr> mcsim: i don't understand your call to kmem_free + < mcsim> braunr: It shouldn't be there? + < braunr> why should it be there ? + < braunr> you're freeing what the copy object references + < braunr> it's strange that it even works + < braunr> also, you shouldn't pass infop directly as the copy object + < braunr> i guess you get a warning for that + < braunr> do what the original code does: use an intermediate copy object + and a cast + < mcsim> ok + < braunr> another error (without consequence but still, you should mind it) + < braunr> simple_lock(&mem_cache_list_lock); + < braunr> [...] + < braunr> kr = kmem_alloc(ipc_kernel_map, &info, info_size); + < braunr> you can't hold simple locks while allocating memory + < braunr> read how the original implementation works around this + < mcsim> ok + < braunr> i guess host_zone_info assumes the zone list doesn't change much + while unlocked + < braunr> or that's it's rather unimportant since it's for debugging + < braunr> a strict snapshot isn't required + < braunr> list_for_each_entry(&mem_cache_list, cache, node) max_caches++; + < braunr> you should really use two separate lines for readability + < braunr> also, instead of counting each time, you could just maintain a + global counter + < braunr> mcsim: use strncpy instead of strcpy for the cache names + < braunr> not to avoid overflow but rather to clear the unused bytes at the + end of the buffer + < braunr> mcsim: about kmem_alloc vs kmem_alloc_pageable, it's a minor + issue + < braunr> you're handing off debugging data to a userspace application + < braunr> a rather dull reporting tool in most cases, which doesn't require + wired down memory + < braunr> so in order to better use available memory, pageable memory + should be used + < braunr> in the future i guess it could become a not-so-minor issue though + < mcsim> ok. I'll fix it + < braunr> mcsim: have you tried to run the kernel with MC_VERIFY always on + ? + < braunr> MEM_CF_VERIFY actually + < mcsim1> yes. + < braunr> oh + < braunr> nothing wrong + < braunr> ? + < mcsim1> it is always set + < braunr> ok + < braunr> ah, you set it in macros.h .. + < braunr> don't + < braunr> put it in mem.c if you want, or better, make it a compile-time + option + < braunr> macros.h is a tiny macro library, it shouldn't define such + unrelated options + < mcsim1> ok. + < braunr> mcsim1: did you try fault injection to make sure the checking + code actually works and how it behaves when an error occurs ? + < mcsim1> I think that when I finish I'll merge files cpu.h and macros.h + with mem.c + < braunr> yes that would simplify things + < mcsim1> Yes. When I confused with types mem_buf_fill worked wrong and + panic occurred. + < braunr> very good + < braunr> have you progressed concerning the measurements you wanted to do + ? + < mcsim1> not much. + < braunr> ok + < mcsim1> I think they will be ready in a few days. + < antrik> what measurements are these? + < mcsim1> braunr: What maximal size for static data and stack in kernel? + < braunr> what do you mean ? + < braunr> kernel stacks are one page if i'm right + < braunr> static data (rodata+data+bss) are limited by grub bugs only :) + < mcsim1> braunr: probably they are present, because when I created too big + array I couldn't boot kernel + < braunr> local variable or static ? + < mcsim1> static + < braunr> how large ? + < mcsim1> 4Mb + < braunr> hm + < braunr> it's not a grub bug then + < braunr> i was able to embed as much as 32 MiB in x15 while doing this + kind of tests + < braunr> I guess it's the gnu mach boot code which only preallocates one + page for the initial kernel mapping + < braunr> one PTP (page table page) maps 4 MiB + < braunr> (x15 does this completely dynamically, unlike mach or even + current BSDs) + < mcsim1> antrik: First I want to measure time of each cache + creation/allocation/deallocation and then compile kernel. + < braunr> cache creation is irrelevant + < braunr> because of the cpu pools in the new allocator, you should test at + least two different allocation patterns + < braunr> one with quick allocs/frees + < braunr> the other with large numbers of allocs then their matching frees + < braunr> (larger being at least 100) + < braunr> i'd say the cpu pool layer is the real advantage over the + previous zone allocator + < braunr> (from a performance perspective) + < mcsim1> But there is only one cpu + < braunr> it doesn't matter + < braunr> it's stil a very effective cache + < braunr> in addition to reducing contention + < braunr> compare mem_cpu_pool_pop() against mem_cache_alloc_from_slab() + < braunr> mcsim1: work is needed to polish the whole thing, but getting it + actually working is a nice achievement for someone new on the project + < braunr> i hope it helped you learn about memory allocation, virtual + memory, gnu mach and the hurd in general :) + < antrik> indeed :-) |