From c4ad3f73033c7e0511c3e7df961e1232cc503478 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 26 Feb 2014 12:32:06 +0100 Subject: IRC. --- open_issues/profiling.mdwn | 233 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 232 insertions(+), 1 deletion(-) (limited to 'open_issues/profiling.mdwn') diff --git a/open_issues/profiling.mdwn b/open_issues/profiling.mdwn index 545edcf6..e7dde903 100644 --- a/open_issues/profiling.mdwn +++ b/open_issues/profiling.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -138,3 +138,234 @@ done for [[performance analysis|performance]] reasons. know what happen and how happen, maybe just suitable for newbie, hope more young hack like it once it's done, everything else is just sugar candy around it + + +# IRC, freenode, #hurd, 2014-01-05 + + braunr: do you speak ocaml ? + i had this awesome idea for a universal profiling framework for + c + universal as in not os dependent, so it can be easily used on + hurd or in gnu mach + it does a source transformation, instrumenting what you are + interested in + for this transformation, coccinelle is used + i have a prototype to measure how often a field in a struct is + accessed + unfortunately, coccinelle hangs while processing kern/slab.c :/ + teythoon: I do speak ocaml + awesome :) + unfortunately, i do not :/ + i should probably get in touch with the coccinelle devs, most + likely the problem is that coccinelle runs in circles somewhere + it's not so complex actually + possibly, yes + do you know coccinelle ? + the only really peculiar thing in ocaml is lambda calculus + +c + I know a bit, although I've never really written an semantic patch + myself + i'm okay with that + but I can understand them + then ocaml should be fine for you :) + just ask the few bits that you don't understand :) + yeah, i haven't really made an effort yet + writing ocaml is a bit more difficult because you need to + understand the syntax, but for putting printfs it should be easy enough + if you get a backtrace with ocamldebug (it basically works like + gdb), I can probably explain you what might be happening + + +## IRC, freenode, #hurd, 2014-01-06 + + braunr: i'm not doing microoptimizations, i'm developing a + profiler :p + teythoon: nice :) + i thought you might like it + teythoon: you may want to look at + http://pdos.csail.mit.edu/multicore/dprof/ + from the same people who brought radixvm + which data structure should i test it with next ? + uh, no idea :) + the ipc ones i suppose + yeah, or the task related ones + but be careful, there many "inline" versions of many ipc functions + in the fast paths + and when they say inline, they really mean they copied it + +are + but i have a microbenchmark for ipc performance + you sure have been busy ;p + it's funny you're working on a profiler at the same time a + collegue of mine said he was interested in writing one in x15 :) + i don't think inlining is a problem for my tool + well, you can use my tool for x15 + i told him he could look at what you did + so i expect he'll ask soon + cool :) + my tool uses coccinelle to instrument c code, so this works in + any environment + one just needs a little glue and a method to get the data + seems reasonable + for gnumach, i just stuff a tiny bit of code into the kdb + + hm debians bigmem patch with my code transformation makes + gnumach hang early on + i don't even get a single message from gnumach + ouch + or it is somethign else entirely + it didn't even work without my patches o_O + weird + uh oh, the kmem_cache array is not properly aligned + braunr: http://paste.debian.net/74588/ + teythoon: do you mean, with your patch ? + i'm not sure i understand + are you saying gnumach doesn't start because of an alignment issue + ? + no, that's unrelated + i skipped the bigmem patch, have a running gnumach with + instrumentation + hum, what is that aliased column ? + but, despite my efforts with __attribute__((align(64))), i see + lot's of accesses to kmem_cache objects which are not properly aligned + is that reported by the performance counters ? + no + http://paste.debian.net/74593/ + aer those the previous lines accessed by other unrelated code ? + previous bytes in the same line* + this is a patch generated to instrument the code + so i instrument field access of the form i->a + but if one does &i->a, my approach will no longer keep track of + any access through that pointer + so i do not count that as an access but as creating an alias for + that field + ok + so if that aliased count is not zero, the tool might + underestimate the access count + hm + static struct kmem_cache kalloc_caches[KALLOC_NR_CACHES] + __attribute__((align(64))); + but + nm gnumach|grep kalloc_caches + c0226e20 b kalloc_caches + ah, that's fine + yes + nevr mind + don't we have a macro for the cache line size ? + ah, there are a great many more kmem_caches around and noone + told me ... + teythoon: eh :) + aren't you familiar with type-specific caches ? + no, i'm not familiar with anything in gnumach-land + well, it's the regular slab allocator, carrying the same ideas + since 1994 + it's pretty much the same in linux and other modern unices + ok + the main difference is likely that we allocate our caches + statically because we have no kernel modules and know we'll never destroy + them, only reap them + is there a macro for the cache line size ? + there is one burried in the linux source + L1_CACHE_BYTES from linux/src/include/asm-i386/cache.h + there is one in kern/slab.h + but it is out of date + there is ? + but it's commented out + only used when SLAB_USE_CPU_POOLS is defined + but the build system should give you CPU_L1_SHIFT + hm + and we probably should define CPU_L1_SIZE from that + unconditionnally in config.h or a general param.h file if there is one + the architecture-specific one perhaps + although it's exported to userland so maybe not + + +## IRC, freenode, #hurd, 2014-01-07 + + braunr: linux defines ____cacheline_aligned : + http://lxr.free-electrons.com/source/include/linux/cache.h#L20 + where would i put a similar definition in gnumach ? + .oO( four underscores ?!? ) + heh + yes, four + teythoon: yes :) + + are kmem_cache objects ever allocated dynamically in gnumach ? + no + hm + i figured that, since there are no kernel modules, there is no + need to allocate them dynamically, since they're never destroyed + so i aligned all statically declarations with + __attribute__((align(1 << CPU_L1_SHIFT))) + but i still see 77% of all accesses being to objects that are + not properly aligned o_O + ah + >,< + you could add an assertion in kmem_cache_init to find out what's + wrong + *aligned + eh :) + right + grr + sweet, the kmem_caches are now all properly aligned :) + :) + + hm + i guess i should change what vmstat reports as "cache" from the + cached objects to the external ones (which map files and not anonymous + memory) + braunr: http://paste.debian.net/74869/ + turned out that struct kmem_cache was actually an easy target + no bitfields, no embedded structs that were addressed as such + (and not aliased) + :) + + +## IRC, freenode, #hurd, 2014-01-09 + + braunr: i didn't quite get what you and youpi were talking about + wrt to the alignment attribute + define a type for struct kmem_cache with the alignment attribute + ? is that possible ? + ah, like it's done for kmem_cpu_pool + teythoon: that's it :) + note that aligning a struct doesn't change what sizeof returns + heh, that save's one a whole lot of trouble indeed + you have to align a member inside for that + why would it change the size ? + imagine an array of such structs + ah + right + but it fits into two cachelines exactly + that wouldn't be a problem with an array either + so an array of those will still be aligned element-wise + yes + and it's often used like that, just as i did for the cpu pools + but then one is tempted to think the size of each element has + changed too + and then use that technique for, say, reserving a whole cache line + for one variable + ah, now i get that remark ;) + :) + + braunr: i annotated struct kmem_cache in slab.h with + __cacheline_aligned and it did not have the desired effect + can you show the diff please ? + http://paste.debian.net/75192/ + i don't know why :/ + that's how it's done for kmem_cpu_pool + i'll try it here + wait + i made a typo + >,< + __cachline_aligned + bad one + uh :) + i don't see it + ah yes + missing e + yep, works like a charme :) + nice, good to know :) + :) + given the previous discussion, shall i send it to the list or + commit it right away ? + i'd say go ahead and commit -- cgit v1.2.3