1 files changed, 728 insertions, 0 deletions
diff --git a/service_solahart_jakarta_selatan__082122541663/rework_gnumach_ipc_spaces.mdwn b/service_solahart_jakarta_selatan__082122541663/rework_gnumach_ipc_spaces.mdwn
new file mode 100644
index 00000000..20ae126d
--- /dev/null
+++ b/service_solahart_jakarta_selatan__082122541663/rework_gnumach_ipc_spaces.mdwn
@@ -0,0 +1,728 @@
+[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_gnumach]]
+
+[[!toc]]
+
+
+# IRC, freenode, #hurd, 2011-05-07
+
+    <braunr> things that are referred to as "system calls" in glibc are
+      actually RPCs to the kernel or other tasks, those RPCs have too lookup
+      port rights
+    <braunr> the main services have tens of thousands of ports, looking up one
+      is slow
+
+There is a [[!FF_project 268]][[!tag bounty]] on this task.
+
+
+# IRC, freenode, #hurd, 2011-04-23
+
+    <braunr> youpi: is there any use of the port renaming facility ?
+    <youpi> I don't know
+    <braunr> at least, did you see such use ?
+    <braunr> i wonder why mach mach_port_insert_right() lets the caller specify
+      the port name
+    <youpi> ../hurd-debian/hurd/serverboot/default_pager.c:	kr =
+      mach_port_rename(	default_pager_self,
+    <braunr> mach_port_rename() is used only once, in the default pager
+    <braunr> so it's not that important
+    <braunr> but mach_port_insert_right() lets userspace task decide the port
+      name value
+    <youpi> just to repeat myself again, I don't know port stuff very much :)
+    <braunr> well you know that a port denotes a right, which denotes a port
+    <youpi> yes, but I don't have any real experience with it
+    <braunr> err
+    <braunr> port name
+    <braunr> the only reason I see is that the caller, say /hurd/exec running a
+      fork()
+    <braunr> hm
+    <braunr> no, i don't even see the reason here
+    <braunr> port names should be allocated by the kernel only, like file
+      descriptors
+    <youpi> you can choose file descriptor values too
+    <braunr> really ?
+    <youpi> with dup2, yes
+    <braunr> oh
+    <braunr> hm
+    <braunr> what's the data structure in current unices to store file
+      descriptors ?
+    <braunr> a hash table ?
+    <youpi> I don't know
+    <braunr> i'll have to look at that
+    <braunr> FYI, i'm asking these questions because i'm thinking of reworking
+      ipc spaces
+    <braunr> i believe the use of splay trees completely destroys performance
+      of tasks with many many port names such as the root file system
+    <youpi> that can be a problem yes
+    <youpi> since there are 3 ports per opened file, and like 3 per thread too
+    <braunr> + the page cache
+    <youpi> with a few thousand opened files and threads, that makes a lot
+    <youpi> by "opened file" I meant page cache actually
+    <braunr> i saw numbers up to 30k
+    <braunr> ok
+    <youpi> on buildds I easily see 100k ports
+    <braunr> for a single task ?
+    <braunr> wow
+    <youpi> yes
+    <youpi> the page cache is 4k files
+    <braunr> so that's definitely worth the try
+    <youpi> so that already makes 12k ports
+    <youpi> and 4k is not so big
+    <braunr> it's limited to 4K ?
+    <youpi> I haven't been able to check where the 100k come from yet
+    <youpi> braunr: yas
+    <braunr> could be leaks :/
+    <youpi> yes
+    <braunr> omg, a hard limit on the page cache ..
+    <youpi> vm/vm_object.c:int		vm_object_cached_max = 4000;	/* may
+      be patched*/
+    <braunr> mach is really old :(
+    <youpi> I've raised it
+    <youpi> before it was 200
+    <youpi> ...
+    <braunr> oO
+    <youpi> I tried to dro pthe limit, but then I was lacking memory
+    <youpi> which I believe have fixed the other day, but I have to test again
+    <braunr> that implementation doesn't know how to deal with memory pressure
+    <youpi> yes
+    <braunr> i saw your recent changes about adding warnings in such cases
+    <braunr> so, back to ipc spaces
+    <braunr> i think splay trees 1/ can get very unbalanced easily
+    <braunr> which isn't hard to imagine
+    <braunr> and 2/ make poor usage of the cpu caches because they're BST and
+      write a lot to memory
+    <youpi> maybe you could write a patch which would dump statistics on that?
+    <braunr> that's part of the job i'm assigning to myself
+    <youpi> ok
+    <braunr> i'd like to try replacing splay trees with radix trees
+    <youpi> I can run it on the buildds
+    <youpi> buildds are very good stress-tests :)
+    <braunr> :)
+    <youpi> 22h building -> 77k ports
+    <youpi> 26h building -> 97k ports
+    <youpi> the problem is that when I add leak debugging (backtraces), I'm
+      getting out of memory :)
+    <braunr> that will be a small summer of code outside the gsoc :p
+    <braunr> :/
+    <braunr> backtraces are very consuming
+    <youpi> but that's only because of hardcoded limits
+    <youpi> I'll have to test again with bigger limits
+    <braunr> again ..
+    <braunr> evil hard limits
+    <youpi> well, actually we could as well just drop them
+    <youpi> but we'd also need to easily get statistics on zone/vm_maps usage
+    <youpi> because else we don't see leaks
+    <youpi> (except that the machine eventually crashes)
+    <braunr> hm
+    <braunr> i haven't explained why i was asking my questions actually
+    <braunr> so, i want radix trees, because they're nice
+    <braunr> they reduce the paths lengths
+    <braunr> they don't get too unbalanced (they're invariant wrt the order of
+      operations)
+    <braunr> they don't need to write to memory on lookups
+    <braunr> the only drawback is that they can create much overhead if their
+      usage pattern isn't appropriate
+    <braunr> elements in such a structure should be close, so that they share
+      common nodes
+    <youpi> the common usage pattern in ext2fs is a big bunch of ever-open
+      ports :)
+    <braunr> if there is one entry per node, it's a big waste
+    <braunr> yes
+    <youpi> there are 3, actually
+    <braunr> but the port names have low values
+    <braunr> they're allocated sequentially, beginning at 0
+    <braunr> (or 1 actually)
+    <braunr> which is perfect for radix trees
+    <youpi> yes
+    <youpi>  97989: send
+    <braunr> but if anyone can rename
+    <braunr> this introduces a new potential weakness
+    <youpi> ah, if it's just a weakness it's probably not a problem
+    <youpi> I thought it was even a no-go
+    <braunr> i think so
+    <youpi> I guess port rename is very seldom
+    <braunr> but in a future version, it would be nice not to allow port
+      renaming
+    <braunr> unless there are similar issues in current unix kernels
+    <braunr> in which case i'd say it's acceptable
+    <youpi> there are
+    <braunr> of that order ?
+    <youpi> and it'd be useful for e.g. processing
+      tracing/debugging/tweaking/whatever
+    <youpi> it's also used to hide fds from a process
+    <braunr> port renaming you mean ?
+    <youpi> you allocate them very high
+    <youpi> yes
+    <braunr> ok
+    <youpi> choosing your port name, generally
+    <youpi> to match what the process expects for instance
+    <braunr> then it would be a matter of resource limiting (which we totally
+      lack afaik)
+    <braunr> along the number of maximum open files, you would have a number of
+      maximum rights
+    <braunr> does that seem fine to you ?
+    <youpi> if done throught rlimits, sure
+    <braunr> something similar yes
+    <youpi> (_no_ PORTS_MAX ;) )
+    <braunr> oh and, in addition, i remember gnumach has a special
+      configuration of the processor in which caching is limited
+    <braunr> like write-through only
+    <youpi> didn't I fix that recently ?
+    <braunr> i don't know :)
+    <braunr> CR0=e001003b
+    <braunr> i don't think it's fixed
+    <youpi> I mean, in the git
+    <braunr> ah
+    <youpi> not in the debian package
+    <braunr> didn't tried the git version yet
+    <braunr> last time i tried (which was a long time ago), it made the kernel
+      crash
+    <braunr> have you figured why ?
+    <youpi> I'm not aware of that
+    <braunr> anyway, splay trees write a lot, and most trees write a lot even
+      at insertion/removal to rebalance
+    <youpi> braunr: Mmm, there's no clearance of CD in the kernel actually
+    <braunr> with radix trees, even if caching can't be fully enabled, it would
+      make much better use of it
+    <braunr> so if port renaming isn't a true issue, i'll choose that data
+      structure
+    <youpi> that'd probably be better yes
+    <youpi> I'm surprised by the CD, I do remember fixing something like this
+      lately
+    <braunr> there are several levels where CD can be set
+    <braunr> the processors ORs all those if i'm right
+    <braunr> to determine if caching is enabled 
+    <youpi> I know
+    <braunr> ok
+    <youpi> but in my memory that was at the CR* level, precisely
+    <braunr> maybe for xen only ?
+    <youpi> no
+    <braunr> well good luck if you hunt that one, i'm off, see you :)
+    <youpi> braunr: ah, no, it was the PGE flag that I had fixed
+
+    <antrik> braunr: explicit port naming is used for example to pass some
+      initial ports to a new task at well-known places IIRC
+    <antrik> braunr: but these tend to be low numbers, so I don't see a problem
+      there
+    <antrik> (I'm not familiar with radix trees... why would high numbers be a
+      problem?)
+
+    <youpi> braunr: iirc the ipc space is limited to ~192k ports
+
+    <braunr> antrik: in most cases i've seen, the insert_right() call is used
+      on task_self()
+    <braunr> and if there really are special ports (like the bootstrap or
+      device ports), they should have special names
+    <braunr> IIRC, these ports are given through command line expansion by the
+      kernel at boot time
+    <braunr> but it seems reasonable to think of port renaming as a potentially
+      useful feature
+    <braunr> antrik: the problem with radix trees isn't them being high, it's
+      them being sparse
+    <braunr> you get the most efficient trees when entries have keys that are
+      close to each other
+    <braunr> because radix trees are a type of tries (the path in the tree is
+      based on the elements composing the key)
+    <braunr> so the more common prefixes you have, the less external nodes you
+      need
+    <braunr> here, keys are port names, but they can be memory addresses or
+      offsets in memory objects (like in the page cache)
+    <braunr> the radix algorithm takes a few bits, say 4 or 6, at a time from a
+      key, and uses that as an index in a node
+    <braunr> if keys are sparse, there can be as little as one entry per node
+    <braunr> IIRC, the worst case (on entry per node with the maximum possible
+      number of nodes for a 32-bits key) is 2% entries
+    <braunr> the reste being null entries and almost-empty nodes containing
+      them
+    <braunr> so if you leave the ability to give port rights the names you
+      want, you can create such worst case trees
+    <braunr> which may consume several MiB of memory per tree
+    <braunr> tens of MiB i'd say
+    <braunr> on the other hand, in the current state, almost all hurd
+      applications use sequentially allocated port names, close to 0 (which
+      allows a nice optimization)
+    <braunr> so a radix ree would be the most efficient
+    <antrik> well, if some processes really feel they must use random numbers
+      for port names, they *ought* to be penalized ;-)
+
+
+# IRC, freenode, #hurd, 2011-04-27
+
+    <braunr> antrik: remember when you asked why high numbers would be a
+      problem with radix trees ?
+    <braunr> here is a radix tree with one entry, which key is around 5000
+    <braunr> [  656.296412] tree height: 3
+    <braunr> [  656.296412] index:  0, level:  0, height:  3, count:  1,
+      bitmap: 0000000000000002
+    <braunr> [  656.296412] index:  1, level:  1, height:  2, count:  1,
+      bitmap: 0000000000004000
+    <braunr> [  656.296412] index: 14, level:  2, height:  1, count:  1,
+      bitmap: 0000000000000080
+    <braunr> three levels, each with an external node (dynamically allocated),
+      for one entry
+    <braunr> so in the worst case of entries with keys close to the highest
+      values, the could be many external nodes with higher paths lengths than
+      when keys are close to 0
+    <braunr> which also brings the problem of port name allocation
+    <braunr> can someone with access to a buildd which has an uptime of at
+      least a few days (and did at least one build) show me the output of
+      portinfo 3 | tail ?
+    <braunr> port names are allocated linearly IIRC, like PIDs, and some parts
+      of the kernel may rely on them not being reused often
+    <braunr> but for maximum effifiency, they should be
+    <braunr> efficiency*
+    <braunr> 00:00 < braunr> can someone with access to a buildd which has an
+      uptime of at least a few days (and did at least one build) show me the
+      output of portinfo 3 | tail ?
+    <braunr> :)
+    <youpi> it's almost like wc -l
+    <youpi>   4905: receive
+    <youpi> vs 4647
+    <youpi> for /
+    <youpi>  52902: receive
+    <youpi> vs 52207
+    <youpi> for the chroot
+    <braunr> even after several builds ?
+    <braunr> and several days ?
+    <youpi> that's after 2 days
+    <youpi> it's not so many builds
+    <youpi> rossini is not so old
+    <youpi> (7h)
+    <youpi> but many builds
+    <youpi> 70927: send
+    <youpi> vs 70938
+    <braunr> ok
+    <braunr> so it seems port names are reused
+    <braunr> good
+    <youpi> yes they are clearly
+    <braunr> i think i remember a comment about why the same port name
+      shouldn't be reused too soon
+    <youpi> well, it could help catching programming errors
+    <braunr> that it helped catch bugs in applications that could
+      deallocate/reallote quickly
+    <braunr> reallocate*
+    <braunr> without carefuly synchronization
+    <braunr> careful
+    <braunr> damn, i'm tired :/
+    <youpi> but that's about debugging
+    <youpi> so we don't care about performance there
+    <braunr> yes
+    <braunr> i'll try to improve allocation performance too
+    <braunr> using e.g. bitmaps in each external node back to the root so that
+      unused slots are quickly found
+    <braunr> i thknk that's what idr does in linux
+    <antrik> braunr: idr?
+    <braunr> antrik: a data structure used to map integers to pointers
+    <braunr> http://fxr.watson.org/fxr/source/lib/idr.c?v=linux-2.6
+
+
+# IRC, freenode, #hurd, 2011-06-08
+
+    <braunr> hm, reverse space/port to name lookups also suck
+    <braunr> having separate types for simple ipc entries and splay tree
+      entries really makes many parts of the ipc code complicated
+    <braunr> and a global hash table for these operations is scary
+
+
+# IRC, freenode, #hurd, 2011-06-09
+
+    <braunr> hm nice, my radix tree code runs inside gnumach, along with the
+      original splay tree code and assertions making sure results are the same
+    <braunr> there is this "collision" thing i'm not sure to understand but
+      once this is solved, replacing the splay trees should be easy
+
+    <braunr> youpi: is there a way to easily know the number of send rights
+      associated to a port ?
+    <youpi> portinfo ?
+    <braunr> portinfo gives information in a space
+    <braunr> but this is specific to a port
+    <braunr> is there an option for that ?
+    <youpi> -v
+    <braunr> hm ok
+    <youpi>     25: send (refs: 550)
+    <braunr> nice
+    <braunr> youpi: if you have time, could you give me the min/max/avg numbers
+      of send rights referring to the same port on buildds ?
+    <braunr> i'm trying to estimate if it's better to have space->list_of_ports
+      or port->list_of_spaces to replace the global ipc hash table
+    <braunr> the latter seems better but there could be unexpected cases on
+      machines using large amounts of resources like the buildds
+    <youpi> max is 64k
+    <youpi> min is 1 of course :)
+    <braunr> 64k
+    <braunr> then it's not what i'm looking for
+    <youpi> avg is 55
+    <braunr> isn't this the number of urefs ?
+    <youpi> I don't know
+    <braunr> hmm
+    <braunr> what i'm looking for is the number of *pure send rights* for the
+      same port
+    <braunr> i don't think portinfo can give it
+    <braunr> there can only be one such send right per task for the same port
+    <braunr> 64k would mean there are 64k tasks
+    <youpi> ok, so it's more difficult
+    <youpi> it means using -t
+    <braunr> ahh
+    <youpi> and run n^2 portinfo over the n processes
+    <braunr> i see
+    <youpi> Mmm, that will however still show any duplicate send right
+    <youpi> but then by min/max/avg, you mean, over time ?
+    <braunr> i'll change the source code, simpler
+    <youpi> e.g. min would be right after boot?
+    <braunr> min is 1
+    <youpi> 1 what ?
+    <braunr> 1 send right to a port
+    <youpi> ah, 1 for a given port
+    <braunr> yes
+    <youpi> ok, it becomes really hairy to compute, I don't hav ethe time :)
+    <braunr> avg and max are more interesting :)
+    <braunr> no worries
+    <youpi> braunr: I wouldn't be surprised that max is the number of tasks
+    <youpi> e.g. for a send port to the proc server for instance
+    <braunr> youpi: it is, but i'm not looking for potential numbers
+    <youpi> I'm not talking about a potential number, but an actual number
+      that's almost always true
+    <braunr> for one port, yes
+    <braunr> but yes, ok for max
+    <braunr> this makes choosing an appropriate data structure difficult
+
+    <antrik> braunr: actually, min number of send rights to a port is 0... but
+      I'm sure you know that already :-)
+
+    <antrik> youpi: normally each client gets a separate port. I'm not sure
+      there are any ports with send rights distributed over many tasks...
+
+    <jkoenig> antrik, what about / ?
+
+    <youpi> antrik: not necessarily
+
+    <antrik> jkoenig: not sure... isn't the "/" port authenticated to the
+      specific user?
+
+    <jkoenig> antrik, I guess so, but a single user could still have many
+      tasks.
+    <jkoenig> (wrt /)
+    <antrik> jkoenig: well, in theory the tasks having exactly the same UIDs
+      and GITs could probably share an auth token... but that's not how things
+      are handled in general
+    <antrik> at least I don't think so
+    <antrik> tasks are authenticated, not users
+    <antrik> err... GIDs :-)
+    <jkoenig> antrik, still, my quick glance to the fork() code seemed to
+      indicate the port is inherited as-is, maybe authentication happens only
+      when something is actually looked up?
+    <jkoenig> hmm "rpctrace ls -d /" does not show any authentication calls,
+      only a lookup("") on the root which returns a different port
+    <jkoenig> so I guess the root port is "deauthenticated" or something when
+      the uid of a process is changed.
+    <antrik> too bad cfhammer isn't around, he digged into all this stuff...
+    <antrik> I know that there is a mechanism which reauths all FDs when the
+      IDs of a process change
+    <antrik> but I'm not sure the "/" port uses this mechanism
+
+    <braunr> antrik: but the radix tree codee is really used as is, which means
+      no locking, no preloading before locking, all of this because simple
+      locks don't exist on UP, and because the kernel isn't preemptible
+
+[[microkernel/mach/gnumach/preemption]].
+
+    <braunr> antrik: and yes, min number is 0, but in that case you don't need
+      (space, port) -> right lookups :)
+    <braunr> antrik: or put in another way, whichever reasonable structure you
+      use, when it's empty, you don't care much
+    <braunr> which also means that the min number has actually no value here
+    <braunr> because the same applies to 1
+
+    <braunr> then what seems to take most time is forks
+    <braunr> and i hope my upcoming kernel changes will help the situation a
+      bit
+    <pinotree> what are your incoming gnumach changes about?
+    <braunr> the ipc translation layer speed
+    <braunr> which basically means operating on port names (looking them up,
+      inserting, removing, renaming, looking up send rights to a specific
+      ports) will be faster
+    <braunr> and i believe forks are (one of) the most demanding use cases wrt
+      ipc space manipulation
+    <braunr> i'm really surprised how badly the splay trees are used
+    <braunr> the worst case for this data structure is traversal
+    <braunr> and this is done in many situations
+    <braunr> leaving the tree in its worst case shape
+    <braunr> and i didn't mentioned the bunch of memory writes occurring, event
+      for things like lookups or traversals
+    <braunr> this is slow and can disrupt many cpu cache lines
+    <braunr> and when there are 10k to 100+k (e.g. in some ext2fs instances on
+      buildds), just imagine the number of operations involved in those
+      operations
+    <braunr> a simple traversal_next involves a rotation *gasp*
+    <braunr> this means potentially writing on 3 different cache lines, for
+      *one* next operation
+    <pinotree> what are you replacing that splay tree with?
+    <braunr> radix trees
+    <braunr> much shorter paths
+    <braunr> extremely few memory writes
+    <braunr> locality of reference when traversing
+    <braunr> good cache usage (as many of the top nodes are reused)
+    <braunr> the two drawbacks are 1/ memory allocation for external nodes,
+      which means the tree must be preloaded before locking
+    <braunr> and 2/ high memory overhead if the keys are sparse
+    <braunr> but this isn't the case with port names, unless someone messes it
+      up by assigning random names to many rights
+
+    <antrik> braunr: so, when will we see the first performance comparision?
+      :-)
+    <braunr> antrik: that's a topic of itself, how to compare ?
+    <braunr> antrik: the thing is, my current gnumach patches only makes
+      assertions
+    <braunr> this is the best way i found to test my tree in real life
+      conditions
+    <braunr> much cleanup is needed
+    <braunr> and what i'd like to do is to completely replace all teh
+      translation layer structures with it
+    <braunr> which means removing much code, making sure it still works as
+      expected
+    <braunr> this is tedious
+    <braunr> so one month at least
+    <antrik> braunr: comparing shouldn't be too hard... the average configure
+      script does a lot of forking, which should be a good benchmark according
+      to your observations
+    <braunr> rough estimates are easy, yes
+    <braunr> but my observations my be wrong :p
+    <antrik> braunr: well, we don't really need precise numbers...
+    <antrik> unless you need to do some kind of fine-tuning?
+    <braunr> i don't know yet
+
+
+# IRC, freenode, #hurd, 2011-06-18
+
+    < braunr> hmm, i'm having a problem with integrating my radix tree code in
+      gnumach
+    < braunr> inserting into such a tree can trigger memory allocation
+    < braunr> so commonly, the tree i loaded with nodes before insertion,
+      usually if it requires strong locking
+    < braunr> ipc spaces are locked using "simple locks" (which are spin locks)
+    < braunr> but spin locks are noops on UP, and gnumach is always UP ..
+    < braunr> so, should i still include preloading code, even if it'll end up
+      dead code ?
+    < antrik> hm... I think we discussed this before; but isn't gnumach
+      supposed to be SMP-capable, minus bugs?...
+    < braunr> it is
+    < braunr> but ofc, if i choose not to include preloading, i'll write
+      #errors so that the day gnumach is built for SMP again, such support will
+      be included
+    < antrik> oh, sorry, I think I misread. what is UP?
+    < braunr> uniprocessor
+    < antrik> well, if it's just bugs forcing the current UP state, I think
+      saying that gnumach is always UP is a stretch...
+    < braunr> sure, but it's a practical consideration
+    < antrik> does the locking complicate stuff? or is it just performance
+      considerations?
+    < braunr> no it's about correctness and completeness
+    < braunr> if you don't preload a tree before locking
+    < braunr> and memory allocation occurs while you're holding a simple lock
+    < braunr> and memory allocation requires the kernel to sleep
+    < braunr> you're screwed
+    < braunr> but i hate the idea of including code that won't be used and
+      which won't be easy to test
+    < braunr> so i'm wondering if it's ok for now to just put this in a TODO
+      comment and write it when the time is right
+    < braunr> or if i should spens the week adding this and tweaking the
+      userspace implementation to "emulate" spin locks
+    < antrik> well, it's tricky situation. on one hand, it seems stupid to
+      include handling for something that presently isn't used, and it's not
+      clear when it will. on the other hand, I'd rather not see additional
+      problems introduced that will make fixing SMP even harder...
+    < braunr> that's why i'm asking here
+    < antrik> of course, you could resolve this question by fixing SMP
+      first... ;-)
+    < braunr> ew
+    < antrik> well, I guess it would be best first to make the code work... and
+      we can still decide about the locking thing before it goes mainline I'd
+      say?
+    < braunr> "make the code work" ?
+    < antrik> I mean make gnumach work with your radix tree code
+    < braunr> without preloading then
+    < antrik> yeah... as a first step... I guess adding it later won't be any
+      harder than adding it right now?
+    < braunr> not much
+    < braunr> testing is what requires time really
+
+
+# IRC, freenode, #hurd, 2011-06-27
+
+    < braunr> ok, here is the radix tree code:
+      http://git.sceen.net/rbraun/libbraunr.git/
+    < braunr> the preloading stuff will be added in the kernel only, as it's
+      really pointless and not easily doable in userspace
+    < youpi> preloading?
+    < braunr> youpi: yes, preloading
+    < braunr> radix trees allocate external nodes
+    < youpi> well, providing a url at some random time of some random day is
+      not a great way  to get eyes on it :)
+    < braunr> and ipc spaces are locked when inserting/allocating names
+    < braunr> we normally don't need preloading in gnumach
+    < braunr> since there is no preemption nor SMP
+
+[[microkernel/mach/gnumach/preemption]].
+
+    < braunr> but in case someone changes that, i'd like the code to be mostly
+      ready
+    < braunr> and correctly handle those ugly simple locks
+    < braunr> youpi: is what i say clear enough or do you need more background
+      on what is done ?
+    < youpi> about preloading?
+    < braunr> yes
+    < youpi> I guess it means allocating nodes in advance?
+    < braunr> yes
+    < youpi> k
+    < braunr> before locking the ipc spaces
+
+
+# IRC, freenode, #hurd, 2011-06-28
+
+    < braunr> antrik: i think i won't write the code for the preloading stuff
+      actually
+    < braunr> antrik: it's not very difficult, but i really hate the idea of
+      not being able to reliably test it
+    < braunr> antrik: and i'd rather concentrate on integrating the radix tree
+      code in gnu mach now
+    < braunr> (i've already removed much code, even some files which weren't
+      actually used before my changes !)
+    < braunr> hmm, i won't be able not to write the preloading code after all
+    < antrik> braunr: not able not to write? how's that?
+    < braunr> antrik: it's actually required
+    < braunr> there are three functions, ipc_entry_get, ipc_entry_alloc, and
+      ipc_entry_grow_table
+    < braunr> ipc_entry_get cannot allocate memory
+    < braunr> if it fails, ipc_entry_grow_table is called, which will allocate
+      memory
+    < braunr> ipc_entry_alloc calls both of them depending on the result of
+      ipc_entry_get
+    < braunr> this is the equivalent of the preloading thing i had in mind
+    < braunr> not a bad thing after all
+    < braunr> the only thing i'm afraid of are the "optimized" version of those
+      ipc functions in te so-called fast paths
+    < braunr> i'm afraid if i don't deal right with those, the kernel may end
+      up using mostly slow paths
+    < braunr> but considering the purpose of those fast paths was merely to
+      avoid the overhead of function calls and some locking functions, it
+      shouldn't be that bad
+    < braunr> this is such a mess eh
+    < antrik> hurray microoptimisations ;-)
+    < braunr> there, the preload functions are done, easy :)
+    < antrik> braunr: seems you spent less time implementing it than pondering
+      whether you should implement it ;-)
+    < braunr> well, i couldn't implement it correctly before knowing what
+      should have been done exactly
+    < braunr> and there are still other problems :/
+    < braunr> and the other problems make me reconsider if this was useful at
+      all eh
+    < braunr> youpi: i'm unable to find where ipc tree entries are released
+      except in ipc_entry_alloc_name(), which could mean they're leaked ...
+    < braunr> youpi: would you have time to take a look ?
+    < youpi> they aren't in ipc_entry_dealloc() ?
+    < braunr> no .....
+    < youpi> it's not so unprobable that they're only freed when the task quits
+    < braunr> i don't see that either
+    < braunr> i only see them released in ipc_entry_alloc_name()
+    < braunr> so maybe they're reused
+    < braunr> but i'm not sure about that when reading the code
+    < braunr> oh wait, yes, they are :/
+    < braunr> my bad
+    < youpi> in the ipc_splay_tree_* fucntions I guess?
+    < braunr> yes
+    < braunr> it's just surprsing to see them allocated outside the tree code
+      only
+    < braunr> but released in both the entry and the splay tree code ...
+
+
+# IRC, freenode, #hurd, 2011-06-29
+
+    < braunr> hmm i missed an important thing :/
+    < braunr> and it's scary
+    < braunr> it looks like the splay tree is mainly used when names are
+      provided
+    < braunr> whereas the entry table is used when names are allocated
+    < braunr> which means the table is the main ipc data structure, even for
+      tasks with lots of rights
+    < braunr> i can make my root ext2fs have more than 10k rights, and i see
+      the ipc table table grow along that number ...
+    < braunr> now thetable has 15k+ entries
+    < braunr> IOW there is no point to put the radix tree code in gnumach :(
+    < antrik> braunr: what do you mean by "provided" and "allocated"?
+    < antrik> and what is that table you are talking about?
+    < braunr> antrik: provided means the user space tasks gives the name of the
+      new right
+    < braunr> antrik: allocated means the kernel generates it
+    < braunr> antrik: the table i'm talking about is is_table in struct
+      ipc_space
+    < braunr>  55  *      Every space has a non-NULL is_table with
+      is_table_size entries.
+    < braunr>  56  *      A space may have a NULL is_tree.  is_tree_small
+      records the
+    < braunr>  57  *      number of entries in the tree that, if the table were
+      to grow
+    < braunr>  58  *      to the next larger size, would move from the tree to
+      the table.
+    < braunr> here is the description which mislead me (in addition of the
+      obscure code)
+    < braunr>  50  *      Spaces hold capabilities for ipc_object_t's (ports
+      and port sets).
+    < braunr>  51  *      Each ipc_entry_t records a capability.  Most
+      capabilities have
+    < braunr>  52  *      small names, and the entries are elements of a table.
+    < braunr>  53  *      Capabilities can have large names, and a splay tree
+      holds
+    < braunr>  54  *      those entries.  The cutoff point between the table
+      and the tree
+    < braunr>  55  *      is adjusted dynamically to minimize memory
+      consumption.
+    < antrik> ah, so the rights with a low name are in a linear table, and only
+      those with "random" high names are stored in the splay tree instead?
+    < antrik> seems a rather complex design... I guess though there isn't much
+      room for further optimisation there :-(
+    < antrik> (well, except for code size optimisation -- which could in fact
+      make a considerable difference...)
+    < braunr> well there are problems with this approach, but most don't
+      concern performance
+    < braunr> when the table gets big (close to the page size or more), it gets
+      remapped when reallocated
+    < braunr> which will incur some penalty because of the tlb
+    < braunr> but it's annoying even for small tables
+    < braunr> the initial table size is 4 entries, and from what i can see,
+      most tables are 128 entries wide when tasks are destroyed
+    < braunr> an obvious simple optimization is to set a larger default size
+    < braunr> the same applies for the dead name tables
+    < braunr> those reallocations are a pain, and they're due to this design
+    < braunr> they can also fail because of fragmentation
+    < braunr> there would be a point to radix trees if they would replace all
+      that, and not just the splay tree
+    < braunr> but this would cause a lot of changes in a lot of places, and in
+      particular the "optimized" fast paths i mentioned yesterday
+    < braunr> we'll see how they perform in x15 :>
+    < braunr> there is a slight noticeable improvement when increasing the
+      initial size of the entry table
+    < antrik> braunr: well, if you use them in a completely different
+      implementation, there will be no way of telling whether they make a
+      difference
+    < antrik> how did you test the improvement?
+    < braunr> antrik: no actually it's completely negligeable
+    < braunr> hm
+    < braunr> is that a valid word ? :)
+    < braunr> negligible
+    < braunr> youpi: did you see my comments about the ipc stuff this earlier
+      today ?
+    < braunr> youpi: well to make things short, when port names are allocated,
+      the right they refer to is allocated from the ipc table
+    < braunr> youpi: the splay tree is only used for user provided names
+    < braunr> youpi: i had tables as large as the number of rights in a space
+      (i could easily reach 20k)
+    < braunr> youpi: whereas the splay trees had at most ~40 entries ..