[[!meta copyright="Copyright © 2013, 2014, 2016, 2018 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!meta title="Improved System Object Lookups"]] [[!template id=highlight text="""/!\ Obsolete /!\ --- This is no longer valid as a Google Summer of Code project."""]] The Hurd currently uses its ihash library ([[hurd/libihash]]) as a generic container for various objects. While it does its job, it has been reported to suffer from high collision rates. In addition, the "one size fits all" approach contributes to slow things down. One particular use case is looking up an object from a Mach port name, which basically translates to getting the file or socket associated with a file descriptor in traditional Unix systems. It's particular because there are actually two lookups for each object, the first being finding the Mach port from a client port name, which is done in the GNU Mach kernel, and the second being finding the server object from a server port name. The best strategy would probably be to directly associate the address of an object to the receive right of its port, eliminating the need to look up again, but this is quite an intrusive change in the code base. For the time being, optimizing lookups would already be an improvement. The goal of this project is to increase system performance by speeding up object lookups, with a particular focus on name-to-object lookups. Note that there is little room for improvement in the kernel name-to-port lookups because of the various optimizations IPC has received in the past. Looking up server objects from port names could use an algorithm highly tuned for this task, perhaps with better locking (shared/exclusive instead of always mutually exclusive for example). Then, the libihash algorithm could be replaced with a better one, not necessarily a hash based one, to improve all the other users. This task requires proper knowledge of data structure algorithms, taking into account machine properties such as processor caches, as well as the appropriate skills in C and assembly to check the generated code. Being able to perform accurate measurements in a system that lacks modern profiling tools would also be helpful. Possible mentors: Richard Braun # IRC, freenode, #hurd, 2013-09-18 In context of [[!message-id "20130918081345.GA13789@dalaran.sceen.net"]]. braunr: (wrt the gnumach HACK) funny, I was thinking about doind the same for userspace servers, renaming ports to the address of the associated object, saving the need for the hash table... teythoon: see http://darnassus.sceen.net/~hurd-web/community/gsoc/project_ideas/object_lookups/ teythoon: my idea is to allow servers to set a label per port, obtained at mesage recv time because, yes, looking up an object twice is ridiculous you normally still want port names to be close to 0 because it allows some data structure optimizations braunr: yes, I feared that ports should normally be smallish integers and contigious at best braunr: interesting that you say there that libihash suffers from high collision rates I've a theory to why that is, libihash doesn't do any hashing at all there are notes about that in the open_issues section of the wiki but I figured that this is probably ok for port names, as they are small and contigious braunr: That's called protected payload. braunr: The idea is that the kernel appends data to the message in flight. ## IRC, freenode, #hurd, 2013-10-24 and, with some effort, getting rid of the hash table lookup by letting the kernel provide the address of the object (iirc neil knew the proper term for that) teythoon: that is a big interface change how so optimizing libihash and libpthread should already be a good start well how do you intend to add this information ? ok, "big" is overstatement, but still, it's a low level interface change that would probably break a lot of things store a pointer in the port structure in gnumach, make that accessible somehow yes but how ? interesting question indeed my plan for x15 is to make this "label" part of received messages which means you need to change the format of messages that is what i call a big change ok, so we need to provide an update path but once done, the change to hurd will be minimal, patching libports should cover most of that normally yes so this amounts to messing with gnumach and mig and designing a clever way to make the update process safe libihash is known to show high collision rates right, libihash it could use an integer hash function on the keys to distribute them better i think that's already what it tries to do so merely using a better hash algorithm such as murmur should do the job or use another data structure altogether no, it does no hashing of its own on the keys are you sure ? well, it uses only prime numbers as sizes, and computes key % size well that's hashing .. :) but this is not really a good hash yes isn't that what i said ? right ok, I didn't get that ;) also, the sizes start quite small, 3, 7, 19... and each time the hash table is grown, all items will have to be updated which is why we could consider another data structure or, for starters, to thin out that list of sizes my personal preference being radix trees I assume you have an implementation handy? yes cool :D but good hashing is excellent too radix trees have their own issues braunr: http://burtleburtle.net/bob/hash/integer.html i use thomas wang's hashing function in x15 or rather, my own personal c utility library, since x15 doesn't hash anything currently but murmur is better we prefer distribution over hashing performances https://131002.net/siphash/ ## IRC, freenode, #hurd, 2013-11-21 btw, about protected payloads in mach I'm thinking about adding a flag to indicate that mach_msg should return the protected payload pointer instead of the local port field in the message header when would you set it ? i mean, how is it set ? we don't really need the port name, right? and when we do, we look it up in the referenced data structure using a new rpc perhaps ok what do you think? a new rpc on ports themselves, like mach_port_mod_refs i assume ? i think it's a good solution the field is a natural_t, as far as i can see, pointers should fit into it, right? yes the big problem is backward compatibility why? and your solution solves that yes hum natural_t was originally intended to be as large as the machine word but it may no longer stay true i think youpi decided to keep it an int and not a long in his x86_64 branch mach uses a trick for in-kernel port rights where the right is the port address yes, I've seen that but i remember hearing about problems with this strategy in 64-bits or maybe compat problems in mig interfaces i don't remember exactly so youpi looked at how macosx mach deals with the problem well, but so far there is no 64 bit mach, so we do not need to worry about compatibility there, no ? and if i'm right, they forced the ports on 32-bits no you're right we can simply force the field to 64-bits, whatever it contains or change the message format from the beginning to include both the name and the payload then again, why bother ? have a 64-bits specific message format ? well, it's worth discussing, no? maybe i personally don't like the idea we could fix stuff forcing the field to 64-bits should be enough right do you think the idea is worth prototyping ? teythoon: yes braunr: cool :) teythoon: definitely :p actually, doing that can remove a large part (if not all) contention from libports indeed i still think we should work on libihash first [[hurd/libihash]]. converting libihash to murmur2/3 impacts more data structures overall it's also much easier what exactly do you mean by that ? libports uses libihash yes but it's not the only user libihash is known to have high collision rates that should be fixed right, but what do you mean by using murmur2/3 that's a hashing algorithm name using the integer finalizer used by murmur? hm i didn't dig the details and simply assumed it could be used for integer hashing as well the way i see it, murmur can hash arbitrary ammounts if there are better integer hashing algorithms, let's just use that but that is not what we need yes we have a fixed size integer but from what i remember, it's also very efficient for integer hashing ## IRC, freenode, #hurd, 2013-11-22 braunr: /test-pp: msgh_protected_payload is 0x12345678 :) :) but currently I do another ipc_port_translate which is clearly not desireable the msg handling in the kernel is... involved... here is the thing... there are two (kernel) threads involved, the sender and the receiver for the sender, kmsg->ikm_header.msgh_remote_port is a pointer (thanks to ipc_port_translate) to the destination's ipc_port_t that's where the protected_payload is stored but at the receiving thread, the pointer is gone, replaced by a port name so currently I'm doing the lookup there again hum are you sure kmsg is the general structure for all messages ? or is it only for kernel messages ? i don't remember exactly no, for all messages ok I just need to get this pointer across cleanly i thought you wanted to replace that port name in the receiving thread with the payload I do i don't see the problem then well, only the sending thread has the pointer, the receiving thread only has the name i don't see what makes it hard to change it since that's what you want to do the sending thread doesn't have the pointer yes it has well only for in kernel objects and it's an optimization and you shouldn't have to care much about it your work only involves changing how messages are received normally let me push it somewhere, so I can show you the patches ok braunr: http://darnassus.sceen.net/cgit/teythoon/gnumach.git/shortlog/refs/heads/feature-protected-payload-1 teythoon: where should i look at ? the last commit hm see what calls mach_msg_receive the payload flag must be handled before, when the message is actually transferred ipc_kmsg_copyin perhaps well but this is the tricky part i'm not sure which of the sender or receiver code actually performs these translations yes b/c at this point it is not known whether the receiver has specified the MACH_RCV_PROTECTED_PAYLOAD flag or my understanding of the whole process is still somewhat off, which might very well be... it's not something the receiver should set i.e. the flag shouldn't be set at mach_msg time because it's asynchronous it's a flag that should be set near port creation time oh right, I can see how that could work mach_reply_port(); mach_port_set_payload(); mach_msg(); braunr: http://darnassus.sceen.net/cgit/teythoon/gnumach.git/log/refs/heads/feature-protected-payload-2 I think I found the right spot teythoon: looks better indeed :) braunr: yes, thanks for the hint :) teythoon: keep in mind gnumach supports moving receive rights between tasks i don't think it's much of a burden but don't forget :) right, if that happens, the protected payload field should probably be just reset to 0 that preserves the old default behavior teythoon: you shouldn't name the payload "address" though but really "payload" or "label" vm_offset_t isn't the appropriate type either i suggest unsigned long payload braunr: noted what i mean is the payload isn't the userspace structure you want to use it's the value stored in that unsigned long whether it's used as a pointer or an array index or whatever should be at the application discretion yes, I got that concerning vm_offset_t, it's misused a lot, mostly for historical reasons vm_offset_t is actually the ancestor of off_t i.e. an offset inside a *memory object* the size of which may differ from the size of a pointer ok historically, physical and virtual addresses, in addition to memory object sizes, were the same, hence the confusion today we might have 32-bits virtual addresses, 36-bits physical addresses, and 44- to 64-bits file offsets (e.g. PAE with large file support) ## IRC, freenode, #hurd, 2013-11-25 braunr: the object lookup problem is worse than i assumed the lookup is actually done twice... teythoon: isn't that usually the case :) ? inside gnumach ? no once in libports, once in the intrans function which intrans function ? can you point at an example ? right routine file_get_fs_options ( file: file_t; file_t is special mig magic type file_t = mach_port_copy_send_t #ifdef FILE_INTRAN intran: FILE_INTRAN trivfs_begin_using_protid ? for example, yes ugh however, I believe that can be fixed cleanly I revised my gnumach changes it works surprisingly well gnumach is largely clean code i patched libports to use the new falicilty all the fs translators i tested work fine nice tmpfs, ext2fs, nfs, hello* so does exec very nice howevcer, the bootstrap fails a lot more straightforward than i expected i believe proc crashes yes you can use mach_print to manually trace the bootstrap process [[microkernel/mach/gnumach/interface/syscall/mach_print]]. i did that ok it's nice bare knives are :) uh oh, this lookup fix requires some mig changes teythoon: have you built some packages on it ? braunr: some clang test builds nice where is mig getting in the way ? yes, and i compiled lots of stuff any debian package ? let me just push my changes somewhere... no, no deb ok braunr: http://darnassus.sceen.net/cgit/teythoon/gnumach.git/log/refs/heads/feature-protected-payload-3 http://darnassus.sceen.net/cgit/teythoon/hurd.git/log/refs/heads/feature-protected-payload-1 braunr: in particular, http://darnassus.sceen.net/cgit/teythoon/hurd.git/blob/refs/heads/feature-protected-payload-1:/libports/manage-multithread.c#l161 ## IRC, freenode, #hurd, 2013-11-27 btw, my protected payload work is progressing nicely the system actually boots now :) that's great looking forward to seeing it in action I'd love to quickly discuss my mig changes if you've got a minute sure ok first, please look at this http://darnassus.sceen.net/cgit/teythoon/hurd.git/blob/refs/heads/feature-protected-payload-1:/libports/manage-multithread.c#l161 in line 165, the msgh_local_port is restored b/c later some intrans function might use this for the object (re-)lookup yes ok http://darnassus.sceen.net/cgit/teythoon/mig.git/commitdiff/64b7d34f90a41d017a9e1e8179c0533a97012f6f makes sense this makes mig payload aware we'd specify another intrans function that takes a label and returns an object let me remind you said there were 3 lookups actually the mach one the libports one and is the intran one the last, right ? yes so now i optimized away the second one, the one in libports ok so you need intran aware functions to replace that lookup well payload aware intran functions yes ok mostly cast the label, ports_port_ref the object i assume they'd be pretty straightforward yes and easy to add for all existing intran functions most likely the proposed change looks very appropriate :) i'd never thought about intran functions because i didn't want that in my clone ;p they do add a bit of complexity but this upgrade path looks right yes I think so too nothing more to say :) it's so simple i actually don't understand how i could miss it last time i looked i guess i was exhausted heh thanks for the review :) thanks for your work it's been a long time since we had someone spend that much time on the hurd ## IRC, freenode, #hurd, 2013-11-29 I came to believe that there is actually a lot of room for improvement in our rpc path ## IRC, freenode, #hurd, 2013-12-19 teythoon_: how is protected payload branch now ? ready for review ? the kernel and mig patch are patches so pending for approval rathr rather documentation is still missing for those ofc the last parts are the mig mutations iirc err, you lost me i haven't continued to work on the hurd patch series the patch series for gnu mach and mig are feature complete from my point of view i mean the changes needed to remove the third lookup in the mutation functions to do that in hurd, we need a patched mig i was just trying to remember correctly those patches need to be reviewed the hurd patch series is not yet working, but you can see the approach i've taken yes ok the next thing i'd do in this regard is to fix all object lookups so it didn't change from last time i looked no some code, notoriously the control port handling in the *fs libs, uses mach_port_t for the receiver and do the lookup themself. i'd fix that next. ## IRC, freenode, #hurd, 2014-01-20 i've tied up enough loose ends, that i can start working on the protected payload stuff again the next step is fixing the receiver lookups everywhere good :) if everyone uses mig magic for that, the switch will be easy undoing the hack in mach-defpager too ## IRC, freenode, #hurd, 2014-01-24 teythoon: what are you currently working on ? protected payload ? braunr: yes, i'm working with coccinelle to fix all object lookups in the hurd i figured it is easier and cleaner to just fix them instead of converting pointers back to port names for those functions that want port names ## IRC, freenode, #hurd, 2014-02-17 braunr: do you think it's okay to make the 0 payload special ? teythoon: for our needs, sure it's similar to NULL or MACH_PORT_NULL yes maybe i should add a symbolic name for that for consistency but is it wise to let mach_port_set_protected_payload reset the behaviour on a zero payload ? i don't think a symbolic name is needed or maybe as you want what else should it do then ? 00:25 < braunr> but is it wise to let mach_port_set_protected_payload reset the behaviour on a zero payload ? it could return invalid argument instead and the documentation would clearly state 0 is invalid but that would also prevent reverting the mode yes, i consider that not really useful, but i'd be okay with the current behavior but yes, the documentation should make that clear ## IRC, freenode, #hurd, 2014-02-22 braunr: once the pp patch set is in gnumach, i'll make mach-defpager use it it's a good target, as it does not use libports and it's currently abusing the port rename procedure for the lookup, making the rights spill into the splay tree braunr: the wiki mentioned that you once considered to remove the ability to rename ports altogether teythoon: ah right i actually intend to keep it for x15, but only because i want port names to blend as file descriptors right, for dup and friends and the radix tree is a data structure that can cope decently with not too sparsed distributions ## IRC, freenode, #hurd, 2014-02-27 teythoon: ah, just saw the commit that will make our network faster :) network ? eh no, it's about ioctls actually :) i read a bit too quickly one more small step towards fixing all receiver lookups in the hurd... i did not anticipate how much the hurd has to be changed first in order to make use of the protected payloads that was my main reason not to do it actually :/ but you're almost finished with it, aren't you ? not sure tbh i believe the fsys stuff was the largest chunk ## IRC, freenode, #hurd, 2014-03-02 youpi: i cleaned up most of the receiver lookups in hurd by now but there are some tricky cases left 1/ the pager stuff the mig declarations are in gnumach, and do not have the necessary intran declarations that we can mutate 2/ some uses of mach_port_t instead of some_type_t in the hurd rpc definitions (e.g. fsys_forward) for 1/, i'd like to extend the definitions in gnumach i'm not so sure what to do for the second case we could introduce some types for each case or, we do not touch the definitions my protected payload prototype allows us to map payloads back to port names for the functions that want a name i did this by redefining the mach_port_t type for mig that uses the payload to port-name intran function mig allows type redefinitions, but emits a warning message though i consider that a useful feature, it allows one to refine a type ## IRC, freenode, #hurd, 2014-03-04 braunr: i fixed the object lookups in libpager yesterday, a pretty mechanic change teythoon: can't be bad :) amusingly, the resulting packages boot about half way through o_O teythoon: ? it hangs while cleaning left-over files from /tmp ugh, libpager .. yes tricky pager stuff is tricky ? tricky buggy pager stuff is tricky and buggy ^^ let's assume you made things faster, increasing the likelihood of deadlocks .. we had a patch once for a libpager deadlock well, i'm not yet at the point where things might get faster see 901c61a1d25e7c8963e51012760a82730eda1910 the same problem exists elsewhere i think, you might have hit it i'm still just moving the object lookups from the server functions to the mig translation functions hm but yes, i might have influenced the timing, not sure shouldn't cost too much to add some prints iirc, the other potential deadlock is in libpager/pager-attr.c when memory_object_change_attributes is called (which loops back into libpager when the kernel sends data back ) tricky .. i'll try that when i get home aren't you almost done ? not sure tbh :( althouhg libpager would be really great and mach-defpager since this is actually one of the biggest points of contention i'll do that next, and return to libpager later ok for both i needed to change some rpc type definitions in gnu mach skipping lookups in libpager would make it harder to suffer writeback thread storms so i want to make sure that these changes are fine so that i can propose them ok