[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] # IRC, freenode, #hurd, 2012-07-20 From [[Genode RPC|microkernel/genode/rpc]]. assuming synchronous ipc is the way to go (it seems so), there is still the need for some async ipc (e.g signalling untrusted recipients without risking blocking on them) 1/ do you agree on that and 2/ how would this low-overhead async ipc be done ? (and 3/ are there relevant examples ? if you think about this stuff too much you will end up like marcus and neal ;-) antrik: likely :) the truth is that there are various possible designs all with their own tradeoffs, and nobody can really tell which one is better the only sensible one i found is qnx :/ but it's still messy they have what they call pulses, with a strictly defined format so it's actually fine because it guarantees low overhead, and can easily be queued but i'm not sure about the format I must say that Neal's half-sync approach in Viengoos still sounds most promising to me. it's actually modelled after the needs of a Hurd-like system; and he thought about it a lot... damn i forgot to reread that stupid me note that you can't come up with a design that allows both a) delivering reliably and b) never blocking the sender -- unless you cache in the kernel, which we don't want but I don't think it's really necessary to fulfill both of these requirements it's up to the receiver to make sure it gets important signals right caching in the kernel is ok as long as the limit allows the receiver to handle its signals in the Viengoos approach, the receiver can allocate a number of receive buffers; so it's even possible to do some queuing if desired ah great, limits in the form of resources lent by the receiver one thing i really don't like in mach is the behaviour on full message queues blocking :/ i bet the libpager deadlock is due to that [[libpager_deadlock]]. it simply means async ipc doesn't prevent at all from deadlocks the sender can set a timeout. blocking only happens when setting it to infinite... which is commonly the case well, if you see places where blocking is done but failing would be more appropriate, try changing them I'd say... it's not that easy :/ # IRC, freenode, #hurd, 2012-08-18 what is the deepest design mistake of the HURD/gnumach? lcc: async ipc braunr: You mentioned that moving to L4 will create problems. Can you name some, please? I thought it was going to be faster on L4 the problem is that l4 *only* provides sync ipc so implementing async communication would require one seperated thread for each instance of async communication But you said that the deepest design mistake of Hurd is asynch ipc. not the hurd, mach and hurd depends on it now i said l4 provides *only* sync ipc systems require async communication tools but they shouldn't be built entirely on top of them Hmm, so you mean mach has bad asynch ipc? you can consider mach and l4 as two extremes in os design mach *only* has async ipc what was viengoos trying to explore? * savask is confused lcc: half-sync ipc :) lcc: i can't tell you more on that, i need to understand it better myself before any explanation attempt You say that mach problem is asynch ipc. And L4's problem is it's sync ipc. That means problems are in either of them! exactly how did apple resolve issues with mach? What is perfect then? A "golden middle"? lcc: they have migrating threads, which make most rpc behave as if they used sync ipc savask: nothing is perfect :p braunr: but why async ipc is the problem? mcsim: it requires in-kernel buffering braunr: Yes, but we can't have problems everywhere o_O mcsim: this not only reduces communication performance, but creates many resource usage problems mcsim: and potential denial of service, which is what we experience most of the time when something in the hurd fails savask: there are problems we can live with braunr: But this could be replaced by userspace server, isn't it? savask: this is what monolithic kernels do mcsim: what ? mcsim: this would be the same, this central buffering server would suffer from the same kind of issue braunr: async ipc. Buffer can hold special server But there could be created several servers, and queue could have limit. queue limits are a problem when a queue limit is reached, you either block (= sync ipc) or lose a message to keep messaging reliable, mach makes senders block the problem is that async ipc is often used to avoid blocking so blocking when you don't expect it can create deadlocks savask: a good compromise is to use sync ipc most of the time, and async ipc for a few special cases, like signals this is what okl4 does if i'm right i'm not sure of the details, but like many other projects they realized current systems simply need good support for async ipc, so they extended l4 or something on top of it to provide it it took years of research for very smart people to get to some consensus like "sync ipc is better but async is needed too" personaly i don't like l4 :/ really not braunr: Anyway there is some queue for messaging, but at the moment if it overflows panics kernel. And with limited queue servers will panic. mcsim: it can't overflow mach blocks senders queuing basically means "block and possible deadlock" or "lose messages and live with it" So, deadlocks are still possible? of course have a look at the libpager debian patch and the discussion around it it's a perfect example braunr: it makes gnu mach slow as hell sometimes, which I guess is because all threads (which can ben 1000s) wake at the same time youpi: you mean are created ? because they'll have to wake in any case i can understand why creating lots of threads is slower, but cthreads never destroyes kernel threads doesn't seem to be a mach problem, rather a cthreads one i hope we're able to remove the patch after pthreads are used [[libpthread]]. braunr: You state that hurd can't move to sync ipc, since it depends on async ipc. But at the same time async ipc doesn't guarantee that task wouldn't block. So, I don't understand why limited queues will lead to more deadlocks? mcsim: async ipc can block because of queue limits mcsim: if you remove the limit, you remove the deadlock problem, and replace it with denial of service mcsim: i didn't say the hurd can't move to sync ipc mcsim: i said it came to depend on async ipc as provided by mach, and we would need to change that and it's tricky braunr: no, I really mean are woken. The timeout which gets dropped by the patch makes threads wake after some time, to realize they should go away. It's a hell long when all these threads wake at the same time (because theygot created at the same time) ahh savask: what is perfect regarding IPC is something nobody can really answer... there are competing opinions on that matter. but we know by know that the Mach model is far from ideal, and that the (original) L4 model is also problematic -- at least for implementing a UNIX-like system personally, if i'd create a system now, i'd use sync ipc for almost everything, and implement posix-like signals in the kernel that's one solution, it's not perfect savask: actually the real answer may be "noone knows for now and it still requires work and research" so for now, we're using mach savask: regarding IPC, the path explored by Viengoos (and briefly Coyotos) seems rather promising to me savask: and yes, I believe that whatever direction we take, we should do so by incrementally reworking Mach rather than jumping to a completely new microkernel...