diff options
-rw-r--r-- | hurd/faq/slash_usr_symlink/discussion.mdwn | 45 | ||||
-rw-r--r-- | hurd/translator.mdwn | 5 | ||||
-rw-r--r-- | hurd/translator/devfs.mdwn | 25 | ||||
-rw-r--r-- | hurd/translator/tmpfs/discussion.mdwn | 93 | ||||
-rw-r--r-- | microkernel/mach/memory_object/discussion.mdwn | 13 | ||||
-rw-r--r-- | microkernel/mach/pmap.mdwn | 74 | ||||
-rw-r--r-- | microkernel/viengoos/documentation.mdwn | 14 | ||||
-rw-r--r-- | microkernel/viengoos/documentation/irc_2012-02-23.mdwn | 159 | ||||
-rw-r--r-- | open_issues/boehm_gc.mdwn | 42 | ||||
-rw-r--r-- | open_issues/bpf.mdwn | 122 | ||||
-rw-r--r-- | open_issues/dde.mdwn | 350 | ||||
-rw-r--r-- | open_issues/default_pager.mdwn | 8 | ||||
-rw-r--r-- | open_issues/glibc_madvise_vs_static_linking.mdwn | 15 | ||||
-rw-r--r-- | open_issues/gnumach_memory_management.mdwn | 10 | ||||
-rw-r--r-- | open_issues/linux_as_the_kernel.mdwn | 42 | ||||
-rw-r--r-- | open_issues/memory_object_model_vs_block-level_cache.mdwn | 273 | ||||
-rw-r--r-- | open_issues/select.mdwn | 187 | ||||
-rw-r--r-- | open_issues/trust_the_behavior_of_translators.mdwn | 181 | ||||
-rw-r--r-- | public_hurd_boxen/xen_handling.mdwn | 9 |
19 files changed, 1642 insertions, 25 deletions
diff --git a/hurd/faq/slash_usr_symlink/discussion.mdwn b/hurd/faq/slash_usr_symlink/discussion.mdwn new file mode 100644 index 00000000..219e14e4 --- /dev/null +++ b/hurd/faq/slash_usr_symlink/discussion.mdwn @@ -0,0 +1,45 @@ +[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_documentation]] + + +# IRC, freenode, #hurd, 2012-02-01 + + <marcusb> I remember the time when we had a /usr symlink. Now fedora 17 + will move / to /usr and have /foo symlinks. :) + <marcusb> braunr: + http://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge + <marcusb> braunr: fedora and others are merging /bin, /sbin and some other + into /usr + <marcusb> braunr: back in 1998 we tried for two years or so to have /usr -> + .. in Debian GNU/Hurd, but eventually we gave up on it, because it broke + some stuff + <gnu_srs> marcusb: Hi, which one is better (in your opinion): / or /usr? + <marcusb> gnu_srs: fedora says that using /usr allows better separation of + distribution files and machine-local files + <braunr> marcusb: won't it break remote /usr ? + <marcusb> so you can atomically mount the OS files to /usr + <marcusb> gnu_srs: but in the end, it's a wash + <marcusb> personally, I think every package should get its own directory + <braunr> marcusb: what PATH then ? + <marcusb> braunr: well, I guess you'd want to assemble a union filesystem + for a POSIX shell + <braunr> marcusb: i don't see what you mean :/ + <braunr> ah this comes from Lennart Poettering + <marcusb> braunr: check out for example how http://nixos.org/ does it + <manuel> braunr: something like, union /package1/bin /package2/bin + /package3/bin for /bin, /package1/lib /package2/lib /package3/lib for + /lib, etc. I guess + <braunr> manuel: would that scale well ? + <marcusb> the idea that there is only one correct binary for each program + with the name foo is noble, but a complete illusion that hides the + complexity of the actual configuration management task + <braunr> marcusb: right diff --git a/hurd/translator.mdwn b/hurd/translator.mdwn index 3527267f..619c0db5 100644 --- a/hurd/translator.mdwn +++ b/hurd/translator.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2007, 2008, 2009, 2010, 2011 Free Software +[[!meta copyright="Copyright © 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -75,7 +75,8 @@ available. Read about translator [[short-circuiting]]. The [[concept|concepts]] of translators creates its own problems, too: -[[open_issues/translators_set_up_by_untrusted_users]]. +[[open_issues/translators_set_up_by_untrusted_users]], and +[[trust_the_behavior_of_translators]]. # Existing Translators diff --git a/hurd/translator/devfs.mdwn b/hurd/translator/devfs.mdwn index 27df23aa..8784e998 100644 --- a/hurd/translator/devfs.mdwn +++ b/hurd/translator/devfs.mdwn @@ -1,12 +1,12 @@ -[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled -[[GNU Free Documentation License|/fdl]]."]]"""]] +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] `devfs` is a translator sitting on `/dev` and providing what is to be provided in there in a dynamic fashion -- as compared to static passive translator @@ -18,3 +18,22 @@ settings as they're used now. If applicable, it has to be taken care that all code concerning the page-in path is resident at all times. + +--- + +# IRC, freenode, #hurd, 2012-01-29 + + <pinotree> what would be an hurdish way to achieve something like the + various system (udev, devfs, devd, etc) for populating devices files + automatically according to the found system devices? + <pinotree> (not that i plan anything about that, just curious) + <youpi> it's not really a stupid question at all :) + <youpi> I guess translators in /dev + <youpi> such as a blockfs on /dev/block + <antrik> pinotree: in an ideal world (userspace drivers and all), the + device nodes will be exported by the drivers themselfs; and the drivers + will be launched by the bus respective bus driver + <antrik> an interesting aspect is what to do if we want a traditional flat + /dev directory with unique device names... probably need some + unionfs-like translator that collects the individual driver nodes in an + intelligent manner diff --git a/hurd/translator/tmpfs/discussion.mdwn b/hurd/translator/tmpfs/discussion.mdwn index 0409f046..1d441c7d 100644 --- a/hurd/translator/tmpfs/discussion.mdwn +++ b/hurd/translator/tmpfs/discussion.mdwn @@ -283,3 +283,96 @@ License|/fdl]]."]]"""]] <mcsim> what kind of log do you mean? <antrik> vmstat 1 I mean <mcsim> ah... + + +## IRC, freenode, #hurd, 2012-02-01 + + <mcsim> I run fsx with this command: fsx -N3000 foo/bar -S4 + -l$((1024*1024*8)). And after 70 commands it breaks. + <mcsim> The strangeness is at address 0xc000 there is text, which was + printed in fsx with vfprintf + <mcsim> I've lost log. Wait a bit, while I generate new + <jkoenig_> mcsim, what's fsx / where can I find it ? + <mcsim> fsx is filesystem exersiser + <mcsim> http://codemonkey.org.uk/projects/fsx/ + <jkoenig_> ok thanks + <mcsim> i use it to test tmpfs + <mcsim> here is fsx that compiles on linux: http://paste.debian.net/154390/ + and Makefile for it: http://paste.debian.net/154392/ + <jkoenig_> mcsim, hmm, I get a failure with ext2fs too, is it expected? + <mcsim> yes + <mcsim> i'll show you logs with tmpfs. They slightly differ + <mcsim> here: http://paste.debian.net/154399/ + <mcsim> pre last operation is truncate + <mcsim> and last is read + <mcsim> during pre-last (or last) starting from address 0xa000, every + 0x1000 bytes appears text + <mcsim> skipping zero size read + <mcsim> skipping zero size read + <mcsim> truncating to largest ever: 0x705f4b + <mcsim> signal 2 + <mcsim> testcalls = 38 + <mcsim> this text is printed by fsx, by function prt + <mcsim> I've mistaken: this text appears even from every beginning + <mcsim> I know that this text appears exactly at this moment, because I + added check of the whole file after every step. And this error appeared + only after last truncation. + <mcsim> I think that the problem is in defpager (I'm fixing it), but I + don't understand where defpager could get this text + <jkoenig_> wow I get java code and debconf templates + <mcsim> So, my question is: is it possible for defpager to get somehow this + text? + <jkoenig_> possibly recycled, non-zeroed pages? + <mcsim> hmmm... probably you're right + <jkoenig_> 0x1000 bytes is consistent with the page size + <mcsim> Should I clean these pages in tmpfs? + <mcsim> or in defpager? + <mcsim> What is proper way? + <jkoenig_> mcsim, I'd say defpager should do it, to avoid leaking + information, I'm not sure though. + <jkoenig_> maybe tmpfs should also not assume the pages have been blanked + out. + <mcsim> if i do it in both, it could have big influence on performance. + <mcsim> i'll do it only in defpager so far. + <mcsim> jkoenig_: Thank you a lot + <jkoenig_> mcsim, no problem. + + +## IRC, freenode, #hurd, 2012-02-08 + + <tschwinge> mcsim: You pushed another branch with cleaned-up patches? + <mcsim> yes. + <tschwinge> mcsim: Anyway, any data from your report that we could be + interested in? (Though it's not in English.) + <mcsim> It's completely in ukrainian an and mostly describes some aspects + of hurd's work. + <tschwinge> mcsim: OK. So you ran out of time to do the benchmarking, + etc.? + <tschwinge> Comparing tmpfs to ext2fs with RAM backend, etc., I mean. + <mcsim> tschwinge: I made benchmarking and it turned out that tmpfs up to 6 + times faster than ext2fs + <mcsim> tschwinge: is it possible to have a review of work, I've already + done, even if parallel writing doesn't work? + <tschwinge> mcsim: Do you need this for university or just a general review + for inclusion in the Git master branch? + <mcsim> general review + <tschwinge> Will need to find someone who feels competent to do that... + <mcsim> the branch that should be checked is tmpfs-final + <pinotree> cool, i guess you tested also special types of files like + sockets and pipes? (they are used in eg /run, /var/run or similar) + <mcsim> Oh. I accidentally created this branch. It is my private + branch. I'll delete it now and merge everything to mplaneta/tmpfs/master + <mcsim> pinotree: Completely forgot about them :( I'll do it by all means + <pinotree> mcsim: no worries :) + <mcsim> tschwinge: Ready. The right branch is mplaneta/tmpfs/master + + +## IRC, freenode, #hurd, 2012-03-07 + + <pinotree> did you test it with sockets and pipes? + <mcsim> pinotree: pipes work and sockets seems to work too (I've created + new pfinet device for them and pinged it). + <pinotree> try with simple C apps + <mcsim> Anyway all these are just translators, so there shouldn't be any + problems. + <mcsim> pinotree: works diff --git a/microkernel/mach/memory_object/discussion.mdwn b/microkernel/mach/memory_object/discussion.mdwn index a2a1514b..907f859a 100644 --- a/microkernel/mach/memory_object/discussion.mdwn +++ b/microkernel/mach/memory_object/discussion.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,10 @@ License|/fdl]]."]]"""]] [[!tag open_issue_documentation open_issue_gnumach]] -IRC, freenode, #hurd, 2011-08-05: +[[!toc]] + + +# IRC, freenode, #hurd, 2011-08-05 < neal> braunr: For instance, memory objects are great as they allow you to specify the mapping policy in user space. @@ -23,7 +26,8 @@ IRC, freenode, #hurd, 2011-08-05: < braunr> the kernel eviction policy :) < neal> that's an implementation detail -IRC, freenode, #hurd, 2011-09-05: + +# IRC, freenode, #hurd, 2011-09-05 <braunr> mach isn't a true modern microkernel, it handles a lot of resources, such as high level virtual memory and cpu time @@ -65,3 +69,6 @@ IRC, freenode, #hurd, 2011-09-05: pages are going to be flushed by themselves [[open_issues/resource_management_problems]]. + + +# [[open_issues/memory_object_model_vs_block-level_cache]] diff --git a/microkernel/mach/pmap.mdwn b/microkernel/mach/pmap.mdwn new file mode 100644 index 00000000..6910bfd3 --- /dev/null +++ b/microkernel/mach/pmap.mdwn @@ -0,0 +1,74 @@ +[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_documentation open_issue_gnumach]] + + +# IRC, freenode, #hurd, 2012-02-01 + + <sekon> on Hurd what is the difference between kernel memory object and + pmap module ?? + <sekon> pmap is heap/libraries table for each thread while kernel memory + object refers to arbitary blobs of data ?? + <braunr> sekon: pmap is the low level memory mapping module + <braunr> i.e. it programs the mmu + <braunr> and these aren't hurd-specific, they are mach modules + <sekon> braunr: so kernel memonry objects consists of a bunch of pmaps ?? + <braunr> sekon: memory objects can be various things, be specific please + <braunr> (they're certainly not a bunch of pmaps though, no) + <braunr> there is one pmap per vm_map, and there is one vm_map per task + <braunr> and there is no need for double question marks, is ther ?? + <sekon> lol then is kernel memory object , please excuse the metaphor + something like a base class for pmap + <braunr> i don't know what a "kernel memory object" is, be specific please, + again + <sekon> braunr: + http://courses.cs.vt.edu/~cs5204/fall05-gback/presentations/MachOS_Rajesh.ppt + <sekon> goto page titled External Memory Management (EMM) on page 15 + <sekon> Kernel memory object shows up + <braunr> you know there are other formats for this document + <sekon> nope .. i did not know that + <sekon> in page 17 pmamp shows up + <braunr> "the problems of external memory management" ? + <sekon> braunr: the paper i am also reading is called x15mach_thesis + <braunr> ah, that's mine + * sekon bows + <sekon> :) + <braunr> ok i see page 17 + <sekon> so please good sir explain the relationship between kernel memory + object and pmap + <sekon> (if any) + <sekon> braunr: there is no mention of kernel memory object + <braunr> again, i don't see any reference or definition of "kernel memory + object" + <sekon> but your paper says + <sekon> that when page faults occur + <sekon> the kernel contact the manager for a kernel reference object + <sekon> *memory + <braunr> where ? + <sekon> in section 2.1.3 (unless i read it wrong) + <sekon> no just a sec + <sekon> 2.1.5 + <braunr> i never used the expression "kernel memory object" there :p + <braunr> anyway, you're referring simple to memory objects as seen by + userspace pagers + <braunr> a memory object is a data container + <braunr> usually, it's a file + <braunr> but it can be anything + <braunr> the pager is the task that provides its content and implements the + object methods + <braunr> as for the relation between them and the pmap module, it's a + distant one + <braunr> i'll explain it with an example + <braunr> page fault -> request content of memory object at a given offset + with given length from pager -> ask pmap to establish the mapping in the + mmu + <sekon> braunr: thank you ver much + <sekon> *very diff --git a/microkernel/viengoos/documentation.mdwn b/microkernel/viengoos/documentation.mdwn index 52ff7a48..edcc79a7 100644 --- a/microkernel/viengoos/documentation.mdwn +++ b/microkernel/viengoos/documentation.mdwn @@ -1,12 +1,12 @@ -[[!meta copyright="Copyright © 2008 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2008, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled -[[GNU Free Documentation License|/fdl]]."]]"""]] +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] The most up-to-date documentation is in the source code itself, see in particular the header files in the hurd directory. @@ -17,7 +17,8 @@ version of that is available [[here|reference-guide.pdf]]. It is not, however, automatically regenerated, and thus may not be up to date. -Academic Papers: + +# Academic Papers * [Viengoos: A Framework for Stakeholder-Directed Resource Allocation](http://walfield.org/papers/2009-walfield-viengoos-a-framework-for-stakeholder-directed-resource-allocation.pdf). @@ -54,3 +55,8 @@ Academic Papers: argue that only a small static number of scheduling policies are needed in practice and advocate hierarchical policy specification and central realization. + + +# Miscellaneous + + * [[IRC_2012-02-23]] diff --git a/microkernel/viengoos/documentation/irc_2012-02-23.mdwn b/microkernel/viengoos/documentation/irc_2012-02-23.mdwn new file mode 100644 index 00000000..a3229be9 --- /dev/null +++ b/microkernel/viengoos/documentation/irc_2012-02-23.mdwn @@ -0,0 +1,159 @@ +[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!meta title="IRC, freenode, #hurd, 2012-02-23"]] + +[[!tag open_issue_documentation open_issue_viengoos]] + + <braunr> neal: i've read a bit about current modern microkernel based + systems, and i'm wondering + <braunr> neal: can a capability be used for both request and replies, or + does messaging need something similar to reply ports ? + <neal> braunr: you want a reply port + <neal> think about a file server: + <neal> the file server publishes a capability to access something + <neal> and multiple entities use it + <neal> if you wanted just bidirectional caps + <braunr> that's the idea i had in mind, i just wondered if it was actually + still the case in practice + <neal> you'd need to create a new capability every time you delegated the + cap + <braunr> yes + <braunr> thanks + <braunr> what about send once rights ? + <neal> also, if you send on a cap and then start waiting on it you could + get your own reply :) + <neal> you can get around send-once rights by using a counter + <braunr> no i mean, is their behaviour still needed/useful ? + <neal> the counter is kernel implemented + <neal> yes + <neal> as an optimization + <braunr> so they're just a special case of capability + <neal> yes + <braunr> not a special capability type of their own + <neal> but they eliminate the constant create/destroy sequence + <braunr> (even if it was already the case at the implementation level in + mach, they were named separately which could confuse people) + <braunr> hm + <braunr> actually, send once rights were used for important notifications + such as dead port notifications + <braunr> is this still handled at the kernel level in modern ukernels ? + <neal> in viengoos, this is called the version field + <neal> see chapter 2 + <neal> + http://www.gnu.org/software/hurd/microkernel/viengoos/documentation/reference-guide.pdf + <braunr> neal: btw, congratulations for viengoos, it really is a very + interesting project: ) + <neal> thanks + <braunr> i don't see the point of rewriting a mach clone after reading + about it eh + <neal> I would definately do the messenger concept again + <neal> but I'd not do persistence + <braunr> i don't fully understand how messengers deal with blocking + <neal> did you read chapter 4? + <braunr> i read all of it but didn't understand everything :) + <braunr> it's quite abstract and i didn't make time to read some of the + source code + <neal> If you have specific questions, I can try to help + <braunr> i'll read those chapter again and formulate my questions after + <neal> I may have to read them as well :) + <braunr> i don't understand how you manage to separate IPC from threading + actually + <braunr> are messengers queues ? + <neal> messengers are super-buffers + <neal> they contain a reference to a thread object + <neal> to send a message, I use a messenger + <neal> I put the data in a buffer + <neal> and then I attach the messenger to the target messenger + <antrik> braunr: my stance is that we should try to incorporate the ideas + from Viengoos into Mach in an evolutionary process... + <neal> this causes an activation to be sent to the target messenger's + thread object + <braunr> neal: which activation ? + <neal> an activation is like a CPU interrupt + <braunr> neal: is it "allocated" at that moment, or taken from the sending + thread ? + <braunr> (i'm not sure my question really makes sense to you :/) + <antrik> braunr: not sure what you are asking exactly; but the basic idea + is that the receiving process preallocates message buffers + <braunr> antrik: maybe, i'm not sure + <antrik> when someone sends a message, it's stored in one of these buffers, + and the process gets a scheduler activation, so it can decide what to do + with it + <neal> antrik is right + <neal> the traget messenger designates a memory buffer + <braunr> i'm wondering about the details of this activation + <braunr> is it similar to thread migration ? + <neal> just before the activation, the data is copied to the messenger's + buffer + <neal> now someone needs to be notified + <neal> (that a message arrived) + <neal> that someone is the thread designated in the target messenger's + thread field + <neal> this is done by an activation + <neal> an activation is just an upcall + <neal> a thread is forced to a particular IP + <neal> an activation isn't a "what" it's a "how" + <neal> I never understood thread migration + <neal> as it's not really about threads + <neal> nor it is about migration + <antrik> neal: what happens if another message comes in before the + activation handling tread is done with the previous one?... + <neal> the messenger is enqueued on the thread object + <neal> it is delivered when the thread is in normal mode + <neal> part of delivering an activation is putting the thread is activation + mode + <neal> when in activation mode, it can't receive any activations + <braunr> i see + <braunr> but then, when a thread receives an activation, does it handle + several queued messengers at once (not to loose events/messages) ? + <neal> (unless it does a blocking receive on a particular messenger, which + is necessary to support memory allocation in activated mode) + <neal> it handles one at a time + <braunr> ah right + <neal> it can't lose events + <braunr> activations are sent per messengers/events + <neal> well, it can + <neal> but it is possible to prevent this + <braunr> neal: also, is message passing completely atomic ? + <neal> I'm not sure what you mean + <neal> which part + <braunr> well, all parts of a message :) + <braunr> in mach, a message can contain several parts + <braunr> data, rights, passing one of them may fail + <braunr> only the header is atomically processed + <neal> it's not atomic in the sense that a thread can observe the data copy + <braunr> that's not what i meant + <braunr> is a message completely transferred or not at all in case of + failure ? + <neal> it may be partially transferred + <braunr> or can it be partially transferred + <braunr> ok + <neal> for instance, if the target thread doesn't provide a memory buffer + <neal> then the data can't be copied + <neal> I don't recall off hand how I dealt with bad addresses + <neal> may be it is not possible + <neal> I don't remember + <neal> sorry + <braunr> but if i read the message structure correctly, there can be one + data block, and several capability addresses in a single message, right ? + <neal> yes + <braunr> ok + <braunr> have you considered passing only one object (either data or + capability) per message ? + <braunr> or is it too inefficient ? + <neal> you at least need a reply port + <neal> s/port/messenger/ + <braunr> yes but can't it be passed separately ? + <neal> then you have server state + <neal> ik + <braunr> hm yes + <braunr> thanks for your answers: ) + <neal> no problem diff --git a/open_issues/boehm_gc.mdwn b/open_issues/boehm_gc.mdwn index 19bd1b21..e7f849f2 100644 --- a/open_issues/boehm_gc.mdwn +++ b/open_issues/boehm_gc.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -281,3 +281,43 @@ It has last been run and compared on 2010-11-10, based on CVS HEAD sources from Git branches (2010-12-15: last change 2009-09). * <http://www.hpl.hp.com/personal/Hans_Boehm/gc/#users> + + +## IRC, OFTC, #debian-hurd, 2012-02-05 + +[[!tag open_issue_porting]] + + <pinotree> youpi: i think i found out the possible cause of the ecl and + mono issuess + <pinotree> -s + <youpi> oh + <pinotree> basically, we don't have the realtime signals (so no + SIGRTMIN/SIGRTMAX defined), hence things use either SIGUSR1 or + SIGUSR2... which are used in libgc to resp. stop/resume threads when + "collecting" + <pinotree> i just patched ecl to use SIGINFO instead of SIGUSR1 (used when + no SIGRTMIN+2 is available), and it seems going on for a while + <youpi> uh, why would SIGINFO work better than SIGUSR1? + <pinotree> it was a test, i tried the first "not common" signal i saw + <pinotree> my test was, use any signal different than USR1/2 + <youpi> ah, sorry, I hadn't understood + <youpi> you mean there's a conflict between ecl and mono using SIGUSR1, as + well as libgc? + <pinotree> yes + <pinotree> for example, in ecl sources see src/c/unixint.d, + install_process_interrupt_handler() + <youpi> SIGINFO seems a sane choice + <youpi> SIGPWR could have been a better choice if it was available :) + <pinotree> i would have chose an "unassigned" number, say SIGLOST (the + bigger one) + 10, but it would be greater than _NSIG (and thus discarded) + <youpi> not a good idea indeed + <pinotree> it seems that linux, beside the range for rt signals, has some + "free space" + <pinotree> i'll start now another ecl build, from scratch this time, with + s/SIGUSR1/SIGINFO/ (making sure ctags won't bother), and if it works i'll + update svante's bug + + <pinotree> mmap(...PROT_NONE...) failed + <pinotree> hmm... + <pinotree> apparently enabling MMAP_ANON in mono's libgc copy was a good + step, let's see diff --git a/open_issues/bpf.mdwn b/open_issues/bpf.mdwn index 98b50430..2a8c897a 100644 --- a/open_issues/bpf.mdwn +++ b/open_issues/bpf.mdwn @@ -440,3 +440,125 @@ This is a collection of resources concerning *Berkeley Packet Filter*s. <braunr> hm, there is a "snoop" source type, using raw sockets <braunr> too far from the packet source, but i'll try it anyway <braunr> hm wrong, snoop was the solaris packet filter fyi + + +## IRC, freenode, #hurd, 2012-01-28 + + <braunr> nice, i have tcpdump working :) + <braunr> let's see if it's as simple with wireshark + <pinotree> \o/ + <braunr> pinotree: it was actually very simple + <pinotree> heh, POV ;) + <braunr> yep, wireshark works too + <braunr> promiscuous mode is harder to test :/ + <braunr> but that's a start + + +## IRC, freenode, #hurd, 2012-01-30 + + <braunr> ok so next step: get tcpreplay working + <antrik> braunr: BTW, when you checked the status of the kernel BPF code, + did you take zhengda's enhancements/fixes into account?... + <braunr> no + <braunr> when did i check it ? + <antrik> braunr: well, you said the kernel BPF code has serious + shortcomings. did you take zhengda's changes into account? + <braunr> antrik: ah, when i mention the issues, i considered the userspace + translator only + <braunr> antrik: and stuff like non blocking io, exporting a selectable + file descriptor + <braunr> antrik: deb http://ftp.sceen.net/debian-hurd experimental/ + <braunr> antrik: this is my easy to use repository with a patched + libpcap0.8 + <braunr> and a small and unoptimized pcap-hurd.c module + <braunr> it doesn't use devopen yet + <braunr> i thought it would be better to have packet filtering working + first as a debian patch, then get the new translator+final patch upstream + <jkoenig> braunr, tcpdump works great here (awesome!). I'm probably using + exactly the same setup and "hardware" as you do, though :-P + + +## IRC, freenode, #hurd, 2012-01-31 + + <braunr> antrik: i tend to think we need a bpf translator, or anything + between the kernel and libpcap to provide selectable file descriptors + <braunr> jkoenig: do you happen to know how mach_msg (as called in a + hello.c file without special macros or options) deals with signals ? + <braunr> i mean, is it wrapped by the libc in a version that sets errno ? + <jkoenig> braunr: no idea. + <pinotree> braunr: what's up with it? (not that i have an idea about your + actual question, just curious) + <braunr> pinotree: i'm improving signal handling in my pcap-hurd module + <braunr> i guess checking for MACH_RCV_INTERRUPTED will dio + <braunr> -INFO is correctly handled :) + <braunr> ok new patch seems fine + <antrik> braunr: selectable file descriptors? + <braunr> antrik: see pcap_fileno() for example + <braunr> it returns a file descriptor matching the underlying object + (usually a socket) that can be multiplexed in a select/poll call + <braunr> obviously a mach port alone can't do the job + <braunr> i've upgraded the libpcap0.8 package with improved signal handling + for tests + <antrik> braunr: no idea what you are talking about :-( + + +## IRC, freenode, #hurd, 2012-02-01 + + <braunr> antrik: you do know about select/poll + <braunr> antrik: you know they work with multiple selectable/pollable file + descriptors + <braunr> on most unix systems, packet capture sources are socket + descriptors + <braunr> they're selectable/pollable + <antrik> braunr: what are packet capture sources? + <braunr> antrik: objects that provide applications with packets :) + <braunr> antrik: a PF_PACKET socket on Linux for example, or a Mach device, + or a BPF file descriptor on BSD + <antrik> for a single network device? or all of them? + <antrik> AIUI the userspace BPF implementation in libpcap opens this + device, waits for packets, and if any arrive, decides depending on the + rules whether to pass them to the main program? + <braunr> antrik: that's it, but it's not the point + <braunr> antrik: the point is that, if programs need to include packet + sources in select/poll calls, they need file descriptors + <braunr> without a translator, i can't provide that + <braunr> so we either decide to stick with the libpcap patch only, and keep + this limitation, or we write a translator that enables this feature + <pinotree> braunr: are the two options exclusive? + <braunr> pinotree: unless we implement a complete bpf translator like i did + years ago, we'll need a patch in libpcap + <braunr> pinotree: the problem with my early translator implementation is + that it's buggy :( + <braunr> pinotree: and it's also slower, as packets are small enough to be + passed through raw copies + <antrik> braunr: I'm not sure what you mean when talking about "programs + including packet sources". programs only interact with packet sources + through libpcap, right? + <antrik> braunr: or are you saying that programs somehow include file + descriptors for packet sources (how do they obtain them?) in their main + loop, and explicitly pass control to libpcap once something arrives on + the respecitive descriptors? + <braunr> antrik: that's the idea, yes + <antrik> braunr: what is the idea? + <braunr> 20:38 < antrik> braunr: or are you saying that programs somehow + include file descriptors for packet sources (how do they obtain them?) in + their main loop, and explicitly pass control to libpcap once something + arrives on the respecitive descriptors? + <antrik> braunr: you didn't answer my question though :-) + <antrik> braunr: how do programs obtain these FDs? + <braunr> antrik: using pcap_fileno() for example + + +## IRC, freenode, #hurd, 2012-02-02 + + <antrik> braunr: oh right, you already mentioned that one... + <antrik> braunr: so you want some entity that exposes the device as + something more POSIXy, so it can be used in standard FS calls, unlike the + Mach devices used for pfinet + <antrik> this is probably a good sentiment in general... but I'm not in + favour of a special solution only for BPF. rather I'd take this as an + indication that we probably should expose network interfaces as something + file-like in general after all, and adapt pfinet, eth-multiplexer, and + DDE accordingly + <braunr> antrik: i agree + <braunr> antrik: eth-multiplexer would be the right place diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn index e2cff94f..adb070cd 100644 --- a/open_issues/dde.mdwn +++ b/open_issues/dde.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -13,3 +14,350 @@ License|/fdl]]."]]"""]] [[General Information|/dde]]. Still waiting for interface finalization and proper integration. + +[[!toc]] + + +# Upstream Status + + +## IRC, freenode, #hurd, 2012-02-08 + +At the microkernel davroom at [[community/meetings/FOSDEM_2012]]: + + <antrik> there was quite some talk about DDE. I learnt that there are newer + versions in Genode and in Minix (as opposed to the DROPS one we are + using) + <antrik> but apparently none of the guys involved is interested in creating + a proper upstream project with central repository and communication + channels :-( + <antrik> the original DDE creator was also there, but said he isn't working + on it anymore + <tschwinge> OK, and the other two projects basically have their own forks. + <tschwinge> Or are they actively cooperating? + <tschwinge> (If you know about it.) + <antrik> well, Genode is also part of the Dresden L4 group; but apart from + that, I'd rather call it a fork... + <antrik> hm... actually, I'm not sure anymore whether the guy I talked to + was from Genode or Nova... + <antrik> (both from the Dresdem L4 group) + + +## IRC, freenode, #hurd, 2012-02-19 + + <youpi> antrik: do we know exactly which DDE version Zheng Da took as a + base ? + <youpi> (so as to be able to merge new changes easily) + <antrik> youpi: not sure... but from what I gathered at FOSDEM, the version + he based on (from DROPS) is not really actively developed right now; if + we want to go for newer versions, we probably have to look at other + projects (like Genode or Nova or Minix) + <youpi> there's no central project for dde ? + <youpi> that sucks + <antrik> no... and nobody seemed interested in having one :-( + <youpi> pff + <antrik> which makes me seriously question the original decision to build + on DDE... + <braunr> .. + <antrik> if we have to basically maintain it ourselfs anyways, we could + just as well have gone with custom glue + <youpi> well, the advantage of DDE is that it already exists now + <antrik> on the positive side, one of the projcets (not sure which) + apparently have both USB and SATA working with some variant of DDE + + +# IRC, OFTC, #debian-hurd, 2012-02-15 + + <pinotree> i have no idea how the dde system works + <youpi> gnumach patch to provide access to physical memory and interrupts + <youpi> then userland accesses i/o ports by hand to drive things + <youpi> but that assumes that no kernel driver is interfering + <youpi> so one has to disable kernel drivers + <pinotree> how are dde drivers used? can they be loaded on their own + automatically, or you have to settrans yourself to setup a device? + <youpi> there's no autoloader for now + <youpi> we'd need a bus arbitrer that'd do autoprobing + <pinotree> i see + <pinotree> (you see i'm not really that low level, so pardon the flood of + posssibly-noobish questions ;) ) + <youpi> I haven't set it up yet, but IIRC you need to specify which driver + to be used + <youpi> well, I mostly have the same questions actually :) + <youpi> I just have some guesswork here :) + <pinotree> i wonder whether the following could be feasible: + <youpi> I'm wondering how we'll manage to make it work in d-i + <pinotree> a) you create a package which would b-d on linux-source, build a + selection of (network only for now) drivers and install them in + /hurd/dde/ + <youpi> probably through a choice at the boot menu + <youpi> I wouldn't dare depending on linux-source + <youpi> dde is usually not up-to-date + <pinotree> b) add a small utility over the actual fsys_settrans() which + would pick the driver from /hurd/dde/ + <pinotree> ... so you could do `set-dde-driver b43 <device>` (or something + like that) + <youpi> we can provide something like b) yes + <youpi> although documenting the settrans should be fine enough ;) + <pinotree> the a) would help/ease with the fact that you need to compile on + your own the drivers + <pinotree> otherwise we would need to create a new linux-dde-sources-X.Y.Z + only with the sources of the drivers we want from linux X.Y.Z + <pinotree> (or hurd-dde-linux-X.Y.Z) + <CIA-4> samuel.thibault * raccdec3 gnumach/debian/ (changelog + patches/70_dde.patch patches/series): + <CIA-4> Add DDE experimental support + <CIA-4> * debian/patches/70_dde.patch: Add experimental support for irq + passing and + <CIA-4> physical memory allocation for DDE. Also adds nonetdev boot + parameter to + <CIA-4> disable network device drivers. + <youpi> ok, boots fine with the additional nonetdev option + <youpi> now I need to try that dde hurd branch :) + <CIA-4> samuel.thibault * rf8b9426 gnumach/debian/patches/70_dde.patch: Add + experimental.defs to gnuamch-dev + + +# IRC, freenode, #hurd, 2012-02-19 + + * youpi got dde almost working + <youpi> it's able to send packets, but apparently not receive them + <youpi> (e1000) + <youpi> ok, rtl8139 works + <youpi> antrik: the wiki instructions are correct + <youpi> with e1000 I haven't investigated + <antrik> (Archhurd guys also reported problems with e1000 IIRC... the one I + built a while back works fine though on my T40p with real e1000 NIC) + <antrik> maybe I should try with current versions... something might got + broken by later changes :-( + <youpi> at least testing could tell the changeset which breaks it + <youpi> Mmm, it's very odd + <youpi> with the debian package, pfinet's call to device_set_filter returns + D_INVALID_OPERATION + <youpi> and indeed devnode.c returns that + <youpi> ah but it's libmachdev which is supposed to answer here + <antrik> youpi: so, regarding the failing device_set_filter... I guess you + are using some wrong combination of gnumach and pfinet + <youpi> no it's actually that my pfinet was not using bpf + <youpi> I've now fixed it + <antrik> the DDE drivers rely on zhengda's modified pfinet, which uses + devnode, but also switched to using proper BPF filters. so you also need + his BPF additions/fixes in gnumach + <antrik> OK + <youpi> that's the latter + <youpi> I had already fixed the devnode part + <youpi> but hadn't seen that the filter was different + <antrik> err... did I say gnumach? that of course doesn't come into play + here + <antrik> so yes, you just need a pfinet using BPF + <youpi> libmachdev does ;) + <antrik> I'm just using pfinet from zhengda's DDE branch... I think devnode + and BPF are the only modifications + <youpi> there's also a libpcap modification in the incubator + <youpi> probably for tcpdump etc. + <antrik> libpcap is used by the modified pfinet to compile the filter rule + <youpi> why does pfinet need to compile the rule ? + <youpi> it's libbpf which is used in the dde driver + <antrik> it doesn't strictly need to... but I guess zhengda considered it + more elegant to put the source rule in pfinet on compile it live, rather + than the compiled blob + <antrik> I probably discussed this with him myself a few years back... but + my memory on this is rather hazy ;-) + <antrik> err... and compile it live + <youpi> ah, right, it's only used when asking pfinet to change its filter + <youpi> but it does not need it for the default filter + <youpi> which is hardcoded + <antrik> I see + <antrik> when would pfinet change its filter? + * youpi now completely converting his hurd box to debian packages with dde + support + <youpi> on SIOCSIFADDR apparently + <youpi> to set "arp or (ip host %s)", + <antrik> well, that sounds like the default filter... + <youpi> the default filter does not choose an IP + <antrik> oh, right... pfinet has to readjust the filter when setting the IP + <youpi> arg we lack support for kernel options for gnumach in update-grub + <antrik> again, I have a vague recollection of discussing this + * youpi crosses fingers + <youpi> yay, works + <antrik> so we *do* need libpcap in pfinet to set proper rules... though I + guess it can also work with a static catchall rule (like it did before + zhengda's changes), only less efficient + <youpi> well in the past we were already catching everything anyway, so at + least it's not a regression :) + <antrik> right + + +# IRC, freenode, #hurd, 2012-02-20 + + <youpi> I was a bit wary of including the ton of dde headers in hurd-dev + <youpi> maybe adding another package for that + <youpi> but that would have delayed introducing the dde binaries + <youpi> probably we can do that for next upload + <pinotree> i can try to work on it, if is feasible (ie if the dde drivers + can currently be built from outside the hurd source tree) + <youpi> it should be, it's a matter of pointing its makefile to a place + where the make scripts and include headers are + <youpi> (and the libraries) + <pinotree> ok + <antrik> youpi: you mean DDEKit headers? + <antrik> pinotree: actually it doesn't matter where the dde-ified Linux + drivers are built -- libdde_linux26 and the actual drivers use a + completetly different build system anyways + <antrik> in fact we concluded at some point that they should live in a + separate repository -- but that change never happened + <antrik> only the base stuff (ddekit, libmachdev etc.) belong in the Hurd + source tree + <youpi> antrik: yes + <youpi> antrik: err, not really completely different + <youpi> the actual drivers' Makefile include the libdde_linux26 mk files + <youpi> the build itself is separate, though + <antrik> youpi: yes, I mean both libdde_linux26 and the drivers use a build + system that is completely distinct from the Hurd one + <youpi> ah, yes + <youpi> libdde_linux26 should however be shipped in the system + <antrik> ideally libdde_linux26 and all the drivers should be built in one + go I'd say... + <youpi> it should be easily feasible to also have a separate driver too + <youpi> e.g. to quickly try a 2.6 driver + <antrik> youpi: I'm not sure about that. it's not even dynamically linked + IIRC?... + <youpi> with scripts to build it + <youpi> it's not + <youpi> but that doesn't mean it can't be separate + <youpi> .a files are usually shipped in -dev packages + <antrik> youpi: ideally we should try to come with a build system that + reuses the original Linux makefile snippets to build all the drivers + automatically without any manual per-driver work + <youpi> there's usually no modification of the drivers themselves? + <youpi> but yeah + <youpi> "ideally", when somebody takes the time to do it + <antrik> unfortunately, it's necessary to include one particular + Hurd/DDE-specific header file in each driver :-( + <youpi> can't it be done through gcc's -include option? + <antrik> zhengda didn't find a way to avoid this... though I still hope + that it must be possible somehow + <antrik> I think the problem is that it has to be included *after* the + other Linux headers. don't remember the details though + <youpi> uh + <youpi> well, a good script can add a line after the last occurrence of + #include + <antrik> yeah... rather hacky, but might work + <youpi> even with a bit of grep, tail, cut, and sed it should work :) + <antrik> note that this is Hurd-specific; the L4 guys didn't need that + <youpi> what is it? + <antrik> don't remember off-hand + + +# IRC, freenode, #hurd, 2012-02-22 + + <youpi> antrik: AIUI, it should be possible to include all network drivers + in just one binary? + <youpi> that'd permit to use it in d-i + <youpi> and completely replace the mach drivers + <youpi> we just need to make sure to include at least what the mach drivers + cover + <youpi> (all DDE network drivers, I mean) + <youpi> of course that doesn't hinder from people to carefully separate + drivers in several binaries if they wish + <youpi> antrik: it does link at least, I'll give a try later + <youpi> yes it works! + <youpi> that looks like a plan + <youpi> throw all network drivers in a /hurd/dde_net + <youpi> settrans it on /dev/dde_net, and settrans devnode on /dev/eth[0-9] + <youpi> I'm also uploading a version of hurd which includes headers & + libraries, so you just need a make in dde_{e100,e1000,etc,net} + <youpi> (uploading it with the dde driver itself :) ) + <youpi> btw, a nice thing is that we don't really care that all drivers are + stuffed into a single binary, since it's a normal process only the useful + pages are mapped and actually take memory :) + <Tekk_> is that really a nice thing though? compared to other systems I + mean + <Tekk_> I know on linux it only loads the modules I need, for example. It's + definitely a step up for hurd though :D + <youpi> that's actually precisely what I mean + <youpi> you don't need to load only the modules you need + <youpi> you just load them all + <youpi> and paging eliminates automatically what's not useful + <youpi> even parts of the driver that your device will not need + <Tekk_> ooh + <Tekk_> awesome + <youpi> (actually, it's not even loaded, but the pci tables of the drivers + are loaded, then paged out) + + +# IRC, freenode, #hurd, 2012-02-24 + + <youpi> antrik_: about the #include <ddekit/timer.h>, I see the issue, it's + about jiffies + <youpi> it wouldn't be a very good thing to have a jiffies variable which + we'd have to update 100times per second + <youpi> so ZhengDa preferred to make jiffies a macro which calls a function + which reads the mapped time + <youpi> however, that break any use of the work "jiffies", e.g. structure + members & such + <youpi> actually it's not only after headers that the #include has to be + done, but after any code what uses the word "jiffies" for something else + than the variable + <youpi> pb is: it has to be done *before* any code that uses the word + "jiffies" for the variable, e.g. inline functions in headers + <youpi> in l4dde, there's already the jiffies variable so it's not a + problem + + +# IRC, OFTC, #debian-hurd, 2012-02-27 + + <tschwinge> I plan to do some light performance testing w.r.t. DDE + Ethernet. That is DDE vs. Mach, etc. + <youpi> that'd be good, indeed + <youpi> I'm getting 4MiB/s with dde + <youpi> I don't remember with mach + <tschwinge> Yes. It just shouldn't regress too much. + <tschwinge> Aha, OK. + + +## IRC, freenode, #hurd, 2012-02-27 + + <youpi> tschwinge: nttcp tells me ~80Mbps for mach-rtl8139, ~72Mbps for + dde-rtl8139, ~72Mbps for dde-e1000 + <youpi> civodul: ↑ btw + <ArneBab> youpi: so the dde network device is not much slower than the + kernel-one? + <civodul> youpi: yes, looks good + <ArneBab> rather almost the same speed + <youpi> apparently + <ArneBab> that’s quite a deal. + <ArneBab> what speed should it have as maximum? + <ArneBab> (means: does the mach version get out all that’s possible?) + <ArneBab> differently put: What speed would GNU/Linux get? + <youpi> I'm checking that right now + <ArneBab> cool! + <ArneBab> we need those numbers for the moth after the next + <youpi> Mmm, I'm not sure you really want the linux number :) + <youpi> 1.6Gbps :) + <ArneBab> oh… + <youpi> let me check with udp rather than tcp + <ArneBab> so the Hurd is a “tiny bit” worse at using the network… + <youpi> it might simply be an issue with tcp tuning in pfinet + <ArneBab> hm, yes + <ArneBab> tcp is not that cheap + <ArneBab> and has some pretty advanced stuff for getting to high speeds + <youpi> well, I'm not thinking about being cheap + <youpi> but using more recent tuning + <youpi> that does not believe only 1Mbps network cards exist :) + <ArneBab> like adaptive windows and such? + <ArneBab> :) + <youpi> yes + * ArneBab remembers that TCP was invented when the connections went over + phone lines - by audio :) + <youpi> yep + <ArneBab> what’s the system load while doing the test? + <youpi> yes, udp seems not so bad + <ArneBab> ah, cool! + <youpi> it's very variable (300-3000Mbps), but like on linux + <ArneBab> that pushing it into user space has so low cost is pretty nice. + * ArneBab thinks that that’s a point where Hurd pays off + <youpi> that's actually what AST said to fosdem + <youpi> he doesn't care about putting an RPC for each and every port i/o + <youpi> because hardware is slow anyway + <ArneBab> jupp + <ArneBab> but it is important to see that in real life diff --git a/open_issues/default_pager.mdwn b/open_issues/default_pager.mdwn index 683dd870..9a8e9412 100644 --- a/open_issues/default_pager.mdwn +++ b/open_issues/default_pager.mdwn @@ -10,7 +10,10 @@ License|/fdl]]."]]"""]] [[!tag open_issue_gnumach]] -IRC, freenode, #hurd, 2011-08-31: +[[!toc]] + + +# IRC, freenode, #hurd, 2011-08-31 <antrik> braunr: do you have any idea what could cause the paging errors long before swap is exhausted? @@ -29,3 +32,6 @@ IRC, freenode, #hurd, 2011-08-31: <braunr> uvm is too different <braunr> dragonflybsd maybe, but it's very close to freebsd <braunr> i didn't look at darwin/xnu + + +# [[trust_the_behavior_of_translators]] diff --git a/open_issues/glibc_madvise_vs_static_linking.mdwn b/open_issues/glibc_madvise_vs_static_linking.mdwn index 6238bc77..7b5963d3 100644 --- a/open_issues/glibc_madvise_vs_static_linking.mdwn +++ b/open_issues/glibc_madvise_vs_static_linking.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -24,3 +25,15 @@ to ignore the advice.* (`man madvise`), so we may simply want to turn it into a no-op in glibc, avoiding the link-time warning. 2011-07: This is what Samuel has done for Debian glibc. + + +# IRC, freenode, #hurd, 2012-02-16 + + <tschwinge> youpi: With Roland's fix the situation will be that just using + gcc -static doesn't emit the stub warning, but still will do so in case + that the source code itself links in madvise. Is this acceptable for + you/Debian/...? + <youpi> packages with -Werror will still bug out + <youpi> not that I consider -Werror to be a good idea, though :) + <tschwinge> youpi: Indeed. Compiler warnings can be caused by all kinds of + things. -Werror is good for development, but not for releases. diff --git a/open_issues/gnumach_memory_management.mdwn b/open_issues/gnumach_memory_management.mdwn index c34d1200..d29e316c 100644 --- a/open_issues/gnumach_memory_management.mdwn +++ b/open_issues/gnumach_memory_management.mdwn @@ -2089,7 +2089,15 @@ There is a [[!FF_project 266]][[!tag bounty]] on this task. <braunr> ou alors dans les .h ipc -# IRC, freenode, #hurdfr, 2012-01-18 +# IRC, freenode, #hurd, 2012-01-18 <braunr> does the slab branch need other reviews/reports before being integrated ? + + +# IRC, freenode, #hurd, 2012-01-30 + + <braunr> youpi: do you have some idea about when you want to get the slab + branch in master ? + <youpi> I was considering as soon as mcsim gets his paper + <braunr> right diff --git a/open_issues/linux_as_the_kernel.mdwn b/open_issues/linux_as_the_kernel.mdwn new file mode 100644 index 00000000..f14b777e --- /dev/null +++ b/open_issues/linux_as_the_kernel.mdwn @@ -0,0 +1,42 @@ +[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +Instead of attempting a [[history/port_to_another_microkernel]], or writing an +own one, an implementation of a Hurd system could use another existing +operating system/kernel, like [[UNIX]], for example, the Linux kernel. This is +not a [[microkernel]], but that is not an inherent hindrance; depending on what +the goals are. + +There has been an attempt for building a [[Mach_on_top_of_POSIX]]. + + +# IRC, freenode, #hurd, 2012-02-08 + +Richard's X-15 Mach re-implementation: + + <braunr> and in case you didn't notice, it's stalled + <braunr> actually i don't intend to work on it for the time being + <braunr> i'd rather do as neal suggested: take linux, strip it, and give it + a mach interface + <braunr> (if your goal really is to get something usable for real world + tasks) + <antrik> braunr: why would you want to strip down Linux? I think one of the + major benefits of building a Linux-Frankenmach would be the ability to + use standard Linux functionality alongside Hurd... + <braunr> we could have a linux x86_64 based mach replacement in "little" + time, with a compatible i386 interface for the hurd + <braunr> antrik: well, many of the vfs and network subsystems would be hard + to use + <antrik> BTW, one of the talks at FOSDEM was about the possibility of using + different kernels for Genode, and pariticularily focused on the + possibilities with using Linux... unfortunately, I wasn't able to follow + the whole talk; but they mentioned similar possibilities to what I'm + envisioning here :-) + diff --git a/open_issues/memory_object_model_vs_block-level_cache.mdwn b/open_issues/memory_object_model_vs_block-level_cache.mdwn new file mode 100644 index 00000000..7da5dea4 --- /dev/null +++ b/open_issues/memory_object_model_vs_block-level_cache.mdwn @@ -0,0 +1,273 @@ +[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_documentation open_issue_hurd open_issue_gnumach]] + + +# IRC, freenode, #hurd, 2012-02-14 + + <slpz> Open question: what do you think about dropping the memory object + model and implementing a simple block-level cache? + +[[microkernel/mach/memory_object]]. + + <kilobug> slpz: AFAIK the memory object has more purpose than just cache, + it's allow used for passing chunk of data between processes, handling + swap (which similar to cache, but still slightly different), ... + <slpz> kilobug: user processes usually make their way to data with POSIX + operations, so memory objects are only needed for mmap'ed files + <slpz> kilobug: and swap can be replaced for an in-kernel system or even + could still use the memory object + <braunr> slpz: memory objects are used for the page cache + <kilobug> slpz: translators (especially diskfs based) make heavy use of + memory objects, and if "user processes" use POSIX semantics, Hurd process + (translators, pagers, ...) shouldn't be bound to POSIX + <slpz> braunr: and page cache could be moved to a lower level, near to the + devices + <braunr> not likely + <braunr> well, it could, but then you'd still have the file system overhead + <slpz> kilobug: but the use of memory objects it's not compulsory, you can + easily write a fs translator without implementing memory objects at all + (except to mmap) + <braunr> a unified buffer/VM cache as all modern systems have is probably + the most efficient approach + <slpz> braunr: I agree. I want to look at *BSD/Linux vfs systems to seem + how much cache policy depends on the filesystem + <slpz> braunr: Are you aware of any good papers on this matter? + <braunr> netbsd UVM, the linux virtual memory system + <braunr> both a bit old bit still relevant + <slpz> braunr: Thanks. + <slpz> the problem in our case is that having FS and cache information at + different contexts (kernel vs. translator), I find hard to coordinate + them. + <slpz> that's why I though about a block-level cache that GNU Mach could + manage by itself + <slpz> I wonder how QNX deals with this + <braunr> the point of having a simple page cache is explicitely about not + caring if those pages are blocks or files or whatever + <braunr> the kernel (at least, mach) normally has all the accounting + information it needs to implement its cache policy + <braunr> file system translators shouldn't cache much + <braunr> the pager interface could be refined, but it looks ok to me as it + is + <slpz> Mach has the accounting info, but it's not able to purge the cache + without coordination with translators + <braunr> which is normal + <slpz> And this is a big problem when memory pressure increases, as it + doesn't know for sure when memory is going to be freed + <braunr> Mach flushes its cache when it decides to, and sends back dirty + pages if needed by the pager + <braunr> that's the case with every paging implementation + <braunr> the main difference is security with untrusted pagers + <braunr> but that's another issue + <slpz> but in a monolithic implementation, the kernel is able for force a + chunk of cache memory to be freed without hoping for other process to do + the job + <braunr> that's not true + <braunr> they're not process, they're threads, but the timing issue is the + same + <braunr> see pdflush on linux + <slpz> no, it isn't. + <braunr> when memory is scarce, threads that request memory can either wait + or immediately fail, and if they wait, they're usually woken by one of + the vm threads once flushing is done + <slpz> a kernel thread can access all the information in the kernel, and + synchronization is pretty easy. + <braunr> on mach, synchronization is done with messages, that's even easier + than shared kernel locks + <slpz> with processes in different spaces, resource coordination becomes + really difficult + <braunr> and what kind of info would an external pager need when simply + asked to take back its dirty pages + <braunr> what resources ? + <slpz> just take a look at the thread storm problem when GNU Mach needs to + clean a bunch of pages + <braunr> Mach is big enough to correctly account memory + <braunr> there can be thread storms on monolithic systems + <braunr> that's a Mach issue, not a microkernel issue + <braunr> that's why linux limits the number of pdflush thread instances + <slpz> Mach can account memory, but can't assure when be freed by any + means, in a lesser degree than a monolithic system + <braunr> again i disagree + <braunr> no system can guarantee when memory will be freed with paging + <slpz> a block level cache can, for most situations + <braunr> slpz: why ? + <braunr> slpz: or how i mean ? + <slpz> braunr: with a block-level page cache, GNU Mach should be able to + flush dirty pages directly to the underlaying device without all the + complexity and resource cost involved in a m_o_data_return message. It + can also throttle the rate at which pages are being cleaned, and do all + this while blocking new page allocations to deal with memory exhaustion + cases. + <slpz> braunr: in the current state, when cleaning dirty pages, GNU Mach + sends a bunch on m_o_data_return to the corresponding pagers, hoping they + will do their job as soon and as fast as possible. + <slpz> memory is not really freed, but transformed from page cache to + anonymous memory pertaining to the corresponding translator + <slpz> and GNU Mach never knows for sure when this memory is released, if + it ever is. + <slpz> not being able to flush dirty pages synchronously is a big problem + when you need to throttle memory usage + <slpz> and needing allocating more memory when you're trying to free (which + is the case for the m_o_data_return mechanism) makes the problem even + worse + <braunr> your idea of a block level cache means in kernel block drivers + <braunr> that's not the direction we're taking + <braunr> i agree flushing should be a synchronous process, which was one of + the proposed improvements in the thread migration papers + <braunr> (they didn't achieve it but thought about it for future works, so + that the thread at the origin of the fault would handle it itself) + <braunr> but it should be possible to have kernel threads similar to + pdflush and throttle flush requests too + <braunr> again, i really think it's a mach bug, and having a buffer cache + would be stepping backward + <braunr> the real design issue is allocating memory while trying to free + it, yes + <slpz> braunr: thread migration doesn't apply to asynchronous IPC, and the + entire paging mechanism is implemented this way + <slpz> in fact, trying to do a synchronous m_o_data_return will trigger a + deadlock for sure + <slpz> to achieve synchronous flushing with translators, the entire paging + model must be redesigned + <slpz> It's true that I'm not very confident of the viability of user space + drivers + <slpz> at least, not for every device + <slpz> I know this is against the current ideas for most ukernel designs, + but if we want to achieve real work functionality, I think some + sacrifices must be done. Or at least a reasonable compromise. + <braunr> slpz: thread migration for paging requests implies synchronous + RPC, we don't care much about the IPC layer there + <braunr> and it requires large changes of the VM code in addition, yes + <braunr> let's not talk about this, we don't have thread migration anyway + :p + <braunr> except the allocation-on-free-path issue, i really don't see how + the current pager interface or the page cache creates problems wrt + flushing .. + <braunr> monolithic systems also have that problem, with lower impacts + though, but still + <slpz> braunr: because as it doesn't know when memory is really freed, 1) + it just blindly sends a bunch of m_o_data_return to the pagers, usually + overloading them (causing thread storms), and 2) it can't properly + throttle new page requests to deal with resource exhaustion + <braunr> it does know when memory is really freed + <braunr> and yes, it blindly sends a bunch of requests, they can and should + be trottled + <slpz> but dirty pages freed become indistinguishable from common anonymous + chunks released, so it doesn't really know if page flushes are really + working or not (i.e. doesn't know how fast a device is processing write + requests) + <braunr> memory is freed when the pager deallocates it + <braunr> the speed of the operation is irrelevant + <braunr> no system can rely on disk speed to guarantee correct page flushes + <braunr> disk or anything else + <slpz> requests can't be throttled if Mach doesn't know when they are being + processed + <braunr> it can easily know it + <braunr> they are processed as soon as the request is sent from the kernel + <braunr> and processing is done when the pager acknowledges the end of the + flush + <braunr> memory backing the flushed pages should be released before + acknowleding that to avoid starting new requests too soon + <slpz> AFAIK pagers doesn't acknowledge the end of the flush + <braunr> well that's where the interface should be refined + <slpz> Mach just sends the m_o_data_return and continues on its own + <braunr> that's why flushing should be synrhconous + <braunr> are you sure about that however ? + <slpz> so the entire paging system needs a new design... :) + <slpz> pretty sure + <braunr> not a new design .. + <braunr> there is m_o_supply_completed, i don't see how difficult it would + be to add m_o_data_return_completed + <braunr> it's not a small change, but not a difficult one either + <braunr> i'm more worried about the allocation problem + <braunr> the default pager should probably be wired in memory + <braunr> maybe others + <slpz> let's suppose a case in which Mach needs to free memory due to an + increase in its pressure. vm_pageout_daemon starts running, clean pages + are freed easily, but for each dirty one a m_o_data_return in sent. 1) + when should this daemon stop sending m_o_data_return and start waiting + for m_o_data_return_completed? 2) what happens if the translator needs to + read new blocks to fulfill a write request (pretty common in ext2fs)? + <braunr> it should stop after an arbitrary limit is reached + <braunr> a reasonable one + <braunr> linux limits the number of pdflush threads for that reason as i + mentioned (to 8 iirc) + <braunr> the problem of reading blocks while flushing is what i'm worried + about too, hence the need to wire that code + <braunr> well, i'm nto sure it's needed + <braunr> again, a reasonable about of free memory should be reserved for + that at all times + <slpz> but the work for pdflush seems to be a lot easier, as it only deals + directly with block devices (if I understood it correctly, I just started + looking at it). + <braunr> i don't know how other systems compute that, but this is how they + seem to do as well + <braunr> no, i don't think so + <slpz> well, I'll try to invest a few days understanding how pdflush work, + to see if some ideas can be borrowed for Hurd + <braunr> iirc, freebsd has thresholds in percent for each part of its cache + (active, inactive, free, dirty) + <slpz> but I still think simple solutions work better, and using the memory + object for page cache is tremendously complex. + <braunr> the amount of free cache pages is generally sufficient to + guarantee much memory can be released at once if needed, without flushing + anything + <braunr> yes but that's the whole point of the Mach VM + <braunr> and its greatest advance .. + <slpz> what, memory objects? + <braunr> yes + <braunr> using physical memory as a cache for anything, not just block + buffers + <slpz> memory objects work great as a way to provide a shared image of + objects between processes, but as page cache they are an overkill (IMHO). + <slpz> or, at least, in the way we're using them + <braunr> probably + <braunr> http://lwn.net/Articles/326552/ + <braunr> this can help udnerstand the problems we may have without better + knowledge of the underlying devices, yes + <braunr> (e.g. not being able to send multiple requests to pagers that + don't share the same disk) + <braunr> slpz: actually i'm not sure it's that overkill + <braunr> the linux vm uses struct vm_file to represent memory objects iirc + <braunr> there are many links between that structure and some vfs related + subsystems + <braunr> when a system very actively uses the page cache, the kernel has to + maintain a lot of objects to accurately describe the cache content + <braunr> you could consider this overkill at first too + <braunr> the mach way of doing it just implies some ipc messages instead of + function calls, it's not that overkill for me + <braunr> the main problems are recursion (allocation while freeing, + handling page faults in order to handle flushes, that sort of things) + <braunr> struct file and struct address_space actually + <braunr> slpz: see struct address_space, it contains a set of function + pointers that can help understanding the linux pager interface + <braunr> they probably sufferred from similar caveats and worked around + them, adjusting that interface on the way + <slpz> but their strategy makes them able to treat the relationship between + the page cache and the block devices in a really simple way, almost as a + traditional storage cache. + <slpz> meanwhile on Mach+pager scenario, the relationship between a block + in a file and its underlying storage becomes really blurry + <slpz> this is a huge advantage when flusing out data, specially when + resources are scarce + <slpz> I think the idea of using abstract objects for page cache, loses a + bit the point that we just want to avoid accessing constantly to a slow + device + <slpz> and breaking the tight relationship between the device and its + cache, makes things a lot harder + <slpz> this also manifest itself when flushing clean pages, as things like + having an static maximum for cached memory objects + <slpz> we shouldn't care about the number of objects, we just need to + control the number of pages + <slpz> but as we need the pager to flush pages, we need to keep alive a lot + of control ports to them + <mcsim> slpz: When mo_data_return is called, once the memory manager no + longer needs supplied data, it should be deallocated using + vm_deallocate. So this way pagers acknowledges the end of flush. diff --git a/open_issues/select.mdwn b/open_issues/select.mdwn index 0b69d645..abec304d 100644 --- a/open_issues/select.mdwn +++ b/open_issues/select.mdwn @@ -14,9 +14,11 @@ License|/fdl]]."]]"""]] There are a lot of reports about this issue, but no thorough analysis. -# `elinks` +# Short Timeouts -IRC, unknown channel, unknown date. +## `elinks` + +IRC, unknown channel, unknown date: <paakku> This is related to ELinks... I've looked at the select() implementation for the Hurd in glibc and it seems that giving it a short @@ -31,9 +33,186 @@ IRC, unknown channel, unknown date. <paakku> Or do I just imagine this problem? -# dbus +## [[dbus]] + + +## IRC + +### IRC, freenode, #hurd, 2012-01-31 + + <braunr> don't you find vim extremely slow lately ? + <braunr> (and not because of cpu usage but rather unnecessary sleeps) + <jkoenig> yes. + <braunr> wasn't there a discussion to add a minimum timeout to mach_msg for + select() or something like that during the past months ? + <youpi> there was, and it was added + <youpi> that could be it + <youpi> I don't want to drop it though, some app really need it + <braunr> as a debian patch only iirc ? + <youpi> yes + <braunr> ok + <braunr> if i'm right, the proper solution was to fix remote servers + instead of client calls + <youpi> (no drop, unless the actual bug gets fixed of course) + <braunr> so i'm guessing it's just a hack in between + <youpi> not only + <youpi> with a timeout of zero, mach will just give *no* time for the + servers to give an answer + <braunr> that's because the timeout is part of the client call + <youpi> so the protocol has to be rethought, both server/client side + <braunr> a suggested solution was to make it a parameter + <braunr> i mean, part of the message + <braunr> not a mach_msg parameter + <jkoenig> OTOH the servers should probably not be trusted to enforce the + timeout. + <braunr> why ? + <jkoenig> they're not necessarily trusted. (but then again, that's not the + only circumstances where that's a problem) + <braunr> there is a proposed solution for that too (trust root and self + servers only by default) + <jkoenig> I'm not sure they're particularily easy to identify in the + general case + <braunr> "they" ? the solutions you mean ? + <braunr> or the servers ? + <youpi> jkoenig: you can't trust the servers in general to provide an + answer, timeout or not + <jkoenig> yes the root/self servers. + <braunr> ah + <youpi> jkoenig: you can stat the actual node before dereferencing the + translator + <jkoenig> could they not report FD activity asynchronously to the message + port? libc would cache the state + <youpi> I don't understand what you mean + <youpi> anyway, really making the timeout part of the message is not a + problem + <braunr> 10:10 < youpi> jkoenig: you can't trust the servers in general to + provide an answer, timeout or not + <youpi> we already trust everything (e.g. read() ) into providing an answer + immediately + <braunr> i don't see why + <youpi> braunr: put sleep(1) in S_io_read() + <youpi> it'll not give you an immediate answer, O_NODELAY being set or not + <braunr> well sleep is evil, but let's just say the server thread blocks + <braunr> ok + <braunr> well fix the server + <youpi> so we agree + <braunr> ? + <youpi> in the current security model, we trust the server into achieve the + timeout + <braunr> yes + <youpi> and jkoenig's remark is more global than just select() + <braunr> taht's why we must make sure we're contacting trusted servers by + default + <youpi> it affects read() too + <braunr> sure + <youpi> so there's no reason not to fix select() + <youpi> that's the important point + <braunr> but this doesn't mean we shouldn't pass the timeout to the server + and expect it to handle it correctly + <youpi> we keep raising issues with things, and not achieve anything, in + the Hurd + <braunr> if it doesn't, then it's a bug, like in any other kernel type + <youpi> I'm not the one to convince :) + <braunr> eh, some would say it's one of the goals :) + <braunr> who's to be convinced then ? + <youpi> jkoenig: + <youpi> who raised the issue + <braunr> ah + <youpi> well, see the irc log :) + <jkoenig> not that I'm objecting to any patch, mind you :-) + <braunr> i didn't understand it that way + <braunr> if you can't trust the servers to act properly, it's similar to + not trusting linux fs code + <youpi> no, the difference is that servers can be non-root + <youpi> while on linux they can't + <braunr> again, trust root and self + <youpi> non-root fuse mounts are not followed by default + <braunr> as with fuse + <youpi> that's still to be written + <braunr> yes + <youpi> and as I said, you can stat the actual node and then dereference + the translator afterwards + <braunr> but before writing anything, we'd better agree on the solution :) + <youpi> which, again, "just" needs to be written + <antrik> err... adding a timeout to mach_msg()? that's just wrong + <antrik> (unless I completely misunderstood what this discussion was + about...) + + +#### IRC, freenode, #hurd, 2012-02-04 + + <youpi> this is confirmed: the select hack patch hurts vim performance a + lot + <youpi> I'll use program_invocation_short_name to make the patch even more + ugly + <youpi> (of course, we really need to fix select somehow) + <pinotree> could it (also) be that vim uses select() somehow "badly"? + <youpi> fsvo "badly", possibly, but still + <gnu_srs1> Could that the select() stuff be the reason for a ten times + slower ethernet too, e.g. scp and apt-get? + <pinotree> i didn't find myself neither scp nor apt-get slower, unlike vim + <youpi> see strace: scp does not use select + <youpi> (I haven't checked apt yet) + + +### IRC, freenode, #hurd, 2012-02-14 + + <braunr> on another subject, I'm wondering how to correctly implement + select/poll with a timeout on a multiserver system :/ + <braunr> i guess a timeout of 0 should imply a non blocking round-trip to + servers only + <braunr> oh good, the timeout is already part of the io_select call + + +### IRC, freenode, #hurdfr, 2012-02-22 + + <braunr> le gros souci de notre implé, c'est que le timeout de select est + un paramètre client + <braunr> un paramètre passé directement à mach_msg + <braunr> donc si tu mets un timeout à 0, y a de fortes chances que mach_msg + retourne avant même qu'un RPC puisse se faire entièrement (round-trip + client-serveur donc) + <braunr> et donc quand le timeout est à 0 pour du non bloquant, ben tu + bloques pas, mais t'as pas tes évènements .. + <abique|work> peut-être que passer le timeout de 10ms à 10 us améliorerait + la situation. + <abique|work> car 10ms c'est un peut beaucoup :) + <braunr> c'est l'interval timer système historique unix + <braunr> et mach n'est pas préemptible + <braunr> donc c'est pas envisageable en l'état + <braunr> ceci dit c'est pas complètement lié + <braunr> enfin si, il nous faudrait qqchose de similaire aux high res + timers de linux + <braunr> enfin soit des timer haute résolution, soit un timer programmable + facilement + <braunr> actuellement il n'y a que le 8254 qui est programmé, et pour + assurer un scheduling à peu près correct, il est programmé une fois, à + 10ms, et basta + <braunr> donc oui, préciser 1ms ou 1us, ça changera rien à l'interval + nécessaire pour déterminer que le timer a expiré + + +### IRC, freenode, #hurd, 2012-02-27 -See [[dbus]]. + <youpi> braunr: extremely dirty hack + <youpi> I don't even want to detail :) + <braunr> oh + <braunr> does it affect vim only ? + <braunr> or all select users ? + <youpi> we've mostly seen it with vim + <youpi> but possibly fakeroot has some issues too + <youpi> it's very little probable that only vim has the issue :) + <braunr> i mean, is it that dirty to switch behaviour depending on the + calling program ? + <youpi> not all select users + <braunr> ew :) + <youpi> just those which do select({0,0}) + <braunr> well sure + <youpi> braunr: you guessed right :) + <braunr> thanks anyway + <braunr> it's probably a good thing to do currently + <braunr> vim was getting me so mad i was using sshfs lately + <youpi> it's better than nothing yes # See Also diff --git a/open_issues/trust_the_behavior_of_translators.mdwn b/open_issues/trust_the_behavior_of_translators.mdwn new file mode 100644 index 00000000..454c638b --- /dev/null +++ b/open_issues/trust_the_behavior_of_translators.mdwn @@ -0,0 +1,181 @@ +[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + +Apart from the issue of [[translators_set_up_by_untrusted_users]], here is +another problem described. + + +# IRC, freenode, #hurd, 2012-02-17 + +(Preceded by the [[memory_object_model_vs_block-level_cache]] discussion.) + + <slpz> what should do Mach with a translator that doesn't clean pages in a + reasonable amount of time? + <slpz> (I'm talking about pages flushed to a pager by the pageout daemon) + <braunr> slpz: i don't know what it should do, but currently, it uses the + default pager + +[[default_pager]]. + + <slpz> braunr: I know, but I was thinking about an alternative, for the + case in which a translator in not behaving properly + <slpz> perhaps freeing the page, removing the map, and letting it die in a + segmentation fault could be an option + <braunr> slpz: what if the translator can't do it properly because of + system resource exhaustion ? + <braunr> (i.e. it doesn't have enough cpu time) + <slpz> braunr: that's the biggest question + <slpz> let's suppose that Mach selects a page, sends it to the pager for + cleaning it up, reinjects the page into the active queue, and later it + founds the page again and it's still dirty + <slpz> but it needs to free some pages because memory it's really, really + scarce + <slpz> Linux just sits there waiting for I/O completion for that page + (trusts its own drivers) + <slpz> but we could be dealing with rogue translator... + <braunr> yes + <braunr> we may need some sort of "authentication" mechanism for pagers + <braunr> so that "system pagers" are trusted and others not + <braunr> using something like the device master port but for pagers + <braunr> a special port passed to trusted pagers only + <slpz> hum... that could be used to workaround the untrusted translator + crossing problem while walking a directory tree + +[[translators_set_up_by_untrusted_users]]. + + <slpz> but I think differentiating between trusted and untrusted + translators was rejected for philosophical reasons + <slpz> (but I'm not sure) + <mcsim> slpz: probably there should be something like oom killer? + <mcsim> braunr: even if translator is trusted it could have a bug which + make it ask more and more memory, so system have something to do with + it. Also, this way TCB is increased, so providing port for trusted + translators may hurt security. + <mcsim> I've read that Genode has "guarded allocators" which help resource + accounting by limiting of memory that could be used. Probably something + like this could be used in Hurd to limit translators. + <antrik> I don't remember how Viengoos deals with this :-( + +[[microkernel/Viengoos]]. + + <braunr> mcsim: the main feature lacking in mach is resource accounting :p + +[[resource_management_problems]]. + + <slpz> mcsim: yes, I think there should be a Hurdish oom killer, paying + special attention to external pagers + +[[microkernel/mach/external_pager_mechanism]]. + + <braunr> the oom killer selects untrusted processes by definition (since + pagers are in kernel) + <mcsim> slpz: and what is better: oom killer or resource accounting? + <mcsim> Under resource accounting I mean mechanism when process can't get + more resources than it is allowed. + <braunr> resource accounting of course + <braunr> but it's not just about that + <braunr> really, how does the kernel deal when a pager refuses to honor a + paging request ? + <braunr> whether it is buggy or malicious + <braunr> is it really possible to keep all pagers out of the TCB ? + <antrik> mcsim: we definitely want proper resource accounting in the long + run. the question is how to deal with the situation that resources are + reallocated to other tasks, so some pages have to be freed + <antrik> I really don't remember how Neal proposed to deal with this + <slpz> mcsim: Better: resource accounting (in which resources are accounted + to the user tasks which are requesting them, as in the Viengoos + model). Good enough an realistic: oom killer + <antrik> I'm not sure an OOM killer for non-system pagers is terribly + helpful. in typical use, the vast majority of paging is done by trusted + pagers... + <antrik> and without proper client resource accounting, there are enough + other ways a rogue/broken process can eat system resources -- I'm not + convinced that untrusted pagers have a major impact on the overall + situation + <mcsim> If pager can't free resources because of lack, for example, of cpu + time it's priority could be increased to give it second chance to free + the page. But if it doesn't manage to free resources it could be killed. + <antrik> I think the current approach with default pager taking over is + good enough for dealing with untrusted pagers. the real problem are even + trusted pager frequently failing to deal with the requests + <braunr> i agree with antrik + <braunr> and i'm opposed to an oom killer + <braunr> it's really not a proper fix for our problems + <braunr> mcsim: what if needs 3 attempts before succeeding ? + <braunr> +it + <braunr> and increasing priority without a good reason (e.g. no priority + inversion) leads to unfairness + <braunr> we don't want to deal with tricky problems like malicious pagers + using that to starve other tasks + <mcsim> braunr: this is just temporary decision (for example for half of + second of user time), to increase probability that task was killed not + because of it lacked resources. + <braunr> mcsim: tunables should only affect the efficiency of an operation, + not its success + + +## IRC, freenode, #hurd, 2012-02-19 + + <antrik> neal: the specific question is how to ensure processes free memory + fast enough when their allocation becomes lower due to resource pressure + <neal> antrik: you can't really. + <neal> antrik: the memory manager can act on the application's behalf if + the application marks pages as discardable or pagable. + <neal> antrik: if the memory manager makes an upcall to the application to + free some memory and it doesn't, you have to penalize it. + <neal> antrik: You shouldn't the process like exokernel + <neal> antrik: It's the developers fault, not the user's + <neal> antrik: What you need are controls that ensure that the user stays + in control + <neal> ...shouldn't *kill* the process... + <antrik> neal: well, how can I penalize a process that eats to much + physical memory? + <neal> in the future, you don't give it as much slack memory + <antrik> marking as pagable means a system pager will push them to the swap + partition? + <antrik> ah, OK + <neal> yes + <neal> and you page it more aggressively, i.e., you don't give it a chance + to free memory + <neal> The situation is: + <neal> you have memory pressure + <neal> you choose a victim process and ask it to free memory + <neal> now, you need to wait + <neal> if you wait and it doesn't free memory, you give it bad karma + <neal> if you wait and it frees memory, you're good + <neal> but during that window, a bad process can delay recovery + <neal> so, in the future, we give bad processes less time + <neal> but, we still send a message asking it to free memory + <neal> we just hope it was a bug + <antrik> so the major difference to the approach we have in Mach is that + instead of just redeclaring some pages as anonymous memory that will be + paged to swap by the default pager eventually if the pager in question + fails to handle them properly, we wait some time for the process to free + (any) memory, and only then start paging out some of it's pages to swap + <neal> there's also discardable memory + <antrik> hm... there is some notion of "precious" pages in Mach... I wonder + whether that is also used to decide about discarding pages instead of + pushing them to swap? + <neal> antrik: A precious page is ro data that shouldn't be dropped + <antrik> ah + <antrik> but I guess that means non-precious RO data (such as a cache) can + be dropped without asking the pager, right? + <neal> yes + <antrik> I wonder whether such a karma system can be introduced in Mach as + well to deal with problematic pagers + + +## IRC, freenode, #hurd, 2012-02-21 + + <neal> antrik: One of the main differences between Mach and Viengoos is + that in Mach servers are responsible for managing memory whereas in + Viengoos applications are primarily responsible for managing memory. diff --git a/public_hurd_boxen/xen_handling.mdwn b/public_hurd_boxen/xen_handling.mdwn index 47d92c43..d4e33ce9 100644 --- a/public_hurd_boxen/xen_handling.mdwn +++ b/public_hurd_boxen/xen_handling.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -44,6 +44,7 @@ by typing in `host=flubber`, for example, will be enough to get access to that machine's console. /!\ TODO: How does one get the environment variables `COLUMNS` and `LINES` set -properly when using `xm console`? This is relevant for everything using -`(n)curses` -- for interactive console applications. Using `export COLUMNS=143 -LINES=44` does work, but is a manual process. +properly when using `xm console`? According to Samuel, *you don't, the xen +console doesn't have the notion of terminal size*. This is relevant for +everything using `(n)curses` -- for interactive console applications. Using +`export COLUMNS=143 LINES=44` does work, but is a manual process. |