diff options
authorThomas Schwinge <>2013-07-10 23:39:29 +0200
committerThomas Schwinge <>2013-07-10 23:39:29 +0200
commit9667351422dec0ca40a784a08dec7ce128482aba (patch)
parentb8f6fb64171e205c9d4b4a5394e6af0baaf802dc (diff)
20 files changed, 2455 insertions, 20 deletions
diff --git a/community/gsoc/project_ideas/mtab.mdwn b/community/gsoc/project_ideas/mtab.mdwn
index 694effca..d6f04385 100644
--- a/community/gsoc/project_ideas/mtab.mdwn
+++ b/community/gsoc/project_ideas/mtab.mdwn
@@ -159,3 +159,13 @@ quite rudimentary, and it shouldn't be hard to find something to improve.
<braunr> what ?
<braunr> the content is generated on open
<kuldeepdhaka> ooh, ok
+## IRC, freenode, #hurd, 2013-06-04
+ <safinaskar> how to see list of all connected translators?
+ <braunr> you can't directly
+ <braunr> you can use ps to list processes and guess which are translators
+ <braunr> (e.g. everything starting with /hurd/)
+ <braunr> a recursive call to obtain such a list would be useful
+ <braunr> similar to what's needed to implement /proc/mounts
diff --git a/contributing/web_pages/news/qoth_next.mdwn b/contributing/web_pages/news/qoth_next.mdwn
index 749a42bb..935784ce 100644
--- a/contributing/web_pages/news/qoth_next.mdwn
+++ b/contributing/web_pages/news/qoth_next.mdwn
@@ -25,6 +25,17 @@ else="
<!--basic structure of a QotH entry. Adapt, reduce and add points as needed. At the end, try to make the text flow as a unified whole.-->
+IRC, freenode, #hurd, 2013-05-05, in context of libpthread conversion
+ <braunr> ArneBab_: which also involved fixing libpthread to correctly
+ handle timed waits and cancellation
+ <braunr> although that part was done in january this year
+IRC, freenode, #hurd, 2013-05-10, in context of libpthread conversion
+ <braunr> the "significant" changes i've done in libpthreads are actually
+ related to io_select, for Q1 2013 :)
This quarter [hurd hacker] [item]
Also …
diff --git a/faq/sata_disk_drives/discussion.mdwn b/faq/sata_disk_drives/discussion.mdwn
new file mode 100644
index 00000000..3f063b77
--- /dev/null
+++ b/faq/sata_disk_drives/discussion.mdwn
@@ -0,0 +1,234 @@
+[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_gnumach]]
+# IRC, freenode, #hurd, 2013-05-10
+ <braunr> what code have you used if any (or is it your own implementation)
+ ?
+ <youpi> I ended up writing my own implementation
+ <braunr> eh :)
+ <youpi> the libahci/ahci code from linux is full of linux-specific stuff
+ <youpi> it would mean working on gluing that
+ <youpi> which woudl rather be just done in block-dde
+ <youpi> I was told at fosdem that ahci is not actually very difficult
+ <youpi> and it isn't indeed
+ <braunr> that's why i usually encourage to use netbsd code
+ <braunr> any chance using ahci might speed up our virtual machines ?
+ <youpi> they are already using DMA, so probably no
+ <youpi> (with the driver I've pushed)
+ <youpi> adding support for tagged requests would permit to submit several
+ requests at a time
+ <youpi> _that_ could improve it
+ <youpi> (it would make it quite more complex too)
+ <youpi> but not so much actually
+ <anatoly> What about virtio? will it speed up?
+ <youpi> probably not so much
+ <youpi> because in the end it works the same
+ <youpi> the guest writes the physical addresse in mapped memory
+ <youpi> kvm performs the read into th epointed memory, triggers an irq
+ <youpi> the guest takes the irq, marks as done, starts the next request,
+ etc.
+ <youpi> most enhancements that virtio could bring can already be achieved
+ with ahci
+ <youpi> one can probably go further with virtio, but doing it with ahci
+ will also benefit bare hardware
+ <pinotree>
+ <youpi> anatoly: aka SATA
+ <anatoly> some sort of general protocol to work with any SATA drive via
+ AHCI-compatible host controller?
+ <braunr> yes
+ <youpi> braunr: I may be mistaken, but it does seem ahci is faster than ide
+ <youpi> possibly because the ide driver is full of hardcoded wait loops
+ <braunr> interesting :)
+ <youpi> usleeps here and there
+ <braunr> oh right
+ <braunr> i wonder how they're actually implemented
+ <youpi> so it would make sense to use that on shattrath
+ <youpi> a nasty buggy busy-loop
+ <braunr> yes but ending when ?
+ <youpi> when a given number of loops have elapsed
+ <youpi> that's where "buggy" applies :)
+ <braunr> ok so buggy implies the loop isn't correctly calibrated
+ <youpi> it isn't calibrated at all actually
+ <braunr> ew
+ <youpi> it was probably calibrated on some 486 or pentium hardware :)
+ <braunr> yeah that's what i imagined too
+ <braunr> we'll need some measurements but if it's actually true, it's even
+ better news
+## IRC, freenode, #hurd, 2013-05-11
+ <youpi> ah, also, worth mentioning: the AHCI driver supports up to 2TiB
+ disks
+ <youpi> (as opposed to our IDE driver which supports only LBA28, 128GiB)
+ <youpi> supporting more than 2TiB would require an RPC change, or using
+ bigger sectors
+ <youpi> (which wouldn't be a bad idea anyway)
+ <braunr> i think we should switch to uint64_t addressable vm_objects
+ <braunr> which would allow to support large files too
+ <youpi> braunr: yep
+## IRC, freenode, #hurd, 2013-05-13
+ <braunr> the hurd, running on vbox, with a sata controler :)
+ <braunr> hum, problem with an extended partition
+ <anatoly_> qemu/kbm doesn't have sata controller, am I right?
+ <braunr> anatoly: recent versions might
+ <braunr>
+ <braunr>
+ <anatoly> braunr: found first link, too. Thanx for the second one
+ <braunr>
+ <braunr> looks ok in recent versions
+ <braunr> looks useful to have virtio drivers though
+ <anatoly> virtio is shown as fastest way for IO in the presentation
+ <anatoly> Hm, failed to run qemu with AHCI enabled
+ <anatoly> qemu 1.1 from debian testing
+ <anatoly> youpi how do run qemu with AHCI enabled?
+## IRC, freenode, #hurd, 2013-05-14
+ <anatoly> can somebody ask youpi how he runs qemu with AHCI please?
+ <gnu_srs> I think he used vbox? Did not find any AHCI option for kvm
+ (1.1.2-+dfsg-6)
+ <anatoly> gnu_srs:
+ <anatoly> but it doesn't work for me the same version of kvm
+ <braunr_> anatoly: have you checked how the debian package builds it ?
+ <anatoly> braunr: mach sees AHCI device
+ <braunr> oh :)
+ <anatoly> the problem is in last option "-device
+ ide-drive,drive=disk,bus=ahci.0"
+ <anatoly> lvm says 'invalid option'
+ <braunr> anatoly: can you give more details please ?
+ <braunr> lvm ?
+ <anatoly> s/lvm/kvm
+ <braunr> i don't understand
+ <braunr> how can mach probe an ahci drive if you can't start kvm ?
+ <anatoly> I ran it without last option
+ <braunr> then why do you want that option ?
+ <anatoly> But, actually I entered command with mistake. I retried it and it
+ works. But got "start ext2fs: ext2fs: device:hd0s2: No such device or
+ address"
+ <anatoly> Sorry for confusing
+ <braunr> that's normal
+ <braunr> it should be sd0s2
+ <bddebian2> Right because the device names are different
+ <braunr> be aware that gnumach couln't see my extended partitions when i
+ tried that yesterday
+ <braunr> i don't know what causes the problem
+ <anatoly> Yeah, I understand, I just note about it to show that it works
+ <braunr> :)
+ <anatoly> And I was wring
+ <anatoly> s/wring/wrong
+ <braunr> is that the version in wheezy ?
+ <anatoly> I'm using testing, but it's same
+ <braunr> great
+ <braunr> the VMs will soon use that then
+ <anatoly> I don't have extended partions
+ <anatoly> Booted with AHCI! :-)
+ <anatoly> It freezes while downloading packages for build-essential
+ fake-root dependencies with AHCI enabled
+ <youpi> anatoly: is the IRQ of the ahci controller the same as for your
+ ethernet device? (you can see that in lspci -v)
+ <anatoly> youpi: will check
+ <anatoly> youpi both uses IRQ 111
+ <anatoly> s/111/11
+ <braunr> aw
+ <youpi> anatoly: ok, that might be why
+ <youpi> is this kvm?
+ <youpi> if so, you can set up a second ahci controler
+ <youpi> and attach devices to it
+ <youpi> so the irq is not the same
+ <youpi> basically, the issue is about dde disabling the irq
+ <youpi> during interrupt handler
+ <youpi> which conflicts with ahci driver needs
+## IRC, freenode, #hurd, 2013-05-15
+ <anatoly> youpi: yes, it's kvm. Will try a second ahci controller
+ <Slex> I read recentrly was added ahci driver, is it in userland or
+ kernel-land?
+ <gnu_srs> kernel-land the change was in gnumach
+## IRC, freenode, #hurd, 2013-05-18
+ <youpi> about the IRQ conflict, it's simply that both dde and the ahci
+ driver need to disable it
+ <youpi> it needs to be coherent somehow
+## IRC, freenode, #hurd, 2013-05-20
+ <anatoly> gnu_srs: kvm -m 1G -drive
+ id=disk,file=<path_hurd_disk_img>,if=none,cache=writeback -device
+ ahci,id=ahci-1 -device ahci,id=ahci-2 -device
+ ide-drive,drive=disk,bus=ahci-2.0
+ <anatoly> who knows what does "ich9-ahci.multifunction=on/off" parameter
+ for kvm's ahci device mean?
+ <anatoly> well, I was a bit incorrect :-) The options is relative to PCI
+ miltifunction devices
+ <anatoly> s/options is relative/options relates
+## IRC, freenode, #hurd, 2013-05-24
+ <anatoly> I don't see freezes anymore while downloading packages with AHCI
+ enabled
+ <youpi> anatoly: by fixing the shared IRQ ?
+ <anatoly> youpi: yes, I added second AHCI as you suggested
+ <youpi> ok
+ <youpi> so it's probably the shared IRQ issue
+ <anatoly> NIC and AHCI have similar IRQ when only one AHCI is enabled
+ <anatoly> according lspci output
+ <youpi> yes
+## IRC, freenode, #hurd, 2013-06-18
+ <braunr> youpi: is there a simple way from hurd to check interrupts ?
+ <youpi> what do you mean by "check interrupts" ?
+ <braunr> if they're shared
+ <youpi> I still don't understand :)
+ <braunr> i'm setting up sata
+ <youpi> ah, knowing the number
+ <braunr> yes
+ <youpi> you can read that from lspci -v
+ <braunr> ok
+ <braunr> thanks
+ <braunr> hum
+ <braunr> i get set root='hd-49,msdos1' in grub.cfg when changing the
+ file to point to sd0
+ <youpi> hum
+ <braunr> i wonder if it's necessary
+ <braunr> i guess i just have to tell gnumach to look for sd0, not grub
+ <braunr> youpi: the trick you mentioned was to add another controler, right
+ ?
+ <youpi> yes
+ <braunr> ok
+ <braunr> youpi: looks fine :)
+ <braunr> and yes, i left hd0 in grub's
+ <braunr> although i have lots of errors on hd0s6 (/home)
+ <braunr> youpi: there must be a bug with large sizes
+ <braunr> i'll stick with ide for now, but at least setting sata with
+ libvirt was quite easy to do
+ <braunr> so we can easily switch later
diff --git a/hurd/libfuse.mdwn b/hurd/libfuse.mdwn
index 45ff97ec..78e96022 100644
--- a/hurd/libfuse.mdwn
+++ b/hurd/libfuse.mdwn
@@ -29,6 +29,26 @@ etc.
* File I/O is quite slow.
+## IRC, freenode, #hurd, 2013-05-31
+ <zacts> well the reason I'm asking, is I'm wonder about the eventual
+ possibility of zfs on hurd
+ <pinotree> no, zfs surely not
+ <zacts> *wondering
+ <zacts> pinotree: would that be because of license incompatabilities, or
+ technical reasons?
+ <pinotree> the latter
+ <taylanub> It's just a matter of someone sitting down and implementing it
+ though, not ?
+ <pinotree> possibly
+ <braunr> zacts: the main problem seems to be the interactions between the
+ fuse file system and virtual memory (including caching)
+ <braunr> something the hurd doesn't excel at
+ <braunr> it *may* be possible to find existing userspace implementations
+ that don't use the system cache (e.g. implement their own)
+ <braunr> and they could almost readily use our libfuse version
# Source
[[source_repositories/incubator]], libfuse/master.
diff --git a/hurd/subhurd/discussion.mdwn b/hurd/subhurd/discussion.mdwn
index 6e694677..fac93625 100644
--- a/hurd/subhurd/discussion.mdwn
+++ b/hurd/subhurd/discussion.mdwn
@@ -170,3 +170,13 @@ License|/fdl]]."]]"""]]
<zacts> ah ok
<braunr> in theory, subhurds can run without root privileges
<braunr> (but there are currently a few things that prevent it)
+## IRC, freenode, #hurd, 2011-06-07
+ <zacts> would hurd jails be more powerful than FreeBSD jails? how so?
+ <braunr> not more powerful
+ <braunr> easier to develop
+ <braunr> safer
+ <braunr> perhaps more powerful too, but that entirely depends on the
+ features you want inside
diff --git a/hurd/translator/procfs/jkoenig/discussion.mdwn b/hurd/translator/procfs/jkoenig/discussion.mdwn
index d26f05f9..2ba98150 100644
--- a/hurd/translator/procfs/jkoenig/discussion.mdwn
+++ b/hurd/translator/procfs/jkoenig/discussion.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation,
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -14,12 +14,13 @@ License|/fdl]]."]]"""]]
-# Miscellaneous
+# `/proc/version`
-IRC, #hurd, around September 2010
+[[!taglink open_issue_documentation]]: edit and move to [[FAQ]].
+## IRC, freenode, #hurd, around 2010-09
- <youpi> jkoenig: is it not possible to provide a /proc/self which points at
- the client's pid?
<pinotree> (also, shouldn't /proc/version say something else than "Linux"?)
<youpi> to make linux tools work, no :/
<youpi> kfreebsd does that too
@@ -33,10 +34,103 @@ IRC, #hurd, around September 2010
<youpi> Linux version 2.6.16 ( (gcc version 4.3.5) #4 Sun
Dec 18 04:30:00 CET 1977
<pinotree> k
- <giselher> I had some problems with killall5 to read the pid from /proc, Is
- this now more reliable?
- <youpi> I haven't tested with jkoenig's implementation
- [...]
+## IRC, freenode, #hurd, 2013-06-04
+ <safinaskar> ?@?#@?$?@#???!?!?!?!??!?!?!?! why /proc/version on gnu system
+ reports "Linux version 2.6.1 (GNU 0.3...)"?
+ <braunr> safinaskar: because /proc/version is a linux thing
+ <braunr> applications using it don't expect to see anything else than linux
+ when parsing
+ <braunr> think of it as your web brower allowing you to set the user-agent
+ <safinaskar> braunr: yes, i just thought about user-agent, too
+ <safinaskar> braunr: but freebsd doesn't report it is linux (as well as i
+ know)
+ <braunr> their choice
+ <braunr> we could change it, but frankly, we don't care
+ <safinaskar> so why "uname" says "GNU" and not "Linux"?
+ <braunr> uname is posix
+ <braunr> note that /proc/version also includes GNU and GNU Mach/Hurd
+ versions
+ <safinaskar> if some program read the word "Linux" from /proc/version, it
+ will assume it is linux. so, i think it is bad idea
+ <braunr> why ?
+ <safinaskar> there is no standard /proc across unixen
+ <braunr> if a program reads /proc/version, it expects to be run on linux
+ <safinaskar> every unix implement his own /proc
+ <safinaskar> so, we don't need to create /proc which is fully compatible
+ with linux
+ <braunr> procfs doesn't by default
+ <safinaskar> instead, we can make /proc, which is partially compatible with
+ linux
+ <braunr> debiansets the -c compatibility flag
+ <braunr> that's what we did
+ <safinaskar> but /proc/version should really report kernel name and its
+ version
+ <braunr> why ?
+ <braunr> (and again, it does)
+ <safinaskar> because this is why /proc/version created
+ <pinotree> no?
+ <braunr> on linux, yes
+ <braunr> pinotree: hm ?
+ <safinaskar> and /proc/version should not contain the "Linux" word, because
+ this is not Linux
+ <braunr> pinotree: no to what ? :)
+ <braunr> safinaskar: *sigh*
+ <braunr> i explained the choice to you
+ <pinotree> safinaskar: if you are using /proc/version to get the kernel
+ name and version, you're doing bad already
+ <braunr> disagree if you want
+ <braunr> but there is a point to using the word Linux there
+ <pinotree> safinaskar: there's the proper aposix api for that, which is
+ uname
+ <safinaskar> pinotree: okey. so why we ever implement /proc/version?
+ <braunr> it's a linux thing
+ <braunr> they probably wanted more than what the posix api was intended to
+ do
+ <safinaskar> okey, so why we need this linux thing? there is a lot of
+ linux thing which is useful in hurd. but not this thing. because this
+ is not linux. if we support /proc/version, we should not write "Linux"
+ to it
+ <pinotree> and even on freebsd their linprocfs (mounted on /proc) is not
+ mounted by default
+ <braunr> 10:37 < braunr> applications using it don't expect to see anything
+ else than linux when parsing
+ <braunr> 10:37 < braunr> think of it as your web brower allowing you to set
+ the user-agent
+ <braunr> safinaskar: the answer hasn't changed
+ <safinaskar> pinotree: but they don't export /proc/version with "Linux"
+ word in it anyway
+ <pinotree> safinaskar: they do
+ <safinaskar> pinotree: ??? their /proc/version contain Linux?
+ <pinotree> Linux version 2.6.16 ( (gcc version 4.6.3) #4
+ Sun Dec 18 04:30:00 CET 1977
+ <kilobug> safinaskar: it's like all web browsers reporting "mozilla" in
+ their UA, it may be silly, but it's how it is for
+ compatibility/historical reasons, and it's just not worth the trouble of
+ changing it
+ <pinotree> that's on a debian gnu/kfreebsd machine
+ <pinotree> and on a freebsd machine it is the same
+ <braunr> safinaskar: you should understand that parsing this string allows
+ correctly walking the rest of the /proc tree
+ <pinotree> and given such filesystem on freebsd is called "linprocfs", you
+ can already have a guess what it is for
+ <kilobug> safinaskar: saying "Linux version 2.6.1" just means "I'm
+ compatible with Linux 2.6.1 interfaces", like saying "Mozilla/5.0 (like
+ Gecko)" in the UA means "I'm a modern browser"
+ <safinaskar> so, is there really a lot of programs which expect "Linux"
+ word in /proc/version even on non-linux platforms?
+ <braunr> no
+ <braunr> but when they do, they do
+# `/proc/self`
+## IRC, freenode, #hurd, around 2010-09
+ <youpi> jkoenig: is it not possible to provide a /proc/self which points at
+ the client's pid?
<pinotree> looks like he did 'self' too, see rootdir_entries[] in rootdir.c
<youpi> but it doesn't point at self
<antrik> youpi: there is no way to provide /proc/self, because the server
@@ -206,6 +300,8 @@ IRC, freenode, #hurd, 2011-07-25
i don't remember)
< pinotree> not a strict need
+See also [[community/gsoc/project_ideas/mtab]].
# `/proc/[PID]/auxv`, `/proc/[PID]/exe`, `/proc/[PID]/mem`
diff --git a/libpthread.mdwn b/libpthread.mdwn
index 801a1a79..0a518996 100644
--- a/libpthread.mdwn
+++ b/libpthread.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012, 2013 Free Software Foundation,
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -60,6 +61,46 @@ even if the current number of threads is lower.
The same issue exists in [[hurd/libthreads]].
+### IRC, freenode, #hurd, 2013-05-09
+ <bddebian> braunr: Speaking of which, didn't you say you had another "easy"
+ task?
+ <braunr> bddebian: make a system call that both terminates a thread and
+ releases memory
+ <braunr> (the memory released being the thread stack)
+ <braunr> this way, a thread can completely terminates itself without the
+ assistance of a managing thread or deferring work
+ <bddebian> braunr: That's "easy" ? :)
+ <braunr> bddebian: since it's just a thread_terminate+vm_deallocate, it is
+ <braunr> something like thread_terminate_self
+ <bddebian> But a syscall not an RPC right?
+ <braunr> in hurd terminology, we don't make the distinction
+ <braunr> the only real syscalls are mach_msg (obviously) and some to get
+ well known port rights
+ <braunr> e.g. mach_task_self
+ <braunr> everything else should be an RPC but could be a system call for
+ performance
+ <braunr> since mach was designed to support clusters, it was necessary that
+ anything not strictly machine-local was an RPC
+ <braunr> and it also helps emulation a lot
+ <braunr> so keep doing RPCs :p
+#### IRC, freenode, #hurd, 2013-05-10
+ <braunr> i'm not sure it should only apply to self though
+ <braunr> youpi: can we get a quick opinion on this please ?
+ <braunr> i've suggested bddebian to work on a new RPC that both terminates
+ a thread and releases its stack to help fix libpthread
+ <braunr> and initially, i thought of it as operating only on the calling
+ thread
+ <braunr> do you see any reason to make it work on any thread ?
+ <braunr> (e.g. a real thread_terminate + vm_deallocate)
+ <braunr> (or any reason not to)
+ <youpi> thread stack deallocation is always a burden indeed
+ <youpi> I'd tend to think it'd be useful, but perhaps ask the list
# Open Issues
[[!inline pages=tag/open_issue_libpthread raw=yes feeds=no]]
diff --git a/microkernel/mach/deficiencies.mdwn b/microkernel/mach/deficiencies.mdwn
index 1294b8b3..d1cdeb54 100644
--- a/microkernel/mach/deficiencies.mdwn
+++ b/microkernel/mach/deficiencies.mdwn
@@ -260,9 +260,9 @@ License|/fdl]]."]]"""]]
solve a number of problems... I just wonder how many others it would open
-# IRC, freenode, #hurd, 2012-09-04
+# X15
+## IRC, freenode, #hurd, 2012-09-04
<braunr> it was intended as a mach clone, but now that i have better
knowledge of both mach and the hurd, i don't want to retain mach
@@ -767,3 +767,1426 @@ In context of [[open_issues/multithreading]] and later [[open_issues/select]].
<braunr> imo, a rewrite is more appropriate
<braunr> sometimes, things done in x15 can be ported to the hurd
<braunr> but it still requires a good deal of effort
+## IRC, freenode, #hurd, 2013-04-26
+ <bddebian> braunr: Did I see that you are back tinkering with X15?
+ <braunr> well yes i am
+ <braunr> and i'm very satisfied with it currently, i hope i can maintain
+ the same level of quality in the future
+ <braunr> it can already handle hundreds of processors with hundreds of GB
+ of RAM in a very scalable way
+ <braunr> most algorithms are O(1)
+ <braunr> even waking up multiple threads is O(1) :)
+ <braunr> i'd like to implement rcu this summer
+ <bddebian> Nice. When are you gonna replace gnumach? ;-P
+ <braunr> never
+ <braunr> it's x15, not x15mach now
+ <braunr> it's not meant to be compatible
+ <bddebian> Who says it has to be compatible? :)
+ <braunr> i don't know, my head
+ <braunr> the point is, the project is about rewriting the hurd now, not
+ just the kernel
+ <braunr> new kernel, new ipc, new interfaces, new libraries, new everything
+ <bddebian> Yikes, now that is some work. :)
+ <braunr> well yes and no
+ <braunr> ipc shouldn't be that difficult/long, considering how simple i
+ want the interface to be
+ <bddebian> Cool.
+ <braunr> networking and drivers will simply be reused from another code
+ base like dde or netbsd
+ <braunr> so besides the kernel, it's a few libraries (e.g. a libports like
+ library), sysdeps parts in the c library, and a file system
+ <bddebian> For inclusion in glibc or are you not intending on using glibc?
+ <braunr> i intend to use glibc, but not for upstream integration, if that's
+ what you meant
+ <braunr> so a private, local branch i assume
+ <braunr> i expect that part to be the hardest
+## IRC, freenode, #hurd, 2013-05-02
+ <zacts> braunr: also, will propel/x15 use netbsd drivers or netdde linux
+ drivers?
+ <zacts> or both?
+ <braunr> probably netbsd drivers
+ <zacts> and if netbsd, will it utilize rump?
+ <braunr> i don't know yet
+ <zacts> ok
+ <braunr> device drivers and networking will arrive late
+ <braunr> the system first has to run in ram, with a truely configurable
+ boot process
+ <braunr> (i.e. a boot process that doesn't use anything static, and can
+ boot from either disk or network)
+ <braunr> rump looks good but it still requires some work since it doesn't
+ take care of messaging as well as we'd want
+ <braunr> e.g. signal relaying isn't that great
+ <zacts> I personally feel like using linux drivers would be cool, just
+ because linux supports more hardware than netbsd iirc..
+ <mcsim> zacts: But it could be problematic as you should take quite a lot
+ code from linux kernel to add support even for a single driver.
+ <braunr> zacts: netbsd drivers are far more portable
+ <zacts> oh wow, interesting. yeah I did have the idea that netbsd would be
+ more portable.
+ <braunr> mcsim: that doesn't seem to be as big a problem as you might
+ suggest
+ <braunr> the problem is providing the drivers with their requirements
+ <braunr> there are a lot of different execution contexts in linux (hardirq,
+ softirq, bh, threads to name a few)
+ <braunr> being portable (as implied in netbsd) also means being less
+ demanding on the execution context
+ <braunr> which allows reusing code in userspace more easily, as
+ demonstrated by rump
+ <braunr> i don't really care about extensive hardware support, since this
+ is required only for very popular projects such as linux
+ <braunr> and hardware support actually comes with popularity (the driver
+ code base is related with the user base)
+ <zacts> so you think that more users will contribute if the projects takes
+ off?
+ <braunr> i care about clean and maintainable code
+ <braunr> well yes
+ <zacts> I think that's a good attitude
+ <braunr> what i mean is, there is no need for extensive hardware support
+ <mcsim> braunr: TBH, I did not really got idea of rump. Do they try to run
+ the whole kernel or some chosen subsystems as user tasks?
+ <braunr> mcsim: some subsystems
+ <braunr> well
+ <braunr> all the subsystems required by the code they actually want to run
+ <braunr> (be it a file system or a network stack)
+ <mcsim> braunr: What's the difference with dde?
+ <braunr> it's not kernel oriented
+ <mcsim> what do you mean?
+ <braunr> it's not only meant to run on top of a microkernel
+ <braunr> as the author named it, it's "anykernel"
+ <braunr> if you remember at fosdem, he run code inside a browser
+ <braunr> ran*
+ <braunr> and also, netbsd drivers wouldn't restrict the license
+ <braunr> although not a priority, having a (would be) gnu system under
+ gplv3+ would be nice
+ <zacts> that would be cool
+ <zacts> x15 is already gplv3+
+ <zacts> iirc
+ <braunr> yes
+ <zacts> cool
+ <zacts> yeah, I would agree netbsd drivers do look more attractive in that
+ case
+ <braunr> again, that's clearly not the main reason for choosing them
+ <zacts> ok
+ <braunr> it could also cause other problems, such as accepting a bsd
+ license when contributing back
+ <braunr> but the main feature of the hurd isn't drivers, and what we want
+ to protect with the gpl is the main features
+ <zacts> I see
+ <braunr> drivers, as well as networking, would be third party code, the
+ same way you run e.g. firefox on linux
+ <braunr> with just a bit of glue
+ <zacts> braunr: what do you think of the idea of being able to do updates
+ for propel without rebooting the machine? would that be possible down the
+ road?
+ <braunr> simple answer: no
+ <braunr> that would probably require persistence, and i really don't want
+ that
+ <zacts> does persistence add a lot of complexity to the system?
+ <braunr> not with the code, but at execution, yes
+ <zacts> interesting
+ <braunr> we could add per-program serialization that would allow it but
+ that's clearly not a priority for me
+ <braunr> updating with a reboot is already complex enough :)
+## IRC, freenode, #hurd, 2013-05-09
+ <braunr> the thing is, i consider the basic building blocks of the hurd too
+ crappy to build anything really worth such effort over them
+ <braunr> mach is crappy, mig is crappy, signal handling is crappy, hurd
+ libraries are ok but incur a lot of contention, which is crappy today
+ <bddebian> Understood but it is all we have currently.
+ <braunr> i know
+ <braunr> and it's good as a prototype
+ <bddebian> We have already had L4, viengoos, etc and nothing has ever come
+ to fruition. :(
+ <braunr> my approach is compeltely different
+ <braunr> it's not a new design
+ <braunr> a few things like ipc and signals are redesigned, but that's minor
+ compared to what was intended for hurdng
+ <braunr> propel is simply meant to be a fast, scalable implementation of
+ the hurd high level architecture
+ <braunr> bddebian: imagine a mig you don't fear using
+ <braunr> imagine interfaces not constrained to 100 calls ...
+ <braunr> imagine per-thread signalling from the start
+ <bddebian> braunr: I am with you 100% but it's vaporware so far.. ;-)
+ <braunr> bddebian: i'm just explaining why i don't want to work on large
+ scale projects on the hurd
+ <braunr> fixing local bugs is fine
+ <braunr> fixing paging is mandatory
+ <braunr> usb could be implemented with dde, perhaps by sharing the pci
+ handling code
+ <braunr> (i.e. have one big dde server with drivers inside, a bit ugly but
+ straightforward compared to a full fledged pci server)
+ <bddebian> braunr: But this is the problem I see. Those of you that have
+ the skills don't have the time or energy to put into fixing that kind of
+ stuff.
+ <bddebian> braunr: That was my thought.
+ <braunr> bddebian: well i have time, and i'm currently working :p
+ <braunr> but not on that
+ <braunr> bddebian: also, it won't be vaporware for long, i may have ipc
+ working well by the end of the year, and optimized and developer-friendly
+ by next year)
+## IRC, freenode, #hurd, 2013-06-05
+ <braunr> i'll soon add my radix tree with support for lockless lookups :>
+ <braunr> a tree organized based on the values of the keys thmselves, and
+ not how they relatively compare to each other
+ <braunr> also, a tree of arrays, which takes advantage of cache locality
+ without the burden of expensive resizes
+ <arnuld> you seem to be applying good algorithmic teghniques
+ <arnuld> that is nice
+ <braunr> that's one goal of the project
+ <braunr> you can't achieve performance and scalability without the
+ appropriate techniques
+ <braunr> see
+ for the existing userspace implementation
+ <arnuld> in kern/work.c I see one TODO "allocate numeric IDs to better
+ identify worker threads"
+ <braunr> yes
+ <braunr> and i'm adding my radix tree now exactly for that
+ <braunr> (well not only, since radix tree will also back VM objects and IPC
+ spaces, two major data structures of the kernel)
+## IRC, freenode, #hurd, 2013-06-11
+ <braunr> and also starting paging anonymous memory in x15 :>
+ <braunr> well, i've merged my radix tree code, made it safe for lockless
+ access (or so i hope), added generic concurrent work queues
+ <braunr> and once the basic support for anonymous memory is done, x15 will
+ be able to load modules passed from grub into userspace :>
+ <braunr> but i've also been thinking about how to solve a major scalability
+ issue with capability based microkernels that noone else seem to have
+ seen or bothered thinking about
+ <braunr> for those interested, the problem is contention at the port level
+ <braunr> unlike on a monolithic kernel, or a microkernel with thread-based
+ ipc such as l4, mach and similar kernels use capabilities (port rights in
+ mach terminology) to communicate
+ <braunr> the kernel then has to "translate" that reference into a thread to
+ process the request
+ <braunr> this is done by using a port set, putting many ports inside, and
+ making worker threads receive messages on the port set
+ <braunr> and in practice, this gets very similar to a traditional thread
+ pool model
+ <braunr> one thread actually waits for a message, while others sit on a
+ list
+ <braunr> when a message arrives, the receiving thread wakes another from
+ that list so it receives the next message
+ <braunr> this is all done with a lock
+ <bddebian> Maybe they thought about it but couldn't or were to lazy to find
+ a better way? :)
+ <mcsim> braunr: what do you mean under "unlike .... a microkernel with
+ thread-based ipc such as l4, mach and similar kernels use capabilities"?
+ L4 also has capabilities.
+ <braunr> mcsim: not directly
+ <braunr> capabilities are implemented by a server on top of l4
+ <braunr> unless it's OKL4 or another variant with capabilities back in the
+ kernel
+ <braunr> i don't know how fiasco does it
+ <braunr> so the problem with this lock is potentially very heavy contention
+ <braunr> and contention in what is the equivalent of a system call ..
+ <braunr> it's also hard to make it real-time capable
+ <braunr> for example, in qnx, they temporarily apply priority inheritance
+ to *every* server thread since they don't know which one is going to be
+ receiving next
+ <mcsim> braunr: in fiasco you have capability pool for each thread and this
+ pool is stored in tread control block. When one allocates capability
+ kernel just marks slot in a pool as busy
+ <braunr> mcsim: ok but, there *is* a thread for each capability
+ <braunr> i mean, when doing ipc, there can only be one thread receiving the
+ message
+ <braunr> (iirc, this was one of the big issue for l4-hurd)
+ <mcsim> ok. i see the difference.
+ <braunr> well i'm asking
+ <braunr> i'm not so sure about fiasco
+ <braunr> but that's what i remember from the generic l4 spec
+ <mcsim> sorry, but where is the question?
+ <braunr> 16:04 < braunr> i mean, when doing ipc, there can only be one
+ thread receiving the message
+ <mcsim> yes, you specify capability to thread you want to send message to
+ <braunr> i'll rephrase:
+ <braunr> when you send a message, do you invoke a capability (as in mach),
+ or do you specify the receiving thread ?
+ <mcsim> you specify a thread
+ <braunr> that's my point
+ <mcsim> but you use local name (that is basically capability)
+ <braunr> i see
+ <braunr> from wikipedia: "Furthermore, Fiasco contains mechanisms for
+ controlling communication rights as well as kernel-level resource
+ consumption"
+ <braunr> not certain that's what it refers to, but that's what i understand
+ from it
+ <braunr> more capability features in the kernel
+ <braunr> but you still send to one thread
+ <mcsim> yes
+ <braunr> that's what makes it "easily" real time capable
+ <braunr> a microkernel that would provide mach-like semantics
+ (object-oriented messaging) but without contention at the messsage
+ passing level (and with resource preallocation for real time) would be
+ really great
+ <braunr> bddebian: i'm not sure anyone did
+ <bddebian> braunr: Well you can be the hero!! ;)
+ <braunr> the various papers i could find that were close to this subject
+ didn't take contention into account
+ <braunr> exception for network-distributed ipc on slow network links
+ <braunr> bddebian: eh
+ <braunr> well i think it's doable acctually
+ <mcsim> braunr: can you elaborate on where contention is, because I do not
+ see this clearly?
+ <braunr> mcsim: let's take a practical example
+ <braunr> a file system such as ext2fs, that you know well enough
+ <braunr> imagine a large machine with e.g. 64 processors
+ <braunr> and an ignorant developer like ourselves issuing make -j64
+ <braunr> every file access performed by the gcc tools will look up files,
+ and read/write/close them, concurrently
+ <braunr> at the server side, thread creation isn't a problem
+ <braunr> we could have as many threads as clients
+ <braunr> the problem is the port set
+ <braunr> for each port class/bucket (let's assume they map 1:1), a port set
+ is created, and all receive rights for the objects managed by the server
+ (the files) are inserted in this port set
+ <braunr> then, the server uses ports_manage_port_operations_multithread()
+ to service requests on that port set
+ <braunr> with as many threads required to process incoming messages, much
+ the same way a work queue does it
+ <braunr> but you can't have *all* threads receiving at the same time
+ <braunr> there can only be one
+ <braunr> the others are queued
+ <braunr> i did a change about the queue order a few months ago in mach btw
+ <braunr> mcsim: see ipc/ipc_thread.c in gnumach
+ <braunr> this queue is shared and must be modified, which basically means a
+ lock, and contention
+ <braunr> so the 64 concurrent gcc processes will suffer from contenion at
+ the server while they're doing something similar to a system call
+ <braunr> by that, i mean, even before the request is received
+ <braunr> mcsim: if you still don't understand, feel free to ask
+ <mcsim> braunr: I'm thinking on it :) give me some time
+ <braunr> "Fiasco.OC is a third generation microkernel, which evolved from
+ its predecessor L4/Fiasco. Fiasco.OC is capability based"
+ <braunr> ok
+ <braunr> so basically, there are no more interesting l4 variants strictly
+ following the l4v2 spec any more
+ <braunr> "The completely redesigned user-land environment running on top of
+ Fiasco.OC is called L4 Runtime Environment (L4Re). It provides the
+ framework to build multi-component systems, including a client/server
+ communication framework"
+ <braunr> so yes, client/server communication is built on top of the kernel
+ <braunr> something i really want to avoid actually
+ <mcsim> So when 1 core wants to pull something out of queue it has to lock
+ it, and the problem arrives when other 63 cpus are waiting in the same
+ lock. Right?
+ <braunr> mcsim: yes
+ <mcsim> could this be solved by implementing per cpu queues? Like in slab
+ allocator
+ <braunr> solved, no
+ <braunr> reduced, yes
+ <braunr> by using multiple port sets, each with their own thread pool
+ <braunr> but this would still leave core problems unsolved
+ <braunr> (those making real-time hard)
+ <mcsim> to make it real-time is not really essential to solve this problem
+ <braunr> that's the other way around
+ <mcsim> we just need to guarantee that locking protocol is fair
+ <braunr> solving this problem is required for quality real-time
+ <braunr> what you refer to is similar to what i described in qnx earlier
+ <braunr> it's ugly
+ <braunr> keep in mind that message passing is the equivalent of system
+ calls on monolithic kernels
+ <braunr> os ideally, we'd want something as close as possible to an
+ actually system call
+ <braunr> so*
+ <braunr> mcsim: do you see why it's ugly ?
+ <mcsim> no i meant exactly opposite, I meant to use some deterministic
+ locking protocol
+ <braunr> please elaborate
+ <braunr> because what qnx does is deterministic
+ <mcsim> We know in what sequences threads will acquire the lock, so we will
+ not have to apply inheritance to all threads
+ <braunr> hwo do you know ?
+ <mcsim> there are different approaches, like you use ticket system or MCS
+ lock (
+ <braunr> that's still locking
+ <braunr> a system call has 0 contention
+ <braunr> 0 potential contention
+ <mcsim> in linux?
+ <braunr> everywhere i assume
+ <mcsim> than why do they need locks?
+ <braunr> they need locks after the system call
+ <braunr> the system call itself is a stupid trap that makes the thread
+ "jump" in the kernel
+ <braunr> and the reason why it's so simple is the same as in fiasco:
+ threads (clients) communicate directly with the "server thread"
+ (themselves in kernel mode)
+ <braunr> so 1/ they don't go through a capability or any other abstraction
+ <braunr> and 2/ they're even faster than on fiasco because they don't need
+ to find the destination, it's implied by the trap mechanism)
+ <braunr> 2/ is only an optimization that we can live without
+ <braunr> but 1/ is a serious bottleneck for microkernels
+ <mcsim> Do you mean that there system call that process without locks or do
+ you mean that there are no system calls that use locks?
+ <braunr> this is what makes papers such as
+ valid
+ <braunr> i mean the system call (the mechanism used to query system
+ services) doesn't have to grab any lock
+ <braunr> the idea i have is to make the kernel transparently (well, as much
+ as it can be) associate a server thread to a client thread at the port
+ level
+ <braunr> at the server side, it would work practically the same
+ <braunr> the first time a server thread services a request, it's
+ automatically associated to a client, and subsequent request will
+ directly address this thread
+ <braunr> when the client is destroyed, the server gets notified and
+ destroys the associated server trhead
+ <braunr> for real-time tasks, i'm thinking of using a signal that gets sent
+ to all servers, notifying them of the thread creation so that they can
+ preallocate the server thread
+ <braunr> or rather, a signal to all servers wishing to be notified
+ <braunr> or perhaps the client has to reserve the resources itself
+ <braunr> i don't know, but that's the idea
+ <mcsim> and who will send this signal?
+ <braunr> the kernel
+ <braunr> x15 will provide unix like signals
+ <braunr> but i think the client doing explicit reservation is better
+ <braunr> more complicated, but better
+ <braunr> real time developers ought to know what they're doing anyway
+ <braunr> mcsim: the trick is using lockless synchronization (like rcu) at
+ the port so that looking up the matching server thread doesn't grab any
+ lock
+ <braunr> there would still be contention for the very first access, but
+ that looks much better than having it every time
+ <braunr> (potential contention)
+ <braunr> it also simplifies writing servers a lot, because it encourages
+ the use of a single port set for best performance
+ <braunr> instead of burdening the server writer with avoiding contention
+ with e.g. a hierarchical scheme
+ <mcsim> "looking up the matching server" -- looking up where?
+ <braunr> in the port
+ <mcsim> but why can't you just take first?
+ <braunr> that's what triggers contention
+ <braunr> you have to look at the first
+ <mcsim> > (16:34:13) braunr: mcsim: do you see why it's ugly ?
+ <mcsim> BTW, not really
+ <braunr> imagine serveral clients send concurrently
+ <braunr> mcsim: well, qnx doesn't do it every time
+ <braunr> qnx boosts server threads only when there are no thread currently
+ receiving, and a sender with a higher priority arrives
+ <braunr> since qnx can't know which server thread is going to be receiving
+ next, it boosts every thread
+ <braunr> boosting priority is expensive, and boosting everythread is linear
+ with the number of threads
+ <braunr> so on a big system, it would be damn slow for a system call :)
+ <mcsim> ok
+ <braunr> and grabbing "the first" can't be properly done without
+ serialization
+ <braunr> if several clients send concurrently, only one of them gets
+ serviced by the "first server thread"
+ <braunr> the second client will be serviced by the "second" (or the first
+ if it came back)
+ <braunr> making the second become the first (i call it the manager) must be
+ atomic
+ <braunr> that's the core of the problem
+ <braunr> i think it's very important because that's currently one of the
+ fundamental differences wih monolithic kernels
+ <mcsim> so looking up for server is done without contention. And just
+ assigning task to server requires lock, right?
+ <braunr> mcsim: basically yes
+ <braunr> i'm not sure it's that easy in practice but that's what i'll aim
+ at
+ <braunr> almost every argument i've read about microkernel vs monolithic is
+ full of crap
+ <mcsim> Do you mean lock on the whole queue or finer grained one?
+ <braunr> the whole port
+ <braunr> (including the queue)
+ <mcsim> why the whole port?
+ <braunr> how can you make it finer ?
+ <mcsim> is queue a linked list?
+ <braunr> yes
+ <mcsim> than can we just lock current element in the queue and elements
+ that point to current
+ <braunr> that's two lock
+ <braunr> and every sender will want "current"
+ <braunr> which then becomes coarse grained
+ <mcsim> but they want different current
+ <braunr> let's call them the manager and the spare threads
+ <braunr> yes, that's why there is a lock
+ <braunr> so they don't all get the same
+ <braunr> the manager is the one currently waiting for a message, while
+ spare threads are available but not doing anything
+ <braunr> when the manager finally receives a message, it takes the first
+ spare, which becomes the new manager
+ <braunr> exactly like in a common thread pool
+ <braunr> so what are you calling current ?
+ <mcsim> we have in a port queue of threads that wait for message: t1 -> t2
+ -> t3 -> t4; kernel decided to assign message to t3, than t3 and t2 are
+ locked.
+ <braunr> why not t1 and t2 ?
+ <mcsim> i was calling t3 in this example as current
+ <mcsim> some heuristics
+ <braunr> yeah well no
+ <braunr> it wouldn't be deterministic then
+ <mcsim> for instance client runs on core 3 and wants server that also runs
+ on core 3
+ <braunr> i really want the operation as close as a true system call as
+ possible, so O(1)
+ <braunr> what if there are none ?
+ <mcsim> it looks up forward up to the end of queue: t1->t2->t4; takes t4
+ <mcsim> than it starts from the beginning
+ <braunr> that becomes linear in the worst case
+ <mcsim> no
+ <braunr> so 4095 attempts on a 4096 cpus machine
+ <braunr> ?
+ <mcsim> you're right
+ <braunr> unfortunately :/
+ <braunr> a per-cpu scheme could be good
+ <braunr> and applicable
+ <braunr> with much more thought
+ <braunr> and the problem is that, unlike the kernel, which is naturally a
+ one thread per cpu server, userspace servers may have less or more
+ threads than cpu
+ <braunr> possibly unbalanced too
+ <braunr> so it would result in complicated code
+ <braunr> one good thing with microkernels is that they're small
+ <braunr> they don't pollute the instruction cache much
+ <braunr> keeping the code small is important for performance too
+ <braunr> so forgetting this kind of optimization makes for not too
+ complicated code, and we rely on the scheduler to properly balance
+ threads
+ <braunr> mcsim: also note that, with your idea, the worst cast is twice
+ more expensive than a single lock
+ <braunr> and on a machine with few processors, this worst case would be
+ likely
+ <mcsim> so, you propose every time try to take first server from the queue?
+ <mcsim> braunr: ^
+ <braunr> no
+ <braunr> that's what is done already
+ <braunr> i propose doing that the first time a client sends a message
+ <braunr> but then, the server thread that replied becomes strongly
+ associated to that client (it cannot service requests from other clients)
+ <braunr> and it can be recycled only when the client dies
+ <braunr> (which generates a signal indicating the server it can now recycle
+ the server thread)
+ <braunr> (a signal similar to the no-sender or dead-name notifications in
+ mach)
+ <braunr> that signal would be sent from the kernel, in the traditional unix
+ way (i.e. no dedicated signal thread since it would be another source of
+ contention)
+ <braunr> and the server thread would directly receive it, not interfering
+ with the other threads in the server in any way
+ <braunr> => contention on first message only
+ <braunr> now, for something like make -j64, which starts a different
+ process for each compilation (itself starting subprocesses for
+ preprocessing/compiling/assembling)
+ <braunr> it wouldn't be such a big win
+ <braunr> so even this first access should be optimized
+ <braunr> if you ever get an idea, feel free to share :)
+ <mcsim> May mach block thread when it performs asynchronous call?
+ <mcsim> braunr: ^
+ <braunr> sure
+ <braunr> but that's unrelated
+ <braunr> in mach, a sender is blocked only when the message queue is full
+ <mcsim> So we can introduce per cpu queues at the sender side
+ <braunr> (and mach_msg wasn't called in non blocking mode obviously)
+ <braunr> no
+ <braunr> they need to be delivered in order
+ <mcsim> In what order?
+ <braunr> messages can't be reorder once queued
+ <braunr> reordered
+ <braunr> so fifo order
+ <braunr> if you break the queue in per cpu queues, you may break that, or
+ need work to rebuild the order
+ <braunr> which negates the gain from using per cpu queues
+ <mcsim> Messages from the same thread will be kept in order
+ <braunr> are you sure ?
+ <braunr> and i'm not sure it's enough
+ <mcsim> thes cpu queues will be put to common queue once context switch
+ occurs
+ <braunr> *all* messages must be received in order
+ <mcsim> these*
+ <braunr> uh ?
+ <braunr> you want each context switch to grab a global lock ?
+ <mcsim> if you have parallel threads that send messages that do not have
+ dependencies than they are unordered
+ <mcsim> always
+ <braunr> the problem is they might
+ <braunr> consider auth for example
+ <braunr> you have one client attempting to authenticate itself to a server
+ through the auth server
+ <braunr> if message order is messed up, it just won't work
+ <braunr> but i don't have this problem in x15, since all ipc (except
+ signals) is synchronous
+ <mcsim> but it won't be messed up. You just "send" messages in O(1), but
+ than you put these messages that are not actually sent in queue all at
+ once
+ <braunr> i think i need more details please
+ <mcsim> you have lock on the port as it works now, not the kernel lock
+ <mcsim> the idea is to batch these calls
+ <braunr> i see
+ <braunr> batching can be effective, but it would really require queueing
+ <braunr> x15 only queues clients when there is no receiver
+ <braunr> i don't think batching can be applied there
+ <mcsim> you batch messages only from one client
+ <braunr> that's what i'm saying
+ <mcsim> so client can send several messages during his time slice and than
+ you put them into queue all together
+ <braunr> x15 ipc is synchronous, no more than 1 message per client at any
+ time
+ <braunr> there also are other problems with this strategy
+ <braunr> problems we have on the hurd, such as priority handling
+ <braunr> if you delay the reception of messages, you also delay priority
+ inheritance to the server thread
+ <braunr> well not the reception, the queueing actually
+ <braunr> but since batching is about delaying that, it's the same
+ <mcsim> if you use synchronous ipc than there is no sence in batching, at
+ least as I see it.
+ <braunr> yes
+ <braunr> 18:08 < braunr> i don't think batching can be applied there
+ <braunr> and i think sync ipc is the only way to go for a system intended
+ to provide messaging performance as close as possible to the system call
+ <mcsim> do you have as many server thread as many cores you have?
+ <braunr> no
+ <braunr> as many server threads as clients
+ <braunr> which matches the monolithic model
+ <mcsim> in current implementation?
+ <braunr> no
+ <braunr> currently i don't have userspace :>
+ <mcsim> and what is in hurd atm?
+ <mcsim> in gnumach
+ <braunr> asyn ipc
+ <braunr> async
+ <braunr> with message queues
+ <braunr> no priority inheritance, simple "handoff" on message delivery,
+ that's all
+ <anatoly> I managed to read the conversation :-)
+ <braunr> eh
+ <braunr> anatoly: any opinion on this ?
+ <anatoly> braunr: I have no opinion. I understand it partially :-) But
+ association of threads sounds for me as good idea
+ <anatoly> But who am I to say what is good or what is not in that area :-)
+ <braunr> there still is this "first time" issue which needs at least one
+ atomic instruction
+ <anatoly> I see. Does mach do this "first time" thing every time?
+ <braunr> yes
+ <braunr> but gnumach is uniprocessor so it doesn't matter
+ <mcsim> if we have 1:1 relation for client and server threads we need only
+ per-cpu queues
+ <braunr> mcsim: explain that please
+ <braunr> and the problem here is establishing this relation
+ <braunr> with a lockless lookup, i don't even need per cpu queues
+ <mcsim> you said: (18:11:16) braunr: as many server threads as clients
+ <mcsim> how do you create server threads?
+ <braunr> pthread_create
+ <braunr> :)
+ <mcsim> ok :)
+ <mcsim> why and when do you create a server thread?
+ <braunr> there must be at least one unbound thread waiting for a message
+ <braunr> when a message is received, that thread knows it's now bound with
+ a client, and if needed wakes up/spawns another thread to wait for
+ incoming messages
+ <braunr> when it gets a signal indicating the death of the client, it knows
+ it's now unbound, and goes back to waiting for new messages
+ <braunr> becoming either the manager or a spare thread if there already is
+ a manager
+ <braunr> a timer could be used as it's done on the hurd to make unbound
+ threads die after a timeout
+ <braunr> the distinction between the manager and spare threads would only
+ be done at the kernel level
+ <braunr> the server would simply make unbound threads wait on the port set
+ <anatoly> How client sends signal to thread about its death (as I
+ understand signal is not message) (sorry for noob question)
+ <mcsim> in what you described there are no queues at all
+ <braunr> anatoly: the kernel does it
+ <braunr> mcsim: there is, in the kernel
+ <braunr> the queue of spare threads
+ <braunr> anatoly: don't apologize for noob questions eh
+ <anatoly> braunr: is that client is a thread of some user space task?
+ <braunr> i don't think it's a newbie topic at all
+ <braunr> anatoly: a thread
+ <mcsim> make these queue per cpu
+ <braunr> why ?
+ <braunr> there can be a lot less spare threads than processors
+ <braunr> i don't think it's a good idea to spawn one thread per cpu per
+ port set
+ <braunr> on a large machine you'd have tons of useless threads
+ <mcsim> if you have many useless threads, than assign 1 thread to several
+ core, thus you will have twice less threads
+ <mcsim> i mean dynamically
+ <braunr> that becomes a hierarchical model
+ <braunr> it does reduce contention, but it's complicated, and for now i'm
+ not sure it's worth it
+ <braunr> it could be a tunable though
+ <mcsim> if you want something fast you should use something complicated.
+ <braunr> really ?
+ <braunr> a system call is very simple and very fast
+ <braunr> :p
+ <mcsim> why is it fast?
+ <mcsim> you still have a lot of threads in kernel
+ <braunr> but they don't interact during the system call
+ <braunr> the system call itself is usually a simple instruction with most
+ of it handled in hardware
+ <mcsim> if you invoke "write" system call, what do you do in kernel?
+ <braunr> you look up the function address in a table
+ <mcsim> you still have queues
+ <braunr> no
+ <braunr> sorry wait
+ <braunr> by system call, i mean "the transition from userspace to kernel
+ space"
+ <braunr> and the return
+ <braunr> not the service itself
+ <braunr> the equivalent on a microkernel system is sending a message from a
+ client, and receiving it in a server, not processing the request
+ <braunr> ideally, that's what l4 does: switching from one thread to
+ another, as simply and quickly as the hardware can
+ <braunr> so just a context and address space switch
+ <mcsim> at some point you put something in queue even in monolithic kernel
+ and make request to some other kernel thread
+ <braunr> the problem here is the indirection that is the capability
+ <braunr> yes but that's the service
+ <braunr> i don't care about the service here
+ <braunr> i care about how the request reaches the server
+ <mcsim> this division exist for microkernels
+ <mcsim> for monolithic it's all mixed
+ <anatoly> What does thread do when it receive a message?
+ <braunr> anatoly: what it wants :p
+ <braunr> the service
+ <braunr> mcsim: ?
+ <braunr> mixed ?
+ <anatoly> braunr: hm, is it a thread of some server?
+ <mcsim> if you have several working threads in monolithic kernel you have
+ to put request in queue
+ <braunr> anatoly: yes
+ <braunr> mcsim: why would you have working threads ?
+ <mcsim> and there is no difference either you consider it as service or
+ just "transition from userspace to kernel space"
+ <braunr> i mean, it's a good thing to have, they usually do, but they're
+ not implied
+ <braunr> they're completely irrelevant to the discussion here
+ <braunr> of course there is
+ <braunr> you might very well perform system calls that don't involve
+ anything shared
+ <mcsim> you can also have only one working thread in microkernel
+ <braunr> yes
+ <mcsim> and all clients will wait for it
+ <braunr> you're mixing up work queues in the discussion here
+ <braunr> server threads are very similar to a work queue, yes
+ <mcsim> but you gave me an example with 64 cores and each core runs some
+ server thread
+ <braunr> they're a thread pool handling requests
+ <mcsim> you can have only one thread in a pool
+ <braunr> they have to exist in a microkernel system to provide concurrency
+ <braunr> monolithic kernels can process concurrently without them though
+ <mcsim> why?
+ <braunr> because on a monolithic system, _every client thread is its own
+ server_
+ <braunr> a thread making a system call is exactly like a client requesting
+ a service
+ <braunr> on a monolithic kernel, the server is the kernel
+ <braunr> and it *already* has as many threads as clients
+ <braunr> and that's pretty much the only thing beautiful about monolithic
+ kernels
+ <mcsim> right
+ <mcsim> have to think about it :)
+ <braunr> that's why they scale so easily compared to microkernel based
+ systems
+ <braunr> and why l4 people chose to have thread-based ipc
+ <braunr> but this just moves the problems to an upper level
+ <braunr> and is probably why they've realized one of the real values of
+ microkernel systems is capabilities
+ <braunr> and if you want to make them fast enough, they should be handled
+ directly by the kernel
+## IRC, freenode, #hurd, 2013-06-13
+ <bddebian> Heya Richard. Solve the worlds problems yet? :)
+ <kilobug> bddebian: I fear the worlds problems are NP-complete ;)
+ <bddebian> heh
+ <braunr> bddebian: i wish i could solve mine at least :p
+ <bddebian> braunr: I meant the contention thing you were discussing the
+ other day :)
+ <braunr> bddebian: oh
+ <braunr> i have a solution that improves the behaviour yes, but there is
+ still contention the first time a thread performs an ipc
+ <bddebian> Any thread or the first time there is contention?
+ <braunr> there may be contention the first time a thread sends a message to
+ a server
+ <braunr> (assuming a server uses a single port set to receive requests)
+ <bddebian> Oh aye
+ <braunr> i think it's as much as can be done considering there is a
+ translation from capability to thread
+ <braunr> other schemes are just too heavy, and thus don't scale well
+ <braunr> this translation is one of the two important nice properties of
+ microkernel based systems, and translations (or indrections) usually have
+ a cost
+ <braunr> so we want to keep them
+ <braunr> and we have to accept that cost
+ <braunr> the amount of code in the critical section should be so small it
+ should only matter for machines with several hundreds or thousands
+ processors
+ <braunr> so it's not such a bit problem
+ <bddebian> OK
+ <braunr> but it would have been nice to have an additional valid
+ theoretical argument to explain how ipc isn't that slow compared to
+ system calls
+ <braunr> s/bit/big/
+ <braunr> people keep saying l4 made ipc as fast as system calls without
+ taking that stuff into account
+ <braunr> which makes the community look lame in the eyes of those familiar
+ with it
+ <bddebian> heh
+ <braunr> with my solution, persistent applications like databases should
+ perform as fast as on an l4 like kernel
+ <braunr> but things like parallel builds, which start many different
+ processes for each file, will suffer a bit more from contention
+ <braunr> seems like a fair compromise to me
+ <bddebian> Aye
+ <braunr> as mcsim said, there is a lot of contention about everywhere in
+ almost every application
+ <braunr> and lockless stuff is hard to correctly implement
+ <braunr> os it should be all right :)
+ <braunr> ... :)
+ <mcsim> braunr: What if we have at least 1 thread for each core that stay
+ in per-core queue. When we decide to kill a thread and this thread is
+ last in a queue we replace it with load balancer. This is still worse
+ than with monolithic kernel, but it is simplier to implement from kernel
+ perspective.
+ <braunr> mcsim: it doesn't scale well
+ <braunr> you end up with one thread per cpu per port set
+ <mcsim> load balancer is only one thread
+ <mcsim> why would it end up like you said?
+ <braunr> remember the goal is to avoid contention
+ <braunr> your proposition is to set per cpu queues
+ <braunr> the way i understand what you said, it means clients will look up
+ a server thread in these queues
+ <braunr> one of them actually, the one for the cpu they're currently
+ running one
+ <braunr> so 1/ it disables migration
+ <braunr> or 2/ you have one server thread per client per cpu
+ <braunr> i don't see what a "load balancer" would do here
+ <mcsim> client either finds server thread without contention or it sends
+ message to load balancer, that redirects message to thread from global
+ queue. Where global queue is concatenation of local ones.
+ <braunr> you can't concatenate local queues in a global one
+ <braunr> if you do that, you end up with a global queue, and a global lock
+ again
+ <mcsim> not global
+ <mcsim> load balancer is just one
+ <braunr> then you serialize all remote messaging through a single thread
+ <mcsim> so contention will be only among local thread and load balancer
+ <braunr> i don't see how it doesn't make the load balancer global
+ <mcsim> it makes
+ <mcsim> but it just makes bootstraping harder
+ <braunr> i'm not following
+ <braunr> and i don't see how it improves on my solution
+ <mcsim> in your example with make -j64 very soon there will be local
+ threads at any core
+ <braunr> yes, hence the lack of scalability
+ <mcsim> but that's your goal: create as many server thread as many clients
+ you have, isn't it?
+ <braunr> your solution may create a lot more
+ <braunr> again, one per port set (or server) per cpu
+ <braunr> imagine this worst case: you have a single client with one thread
+ <braunr> which gets migrated to every cpu on the machine
+ <braunr> it will spawn one thread per cpu at the server side
+ <mcsim> why would it migrate all the time?
+ <braunr> it's a worst case
+ <braunr> if it can migrate, consider it will
+ <braunr> murphy's law, you know
+ <braunr> also keep in mind contention doesn't always occur with a global
+ lock
+ <braunr> i'm talking about potential contention
+ <braunr> and same things apply: if it can happen, consider it will
+ <mcsim> than we can make load balancer that also migrates server threads
+ <braunr> ok so in addition to worker threads, we'll add an additional per
+ server load balancer which may have to lock several queues at once
+ <braunr> doesn't it feel completely overkill to you ?
+ <mcsim> load balancer is global, not per-cpu
+ <mcsim> there could be contention for it
+ <braunr> again, keep in mind this problem becomes important for several
+ hundreds processors, not below
+ <braunr> yes but it has to balance
+ <braunr> which means it has to lock cpu queues
+ <braunr> and at least two of them to "migrate" server threads
+ <braunr> and i don't know why it would do that
+ <braunr> i don't see the point of the load balancer
+ <mcsim> so, you start make -j64. First 64 invocations of gcc will suffer
+ from contention for load balancer, but later on it will create enough
+ server threads and contention will disappear
+ <braunr> no
+ <braunr> that's the best case : there is always one server thread per cpu
+ queue
+ <braunr> how do you guarantee your 64 server threads don't end up in the
+ same cpu queue ?
+ <braunr> (without disabling migration)
+ <mcsim> load balancer will try to put some server thread to the core where
+ load balancer was invoked
+ <braunr> so there is no guarantee
+ <mcsim> LB can pin server thread
+ <braunr> unless we invoke it regularly, in a way similar to what is already
+ done in the SMP scheduler :/
+ <braunr> and this also means one balancer per cpu then
+ <mcsim> why one balance per cpu?
+ <braunr> 15:56 < mcsim> load balancer will try to put some server thread to
+ the core where load balancer was invoked
+ <braunr> why only where it was invoked ?
+ <mcsim> because it assumes that if some one asked for server at core x, it
+ most likely will ask for the same service from the same core
+ <braunr> i'm not following
+ <mcsim> LB just tries to prefetch were next call will be
+ <braunr> what you're describing really looks like per-cpu work queues ...
+ <braunr> i don't see how you make sure there aren't too many threads
+ <braunr> i don't see how a load balancer helps
+ <braunr> this is just an heuristic
+ <mcsim> when server thread is created?
+ <mcsim> who creates it?
+ <braunr> and it may be useless, depending on how threads are migrated and
+ when they call the server
+ <braunr> same answer as yesterday
+ <braunr> there must be at least one thread receiving messages on a port set
+ <braunr> when a message arrives, if there aren't any spare threads, it
+ spawns one to receive messages while it processes the request
+ <mcsim> at the moment server threads are killed by timeout, right?
+ <braunr> yes
+ <braunr> well no
+ <braunr> there is a debian patch that disables that
+ <braunr> because there is something wrong with thread destruction
+ <braunr> but that's an implementation bug, not a design issue
+ <mcsim> so it is the mechanism how we insure that there aren't too many
+ threads
+ <mcsim> it helps because yesterday I proposed to hierarchical scheme, were
+ one server thread could wait in cpu queues of several cores
+ <mcsim> but this has to be implemented in kernel
+ <braunr> a hierarchical scheme would help yes
+ <braunr> a bit
+ <mcsim> i propose scheme that could be implemented in userspace
+ <braunr> ?
+ <mcsim> kernel should not distinguish among load balancer and server thread
+ <braunr> sorry this is too confusing
+ <braunr> please start describing what you have in mind from the start
+ <mcsim> ok
+ <mcsim> so my starting point was to use hierarchical management
+ <mcsim> but the drawback was that to implement it you have to do this in
+ kernel
+ <mcsim> right?
+ <braunr> no
+ <mcsim> so I thought how can this be implemented in user space
+ <braunr> being in kernel isn't the problem
+ <braunr> contention is
+ <braunr> on the contrary, i want ipc in kernel exactly because that's where
+ you have the most control over how it happens
+ <braunr> and can provide the best performance
+ <braunr> ipc is the main kernel responsibility
+ <mcsim> but if you have few clients you have low contention
+ <braunr> the goal was "0 potential contention"
+ <mcsim> and if you have many clients, you have many servers
+ <braunr> let's say server threads
+ <braunr> for me, a server is a server task or process
+ <mcsim> right
+ <braunr> so i think 0 potential contention is just impossible
+ <braunr> or it requires too many resources that make the solution not
+ scalable
+ <mcsim> 0 contention is impossible, since you have disbalance in numbers of
+ client threads and server threads
+ <braunr> well no
+ <braunr> it *canù be achieved
+ <braunr> imagine servers register themselves to the kernel
+ <braunr> and the kernel signals them when a client thread is spawned
+ <braunr> you'd effectively have one server thread per client
+ <braunr> (there would be other problems like e.g. when a server thread
+ becomes the client of another, etc..)
+ <braunr> so it's actually possible
+ <braunr> but we clearly don't want that, unless perhaps for real time
+ threads
+ <braunr> but please continue
+ <mcsim> what does "and the kernel signals them when a client thread is
+ spawned" mean?
+ <braunr> it means each time a thread not part of a server thread is
+ created, servers receive a signal meaning "hey, there's a new thread out
+ there, you might want to preallocate a server thread for it"
+ <mcsim> and what is the difference with creating thread on demand?
+ <braunr> on demand can occur when receiving a message
+ <braunr> i.e. during syscall
+ <mcsim> I will continue, I just want to be sure that I'm not basing on
+ wrong assumtions.
+ <mcsim> and what is bad in that?
+ <braunr> (just to clarify, i use the word "syscall" with the same meaning
+ as "RPC" on a microkernel system, whereas it's a true syscall on a
+ monolithic one)
+ <braunr> contention
+ <braunr> whether you have contention on a list of threads or on map entries
+ when allocating a stack doesn't matter
+ <braunr> the problem is contention
+ <mcsim> and if we create server thread always?
+ <mcsim> and do not keep them in queue?
+ <braunr> always ?
+ <mcsim> yes
+ <braunr> again
+ <braunr> you'd have to allocate a stack for it
+ <braunr> every time
+ <braunr> so two potentially heavy syscalls to allocate/free the stac
+ <braunr> k
+ <braunr> not to mention the thread itself, its associations with its task,
+ ipc space, maintaining reference counts
+ <braunr> (moar contention)
+ <braunr> creating threads was considered cheap at the time the process was
+ the main unit of concurrency
+ <mcsim> ok, than we will have the same contention if we will create a
+ thread when "the kernel signals them when a client thread is spawned"
+ <braunr> now we have work queues / thread pools just to avoid that
+ <braunr> no
+ <braunr> because that contention happens at thread creation
+ <braunr> not during a syscall
+ <braunr> i'll redefine the problem: the problem is contention during a
+ system call / IPC
+ <mcsim> ok
+ <braunr> note that my current solution is very close to signalling every
+ server
+ <braunr> it's the lazy version
+ <braunr> match at first IPC time
+ <mcsim> so I was basing my plan on the case when we create new thread when
+ client makes syscall and there is not enough server threads
+ <braunr> the problem exists even when there is enough server threads
+ <braunr> we shouldn't consider the case where there aren't enough server
+ threads
+ <braunr> real time tasks are the only ones which want that, and can
+ preallocate resources explicitely
+ <mcsim> I think that real time tasks should be really separated
+ <mcsim> For them resource availability as much more important that good
+ resource utilisation.
+ <mcsim> So if we talk about real time tasks we should apply one police and
+ for non-real time another
+ <mcsim> So it shouldn't be critical if thread is created during syscall
+ <braunr> agreed
+ <braunr> that's what i was saying :
+ <braunr> :)
+ <braunr> 16:23 < braunr> we shouldn't consider the case where there aren't
+ enough server threads
+ <braunr> in this case, we spawn a thread, and that's ok
+ <braunr> it will live on long enough that we really don't care about the
+ cost of lazily creating it
+ <braunr> so let's concentrate only on the case where there already are
+ enough server threads
+ <mcsim> So if client makes a request to ST (is it ok to use abbreviations?)
+ there are several cases:
+ <mcsim> 1/ There is ST waiting on local queue (trivial case)
+ <mcsim> 2/ There is no ST, only load balancer (LB). LB decides to create a
+ new thread
+ <mcsim> 3/ Like in previous case, but LB decides to perform migration
+ <braunr> migration of what ?
+ <mcsim> migration of ST from other core
+ <braunr> the only case effectively solving the problem is 1
+ <braunr> others introduce contention, and worse, complex code
+ <braunr> i mean a complex solution
+ <braunr> not only code
+ <braunr> even the addition of a load balancer per port set
+ <braunr> thr data structures involved for proper migration
+ <mcsim> But 2 and 3 in long run will lead to having enough threads on all
+ cores
+ <braunr> then you end up having 1 per client per cpu
+ <mcsim> migration is needed in any case
+ <braunr> no
+ <braunr> why would it be ?
+ <mcsim> to balance load
+ <mcsim> not only for this case
+ <braunr> there already is load balancing in the scheduler
+ <braunr> we don't want to duplicate its function
+ <mcsim> what kind of load balancing?
+ <mcsim> *has scheduler
+ <braunr> thread weight / cpu
+ <mcsim> and does it perform migration?
+ <braunr> sure
+ <mcsim> so scheduler can be simplified if policy "when to migrate" will be
+ moved to user space
+ <braunr> this is becoming a completely different problem
+ <braunr> and i don't want to do that
+ <braunr> it's very complicated for no real world benefit
+ <mcsim> but all this will be done in userspace
+ <braunr> ?
+ <braunr> all what ?
+ <mcsim> migration decisions
+ <braunr> in your scheme you mean ?
+ <mcsim> yes
+ <braunr> explain how
+ <mcsim> LB will decide when thread will migrate
+ <mcsim> and LB is user space task
+ <braunr> what does it bring ?
+ <braunr> imagine that, in the mean time, the scheduler then decides the
+ client should migrate to another processor for fairness
+ <braunr> you'd have migrated a server thread once for no actual benefit
+ <braunr> or again, you need to disable migration for long durations, which
+ sucks
+ <braunr> also
+ <braunr> 17:06 < mcsim> But 2 and 3 in long run will lead to having enough
+ threads on all cores
+ <braunr> contradicts the need for a load balancer
+ <braunr> if you have enough threads every where, why do you need to balance
+ ?
+ <mcsim> and how are you going to deal with the case when client will
+ migrate all the time?
+ <braunr> i intend to implement something close to thread migration
+ <mcsim> because some of them can die because of timeout
+ <braunr> something l4 already does iirc
+ <braunr> the thread scheduler manages scheduling contexts
+ <braunr> which can be shared by different threads
+ <braunr> which means the server thread bound to its client will share the
+ scheduling context
+ <braunr> the only thing that gets migrated is the scheduling context
+ <braunr> the same way a thread can be migrated indifferently on a
+ monolithic system, whether it's in user of kernel space (with kernel
+ preemption enabled ofc)
+ <braunr> or*
+ <mcsim> but how server thread can process requests from different clients?
+ <braunr> mcsim: load becomes a problem when there are too many threads, not
+ when they're dying
+ <braunr> they can't
+ <braunr> at first message, they're *bound*
+ <braunr> => one server thread per client
+ <braunr> when the client dies, the server thread is ubound and can be
+ recycled
+ <braunr> unbound*
+ <mcsim> and you intend to put recycled threads to global queue, right?
+ <braunr> yes
+ <mcsim> and I propose to put them in local queues in hope that next client
+ will be on the same core
+ <braunr> the thing is, i don't see the benefit
+ <braunr> next client could be on another
+ <braunr> in which case it gets a lot heavier than the extremely small
+ critical section i have in mind
+ <mcsim> but most likely it could be on the same
+ <braunr> uh, no
+ <mcsim> becouse on this load on this core is decreased
+ <mcsim> *because
+ <braunr> well, ok, it would likely remain on the same cpu
+ <braunr> but what happens when it migrates ?
+ <braunr> and what about memory usage ?
+ <braunr> one queue per cpu per port set can get very large
+ <braunr> (i understand the proposition better though, i think)
+ <mcsim> we can ask also "What if random access in memory will be more usual
+ than sequential?", but we still optimise sequential one, making random
+ sometimes even worse. The real question is "How can we maximise benefit
+ of knowledge where free server thread resides?"
+ <mcsim> previous was reply to: "(17:17:08) braunr: but what happens when it
+ migrates ?"
+ <braunr> i understand
+ <braunr> you optimize for the common case
+ <braunr> where a lot more ipc occurs than migrations
+ <braunr> agreed
+ <braunr> now, what happens when the server thread isn't in the local queue
+ ?
+ <mcsim> than client request will be handled to LB
+ <braunr> why not search directly itself ?
+ <braunr> (and btw, the right word is "then")
+ <mcsim> LB can decide whom to migrate
+ <mcsim> right, sorry
+ <braunr> i thought you were improving on my scheme
+ <braunr> which implies there is a 1:1 mapping for client and server threads
+ <mcsim> If job of LB is too small than it can be removed and everything
+ will be done in kernel
+ <braunr> it can't be done in userspace anyway
+ <braunr> these queues are in the port / port set structures
+ <braunr> it could be done though
+ <braunr> i mean
+ <braunr> using per cpu queues
+ <braunr> server threads could be both in per cpu queues and in a global
+ queue as long as they exist
+ <mcsim> there should be no global queue, because there again will be
+ contention for it
+ <braunr> mcsim: accessing a load balancer implies contention
+ <braunr> there is contention anyway
+ <braunr> what you're trying to do is reduce it in the first message case if
+ i'm right
+ <mcsim> braunr: yes
+ <braunr> well then we have to revise a few assumptions
+ <braunr> 17:26 < braunr> you optimize for the common case
+ <braunr> 17:26 < braunr> where a lot more ipc occurs than migrations
+ <braunr> that actually becomes wrong
+ <braunr> the first message case occurs for newly created threads
+ <mcsim> for make -j64 this is actually common case
+ <braunr> and those are usually not spawn on the processor their parent runs
+ on
+ <braunr> yes
+ <braunr> if you need all processors, yes
+ <braunr> i don't think taking into account this property changes many
+ things
+ <braunr> per cpu queues still remain the best way to avoid contention
+ <braunr> my problem with this solution is that you may end up with one
+ unbound thread per processor per server
+ <braunr> also, i say "per server", but it's actually per port set
+ <braunr> and even per port depending on how a server is written
+ <braunr> (the system will use one port set for one server in the common
+ case but still)
+ <braunr> so i'll start with a global queue for unbound threads
+ <braunr> and the day we decide it should be optimized with local (or
+ hierarchical) queues, we can still do it without changing the interface
+ <braunr> or by simply adding an option at port / port set creation
+ <braunr> whicih is a non intrusive change
+ <mcsim> ok. your solution should be simplier. And TBH, what I propose is
+ not clearly much mory gainful.
+ <braunr> well it is actually for big systems
+ <braunr> it is because instead of grabbing a lock, you disable preemption
+ <braunr> which means writing to a local, uncontended variable
+ <braunr> with 0 risk of cache line bouncing
+ <braunr> this actually looks very good to me now
+ <braunr> using an option to control this behaviour
+ <braunr> and yes, in the end, it gets very similar to the slab allocator,
+ where you can disable the cpu pool layer with a flag :)
+ <braunr> (except the serialized case would be the default one here)
+ <braunr> mcsim: thanks for insisting
+ <braunr> or being persistent
+ <mcsim> braunr: thanks for conversation :)
+ <mcsim> and probably I had to start from statement that I wanted to improve
+ common case
+## IRC, freenode, #hurd, 2013-06-20
+ <congzhang> braunr: how about your x15, it is impovement for mach or
+ redesign? I really want to know that:)
+ <braunr> it's both largely based on mach and now quite far from it
+ <braunr> based on mach from a functional point of view
+ <braunr> i.e. the kernel assumes practically the same functions, with a
+ close interface
+ <congzhang> Good point:)
+ <braunr> except for ipc which is entirely rewritten
+ <braunr> why ? :)
+ <congzhang> for from a functional point of view:) I think each design has
+ it intrinsic advantage and disadvantage
+ <braunr> but why is it good ?
+ <congzhang> if redesign , I may need wait more time to a new function hurd
+ <braunr> you'll have to wait a long time anyway :p
+ <congzhang> Improvement was better sometimes, although redesign was more
+ attraction sometimes :)
+ <congzhang> I will wait :)
+ <braunr> i wouldn't put that as a reason for it being good
+ <braunr> this is a departure from what current microkernel projects are
+ doing
+ <braunr> i.e. x15 is a hybrid
+ <congzhang> Sure, it is good from design too:)
+ <braunr> yes but i don't see why you say that
+ <congzhang> Sorry, i did not show my view clear, it is good from design
+ too:)
+ <braunr> you're just saying it's good, you're not saying why you think it's
+ good
+ <congzhang> I would like to talk hybrid, I want to talk that, but I am a
+ litter afraid that you are all enthusiasm microkernel fans
+ <braunr> well no i'm not
+ <braunr> on the contrary, i'm personally opposed to the so called
+ "microkernel dogma"
+ <braunr> but i can give you reasons why, i'd like you to explain why *you*
+ think a hybrid design is better
+ <congzhang> so, when I talk apple or nextstep, I got one soap :)
+ <braunr> that's different
+ <braunr> these are still monolithic kernels
+ <braunr> well, monolithic systems running on a microkernel
+ <congzhang> yes, I view this as one type of hybrid
+ <braunr> no it's not
+ <congzhang> microkernel wan't to divide process ( task ) from design view,
+ It is great
+ <congzhang> as implement view or execute view, we have one cpu and some
+ physic memory, as the simplest condition, we can't change that
+ <congzhang> that what resource the system has
+ <braunr> what's your point ?
+ <congzhang> I view this as follow
+ <congzhang> I am cpu and computer
+ <congzhang> application are the things I need to do
+ <congzhang> for running the program and finish the job, which way is the
+ best way for me
+ <congzhang> I need keep all the thing as simple as possible, divide just
+ from application design view, for me no different
+ <congzhang> desgin was microkernel , run just for one cpu and these
+ resource.
+ <braunr> (well there can be many processors actually)
+ <congzhang> I know, I mean hybrid at some level, we can't escape that
+ <congzhang> braunr: I show my point?
+ <braunr> well l4 systems showed we somehow can
+ <braunr> no you didn't
+ <congzhang> x15's api was rpc, right?
+ <braunr> yes
+ <braunr> well a few system calls, and mostly rpcs on top of the ipc one
+ <braunr> jsu tas with mach
+ <congzhang> and you hope the target logic run locally just like in process
+ function call, right?
+ <braunr> no
+ <braunr> it can't run locally
+ <congzhang> you need thread context switch
+ <braunr> and address space context switch
+ <congzhang> but you cut down the cost
+ <braunr> how so ?
+ <congzhang> I mean you do it, right?
+ <congzhang> x15
+ <braunr> yes but no in this way
+ <braunr> in every other way :p
+ <congzhang> I know, you remeber performance anywhere :p
+ <braunr> i still don't see your point
+ <braunr> i'd like you to tell, in one sentence, why you think hybrids are
+ better
+ <congzhang> balance the design and implement problem :p
+ <braunr> which is ?
+ <congzhang> hybird for kernel arc
+ <braunr> you're stating the solution inside the problem
+ <congzhang> you are good at mathmatics
+ <congzhang> sorry, I am not native english speaker
+ <congzhang> braunr: I will find some more suitable sentence to show my
+ point some day, but I can't find one if you think I did not show my
+ point:)
+ <congzhang> for today
+ <braunr> too bad
+ <congzhang> If i am computer I hope the arch was monolithic, If i am
+ programer I hope the arch was microkernel, that's my idea
+ <braunr> ok let's get a bit faster
+ <braunr> monolithic for performance ?
+ <congzhang> braunr: sorry for that, and thank you for the talk:)
+ <braunr> (a computer doesn't "hope")
+ <congzhang> braunr: you need very clear answer, I can't give you that,
+ sorry again
+ <braunr> why do you say "If i am computer I hope the arch was monolithic" ?
+ <congzhang> I know you can slove any single problem
+ <braunr> no i don't, and it's not about me
+ <braunr> i'm just curious
+ <congzhang> I do the work for myself, as my own view, all the resource
+ belong to me, I does not think too much arch related divide was need, if
+ I am the computer :P
+ <braunr> separating address spaces helps avoiding serious errors like
+ corrupting memory of unrelated subsystems
+ <braunr> how does one not want that ?
+ <braunr> (except for performance)
+ <congzhang> braunr: I am computer when I say that words!
+ <braunr> a computer doesn't want anything
+ <braunr> users (including developers) on the other way are the point of
+ view you should have
+ <congzhang> I am engineer other time
+ <congzhang> we create computer, but they are lifeable just my feeling, hope
+ not talk this topic
+ <braunr> what ?
+ <congzhang> I mark computer as life things
+ <braunr> please don't
+ <braunr> and even, i'll make a simple example in favor of isolating
+ resources
+ <braunr> if we, humans, were able to control all of our "resources", we
+ could for example shut down our heart by mistake
+ <congzhang> back to the topic, I think monolithic was easy to understand,
+ and cut the combinatorial problem count for the perfect software
+ <braunr> the reason the body have so many involuntary functions is probably
+ because those who survived did so because these functions were
+ involuntary and controlled by separated physiological functions
+ <braunr> now that i've made this absurd point, let's just not consider
+ computers as life forms
+ <braunr> microkernels don't make a system that more complicated
+ <congzhang> they does
+ <braunr> no
+ <congzhang> do
+ <braunr> they create isolation
+ <braunr> and another layer of indirection with capabilities
+ <braunr> that's it
+ <braunr> it's not that more complicated
+ <congzhang> view the kernel function from more nature view, execute some
+ code
+ <braunr> what ?
+ <congzhang> I know the benefit of the microkernel and the os
+ <congzhang> it's complicated
+ <braunr> not that much
+ <congzhang> I agree with you
+ <congzhang> microkernel was the idea of organization
+ <braunr> yes
+ <braunr> but always keep in mind your goal when thinking about means to
+ achieve them
+ <congzhang> we do the work at diferent view
+ <kilobug> what's quite complicated is making a microkernel design without
+ too much performances loss, but aside from that performances issue, it's
+ not really much more complicated
+ <congzhang> hurd do the work at os level
+ <kilobug> even a monolithic kernel is made of several subsystems that
+ communicated with each others using an API
+ <core-ix> i'm reading this conversation for some time now
+ <core-ix> and I have to agree with braunr
+ <core-ix> microkernels simplify the design
+ <braunr> yes and no
+ <braunr> i think it depends a lot on the availability of capabilities
+ <core-ix> i have experience mostly with QNX and i can say it is far more
+ easier to write a driver for QNX, compared to Linux/BSD for example ...
+ <braunr> which are the major feature microkernels usually add
+ <braunr> qnx >= 5 do provide capabilities
+ <braunr> (in the form of channels)
+ <core-ix> yeah ... it's the basic communication mechanism
+ <braunr> but my initial and still unanswered question was: why do people
+ think a hybrid kernel is batter than a true microkernel, or not
+ <braunr> better*
+ <congzhang> I does not say what is good or not, I just say hybird was
+ accept
+ <braunr> core-ix: and if i'm right, they're directly implemented by the
+ kernel, and not a userspace system server
+ <core-ix> braunr: evolution is more easily accepted than revolution :)
+ <core-ix> braunr: yes, message passing is in the QNX kernel
+ <braunr> not message passing, capabilities
+ <braunr> l4 does message passing in kernel too, but you need to go through
+ a capability server
+ <braunr> (for the l4 variants i have in mind at least)
+ <congzhang> the operating system evolve for it's application.
+ <braunr> congzhang: about evolution, that's one explanation, but other than
+ that ?
+ <braunr> core-ix: ^
+ <core-ix> braunr: by capability you mean (for the lack of a better word
+ i'll use) access control mechanisms?
+ <braunr> i mean reference-rights
+ <core-ix> the "trusted" functionality available in other OS?
+ <braunr>
+ <braunr> i don't know what other systems refer to with "trusted"
+ functionnality
+ <core-ix> yeah, the same thing
+ <congzhang> for now, I am searching one way to make hurd arm edition
+ suitable for Raspberry Pi
+ <congzhang> I hope design or the arch itself cant scale
+ <congzhang> can be scale
+ <core-ix> braunr: i think (!!!) that those are implemented in the Secure
+ Kernel (
+ <core-ix> never used it though ...
+ <congzhang> rpc make intercept easy :)
+ <braunr> core-ix: regular channels are capabilities
+ <core-ix> yes, and by extensions - they are in the kenrel
+ <braunr> that's my understanding too
+ <braunr> and that one thing that, for me, makes qnx an hybrid as well
+ <congzhang> just need intercept in kernel,
+ <core-ix> braunr: i would dive the academic aspects of this ... in my mind
+ a microkernel is system that provides minimal hardware abstraction,
+ communication primitives (usually message passing), virtual memory
+ protection
+ <core-ix> *wouldn't ...
+ <braunr> i think it's very important on the contrary
+ <braunr> what you describe is the "microkernel dogma"
+ <braunr> precisely
+ <braunr> that doesn't include capabilities
+ <braunr> that's why l4 messaging is thread-based
+ <braunr> and that's why l4 based systems are so slow
+ <braunr> (except okl4 which put back capabilities in the kernel)
+ <core-ix> so the compromise here is to include capabilities implementation
+ in the kernel, thus making the final product hybrid?
+ <braunr> not only
+ <braunr> because now that you have them in kernel
+ <braunr> the kernel probably has to manage memory for itself
+ <braunr> so you need more features in the virtual memory system
+ <core-ix> true ...
+ <braunr> that's what makes it a hybrid
+ <braunr> other ways being making each client provide memory, but that's
+ when your system becomes very complicated
+ <core-ix> but I believe this is true for pretty much any "general OS" case
+ <braunr> and some resources just can't be provided by a client
+ <braunr> e.g. a client can't provide virtual memory to another process
+ <braunr> okl4 is actually the only pragmatic real-world implementation of
+ l4
+ <braunr> and they also added unix-like signals
+ <braunr> so that's an interesting model
+ <braunr> as well as qnx
+ <braunr> the good thing about the hurd is that, although it's not kernel
+ agnostic, it doesn't require a lot from the underlying kernel
+ <core-ix> about hurd?
+ <braunr> yes
+ <core-ix> i really need to dig into this code at some point :)
+ <braunr> well you may but you may not see that property from the code
+ itself
diff --git a/microkernel/mach/gnumach/memory_management.mdwn b/microkernel/mach/gnumach/memory_management.mdwn
index 4e237269..477f0a18 100644
--- a/microkernel/mach/gnumach/memory_management.mdwn
+++ b/microkernel/mach/gnumach/memory_management.mdwn
@@ -188,3 +188,18 @@ License|/fdl]]."]]"""]]
<braunr> (more kernel memory, thus more physical memory - up to 1.8 GiB -
but then, less user memory)
+# IRC, freenode, #hurd, 2013-06-06
+ <nlightnfotis> braunr: quick question, what memory allocation algorithms
+ does the Mach use? I know it uses slab allocation, so I can guess buddy
+ allocators too?
+ <braunr> no
+ <braunr> slab allocator for kernel memory (allocation of buffers used by
+ the kernel itself)
+ <braunr> a simple freelist for physical pages
+ <braunr> and a custom allocator based on a red-black tree, a linked list
+ and a hint for virtual memory
+ <braunr> (which is practically the same in all BSD variants)
+ <braunr> and linux does something very close too
diff --git a/news/2008-09-11.mdwn b/news/2008-09-11.mdwn
index 0765a269..d5aa7811 100644
--- a/news/2008-09-11.mdwn
+++ b/news/2008-09-11.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2008, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2008, 2011, 2013 Free Software Foundation,
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -11,5 +12,6 @@ License|/fdl]]."]]"""]]
[[!meta date="2008-09-11"]]
All five students who worked on the Hurd during the **Google Summer of Code 2008** succeeded
-in their projects. For more information please see [[the_community/gsoc_page|community/gsoc]].
+in their projects.
+For more information please see [[2008 GSoC page|community/gsoc/2008]].
**Congratulations to both students and mentors!**
diff --git a/open_issues/anatomy_of_a_hurd_system.mdwn b/open_issues/anatomy_of_a_hurd_system.mdwn
index 677e4625..11a1f754 100644
--- a/open_issues/anatomy_of_a_hurd_system.mdwn
+++ b/open_issues/anatomy_of_a_hurd_system.mdwn
@@ -380,3 +380,220 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l
<braunr> if you're looking for how to do it for a non-translator
application, the answer is probably somewhere in glibc
<braunr> _hurd_startup i'd guess
+# IRC, freenode, #hurd, 2013-05-23
+ <gnu_srs> Hi, is there any efficient way to control which backed
+ translators are called via RPC with a user space program?
+ <gnu_srs> Take for example io_stat: S_io_stat is defined in boot/boot.c,
+ pfinet/io-ops.c and pflocal/io.c
+ <gnu_srs> And the we have libdiskfs/io-stat.c:diskfs_S_io_stat,
+ libnetfs/io-stat.c:netfs_S_io_stat, libtreefs/s-io.c:treefs_S_io_stat,
+ libtrivfs/io-stat.c:trivfs_S_io_stat
+ <gnu_srs> How are they related?
+ <braunr> gnu_srs: it depends on the server (translator) managing the files
+ (nodes) you're accessing
+ <braunr> so use fsysopts to know the server, and see what this server uses
+ <gnu_srs> fsysopts /hurd/pfinet and fsysopts /hurd/pflocal gives the same
+ answer: ext2fs --writable --no-inherit-dir-group --store-type=typed
+ device:hd0s1
+ <braunr> of course
+ <braunr> the binaries are regular files
+ <braunr> see /servers/socket/1 and /servers/socket/2 instead
+ <braunr> which are the nodes representing the *service*
+ <braunr> again, the hurd uses the file system as a service directory
+ <braunr> this usage of the file system is at the core of the hurd design
+ <braunr> files are not mere files, they're service names
+ <braunr> it happens that, for most files, the service behind them is the
+ same as for regular files
+ <braunr> gnu_srs: this *must* be obvious for you to do any tricky work on
+ the hurd
+ <gnu_srs> fsysopts /servers/socket/2 works by /1 gives Operation not
+ supported.
+[[!taglink open_issue_hurd]].
+ <braunr> ah right, some servers don't implement that
+ <braunr> work around this by using showtrans
+ <braunr> fsysopts asks the server itself how it's running, usually giving
+ its command name and options
+ <braunr> showtrans asks the parent how it starts a passive translator
+ attached to the node
+ <gnu_srs> Yes showtrans works :), thanks.
+ <gnu_srs> Anyway, if I create a test program calling io_stat I assume
+ S_io_stat in pflocal is called.
+ <gnu_srs> How to make the program call S_io_stat in pfinet instead?
+ <braunr> create a socket managed by pfinet
+ <braunr> i.e. an inet or inet6 socket
+ <braunr> you can't assume io_stat is serviced by pflocal
+ <braunr> only stats on unix sockets of pipes will be
+ <braunr> or*
+ <gnu_srs> thanks, what about the *_S_io_stat functions?
+ <braunr> what about them ?
+ <gnu_srs> How they fit into the picture, e.g. diskfs_io_stat?
+ <gnu_srs> *diskfs_S_io_stat
+ <braunr> gnu_srs: if you open a file managed by a server using libdiskfs,
+ e.g. ext2fs, that one will be called
+ <gnu_srs> Using the same user space call: io_stat, right?
+ <braunr> it's all userspace
+ <braunr> say rather, client-side
+ <braunr> the client calls the posix stat() function, which is implemented
+ by glibc, which converts it into a call to io_stat, and sends it to the
+ server managing the open file
+ <braunr> the io_stat can change depending on the server
+ <braunr> the remote io_stat implementation, i mean
+ <braunr> identify the server, and you will identify the actual
+ implementation
+# IRC, freenode, #hurd, 2013-06-15
+ <damo22> ive been reading a little about exokernels or unikernels, and i
+ was wondering if it might be relevant to the GNU/hurd design. I'm not
+ too familiar with hurd terminology so forgive me. what if every
+ privileged service was compiled as its own mini "kernel" that handled (a)
+ any hardware related to that service (b) any device nodes exposed by that
+ service etc...
+ <braunr> yes but not really that way
+ <damo22> under the current hurd model of the operating system, how would
+ you talk to hardware that required specific timings like sound hardware?
+ <braunr> through mapped memory
+ <damo22> is there such a thing as an interrupt request in hurd?
+ <braunr> obviously
+ <damo22> ok
+ <damo22> is there any documentation i can read that involves a driver that
+ uses irqs for hurd?
+ <braunr> you can read the netdde code
+ <braunr> dde being another project, there may be documentation about it
+ <braunr> somewhere else
+ <braunr> i don't know where
+ <damo22> thanks
+ <damo22> i read a little about dde, apparently it reuses existing code from
+ linux or bsd by reimplementing parts of the old kernel like an api or
+ something
+ <braunr> yes
+ <damo22> it must translate these system calls into ipc or something
+ <damo22> then mach handles it?
+ <braunr> exactly
+ <braunr> that's why i say it's not the exokernel way of doing things
+ <damo22> ok
+ <damo22> so does every low level hardware access go through mach?'
+ <braunr> yes
+ <braunr> well no
+ <braunr> interrupts do
+ <braunr> ports (on x86)
+ <braunr> everything else should be doable through mapped memory
+ <damo22> seems surprising that the code for it is so small
+ <braunr> 1/ why surprising ? and 2/ "so small" ?
+ <damo22> its like the core of the OS, and yet its tiny compared to say the
+ linux kernel
+ <braunr> it's a microkenrel
+ <braunr> well, rather an hybrid
+ <braunr> the size of the equivalent code in linux is about the same
+ <damo22> ok
+ <damo22> with the model that privileged instructions get moved to
+ userspace, how does one draw the line between what is OS and what is user
+ code
+ <braunr> privileged instructions remain in the kernel
+ <braunr> that's one of the few responsibilities of the kernel
+ <damo22> i see, so it is an illusion that the user has privilege in a sense
+ <braunr> hum no
+ <braunr> or, define "illusion"
+ <damo22> well the user can suddenly do things never imaginable in linux
+ <damo22> that would have required sudo
+ <braunr> yes
+ <braunr> well, they're not unimaginable on linux
+ <braunr> it's just not how it's meant to work
+ <damo22> :)
+ <braunr> and why things like fuse are so slow
+ <braunr> i still don't get "i see, so it is an illusion that the user has
+ privilege in a sense"
+ <damo22> because the user doesnt actually have the elevated privilege its
+ the server thing (translator)?
+ <braunr> it does
+ <braunr> not at the hardware level, but at the system level
+ <braunr> not being able to do it directly doesn't mean you can't do it
+ <damo22> right
+ <braunr> it means you need indirections
+ <braunr> that's what the kernel provides
+ <damo22> so the user cant do stuff like outb 0x13, 0x1
+ <braunr> he can
+ <braunr> he also can on linux
+ <damo22> oh
+ <braunr> that's an x86 specifity though
+ <damo22> but the user would need hardware privilege to do that
+ <braunr> no
+ <damo22> or some kind of privilege
+ <braunr> there is a permission bitmap in the TSS that allows userspace to
+ directly access some ports
+ <braunr> but that's really x86 specific, again
+ <damo22> i was using it as an example
+ <damo22> i mean you wouldnt want userspace to directly access everything
+ <braunr> yes
+ <braunr> the only problem with that is dma reall
+ <braunr> y
+ <braunr> because dma usually access physical memory directly
+ <damo22> are you saying its good to let userspace access everything minus
+ dma?
+ <braunr> otherwise you can just centralize permissions in one place (the
+ kernel or an I/O server for example)
+ <braunr> no
+ <braunr> you don't let userspace access everything
+ <damo22> ah
+ <damo22> yes
+ <braunr> userspace asks for permission to access one specific part (a
+ memory range through mapping)
+ <braunr> and can't access the rest (except through dma)
+ <damo22> except through dma?? doesnt that pose a large security threat?
+ <braunr> no
+ <braunr> you don't give away dma access to anyone
+ <braunr> only drivers
+ <damo22> ahh
+ <braunr> and drivers are normally privileged applications anyway
+ <damo22> so a driver runs in userspace?
+ <braunr> so the only effect is that bugs can affect other address spaces
+ indirectly
+ <braunr> netdde does
+ <damo22> interesting
+ <braunr> and they all should but that's not the case for historical reasons
+ <damo22> i want to port ALSA to hurd userspace :D
+ <braunr> that's not so simple unfortunately
+ <braunr> one of the reasons it's hard is that pci access needs arbitration
+ <braunr> and we don't have that yet
+ <damo22> i imagine that would be difficult
+ <braunr> yes
+ <braunr> also we're not sure we want alsa
+ <braunr> alsa drivers, maybe, but probably not the interface itself
+ <damo22> its tangled spaghetti
+ <damo22> but the guy who wrote JACK for audio hates OSS, and believes it is
+ rubbish due to the fact it tries to read and write to a pcm device node
+ like a filesystem with no care for timing
+ <braunr> i don't know audio well enough to tell you anything about that
+ <braunr> was that about oss3 or oss4 ?
+ <braunr> also, the hurd isn't a real time system
+ <braunr> so we don't really care about timings
+ <braunr> but with "good enough" latencies, it shouldn't be a problem
+ <damo22> but if the audio doesnt reach the sound card in time, you will get
+ a crackle or a pop or a pause in the signal
+ <braunr> yep
+ <braunr> it happens on linux too when the system gets some load
+ <damo22> some users find this unnacceptable
+ <braunr> some users want real time systems
+ <braunr> using soft real time is usually plenty enough to "solve" this kind
+ of problems
+ <damo22> will hurd ever be a real time system?
+ <braunr> no idea
+ <youpi> if somebody works on it why not
+ <youpi> it's the same as linux
+ <braunr> it should certainly be simpler than on linux though
+ <damo22> hmm
+ <braunr> microkernels are well suited for real time because of the well
+ defined interfaces they provide and the small amount of code running in
+ kernel
+ <damo22> that sounds promising
+ <braunr> you usually need to add priority inheritance and take care of just
+ a few corner cases and that's all
+ <braunr> but as youpi said, it still requires work
+ <braunr> and nobody's working on it
+ <braunr> you may want to check l4 fiasco.oc though
diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn
index 33a1a071..b06b4f9f 100644
--- a/open_issues/glibc.mdwn
+++ b/open_issues/glibc.mdwn
@@ -1305,6 +1305,23 @@ Failures, mostly in order of appearance:
Due to `ext2fs --no-atime`.
+ * IRC, OFTC, #debian-hurd, 2013-05-08
+ <youpi> bah, tst-atime failure :)
+ <pinotree> do you have its output?
+ <youpi> well it's very simple
+ <youpi> I have the noatime option on / :)
+ <pinotree> oh
+ <youpi> fortunately fsysopts works :)
+ <pinotree> the test checks whether ST_NOATIME is in the mount
+ options, maybe it would be a good idea to provide it
+ <youpi> yes
+ <pinotree> unfortunately it isn't in posix, so i'm not sure whether
+ adding it to the general bits/statvfs.h would be welcome
+ <pinotree> or whether we should fork it, like it is done for linux
+ <pinotree> oh no, we fork it already
+ <pinotree> \o/
directory atime changed
diff --git a/open_issues/gnumach_integer_overflow.mdwn b/open_issues/gnumach_integer_overflow.mdwn
index 2166e591..08a29268 100644
--- a/open_issues/gnumach_integer_overflow.mdwn
+++ b/open_issues/gnumach_integer_overflow.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -15,3 +15,36 @@ License|/fdl]]."]]"""]]
<braunr> yes, we have integer overflows on resident_page_count, but
luckily, the member is rarely used
+See also [[gnumach_vm_object_resident_page_count]].
+## IRC, freenode, #hurd, 2013-06-04
+ <elmig> this is declared as int on vm_object.h
+ <elmig> and as it as counter it's always positive
+ <braunr> yes it should be unsigned
+ <elmig> ok
+ <braunr> but leave it as it is for consistency with the rest
+ <elmig> i send patch :)
+ <braunr> please no
+ <braunr> unless you've fully determined the side effects
+ <elmig> i've grepped the vars and saw only comparisons > and = 0
+ <elmig> never less than 0
+ <braunr> > 0 is the same
+ <braunr> well
+ <braunr> > not, but >= would be a problem
+ <elmig>
+ <elmig> asctually no >=0
+ <braunr> still, i don't want to change that unless it's strictly necessary
+ <braunr> hum, you're grepping ref_count, not resident_page_count
+ <elmig> i did both
+ <elmig> on resident_page_count theres resident_page_count >= 0
+ <elmig> = 0, == 0
+ <braunr> this isn't the only possible issue
+ <braunr> anyway
+ <braunr> for now there is no reason to change anything unless you do a full
+ review
+ <elmig> only place i see resdent_page_count and page_count being decreased
+ it's on vm/vm_resident.c
+ <elmig> vm_page_remove() and vm_page_replace()
diff --git a/open_issues/gnumach_vm_object_resident_page_count.mdwn b/open_issues/gnumach_vm_object_resident_page_count.mdwn
index cc1b8897..5c1247b2 100644
--- a/open_issues/gnumach_vm_object_resident_page_count.mdwn
+++ b/open_issues/gnumach_vm_object_resident_page_count.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -20,3 +20,29 @@ License|/fdl]]."]]"""]]
<braunr> luckily, this should be easy to solve
+## IRC, freenode, #hurd, 2013-06-03
+ <elmig> regarding
+ this is fixed. it's an int. what should happen do this page? /dev/null
+ <elmig> ?
+ <youpi> I guess so
+## IRC, freenode, #hurd, 2013-06-04
+ <elmig>
+ <elmig> this is a int
+ <elmig> how to deal with the page? delete it? archive it?
+ <braunr> ?
+ <elmig> the issue originallu reported was fixed, right?
+ <braunr> i think so, yes
+ <braunr> for now at least
+ <elmig> so this stays on the open_issues on the wiki anyway?
+ <braunr> no, it should go away
+ <elmig> i dont know how to suggest deletion on the wiki
+ <braunr> don't
+ <braunr> i'll do it later
diff --git a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
index 670c82cb..11bebd6e 100644
--- a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
+++ b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
@@ -133,3 +133,29 @@ License|/fdl]]."]]"""]]
<braunr> (imo, the mach_debug interface should be adjusted to be used with
privileged ports only)
<braunr> (well, maybe not all mach_debug RPCs)
+# `gnumach.defs`
+## IRC, freenode, #hurd, 2013-05-13
+ <braunr> youpi: what's the point of the last commit in the upstream hurd
+ repository (utils/vmstat: Use gnumach.defs from gnumach) ?
+ <braunr> or rather, i think i see the point, but then why do it only for
+ gnumach and not fot the rest ?
+ <braunr> for*
+ <youpi> most probably because nobody did it, probably
+ <braunr> aiui, it makes the hurd build process not rely on system headers
+ <youpi> (and nobody had any issue with it)
+ <braunr> well yes, that's why i'm wondering :)
+ <braunr> it looks perfectly fine to me to use system headers instead of
+ generating them
+ <youpi> ah right
+ <youpi> I thought there was actually a reason
+ <youpi> I'll revert
+ <youpi> could you answer David about it?
+ <braunr> sure
diff --git a/open_issues/open_symlink.mdwn b/open_issues/open_symlink.mdwn
index 20e4a4fe..f71109a9 100644
--- a/open_issues/open_symlink.mdwn
+++ b/open_issues/open_symlink.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,9 +10,21 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_glibc]]
# IRC, freenode, #hurd, 2012-01-02
<pinotree> hm, is it a known issue that open("somesymlink", O_RDONLY |
O_NOFOLLOW) does not fail with ELOOP?
<youpi> pinotree: iirc there is code for it, maybe not the same behavior as
on linux
+## IRC, OFTC, #debian-hurd, 2013-05-08
+ <pinotree> the hurd issue is that Q_NOFOLLOW seems broken on symlinks, and
+ thus open(symlink, O_NOFOLLOW) doesn't fail with ELOOP
+ <youpi> I don't really see why it should fail
+ <youpi> since NOFOLLOW says not to follow the symlink
+ <pinotree> yeah, but you cannot open a symlink
+ <youpi> ah right ok
+ <youpi> interesting :)
diff --git a/open_issues/profiling.mdwn b/open_issues/profiling.mdwn
index 26e6c97c..545edcf6 100644
--- a/open_issues/profiling.mdwn
+++ b/open_issues/profiling.mdwn
@@ -9,10 +9,14 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
+[[!meta title="Profiling, Tracing"]]
*Profiling* ([[!wikipedia Profiling_(computer_programming) desc="Wikipedia
article"]]) is a tool for tracing where CPU time is spent. This is usually
done for [[performance analysis|performance]] reasons.
+ * [[hurd/debugging/rpctrace]]
* [[gprof]]
Should be working, but some issues have been reported, regarding GCC spec
@@ -33,3 +37,104 @@ done for [[performance analysis|performance]] reasons.
* [[SystemTap]]
* ... or some other Linux thing.
+# IRC, freenode, #hurd, 2013-06-17
+ <congzhang> is that possible we develop rpc msg analyse tool? make it clear
+ view system at different level?
+ <congzhang> hurd was dynamic system, how can we just read log line by line
+ <kilobug> congzhang: well, you can use rpctrace and then analyze the logs,
+ but rpctrace is quite intrusive and will slow down things (like strace or
+ similar)
+ <kilobug> congzhang: I don't know if a low-overhead solution could be made
+ or not
+ <congzhang> that's the problem
+ <congzhang> when real system run, the msg cross different server, and then
+ the debug action should not intrusive the process itself
+ <congzhang> we observe the system and analyse os
+ <congzhang> when rms choose microkernel, it's expect to accelerate the
+ progress, but not
+ <congzhang> microkernel make debug a litter hard
+ <kilobug> well, it's not limited to microkernels, debugging/tracing is
+ intrusive and slow things down, it's an universal law of compsci
+ <kilobug> no, it makes debugging easier
+ <congzhang> I don't think so
+ <kilobug> you can gdb the various services (like ext2fs or pfinet) more
+ easily
+ <kilobug> and rpctrace isn't any worse than strace
+ <congzhang> how easy when debug lpc
+ <kilobug> lpc ?
+ <congzhang> because cross context
+ <congzhang> classic function call
+ <congzhang> when find the bug source, I don't care performance, I wan't to
+ know it's right or wrong by design, If it work as I expect
+ <congzhang> I optimize it latter
+ <congzhang> I have an idea, but don't know weather it's usefull or not
+ <braunr> rpctrace is a lot less instrusive than ptrace based tools
+ <braunr> congzhang: debugging is not made hard by the design choice, but by
+ implementation details
+ <braunr> as a simple counter example, someone often cited usb development
+ on l3 being made a lot easier than on a monolithic kernel
+ <congzhang> Collect the trace information first, and then layout the msg by
+ graph, when something wrong, I focus the trouble rpc, and found what
+ happen around
+ <braunr> "by graph" ?
+ <congzhang> yes
+ <congzhang> braunr: directed graph or something similar
+ <braunr> and not caring about performance when debugging is actually stupid
+ <braunr> i've seen it on many occasions, people not being able to use
+ debugging tools because they were far too inefficient and slow
+ <braunr> why a graph ?
+ <braunr> what you want is the complete trace, taking into account cross
+ address space boundaries
+ <congzhang> yes
+ <braunr> well it's linear
+ <braunr> switching server
+ <congzhang> by independent process view it's linear
+ <congzhang> it's linear on cpu's view too
+ <congzhang> yes, I need complete trace, and dynamic control at microkernel
+ level
+ <congzhang> os, if server crash, and then I know what's other doing, from
+ the graph
+ <congzhang> graph needn't to be one, if the are not connect together, time
+ sort them
+ <congzhang> when hurd was complete ok, some tools may be help too
+ <braunr> i don't get what you want on that graph
+ <congzhang> sorry, I need a context
+ <congzhang> like uml sequence diagram, I need what happen one by one
+ <congzhang> from server's view and from the function's view
+ <braunr> that's still linear
+ <braunr> so please stop using the word graph
+ <braunr> you want a trace
+ <braunr> a simple call trace
+ <congzhang> yes, and a tool
+ <braunr> with some work gdb could do it
+ <congzhang> you mean under some microkernel infrastructure help
+ <congzhang> ?
+ <braunr> if needed
+ <congzhang> braunr: will that be easy?
+ <braunr> not too hard
+ <braunr> i've had this idea for a long time actually
+ <braunr> another reason i insist on migrating threads (or rather, binding
+ server and client threads)
+ <congzhang> braunr: that's great
+ <braunr> the current problem we have when using gdb is that we don't know
+ which server thread is handling the request of which client
+ <braunr> we can guess it
+ <braunr> but it's not always obvious
+ <congzhang> I read the talk, know some of your idea
+ <congzhang> make things happen like classic kernel, just from function
+ ,sure:)
+ <braunr> that's it
+ <congzhang> I think you and other do a lot of work to improve the mach and
+ hurd, buT we lack the design document and the diagram, one diagram was
+ great than one thousand words
+ <braunr> diagrams are made after the prototypes that prove they're doable
+ <braunr> i'm not a researcher
+ <braunr> and we have little time
+ <braunr> the prototype is the true spec
+ <congzhang> that's why i wan't cllector the trace info and show, you can
+ know what happen and how happen, maybe just suitable for newbie, hope
+ more young hack like it
+ <braunr> once it's done, everything else is just sugar candy around it
diff --git a/open_issues/sendmsg_scm_creds.mdwn b/open_issues/sendmsg_scm_creds.mdwn
index cf0103df..d4a6126e 100644
--- a/open_issues/sendmsg_scm_creds.mdwn
+++ b/open_issues/sendmsg_scm_creds.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation,
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -11,7 +11,8 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_glibc]]
-IRC, unknown channel, unknown date.
+# IRC, unknown channel, unknown date
<pinotree> Credentials: s_uid 1000, c_uid 1000, c_gid 100, c_pid 2722
<pinotree> 2722: Credentials: s_uid 1000, c_uid 1000, c_gid 100, c_pid 2724
@@ -91,10 +92,80 @@ IRC, unknown channel, unknown date.
<pinotree> yep
<youpi> ok, good :)
-/!\ IRC, freenode, #hurd, 2011-08-11
+## IRC, freenode, #hurd, 2011-08-11
< pinotree> (but that patch is lame)
+## IRC, freenode, #hurd, 2013-05-09
+ <gnu_srs> youpi: Since you are online tonight, which authentication
+ callbacks to be used for SCM_CREDS calls.
+ <gnu_srs> I have working code and need to add this to make things
+ complete. The auth server, lib* or where?
+ <youpi> I don't understand the question
+ <gnu_srs> authentication callbacks like for SCM_RIGHTS, see
+ <gnu_srs>
+ <youpi> I still don't understand: what are you trying to do actually?
+ <gnu_srs> solving the SCM_CREDS propbems with e.g. dbus.
+ <youpi> so what is the relation with pinotree's patch on the page above?
+ <youpi> (I have no idea of the current status of all that)
+ <gnu_srs> his patch was not merged, right? have to shut down, sorry, bbl,
+ gn8
+ <pinotree> that patch was not merged since it is not in the correct place
+ <youpi> as I said, I have no idea about the status
+ <pinotree> youpi: basically, it boils down to knowing, when executing the
+ code implementing an rpc, who requested that rpc (pid, uid, gid)
+ <youpi> i.e. getting information about the reply port for instance?
+ <youpi> well that might be somehow faked
+ <youpi> (by perhaps giving another task's port as reply port)
+ <pinotree> for example (which would be the code path for SCM_CREDS), when
+ you call call the socket sendmsg(), pflocal would know who did that rpc
+ and fill the auxilliary data)
+ <pinotree> s,)$,,
+ <pinotree> youpi: yes, i know about this faking issue, iirc also antrik
+ mentioned quite some time ago
+ <youpi> ok
+ <pinotree> that's one of the (imho) two issues of this
+ <pinotree> my hurd-foo is not enough to know whether there are solutions to
+ the problem above
+### IRC, freenode, #hurd, 2013-05-14
+ <gnu_srs> Hi, regarding SCM_CREDS, I have some working code in
+ sendmsg.c. Now I need to make a callback to authenticate the pid, uid,
+ etc
+ <gnu_srs> Where to hook call that into pflocal?
+ <gnu_srs> the auth server?
+ <gnu_srs> maybe _io_restrict_auth is the correct call to use (same as for
+### IRC, freenode, #hurd, 2013-05-17
+ <gnu_srs> I'm working on the scm credentials right now to enable (via dbus)
+ more X window managers to work properly.
+ <gnu_srs> seems to be rather tricky:-(
+ <pochu> gnu_srs: I guess you also need SCM_CREDS, right?
+ <gnu_srs> hi pochu, that's what I'm working on, extending your SCM_RIGHTS
+ work to SCM_CREDS
+ <pinotree> that's what i did as proof, years ago?
+ <gnu_srs> it would be good to know which server calls to make, I'll be back
+ with proposals of functions to use.
+ <pinotree> there was a talk, years ago when i started with this, and few
+ days ago too
+ <pinotree> every methods has its own drawbacks, and basically so far it
+ seems that in every method the sender identity can be faked somehow
+ <gnu_srs> pinotree: Yes of course your patch was perfect, but it seemed
+ like people wanted a server acknowledgement too.
+ <pinotree> no, my patch was not perfect at all
+ <pinotree> if it was, it would have been cleaned up and sent few years ago
+ already
See also [[dbus]], [[pflocal_socket_credentials_for_local_sockets]] and
diff --git a/open_issues/translate_fd_or_port_to_file_name.mdwn b/open_issues/translate_fd_or_port_to_file_name.mdwn
index 0d786d2a..0d6a460c 100644
--- a/open_issues/translate_fd_or_port_to_file_name.mdwn
+++ b/open_issues/translate_fd_or_port_to_file_name.mdwn
@@ -105,6 +105,57 @@ License|/fdl]]."]]"""]]
<tschwinge> Ah, for /proc/*/maps, right. I've been thinking more globally.
+## task_get_name, task_set_name RPCs
+[[!message-id ""]]
+## IRC, freenode, #hurd, 2013-05-10
+ <youpi> tschwinge's suggestion to put names on ports instead of tasks would
+ be useful too
+ <braunr> do you get task ports as easily as you get tasks in kdb ?
+ <youpi> there is task->itk_self & such
+ <youpi> or itk_space
+ <youpi> I don't remember which one is used by userspace
+ <braunr> i mean
+ <braunr> when you use the debugger, can you easily find its ports ?
+ <braunr> the task ports i mean
+ <braunr> or thread ports or whatever
+ <youpi> once you have a task, it's a matter of getting the itk_self port
+ <youpi> s/port/field member/
+ <braunr> so the debugger provides you with the addresses of the structs
+ <braunr> right ?
+ <youpi> yes, that's what we have already
+ <braunr> then ok
+ <braunr> bddebian: do that :p
+ <braunr> hehe
+ <youpi> see show all thread
+ <braunr> (haven't used kdb in a long time)
+ <bddebian> So, adding a name to ports like I did with tasks?
+ <braunr> remove what you did for tasks
+ <braunr> move it to ports
+ <braunr> it's very similar
+ <braunr> but hm
+ <braunr> i'm not sure where the RPC would be
+ <braunr> this RPC would exist for *all* ports
+ <braunr> or only for kernel objects if added to gnumach.defs
+ <youpi> it's just about moving the char array field to another structure
+ <youpi> and plugging that
+ <bddebian> But mach_task_self is a syscal, it looks like itk_self is just a
+ pointer to an ipc_port ?
+ <braunr> so ?
+ <braunr> you take that pointer and you get the port
+ <braunr> just like vm_map gets a struct vm_map from a task
+ <bddebian> So I am just adding ipc_port_name to the ipc_port struct in this
+ case?
+ <braunr> yes
+ <braunr> actually
+ <braunr> don't do anything just yet
+ <braunr> we need to sort a few details out first
+ <braunr> see bug-hurd
# IRC, freenode, #hurd, 2011-07-13
A related issue:
diff --git a/system_call.mdwn b/system_call.mdwn
index f180a79b..16d706c7 100644
--- a/system_call.mdwn
+++ b/system_call.mdwn
@@ -18,3 +18,18 @@ See [[GNU Mach's system calls|microkernel/mach/gnumach/interface/syscall]].
In the [[GNU Hurd|hurd]], a lot of what is traditionlly considered to be a UNIX
system call is implemented (primarily by means of [[RPC]]) inside [[glibc]].
+# IRC, freenode, #hurd, 2013-06-15
+ <braunr> true system calls are always implemented the same way, by the
+ kernel, using traps or specialized instructions that enable crossing from
+ user to kernel space
+ <braunr> glibc simply translates function calls to system calls by packing
+ arguments appropriately and using that trap or syscall instruction
+ <braunr> on microkernel based systems however, true system calls are
+ normally used only for IPC
+ <braunr> so we also use the term syscall to refer to those RPCs that
+ provide system services
+ <braunr> e.G. open() is a call to a file system server (and maybe several
+ actually)