diff options
authorThomas Schwinge <>2012-11-29 01:33:22 +0100
committerThomas Schwinge <>2012-11-29 01:33:22 +0100
commit5bd36fdff16871eb7d06fc26cac07e7f2703432b (patch)
parent2603401fa1f899a8ff60ec6a134d5bd511073a9d (diff)
50 files changed, 4703 insertions, 45 deletions
diff --git a/community/gsoc/2012/virt/discussion.mdwn b/community/gsoc/2012/virt/discussion.mdwn
index 31b9ce01..e0085322 100644
--- a/community/gsoc/2012/virt/discussion.mdwn
+++ b/community/gsoc/2012/virt/discussion.mdwn
@@ -214,3 +214,176 @@ License|/fdl]]."]]"""]]
<braunr> or these meetings won't be as useful as they could be
<tschwinge> Yes. But also please don't wait for the meetings, but ask
questions throughout the week, too.
+# IRC, freenode, #hurd, 2012-08-09
+ <nowhere_man> hey, does anyone knows the network device interface well?
+ <nowhere_man> I don't get it by reading net_io.c/h in gnumach
+ <braunr> nowhere_man: ask your question
+ <braunr> nowhere_man: <- this may
+ help
+ <nowhere_man> I don't see what the entry point is
+ <nowhere_man> I finally understood that I actually don't need to touch
+ pfinet for gsoc project
+ <nowhere_man> but I should do a replacement network device instead
+ <nowhere_man> is the net_io_init function called at start?
+ <braunr> what entry point ?
+ <braunr> and you should perhaps have a look at the eth-multiplexer by
+ zhengda
+ <braunr> yes net_io_init is called at startup
+ <braunr> nowhere_man: did you find your answers about networking ?
+ <nowhere_man> no, I'm still digging in mach's code
+ <braunr> nowhere_man: well keep asking :/
+ <braunr> you left conversation without notice :/
+ <braunr> nowhere_man: and why mach ?
+ <nowhere_man> I thought hardware devices are there
+ <tschwinge> nowhere_man: You wanted to push your documentation one/two
+ weeks ago. Why has that not yet happened?
+ <youpi> nowhere_man: they used to be there, they are now in netdde, but in
+ both case it's just a matter of the same RPC interface
+ <nowhere_man> tschwinge: I spent very few time this week on gsoc, and
+ completely forgot about the push on savannah
+ <braunr> nowhere_man: i told you to look at the work by zhengda concerning
+ eth-multiplexer, did you do that ?
+ <tschwinge> nowhere_man: You realize GSoC is meant to be a full-time job?
+ <tschwinge> Or, next to full-time?
+ <braunr> it's full-time normally
+ <braunr> the payment is justified by that
+ <youpi> nowhere_man: most RPC operations you need to know about network can
+ be seen at work in pfinet/ethernet.c, wherever "ether_port" appears
+ <youpi> i.e. device_open, set_filter, write, set/get_status
+ <braunr> again, should guide you
+ pretty well
+ <braunr> since it's the very least necessary to use that interface
+ <tschwinge> nowhere_man: How, roughly but realistically, are your plans to
+ continue this task?
+ <tschwinge> nowhere_man: What has been blocking you this week so you
+ couldn't work on your task?
+ <nowhere_man> tschwinge: mostly a previous work that was supposed to end at
+ the beginning of the summer and only went online now, for which I'm
+ basically sysadmin
+ <braunr> 21:25 < tschwinge> nowhere_man: How, roughly but realistically,
+ are your plans to continue this task?
+ <braunr> this question is really more interesting actually
+ <nowhere_man> right now, I want to write a netword device that just sends
+ its frames by IPC
+ <braunr> why ?
+ <nowhere_man> as I never wrote any program using Mach's IPC, that seems the
+ easiest to get them right
+ <braunr> you won't have time
+ <braunr> 21:22 < braunr> nowhere_man: i told you to look at the work by
+ zhengda concerning eth-multiplexer, did you do that ?
+ <nowhere_man> braunr: not yet, no
+ <braunr> well that's your best chance to make some progress
+ <nowhere_man> braunr: is writing the virtal network device that hard?
+ <braunr> basically, it allows "bridgind" the pfinet instances of various
+ subhurds
+ <braunr> the virtual network device you want *is* eth-multiplexer
+ <tschwinge> nowhere_man: GSoC is nearly over. That's why I'm asking how
+ this task is going to continue. I'm sorry but I reckon you have not
+ spend anywhere near the amount of hours that are meant to be spent on it.
+ <braunr> and from what antrik told me, yes it's hard, and moreover, why
+ rewrite it if it already exists and you're late
+ <braunr> i agree
+ <nowhere_man> tschwinge: I know, I've started way too late because of my
+ second round of exams
+ <tschwinge> nowhere_man: OK, that's how you started. But how is it going
+ to continue...
+ <nowhere_man> tschwinge: in short, I write a prototype that just starts a
+ subhurd, and when that works correctly I add the network
+ <tschwinge> nowhere_man: I mean from an organizational point of view.
+ <nowhere_man> well, between now and the beginning of september, I'll work
+ full-time on this
+ <nowhere_man> up until september 8th
+# IRC, freenode, #hurd, 2012-08-09
+ <antrik> nowhere_man: you do *not* have to do a replacement network
+ device. zhengda did that years ago.
+ <antrik> nowhere_man: also note that zhengda also implemented the support
+ for *using* the virtual network device (in fact any replacement devices
+ -- except that no others actually exist yet) in boot
+ <youpi> which is already in, actually, isn't it?
+ <antrik> youpi: hm, yes... it was the patch that zhengda posted on the list
+ once, but later updated, and at some later point you merged the outdated
+ variant from the list...
+ <youpi> outdated?
+ <youpi> ah, but he never posted the updated one, and it got lost in git
+ repos, right?
+ <youpi> (what was updated actually?)
+ <antrik> he changed the option name and description later for more
+ clarity. don't remember whether there were other changes
+ <antrik> -f, --device=device_name=device_file
+ <antrik> Specify a device file used by subhurd
+ and its
+ <antrik> virtual name.
+ <antrik> that's the one from the Debian package
+ <antrik> -m, --device-map=DEBICENAME=DEVICEFILE
+ <antrik> Map the device in subhurd to the
+ device in the
+ <antrik> main Hurd.
+ <antrik> that's the one I have locally built from his tree
+ <youpi> so you actually have access to his tree?
+ <antrik> uhm... I used to... it was on flubber
+# IRC, freenode, #hurd, 2012-08-18
+ <nowhere_man> so, this week I discovered how fun it is to work on a
+ non-mainstream OS
+ <nowhere_man> I hoped to start coding the tool itself, put together the
+ skeleton, but every Lisp implementation I tried had problems
+ <braunr> ah you want to write it in lisp ?
+ <nowhere_man> ECL, that I had ported a few years ago, actually FTBFS since
+ <nowhere_man> I hoped to be able, it would be easier for me
+ <nowhere_man> and when I tried Scheme, I started with Guile (it's GNU's own
+ Scheme implementation, after all)
+ <nowhere_man> and when I execute the FFI functions, to access functions in
+ libmachdec
+ <nowhere_man> I get SIGILL
+ <braunr> i can't advise you about anything lisp related
+ <braunr> the most reliable thing you'll find on the hurd is C
+ <nowhere_man> I tried to debug that, but running Guile in GDB gets me a
+ <nowhere_man> I'll try to make ECL to build again
+ <braunr> this seems like a waste of time to me
+ <braunr> avoid spending time on anything that isn't directly related to
+ your goal if you still hope to finish it
+ <nowhere_man> I'm ten times more comfortable coding in Lisp
+ <braunr> it doesn't matter, you're late
+ <nowhere_man> yeah, I know, so taking the time to correct that problem
+ won't change the fact that I won't finish in time
+ <nowhere_man> so I'll finish anyway, and in Lisp
+ <braunr> and if you lack something else, like some mach/hurd specific lisp
+ bindings, you'll have to spend more time on that
+ <braunr> ok
+ <nowhere_man> do you know if someone had a SIGILL situation on Hurd in the
+ past?
+ <nowhere_man> I'm wondering if that's a known kind of issue
+ <braunr> there are lots of issues
+ <braunr> especially when it comes to other languages and runtime
+ environments
+ <nowhere_man> but is it like MAX_PATH_LEN, something that is known to
+ happen when porting something on Hurd?
+ <braunr> i'm not sure how comparable it is
+ <braunr> i'd say it's often before of the conformance issues of the hurd
+ <braunr> because*
+ <nowhere_man> like missing bits of POSIX ?
+ <braunr> or simple wrong for some corner cases
+ <braunr> simply*
+ <bubu^> nowhere_man, I was able to run guile on my hurd image through qemu
+ <bubu^> but I didn't make any complexe programms to check if everything
+ works fine
+ <nowhere_man> yeah, it runs fine
+ <nowhere_man> FFI functions get you a SIGILL
+ <nowhere_man>
+ <nowhere_man> the define-module form at the beginning triggers the signal
+ <antrik> nowhere_man: what do you want to implement in Lisp?
+ <antrik> BTW, the guy working on Lisp bindings a couple of years ago used
+ Clisp
+ <antrik> it was working back then
+ <nowhere_man> antrik: the program that sets up a subhurd
+ <nowhere_man> I always forget about clisp, I'll try it right away
diff --git a/community/gsoc/project_ideas.mdwn b/community/gsoc/project_ideas.mdwn
index 8ce10ffa..9d486b00 100644
--- a/community/gsoc/project_ideas.mdwn
+++ b/community/gsoc/project_ideas.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2008, 2009, 2010, 2011 Free Software Foundation,
+[[!meta copyright="Copyright © 2008, 2009, 2010, 2011, 2012 Free Software
+Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -87,6 +87,7 @@ other: language_bindings, gnat, gccgo, perl_python. -->
[[!inline pages="community/gsoc/project_ideas/tcp_ip_stack" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/nfs" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/pthreads" show=0 feeds=no actions=yes]]
+[[!inline pages="community/gsoc/project_ideas/smp" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/sound" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/disk_io_performance" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/vm_tuning" show=0 feeds=no actions=yes]]
diff --git a/community/gsoc/project_ideas/smp.mdwn b/community/gsoc/project_ideas/smp.mdwn
new file mode 100644
index 00000000..e17c2ccf
--- /dev/null
+++ b/community/gsoc/project_ideas/smp.mdwn
@@ -0,0 +1,16 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!meta title="SMP"]]
+# IRC, freenode, #hurd, 2012-09-30
+ <braunr> i expect smp to be our next gsoc project
diff --git a/glibc/select.mdwn b/glibc/select.mdwn
new file mode 100644
index 00000000..bafda141
--- /dev/null
+++ b/glibc/select.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_documentation]]
+# IRC, freenode, #hurd, 2012-08-10
+ <afleck> what is the use of having a port set name that can receive from
+ multiple ports?
+ <youpi> think of select()
+ <afleck> I haven't really gotten into it yet, I was just reading the Mach
+ Kernel Guide and I didn't understand the difference between having a port
+ set and multiple ports, since you can't choose which port receives in a
+ port set.
+ <youpi> with multiple ports, you'd have to have as many threads to block in
+ reception
+ <youpi> or poll in turn
diff --git a/hurd/console/discussion.mdwn b/hurd/console/discussion.mdwn
new file mode 100644
index 00000000..f887d826
--- /dev/null
+++ b/hurd/console/discussion.mdwn
@@ -0,0 +1,42 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_documentation]]
+# IRC, OFTC, #debian-hurd, 2012-09-24
+ <allesa> hello, I'm trying to get familiar with the Hurd and would like to
+ change the keyboard layout in use. It seems all the information I can
+ find (relating to console-driver-xkb) is out of date, with the latest
+ info relating to it being that this package should not be used anymore…
+ <allesa> does anyone know how changing keyboard layouts currently works?
+ <allesa> ah, never mind. I assume it doesn't currently work:
+ <allesa> *
+ <youpi> it does actually work
+ <youpi> simply dpkg-reconfigure keyboard-configuration
+ <youpi> and reboot
+ <youpi> (see
+ <youpi> )
+ <allesa> mhm, I got that far — but selecting my layout gave me no joy, even
+ after restart. Seem to be stuck with the layout chosen during
+ installation (d-i). Just to check I'm using the right version — still on
+ the installer isos from 15 July?
+ <allesa> wait… progress is being made — slowly and subtly…
+ <allesa> Ok, so the XKBLAYOUT is changing as you described, but XKBVARIANT
+ seems to be ignored. Could this be right?
+ <youpi> yes, the hurd console only supports keymaps
+ <youpi> (currently)
+ <allesa> Ah OK, thanks for your help on this. I imagine this is not
+ something that just requires simple repetitive work, but some actual
+ hacking?
+ <allesa> to fix that is…
+ <youpi> some hacking yes
diff --git a/hurd/libstore/nbd_store.mdwn b/hurd/libstore/nbd_store.mdwn
index 5874b162..4d4a769f 100644
--- a/hurd/libstore/nbd_store.mdwn
+++ b/hurd/libstore/nbd_store.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2007, 2008, 2009 Free Software Foundation,
+[[!meta copyright="Copyright © 2007, 2008, 2009, 2012 Free Software Foundation,
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -10,3 +10,8 @@ is included in the section entitled [[GNU Free Documentation
[[!meta title="nbd store: Linux-compatible network block device"]]
+# Open Issues
+ * [[!GNU_Savannah_task 5722]]
diff --git a/hurd/libthreads.mdwn b/hurd/libthreads.mdwn
index 8b1a97e6..c8d819d4 100644
--- a/hurd/libthreads.mdwn
+++ b/hurd/libthreads.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -13,10 +13,14 @@ License|/fdl]]."]]"""]]
# Internals
+## Threading Model
+libthreads has a 1:1 threading model.
## Threads' Death
-C threads death doesn't actually free the thread's stack (and maybe not the
+A thread's death doesn't actually free the thread's stack (and maybe not the
associated Mach ports either). That's because there's no way to free the stack
after the thread dies (because the thread of control is gone); the stack needs
to be freed by something else, and there's nothing convenient to do it. There
@@ -26,3 +30,5 @@ However, it isn't really a leak, because the unfreed resources do get used for
the next thread. So the issue is that the shrinkage of resource consumption
never happens, but it doesn't grow without bounds; it just stays at the maximum
even if the current number of threads is lower.
+The same issue exists in [[libpthread]].
diff --git a/hurd/running/qemu.mdwn b/hurd/running/qemu.mdwn
index 512ea602..3648c7d6 100644
--- a/hurd/running/qemu.mdwn
+++ b/hurd/running/qemu.mdwn
@@ -105,6 +105,16 @@ If your machine supports hardware acceleration, you should really use the kvm va
to the command line, see below, if you are running Linux kernels 2.6.37 or 2.6.38 else IRQs may hang sooner or later. The kvm irq problems will be solved in kernel 2.6.39.
+IRC, freenode, #hurd, 2012-08-29:
+ <braunr> youpi: do you remember which linux versions require the
+ -no-kvm-irqchip option ?
+ <braunr> your page indicates 2.6.37-38, but i'm seeing weird things on
+ 2.6.32
+ <braunr> looks like a good thing to use that option all the time actually
+ <gnu_srs> seems like kvm -h says: -no-kvm-irqchip and man kvm says:
+ -machine kernel_irqchip=off
/!\ Note that there are known performance issues with KVM on Linux 2.6.39
kernels, compared to 2.6.32: [[!debbug 634149]]. We're preparing on a change
on our side to work around this.
diff --git a/hurd/settrans/discussion.mdwn b/hurd/settrans/discussion.mdwn
index c9ec4d34..74f1c8f5 100644
--- a/hurd/settrans/discussion.mdwn
+++ b/hurd/settrans/discussion.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -16,3 +16,24 @@ License|/fdl]]."]]"""]]
<antrik> ugh... I just realized why settrans -a without -f doesn't
generally work on filesystem translators
<antrik> obviously, it needs -R too!
+# IRC, freenode, #hurd, 2012-08-17
+ <antrik> youpi: no, only the -g is redundant; i.e. -ga is the same as -a
+ <antrik> (actually, not redundant, but rather simply meaningless in this
+ case)
+ <antrik> -g tells what to do with an active translator *when a passive one
+ is changed*
+ <antrik> if no passive one is changed, it does nothing
+ <antrik> (and I realized that after using the Hurd for only 6 years or so
+ ;-) )
+ <braunr> it's not obvious
+ <antrik> braunr: indeed. it's not obvious at all from the --help output :-(
+ <antrik> not sure though how to make it clearer
+ <braunr> the idea isn't obvious
+ <braunr> perhaps telling that "setting a passive translator" also applies
+ to removing it, i.e. setting it to none
+ <antrik> braunr: well, the fact that a translator is unset by setting it to
+ nothing is unclear in general, not only for passive translator. I agree
+ that pointing this out should make things much more clear in general...
diff --git a/hurd/translator/ext2fs.mdwn b/hurd/translator/ext2fs.mdwn
index 460194f9..13a1d9ec 100644
--- a/hurd/translator/ext2fs.mdwn
+++ b/hurd/translator/ext2fs.mdwn
@@ -89,6 +89,20 @@ small backend stores, like floppy devices.
<youpi> which can be quite probable
+## Sync Interval
+[[!tag open_issue_hurd]]
+### IRC, freenode, #hurd, 2012-10-08
+ <braunr> btw, how about we increase our ext2 sync interval to 30 seconds,
+ like others do ?
+ <braunr> not really because others do it that way, but because it severely
+ breaks performance on the hurd
+ <braunr> and 30 seems like a reasonable amount (better than 5 at least)
# Documentation
* <>
diff --git a/hurd/translator/nfs.mdwn b/hurd/translator/nfs.mdwn
index bf24370a..81372204 100644
--- a/hurd/translator/nfs.mdwn
+++ b/hurd/translator/nfs.mdwn
@@ -10,6 +10,11 @@ License|/fdl]]."]]"""]]
Translator acting as a NFS client.
+Only NFSv2/v3 is currentl supported.
+[[!tag open_issue_hurd]]There are a few unmerged changes on a former GSoC
+project's topic-branch.
# See Also
diff --git a/hurd/translator/procfs/jkoenig/discussion.mdwn b/hurd/translator/procfs/jkoenig/discussion.mdwn
index 0ee1e238..da6f081e 100644
--- a/hurd/translator/procfs/jkoenig/discussion.mdwn
+++ b/hurd/translator/procfs/jkoenig/discussion.mdwn
@@ -329,3 +329,23 @@ Needed by glibc's `pldd` tool (commit
* pinotree has a local work to add the /proc/$pid/cwd symlink, but relying
on "internal" (but exported) glibc functions
+# "Unusual" PIDs
+Not actually related to procfs, but here seems to be a convenient place for
+filing these:
+## IRC, freenode, #hurd, 2012-08-10
+ <braunr> too bad the proc server has pid 0
+ <braunr> top & co won't show it
+## IRC, OFTC, #debian-hurd, 2012-09-18
+ <pinotree> youpi: did you see
+ <pinotree> ?
+ <youpi> nope
diff --git a/libpthread.mdwn b/libpthread.mdwn
index b31876b3..27ca0da9 100644
--- a/libpthread.mdwn
+++ b/libpthread.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -25,6 +25,27 @@ Mach|microkernel/mach/gnumach]], some [[microkernel/L4]] variants, and
+## Threading Model
+libpthread has a 1:1 threading model.
+## Threads' Death
+A thread's death doesn't actually free the thread's stack (and maybe not the
+associated Mach ports either). That's because there's no way to free the stack
+after the thread dies (because the thread of control is gone); the stack needs
+to be freed by something else, and there's nothing convenient to do it. There
+are many ways to make it work.
+However, it isn't really a leak, because the unfreed resources do get used for
+the next thread. So the issue is that the shrinkage of resource consumption
+never happens, but it doesn't grow without bounds; it just stays at the maximum
+even if the current number of threads is lower.
+The same issue exists in [[hurd/libthreads]].
# Open Issues
[[!inline pages=tag/open_issue_libpthread raw=yes feeds=no]]
diff --git a/microkernel/mach/deficiencies.mdwn b/microkernel/mach/deficiencies.mdwn
index f2f49975..e1f6debc 100644
--- a/microkernel/mach/deficiencies.mdwn
+++ b/microkernel/mach/deficiencies.mdwn
@@ -258,3 +258,265 @@ License|/fdl]]."]]"""]]
working on research around mach
<antrik> braunr: BTW, I have little doubt that making RPC first-class would
solve a number of problems... I just wonder how many others it would open
+# IRC, freenode, #hurd, 2012-09-04
+ <braunr> it was intended as a mach clone, but now that i have better
+ knowledge of both mach and the hurd, i don't want to retain mach
+ compatibility
+ <braunr> and unlike viengoos, it's not really experimental
+ <braunr> it's focused on memory and cpu scalability, and performance, with
+ techniques likes thread migration and rcu
+ <braunr> the design i have in mind is closer to what exists today, with
+ strong emphasis on scalability and performance, that's all
+ <braunr> and the reason the hurd can't be modified first is that my design
+ relies on some important design changes
+ <braunr> so there is a strong dependency on these mechanisms that requires
+ the kernel to exists first
+## IRC, freenode, #hurd, 2012-09-06
+In context of [[open_issues/multithreading]] and later [[open_issues/select]].
+ <gnu_srs> And you will address the design flaws or implementation faults
+ with x15?
+ <braunr> no
+ <braunr> i'll address the implementation details :p
+ <braunr> and some design issues like cpu and memory resource accounting
+ <braunr> but i won't implement generic resource containers
+ <braunr> assuming it's completed, my work should provide a hurd system on
+ par with modern monolithic systems
+ <braunr> (less performant of course, but performant, scalable, and with
+ about the same kinds of problems)
+ <braunr> for example, thread migration should be mandatory
+ <braunr> which would make client calls behave exactly like a userspace task
+ asking a service from the kernel
+ <braunr> you have to realize that, on a monolithic kernel, applications are
+ clients, and the kernel is a server
+ <braunr> and when performing a system call, the calling thread actually
+ services itself by running kernel code
+ <braunr> which is exactly what thread migration is for a multiserver system
+ <braunr> thread migration also implies sync IPC
+ <braunr> and sync IPC is inherently more performant because it only
+ requires one copy, no in kernel buffering
+ <braunr> sync ipc also avoids message floods, since client threads must run
+ server code
+ <gnu_srs> and this is not achievable with evolved gnumach and/or hurd?
+ <braunr> well that's not entirely true, because there is still a form of
+ async ipc, but it's a lot less likely
+ <braunr> it probably is
+ <braunr> but there are so many things to change i prefer starting from
+ scratch
+ <braunr> scalability itself probably requires a revamp of the hurd core
+ libraries
+ <braunr> and these libraries are like more than half of the hurd code
+ <braunr> mach ipc and vm are also very complicated
+ <braunr> it's better to get something new and simpler from the start
+ <gnu_srs> a major task nevertheless:-D
+ <braunr> at least with the vm, netbsd showed it's easier to achieve good
+ results from new code, as other mach vm based systems like freebsd
+ struggled to get as good
+ <braunr> well yes
+ <braunr> but at least it's not experimental
+ <braunr> everything i want to implement already exists, and is tested on
+ production systems
+ <braunr> it's just time to assemble those ideas and components together
+ into something that works
+ <braunr> you could see it as a qnx-like system with thread migration, the
+ global architecture of the hurd, and some improvements from linux like
+ rcu :)
+### IRC, freenode, #hurd, 2012-09-07
+ <antrik> braunr: thread migration is tested on production systems?
+ <antrik> BTW, I don't think that generally increasing the priority of
+ servers is a good idea
+ <antrik> in most cases, IPC should actually be sync. slpz looked at it at
+ some point, and concluded that the implementation actually has a
+ fast-path for that case. I wonder what happens to scheduling in this case
+ -- is the receiver sheduled immediately? if not, that's something to
+ fix...
+ <braunr> antrik: qnx does something very close to thread migration, yes
+ <braunr> antrik: i agree increasing the priority isn't a good thing, but
+ it's the best of the quick and dirty ways to reduce message floods
+ <braunr> the problem isn't sync ipc in mach
+ <braunr> the problem is the notifications (in our cases the dead name
+ notifications) that are by nature async
+ <braunr> and a malicious program could send whatever it wants at the
+ fastest rate it can
+ <antrik> braunr: malicious programs can do any number of DOS attacks on the
+ Hurd; I don't see how increasing priority of system servers is relevant
+ in that context
+ <antrik> (BTW, I don't think dead name notifications are async by
+ nature... just like for most other IPC, the *usual* case is that a server
+ thread is actively waiting for the message when it's generated)
+ <braunr> antrik: it's async with respect to the client
+ <braunr> antrik: and malicious programs shouldn't be able to do that kind
+ of dos
+ <braunr> but this won't be fixed any time soon
+ <braunr> on the other hand, a higher priority helps servers not create too
+ many threads because of notifications, and that's a good thing
+ <braunr> gnu_srs: the "fix" for this will be to rewrite select so that it's
+ synchronous btw
+ <braunr> replacing dead name notifications with something like cancelling a
+ previously installed select request
+ <antrik> no idea what "async with respect to the client" means
+ <braunr> it means the client doesn't wait for anything
+ <antrik> what is the client? what scenario are you talking about? how does
+ it affect scheduling?
+ <braunr> for notifications, it's usually the kernel
+ <braunr> it doesn't directly affect scheduling
+ <braunr> it affects the amount of messages a hurd server has to take care
+ of
+ <braunr> and the more messages, the more threads
+ <braunr> i'm talking about event loops
+ <braunr> and non blocking (or very short) selects
+ <antrik> the amount of messages is always the same. the question is whether
+ they can be handled before more come in. which would be the case if be
+ default the receiver gets scheduled as soon as a message is sent...
+ <braunr> no
+ <braunr> scheduling handoff doesn't imply the thread will be ready to
+ service the next message by the time a client sends a new one
+ <braunr> the rate at which a message queue gets filled has nothing to do
+ with scheduling handoff
+ <antrik> I very much doubt rates come into play at all
+ <braunr> well they do
+ <antrik> in my understanding the problem is that a lot of messages are sent
+ before the receive ever has a chance to handle them. so no matter how
+ fast the receiver is, it looses
+ <braunr> a lot of non blocking selects means a lot of reply ports
+ destroyed, a lot of dead name notifications, and what i call message
+ floods at server side
+ <braunr> no
+ <braunr> it used to work fine with cthreads
+ <braunr> it doesn't any more with pthreads because pthreads are slightly
+ slower
+ <antrik> if the receiver gets a chance to do some work each time a message
+ arrives, in most cases it would be free to service the next request with
+ the same thread
+ <braunr> no, because that thread won't have finished soon enough
+ <antrik> no, it *never* worked fine. it might have been slighly less
+ terrible.
+ <braunr> ok it didn't work fine, it worked ok
+ <braunr> it's entirely a matter of rate here
+ <braunr> and that's the big problem, because it shouldn't
+ <antrik> I'm pretty sure the thread would finish before the time slice ends
+ in almost all cases
+ <braunr> no
+ <braunr> too much contention
+ <braunr> and in addition locking a contended spin lock depresses priority
+ <braunr> so servers really waste a lot of time because of that
+ <antrik> I doubt contention would be a problem if the server gets a chance
+ to handle each request before 100 others come in
+ <braunr> i don't see how this is related
+ <braunr> handling a request doesn't mean entirely processing it
+ <braunr> there is *no* relation between handoff and the rate of incoming
+ message rate
+ <braunr> unless you assume threads can always complete their task in some
+ fixed and low duration
+ <antrik> sure there is. we are talking about a single-processor system
+ here.
+ <braunr> which is definitely not the case
+ <braunr> i don't see what it changes
+ <antrik> I'm pretty sure notifications can generally be handled in a very
+ short time
+ <braunr> if the server thread is scheduled as soon as it gets a message, it
+ can also get preempted by the kernel before replying
+ <braunr> no, notifications can actually be very long
+ <braunr> hurd_thread_cancel calls condition_broadcast
+ <braunr> so if there are a lot of threads on that ..
+ <braunr> (this is one of the optimizations i have in mind for pthreads,
+ since it's possible to precisely select the target thread with a doubly
+ linked list)
+ <braunr> but even if that's the case, there is no guarantee
+ <braunr> you can't assume it will be "quick enough"
+ <antrik> there is no guarantee. but I'm pretty sure it will be "quick
+ enough" in the vast majority of cases. which is all it needs.
+ <braunr> ok
+ <braunr> that's also the idea behind raising server priorities
+ <antrik> braunr: so you are saying the storms are all caused by select(),
+ and once this is fixed, the problem should be mostly gone and the
+ workaround not necessary anymore?
+ <braunr> yes
+ <antrik> let's hope you are right :-)
+ <braunr> :)
+ <antrik> (I still think though that making hand-off scheduling default is
+ the right thing to do, and would improve performance in general...)
+ <braunr> sure
+ <braunr> well
+ <braunr> no it's just a hack ;p
+ <braunr> but it's a right one
+ <braunr> the right thing to do is a lot more complicated
+ <braunr> as roland wrote a long time ago, the hurd doesn't need dead-name
+ notifications, or any notification other than the no-sender (which can be
+ replaced by a synchronous close on fd like operation)
+ <antrik> well, yes... I still think the viengoos approach is promising. I
+ meant the right thing to do in the existing context ;-)
+ <braunr> better than this priority hack
+ <antrik> oh? you happen to have a link? never heard of that...
+ <braunr> i didn't want to do it initially, even resorting to priority
+ depression on trhead creation to work around the problem
+ <braunr> hm maybe it wasn't him, i can't manage to find it
+ <braunr> antrik:
+ <braunr> "Long ago, in specifying the constraints of
+ <braunr> what the Hurd needs from an underlying IPC system/object model we
+ made it
+ <braunr> very clear that we only need no-senders notifications for object
+ <braunr> implementors (servers)"
+ <braunr> "We don't in general make use of dead-name notifications,
+ <braunr> which are the general kind of object death notification Mach
+ provides and
+ <braunr> what serves as task death notification."
+ <braunr> "In the places we do, it's to serve
+ <braunr> some particular quirky need (and mostly those are side effects of
+ Mach's
+ <braunr> decouplable RPCs) and not a semantic model we insist on having."
+### IRC, freenode, #hurd, 2012-09-08
+ <antrik> The notion that seemed appropriate when we thought about these
+ issues for
+ <antrik> Fluke was that the "alert" facility be a feature of the IPC system
+ itself
+ <antrik> rather than another layer like the Hurd's io_interrupt protocol.
+ <antrik> braunr: funny, that's *exactly* what I was thinking when looking
+ at the io_interrupt mess :-)
+ <antrik> (and what ultimately convinced me that the Hurd could be much more
+ elegant with a custom-tailored kernel rather than building around Mach)
+## IRC, freenode, #hurd, 2012-09-24
+ <braunr> my initial attempt was a mach clone
+ <braunr> but now i want a mach-like kernel, without compability
+ <lisporu> which new licence ?
+ <braunr> and some very important changes like sync ipc
+ <braunr> gplv3
+ <braunr> (or later)
+ <lisporu> cool 8)
+ <braunr> yes it is gplv2+ since i didn't take the time to read gplv3, but
+ now that i have, i can't use anything else for such a project: )
+ <lisporu> what is mach-like ? (how it is different from Pistachio like ?)
+ <braunr> l4 doesn't provide capabilities
+ <lisporu> hmmm..
+ <braunr> you need a userspace for that
+ <braunr> +server
+ <braunr> and it relies on complete external memory management
+ <lisporu> how much work is done ?
+ <braunr> my kernel will provide capabilities, similar to mach ports, but
+ simpler (less overhead)
+ <braunr> i want the primitives right
+ <braunr> like multiprocessor, synchronization, virtual memory, etc..
+### IRC, freenode, #hurd, 2012-09-30
+ <braunr> for those interested, x15 is now a project of its own, with no
+ gnumach compability goal, and covered by gplv3+
diff --git a/microkernel/mach/gnumach/memory_management.mdwn b/microkernel/mach/gnumach/memory_management.mdwn
index c630af05..3e158b7c 100644
--- a/microkernel/mach/gnumach/memory_management.mdwn
+++ b/microkernel/mach/gnumach/memory_management.mdwn
@@ -48,6 +48,7 @@ License|/fdl]]."]]"""]]
<braunr> and mmu management
<braunr> (but maybe that's what you meant by physical memory)
## IRC, freenode, #hurd, 2011-02-16
<braunr> antrik: youpi added it for xen, yes
@@ -119,3 +120,16 @@ License|/fdl]]."]]"""]]
<braunr> there is issue with watch ./slabinfo which turned in a infinite
loop, but it didn't affect the stability of the system
<braunr> actually with a 64-bits kernel, we could use a 4/x split
+# IRC, freenode, #hurd, 2012-08-10
+ <braunr> all modern systems embed the kernel in every address space
+ <braunr> which allows reduced overhead when making a system call
+ <braunr> sometimes there is no context switch at all
+ <braunr> on i386, there are security checks to upgrade the privilege level
+ (switch to ring 0), and when used, kernel page tables are global, so
+ they're not flushed
+ <braunr> using sysenter/sysexit makes it even faster
diff --git a/microkernel/mach/gnumach/ports.mdwn b/microkernel/mach/gnumach/ports.mdwn
index f114460c..e7fdb446 100644
--- a/microkernel/mach/gnumach/ports.mdwn
+++ b/microkernel/mach/gnumach/ports.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2007, 2008, 2009, 2011 Free Software Foundation,
+[[!meta copyright="Copyright © 2007, 2008, 2009, 2011, 2012 Free Software
+Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -13,6 +13,11 @@ License|/fdl]]."]]"""]]
* [[Xen]]
+ * [[open_issues/64-bit_port]]. There is some preliminary work for a
+ x86\_64 port.
+ * [[open_issues/ARM_port]]. Is not in a usable state.
* [PowerPC]( Is not in a usable state.
* Alpha: [project I](, and
diff --git a/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn b/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn
index 2a9b4b60..89a27b01 100644
--- a/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn
+++ b/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2005, 2006, 2007, 2008, 2010 Free Software
+[[!meta copyright="Copyright © 2005, 2006, 2007, 2008, 2010, 2012 Free Software
Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -121,3 +121,12 @@ further files (also exported ones) that serve no real value, but are being
# Rewrite ugly code
+# IRC, freenode, #hurd, 2012-09-06
+ <mcsim> hello. Why size parameter of rpc device_read has type
+ "mach_msg_type_number_t *"? Why not just "vm_size_t *"?
+ <mcsim> this parameter has name data_count
+ <braunr> that's one of the reasons mach is confusing
+ <braunr> i can't really tell you why, it's messy :/
diff --git a/microkernel/mach/history.mdwn b/microkernel/mach/history.mdwn
index 5a3608cd..776bb1d7 100644
--- a/microkernel/mach/history.mdwn
+++ b/microkernel/mach/history.mdwn
@@ -58,3 +58,23 @@ Verbatim copying and distribution of this entire article is permitted in any med
Apple's Macintosh OSX (OS 10.x) is based on [Darwin]( _"Darwin uses a monolithic kernel based on [[TWiki/FreeBSD]] 4.4 and the OSF/mk Mach 3."_ Darwin also has a [Kernel Programming]( Book.
-- [[Main/GrantBow]] - 22 Oct 2002
+IRC, freenode, #hurd, 2012-08-29:
+ <pavlx> was moved the page from about darwin kernel programming
+ as described on the
+ <pavlx> i found the page and it's
+ <pavlx> it's not anymore the old page
+ <pavlx> and the link about darwin does noit exists anymore ! the new one
+ could be
+ <pavlx> the old one was
+ <pavlx> the link to Darwin is changed i suppose that the nw one it's
+ <pavlx> and the link to Kern Programming it's
+ <pavlx> can't be anymore
diff --git a/microkernel/mach/port.mdwn b/microkernel/mach/port.mdwn
index 26b55456..ccc7286f 100644
--- a/microkernel/mach/port.mdwn
+++ b/microkernel/mach/port.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2002, 2003, 2007, 2008, 2010, 2011 Free Software
-Foundation, Inc."]]
+[[!meta copyright="Copyright © 2002, 2003, 2007, 2008, 2010, 2011, 2012 Free
+Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -86,4 +86,4 @@ When a server process' thread receives from a port set, it dequeues exactly one
message from any of the ports that has a message available in its queue.
This concept of port sets is also the facility that makes convenient
-implementation of [[UNIX]]'s `select` [[system_call]] possible.
+implementation of [[UNIX's `select` system call|glibc/select]] possible.
diff --git a/open_issues/64-bit_port.mdwn b/open_issues/64-bit_port.mdwn
index 797d540f..2d273ba1 100644
--- a/open_issues/64-bit_port.mdwn
+++ b/open_issues/64-bit_port.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,7 +10,11 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_gnumach open_issue_mig]]
-IRC, freenode, #hurd, 2011-10-16:
+There is a `master-x86_64` GNU Mach branch. As of 2012-11-20, it only supports
+the [[microkernel/mach/gnumach/ports/Xen]] platform.
+# IRC, freenode, #hurd, 2011-10-16
<youpi> it'd be really good to have a 64bit kernel, no need to care about
addressing space :)
@@ -34,3 +38,22 @@ IRC, freenode, #hurd, 2011-10-16:
<youpi> and it'd boost userland addrespace to 4GiB
<braunr> yes
<youpi> leaving time for a 64bit userland :)
+# IRC, freenode, #hurd, 2012-10-03
+ <braunr> youpi: just so you know in case you try the master-x86_64 with
+ grub
+ <braunr> youpi:
+ <youpi> ok, thx
+ <braunr> the squeeze version is fine but i had to patch the wheezy/sid one
+ <youpi> I actually hadn't hoped to boot into 64bit directly from grub
+ <braunr> youpi: there is code in viengoos that could be reused
+ <braunr> i've been thinking about it for a time now
+ <youpi> ok
+ <braunr> the two easiest ways are 1/ the viengoos one (a -m32 object file
+ converted with objcopy as an embedded loader)
+ <braunr> and 2/ establishing an identity mapping using 4x1 GB large pages
+ and switching to long mode, then jumping to c code to complete the
+ initialization
+ <braunr> i think i'll go the second way with x15, so you'll have the two :)
diff --git a/open_issues/arm_port.mdwn b/open_issues/arm_port.mdwn
new file mode 100644
index 00000000..2d8b9038
--- /dev/null
+++ b/open_issues/arm_port.mdwn
@@ -0,0 +1,238 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+Several people have expressed interested in a port of GNU/Hurd for the ARM
+# IRC, freenode, #hurd, 2012-10-09
+ <mcsim> bootinfdsds: There was an unfinished port to arm, if you're
+ interested.
+ <tschwinge> mcsim: Has that ever been published?
+ <mcsim> tschwinge: I don't think so. But I have an email of that person and
+ I think that this could be discussed with him.
+## IRC, freenode, #hurd, 2012-10-10
+ <tschwinge> mcsim: If you have a contact to the ARM porter, could you
+ please ask him to post what he has?
+ <antrik> tschwinge: we all have the "contact" -- let me remind you that he
+ posted his questions to the list...
+## IRC, freenode, #hurd, 2012-10-17
+ <mcsim> tschwinge: Hello. The person who I wrote regarding arm port of
+ gnumach still hasn't answered. And I don't think that he is going to
+ answer.
+# IRC, freenode, #hurd, 2012-11-15
+ <matty3269> Well, I have a big interest in the ARM architecture, I worked
+ at ARM for a bit too, and I've written my own little OS that runs on
+ qemu. Is there an interest in getting hurd running on ARM?
+ <braunr> matty3269: not really currently
+ <braunr> but if that's what you want to do, sure
+ <tschwinge> matty3269: Well, interest -- sure!, but we don't really have
+ people savvy in low-level kernel implementation on ARM. I do know some
+ bits about it, but more about the instruction set than about its memory
+ architecture, for example.
+ <tschwinge> matty3269: But if you're feeling adventurous, by all means work
+ on it, and we'll try to help as we can.
+ <tschwinge> matty3269: There has been one previous attempt for an ARM port,
+ but that person never published his code, and apparently moved to a
+ different project.
+ <tschwinge> matty3269: I can help with toolchains (GCC, etc.) things for
+ ARM, if there's need.
+ <matty3269> tschwinge: That sounds great, thanks! Where would you recommend
+ I start (at the moment I've got Mach checked out and am trying to get it
+ compiled for i386)
+ <matty3269> I'm guessing that the Mach micro-kernel is all that would need
+ to be ported or are there arch-dependant bits of code in the server
+ processes?
+ <tschwinge> matty3269:
+ has some
+ information. Mach is the biggest part, yes. Then some bits in glibc and
+ libpthread, and even less in the Hurd libraries and servers.
+ <tschwinge> matty3269: Basically, you'd need equivalents for the i386 (and
+ similar) directories, yep.
+ <tschwinge> Though, you may be able to avoid some cruft in there.
+ <tschwinge> Does building for x86 have any issues?
+ <tschwinge> matty3269: How is generally your understanding of the Hurd on
+ Mach system architecture, and on microkernel-based systems generally, and
+ on Mach in particular?
+ <matty3269> tschwinge: yes, it seems to be progressing... I've got mig
+ installed and it's just compiling now
+ <matty3269> hmm, not too great if I'm honest, I've done mostly monolithic
+ kernel development so having such low-level processes, such as
+ scheduling, done in user-space seems a little strinage
+ <tschwinge> Ah, yes, MIG will need a little bit of porting, too. I can
+ help with that, but that's not a priority -- first you have to get Mach
+ to boot at all; MIG will only be needed once you need to deal with RPCs,
+ so user-land/kernel interaction, basically. Before, you can hack around
+ it.
+ <matty3269> tschwinge: I have been running a GNU/Hurd system for a while
+ now though
+ <tschwinge> I'm happy to tell you that the schedules is still in the
+ kernel. ;-)
+ <tschwinge> OK, good, so you know about the basic ideas.
+ <braunr> matty3269: there has to be machine specific stuff in user space
+ <braunr> for initial thread contexts for example
+ <matty3269> tschwinge: Ok, just got gnumach built
+ <braunr> but there isn't much and you can easily base your work from the
+ x86 implementation
+ <tschwinge> Yes. Mach itself is the more difficult one.
+ <matty3269> braunr: Yeah, looking around at things, it doesn't seem that
+ there will be too much work involoved in the user-space stuff
+ <tschwinge> braunr: Do you know off-hand whether there are some old Mach
+ research papers describing architecture ports?
+ <tschwinge> I know there are some describing the memory system (obviously),
+ and I/O system -- which may help matty3269 to understand the general
+ design/structure.
+ <tschwinge> We might want to identify some documents, and make a list.
+ <braunr> all mach related documentation i have is available here:
+ <braunr> (also through http://)
+ <tschwinge> matty3269: Oh, definitely I'd suggest the Mach 3 Kernel
+ Principles book. That gives a good description of the Mach architecture.
+ <matty3269> Great, that's my weekends reading then!
+ <braunr> you don't need all that for a port
+ <matty3269> Is it possible to run the gnumach binary standalone with qemu?
+ <braunr> you won't go far with it
+ <braunr> you really need at least one program
+ <braunr> but sure, for a port development, it can easily be done
+ <braunr> i'd suggest writing a basic static application for your tests once
+ you reach an advanced state
+ <braunr> the critical parts of a port are memory and interrupts
+ <braunr> and memory can be particularly difficult to implement correctly
+ <tschwinge> matty3269: I once used QEMU's
+ virtual-FAT-filesystem-from-a-directory-on-the-host, and configured GRUB
+ to boot from that one, so it was easy to quickly reboot for kernel
+ development.
+ <braunr> but the good news is that almost every bsd system still uses a
+ similar interface
+ <tschwinge> matty3269: And, you may want to become familiar with QEMU's
+ built-in gdbserver, and how to connect to and use that.
+ <braunr> so, for example, you could base your work from the netbsd/arm pmap
+ module
+ <tschwinge> matty3269: I think that's better than starting on real
+ hardware.
+ <braunr> tschwinge: you can use -kernel with a multiboot binary now
+ <braunr> tschwinge: and even creating iso images is so fast it's not any
+ slower
+ <tschwinge> braunr: Yeah, I thought so, but never checked this out --
+ recently I saw in qemu --help's output some »multiboot« thing flashing
+ by. :-)
+ <braunr> i think it only supports 32-bits executables though
+ <matty3269> braunr: Yeah, I just tried passing gnumach as the -kernel
+ parameter to qemu, but it segged qemu :S
+ <braunr> otherwise i'd be using it for x15
+ <matty3269> qemu: fatal: Trying to execute code outside RAM or ROM at
+ 0xc0100000
+ <braunr> how much ram did you give qemu ?
+ <matty3269> I used '-m 512'
+ <braunr> hum, so the -kernel option doesn't correctly implement elf loading
+ or something like that
+ <braunr> anyway, i'm not sure how well building gnumach on a non-hurd
+ system is supported
+ <braunr> so you may want to simply develop inside your VM for the time
+ being, and reboot
+ <matty3269> doing an objdump of it seems fine...
+ <braunr> ?
+ <braunr> ah, the gnumach executable is a correct elf image
+ <braunr> that's not the point
+ <matty3269> Is there particular reason that mach is linked at 0xc0100000?
+ <matty3269> or is that where it is expected to be in VM>
+ <tschwinge> That's my understanding.
+ <braunr> kernels commmonly sti at high addresses
+ <braunr> that's the "standard" 3G/1G split for user/kernel space
+ <matty3269> I think Linux sits at a similar VA for 32-bit
+ <braunr> no
+ <matty3269> Oh, I thought it did, I know it does on ARM, the kernel is
+ mapped to 0xc000000
+ <braunr> i don't know arm, but are you sure about this number ?
+ <braunr> seems to lack a 0
+ <matty3269> Ah, yes sorry
+ <matty3269> so 0xC0000000
+ <braunr> 0xc0100000 is just 1 MiB above it
+ <braunr> the .text section of linux on x86 actually starts at c1000000
+ (above 16 MiB, certainly to preserve as much dma-able memory since modern
+ machines now have a lot more)
+ <tschwinge> Surely the GRUB multiboot loader is not that much used/tested?
+ <braunr> unfortunately, no
+ <braunr> matty3269: FYI, my kernel starts at 0xfff00000 :p
+ <matty3269> braunr: hmm, you could be right, I know it's arround there
+ someone
+ <matty3269> somewhere*
+ <matty3269> braunr: that's an interesting address :S
+ <matty3269> braunr: is that the PA address of the kernel or the VA inside a
+ process?
+ <braunr> the VA
+ <matty3269> hmm
+ <braunr> it can't be a PA
+ <braunr> such high addresses are normally device memory
+ <braunr> but don't worry, i have good reasons to have chosen this address
+ :)
+ <matty3269> so with gnumach, does the boot-up sequence use PIC until VM is
+ active and the kernel mapped to the linking address?
+ <braunr> no
+ <braunr> actually i'm not certain of the details
+ <braunr> but there is no PIC
+ <braunr> either special sections are linked at physical addresses
+ <braunr> or it relies on the fact that all executable code uses near jumps
+ <braunr> and uses offsets when accessing data
+ <braunr> (which is why the kernel text is at 3 GiB + 1 MiB, and not 3 GiB)
+ <matty3269> hmm,
+ <matty3269> gah, I need to learn x86
+ <braunr> that would certainly help
+ <matty3269> I've just had a look at boothdr.S; I presume that there must be
+ something else that is executed before this to setup VM, switch to 32-bit
+ more etc...?
+ <matty3269> mode*
+ <braunr> have a look at the multiboot specification
+ <braunr> it sets protected mode
+ <braunr> but not paging
+ <braunr> (i mean, the boot loader does, before passing control to the
+ kernel)
+ <matty3269> Ah, I see
+ <tschwinge> matty3269: Multiboot should be documented in the GRUB package.
+ <matty3269> tschwinge: yep, got that, thanks
+ <matty3269> hmm, I can't find any reference to CR0 in gnumach so paging
+ must be enabled elsewhere
+ <matty3269> oh wait, found it
+ <braunr> $ git grep -i '\<cr0\>'
+ <braunr> i386/i386/proc_reg.h, linux/dev/include/asm-i386/system.h
+ <braunr> although i suspect only the first one is relevant to us :)
+ <matty3269> Yeah, that seems to have the setup code for paging :)
+ <matty3269> I'm still confused how it could run that without paging or PIC
+ though
+ <matty3269> I think I need to watch the boot sequence with qemu
+ <braunr> it's a bit tricky
+ <braunr> but actually simple
+ <braunr> 00:44 < braunr> either special sections are linked at physical
+ addresses
+ <braunr> 00:44 < braunr> or it relies on the fact that all executable code
+ uses near jumps
+ <braunr> that's really all there is
+ <braunr> but you shouldn't worry about that i suppose, as the protocol
+ between the boot loader and an arm kernel will certainly not be the saem
+ <braunr> same*
+ <matty3269> indeed, ARM is tricky because memory maps are vastly differnt
+ on every platform
+## IRC, freenode, #hurd, 2012-11-21
+ <matty3269> Well, I have a ARM gnumach kernel compiled. It just doesn't
+ run! :)
+ <braunr> matty3269: good luck :)
diff --git a/open_issues/code_analysis/discussion.mdwn b/open_issues/code_analysis/discussion.mdwn
index f8a0657d..6f2acc08 100644
--- a/open_issues/code_analysis/discussion.mdwn
+++ b/open_issues/code_analysis/discussion.mdwn
@@ -42,3 +42,20 @@ License|/fdl]]."]]"""]]
tool, please add it to open_issues/code_analysis.mdwn
<antrik> (I guess we should have a "proper" page listing useful debugging
+## IRC, freenode, #hurd, 2012-09-03
+ <mcsim> hello. Has anyone tried some memory debugging tools like duma or
+ dmalloc with hurd?
+ <braunr> mcsim: yes, but i couldn't
+ <braunr> i tried duma, and it crashes, probably because of cthreads :)
+## IRC, freenode, #hurd, 2012-09-08
+ <mcsim> hello. What static analyzer would you suggest (probably you have
+ tried it for hurd already)?
+ <braunr> mcsim: if you find some good free static analyzer, let me know :)
+ <pinotree> a simple one is cppcheck
+ <mcsim> braunr: I'm choosing now between splint and adlint
diff --git a/open_issues/console_tty1.mdwn b/open_issues/console_tty1.mdwn
new file mode 100644
index 00000000..614c02c9
--- /dev/null
+++ b/open_issues/console_tty1.mdwn
@@ -0,0 +1,151 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_hurd]]
+Seen in context of [[libpthread]], but probably not directly related to it.
+# IRC, freenode, #hurd, 2012-08-30
+ <gnu_srs> Do you also experience a frozen hurd console?
+ <braunr> yes
+ <braunr> i didn't check but i'm almost certain it's a bug in my branch
+ <braunr> the replacement of condition_implies was a bit hasty in some
+ places
+ <braunr> this is why i want to rework it separately
+## IRC, freenode, #hurd, 2012-09-03
+ <gnu_srs> braunr: Did you find the cause of the Hurd console freeze for
+ your libpthread branch?
+ <braunr> gnu_srs: like i said, a bug
+ <braunr> probably in the replacement of condition_implies
+ <braunr> i rewrote that part in libpipe and it no works
+ <braunr> now*
+ <braunr> gnu_srs: the packages have been updated
+ <braunr> and these apparently fix the hurd console issue correctly
+## IRC, freenode, #hurd, 2012-09-04
+ <braunr> gnu_srs: this hurd console problem isn't fixed
+ <braunr> it seems to be due to a race condition that only affects the first
+ console
+ <braunr> and by reading the code i can't see how it can even work oO
+ <gnu_srs> braunr: just rebooted, tty1 is still locked, tty2-6 works. And
+ the floppy error stays (maybe a kvm bug??)
+ <braunr> the floppy error is probably a kvm bug as we discussed
+ <braunr> the tty1 locked isn't
+ <braunr> i have it too
+ <braunr> it seems to be a bug in the hurd console server
+ <braunr> which is started by tty1, but for some reason, doesn't return a
+ valid answer at init time
+ <braunr> if you kill the term handling tty1, you'll see your first tty
+ starts working
+ <braunr> for now i'll try a hack that starts the hurd console server before
+ the clients
+ <braunr> doesn't work eh
+ <braunr> tty1 is the only one started before runttys
+ <braunr> indeed, fixing /etc/hurd/runsystem.gnu so that it doesn't touch
+ tty1 fixes the problem
+ <gnu_srs> do you have an explanation?
+ <braunr> not really no
+ <braunr> but it will do for now
+ <pinotree> samuel added that with the comment above, apparently to
+ workaround some other issue of the hurd console
+ <braunr> i'm pretty sure the bug is already visible with cthreads
+ <braunr> the first console always seems weird compared to the others
+ <braunr> with a login: at the bottom of the screen
+ <braunr> didn't you notice ?
+ <pinotree> sometimes, but not often
+ <braunr> typical of a race
+ <pinotree> (at least for me)
+ <braunr> pthreads being slightly slower exposes it
+ <gnu_srs> confirmed, it works by commenting out touch /dev/tty1
+ <gnu_srs> yes, the login is at the bottom of the screen, sometimes one in
+ the upper part too:-/
+ <braunr> so we have a new open issue
+ <braunr> hm
+ <braunr> exiting the first tty doesn't work
+ <braunr> which makes me think of the issue we have with screen
+ <gnu_srs> confirmed!
+ <braunr> also, i don't understand why we have getty on tty1, but nothing on
+ the other terminals
+ <braunr> something is really wrong with terminals on hurd *sigh*
+ <braunr> ah, the problem looks like it happens when getty attempts to
+ handle a terminal !
+ <braunr> gnu_srs: anyway, i don't think it should be blocking for the
+ conversion to pthreads
+ <braunr> but it would be better if someone could assign himself that bug
+ <braunr> :)
+## IRC, freenode, #hurd, 2012-09-05
+ <antrik> braunr: the login at the bottom of the screen if from the Mach
+ console I believe
+ <braunr> antrik: well maybe, but it shouldn't be there anyway
+ <antrik> braunr: why not?
+ <antrik> it's confusing, but perfectly correct as far as I can tell
+ <braunr> antrik: two login: on the same screen ?
+ <braunr> antrik: it's even more confusing when comparing with other ttys
+ <antrik> I mean it's correct from a techincal point of view... I'm not
+ saying it's helpful for the user ;-)
+ <braunr> i'm not even sure it's correct
+ <braunr> i've double checked the pthreads patch and didn't see anything
+ wrong there
+ <antrik> perhaps the startup of the Hurd console could be delayed a bit to
+ make sure it happens after the Mach console login is done printing
+ stuff...
+ <braunr> why are our gettys stubs ?
+ <antrik> I never understood the point of a getty TBH...
+ <braunr> well you need to communicate to something behind your terminal,
+ don't you ?
+ <braunr> with*
+ <antrik> why not just launch the login program or login shell right away?
+ <braunr> what if you want something else than a login program ?
+ <antrik> like what?
+ <antrik> and how would a getty help with that?
+ <braunr> an ascii-art version of star wars
+ <braunr> it would be configured to start something else
+ <antrik> and why does that need a getty? why not just start something else
+ directly?
+ <braunr> well getty is about the serial line parameters actually
+ <antrik> yeah, I had a vague understanding that it has something to do with
+ serial lines (or real TTY lines)... but we hardly need that on local
+ cosoles :-)
+ <antrik> consoles
+ <braunr> right
+ <braunr> but then why even bother with something like runttys
+ <antrik> well, something has to start the terminal servers?...
+ <antrik> I might be confused though
+ <braunr> what i don't understand is
+ <braunr> why is there no getty at startup, whereas they are spawned when
+ logging off ?
+ <antrik> they are? that's fascinating indeed ;-)
+ <braunr> does it behave like this on your old version ?
+ <antrik> I don't remember ever having seen a "getty" process on my Hurd
+ systems...
+ <braunr> can you log on e.g. tty2 and then log out, and see ?
+ <antrik> OTOH, I'm hardly ever using consoles...
+ <antrik> hm... I think that should be possible remotely using the console
+ client with ncurses driver? never tried that...
+ <braunr> ncurses driver ?
+ <braunr> hum i don't know, never tried either
+ <braunr> and it may add other bugs :p
+ <braunr> better wait to be close to the machine
+ <antrik> hehe
+ <antrik> well, it's a good excuse for trying the ncurses driver ;-)
+ <antrik> hrm
+ <antrik> alien:~# console -d ncursesw
+ <antrik> console: loading driver `ncursesw' failed: Gratuitous error
+ <antrik> I guess nobody tested that stuff in years
diff --git a/open_issues/console_vs_xorg.mdwn b/open_issues/console_vs_xorg.mdwn
new file mode 100644
index 00000000..ffefb389
--- /dev/null
+++ b/open_issues/console_vs_xorg.mdwn
@@ -0,0 +1,31 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_glibc open_issue_hurd]]
+# IRC, freenode, #hurd, 2012-08-30
+ <gean> braunr: I have some errors about keyboard in the xorg log, but
+ keyboard is working on the X
+ <braunr> gean: paste the log somewhere please
+ <gean> braunr:
+ [...]
+ [1987693.272] Fatal server error:
+ [1987693.272] Cannot set event mode on keyboard (Inappropriate ioctl for device)
+ [...]
+ [1987693.292] FatalError re-entered, aborting
+ [1987693.302] can't reset keyboard mode (Inappropriate ioctl for device)
+ [...]
+ <braunr> hum
+ <braunr> it looks like the xorg keyboard driver evolved and now uses ioctls
+ our drivers don't implement
+ <braunr> thanks for the report, we'll have to work on this
+ <braunr> i'm not sure the problem is new actually
diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn
index 8f00c950..5f6fcf6a 100644
--- a/open_issues/dde.mdwn
+++ b/open_issues/dde.mdwn
@@ -17,6 +17,9 @@ Still waiting for interface finalization and proper integration.
+See [[user-space_device_drivers]] for generic discussion related to user-space
+device drivers.
# Disk Drivers
@@ -25,24 +28,6 @@ Not yet supported.
The plan is to use [[libstore_parted]] for accessing partitions.
-## Booting
-A similar problem is described in
-[[community/gsoc/project_ideas/unionfs_boot]], and needs to be implemented.
-### IRC, freenode, #hurd, 2012-07-17
- <bddebian> OK, here is a stupid question I have always had. If you move
- PCI and disk drivers in to userspace, how do do initial bootstrap to get
- the system booting?
- <braunr> that's hard
- <braunr> basically you make the boot loader load all the components you
- need in ram
- <braunr> then you make it give each component something (ports) so they can
- communicate
# Upstream Status
@@ -68,6 +53,33 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> (both from the Dresdem L4 group)
+### IRC, freenode, #hurd, 2012-08-12
+ <antrik>
+ <antrik> I wonder whether the very detailed explanation was prompted by our
+ DDE discussions at FOSDEM...
+ <pinotree> antrik: one could think about approaching them to develop the
+ common dde libs + dde_linux together
+ <antrik> pinotree: that's what I did at FOSDEM -- they weren't interested
+ <pinotree> antrik: this year's one? why weren't they?
+ <pinotree> maybe at that time dde was not integrated properly yet (netdde
+ is just few months "old")
+ <braunr> do you really consider it integrated properly ?
+ <pinotree> no, but a bit better than last year
+ <antrik> I don't see what our integration has to do with anything...
+ <antrik> they just prefer hacking thing ad-hoc than having some central
+ usptream
+ <pinotree> the helenos people?
+ <antrik> err... how did helenos come into the picture?...
+ <antrik> we are talking about genode
+ <pinotree> sorry, confused wrong microkernel OS
+ <antrik> actually, I don't remember exactly who said what; there were
+ people from genode there and from one or more other DDE projects... but
+ none of them seemed interested in a common DDE
+ <antrik> err... one or two other L4 projects
## IRC, freenode, #hurd, 2012-02-19
<youpi> antrik: do we know exactly which DDE version Zheng Da took as a
@@ -91,6 +103,12 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
apparently have both USB and SATA working with some variant of DDE
+### IRC, freenode, #hurd, 2012-11-03
+ <mcsim> DrChaos: there is DDEUSB framework for L4. You could port it, if
+ you want. It uses Linux 2.6.26 usb subsystem.
# IRC, OFTC, #debian-hurd, 2012-02-15
<pinotree> i have no idea how the dde system works
@@ -457,6 +475,59 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> hm... good point
+# IRC, freenode, #hurd, 2012-08-14
+ <braunr> it's amazing how much code just gets reimplemented needlessly ...
+ <braunr> libddekit has its own mutex, condition, semaphore etc.. objects
+ <braunr> with the *exact* same comment about the dequeueing-on-timeout
+ problem found in libpthread
+ <braunr> *sigh*
+# IRC, freenode, #hurd, 2012-08-18
+ <braunr> hum, leaks and potential deadlocks in libddekit/thread.c :/
+# IRC, freenode, #hurd, 2012-08-18
+ <braunr> nice, dde relies on a race to start ..
+# IRC, freenode, #hurd, 2012-08-18
+ <braunr> hm looks like if netdde crashes, the kernel doesn't handle it
+ cleanly, and we can't attach another netdde instance
+[[!message-id ""]]
+# IRC, freenode, #hurd, 2012-08-21
+In context of [[libpthread]].
+ <braunr> hm, i thought my pthreads patches introduced a deadlock, but
+ actually this one is present in the current upstream/debian code :/
+ <braunr> (the deadlock occurs when receiving data fast with sftp)
+ <braunr> either in netdde or pfinet
+# DDE for Filesystems
+## IRC, freenode, #hurd, 2012-10-07
+ * pinotree wonders whether the dde layer could aldo theorically support
+ also file systems
+ <antrik> pinotree: yeah, I also brought up the idea of creating a DDE
+ extension or DDE-like wrapper for Linux filesystems a while back... don't
+ know enough about it though to decide whether it's doable
+ <antrik> OTOH, I'm not sure it would be worthwhile. we still should
+ probably have a native (not GPLv2-only) implementation for the main FS at
+ least; so the wrapper would only be for accessing external
+ partitions/media...
# virtio
diff --git a/open_issues/exec_leak.mdwn b/open_issues/exec_leak.mdwn
new file mode 100644
index 00000000..b58d2c81
--- /dev/null
+++ b/open_issues/exec_leak.mdwn
@@ -0,0 +1,57 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_hurd]]
+# IRC, freenode, #hurd, 2012-08-11
+ <braunr> the exec servers seems to leak a lot
+ <braunr> server*
+ <braunr> exec now uses 109M on darnassus
+ <braunr> it really leaks a lot
+ <pinotree> only 109mb? few months ago, exec on exodar was taking more than
+ 200mb after few days of uptime with builds done
+ <braunr> i wonder how much it takes on the buildds
+# IRC, freenode, #hurd, 2012-08-17
+ <braunr> the exec leak is tricky
+ <braunr> bddebian: btw, look at the TODO file in the hurd source code
+ <braunr> bddebian: there is a not from thomas bushnell about that
+ <braunr> "*** Handle dead name notifications on execserver ports. !
+ <braunr> not sure it's still a todo item, but it might be worth checking
+ <bddebian> braunr: diskfs_execboot_class = ports_create_class (0, 0);
+ This is what would need to change right? It should call some cleanup
+ routine in the first argument?
+ <bddebian> Would be ideal if it could just use deadboot() from exec.
+ <braunr> bddebian: possible
+ <braunr> bddebian: hum execboot, i'm not so sure
+ <bddebian> Execboot is the exec task, no?
+ <braunr> i don't know what execboot is
+ <bddebian> It's from libdiskfs
+ <braunr> but "diskfs_execboot_class" looks like a class of ports used at
+ startup only
+ <braunr> ah
+ <braunr> then it's something run in the diskfs users ?
+ <bddebian> yes
+ <braunr> the leak is in exec
+ <braunr> if clients misbehave, it shouldn't affect that server
+ <bddebian> That's a different issue, this was about the TODO thing
+ <braunr> ah
+ <braunr> i don't know
+ <bddebian> Me either :)
+ <bddebian> For the leak I'm still focusing on do-bunzip2 but I am baffled
+ at my results..
+ <braunr> ?
+ <bddebian> Where my counters are zero if I always increment on different
+ vars but wild freaking numbers if I increment on malloc and decrement on
+ free
diff --git a/open_issues/fork_deadlock.mdwn b/open_issues/fork_deadlock.mdwn
index 6b90aa0a..c1fa9208 100644
--- a/open_issues/fork_deadlock.mdwn
+++ b/open_issues/fork_deadlock.mdwn
@@ -63,3 +63,34 @@ Another one in `dash`:
stopped = 1
i = 6
+# IRC, OFTC, #debian-hurd, 2012-11-24
+ <youpi> the lockups are about a SIGCHLD which gets lost
+ <pinotree> ah, ok
+ <youpi> which makes bash spin
+ <pinotree> is that happening more often recently, or it's just something i
+ just noticed?
+ <youpi> it's more often recently
+ <youpi> where "recently" means "some months ago"
+ <youpi> I didn't notice exactly when
+ <pinotree> i see
+ <youpi> it's at most since june, apparently
+ <youpi> (libtool managed to build without a fuss, while now it's a pain)
+ <youpi> (libtool building is a good test, it seems to be triggering quite
+ reliably)
+## IRC, freenode, #hurd, 2012-11-27
+ <youpi> we also have the shell wait issue
+ <youpi> it's particularly bad on libtool calls
+ <youpi> the libtool package (with testsuite) is a good reproducer :)
+ <youpi> the symptom is shell scripts eating CPU
+ <youpi> busy-waiting for a SIGCHLD which never gets received
+ <braunr> that could be what i got
+ <braunr>
+ <braunr> last part
+ <youpi> perhaps watch has the same issue as the shell, yes
diff --git a/open_issues/gcc/pie.mdwn b/open_issues/gcc/pie.mdwn
new file mode 100644
index 00000000..a4598d1e
--- /dev/null
+++ b/open_issues/gcc/pie.mdwn
@@ -0,0 +1,40 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!meta title="Position-Independent Executables"]]
+[[!tag open_issue_gcc]]
+# IRC, freenode, #debian-hurd, 2012-11-08
+ <pinotree> tschwinge: i'm not totally sure, but it seems the pie options
+ for gcc/ld are causing issues
+ <pinotree> namely, producing executables that sigsegv straight away
+ <tschwinge> pinotree: OK, I do remember some issues about these, too.
+ <tschwinge> Also for -pg.
+ <tschwinge> These have in common that they use different crt*.o files for
+ linking.
+ <tschwinge> Might well be there's some bugs there.
+ <pinotree> one way is to try the w3m debian build: the current build
+ configuration enables also pie, which in turns makes an helper executable
+ (mktable) sigsegv when invoked
+ <pinotree> if «,-pie» is appended to the DEB_BUILD_MAINT_OPTIONS variable
+ in debian/rules, pie is not added and the resulting mktable runs
+ correctly
+## IRC, OFTC, #debian-hurd, 2012-11-09
+ <pinotree> youpi: ah, as i noted to tschwinge earlier, it seems -fPIE -pie
+ miscompile stuff
+ <youpi> uh
+ <pinotree> this causes the w3m build failure and (indirectly, due to elinks
+ built with -pie) aptitude
diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn
index e94a4f1f..3b4e5efa 100644
--- a/open_issues/glibc.mdwn
+++ b/open_issues/glibc.mdwn
@@ -81,6 +81,35 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
Might simply be a missing patch(es) from master.
+ * `--disable-multi-arch`
+ IRC, freenode, #hurd, 2012-11-22
+ <pinotree> tschwinge: is your glibc build w/ or w/o multiarch?
+ <tschwinge> pinotree: See open_issues/glibc: --disable-multi-arch
+ <pinotree> ah, because you do cross-compilation?
+ <tschwinge> No, that's natively.
+ <tschwinge> There is also a not of what happened in cross-gnu when I
+ enabled multi-arch.
+ <tschwinge> No idea whether that's still relevant, though.
+ <pinotree> EPARSE
+ <tschwinge> s%not%note
+ <tschwinge> Better?
+ <pinotree> yes :)
+ <tschwinge> As for native builds: I guess I just didn't (want to) play
+ with it yet.
+ <pinotree> it is enabled in debian since quite some time, maybe other
+ i386/i686 patches (done for linux) help us too
+ <tschwinge> I though we first needed some CPU identification
+ infrastructe before it can really work?
+ <tschwinge> I thought [...].
+ <pinotree> as in use the i686 variant as runtime automatically? i guess
+ so
+ <tschwinge> I thought I had some notes about that, but can't currently
+ find them.
+ <tschwinge> Ah, I probably have been thinking about open_issues/ifunc
+ and open_issues/libc_variant_selection.
* --build=X
`long double` test: due to `cross_compiling = maybe` wants to execute a
@@ -350,6 +379,24 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
<pinotree> like posix/tst-waitid.c, you mean?
<youpi> yes
+ * `getconf` things
+ IRC, freenode, #hurd, 2012-10-03
+ <pinotree> getconf -a | grep CACHE
+ <Tekk_> pinotree: I hate spoiling data, but 0 :P
+ <pinotree> had that feeling, but wanted to be sure -- thanks!
+ <Tekk_>
+ <Tekk_> except for uhh
+ <Tekk_> L4 linesize
+ <Tekk_> that didn't have any number associated
+ <pinotree> weird
+ <Tekk_> I actually didn't even know that there was L4 cache
+ <pinotree> what do you get if you run `getconf
+ <Tekk_> pinotree: undefined
+ <pinotree> expected, given the output above
For specific packages:
* [[octave]]
@@ -384,6 +431,270 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
* `sysdeps/unix/sysv/linux/syslog.c`
+ * `fsync` on a pipe
+ IRC, freenode, #hurd, 2012-08-21:
+ <braunr> pinotree: i think gnu_srs spotted a conformance problem in
+ glibc
+ <pinotree> (only one?)
+ <braunr> pinotree: namely, fsync on a pipe (which is actually a
+ socketpair) doesn't return EINVAL when the "operation not supported"
+ error is returned as a "bad request message ID"
+ <braunr> pinotree: what do you think of this case ?
+ <pinotree> i'm far from an expert on such stuff, but seems a proper E*
+ should be returned
+ <braunr> (there also is a problem in clisp falling in an infinite loop
+ when trying to handle this, since it uses fsync inside the error
+ handling code, eww, but we don't care :p)
+ <braunr> basically, here is what clisp does
+ <braunr> if fsync fails, and the error isn't EINVAL, let's report the
+ error
+ <braunr> and reporting the error in turn writes something on the
+ output/error stream, which in turn calls fsync again
+ <pinotree> smart
+ <braunr> after the stack is exhausted, clisp happily crashes
+ <braunr> gnu_srs: i'll alter the clisp code a bit so it knows about our
+ mig specific error
+ <braunr> if that's the problem (which i strongly suspect), the solution
+ will be to add an error conversion for fsync so that it returns
+ <braunr> if pinotree is willing to do that, he'll be the only one
+ suffering from the dangers of sending stuff to the glibc maintainers
+ :p
+ <pinotree> that shouldn't be an issue i think, there are other glibc
+ hurd implementations that do such checks
+ <gnu_srs> does fsync return EINVAL for other OSes?
+ <braunr> EROFS, EINVAL
+ <braunr> fd is bound to a special file which does not
+ support synchronization.
+ <braunr> obviously, pipes and sockets don't
+ <pinotree>
+ <braunr> so yes, other OSes do just that
+ <pinotree> now that you speak about it, it could be the failure that
+ the gnulib fsync+fdatasync testcase have when being run with `make
+ check` (although not when running as ./test-foo)
+ <braunr> hm we may not need change glibc
+ <braunr> clisp has a part where it defines a macro IS_EINVAL which is
+ system specific
+ <braunr> (but we should change it in glibc for conformance anyway)
+ <braunr> #elif defined(UNIX_DARWIN) || defined(UNIX_FREEBSD) ||
+ defined(UNIX_NETBSD) || defined(UNIX_OPENBSD) #define IS_EINVAL_EXTRA
+ ((errno==EOPNOTSUPP)||(errno==ENOTSUP)||(errno==ENODEV))
+ <pinotree> i'd rather add nothing to clisp
+ <braunr> let's see what posix says
+ <braunr> EINVAL
+ <braunr> so right, we should simply convert it in glibc
+ <gnu_srs> man fsync mentions EINVAL
+ <braunr> man pages aren't posix, even if they are usually close
+ <gnu_srs> aha
+ <pinotree> i think checking for MIG_BAD_ID and EOPNOTSUPP (like other
+ parts do) will b enough
+ <pinotree> *be
+ <braunr> gnu_srs: there, it finished correctly even when piped
+ <gnu_srs> I saw that, congrats!
+ <braunr> clisp is quite tricky to debug
+ <braunr> i never had to deal with a program that installs break points
+ and handles segfaults itself in order to implement growing stacks :p
+ <braunr> i suppose most interpreters do that
+ <gnu_srs> So the permanent change will be in glibc, not clisp?
+ <braunr> yes
+ IRC, freenode, #hurd, 2012-08-24:
+ <gnu_srs1> pinotree: The changes needed for fsync.c is at
+ if you want to try it out (confirmed
+ with rbraun)
+ <youpi> I agree with the patch, posix indeed documents einval as the
+ "proper" error value
+ <pinotree> there's fdatasync too
+ <pinotree> other places use MIG_BAD_ID instead of EMIG_BAD_ID
+ <braunr> pinotree: i assume that if you're telling us, it's because
+ they have different values
+ <pinotree> braunr: tbh i never seen the E version, and everywhere in
+ glibc the non-E version is used
+ <gnu_srs1> in sysdeps/mach/hurd/bits/errno.h only the E version is
+ defined
+ <pinotree> look in gnumach/include/mach/mig_errors.h
+ <pinotree> (as the comment in errno.h say)
+ <gnu_srs1> mig_errors.h yes. Which comment: from errors.h: /* Errors
+ from <mach/mig_errors.h>. */ and then the EMIG_ stuff?
+ <gnu_srs1> Which one is used when building libc?
+ <gnu_srs1> Answer: At least in fsync.c errno.h is used: #include
+ <errno.h>
+ <gnu_srs1> Yes, fdatasync.c should be patched too.
+ <gnu_srs1> pinotree: You are right: EMIG_ or MIG_ is confusing.
+ <gnu_srs1> /usr/include/i386-gnu/bits/errno.h: /* Errors from
+ <mach/mig_errors.h>. */
+ <gnu_srs1> /usr/include/hurd.h:#include <mach/mig_errors.h>
+ IRC, freenode, #hurd, 2012-09-02:
+ <antrik> braunr: regarding fsync(), I agree that EOPNOTSUPP probably
+ should be translated to EINVAL, if that's what POSIX says. it does
+ *not* sound right to translate MIG_BAD_ID though. the server should
+ explicitly return EOPNOTSUPP, and that's what the default trivfs stub
+ does. if you actually do see MIG_BAD_ID, there must be some other
+ bug...
+ <braunr> antrik: right, pflocal doesn't call the trivfs stub for socket
+ objects
+ <braunr> trivfs_demuxer is only called by the pflocal node demuxer, for
+ socket objects it's another call, and i don't think it's the right
+ thing to call trivfs_demuxer there either
+ <pinotree> handling MAG_BAD_ID isn't a bad idea anyway, you never know
+ what the underlying server actually implements
+ <pinotree> (imho)
+ <braunr> for me, a bad id is the same as a not supported operation
+ <pinotree> ditto
+ <pinotree> from fsync's POV, both the results are the same anyway, ie
+ that the server does not support a file_sync operation
+ <antrik> no, a bad ID means the server doesn't implement the protocol
+ (or not properly at least)
+ <antrik> it's usually a bug IMHO
+ <antrik> there is a reason we have EOPNOTSUPP for operations that are
+ part of a protocol but not implemented by a particular server
+ <pinotree> antrik: even if it could be the case, there's no reason to
+ make fsync fail anyway
+ <antrik> pinotree: I think there is. it indicates a bug, which should
+ not be hidden
+ <pinotree> well, patches welcome then...
+ <antrik> thing is, if sock objects are actually not supposed to
+ implement the file interface, glibc shouldn't even *try* to call
+ fsync on them
+ <pinotree> how?
+ <pinotree> i mean, can you check whether the file interface is not
+ implemented, without doing a roundtrip^
+ <pinotree> ?
+ <antrik> well, the sock objects are not files, i.e. they were *not*
+ obtained by file_name_lookup(), but rather a specific call. so glibc
+ actually *knows* that they are not files.
+ <braunr> antrik: this way of thinking means we need an "fd" protocol
+ <braunr> so that objects accessed through a file descriptor implement
+ all fd calls
+ <antrik> now I wonder though whether there are conceivable use cases
+ where it would make sense for objects obtained through the socket
+ call to optionally implement the file interface...
+ <antrik> which could actually make sense, if libc lets through other
+ file calls as well (which I guess it does, if the sock ports are
+ wrapped in normal fd structures?)
+ <braunr> antrik: they are
+ <braunr> and i'd personally be in favor of such an fd protocol, even if
+ it means implementing stubs for many useless calls
+ <braunr> but the way things are now suggest a bad id really means an
+ operation is simply not supported
+ <antrik> the question in this case is whether we should make the file
+ protocol mandatory for anything that can end up in an FD; or whether
+ we should keep it optional, and add the MIG_BAD_ID calls to *all* FD
+ operations
+ <antrik> (there is no reason for fsync to be special in this regard)
+ <braunr> yes
+ <antrik> braunr: BTW, I'm rather undecided whether the right approach
+ is a) requiring an FD interface collection, b) always checking
+ MIG_BAD_ID, or perhaps c) think about introducing a mechanism to
+ explicitly query supported interfaces...
+ IRC, freenode, #hurd, 2012-09-03:
+ <braunr> antrik: querying interfaces sounds like an additional penalty
+ on performance
+ <antrik> braunr: the query usually has to be done only once. in fact it
+ could be integrated into the name lookup...
+ <braunr> antrik: once for every object
+ <braunr> antrik: yes, along with the lookup would be a nice thing
+ [[!message-id ""]].
+ * `t/no-hp-timing`
+ IRC, freenode, #hurd, 2012-11-16
+ <pinotree> tschwinge: wrt the glibc topgit branch t/no-hp-timing,
+ couldn't that file be just replaced by #include
+ <sysdeps/generic/hp-timing.h>?
+ * `flockfile`/`ftrylockfile`/`funlockfile`
+ IRC, freenode, #hurd, 2012-11-16
+ <pinotree> youpi: uhm, in glibc we use
+ stdio-common/f{,try,un}lockfile.c, which do nothing (as opposed to eg
+ the nptl versions, which do lock/trylock/unlock); do you know more
+ about them?
+ <youpi> pinotree: ouch
+ <youpi> no, I don't know
+ <youpi> well, I do know what they're supposed to do
+ <pinotree> i'm trying fillig them, let's see
+ <youpi> but not why we don't have them
+ <youpi> (except that libpthread is "recent")
+ <youpi> yet another reason to build libpthread in glibc, btw
+ <youpi> oh, but we do provide lockfile in libpthread, don't we ?
+ <youpi> pinotree: yes, and libc has weak variants, so the libpthread
+ will take over
+ <pinotree> youpi: sure, but that in stuff linking to pthreads
+ <pinotree> if you do a simple application doing eg main() { fopen +
+ fwrite + fclose }, you get no locking
+ <youpi> so?
+ <youpi> if you don't have threads, you don't need locks :)
+ <pinotree> ... unless there is some indirect recursion
+ <youpi> ?
+ <pinotree> basically, i was debugging why glibc tests with mtrace() and
+ ending with muntrace() would die (while tests without muntrace call
+ wouldn't)
+ <youpi> well, I still don't see what a lock will bring
+ <pinotree> if you look at the muntrace implementation (in
+ malloc/mtrace.c), basically fclose can trigger a malloc hook (because
+ of the free for the FILE*)
+ <youpi> either you have threads, and it's need, or you don't, and it's
+ a nop
+ <youpi> yes, and ?
+ <braunr> does the signal thread count ?
+ <youpi> again, in linux, when you don't have threads, the lock is a nop
+ <youpi> does the signal thread use IO ?
+ <braunr> that's the question :)
+ <braunr> i hope not
+ <youpi> IIRC the signal thread just manages signals, and doesn't
+ execute the handler itself
+ <braunr> sure
+ <braunr> i was more thinking about debug stuff
+ <youpi> can't hurt to add them anyway, but let me still doubt that it'd
+ fix muntrace, I don't see why it would, unless you have threads
+ <pinotree> that's what i'm going next
+ <pinotree> pardon, it seems i got confused a bit
+ <pinotree> it'd look like a genuine muntrace bug (muntrace → fclose →
+ free hook → lock lock → fprint (since the FILE is still set) → malloc
+ → malloc hook → lock lock → spin)
+ <pinotree> at least i got some light over the flockfile stuff, thanks
+ ;)
+ <pinotree> youpi: otoh, __libc_lock_lock (etc) are noop in the base
+ implementation, while doing real locks on hurd in any case, and on
+ linux only if nptl is loaded, it seems
+ <pinotree> that would explain why on linux you get no deadlock
+ <youpi> unless using nptl, that is?
+ <pinotree> hm no, even with pthread it works
+ <pinotree> but hey, at least the affected glibc test now passes
+ <pinotree> will maybe try to do investigation on why it works on linux
+ tomorrow
+ [[!message-id ""]].
+ * `t/pagesize`
+ IRC, freenode, #hurd, 2012-11-16
+ <pinotree> tschwinge: somehow related to your t/pagesize branch: due to
+ the fact that EXEC_PAGESIZE is not defined on hurd, libio/libioP.h
+ switches the allocation modes from mmap to malloc
+ * `LD_DEBUG`
+ IRC, freenode, #hurd, 2012-11-22
+ <pinotree> woot, `LD_DEBUG=libs /bin/ls >/dev/null` prints stuff and
+ then sigsegv
+ <tschwinge> Yeah, that's known for years... :-D
+ <tschwinge> Probably not too difficult to resolve, though.
* Verify baseline changes, if we need any follow-up changes:
* a11ec63713ea3903c482dc907a108be404191a02
@@ -559,6 +870,11 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
* *baseline*
* [high] `sendmmsg` usage, c030f70c8796c7743c3aa97d6beff3bd5b8dcd5d --
need a `ENOSYS` stub.
+ * ea4d37b3169908615b7c17c9c506c6a6c16b3a26 -- IRC, freenode, #hurd,
+ 2012-11-20, pinotree: »tschwinge: i agree on your comments on
+ ea4d37b3169908615b7c17c9c506c6a6c16b3a26, especially since mach's
+ sleep.c is buggy (not considers interruption, extra time() (= RPC)
+ call)«.
# Build
diff --git a/open_issues/gnumach_page_cache_policy.mdwn b/open_issues/gnumach_page_cache_policy.mdwn
index 375e153b..d128c668 100644
--- a/open_issues/gnumach_page_cache_policy.mdwn
+++ b/open_issues/gnumach_page_cache_policy.mdwn
@@ -771,3 +771,15 @@ License|/fdl]]."]]"""]]
## IRC, freenode, #hurd, 2012-07-26
<braunr> hm i killed darnassus, probably the page cache patch again
+## IRC, freenode, #hurd, 2012-09-19
+ <youpi> I was wondering about the page cache information structure
+ <youpi> I guess the idea is that if we need to add a field, we'll just
+ define another RPC?
+ <youpi> braunr: ↑
+ <braunr> i've done that already, yes
+ <braunr> youpi: have a look at the rbraun/page_cache gnumach branch
+ <youpi> that's what I was referring to
+ <braunr> ok
diff --git a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
index 90137766..7739f4d1 100644
--- a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
+++ b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -181,6 +181,8 @@ License|/fdl]]."]]"""]]
<braunr> from what i could see, part of the problem still exists in freebsd
<braunr> for the same reasons (shadow objects being one of them)
# GCC build time using bash vs. dash
diff --git a/open_issues/gnumach_vm_map_red-black_trees.mdwn b/open_issues/gnumach_vm_map_red-black_trees.mdwn
index 7a54914f..53ff66c5 100644
--- a/open_issues/gnumach_vm_map_red-black_trees.mdwn
+++ b/open_issues/gnumach_vm_map_red-black_trees.mdwn
@@ -198,3 +198,149 @@ License|/fdl]]."]]"""]]
get all that crap
<braunr> that's very good
<braunr> more test cases to fix the vm
+### IRC, freenode, #hurd, 2012-11-01
+ <youpi> braunr: Assertion `diff != 0' failed in file "vm/vm_map.c", line
+ 1002
+ <youpi> that's in rbtree_insert
+ <braunr> youpi: the problem isn't the tree, it's the map entries
+ <braunr> some must overlap
+ <braunr> if you can inspect that, it would be helpful
+ <youpi> I have a kdb there
+ <youpi> it's within a port_name_to_task system call
+ <braunr> this assertion basically means there already is an item in the
+ tree where the new item is supposed to be inserted
+ <youpi> this port_name_to_task presence in the stack is odd
+ <braunr> it's in vm_map_enter
+ <youpi> there's a vm_map just after that (and the assembly trap code
+ before)
+ <youpi> I know
+ <youpi> I'm wondering about the caller
+ <braunr> do you have a way to inspect the inserted map entry ?
+ <youpi> I'm actually wondering whether I have the right kernel in gdb
+ <braunr> oh
+ <youpi> better
+ <youpi> with the right kernel :)
+ <youpi> 0x80039acf (syscall_vm_map)
+ (target_map=d48b6640,address=d3b63f90,size=0,mask=0,anywhere=1)
+ <youpi> size == 0 seems odd to me
+ <youpi> (same parameters for vm_map)
+ <braunr> right
+ <braunr> my code does assume an entry has a non null size
+ <braunr> (in the entry comparison function)
+ <braunr> EINVAL (since Linux 2.6.12) length was 0.
+ <braunr> that's a quick glance at mmap(2)
+ <braunr> might help track bugs from userspace (e.g. in exec .. :))
+ <braunr> posix says the saem
+ <braunr> same*
+ <braunr> the gnumach manual isn't that precise
+ <youpi> I don't seem to manage to read the entry
+ <youpi> but I guess size==0 is the problem anyway
+ <mcsim> youpi, braunr: Is there another kernel fault? Was that in my
+ kernel?
+ <braunr> no that's another problem
+ <braunr> which became apparent following the addition of red black trees in
+ the vm_map code
+ <braunr> (but which was probably present long before)
+ <mcsim> braunr: BTW, do you know if there where some specific circumstances
+ that led to memory exhaustion in my code? Or it just aggregated over
+ time?
+ <braunr> mcsim: i don't know
+ <mcsim> s/where/were
+ <mcsim> braunr: ok
+### IRC, freenode, #hurd, 2012-11-05
+ <tschwinge> braunr: I have now also hit the diff != 0 assertion error;
+ sitting in KDB, waiting for your commands.
+ <braunr> tschwinge: can you check the backtrace, have a look at the system
+ call and its parameters like youpi did ?
+ <tschwinge> If I manage to figure out how to do that... :-)
+ * tschwinge goes read scrollback.
+ <braunr> "trace" i suppose
+ <braunr> if running inside qemu, you can use the integrated gdb server
+ <tschwinge> braunr: No, hardware. And work intervened. And mobile phone
+ <-> laptop via bluetooth didn't work. But now:
+ <tschwinge> Pretty similar to Samuel's:
+ <tschwinge> Assert([...])
+ <tschwinge> vm_map_enter(0xc11de6c8, 0xc1785f94, 0, 0, 1)
+ <tschwinge> vm_map(0xc11de6c8, 0xc1785f94, 0, 0, 1)
+ <tschwinge> syscall_vm_map(1, 0x1024a88, 0, 0, 1)
+ <tschwinge> mach_call_call(1, 0x1024a88, 0, 0, 1)
+ <braunr> thanks
+ <braunr> same as youpi observed, the requested size for the mapping is 0
+ <braunr> tschwinge: thanks
+ <tschwinge> braunr: Anything else you'd like to see before I reboot?
+ <braunr> tschwinge: no, that's enough for now, and the other kind of info
+ i'd like are much more difficult to obtain
+ <braunr> if we still have the problem once a small patch to prevent null
+ size is applied, then it'll be worth looking more into it
+ <pinotree> isn't it possible to find out who called with that size?
+ <braunr> not easy, no
+ <braunr> it's also likely that the call that fails isn't the first one
+ <pinotree> ah sure
+ <pinotree> braunr: making mmap reject 0 size length could help? posix says
+ such size should be rejected straight away
+ <braunr> 17:09 < braunr> if we still have the problem once a small patch to
+ prevent null size is applied, then it'll be worth looking more into it
+ <braunr> that's the idea
+ <braunr> making faulty processes choke on it should work fine :)
+ <pinotree> «If len is zero, mmap() shall fail and no mapping shall be
+ established.»
+ <pinotree> braunr: should i cook up such patch for mmap?
+ <braunr> no, the change must be applied in gnumach
+ <pinotree> sure, but that could simply such condition in mmap (ie avoiding
+ to call io_map on a file)
+ <braunr> such calls are erroneous and rare, i don't see the need
+ <pinotree> ok
+ <braunr> i bet it comes from the exec server anyway :p
+ <tschwinge> braunr: Is the mmap with size 0 already a reproducible testcase
+ you can use for the diff != 0 assertion?
+ <tschwinge> Otherwise I'd have a reproducer now.
+ <braunr> tschwinge: i'm not sure but probably yes
+ <tschwinge> braunr: Otherwise, take GDB sources, then: gcc -fsplit-stack
+ gdb/testsuite/gdb.base/morestack.c && ./a.out
+ <tschwinge> I have not looked what exactly this does; I think -fsplit-stack
+ is not really implemented for us (needs something in libgcc we might not
+ have), is on my GCC TODO list already.
+ <braunr> tschwinge: interesting too :)
+### IRC, freenode, #hurd, 2012-11-19
+ <tschwinge> braunr: Hmm, I have now hit the diff != 0 GNU Mach assertion
+ failure during some GCC invocation (GCC testsuite) that does not relate
+ to -fsplit-stack (as the others before always have).
+ <tschwinge> Reproduced:
+ /media/erich/home/thomas/tmp/gcc/hurd/
+ -B/media/erich/home/thomas/tmp/gcc/hurd/
+ /home/thomas/tmp/gcc/hurd/master/gcc/testsuite/gcc.dg/torture/pr42878-1.c
+ -fno-diagnostics-show-caret -O2 -flto -fuse-linker-plugin
+ -fno-fat-lto-objects -fcompare-debug -S -o pr42878-1.s
+ <tschwinge> Will check whether it's the same backtrace in GNU Mach.
+ <tschwinge> Yes, same.
+ <braunr> tschwinge: as youpi seems quite busy these days, i'll cook a patch
+ and commit it directly
+ <tschwinge> braunr: Thanks! I have, by the way, confirmed that the
+ following is enough to trigger the issue: vm_map(mach_task_self(), 0, 0,
+ 0, 1, 0, 0, 0, 0, 0, 0);
+ <tschwinge> ... and before the allocator patch, GNU Mach did accept that
+ and return 0 -- though I did not check what effect it actually has. (And
+ I don't think it has any useful one.) I'm also reading that as of lately
+ (Linux 2.6.12), mmap (length = 0) is to return EINVAL, which I think is
+ the foremost user of vm_map.
+ <pinotree> tschwinge: posix too says to return EINVAL for length = 0
+ <braunr> yes, we checked that earlier with youpi
+[[!message-id ""]].
+ <braunr> tschwinge: well, actually your patch is what i had in mind
+ (although i'd like one in vm_map_enter to catch wrong kernel requests
+ too)
+ <braunr> tschwinge: i'll work on it tonight, and do some testing to make
+ sure we don't regress critical stuff (exec is another major direct user
+ of vm_map iirc)
+ <tschwinge> braunr: Oh, OK. :-)
diff --git a/open_issues/implementing_hurd_on_top_of_another_system.mdwn b/open_issues/implementing_hurd_on_top_of_another_system.mdwn
index 95b71ebb..220c69cc 100644
--- a/open_issues/implementing_hurd_on_top_of_another_system.mdwn
+++ b/open_issues/implementing_hurd_on_top_of_another_system.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -15,9 +16,12 @@ One obvious variant is [[emulation]] (using [[hurd/running/QEMU]], for
example), but
doing that does not really integratable the Hurd guest into the host system.
There is also a more direct way, more powerful, but it also has certain
-requirements to do it effectively:
+requirements to do it effectively.
-IRC, #hurd, August / September 2010
+See also [[Mach_on_top_of_POSIX]].
+# IRC, freenode, #hurd, August / September 2010
<marcusb> silver_hook: the Hurd can also refer to the interfaces of the
filesystems etc, and a lot of that is really just server/client APIs that
@@ -56,7 +60,7 @@ IRC, #hurd, August / September 2010
<marcusb> ArneBab: in fact, John Tobey did this a couple of years ago, or
started it
-([[tschwinge]] has tarballs of John's work.)
<marcusb> ArneBab: or you can just implement parts of it and relay to Linux
for the rest
@@ -64,11 +68,10 @@ IRC, #hurd, August / September 2010
are sufficiently happy with the translator stuff, it's not hard to bring
the Hurd to Linux or BSD
-Continue reading about the [[benefits of a native Hurd implementation]].
+Continue reading about the [[benefits_of_a_native_Hurd_implementation]].
-IRC, #hurd, 2010-12-28
+# IRC, freenode, #hurd, 2010-12-28
<antrik> kilobug: there is no real requirement for the Hurd to run on a
microkernel... as long as the important mechanisms are provided (most
@@ -79,9 +82,8 @@ IRC, #hurd, 2010-12-28
Hurd on top of a monolithic kernel would actually be a useful approach
for the time being...
-IRC, #hurd, 2011-02-11
+# IRC, freenode, #hurd, 2011-02-11
<neal> marcus and I were discussing how to add Mach to Linux
<neal> one could write a module to implement Mach IPC
@@ -115,3 +117,303 @@ IRC, #hurd, 2011-02-11
<neal> I'm unlikely to work on it, sorry
<antrik> didn't really expect that :-)
<antrik> would be nice though if you could write up your conclusions...
+# IRC, freenode, #hurd, 2012-10-12
+ <peo-xaci> do hurd system libraries make raw system calls ever
+ (i.e. inlined syscall() / raw assembly)?
+ <braunr> sure
+ <peo-xaci> hmm, so a hurd emulation layer would need to use ptrace if it
+ should be fool proof? :/
+ <braunr> there is no real need for raw assembly, and the very syscalls are
+ all available through macros
+ <braunr> hum what are you trying to say ?
+ <peo-xaci> well, if they are done through syscall, as a function, not a
+ macro, then they can be intercepted with LD_PRELOAD
+ <peo-xaci> so applications that do Hurd (Mach?) syscalls could work on
+ f.e. Linux, if a special libc is injected into the program with
+ <peo-xaci> same thing with making standard Linux-applications go through
+ the Hurd emulation layer
+ <peo-xaci> without recompilation
+ <mel-_> peo-xaci: the second direction is implemented in glibc.
+ <mel-_> for the other direction, I personally see little use for it
+ <braunr> peo-xaci: ok i misunderstood
+ <braunr> peo-xaci: i don't think there is any truely direct syscall usage
+ in the hurd
+ <peo-xaci> hmm, I'm not sure I understand what directions you are referring
+ to mel-_
+ <braunr> peo-xaci: what are you trying to achieve ?
+ <peo-xaci> I want to make the Hurd design more accessible by letting Hurd
+ application run on the Linux kernel, preferably without
+ recompilation. This would be done with a daemon that implements Mach and
+ which all syscalls would go to.
+ <peo-xaci> then, I also want so that standard Linux applications can go
+ through that Mach daemon as well, if a special libc is preloaded
+ <braunr> you might want to discuss this with antrik
+ <peo-xaci> what I'm trying to figure out specifically is if there is some
+ library/interface that glue Hurd with Mach and would be better suited to
+ emulate than Mach? Mach seems to be more of an implementation detail to
+ the hurd and not something an application would directly use.
+ <braunr> yes, the various hurd libraries (libports and libpager mostly)
+ <peo-xaci> From []:
+ "libports is not (at least, not for now) a generalization / abstraction
+ of Mach ports to the functionality the Hurd needs, that is, it is not
+ meant to provide an interface independently of the underlying
+ microkernel."
+ <peo-xaci> Is this still true?
+ <peo-xaci> Does libpager abstract the rest?
+ <peo-xaci> (and the other hurd libraries)
+ <braunr> there is nothing that really abstracts the hurd from mach
+ <braunr> for example, reference counting often happens here and there
+ <braunr> and core libraries like glibc and libpthread heavily rely on it
+ (through sysdeps specific code though)
+ <braunr> libports and libpager are meant to simplify object manipulation
+ for the former, and pager operations for the latter
+ <peo-xaci> and applications, such as translators, often use Mach interfaces
+ directly?
+ <peo-xaci> correct?
+ <braunr> depends on what often means
+ <braunr> let's say they do
+ <peo-xaci> :/ then it probably is better to emulate Mach after all
+ <braunr> there was a mach on posix port a long time ago
+ <peo-xaci> I thought applications were completely separated from the
+ microkernel in use by the Hurd
+ <braunr> that level of abstraction is pretty new
+ <braunr> genode is the only system i know which does that
+ <braunr> and it's still for "l4 variants"
+ <pinotree> ah, thanks (i forgot that name)
+ <antrik> braunr: Genode also runs on Linux and a few other non-L4
+ environments IIRC
+ <antrik> peo-xaci: I'm not sure binary emulation is really useful. rather,
+ I'd recompile stuff as "regular" Linux executables, only using a special
+ glibc
+ <antrik> where the special glibc could be basically a port of the Hurd
+ glibc communicating with the Mach emulation instead of real Mach; or it
+ could do emulation at a higher level
+ <antrik> a higher level emulation would be more complicated to implement,
+ but more efficient, and allow better integration with the ordinary
+ GNU/Linux environment
+ <antrik> also note that any regular program could be recompiled against the
+ HELL glibc to run in the Hurdish environment...
+ <antrik> (well, glibc + hurd server libraries)
+ <peo-xaci> I'm willing to accept that Hurd-application would need to be
+ recompiled to work on the HELL
+ <peo-xaci> but not Linux-applications :)
+ <antrik> peo-xaci: if you happen to understand German, there is a fairly
+ good overview in my thesis report ;-)
+ <antrik> peo-xaci: there are no "Hurd applications" or "Linux applications"
+ <peo-xaci> well, let me define what I mean by the terms: Hurd applications
+ use Hurd-specific interfaces/syscalls, and Linux applications use
+ Linux-specific interfaces/syscalls
+ <antrik> a few programs use Linux-specific interfaces (and we probably
+ can't run them in HELL just as we can't run them on actual Hurd); but all
+ other programs work in any glibc environment
+ <antrik> (usually in any POSIX environment in fact...)
+ <antrik> peo-xaci: no sane application uses syscalls
+ <peo-xaci> they do under the hood
+ <peo-xaci> I have read about inlined syscalls
+ <antrik> again, there are *some* applications using Linux-specific
+ interfaces (sometimes because they are inherently bound to Linux
+ features, sometimes unnecessarily)
+ <antrik> so far there are no applications using Hurd-specific interfaces
+ <peo-xaci> translators do?
+ <peo-xaci> they are standard executables are they not?
+ <peo-xaci> I would like so that translators also can be run in the HELL
+ <antrik> I wouldn't consider them applications. all existing translators
+ are pretty much components of the Hurd itself
+ <peo-xaci> okay, it's a question about semantics, perhaps I should use
+ another word than "applications" :)
+ <peo-xaci> for me, applications are what have a main-function, or similar
+ single entry point
+ <braunr> hum
+ <braunr> that's not a good enough definition
+ <antrik> anyways, as I said, I think recompiling translators against a
+ Hurdish glibc and ported translator libraries seems the most reasonable
+ approach to me
+ <braunr> let's say applications are userspace processes that make use of
+ services provided by the operating system
+ <braunr> translators being part of the operating system here
+ <antrik> braunr: do you know whether the Mach-on-POSIX was actually
+ functional, or just an abandoned experiment?...
+ <antrik> (I don't remember hearing of it before...)
+ <braunr> incomplete iirc
+ <peo-xaci> braunr: still, when I've explained what I meant, even if I used
+ the wrong term, then my previous statements should come in another light
+ <peo-xaci> antrik / braunr: are you still interested in hearing my
+ thoughts/ideas about HELL?
+ <antrik> oh, there is more to come? ;-)
+ <peo-xaci> yes! I don't think I have made myself completely understood :/
+ <peo-xaci> what I envision is a HELL system that works on as low level as
+ feasible, to make it possible to do almost anything that can be done on
+ the real Hurd (except possibly testing hardware drivers and such very low
+ level stuff).
+ <braunr> sure
+ <peo-xaci> I want it to be more than just allowing programs to access a
+ virtual filesystem à la FUSE. My idea is that all user space system
+ libraries/programs of the Hurd should be inside the HELL as well, and
+ they should not be emulated.
+ <peo-xaci> The system should at the very least be API compatible, so at the
+ very most a recompilation is necessary.
+ <peo-xaci> I also want so that GNU/Linux-programs can access the features
+ of the HELL with little effort on the user. At most perhaps a script that
+ wraps LD_PRELOADing has to be run on the binary. Best would be if it
+ could work also with insane assembly programs using raw system calls, or
+ if glibc happens to have some well hidden syscall being inlined to raw
+ assembly code.
+ <peo-xaci> And I think I have an idea on how an implementation could
+ satisfy these things!
+ <peo-xaci> By modifying the kernel and replace those syscalls that make
+ sense for the Hurd/Mach
+ <peo-xaci> with "the kernel", I meant Linux
+ <braunr> it's possible but tedious and not very useful so better do that
+ later
+ <braunr> mach did something similar at its time
+ <braunr> there was a syscall emulation library
+ <peo-xaci> but isn't it about as much work as emulating the interface on
+ user-level?
+ <braunr> and the kernel cooperated so that unmodified unix binaries
+ performing syscalls would actually jump to functions provided by that
+ library, which generally made an RPC
+ <peo-xaci> instead of a bunch of extern-declerations, one would put the
+ symbols in the syscall table
+ <braunr> define what "those syscalls that make sense for the Hurd/Mach"
+ actually means
+ <peo-xaci> open/close, for example
+ <braunr> otherwise i don't see another better way than what the old mach
+ folks did
+ <braunr> well, with that old, but existing support, your open would perform
+ a syscall
+ <braunr> the kernel would catch it and redirect the caller to its syscall
+ emulation library
+ <braunr> which would call the open RPC instead
+ <peo-xaci> wait, so this "existing support" you're talking about; is this a
+ module for the Linux kernel (or a fork, or something else)?
+ <peo-xaci> where can I find it?
+ <braunr> no
+ <braunr> it was for mach
+ <braunr> in order to run unmodified unix binaries
+ <braunr> the opposite of what you're trying to do
+ <peo-xaci> ah okay
+ <braunr> well
+ <braunr> not really either :)
+ <peo-xaci> does posix/unix define a standard for how a syscall table should
+ look like, to allow binary syscall compatibility?
+ <braunr> absolutely not
+ <peo-xaci> so how could this mach module run any unmodified unix binary? if
+ they expected different sys calls at different offsets?
+ <braunr> posix specifically (and very early) states that it almost forbids
+ itself to deal with anything regarding to ABIs
+ <braunr> depends
+ <braunr> since it was old, there weren't that many unix systems
+ <braunr> and even today, there are techniques like those used by netbsd
+ (and many other actually)
+ <braunr> that are able to inspect the binary and load a syscall emulation
+ environment depending on its exposed ABI
+ <braunr> e.g. file on an executable states which system it's for
+ <peo-xaci> hmm, I'm not sure how a kernel would implement that in
+ practice.. I thought these things were so hard coded and dependent on raw
+ memory reads that it would not be possible
+ <braunr> but i really think it's not worth the time for your project
+ <peo-xaci> to be honest I have virtually no experience of practical kernel
+ programming
+ <braunr> with an LDT on x86 for example
+ <braunr> no, there is really not that much hardcoded
+ <braunr> quite the contrary
+ <braunr> there is a lot of runtime detection today
+ <peo-xaci> well I mean how the syscall table is read
+ <braunr> it's not read
+ <peo-xaci> it's read to find the function pointer to the syscall handler in
+ the kernel?
+ <braunr> no
+ <braunr> that's the really basic approach
+ <braunr> (and in practice it can happen of course)
+ <braunr> what really happens is that, for example, on linux, the user space
+ system call code is loaded as a virtual shared library
+ <braunr> use ldd on an executable to see it
+ <braunr> this virtual object provides code that, depending on what the
+ kernel has detected, will use the appropriate method to perform a system
+ call
+ <peo-xaci> but this user space system calls need to make some kind of cpu
+ interupt to communicate with the kernel, right?
+ <braunr> the glibc itself has no idea how a system call will look like in
+ the end
+ <braunr> yes
+ <peo-xaci> an assembler programmer would be able to get around this glue
+ code?
+ <braunr> that's precisely what is embedded in this virtual library
+ <braunr> it could yes
+ <braunr> i think even when sysenter/sysexit is supported, legacy traps are
+ still implemented to support old binaries
+ <braunr> but then all these different entry points will lead to the same
+ code inside the kernel
+ <peo-xaci> but when the glue code is used, then its API compatible, and
+ then I can understand that the kernel can allow different syscall
+ implementations for different executables
+ <braunr> what glue code ?
+ <peo-xaci> what you talked about above "the user space system call code is
+ loaded as a virtual shared library"
+ <braunr> let's call it vdso
+ <braunr> i have to leave in a few minutes
+ <braunr> keep going, i'll read later
+ <peo-xaci> thanks, I looked it up on Wikipedia and understand immediately
+ :P
+ <peo-xaci> so VDSOs are provided by the kernel, not a regular library file,
+ right?
+ <vdox2> What does HELL stand for :) ?
+ <dardevelin> vdox2, Hurd Emulation Layer for Linux
+ <vdox2> dardevelin: thanks
+ <braunr> peo-xaci: yes
+ <antrik> peo-xaci: I believe your goals are conflicting. a low-level
+ implementation makes it basically impossible to interact between the HELL
+ environment and the GNU/Linux environment in any meaningful way. to allow
+ such interaction, you *have* to have some glue at a higher semantic level
+ <braunr> agreed
+ <antrik> peo-xaci: BTW, if you want regular Linux binaries to get somehow
+ redirected to access HELL facilities, there is already a framework (don't
+ remember the name right now) that allows this kind of system call
+ redirection on Linux
+ <antrik> (it can run both through LD_PRELOAD or as a kernel module -- where
+ obviously only the latter would allow raw system call redirection... but
+ TBH, I don't think that's worthwhile anyways. the rare cases where
+ programs use raw system calls are usually for extremely system-specific
+ stuff anyways...)
+ <antrik> ViewOS is the name
+ <antrik> err... View-OS I mean
+ <antrik> or maybe View OS ? ;-)
+ <antrik> whatever, you'll find it :-)
+ <antrik> I'm not sure it's really worthwhile to use this either
+ though... the most meaningful interaction is probably at the FS level,
+ and that can be done with FUSE
+ <antrik> OHOH, View-OS probably allows doing more interesting stuff that
+ FUSE, such as modyfing the way the VFS works...
+ <antrik> OTOH
+ <antrik> so it could expose more of the Hurd features, at least in theory
+## IRC, freenode, #hurd, 2012-10-13
+ <peo-xaci> antrik / braunr: thanks for your input! I'm not entirely
+ convinced though. :) I will probably return to this project once I have
+ acquired a lot more knowledge about low level stuff. I want to see for
+ myself whether a low level HELL is not feasible. :P
+ <braunr> peo-xaci: what's the point of a low level hell ?
+ <peo-xaci> more Hurd code can be tested in the hell, if the hell is at a
+ low level
+ <peo-xaci> at a higher level, some Hurd code cannot run, because the
+ interfaces they use would not be accessible from the higher level
+ emulation
+ <antrik> peo-xaci: I never said it's not possible. I actually said it would
+ be easier to do. I just said you can't do it low level *and* have
+ meaningful interaction with the host system
+ <peo-xaci> I don't understand why
+ <braunr> peo-xaci: i really don't see what you want to achieve with low
+ level support
+ <braunr> what would be unavailable with a higher level approach ?
diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn
index 03a52218..81f1a382 100644
--- a/open_issues/libpthread.mdwn
+++ b/open_issues/libpthread.mdwn
@@ -566,3 +566,671 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task.
<braunr> ouch
<bddebian> braunr: Do you have debugging enabled in that custom kernel you
installed? Apparently it is sitting at the debug prompt.
+## IRC, freenode, #hurd, 2012-08-12
+ <braunr> hmm, it seems the hurd notion of cancellation is actually not the
+ pthread one at all
+ <braunr> pthread_cancel merely marks a thread as being cancelled, while
+ hurd_thread_cancel interrupts it
+ <braunr> ok, i have a pthread_hurd_cond_wait_np function in glibc
+## IRC, freenode, #hurd, 2012-08-13
+ <braunr> nice, i got ext2fs work with pthreads
+ <braunr> there are issues with the stack size strongly limiting the number
+ of concurrent threads, but that's easy to fix
+ <braunr> one problem with the hurd side is the condition implications
+ <braunr> i think it should be deal separately, and before doing anything
+ with pthreads
+ <braunr> but that's minor, the most complex part is, again, the term server
+ <braunr> other than that, it was pretty easy to do
+ <braunr> but, i shouldn't speak too soon, who knows what tricky bootstrap
+ issue i'm gonna face ;p
+ <braunr> tschwinge: i'd like to know how i should proceed if i want a
+ symbol in a library overriden by that of a main executable
+ <braunr> e.g. have libpthread define a default stack size, and let
+ executables define their own if they want to change it
+ <braunr> tschwinge: i suppose i should create a weak alias in the library
+ and a normal variable in the executable, right ?
+ <braunr> hm i'm making this too complicated
+ <braunr> don't mind that stupid question
+ <tschwinge> braunr: A simple variable definition would do, too, I think?
+ <tschwinge> braunr: Anyway, I'd first like to know why we can'T reduce the
+ size of libpthread threads from 2 MiB to 64 KiB as libthreads had. Is
+ that a requirement of the pthread specification?
+ <braunr> tschwinge: it's a requirement yes
+ <braunr> the main reason i see is that hurd threadvars (which are still
+ present) rely on common stack sizes and alignment to work
+ <tschwinge> Mhm, I see.
+ <braunr> so for now, i'm using this approach as a hack only
+ <tschwinge> I'm working on phasing out threadvars, but we're not there yet.
+ <tschwinge> Yes, that's fine for the moment.
+ <braunr> tschwinge: a simple definition wouldn't work
+ <braunr> tschwinge: i resorted to a weak symbol, and see how it goes
+ <braunr> tschwinge: i supposed i need to export my symbol as a global one,
+ otherwise making it weak makes no sense, right ?
+ <braunr> suppose*
+ <braunr> tschwinge: also, i'm not actually sure what you meant is a
+ requirement about the stack size, i shouldn't have answered right away
+ <braunr> no there is actually no requirement
+ <braunr> i misunderstood your question
+ <braunr> hm when adding this weak variable, starting a program segfaults :(
+ <braunr> apparently on ___pthread_self, a tls variable
+ <braunr> fighting black magic begins
+ <braunr> arg, i can't manage to use that weak symbol to reduce stack sizes
+ :(
+ <braunr> ah yes, finally
+ <braunr> git clone /path/to/glibc.git on a pthread-powered ext2fs server :>
+ <braunr> tschwinge: seems i have problems using __thread in hurd code
+ <braunr> tschwinge: they produce undefined symbols
+ <braunr> tschwinge: forget that, another mistake on my part
+ <braunr> so, current state: i just need to create another patch, for the
+ code that is included in the debian hurd package but not in the upstream
+ hurd repository (e.g. procfs, netdde), and i should be able to create
+ hurd packages taht completely use pthreads
+## IRC, freenode, #hurd, 2012-08-14
+ <braunr> tschwinge: i have weird bootstrap issues, as expected
+ <braunr> tschwinge: can you point me to important files involved during
+ bootstrap ?
+ <braunr> my ext2fs.static server refuses to start as a rootfs, whereas it
+ seems to work fine otherwise
+ <braunr> hm, it looks like it's related to global signal dispositions
+## IRC, freenode, #hurd, 2012-08-15
+ <braunr> ahah, a subhurd running pthreads-powered hurd servers only
+ <LarstiQ> braunr: \o/
+ <braunr> i can even long on ssh
+ <braunr> log
+ <braunr> pinotree: for reference, i uploaded my debian-specific changes
+ there :
+ <braunr>
+ <braunr> darnassus is now running a pthreads-enabled hurd system :)
+## IRC, freenode, #hurd, 2012-08-16
+ <braunr> my pthreads-enabled hurd systems can quickly die under load
+ <braunr> youpi: with hurd servers using pthreads, i occasionally see thread
+ storms apparently due to a deadlock
+ <braunr> youpi: it makes me think of the problem you sometimes have (and
+ had often with the page cache patch)
+ <braunr> in cthreads, mutex and condition operations are macros, and they
+ check the mutex/condition queue without holding the internal
+ mutex/condition lock
+ <braunr> i'm not sure where this can lead to, but it doesn't seem right
+ <pinotree> isn't that a bit dangerous?
+ <braunr> i believe it is
+ <braunr> i mean
+ <braunr> it looks dangerous
+ <braunr> but it may be perfectly safe
+ <pinotree> could it be?
+ <braunr> aiui, it's an optimization, e.g. "dont take the internal lock if
+ there are no thread to wake"
+ <braunr> but if there is a thread enqueuing itself at the same time, it
+ might not be waken
+ <pinotree> yeah
+ <braunr> pthreads don't have this issue
+ <braunr> and what i see looks like a deadlock
+ <pinotree> anything can happen between the unlocked checking and the
+ following instruction
+ <braunr> so i'm not sure how a situation working around a faulty
+ implementation would result in a deadlock with a correct one
+ <braunr> on the other hand, the error youpi reported
+ ( seems
+ to indicate something is deeply wrong with libports
+ <pinotree> it could also be the current code does not really "works around"
+ that, but simply implicitly relies on the so-generated behaviour
+ <braunr> luckily not often
+ <braunr> maybe
+ <braunr> i think we have to find and fix these issues before moving to
+ pthreads entirely
+ <braunr> (ofc, using pthreads to trigger those bugs is a good procedure)
+ <pinotree> indeed
+ <braunr> i wonder if tweaking the error checking mode of pthreads to abort
+ on EDEADLK is a good approach to detecting this problem
+ <braunr> let's try !
+ <braunr> youpi: eh, i think i've spotted the libports ref mistake
+ <youpi> ooo!
+ <youpi> .oOo.!!
+ <gnu_srs> Same problem but different patches
+ <braunr> look at libports/bucket-iterate.c
+ <braunr> in the HURD_IHASH_ITERATE loop, pi->refcnt is incremented without
+ a lock
+ <youpi> Mmm, the incrementation itself would probably be compiled into an
+ INC, which is safe in UP
+ <youpi> it's an add currently actually
+ <youpi> 0x00004343 <+163>: addl $0x1,0x4(%edi)
+ <braunr> 40c4: 83 47 04 01 addl $0x1,0x4(%edi)
+ <youpi> that makes it SMP unsafe, but not UP unsafe
+ <braunr> right
+ <braunr> too bad
+ <youpi> that still deserves fixing :)
+ <braunr> the good side is my mind is already wired for smp
+ <youpi> well, it's actually not UP either
+ <youpi> in general
+ <youpi> when the processor is not able to do the add in one instruction
+ <braunr> sure
+ <braunr> youpi: looks like i'm wrong, refcnt is protected by the global
+ libports lock
+ <youpi> braunr: but aren't there pieces of code which manipulate the refcnt
+ while taking another lock than the global libports lock
+ <youpi> it'd not be scalable to use the global libports lock to protect
+ refcnt
+ <braunr> youpi: imo, the scalability issues are present because global
+ locks are taken all the time, indeed
+ <youpi> urgl
+ <braunr> yes ..
+ <braunr> when enabling mutex checks in libpthread, pfinet dies :/
+ <braunr> grmbl, when trying to start "ls" using my deadlock-detection
+ libpthread, the terminal gets unresponsive, and i can't even use ps .. :(
+ <pinotree> braunr: one could say your deadlock detection works too
+ good... :P
+ <braunr> pinotree: no, i made a mistake :p
+ <braunr> it works now :)
+ <braunr> well, works is a bit fast
+ <braunr> i can't attach gdb now :(
+ <braunr> *sigh*
+ <braunr> i guess i'd better revert to a cthreads hurd and debug from there
+ <braunr> eh, with my deadlock-detection changes, recursive mutexes are now
+ failing on _pthread_self(), which for some obscure reason generates this
+ <braunr> => 0x0107223b <+283>: jmp 0x107223b
+ <__pthread_mutex_timedlock_internal+283>
+ <braunr> *sigh*
+## IRC, freenode, #hurd, 2012-08-17
+ <braunr> aw, the thread storm i see isn't a deadlock
+ <braunr> seems to be mere contention ....
+ <braunr> youpi: what do you think of the way
+ ports_manage_port_operations_multithread determines it needs to spawn a
+ new thread ?
+ <braunr> it grabs a lock protecting the number of threads to determine if
+ it needs a new thread
+ <braunr> then releases it, to retake it right after if a new thread must be
+ created
+ <braunr> aiui, it could lead to a situation where many threads could
+ determine they need to create threads
+ <youpi> braunr: there's no reason to release the spinlock before re-taking
+ it
+ <youpi> that can indeed lead to too much thread creations
+ <braunr> youpi: a harder question
+ <braunr> youpi: what if thread creation fails ? :/
+ <braunr> if i'm right, hurd servers simply never expect thread creation to
+ fail
+ <youpi> indeed
+ <braunr> and as some patterns have threads blocking until another produce
+ an event
+ <braunr> i'm not sure there is any point handling the failure at all :/
+ <youpi> well, at least produce some output
+ <braunr> i added a perror
+ <youpi> so we know that happened
+ <braunr> async messaging is quite evil actually
+ <braunr> the bug i sometimes have with pfinet is usually triggered by
+ fakeroot
+ <braunr> it seems to use select a lot
+ <braunr> and select often destroys ports when it has something to return to
+ the caller
+ <braunr> which creates dead name notifications
+ <braunr> and if done often enough, a lot of them
+ <youpi> uh
+ <braunr> and as pfinet is creating threads to service new messages, already
+ existing threads are starved and can't continue
+ <braunr> which leads to pfinet exhausting its address space with thread
+ stacks (at about 30k threads)
+ <braunr> i initially thought it was a deadlock, but my modified libpthread
+ didn't detect one, and indeed, after i killed fakeroot (the whole
+ dpkg-buildpackage process hierarchy), pfinet just "cooled down"
+ <braunr> with almost all 30k threads simply waiting for requests to
+ service, and the few expected select calls blocking (a few ssh sessions,
+ exim probably, possibly others)
+ <braunr> i wonder why this doesn't happen with cthreads
+ <youpi> there's a 4k guard between stacks, otherwise I don't see anything
+ obvious
+ <braunr> i'll test my pthreads package with the fixed
+ ports_manage_port_operations_multithread
+ <braunr> but even if this "fix" should reduce thread creation, it doesn't
+ prevent the starvation i observed
+ <braunr> evil concurrency :p
+ <braunr> youpi: hm i've just spotted an important difference actually
+ <braunr> youpi: glibc sched_yield is __swtch(), cthreads is
+ <braunr> i'll change the glibc implementation, see how it affects the whole
+ system
+ <braunr> youpi: do you think bootsting the priority or cancellation
+ requests is an acceptable workaround ?
+ <braunr> boosting
+ <braunr> of*
+ <youpi> workaround for what?
+ <braunr> youpi: the starvation i described earlier
+ <youpi> well, I guess I'm not into the thing enough to understand
+ <youpi> you meant the dead port notifications, right?
+ <braunr> yes
+ <braunr> they are the cancellation triggers
+ <youpi> cancelling whaT?
+ <braunr> a blocking select for example
+ <braunr> ports_do_mach_notify_dead_name -> ports_dead_name ->
+ ports_interrupt_notified_rpcs -> hurd_thread_cancel
+ <braunr> so it's important they are processed quickly, to allow blocking
+ threads to unblock, reply, and be recycled
+ <youpi> you mean the threads in pfinet?
+ <braunr> the issue applies to all servers, but yes
+ <youpi> k
+ <youpi> well, it can not not be useful :)
+ <braunr> whatever the choice, it seems to be there will be a security issue
+ (a denial of service of some kind)
+ <youpi> well, it's not only in that case
+ <youpi> you can always queue a lot of requests to a server
+ <braunr> sure, i'm just focusing on this particular problem
+ <braunr> hm
+ <braunr> i'd say POLICY_TIMESHARE just in case
+ <braunr> (and i'm not sure mach handles fixed priority threads first
+ actually :/)
+ <braunr> hm my current hack which consists of calling swtch_pri(0) from a
+ freshly created thread seems to do the job eh
+ <braunr> (it may be what cthreads unintentionally does by acquiring a spin
+ lock from the entry function)
+ <braunr> not a single issue any more with this hack
+ <bddebian> Nice
+ <braunr> bddebian: well it's a hack :p
+ <braunr> and the problem is that, in order to boost a thread's priority,
+ one would need to implement that in libpthread
+ <bddebian> there isn't thread priority in libpthread?
+ <braunr> it's not implemented
+ <bddebian> Interesting
+ <braunr> if you want to do it, be my guest :p
+ <braunr> mach should provide the basic stuff for a partial implementation
+ <braunr> but for now, i'll fall back on the hack, because that's what
+ cthreads "does", and it's "reliable enough"
+ <antrik> braunr: I don't think the locking approach in
+ ports_manage_port_operations_multithread() could cause issues. the worst
+ that can happen is that some other thread becomes idle between the check
+ and creating a new thread -- and I can't think of a situation where this
+ could have any impact...
+ <braunr> antrik: hm ?
+ <braunr> the worst case is that many threads will evalute spawn to 1 and
+ create threads, whereas only one of them should have
+ <antrik> braunr: I'm not sure perror() is a good way to handle the
+ situation where thread creation failed. this would usually happen because
+ of resource shortage, right? in that case, it should work in non-debug
+ builds too
+ <braunr> perror isn't specific to debug builds
+ <braunr> i'm building glibc packages with a pthreads-enabled hurd :>
+ <braunr> (which at one point run the test allocating and filling 2 GiB of
+ memory, which passed)
+ <braunr> (with a kernel using a 3/1 split of course, swap usage reached
+ something like 1.6 GiB)
+ <antrik> braunr: BTW, I think the observation that thread storms tend to
+ happen on destroying stuff more than on creating stuff has been made
+ before...
+ <braunr> ok
+ <antrik> braunr: you are right about perror() of course. brain fart -- was
+ thinking about assert_perror()
+ <antrik> (which is misused in some places in existing Hurd code...)
+ <antrik> braunr: I still don't see the issue with the "spawn"
+ locking... the only situation where this code can be executed
+ concurrently is when multiple threads are idle and handling incoming
+ request -- but in that case spawning does *not* happen anyways...
+ <antrik> unless you are talking about something else than what I'm thinking
+ of...
+ <braunr> well imagine you have idle threads, yes
+ <braunr> let's say a lot like a thousand
+ <braunr> and the server gets a thousand requests
+ <braunr> a one more :p
+ <braunr> normally only one thread should be created to handle it
+ <braunr> but here, the worst case is that all threads run internal_demuxer
+ roughly at the same time
+ <braunr> and they all determine they need to spawn a thread
+ <braunr> leading to another thousand
+ <braunr> (that's extreme and very unlikely in practice of course)
+ <antrik> oh, I see... you mean all the idle threads decide that no spawning
+ is necessary; but before they proceed, finally one comes in and decides
+ that it needs to spawn; and when the other ones are scheduled again they
+ all spawn unnecessarily?
+ <braunr> no, spawn is a local variable
+ <braunr> it's rather, all idle threads become busy, and right before
+ servicing their request, they all decide they must spawn a thread
+ <antrik> I don't think that's how it works. changing the status to busy (by
+ decrementing the idle counter) and checking that there are no idle
+ threads is atomic, isn't it?
+ <braunr> no
+ <antrik> oh
+ <antrik> I guess I should actually look at that code (again) before
+ commenting ;-)
+ <braunr> let me check
+ <braunr> no sorry you're right
+ <braunr> so right, you can't lead to that situation
+ <braunr> i don't even understand how i can't see that :/
+ <braunr> let's say it's the heat :p
+ <braunr> 22:08 < braunr> so right, you can't lead to that situation
+ <braunr> it can't lead to that situation
+## IRC, freenode, #hurd, 2012-08-18
+ <braunr> one more attempt at fixing netdde, hope i get it right this time
+ <braunr> some parts assume a ddekit thread is a cthread, because they share
+ the same address
+ <braunr> it's not as easy when using pthread_self :/
+ <braunr> good, i got netdde work with pthreads
+ <braunr> youpi: for reference, there are now glibc, hurd and netdde
+ packages on my repository
+ <braunr> youpi: the debian specific patches can be found at my git
+ repository ( and
+ <braunr> except a freeze during boot (between exec and init) which happens
+ rarely, and the starvation which still exists to some extent (fakeroot
+ can cause many threads to be created in pfinet and pflocal), the
+ glibc/hurd packages have been working fine for a few days now
+ <braunr> the threading issue in pfinet/pflocal is directly related to
+ select, which the io_select_timeout patches should fix once merged
+ <braunr> well, considerably reduce at least
+ <braunr> and maybe fix completely, i'm not sure
+## IRC, freenode, #hurd, 2012-08-27
+ <pinotree> braunr: wrt a78a95d in your pthread branch of hurd.git,
+ shouldn't that job theorically been done using pthread api (of course
+ after implementing it)?
+ <braunr> pinotree: sure, it could be done through pthreads
+ <braunr> pinotree: i simply restricted myself to moving the hurd to
+ pthreads, not augment libpthread
+ <braunr> (you need to remember that i work on hurd with pthreads because it
+ became a dependency of my work on fixing select :p)
+ <braunr> and even if it wasn't the reason, it is best to do these tasks
+ (replace cthreads and implement pthread scheduling api) separately
+ <pinotree> braunr: hm ok
+ <pinotree> implementing the pthread priority bits could be done
+ independently though
+ <braunr> youpi: there are more than 9000 threads for /hurd/streamio kmsg on
+ ironforge oO
+ <youpi> kmsg ?!
+ <youpi> it's only /dev/klog right?
+ <braunr> not sure but it seems so
+ <pinotree> which syslog daemon is running?
+ <youpi> inetutils
+ <youpi> I've restarted the klog translator, to see whether when it grows
+ again
+ <braunr> 6 hours and 21 minutes to build glibc on darnassus
+ <braunr> pfinet still runs only 24 threads
+ <braunr> the ext2 instance used for the build runs 2k threads, but that's
+ because of the pageouts
+ <braunr> so indeed, the priority patch helps a lot
+ <braunr> (pfinet used to have several hundreds, sometimes more than a
+ thousand threads after a glibc build, and potentially increasing with
+ each use of fakeroot)
+ <braunr> exec weights 164M eww, we definitely have to fix that leak
+ <braunr> the leaks are probably due to wrong mmap/munmap usage
+### IRC, freenode, #hurd, 2012-08-29
+ <braunr> youpi: btw, after my glibc build, there were as little as between
+ 20 and 30 threads for pflocal and pfinet
+ <braunr> with the priority patch
+ <braunr> ext2fs still had around 2k because of pageouts, but that's
+ expected
+ <youpi> ok
+ <braunr> overall the results seem very good and allow the switch to
+ pthreads
+ <youpi> yep, so it seems
+ <braunr> youpi: i think my first integration branch will include only a few
+ changes, such as this priority tuning, and the replacement of
+ condition_implies
+ <youpi> sure
+ <braunr> so we can push the move to pthreads after all its small
+ dependencies
+ <youpi> yep, that's the most readable way
+## IRC, freenode, #hurd, 2012-09-03
+ <gnu_srs> braunr: Compiling yodl-3.00.0-7:
+ <gnu_srs> pthreads: real 13m42.460s, user 0m0.000s, sys 0m0.030s
+ <gnu_srs> cthreads: real 9m 6.950s, user 0m0.000s, sys 0m0.020s
+ <braunr> thanks
+ <braunr> i'm not exactly certain about what causes the problem though
+ <braunr> it could be due to libpthread using doubly-linked lists, but i
+ don't think the overhead would be so heavier because of that alone
+ <braunr> there is so much contention sometimes that it could
+ <braunr> the hurd would have been better off with single threaded servers
+ :/
+ <braunr> we should probably replace spin locks with mutexes everywhere
+ <braunr> on the other hand, i don't have any more starvation problem with
+ the current code
+### IRC, freenode, #hurd, 2012-09-06
+ <gnu_srs> braunr: Yes you are right, the new pthread-based Hurd is _much_
+ slower.
+ <gnu_srs> One annoying example is when compiling, the standard output is
+ written in bursts with _long_ periods of no output in between:-(
+ <braunr> that's more probably because of the priority boost, not the
+ overhead
+ <braunr> that's one of the big issues with our mach-based model
+ <braunr> we either give high priorities to our servers, or we can suffer
+ from message floods
+ <braunr> that's in fact more a hurd problem than a mach one
+ <gnu_srs> braunr: any immediate ideas how to speed up responsiveness the
+ pthread-hurd. It is annoyingly slow (slow-witted)
+ <braunr> gnu_srs: i already answered that
+ <braunr> it doesn't look that slower on my machines though
+ <gnu_srs> you said you had some ideas, not which. except for mcsims work.
+ <braunr> i have ideas about what makes it slower
+ <braunr> it doesn't mean i have solutions for that
+ <braunr> if i had, don't you think i'd have applied them ? :)
+ <gnu_srs> ok, how to make it more responsive on the console? and printing
+ stdout more regularly, now several pages are stored and then flushed.
+ <braunr> give more details please
+ <gnu_srs> it behaves like a loaded linux desktop, with little memory
+ left...
+ <braunr> details about what you're doing
+ <gnu_srs> apt-get source any big package and: fakeroot debian/rules binary
+ 2>&1 | tee ../binary.logg
+ <braunr> isee
+ <braunr> well no, we can't improve responsiveness
+ <braunr> without reintroducing the starvation problem
+ <braunr> they are linked
+ <braunr> and what you're doing involes a few buffers, so the laggy feel is
+ expected
+ <braunr> if we can fix that simply, we'll do so after it is merged upstream
+### IRC, freenode, #hurd, 2012-09-07
+ <braunr> gnu_srs: i really don't feel the sluggishness you described with
+ hurd+pthreads on my machines
+ <braunr> gnu_srs: what's your hardware ?
+ <braunr> and your VM configuration ?
+ <gnu_srs> Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
+ <gnu_srs> kvm -m 1024 -net nic,model=rtl8139 -net
+ user,hostfwd=tcp::5562-:22 -drive
+ cache=writeback,index=0,media=disk,file=hurd-experimental.img -vnc :6
+ -cdrom isos/netinst_2012-07-15.iso -no-kvm-irqchip
+ <braunr> what is the file system type where your disk image is stored ?
+ <gnu_srs> ext3
+ <braunr> and how much physical memory on the host ?
+ <braunr> (paste meminfo somewhere please)
+ <gnu_srs> 4G, and it's on the limit, 2 kvm instances+gnome,etc
+ <gnu_srs> 80% in use by programs, 14% in cache.
+ <braunr> ok, that's probably the reason then
+ <braunr> the writeback option doesn't help a lot if you don't have much
+ cache
+ <gnu_srs> well the other instance is cthreads based, and not so sluggish.
+ <braunr> we know hurd+pthreads is slower
+ <braunr> i just wondered why i didn't feel it that much
+ <gnu_srs> try to fire up more kvm instances, and do a heavy compile...
+ <braunr> i don't do that :)
+ <braunr> that's why i never had the problem
+ <braunr> most of the time i have like 2-3 GiB of cache
+ <braunr> and of course more on shattrath
+ <braunr> (the host of the hurdboxes, which has 16 GiB of ram)
+### IRC, freenode, #hurd, 2012-09-11
+ <gnu_srs> Monitoring the cthreads and the pthreads load under Linux shows:
+ <gnu_srs> cthread version: load can jump very high, less cpu usage than
+ pthread version
+ <gnu_srs> pthread version: less memory usage, background cpu usage higher
+ than for cthread version
+ <braunr> that's the expected behaviour
+ <braunr> gnu_srs: are you using the lifothreads gnumach kernel ?
+ <gnu_srs> for experimental, yes.
+ <gnu_srs> i.e. pthreads
+ <braunr> i mean, you're measuring on it right now, right ?
+ <gnu_srs> yes, one instance running cthreads, and one pthreads (with lifo
+ gnumach)
+ <braunr> ok
+ <gnu_srs> no swap used in either instance, will try a heavy compile later
+ on.
+ <braunr> what for ?
+ <gnu_srs> E.g. for memory when linking. I have swap available, but no swap
+ is used currently.
+ <braunr> yes but, what do you intend to measure ?
+ <gnu_srs> don't know, just to see if swap is used at all. it seems to be
+ used not very much.
+ <braunr> depends
+ <braunr> be warned that using the swap means there is pageout, which is one
+ of the triggers for global system freeze :p
+ <braunr> anonymous memory pageout
+ <gnu_srs> for linux swap is used constructively, why not on hurd?
+ <braunr> because of hard to squash bugs
+ <gnu_srs> aha, so it is bugs hindering swap usage:-/
+ <braunr> yup :/
+ <gnu_srs> Let's find them thenO:-), piece of cake
+ <braunr> remember my page cache branch in gnumach ? :)
+ <gnu_srs> not much
+ <braunr> i started it before fixing non blocking select
+ <braunr> anyway, as a side effect, it should solve this stability issue
+ too, but it'll probably take time
+ <gnu_srs> is that branch integrated? I only remember slab and the lifo
+ stuff.
+ <gnu_srs> and mcsims work
+ <braunr> no it's not
+ <braunr> it's unfinished
+ <gnu_srs> k!
+ <braunr> it correctly extends the page cache to all available physical
+ memory, but since the hurd doesn't scale well, it slows the system down
+## IRC, freenode, #hurd, 2012-09-14
+ <braunr> arg
+ <braunr> darnassus seems to eat 100% cpu and make top freeze after some
+ time
+ <braunr> seems like there is an important leak in the pthreads version
+ <braunr> could be the lifothreads patch :/
+ <cjbirk> there's a memory leak?
+ <cjbirk> in pthreads?
+ <braunr> i don't think so, and it's not a memory leak
+ <braunr> it's a port leak
+ <braunr> probably in the kernel
+### IRC, freenode, #hurd, 2012-09-17
+ <braunr> nice, the port leak is actually caused by the exim4 loop bug
+### IRC, freenode, #hurd, 2012-09-23
+ <braunr> the port leak i observed a few days ago is because of exim4 (the
+ infamous loop eating the cpu we've been seeing regularly)
+ <youpi> oh
+ <braunr> next time it happens, and if i have the occasion, i'll examine the
+ problem
+ <braunr> tip: when you can't use top or ps -e, you can use ps -e -o
+ pid=,args=
+ <youpi> or -M ?
+ <braunr> haven't tested
+## IRC, freenode, #hurd, 2012-09-23
+ <braunr> tschwinge: i committed the last hurd pthread change,
+ <braunr> tschwinge: please tell me if you consider it ok for merging
+### IRC, freenode, #hurd, 2012-11-27
+ <youpi> braunr: btw, I forgot to forward here, with the glibc patch it does
+ boot fine, I'll push all that and build some almost-official packages for
+ people to try out what will come when eglibc gets the change in unstable
+ <braunr> youpi: great :)
+ <youpi> thanks for managing the final bits of this
+ <youpi> (and thanks for everybody involved)
+ <braunr> sorry again for the non obvious parts
+ <braunr> if you need the debian specific parts refined (e.g. nice commits
+ for procfs & others), i can do that
+ <youpi> I'll do that, no pb
+ <braunr> ok
+ <braunr> after that (well, during also), we should focus more on bug
+ hunting
+## IRC, freenode, #hurd, 2012-10-26
+ <mcsim1> hello. What does following error message means? "unable to adjust
+ libports thread priority: Operation not permitted" It appears when I set
+ translators.
+ <mcsim1> Seems has some attitude to libpthread. Also following appeared
+ when I tried to remove translator: "pthread_create: Resource temporarily
+ unavailable"
+ <mcsim1> Oh, first message appears very often, when I use translator I set.
+ <braunr> mcsim1: it's related to a recent patch i sent
+ <braunr> mcsim1: hurd servers attempt to increase their priority on startup
+ (when a thread is created actually)
+ <braunr> to reduce message floods and thread storms (such sweet names :))
+ <braunr> but if you start them as an unprivileged user, it fails, which is
+ ok, it's just a warning
+ <braunr> the second way is weird
+ <braunr> it normally happens when you're out of available virtual space,
+ not when shutting a translator donw
+ <mcsim1> braunr: you mean this patch: libports: reduce thread starvation on
+ message floods?
+ <braunr> yes
+ <braunr> remember you're running on darnassus
+ <braunr> with a heavily modified hurd/glibc
+ <braunr> you can go back to the cthreads version if you wish
+ <mcsim1> it's better to check translators privileges, before attempting to
+ increase their priority, I think.
+ <braunr> no
+ <mcsim1> it's just a bit annoying
+ <braunr> privileges can be changed during execution
+ <braunr> well remove it
+ <mcsim1> But warning should not appear.
+ <braunr> what could be done is to limit the warning to one occurrence
+ <braunr> mcsim1: i prefer that it appears
+ <mcsim1> ok
+ <braunr> it's always better to be explicit and verbose
+ <braunr> well not always, but very often
+ <braunr> one of the reasons the hurd is so difficult to debug is the lack
+ of a "message server" à la dmesg
diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
new file mode 100644
index 00000000..37231c66
--- /dev/null
+++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
@@ -0,0 +1,21 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_libphread]]
+# IRC, freenode, #hurd, 2012-08-30
+ <braunr> tschwinge: this issue needs more cooperation with the kernel
+ <braunr> tschwinge: i.e. the ability to tell the kernel where the stack is,
+ so it's unmapped when the thread dies
+ <braunr> which requiring another thread to perform this deallocation
diff --git a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
index 86a613d3..22b2cd3b 100644
--- a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
+++ b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
@@ -103,3 +103,11 @@ License|/fdl]]."]]"""]]
<pinotree> it'll be safe when implementing some private
__hurd_clock_get{time,res} in libc proper, making librt just forward to
it and adapting the gettimeofday to use it
+## IRC, freenode, #hurd, 2012-10-22
+ <pinotree> youpi: apparently somebody in glibc land is indirectly solving
+ our "libpthread needs lirt which pulls libphtread" circular issue by
+ moving the clock_* functions to libc proper
+ <youpi> I've seen that yes :)
diff --git a/open_issues/libpthread_timeout_dequeue.mdwn b/open_issues/libpthread_timeout_dequeue.mdwn
new file mode 100644
index 00000000..5ebb2e11
--- /dev/null
+++ b/open_issues/libpthread_timeout_dequeue.mdwn
@@ -0,0 +1,22 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_libpthread]]
+# IRC, freenode, #hurd, 2012-08-17
+ <braunr> pthread_cond_timedwait and pthread_mutex_timedlock *can* produce
+ segfaults in our implementation
+ <braunr> if a timeout happens, but before the thread dequeues itself,
+ another tries to wake it, it will be dequeued twice
+ <braunr> this is the issue i spent a week on when working on fixing select
diff --git a/open_issues/mach_federations.mdwn b/open_issues/mach_federations.mdwn
new file mode 100644
index 00000000..50c939c3
--- /dev/null
+++ b/open_issues/mach_federations.mdwn
@@ -0,0 +1,66 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_documentation]]
+# IRC, freenode, #hurd, 2012-08-18
+ <braunr> well replacing parts of it is possible on the hurd, but for core
+ servers it's limited
+ <braunr> minix has features for that
+ <braunr> this was interesting too:
+ <braunr> lcc: you'll always have some kind of dependency problems which are
+ hard to solve
+ <savask> braunr: One my friend asked me if it's possible to run different
+ parts of Hurd on different computers and make a cluster therefore. So, is
+ it, at least theoretically?
+ <braunr> savask: no
+ <savask> Okay, then I guessed a right answer.
+ <youpi> well, theorically it's possible, but it's not implemented
+ <braunr> well it's possible everywhere :p
+ <braunr> there are projects for that on linux
+ <braunr> but it requires serious changes in both the protocols and servers
+ <braunr> and it depends on the features you want (i assume here you want
+ e.g. process checkpointing so they can be migrated to other machines to
+ transparently balance loads)
+ <lcc> is it even theoretically possible to have a system in which core
+ servers can be modified while the system is running? hm... I will look
+ more into it. just curious.
+ <savask> lcc: Linux can be updated on the fly, without rebooting.
+ <braunr> lcc: to some degree, it is
+ <braunr> savask: the whole kernel is rebooted actually
+ <braunr> well not rebooted, but restarted
+ <braunr> there is a project that provides kernel updates through binary
+ patches
+ <braunr> ksplice
+ <savask> braunr: But it will look like everything continued running.
+ <braunr> as long as the new code expects the same data structures and other
+ implications, yes
+ <braunr> "Ksplice can handle many security updates but not changes to data
+ structures"
+ <braunr> obviously
+ <braunr> so it's good for small changes
+ <braunr> and ksplice is very specific, it's intended for security updates,
+ ad the primary users are telecommunication providers who don't want
+ downtime
+ <antrik> braunr: well, protocols and servers on Mach-based systems should
+ be ready for federations... although some Hurd protocols are not clean
+ for federations with heterogenous architectures, at least on homogenous
+ clusters it should actually work with only some extra bootstrapping code,
+ if the support existed in our Mach variant...
+ <braunr> antrik: why do you want the support in the kernel ?
+ <antrik> braunr: I didn't say I *want* federation support in the
+ kernel... in fact I agree with Shapiro that it's probably a bad idea. I
+ just said that it *should* actually work with the system design as it is
+ now :-)
+ <antrik> braunr: yes, I said that it wouldn't work on heterogenous
+ federations. if all machines use the same architecture it should work.
diff --git a/open_issues/mach_on_top_of_posix.mdwn b/open_issues/mach_on_top_of_posix.mdwn
index 7574feb0..a3e47685 100644
--- a/open_issues/mach_on_top_of_posix.mdwn
+++ b/open_issues/mach_on_top_of_posix.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -14,3 +14,5 @@ License|/fdl]]."]]"""]]
At the beginning of the 2000s, there was a *Mach on Top of POSIX* port started
by John Edwin Tobey. Status unknown. Ask [[tschwinge]] for the source code.
+See also [[implementing_hurd_on_top_of_another_system]].
diff --git a/open_issues/mach_shadow_objects.mdwn b/open_issues/mach_shadow_objects.mdwn
new file mode 100644
index 00000000..0669041a
--- /dev/null
+++ b/open_issues/mach_shadow_objects.mdwn
@@ -0,0 +1,24 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_gnumach]]
+See also [[gnumach_vm_map_entry_forward_merging]].
+# IRC, freenode, #hurd, 2012-11-16
+ <mcsim> hi. do I understand correct that following is true: vm_object_t a;
+ a->shadow->copy == a;?
+ <braunr> mcsim: not completely sure, but i'd say no
+ <braunr> but mach terminology isn't always firm, so possible
+ <braunr> mcsim: apparently you're right, although be careful that it may
+ not be the case *all* the time
+ <braunr> there may be inconsistent states
diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn
index c9567828..f42601b4 100644
--- a/open_issues/multithreading.mdwn
+++ b/open_issues/multithreading.mdwn
@@ -134,6 +134,75 @@ Tom Van Cutsem, 2009.
<braunr> (i still strongly believe those shouldn't be used at all)
+## IRC, freenode, #hurd, 2012-08-31
+ <braunr> and the hurd is all but scalable
+ <gnu_srs> I thought scalability was built-in already, at least for hurd??
+ <braunr> built in ?
+ <gnu_srs> designed in
+ <braunr> i guess you think that because you read "aggressively
+ multithreaded" ?
+ <braunr> well, a system that is unable to control the amount of threads it
+ creates for no valid reason and uses global lock about everywhere isn't
+ really scalable
+ <braunr> it's not smp nor memory scalable
+ <gnu_srs> most modern OSes have multi-cpu support.
+ <braunr> that doesn't mean they scale
+ <braunr> bsd sucks in this area
+ <braunr> it got better in recent years but they're way behind linux
+ <braunr> linux has this magic thing called rcu
+ <braunr> and i want that in my system, from the beginning
+ <braunr> and no, the hurd was never designed to scale
+ <braunr> that's obvious
+ <braunr> a very common mistake of the early 90s
+## IRC, freenode, #hurd, 2012-09-06
+ <braunr> mel-: the problem with such a true client/server architecture is
+ that the scheduling context of clients is not transferred to servers
+ <braunr> mel-: and the hurd creates threads on demand, so if it's too slow
+ to process requests, more threads are spawned
+ <braunr> to prevent hurd servers from creating too many threads, they are
+ given a higher priority
+ <braunr> and it causes increased latency for normal user applications
+ <braunr> a better way, which is what modern synchronous microkernel based
+ systems do
+ <braunr> is to transfer the scheduling context of the client to the server
+ <braunr> the server thread behaves like the client thread from the
+ scheduler perspective
+ <gnu_srs> how can creating more threads ease the slowness, is that a design
+ decision??
+ <mel-> what would be needed to implement this?
+ <braunr> mel-: thread migration
+ <braunr> gnu_srs: is that what i wrote ?
+ <mel-> does mach support it?
+ <braunr> mel-: some versions do yes
+ <braunr> mel-: not ours
+ <gnu_srs> 21:49:03) braunr: mel-: and the hurd creates threads on demand,
+ so if it's too slow to process requests, more threads are spawned
+ <braunr> of course it's a design decision
+ <braunr> it doesn't "ease the slowness"
+ <braunr> it makes servers able to use multiple processors to handle
+ requests
+ <braunr> but it's a wrong design decision as the number of threads is
+ completely unchecked
+ <gnu_srs> what's the idea of creating more theads then, multiple cpus is
+ not supported?
+ <braunr> it's a very old decision taken at a time when systems and machines
+ were very different
+ <braunr> mach used to support multiple processors
+ <braunr> it was expected gnumach would do so too
+ <braunr> mel-: but getting thread migration would also require us to adjust
+ our threading library and our servers
+ <braunr> it's not an easy task at all
+ <braunr> and it doesn't fix everything
+ <braunr> thread migration on mach is an optimization
+ <mel-> interesting
+ <braunr> async ipc remains available, which means notifications, which are
+ async by nature, will create messages floods anyway
# Alternative approaches:
* <>
diff --git a/open_issues/packaging_libpthread.mdwn b/open_issues/packaging_libpthread.mdwn
index 528e0b01..2d90779e 100644
--- a/open_issues/packaging_libpthread.mdwn
+++ b/open_issues/packaging_libpthread.mdwn
@@ -187,3 +187,60 @@ License|/fdl]]."]]"""]]
<youpi> the slibdir change, however, is odd
<youpi> it must be a leftover
+# IRC, freenode, #hurd, 2012-11-16
+ <pinotree> *** $(common-objpfx)resolv/gai_suspend.o: uses
+ /usr/include/i386-gnu/bits/pthread.h
+ <pinotree> so the ones in the libpthread addon are not used...
+ <tschwinge> pinotree: The latter at leash should be useful information.
+ <pinotree> tschwinge: i'm afraid i didn't get you :) what are you referring
+ to?
+ <tschwinge> pinotree: s%leash%least -- what I mean was the it's actually a
+ real bug that not the in-tree libpthread addon include files are being
+ used.
+ <pinotree> tschwinge: ah sure -- basically, the stuff in
+ libpthread/sysdeps/generic are not used at all
+ <pinotree> (glibc only uses generic for glibc/sysdeps/generic)
+ <pinotree> tschwinge: i might have an idea how to fix it: moving the
+ contents from libpthread/sysdeps/generic to libpthread/sysdeps/pthread,
+ and that would depend on one of the latest libpthread patches i sent
+# libihash
+## IRC, freenode, #hurd, 2012-11-16
+ <pinotree> also, libpthread uses hurd's ihash
+ <tschwinge> Yes, I already thought a little bit about the ihash thing. I
+ besically see two options: move ihash into glibc ((probably?) not as a
+ public interface, though), or have libpthread use of of the hash
+ implementations that surely are already present in glibc.
+ <tschwinge> My notes say:
+ <tschwinge> * include/inline-hashtab.h
+ <tschwinge> * locale/programs/simple-hash.h
+ <tschwinge> * misc/hsearch_r.c
+ <tschwinge> * NNS; cf. f46f0abfee5a2b34451708f2462a1c3b1701facd
+ <tschwinge> No idea whether they're equivalent/usable.
+ <pinotree> interesting
+ <tschwinge> And no immediate recollection what NNS is;
+ f46f0abfee5a2b34451708f2462a1c3b1701facd is not a glibc commit after all.
+ ;-)
+ <tschwinge> Oh, and: libiberty: `hashtab.c`
+ <pinotree> hmm, but then you would need to properly ifdef the libpthread
+ hash usage (iirc only for pthread keys) depending on whether it's in
+ glibc or standalone
+ <pinotree> but that shouldn't be an ussue, i guess
+ <pinotree> *issue
+ <tschwinge> No that'd be fine.
+ <tschwinge> My understanding is that the long-term goal (well, no so
+ long-term, actually) is to completely move libpthread into glibc.
+ <pinotree> ie have it buildable only ad glibc addon?
+ <tschwinge> Yes.
+ <tschwinge> No need for more than one mechanism for building it, I think.
+ <tschwinge> Hmm, this doesn't bring us any further:
+ <pinotree> yay for acronyms ;)
+ <tschwinge> So, if someone figures out what NNS and this commit it are: one
+ beer. ;-)
diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn
index ec14fa52..8147e5eb 100644
--- a/open_issues/performance.mdwn
+++ b/open_issues/performance.mdwn
@@ -81,3 +81,34 @@ call|/glibc/fork]]'s case.
gnumach and the hurd) just wake every thread waiting for an event when
the event occurs (there are a few exceptions, but not many)
<antrik> ouch
+# IRC, freenode, #hurd, 2012-09-13
+ <braunr> the phoronix benchmarks don't actually test the operating system
+ ..
+ <hroi_> braunr: well, at least it tests its ability to run programs for
+ those particular tasks
+ <braunr> exactly, it tests how programs that don't make much use of the
+ operating system run
+ <braunr> well yes, we can run programs :)
+ <pinotree> those are just cpu-taking tasks
+ <hroi_> ok
+ <pinotree> if you do a benchmark with also i/o, you can see how it is
+ (quite) slower on hurd
+ <hroi_> perhaps they should have run 10 of those programs in parallel, that
+ would test the kernel multitasking I suppose
+ <braunr> not even I/O, simply system calls
+ <braunr> no, multitasking is ok on the hurd
+ <braunr> and it's very similar to what is done on other systems, which
+ hasn't changed much for a long time
+ <braunr> (except for multiprocessor)
+ <braunr> true OS benchmarks measure system calls
+ <hroi_> ok, so Im sensing the view that the actual OS kernel architecture
+ dont really make that much difference, good software does
+ <braunr> not at all
+ <braunr> i'm only saying that the phoronix benchmark results are useless
+ <braunr> because they didn't measure the right thing
+ <hroi_> ok
diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn
index 657318cd..706e1632 100644
--- a/open_issues/performance/io_system/read-ahead.mdwn
+++ b/open_issues/performance/io_system/read-ahead.mdwn
@@ -1845,3 +1845,714 @@ License|/fdl]]."]]"""]]
<braunr> but that's one way to do it
<braunr> defaults work well too
<braunr> as shown in other implementations
+## IRC, freenode, #hurd, 2012-08-09
+ <mcsim> braunr: I'm still debugging ext2 with large storage patch
+ <braunr> mcsim: tough problems ?
+ <mcsim> braunr: The same issues as I always meet when do debugging, but it
+ takes time.
+ <braunr> mcsim: so nothing blocking so far ?
+ <mcsim> braunr: I can't tell you for sure that I will finish up to 13th of
+ August and this is unofficial pencil down date.
+ <braunr> all right, but are you blocked ?
+ <mcsim> braunr: If you mean the issues that I can not even imagine how to
+ solve than there is no ones.
+ <braunr> good
+ <braunr> mcsim: i'll try to review your code again this week end
+ <braunr> mcsim: make sure to commit everything even if it's messy
+ <mcsim> braunr: ok
+ <mcsim> braunr: I made changes to defpager, but I haven't tried
+ them. Commit them too?
+ <braunr> mcsim: sure
+ <braunr> mcsim: does it work fine without the large storage patch ?
+ <mcsim> braunr: looks fine, but TBH I can't even run such things like fsx,
+ because even without my changes it failed mightily at once.
+ <braunr> mcsim: right, well, that will be part of another task :)
+## IRC, freenode, #hurd, 2012-08-13
+ <mcsim> braunr: hello. Seems ext2fs with large store patch works.
+## IRC, freenode, #hurd, 2012-08-19
+ <mcsim> hello. Consider such situation. There is a page fault and kernel
+ decided to request pager for several pages, but at the moment pager is
+ able to provide only first pages, the rest ones are not know yet. Is it
+ possible to supply only one page and regarding rest ones tell the kernel
+ something like: "Rest pages try again later"?
+ <mcsim> I tried pager_data_unavailable && pager_flush_some, but this seems
+ does not work.
+ <mcsim> Or I have to supply something anyway?
+ <braunr> mcsim: better not provide them
+ <braunr> the kernel only really needs one page
+ <braunr> don't try to implement "try again later", the kernel will do that
+ if other page faults occur for those pages
+ <mcsim> braunr: No, translator just hangs
+ <braunr> ?
+ <mcsim> braunr: And I even can't deattach it without reboot
+ <braunr> hangs when what
+ <braunr> ?
+ <braunr> i mean, what happens when it hangs ?
+ <mcsim> If kernel request 2 pages and I provide one, than when page fault
+ occurs in second page translator hangs.
+ <braunr> well that's a bug
+ <braunr> clustered pager transfer is a mere optimization, you shouldn't
+ transfer more than you can just to satisfy some requested size
+ <mcsim> I think that it because I create fictitious pages before calling
+ mo_data_request
+ <braunr> as placeholders ?
+ <mcsim> Yes. Is it correct if I will not grab fictitious pages?
+ <braunr> no
+ <braunr> i don't know the details well enough about fictitious pages
+ unfortunately, but it really feels wrong to use them where real physical
+ pages should be used instead
+ <braunr> normally, an in-transfer page is simply marked busy
+ <mcsim> But If page is already marked busy kernel will not ask it another
+ time.
+ <braunr> when the pager replies, you unbusy them
+ <braunr> your bug may be that you incorrectly use pmap
+ <braunr> you shouldn't create mmu mappings for pages you didn't receive
+ from the pagers
+ <mcsim> I don't create them
+ <braunr> ok so you correctly get the second page fault
+ <mcsim> If pager supplies only first pages, when asked were two, than
+ second page will not become un-busy.
+ <braunr> that's a bug
+ <braunr> your code shouldn't assume the pager will provide all the pages it
+ was asked for
+ <braunr> only the main one
+ <mcsim> Will it be ok if I will provide special attribute that will keep
+ information that page has been advised?
+ <braunr> what for ?
+ <braunr> i don't understand "page has been advised"
+ <mcsim> Advised page is page that is asked in cluster, but there wasn't a
+ page fault in it.
+ <mcsim> I need this attribute because if I don't inform kernel about this
+ page anyhow, than kernel will not change attributes of this page.
+ <braunr> why would it change its attributes ?
+ <mcsim> But if page fault will occur in page that was asked than page will
+ be already busy by the moment.
+ <braunr> and what attribute ?
+ <mcsim> advised
+ <braunr> i'm lost
+ <braunr> 08:53 < mcsim> I need this attribute because if I don't inform
+ kernel about this page anyhow, than kernel will not change attributes of
+ this page.
+ <braunr> you need the advised attribute because if you don't inform the
+ kernel about this page, the kernel will not change the advised attribute
+ of this page ?
+ <mcsim> Not only advised, but busy as well.
+ <mcsim> And if page fault will occur in this page, kernel will not ask it
+ second time. Kernel will just block.
+ <braunr> well that's normal
+ <mcsim> But if kernel will block and pager is not going to report somehow
+ about this page, than translator will hang.
+ <braunr> but the pager is going to report
+ <braunr> and in this report, there can be less pages then requested
+ <mcsim> braunr: You told not to report
+ <braunr> the kernel can deduce it didn't receive all the pages, and mark
+ them unbusy anyway
+ <braunr> i told not to transfer more than requested
+ <braunr> but not sending data can be a form of communication
+ <braunr> i mean, sending a message in which data is missing
+ <braunr> it simply means its not there, but this info is sufficient for the
+ kernel
+ <mcsim> hmmm... Seems I understood you. Let me try something.
+ <mcsim> braunr: I informed kernel about missing page as follows:
+ pager_data_supply (pager, precious, writelock, i, 1, NULL, 0); Am I
+ right?
+ <braunr> i don't know the interface well
+ <braunr> what does it mean
+ <braunr> ?
+ <braunr> are you passing NULL as the data for a missing page ?
+ <mcsim> yes
+ <braunr> i see
+ <braunr> you shouldn't need a request for that though, avoiding useless ipc
+ is a good thing
+ <mcsim> i is number of page, 1 is quantity
+ <braunr> but if you can't find a better way for now, it will do
+ <mcsim> But this does not work :(
+ <braunr> that's a bug
+ <braunr> in your code probably
+ <mcsim> braunr: supplying NULL as data returns MACH_SEND_INVALID_MEMORY
+ <braunr> but why would it work ?
+ <braunr> mach expects something
+ <braunr> you have to change that
+ <mcsim> It's mig who refuses data. Mach does not even get the call.
+ <braunr> hum
+ <mcsim> That's why I propose to provide new attribute, that will keep
+ information regarding whether the page was asked as advice or not.
+ <braunr> i still don't understand why
+ <braunr> why don't you fix mig so you can your null message instead ?
+ <braunr> +send
+ <mcsim> braunr: because usually this is an error
+ <braunr> the kernel will decide if it's an erro
+ <braunr> r
+ <braunr> what kinf of reply do you intend to send the kernel with for these
+ "advised" pages ?
+ <mcsim> no reply. But when page fault will occur in busy page and it will
+ be also advised, kernel will not block, but ask this page another time.
+ <mcsim> And how kernel will know that this is an error or not?
+ <braunr> why ask another time ?!
+ <braunr> you really don't want to flood pagers with useless messages
+ <braunr> here is how it should be
+ <braunr> 1/ the kernel requests pages from the pager
+ <braunr> it know the range
+ <braunr> 2/ the pager replies what it can, full range, subset of it, even
+ only one page
+ <braunr> 3/ the kernel uses what the pager replied, and unbusies the other
+ pages
+ <mcsim> First time page was asked because page fault occurred in
+ neighborhood. And second time because PF occurred in page.
+ <braunr> well it shouldn't
+ <braunr> or it should, but then you have a segfault
+ <mcsim> But kernel does not keep bound of range, that it asked.
+ <braunr> if the kernel can't find the main page, the one it needs to make
+ progress, it's a segfault
+ <mcsim> And this range could be supplied in several messages.
+ <braunr> absolutely not
+ <braunr> you defeat the purpose of clustered pageins if you use several
+ messages
+ <mcsim> But interface supports it
+ <braunr> interface supported single page transfers, doesn't mean it's good
+ <braunr> well, you could use several messages
+ <braunr> as what we really want is less I/O
+ <mcsim> Noone keeps bounds of requested range, so it couldn't be checked
+ that range was split
+ <braunr> but it would be so much better to do it all with as few messages
+ as possible
+ <braunr> does the kernel knows the main page ?
+ <braunr> know*
+ <mcsim> Splitting range is not optimal, but it's not an error.
+ <braunr> i assume it does
+ <braunr> doesn't it ?
+ <mcsim> no, that's why I want to provide new attribute.
+ <braunr> i'm sorry i'm lost again
+ <braunr> how does the kernel knows a page fault has been serviced ?
+ <braunr> know*
+ <mcsim> It receives an interrupt
+ <braunr> ?
+ <braunr> let's not mix terms
+ <mcsim> oh.. I read as received. Sorry
+ <mcsim> It get mo_data_supply message. Than it replaces fictitious pages
+ with real ones.
+ <braunr> so you get a message
+ <braunr> and you kept track of the range using fictitious pages
+ <braunr> use the busy flag instead, and another way to retain the range
+ <mcsim> I allocate fictitious pages to reserve place. Than if page fault
+ will occur in this page fictitious page kernel will not send another
+ mo_data_request call, it will wait until fictitious page unblocks.
+ <braunr> i'll have to check the code but it looks unoptimal to me
+ <braunr> we really don't want to allocate useless objects when a simple
+ busy flag would do
+ <mcsim> busy flag for what? There is no page yet
+ <braunr> we're talking about mo_data_supply
+ <braunr> actually we're talking about the whole page fault process
+ <mcsim> We can't mark nothing as busy, that's why kernel allocates
+ fictitious page and marks it as busy until real page would be supplied.
+ <braunr> what do you mean "nothing" ?
+ <mcsim> VM_PAGE_NULL
+ <braunr> uh ?
+ <braunr> when are physical pages allocated ?
+ <braunr> on request or on reply from the pager ?
+ <braunr> i'm reading mo_data_supply, and it looks like the page is already
+ busy at that time
+ <mcsim> they are allocated by pager and than supplied in reply
+ <mcsim> Yes, but these pages are fictitious
+ <braunr> show me please
+ <braunr> in the master branch, not yours
+ <mcsim> that page is fictitious?
+ <braunr> yes
+ <braunr> i'm referring to the way mach currently does things
+ <mcsim> vm/vm_fault.c:582
+ <braunr> that's memory_object_lock_page
+ <braunr> hm wait
+ <braunr> my bad
+ <braunr> ah that damn object chaining :/
+ <braunr> ok
+ <braunr> the original code is stupid enough to use fictitious pages all the
+ time, you probably have to do the same
+ <mcsim> hm... Attributes will be useless, pager should tell something about
+ pages, that it is not going to supply.
+ <braunr> yes
+ <braunr> that's what null is for
+ <mcsim> Not null, null is error.
+ <braunr> one problem i can think of is making sure the kernel doesn't
+ interpret missing as error
+ <braunr> right
+ <mcsim> I think better have special value for mo_data_error
+ <braunr> probably
+### IRC, freenode, #hurd, 2012-08-20
+ <antrik> braunr: I think it's useful to allow supplying the data in several
+ batches. the kernel should *not* assume that any data missing in the
+ first batch won't be supplied later.
+ <braunr> antrik: it really depends
+ <braunr> i personally prefer synchronous approaches
+ <antrik> demanding that all data is supplied at once could actually turn
+ readahead into a performace killer
+ <mcsim> antrik: Why? The only drawback I see is higher response time for
+ page fault, but it also leads to reduced overhead.
+ <braunr> that's why "it depends"
+ <braunr> mcsim: it brings benefit only if enough preloaded pages are
+ actually used to compensate for the time it took the pager to provide
+ them
+ <braunr> which is the case for many workloads (including sequential access,
+ which is the common case we want to optimize here)
+ <antrik> mcsim: the overhead of an extra RPC is negligible compared to
+ increased latencies when dealing with slow backing stores (such as disk
+ or network)
+ <mcsim> antrik: also many replies lead to fragmentation, while in one reply
+ all data is gathered in one bunch. If all data is placed consecutively,
+ than it may be transferred next time faster.
+ <braunr> mcsim: what kind of fragmentation ?
+ <antrik> I really really don't think it's a good idea for the page to hold
+ back the first page (which is usually the one actually blocking) while
+ it's still loading some other pages (which will probably be needed only
+ in the future anyways, if at all)
+ <antrik> err... for the pager to hold back
+ <braunr> antrik: then all pagers should be changed to handle asynchronous
+ data supply
+ <braunr> it's a bit late to change that now
+ <mcsim> there could be two cases of data placement in backing store: 1/ all
+ asked data is placed consecutively; 2/ it is spread among backing
+ store. If pager gets data in one message it more like place it
+ consecutively. So to have data consecutive in each pager, each pager has
+ to try send data in one message. Having data placed consecutive is
+ important, since reading of such data is much more faster.
+ <braunr> mcsim: you're confusing things ..
+ <braunr> or you're not telling them properly
+ <mcsim> Ok. Let me try one more time
+ <braunr> since you're working *only* on pagein, not pageout, how do you
+ expect spread pages being sent in a single message be better than
+ multiple messages ?
+ <mcsim> braunr: I think about future :)
+ <braunr> ok
+ <braunr> but antrik is right, paging in too much can reduce performance
+ <braunr> so the default policy should be adjusted for both the worst case
+ (one page) and the average/best (some/mane contiguous pages)
+ <braunr> through measurement ideally
+ <antrik> mcsim: BTW, I still think implementing clustered pageout has
+ higher priority than implementing madvise()... but if the latter is less
+ work, it might still make sense to do it first of course :-)
+ <braunr> many*
+ <braunr> there aren't many users of madvise, true
+ <mcsim> antrik: Implementing madvise I expect to be very simple. It should
+ just translate call to vm_advise
+ <antrik> well, that part is easy of course :-) so you already implemented
+ vm_advise itself I take it?
+ <mcsim> antrik: Yes, that was also quite easy.
+ <antrik> great :-)
+ <antrik> in that case it would be silly of course to postpone implementing
+ the madvise() wrapper. in other words: never mind my remark about
+ priorities :-)
+## IRC, freenode, #hurd, 2012-09-03
+ <mcsim> I try a test with ext2fs. It works, than I just recompile ext2fs
+ and it stops working, than I recompile it again several times and each
+ time the result is unpredictable.
+ <braunr> sounds like a concurrency issue
+ <mcsim> I can run the same test several times and ext2 works until I
+ recompile it. That's the problem. Could that be concurrency too?
+ <braunr> mcsim: without bad luck, yes, unless "several times" is a lot
+ <braunr> like several dozens of tries
+## IRC, freenode, #hurd, 2012-09-04
+ <mcsim> hello. I want to tell that ext2fs translator, that I work on,
+ replaced for my system old variant that processed only single pages
+ requests. And it works with partitions bigger than 2 Gb.
+ <mcsim> Probably I'm not for from the end.
+ <mcsim> But it's worth to mention that I didn't fix that nasty bug that I
+ told yesterday about.
+ <mcsim> braunr: That bug sometimes appears after recompilation of ext2fs
+ and always disappears after sync or reboot. Now I'm going to finish
+ defpager and test other translators.
+## IRC, freenode, #hurd, 2012-09-17
+ <mcsim> braunr: hello. Do you remember that you said that pager has to
+ inform kernel about appropriate cluster size for readahead?
+ <mcsim> I don't understand how kernel store this information, because it
+ does not know about such unit as "pager".
+ <mcsim> Can you give me an advice about how this could be implemented?
+ <youpi> mcsim: it can store it in the object
+ <mcsim> youpi: It too big overhead
+ <mcsim> youpi: at least from my pow
+ <mcsim> *pov
+ <braunr> mcsim: we discussed this already
+ <braunr> mcsim: there is no "pager" entity in the kernel, which is a defect
+ from my PoV
+ <braunr> mcsim: the best you can do is follow what the kernel already does
+ <braunr> that is, store this property per object$
+ <braunr> we don't care much about the overhead for now
+ <braunr> my guess is there is already some padding, so the overhead is
+ likely to be amortized by this
+ <braunr> like youpi said
+ <mcsim> I remember that discussion, but I didn't get than whether there
+ should be only one or two values for all policies. Or each policy should
+ have its own values?
+ <mcsim> braunr: ^
+ <braunr> each policy should have its own values, which means it can be
+ implemented with a simple static array somewhere
+ <braunr> the information in each object is a policy selector, such as an
+ index in this static array
+ <mcsim> ok
+ <braunr> mcsim: if you want to minimize the overhead, you can make this
+ selector a char, and place it near another char member, so that you use
+ space that was previously used as padding by the compiler
+ <braunr> mcsim: do you see what i mean ?
+ <mcsim> yes
+ <braunr> good
+## IRC, freenode, #hurd, 2012-09-17
+ <mcsim> hello. May I add function krealloc to slab.c?
+ <braunr> mcsim: what for ?
+ <mcsim> braunr: It is quite useful for creating dynamic arrays
+ <braunr> you don't want dynamic arrays
+ <mcsim> why?
+ <braunr> they're expensive
+ <braunr> try other data structures
+ <mcsim> more expensive than linked lists?
+ <braunr> depends
+ <braunr> but linked lists aren't the only other alternative
+ <braunr> that's why btrees and radix trees (basically trees of arrays)
+ exist
+ <braunr> the best general purpose data structure we have in mach is the red
+ black tree currently
+ <braunr> but always think about what you want to do with it
+ <mcsim> I want to store there sets of sizes for different memory
+ policies. I don't expect this array to be big. But for sure I can use
+ rbtree for it.
+ <braunr> why not a static array ?
+ <braunr> arrays are perfect for known data sizes
+ <mcsim> I expect from pager to supply its own sizes. So at the beginning in
+ this array is only default policy. When pager wants to supply it own
+ policy kernel lookups table of advice. If this policy is new set of sizes
+ then kernel creates new entry in table of advice.
+ <braunr> that would mean one set of sizes for each object
+ <braunr> why don't you make things simple first ?
+ <mcsim> Object stores only pointer to entry in this table.
+ <braunr> but there is no pager object shared by memory objects in the
+ kernel
+ <mcsim> I mean struct vm_object
+ <braunr> so that's what i'm saying, one set per object
+ <braunr> it's useless overhead
+ <braunr> i would really suggest using a global set of policies for now
+ <mcsim> Probably, I don't understand you. Where do you want to store this
+ static array?
+ <braunr> it's a global one
+ <mcsim> "for now"? It is not a problem to implement a table for local
+ advice, using either rbtree or dynamic array.
+ <braunr> it's useless overhead
+ <braunr> and it's not a single integer, you want a whole container per
+ object
+ <braunr> don't do anything fancy unless you know you really want it
+ <braunr> i'll link the netbsd code again as a very good example of how to
+ implement global policies that work more than decently for every file
+ system in this OS
+ <braunr>
+ <braunr> look for uvmadvice
+ <mcsim> But different translators have different demands. Thus changing of
+ global policy for one translator would have impact on behavior of another
+ one.
+ <braunr> i understand
+ <braunr> this isn't l4, or anything experimental
+ <braunr> we want something that works well for us
+ <mcsim> And this is acceptable?
+ <braunr> until you're able to demonstrate we need different policies, i'd
+ recommend not making things more complicated than they already are and
+ need to be
+ <braunr> why wouldn't it ?
+ <braunr> we've been discussing this a long time :/
+ <mcsim> because every process runs in isolated environment and the fact
+ that there is something outside this environment, that has no rights to
+ do that, does it surprises me.
+ <braunr> ?
+ <mcsim> ok. let me dip in uvm code. Probably my questions disappear
+ <braunr> i don't think it will
+ <braunr> you're asking about the system design here, not implementation
+ details
+ <braunr> with l4, there are as you'd expect well defined components
+ handling policies for address space allocation, or paging, or whatever
+ <braunr> but this is mach
+ <braunr> mach has a big shared global vm server with in kernel policies for
+ it
+ <braunr> so it's ok to implement a global policy for this
+ <braunr> and let's be pragmatic, if we don't need complicated stuff, why
+ would we waste time on this ?
+ <mcsim> It is not complicated.
+ <braunr> retaining a whole container for each object, whereas they're all
+ going to contain exactly the same stuff for years to come seems overly
+ complicated for me
+ <mcsim> I'm not going to create separate container for each object.
+ <braunr> i'm not following you then
+ <braunr> how can pagers upload their sizes in the kernel ?
+ <mcsim> I'm going to create a new container only for combination of cluster
+ sizes that are not present in table of advice.
+ <braunr> that's equivalent
+ <braunr> you're ruling out the default set, but that's just an optimization
+ <braunr> whenever a file system decides to use other sizes, the problem
+ will arise
+ <mcsim> Before creating a container I'm going to lookup a table. And only
+ than create
+ <braunr> a table ?
+ <mcsim> But there will be the same container for a huge bunch of objects
+ <braunr> how do you select it ?
+ <braunr> if it's a per pager container, remember there is no shared pager
+ object in the kernel, only ports to external programs
+ <mcsim> I'll give an example
+ <mcsim> Suppose there are only two policies. At the beginning we have table
+ {{random = 4096, sequential = 8096}}. Than pager 1 wants to add new
+ policy where random cluster size is 8192. He asks kernel to create it and
+ after this table will be following: {{random = 4096, sequential = 8192},
+ {random = 8192, sequential = 8192}}. If pager 2 wants to create the same
+ policy as pager 1, kernel will lockup table and will not create new
+ entry. So the table will be the same.
+ <mcsim> And each object has link to appropriate table entry
+ <braunr> i'm not sure how this can work
+ <braunr> how can pagers 1 and 2 know the sizes are the same for the same
+ policy ?
+ <braunr> (and actually they shouldn't)
+ <mcsim> For faster lookup there will be create hash keys for each entry
+ <braunr> what's the lookup key ?
+ <mcsim> They do not know
+ <mcsim> The kernel knows
+ <braunr> then i really don't understand
+ <braunr> and how do you select sizes based on the policy ?
+ <braunr> and how do you remove unused entries ?
+ <braunr> (ok this can be implemented with a simple ref counter)
+ <mcsim> "and how do you select sizes based on the policy ?" you mean at
+ page fault?
+ <braunr> yes
+ <mcsim> entry or object keeps pointer to appropriate entry in the table
+ <braunr> ok your per object data is a pointer to the table entry and the
+ policy is the index inside
+ <braunr> so you really need a ref counter there
+ <mcsim> yes
+ <braunr> and you need to maintain this table
+ <braunr> for me it's uselessly complicated
+ <mcsim> but this keeps design clear
+ <braunr> not for me
+ <braunr> i don't see how this is clearer
+ <braunr> it's just more powerful
+ <braunr> a power we clearly don't need now
+ <braunr> and in the following years
+ <braunr> in addition, i'm very worried about the potential problems this
+ can introduce
+ <mcsim> In fact I don't feel comfortable from the thought that one
+ translator can impact on behavior of another.
+ <braunr> simple example: the table is shared, it needs a lock, other data
+ structures you may have added in your patch may also need a lock
+ <braunr> but our locks are noop for now, so you just can't be sure there is
+ no deadlock or other issues
+ <braunr> and adding smp is a *lot* more important than being able to select
+ precisely policy sizes that we're very likely not to change a lot
+ <braunr> what do you mean by "one translator can impact another" ?
+ <mcsim> As I understand your idea (I haven't read uvm code yet) that there
+ is a global table of cluster sizes for different policies. And every
+ translator can change values in this table. That is what I mean under one
+ translator will have an impact on another one.
+ <braunr> absolutely not
+ <braunr> translators *can't* change sizes
+ <braunr> the sizes are completely static, assumed to be fit all
+ <braunr> -be
+ <braunr> it's not optimial but it's very simple and effective in practice
+ <braunr> optimal*
+ <braunr> and it's not a table of cluster sizes
+ <braunr> it's a table of pages before/after the faulted one
+ <braunr> this reflects the fact tha in mach, virtual memory (implementation
+ and policy) is in the kernel
+ <braunr> translators must not be able to change that
+ <braunr> let's talk about pagers here, not translators
+ <mcsim> Finally I got you. This is an acceptable tradeoff.
+ <braunr> it took some time :)
+ <braunr> just to clear something
+ <braunr> 20:12 < mcsim> For faster lookup there will be create hash keys
+ for each entry
+ <braunr> i'm not sure i understand you here
+ <mcsim> To found out if there is such policy (set of sizes) in the table we
+ can lookup every entry and compare each value. But it is better to create
+ a hash value for set and thus find equal policies.
+ <braunr> first, i'm really not comfortable with hash tables
+ <braunr> they really need careful configuration
+ <braunr> next, as we don't expect many entries in this table, there is
+ probably no need for this overhead
+ <braunr> remember that one property of tables is locality of reference
+ <braunr> you access the first entry, the processor automatically fills a
+ whole cache line
+ <braunr> so if your table fits on just a few, it's probably faster to
+ compare entries completely than to jump around in memory
+ <mcsim> But we can sort hash keys, and in this way find policies quickly.
+ <braunr> cache misses are way slower than computation
+ <braunr> so unless you have massive amounts of data, don't use an optimized
+ container
+ <mcsim> (20:38:53) braunr: that's why btrees and radix trees (basically
+ trees of arrays) exist
+ <mcsim> and what will be the key?
+ <braunr> i'm not saying to use a tree instead of a hash table
+ <braunr> i'm saying, unless you have many entries, just use a simple table
+ <braunr> and since pagers don't add and remove entries from this table
+ often, it's on case reallocation is ok
+ <braunr> one*
+ <mcsim> So here dynamic arrays fit the most?
+ <braunr> probably
+ <braunr> it really depends on the number of entries and the write ratio
+ <braunr> keep in mind current processors have 32-bits or (more commonly)
+ 64-bits cache line sizes
+ <mcsim> bytes probably?
+ <braunr> yes bytes
+ <braunr> but i'm not willing to add a realloc like call to our general
+ purpose kernel allocator
+ <braunr> i don't want to make it easy for people to rely on it, and i hope
+ the lack of it will make them think about other solutions instead :)
+ <braunr> and if they really want to, they can just use alloc/free
+ <mcsim> Under "other solutions" you mean trees?
+ <braunr> i mean anything else :)
+ <braunr> lists are simple, trees are elegant (but add non negligible
+ overhead)
+ <braunr> i like trees because they truely "gracefully" scale
+ <braunr> but they're still O(log n)
+ <braunr> a good hash table is O(1), but must be carefully measured and
+ adjusted
+ <braunr> there are many other data structures, many of them you can find in
+ linux
+ <braunr> but in mach we don't need a lot of them
+ <mcsim> Your favorite data structures are lists and trees. Next, what
+ should you claim, is that lisp is your favorite language :)
+ <braunr> functional programming should eventually rule the world, yes
+ <braunr> i wouldn't count lists are my favorite, which are really trees
+ <braunr> as*
+ <braunr> there is a reason why red black trees back higher level data
+ structures like vectors or maps in many common libraries ;)
+ <braunr> mcsim: hum but just to make it clear, i asked this question about
+ hashing because i was curious about what you had in mind, i still think
+ it's best to use static predetermined values for policies
+ <mcsim> braunr: I understand this.
+ <braunr> :)
+ <mcsim> braunr: Yeah. You should be cautious with me :)
+## IRC, freenode, #hurd, 2012-09-21
+ <antrik> mcsim: there is only one cluster size per object -- it depends on
+ the properties of the backing store, nothing else.
+ <antrik> (while the readahead policies depend on the use pattern of the
+ application, and thus should be selected per mapping)
+ <antrik> but I'm still not convinced it's worthwhile to bother with cluster
+ size at all. do other systems even do that?...
+## IRC, freenode, #hurd, 2012-09-23
+ <braunr> mcsim: how long do you think it will take you to polish your gsoc
+ work ?
+ <braunr> (and when before you begin that part actually, because we'll to
+ review the whole stuff prior to polishing it)
+ <mcsim> braunr: I think about 2 weeks
+ <mcsim> But you may already start review it, if you're intended to do it
+ before I'll rearrange commits.
+ <mcsim> Gnumach, ext2fs and defpager are ready. I just have to polish the
+ code.
+ <braunr> mcsim: i don't know when i'll be able to do that
+ <braunr> so expect a few weeks on my (our) side too
+ <mcsim> ok
+ <braunr> sorry for being slow, that's how hurd development is :)
+ <mcsim> What should I do with libc patch that adds madvise support?
+ <mcsim> Post it to bug-hurd?
+ <braunr> hm probably the same i did for pthreads, create a topic branch in
+ glibc.git
+ <mcsim> there is only one commit
+ <braunr> yes
+ <braunr> (mine was a one liner :p)
+ <mcsim> ok
+ <braunr> it will probably be a debian patch before going into glibc anyway,
+ just for making sure it works
+ <mcsim> But according to term. I expect that my study begins in a week and
+ I'll have to do some stuff then, so actually probably I'll need a week
+ more.
+ <braunr> don't worry, that's expected
+ <braunr> and that's the reason why we're slow
+ <mcsim> And what should I do with large store patch?
+ <braunr> hm good question
+ <braunr> what did you do for now ?
+ <braunr> include it in your work ?
+ <braunr> that's what i saw iirc
+ <mcsim> Yes. It consists of two parts.
+ <braunr> the original part and the modificaionts ?
+ <braunr> modifications*
+ <braunr> i think youpi would know better about that
+ <mcsim> First (small) adds notification to libpager interface and second
+ one adds support for large stores.
+ <braunr> i suppose we'll probably merge the large store patch at some point
+ anyway
+ <mcsim> Yes both original and modifications
+ <braunr> good
+ <mcsim> I'll split these parts to different commits and I'll try to make
+ support for large stores independent from other work.
+ <braunr> that would be best
+ <braunr> if you can make it so that, by ommitting (or including) one patch,
+ we can add your patches to the debian package, it would be great
+ <braunr> (only with regard to the large store change, not other potential
+ smaller conflicts)
+ <mcsim> braunr: I also found several bugs in defpager, that I haven't fixed
+ since winter.
+ <braunr> oh
+ <mcsim> seems nobody hasn't expect them.
+ <braunr> i'm very interested in those actually (not too soon because it
+ concerns my work on pageout, which is postponed after pthreads and
+ select)
+ <mcsim> ok. than I'll do it first.
+## IRC, freenode, #hurd, 2012-09-24
+ <braunr> mcsim: what is vm_get_advice_info ?
+ <mcsim> braunr: hello. It should supply some machine specific parameters
+ regarding clustered reading. At the moment it supplies only maximal
+ possible size of cluster.
+ <braunr> mcsim: why such a need ?
+ <mcsim> It is used by defpager, as it can't allocate memory dynamically and
+ every thread has to allocate maximal size beforehand
+ <braunr> mcsim: i see
+## IRC, freenode, #hurd, 2012-10-05
+ <mcsim> braunr: I think it's not worth to separate large store patch for
+ ext2 and patch for moving it to new libpager interface. Am I right?
+ <braunr> mcsim: it's worth separating, but not creating two versions
+ <braunr> i'm not sure what you mean here
+ <mcsim> First, I applied large store patch, and than I was changing patched
+ code, to make it work with new libpager interface. So changes to make
+ ext2 work with new interface depend on large store patch.
+ <mcsim> braunr: ^
+ <braunr> mcsim: you're not forced to make each version resulting from a new
+ commit work
+ <braunr> but don't make big commits
+ <braunr> so if changing an interface requires its users to be updated
+ twice, it doesn't make sense to do that
+ <braunr> just update the interface cleanly, you'll have one or more commits
+ that produce intermediate version that don't build, that's ok
+ <braunr> then in another, separate commit, adjust the users
+ <mcsim> braunr: The only user now is ext2. And the problem with ext2 is
+ that I updated not the version from git repository, but the version, that
+ I've got after applying the large store patch. So in other words my
+ question is follows: should I make a commit that moves to new interface
+ version of ext2fs without large store patch?
+ <braunr> you're asking if you can include the large store patch in your
+ work, and by extension, in the main branch
+ <braunr> i would say yes, but this must be discussed with others
diff --git a/open_issues/select.mdwn b/open_issues/select.mdwn
index 6bed94ca..12807e11 100644
--- a/open_issues/select.mdwn
+++ b/open_issues/select.mdwn
@@ -1395,6 +1395,114 @@ IRC, unknown channel, unknown date:
+## IRC, freenode, #hurd, 2012-08-07
+ <rbraun_hurd> anyone knows of applications extensively using non-blocking
+ networking functions ?
+ <rbraun_hurd> (well, networking functions in a non-blocking way)
+ <antrik> rbraun_hurd: X perhaps?
+ <antrik> it's single-threaded, so I guess it must be pretty async ;-)
+ <antrik> thinking about it, perhaps it's the reason it works so poorly on
+ Hurd...
+ <braunr> it does ?
+ <rbraun_hurd> ah maybe at the client side, right
+ <rbraun_hurd> hm no, the client side is synchronous
+ <rbraun_hurd> oh by the way, i can use gitk on darnassys
+ <rbraun_hurd> i wonder if it's because of the select fix
+ <tschwinge> rbraun_hurd: If you want, you could also have a look if there's
+ any improvement for these:
+ (elinks),
+ <tschwinge> rbraun_hurd: And congratulations, again! :-)
+ <rbraun_hurd> tschwinge: too bad it can't be merged before the pthread port
+ :(
+ <antrik> rbraun_hurd: I was talking about server. most clients are probably
+ sync.
+ <rbraun_hurd> antrik: i guessed :)
+ <antrik> (thought certainly not all... multithreaded clients are not really
+ supported with xlib IIRC)
+ <rbraun_hurd> but i didn't have much trouble with X
+ <antrik> tried something pushing a lot of data? like, say, glxgears? :-)
+ <rbraun_hurd> why not
+ <rbraun_hurd> the problem with tests involving "a lot of data" is that it
+ can easily degenerate into a livelock
+ <antrik> yeah, sounds about right
+ <rbraun_hurd> (with the current patch i mean)
+ <antrik> the symptoms I got were general jerkiness, with occasional long
+ hangs
+ <rbraun_hurd> that applies to about everything on the hurd
+ <rbraun_hurd> so it didn't alarm me
+ <antrik> another interesting testcase is freeciv-gtk... it reporducibly
+ caused a thread explosion after idling for some time -- though I don't
+ remember the details; and never managed to come up with a way to track
+ down how this happens...
+ <rbraun_hurd> dbus is more worthwhile
+ <rbraun_hurd> pinotree: hwo do i test that ?
+ <pinotree> eh?
+ <rbraun_hurd> pinotree: you once mentioned dbus had trouble with non
+ blocking selects
+ <pinotree> it does a poll() with a 0s timeout
+ <rbraun_hurd> that's the non blocking select part, yes
+ <pinotree> you'll need also fixes for the socket credentials though,
+ otherwise it won't work ootb
+ <rbraun_hurd> right but, isn't it already used somehow ?
+ <antrik> rbraun_hurd: uhm... none of the non-X applications I use expose a
+ visible jerkiness/long hangs pattern... though that may well be a result
+ of general load patterns rather than X I guess
+ <rbraun_hurd> antrik: that's my feeling
+ <rbraun_hurd> antrik: heavy communication channels, unoptimal scheduling,
+ lack of scalability, they're clearly responsible for the generally
+ perceived "jerkiness" of the system
+ <antrik> again, I can't say I observe "general jerkiness". apart from slow
+ I/O the system behaves rather normally for the things I do
+ <antrik> I'm pretty sure the X jerkiness *is* caused by the socket
+ communication
+ <antrik> which of course might be a scheduling issue
+ <antrik> but it seems perfectly possible that it *is* related to the select
+ implementation
+ <antrik> at least worth a try I'd say
+ <rbraun_hurd> sure
+ <rbraun_hurd> there is still some work to do on it though
+ <rbraun_hurd> the client side changes i did could be optimized a bit more
+ <rbraun_hurd> (but i'm afraid it would lead to ugly things like 2 timeout
+ parameters in the io_select_timeout call, one for the client side, the
+ other for the servers, eh)
+## IRC, freenode, #hurd, 2012-08-07
+ <braunr> when running gitk on [darnassus], yesterday, i could push the CPU
+ to 100% by simply moving the mouse in the window :p
+ <braunr> (but it may also be caused by the select fix)
+ <antrik> braunr: that cursor might be "normal"
+ <rbraunrh> antrik: what do you mean ?
+ <antrik> the 100% CPU
+ <rbraunh> antrik: yes i got that, but what would make it normal ?
+ <rbraunh> antrik: right i get similar behaviour on linux actually
+ <rbraunh> (not 100% because two threads are spread on different cores, but
+ their cpu usage add up to 100%)
+ <rbraunh> antrik: so you think as long as there are events to process, the
+ x client is running
+ <rbraunh> thath would mean latencies are small enough to allow that, which
+ is actually a very good thing
+ <antrik> hehe... sound kinda funny :-)
+ <rbraunh> this linear search on dequeue is a real pain :/
+## IRC, freenode, #hurd, 2012-08-09
+`screen` doesn't close a window/hangs after exiting the shell.
+ <rbraunh> the screen issue seems linked to select :p
+ <rbraunh> tschwinge: the term server may not correctly implement it
+ <rbraunh> tschwinge: the problem looks related to the term consoles not
+ dying
+ <rbraunh>
# See Also
See also [[select_bogus_fd]] and [[select_vs_signals]].
diff --git a/open_issues/synchronous_ipc.mdwn b/open_issues/synchronous_ipc.mdwn
index 57bcdda7..53d5d69d 100644
--- a/open_issues/synchronous_ipc.mdwn
+++ b/open_issues/synchronous_ipc.mdwn
@@ -62,3 +62,124 @@ From [[Genode RPC|microkernel/genode/rpc]].
<antrik> well, if you see places where blocking is done but failing would
be more appropriate, try changing them I'd say...
<braunr> it's not that easy :/
+# IRC, freenode, #hurd, 2012-08-18
+ <lcc> what is the deepest design mistake of the HURD/gnumach?
+ <braunr> lcc: async ipc
+ <savask> braunr: You mentioned that moving to L4 will create problems. Can
+ you name some, please?
+ <savask> I thought it was going to be faster on L4
+ <braunr> the problem is that l4 *only* provides sync ipc
+ <braunr> so implementing async communication would require one seperated
+ thread for each instance of async communication
+ <savask> But you said that the deepest design mistake of Hurd is asynch
+ ipc.
+ <braunr> not the hurd, mach
+ <braunr> and hurd depends on it now
+ <braunr> i said l4 provides *only* sync ipc
+ <braunr> systems require async communication tools
+ <braunr> but they shouldn't be built entirely on top of them
+ <savask> Hmm, so you mean mach has bad asynch ipc?
+ <braunr> you can consider mach and l4 as two extremes in os design
+ <braunr> mach *only* has async ipc
+ <lcc> what was viengoos trying to explore?
+ * savask is confused
+ <braunr> lcc: half-sync ipc :)
+ <braunr> lcc: i can't tell you more on that, i need to understand it better
+ myself before any explanation attempt
+ <savask> You say that mach problem is asynch ipc. And L4's problem is it's
+ sync ipc. That means problems are in either of them!
+ <braunr> exactly
+ <lcc> how did apple resolve issues with mach?
+ <savask> What is perfect then? A "golden middle"?
+ <braunr> lcc: they have migrating threads, which make most rpc behave as if
+ they used sync ipc
+ <braunr> savask: nothing is perfect :p
+ <mcsim> braunr: but why async ipc is the problem?
+ <braunr> mcsim: it requires in-kernel buffering
+ <savask> braunr: Yes, but we can't have problems everywhere o_O
+ <braunr> mcsim: this not only reduces communication performance, but
+ creates many resource usage problems
+ <braunr> mcsim: and potential denial of service, which is what we
+ experience most of the time when something in the hurd fails
+ <braunr> savask: there are problems we can live with
+ <mcsim> braunr: But this could be replaced by userspace server, isn't it?
+ <braunr> savask: this is what monolithic kernels do
+ <braunr> mcsim: what ?
+ <braunr> mcsim: this would be the same, this central buffering server would
+ suffer from the same kind of issue
+ <mcsim> braunr: async ipc. Buffer can hold special server
+ <mcsim> But there could be created several servers, and queue could have
+ limit.
+ <braunr> queue limits are a problem
+ <braunr> when a queue limit is reached, you either block (= sync ipc) or
+ lose a message
+ <braunr> to keep messaging reliable, mach makes senders block
+ <braunr> the problem is that async ipc is often used to avoid blocking
+ <braunr> so blocking when you don't expect it can create deadlocks
+ <braunr> savask: a good compromise is to use sync ipc most of the time, and
+ async ipc for a few special cases, like signals
+ <braunr> this is what okl4 does if i'm right
+ <braunr> i'm not sure of the details, but like many other projects they
+ realized current systems simply need good support for async ipc, so they
+ extended l4 or something on top of it to provide it
+ <braunr> it took years of research for very smart people to get to some
+ consensus like "sync ipc is better but async is needed too"
+ <braunr> personaly i don't like l4 :/
+ <braunr> really not
+ <mcsim> braunr: Anyway there is some queue for messaging, but at the moment
+ if it overflows panics kernel. And with limited queue servers will panic.
+ <braunr> mcsim: it can't overflow
+ <braunr> mach blocks senders
+ <braunr> queuing basically means "block and possible deadlock" or "lose
+ messages and live with it"
+ <mcsim> So, deadlocks are still possible?
+ <braunr> of course
+ <braunr> have a look at the libpager debian patch and the discussion around
+ it
+ <braunr> it's a perfect example
+ <youpi> braunr: it makes gnu mach slow as hell sometimes, which I guess is
+ because all threads (which can ben 1000s) wake at the same time
+ <braunr> youpi: you mean are created ?
+ <braunr> because they'll have to wake in any case
+ <braunr> i can understand why creating lots of threads is slower, but
+ cthreads never destroyes kernel threads
+ <braunr> doesn't seem to be a mach problem, rather a cthreads one
+ <braunr> i hope we're able to remove the patch after pthreads are used
+ <mcsim> braunr: You state that hurd can't move to sync ipc, since it
+ depends on async ipc. But at the same time async ipc doesn't guarantee
+ that task wouldn't block. So, I don't understand why limited queues will
+ lead to more deadlocks?
+ <braunr> mcsim: async ipc can block because of queue limits
+ <braunr> mcsim: if you remove the limit, you remove the deadlock problem,
+ and replace it with denial of service
+ <braunr> mcsim: i didn't say the hurd can't move to sync ipc
+ <braunr> mcsim: i said it came to depend on async ipc as provided by mach,
+ and we would need to change that
+ <braunr> and it's tricky
+ <youpi> braunr: no, I really mean are woken. The timeout which gets dropped
+ by the patch makes threads wake after some time, to realize they should
+ go away. It's a hell long when all these threads wake at the same time
+ (because theygot created at the same time)
+ <braunr> ahh
+ <antrik> savask: what is perfect regarding IPC is something nobody can
+ really answer... there are competing opinions on that matter. but we know
+ by know that the Mach model is far from ideal, and that the (original) L4
+ model is also problematic -- at least for implementing a UNIX-like system
+ <braunr> personally, if i'd create a system now, i'd use sync ipc for
+ almost everything, and implement posix-like signals in the kernel
+ <braunr> that's one solution, it's not perfect
+ <braunr> savask: actually the real answer may be "noone knows for now and
+ it still requires work and research"
+ <braunr> so for now, we're using mach
+ <antrik> savask: regarding IPC, the path explored by Viengoos (and briefly
+ Coyotos) seems rather promising to me
+ <antrik> savask: and yes, I believe that whatever direction we take, we
+ should do so by incrementally reworking Mach rather than jumping to a
+ completely new microkernel...
diff --git a/open_issues/system_stats.mdwn b/open_issues/system_stats.mdwn
new file mode 100644
index 00000000..9a13b29a
--- /dev/null
+++ b/open_issues/system_stats.mdwn
@@ -0,0 +1,39 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_documentation]]There should be a page listing ways to get
+system statistics, how to interpret them, and some example/expected values.
+# IRC, frenode, #hurd, 2012-11-04
+ <mcsim> Hi, is that normal that memory cache "ipc_port" is 24 Mb already?
+ Some memory has been already swapped out.
+ <mcsim> Other caches are big too
+ <braunr> how many ports ?
+ <mcsim> 45922
+ <braunr> yes it's normal
+ <braunr> ipc_port 0010 76 4k 50 45937 302050
+ 24164k 4240k
+ <braunr> it's a bug in exim
+ <braunr> or triggered by exim, from time to time
+ <braunr> lots of ports are created until the faulty processes are killed
+ <braunr> the other big caches you have are vm_object and vm_map_entry,
+ probably because of a big build like glibc
+ <braunr> and if they remain big, it's because there was no memory pressure
+ since they got big
+ <braunr> memory pressure can only be caused by very large files on the
+ hurd, because of the limited page cache size (4000 objects at most)
+ <braunr> the reason you have swapped memory is probably because of a glibc
+ test that allocates a very large (more than 1.5 GiB iirc) block and fills
+ it
+ <mcsim> yes
+ <braunr> (a test that fails with the 2G/2G split of the debian kernel, but
+ not on your vanilla version btw)
diff --git a/open_issues/term_blocking.mdwn b/open_issues/term_blocking.mdwn
index 19d18d0e..39803779 100644
--- a/open_issues/term_blocking.mdwn
+++ b/open_issues/term_blocking.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2009, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2009, 2011, 2012 Free Software Foundation,
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -117,6 +118,128 @@ noninvasive on`, attach to the *term* that GDB is using.
+# IRC, freenode, #hurd, 2012-08-09
+In context of the [[select]] issue.
+ <braunr> i wonder where the tty allocation is made
+ <braunr> it could simply be that current applications don't handle old BSD
+ ptys correctly
+ <braunr> hm no, allocation is fine
+ <braunr> does someone know why there is no term instance for /dev/ttypX ?
+ <braunr> showtrans says "/hurd/term /dev/ttyp0 pty-slave /dev/ptyp0" though
+ <youpi> braunr: /dev/ttypX share the same translator with /dev/ptypX
+ <braunr> youpi: but how ?
+ <youpi> see the main function of term
+ <youpi> it attaches itself to the other node
+ <youpi> with file_set_translator
+ <youpi> just like pfinet can attach itself to /servers/socket/26 too
+ <braunr> youpi: isn't there a possible race when the same translator tries
+ to sets itself on several nodes ?
+ <youpi> I don't know
+ <tschwinge> There is.
+ <braunr> i guess it would just faikl
+ <braunr> fail
+ <tschwinge> I remember some discussion about this, possibly in context of
+ the IPv6 project.
+ <braunr> gdb shows weird traces in term
+ <braunr> i got this earlier today:
+ <braunr> 0x805e008 is the ptyctl, the trivs control for the pty
+ <tschwinge> braunr: How do you mean »weird«?
+ <braunr> tschwinge: some peropen (po) are never destroyed
+ <tschwinge> Well, can't they possibly still be open?
+ <braunr> they shouldn't
+ <braunr> that's why term doesn't close cleany, why select still reports
+ readiness, and why screen loops on it
+ <braunr> (and why each ssh session uses a different pty)
+ <tschwinge> ... but only on darnassus, I think? (I think I haven't seen
+ this anywhere else.)
+ <braunr> really ?
+ <braunr> i had it on my virtual machines too
+ <tschwinge> But perhaps I've always been rebooting systems quickly enough
+ to not notice.
+ <tschwinge> OK, I'll have a look next time I boot mine.
+ <braunr> i suppose it's why you can't login anymore quickly when syslog is
+ running
+ <braunr> i've traced the problem to ptyio.c, where pty_open_hook returns
+ EBUSY because ptyopen is still true
+ <braunr> ptyopen remains true because pty_po_create_hook doesn't get called
+ <youpi> tschwinge: I've seen the pty issue on exodar too, and on my qemu
+ image too
+ <braunr> err, pty_po_destroy_hook
+ <tschwinge> OK.
+ <braunr> and pty_po_destroy_hook doesn't get called from users.c because
+ po->cntl != ptyctl
+ <braunr> which means, somehow, the pty never gets closed
+ <youpi> oddly enough it seems to happen on all qemu systems I have, and no
+ xen system I have
+ <braunr> Oo
+ <braunr> are they all (xen and qemu) up to date ?
+ <braunr> (so we can remove versions as a factor)
+ <tschwinge> Aha. I only hve Xen and real hardware.
+ <youpi> braunr: no
+ <braunr> youpi: do you know any obscur site about ptys ? :)
+ <youpi> no
+ <youpi> well, actually yes
+ <youpi> (in french)
+ <braunr> :D
+ <braunr> looks
+ interesting
+ <youpi> indeed
+## IRC, freenode, #hurdfr, 2012-08-09
+ <braunr> youpi: ce que j'ai le plus de mal à comprendre, c'est ce qu'est un
+ "controlling tty"
+ <youpi> c'est le plus obscur d'obscur :)
+ <braunr> s'il est exclusif à une appli, comment ça doit se comporter sur un
+ fork, etc..
+ <youpi> de manière simple, c'est ce qui permet de faire ^C
+ <braunr> eh oui, et c'est sûrement là que ça explose
+ <youpi> c'est pas exclusif, c'est hérité
+ <braunr>
+## IRC, freenode, #hurd, 2012-08-10
+ <braunr> youpi: and just to be sure about the test procedure, i log on a
+ system, type tty, see e.g. ttyp0, log out, and in again, then tty returns
+ ttyp1, etc..
+ <youpi> yes
+ <braunr> youpi: and an open (e.g. cat) on /dev/ptyp0 returns EBUSY
+ <youpi> indeed
+ <braunr> so on xen it doesn't
+ <braunr> grmbl
+ <youpi> I've never seen it, more precisely
+ <braunr> i also have the problem with a non-accelerated qemu
+ <braunr> antrik: do you have the term problems we've seen on your bare
+ hardware ?
+ <antrik> I'm not sure what problem you are seeing exactly :-)
+ <braunr> antrik: when logging through ssh, tty first returns ttyp0, and the
+ second time (after logging out from the first session) ttyp1
+ <braunr> antrik: and term servers that have been used are then stuck in a
+ busy state
+ <antrik> braunr: my ptys seem to be reused just fine
+ <braunr> or perhaps they didn't have the bug
+ <braunr> antrik: that's so weird
+ <antrik> (I do *sometimes* get hanging ptys, but that's a different issue
+ -- these are *not* busy; they just hang when reused...)
+ <braunr> antrik: yes i saw that too
+ <antrik> braunr: note though that my hurd package is many months old...
+ <antrik> (in fact everything on this system)
+ <braunr> antrik: i didn't see anything relevant about the term server in
+ years
+ <braunr> antrik: what shell do you use ?
+ <antrik> yeah, but such errors could be caused by all kinds of changes in
+ other parts of the Hurd, glibc, whatever...
+ <antrik> bash
# Formal Verification
This issue may be a simple programming error, or it may be more complicated.
diff --git a/open_issues/user-space_device_drivers.mdwn b/open_issues/user-space_device_drivers.mdwn
index 25168fce..8cde8281 100644
--- a/open_issues/user-space_device_drivers.mdwn
+++ b/open_issues/user-space_device_drivers.mdwn
@@ -50,6 +50,65 @@ Also see [[device drivers and IO systems]].
* I/O MMU.
+### IRC, freenode, #hurd, 2012-08-15
+ <carli2> hi. does hurd support mesa?
+ <braunr> carli2: software only, but yes
+ <carli2> :(
+ <carli2> so you did not solve the problem with the CS checkers and GPU DMA
+ for microkernels yet, right?
+ <braunr> cs = ?
+ <carli2> control stream
+ <carli2> the data sent to the gpu
+ <braunr> no
+ <braunr> and to be honest we're not currently trying to
+ <carli2> well, a microkernel containing cs checkers for each hardware is
+ not a microkernel any more
+ <braunr> the problem is having the ability to check
+ <braunr> or rather, giving only what's necessary to delegate checking to
+ mmus
+ <carli2> but maybe the kernel could have a smaller interface like a
+ function to check if a memory block is owned by a process
+ <braunr> i'm not sure what you refer to
+ <carli2> about DMA-capable devices you can send messages to
+ <braunr> carli2: dma must be delegated to a trusted server
+ <carli2> linux checks the data sent to these devices, parses them and
+ checks all pointers if they are in a memory range that the client is
+ allowed to read/write from
+ <braunr> the client ?
+ <carli2> in linux, 3d drivers are in user space, so the kernel side checks
+ the pointer sent to the GPU
+ <youpi> carli2: mach could do that as well
+ <braunr> well, there is a rather large part in kernel space too
+ <carli2> so in hurd I trust some drivers to not do evil things?
+ <braunr> those in the kernel yes
+ <carli2> what does "in the kernel" mean? afaik a microkernel only has
+ memory manager and some basic memory sharing and messaging functionality
+ <braunr> did you read about the hurd ?
+ <braunr> mach is considered an hybrid kernel, not a true microkernel
+ <braunr> even with all drivers outside, it's still an hybrid
+ <youpi> although we're to move some parts into userlands :)
+ <youpi> braunr: ah, why?
+ <braunr> youpi: the vm part is too large
+ <youpi> ok
+ <braunr> the microkernel dogma is no policy inside the kernel
+ <braunr> "except scheduling because it's very complicated"
+ <braunr> but all modern systems have moved memory management outisde the
+ kernel, leaving just the kernel abstraction inside
+ <braunr> the adress space kernel abstraction
+ <braunr> and the two components required to make it work are what l4re
+ calls region mappers (the rough equivalent of our vm_map), which decides
+ how to allocate regions in an address space
+ <braunr> and the pager, like ours, which are already external
+ <carli2> i'm not a OS developer, i mostly develop games, web services and
+ sometimes I fix gpu drivers
+ <braunr> that was just FYI
+ <braunr> but yes, dma must be considered something privileged
+ <braunr> and the hurd doesn't have the infrastructure you seem to be
+ looking for
## I/O Ports
* Security considerations.
@@ -63,8 +122,13 @@ Also see [[device drivers and IO systems]].
* [[GNU Mach|microkernel/mach/gnumach]] is said to have a high overhead when
doing RPC calls.
## System Boot
+A similar problem is described in
+[[community/gsoc/project_ideas/unionfs_boot]], and needs to be implemented.
### IRC, freenode, #hurd, 2011-07-27
< braunr> btw, was there any formulation of the modifications required to
@@ -89,12 +153,270 @@ Also see [[device drivers and IO systems]].
< Tekk_> mhm
< braunr> s/disk/storage/
### IRC, freenode, #hurd, 2012-04-25
<youpi> btw, remember the initrd thing?
<youpi> I just came across task.c in libstore/ :)
+### IRC, freenode, #hurd, 2012-07-17
+ <bddebian> OK, here is a stupid question I have always had. If you move
+ PCI and disk drivers in to userspace, how do do initial bootstrap to get
+ the system booting?
+ <braunr> that's hard
+ <braunr> basically you make the boot loader load all the components you
+ need in ram
+ <braunr> then you make it give each component something (ports) so they can
+ communicate
+### IRC, freenode, #hurd, 2012-08-12
+ <antrik> braunr: so, about booting with userspace disk drivers
+ <antrik> after rereading the chapter in my thesis, I see that there aren't
+ really all than many interesting options...
+ <antrik> I pondered some variants involving a temporary boot filesystem
+ with handoff to the real root FS; but ultimately concluded with another
+ option that is slightly less elegant but probably gets a much better
+ usefulness/complexity ratio:
+ <antrik> just start the root filesystem as the first process as we used to;
+ only hack it so that initially it doesn't try to access the disk, but
+ instead gets the files from GRUB
+ <antrik> once the disk driver is operational, we flip a switch, and the
+ root filesystem starts reading stuff from disk normally
+ <antrik> transparently for all other processes
+ <bddebian> How does grub access the disk without drivers?
+ <antrik> bddebian: GRUB obviously has its own drivers... that's how it
+ loads the kernel and modules
+ <antrik> bddebian: basically, it would have to load additional modules for
+ all the components necessary to get the Hurd disk driver going
+ <bddebian> Right, why wouldn't that be possible?
+ <antrik> (I have some more crazy ideas too -- but these are mostly
+ orthogonal :-) )
+ <antrik> ?
+ <antrik> I'm describing this because I'm pretty sure it *is* possible :-)
+ <bddebian> That grub loads the kernel and whatever server/module gets
+ access to the disk
+ <antrik> not sure what you mean
+ <bddebian> Well as usual I probably don't know the proper terminology but
+ why could grub load gnumach and the hurd "disk server" that contains the
+ userspace drivers?
+ <antrik> disk server?
+ <bddebian> Oh FFS whatever contains the disk drivers :)
+ <bddebian> diskdde, whatever :)
+ <antrik> actually, I never liked the idea of having a big driver blob very
+ much... ideally each driver should have it's own file
+ <antrik> but that's admittedly beside the point :-)
+ <antrik> its
+ <antrik> so to restate: in addition to gnumach, ext2fs.static, and,
+ in the new scenario GRUB will also load exec, the disk driver, any
+ libraries these two depend upon, and any additional infrastructure
+ involved in getting the disk driver running (for automatic probing or
+ whatever)
+ <antrik> probably some other Hurd core servers too, so we can have a more
+ complete POSIX environment for the disk driver to run in
+ <bddebian> There ya go :)
+ <antrik> the interesting part is modifying ext2fs so it will access only
+ the GRUB-provided files, until it is told that it's OK now to access the
+ real disk
+ <antrik> (and the mechanism how ext2 actually gets at the GRUB-provided
+ files)
+ <bddebian> Or write some new really small ext2fs? :)
+ <antrik> ?
+ <bddebian> I'm just talking out my butt. Something temporary that gets
+ disposed of when the real disk is available :)
+ <antrik> well, I mentioned above that I considered some handoff
+ schemes... but they would probably be more complex to implement than
+ doing the switchover internally in ext2
+ <bddebian> Ah
+ <bddebian> boot up in a ramdisk? :)
+ <antrik> (and the temporary FS would *not* be an ext2 obviously, but rather
+ some special ramdisk-like filesystem operating from GRUB-loaded files...)
+ <antrik> again, that would require a complicated handoff-scheme
+ <bddebian> Bah, what do I know? :)
+ <antrik> (well, you could of course go with a trivial chroot()... but that
+ would be ugly and inefficient, as the initial processes would still run
+ from the ramdisk)
+ <bddebian> Aren't most things running in memory initially anyway? At what
+ point must it have access to the real disk?
+ <braunr> antrik: but doesn't that require that disk drivers be statically
+ linked ?
+ <braunr> and having all disk drivers in separate tasks (which is what we
+ prefer to blobs as you put it) seems to pretty much forbid using static
+ linking
+ <braunr> hm actually, i don't see how any solution could work without
+ static linking, as it would create a recursion
+ <braunr> and the only one required is the one used by the root file system
+ <braunr> others can be run from the dynamically linked version
+ <braunr> antrik: i agree, it's a good approach, requiring only a slightly
+ more complicated boot script/sequence
+ <antrik> bddebian: at some point we have to access the real disk so we
+ don't have to work exclusively with stuff loaded by grub... but there is
+ no specific point where it *has* to happen. generally speaking, the
+ sooner the better
+ <antrik> braunr: why wouldn't that work with a dynamically linked disk
+ driver? we only need to make sure all required libraries are loaded by
+ grub too
+ <braunr> antrik: i have a problem with that approach :p
+ <braunr> antrik: it would probably require a reboot when those libraries
+ are upgraded, wouldn't it ?
+ <antrik> I'd actually wish we could run with a dynamically linked ext2fs as
+ well... but that would require a separated boot filesystem and some kind
+ of handoff approach, which would be much more complicated I fear...
+ <braunr> and if a driver is restarted, would it use those libraries too ?
+ and if so, how to find them ?
+ <braunr> but how can you run a dynamically linked root file system ?
+ <braunr> unless the libraries it uses are provided by something else, as
+ you said
+ <antrik> braunr: well, if you upgrade the libraries, *and* want the disk
+ driver to use the upgraded libraries, you are obviously in a tricky
+ situation ;-)
+ <braunr> yes
+ <antrik> perhaps you could tell ext2 to preload the new libraries before
+ restarting the disk driver...
+ <antrik> but that's a minor quibble anyways IMHO
+ <braunr> but that case isn't that important actually, since upgrading these
+ libraries usually means we're upgrading the system, which can imply a
+ reoobt
+ <braunr> i don't think it is
+ <braunr> it looks very complicated to me
+ <braunr> think of restart as after a crash :p
+ <braunr> you can't preload stuff in that case
+ <antrik> uh? I don't see anything particularily complicated. but my point
+ was more that it's not a big thing if that's not implemented IMHO
+ <braunr> right
+ <braunr> it's not that important
+ <braunr> but i still think statically linking is better
+ <braunr> although i'm not sure about some details
+ <antrik> oh, you mean how to make the root filesystem use new libraries
+ without a reboot? that would be tricky indeed... but this is not possible
+ right now either, so that's not a regression
+ <braunr> i assume that, when statically linking, only the .o providing the
+ required symbols are included, right ?
+ <antrik> making the root filesystem restartable is a whole different epic
+ story ;-)
+ <braunr> antrik: not the root file system, but the disk driver
+ <braunr> but i guess it's the same
+ <antrik> no, it's not
+ <braunr> ah
+ <antrik> for the disk driver it's really not that hard I believe
+ <antrik> still some extra effort, but definitely doable
+ <braunr> with the preload you mentioned
+ <antrik> yes
+ <braunr> i see
+ <braunr> i don't think it's worth the trouble actually
+ <braunr> statically linking looks way simpler and should make for smaller
+ binaries than if libraries were loaded by grub
+ <antrik> no, I really don't want statically linked disk drivers
+ <braunr> why ?
+ <antrik> again, I'd prefer even ext2fs to be dynamic -- only that would be
+ much more complicated
+ <braunr> the point of dynamically linking is sharing
+ <antrik> while dynamic disk drivers do not require any extra effort beyond
+ loading the libraries with grub
+ <braunr> but if it means sharing big files that are seldom used (i assume
+ there is a lot of code that simply isn't used by hurd servers), i don't
+ see the point
+ <antrik> right. and with the approach I proposed that will work just as it
+ should
+ <antrik> err... what big files?
+ <braunr> glibc ?
+ <antrik> I don't get your point
+ <antrik> you prefer statically linking everything needed before the disk
+ driver runs (which BTW is much more than only the disk driver itself) to
+ using normal shared libraries like the rest of the system?...
+ <braunr> it's not "like the rest of the system"
+ <braunr> the libraries loaded by grub wouldn't be back by the ext2fs server
+ <braunr> they would be wired in memory
+ <braunr> you'd have two copies of them, the one loaded by grub, and the one
+ shared by normal executables
+ <antrik> no
+ <braunr> i prefer static linking because, if done correctly, the combined
+ size of the root file system and the disk driver should be smaller than
+ that of the rootfs+disk driver and libraries loaded by grub
+ <antrik> apparently I was not quite clear how my approach would work :-(
+ <braunr> probably not
+ <antrik> (preventing that is actually the reason why I do *not* want as
+ simple boot filesystem+chroot approach)
+ <braunr> and initramfs can be easily freed after init
+ <braunr> an*
+ <braunr> it wouldn't be a chroot but something a bit more involved like
+ switch_root in linux
+ <antrik> not if various servers use files provided by that init filesystem
+ <antrik> yes, that's the complex handoff I'm talking about
+ <braunr> yes
+ <braunr> that's one approach
+ <antrik> as I said, that would be a quite elegant approach (allowing a
+ dynamically linked ext2); but it would be much more complicated to
+ implement I believe
+ <braunr> how would it allow a dynamically linked ext2 ?
+ <braunr> how can the root file system be linked with code backed by itself
+ ?
+ <braunr> unless it requires wiring all its memory ?
+ <antrik> it would be loaded from the init filesystem before the handoff
+ <braunr> init sn't the problem here
+ <braunr> i understand how it would boot
+ <braunr> but then, you need to make sure the root fs is never used to
+ service page faults on its own address space
+ <braunr> or any address space it depends on, like the disk driver
+ <braunr> so this basically requires wiring all the system libraries, glibc
+ included
+ <braunr> why not
+ <antrik> ah. yes, that's something I covered in a separate section in my
+ thesis ;-)
+ <braunr> eh :)
+ <antrik> we have to do that anyways, if we want *any* dynamically linked
+ components (such as the disk driver) in the paging path
+ <braunr> yes
+ <braunr> and it should make swapping more reliable too
+ <antrik> so that adds a couple MiB of wired memory... I guess we will just
+ have to live with that
+ <braunr> yes it seems acceptable
+ <braunr> thanks
+ <antrik> (it is actually one reason why I want to avoid static linking as
+ much as possible... so at least we have to wire these libraries only
+ *once*)
+ <antrik> anyways, back to my "simpler" approach
+ <antrik> the idea is that a (static) ext2fs would still be the first task
+ running, and immediately able to serve filesystem access requests -- only
+ it would serve these requests from files preloaded by GRUB rather than
+ the actual disk driver
+ <braunr> i understand now
+ <antrik> until a switch is flipped telling it that now the disk driver (and
+ anything it depends upon) is operational
+ <braunr> you still need to make sure all this is wired
+ <antrik> yes
+ <antrik> that's orthogonal
+ <antrik> which is why I have a separate section about it :-)
+ <braunr> what was the relation with ggi ?
+ <antrik> none strictly speaking
+ <braunr> i'll rephrase it: how did it end up in your thesis ?
+ <antrik> I just covered all aspects of userspace drivers in one of the
+ "introduction" sections of my thesis
+ <braunr> ok
+ <antrik> before going into specifics of KGI
+ <antrik> (and throwing in along the way that most of the issues described
+ do not matter for KGI ;-) )
+ <braunr> hehe
+ <braunr> i'm wondering, do we have mlockall on the hurd ? it seems not
+ <braunr> that's something deeply missing in mach
+ <antrik> well, bootstrap in general *is* actually relevant for KGI as well,
+ because of console messages during boot... but the filesystem bootstrap
+ is mostly irrelevant there ;-)
+ <antrik> braunr: oh? that's a problem then... I just assumed we have it
+ <braunr> well, it's possible to implement MCL_CURRENT, but not MCL_FUTURE
+ <braunr> or at least, it would be a bit difficult
+ <braunr> every allocation would need to be aware of that property
+ <braunr> it's better to have it managed by the vm system
+ <braunr> mach-defpager has its own version of vm_allocate for that
+ <antrik> braunr: I don't think we care about MCL_FUTURE here
+ <antrik> hm, wait... MCL_CURRENT is fine for code, but it might indeed be a
+ problem for dynamically allocated memory :-(
+ <braunr> yes
# Plan
* Examine what other systems are doing.
@@ -116,6 +438,112 @@ Also see [[device drivers and IO systems]].
and parallel port drivers, using `libtrivfs`.
+## I/O Server
+### IRC, freenode, #hurd, 2012-08-10
+ <braunr> usually you'd have an I/O server, and serveral device drivers
+ using it
+ <bddebian> Well maybe that's my question. Should there be unique servers
+ for say ISA, PCI, etc or could all of that be served by one "server"?
+ <braunr> forget about ISA
+ <bddebian> How? Oh because the ISA bus is now served via a PCI bridge?
+ <braunr> the I/O server would merely be there to help device drivers map
+ only what they require, and avoid conflicts
+ <braunr> because it's a relic of the past :p
+ <braunr> and because it requires too high privileges
+ <bddebian> But still exists in several PCs :)
+ <braunr> so usually, you'd directly ask the kernel for the I/O ports you
+ need
+ <mel-> so do floppy drives
+ <mel-> :)
+ <braunr> if i'm right, even the l4 guys do it that way
+ <braunr> he's right, some devices are still considered ISA
+ <bddebian> But that is where my confusion lies. Something has to figure
+ out what/where those I/O ports are
+ <braunr> and that's why i tell you to forget about it
+ <braunr> ISA has both statically allocated ports (the historical ones) and
+ others usually detected through PnP, when it works
+ <braunr> PCI is much cleaner, and memory mapped I/O is both better and much
+ more popular currently
+ <bddebian> So let's say I have a PCI SCSI card. I need some device driver
+ to know how to talk to that, right?
+ <bddebian> something is going to enumerate all the PCI devices and map them
+ to and address space
+ <braunr> bddebian: that would be the I/O server
+ <braunr> we'll call it the PCI server
+ <bddebian> OK, that is where I am headed. What if everything isn't PCI?
+ Is the "I/O server" generic enough?
+ <youpi> nowadays everything is PCI
+ <bddebian> So we are completely ignoring legacy hardware?
+ <braunr> we could have separate servers using a shared library that would
+ provide allocation routines like resource maps
+ <braunr> yes
+ <youpi> for what is not, the translator just needs to be run as root
+ <youpi> to get i/o perm from the kernel
+ <braunr> the idea for projects like ours, where the user base is very small
+ is: don't implement what you can't test
+ <youpi> bddebian: legacy can not be supported in a nice way, so for them we
+ can just afford a bad solution
+ <youpi> i.e. leave the driver in kernel
+ <braunr> right
+ <youpi> e.g. the keyboard
+ <bddebian> Well what if I have a USB keyboard? :-P
+ <braunr> that's a different matter
+ <youpi> USB keyboard is not legacy hardware
+ <youpi> it's usb
+ <youpi> which can be enumerated like pci
+ <braunr> and USB uses PCI
+ <youpi> and pci could be on usb :)
+ <braunr> so it's just a separate stack on top of the PCI server
+ <bddebian> Sure so would SCSI in my example above but is still a seperate
+ bus
+ <braunr> netbsd has a very nice way of attaching drivers to buses
+ <youpi> bddebian: also, yes, and it can be enumerated
+ <bddebian> Which was my original question. This magic I/O server handles
+ all of the buses?
+ <youpi> no, just PCI, and then you'd have other servers for other busses
+ <braunr> i didn't mean that there would be *one* I/O server instance
+ <bddebian> So then it isn't a generic I/O server is it?
+ <bddebian> Ahhhh
+ <youpi> that way you can even put scsi over ppp or other crazy things
+ <braunr> it's more of an idea
+ <braunr> there would probably be a generic interface for basic stuff
+ <braunr> and i assume it could be augmented with specific (e.g. USB)
+ interfaces for servers that need more detailed communication
+ <braunr> (well, i'm pretty sure of it)
+ <bddebian> So the I/O server generalizes all functions, say read and write,
+ and then the PCI, USB, SCIS, whatever servers are contacted by it?
+ <braunr> no, not read and write
+ <braunr> resource allocation rather
+ <youpi> and enumeration
+ <braunr> probing perhaps
+ <braunr> bddebian: the goal of the I/O server is to make it possible for
+ device drivers to access the resources they need without a chance to
+ interfere with other device drivers
+ <braunr> (at least, that's one of the goals)
+ <braunr> so a driver would request the bus space matching the device(s) and
+ obtain that through memory mapping
+ <bddebian> Shouldn't that be in the "global address space"? SOrry if I am
+ using the wrong terminology
+ <youpi> well, the i/o server should also trigger the start of that driver
+ <youpi> bddebian: address space is not a matter for drivers
+ <braunr> bddebian: i'm not sure what you think of with "global address
+ space"
+ <youpi> bddebian: it's just a matter for the pci enumerator when (and if)
+ it places the BARs in physical address space
+ <youpi> drivers merely request mapping that, they don't need to know about
+ actual physical addresses
+ <braunr> i'm almost sure you lost him at BARs
+ <braunr> :(
+ <braunr> youpi: that's what i meant with probing actually
+ <bddebian> Actually I know BARs I have been reading on PCI :)
+ <bddebian> I suppose physicall address space is more what I meant when I
+ used "global address space"
+ <braunr> i see
+ <youpi> bddebian: probably, yes
# Documentation
* [An Architecture for Device Drivers Executing as User-Level
diff --git a/open_issues/vm_map_kernel_bug.mdwn b/open_issues/vm_map_kernel_bug.mdwn
new file mode 100644
index 00000000..613c1317
--- /dev/null
+++ b/open_issues/vm_map_kernel_bug.mdwn
@@ -0,0 +1,54 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+[[!tag open_issue_glibc open_issue_gnumach]]
+# IRC, frenode, #hurd, 2012-11-04
+ <tschwinge> braunr, pinotree, youpi: Has either of you already figured out
+ what [glibc]/sysdeps/mach/hurd/dl-sysdep.c:fmh »XXX loser kludge for
+ vm_map kernel bug« is about?
+ <pinotree> tschwinge: ETOOLOWLEVELFORME :)
+ <pinotree> tschwinge: 5bf62f2d3a8af353fac661b224fc1604d4de51ea added it
+ <braunr> tschwinge: no, but that looks interesting
+ <braunr> i'll have a look later
+ <tschwinge> Heh, "interesting". ;-)
+ <tschwinge> It seems related to vm_map's mask
+ parameter/ELF_MACHINE_USER_ADDRESS_MASK, though the latter in only used
+ in the mmap implementation in sysdeps/mach/hurd/dl-sysdep.c (in mmap.c, 0
+ is passed; perhaps due to the bug?).
+ <tschwinge> braunr: Anyway, I'd already welcome a patch to simply turn that
+ into a more comprehensible form.
+ <braunr> tschwinge: ELF_MACHINE_USER_ADDRESS_MASK is defined as "Mask
+ identifying addresses reserved for the user program, where the dynamic
+ linker should not map anything."
+ <braunr> about the vm_map parameter, which is a mask, it is described by
+ "Bits asserted in this mask must not be asserted in the address returned"
+ <braunr> so it's an alignment constraint
+ <braunr> the kludge disables alignment, apparently because gnumach doesn't
+ handle them correctly for some cases
+ <tschwinge> braunr: But ELF_MACHINE_USER_ADDRESS_MASK is 0xf8000000, so I'd
+ rather assume this means to restrict to addresses lower than 0xf8000000.
+ (What are whigher ones reserved for?)
+ <braunr> tschwinge: the linker i suppose
+ <braunr> tschwinge: sorry, i don't understand what
+ ELF_MACHINE_USER_ADDRESS_MASK really is used for :/
+ <braunr> tschwinge: it looks unused for the other systems
+ <braunr> tschwinge: i guess it's just one way to partition the address
+ space, so that the linker knows where to load libraries and mmap can
+ still allocate large contiguous blocks
+ <braunr> tschwinge: 0xf8000000 means each "chunk" of linker/other blocks
+ are 128 MiB large
+ <tschwinge> braunr: OK, thanks for looking. I guess I'll ask Roland about
+ it.
+ <braunr> it could be that gnumach isn't good at aligning to large values
+[[!message-id ""]]