IRC.

author: Thomas Schwinge <tschwinge@gnu.org> 2012-08-07 23:25:26 +0200
committer: Thomas Schwinge <tschwinge@gnu.org> 2012-08-07 23:25:26 +0200
commit: 2603401fa1f899a8ff60ec6a134d5bd511073a9d (patch)
tree: ccac6e11638ddeee8da94055b53f4fdfde73aa5c /open_issues
parent: d72694b33a81919368365da2c35d5b4a264648e0 (diff)
27 files changed, 3272 insertions, 29 deletions
diff --git a/open_issues/alarm_setitimer.mdwn b/open_issues/alarm_setitimer.mdwn
index 99b2d7b6..3255683c 100644
--- a/open_issues/alarm_setitimer.mdwn
+++ b/open_issues/alarm_setitimer.mdwn
@@ -21,3 +21,11 @@ See also the attached file: on other OSes (e.g. Linux) it blocks waiting
 for a signal, while on GNU/Hurd it gets a new alarm and exits.
 
 [[alrm.c]]
+
+
+# IRC, freenode, #hurd, 2012-07-29
+
+    <braunr> our setitimer is bugged
+    <braunr> it seems doesn't seem to leave a timer disarmed when the interval
+      is set to 0
+    <braunr> (which means a one shot timer is actually periodic ..)
diff --git a/open_issues/automatic_backtraces_when_assertions_hit.mdwn b/open_issues/automatic_backtraces_when_assertions_hit.mdwn
index 1cfacaf5..71007f99 100644
--- a/open_issues/automatic_backtraces_when_assertions_hit.mdwn
+++ b/open_issues/automatic_backtraces_when_assertions_hit.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
 
 [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
 id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,9 +10,65 @@ License|/fdl]]."]]"""]]
 
 [[!tag open_issue_glibc]]
 
-IRC, unknown channel, unknown date.
+
+# IRC, unknown channel, unknown date
 
     <azeem> tschwinge: ext2fs.static: thread-cancel.c:55: hurd_thread_cancel: Assertion `! __spin_lock_locked (&ss->critical_section_lock)' failed.
     <youpi> it'd be great if we could have backtraces in such case
     <youpi> at least just the function names
     <youpi> and in this case (static), just addresses would be enough
+
+
+# IRC, freenode, #hurd, 2012-07-19
+
+In context of the [[ext2fs_libports_reference_counting_assertion]].
+
+    <braunr> pinotree: tschwinge: do you know if our packages are built with
+      -rdynamic ?
+    <pinotree> braunr: debian's cflags don't include it, so unless the upstream
+      build systems do, -rdynamic is not added
+    <braunr> i doubt glibc' backtrace() is able to find debugging symbol files
+      on its own
+    <pinotree> what do you mean?
+    <braunr> the port reference bug youpi noticed is rare
+    <pinotree> even on linux, a program compiled with normal optimizations (eg
+      -O2 -g) can give just pointer values in backtrace()'s output
+    <braunr> core dumps are unreliable at best
+
+[[crash_server]].
+
+    <braunr> uh, no, backtrace does give names
+    <braunr> but not with -fomit-frame-pointer
+    <braunr> unless the binary is built with -rdynamic
+    <braunr> at least it used to
+    <pinotree> not really, when being optimized some steps can be optimized
+      away (eg inlines)
+    <braunr> that's ok
+    <braunr> anyway, the point is i'd like a way that can give us as much
+      information as possible when the problem happens
+    <braunr> the stack trace being the most useful imo
+    <pinotree> do you face issues currently with backtrace()?
+    <braunr> not tried yet
+    <braunr> i guess i could make the application trap in the kernel, and fault
+      there, so we can attach gdb while still in the pager address space :>
+    <pinotree> that would imply the need for interactivity when the fault
+      happens, wouldn't it?
+    <braunr> no
+    <braunr> it would remain this way until someone comes, hours, days later
+    <braunr> pinotree: well ok, it would require interactivity, but not *when*
+      it happens ;p
+    <braunr> pinotree: right, it needs -rdynamic
+
+
+## IRC, freenode, #hurd, 2012-07-21
+
+    <braunr> tschwinge: my current "approach" is to introduce an infinite loop
+    <braunr> it makes the faulting task mapped in often enough to use gdb
+      through qemu
+    <braunr> ... :)
+    <tschwinge> My understanding is that glibc already does have some mechanism
+      for that: I have seen it print backtraces whendetecting malloc
+      inconsistencies (double free and the lite).
+    <braunr> yes, i thought it used the backtrace functions internally though
+    <braunr> that is, execinfo
+    <braunr> but this does require -rdynamic
diff --git a/open_issues/bpf.mdwn b/open_issues/bpf.mdwn
index e24d761b..02dc7f87 100644
--- a/open_issues/bpf.mdwn
+++ b/open_issues/bpf.mdwn
@@ -585,3 +585,11 @@ This is a collection of resources concerning *Berkeley Packet Filter*s.
       in libpcap, and let users of that library benefit from it
     <braunr> instead of implementing the low level bpf interface, which
       nonetheless has some system-specific variants ..
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+In context of the [[select]] issue.
+
+    <braunr> i understand now why my bpf translator was so buggy
+    <braunr> the condition_timedwait i wrote at the time was .. incomplete :)
diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn
index aff988d5..8f00c950 100644
--- a/open_issues/dde.mdwn
+++ b/open_issues/dde.mdwn
@@ -31,6 +31,18 @@ A similar problem is described in
 [[community/gsoc/project_ideas/unionfs_boot]], and needs to be implemented.
 
 
+### IRC, freenode, #hurd, 2012-07-17
+
+    <bddebian> OK, here is a stupid question  I have always had.  If you move
+      PCI and disk drivers in to userspace, how do do initial bootstrap to get
+      the system booting?
+    <braunr> that's hard
+    <braunr> basically you make the boot loader load all the components you
+      need in ram
+    <braunr> then you make it give each component something (ports) so they can
+      communicate
+
+
 # Upstream Status
 
 
@@ -90,6 +102,9 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
       automatically, or you have to settrans yourself to setup a device?
     <youpi> there's no autoloader for now
     <youpi> we'd need a bus arbitrer that'd do autoprobing
+
+[[PCI_arbiter]].
+
     <pinotree> i see
     <pinotree> (you see i'm not really that low level, so pardon the flood of
       posssibly-noobish questions ;) )
@@ -200,21 +215,10 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
     <antrik> right
 
 
-# IRC, freenode, #hurd, 2012-02-19
-
-    <youpi> antrik: we should probably add a gsoc idea on pci bus arbitration
-    <youpi> DDE is still experimental for now so it's ok that you  have to
-      configure it by hand, but it should be automatic at some ponit
-
+# [[PCI_Arbiter]]
 
 ## IRC, freenode, #hurd, 2012-02-21
 
-    <braunr> i'm not familiar with the new gnumach interface for userspace
-      drivers, but can this pci enumerator be written with it as it is ?
-    <braunr> (i'm not asking for a precise answer, just yes - even probably -
-      or no)
-    <braunr> (idk or utsl will do as well)
-    <youpi> I'd say yes
     <youpi> since all drivers need is interrupts, io ports and iomem
     <youpi> the latter was already available through /dev/mem
     <youpi> io ports through the i386 rpcs
diff --git a/open_issues/ext2fs_deadlock.mdwn b/open_issues/ext2fs_deadlock.mdwn
index 369875fe..23f54a4a 100644
--- a/open_issues/ext2fs_deadlock.mdwn
+++ b/open_issues/ext2fs_deadlock.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
 
 [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
 id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -44,9 +44,8 @@ pull the information out of the process' memory manually (how to do that,
 anyways?), and also didn't have time to continue with debugging GDB itself, but
 this sounds like a [[!taglink open_issue_gdb]]...)
 
----
 
-IRC, #hurd, 2010-10-27
+# IRC, freenode, #hurd, 2010-10-27
 
     <youpi> thread 8 hung on ports_begin_rpc
     <youpi> that's probably where one could investigated first
diff --git a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn
new file mode 100644
index 00000000..ff1c4c38
--- /dev/null
+++ b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn
@@ -0,0 +1,93 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+    libports/port-ref.c:31: ports_port_ref: Assertion `pi->refcnt || pi->weakrefcnt' failed
+
+This is seen every now and then.
+
+
+# [[gnumach_page_cache_policy]]
+
+With that patch in place, the assertion failure is seen more often.
+
+
+## IRC, freenode, #hurd, 2012-07-14
+
+    <youpi> braunr: I'm getting ext2fs.static:
+      /usr/src/hurd-debian/./libports/port-ref.c:31: ports_port_ref: Assertion
+      `pi->refcnt || pi->weakrefcnt' failed.
+    <youpi> oddly enough, that happens on one of the buildds only
+    <braunr> :/
+    <braunr> i fear the patch can wake many of these issues
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+    <youpi> braunr: same assertion failed on a second buildd
+    <braunr> can you paste it again please ?
+    <youpi> ext2fs.static: /usr/src/hurd-debian/./libports/port-ref.c:31:
+      ports_port_ref: Assertion `pi->refcnt || pi->weakrefcnt' failed.
+    <braunr> or better, answer the ml thread for future reference
+    <braunr> thanks
+    <youpi> braunr: I can't keep your patch on the buildds, it makes them too
+      unreliable
+    <braunr> youpi: ok
+    <braunr> i never got this error though, that's weird
+    <braunr> youpi: was the failure during the same build ?
+    <youpi> no, it was during package installation, and not the same
+    <youpi> braunr: note that I've already seen such errors, it's not new, but
+      it was way rarer
+    <youpi> like every month only
+    <braunr> ah ok
+    <braunr> yes it's less surprising then
+    <braunr> a tricky reference counting / locking mistake somewhere in the
+      hurd :) ...
+    <braunr> ah ! just got it !
+    <bddebian> braunr: Got the error or found the problem? :)
+    <braunr> the former unfortunately :/
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+    <braunr> hm, i think those ext2fs port refs errors may also be due to stack
+      overflows
+    <pinotree> --verbose
+    <braunr> hm ?
+    <braunr> http://lists.gnu.org/archive/html/bug-hurd/2012-07/msg00051.html
+    <pinotree> i mean, why do you think they could be due to that?
+    <braunr> the error is that both strong and weak refs in a port are 0 when
+      adding a reference
+    <braunr> weak refs are almost never used so let's forget about them
+    <braunr> when a ref count drops to 0, the port is automatically deallocated
+    <braunr> so what other than memory corruption setting this counter to 0
+      could possibly do that ? :)
+    <pinotree> one could also guess an unbalanced ref/unref logic, somehow
+    <braunr> what do you mean ?
+    <pinotree> that for a bug, an early return, etc a port gets unref'ed often
+      than it is ref'ed
+    <braunr> highly unlikely, as they're protected by a lock
+    <braunr> pinotree: ah you mean, the object gets deallocated early because
+      of an deref overflow ?
+    <braunr> pinotree: could be, yes
+    <braunr> pinotree: i wonder if it could happen because of the periodic sync
+      duplicating the node table without holding references
+    <braunr> rah, libports uses a big lock in many places :(
+    <pinotree> braunr: yes, i meant that
+    <braunr> we could try using libduma some day
+    <braunr> i wonder if it could work out of the box
+    <pinotree> but that wouldn't help to find out whether a port gets deref'ed
+      too often, for instance
+    <pinotree> although it could be adapted to do so, i guess
+    <braunr> reproducing + a call trace or core would be best, but i'm not even
+      sure we can get that easily lol
+
+[[automatic_backtraces_when_assertions_hit]].
diff --git a/open_issues/glibc/t/tls-threadvar.mdwn b/open_issues/glibc/t/tls-threadvar.mdwn
index e72732ab..4afd8a1a 100644
--- a/open_issues/glibc/t/tls-threadvar.mdwn
+++ b/open_issues/glibc/t/tls-threadvar.mdwn
@@ -29,3 +29,32 @@ IRC, freenode, #hurd, 2011-10-23:
 
 After this has been done, probably the whole `__libc_tsd_*` stuff can be
 dropped altogether, and `__thread` directly be used in glibc.
+
+
+# IRC, freenode, #hurd, 2012-08-07
+
+    <tschwinge> r5219: Update libpthread patch to replace threadvar with tls
+      for pthread_self
+    <tschwinge> r5224: revert r5219 too, it's not ready either
+    <youpi> as the changelog says, the __thread revertal is because it posed
+      problems
+    <youpi> and I just didn't have any time to check them while the freeze was
+      so close
+    <tschwinge> OK.  What kind of problems?  Should it be reverted upstream,
+      too?
+    <youpi> I don't remember exactly
+    <youpi> it should just be fixed
+    <youpi> we can revert it upstream, but it'd be good that we manage to
+      progress, at some point...
+    <tschwinge> Of course -- however as long as we don't know what kind of
+      problem, it is a bit difficult.  ;-)
+    <youpi> since I didn't left a note, it was most probably a mere glibc run,
+      or boot with the patched libpthread
+    <youpi> *testsuite run
+    <tschwinge> OK.
+    <tschwinge> The libpthread testsuite doesn't show any issues with that
+      patch applied, though.  But I didn'T test anything else.
+    <tschwinge> youpi: Also, you have probably seen my glibc __thread errno
+      email -- rmcgrath wanted to find some time this week to comment/help, and
+      I take it you don't have any immediate comments to that issue?
+    <youpi> I saw the mails, but didn't investigate at all
diff --git a/open_issues/gnat.mdwn b/open_issues/gnat.mdwn
index fb624fad..2d17e275 100644
--- a/open_issues/gnat.mdwn
+++ b/open_issues/gnat.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
 
 [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
 id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -38,6 +38,55 @@ svn://svn.debian.org/gcccvs/branches/sid@5638
 6ca36cf4-e1d1-0310-8c6f-e303bb2178ca'
 
 
+## IRC, freenode, #hurd, 2012-07-17
+
+    <gnu_srs> I've found the remaining problem with gnat backtrace for Hurd!
+      Related to the stack frame.
+    <gnu_srs> This version does not work: one relying on static assumptions
+      about the frame layout
+    <gnu_srs> Causing segfaults.
+    <gnu_srs> Any interest to create a test case out of that piece of code,
+      taken from gcc/ada/tracebak.c?
+    <braunr> gnu_srs: sure
+
+
+### IRC, freenode, #hurd, 2012-07-18
+
+    <braunr> "Digging further revealed that the GNU/Hurd stack frame does not
+      seem to
+    <braunr> be static enough to define USE_GENERIC_UNWINDER in
+      gcc/ada/tracebak.c.
+    <braunr> "
+    <braunr> what do you mean by a "stack frame does not seem to be static
+      enough" ?
+    <gnu_srs> I can qoute from the source file if you want. Otherwise look at
+      the code yourself: gcc/ada/tracebak,c
+    <gnu_srs> I mean that something is wrong with the stack frame for
+      Hurd. This is the code I wanted to use as a test case for the stack.
+    <gnu_srs> Remember?
+    <braunr> more or less
+    <braunr> ah, "static assumptions"
+    <braunr> all right, i don't think anything is "wrong" with stack frames
+    <braunr> but if you use a recent version of gcc, as indicated in the code,
+      -fomit-frame-pointer is enabled by default
+    <braunr> so your stack frame won't look like it used to be without the
+      option
+    <braunr> hence the need for USE_GCC_UNWINDER
+    <braunr> http://en.wikipedia.org/wiki/Call_stack explains this very well
+    <gnu_srs> However, kfreebsd does not seem to need USE_GCC_UNWINDER, how
+      come?
+    <braunr> i guess they don't omit the frame pointer
+    <braunr> your fix is good btw
+    <gnu_srs> thanks
+
+
+### IRC, freenode, #hurd, 2012-07-19
+
+    <gnu_srs> tschwinge: The bug in #681998 should go upstream. Applied in
+      Debian already. Hopefully this is the last patch needed for the port of
+      GNAT to Hurd.
+
+
 ---
 
 
diff --git a/open_issues/gnumach_page_cache_policy.mdwn b/open_issues/gnumach_page_cache_policy.mdwn
index 03cb3725..375e153b 100644
--- a/open_issues/gnumach_page_cache_policy.mdwn
+++ b/open_issues/gnumach_page_cache_policy.mdwn
@@ -108,6 +108,9 @@ License|/fdl]]."]]"""]]
       12k random data
     <braunr> i'll try with other values
     <braunr> i get crashes, deadlocks, livelocks, and it's not pretty :)
+
+[[libpager_deadlock]].
+
     <braunr> and always in ext2, mach doesn't seem affected by the issue, other
       than the obvious
     <braunr> (well i get the usual "deallocating an invalid port", but as
@@ -625,3 +628,146 @@ License|/fdl]]."]]"""]]
 
 
 ## [[metadata_caching]]
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+    <braunr> i'm only adding a cached pages count you know :)
+    <braunr> (well actually, this is now a vm_stats call that can replace
+      vm_statistics, and uses flavors similar to task_info)
+    <braunr> my goal being to see that yellow bar in htop
+    <braunr> ... :)
+    <pinotree> yellow?
+    <braunr> yes, yellow
+    <braunr> as in http://www.sceen.net/~rbraun/htop.png
+    <pinotree> ah
+
+
+## IRC, freenode, #hurd, 2012-07-13
+
+    <braunr> i always get a "no more room for vm_map_enter" error when building
+      glibc :/
+    <braunr> but the build continues, probably a failed test
+    <braunr> ah yes, i can see the yellow bar :>
+    <antrik> braunr: congrats :-)
+    <braunr> antrik: thanks
+    <braunr> but i think my patch can't make it into the git repo until the
+      swap deadlock is solved (or at least very infrequent ..)
+
+[[libpager_deadlock]].
+
+    <braunr> well, the page cache accounting tells me something is wrong there
+      too lol
+    <braunr> during a build 112M of data was created, of which only 28M made it
+      into the cache
+    <braunr> which may imply something is still holding references on the
+      others objects (shadow objects hold references to their underlying
+      object, which could explain this)
+    <braunr> ok i'm stupid, i just forgot to subtract the cached pages from the
+      used pages .. :>
+    <braunr> (hm, actually i'm tired, i don't think this should be done)
+    <braunr> ahh yes much better
+    <braunr> i simply forgot to convert pages in kilobytes .... :>
+    <braunr> with the fix, the accounting of cached files is perfect :)
+
+
+## IRC, freenode, #hurd, 2012-07-14
+
+    <youpi> braunr: btw, if you want to stress big builds, you might want to
+      try webkit, ppl, rquantlib, rheolef, yade
+    <youpi> they don't pass on bach (1.3GiB), but do on ironforge (1.8GiB)
+    <braunr> youpi: i don't need to, i already know my patch triggers swap
+      deadlocks more often, which was expected
+    <youpi> k
+    <braunr> there are 3 tasks concerning my work : 1/ page cache accounting
+      (i'm sending the patch right now) 2/ removing the fixed limit and 3/
+      hunting the swap deadlock and fixing as much as possible
+    <braunr> 2/ can't get in the repository without 3/ imo
+    <youpi> btw, the increase of PAGE_FREE_* in your 2/ could go already,
+      couldn't it?
+    <braunr> yes
+    <braunr> but we should test with higher thresholds
+    <braunr> well
+    <braunr> it really depends on the usage pattern :/
+
+
+## [[ext2fs_libports_reference_counting_assertion]]
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+    <braunr> concerning the page cache patch, i've been using for quite some
+      time now, did lots of builds with it, and i actually wonder if it hurts
+      stability as much as i think
+    <braunr> considering i didn't stress the system as much before
+    <braunr> and it really improves performance
+
+    <braunr> cached memobjs:   138606
+    <braunr> cache:             1138M
+    <braunr> i bet ext2fs can have a hard time scanning 138k entries in a
+      linked list, using callback functions on each of them :x
+
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+    <tschwinge> braunr: Sorry that I didn't have better results to present.
+      :-/
+    <braunr> eh, that was expected :)
+    <braunr> my biggest problem is the hurd itself :/
+    <braunr> for my patch to be useful (and the rest of the intended work), the
+      hurd needs some serious fixing
+    <braunr> not syncing from the pagers
+    <braunr> and scalable algorithms everywhere of course
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+    <braunr> youpi: FYI, the branches rbraun/page_cache in the gnupach and hurd
+      repos are ready to be merged after review
+    <braunr> gnumach*
+    <youpi> so you fixed the hangs & such?
+    <braunr> they only the cache stats, not the "improved" cache
+    <braunr> no
+    <braunr> it requires much more work for that :)
+    <youpi> braunr: my concern is that the tests on buildds show stability
+      regression
+    <braunr> youpi: tschwinge also reported performance degradation
+    <braunr> and not the minor kind
+    <youpi> uh
+    <tschwinge> :-/
+    <braunr> far less pageins, but twice as many pageouts, and probably high
+      cpu overhead
+    <braunr> building (which is what buildds do) means lots of small files
+    <braunr> so lots of objects
+    <braunr> huge lists, long scans, etc..
+    <braunr> so it definitely requires more work
+    <braunr> the stability issue comes first in mind, and i don't see a way to
+      obtain a usable trace
+    <braunr> do you ?
+    <youpi> nope
+    <braunr> (except making it loop forever instead of calling assert() and
+      attach gdb to a qemu instance)
+    <braunr> youpi: if you think the infinite loop trick is ok, we could
+      proceed with that
+    <youpi> which assert?
+    <braunr> the port refs one
+    <youpi> which one?
+    <braunr> whicih prevented you from using the page cache patch on buildds
+    <youpi> ah, the libports one
+    <youpi> for that one, I'd tend to take the time to perhaps use coccicheck
+      actually
+
+[[code_analysis]].
+
+    <braunr> oh
+    <youpi> it's one of those which is supposed to be statically ananyzable
+    <youpi> s/n/l
+    <braunr> that would be great
+    <tschwinge> :-)
+    <tschwinge> And set precedence.
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+    <braunr> hm i killed darnassus, probably the page cache patch again
diff --git a/open_issues/gnumach_vm_map_red-black_trees.mdwn b/open_issues/gnumach_vm_map_red-black_trees.mdwn
index d7407bfe..7a54914f 100644
--- a/open_issues/gnumach_vm_map_red-black_trees.mdwn
+++ b/open_issues/gnumach_vm_map_red-black_trees.mdwn
@@ -172,3 +172,29 @@ License|/fdl]]."]]"""]]
       crasher le noyau)
     <braunr> (enfin jveux dire, qui faisait crasher le noyau de façon très
       obscure avant le patch rbtree)
+
+
+### IRC, freenode, #hurd, 2012-07-15
+
+    <bddebian> I get errors in vm_map.c whenever I try to "mount" a CD
+    <bddebian> Hmm, this time it rebooted the machine
+    <bddebian> braunr: The translator set this time and the machine reboots
+      before I can get the full message about vm_map, but here is some of the
+      crap I get:  http://paste.debian.net/179191/
+    <braunr> oh
+    <braunr> nice
+    <braunr> that may be the bug youpi saw with my redblack tree patch
+    <braunr> bddebian: assert(diff != 0); ?
+    <bddebian> Aye
+    <braunr> good
+    <braunr> it means we're trying to insert a vm_map_entry at a region in a
+      map which is already occupied
+    <bddebian> Oh
+    <braunr> and unlike the previous code, the tree actually checks that
+    <braunr> it has to
+    <braunr> so you just simply use the iso9660fs translator and it crashes ?
+    <bddebian> Well it used to on just trying to set the translator.  This time
+      I was able to set the translator but as soon as I cd to the mount point I
+      get all that crap
+    <braunr> that's very good
+    <braunr> more test cases to fix the vm
diff --git a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
index 80fc9fcd..57eb403d 100644
--- a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
+++ b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+Inc."]]
 
 [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
 id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -104,3 +105,11 @@ License|/fdl]]."]]"""]]
       of embedding it ?
     <braunr> right
     <antrik> now that's a good question... no idea TBH :-)
+
+
+# IRC, freenode, #hurd, 2012-07-23
+
+    <pinotree> aren't libmachuser and libhurduser supposed to be slowly faded
+      out?
+    <tschwinge> pinotree: That discussion has not yet come to a conclusion, I
+      think.  (I'd say: yes.)
diff --git a/open_issues/libpager_deadlock.mdwn b/open_issues/libpager_deadlock.mdwn
new file mode 100644
index 00000000..017ecff6
--- /dev/null
+++ b/open_issues/libpager_deadlock.mdwn
@@ -0,0 +1,165 @@
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+Deadlocks in libpager/periodic sync have been found.
+
+
+# [[gnumach_page_cache_policy]]
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+    <braunr> ah great, a paper about the mach pageout daemon !
+    <mcsim> braunr: Where is paper about the mach pageout daemon?
+    <braunr> ftp://ftp.cs.cmu.edu/project/mach/doc/published/defaultmm.ps
+    <braunr> might give us a clue about the swap deadlock (although i still
+      have a few ideas to check)
+    <braunr>
+      http://www.sceen.net/~rbraun/moving_the_default_memory_manager_out_of_the_mach_kernel.pdf
+    <braunr> we should more seriously consider sergio's advisory pageout branch
+      some day
+    <braunr> i'll try to get in touch with him about that before he completely
+      looses interest
+    <braunr> i'll include it in my "make that page cache as decent as possible"
+      task
+    <braunr> many of his comments match what i've seen
+    <braunr> and we both did a few optimizations the same way
+    <braunr> (like not deactivating pages when they enter the cache)
+
+
+## IRC, freenode, #hurd, 2012-07-13
+
+    <braunr> antrik: i'm able to consistently reproduce the swap deadlocks you
+      regularly had when using apt with my page cache patch
+    <braunr> it happens when lots of dirty pages are write back to their pagers
+    <braunr> so apt, or a big file copy or anything that writes several MiB
+      very quickly is a good candidate
+    <braunr> written*
+    <antrik> braunr: nice...
+    <braunr> antrik: well in a way, yes, as it will allow us to track it more
+      easily
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+    <braunr> oh btw, i think i can say with confidence that the hurd *doesn't*
+      deadlock
+    <braunr> (at least, concerning swapping)
+    <braunr> lol, one of my hurd systems has been hitting the "swap deadlock"
+      for more than an hour, and suddenly got out of it
+    <braunr> something is really wrong in the pageout daemon, but it's not a
+      deadlock
+    <youpi> a livelock then
+    <braunr> do you get out of livelocks ?
+    <braunr> i mean, it's not even a "lock"
+    <braunr> just a big damn tricky slowdown 
+    <youpi> yes, you can, by giving a few more resources for instance
+    <youpi> depends on the kind of livelock of course
+    <braunr> i think it's that
+    <braunr> the pageout daemon clearly throttles itself, waiting for pagers to
+      complete
+    <braunr> and another dangerous thing is the line in vm_resident, which only
+      wakes on thread to avoid starvation
+    <braunr> hum, during the livelock, the kernel spends much time waiting in
+      db_read_address
+    <braunr> could be a bad stack
+    <braunr> so, the pageout daemon seems to slow itself as much as waiting
+      several seconds between each iteration when under load
+    <braunr> but each iteration possibly removes clean pages
+    <braunr> so at some point, there is enough memory to unblock waiting pagers
+    <braunr> for now i'll try a simple solution, like limiting the pausing
+      delay
+    <braunr> but we'll need more page lists in the future (inactive-clean,
+      inactive-dirty, etc..)
+    <braunr> limiting the amount of dirty pages is the only way to really make
+      it safe actually
+    <braunr> wow, the pageout loop is still running even after many pages were
+      freed, and it unable to free more pages
+    <braunr> i think i have an idea about the livelock
+    <braunr> i think it comes from the periodic syncing
+    <bddebian> Too often?
+    <braunr> that's not the problem
+    <braunr> the problem is that it can happen at the same time with paging
+    <bddebian> Oh
+    <braunr> if paging gets slow, it won't stop the periodic syncing
+    <braunr> which will grab any page it can as soon as some are free
+    <braunr> but then, before it even finishes, another sync may occur
+    <braunr> i have yet to check that it is possible
+    <braunr> and i don't understand why syncing isn't done by the kernel
+    <braunr> the kernel is supposed to handle the paging policy
+    <braunr> and it would make paging really scale
+    <bddebian> It's done on the Hurd side?
+    <braunr> (instead of having external pagers make one request for each
+      object, even if they're clean)
+    <braunr> yes
+    <bddebian> Hmm, interesting
+    <braunr> ofc, with ext2fs --debug, i can't reproduce anything
+    <bddebian> Ugh
+    <braunr> sync are serialized
+    <braunr> grmbl
+    <braunr> there is a big lock taken at sync time though
+    <braunr> uhg
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+    <braunr> all right so, there *is* a deadlock, and it may be due to the
+      default pager actually
+    <braunr> the vm_page_laundry_count doesn't decrease at some point, even
+      when there are more than enough free pages
+    <braunr> antrik: the thing is, i think the deadlock concerns the default
+      pager
+    <antrik> the deadlock?
+    <braunr> yes
+    <braunr> when swapping
+
+
+## IRC, freenode, #hurd, 2012-07-17
+
+    <braunr> i can't even reproduce the swap deadlock when using upstrea ext2fs
+      :(
+    <braunr> upstream*
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+    <braunr> the libpager deadlock patch looks wrong to me
+    <braunr> hm no, the libpager patch is ok acually
+
+
+## [[synchronous_ipc]]
+
+### IRC, freenode, #hurd, 2012-07-20
+
+    <braunr> but actually after reviewing more, the debian patch for this
+      particular issue seems correct
+    <antrik> well, it's most probably done by youpi, so I would be shocked if
+      it wasn't correct... ;-)
+    <braunr> he wasn't sure at all about it
+    <antrik> still ;-)
+    <braunr> :)
+    <antrik> well, if you also think it's correct, I guess it's time to push it
+      upstream...
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+    <braunr> i still can't conclude if we have any pageout deadlock, or if it's
+      simply a side effect of the active and inactive lists getting very very
+      large
+    <braunr> but almost every time this issue happens, it somehow recovers,
+      sometimes hours later
+
+
+# See Also
+
+  * [[ext2fs_deadlock]]
diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn
index c5054b7f..03a52218 100644
--- a/open_issues/libpthread.mdwn
+++ b/open_issues/libpthread.mdwn
@@ -42,3 +42,527 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task.
     <youpi> there'll still be the issue that only one will be initialized
     <youpi> and one that provides libc thread safety functions, etc.
     <pinotree> that's what i wanted to knew, thanks :)
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+    <bddebian> So I am not sure what to do with the hurd_condition_wait stuff
+    <braunr> i would also like to know what's the real issue with cancellation
+      here
+    <braunr> because my understanding is that libpthread already implements it
+    <braunr> does it look ok to you to make hurd_condition_timedwait return an
+      errno code (like ETIMEDOUT and ECANCELED) ?
+    <youpi> braunr: that's what pthread_* function usually do, yes
+    <braunr> i thought they used their own code
+    <youpi> no
+    <braunr> thanks
+    <braunr> well, first, do you understand what hurd_condition_wait is ?
+    <braunr> it's similar to condition_wait or pthread_cond_wait with a subtle
+      difference
+    <braunr> it differs from the original cthreads version by handling
+      cancellation
+    <braunr> but it also differs from the second by how it handles cancellation
+    <braunr> instead of calling registered cleanup routines and leaving, it
+      returns an error code
+    <braunr> (well simply !0 in this case)
+    <braunr> so there are two ways
+    <braunr> first, change the call to pthread_cond_wait
+    <bddebian> Are you saying we could fix stuff to use pthread_cond_wait()
+      properly?
+    <braunr> it's possible but not easy
+    <braunr> because you'd have to rewrite the cancellation code
+    <braunr> probably writing cleanup routines
+    <braunr> this can be hard and error prone
+    <braunr> and is useless if the code already exists
+    <braunr> so it seems reasonable to keep this hurd extension
+    <braunr> but now, as it *is* a hurd extension noone else uses
+    <antrik> braunr: BTW, when trying to figure out a tricky problem with the
+      auth server, cfhammer digged into the RPC cancellation code quite a bit,
+      and it's really a horrible complex monstrosity... plus the whole concept
+      is actually broken in some regards I think -- though I don't remember the
+      details
+    <braunr> antrik: i had the same kind of thoughts
+    <braunr> antrik: the hurd or pthreads ones ?
+    <antrik> not sure what you mean. I mean the RPC cancellation code -- which
+      is involves thread management too
+    <braunr> ok
+    <antrik> I don't know how it is related to hurd_condition_wait though
+    <braunr> well i found two main entry points there
+    <braunr> hurd_thread_cancel and hurd_condition_wait
+    <braunr> and it didn't look that bad
+    <braunr> whereas in the pthreads code, there are many corner cases
+    <braunr> and even the standard itself looks insane
+    <antrik> well, perhaps the threading part is not that bad...
+    <antrik> it's not where we saw the problems at any rate :-)
+    <braunr> rpc interruption maybe ?
+    <antrik> oh, right... interruption is probably the right term
+    <braunr> yes that thing looks scary
+    <braunr> :))
+    <braunr> the migration thread paper mentions some things about the problems
+      concerning threads controllability
+    <antrik> I believe it's a very strong example for why building around
+      standard Mach features is a bad idea, instead of adapting the primitives
+      to our actual needs...
+    <braunr> i wouldn't be surprised if the "monstrosities" are work arounds
+    <braunr> right
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+    <bddebian> Uhm, where does /usr/include/hurd/signal.h come from?
+    <pinotree> head -n4 /usr/include/hurd/signal.
+    <pinotree> h
+    <bddebian> Ohh glibc?
+    <bddebian> That makes things a little more difficult :(
+    <braunr> why ?
+    <bddebian> Hurd includes it which brings in cthreads
+    <braunr> ?
+    <braunr> the hurd already brings in cthreads
+    <braunr> i don't see what you mean
+    <bddebian> Not anymore :)
+    <braunr> the system cthreads header ?
+    <braunr> well it's not that difficult to trick the compiler not to include
+      them
+    <bddebian> signal.h includes cthreads.h  I need to stop that
+    <braunr> just define the _CTHREADS_ macro before including anything
+    <braunr> remember that header files are normally enclosed in such macros to
+      avoid multiple inclusions
+    <braunr> this isn't specific to cthreads
+    <pinotree> converting hurd from cthreads to pthreads will make hurd and
+      glibc break source and binary compatibility
+    <bddebian> Of course
+    <braunr> reminds me of the similar issues of the late 90s
+    <bddebian> Ugh, why is he using _pthread_self()?
+    <pinotree> maybe because it accesses to the internals
+    <braunr> "he" ?
+    <bddebian> Thomas in his modified cancel-cond.c
+    <braunr> well, you need the internals to implement it
+    <braunr> hurd_condition_wait is similar to pthread_condition_wait, except
+      that instead of stopping the thread and calling cleanup routines, it
+      returns 1 if cancelled
+    <pinotree> not that i looked at it, but there's really no way to implement
+      it using public api?
+    <bddebian> Even if I am using glibc pthreads?
+    <braunr> unlikely
+    <bddebian> God I had all of this worked out before I dropped off for a
+      couple years.. :(
+    <braunr> this will come back :p
+    <pinotree> that makes you the perfect guy to work on it ;)
+    <bddebian> I can't find a pt-internal.h anywhere.. :(
+    <pinotree> clone the hurd/libpthread.git repo from savannah
+    <bddebian> Of course when I was doing this libpthread was still in hurd
+      sources...
+    <bddebian> So if I am using glibc pthread, why can't I use pthread_self()
+      instead?
+    <pinotree> that won't give you access to the internals
+    <bddebian> OK, dumb question time.  What internals?
+    <pinotree> the libpthread ones
+    <braunr> that's where you will find if your thread has been cancelled or
+      not
+    <bddebian> pinotree: But isn't that assuming that I am using hurd's
+      libpthread?
+    <pinotree> if you aren't inside libpthread, no
+    <braunr> pthread_self is normally not portable
+    <braunr> you can only use it with pthread_equal
+    <braunr> so unless you *know* the internals, you can't use it
+    <braunr> and you won't be able to do much
+    <braunr> so, as it was done with cthreads, hurd_condition_wait should be
+      close to the libpthread implementation
+    <braunr> inside, normally
+    <braunr> now, if it's too long for you (i assume you don't want to build
+      glibc)
+    <braunr> you can just implement it outside, grabbing the internal headers
+      for now
+    <pinotree> another "not that i looked at it" question: isn't there no way
+      to rewrite the code using that custom condwait stuff to use the standard
+      libpthread one?
+    <braunr> and once it works, it'll get integrated
+    <braunr> pinotree: it looks very hard
+    <bddebian> braunr: But the internal headers are assuming hurd libpthread
+      which isn't in the source anymore
+    <braunr> from what i could see while working on select, servers very often
+      call hurd_condition_wait
+    <braunr> and they return EINTR if canceleld
+    <braunr> so if you use the standard pthread_cond_wait function, your thread
+      won't be able to return anything, unless you push the reply in a
+      completely separate callback
+    <braunr> i'm not sure how well mig can cope with that
+    <braunr> i'd say it can't :)
+    <braunr> no really it looks ugly
+    <braunr> it's far better to have this hurd specific function and keep the
+      existing user code as it is
+    <braunr> bddebian: you don't need the implementation, only the headers
+    <braunr> the thread, cond, mutex structures mostly
+    <bddebian> I should turn <pt-internal.h> to "pt-internal.h" and just put it
+      in libshouldbelibc, no?
+    <pinotree> no, that header is not installed
+    <bddebian> Obviously not the "best" way
+    <bddebian> pinotree: ??
+    <braunr> pinotree: what does it change ?
+    <pinotree> braunr: it == ?
+    <braunr> bddebian: you could even copy it entirely in your new
+      cancel-cond.C and mention where it was copied from
+    <braunr> pinotree: it == pt-internal.H not being installed
+    <pinotree> that he cannot include it in libshouldbelibc sources?
+    <pinotree> ah, he wants to copy it?
+    <braunr> yes
+    <braunr> i want him to copy it actually :p
+    <braunr> it may be hard if there are a lot of macro options
+    <pinotree> the __pthread struct changes size and content depending on other
+      internal sysdeps headers
+    <braunr> well he needs to copy those too :p
+    <bddebian> Well even if this works we are going to have to do something
+      more "correct" about hurd_condition_wait.  Maybe even putting it in
+      glibc?
+    <braunr> sure
+    <braunr> but again, don't waste time on this for now
+    <braunr> make it *work*, then it'll get integrated
+    <bddebian> Like it has already?  This "patch" is only about 5 years old
+      now... ;-P
+    <braunr> but is it complete ?
+    <bddebian> Probably not :)
+    <bddebian> Hmm, I wonder how many undefined references I am going to get
+      though.. :(
+    <bddebian> Shit, 5
+    <bddebian> One of which is ___pthread_self.. :(
+    <bddebian> Does that mean I am actually going to have to build hurds
+      libpthreads in libshouldbeinlibc?
+    <bddebian> Seriously, do I really need ___pthread_self, __pthread_self,
+      _pthread_self and pthread_self???
+    <bddebian> I'm still unclear what to do with cancel-cond.c.  It seems to me
+      that if I leave it the way it is currently I am going to have to either
+      re-add libpthreads or still all of the libpthreads code under
+      libshouldbeinlibc.
+    <braunr> then add it in libc
+    <braunr> glib
+    <braunr> glibc
+    <braunr> maybe under the name __hurd_condition_wait
+    <bddebian> Shouldn't I be able to interrupt cancel-cond stuff to use glibc
+      pthreads?
+    <braunr> interrupt ?
+    <bddebian> Meaning interject like they are doing.  I may be missing the
+      point but they are just obfuscating libpthreads thread with some other
+      "namespace"?  (I know my terminology is wrong, sorry).
+    <braunr> they ?
+    <bddebian> Well Thomas in this case but even in the old cthreads code,
+      whoever wrote cancel-cond.c
+    <braunr> but they use internal thread structures ..
+    <bddebian> Understood but at some level they are still just getting to a
+      libpthread thread, no?
+    <braunr> absolutely not ..
+    <braunr> there is *no* pthread stuff in the hurd
+    <braunr> that's the problem :p
+    <bddebian> Bah damnit...
+    <braunr> cthreads are directly implement on top of mach threads
+    <braunr> implemeneted*
+    <braunr> implemented*
+    <bddebian> Sure but hurd_condition_wait wasn't
+    <braunr> of course it is
+    <braunr> it's almost the same as condition_wait
+    <braunr> but returns 1 if a cancelation request was made
+    <bddebian> Grr, maybe I am just confusing myself because I am looking at
+      the modified (pthreads) version instead of the original cthreads version
+      of cancel-cond.c
+    <braunr> well if the modified version is fine, why not directly use that ?
+    <braunr> normally, hurd_condition_wait should sit next to other pthread
+      internal stuff
+    <braunr> it could be renamed __hurd_condition_wait, i'm not sure
+    <braunr> that's irrelevant for your work anyway
+    <bddebian> I am using it but it relies on libpthread and I am trying to use
+      glibc pthreads
+    <braunr> hum
+    <braunr> what's the difference between libpthread and "glibc pthreads" ?
+    <braunr> aren't glibc pthreads the merged libpthread ?
+    <bddebian> quite possibly but then I am missing something obvious.  I'm
+      getting ___pthread_self in libshouldbeinlibc but it is *UND*
+    <braunr> bddebian: with unmodified binaries ?
+    <bddebian> braunr: No I added cancel-cond.c to libshouldbeinlibc
+    <bddebian> And some of the pt-xxx.h headers
+    <braunr> well it's normal then
+    <braunr> i suppose
+    <bddebian> braunr: So how do I get those defined without including
+      pthreads.c from libpthreads? :)
+    <antrik> pinotree: hm... I think we should try to make sure glibc works
+      both whith cthreads hurd and pthreads hurd. I hope that shoudn't be so
+      hard.
+    <antrik> breaking binary compatibility for the Hurd libs is not too
+      terrible I'd say -- as much as I'd like that, we do not exactly have a
+      lot of external stuff depending on them :-)
+    <braunr> bddebian: *sigh*
+    <braunr> bddebian: just add cancel-cond to glibc, near the pthread code :p
+    <bddebian> braunr: Wouldn't I still have the same issue?
+    <braunr> bddebian: what issue ?
+    <antrik> is hurd_condition_wait() the name of the original cthreads-based
+      function?
+    <braunr> antrik: the original is condition_wait
+    <antrik> I'm confused
+    <antrik> is condition_wait() a standard cthreads function, or a
+      Hurd-specific extension?
+    <braunr> antrik: as standard as you can get for something like cthreads
+    <bddebian> braunr: Where hurd_condition_wait is looking for "internals" as
+      you call them.  I.E. there is no __pthread_self() in glibc pthreads :)
+    <braunr> hurd_condition_wait is the hurd-specific addition for cancelation
+    <braunr> bddebian: who cares ?
+    <braunr> bddebian: there is a pthread structure, and conditions, and
+      mutexes
+    <braunr> you need those definitions
+    <braunr> so you either import them in the hurd
+    <antrik> braunr: so hurd_condition_wait() *is* also used in the original
+      cthread-based implementation?
+    <braunr> or you write your code directly where they're available
+    <braunr> antrik: what do you call "original" ?
+    <antrik> not transitioned to pthreads
+    <braunr> ok, let's simply call that cthreads
+    <braunr> yes, it's used by every hurd servers
+    <braunr> virtually
+    <braunr> if not really everyone of them
+    <bddebian> braunr: That is where you are losing me.  If I can just use
+      glibc pthreads structures, why can't I just use them in the new pthreads
+      version of cancel-cond.c which is what I was originally asking.. :)
+    <braunr> you *have* to do that
+    <braunr> but then, you have to build the whole glibc
+    * bddebian shoots himself
+    <braunr> and i was under the impression you wanted to avoid that
+    <antrik> do any standard pthread functions use identical names to any
+      standard cthread functions?
+    <braunr> what you *can't* do is use the standard pthreads interface
+    <braunr> no, not identical
+    <braunr> but very close
+    <braunr> bddebian: there is a difference between using pthreads, which
+      means using the standard posix interface, and using the glibc pthreads
+      structure, which means toying with the internale implementation
+    <braunr> you *cannot* implement hurd_condition_wait with the standard posix
+      interface, you need to use the internal structures
+    <braunr> hurd_condition_wait is actually a shurd specific addition to the
+      threading library
+    <braunr> hurd*
+    <antrik> well, in that case, the new pthread-based variant of
+      hurd_condition_wait() should also use a different name from the
+      cthread-based one
+    <braunr> so it's normal to put it in that threading library, like it was
+      done for cthreads
+    <braunr> 21:35 < braunr> it could be renamed __hurd_condition_wait, i'm not
+      sure
+    <bddebian> Except that I am trying to avoid using that threading library
+    <braunr> what ?
+    <bddebian> If I am understanding you correctly it is an extention to the
+      hurd specific libpthreads?
+    <braunr> to the threading library, whichever it is
+    <braunr> antrik: although, why not keeping the same name ?
+    <antrik> braunr: I don't think having hurd_condition_wait() for the cthread
+      variant and __hurd_condition_wait() would exactly help clarity...
+    <antrik> I was talking about a really new name. something like
+      pthread_hurd_condition_wait() or so
+    <antrik> braunr: to avoid confusion. to avoid accidentally pulling in the
+      wrong one at build and/or runtime.
+    <antrik> to avoid possible namespace conflicts
+    <braunr> ok
+    <braunr> well yes, makes sense
+    <bddebian> braunr: Let me state this as plainly as I hope I can.  If I want
+      to use glibc's pthreads, I have no choice but to add it to glibc?
+    <braunr> and pthread_hurd_condition_wait is a fine name
+    <braunr> bddebian: no
+    <braunr> bddebian: you either add it there
+    <braunr> bddebian: or you copy the headers defining the internal structures
+      somewhere else and implement it there
+    <braunr> but adding it to glibc is better
+    <braunr> it's just longer in the beginning, and now i'm working on it, i'm
+      really not sure
+    <braunr> add it to glibc directly :p
+    <bddebian> That's what I am trying to do but the headers use pthread
+      specific stuff would should be coming from glibc's pthreads
+    <braunr> yes
+    <braunr> well it's not the headers you need
+    <braunr> you need the internal structure definitions
+    <braunr> sometimes they're in c files for opacity
+    <bddebian> So ___pthread_self() should eventually be an obfuscation of
+      glibcs pthread_self(), no?
+    <braunr> i don't know what it is
+    <braunr> read the cthreads variant of hurd_condition_wait, understand it,
+      do the same for pthreads
+    <braunr> it's easy :p
+    <bddebian> For you bastards that have a clue!! ;-P
+    <antrik> I definitely vote for adding it to the hurd pthreads
+      implementation in glibc right away. trying to do it externally only adds
+      unnecessary complications
+    <antrik> and we seem to agree that this new pthread function should be
+      named pthread_hurd_condition_wait(), not just hurd_condition_wait() :-)
+
+
+## IRC, freenode, #hurd, 2012-07-27
+
+    <bddebian> OK this hurd_condition_wait stuff is getting ridiculous the way
+      I am trying to tackle it. :(  I think I need a new tactic.
+    <braunr> bddebian: what do you mean ?
+    <bddebian> braunr: I know I am thick headed but I still don't get why I
+      cannot implement it in libshouldbeinlibc for now but still use glibc
+      pthreads internals
+    <bddebian> I thought I was getting close last night by bringing in all of
+      the hurd pthread headers and .c files but it just keeps getting uglier
+      and uglier
+    <bddebian> youpi: Just to verify.  The /usr/lib/i386-gnu/libpthread.so that
+      ships with Debian now is from glibc, NOT libpthreads from Hurd right?
+      Everything I need should be available in glibc's libpthreads? (Except for
+      hurd_condition_wait obviously).
+    <braunr> 22:35 < antrik> I definitely vote for adding it to the hurd
+      pthreads implementation in glibc right away. trying to do it externally
+      only adds unnecessary complications
+    <youpi> bddebian: yes
+    <youpi> same as antrik
+    <bddebian> fuck
+    <youpi> libpthread *already* provides some odd symbols (cthread
+      compatibility), it can provide others
+    <braunr> bddebian: don't curse :p it will be easier in the long run
+    * bddebian breaks out glibc :(
+    <braunr> but you should tell thomas that too
+    <bddebian> braunr: I know it just adds a level of complexity that I may not
+      be able to deal with
+    <braunr> we wouldn't want him to waste too much time on the external
+      libpthread
+    <braunr> which one ?
+    <bddebian> glibc for one.  hurd_condition_wait() for another which I don't
+      have a great grasp on.  Remember my knowledge/skillsets are limited
+      currently.
+    <braunr> bddebian: tschwinge has good instructions to build glibc
+    <braunr> keep your tree around and it shouldn't be long to hack on it
+    <braunr> for hurd_condition_wait, i can help
+    <bddebian> Oh I was thinking about using Debian glibc for now.  You think I
+      should do it from git?
+    <braunr> no
+    <braunr> debian rules are even more reliable
+    <braunr> (just don't build all the variants)
+    <pinotree> `debian/rules build_libc` builds the plain i386 variant only
+    <bddebian> So put pthread_hurd_cond_wait in it's own .c file or just put it
+      in pt-cond-wait.c ?
+    <braunr> i'd put it in pt-cond-wait.C
+    <bddebian> youpi or braunr: OK, another dumb question.  What (if anything)
+      should I do about hurd/hurd/signal.h.  Should I stop it from including
+      cthreads?
+    <youpi> it's not a dumb question. it should probably stop, yes, but there
+      might be uncovered issues, which we'll have to take care of
+    <bddebian> Well I know antrik suggested trying to keep compatibility but I
+      don't see how you would do that
+    <braunr> compability between what ?
+    <braunr> and source and/or binary ?
+    <youpi> hurd/signal.h implicitly including cthreads.h
+    <braunr> ah
+    <braunr> well yes, it has to change obviously
+    <bddebian> Which will break all the cthreads stuff of course
+    <bddebian> So are we agreeing on pthread_hurd_cond_wait()?
+    <braunr> that's fine
+    <bddebian> Ugh, shit there is stuff in glibc using cthreads??
+    <braunr> like what ?
+    <bddebian> hurdsig, hurdsock, setauth, dtable, ...
+    <youpi> it's just using the compatibility stuff, that pthread does provide
+    <bddebian> but it includes cthreads.h implicitly
+    <bddebian> s/it/they in many cases
+    <youpi> not a problem, we provide the functions
+    <bddebian> Hmm, then what do I do about signal.h?  It includes chtreads.h
+      because it uses extern struct mutex ...
+    <youpi> ah, then keep the include
+    <youpi> the pthread mutexes are compatible with that
+    <youpi> we'll clean that afterwards
+    <bddebian> arf, OK
+    <youpi> that's what I meant by "uncover issues"
+
+
+## IRC, freenode, #hurd, 2012-07-28
+
+    <bddebian> Well crap, glibc built but I have no symbol for
+      pthread_hurd_cond_wait in libpthread.so :(
+    <bddebian> Hmm, I wonder if I have to add pthread_hurd_cond_wait to
+      forward.c and Versions? (Versions obviously eventually)
+    <pinotree> bddebian: most probably not about forward.c, but definitely you
+      have to export public stuff using Versions
+
+
+## IRC, freenode, #hurd, 2012-07-29
+
+    <bddebian> braunr: http://paste.debian.net/181078/
+    <braunr> ugh, inline functions :/
+    <braunr> "Tell hurd_thread_cancel how to unblock us"
+    <braunr> i think you need that one too :p
+    <bddebian> ??
+    <braunr> well, they work in pair
+    <braunr> one cancels, the other notices it
+    <braunr> hurd_thread_cancel is in the hurd though, iirc
+    <braunr> or uh wait
+    <braunr> no it's in glibc, hurd/thread-cancel.c
+    <braunr> otherwise it looks like a correct reuse of the original code, but
+      i need to understand the pthreads internals better to really say anything
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+    <braunr> pinotree: what do you think of
+      condition_implies/condition_unimplies ?
+    <braunr> the work on pthread will have to replace those
+
+
+## IRC, freenode, #hurd, 2012-08-06
+
+    <braunr> bddebian: so, where is the work being done ?
+    <bddebian> braunr: Right now I would just like to testing getting my glibc
+      with pthread_hurd_cond_wait installed on the clubber subhurd.  It is in
+      /home/bdefreese/glibc-debian2
+    <braunr> we need a git branch
+    <bddebian> braunr: Then I want to rebuild hurd with Thomas's pthread
+      patches against that new libc
+    <bddebian> Aye
+    <braunr> i don't remember, did thomas set a git repository somewhere for
+      that ?
+    <bddebian> He has one but I didn't have much luck with it since he is using
+      an external libpthreads
+    <braunr> i can manage the branches
+    <bddebian> I was actually patching debian/hurd then adding his patches on
+      top of that.  It is in /home/bdefreese/debian-hurd but he has updateds
+      some stuff since then
+    <bddebian> Well we need to agree on a strategy.  libpthreads only exists in
+      debian/glibc
+    <braunr> it would be better to have something upstream than to work on a
+      debian specific branch :/
+    <braunr> tschwinge: do you think it can be done 
+    <braunr> ?
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+    <tschwinge> braunr: You mean to create on Savannah branches for the
+      libpthread conversion?  Sure -- that's what I have been suggesting to
+      Barry and Thomas D. all the time.
+
+    <bddebian> braunr: OK, so I installed my glibc with
+      pthread_hurd_condition_wait in the subhurd and now I have built Debian
+      Hurd with Thomas D's pthread patches.
+    <braunr> bddebian: i'm not sure we're ready for tests yet :p
+    <bddebian> braunr: Why not? :)
+    <braunr> bddebian: a few important bits are missing
+    <bddebian> braunr: Like?
+    <braunr> like condition_implies
+    <braunr> i'm not sure they have been handled everywhere
+    <braunr> it's still interesting to try, but i bet your system won't finish
+      booting
+    <bddebian> Well I haven't "installed" the built hurd yet
+    <bddebian> I was trying to think of a way to test a little bit first, like
+      maybe ext2fs.static or something
+    <bddebian> Ohh, it actually mounted the partition
+    <bddebian> How would I actually "test" it?
+    <braunr> git clone :p
+    <braunr> building a debian package inside
+    <braunr> removing the whole content after
+    <braunr> that sort of things
+    <bddebian> Hmm, I think I killed clubber :(
+    <bddebian> Yep.. Crap! :(
+    <braunr> ?
+    <braunr> how did you do that ?
+    <bddebian> Mounted a new partition with the pthreads ext2fs.static then did
+      an apt-get source hurd to it..
+    <braunr> what partition, and what mount point ?
+    <bddebian> I added a new 2Gb partition on /dev/hd0s6 and set the translator
+      on /home/bdefreese/part6
+    <braunr> shouldn't kill your hurd
+    <bddebian> Well it might still be up but killed my ssh session at the very
+      least :)
+    <braunr> ouch
+    <bddebian> braunr: Do you have debugging enabled in that custom kernel you
+      installed?  Apparently it is sitting at the debug prompt.
diff --git a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
index 2c8f10f8..86a613d3 100644
--- a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
+++ b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
@@ -76,3 +76,30 @@ License|/fdl]]."]]"""]]
     <pinotree> kind of, yes
     <youpi> I have reverted the change in libc for now
     <pinotree> ok
+
+
+## IRC, freenode, #hurd, 2012-07-22
+
+    <tschwinge> pinotree, youpi: I once saw you discussing issue with librt
+      usage is libpthread -- is it this issue?  http://sourceware.org/PR14304
+    <youpi> tschwinge: (librt): no
+    <youpi> it's the converse
+    <pinotree> tschwinge: kind of
+    <youpi> unexpectedly loading libpthread is almost never a problem
+    <youpi> it's unexpectedly loading librt which was a problem for glib
+    <youpi> tschwinge: basically what happened with glib is that at configure
+      time, it could find clock_gettime without any -lrt, because of pulling
+      -lpthread, but at link time that wouldn't happen
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+    <braunr> pinotree: oh, i see you changed __pthread_timedblock to use
+      clock_gettime
+    <braunr> i wonder if i should do the same in libthreads
+    <pinotree> yeah, i realized later it was a bad move
+    <braunr> ok
+    <braunr> i'll stick to gettimeofday for now
+    <pinotree> it'll be safe when implementing some private
+      __hurd_clock_get{time,res} in libc proper, making librt just forward to
+      it and adapting the gettimeofday to use it
diff --git a/open_issues/mission_statement.mdwn b/open_issues/mission_statement.mdwn
index 17f148a9..b32d6ba6 100644
--- a/open_issues/mission_statement.mdwn
+++ b/open_issues/mission_statement.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
 
 [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
 id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -658,3 +658,42 @@ License|/fdl]]."]]"""]]
       FUSE in this case though... it doesn't really change the functionality of
       the VFS; only rearranges the tree a bit
     <antrik> (might even be doable with standard Linux features)
+
+
+# IRC, freenode, #hurd, 2012-07-25
+
+    <braunr> because it has design problems, because it has implementation
+      problems, lots of problems, and far too few people to keep up with other
+      systems that are already dominating
+    <braunr> also, considering other research projects get much more funding
+      than we do, they probably have a better chance at being adopted
+    <rah> you consider the Hurd to be a research project?
+    <braunr> and as they're more recent, they sometimes overcome some of the
+      issues we have
+    <braunr> yes and no
+    <braunr> yes because it was, at the time of its creation, and it hasn't
+      changed much, and there aren't many (any?) other systems with such a
+      design
+    <braunr> and no because the hurd is actually working, and being released as
+      part of something like debian
+    <braunr> which clearly shows it's able to do the stuff it was intended for
+    <braunr> i consider it a technically very interesting project for
+      developers who want to know more about microkernel based extensible
+      systems
+    <antrik> rah: I don't expect the Hurd to achieve world domination, because
+      most people consider Linux "good enough" and will stick with it
+    <antrik> I for my part think though we could do better than Linux (in
+      certain regards I consider important), which is why I still consider it
+      interesting and worthwhile
+    <nowhere_man> I think that in some respect the OS scene may evolve a bit
+      like the PL one, where everyone progressively adopts ideas from Lisp but
+      doesn't want to do Lisp: everyone slowly shifts towards what µ-kernels
+      OSes have done from the start, but they don't want µ-kernels...
+    <braunr> nowhere_man: that's my opinion too
+    <braunr> and this is why i think something like the hurd still has valuable
+      purpose
+    <nowhere_man> braunr: in honesty, I still ponder the fact that it's my
+      coping mechanism to accept being a Lisp and Hurd fan ;-)
+    <braunr> nowhere_man: it can be used that way too
+    <braunr> functional programming is getting more and more attention
+    <braunr> so it's fine if you're a lisp fan really
diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn
index 5924d3f9..c9567828 100644
--- a/open_issues/multithreading.mdwn
+++ b/open_issues/multithreading.mdwn
@@ -49,6 +49,91 @@ Tom Van Cutsem, 2009.
     <youpi> right
 
 
+## IRC, freenode, #hurd, 2012-07-16
+
+    <braunr> hm interesting
+    <braunr> when many threads are creating to handle requests, they
+      automatically create a pool of worker threads by staying around for some
+      time
+    <braunr> this time is given in the libport call
+    <braunr> but the thread always remain
+    <braunr> they must be used in turn each time a new requet comes in
+    <braunr> ah no :(, they're maintained by the periodic sync :(
+    <braunr> hm, still not that, so weird
+    <antrik> braunr: yes, that's a known problem: unused threads should go away
+      after some time, but that doesn't actually happen
+    <antrik> don't remember though whether it's broken for some reason, or
+      simply not implemented at all...
+    <antrik> (this was already a known issue when thread throttling was
+      discussed around 2005...)
+    <braunr> antrik: ok
+    <braunr> hm threads actually do finish ..
+    <braunr> libthreads retain them in a pool for faster allocations
+    <braunr> hm, it's worse than i thought
+    <braunr> i think the hurd does its job well
+    <braunr> the cthreads code never reaps threads
+    <braunr> when threads are finished, they just wait until assigned a new
+      invocation
+
+    <braunr> i don't understand ports_manage_port_operations_multithread :/
+    <braunr> i think i get it
+    <braunr> why do people write things in such a complicated way ..
+    <braunr> such code is error prone and confuses anyone
+
+    <braunr> i wonder how well nested functions interact with threads when
+      sharing variables :/
+    <braunr> the simple idea of nested functions hurts my head
+    <braunr> do you see my point ? :) variables on the stack automatically
+      shared between threads, without the need to explicitely pass them by
+      address
+    <antrik> braunr: I don't understand. why would variables on the stack be
+      shared between threads?...
+    <braunr> antrik: one function declares two variables, two nested functions,
+      and use these in separate threads
+    <braunr> are the local variables still "local"
+    <braunr> ?
+    <antrik> braunr: I would think so? why wouldn't they? threads have separate
+      stacks, right?...
+    <antrik> I must admit though that I have no idea how accessing local
+      variables from the parent function works at all...
+    <braunr> me neither
+
+    <braunr> why don't demuxers get a generic void * like every callback does
+      :((
+    <antrik> ?
+    <braunr> antrik: they get pointers to the input and output messages only
+    <antrik> why is this a problem?
+    <braunr> ports_manage_port_operations_multithread can be called multiple
+      times in the same process
+    <braunr> each call must have its own context
+    <braunr> currently this is done by using nested functions
+    <braunr> also, why demuxers return booleans while mach_msg_server_timeout
+      happily ignores them :(
+    <braunr> callbacks shouldn't return anything anyway
+    <braunr> but then you have a totally meaningless "return 1" in the middle
+      of the code
+    <braunr> i'd advise not using a single nested function
+    <antrik> I don't understand the remark about nested function
+    <braunr> they're just horrible extensions
+    <braunr> the compiler completely hides what happens behind the scenes, and
+      nasty bugs could come out of that
+    <braunr> i'll try to rewrite ports_manage_port_operations_multithread
+      without them and see if it changes anything
+    <braunr> but it's not easy
+    <braunr> also, it makes debugging harder :p
+    <braunr> i suspect gdb hangs are due to that, since threads directly start
+      on a nested function
+    <braunr> and if i'm right, they are created on the stack
+    <braunr> (which is also horrible for security concerns, but that's another
+      story)
+    <braunr> (at least the trampolines)
+    <antrik> I seriously doubt it will change anything... but feel free to
+      prove me wrong :-)
+    <braunr> well, i can see really weird things, but it may have nothing to do
+      with the fact functions are nested
+    <braunr> (i still strongly believe those shouldn't be used at all)
+
+
 # Alternative approaches:
 
   * <http://www.concurrencykit.org/>
diff --git a/open_issues/packaging_libpthread.mdwn b/open_issues/packaging_libpthread.mdwn
index d243aaaa..528e0b01 100644
--- a/open_issues/packaging_libpthread.mdwn
+++ b/open_issues/packaging_libpthread.mdwn
@@ -137,3 +137,53 @@ License|/fdl]]."]]"""]]
     <youpi> I know, I've asked tschwinge about it
     <youpi> it's not urging anyway
     <pinotree> right
+
+
+## IRC, freenode, #hurd, 2012-07-21
+
+    <pinotree> tschwinge: btw, samuel suggested to rename in libpthread ia32 →
+      i386, to better fit with glibc
+    <tschwinge> pinotree: Hmm, that'd somewhat break interopability with
+      Viengoos' use of libpthread.
+    <pinotree> how would it break with viengoos?
+    <tschwinge> I assume it is using the i386 names.  Hmm, no isn't it x86_64
+      only?
+    <tschwinge> I'll check.
+    <pinotree> does it use automake (with the Makefile.am in repo)?
+    <tschwinge> I have no idea what the current branch arrangement is.
+    <pinotree> tschwinge: it looks like ia32 is hardcoded in Makefile and
+      Makefile.am
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+    <tschwinge> Also, the Savannah hurd/glibc.git one does not/not yet include
+      libpthread.
+    <tschwinge> But that could easily be added as a Git submodule.
+    <tschwinge> youpi: To put libpthread into glibc it is literally enough to
+      make Savannah hurd/libpthread.git appear at [glibc]/libpthread?
+    <youpi> tschwinge: there are some patches needed in the rest of the tree
+    <youpi> see in debian, libpthread_clean.diff, tg-libpthread_depends.diff,
+      unsubmitted-pthread.diff, unsubmitted-pthread_posix_options.diff
+    <tschwinge> The libpthread in Debian glibc is
+      hurd/libpthread.git:b428baaa85c0adca9ef4884c637f289a0ab5e2d6 but with
+      25260994c812050a5d7addf125cdc90c911ca5c1 »Store self in __thread variable
+      instead of threadvar« reverted (why?), and the following additional
+      change applied to Makefile:
+    <tschwinge>  ifeq ($(IN_GLIBC),yes)
+    <tschwinge>  $(inst_libdir)/libpthread.so:
+      $(objpfx)libpthread.so$(libpthread.so-version) \
+    <tschwinge>                               $(+force)
+    <tschwinge> -       ln -sf $(slibdir)/libpthread.so$(libpthread.so-version)
+      $@
+    <tschwinge> +       ln -sf libpthread.so$(libpthread.so-version) $@
+    <braunr> tschwinge: is there any plan to merge libpthread.git in glibc.git
+      upstream ?
+    <tschwinge> braunr, youpi: Has not yet been discussed with Roland, as far
+      as I know.
+    <youpi> has not
+    <youpi> libpthread.diff is supposed to be a verbatim copy of the repository
+    <youpi> and then there are a couple patches which don't (yet) make sense
+      upstream
+    <youpi> the slibdir change, however, is odd
+    <youpi> it must be a leftover
diff --git a/open_issues/pci_arbiter.mdwn b/open_issues/pci_arbiter.mdwn
new file mode 100644
index 00000000..7730cee0
--- /dev/null
+++ b/open_issues/pci_arbiter.mdwn
@@ -0,0 +1,256 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+For [[DDE]]/X.org/...
+
+
+# IRC, freenode, #hurd, 2012-02-19
+
+    <youpi> antrik: we should probably add a gsoc idea on pci bus arbitration
+    <youpi> DDE is still experimental for now so it's ok that you  have to
+      configure it by hand, but it should be automatic at some ponit
+
+
+## IRC, freenode, #hurd, 2012-02-21
+
+    <braunr> i'm not familiar with the new gnumach interface for userspace
+      drivers, but can this pci enumerator be written with it as it is ?
+    <braunr> (i'm not asking for a precise answer, just yes - even probably -
+      or no)
+    <braunr> (idk or utsl will do as well)
+    <youpi> I'd say yes
+    <youpi> since all drivers need is interrupts, io ports and iomem
+    <youpi> the latter was already available through /dev/mem
+    <youpi> io ports through the i386 rpcs
+    <youpi> the changes provide both interrupts, and physical-contiguous
+      allocation
+    <youpi> it should be way enough
+    <braunr> youpi: ok
+    <braunr> youpi: thanks for the details :)
+    <antrik> braunr: this was mentioned in the context of the interrupt
+      forwarding interface... the original one implemented by zhengda isn't
+      suitable for a PCI server; but the ones proposed by youpi and tschwinge
+      would work
+    <antrik> same for the physical memory interface: the current implementation
+      doesn't allow delegation; but I already said that it's wrong
+
+
+# IRC, freenode, #hurd, 2012-07-15
+
+    <bddebian> youpi: Oh, BTW, I keep meaning to ask you.  Could sound be done
+      with dde or would there still need to be some kernel work?
+    <youpi> bddebian: we'd need a PCI arbitrer for that
+    <youpi> for now just one userland poking with PCI is fine
+    <youpi> but two can produce bonks
+    <bddebian> They can't use the same?
+    <youpi> that's precisely the matter
+    <youpi> they have to use the same
+    <youpi> and not poke with it themselves
+    <braunr> that's what an arbiter is for
+    <bddebian> OK, so if we don't have a PCI arbiter now, how do things like
+      netdde and video not collide currently?
+    <bddebian> s/netdde/network/
+    <bddebian> or disk for that matter
+    <braunr> bddebian: ah currently, well currently, the network is the only
+      thing using the pci bus
+    <bddebian> How is that possible when I have a PCI video card and disk
+      controller?
+    <braunr> they are accessed through compatible means
+    <bddebian> I suppose one of the hardest parts is prioritization?
+    <braunr> i don't think it matters much, no
+    <youpi> bddebian: netdde and Xorg don't collide essentially because they
+      are not started at the same time (hopefully)
+    <bddebian> braunr: What do you mean it doesn't matter?
+    <braunr> bddebian: well the point is rather serializing access, we don't
+      need more
+    <braunr> do other systems actually schedule access to the pci bus ?
+    <bddebian> From what I am reading, yes
+    <braunr> ok
+
+
+# IRC, freenode, #hurd, 2012-07-16
+
+    <antrik> youpi: the lack of a PCI arbiter is a problem, but I wounldn't
+      consider it a precondition for adding another userspace driver
+      class... it's up to the user to make sure he has only one class active,
+      or take the risk of not doing so...
+    <antrik> (plus, I suspect writing the arbiter is a smaller task than
+      implementing another DDE class anyways...)
+    <bddebian> Where would the arbiter need to reside, in gnumach?
+    <antrik> bddebian: kernel would be one possible place (with the advantage
+      of running both userspace and kernel drivers without the potential for
+      conflicts)
+    <antrik> but I think I would prefer a userspace server
+    <youpi> antrik: we'd rather have PCI devices automatically set up
+    <youpi> just like /dev/netdde is already set up for the user
+    <youpi> so you can't count on the user
+    <youpi> for the arbitrer, it could as well be userland, while still
+      interacting with the kernel for some devices
+    <youpi> we however "just" need to get disk drivers in userland to drop PCI
+      drivers from kernel, actually
+
+
+# IRC, freenode, #hurd, 2012-07-17
+
+    <bddebian> youpi: So this PCI arbiter should be a hurd server?
+    <youpi> that'd be better
+    <bddebian> youpi: Is there anything existing to look at as a basis?
+    <youpi> no idea off-hand
+    <bddebian> I mean you couldn't take what netdde does and generalize it?
+    <youpi> netdde doesn't do any arbitration
+
+
+# IRC, OFTC, #debian-hurd, 2012-07-19
+
+    <bdefreese> youpi: Well at some point if you ever have time I'd like to
+      understand better how you see the PCI architecture working in Hurd.
+      I.E. would you expect the server to do enumeration and arbitration?
+    <youpi> I'd expect both, yes, but that's probably to be discussed rather
+      with antrik, he's the one who took some time to think about it
+    <bdefreese> netdde uses libpciaccess currently, right?
+    <youpi> yes
+    <youpi> libpciaccess would have to be fixed into using the arbitrer
+    <youpi> (that'd fix xorg as well)
+    <bdefreese> Man, I am still a bit unclear on how this all interacting
+      currently.. :(
+    <youpi> currently it's not
+    <youpi> and it's just by luck that it doesn't break
+    <bdefreese> Long term xxxdde would use the new server, correct?
+    <youpi> (well, we are also sure that the gnumach enumeration comes always
+      before the netdde enumeration, and xorg is currently not started
+      automatically, so its enumeration is also always after that)
+    <youpi> yes
+    <youpi> the server would essentially provide an interface equivalent to
+      libpciaccess
+    <bdefreese> Right
+    <bdefreese> In general, where does the pci map get "stored"?  In GNU/Linux,
+      is it all /proc based?
+    <youpi> what do you mean by "pci map" ?
+    <bdefreese> Once I have enumerated all of the buses and devices, does it
+      stay stored or is it just redone for every call to a pci device?
+    <youpi> in linux it's stored in the kernel
+    <youpi> the abritrator would store it itself
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+    <bddebian> antrik: BTW, youpi says you are the one to talk to for design of
+      a PCI server :)
+    <antrik> oh, am I?
+    * antrik feels honoured :-)
+    <antrik> I guess it's true though: I *did* spent a little thought on
+      it... even mentioned something in my thesis IIRC
+    <antrik> there is one tricky aspect to it though, which I'm not sure how to
+      handle best: we need two different instances of libpciaccess
+    <bddebian> Why two instances of libpciaccess?
+    <antrik> one used by the PCI server to access the hardware directly (using
+      the existing port poking backend), and one using a new backend to access
+      our PCI server...
+    <braunr> bddebian: hum, both i guess ?
+    <bddebian> antrik: Why wouldn't the server access the hardware directly?  I
+      thought libpciaccess was supposed to be generic on purpose?
+    <antrik> hm... guess I wasn't clear
+    <antrik> the point is that the PCI server should use the direct hardware
+      access backend of libpciaccess
+    <antrik> however, *clients* should use the PCI server backend of
+      libpciaccess
+    <antrik> I'm not sure backends can be selected at runtime...
+    <antrik> which might mean that we actually have to compile two different
+      versions of the library. erk.
+    <bddebian> So you are saying the pci server should itself use libpci access
+      rather than having it's own?
+    <antrik> admittedly, that's not the most fundamental design decision to
+      make ;-)
+    <antrik> bddebian: yes. no need to rewrite (or copy) this code...
+    <bddebian> Hmm
+    <antrik> actually that was the plan all along when I first suggested
+      implementing the register poking backend for libpciaccess
+    <bddebian> Hmm, not sure I like it but I am certainly in no position to
+      question it right now :)
+    <braunr> why don't you like it ?
+    <bddebian> I shouldn't need an Xorg specific library to access PCI on my OS
+      :)
+    <braunr> oh
+    <bddebian> Though I don't disagree that reinventing the wheel is a bit
+      tedious. :)
+    <antrik> bddebian: although it originates from X.Org, I don't think there
+      is anything about the library technically making it X-specific...
+    <braunr> yes that's my opinion too
+    <antrik> (well, there are some X-specific functions IIRC, but these do not
+      hurt the other functionality)
+    <bddebian> But what is there is api/abi breakage? :)
+    <bddebian> s/is/if/
+    <antrik> BTW according to rdepends there appear to be a number of non-X
+      things using the library now
+    <pinotree> like, uhm, hurd
+    <antrik> yeah, that too... we are already using it for DDE
+    <pinotree> if you have deb-src lines in your sources.list, use the
+      grep-dctrl power:
+    <pinotree> grep-dctrl -sPackage -FBuild-Depends libpciaccess-dev
+      /var/lib/apt/lists/*_source_Sources | sort -u
+    <bddebian> I know we are using it for netdde.
+    <antrik> nice thing about it is that once we have the PCI server and an
+      appropriate backend for libpciaccess, the same netdde and X binaries
+      should work either with or without the PCI server
+    <bddebian> Then why have the server at all?
+    <braunr> it's the arbiter
+    <braunr> you can use the library directly only if you're the only user
+    <braunr> and what antrik means is that the interface should be the same for
+      both modes
+    <bddebian> Ugh, that is where I am getting confused
+    <bddebian> In that case shouldn't everything use libpciaccess and the PCI
+      server has to arbitrate the requests?
+    <braunr> bd	?
+    <braunr> bddebian: yes
+    <braunr> bddebian: but they use the indirect version of the library
+    <braunr> whereas the server uses the raw version
+    <bddebian> OK, I gotcha (I think)
+    <braunr> (but they both provide the same interface, so if you don't have a
+      pci server and you know you're the only user, the direct version can be
+      used)
+    <bddebian> But I am not sure I see the difference between creating a second
+      library or just moving the raw access to the PCI server :)
+    <braunr> uh, there is no difference in that
+    <braunr> and you shouldn't do it
+    <braunr> (if that's what antrik meant at least)
+    <braunr> if you can select the backend (raw or pci server) easily, then
+      stick to the same code base
+    <bddebian> That's where I struggle.  In my worthless opinion, raw access
+      should be the OS job while indirect access would be the libraries
+      responsibility
+    <braunr> that's true
+    <braunr> but as an optimization, if an application is the only user, it can
+      directly use raw access
+    <bddebian> How would you know that?
+    <bddebian> I'm sorry if these are dumb questions
+    <braunr> hum, don't try to make this behaviour automatic
+    <braunr> it would be selected by the user through command line switches
+    <bddebian> But the OS itself uses PCI for things like disk access and
+      video, no?
+    <braunr> (it could be automatic but it makes things more complicated)
+    <braunr> you don't need an arbiter all the time
+    <braunr> i can't tell you more, wait for antrik to return
+    <braunr> i realize i might already have said some bullshit
+    <antrik> bddebian: well, you have a point there that once we have the
+      arbiter and use it for everthing, it isn't strictly useful to still have
+      the register poking in the library
+    <antrik> however, the code will remain in the library anyways, so we better
+      continue using it rather than introducing redundancy...
+    <antrik> but again, that's rather a side issue concerning the design of the
+      PCI server
+    <bddebian> antrik: Fair enough. :)  So how would I even start on this?
+    <antrik> bddebian: actually, libpciaccess is a good starting point:
+      checking the API should give you a fairly good idea what functionality
+      the server needs to implement
+    <pinotree> (+1 on library (re)use)
+    <bddebian> antrik: KK
+    <antrik> sorry, I'm a bit busy right now...
diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn
index 8dbe1160..ec14fa52 100644
--- a/open_issues/performance.mdwn
+++ b/open_issues/performance.mdwn
@@ -52,3 +52,32 @@ call|/glibc/fork]]'s case.
     <braunr> the more i study the code, the more i think a lot of time is
       wasted on cpu, unlike the common belief of the lack of performance being
       only due to I/O
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+    <braunr> there are several kinds of scalability issues
+    <braunr> iirc, i found some big locks in core libraries like libpager and
+      libdiskfs
+    <braunr> but anyway we can live with those
+    <braunr> in the case i observed, ext2fs, relying on libdiskfs and libpager,
+      scans the entire file list to ask for writebacks, as it can't know if the
+      pages are dirty or not
+    <braunr> the mistake here is moving part of the pageout policy out of the
+      kernel
+    <braunr> so it would require the kernel to handle periodic synces of the
+      page cache
+    <antrik> braunr: as for big locks: considering that we don't have any SMP
+      so far, does it really matter?...
+    <braunr> antrik: yes
+    <braunr> we have multithreading
+    <braunr> there is no reason to block many threads while if most of them
+      could continue
+    <braunr> -while
+    <antrik> so that's more about latency than throughput?
+    <braunr> considering sleeping/waking is expensive, it's also about
+      throughput
+    <braunr> currently, everything that deals with sleepable locks (both
+      gnumach and the hurd) just wake every thread waiting for an event when
+      the event occurs (there are a few exceptions, but not many)
+    <antrik> ouch
diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn
index 710c746b..657318cd 100644
--- a/open_issues/performance/io_system/read-ahead.mdwn
+++ b/open_issues/performance/io_system/read-ahead.mdwn
@@ -1565,3 +1565,283 @@ License|/fdl]]."]]"""]]
     <braunr> mcsim1: just use sane values inside the kernel :p
     <braunr> this simplifies things by only adding the new vm_advise call and
       not change the existing external pager interface
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+    <braunr> mcsim: so, to begin with, tell us what state you've reached please
+    <mcsim> braunr: I'm writing code for hurd and gnumach. For gnumach I'm
+      implementing memory policies now. RANDOM and NORMAL seems work, but in
+      hurd I found error that I made during editing ext2fs. So for now ext2fs
+      does not work
+    <braunr> policies ?
+    <braunr> what about mechanism ?
+    <mcsim> also I moved some translators to new interface.
+    <mcsim> It works too
+    <braunr> well that's impressive
+    <mcsim> braunr: I'm not sure yet that everything works
+    <braunr> right, but that's already a very good step
+    <braunr> i thought you were still working on the interfaces to be honest
+    <mcsim> And with mechanism I didn't implement moving pages to inactive
+      queue
+    <braunr> what do you mean ?
+    <braunr> ah you mean with the sequential policy ?
+    <mcsim> yes
+    <braunr> you can consider this a secondary goal
+    <mcsim> sequential I was going to implement like you've said, but I still
+      want to support moving pages to inactive queue
+    <braunr> i think you shouldn't
+    <braunr> first get to a state where clustered transfers do work fine
+    <mcsim> policies are implemented in function calculate_clusters
+    <braunr> then, you can try, and measure the difference
+    <mcsim> ok. I'm now working on fixing ext2fs
+    <braunr> so, except from bug squashing, what's left to do ?
+    <mcsim> finish policies and ext2fs; move fatfs, ufs, isofs to new
+      interface; test this all; edit patches from debian repository, that
+      conflict with my changes; rearrange commits and fix code indentation;
+      update documentation;
+    <braunr> think about measurements too
+    <tschwinge> mcsim: Please don't spend a lot of time on ufs.  No testing
+      required for that one.
+    <braunr> and keep us informed about your progress on bug fixing, so we can
+      test soon
+    <mcsim> Forgot about moving system to new interfaces (I mean determine form
+      of vm_advise and memory_object_change_attributes)
+    <braunr> s/determine/final/
+    <mcsim> braunr: ok.
+    <braunr> what do you mean "moving system to new interfaces" ?
+    <mcsim> braunr: I also pushed code changes to gnumach and hurd git
+      repositories
+    <mcsim> I met an issue with memory_object_change_attributes when I tried to
+      use it as I have to update all applications that use it. This includes
+      libc and translators that are not in hurd repository or use debian
+      patches. So I will not be able to run system with new
+      memory_object_change_attributes interface, until I update all software
+      that use this rpc
+    <braunr> this is a bit like the problem i had with my change
+    <braunr> the solution is : don't do it
+    <braunr> i mean, don't change the interface in an incompatible way
+    <braunr> if you can't change an existing call, add a new one
+    <mcsim> temporary I changed memory_object_set_attributes as it isn't used
+      any more.
+    <mcsim> braunr: ok. Adding new call is a good idea :)
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+    <braunr> mcsim: how did you deal with multiple page transfers towards the
+      default pager ?
+    <mcsim> braunr: hello. Didn't handle this yet, but AFAIR default pager
+      supports multiple page transfers.
+    <braunr> mcsim: i'm almost sure it doesn't
+    <mcsim> braunr: indeed
+    <mcsim> braunr: So, I'll update it just other translators.
+    <braunr> like other translators you mean ?
+    <mcsim> *just as
+    <mcsim> braunr: yes
+    <braunr> ok
+    <braunr> be aware also that it may need some support in vm_pageout.c in
+      gnumach
+    <mcsim> braunr: thank you
+    <braunr> if you see anything strange in the default pager, don't hesitate
+      to talk about it
+    <mcsim> braunr: ok. I didn't finish with ext2fs yet.
+    <braunr> so it's a good thing you're aware of it now, before you begin
+      working on it :)
+    <mcsim> braunr: I'm working on ext2 now.
+    <braunr> yes i understand
+    <braunr> i meant "before beginning work on the default pager"
+    <mcsim> ok
+
+    <antrik> mcsim: BTW, we were mostly talking about readahead (pagein) over
+      the past weeks, so I wonder what the status on clustered page*out* is?...
+    <mcsim> antrik: I don't work on this, but following, I think, is an example
+      of *clustered* pageout: _pager_seqnos_memory_object_data_return: object =
+      113, seqno = 4, control = 120, start_address = 0, length = 8192, dirty =
+      1. This is an example of debugging printout that shows that pageout
+      manipulates with chunks bigger than page sized.
+    <mcsim> antrik: Another one with bigger length
+      _pager_seqnos_memory_object_data_return: object = 125, seqno = 124,
+      control = 132, start_address = 131072, length = 126976, dirty = 1, kcopy
+    <antrik> mcsim: that's odd -- I didn't know the functionality for that even
+      exists in our codebase...
+    <antrik> my understanding was that Mach always sends individual pageout
+      requests for ever single page it wants cleaned...
+    <antrik> (and this being the reason for the dreadful thread storms we are
+      facing...)
+    <braunr> antrik: ok
+    <braunr> antrik: yes that's what is happening
+    <braunr> the thread storms aren't that much of a problem now
+    <braunr> (by carefully throttling pageouts, which is a task i intend to
+      work on during the following months, this won't be an issue any more)
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+    <mcsim> I moved fatfs, ufs, isofs to new interface, corrected some errors
+      in other that I already moved, moved kernel to new interface (renamed
+      vm_advice to vm_advise and added rpcs memory_object_set_advice and
+      memory_object_get_advice). Made some changes in mechanism and tried to
+      finish ext2 translator.
+    <mcsim> braunr: I've got an issue with fictitious pages...
+    <mcsim> When I determine bounds of cluster in external object I never know
+      its actual size. So, mo_data_request call could ask data that are behind
+      object bounds. The problem is that pager returns data that it has and
+      because of this fictitious pages that were allocated are not freed.
+    <braunr> why don't you know the size ?
+    <mcsim> I see 2 solutions. First one is do not allocate fictitious pages at
+      all (but I think that there could be issues). Another lies in allocating
+      fictitious pages, but then freeing them with mo_data_lock.
+    <mcsim> braunr: Because pages does not inform kernel about object size.
+    <braunr> i don't understand what you mean
+    <mcsim> I think that second way is better.
+    <braunr> so how does it happen ?
+    <braunr> you get a page fault
+    <mcsim> Don't you understand problem or solutions?
+    <braunr> then a lookup in the map finds the map entry
+    <braunr> and the map entry gives you the link to the underlying object
+    <mcsim> from vm_object.h: 	vm_size_t		size;		/*
+      Object size (only valid if internal)				 */
+    <braunr> mcsim: ugh
+    <mcsim> For external they are either 0x8000 or 0x20000...
+    <braunr> and for internal ?
+    <braunr> i'm very surprised to learn that
+    <mcsim> braunr: for internal size is actual
+    <braunr> right sorry, wrong question
+    <braunr> did you find what 0x8000 and 0x20000 are ?
+    <mcsim> for external I met only these 2 magic numbers when printed out
+      arguments of functions _pager_seqno_memory_object_... when they were
+      called.
+    <braunr> yes but did you try to find out where they come from ?
+    <mcsim> braunr: no. I think that 0x2000(many zeros) is maximal possible
+      object size.
+    <braunr> what's the exact value ?
+    <mcsim> can't tell exactly :/ My hurd box has broken again.
+    <braunr> mcsim: how does the vm find the backing content then ?
+    <mcsim> braunr: Do you know if it is guaranteed that map_entry size will be
+      not bigger than external object size?
+    <braunr> mcsim: i know it's not
+    <braunr> but you can use the map entry boundaries though
+    <mcsim> braunr: vm asks pager
+    <braunr> but if the page is already present
+    <braunr> how does it know ?
+    <braunr> it must be inside a vm_object ..
+    <mcsim> If I can use these boundaries than the problem, I described is not
+      actual.
+    <braunr> good
+    <braunr> it makes sense to use these boundaries, as the application can't
+      use data outside the mapping
+    <mcsim> I ask page with vm_page_lookup
+    <braunr> it would matter for shared objects, but then they have their own
+      faults :p
+    <braunr> ok
+    <braunr> so the size is actually completely ignord
+    <mcsim> if it is present than I stop expansion of cluster.
+    <braunr> which makes sense
+    <mcsim> braunr: yes, for external.
+    <braunr> all right
+    <braunr> use the mapping boundaries, it will do
+    <braunr> mcsim: i have only one comment about what i could see
+    <braunr> mcsim: there are 'advice' fields in both vm_map_entry and
+      vm_object
+    <braunr> there should be something else in vm_object
+    <braunr> i told you about pages before and after
+    <braunr> mcsim: how are you using this per object "advice" currently ?
+    <braunr> (in addition, using the same name twice for both mechanism and
+      policy is very sonfusing)
+    <braunr> confusing*
+    <mcsim> braunr: I try to expand cluster as much as it possible, but not
+      much than limit
+    <mcsim> they both determine policy, but advice for entry has bigger
+      priority
+    <braunr> that's wrong
+    <braunr> mapping and content shouldn't compete for policy
+    <braunr> the mapping tells the policy (=the advice) while the content tells
+      how to implement (e.g. how much content)
+    <braunr> IMO, you could simply get rid of the per object "advice" field and
+      use default values for now
+    <mcsim> braunr: What sense these values for number of pages before and
+      after should have?
+    <braunr> or use something well known, easy, and effective like preceding
+      and following pages
+    <braunr> they give the vm the amount of content to ask the backing pager
+    <mcsim> braunr: maximal amount, minimal amount or exact amount?
+    <braunr> neither
+    <braunr> that's why i recommend you forget it for now
+    <braunr> but
+    <braunr> imagine you implement the three standard policies (normal, random,
+      sequential)
+    <braunr> then the pager assigns preceding and following numbers for each of
+      them, say [5;5], [0;0], [15;15] respectively
+    <braunr> these numbers would tell the vm how many pages to ask the pagers
+      in a single request and from where
+    <mcsim> braunr: but in fact there could be much more policies.
+    <braunr> yes
+    <mcsim> also in kernel context there is no such unit as pager.
+    <braunr> so there should be a call like memory_object_set_advice(int
+      advice, int preceding, int following);
+    <braunr> for example
+    <braunr> what ?
+    <braunr> the pager is the memory manager
+    <braunr> it does exist in kernel context
+    <braunr> (or i don't understand what you mean)
+    <mcsim> there is only port, but port could be either pager or something
+      else
+    <braunr> no, it's a pager
+    <braunr> it's a port whose receive right is hold by a task implementing the
+      pager interface
+    <braunr> either the default pager or an untrusted task
+    <braunr> (or null if the object is anonymous memory not yet sent to the
+      default pager)
+    <mcsim> port is always pager?
+    <braunr> the object port is, yes
+    <braunr>         struct ipc_port         *pager;         /* Where to get
+      data */
+    <mcsim> So, you suggest to keep set of advices for each object?
+    <braunr> i suggest you don't change anything in objects for now
+    <braunr> keep the advice in the mappings only, and implement default
+      behaviour for the known policies
+    <braunr> mcsim: if you understand this point, then i have nothing more to
+      say, and we should let nowhere_man present his work
+    <mcsim> braunr: ok. I'll implement only default behaviors for know policies
+      for now.
+    <braunr> (actually, using the mapping boundaries is slightly unoptimal, as
+      we could have several mappings for the same content, e.g. a program with
+      read only executable mapping, then ro only)
+    <braunr> mcsim: another way to know the "size" is to actually lookup for
+      pages in objects
+    <braunr> hm no, that's not true
+    <mcsim> braunr: But if there is no page we have to ask it
+    <mcsim> and I don't understand why using mappings boundaries is unoptimal
+    <braunr> here is bash
+    <braunr> 0000000000400000    868K r-x--  /bin/bash
+    <braunr> 00000000006d9000     36K rw---  /bin/bash
+    <braunr> two entries, same file
+    <braunr> (there is the anonymous memory layer for the second, but it would
+      matter for the first cow faults)
+
+
+## IRC, freenode, #hurd, 2012-08-02
+
+    <mcsim> braunr: You said that I probably need some support in vm_pageout.c
+      to make defpager work with clustered page transfers, but TBH I thought
+      that I have to implement only pagein. Do you expect from me implementing
+      pageout either? Or I misunderstand role of vm_pageout.c?
+    <braunr> no
+    <braunr> you're expected to implement only pagins for now
+    <braunr> pageins
+    <mcsim> well, I'm finishing merging of ext2fs patch for large stores and
+      work on defpager in parallel.
+    <mcsim> braunr: Also I didn't get your idea about configuring of paging
+      mechanism on behalf of pagers.
+    <braunr> which one ?
+    <mcsim> braunr: You said that pager has somehow pass size of desired
+      clusters for different paging policies.
+    <braunr> mcsim: i said not to care about that
+    <braunr> and the wording isn't correct, it's not "on behalf of pagers"
+    <mcsim> servers?
+    <braunr> pagers could tell the kernel what size (before and after a faulted
+      page) they prefer for each existing policy
+    <braunr> but that's one way to do it
+    <braunr> defaults work well too
+    <braunr> as shown in other implementations
diff --git a/open_issues/pfinet_vs_system_time_changes.mdwn b/open_issues/pfinet_vs_system_time_changes.mdwn
index 46705047..09b00d30 100644
--- a/open_issues/pfinet_vs_system_time_changes.mdwn
+++ b/open_issues/pfinet_vs_system_time_changes.mdwn
@@ -11,14 +11,16 @@ License|/fdl]]."]]"""]]
 
 [[!tag open_issue_hurd]]
 
-IRC, unknown channel, unknown date.
+
+# IRC, unknown channel, unknown date
 
     <grey_gandalf> I did a sudo date...
     <grey_gandalf> and the machine hangs
 
-This was very likely a misdiagnosis:
+This was very likely a misdiagnosis.
+
 
-IRC, freenode, #hurd, 2011-03-25:
+# IRC, freenode, #hurd, 2011-03-25
 
     <tschwinge> antrik: I suspect it'S some timing stuff in pfinet that perhaps
       uses absolute time, and somehow wildely gets confused?
@@ -42,7 +44,8 @@ IRC, freenode, #hurd, 2011-03-25:
       wrap-around, and thus the same result.)
     <tschwinge> Yes.
 
-IRC, freenode, #hurd, 2011-10-26:
+
+# IRC, freenode, #hurd, 2011-10-26
 
     <antrik> anyways, when ntpdate adjusts to the past, the connections hang,
       roughly for the amount of time being adjusted
@@ -50,7 +53,8 @@ IRC, freenode, #hurd, 2011-10-26:
     <antrik> (well, if it's long enough, they probably timeout on the other
       side...)
 
-IRC, freenode, #hurd, 2011-10-27:
+
+# IRC, freenode, #hurd, 2011-10-27
 
     <antrik> oh, another interesting thing I observed is that the the subhurd
       pfinet did *not* drop the connection... only the main Hurd one. I thought
@@ -60,7 +64,8 @@ IRC, freenode, #hurd, 2011-10-27:
       where I set the date is affected, and not the pfinet in the other
       instance
 
-IRC, freenode, #hurd, 2012-06-28:
+
+# IRC, freenode, #hurd, 2012-06-28
 
     <bddebian> great, now setting the date/time fucked my machine
     <pinotree> yes, we lack a monotonic clock
@@ -80,3 +85,17 @@ IRC, freenode, #hurd, 2012-06-28:
       it fucked me because I now cannot get to it.. :)
     <antrik> bddebian: that's odd... you should be able to just log in again
       IIRC
+
+
+# IRC, freenode, #hurd, 2012-07-29
+
+    <antrik> pfinet can't cope with larger system time changes because it can't
+      use a monotonic clock
+
+[[clock_gettime]].
+
+    <braunr> well when librt becomes easily usable everywhere (it it's
+      possible), it will be quite easy to work around this issue
+    <pinotree> yes and no, you just need a monotonic clock and clock_gettime
+      able to use it
+    <braunr> why "no" ?
diff --git a/open_issues/select.mdwn b/open_issues/select.mdwn
index abec304d..6bed94ca 100644
--- a/open_issues/select.mdwn
+++ b/open_issues/select.mdwn
@@ -215,6 +215,1186 @@ IRC, unknown channel, unknown date:
     <youpi> it's better than nothing yes
 
 
+# IRC, freenode, #hurd, 2012-07-21
+
+    <braunr> damn, select is actually completely misdesigned :/
+    <braunr> iiuc, it makes servers *block*, in turn :/
+    <braunr> can't be right
+    <braunr> ok i understand it better
+    <braunr> yes, timeouts should be passed along with the other parameters to
+      correctly implement non blocking select
+    <braunr> (or the round-trip io_select should only ask for notification
+      requests instead of making a server thread block, but this would require
+      even more work)
+    <braunr> adding the timeout in the io_select call should be easy enough for
+      whoever wants to take over a not-too-complicated-but-not-one-liner-either
+      task :)
+    <antrik> braunr: why is a blocking server thread a problem?
+    <braunr> antrik: handling the timeout at client side while server threads
+      block is the problem
+    <braunr> the timeout must be handled along with blocking obviously
+    <braunr> so you either do it at server side when async ipc is available,
+      which is the case here
+    <braunr> or request notifications (synchronously) and block at client side,
+      waiting forthose notifications
+    <antrik> braunr: are you saying the client has a receive timeout, but when
+      it elapses, the server thread keeps on blocking?...
+    <braunr> antrik: no i'm referring to the non-blocking select issue we have
+    <braunr> antrik: the client doesn't block in this case, whereas the servers
+      do
+    <braunr> which obviously doesn't work ..
+    <braunr> see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=79358
+    <braunr> this is the reason why vim (and probably others) are slow on the
+      hurd, while not consuming any cpu
+    <braunr> the current work around is that whenevever a non-blocking select
+      is done, it's transformed into a blocking select with the smallest
+      possible timeout
+    <braunr> whenever*
+    <antrik> braunr: well, note that the issue only began after fixing some
+      other select issue... it was fine before
+    <braunr> apparently, the issue was raised in 2000
+    <braunr> also, note that there is a delay between sending the io_select
+      requests and blocking on the replies
+    <braunr> when machines were slow, this delay could almost guarantee a
+      preemption between these steps, making the servers reply soon enough even
+      for a non blocking select
+    <braunr> the problem occurs when sending all the requests and checking for
+      replies is done before servers have a chance the send the reply
+    <antrik> braunr: I don't know what issue was raised in 2000, but I do know
+      that vim worked perfectly fine until last year or so. then some select
+      fix was introduced, which in turn broke vim
+    <braunr> antrik: could be the timeout rounding, Aug 2 2010
+    <braunr> hum but, the problem wasn't with vim
+    <braunr> vim does still work fine (in fact, glibc is patched to check some
+      well known process names and selectively fix the timeout)
+    <braunr> which is why vim is fast and view isn't
+    <braunr> the problem was with other services apparently
+    <braunr> and in order to fix them, that workaround had to be introduced
+    <braunr> i think it has nothing to do with the timeout rounding
+    <braunr> it must be the time when youpi added the patch to the debian
+      package
+    <antrik> braunr: the problem is that with the patch changing the timeout
+      rounding, vim got extremely slow. this is why the ugly hacky exception
+      was added later...
+    <antrik> after reading the report, I agree that the timeout needs to be
+      handled by the server. at least the timeout=0 case.
+    <pinotree> vim uses often 0-time selects to check whether there's input
+    <antrik> client-side handling might still be OK for other timeout settings
+      I guess
+    <antrik> I'm a bit ambivalent about that
+    <antrik> I tend to agree with Neal though: it really doesn't make much
+      sense to have a client-side watchdog timer for this specific call, while
+      for all other ones we trust the servers not to block...
+    <antrik> or perhaps not. for standard sync I/O, clients should expect that
+      an operation could take long (though not forever); but they might use
+      select() precisely to avoid long delays in I/O... so it makes some sense
+      to make sure that select() really doesn't delay because of a busy server
+    <antrik> OTOH, unless the server is actually broken (in which anything
+      could happen), a 0-time select should never actually block for an
+      extended period of time... I guess it's not wrong to trust the servers on
+      that
+    <antrik> pinotree: hm... that might explain a certain issue I *was*
+      observing with Vim on Hurd -- though I never really thought about it
+      being an actual bug, as opposed to just general Hurd sluggishness...
+    <antrik> but it makes sense now
+    <pinotree> antrik:
+      http://patch-tracker.debian.org/patch/series/view/eglibc/2.13-34/hurd-i386/local-select.diff
+    <antrik> so I guess we all agree that moving the select timeout to the
+      server is probably the most reasonably approach...
+    <antrik> braunr: BTW, I wouldn't really consider the sync vs. async IPC
+      cases any different. the client blocks waiting for the server to reply
+      either way...
+    <antrik> the only difference is that in the sync IPC case, the server might
+      want to take some special precaution so it doesn't have to block until
+      the client is ready to receive the reply
+    <antrik> but that's optional and not really select-specific I'd say
+    <antrik> (I'd say the only sane approach with sync IPC is probably for the
+      server never to wait -- if the client fails to set up for receiving the
+      reply in time, it looses...)
+    <antrik> and with the receive buffer approach in Viengoos, this can be done
+      really easy and nice :-)
+
+
+## IRC, freenode, #hurd, 2012-07-22
+
+    <braunr> antrik: you can't block in servers with sync ipc
+    <braunr> so in this case, "select" becomes a request for notifications
+    <braunr> whereas with async ipc, you can, so it's less efficient to make a
+      full round trip just to ask for requests when you can just do async
+      requests (doing the actual blocking) and wait for any reply after
+    <antrik> braunr: I don't understand. why can't you block in servers with
+      async IPC?
+    <antrik> braunr: err... with sync IPC I mean
+    <braunr> antrik: because select operates on more than one fd
+    <antrik> braunr: and what does that got to do with sync vs. async IPC?...
+    <antrik> maybe you are thinking of endpoints here, which is a whole
+      different story
+    <antrik> traditional L4 has IPC ports bound to specific threads; so
+      implementing select requires a separate client thread for each
+      server. but that's not mandatory for sync IPC. Viengoos has endpoints not
+      bound to threads
+    <braunr> antrik: i don't know what "endpoint" means here
+    <braunr> but, you can't use sync IPC to implement select on multiple fds
+      (and thus possibly multiple servers) by blocking in the servers
+    <braunr> you'd block in the first and completely miss the others
+    <antrik> braunr: I still don't see why... or why async IPC would change
+      anything in that regard
+    <braunr> antrik: well, you call select on 3 fds, each implemented by
+      different servers
+    <braunr> antrik: you call a sync select on the first fd, obviously you'll
+      block there
+    <braunr> antrik: if it's async, you don't block, you just send the
+      requests, and wait for any reply
+    <braunr> like we do
+    <antrik> braunr: I think you might be confused about the meaning of sync
+      IPC. it doesn't in any way imply that after sending an RPC request you
+      have to block on some particular reply...
+    <youpi> antrik: what does sync mean then?
+    <antrik> braunr: you can have any number of threads listening for replies
+      from the various servers (if using an L4-like model); or even a single
+      thread, if you have endpoints that can listen on replies from different
+      sources (which was pretty much the central concern in the Viengoos IPC
+      design AIUI)
+    <youpi> antrik: I agree with your "so it makes some sense to make sure that
+      select() really doesn't delay because of a busy server" (for blocking
+      select) and "OTOH, unless the server is actually broken (in which
+      anything could happen), a 0-time select should never actually block" (for
+      non-blocking select)
+    <antrik> youpi: regarding the select, I was thinking out loud; the former
+      statement was mostly cancelled by my later conclusions...
+    <antrik> and I'm not sure the latter statement was quite clear
+    <youpi> do you know when it was?
+    <antrik> after rethinking it, I finally concluded that it's probably *not*
+      a problem to rely on the server to observe the timout. if it's really
+      busy, it might take longer than the designated timeout (especially if
+      timeout is 0, hehe) -- but I don't think this is a problem
+    <antrik> and if it doens't observe the timout because it's
+      broken/malicious, that's not more problematic that any other RPC the
+      server doesn't handle as expected
+    <youpi> ok
+    <youpi> did somebody wrote down the conclusion "let's make select timeout
+      handled at server side" somewhere?
+    <antrik> youpi: well, neal already said that in a followup to the select
+      issue Debian bug... and after some consideration, I completely agree with
+      his reasoning (as does braunr)
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+    <braunr> antrik: i was meaning sync in the most common meaning, yes, the
+      client blocking on the reply
+    <antrik> braunr: I think you are confusing sync IPC with sync I/O ;-)
+    <antrik> braunr: by that definition, the vast majority of Hurd IPC would be
+      sync... but that's obviously not the case
+    <antrik> synchronous IPC means that send and receive happen at the same
+      time -- nothing more, nothing less. that's why it's called synchronous
+    <braunr> antrik: yes
+    <braunr> antrik: so it means the client can't continue unless he actually
+      receives
+    <antrik> in a pure sync model such as L4 or EROS, this means either the
+      sender or the receiver has to block, so synchronisation can happen. which
+      one is server and which one is client is completely irrelevant here --
+      this is about individual message transfer, not any RPC model on top of it
+    <braunr> i the case of select, i assume sender == client
+    <antrik> in Viengoos, the IPC is synchronous in the sense that transfer
+      from the send buffer to the receive buffer happens at the same time; but
+      it's asynchronous in the sense that the receiver doesn't necessarily have
+      to be actively waiting for the incoming message
+    <braunr> ok, i was talking about a pure sync model
+    <antrik> (though it most cases it will still do so...)
+    <antrik> braunr: BTW, in the case of select, the sender is *not* the
+      client. the reply is relevant here, not the request -- so the client is
+      the receiver
+    <antrik> (the select request is boring)
+    <braunr> sorry, i don't understand, you seem to dismiss the select request
+      for no valid reason
+    <antrik> I still don't see how sync vs. async affects the select reply
+      receive though... blocking seems the right approach in either case
+    <braunr> blocking is required
+    <braunr> but you either block in the servers, or in the client
+    <braunr> (and if blocking in the servers, the client also blocks)
+    <braunr> i'll explain how i see it again
+    <braunr> there are two approaches to implementing select
+    <braunr> 1/ send requests to all servers, wait for any reply, this is what
+      the hurd does
+    <braunr> but it's possible because you can send all the requests without
+      waiting for the replies
+    <braunr> 2/ send notification requests, wait for a notification
+    <braunr> this doesn't require blocking in the servers (so if you have many
+      clients, you don't need as many threads)
+    <braunr> i was wondering which approach was used by the hurd, and if it
+      made sense to change
+    <antrik> TBH I don't see the difference between 1) and 2)... whether the
+      message from the server is called an RPC reply or a notification is just
+      a matter of definition
+    <antrik> I think I see though what you are getting at
+    <antrik> with sync IPC, if the client sent all requests and only afterwards
+      started to listen for replies, the servers might need to block while
+      trying to deliver the reply because the client is not ready yet
+    <braunr> that's one thing yes
+    <antrik> but even in the sync case, the client can immediately wait for
+      replies to each individual request -- it might just be more complicated,
+      depending on the specifics of the IPC design
+    <braunr> what i mean by "send notification requests" is actually more than
+      just sending, it's a complete RPC
+    <braunr> and notifications are non-blocking, yes
+    <antrik> (with L4, it would require a separate client thread for each
+      server contacted... which is precisely why a different mechanism was
+      designed for Viengoos)
+    <braunr> seems weird though
+    <braunr> don't they have a portset like abstraction ?
+    <antrik> braunr: well, having an immediate reply to the request and a
+      separate notification later is just a waste of resources... the immediate
+      reply would have no information value
+    <antrik> no, in original L4 IPC is always directed to specific threads
+    <braunr> antrik: some could see the waste of resource as being the
+      duplication of the number of client threads in the server
+    <antrik> you could have one thread listening to replies from several
+      servers -- but then, replies can get lost
+    <braunr> i see
+    <antrik> (or the servers have to block on the reply)
+    <braunr> so, there are really no capabilities in the original l4 design ?
+    <antrik> though I guess in the case of select() it wouldn't really matter
+      if replies get lost, as long as at least one is handled... would just
+      require the listener thread by separate from the thread sending the
+      requests
+    <antrik> braunr: right. no capabilities of any kind
+    <braunr> that was my initial understanding too
+    <braunr> thanks
+    <antrik> so I partially agree: in a purely sync IPC design, it would be
+      more complicated (but not impossible) to make sure the client gets the
+      replies without the server having to block while sending replies
+
+    <braunr> arg, we need hurd_condition_timedwait (and possible
+      condition_timedwait) to cleanly fix io_select
+    <braunr> luckily, i still have my old patch for condition_timedwait :>
+    <braunr> bddebian: in order to implement timeouts in select calls, servers
+      now have to use a hurd_condition_timedwait function
+    <braunr> is it possible that a thread both gets canceled and timeout on a
+      wait ?
+    <braunr> looks unlikely to me
+
+    <braunr> hm, i guess the same kind of compatibility constraints exist for
+      hurd interfaces
+    <braunr> so, should we have an io_select1 ?
+    <antrik> braunr: I would use a more descriptive name: io_select_timeout()
+    <braunr> antrik: ah yes
+    <braunr> well, i don't really like the idea of having 2 interfaces for the
+      same call :)
+    <braunr> because all select should be select_timeout :)
+    <braunr> but ok
+    <braunr> antrik: actually, having two select calls may be better
+    <braunr> oh it's really minor, we do'nt care actually
+    <antrik> braunr: two select calls?
+    <braunr> antrik: one with a timeout and one without
+    <braunr> the glibc would choose at runtime
+    <antrik> right. that was the idea. like with most transitions, that's
+      probably the best option
+    <braunr> there is no need to pass the timeout value if it's not needed, and
+      it's easier to pass NULL this way
+    <antrik> oh
+    <antrik> nah, that would make the transition more complicated I think
+    <braunr> ?
+    <braunr> ok
+    <braunr> :)
+    <braunr> this way, it becomes very easy
+    <braunr> the existing io_select call moves into a select_common() function
+    <antrik> the old variant doesn't know that the server has to return
+      immediately; changing that would be tricky. better just use the new
+      variant for the new behaviour, and deprecate the old one
+    <braunr> and the entry points just call this common function with either
+      NULL or the given timeout
+    <braunr> no need to deprecate the old one
+    <braunr> that's what i'm saying
+    <braunr> and i don't understand "the old variant doesn't know that the
+      server has to return immediately"
+    <antrik> won't the old variant block indefinitely in the server if there
+      are no ready fds?
+    <braunr> yes it will
+    <antrik> oh, you mean using the old variant if there is no timeout value?
+    <braunr> yes
+    <antrik> well, I guess this would work
+    <braunr> well of course, the question is rather if we want this or not :)
+    <antrik> hm... not sure
+    <braunr> we need something to improve the process of changing our
+      interfaces
+    <braunr> it's really painful currnelty
+    <antrik> inside the servers, we probably want to use common code
+      anyways... so in the long run, I think it simplifies the code when we can
+      just drop the old variant at some point
+    <braunr> a lot of the work we need to do involves changing interfaces, and
+      we very often get to the point where we don't know how to do that and
+      hardly agree on a final version :
+    <braunr> :/
+    <braunr> ok but
+    <braunr> how do you tell the server you don't want a timeout ?
+    <braunr> a special value ? like { -1; -1 } ?
+    <antrik> hm... good point
+    <braunr> i'll do it that way for now
+    <braunr> it's the best way to test it
+    <antrik> which way you mean now?
+    <braunr> keeping io_select as it is, add io_select_timeout
+    <antrik> yeah, I thought we agreed on that part... the question is just
+      whether io_select_timeout should also handle the no-timeout variant going
+      forward, or keep io_select for that. I'm really not sure
+    <antrik> maybe I'll form an opinion over time :-)
+    <antrik> but right now I'm undecided
+    <braunr> i say we keep io_select
+    <braunr> anyway it won't change much
+    <braunr> we can just change that at the end if we decide otherwise
+    <antrik> right
+    <braunr> even passing special values is ok
+    <braunr> with a carefully written hurd_condition_timedwait, it's very easy
+      to add the timeouts :)
+    <youpi> antrik, braunr: I'm wondering, another solution is to add an
+      io_probe, i.e. the server has to return an immediate result, and the
+      client then just waits for all results, without timeout
+    <youpi> that'd be a mere addition in the glibc select() call: when timeout
+      is 0, use that, and otherwise use the previous code
+    <youpi> the good point is that it looks nicer in fs.defs
+    <youpi> are there bad points?
+    <youpi> (I don't have the whole issues in the mind now, so I'm probably
+      missing things)
+    <braunr> youpi: the bad point is duplicating the implementation maybe
+    <youpi> what duplication ?
+    <youpi> ah you mean for the select case
+    <braunr> yes
+    <braunr> although it would be pretty much the same
+    <braunr> that is, if probe only, don't enter the wait loop
+    <youpi> could that be just some ifs here and there?
+    <youpi> (though not making the code easier to read...)
+    <braunr> hm i'm not sure it's fine
+    <youpi> in that case oi_select_timeout looks ncier ideed :)
+    <braunr> my problem with the current implementation is having the timeout
+      at the client side whereas the server side is doing the blocking
+    <youpi> I wonder how expensive a notification is, compared to blocking
+    <youpi> a blocking indeed needs a thread stack
+    <youpi> (and kernel thread stuff)
+    <braunr> with the kind of async ipc we have, it's still better to do it
+      that way
+    <braunr> and all the code already exists
+    <braunr> having the timeout at the client side also have its advantage
+    <braunr> has*
+    <braunr> latency is more precise
+    <braunr> so the real problem is indeed the non blocking case only
+    <youpi> isn't it bound to kernel ticks anyway ?
+    <braunr> uh, not if your server sucks
+    <braunr> or is loaded for whatever reason
+    <youpi> ok, that's not what I understood by "precision" :)
+    <youpi> I'd rather call it robustness :)
+    <braunr> hm
+    <braunr> right
+    <braunr> there are several ways to do this, but the io_select_timeout one
+      looks fine to me
+    <braunr> and is already well on its way
+    <braunr> and it's reliable
+    <braunr> (whereas i'm not sure about reliability if we keep the timeout at
+      client side)
+    <youpi> btw make the timeout nanoseconds
+    <braunr> ??
+    <youpi> pselect uses timespec, not timeval
+    <braunr> do we want pselect ?
+    <youpi> err, that's the only safe way with signals
+    <braunr> not only, no
+    <youpi> and poll is timespec also
+    <youpi> not only??
+    <braunr> you mean ppol
+    <braunr> ppoll
+    <youpi> no, poll too
+    <youpi> by "the only safe way", I mean for select calls
+    <braunr> i understand the race issue
+    <youpi> ppoll is a gnu extension
+    <braunr> int poll(struct pollfd *fds, nfds_t nfds, int timeout);
+    <youpi> ah, right, I was also looking at ppoll
+    <youpi> any
+    <youpi> way
+    <youpi> we can use nanosecs
+    <braunr> most event loops use a pipe or a socketpair
+    <youpi> there's no reason not to
+    <antrik> youpi: I briefly considered special-casisg 0 timeouts last time we
+      discussed this; but I concluded that it's probably better to handle all
+      timeouts server-side
+    <youpi> I don't see why we should even discuss that
+    <braunr> and translate signals to writes into the pipe/socketpair
+    <youpi> antrik: ok
+    <antrik> you can't count on select() timout precision anyways
+    <antrik> a few ms more shouldn't hurt any sanely written program
+    <youpi> braunr: "most" doesn't mean "all"
+    <youpi> there *are* applications which use pselect
+    <braunr> well mach only handles millisedonds
+    <braunr> seconds
+    <youpi> and it's not going out of the standard
+    <youpi> mach is not the hurd
+    <youpi> if we change mach, we can still keep the hurd ipcs
+    <youpi> anyway
+    <youpi> agagin
+    <youpi> I reallyt don't see the point of the discussion
+    <youpi> is there anything *against* using nanoseconds?
+    <braunr> i chose the types specifically because of that :p
+    <braunr> but ok i can change again
+    <youpi> becaus what??
+    <braunr> i chose to use mach's native time_value_t
+    <braunr> because it matches timeval nicely
+    <youpi> but it doesn't match timespec nicely
+    <braunr> no it doesn't
+    <braunr> should i add a hurd specific time_spec_t then ?
+    <youpi> "how do you tell the server you don't want a timeout ? a special
+      value ? like { -1; -1 } ?"
+    <youpi> you meant infinite blocking?
+    <braunr> youpi: yes
+    <braunr> oh right, pselect is posix
+    <youpi> actually posix says that there can be limitations on the maximum
+      timeout supported, which should be at least 31 days
+    <youpi> -1;-1 is thus fine
+    <braunr> yes
+    <braunr> which is why i could choose time_value_t (a struct of 2 integer_t)
+    <youpi> well, I'd say gnumach could grow a nanosecond-precision time value
+    <youpi> e.g. for clock_gettime precision and such
+    <braunr> so you would prefer me adding the time_spec_t time to gnumach
+      rather than the hurd ?
+    <youpi> well, if hurd RPCs are using mach types and there's no mach type
+      for nanoseconds, it m akes sense to add one
+    <youpi> I don't know about the first part
+    <braunr> yes some hurd itnerfaces also use time_value_t
+    <antrik> in general, I don't think Hurd interfaces should rely on a Mach
+      timevalue. it's really only meaningful when Mach is involved...
+    <antrik> we could even pass the time value as an opaque struct. don't
+      really need an explicit MIG type for that.
+    <braunr> opaque ?
+    <youpi> an opaque type would be a step backward from multi-machine support
+      ;)
+    <antrik> youpi: that's a sham anyways ;-)
+    <youpi> what?
+    <youpi> ah, using an opaque type, yes :)
+    <braunr> probably why my head bugged while reading that
+    <antrik> it wouldn't be fully opaque either. it would be two ints, right?
+      even if Mach doesn't know what these two ints mean, it still could to
+      byte order conversion, if we ever actually supported setups where it
+      matters...
+    <braunr> so uh, should this new time_spec_t be added in gnumach or the hurd
+      ?
+    <braunr> youpi: you're the maintainer, you decide :p
+    *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has joined channel
+          #hurd
+    <youpi> well, I don't like deciding when I didn't even have read fs.defs :)
+    <youpi> but I'd say the way forward is defining it in the hurd
+    <youpi> and put a comment "should be our own type" above use of the mach
+      type
+    <braunr> ok
+    *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has quit: Remote host
+          closed the connection
+    <braunr> and, by the way, is using integer_t fine wrt the 64-bits port ?
+    <youpi> I believe we settled on keeping integer_t a 32bit integer, like xnu
+      does
+    *** elmig (~elmig@a89-155-34-142.cpe.netcabo.pt) has quit: Quit: leaving
+    <braunr> ok so it's not
+    *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has joined channel
+          #hurd
+    <braunr> uh well
+    <youpi> why "not" ?
+    <braunr> keeping it 32-bits for the 32-bits userspace hurd
+    <braunr> but i'm talking about a true 64-bits version
+    <braunr> wouldn't integer_t get 64-bits then ?
+    <youpi> I meant we settled on a no
+    <youpi> like xnu does
+    <braunr> xnu uses 32-bits integer_t even when userspace runs in 64-bits
+      mode ?
+    <youpi> because things for which we'd need 64bits then are offset_t,
+      vm_size_t, and such
+    <youpi> yes
+    <braunr> ok
+    <braunr> youpi: but then what is the type to use for long integers ?
+    <braunr> or uintptr_t
+    <youpi> braunr: uintptr_t
+    <braunr> the mig type i mean
+    <youpi> type memory_object_offset_t     = uint64_t;
+    <youpi> (and size)
+    <braunr> well that's a 64-bits type
+    <youpi> well, yes
+    <braunr> natural_t and integer_t were supposed to have the processor word
+      size
+    <youpi> probably I didn't understand your question
+    <braunr> if we remove that property, what else has it ?
+    <youpi> yes, but see rolands comment on this
+    <braunr> ah ?
+    <youpi> ah, no, he just says the same
+    <antrik> braunr: well, it's debatable whether the processor word size is
+      really 64 bit on x86_64...
+    <antrik> all known compilers still consider int to be 32 bit
+    <antrik> (and int is the default word size)
+    <braunr> not really
+    <youpi> as in?
+    <braunr> the word size really is 64-bits
+    <braunr> the question concerns the data model
+    <braunr> with ILP32 and LP64, int is always 32-bits, and long gets the
+      processor word size
+    <braunr> and those are the only ones current unices support
+    <braunr> (which is why long is used everywhere for this purpose instead of
+      uintptr_t in linux)
+    <antrik> I don't think int is 32 bit on alpha?
+    <antrik> (and probably some other 64 bit arches)
+    <braunr> also, assuming we want to maintain the ability to support single
+      system images, do we really want RPC with variable size types ?
+    <youpi> antrik: linux alpha's int is 32bit
+    <braunr> sparc64 too
+    <youpi> I don't know any 64bit port with 64bit int
+    <braunr> i wonder how posix will solve the year 2038 problem ;p
+    <youpi> time_t is a long
+    <youpi> the hope is that there'll be no 32bit systems by 2038 :)
+    <braunr> :)
+    <youpi> but yes, that matters to us
+    <youpi> number of seconds should not be just an int
+    <braunr> we can force a 64-bits type then
+    <braunr> i tend to think we should have no variable size type in any mig
+      interface
+    <braunr> youpi: so, new hurd type, named time_spec_t, composed of two
+      64-bits signed integers
+    <pinotree> braunr: i added that in my prototype of monotonic clock patch
+      for gnumach
+    <braunr> oh
+    <youpi> braunr: well, 64bit is not needed for the nanosecond part
+    <braunr> right
+    <braunr> it will be aligned anyway :p
+    <youpi> I know
+    <youpi> uh, actually linux uses long there
+    <braunr> pinotree: i guess your patch is still in debian ?
+    <braunr> youpi: well yes
+    <braunr> youpi: why wouldn't it ? :)
+    <pinotree> no, never applied
+    <youpi> braunr: because 64bit is not needed
+    <braunr> ah, i see what you mean
+    <youpi> oh, posix says longa ctually
+    <youpi> *exactly* long
+    <braunr> i'll use the same sizes
+    <braunr> so it fits nicely with timespec
+    <braunr> hm
+    <braunr> but timespec is only used at the client side
+    <braunr> glibc would simply move the timespec values into our hurd specific
+      type (which can use 32-bits nanosecs) and servers would only use that
+      type
+    <braunr> all right, i'll do it that way, unless there are additional
+      comments next morning :)
+    <antrik> braunr: we never supported federations, and I'm pretty sure we
+      never will. the remnants of network IPC code were ripped out some years
+      ago. some of the Hurd interfaces use opaque structs too, so it wouldn't
+      even work if it existed. as I said earlier, it's really all a sham
+    <antrik> as for the timespec type, I think it's easier to stick with the
+      API definition at RPC level too
+
+
+## IRC, freenode, #hurd, 2012-07-24
+
+    <braunr> youpi: antrik: is vm_size_t an appropriate type for a c long ?
+    <braunr> (appropriate mig type)
+    <antrik> I wouldn't say so. while technically they are pretty much
+      guaranteed to be the same, conceptually they are entirely different
+      things -- it would be confusing at least to do it that way...
+    <braunr> antrik: well which one then ? :(
+    <antrik> braunr: no idea TBH
+    <braunr> antrik_: that should have been natural_t and integer_t
+    <braunr> so maybe we should new types to replace them
+    <antrik_> braunr: actually, RPCs should never have nay machine-specific
+      types... which makes me realise that a 1:1 translation to the POSIX
+      definition is actually not possible if we want to follow the Mach ideals
+    <braunr> i agree
+    <braunr> (well, the original mach authors used natural_t in quite a bunch
+      of places ..)
+    <braunr> the mig interfaces look extremely messy to me because of this type
+      issue
+    <braunr> and i just want to move forward with my work now
+    <braunr> i could just use 2 integer_t, that would get converted in the
+      massive future revamp of the interfaces for the 64-bits userspace
+    <braunr> or 2 64-bits types
+    <braunr> i'd like us to agree on one of the two not too late so i can
+      continue
+
+
+## IRC, freenode, #hurd, 2012-07-25
+
+    <antrik_> braunr: well, for actual kernel calls, machine-specific types are
+      probably hard to avoid... the problem is when they are used in other RPCs
+    <braunr> antrik: i opted for a hurd specific time_data_t = struct[2] of
+      int64
+    <braunr> and going on with this for now
+    <braunr> once it works we'll finalize the types if needed
+    <antrik> I'm really not sure how to best handle such 32 vs. 64 bit issues
+      in Hurd interfaces...
+    <braunr> you *could* consider time_t and long to be machine specific types
+    <antrik> well, they clearly are
+    <braunr> long is
+    <braunr> time_t isn't really
+    <antrik> didn't you say POSIX demands it to be longs?
+    <braunr> we could decide to make it 64 bits in all versions of the hurd
+    <braunr> no
+    <braunr> posix requires the nanoseconds field of timespec to be long
+    <braunr> the way i see it, i don't see any problem (other than a little bit
+      of storage and performance) using 64-bits types here
+    <antrik> well, do we really want to use a machine-independent time format,
+      if the POSIX interfaces we are mapping do not?...
+    <antrik> (perhaps we should; I'm just uncertain what's better in this case)
+    <braunr> this would require creating new types for that
+    <braunr> probably mach types for consistency
+    <braunr> to replace natural_t and integer_t
+    <braunr> now this concerns a totally different issue than select
+    <braunr> which is how we're gonna handle the 64-bits port
+    <braunr> because natural_t and integer_t are used almost everywhere
+    <antrik> indeed
+    <braunr> and we must think of 2 ports
+    <braunr> the 32-bits over 64-bits gnumach, and the complete 64-bits one
+    <antrik> what do we do for the interfaces that are explicitly 64 bit?
+    <braunr> what do you mean ?
+    <braunr> i'm not sure there is anything to do
+    <antrik> I mean what is done in the existing ones?
+    <braunr> like off64_t ?
+    <antrik> yeah
+    <braunr> they use int64 and unsigned64
+    <antrik> OK. so we shouldn't have any trouble with that at least...
+    <pinotree> braunr: were you adding a time_value_t in mach, but for
+      nanoseconds?
+    <braunr> no i'm adding a time_data_t to the hurd
+    <braunr> for nanoseconds yes
+    <pinotree> ah ok
+    <pinotree> (maybe sure it is available in hurd/hurd_types.defs)
+    <braunr> yes it's there
+    <pinotree> \o/
+    <braunr> i mean, i didn't forget to add it there
+    <braunr> for now it's a struct[2] of int64
+    <braunr> but we're not completely sure of that
+    <braunr> currently i'm teaching the hurd how to use timeouts
+    <pinotree> cool
+    <braunr> which basically involves adding a time_data_t *timeout parameter
+      to many functions
+    <braunr> and replacing hurd_condition_wait with hurd_condition_timedwait
+    <braunr> and making sure a timeout isn't an error on the return path
+    * pinotree has a simplier idea for time_data_t: add a file_utimesns to
+        fs.defs
+    <braunr> hmm, some functions have a nonblocking parameter
+    <braunr> i'm not sure if it's better to replace them with the timeout, or add the timeout parameter
+    <braunr> considering the functions involved may return EWOULDBLOCK
+    <braunr> for now i'll add a timeout parameter, so that the code requires as little modification as possible
+    <braunr> tell me your opinion on that please
+    <antrik> braunr: what functions?
+    <braunr> connq_listen in pflocal for example
+    <antrik> braunr: I don't really understand what you are talking about :-(
+    <braunr> some servers implement select this way :
+    <braunr> 1/ call a function in non-blocking mode, if it indicates data is available, return immediately
+    <braunr> 2/ call the same function, in blocking mode
+    <braunr> normally, with the new timeout parameter, non-blocking could be passed in the timeout parameter (with a timeout of 0)
+    <braunr> operating in non-blocking mode, i mean
+    <braunr> antrik: is it clear now ? :)
+    <braunr> i wonder how the hurd managed to grow so much code without a cond_timedwait function :/
+    <braunr> i think i have finished my io_select_timeout patch on the hurd side
+    <braunr> :)
+    <braunr> a small step for the hurd, but a big one against vim latencies !!
+    <braunr> (which is the true reason i'm working on this haha)
+    <braunr> new hurd rbraun/io_select_timeout branch for those interested
+    <braunr> hm, my changes clashes hard with the debian pflocal patch by neal :/
+    <braunr> clash*
+    <antrik> braunr: replace I'd say. no need to introduce redundancy; and code changes not affecting interfaces are cheap
+    <antrik> (in general, I'm always in favour of refactoring)
+    <braunr> antrik: replace what ?
+    <antrik> braunr: wow, didn't think moving the timeouts to server would be such a quick task :-)
+    <braunr> antrik: :)
+    <antrik> 16:57 < braunr> hmm, some functions have a nonblocking parameter
+    <antrik> 16:58 < braunr> i'm not sure if it's better to replace them with the timeout, or add the timeout parameter
+    <braunr> antrik: ah about that, ok
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+    <pinotree> braunr: wrt your select_timeout branch, why not push only the
+      time_data stuff to master?
+    <braunr> pinotree: we didn't agree on that yet
+
+    <braunr> ah better, with the correct ordering of io routines, my hurd boots
+      :)
+    <pinotree> and works too? :p
+    <braunr> so far yes
+    <braunr> i've spotted some issues in libpipe but nothing major
+    <braunr> i "only" have to adjust the client side select implementation now
+
+
+## IRC, freenode, #hurd, 2012-07-27
+
+    <braunr> io_select should remain a routine (i.e. synchronous) for server
+      side stub code
+    <braunr> but should be asynchronous (send only) for client side stub code
+    <braunr> (since _hurs_select manually handles replies through a port set)
+
+
+## IRC, freenode, #hurd, 2012-07-28
+
+    <braunr> why are there both REPLY_PORTS and IO_SELECT_REPLY_PORT macros in
+      the hurd ..
+    <braunr> and for the select call only :(
+    <braunr> and doing the exact same thing unless i'm mistaken
+    <braunr> the reply port is required for select anyway ..
+    <braunr> i just want to squeeze them into a new IO_SELECT_SERVER macro
+    <braunr> i don't think i can maintain the use the existing io_select call
+      as it is
+    <braunr> grr, the io_request/io_reply files aren't synced with the io.defs
+      file
+    <braunr> calls like io_sigio_request seem totally unused
+    <antrik> yeah, that's a major shortcoming of MIG -- we shouldn't need to
+      have separate request/reply defs
+    <braunr> they're not even used :/
+    <braunr> i did something a bit ugly but it seems to do what i wanted
+
+
+## IRC, freenode, #hurd, 2012-07-29
+
+    <braunr> good, i have a working client-side select
+    <braunr> now i need to fix the servers a bit :x
+    <braunr> arg, my test cases work, but vim doesn't :((
+    <braunr> i hate select :p
+    <braunr> ah good, my problems are caused by a deadlock because of my glibc
+      changes
+    <braunr> ah yes, found my locking problem
+    <braunr> building my final libc now
+    * braunr crosses fingers
+    <braunr> (the deadlock issue was of course a one liner)
+    <braunr> grr deadlocks again
+    <braunr> grmbl, my deadlock is in pfinet :/
+    <braunr> my select_timeout code makes servers deadlock on the libports
+      global lock :/
+    <braunr> wtf..
+    <braunr> youpi: it may be related to the failed asserttion
+    <braunr> deadlocking on mutex_unlock oO
+    <braunr> grr
+    <braunr> actually, mutex_unlock sends a message to notify other threads
+      that the lock is ready
+    <braunr> and that's what is blocking ..
+    <braunr> i'm not sure it's a fundamental problem here
+    <braunr> it may simply be a corruption
+    <braunr> i have several (but not that many) threads blocked in mutex_unlock
+      and one blocked in mutex_lcok
+    <braunr> i fail to see how my changes can create such a behaviour
+    <braunr> the weird thing is that i can't reproduce this with my test cases
+      :/
+    <braunr> only vim makes things crazy
+    <braunr> and i suppose it's related to the terminal
+    <braunr> (don't terminals relay select requests ?)
+    <braunr> when starting vim through ssh, pfinet deadlocks, and when starting
+      it on the mach console, the console term deadlocks
+    <pinotree> no help/hints when started with rpctrace?
+    <braunr> i only get assertions with rpctrace
+    <braunr> it's completely unusable for me
+    <braunr> gdb tells vim is indeed blocked in a select request
+    <braunr> and i can't see any in the remote servers :/
+    <braunr> this is so weird ..
+    <braunr> when using vim with the unmodified c library, i clearly see the
+      select call, and everything works fine ....
+    <braunr>     2e27:       a1 c4 d2 b7 f7          mov    0xf7b7d2c4,%eax
+    <braunr>     2e2c:       62                      (bad)  
+    <braunr>     2e2d:       f6 47 b6 69             testb  $0x69,-0x4a(%edi)
+    <braunr> what's the "bad" line ??
+    <braunr> ew, i think i understand my problem now
+    <braunr> the timeout makes blocking threads wake prematurely
+    <braunr> but on an mutex unlock, or a condition signal/broadcast, a message
+      is still sent, as it is expected a thread is still waiting
+    <braunr> but the receiving thread, having returned sooner than expected
+      from mach_msg, doesn't dequeue the message
+    <braunr> as vim does a lot of non blocking selects, this fills the message
+      queue ...
+
+
+## IRC, freenode, #hurd, 2012-07-30
+
+    <braunr> hm nice, the problem i have with my hurd_condition_timedwait seems
+      to also exist in libpthread
+
+[[!taglink open_issue_libpthread]].
+
+    <braunr> although at a lesser degree (the implementation already correctly
+      removes a thread that timed out from a condition queue, and there is a
+      nice FIXME comment asking what to do with any stale wakeup message)
+    <braunr> and the only solution i can think of for now is to drain the
+      message queue
+    <braunr> ah yes, i know have vim running with my io_select_timeout code :>
+    <braunr> but hum
+    <braunr> eating all cpu
+    <braunr> ah nice, an infinite loop in _hurd_critical_section_unlock
+    <braunr> grmbl
+    <tschwinge> braunr: But not this one?
+      http://www.gnu.org/software/hurd/open_issues/fork_deadlock.html
+    <braunr> it looks similar, yes
+    <braunr> let me try again to compare in detail
+    <braunr> pretty much the same yes
+    <braunr> there is only one difference but i really don't think it matters
+    <braunr> (#3  _hurd_sigstate_lock (ss=0x2dff718) at hurdsig.c:173
+    <braunr> instead of
+    <braunr> #3  _hurd_sigstate_lock (ss=0x1235008) at hurdsig.c:172)
+    <braunr> ok so we need to review jeremie's work
+    <braunr> tschwinge: thanks for pointing me at this
+    <braunr> the good thing with my patch is that i can reproduce in a few
+      seconds
+    <braunr> consistently
+    <tschwinge> braunr: You're welcome.  Great -- a reproducer!
+    <tschwinge> You might also build a glibc without his patches as a
+      cross-test to see the issues goes away?
+    <braunr> right
+    <braunr> i hope they're easy to find :)
+    <tschwinge> Hmm, have you already done changes to glibc?  Otherwise you
+      might also simply use a Debian package from before?
+    <braunr> yes i have local changes to _hurd_select
+    <tschwinge> OK, too bad.
+    <tschwinge> braunr: debian/patches/hurd-i386/tg-hurdsig-*, I think.
+    <braunr> ok
+    <braunr> hmmmmm
+    <braunr> it may be related to my last patch on the select_timeout branch
+    <braunr> (i mean, this may be caused by what i mentioned earlier this
+      morning)
+    <braunr> damn i can't build glibc without the signal disposition patches :(
+    <braunr> libpthread_sigmask.diff depends on it
+    <braunr> tschwinge: doesn't libpthread (as implemented in the debian glibc
+      patches) depend on global signal dispositions ?
+    <braunr> i think i'll use an older glibc for now
+    <braunr> but hmm which one ..
+    <braunr> oh whatever, let's fix the deadlock, it's simpler
+    <braunr> and more productive anyway
+    <tschwinge> braunr: May be that you need to revert some libpthread patch,
+      too.  Or even take out the libpthread build completely (you don't need it
+      for you current work, I think).
+    <tschwinge> braunr: Or, of course, you locate the deadlock.  :-)
+    <braunr> hum, now why would __io_select_timeout return
+      EMACH_SEND_INVALID_DEST :(
+    <braunr> the current glibc code just transparently reports any such error
+      as a false positive oO
+    <braunr> hm nice, segfault through recursion
+    <braunr> "task foo destroying an invalid port bar" everywhere :((
+    <braunr> i still have problems at the server side ..
+    <braunr> ok i think i have a solution for the "synchronization problem"
+    <braunr> (by this name, i refer to the way mutex and condition variables
+      are implemented"
+    <braunr> (the problem being that, when a thread unblocks early, because of
+      a timeout, another may still send a message to attempt it, which may fill
+      up the message queue and make the sender block, causing a deadlock)
+    <braunr> s/attempt/attempt to wake/
+    <bddebian> Attempts to wake a dead thread?
+    <braunr> no
+    <braunr> attempt to wake an already active thread
+    <braunr> which won't dequeue the message because it's doing something else
+    <braunr> bddebian: i'm mentioning this because the problem potentially also
+      exists in libpthread
+
+[[!taglink open_issue_libpthread]].
+
+    <braunr> since the underlying algorithms are exactly the same
+    <youpi> (fortunately the time-out versions are not often used)
+    <braunr> for now :)
+    <braunr> for reference, my idea is to make the wake call truely non
+      blocking, by setting a timeout of 0
+    <braunr> i also limit the message queue size to 1, to limit the amount of
+      spurious wakeups
+    <braunr> i'll be able to test that in 30 mins or so
+    <braunr> hum
+    <braunr> how can mach_msg block with a timeout of 0 ??
+    <braunr> never mind :p
+    <braunr> unfortunately, my idea alone isn't enough
+    <braunr> for those interested in the problem, i've updated the analysis in
+      my last commit
+      (http://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?h=rbraun/select_timeout&id=40fe717ba9093c0c893d9ea44673e46a6f9e0c7d)
+
+
+## IRC, freenode, #hurd, 2012-08-01
+
+    <braunr> damn, i can't manage to make threads calling condition_wait to
+      dequeue themselves from the condition queue :(
+    <braunr> (instead of the one sending the signal/broadcast)
+    <braunr> my changes on cthreads introduce 2 intrusive changes
+    <braunr> the first is that the wakeup port is limited to 1 port, and the
+      wakeup operation is totally non blocking
+    <braunr> which is something we should probably add in any case
+    <braunr> the second is that condition_wait dequeues itself after blocking,
+      instead of condition_signal/broadcast
+    <braunr> and this second change seems to introduce deadlocks, for reasons
+      completely unknown to me :((
+    <braunr> limited to 1 message*
+    <braunr> if anyone has an idea about why it is bad for a thread to remove
+      itself from a condition/mutex queue, i'm all ears
+    <braunr> i'm hitting a wall :(
+    <braunr> antrik: if you have some motivation, can you review this please ?
+      http://www.sceen.net/~rbraun/0001-Rework-condition-signal-broadcast.patch
+    <braunr> with this patch, i get threads blocked in condition_wait,
+      apparently waiting for a wakeup that never comes (or was already
+      consumed)
+    <braunr> and i don't understand why :
+    <braunr> :(
+    <bddebian> braunr: The condition never happens?
+    <braunr> bddebian: it works without the patch, so i guess that's not the
+      problem
+    <braunr> bddebian: hm, you could be right actually :p
+    <bddebian> braunr: About what? :)
+    <braunr> 17:50 < bddebian> braunr: The condition never happens?
+    <braunr> although i doubt it again
+    <braunr> this problem is getting very very frustrating
+    <bddebian> :(
+    <braunr> it frightens me because i don't see any flaw in the logic :(
+
+
+## IRC, freenode, #hurd, 2012-08-02
+
+    <braunr> ah, seems i found a reliable workaround to my deadlock issue, and
+      more than a workaround, it should increase efficiency by reducing
+      messaging
+    * braunr happy
+    <kilobug> congrats :)
+    <braunr> the downside is that we may have a problem with non blocking send
+      calls :/
+    <braunr> which are used for signals
+    <braunr> i mean, this could be a mach bug
+    <braunr> let's try running a complete hurd with the change
+    <braunr> arg, the boot doesn't complete with the patch .. :(
+    <braunr> grmbl, by changing only a few bits in crtheads, the boot process
+      freezes in an infinite loop in somethign started after auth
+      (/etc/hurd/runsystem i assume)
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+    <braunr> glibc actually makes some direct use of cthreads condition
+      variables
+    <braunr> and my patch seems to work with servers in an already working
+      hurd, but don't allow it to boot
+    <braunr> and the hang happens on bash, the first thing that doesn't come
+      from the hurd package
+    <braunr> (i mean, during the boot sequence)
+    <braunr> which means we can't change cthreads headers (as some primitives
+      are macros)
+    <braunr> *sigh*
+    <braunr> the thing is, i can't fix select until i have a
+      condition_timedwait primitive
+    <braunr> and i can't add this primitive until either 1/ cthreads are fixed
+      not to allow the inlining of its primitives, or 2/ the switch to pthreads
+      is done
+    <braunr> which might take a loong time :p
+    <braunr> i'll have to rebuild a whole libc package with a fixed cthreads
+      version
+    <braunr> let's do this
+    <braunr> pinotree: i see two __condition_wait calls in glibc, how is the
+      double underscore handled ?
+    <pinotree> where do you see it?
+    <braunr> sysdeps/mach/hurd/setpgid.c and sysdeps/mach/hurd/setsid.c
+    <braunr> i wonder if it's even used
+    <braunr> looks like we use posix/setsid.c now
+    <pinotree> #ifdef noteven
+    <braunr> ?
+    <pinotree> the two __condition_wait calls you pointed out are in such
+      preprocessor block
+    <pinotree> s
+    <braunr> but what does it mean ?
+    <pinotree> no idea
+    <braunr> ok
+    <pinotree> these two files should be definitely be used, they are found
+      earlier in the vpath
+    <braunr> hum, posix/setsid.c is a nop stub
+    <pinotree> i don't see anything defining "noteven" in glibc itself nor in
+      hurd
+    <braunr> :(
+    <pinotree> yes, most of the stuff in posix/, misc/, signal/, time/ are
+      ENOSYS stubs, to be reimplemented in a sysdep
+    <braunr> hm, i may have made a small mistake in cthreads itself actually
+    <braunr> right
+    <braunr> when i try to debug using a subhurd, gdb tells me the blocked
+      process is spinning in ld ..
+    <braunr> i mean ld.so
+    <braunr> and i can't see any debugging symbol
+    <braunr> some progress, it hangs at process_envvars
+    <braunr> eh
+    <braunr> i've partially traced my problem
+    <braunr> when a "normal" program starts, libc creates the signal thread
+      early
+    <braunr> the main thread waits for the creation of this thread by polling
+      its address
+    <braunr> (i.e. while (signal_thread == 0); )
+    <braunr> for some reason, it is stuck in this loop
+    <braunr> cthread creation being actually governed by
+      condition_wait/broadcast, it makes some sense
+    <bddebian> braunr: When you say the "main" thread, do you mean the main
+      thread of the program?
+    <braunr> bddebian: yes
+    <braunr> i think i've determined my mistake
+    <braunr> glibc has its own variants of the mutex primitives
+    <braunr> and i changed one :/
+    <bddebian> Ah
+    <braunr> it's good news for me :)
+    <braunr> hum no, that's not exactly what i described
+    <braunr> glibc has some stubs, but it's not the problem, the problem is
+      that mutex_lock/unlock are macros, and i changed one of them
+    <braunr> so everything that used that macro inside glibc wasn't changed
+    <braunr> yes!
+    <braunr> my patched hurd now boots :)
+    * braunr relieved
+    <braunr> this experience at least taught me that it's not possible to
+      easily change the singly linked queues of thread (waiting for a mutex or
+      a condition variable) :(
+    <braunr> for now, i'm using a linear search from the start
+    <braunr> so, not only does this patched hurd boot, but i was able to use
+      aptitude, git, build a whole hurd, copy the whole thing, and remove
+      everything, and it still runs fine (whereas usually it would fail very
+      early)
+    * braunr happy
+    <antrik> and vim works fine now?
+    <braunr> err, wait
+    <braunr> this patch does only one thing
+    <braunr> it alters the way condition_signal/broadcast and
+      {hurd_,}condition_wait operate
+    <braunr> currently, condition_signal/broadcast dequeues threads from a
+      condition queue and wake them
+    <braunr> my patch makes these functions only wake the target threads
+    <braunr> which dequeue themselves
+    <braunr> (a necessary requirement to allow clean timeout handling)
+    <braunr> the next step is to fix my hurd_condition_wait patch
+    <braunr> and reapply the whole hurd patch indotrucing io_select_timeout
+    <braunr> introducing*
+    <braunr> then i'll be able to tell you
+    <braunr> one side effect of my current changes is that the linear search
+      required when a thread dequeues itself is ugly
+    <braunr> so it'll be an additional reason to help the pthreads porting
+      effort
+    <braunr> (pthreads have the same sort of issues wrt to timeout handling,
+      but threads are a doubly-linked lists, making it way easier to adjust)
+    <braunr> +on
+    <braunr> damn i'm happy
+    <braunr> 3 days on this stupid bug
+    <braunr> (which is actually responsible for what i initially feared to be a
+      mach bug on non blocking sends)
+    <braunr> (and because of that, i worked on the code to make it sure that 1/
+      waking is truely non blocking and 2/ only one message is required for
+      wakeups
+    <braunr> )
+    <braunr> a simple flag is tested instead of sending in a non blocking way
+      :)
+    <braunr> these improvments should be ported to pthreads some day
+
+[[!taglink open_issue_libpthread]]
+
+    <braunr> ahah !
+    <braunr> view is now FAST !
+    <mel-> braunr: what do you mean by 'view'?
+    <braunr> mel-: i mean the read-only version of vim
+    <mel-> aah
+    <braunr> i still have a few port leaks to fix
+    <braunr> and some polishing
+    <braunr> but basically, the non-blocking select issue seems fixed
+    <braunr> and with some luck, we should get unexpected speedups here and
+      there
+    <mel-> so vim was considerable slow on the Hurd before? didn't know that.
+    <braunr> not exactly
+    <braunr> at first, it wasn't, but the non blocking select/poll calls
+      misbehaved
+    <braunr> so a patch was introduced to make these block at least 1 ms
+    <braunr> then vim became slow, because it does a lot of non blocking select
+    <braunr> so another patch was introduced, not to set the 1ms timeout for a
+      few programs
+    <braunr> youpi: darnassus is already running the patched hurd, which shows
+      (as expected) that it can safely be used with an older libc
+    <youpi> i.e. servers with the additional io_select?
+    <braunr> yes
+    <youpi> k
+    <youpi> good :)
+    <braunr> and the modified cthreads
+    <braunr> which is the most intrusive change
+    <braunr> port leaks fixed
+    <gnu_srs> braunr: Congrats:-D
+    <braunr> thanks
+    <braunr> it's not over yet :p
+    <braunr> tests, reviews, more tests, polishing, commits, packaging
+
+
+## IRC, freenode, #hurd, 2012-08-04
+
+    <braunr> grmbl, apt-get fails on select in my subhurd with the updated
+      glibc
+    <braunr> otherwise it boots and runs fine
+    <braunr> fixed :)
+    <braunr> grmbl, there is a deadlock in pfinet with my patch
+    <braunr> deadlock fixed
+    <braunr> the sigstate and the condition locks must be taken at the same
+      time, for some obscure reason explained in the cthreads code
+    <braunr> but when a thread awakes and dequeues itself from the condition
+      queue, it only took the condition lock
+    <braunr> i noted in my todo list that this could create problems, but
+      wanted to leave it as it is to really see it happen
+    <braunr> well, i saw :)
+    <braunr> the last commit of my hurd branch includes the 3 line fix
+    <braunr> these fixes will be required for libpthreads
+      (pthread_mutex_timedlock and pthread_cond_timedwait) some day
+    <braunr> after the select bug is fixed, i'll probably work on that with you
+      and thomas d
+
+
+## IRC, freenode, #hurd, 2012-08-05
+
+    <braunr> eh, i made dpkg-buildpackage use the patched c library, and it
+      finished the build oO
+    <gnu_srs> braunr: :)
+    <braunr> faked-tcp was blocked in a select call :/
+    <braunr> (with the old libc i mean)
+    <braunr> with mine i just worked at the first attempt
+    <braunr> i'm not sure what it means
+    <braunr> it could mean that the patched hurd servers are not completely
+      compatible with the current libc, for some weird corner cases
+    <braunr> the slowness of faked-tcp is apparently inherent to its
+      implementation
+    <braunr> all right, let's put all these packages online
+    <braunr> eh, right when i upload them, i get a deadlock
+    <braunr> this one seems specific to pfinet
+    <braunr> only one deadlock so far, and the libc wasn't in sync with the
+      hurd
+    <braunr> :/
+    <braunr> damn, another deadlock as soon as i send a mail on bug-hurd :(
+    <braunr> grr
+    <pinotree> thou shall not email
+    <braunr> aptitude seems to be a heavy user of select
+    <braunr> oh, it may be due to my script regularly chaning the system time
+    <braunr> or it may not be a deadlock, but simply the linear queue getting
+      extremely large
+
+
+## IRC, freenode, #hurd, 2012-08-06
+
+    <braunr> i have bad news :( it seems there can be memory corruptions with
+      my io_select patch
+    <braunr> i've just seen an auth server (!) spinning on a condition lock
+      (the internal spin lock), probably because the condition was corrupted ..
+    <braunr> i guess it's simply because conditions embedded in dynamically
+      allocated structures can be freed while there are still threads waiting
+      ...
+    <braunr> so, yes the solution to my problem is simply to dequeue threads
+      from both the waker when there is one, and the waiter when no wakeup
+      message was received
+    <braunr> simple
+    <braunr> it's so obvious i wonder how i didn't think of it earlier :(-
+    <antrik> braunr: an elegant solution always seems obvious afterwards... ;-)
+    <braunr> antrik: let's hope this time, it's completely right
+    <braunr> good, my latest hurd packages seem fixed finally
+    <braunr> looks like i got another deadlock
+    * braunr hangs himselg
+    <braunr> that, or again, condition queues can get very large (e.g. on
+      thread storms)
+    <braunr> looks like this is the case yes
+    <braunr> after some time the system recovered :(
+    <braunr> which means a doubly linked list is required to avoid pathological
+      behaviours
+    <braunr> arg
+    <braunr> it won't be easy at all to add a doubly linked list to condition
+      variables :(
+    <braunr> actually, just a bit messy
+    <braunr> youpi: other than this linear search on dequeue, darnassus has
+      been working fine so far
+    <youpi> k
+    <youpi> Mmm, you'd need to bump the abi soname if changing the condition
+      structure layout
+    <braunr> :(
+    <braunr> youpi: how are we going to solve that ?
+    <youpi> well, either bump soname, or finish transition to libpthread :)
+    <braunr> it looks better to work on pthread now
+    <braunr> to avoid too many abi changes
+
+[[libpthread]].
+
+
 # See Also
 
 See also [[select_bogus_fd]] and [[select_vs_signals]].
diff --git a/open_issues/strict_aliasing.mdwn b/open_issues/strict_aliasing.mdwn
index 01019372..b7d39805 100644
--- a/open_issues/strict_aliasing.mdwn
+++ b/open_issues/strict_aliasing.mdwn
@@ -19,3 +19,13 @@ License|/fdl]]."]]"""]]
       instead?
     <braunr> pinotree: if we can rely on gcc for the warnings, yes
     <braunr> but i suspect there might be other silent issues in very old code
+
+
+# IRC, freenode, #hurd, 2012-07-12
+
+    <braunr> btw, i'm building glibc right now, and i can see a few strict
+      aliasing warnings
+    <braunr> fixing them will allow us to avoid wasting time on very obscure
+      issues (if gcc catches them all)
+    <tschwinge> The strict aliasing things should be fixed, yes.  Some might be
+      from MIG.
diff --git a/open_issues/synchronous_ipc.mdwn b/open_issues/synchronous_ipc.mdwn
new file mode 100644
index 00000000..57bcdda7
--- /dev/null
+++ b/open_issues/synchronous_ipc.mdwn
@@ -0,0 +1,64 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+From [[Genode RPC|microkernel/genode/rpc]].
+
+    <braunr> assuming synchronous ipc is the way to go (it seems so), there is
+      still the need for some async ipc (e.g signalling untrusted recipients
+      without risking blocking on them)
+    <braunr> 1/ do you agree on that and 2/ how would this low-overhead async
+      ipc be done ? (and 3/ are there relevant examples ?
+    <antrik> if you think about this stuff too much you will end up like marcus
+      and neal ;-)
+    <braunr> antrik: likely :)
+    <antrik> the truth is that there are various possible designs all with
+      their own tradeoffs, and nobody can really tell which one is better
+    <braunr> the only sensible one i found is qnx :/
+    <braunr> but it's still messy
+    <braunr> they have what they call pulses, with a strictly defined format
+    <braunr> so it's actually fine because it guarantees low overhead, and can
+      easily be queued
+    <braunr> but i'm not sure about the format
+    <antrik> I must say that Neal's half-sync approach in Viengoos still sounds
+      most promising to me. it's actually modelled after the needs of a
+      Hurd-like system; and he thought about it a lot...
+    <braunr> damn i forgot to reread that
+    <braunr> stupid me
+    <antrik> note that you can't come up with a design that allows both a)
+      delivering reliably and b) never blocking the sender -- unless you cache
+      in the kernel, which we don't want
+    <antrik> but I don't think it's really necessary to fulfill both of these
+      requirements
+    <antrik> it's up to the receiver to make sure it gets important signals
+    <braunr> right
+    <braunr> caching in the kernel is ok as long as the limit allows the
+      receiver to handle its signals
+    <antrik> in the Viengoos approach, the receiver can allocate a number of
+      receive buffers; so it's even possible to do some queuing if desired
+    <braunr> ah great, limits in the form of resources lent by the receiver
+    <braunr> one thing i really don't like in mach is the behaviour on full
+      message queues
+    <braunr> blocking :/
+    <braunr> i bet the libpager deadlock is due to that
+
+[[libpager_deadlock]].
+
+    <braunr> it simply means async ipc doesn't prevent at all from deadlocks
+    <antrik> the sender can set a timeout. blocking only happens when setting
+      it to infinite...
+    <braunr> which is commonly the case
+    <antrik> well, if you see places where blocking is done but failing would
+      be more appropriate, try changing them I'd say...
+    <braunr> it's not that easy :/
diff --git a/open_issues/usleep.mdwn b/open_issues/usleep.mdwn
new file mode 100644
index 00000000..b71cd902
--- /dev/null
+++ b/open_issues/usleep.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc]]
+
+# IRC, OFTC, #debian-hurd, 2012-07-14
+
+    <pinotree> eeek, usleep has the issues which i fixed in nanosleep
+    <bdefreese> pinotree: ?
+    * pinotree ponders a `mv sysdeps/unix/sysv/linux/usleep.c
+        sysdeps/mach/usleep.c`
+    <pinotree> s/mv/cp/
+    <bdefreese> What the heck is the point of usleep(0) anyway?  Isn't that
+      basically saying suspend for 0 milliseconds?
+    <youpi> it's rounded up by the kernel I guess
+    <youpi> i.e. suspend for the shortest time possible (a clock tick)
+    <pinotree> posix 2001 says that «If the value of useconds is 0, then the
+      call has no effect.»
diff --git a/open_issues/virtualbox.mdwn b/open_issues/virtualbox.mdwn
index 9440284f..d0608b4a 100644
--- a/open_issues/virtualbox.mdwn
+++ b/open_issues/virtualbox.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
 
 [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
 id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -8,11 +8,15 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
 is included in the section entitled [[GNU Free Documentation
 License|/fdl]]."]]"""]]
 
+[[!toc]]
+
+
+# Running GNU Mach in VirtualBox crashes during initialization.
+
 [[!tag open_issue_gnumach]]
 
-Running GNU Mach in VirtualBox crashes during initialization.
 
-IRC, freenode, #hurd, 2011-08-15
+## IRC, freenode, #hurd, 2011-08-15
 
     <BlueT_> HowTo Reproduce: 1) Use `reboot` to reboot the system.  2) Once
       you see the Grub menu, turn off the debian hurd box.  3) Let the box boot
@@ -97,3 +101,37 @@ IRC, freenode, #hurd, 2011-08-15
     <youpi> what's interesting is that that one means that $USER_DS did load in
       %es fine at least once 
     <youpi> and it's the reload that fails
+
+
+# Slow SCSI probing
+
+[[!tag open_issue_gnumach]]
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+    <braunr> youpi: it seems the slow boot on virtualbox is really because of
+      scsi (it spends a long time in scsi_init, probing for all the drivers)
+    <youpi> braunr: we know that
+    <youpi> isn't it in the io port probe printed at boot?
+    <youpi> iirc that was that
+    <braunr> the discussion i found was about eata
+    <braunr> not the whole scsi group
+    <youpi> there used to be another in eata, yas
+    <braunr> oh
+    <braunr> i must have missed the first discussion then
+    <youpi> I mean
+    <youpi> the eata is the first
+    <braunr> ok
+    <youpi> and scsi was mentioned later
+    <youpi> just nobody took the time to track it down
+    <braunr> ok
+    <braunr> so it's not just a matter of disabling a single driver :(
+    <youpi> braunr: I still believe it's a matter of disableing a single driver
+    <youpi> I don't see why scsi in general should take a lot of time
+    <braunr> youpi: it doesn't on qemu, it may simply be virtualbox's fault
+    <youpi> it is, yes
+    <youpi> and virtualbox people say it's hurd's fault, of course
+    <braunr> both are possible
+    <braunr> but we can't expect them to fix it :)
+    <youpi> that's what I mean
diff --git a/open_issues/wait_errors.mdwn b/open_issues/wait_errors.mdwn
new file mode 100644
index 00000000..855b9add
--- /dev/null
+++ b/open_issues/wait_errors.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_hurd]]
+
+# IRC, freenode, #hurd, 2012-07-12
+
+    <braunr> tschwinge: have you encountered wait() errors ?
+    <tschwinge> What kind of wait errors?
+    <braunr> when running htop or watch vmstat, other apparently unrelated
+      processes calling wait() sometimes fail with an error
+    <braunr> i saw it mostly during builds, as they spawn lots of children
+    <braunr> (and used the aforementioned commands to monitor the builds)
+    <tschwinge> Sounds nasty...  No, don't remember seeing that.  But I don't
+      typiclly invoke such commands during builds.
+    <tschwinge> So this wait thing suggests there's something going wrong in
+      the proc server?
+    <braunr> tschwinge: yes
author	Thomas Schwinge <tschwinge@gnu.org>	2012-08-07 23:25:26 +0200
committer	Thomas Schwinge <tschwinge@gnu.org>	2012-08-07 23:25:26 +0200
commit	2603401fa1f899a8ff60ec6a134d5bd511073a9d (patch)
tree	ccac6e11638ddeee8da94055b53f4fdfde73aa5c /open_issues
parent	d72694b33a81919368365da2c35d5b4a264648e0 (diff)