summaryrefslogtreecommitdiff
path: root/open_issues
diff options
context:
space:
mode:
Diffstat (limited to 'open_issues')
-rw-r--r--open_issues/64-bit_port.mdwn27
-rw-r--r--open_issues/alarm_setitimer.mdwn8
-rw-r--r--open_issues/anatomy_of_a_hurd_system.mdwn111
-rw-r--r--open_issues/arm_port.mdwn238
-rw-r--r--open_issues/automatic_backtraces_when_assertions_hit.mdwn60
-rw-r--r--open_issues/bpf.mdwn8
-rw-r--r--open_issues/code_analysis/discussion.mdwn17
-rw-r--r--open_issues/console_tty1.mdwn151
-rw-r--r--open_issues/console_vs_xorg.mdwn31
-rw-r--r--open_issues/dde.mdwn111
-rw-r--r--open_issues/exec_leak.mdwn57
-rw-r--r--open_issues/ext2fs_deadlock.mdwn5
-rw-r--r--open_issues/ext2fs_libports_reference_counting_assertion.mdwn93
-rw-r--r--open_issues/fakeroot_eagain.mdwn216
-rw-r--r--open_issues/fork_deadlock.mdwn31
-rw-r--r--open_issues/gcc/pie.mdwn40
-rw-r--r--open_issues/glibc.mdwn316
-rw-r--r--open_issues/glibc/t/tls-threadvar.mdwn29
-rw-r--r--open_issues/gnat.mdwn51
-rw-r--r--open_issues/gnumach_memory_management.mdwn49
-rw-r--r--open_issues/gnumach_page_cache_policy.mdwn158
-rw-r--r--open_issues/gnumach_vm_map_entry_forward_merging.mdwn4
-rw-r--r--open_issues/gnumach_vm_map_red-black_trees.mdwn172
-rw-r--r--open_issues/implementing_hurd_on_top_of_another_system.mdwn320
-rw-r--r--open_issues/libmachuser_libhurduser_rpc_stubs.mdwn11
-rw-r--r--open_issues/libpager_deadlock.mdwn165
-rw-r--r--open_issues/libpthread.mdwn1284
-rw-r--r--open_issues/libpthread/t/fix_have_kernel_resources.mdwn21
-rw-r--r--open_issues/libpthread_CLOCK_MONOTONIC.mdwn35
-rw-r--r--open_issues/libpthread_timeout_dequeue.mdwn22
-rw-r--r--open_issues/mach_federations.mdwn66
-rw-r--r--open_issues/mach_on_top_of_posix.mdwn4
-rw-r--r--open_issues/mach_shadow_objects.mdwn24
-rw-r--r--open_issues/mission_statement.mdwn41
-rw-r--r--open_issues/multithreading.mdwn154
-rw-r--r--open_issues/netstat.mdwn34
-rw-r--r--open_issues/packaging_libpthread.mdwn107
-rw-r--r--open_issues/pci_arbiter.mdwn256
-rw-r--r--open_issues/performance.mdwn163
-rw-r--r--open_issues/performance/io_system/read-ahead.mdwn991
-rw-r--r--open_issues/pfinet_vs_system_time_changes.mdwn31
-rw-r--r--open_issues/robustness.mdwn65
-rw-r--r--open_issues/select.mdwn1416
-rw-r--r--open_issues/strict_aliasing.mdwn10
-rw-r--r--open_issues/synchronous_ipc.mdwn185
-rw-r--r--open_issues/system_stats.mdwn39
-rw-r--r--open_issues/term_blocking.mdwn125
-rw-r--r--open_issues/user-space_device_drivers.mdwn428
-rw-r--r--open_issues/usleep.mdwn25
-rw-r--r--open_issues/virtualbox.mdwn44
-rw-r--r--open_issues/vm_map_kernel_bug.mdwn54
-rw-r--r--open_issues/wait_errors.mdwn25
52 files changed, 8064 insertions, 64 deletions
diff --git a/open_issues/64-bit_port.mdwn b/open_issues/64-bit_port.mdwn
index 797d540f..2d273ba1 100644
--- a/open_issues/64-bit_port.mdwn
+++ b/open_issues/64-bit_port.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,7 +10,11 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_gnumach open_issue_mig]]
-IRC, freenode, #hurd, 2011-10-16:
+There is a `master-x86_64` GNU Mach branch. As of 2012-11-20, it only supports
+the [[microkernel/mach/gnumach/ports/Xen]] platform.
+
+
+# IRC, freenode, #hurd, 2011-10-16
<youpi> it'd be really good to have a 64bit kernel, no need to care about
addressing space :)
@@ -34,3 +38,22 @@ IRC, freenode, #hurd, 2011-10-16:
<youpi> and it'd boost userland addrespace to 4GiB
<braunr> yes
<youpi> leaving time for a 64bit userland :)
+
+
+# IRC, freenode, #hurd, 2012-10-03
+
+ <braunr> youpi: just so you know in case you try the master-x86_64 with
+ grub
+ <braunr> youpi: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=689509
+ <youpi> ok, thx
+ <braunr> the squeeze version is fine but i had to patch the wheezy/sid one
+ <youpi> I actually hadn't hoped to boot into 64bit directly from grub
+ <braunr> youpi: there is code in viengoos that could be reused
+ <braunr> i've been thinking about it for a time now
+ <youpi> ok
+ <braunr> the two easiest ways are 1/ the viengoos one (a -m32 object file
+ converted with objcopy as an embedded loader)
+ <braunr> and 2/ establishing an identity mapping using 4x1 GB large pages
+ and switching to long mode, then jumping to c code to complete the
+ initialization
+ <braunr> i think i'll go the second way with x15, so you'll have the two :)
diff --git a/open_issues/alarm_setitimer.mdwn b/open_issues/alarm_setitimer.mdwn
index 99b2d7b6..3255683c 100644
--- a/open_issues/alarm_setitimer.mdwn
+++ b/open_issues/alarm_setitimer.mdwn
@@ -21,3 +21,11 @@ See also the attached file: on other OSes (e.g. Linux) it blocks waiting
for a signal, while on GNU/Hurd it gets a new alarm and exits.
[[alrm.c]]
+
+
+# IRC, freenode, #hurd, 2012-07-29
+
+ <braunr> our setitimer is bugged
+ <braunr> it seems doesn't seem to leave a timer disarmed when the interval
+ is set to 0
+ <braunr> (which means a one shot timer is actually periodic ..)
diff --git a/open_issues/anatomy_of_a_hurd_system.mdwn b/open_issues/anatomy_of_a_hurd_system.mdwn
index 99ef170b..3e585876 100644
--- a/open_issues/anatomy_of_a_hurd_system.mdwn
+++ b/open_issues/anatomy_of_a_hurd_system.mdwn
@@ -13,7 +13,10 @@ License|/fdl]]."]]"""]]
A bunch of this should also be covered in other (introductionary) material,
like Bushnell's Hurd paper. All this should be unfied and streamlined.
-IRC, freenode, #hurd, 2011-03-08:
+[[!toc]]
+
+
+# IRC, freenode, #hurd, 2011-03-08
<foocraft> I've a question on what are the "units" in the hurd project, if
you were to divide them into units if they aren't, and what are the
@@ -38,9 +41,8 @@ IRC, freenode, #hurd, 2011-03-08:
<antrik> no
<antrik> servers often depend on other servers for certain functionality
----
-IRC, freenode, #hurd, 2011-03-12:
+# IRC, freenode, #hurd, 2011-03-12
<dEhiN> when mach first starts up, does it have some basic i/o or fs
functionality built into it to start up the initial hurd translators?
@@ -72,24 +74,24 @@ IRC, freenode, #hurd, 2011-03-12:
<antrik> it also does some bootstrapping work during startup, to bring the
rest of the system up
----
+
+# Source Code Documentation
Provide a cross-linked sources documentation, including generated files, like
RPC stubs.
* <http://www.gnu.org/software/global/>
----
-[[Hurd_101]].
+# [[Hurd_101]]
+
----
+# [[hurd/IO_path]]
-More stuff like [[hurd/IO_path]].
+Need more stuff like that.
----
-IRC, freenode, #hurd, 2011-10-18:
+# IRC, freenode, #hurd, 2011-10-18
<frhodes> what happens @ boot. and which translators are started in what
order?
@@ -97,9 +99,8 @@ IRC, freenode, #hurd, 2011-10-18:
ext2; ext2 starts exec; ext2 execs a few other servers; ext2 execs
init. from there on, it's just standard UNIX stuff
----
-IRC, OFTC, #debian-hurd, 2011-11-02:
+# IRC, OFTC, #debian-hurd, 2011-11-02
<sekon_> is __dir_lookup a RPC ??
<sekon_> where can i find the source of __dir_lookup ??
@@ -123,9 +124,8 @@ IRC, OFTC, #debian-hurd, 2011-11-02:
<tschwinge> sekon_: This may help a bit:
http://www.gnu.org/software/hurd/hurd/hurd_hacking_guide.html
----
-IRC, freenode, #hurd, 2012-01-08:
+# IRC, freenode, #hurd, 2012-01-08
<abique> can you tell me how is done in hurd: "ls | grep x" ?
<abique> in bash
@@ -187,7 +187,8 @@ IRC, freenode, #hurd, 2012-01-08:
<antrik> that's probably the most fundamental design feature of the Hurd
<antrik> (all filesystem operations actually, not only lookups)
-IRC, freenode, #hurd, 2012-01-09:
+
+## IRC, freenode, #hurd, 2012-01-09
<braunr> youpi: are you sure cthreads are M:N ? i'm almost sure they're 1:1
<braunr> and no modern OS is a right place for any thread userspace
@@ -266,3 +267,83 @@ IRC, freenode, #hurd, 2012-01-09:
<youpi> they help only when the threads are living
<braunr> ok
<youpi> now as I said I don't have to talk much more, I have to leave :)
+
+
+# IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> spiderweb: have you read
+ http://www.gnu.org/software/hurd/hurd-paper.html ?
+ <spiderweb> I'll have a look.
+ <braunr> and also the beginning of
+ http://ftp.sceen.net/mach/mach_a_new_kernel_foundation_for_unix_development.pdf
+ <braunr> these two should provide a good look at the big picture the hurd
+ attemtps to achieve
+ <Tekk_> I can't help but wonder though, what advantages were really
+ achieved with early mach?
+ <Tekk_> weren't they just running a monolithic unix server like osx does?
+ <braunr> most mach-based systems were
+ <braunr> but thanks to that, they could provide advanced features over
+ other well established unix systems
+ <braunr> while also being compatible
+ <Tekk_> so basically it was just an ease of development thing
+ <braunr> well that's what mach aimed at being
+ <braunr> same for the hurd
+ <braunr> making things easy
+ <Tekk_> but as a side effect hurd actually delivers on the advantages of
+ microkernels aside from that, but the older systems wouldn't, correct?
+ <braunr> that's how there could be network file systems in very short time
+ and very scarce resources (i.e. developers working on it), while on other
+ systems it required a lot more to accomplish that
+ <braunr> no, it's not a side effect of the microkernel
+ <braunr> the hurd retains and extends the concept of flexibility introduced
+ by mach
+ <Tekk_> the improved stability, etc. isn't a side effect of being able to
+ restart generally thought of as system-critical processes?
+ <braunr> no
+ <braunr> you can't restart system critical processes on the hurd either
+ <braunr> that's one feature of minix, and they worked hard on it
+ <Tekk_> ah, okay. so that's currently just the domain of minix
+ <Tekk_> okay
+ <Tekk_> spiderweb: well, there's 1 advantage of minix for you :P
+ <braunr> the main idea of mach is to make it easy to extend unix
+ <braunr> without having hundreds of system calls
+ <braunr> the hurd keeps that and extends it by making many operations
+ unprivileged
+ <braunr> you don't need special code for kernel modules any more
+ <braunr> it's easy
+ <braunr> you don't need special code to handle suid bits and other ugly
+ similar hacks,
+ <braunr> it's easy
+ <braunr> you don't need fuse
+ <braunr> easy
+ <braunr> etc..
+
+
+# IRC, freenode, #hurd, 2012-12-06
+
+ <spiderweb> what is the #1 feature that distinguished hurd from other
+ operating systems. the concept of translators. (will read more when I get
+ more time).
+ <braunr> yes, translators
+ <braunr> using the VFS as a service directory
+ <braunr> and the VFS permissions to control access to those services
+
+
+# IRC, freenode, #hurd, 2012-12-10
+
+ <spiderweb> I want to work on hurd, but I think I'm going to start with
+ minix, I own the minix book 3rd ed. it seems like a good intro to
+ operating systems in general. like I don't even know what a semaphore is
+ yet.
+ <braunr> well, enjoy learning :)
+ <spiderweb> once I finish that book, what reading do you guys recommend?
+ <spiderweb> other than the wiki
+ <braunr> i wouldn't recommend starting with a book that focuses on one
+ operating system anyway
+ <braunr> you tend to think in terms of what is done in that specific
+ implementation and compare everything else to that
+ <braunr> tannenbaum is not only the main author or minix, but also the one
+ of the book http://en.wikipedia.org/wiki/Modern_Operating_Systems
+ <braunr>
+ http://en.wikipedia.org/wiki/List_of_important_publications_in_computer_science#Operating_systems
+ should be a pretty good list :)
diff --git a/open_issues/arm_port.mdwn b/open_issues/arm_port.mdwn
new file mode 100644
index 00000000..2d8b9038
--- /dev/null
+++ b/open_issues/arm_port.mdwn
@@ -0,0 +1,238 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+Several people have expressed interested in a port of GNU/Hurd for the ARM
+architecture.
+
+
+# IRC, freenode, #hurd, 2012-10-09
+
+ <mcsim> bootinfdsds: There was an unfinished port to arm, if you're
+ interested.
+ <tschwinge> mcsim: Has that ever been published?
+ <mcsim> tschwinge: I don't think so. But I have an email of that person and
+ I think that this could be discussed with him.
+
+
+## IRC, freenode, #hurd, 2012-10-10
+
+ <tschwinge> mcsim: If you have a contact to the ARM porter, could you
+ please ask him to post what he has?
+ <antrik> tschwinge: we all have the "contact" -- let me remind you that he
+ posted his questions to the list...
+
+
+## IRC, freenode, #hurd, 2012-10-17
+
+ <mcsim> tschwinge: Hello. The person who I wrote regarding arm port of
+ gnumach still hasn't answered. And I don't think that he is going to
+ answer.
+
+
+# IRC, freenode, #hurd, 2012-11-15
+
+ <matty3269> Well, I have a big interest in the ARM architecture, I worked
+ at ARM for a bit too, and I've written my own little OS that runs on
+ qemu. Is there an interest in getting hurd running on ARM?
+ <braunr> matty3269: not really currently
+ <braunr> but if that's what you want to do, sure
+ <tschwinge> matty3269: Well, interest -- sure!, but we don't really have
+ people savvy in low-level kernel implementation on ARM. I do know some
+ bits about it, but more about the instruction set than about its memory
+ architecture, for example.
+ <tschwinge> matty3269: But if you're feeling adventurous, by all means work
+ on it, and we'll try to help as we can.
+ <tschwinge> matty3269: There has been one previous attempt for an ARM port,
+ but that person never published his code, and apparently moved to a
+ different project.
+ <tschwinge> matty3269: I can help with toolchains (GCC, etc.) things for
+ ARM, if there's need.
+ <matty3269> tschwinge: That sounds great, thanks! Where would you recommend
+ I start (at the moment I've got Mach checked out and am trying to get it
+ compiled for i386)
+ <matty3269> I'm guessing that the Mach micro-kernel is all that would need
+ to be ported or are there arch-dependant bits of code in the server
+ processes?
+ <tschwinge> matty3269:
+ http://www.gnu.org/software/hurd/faq/system_port.html has some
+ information. Mach is the biggest part, yes. Then some bits in glibc and
+ libpthread, and even less in the Hurd libraries and servers.
+ <tschwinge> matty3269: Basically, you'd need equivalents for the i386 (and
+ similar) directories, yep.
+ <tschwinge> Though, you may be able to avoid some cruft in there.
+ <tschwinge> Does building for x86 have any issues?
+ <tschwinge> matty3269: How is generally your understanding of the Hurd on
+ Mach system architecture, and on microkernel-based systems generally, and
+ on Mach in particular?
+ <matty3269> tschwinge: yes, it seems to be progressing... I've got mig
+ installed and it's just compiling now
+ <matty3269> hmm, not too great if I'm honest, I've done mostly monolithic
+ kernel development so having such low-level processes, such as
+ scheduling, done in user-space seems a little strinage
+ <tschwinge> Ah, yes, MIG will need a little bit of porting, too. I can
+ help with that, but that's not a priority -- first you have to get Mach
+ to boot at all; MIG will only be needed once you need to deal with RPCs,
+ so user-land/kernel interaction, basically. Before, you can hack around
+ it.
+ <matty3269> tschwinge: I have been running a GNU/Hurd system for a while
+ now though
+ <tschwinge> I'm happy to tell you that the schedules is still in the
+ kernel. ;-)
+ <tschwinge> OK, good, so you know about the basic ideas.
+ <braunr> matty3269: there has to be machine specific stuff in user space
+ <braunr> for initial thread contexts for example
+ <matty3269> tschwinge: Ok, just got gnumach built
+ <braunr> but there isn't much and you can easily base your work from the
+ x86 implementation
+ <tschwinge> Yes. Mach itself is the more difficult one.
+ <matty3269> braunr: Yeah, looking around at things, it doesn't seem that
+ there will be too much work involoved in the user-space stuff
+ <tschwinge> braunr: Do you know off-hand whether there are some old Mach
+ research papers describing architecture ports?
+ <tschwinge> I know there are some describing the memory system (obviously),
+ and I/O system -- which may help matty3269 to understand the general
+ design/structure.
+ <tschwinge> We might want to identify some documents, and make a list.
+ <braunr> all mach related documentation i have is available here:
+ ftp://ftp.sceen.net/mach/
+ <braunr> (also through http://)
+ <tschwinge> matty3269: Oh, definitely I'd suggest the Mach 3 Kernel
+ Principles book. That gives a good description of the Mach architecture.
+ <matty3269> Great, that's my weekends reading then!
+ <braunr> you don't need all that for a port
+ <matty3269> Is it possible to run the gnumach binary standalone with qemu?
+ <braunr> you won't go far with it
+ <braunr> you really need at least one program
+ <braunr> but sure, for a port development, it can easily be done
+ <braunr> i'd suggest writing a basic static application for your tests once
+ you reach an advanced state
+ <braunr> the critical parts of a port are memory and interrupts
+ <braunr> and memory can be particularly difficult to implement correctly
+ <tschwinge> matty3269: I once used QEMU's
+ virtual-FAT-filesystem-from-a-directory-on-the-host, and configured GRUB
+ to boot from that one, so it was easy to quickly reboot for kernel
+ development.
+ <braunr> but the good news is that almost every bsd system still uses a
+ similar interface
+ <tschwinge> matty3269: And, you may want to become familiar with QEMU's
+ built-in gdbserver, and how to connect to and use that.
+ <braunr> so, for example, you could base your work from the netbsd/arm pmap
+ module
+ <tschwinge> matty3269: I think that's better than starting on real
+ hardware.
+ <braunr> tschwinge: you can use -kernel with a multiboot binary now
+ <braunr> tschwinge: and even creating iso images is so fast it's not any
+ slower
+ <tschwinge> braunr: Yeah, I thought so, but never checked this out --
+ recently I saw in qemu --help's output some »multiboot« thing flashing
+ by. :-)
+ <braunr> i think it only supports 32-bits executables though
+ <matty3269> braunr: Yeah, I just tried passing gnumach as the -kernel
+ parameter to qemu, but it segged qemu :S
+ <braunr> otherwise i'd be using it for x15
+ <matty3269> qemu: fatal: Trying to execute code outside RAM or ROM at
+ 0xc0100000
+ <braunr> how much ram did you give qemu ?
+ <matty3269> I used '-m 512'
+ <braunr> hum, so the -kernel option doesn't correctly implement elf loading
+ or something like that
+ <braunr> anyway, i'm not sure how well building gnumach on a non-hurd
+ system is supported
+ <braunr> so you may want to simply develop inside your VM for the time
+ being, and reboot
+ <matty3269> doing an objdump of it seems fine...
+ <braunr> ?
+ <braunr> ah, the gnumach executable is a correct elf image
+ <braunr> that's not the point
+ <matty3269> Is there particular reason that mach is linked at 0xc0100000?
+ <matty3269> or is that where it is expected to be in VM>
+ <tschwinge> That's my understanding.
+ <braunr> kernels commmonly sti at high addresses
+ <braunr> that's the "standard" 3G/1G split for user/kernel space
+ <matty3269> I think Linux sits at a similar VA for 32-bit
+ <braunr> no
+ <matty3269> Oh, I thought it did, I know it does on ARM, the kernel is
+ mapped to 0xc000000
+ <braunr> i don't know arm, but are you sure about this number ?
+ <braunr> seems to lack a 0
+ <matty3269> Ah, yes sorry
+ <matty3269> so 0xC0000000
+ <braunr> 0xc0100000 is just 1 MiB above it
+ <braunr> the .text section of linux on x86 actually starts at c1000000
+ (above 16 MiB, certainly to preserve as much dma-able memory since modern
+ machines now have a lot more)
+ <tschwinge> Surely the GRUB multiboot loader is not that much used/tested?
+ <braunr> unfortunately, no
+ <braunr> matty3269: FYI, my kernel starts at 0xfff00000 :p
+ <matty3269> braunr: hmm, you could be right, I know it's arround there
+ someone
+ <matty3269> somewhere*
+ <matty3269> braunr: that's an interesting address :S
+ <matty3269> braunr: is that the PA address of the kernel or the VA inside a
+ process?
+ <braunr> the VA
+ <matty3269> hmm
+ <braunr> it can't be a PA
+ <braunr> such high addresses are normally device memory
+ <braunr> but don't worry, i have good reasons to have chosen this address
+ :)
+ <matty3269> so with gnumach, does the boot-up sequence use PIC until VM is
+ active and the kernel mapped to the linking address?
+ <braunr> no
+ <braunr> actually i'm not certain of the details
+ <braunr> but there is no PIC
+ <braunr> either special sections are linked at physical addresses
+ <braunr> or it relies on the fact that all executable code uses near jumps
+ <braunr> and uses offsets when accessing data
+ <braunr> (which is why the kernel text is at 3 GiB + 1 MiB, and not 3 GiB)
+ <matty3269> hmm,
+ <matty3269> gah, I need to learn x86
+ <braunr> that would certainly help
+ <matty3269> I've just had a look at boothdr.S; I presume that there must be
+ something else that is executed before this to setup VM, switch to 32-bit
+ more etc...?
+ <matty3269> mode*
+ <braunr> have a look at the multiboot specification
+ <braunr> it sets protected mode
+ <braunr> but not paging
+ <braunr> (i mean, the boot loader does, before passing control to the
+ kernel)
+ <matty3269> Ah, I see
+ <tschwinge> matty3269: Multiboot should be documented in the GRUB package.
+ <matty3269> tschwinge: yep, got that, thanks
+ <matty3269> hmm, I can't find any reference to CR0 in gnumach so paging
+ must be enabled elsewhere
+ <matty3269> oh wait, found it
+ <braunr> $ git grep -i '\<cr0\>'
+ <braunr> i386/i386/proc_reg.h, linux/dev/include/asm-i386/system.h
+ <braunr> although i suspect only the first one is relevant to us :)
+ <matty3269> Yeah, that seems to have the setup code for paging :)
+ <matty3269> I'm still confused how it could run that without paging or PIC
+ though
+ <matty3269> I think I need to watch the boot sequence with qemu
+ <braunr> it's a bit tricky
+ <braunr> but actually simple
+ <braunr> 00:44 < braunr> either special sections are linked at physical
+ addresses
+ <braunr> 00:44 < braunr> or it relies on the fact that all executable code
+ uses near jumps
+ <braunr> that's really all there is
+ <braunr> but you shouldn't worry about that i suppose, as the protocol
+ between the boot loader and an arm kernel will certainly not be the saem
+ <braunr> same*
+ <matty3269> indeed, ARM is tricky because memory maps are vastly differnt
+ on every platform
+
+
+## IRC, freenode, #hurd, 2012-11-21
+
+ <matty3269> Well, I have a ARM gnumach kernel compiled. It just doesn't
+ run! :)
+ <braunr> matty3269: good luck :)
diff --git a/open_issues/automatic_backtraces_when_assertions_hit.mdwn b/open_issues/automatic_backtraces_when_assertions_hit.mdwn
index 1cfacaf5..71007f99 100644
--- a/open_issues/automatic_backtraces_when_assertions_hit.mdwn
+++ b/open_issues/automatic_backtraces_when_assertions_hit.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,9 +10,65 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_glibc]]
-IRC, unknown channel, unknown date.
+
+# IRC, unknown channel, unknown date
<azeem> tschwinge: ext2fs.static: thread-cancel.c:55: hurd_thread_cancel: Assertion `! __spin_lock_locked (&ss->critical_section_lock)' failed.
<youpi> it'd be great if we could have backtraces in such case
<youpi> at least just the function names
<youpi> and in this case (static), just addresses would be enough
+
+
+# IRC, freenode, #hurd, 2012-07-19
+
+In context of the [[ext2fs_libports_reference_counting_assertion]].
+
+ <braunr> pinotree: tschwinge: do you know if our packages are built with
+ -rdynamic ?
+ <pinotree> braunr: debian's cflags don't include it, so unless the upstream
+ build systems do, -rdynamic is not added
+ <braunr> i doubt glibc' backtrace() is able to find debugging symbol files
+ on its own
+ <pinotree> what do you mean?
+ <braunr> the port reference bug youpi noticed is rare
+ <pinotree> even on linux, a program compiled with normal optimizations (eg
+ -O2 -g) can give just pointer values in backtrace()'s output
+ <braunr> core dumps are unreliable at best
+
+[[crash_server]].
+
+ <braunr> uh, no, backtrace does give names
+ <braunr> but not with -fomit-frame-pointer
+ <braunr> unless the binary is built with -rdynamic
+ <braunr> at least it used to
+ <pinotree> not really, when being optimized some steps can be optimized
+ away (eg inlines)
+ <braunr> that's ok
+ <braunr> anyway, the point is i'd like a way that can give us as much
+ information as possible when the problem happens
+ <braunr> the stack trace being the most useful imo
+ <pinotree> do you face issues currently with backtrace()?
+ <braunr> not tried yet
+ <braunr> i guess i could make the application trap in the kernel, and fault
+ there, so we can attach gdb while still in the pager address space :>
+ <pinotree> that would imply the need for interactivity when the fault
+ happens, wouldn't it?
+ <braunr> no
+ <braunr> it would remain this way until someone comes, hours, days later
+ <braunr> pinotree: well ok, it would require interactivity, but not *when*
+ it happens ;p
+ <braunr> pinotree: right, it needs -rdynamic
+
+
+## IRC, freenode, #hurd, 2012-07-21
+
+ <braunr> tschwinge: my current "approach" is to introduce an infinite loop
+ <braunr> it makes the faulting task mapped in often enough to use gdb
+ through qemu
+ <braunr> ... :)
+ <tschwinge> My understanding is that glibc already does have some mechanism
+ for that: I have seen it print backtraces whendetecting malloc
+ inconsistencies (double free and the lite).
+ <braunr> yes, i thought it used the backtrace functions internally though
+ <braunr> that is, execinfo
+ <braunr> but this does require -rdynamic
diff --git a/open_issues/bpf.mdwn b/open_issues/bpf.mdwn
index e24d761b..02dc7f87 100644
--- a/open_issues/bpf.mdwn
+++ b/open_issues/bpf.mdwn
@@ -585,3 +585,11 @@ This is a collection of resources concerning *Berkeley Packet Filter*s.
in libpcap, and let users of that library benefit from it
<braunr> instead of implementing the low level bpf interface, which
nonetheless has some system-specific variants ..
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+In context of the [[select]] issue.
+
+ <braunr> i understand now why my bpf translator was so buggy
+ <braunr> the condition_timedwait i wrote at the time was .. incomplete :)
diff --git a/open_issues/code_analysis/discussion.mdwn b/open_issues/code_analysis/discussion.mdwn
index f6b1e8c1..7ac3beb1 100644
--- a/open_issues/code_analysis/discussion.mdwn
+++ b/open_issues/code_analysis/discussion.mdwn
@@ -32,3 +32,20 @@ License|/fdl]]."]]"""]]
tool, please add it to open_issues/code_analysis.mdwn
<antrik> (I guess we should have a "proper" page listing useful debugging
tools...)
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <mcsim> hello. Has anyone tried some memory debugging tools like duma or
+ dmalloc with hurd?
+ <braunr> mcsim: yes, but i couldn't
+ <braunr> i tried duma, and it crashes, probably because of cthreads :)
+
+
+## IRC, freenode, #hurd, 2012-09-08
+
+ <mcsim> hello. What static analyzer would you suggest (probably you have
+ tried it for hurd already)?
+ <braunr> mcsim: if you find some good free static analyzer, let me know :)
+ <pinotree> a simple one is cppcheck
+ <mcsim> braunr: I'm choosing now between splint and adlint
diff --git a/open_issues/console_tty1.mdwn b/open_issues/console_tty1.mdwn
new file mode 100644
index 00000000..614c02c9
--- /dev/null
+++ b/open_issues/console_tty1.mdwn
@@ -0,0 +1,151 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+Seen in context of [[libpthread]], but probably not directly related to it.
+
+
+# IRC, freenode, #hurd, 2012-08-30
+
+ <gnu_srs> Do you also experience a frozen hurd console?
+ <braunr> yes
+ <braunr> i didn't check but i'm almost certain it's a bug in my branch
+ <braunr> the replacement of condition_implies was a bit hasty in some
+ places
+ <braunr> this is why i want to rework it separately
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <gnu_srs> braunr: Did you find the cause of the Hurd console freeze for
+ your libpthread branch?
+ <braunr> gnu_srs: like i said, a bug
+ <braunr> probably in the replacement of condition_implies
+ <braunr> i rewrote that part in libpipe and it no works
+ <braunr> now*
+
+ <braunr> gnu_srs: the packages have been updated
+ <braunr> and these apparently fix the hurd console issue correctly
+
+## IRC, freenode, #hurd, 2012-09-04
+
+ <braunr> gnu_srs: this hurd console problem isn't fixed
+ <braunr> it seems to be due to a race condition that only affects the first
+ console
+ <braunr> and by reading the code i can't see how it can even work oO
+ <gnu_srs> braunr: just rebooted, tty1 is still locked, tty2-6 works. And
+ the floppy error stays (maybe a kvm bug??)
+ <braunr> the floppy error is probably a kvm bug as we discussed
+ <braunr> the tty1 locked isn't
+ <braunr> i have it too
+ <braunr> it seems to be a bug in the hurd console server
+ <braunr> which is started by tty1, but for some reason, doesn't return a
+ valid answer at init time
+ <braunr> if you kill the term handling tty1, you'll see your first tty
+ starts working
+ <braunr> for now i'll try a hack that starts the hurd console server before
+ the clients
+ <braunr> doesn't work eh
+ <braunr> tty1 is the only one started before runttys
+ <braunr> indeed, fixing /etc/hurd/runsystem.gnu so that it doesn't touch
+ tty1 fixes the problem
+ <gnu_srs> do you have an explanation?
+ <braunr> not really no
+ <braunr> but it will do for now
+ <pinotree> samuel added that with the comment above, apparently to
+ workaround some other issue of the hurd console
+ <braunr> i'm pretty sure the bug is already visible with cthreads
+ <braunr> the first console always seems weird compared to the others
+ <braunr> with a login: at the bottom of the screen
+ <braunr> didn't you notice ?
+ <pinotree> sometimes, but not often
+ <braunr> typical of a race
+ <pinotree> (at least for me)
+ <braunr> pthreads being slightly slower exposes it
+ <gnu_srs> confirmed, it works by commenting out touch /dev/tty1
+ <gnu_srs> yes, the login is at the bottom of the screen, sometimes one in
+ the upper part too:-/
+ <braunr> so we have a new open issue
+ <braunr> hm
+ <braunr> exiting the first tty doesn't work
+ <braunr> which makes me think of the issue we have with screen
+ <gnu_srs> confirmed!
+ <braunr> also, i don't understand why we have getty on tty1, but nothing on
+ the other terminals
+ <braunr> something is really wrong with terminals on hurd *sigh*
+ <braunr> ah, the problem looks like it happens when getty attempts to
+ handle a terminal !
+ <braunr> gnu_srs: anyway, i don't think it should be blocking for the
+ conversion to pthreads
+ <braunr> but it would be better if someone could assign himself that bug
+ <braunr> :)
+
+
+## IRC, freenode, #hurd, 2012-09-05
+
+ <antrik> braunr: the login at the bottom of the screen if from the Mach
+ console I believe
+ <braunr> antrik: well maybe, but it shouldn't be there anyway
+ <antrik> braunr: why not?
+ <antrik> it's confusing, but perfectly correct as far as I can tell
+ <braunr> antrik: two login: on the same screen ?
+ <braunr> antrik: it's even more confusing when comparing with other ttys
+ <antrik> I mean it's correct from a techincal point of view... I'm not
+ saying it's helpful for the user ;-)
+ <braunr> i'm not even sure it's correct
+ <braunr> i've double checked the pthreads patch and didn't see anything
+ wrong there
+ <antrik> perhaps the startup of the Hurd console could be delayed a bit to
+ make sure it happens after the Mach console login is done printing
+ stuff...
+ <braunr> why are our gettys stubs ?
+ <antrik> I never understood the point of a getty TBH...
+ <braunr> well you need to communicate to something behind your terminal,
+ don't you ?
+ <braunr> with*
+ <antrik> why not just launch the login program or login shell right away?
+ <braunr> what if you want something else than a login program ?
+ <antrik> like what?
+ <antrik> and how would a getty help with that?
+ <braunr> an ascii-art version of star wars
+ <braunr> it would be configured to start something else
+ <antrik> and why does that need a getty? why not just start something else
+ directly?
+ <braunr> well getty is about the serial line parameters actually
+ <antrik> yeah, I had a vague understanding that it has something to do with
+ serial lines (or real TTY lines)... but we hardly need that on local
+ cosoles :-)
+ <antrik> consoles
+ <braunr> right
+ <braunr> but then why even bother with something like runttys
+ <antrik> well, something has to start the terminal servers?...
+ <antrik> I might be confused though
+ <braunr> what i don't understand is
+ <braunr> why is there no getty at startup, whereas they are spawned when
+ logging off ?
+ <antrik> they are? that's fascinating indeed ;-)
+ <braunr> does it behave like this on your old version ?
+ <antrik> I don't remember ever having seen a "getty" process on my Hurd
+ systems...
+ <braunr> can you log on e.g. tty2 and then log out, and see ?
+ <antrik> OTOH, I'm hardly ever using consoles...
+ <antrik> hm... I think that should be possible remotely using the console
+ client with ncurses driver? never tried that...
+ <braunr> ncurses driver ?
+ <braunr> hum i don't know, never tried either
+ <braunr> and it may add other bugs :p
+ <braunr> better wait to be close to the machine
+ <antrik> hehe
+ <antrik> well, it's a good excuse for trying the ncurses driver ;-)
+ <antrik> hrm
+ <antrik> alien:~# console -d ncursesw
+ <antrik> console: loading driver `ncursesw' failed: Gratuitous error
+ <antrik> I guess nobody tested that stuff in years
diff --git a/open_issues/console_vs_xorg.mdwn b/open_issues/console_vs_xorg.mdwn
new file mode 100644
index 00000000..ffefb389
--- /dev/null
+++ b/open_issues/console_vs_xorg.mdwn
@@ -0,0 +1,31 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_hurd]]
+
+
+# IRC, freenode, #hurd, 2012-08-30
+
+ <gean> braunr: I have some errors about keyboard in the xorg log, but
+ keyboard is working on the X
+ <braunr> gean: paste the log somewhere please
+ <gean> braunr: http://justpaste.it/19jb
+ [...]
+ [1987693.272] Fatal server error:
+ [1987693.272] Cannot set event mode on keyboard (Inappropriate ioctl for device)
+ [...]
+ [1987693.292] FatalError re-entered, aborting
+ [1987693.302] can't reset keyboard mode (Inappropriate ioctl for device)
+ [...]
+ <braunr> hum
+ <braunr> it looks like the xorg keyboard driver evolved and now uses ioctls
+ our drivers don't implement
+ <braunr> thanks for the report, we'll have to work on this
+ <braunr> i'm not sure the problem is new actually
diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn
index aff988d5..5f6fcf6a 100644
--- a/open_issues/dde.mdwn
+++ b/open_issues/dde.mdwn
@@ -17,6 +17,9 @@ Still waiting for interface finalization and proper integration.
[[!toc]]
+See [[user-space_device_drivers]] for generic discussion related to user-space
+device drivers.
+
# Disk Drivers
@@ -25,12 +28,6 @@ Not yet supported.
The plan is to use [[libstore_parted]] for accessing partitions.
-## Booting
-
-A similar problem is described in
-[[community/gsoc/project_ideas/unionfs_boot]], and needs to be implemented.
-
-
# Upstream Status
@@ -56,6 +53,33 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> (both from the Dresdem L4 group)
+### IRC, freenode, #hurd, 2012-08-12
+
+ <antrik>
+ http://genode.org/documentation/release-notes/12.05#Re-approaching_the_Linux_device-driver_environment
+ <antrik> I wonder whether the very detailed explanation was prompted by our
+ DDE discussions at FOSDEM...
+ <pinotree> antrik: one could think about approaching them to develop the
+ common dde libs + dde_linux together
+ <antrik> pinotree: that's what I did at FOSDEM -- they weren't interested
+ <pinotree> antrik: this year's one? why weren't they?
+ <pinotree> maybe at that time dde was not integrated properly yet (netdde
+ is just few months "old")
+ <braunr> do you really consider it integrated properly ?
+ <pinotree> no, but a bit better than last year
+ <antrik> I don't see what our integration has to do with anything...
+ <antrik> they just prefer hacking thing ad-hoc than having some central
+ usptream
+ <pinotree> the helenos people?
+ <antrik> err... how did helenos come into the picture?...
+ <antrik> we are talking about genode
+ <pinotree> sorry, confused wrong microkernel OS
+ <antrik> actually, I don't remember exactly who said what; there were
+ people from genode there and from one or more other DDE projects... but
+ none of them seemed interested in a common DDE
+ <antrik> err... one or two other L4 projects
+
+
## IRC, freenode, #hurd, 2012-02-19
<youpi> antrik: do we know exactly which DDE version Zheng Da took as a
@@ -79,6 +103,12 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
apparently have both USB and SATA working with some variant of DDE
+### IRC, freenode, #hurd, 2012-11-03
+
+ <mcsim> DrChaos: there is DDEUSB framework for L4. You could port it, if
+ you want. It uses Linux 2.6.26 usb subsystem.
+
+
# IRC, OFTC, #debian-hurd, 2012-02-15
<pinotree> i have no idea how the dde system works
@@ -90,6 +120,9 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
automatically, or you have to settrans yourself to setup a device?
<youpi> there's no autoloader for now
<youpi> we'd need a bus arbitrer that'd do autoprobing
+
+[[PCI_arbiter]].
+
<pinotree> i see
<pinotree> (you see i'm not really that low level, so pardon the flood of
posssibly-noobish questions ;) )
@@ -200,21 +233,10 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> right
-# IRC, freenode, #hurd, 2012-02-19
-
- <youpi> antrik: we should probably add a gsoc idea on pci bus arbitration
- <youpi> DDE is still experimental for now so it's ok that you have to
- configure it by hand, but it should be automatic at some ponit
-
+# [[PCI_Arbiter]]
## IRC, freenode, #hurd, 2012-02-21
- <braunr> i'm not familiar with the new gnumach interface for userspace
- drivers, but can this pci enumerator be written with it as it is ?
- <braunr> (i'm not asking for a precise answer, just yes - even probably -
- or no)
- <braunr> (idk or utsl will do as well)
- <youpi> I'd say yes
<youpi> since all drivers need is interrupts, io ports and iomem
<youpi> the latter was already available through /dev/mem
<youpi> io ports through the i386 rpcs
@@ -453,6 +475,59 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> hm... good point
+# IRC, freenode, #hurd, 2012-08-14
+
+ <braunr> it's amazing how much code just gets reimplemented needlessly ...
+ <braunr> libddekit has its own mutex, condition, semaphore etc.. objects
+ <braunr> with the *exact* same comment about the dequeueing-on-timeout
+ problem found in libpthread
+ <braunr> *sigh*
+
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> hum, leaks and potential deadlocks in libddekit/thread.c :/
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> nice, dde relies on a race to start ..
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> hm looks like if netdde crashes, the kernel doesn't handle it
+ cleanly, and we can't attach another netdde instance
+
+[[!message-id "877gu8klq3.fsf@kepler.schwinge.homeip.net"]]
+
+
+# IRC, freenode, #hurd, 2012-08-21
+
+In context of [[libpthread]].
+
+ <braunr> hm, i thought my pthreads patches introduced a deadlock, but
+ actually this one is present in the current upstream/debian code :/
+ <braunr> (the deadlock occurs when receiving data fast with sftp)
+ <braunr> either in netdde or pfinet
+
+
+# DDE for Filesystems
+
+## IRC, freenode, #hurd, 2012-10-07
+
+ * pinotree wonders whether the dde layer could aldo theorically support
+ also file systems
+ <antrik> pinotree: yeah, I also brought up the idea of creating a DDE
+ extension or DDE-like wrapper for Linux filesystems a while back... don't
+ know enough about it though to decide whether it's doable
+ <antrik> OTOH, I'm not sure it would be worthwhile. we still should
+ probably have a native (not GPLv2-only) implementation for the main FS at
+ least; so the wrapper would only be for accessing external
+ partitions/media...
+
+
# virtio
diff --git a/open_issues/exec_leak.mdwn b/open_issues/exec_leak.mdwn
new file mode 100644
index 00000000..b58d2c81
--- /dev/null
+++ b/open_issues/exec_leak.mdwn
@@ -0,0 +1,57 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+
+# IRC, freenode, #hurd, 2012-08-11
+
+ <braunr> the exec servers seems to leak a lot
+ <braunr> server*
+ <braunr> exec now uses 109M on darnassus
+ <braunr> it really leaks a lot
+ <pinotree> only 109mb? few months ago, exec on exodar was taking more than
+ 200mb after few days of uptime with builds done
+ <braunr> i wonder how much it takes on the buildds
+
+
+# IRC, freenode, #hurd, 2012-08-17
+
+ <braunr> the exec leak is tricky
+ <braunr> bddebian: btw, look at the TODO file in the hurd source code
+ <braunr> bddebian: there is a not from thomas bushnell about that
+ <braunr> "*** Handle dead name notifications on execserver ports. !
+ <braunr> not sure it's still a todo item, but it might be worth checking
+ <bddebian> braunr: diskfs_execboot_class = ports_create_class (0, 0);
+ This is what would need to change right? It should call some cleanup
+ routine in the first argument?
+ <bddebian> Would be ideal if it could just use deadboot() from exec.
+ <braunr> bddebian: possible
+ <braunr> bddebian: hum execboot, i'm not so sure
+ <bddebian> Execboot is the exec task, no?
+ <braunr> i don't know what execboot is
+ <bddebian> It's from libdiskfs
+ <braunr> but "diskfs_execboot_class" looks like a class of ports used at
+ startup only
+ <braunr> ah
+ <braunr> then it's something run in the diskfs users ?
+ <bddebian> yes
+ <braunr> the leak is in exec
+ <braunr> if clients misbehave, it shouldn't affect that server
+ <bddebian> That's a different issue, this was about the TODO thing
+ <braunr> ah
+ <braunr> i don't know
+ <bddebian> Me either :)
+ <bddebian> For the leak I'm still focusing on do-bunzip2 but I am baffled
+ at my results..
+ <braunr> ?
+ <bddebian> Where my counters are zero if I always increment on different
+ vars but wild freaking numbers if I increment on malloc and decrement on
+ free
diff --git a/open_issues/ext2fs_deadlock.mdwn b/open_issues/ext2fs_deadlock.mdwn
index 369875fe..23f54a4a 100644
--- a/open_issues/ext2fs_deadlock.mdwn
+++ b/open_issues/ext2fs_deadlock.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -44,9 +44,8 @@ pull the information out of the process' memory manually (how to do that,
anyways?), and also didn't have time to continue with debugging GDB itself, but
this sounds like a [[!taglink open_issue_gdb]]...)
----
-IRC, #hurd, 2010-10-27
+# IRC, freenode, #hurd, 2010-10-27
<youpi> thread 8 hung on ports_begin_rpc
<youpi> that's probably where one could investigated first
diff --git a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn
new file mode 100644
index 00000000..ff1c4c38
--- /dev/null
+++ b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn
@@ -0,0 +1,93 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+ libports/port-ref.c:31: ports_port_ref: Assertion `pi->refcnt || pi->weakrefcnt' failed
+
+This is seen every now and then.
+
+
+# [[gnumach_page_cache_policy]]
+
+With that patch in place, the assertion failure is seen more often.
+
+
+## IRC, freenode, #hurd, 2012-07-14
+
+ <youpi> braunr: I'm getting ext2fs.static:
+ /usr/src/hurd-debian/./libports/port-ref.c:31: ports_port_ref: Assertion
+ `pi->refcnt || pi->weakrefcnt' failed.
+ <youpi> oddly enough, that happens on one of the buildds only
+ <braunr> :/
+ <braunr> i fear the patch can wake many of these issues
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+ <youpi> braunr: same assertion failed on a second buildd
+ <braunr> can you paste it again please ?
+ <youpi> ext2fs.static: /usr/src/hurd-debian/./libports/port-ref.c:31:
+ ports_port_ref: Assertion `pi->refcnt || pi->weakrefcnt' failed.
+ <braunr> or better, answer the ml thread for future reference
+ <braunr> thanks
+ <youpi> braunr: I can't keep your patch on the buildds, it makes them too
+ unreliable
+ <braunr> youpi: ok
+ <braunr> i never got this error though, that's weird
+ <braunr> youpi: was the failure during the same build ?
+ <youpi> no, it was during package installation, and not the same
+ <youpi> braunr: note that I've already seen such errors, it's not new, but
+ it was way rarer
+ <youpi> like every month only
+ <braunr> ah ok
+ <braunr> yes it's less surprising then
+ <braunr> a tricky reference counting / locking mistake somewhere in the
+ hurd :) ...
+ <braunr> ah ! just got it !
+ <bddebian> braunr: Got the error or found the problem? :)
+ <braunr> the former unfortunately :/
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+ <braunr> hm, i think those ext2fs port refs errors may also be due to stack
+ overflows
+ <pinotree> --verbose
+ <braunr> hm ?
+ <braunr> http://lists.gnu.org/archive/html/bug-hurd/2012-07/msg00051.html
+ <pinotree> i mean, why do you think they could be due to that?
+ <braunr> the error is that both strong and weak refs in a port are 0 when
+ adding a reference
+ <braunr> weak refs are almost never used so let's forget about them
+ <braunr> when a ref count drops to 0, the port is automatically deallocated
+ <braunr> so what other than memory corruption setting this counter to 0
+ could possibly do that ? :)
+ <pinotree> one could also guess an unbalanced ref/unref logic, somehow
+ <braunr> what do you mean ?
+ <pinotree> that for a bug, an early return, etc a port gets unref'ed often
+ than it is ref'ed
+ <braunr> highly unlikely, as they're protected by a lock
+ <braunr> pinotree: ah you mean, the object gets deallocated early because
+ of an deref overflow ?
+ <braunr> pinotree: could be, yes
+ <braunr> pinotree: i wonder if it could happen because of the periodic sync
+ duplicating the node table without holding references
+ <braunr> rah, libports uses a big lock in many places :(
+ <pinotree> braunr: yes, i meant that
+ <braunr> we could try using libduma some day
+ <braunr> i wonder if it could work out of the box
+ <pinotree> but that wouldn't help to find out whether a port gets deref'ed
+ too often, for instance
+ <pinotree> although it could be adapted to do so, i guess
+ <braunr> reproducing + a call trace or core would be best, but i'm not even
+ sure we can get that easily lol
+
+[[automatic_backtraces_when_assertions_hit]].
diff --git a/open_issues/fakeroot_eagain.mdwn b/open_issues/fakeroot_eagain.mdwn
new file mode 100644
index 00000000..6b684a04
--- /dev/null
+++ b/open_issues/fakeroot_eagain.mdwn
@@ -0,0 +1,216 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_porting]]
+
+
+# IRC, freenode, #hurd, 2012-12-05
+
+ <braunr> rbraun 18813 R 2hrs ln -sf ../af_ZA/LC_NUMERIC
+ debian/locales-all/usr/lib/locale/en_BW/LC_NUMERIC
+ <braunr> when building glibc
+ <braunr> is this a known issue ?
+ <tschwinge> braunr: No. Can you get a backtrace?
+ <braunr> tschwinge: with gdb you mean ?
+ <tschwinge> Yes. If you have any debugging symbols (glibc?).
+ <braunr> or the build log leading to that ?
+ <braunr> ok, i will next time i have it
+ <tschwinge> OK.
+ <braunr> (i regularly had it when working on the pthreads port)
+ <braunr> tschwinge:
+ http://www.sceen.net/~rbraun/hurd_glibc_build_deadlock_trace
+ <braunr> youpi: ^
+ <youpi> Mmm, there's not so much we can do about this one
+ <braunr> youpi: what do you mean ?
+ <youpi> the problem is that it's really a reentrency issue of the libc
+ locale
+ <youpi> it would happen just the same on linux
+ <braunr> sure
+ <braunr> but hat doesn't mean we can't report and/or fix it :)
+ <youpi> (the _nl_state_lock)
+ <braunr> do you have any workaround in mind ?
+ <youpi> no
+ <youpi> actually that's what I meant by "there's not so much we can do
+ about this"
+ <braunr> ok
+ <youpi> because it's a bad interaction between libfakeroot and glibc
+ <youpi> glibc believe fxtstat64 would never call locale functions
+ <youpi> but with libfakeroot it does
+ <braunr> i see
+ <youpi> only because we get an EAGAIN here
+ <braunr> but hm, doesn't it happen on linux ?
+ <youpi> EAGAIN doesn't happen on linux for fxstat64, no :)
+ <braunr> why does it happen on the hurd ?
+ <youpi> I mean for fakeroot stuff
+ <youpi> probably because fakeroot uses socket functions
+ <youpi> for which we probably don't properly handleEAGAIN
+ <youpi> I've already seen such kind of issue
+ <youpi> in buildd failures
+ <braunr> ok
+ <youpi> (so the actual bug here is EAGAIN
+ <youpi> )
+ <braunr> yes, so we can do something about it
+ <braunr> worth a look
+ <pinotree> (implement sysv semaphores)
+ <youpi> pinotree: if we could also solve all these buildd EAGAIN issues
+ that'd be nice :)
+ <braunr> that EAGAIN error might also be what makes exim behave badly and
+ loop forever
+ <youpi> possibly
+ <braunr> i've updated the trace with debugging symbols
+ <braunr> it fails on connect
+ <pinotree> like http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=563342 ?
+ <braunr> it's EAGAIN, not ECONNREFUSED
+ <pinotree> ah ok
+ <braunr> might be an error in tcp_v4_get_port
+
+
+## IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> hmm, tcp_v4_get_port sometimes fails indeed
+ <gnu_srs> braunr: may I ask how you found out, adding print statements in
+ pfinet, or?
+ <braunr> yes
+ <gnu_srs> OK, so that's the only (easy) way to debug.
+ <braunr> that's the last resort
+ <braunr> gdb is easy too
+ <braunr> i could have added a breakpoint too
+ <braunr> but i didn't want to block pfinet while i was away
+ <braunr> is it possible to force the use of fakeroot-tcp on linux ?
+ <braunr> the problem seems to be that fakeroot doesn't close the sockets
+ that it connected to faked-tcp
+ <braunr> which, at some point, exhauts the port space
+ <pinotree> braunr: sure
+ <pinotree> change the fakeroot dpkg alternative
+ <braunr> ok
+ <pinotree> calling it explicitly `fakeroot-tcp command` or
+ `dpkg-buildpackage -rfakeroot-tcp ...` should work too
+ <braunr> fakeroot-tcp looks really evil :p
+ <braunr> hum, i don't see any faked-tcp process on linux :/
+ <pinotree> not even with `fakeroot-tcp bash -c "sleep 10"`?
+ <braunr> pinotree: now yes
+ <braunr> but, does it mean faked-tcp is started for *each* process loading
+ fakeroot-tcp ?
+ <braunr> (the lib i mean)
+ <pinotree> i think so
+ <braunr> well the hurd doesn't seem to do that at all
+ <braunr> or maybe it does and i don't see it
+ <braunr> the stale faked-tcp processes could be those that failed something
+ only
+ <pinotree> yes, there's also that issue: sometimes there are stake
+ faked-tcp processes
+ <braunr> hum no, i see one faked-tcp that consumes cpu when building glibc
+ <pinotree> *stale
+ <braunr> it's the same process for all commands
+ <pinotree> <braunr> but, does it mean faked-tcp is started for *each*
+ process loading fakeroot-tcp ?
+ <pinotree> → everytime you start fakeroot, there's a new faked-xxx for it
+ <braunr> it doesn't look that way
+ <braunr> again, on the hurd, i see one faked-tcp, consuming cpu while
+ building so i assume it services libfakeroot-tcp requests
+ <pinotree> yes
+ <braunr> which means i probably won't reproduce the problem on linux
+ <pinotree> it serves that fakeroot under which the binary(-arch) target is
+ run
+ <braunr> or perhaps it's the normal fakeroot-tcp behaviour on sid
+ <braunr> pinotree: a faked-tcp that is started for each command invocation
+ will implicitely make the network stack close all its sockets when
+ exiting
+ <braunr> pinotree: as our fakeroot-tcp uses the same instance of faked-tcp,
+ it's a lot more likely to exhaust the port space
+ <pinotree> i see
+ <braunr> i'll try on sid and see how it behaves
+ <braunr> pinotree: on the other hand, forking so many processes at each
+ command invocation may make exec leak a lot :p
+ <braunr> or rather, a lot more
+ <braunr> (or maybe not, since it leaks only in some cases)
+
+[[exec_leak]].
+
+ <braunr> pinotree: actually, the behaviour under linux is the same with the
+ alternative correctly set, whereas faked-tcp is restarted (if used at
+ all) with -rfakeroot-tcp
+ <braunr> hm no, even that isn't true
+ <braunr> grr
+ <braunr> pinotree: i think i found a handy workaround for fakeroot
+ <braunr> pinotree: the range of local ports in our networking stack is a
+ lot more limited than what is configured in current systems
+ <braunr> by extending it, i can now build glibc \o/
+ <pinotree> braunr: what are the current ours and the usual one?
+ <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c
+ <braunr> the modern ones are the ones suggested in the comment
+ <braunr> sysctl_local_port_range is the symbol storing the range
+ <pinotree> i see
+ <pinotree> what's the current range on linux?
+ <braunr> 20:44 < braunr> the modern ones are the ones suggested in the
+ comment
+ <pinotree> i see
+ <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range
+ <braunr> 32768 61000
+ <braunr> so, i'm not sure why we have the problem, since even on linux,
+ netstat doesn't show open bound ports, but it does help
+ <braunr> the fact faked-tcp can remain after its use is more problematic
+ <pinotree> (maybe pfinet could grow a (startup-only?) option to change it,
+ similar to that sysctl)
+ <braunr> but it can also stems from the same issue gnu_srs found about
+ closed sockets that haven't been shut down
+ <braunr> perhaps
+ <braunr> but i don't see the point actually
+ <braunr> we could simply change the values in the code
+
+ <braunr> youpi: first, in pfinet, i increased the range of local ports to
+ reduce the likeliness of port space exhaustion
+ <braunr> so we should get a lot less EAGAIN after that
+ <braunr> (i've not committed any of those changes)
+ <youpi> range of local ports?
+ <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c, tcp_v4_get_port function
+ and sysctl_local_port_range array
+ <youpi> oh
+ <braunr> EAGAIN is caused by tcp_v4_get_port failing at
+ <braunr> /* Exhausted local port range during search? */
+ <braunr> if (remaining <= 0)
+ <braunr> goto fail;
+ <youpi> interesting
+ <youpi> so it's not a hurd bug after all
+ <youpi> just a problem in fakeroot eating a lot of ports
+ <braunr> maybe because of the same issue gnu_srs worked on (bad socket
+ close when no clean shutdown)
+ <braunr> maybe, maybe not
+ <braunr> but increasing the range is effective
+ <braunr> and i compared with what linux does today, which is exactly what
+ is in the comment above sysctl_local_port_range
+ <braunr> so it looks safe
+ <youpi> so that means that the pfinet just uses ports 1024- 4999 for
+ auto-allocated ports?
+ <braunr> i guess so
+ <youpi> the linux pfinet I meant
+ <braunr> i haven't checked the whole code but it looks that way
+ <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_min[] = { 1, 1
+ };
+ <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_max[] = { 65535,
+ 65535 };
+ <youpi> looks like they have increased it since then :)
+ <braunr> hum :)
+ <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range
+ <braunr> 32768 61000
+ <youpi> yep, same here
+ <youpi> ./inet_connection_sock.c: .range = { 32768, 61000 },
+ <youpi> so there are two things apparently
+ <youpi> but linux now defaults to 32k-61k
+ <youpi> braunr: please just push the port range upgrade to 32Ki-61K
+ <braunr> ok, will do
+ <youpi> there's not reason not to do it
+
+
+## IRC, freenode, #hurd, 2012-12-11
+
+ <braunr> youpi: at least, i haven't had any failure building eglibc since
+ the port range patch
+ <youpi> good :)
diff --git a/open_issues/fork_deadlock.mdwn b/open_issues/fork_deadlock.mdwn
index 6b90aa0a..c1fa9208 100644
--- a/open_issues/fork_deadlock.mdwn
+++ b/open_issues/fork_deadlock.mdwn
@@ -63,3 +63,34 @@ Another one in `dash`:
stopped = 1
i = 6
[...]
+
+
+# IRC, OFTC, #debian-hurd, 2012-11-24
+
+ <youpi> the lockups are about a SIGCHLD which gets lost
+ <pinotree> ah, ok
+ <youpi> which makes bash spin
+ <pinotree> is that happening more often recently, or it's just something i
+ just noticed?
+ <youpi> it's more often recently
+ <youpi> where "recently" means "some months ago"
+ <youpi> I didn't notice exactly when
+ <pinotree> i see
+ <youpi> it's at most since june, apparently
+ <youpi> (libtool managed to build without a fuss, while now it's a pain)
+ <youpi> (libtool building is a good test, it seems to be triggering quite
+ reliably)
+
+
+## IRC, freenode, #hurd, 2012-11-27
+
+ <youpi> we also have the shell wait issue
+ <youpi> it's particularly bad on libtool calls
+ <youpi> the libtool package (with testsuite) is a good reproducer :)
+ <youpi> the symptom is shell scripts eating CPU
+ <youpi> busy-waiting for a SIGCHLD which never gets received
+ <braunr> that could be what i got
+ <braunr>
+ http://www.gnu.org/software/hurd/microkernel/mach/gnumach/memory_management.html
+ <braunr> last part
+ <youpi> perhaps watch has the same issue as the shell, yes
diff --git a/open_issues/gcc/pie.mdwn b/open_issues/gcc/pie.mdwn
new file mode 100644
index 00000000..a4598d1e
--- /dev/null
+++ b/open_issues/gcc/pie.mdwn
@@ -0,0 +1,40 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!meta title="Position-Independent Executables"]]
+
+[[!tag open_issue_gcc]]
+
+
+# IRC, freenode, #debian-hurd, 2012-11-08
+
+ <pinotree> tschwinge: i'm not totally sure, but it seems the pie options
+ for gcc/ld are causing issues
+ <pinotree> namely, producing executables that sigsegv straight away
+ <tschwinge> pinotree: OK, I do remember some issues about these, too.
+ <tschwinge> Also for -pg.
+ <tschwinge> These have in common that they use different crt*.o files for
+ linking.
+ <tschwinge> Might well be there's some bugs there.
+ <pinotree> one way is to try the w3m debian build: the current build
+ configuration enables also pie, which in turns makes an helper executable
+ (mktable) sigsegv when invoked
+ <pinotree> if «,-pie» is appended to the DEB_BUILD_MAINT_OPTIONS variable
+ in debian/rules, pie is not added and the resulting mktable runs
+ correctly
+
+
+## IRC, OFTC, #debian-hurd, 2012-11-09
+
+ <pinotree> youpi: ah, as i noted to tschwinge earlier, it seems -fPIE -pie
+ miscompile stuff
+ <youpi> uh
+ <pinotree> this causes the w3m build failure and (indirectly, due to elinks
+ built with -pie) aptitude
diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn
index e94a4f1f..3b4e5efa 100644
--- a/open_issues/glibc.mdwn
+++ b/open_issues/glibc.mdwn
@@ -81,6 +81,35 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
Might simply be a missing patch(es) from master.
+ * `--disable-multi-arch`
+
+ IRC, freenode, #hurd, 2012-11-22
+
+ <pinotree> tschwinge: is your glibc build w/ or w/o multiarch?
+ <tschwinge> pinotree: See open_issues/glibc: --disable-multi-arch
+ <pinotree> ah, because you do cross-compilation?
+ <tschwinge> No, that's natively.
+ <tschwinge> There is also a not of what happened in cross-gnu when I
+ enabled multi-arch.
+ <tschwinge> No idea whether that's still relevant, though.
+ <pinotree> EPARSE
+ <tschwinge> s%not%note
+ <tschwinge> Better?
+ <pinotree> yes :)
+ <tschwinge> As for native builds: I guess I just didn't (want to) play
+ with it yet.
+ <pinotree> it is enabled in debian since quite some time, maybe other
+ i386/i686 patches (done for linux) help us too
+ <tschwinge> I though we first needed some CPU identification
+ infrastructe before it can really work?
+ <tschwinge> I thought [...].
+ <pinotree> as in use the i686 variant as runtime automatically? i guess
+ so
+ <tschwinge> I thought I had some notes about that, but can't currently
+ find them.
+ <tschwinge> Ah, I probably have been thinking about open_issues/ifunc
+ and open_issues/libc_variant_selection.
+
* --build=X
`long double` test: due to `cross_compiling = maybe` wants to execute a
@@ -350,6 +379,24 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
<pinotree> like posix/tst-waitid.c, you mean?
<youpi> yes
+ * `getconf` things
+
+ IRC, freenode, #hurd, 2012-10-03
+
+ <pinotree> getconf -a | grep CACHE
+ <Tekk_> pinotree: I hate spoiling data, but 0 :P
+ <pinotree> had that feeling, but wanted to be sure -- thanks!
+ <Tekk_> http://dpaste.com/809519/
+ <Tekk_> except for uhh
+ <Tekk_> L4 linesize
+ <Tekk_> that didn't have any number associated
+ <pinotree> weird
+ <Tekk_> I actually didn't even know that there was L4 cache
+ <pinotree> what do you get if you run `getconf
+ LEVEL4_CACHE_LINESIZE`?
+ <Tekk_> pinotree: undefined
+ <pinotree> expected, given the output above
+
For specific packages:
* [[octave]]
@@ -384,6 +431,270 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
* `sysdeps/unix/sysv/linux/syslog.c`
+ * `fsync` on a pipe
+
+ IRC, freenode, #hurd, 2012-08-21:
+
+ <braunr> pinotree: i think gnu_srs spotted a conformance problem in
+ glibc
+ <pinotree> (only one?)
+ <braunr> pinotree: namely, fsync on a pipe (which is actually a
+ socketpair) doesn't return EINVAL when the "operation not supported"
+ error is returned as a "bad request message ID"
+ <braunr> pinotree: what do you think of this case ?
+ <pinotree> i'm far from an expert on such stuff, but seems a proper E*
+ should be returned
+ <braunr> (there also is a problem in clisp falling in an infinite loop
+ when trying to handle this, since it uses fsync inside the error
+ handling code, eww, but we don't care :p)
+ <braunr> basically, here is what clisp does
+ <braunr> if fsync fails, and the error isn't EINVAL, let's report the
+ error
+ <braunr> and reporting the error in turn writes something on the
+ output/error stream, which in turn calls fsync again
+ <pinotree> smart
+ <braunr> after the stack is exhausted, clisp happily crashes
+ <braunr> gnu_srs: i'll alter the clisp code a bit so it knows about our
+ mig specific error
+ <braunr> if that's the problem (which i strongly suspect), the solution
+ will be to add an error conversion for fsync so that it returns
+ EINVAL
+ <braunr> if pinotree is willing to do that, he'll be the only one
+ suffering from the dangers of sending stuff to the glibc maintainers
+ :p
+ <pinotree> that shouldn't be an issue i think, there are other glibc
+ hurd implementations that do such checks
+ <gnu_srs> does fsync return EINVAL for other OSes?
+ <braunr> EROFS, EINVAL
+ <braunr> fd is bound to a special file which does not
+ support synchronization.
+ <braunr> obviously, pipes and sockets don't
+ <pinotree>
+ http://pubs.opengroup.org/onlinepubs/9699919799/functions/fsync.html
+ <braunr> so yes, other OSes do just that
+ <pinotree> now that you speak about it, it could be the failure that
+ the gnulib fsync+fdatasync testcase have when being run with `make
+ check` (although not when running as ./test-foo)
+ <braunr> hm we may not need change glibc
+ <braunr> clisp has a part where it defines a macro IS_EINVAL which is
+ system specific
+ <braunr> (but we should change it in glibc for conformance anyway)
+ <braunr> #elif defined(UNIX_DARWIN) || defined(UNIX_FREEBSD) ||
+ defined(UNIX_NETBSD) || defined(UNIX_OPENBSD) #define IS_EINVAL_EXTRA
+ ((errno==EOPNOTSUPP)||(errno==ENOTSUP)||(errno==ENODEV))
+ <pinotree> i'd rather add nothing to clisp
+ <braunr> let's see what posix says
+ <braunr> EINVAL
+ <braunr> so right, we should simply convert it in glibc
+ <gnu_srs> man fsync mentions EINVAL
+ <braunr> man pages aren't posix, even if they are usually close
+ <gnu_srs> aha
+ <pinotree> i think checking for MIG_BAD_ID and EOPNOTSUPP (like other
+ parts do) will b enough
+ <pinotree> *be
+ <braunr> gnu_srs: there, it finished correctly even when piped
+ <gnu_srs> I saw that, congrats!
+ <braunr> clisp is quite tricky to debug
+ <braunr> i never had to deal with a program that installs break points
+ and handles segfaults itself in order to implement growing stacks :p
+ <braunr> i suppose most interpreters do that
+ <gnu_srs> So the permanent change will be in glibc, not clisp?
+ <braunr> yes
+
+ IRC, freenode, #hurd, 2012-08-24:
+
+ <gnu_srs1> pinotree: The changes needed for fsync.c is at
+ http://paste.debian.net/185379/ if you want to try it out (confirmed
+ with rbraun)
+ <youpi> I agree with the patch, posix indeed documents einval as the
+ "proper" error value
+ <pinotree> there's fdatasync too
+ <pinotree> other places use MIG_BAD_ID instead of EMIG_BAD_ID
+ <braunr> pinotree: i assume that if you're telling us, it's because
+ they have different values
+ <pinotree> braunr: tbh i never seen the E version, and everywhere in
+ glibc the non-E version is used
+ <gnu_srs1> in sysdeps/mach/hurd/bits/errno.h only the E version is
+ defined
+ <pinotree> look in gnumach/include/mach/mig_errors.h
+ <pinotree> (as the comment in errno.h say)
+ <gnu_srs1> mig_errors.h yes. Which comment: from errors.h: /* Errors
+ from <mach/mig_errors.h>. */ and then the EMIG_ stuff?
+ <gnu_srs1> Which one is used when building libc?
+ <gnu_srs1> Answer: At least in fsync.c errno.h is used: #include
+ <errno.h>
+ <gnu_srs1> Yes, fdatasync.c should be patched too.
+ <gnu_srs1> pinotree: You are right: EMIG_ or MIG_ is confusing.
+ <gnu_srs1> /usr/include/i386-gnu/bits/errno.h: /* Errors from
+ <mach/mig_errors.h>. */
+ <gnu_srs1> /usr/include/hurd.h:#include <mach/mig_errors.h>
+
+ IRC, freenode, #hurd, 2012-09-02:
+
+ <antrik> braunr: regarding fsync(), I agree that EOPNOTSUPP probably
+ should be translated to EINVAL, if that's what POSIX says. it does
+ *not* sound right to translate MIG_BAD_ID though. the server should
+ explicitly return EOPNOTSUPP, and that's what the default trivfs stub
+ does. if you actually do see MIG_BAD_ID, there must be some other
+ bug...
+ <braunr> antrik: right, pflocal doesn't call the trivfs stub for socket
+ objects
+ <braunr> trivfs_demuxer is only called by the pflocal node demuxer, for
+ socket objects it's another call, and i don't think it's the right
+ thing to call trivfs_demuxer there either
+ <pinotree> handling MAG_BAD_ID isn't a bad idea anyway, you never know
+ what the underlying server actually implements
+ <pinotree> (imho)
+ <braunr> for me, a bad id is the same as a not supported operation
+ <pinotree> ditto
+ <pinotree> from fsync's POV, both the results are the same anyway, ie
+ that the server does not support a file_sync operation
+ <antrik> no, a bad ID means the server doesn't implement the protocol
+ (or not properly at least)
+ <antrik> it's usually a bug IMHO
+ <antrik> there is a reason we have EOPNOTSUPP for operations that are
+ part of a protocol but not implemented by a particular server
+ <pinotree> antrik: even if it could be the case, there's no reason to
+ make fsync fail anyway
+ <antrik> pinotree: I think there is. it indicates a bug, which should
+ not be hidden
+ <pinotree> well, patches welcome then...
+ <antrik> thing is, if sock objects are actually not supposed to
+ implement the file interface, glibc shouldn't even *try* to call
+ fsync on them
+ <pinotree> how?
+ <pinotree> i mean, can you check whether the file interface is not
+ implemented, without doing a roundtrip^
+ <pinotree> ?
+ <antrik> well, the sock objects are not files, i.e. they were *not*
+ obtained by file_name_lookup(), but rather a specific call. so glibc
+ actually *knows* that they are not files.
+ <braunr> antrik: this way of thinking means we need an "fd" protocol
+ <braunr> so that objects accessed through a file descriptor implement
+ all fd calls
+ <antrik> now I wonder though whether there are conceivable use cases
+ where it would make sense for objects obtained through the socket
+ call to optionally implement the file interface...
+ <antrik> which could actually make sense, if libc lets through other
+ file calls as well (which I guess it does, if the sock ports are
+ wrapped in normal fd structures?)
+ <braunr> antrik: they are
+ <braunr> and i'd personally be in favor of such an fd protocol, even if
+ it means implementing stubs for many useless calls
+ <braunr> but the way things are now suggest a bad id really means an
+ operation is simply not supported
+ <antrik> the question in this case is whether we should make the file
+ protocol mandatory for anything that can end up in an FD; or whether
+ we should keep it optional, and add the MIG_BAD_ID calls to *all* FD
+ operations
+ <antrik> (there is no reason for fsync to be special in this regard)
+ <braunr> yes
+ <antrik> braunr: BTW, I'm rather undecided whether the right approach
+ is a) requiring an FD interface collection, b) always checking
+ MIG_BAD_ID, or perhaps c) think about introducing a mechanism to
+ explicitly query supported interfaces...
+
+ IRC, freenode, #hurd, 2012-09-03:
+
+ <braunr> antrik: querying interfaces sounds like an additional penalty
+ on performance
+ <antrik> braunr: the query usually has to be done only once. in fact it
+ could be integrated into the name lookup...
+ <braunr> antrik: once for every object
+ <braunr> antrik: yes, along with the lookup would be a nice thing
+
+ [[!message-id "1351231423.8019.19.camel@hp.my.own.domain"]].
+
+ * `t/no-hp-timing`
+
+ IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> tschwinge: wrt the glibc topgit branch t/no-hp-timing,
+ couldn't that file be just replaced by #include
+ <sysdeps/generic/hp-timing.h>?
+
+ * `flockfile`/`ftrylockfile`/`funlockfile`
+
+ IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> youpi: uhm, in glibc we use
+ stdio-common/f{,try,un}lockfile.c, which do nothing (as opposed to eg
+ the nptl versions, which do lock/trylock/unlock); do you know more
+ about them?
+ <youpi> pinotree: ouch
+ <youpi> no, I don't know
+ <youpi> well, I do know what they're supposed to do
+ <pinotree> i'm trying fillig them, let's see
+ <youpi> but not why we don't have them
+ <youpi> (except that libpthread is "recent")
+ <youpi> yet another reason to build libpthread in glibc, btw
+ <youpi> oh, but we do provide lockfile in libpthread, don't we ?
+ <youpi> pinotree: yes, and libc has weak variants, so the libpthread
+ will take over
+ <pinotree> youpi: sure, but that in stuff linking to pthreads
+ <pinotree> if you do a simple application doing eg main() { fopen +
+ fwrite + fclose }, you get no locking
+ <youpi> so?
+ <youpi> if you don't have threads, you don't need locks :)
+ <pinotree> ... unless there is some indirect recursion
+ <youpi> ?
+ <pinotree> basically, i was debugging why glibc tests with mtrace() and
+ ending with muntrace() would die (while tests without muntrace call
+ wouldn't)
+ <youpi> well, I still don't see what a lock will bring
+ <pinotree> if you look at the muntrace implementation (in
+ malloc/mtrace.c), basically fclose can trigger a malloc hook (because
+ of the free for the FILE*)
+ <youpi> either you have threads, and it's need, or you don't, and it's
+ a nop
+ <youpi> yes, and ?
+ <braunr> does the signal thread count ?
+ <youpi> again, in linux, when you don't have threads, the lock is a nop
+ <youpi> does the signal thread use IO ?
+ <braunr> that's the question :)
+ <braunr> i hope not
+ <youpi> IIRC the signal thread just manages signals, and doesn't
+ execute the handler itself
+ <braunr> sure
+ <braunr> i was more thinking about debug stuff
+ <youpi> can't hurt to add them anyway, but let me still doubt that it'd
+ fix muntrace, I don't see why it would, unless you have threads
+ <pinotree> that's what i'm going next
+ <pinotree> pardon, it seems i got confused a bit
+ <pinotree> it'd look like a genuine muntrace bug (muntrace → fclose →
+ free hook → lock lock → fprint (since the FILE is still set) → malloc
+ → malloc hook → lock lock → spin)
+ <pinotree> at least i got some light over the flockfile stuff, thanks
+ ;)
+ <pinotree> youpi: otoh, __libc_lock_lock (etc) are noop in the base
+ implementation, while doing real locks on hurd in any case, and on
+ linux only if nptl is loaded, it seems
+ <pinotree> that would explain why on linux you get no deadlock
+ <youpi> unless using nptl, that is?
+ <pinotree> hm no, even with pthread it works
+ <pinotree> but hey, at least the affected glibc test now passes
+ <pinotree> will maybe try to do investigation on why it works on linux
+ tomorrow
+
+ [[!message-id "201211172058.21035.toscano.pino@tiscali.it"]].
+
+ * `t/pagesize`
+
+ IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> tschwinge: somehow related to your t/pagesize branch: due to
+ the fact that EXEC_PAGESIZE is not defined on hurd, libio/libioP.h
+ switches the allocation modes from mmap to malloc
+
+ * `LD_DEBUG`
+
+ IRC, freenode, #hurd, 2012-11-22
+
+ <pinotree> woot, `LD_DEBUG=libs /bin/ls >/dev/null` prints stuff and
+ then sigsegv
+ <tschwinge> Yeah, that's known for years... :-D
+ <tschwinge> Probably not too difficult to resolve, though.
+
* Verify baseline changes, if we need any follow-up changes:
* a11ec63713ea3903c482dc907a108be404191a02
@@ -559,6 +870,11 @@ Last reviewed up to the [[Git mirror's e80d6f94e19d17b91e3cd3ada7193cc88f621feb
* *baseline*
* [high] `sendmmsg` usage, c030f70c8796c7743c3aa97d6beff3bd5b8dcd5d --
need a `ENOSYS` stub.
+ * ea4d37b3169908615b7c17c9c506c6a6c16b3a26 -- IRC, freenode, #hurd,
+ 2012-11-20, pinotree: »tschwinge: i agree on your comments on
+ ea4d37b3169908615b7c17c9c506c6a6c16b3a26, especially since mach's
+ sleep.c is buggy (not considers interruption, extra time() (= RPC)
+ call)«.
# Build
diff --git a/open_issues/glibc/t/tls-threadvar.mdwn b/open_issues/glibc/t/tls-threadvar.mdwn
index e72732ab..4afd8a1a 100644
--- a/open_issues/glibc/t/tls-threadvar.mdwn
+++ b/open_issues/glibc/t/tls-threadvar.mdwn
@@ -29,3 +29,32 @@ IRC, freenode, #hurd, 2011-10-23:
After this has been done, probably the whole `__libc_tsd_*` stuff can be
dropped altogether, and `__thread` directly be used in glibc.
+
+
+# IRC, freenode, #hurd, 2012-08-07
+
+ <tschwinge> r5219: Update libpthread patch to replace threadvar with tls
+ for pthread_self
+ <tschwinge> r5224: revert r5219 too, it's not ready either
+ <youpi> as the changelog says, the __thread revertal is because it posed
+ problems
+ <youpi> and I just didn't have any time to check them while the freeze was
+ so close
+ <tschwinge> OK. What kind of problems? Should it be reverted upstream,
+ too?
+ <youpi> I don't remember exactly
+ <youpi> it should just be fixed
+ <youpi> we can revert it upstream, but it'd be good that we manage to
+ progress, at some point...
+ <tschwinge> Of course -- however as long as we don't know what kind of
+ problem, it is a bit difficult. ;-)
+ <youpi> since I didn't left a note, it was most probably a mere glibc run,
+ or boot with the patched libpthread
+ <youpi> *testsuite run
+ <tschwinge> OK.
+ <tschwinge> The libpthread testsuite doesn't show any issues with that
+ patch applied, though. But I didn'T test anything else.
+ <tschwinge> youpi: Also, you have probably seen my glibc __thread errno
+ email -- rmcgrath wanted to find some time this week to comment/help, and
+ I take it you don't have any immediate comments to that issue?
+ <youpi> I saw the mails, but didn't investigate at all
diff --git a/open_issues/gnat.mdwn b/open_issues/gnat.mdwn
index fb624fad..2d17e275 100644
--- a/open_issues/gnat.mdwn
+++ b/open_issues/gnat.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -38,6 +38,55 @@ svn://svn.debian.org/gcccvs/branches/sid@5638
6ca36cf4-e1d1-0310-8c6f-e303bb2178ca'
+## IRC, freenode, #hurd, 2012-07-17
+
+ <gnu_srs> I've found the remaining problem with gnat backtrace for Hurd!
+ Related to the stack frame.
+ <gnu_srs> This version does not work: one relying on static assumptions
+ about the frame layout
+ <gnu_srs> Causing segfaults.
+ <gnu_srs> Any interest to create a test case out of that piece of code,
+ taken from gcc/ada/tracebak.c?
+ <braunr> gnu_srs: sure
+
+
+### IRC, freenode, #hurd, 2012-07-18
+
+ <braunr> "Digging further revealed that the GNU/Hurd stack frame does not
+ seem to
+ <braunr> be static enough to define USE_GENERIC_UNWINDER in
+ gcc/ada/tracebak.c.
+ <braunr> "
+ <braunr> what do you mean by a "stack frame does not seem to be static
+ enough" ?
+ <gnu_srs> I can qoute from the source file if you want. Otherwise look at
+ the code yourself: gcc/ada/tracebak,c
+ <gnu_srs> I mean that something is wrong with the stack frame for
+ Hurd. This is the code I wanted to use as a test case for the stack.
+ <gnu_srs> Remember?
+ <braunr> more or less
+ <braunr> ah, "static assumptions"
+ <braunr> all right, i don't think anything is "wrong" with stack frames
+ <braunr> but if you use a recent version of gcc, as indicated in the code,
+ -fomit-frame-pointer is enabled by default
+ <braunr> so your stack frame won't look like it used to be without the
+ option
+ <braunr> hence the need for USE_GCC_UNWINDER
+ <braunr> http://en.wikipedia.org/wiki/Call_stack explains this very well
+ <gnu_srs> However, kfreebsd does not seem to need USE_GCC_UNWINDER, how
+ come?
+ <braunr> i guess they don't omit the frame pointer
+ <braunr> your fix is good btw
+ <gnu_srs> thanks
+
+
+### IRC, freenode, #hurd, 2012-07-19
+
+ <gnu_srs> tschwinge: The bug in #681998 should go upstream. Applied in
+ Debian already. Hopefully this is the last patch needed for the port of
+ GNAT to Hurd.
+
+
---
diff --git a/open_issues/gnumach_memory_management.mdwn b/open_issues/gnumach_memory_management.mdwn
index 9feb30c8..e5e9d2c5 100644
--- a/open_issues/gnumach_memory_management.mdwn
+++ b/open_issues/gnumach_memory_management.mdwn
@@ -2133,3 +2133,52 @@ There is a [[!FF_project 266]][[!tag bounty]] on this task.
<braunr> do you want to review ?
<youpi> I don't think there is any need to
<braunr> ok
+
+
+# IRC, freenode, #hurd, 2012-12-08
+
+ <mcsim> braunr: hi. Do I understand correct that merely the same technique
+ is used in linux to determine the slab where, the object to be freed,
+ resides?
+ <braunr> yes but it's faster on linux since it uses a direct mapping of
+ physical memory
+ <braunr> it just has to shift the virtual address to obtain the physical
+ one, whereas x15 has to walk the pages tables
+ <braunr> of course it only works for kmalloc, vmalloc is entirely different
+ <mcsim> btw, is there sense to use some kind of B-tree instead of AVL to
+ decrease number of cache misses? AFAIK, in modern processors size of L1
+ cache line is at least 64 bytes, so in one node we can put at least 4
+ leafs (key + pointer to data) making search faster.
+ <braunr> that would be a b-tree
+ <braunr> and yes, red-black trees were actually developed based on
+ properties observed on b-trees
+ <braunr> but increasing the size of the nodes also increases memory
+ overhead
+ <braunr> and code complexity
+ <braunr> that's why i have a radix trees for cases where there are a large
+ number of entries with keys close to each other :)
+ <braunr> a radix-tree is basically a b-tree using the bits of the key as
+ indexes in the various arrays it walks instead of comparing keys to each
+ other
+ <braunr> the original avl tree used in my slab allocator was intended to
+ reduce the average height of the tree (avl is better for that)
+ <braunr> avl trees are more suited for cases where there are more lookups
+ than inserts/deletions
+ <braunr> they make the tree "flatter" but the maximum complexity of
+ operations that change the tree is 2log2(n), since rebalancing the tree
+ can make the algorithm reach back to the tree root
+ <braunr> red-black trees have slightly bigger heights but insertions are
+ limited to 2 rotations and deletions to 3
+ <mcsim> there should be not much lookups in slab allocators
+ <braunr> which explains why they're more generally found in generic
+ containers
+ <mcsim> or do I misunderstand something?
+ <braunr> well, there is a lookup for each free()
+ <braunr> whereas there are insertions/deletions when a slab becomes
+ non-empty/empty
+ <mcsim> I see
+ <braunr> so it was very efficient for caches of small objects, where slabs
+ have many of them
+ <braunr> also, i wrote the implementation in userspace, without
+ functionality pmap provides (although i could have emulated it
+ afterwards)
diff --git a/open_issues/gnumach_page_cache_policy.mdwn b/open_issues/gnumach_page_cache_policy.mdwn
index 03cb3725..d128c668 100644
--- a/open_issues/gnumach_page_cache_policy.mdwn
+++ b/open_issues/gnumach_page_cache_policy.mdwn
@@ -108,6 +108,9 @@ License|/fdl]]."]]"""]]
12k random data
<braunr> i'll try with other values
<braunr> i get crashes, deadlocks, livelocks, and it's not pretty :)
+
+[[libpager_deadlock]].
+
<braunr> and always in ext2, mach doesn't seem affected by the issue, other
than the obvious
<braunr> (well i get the usual "deallocating an invalid port", but as
@@ -625,3 +628,158 @@ License|/fdl]]."]]"""]]
## [[metadata_caching]]
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> i'm only adding a cached pages count you know :)
+ <braunr> (well actually, this is now a vm_stats call that can replace
+ vm_statistics, and uses flavors similar to task_info)
+ <braunr> my goal being to see that yellow bar in htop
+ <braunr> ... :)
+ <pinotree> yellow?
+ <braunr> yes, yellow
+ <braunr> as in http://www.sceen.net/~rbraun/htop.png
+ <pinotree> ah
+
+
+## IRC, freenode, #hurd, 2012-07-13
+
+ <braunr> i always get a "no more room for vm_map_enter" error when building
+ glibc :/
+ <braunr> but the build continues, probably a failed test
+ <braunr> ah yes, i can see the yellow bar :>
+ <antrik> braunr: congrats :-)
+ <braunr> antrik: thanks
+ <braunr> but i think my patch can't make it into the git repo until the
+ swap deadlock is solved (or at least very infrequent ..)
+
+[[libpager_deadlock]].
+
+ <braunr> well, the page cache accounting tells me something is wrong there
+ too lol
+ <braunr> during a build 112M of data was created, of which only 28M made it
+ into the cache
+ <braunr> which may imply something is still holding references on the
+ others objects (shadow objects hold references to their underlying
+ object, which could explain this)
+ <braunr> ok i'm stupid, i just forgot to subtract the cached pages from the
+ used pages .. :>
+ <braunr> (hm, actually i'm tired, i don't think this should be done)
+ <braunr> ahh yes much better
+ <braunr> i simply forgot to convert pages in kilobytes .... :>
+ <braunr> with the fix, the accounting of cached files is perfect :)
+
+
+## IRC, freenode, #hurd, 2012-07-14
+
+ <youpi> braunr: btw, if you want to stress big builds, you might want to
+ try webkit, ppl, rquantlib, rheolef, yade
+ <youpi> they don't pass on bach (1.3GiB), but do on ironforge (1.8GiB)
+ <braunr> youpi: i don't need to, i already know my patch triggers swap
+ deadlocks more often, which was expected
+ <youpi> k
+ <braunr> there are 3 tasks concerning my work : 1/ page cache accounting
+ (i'm sending the patch right now) 2/ removing the fixed limit and 3/
+ hunting the swap deadlock and fixing as much as possible
+ <braunr> 2/ can't get in the repository without 3/ imo
+ <youpi> btw, the increase of PAGE_FREE_* in your 2/ could go already,
+ couldn't it?
+ <braunr> yes
+ <braunr> but we should test with higher thresholds
+ <braunr> well
+ <braunr> it really depends on the usage pattern :/
+
+
+## [[ext2fs_libports_reference_counting_assertion]]
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+ <braunr> concerning the page cache patch, i've been using for quite some
+ time now, did lots of builds with it, and i actually wonder if it hurts
+ stability as much as i think
+ <braunr> considering i didn't stress the system as much before
+ <braunr> and it really improves performance
+
+ <braunr> cached memobjs: 138606
+ <braunr> cache: 1138M
+ <braunr> i bet ext2fs can have a hard time scanning 138k entries in a
+ linked list, using callback functions on each of them :x
+
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+ <tschwinge> braunr: Sorry that I didn't have better results to present.
+ :-/
+ <braunr> eh, that was expected :)
+ <braunr> my biggest problem is the hurd itself :/
+ <braunr> for my patch to be useful (and the rest of the intended work), the
+ hurd needs some serious fixing
+ <braunr> not syncing from the pagers
+ <braunr> and scalable algorithms everywhere of course
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> youpi: FYI, the branches rbraun/page_cache in the gnupach and hurd
+ repos are ready to be merged after review
+ <braunr> gnumach*
+ <youpi> so you fixed the hangs & such?
+ <braunr> they only the cache stats, not the "improved" cache
+ <braunr> no
+ <braunr> it requires much more work for that :)
+ <youpi> braunr: my concern is that the tests on buildds show stability
+ regression
+ <braunr> youpi: tschwinge also reported performance degradation
+ <braunr> and not the minor kind
+ <youpi> uh
+ <tschwinge> :-/
+ <braunr> far less pageins, but twice as many pageouts, and probably high
+ cpu overhead
+ <braunr> building (which is what buildds do) means lots of small files
+ <braunr> so lots of objects
+ <braunr> huge lists, long scans, etc..
+ <braunr> so it definitely requires more work
+ <braunr> the stability issue comes first in mind, and i don't see a way to
+ obtain a usable trace
+ <braunr> do you ?
+ <youpi> nope
+ <braunr> (except making it loop forever instead of calling assert() and
+ attach gdb to a qemu instance)
+ <braunr> youpi: if you think the infinite loop trick is ok, we could
+ proceed with that
+ <youpi> which assert?
+ <braunr> the port refs one
+ <youpi> which one?
+ <braunr> whicih prevented you from using the page cache patch on buildds
+ <youpi> ah, the libports one
+ <youpi> for that one, I'd tend to take the time to perhaps use coccicheck
+ actually
+
+[[code_analysis]].
+
+ <braunr> oh
+ <youpi> it's one of those which is supposed to be statically ananyzable
+ <youpi> s/n/l
+ <braunr> that would be great
+ <tschwinge> :-)
+ <tschwinge> And set precedence.
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+ <braunr> hm i killed darnassus, probably the page cache patch again
+
+
+## IRC, freenode, #hurd, 2012-09-19
+
+ <youpi> I was wondering about the page cache information structure
+ <youpi> I guess the idea is that if we need to add a field, we'll just
+ define another RPC?
+ <youpi> braunr: ↑
+ <braunr> i've done that already, yes
+ <braunr> youpi: have a look at the rbraun/page_cache gnumach branch
+ <youpi> that's what I was referring to
+ <braunr> ok
diff --git a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
index 90137766..7739f4d1 100644
--- a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
+++ b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -181,6 +181,8 @@ License|/fdl]]."]]"""]]
<braunr> from what i could see, part of the problem still exists in freebsd
<braunr> for the same reasons (shadow objects being one of them)
+[[mach_shadow_objects]].
+
# GCC build time using bash vs. dash
diff --git a/open_issues/gnumach_vm_map_red-black_trees.mdwn b/open_issues/gnumach_vm_map_red-black_trees.mdwn
index d7407bfe..53ff66c5 100644
--- a/open_issues/gnumach_vm_map_red-black_trees.mdwn
+++ b/open_issues/gnumach_vm_map_red-black_trees.mdwn
@@ -172,3 +172,175 @@ License|/fdl]]."]]"""]]
crasher le noyau)
<braunr> (enfin jveux dire, qui faisait crasher le noyau de façon très
obscure avant le patch rbtree)
+
+
+### IRC, freenode, #hurd, 2012-07-15
+
+ <bddebian> I get errors in vm_map.c whenever I try to "mount" a CD
+ <bddebian> Hmm, this time it rebooted the machine
+ <bddebian> braunr: The translator set this time and the machine reboots
+ before I can get the full message about vm_map, but here is some of the
+ crap I get: http://paste.debian.net/179191/
+ <braunr> oh
+ <braunr> nice
+ <braunr> that may be the bug youpi saw with my redblack tree patch
+ <braunr> bddebian: assert(diff != 0); ?
+ <bddebian> Aye
+ <braunr> good
+ <braunr> it means we're trying to insert a vm_map_entry at a region in a
+ map which is already occupied
+ <bddebian> Oh
+ <braunr> and unlike the previous code, the tree actually checks that
+ <braunr> it has to
+ <braunr> so you just simply use the iso9660fs translator and it crashes ?
+ <bddebian> Well it used to on just trying to set the translator. This time
+ I was able to set the translator but as soon as I cd to the mount point I
+ get all that crap
+ <braunr> that's very good
+ <braunr> more test cases to fix the vm
+
+
+### IRC, freenode, #hurd, 2012-11-01
+
+ <youpi> braunr: Assertion `diff != 0' failed in file "vm/vm_map.c", line
+ 1002
+ <youpi> that's in rbtree_insert
+ <braunr> youpi: the problem isn't the tree, it's the map entries
+ <braunr> some must overlap
+ <braunr> if you can inspect that, it would be helpful
+ <youpi> I have a kdb there
+ <youpi> it's within a port_name_to_task system call
+ <braunr> this assertion basically means there already is an item in the
+ tree where the new item is supposed to be inserted
+ <youpi> this port_name_to_task presence in the stack is odd
+ <braunr> it's in vm_map_enter
+ <youpi> there's a vm_map just after that (and the assembly trap code
+ before)
+ <youpi> I know
+ <youpi> I'm wondering about the caller
+ <braunr> do you have a way to inspect the inserted map entry ?
+ <youpi> I'm actually wondering whether I have the right kernel in gdb
+ <braunr> oh
+ <youpi> better
+ <youpi> with the right kernel :)
+ <youpi> 0x80039acf (syscall_vm_map)
+ (target_map=d48b6640,address=d3b63f90,size=0,mask=0,anywhere=1)
+ <youpi> size == 0 seems odd to me
+ <youpi> (same parameters for vm_map)
+ <braunr> right
+ <braunr> my code does assume an entry has a non null size
+ <braunr> (in the entry comparison function)
+ <braunr> EINVAL (since Linux 2.6.12) length was 0.
+ <braunr> that's a quick glance at mmap(2)
+ <braunr> might help track bugs from userspace (e.g. in exec .. :))
+ <braunr> posix says the saem
+ <braunr> same*
+ <braunr> the gnumach manual isn't that precise
+ <youpi> I don't seem to manage to read the entry
+ <youpi> but I guess size==0 is the problem anyway
+ <mcsim> youpi, braunr: Is there another kernel fault? Was that in my
+ kernel?
+ <braunr> no that's another problem
+ <braunr> which became apparent following the addition of red black trees in
+ the vm_map code
+ <braunr> (but which was probably present long before)
+ <mcsim> braunr: BTW, do you know if there where some specific circumstances
+ that led to memory exhaustion in my code? Or it just aggregated over
+ time?
+ <braunr> mcsim: i don't know
+ <mcsim> s/where/were
+ <mcsim> braunr: ok
+
+
+### IRC, freenode, #hurd, 2012-11-05
+
+ <tschwinge> braunr: I have now also hit the diff != 0 assertion error;
+ sitting in KDB, waiting for your commands.
+ <braunr> tschwinge: can you check the backtrace, have a look at the system
+ call and its parameters like youpi did ?
+ <tschwinge> If I manage to figure out how to do that... :-)
+ * tschwinge goes read scrollback.
+ <braunr> "trace" i suppose
+ <braunr> if running inside qemu, you can use the integrated gdb server
+ <tschwinge> braunr: No, hardware. And work intervened. And mobile phone
+ <-> laptop via bluetooth didn't work. But now:
+ <tschwinge> Pretty similar to Samuel's:
+ <tschwinge> Assert([...])
+ <tschwinge> vm_map_enter(0xc11de6c8, 0xc1785f94, 0, 0, 1)
+ <tschwinge> vm_map(0xc11de6c8, 0xc1785f94, 0, 0, 1)
+ <tschwinge> syscall_vm_map(1, 0x1024a88, 0, 0, 1)
+ <tschwinge> mach_call_call(1, 0x1024a88, 0, 0, 1)
+ <braunr> thanks
+ <braunr> same as youpi observed, the requested size for the mapping is 0
+ <braunr> tschwinge: thanks
+ <tschwinge> braunr: Anything else you'd like to see before I reboot?
+ <braunr> tschwinge: no, that's enough for now, and the other kind of info
+ i'd like are much more difficult to obtain
+ <braunr> if we still have the problem once a small patch to prevent null
+ size is applied, then it'll be worth looking more into it
+ <pinotree> isn't it possible to find out who called with that size?
+ <braunr> not easy, no
+ <braunr> it's also likely that the call that fails isn't the first one
+ <pinotree> ah sure
+ <pinotree> braunr: making mmap reject 0 size length could help? posix says
+ such size should be rejected straight away
+ <braunr> 17:09 < braunr> if we still have the problem once a small patch to
+ prevent null size is applied, then it'll be worth looking more into it
+ <braunr> that's the idea
+ <braunr> making faulty processes choke on it should work fine :)
+ <pinotree> «If len is zero, mmap() shall fail and no mapping shall be
+ established.»
+ <pinotree> braunr: should i cook up such patch for mmap?
+ <braunr> no, the change must be applied in gnumach
+ <pinotree> sure, but that could simply such condition in mmap (ie avoiding
+ to call io_map on a file)
+ <braunr> such calls are erroneous and rare, i don't see the need
+ <pinotree> ok
+ <braunr> i bet it comes from the exec server anyway :p
+ <tschwinge> braunr: Is the mmap with size 0 already a reproducible testcase
+ you can use for the diff != 0 assertion?
+ <tschwinge> Otherwise I'd have a reproducer now.
+ <braunr> tschwinge: i'm not sure but probably yes
+ <tschwinge> braunr: Otherwise, take GDB sources, then: gcc -fsplit-stack
+ gdb/testsuite/gdb.base/morestack.c && ./a.out
+ <tschwinge> I have not looked what exactly this does; I think -fsplit-stack
+ is not really implemented for us (needs something in libgcc we might not
+ have), is on my GCC TODO list already.
+ <braunr> tschwinge: interesting too :)
+
+
+### IRC, freenode, #hurd, 2012-11-19
+
+ <tschwinge> braunr: Hmm, I have now hit the diff != 0 GNU Mach assertion
+ failure during some GCC invocation (GCC testsuite) that does not relate
+ to -fsplit-stack (as the others before always have).
+ <tschwinge> Reproduced:
+ /media/erich/home/thomas/tmp/gcc/hurd/master.build/gcc/xgcc
+ -B/media/erich/home/thomas/tmp/gcc/hurd/master.build/gcc/
+ /home/thomas/tmp/gcc/hurd/master/gcc/testsuite/gcc.dg/torture/pr42878-1.c
+ -fno-diagnostics-show-caret -O2 -flto -fuse-linker-plugin
+ -fno-fat-lto-objects -fcompare-debug -S -o pr42878-1.s
+ <tschwinge> Will check whether it's the same backtrace in GNU Mach.
+ <tschwinge> Yes, same.
+ <braunr> tschwinge: as youpi seems quite busy these days, i'll cook a patch
+ and commit it directly
+ <tschwinge> braunr: Thanks! I have, by the way, confirmed that the
+ following is enough to trigger the issue: vm_map(mach_task_self(), 0, 0,
+ 0, 1, 0, 0, 0, 0, 0, 0);
+ <tschwinge> ... and before the allocator patch, GNU Mach did accept that
+ and return 0 -- though I did not check what effect it actually has. (And
+ I don't think it has any useful one.) I'm also reading that as of lately
+ (Linux 2.6.12), mmap (length = 0) is to return EINVAL, which I think is
+ the foremost user of vm_map.
+ <pinotree> tschwinge: posix too says to return EINVAL for length = 0
+ <braunr> yes, we checked that earlier with youpi
+
+[[!message-id "87sj8522zx.fsf@kepler.schwinge.homeip.net"]].
+
+ <braunr> tschwinge: well, actually your patch is what i had in mind
+ (although i'd like one in vm_map_enter to catch wrong kernel requests
+ too)
+ <braunr> tschwinge: i'll work on it tonight, and do some testing to make
+ sure we don't regress critical stuff (exec is another major direct user
+ of vm_map iirc)
+ <tschwinge> braunr: Oh, OK. :-)
diff --git a/open_issues/implementing_hurd_on_top_of_another_system.mdwn b/open_issues/implementing_hurd_on_top_of_another_system.mdwn
index 95b71ebb..220c69cc 100644
--- a/open_issues/implementing_hurd_on_top_of_another_system.mdwn
+++ b/open_issues/implementing_hurd_on_top_of_another_system.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -15,9 +16,12 @@ One obvious variant is [[emulation]] (using [[hurd/running/QEMU]], for
example), but
doing that does not really integratable the Hurd guest into the host system.
There is also a more direct way, more powerful, but it also has certain
-requirements to do it effectively:
+requirements to do it effectively.
-IRC, #hurd, August / September 2010
+See also [[Mach_on_top_of_POSIX]].
+
+
+# IRC, freenode, #hurd, August / September 2010
<marcusb> silver_hook: the Hurd can also refer to the interfaces of the
filesystems etc, and a lot of that is really just server/client APIs that
@@ -56,7 +60,7 @@ IRC, #hurd, August / September 2010
<marcusb> ArneBab: in fact, John Tobey did this a couple of years ago, or
started it
-([[tschwinge]] has tarballs of John's work.)
+[[Mach_on_top_of_POSIX]].
<marcusb> ArneBab: or you can just implement parts of it and relay to Linux
for the rest
@@ -64,11 +68,10 @@ IRC, #hurd, August / September 2010
are sufficiently happy with the translator stuff, it's not hard to bring
the Hurd to Linux or BSD
-Continue reading about the [[benefits of a native Hurd implementation]].
+Continue reading about the [[benefits_of_a_native_Hurd_implementation]].
----
-IRC, #hurd, 2010-12-28
+# IRC, freenode, #hurd, 2010-12-28
<antrik> kilobug: there is no real requirement for the Hurd to run on a
microkernel... as long as the important mechanisms are provided (most
@@ -79,9 +82,8 @@ IRC, #hurd, 2010-12-28
Hurd on top of a monolithic kernel would actually be a useful approach
for the time being...
----
-IRC, #hurd, 2011-02-11
+# IRC, freenode, #hurd, 2011-02-11
<neal> marcus and I were discussing how to add Mach to Linux
<neal> one could write a module to implement Mach IPC
@@ -115,3 +117,303 @@ IRC, #hurd, 2011-02-11
<neal> I'm unlikely to work on it, sorry
<antrik> didn't really expect that :-)
<antrik> would be nice though if you could write up your conclusions...
+
+
+# IRC, freenode, #hurd, 2012-10-12
+
+ <peo-xaci> do hurd system libraries make raw system calls ever
+ (i.e. inlined syscall() / raw assembly)?
+ <braunr> sure
+ <peo-xaci> hmm, so a hurd emulation layer would need to use ptrace if it
+ should be fool proof? :/
+ <braunr> there is no real need for raw assembly, and the very syscalls are
+ all available through macros
+ <braunr> hum what are you trying to say ?
+ <peo-xaci> well, if they are done through syscall, as a function, not a
+ macro, then they can be intercepted with LD_PRELOAD
+ <peo-xaci> so applications that do Hurd (Mach?) syscalls could work on
+ f.e. Linux, if a special libc is injected into the program with
+ LD_PRELOAD
+ <peo-xaci> same thing with making standard Linux-applications go through
+ the Hurd emulation layer
+ <peo-xaci> without recompilation
+ <mel-_> peo-xaci: the second direction is implemented in glibc.
+ <mel-_> for the other direction, I personally see little use for it
+ <braunr> peo-xaci: ok i misunderstood
+ <braunr> peo-xaci: i don't think there is any truely direct syscall usage
+ in the hurd
+ <peo-xaci> hmm, I'm not sure I understand what directions you are referring
+ to mel-_
+ <braunr> peo-xaci: what are you trying to achieve ?
+ <peo-xaci> I want to make the Hurd design more accessible by letting Hurd
+ application run on the Linux kernel, preferably without
+ recompilation. This would be done with a daemon that implements Mach and
+ which all syscalls would go to.
+ <peo-xaci> then, I also want so that standard Linux applications can go
+ through that Mach daemon as well, if a special libc is preloaded
+ <braunr> you might want to discuss this with antrik
+ <peo-xaci> what I'm trying to figure out specifically is if there is some
+ library/interface that glue Hurd with Mach and would be better suited to
+ emulate than Mach? Mach seems to be more of an implementation detail to
+ the hurd and not something an application would directly use.
+ <braunr> yes, the various hurd libraries (libports and libpager mostly)
+ <peo-xaci> From [http://www.gnu.org/software/hurd/hurd/libports.html]:
+ "libports is not (at least, not for now) a generalization / abstraction
+ of Mach ports to the functionality the Hurd needs, that is, it is not
+ meant to provide an interface independently of the underlying
+ microkernel."
+ <peo-xaci> Is this still true?
+ <peo-xaci> Does libpager abstract the rest?
+ <peo-xaci> (and the other hurd libraries)
+ <braunr> there is nothing that really abstracts the hurd from mach
+ <braunr> for example, reference counting often happens here and there
+ <braunr> and core libraries like glibc and libpthread heavily rely on it
+ (through sysdeps specific code though)
+ <braunr> libports and libpager are meant to simplify object manipulation
+ for the former, and pager operations for the latter
+ <peo-xaci> and applications, such as translators, often use Mach interfaces
+ directly?
+ <peo-xaci> correct?
+ <braunr> depends on what often means
+ <braunr> let's say they do
+ <peo-xaci> :/ then it probably is better to emulate Mach after all
+ <braunr> there was a mach on posix port a long time ago
+ <peo-xaci> I thought applications were completely separated from the
+ microkernel in use by the Hurd
+ <braunr> that level of abstraction is pretty new
+ <braunr> genode is the only system i know which does that
+
+[[microkernel/Genode]].
+
+ <braunr> and it's still for "l4 variants"
+ <pinotree> ah, thanks (i forgot that name)
+ <antrik> braunr: Genode also runs on Linux and a few other non-L4
+ environments IIRC
+ <antrik> peo-xaci: I'm not sure binary emulation is really useful. rather,
+ I'd recompile stuff as "regular" Linux executables, only using a special
+ glibc
+ <antrik> where the special glibc could be basically a port of the Hurd
+ glibc communicating with the Mach emulation instead of real Mach; or it
+ could do emulation at a higher level
+ <antrik> a higher level emulation would be more complicated to implement,
+ but more efficient, and allow better integration with the ordinary
+ GNU/Linux environment
+ <antrik> also note that any regular program could be recompiled against the
+ HELL glibc to run in the Hurdish environment...
+ <antrik> (well, glibc + hurd server libraries)
+ <peo-xaci> I'm willing to accept that Hurd-application would need to be
+ recompiled to work on the HELL
+ <peo-xaci> but not Linux-applications :)
+ <antrik> peo-xaci: if you happen to understand German, there is a fairly
+ good overview in my thesis report ;-)
+ <antrik> peo-xaci: there are no "Hurd applications" or "Linux applications"
+ <peo-xaci> well, let me define what I mean by the terms: Hurd applications
+ use Hurd-specific interfaces/syscalls, and Linux applications use
+ Linux-specific interfaces/syscalls
+ <antrik> a few programs use Linux-specific interfaces (and we probably
+ can't run them in HELL just as we can't run them on actual Hurd); but all
+ other programs work in any glibc environment
+ <antrik> (usually in any POSIX environment in fact...)
+ <antrik> peo-xaci: no sane application uses syscalls
+ <peo-xaci> they do under the hood
+ <peo-xaci> I have read about inlined syscalls
+ <antrik> again, there are *some* applications using Linux-specific
+ interfaces (sometimes because they are inherently bound to Linux
+ features, sometimes unnecessarily)
+ <antrik> so far there are no applications using Hurd-specific interfaces
+ <peo-xaci> translators do?
+ <peo-xaci> they are standard executables are they not?
+ <peo-xaci> I would like so that translators also can be run in the HELL
+ <antrik> I wouldn't consider them applications. all existing translators
+ are pretty much components of the Hurd itself
+ <peo-xaci> okay, it's a question about semantics, perhaps I should use
+ another word than "applications" :)
+ <peo-xaci> for me, applications are what have a main-function, or similar
+ single entry point
+ <braunr> hum
+ <braunr> that's not a good enough definition
+ <antrik> anyways, as I said, I think recompiling translators against a
+ Hurdish glibc and ported translator libraries seems the most reasonable
+ approach to me
+ <braunr> let's say applications are userspace processes that make use of
+ services provided by the operating system
+ <braunr> translators being part of the operating system here
+ <antrik> braunr: do you know whether the Mach-on-POSIX was actually
+ functional, or just an abandoned experiment?...
+ <antrik> (I don't remember hearing of it before...)
+ <braunr> incomplete iirc
+ <peo-xaci> braunr: still, when I've explained what I meant, even if I used
+ the wrong term, then my previous statements should come in another light
+ <peo-xaci> antrik / braunr: are you still interested in hearing my
+ thoughts/ideas about HELL?
+ <antrik> oh, there is more to come? ;-)
+ <peo-xaci> yes! I don't think I have made myself completely understood :/
+ <peo-xaci> what I envision is a HELL system that works on as low level as
+ feasible, to make it possible to do almost anything that can be done on
+ the real Hurd (except possibly testing hardware drivers and such very low
+ level stuff).
+ <braunr> sure
+ <peo-xaci> I want it to be more than just allowing programs to access a
+ virtual filesystem à la FUSE. My idea is that all user space system
+ libraries/programs of the Hurd should be inside the HELL as well, and
+ they should not be emulated.
+ <peo-xaci> The system should at the very least be API compatible, so at the
+ very most a recompilation is necessary.
+ <peo-xaci> I also want so that GNU/Linux-programs can access the features
+ of the HELL with little effort on the user. At most perhaps a script that
+ wraps LD_PRELOADing has to be run on the binary. Best would be if it
+ could work also with insane assembly programs using raw system calls, or
+ if glibc happens to have some well hidden syscall being inlined to raw
+ assembly code.
+ <peo-xaci> And I think I have an idea on how an implementation could
+ satisfy these things!
+ <peo-xaci> By modifying the kernel and replace those syscalls that make
+ sense for the Hurd/Mach
+ <peo-xaci> with "the kernel", I meant Linux
+ <braunr> it's possible but tedious and not very useful so better do that
+ later
+ <braunr> mach did something similar at its time
+ <braunr> there was a syscall emulation library
+ <peo-xaci> but isn't it about as much work as emulating the interface on
+ user-level?
+ <braunr> and the kernel cooperated so that unmodified unix binaries
+ performing syscalls would actually jump to functions provided by that
+ library, which generally made an RPC
+ <peo-xaci> instead of a bunch of extern-declerations, one would put the
+ symbols in the syscall table
+ <braunr> define what "those syscalls that make sense for the Hurd/Mach"
+ actually means
+ <peo-xaci> open/close, for example
+ <braunr> otherwise i don't see another better way than what the old mach
+ folks did
+ <braunr> well, with that old, but existing support, your open would perform
+ a syscall
+ <braunr> the kernel would catch it and redirect the caller to its syscall
+ emulation library
+ <braunr> which would call the open RPC instead
+ <peo-xaci> wait, so this "existing support" you're talking about; is this a
+ module for the Linux kernel (or a fork, or something else)?
+ <peo-xaci> where can I find it?
+ <braunr> no
+ <braunr> it was for mach
+ <braunr> in order to run unmodified unix binaries
+ <braunr> the opposite of what you're trying to do
+ <peo-xaci> ah okay
+ <braunr> well
+ <braunr> not really either :)
+ <peo-xaci> does posix/unix define a standard for how a syscall table should
+ look like, to allow binary syscall compatibility?
+ <braunr> absolutely not
+ <peo-xaci> so how could this mach module run any unmodified unix binary? if
+ they expected different sys calls at different offsets?
+ <braunr> posix specifically (and very early) states that it almost forbids
+ itself to deal with anything regarding to ABIs
+ <braunr> depends
+ <braunr> since it was old, there weren't that many unix systems
+ <braunr> and even today, there are techniques like those used by netbsd
+ (and many other actually)
+ <braunr> that are able to inspect the binary and load a syscall emulation
+ environment depending on its exposed ABI
+ <braunr> e.g. file on an executable states which system it's for
+ <peo-xaci> hmm, I'm not sure how a kernel would implement that in
+ practice.. I thought these things were so hard coded and dependent on raw
+ memory reads that it would not be possible
+ <braunr> but i really think it's not worth the time for your project
+ <peo-xaci> to be honest I have virtually no experience of practical kernel
+ programming
+ <braunr> with an LDT on x86 for example
+ <braunr> no, there is really not that much hardcoded
+ <braunr> quite the contrary
+ <braunr> there is a lot of runtime detection today
+ <peo-xaci> well I mean how the syscall table is read
+ <braunr> it's not read
+ <peo-xaci> it's read to find the function pointer to the syscall handler in
+ the kernel?
+ <braunr> no
+ <braunr> that's the really basic approach
+ <braunr> (and in practice it can happen of course)
+ <braunr> what really happens is that, for example, on linux, the user space
+ system call code is loaded as a virtual shared library
+ <braunr> use ldd on an executable to see it
+ <braunr> this virtual object provides code that, depending on what the
+ kernel has detected, will use the appropriate method to perform a system
+ call
+ <peo-xaci> but this user space system calls need to make some kind of cpu
+ interupt to communicate with the kernel, right?
+ <braunr> the glibc itself has no idea how a system call will look like in
+ the end
+ <braunr> yes
+ <peo-xaci> an assembler programmer would be able to get around this glue
+ code?
+ <braunr> that's precisely what is embedded in this virtual library
+ <braunr> it could yes
+ <braunr> i think even when sysenter/sysexit is supported, legacy traps are
+ still implemented to support old binaries
+ <braunr> but then all these different entry points will lead to the same
+ code inside the kernel
+ <peo-xaci> but when the glue code is used, then its API compatible, and
+ then I can understand that the kernel can allow different syscall
+ implementations for different executables
+ <braunr> what glue code ?
+ <peo-xaci> what you talked about above "the user space system call code is
+ loaded as a virtual shared library"
+ <braunr> let's call it vdso
+ <braunr> i have to leave in a few minutes
+ <braunr> keep going, i'll read later
+ <peo-xaci> thanks, I looked it up on Wikipedia and understand immediately
+ :P
+ <peo-xaci> so VDSOs are provided by the kernel, not a regular library file,
+ right?
+ <vdox2> What does HELL stand for :) ?
+ <dardevelin> vdox2, Hurd Emulation Layer for Linux
+ <vdox2> dardevelin: thanks
+ <braunr> peo-xaci: yes
+ <antrik> peo-xaci: I believe your goals are conflicting. a low-level
+ implementation makes it basically impossible to interact between the HELL
+ environment and the GNU/Linux environment in any meaningful way. to allow
+ such interaction, you *have* to have some glue at a higher semantic level
+ <braunr> agreed
+ <antrik> peo-xaci: BTW, if you want regular Linux binaries to get somehow
+ redirected to access HELL facilities, there is already a framework (don't
+ remember the name right now) that allows this kind of system call
+ redirection on Linux
+ <antrik> (it can run both through LD_PRELOAD or as a kernel module -- where
+ obviously only the latter would allow raw system call redirection... but
+ TBH, I don't think that's worthwhile anyways. the rare cases where
+ programs use raw system calls are usually for extremely system-specific
+ stuff anyways...)
+ <antrik> ViewOS is the name
+ <antrik> err... View-OS I mean
+ <antrik> or maybe View OS ? ;-)
+ <antrik> whatever, you'll find it :-)
+
+[[Virtual_Square_View-OS]].
+
+ <antrik> I'm not sure it's really worthwhile to use this either
+ though... the most meaningful interaction is probably at the FS level,
+ and that can be done with FUSE
+ <antrik> OHOH, View-OS probably allows doing more interesting stuff that
+ FUSE, such as modyfing the way the VFS works...
+ <antrik> OTOH
+ <antrik> so it could expose more of the Hurd features, at least in theory
+
+
+## IRC, freenode, #hurd, 2012-10-13
+
+ <peo-xaci> antrik / braunr: thanks for your input! I'm not entirely
+ convinced though. :) I will probably return to this project once I have
+ acquired a lot more knowledge about low level stuff. I want to see for
+ myself whether a low level HELL is not feasible. :P
+ <braunr> peo-xaci: what's the point of a low level hell ?
+ <peo-xaci> more Hurd code can be tested in the hell, if the hell is at a
+ low level
+ <peo-xaci> at a higher level, some Hurd code cannot run, because the
+ interfaces they use would not be accessible from the higher level
+ emulation
+ <antrik> peo-xaci: I never said it's not possible. I actually said it would
+ be easier to do. I just said you can't do it low level *and* have
+ meaningful interaction with the host system
+ <peo-xaci> I don't understand why
+ <braunr> peo-xaci: i really don't see what you want to achieve with low
+ level support
+ <braunr> what would be unavailable with a higher level approach ?
diff --git a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
index 80fc9fcd..57eb403d 100644
--- a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
+++ b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -104,3 +105,11 @@ License|/fdl]]."]]"""]]
of embedding it ?
<braunr> right
<antrik> now that's a good question... no idea TBH :-)
+
+
+# IRC, freenode, #hurd, 2012-07-23
+
+ <pinotree> aren't libmachuser and libhurduser supposed to be slowly faded
+ out?
+ <tschwinge> pinotree: That discussion has not yet come to a conclusion, I
+ think. (I'd say: yes.)
diff --git a/open_issues/libpager_deadlock.mdwn b/open_issues/libpager_deadlock.mdwn
new file mode 100644
index 00000000..017ecff6
--- /dev/null
+++ b/open_issues/libpager_deadlock.mdwn
@@ -0,0 +1,165 @@
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+Deadlocks in libpager/periodic sync have been found.
+
+
+# [[gnumach_page_cache_policy]]
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> ah great, a paper about the mach pageout daemon !
+ <mcsim> braunr: Where is paper about the mach pageout daemon?
+ <braunr> ftp://ftp.cs.cmu.edu/project/mach/doc/published/defaultmm.ps
+ <braunr> might give us a clue about the swap deadlock (although i still
+ have a few ideas to check)
+ <braunr>
+ http://www.sceen.net/~rbraun/moving_the_default_memory_manager_out_of_the_mach_kernel.pdf
+ <braunr> we should more seriously consider sergio's advisory pageout branch
+ some day
+ <braunr> i'll try to get in touch with him about that before he completely
+ looses interest
+ <braunr> i'll include it in my "make that page cache as decent as possible"
+ task
+ <braunr> many of his comments match what i've seen
+ <braunr> and we both did a few optimizations the same way
+ <braunr> (like not deactivating pages when they enter the cache)
+
+
+## IRC, freenode, #hurd, 2012-07-13
+
+ <braunr> antrik: i'm able to consistently reproduce the swap deadlocks you
+ regularly had when using apt with my page cache patch
+ <braunr> it happens when lots of dirty pages are write back to their pagers
+ <braunr> so apt, or a big file copy or anything that writes several MiB
+ very quickly is a good candidate
+ <braunr> written*
+ <antrik> braunr: nice...
+ <braunr> antrik: well in a way, yes, as it will allow us to track it more
+ easily
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+ <braunr> oh btw, i think i can say with confidence that the hurd *doesn't*
+ deadlock
+ <braunr> (at least, concerning swapping)
+ <braunr> lol, one of my hurd systems has been hitting the "swap deadlock"
+ for more than an hour, and suddenly got out of it
+ <braunr> something is really wrong in the pageout daemon, but it's not a
+ deadlock
+ <youpi> a livelock then
+ <braunr> do you get out of livelocks ?
+ <braunr> i mean, it's not even a "lock"
+ <braunr> just a big damn tricky slowdown
+ <youpi> yes, you can, by giving a few more resources for instance
+ <youpi> depends on the kind of livelock of course
+ <braunr> i think it's that
+ <braunr> the pageout daemon clearly throttles itself, waiting for pagers to
+ complete
+ <braunr> and another dangerous thing is the line in vm_resident, which only
+ wakes on thread to avoid starvation
+ <braunr> hum, during the livelock, the kernel spends much time waiting in
+ db_read_address
+ <braunr> could be a bad stack
+ <braunr> so, the pageout daemon seems to slow itself as much as waiting
+ several seconds between each iteration when under load
+ <braunr> but each iteration possibly removes clean pages
+ <braunr> so at some point, there is enough memory to unblock waiting pagers
+ <braunr> for now i'll try a simple solution, like limiting the pausing
+ delay
+ <braunr> but we'll need more page lists in the future (inactive-clean,
+ inactive-dirty, etc..)
+ <braunr> limiting the amount of dirty pages is the only way to really make
+ it safe actually
+ <braunr> wow, the pageout loop is still running even after many pages were
+ freed, and it unable to free more pages
+ <braunr> i think i have an idea about the livelock
+ <braunr> i think it comes from the periodic syncing
+ <bddebian> Too often?
+ <braunr> that's not the problem
+ <braunr> the problem is that it can happen at the same time with paging
+ <bddebian> Oh
+ <braunr> if paging gets slow, it won't stop the periodic syncing
+ <braunr> which will grab any page it can as soon as some are free
+ <braunr> but then, before it even finishes, another sync may occur
+ <braunr> i have yet to check that it is possible
+ <braunr> and i don't understand why syncing isn't done by the kernel
+ <braunr> the kernel is supposed to handle the paging policy
+ <braunr> and it would make paging really scale
+ <bddebian> It's done on the Hurd side?
+ <braunr> (instead of having external pagers make one request for each
+ object, even if they're clean)
+ <braunr> yes
+ <bddebian> Hmm, interesting
+ <braunr> ofc, with ext2fs --debug, i can't reproduce anything
+ <bddebian> Ugh
+ <braunr> sync are serialized
+ <braunr> grmbl
+ <braunr> there is a big lock taken at sync time though
+ <braunr> uhg
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+ <braunr> all right so, there *is* a deadlock, and it may be due to the
+ default pager actually
+ <braunr> the vm_page_laundry_count doesn't decrease at some point, even
+ when there are more than enough free pages
+ <braunr> antrik: the thing is, i think the deadlock concerns the default
+ pager
+ <antrik> the deadlock?
+ <braunr> yes
+ <braunr> when swapping
+
+
+## IRC, freenode, #hurd, 2012-07-17
+
+ <braunr> i can't even reproduce the swap deadlock when using upstrea ext2fs
+ :(
+ <braunr> upstream*
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+ <braunr> the libpager deadlock patch looks wrong to me
+ <braunr> hm no, the libpager patch is ok acually
+
+
+## [[synchronous_ipc]]
+
+### IRC, freenode, #hurd, 2012-07-20
+
+ <braunr> but actually after reviewing more, the debian patch for this
+ particular issue seems correct
+ <antrik> well, it's most probably done by youpi, so I would be shocked if
+ it wasn't correct... ;-)
+ <braunr> he wasn't sure at all about it
+ <antrik> still ;-)
+ <braunr> :)
+ <antrik> well, if you also think it's correct, I guess it's time to push it
+ upstream...
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> i still can't conclude if we have any pageout deadlock, or if it's
+ simply a side effect of the active and inactive lists getting very very
+ large
+ <braunr> but almost every time this issue happens, it somehow recovers,
+ sometimes hours later
+
+
+# See Also
+
+ * [[ext2fs_deadlock]]
diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn
index c5054b7f..befc1378 100644
--- a/open_issues/libpthread.mdwn
+++ b/open_issues/libpthread.mdwn
@@ -42,3 +42,1287 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task.
<youpi> there'll still be the issue that only one will be initialized
<youpi> and one that provides libc thread safety functions, etc.
<pinotree> that's what i wanted to knew, thanks :)
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <bddebian> So I am not sure what to do with the hurd_condition_wait stuff
+ <braunr> i would also like to know what's the real issue with cancellation
+ here
+ <braunr> because my understanding is that libpthread already implements it
+ <braunr> does it look ok to you to make hurd_condition_timedwait return an
+ errno code (like ETIMEDOUT and ECANCELED) ?
+ <youpi> braunr: that's what pthread_* function usually do, yes
+ <braunr> i thought they used their own code
+ <youpi> no
+ <braunr> thanks
+ <braunr> well, first, do you understand what hurd_condition_wait is ?
+ <braunr> it's similar to condition_wait or pthread_cond_wait with a subtle
+ difference
+ <braunr> it differs from the original cthreads version by handling
+ cancellation
+ <braunr> but it also differs from the second by how it handles cancellation
+ <braunr> instead of calling registered cleanup routines and leaving, it
+ returns an error code
+ <braunr> (well simply !0 in this case)
+ <braunr> so there are two ways
+ <braunr> first, change the call to pthread_cond_wait
+ <bddebian> Are you saying we could fix stuff to use pthread_cond_wait()
+ properly?
+ <braunr> it's possible but not easy
+ <braunr> because you'd have to rewrite the cancellation code
+ <braunr> probably writing cleanup routines
+ <braunr> this can be hard and error prone
+ <braunr> and is useless if the code already exists
+ <braunr> so it seems reasonable to keep this hurd extension
+ <braunr> but now, as it *is* a hurd extension noone else uses
+ <antrik> braunr: BTW, when trying to figure out a tricky problem with the
+ auth server, cfhammer digged into the RPC cancellation code quite a bit,
+ and it's really a horrible complex monstrosity... plus the whole concept
+ is actually broken in some regards I think -- though I don't remember the
+ details
+ <braunr> antrik: i had the same kind of thoughts
+ <braunr> antrik: the hurd or pthreads ones ?
+ <antrik> not sure what you mean. I mean the RPC cancellation code -- which
+ is involves thread management too
+ <braunr> ok
+ <antrik> I don't know how it is related to hurd_condition_wait though
+ <braunr> well i found two main entry points there
+ <braunr> hurd_thread_cancel and hurd_condition_wait
+ <braunr> and it didn't look that bad
+ <braunr> whereas in the pthreads code, there are many corner cases
+ <braunr> and even the standard itself looks insane
+ <antrik> well, perhaps the threading part is not that bad...
+ <antrik> it's not where we saw the problems at any rate :-)
+ <braunr> rpc interruption maybe ?
+ <antrik> oh, right... interruption is probably the right term
+ <braunr> yes that thing looks scary
+ <braunr> :))
+ <braunr> the migration thread paper mentions some things about the problems
+ concerning threads controllability
+ <antrik> I believe it's a very strong example for why building around
+ standard Mach features is a bad idea, instead of adapting the primitives
+ to our actual needs...
+ <braunr> i wouldn't be surprised if the "monstrosities" are work arounds
+ <braunr> right
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+ <bddebian> Uhm, where does /usr/include/hurd/signal.h come from?
+ <pinotree> head -n4 /usr/include/hurd/signal.
+ <pinotree> h
+ <bddebian> Ohh glibc?
+ <bddebian> That makes things a little more difficult :(
+ <braunr> why ?
+ <bddebian> Hurd includes it which brings in cthreads
+ <braunr> ?
+ <braunr> the hurd already brings in cthreads
+ <braunr> i don't see what you mean
+ <bddebian> Not anymore :)
+ <braunr> the system cthreads header ?
+ <braunr> well it's not that difficult to trick the compiler not to include
+ them
+ <bddebian> signal.h includes cthreads.h I need to stop that
+ <braunr> just define the _CTHREADS_ macro before including anything
+ <braunr> remember that header files are normally enclosed in such macros to
+ avoid multiple inclusions
+ <braunr> this isn't specific to cthreads
+ <pinotree> converting hurd from cthreads to pthreads will make hurd and
+ glibc break source and binary compatibility
+ <bddebian> Of course
+ <braunr> reminds me of the similar issues of the late 90s
+ <bddebian> Ugh, why is he using _pthread_self()?
+ <pinotree> maybe because it accesses to the internals
+ <braunr> "he" ?
+ <bddebian> Thomas in his modified cancel-cond.c
+ <braunr> well, you need the internals to implement it
+ <braunr> hurd_condition_wait is similar to pthread_condition_wait, except
+ that instead of stopping the thread and calling cleanup routines, it
+ returns 1 if cancelled
+ <pinotree> not that i looked at it, but there's really no way to implement
+ it using public api?
+ <bddebian> Even if I am using glibc pthreads?
+ <braunr> unlikely
+ <bddebian> God I had all of this worked out before I dropped off for a
+ couple years.. :(
+ <braunr> this will come back :p
+ <pinotree> that makes you the perfect guy to work on it ;)
+ <bddebian> I can't find a pt-internal.h anywhere.. :(
+ <pinotree> clone the hurd/libpthread.git repo from savannah
+ <bddebian> Of course when I was doing this libpthread was still in hurd
+ sources...
+ <bddebian> So if I am using glibc pthread, why can't I use pthread_self()
+ instead?
+ <pinotree> that won't give you access to the internals
+ <bddebian> OK, dumb question time. What internals?
+ <pinotree> the libpthread ones
+ <braunr> that's where you will find if your thread has been cancelled or
+ not
+ <bddebian> pinotree: But isn't that assuming that I am using hurd's
+ libpthread?
+ <pinotree> if you aren't inside libpthread, no
+ <braunr> pthread_self is normally not portable
+ <braunr> you can only use it with pthread_equal
+ <braunr> so unless you *know* the internals, you can't use it
+ <braunr> and you won't be able to do much
+ <braunr> so, as it was done with cthreads, hurd_condition_wait should be
+ close to the libpthread implementation
+ <braunr> inside, normally
+ <braunr> now, if it's too long for you (i assume you don't want to build
+ glibc)
+ <braunr> you can just implement it outside, grabbing the internal headers
+ for now
+ <pinotree> another "not that i looked at it" question: isn't there no way
+ to rewrite the code using that custom condwait stuff to use the standard
+ libpthread one?
+ <braunr> and once it works, it'll get integrated
+ <braunr> pinotree: it looks very hard
+ <bddebian> braunr: But the internal headers are assuming hurd libpthread
+ which isn't in the source anymore
+ <braunr> from what i could see while working on select, servers very often
+ call hurd_condition_wait
+ <braunr> and they return EINTR if canceleld
+ <braunr> so if you use the standard pthread_cond_wait function, your thread
+ won't be able to return anything, unless you push the reply in a
+ completely separate callback
+ <braunr> i'm not sure how well mig can cope with that
+ <braunr> i'd say it can't :)
+ <braunr> no really it looks ugly
+ <braunr> it's far better to have this hurd specific function and keep the
+ existing user code as it is
+ <braunr> bddebian: you don't need the implementation, only the headers
+ <braunr> the thread, cond, mutex structures mostly
+ <bddebian> I should turn <pt-internal.h> to "pt-internal.h" and just put it
+ in libshouldbelibc, no?
+ <pinotree> no, that header is not installed
+ <bddebian> Obviously not the "best" way
+ <bddebian> pinotree: ??
+ <braunr> pinotree: what does it change ?
+ <pinotree> braunr: it == ?
+ <braunr> bddebian: you could even copy it entirely in your new
+ cancel-cond.C and mention where it was copied from
+ <braunr> pinotree: it == pt-internal.H not being installed
+ <pinotree> that he cannot include it in libshouldbelibc sources?
+ <pinotree> ah, he wants to copy it?
+ <braunr> yes
+ <braunr> i want him to copy it actually :p
+ <braunr> it may be hard if there are a lot of macro options
+ <pinotree> the __pthread struct changes size and content depending on other
+ internal sysdeps headers
+ <braunr> well he needs to copy those too :p
+ <bddebian> Well even if this works we are going to have to do something
+ more "correct" about hurd_condition_wait. Maybe even putting it in
+ glibc?
+ <braunr> sure
+ <braunr> but again, don't waste time on this for now
+ <braunr> make it *work*, then it'll get integrated
+ <bddebian> Like it has already? This "patch" is only about 5 years old
+ now... ;-P
+ <braunr> but is it complete ?
+ <bddebian> Probably not :)
+ <bddebian> Hmm, I wonder how many undefined references I am going to get
+ though.. :(
+ <bddebian> Shit, 5
+ <bddebian> One of which is ___pthread_self.. :(
+ <bddebian> Does that mean I am actually going to have to build hurds
+ libpthreads in libshouldbeinlibc?
+ <bddebian> Seriously, do I really need ___pthread_self, __pthread_self,
+ _pthread_self and pthread_self???
+ <bddebian> I'm still unclear what to do with cancel-cond.c. It seems to me
+ that if I leave it the way it is currently I am going to have to either
+ re-add libpthreads or still all of the libpthreads code under
+ libshouldbeinlibc.
+ <braunr> then add it in libc
+ <braunr> glib
+ <braunr> glibc
+ <braunr> maybe under the name __hurd_condition_wait
+ <bddebian> Shouldn't I be able to interrupt cancel-cond stuff to use glibc
+ pthreads?
+ <braunr> interrupt ?
+ <bddebian> Meaning interject like they are doing. I may be missing the
+ point but they are just obfuscating libpthreads thread with some other
+ "namespace"? (I know my terminology is wrong, sorry).
+ <braunr> they ?
+ <bddebian> Well Thomas in this case but even in the old cthreads code,
+ whoever wrote cancel-cond.c
+ <braunr> but they use internal thread structures ..
+ <bddebian> Understood but at some level they are still just getting to a
+ libpthread thread, no?
+ <braunr> absolutely not ..
+ <braunr> there is *no* pthread stuff in the hurd
+ <braunr> that's the problem :p
+ <bddebian> Bah damnit...
+ <braunr> cthreads are directly implement on top of mach threads
+ <braunr> implemeneted*
+ <braunr> implemented*
+ <bddebian> Sure but hurd_condition_wait wasn't
+ <braunr> of course it is
+ <braunr> it's almost the same as condition_wait
+ <braunr> but returns 1 if a cancelation request was made
+ <bddebian> Grr, maybe I am just confusing myself because I am looking at
+ the modified (pthreads) version instead of the original cthreads version
+ of cancel-cond.c
+ <braunr> well if the modified version is fine, why not directly use that ?
+ <braunr> normally, hurd_condition_wait should sit next to other pthread
+ internal stuff
+ <braunr> it could be renamed __hurd_condition_wait, i'm not sure
+ <braunr> that's irrelevant for your work anyway
+ <bddebian> I am using it but it relies on libpthread and I am trying to use
+ glibc pthreads
+ <braunr> hum
+ <braunr> what's the difference between libpthread and "glibc pthreads" ?
+ <braunr> aren't glibc pthreads the merged libpthread ?
+ <bddebian> quite possibly but then I am missing something obvious. I'm
+ getting ___pthread_self in libshouldbeinlibc but it is *UND*
+ <braunr> bddebian: with unmodified binaries ?
+ <bddebian> braunr: No I added cancel-cond.c to libshouldbeinlibc
+ <bddebian> And some of the pt-xxx.h headers
+ <braunr> well it's normal then
+ <braunr> i suppose
+ <bddebian> braunr: So how do I get those defined without including
+ pthreads.c from libpthreads? :)
+ <antrik> pinotree: hm... I think we should try to make sure glibc works
+ both whith cthreads hurd and pthreads hurd. I hope that shoudn't be so
+ hard.
+ <antrik> breaking binary compatibility for the Hurd libs is not too
+ terrible I'd say -- as much as I'd like that, we do not exactly have a
+ lot of external stuff depending on them :-)
+ <braunr> bddebian: *sigh*
+ <braunr> bddebian: just add cancel-cond to glibc, near the pthread code :p
+ <bddebian> braunr: Wouldn't I still have the same issue?
+ <braunr> bddebian: what issue ?
+ <antrik> is hurd_condition_wait() the name of the original cthreads-based
+ function?
+ <braunr> antrik: the original is condition_wait
+ <antrik> I'm confused
+ <antrik> is condition_wait() a standard cthreads function, or a
+ Hurd-specific extension?
+ <braunr> antrik: as standard as you can get for something like cthreads
+ <bddebian> braunr: Where hurd_condition_wait is looking for "internals" as
+ you call them. I.E. there is no __pthread_self() in glibc pthreads :)
+ <braunr> hurd_condition_wait is the hurd-specific addition for cancelation
+ <braunr> bddebian: who cares ?
+ <braunr> bddebian: there is a pthread structure, and conditions, and
+ mutexes
+ <braunr> you need those definitions
+ <braunr> so you either import them in the hurd
+ <antrik> braunr: so hurd_condition_wait() *is* also used in the original
+ cthread-based implementation?
+ <braunr> or you write your code directly where they're available
+ <braunr> antrik: what do you call "original" ?
+ <antrik> not transitioned to pthreads
+ <braunr> ok, let's simply call that cthreads
+ <braunr> yes, it's used by every hurd servers
+ <braunr> virtually
+ <braunr> if not really everyone of them
+ <bddebian> braunr: That is where you are losing me. If I can just use
+ glibc pthreads structures, why can't I just use them in the new pthreads
+ version of cancel-cond.c which is what I was originally asking.. :)
+ <braunr> you *have* to do that
+ <braunr> but then, you have to build the whole glibc
+ * bddebian shoots himself
+ <braunr> and i was under the impression you wanted to avoid that
+ <antrik> do any standard pthread functions use identical names to any
+ standard cthread functions?
+ <braunr> what you *can't* do is use the standard pthreads interface
+ <braunr> no, not identical
+ <braunr> but very close
+ <braunr> bddebian: there is a difference between using pthreads, which
+ means using the standard posix interface, and using the glibc pthreads
+ structure, which means toying with the internale implementation
+ <braunr> you *cannot* implement hurd_condition_wait with the standard posix
+ interface, you need to use the internal structures
+ <braunr> hurd_condition_wait is actually a shurd specific addition to the
+ threading library
+ <braunr> hurd*
+ <antrik> well, in that case, the new pthread-based variant of
+ hurd_condition_wait() should also use a different name from the
+ cthread-based one
+ <braunr> so it's normal to put it in that threading library, like it was
+ done for cthreads
+ <braunr> 21:35 < braunr> it could be renamed __hurd_condition_wait, i'm not
+ sure
+ <bddebian> Except that I am trying to avoid using that threading library
+ <braunr> what ?
+ <bddebian> If I am understanding you correctly it is an extention to the
+ hurd specific libpthreads?
+ <braunr> to the threading library, whichever it is
+ <braunr> antrik: although, why not keeping the same name ?
+ <antrik> braunr: I don't think having hurd_condition_wait() for the cthread
+ variant and __hurd_condition_wait() would exactly help clarity...
+ <antrik> I was talking about a really new name. something like
+ pthread_hurd_condition_wait() or so
+ <antrik> braunr: to avoid confusion. to avoid accidentally pulling in the
+ wrong one at build and/or runtime.
+ <antrik> to avoid possible namespace conflicts
+ <braunr> ok
+ <braunr> well yes, makes sense
+ <bddebian> braunr: Let me state this as plainly as I hope I can. If I want
+ to use glibc's pthreads, I have no choice but to add it to glibc?
+ <braunr> and pthread_hurd_condition_wait is a fine name
+ <braunr> bddebian: no
+ <braunr> bddebian: you either add it there
+ <braunr> bddebian: or you copy the headers defining the internal structures
+ somewhere else and implement it there
+ <braunr> but adding it to glibc is better
+ <braunr> it's just longer in the beginning, and now i'm working on it, i'm
+ really not sure
+ <braunr> add it to glibc directly :p
+ <bddebian> That's what I am trying to do but the headers use pthread
+ specific stuff would should be coming from glibc's pthreads
+ <braunr> yes
+ <braunr> well it's not the headers you need
+ <braunr> you need the internal structure definitions
+ <braunr> sometimes they're in c files for opacity
+ <bddebian> So ___pthread_self() should eventually be an obfuscation of
+ glibcs pthread_self(), no?
+ <braunr> i don't know what it is
+ <braunr> read the cthreads variant of hurd_condition_wait, understand it,
+ do the same for pthreads
+ <braunr> it's easy :p
+ <bddebian> For you bastards that have a clue!! ;-P
+ <antrik> I definitely vote for adding it to the hurd pthreads
+ implementation in glibc right away. trying to do it externally only adds
+ unnecessary complications
+ <antrik> and we seem to agree that this new pthread function should be
+ named pthread_hurd_condition_wait(), not just hurd_condition_wait() :-)
+
+
+## IRC, freenode, #hurd, 2012-07-27
+
+ <bddebian> OK this hurd_condition_wait stuff is getting ridiculous the way
+ I am trying to tackle it. :( I think I need a new tactic.
+ <braunr> bddebian: what do you mean ?
+ <bddebian> braunr: I know I am thick headed but I still don't get why I
+ cannot implement it in libshouldbeinlibc for now but still use glibc
+ pthreads internals
+ <bddebian> I thought I was getting close last night by bringing in all of
+ the hurd pthread headers and .c files but it just keeps getting uglier
+ and uglier
+ <bddebian> youpi: Just to verify. The /usr/lib/i386-gnu/libpthread.so that
+ ships with Debian now is from glibc, NOT libpthreads from Hurd right?
+ Everything I need should be available in glibc's libpthreads? (Except for
+ hurd_condition_wait obviously).
+ <braunr> 22:35 < antrik> I definitely vote for adding it to the hurd
+ pthreads implementation in glibc right away. trying to do it externally
+ only adds unnecessary complications
+ <youpi> bddebian: yes
+ <youpi> same as antrik
+ <bddebian> fuck
+ <youpi> libpthread *already* provides some odd symbols (cthread
+ compatibility), it can provide others
+ <braunr> bddebian: don't curse :p it will be easier in the long run
+ * bddebian breaks out glibc :(
+ <braunr> but you should tell thomas that too
+ <bddebian> braunr: I know it just adds a level of complexity that I may not
+ be able to deal with
+ <braunr> we wouldn't want him to waste too much time on the external
+ libpthread
+ <braunr> which one ?
+ <bddebian> glibc for one. hurd_condition_wait() for another which I don't
+ have a great grasp on. Remember my knowledge/skillsets are limited
+ currently.
+ <braunr> bddebian: tschwinge has good instructions to build glibc
+ <braunr> keep your tree around and it shouldn't be long to hack on it
+ <braunr> for hurd_condition_wait, i can help
+ <bddebian> Oh I was thinking about using Debian glibc for now. You think I
+ should do it from git?
+ <braunr> no
+ <braunr> debian rules are even more reliable
+ <braunr> (just don't build all the variants)
+ <pinotree> `debian/rules build_libc` builds the plain i386 variant only
+ <bddebian> So put pthread_hurd_cond_wait in it's own .c file or just put it
+ in pt-cond-wait.c ?
+ <braunr> i'd put it in pt-cond-wait.C
+ <bddebian> youpi or braunr: OK, another dumb question. What (if anything)
+ should I do about hurd/hurd/signal.h. Should I stop it from including
+ cthreads?
+ <youpi> it's not a dumb question. it should probably stop, yes, but there
+ might be uncovered issues, which we'll have to take care of
+ <bddebian> Well I know antrik suggested trying to keep compatibility but I
+ don't see how you would do that
+ <braunr> compability between what ?
+ <braunr> and source and/or binary ?
+ <youpi> hurd/signal.h implicitly including cthreads.h
+ <braunr> ah
+ <braunr> well yes, it has to change obviously
+ <bddebian> Which will break all the cthreads stuff of course
+ <bddebian> So are we agreeing on pthread_hurd_cond_wait()?
+ <braunr> that's fine
+ <bddebian> Ugh, shit there is stuff in glibc using cthreads??
+ <braunr> like what ?
+ <bddebian> hurdsig, hurdsock, setauth, dtable, ...
+ <youpi> it's just using the compatibility stuff, that pthread does provide
+ <bddebian> but it includes cthreads.h implicitly
+ <bddebian> s/it/they in many cases
+ <youpi> not a problem, we provide the functions
+ <bddebian> Hmm, then what do I do about signal.h? It includes chtreads.h
+ because it uses extern struct mutex ...
+ <youpi> ah, then keep the include
+ <youpi> the pthread mutexes are compatible with that
+ <youpi> we'll clean that afterwards
+ <bddebian> arf, OK
+ <youpi> that's what I meant by "uncover issues"
+
+
+## IRC, freenode, #hurd, 2012-07-28
+
+ <bddebian> Well crap, glibc built but I have no symbol for
+ pthread_hurd_cond_wait in libpthread.so :(
+ <bddebian> Hmm, I wonder if I have to add pthread_hurd_cond_wait to
+ forward.c and Versions? (Versions obviously eventually)
+ <pinotree> bddebian: most probably not about forward.c, but definitely you
+ have to export public stuff using Versions
+
+
+## IRC, freenode, #hurd, 2012-07-29
+
+ <bddebian> braunr: http://paste.debian.net/181078/
+ <braunr> ugh, inline functions :/
+ <braunr> "Tell hurd_thread_cancel how to unblock us"
+ <braunr> i think you need that one too :p
+ <bddebian> ??
+ <braunr> well, they work in pair
+ <braunr> one cancels, the other notices it
+ <braunr> hurd_thread_cancel is in the hurd though, iirc
+ <braunr> or uh wait
+ <braunr> no it's in glibc, hurd/thread-cancel.c
+ <braunr> otherwise it looks like a correct reuse of the original code, but
+ i need to understand the pthreads internals better to really say anything
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+ <braunr> pinotree: what do you think of
+ condition_implies/condition_unimplies ?
+ <braunr> the work on pthread will have to replace those
+
+
+## IRC, freenode, #hurd, 2012-08-06
+
+ <braunr> bddebian: so, where is the work being done ?
+ <bddebian> braunr: Right now I would just like to testing getting my glibc
+ with pthread_hurd_cond_wait installed on the clubber subhurd. It is in
+ /home/bdefreese/glibc-debian2
+ <braunr> we need a git branch
+ <bddebian> braunr: Then I want to rebuild hurd with Thomas's pthread
+ patches against that new libc
+ <bddebian> Aye
+ <braunr> i don't remember, did thomas set a git repository somewhere for
+ that ?
+ <bddebian> He has one but I didn't have much luck with it since he is using
+ an external libpthreads
+ <braunr> i can manage the branches
+ <bddebian> I was actually patching debian/hurd then adding his patches on
+ top of that. It is in /home/bdefreese/debian-hurd but he has updateds
+ some stuff since then
+ <bddebian> Well we need to agree on a strategy. libpthreads only exists in
+ debian/glibc
+ <braunr> it would be better to have something upstream than to work on a
+ debian specific branch :/
+ <braunr> tschwinge: do you think it can be done
+ <braunr> ?
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <tschwinge> braunr: You mean to create on Savannah branches for the
+ libpthread conversion? Sure -- that's what I have been suggesting to
+ Barry and Thomas D. all the time.
+
+ <bddebian> braunr: OK, so I installed my glibc with
+ pthread_hurd_condition_wait in the subhurd and now I have built Debian
+ Hurd with Thomas D's pthread patches.
+ <braunr> bddebian: i'm not sure we're ready for tests yet :p
+ <bddebian> braunr: Why not? :)
+ <braunr> bddebian: a few important bits are missing
+ <bddebian> braunr: Like?
+ <braunr> like condition_implies
+ <braunr> i'm not sure they have been handled everywhere
+ <braunr> it's still interesting to try, but i bet your system won't finish
+ booting
+ <bddebian> Well I haven't "installed" the built hurd yet
+ <bddebian> I was trying to think of a way to test a little bit first, like
+ maybe ext2fs.static or something
+ <bddebian> Ohh, it actually mounted the partition
+ <bddebian> How would I actually "test" it?
+ <braunr> git clone :p
+ <braunr> building a debian package inside
+ <braunr> removing the whole content after
+ <braunr> that sort of things
+ <bddebian> Hmm, I think I killed clubber :(
+ <bddebian> Yep.. Crap! :(
+ <braunr> ?
+ <braunr> how did you do that ?
+ <bddebian> Mounted a new partition with the pthreads ext2fs.static then did
+ an apt-get source hurd to it..
+ <braunr> what partition, and what mount point ?
+ <bddebian> I added a new 2Gb partition on /dev/hd0s6 and set the translator
+ on /home/bdefreese/part6
+ <braunr> shouldn't kill your hurd
+ <bddebian> Well it might still be up but killed my ssh session at the very
+ least :)
+ <braunr> ouch
+ <bddebian> braunr: Do you have debugging enabled in that custom kernel you
+ installed? Apparently it is sitting at the debug prompt.
+
+
+## IRC, freenode, #hurd, 2012-08-12
+
+ <braunr> hmm, it seems the hurd notion of cancellation is actually not the
+ pthread one at all
+ <braunr> pthread_cancel merely marks a thread as being cancelled, while
+ hurd_thread_cancel interrupts it
+ <braunr> ok, i have a pthread_hurd_cond_wait_np function in glibc
+
+
+## IRC, freenode, #hurd, 2012-08-13
+
+ <braunr> nice, i got ext2fs work with pthreads
+ <braunr> there are issues with the stack size strongly limiting the number
+ of concurrent threads, but that's easy to fix
+ <braunr> one problem with the hurd side is the condition implications
+ <braunr> i think it should be deal separately, and before doing anything
+ with pthreads
+ <braunr> but that's minor, the most complex part is, again, the term server
+ <braunr> other than that, it was pretty easy to do
+ <braunr> but, i shouldn't speak too soon, who knows what tricky bootstrap
+ issue i'm gonna face ;p
+ <braunr> tschwinge: i'd like to know how i should proceed if i want a
+ symbol in a library overriden by that of a main executable
+ <braunr> e.g. have libpthread define a default stack size, and let
+ executables define their own if they want to change it
+ <braunr> tschwinge: i suppose i should create a weak alias in the library
+ and a normal variable in the executable, right ?
+ <braunr> hm i'm making this too complicated
+ <braunr> don't mind that stupid question
+ <tschwinge> braunr: A simple variable definition would do, too, I think?
+ <tschwinge> braunr: Anyway, I'd first like to know why we can'T reduce the
+ size of libpthread threads from 2 MiB to 64 KiB as libthreads had. Is
+ that a requirement of the pthread specification?
+ <braunr> tschwinge: it's a requirement yes
+ <braunr> the main reason i see is that hurd threadvars (which are still
+ present) rely on common stack sizes and alignment to work
+ <tschwinge> Mhm, I see.
+ <braunr> so for now, i'm using this approach as a hack only
+ <tschwinge> I'm working on phasing out threadvars, but we're not there yet.
+ <tschwinge> Yes, that's fine for the moment.
+ <braunr> tschwinge: a simple definition wouldn't work
+ <braunr> tschwinge: i resorted to a weak symbol, and see how it goes
+ <braunr> tschwinge: i supposed i need to export my symbol as a global one,
+ otherwise making it weak makes no sense, right ?
+ <braunr> suppose*
+ <braunr> tschwinge: also, i'm not actually sure what you meant is a
+ requirement about the stack size, i shouldn't have answered right away
+ <braunr> no there is actually no requirement
+ <braunr> i misunderstood your question
+ <braunr> hm when adding this weak variable, starting a program segfaults :(
+ <braunr> apparently on ___pthread_self, a tls variable
+ <braunr> fighting black magic begins
+ <braunr> arg, i can't manage to use that weak symbol to reduce stack sizes
+ :(
+ <braunr> ah yes, finally
+ <braunr> git clone /path/to/glibc.git on a pthread-powered ext2fs server :>
+ <braunr> tschwinge: seems i have problems using __thread in hurd code
+ <braunr> tschwinge: they produce undefined symbols
+ <braunr> tschwinge: forget that, another mistake on my part
+ <braunr> so, current state: i just need to create another patch, for the
+ code that is included in the debian hurd package but not in the upstream
+ hurd repository (e.g. procfs, netdde), and i should be able to create
+ hurd packages taht completely use pthreads
+
+
+## IRC, freenode, #hurd, 2012-08-14
+
+ <braunr> tschwinge: i have weird bootstrap issues, as expected
+ <braunr> tschwinge: can you point me to important files involved during
+ bootstrap ?
+ <braunr> my ext2fs.static server refuses to start as a rootfs, whereas it
+ seems to work fine otherwise
+ <braunr> hm, it looks like it's related to global signal dispositions
+
+
+## IRC, freenode, #hurd, 2012-08-15
+
+ <braunr> ahah, a subhurd running pthreads-powered hurd servers only
+ <LarstiQ> braunr: \o/
+ <braunr> i can even long on ssh
+ <braunr> log
+ <braunr> pinotree: for reference, i uploaded my debian-specific changes
+ there :
+ <braunr> http://git.sceen.net/rbraun/debian_hurd.git/
+ <braunr> darnassus is now running a pthreads-enabled hurd system :)
+
+
+## IRC, freenode, #hurd, 2012-08-16
+
+ <braunr> my pthreads-enabled hurd systems can quickly die under load
+ <braunr> youpi: with hurd servers using pthreads, i occasionally see thread
+ storms apparently due to a deadlock
+ <braunr> youpi: it makes me think of the problem you sometimes have (and
+ had often with the page cache patch)
+ <braunr> in cthreads, mutex and condition operations are macros, and they
+ check the mutex/condition queue without holding the internal
+ mutex/condition lock
+ <braunr> i'm not sure where this can lead to, but it doesn't seem right
+ <pinotree> isn't that a bit dangerous?
+ <braunr> i believe it is
+ <braunr> i mean
+ <braunr> it looks dangerous
+ <braunr> but it may be perfectly safe
+ <pinotree> could it be?
+ <braunr> aiui, it's an optimization, e.g. "dont take the internal lock if
+ there are no thread to wake"
+ <braunr> but if there is a thread enqueuing itself at the same time, it
+ might not be waken
+ <pinotree> yeah
+ <braunr> pthreads don't have this issue
+ <braunr> and what i see looks like a deadlock
+ <pinotree> anything can happen between the unlocked checking and the
+ following instruction
+ <braunr> so i'm not sure how a situation working around a faulty
+ implementation would result in a deadlock with a correct one
+ <braunr> on the other hand, the error youpi reported
+ (http://lists.gnu.org/archive/html/bug-hurd/2012-07/msg00051.html) seems
+ to indicate something is deeply wrong with libports
+ <pinotree> it could also be the current code does not really "works around"
+ that, but simply implicitly relies on the so-generated behaviour
+ <braunr> luckily not often
+ <braunr> maybe
+ <braunr> i think we have to find and fix these issues before moving to
+ pthreads entirely
+ <braunr> (ofc, using pthreads to trigger those bugs is a good procedure)
+ <pinotree> indeed
+ <braunr> i wonder if tweaking the error checking mode of pthreads to abort
+ on EDEADLK is a good approach to detecting this problem
+ <braunr> let's try !
+ <braunr> youpi: eh, i think i've spotted the libports ref mistake
+ <youpi> ooo!
+ <youpi> .oOo.!!
+ <gnu_srs> Same problem but different patches
+ <braunr> look at libports/bucket-iterate.c
+ <braunr> in the HURD_IHASH_ITERATE loop, pi->refcnt is incremented without
+ a lock
+ <youpi> Mmm, the incrementation itself would probably be compiled into an
+ INC, which is safe in UP
+ <youpi> it's an add currently actually
+ <youpi> 0x00004343 <+163>: addl $0x1,0x4(%edi)
+ <braunr> 40c4: 83 47 04 01 addl $0x1,0x4(%edi)
+ <youpi> that makes it SMP unsafe, but not UP unsafe
+ <braunr> right
+ <braunr> too bad
+ <youpi> that still deserves fixing :)
+ <braunr> the good side is my mind is already wired for smp
+ <youpi> well, it's actually not UP either
+ <youpi> in general
+ <youpi> when the processor is not able to do the add in one instruction
+ <braunr> sure
+ <braunr> youpi: looks like i'm wrong, refcnt is protected by the global
+ libports lock
+ <youpi> braunr: but aren't there pieces of code which manipulate the refcnt
+ while taking another lock than the global libports lock
+ <youpi> it'd not be scalable to use the global libports lock to protect
+ refcnt
+ <braunr> youpi: imo, the scalability issues are present because global
+ locks are taken all the time, indeed
+ <youpi> urgl
+ <braunr> yes ..
+ <braunr> when enabling mutex checks in libpthread, pfinet dies :/
+ <braunr> grmbl, when trying to start "ls" using my deadlock-detection
+ libpthread, the terminal gets unresponsive, and i can't even use ps .. :(
+ <pinotree> braunr: one could say your deadlock detection works too
+ good... :P
+ <braunr> pinotree: no, i made a mistake :p
+ <braunr> it works now :)
+ <braunr> well, works is a bit fast
+ <braunr> i can't attach gdb now :(
+ <braunr> *sigh*
+ <braunr> i guess i'd better revert to a cthreads hurd and debug from there
+ <braunr> eh, with my deadlock-detection changes, recursive mutexes are now
+ failing on _pthread_self(), which for some obscure reason generates this
+ <braunr> => 0x0107223b <+283>: jmp 0x107223b
+ <__pthread_mutex_timedlock_internal+283>
+ <braunr> *sigh*
+
+
+## IRC, freenode, #hurd, 2012-08-17
+
+ <braunr> aw, the thread storm i see isn't a deadlock
+ <braunr> seems to be mere contention ....
+ <braunr> youpi: what do you think of the way
+ ports_manage_port_operations_multithread determines it needs to spawn a
+ new thread ?
+ <braunr> it grabs a lock protecting the number of threads to determine if
+ it needs a new thread
+ <braunr> then releases it, to retake it right after if a new thread must be
+ created
+ <braunr> aiui, it could lead to a situation where many threads could
+ determine they need to create threads
+ <youpi> braunr: there's no reason to release the spinlock before re-taking
+ it
+ <youpi> that can indeed lead to too much thread creations
+ <braunr> youpi: a harder question
+ <braunr> youpi: what if thread creation fails ? :/
+ <braunr> if i'm right, hurd servers simply never expect thread creation to
+ fail
+ <youpi> indeed
+ <braunr> and as some patterns have threads blocking until another produce
+ an event
+ <braunr> i'm not sure there is any point handling the failure at all :/
+ <youpi> well, at least produce some output
+ <braunr> i added a perror
+ <youpi> so we know that happened
+ <braunr> async messaging is quite evil actually
+ <braunr> the bug i sometimes have with pfinet is usually triggered by
+ fakeroot
+ <braunr> it seems to use select a lot
+ <braunr> and select often destroys ports when it has something to return to
+ the caller
+ <braunr> which creates dead name notifications
+ <braunr> and if done often enough, a lot of them
+ <youpi> uh
+ <braunr> and as pfinet is creating threads to service new messages, already
+ existing threads are starved and can't continue
+ <braunr> which leads to pfinet exhausting its address space with thread
+ stacks (at about 30k threads)
+ <braunr> i initially thought it was a deadlock, but my modified libpthread
+ didn't detect one, and indeed, after i killed fakeroot (the whole
+ dpkg-buildpackage process hierarchy), pfinet just "cooled down"
+ <braunr> with almost all 30k threads simply waiting for requests to
+ service, and the few expected select calls blocking (a few ssh sessions,
+ exim probably, possibly others)
+ <braunr> i wonder why this doesn't happen with cthreads
+ <youpi> there's a 4k guard between stacks, otherwise I don't see anything
+ obvious
+ <braunr> i'll test my pthreads package with the fixed
+ ports_manage_port_operations_multithread
+ <braunr> but even if this "fix" should reduce thread creation, it doesn't
+ prevent the starvation i observed
+ <braunr> evil concurrency :p
+
+ <braunr> youpi: hm i've just spotted an important difference actually
+ <braunr> youpi: glibc sched_yield is __swtch(), cthreads is
+ thread_switch(MACH_PORT_NULL, SWITCH_OPTION_DEPRESS, 10)
+ <braunr> i'll change the glibc implementation, see how it affects the whole
+ system
+
+ <braunr> youpi: do you think bootsting the priority or cancellation
+ requests is an acceptable workaround ?
+ <braunr> boosting
+ <braunr> of*
+ <youpi> workaround for what?
+ <braunr> youpi: the starvation i described earlier
+ <youpi> well, I guess I'm not into the thing enough to understand
+ <youpi> you meant the dead port notifications, right?
+ <braunr> yes
+ <braunr> they are the cancellation triggers
+ <youpi> cancelling whaT?
+ <braunr> a blocking select for example
+ <braunr> ports_do_mach_notify_dead_name -> ports_dead_name ->
+ ports_interrupt_notified_rpcs -> hurd_thread_cancel
+ <braunr> so it's important they are processed quickly, to allow blocking
+ threads to unblock, reply, and be recycled
+ <youpi> you mean the threads in pfinet?
+ <braunr> the issue applies to all servers, but yes
+ <youpi> k
+ <youpi> well, it can not not be useful :)
+ <braunr> whatever the choice, it seems to be there will be a security issue
+ (a denial of service of some kind)
+ <youpi> well, it's not only in that case
+ <youpi> you can always queue a lot of requests to a server
+ <braunr> sure, i'm just focusing on this particular problem
+ <braunr> hm
+ <braunr> max POLICY_TIMESHARE or min POLICY_FIXEDPRI ?
+ <braunr> i'd say POLICY_TIMESHARE just in case
+ <braunr> (and i'm not sure mach handles fixed priority threads first
+ actually :/)
+ <braunr> hm my current hack which consists of calling swtch_pri(0) from a
+ freshly created thread seems to do the job eh
+ <braunr> (it may be what cthreads unintentionally does by acquiring a spin
+ lock from the entry function)
+ <braunr> not a single issue any more with this hack
+ <bddebian> Nice
+ <braunr> bddebian: well it's a hack :p
+ <braunr> and the problem is that, in order to boost a thread's priority,
+ one would need to implement that in libpthread
+ <bddebian> there isn't thread priority in libpthread?
+ <braunr> it's not implemented
+ <bddebian> Interesting
+ <braunr> if you want to do it, be my guest :p
+ <braunr> mach should provide the basic stuff for a partial implementation
+ <braunr> but for now, i'll fall back on the hack, because that's what
+ cthreads "does", and it's "reliable enough"
+
+ <antrik> braunr: I don't think the locking approach in
+ ports_manage_port_operations_multithread() could cause issues. the worst
+ that can happen is that some other thread becomes idle between the check
+ and creating a new thread -- and I can't think of a situation where this
+ could have any impact...
+ <braunr> antrik: hm ?
+ <braunr> the worst case is that many threads will evalute spawn to 1 and
+ create threads, whereas only one of them should have
+ <antrik> braunr: I'm not sure perror() is a good way to handle the
+ situation where thread creation failed. this would usually happen because
+ of resource shortage, right? in that case, it should work in non-debug
+ builds too
+ <braunr> perror isn't specific to debug builds
+ <braunr> i'm building glibc packages with a pthreads-enabled hurd :>
+ <braunr> (which at one point run the test allocating and filling 2 GiB of
+ memory, which passed)
+ <braunr> (with a kernel using a 3/1 split of course, swap usage reached
+ something like 1.6 GiB)
+ <antrik> braunr: BTW, I think the observation that thread storms tend to
+ happen on destroying stuff more than on creating stuff has been made
+ before...
+ <braunr> ok
+ <antrik> braunr: you are right about perror() of course. brain fart -- was
+ thinking about assert_perror()
+ <antrik> (which is misused in some places in existing Hurd code...)
+ <antrik> braunr: I still don't see the issue with the "spawn"
+ locking... the only situation where this code can be executed
+ concurrently is when multiple threads are idle and handling incoming
+ request -- but in that case spawning does *not* happen anyways...
+ <antrik> unless you are talking about something else than what I'm thinking
+ of...
+ <braunr> well imagine you have idle threads, yes
+ <braunr> let's say a lot like a thousand
+ <braunr> and the server gets a thousand requests
+ <braunr> a one more :p
+ <braunr> normally only one thread should be created to handle it
+ <braunr> but here, the worst case is that all threads run internal_demuxer
+ roughly at the same time
+ <braunr> and they all determine they need to spawn a thread
+ <braunr> leading to another thousand
+ <braunr> (that's extreme and very unlikely in practice of course)
+ <antrik> oh, I see... you mean all the idle threads decide that no spawning
+ is necessary; but before they proceed, finally one comes in and decides
+ that it needs to spawn; and when the other ones are scheduled again they
+ all spawn unnecessarily?
+ <braunr> no, spawn is a local variable
+ <braunr> it's rather, all idle threads become busy, and right before
+ servicing their request, they all decide they must spawn a thread
+ <antrik> I don't think that's how it works. changing the status to busy (by
+ decrementing the idle counter) and checking that there are no idle
+ threads is atomic, isn't it?
+ <braunr> no
+ <antrik> oh
+ <antrik> I guess I should actually look at that code (again) before
+ commenting ;-)
+ <braunr> let me check
+ <braunr> no sorry you're right
+ <braunr> so right, you can't lead to that situation
+ <braunr> i don't even understand how i can't see that :/
+ <braunr> let's say it's the heat :p
+ <braunr> 22:08 < braunr> so right, you can't lead to that situation
+ <braunr> it can't lead to that situation
+
+
+## IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> one more attempt at fixing netdde, hope i get it right this time
+ <braunr> some parts assume a ddekit thread is a cthread, because they share
+ the same address
+ <braunr> it's not as easy when using pthread_self :/
+ <braunr> good, i got netdde work with pthreads
+ <braunr> youpi: for reference, there are now glibc, hurd and netdde
+ packages on my repository
+ <braunr> youpi: the debian specific patches can be found at my git
+ repository (http://git.sceen.net/rbraun/debian_hurd.git/ and
+ http://git.sceen.net/rbraun/debian_netdde.git/)
+ <braunr> except a freeze during boot (between exec and init) which happens
+ rarely, and the starvation which still exists to some extent (fakeroot
+ can cause many threads to be created in pfinet and pflocal), the
+ glibc/hurd packages have been working fine for a few days now
+ <braunr> the threading issue in pfinet/pflocal is directly related to
+ select, which the io_select_timeout patches should fix once merged
+ <braunr> well, considerably reduce at least
+ <braunr> and maybe fix completely, i'm not sure
+
+
+## IRC, freenode, #hurd, 2012-08-27
+
+ <pinotree> braunr: wrt a78a95d in your pthread branch of hurd.git,
+ shouldn't that job theorically been done using pthread api (of course
+ after implementing it)?
+ <braunr> pinotree: sure, it could be done through pthreads
+ <braunr> pinotree: i simply restricted myself to moving the hurd to
+ pthreads, not augment libpthread
+ <braunr> (you need to remember that i work on hurd with pthreads because it
+ became a dependency of my work on fixing select :p)
+ <braunr> and even if it wasn't the reason, it is best to do these tasks
+ (replace cthreads and implement pthread scheduling api) separately
+ <pinotree> braunr: hm ok
+ <pinotree> implementing the pthread priority bits could be done
+ independently though
+
+ <braunr> youpi: there are more than 9000 threads for /hurd/streamio kmsg on
+ ironforge oO
+ <youpi> kmsg ?!
+ <youpi> it's only /dev/klog right?
+ <braunr> not sure but it seems so
+ <pinotree> which syslog daemon is running?
+ <youpi> inetutils
+ <youpi> I've restarted the klog translator, to see whether when it grows
+ again
+
+ <braunr> 6 hours and 21 minutes to build glibc on darnassus
+ <braunr> pfinet still runs only 24 threads
+ <braunr> the ext2 instance used for the build runs 2k threads, but that's
+ because of the pageouts
+ <braunr> so indeed, the priority patch helps a lot
+ <braunr> (pfinet used to have several hundreds, sometimes more than a
+ thousand threads after a glibc build, and potentially increasing with
+ each use of fakeroot)
+ <braunr> exec weights 164M eww, we definitely have to fix that leak
+ <braunr> the leaks are probably due to wrong mmap/munmap usage
+
+[[exec_leak]].
+
+
+### IRC, freenode, #hurd, 2012-08-29
+
+ <braunr> youpi: btw, after my glibc build, there were as little as between
+ 20 and 30 threads for pflocal and pfinet
+ <braunr> with the priority patch
+ <braunr> ext2fs still had around 2k because of pageouts, but that's
+ expected
+ <youpi> ok
+ <braunr> overall the results seem very good and allow the switch to
+ pthreads
+ <youpi> yep, so it seems
+ <braunr> youpi: i think my first integration branch will include only a few
+ changes, such as this priority tuning, and the replacement of
+ condition_implies
+ <youpi> sure
+ <braunr> so we can push the move to pthreads after all its small
+ dependencies
+ <youpi> yep, that's the most readable way
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <gnu_srs> braunr: Compiling yodl-3.00.0-7:
+ <gnu_srs> pthreads: real 13m42.460s, user 0m0.000s, sys 0m0.030s
+ <gnu_srs> cthreads: real 9m 6.950s, user 0m0.000s, sys 0m0.020s
+ <braunr> thanks
+ <braunr> i'm not exactly certain about what causes the problem though
+ <braunr> it could be due to libpthread using doubly-linked lists, but i
+ don't think the overhead would be so heavier because of that alone
+ <braunr> there is so much contention sometimes that it could
+ <braunr> the hurd would have been better off with single threaded servers
+ :/
+ <braunr> we should probably replace spin locks with mutexes everywhere
+ <braunr> on the other hand, i don't have any more starvation problem with
+ the current code
+
+
+### IRC, freenode, #hurd, 2012-09-06
+
+ <gnu_srs> braunr: Yes you are right, the new pthread-based Hurd is _much_
+ slower.
+ <gnu_srs> One annoying example is when compiling, the standard output is
+ written in bursts with _long_ periods of no output in between:-(
+ <braunr> that's more probably because of the priority boost, not the
+ overhead
+ <braunr> that's one of the big issues with our mach-based model
+ <braunr> we either give high priorities to our servers, or we can suffer
+ from message floods
+ <braunr> that's in fact more a hurd problem than a mach one
+ <gnu_srs> braunr: any immediate ideas how to speed up responsiveness the
+ pthread-hurd. It is annoyingly slow (slow-witted)
+ <braunr> gnu_srs: i already answered that
+ <braunr> it doesn't look that slower on my machines though
+ <gnu_srs> you said you had some ideas, not which. except for mcsims work.
+ <braunr> i have ideas about what makes it slower
+ <braunr> it doesn't mean i have solutions for that
+ <braunr> if i had, don't you think i'd have applied them ? :)
+ <gnu_srs> ok, how to make it more responsive on the console? and printing
+ stdout more regularly, now several pages are stored and then flushed.
+ <braunr> give more details please
+ <gnu_srs> it behaves like a loaded linux desktop, with little memory
+ left...
+ <braunr> details about what you're doing
+ <gnu_srs> apt-get source any big package and: fakeroot debian/rules binary
+ 2>&1 | tee ../binary.logg
+ <braunr> isee
+ <braunr> well no, we can't improve responsiveness
+ <braunr> without reintroducing the starvation problem
+ <braunr> they are linked
+ <braunr> and what you're doing involes a few buffers, so the laggy feel is
+ expected
+ <braunr> if we can fix that simply, we'll do so after it is merged upstream
+
+
+### IRC, freenode, #hurd, 2012-09-07
+
+ <braunr> gnu_srs: i really don't feel the sluggishness you described with
+ hurd+pthreads on my machines
+ <braunr> gnu_srs: what's your hardware ?
+ <braunr> and your VM configuration ?
+ <gnu_srs> Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
+ <gnu_srs> kvm -m 1024 -net nic,model=rtl8139 -net
+ user,hostfwd=tcp::5562-:22 -drive
+ cache=writeback,index=0,media=disk,file=hurd-experimental.img -vnc :6
+ -cdrom isos/netinst_2012-07-15.iso -no-kvm-irqchip
+ <braunr> what is the file system type where your disk image is stored ?
+ <gnu_srs> ext3
+ <braunr> and how much physical memory on the host ?
+ <braunr> (paste meminfo somewhere please)
+ <gnu_srs> 4G, and it's on the limit, 2 kvm instances+gnome,etc
+ <gnu_srs> 80% in use by programs, 14% in cache.
+ <braunr> ok, that's probably the reason then
+ <braunr> the writeback option doesn't help a lot if you don't have much
+ cache
+ <gnu_srs> well the other instance is cthreads based, and not so sluggish.
+ <braunr> we know hurd+pthreads is slower
+ <braunr> i just wondered why i didn't feel it that much
+ <gnu_srs> try to fire up more kvm instances, and do a heavy compile...
+ <braunr> i don't do that :)
+ <braunr> that's why i never had the problem
+ <braunr> most of the time i have like 2-3 GiB of cache
+ <braunr> and of course more on shattrath
+ <braunr> (the host of the sceen.net hurdboxes, which has 16 GiB of ram)
+
+
+### IRC, freenode, #hurd, 2012-09-11
+
+ <gnu_srs> Monitoring the cthreads and the pthreads load under Linux shows:
+ <gnu_srs> cthread version: load can jump very high, less cpu usage than
+ pthread version
+ <gnu_srs> pthread version: less memory usage, background cpu usage higher
+ than for cthread version
+ <braunr> that's the expected behaviour
+ <braunr> gnu_srs: are you using the lifothreads gnumach kernel ?
+ <gnu_srs> for experimental, yes.
+ <gnu_srs> i.e. pthreads
+ <braunr> i mean, you're measuring on it right now, right ?
+ <gnu_srs> yes, one instance running cthreads, and one pthreads (with lifo
+ gnumach)
+ <braunr> ok
+ <gnu_srs> no swap used in either instance, will try a heavy compile later
+ on.
+ <braunr> what for ?
+ <gnu_srs> E.g. for memory when linking. I have swap available, but no swap
+ is used currently.
+ <braunr> yes but, what do you intend to measure ?
+ <gnu_srs> don't know, just to see if swap is used at all. it seems to be
+ used not very much.
+ <braunr> depends
+ <braunr> be warned that using the swap means there is pageout, which is one
+ of the triggers for global system freeze :p
+ <braunr> anonymous memory pageout
+ <gnu_srs> for linux swap is used constructively, why not on hurd?
+ <braunr> because of hard to squash bugs
+ <gnu_srs> aha, so it is bugs hindering swap usage:-/
+ <braunr> yup :/
+ <gnu_srs> Let's find them thenO:-), piece of cake
+ <braunr> remember my page cache branch in gnumach ? :)
+
+[[gnumach_page_cache_policy]].
+
+ <gnu_srs> not much
+ <braunr> i started it before fixing non blocking select
+ <braunr> anyway, as a side effect, it should solve this stability issue
+ too, but it'll probably take time
+ <gnu_srs> is that branch integrated? I only remember slab and the lifo
+ stuff.
+ <gnu_srs> and mcsims work
+ <braunr> no it's not
+ <braunr> it's unfinished
+ <gnu_srs> k!
+ <braunr> it correctly extends the page cache to all available physical
+ memory, but since the hurd doesn't scale well, it slows the system down
+
+
+## IRC, freenode, #hurd, 2012-09-14
+
+ <braunr> arg
+ <braunr> darnassus seems to eat 100% cpu and make top freeze after some
+ time
+ <braunr> seems like there is an important leak in the pthreads version
+ <braunr> could be the lifothreads patch :/
+ <cjbirk> there's a memory leak?
+ <cjbirk> in pthreads?
+ <braunr> i don't think so, and it's not a memory leak
+ <braunr> it's a port leak
+ <braunr> probably in the kernel
+
+
+### IRC, freenode, #hurd, 2012-09-17
+
+ <braunr> nice, the port leak is actually caused by the exim4 loop bug
+
+
+### IRC, freenode, #hurd, 2012-09-23
+
+ <braunr> the port leak i observed a few days ago is because of exim4 (the
+ infamous loop eating the cpu we've been seeing regularly)
+
+[[fork_deadlock]]?
+
+ <youpi> oh
+ <braunr> next time it happens, and if i have the occasion, i'll examine the
+ problem
+ <braunr> tip: when you can't use top or ps -e, you can use ps -e -o
+ pid=,args=
+ <youpi> or -M ?
+ <braunr> haven't tested
+
+
+## IRC, freenode, #hurd, 2012-09-23
+
+ <braunr> tschwinge: i committed the last hurd pthread change,
+ http://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?h=master-pthreads
+ <braunr> tschwinge: please tell me if you consider it ok for merging
+
+
+### IRC, freenode, #hurd, 2012-11-27
+
+ <youpi> braunr: btw, I forgot to forward here, with the glibc patch it does
+ boot fine, I'll push all that and build some almost-official packages for
+ people to try out what will come when eglibc gets the change in unstable
+ <braunr> youpi: great :)
+ <youpi> thanks for managing the final bits of this
+ <youpi> (and thanks for everybody involved)
+ <braunr> sorry again for the non obvious parts
+ <braunr> if you need the debian specific parts refined (e.g. nice commits
+ for procfs & others), i can do that
+ <youpi> I'll do that, no pb
+ <braunr> ok
+ <braunr> after that (well, during also), we should focus more on bug
+ hunting
+
+
+## IRC, freenode, #hurd, 2012-10-26
+
+ <mcsim1> hello. What does following error message means? "unable to adjust
+ libports thread priority: Operation not permitted" It appears when I set
+ translators.
+ <mcsim1> Seems has some attitude to libpthread. Also following appeared
+ when I tried to remove translator: "pthread_create: Resource temporarily
+ unavailable"
+ <mcsim1> Oh, first message appears very often, when I use translator I set.
+ <braunr> mcsim1: it's related to a recent patch i sent
+ <braunr> mcsim1: hurd servers attempt to increase their priority on startup
+ (when a thread is created actually)
+ <braunr> to reduce message floods and thread storms (such sweet names :))
+ <braunr> but if you start them as an unprivileged user, it fails, which is
+ ok, it's just a warning
+ <braunr> the second way is weird
+ <braunr> it normally happens when you're out of available virtual space,
+ not when shutting a translator donw
+ <mcsim1> braunr: you mean this patch: libports: reduce thread starvation on
+ message floods?
+ <braunr> yes
+ <braunr> remember you're running on darnassus
+ <braunr> with a heavily modified hurd/glibc
+ <braunr> you can go back to the cthreads version if you wish
+ <mcsim1> it's better to check translators privileges, before attempting to
+ increase their priority, I think.
+ <braunr> no
+ <mcsim1> it's just a bit annoying
+ <braunr> privileges can be changed during execution
+ <braunr> well remove it
+ <mcsim1> But warning should not appear.
+ <braunr> what could be done is to limit the warning to one occurrence
+ <braunr> mcsim1: i prefer that it appears
+ <mcsim1> ok
+ <braunr> it's always better to be explicit and verbose
+ <braunr> well not always, but very often
+ <braunr> one of the reasons the hurd is so difficult to debug is the lack
+ of a "message server" à la dmesg
+
+[[translator_stdout_stderr]].
+
+
+### IRC, freenode, #hurd, 2012-12-10
+
+ <youpi> braunr: unable to adjust libports thread priority: (ipc/send)
+ invalid destination port
+ <youpi> I'll see what package brought that
+ <youpi> (that was on a buildd)
+ <braunr> wow
+ <youpi> mkvtoolnix_5.9.0-1:
+ <pinotree> shouldn't that code be done in pthreads and then using such
+ pthread api? :p
+ <braunr> pinotree: you've already asked that question :p
+ <pinotree> i know :p
+ <braunr> the semantics of pthreads are larger than what we need, so that
+ will be done "later"
+ <braunr> but this error shouldn't happen
+ <braunr> it looks more like a random mach bug
+ <braunr> youpi: anything else on the console ?
+ <youpi> nope
+ <braunr> i'll add traces to know which step causes the error
+
+
+## IRC, freenode, #hurd, 2012-12-05
+
+ <braunr> tschwinge: i'm currently working on a few easy bugs and i have
+ planned improvements for libpthreads soon
+ <pinotree> wotwot, which ones?
+ <braunr> pinotree: first, fixing pthread_cond_timedwait (and everything
+ timedsomething actually)
+ <braunr> pinotree: then, fixing cancellation
+ <braunr> pinotree: and last but not least, optimizing thread wakeup
+ <braunr> i also want to try replacing spin locks and see if it does what i
+ expect
+ <pinotree> which fixes do you plan applying to cond_timedwait?
+ <braunr> see sysdeps/generic/pt-cond-timedwait.c
+ <braunr> the FIXME comment
+ <pinotree> ah that
+ <braunr> well that's important :)
+ <braunr> did you have something else in mind ?
+ <pinotree> hm, __pthread_timedblock... do you plan fixing directly there? i
+ remember having seem something related to that (but not on conditions),
+ but wasn't able to see further
+ <braunr> it has the same issue
+ <braunr> i don't remember the details, but i wrote a cthreads version that
+ does it right
+ <braunr> in the io_select_timeout branch
+ <braunr> see
+ http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/libthreads/cancel-cond.c?h=rbraun/select_timeout
+ for example
+ * pinotree looks
+ <braunr> what matters is the msg_delivered member used to synchronize
+ sleeper and waker
+ <braunr> the waker code is in
+ http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/libthreads/cprocs.c?h=rbraun/select_timeout
+ <pinotree> never seen cthreads' code before :)
+ <braunr> soon you shouldn't have any more reason to :p
+ <pinotree> ah, so basically the cthread version of the pthread cleanup
+ stack + cancelation (ie the cancel hook) broadcasts the condition
+ <braunr> yes
+ <pinotree> so a similar fix would be needed in all the places using
+ __pthread_timedblock, that is conditions and mutexes
+ <braunr> and that's what's missing in glibc that prevents deploying a
+ pthreads based hurd currently
+ <braunr> no that's unrelated
+ <pinotree> ok
+ <braunr> the problem is how __pthread_block/__pthread_timedblock is
+ synchronized with __pthread_wakeup
+ <braunr> libpthreads does exactly the same thing as cthreads for that,
+ i.e. use messages
+ <braunr> but the message alone isn't enough, since, as explained in the
+ FIXME comment, it can arrive too late
+ <braunr> it's not a problem for __pthread_block because this function can
+ only resume after receiving a message
+ <braunr> but it's a problem for __pthread_timedblock which can resume
+ because of a timeout
+ <braunr> my solution is to add a flag that says whether a message was
+ actually sent, and lock around sending the message, so that the thread
+ resume can accurately tell in which state it is
+ <braunr> and drain the message queue if needed
+ <pinotree> i see, race between the "i stop blocking because of timeout" and
+ "i stop because i got a message" with the actual check for the real cause
+ <braunr> locking around mach_msg may seem overkill but it's not in
+ practice, since there can only be one message at most in the message
+ queue
+ <braunr> and i checked that in practice by limiting the message queue size
+ and check for such errors
+ <braunr> but again, it would be far better with mutexes only, and no spin
+ locks
+ <braunr> i wondered for a long time why the load average was so high on the
+ hurd under even "light" loads
+ <braunr> now i know :)
diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
new file mode 100644
index 00000000..37231c66
--- /dev/null
+++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
@@ -0,0 +1,21 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_libphread]]
+
+`t/have_kernel_resources`
+
+
+# IRC, freenode, #hurd, 2012-08-30
+
+ <braunr> tschwinge: this issue needs more cooperation with the kernel
+ <braunr> tschwinge: i.e. the ability to tell the kernel where the stack is,
+ so it's unmapped when the thread dies
+ <braunr> which requiring another thread to perform this deallocation
diff --git a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
index 2c8f10f8..22b2cd3b 100644
--- a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
+++ b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
@@ -76,3 +76,38 @@ License|/fdl]]."]]"""]]
<pinotree> kind of, yes
<youpi> I have reverted the change in libc for now
<pinotree> ok
+
+
+## IRC, freenode, #hurd, 2012-07-22
+
+ <tschwinge> pinotree, youpi: I once saw you discussing issue with librt
+ usage is libpthread -- is it this issue? http://sourceware.org/PR14304
+ <youpi> tschwinge: (librt): no
+ <youpi> it's the converse
+ <pinotree> tschwinge: kind of
+ <youpi> unexpectedly loading libpthread is almost never a problem
+ <youpi> it's unexpectedly loading librt which was a problem for glib
+ <youpi> tschwinge: basically what happened with glib is that at configure
+ time, it could find clock_gettime without any -lrt, because of pulling
+ -lpthread, but at link time that wouldn't happen
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> pinotree: oh, i see you changed __pthread_timedblock to use
+ clock_gettime
+ <braunr> i wonder if i should do the same in libthreads
+ <pinotree> yeah, i realized later it was a bad move
+ <braunr> ok
+ <braunr> i'll stick to gettimeofday for now
+ <pinotree> it'll be safe when implementing some private
+ __hurd_clock_get{time,res} in libc proper, making librt just forward to
+ it and adapting the gettimeofday to use it
+
+
+## IRC, freenode, #hurd, 2012-10-22
+
+ <pinotree> youpi: apparently somebody in glibc land is indirectly solving
+ our "libpthread needs lirt which pulls libphtread" circular issue by
+ moving the clock_* functions to libc proper
+ <youpi> I've seen that yes :)
diff --git a/open_issues/libpthread_timeout_dequeue.mdwn b/open_issues/libpthread_timeout_dequeue.mdwn
new file mode 100644
index 00000000..5ebb2e11
--- /dev/null
+++ b/open_issues/libpthread_timeout_dequeue.mdwn
@@ -0,0 +1,22 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_libpthread]]
+
+
+# IRC, freenode, #hurd, 2012-08-17
+
+ <braunr> pthread_cond_timedwait and pthread_mutex_timedlock *can* produce
+ segfaults in our implementation
+ <braunr> if a timeout happens, but before the thread dequeues itself,
+ another tries to wake it, it will be dequeued twice
+ <braunr> this is the issue i spent a week on when working on fixing select
+
+[[select]]
diff --git a/open_issues/mach_federations.mdwn b/open_issues/mach_federations.mdwn
new file mode 100644
index 00000000..50c939c3
--- /dev/null
+++ b/open_issues/mach_federations.mdwn
@@ -0,0 +1,66 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> well replacing parts of it is possible on the hurd, but for core
+ servers it's limited
+ <braunr> minix has features for that
+ <braunr> this was interesting too:
+ http://static.usenix.org/event/osdi08/tech/full_papers/david/david_html/
+ <braunr> lcc: you'll always have some kind of dependency problems which are
+ hard to solve
+ <savask> braunr: One my friend asked me if it's possible to run different
+ parts of Hurd on different computers and make a cluster therefore. So, is
+ it, at least theoretically?
+ <braunr> savask: no
+ <savask> Okay, then I guessed a right answer.
+ <youpi> well, theorically it's possible, but it's not implemented
+ <braunr> well it's possible everywhere :p
+ <braunr> there are projects for that on linux
+ <braunr> but it requires serious changes in both the protocols and servers
+ <braunr> and it depends on the features you want (i assume here you want
+ e.g. process checkpointing so they can be migrated to other machines to
+ transparently balance loads)
+ <lcc> is it even theoretically possible to have a system in which core
+ servers can be modified while the system is running? hm... I will look
+ more into it. just curious.
+ <savask> lcc: Linux can be updated on the fly, without rebooting.
+ <braunr> lcc: to some degree, it is
+ <braunr> savask: the whole kernel is rebooted actually
+ <braunr> well not rebooted, but restarted
+ <braunr> there is a project that provides kernel updates through binary
+ patches
+ <braunr> ksplice
+ <savask> braunr: But it will look like everything continued running.
+ <braunr> as long as the new code expects the same data structures and other
+ implications, yes
+ <braunr> "Ksplice can handle many security updates but not changes to data
+ structures"
+ <braunr> obviously
+ <braunr> so it's good for small changes
+ <braunr> and ksplice is very specific, it's intended for security updates,
+ ad the primary users are telecommunication providers who don't want
+ downtime
+ <antrik> braunr: well, protocols and servers on Mach-based systems should
+ be ready for federations... although some Hurd protocols are not clean
+ for federations with heterogenous architectures, at least on homogenous
+ clusters it should actually work with only some extra bootstrapping code,
+ if the support existed in our Mach variant...
+ <braunr> antrik: why do you want the support in the kernel ?
+ <antrik> braunr: I didn't say I *want* federation support in the
+ kernel... in fact I agree with Shapiro that it's probably a bad idea. I
+ just said that it *should* actually work with the system design as it is
+ now :-)
+ <antrik> braunr: yes, I said that it wouldn't work on heterogenous
+ federations. if all machines use the same architecture it should work.
diff --git a/open_issues/mach_on_top_of_posix.mdwn b/open_issues/mach_on_top_of_posix.mdwn
index 7574feb0..a3e47685 100644
--- a/open_issues/mach_on_top_of_posix.mdwn
+++ b/open_issues/mach_on_top_of_posix.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -14,3 +14,5 @@ License|/fdl]]."]]"""]]
At the beginning of the 2000s, there was a *Mach on Top of POSIX* port started
by John Edwin Tobey. Status unknown. Ask [[tschwinge]] for the source code.
+
+See also [[implementing_hurd_on_top_of_another_system]].
diff --git a/open_issues/mach_shadow_objects.mdwn b/open_issues/mach_shadow_objects.mdwn
new file mode 100644
index 00000000..0669041a
--- /dev/null
+++ b/open_issues/mach_shadow_objects.mdwn
@@ -0,0 +1,24 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_gnumach]]
+
+See also [[gnumach_vm_map_entry_forward_merging]].
+
+
+# IRC, freenode, #hurd, 2012-11-16
+
+ <mcsim> hi. do I understand correct that following is true: vm_object_t a;
+ a->shadow->copy == a;?
+ <braunr> mcsim: not completely sure, but i'd say no
+ <braunr> but mach terminology isn't always firm, so possible
+ <braunr> mcsim: apparently you're right, although be careful that it may
+ not be the case *all* the time
+ <braunr> there may be inconsistent states
diff --git a/open_issues/mission_statement.mdwn b/open_issues/mission_statement.mdwn
index 17f148a9..b32d6ba6 100644
--- a/open_issues/mission_statement.mdwn
+++ b/open_issues/mission_statement.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -658,3 +658,42 @@ License|/fdl]]."]]"""]]
FUSE in this case though... it doesn't really change the functionality of
the VFS; only rearranges the tree a bit
<antrik> (might even be doable with standard Linux features)
+
+
+# IRC, freenode, #hurd, 2012-07-25
+
+ <braunr> because it has design problems, because it has implementation
+ problems, lots of problems, and far too few people to keep up with other
+ systems that are already dominating
+ <braunr> also, considering other research projects get much more funding
+ than we do, they probably have a better chance at being adopted
+ <rah> you consider the Hurd to be a research project?
+ <braunr> and as they're more recent, they sometimes overcome some of the
+ issues we have
+ <braunr> yes and no
+ <braunr> yes because it was, at the time of its creation, and it hasn't
+ changed much, and there aren't many (any?) other systems with such a
+ design
+ <braunr> and no because the hurd is actually working, and being released as
+ part of something like debian
+ <braunr> which clearly shows it's able to do the stuff it was intended for
+ <braunr> i consider it a technically very interesting project for
+ developers who want to know more about microkernel based extensible
+ systems
+ <antrik> rah: I don't expect the Hurd to achieve world domination, because
+ most people consider Linux "good enough" and will stick with it
+ <antrik> I for my part think though we could do better than Linux (in
+ certain regards I consider important), which is why I still consider it
+ interesting and worthwhile
+ <nowhere_man> I think that in some respect the OS scene may evolve a bit
+ like the PL one, where everyone progressively adopts ideas from Lisp but
+ doesn't want to do Lisp: everyone slowly shifts towards what µ-kernels
+ OSes have done from the start, but they don't want µ-kernels...
+ <braunr> nowhere_man: that's my opinion too
+ <braunr> and this is why i think something like the hurd still has valuable
+ purpose
+ <nowhere_man> braunr: in honesty, I still ponder the fact that it's my
+ coping mechanism to accept being a Lisp and Hurd fan ;-)
+ <braunr> nowhere_man: it can be used that way too
+ <braunr> functional programming is getting more and more attention
+ <braunr> so it's fine if you're a lisp fan really
diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn
index 5924d3f9..f42601b4 100644
--- a/open_issues/multithreading.mdwn
+++ b/open_issues/multithreading.mdwn
@@ -49,6 +49,160 @@ Tom Van Cutsem, 2009.
<youpi> right
+## IRC, freenode, #hurd, 2012-07-16
+
+ <braunr> hm interesting
+ <braunr> when many threads are creating to handle requests, they
+ automatically create a pool of worker threads by staying around for some
+ time
+ <braunr> this time is given in the libport call
+ <braunr> but the thread always remain
+ <braunr> they must be used in turn each time a new requet comes in
+ <braunr> ah no :(, they're maintained by the periodic sync :(
+ <braunr> hm, still not that, so weird
+ <antrik> braunr: yes, that's a known problem: unused threads should go away
+ after some time, but that doesn't actually happen
+ <antrik> don't remember though whether it's broken for some reason, or
+ simply not implemented at all...
+ <antrik> (this was already a known issue when thread throttling was
+ discussed around 2005...)
+ <braunr> antrik: ok
+ <braunr> hm threads actually do finish ..
+ <braunr> libthreads retain them in a pool for faster allocations
+ <braunr> hm, it's worse than i thought
+ <braunr> i think the hurd does its job well
+ <braunr> the cthreads code never reaps threads
+ <braunr> when threads are finished, they just wait until assigned a new
+ invocation
+
+ <braunr> i don't understand ports_manage_port_operations_multithread :/
+ <braunr> i think i get it
+ <braunr> why do people write things in such a complicated way ..
+ <braunr> such code is error prone and confuses anyone
+
+ <braunr> i wonder how well nested functions interact with threads when
+ sharing variables :/
+ <braunr> the simple idea of nested functions hurts my head
+ <braunr> do you see my point ? :) variables on the stack automatically
+ shared between threads, without the need to explicitely pass them by
+ address
+ <antrik> braunr: I don't understand. why would variables on the stack be
+ shared between threads?...
+ <braunr> antrik: one function declares two variables, two nested functions,
+ and use these in separate threads
+ <braunr> are the local variables still "local"
+ <braunr> ?
+ <antrik> braunr: I would think so? why wouldn't they? threads have separate
+ stacks, right?...
+ <antrik> I must admit though that I have no idea how accessing local
+ variables from the parent function works at all...
+ <braunr> me neither
+
+ <braunr> why don't demuxers get a generic void * like every callback does
+ :((
+ <antrik> ?
+ <braunr> antrik: they get pointers to the input and output messages only
+ <antrik> why is this a problem?
+ <braunr> ports_manage_port_operations_multithread can be called multiple
+ times in the same process
+ <braunr> each call must have its own context
+ <braunr> currently this is done by using nested functions
+ <braunr> also, why demuxers return booleans while mach_msg_server_timeout
+ happily ignores them :(
+ <braunr> callbacks shouldn't return anything anyway
+ <braunr> but then you have a totally meaningless "return 1" in the middle
+ of the code
+ <braunr> i'd advise not using a single nested function
+ <antrik> I don't understand the remark about nested function
+ <braunr> they're just horrible extensions
+ <braunr> the compiler completely hides what happens behind the scenes, and
+ nasty bugs could come out of that
+ <braunr> i'll try to rewrite ports_manage_port_operations_multithread
+ without them and see if it changes anything
+ <braunr> but it's not easy
+ <braunr> also, it makes debugging harder :p
+ <braunr> i suspect gdb hangs are due to that, since threads directly start
+ on a nested function
+ <braunr> and if i'm right, they are created on the stack
+ <braunr> (which is also horrible for security concerns, but that's another
+ story)
+ <braunr> (at least the trampolines)
+ <antrik> I seriously doubt it will change anything... but feel free to
+ prove me wrong :-)
+ <braunr> well, i can see really weird things, but it may have nothing to do
+ with the fact functions are nested
+ <braunr> (i still strongly believe those shouldn't be used at all)
+
+
+## IRC, freenode, #hurd, 2012-08-31
+
+ <braunr> and the hurd is all but scalable
+ <gnu_srs> I thought scalability was built-in already, at least for hurd??
+ <braunr> built in ?
+ <gnu_srs> designed in
+ <braunr> i guess you think that because you read "aggressively
+ multithreaded" ?
+ <braunr> well, a system that is unable to control the amount of threads it
+ creates for no valid reason and uses global lock about everywhere isn't
+ really scalable
+ <braunr> it's not smp nor memory scalable
+ <gnu_srs> most modern OSes have multi-cpu support.
+ <braunr> that doesn't mean they scale
+ <braunr> bsd sucks in this area
+ <braunr> it got better in recent years but they're way behind linux
+ <braunr> linux has this magic thing called rcu
+ <braunr> and i want that in my system, from the beginning
+ <braunr> and no, the hurd was never designed to scale
+ <braunr> that's obvious
+ <braunr> a very common mistake of the early 90s
+
+
+## IRC, freenode, #hurd, 2012-09-06
+
+ <braunr> mel-: the problem with such a true client/server architecture is
+ that the scheduling context of clients is not transferred to servers
+ <braunr> mel-: and the hurd creates threads on demand, so if it's too slow
+ to process requests, more threads are spawned
+ <braunr> to prevent hurd servers from creating too many threads, they are
+ given a higher priority
+ <braunr> and it causes increased latency for normal user applications
+ <braunr> a better way, which is what modern synchronous microkernel based
+ systems do
+ <braunr> is to transfer the scheduling context of the client to the server
+ <braunr> the server thread behaves like the client thread from the
+ scheduler perspective
+ <gnu_srs> how can creating more threads ease the slowness, is that a design
+ decision??
+ <mel-> what would be needed to implement this?
+ <braunr> mel-: thread migration
+ <braunr> gnu_srs: is that what i wrote ?
+ <mel-> does mach support it?
+ <braunr> mel-: some versions do yes
+ <braunr> mel-: not ours
+ <gnu_srs> 21:49:03) braunr: mel-: and the hurd creates threads on demand,
+ so if it's too slow to process requests, more threads are spawned
+ <braunr> of course it's a design decision
+ <braunr> it doesn't "ease the slowness"
+ <braunr> it makes servers able to use multiple processors to handle
+ requests
+ <braunr> but it's a wrong design decision as the number of threads is
+ completely unchecked
+ <gnu_srs> what's the idea of creating more theads then, multiple cpus is
+ not supported?
+ <braunr> it's a very old decision taken at a time when systems and machines
+ were very different
+ <braunr> mach used to support multiple processors
+ <braunr> it was expected gnumach would do so too
+ <braunr> mel-: but getting thread migration would also require us to adjust
+ our threading library and our servers
+ <braunr> it's not an easy task at all
+ <braunr> and it doesn't fix everything
+ <braunr> thread migration on mach is an optimization
+ <mel-> interesting
+ <braunr> async ipc remains available, which means notifications, which are
+ async by nature, will create messages floods anyway
+
+
# Alternative approaches:
* <http://www.concurrencykit.org/>
diff --git a/open_issues/netstat.mdwn b/open_issues/netstat.mdwn
new file mode 100644
index 00000000..b575ea7f
--- /dev/null
+++ b/open_issues/netstat.mdwn
@@ -0,0 +1,34 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_hurd open_issue_porting]]
+
+
+# IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> we need a netstat command
+ <pinotree> wouldn't that require rpcs and notifications in pfinet to get
+ info on the known sockets?
+ <braunr> depends on the interface
+ <braunr> netstat currently uses /proc/net/* so that's out of the question
+ <braunr> but a bsd netstat using ioctls could do the job
+ <braunr> i'm not sure if it's done that way
+ <braunr> i don't see why it would require notifications though
+ <pinotree> if add such rpcs to pfinet, you could show the sockets in procfs
+ <braunr> yes
+ <braunr> that's the clean way :p
+ <braunr> but why notifications ?
+ <pinotree> to get changes on data of sockets (status change, i/o activity,
+ etc)
+ <pinotree> (possibly i'm forgetting some already there features to know
+ that)
+ <braunr> the socket state is centralized in pfinet
+ <braunr> netstat polls it
+ <braunr> (or asks it once)
diff --git a/open_issues/packaging_libpthread.mdwn b/open_issues/packaging_libpthread.mdwn
index d243aaaa..2d90779e 100644
--- a/open_issues/packaging_libpthread.mdwn
+++ b/open_issues/packaging_libpthread.mdwn
@@ -137,3 +137,110 @@ License|/fdl]]."]]"""]]
<youpi> I know, I've asked tschwinge about it
<youpi> it's not urging anyway
<pinotree> right
+
+
+## IRC, freenode, #hurd, 2012-07-21
+
+ <pinotree> tschwinge: btw, samuel suggested to rename in libpthread ia32 →
+ i386, to better fit with glibc
+ <tschwinge> pinotree: Hmm, that'd somewhat break interopability with
+ Viengoos' use of libpthread.
+ <pinotree> how would it break with viengoos?
+ <tschwinge> I assume it is using the i386 names. Hmm, no isn't it x86_64
+ only?
+ <tschwinge> I'll check.
+ <pinotree> does it use automake (with the Makefile.am in repo)?
+ <tschwinge> I have no idea what the current branch arrangement is.
+ <pinotree> tschwinge: it looks like ia32 is hardcoded in Makefile and
+ Makefile.am
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <tschwinge> Also, the Savannah hurd/glibc.git one does not/not yet include
+ libpthread.
+ <tschwinge> But that could easily be added as a Git submodule.
+ <tschwinge> youpi: To put libpthread into glibc it is literally enough to
+ make Savannah hurd/libpthread.git appear at [glibc]/libpthread?
+ <youpi> tschwinge: there are some patches needed in the rest of the tree
+ <youpi> see in debian, libpthread_clean.diff, tg-libpthread_depends.diff,
+ unsubmitted-pthread.diff, unsubmitted-pthread_posix_options.diff
+ <tschwinge> The libpthread in Debian glibc is
+ hurd/libpthread.git:b428baaa85c0adca9ef4884c637f289a0ab5e2d6 but with
+ 25260994c812050a5d7addf125cdc90c911ca5c1 »Store self in __thread variable
+ instead of threadvar« reverted (why?), and the following additional
+ change applied to Makefile:
+ <tschwinge> ifeq ($(IN_GLIBC),yes)
+ <tschwinge> $(inst_libdir)/libpthread.so:
+ $(objpfx)libpthread.so$(libpthread.so-version) \
+ <tschwinge> $(+force)
+ <tschwinge> - ln -sf $(slibdir)/libpthread.so$(libpthread.so-version)
+ $@
+ <tschwinge> + ln -sf libpthread.so$(libpthread.so-version) $@
+ <braunr> tschwinge: is there any plan to merge libpthread.git in glibc.git
+ upstream ?
+ <tschwinge> braunr, youpi: Has not yet been discussed with Roland, as far
+ as I know.
+ <youpi> has not
+ <youpi> libpthread.diff is supposed to be a verbatim copy of the repository
+ <youpi> and then there are a couple patches which don't (yet) make sense
+ upstream
+ <youpi> the slibdir change, however, is odd
+ <youpi> it must be a leftover
+
+
+# IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> *** $(common-objpfx)resolv/gai_suspend.o: uses
+ /usr/include/i386-gnu/bits/pthread.h
+ <pinotree> so the ones in the libpthread addon are not used...
+ <tschwinge> pinotree: The latter at leash should be useful information.
+ <pinotree> tschwinge: i'm afraid i didn't get you :) what are you referring
+ to?
+ <tschwinge> pinotree: s%leash%least -- what I mean was the it's actually a
+ real bug that not the in-tree libpthread addon include files are being
+ used.
+ <pinotree> tschwinge: ah sure -- basically, the stuff in
+ libpthread/sysdeps/generic are not used at all
+ <pinotree> (glibc only uses generic for glibc/sysdeps/generic)
+ <pinotree> tschwinge: i might have an idea how to fix it: moving the
+ contents from libpthread/sysdeps/generic to libpthread/sysdeps/pthread,
+ and that would depend on one of the latest libpthread patches i sent
+
+
+# libihash
+
+## IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> also, libpthread uses hurd's ihash
+ <tschwinge> Yes, I already thought a little bit about the ihash thing. I
+ besically see two options: move ihash into glibc ((probably?) not as a
+ public interface, though), or have libpthread use of of the hash
+ implementations that surely are already present in glibc.
+ <tschwinge> My notes say:
+ <tschwinge> * include/inline-hashtab.h
+ <tschwinge> * locale/programs/simple-hash.h
+ <tschwinge> * misc/hsearch_r.c
+ <tschwinge> * NNS; cf. f46f0abfee5a2b34451708f2462a1c3b1701facd
+ <tschwinge> No idea whether they're equivalent/usable.
+ <pinotree> interesting
+ <tschwinge> And no immediate recollection what NNS is;
+ f46f0abfee5a2b34451708f2462a1c3b1701facd is not a glibc commit after all.
+ ;-)
+ <tschwinge> Oh, and: libiberty: `hashtab.c`
+ <pinotree> hmm, but then you would need to properly ifdef the libpthread
+ hash usage (iirc only for pthread keys) depending on whether it's in
+ glibc or standalone
+ <pinotree> but that shouldn't be an ussue, i guess
+ <pinotree> *issue
+ <tschwinge> No that'd be fine.
+ <tschwinge> My understanding is that the long-term goal (well, no so
+ long-term, actually) is to completely move libpthread into glibc.
+ <pinotree> ie have it buildable only ad glibc addon?
+ <tschwinge> Yes.
+ <tschwinge> No need for more than one mechanism for building it, I think.
+ <tschwinge> Hmm, this doesn't bring us any further:
+ https://www.google.com/search?q=f46f0abfee5a2b34451708f2462a1c3b1701facd
+ <pinotree> yay for acronyms ;)
+ <tschwinge> So, if someone figures out what NNS and this commit it are: one
+ beer. ;-)
diff --git a/open_issues/pci_arbiter.mdwn b/open_issues/pci_arbiter.mdwn
new file mode 100644
index 00000000..7730cee0
--- /dev/null
+++ b/open_issues/pci_arbiter.mdwn
@@ -0,0 +1,256 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+For [[DDE]]/X.org/...
+
+
+# IRC, freenode, #hurd, 2012-02-19
+
+ <youpi> antrik: we should probably add a gsoc idea on pci bus arbitration
+ <youpi> DDE is still experimental for now so it's ok that you have to
+ configure it by hand, but it should be automatic at some ponit
+
+
+## IRC, freenode, #hurd, 2012-02-21
+
+ <braunr> i'm not familiar with the new gnumach interface for userspace
+ drivers, but can this pci enumerator be written with it as it is ?
+ <braunr> (i'm not asking for a precise answer, just yes - even probably -
+ or no)
+ <braunr> (idk or utsl will do as well)
+ <youpi> I'd say yes
+ <youpi> since all drivers need is interrupts, io ports and iomem
+ <youpi> the latter was already available through /dev/mem
+ <youpi> io ports through the i386 rpcs
+ <youpi> the changes provide both interrupts, and physical-contiguous
+ allocation
+ <youpi> it should be way enough
+ <braunr> youpi: ok
+ <braunr> youpi: thanks for the details :)
+ <antrik> braunr: this was mentioned in the context of the interrupt
+ forwarding interface... the original one implemented by zhengda isn't
+ suitable for a PCI server; but the ones proposed by youpi and tschwinge
+ would work
+ <antrik> same for the physical memory interface: the current implementation
+ doesn't allow delegation; but I already said that it's wrong
+
+
+# IRC, freenode, #hurd, 2012-07-15
+
+ <bddebian> youpi: Oh, BTW, I keep meaning to ask you. Could sound be done
+ with dde or would there still need to be some kernel work?
+ <youpi> bddebian: we'd need a PCI arbitrer for that
+ <youpi> for now just one userland poking with PCI is fine
+ <youpi> but two can produce bonks
+ <bddebian> They can't use the same?
+ <youpi> that's precisely the matter
+ <youpi> they have to use the same
+ <youpi> and not poke with it themselves
+ <braunr> that's what an arbiter is for
+ <bddebian> OK, so if we don't have a PCI arbiter now, how do things like
+ netdde and video not collide currently?
+ <bddebian> s/netdde/network/
+ <bddebian> or disk for that matter
+ <braunr> bddebian: ah currently, well currently, the network is the only
+ thing using the pci bus
+ <bddebian> How is that possible when I have a PCI video card and disk
+ controller?
+ <braunr> they are accessed through compatible means
+ <bddebian> I suppose one of the hardest parts is prioritization?
+ <braunr> i don't think it matters much, no
+ <youpi> bddebian: netdde and Xorg don't collide essentially because they
+ are not started at the same time (hopefully)
+ <bddebian> braunr: What do you mean it doesn't matter?
+ <braunr> bddebian: well the point is rather serializing access, we don't
+ need more
+ <braunr> do other systems actually schedule access to the pci bus ?
+ <bddebian> From what I am reading, yes
+ <braunr> ok
+
+
+# IRC, freenode, #hurd, 2012-07-16
+
+ <antrik> youpi: the lack of a PCI arbiter is a problem, but I wounldn't
+ consider it a precondition for adding another userspace driver
+ class... it's up to the user to make sure he has only one class active,
+ or take the risk of not doing so...
+ <antrik> (plus, I suspect writing the arbiter is a smaller task than
+ implementing another DDE class anyways...)
+ <bddebian> Where would the arbiter need to reside, in gnumach?
+ <antrik> bddebian: kernel would be one possible place (with the advantage
+ of running both userspace and kernel drivers without the potential for
+ conflicts)
+ <antrik> but I think I would prefer a userspace server
+ <youpi> antrik: we'd rather have PCI devices automatically set up
+ <youpi> just like /dev/netdde is already set up for the user
+ <youpi> so you can't count on the user
+ <youpi> for the arbitrer, it could as well be userland, while still
+ interacting with the kernel for some devices
+ <youpi> we however "just" need to get disk drivers in userland to drop PCI
+ drivers from kernel, actually
+
+
+# IRC, freenode, #hurd, 2012-07-17
+
+ <bddebian> youpi: So this PCI arbiter should be a hurd server?
+ <youpi> that'd be better
+ <bddebian> youpi: Is there anything existing to look at as a basis?
+ <youpi> no idea off-hand
+ <bddebian> I mean you couldn't take what netdde does and generalize it?
+ <youpi> netdde doesn't do any arbitration
+
+
+# IRC, OFTC, #debian-hurd, 2012-07-19
+
+ <bdefreese> youpi: Well at some point if you ever have time I'd like to
+ understand better how you see the PCI architecture working in Hurd.
+ I.E. would you expect the server to do enumeration and arbitration?
+ <youpi> I'd expect both, yes, but that's probably to be discussed rather
+ with antrik, he's the one who took some time to think about it
+ <bdefreese> netdde uses libpciaccess currently, right?
+ <youpi> yes
+ <youpi> libpciaccess would have to be fixed into using the arbitrer
+ <youpi> (that'd fix xorg as well)
+ <bdefreese> Man, I am still a bit unclear on how this all interacting
+ currently.. :(
+ <youpi> currently it's not
+ <youpi> and it's just by luck that it doesn't break
+ <bdefreese> Long term xxxdde would use the new server, correct?
+ <youpi> (well, we are also sure that the gnumach enumeration comes always
+ before the netdde enumeration, and xorg is currently not started
+ automatically, so its enumeration is also always after that)
+ <youpi> yes
+ <youpi> the server would essentially provide an interface equivalent to
+ libpciaccess
+ <bdefreese> Right
+ <bdefreese> In general, where does the pci map get "stored"? In GNU/Linux,
+ is it all /proc based?
+ <youpi> what do you mean by "pci map" ?
+ <bdefreese> Once I have enumerated all of the buses and devices, does it
+ stay stored or is it just redone for every call to a pci device?
+ <youpi> in linux it's stored in the kernel
+ <youpi> the abritrator would store it itself
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+ <bddebian> antrik: BTW, youpi says you are the one to talk to for design of
+ a PCI server :)
+ <antrik> oh, am I?
+ * antrik feels honoured :-)
+ <antrik> I guess it's true though: I *did* spent a little thought on
+ it... even mentioned something in my thesis IIRC
+ <antrik> there is one tricky aspect to it though, which I'm not sure how to
+ handle best: we need two different instances of libpciaccess
+ <bddebian> Why two instances of libpciaccess?
+ <antrik> one used by the PCI server to access the hardware directly (using
+ the existing port poking backend), and one using a new backend to access
+ our PCI server...
+ <braunr> bddebian: hum, both i guess ?
+ <bddebian> antrik: Why wouldn't the server access the hardware directly? I
+ thought libpciaccess was supposed to be generic on purpose?
+ <antrik> hm... guess I wasn't clear
+ <antrik> the point is that the PCI server should use the direct hardware
+ access backend of libpciaccess
+ <antrik> however, *clients* should use the PCI server backend of
+ libpciaccess
+ <antrik> I'm not sure backends can be selected at runtime...
+ <antrik> which might mean that we actually have to compile two different
+ versions of the library. erk.
+ <bddebian> So you are saying the pci server should itself use libpci access
+ rather than having it's own?
+ <antrik> admittedly, that's not the most fundamental design decision to
+ make ;-)
+ <antrik> bddebian: yes. no need to rewrite (or copy) this code...
+ <bddebian> Hmm
+ <antrik> actually that was the plan all along when I first suggested
+ implementing the register poking backend for libpciaccess
+ <bddebian> Hmm, not sure I like it but I am certainly in no position to
+ question it right now :)
+ <braunr> why don't you like it ?
+ <bddebian> I shouldn't need an Xorg specific library to access PCI on my OS
+ :)
+ <braunr> oh
+ <bddebian> Though I don't disagree that reinventing the wheel is a bit
+ tedious. :)
+ <antrik> bddebian: although it originates from X.Org, I don't think there
+ is anything about the library technically making it X-specific...
+ <braunr> yes that's my opinion too
+ <antrik> (well, there are some X-specific functions IIRC, but these do not
+ hurt the other functionality)
+ <bddebian> But what is there is api/abi breakage? :)
+ <bddebian> s/is/if/
+ <antrik> BTW according to rdepends there appear to be a number of non-X
+ things using the library now
+ <pinotree> like, uhm, hurd
+ <antrik> yeah, that too... we are already using it for DDE
+ <pinotree> if you have deb-src lines in your sources.list, use the
+ grep-dctrl power:
+ <pinotree> grep-dctrl -sPackage -FBuild-Depends libpciaccess-dev
+ /var/lib/apt/lists/*_source_Sources | sort -u
+ <bddebian> I know we are using it for netdde.
+ <antrik> nice thing about it is that once we have the PCI server and an
+ appropriate backend for libpciaccess, the same netdde and X binaries
+ should work either with or without the PCI server
+ <bddebian> Then why have the server at all?
+ <braunr> it's the arbiter
+ <braunr> you can use the library directly only if you're the only user
+ <braunr> and what antrik means is that the interface should be the same for
+ both modes
+ <bddebian> Ugh, that is where I am getting confused
+ <bddebian> In that case shouldn't everything use libpciaccess and the PCI
+ server has to arbitrate the requests?
+ <braunr> bd ?
+ <braunr> bddebian: yes
+ <braunr> bddebian: but they use the indirect version of the library
+ <braunr> whereas the server uses the raw version
+ <bddebian> OK, I gotcha (I think)
+ <braunr> (but they both provide the same interface, so if you don't have a
+ pci server and you know you're the only user, the direct version can be
+ used)
+ <bddebian> But I am not sure I see the difference between creating a second
+ library or just moving the raw access to the PCI server :)
+ <braunr> uh, there is no difference in that
+ <braunr> and you shouldn't do it
+ <braunr> (if that's what antrik meant at least)
+ <braunr> if you can select the backend (raw or pci server) easily, then
+ stick to the same code base
+ <bddebian> That's where I struggle. In my worthless opinion, raw access
+ should be the OS job while indirect access would be the libraries
+ responsibility
+ <braunr> that's true
+ <braunr> but as an optimization, if an application is the only user, it can
+ directly use raw access
+ <bddebian> How would you know that?
+ <bddebian> I'm sorry if these are dumb questions
+ <braunr> hum, don't try to make this behaviour automatic
+ <braunr> it would be selected by the user through command line switches
+ <bddebian> But the OS itself uses PCI for things like disk access and
+ video, no?
+ <braunr> (it could be automatic but it makes things more complicated)
+ <braunr> you don't need an arbiter all the time
+ <braunr> i can't tell you more, wait for antrik to return
+ <braunr> i realize i might already have said some bullshit
+ <antrik> bddebian: well, you have a point there that once we have the
+ arbiter and use it for everthing, it isn't strictly useful to still have
+ the register poking in the library
+ <antrik> however, the code will remain in the library anyways, so we better
+ continue using it rather than introducing redundancy...
+ <antrik> but again, that's rather a side issue concerning the design of the
+ PCI server
+ <bddebian> antrik: Fair enough. :) So how would I even start on this?
+ <antrik> bddebian: actually, libpciaccess is a good starting point:
+ checking the API should give you a fairly good idea what functionality
+ the server needs to implement
+ <pinotree> (+1 on library (re)use)
+ <bddebian> antrik: KK
+ <antrik> sorry, I'm a bit busy right now...
diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn
index 8dbe1160..ae05e128 100644
--- a/open_issues/performance.mdwn
+++ b/open_issues/performance.mdwn
@@ -52,3 +52,166 @@ call|/glibc/fork]]'s case.
<braunr> the more i study the code, the more i think a lot of time is
wasted on cpu, unlike the common belief of the lack of performance being
only due to I/O
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> there are several kinds of scalability issues
+ <braunr> iirc, i found some big locks in core libraries like libpager and
+ libdiskfs
+ <braunr> but anyway we can live with those
+ <braunr> in the case i observed, ext2fs, relying on libdiskfs and libpager,
+ scans the entire file list to ask for writebacks, as it can't know if the
+ pages are dirty or not
+ <braunr> the mistake here is moving part of the pageout policy out of the
+ kernel
+ <braunr> so it would require the kernel to handle periodic synces of the
+ page cache
+ <antrik> braunr: as for big locks: considering that we don't have any SMP
+ so far, does it really matter?...
+ <braunr> antrik: yes
+ <braunr> we have multithreading
+ <braunr> there is no reason to block many threads while if most of them
+ could continue
+ <braunr> -while
+ <antrik> so that's more about latency than throughput?
+ <braunr> considering sleeping/waking is expensive, it's also about
+ throughput
+ <braunr> currently, everything that deals with sleepable locks (both
+ gnumach and the hurd) just wake every thread waiting for an event when
+ the event occurs (there are a few exceptions, but not many)
+ <antrik> ouch
+
+
+## [[!message-id "20121202101508.GA30541@mail.sceen.net"]]
+
+
+## IRC, freenode, #hurd, 2012-12-04
+
+ <damo22> why do some people think hurd is slow? i find it works well even
+ under heavy load inside a virtual machine
+ <braunr> damo22: the virtual machine actually assists the hurd a lot :p
+ <braunr> but even with that, the hurd is a slow system
+ <damo22> i would have thought it would have the potential to be very fast,
+ considering the model of the kernel
+ <braunr> the design implies by definition more overhead, but the true cause
+ is more than 15 years without optimization on the core components
+ <braunr> how so ?
+ <damo22> since there are less layers of code between the hardware bare
+ metal and the application that users run
+ <braunr> how so ? :)
+ <braunr> it's the contrary actually
+ <damo22> VFS -> IPC -> scheduler -> device drivers -> hardware
+ <damo22> that is monolithic
+ <braunr> well, it's not really meaningful
+ <braunr> and i'd say the same applies for a microkernel system
+ <damo22> if the application can talk directly to hardware through the
+ kernel its almost like plugging directly into the hardware
+ <braunr> you never talk directly to hardware
+ <braunr> you talk to servers instead of the kernel
+ <damo22> ah
+ <braunr> consider monolithic kernel systems like systems with one big
+ server
+ <braunr> the kernel
+ <braunr> whereas a multiserver system is a kernel and many servers
+ <braunr> you still need the VFS to identify your service (and thus your
+ server)
+ <braunr> you need much more IPC, since system calls are "replaced" with RPC
+ <braunr> the scheduler is basically the same
+ <damo22> okay
+ <braunr> device drivers are similar too, except they run in thread context
+ (which is usually a bit heavier)
+ <damo22> but you can do cool things like report when an interrupt line is
+ blocked
+ <braunr> and there are many context switches between all that
+ <braunr> you can do all that in a monolithic kernel too, and faster
+ <braunr> but it's far more elegant, and (when well done) easy to do on a
+ microkernel based system
+ <damo22> yes
+ <damo22> i like elegant, makes coding easier if you know the basics
+ <braunr> there are only two major differences between a monolilthic kernel
+ and a multiserver microkernel system
+ * damo22 listens
+ <braunr> 1/ independence of location (your resources could be anywhere)
+ <braunr> 2/ separation of address spaces (your servers have their own
+ addresses)
+ <damo22> wow
+ <braunr> these both imply additional layers of indirection, making the
+ system as a whole slower
+ <damo22> but it would be far more secure though i suspect
+ <braunr> yes
+ <braunr> and reliable
+ <braunr> that's why systems like qnx were usually adopted for critical
+ tasks
+ <damo22> security and reliability are very important, i would switch to the
+ hurd if it supported all the hardware i use
+ <braunr> so would i :)
+ <braunr> but performance matters too
+ <damo22> not to me
+ <braunr> it should :p
+ <braunr> it really does matter a lot in practice
+ <damo22> i mean, a 2x slowdown compared to linux would not affect me
+ <damo22> if it had all the benefits we mentioned above
+ <braunr> but the hurd is really slow for other reasons than its additional
+ layers of indrection unfortunately
+ <damo22> is it because of lack of optimisation in the core code?
+ <braunr> we're working on these issues, but it's not easy and takes a lot
+ of time :p
+ <damo22> like you said
+ <braunr> yes
+ <braunr> and also because of some fundamental design choices related to the
+ microkernel back in the 80s
+ <damo22> what about the darwin system
+ <damo22> it uses a mach kernel?
+ <braunr> yes
+ <damo22> what is stopping someone taking the MIT code from darwin and
+ creating a monster free OS
+ <braunr> what for ?
+ <damo22> because it already has hardware support
+ <damo22> and a mach kernel
+ <braunr> in kernel drivers ?
+ <damo22> it has kernel extensions
+ <damo22> you can do things like kextload module
+ <braunr> first, being a mach kernel doesn't make it compatible or even
+ easily usable with the hurd, the interfaces have evolved independantly
+ <braunr> and second, we really do want more stuff out of the kernel
+ <braunr> drivers in particular
+ <damo22> may i ask why you are very keen to have drivers out of kernel?
+ <braunr> for the same reason we want other system services out of the
+ kernel
+ <braunr> security, reliability, etc..
+ <braunr> ease of debugging
+ <braunr> the ability to restart drivers separately, without restarting the
+ kernel
+ <damo22> i see
+
+
+# IRC, freenode, #hurd, 2012-09-13
+
+{{$news/2011-q2#phoronix-3}}.
+
+ <braunr> the phoronix benchmarks don't actually test the operating system
+ ..
+ <hroi_> braunr: well, at least it tests its ability to run programs for
+ those particular tasks
+ <braunr> exactly, it tests how programs that don't make much use of the
+ operating system run
+ <braunr> well yes, we can run programs :)
+ <pinotree> those are just cpu-taking tasks
+ <hroi_> ok
+ <pinotree> if you do a benchmark with also i/o, you can see how it is
+ (quite) slower on hurd
+ <hroi_> perhaps they should have run 10 of those programs in parallel, that
+ would test the kernel multitasking I suppose
+ <braunr> not even I/O, simply system calls
+ <braunr> no, multitasking is ok on the hurd
+ <braunr> and it's very similar to what is done on other systems, which
+ hasn't changed much for a long time
+ <braunr> (except for multiprocessor)
+ <braunr> true OS benchmarks measure system calls
+ <hroi_> ok, so Im sensing the view that the actual OS kernel architecture
+ dont really make that much difference, good software does
+ <braunr> not at all
+ <braunr> i'm only saying that the phoronix benchmark results are useless
+ <braunr> because they didn't measure the right thing
+ <hroi_> ok
diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn
index 710c746b..706e1632 100644
--- a/open_issues/performance/io_system/read-ahead.mdwn
+++ b/open_issues/performance/io_system/read-ahead.mdwn
@@ -1565,3 +1565,994 @@ License|/fdl]]."]]"""]]
<braunr> mcsim1: just use sane values inside the kernel :p
<braunr> this simplifies things by only adding the new vm_advise call and
not change the existing external pager interface
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> mcsim: so, to begin with, tell us what state you've reached please
+ <mcsim> braunr: I'm writing code for hurd and gnumach. For gnumach I'm
+ implementing memory policies now. RANDOM and NORMAL seems work, but in
+ hurd I found error that I made during editing ext2fs. So for now ext2fs
+ does not work
+ <braunr> policies ?
+ <braunr> what about mechanism ?
+ <mcsim> also I moved some translators to new interface.
+ <mcsim> It works too
+ <braunr> well that's impressive
+ <mcsim> braunr: I'm not sure yet that everything works
+ <braunr> right, but that's already a very good step
+ <braunr> i thought you were still working on the interfaces to be honest
+ <mcsim> And with mechanism I didn't implement moving pages to inactive
+ queue
+ <braunr> what do you mean ?
+ <braunr> ah you mean with the sequential policy ?
+ <mcsim> yes
+ <braunr> you can consider this a secondary goal
+ <mcsim> sequential I was going to implement like you've said, but I still
+ want to support moving pages to inactive queue
+ <braunr> i think you shouldn't
+ <braunr> first get to a state where clustered transfers do work fine
+ <mcsim> policies are implemented in function calculate_clusters
+ <braunr> then, you can try, and measure the difference
+ <mcsim> ok. I'm now working on fixing ext2fs
+ <braunr> so, except from bug squashing, what's left to do ?
+ <mcsim> finish policies and ext2fs; move fatfs, ufs, isofs to new
+ interface; test this all; edit patches from debian repository, that
+ conflict with my changes; rearrange commits and fix code indentation;
+ update documentation;
+ <braunr> think about measurements too
+ <tschwinge> mcsim: Please don't spend a lot of time on ufs. No testing
+ required for that one.
+ <braunr> and keep us informed about your progress on bug fixing, so we can
+ test soon
+ <mcsim> Forgot about moving system to new interfaces (I mean determine form
+ of vm_advise and memory_object_change_attributes)
+ <braunr> s/determine/final/
+ <mcsim> braunr: ok.
+ <braunr> what do you mean "moving system to new interfaces" ?
+ <mcsim> braunr: I also pushed code changes to gnumach and hurd git
+ repositories
+ <mcsim> I met an issue with memory_object_change_attributes when I tried to
+ use it as I have to update all applications that use it. This includes
+ libc and translators that are not in hurd repository or use debian
+ patches. So I will not be able to run system with new
+ memory_object_change_attributes interface, until I update all software
+ that use this rpc
+ <braunr> this is a bit like the problem i had with my change
+ <braunr> the solution is : don't do it
+ <braunr> i mean, don't change the interface in an incompatible way
+ <braunr> if you can't change an existing call, add a new one
+ <mcsim> temporary I changed memory_object_set_attributes as it isn't used
+ any more.
+ <mcsim> braunr: ok. Adding new call is a good idea :)
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+ <braunr> mcsim: how did you deal with multiple page transfers towards the
+ default pager ?
+ <mcsim> braunr: hello. Didn't handle this yet, but AFAIR default pager
+ supports multiple page transfers.
+ <braunr> mcsim: i'm almost sure it doesn't
+ <mcsim> braunr: indeed
+ <mcsim> braunr: So, I'll update it just other translators.
+ <braunr> like other translators you mean ?
+ <mcsim> *just as
+ <mcsim> braunr: yes
+ <braunr> ok
+ <braunr> be aware also that it may need some support in vm_pageout.c in
+ gnumach
+ <mcsim> braunr: thank you
+ <braunr> if you see anything strange in the default pager, don't hesitate
+ to talk about it
+ <mcsim> braunr: ok. I didn't finish with ext2fs yet.
+ <braunr> so it's a good thing you're aware of it now, before you begin
+ working on it :)
+ <mcsim> braunr: I'm working on ext2 now.
+ <braunr> yes i understand
+ <braunr> i meant "before beginning work on the default pager"
+ <mcsim> ok
+
+ <antrik> mcsim: BTW, we were mostly talking about readahead (pagein) over
+ the past weeks, so I wonder what the status on clustered page*out* is?...
+ <mcsim> antrik: I don't work on this, but following, I think, is an example
+ of *clustered* pageout: _pager_seqnos_memory_object_data_return: object =
+ 113, seqno = 4, control = 120, start_address = 0, length = 8192, dirty =
+ 1. This is an example of debugging printout that shows that pageout
+ manipulates with chunks bigger than page sized.
+ <mcsim> antrik: Another one with bigger length
+ _pager_seqnos_memory_object_data_return: object = 125, seqno = 124,
+ control = 132, start_address = 131072, length = 126976, dirty = 1, kcopy
+ <antrik> mcsim: that's odd -- I didn't know the functionality for that even
+ exists in our codebase...
+ <antrik> my understanding was that Mach always sends individual pageout
+ requests for ever single page it wants cleaned...
+ <antrik> (and this being the reason for the dreadful thread storms we are
+ facing...)
+ <braunr> antrik: ok
+ <braunr> antrik: yes that's what is happening
+ <braunr> the thread storms aren't that much of a problem now
+ <braunr> (by carefully throttling pageouts, which is a task i intend to
+ work on during the following months, this won't be an issue any more)
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+ <mcsim> I moved fatfs, ufs, isofs to new interface, corrected some errors
+ in other that I already moved, moved kernel to new interface (renamed
+ vm_advice to vm_advise and added rpcs memory_object_set_advice and
+ memory_object_get_advice). Made some changes in mechanism and tried to
+ finish ext2 translator.
+ <mcsim> braunr: I've got an issue with fictitious pages...
+ <mcsim> When I determine bounds of cluster in external object I never know
+ its actual size. So, mo_data_request call could ask data that are behind
+ object bounds. The problem is that pager returns data that it has and
+ because of this fictitious pages that were allocated are not freed.
+ <braunr> why don't you know the size ?
+ <mcsim> I see 2 solutions. First one is do not allocate fictitious pages at
+ all (but I think that there could be issues). Another lies in allocating
+ fictitious pages, but then freeing them with mo_data_lock.
+ <mcsim> braunr: Because pages does not inform kernel about object size.
+ <braunr> i don't understand what you mean
+ <mcsim> I think that second way is better.
+ <braunr> so how does it happen ?
+ <braunr> you get a page fault
+ <mcsim> Don't you understand problem or solutions?
+ <braunr> then a lookup in the map finds the map entry
+ <braunr> and the map entry gives you the link to the underlying object
+ <mcsim> from vm_object.h: vm_size_t size; /*
+ Object size (only valid if internal) */
+ <braunr> mcsim: ugh
+ <mcsim> For external they are either 0x8000 or 0x20000...
+ <braunr> and for internal ?
+ <braunr> i'm very surprised to learn that
+ <mcsim> braunr: for internal size is actual
+ <braunr> right sorry, wrong question
+ <braunr> did you find what 0x8000 and 0x20000 are ?
+ <mcsim> for external I met only these 2 magic numbers when printed out
+ arguments of functions _pager_seqno_memory_object_... when they were
+ called.
+ <braunr> yes but did you try to find out where they come from ?
+ <mcsim> braunr: no. I think that 0x2000(many zeros) is maximal possible
+ object size.
+ <braunr> what's the exact value ?
+ <mcsim> can't tell exactly :/ My hurd box has broken again.
+ <braunr> mcsim: how does the vm find the backing content then ?
+ <mcsim> braunr: Do you know if it is guaranteed that map_entry size will be
+ not bigger than external object size?
+ <braunr> mcsim: i know it's not
+ <braunr> but you can use the map entry boundaries though
+ <mcsim> braunr: vm asks pager
+ <braunr> but if the page is already present
+ <braunr> how does it know ?
+ <braunr> it must be inside a vm_object ..
+ <mcsim> If I can use these boundaries than the problem, I described is not
+ actual.
+ <braunr> good
+ <braunr> it makes sense to use these boundaries, as the application can't
+ use data outside the mapping
+ <mcsim> I ask page with vm_page_lookup
+ <braunr> it would matter for shared objects, but then they have their own
+ faults :p
+ <braunr> ok
+ <braunr> so the size is actually completely ignord
+ <mcsim> if it is present than I stop expansion of cluster.
+ <braunr> which makes sense
+ <mcsim> braunr: yes, for external.
+ <braunr> all right
+ <braunr> use the mapping boundaries, it will do
+ <braunr> mcsim: i have only one comment about what i could see
+ <braunr> mcsim: there are 'advice' fields in both vm_map_entry and
+ vm_object
+ <braunr> there should be something else in vm_object
+ <braunr> i told you about pages before and after
+ <braunr> mcsim: how are you using this per object "advice" currently ?
+ <braunr> (in addition, using the same name twice for both mechanism and
+ policy is very sonfusing)
+ <braunr> confusing*
+ <mcsim> braunr: I try to expand cluster as much as it possible, but not
+ much than limit
+ <mcsim> they both determine policy, but advice for entry has bigger
+ priority
+ <braunr> that's wrong
+ <braunr> mapping and content shouldn't compete for policy
+ <braunr> the mapping tells the policy (=the advice) while the content tells
+ how to implement (e.g. how much content)
+ <braunr> IMO, you could simply get rid of the per object "advice" field and
+ use default values for now
+ <mcsim> braunr: What sense these values for number of pages before and
+ after should have?
+ <braunr> or use something well known, easy, and effective like preceding
+ and following pages
+ <braunr> they give the vm the amount of content to ask the backing pager
+ <mcsim> braunr: maximal amount, minimal amount or exact amount?
+ <braunr> neither
+ <braunr> that's why i recommend you forget it for now
+ <braunr> but
+ <braunr> imagine you implement the three standard policies (normal, random,
+ sequential)
+ <braunr> then the pager assigns preceding and following numbers for each of
+ them, say [5;5], [0;0], [15;15] respectively
+ <braunr> these numbers would tell the vm how many pages to ask the pagers
+ in a single request and from where
+ <mcsim> braunr: but in fact there could be much more policies.
+ <braunr> yes
+ <mcsim> also in kernel context there is no such unit as pager.
+ <braunr> so there should be a call like memory_object_set_advice(int
+ advice, int preceding, int following);
+ <braunr> for example
+ <braunr> what ?
+ <braunr> the pager is the memory manager
+ <braunr> it does exist in kernel context
+ <braunr> (or i don't understand what you mean)
+ <mcsim> there is only port, but port could be either pager or something
+ else
+ <braunr> no, it's a pager
+ <braunr> it's a port whose receive right is hold by a task implementing the
+ pager interface
+ <braunr> either the default pager or an untrusted task
+ <braunr> (or null if the object is anonymous memory not yet sent to the
+ default pager)
+ <mcsim> port is always pager?
+ <braunr> the object port is, yes
+ <braunr> struct ipc_port *pager; /* Where to get
+ data */
+ <mcsim> So, you suggest to keep set of advices for each object?
+ <braunr> i suggest you don't change anything in objects for now
+ <braunr> keep the advice in the mappings only, and implement default
+ behaviour for the known policies
+ <braunr> mcsim: if you understand this point, then i have nothing more to
+ say, and we should let nowhere_man present his work
+ <mcsim> braunr: ok. I'll implement only default behaviors for know policies
+ for now.
+ <braunr> (actually, using the mapping boundaries is slightly unoptimal, as
+ we could have several mappings for the same content, e.g. a program with
+ read only executable mapping, then ro only)
+ <braunr> mcsim: another way to know the "size" is to actually lookup for
+ pages in objects
+ <braunr> hm no, that's not true
+ <mcsim> braunr: But if there is no page we have to ask it
+ <mcsim> and I don't understand why using mappings boundaries is unoptimal
+ <braunr> here is bash
+ <braunr> 0000000000400000 868K r-x-- /bin/bash
+ <braunr> 00000000006d9000 36K rw--- /bin/bash
+ <braunr> two entries, same file
+ <braunr> (there is the anonymous memory layer for the second, but it would
+ matter for the first cow faults)
+
+
+## IRC, freenode, #hurd, 2012-08-02
+
+ <mcsim> braunr: You said that I probably need some support in vm_pageout.c
+ to make defpager work with clustered page transfers, but TBH I thought
+ that I have to implement only pagein. Do you expect from me implementing
+ pageout either? Or I misunderstand role of vm_pageout.c?
+ <braunr> no
+ <braunr> you're expected to implement only pagins for now
+ <braunr> pageins
+ <mcsim> well, I'm finishing merging of ext2fs patch for large stores and
+ work on defpager in parallel.
+ <mcsim> braunr: Also I didn't get your idea about configuring of paging
+ mechanism on behalf of pagers.
+ <braunr> which one ?
+ <mcsim> braunr: You said that pager has somehow pass size of desired
+ clusters for different paging policies.
+ <braunr> mcsim: i said not to care about that
+ <braunr> and the wording isn't correct, it's not "on behalf of pagers"
+ <mcsim> servers?
+ <braunr> pagers could tell the kernel what size (before and after a faulted
+ page) they prefer for each existing policy
+ <braunr> but that's one way to do it
+ <braunr> defaults work well too
+ <braunr> as shown in other implementations
+
+
+## IRC, freenode, #hurd, 2012-08-09
+
+ <mcsim> braunr: I'm still debugging ext2 with large storage patch
+ <braunr> mcsim: tough problems ?
+ <mcsim> braunr: The same issues as I always meet when do debugging, but it
+ takes time.
+ <braunr> mcsim: so nothing blocking so far ?
+ <mcsim> braunr: I can't tell you for sure that I will finish up to 13th of
+ August and this is unofficial pencil down date.
+ <braunr> all right, but are you blocked ?
+ <mcsim> braunr: If you mean the issues that I can not even imagine how to
+ solve than there is no ones.
+ <braunr> good
+ <braunr> mcsim: i'll try to review your code again this week end
+ <braunr> mcsim: make sure to commit everything even if it's messy
+ <mcsim> braunr: ok
+ <mcsim> braunr: I made changes to defpager, but I haven't tried
+ them. Commit them too?
+ <braunr> mcsim: sure
+ <braunr> mcsim: does it work fine without the large storage patch ?
+ <mcsim> braunr: looks fine, but TBH I can't even run such things like fsx,
+ because even without my changes it failed mightily at once.
+ <braunr> mcsim: right, well, that will be part of another task :)
+
+
+## IRC, freenode, #hurd, 2012-08-13
+
+ <mcsim> braunr: hello. Seems ext2fs with large store patch works.
+
+
+## IRC, freenode, #hurd, 2012-08-19
+
+ <mcsim> hello. Consider such situation. There is a page fault and kernel
+ decided to request pager for several pages, but at the moment pager is
+ able to provide only first pages, the rest ones are not know yet. Is it
+ possible to supply only one page and regarding rest ones tell the kernel
+ something like: "Rest pages try again later"?
+ <mcsim> I tried pager_data_unavailable && pager_flush_some, but this seems
+ does not work.
+ <mcsim> Or I have to supply something anyway?
+ <braunr> mcsim: better not provide them
+ <braunr> the kernel only really needs one page
+ <braunr> don't try to implement "try again later", the kernel will do that
+ if other page faults occur for those pages
+ <mcsim> braunr: No, translator just hangs
+ <braunr> ?
+ <mcsim> braunr: And I even can't deattach it without reboot
+ <braunr> hangs when what
+ <braunr> ?
+ <braunr> i mean, what happens when it hangs ?
+ <mcsim> If kernel request 2 pages and I provide one, than when page fault
+ occurs in second page translator hangs.
+ <braunr> well that's a bug
+ <braunr> clustered pager transfer is a mere optimization, you shouldn't
+ transfer more than you can just to satisfy some requested size
+ <mcsim> I think that it because I create fictitious pages before calling
+ mo_data_request
+ <braunr> as placeholders ?
+ <mcsim> Yes. Is it correct if I will not grab fictitious pages?
+ <braunr> no
+ <braunr> i don't know the details well enough about fictitious pages
+ unfortunately, but it really feels wrong to use them where real physical
+ pages should be used instead
+ <braunr> normally, an in-transfer page is simply marked busy
+ <mcsim> But If page is already marked busy kernel will not ask it another
+ time.
+ <braunr> when the pager replies, you unbusy them
+ <braunr> your bug may be that you incorrectly use pmap
+ <braunr> you shouldn't create mmu mappings for pages you didn't receive
+ from the pagers
+ <mcsim> I don't create them
+ <braunr> ok so you correctly get the second page fault
+ <mcsim> If pager supplies only first pages, when asked were two, than
+ second page will not become un-busy.
+ <braunr> that's a bug
+ <braunr> your code shouldn't assume the pager will provide all the pages it
+ was asked for
+ <braunr> only the main one
+ <mcsim> Will it be ok if I will provide special attribute that will keep
+ information that page has been advised?
+ <braunr> what for ?
+ <braunr> i don't understand "page has been advised"
+ <mcsim> Advised page is page that is asked in cluster, but there wasn't a
+ page fault in it.
+ <mcsim> I need this attribute because if I don't inform kernel about this
+ page anyhow, than kernel will not change attributes of this page.
+ <braunr> why would it change its attributes ?
+ <mcsim> But if page fault will occur in page that was asked than page will
+ be already busy by the moment.
+ <braunr> and what attribute ?
+ <mcsim> advised
+ <braunr> i'm lost
+ <braunr> 08:53 < mcsim> I need this attribute because if I don't inform
+ kernel about this page anyhow, than kernel will not change attributes of
+ this page.
+ <braunr> you need the advised attribute because if you don't inform the
+ kernel about this page, the kernel will not change the advised attribute
+ of this page ?
+ <mcsim> Not only advised, but busy as well.
+ <mcsim> And if page fault will occur in this page, kernel will not ask it
+ second time. Kernel will just block.
+ <braunr> well that's normal
+ <mcsim> But if kernel will block and pager is not going to report somehow
+ about this page, than translator will hang.
+ <braunr> but the pager is going to report
+ <braunr> and in this report, there can be less pages then requested
+ <mcsim> braunr: You told not to report
+ <braunr> the kernel can deduce it didn't receive all the pages, and mark
+ them unbusy anyway
+ <braunr> i told not to transfer more than requested
+ <braunr> but not sending data can be a form of communication
+ <braunr> i mean, sending a message in which data is missing
+ <braunr> it simply means its not there, but this info is sufficient for the
+ kernel
+ <mcsim> hmmm... Seems I understood you. Let me try something.
+ <mcsim> braunr: I informed kernel about missing page as follows:
+ pager_data_supply (pager, precious, writelock, i, 1, NULL, 0); Am I
+ right?
+ <braunr> i don't know the interface well
+ <braunr> what does it mean
+ <braunr> ?
+ <braunr> are you passing NULL as the data for a missing page ?
+ <mcsim> yes
+ <braunr> i see
+ <braunr> you shouldn't need a request for that though, avoiding useless ipc
+ is a good thing
+ <mcsim> i is number of page, 1 is quantity
+ <braunr> but if you can't find a better way for now, it will do
+ <mcsim> But this does not work :(
+ <braunr> that's a bug
+ <braunr> in your code probably
+ <mcsim> braunr: supplying NULL as data returns MACH_SEND_INVALID_MEMORY
+ <braunr> but why would it work ?
+ <braunr> mach expects something
+ <braunr> you have to change that
+ <mcsim> It's mig who refuses data. Mach does not even get the call.
+ <braunr> hum
+ <mcsim> That's why I propose to provide new attribute, that will keep
+ information regarding whether the page was asked as advice or not.
+ <braunr> i still don't understand why
+ <braunr> why don't you fix mig so you can your null message instead ?
+ <braunr> +send
+ <mcsim> braunr: because usually this is an error
+ <braunr> the kernel will decide if it's an erro
+ <braunr> r
+ <braunr> what kinf of reply do you intend to send the kernel with for these
+ "advised" pages ?
+ <mcsim> no reply. But when page fault will occur in busy page and it will
+ be also advised, kernel will not block, but ask this page another time.
+ <mcsim> And how kernel will know that this is an error or not?
+ <braunr> why ask another time ?!
+ <braunr> you really don't want to flood pagers with useless messages
+ <braunr> here is how it should be
+ <braunr> 1/ the kernel requests pages from the pager
+ <braunr> it know the range
+ <braunr> 2/ the pager replies what it can, full range, subset of it, even
+ only one page
+ <braunr> 3/ the kernel uses what the pager replied, and unbusies the other
+ pages
+ <mcsim> First time page was asked because page fault occurred in
+ neighborhood. And second time because PF occurred in page.
+ <braunr> well it shouldn't
+ <braunr> or it should, but then you have a segfault
+ <mcsim> But kernel does not keep bound of range, that it asked.
+ <braunr> if the kernel can't find the main page, the one it needs to make
+ progress, it's a segfault
+ <mcsim> And this range could be supplied in several messages.
+ <braunr> absolutely not
+ <braunr> you defeat the purpose of clustered pageins if you use several
+ messages
+ <mcsim> But interface supports it
+ <braunr> interface supported single page transfers, doesn't mean it's good
+ <braunr> well, you could use several messages
+ <braunr> as what we really want is less I/O
+ <mcsim> Noone keeps bounds of requested range, so it couldn't be checked
+ that range was split
+ <braunr> but it would be so much better to do it all with as few messages
+ as possible
+ <braunr> does the kernel knows the main page ?
+ <braunr> know*
+ <mcsim> Splitting range is not optimal, but it's not an error.
+ <braunr> i assume it does
+ <braunr> doesn't it ?
+ <mcsim> no, that's why I want to provide new attribute.
+ <braunr> i'm sorry i'm lost again
+ <braunr> how does the kernel knows a page fault has been serviced ?
+ <braunr> know*
+ <mcsim> It receives an interrupt
+ <braunr> ?
+ <braunr> let's not mix terms
+ <mcsim> oh.. I read as received. Sorry
+ <mcsim> It get mo_data_supply message. Than it replaces fictitious pages
+ with real ones.
+ <braunr> so you get a message
+ <braunr> and you kept track of the range using fictitious pages
+ <braunr> use the busy flag instead, and another way to retain the range
+ <mcsim> I allocate fictitious pages to reserve place. Than if page fault
+ will occur in this page fictitious page kernel will not send another
+ mo_data_request call, it will wait until fictitious page unblocks.
+ <braunr> i'll have to check the code but it looks unoptimal to me
+ <braunr> we really don't want to allocate useless objects when a simple
+ busy flag would do
+ <mcsim> busy flag for what? There is no page yet
+ <braunr> we're talking about mo_data_supply
+ <braunr> actually we're talking about the whole page fault process
+ <mcsim> We can't mark nothing as busy, that's why kernel allocates
+ fictitious page and marks it as busy until real page would be supplied.
+ <braunr> what do you mean "nothing" ?
+ <mcsim> VM_PAGE_NULL
+ <braunr> uh ?
+ <braunr> when are physical pages allocated ?
+ <braunr> on request or on reply from the pager ?
+ <braunr> i'm reading mo_data_supply, and it looks like the page is already
+ busy at that time
+ <mcsim> they are allocated by pager and than supplied in reply
+ <mcsim> Yes, but these pages are fictitious
+ <braunr> show me please
+ <braunr> in the master branch, not yours
+ <mcsim> that page is fictitious?
+ <braunr> yes
+ <braunr> i'm referring to the way mach currently does things
+ <mcsim> vm/vm_fault.c:582
+ <braunr> that's memory_object_lock_page
+ <braunr> hm wait
+ <braunr> my bad
+ <braunr> ah that damn object chaining :/
+ <braunr> ok
+ <braunr> the original code is stupid enough to use fictitious pages all the
+ time, you probably have to do the same
+ <mcsim> hm... Attributes will be useless, pager should tell something about
+ pages, that it is not going to supply.
+ <braunr> yes
+ <braunr> that's what null is for
+ <mcsim> Not null, null is error.
+ <braunr> one problem i can think of is making sure the kernel doesn't
+ interpret missing as error
+ <braunr> right
+ <mcsim> I think better have special value for mo_data_error
+ <braunr> probably
+
+
+### IRC, freenode, #hurd, 2012-08-20
+
+ <antrik> braunr: I think it's useful to allow supplying the data in several
+ batches. the kernel should *not* assume that any data missing in the
+ first batch won't be supplied later.
+ <braunr> antrik: it really depends
+ <braunr> i personally prefer synchronous approaches
+ <antrik> demanding that all data is supplied at once could actually turn
+ readahead into a performace killer
+ <mcsim> antrik: Why? The only drawback I see is higher response time for
+ page fault, but it also leads to reduced overhead.
+ <braunr> that's why "it depends"
+ <braunr> mcsim: it brings benefit only if enough preloaded pages are
+ actually used to compensate for the time it took the pager to provide
+ them
+ <braunr> which is the case for many workloads (including sequential access,
+ which is the common case we want to optimize here)
+ <antrik> mcsim: the overhead of an extra RPC is negligible compared to
+ increased latencies when dealing with slow backing stores (such as disk
+ or network)
+ <mcsim> antrik: also many replies lead to fragmentation, while in one reply
+ all data is gathered in one bunch. If all data is placed consecutively,
+ than it may be transferred next time faster.
+ <braunr> mcsim: what kind of fragmentation ?
+ <antrik> I really really don't think it's a good idea for the page to hold
+ back the first page (which is usually the one actually blocking) while
+ it's still loading some other pages (which will probably be needed only
+ in the future anyways, if at all)
+ <antrik> err... for the pager to hold back
+ <braunr> antrik: then all pagers should be changed to handle asynchronous
+ data supply
+ <braunr> it's a bit late to change that now
+ <mcsim> there could be two cases of data placement in backing store: 1/ all
+ asked data is placed consecutively; 2/ it is spread among backing
+ store. If pager gets data in one message it more like place it
+ consecutively. So to have data consecutive in each pager, each pager has
+ to try send data in one message. Having data placed consecutive is
+ important, since reading of such data is much more faster.
+ <braunr> mcsim: you're confusing things ..
+ <braunr> or you're not telling them properly
+ <mcsim> Ok. Let me try one more time
+ <braunr> since you're working *only* on pagein, not pageout, how do you
+ expect spread pages being sent in a single message be better than
+ multiple messages ?
+ <mcsim> braunr: I think about future :)
+ <braunr> ok
+ <braunr> but antrik is right, paging in too much can reduce performance
+ <braunr> so the default policy should be adjusted for both the worst case
+ (one page) and the average/best (some/mane contiguous pages)
+ <braunr> through measurement ideally
+ <antrik> mcsim: BTW, I still think implementing clustered pageout has
+ higher priority than implementing madvise()... but if the latter is less
+ work, it might still make sense to do it first of course :-)
+ <braunr> many*
+ <braunr> there aren't many users of madvise, true
+ <mcsim> antrik: Implementing madvise I expect to be very simple. It should
+ just translate call to vm_advise
+ <antrik> well, that part is easy of course :-) so you already implemented
+ vm_advise itself I take it?
+ <mcsim> antrik: Yes, that was also quite easy.
+ <antrik> great :-)
+ <antrik> in that case it would be silly of course to postpone implementing
+ the madvise() wrapper. in other words: never mind my remark about
+ priorities :-)
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <mcsim> I try a test with ext2fs. It works, than I just recompile ext2fs
+ and it stops working, than I recompile it again several times and each
+ time the result is unpredictable.
+ <braunr> sounds like a concurrency issue
+ <mcsim> I can run the same test several times and ext2 works until I
+ recompile it. That's the problem. Could that be concurrency too?
+ <braunr> mcsim: without bad luck, yes, unless "several times" is a lot
+ <braunr> like several dozens of tries
+
+
+## IRC, freenode, #hurd, 2012-09-04
+
+ <mcsim> hello. I want to tell that ext2fs translator, that I work on,
+ replaced for my system old variant that processed only single pages
+ requests. And it works with partitions bigger than 2 Gb.
+ <mcsim> Probably I'm not for from the end.
+ <mcsim> But it's worth to mention that I didn't fix that nasty bug that I
+ told yesterday about.
+ <mcsim> braunr: That bug sometimes appears after recompilation of ext2fs
+ and always disappears after sync or reboot. Now I'm going to finish
+ defpager and test other translators.
+
+
+## IRC, freenode, #hurd, 2012-09-17
+
+ <mcsim> braunr: hello. Do you remember that you said that pager has to
+ inform kernel about appropriate cluster size for readahead?
+ <mcsim> I don't understand how kernel store this information, because it
+ does not know about such unit as "pager".
+ <mcsim> Can you give me an advice about how this could be implemented?
+ <youpi> mcsim: it can store it in the object
+ <mcsim> youpi: It too big overhead
+ <mcsim> youpi: at least from my pow
+ <mcsim> *pov
+ <braunr> mcsim: we discussed this already
+ <braunr> mcsim: there is no "pager" entity in the kernel, which is a defect
+ from my PoV
+ <braunr> mcsim: the best you can do is follow what the kernel already does
+ <braunr> that is, store this property per object$
+ <braunr> we don't care much about the overhead for now
+ <braunr> my guess is there is already some padding, so the overhead is
+ likely to be amortized by this
+ <braunr> like youpi said
+ <mcsim> I remember that discussion, but I didn't get than whether there
+ should be only one or two values for all policies. Or each policy should
+ have its own values?
+ <mcsim> braunr: ^
+ <braunr> each policy should have its own values, which means it can be
+ implemented with a simple static array somewhere
+ <braunr> the information in each object is a policy selector, such as an
+ index in this static array
+ <mcsim> ok
+ <braunr> mcsim: if you want to minimize the overhead, you can make this
+ selector a char, and place it near another char member, so that you use
+ space that was previously used as padding by the compiler
+ <braunr> mcsim: do you see what i mean ?
+ <mcsim> yes
+ <braunr> good
+
+
+## IRC, freenode, #hurd, 2012-09-17
+
+ <mcsim> hello. May I add function krealloc to slab.c?
+ <braunr> mcsim: what for ?
+ <mcsim> braunr: It is quite useful for creating dynamic arrays
+ <braunr> you don't want dynamic arrays
+ <mcsim> why?
+ <braunr> they're expensive
+ <braunr> try other data structures
+ <mcsim> more expensive than linked lists?
+ <braunr> depends
+ <braunr> but linked lists aren't the only other alternative
+ <braunr> that's why btrees and radix trees (basically trees of arrays)
+ exist
+ <braunr> the best general purpose data structure we have in mach is the red
+ black tree currently
+ <braunr> but always think about what you want to do with it
+ <mcsim> I want to store there sets of sizes for different memory
+ policies. I don't expect this array to be big. But for sure I can use
+ rbtree for it.
+ <braunr> why not a static array ?
+ <braunr> arrays are perfect for known data sizes
+ <mcsim> I expect from pager to supply its own sizes. So at the beginning in
+ this array is only default policy. When pager wants to supply it own
+ policy kernel lookups table of advice. If this policy is new set of sizes
+ then kernel creates new entry in table of advice.
+ <braunr> that would mean one set of sizes for each object
+ <braunr> why don't you make things simple first ?
+ <mcsim> Object stores only pointer to entry in this table.
+ <braunr> but there is no pager object shared by memory objects in the
+ kernel
+ <mcsim> I mean struct vm_object
+ <braunr> so that's what i'm saying, one set per object
+ <braunr> it's useless overhead
+ <braunr> i would really suggest using a global set of policies for now
+ <mcsim> Probably, I don't understand you. Where do you want to store this
+ static array?
+ <braunr> it's a global one
+ <mcsim> "for now"? It is not a problem to implement a table for local
+ advice, using either rbtree or dynamic array.
+ <braunr> it's useless overhead
+ <braunr> and it's not a single integer, you want a whole container per
+ object
+ <braunr> don't do anything fancy unless you know you really want it
+ <braunr> i'll link the netbsd code again as a very good example of how to
+ implement global policies that work more than decently for every file
+ system in this OS
+ <braunr>
+ http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/uvm/uvm_fault.c?rev=1.194&content-type=text/x-cvsweb-markup&only_with_tag=MAIN
+ <braunr> look for uvmadvice
+ <mcsim> But different translators have different demands. Thus changing of
+ global policy for one translator would have impact on behavior of another
+ one.
+ <braunr> i understand
+ <braunr> this isn't l4, or anything experimental
+ <braunr> we want something that works well for us
+ <mcsim> And this is acceptable?
+ <braunr> until you're able to demonstrate we need different policies, i'd
+ recommend not making things more complicated than they already are and
+ need to be
+ <braunr> why wouldn't it ?
+ <braunr> we've been discussing this a long time :/
+ <mcsim> because every process runs in isolated environment and the fact
+ that there is something outside this environment, that has no rights to
+ do that, does it surprises me.
+ <braunr> ?
+ <mcsim> ok. let me dip in uvm code. Probably my questions disappear
+ <braunr> i don't think it will
+ <braunr> you're asking about the system design here, not implementation
+ details
+ <braunr> with l4, there are as you'd expect well defined components
+ handling policies for address space allocation, or paging, or whatever
+ <braunr> but this is mach
+ <braunr> mach has a big shared global vm server with in kernel policies for
+ it
+ <braunr> so it's ok to implement a global policy for this
+ <braunr> and let's be pragmatic, if we don't need complicated stuff, why
+ would we waste time on this ?
+ <mcsim> It is not complicated.
+ <braunr> retaining a whole container for each object, whereas they're all
+ going to contain exactly the same stuff for years to come seems overly
+ complicated for me
+ <mcsim> I'm not going to create separate container for each object.
+ <braunr> i'm not following you then
+ <braunr> how can pagers upload their sizes in the kernel ?
+ <mcsim> I'm going to create a new container only for combination of cluster
+ sizes that are not present in table of advice.
+ <braunr> that's equivalent
+ <braunr> you're ruling out the default set, but that's just an optimization
+ <braunr> whenever a file system decides to use other sizes, the problem
+ will arise
+ <mcsim> Before creating a container I'm going to lookup a table. And only
+ than create
+ <braunr> a table ?
+ <mcsim> But there will be the same container for a huge bunch of objects
+ <braunr> how do you select it ?
+ <braunr> if it's a per pager container, remember there is no shared pager
+ object in the kernel, only ports to external programs
+ <mcsim> I'll give an example
+ <mcsim> Suppose there are only two policies. At the beginning we have table
+ {{random = 4096, sequential = 8096}}. Than pager 1 wants to add new
+ policy where random cluster size is 8192. He asks kernel to create it and
+ after this table will be following: {{random = 4096, sequential = 8192},
+ {random = 8192, sequential = 8192}}. If pager 2 wants to create the same
+ policy as pager 1, kernel will lockup table and will not create new
+ entry. So the table will be the same.
+ <mcsim> And each object has link to appropriate table entry
+ <braunr> i'm not sure how this can work
+ <braunr> how can pagers 1 and 2 know the sizes are the same for the same
+ policy ?
+ <braunr> (and actually they shouldn't)
+ <mcsim> For faster lookup there will be create hash keys for each entry
+ <braunr> what's the lookup key ?
+ <mcsim> They do not know
+ <mcsim> The kernel knows
+ <braunr> then i really don't understand
+ <braunr> and how do you select sizes based on the policy ?
+ <braunr> and how do you remove unused entries ?
+ <braunr> (ok this can be implemented with a simple ref counter)
+ <mcsim> "and how do you select sizes based on the policy ?" you mean at
+ page fault?
+ <braunr> yes
+ <mcsim> entry or object keeps pointer to appropriate entry in the table
+ <braunr> ok your per object data is a pointer to the table entry and the
+ policy is the index inside
+ <braunr> so you really need a ref counter there
+ <mcsim> yes
+ <braunr> and you need to maintain this table
+ <braunr> for me it's uselessly complicated
+ <mcsim> but this keeps design clear
+ <braunr> not for me
+ <braunr> i don't see how this is clearer
+ <braunr> it's just more powerful
+ <braunr> a power we clearly don't need now
+ <braunr> and in the following years
+ <braunr> in addition, i'm very worried about the potential problems this
+ can introduce
+ <mcsim> In fact I don't feel comfortable from the thought that one
+ translator can impact on behavior of another.
+ <braunr> simple example: the table is shared, it needs a lock, other data
+ structures you may have added in your patch may also need a lock
+ <braunr> but our locks are noop for now, so you just can't be sure there is
+ no deadlock or other issues
+ <braunr> and adding smp is a *lot* more important than being able to select
+ precisely policy sizes that we're very likely not to change a lot
+ <braunr> what do you mean by "one translator can impact another" ?
+ <mcsim> As I understand your idea (I haven't read uvm code yet) that there
+ is a global table of cluster sizes for different policies. And every
+ translator can change values in this table. That is what I mean under one
+ translator will have an impact on another one.
+ <braunr> absolutely not
+ <braunr> translators *can't* change sizes
+ <braunr> the sizes are completely static, assumed to be fit all
+ <braunr> -be
+ <braunr> it's not optimial but it's very simple and effective in practice
+ <braunr> optimal*
+ <braunr> and it's not a table of cluster sizes
+ <braunr> it's a table of pages before/after the faulted one
+ <braunr> this reflects the fact tha in mach, virtual memory (implementation
+ and policy) is in the kernel
+ <braunr> translators must not be able to change that
+ <braunr> let's talk about pagers here, not translators
+ <mcsim> Finally I got you. This is an acceptable tradeoff.
+ <braunr> it took some time :)
+ <braunr> just to clear something
+ <braunr> 20:12 < mcsim> For faster lookup there will be create hash keys
+ for each entry
+ <braunr> i'm not sure i understand you here
+ <mcsim> To found out if there is such policy (set of sizes) in the table we
+ can lookup every entry and compare each value. But it is better to create
+ a hash value for set and thus find equal policies.
+ <braunr> first, i'm really not comfortable with hash tables
+ <braunr> they really need careful configuration
+ <braunr> next, as we don't expect many entries in this table, there is
+ probably no need for this overhead
+ <braunr> remember that one property of tables is locality of reference
+ <braunr> you access the first entry, the processor automatically fills a
+ whole cache line
+ <braunr> so if your table fits on just a few, it's probably faster to
+ compare entries completely than to jump around in memory
+ <mcsim> But we can sort hash keys, and in this way find policies quickly.
+ <braunr> cache misses are way slower than computation
+ <braunr> so unless you have massive amounts of data, don't use an optimized
+ container
+ <mcsim> (20:38:53) braunr: that's why btrees and radix trees (basically
+ trees of arrays) exist
+ <mcsim> and what will be the key?
+ <braunr> i'm not saying to use a tree instead of a hash table
+ <braunr> i'm saying, unless you have many entries, just use a simple table
+ <braunr> and since pagers don't add and remove entries from this table
+ often, it's on case reallocation is ok
+ <braunr> one*
+ <mcsim> So here dynamic arrays fit the most?
+ <braunr> probably
+ <braunr> it really depends on the number of entries and the write ratio
+ <braunr> keep in mind current processors have 32-bits or (more commonly)
+ 64-bits cache line sizes
+ <mcsim> bytes probably?
+ <braunr> yes bytes
+ <braunr> but i'm not willing to add a realloc like call to our general
+ purpose kernel allocator
+ <braunr> i don't want to make it easy for people to rely on it, and i hope
+ the lack of it will make them think about other solutions instead :)
+ <braunr> and if they really want to, they can just use alloc/free
+ <mcsim> Under "other solutions" you mean trees?
+ <braunr> i mean anything else :)
+ <braunr> lists are simple, trees are elegant (but add non negligible
+ overhead)
+ <braunr> i like trees because they truely "gracefully" scale
+ <braunr> but they're still O(log n)
+ <braunr> a good hash table is O(1), but must be carefully measured and
+ adjusted
+ <braunr> there are many other data structures, many of them you can find in
+ linux
+ <braunr> but in mach we don't need a lot of them
+ <mcsim> Your favorite data structures are lists and trees. Next, what
+ should you claim, is that lisp is your favorite language :)
+ <braunr> functional programming should eventually rule the world, yes
+ <braunr> i wouldn't count lists are my favorite, which are really trees
+ <braunr> as*
+ <braunr> there is a reason why red black trees back higher level data
+ structures like vectors or maps in many common libraries ;)
+ <braunr> mcsim: hum but just to make it clear, i asked this question about
+ hashing because i was curious about what you had in mind, i still think
+ it's best to use static predetermined values for policies
+ <mcsim> braunr: I understand this.
+ <braunr> :)
+ <mcsim> braunr: Yeah. You should be cautious with me :)
+
+
+## IRC, freenode, #hurd, 2012-09-21
+
+ <antrik> mcsim: there is only one cluster size per object -- it depends on
+ the properties of the backing store, nothing else.
+ <antrik> (while the readahead policies depend on the use pattern of the
+ application, and thus should be selected per mapping)
+ <antrik> but I'm still not convinced it's worthwhile to bother with cluster
+ size at all. do other systems even do that?...
+
+
+## IRC, freenode, #hurd, 2012-09-23
+
+ <braunr> mcsim: how long do you think it will take you to polish your gsoc
+ work ?
+ <braunr> (and when before you begin that part actually, because we'll to
+ review the whole stuff prior to polishing it)
+ <mcsim> braunr: I think about 2 weeks
+ <mcsim> But you may already start review it, if you're intended to do it
+ before I'll rearrange commits.
+ <mcsim> Gnumach, ext2fs and defpager are ready. I just have to polish the
+ code.
+ <braunr> mcsim: i don't know when i'll be able to do that
+ <braunr> so expect a few weeks on my (our) side too
+ <mcsim> ok
+ <braunr> sorry for being slow, that's how hurd development is :)
+ <mcsim> What should I do with libc patch that adds madvise support?
+ <mcsim> Post it to bug-hurd?
+ <braunr> hm probably the same i did for pthreads, create a topic branch in
+ glibc.git
+ <mcsim> there is only one commit
+ <braunr> yes
+ <braunr> (mine was a one liner :p)
+ <mcsim> ok
+ <braunr> it will probably be a debian patch before going into glibc anyway,
+ just for making sure it works
+ <mcsim> But according to term. I expect that my study begins in a week and
+ I'll have to do some stuff then, so actually probably I'll need a week
+ more.
+ <braunr> don't worry, that's expected
+ <braunr> and that's the reason why we're slow
+ <mcsim> And what should I do with large store patch?
+ <braunr> hm good question
+ <braunr> what did you do for now ?
+ <braunr> include it in your work ?
+ <braunr> that's what i saw iirc
+ <mcsim> Yes. It consists of two parts.
+ <braunr> the original part and the modificaionts ?
+ <braunr> modifications*
+ <braunr> i think youpi would know better about that
+ <mcsim> First (small) adds notification to libpager interface and second
+ one adds support for large stores.
+ <braunr> i suppose we'll probably merge the large store patch at some point
+ anyway
+ <mcsim> Yes both original and modifications
+ <braunr> good
+ <mcsim> I'll split these parts to different commits and I'll try to make
+ support for large stores independent from other work.
+ <braunr> that would be best
+ <braunr> if you can make it so that, by ommitting (or including) one patch,
+ we can add your patches to the debian package, it would be great
+ <braunr> (only with regard to the large store change, not other potential
+ smaller conflicts)
+ <mcsim> braunr: I also found several bugs in defpager, that I haven't fixed
+ since winter.
+ <braunr> oh
+ <mcsim> seems nobody hasn't expect them.
+ <braunr> i'm very interested in those actually (not too soon because it
+ concerns my work on pageout, which is postponed after pthreads and
+ select)
+ <mcsim> ok. than I'll do it first.
+
+
+## IRC, freenode, #hurd, 2012-09-24
+
+ <braunr> mcsim: what is vm_get_advice_info ?
+ <mcsim> braunr: hello. It should supply some machine specific parameters
+ regarding clustered reading. At the moment it supplies only maximal
+ possible size of cluster.
+ <braunr> mcsim: why such a need ?
+ <mcsim> It is used by defpager, as it can't allocate memory dynamically and
+ every thread has to allocate maximal size beforehand
+ <braunr> mcsim: i see
+
+
+## IRC, freenode, #hurd, 2012-10-05
+
+ <mcsim> braunr: I think it's not worth to separate large store patch for
+ ext2 and patch for moving it to new libpager interface. Am I right?
+ <braunr> mcsim: it's worth separating, but not creating two versions
+ <braunr> i'm not sure what you mean here
+ <mcsim> First, I applied large store patch, and than I was changing patched
+ code, to make it work with new libpager interface. So changes to make
+ ext2 work with new interface depend on large store patch.
+ <mcsim> braunr: ^
+ <braunr> mcsim: you're not forced to make each version resulting from a new
+ commit work
+ <braunr> but don't make big commits
+ <braunr> so if changing an interface requires its users to be updated
+ twice, it doesn't make sense to do that
+ <braunr> just update the interface cleanly, you'll have one or more commits
+ that produce intermediate version that don't build, that's ok
+ <braunr> then in another, separate commit, adjust the users
+ <mcsim> braunr: The only user now is ext2. And the problem with ext2 is
+ that I updated not the version from git repository, but the version, that
+ I've got after applying the large store patch. So in other words my
+ question is follows: should I make a commit that moves to new interface
+ version of ext2fs without large store patch?
+ <braunr> you're asking if you can include the large store patch in your
+ work, and by extension, in the main branch
+ <braunr> i would say yes, but this must be discussed with others
diff --git a/open_issues/pfinet_vs_system_time_changes.mdwn b/open_issues/pfinet_vs_system_time_changes.mdwn
index 46705047..09b00d30 100644
--- a/open_issues/pfinet_vs_system_time_changes.mdwn
+++ b/open_issues/pfinet_vs_system_time_changes.mdwn
@@ -11,14 +11,16 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_hurd]]
-IRC, unknown channel, unknown date.
+
+# IRC, unknown channel, unknown date
<grey_gandalf> I did a sudo date...
<grey_gandalf> and the machine hangs
-This was very likely a misdiagnosis:
+This was very likely a misdiagnosis.
+
-IRC, freenode, #hurd, 2011-03-25:
+# IRC, freenode, #hurd, 2011-03-25
<tschwinge> antrik: I suspect it'S some timing stuff in pfinet that perhaps
uses absolute time, and somehow wildely gets confused?
@@ -42,7 +44,8 @@ IRC, freenode, #hurd, 2011-03-25:
wrap-around, and thus the same result.)
<tschwinge> Yes.
-IRC, freenode, #hurd, 2011-10-26:
+
+# IRC, freenode, #hurd, 2011-10-26
<antrik> anyways, when ntpdate adjusts to the past, the connections hang,
roughly for the amount of time being adjusted
@@ -50,7 +53,8 @@ IRC, freenode, #hurd, 2011-10-26:
<antrik> (well, if it's long enough, they probably timeout on the other
side...)
-IRC, freenode, #hurd, 2011-10-27:
+
+# IRC, freenode, #hurd, 2011-10-27
<antrik> oh, another interesting thing I observed is that the the subhurd
pfinet did *not* drop the connection... only the main Hurd one. I thought
@@ -60,7 +64,8 @@ IRC, freenode, #hurd, 2011-10-27:
where I set the date is affected, and not the pfinet in the other
instance
-IRC, freenode, #hurd, 2012-06-28:
+
+# IRC, freenode, #hurd, 2012-06-28
<bddebian> great, now setting the date/time fucked my machine
<pinotree> yes, we lack a monotonic clock
@@ -80,3 +85,17 @@ IRC, freenode, #hurd, 2012-06-28:
it fucked me because I now cannot get to it.. :)
<antrik> bddebian: that's odd... you should be able to just log in again
IIRC
+
+
+# IRC, freenode, #hurd, 2012-07-29
+
+ <antrik> pfinet can't cope with larger system time changes because it can't
+ use a monotonic clock
+
+[[clock_gettime]].
+
+ <braunr> well when librt becomes easily usable everywhere (it it's
+ possible), it will be quite easy to work around this issue
+ <pinotree> yes and no, you just need a monotonic clock and clock_gettime
+ able to use it
+ <braunr> why "no" ?
diff --git a/open_issues/robustness.mdwn b/open_issues/robustness.mdwn
index d32bd509..1f8aa0c6 100644
--- a/open_issues/robustness.mdwn
+++ b/open_issues/robustness.mdwn
@@ -62,3 +62,68 @@ License|/fdl]]."]]"""]]
<antrik> well, I'm not aware of the Minix implementation working across
reboots. the one I have in mind based on a generic session management
infrastructure should though :-)
+
+
+## IRC, freenode, #hurd, 2012-12-06
+
+ <Tekk_> out of curiosity, would it be possible to strap on a resurrection
+ server to hurd?
+ <Tekk_> in the future, that is
+ <braunr> sure
+ <Tekk_> cool :)
+ <braunr> but this requires things like persistence
+ <spiderweb> like a reincarnation server?
+ <braunr> it's a lot of works, with non negligible overhead
+ <Tekk_> spiderweb: yes, exactly. I didn't remember tanenbaum's wording on
+ that
+ <braunr> i'm pretty sure most people would be against that
+ <spiderweb> braunr: why so?
+ <Tekk_> it was actually the feature that convinced me that ukernels were a
+ good idea
+ <Tekk_> spiderweb: because then you need a process that keeps track of all
+ the other servers
+ <Tekk_> and they have to be replying to "useless" pings to see if they're
+ still alive
+ <braunr> spiderweb: the hurd community isn't looking for a system reliable
+ in critical environments
+ <braunr> just a general purpose system
+ <braunr> and persistence requires regular data saves
+ <braunr> it's expensive
+ <Tekk_> as well as that
+ <braunr> we already have performance problems because of the nature of the
+ system, adding more without really looking for the benefits is useless
+ <spiderweb> so you can't theoretically have both?
+ <braunr> persistence and performance ?
+ <braunr> it's hard
+ <Tekk_> spiderweb: you need to modify the other translators to be
+ persistent
+ <braunr> only the ones you care about actually
+ <braunr> but it's just better to make the critical servers very stable
+ <Tekk_> so it's not just turning on and off the reincarnation
+ <braunr> (there isn't that much code there)
+ <braunr> and the other servers restartable
+ <mcsim> braunr: I think that if there will be aim to make something like
+ resurrection server than it will be needed rewrite most servers to make
+ them stateless, isn't it?
+ <braunr> that's a lot easier and already works with non essential passive
+ translators
+ <Tekk_> mcsim: pretty much
+ <braunr> mcsim: only those you care about
+ <braunr> mcsim: the proc auth exec servers for example, perhaps the file
+ system servers that can act as root fs, but the others would simply be
+ restarted by the passive translator mechanism
+ <spiderweb> what about restarting device drivers, that would be simple
+ right?
+ <braunr> that's perfectly doable, yes
+ <spiderweb> (being an OS newbie) - it does seem to me that the whole
+ reincarnation server concept could quite possibly be a band aid.
+ <braunr> spiderweb: no it really works
+ <braunr> many systems do that actually
+ <braunr> let me give you a link
+ <braunr>
+ http://ftp.sceen.net/curios_improving_reliability_through_operating_system_structure.pdf
+ <braunr> it's a bit old, but there is a review of systems aiming at
+ resilience and how they achieve part of it
+ <spiderweb> neat, thanks
+ <braunr> actually it's not that old at all
+ <braunr> around 2007
diff --git a/open_issues/select.mdwn b/open_issues/select.mdwn
index abec304d..778af530 100644
--- a/open_issues/select.mdwn
+++ b/open_issues/select.mdwn
@@ -215,6 +215,1422 @@ IRC, unknown channel, unknown date:
<youpi> it's better than nothing yes
+# IRC, freenode, #hurd, 2012-07-21
+
+ <braunr> damn, select is actually completely misdesigned :/
+ <braunr> iiuc, it makes servers *block*, in turn :/
+ <braunr> can't be right
+ <braunr> ok i understand it better
+ <braunr> yes, timeouts should be passed along with the other parameters to
+ correctly implement non blocking select
+ <braunr> (or the round-trip io_select should only ask for notification
+ requests instead of making a server thread block, but this would require
+ even more work)
+ <braunr> adding the timeout in the io_select call should be easy enough for
+ whoever wants to take over a not-too-complicated-but-not-one-liner-either
+ task :)
+ <antrik> braunr: why is a blocking server thread a problem?
+ <braunr> antrik: handling the timeout at client side while server threads
+ block is the problem
+ <braunr> the timeout must be handled along with blocking obviously
+ <braunr> so you either do it at server side when async ipc is available,
+ which is the case here
+ <braunr> or request notifications (synchronously) and block at client side,
+ waiting forthose notifications
+ <antrik> braunr: are you saying the client has a receive timeout, but when
+ it elapses, the server thread keeps on blocking?...
+ <braunr> antrik: no i'm referring to the non-blocking select issue we have
+ <braunr> antrik: the client doesn't block in this case, whereas the servers
+ do
+ <braunr> which obviously doesn't work ..
+ <braunr> see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=79358
+ <braunr> this is the reason why vim (and probably others) are slow on the
+ hurd, while not consuming any cpu
+ <braunr> the current work around is that whenevever a non-blocking select
+ is done, it's transformed into a blocking select with the smallest
+ possible timeout
+ <braunr> whenever*
+ <antrik> braunr: well, note that the issue only began after fixing some
+ other select issue... it was fine before
+ <braunr> apparently, the issue was raised in 2000
+ <braunr> also, note that there is a delay between sending the io_select
+ requests and blocking on the replies
+ <braunr> when machines were slow, this delay could almost guarantee a
+ preemption between these steps, making the servers reply soon enough even
+ for a non blocking select
+ <braunr> the problem occurs when sending all the requests and checking for
+ replies is done before servers have a chance the send the reply
+ <antrik> braunr: I don't know what issue was raised in 2000, but I do know
+ that vim worked perfectly fine until last year or so. then some select
+ fix was introduced, which in turn broke vim
+ <braunr> antrik: could be the timeout rounding, Aug 2 2010
+ <braunr> hum but, the problem wasn't with vim
+ <braunr> vim does still work fine (in fact, glibc is patched to check some
+ well known process names and selectively fix the timeout)
+ <braunr> which is why vim is fast and view isn't
+ <braunr> the problem was with other services apparently
+ <braunr> and in order to fix them, that workaround had to be introduced
+ <braunr> i think it has nothing to do with the timeout rounding
+ <braunr> it must be the time when youpi added the patch to the debian
+ package
+ <antrik> braunr: the problem is that with the patch changing the timeout
+ rounding, vim got extremely slow. this is why the ugly hacky exception
+ was added later...
+ <antrik> after reading the report, I agree that the timeout needs to be
+ handled by the server. at least the timeout=0 case.
+ <pinotree> vim uses often 0-time selects to check whether there's input
+ <antrik> client-side handling might still be OK for other timeout settings
+ I guess
+ <antrik> I'm a bit ambivalent about that
+ <antrik> I tend to agree with Neal though: it really doesn't make much
+ sense to have a client-side watchdog timer for this specific call, while
+ for all other ones we trust the servers not to block...
+ <antrik> or perhaps not. for standard sync I/O, clients should expect that
+ an operation could take long (though not forever); but they might use
+ select() precisely to avoid long delays in I/O... so it makes some sense
+ to make sure that select() really doesn't delay because of a busy server
+ <antrik> OTOH, unless the server is actually broken (in which anything
+ could happen), a 0-time select should never actually block for an
+ extended period of time... I guess it's not wrong to trust the servers on
+ that
+ <antrik> pinotree: hm... that might explain a certain issue I *was*
+ observing with Vim on Hurd -- though I never really thought about it
+ being an actual bug, as opposed to just general Hurd sluggishness...
+ <antrik> but it makes sense now
+ <pinotree> antrik:
+ http://patch-tracker.debian.org/patch/series/view/eglibc/2.13-34/hurd-i386/local-select.diff
+ <antrik> so I guess we all agree that moving the select timeout to the
+ server is probably the most reasonably approach...
+ <antrik> braunr: BTW, I wouldn't really consider the sync vs. async IPC
+ cases any different. the client blocks waiting for the server to reply
+ either way...
+ <antrik> the only difference is that in the sync IPC case, the server might
+ want to take some special precaution so it doesn't have to block until
+ the client is ready to receive the reply
+ <antrik> but that's optional and not really select-specific I'd say
+ <antrik> (I'd say the only sane approach with sync IPC is probably for the
+ server never to wait -- if the client fails to set up for receiving the
+ reply in time, it looses...)
+ <antrik> and with the receive buffer approach in Viengoos, this can be done
+ really easy and nice :-)
+
+
+## IRC, freenode, #hurd, 2012-07-22
+
+ <braunr> antrik: you can't block in servers with sync ipc
+ <braunr> so in this case, "select" becomes a request for notifications
+ <braunr> whereas with async ipc, you can, so it's less efficient to make a
+ full round trip just to ask for requests when you can just do async
+ requests (doing the actual blocking) and wait for any reply after
+ <antrik> braunr: I don't understand. why can't you block in servers with
+ async IPC?
+ <antrik> braunr: err... with sync IPC I mean
+ <braunr> antrik: because select operates on more than one fd
+ <antrik> braunr: and what does that got to do with sync vs. async IPC?...
+ <antrik> maybe you are thinking of endpoints here, which is a whole
+ different story
+ <antrik> traditional L4 has IPC ports bound to specific threads; so
+ implementing select requires a separate client thread for each
+ server. but that's not mandatory for sync IPC. Viengoos has endpoints not
+ bound to threads
+ <braunr> antrik: i don't know what "endpoint" means here
+ <braunr> but, you can't use sync IPC to implement select on multiple fds
+ (and thus possibly multiple servers) by blocking in the servers
+ <braunr> you'd block in the first and completely miss the others
+ <antrik> braunr: I still don't see why... or why async IPC would change
+ anything in that regard
+ <braunr> antrik: well, you call select on 3 fds, each implemented by
+ different servers
+ <braunr> antrik: you call a sync select on the first fd, obviously you'll
+ block there
+ <braunr> antrik: if it's async, you don't block, you just send the
+ requests, and wait for any reply
+ <braunr> like we do
+ <antrik> braunr: I think you might be confused about the meaning of sync
+ IPC. it doesn't in any way imply that after sending an RPC request you
+ have to block on some particular reply...
+ <youpi> antrik: what does sync mean then?
+ <antrik> braunr: you can have any number of threads listening for replies
+ from the various servers (if using an L4-like model); or even a single
+ thread, if you have endpoints that can listen on replies from different
+ sources (which was pretty much the central concern in the Viengoos IPC
+ design AIUI)
+ <youpi> antrik: I agree with your "so it makes some sense to make sure that
+ select() really doesn't delay because of a busy server" (for blocking
+ select) and "OTOH, unless the server is actually broken (in which
+ anything could happen), a 0-time select should never actually block" (for
+ non-blocking select)
+ <antrik> youpi: regarding the select, I was thinking out loud; the former
+ statement was mostly cancelled by my later conclusions...
+ <antrik> and I'm not sure the latter statement was quite clear
+ <youpi> do you know when it was?
+ <antrik> after rethinking it, I finally concluded that it's probably *not*
+ a problem to rely on the server to observe the timout. if it's really
+ busy, it might take longer than the designated timeout (especially if
+ timeout is 0, hehe) -- but I don't think this is a problem
+ <antrik> and if it doens't observe the timout because it's
+ broken/malicious, that's not more problematic that any other RPC the
+ server doesn't handle as expected
+ <youpi> ok
+ <youpi> did somebody wrote down the conclusion "let's make select timeout
+ handled at server side" somewhere?
+ <antrik> youpi: well, neal already said that in a followup to the select
+ issue Debian bug... and after some consideration, I completely agree with
+ his reasoning (as does braunr)
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> antrik: i was meaning sync in the most common meaning, yes, the
+ client blocking on the reply
+ <antrik> braunr: I think you are confusing sync IPC with sync I/O ;-)
+ <antrik> braunr: by that definition, the vast majority of Hurd IPC would be
+ sync... but that's obviously not the case
+ <antrik> synchronous IPC means that send and receive happen at the same
+ time -- nothing more, nothing less. that's why it's called synchronous
+ <braunr> antrik: yes
+ <braunr> antrik: so it means the client can't continue unless he actually
+ receives
+ <antrik> in a pure sync model such as L4 or EROS, this means either the
+ sender or the receiver has to block, so synchronisation can happen. which
+ one is server and which one is client is completely irrelevant here --
+ this is about individual message transfer, not any RPC model on top of it
+ <braunr> i the case of select, i assume sender == client
+ <antrik> in Viengoos, the IPC is synchronous in the sense that transfer
+ from the send buffer to the receive buffer happens at the same time; but
+ it's asynchronous in the sense that the receiver doesn't necessarily have
+ to be actively waiting for the incoming message
+ <braunr> ok, i was talking about a pure sync model
+ <antrik> (though it most cases it will still do so...)
+ <antrik> braunr: BTW, in the case of select, the sender is *not* the
+ client. the reply is relevant here, not the request -- so the client is
+ the receiver
+ <antrik> (the select request is boring)
+ <braunr> sorry, i don't understand, you seem to dismiss the select request
+ for no valid reason
+ <antrik> I still don't see how sync vs. async affects the select reply
+ receive though... blocking seems the right approach in either case
+ <braunr> blocking is required
+ <braunr> but you either block in the servers, or in the client
+ <braunr> (and if blocking in the servers, the client also blocks)
+ <braunr> i'll explain how i see it again
+ <braunr> there are two approaches to implementing select
+ <braunr> 1/ send requests to all servers, wait for any reply, this is what
+ the hurd does
+ <braunr> but it's possible because you can send all the requests without
+ waiting for the replies
+ <braunr> 2/ send notification requests, wait for a notification
+ <braunr> this doesn't require blocking in the servers (so if you have many
+ clients, you don't need as many threads)
+ <braunr> i was wondering which approach was used by the hurd, and if it
+ made sense to change
+ <antrik> TBH I don't see the difference between 1) and 2)... whether the
+ message from the server is called an RPC reply or a notification is just
+ a matter of definition
+ <antrik> I think I see though what you are getting at
+ <antrik> with sync IPC, if the client sent all requests and only afterwards
+ started to listen for replies, the servers might need to block while
+ trying to deliver the reply because the client is not ready yet
+ <braunr> that's one thing yes
+ <antrik> but even in the sync case, the client can immediately wait for
+ replies to each individual request -- it might just be more complicated,
+ depending on the specifics of the IPC design
+ <braunr> what i mean by "send notification requests" is actually more than
+ just sending, it's a complete RPC
+ <braunr> and notifications are non-blocking, yes
+ <antrik> (with L4, it would require a separate client thread for each
+ server contacted... which is precisely why a different mechanism was
+ designed for Viengoos)
+ <braunr> seems weird though
+ <braunr> don't they have a portset like abstraction ?
+ <antrik> braunr: well, having an immediate reply to the request and a
+ separate notification later is just a waste of resources... the immediate
+ reply would have no information value
+ <antrik> no, in original L4 IPC is always directed to specific threads
+ <braunr> antrik: some could see the waste of resource as being the
+ duplication of the number of client threads in the server
+ <antrik> you could have one thread listening to replies from several
+ servers -- but then, replies can get lost
+ <braunr> i see
+ <antrik> (or the servers have to block on the reply)
+ <braunr> so, there are really no capabilities in the original l4 design ?
+ <antrik> though I guess in the case of select() it wouldn't really matter
+ if replies get lost, as long as at least one is handled... would just
+ require the listener thread by separate from the thread sending the
+ requests
+ <antrik> braunr: right. no capabilities of any kind
+ <braunr> that was my initial understanding too
+ <braunr> thanks
+ <antrik> so I partially agree: in a purely sync IPC design, it would be
+ more complicated (but not impossible) to make sure the client gets the
+ replies without the server having to block while sending replies
+
+ <braunr> arg, we need hurd_condition_timedwait (and possible
+ condition_timedwait) to cleanly fix io_select
+ <braunr> luckily, i still have my old patch for condition_timedwait :>
+ <braunr> bddebian: in order to implement timeouts in select calls, servers
+ now have to use a hurd_condition_timedwait function
+ <braunr> is it possible that a thread both gets canceled and timeout on a
+ wait ?
+ <braunr> looks unlikely to me
+
+ <braunr> hm, i guess the same kind of compatibility constraints exist for
+ hurd interfaces
+ <braunr> so, should we have an io_select1 ?
+ <antrik> braunr: I would use a more descriptive name: io_select_timeout()
+ <braunr> antrik: ah yes
+ <braunr> well, i don't really like the idea of having 2 interfaces for the
+ same call :)
+ <braunr> because all select should be select_timeout :)
+ <braunr> but ok
+ <braunr> antrik: actually, having two select calls may be better
+ <braunr> oh it's really minor, we do'nt care actually
+ <antrik> braunr: two select calls?
+ <braunr> antrik: one with a timeout and one without
+ <braunr> the glibc would choose at runtime
+ <antrik> right. that was the idea. like with most transitions, that's
+ probably the best option
+ <braunr> there is no need to pass the timeout value if it's not needed, and
+ it's easier to pass NULL this way
+ <antrik> oh
+ <antrik> nah, that would make the transition more complicated I think
+ <braunr> ?
+ <braunr> ok
+ <braunr> :)
+ <braunr> this way, it becomes very easy
+ <braunr> the existing io_select call moves into a select_common() function
+ <antrik> the old variant doesn't know that the server has to return
+ immediately; changing that would be tricky. better just use the new
+ variant for the new behaviour, and deprecate the old one
+ <braunr> and the entry points just call this common function with either
+ NULL or the given timeout
+ <braunr> no need to deprecate the old one
+ <braunr> that's what i'm saying
+ <braunr> and i don't understand "the old variant doesn't know that the
+ server has to return immediately"
+ <antrik> won't the old variant block indefinitely in the server if there
+ are no ready fds?
+ <braunr> yes it will
+ <antrik> oh, you mean using the old variant if there is no timeout value?
+ <braunr> yes
+ <antrik> well, I guess this would work
+ <braunr> well of course, the question is rather if we want this or not :)
+ <antrik> hm... not sure
+ <braunr> we need something to improve the process of changing our
+ interfaces
+ <braunr> it's really painful currnelty
+ <antrik> inside the servers, we probably want to use common code
+ anyways... so in the long run, I think it simplifies the code when we can
+ just drop the old variant at some point
+ <braunr> a lot of the work we need to do involves changing interfaces, and
+ we very often get to the point where we don't know how to do that and
+ hardly agree on a final version :
+ <braunr> :/
+ <braunr> ok but
+ <braunr> how do you tell the server you don't want a timeout ?
+ <braunr> a special value ? like { -1; -1 } ?
+ <antrik> hm... good point
+ <braunr> i'll do it that way for now
+ <braunr> it's the best way to test it
+ <antrik> which way you mean now?
+ <braunr> keeping io_select as it is, add io_select_timeout
+ <antrik> yeah, I thought we agreed on that part... the question is just
+ whether io_select_timeout should also handle the no-timeout variant going
+ forward, or keep io_select for that. I'm really not sure
+ <antrik> maybe I'll form an opinion over time :-)
+ <antrik> but right now I'm undecided
+ <braunr> i say we keep io_select
+ <braunr> anyway it won't change much
+ <braunr> we can just change that at the end if we decide otherwise
+ <antrik> right
+ <braunr> even passing special values is ok
+ <braunr> with a carefully written hurd_condition_timedwait, it's very easy
+ to add the timeouts :)
+ <youpi> antrik, braunr: I'm wondering, another solution is to add an
+ io_probe, i.e. the server has to return an immediate result, and the
+ client then just waits for all results, without timeout
+ <youpi> that'd be a mere addition in the glibc select() call: when timeout
+ is 0, use that, and otherwise use the previous code
+ <youpi> the good point is that it looks nicer in fs.defs
+ <youpi> are there bad points?
+ <youpi> (I don't have the whole issues in the mind now, so I'm probably
+ missing things)
+ <braunr> youpi: the bad point is duplicating the implementation maybe
+ <youpi> what duplication ?
+ <youpi> ah you mean for the select case
+ <braunr> yes
+ <braunr> although it would be pretty much the same
+ <braunr> that is, if probe only, don't enter the wait loop
+ <youpi> could that be just some ifs here and there?
+ <youpi> (though not making the code easier to read...)
+ <braunr> hm i'm not sure it's fine
+ <youpi> in that case oi_select_timeout looks ncier ideed :)
+ <braunr> my problem with the current implementation is having the timeout
+ at the client side whereas the server side is doing the blocking
+ <youpi> I wonder how expensive a notification is, compared to blocking
+ <youpi> a blocking indeed needs a thread stack
+ <youpi> (and kernel thread stuff)
+ <braunr> with the kind of async ipc we have, it's still better to do it
+ that way
+ <braunr> and all the code already exists
+ <braunr> having the timeout at the client side also have its advantage
+ <braunr> has*
+ <braunr> latency is more precise
+ <braunr> so the real problem is indeed the non blocking case only
+ <youpi> isn't it bound to kernel ticks anyway ?
+ <braunr> uh, not if your server sucks
+ <braunr> or is loaded for whatever reason
+ <youpi> ok, that's not what I understood by "precision" :)
+ <youpi> I'd rather call it robustness :)
+ <braunr> hm
+ <braunr> right
+ <braunr> there are several ways to do this, but the io_select_timeout one
+ looks fine to me
+ <braunr> and is already well on its way
+ <braunr> and it's reliable
+ <braunr> (whereas i'm not sure about reliability if we keep the timeout at
+ client side)
+ <youpi> btw make the timeout nanoseconds
+ <braunr> ??
+ <youpi> pselect uses timespec, not timeval
+ <braunr> do we want pselect ?
+ <youpi> err, that's the only safe way with signals
+ <braunr> not only, no
+ <youpi> and poll is timespec also
+ <youpi> not only??
+ <braunr> you mean ppol
+ <braunr> ppoll
+ <youpi> no, poll too
+ <youpi> by "the only safe way", I mean for select calls
+ <braunr> i understand the race issue
+ <youpi> ppoll is a gnu extension
+ <braunr> int poll(struct pollfd *fds, nfds_t nfds, int timeout);
+ <youpi> ah, right, I was also looking at ppoll
+ <youpi> any
+ <youpi> way
+ <youpi> we can use nanosecs
+ <braunr> most event loops use a pipe or a socketpair
+ <youpi> there's no reason not to
+ <antrik> youpi: I briefly considered special-casisg 0 timeouts last time we
+ discussed this; but I concluded that it's probably better to handle all
+ timeouts server-side
+ <youpi> I don't see why we should even discuss that
+ <braunr> and translate signals to writes into the pipe/socketpair
+ <youpi> antrik: ok
+ <antrik> you can't count on select() timout precision anyways
+ <antrik> a few ms more shouldn't hurt any sanely written program
+ <youpi> braunr: "most" doesn't mean "all"
+ <youpi> there *are* applications which use pselect
+ <braunr> well mach only handles millisedonds
+ <braunr> seconds
+ <youpi> and it's not going out of the standard
+ <youpi> mach is not the hurd
+ <youpi> if we change mach, we can still keep the hurd ipcs
+ <youpi> anyway
+ <youpi> agagin
+ <youpi> I reallyt don't see the point of the discussion
+ <youpi> is there anything *against* using nanoseconds?
+ <braunr> i chose the types specifically because of that :p
+ <braunr> but ok i can change again
+ <youpi> becaus what??
+ <braunr> i chose to use mach's native time_value_t
+ <braunr> because it matches timeval nicely
+ <youpi> but it doesn't match timespec nicely
+ <braunr> no it doesn't
+ <braunr> should i add a hurd specific time_spec_t then ?
+ <youpi> "how do you tell the server you don't want a timeout ? a special
+ value ? like { -1; -1 } ?"
+ <youpi> you meant infinite blocking?
+ <braunr> youpi: yes
+ <braunr> oh right, pselect is posix
+ <youpi> actually posix says that there can be limitations on the maximum
+ timeout supported, which should be at least 31 days
+ <youpi> -1;-1 is thus fine
+ <braunr> yes
+ <braunr> which is why i could choose time_value_t (a struct of 2 integer_t)
+ <youpi> well, I'd say gnumach could grow a nanosecond-precision time value
+ <youpi> e.g. for clock_gettime precision and such
+ <braunr> so you would prefer me adding the time_spec_t time to gnumach
+ rather than the hurd ?
+ <youpi> well, if hurd RPCs are using mach types and there's no mach type
+ for nanoseconds, it m akes sense to add one
+ <youpi> I don't know about the first part
+ <braunr> yes some hurd itnerfaces also use time_value_t
+ <antrik> in general, I don't think Hurd interfaces should rely on a Mach
+ timevalue. it's really only meaningful when Mach is involved...
+ <antrik> we could even pass the time value as an opaque struct. don't
+ really need an explicit MIG type for that.
+ <braunr> opaque ?
+ <youpi> an opaque type would be a step backward from multi-machine support
+ ;)
+ <antrik> youpi: that's a sham anyways ;-)
+ <youpi> what?
+ <youpi> ah, using an opaque type, yes :)
+ <braunr> probably why my head bugged while reading that
+ <antrik> it wouldn't be fully opaque either. it would be two ints, right?
+ even if Mach doesn't know what these two ints mean, it still could to
+ byte order conversion, if we ever actually supported setups where it
+ matters...
+ <braunr> so uh, should this new time_spec_t be added in gnumach or the hurd
+ ?
+ <braunr> youpi: you're the maintainer, you decide :p
+ *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has joined channel
+ #hurd
+ <youpi> well, I don't like deciding when I didn't even have read fs.defs :)
+ <youpi> but I'd say the way forward is defining it in the hurd
+ <youpi> and put a comment "should be our own type" above use of the mach
+ type
+ <braunr> ok
+ *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has quit: Remote host
+ closed the connection
+ <braunr> and, by the way, is using integer_t fine wrt the 64-bits port ?
+ <youpi> I believe we settled on keeping integer_t a 32bit integer, like xnu
+ does
+ *** elmig (~elmig@a89-155-34-142.cpe.netcabo.pt) has quit: Quit: leaving
+ <braunr> ok so it's not
+ *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has joined channel
+ #hurd
+ <braunr> uh well
+ <youpi> why "not" ?
+ <braunr> keeping it 32-bits for the 32-bits userspace hurd
+ <braunr> but i'm talking about a true 64-bits version
+ <braunr> wouldn't integer_t get 64-bits then ?
+ <youpi> I meant we settled on a no
+ <youpi> like xnu does
+ <braunr> xnu uses 32-bits integer_t even when userspace runs in 64-bits
+ mode ?
+ <youpi> because things for which we'd need 64bits then are offset_t,
+ vm_size_t, and such
+ <youpi> yes
+ <braunr> ok
+ <braunr> youpi: but then what is the type to use for long integers ?
+ <braunr> or uintptr_t
+ <youpi> braunr: uintptr_t
+ <braunr> the mig type i mean
+ <youpi> type memory_object_offset_t = uint64_t;
+ <youpi> (and size)
+ <braunr> well that's a 64-bits type
+ <youpi> well, yes
+ <braunr> natural_t and integer_t were supposed to have the processor word
+ size
+ <youpi> probably I didn't understand your question
+ <braunr> if we remove that property, what else has it ?
+ <youpi> yes, but see rolands comment on this
+ <braunr> ah ?
+ <youpi> ah, no, he just says the same
+ <antrik> braunr: well, it's debatable whether the processor word size is
+ really 64 bit on x86_64...
+ <antrik> all known compilers still consider int to be 32 bit
+ <antrik> (and int is the default word size)
+ <braunr> not really
+ <youpi> as in?
+ <braunr> the word size really is 64-bits
+ <braunr> the question concerns the data model
+ <braunr> with ILP32 and LP64, int is always 32-bits, and long gets the
+ processor word size
+ <braunr> and those are the only ones current unices support
+ <braunr> (which is why long is used everywhere for this purpose instead of
+ uintptr_t in linux)
+ <antrik> I don't think int is 32 bit on alpha?
+ <antrik> (and probably some other 64 bit arches)
+ <braunr> also, assuming we want to maintain the ability to support single
+ system images, do we really want RPC with variable size types ?
+ <youpi> antrik: linux alpha's int is 32bit
+ <braunr> sparc64 too
+ <youpi> I don't know any 64bit port with 64bit int
+ <braunr> i wonder how posix will solve the year 2038 problem ;p
+ <youpi> time_t is a long
+ <youpi> the hope is that there'll be no 32bit systems by 2038 :)
+ <braunr> :)
+ <youpi> but yes, that matters to us
+ <youpi> number of seconds should not be just an int
+ <braunr> we can force a 64-bits type then
+ <braunr> i tend to think we should have no variable size type in any mig
+ interface
+ <braunr> youpi: so, new hurd type, named time_spec_t, composed of two
+ 64-bits signed integers
+ <pinotree> braunr: i added that in my prototype of monotonic clock patch
+ for gnumach
+ <braunr> oh
+ <youpi> braunr: well, 64bit is not needed for the nanosecond part
+ <braunr> right
+ <braunr> it will be aligned anyway :p
+ <youpi> I know
+ <youpi> uh, actually linux uses long there
+ <braunr> pinotree: i guess your patch is still in debian ?
+ <braunr> youpi: well yes
+ <braunr> youpi: why wouldn't it ? :)
+ <pinotree> no, never applied
+ <youpi> braunr: because 64bit is not needed
+ <braunr> ah, i see what you mean
+ <youpi> oh, posix says longa ctually
+ <youpi> *exactly* long
+ <braunr> i'll use the same sizes
+ <braunr> so it fits nicely with timespec
+ <braunr> hm
+ <braunr> but timespec is only used at the client side
+ <braunr> glibc would simply move the timespec values into our hurd specific
+ type (which can use 32-bits nanosecs) and servers would only use that
+ type
+ <braunr> all right, i'll do it that way, unless there are additional
+ comments next morning :)
+ <antrik> braunr: we never supported federations, and I'm pretty sure we
+ never will. the remnants of network IPC code were ripped out some years
+ ago. some of the Hurd interfaces use opaque structs too, so it wouldn't
+ even work if it existed. as I said earlier, it's really all a sham
+ <antrik> as for the timespec type, I think it's easier to stick with the
+ API definition at RPC level too
+
+
+## IRC, freenode, #hurd, 2012-07-24
+
+ <braunr> youpi: antrik: is vm_size_t an appropriate type for a c long ?
+ <braunr> (appropriate mig type)
+ <antrik> I wouldn't say so. while technically they are pretty much
+ guaranteed to be the same, conceptually they are entirely different
+ things -- it would be confusing at least to do it that way...
+ <braunr> antrik: well which one then ? :(
+ <antrik> braunr: no idea TBH
+ <braunr> antrik_: that should have been natural_t and integer_t
+ <braunr> so maybe we should new types to replace them
+ <antrik_> braunr: actually, RPCs should never have nay machine-specific
+ types... which makes me realise that a 1:1 translation to the POSIX
+ definition is actually not possible if we want to follow the Mach ideals
+ <braunr> i agree
+ <braunr> (well, the original mach authors used natural_t in quite a bunch
+ of places ..)
+ <braunr> the mig interfaces look extremely messy to me because of this type
+ issue
+ <braunr> and i just want to move forward with my work now
+ <braunr> i could just use 2 integer_t, that would get converted in the
+ massive future revamp of the interfaces for the 64-bits userspace
+ <braunr> or 2 64-bits types
+ <braunr> i'd like us to agree on one of the two not too late so i can
+ continue
+
+
+## IRC, freenode, #hurd, 2012-07-25
+
+ <antrik_> braunr: well, for actual kernel calls, machine-specific types are
+ probably hard to avoid... the problem is when they are used in other RPCs
+ <braunr> antrik: i opted for a hurd specific time_data_t = struct[2] of
+ int64
+ <braunr> and going on with this for now
+ <braunr> once it works we'll finalize the types if needed
+ <antrik> I'm really not sure how to best handle such 32 vs. 64 bit issues
+ in Hurd interfaces...
+ <braunr> you *could* consider time_t and long to be machine specific types
+ <antrik> well, they clearly are
+ <braunr> long is
+ <braunr> time_t isn't really
+ <antrik> didn't you say POSIX demands it to be longs?
+ <braunr> we could decide to make it 64 bits in all versions of the hurd
+ <braunr> no
+ <braunr> posix requires the nanoseconds field of timespec to be long
+ <braunr> the way i see it, i don't see any problem (other than a little bit
+ of storage and performance) using 64-bits types here
+ <antrik> well, do we really want to use a machine-independent time format,
+ if the POSIX interfaces we are mapping do not?...
+ <antrik> (perhaps we should; I'm just uncertain what's better in this case)
+ <braunr> this would require creating new types for that
+ <braunr> probably mach types for consistency
+ <braunr> to replace natural_t and integer_t
+ <braunr> now this concerns a totally different issue than select
+ <braunr> which is how we're gonna handle the 64-bits port
+ <braunr> because natural_t and integer_t are used almost everywhere
+ <antrik> indeed
+ <braunr> and we must think of 2 ports
+ <braunr> the 32-bits over 64-bits gnumach, and the complete 64-bits one
+ <antrik> what do we do for the interfaces that are explicitly 64 bit?
+ <braunr> what do you mean ?
+ <braunr> i'm not sure there is anything to do
+ <antrik> I mean what is done in the existing ones?
+ <braunr> like off64_t ?
+ <antrik> yeah
+ <braunr> they use int64 and unsigned64
+ <antrik> OK. so we shouldn't have any trouble with that at least...
+ <pinotree> braunr: were you adding a time_value_t in mach, but for
+ nanoseconds?
+ <braunr> no i'm adding a time_data_t to the hurd
+ <braunr> for nanoseconds yes
+ <pinotree> ah ok
+ <pinotree> (maybe sure it is available in hurd/hurd_types.defs)
+ <braunr> yes it's there
+ <pinotree> \o/
+ <braunr> i mean, i didn't forget to add it there
+ <braunr> for now it's a struct[2] of int64
+ <braunr> but we're not completely sure of that
+ <braunr> currently i'm teaching the hurd how to use timeouts
+ <pinotree> cool
+ <braunr> which basically involves adding a time_data_t *timeout parameter
+ to many functions
+ <braunr> and replacing hurd_condition_wait with hurd_condition_timedwait
+ <braunr> and making sure a timeout isn't an error on the return path
+ * pinotree has a simplier idea for time_data_t: add a file_utimesns to
+ fs.defs
+ <braunr> hmm, some functions have a nonblocking parameter
+ <braunr> i'm not sure if it's better to replace them with the timeout, or add the timeout parameter
+ <braunr> considering the functions involved may return EWOULDBLOCK
+ <braunr> for now i'll add a timeout parameter, so that the code requires as little modification as possible
+ <braunr> tell me your opinion on that please
+ <antrik> braunr: what functions?
+ <braunr> connq_listen in pflocal for example
+ <antrik> braunr: I don't really understand what you are talking about :-(
+ <braunr> some servers implement select this way :
+ <braunr> 1/ call a function in non-blocking mode, if it indicates data is available, return immediately
+ <braunr> 2/ call the same function, in blocking mode
+ <braunr> normally, with the new timeout parameter, non-blocking could be passed in the timeout parameter (with a timeout of 0)
+ <braunr> operating in non-blocking mode, i mean
+ <braunr> antrik: is it clear now ? :)
+ <braunr> i wonder how the hurd managed to grow so much code without a cond_timedwait function :/
+ <braunr> i think i have finished my io_select_timeout patch on the hurd side
+ <braunr> :)
+ <braunr> a small step for the hurd, but a big one against vim latencies !!
+ <braunr> (which is the true reason i'm working on this haha)
+ <braunr> new hurd rbraun/io_select_timeout branch for those interested
+ <braunr> hm, my changes clashes hard with the debian pflocal patch by neal :/
+ <braunr> clash*
+ <antrik> braunr: replace I'd say. no need to introduce redundancy; and code changes not affecting interfaces are cheap
+ <antrik> (in general, I'm always in favour of refactoring)
+ <braunr> antrik: replace what ?
+ <antrik> braunr: wow, didn't think moving the timeouts to server would be such a quick task :-)
+ <braunr> antrik: :)
+ <antrik> 16:57 < braunr> hmm, some functions have a nonblocking parameter
+ <antrik> 16:58 < braunr> i'm not sure if it's better to replace them with the timeout, or add the timeout parameter
+ <braunr> antrik: ah about that, ok
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+ <pinotree> braunr: wrt your select_timeout branch, why not push only the
+ time_data stuff to master?
+ <braunr> pinotree: we didn't agree on that yet
+
+ <braunr> ah better, with the correct ordering of io routines, my hurd boots
+ :)
+ <pinotree> and works too? :p
+ <braunr> so far yes
+ <braunr> i've spotted some issues in libpipe but nothing major
+ <braunr> i "only" have to adjust the client side select implementation now
+
+
+## IRC, freenode, #hurd, 2012-07-27
+
+ <braunr> io_select should remain a routine (i.e. synchronous) for server
+ side stub code
+ <braunr> but should be asynchronous (send only) for client side stub code
+ <braunr> (since _hurs_select manually handles replies through a port set)
+
+
+## IRC, freenode, #hurd, 2012-07-28
+
+ <braunr> why are there both REPLY_PORTS and IO_SELECT_REPLY_PORT macros in
+ the hurd ..
+ <braunr> and for the select call only :(
+ <braunr> and doing the exact same thing unless i'm mistaken
+ <braunr> the reply port is required for select anyway ..
+ <braunr> i just want to squeeze them into a new IO_SELECT_SERVER macro
+ <braunr> i don't think i can maintain the use the existing io_select call
+ as it is
+ <braunr> grr, the io_request/io_reply files aren't synced with the io.defs
+ file
+ <braunr> calls like io_sigio_request seem totally unused
+ <antrik> yeah, that's a major shortcoming of MIG -- we shouldn't need to
+ have separate request/reply defs
+ <braunr> they're not even used :/
+ <braunr> i did something a bit ugly but it seems to do what i wanted
+
+
+## IRC, freenode, #hurd, 2012-07-29
+
+ <braunr> good, i have a working client-side select
+ <braunr> now i need to fix the servers a bit :x
+ <braunr> arg, my test cases work, but vim doesn't :((
+ <braunr> i hate select :p
+ <braunr> ah good, my problems are caused by a deadlock because of my glibc
+ changes
+ <braunr> ah yes, found my locking problem
+ <braunr> building my final libc now
+ * braunr crosses fingers
+ <braunr> (the deadlock issue was of course a one liner)
+ <braunr> grr deadlocks again
+ <braunr> grmbl, my deadlock is in pfinet :/
+ <braunr> my select_timeout code makes servers deadlock on the libports
+ global lock :/
+ <braunr> wtf..
+ <braunr> youpi: it may be related to the failed asserttion
+ <braunr> deadlocking on mutex_unlock oO
+ <braunr> grr
+ <braunr> actually, mutex_unlock sends a message to notify other threads
+ that the lock is ready
+ <braunr> and that's what is blocking ..
+ <braunr> i'm not sure it's a fundamental problem here
+ <braunr> it may simply be a corruption
+ <braunr> i have several (but not that many) threads blocked in mutex_unlock
+ and one blocked in mutex_lcok
+ <braunr> i fail to see how my changes can create such a behaviour
+ <braunr> the weird thing is that i can't reproduce this with my test cases
+ :/
+ <braunr> only vim makes things crazy
+ <braunr> and i suppose it's related to the terminal
+ <braunr> (don't terminals relay select requests ?)
+ <braunr> when starting vim through ssh, pfinet deadlocks, and when starting
+ it on the mach console, the console term deadlocks
+ <pinotree> no help/hints when started with rpctrace?
+ <braunr> i only get assertions with rpctrace
+ <braunr> it's completely unusable for me
+ <braunr> gdb tells vim is indeed blocked in a select request
+ <braunr> and i can't see any in the remote servers :/
+ <braunr> this is so weird ..
+ <braunr> when using vim with the unmodified c library, i clearly see the
+ select call, and everything works fine ....
+ <braunr> 2e27: a1 c4 d2 b7 f7 mov 0xf7b7d2c4,%eax
+ <braunr> 2e2c: 62 (bad)
+ <braunr> 2e2d: f6 47 b6 69 testb $0x69,-0x4a(%edi)
+ <braunr> what's the "bad" line ??
+ <braunr> ew, i think i understand my problem now
+ <braunr> the timeout makes blocking threads wake prematurely
+ <braunr> but on an mutex unlock, or a condition signal/broadcast, a message
+ is still sent, as it is expected a thread is still waiting
+ <braunr> but the receiving thread, having returned sooner than expected
+ from mach_msg, doesn't dequeue the message
+ <braunr> as vim does a lot of non blocking selects, this fills the message
+ queue ...
+
+
+## IRC, freenode, #hurd, 2012-07-30
+
+ <braunr> hm nice, the problem i have with my hurd_condition_timedwait seems
+ to also exist in libpthread
+
+[[!taglink open_issue_libpthread]].
+
+ <braunr> although at a lesser degree (the implementation already correctly
+ removes a thread that timed out from a condition queue, and there is a
+ nice FIXME comment asking what to do with any stale wakeup message)
+ <braunr> and the only solution i can think of for now is to drain the
+ message queue
+ <braunr> ah yes, i know have vim running with my io_select_timeout code :>
+ <braunr> but hum
+ <braunr> eating all cpu
+ <braunr> ah nice, an infinite loop in _hurd_critical_section_unlock
+ <braunr> grmbl
+ <tschwinge> braunr: But not this one?
+ http://www.gnu.org/software/hurd/open_issues/fork_deadlock.html
+ <braunr> it looks similar, yes
+ <braunr> let me try again to compare in detail
+ <braunr> pretty much the same yes
+ <braunr> there is only one difference but i really don't think it matters
+ <braunr> (#3 _hurd_sigstate_lock (ss=0x2dff718) at hurdsig.c:173
+ <braunr> instead of
+ <braunr> #3 _hurd_sigstate_lock (ss=0x1235008) at hurdsig.c:172)
+ <braunr> ok so we need to review jeremie's work
+ <braunr> tschwinge: thanks for pointing me at this
+ <braunr> the good thing with my patch is that i can reproduce in a few
+ seconds
+ <braunr> consistently
+ <tschwinge> braunr: You're welcome. Great -- a reproducer!
+ <tschwinge> You might also build a glibc without his patches as a
+ cross-test to see the issues goes away?
+ <braunr> right
+ <braunr> i hope they're easy to find :)
+ <tschwinge> Hmm, have you already done changes to glibc? Otherwise you
+ might also simply use a Debian package from before?
+ <braunr> yes i have local changes to _hurd_select
+ <tschwinge> OK, too bad.
+ <tschwinge> braunr: debian/patches/hurd-i386/tg-hurdsig-*, I think.
+ <braunr> ok
+ <braunr> hmmmmm
+ <braunr> it may be related to my last patch on the select_timeout branch
+ <braunr> (i mean, this may be caused by what i mentioned earlier this
+ morning)
+ <braunr> damn i can't build glibc without the signal disposition patches :(
+ <braunr> libpthread_sigmask.diff depends on it
+ <braunr> tschwinge: doesn't libpthread (as implemented in the debian glibc
+ patches) depend on global signal dispositions ?
+ <braunr> i think i'll use an older glibc for now
+ <braunr> but hmm which one ..
+ <braunr> oh whatever, let's fix the deadlock, it's simpler
+ <braunr> and more productive anyway
+ <tschwinge> braunr: May be that you need to revert some libpthread patch,
+ too. Or even take out the libpthread build completely (you don't need it
+ for you current work, I think).
+ <tschwinge> braunr: Or, of course, you locate the deadlock. :-)
+ <braunr> hum, now why would __io_select_timeout return
+ EMACH_SEND_INVALID_DEST :(
+ <braunr> the current glibc code just transparently reports any such error
+ as a false positive oO
+ <braunr> hm nice, segfault through recursion
+ <braunr> "task foo destroying an invalid port bar" everywhere :((
+ <braunr> i still have problems at the server side ..
+ <braunr> ok i think i have a solution for the "synchronization problem"
+ <braunr> (by this name, i refer to the way mutex and condition variables
+ are implemented"
+ <braunr> (the problem being that, when a thread unblocks early, because of
+ a timeout, another may still send a message to attempt it, which may fill
+ up the message queue and make the sender block, causing a deadlock)
+ <braunr> s/attempt/attempt to wake/
+ <bddebian> Attempts to wake a dead thread?
+ <braunr> no
+ <braunr> attempt to wake an already active thread
+ <braunr> which won't dequeue the message because it's doing something else
+ <braunr> bddebian: i'm mentioning this because the problem potentially also
+ exists in libpthread
+
+[[!taglink open_issue_libpthread]].
+
+ <braunr> since the underlying algorithms are exactly the same
+ <youpi> (fortunately the time-out versions are not often used)
+ <braunr> for now :)
+ <braunr> for reference, my idea is to make the wake call truely non
+ blocking, by setting a timeout of 0
+ <braunr> i also limit the message queue size to 1, to limit the amount of
+ spurious wakeups
+ <braunr> i'll be able to test that in 30 mins or so
+ <braunr> hum
+ <braunr> how can mach_msg block with a timeout of 0 ??
+ <braunr> never mind :p
+ <braunr> unfortunately, my idea alone isn't enough
+ <braunr> for those interested in the problem, i've updated the analysis in
+ my last commit
+ (http://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?h=rbraun/select_timeout&id=40fe717ba9093c0c893d9ea44673e46a6f9e0c7d)
+
+
+## IRC, freenode, #hurd, 2012-08-01
+
+ <braunr> damn, i can't manage to make threads calling condition_wait to
+ dequeue themselves from the condition queue :(
+ <braunr> (instead of the one sending the signal/broadcast)
+ <braunr> my changes on cthreads introduce 2 intrusive changes
+ <braunr> the first is that the wakeup port is limited to 1 port, and the
+ wakeup operation is totally non blocking
+ <braunr> which is something we should probably add in any case
+ <braunr> the second is that condition_wait dequeues itself after blocking,
+ instead of condition_signal/broadcast
+ <braunr> and this second change seems to introduce deadlocks, for reasons
+ completely unknown to me :((
+ <braunr> limited to 1 message*
+ <braunr> if anyone has an idea about why it is bad for a thread to remove
+ itself from a condition/mutex queue, i'm all ears
+ <braunr> i'm hitting a wall :(
+ <braunr> antrik: if you have some motivation, can you review this please ?
+ http://www.sceen.net/~rbraun/0001-Rework-condition-signal-broadcast.patch
+ <braunr> with this patch, i get threads blocked in condition_wait,
+ apparently waiting for a wakeup that never comes (or was already
+ consumed)
+ <braunr> and i don't understand why :
+ <braunr> :(
+ <bddebian> braunr: The condition never happens?
+ <braunr> bddebian: it works without the patch, so i guess that's not the
+ problem
+ <braunr> bddebian: hm, you could be right actually :p
+ <bddebian> braunr: About what? :)
+ <braunr> 17:50 < bddebian> braunr: The condition never happens?
+ <braunr> although i doubt it again
+ <braunr> this problem is getting very very frustrating
+ <bddebian> :(
+ <braunr> it frightens me because i don't see any flaw in the logic :(
+
+
+## IRC, freenode, #hurd, 2012-08-02
+
+ <braunr> ah, seems i found a reliable workaround to my deadlock issue, and
+ more than a workaround, it should increase efficiency by reducing
+ messaging
+ * braunr happy
+ <kilobug> congrats :)
+ <braunr> the downside is that we may have a problem with non blocking send
+ calls :/
+ <braunr> which are used for signals
+ <braunr> i mean, this could be a mach bug
+ <braunr> let's try running a complete hurd with the change
+ <braunr> arg, the boot doesn't complete with the patch .. :(
+ <braunr> grmbl, by changing only a few bits in crtheads, the boot process
+ freezes in an infinite loop in somethign started after auth
+ (/etc/hurd/runsystem i assume)
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+ <braunr> glibc actually makes some direct use of cthreads condition
+ variables
+ <braunr> and my patch seems to work with servers in an already working
+ hurd, but don't allow it to boot
+ <braunr> and the hang happens on bash, the first thing that doesn't come
+ from the hurd package
+ <braunr> (i mean, during the boot sequence)
+ <braunr> which means we can't change cthreads headers (as some primitives
+ are macros)
+ <braunr> *sigh*
+ <braunr> the thing is, i can't fix select until i have a
+ condition_timedwait primitive
+ <braunr> and i can't add this primitive until either 1/ cthreads are fixed
+ not to allow the inlining of its primitives, or 2/ the switch to pthreads
+ is done
+ <braunr> which might take a loong time :p
+ <braunr> i'll have to rebuild a whole libc package with a fixed cthreads
+ version
+ <braunr> let's do this
+ <braunr> pinotree: i see two __condition_wait calls in glibc, how is the
+ double underscore handled ?
+ <pinotree> where do you see it?
+ <braunr> sysdeps/mach/hurd/setpgid.c and sysdeps/mach/hurd/setsid.c
+ <braunr> i wonder if it's even used
+ <braunr> looks like we use posix/setsid.c now
+ <pinotree> #ifdef noteven
+ <braunr> ?
+ <pinotree> the two __condition_wait calls you pointed out are in such
+ preprocessor block
+ <pinotree> s
+ <braunr> but what does it mean ?
+ <pinotree> no idea
+ <braunr> ok
+ <pinotree> these two files should be definitely be used, they are found
+ earlier in the vpath
+ <braunr> hum, posix/setsid.c is a nop stub
+ <pinotree> i don't see anything defining "noteven" in glibc itself nor in
+ hurd
+ <braunr> :(
+ <pinotree> yes, most of the stuff in posix/, misc/, signal/, time/ are
+ ENOSYS stubs, to be reimplemented in a sysdep
+ <braunr> hm, i may have made a small mistake in cthreads itself actually
+ <braunr> right
+ <braunr> when i try to debug using a subhurd, gdb tells me the blocked
+ process is spinning in ld ..
+ <braunr> i mean ld.so
+ <braunr> and i can't see any debugging symbol
+ <braunr> some progress, it hangs at process_envvars
+ <braunr> eh
+ <braunr> i've partially traced my problem
+ <braunr> when a "normal" program starts, libc creates the signal thread
+ early
+ <braunr> the main thread waits for the creation of this thread by polling
+ its address
+ <braunr> (i.e. while (signal_thread == 0); )
+ <braunr> for some reason, it is stuck in this loop
+ <braunr> cthread creation being actually governed by
+ condition_wait/broadcast, it makes some sense
+ <bddebian> braunr: When you say the "main" thread, do you mean the main
+ thread of the program?
+ <braunr> bddebian: yes
+ <braunr> i think i've determined my mistake
+ <braunr> glibc has its own variants of the mutex primitives
+ <braunr> and i changed one :/
+ <bddebian> Ah
+ <braunr> it's good news for me :)
+ <braunr> hum no, that's not exactly what i described
+ <braunr> glibc has some stubs, but it's not the problem, the problem is
+ that mutex_lock/unlock are macros, and i changed one of them
+ <braunr> so everything that used that macro inside glibc wasn't changed
+ <braunr> yes!
+ <braunr> my patched hurd now boots :)
+ * braunr relieved
+ <braunr> this experience at least taught me that it's not possible to
+ easily change the singly linked queues of thread (waiting for a mutex or
+ a condition variable) :(
+ <braunr> for now, i'm using a linear search from the start
+ <braunr> so, not only does this patched hurd boot, but i was able to use
+ aptitude, git, build a whole hurd, copy the whole thing, and remove
+ everything, and it still runs fine (whereas usually it would fail very
+ early)
+ * braunr happy
+ <antrik> and vim works fine now?
+ <braunr> err, wait
+ <braunr> this patch does only one thing
+ <braunr> it alters the way condition_signal/broadcast and
+ {hurd_,}condition_wait operate
+ <braunr> currently, condition_signal/broadcast dequeues threads from a
+ condition queue and wake them
+ <braunr> my patch makes these functions only wake the target threads
+ <braunr> which dequeue themselves
+ <braunr> (a necessary requirement to allow clean timeout handling)
+ <braunr> the next step is to fix my hurd_condition_wait patch
+ <braunr> and reapply the whole hurd patch indotrucing io_select_timeout
+ <braunr> introducing*
+ <braunr> then i'll be able to tell you
+ <braunr> one side effect of my current changes is that the linear search
+ required when a thread dequeues itself is ugly
+ <braunr> so it'll be an additional reason to help the pthreads porting
+ effort
+ <braunr> (pthreads have the same sort of issues wrt to timeout handling,
+ but threads are a doubly-linked lists, making it way easier to adjust)
+ <braunr> +on
+ <braunr> damn i'm happy
+ <braunr> 3 days on this stupid bug
+ <braunr> (which is actually responsible for what i initially feared to be a
+ mach bug on non blocking sends)
+ <braunr> (and because of that, i worked on the code to make it sure that 1/
+ waking is truely non blocking and 2/ only one message is required for
+ wakeups
+ <braunr> )
+ <braunr> a simple flag is tested instead of sending in a non blocking way
+ :)
+ <braunr> these improvments should be ported to pthreads some day
+
+[[!taglink open_issue_libpthread]]
+
+ <braunr> ahah !
+ <braunr> view is now FAST !
+ <mel-> braunr: what do you mean by 'view'?
+ <braunr> mel-: i mean the read-only version of vim
+ <mel-> aah
+ <braunr> i still have a few port leaks to fix
+ <braunr> and some polishing
+ <braunr> but basically, the non-blocking select issue seems fixed
+ <braunr> and with some luck, we should get unexpected speedups here and
+ there
+ <mel-> so vim was considerable slow on the Hurd before? didn't know that.
+ <braunr> not exactly
+ <braunr> at first, it wasn't, but the non blocking select/poll calls
+ misbehaved
+ <braunr> so a patch was introduced to make these block at least 1 ms
+ <braunr> then vim became slow, because it does a lot of non blocking select
+ <braunr> so another patch was introduced, not to set the 1ms timeout for a
+ few programs
+ <braunr> youpi: darnassus is already running the patched hurd, which shows
+ (as expected) that it can safely be used with an older libc
+ <youpi> i.e. servers with the additional io_select?
+ <braunr> yes
+ <youpi> k
+ <youpi> good :)
+ <braunr> and the modified cthreads
+ <braunr> which is the most intrusive change
+ <braunr> port leaks fixed
+ <gnu_srs> braunr: Congrats:-D
+ <braunr> thanks
+ <braunr> it's not over yet :p
+ <braunr> tests, reviews, more tests, polishing, commits, packaging
+
+
+## IRC, freenode, #hurd, 2012-08-04
+
+ <braunr> grmbl, apt-get fails on select in my subhurd with the updated
+ glibc
+ <braunr> otherwise it boots and runs fine
+ <braunr> fixed :)
+ <braunr> grmbl, there is a deadlock in pfinet with my patch
+ <braunr> deadlock fixed
+ <braunr> the sigstate and the condition locks must be taken at the same
+ time, for some obscure reason explained in the cthreads code
+ <braunr> but when a thread awakes and dequeues itself from the condition
+ queue, it only took the condition lock
+ <braunr> i noted in my todo list that this could create problems, but
+ wanted to leave it as it is to really see it happen
+ <braunr> well, i saw :)
+ <braunr> the last commit of my hurd branch includes the 3 line fix
+ <braunr> these fixes will be required for libpthreads
+ (pthread_mutex_timedlock and pthread_cond_timedwait) some day
+ <braunr> after the select bug is fixed, i'll probably work on that with you
+ and thomas d
+
+
+## IRC, freenode, #hurd, 2012-08-05
+
+ <braunr> eh, i made dpkg-buildpackage use the patched c library, and it
+ finished the build oO
+ <gnu_srs> braunr: :)
+ <braunr> faked-tcp was blocked in a select call :/
+ <braunr> (with the old libc i mean)
+ <braunr> with mine i just worked at the first attempt
+ <braunr> i'm not sure what it means
+ <braunr> it could mean that the patched hurd servers are not completely
+ compatible with the current libc, for some weird corner cases
+ <braunr> the slowness of faked-tcp is apparently inherent to its
+ implementation
+ <braunr> all right, let's put all these packages online
+ <braunr> eh, right when i upload them, i get a deadlock
+ <braunr> this one seems specific to pfinet
+ <braunr> only one deadlock so far, and the libc wasn't in sync with the
+ hurd
+ <braunr> :/
+ <braunr> damn, another deadlock as soon as i send a mail on bug-hurd :(
+ <braunr> grr
+ <pinotree> thou shall not email
+ <braunr> aptitude seems to be a heavy user of select
+ <braunr> oh, it may be due to my script regularly chaning the system time
+ <braunr> or it may not be a deadlock, but simply the linear queue getting
+ extremely large
+
+
+## IRC, freenode, #hurd, 2012-08-06
+
+ <braunr> i have bad news :( it seems there can be memory corruptions with
+ my io_select patch
+ <braunr> i've just seen an auth server (!) spinning on a condition lock
+ (the internal spin lock), probably because the condition was corrupted ..
+ <braunr> i guess it's simply because conditions embedded in dynamically
+ allocated structures can be freed while there are still threads waiting
+ ...
+ <braunr> so, yes the solution to my problem is simply to dequeue threads
+ from both the waker when there is one, and the waiter when no wakeup
+ message was received
+ <braunr> simple
+ <braunr> it's so obvious i wonder how i didn't think of it earlier :(-
+ <antrik> braunr: an elegant solution always seems obvious afterwards... ;-)
+ <braunr> antrik: let's hope this time, it's completely right
+ <braunr> good, my latest hurd packages seem fixed finally
+ <braunr> looks like i got another deadlock
+ * braunr hangs himselg
+ <braunr> that, or again, condition queues can get very large (e.g. on
+ thread storms)
+ <braunr> looks like this is the case yes
+ <braunr> after some time the system recovered :(
+ <braunr> which means a doubly linked list is required to avoid pathological
+ behaviours
+ <braunr> arg
+ <braunr> it won't be easy at all to add a doubly linked list to condition
+ variables :(
+ <braunr> actually, just a bit messy
+ <braunr> youpi: other than this linear search on dequeue, darnassus has
+ been working fine so far
+ <youpi> k
+ <youpi> Mmm, you'd need to bump the abi soname if changing the condition
+ structure layout
+ <braunr> :(
+ <braunr> youpi: how are we going to solve that ?
+ <youpi> well, either bump soname, or finish transition to libpthread :)
+ <braunr> it looks better to work on pthread now
+ <braunr> to avoid too many abi changes
+
+[[libpthread]].
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <rbraun_hurd> anyone knows of applications extensively using non-blocking
+ networking functions ?
+ <rbraun_hurd> (well, networking functions in a non-blocking way)
+ <antrik> rbraun_hurd: X perhaps?
+ <antrik> it's single-threaded, so I guess it must be pretty async ;-)
+ <antrik> thinking about it, perhaps it's the reason it works so poorly on
+ Hurd...
+ <braunr> it does ?
+ <rbraun_hurd> ah maybe at the client side, right
+ <rbraun_hurd> hm no, the client side is synchronous
+ <rbraun_hurd> oh by the way, i can use gitk on darnassys
+ <rbraun_hurd> i wonder if it's because of the select fix
+ <tschwinge> rbraun_hurd: If you want, you could also have a look if there's
+ any improvement for these:
+ http://www.gnu.org/software/hurd/open_issues/select.html (elinks),
+ http://www.gnu.org/software/hurd/open_issues/dbus.html,
+ http://www.gnu.org/software/hurd/open_issues/runit.html
+ <tschwinge> rbraun_hurd: And congratulations, again! :-)
+ <rbraun_hurd> tschwinge: too bad it can't be merged before the pthread port
+ :(
+ <antrik> rbraun_hurd: I was talking about server. most clients are probably
+ sync.
+ <rbraun_hurd> antrik: i guessed :)
+ <antrik> (thought certainly not all... multithreaded clients are not really
+ supported with xlib IIRC)
+ <rbraun_hurd> but i didn't have much trouble with X
+ <antrik> tried something pushing a lot of data? like, say, glxgears? :-)
+ <rbraun_hurd> why not
+ <rbraun_hurd> the problem with tests involving "a lot of data" is that it
+ can easily degenerate into a livelock
+ <antrik> yeah, sounds about right
+ <rbraun_hurd> (with the current patch i mean)
+ <antrik> the symptoms I got were general jerkiness, with occasional long
+ hangs
+ <rbraun_hurd> that applies to about everything on the hurd
+ <rbraun_hurd> so it didn't alarm me
+ <antrik> another interesting testcase is freeciv-gtk... it reporducibly
+ caused a thread explosion after idling for some time -- though I don't
+ remember the details; and never managed to come up with a way to track
+ down how this happens...
+ <rbraun_hurd> dbus is more worthwhile
+ <rbraun_hurd> pinotree: hwo do i test that ?
+ <pinotree> eh?
+ <rbraun_hurd> pinotree: you once mentioned dbus had trouble with non
+ blocking selects
+ <pinotree> it does a poll() with a 0s timeout
+ <rbraun_hurd> that's the non blocking select part, yes
+ <pinotree> you'll need also fixes for the socket credentials though,
+ otherwise it won't work ootb
+ <rbraun_hurd> right but, isn't it already used somehow ?
+ <antrik> rbraun_hurd: uhm... none of the non-X applications I use expose a
+ visible jerkiness/long hangs pattern... though that may well be a result
+ of general load patterns rather than X I guess
+ <rbraun_hurd> antrik: that's my feeling
+ <rbraun_hurd> antrik: heavy communication channels, unoptimal scheduling,
+ lack of scalability, they're clearly responsible for the generally
+ perceived "jerkiness" of the system
+ <antrik> again, I can't say I observe "general jerkiness". apart from slow
+ I/O the system behaves rather normally for the things I do
+ <antrik> I'm pretty sure the X jerkiness *is* caused by the socket
+ communication
+ <antrik> which of course might be a scheduling issue
+ <antrik> but it seems perfectly possible that it *is* related to the select
+ implementation
+ <antrik> at least worth a try I'd say
+ <rbraun_hurd> sure
+ <rbraun_hurd> there is still some work to do on it though
+ <rbraun_hurd> the client side changes i did could be optimized a bit more
+ <rbraun_hurd> (but i'm afraid it would lead to ugly things like 2 timeout
+ parameters in the io_select_timeout call, one for the client side, the
+ other for the servers, eh)
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <braunr> when running gitk on [darnassus], yesterday, i could push the CPU
+ to 100% by simply moving the mouse in the window :p
+ <braunr> (but it may also be caused by the select fix)
+ <antrik> braunr: that cursor might be "normal"
+ <rbraunrh> antrik: what do you mean ?
+ <antrik> the 100% CPU
+ <rbraunh> antrik: yes i got that, but what would make it normal ?
+ <rbraunh> antrik: right i get similar behaviour on linux actually
+ <rbraunh> (not 100% because two threads are spread on different cores, but
+ their cpu usage add up to 100%)
+ <rbraunh> antrik: so you think as long as there are events to process, the
+ x client is running
+ <rbraunh> thath would mean latencies are small enough to allow that, which
+ is actually a very good thing
+ <antrik> hehe... sound kinda funny :-)
+ <rbraunh> this linear search on dequeue is a real pain :/
+
+
+## IRC, freenode, #hurd, 2012-08-09
+
+`screen` doesn't close a window/hangs after exiting the shell.
+
+ <rbraunh> the screen issue seems linked to select :p
+ <rbraunh> tschwinge: the term server may not correctly implement it
+ <rbraunh> tschwinge: the problem looks related to the term consoles not
+ dying
+ <rbraunh> http://www.gnu.org/software/hurd/open_issues/term_blocking.html
+
+[[Term_blocking]].
+
+
+# IRC, freenode, #hurd, 2012-12-05
+
+ <braunr> well if i'm unable to build my own packages, i'll send you the one
+ line patch i wrote that fixes select/poll for the case where there is
+ only one descriptor
+ <braunr> (the current code calls mach_msg twice, each time with the same
+ timeout, doubling the total wait time when there is no event)
+
+
+## IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> damn, my eglibc patch breaks select :x
+ <braunr> i guess i'll just simplify the code by using the same path for
+ both single fd and multiple fd calls
+ <braunr> at least, the patch does fix the case i wanted it to .. :)
+ <braunr> htop and ping act at the right regular interval
+ <braunr> my select patch is :
+ <braunr> /* Now wait for reply messages. */
+ <braunr> - if (!err && got == 0)
+ <braunr> + if (!err && got == 0 && firstfd != -1 && firstfd != lastfd)
+ <braunr> basically, when there is a single fd, the code calls io_select
+ with a timeout
+ <braunr> and later calls mach_msg with the same timeout
+ <braunr> effectively making the maximum wait time twice what it should be
+ <pinotree> ouch
+ <braunr> which is why htop and ping are "laggy"
+ <braunr> and perhaps also why fakeroot is when building libc
+ <braunr> well
+ <braunr> when building packages
+ <braunr> my patch avoids entering the mach_msg call if there is only one fd
+ <braunr> (my failed attempt didn't have the firstfd != -1 check, leading to
+ the 0 fd case skipping mach_msg too, which is wrong since in that case
+ there is just no wait, making applications use select/poll for sleeping
+ consume all cpu)
+
+ <braunr> the second is a fix in select (yet another) for the case where a
+ single fd is passed
+ <braunr> in which case there is one timeout directly passed in the
+ io_select call, but then yet another in the mach_msg call that waits for
+ replies
+ <braunr> this can account for the slowness of a bunch of select/poll users
+
+
+## IRC, freenode, #hurd, 2012-12-07
+
+ <braunr> finally, my select patch works :)
+
+
+## IRC, freenode, #hurd, 2012-12-08
+
+ <braunr> for those interested, i pushed my eglibc packages that include
+ this little select/poll timeout fix on my debian repository
+ <braunr> deb http://ftp.sceen.net/debian-hurd experimental/
+ <braunr> reports are welcome, i'm especially interested in potential
+ regressions
+
+
+## IRC, freenode, #hurd, 2012-12-10
+
+ <gnu_srs> I have verified your double timeout bug in hurdselect.c.
+ <gnu_srs> Since I'm also working on hurdselect I have a few questions
+ about where the timeouts in mach_msg and io_select are implemented.
+ <gnu_srs> Have a big problem to trace them down to actual code: mig magic
+ again?
+ <braunr> yes
+ <braunr> see hurd/io.defs, io_select includes a waittime timeout:
+ natural_t; parameter
+ <braunr> waittime is mig magic that tells the client side not to wait more
+ than the timeout
+ <braunr> and in _hurd_select, you can see these lines :
+ <braunr> err = __io_select (d[i].io_port, d[i].reply_port,
+ <braunr> /* Poll only if there's a single
+ descriptor. */
+ <braunr> (firstfd == lastfd) ? to : 0,
+ <braunr> to being the timeout previously computed
+ <braunr> "to"
+ <braunr> and later, when waiting for replies :
+ <braunr> while ((msgerr = __mach_msg (&msg.head,
+ <braunr> MACH_RCV_MSG | options,
+ <braunr> 0, sizeof msg, portset, to,
+ <braunr> MACH_PORT_NULL)) ==
+ MACH_MSG_SUCCESS)
+ <braunr> the same timeout is used
+ <braunr> hope it helps
+ <gnu_srs> Additional stuff on io-select question is at
+ http://paste.debian.net/215401/
+ <gnu_srs> Sorry, should have posted it before you comment, but was
+ disturbed.
+ <braunr> 14:13 < braunr> waittime is mig magic that tells the client side
+ not to wait more than the timeout
+ <braunr> the waittime argument is a client argument only
+ <braunr> that's one of the main source of problems with select/poll, and
+ the one i fixed 6 months ago
+ <gnu_srs> so there is no relation to the third argument of the client call
+ and the third argument of the server code?
+ <braunr> no
+ <braunr> the 3rd argument at server side is undoubtedly the 4th at client
+ side here
+ <gnu_srs> but for the fourth argument there is?
+ <braunr> i think i've just answered that
+ <braunr> when in doubt, check the code generated by mig when building glibc
+ <gnu_srs> as I said before, I have verified the timeout bug you solved.
+ <gnu_srs> which code to look for RPC_*?
+ <braunr> should be easy to guess
+ <gnu_srs> is it the same with mach_msg()? No explicit usage of the timeout
+ there either.
+ <gnu_srs> in the code for the function I mean.
+ <braunr> gnu_srs: mach_msg is a low level system call
+ <braunr> see
+ http://www.gnu.org/software/hurd/gnumach-doc/Mach-Message-Call.html#Mach-Message-Call
+ <gnu_srs> found the definition of __io_select in: RPC_io_select.c, thanks.
+ <gnu_srs> so the client code to look for wrt RPC_ is in hurd/*.defs? what
+ about the gnumach/*/include/*.defs?
+ <gnu_srs> a final question: why use a timeout if there is a single FD for
+ the __io_select call, not when there are more than one?
+ <braunr> well, the code is obviously buggy, so don't expect me to justify
+ wrong code
+ <braunr> but i suppose the idea was : if there is only one fd, perform a
+ classical synchronous RPC, whereas if there are more use a heavyweight
+ portset and additional code to receive replies
+
+ <youpi> exim4 didn't get fixed by the libc patch, unfortunately
+ <braunr> yes i noticed
+ <braunr> gdb can't attach correctly to exim, so it's probably something
+ completely different
+ <braunr> i'll try the non intrusive mode
+
+
# See Also
See also [[select_bogus_fd]] and [[select_vs_signals]].
diff --git a/open_issues/strict_aliasing.mdwn b/open_issues/strict_aliasing.mdwn
index 01019372..b7d39805 100644
--- a/open_issues/strict_aliasing.mdwn
+++ b/open_issues/strict_aliasing.mdwn
@@ -19,3 +19,13 @@ License|/fdl]]."]]"""]]
instead?
<braunr> pinotree: if we can rely on gcc for the warnings, yes
<braunr> but i suspect there might be other silent issues in very old code
+
+
+# IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> btw, i'm building glibc right now, and i can see a few strict
+ aliasing warnings
+ <braunr> fixing them will allow us to avoid wasting time on very obscure
+ issues (if gcc catches them all)
+ <tschwinge> The strict aliasing things should be fixed, yes. Some might be
+ from MIG.
diff --git a/open_issues/synchronous_ipc.mdwn b/open_issues/synchronous_ipc.mdwn
new file mode 100644
index 00000000..53d5d69d
--- /dev/null
+++ b/open_issues/synchronous_ipc.mdwn
@@ -0,0 +1,185 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+From [[Genode RPC|microkernel/genode/rpc]].
+
+ <braunr> assuming synchronous ipc is the way to go (it seems so), there is
+ still the need for some async ipc (e.g signalling untrusted recipients
+ without risking blocking on them)
+ <braunr> 1/ do you agree on that and 2/ how would this low-overhead async
+ ipc be done ? (and 3/ are there relevant examples ?
+ <antrik> if you think about this stuff too much you will end up like marcus
+ and neal ;-)
+ <braunr> antrik: likely :)
+ <antrik> the truth is that there are various possible designs all with
+ their own tradeoffs, and nobody can really tell which one is better
+ <braunr> the only sensible one i found is qnx :/
+ <braunr> but it's still messy
+ <braunr> they have what they call pulses, with a strictly defined format
+ <braunr> so it's actually fine because it guarantees low overhead, and can
+ easily be queued
+ <braunr> but i'm not sure about the format
+ <antrik> I must say that Neal's half-sync approach in Viengoos still sounds
+ most promising to me. it's actually modelled after the needs of a
+ Hurd-like system; and he thought about it a lot...
+ <braunr> damn i forgot to reread that
+ <braunr> stupid me
+ <antrik> note that you can't come up with a design that allows both a)
+ delivering reliably and b) never blocking the sender -- unless you cache
+ in the kernel, which we don't want
+ <antrik> but I don't think it's really necessary to fulfill both of these
+ requirements
+ <antrik> it's up to the receiver to make sure it gets important signals
+ <braunr> right
+ <braunr> caching in the kernel is ok as long as the limit allows the
+ receiver to handle its signals
+ <antrik> in the Viengoos approach, the receiver can allocate a number of
+ receive buffers; so it's even possible to do some queuing if desired
+ <braunr> ah great, limits in the form of resources lent by the receiver
+ <braunr> one thing i really don't like in mach is the behaviour on full
+ message queues
+ <braunr> blocking :/
+ <braunr> i bet the libpager deadlock is due to that
+
+[[libpager_deadlock]].
+
+ <braunr> it simply means async ipc doesn't prevent at all from deadlocks
+ <antrik> the sender can set a timeout. blocking only happens when setting
+ it to infinite...
+ <braunr> which is commonly the case
+ <antrik> well, if you see places where blocking is done but failing would
+ be more appropriate, try changing them I'd say...
+ <braunr> it's not that easy :/
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <lcc> what is the deepest design mistake of the HURD/gnumach?
+ <braunr> lcc: async ipc
+ <savask> braunr: You mentioned that moving to L4 will create problems. Can
+ you name some, please?
+ <savask> I thought it was going to be faster on L4
+ <braunr> the problem is that l4 *only* provides sync ipc
+ <braunr> so implementing async communication would require one seperated
+ thread for each instance of async communication
+ <savask> But you said that the deepest design mistake of Hurd is asynch
+ ipc.
+ <braunr> not the hurd, mach
+ <braunr> and hurd depends on it now
+ <braunr> i said l4 provides *only* sync ipc
+ <braunr> systems require async communication tools
+ <braunr> but they shouldn't be built entirely on top of them
+ <savask> Hmm, so you mean mach has bad asynch ipc?
+ <braunr> you can consider mach and l4 as two extremes in os design
+ <braunr> mach *only* has async ipc
+ <lcc> what was viengoos trying to explore?
+ * savask is confused
+ <braunr> lcc: half-sync ipc :)
+ <braunr> lcc: i can't tell you more on that, i need to understand it better
+ myself before any explanation attempt
+ <savask> You say that mach problem is asynch ipc. And L4's problem is it's
+ sync ipc. That means problems are in either of them!
+ <braunr> exactly
+ <lcc> how did apple resolve issues with mach?
+ <savask> What is perfect then? A "golden middle"?
+ <braunr> lcc: they have migrating threads, which make most rpc behave as if
+ they used sync ipc
+ <braunr> savask: nothing is perfect :p
+ <mcsim> braunr: but why async ipc is the problem?
+ <braunr> mcsim: it requires in-kernel buffering
+ <savask> braunr: Yes, but we can't have problems everywhere o_O
+ <braunr> mcsim: this not only reduces communication performance, but
+ creates many resource usage problems
+ <braunr> mcsim: and potential denial of service, which is what we
+ experience most of the time when something in the hurd fails
+ <braunr> savask: there are problems we can live with
+ <mcsim> braunr: But this could be replaced by userspace server, isn't it?
+ <braunr> savask: this is what monolithic kernels do
+ <braunr> mcsim: what ?
+ <braunr> mcsim: this would be the same, this central buffering server would
+ suffer from the same kind of issue
+ <mcsim> braunr: async ipc. Buffer can hold special server
+ <mcsim> But there could be created several servers, and queue could have
+ limit.
+ <braunr> queue limits are a problem
+ <braunr> when a queue limit is reached, you either block (= sync ipc) or
+ lose a message
+ <braunr> to keep messaging reliable, mach makes senders block
+ <braunr> the problem is that async ipc is often used to avoid blocking
+ <braunr> so blocking when you don't expect it can create deadlocks
+ <braunr> savask: a good compromise is to use sync ipc most of the time, and
+ async ipc for a few special cases, like signals
+ <braunr> this is what okl4 does if i'm right
+ <braunr> i'm not sure of the details, but like many other projects they
+ realized current systems simply need good support for async ipc, so they
+ extended l4 or something on top of it to provide it
+ <braunr> it took years of research for very smart people to get to some
+ consensus like "sync ipc is better but async is needed too"
+ <braunr> personaly i don't like l4 :/
+ <braunr> really not
+ <mcsim> braunr: Anyway there is some queue for messaging, but at the moment
+ if it overflows panics kernel. And with limited queue servers will panic.
+ <braunr> mcsim: it can't overflow
+ <braunr> mach blocks senders
+ <braunr> queuing basically means "block and possible deadlock" or "lose
+ messages and live with it"
+ <mcsim> So, deadlocks are still possible?
+ <braunr> of course
+ <braunr> have a look at the libpager debian patch and the discussion around
+ it
+ <braunr> it's a perfect example
+ <youpi> braunr: it makes gnu mach slow as hell sometimes, which I guess is
+ because all threads (which can ben 1000s) wake at the same time
+ <braunr> youpi: you mean are created ?
+ <braunr> because they'll have to wake in any case
+ <braunr> i can understand why creating lots of threads is slower, but
+ cthreads never destroyes kernel threads
+ <braunr> doesn't seem to be a mach problem, rather a cthreads one
+ <braunr> i hope we're able to remove the patch after pthreads are used
+
+[[libpthread]].
+
+ <mcsim> braunr: You state that hurd can't move to sync ipc, since it
+ depends on async ipc. But at the same time async ipc doesn't guarantee
+ that task wouldn't block. So, I don't understand why limited queues will
+ lead to more deadlocks?
+ <braunr> mcsim: async ipc can block because of queue limits
+ <braunr> mcsim: if you remove the limit, you remove the deadlock problem,
+ and replace it with denial of service
+ <braunr> mcsim: i didn't say the hurd can't move to sync ipc
+ <braunr> mcsim: i said it came to depend on async ipc as provided by mach,
+ and we would need to change that
+ <braunr> and it's tricky
+ <youpi> braunr: no, I really mean are woken. The timeout which gets dropped
+ by the patch makes threads wake after some time, to realize they should
+ go away. It's a hell long when all these threads wake at the same time
+ (because theygot created at the same time)
+ <braunr> ahh
+
+ <antrik> savask: what is perfect regarding IPC is something nobody can
+ really answer... there are competing opinions on that matter. but we know
+ by know that the Mach model is far from ideal, and that the (original) L4
+ model is also problematic -- at least for implementing a UNIX-like system
+ <braunr> personally, if i'd create a system now, i'd use sync ipc for
+ almost everything, and implement posix-like signals in the kernel
+ <braunr> that's one solution, it's not perfect
+ <braunr> savask: actually the real answer may be "noone knows for now and
+ it still requires work and research"
+ <braunr> so for now, we're using mach
+ <antrik> savask: regarding IPC, the path explored by Viengoos (and briefly
+ Coyotos) seems rather promising to me
+ <antrik> savask: and yes, I believe that whatever direction we take, we
+ should do so by incrementally reworking Mach rather than jumping to a
+ completely new microkernel...
diff --git a/open_issues/system_stats.mdwn b/open_issues/system_stats.mdwn
new file mode 100644
index 00000000..9a13b29a
--- /dev/null
+++ b/open_issues/system_stats.mdwn
@@ -0,0 +1,39 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]There should be a page listing ways to get
+system statistics, how to interpret them, and some example/expected values.
+
+
+# IRC, frenode, #hurd, 2012-11-04
+
+ <mcsim> Hi, is that normal that memory cache "ipc_port" is 24 Mb already?
+ Some memory has been already swapped out.
+ <mcsim> Other caches are big too
+ <braunr> how many ports ?
+ <mcsim> 45922
+ <braunr> yes it's normal
+ <braunr> ipc_port 0010 76 4k 50 45937 302050
+ 24164k 4240k
+ <braunr> it's a bug in exim
+ <braunr> or triggered by exim, from time to time
+ <braunr> lots of ports are created until the faulty processes are killed
+ <braunr> the other big caches you have are vm_object and vm_map_entry,
+ probably because of a big build like glibc
+ <braunr> and if they remain big, it's because there was no memory pressure
+ since they got big
+ <braunr> memory pressure can only be caused by very large files on the
+ hurd, because of the limited page cache size (4000 objects at most)
+ <braunr> the reason you have swapped memory is probably because of a glibc
+ test that allocates a very large (more than 1.5 GiB iirc) block and fills
+ it
+ <mcsim> yes
+ <braunr> (a test that fails with the 2G/2G split of the debian kernel, but
+ not on your vanilla version btw)
diff --git a/open_issues/term_blocking.mdwn b/open_issues/term_blocking.mdwn
index 19d18d0e..39803779 100644
--- a/open_issues/term_blocking.mdwn
+++ b/open_issues/term_blocking.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2009, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2009, 2011, 2012 Free Software Foundation,
+Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -117,6 +118,128 @@ noninvasive on`, attach to the *term* that GDB is using.
[[2011-07-04]].
+# IRC, freenode, #hurd, 2012-08-09
+
+In context of the [[select]] issue.
+
+ <braunr> i wonder where the tty allocation is made
+ <braunr> it could simply be that current applications don't handle old BSD
+ ptys correctly
+ <braunr> hm no, allocation is fine
+ <braunr> does someone know why there is no term instance for /dev/ttypX ?
+ <braunr> showtrans says "/hurd/term /dev/ttyp0 pty-slave /dev/ptyp0" though
+ <youpi> braunr: /dev/ttypX share the same translator with /dev/ptypX
+ <braunr> youpi: but how ?
+ <youpi> see the main function of term
+ <youpi> it attaches itself to the other node
+ <youpi> with file_set_translator
+ <youpi> just like pfinet can attach itself to /servers/socket/26 too
+ <braunr> youpi: isn't there a possible race when the same translator tries
+ to sets itself on several nodes ?
+ <youpi> I don't know
+ <tschwinge> There is.
+ <braunr> i guess it would just faikl
+ <braunr> fail
+ <tschwinge> I remember some discussion about this, possibly in context of
+ the IPv6 project.
+ <braunr> gdb shows weird traces in term
+ <braunr> i got this earlier today: http://www.sceen.net/~rbraun/gdb.txt
+ <braunr> 0x805e008 is the ptyctl, the trivs control for the pty
+ <tschwinge> braunr: How do you mean »weird«?
+ <braunr> tschwinge: some peropen (po) are never destroyed
+ <tschwinge> Well, can't they possibly still be open?
+ <braunr> they shouldn't
+ <braunr> that's why term doesn't close cleany, why select still reports
+ readiness, and why screen loops on it
+ <braunr> (and why each ssh session uses a different pty)
+ <tschwinge> ... but only on darnassus, I think? (I think I haven't seen
+ this anywhere else.)
+ <braunr> really ?
+ <braunr> i had it on my virtual machines too
+ <tschwinge> But perhaps I've always been rebooting systems quickly enough
+ to not notice.
+ <tschwinge> OK, I'll have a look next time I boot mine.
+ <braunr> i suppose it's why you can't login anymore quickly when syslog is
+ running
+
+[[syslog]]?
+
+ <braunr> i've traced the problem to ptyio.c, where pty_open_hook returns
+ EBUSY because ptyopen is still true
+ <braunr> ptyopen remains true because pty_po_create_hook doesn't get called
+ <youpi> tschwinge: I've seen the pty issue on exodar too, and on my qemu
+ image too
+ <braunr> err, pty_po_destroy_hook
+ <tschwinge> OK.
+ <braunr> and pty_po_destroy_hook doesn't get called from users.c because
+ po->cntl != ptyctl
+ <braunr> which means, somehow, the pty never gets closed
+ <youpi> oddly enough it seems to happen on all qemu systems I have, and no
+ xen system I have
+ <braunr> Oo
+ <braunr> are they all (xen and qemu) up to date ?
+ <braunr> (so we can remove versions as a factor)
+ <tschwinge> Aha. I only hve Xen and real hardware.
+ <youpi> braunr: no
+ <braunr> youpi: do you know any obscur site about ptys ? :)
+ <youpi> no
+ <youpi> well, actually yes
+ <youpi> http://dept-info.labri.fr/~thibault/a (in french)
+ <braunr> :D
+ <braunr> http://www.linusakesson.net/programming/tty/index.php looks
+ interesting
+ <youpi> indeed
+
+
+## IRC, freenode, #hurdfr, 2012-08-09
+
+ <braunr> youpi: ce que j'ai le plus de mal à comprendre, c'est ce qu'est un
+ "controlling tty"
+ <youpi> c'est le plus obscur d'obscur :)
+ <braunr> s'il est exclusif à une appli, comment ça doit se comporter sur un
+ fork, etc..
+ <youpi> de manière simple, c'est ce qui permet de faire ^C
+ <braunr> eh oui, et c'est sûrement là que ça explose
+ <youpi> c'est pas exclusif, c'est hérité
+ <braunr>
+ http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/bernstein-on-ttys/cttys.html
+
+
+## IRC, freenode, #hurd, 2012-08-10
+
+ <braunr> youpi: and just to be sure about the test procedure, i log on a
+ system, type tty, see e.g. ttyp0, log out, and in again, then tty returns
+ ttyp1, etc..
+ <youpi> yes
+ <braunr> youpi: and an open (e.g. cat) on /dev/ptyp0 returns EBUSY
+ <youpi> indeed
+ <braunr> so on xen it doesn't
+ <braunr> grmbl
+ <youpi> I've never seen it, more precisely
+ <braunr> i also have the problem with a non-accelerated qemu
+ <braunr> antrik: do you have the term problems we've seen on your bare
+ hardware ?
+ <antrik> I'm not sure what problem you are seeing exactly :-)
+ <braunr> antrik: when logging through ssh, tty first returns ttyp0, and the
+ second time (after logging out from the first session) ttyp1
+ <braunr> antrik: and term servers that have been used are then stuck in a
+ busy state
+ <antrik> braunr: my ptys seem to be reused just fine
+ <braunr> or perhaps they didn't have the bug
+ <braunr> antrik: that's so weird
+ <antrik> (I do *sometimes* get hanging ptys, but that's a different issue
+ -- these are *not* busy; they just hang when reused...)
+ <braunr> antrik: yes i saw that too
+ <antrik> braunr: note though that my hurd package is many months old...
+ <antrik> (in fact everything on this system)
+ <braunr> antrik: i didn't see anything relevant about the term server in
+ years
+ <braunr> antrik: what shell do you use ?
+ <antrik> yeah, but such errors could be caused by all kinds of changes in
+ other parts of the Hurd, glibc, whatever...
+ <antrik> bash
+
+
# Formal Verification
This issue may be a simple programming error, or it may be more complicated.
diff --git a/open_issues/user-space_device_drivers.mdwn b/open_issues/user-space_device_drivers.mdwn
index 25168fce..8cde8281 100644
--- a/open_issues/user-space_device_drivers.mdwn
+++ b/open_issues/user-space_device_drivers.mdwn
@@ -50,6 +50,65 @@ Also see [[device drivers and IO systems]].
* I/O MMU.
+
+### IRC, freenode, #hurd, 2012-08-15
+
+ <carli2> hi. does hurd support mesa?
+ <braunr> carli2: software only, but yes
+ <carli2> :(
+ <carli2> so you did not solve the problem with the CS checkers and GPU DMA
+ for microkernels yet, right?
+ <braunr> cs = ?
+ <carli2> control stream
+ <carli2> the data sent to the gpu
+ <braunr> no
+ <braunr> and to be honest we're not currently trying to
+ <carli2> well, a microkernel containing cs checkers for each hardware is
+ not a microkernel any more
+ <braunr> the problem is having the ability to check
+ <braunr> or rather, giving only what's necessary to delegate checking to
+ mmus
+ <carli2> but maybe the kernel could have a smaller interface like a
+ function to check if a memory block is owned by a process
+ <braunr> i'm not sure what you refer to
+ <carli2> about DMA-capable devices you can send messages to
+ <braunr> carli2: dma must be delegated to a trusted server
+ <carli2> linux checks the data sent to these devices, parses them and
+ checks all pointers if they are in a memory range that the client is
+ allowed to read/write from
+ <braunr> the client ?
+ <carli2> in linux, 3d drivers are in user space, so the kernel side checks
+ the pointer sent to the GPU
+ <youpi> carli2: mach could do that as well
+ <braunr> well, there is a rather large part in kernel space too
+ <carli2> so in hurd I trust some drivers to not do evil things?
+ <braunr> those in the kernel yes
+ <carli2> what does "in the kernel" mean? afaik a microkernel only has
+ memory manager and some basic memory sharing and messaging functionality
+ <braunr> did you read about the hurd ?
+ <braunr> mach is considered an hybrid kernel, not a true microkernel
+ <braunr> even with all drivers outside, it's still an hybrid
+ <youpi> although we're to move some parts into userlands :)
+ <youpi> braunr: ah, why?
+ <braunr> youpi: the vm part is too large
+ <youpi> ok
+ <braunr> the microkernel dogma is no policy inside the kernel
+ <braunr> "except scheduling because it's very complicated"
+ <braunr> but all modern systems have moved memory management outisde the
+ kernel, leaving just the kernel abstraction inside
+ <braunr> the adress space kernel abstraction
+ <braunr> and the two components required to make it work are what l4re
+ calls region mappers (the rough equivalent of our vm_map), which decides
+ how to allocate regions in an address space
+ <braunr> and the pager, like ours, which are already external
+ <carli2> i'm not a OS developer, i mostly develop games, web services and
+ sometimes I fix gpu drivers
+ <braunr> that was just FYI
+ <braunr> but yes, dma must be considered something privileged
+ <braunr> and the hurd doesn't have the infrastructure you seem to be
+ looking for
+
+
## I/O Ports
* Security considerations.
@@ -63,8 +122,13 @@ Also see [[device drivers and IO systems]].
* [[GNU Mach|microkernel/mach/gnumach]] is said to have a high overhead when
doing RPC calls.
+
## System Boot
+A similar problem is described in
+[[community/gsoc/project_ideas/unionfs_boot]], and needs to be implemented.
+
+
### IRC, freenode, #hurd, 2011-07-27
< braunr> btw, was there any formulation of the modifications required to
@@ -89,12 +153,270 @@ Also see [[device drivers and IO systems]].
< Tekk_> mhm
< braunr> s/disk/storage/
+
### IRC, freenode, #hurd, 2012-04-25
<youpi> btw, remember the initrd thing?
<youpi> I just came across task.c in libstore/ :)
+### IRC, freenode, #hurd, 2012-07-17
+
+ <bddebian> OK, here is a stupid question I have always had. If you move
+ PCI and disk drivers in to userspace, how do do initial bootstrap to get
+ the system booting?
+ <braunr> that's hard
+ <braunr> basically you make the boot loader load all the components you
+ need in ram
+ <braunr> then you make it give each component something (ports) so they can
+ communicate
+
+
+### IRC, freenode, #hurd, 2012-08-12
+
+ <antrik> braunr: so, about booting with userspace disk drivers
+ <antrik> after rereading the chapter in my thesis, I see that there aren't
+ really all than many interesting options...
+ <antrik> I pondered some variants involving a temporary boot filesystem
+ with handoff to the real root FS; but ultimately concluded with another
+ option that is slightly less elegant but probably gets a much better
+ usefulness/complexity ratio:
+ <antrik> just start the root filesystem as the first process as we used to;
+ only hack it so that initially it doesn't try to access the disk, but
+ instead gets the files from GRUB
+ <antrik> once the disk driver is operational, we flip a switch, and the
+ root filesystem starts reading stuff from disk normally
+ <antrik> transparently for all other processes
+ <bddebian> How does grub access the disk without drivers?
+ <antrik> bddebian: GRUB obviously has its own drivers... that's how it
+ loads the kernel and modules
+ <antrik> bddebian: basically, it would have to load additional modules for
+ all the components necessary to get the Hurd disk driver going
+ <bddebian> Right, why wouldn't that be possible?
+ <antrik> (I have some more crazy ideas too -- but these are mostly
+ orthogonal :-) )
+ <antrik> ?
+ <antrik> I'm describing this because I'm pretty sure it *is* possible :-)
+ <bddebian> That grub loads the kernel and whatever server/module gets
+ access to the disk
+ <antrik> not sure what you mean
+ <bddebian> Well as usual I probably don't know the proper terminology but
+ why could grub load gnumach and the hurd "disk server" that contains the
+ userspace drivers?
+ <antrik> disk server?
+ <bddebian> Oh FFS whatever contains the disk drivers :)
+ <bddebian> diskdde, whatever :)
+ <antrik> actually, I never liked the idea of having a big driver blob very
+ much... ideally each driver should have it's own file
+ <antrik> but that's admittedly beside the point :-)
+ <antrik> its
+ <antrik> so to restate: in addition to gnumach, ext2fs.static, and ld.so,
+ in the new scenario GRUB will also load exec, the disk driver, any
+ libraries these two depend upon, and any additional infrastructure
+ involved in getting the disk driver running (for automatic probing or
+ whatever)
+ <antrik> probably some other Hurd core servers too, so we can have a more
+ complete POSIX environment for the disk driver to run in
+ <bddebian> There ya go :)
+ <antrik> the interesting part is modifying ext2fs so it will access only
+ the GRUB-provided files, until it is told that it's OK now to access the
+ real disk
+ <antrik> (and the mechanism how ext2 actually gets at the GRUB-provided
+ files)
+ <bddebian> Or write some new really small ext2fs? :)
+ <antrik> ?
+ <bddebian> I'm just talking out my butt. Something temporary that gets
+ disposed of when the real disk is available :)
+ <antrik> well, I mentioned above that I considered some handoff
+ schemes... but they would probably be more complex to implement than
+ doing the switchover internally in ext2
+ <bddebian> Ah
+ <bddebian> boot up in a ramdisk? :)
+ <antrik> (and the temporary FS would *not* be an ext2 obviously, but rather
+ some special ramdisk-like filesystem operating from GRUB-loaded files...)
+ <antrik> again, that would require a complicated handoff-scheme
+ <bddebian> Bah, what do I know? :)
+ <antrik> (well, you could of course go with a trivial chroot()... but that
+ would be ugly and inefficient, as the initial processes would still run
+ from the ramdisk)
+ <bddebian> Aren't most things running in memory initially anyway? At what
+ point must it have access to the real disk?
+ <braunr> antrik: but doesn't that require that disk drivers be statically
+ linked ?
+ <braunr> and having all disk drivers in separate tasks (which is what we
+ prefer to blobs as you put it) seems to pretty much forbid using static
+ linking
+ <braunr> hm actually, i don't see how any solution could work without
+ static linking, as it would create a recursion
+ <braunr> and the only one required is the one used by the root file system
+ <braunr> others can be run from the dynamically linked version
+ <braunr> antrik: i agree, it's a good approach, requiring only a slightly
+ more complicated boot script/sequence
+ <antrik> bddebian: at some point we have to access the real disk so we
+ don't have to work exclusively with stuff loaded by grub... but there is
+ no specific point where it *has* to happen. generally speaking, the
+ sooner the better
+ <antrik> braunr: why wouldn't that work with a dynamically linked disk
+ driver? we only need to make sure all required libraries are loaded by
+ grub too
+ <braunr> antrik: i have a problem with that approach :p
+ <braunr> antrik: it would probably require a reboot when those libraries
+ are upgraded, wouldn't it ?
+ <antrik> I'd actually wish we could run with a dynamically linked ext2fs as
+ well... but that would require a separated boot filesystem and some kind
+ of handoff approach, which would be much more complicated I fear...
+ <braunr> and if a driver is restarted, would it use those libraries too ?
+ and if so, how to find them ?
+ <braunr> but how can you run a dynamically linked root file system ?
+ <braunr> unless the libraries it uses are provided by something else, as
+ you said
+ <antrik> braunr: well, if you upgrade the libraries, *and* want the disk
+ driver to use the upgraded libraries, you are obviously in a tricky
+ situation ;-)
+ <braunr> yes
+ <antrik> perhaps you could tell ext2 to preload the new libraries before
+ restarting the disk driver...
+ <antrik> but that's a minor quibble anyways IMHO
+ <braunr> but that case isn't that important actually, since upgrading these
+ libraries usually means we're upgrading the system, which can imply a
+ reoobt
+ <braunr> i don't think it is
+ <braunr> it looks very complicated to me
+ <braunr> think of restart as after a crash :p
+ <braunr> you can't preload stuff in that case
+ <antrik> uh? I don't see anything particularily complicated. but my point
+ was more that it's not a big thing if that's not implemented IMHO
+ <braunr> right
+ <braunr> it's not that important
+ <braunr> but i still think statically linking is better
+ <braunr> although i'm not sure about some details
+ <antrik> oh, you mean how to make the root filesystem use new libraries
+ without a reboot? that would be tricky indeed... but this is not possible
+ right now either, so that's not a regression
+ <braunr> i assume that, when statically linking, only the .o providing the
+ required symbols are included, right ?
+ <antrik> making the root filesystem restartable is a whole different epic
+ story ;-)
+ <braunr> antrik: not the root file system, but the disk driver
+ <braunr> but i guess it's the same
+ <antrik> no, it's not
+ <braunr> ah
+ <antrik> for the disk driver it's really not that hard I believe
+ <antrik> still some extra effort, but definitely doable
+ <braunr> with the preload you mentioned
+ <antrik> yes
+ <braunr> i see
+ <braunr> i don't think it's worth the trouble actually
+ <braunr> statically linking looks way simpler and should make for smaller
+ binaries than if libraries were loaded by grub
+ <antrik> no, I really don't want statically linked disk drivers
+ <braunr> why ?
+ <antrik> again, I'd prefer even ext2fs to be dynamic -- only that would be
+ much more complicated
+ <braunr> the point of dynamically linking is sharing
+ <antrik> while dynamic disk drivers do not require any extra effort beyond
+ loading the libraries with grub
+ <braunr> but if it means sharing big files that are seldom used (i assume
+ there is a lot of code that simply isn't used by hurd servers), i don't
+ see the point
+ <antrik> right. and with the approach I proposed that will work just as it
+ should
+ <antrik> err... what big files?
+ <braunr> glibc ?
+ <antrik> I don't get your point
+ <antrik> you prefer statically linking everything needed before the disk
+ driver runs (which BTW is much more than only the disk driver itself) to
+ using normal shared libraries like the rest of the system?...
+ <braunr> it's not "like the rest of the system"
+ <braunr> the libraries loaded by grub wouldn't be back by the ext2fs server
+ <braunr> they would be wired in memory
+ <braunr> you'd have two copies of them, the one loaded by grub, and the one
+ shared by normal executables
+ <antrik> no
+ <braunr> i prefer static linking because, if done correctly, the combined
+ size of the root file system and the disk driver should be smaller than
+ that of the rootfs+disk driver and libraries loaded by grub
+ <antrik> apparently I was not quite clear how my approach would work :-(
+ <braunr> probably not
+ <antrik> (preventing that is actually the reason why I do *not* want as
+ simple boot filesystem+chroot approach)
+ <braunr> and initramfs can be easily freed after init
+ <braunr> an*
+ <braunr> it wouldn't be a chroot but something a bit more involved like
+ switch_root in linux
+ <antrik> not if various servers use files provided by that init filesystem
+ <antrik> yes, that's the complex handoff I'm talking about
+ <braunr> yes
+ <braunr> that's one approach
+ <antrik> as I said, that would be a quite elegant approach (allowing a
+ dynamically linked ext2); but it would be much more complicated to
+ implement I believe
+ <braunr> how would it allow a dynamically linked ext2 ?
+ <braunr> how can the root file system be linked with code backed by itself
+ ?
+ <braunr> unless it requires wiring all its memory ?
+ <antrik> it would be loaded from the init filesystem before the handoff
+ <braunr> init sn't the problem here
+ <braunr> i understand how it would boot
+ <braunr> but then, you need to make sure the root fs is never used to
+ service page faults on its own address space
+ <braunr> or any address space it depends on, like the disk driver
+ <braunr> so this basically requires wiring all the system libraries, glibc
+ included
+ <braunr> why not
+ <antrik> ah. yes, that's something I covered in a separate section in my
+ thesis ;-)
+ <braunr> eh :)
+ <antrik> we have to do that anyways, if we want *any* dynamically linked
+ components (such as the disk driver) in the paging path
+ <braunr> yes
+ <braunr> and it should make swapping more reliable too
+ <antrik> so that adds a couple MiB of wired memory... I guess we will just
+ have to live with that
+ <braunr> yes it seems acceptable
+ <braunr> thanks
+ <antrik> (it is actually one reason why I want to avoid static linking as
+ much as possible... so at least we have to wire these libraries only
+ *once*)
+ <antrik> anyways, back to my "simpler" approach
+ <antrik> the idea is that a (static) ext2fs would still be the first task
+ running, and immediately able to serve filesystem access requests -- only
+ it would serve these requests from files preloaded by GRUB rather than
+ the actual disk driver
+ <braunr> i understand now
+ <antrik> until a switch is flipped telling it that now the disk driver (and
+ anything it depends upon) is operational
+ <braunr> you still need to make sure all this is wired
+ <antrik> yes
+ <antrik> that's orthogonal
+ <antrik> which is why I have a separate section about it :-)
+ <braunr> what was the relation with ggi ?
+ <antrik> none strictly speaking
+ <braunr> i'll rephrase it: how did it end up in your thesis ?
+ <antrik> I just covered all aspects of userspace drivers in one of the
+ "introduction" sections of my thesis
+ <braunr> ok
+ <antrik> before going into specifics of KGI
+ <antrik> (and throwing in along the way that most of the issues described
+ do not matter for KGI ;-) )
+ <braunr> hehe
+ <braunr> i'm wondering, do we have mlockall on the hurd ? it seems not
+ <braunr> that's something deeply missing in mach
+ <antrik> well, bootstrap in general *is* actually relevant for KGI as well,
+ because of console messages during boot... but the filesystem bootstrap
+ is mostly irrelevant there ;-)
+ <antrik> braunr: oh? that's a problem then... I just assumed we have it
+ <braunr> well, it's possible to implement MCL_CURRENT, but not MCL_FUTURE
+ <braunr> or at least, it would be a bit difficult
+ <braunr> every allocation would need to be aware of that property
+ <braunr> it's better to have it managed by the vm system
+ <braunr> mach-defpager has its own version of vm_allocate for that
+ <antrik> braunr: I don't think we care about MCL_FUTURE here
+ <antrik> hm, wait... MCL_CURRENT is fine for code, but it might indeed be a
+ problem for dynamically allocated memory :-(
+ <braunr> yes
+
+
# Plan
* Examine what other systems are doing.
@@ -116,6 +438,112 @@ Also see [[device drivers and IO systems]].
and parallel port drivers, using `libtrivfs`.
+## I/O Server
+
+### IRC, freenode, #hurd, 2012-08-10
+
+ <braunr> usually you'd have an I/O server, and serveral device drivers
+ using it
+ <bddebian> Well maybe that's my question. Should there be unique servers
+ for say ISA, PCI, etc or could all of that be served by one "server"?
+ <braunr> forget about ISA
+ <bddebian> How? Oh because the ISA bus is now served via a PCI bridge?
+ <braunr> the I/O server would merely be there to help device drivers map
+ only what they require, and avoid conflicts
+ <braunr> because it's a relic of the past :p
+ <braunr> and because it requires too high privileges
+ <bddebian> But still exists in several PCs :)
+ <braunr> so usually, you'd directly ask the kernel for the I/O ports you
+ need
+ <mel-> so do floppy drives
+ <mel-> :)
+ <braunr> if i'm right, even the l4 guys do it that way
+ <braunr> he's right, some devices are still considered ISA
+ <bddebian> But that is where my confusion lies. Something has to figure
+ out what/where those I/O ports are
+ <braunr> and that's why i tell you to forget about it
+ <braunr> ISA has both statically allocated ports (the historical ones) and
+ others usually detected through PnP, when it works
+ <braunr> PCI is much cleaner, and memory mapped I/O is both better and much
+ more popular currently
+ <bddebian> So let's say I have a PCI SCSI card. I need some device driver
+ to know how to talk to that, right?
+ <bddebian> something is going to enumerate all the PCI devices and map them
+ to and address space
+ <braunr> bddebian: that would be the I/O server
+ <braunr> we'll call it the PCI server
+ <bddebian> OK, that is where I am headed. What if everything isn't PCI?
+ Is the "I/O server" generic enough?
+ <youpi> nowadays everything is PCI
+ <bddebian> So we are completely ignoring legacy hardware?
+ <braunr> we could have separate servers using a shared library that would
+ provide allocation routines like resource maps
+ <braunr> yes
+ <youpi> for what is not, the translator just needs to be run as root
+ <youpi> to get i/o perm from the kernel
+ <braunr> the idea for projects like ours, where the user base is very small
+ is: don't implement what you can't test
+ <youpi> bddebian: legacy can not be supported in a nice way, so for them we
+ can just afford a bad solution
+ <youpi> i.e. leave the driver in kernel
+ <braunr> right
+ <youpi> e.g. the keyboard
+ <bddebian> Well what if I have a USB keyboard? :-P
+ <braunr> that's a different matter
+ <youpi> USB keyboard is not legacy hardware
+ <youpi> it's usb
+ <youpi> which can be enumerated like pci
+ <braunr> and USB uses PCI
+ <youpi> and pci could be on usb :)
+ <braunr> so it's just a separate stack on top of the PCI server
+ <bddebian> Sure so would SCSI in my example above but is still a seperate
+ bus
+ <braunr> netbsd has a very nice way of attaching drivers to buses
+ <youpi> bddebian: also, yes, and it can be enumerated
+ <bddebian> Which was my original question. This magic I/O server handles
+ all of the buses?
+ <youpi> no, just PCI, and then you'd have other servers for other busses
+ <braunr> i didn't mean that there would be *one* I/O server instance
+ <bddebian> So then it isn't a generic I/O server is it?
+ <bddebian> Ahhhh
+ <youpi> that way you can even put scsi over ppp or other crazy things
+ <braunr> it's more of an idea
+ <braunr> there would probably be a generic interface for basic stuff
+ <braunr> and i assume it could be augmented with specific (e.g. USB)
+ interfaces for servers that need more detailed communication
+ <braunr> (well, i'm pretty sure of it)
+ <bddebian> So the I/O server generalizes all functions, say read and write,
+ and then the PCI, USB, SCIS, whatever servers are contacted by it?
+ <braunr> no, not read and write
+ <braunr> resource allocation rather
+ <youpi> and enumeration
+ <braunr> probing perhaps
+ <braunr> bddebian: the goal of the I/O server is to make it possible for
+ device drivers to access the resources they need without a chance to
+ interfere with other device drivers
+ <braunr> (at least, that's one of the goals)
+ <braunr> so a driver would request the bus space matching the device(s) and
+ obtain that through memory mapping
+ <bddebian> Shouldn't that be in the "global address space"? SOrry if I am
+ using the wrong terminology
+ <youpi> well, the i/o server should also trigger the start of that driver
+ <youpi> bddebian: address space is not a matter for drivers
+ <braunr> bddebian: i'm not sure what you think of with "global address
+ space"
+ <youpi> bddebian: it's just a matter for the pci enumerator when (and if)
+ it places the BARs in physical address space
+ <youpi> drivers merely request mapping that, they don't need to know about
+ actual physical addresses
+ <braunr> i'm almost sure you lost him at BARs
+ <braunr> :(
+ <braunr> youpi: that's what i meant with probing actually
+ <bddebian> Actually I know BARs I have been reading on PCI :)
+ <bddebian> I suppose physicall address space is more what I meant when I
+ used "global address space"
+ <braunr> i see
+ <youpi> bddebian: probably, yes
+
+
# Documentation
* [An Architecture for Device Drivers Executing as User-Level
diff --git a/open_issues/usleep.mdwn b/open_issues/usleep.mdwn
new file mode 100644
index 00000000..b71cd902
--- /dev/null
+++ b/open_issues/usleep.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc]]
+
+# IRC, OFTC, #debian-hurd, 2012-07-14
+
+ <pinotree> eeek, usleep has the issues which i fixed in nanosleep
+ <bdefreese> pinotree: ?
+ * pinotree ponders a `mv sysdeps/unix/sysv/linux/usleep.c
+ sysdeps/mach/usleep.c`
+ <pinotree> s/mv/cp/
+ <bdefreese> What the heck is the point of usleep(0) anyway? Isn't that
+ basically saying suspend for 0 milliseconds?
+ <youpi> it's rounded up by the kernel I guess
+ <youpi> i.e. suspend for the shortest time possible (a clock tick)
+ <pinotree> posix 2001 says that «If the value of useconds is 0, then the
+ call has no effect.»
diff --git a/open_issues/virtualbox.mdwn b/open_issues/virtualbox.mdwn
index 9440284f..d0608b4a 100644
--- a/open_issues/virtualbox.mdwn
+++ b/open_issues/virtualbox.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -8,11 +8,15 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
+[[!toc]]
+
+
+# Running GNU Mach in VirtualBox crashes during initialization.
+
[[!tag open_issue_gnumach]]
-Running GNU Mach in VirtualBox crashes during initialization.
-IRC, freenode, #hurd, 2011-08-15
+## IRC, freenode, #hurd, 2011-08-15
<BlueT_> HowTo Reproduce: 1) Use `reboot` to reboot the system. 2) Once
you see the Grub menu, turn off the debian hurd box. 3) Let the box boot
@@ -97,3 +101,37 @@ IRC, freenode, #hurd, 2011-08-15
<youpi> what's interesting is that that one means that $USER_DS did load in
%es fine at least once
<youpi> and it's the reload that fails
+
+
+# Slow SCSI probing
+
+[[!tag open_issue_gnumach]]
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <braunr> youpi: it seems the slow boot on virtualbox is really because of
+ scsi (it spends a long time in scsi_init, probing for all the drivers)
+ <youpi> braunr: we know that
+ <youpi> isn't it in the io port probe printed at boot?
+ <youpi> iirc that was that
+ <braunr> the discussion i found was about eata
+ <braunr> not the whole scsi group
+ <youpi> there used to be another in eata, yas
+ <braunr> oh
+ <braunr> i must have missed the first discussion then
+ <youpi> I mean
+ <youpi> the eata is the first
+ <braunr> ok
+ <youpi> and scsi was mentioned later
+ <youpi> just nobody took the time to track it down
+ <braunr> ok
+ <braunr> so it's not just a matter of disabling a single driver :(
+ <youpi> braunr: I still believe it's a matter of disableing a single driver
+ <youpi> I don't see why scsi in general should take a lot of time
+ <braunr> youpi: it doesn't on qemu, it may simply be virtualbox's fault
+ <youpi> it is, yes
+ <youpi> and virtualbox people say it's hurd's fault, of course
+ <braunr> both are possible
+ <braunr> but we can't expect them to fix it :)
+ <youpi> that's what I mean
diff --git a/open_issues/vm_map_kernel_bug.mdwn b/open_issues/vm_map_kernel_bug.mdwn
new file mode 100644
index 00000000..613c1317
--- /dev/null
+++ b/open_issues/vm_map_kernel_bug.mdwn
@@ -0,0 +1,54 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_gnumach]]
+
+
+# IRC, frenode, #hurd, 2012-11-04
+
+ <tschwinge> braunr, pinotree, youpi: Has either of you already figured out
+ what [glibc]/sysdeps/mach/hurd/dl-sysdep.c:fmh »XXX loser kludge for
+ vm_map kernel bug« is about?
+ <pinotree> tschwinge: ETOOLOWLEVELFORME :)
+ <pinotree> tschwinge: 5bf62f2d3a8af353fac661b224fc1604d4de51ea added it
+ <braunr> tschwinge: no, but that looks interesting
+ <braunr> i'll have a look later
+ <tschwinge> Heh, "interesting". ;-)
+ <tschwinge> It seems related to vm_map's mask
+ parameter/ELF_MACHINE_USER_ADDRESS_MASK, though the latter in only used
+ in the mmap implementation in sysdeps/mach/hurd/dl-sysdep.c (in mmap.c, 0
+ is passed; perhaps due to the bug?).
+ <tschwinge> braunr: Anyway, I'd already welcome a patch to simply turn that
+ into a more comprehensible form.
+ <braunr> tschwinge: ELF_MACHINE_USER_ADDRESS_MASK is defined as "Mask
+ identifying addresses reserved for the user program, where the dynamic
+ linker should not map anything."
+ <braunr> about the vm_map parameter, which is a mask, it is described by
+ "Bits asserted in this mask must not be asserted in the address returned"
+ <braunr> so it's an alignment constraint
+ <braunr> the kludge disables alignment, apparently because gnumach doesn't
+ handle them correctly for some cases
+ <tschwinge> braunr: But ELF_MACHINE_USER_ADDRESS_MASK is 0xf8000000, so I'd
+ rather assume this means to restrict to addresses lower than 0xf8000000.
+ (What are whigher ones reserved for?)
+ <braunr> tschwinge: the linker i suppose
+ <braunr> tschwinge: sorry, i don't understand what
+ ELF_MACHINE_USER_ADDRESS_MASK really is used for :/
+ <braunr> tschwinge: it looks unused for the other systems
+ <braunr> tschwinge: i guess it's just one way to partition the address
+ space, so that the linker knows where to load libraries and mmap can
+ still allocate large contiguous blocks
+ <braunr> tschwinge: 0xf8000000 means each "chunk" of linker/other blocks
+ are 128 MiB large
+ <tschwinge> braunr: OK, thanks for looking. I guess I'll ask Roland about
+ it.
+ <braunr> it could be that gnumach isn't good at aligning to large values
+
+[[!message-id "87fw4pb4c7.fsf@kepler.schwinge.homeip.net"]]
diff --git a/open_issues/wait_errors.mdwn b/open_issues/wait_errors.mdwn
new file mode 100644
index 00000000..855b9add
--- /dev/null
+++ b/open_issues/wait_errors.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_hurd]]
+
+# IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> tschwinge: have you encountered wait() errors ?
+ <tschwinge> What kind of wait errors?
+ <braunr> when running htop or watch vmstat, other apparently unrelated
+ processes calling wait() sometimes fail with an error
+ <braunr> i saw it mostly during builds, as they spawn lots of children
+ <braunr> (and used the aforementioned commands to monitor the builds)
+ <tschwinge> Sounds nasty... No, don't remember seeing that. But I don't
+ typiclly invoke such commands during builds.
+ <tschwinge> So this wait thing suggests there's something going wrong in
+ the proc server?
+ <braunr> tschwinge: yes