summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--community/gsoc/2012/virt/discussion.mdwn389
-rw-r--r--community/gsoc/project_ideas.mdwn5
-rw-r--r--community/gsoc/project_ideas/gnat.mdwn32
-rw-r--r--community/gsoc/project_ideas/smp.mdwn16
-rw-r--r--glibc/select.mdwn25
-rw-r--r--glibc/signal/discussion.mdwn18
-rw-r--r--hurd/console/discussion.mdwn42
-rw-r--r--hurd/debugging/rpctrace.mdwn8
-rw-r--r--hurd/debugging/translator/capturing_stdout_and_stderr.mdwn13
-rw-r--r--hurd/documentation.mdwn4
-rw-r--r--hurd/libstore/nbd_store.mdwn27
-rw-r--r--hurd/libthreads.mdwn10
-rw-r--r--hurd/rpc.mdwn121
-rw-r--r--hurd/running.mdwn7
-rw-r--r--hurd/running/chroot.mdwn (renamed from hurd/chroot.mdwn)15
-rw-r--r--hurd/running/debian/after_install.mdwn4
-rw-r--r--hurd/running/debian/dhcp.mdwn12
-rw-r--r--hurd/running/qemu.mdwn10
-rw-r--r--hurd/settrans/discussion.mdwn23
-rw-r--r--hurd/subhurd/discussion.mdwn96
-rw-r--r--hurd/translator.mdwn2
-rw-r--r--hurd/translator/exec.mdwn4
-rw-r--r--hurd/translator/ext2fs.mdwn16
-rw-r--r--hurd/translator/ext2fs/internal_allocator.mdwn39
-rw-r--r--hurd/translator/firmlink.mdwn22
-rw-r--r--hurd/translator/nfs.mdwn5
-rw-r--r--hurd/translator/pfinet/ipv6.mdwn21
-rw-r--r--hurd/translator/procfs/jkoenig/discussion.mdwn26
-rw-r--r--libpthread.mdwn43
-rw-r--r--microkernel.mdwn4
-rw-r--r--microkernel/genode.mdwn17
-rw-r--r--microkernel/genode/rpc.mdwn65
-rw-r--r--microkernel/l4.mdwn10
-rw-r--r--microkernel/mach/deficiencies.mdwn262
-rw-r--r--microkernel/mach/gnumach/memory_management.mdwn14
-rw-r--r--microkernel/mach/gnumach/ports.mdwn9
-rw-r--r--microkernel/mach/gnumach/projects/clean_up_the_code.mdwn11
-rw-r--r--microkernel/mach/history.mdwn20
-rw-r--r--microkernel/mach/message.mdwn8
-rw-r--r--microkernel/mach/message/msgh_id.mdwn254
-rw-r--r--microkernel/mach/port.mdwn6
-rw-r--r--microkernel/mach/rpc.mdwn6
-rw-r--r--open_issues/64-bit_port.mdwn27
-rw-r--r--open_issues/alarm_setitimer.mdwn8
-rw-r--r--open_issues/anatomy_of_a_hurd_system.mdwn111
-rw-r--r--open_issues/arm_port.mdwn238
-rw-r--r--open_issues/automatic_backtraces_when_assertions_hit.mdwn65
-rw-r--r--open_issues/binutils.mdwn22
-rw-r--r--open_issues/bpf.mdwn8
-rw-r--r--open_issues/code_analysis.mdwn71
-rw-r--r--open_issues/code_analysis/discussion.mdwn29
-rw-r--r--open_issues/console_tty1.mdwn151
-rw-r--r--open_issues/console_vs_xorg.mdwn31
-rw-r--r--open_issues/dde.mdwn111
-rw-r--r--open_issues/exec_leak.mdwn57
-rw-r--r--open_issues/exec_memory_leaks.mdwn24
-rw-r--r--open_issues/ext2fs_deadlock.mdwn5
-rw-r--r--open_issues/ext2fs_libports_reference_counting_assertion.mdwn93
-rw-r--r--open_issues/fakeroot_eagain.mdwn216
-rw-r--r--open_issues/fifo_thread_explosion.mdwn20
-rw-r--r--open_issues/fork_deadlock.mdwn31
-rw-r--r--open_issues/gcc.mdwn234
-rw-r--r--open_issues/gcc/pie.mdwn40
-rw-r--r--open_issues/gdb.mdwn228
-rw-r--r--open_issues/glibc.mdwn454
-rw-r--r--open_issues/glibc/t/tls-threadvar.mdwn29
-rw-r--r--open_issues/glibc_ptrace.mdwn6
-rw-r--r--open_issues/gnat.mdwn102
-rw-r--r--open_issues/gnumach_memory_management.mdwn49
-rw-r--r--open_issues/gnumach_page_cache_policy.mdwn158
-rw-r--r--open_issues/gnumach_vm_map_entry_forward_merging.mdwn4
-rw-r--r--open_issues/gnumach_vm_map_red-black_trees.mdwn172
-rw-r--r--open_issues/hurdextras.mdwn8
-rw-r--r--open_issues/implementing_hurd_on_top_of_another_system.mdwn320
-rw-r--r--open_issues/libmachuser_libhurduser_rpc_stubs.mdwn11
-rw-r--r--open_issues/libpager_deadlock.mdwn165
-rw-r--r--open_issues/libpthread.mdwn1284
-rw-r--r--open_issues/libpthread/t/fix_have_kernel_resources.mdwn21
-rw-r--r--open_issues/libpthread_1fcd93fd3c733eb19bcad8d03e65f13ec4b0e998..master-viengoos-on-bare-metal.mdwn849
-rw-r--r--open_issues/libpthread_CLOCK_MONOTONIC.mdwn39
-rw-r--r--open_issues/libpthread_timeout_dequeue.mdwn22
-rw-r--r--open_issues/mach_federations.mdwn66
-rw-r--r--open_issues/mach_on_top_of_posix.mdwn4
-rw-r--r--open_issues/mach_shadow_objects.mdwn24
-rw-r--r--open_issues/mission_statement.mdwn41
-rw-r--r--open_issues/multithreading.mdwn154
-rw-r--r--open_issues/netstat.mdwn34
-rw-r--r--open_issues/packaging_libpthread.mdwn95
-rw-r--r--open_issues/pci_arbiter.mdwn256
-rw-r--r--open_issues/performance.mdwn163
-rw-r--r--open_issues/performance/io_system/read-ahead.mdwn991
-rw-r--r--open_issues/pfinet_vs_system_time_changes.mdwn31
-rw-r--r--open_issues/robustness.mdwn65
-rw-r--r--open_issues/select.mdwn1416
-rw-r--r--open_issues/strict_aliasing.mdwn10
-rw-r--r--open_issues/synchronous_ipc.mdwn185
-rw-r--r--open_issues/system_stats.mdwn39
-rw-r--r--open_issues/term_blocking.mdwn192
-rw-r--r--open_issues/user-space_device_drivers.mdwn428
-rw-r--r--open_issues/usleep.mdwn25
-rw-r--r--open_issues/virtualbox.mdwn44
-rw-r--r--open_issues/vm_map_kernel_bug.mdwn54
-rw-r--r--open_issues/wait_errors.mdwn25
-rw-r--r--open_issues/whole_system_debugging.mdwn19
-rw-r--r--rpc.mdwn100
-rw-r--r--shortcuts.mdwn7
-rw-r--r--source_repositories/gdb.mdwn9
-rw-r--r--source_repositories/glibc.mdwn33
m---------toolchain/logs10
109 files changed, 11396 insertions, 378 deletions
diff --git a/community/gsoc/2012/virt/discussion.mdwn b/community/gsoc/2012/virt/discussion.mdwn
new file mode 100644
index 00000000..e0085322
--- /dev/null
+++ b/community/gsoc/2012/virt/discussion.mdwn
@@ -0,0 +1,389 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+
+# IRC, freenode, #hurd, 2012-07-19
+
+ <nowhere_man> well, I really actively started last week, so I'm ironing my
+ various use cases and above all I'm taking my barings in Hurd's code
+ <nowhere_man> I'm currently reading boot/ and pfinet/
+ <braunr> sorry for asking but
+ <braunr> can you describe brielfy what you mean to achieve
+ <braunr> i know it sounds weird but the project description is a bit vague
+ for me
+ <nowhere_man> OK
+ <nowhere_man> the main goal is to be able to easily spawn a subhurd that's
+ connected in some way to its host
+ <braunr> ok
+ <nowhere_man> mainly connected by network, possibly sharing resources like
+ FS
+ <braunr> is it similar in spirit with something like linux containers ?
+ <nowhere_man> IIRC about them, yes
+ <braunr> ok
+ <braunr> that will do for me then
+ <tschwinge> Yes, so not complete virtualization, but instaed limitied to
+ several components.
+ <braunr> lxc with more runtime features to increase/decrease the level of
+ isolation
+ <nowhere_man> at first it would be static, at creation time only
+ <braunr> ok, i clearly understand the proposal now :)
+ <braunr> what kind of help could you need in the near future ?
+ <braunr> (except permanent access to youpi's brain?)
+ <tschwinge> Yes, that's my question, too -- what can we do to "get this
+ thing going".
+ <nowhere_man> by monday or tuesday I should be clear on what I understand
+ or not in the code
+ <nowhere_man> I'm still a bit up to my elbows in it
+ <nowhere_man> at that point I'll be happy to be able to pop a lot of
+ questions about it
+ <braunr> so you'll be ready for the next meeting
+ <nowhere_man> yeah
+ <tschwinge> Please do as soon as there are questions that you cannot
+ resolve in a reasonably short amount of time.
+ <tschwinge> So often a quick hint from someone else already helps to ge
+ un-stuck.
+ <nowhere_man> OK
+ <tschwinge> There is no problem with asking for help given this huge and
+ convoluted code-base, where often design decisions are not obvious, too.
+ <nowhere_man> I will
+ <tschwinge> Good. :-)
+ <antrik> nowhere_man: hm... what you said so far doesn't sound any
+ different than the work zhengda already did on boot years ago...
+ <antrik> (although none of it ever got upstream IIRC :-( )
+ <nowhere_man> antrik: wasn't aware of it, is there some code published?
+ <tschwinge> There are bits and pieces, but certainly there is enough work
+ left to be done, to put it all together.
+ <antrik> yes, his git repository should be up somewhere. it's quite
+ convoluted though, as he worked on several things, and also wasn't very
+ experienced with revision control in the beginning
+ <tschwinge> nowhere_man:
+ http://www.gnu.org/software/hurd/community/gsoc/2008.html
+ <tschwinge> nowhere_man: http://www.gnu.org/software/hurd/user/zhengda.html
+ <tschwinge> Second section of the latter one.
+ <antrik> well, my understanding of the proposal (and more or less what I
+ was driving at in the project idea, which is rather vague admittedly) is
+ something lighter than a real subhurd... rather some kind of thin
+ subenvironment that doesn't actually boot a complete system instance with
+ various daemons etc.
+ <tschwinge> nowhere_man: It is certainly valid for you to use pre-existing
+ code/patches, by the way.
+ <antrik> BTW, regarding the "full subhurd" thing, the missing pieces are
+ mostly virtual device implementations
+ <antrik> (that and some tough bug(s) remaining in zhengda's modified
+ boot...)
+ <nowhere_man> cool, I'll take a look
+ <antrik> in any case, getting a picture of the work zhengda did is, is
+ definitely the first thing to do :-)
+ <tschwinge> nowhere_man: I'll also try to locate some bits and stuff from
+ his verious repositories (I just fond a Subverision one; will convert to
+ Git).
+ <antrik> tschwinge: I'm pretty sure zhengda's git repository was converted
+ from the SVN one...
+ <tschwinge> antrik: Thanks for reminding us about this -- I failed to
+ remember all that.
+ <antrik> (which was in turn converted from CVS...)
+ <tschwinge> antrik: OK, will have a lot.
+ <tschwinge> Yeah, found a CVS tree, too. ;-)
+ <antrik> BTW, zhengda's work more exactly was about subhurd without root
+ privileges. but that lays a lot of the groundwork for all kinds of more
+ flexible subhurd usage
+ <antrik> (but it's still quite a different thing that thing
+ subenvironments, so don't get confused...)
+ <antrik> err... thin subenvironments
+
+
+# IRC, freenode, #hurd, 2012-07-27
+
+ <nowhere_man> bddebian: I'm actually not progressing much while reading the
+ source, I'm jumping all over the place to grasp the various types and
+ functions used where I start
+ <nowhere_man> would there be a few starting points that could help me?
+ <tschwinge> nowhere_man: So what exactly is your status; what are you
+ doing, what do you need help with? We surely can provide help, but need
+ to know where.
+ <nowhere_man> I'm starting from the source of boot/ and pfinet/ and as soon
+ as I encounter something that I don't understand, I find its definition
+ <nowhere_man> I'm kind of doing a depth-first search of what I need to
+ understand in the source code
+ <nowhere_man> I'm wondering if there are a few places in the source code
+ that I should start reading before anything else
+ <nowhere_man> well, I'll have to go in a few minutes
+ <nowhere_man> I'll continue my DFS ;-)
+
+
+# IRC, freenode, #hurd, 2012-08-02
+
+ <nowhereman> well, I made a leap forward in understanding the code, when I
+ stopped my DFS
+ <nowhereman> in hindsight, I'd say my way of approaching the code was
+ probably one of the worst possible
+ <braunr> oh
+ <tschwinge> OK, so at least you learned something, which is good.
+ <tschwinge> So, what's the new approach? And what are you working on at
+ the moment
+ <tschwinge> ?
+ <nowhereman> I just remembered SICP, the idea of wishful thinking when you
+ code, and didn't bother with the fine details behind what I'm reading
+ <nowhereman> like, I don't really get what happens when a Mach port is
+ allocated, but I know approximately what a Mach port is
+ <tschwinge> So originally you worked on investigating all that, every line
+ of code?
+ <nowhereman> almost, yeah
+ <braunr> nowhereman: again, feel free to ask
+ <tschwinge> Yes indeed -- that's too complex for a single person to tackle
+ at one time.
+ <braunr> and quickly
+ <braunr> don't loose time
+ <tschwinge> Not even braunr and I have looked up all these things.
+ (Speaking for Richard here, but I'm quite sure he'll agree. Perhaps he
+ has in fact looked up all the Mach things, though.)
+ <tschwinge> nowhereman: ufc?
+ <nowhereman> BTW, last week I wanted to push my description of how the tool
+ could be used, the use cases
+ <nowhereman> ufs
+ <nowhereman> but flubber is not online
+ <tschwinge> nowhereman: Oh, why ufs specifically?
+ <braunr> don't waste time on ufs
+ <braunr> really
+ <tschwinge> nowhereman: Yes, flubber is down. But you can push directly to
+ the Savannah repository.
+ <tschwinge> nowhereman: Please immediatelly tell us if you're stuck on
+ something, like flubber not being available.
+ <tschwinge> We may not be able to help immediatelly, but we're the at least
+ aware of issues.
+ <braunr> and we may be able to help immediately :)
+ <tschwinge> As we're not sitting in a lab next to each other, we can't tell
+ otherwise what's going on.
+ <tschwinge> We may in fact even be able to tell you immediatelly to use
+ Savannah instead of flubber, indeed.
+ <tschwinge> nowhereman: So, back to ufs -- which you don't specifically
+ need to look at, I think -- ext2fs is what everyone uses. But even there
+ you shouldn't really need to know many details/internals.
+ <nowhereman> OK, I was looking into it has it appears in hurd.boot
+ <tschwinge> Ah, OK. Yeah, that's just an example/template, and should use
+ ext2fs nowadays.
+ <nowhereman> in fact, as far as FS are concerned, I suppose I will merely
+ need to know how to pass a port to the host's FS to some proxy FS in the
+ subhurd
+ <nowhereman> mmmh, Savannah only mentions a hurd.git
+ <tschwinge> Exactly that is the abstraction level you need, yes.
+ <nowhereman> I'm looking at http://savannah.gnu.org/git/?group=hurd
+ <tschwinge> Yeah, that's a known shortcoming -- look here instead:
+ http://git.savannah.gnu.org/cgit/hurd
+ <tschwinge> Here is some more up-to-date stuff on subhurds:
+ http://www.gnu.org/software/hurd/hurd/subhurd.html
+ <tschwinge> nowhereman: You know how to tell git to add a new remote to
+ your web pages checkout and such stuff?
+ <nowhereman> yeah, no problem with that
+ <braunr> have you prepared any question to ask us ?
+ <nowhereman> the only I have now is if you can tell me where to look in the
+ code about passing Mach ports
+ <braunr> you don't pass ports, you pass rights
+ <braunr> http://www.gnu.org/software/hurd/gnumach-doc/index.html is the
+ best location to have a look at
+ <braunr>
+ http://www.gnu.org/software/hurd/gnumach-doc/Exchanging-Port-Rights.html#Exchanging-Port-Rights
+ <braunr> i suppose the mig doc will help too, as you may be using a higher
+ level interface to exchange rights
+ <braunr> be careful about user references on port rights
+ <braunr> deallocate releases a reference, it doesn't immediately destroy a
+ resource
+ <braunr> portinfo -v can help monitoring a task's rights
+ <braunr> nowhereman: so what are you planning to do now ?
+ <braunr> during the next week
+ <nowhereman> documenting what I understand from the boot process and where
+ things can be changed to fit my various use cases
+ <braunr> do you expect that to take the whole week ?
+ <nowhereman> and doing some first modifications to servers for the simplest
+ cases
+ <braunr> ok
+ <braunr> well i hope you're able to really start working on it soon, and
+ won't face weird issues in the meantime
+ <braunr> i'm a bit disappointed that you don't have more questions
+ <braunr> my feeling is you either did understand everything (except passing
+ port rights), or you didn't attempt to seriously understand the code
+ <braunr> or you don't dare ask questions
+ <braunr> this is something that must change
+ <braunr> or these meetings won't be as useful as they could be
+ <tschwinge> Yes. But also please don't wait for the meetings, but ask
+ questions throughout the week, too.
+
+
+# IRC, freenode, #hurd, 2012-08-09
+
+ <nowhere_man> hey, does anyone knows the network device interface well?
+ <nowhere_man> I don't get it by reading net_io.c/h in gnumach
+ <braunr> nowhere_man: ask your question
+ <braunr> nowhere_man: http://www.sceen.net/~rbraun/pcap-hurd.c <- this may
+ help
+ <nowhere_man> I don't see what the entry point is
+ <nowhere_man> I finally understood that I actually don't need to touch
+ pfinet for gsoc project
+ <nowhere_man> but I should do a replacement network device instead
+ <nowhere_man> is the net_io_init function called at start?
+ <braunr> what entry point ?
+ <braunr> and you should perhaps have a look at the eth-multiplexer by
+ zhengda
+ <braunr> yes net_io_init is called at startup
+ <braunr> nowhere_man: did you find your answers about networking ?
+ <nowhere_man> no, I'm still digging in mach's code
+ <braunr> nowhere_man: well keep asking :/
+ <braunr> you left conversation without notice :/
+ <braunr> nowhere_man: and why mach ?
+ <nowhere_man> I thought hardware devices are there
+ <tschwinge> nowhere_man: You wanted to push your documentation one/two
+ weeks ago. Why has that not yet happened?
+ <youpi> nowhere_man: they used to be there, they are now in netdde, but in
+ both case it's just a matter of the same RPC interface
+ <nowhere_man> tschwinge: I spent very few time this week on gsoc, and
+ completely forgot about the push on savannah
+ <braunr> nowhere_man: i told you to look at the work by zhengda concerning
+ eth-multiplexer, did you do that ?
+ <tschwinge> nowhere_man: You realize GSoC is meant to be a full-time job?
+ <tschwinge> Or, next to full-time?
+ <braunr> it's full-time normally
+ <braunr> the payment is justified by that
+ <youpi> nowhere_man: most RPC operations you need to know about network can
+ be seen at work in pfinet/ethernet.c, wherever "ether_port" appears
+ <youpi> i.e. device_open, set_filter, write, set/get_status
+ <braunr> again, http://www.sceen.net/~rbraun/pcap-hurd.c should guide you
+ pretty well
+ <braunr> since it's the very least necessary to use that interface
+ <tschwinge> nowhere_man: How, roughly but realistically, are your plans to
+ continue this task?
+ <tschwinge> nowhere_man: What has been blocking you this week so you
+ couldn't work on your task?
+ <nowhere_man> tschwinge: mostly a previous work that was supposed to end at
+ the beginning of the summer and only went online now, for which I'm
+ basically sysadmin
+ <braunr> 21:25 < tschwinge> nowhere_man: How, roughly but realistically,
+ are your plans to continue this task?
+ <braunr> this question is really more interesting actually
+ <nowhere_man> right now, I want to write a netword device that just sends
+ its frames by IPC
+ <braunr> why ?
+ <nowhere_man> as I never wrote any program using Mach's IPC, that seems the
+ easiest to get them right
+ <braunr> you won't have time
+ <braunr> 21:22 < braunr> nowhere_man: i told you to look at the work by
+ zhengda concerning eth-multiplexer, did you do that ?
+ <nowhere_man> braunr: not yet, no
+ <braunr> well that's your best chance to make some progress
+ <nowhere_man> braunr: is writing the virtal network device that hard?
+ <braunr> basically, it allows "bridgind" the pfinet instances of various
+ subhurds
+ <braunr> the virtual network device you want *is* eth-multiplexer
+ <tschwinge> nowhere_man: GSoC is nearly over. That's why I'm asking how
+ this task is going to continue. I'm sorry but I reckon you have not
+ spend anywhere near the amount of hours that are meant to be spent on it.
+ <braunr> and from what antrik told me, yes it's hard, and moreover, why
+ rewrite it if it already exists and you're late
+ <braunr> i agree
+ <nowhere_man> tschwinge: I know, I've started way too late because of my
+ second round of exams
+ <tschwinge> nowhere_man: OK, that's how you started. But how is it going
+ to continue...
+ <nowhere_man> tschwinge: in short, I write a prototype that just starts a
+ subhurd, and when that works correctly I add the network
+ <tschwinge> nowhere_man: I mean from an organizational point of view.
+ <nowhere_man> well, between now and the beginning of september, I'll work
+ full-time on this
+ <nowhere_man> up until september 8th
+
+
+# IRC, freenode, #hurd, 2012-08-09
+
+ <antrik> nowhere_man: you do *not* have to do a replacement network
+ device. zhengda did that years ago.
+ <antrik> nowhere_man: also note that zhengda also implemented the support
+ for *using* the virtual network device (in fact any replacement devices
+ -- except that no others actually exist yet) in boot
+ <youpi> which is already in, actually, isn't it?
+ <antrik> youpi: hm, yes... it was the patch that zhengda posted on the list
+ once, but later updated, and at some later point you merged the outdated
+ variant from the list...
+ <youpi> outdated?
+ <youpi> ah, but he never posted the updated one, and it got lost in git
+ repos, right?
+ <youpi> (what was updated actually?)
+ <antrik> he changed the option name and description later for more
+ clarity. don't remember whether there were other changes
+ <antrik> -f, --device=device_name=device_file
+ <antrik> Specify a device file used by subhurd
+ and its
+ <antrik> virtual name.
+ <antrik> that's the one from the Debian package
+ <antrik> -m, --device-map=DEBICENAME=DEVICEFILE
+ <antrik> Map the device in subhurd to the
+ device in the
+ <antrik> main Hurd.
+ <antrik> that's the one I have locally built from his tree
+ <youpi> so you actually have access to his tree?
+ <antrik> uhm... I used to... it was on flubber
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <nowhere_man> so, this week I discovered how fun it is to work on a
+ non-mainstream OS
+ <nowhere_man> I hoped to start coding the tool itself, put together the
+ skeleton, but every Lisp implementation I tried had problems
+ <braunr> ah you want to write it in lisp ?
+ <nowhere_man> ECL, that I had ported a few years ago, actually FTBFS since
+ <nowhere_man> I hoped to be able, it would be easier for me
+ <nowhere_man> and when I tried Scheme, I started with Guile (it's GNU's own
+ Scheme implementation, after all)
+ <nowhere_man> and when I execute the FFI functions, to access functions in
+ libmachdec
+ <nowhere_man> I get SIGILL
+ <braunr> i can't advise you about anything lisp related
+ <braunr> the most reliable thing you'll find on the hurd is C
+ <nowhere_man> I tried to debug that, but running Guile in GDB gets me a
+ SIGSEV
+ <nowhere_man> I'll try to make ECL to build again
+ <braunr> this seems like a waste of time to me
+ <braunr> avoid spending time on anything that isn't directly related to
+ your goal if you still hope to finish it
+ <nowhere_man> I'm ten times more comfortable coding in Lisp
+ <braunr> it doesn't matter, you're late
+ <nowhere_man> yeah, I know, so taking the time to correct that problem
+ won't change the fact that I won't finish in time
+ <nowhere_man> so I'll finish anyway, and in Lisp
+ <braunr> and if you lack something else, like some mach/hurd specific lisp
+ bindings, you'll have to spend more time on that
+ <braunr> ok
+ <nowhere_man> do you know if someone had a SIGILL situation on Hurd in the
+ past?
+ <nowhere_man> I'm wondering if that's a known kind of issue
+ <braunr> there are lots of issues
+ <braunr> especially when it comes to other languages and runtime
+ environments
+ <nowhere_man> but is it like MAX_PATH_LEN, something that is known to
+ happen when porting something on Hurd?
+ <braunr> i'm not sure how comparable it is
+ <braunr> i'd say it's often before of the conformance issues of the hurd
+ <braunr> because*
+ <nowhere_man> like missing bits of POSIX ?
+ <braunr> or simple wrong for some corner cases
+ <braunr> simply*
+ <bubu^> nowhere_man, I was able to run guile on my hurd image through qemu
+ <bubu^> but I didn't make any complexe programms to check if everything
+ works fine
+ <nowhere_man> yeah, it runs fine
+ <nowhere_man> FFI functions get you a SIGILL
+ <nowhere_man>
+ http://www.gnu.org/software/guile/manual/html_node/Dynamic-FFI.html
+ <nowhere_man> the define-module form at the beginning triggers the signal
+ <antrik> nowhere_man: what do you want to implement in Lisp?
+ <antrik> BTW, the guy working on Lisp bindings a couple of years ago used
+ Clisp
+ <antrik> it was working back then
+ <nowhere_man> antrik: the program that sets up a subhurd
+ <nowhere_man> I always forget about clisp, I'll try it right away
diff --git a/community/gsoc/project_ideas.mdwn b/community/gsoc/project_ideas.mdwn
index 8ce10ffa..9d486b00 100644
--- a/community/gsoc/project_ideas.mdwn
+++ b/community/gsoc/project_ideas.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2008, 2009, 2010, 2011 Free Software Foundation,
-Inc."]]
+[[!meta copyright="Copyright © 2008, 2009, 2010, 2011, 2012 Free Software
+Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -87,6 +87,7 @@ other: language_bindings, gnat, gccgo, perl_python. -->
[[!inline pages="community/gsoc/project_ideas/tcp_ip_stack" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/nfs" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/pthreads" show=0 feeds=no actions=yes]]
+[[!inline pages="community/gsoc/project_ideas/smp" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/sound" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/disk_io_performance" show=0 feeds=no actions=yes]]
[[!inline pages="community/gsoc/project_ideas/vm_tuning" show=0 feeds=no actions=yes]]
diff --git a/community/gsoc/project_ideas/gnat.mdwn b/community/gsoc/project_ideas/gnat.mdwn
new file mode 100644
index 00000000..ba34cc9c
--- /dev/null
+++ b/community/gsoc/project_ideas/gnat.mdwn
@@ -0,0 +1,32 @@
+[[!meta copyright="Copyright © 2009, 2011, 2012 Free Software Foundation,
+Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!meta title="Porting GNAT (GCC)"]]
+
+An initial port of the GNU Ada Translator (GNAT) is available for the Hurd.
+
+The goal of this project is getting GNAT fully working in Debian GNU/Hurd. It
+requires implementing some explicitly system-specific stuff in GNAT (mostly in
+its runtime libraries), and for that also address a number of issues in Hurd
+and other libraries. Knowledge of Ada is a must; some Hurd
+knowledge will have to be acquired while working on the project.
+
+Designing and implementing [[language_bindings]] is a follow-up project.
+
+Possible mentors: [[Samuel Thibault (youpi)|samuelthibault]], [[Thomas Schwinge
+(tschwinge)|tschwinge]].
+
+Exercise: Fix one of the known issues of GNAT on the Hurd.
+
+---
+
+[[Open Issue page|open_issues/gnat]]. [Entry in the GCC
+wiki](http://gcc.gnu.org/wiki/SummerOfCode#gnat_hurd).
diff --git a/community/gsoc/project_ideas/smp.mdwn b/community/gsoc/project_ideas/smp.mdwn
new file mode 100644
index 00000000..e17c2ccf
--- /dev/null
+++ b/community/gsoc/project_ideas/smp.mdwn
@@ -0,0 +1,16 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!meta title="SMP"]]
+
+
+# IRC, freenode, #hurd, 2012-09-30
+
+ <braunr> i expect smp to be our next gsoc project
diff --git a/glibc/select.mdwn b/glibc/select.mdwn
new file mode 100644
index 00000000..bafda141
--- /dev/null
+++ b/glibc/select.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]
+
+
+# IRC, freenode, #hurd, 2012-08-10
+
+ <afleck> what is the use of having a port set name that can receive from
+ multiple ports?
+ <youpi> think of select()
+ <afleck> I haven't really gotten into it yet, I was just reading the Mach
+ Kernel Guide and I didn't understand the difference between having a port
+ set and multiple ports, since you can't choose which port receives in a
+ port set.
+ <youpi> with multiple ports, you'd have to have as many threads to block in
+ reception
+ <youpi> or poll in turn
diff --git a/glibc/signal/discussion.mdwn b/glibc/signal/discussion.mdwn
new file mode 100644
index 00000000..064c1c5b
--- /dev/null
+++ b/glibc/signal/discussion.mdwn
@@ -0,0 +1,18 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+
+# `_hurd_sigstates`
+
+In an [[hurd/translator/ext2fs]] instance with 1068 threads, `_hurd_sigstates`
+was a linked with with 1067 entries, in one with 351 threads, 351 entries. Is
+this noticeable already? Perhaps a different data structure is needed?
+Though, a linked list is perfect for the common case of processes with only a
+handful of threads.
diff --git a/hurd/console/discussion.mdwn b/hurd/console/discussion.mdwn
new file mode 100644
index 00000000..f887d826
--- /dev/null
+++ b/hurd/console/discussion.mdwn
@@ -0,0 +1,42 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]
+
+
+# IRC, OFTC, #debian-hurd, 2012-09-24
+
+ <allesa> hello, I'm trying to get familiar with the Hurd and would like to
+ change the keyboard layout in use. It seems all the information I can
+ find (relating to console-driver-xkb) is out of date, with the latest
+ info relating to it being that this package should not be used anymore…
+ <allesa> does anyone know how changing keyboard layouts currently works?
+ <allesa> ah, never mind. I assume it doesn't currently work:
+ http://www.gnu.org/software/hurd/hurd/console.htmlq
+ <allesa> *http://www.gnu.org/software/hurd/hurd/console.html
+ <youpi> it does actually work
+ <youpi> simply dpkg-reconfigure keyboard-configuration
+ <youpi> and reboot
+ <youpi> (see http://www.debian.org/ports/hurd/hurd-install
+ <youpi> )
+ <allesa> mhm, I got that far — but selecting my layout gave me no joy, even
+ after restart. Seem to be stuck with the layout chosen during
+ installation (d-i). Just to check I'm using the right version — still on
+ the installer isos from 15 July?
+ <allesa> wait… progress is being made — slowly and subtly…
+ <allesa> Ok, so the XKBLAYOUT is changing as you described, but XKBVARIANT
+ seems to be ignored. Could this be right?
+ <youpi> yes, the hurd console only supports keymaps
+ <youpi> (currently)
+ <allesa> Ah OK, thanks for your help on this. I imagine this is not
+ something that just requires simple repetitive work, but some actual
+ hacking?
+ <allesa> to fix that is…
+ <youpi> some hacking yes
diff --git a/hurd/debugging/rpctrace.mdwn b/hurd/debugging/rpctrace.mdwn
index df6290f7..c506861a 100644
--- a/hurd/debugging/rpctrace.mdwn
+++ b/hurd/debugging/rpctrace.mdwn
@@ -167,6 +167,14 @@ See `rpctrace --help` about how to use it.
Debian-specific, but not ready for upstream either...
<youpi> antrik: yes
+* IRC, freenode, #hurd, 2012-07-18
+
+ <braunr> hm, rpctrace on gitk gives an interesting result
+ <braunr> 152<--153(pid1849)->io_set_all_openmodes_request (267) = 0
+ <braunr> rpctrace:
+ /home/rbraun/hd0s7/hurd/hurd-20120710/./utils/rpctrace.c:1287:
+ trace_and_forward: Assertion `reply_type == 18' failed.
+
# See Also
diff --git a/hurd/debugging/translator/capturing_stdout_and_stderr.mdwn b/hurd/debugging/translator/capturing_stdout_and_stderr.mdwn
index b7cfc3c9..47fbbc48 100644
--- a/hurd/debugging/translator/capturing_stdout_and_stderr.mdwn
+++ b/hurd/debugging/translator/capturing_stdout_and_stderr.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2008, 2009, 2010 Free Software Foundation,
+[[!meta copyright="Copyright © 2008, 2009, 2010, 2012 Free Software Foundation,
Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -6,8 +6,8 @@ id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
-is included in the section entitled
-[[GNU Free Documentation License|/fdl]]."]]"""]]
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
Sometimes it may already be helpful to capture a translator's `stdout` and
`stderr`, for example in this situation where [[translator/pfinet]] was
@@ -15,13 +15,14 @@ silently dying all the time, without any console output:
$ sudo settrans -fgap ↩
/servers/socket/2 ↩
- /bin/sh -c '/hurd/pfinet -i eth0 -a [...] > /tmp/stdout 2> /tmp/stderr'
+ /bin/sh -c 'exec >> /root/pfinet.log 2>&1 && date && ↩
+ /hurd/pfinet -i eth0 -a [...]'
$ [...]
- $ cat /tmp/stdout
+ $ cat /root/pfinet.log
+ [date]
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
TCP: Hash tables configured (ehash 65536 bhash 65536)
- $ cat /tmp/stderr
pfinet: ../../hurd.work/pfinet/ethernet.c:196: ethernet_xmit: Unexpected error: (os/device) invalid IO size.
(Trying to run [[GDB]] in this case was of no help -- due to a bug in GDB
diff --git a/hurd/documentation.mdwn b/hurd/documentation.mdwn
index b87ea964..ec19e90b 100644
--- a/hurd/documentation.mdwn
+++ b/hurd/documentation.mdwn
@@ -1,5 +1,5 @@
[[!meta copyright="Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
-2009, 2011 Free Software Foundation, Inc."]]
+2009, 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -44,7 +44,7 @@ is included in the section entitled
# Development
- * [[RPCs|/rpc]]: A description of what RPCs are.
+ * [[RPC]]: our usage of *Remote Procedure Call*s.
* *[[The_GNU_Hurd_Reference_Manual|reference_manual]]*.
diff --git a/hurd/libstore/nbd_store.mdwn b/hurd/libstore/nbd_store.mdwn
index 5874b162..8560fd44 100644
--- a/hurd/libstore/nbd_store.mdwn
+++ b/hurd/libstore/nbd_store.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2007, 2008, 2009 Free Software Foundation,
+[[!meta copyright="Copyright © 2007, 2008, 2009, 2012 Free Software Foundation,
Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -10,3 +10,28 @@ is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
[[!meta title="nbd store: Linux-compatible network block device"]]
+
+[[!wikipedia "Network block device"]].
+
+
+# Servers
+
+
+## [Network Block Device (TCP version)](http://nbd.sourceforge.net/)
+
+[[tschwinge]] once was testing this (years ago), and found it didn't work.
+Perhaps the protocol was extended?
+
+
+## [xNBD](https://bitbucket.org/hirofuchi/xnbd/)
+
+
+## [jNbd](http://vanheusden.com/java/JNbd/)
+
+
+## [BlackHole](http://vanheusden.com/java/BlackHole/)
+
+
+# Open Issues
+
+ * [[!GNU_Savannah_task 5722]]
diff --git a/hurd/libthreads.mdwn b/hurd/libthreads.mdwn
index 8b1a97e6..c8d819d4 100644
--- a/hurd/libthreads.mdwn
+++ b/hurd/libthreads.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -13,10 +13,14 @@ License|/fdl]]."]]"""]]
# Internals
+## Threading Model
+
+libthreads has a 1:1 threading model.
+
## Threads' Death
-C threads death doesn't actually free the thread's stack (and maybe not the
+A thread's death doesn't actually free the thread's stack (and maybe not the
associated Mach ports either). That's because there's no way to free the stack
after the thread dies (because the thread of control is gone); the stack needs
to be freed by something else, and there's nothing convenient to do it. There
@@ -26,3 +30,5 @@ However, it isn't really a leak, because the unfreed resources do get used for
the next thread. So the issue is that the shrinkage of resource consumption
never happens, but it doesn't grow without bounds; it just stays at the maximum
even if the current number of threads is lower.
+
+The same issue exists in [[libpthread]].
diff --git a/hurd/rpc.mdwn b/hurd/rpc.mdwn
new file mode 100644
index 00000000..f4ddaab5
--- /dev/null
+++ b/hurd/rpc.mdwn
@@ -0,0 +1,121 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[Remote procedure call|/rpc]]s are the basis for about everything in the Hurd.
+They're based on the [[Mach RPC mechanism (`mach_msg` system
+call)|microkernel/mach/rpc]]. An RPC is made against a [[Mach
+port|microkernel/mach/port]], which is the gateway to the [[translator]] that
+will serve the RPC. Let's explore the case of `open`ing a file, and advancing
+(`lseek`) ten bytes into it. The user program will be something like:
+
+ #include <fcntl.h>
+
+ int main(void) {
+ int fd = open("test.txt", O_RDONLY);
+ lseek(fd, 10, SEEK_CUR);
+ }
+
+Both `open` and `lseek` are functions provided by [[glibc]], which translates
+these into the appropriate remote procedure calls.
+
+`open` first has to find its way to the actual translator serving that file,
+but for a file on the root filesystem, what happens boils down to calling the
+`dir_lookup` function against the root filesystem. This is an RPC from the
+[[`fs` interface (see `fs.defs`)|interface/fs]]. The implementation of this
+function is thus actually generated during the glibc build in
+`RPC_dir_lookup.c`, based on the `fs.defs` file, using
+[[microkernel/mach/MIG]]. This generated function essentially [[encodes the
+parameters into a data buffer|idl]], and makes a `mach_msg` system call to send
+the buffer to the root filesystem port, with the `dir_lookup` RPC ID.
+
+The root filesystem, for instance [[translator/ext2fs]], was sitting in its
+main service loop (`libdiskfs/init-first.c:master_thread_function`), which
+calls `ports_manage_port_operations_multithread`, which essentially simply
+keeps making `mach_msg` system calls to receive [[microkernel/mach/message]]s,
+and calls the demultiplexer on it, here the `diskfs_demuxer`. This
+demultiplexer calls the demultiplexers for the various interfaces supported by
+ext2fs. These demuxers are generated using MIG during the Hurd build. For
+instance, the `fs` interface demultiplexer for [[diskfs|libdiskfs]],
+`diskfs_fs_server`, is in `libdiskfs/fsServer.c`. It simply checks whether the
+RPC ID is an `fs` interface ID, and if so uses the `diskfs_fs_server_routines`
+array for calling the appropriate function corresponding to the RPC ID. Here
+it's `_Xdir_lookup` which thus gets called. This one decodes the parameters
+from the message data buffer, and calls `diskfs_S_dir_lookup`.
+
+`diskfs_S_dir_lookup` in the ext2fs translator does stuff to check that the
+file exists, etc. and eventually creates a new port, which will represent the
+open file, and a structure to keep information about it. It returns this new
+port to its caller, `_Xdir_lookup`, which puts it into the reply message data
+buffer and returns. `ports_manage_port_operations_multithread` then calls
+`mach_msg` to send that port to the user program.
+
+The `mach_msg` call in the user program thus returns, returning the port,
+decoded by `dir_lookup`. glibc adds a new slot to its
+[[glibc/file_descriptor]] table, and records the port in it.
+
+`lseek` is simpler. The glibc implementation simply calls the `__io_seek`
+function against the port of the file descriptor. This is an RPC from the
+[[`io` interface (see io.defs)|interface/io]]. As explained above, the
+implementation is thus in `RPC_io_seek.c`, it encodes parameters and makes a
+`mach_msg` system call to the port of the file descriptor with the `io_seek`
+RPC ID.
+
+In the root filesystem, it's now the demultiplexer for the `io` interface,
+`diskfs_io_server`, which will recognize the RPC ID, and call `_Xio_seek`,
+which retrieves the data structure for the port, and calls `diskfs_S_io_seek`.
+The latter simply modifies ext2fs' internal data structure to account for the
+file position change, and returns the new position. `_Xio_seek` encodes the
+position into the reply message, which is sent back by
+`ports_manage_port_operations_multithread` through `mach_msg`.
+
+The `mach_msg` call in the user program thus returns the new offset, decoded by
+`__io_seek`. `lseek` can then return it to the user application.
+
+
+When hacking, one usually does *not* have to keep all that in mind. All one
+needs to remember (or look up) is that when the application program calls
+`open`, the glibc implementation actually calls `dir_lookup`, which triggers a
+call to `diskfs_S_dir_lookup` in the ext2fs translator. When the application
+program calls `lseek`, the glibc implementation calls `__io_seek`, which
+triggers a call to `diskfs_S_io_seek` in the ext2fs translator. And so on...
+
+
+# Questions and Answers
+
+## How do I know whether a function is an RPC or not?
+
+Simply `grep` the function name (without leading underscores) in the
+`/usr/include/hurd/*.defs` files.
+
+
+## Why is it a libdiskfs function that get called?
+
+Because the filesystem serving the file, ext2fs, is [[libdiskfs]]-based (see
+`HURDLIBS = diskfs` in `ext2fs/Makefile`). Other translators are
+[[libnetfs]]-based or [[libtrivfs]]-based. `grep` for RPC names in those
+according to what your translator is based on.
+
+
+## How do I know which translator the RPC gets into?
+
+Check the type of file whose port the RPC was made on. Most files are handled
+by the translator which is mounted where the files are opened. Some special
+files are handled by particular translators:
+
+ * `PF_LOCAL`/`PF_UNIX` sockets are served by [[translator/pflocal]], see
+ [[hurd/networking]];
+ * `PF_INET`/`PF_INET6` sockets are served by [[translator/pfinet]], see
+ [[hurd/networking]];
+ * named sockets (also known as FIFOs) are served by [[translator/fifo]].
+
+
+# See Also
+
+ * [[hurd/debugging/rpctrace]]
diff --git a/hurd/running.mdwn b/hurd/running.mdwn
index a14106e1..41855433 100644
--- a/hurd/running.mdwn
+++ b/hurd/running.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2007, 2008, 2009, 2011 Free Software Foundation,
-Inc."]]
+[[!meta copyright="Copyright © 2007, 2008, 2009, 2011, 2012 Free Software
+Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -17,11 +17,10 @@ There are several different ways to run a GNU/Hurd system:
* [[microkernel/mach/gnumach/ports/Xen]] - In Xen
* [[Live_CD]]
* [[QEMU]] - In QEMU
+* [[chroots|chroot]] need a couple of tricks to work properly.
* [[VirtualBox]] - In VirtualBox
* [[vmware]] (**non-free!**)
* [[FAQ]]
* [[Public_hurd_boxen]]
-
-[[chroots|chroot]] need a couple of tricks to work properly.
diff --git a/hurd/chroot.mdwn b/hurd/running/chroot.mdwn
index 60bf47b7..29b00a8f 100644
--- a/hurd/chroot.mdwn
+++ b/hurd/running/chroot.mdwn
@@ -8,15 +8,15 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
-This documents the currently-needed tricks to successfully build a chroot in
-GNU/Hurd.
+This documents the currently-needed tricks to successfully build a
+[[glibc/chroot]] in GNU/Hurd.
# Preparation
For proper translator startup, the chroot storage needs to be handled by a
separate translator, for instance:
- # dd < /dev/zero > storage
+ # dd [...] < /dev/zero > storage
# mke2fs storage
# settrans -c chroot /hurd/ext2fs $PWD/storage
@@ -29,13 +29,15 @@ Debootstrap should be able to build the content:
# Tricks
One current issue to know about chroots is that since passive translators (e.g.
-/servers/socket/pflocal) are started by the root translator, which is not aware
+`/servers/socket/1`) are started by the root translator, which is not aware
of the chrooting, these passive translators are started non-chrooted, leading to
a few issues.
+[[!tag open_issue_hurd]]
## Sockets
-Since the passive pflocal translator will not be chrooted, local socket creation
+Since the passive [[translator/pflocal]] translator will not be chrooted, local
+socket creation
will actually happen in the root filesystem. To make things work correctly the
programs inside the chroot need to be able to access them:
@@ -45,7 +47,8 @@ programs inside the chroot need to be able to access them:
## Network
-Unless using a separate IP for the chroot, it is preferrable to share the pfinet translator:
+Unless using a separate IP for the chroot, it is preferrable to share the
+[[translator/pfinet]] translator:
# settrans chroot/servers/socket/2 /hurd/firmlink /servers/socket/2
# settrans chroot/servers/socket/26 /hurd/firmlink /servers/socket/26
diff --git a/hurd/running/debian/after_install.mdwn b/hurd/running/debian/after_install.mdwn
index 419940a7..72ea70a9 100644
--- a/hurd/running/debian/after_install.mdwn
+++ b/hurd/running/debian/after_install.mdwn
@@ -1,6 +1,8 @@
First steps after installation.
-See http://www.debian.org/ports/hurd/hurd-install for configuration bits and tips and tricks.
+See <http://www.debian.org/ports/hurd/hurd-install> for configuration bits and
+tips and tricks.
+
# Setup GRUB
diff --git a/hurd/running/debian/dhcp.mdwn b/hurd/running/debian/dhcp.mdwn
index 8d351aae..afa46799 100644
--- a/hurd/running/debian/dhcp.mdwn
+++ b/hurd/running/debian/dhcp.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,16 +10,16 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_porting]]
-In order to use DHCP, you need to install the `ifup` and `isc-dhcp-client`
-packages, and manually create the following two symbolic links:
+In order to use DHCP, you need to install the `ifupdown` and `isc-dhcp-client`
+packages, and manually create the following symbolic link:
- # ln -s ../rcS.d/S06ifupdown-clean ../rcS.d/S11networking /etc/rc.boot/
+ # ln -s ../rcS.d/S10networking /etc/rc.boot/
-During execution at boot time, the `S11networking` script will emit some error
+During execution at boot time, the `S10networking` script will emit some error
messages while trying to configure the loopback interface. These are not
fatal.
-Debian GNU/Hurd doesn't currently execute's Debian standard `/etc/rcS.d/*` boot
+Debian GNU/Hurd doesn't currently execute Debian standard `/etc/rcS.d/*` boot
scripts, but has its own `/libexec/rc` script -- which integrates scripts from
`/etc/rc.boot/` instead.
diff --git a/hurd/running/qemu.mdwn b/hurd/running/qemu.mdwn
index 512ea602..3648c7d6 100644
--- a/hurd/running/qemu.mdwn
+++ b/hurd/running/qemu.mdwn
@@ -105,6 +105,16 @@ If your machine supports hardware acceleration, you should really use the kvm va
to the command line, see below, if you are running Linux kernels 2.6.37 or 2.6.38 else IRQs may hang sooner or later. The kvm irq problems will be solved in kernel 2.6.39.
+IRC, freenode, #hurd, 2012-08-29:
+
+ <braunr> youpi: do you remember which linux versions require the
+ -no-kvm-irqchip option ?
+ <braunr> your page indicates 2.6.37-38, but i'm seeing weird things on
+ 2.6.32
+ <braunr> looks like a good thing to use that option all the time actually
+ <gnu_srs> seems like kvm -h says: -no-kvm-irqchip and man kvm says:
+ -machine kernel_irqchip=off
+
/!\ Note that there are known performance issues with KVM on Linux 2.6.39
kernels, compared to 2.6.32: [[!debbug 634149]]. We're preparing on a change
on our side to work around this.
diff --git a/hurd/settrans/discussion.mdwn b/hurd/settrans/discussion.mdwn
index c9ec4d34..74f1c8f5 100644
--- a/hurd/settrans/discussion.mdwn
+++ b/hurd/settrans/discussion.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -16,3 +16,24 @@ License|/fdl]]."]]"""]]
<antrik> ugh... I just realized why settrans -a without -f doesn't
generally work on filesystem translators
<antrik> obviously, it needs -R too!
+
+
+# IRC, freenode, #hurd, 2012-08-17
+
+ <antrik> youpi: no, only the -g is redundant; i.e. -ga is the same as -a
+ <antrik> (actually, not redundant, but rather simply meaningless in this
+ case)
+ <antrik> -g tells what to do with an active translator *when a passive one
+ is changed*
+ <antrik> if no passive one is changed, it does nothing
+ <antrik> (and I realized that after using the Hurd for only 6 years or so
+ ;-) )
+ <braunr> it's not obvious
+ <antrik> braunr: indeed. it's not obvious at all from the --help output :-(
+ <antrik> not sure though how to make it clearer
+ <braunr> the idea isn't obvious
+ <braunr> perhaps telling that "setting a passive translator" also applies
+ to removing it, i.e. setting it to none
+ <antrik> braunr: well, the fact that a translator is unset by setting it to
+ nothing is unclear in general, not only for passive translator. I agree
+ that pointing this out should make things much more clear in general...
diff --git a/hurd/subhurd/discussion.mdwn b/hurd/subhurd/discussion.mdwn
index 3449edcd..c4fc047f 100644
--- a/hurd/subhurd/discussion.mdwn
+++ b/hurd/subhurd/discussion.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -8,9 +8,10 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
-[[!tag open_issue_documentation]]
+[[!tag open_issue_documentation open_issue_hurd]]
-IRC, freenode, #hurd, 2011-08-10
+
+# IRC, freenode, #hurd, 2011-08-10
< braunr> youpi: aren't sub-hurds actually called "neighbor hurds" ?
< youpi> no idea
@@ -67,3 +68,92 @@ IRC, freenode, #hurd, 2011-08-10
< antrik> I don't think that's actually supported in the boot
program... but it doesn't really matter, as you don't really need the
terminal anyways -- you can always log in through the network
+
+
+# IRC, freenode, #hurd, 2012-07-31
+
+ <gg0> subhurd seems like bsd jail (tried none of them)
+ <antrik> gg0: nope. BSD jails are mostly chroot AIUI. subhurd is quite
+ different
+ <antrik> gg0: you actually boot a completely new system instance
+ <antrik> complete with all the Hurd servers, UNIX daemons etc.
+ <braunr> jails are between subhhurds and chroots :p
+ <braunr> i suppose there is nothing against making the root server of the
+ subhurd use a file instead of a raw disk, is there ?
+ <gg0> well, I said jails cos afaik are more isolated from real system than
+ chroots
+ <braunr> yes
+ <gg0> maybe comparing subhurd to virtual machines would be more
+ appropriated then
+ <braunr> they're not VMs either
+ <gg0> say chroot -> jail -> subhurd -> vm ?
+ <braunr> unless you consider the microkernel to be a hypervisor, with its
+ own architecture, which some actually do
+ <braunr> gg0: something like that, yes
+ <gg0> [system-in-system evolution]
+ <braunr> a subhurd is an operating system instance
+ <braunr> i think the closest analogy you can get is openvz
+ <antrik> yeah, I'd also consider it closest. but it's still quite
+ different: with OpenVZ, the kernel facilities are only logically
+ isolated; but they use the same kernel code. with subhurds, most of the
+ system facilities are independent
+
+
+# IRC, freenode, #hurd, 2012-08-03
+
+ <antrik> hm... are Mach task IDs exposed to userspace?
+ <braunr> antrik: ids ?
+ <braunr> antrik: what do you call a mach task id ?
+ <antrik> task have numeric IDs in the kernel
+ <antrik> I wonder whether these are ever exposed to userspace
+ <braunr> i'm not sure
+ <braunr> i don't remember the had numeric IDs
+ <braunr> they*
+ <antrik> well, perhaps I'm making things up... but I believe I saw such IDs
+ in the debugger and/or in error messages
+ <braunr> probably their address
+ <braunr> or creation time orpc_sample
+ <antrik> braunr: well, any unique ID would do
+ <braunr> antrik: yes but i was wondering what kdb would actually show
+ <antrik> I just realised that it would be useful for debugging accross
+ subhurds or kernel/userspace if some kind of unique task IDs could be
+ shown in ps output
+ <braunr> yes
+ <braunr> this requires some thought though
+ <braunr> ps shouldn't show that
+ <braunr> there should be mach specific commands i suppose
+ <braunr> but then, gdb and other tools wouldn't have access to subhurd
+ tasks either
+ <antrik> why shouldn't ps show that? I don't think it's any more sensitive
+ information than all the other stuff ps shows...
+ <braunr> it doesn't feel right
+ <braunr> i would want my system instances to be truely isolated
+ <braunr> and use special "cross instance" facilities
+ <braunr> when necessary
+ <antrik> that's completely orthogonal to what I'm talking about
+ <braunr> like eth-multiplexer
+ <braunr> you seem to be talking about security
+ <braunr> or privacy
+ <antrik> we discussed such options when zhengda worked on rootless subhurd
+ <antrik> no, I'm talking about convenient debugging
+ <braunr> right
+ <braunr> i don't think it'zs orthogonal here
+ <braunr> if we increase separation, it becomes less convenient
+ <antrik> for debugging purposes you would *not* use the isolation options
+ <braunr> ok so you propose two modes of operations
+ <antrik> BTW, as an isolated subhurd relies on the parent, it makes no
+ sense to hide subhurd tasks from the parent hurd -- only hide parent hurd
+ task from the subhurd
+ <braunr> agreed
+ <antrik> so even with an isolated subhurd global task IDs would still be
+ useful
+
+
+# IRC, freenode, #hurd, 2012-08-06
+
+ <braunr> antrik: if i'm right, the root file system executable is read from
+ the parent, right ?
+ <antrik> braunr: probably... I'm not sure about that part
+ <braunr> antrik: i've installed the same packages in both the main and
+ subhurds to be sure
+ <braunr> and to have the right binary and debugging symbols in gdb anyway
diff --git a/hurd/translator.mdwn b/hurd/translator.mdwn
index d504b41f..37f4e8bc 100644
--- a/hurd/translator.mdwn
+++ b/hurd/translator.mdwn
@@ -93,6 +93,8 @@ The [[concept|concepts]] of translators creates its own problems, too:
* [[magic]]
* [[unionfs]]
* [[nfs]]
+* [[symlink]]
+* [[firmlink]]
* ...
diff --git a/hurd/translator/exec.mdwn b/hurd/translator/exec.mdwn
index d5b6bfbc..54abba7e 100644
--- a/hurd/translator/exec.mdwn
+++ b/hurd/translator/exec.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2009, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,3 +10,5 @@ License|/fdl]]."]]"""]]
The *exec* server, listening on `/servers/exec`, is responsible for
preparing the execution of processes.
+
+ * [[open_issues/exec_memory_leaks]].
diff --git a/hurd/translator/ext2fs.mdwn b/hurd/translator/ext2fs.mdwn
index 8e15d1c7..13a1d9ec 100644
--- a/hurd/translator/ext2fs.mdwn
+++ b/hurd/translator/ext2fs.mdwn
@@ -20,6 +20,8 @@ License|/fdl]]."]]"""]]
* [[metadata_caching]]
+ * [[internal_allocator]]
+
## Large Stores
@@ -87,6 +89,20 @@ small backend stores, like floppy devices.
<youpi> which can be quite probable
+## Sync Interval
+
+[[!tag open_issue_hurd]]
+
+
+### IRC, freenode, #hurd, 2012-10-08
+
+ <braunr> btw, how about we increase our ext2 sync interval to 30 seconds,
+ like others do ?
+ <braunr> not really because others do it that way, but because it severely
+ breaks performance on the hurd
+ <braunr> and 30 seems like a reasonable amount (better than 5 at least)
+
+
# Documentation
* <http://e2fsprogs.sourceforge.net/ext2.html>
diff --git a/hurd/translator/ext2fs/internal_allocator.mdwn b/hurd/translator/ext2fs/internal_allocator.mdwn
new file mode 100644
index 00000000..f3678a28
--- /dev/null
+++ b/hurd/translator/ext2fs/internal_allocator.mdwn
@@ -0,0 +1,39 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]
+
+
+# IRC, freenode, #hurd, 2012-07-30
+
+ <mcsim> Why for big buffers in ext2fs used own allocator, that just
+ allocates many pages at once, instead of using malloc?
+ <mcsim> i.e. can I replace it with malloc, because it just complicates
+ things?
+ <braunr> mcsim: probably because of alignment
+ <braunr> what gets complicated by that ?
+ <mcsim> braunr: than valloc?
+ <mcsim> braunr: this allocator allows to allocate only buffer with size of
+ vm_page_size.
+ <mcsim> valloc just would be clearer.
+ <braunr> valloc ?
+ <braunr> valloc is obsolete
+ <mcsim> braunr: than memalign or posix_memalign?
+ <mcsim> memalign obsolete too... would posix_memalign be eligible?
+ <braunr> mcsim: why memalign instead of the custom allocator ?
+ <mcsim> because, I think, it is clearer. Also, since I need to allocate any
+ amount of pages, not just one, I have to edit custom allocator. Although
+ it is not hard, but using ready stuff seems more sane for me.
+ <mcsim> braunr: ^
+ <braunr> right, but make sure posix_memalign doesn't create too much
+ overhead
+ <mcsim> braunr: what kind of overhead?
+ <braunr> fragmentation
+ <braunr> i assume the glibc implementation is careful about that, but still
diff --git a/hurd/translator/firmlink.mdwn b/hurd/translator/firmlink.mdwn
new file mode 100644
index 00000000..038879db
--- /dev/null
+++ b/hurd/translator/firmlink.mdwn
@@ -0,0 +1,22 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+ <infinity0> does hurd have equivalent of mount --bind yet?
+ <kilobug> infinity0: unionfs with just one back-end ?
+ <infinity0> ah cool i'll try thaty
+ <kilobug> there may be something better, but that's the one I know about ;)
+ <braunr> infinity0: firmlinks
+ <infinity0> ah thanks i'll look that up
+ <kilobug> braunr: oh, true, I forgot about that one
diff --git a/hurd/translator/nfs.mdwn b/hurd/translator/nfs.mdwn
index bf24370a..81372204 100644
--- a/hurd/translator/nfs.mdwn
+++ b/hurd/translator/nfs.mdwn
@@ -10,6 +10,11 @@ License|/fdl]]."]]"""]]
Translator acting as a NFS client.
+Only NFSv2/v3 is currentl supported.
+
+[[!tag open_issue_hurd]]There are a few unmerged changes on a former GSoC
+project's topic-branch.
+
# See Also
diff --git a/hurd/translator/pfinet/ipv6.mdwn b/hurd/translator/pfinet/ipv6.mdwn
index 5afee0c6..d30cc850 100644
--- a/hurd/translator/pfinet/ipv6.mdwn
+++ b/hurd/translator/pfinet/ipv6.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2007, 2008, 2010 Free Software Foundation,
+[[!meta copyright="Copyright © 2007, 2008, 2010, 2012 Free Software Foundation,
Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -6,8 +6,8 @@ id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
-is included in the section entitled
-[[GNU Free Documentation License|/fdl]]."]]"""]]
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
[[Stefan_Siegl|stesie]] has added IPv6 support to the pfinet [[translator]].
This was [Savannah task #5470](http://savannah.gnu.org/task/?5470).
@@ -55,3 +55,18 @@ Quite the same, but with static IPv6 address assignment:
# Missing Functionality
Amongst other things, support for [[IOCTL]]s is missing.
+
+
+## IRC, freenode, #hurd, 2012-12-10
+
+[[!tag open_issue_hurd]]
+
+ <braunr> looks like pfinet -G option doesn't work
+ <braunr> if someone is interested in fixing this (it concerns static IPv6
+ routing)
+ <braunr> youpi: have you ever successfully used pfinet with global
+ statically configured ipv6 addresses ?
+ <youpi> never tried
+ <braunr> ok
+ <braunr> i'd like to set this up on my VMs but it looks bugged :/
+ <braunr> i can't manage to set correctly set the gateway
diff --git a/hurd/translator/procfs/jkoenig/discussion.mdwn b/hurd/translator/procfs/jkoenig/discussion.mdwn
index 4f6492ed..612983db 100644
--- a/hurd/translator/procfs/jkoenig/discussion.mdwn
+++ b/hurd/translator/procfs/jkoenig/discussion.mdwn
@@ -215,7 +215,8 @@ Needed by glibc's `pldd` tool (commit
# `/proc/self/exe`
-[[!message-id "alpine.LFD.2.02.1110111111260.2016@akari"]]
+[[!message-id "alpine.LFD.2.02.1110111111260.2016@akari"]]. Needed by glibc's
+`stdlib/tst-secure-getenv.c`.
# `/proc/[PID]/fd/`
@@ -278,6 +279,9 @@ Needed by glibc's `pldd` tool (commit
# `/proc/[PID]/maps`
+[[!tag GNU_Savannah_bug 32770]]
+
+
## IRC, OFTC, #debian-hurd, 2012-06-20
<pinotree> bdefreese: the two elfutils tests fail because there are no
@@ -313,3 +317,23 @@ Needed by glibc's `pldd` tool (commit
* pinotree has a local work to add the /proc/$pid/cwd symlink, but relying
on "internal" (but exported) glibc functions
+
+
+# "Unusual" PIDs
+
+Not actually related to procfs, but here seems to be a convenient place for
+filing these:
+
+
+## IRC, freenode, #hurd, 2012-08-10
+
+ <braunr> too bad the proc server has pid 0
+ <braunr> top & co won't show it
+
+
+## IRC, OFTC, #debian-hurd, 2012-09-18
+
+ <pinotree> youpi: did you see
+ https://enc.com.au/2012/09/careful-with-pids/'
+ <pinotree> ?
+ <youpi> nope
diff --git a/libpthread.mdwn b/libpthread.mdwn
index b31876b3..801a1a79 100644
--- a/libpthread.mdwn
+++ b/libpthread.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -20,9 +20,44 @@ License|/fdl]]."]]"""]]
Porting libpthread to a specific architecture is non-trivial.
-Our libpthread is currently used by / ported to the [[Hurd]] on [[GNU
-Mach|microkernel/mach/gnumach]], some [[microkernel/L4]] variants, and
-[[microkernel/Viengoos]].
+Our libpthread is currently used by/ported to the [[Hurd]] on [[GNU
+Mach|microkernel/mach/gnumach]], and [[microkernel/Viengoos]].
+
+
+# History
+
+There has been a libpthread port for Hurd on L4 use (working directly on L4: no
+further OS personality support required), which was dead and has been removed
+in commit a0bca9895bca67591127680860077b2658830e96. This had been superseded
+by a [[microkernel/Viengoos]] port, which has its own branches:
+`master-viengoos` (an implementation of Viengoos that runs on L4) and its
+successor, `master-viengoos-on-bare-metal` (runs directly on x86-64 (and it a
+bit more advanced) and provides everything that `master-viengoos` does and
+more).
+
+There has also been an incomplete and unmaintained PowerPC port which has been
+removed in commit a5387f6a45d6b3f2b381d861f5c288b79da6204f.
+
+
+## Threading Model
+
+libpthread has a 1:1 threading model.
+
+
+## Threads' Death
+
+A thread's death doesn't actually free the thread's stack (and maybe not the
+associated Mach ports either). That's because there's no way to free the stack
+after the thread dies (because the thread of control is gone); the stack needs
+to be freed by something else, and there's nothing convenient to do it. There
+are many ways to make it work.
+
+However, it isn't really a leak, because the unfreed resources do get used for
+the next thread. So the issue is that the shrinkage of resource consumption
+never happens, but it doesn't grow without bounds; it just stays at the maximum
+even if the current number of threads is lower.
+
+The same issue exists in [[hurd/libthreads]].
# Open Issues
diff --git a/microkernel.mdwn b/microkernel.mdwn
index edefddb7..754c7aee 100644
--- a/microkernel.mdwn
+++ b/microkernel.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2007, 2008, 2010 Free Software Foundation,
+[[!meta copyright="Copyright © 2007, 2008, 2010, 2012 Free Software Foundation,
Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -53,3 +53,5 @@ A 2002 article about [[microkernel_FUD|FUD]] (Fear, Uncertainty, Doubt).
* [[Barrelfish]]
* [[Viengoos]]
+
+ * [[Genode]]
diff --git a/microkernel/genode.mdwn b/microkernel/genode.mdwn
new file mode 100644
index 00000000..66dd6b38
--- /dev/null
+++ b/microkernel/genode.mdwn
@@ -0,0 +1,17 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+ * [[RPC]]
+
+
+# IRC, freenode, #hurd, 2012-08-02
+
+ <mcsim> If someone interested, there is a channel with lectures about
+ Genode and L4: http://www.youtube.com/user/drsartakov?feature=watch
diff --git a/microkernel/genode/rpc.mdwn b/microkernel/genode/rpc.mdwn
new file mode 100644
index 00000000..4f68a9f6
--- /dev/null
+++ b/microkernel/genode/rpc.mdwn
@@ -0,0 +1,65 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+ <braunr> for the curious, genode has a very interesting way to provide IPC
+ <braunr> i think they're on the right path at least (synchronous rpc,
+ shared memory, signals, no IDL)
+ <braunr> (i just don't like their choosing C++ at the system core)
+ <kilobug> braunr: hum, how do you write the rpc if there is no IDL ?
+ <kilobug> braunr: in a dynamic language like Python or Ruby, you can have
+ "transparent" RPC with no IDL, but in a language like C or C++ ?
+ <kilobug> when you call them I mean
+ <braunr> kilobug: they call this dynamic marshalling based on C++ streams
+ <braunr> http://genode-labs.com/publications/dynrpc-2007.pdf
+ <kilobug> sounds quite ugly to use :s but that may because I'm not fond of
+ C++ itself ;)
+ <braunr> same for me
+ <braunr> they say inheritance in RPC interfaces is "a must"
+ <braunr> makes me skeptical
+ <braunr> other than that, it's very promising
+ <kilobug> from the caller side, having the RPC appearing to be normal
+ function calls (like you do with Mig or Corba) is quite pleasant, even if
+ writing IDL is burdersome, you write IDL only once while calling RPC is
+ done very often
+ <braunr> oh but they have that as well
+ <braunr> there is just an additional, thin layer of hand written code to
+ provide that
+ <kilobug> ok
+ <braunr> basically, interfaces are C++ virtual classes, which are then
+ inherited in client and server classes
+ <braunr> (although they're changing that with recursive templates)
+ <braunr> but i really like to idea of not relying on an IDL
+ <kilobug> recursive templates :s
+ <braunr> yeah :>
+ <braunr> must be some tricky code, but i guess once it's there, it must be
+ practical
+ <braunr> see
+ http://genode.org/documentation/release-notes/11.05#New_API_for_type-safe_inter-process_communication
+ <braunr> they also added typed capabilities, but i don't really like that
+ idea
+ <antrik> braunr: shared memory for what?
+ <braunr> antrik: for uh.. sharing ? :)
+ <braunr> antrik: these systems don't provide ipc primitives able to share
+ memory directly
+ <braunr> messages are always copied (although zero copy can be used)
+ <braunr> so sharing must be done separately
+ <antrik> hm... I realise that I have no idea how the map operation in L4 is
+ actually done...
+ <braunr> iirc, privileged threads handle that
+ <antrik> I guess you have to explicitly map before an RPC and revoke
+ afterwards, which is some overhead...
+ <braunr> so i guess it's separated as well
+ <braunr> i have one question in mind for now, maybe you can help me with
+ that :
+
+[[open_issues/synchronous_ipc]].
diff --git a/microkernel/l4.mdwn b/microkernel/l4.mdwn
index 7af5e6fc..de311497 100644
--- a/microkernel/l4.mdwn
+++ b/microkernel/l4.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2004, 2006, 2007, 2008, 2010, 2011 Free Software
-Foundation, Inc."]]
+[[!meta copyright="Copyright © 2004, 2006, 2007, 2008, 2010, 2011, 2012 Free
+Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -24,6 +24,12 @@ There was a GNU/Hurd [[port to L4|history/port_to_another_microkernel]], which
is now stalled.
+# IRC, freenode, #hurd, 2012-08-02
+
+ <mcsim> If someone interested, there is a channel with lectures about
+ Genode and L4: http://www.youtube.com/user/drsartakov?feature=watch
+
+
[[!ymlfront data="""
sel4:
diff --git a/microkernel/mach/deficiencies.mdwn b/microkernel/mach/deficiencies.mdwn
index f2f49975..e1f6debc 100644
--- a/microkernel/mach/deficiencies.mdwn
+++ b/microkernel/mach/deficiencies.mdwn
@@ -258,3 +258,265 @@ License|/fdl]]."]]"""]]
working on research around mach
<antrik> braunr: BTW, I have little doubt that making RPC first-class would
solve a number of problems... I just wonder how many others it would open
+
+
+# IRC, freenode, #hurd, 2012-09-04
+
+X15
+
+ <braunr> it was intended as a mach clone, but now that i have better
+ knowledge of both mach and the hurd, i don't want to retain mach
+ compatibility
+ <braunr> and unlike viengoos, it's not really experimental
+ <braunr> it's focused on memory and cpu scalability, and performance, with
+ techniques likes thread migration and rcu
+ <braunr> the design i have in mind is closer to what exists today, with
+ strong emphasis on scalability and performance, that's all
+ <braunr> and the reason the hurd can't be modified first is that my design
+ relies on some important design changes
+ <braunr> so there is a strong dependency on these mechanisms that requires
+ the kernel to exists first
+
+
+## IRC, freenode, #hurd, 2012-09-06
+
+In context of [[open_issues/multithreading]] and later [[open_issues/select]].
+
+ <gnu_srs> And you will address the design flaws or implementation faults
+ with x15?
+ <braunr> no
+ <braunr> i'll address the implementation details :p
+ <braunr> and some design issues like cpu and memory resource accounting
+ <braunr> but i won't implement generic resource containers
+ <braunr> assuming it's completed, my work should provide a hurd system on
+ par with modern monolithic systems
+ <braunr> (less performant of course, but performant, scalable, and with
+ about the same kinds of problems)
+ <braunr> for example, thread migration should be mandatory
+ <braunr> which would make client calls behave exactly like a userspace task
+ asking a service from the kernel
+ <braunr> you have to realize that, on a monolithic kernel, applications are
+ clients, and the kernel is a server
+ <braunr> and when performing a system call, the calling thread actually
+ services itself by running kernel code
+ <braunr> which is exactly what thread migration is for a multiserver system
+ <braunr> thread migration also implies sync IPC
+ <braunr> and sync IPC is inherently more performant because it only
+ requires one copy, no in kernel buffering
+ <braunr> sync ipc also avoids message floods, since client threads must run
+ server code
+ <gnu_srs> and this is not achievable with evolved gnumach and/or hurd?
+ <braunr> well that's not entirely true, because there is still a form of
+ async ipc, but it's a lot less likely
+ <braunr> it probably is
+ <braunr> but there are so many things to change i prefer starting from
+ scratch
+ <braunr> scalability itself probably requires a revamp of the hurd core
+ libraries
+ <braunr> and these libraries are like more than half of the hurd code
+ <braunr> mach ipc and vm are also very complicated
+ <braunr> it's better to get something new and simpler from the start
+ <gnu_srs> a major task nevertheless:-D
+ <braunr> at least with the vm, netbsd showed it's easier to achieve good
+ results from new code, as other mach vm based systems like freebsd
+ struggled to get as good
+ <braunr> well yes
+ <braunr> but at least it's not experimental
+ <braunr> everything i want to implement already exists, and is tested on
+ production systems
+ <braunr> it's just time to assemble those ideas and components together
+ into something that works
+ <braunr> you could see it as a qnx-like system with thread migration, the
+ global architecture of the hurd, and some improvements from linux like
+ rcu :)
+
+
+### IRC, freenode, #hurd, 2012-09-07
+
+ <antrik> braunr: thread migration is tested on production systems?
+ <antrik> BTW, I don't think that generally increasing the priority of
+ servers is a good idea
+ <antrik> in most cases, IPC should actually be sync. slpz looked at it at
+ some point, and concluded that the implementation actually has a
+ fast-path for that case. I wonder what happens to scheduling in this case
+ -- is the receiver sheduled immediately? if not, that's something to
+ fix...
+ <braunr> antrik: qnx does something very close to thread migration, yes
+ <braunr> antrik: i agree increasing the priority isn't a good thing, but
+ it's the best of the quick and dirty ways to reduce message floods
+ <braunr> the problem isn't sync ipc in mach
+ <braunr> the problem is the notifications (in our cases the dead name
+ notifications) that are by nature async
+ <braunr> and a malicious program could send whatever it wants at the
+ fastest rate it can
+ <antrik> braunr: malicious programs can do any number of DOS attacks on the
+ Hurd; I don't see how increasing priority of system servers is relevant
+ in that context
+ <antrik> (BTW, I don't think dead name notifications are async by
+ nature... just like for most other IPC, the *usual* case is that a server
+ thread is actively waiting for the message when it's generated)
+ <braunr> antrik: it's async with respect to the client
+ <braunr> antrik: and malicious programs shouldn't be able to do that kind
+ of dos
+ <braunr> but this won't be fixed any time soon
+ <braunr> on the other hand, a higher priority helps servers not create too
+ many threads because of notifications, and that's a good thing
+ <braunr> gnu_srs: the "fix" for this will be to rewrite select so that it's
+ synchronous btw
+ <braunr> replacing dead name notifications with something like cancelling a
+ previously installed select request
+ <antrik> no idea what "async with respect to the client" means
+ <braunr> it means the client doesn't wait for anything
+ <antrik> what is the client? what scenario are you talking about? how does
+ it affect scheduling?
+ <braunr> for notifications, it's usually the kernel
+ <braunr> it doesn't directly affect scheduling
+ <braunr> it affects the amount of messages a hurd server has to take care
+ of
+ <braunr> and the more messages, the more threads
+ <braunr> i'm talking about event loops
+ <braunr> and non blocking (or very short) selects
+ <antrik> the amount of messages is always the same. the question is whether
+ they can be handled before more come in. which would be the case if be
+ default the receiver gets scheduled as soon as a message is sent...
+ <braunr> no
+ <braunr> scheduling handoff doesn't imply the thread will be ready to
+ service the next message by the time a client sends a new one
+ <braunr> the rate at which a message queue gets filled has nothing to do
+ with scheduling handoff
+ <antrik> I very much doubt rates come into play at all
+ <braunr> well they do
+ <antrik> in my understanding the problem is that a lot of messages are sent
+ before the receive ever has a chance to handle them. so no matter how
+ fast the receiver is, it looses
+ <braunr> a lot of non blocking selects means a lot of reply ports
+ destroyed, a lot of dead name notifications, and what i call message
+ floods at server side
+ <braunr> no
+ <braunr> it used to work fine with cthreads
+ <braunr> it doesn't any more with pthreads because pthreads are slightly
+ slower
+ <antrik> if the receiver gets a chance to do some work each time a message
+ arrives, in most cases it would be free to service the next request with
+ the same thread
+ <braunr> no, because that thread won't have finished soon enough
+ <antrik> no, it *never* worked fine. it might have been slighly less
+ terrible.
+ <braunr> ok it didn't work fine, it worked ok
+ <braunr> it's entirely a matter of rate here
+ <braunr> and that's the big problem, because it shouldn't
+ <antrik> I'm pretty sure the thread would finish before the time slice ends
+ in almost all cases
+ <braunr> no
+ <braunr> too much contention
+ <braunr> and in addition locking a contended spin lock depresses priority
+ <braunr> so servers really waste a lot of time because of that
+ <antrik> I doubt contention would be a problem if the server gets a chance
+ to handle each request before 100 others come in
+ <braunr> i don't see how this is related
+ <braunr> handling a request doesn't mean entirely processing it
+ <braunr> there is *no* relation between handoff and the rate of incoming
+ message rate
+ <braunr> unless you assume threads can always complete their task in some
+ fixed and low duration
+ <antrik> sure there is. we are talking about a single-processor system
+ here.
+ <braunr> which is definitely not the case
+ <braunr> i don't see what it changes
+ <antrik> I'm pretty sure notifications can generally be handled in a very
+ short time
+ <braunr> if the server thread is scheduled as soon as it gets a message, it
+ can also get preempted by the kernel before replying
+ <braunr> no, notifications can actually be very long
+ <braunr> hurd_thread_cancel calls condition_broadcast
+ <braunr> so if there are a lot of threads on that ..
+ <braunr> (this is one of the optimizations i have in mind for pthreads,
+ since it's possible to precisely select the target thread with a doubly
+ linked list)
+ <braunr> but even if that's the case, there is no guarantee
+ <braunr> you can't assume it will be "quick enough"
+ <antrik> there is no guarantee. but I'm pretty sure it will be "quick
+ enough" in the vast majority of cases. which is all it needs.
+ <braunr> ok
+ <braunr> that's also the idea behind raising server priorities
+ <antrik> braunr: so you are saying the storms are all caused by select(),
+ and once this is fixed, the problem should be mostly gone and the
+ workaround not necessary anymore?
+ <braunr> yes
+ <antrik> let's hope you are right :-)
+ <braunr> :)
+ <antrik> (I still think though that making hand-off scheduling default is
+ the right thing to do, and would improve performance in general...)
+ <braunr> sure
+ <braunr> well
+ <braunr> no it's just a hack ;p
+ <braunr> but it's a right one
+ <braunr> the right thing to do is a lot more complicated
+ <braunr> as roland wrote a long time ago, the hurd doesn't need dead-name
+ notifications, or any notification other than the no-sender (which can be
+ replaced by a synchronous close on fd like operation)
+ <antrik> well, yes... I still think the viengoos approach is promising. I
+ meant the right thing to do in the existing context ;-)
+ <braunr> better than this priority hack
+ <antrik> oh? you happen to have a link? never heard of that...
+ <braunr> i didn't want to do it initially, even resorting to priority
+ depression on trhead creation to work around the problem
+ <braunr> hm maybe it wasn't him, i can't manage to find it
+ <braunr> antrik:
+ http://lists.gnu.org/archive/html/l4-hurd/2003-09/msg00009.html
+ <braunr> "Long ago, in specifying the constraints of
+ <braunr> what the Hurd needs from an underlying IPC system/object model we
+ made it
+ <braunr> very clear that we only need no-senders notifications for object
+ <braunr> implementors (servers)"
+ <braunr> "We don't in general make use of dead-name notifications,
+ <braunr> which are the general kind of object death notification Mach
+ provides and
+ <braunr> what serves as task death notification."
+ <braunr> "In the places we do, it's to serve
+ <braunr> some particular quirky need (and mostly those are side effects of
+ Mach's
+ <braunr> decouplable RPCs) and not a semantic model we insist on having."
+
+
+### IRC, freenode, #hurd, 2012-09-08
+
+ <antrik> The notion that seemed appropriate when we thought about these
+ issues for
+ <antrik> Fluke was that the "alert" facility be a feature of the IPC system
+ itself
+ <antrik> rather than another layer like the Hurd's io_interrupt protocol.
+ <antrik> braunr: funny, that's *exactly* what I was thinking when looking
+ at the io_interrupt mess :-)
+ <antrik> (and what ultimately convinced me that the Hurd could be much more
+ elegant with a custom-tailored kernel rather than building around Mach)
+
+
+## IRC, freenode, #hurd, 2012-09-24
+
+ <braunr> my initial attempt was a mach clone
+ <braunr> but now i want a mach-like kernel, without compability
+ <lisporu> which new licence ?
+ <braunr> and some very important changes like sync ipc
+ <braunr> gplv3
+ <braunr> (or later)
+ <lisporu> cool 8)
+ <braunr> yes it is gplv2+ since i didn't take the time to read gplv3, but
+ now that i have, i can't use anything else for such a project: )
+ <lisporu> what is mach-like ? (how it is different from Pistachio like ?)
+ <braunr> l4 doesn't provide capabilities
+ <lisporu> hmmm..
+ <braunr> you need a userspace for that
+ <braunr> +server
+ <braunr> and it relies on complete external memory management
+ <lisporu> how much work is done ?
+ <braunr> my kernel will provide capabilities, similar to mach ports, but
+ simpler (less overhead)
+ <braunr> i want the primitives right
+ <braunr> like multiprocessor, synchronization, virtual memory, etc..
+
+
+### IRC, freenode, #hurd, 2012-09-30
+
+ <braunr> for those interested, x15 is now a project of its own, with no
+ gnumach compability goal, and covered by gplv3+
diff --git a/microkernel/mach/gnumach/memory_management.mdwn b/microkernel/mach/gnumach/memory_management.mdwn
index c630af05..3e158b7c 100644
--- a/microkernel/mach/gnumach/memory_management.mdwn
+++ b/microkernel/mach/gnumach/memory_management.mdwn
@@ -48,6 +48,7 @@ License|/fdl]]."]]"""]]
<braunr> and mmu management
<braunr> (but maybe that's what you meant by physical memory)
+
## IRC, freenode, #hurd, 2011-02-16
<braunr> antrik: youpi added it for xen, yes
@@ -119,3 +120,16 @@ License|/fdl]]."]]"""]]
<braunr> there is issue with watch ./slabinfo which turned in a infinite
loop, but it didn't affect the stability of the system
<braunr> actually with a 64-bits kernel, we could use a 4/x split
+
+
+# IRC, freenode, #hurd, 2012-08-10
+
+ <braunr> all modern systems embed the kernel in every address space
+ <braunr> which allows reduced overhead when making a system call
+ <braunr> sometimes there is no context switch at all
+ <braunr> on i386, there are security checks to upgrade the privilege level
+ (switch to ring 0), and when used, kernel page tables are global, so
+ they're not flushed
+ <braunr> using sysenter/sysexit makes it even faster
+
+[[open_issues/system_call_mechanism]].
diff --git a/microkernel/mach/gnumach/ports.mdwn b/microkernel/mach/gnumach/ports.mdwn
index f114460c..e7fdb446 100644
--- a/microkernel/mach/gnumach/ports.mdwn
+++ b/microkernel/mach/gnumach/ports.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2007, 2008, 2009, 2011 Free Software Foundation,
-Inc."]]
+[[!meta copyright="Copyright © 2007, 2008, 2009, 2011, 2012 Free Software
+Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -13,6 +13,11 @@ License|/fdl]]."]]"""]]
* [[Xen]]
+ * [[open_issues/64-bit_port]]. There is some preliminary work for a
+ x86\_64 port.
+
+ * [[open_issues/ARM_port]]. Is not in a usable state.
+
* [PowerPC](http://www.pjbruin.dds.nl/hurd/). Is not in a usable state.
* Alpha: [project I](http://savannah.nongnu.org/projects/hurd-alpha), and
diff --git a/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn b/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn
index 2a9b4b60..89a27b01 100644
--- a/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn
+++ b/microkernel/mach/gnumach/projects/clean_up_the_code.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2005, 2006, 2007, 2008, 2010 Free Software
+[[!meta copyright="Copyright © 2005, 2006, 2007, 2008, 2010, 2012 Free Software
Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -121,3 +121,12 @@ further files (also exported ones) that serve no real value, but are being
# Rewrite ugly code
+
+
+# IRC, freenode, #hurd, 2012-09-06
+
+ <mcsim> hello. Why size parameter of rpc device_read has type
+ "mach_msg_type_number_t *"? Why not just "vm_size_t *"?
+ <mcsim> this parameter has name data_count
+ <braunr> that's one of the reasons mach is confusing
+ <braunr> i can't really tell you why, it's messy :/
diff --git a/microkernel/mach/history.mdwn b/microkernel/mach/history.mdwn
index 5a3608cd..776bb1d7 100644
--- a/microkernel/mach/history.mdwn
+++ b/microkernel/mach/history.mdwn
@@ -58,3 +58,23 @@ Verbatim copying and distribution of this entire article is permitted in any med
Apple's Macintosh OSX (OS 10.x) is based on [Darwin](http://www.apple.com/macosx/technologies/darwin.html). _"Darwin uses a monolithic kernel based on [[TWiki/FreeBSD]] 4.4 and the OSF/mk Mach 3."_ Darwin also has a [Kernel Programming](http://developer.apple.com/techpubs/macosx/Darwin/General/KernelProgramming/About/index.html) Book.
-- [[Main/GrantBow]] - 22 Oct 2002
+
+IRC, freenode, #hurd, 2012-08-29:
+
+ <pavlx> was moved the page from apple.com about darwin kernel programming
+ as described on the
+ https://www.gnu.org/software/hurd/microkernel/mach/history.html
+ <pavlx> i found the page and it's
+ https://developer.apple.com/library/mac/#documentation/Darwin/Conceptual/KernelProgramming/About/About.html
+ <pavlx> it's not anymore the old page
+ http://developer.apple.com/techpubs/macosx/Darwin/General/KernelProgramming/About/index.html
+ <pavlx> and the link about darwin does noit exists anymore ! the new one
+ could be https://ssl.apple.com/science/profiles/cornell
+ <pavlx> the old one was
+ http://www.apple.com/macosx/technologies/darwin.html
+ <pavlx> the link to Darwin is changed i suppose that the nw one it's
+ https://ssl.apple.com/science/profiles/cornell
+ <pavlx> and the link to Kern Programming it's
+ https://developer.apple.com/library/mac/#documentation/Darwin/Conceptual/KernelProgramming/About/About.html
+ <pavlx> can't be anymore
+ http://developer.apple.com/techpubs/macosx/Darwin/General/KernelProgramming/About/index.html
diff --git a/microkernel/mach/message.mdwn b/microkernel/mach/message.mdwn
index ba47671e..4c49af17 100644
--- a/microkernel/mach/message.mdwn
+++ b/microkernel/mach/message.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2002, 2003, 2010 Free Software Foundation,
+[[!meta copyright="Copyright © 2002, 2003, 2010, 2012 Free Software Foundation,
Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -9,9 +9,11 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
-*Messages* are collections of typed data, with a defined layout.
+*Messages* are collections of typed data, with a defined layout, including an
+[[ID|msgh_id]].
-They are used for [[IPC]], and are sent to and received from [[port]]s.
+They are used for [[IPC]], and are sent to and received from [[port]]s using
+the `mach_msg` interface.
These messages are not only opaque data. They can also contain [[port
rights|port]] to be passed to another [[task]]. Port rights are either
diff --git a/microkernel/mach/message/msgh_id.mdwn b/microkernel/mach/message/msgh_id.mdwn
new file mode 100644
index 00000000..986fcbc7
--- /dev/null
+++ b/microkernel/mach/message/msgh_id.mdwn
@@ -0,0 +1,254 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_mig]]
+
+Every [[message]] has an ID field, which is defined in the [[RPC]] `*.defs`
+files.
+
+
+# IRC, freenode, #hurd, 2012-07-12
+
+[Extending an existing RPC.]
+
+ <antrik> create a new call, either with a new variant of vm_statistics_t,
+ or a new structure with only the extra fields
+ <braunr> that seems cleaner indeed
+ <braunr> but using different names for the same thing seems so tedious and
+ unnecessary :/
+ <antrik> it's extra effort, but it pays off
+ <braunr> i agree, it's the right way to do it
+ <braunr> but this implies some kind of versioning
+ <braunr> which is currently more or less done using mig subsystem numbers,
+ and skipping obsolete calls in rpc definition files
+ <braunr> and a subsystem is like 100 calls (200 with the replies)
+ <braunr> at some point we should recycle them
+ <braunr> or use truely huge ranges
+ <antrik> braunr: that's not something we need to worry about until we get
+ there -- which is not likely to happen any time soon :-)
+ <braunr> "There is no more room in this interface for additional calls."
+ <braunr> in mach.defs
+ <braunr> i'll use the mach4.defs file
+ <braunr> but it really makes no sense at all to do such things just because
+ we want to be compatible with 20 year old software nobody uses any more
+ <braunr> who cares about the skips used to keep us from using the old mach
+ 2.5 interface ..
+ <braunr> (and this 100 arbitrary limit is really ugly too)
+ <antrik> braunr: I agree that we don't want to be compatible with 20 years
+ old software. just Hurd stuff from the last few years is perfectly fine.
+ <tschwinge> braunr, antrik: I agree with the approach of using a new
+ RPC/data structure for incompatible changes, and I also agree that
+ recycling RPC slots that have been unused (skipped) for some years is
+ fine.
+ <antrik> tschwinge: well, we probably shouldn't just reuse them
+ arbitrarily; but rather do a mass purge if the need really arises...
+ <antrik> it would be confusing otherwise IMHO
+ <tschwinge> antrik: What do you understand by doing a mass purge?
+ <tschwinge> My idea indeed was to replace arbitrary "skip"s by new RPC
+ definitions.
+ <braunr> a purge would be good along with a mig change to make subsystem
+ and routines identifier larger
+ <braunr> i guess 16-bits width should do
+ <tschwinge> But what do you unterstand by a "purge" in this context.
+ <braunr> removing all the skips
+ <tschwinge> But that moves the RPC ids following after?
+ <braunr> yes
+ <braunr> that's why i think it's not a good thing, unless we also change
+ the numbering
+ <tschwinge> ... which is a incompatible change for all clients.
+ <braunr> yes
+ <tschwinge> OK, so you'd propose a new system and deprecate the current
+ one.
+ <braunr> not really new
+ <braunr> just larger numbers
+ <braunr> we must acknowledge interfaces change with time
+ <tschwinge> Yes, that's "new" enough. ;-)
+ <tschwinge> New in the sense that all clients use new iterfaces.
+ <braunr> that's enough to completely break compability, yes
+ <braunr> at least binary
+ <tschwinge> Yes.
+ <tschwinge> However, I don't see an urgent need for that, do you?
+ <tschwinge> Why not just recycled a skip that has been unused for a decade?
+ <braunr> i don't think we should care much about that, as the only real
+ issue i can see is when upgrading a system
+ <braunr> i don't say we shouldn't do that
+ <braunr> actually, my current patch does exactly this
+ <tschwinge> OK. :-)
+ <braunr> purging is another topic
+ <braunr> but purging without making numbers larger seems a bit pointless
+ <braunr> as the point is allowing developers to change interfaces without
+ breaking short time compability
+ <braunr> compatibility*
+ <braunr> also, interfaces, even stable, can have more than 100 calls
+ <braunr> (at the same time, i don't think there would ever be many
+ interfaces, so using 16-bits integers for the subsystems and the calls
+ should really be fine, and cleanly aligned in memory)
+ <antrik> tschwinge: you are right, it was a brain fart :-)
+ <antrik> no purge obviously
+ <antrik> but I think we only should start with filling skips once all IDs
+ in the subsystem are exhausted
+ <antrik> braunr: the 100 is not fixed in MIG IIRC; it's a definition we
+ make somewhere
+ <antrik> BTW, using multiple subsystems for "overflowing" interfaces is a
+ bit ugly, but not to bad I'd say... so I wouldn't really consider this a
+ major problem
+ <antrik> err... not too bad
+ <antrik> especially since Hurd subsystem usually are spaced 1000 aways, so
+ there are some "spare" blocks between them anyways
+ <braunr> hm i'm almost sure it's related to mig
+ <braunr> that's how the reply id is computed
+ <antrik> of course it is related to MIG... but I have a vague recollection
+ that this constant is not fixed in the MIG code, but rather supplied
+ somewhere. might be wrong though :-)
+ <pinotree> you mean like the 101-200 skip block in hurd/tioctl.defs?
+ <antrik> pinotree: exactly
+ <antrik> these are reserved for reply message IDs
+ <antrik> at 200 a new request message block begins...
+ <braunr> server.c: fprintf(file, "\tOutP->Head.msgh_id = InP->msgh_id +
+ 100;\n");
+ <braunr> it's not even a define in the mig code :/
+ <pinotree> meaning that in the space of an hurd subsystem there are max 500
+ effective rpc's?
+ <antrik> actually, ioctls are rather special, as the numbers are computed
+ from the ioctl properties...
+ <antrik> braunr: :-(
+ <braunr> pinotree: how do you get this value ?
+ <pinotree> braunr: 1000/2? :)
+ <braunr> ?
+ <braunr> why not 20000/3 ?
+ <antrik> pinotree: yes
+ <braunr> where do they come from ?
+ <braunr> ah ok sorry
+ <pinotree> braunr: 1000 is the space of each subsystem, and each rpc takes
+ an id + its replu
+ <pinotree> *reply
+ <braunr> right
+ <braunr> 500 is fine
+ <braunr> better than 100
+ <braunr> but still, 64k is way better
+ <braunr> and not harder to do
+ <pinotree> (hey, i'm the noob in this :) )
+ <antrik> braunr: it's just how "we" lay out subsystems... nothing fixed
+ about it really; we could just as well define new subsystems with 10000
+ or whatever if we wanted
+ <braunr> yes
+ <braunr> but we still have to consider this mig limit
+ <antrik> there are one or two odd exceptions though, with "related"
+ subsystems starting at ??500...
+ <antrik> braunr: right. it's not pretty -- but I wouldn't consider it
+ enough of a problem to invest major effort in changing this...
+ <braunr> agreed
+ <braunr> at least not while our interfaces don't change often
+ <braunr> which shouldn't happen any time soon
+
+ <tschwinge> Hmm, I also remember seeing some emails about indeed versioning
+ RPCs (by Roland, I think). I can try to look that up if there's
+ interest.
+
+ <braunr> i'm only adding a cached pages count you know :)
+ <braunr> (well actually, this is now a vm_stats call that can replace
+ vm_statistics, and uses flavors similar to task_info)
+ <antrik> braunr: I don't think introducing "flavors" is a good idea
+ <braunr> i just did it the way others calls were done
+ <braunr> other*
+ <braunr> woud you prefer a larger structure with append-only upgrades ?
+ <antrik> I prefer introducing new calls. it avoids an unncessary layer of
+ indirection
+ <antrik> flavors are not exactly RPC-over-RPC, but definitely going down
+ that road...
+ <braunr> right
+ <antrik> as fetching VM statistics is not performance-critical, I would
+ suggest adding a new call with only the extra stats you are
+ introducing. then if someone runs an old kernel not implementing that
+ call, the values are simply left blank in the caller. makes
+ backward-compatibility a no-brainer
+ <antrik> (the alternative is a new call fetching both the traditional and
+ the new stats -- but this is not necessary here, as an extra call
+ shouldn't hurt)
+ <braunr> antrik: all right
+
+
+## IRC, freenode, #hurd, 2012-07-13
+
+ <braunr> so, should i replace old, unused mach.defs RPCs with mine, or add
+ them to e.g. mach4.defs ?
+ <antrik> braunr: hm... actually I wonder whether we shouldn't add a
+ gnumach.defs -- after all, it's neither old mach nor mach4 interfaces...
+ <braunr> true
+ <braunr> good idea
+ <braunr> i'll do just that
+ <braunr> hm, doesn't adding a new interface file requires some handling in
+ glibc ?
+ <youpi> simply rebuild it
+ <braunr> youpi: no i mean
+ <braunr> youpi: glibc knows about mach.defs and mach4.defs, but i guess we
+ should add something so that it knows about gnumach.defs
+ <youpi> ah
+ <youpi> probably, yes
+ <braunr> ok
+ <braunr> i don't understand why these files are part of the glibc headers
+ <pinotree> are they?
+ <braunr> (i mean mach_interface.h and mach4.h)
+ <braunr> for example
+ <braunr> youpi: the interface i'll add is vm_cache_statistics(task,
+ &cached_objects, &cached_pages)
+ <braunr> if it's ok i'll commit directly into the gnumach repository
+ <youpi> shouldn't it rather be a int array, to make it extensible?
+ <youpi> like other stat functions of gnumach
+ <braunr> antrik was against doing that
+ <braunr> well, he was against using flavors
+ <braunr> maybe we could have an extensible array yes, and require additions
+ at the end of the structure
+
+
+## IRC, freenode, #hurd, 2012-07-14
+
+ <antrik> braunr: there are two reasons why the files are part of glibc. one
+ is that glibc itself uses them, so it would be painful to handle
+ otherwise. the other is that libc is traditionally responsible for
+ providing the system interface...
+ <antrik> having said that, I'm not sure we should stick with that :-)
+ <braunr> antrik: what do you think about having a larger structure with
+ reserved fields ? sounds a lot better than flavors, doesn't it ?
+ <youpi> antrik: it's in debian, yes
+ <braunr> grmbl, adding a new interface just for a single call is really
+ tedious
+ <braunr> i'll just add it to mach4
+ <antrik> braunr: well, it's not unlikely there will be other new calls in
+ the future... but I guess using mach4.defs isn't too bad
+ <antrik> braunr: as for reserved fields, I guess that is somewhat better
+ than flavors; but I can't say I exactly like the idea either...
+ <braunr> antrik: there is room in mach4 ;p
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <tschwinge> I'm not sure yet whether I'm happy with adding the RPC to
+ mach4.defs.
+ <braunr> that's the only question yes
+ <braunr> (well, no, not only)
+ <braunr> as i know have a better view of what's involved, it may make sense
+ to create a gnumach.defs file
+ <braunr> tschwinge: all right i'll create a gnumach.defs file
+ <tschwinge> braunr: Well, if there is general agreement that this is the
+ way to go.
+ <tschwinge> braunr: In that case, I guess there's no point in being more
+ fine-grained -- gnumach-vm.defs or similar -- that'd probably be
+ over-engineering. If the glibc bits for libmachuser are not
+ straight-forward, I can help with that of course.
+ <braunr> ok
+
+
+## IRC, freenode, #hurd, 2012-07-27
+
+ <braunr> tschwinge: i've pushed a patch on the gnumach page_cache branch
+ that adds a gnumach.defs interface
+ <braunr> tschwinge: if you think it's ok, i'll rewrite a formal changelog
+ so it can be applied
diff --git a/microkernel/mach/port.mdwn b/microkernel/mach/port.mdwn
index 26b55456..ccc7286f 100644
--- a/microkernel/mach/port.mdwn
+++ b/microkernel/mach/port.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2002, 2003, 2007, 2008, 2010, 2011 Free Software
-Foundation, Inc."]]
+[[!meta copyright="Copyright © 2002, 2003, 2007, 2008, 2010, 2011, 2012 Free
+Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -86,4 +86,4 @@ When a server process' thread receives from a port set, it dequeues exactly one
message from any of the ports that has a message available in its queue.
This concept of port sets is also the facility that makes convenient
-implementation of [[UNIX]]'s `select` [[system_call]] possible.
+implementation of [[UNIX's `select` system call|glibc/select]] possible.
diff --git a/microkernel/mach/rpc.mdwn b/microkernel/mach/rpc.mdwn
index 422e0441..3615fc12 100644
--- a/microkernel/mach/rpc.mdwn
+++ b/microkernel/mach/rpc.mdwn
@@ -1,5 +1,5 @@
-[[!meta copyright="Copyright © 2002, 2003, 2007, 2008, 2010, 2011 Free Software
-Foundation, Inc."]]
+[[!meta copyright="Copyright © 2002, 2003, 2007, 2008, 2010, 2011, 2012 Free
+Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -18,7 +18,7 @@ transparently. This can be implemented with user [[task]]s, but there is an
implementation in the kernel possible, too, which is called *NORMA*, but is not
avilable in [[GNU Mach|gnumach]].
-The RPC stub code generated by [[MIG]].
+The RPC stub code is generated by [[MIG]] to send appropriate [[message]]s.
# Tracing
diff --git a/open_issues/64-bit_port.mdwn b/open_issues/64-bit_port.mdwn
index 797d540f..2d273ba1 100644
--- a/open_issues/64-bit_port.mdwn
+++ b/open_issues/64-bit_port.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,7 +10,11 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_gnumach open_issue_mig]]
-IRC, freenode, #hurd, 2011-10-16:
+There is a `master-x86_64` GNU Mach branch. As of 2012-11-20, it only supports
+the [[microkernel/mach/gnumach/ports/Xen]] platform.
+
+
+# IRC, freenode, #hurd, 2011-10-16
<youpi> it'd be really good to have a 64bit kernel, no need to care about
addressing space :)
@@ -34,3 +38,22 @@ IRC, freenode, #hurd, 2011-10-16:
<youpi> and it'd boost userland addrespace to 4GiB
<braunr> yes
<youpi> leaving time for a 64bit userland :)
+
+
+# IRC, freenode, #hurd, 2012-10-03
+
+ <braunr> youpi: just so you know in case you try the master-x86_64 with
+ grub
+ <braunr> youpi: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=689509
+ <youpi> ok, thx
+ <braunr> the squeeze version is fine but i had to patch the wheezy/sid one
+ <youpi> I actually hadn't hoped to boot into 64bit directly from grub
+ <braunr> youpi: there is code in viengoos that could be reused
+ <braunr> i've been thinking about it for a time now
+ <youpi> ok
+ <braunr> the two easiest ways are 1/ the viengoos one (a -m32 object file
+ converted with objcopy as an embedded loader)
+ <braunr> and 2/ establishing an identity mapping using 4x1 GB large pages
+ and switching to long mode, then jumping to c code to complete the
+ initialization
+ <braunr> i think i'll go the second way with x15, so you'll have the two :)
diff --git a/open_issues/alarm_setitimer.mdwn b/open_issues/alarm_setitimer.mdwn
index 99b2d7b6..3255683c 100644
--- a/open_issues/alarm_setitimer.mdwn
+++ b/open_issues/alarm_setitimer.mdwn
@@ -21,3 +21,11 @@ See also the attached file: on other OSes (e.g. Linux) it blocks waiting
for a signal, while on GNU/Hurd it gets a new alarm and exits.
[[alrm.c]]
+
+
+# IRC, freenode, #hurd, 2012-07-29
+
+ <braunr> our setitimer is bugged
+ <braunr> it seems doesn't seem to leave a timer disarmed when the interval
+ is set to 0
+ <braunr> (which means a one shot timer is actually periodic ..)
diff --git a/open_issues/anatomy_of_a_hurd_system.mdwn b/open_issues/anatomy_of_a_hurd_system.mdwn
index 99ef170b..3e585876 100644
--- a/open_issues/anatomy_of_a_hurd_system.mdwn
+++ b/open_issues/anatomy_of_a_hurd_system.mdwn
@@ -13,7 +13,10 @@ License|/fdl]]."]]"""]]
A bunch of this should also be covered in other (introductionary) material,
like Bushnell's Hurd paper. All this should be unfied and streamlined.
-IRC, freenode, #hurd, 2011-03-08:
+[[!toc]]
+
+
+# IRC, freenode, #hurd, 2011-03-08
<foocraft> I've a question on what are the "units" in the hurd project, if
you were to divide them into units if they aren't, and what are the
@@ -38,9 +41,8 @@ IRC, freenode, #hurd, 2011-03-08:
<antrik> no
<antrik> servers often depend on other servers for certain functionality
----
-IRC, freenode, #hurd, 2011-03-12:
+# IRC, freenode, #hurd, 2011-03-12
<dEhiN> when mach first starts up, does it have some basic i/o or fs
functionality built into it to start up the initial hurd translators?
@@ -72,24 +74,24 @@ IRC, freenode, #hurd, 2011-03-12:
<antrik> it also does some bootstrapping work during startup, to bring the
rest of the system up
----
+
+# Source Code Documentation
Provide a cross-linked sources documentation, including generated files, like
RPC stubs.
* <http://www.gnu.org/software/global/>
----
-[[Hurd_101]].
+# [[Hurd_101]]
+
----
+# [[hurd/IO_path]]
-More stuff like [[hurd/IO_path]].
+Need more stuff like that.
----
-IRC, freenode, #hurd, 2011-10-18:
+# IRC, freenode, #hurd, 2011-10-18
<frhodes> what happens @ boot. and which translators are started in what
order?
@@ -97,9 +99,8 @@ IRC, freenode, #hurd, 2011-10-18:
ext2; ext2 starts exec; ext2 execs a few other servers; ext2 execs
init. from there on, it's just standard UNIX stuff
----
-IRC, OFTC, #debian-hurd, 2011-11-02:
+# IRC, OFTC, #debian-hurd, 2011-11-02
<sekon_> is __dir_lookup a RPC ??
<sekon_> where can i find the source of __dir_lookup ??
@@ -123,9 +124,8 @@ IRC, OFTC, #debian-hurd, 2011-11-02:
<tschwinge> sekon_: This may help a bit:
http://www.gnu.org/software/hurd/hurd/hurd_hacking_guide.html
----
-IRC, freenode, #hurd, 2012-01-08:
+# IRC, freenode, #hurd, 2012-01-08
<abique> can you tell me how is done in hurd: "ls | grep x" ?
<abique> in bash
@@ -187,7 +187,8 @@ IRC, freenode, #hurd, 2012-01-08:
<antrik> that's probably the most fundamental design feature of the Hurd
<antrik> (all filesystem operations actually, not only lookups)
-IRC, freenode, #hurd, 2012-01-09:
+
+## IRC, freenode, #hurd, 2012-01-09
<braunr> youpi: are you sure cthreads are M:N ? i'm almost sure they're 1:1
<braunr> and no modern OS is a right place for any thread userspace
@@ -266,3 +267,83 @@ IRC, freenode, #hurd, 2012-01-09:
<youpi> they help only when the threads are living
<braunr> ok
<youpi> now as I said I don't have to talk much more, I have to leave :)
+
+
+# IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> spiderweb: have you read
+ http://www.gnu.org/software/hurd/hurd-paper.html ?
+ <spiderweb> I'll have a look.
+ <braunr> and also the beginning of
+ http://ftp.sceen.net/mach/mach_a_new_kernel_foundation_for_unix_development.pdf
+ <braunr> these two should provide a good look at the big picture the hurd
+ attemtps to achieve
+ <Tekk_> I can't help but wonder though, what advantages were really
+ achieved with early mach?
+ <Tekk_> weren't they just running a monolithic unix server like osx does?
+ <braunr> most mach-based systems were
+ <braunr> but thanks to that, they could provide advanced features over
+ other well established unix systems
+ <braunr> while also being compatible
+ <Tekk_> so basically it was just an ease of development thing
+ <braunr> well that's what mach aimed at being
+ <braunr> same for the hurd
+ <braunr> making things easy
+ <Tekk_> but as a side effect hurd actually delivers on the advantages of
+ microkernels aside from that, but the older systems wouldn't, correct?
+ <braunr> that's how there could be network file systems in very short time
+ and very scarce resources (i.e. developers working on it), while on other
+ systems it required a lot more to accomplish that
+ <braunr> no, it's not a side effect of the microkernel
+ <braunr> the hurd retains and extends the concept of flexibility introduced
+ by mach
+ <Tekk_> the improved stability, etc. isn't a side effect of being able to
+ restart generally thought of as system-critical processes?
+ <braunr> no
+ <braunr> you can't restart system critical processes on the hurd either
+ <braunr> that's one feature of minix, and they worked hard on it
+ <Tekk_> ah, okay. so that's currently just the domain of minix
+ <Tekk_> okay
+ <Tekk_> spiderweb: well, there's 1 advantage of minix for you :P
+ <braunr> the main idea of mach is to make it easy to extend unix
+ <braunr> without having hundreds of system calls
+ <braunr> the hurd keeps that and extends it by making many operations
+ unprivileged
+ <braunr> you don't need special code for kernel modules any more
+ <braunr> it's easy
+ <braunr> you don't need special code to handle suid bits and other ugly
+ similar hacks,
+ <braunr> it's easy
+ <braunr> you don't need fuse
+ <braunr> easy
+ <braunr> etc..
+
+
+# IRC, freenode, #hurd, 2012-12-06
+
+ <spiderweb> what is the #1 feature that distinguished hurd from other
+ operating systems. the concept of translators. (will read more when I get
+ more time).
+ <braunr> yes, translators
+ <braunr> using the VFS as a service directory
+ <braunr> and the VFS permissions to control access to those services
+
+
+# IRC, freenode, #hurd, 2012-12-10
+
+ <spiderweb> I want to work on hurd, but I think I'm going to start with
+ minix, I own the minix book 3rd ed. it seems like a good intro to
+ operating systems in general. like I don't even know what a semaphore is
+ yet.
+ <braunr> well, enjoy learning :)
+ <spiderweb> once I finish that book, what reading do you guys recommend?
+ <spiderweb> other than the wiki
+ <braunr> i wouldn't recommend starting with a book that focuses on one
+ operating system anyway
+ <braunr> you tend to think in terms of what is done in that specific
+ implementation and compare everything else to that
+ <braunr> tannenbaum is not only the main author or minix, but also the one
+ of the book http://en.wikipedia.org/wiki/Modern_Operating_Systems
+ <braunr>
+ http://en.wikipedia.org/wiki/List_of_important_publications_in_computer_science#Operating_systems
+ should be a pretty good list :)
diff --git a/open_issues/arm_port.mdwn b/open_issues/arm_port.mdwn
new file mode 100644
index 00000000..2d8b9038
--- /dev/null
+++ b/open_issues/arm_port.mdwn
@@ -0,0 +1,238 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+Several people have expressed interested in a port of GNU/Hurd for the ARM
+architecture.
+
+
+# IRC, freenode, #hurd, 2012-10-09
+
+ <mcsim> bootinfdsds: There was an unfinished port to arm, if you're
+ interested.
+ <tschwinge> mcsim: Has that ever been published?
+ <mcsim> tschwinge: I don't think so. But I have an email of that person and
+ I think that this could be discussed with him.
+
+
+## IRC, freenode, #hurd, 2012-10-10
+
+ <tschwinge> mcsim: If you have a contact to the ARM porter, could you
+ please ask him to post what he has?
+ <antrik> tschwinge: we all have the "contact" -- let me remind you that he
+ posted his questions to the list...
+
+
+## IRC, freenode, #hurd, 2012-10-17
+
+ <mcsim> tschwinge: Hello. The person who I wrote regarding arm port of
+ gnumach still hasn't answered. And I don't think that he is going to
+ answer.
+
+
+# IRC, freenode, #hurd, 2012-11-15
+
+ <matty3269> Well, I have a big interest in the ARM architecture, I worked
+ at ARM for a bit too, and I've written my own little OS that runs on
+ qemu. Is there an interest in getting hurd running on ARM?
+ <braunr> matty3269: not really currently
+ <braunr> but if that's what you want to do, sure
+ <tschwinge> matty3269: Well, interest -- sure!, but we don't really have
+ people savvy in low-level kernel implementation on ARM. I do know some
+ bits about it, but more about the instruction set than about its memory
+ architecture, for example.
+ <tschwinge> matty3269: But if you're feeling adventurous, by all means work
+ on it, and we'll try to help as we can.
+ <tschwinge> matty3269: There has been one previous attempt for an ARM port,
+ but that person never published his code, and apparently moved to a
+ different project.
+ <tschwinge> matty3269: I can help with toolchains (GCC, etc.) things for
+ ARM, if there's need.
+ <matty3269> tschwinge: That sounds great, thanks! Where would you recommend
+ I start (at the moment I've got Mach checked out and am trying to get it
+ compiled for i386)
+ <matty3269> I'm guessing that the Mach micro-kernel is all that would need
+ to be ported or are there arch-dependant bits of code in the server
+ processes?
+ <tschwinge> matty3269:
+ http://www.gnu.org/software/hurd/faq/system_port.html has some
+ information. Mach is the biggest part, yes. Then some bits in glibc and
+ libpthread, and even less in the Hurd libraries and servers.
+ <tschwinge> matty3269: Basically, you'd need equivalents for the i386 (and
+ similar) directories, yep.
+ <tschwinge> Though, you may be able to avoid some cruft in there.
+ <tschwinge> Does building for x86 have any issues?
+ <tschwinge> matty3269: How is generally your understanding of the Hurd on
+ Mach system architecture, and on microkernel-based systems generally, and
+ on Mach in particular?
+ <matty3269> tschwinge: yes, it seems to be progressing... I've got mig
+ installed and it's just compiling now
+ <matty3269> hmm, not too great if I'm honest, I've done mostly monolithic
+ kernel development so having such low-level processes, such as
+ scheduling, done in user-space seems a little strinage
+ <tschwinge> Ah, yes, MIG will need a little bit of porting, too. I can
+ help with that, but that's not a priority -- first you have to get Mach
+ to boot at all; MIG will only be needed once you need to deal with RPCs,
+ so user-land/kernel interaction, basically. Before, you can hack around
+ it.
+ <matty3269> tschwinge: I have been running a GNU/Hurd system for a while
+ now though
+ <tschwinge> I'm happy to tell you that the schedules is still in the
+ kernel. ;-)
+ <tschwinge> OK, good, so you know about the basic ideas.
+ <braunr> matty3269: there has to be machine specific stuff in user space
+ <braunr> for initial thread contexts for example
+ <matty3269> tschwinge: Ok, just got gnumach built
+ <braunr> but there isn't much and you can easily base your work from the
+ x86 implementation
+ <tschwinge> Yes. Mach itself is the more difficult one.
+ <matty3269> braunr: Yeah, looking around at things, it doesn't seem that
+ there will be too much work involoved in the user-space stuff
+ <tschwinge> braunr: Do you know off-hand whether there are some old Mach
+ research papers describing architecture ports?
+ <tschwinge> I know there are some describing the memory system (obviously),
+ and I/O system -- which may help matty3269 to understand the general
+ design/structure.
+ <tschwinge> We might want to identify some documents, and make a list.
+ <braunr> all mach related documentation i have is available here:
+ ftp://ftp.sceen.net/mach/
+ <braunr> (also through http://)
+ <tschwinge> matty3269: Oh, definitely I'd suggest the Mach 3 Kernel
+ Principles book. That gives a good description of the Mach architecture.
+ <matty3269> Great, that's my weekends reading then!
+ <braunr> you don't need all that for a port
+ <matty3269> Is it possible to run the gnumach binary standalone with qemu?
+ <braunr> you won't go far with it
+ <braunr> you really need at least one program
+ <braunr> but sure, for a port development, it can easily be done
+ <braunr> i'd suggest writing a basic static application for your tests once
+ you reach an advanced state
+ <braunr> the critical parts of a port are memory and interrupts
+ <braunr> and memory can be particularly difficult to implement correctly
+ <tschwinge> matty3269: I once used QEMU's
+ virtual-FAT-filesystem-from-a-directory-on-the-host, and configured GRUB
+ to boot from that one, so it was easy to quickly reboot for kernel
+ development.
+ <braunr> but the good news is that almost every bsd system still uses a
+ similar interface
+ <tschwinge> matty3269: And, you may want to become familiar with QEMU's
+ built-in gdbserver, and how to connect to and use that.
+ <braunr> so, for example, you could base your work from the netbsd/arm pmap
+ module
+ <tschwinge> matty3269: I think that's better than starting on real
+ hardware.
+ <braunr> tschwinge: you can use -kernel with a multiboot binary now
+ <braunr> tschwinge: and even creating iso images is so fast it's not any
+ slower
+ <tschwinge> braunr: Yeah, I thought so, but never checked this out --
+ recently I saw in qemu --help's output some »multiboot« thing flashing
+ by. :-)
+ <braunr> i think it only supports 32-bits executables though
+ <matty3269> braunr: Yeah, I just tried passing gnumach as the -kernel
+ parameter to qemu, but it segged qemu :S
+ <braunr> otherwise i'd be using it for x15
+ <matty3269> qemu: fatal: Trying to execute code outside RAM or ROM at
+ 0xc0100000
+ <braunr> how much ram did you give qemu ?
+ <matty3269> I used '-m 512'
+ <braunr> hum, so the -kernel option doesn't correctly implement elf loading
+ or something like that
+ <braunr> anyway, i'm not sure how well building gnumach on a non-hurd
+ system is supported
+ <braunr> so you may want to simply develop inside your VM for the time
+ being, and reboot
+ <matty3269> doing an objdump of it seems fine...
+ <braunr> ?
+ <braunr> ah, the gnumach executable is a correct elf image
+ <braunr> that's not the point
+ <matty3269> Is there particular reason that mach is linked at 0xc0100000?
+ <matty3269> or is that where it is expected to be in VM>
+ <tschwinge> That's my understanding.
+ <braunr> kernels commmonly sti at high addresses
+ <braunr> that's the "standard" 3G/1G split for user/kernel space
+ <matty3269> I think Linux sits at a similar VA for 32-bit
+ <braunr> no
+ <matty3269> Oh, I thought it did, I know it does on ARM, the kernel is
+ mapped to 0xc000000
+ <braunr> i don't know arm, but are you sure about this number ?
+ <braunr> seems to lack a 0
+ <matty3269> Ah, yes sorry
+ <matty3269> so 0xC0000000
+ <braunr> 0xc0100000 is just 1 MiB above it
+ <braunr> the .text section of linux on x86 actually starts at c1000000
+ (above 16 MiB, certainly to preserve as much dma-able memory since modern
+ machines now have a lot more)
+ <tschwinge> Surely the GRUB multiboot loader is not that much used/tested?
+ <braunr> unfortunately, no
+ <braunr> matty3269: FYI, my kernel starts at 0xfff00000 :p
+ <matty3269> braunr: hmm, you could be right, I know it's arround there
+ someone
+ <matty3269> somewhere*
+ <matty3269> braunr: that's an interesting address :S
+ <matty3269> braunr: is that the PA address of the kernel or the VA inside a
+ process?
+ <braunr> the VA
+ <matty3269> hmm
+ <braunr> it can't be a PA
+ <braunr> such high addresses are normally device memory
+ <braunr> but don't worry, i have good reasons to have chosen this address
+ :)
+ <matty3269> so with gnumach, does the boot-up sequence use PIC until VM is
+ active and the kernel mapped to the linking address?
+ <braunr> no
+ <braunr> actually i'm not certain of the details
+ <braunr> but there is no PIC
+ <braunr> either special sections are linked at physical addresses
+ <braunr> or it relies on the fact that all executable code uses near jumps
+ <braunr> and uses offsets when accessing data
+ <braunr> (which is why the kernel text is at 3 GiB + 1 MiB, and not 3 GiB)
+ <matty3269> hmm,
+ <matty3269> gah, I need to learn x86
+ <braunr> that would certainly help
+ <matty3269> I've just had a look at boothdr.S; I presume that there must be
+ something else that is executed before this to setup VM, switch to 32-bit
+ more etc...?
+ <matty3269> mode*
+ <braunr> have a look at the multiboot specification
+ <braunr> it sets protected mode
+ <braunr> but not paging
+ <braunr> (i mean, the boot loader does, before passing control to the
+ kernel)
+ <matty3269> Ah, I see
+ <tschwinge> matty3269: Multiboot should be documented in the GRUB package.
+ <matty3269> tschwinge: yep, got that, thanks
+ <matty3269> hmm, I can't find any reference to CR0 in gnumach so paging
+ must be enabled elsewhere
+ <matty3269> oh wait, found it
+ <braunr> $ git grep -i '\<cr0\>'
+ <braunr> i386/i386/proc_reg.h, linux/dev/include/asm-i386/system.h
+ <braunr> although i suspect only the first one is relevant to us :)
+ <matty3269> Yeah, that seems to have the setup code for paging :)
+ <matty3269> I'm still confused how it could run that without paging or PIC
+ though
+ <matty3269> I think I need to watch the boot sequence with qemu
+ <braunr> it's a bit tricky
+ <braunr> but actually simple
+ <braunr> 00:44 < braunr> either special sections are linked at physical
+ addresses
+ <braunr> 00:44 < braunr> or it relies on the fact that all executable code
+ uses near jumps
+ <braunr> that's really all there is
+ <braunr> but you shouldn't worry about that i suppose, as the protocol
+ between the boot loader and an arm kernel will certainly not be the saem
+ <braunr> same*
+ <matty3269> indeed, ARM is tricky because memory maps are vastly differnt
+ on every platform
+
+
+## IRC, freenode, #hurd, 2012-11-21
+
+ <matty3269> Well, I have a ARM gnumach kernel compiled. It just doesn't
+ run! :)
+ <braunr> matty3269: good luck :)
diff --git a/open_issues/automatic_backtraces_when_assertions_hit.mdwn b/open_issues/automatic_backtraces_when_assertions_hit.mdwn
index 1cfacaf5..f6bf5856 100644
--- a/open_issues/automatic_backtraces_when_assertions_hit.mdwn
+++ b/open_issues/automatic_backtraces_when_assertions_hit.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -10,9 +10,70 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_glibc]]
-IRC, unknown channel, unknown date.
+
+# IRC, unknown channel, unknown date
<azeem> tschwinge: ext2fs.static: thread-cancel.c:55: hurd_thread_cancel: Assertion `! __spin_lock_locked (&ss->critical_section_lock)' failed.
<youpi> it'd be great if we could have backtraces in such case
<youpi> at least just the function names
<youpi> and in this case (static), just addresses would be enough
+
+
+# IRC, freenode, #hurd, 2012-07-19
+
+In context of the [[ext2fs_libports_reference_counting_assertion]].
+
+ <braunr> pinotree: tschwinge: do you know if our packages are built with
+ -rdynamic ?
+ <pinotree> braunr: debian's cflags don't include it, so unless the upstream
+ build systems do, -rdynamic is not added
+ <braunr> i doubt glibc' backtrace() is able to find debugging symbol files
+ on its own
+ <pinotree> what do you mean?
+ <braunr> the port reference bug youpi noticed is rare
+ <pinotree> even on linux, a program compiled with normal optimizations (eg
+ -O2 -g) can give just pointer values in backtrace()'s output
+ <braunr> core dumps are unreliable at best
+
+[[crash_server]].
+
+ <braunr> uh, no, backtrace does give names
+ <braunr> but not with -fomit-frame-pointer
+ <braunr> unless the binary is built with -rdynamic
+ <braunr> at least it used to
+ <pinotree> not really, when being optimized some steps can be optimized
+ away (eg inlines)
+ <braunr> that's ok
+ <braunr> anyway, the point is i'd like a way that can give us as much
+ information as possible when the problem happens
+ <braunr> the stack trace being the most useful imo
+ <pinotree> do you face issues currently with backtrace()?
+ <braunr> not tried yet
+ <braunr> i guess i could make the application trap in the kernel, and fault
+ there, so we can attach gdb while still in the pager address space :>
+ <pinotree> that would imply the need for interactivity when the fault
+ happens, wouldn't it?
+ <braunr> no
+ <braunr> it would remain this way until someone comes, hours, days later
+ <braunr> pinotree: well ok, it would require interactivity, but not *when*
+ it happens ;p
+ <braunr> pinotree: right, it needs -rdynamic
+
+
+## IRC, freenode, #hurd, 2012-07-21
+
+ <braunr> tschwinge: my current "approach" is to introduce an infinite loop
+ <braunr> it makes the faulting task mapped in often enough to use gdb
+ through qemu
+ <braunr> ... :)
+ <tschwinge> My understanding is that glibc already does have some mechanism
+ for that: I have seen it print backtraces whendetecting malloc
+ inconsistencies (double free and the lite).
+ <braunr> yes, i thought it used the backtrace functions internally though
+ <braunr> that is, execinfo
+ <braunr> but this does require -rdynamic
+
+
+# GCC's libbacktrace
+
+Introduced in commit ecd3459e7bb829202601e3274411135a15c64dde.
diff --git a/open_issues/binutils.mdwn b/open_issues/binutils.mdwn
index 8d6b3a94..eec5154f 100644
--- a/open_issues/binutils.mdwn
+++ b/open_issues/binutils.mdwn
@@ -123,20 +123,22 @@ sources|source_repositories/binutils]], run on kepler.SCHWINGE and
coulomb.SCHWINGE.
$ export LC_ALL=C
- $ ../master/configure --prefix="$PWD".install SHELL=/bin/dash CC=gcc-4.6 CXX=g++-4.6 2>&1 | tee log_build
+ $ ../master/configure --prefix="$PWD".install --with-sysroot=/ SHELL=/bin/dash CC=gcc-4.6 CXX=g++-4.6 2>&1 | tee log_build
[...]
$ make 2>&1 | tee log_build_
[...]
Different hosts may default to different shells and compiler versions; thus
-harmonized.
+harmonized. Debian GCC (which is used in binutils' testsuite) likes to pass
+`--sysroot=/` to `ld`, so we need to configure binutils with support for
+sysroots.
This takes up around 120 MiB, and needs roughly 4 min on kepler.SCHWINGE and
15 min on coulomb.SCHWINGE.
<!--
- $ (make && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install && touch .go-check) 2>&1 | tee log_install && test -f .go-check && make -k check 2>&1 | tee log_check
+ $ (make && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install && touch .go-test) 2>&1 | tee log_install && test -f .go-test && make -k check 2>&1 | tee log_test
-->
@@ -147,9 +149,7 @@ x86 GNU/Linux' and GNU/Hurd's configurations are slightly different, thus mask
out most of the differences that are due to GNU/Linux supporting more core file
formats, and more emulation vectors.
- $ ssh kepler.SCHWINGE 'cd tmp/source/binutils/ && cat hurd/master.build/log_build* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/binutils/linux/log_build
- $ ssh coulomb.SCHWINGE 'cd tmp/binutils/ && cat hurd/master.build/log_build* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/binutils/hurd/log_build
- $ diff -wu <(sed -f toolchain/logs/binutils/linux/log_build.sed < toolchain/logs/binutils/linux/log_build) <(sed -f toolchain/logs/binutils/hurd/log_build.sed < toolchain/logs/binutils/hurd/log_build) > toolchain/logs/binutils/log_build.diff
+ $ toolchain/logs/process binutils build
# Install
@@ -163,9 +163,7 @@ min on coulomb.SCHWINGE.
## Analysis
- $ ssh kepler.SCHWINGE 'cd tmp/source/binutils/ && cat hurd/master.build/log_install | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/binutils/linux/log_install
- $ ssh coulomb.SCHWINGE 'cd tmp/binutils/ && cat hurd/master.build/log_install | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/binutils/hurd/log_install
- $ diff -wu <(sed -f toolchain/logs/binutils/linux/log_install.sed < toolchain/logs/binutils/linux/log_install) <(sed -f toolchain/logs/binutils/hurd/log_install.sed < toolchain/logs/binutils/hurd/log_install) > toolchain/logs/binutils/log_install.diff
+ $ toolchain/logs/process binutils install
* `libtool: finish`: `ldconfig` is not run for the Hurd.
@@ -177,13 +175,11 @@ min on coulomb.SCHWINGE.
This needs roughly 3 min on kepler.SCHWINGE and 13 min on coulomb.SCHWINGE.
- $ ssh kepler.SCHWINGE 'cd tmp/source/binutils/ && cat hurd/master.build/*/*.sum hurd/master.build/*/*/*.sum | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/binutils/linux/sum
- $ ssh coulomb.SCHWINGE 'cd tmp/binutils/ && cat hurd/master.build/*/*.sum hurd/master.build/*/*/*.sum | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/binutils/hurd/sum
- $ diff -u -F ^Running toolchain/logs/binutils/linux/sum toolchain/logs/binutils/hurd/sum > toolchain/logs/binutils/sum.diff
-
## Analysis
+ $ toolchain/logs/process binutils test
+
* <a name="static"><!-- stable_URL -->`FAIL: static [...]`</a>
The testsuite isn't prepared for using `crt0.o` instead of `crt1.o`
diff --git a/open_issues/bpf.mdwn b/open_issues/bpf.mdwn
index e24d761b..02dc7f87 100644
--- a/open_issues/bpf.mdwn
+++ b/open_issues/bpf.mdwn
@@ -585,3 +585,11 @@ This is a collection of resources concerning *Berkeley Packet Filter*s.
in libpcap, and let users of that library benefit from it
<braunr> instead of implementing the low level bpf interface, which
nonetheless has some system-specific variants ..
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+In context of the [[select]] issue.
+
+ <braunr> i understand now why my bpf translator was so buggy
+ <braunr> the condition_timedwait i wrote at the time was .. incomplete :)
diff --git a/open_issues/code_analysis.mdwn b/open_issues/code_analysis.mdwn
index 00915651..2ab8bf1d 100644
--- a/open_issues/code_analysis.mdwn
+++ b/open_issues/code_analysis.mdwn
@@ -42,10 +42,13 @@ There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks.
"1123688017.3905.22.camel@buko.sinrega.org"]]. This could be checked by a
static analysis tool.
- * [Static Source Code Analysis Tools for C](http://spinroot.com/static/)
-
* [[!wikipedia List_of_tools_for_static_code_analysis]]
+ * [Engineering zero-defect software](http://esr.ibiblio.org/?p=4340), Eric
+ S. Raymond, 2012-05-13
+
+ * [Static Source Code Analysis Tools for C](http://spinroot.com/static/)
+
* [Cppcheck](http://sourceforge.net/apps/mediawiki/cppcheck/)
For example, [Debian's hurd_20110319-2
@@ -59,13 +62,9 @@ There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks.
* <http://www.google.com/search?q=coccinelle+analysis>
- * clang
-
- * <http://www.google.com/search?q=clang+analysis>
+ * [clang](http://www.google.com/search?q=clang+analysis)
- * Linux' sparse
-
- * <https://sparse.wiki.kernel.org/>
+ * [Linux' sparse](https://sparse.wiki.kernel.org/)
* <http://klee.llvm.org/>
@@ -83,13 +82,34 @@ There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks.
* [sixgill](http://sixgill.org/)
+ * [s-spider](http://code.google.com/p/s-spider/)
+
+ * [CIL (C Intermediate Language)](http://kerneis.github.com/cil/)
+
+ * [Frama-C](http://frama-c.com/)
+
* [Coverity](http://www.coverity.com/) (nonfree?)
+ * [Splint](http://www.splint.org/)
+
+ * IRC, freenode, #hurd, 2011-12-04
+
+ <mcsim> has anyone used splint on hurd?
+ <mcsim> this is tool for statically checking C programs
+ <mcsim> seems I made it work
+
# Dynamic
* [[community/gsoc/project_ideas/Valgrind]]
+ * glibc's `libmcheck`
+
+ * Used by GDB, for example.
+
+ * Is not thread-safe, [[!sourceware_PR 6547]], [[!sourceware_PR 9939]],
+ [[!sourceware_PR 12751]], [[!stackoverflow_question 314931]].
+
* <http://en.wikipedia.org/wiki/Electric_Fence>
* <http://sourceforge.net/projects/duma/>
@@ -98,18 +118,25 @@ There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks.
* <https://wiki.ubuntu.com/CompilerFlags>
- * IRC, freenode, #glibc, 2011-09-28
+ * `MALLOC_CHECK_`/`MALLOC_PERTURB_`
- <vsrinivas> two things you can do -- there is an environment variable
- (DEBUG_MALLOC_ iirc?) that can be set to 2 to make ptmalloc (glibc's
- allocator) more forceful and verbose wrt error checking
- <vsrinivas> another is to grab a copy of Tor's source tree and copy out
- OpenBSD's allocator (its a clearly-identifyable file in the tree);
- LD_PRELOAD it or link it into your app, it is even more aggressive
- about detecting memory misuse.
- <vsrinivas> third, Red hat has a gdb python plugin that can instrument
- glibc's heap structure. its kinda handy, might help?
- <vsrinivas> MALLOC_CHECK_ was the envvar you want, sorry.
+ * IRC, freenode, #glibc, 2011-09-28
+
+ <vsrinivas> two things you can do -- there is an environment
+ variable (DEBUG_MALLOC_ iirc?) that can be set to 2 to make
+ ptmalloc (glibc's allocator) more forceful and verbose wrt error
+ checking
+ <vsrinivas> another is to grab a copy of Tor's source tree and copy
+ out OpenBSD's allocator (its a clearly-identifyable file in the
+ tree); LD_PRELOAD it or link it into your app, it is even more
+ aggressive about detecting memory misuse.
+ <vsrinivas> third, Red hat has a gdb python plugin that can
+ instrument glibc's heap structure. its kinda handy, might help?
+ <vsrinivas> MALLOC_CHECK_ was the envvar you want, sorry.
+
+ * [`MALLOC_PERTURB_`](http://udrepper.livejournal.com/11429.html)
+
+ * <http://git.fedorahosted.org/cgit/initscripts.git/diff/?id=deb0df0124fbe9b645755a0a44c7cb8044f24719>
* In context of [[!message-id
"1341350006-2499-1-git-send-email-rbraun@sceen.net"]]/the `alloca` issue
@@ -125,6 +152,12 @@ There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks.
<youpi> ah, no, the libthreads code properly sets the guard, just for
grow-up stacks
+ * GCC's AddressSanitizer (ASan; `-faddress-sanitizer`)
+
+ [Finding races and memory errors with GCC instrumentation
+ (AddressSanitizer)](http://gcc.gnu.org/wiki/cauldron2012#Finding_races_and_memory_errors_with_GCC_instrumentation_.28AddressSanitizer.29),
+ GNU Tools Cauldron 2012.
+
* Input fuzzing
Not a new topic; has been used (and a paper published) for early UNIX
diff --git a/open_issues/code_analysis/discussion.mdwn b/open_issues/code_analysis/discussion.mdwn
index f8a0657d..7ac3beb1 100644
--- a/open_issues/code_analysis/discussion.mdwn
+++ b/open_issues/code_analysis/discussion.mdwn
@@ -15,17 +15,6 @@ License|/fdl]]."]]"""]]
# IRC, freenode, #hurd, 2011-12-04
- <mcsim> has anyone used splice on hurd?
- <mcsim> splice -> splint
- <youpi> not that I know of
- <mcsim> this is tool for statically checking C programs
- <mcsim> seems I made it work
- <braunr> hm i realli i personnally dislike such tools a lot, but sometimes
- it might help
- <braunr> hello hurd people
- <mcsim> braunr: hello
- <braunr> mcsim: duma may be helpful as replacement for the memcheck part of
- valgrind
<mcsim> defpager uses it's own dynamic memory allocator, which uses
vm_allocate/vm_deallocate as backing store? Am I able to use duma in such
case?
@@ -38,7 +27,25 @@ License|/fdl]]."]]"""]]
<mcsim> yes, wired memory
<braunr> you'll have to change that in duma then
<braunr> but apart from such details, it should be straightforward
+
<antrik> braunr: I have no idea about duma; but if you think it's a useful
tool, please add it to open_issues/code_analysis.mdwn
<antrik> (I guess we should have a "proper" page listing useful debugging
tools...)
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <mcsim> hello. Has anyone tried some memory debugging tools like duma or
+ dmalloc with hurd?
+ <braunr> mcsim: yes, but i couldn't
+ <braunr> i tried duma, and it crashes, probably because of cthreads :)
+
+
+## IRC, freenode, #hurd, 2012-09-08
+
+ <mcsim> hello. What static analyzer would you suggest (probably you have
+ tried it for hurd already)?
+ <braunr> mcsim: if you find some good free static analyzer, let me know :)
+ <pinotree> a simple one is cppcheck
+ <mcsim> braunr: I'm choosing now between splint and adlint
diff --git a/open_issues/console_tty1.mdwn b/open_issues/console_tty1.mdwn
new file mode 100644
index 00000000..614c02c9
--- /dev/null
+++ b/open_issues/console_tty1.mdwn
@@ -0,0 +1,151 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+Seen in context of [[libpthread]], but probably not directly related to it.
+
+
+# IRC, freenode, #hurd, 2012-08-30
+
+ <gnu_srs> Do you also experience a frozen hurd console?
+ <braunr> yes
+ <braunr> i didn't check but i'm almost certain it's a bug in my branch
+ <braunr> the replacement of condition_implies was a bit hasty in some
+ places
+ <braunr> this is why i want to rework it separately
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <gnu_srs> braunr: Did you find the cause of the Hurd console freeze for
+ your libpthread branch?
+ <braunr> gnu_srs: like i said, a bug
+ <braunr> probably in the replacement of condition_implies
+ <braunr> i rewrote that part in libpipe and it no works
+ <braunr> now*
+
+ <braunr> gnu_srs: the packages have been updated
+ <braunr> and these apparently fix the hurd console issue correctly
+
+## IRC, freenode, #hurd, 2012-09-04
+
+ <braunr> gnu_srs: this hurd console problem isn't fixed
+ <braunr> it seems to be due to a race condition that only affects the first
+ console
+ <braunr> and by reading the code i can't see how it can even work oO
+ <gnu_srs> braunr: just rebooted, tty1 is still locked, tty2-6 works. And
+ the floppy error stays (maybe a kvm bug??)
+ <braunr> the floppy error is probably a kvm bug as we discussed
+ <braunr> the tty1 locked isn't
+ <braunr> i have it too
+ <braunr> it seems to be a bug in the hurd console server
+ <braunr> which is started by tty1, but for some reason, doesn't return a
+ valid answer at init time
+ <braunr> if you kill the term handling tty1, you'll see your first tty
+ starts working
+ <braunr> for now i'll try a hack that starts the hurd console server before
+ the clients
+ <braunr> doesn't work eh
+ <braunr> tty1 is the only one started before runttys
+ <braunr> indeed, fixing /etc/hurd/runsystem.gnu so that it doesn't touch
+ tty1 fixes the problem
+ <gnu_srs> do you have an explanation?
+ <braunr> not really no
+ <braunr> but it will do for now
+ <pinotree> samuel added that with the comment above, apparently to
+ workaround some other issue of the hurd console
+ <braunr> i'm pretty sure the bug is already visible with cthreads
+ <braunr> the first console always seems weird compared to the others
+ <braunr> with a login: at the bottom of the screen
+ <braunr> didn't you notice ?
+ <pinotree> sometimes, but not often
+ <braunr> typical of a race
+ <pinotree> (at least for me)
+ <braunr> pthreads being slightly slower exposes it
+ <gnu_srs> confirmed, it works by commenting out touch /dev/tty1
+ <gnu_srs> yes, the login is at the bottom of the screen, sometimes one in
+ the upper part too:-/
+ <braunr> so we have a new open issue
+ <braunr> hm
+ <braunr> exiting the first tty doesn't work
+ <braunr> which makes me think of the issue we have with screen
+ <gnu_srs> confirmed!
+ <braunr> also, i don't understand why we have getty on tty1, but nothing on
+ the other terminals
+ <braunr> something is really wrong with terminals on hurd *sigh*
+ <braunr> ah, the problem looks like it happens when getty attempts to
+ handle a terminal !
+ <braunr> gnu_srs: anyway, i don't think it should be blocking for the
+ conversion to pthreads
+ <braunr> but it would be better if someone could assign himself that bug
+ <braunr> :)
+
+
+## IRC, freenode, #hurd, 2012-09-05
+
+ <antrik> braunr: the login at the bottom of the screen if from the Mach
+ console I believe
+ <braunr> antrik: well maybe, but it shouldn't be there anyway
+ <antrik> braunr: why not?
+ <antrik> it's confusing, but perfectly correct as far as I can tell
+ <braunr> antrik: two login: on the same screen ?
+ <braunr> antrik: it's even more confusing when comparing with other ttys
+ <antrik> I mean it's correct from a techincal point of view... I'm not
+ saying it's helpful for the user ;-)
+ <braunr> i'm not even sure it's correct
+ <braunr> i've double checked the pthreads patch and didn't see anything
+ wrong there
+ <antrik> perhaps the startup of the Hurd console could be delayed a bit to
+ make sure it happens after the Mach console login is done printing
+ stuff...
+ <braunr> why are our gettys stubs ?
+ <antrik> I never understood the point of a getty TBH...
+ <braunr> well you need to communicate to something behind your terminal,
+ don't you ?
+ <braunr> with*
+ <antrik> why not just launch the login program or login shell right away?
+ <braunr> what if you want something else than a login program ?
+ <antrik> like what?
+ <antrik> and how would a getty help with that?
+ <braunr> an ascii-art version of star wars
+ <braunr> it would be configured to start something else
+ <antrik> and why does that need a getty? why not just start something else
+ directly?
+ <braunr> well getty is about the serial line parameters actually
+ <antrik> yeah, I had a vague understanding that it has something to do with
+ serial lines (or real TTY lines)... but we hardly need that on local
+ cosoles :-)
+ <antrik> consoles
+ <braunr> right
+ <braunr> but then why even bother with something like runttys
+ <antrik> well, something has to start the terminal servers?...
+ <antrik> I might be confused though
+ <braunr> what i don't understand is
+ <braunr> why is there no getty at startup, whereas they are spawned when
+ logging off ?
+ <antrik> they are? that's fascinating indeed ;-)
+ <braunr> does it behave like this on your old version ?
+ <antrik> I don't remember ever having seen a "getty" process on my Hurd
+ systems...
+ <braunr> can you log on e.g. tty2 and then log out, and see ?
+ <antrik> OTOH, I'm hardly ever using consoles...
+ <antrik> hm... I think that should be possible remotely using the console
+ client with ncurses driver? never tried that...
+ <braunr> ncurses driver ?
+ <braunr> hum i don't know, never tried either
+ <braunr> and it may add other bugs :p
+ <braunr> better wait to be close to the machine
+ <antrik> hehe
+ <antrik> well, it's a good excuse for trying the ncurses driver ;-)
+ <antrik> hrm
+ <antrik> alien:~# console -d ncursesw
+ <antrik> console: loading driver `ncursesw' failed: Gratuitous error
+ <antrik> I guess nobody tested that stuff in years
diff --git a/open_issues/console_vs_xorg.mdwn b/open_issues/console_vs_xorg.mdwn
new file mode 100644
index 00000000..ffefb389
--- /dev/null
+++ b/open_issues/console_vs_xorg.mdwn
@@ -0,0 +1,31 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_hurd]]
+
+
+# IRC, freenode, #hurd, 2012-08-30
+
+ <gean> braunr: I have some errors about keyboard in the xorg log, but
+ keyboard is working on the X
+ <braunr> gean: paste the log somewhere please
+ <gean> braunr: http://justpaste.it/19jb
+ [...]
+ [1987693.272] Fatal server error:
+ [1987693.272] Cannot set event mode on keyboard (Inappropriate ioctl for device)
+ [...]
+ [1987693.292] FatalError re-entered, aborting
+ [1987693.302] can't reset keyboard mode (Inappropriate ioctl for device)
+ [...]
+ <braunr> hum
+ <braunr> it looks like the xorg keyboard driver evolved and now uses ioctls
+ our drivers don't implement
+ <braunr> thanks for the report, we'll have to work on this
+ <braunr> i'm not sure the problem is new actually
diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn
index aff988d5..5f6fcf6a 100644
--- a/open_issues/dde.mdwn
+++ b/open_issues/dde.mdwn
@@ -17,6 +17,9 @@ Still waiting for interface finalization and proper integration.
[[!toc]]
+See [[user-space_device_drivers]] for generic discussion related to user-space
+device drivers.
+
# Disk Drivers
@@ -25,12 +28,6 @@ Not yet supported.
The plan is to use [[libstore_parted]] for accessing partitions.
-## Booting
-
-A similar problem is described in
-[[community/gsoc/project_ideas/unionfs_boot]], and needs to be implemented.
-
-
# Upstream Status
@@ -56,6 +53,33 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> (both from the Dresdem L4 group)
+### IRC, freenode, #hurd, 2012-08-12
+
+ <antrik>
+ http://genode.org/documentation/release-notes/12.05#Re-approaching_the_Linux_device-driver_environment
+ <antrik> I wonder whether the very detailed explanation was prompted by our
+ DDE discussions at FOSDEM...
+ <pinotree> antrik: one could think about approaching them to develop the
+ common dde libs + dde_linux together
+ <antrik> pinotree: that's what I did at FOSDEM -- they weren't interested
+ <pinotree> antrik: this year's one? why weren't they?
+ <pinotree> maybe at that time dde was not integrated properly yet (netdde
+ is just few months "old")
+ <braunr> do you really consider it integrated properly ?
+ <pinotree> no, but a bit better than last year
+ <antrik> I don't see what our integration has to do with anything...
+ <antrik> they just prefer hacking thing ad-hoc than having some central
+ usptream
+ <pinotree> the helenos people?
+ <antrik> err... how did helenos come into the picture?...
+ <antrik> we are talking about genode
+ <pinotree> sorry, confused wrong microkernel OS
+ <antrik> actually, I don't remember exactly who said what; there were
+ people from genode there and from one or more other DDE projects... but
+ none of them seemed interested in a common DDE
+ <antrik> err... one or two other L4 projects
+
+
## IRC, freenode, #hurd, 2012-02-19
<youpi> antrik: do we know exactly which DDE version Zheng Da took as a
@@ -79,6 +103,12 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
apparently have both USB and SATA working with some variant of DDE
+### IRC, freenode, #hurd, 2012-11-03
+
+ <mcsim> DrChaos: there is DDEUSB framework for L4. You could port it, if
+ you want. It uses Linux 2.6.26 usb subsystem.
+
+
# IRC, OFTC, #debian-hurd, 2012-02-15
<pinotree> i have no idea how the dde system works
@@ -90,6 +120,9 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
automatically, or you have to settrans yourself to setup a device?
<youpi> there's no autoloader for now
<youpi> we'd need a bus arbitrer that'd do autoprobing
+
+[[PCI_arbiter]].
+
<pinotree> i see
<pinotree> (you see i'm not really that low level, so pardon the flood of
posssibly-noobish questions ;) )
@@ -200,21 +233,10 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> right
-# IRC, freenode, #hurd, 2012-02-19
-
- <youpi> antrik: we should probably add a gsoc idea on pci bus arbitration
- <youpi> DDE is still experimental for now so it's ok that you have to
- configure it by hand, but it should be automatic at some ponit
-
+# [[PCI_Arbiter]]
## IRC, freenode, #hurd, 2012-02-21
- <braunr> i'm not familiar with the new gnumach interface for userspace
- drivers, but can this pci enumerator be written with it as it is ?
- <braunr> (i'm not asking for a precise answer, just yes - even probably -
- or no)
- <braunr> (idk or utsl will do as well)
- <youpi> I'd say yes
<youpi> since all drivers need is interrupts, io ports and iomem
<youpi> the latter was already available through /dev/mem
<youpi> io ports through the i386 rpcs
@@ -453,6 +475,59 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]:
<antrik> hm... good point
+# IRC, freenode, #hurd, 2012-08-14
+
+ <braunr> it's amazing how much code just gets reimplemented needlessly ...
+ <braunr> libddekit has its own mutex, condition, semaphore etc.. objects
+ <braunr> with the *exact* same comment about the dequeueing-on-timeout
+ problem found in libpthread
+ <braunr> *sigh*
+
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> hum, leaks and potential deadlocks in libddekit/thread.c :/
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> nice, dde relies on a race to start ..
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> hm looks like if netdde crashes, the kernel doesn't handle it
+ cleanly, and we can't attach another netdde instance
+
+[[!message-id "877gu8klq3.fsf@kepler.schwinge.homeip.net"]]
+
+
+# IRC, freenode, #hurd, 2012-08-21
+
+In context of [[libpthread]].
+
+ <braunr> hm, i thought my pthreads patches introduced a deadlock, but
+ actually this one is present in the current upstream/debian code :/
+ <braunr> (the deadlock occurs when receiving data fast with sftp)
+ <braunr> either in netdde or pfinet
+
+
+# DDE for Filesystems
+
+## IRC, freenode, #hurd, 2012-10-07
+
+ * pinotree wonders whether the dde layer could aldo theorically support
+ also file systems
+ <antrik> pinotree: yeah, I also brought up the idea of creating a DDE
+ extension or DDE-like wrapper for Linux filesystems a while back... don't
+ know enough about it though to decide whether it's doable
+ <antrik> OTOH, I'm not sure it would be worthwhile. we still should
+ probably have a native (not GPLv2-only) implementation for the main FS at
+ least; so the wrapper would only be for accessing external
+ partitions/media...
+
+
# virtio
diff --git a/open_issues/exec_leak.mdwn b/open_issues/exec_leak.mdwn
new file mode 100644
index 00000000..b58d2c81
--- /dev/null
+++ b/open_issues/exec_leak.mdwn
@@ -0,0 +1,57 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+
+# IRC, freenode, #hurd, 2012-08-11
+
+ <braunr> the exec servers seems to leak a lot
+ <braunr> server*
+ <braunr> exec now uses 109M on darnassus
+ <braunr> it really leaks a lot
+ <pinotree> only 109mb? few months ago, exec on exodar was taking more than
+ 200mb after few days of uptime with builds done
+ <braunr> i wonder how much it takes on the buildds
+
+
+# IRC, freenode, #hurd, 2012-08-17
+
+ <braunr> the exec leak is tricky
+ <braunr> bddebian: btw, look at the TODO file in the hurd source code
+ <braunr> bddebian: there is a not from thomas bushnell about that
+ <braunr> "*** Handle dead name notifications on execserver ports. !
+ <braunr> not sure it's still a todo item, but it might be worth checking
+ <bddebian> braunr: diskfs_execboot_class = ports_create_class (0, 0);
+ This is what would need to change right? It should call some cleanup
+ routine in the first argument?
+ <bddebian> Would be ideal if it could just use deadboot() from exec.
+ <braunr> bddebian: possible
+ <braunr> bddebian: hum execboot, i'm not so sure
+ <bddebian> Execboot is the exec task, no?
+ <braunr> i don't know what execboot is
+ <bddebian> It's from libdiskfs
+ <braunr> but "diskfs_execboot_class" looks like a class of ports used at
+ startup only
+ <braunr> ah
+ <braunr> then it's something run in the diskfs users ?
+ <bddebian> yes
+ <braunr> the leak is in exec
+ <braunr> if clients misbehave, it shouldn't affect that server
+ <bddebian> That's a different issue, this was about the TODO thing
+ <braunr> ah
+ <braunr> i don't know
+ <bddebian> Me either :)
+ <bddebian> For the leak I'm still focusing on do-bunzip2 but I am baffled
+ at my results..
+ <braunr> ?
+ <bddebian> Where my counters are zero if I always increment on different
+ vars but wild freaking numbers if I increment on malloc and decrement on
+ free
diff --git a/open_issues/exec_memory_leaks.mdwn b/open_issues/exec_memory_leaks.mdwn
new file mode 100644
index 00000000..1a73ce9a
--- /dev/null
+++ b/open_issues/exec_memory_leaks.mdwn
@@ -0,0 +1,24 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+There are is some memory leak in [[`exec`|hurd/translator/exec]]. After twelve
+hours worth of `fork/exec` ([[GCC]]'s `check-c` part of the testsuite), we got:
+
+ PID UID PPID PGrp Sess TH Vmem RSS %CPU User System Args
+ 4 0 3 1 1 10 392M 262M 0.0 2:18.29 2hrs /hurd/exec
+
+The *RSS* seems a tad high. Also the system part of CPU time consumption is
+quite noticeable. In comparison:
+
+ 0 0 1 1 1 19 131M 1.14M 0.0 3:30.25 9:17.79 /hurd/proc
+ 3 0 1 1 1 224 405M 12.6M 0.2 42:20.25 67min ext2fs --readonly --multiboot-command-line=root=device:hd0s6 --host-priv-port=1 --device-master-port=2 --exec-server-task=3 -T typed device:hd0s6
+ 276 0 3 1 1 344 442M 28.2M 0.6 48:09.36 91min /hurd/ext2fs /dev/hd2s5
diff --git a/open_issues/ext2fs_deadlock.mdwn b/open_issues/ext2fs_deadlock.mdwn
index 369875fe..23f54a4a 100644
--- a/open_issues/ext2fs_deadlock.mdwn
+++ b/open_issues/ext2fs_deadlock.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -44,9 +44,8 @@ pull the information out of the process' memory manually (how to do that,
anyways?), and also didn't have time to continue with debugging GDB itself, but
this sounds like a [[!taglink open_issue_gdb]]...)
----
-IRC, #hurd, 2010-10-27
+# IRC, freenode, #hurd, 2010-10-27
<youpi> thread 8 hung on ports_begin_rpc
<youpi> that's probably where one could investigated first
diff --git a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn
new file mode 100644
index 00000000..ff1c4c38
--- /dev/null
+++ b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn
@@ -0,0 +1,93 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+ libports/port-ref.c:31: ports_port_ref: Assertion `pi->refcnt || pi->weakrefcnt' failed
+
+This is seen every now and then.
+
+
+# [[gnumach_page_cache_policy]]
+
+With that patch in place, the assertion failure is seen more often.
+
+
+## IRC, freenode, #hurd, 2012-07-14
+
+ <youpi> braunr: I'm getting ext2fs.static:
+ /usr/src/hurd-debian/./libports/port-ref.c:31: ports_port_ref: Assertion
+ `pi->refcnt || pi->weakrefcnt' failed.
+ <youpi> oddly enough, that happens on one of the buildds only
+ <braunr> :/
+ <braunr> i fear the patch can wake many of these issues
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+ <youpi> braunr: same assertion failed on a second buildd
+ <braunr> can you paste it again please ?
+ <youpi> ext2fs.static: /usr/src/hurd-debian/./libports/port-ref.c:31:
+ ports_port_ref: Assertion `pi->refcnt || pi->weakrefcnt' failed.
+ <braunr> or better, answer the ml thread for future reference
+ <braunr> thanks
+ <youpi> braunr: I can't keep your patch on the buildds, it makes them too
+ unreliable
+ <braunr> youpi: ok
+ <braunr> i never got this error though, that's weird
+ <braunr> youpi: was the failure during the same build ?
+ <youpi> no, it was during package installation, and not the same
+ <youpi> braunr: note that I've already seen such errors, it's not new, but
+ it was way rarer
+ <youpi> like every month only
+ <braunr> ah ok
+ <braunr> yes it's less surprising then
+ <braunr> a tricky reference counting / locking mistake somewhere in the
+ hurd :) ...
+ <braunr> ah ! just got it !
+ <bddebian> braunr: Got the error or found the problem? :)
+ <braunr> the former unfortunately :/
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+ <braunr> hm, i think those ext2fs port refs errors may also be due to stack
+ overflows
+ <pinotree> --verbose
+ <braunr> hm ?
+ <braunr> http://lists.gnu.org/archive/html/bug-hurd/2012-07/msg00051.html
+ <pinotree> i mean, why do you think they could be due to that?
+ <braunr> the error is that both strong and weak refs in a port are 0 when
+ adding a reference
+ <braunr> weak refs are almost never used so let's forget about them
+ <braunr> when a ref count drops to 0, the port is automatically deallocated
+ <braunr> so what other than memory corruption setting this counter to 0
+ could possibly do that ? :)
+ <pinotree> one could also guess an unbalanced ref/unref logic, somehow
+ <braunr> what do you mean ?
+ <pinotree> that for a bug, an early return, etc a port gets unref'ed often
+ than it is ref'ed
+ <braunr> highly unlikely, as they're protected by a lock
+ <braunr> pinotree: ah you mean, the object gets deallocated early because
+ of an deref overflow ?
+ <braunr> pinotree: could be, yes
+ <braunr> pinotree: i wonder if it could happen because of the periodic sync
+ duplicating the node table without holding references
+ <braunr> rah, libports uses a big lock in many places :(
+ <pinotree> braunr: yes, i meant that
+ <braunr> we could try using libduma some day
+ <braunr> i wonder if it could work out of the box
+ <pinotree> but that wouldn't help to find out whether a port gets deref'ed
+ too often, for instance
+ <pinotree> although it could be adapted to do so, i guess
+ <braunr> reproducing + a call trace or core would be best, but i'm not even
+ sure we can get that easily lol
+
+[[automatic_backtraces_when_assertions_hit]].
diff --git a/open_issues/fakeroot_eagain.mdwn b/open_issues/fakeroot_eagain.mdwn
new file mode 100644
index 00000000..6b684a04
--- /dev/null
+++ b/open_issues/fakeroot_eagain.mdwn
@@ -0,0 +1,216 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_porting]]
+
+
+# IRC, freenode, #hurd, 2012-12-05
+
+ <braunr> rbraun 18813 R 2hrs ln -sf ../af_ZA/LC_NUMERIC
+ debian/locales-all/usr/lib/locale/en_BW/LC_NUMERIC
+ <braunr> when building glibc
+ <braunr> is this a known issue ?
+ <tschwinge> braunr: No. Can you get a backtrace?
+ <braunr> tschwinge: with gdb you mean ?
+ <tschwinge> Yes. If you have any debugging symbols (glibc?).
+ <braunr> or the build log leading to that ?
+ <braunr> ok, i will next time i have it
+ <tschwinge> OK.
+ <braunr> (i regularly had it when working on the pthreads port)
+ <braunr> tschwinge:
+ http://www.sceen.net/~rbraun/hurd_glibc_build_deadlock_trace
+ <braunr> youpi: ^
+ <youpi> Mmm, there's not so much we can do about this one
+ <braunr> youpi: what do you mean ?
+ <youpi> the problem is that it's really a reentrency issue of the libc
+ locale
+ <youpi> it would happen just the same on linux
+ <braunr> sure
+ <braunr> but hat doesn't mean we can't report and/or fix it :)
+ <youpi> (the _nl_state_lock)
+ <braunr> do you have any workaround in mind ?
+ <youpi> no
+ <youpi> actually that's what I meant by "there's not so much we can do
+ about this"
+ <braunr> ok
+ <youpi> because it's a bad interaction between libfakeroot and glibc
+ <youpi> glibc believe fxtstat64 would never call locale functions
+ <youpi> but with libfakeroot it does
+ <braunr> i see
+ <youpi> only because we get an EAGAIN here
+ <braunr> but hm, doesn't it happen on linux ?
+ <youpi> EAGAIN doesn't happen on linux for fxstat64, no :)
+ <braunr> why does it happen on the hurd ?
+ <youpi> I mean for fakeroot stuff
+ <youpi> probably because fakeroot uses socket functions
+ <youpi> for which we probably don't properly handleEAGAIN
+ <youpi> I've already seen such kind of issue
+ <youpi> in buildd failures
+ <braunr> ok
+ <youpi> (so the actual bug here is EAGAIN
+ <youpi> )
+ <braunr> yes, so we can do something about it
+ <braunr> worth a look
+ <pinotree> (implement sysv semaphores)
+ <youpi> pinotree: if we could also solve all these buildd EAGAIN issues
+ that'd be nice :)
+ <braunr> that EAGAIN error might also be what makes exim behave badly and
+ loop forever
+ <youpi> possibly
+ <braunr> i've updated the trace with debugging symbols
+ <braunr> it fails on connect
+ <pinotree> like http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=563342 ?
+ <braunr> it's EAGAIN, not ECONNREFUSED
+ <pinotree> ah ok
+ <braunr> might be an error in tcp_v4_get_port
+
+
+## IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> hmm, tcp_v4_get_port sometimes fails indeed
+ <gnu_srs> braunr: may I ask how you found out, adding print statements in
+ pfinet, or?
+ <braunr> yes
+ <gnu_srs> OK, so that's the only (easy) way to debug.
+ <braunr> that's the last resort
+ <braunr> gdb is easy too
+ <braunr> i could have added a breakpoint too
+ <braunr> but i didn't want to block pfinet while i was away
+ <braunr> is it possible to force the use of fakeroot-tcp on linux ?
+ <braunr> the problem seems to be that fakeroot doesn't close the sockets
+ that it connected to faked-tcp
+ <braunr> which, at some point, exhauts the port space
+ <pinotree> braunr: sure
+ <pinotree> change the fakeroot dpkg alternative
+ <braunr> ok
+ <pinotree> calling it explicitly `fakeroot-tcp command` or
+ `dpkg-buildpackage -rfakeroot-tcp ...` should work too
+ <braunr> fakeroot-tcp looks really evil :p
+ <braunr> hum, i don't see any faked-tcp process on linux :/
+ <pinotree> not even with `fakeroot-tcp bash -c "sleep 10"`?
+ <braunr> pinotree: now yes
+ <braunr> but, does it mean faked-tcp is started for *each* process loading
+ fakeroot-tcp ?
+ <braunr> (the lib i mean)
+ <pinotree> i think so
+ <braunr> well the hurd doesn't seem to do that at all
+ <braunr> or maybe it does and i don't see it
+ <braunr> the stale faked-tcp processes could be those that failed something
+ only
+ <pinotree> yes, there's also that issue: sometimes there are stake
+ faked-tcp processes
+ <braunr> hum no, i see one faked-tcp that consumes cpu when building glibc
+ <pinotree> *stale
+ <braunr> it's the same process for all commands
+ <pinotree> <braunr> but, does it mean faked-tcp is started for *each*
+ process loading fakeroot-tcp ?
+ <pinotree> → everytime you start fakeroot, there's a new faked-xxx for it
+ <braunr> it doesn't look that way
+ <braunr> again, on the hurd, i see one faked-tcp, consuming cpu while
+ building so i assume it services libfakeroot-tcp requests
+ <pinotree> yes
+ <braunr> which means i probably won't reproduce the problem on linux
+ <pinotree> it serves that fakeroot under which the binary(-arch) target is
+ run
+ <braunr> or perhaps it's the normal fakeroot-tcp behaviour on sid
+ <braunr> pinotree: a faked-tcp that is started for each command invocation
+ will implicitely make the network stack close all its sockets when
+ exiting
+ <braunr> pinotree: as our fakeroot-tcp uses the same instance of faked-tcp,
+ it's a lot more likely to exhaust the port space
+ <pinotree> i see
+ <braunr> i'll try on sid and see how it behaves
+ <braunr> pinotree: on the other hand, forking so many processes at each
+ command invocation may make exec leak a lot :p
+ <braunr> or rather, a lot more
+ <braunr> (or maybe not, since it leaks only in some cases)
+
+[[exec_leak]].
+
+ <braunr> pinotree: actually, the behaviour under linux is the same with the
+ alternative correctly set, whereas faked-tcp is restarted (if used at
+ all) with -rfakeroot-tcp
+ <braunr> hm no, even that isn't true
+ <braunr> grr
+ <braunr> pinotree: i think i found a handy workaround for fakeroot
+ <braunr> pinotree: the range of local ports in our networking stack is a
+ lot more limited than what is configured in current systems
+ <braunr> by extending it, i can now build glibc \o/
+ <pinotree> braunr: what are the current ours and the usual one?
+ <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c
+ <braunr> the modern ones are the ones suggested in the comment
+ <braunr> sysctl_local_port_range is the symbol storing the range
+ <pinotree> i see
+ <pinotree> what's the current range on linux?
+ <braunr> 20:44 < braunr> the modern ones are the ones suggested in the
+ comment
+ <pinotree> i see
+ <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range
+ <braunr> 32768 61000
+ <braunr> so, i'm not sure why we have the problem, since even on linux,
+ netstat doesn't show open bound ports, but it does help
+ <braunr> the fact faked-tcp can remain after its use is more problematic
+ <pinotree> (maybe pfinet could grow a (startup-only?) option to change it,
+ similar to that sysctl)
+ <braunr> but it can also stems from the same issue gnu_srs found about
+ closed sockets that haven't been shut down
+ <braunr> perhaps
+ <braunr> but i don't see the point actually
+ <braunr> we could simply change the values in the code
+
+ <braunr> youpi: first, in pfinet, i increased the range of local ports to
+ reduce the likeliness of port space exhaustion
+ <braunr> so we should get a lot less EAGAIN after that
+ <braunr> (i've not committed any of those changes)
+ <youpi> range of local ports?
+ <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c, tcp_v4_get_port function
+ and sysctl_local_port_range array
+ <youpi> oh
+ <braunr> EAGAIN is caused by tcp_v4_get_port failing at
+ <braunr> /* Exhausted local port range during search? */
+ <braunr> if (remaining <= 0)
+ <braunr> goto fail;
+ <youpi> interesting
+ <youpi> so it's not a hurd bug after all
+ <youpi> just a problem in fakeroot eating a lot of ports
+ <braunr> maybe because of the same issue gnu_srs worked on (bad socket
+ close when no clean shutdown)
+ <braunr> maybe, maybe not
+ <braunr> but increasing the range is effective
+ <braunr> and i compared with what linux does today, which is exactly what
+ is in the comment above sysctl_local_port_range
+ <braunr> so it looks safe
+ <youpi> so that means that the pfinet just uses ports 1024- 4999 for
+ auto-allocated ports?
+ <braunr> i guess so
+ <youpi> the linux pfinet I meant
+ <braunr> i haven't checked the whole code but it looks that way
+ <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_min[] = { 1, 1
+ };
+ <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_max[] = { 65535,
+ 65535 };
+ <youpi> looks like they have increased it since then :)
+ <braunr> hum :)
+ <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range
+ <braunr> 32768 61000
+ <youpi> yep, same here
+ <youpi> ./inet_connection_sock.c: .range = { 32768, 61000 },
+ <youpi> so there are two things apparently
+ <youpi> but linux now defaults to 32k-61k
+ <youpi> braunr: please just push the port range upgrade to 32Ki-61K
+ <braunr> ok, will do
+ <youpi> there's not reason not to do it
+
+
+## IRC, freenode, #hurd, 2012-12-11
+
+ <braunr> youpi: at least, i haven't had any failure building eglibc since
+ the port range patch
+ <youpi> good :)
diff --git a/open_issues/fifo_thread_explosion.mdwn b/open_issues/fifo_thread_explosion.mdwn
new file mode 100644
index 00000000..08f682f2
--- /dev/null
+++ b/open_issues/fifo_thread_explosion.mdwn
@@ -0,0 +1,20 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+As reported in [[!message-id "87sj80yb3e.fsf@kepler.schwinge.homeip.net"]],
+after a [[GCC]] build (native, so three stages bootstrap), we got:
+
+ PID UID PPID PGrp Sess TH Vmem RSS %CPU User System Args
+ 449 1000 3 1 1 10118 782M 198M 0.0 0:40.78 2:26.65 /hurd/fifo
+
+The other processes, in particular two instances of ext2fs and one of [[exec]],
+looked reasonable.
diff --git a/open_issues/fork_deadlock.mdwn b/open_issues/fork_deadlock.mdwn
index 6b90aa0a..c1fa9208 100644
--- a/open_issues/fork_deadlock.mdwn
+++ b/open_issues/fork_deadlock.mdwn
@@ -63,3 +63,34 @@ Another one in `dash`:
stopped = 1
i = 6
[...]
+
+
+# IRC, OFTC, #debian-hurd, 2012-11-24
+
+ <youpi> the lockups are about a SIGCHLD which gets lost
+ <pinotree> ah, ok
+ <youpi> which makes bash spin
+ <pinotree> is that happening more often recently, or it's just something i
+ just noticed?
+ <youpi> it's more often recently
+ <youpi> where "recently" means "some months ago"
+ <youpi> I didn't notice exactly when
+ <pinotree> i see
+ <youpi> it's at most since june, apparently
+ <youpi> (libtool managed to build without a fuss, while now it's a pain)
+ <youpi> (libtool building is a good test, it seems to be triggering quite
+ reliably)
+
+
+## IRC, freenode, #hurd, 2012-11-27
+
+ <youpi> we also have the shell wait issue
+ <youpi> it's particularly bad on libtool calls
+ <youpi> the libtool package (with testsuite) is a good reproducer :)
+ <youpi> the symptom is shell scripts eating CPU
+ <youpi> busy-waiting for a SIGCHLD which never gets received
+ <braunr> that could be what i got
+ <braunr>
+ http://www.gnu.org/software/hurd/microkernel/mach/gnumach/memory_management.html
+ <braunr> last part
+ <youpi> perhaps watch has the same issue as the shell, yes
diff --git a/open_issues/gcc.mdwn b/open_issues/gcc.mdwn
index d9940716..574a743b 100644
--- a/open_issues/gcc.mdwn
+++ b/open_issues/gcc.mdwn
@@ -31,14 +31,14 @@ example. Especially all the compiler magic is all the same.
<!--
git checkout reviewed
-git log --reverse --pretty=fuller --stat=$COLUMNS,$COLUMNS -p -C --cc ..upstream/trunk
+git log --reverse --topo-order --pretty=fuller --stat=$COLUMNS,$COLUMNS -w -p -C --cc ..upstream/trunk
-i
-/^commit |^---$|hurd|linux|nptl|glibc
+/^commit |^Merge:|^---$|hurd|linux|nacl|nptl|glibc|gs:
-->
-Last reviewed up to the [[Git mirror's dfed30bca14de84e0446cc02f5a27407dbfdc3e1
-(2012-06-11) sources|source_repositories/gcc]].
+Last reviewed up to the [[Git mirror's 769bf18a20ee2540ca7601cdafabd62b18b9751b
+(2012-10-01) sources|source_repositories/gcc]].
<http://gcc.gnu.org/install/configure.html> has documentation for the
`configure` switches.
@@ -74,24 +74,27 @@ Last reviewed up to the [[Git mirror's dfed30bca14de84e0446cc02f5a27407dbfdc3e1
* [[`libmudflap`|libmudflap]].
- * Might [`-fsplit-stack`](http://nickclifton.livejournal.com/6889.html) be
- worthwhile w.r.t. our [[multithreaded|multithreading]] libraries?
+ * [`-fsplit-stack`](http://nickclifton.livejournal.com/6889.html)
* Also see `libgcc/config/i386/morestack.S`: comments w.r.t
- `TARGET_THREAD_SPLIT_STACK_OFFSET`; likely needs porting.
+ `TARGET_THREAD_SPLIT_STACK_OFFSET`/`%gs:0x30` usage; likely needs
+ porting.
- As per `libgcc/config/i386/t-stack-i386`, the former file is only used for
- `-fsplit-stack` support -- which is currently enabled for us in
- `libgcc/config.host`, but not usable via GCC proper.
+ * As per `libgcc/config/i386/t-stack-i386`, the former file is only used
+ for `-fsplit-stack` support -- which is currently enabled for us in
+ `libgcc/config.host`.
* `gcc/config/gnu-user.h` defines `*SPLIT_STACK*` macros -- which aren't
valid for us (yet), I think.
+ * Might `-fsplit-stack` be useful for us with respect to our
+ [[multithreaded|multithreading]] libraries?
+
* `--enable-languages=[...]`
- * GNAT is not yet ported / bootstrapped?
+ * [[Ada (GNAT)|GNAT]] support is work in progress.
- * The Google Go's libgo (introduced in
+ * The [[Google Go's libgo|gccgo]] (introduced in
e440a3286bc89368b8d3a8fd6accd47191790bf2 (2010-12-03)) needs
OS configuration / support.
@@ -136,9 +139,13 @@ Last reviewed up to the [[Git mirror's dfed30bca14de84e0446cc02f5a27407dbfdc3e1
* [-fstack-protector shouldn't use TLS in freestanding
mode](http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29838)
+ * See also commit bf1c0af128f33bd342636c4afeaa8f3a8a7cf8ca (reverted in
+ commit a204f0622242865ffea889bd698bc7c7bd236bd1), commit
+ 05c1aa95e6c37b3b281d749c76c673392941a031.
+
* Check before/after Joseph changes. (Should be fine.)
- * 34618b3190c110b8926cc2b1db4b4eac95451995
+ * 34618b3190c110b8926cc2b1db4b4eac95451995 »config-list.mk«
What's this used for? (Check ML.) Ask to include i686-pc-gnu (once it is
buildable out of the box)? See also
@@ -194,6 +201,17 @@ Last reviewed up to the [[Git mirror's dfed30bca14de84e0446cc02f5a27407dbfdc3e1
to find out why some stuff wasn't compiling even after kfreebsd
porting patches adding preprocessors checks for __GLIBC__
+ GNU/kFreeBSD and GNU/kNetBSD: commit
+ 6396cc37141180db4d2c8f73cab4f5977d8a1e19 (2004-06-24, r83577),
+ GNU/kOpenSolaris: commit 3bef40126fb1633018fce47828df0fa9f65f110c
+ (2009-01-29, r143768). See also GDB commits
+ fda1b24c62843f81d31de2af57b1ed9c55f1e348 and
+ 1acb4f4ff73d20850a7524fc939d2651be75f47b, and binutils commits
+ e3081899be7570eb90ccfd5d767950d3a62871ee,
+ 127c4d4a4fe65bd17ea64db1be7f3c93d393afcb,
+ 47dbf5b634b955c2db1221715d15751e1281546a, and
+ ad2be7e8b846f4cd67fa1e032f98d5dc1cdb6b8d.
+
IRC, freenode, #hurd, 2012-05-25:
<gnu_srs> Hi, looks like __GLIBC__ is not defined by default for GNU?
@@ -248,6 +266,8 @@ Last reviewed up to the [[Git mirror's dfed30bca14de84e0446cc02f5a27407dbfdc3e1
<pinotree> what should be done first is, however, find out why that
define has been added to gcc
+ [[!message-id "201211061305.02565.pino@debian.org"]].
+
* [low] Does `-mcpu=native` etc. work? (For example,
2ae1f0cc764e998bfc684d662aba0497e8723e52.)
@@ -278,18 +298,20 @@ Last reviewed up to the [[Git mirror's dfed30bca14de84e0446cc02f5a27407dbfdc3e1
C.f. [[!message-id "x57jobtqx89w.fsf@frobland.mtv.corp.google.com"]],
[[!message-id "x57jd359fkx3.fsf@frobland.mtv.corp.google.com"]] as well as
[[!debbug 629866]]/[[!message-id
- "20110609002620.GA16719@const.famille.thibault.fr"]].
+ "20110609002620.GA16719@const.famille.thibault.fr"]]. commit
+ 026e608ecebcb2a6193971006a85276307d79b00.
# Build
Here's a log of a GCC build run; this is from our [[Git repository's
-2e2db3f92b534460c68c2f9ae64455884424beb6 (2012-06-15; 2012-06-06)
+b401cb7ed15602d244a6807835b0b9d740a302a8 (2012-11-26;
+769bf18a20ee2540ca7601cdafabd62b18b9751b (2012-10-01))
sources|source_repositories/gcc]], run on kepler.SCHWINGE and coulomb.SCHWINGE.
$ export LC_ALL=C
$ (cd ../master/ && contrib/gcc_update --touch)
- $ ../master/configure --prefix="$PWD".install SHELL=/bin/dash CC=gcc-4.6 CXX=g++-4.6 --enable-build-with-cxx --enable-languages=all,ada 2>&1 | tee log_build
+ $ ../master/configure --prefix="$PWD".install SHELL=/bin/dash CC=gcc-4.6 CXX=g++-4.6 --enable-languages=all,ada 2>&1 | tee log_build
[...]
$ make 2>&1 | tee log_build_
[...]
@@ -297,12 +319,12 @@ sources|source_repositories/gcc]], run on kepler.SCHWINGE and coulomb.SCHWINGE.
Different hosts may default to different shells and compiler versions; thus
harmonized.
-This takes up around 3.1 GiB, and needs roughly 3.0 h on kepler.SCHWINGE and
-12.75 h on coulomb.SCHWINGE.
+This takes up around 3.1 GiB, and needs roughly 3.25 h on kepler.SCHWINGE and
+13.25 h on coulomb.SCHWINGE.
<!--
- $ (make && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install && touch .go-check) 2>&1 | tee log_install && test -f .go-check && make -k RUNTESTFLAGS=-v check 2>&1 | tee log_check
+ $ (make && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install && touch .go-test) 2>&1 | tee log_install && test -f .go-test && make -k RUNTESTFLAGS=-v check 2>&1 | tee log_test
-->
@@ -382,7 +404,7 @@ This takes up around 3.1 GiB, and needs roughly 3.0 h on kepler.SCHWINGE and
Just different order of object files, or another problem? TODO
- * `libobjc/encoding.c`:
+ * `libobjc/encoding.c`:
libtool: compile: [...]/hurd/master.build/./gcc/xgcc [...] [...]/hurd/master/libobjc/encoding.c -c [...]
+[...]/hurd/master/libobjc/encoding.c:128:1: warning: '_darwin_rs6000_special_round_type_align' defined but not used [-Wunused-function]
@@ -416,9 +438,11 @@ This takes up around 3.1 GiB, and needs roughly 3.0 h on kepler.SCHWINGE and
* *default library search path*
- -checking for the default library search path... /lib /usr/lib /lib/i386-linux-gnu /usr/lib/i386-linux-gnu /lib/i486-linux-gnu /usr/lib/i486-linux-gnu /usr/local/lib /lib64 /usr/lib64
+ -checking for the default library search path... /lib /usr/lib /lib/i386-linux-gnu /usr/lib/i386-linux-gnu /lib/i486-linux-gnu /usr/lib/i486-linux-gnu /usr/local/lib
+checking for the default library search path... /lib /usr/lib
+ [[binutils]] issue? Should be aligned by Samuel's binutils patch.
+
* `./classpath/[...]/*.properties`
Just different order of files, or another problem?
@@ -452,13 +476,6 @@ This takes up around 3.1 GiB, and needs roughly 3.0 h on kepler.SCHWINGE and
There are other instances of this in the following.
- * *default library search path*
-
- -checking for the default library search path... /lib /usr/lib /lib/[MULTIARCH] /usr/lib/[MULTIARCH] /lib/i486-linux-gnu /usr/lib/i486-linux-gnu /usr/local/lib /lib64 /usr/lib64
- +checking for the default library search path... /lib /usr/lib
-
- Should be aligned by Samuel's binutils patch.
-
* `value-unwind.h`
-DEFINES='' HEADERS='../../../master/libgcc/config/i386/value-unwind.h' \
@@ -482,13 +499,22 @@ This takes up around 3.1 GiB, and needs roughly 3.0 h on kepler.SCHWINGE and
* `libatomic` on GNU/Linux compiles several more files than on GNU/Hurd. Is
that correct? Probably futex support.
+ * 2e2db3f92b534460c68c2f9ae64455884424beb6..3336556d2cb32f46322922a83015f760cfb79d8f
+
+ Both GNU/Linux and GNU/Hurd:
+
+ -checking assembler for rep and lock prefix... yes
+ +checking assembler for rep and lock prefix... no
+
+ TODO.
+
# Install
$ make install 2>&1 | tee log_install
[...]
-This takes up around 850 MiB, and needs roughly 4 min on kepler.SCHWINGE and 45
+This takes up around 850 MiB, and needs roughly 4 min on kepler.SCHWINGE and 35
min on coulomb.SCHWINGE.
@@ -514,24 +540,158 @@ min on coulomb.SCHWINGE.
Testing on GNU/Hurd is blocked on
[[fork_mach_port_mod_refs_ekern_urefs_owerflow]].
-TODO. Can use parallel testing, see [[!message-id
-"20110331070322.GI11563@sunsite.ms.mff.cuni.cz"]].
+TODO. On GNU/Hurd, it is advisable to reboot after having built and installed
+GCC, before running the testsuite, as otherwise there seems to be a tendency
+that the system crashes during the `gcc.c-torture/compile/limits-structnest.c`
+tests, which are rather memory hungry, see [[!message-id
+"87bol6aixd.fsf@schwinge.name"]]. Likewise, it also seems advisable to add
+further reboots in between, that is, separate `make check`'s `check-host` into
+several separate runs, and then one for `check-target` (see
+`[build]/Makefile:do-check`, `[build]/gcc/Makefile:CHECK_TARGETS`), as
+otherwise there seems to be a tendency for the system crashing sooner or later.
+(Running `check-host` accumulates to something like 44 hours worth of
+forking/execing of GCC and testcases.) On GNU/Linux we run it in one go, so
+that we'll catch any fundamental rearrangements of/additions to the testsuites.
+
+kepler.SCHWINGE:
+
+ $ make -k check 2>&1 | tee log_test
+ [...]
+
+coulomb.SCHWINGE:
+
+ $ awk '/^maybe-check-target/ { next; }; /^maybe-check-[^:]*:./ { print; };' < Makefile
+ maybe-check-fixincludes: check-fixincludes
+ maybe-check-gcc: check-gcc
+ maybe-check-intl: check-intl
+ maybe-check-libbacktrace: check-libbacktrace
+ maybe-check-libcpp: check-libcpp
+ maybe-check-libdecnumber: check-libdecnumber
+ maybe-check-libiberty: check-libiberty
+ maybe-check-zlib: check-zlib
+ maybe-check-gnattools: check-gnattools
+ maybe-check-lto-plugin: check-lto-plugin
+ $ grep ^CHECK_TARGETS gcc/Makefile
+ CHECK_TARGETS = check-ada check-c check-c++ check-fortran check-java check-lto check-objc
- $ make -k RUNTESTFLAGS=-v check 2>&1 | tee log_check
+ $ export LC_ALL=C
+
+ $ make -k check-fixincludes 2>&1 | tee log_test_1_check-fixincludes
+ [...]
+ $ make -k -C gcc check-ada 2>&1 | tee log_test_2_gcc_check-ada
+ [...]
+ [reboot]
+ $ make -k -C gcc check-c 2>&1 | tee log_test_2_gcc_check-c
+ [...]
+ [reboot]
+ $ make -k -C gcc check-c++ 2>&1 | tee log_test_2_gcc_check-c++
+ [...]
+ [reboot]
+ $ make -k -C gcc check-fortran check-java check-lto check-objc 2>&1 | tee log_test_2_gcc_check-fortran,check-java,check-lto,check-objc
+ [...]
+ [reboot]
+ $ make -k check-intl check-libbacktrace check-libcpp check-libdecnumber check-libiberty check-zlib check-gnattools check-lto-plugin 2>&1 | tee log_test_3
+ [...]
+ $ make -k check-target 2>&1 | tee log_test_4_check-target
[...]
-This needs roughly 6.5 h on kepler.SCHWINGE and 50.25 h on coulomb.SCHWINGE.
+This needs roughly 6.75 h on kepler.SCHWINGE and 3.5 h (`check-fixincludes`,
+`gcc/check-ada`) + 10 h (`gcc/check-c`) + 3.75 h (`gcc/check-c++`) + 5.5 h
+(`gcc/check-fortran`, `gcc/check-java`, `gcc/check-lto`, `gcc/check-objc`) +
+8.25 h (`check-intl`, [...], `check-lto-plugin`, `check-target`) = 31 h on
+coulomb.SCHWINGE.
## Analysis
$ toolchain/logs/process gcc test
-TODO.
+ * PTYs
+
+ Occasionally tests FAIL due to:
+
+ spawn -open -1 failed, 1 5, The system has no more ptys. Ask your system administrator to create more.
+
+ TODO.
+
+ * As of b401cb7ed15602d244a6807835b0b9d740a302a8 (2012-11-26;
+ 769bf18a20ee2540ca7601cdafabd62b18b9751b (2012-10-01)), all
+ `gcc.dg/guality` and `g++.dg/guality` and a few more are no longer tested
+ on coulomb.SCHWINGE and kepler.SCHWINGE.
+
+ * As of b401cb7ed15602d244a6807835b0b9d740a302a8 (2012-11-26;
+ 769bf18a20ee2540ca7601cdafabd62b18b9751b (2012-10-01)), there are
+ regressions (FAILs) in libgomp execution tests on coulomb.SCHWINGE.
+
+ * TODO
+
+
+## Enhancements
+
+
+### `contrib/testsuite-management/`, `contrib/regression/`
+
+ * 35a27ee8c4b349fea44fd1fadc9614ab3cc9d578 `Add an xfail manifest for
+ x86_64-unknown-linux-gnu to trunk.`
+
+
+### Parallel Testing
+
+[[!message-id "20110331070322.GI11563@sunsite.ms.mff.cuni.cz"]].
+
+
+### Distributed Testing
+
+
+#### IRC, OFTC, #gcc, 2012-05-31
+ <dnovillo> jsm28: in your mentor testing, you have the source and build
+ tree available for make check? or it's a pure installed-tree test?
+ <jsm28> dnovillo: Source tree, install tree, no build tree.
+ <dnovillo> jsm28: so, you run make check on top of the source tree or copy
+ the */testsuite trees to a testing area?
+ <jsm28> Create a site.exp and do runtest in a temporary directory. runtest
+ is pointed to the source tree to find sources.
+ <jsm28> For cross testing for GNU/Linux targets, the temporary directory is
+ mounted at the same path on host and target.
+ <dnovillo> jsm28: thanks. i guess i'll have to find the slice of the
+ source tree i need to copy.
+ <dnovillo> jsm28: for libstdc++ do you write a different site.exp?
+ <dnovillo> i noticed that it generates a different site,exp there.
+ <jsm28> The site.exp is mostly the same for all testsuites (so includes
+ settings that only some testsuites use).
+ <dnovillo> ok, thanks.
+ <dnovillo> and when you say "pointed to the source tree" you mean "set
+ srcdir /path/to/top/of/gcc" ?
+ <dnovillo> (in site.exp)
+ <jsm28> The GDB testsuite requires that you run the GDB testsuite's
+ configure script in the temporary directory where you will run runtest.
+ I don't think any GCC testsuites we use have requirements like that.
+ <jsm28> dnovillo: --srcdir option to runtest.
+ <dnovillo> ah, yes.
+ <jsm28> (and --tool, --target_board etc.)
+ <dnovillo> right
+ <dnovillo> since i'm distributing the tests. i want each node to only do a
+ bunch of files. this means that i either use 'tool.exp=file-pattern' or
+ simply copy the subset of files i want tool.exp to find.
+ <dnovillo> i chose the second approach, but that breaks in a handful of
+ cases that need files from other sub-directories.
+ <dnovillo> like g++.dg gcc.dg using stuff from c-c++-common.
+ <dnovillo> for libstdc++, the possibilities for splitting are enormous as
+ it has many directories.
+ <dnovillo> but i'm not setting it right. runtest runs without even trying
+ to test anything.
+ <dnovillo> i'm not having it pick up the right driver.
+ <jsm28> Probably all .exp files should be copied to anywhere running
+ testsuites, since some read .exp files from other directories.
+ <dnovillo> jsm28: that could be it too. it's irritating that libstdc++
+ does not even error out. runtest just does nothing and returns 0.
-# Specific Languages
+##### IRC, OFTC, #gcc, 2012-06-06
- * [[GNAT]]
+ <dnovillo> any libstdc++ maintainer around?
+ <dnovillo> or, does anyone know when the testsuite/data files are copied
+ into the running testsuite/ dir?
+ <dnovillo> seems to be done in advance by make.
- * [[gccgo]]
+##### [[!message-id "4FC7791E.6040407@gmail.com"]]
diff --git a/open_issues/gcc/pie.mdwn b/open_issues/gcc/pie.mdwn
new file mode 100644
index 00000000..a4598d1e
--- /dev/null
+++ b/open_issues/gcc/pie.mdwn
@@ -0,0 +1,40 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!meta title="Position-Independent Executables"]]
+
+[[!tag open_issue_gcc]]
+
+
+# IRC, freenode, #debian-hurd, 2012-11-08
+
+ <pinotree> tschwinge: i'm not totally sure, but it seems the pie options
+ for gcc/ld are causing issues
+ <pinotree> namely, producing executables that sigsegv straight away
+ <tschwinge> pinotree: OK, I do remember some issues about these, too.
+ <tschwinge> Also for -pg.
+ <tschwinge> These have in common that they use different crt*.o files for
+ linking.
+ <tschwinge> Might well be there's some bugs there.
+ <pinotree> one way is to try the w3m debian build: the current build
+ configuration enables also pie, which in turns makes an helper executable
+ (mktable) sigsegv when invoked
+ <pinotree> if «,-pie» is appended to the DEB_BUILD_MAINT_OPTIONS variable
+ in debian/rules, pie is not added and the resulting mktable runs
+ correctly
+
+
+## IRC, OFTC, #debian-hurd, 2012-11-09
+
+ <pinotree> youpi: ah, as i noted to tschwinge earlier, it seems -fPIE -pie
+ miscompile stuff
+ <youpi> uh
+ <pinotree> this causes the w3m build failure and (indirectly, due to elinks
+ built with -pie) aptitude
diff --git a/open_issues/gdb.mdwn b/open_issues/gdb.mdwn
index 1652031b..f5daff48 100644
--- a/open_issues/gdb.mdwn
+++ b/open_issues/gdb.mdwn
@@ -24,8 +24,17 @@ Here's what's to be done for maintaining GNU GDB.
# Configuration
-Last reviewed up to the [[Git mirror's ea9812279fe436be9a010d07ef1dbe465199a3d7
-(2011-09-07) sources|source_repositories/gdb]].
+<!--
+
+git checkout reviewed
+git log --reverse --topo-order --pretty=fuller --stat=$COLUMNS,$COLUMNS -w -p -C --cc ..sourceware/master
+-i
+/^commit |^merge:|^---$|hurd|linux|nacl|nptl|glibc|gs:|gnu-nat|i386gnu
+
+-->
+
+Last reviewed up to the [[Git mirror's ded7dfe6274b281d92a6ed76cedf29d06c918dec
+(2012-12-10) sources|source_repositories/gdb]].
* Globally
@@ -51,15 +60,22 @@ Last reviewed up to the [[Git mirror's ea9812279fe436be9a010d07ef1dbe465199a3d7
* [[gdbserver]]
+ * 82763a3d329b0d342d0273941b1521be9ef0c604 »MODIFIED is unknown, pass it as
+ true.«
+
+ * Configure so that Debian system's `/usr/lib/debug/[...]` will be loaded
+ automatically.
+
# Build
-Here's a log of a GDB build run; this is from our [[Git repository's
-695f61ff0f378e1680964128585044799de27015 (2011-09-06)
-sources|source_repositories/gdb]], run on kepler.SCHWINGE and coulomb.SCHWINGE.
+Here's a log of a GDB build run; this is from our [[Git
+repository|source_repositories/gdb]]'s `tschwinge/Ferry_Tagscherer` branch,
+commit ded7dfe6274b281d92a6ed76cedf29d06c918dec (2012-12-10), run on
+kepler.SCHWINGE and coulomb.SCHWINGE.
$ export LC_ALL=C
- $ ../master/configure --prefix="$PWD".install SHELL=/bin/dash CC=gcc-4.6 CXX=g++-4.6 --disable-werror 2>&1 | tee log_build
+ $ ../Ferry_Tagscherer/configure --prefix="$PWD".install SHELL=/bin/dash CC=gcc-4.6 CXX=g++-4.6 --disable-werror 2>&1 | tee log_build
[...]
$ make 2>&1 | tee log_build_
[...]
@@ -71,9 +87,15 @@ There are several occurences of *error: dereferencing type-punned pointer will
break strict-aliasing rules* in the MIG-generated stub files; thus no `-Werror`
until that is resolved ([[strict_aliasing]]).
-This takes up around 140 MiB and needs roughly 6 min on kepler.SCHWINGE and 30
+This takes up around 200 MiB and needs roughly 7 min on kepler.SCHWINGE and 23
min on coulomb.SCHWINGE.
+<!--
+
+ $ (make && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install && touch .go-test) 2>&1 | tee log_install && test -f .go-test && make -k check 2>&1 | tee log_test
+
+-->
+
## Analysis
@@ -81,12 +103,86 @@ x86 GNU/Linux' and GNU/Hurd's configurations are slightly different, thus mask
out most of the differences that are due to GNU/Linux supporting more core file
formats and more emulation vectors.
- $ ssh kepler.SCHWINGE 'cd tmp/source/gdb/ && cat hurd/master.build/log_build* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/gdb/linux/log_build
- $ ssh coulomb.SCHWINGE 'cd tmp/gdb/ && cat hurd/master.build/log_build* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/gdb/hurd/log_build
- $ diff -wu <(sed -f toolchain/logs/gdb/linux/log_build.sed < toolchain/logs/gdb/linux/log_build) <(sed -f toolchain/logs/gdb/hurd/log_build.sed < toolchain/logs/gdb/hurd/log_build) > toolchain/logs/gdb/log_build.diff
+ $ toolchain/logs/process gdb build
* Why do we specify `-D_GNU_SOURCE`, and GNU/Linux doesn't?
+ * GNU/Linux: `gdb/symfile-mem.c` for vDSO.
+
+ * GNU/Linux: `gdb/i386-nat.c` for hardware breakpoints, etc. -- we should
+ probably use that, too. Related to Samuel's Hurd GDB patch?
+
+ * `gdb/gnu-nat.c`
+
+ gnu-nat.c: In function 'proc_set_exception_port':
+ gnu-nat.c:409:3: warning: format '%d' expects argument of type 'int', but argument 8 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c: In function 'proc_steal_exc_port':
+ gnu-nat.c:449:7: warning: format '%d' expects argument of type 'int', but argument 8 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c:470:7: warning: format '%d' expects argument of type 'int', but argument 8 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c: In function 'make_proc':
+ gnu-nat.c:583:7: warning: format '%d' expects argument of type 'int', but argument 2 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c:586:7: warning: format '%d' expects argument of type 'int', but argument 8 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c: In function 'inf_set_pid':
+ gnu-nat.c:761:3: warning: format '%d' expects argument of type 'int', but argument 7 has type 'task_t' [-Wformat]
+ gnu-nat.c: In function 'inf_validate_procs':
+ gnu-nat.c:1085:6: warning: format '%d' expects argument of type 'int', but argument 8 has type 'thread_t' [-Wformat]
+ gnu-nat.c: In function 'inf_signal':
+ gnu-nat.c:1349:4: warning: format '%d' expects argument of type 'int', but argument 7 has type 'thread_t' [-Wformat]
+ gnu-nat.c:1349:4: warning: format '%d' expects argument of type 'int', but argument 8 has type 'thread_t' [-Wformat]
+ gnu-nat.c: In function 'S_exception_raise_request':
+ gnu-nat.c:1668:3: warning: format '%d' expects argument of type 'int', but argument 7 has type 'thread_t' [-Wformat]
+ gnu-nat.c:1668:3: warning: format '%d' expects argument of type 'int', but argument 8 has type 'task_t' [-Wformat]
+ gnu-nat.c:1705:8: warning: format '%d' expects argument of type 'int', but argument 7 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c:1711:8: warning: format '%d' expects argument of type 'int', but argument 7 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c: In function 'do_mach_notify_dead_name':
+ gnu-nat.c:1762:3: warning: format '%d' expects argument of type 'int', but argument 7 has type 'mach_port_t' [-Wformat]
+ gnu-nat.c: In function 'gnu_write_inferior':
+ gnu-nat.c:2383:8: warning: format '%x' expects argument of type 'unsigned int', but argument 2 has type 'vm_address_t' [-Wformat]
+ gnu-nat.c:2393:8: warning: format '%x' expects argument of type 'unsigned int', but argument 2 has type 'vm_address_t' [-Wformat]
+ gnu-nat.c: In function 'steal_exc_port':
+ gnu-nat.c:2864:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'mach_port_t' [-Wformat]
+
+
+ * fe19822761b4635f392875a186e48af446b40f41..7a63e9515491f21eaf07301df87d389def20e317):
+
+ `-Wmissing-prototypes`
+
+ gnu-nat.c: At top level:
+ gnu-nat.c:643:1: warning: no previous prototype for 'make_inf' []
+ gnu-nat.c: At top level:
+ gnu-nat.c:879:1: warning: no previous prototype for 'inf_set_traced' []
+ gnu-nat.c:980:1: warning: no previous prototype for 'inf_port_to_thread' []
+ gnu-nat.c: At top level:
+ gnu-nat.c:1748:1: warning: no previous prototype for 'inf_task_died_status' []
+ gnu-nat.c: At top level:
+ gnu-nat.c:2273:1: warning: no previous prototype for 'gnu_read_inferior' []
+ gnu-nat.c:2319:1: warning: no previous prototype for 'gnu_write_inferior' []
+ gnu-nat.c: At top level:
+ gnu-nat.c:3415:1: warning: no previous prototype for '_initialize_gnu_nat' []
+ notify_S.c:305:24: warning: no previous prototype for 'notify_server' []
+ notify_S.c:341:28: warning: no previous prototype for 'notify_server_routine' []
+ process_reply_S.c:343:24: warning: no previous prototype for 'process_reply_server' []
+ process_reply_S.c:379:28: warning: no previous prototype for 'process_reply_server_routine' []
+ msg_reply_S.c:165:24: warning: no previous prototype for 'msg_reply_server' []
+ msg_reply_S.c:201:28: warning: no previous prototype for 'msg_reply_server_routine' []
+ exc_request_S.c:157:24: warning: no previous prototype for 'exc_server' []
+ exc_request_S.c:193:28: warning: no previous prototype for 'exc_server_routine' []
+
+ * `dlopen`/`-ldl`
+
+ -checking for library containing dlopen... none required
+ +checking for library containing dlopen... -ldl
+
+ * `O_NOFOLLOW`
+
+ First seen in
+ 20f498edfd7e57d3297febcf9c7c7d667cc74239..69a5e2b022c7d15ec4c7c49e6f53a8d924d3b72b:
+
+ -checking for working fcntl.h... yes
+ +checking for working fcntl.h... no (bad O_NOFOLLOW)
+
+ [[!taglink open_issue_glibc]]?
+
* Why does GNU/Linux have an additional `-ldl -rdynamic` when linking `gdb`?
@@ -95,33 +191,125 @@ formats and more emulation vectors.
$ make install 2>&1 | tee log_install
[...]
-This takes up around 50 MiB, and needs roughly 1 min on kepler.SCHWINGE and 3
+This takes up around 50 MiB, and needs roughly 1 min on kepler.SCHWINGE and 2
min on coulomb.SCHWINGE.
## Analysis
- $ ssh kepler.SCHWINGE 'cd tmp/source/gdb/ && cat hurd/master.build/log_install | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/gdb/linux/log_install
- $ ssh coulomb.SCHWINGE 'cd tmp/gdb/ && cat hurd/master.build/log_install | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/gdb/hurd/log_install
- $ diff -wu <(sed -f toolchain/logs/gdb/linux/log_install.sed < toolchain/logs/gdb/linux/log_install) <(sed -f toolchain/logs/gdb/hurd/log_install.sed < toolchain/logs/gdb/hurd/log_install) > toolchain/logs/gdb/log_install.diff
+ $ toolchain/logs/process gdb install
* `libtool: finish`: `ldconfig` is not run for the Hurd.
# Testsuite
-On GNU/Hurd, hampered by the [[term_blocking]] issue.
-
$ make -k check
[...]
-This needs roughly 45 min on kepler.SCHWINGE and TODO min on coulomb.SCHWINGE.
+This needs roughly 14 min on kepler.SCHWINGE and 110 min on coulomb.SCHWINGE.
- $ ssh kepler.SCHWINGE 'cd tmp/source/gdb/ && sed -e "s%\(/media/data\)\?${PWD}%[...]%g" < hurd/master.build/gdb/testsuite/gdb.sum' > toolchain/logs/gdb/linux/sum
- $ ssh coulomb.SCHWINGE 'cd tmp/gdb/ && sed -e "s%\(/media/erich\)\?${PWD}%[...]%g" < hurd/master.build/gdb/testsuite/gdb.sum' > toolchain/logs/gdb/hurd/sum
- $ diff -u -F ^Running toolchain/logs/gdb/linux/sum toolchain/logs/gdb/hurd/sum > toolchain/logs/gdb/sum.diff
+When running `make -k check 2>&1 | tee log_test`, at the end of the testsuite
+the `tee` process does not terminate if there are still stray leftover
+processes that [have their stdout/stderr
+open](http://sourceware.org/ml/gdb-patches/2012-10/msg00489.html). `kill`ing
+these (`SIGKILL` may be needed), makes the `tee` process terminate, too. On
+GNU/Hurd, these generally are `gdb.multi/watchpoint-multi`, and an unknown
+(`?`) one ("57 PIDs before" `expect [...] gdb.cp`).
## Analysis
+ $ toolchain/logs/process gdb test
+
+ * Disabled
+
+ * `gdb.base/readline.exp`
+
+ [[term_blocking]] issue.
+
+ * `gdb.base/sigall.exp`
+
+ From `send signal TSTP` on, all FAIL running into timeouts.
+
+ * `gdb.python/py-inferior.exp` (mostly disabled)
+
+ Running ../../../Ferry_Tagscherer/gdb/testsuite/gdb.python/py-inferior.exp ...
+ [...]
+ python print 'result =', i0.was_attached
+ result = False
+ (gdb) PASS: gdb.python/py-inferior.exp: test Inferior.was_attached
+ python print i0.threads ()
+ (<gdb.InferiorThread object at 0x61170>, <gdb.InferiorThread object at 0x61160>)
+ (gdb) FAIL: gdb.python/py-inferior.exp: test Inferior.threads
+ break check_threads
+ Breakpoint 2 at 0x8048869: file ../../../Ferry_Tagscherer/gdb/testsuite/gdb.python/py-inferior.c, line 61.
+ (gdb) continue
+ Continuing.
+ [New Thread 25670.6]
+ [New Thread 25670.7]
+ [New Thread 25670.8]
+ [New Thread 25670.9]
+ [New Thread 25670.10]
+ [New Thread 25670.11]
+ [New Thread 25670.12]
+ [New Thread 25670.13]
+
+ Breakpoint 2, check_threads (barrier=0x15ff144) at ../../../Ferry_Tagscherer/gdb/testsuite/gdb.python/py-inferior.c:61
+ 61 pthread_barrier_wait (barrier);
+ (gdb) PASS: gdb.python/py-inferior.exp: continue to breakpoint: cont to check_threads
+ python print len (i0.threads ())
+ 10
+ (gdb) FAIL: gdb.python/py-inferior.exp: test Inferior.threads 2
+ break 28
+ Breakpoint 3 at 0x80487c2: file ../../../Ferry_Tagscherer/gdb/testsuite/gdb.python/py-inferior.c, line 28.
+ (gdb) continue
+ Continuing.
+ FAIL: gdb.python/py-inferior.exp: continue to breakpoint: cont to Break here. (timeout)
+ python addr = gdb.selected_frame ().read_var ('str')
+ FAIL: gdb.python/py-inferior.exp: read str address (timeout)
+ [All following tests FAIL with timeout.]
+ FAIL: gdb.python/py-inferior.exp: Switch to first inferior (timeout)
+ remove-inferiors 3
+ FAIL: gdb.python/py-inferior.exp: Remove second inferior (timeout)
+
+ At this point, the system hangs; no new processes can be spawned, so
+ perhaps an issue with the exec server.
+
+ * `UNSUPPORTED: gdb.threads/ia64-sigill.exp: Couldn't compile ../../../master/gdb/testsuite/gdb.threads/ia64-sigill.c: unrecognized error`
+
+ ../../../master/gdb/testsuite/gdb.threads/ia64-sigill.c:29:24: fatal error: asm/unistd.h: No such file or directory
+
+ * `UNSUPPORTED: gdb.threads/multi-create.exp: Couldn't compile ../../../master/gdb/testsuite/gdb.threads/multi-create.c: unrecognized error`
+ ../../../master/gdb/testsuite/gdb.threads/multi-create.c: In function 'create_function':
+ ../../../master/gdb/testsuite/gdb.threads/multi-create.c:46:39: error: 'PTHREAD_STACK_MIN' undeclared (first use in this function)
+ ../../../master/gdb/testsuite/gdb.threads/multi-create.c:46:39: note: each undeclared identifier is reported only once for each function it appears in
+ ../../../master/gdb/testsuite/gdb.threads/multi-create.c: In function 'main':
+ ../../../master/gdb/testsuite/gdb.threads/multi-create.c:73:39: error: 'PTHREAD_STACK_MIN' undeclared (first use in this function)
+
+ * `UNSUPPORTED: gdb.threads/staticthreads.exp: Couldn't compile ../../../master/gdb/testsuite/gdb.threads/staticthreads.c: unrecognized error`
+
+ ../../../master/gdb/testsuite/gdb.threads/staticthreads.c: In function 'main':
+ ../../../master/gdb/testsuite/gdb.threads/staticthreads.c:52:37: error: 'PTHREAD_STACK_MIN' undeclared (first use in this function)
+ ../../../master/gdb/testsuite/gdb.threads/staticthreads.c:52:37: note: each undeclared identifier is reported only once for each function it appears in
+
+ * `UNSUPPORTED: gdb.threads/watchpoint-fork.exp: parent: multithreaded: Couldn't compile ../../../Ferry_Tagscherer/gdb/testsuite/gdb.threads/watchpoint-fork-mt.c ../../../Ferry_Tagscherer/gdb/testsuite/gdb.threads/watchpoint-fork-parent.c: unrecognized error`
+
+ ../../../Ferry_Tagscherer/gdb/testsuite/gdb.threads/watchpoint-fork-mt.c:29:24: fatal error: asm/unistd.h: No such file or directory
+
+ * `UNSUPPORTED: gdb.threads/create-fail.exp: Couldn't compile ../../../Ferry_Tagscherer/gdb/testsuite/gdb.threads/create-fail.c: unrecognized error`
+
+ [...]/gdb.threads/create-fail.c:77: undefined reference to `pthread_attr_setaffinity_np'
+ [...]/gdb.threads/create-fail.c:83: undefined reference to `pthread_create'
+
+ * `UNSUPPORTED: gdb.threads/siginfo-threads.exp: Couldn't compile ../../../Ferry_Tagscherer/gdb/testsuite/gdb.threads/siginfo-threads.c: unrecognized error`
+
+ ../../../Ferry_Tagscherer/gdb/testsuite/gdb.threads/sigstep-threads.c:22:24: fatal error: asm/unistd.h: No such file or directory
+
+ * `UNTESTED: gdb.base/longest-types.exp: longest-types.exp`
+
+ ../../../Ferry_Tagscherer/gdb/testsuite/gdb.base/longest-types.c:20:8: error: size of array 'buf' is too large
+
+ Also on GNU/Linux.
+
TODO.
diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn
index 31cafbfe..734806a1 100644
--- a/open_issues/glibc.mdwn
+++ b/open_issues/glibc.mdwn
@@ -36,8 +36,8 @@ git log --reverse --pretty=fuller --stat=$COLUMNS,$COLUMNS -w -p -C --cc ..sourc
-->
-Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
-(2012-07-25) sources|source_repositories/glibc]].
+Last reviewed up to the [[Git mirror's d3bd58cf0a027016544949ffd27300ac5fb01bb8
+(2012-11-03) sources|source_repositories/glibc]].
* `t/hurdsig-fixes`
@@ -83,6 +83,35 @@ Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
Might simply be a missing patch(es) from master.
+ * `--disable-multi-arch`
+
+ IRC, freenode, #hurd, 2012-11-22
+
+ <pinotree> tschwinge: is your glibc build w/ or w/o multiarch?
+ <tschwinge> pinotree: See open_issues/glibc: --disable-multi-arch
+ <pinotree> ah, because you do cross-compilation?
+ <tschwinge> No, that's natively.
+ <tschwinge> There is also a not of what happened in cross-gnu when I
+ enabled multi-arch.
+ <tschwinge> No idea whether that's still relevant, though.
+ <pinotree> EPARSE
+ <tschwinge> s%not%note
+ <tschwinge> Better?
+ <pinotree> yes :)
+ <tschwinge> As for native builds: I guess I just didn't (want to) play
+ with it yet.
+ <pinotree> it is enabled in debian since quite some time, maybe other
+ i386/i686 patches (done for linux) help us too
+ <tschwinge> I though we first needed some CPU identification
+ infrastructe before it can really work?
+ <tschwinge> I thought [...].
+ <pinotree> as in use the i686 variant as runtime automatically? i guess
+ so
+ <tschwinge> I thought I had some notes about that, but can't currently
+ find them.
+ <tschwinge> Ah, I probably have been thinking about open_issues/ifunc
+ and open_issues/libc_variant_selection.
+
* --build=X
`long double` test: due to `cross_compiling = maybe` wants to execute a
@@ -184,7 +213,8 @@ Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
`AT_EMPTY_PATH`, `CLOCK_BOOTTIME`, `CLOCK_BOOTTIME_ALARM`,
`CLOCK_REALTIME_ALARM`, `O_PATH`,
- `PTRACE_*` (for example, cbff0d9689c4d68578b6a4f0a17807232506ea27),
+ `PTRACE_*` (for example, cbff0d9689c4d68578b6a4f0a17807232506ea27,
+ b1b2aaf8eb9eed301ea8f65b96844568ca017f8b),
`RLIMIT_RTTIME`, `SEEK_DATA` (`unistd.h`), `SEEK_HOLE` (`unistd.h`)
`clock_adjtime`, `fallocate`, `fallocate64`, `name_to_handle_at`,
`open_by_handle_at`, `process_vm_readv`, `process_vm_writev`, `sendmmsg`,
@@ -274,7 +304,7 @@ Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
We should be easily able to implement that one.
- * `futimesat`, `readlinkat`, `renameat`
+ * `futimesat`, `readlinkat`
If we have all of 'em (check Linux kernel), `#define __ASSUME_ATFCTS`.
@@ -352,6 +382,24 @@ Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
<pinotree> like posix/tst-waitid.c, you mean?
<youpi> yes
+ * `getconf` things
+
+ IRC, freenode, #hurd, 2012-10-03
+
+ <pinotree> getconf -a | grep CACHE
+ <Tekk_> pinotree: I hate spoiling data, but 0 :P
+ <pinotree> had that feeling, but wanted to be sure -- thanks!
+ <Tekk_> http://dpaste.com/809519/
+ <Tekk_> except for uhh
+ <Tekk_> L4 linesize
+ <Tekk_> that didn't have any number associated
+ <pinotree> weird
+ <Tekk_> I actually didn't even know that there was L4 cache
+ <pinotree> what do you get if you run `getconf
+ LEVEL4_CACHE_LINESIZE`?
+ <Tekk_> pinotree: undefined
+ <pinotree> expected, given the output above
+
For specific packages:
* [[octave]]
@@ -386,6 +434,270 @@ Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
* `sysdeps/unix/sysv/linux/syslog.c`
+ * `fsync` on a pipe
+
+ IRC, freenode, #hurd, 2012-08-21:
+
+ <braunr> pinotree: i think gnu_srs spotted a conformance problem in
+ glibc
+ <pinotree> (only one?)
+ <braunr> pinotree: namely, fsync on a pipe (which is actually a
+ socketpair) doesn't return EINVAL when the "operation not supported"
+ error is returned as a "bad request message ID"
+ <braunr> pinotree: what do you think of this case ?
+ <pinotree> i'm far from an expert on such stuff, but seems a proper E*
+ should be returned
+ <braunr> (there also is a problem in clisp falling in an infinite loop
+ when trying to handle this, since it uses fsync inside the error
+ handling code, eww, but we don't care :p)
+ <braunr> basically, here is what clisp does
+ <braunr> if fsync fails, and the error isn't EINVAL, let's report the
+ error
+ <braunr> and reporting the error in turn writes something on the
+ output/error stream, which in turn calls fsync again
+ <pinotree> smart
+ <braunr> after the stack is exhausted, clisp happily crashes
+ <braunr> gnu_srs: i'll alter the clisp code a bit so it knows about our
+ mig specific error
+ <braunr> if that's the problem (which i strongly suspect), the solution
+ will be to add an error conversion for fsync so that it returns
+ EINVAL
+ <braunr> if pinotree is willing to do that, he'll be the only one
+ suffering from the dangers of sending stuff to the glibc maintainers
+ :p
+ <pinotree> that shouldn't be an issue i think, there are other glibc
+ hurd implementations that do such checks
+ <gnu_srs> does fsync return EINVAL for other OSes?
+ <braunr> EROFS, EINVAL
+ <braunr> fd is bound to a special file which does not
+ support synchronization.
+ <braunr> obviously, pipes and sockets don't
+ <pinotree>
+ http://pubs.opengroup.org/onlinepubs/9699919799/functions/fsync.html
+ <braunr> so yes, other OSes do just that
+ <pinotree> now that you speak about it, it could be the failure that
+ the gnulib fsync+fdatasync testcase have when being run with `make
+ check` (although not when running as ./test-foo)
+ <braunr> hm we may not need change glibc
+ <braunr> clisp has a part where it defines a macro IS_EINVAL which is
+ system specific
+ <braunr> (but we should change it in glibc for conformance anyway)
+ <braunr> #elif defined(UNIX_DARWIN) || defined(UNIX_FREEBSD) ||
+ defined(UNIX_NETBSD) || defined(UNIX_OPENBSD) #define IS_EINVAL_EXTRA
+ ((errno==EOPNOTSUPP)||(errno==ENOTSUP)||(errno==ENODEV))
+ <pinotree> i'd rather add nothing to clisp
+ <braunr> let's see what posix says
+ <braunr> EINVAL
+ <braunr> so right, we should simply convert it in glibc
+ <gnu_srs> man fsync mentions EINVAL
+ <braunr> man pages aren't posix, even if they are usually close
+ <gnu_srs> aha
+ <pinotree> i think checking for MIG_BAD_ID and EOPNOTSUPP (like other
+ parts do) will b enough
+ <pinotree> *be
+ <braunr> gnu_srs: there, it finished correctly even when piped
+ <gnu_srs> I saw that, congrats!
+ <braunr> clisp is quite tricky to debug
+ <braunr> i never had to deal with a program that installs break points
+ and handles segfaults itself in order to implement growing stacks :p
+ <braunr> i suppose most interpreters do that
+ <gnu_srs> So the permanent change will be in glibc, not clisp?
+ <braunr> yes
+
+ IRC, freenode, #hurd, 2012-08-24:
+
+ <gnu_srs1> pinotree: The changes needed for fsync.c is at
+ http://paste.debian.net/185379/ if you want to try it out (confirmed
+ with rbraun)
+ <youpi> I agree with the patch, posix indeed documents einval as the
+ "proper" error value
+ <pinotree> there's fdatasync too
+ <pinotree> other places use MIG_BAD_ID instead of EMIG_BAD_ID
+ <braunr> pinotree: i assume that if you're telling us, it's because
+ they have different values
+ <pinotree> braunr: tbh i never seen the E version, and everywhere in
+ glibc the non-E version is used
+ <gnu_srs1> in sysdeps/mach/hurd/bits/errno.h only the E version is
+ defined
+ <pinotree> look in gnumach/include/mach/mig_errors.h
+ <pinotree> (as the comment in errno.h say)
+ <gnu_srs1> mig_errors.h yes. Which comment: from errors.h: /* Errors
+ from <mach/mig_errors.h>. */ and then the EMIG_ stuff?
+ <gnu_srs1> Which one is used when building libc?
+ <gnu_srs1> Answer: At least in fsync.c errno.h is used: #include
+ <errno.h>
+ <gnu_srs1> Yes, fdatasync.c should be patched too.
+ <gnu_srs1> pinotree: You are right: EMIG_ or MIG_ is confusing.
+ <gnu_srs1> /usr/include/i386-gnu/bits/errno.h: /* Errors from
+ <mach/mig_errors.h>. */
+ <gnu_srs1> /usr/include/hurd.h:#include <mach/mig_errors.h>
+
+ IRC, freenode, #hurd, 2012-09-02:
+
+ <antrik> braunr: regarding fsync(), I agree that EOPNOTSUPP probably
+ should be translated to EINVAL, if that's what POSIX says. it does
+ *not* sound right to translate MIG_BAD_ID though. the server should
+ explicitly return EOPNOTSUPP, and that's what the default trivfs stub
+ does. if you actually do see MIG_BAD_ID, there must be some other
+ bug...
+ <braunr> antrik: right, pflocal doesn't call the trivfs stub for socket
+ objects
+ <braunr> trivfs_demuxer is only called by the pflocal node demuxer, for
+ socket objects it's another call, and i don't think it's the right
+ thing to call trivfs_demuxer there either
+ <pinotree> handling MAG_BAD_ID isn't a bad idea anyway, you never know
+ what the underlying server actually implements
+ <pinotree> (imho)
+ <braunr> for me, a bad id is the same as a not supported operation
+ <pinotree> ditto
+ <pinotree> from fsync's POV, both the results are the same anyway, ie
+ that the server does not support a file_sync operation
+ <antrik> no, a bad ID means the server doesn't implement the protocol
+ (or not properly at least)
+ <antrik> it's usually a bug IMHO
+ <antrik> there is a reason we have EOPNOTSUPP for operations that are
+ part of a protocol but not implemented by a particular server
+ <pinotree> antrik: even if it could be the case, there's no reason to
+ make fsync fail anyway
+ <antrik> pinotree: I think there is. it indicates a bug, which should
+ not be hidden
+ <pinotree> well, patches welcome then...
+ <antrik> thing is, if sock objects are actually not supposed to
+ implement the file interface, glibc shouldn't even *try* to call
+ fsync on them
+ <pinotree> how?
+ <pinotree> i mean, can you check whether the file interface is not
+ implemented, without doing a roundtrip^
+ <pinotree> ?
+ <antrik> well, the sock objects are not files, i.e. they were *not*
+ obtained by file_name_lookup(), but rather a specific call. so glibc
+ actually *knows* that they are not files.
+ <braunr> antrik: this way of thinking means we need an "fd" protocol
+ <braunr> so that objects accessed through a file descriptor implement
+ all fd calls
+ <antrik> now I wonder though whether there are conceivable use cases
+ where it would make sense for objects obtained through the socket
+ call to optionally implement the file interface...
+ <antrik> which could actually make sense, if libc lets through other
+ file calls as well (which I guess it does, if the sock ports are
+ wrapped in normal fd structures?)
+ <braunr> antrik: they are
+ <braunr> and i'd personally be in favor of such an fd protocol, even if
+ it means implementing stubs for many useless calls
+ <braunr> but the way things are now suggest a bad id really means an
+ operation is simply not supported
+ <antrik> the question in this case is whether we should make the file
+ protocol mandatory for anything that can end up in an FD; or whether
+ we should keep it optional, and add the MIG_BAD_ID calls to *all* FD
+ operations
+ <antrik> (there is no reason for fsync to be special in this regard)
+ <braunr> yes
+ <antrik> braunr: BTW, I'm rather undecided whether the right approach
+ is a) requiring an FD interface collection, b) always checking
+ MIG_BAD_ID, or perhaps c) think about introducing a mechanism to
+ explicitly query supported interfaces...
+
+ IRC, freenode, #hurd, 2012-09-03:
+
+ <braunr> antrik: querying interfaces sounds like an additional penalty
+ on performance
+ <antrik> braunr: the query usually has to be done only once. in fact it
+ could be integrated into the name lookup...
+ <braunr> antrik: once for every object
+ <braunr> antrik: yes, along with the lookup would be a nice thing
+
+ [[!message-id "1351231423.8019.19.camel@hp.my.own.domain"]].
+
+ * `t/no-hp-timing`
+
+ IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> tschwinge: wrt the glibc topgit branch t/no-hp-timing,
+ couldn't that file be just replaced by #include
+ <sysdeps/generic/hp-timing.h>?
+
+ * `flockfile`/`ftrylockfile`/`funlockfile`
+
+ IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> youpi: uhm, in glibc we use
+ stdio-common/f{,try,un}lockfile.c, which do nothing (as opposed to eg
+ the nptl versions, which do lock/trylock/unlock); do you know more
+ about them?
+ <youpi> pinotree: ouch
+ <youpi> no, I don't know
+ <youpi> well, I do know what they're supposed to do
+ <pinotree> i'm trying fillig them, let's see
+ <youpi> but not why we don't have them
+ <youpi> (except that libpthread is "recent")
+ <youpi> yet another reason to build libpthread in glibc, btw
+ <youpi> oh, but we do provide lockfile in libpthread, don't we ?
+ <youpi> pinotree: yes, and libc has weak variants, so the libpthread
+ will take over
+ <pinotree> youpi: sure, but that in stuff linking to pthreads
+ <pinotree> if you do a simple application doing eg main() { fopen +
+ fwrite + fclose }, you get no locking
+ <youpi> so?
+ <youpi> if you don't have threads, you don't need locks :)
+ <pinotree> ... unless there is some indirect recursion
+ <youpi> ?
+ <pinotree> basically, i was debugging why glibc tests with mtrace() and
+ ending with muntrace() would die (while tests without muntrace call
+ wouldn't)
+ <youpi> well, I still don't see what a lock will bring
+ <pinotree> if you look at the muntrace implementation (in
+ malloc/mtrace.c), basically fclose can trigger a malloc hook (because
+ of the free for the FILE*)
+ <youpi> either you have threads, and it's need, or you don't, and it's
+ a nop
+ <youpi> yes, and ?
+ <braunr> does the signal thread count ?
+ <youpi> again, in linux, when you don't have threads, the lock is a nop
+ <youpi> does the signal thread use IO ?
+ <braunr> that's the question :)
+ <braunr> i hope not
+ <youpi> IIRC the signal thread just manages signals, and doesn't
+ execute the handler itself
+ <braunr> sure
+ <braunr> i was more thinking about debug stuff
+ <youpi> can't hurt to add them anyway, but let me still doubt that it'd
+ fix muntrace, I don't see why it would, unless you have threads
+ <pinotree> that's what i'm going next
+ <pinotree> pardon, it seems i got confused a bit
+ <pinotree> it'd look like a genuine muntrace bug (muntrace → fclose →
+ free hook → lock lock → fprint (since the FILE is still set) → malloc
+ → malloc hook → lock lock → spin)
+ <pinotree> at least i got some light over the flockfile stuff, thanks
+ ;)
+ <pinotree> youpi: otoh, __libc_lock_lock (etc) are noop in the base
+ implementation, while doing real locks on hurd in any case, and on
+ linux only if nptl is loaded, it seems
+ <pinotree> that would explain why on linux you get no deadlock
+ <youpi> unless using nptl, that is?
+ <pinotree> hm no, even with pthread it works
+ <pinotree> but hey, at least the affected glibc test now passes
+ <pinotree> will maybe try to do investigation on why it works on linux
+ tomorrow
+
+ [[!message-id "201211172058.21035.toscano.pino@tiscali.it"]].
+
+ * `t/pagesize`
+
+ IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> tschwinge: somehow related to your t/pagesize branch: due to
+ the fact that EXEC_PAGESIZE is not defined on hurd, libio/libioP.h
+ switches the allocation modes from mmap to malloc
+
+ * `LD_DEBUG`
+
+ IRC, freenode, #hurd, 2012-11-22
+
+ <pinotree> woot, `LD_DEBUG=libs /bin/ls >/dev/null` prints stuff and
+ then sigsegv
+ <tschwinge> Yeah, that's known for years... :-D
+ <tschwinge> Probably not too difficult to resolve, though.
+
* Verify baseline changes, if we need any follow-up changes:
* a11ec63713ea3903c482dc907a108be404191a02
@@ -554,9 +866,6 @@ Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
* [low] CFI for `_start`, 6a1bd2a100c958d30bbfe8c9b8f9071d24b7c3f4,
[[!message-id "20120316180551.GA6291@host2.jankratochvil.net"]] -- what
about other architectures?
- * `sendmmsg` usage, c030f70c8796c7743c3aa97d6beff3bd5b8dcd5d -- need a
- `ENOSYS` stub, [[!message-id "87a9zubdm9.fsf@schwinge.name"]],
- `t/sendmmsg`.
* `linkobj/libc.so`, 510bbf14b4f25fec8ee3a2d24de3f24bdbf84333 -- need to
adapt for (conditional?) Sun RPC reversion (if that was the original
cause for the patch)?
@@ -564,13 +873,35 @@ Last reviewed up to the [[Git mirror's 56e49b714ecd32c72c334802b00e3d62008d98e3
3e5aef87d76cfa7354f2b0d82b96e59280720796, [[!message-id
"20120517134700.GA19046@intel.com"]] -- only updates one copy of
`bits/statfs.h`; update the others, too, for consistency.
+ * [low] 789bd351b45f024b7f51e4886bf46b8e887ab6da: remove
+ `libc_hidden_def` in `sysdeps/mach/hurd/accept4.c`?
+ * 0948c3af9dfb3bc1312d6bed2f3a6bfd4e96eef4,
+ b80af2f40631871cf53a5e39d08d5d5516473b96,
+ 04570aaa8ad88caad303f8afe469beb4cf851e17 `_dl_initial_dtv`: OK?
+ * [very low] ea4d37b3169908615b7c17c9c506c6a6c16b3a26 `Implement
+ POSIX-generic sleep via nanosleep rather than SIGARLM.`: any benefit
+ using that one (with `sysdeps/mach/nanosleep.c`) instead of
+ `sysdeps/mach/sleep.c`?
* *baseline*
+ * ea4d37b3169908615b7c17c9c506c6a6c16b3a26 -- IRC, freenode, #hurd,
+ 2012-11-20, pinotree: »tschwinge: i agree on your comments on
+ ea4d37b3169908615b7c17c9c506c6a6c16b3a26, especially since mach's
+ sleep.c is buggy (not considers interruption, extra time() (= RPC)
+ call)«.
+
+
+## Update
+
+`baseline`, `t/regenerate_configure` (could now be removed),
+`t/master_backports`, `t/eglibc_backports`, `t/host-independency`,
+`tschwinge/Roger_Whittaker`
# Build
Here's a log of a glibc build run; this is from our [[Git repository's
-8958805c11c741d9211e20612c86271d906c9a0b (2012-07-28; 2012-06-30)
+28b74f8dbc3eb639d35fc0f93021ac5eb1fde9a4 (2012-11-03;
+fbeafedeea37e0af1984a6511018d159f5ceed6a (2012-11-03))
sources|source_repositories/glibc]], run on coulomb.SCHWINGE.
$ export LC_ALL=C
@@ -579,12 +910,12 @@ sources|source_repositories/glibc]], run on coulomb.SCHWINGE.
$ make install_root=/INVALID 2>&1 | tee log_build_
[...]
-This takes up around 500 MiB and needs roughly X min on kepler.SCHWINGE and 100
-min on coulomb.SCHWINGE.
+This takes up around 500 MiB, and needs roughly X min on kepler.SCHWINGE and
+100 min on coulomb.SCHWINGE.
<!--
- $ (make install_root=/INVALID && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install_root="$PWD".install install && touch .go-check) 2>&1 | tee log_install && test -f .go-check && ln -s /usr/lib/i386-*gnu/libstdc++.so.6 /lib/i386-*gnu/libpthread-stubs.so.0 /lib/i386-*gnu/libgcc_s.so.1 mach/libmachuser.so.1 hurd/libhurduser.so.0.3 ./ && make -k install_root=/INVALID check fast-check=yes 2>&1 | tee log_test
+ $ (make install_root=/INVALID && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install_root="$PWD".install install && touch .go-test) 2>&1 | tee log_install && test -f .go-test && ln -s /usr/lib/i386-*gnu/libstdc++.so.6 /lib/i386-*gnu/libpthread-stubs.so.0 /lib/i386-*gnu/libgcc_s.so.1 mach/libmachuser.so.1 hurd/libhurduser.so.0.3 ./ && make -k install_root=/INVALID check fast-check=yes 2>&1 | tee log_test
Mask out gcc-4.X (with possibly a backslash before the dot), GCC 4.5's column
output for (warning, error) messages, GCC 4.6's `[-Wsomething]` or `[enabled by
@@ -641,19 +972,6 @@ TODO.
* baseline
fd5bdc0924e0cfd1688b632068c1b26f3b0c88da..2ba92745c36eb3c3f3af0ce1b0aebd255c63a13b
- introduces:
-
- genops.c: In function '_IO_flush_all_lockp':
- genops.c:869:3: warning: passing argument 1 of '__save_FCT' makes pointer from integer without a cast [enabled by default]
- genops.c:869:3: note: expected 'void *' but argument is of type 'int'
-
- A similar warning has already been (and still is) seen here:
-
- dl-iteratephdr.c:83:3: warning: passing argument 1 of '__save_FCT' makes pointer from integer without a cast [enabled by default]
- dl-iteratephdr.c:83:3: note: expected 'void *' but argument is of type 'int'
-
- * baseline
- fd5bdc0924e0cfd1688b632068c1b26f3b0c88da..2ba92745c36eb3c3f3af0ce1b0aebd255c63a13b
(or probably Samuel's mmap backport) introduces:
../sysdeps/mach/hurd/mmap.c: In function '__mmap':
@@ -678,13 +996,34 @@ TODO.
2ba92745c36eb3c3f3af0ce1b0aebd255c63a13b..7a270350a9bc3110cd5ba12bbd8c5c8c365e0032
introduces:
- In file included from regex.c:62:0:
- regcomp.c: In function 'init_word_char':
- regcomp.c:935:4: warning: large integer implicitly truncated to unsigned type [-Woverflow]
- regcomp.c:936:4: warning: large integer implicitly truncated to unsigned type [-Woverflow]
-
tst-relsort1.c:6:1: warning: function declaration isn't a prototype [-Wstrict-prototypes]
+ * baseline
+ fc56c5bbc1a0d56b9b49171dd377c73c268ebcfd..cbc818d0ee66065f3942beffdca82986615aa19a
+ introduces
+
+ +gcc-4.6 tst-printf-round.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc-4.
+ +tst-printf-round.c: In function 'do_test':
+ +tst-printf-round.c:203:11: warning: passing argument 3 of 'test_hex_in_one_mode' discards 'const' qualifier from pointer target type [enabled by default]
+ +tst-printf-round.c:139:1: note: expected 'const char **' but argument is of type 'const char * const*'
+ +tst-printf-round.c:208:8: warning: passing argument 3 of 'test_hex_in_one_mode' discards 'const' qualifier from pointer target type [enabled by default]
+ +tst-printf-round.c:139:1: note: expected 'const char **' but argument is of type 'const char * const*'
+ +tst-printf-round.c:216:8: warning: passing argument 3 of 'test_hex_in_one_mode' discards 'const' qualifier from pointer target type [enabled by default]
+ +tst-printf-round.c:139:1: note: expected 'const char **' but argument is of type 'const char * const*'
+ +tst-printf-round.c:224:8: warning: passing argument 3 of 'test_hex_in_one_mode' discards 'const' qualifier from pointer target type [enabled by default]
+ +tst-printf-round.c:139:1: note: expected 'const char **' but argument is of type 'const char * const*'
+
+ gcc-4.6 test-wcschr.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486
+ +In file included from test-wcschr.c:2:0:
+ +../string/test-strchr.c: In function 'check1':
+ +../string/test-strchr.c:249:3: warning: passing argument 1 of 'stupid_STRCHR' from incompatible pointer type [enabled by default]
+ +../string/test-strchr.c:77:1: note: expected 'const wchar_t *' but argument is of type 'char *'
+ +../string/test-strchr.c:249:22: warning: initialization from incompatible pointer type [enabled by default]
+ +../string/test-strchr.c:252:5: warning: passing argument 2 of 'check_result' from incompatible pointer type [enabled by default]
+ +../string/test-strchr.c:92:1: note: expected 'const wchar_t *' but argument is of type 'char *'
+ +../string/test-strchr.c:252:5: warning: passing argument 4 of 'check_result' from incompatible pointer type [enabled by default]
+ +../string/test-strchr.c:92:1: note: expected 'const wchar_t *' but argument is of type 'char *'
+
# Install
@@ -760,10 +1099,12 @@ There is quite a baseline of failures.
`configure` magic akin to the `fixincludes` stuff (`gcc-4.4
-print-file-name=libstdc++.so.6`, etc.).
- * `debug/tst-chk4`, `debug/tst-chk5`, `debug/tst-chk6`, `debug/tst-lfschk4`,
- `debug/tst-lfschk5`, `debug/tst-lfschk6`
+ Even if that that is being worked around, the tests fail with:
- Fail in the same way as the C ones, `tst-chk1..3`.
+ dlopen failed: [...]/libc.so.0.3: version `GLIBC_2.13_DEBIAN_31' not found (required by [...]/libstdc++.so.6)
+ dlopen failed: [...]/libc.so.0.3: version `GLIBC_2.13_DEBIAN_31' not found (required by [...]/libgcc_s.so.1)
+
+ [[packaging_libpthread]].
* `io/ftwtest`, `posix/globtest`, `iconvdata/iconv-test`, `intl/tst-gettext`,
`malloc/tst-mtrace`, `elf/tst-pathopt`, `iconvdata/tst-tables`,
@@ -793,7 +1134,7 @@ There is quite a baseline of failures.
SIGSEGV.
- * `rt/tst-aio10`, `rt/tst-aio9`
+ * `rt-tst-aio2`, `rt-tst-aio3`, `rt/tst-aio10`, `rt/tst-aio9`
/home/thomas/tmp/glibc/tschwinge/Roger_Whittaker.build-gcc-4.4-486.O/rt/tst-aio10.o: In function `do_test':
tst-aio10.c:(.text+0x1b): undefined reference to `pthread_self'
@@ -806,7 +1147,7 @@ There is quite a baseline of failures.
collect2: ld returned 1 exit status
make[2]: *** [/home/thomas/tmp/glibc/tschwinge/Roger_Whittaker.build-gcc-4.4-486.O/rt/tst-aio10] Error 1
- * `rt-tst-aio2`, `rt-tst-aio3`, `rt/tst-mqueue3`, `rt/tst-mqueue6`,
+ * `rt/tst-mqueue3`, `rt/tst-mqueue6`,
`rt/tst-mqueue8`, `elf/tst-thrlock`, `rt/tst-timer3`,
`nss//libnss_test1.so`
@@ -843,7 +1184,8 @@ There is quite a baseline of failures.
Is not implemented; see above. In 8958805c11c741d9211e20612c86271d906c9a0b
testing, `stdlib/bug-getcontext.out` now says: *Skipping test; no support
- for FP exceptions.*
+ for FP exceptions.*, in cba1c83ad62a11347684a9daf349e659237a1741 testing,
+ it's back to the previous failure.
* `elf/tst-unique3lib.so`, `elf/tst-unique3lib2.so`, `elf/tst-unique4lib.so`
@@ -873,6 +1215,48 @@ There is quite a baseline of failures.
As of 8958805c11c741d9211e20612c86271d906c9a0b, this test now passes --
correct?
+ * `stdlib/tst-secure-getenv.out`
+
+ Needs [[`/proc/self/exe`|hurd/translator/procfs/jkoenig/discussion]].
+
+ * `elf/tst-array*`
+
+ Failures also seen on GNU/Linux; [[!message-id
+ "50950082.1070906@df1tl.local.here"]].
+
+ gcc-4.6 tst-array1.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/
+ gcc-4.6 -nostdlib -nostartfiles -o [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1 -Wl,-dynamic-linker=/lib/ld.so.1 -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486
+ [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/ld.so.1 --library-path [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/math:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf:[
+ cmp [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1.out tst-array1.exp > /dev/null
+ make[2]: *** [[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1.out] Error 1
+ gcc-4.6 tst-array2.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/
+ gcc-4.6 tst-array2dep.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -fPIC -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc
+ gcc-4.6 -shared -static-libgcc -Wl,-dynamic-linker=/lib/ld.so.1 -Wl,-z,defs -B[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/csu/ -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both -L[...]/tschwinge/Roger_Whittaker.build-gcc-4.6
+ gcc-4.6 -nostdlib -nostartfiles -o [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array2 -Wl,-dynamic-linker=/lib/ld.so.1 -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486
+ [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/ld.so.1 --library-path [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/math:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf:[
+ cmp [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array2.out tst-array2.exp > /dev/null
+ make[2]: *** [[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array2.out] Error 1
+ gcc-4.6 tst-array3.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/
+ gcc-4.6 -nostdlib -nostartfiles -o [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array3 -Wl,-dynamic-linker=/lib/ld.so.1 -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486
+ [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/ld.so.1 --library-path [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/math:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf:[
+ cmp [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array3.out tst-array1.exp > /dev/null
+ make[2]: *** [[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array3.out] Error 1
+ gcc-4.6 tst-array4.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/
+ gcc-4.6 -nostdlib -nostartfiles -o [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array4 -Wl,-dynamic-linker=/lib/ld.so.1 -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486
+ [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/ld.so.1 --library-path [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/math:[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf:[
+ cmp [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array4.out tst-array4.exp > /dev/null
+ make[2]: *** [[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array4.out] Error 1
+
+ `tst-array5` passes.
+
+ gcc-4.6 tst-array1-static.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline -Wwrite-strings -fmerge-all-constants -frounding-math -g -Wno-parentheses -Wstrict-prototypes -I../include -I[...]/tschwinge/Roger_Whittaker.build-gcc-4
+ gcc-4.6 -nostdlib -nostartfiles -static -o [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1-static [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/csu/crt0.o [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/csu/crti
+ [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1-static > [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1-static.out
+ cmp [...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1-static.out tst-array1.exp > /dev/null
+ make[2]: *** [[...]/tschwinge/Roger_Whittaker.build-gcc-4.6-486/elf/tst-array1-static.out] Error 1
+
+ `tst-array5-static` passes.
+
## OLD
diff --git a/open_issues/glibc/t/tls-threadvar.mdwn b/open_issues/glibc/t/tls-threadvar.mdwn
index e72732ab..4afd8a1a 100644
--- a/open_issues/glibc/t/tls-threadvar.mdwn
+++ b/open_issues/glibc/t/tls-threadvar.mdwn
@@ -29,3 +29,32 @@ IRC, freenode, #hurd, 2011-10-23:
After this has been done, probably the whole `__libc_tsd_*` stuff can be
dropped altogether, and `__thread` directly be used in glibc.
+
+
+# IRC, freenode, #hurd, 2012-08-07
+
+ <tschwinge> r5219: Update libpthread patch to replace threadvar with tls
+ for pthread_self
+ <tschwinge> r5224: revert r5219 too, it's not ready either
+ <youpi> as the changelog says, the __thread revertal is because it posed
+ problems
+ <youpi> and I just didn't have any time to check them while the freeze was
+ so close
+ <tschwinge> OK. What kind of problems? Should it be reverted upstream,
+ too?
+ <youpi> I don't remember exactly
+ <youpi> it should just be fixed
+ <youpi> we can revert it upstream, but it'd be good that we manage to
+ progress, at some point...
+ <tschwinge> Of course -- however as long as we don't know what kind of
+ problem, it is a bit difficult. ;-)
+ <youpi> since I didn't left a note, it was most probably a mere glibc run,
+ or boot with the patched libpthread
+ <youpi> *testsuite run
+ <tschwinge> OK.
+ <tschwinge> The libpthread testsuite doesn't show any issues with that
+ patch applied, though. But I didn'T test anything else.
+ <tschwinge> youpi: Also, you have probably seen my glibc __thread errno
+ email -- rmcgrath wanted to find some time this week to comment/help, and
+ I take it you don't have any immediate comments to that issue?
+ <youpi> I saw the mails, but didn't investigate at all
diff --git a/open_issues/glibc_ptrace.mdwn b/open_issues/glibc_ptrace.mdwn
index b4c529d7..6704ed80 100644
--- a/open_issues/glibc_ptrace.mdwn
+++ b/open_issues/glibc_ptrace.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2009, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -33,8 +33,8 @@ License|/fdl]]."]]"""]]
and for us it is a `struct i386_thread_state` from
`mach/i386/thread_status.h`;
- * Linux probides some functionality that we don't provide, e.g.,
- `PTRACE_SINGLESTEP`.
+ * Linux provides some functionality that we don't provide:
+ `PTRACE_GETFPXREGS` , `PTRACE_SINGLESTEP`.
* Some parts are wrongly implemented, e.g., `PTRACE_GETREGS` and
`PTRACE_SETREGS` both do the same thing.
diff --git a/open_issues/gnat.mdwn b/open_issues/gnat.mdwn
new file mode 100644
index 00000000..2d17e275
--- /dev/null
+++ b/open_issues/gnat.mdwn
@@ -0,0 +1,102 @@
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!meta title="Enable Ada programming (GCC: GNAT)"]]
+
+[[!tag open_issue_gcc]]
+
+Make the Ada programming language available on GNU/Hurd in its [[GCC]] GNAT
+implementation, and enable Hurd-specific features.
+
+There is a [[!FF_project 259]][[!tag bounty]] on this task.
+
+---
+
+
+# Part I
+
+First, make the language functional, have its test suite pass without errors.
+
+
+## Original [[community/GSoC]] Task Description
+
+[[!inline pages=community/gsoc/project_ideas/gnat feeds=no]]
+
+
+## Debian GCC
+
+There has a patch been added for GNU/kFreeBSD:
+`bfe081336914729fc0180c07ab4afa41965100f2`, `git-svn-id:
+svn://svn.debian.org/gcccvs/branches/sid@5638
+6ca36cf4-e1d1-0310-8c6f-e303bb2178ca'
+
+
+## IRC, freenode, #hurd, 2012-07-17
+
+ <gnu_srs> I've found the remaining problem with gnat backtrace for Hurd!
+ Related to the stack frame.
+ <gnu_srs> This version does not work: one relying on static assumptions
+ about the frame layout
+ <gnu_srs> Causing segfaults.
+ <gnu_srs> Any interest to create a test case out of that piece of code,
+ taken from gcc/ada/tracebak.c?
+ <braunr> gnu_srs: sure
+
+
+### IRC, freenode, #hurd, 2012-07-18
+
+ <braunr> "Digging further revealed that the GNU/Hurd stack frame does not
+ seem to
+ <braunr> be static enough to define USE_GENERIC_UNWINDER in
+ gcc/ada/tracebak.c.
+ <braunr> "
+ <braunr> what do you mean by a "stack frame does not seem to be static
+ enough" ?
+ <gnu_srs> I can qoute from the source file if you want. Otherwise look at
+ the code yourself: gcc/ada/tracebak,c
+ <gnu_srs> I mean that something is wrong with the stack frame for
+ Hurd. This is the code I wanted to use as a test case for the stack.
+ <gnu_srs> Remember?
+ <braunr> more or less
+ <braunr> ah, "static assumptions"
+ <braunr> all right, i don't think anything is "wrong" with stack frames
+ <braunr> but if you use a recent version of gcc, as indicated in the code,
+ -fomit-frame-pointer is enabled by default
+ <braunr> so your stack frame won't look like it used to be without the
+ option
+ <braunr> hence the need for USE_GCC_UNWINDER
+ <braunr> http://en.wikipedia.org/wiki/Call_stack explains this very well
+ <gnu_srs> However, kfreebsd does not seem to need USE_GCC_UNWINDER, how
+ come?
+ <braunr> i guess they don't omit the frame pointer
+ <braunr> your fix is good btw
+ <gnu_srs> thanks
+
+
+### IRC, freenode, #hurd, 2012-07-19
+
+ <gnu_srs> tschwinge: The bug in #681998 should go upstream. Applied in
+ Debian already. Hopefully this is the last patch needed for the port of
+ GNAT to Hurd.
+
+
+---
+
+
+# Part II
+
+Next, Hurd-specific features can be added. Add an interface to the
+language/environment for being able to do [[RPC]] calls, in order to program
+[[hurd/translator]]s natively in Ada.
+
+
+## Original [[community/GSoC]] Task Description
+
+[[!inline pages=community/gsoc/project_ideas/language_bindings feeds=no]]
diff --git a/open_issues/gnumach_memory_management.mdwn b/open_issues/gnumach_memory_management.mdwn
index 9feb30c8..e5e9d2c5 100644
--- a/open_issues/gnumach_memory_management.mdwn
+++ b/open_issues/gnumach_memory_management.mdwn
@@ -2133,3 +2133,52 @@ There is a [[!FF_project 266]][[!tag bounty]] on this task.
<braunr> do you want to review ?
<youpi> I don't think there is any need to
<braunr> ok
+
+
+# IRC, freenode, #hurd, 2012-12-08
+
+ <mcsim> braunr: hi. Do I understand correct that merely the same technique
+ is used in linux to determine the slab where, the object to be freed,
+ resides?
+ <braunr> yes but it's faster on linux since it uses a direct mapping of
+ physical memory
+ <braunr> it just has to shift the virtual address to obtain the physical
+ one, whereas x15 has to walk the pages tables
+ <braunr> of course it only works for kmalloc, vmalloc is entirely different
+ <mcsim> btw, is there sense to use some kind of B-tree instead of AVL to
+ decrease number of cache misses? AFAIK, in modern processors size of L1
+ cache line is at least 64 bytes, so in one node we can put at least 4
+ leafs (key + pointer to data) making search faster.
+ <braunr> that would be a b-tree
+ <braunr> and yes, red-black trees were actually developed based on
+ properties observed on b-trees
+ <braunr> but increasing the size of the nodes also increases memory
+ overhead
+ <braunr> and code complexity
+ <braunr> that's why i have a radix trees for cases where there are a large
+ number of entries with keys close to each other :)
+ <braunr> a radix-tree is basically a b-tree using the bits of the key as
+ indexes in the various arrays it walks instead of comparing keys to each
+ other
+ <braunr> the original avl tree used in my slab allocator was intended to
+ reduce the average height of the tree (avl is better for that)
+ <braunr> avl trees are more suited for cases where there are more lookups
+ than inserts/deletions
+ <braunr> they make the tree "flatter" but the maximum complexity of
+ operations that change the tree is 2log2(n), since rebalancing the tree
+ can make the algorithm reach back to the tree root
+ <braunr> red-black trees have slightly bigger heights but insertions are
+ limited to 2 rotations and deletions to 3
+ <mcsim> there should be not much lookups in slab allocators
+ <braunr> which explains why they're more generally found in generic
+ containers
+ <mcsim> or do I misunderstand something?
+ <braunr> well, there is a lookup for each free()
+ <braunr> whereas there are insertions/deletions when a slab becomes
+ non-empty/empty
+ <mcsim> I see
+ <braunr> so it was very efficient for caches of small objects, where slabs
+ have many of them
+ <braunr> also, i wrote the implementation in userspace, without
+ functionality pmap provides (although i could have emulated it
+ afterwards)
diff --git a/open_issues/gnumach_page_cache_policy.mdwn b/open_issues/gnumach_page_cache_policy.mdwn
index 03cb3725..d128c668 100644
--- a/open_issues/gnumach_page_cache_policy.mdwn
+++ b/open_issues/gnumach_page_cache_policy.mdwn
@@ -108,6 +108,9 @@ License|/fdl]]."]]"""]]
12k random data
<braunr> i'll try with other values
<braunr> i get crashes, deadlocks, livelocks, and it's not pretty :)
+
+[[libpager_deadlock]].
+
<braunr> and always in ext2, mach doesn't seem affected by the issue, other
than the obvious
<braunr> (well i get the usual "deallocating an invalid port", but as
@@ -625,3 +628,158 @@ License|/fdl]]."]]"""]]
## [[metadata_caching]]
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> i'm only adding a cached pages count you know :)
+ <braunr> (well actually, this is now a vm_stats call that can replace
+ vm_statistics, and uses flavors similar to task_info)
+ <braunr> my goal being to see that yellow bar in htop
+ <braunr> ... :)
+ <pinotree> yellow?
+ <braunr> yes, yellow
+ <braunr> as in http://www.sceen.net/~rbraun/htop.png
+ <pinotree> ah
+
+
+## IRC, freenode, #hurd, 2012-07-13
+
+ <braunr> i always get a "no more room for vm_map_enter" error when building
+ glibc :/
+ <braunr> but the build continues, probably a failed test
+ <braunr> ah yes, i can see the yellow bar :>
+ <antrik> braunr: congrats :-)
+ <braunr> antrik: thanks
+ <braunr> but i think my patch can't make it into the git repo until the
+ swap deadlock is solved (or at least very infrequent ..)
+
+[[libpager_deadlock]].
+
+ <braunr> well, the page cache accounting tells me something is wrong there
+ too lol
+ <braunr> during a build 112M of data was created, of which only 28M made it
+ into the cache
+ <braunr> which may imply something is still holding references on the
+ others objects (shadow objects hold references to their underlying
+ object, which could explain this)
+ <braunr> ok i'm stupid, i just forgot to subtract the cached pages from the
+ used pages .. :>
+ <braunr> (hm, actually i'm tired, i don't think this should be done)
+ <braunr> ahh yes much better
+ <braunr> i simply forgot to convert pages in kilobytes .... :>
+ <braunr> with the fix, the accounting of cached files is perfect :)
+
+
+## IRC, freenode, #hurd, 2012-07-14
+
+ <youpi> braunr: btw, if you want to stress big builds, you might want to
+ try webkit, ppl, rquantlib, rheolef, yade
+ <youpi> they don't pass on bach (1.3GiB), but do on ironforge (1.8GiB)
+ <braunr> youpi: i don't need to, i already know my patch triggers swap
+ deadlocks more often, which was expected
+ <youpi> k
+ <braunr> there are 3 tasks concerning my work : 1/ page cache accounting
+ (i'm sending the patch right now) 2/ removing the fixed limit and 3/
+ hunting the swap deadlock and fixing as much as possible
+ <braunr> 2/ can't get in the repository without 3/ imo
+ <youpi> btw, the increase of PAGE_FREE_* in your 2/ could go already,
+ couldn't it?
+ <braunr> yes
+ <braunr> but we should test with higher thresholds
+ <braunr> well
+ <braunr> it really depends on the usage pattern :/
+
+
+## [[ext2fs_libports_reference_counting_assertion]]
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+ <braunr> concerning the page cache patch, i've been using for quite some
+ time now, did lots of builds with it, and i actually wonder if it hurts
+ stability as much as i think
+ <braunr> considering i didn't stress the system as much before
+ <braunr> and it really improves performance
+
+ <braunr> cached memobjs: 138606
+ <braunr> cache: 1138M
+ <braunr> i bet ext2fs can have a hard time scanning 138k entries in a
+ linked list, using callback functions on each of them :x
+
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+ <tschwinge> braunr: Sorry that I didn't have better results to present.
+ :-/
+ <braunr> eh, that was expected :)
+ <braunr> my biggest problem is the hurd itself :/
+ <braunr> for my patch to be useful (and the rest of the intended work), the
+ hurd needs some serious fixing
+ <braunr> not syncing from the pagers
+ <braunr> and scalable algorithms everywhere of course
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> youpi: FYI, the branches rbraun/page_cache in the gnupach and hurd
+ repos are ready to be merged after review
+ <braunr> gnumach*
+ <youpi> so you fixed the hangs & such?
+ <braunr> they only the cache stats, not the "improved" cache
+ <braunr> no
+ <braunr> it requires much more work for that :)
+ <youpi> braunr: my concern is that the tests on buildds show stability
+ regression
+ <braunr> youpi: tschwinge also reported performance degradation
+ <braunr> and not the minor kind
+ <youpi> uh
+ <tschwinge> :-/
+ <braunr> far less pageins, but twice as many pageouts, and probably high
+ cpu overhead
+ <braunr> building (which is what buildds do) means lots of small files
+ <braunr> so lots of objects
+ <braunr> huge lists, long scans, etc..
+ <braunr> so it definitely requires more work
+ <braunr> the stability issue comes first in mind, and i don't see a way to
+ obtain a usable trace
+ <braunr> do you ?
+ <youpi> nope
+ <braunr> (except making it loop forever instead of calling assert() and
+ attach gdb to a qemu instance)
+ <braunr> youpi: if you think the infinite loop trick is ok, we could
+ proceed with that
+ <youpi> which assert?
+ <braunr> the port refs one
+ <youpi> which one?
+ <braunr> whicih prevented you from using the page cache patch on buildds
+ <youpi> ah, the libports one
+ <youpi> for that one, I'd tend to take the time to perhaps use coccicheck
+ actually
+
+[[code_analysis]].
+
+ <braunr> oh
+ <youpi> it's one of those which is supposed to be statically ananyzable
+ <youpi> s/n/l
+ <braunr> that would be great
+ <tschwinge> :-)
+ <tschwinge> And set precedence.
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+ <braunr> hm i killed darnassus, probably the page cache patch again
+
+
+## IRC, freenode, #hurd, 2012-09-19
+
+ <youpi> I was wondering about the page cache information structure
+ <youpi> I guess the idea is that if we need to add a field, we'll just
+ define another RPC?
+ <youpi> braunr: ↑
+ <braunr> i've done that already, yes
+ <braunr> youpi: have a look at the rbraun/page_cache gnumach branch
+ <youpi> that's what I was referring to
+ <braunr> ok
diff --git a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
index 90137766..7739f4d1 100644
--- a/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
+++ b/open_issues/gnumach_vm_map_entry_forward_merging.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -181,6 +181,8 @@ License|/fdl]]."]]"""]]
<braunr> from what i could see, part of the problem still exists in freebsd
<braunr> for the same reasons (shadow objects being one of them)
+[[mach_shadow_objects]].
+
# GCC build time using bash vs. dash
diff --git a/open_issues/gnumach_vm_map_red-black_trees.mdwn b/open_issues/gnumach_vm_map_red-black_trees.mdwn
index d7407bfe..53ff66c5 100644
--- a/open_issues/gnumach_vm_map_red-black_trees.mdwn
+++ b/open_issues/gnumach_vm_map_red-black_trees.mdwn
@@ -172,3 +172,175 @@ License|/fdl]]."]]"""]]
crasher le noyau)
<braunr> (enfin jveux dire, qui faisait crasher le noyau de façon très
obscure avant le patch rbtree)
+
+
+### IRC, freenode, #hurd, 2012-07-15
+
+ <bddebian> I get errors in vm_map.c whenever I try to "mount" a CD
+ <bddebian> Hmm, this time it rebooted the machine
+ <bddebian> braunr: The translator set this time and the machine reboots
+ before I can get the full message about vm_map, but here is some of the
+ crap I get: http://paste.debian.net/179191/
+ <braunr> oh
+ <braunr> nice
+ <braunr> that may be the bug youpi saw with my redblack tree patch
+ <braunr> bddebian: assert(diff != 0); ?
+ <bddebian> Aye
+ <braunr> good
+ <braunr> it means we're trying to insert a vm_map_entry at a region in a
+ map which is already occupied
+ <bddebian> Oh
+ <braunr> and unlike the previous code, the tree actually checks that
+ <braunr> it has to
+ <braunr> so you just simply use the iso9660fs translator and it crashes ?
+ <bddebian> Well it used to on just trying to set the translator. This time
+ I was able to set the translator but as soon as I cd to the mount point I
+ get all that crap
+ <braunr> that's very good
+ <braunr> more test cases to fix the vm
+
+
+### IRC, freenode, #hurd, 2012-11-01
+
+ <youpi> braunr: Assertion `diff != 0' failed in file "vm/vm_map.c", line
+ 1002
+ <youpi> that's in rbtree_insert
+ <braunr> youpi: the problem isn't the tree, it's the map entries
+ <braunr> some must overlap
+ <braunr> if you can inspect that, it would be helpful
+ <youpi> I have a kdb there
+ <youpi> it's within a port_name_to_task system call
+ <braunr> this assertion basically means there already is an item in the
+ tree where the new item is supposed to be inserted
+ <youpi> this port_name_to_task presence in the stack is odd
+ <braunr> it's in vm_map_enter
+ <youpi> there's a vm_map just after that (and the assembly trap code
+ before)
+ <youpi> I know
+ <youpi> I'm wondering about the caller
+ <braunr> do you have a way to inspect the inserted map entry ?
+ <youpi> I'm actually wondering whether I have the right kernel in gdb
+ <braunr> oh
+ <youpi> better
+ <youpi> with the right kernel :)
+ <youpi> 0x80039acf (syscall_vm_map)
+ (target_map=d48b6640,address=d3b63f90,size=0,mask=0,anywhere=1)
+ <youpi> size == 0 seems odd to me
+ <youpi> (same parameters for vm_map)
+ <braunr> right
+ <braunr> my code does assume an entry has a non null size
+ <braunr> (in the entry comparison function)
+ <braunr> EINVAL (since Linux 2.6.12) length was 0.
+ <braunr> that's a quick glance at mmap(2)
+ <braunr> might help track bugs from userspace (e.g. in exec .. :))
+ <braunr> posix says the saem
+ <braunr> same*
+ <braunr> the gnumach manual isn't that precise
+ <youpi> I don't seem to manage to read the entry
+ <youpi> but I guess size==0 is the problem anyway
+ <mcsim> youpi, braunr: Is there another kernel fault? Was that in my
+ kernel?
+ <braunr> no that's another problem
+ <braunr> which became apparent following the addition of red black trees in
+ the vm_map code
+ <braunr> (but which was probably present long before)
+ <mcsim> braunr: BTW, do you know if there where some specific circumstances
+ that led to memory exhaustion in my code? Or it just aggregated over
+ time?
+ <braunr> mcsim: i don't know
+ <mcsim> s/where/were
+ <mcsim> braunr: ok
+
+
+### IRC, freenode, #hurd, 2012-11-05
+
+ <tschwinge> braunr: I have now also hit the diff != 0 assertion error;
+ sitting in KDB, waiting for your commands.
+ <braunr> tschwinge: can you check the backtrace, have a look at the system
+ call and its parameters like youpi did ?
+ <tschwinge> If I manage to figure out how to do that... :-)
+ * tschwinge goes read scrollback.
+ <braunr> "trace" i suppose
+ <braunr> if running inside qemu, you can use the integrated gdb server
+ <tschwinge> braunr: No, hardware. And work intervened. And mobile phone
+ <-> laptop via bluetooth didn't work. But now:
+ <tschwinge> Pretty similar to Samuel's:
+ <tschwinge> Assert([...])
+ <tschwinge> vm_map_enter(0xc11de6c8, 0xc1785f94, 0, 0, 1)
+ <tschwinge> vm_map(0xc11de6c8, 0xc1785f94, 0, 0, 1)
+ <tschwinge> syscall_vm_map(1, 0x1024a88, 0, 0, 1)
+ <tschwinge> mach_call_call(1, 0x1024a88, 0, 0, 1)
+ <braunr> thanks
+ <braunr> same as youpi observed, the requested size for the mapping is 0
+ <braunr> tschwinge: thanks
+ <tschwinge> braunr: Anything else you'd like to see before I reboot?
+ <braunr> tschwinge: no, that's enough for now, and the other kind of info
+ i'd like are much more difficult to obtain
+ <braunr> if we still have the problem once a small patch to prevent null
+ size is applied, then it'll be worth looking more into it
+ <pinotree> isn't it possible to find out who called with that size?
+ <braunr> not easy, no
+ <braunr> it's also likely that the call that fails isn't the first one
+ <pinotree> ah sure
+ <pinotree> braunr: making mmap reject 0 size length could help? posix says
+ such size should be rejected straight away
+ <braunr> 17:09 < braunr> if we still have the problem once a small patch to
+ prevent null size is applied, then it'll be worth looking more into it
+ <braunr> that's the idea
+ <braunr> making faulty processes choke on it should work fine :)
+ <pinotree> «If len is zero, mmap() shall fail and no mapping shall be
+ established.»
+ <pinotree> braunr: should i cook up such patch for mmap?
+ <braunr> no, the change must be applied in gnumach
+ <pinotree> sure, but that could simply such condition in mmap (ie avoiding
+ to call io_map on a file)
+ <braunr> such calls are erroneous and rare, i don't see the need
+ <pinotree> ok
+ <braunr> i bet it comes from the exec server anyway :p
+ <tschwinge> braunr: Is the mmap with size 0 already a reproducible testcase
+ you can use for the diff != 0 assertion?
+ <tschwinge> Otherwise I'd have a reproducer now.
+ <braunr> tschwinge: i'm not sure but probably yes
+ <tschwinge> braunr: Otherwise, take GDB sources, then: gcc -fsplit-stack
+ gdb/testsuite/gdb.base/morestack.c && ./a.out
+ <tschwinge> I have not looked what exactly this does; I think -fsplit-stack
+ is not really implemented for us (needs something in libgcc we might not
+ have), is on my GCC TODO list already.
+ <braunr> tschwinge: interesting too :)
+
+
+### IRC, freenode, #hurd, 2012-11-19
+
+ <tschwinge> braunr: Hmm, I have now hit the diff != 0 GNU Mach assertion
+ failure during some GCC invocation (GCC testsuite) that does not relate
+ to -fsplit-stack (as the others before always have).
+ <tschwinge> Reproduced:
+ /media/erich/home/thomas/tmp/gcc/hurd/master.build/gcc/xgcc
+ -B/media/erich/home/thomas/tmp/gcc/hurd/master.build/gcc/
+ /home/thomas/tmp/gcc/hurd/master/gcc/testsuite/gcc.dg/torture/pr42878-1.c
+ -fno-diagnostics-show-caret -O2 -flto -fuse-linker-plugin
+ -fno-fat-lto-objects -fcompare-debug -S -o pr42878-1.s
+ <tschwinge> Will check whether it's the same backtrace in GNU Mach.
+ <tschwinge> Yes, same.
+ <braunr> tschwinge: as youpi seems quite busy these days, i'll cook a patch
+ and commit it directly
+ <tschwinge> braunr: Thanks! I have, by the way, confirmed that the
+ following is enough to trigger the issue: vm_map(mach_task_self(), 0, 0,
+ 0, 1, 0, 0, 0, 0, 0, 0);
+ <tschwinge> ... and before the allocator patch, GNU Mach did accept that
+ and return 0 -- though I did not check what effect it actually has. (And
+ I don't think it has any useful one.) I'm also reading that as of lately
+ (Linux 2.6.12), mmap (length = 0) is to return EINVAL, which I think is
+ the foremost user of vm_map.
+ <pinotree> tschwinge: posix too says to return EINVAL for length = 0
+ <braunr> yes, we checked that earlier with youpi
+
+[[!message-id "87sj8522zx.fsf@kepler.schwinge.homeip.net"]].
+
+ <braunr> tschwinge: well, actually your patch is what i had in mind
+ (although i'd like one in vm_map_enter to catch wrong kernel requests
+ too)
+ <braunr> tschwinge: i'll work on it tonight, and do some testing to make
+ sure we don't regress critical stuff (exec is another major direct user
+ of vm_map iirc)
+ <tschwinge> braunr: Oh, OK. :-)
diff --git a/open_issues/hurdextras.mdwn b/open_issues/hurdextras.mdwn
index f31802da..d4f9d1bc 100644
--- a/open_issues/hurdextras.mdwn
+++ b/open_issues/hurdextras.mdwn
@@ -46,12 +46,20 @@ tarball(s).
# Not OK
+Sent email to all *NOK*s on 2012-07-14, asking for assignment.
+
+
## httpfs
* Arun V. <arunsark@yahoo.com> -- NOK
* Gopika U. K. <gopika78@yahoo.com> -- NOK
* mrphython / James A. Morrison <ja2morri@uwaterloo.ca> -- OK
+## ipc_guide
+
+ * Manuel Pavón Valderrama <mpavon@ugr.es> -- NOK
+ * <cp46tan@hotpop.com> -- NOK
+
## jfs
* Sajith T S <sajith@symonds.net> -- NOK
diff --git a/open_issues/implementing_hurd_on_top_of_another_system.mdwn b/open_issues/implementing_hurd_on_top_of_another_system.mdwn
index 95b71ebb..220c69cc 100644
--- a/open_issues/implementing_hurd_on_top_of_another_system.mdwn
+++ b/open_issues/implementing_hurd_on_top_of_another_system.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -15,9 +16,12 @@ One obvious variant is [[emulation]] (using [[hurd/running/QEMU]], for
example), but
doing that does not really integratable the Hurd guest into the host system.
There is also a more direct way, more powerful, but it also has certain
-requirements to do it effectively:
+requirements to do it effectively.
-IRC, #hurd, August / September 2010
+See also [[Mach_on_top_of_POSIX]].
+
+
+# IRC, freenode, #hurd, August / September 2010
<marcusb> silver_hook: the Hurd can also refer to the interfaces of the
filesystems etc, and a lot of that is really just server/client APIs that
@@ -56,7 +60,7 @@ IRC, #hurd, August / September 2010
<marcusb> ArneBab: in fact, John Tobey did this a couple of years ago, or
started it
-([[tschwinge]] has tarballs of John's work.)
+[[Mach_on_top_of_POSIX]].
<marcusb> ArneBab: or you can just implement parts of it and relay to Linux
for the rest
@@ -64,11 +68,10 @@ IRC, #hurd, August / September 2010
are sufficiently happy with the translator stuff, it's not hard to bring
the Hurd to Linux or BSD
-Continue reading about the [[benefits of a native Hurd implementation]].
+Continue reading about the [[benefits_of_a_native_Hurd_implementation]].
----
-IRC, #hurd, 2010-12-28
+# IRC, freenode, #hurd, 2010-12-28
<antrik> kilobug: there is no real requirement for the Hurd to run on a
microkernel... as long as the important mechanisms are provided (most
@@ -79,9 +82,8 @@ IRC, #hurd, 2010-12-28
Hurd on top of a monolithic kernel would actually be a useful approach
for the time being...
----
-IRC, #hurd, 2011-02-11
+# IRC, freenode, #hurd, 2011-02-11
<neal> marcus and I were discussing how to add Mach to Linux
<neal> one could write a module to implement Mach IPC
@@ -115,3 +117,303 @@ IRC, #hurd, 2011-02-11
<neal> I'm unlikely to work on it, sorry
<antrik> didn't really expect that :-)
<antrik> would be nice though if you could write up your conclusions...
+
+
+# IRC, freenode, #hurd, 2012-10-12
+
+ <peo-xaci> do hurd system libraries make raw system calls ever
+ (i.e. inlined syscall() / raw assembly)?
+ <braunr> sure
+ <peo-xaci> hmm, so a hurd emulation layer would need to use ptrace if it
+ should be fool proof? :/
+ <braunr> there is no real need for raw assembly, and the very syscalls are
+ all available through macros
+ <braunr> hum what are you trying to say ?
+ <peo-xaci> well, if they are done through syscall, as a function, not a
+ macro, then they can be intercepted with LD_PRELOAD
+ <peo-xaci> so applications that do Hurd (Mach?) syscalls could work on
+ f.e. Linux, if a special libc is injected into the program with
+ LD_PRELOAD
+ <peo-xaci> same thing with making standard Linux-applications go through
+ the Hurd emulation layer
+ <peo-xaci> without recompilation
+ <mel-_> peo-xaci: the second direction is implemented in glibc.
+ <mel-_> for the other direction, I personally see little use for it
+ <braunr> peo-xaci: ok i misunderstood
+ <braunr> peo-xaci: i don't think there is any truely direct syscall usage
+ in the hurd
+ <peo-xaci> hmm, I'm not sure I understand what directions you are referring
+ to mel-_
+ <braunr> peo-xaci: what are you trying to achieve ?
+ <peo-xaci> I want to make the Hurd design more accessible by letting Hurd
+ application run on the Linux kernel, preferably without
+ recompilation. This would be done with a daemon that implements Mach and
+ which all syscalls would go to.
+ <peo-xaci> then, I also want so that standard Linux applications can go
+ through that Mach daemon as well, if a special libc is preloaded
+ <braunr> you might want to discuss this with antrik
+ <peo-xaci> what I'm trying to figure out specifically is if there is some
+ library/interface that glue Hurd with Mach and would be better suited to
+ emulate than Mach? Mach seems to be more of an implementation detail to
+ the hurd and not something an application would directly use.
+ <braunr> yes, the various hurd libraries (libports and libpager mostly)
+ <peo-xaci> From [http://www.gnu.org/software/hurd/hurd/libports.html]:
+ "libports is not (at least, not for now) a generalization / abstraction
+ of Mach ports to the functionality the Hurd needs, that is, it is not
+ meant to provide an interface independently of the underlying
+ microkernel."
+ <peo-xaci> Is this still true?
+ <peo-xaci> Does libpager abstract the rest?
+ <peo-xaci> (and the other hurd libraries)
+ <braunr> there is nothing that really abstracts the hurd from mach
+ <braunr> for example, reference counting often happens here and there
+ <braunr> and core libraries like glibc and libpthread heavily rely on it
+ (through sysdeps specific code though)
+ <braunr> libports and libpager are meant to simplify object manipulation
+ for the former, and pager operations for the latter
+ <peo-xaci> and applications, such as translators, often use Mach interfaces
+ directly?
+ <peo-xaci> correct?
+ <braunr> depends on what often means
+ <braunr> let's say they do
+ <peo-xaci> :/ then it probably is better to emulate Mach after all
+ <braunr> there was a mach on posix port a long time ago
+ <peo-xaci> I thought applications were completely separated from the
+ microkernel in use by the Hurd
+ <braunr> that level of abstraction is pretty new
+ <braunr> genode is the only system i know which does that
+
+[[microkernel/Genode]].
+
+ <braunr> and it's still for "l4 variants"
+ <pinotree> ah, thanks (i forgot that name)
+ <antrik> braunr: Genode also runs on Linux and a few other non-L4
+ environments IIRC
+ <antrik> peo-xaci: I'm not sure binary emulation is really useful. rather,
+ I'd recompile stuff as "regular" Linux executables, only using a special
+ glibc
+ <antrik> where the special glibc could be basically a port of the Hurd
+ glibc communicating with the Mach emulation instead of real Mach; or it
+ could do emulation at a higher level
+ <antrik> a higher level emulation would be more complicated to implement,
+ but more efficient, and allow better integration with the ordinary
+ GNU/Linux environment
+ <antrik> also note that any regular program could be recompiled against the
+ HELL glibc to run in the Hurdish environment...
+ <antrik> (well, glibc + hurd server libraries)
+ <peo-xaci> I'm willing to accept that Hurd-application would need to be
+ recompiled to work on the HELL
+ <peo-xaci> but not Linux-applications :)
+ <antrik> peo-xaci: if you happen to understand German, there is a fairly
+ good overview in my thesis report ;-)
+ <antrik> peo-xaci: there are no "Hurd applications" or "Linux applications"
+ <peo-xaci> well, let me define what I mean by the terms: Hurd applications
+ use Hurd-specific interfaces/syscalls, and Linux applications use
+ Linux-specific interfaces/syscalls
+ <antrik> a few programs use Linux-specific interfaces (and we probably
+ can't run them in HELL just as we can't run them on actual Hurd); but all
+ other programs work in any glibc environment
+ <antrik> (usually in any POSIX environment in fact...)
+ <antrik> peo-xaci: no sane application uses syscalls
+ <peo-xaci> they do under the hood
+ <peo-xaci> I have read about inlined syscalls
+ <antrik> again, there are *some* applications using Linux-specific
+ interfaces (sometimes because they are inherently bound to Linux
+ features, sometimes unnecessarily)
+ <antrik> so far there are no applications using Hurd-specific interfaces
+ <peo-xaci> translators do?
+ <peo-xaci> they are standard executables are they not?
+ <peo-xaci> I would like so that translators also can be run in the HELL
+ <antrik> I wouldn't consider them applications. all existing translators
+ are pretty much components of the Hurd itself
+ <peo-xaci> okay, it's a question about semantics, perhaps I should use
+ another word than "applications" :)
+ <peo-xaci> for me, applications are what have a main-function, or similar
+ single entry point
+ <braunr> hum
+ <braunr> that's not a good enough definition
+ <antrik> anyways, as I said, I think recompiling translators against a
+ Hurdish glibc and ported translator libraries seems the most reasonable
+ approach to me
+ <braunr> let's say applications are userspace processes that make use of
+ services provided by the operating system
+ <braunr> translators being part of the operating system here
+ <antrik> braunr: do you know whether the Mach-on-POSIX was actually
+ functional, or just an abandoned experiment?...
+ <antrik> (I don't remember hearing of it before...)
+ <braunr> incomplete iirc
+ <peo-xaci> braunr: still, when I've explained what I meant, even if I used
+ the wrong term, then my previous statements should come in another light
+ <peo-xaci> antrik / braunr: are you still interested in hearing my
+ thoughts/ideas about HELL?
+ <antrik> oh, there is more to come? ;-)
+ <peo-xaci> yes! I don't think I have made myself completely understood :/
+ <peo-xaci> what I envision is a HELL system that works on as low level as
+ feasible, to make it possible to do almost anything that can be done on
+ the real Hurd (except possibly testing hardware drivers and such very low
+ level stuff).
+ <braunr> sure
+ <peo-xaci> I want it to be more than just allowing programs to access a
+ virtual filesystem à la FUSE. My idea is that all user space system
+ libraries/programs of the Hurd should be inside the HELL as well, and
+ they should not be emulated.
+ <peo-xaci> The system should at the very least be API compatible, so at the
+ very most a recompilation is necessary.
+ <peo-xaci> I also want so that GNU/Linux-programs can access the features
+ of the HELL with little effort on the user. At most perhaps a script that
+ wraps LD_PRELOADing has to be run on the binary. Best would be if it
+ could work also with insane assembly programs using raw system calls, or
+ if glibc happens to have some well hidden syscall being inlined to raw
+ assembly code.
+ <peo-xaci> And I think I have an idea on how an implementation could
+ satisfy these things!
+ <peo-xaci> By modifying the kernel and replace those syscalls that make
+ sense for the Hurd/Mach
+ <peo-xaci> with "the kernel", I meant Linux
+ <braunr> it's possible but tedious and not very useful so better do that
+ later
+ <braunr> mach did something similar at its time
+ <braunr> there was a syscall emulation library
+ <peo-xaci> but isn't it about as much work as emulating the interface on
+ user-level?
+ <braunr> and the kernel cooperated so that unmodified unix binaries
+ performing syscalls would actually jump to functions provided by that
+ library, which generally made an RPC
+ <peo-xaci> instead of a bunch of extern-declerations, one would put the
+ symbols in the syscall table
+ <braunr> define what "those syscalls that make sense for the Hurd/Mach"
+ actually means
+ <peo-xaci> open/close, for example
+ <braunr> otherwise i don't see another better way than what the old mach
+ folks did
+ <braunr> well, with that old, but existing support, your open would perform
+ a syscall
+ <braunr> the kernel would catch it and redirect the caller to its syscall
+ emulation library
+ <braunr> which would call the open RPC instead
+ <peo-xaci> wait, so this "existing support" you're talking about; is this a
+ module for the Linux kernel (or a fork, or something else)?
+ <peo-xaci> where can I find it?
+ <braunr> no
+ <braunr> it was for mach
+ <braunr> in order to run unmodified unix binaries
+ <braunr> the opposite of what you're trying to do
+ <peo-xaci> ah okay
+ <braunr> well
+ <braunr> not really either :)
+ <peo-xaci> does posix/unix define a standard for how a syscall table should
+ look like, to allow binary syscall compatibility?
+ <braunr> absolutely not
+ <peo-xaci> so how could this mach module run any unmodified unix binary? if
+ they expected different sys calls at different offsets?
+ <braunr> posix specifically (and very early) states that it almost forbids
+ itself to deal with anything regarding to ABIs
+ <braunr> depends
+ <braunr> since it was old, there weren't that many unix systems
+ <braunr> and even today, there are techniques like those used by netbsd
+ (and many other actually)
+ <braunr> that are able to inspect the binary and load a syscall emulation
+ environment depending on its exposed ABI
+ <braunr> e.g. file on an executable states which system it's for
+ <peo-xaci> hmm, I'm not sure how a kernel would implement that in
+ practice.. I thought these things were so hard coded and dependent on raw
+ memory reads that it would not be possible
+ <braunr> but i really think it's not worth the time for your project
+ <peo-xaci> to be honest I have virtually no experience of practical kernel
+ programming
+ <braunr> with an LDT on x86 for example
+ <braunr> no, there is really not that much hardcoded
+ <braunr> quite the contrary
+ <braunr> there is a lot of runtime detection today
+ <peo-xaci> well I mean how the syscall table is read
+ <braunr> it's not read
+ <peo-xaci> it's read to find the function pointer to the syscall handler in
+ the kernel?
+ <braunr> no
+ <braunr> that's the really basic approach
+ <braunr> (and in practice it can happen of course)
+ <braunr> what really happens is that, for example, on linux, the user space
+ system call code is loaded as a virtual shared library
+ <braunr> use ldd on an executable to see it
+ <braunr> this virtual object provides code that, depending on what the
+ kernel has detected, will use the appropriate method to perform a system
+ call
+ <peo-xaci> but this user space system calls need to make some kind of cpu
+ interupt to communicate with the kernel, right?
+ <braunr> the glibc itself has no idea how a system call will look like in
+ the end
+ <braunr> yes
+ <peo-xaci> an assembler programmer would be able to get around this glue
+ code?
+ <braunr> that's precisely what is embedded in this virtual library
+ <braunr> it could yes
+ <braunr> i think even when sysenter/sysexit is supported, legacy traps are
+ still implemented to support old binaries
+ <braunr> but then all these different entry points will lead to the same
+ code inside the kernel
+ <peo-xaci> but when the glue code is used, then its API compatible, and
+ then I can understand that the kernel can allow different syscall
+ implementations for different executables
+ <braunr> what glue code ?
+ <peo-xaci> what you talked about above "the user space system call code is
+ loaded as a virtual shared library"
+ <braunr> let's call it vdso
+ <braunr> i have to leave in a few minutes
+ <braunr> keep going, i'll read later
+ <peo-xaci> thanks, I looked it up on Wikipedia and understand immediately
+ :P
+ <peo-xaci> so VDSOs are provided by the kernel, not a regular library file,
+ right?
+ <vdox2> What does HELL stand for :) ?
+ <dardevelin> vdox2, Hurd Emulation Layer for Linux
+ <vdox2> dardevelin: thanks
+ <braunr> peo-xaci: yes
+ <antrik> peo-xaci: I believe your goals are conflicting. a low-level
+ implementation makes it basically impossible to interact between the HELL
+ environment and the GNU/Linux environment in any meaningful way. to allow
+ such interaction, you *have* to have some glue at a higher semantic level
+ <braunr> agreed
+ <antrik> peo-xaci: BTW, if you want regular Linux binaries to get somehow
+ redirected to access HELL facilities, there is already a framework (don't
+ remember the name right now) that allows this kind of system call
+ redirection on Linux
+ <antrik> (it can run both through LD_PRELOAD or as a kernel module -- where
+ obviously only the latter would allow raw system call redirection... but
+ TBH, I don't think that's worthwhile anyways. the rare cases where
+ programs use raw system calls are usually for extremely system-specific
+ stuff anyways...)
+ <antrik> ViewOS is the name
+ <antrik> err... View-OS I mean
+ <antrik> or maybe View OS ? ;-)
+ <antrik> whatever, you'll find it :-)
+
+[[Virtual_Square_View-OS]].
+
+ <antrik> I'm not sure it's really worthwhile to use this either
+ though... the most meaningful interaction is probably at the FS level,
+ and that can be done with FUSE
+ <antrik> OHOH, View-OS probably allows doing more interesting stuff that
+ FUSE, such as modyfing the way the VFS works...
+ <antrik> OTOH
+ <antrik> so it could expose more of the Hurd features, at least in theory
+
+
+## IRC, freenode, #hurd, 2012-10-13
+
+ <peo-xaci> antrik / braunr: thanks for your input! I'm not entirely
+ convinced though. :) I will probably return to this project once I have
+ acquired a lot more knowledge about low level stuff. I want to see for
+ myself whether a low level HELL is not feasible. :P
+ <braunr> peo-xaci: what's the point of a low level hell ?
+ <peo-xaci> more Hurd code can be tested in the hell, if the hell is at a
+ low level
+ <peo-xaci> at a higher level, some Hurd code cannot run, because the
+ interfaces they use would not be accessible from the higher level
+ emulation
+ <antrik> peo-xaci: I never said it's not possible. I actually said it would
+ be easier to do. I just said you can't do it low level *and* have
+ meaningful interaction with the host system
+ <peo-xaci> I don't understand why
+ <braunr> peo-xaci: i really don't see what you want to achieve with low
+ level support
+ <braunr> what would be unavailable with a higher level approach ?
diff --git a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
index 80fc9fcd..57eb403d 100644
--- a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
+++ b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -104,3 +105,11 @@ License|/fdl]]."]]"""]]
of embedding it ?
<braunr> right
<antrik> now that's a good question... no idea TBH :-)
+
+
+# IRC, freenode, #hurd, 2012-07-23
+
+ <pinotree> aren't libmachuser and libhurduser supposed to be slowly faded
+ out?
+ <tschwinge> pinotree: That discussion has not yet come to a conclusion, I
+ think. (I'd say: yes.)
diff --git a/open_issues/libpager_deadlock.mdwn b/open_issues/libpager_deadlock.mdwn
new file mode 100644
index 00000000..017ecff6
--- /dev/null
+++ b/open_issues/libpager_deadlock.mdwn
@@ -0,0 +1,165 @@
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+Deadlocks in libpager/periodic sync have been found.
+
+
+# [[gnumach_page_cache_policy]]
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> ah great, a paper about the mach pageout daemon !
+ <mcsim> braunr: Where is paper about the mach pageout daemon?
+ <braunr> ftp://ftp.cs.cmu.edu/project/mach/doc/published/defaultmm.ps
+ <braunr> might give us a clue about the swap deadlock (although i still
+ have a few ideas to check)
+ <braunr>
+ http://www.sceen.net/~rbraun/moving_the_default_memory_manager_out_of_the_mach_kernel.pdf
+ <braunr> we should more seriously consider sergio's advisory pageout branch
+ some day
+ <braunr> i'll try to get in touch with him about that before he completely
+ looses interest
+ <braunr> i'll include it in my "make that page cache as decent as possible"
+ task
+ <braunr> many of his comments match what i've seen
+ <braunr> and we both did a few optimizations the same way
+ <braunr> (like not deactivating pages when they enter the cache)
+
+
+## IRC, freenode, #hurd, 2012-07-13
+
+ <braunr> antrik: i'm able to consistently reproduce the swap deadlocks you
+ regularly had when using apt with my page cache patch
+ <braunr> it happens when lots of dirty pages are write back to their pagers
+ <braunr> so apt, or a big file copy or anything that writes several MiB
+ very quickly is a good candidate
+ <braunr> written*
+ <antrik> braunr: nice...
+ <braunr> antrik: well in a way, yes, as it will allow us to track it more
+ easily
+
+
+## IRC, freenode, #hurd, 2012-07-15
+
+ <braunr> oh btw, i think i can say with confidence that the hurd *doesn't*
+ deadlock
+ <braunr> (at least, concerning swapping)
+ <braunr> lol, one of my hurd systems has been hitting the "swap deadlock"
+ for more than an hour, and suddenly got out of it
+ <braunr> something is really wrong in the pageout daemon, but it's not a
+ deadlock
+ <youpi> a livelock then
+ <braunr> do you get out of livelocks ?
+ <braunr> i mean, it's not even a "lock"
+ <braunr> just a big damn tricky slowdown
+ <youpi> yes, you can, by giving a few more resources for instance
+ <youpi> depends on the kind of livelock of course
+ <braunr> i think it's that
+ <braunr> the pageout daemon clearly throttles itself, waiting for pagers to
+ complete
+ <braunr> and another dangerous thing is the line in vm_resident, which only
+ wakes on thread to avoid starvation
+ <braunr> hum, during the livelock, the kernel spends much time waiting in
+ db_read_address
+ <braunr> could be a bad stack
+ <braunr> so, the pageout daemon seems to slow itself as much as waiting
+ several seconds between each iteration when under load
+ <braunr> but each iteration possibly removes clean pages
+ <braunr> so at some point, there is enough memory to unblock waiting pagers
+ <braunr> for now i'll try a simple solution, like limiting the pausing
+ delay
+ <braunr> but we'll need more page lists in the future (inactive-clean,
+ inactive-dirty, etc..)
+ <braunr> limiting the amount of dirty pages is the only way to really make
+ it safe actually
+ <braunr> wow, the pageout loop is still running even after many pages were
+ freed, and it unable to free more pages
+ <braunr> i think i have an idea about the livelock
+ <braunr> i think it comes from the periodic syncing
+ <bddebian> Too often?
+ <braunr> that's not the problem
+ <braunr> the problem is that it can happen at the same time with paging
+ <bddebian> Oh
+ <braunr> if paging gets slow, it won't stop the periodic syncing
+ <braunr> which will grab any page it can as soon as some are free
+ <braunr> but then, before it even finishes, another sync may occur
+ <braunr> i have yet to check that it is possible
+ <braunr> and i don't understand why syncing isn't done by the kernel
+ <braunr> the kernel is supposed to handle the paging policy
+ <braunr> and it would make paging really scale
+ <bddebian> It's done on the Hurd side?
+ <braunr> (instead of having external pagers make one request for each
+ object, even if they're clean)
+ <braunr> yes
+ <bddebian> Hmm, interesting
+ <braunr> ofc, with ext2fs --debug, i can't reproduce anything
+ <bddebian> Ugh
+ <braunr> sync are serialized
+ <braunr> grmbl
+ <braunr> there is a big lock taken at sync time though
+ <braunr> uhg
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+ <braunr> all right so, there *is* a deadlock, and it may be due to the
+ default pager actually
+ <braunr> the vm_page_laundry_count doesn't decrease at some point, even
+ when there are more than enough free pages
+ <braunr> antrik: the thing is, i think the deadlock concerns the default
+ pager
+ <antrik> the deadlock?
+ <braunr> yes
+ <braunr> when swapping
+
+
+## IRC, freenode, #hurd, 2012-07-17
+
+ <braunr> i can't even reproduce the swap deadlock when using upstrea ext2fs
+ :(
+ <braunr> upstream*
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+ <braunr> the libpager deadlock patch looks wrong to me
+ <braunr> hm no, the libpager patch is ok acually
+
+
+## [[synchronous_ipc]]
+
+### IRC, freenode, #hurd, 2012-07-20
+
+ <braunr> but actually after reviewing more, the debian patch for this
+ particular issue seems correct
+ <antrik> well, it's most probably done by youpi, so I would be shocked if
+ it wasn't correct... ;-)
+ <braunr> he wasn't sure at all about it
+ <antrik> still ;-)
+ <braunr> :)
+ <antrik> well, if you also think it's correct, I guess it's time to push it
+ upstream...
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> i still can't conclude if we have any pageout deadlock, or if it's
+ simply a side effect of the active and inactive lists getting very very
+ large
+ <braunr> but almost every time this issue happens, it somehow recovers,
+ sometimes hours later
+
+
+# See Also
+
+ * [[ext2fs_deadlock]]
diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn
index c5054b7f..befc1378 100644
--- a/open_issues/libpthread.mdwn
+++ b/open_issues/libpthread.mdwn
@@ -42,3 +42,1287 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task.
<youpi> there'll still be the issue that only one will be initialized
<youpi> and one that provides libc thread safety functions, etc.
<pinotree> that's what i wanted to knew, thanks :)
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <bddebian> So I am not sure what to do with the hurd_condition_wait stuff
+ <braunr> i would also like to know what's the real issue with cancellation
+ here
+ <braunr> because my understanding is that libpthread already implements it
+ <braunr> does it look ok to you to make hurd_condition_timedwait return an
+ errno code (like ETIMEDOUT and ECANCELED) ?
+ <youpi> braunr: that's what pthread_* function usually do, yes
+ <braunr> i thought they used their own code
+ <youpi> no
+ <braunr> thanks
+ <braunr> well, first, do you understand what hurd_condition_wait is ?
+ <braunr> it's similar to condition_wait or pthread_cond_wait with a subtle
+ difference
+ <braunr> it differs from the original cthreads version by handling
+ cancellation
+ <braunr> but it also differs from the second by how it handles cancellation
+ <braunr> instead of calling registered cleanup routines and leaving, it
+ returns an error code
+ <braunr> (well simply !0 in this case)
+ <braunr> so there are two ways
+ <braunr> first, change the call to pthread_cond_wait
+ <bddebian> Are you saying we could fix stuff to use pthread_cond_wait()
+ properly?
+ <braunr> it's possible but not easy
+ <braunr> because you'd have to rewrite the cancellation code
+ <braunr> probably writing cleanup routines
+ <braunr> this can be hard and error prone
+ <braunr> and is useless if the code already exists
+ <braunr> so it seems reasonable to keep this hurd extension
+ <braunr> but now, as it *is* a hurd extension noone else uses
+ <antrik> braunr: BTW, when trying to figure out a tricky problem with the
+ auth server, cfhammer digged into the RPC cancellation code quite a bit,
+ and it's really a horrible complex monstrosity... plus the whole concept
+ is actually broken in some regards I think -- though I don't remember the
+ details
+ <braunr> antrik: i had the same kind of thoughts
+ <braunr> antrik: the hurd or pthreads ones ?
+ <antrik> not sure what you mean. I mean the RPC cancellation code -- which
+ is involves thread management too
+ <braunr> ok
+ <antrik> I don't know how it is related to hurd_condition_wait though
+ <braunr> well i found two main entry points there
+ <braunr> hurd_thread_cancel and hurd_condition_wait
+ <braunr> and it didn't look that bad
+ <braunr> whereas in the pthreads code, there are many corner cases
+ <braunr> and even the standard itself looks insane
+ <antrik> well, perhaps the threading part is not that bad...
+ <antrik> it's not where we saw the problems at any rate :-)
+ <braunr> rpc interruption maybe ?
+ <antrik> oh, right... interruption is probably the right term
+ <braunr> yes that thing looks scary
+ <braunr> :))
+ <braunr> the migration thread paper mentions some things about the problems
+ concerning threads controllability
+ <antrik> I believe it's a very strong example for why building around
+ standard Mach features is a bad idea, instead of adapting the primitives
+ to our actual needs...
+ <braunr> i wouldn't be surprised if the "monstrosities" are work arounds
+ <braunr> right
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+ <bddebian> Uhm, where does /usr/include/hurd/signal.h come from?
+ <pinotree> head -n4 /usr/include/hurd/signal.
+ <pinotree> h
+ <bddebian> Ohh glibc?
+ <bddebian> That makes things a little more difficult :(
+ <braunr> why ?
+ <bddebian> Hurd includes it which brings in cthreads
+ <braunr> ?
+ <braunr> the hurd already brings in cthreads
+ <braunr> i don't see what you mean
+ <bddebian> Not anymore :)
+ <braunr> the system cthreads header ?
+ <braunr> well it's not that difficult to trick the compiler not to include
+ them
+ <bddebian> signal.h includes cthreads.h I need to stop that
+ <braunr> just define the _CTHREADS_ macro before including anything
+ <braunr> remember that header files are normally enclosed in such macros to
+ avoid multiple inclusions
+ <braunr> this isn't specific to cthreads
+ <pinotree> converting hurd from cthreads to pthreads will make hurd and
+ glibc break source and binary compatibility
+ <bddebian> Of course
+ <braunr> reminds me of the similar issues of the late 90s
+ <bddebian> Ugh, why is he using _pthread_self()?
+ <pinotree> maybe because it accesses to the internals
+ <braunr> "he" ?
+ <bddebian> Thomas in his modified cancel-cond.c
+ <braunr> well, you need the internals to implement it
+ <braunr> hurd_condition_wait is similar to pthread_condition_wait, except
+ that instead of stopping the thread and calling cleanup routines, it
+ returns 1 if cancelled
+ <pinotree> not that i looked at it, but there's really no way to implement
+ it using public api?
+ <bddebian> Even if I am using glibc pthreads?
+ <braunr> unlikely
+ <bddebian> God I had all of this worked out before I dropped off for a
+ couple years.. :(
+ <braunr> this will come back :p
+ <pinotree> that makes you the perfect guy to work on it ;)
+ <bddebian> I can't find a pt-internal.h anywhere.. :(
+ <pinotree> clone the hurd/libpthread.git repo from savannah
+ <bddebian> Of course when I was doing this libpthread was still in hurd
+ sources...
+ <bddebian> So if I am using glibc pthread, why can't I use pthread_self()
+ instead?
+ <pinotree> that won't give you access to the internals
+ <bddebian> OK, dumb question time. What internals?
+ <pinotree> the libpthread ones
+ <braunr> that's where you will find if your thread has been cancelled or
+ not
+ <bddebian> pinotree: But isn't that assuming that I am using hurd's
+ libpthread?
+ <pinotree> if you aren't inside libpthread, no
+ <braunr> pthread_self is normally not portable
+ <braunr> you can only use it with pthread_equal
+ <braunr> so unless you *know* the internals, you can't use it
+ <braunr> and you won't be able to do much
+ <braunr> so, as it was done with cthreads, hurd_condition_wait should be
+ close to the libpthread implementation
+ <braunr> inside, normally
+ <braunr> now, if it's too long for you (i assume you don't want to build
+ glibc)
+ <braunr> you can just implement it outside, grabbing the internal headers
+ for now
+ <pinotree> another "not that i looked at it" question: isn't there no way
+ to rewrite the code using that custom condwait stuff to use the standard
+ libpthread one?
+ <braunr> and once it works, it'll get integrated
+ <braunr> pinotree: it looks very hard
+ <bddebian> braunr: But the internal headers are assuming hurd libpthread
+ which isn't in the source anymore
+ <braunr> from what i could see while working on select, servers very often
+ call hurd_condition_wait
+ <braunr> and they return EINTR if canceleld
+ <braunr> so if you use the standard pthread_cond_wait function, your thread
+ won't be able to return anything, unless you push the reply in a
+ completely separate callback
+ <braunr> i'm not sure how well mig can cope with that
+ <braunr> i'd say it can't :)
+ <braunr> no really it looks ugly
+ <braunr> it's far better to have this hurd specific function and keep the
+ existing user code as it is
+ <braunr> bddebian: you don't need the implementation, only the headers
+ <braunr> the thread, cond, mutex structures mostly
+ <bddebian> I should turn <pt-internal.h> to "pt-internal.h" and just put it
+ in libshouldbelibc, no?
+ <pinotree> no, that header is not installed
+ <bddebian> Obviously not the "best" way
+ <bddebian> pinotree: ??
+ <braunr> pinotree: what does it change ?
+ <pinotree> braunr: it == ?
+ <braunr> bddebian: you could even copy it entirely in your new
+ cancel-cond.C and mention where it was copied from
+ <braunr> pinotree: it == pt-internal.H not being installed
+ <pinotree> that he cannot include it in libshouldbelibc sources?
+ <pinotree> ah, he wants to copy it?
+ <braunr> yes
+ <braunr> i want him to copy it actually :p
+ <braunr> it may be hard if there are a lot of macro options
+ <pinotree> the __pthread struct changes size and content depending on other
+ internal sysdeps headers
+ <braunr> well he needs to copy those too :p
+ <bddebian> Well even if this works we are going to have to do something
+ more "correct" about hurd_condition_wait. Maybe even putting it in
+ glibc?
+ <braunr> sure
+ <braunr> but again, don't waste time on this for now
+ <braunr> make it *work*, then it'll get integrated
+ <bddebian> Like it has already? This "patch" is only about 5 years old
+ now... ;-P
+ <braunr> but is it complete ?
+ <bddebian> Probably not :)
+ <bddebian> Hmm, I wonder how many undefined references I am going to get
+ though.. :(
+ <bddebian> Shit, 5
+ <bddebian> One of which is ___pthread_self.. :(
+ <bddebian> Does that mean I am actually going to have to build hurds
+ libpthreads in libshouldbeinlibc?
+ <bddebian> Seriously, do I really need ___pthread_self, __pthread_self,
+ _pthread_self and pthread_self???
+ <bddebian> I'm still unclear what to do with cancel-cond.c. It seems to me
+ that if I leave it the way it is currently I am going to have to either
+ re-add libpthreads or still all of the libpthreads code under
+ libshouldbeinlibc.
+ <braunr> then add it in libc
+ <braunr> glib
+ <braunr> glibc
+ <braunr> maybe under the name __hurd_condition_wait
+ <bddebian> Shouldn't I be able to interrupt cancel-cond stuff to use glibc
+ pthreads?
+ <braunr> interrupt ?
+ <bddebian> Meaning interject like they are doing. I may be missing the
+ point but they are just obfuscating libpthreads thread with some other
+ "namespace"? (I know my terminology is wrong, sorry).
+ <braunr> they ?
+ <bddebian> Well Thomas in this case but even in the old cthreads code,
+ whoever wrote cancel-cond.c
+ <braunr> but they use internal thread structures ..
+ <bddebian> Understood but at some level they are still just getting to a
+ libpthread thread, no?
+ <braunr> absolutely not ..
+ <braunr> there is *no* pthread stuff in the hurd
+ <braunr> that's the problem :p
+ <bddebian> Bah damnit...
+ <braunr> cthreads are directly implement on top of mach threads
+ <braunr> implemeneted*
+ <braunr> implemented*
+ <bddebian> Sure but hurd_condition_wait wasn't
+ <braunr> of course it is
+ <braunr> it's almost the same as condition_wait
+ <braunr> but returns 1 if a cancelation request was made
+ <bddebian> Grr, maybe I am just confusing myself because I am looking at
+ the modified (pthreads) version instead of the original cthreads version
+ of cancel-cond.c
+ <braunr> well if the modified version is fine, why not directly use that ?
+ <braunr> normally, hurd_condition_wait should sit next to other pthread
+ internal stuff
+ <braunr> it could be renamed __hurd_condition_wait, i'm not sure
+ <braunr> that's irrelevant for your work anyway
+ <bddebian> I am using it but it relies on libpthread and I am trying to use
+ glibc pthreads
+ <braunr> hum
+ <braunr> what's the difference between libpthread and "glibc pthreads" ?
+ <braunr> aren't glibc pthreads the merged libpthread ?
+ <bddebian> quite possibly but then I am missing something obvious. I'm
+ getting ___pthread_self in libshouldbeinlibc but it is *UND*
+ <braunr> bddebian: with unmodified binaries ?
+ <bddebian> braunr: No I added cancel-cond.c to libshouldbeinlibc
+ <bddebian> And some of the pt-xxx.h headers
+ <braunr> well it's normal then
+ <braunr> i suppose
+ <bddebian> braunr: So how do I get those defined without including
+ pthreads.c from libpthreads? :)
+ <antrik> pinotree: hm... I think we should try to make sure glibc works
+ both whith cthreads hurd and pthreads hurd. I hope that shoudn't be so
+ hard.
+ <antrik> breaking binary compatibility for the Hurd libs is not too
+ terrible I'd say -- as much as I'd like that, we do not exactly have a
+ lot of external stuff depending on them :-)
+ <braunr> bddebian: *sigh*
+ <braunr> bddebian: just add cancel-cond to glibc, near the pthread code :p
+ <bddebian> braunr: Wouldn't I still have the same issue?
+ <braunr> bddebian: what issue ?
+ <antrik> is hurd_condition_wait() the name of the original cthreads-based
+ function?
+ <braunr> antrik: the original is condition_wait
+ <antrik> I'm confused
+ <antrik> is condition_wait() a standard cthreads function, or a
+ Hurd-specific extension?
+ <braunr> antrik: as standard as you can get for something like cthreads
+ <bddebian> braunr: Where hurd_condition_wait is looking for "internals" as
+ you call them. I.E. there is no __pthread_self() in glibc pthreads :)
+ <braunr> hurd_condition_wait is the hurd-specific addition for cancelation
+ <braunr> bddebian: who cares ?
+ <braunr> bddebian: there is a pthread structure, and conditions, and
+ mutexes
+ <braunr> you need those definitions
+ <braunr> so you either import them in the hurd
+ <antrik> braunr: so hurd_condition_wait() *is* also used in the original
+ cthread-based implementation?
+ <braunr> or you write your code directly where they're available
+ <braunr> antrik: what do you call "original" ?
+ <antrik> not transitioned to pthreads
+ <braunr> ok, let's simply call that cthreads
+ <braunr> yes, it's used by every hurd servers
+ <braunr> virtually
+ <braunr> if not really everyone of them
+ <bddebian> braunr: That is where you are losing me. If I can just use
+ glibc pthreads structures, why can't I just use them in the new pthreads
+ version of cancel-cond.c which is what I was originally asking.. :)
+ <braunr> you *have* to do that
+ <braunr> but then, you have to build the whole glibc
+ * bddebian shoots himself
+ <braunr> and i was under the impression you wanted to avoid that
+ <antrik> do any standard pthread functions use identical names to any
+ standard cthread functions?
+ <braunr> what you *can't* do is use the standard pthreads interface
+ <braunr> no, not identical
+ <braunr> but very close
+ <braunr> bddebian: there is a difference between using pthreads, which
+ means using the standard posix interface, and using the glibc pthreads
+ structure, which means toying with the internale implementation
+ <braunr> you *cannot* implement hurd_condition_wait with the standard posix
+ interface, you need to use the internal structures
+ <braunr> hurd_condition_wait is actually a shurd specific addition to the
+ threading library
+ <braunr> hurd*
+ <antrik> well, in that case, the new pthread-based variant of
+ hurd_condition_wait() should also use a different name from the
+ cthread-based one
+ <braunr> so it's normal to put it in that threading library, like it was
+ done for cthreads
+ <braunr> 21:35 < braunr> it could be renamed __hurd_condition_wait, i'm not
+ sure
+ <bddebian> Except that I am trying to avoid using that threading library
+ <braunr> what ?
+ <bddebian> If I am understanding you correctly it is an extention to the
+ hurd specific libpthreads?
+ <braunr> to the threading library, whichever it is
+ <braunr> antrik: although, why not keeping the same name ?
+ <antrik> braunr: I don't think having hurd_condition_wait() for the cthread
+ variant and __hurd_condition_wait() would exactly help clarity...
+ <antrik> I was talking about a really new name. something like
+ pthread_hurd_condition_wait() or so
+ <antrik> braunr: to avoid confusion. to avoid accidentally pulling in the
+ wrong one at build and/or runtime.
+ <antrik> to avoid possible namespace conflicts
+ <braunr> ok
+ <braunr> well yes, makes sense
+ <bddebian> braunr: Let me state this as plainly as I hope I can. If I want
+ to use glibc's pthreads, I have no choice but to add it to glibc?
+ <braunr> and pthread_hurd_condition_wait is a fine name
+ <braunr> bddebian: no
+ <braunr> bddebian: you either add it there
+ <braunr> bddebian: or you copy the headers defining the internal structures
+ somewhere else and implement it there
+ <braunr> but adding it to glibc is better
+ <braunr> it's just longer in the beginning, and now i'm working on it, i'm
+ really not sure
+ <braunr> add it to glibc directly :p
+ <bddebian> That's what I am trying to do but the headers use pthread
+ specific stuff would should be coming from glibc's pthreads
+ <braunr> yes
+ <braunr> well it's not the headers you need
+ <braunr> you need the internal structure definitions
+ <braunr> sometimes they're in c files for opacity
+ <bddebian> So ___pthread_self() should eventually be an obfuscation of
+ glibcs pthread_self(), no?
+ <braunr> i don't know what it is
+ <braunr> read the cthreads variant of hurd_condition_wait, understand it,
+ do the same for pthreads
+ <braunr> it's easy :p
+ <bddebian> For you bastards that have a clue!! ;-P
+ <antrik> I definitely vote for adding it to the hurd pthreads
+ implementation in glibc right away. trying to do it externally only adds
+ unnecessary complications
+ <antrik> and we seem to agree that this new pthread function should be
+ named pthread_hurd_condition_wait(), not just hurd_condition_wait() :-)
+
+
+## IRC, freenode, #hurd, 2012-07-27
+
+ <bddebian> OK this hurd_condition_wait stuff is getting ridiculous the way
+ I am trying to tackle it. :( I think I need a new tactic.
+ <braunr> bddebian: what do you mean ?
+ <bddebian> braunr: I know I am thick headed but I still don't get why I
+ cannot implement it in libshouldbeinlibc for now but still use glibc
+ pthreads internals
+ <bddebian> I thought I was getting close last night by bringing in all of
+ the hurd pthread headers and .c files but it just keeps getting uglier
+ and uglier
+ <bddebian> youpi: Just to verify. The /usr/lib/i386-gnu/libpthread.so that
+ ships with Debian now is from glibc, NOT libpthreads from Hurd right?
+ Everything I need should be available in glibc's libpthreads? (Except for
+ hurd_condition_wait obviously).
+ <braunr> 22:35 < antrik> I definitely vote for adding it to the hurd
+ pthreads implementation in glibc right away. trying to do it externally
+ only adds unnecessary complications
+ <youpi> bddebian: yes
+ <youpi> same as antrik
+ <bddebian> fuck
+ <youpi> libpthread *already* provides some odd symbols (cthread
+ compatibility), it can provide others
+ <braunr> bddebian: don't curse :p it will be easier in the long run
+ * bddebian breaks out glibc :(
+ <braunr> but you should tell thomas that too
+ <bddebian> braunr: I know it just adds a level of complexity that I may not
+ be able to deal with
+ <braunr> we wouldn't want him to waste too much time on the external
+ libpthread
+ <braunr> which one ?
+ <bddebian> glibc for one. hurd_condition_wait() for another which I don't
+ have a great grasp on. Remember my knowledge/skillsets are limited
+ currently.
+ <braunr> bddebian: tschwinge has good instructions to build glibc
+ <braunr> keep your tree around and it shouldn't be long to hack on it
+ <braunr> for hurd_condition_wait, i can help
+ <bddebian> Oh I was thinking about using Debian glibc for now. You think I
+ should do it from git?
+ <braunr> no
+ <braunr> debian rules are even more reliable
+ <braunr> (just don't build all the variants)
+ <pinotree> `debian/rules build_libc` builds the plain i386 variant only
+ <bddebian> So put pthread_hurd_cond_wait in it's own .c file or just put it
+ in pt-cond-wait.c ?
+ <braunr> i'd put it in pt-cond-wait.C
+ <bddebian> youpi or braunr: OK, another dumb question. What (if anything)
+ should I do about hurd/hurd/signal.h. Should I stop it from including
+ cthreads?
+ <youpi> it's not a dumb question. it should probably stop, yes, but there
+ might be uncovered issues, which we'll have to take care of
+ <bddebian> Well I know antrik suggested trying to keep compatibility but I
+ don't see how you would do that
+ <braunr> compability between what ?
+ <braunr> and source and/or binary ?
+ <youpi> hurd/signal.h implicitly including cthreads.h
+ <braunr> ah
+ <braunr> well yes, it has to change obviously
+ <bddebian> Which will break all the cthreads stuff of course
+ <bddebian> So are we agreeing on pthread_hurd_cond_wait()?
+ <braunr> that's fine
+ <bddebian> Ugh, shit there is stuff in glibc using cthreads??
+ <braunr> like what ?
+ <bddebian> hurdsig, hurdsock, setauth, dtable, ...
+ <youpi> it's just using the compatibility stuff, that pthread does provide
+ <bddebian> but it includes cthreads.h implicitly
+ <bddebian> s/it/they in many cases
+ <youpi> not a problem, we provide the functions
+ <bddebian> Hmm, then what do I do about signal.h? It includes chtreads.h
+ because it uses extern struct mutex ...
+ <youpi> ah, then keep the include
+ <youpi> the pthread mutexes are compatible with that
+ <youpi> we'll clean that afterwards
+ <bddebian> arf, OK
+ <youpi> that's what I meant by "uncover issues"
+
+
+## IRC, freenode, #hurd, 2012-07-28
+
+ <bddebian> Well crap, glibc built but I have no symbol for
+ pthread_hurd_cond_wait in libpthread.so :(
+ <bddebian> Hmm, I wonder if I have to add pthread_hurd_cond_wait to
+ forward.c and Versions? (Versions obviously eventually)
+ <pinotree> bddebian: most probably not about forward.c, but definitely you
+ have to export public stuff using Versions
+
+
+## IRC, freenode, #hurd, 2012-07-29
+
+ <bddebian> braunr: http://paste.debian.net/181078/
+ <braunr> ugh, inline functions :/
+ <braunr> "Tell hurd_thread_cancel how to unblock us"
+ <braunr> i think you need that one too :p
+ <bddebian> ??
+ <braunr> well, they work in pair
+ <braunr> one cancels, the other notices it
+ <braunr> hurd_thread_cancel is in the hurd though, iirc
+ <braunr> or uh wait
+ <braunr> no it's in glibc, hurd/thread-cancel.c
+ <braunr> otherwise it looks like a correct reuse of the original code, but
+ i need to understand the pthreads internals better to really say anything
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+ <braunr> pinotree: what do you think of
+ condition_implies/condition_unimplies ?
+ <braunr> the work on pthread will have to replace those
+
+
+## IRC, freenode, #hurd, 2012-08-06
+
+ <braunr> bddebian: so, where is the work being done ?
+ <bddebian> braunr: Right now I would just like to testing getting my glibc
+ with pthread_hurd_cond_wait installed on the clubber subhurd. It is in
+ /home/bdefreese/glibc-debian2
+ <braunr> we need a git branch
+ <bddebian> braunr: Then I want to rebuild hurd with Thomas's pthread
+ patches against that new libc
+ <bddebian> Aye
+ <braunr> i don't remember, did thomas set a git repository somewhere for
+ that ?
+ <bddebian> He has one but I didn't have much luck with it since he is using
+ an external libpthreads
+ <braunr> i can manage the branches
+ <bddebian> I was actually patching debian/hurd then adding his patches on
+ top of that. It is in /home/bdefreese/debian-hurd but he has updateds
+ some stuff since then
+ <bddebian> Well we need to agree on a strategy. libpthreads only exists in
+ debian/glibc
+ <braunr> it would be better to have something upstream than to work on a
+ debian specific branch :/
+ <braunr> tschwinge: do you think it can be done
+ <braunr> ?
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <tschwinge> braunr: You mean to create on Savannah branches for the
+ libpthread conversion? Sure -- that's what I have been suggesting to
+ Barry and Thomas D. all the time.
+
+ <bddebian> braunr: OK, so I installed my glibc with
+ pthread_hurd_condition_wait in the subhurd and now I have built Debian
+ Hurd with Thomas D's pthread patches.
+ <braunr> bddebian: i'm not sure we're ready for tests yet :p
+ <bddebian> braunr: Why not? :)
+ <braunr> bddebian: a few important bits are missing
+ <bddebian> braunr: Like?
+ <braunr> like condition_implies
+ <braunr> i'm not sure they have been handled everywhere
+ <braunr> it's still interesting to try, but i bet your system won't finish
+ booting
+ <bddebian> Well I haven't "installed" the built hurd yet
+ <bddebian> I was trying to think of a way to test a little bit first, like
+ maybe ext2fs.static or something
+ <bddebian> Ohh, it actually mounted the partition
+ <bddebian> How would I actually "test" it?
+ <braunr> git clone :p
+ <braunr> building a debian package inside
+ <braunr> removing the whole content after
+ <braunr> that sort of things
+ <bddebian> Hmm, I think I killed clubber :(
+ <bddebian> Yep.. Crap! :(
+ <braunr> ?
+ <braunr> how did you do that ?
+ <bddebian> Mounted a new partition with the pthreads ext2fs.static then did
+ an apt-get source hurd to it..
+ <braunr> what partition, and what mount point ?
+ <bddebian> I added a new 2Gb partition on /dev/hd0s6 and set the translator
+ on /home/bdefreese/part6
+ <braunr> shouldn't kill your hurd
+ <bddebian> Well it might still be up but killed my ssh session at the very
+ least :)
+ <braunr> ouch
+ <bddebian> braunr: Do you have debugging enabled in that custom kernel you
+ installed? Apparently it is sitting at the debug prompt.
+
+
+## IRC, freenode, #hurd, 2012-08-12
+
+ <braunr> hmm, it seems the hurd notion of cancellation is actually not the
+ pthread one at all
+ <braunr> pthread_cancel merely marks a thread as being cancelled, while
+ hurd_thread_cancel interrupts it
+ <braunr> ok, i have a pthread_hurd_cond_wait_np function in glibc
+
+
+## IRC, freenode, #hurd, 2012-08-13
+
+ <braunr> nice, i got ext2fs work with pthreads
+ <braunr> there are issues with the stack size strongly limiting the number
+ of concurrent threads, but that's easy to fix
+ <braunr> one problem with the hurd side is the condition implications
+ <braunr> i think it should be deal separately, and before doing anything
+ with pthreads
+ <braunr> but that's minor, the most complex part is, again, the term server
+ <braunr> other than that, it was pretty easy to do
+ <braunr> but, i shouldn't speak too soon, who knows what tricky bootstrap
+ issue i'm gonna face ;p
+ <braunr> tschwinge: i'd like to know how i should proceed if i want a
+ symbol in a library overriden by that of a main executable
+ <braunr> e.g. have libpthread define a default stack size, and let
+ executables define their own if they want to change it
+ <braunr> tschwinge: i suppose i should create a weak alias in the library
+ and a normal variable in the executable, right ?
+ <braunr> hm i'm making this too complicated
+ <braunr> don't mind that stupid question
+ <tschwinge> braunr: A simple variable definition would do, too, I think?
+ <tschwinge> braunr: Anyway, I'd first like to know why we can'T reduce the
+ size of libpthread threads from 2 MiB to 64 KiB as libthreads had. Is
+ that a requirement of the pthread specification?
+ <braunr> tschwinge: it's a requirement yes
+ <braunr> the main reason i see is that hurd threadvars (which are still
+ present) rely on common stack sizes and alignment to work
+ <tschwinge> Mhm, I see.
+ <braunr> so for now, i'm using this approach as a hack only
+ <tschwinge> I'm working on phasing out threadvars, but we're not there yet.
+ <tschwinge> Yes, that's fine for the moment.
+ <braunr> tschwinge: a simple definition wouldn't work
+ <braunr> tschwinge: i resorted to a weak symbol, and see how it goes
+ <braunr> tschwinge: i supposed i need to export my symbol as a global one,
+ otherwise making it weak makes no sense, right ?
+ <braunr> suppose*
+ <braunr> tschwinge: also, i'm not actually sure what you meant is a
+ requirement about the stack size, i shouldn't have answered right away
+ <braunr> no there is actually no requirement
+ <braunr> i misunderstood your question
+ <braunr> hm when adding this weak variable, starting a program segfaults :(
+ <braunr> apparently on ___pthread_self, a tls variable
+ <braunr> fighting black magic begins
+ <braunr> arg, i can't manage to use that weak symbol to reduce stack sizes
+ :(
+ <braunr> ah yes, finally
+ <braunr> git clone /path/to/glibc.git on a pthread-powered ext2fs server :>
+ <braunr> tschwinge: seems i have problems using __thread in hurd code
+ <braunr> tschwinge: they produce undefined symbols
+ <braunr> tschwinge: forget that, another mistake on my part
+ <braunr> so, current state: i just need to create another patch, for the
+ code that is included in the debian hurd package but not in the upstream
+ hurd repository (e.g. procfs, netdde), and i should be able to create
+ hurd packages taht completely use pthreads
+
+
+## IRC, freenode, #hurd, 2012-08-14
+
+ <braunr> tschwinge: i have weird bootstrap issues, as expected
+ <braunr> tschwinge: can you point me to important files involved during
+ bootstrap ?
+ <braunr> my ext2fs.static server refuses to start as a rootfs, whereas it
+ seems to work fine otherwise
+ <braunr> hm, it looks like it's related to global signal dispositions
+
+
+## IRC, freenode, #hurd, 2012-08-15
+
+ <braunr> ahah, a subhurd running pthreads-powered hurd servers only
+ <LarstiQ> braunr: \o/
+ <braunr> i can even long on ssh
+ <braunr> log
+ <braunr> pinotree: for reference, i uploaded my debian-specific changes
+ there :
+ <braunr> http://git.sceen.net/rbraun/debian_hurd.git/
+ <braunr> darnassus is now running a pthreads-enabled hurd system :)
+
+
+## IRC, freenode, #hurd, 2012-08-16
+
+ <braunr> my pthreads-enabled hurd systems can quickly die under load
+ <braunr> youpi: with hurd servers using pthreads, i occasionally see thread
+ storms apparently due to a deadlock
+ <braunr> youpi: it makes me think of the problem you sometimes have (and
+ had often with the page cache patch)
+ <braunr> in cthreads, mutex and condition operations are macros, and they
+ check the mutex/condition queue without holding the internal
+ mutex/condition lock
+ <braunr> i'm not sure where this can lead to, but it doesn't seem right
+ <pinotree> isn't that a bit dangerous?
+ <braunr> i believe it is
+ <braunr> i mean
+ <braunr> it looks dangerous
+ <braunr> but it may be perfectly safe
+ <pinotree> could it be?
+ <braunr> aiui, it's an optimization, e.g. "dont take the internal lock if
+ there are no thread to wake"
+ <braunr> but if there is a thread enqueuing itself at the same time, it
+ might not be waken
+ <pinotree> yeah
+ <braunr> pthreads don't have this issue
+ <braunr> and what i see looks like a deadlock
+ <pinotree> anything can happen between the unlocked checking and the
+ following instruction
+ <braunr> so i'm not sure how a situation working around a faulty
+ implementation would result in a deadlock with a correct one
+ <braunr> on the other hand, the error youpi reported
+ (http://lists.gnu.org/archive/html/bug-hurd/2012-07/msg00051.html) seems
+ to indicate something is deeply wrong with libports
+ <pinotree> it could also be the current code does not really "works around"
+ that, but simply implicitly relies on the so-generated behaviour
+ <braunr> luckily not often
+ <braunr> maybe
+ <braunr> i think we have to find and fix these issues before moving to
+ pthreads entirely
+ <braunr> (ofc, using pthreads to trigger those bugs is a good procedure)
+ <pinotree> indeed
+ <braunr> i wonder if tweaking the error checking mode of pthreads to abort
+ on EDEADLK is a good approach to detecting this problem
+ <braunr> let's try !
+ <braunr> youpi: eh, i think i've spotted the libports ref mistake
+ <youpi> ooo!
+ <youpi> .oOo.!!
+ <gnu_srs> Same problem but different patches
+ <braunr> look at libports/bucket-iterate.c
+ <braunr> in the HURD_IHASH_ITERATE loop, pi->refcnt is incremented without
+ a lock
+ <youpi> Mmm, the incrementation itself would probably be compiled into an
+ INC, which is safe in UP
+ <youpi> it's an add currently actually
+ <youpi> 0x00004343 <+163>: addl $0x1,0x4(%edi)
+ <braunr> 40c4: 83 47 04 01 addl $0x1,0x4(%edi)
+ <youpi> that makes it SMP unsafe, but not UP unsafe
+ <braunr> right
+ <braunr> too bad
+ <youpi> that still deserves fixing :)
+ <braunr> the good side is my mind is already wired for smp
+ <youpi> well, it's actually not UP either
+ <youpi> in general
+ <youpi> when the processor is not able to do the add in one instruction
+ <braunr> sure
+ <braunr> youpi: looks like i'm wrong, refcnt is protected by the global
+ libports lock
+ <youpi> braunr: but aren't there pieces of code which manipulate the refcnt
+ while taking another lock than the global libports lock
+ <youpi> it'd not be scalable to use the global libports lock to protect
+ refcnt
+ <braunr> youpi: imo, the scalability issues are present because global
+ locks are taken all the time, indeed
+ <youpi> urgl
+ <braunr> yes ..
+ <braunr> when enabling mutex checks in libpthread, pfinet dies :/
+ <braunr> grmbl, when trying to start "ls" using my deadlock-detection
+ libpthread, the terminal gets unresponsive, and i can't even use ps .. :(
+ <pinotree> braunr: one could say your deadlock detection works too
+ good... :P
+ <braunr> pinotree: no, i made a mistake :p
+ <braunr> it works now :)
+ <braunr> well, works is a bit fast
+ <braunr> i can't attach gdb now :(
+ <braunr> *sigh*
+ <braunr> i guess i'd better revert to a cthreads hurd and debug from there
+ <braunr> eh, with my deadlock-detection changes, recursive mutexes are now
+ failing on _pthread_self(), which for some obscure reason generates this
+ <braunr> => 0x0107223b <+283>: jmp 0x107223b
+ <__pthread_mutex_timedlock_internal+283>
+ <braunr> *sigh*
+
+
+## IRC, freenode, #hurd, 2012-08-17
+
+ <braunr> aw, the thread storm i see isn't a deadlock
+ <braunr> seems to be mere contention ....
+ <braunr> youpi: what do you think of the way
+ ports_manage_port_operations_multithread determines it needs to spawn a
+ new thread ?
+ <braunr> it grabs a lock protecting the number of threads to determine if
+ it needs a new thread
+ <braunr> then releases it, to retake it right after if a new thread must be
+ created
+ <braunr> aiui, it could lead to a situation where many threads could
+ determine they need to create threads
+ <youpi> braunr: there's no reason to release the spinlock before re-taking
+ it
+ <youpi> that can indeed lead to too much thread creations
+ <braunr> youpi: a harder question
+ <braunr> youpi: what if thread creation fails ? :/
+ <braunr> if i'm right, hurd servers simply never expect thread creation to
+ fail
+ <youpi> indeed
+ <braunr> and as some patterns have threads blocking until another produce
+ an event
+ <braunr> i'm not sure there is any point handling the failure at all :/
+ <youpi> well, at least produce some output
+ <braunr> i added a perror
+ <youpi> so we know that happened
+ <braunr> async messaging is quite evil actually
+ <braunr> the bug i sometimes have with pfinet is usually triggered by
+ fakeroot
+ <braunr> it seems to use select a lot
+ <braunr> and select often destroys ports when it has something to return to
+ the caller
+ <braunr> which creates dead name notifications
+ <braunr> and if done often enough, a lot of them
+ <youpi> uh
+ <braunr> and as pfinet is creating threads to service new messages, already
+ existing threads are starved and can't continue
+ <braunr> which leads to pfinet exhausting its address space with thread
+ stacks (at about 30k threads)
+ <braunr> i initially thought it was a deadlock, but my modified libpthread
+ didn't detect one, and indeed, after i killed fakeroot (the whole
+ dpkg-buildpackage process hierarchy), pfinet just "cooled down"
+ <braunr> with almost all 30k threads simply waiting for requests to
+ service, and the few expected select calls blocking (a few ssh sessions,
+ exim probably, possibly others)
+ <braunr> i wonder why this doesn't happen with cthreads
+ <youpi> there's a 4k guard between stacks, otherwise I don't see anything
+ obvious
+ <braunr> i'll test my pthreads package with the fixed
+ ports_manage_port_operations_multithread
+ <braunr> but even if this "fix" should reduce thread creation, it doesn't
+ prevent the starvation i observed
+ <braunr> evil concurrency :p
+
+ <braunr> youpi: hm i've just spotted an important difference actually
+ <braunr> youpi: glibc sched_yield is __swtch(), cthreads is
+ thread_switch(MACH_PORT_NULL, SWITCH_OPTION_DEPRESS, 10)
+ <braunr> i'll change the glibc implementation, see how it affects the whole
+ system
+
+ <braunr> youpi: do you think bootsting the priority or cancellation
+ requests is an acceptable workaround ?
+ <braunr> boosting
+ <braunr> of*
+ <youpi> workaround for what?
+ <braunr> youpi: the starvation i described earlier
+ <youpi> well, I guess I'm not into the thing enough to understand
+ <youpi> you meant the dead port notifications, right?
+ <braunr> yes
+ <braunr> they are the cancellation triggers
+ <youpi> cancelling whaT?
+ <braunr> a blocking select for example
+ <braunr> ports_do_mach_notify_dead_name -> ports_dead_name ->
+ ports_interrupt_notified_rpcs -> hurd_thread_cancel
+ <braunr> so it's important they are processed quickly, to allow blocking
+ threads to unblock, reply, and be recycled
+ <youpi> you mean the threads in pfinet?
+ <braunr> the issue applies to all servers, but yes
+ <youpi> k
+ <youpi> well, it can not not be useful :)
+ <braunr> whatever the choice, it seems to be there will be a security issue
+ (a denial of service of some kind)
+ <youpi> well, it's not only in that case
+ <youpi> you can always queue a lot of requests to a server
+ <braunr> sure, i'm just focusing on this particular problem
+ <braunr> hm
+ <braunr> max POLICY_TIMESHARE or min POLICY_FIXEDPRI ?
+ <braunr> i'd say POLICY_TIMESHARE just in case
+ <braunr> (and i'm not sure mach handles fixed priority threads first
+ actually :/)
+ <braunr> hm my current hack which consists of calling swtch_pri(0) from a
+ freshly created thread seems to do the job eh
+ <braunr> (it may be what cthreads unintentionally does by acquiring a spin
+ lock from the entry function)
+ <braunr> not a single issue any more with this hack
+ <bddebian> Nice
+ <braunr> bddebian: well it's a hack :p
+ <braunr> and the problem is that, in order to boost a thread's priority,
+ one would need to implement that in libpthread
+ <bddebian> there isn't thread priority in libpthread?
+ <braunr> it's not implemented
+ <bddebian> Interesting
+ <braunr> if you want to do it, be my guest :p
+ <braunr> mach should provide the basic stuff for a partial implementation
+ <braunr> but for now, i'll fall back on the hack, because that's what
+ cthreads "does", and it's "reliable enough"
+
+ <antrik> braunr: I don't think the locking approach in
+ ports_manage_port_operations_multithread() could cause issues. the worst
+ that can happen is that some other thread becomes idle between the check
+ and creating a new thread -- and I can't think of a situation where this
+ could have any impact...
+ <braunr> antrik: hm ?
+ <braunr> the worst case is that many threads will evalute spawn to 1 and
+ create threads, whereas only one of them should have
+ <antrik> braunr: I'm not sure perror() is a good way to handle the
+ situation where thread creation failed. this would usually happen because
+ of resource shortage, right? in that case, it should work in non-debug
+ builds too
+ <braunr> perror isn't specific to debug builds
+ <braunr> i'm building glibc packages with a pthreads-enabled hurd :>
+ <braunr> (which at one point run the test allocating and filling 2 GiB of
+ memory, which passed)
+ <braunr> (with a kernel using a 3/1 split of course, swap usage reached
+ something like 1.6 GiB)
+ <antrik> braunr: BTW, I think the observation that thread storms tend to
+ happen on destroying stuff more than on creating stuff has been made
+ before...
+ <braunr> ok
+ <antrik> braunr: you are right about perror() of course. brain fart -- was
+ thinking about assert_perror()
+ <antrik> (which is misused in some places in existing Hurd code...)
+ <antrik> braunr: I still don't see the issue with the "spawn"
+ locking... the only situation where this code can be executed
+ concurrently is when multiple threads are idle and handling incoming
+ request -- but in that case spawning does *not* happen anyways...
+ <antrik> unless you are talking about something else than what I'm thinking
+ of...
+ <braunr> well imagine you have idle threads, yes
+ <braunr> let's say a lot like a thousand
+ <braunr> and the server gets a thousand requests
+ <braunr> a one more :p
+ <braunr> normally only one thread should be created to handle it
+ <braunr> but here, the worst case is that all threads run internal_demuxer
+ roughly at the same time
+ <braunr> and they all determine they need to spawn a thread
+ <braunr> leading to another thousand
+ <braunr> (that's extreme and very unlikely in practice of course)
+ <antrik> oh, I see... you mean all the idle threads decide that no spawning
+ is necessary; but before they proceed, finally one comes in and decides
+ that it needs to spawn; and when the other ones are scheduled again they
+ all spawn unnecessarily?
+ <braunr> no, spawn is a local variable
+ <braunr> it's rather, all idle threads become busy, and right before
+ servicing their request, they all decide they must spawn a thread
+ <antrik> I don't think that's how it works. changing the status to busy (by
+ decrementing the idle counter) and checking that there are no idle
+ threads is atomic, isn't it?
+ <braunr> no
+ <antrik> oh
+ <antrik> I guess I should actually look at that code (again) before
+ commenting ;-)
+ <braunr> let me check
+ <braunr> no sorry you're right
+ <braunr> so right, you can't lead to that situation
+ <braunr> i don't even understand how i can't see that :/
+ <braunr> let's say it's the heat :p
+ <braunr> 22:08 < braunr> so right, you can't lead to that situation
+ <braunr> it can't lead to that situation
+
+
+## IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> one more attempt at fixing netdde, hope i get it right this time
+ <braunr> some parts assume a ddekit thread is a cthread, because they share
+ the same address
+ <braunr> it's not as easy when using pthread_self :/
+ <braunr> good, i got netdde work with pthreads
+ <braunr> youpi: for reference, there are now glibc, hurd and netdde
+ packages on my repository
+ <braunr> youpi: the debian specific patches can be found at my git
+ repository (http://git.sceen.net/rbraun/debian_hurd.git/ and
+ http://git.sceen.net/rbraun/debian_netdde.git/)
+ <braunr> except a freeze during boot (between exec and init) which happens
+ rarely, and the starvation which still exists to some extent (fakeroot
+ can cause many threads to be created in pfinet and pflocal), the
+ glibc/hurd packages have been working fine for a few days now
+ <braunr> the threading issue in pfinet/pflocal is directly related to
+ select, which the io_select_timeout patches should fix once merged
+ <braunr> well, considerably reduce at least
+ <braunr> and maybe fix completely, i'm not sure
+
+
+## IRC, freenode, #hurd, 2012-08-27
+
+ <pinotree> braunr: wrt a78a95d in your pthread branch of hurd.git,
+ shouldn't that job theorically been done using pthread api (of course
+ after implementing it)?
+ <braunr> pinotree: sure, it could be done through pthreads
+ <braunr> pinotree: i simply restricted myself to moving the hurd to
+ pthreads, not augment libpthread
+ <braunr> (you need to remember that i work on hurd with pthreads because it
+ became a dependency of my work on fixing select :p)
+ <braunr> and even if it wasn't the reason, it is best to do these tasks
+ (replace cthreads and implement pthread scheduling api) separately
+ <pinotree> braunr: hm ok
+ <pinotree> implementing the pthread priority bits could be done
+ independently though
+
+ <braunr> youpi: there are more than 9000 threads for /hurd/streamio kmsg on
+ ironforge oO
+ <youpi> kmsg ?!
+ <youpi> it's only /dev/klog right?
+ <braunr> not sure but it seems so
+ <pinotree> which syslog daemon is running?
+ <youpi> inetutils
+ <youpi> I've restarted the klog translator, to see whether when it grows
+ again
+
+ <braunr> 6 hours and 21 minutes to build glibc on darnassus
+ <braunr> pfinet still runs only 24 threads
+ <braunr> the ext2 instance used for the build runs 2k threads, but that's
+ because of the pageouts
+ <braunr> so indeed, the priority patch helps a lot
+ <braunr> (pfinet used to have several hundreds, sometimes more than a
+ thousand threads after a glibc build, and potentially increasing with
+ each use of fakeroot)
+ <braunr> exec weights 164M eww, we definitely have to fix that leak
+ <braunr> the leaks are probably due to wrong mmap/munmap usage
+
+[[exec_leak]].
+
+
+### IRC, freenode, #hurd, 2012-08-29
+
+ <braunr> youpi: btw, after my glibc build, there were as little as between
+ 20 and 30 threads for pflocal and pfinet
+ <braunr> with the priority patch
+ <braunr> ext2fs still had around 2k because of pageouts, but that's
+ expected
+ <youpi> ok
+ <braunr> overall the results seem very good and allow the switch to
+ pthreads
+ <youpi> yep, so it seems
+ <braunr> youpi: i think my first integration branch will include only a few
+ changes, such as this priority tuning, and the replacement of
+ condition_implies
+ <youpi> sure
+ <braunr> so we can push the move to pthreads after all its small
+ dependencies
+ <youpi> yep, that's the most readable way
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <gnu_srs> braunr: Compiling yodl-3.00.0-7:
+ <gnu_srs> pthreads: real 13m42.460s, user 0m0.000s, sys 0m0.030s
+ <gnu_srs> cthreads: real 9m 6.950s, user 0m0.000s, sys 0m0.020s
+ <braunr> thanks
+ <braunr> i'm not exactly certain about what causes the problem though
+ <braunr> it could be due to libpthread using doubly-linked lists, but i
+ don't think the overhead would be so heavier because of that alone
+ <braunr> there is so much contention sometimes that it could
+ <braunr> the hurd would have been better off with single threaded servers
+ :/
+ <braunr> we should probably replace spin locks with mutexes everywhere
+ <braunr> on the other hand, i don't have any more starvation problem with
+ the current code
+
+
+### IRC, freenode, #hurd, 2012-09-06
+
+ <gnu_srs> braunr: Yes you are right, the new pthread-based Hurd is _much_
+ slower.
+ <gnu_srs> One annoying example is when compiling, the standard output is
+ written in bursts with _long_ periods of no output in between:-(
+ <braunr> that's more probably because of the priority boost, not the
+ overhead
+ <braunr> that's one of the big issues with our mach-based model
+ <braunr> we either give high priorities to our servers, or we can suffer
+ from message floods
+ <braunr> that's in fact more a hurd problem than a mach one
+ <gnu_srs> braunr: any immediate ideas how to speed up responsiveness the
+ pthread-hurd. It is annoyingly slow (slow-witted)
+ <braunr> gnu_srs: i already answered that
+ <braunr> it doesn't look that slower on my machines though
+ <gnu_srs> you said you had some ideas, not which. except for mcsims work.
+ <braunr> i have ideas about what makes it slower
+ <braunr> it doesn't mean i have solutions for that
+ <braunr> if i had, don't you think i'd have applied them ? :)
+ <gnu_srs> ok, how to make it more responsive on the console? and printing
+ stdout more regularly, now several pages are stored and then flushed.
+ <braunr> give more details please
+ <gnu_srs> it behaves like a loaded linux desktop, with little memory
+ left...
+ <braunr> details about what you're doing
+ <gnu_srs> apt-get source any big package and: fakeroot debian/rules binary
+ 2>&1 | tee ../binary.logg
+ <braunr> isee
+ <braunr> well no, we can't improve responsiveness
+ <braunr> without reintroducing the starvation problem
+ <braunr> they are linked
+ <braunr> and what you're doing involes a few buffers, so the laggy feel is
+ expected
+ <braunr> if we can fix that simply, we'll do so after it is merged upstream
+
+
+### IRC, freenode, #hurd, 2012-09-07
+
+ <braunr> gnu_srs: i really don't feel the sluggishness you described with
+ hurd+pthreads on my machines
+ <braunr> gnu_srs: what's your hardware ?
+ <braunr> and your VM configuration ?
+ <gnu_srs> Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
+ <gnu_srs> kvm -m 1024 -net nic,model=rtl8139 -net
+ user,hostfwd=tcp::5562-:22 -drive
+ cache=writeback,index=0,media=disk,file=hurd-experimental.img -vnc :6
+ -cdrom isos/netinst_2012-07-15.iso -no-kvm-irqchip
+ <braunr> what is the file system type where your disk image is stored ?
+ <gnu_srs> ext3
+ <braunr> and how much physical memory on the host ?
+ <braunr> (paste meminfo somewhere please)
+ <gnu_srs> 4G, and it's on the limit, 2 kvm instances+gnome,etc
+ <gnu_srs> 80% in use by programs, 14% in cache.
+ <braunr> ok, that's probably the reason then
+ <braunr> the writeback option doesn't help a lot if you don't have much
+ cache
+ <gnu_srs> well the other instance is cthreads based, and not so sluggish.
+ <braunr> we know hurd+pthreads is slower
+ <braunr> i just wondered why i didn't feel it that much
+ <gnu_srs> try to fire up more kvm instances, and do a heavy compile...
+ <braunr> i don't do that :)
+ <braunr> that's why i never had the problem
+ <braunr> most of the time i have like 2-3 GiB of cache
+ <braunr> and of course more on shattrath
+ <braunr> (the host of the sceen.net hurdboxes, which has 16 GiB of ram)
+
+
+### IRC, freenode, #hurd, 2012-09-11
+
+ <gnu_srs> Monitoring the cthreads and the pthreads load under Linux shows:
+ <gnu_srs> cthread version: load can jump very high, less cpu usage than
+ pthread version
+ <gnu_srs> pthread version: less memory usage, background cpu usage higher
+ than for cthread version
+ <braunr> that's the expected behaviour
+ <braunr> gnu_srs: are you using the lifothreads gnumach kernel ?
+ <gnu_srs> for experimental, yes.
+ <gnu_srs> i.e. pthreads
+ <braunr> i mean, you're measuring on it right now, right ?
+ <gnu_srs> yes, one instance running cthreads, and one pthreads (with lifo
+ gnumach)
+ <braunr> ok
+ <gnu_srs> no swap used in either instance, will try a heavy compile later
+ on.
+ <braunr> what for ?
+ <gnu_srs> E.g. for memory when linking. I have swap available, but no swap
+ is used currently.
+ <braunr> yes but, what do you intend to measure ?
+ <gnu_srs> don't know, just to see if swap is used at all. it seems to be
+ used not very much.
+ <braunr> depends
+ <braunr> be warned that using the swap means there is pageout, which is one
+ of the triggers for global system freeze :p
+ <braunr> anonymous memory pageout
+ <gnu_srs> for linux swap is used constructively, why not on hurd?
+ <braunr> because of hard to squash bugs
+ <gnu_srs> aha, so it is bugs hindering swap usage:-/
+ <braunr> yup :/
+ <gnu_srs> Let's find them thenO:-), piece of cake
+ <braunr> remember my page cache branch in gnumach ? :)
+
+[[gnumach_page_cache_policy]].
+
+ <gnu_srs> not much
+ <braunr> i started it before fixing non blocking select
+ <braunr> anyway, as a side effect, it should solve this stability issue
+ too, but it'll probably take time
+ <gnu_srs> is that branch integrated? I only remember slab and the lifo
+ stuff.
+ <gnu_srs> and mcsims work
+ <braunr> no it's not
+ <braunr> it's unfinished
+ <gnu_srs> k!
+ <braunr> it correctly extends the page cache to all available physical
+ memory, but since the hurd doesn't scale well, it slows the system down
+
+
+## IRC, freenode, #hurd, 2012-09-14
+
+ <braunr> arg
+ <braunr> darnassus seems to eat 100% cpu and make top freeze after some
+ time
+ <braunr> seems like there is an important leak in the pthreads version
+ <braunr> could be the lifothreads patch :/
+ <cjbirk> there's a memory leak?
+ <cjbirk> in pthreads?
+ <braunr> i don't think so, and it's not a memory leak
+ <braunr> it's a port leak
+ <braunr> probably in the kernel
+
+
+### IRC, freenode, #hurd, 2012-09-17
+
+ <braunr> nice, the port leak is actually caused by the exim4 loop bug
+
+
+### IRC, freenode, #hurd, 2012-09-23
+
+ <braunr> the port leak i observed a few days ago is because of exim4 (the
+ infamous loop eating the cpu we've been seeing regularly)
+
+[[fork_deadlock]]?
+
+ <youpi> oh
+ <braunr> next time it happens, and if i have the occasion, i'll examine the
+ problem
+ <braunr> tip: when you can't use top or ps -e, you can use ps -e -o
+ pid=,args=
+ <youpi> or -M ?
+ <braunr> haven't tested
+
+
+## IRC, freenode, #hurd, 2012-09-23
+
+ <braunr> tschwinge: i committed the last hurd pthread change,
+ http://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?h=master-pthreads
+ <braunr> tschwinge: please tell me if you consider it ok for merging
+
+
+### IRC, freenode, #hurd, 2012-11-27
+
+ <youpi> braunr: btw, I forgot to forward here, with the glibc patch it does
+ boot fine, I'll push all that and build some almost-official packages for
+ people to try out what will come when eglibc gets the change in unstable
+ <braunr> youpi: great :)
+ <youpi> thanks for managing the final bits of this
+ <youpi> (and thanks for everybody involved)
+ <braunr> sorry again for the non obvious parts
+ <braunr> if you need the debian specific parts refined (e.g. nice commits
+ for procfs & others), i can do that
+ <youpi> I'll do that, no pb
+ <braunr> ok
+ <braunr> after that (well, during also), we should focus more on bug
+ hunting
+
+
+## IRC, freenode, #hurd, 2012-10-26
+
+ <mcsim1> hello. What does following error message means? "unable to adjust
+ libports thread priority: Operation not permitted" It appears when I set
+ translators.
+ <mcsim1> Seems has some attitude to libpthread. Also following appeared
+ when I tried to remove translator: "pthread_create: Resource temporarily
+ unavailable"
+ <mcsim1> Oh, first message appears very often, when I use translator I set.
+ <braunr> mcsim1: it's related to a recent patch i sent
+ <braunr> mcsim1: hurd servers attempt to increase their priority on startup
+ (when a thread is created actually)
+ <braunr> to reduce message floods and thread storms (such sweet names :))
+ <braunr> but if you start them as an unprivileged user, it fails, which is
+ ok, it's just a warning
+ <braunr> the second way is weird
+ <braunr> it normally happens when you're out of available virtual space,
+ not when shutting a translator donw
+ <mcsim1> braunr: you mean this patch: libports: reduce thread starvation on
+ message floods?
+ <braunr> yes
+ <braunr> remember you're running on darnassus
+ <braunr> with a heavily modified hurd/glibc
+ <braunr> you can go back to the cthreads version if you wish
+ <mcsim1> it's better to check translators privileges, before attempting to
+ increase their priority, I think.
+ <braunr> no
+ <mcsim1> it's just a bit annoying
+ <braunr> privileges can be changed during execution
+ <braunr> well remove it
+ <mcsim1> But warning should not appear.
+ <braunr> what could be done is to limit the warning to one occurrence
+ <braunr> mcsim1: i prefer that it appears
+ <mcsim1> ok
+ <braunr> it's always better to be explicit and verbose
+ <braunr> well not always, but very often
+ <braunr> one of the reasons the hurd is so difficult to debug is the lack
+ of a "message server" à la dmesg
+
+[[translator_stdout_stderr]].
+
+
+### IRC, freenode, #hurd, 2012-12-10
+
+ <youpi> braunr: unable to adjust libports thread priority: (ipc/send)
+ invalid destination port
+ <youpi> I'll see what package brought that
+ <youpi> (that was on a buildd)
+ <braunr> wow
+ <youpi> mkvtoolnix_5.9.0-1:
+ <pinotree> shouldn't that code be done in pthreads and then using such
+ pthread api? :p
+ <braunr> pinotree: you've already asked that question :p
+ <pinotree> i know :p
+ <braunr> the semantics of pthreads are larger than what we need, so that
+ will be done "later"
+ <braunr> but this error shouldn't happen
+ <braunr> it looks more like a random mach bug
+ <braunr> youpi: anything else on the console ?
+ <youpi> nope
+ <braunr> i'll add traces to know which step causes the error
+
+
+## IRC, freenode, #hurd, 2012-12-05
+
+ <braunr> tschwinge: i'm currently working on a few easy bugs and i have
+ planned improvements for libpthreads soon
+ <pinotree> wotwot, which ones?
+ <braunr> pinotree: first, fixing pthread_cond_timedwait (and everything
+ timedsomething actually)
+ <braunr> pinotree: then, fixing cancellation
+ <braunr> pinotree: and last but not least, optimizing thread wakeup
+ <braunr> i also want to try replacing spin locks and see if it does what i
+ expect
+ <pinotree> which fixes do you plan applying to cond_timedwait?
+ <braunr> see sysdeps/generic/pt-cond-timedwait.c
+ <braunr> the FIXME comment
+ <pinotree> ah that
+ <braunr> well that's important :)
+ <braunr> did you have something else in mind ?
+ <pinotree> hm, __pthread_timedblock... do you plan fixing directly there? i
+ remember having seem something related to that (but not on conditions),
+ but wasn't able to see further
+ <braunr> it has the same issue
+ <braunr> i don't remember the details, but i wrote a cthreads version that
+ does it right
+ <braunr> in the io_select_timeout branch
+ <braunr> see
+ http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/libthreads/cancel-cond.c?h=rbraun/select_timeout
+ for example
+ * pinotree looks
+ <braunr> what matters is the msg_delivered member used to synchronize
+ sleeper and waker
+ <braunr> the waker code is in
+ http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/libthreads/cprocs.c?h=rbraun/select_timeout
+ <pinotree> never seen cthreads' code before :)
+ <braunr> soon you shouldn't have any more reason to :p
+ <pinotree> ah, so basically the cthread version of the pthread cleanup
+ stack + cancelation (ie the cancel hook) broadcasts the condition
+ <braunr> yes
+ <pinotree> so a similar fix would be needed in all the places using
+ __pthread_timedblock, that is conditions and mutexes
+ <braunr> and that's what's missing in glibc that prevents deploying a
+ pthreads based hurd currently
+ <braunr> no that's unrelated
+ <pinotree> ok
+ <braunr> the problem is how __pthread_block/__pthread_timedblock is
+ synchronized with __pthread_wakeup
+ <braunr> libpthreads does exactly the same thing as cthreads for that,
+ i.e. use messages
+ <braunr> but the message alone isn't enough, since, as explained in the
+ FIXME comment, it can arrive too late
+ <braunr> it's not a problem for __pthread_block because this function can
+ only resume after receiving a message
+ <braunr> but it's a problem for __pthread_timedblock which can resume
+ because of a timeout
+ <braunr> my solution is to add a flag that says whether a message was
+ actually sent, and lock around sending the message, so that the thread
+ resume can accurately tell in which state it is
+ <braunr> and drain the message queue if needed
+ <pinotree> i see, race between the "i stop blocking because of timeout" and
+ "i stop because i got a message" with the actual check for the real cause
+ <braunr> locking around mach_msg may seem overkill but it's not in
+ practice, since there can only be one message at most in the message
+ queue
+ <braunr> and i checked that in practice by limiting the message queue size
+ and check for such errors
+ <braunr> but again, it would be far better with mutexes only, and no spin
+ locks
+ <braunr> i wondered for a long time why the load average was so high on the
+ hurd under even "light" loads
+ <braunr> now i know :)
diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
new file mode 100644
index 00000000..37231c66
--- /dev/null
+++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
@@ -0,0 +1,21 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_libphread]]
+
+`t/have_kernel_resources`
+
+
+# IRC, freenode, #hurd, 2012-08-30
+
+ <braunr> tschwinge: this issue needs more cooperation with the kernel
+ <braunr> tschwinge: i.e. the ability to tell the kernel where the stack is,
+ so it's unmapped when the thread dies
+ <braunr> which requiring another thread to perform this deallocation
diff --git a/open_issues/libpthread_1fcd93fd3c733eb19bcad8d03e65f13ec4b0e998..master-viengoos-on-bare-metal.mdwn b/open_issues/libpthread_1fcd93fd3c733eb19bcad8d03e65f13ec4b0e998..master-viengoos-on-bare-metal.mdwn
new file mode 100644
index 00000000..4396cf59
--- /dev/null
+++ b/open_issues/libpthread_1fcd93fd3c733eb19bcad8d03e65f13ec4b0e998..master-viengoos-on-bare-metal.mdwn
@@ -0,0 +1,849 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_libpthread]]
+
+Things found in a `git diff
+1fcd93fd3c733eb19bcad8d03e65f13ec4b0e998..master-viengoos-on-bare-metal` that
+are not specific to L4 or Viengoos, and may be worth having on master, too.
+
+
+# `__pthread_alloc` init with `malloc` or `calloc`
+
+ diff --git a/pthread/pt-alloc.c b/pthread/pt-alloc.c
+ index 6af2da9..c63801f 100644
+ --- a/pthread/pt-alloc.c
+ +++ b/pthread/pt-alloc.c
+ @@ -123,7 +123,7 @@ __pthread_alloc (struct __pthread **pthread)
+ }
+
+ /* Allocate a new thread structure. */
+ - new = malloc (sizeof (struct __pthread));
+ + new = calloc (sizeof (struct __pthread), 1);
+ if (new == NULL)
+ return ENOMEM;
+
+
+
+# `atomic.h`
+
+Later on master, commit 608a12659f15d57abf42a972c1e56c6a24cfe244: `Rename
+bits/atomic.h to bits/pt-atomic.h`.
+
+ diff --git a/pthread/pt-create.c b/pthread/pt-create.c
+ index 8f62b78..504cacc 100644
+ --- a/pthread/pt-create.c
+ +++ b/pthread/pt-create.c
+ @@ -22,7 +22,7 @@
+ #include <pthread.h>
+ #include <signal.h>
+
+ -#include <bits/atomic.h>
+ +#include <atomic.h>
+
+ #include <pt-internal.h>
+
+ @@ -33,7 +33,7 @@
+ /* The total number of pthreads currently active. This is defined
+ here since it would be really stupid to have a threads-using
+ program that doesn't call `pthread_create'. */
+ -__atomic_t __pthread_total;
+ +atomic_fast32_t __pthread_total;
+
+
+ /* The entry-point for new threads. */
+ @@ -163,7 +163,7 @@ __pthread_create_internal (struct __pthread **thread,
+ the number of threads from within the new thread isn't an option
+ since this thread might return and call `pthread_exit' before the
+ new thread runs. */
+ - __atomic_inc (&__pthread_total);
+ + atomic_increment (&__pthread_total);
+
+ /* Store a pointer to this thread in the thread ID lookup table. We
+ could use __thread_setid, however, we only lock for reading as no
+ @@ -190,7 +190,7 @@ __pthread_create_internal (struct __pthread **thread,
+
+ failed_starting:
+ __pthread_setid (pthread->thread, NULL);
+ - __atomic_dec (&__pthread_total);
+ + atomic_decrement (&__pthread_total);
+ failed_sigstate:
+ __pthread_sigstate_destroy (pthread);
+ failed_setup:
+ diff --git a/pthread/pt-exit.c b/pthread/pt-exit.c
+ index 5fe0ba8..68c56d7 100644
+ --- a/pthread/pt-exit.c
+ +++ b/pthread/pt-exit.c
+ @@ -24,7 +24,7 @@
+
+ #include <pt-internal.h>
+
+ -#include <bits/atomic.h>
+ +#include <atomic.h>
+
+
+ /* Terminate the current thread and make STATUS available to any
+ @@ -57,7 +57,7 @@ pthread_exit (void *status)
+
+ /* Decrease the number of threads. We use an atomic operation to
+ make sure that only the last thread calls `exit'. */
+ - if (__atomic_dec_and_test (&__pthread_total))
+ + if (atomic_decrement_and_test (&__pthread_total))
+ /* We are the last thread. */
+ exit (0);
+
+ diff --git a/pthread/pt-internal.h b/pthread/pt-internal.h
+ index cb441d0..986ec6b 100644
+ --- a/pthread/pt-internal.h
+ +++ b/pthread/pt-internal.h
+ @@ -26,13 +26,15 @@
+ #include <signal.h>
+ #include <assert.h>
+
+ -#include <bits/atomic.h>
+ +#include <atomic.h>
+ [...]
+ @@ -136,7 +144,7 @@ __pthread_dequeue (struct __pthread *thread)
+ )
+
+ /* The total number of threads currently active. */
+ -extern __atomic_t __pthread_total;
+ +extern atomic_fast32_t __pthread_total;
+
+ /* The total number of thread IDs currently in use, or on the list of
+ available thread IDs. */
+ diff --git a/sysdeps/ia32/bits/atomic.h b/sysdeps/ia32/bits/atomic.h
+ deleted file mode 100644
+ index 0dfc1f6..0000000
+ --- a/sysdeps/ia32/bits/atomic.h
+ +++ /dev/null
+ @@ -1,66 +0,0 @@
+ -/* Atomic operations. i386 version.
+ - Copyright (C) 2000 Free Software Foundation, Inc.
+ - This file is part of the GNU C Library.
+ -
+ - The GNU C Library is free software; you can redistribute it and/or
+ - modify it under the terms of the GNU Library General Public License as
+ - published by the Free Software Foundation; either version 2 of the
+ - License, or (at your option) any later version.
+ -
+ - The GNU C Library is distributed in the hope that it will be useful,
+ - but WITHOUT ANY WARRANTY; without even the implied warranty of
+ - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ - Library General Public License for more details.
+ -
+ - You should have received a copy of the GNU Library General Public
+ - License along with the GNU C Library; see the file COPYING.LIB. If not,
+ - write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ - Boston, MA 02111-1307, USA. */
+ -
+ -#ifndef _BITS_ATOMIC_H
+ -#define _BITS_ATOMIC_H 1
+ -
+ -typedef __volatile int __atomic_t;
+ -
+ -static inline void
+ -__atomic_inc (__atomic_t *__var)
+ -{
+ - __asm__ __volatile ("lock; incl %0" : "=m" (*__var) : "m" (*__var));
+ -}
+ -
+ -static inline void
+ -__atomic_dec (__atomic_t *__var)
+ -{
+ - __asm__ __volatile ("lock; decl %0" : "=m" (*__var) : "m" (*__var));
+ -}
+ -
+ -static inline int
+ -__atomic_dec_and_test (__atomic_t *__var)
+ -{
+ - unsigned char __ret;
+ -
+ - __asm__ __volatile ("lock; decl %0; sete %1"
+ - : "=m" (*__var), "=qm" (__ret) : "m" (*__var));
+ - return __ret != 0;
+ -}
+ -
+ -/* We assume that an __atomicptr_t is only used for pointers to
+ - word-aligned objects, and use the lowest bit for a simple lock. */
+ -typedef __volatile int * __atomicptr_t;
+ -
+ -/* Actually we don't implement that yet, and assume that we run on
+ - something that has the i486 instruction set. */
+ -static inline int
+ -__atomicptr_compare_and_swap (__atomicptr_t *__ptr, void *__oldval,
+ - void * __newval)
+ -{
+ - char __ret;
+ - int __dummy;
+ -
+ - __asm__ __volatile ("lock; cmpxchgl %3, %1; sete %0"
+ - : "=q" (__ret), "=m" (*__ptr), "=a" (__dummy)
+ - : "r" (__newval), "m" (*__ptr), "a" (__oldval));
+ - return __ret;
+ -}
+ -
+ -#endif
+
+
+# Memory Barries
+
+ diff --git a/sysdeps/generic/bits/memory.h b/sysdeps/generic/bits/memory.h
+ new file mode 100644
+ index 0000000..7b88a7e
+ --- /dev/null
+ +++ b/sysdeps/generic/bits/memory.h
+ @@ -0,0 +1,36 @@
+ +/* Memory barrier operations. Generic version.
+ + Copyright (C) 2008 Free Software Foundation, Inc.
+ + This file is part of the GNU Hurd.
+ +
+ + The GNU Hurd is free software; you can redistribute it and/or
+ + modify it under the terms of the GNU General Public License as
+ + published by the Free Software Foundation; either version 3 of the
+ + License, or (at your option) any later version.
+ +
+ + The GNU Hurd is distributed in the hope that it will be useful, but
+ + WITHOUT ANY WARRANTY; without even the implied warranty of
+ + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ + General Public License for more details.
+ +
+ + You should have received a copy of the GNU General Public License
+ + along with this program. If not, see
+ + <http://www.gnu.org/licenses/>. */
+ +
+ +#ifndef _BITS_MEMORY_H
+ +#define _BITS_MEMORY_H 1
+ +
+ +/* Prevent read and write reordering across this function. */
+ +static inline void
+ +__memory_barrier (void)
+ +{
+ + /* Any lock'ed instruction will do. */
+ + __sync_synchronize ();
+ +}
+ +
+ +/* Prevent read reordering across this function. */
+ +#define __memory_read_barrier __memory_barrier
+ +
+ +/* Prevent write reordering across this function. */
+ +#define __memory_write_barrier __memory_barrier
+ +
+ +#endif
+
+
+# Spin Locks
+
+ diff --git a/sysdeps/generic/bits/spin-lock-inline.h b/sysdeps/generic/bits/spin-lock-inline.h
+ new file mode 100644
+ index 0000000..6c3e06e
+ --- /dev/null
+ +++ b/sysdeps/generic/bits/spin-lock-inline.h
+ @@ -0,0 +1,99 @@
+ +/* Machine-specific definitions for spin locks. Generic version.
+ + Copyright (C) 2000, 2005, 2008 Free Software Foundation, Inc.
+ + This file is part of the GNU C Library.
+ +
+ + The GNU C Library is free software; you can redistribute it and/or
+ + modify it under the terms of the GNU Library General Public License as
+ + published by the Free Software Foundation; either version 2 of the
+ + License, or (at your option) any later version.
+ +
+ + The GNU C Library is distributed in the hope that it will be useful,
+ + but WITHOUT ANY WARRANTY; without even the implied warranty of
+ + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ + Library General Public License for more details.
+ +
+ + You should have received a copy of the GNU Library General Public
+ + License along with the GNU C Library; see the file COPYING.LIB. If not,
+ + write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ + Boston, MA 02111-1307, USA. */
+ +
+ +/*
+ + * Never include this file directly; use <pthread.h> or <cthreads.h> instead.
+ + */
+ +
+ +#ifndef _BITS_SPIN_LOCK_INLINE_H
+ +#define _BITS_SPIN_LOCK_INLINE_H 1
+ +
+ +#include <features.h>
+ +#include <bits/spin-lock.h>
+ +
+ +__BEGIN_DECLS
+ +
+ +#if defined __USE_EXTERN_INLINES || defined _FORCE_INLINES
+ +
+ +# if !defined (__EBUSY) || !defined (__EINVAL)
+ +# include <errno.h>
+ +# ifndef __EBUSY
+ +# define __EBUSY EBUSY
+ +# endif
+ +# ifndef __EINVAL
+ +# define __EINVAL EINVAL
+ +# endif
+ +# endif
+ +
+ +# ifndef __PT_SPIN_INLINE
+ +# define __PT_SPIN_INLINE __extern_inline
+ +# endif
+ +
+ +__PT_SPIN_INLINE int __pthread_spin_destroy (__pthread_spinlock_t *__lock);
+ +
+ +__PT_SPIN_INLINE int
+ +__pthread_spin_destroy (__pthread_spinlock_t *__lock)
+ +{
+ + return 0;
+ +}
+ +
+ +__PT_SPIN_INLINE int __pthread_spin_init (__pthread_spinlock_t *__lock,
+ + int __pshared);
+ +
+ +__PT_SPIN_INLINE int
+ +__pthread_spin_init (__pthread_spinlock_t *__lock, int __pshared)
+ +{
+ + *__lock = __SPIN_LOCK_INITIALIZER;
+ + return 0;
+ +}
+ +
+ +__PT_SPIN_INLINE int __pthread_spin_trylock (__pthread_spinlock_t *__lock);
+ +
+ +__PT_SPIN_INLINE int
+ +__pthread_spin_trylock (__pthread_spinlock_t *__lock)
+ +{
+ + int __locked = __sync_val_compare_and_swap (__lock, 0, 1);
+ + return __locked ? __EBUSY : 0;
+ +}
+ +
+ +__extern_inline int __pthread_spin_lock (__pthread_spinlock_t *__lock);
+ +extern int _pthread_spin_lock (__pthread_spinlock_t *__lock);
+ +
+ +__extern_inline int
+ +__pthread_spin_lock (__pthread_spinlock_t *__lock)
+ +{
+ + if (__pthread_spin_trylock (__lock))
+ + return _pthread_spin_lock (__lock);
+ + return 0;
+ +}
+ +
+ +__PT_SPIN_INLINE int __pthread_spin_unlock (__pthread_spinlock_t *__lock);
+ +
+ +__PT_SPIN_INLINE int
+ +__pthread_spin_unlock (__pthread_spinlock_t *__lock)
+ +{
+ + int __locked = __sync_val_compare_and_swap (__lock, 1, 0);
+ + return __locked ? 0 : __EINVAL;
+ +}
+ +
+ +#endif /* Use extern inlines or force inlines. */
+ +
+ +__END_DECLS
+ +
+ +#endif /* bits/spin-lock.h */
+ diff --git a/sysdeps/l4/bits/pthread-np.h b/sysdeps/generic/bits/spin-lock.h
+ similarity index 67%
+ rename from sysdeps/l4/bits/pthread-np.h
+ rename to sysdeps/generic/bits/spin-lock.h
+ index 6a02bdc..c2ba332 100644
+ --- a/sysdeps/l4/bits/pthread-np.h
+ +++ b/sysdeps/generic/bits/spin-lock.h
+ @@ -1,5 +1,5 @@
+ -/* Non-portable functions. L4 version.
+ - Copyright (C) 2003, 2007 Free Software Foundation, Inc.
+ +/* Machine-specific definitions for spin locks. Generic version.
+ + Copyright (C) 2000, 2005, 2008 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ @@ -21,15 +21,19 @@
+ * Never include this file directly; use <pthread.h> or <cthreads.h> instead.
+ */
+
+ -#ifndef _BITS_PTHREAD_NP_H
+ -#define _BITS_PTHREAD_NP_H 1
+ +#ifndef _BITS_SPIN_LOCK_H
+ +#define _BITS_SPIN_LOCK_H 1
+
+ -#include <l4.h>
+ +#include <features.h>
+
+ -/* Add the thread TID to the internal kernel thread pool. */
+ -extern int pthread_pool_add_np (l4_thread_id_t tid);
+ +__BEGIN_DECLS
+
+ -/* Get the first thread from the pool. */
+ -extern l4_thread_id_t pthread_pool_get_np (void);
+ +/* The type of a spin lock object. */
+ +typedef __volatile int __pthread_spinlock_t;
+
+ -#endif /* bits/pthread-np.h */
+ +/* Initializer for a spin lock object. */
+ +# define __SPIN_LOCK_INITIALIZER ((__pthread_spinlock_t) 0)
+ +
+ +__END_DECLS
+ +
+ +#endif /* bits/spin-lock.h */
+
+
+# Signal Stuff
+
+ diff --git a/pthread/pt-internal.h b/pthread/pt-internal.h
+ index cb441d0..986ec6b 100644
+ --- a/pthread/pt-internal.h
+ +++ b/pthread/pt-internal.h
+ @@ -26,13 +26,15 @@
+ [...]
+ #include <pt-sysdep.h>
+ #include <pt-machdep.h>
+
+ +#include <sig-internal.h>
+ +
+ /* Thread state. */
+ enum pthread_state
+ {
+ @@ -54,6 +56,10 @@ enum pthread_state
+ # define PTHREAD_SYSDEP_MEMBERS
+ #endif
+
+ +#ifndef PTHREAD_SIGNAL_MEMBERS
+ +# define PTHREAD_SIGNAL_MEMBERS
+ +#endif
+ +
+ /* This structure describes a POSIX thread. */
+ struct __pthread
+ {
+ @@ -89,6 +95,8 @@ struct __pthread
+
+ PTHREAD_SYSDEP_MEMBERS
+
+ + PTHREAD_SIGNAL_MEMBERS
+ +
+ struct __pthread *next, **prevp;
+ };
+
+ diff --git a/signal/kill.c b/signal/kill.c
+ index 27c9c32..c281640 100644
+ --- a/signal/kill.c
+ +++ b/signal/kill.c
+ @@ -20,6 +20,8 @@
+
+ #include "sig-internal.h"
+
+ +#include <string.h>
+ +
+ int
+ kill (pid_t pid, int signo)
+ {
+ @@ -65,6 +67,12 @@ kill (pid_t pid, int signo)
+ current thread has blocked the signal, the correct thing to do is
+ to iterate over all the other threads and find on that hasn't
+ blocked it. */
+ +
+ + extern int __pthread_num_threads;
+ + if (__pthread_num_threads == 0)
+ + panic ("signal %d received before pthread library is able to handle it",
+ + signo);
+ +
+ return pthread_kill (pthread_self (), signo);
+ }
+
+ diff --git a/signal/pt-kill-siginfo-np.c b/signal/pt-kill-siginfo-np.c
+ index 9bdf6cc..35642c3 100644
+ --- a/signal/pt-kill-siginfo-np.c
+ +++ b/signal/pt-kill-siginfo-np.c
+ @@ -75,7 +75,8 @@ pthread_kill_siginfo_np (pthread_t tid, siginfo_t si)
+ || (ss->stack.ss_flags & SS_DISABLE)
+ || (ss->stack.ss_flags & SS_ONSTACK)))
+ /* We are sending a signal to ourself and we don't use an
+ - alternate stack. */
+ + alternate stack. (Recall: SA_ONSTACK means use the alt
+ + stack.) */
+ signal_dispatch (ss, &si);
+ else
+ signal_dispatch_lowlevel (ss, tid, si);
+ diff --git a/signal/signal-dispatch.c b/signal/signal-dispatch.c
+ index 40440b7..6fafcc1 100644
+ --- a/signal/signal-dispatch.c
+ +++ b/signal/signal-dispatch.c
+ @@ -20,6 +20,8 @@
+
+ #include "sig-internal.h"
+
+ +#include <viengoos/math.h>
+ +
+ /* This is the signal handler entry point. A thread is forced into
+ this state when it receives a signal. We need to save the thread's
+ state and then invoke the high-level signal dispatcher. SS->LOCK
+ @@ -107,7 +109,7 @@ signal_dispatch (struct signal_state *ss, siginfo_t *si)
+ sigset_t pending = ~ss->blocked & ss->pending;
+ if (! pending)
+ pending = ~ss->blocked & process_pending;
+ - signo = l4_lsb64 (pending);
+ + signo = vg_lsb64 (pending);
+ }
+ while (signo);
+
+ diff --git a/signal/sigwaiter.c b/signal/sigwaiter.c
+ index 8d041ac..adc05ca 100644
+ --- a/signal/sigwaiter.c
+ +++ b/signal/sigwaiter.c
+ @@ -20,7 +20,7 @@
+
+ #include "sig-internal.h"
+
+ -#include <hurd/futex.h>
+ +#include <viengoos/futex.h>
+
+ struct sigwaiter *sigwaiters;
+
+ diff --git a/signal/sigwaitinfo.c b/signal/sigwaitinfo.c
+ index 1b47079..dea3ef4 100644
+ --- a/signal/sigwaitinfo.c
+ +++ b/signal/sigwaitinfo.c
+ @@ -43,7 +43,7 @@ sigwaitinfo (const sigset_t *restrict set, siginfo_t *restrict info)
+
+ assert (extant);
+
+ - int signo = l4_msb64 (extant);
+ + int signo = vg_msb64 (extant);
+
+ if (info)
+ {
+
+
+# `ALWAYS_TRACK_MUTEX_OWNER`
+
+ diff --git a/sysdeps/generic/pt-mutex-timedlock.c b/sysdeps/generic/pt-mutex-timedlock.c
+ index ee43219..265a453 100644
+ --- a/sysdeps/generic/pt-mutex-timedlock.c
+ +++ b/sysdeps/generic/pt-mutex-timedlock.c
+ @@ -36,7 +36,6 @@ __pthread_mutex_timedlock_internal (struct __pthread_mutex *mutex,
+ if (__pthread_spin_trylock (&mutex->__held) == 0)
+ /* Successfully acquired the lock. */
+ {
+ -#ifdef ALWAYS_TRACK_MUTEX_OWNER
+ #ifndef NDEBUG
+ self = _pthread_self ();
+ if (self)
+ @@ -48,7 +47,6 @@ __pthread_mutex_timedlock_internal (struct __pthread_mutex *mutex,
+ mutex->owner = _pthread_self ();
+ }
+ #endif
+ -#endif
+
+ if (mutex->attr)
+ switch (mutex->attr->mutex_type)
+ @@ -75,16 +73,14 @@ __pthread_mutex_timedlock_internal (struct __pthread_mutex *mutex,
+ self = _pthread_self ();
+ assert (self);
+
+ - if (! mutex->attr || mutex->attr->mutex_type == PTHREAD_MUTEX_NORMAL)
+ - {
+ -#if defined(ALWAYS_TRACK_MUTEX_OWNER)
+ - assert (mutex->owner != self);
+ -#endif
+ - }
+ - else
+ + if (mutex->attr)
+ {
+ switch (mutex->attr->mutex_type)
+ {
+ + case PTHREAD_MUTEX_NORMAL:
+ + assert (mutex->owner != self);
+ + break;
+ +
+ case PTHREAD_MUTEX_ERRORCHECK:
+ if (mutex->owner == self)
+ {
+ @@ -106,10 +102,9 @@ __pthread_mutex_timedlock_internal (struct __pthread_mutex *mutex,
+ LOSE;
+ }
+ }
+ + else
+ + assert (mutex->owner != self);
+
+ -#if !defined(ALWAYS_TRACK_MUTEX_OWNER)
+ - if (mutex->attr && mutex->attr->mutex_type != PTHREAD_MUTEX_NORMAL)
+ -#endif
+ assert (mutex->owner);
+
+ if (abstime && (abstime->tv_nsec < 0 || abstime->tv_nsec >= 1000000000))
+ @@ -146,12 +141,9 @@ __pthread_mutex_timedlock_internal (struct __pthread_mutex *mutex,
+ else
+ __pthread_block (self);
+
+ -#if !defined(ALWAYS_TRACK_MUTEX_OWNER)
+ - if (mutex->attr && mutex->attr->mutex_type != PTHREAD_MUTEX_NORMAL)
+ -#endif
+ - {
+ +#ifndef NDEBUG
+ assert (mutex->owner == self);
+ - }
+ +#endif
+
+ if (mutex->attr)
+ switch (mutex->attr->mutex_type)
+ diff --git a/sysdeps/generic/pt-mutex-transfer-np.c b/sysdeps/generic/pt-mutex-transfer-np.c
+ index 7796ac4..bcb809d 100644
+ --- a/sysdeps/generic/pt-mutex-transfer-np.c
+ +++ b/sysdeps/generic/pt-mutex-transfer-np.c
+ @@ -45,12 +45,7 @@ __pthread_mutex_transfer_np (struct __pthread_mutex *mutex, pthread_t tid)
+ }
+
+ #ifndef NDEBUG
+ -# if !defined(ALWAYS_TRACK_MUTEX_OWNER)
+ - if (mutex->attr && mutex->attr->mutex_type != PTHREAD_MUTEX_NORMAL)
+ -# endif
+ - {
+ mutex->owner = thread;
+ - }
+ #endif
+
+ return 0;
+ diff --git a/sysdeps/generic/pt-mutex-unlock.c b/sysdeps/generic/pt-mutex-unlock.c
+ index 7645fd4..f299750 100644
+ --- a/sysdeps/generic/pt-mutex-unlock.c
+ +++ b/sysdeps/generic/pt-mutex-unlock.c
+ @@ -33,16 +33,19 @@ __pthread_mutex_unlock (pthread_mutex_t *mutex)
+
+ if (! mutex->attr || mutex->attr->mutex_type == PTHREAD_MUTEX_NORMAL)
+ {
+ -#if defined(ALWAYS_TRACK_MUTEX_OWNER)
+ # ifndef NDEBUG
+ if (_pthread_self ())
+ {
+ assert (mutex->owner);
+ - assert (mutex->owner == _pthread_self ());
+ + assertx (mutex->owner == _pthread_self (),
+ + "%p("VG_THREAD_ID_FMT") != %p("VG_THREAD_ID_FMT")",
+ + mutex->owner,
+ + ((struct __pthread *) mutex->owner)->threadid,
+ + _pthread_self (),
+ + _pthread_self ()->threadid);
+ mutex->owner = NULL;
+ }
+ # endif
+ -#endif
+ }
+ else
+ switch (mutex->attr->mutex_type)
+ @@ -81,12 +84,7 @@ __pthread_mutex_unlock (pthread_mutex_t *mutex)
+ __pthread_dequeue (wakeup);
+
+ #ifndef NDEBUG
+ -# if !defined (ALWAYS_TRACK_MUTEX_OWNER)
+ - if (mutex->attr && mutex->attr->mutex_type != PTHREAD_MUTEX_NORMAL)
+ -# endif
+ - {
+ mutex->owner = wakeup;
+ - }
+ #endif
+
+ /* We do not unlock MUTEX->held: we are transferring the ownership
+
+
+# `t/fix_have_kernel_resources`
+
+See topic branch of that name.
+
+ diff --git a/sysdeps/mach/hurd/pt-sysdep.h b/sysdeps/mach/hurd/pt-sysdep.h
+ index f14a136..83bad96 100644
+ --- a/sysdeps/mach/hurd/pt-sysdep.h
+ +++ b/sysdeps/mach/hurd/pt-sysdep.h
+ @@ -1,5 +1,5 @@
+ /* Internal defenitions for pthreads library.
+ - Copyright (C) 2000, 2002, 2008 Free Software Foundation, Inc.
+ + Copyright (C) 2000, 2002 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ @@ -32,8 +32,7 @@
+
+ #define PTHREAD_SYSDEP_MEMBERS \
+ thread_t kernel_thread; \
+ - mach_msg_header_t wakeupmsg; \
+ - int have_kernel_resources;
+ + mach_msg_header_t wakeupmsg;
+
+ #define _HURD_THREADVAR_THREAD _HURD_THREADVAR_DYNAMIC_USER
+
+ diff --git a/sysdeps/mach/pt-thread-alloc.c b/sysdeps/mach/pt-thread-alloc.c
+ index 3d7c046..1acba98 100644
+ --- a/sysdeps/mach/pt-thread-alloc.c
+ +++ b/sysdeps/mach/pt-thread-alloc.c
+ @@ -1,5 +1,5 @@
+ /* Start thread. Mach version.
+ - Copyright (C) 2000, 2002, 2005, 2008 Free Software Foundation, Inc.
+ + Copyright (C) 2000, 2002, 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ @@ -63,9 +63,6 @@ create_wakeupmsg (struct __pthread *thread)
+ int
+ __pthread_thread_alloc (struct __pthread *thread)
+ {
+ - if (thread->have_kernel_resources)
+ - return 0;
+ -
+ error_t err;
+
+ err = create_wakeupmsg (thread);
+ @@ -100,7 +97,5 @@ __pthread_thread_alloc (struct __pthread *thread)
+ return EAGAIN;
+ }
+
+ - thread->have_kernel_resources = 1;
+ -
+ return 0;
+ }
+
+
+# Miscellaneous
+
+ diff --git a/Makefile b/Makefile
+ index 04dfb26..a4c0c52 100644
+ --- a/Makefile
+ +++ b/Makefile
+ @@ -71,7 +71,6 @@ SRCS := pt-attr.c pt-attr-destroy.c pt-attr-getdetachstate.c \
+ pt-mutex-init.c pt-mutex-destroy.c \
+ pt-mutex-lock.c pt-mutex-trylock.c pt-mutex-timedlock.c \
+ pt-mutex-unlock.c \
+ - pt-mutex-transfer-np.c \
+ pt-mutex-getprioceiling.c pt-mutex-setprioceiling.c \
+ \
+ pt-rwlock-attr.c \
+ @@ -100,7 +99,6 @@ SRCS := pt-attr.c pt-attr-destroy.c pt-attr-getdetachstate.c \
+ pt-thread-dealloc.c \
+ pt-thread-start.c \
+ pt-thread-halt.c \
+ - pt-startup.c \
+ \
+ pt-getconcurrency.c pt-setconcurrency.c \
+ \
+ @@ -143,7 +141,6 @@ sysdeps_headers = \
+ semaphore.h \
+ \
+ bits/pthread.h \
+ - bits/pthread-np.h \
+ bits/mutex.h \
+ bits/condition.h \
+ bits/condition-attr.h \
+ diff --git a/Makefile.am b/Makefile.am
+ index e59c946..e73d8d6 100644
+ --- a/Makefile.am
+ +++ b/Makefile.am
+ @@ -20,17 +20,18 @@
+ if ARCH_IA32
+ arch=ia32
+ endif
+ +if ARCH_X86_64
+ + arch=x86_64
+ +endif
+ if ARCH_POWERPC
+ arch=powerpc
+ endif
+
+ # The source files are scattered over several directories. Add
+ # all these directories to the vpath.
+ -SYSDEP_PATH = $(srcdir)/sysdeps/l4/hurd/${arch} \
+ - $(srcdir)/sysdeps/l4/${arch} \
+ +SYSDEP_PATH = $(srcdir)/sysdeps/viengoos/${arch} \
+ $(srcdir)/sysdeps/${arch} \
+ - $(srcdir)/sysdeps/l4/hurd \
+ - $(srcdir)/sysdeps/l4 \
+ + $(srcdir)/sysdeps/viengoos \
+ $(srcdir)/sysdeps/hurd \
+ $(srcdir)/sysdeps/generic \
+ $(srcdir)/sysdeps/posix \
+ @@ -68,7 +69,6 @@ libpthread_a_SOURCES = pt-attr.c pt-attr-destroy.c pt-attr-getdetachstate.c \
+ pt-alloc.c \
+ pt-create.c \
+ pt-getattr.c \
+ - pt-pool-np.c \
+ pt-equal.c \
+ pt-dealloc.c \
+ pt-detach.c \
+ diff --git a/headers.m4 b/headers.m4
+ index 5a58b9b..7c73cf2 100644
+ --- a/headers.m4
+ +++ b/headers.m4
+ @@ -14,10 +14,9 @@ AC_CONFIG_LINKS([
+ sysroot/include/pthread.h:libpthread/include/pthread.h
+ sysroot/include/pthread/pthread.h:libpthread/include/pthread/pthread.h
+ sysroot/include/pthread/pthreadtypes.h:libpthread/include/pthread/pthreadtypes.h
+ - sysroot/include/bits/memory.h:libpthread/sysdeps/${arch}/bits/memory.h
+ - sysroot/include/bits/spin-lock.h:libpthread/sysdeps/${arch}/bits/spin-lock.h
+ - sysroot/include/bits/spin-lock-inline.h:libpthread/sysdeps/${arch}/bits/spin-lock-inline.h
+ - sysroot/include/bits/pthreadtypes.h:libpthread/sysdeps/generic/bits/pthreadtypes.h
+ + sysroot/include/bits/memory.h:libpthread/sysdeps/generic/bits/memory.h
+ + sysroot/include/bits/spin-lock.h:libpthread/sysdeps/generic/bits/spin-lock.h
+ + sysroot/include/bits/spin-lock-inline.h:libpthread/sysdeps/generic/bits/spin-lock-inline.h
+ sysroot/include/bits/barrier-attr.h:libpthread/sysdeps/generic/bits/barrier-attr.h
+ sysroot/include/bits/barrier.h:libpthread/sysdeps/generic/bits/barrier.h
+ sysroot/include/bits/cancelation.h:libpthread/sysdeps/generic/bits/cancelation.h
+ @@ -30,9 +29,8 @@ AC_CONFIG_LINKS([
+ sysroot/include/bits/rwlock-attr.h:libpthread/sysdeps/generic/bits/rwlock-attr.h
+ sysroot/include/bits/rwlock.h:libpthread/sysdeps/generic/bits/rwlock.h
+ sysroot/include/bits/thread-attr.h:libpthread/sysdeps/generic/bits/thread-attr.h
+ - sysroot/include/bits/thread-barrier.h:libpthread/sysdeps/generic/bits/thread-barrier.h
+ sysroot/include/bits/thread-specific.h:libpthread/sysdeps/generic/bits/thread-specific.h
+ - sysroot/include/bits/pthread-np.h:libpthread/sysdeps/l4/hurd/bits/pthread-np.h
+ + sysroot/include/bits/pthread-np.h:libpthread/sysdeps/viengoos/bits/pthread-np.h
+ sysroot/include/semaphore.h:libpthread/include/semaphore.h
+ sysroot/include/bits/semaphore.h:libpthread/sysdeps/generic/bits/semaphore.h
+ sysroot/include/signal.h:libpthread/signal/signal.h
+ @@ -41,5 +39,5 @@ AC_CONFIG_LINKS([
+ AC_CONFIG_COMMANDS_POST([
+ mkdir -p sysroot/lib libpthread &&
+ ln -sf ../../libpthread/libpthread.a sysroot/lib/ &&
+ - touch libpthread/libpthread.a
+ + echo '/* This file intentionally left blank. */' >libpthread/libpthread.a
+ ])
+ diff --git a/sysdeps/hurd/pt-setspecific.c b/sysdeps/hurd/pt-setspecific.c
+ index 89ca4d7..d2d1157 100644
+ --- a/sysdeps/hurd/pt-setspecific.c
+ +++ b/sysdeps/hurd/pt-setspecific.c
+ @@ -1,5 +1,5 @@
+ /* pthread_setspecific. Generic version.
+ - Copyright (C) 2002 Free Software Foundation, Inc.
+ + Copyright (C) 2002, 2008 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ @@ -30,7 +30,8 @@ pthread_setspecific (pthread_key_t key, const void *value)
+
+ if (! self->thread_specifics)
+ {
+ - err = hurd_ihash_create (&self->thread_specifics, HURD_IHASH_NO_LOCP);
+ + err = hurd_ihash_create (&self->thread_specifics, false,
+ + HURD_IHASH_NO_LOCP);
+ if (err)
+ return ENOMEM;
+ }
+ diff --git a/sysdeps/mach/pt-thread-halt.c b/sysdeps/mach/pt-thread-halt.c
+ index 973cde1..9f86024 100644
+ --- a/sysdeps/mach/pt-thread-halt.c
+ +++ b/sysdeps/mach/pt-thread-halt.c
+ @@ -30,8 +30,14 @@
+ being halted, thus the last action should be halting the thread
+ itself. */
+ void
+ -__pthread_thread_halt (struct __pthread *thread)
+ +__pthread_thread_halt (struct __pthread *thread, int need_dealloc)
+ {
+ - error_t err = __thread_terminate (thread->kernel_thread);
+ + error_t err;
+ + thread_t tid = thread->kernel_thread;
+ +
+ + if (need_dealloc)
+ + __pthread_dealloc (thread);
+ +
+ + err = __thread_terminate (tid);
+ assert_perror (err);
+ }
diff --git a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
index 2c8f10f8..5a99778b 100644
--- a/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
+++ b/open_issues/libpthread_CLOCK_MONOTONIC.mdwn
@@ -76,3 +76,42 @@ License|/fdl]]."]]"""]]
<pinotree> kind of, yes
<youpi> I have reverted the change in libc for now
<pinotree> ok
+
+
+## IRC, freenode, #hurd, 2012-07-22
+
+ <tschwinge> pinotree, youpi: I once saw you discussing issue with librt
+ usage is libpthread -- is it this issue? http://sourceware.org/PR14304
+ <youpi> tschwinge: (librt): no
+ <youpi> it's the converse
+ <pinotree> tschwinge: kind of
+ <youpi> unexpectedly loading libpthread is almost never a problem
+ <youpi> it's unexpectedly loading librt which was a problem for glib
+ <youpi> tschwinge: basically what happened with glib is that at configure
+ time, it could find clock_gettime without any -lrt, because of pulling
+ -lpthread, but at link time that wouldn't happen
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> pinotree: oh, i see you changed __pthread_timedblock to use
+ clock_gettime
+ <braunr> i wonder if i should do the same in libthreads
+ <pinotree> yeah, i realized later it was a bad move
+ <braunr> ok
+ <braunr> i'll stick to gettimeofday for now
+ <pinotree> it'll be safe when implementing some private
+ __hurd_clock_get{time,res} in libc proper, making librt just forward to
+ it and adapting the gettimeofday to use it
+
+
+## IRC, freenode, #hurd, 2012-10-22
+
+ <pinotree> youpi: apparently somebody in glibc land is indirectly solving
+ our "libpthread needs lirt which pulls libphtread" circular issue by
+ moving the clock_* functions to libc proper
+ <youpi> I've seen that yes :)
+
+[[!sourceware_PR 14304]], [[!sourceware_PR 14743]], [[!message-id
+"CAH6eHdQRyTgkXE7k+UVpaObNTOZf7QF_fNoU-bqbMhfzXxXUDg@mail.gmail.com"]], commit
+6e6249d0b461b952d0f544792372663feb6d792a (2012-10-24).
diff --git a/open_issues/libpthread_timeout_dequeue.mdwn b/open_issues/libpthread_timeout_dequeue.mdwn
new file mode 100644
index 00000000..5ebb2e11
--- /dev/null
+++ b/open_issues/libpthread_timeout_dequeue.mdwn
@@ -0,0 +1,22 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_libpthread]]
+
+
+# IRC, freenode, #hurd, 2012-08-17
+
+ <braunr> pthread_cond_timedwait and pthread_mutex_timedlock *can* produce
+ segfaults in our implementation
+ <braunr> if a timeout happens, but before the thread dequeues itself,
+ another tries to wake it, it will be dequeued twice
+ <braunr> this is the issue i spent a week on when working on fixing select
+
+[[select]]
diff --git a/open_issues/mach_federations.mdwn b/open_issues/mach_federations.mdwn
new file mode 100644
index 00000000..50c939c3
--- /dev/null
+++ b/open_issues/mach_federations.mdwn
@@ -0,0 +1,66 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <braunr> well replacing parts of it is possible on the hurd, but for core
+ servers it's limited
+ <braunr> minix has features for that
+ <braunr> this was interesting too:
+ http://static.usenix.org/event/osdi08/tech/full_papers/david/david_html/
+ <braunr> lcc: you'll always have some kind of dependency problems which are
+ hard to solve
+ <savask> braunr: One my friend asked me if it's possible to run different
+ parts of Hurd on different computers and make a cluster therefore. So, is
+ it, at least theoretically?
+ <braunr> savask: no
+ <savask> Okay, then I guessed a right answer.
+ <youpi> well, theorically it's possible, but it's not implemented
+ <braunr> well it's possible everywhere :p
+ <braunr> there are projects for that on linux
+ <braunr> but it requires serious changes in both the protocols and servers
+ <braunr> and it depends on the features you want (i assume here you want
+ e.g. process checkpointing so they can be migrated to other machines to
+ transparently balance loads)
+ <lcc> is it even theoretically possible to have a system in which core
+ servers can be modified while the system is running? hm... I will look
+ more into it. just curious.
+ <savask> lcc: Linux can be updated on the fly, without rebooting.
+ <braunr> lcc: to some degree, it is
+ <braunr> savask: the whole kernel is rebooted actually
+ <braunr> well not rebooted, but restarted
+ <braunr> there is a project that provides kernel updates through binary
+ patches
+ <braunr> ksplice
+ <savask> braunr: But it will look like everything continued running.
+ <braunr> as long as the new code expects the same data structures and other
+ implications, yes
+ <braunr> "Ksplice can handle many security updates but not changes to data
+ structures"
+ <braunr> obviously
+ <braunr> so it's good for small changes
+ <braunr> and ksplice is very specific, it's intended for security updates,
+ ad the primary users are telecommunication providers who don't want
+ downtime
+ <antrik> braunr: well, protocols and servers on Mach-based systems should
+ be ready for federations... although some Hurd protocols are not clean
+ for federations with heterogenous architectures, at least on homogenous
+ clusters it should actually work with only some extra bootstrapping code,
+ if the support existed in our Mach variant...
+ <braunr> antrik: why do you want the support in the kernel ?
+ <antrik> braunr: I didn't say I *want* federation support in the
+ kernel... in fact I agree with Shapiro that it's probably a bad idea. I
+ just said that it *should* actually work with the system design as it is
+ now :-)
+ <antrik> braunr: yes, I said that it wouldn't work on heterogenous
+ federations. if all machines use the same architecture it should work.
diff --git a/open_issues/mach_on_top_of_posix.mdwn b/open_issues/mach_on_top_of_posix.mdwn
index 7574feb0..a3e47685 100644
--- a/open_issues/mach_on_top_of_posix.mdwn
+++ b/open_issues/mach_on_top_of_posix.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -14,3 +14,5 @@ License|/fdl]]."]]"""]]
At the beginning of the 2000s, there was a *Mach on Top of POSIX* port started
by John Edwin Tobey. Status unknown. Ask [[tschwinge]] for the source code.
+
+See also [[implementing_hurd_on_top_of_another_system]].
diff --git a/open_issues/mach_shadow_objects.mdwn b/open_issues/mach_shadow_objects.mdwn
new file mode 100644
index 00000000..0669041a
--- /dev/null
+++ b/open_issues/mach_shadow_objects.mdwn
@@ -0,0 +1,24 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_gnumach]]
+
+See also [[gnumach_vm_map_entry_forward_merging]].
+
+
+# IRC, freenode, #hurd, 2012-11-16
+
+ <mcsim> hi. do I understand correct that following is true: vm_object_t a;
+ a->shadow->copy == a;?
+ <braunr> mcsim: not completely sure, but i'd say no
+ <braunr> but mach terminology isn't always firm, so possible
+ <braunr> mcsim: apparently you're right, although be careful that it may
+ not be the case *all* the time
+ <braunr> there may be inconsistent states
diff --git a/open_issues/mission_statement.mdwn b/open_issues/mission_statement.mdwn
index 17f148a9..b32d6ba6 100644
--- a/open_issues/mission_statement.mdwn
+++ b/open_issues/mission_statement.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -658,3 +658,42 @@ License|/fdl]]."]]"""]]
FUSE in this case though... it doesn't really change the functionality of
the VFS; only rearranges the tree a bit
<antrik> (might even be doable with standard Linux features)
+
+
+# IRC, freenode, #hurd, 2012-07-25
+
+ <braunr> because it has design problems, because it has implementation
+ problems, lots of problems, and far too few people to keep up with other
+ systems that are already dominating
+ <braunr> also, considering other research projects get much more funding
+ than we do, they probably have a better chance at being adopted
+ <rah> you consider the Hurd to be a research project?
+ <braunr> and as they're more recent, they sometimes overcome some of the
+ issues we have
+ <braunr> yes and no
+ <braunr> yes because it was, at the time of its creation, and it hasn't
+ changed much, and there aren't many (any?) other systems with such a
+ design
+ <braunr> and no because the hurd is actually working, and being released as
+ part of something like debian
+ <braunr> which clearly shows it's able to do the stuff it was intended for
+ <braunr> i consider it a technically very interesting project for
+ developers who want to know more about microkernel based extensible
+ systems
+ <antrik> rah: I don't expect the Hurd to achieve world domination, because
+ most people consider Linux "good enough" and will stick with it
+ <antrik> I for my part think though we could do better than Linux (in
+ certain regards I consider important), which is why I still consider it
+ interesting and worthwhile
+ <nowhere_man> I think that in some respect the OS scene may evolve a bit
+ like the PL one, where everyone progressively adopts ideas from Lisp but
+ doesn't want to do Lisp: everyone slowly shifts towards what µ-kernels
+ OSes have done from the start, but they don't want µ-kernels...
+ <braunr> nowhere_man: that's my opinion too
+ <braunr> and this is why i think something like the hurd still has valuable
+ purpose
+ <nowhere_man> braunr: in honesty, I still ponder the fact that it's my
+ coping mechanism to accept being a Lisp and Hurd fan ;-)
+ <braunr> nowhere_man: it can be used that way too
+ <braunr> functional programming is getting more and more attention
+ <braunr> so it's fine if you're a lisp fan really
diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn
index 5924d3f9..f42601b4 100644
--- a/open_issues/multithreading.mdwn
+++ b/open_issues/multithreading.mdwn
@@ -49,6 +49,160 @@ Tom Van Cutsem, 2009.
<youpi> right
+## IRC, freenode, #hurd, 2012-07-16
+
+ <braunr> hm interesting
+ <braunr> when many threads are creating to handle requests, they
+ automatically create a pool of worker threads by staying around for some
+ time
+ <braunr> this time is given in the libport call
+ <braunr> but the thread always remain
+ <braunr> they must be used in turn each time a new requet comes in
+ <braunr> ah no :(, they're maintained by the periodic sync :(
+ <braunr> hm, still not that, so weird
+ <antrik> braunr: yes, that's a known problem: unused threads should go away
+ after some time, but that doesn't actually happen
+ <antrik> don't remember though whether it's broken for some reason, or
+ simply not implemented at all...
+ <antrik> (this was already a known issue when thread throttling was
+ discussed around 2005...)
+ <braunr> antrik: ok
+ <braunr> hm threads actually do finish ..
+ <braunr> libthreads retain them in a pool for faster allocations
+ <braunr> hm, it's worse than i thought
+ <braunr> i think the hurd does its job well
+ <braunr> the cthreads code never reaps threads
+ <braunr> when threads are finished, they just wait until assigned a new
+ invocation
+
+ <braunr> i don't understand ports_manage_port_operations_multithread :/
+ <braunr> i think i get it
+ <braunr> why do people write things in such a complicated way ..
+ <braunr> such code is error prone and confuses anyone
+
+ <braunr> i wonder how well nested functions interact with threads when
+ sharing variables :/
+ <braunr> the simple idea of nested functions hurts my head
+ <braunr> do you see my point ? :) variables on the stack automatically
+ shared between threads, without the need to explicitely pass them by
+ address
+ <antrik> braunr: I don't understand. why would variables on the stack be
+ shared between threads?...
+ <braunr> antrik: one function declares two variables, two nested functions,
+ and use these in separate threads
+ <braunr> are the local variables still "local"
+ <braunr> ?
+ <antrik> braunr: I would think so? why wouldn't they? threads have separate
+ stacks, right?...
+ <antrik> I must admit though that I have no idea how accessing local
+ variables from the parent function works at all...
+ <braunr> me neither
+
+ <braunr> why don't demuxers get a generic void * like every callback does
+ :((
+ <antrik> ?
+ <braunr> antrik: they get pointers to the input and output messages only
+ <antrik> why is this a problem?
+ <braunr> ports_manage_port_operations_multithread can be called multiple
+ times in the same process
+ <braunr> each call must have its own context
+ <braunr> currently this is done by using nested functions
+ <braunr> also, why demuxers return booleans while mach_msg_server_timeout
+ happily ignores them :(
+ <braunr> callbacks shouldn't return anything anyway
+ <braunr> but then you have a totally meaningless "return 1" in the middle
+ of the code
+ <braunr> i'd advise not using a single nested function
+ <antrik> I don't understand the remark about nested function
+ <braunr> they're just horrible extensions
+ <braunr> the compiler completely hides what happens behind the scenes, and
+ nasty bugs could come out of that
+ <braunr> i'll try to rewrite ports_manage_port_operations_multithread
+ without them and see if it changes anything
+ <braunr> but it's not easy
+ <braunr> also, it makes debugging harder :p
+ <braunr> i suspect gdb hangs are due to that, since threads directly start
+ on a nested function
+ <braunr> and if i'm right, they are created on the stack
+ <braunr> (which is also horrible for security concerns, but that's another
+ story)
+ <braunr> (at least the trampolines)
+ <antrik> I seriously doubt it will change anything... but feel free to
+ prove me wrong :-)
+ <braunr> well, i can see really weird things, but it may have nothing to do
+ with the fact functions are nested
+ <braunr> (i still strongly believe those shouldn't be used at all)
+
+
+## IRC, freenode, #hurd, 2012-08-31
+
+ <braunr> and the hurd is all but scalable
+ <gnu_srs> I thought scalability was built-in already, at least for hurd??
+ <braunr> built in ?
+ <gnu_srs> designed in
+ <braunr> i guess you think that because you read "aggressively
+ multithreaded" ?
+ <braunr> well, a system that is unable to control the amount of threads it
+ creates for no valid reason and uses global lock about everywhere isn't
+ really scalable
+ <braunr> it's not smp nor memory scalable
+ <gnu_srs> most modern OSes have multi-cpu support.
+ <braunr> that doesn't mean they scale
+ <braunr> bsd sucks in this area
+ <braunr> it got better in recent years but they're way behind linux
+ <braunr> linux has this magic thing called rcu
+ <braunr> and i want that in my system, from the beginning
+ <braunr> and no, the hurd was never designed to scale
+ <braunr> that's obvious
+ <braunr> a very common mistake of the early 90s
+
+
+## IRC, freenode, #hurd, 2012-09-06
+
+ <braunr> mel-: the problem with such a true client/server architecture is
+ that the scheduling context of clients is not transferred to servers
+ <braunr> mel-: and the hurd creates threads on demand, so if it's too slow
+ to process requests, more threads are spawned
+ <braunr> to prevent hurd servers from creating too many threads, they are
+ given a higher priority
+ <braunr> and it causes increased latency for normal user applications
+ <braunr> a better way, which is what modern synchronous microkernel based
+ systems do
+ <braunr> is to transfer the scheduling context of the client to the server
+ <braunr> the server thread behaves like the client thread from the
+ scheduler perspective
+ <gnu_srs> how can creating more threads ease the slowness, is that a design
+ decision??
+ <mel-> what would be needed to implement this?
+ <braunr> mel-: thread migration
+ <braunr> gnu_srs: is that what i wrote ?
+ <mel-> does mach support it?
+ <braunr> mel-: some versions do yes
+ <braunr> mel-: not ours
+ <gnu_srs> 21:49:03) braunr: mel-: and the hurd creates threads on demand,
+ so if it's too slow to process requests, more threads are spawned
+ <braunr> of course it's a design decision
+ <braunr> it doesn't "ease the slowness"
+ <braunr> it makes servers able to use multiple processors to handle
+ requests
+ <braunr> but it's a wrong design decision as the number of threads is
+ completely unchecked
+ <gnu_srs> what's the idea of creating more theads then, multiple cpus is
+ not supported?
+ <braunr> it's a very old decision taken at a time when systems and machines
+ were very different
+ <braunr> mach used to support multiple processors
+ <braunr> it was expected gnumach would do so too
+ <braunr> mel-: but getting thread migration would also require us to adjust
+ our threading library and our servers
+ <braunr> it's not an easy task at all
+ <braunr> and it doesn't fix everything
+ <braunr> thread migration on mach is an optimization
+ <mel-> interesting
+ <braunr> async ipc remains available, which means notifications, which are
+ async by nature, will create messages floods anyway
+
+
# Alternative approaches:
* <http://www.concurrencykit.org/>
diff --git a/open_issues/netstat.mdwn b/open_issues/netstat.mdwn
new file mode 100644
index 00000000..b575ea7f
--- /dev/null
+++ b/open_issues/netstat.mdwn
@@ -0,0 +1,34 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_hurd open_issue_porting]]
+
+
+# IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> we need a netstat command
+ <pinotree> wouldn't that require rpcs and notifications in pfinet to get
+ info on the known sockets?
+ <braunr> depends on the interface
+ <braunr> netstat currently uses /proc/net/* so that's out of the question
+ <braunr> but a bsd netstat using ioctls could do the job
+ <braunr> i'm not sure if it's done that way
+ <braunr> i don't see why it would require notifications though
+ <pinotree> if add such rpcs to pfinet, you could show the sockets in procfs
+ <braunr> yes
+ <braunr> that's the clean way :p
+ <braunr> but why notifications ?
+ <pinotree> to get changes on data of sockets (status change, i/o activity,
+ etc)
+ <pinotree> (possibly i'm forgetting some already there features to know
+ that)
+ <braunr> the socket state is centralized in pfinet
+ <braunr> netstat polls it
+ <braunr> (or asks it once)
diff --git a/open_issues/packaging_libpthread.mdwn b/open_issues/packaging_libpthread.mdwn
index d243aaaa..18f124b4 100644
--- a/open_issues/packaging_libpthread.mdwn
+++ b/open_issues/packaging_libpthread.mdwn
@@ -93,7 +93,6 @@ License|/fdl]]."]]"""]]
by anybody?
<youpi> they are half-finished (no __PTHREAD_SPIN_LOCK_INITIALIZER), and
come in the way when building in glibc
- <youpi> also, any reason for using ia32 and not i386? glibc uses the latter
<youpi> pinotree: rid of pthread-stubs yes
<pinotree> \o/
<tschwinge> youpi: You mean sysdeps/mach/i386/machine-lock.h? No idea
@@ -101,7 +100,7 @@ License|/fdl]]."]]"""]]
<youpi> I'm talking about libpthread
<youpi> not glibc
<tschwinge> Oh.
- <tschwinge> sysdeps/ia32/bits/spin-lock.h:# define
+ <tschwinge> sysdeps/i386/bits/spin-lock.h:# define
__PTHREAD_SPIN_LOCK_INITIALIZER ((__pthread_spinlock_t) 0)
<tschwinge> Anyway, no idea about that either.
<youpi> that one is meant to be used with the spin-lock.h just below
@@ -128,12 +127,86 @@ License|/fdl]]."]]"""]]
no-add-needed issue
-## IRC, freenode, #hurd, 2012-04-27
-
- <pinotree> youpi: wouldn't be the case to rename ia32 subdirs to i386 in
- libpthread?
- <pinotree> after all, Makefile hardcodes it, Makefile.am sets the variable
- for it, and glibc expects i386
- <youpi> I know, I've asked tschwinge about it
- <youpi> it's not urging anyway
- <pinotree> right
+## IRC, freenode, #hurd, 2012-08-07
+
+ <tschwinge> Also, the Savannah hurd/glibc.git one does not/not yet include
+ libpthread.
+ <tschwinge> But that could easily be added as a Git submodule.
+ <tschwinge> youpi: To put libpthread into glibc it is literally enough to
+ make Savannah hurd/libpthread.git appear at [glibc]/libpthread?
+ <youpi> tschwinge: there are some patches needed in the rest of the tree
+ <youpi> see in debian, libpthread_clean.diff, tg-libpthread_depends.diff,
+ unsubmitted-pthread.diff, unsubmitted-pthread_posix_options.diff
+ <tschwinge> The libpthread in Debian glibc is
+ hurd/libpthread.git:b428baaa85c0adca9ef4884c637f289a0ab5e2d6 but with
+ 25260994c812050a5d7addf125cdc90c911ca5c1 »Store self in __thread variable
+ instead of threadvar« reverted (why?), [...]
+
+..., and 549aba4335946c26f2701c2b43be0e6148d27c09 »Fix libpthread.so symlink«
+cherry-picked.
+
+ <braunr> tschwinge: is there any plan to merge libpthread.git in glibc.git
+ upstream ?
+ <tschwinge> braunr, youpi: Has not yet been discussed with Roland, as far
+ as I know.
+ <youpi> has not
+ <youpi> libpthread.diff is supposed to be a verbatim copy of the repository
+ <youpi> and then there are a couple patches which don't (yet) make sense
+ upstream
+
+
+# IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> *** $(common-objpfx)resolv/gai_suspend.o: uses
+ /usr/include/i386-gnu/bits/pthread.h
+ <pinotree> so the ones in the libpthread addon are not used...
+ <tschwinge> pinotree: The latter at leash should be useful information.
+ <pinotree> tschwinge: i'm afraid i didn't get you :) what are you referring
+ to?
+ <tschwinge> pinotree: s%leash%least -- what I mean was the it's actually a
+ real bug that not the in-tree libpthread addon include files are being
+ used.
+ <pinotree> tschwinge: ah sure -- basically, the stuff in
+ libpthread/sysdeps/generic are not used at all
+ <pinotree> (glibc only uses generic for glibc/sysdeps/generic)
+ <pinotree> tschwinge: i might have an idea how to fix it: moving the
+ contents from libpthread/sysdeps/generic to libpthread/sysdeps/pthread,
+ and that would depend on one of the latest libpthread patches i sent
+
+
+# libihash
+
+## IRC, freenode, #hurd, 2012-11-16
+
+ <pinotree> also, libpthread uses hurd's ihash
+ <tschwinge> Yes, I already thought a little bit about the ihash thing. I
+ besically see two options: move ihash into glibc ((probably?) not as a
+ public interface, though), or have libpthread use of of the hash
+ implementations that surely are already present in glibc.
+ <tschwinge> My notes say:
+ <tschwinge> * include/inline-hashtab.h
+ <tschwinge> * locale/programs/simple-hash.h
+ <tschwinge> * misc/hsearch_r.c
+ <tschwinge> * NNS; cf. f46f0abfee5a2b34451708f2462a1c3b1701facd
+ <tschwinge> No idea whether they're equivalent/usable.
+ <pinotree> interesting
+ <tschwinge> And no immediate recollection what NNS is;
+ f46f0abfee5a2b34451708f2462a1c3b1701facd is not a glibc commit after all.
+ ;-)
+ <tschwinge> Oh, and: libiberty: `hashtab.c`
+ <pinotree> hmm, but then you would need to properly ifdef the libpthread
+ hash usage (iirc only for pthread keys) depending on whether it's in
+ glibc or standalone
+ <pinotree> but that shouldn't be an ussue, i guess
+ <pinotree> *issue
+ <tschwinge> No that'd be fine.
+ <tschwinge> My understanding is that the long-term goal (well, no so
+ long-term, actually) is to completely move libpthread into glibc.
+ <pinotree> ie have it buildable only ad glibc addon?
+ <tschwinge> Yes.
+ <tschwinge> No need for more than one mechanism for building it, I think.
+ <tschwinge> Hmm, this doesn't bring us any further:
+ https://www.google.com/search?q=f46f0abfee5a2b34451708f2462a1c3b1701facd
+ <pinotree> yay for acronyms ;)
+ <tschwinge> So, if someone figures out what NNS and this commit it are: one
+ beer. ;-)
diff --git a/open_issues/pci_arbiter.mdwn b/open_issues/pci_arbiter.mdwn
new file mode 100644
index 00000000..7730cee0
--- /dev/null
+++ b/open_issues/pci_arbiter.mdwn
@@ -0,0 +1,256 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+For [[DDE]]/X.org/...
+
+
+# IRC, freenode, #hurd, 2012-02-19
+
+ <youpi> antrik: we should probably add a gsoc idea on pci bus arbitration
+ <youpi> DDE is still experimental for now so it's ok that you have to
+ configure it by hand, but it should be automatic at some ponit
+
+
+## IRC, freenode, #hurd, 2012-02-21
+
+ <braunr> i'm not familiar with the new gnumach interface for userspace
+ drivers, but can this pci enumerator be written with it as it is ?
+ <braunr> (i'm not asking for a precise answer, just yes - even probably -
+ or no)
+ <braunr> (idk or utsl will do as well)
+ <youpi> I'd say yes
+ <youpi> since all drivers need is interrupts, io ports and iomem
+ <youpi> the latter was already available through /dev/mem
+ <youpi> io ports through the i386 rpcs
+ <youpi> the changes provide both interrupts, and physical-contiguous
+ allocation
+ <youpi> it should be way enough
+ <braunr> youpi: ok
+ <braunr> youpi: thanks for the details :)
+ <antrik> braunr: this was mentioned in the context of the interrupt
+ forwarding interface... the original one implemented by zhengda isn't
+ suitable for a PCI server; but the ones proposed by youpi and tschwinge
+ would work
+ <antrik> same for the physical memory interface: the current implementation
+ doesn't allow delegation; but I already said that it's wrong
+
+
+# IRC, freenode, #hurd, 2012-07-15
+
+ <bddebian> youpi: Oh, BTW, I keep meaning to ask you. Could sound be done
+ with dde or would there still need to be some kernel work?
+ <youpi> bddebian: we'd need a PCI arbitrer for that
+ <youpi> for now just one userland poking with PCI is fine
+ <youpi> but two can produce bonks
+ <bddebian> They can't use the same?
+ <youpi> that's precisely the matter
+ <youpi> they have to use the same
+ <youpi> and not poke with it themselves
+ <braunr> that's what an arbiter is for
+ <bddebian> OK, so if we don't have a PCI arbiter now, how do things like
+ netdde and video not collide currently?
+ <bddebian> s/netdde/network/
+ <bddebian> or disk for that matter
+ <braunr> bddebian: ah currently, well currently, the network is the only
+ thing using the pci bus
+ <bddebian> How is that possible when I have a PCI video card and disk
+ controller?
+ <braunr> they are accessed through compatible means
+ <bddebian> I suppose one of the hardest parts is prioritization?
+ <braunr> i don't think it matters much, no
+ <youpi> bddebian: netdde and Xorg don't collide essentially because they
+ are not started at the same time (hopefully)
+ <bddebian> braunr: What do you mean it doesn't matter?
+ <braunr> bddebian: well the point is rather serializing access, we don't
+ need more
+ <braunr> do other systems actually schedule access to the pci bus ?
+ <bddebian> From what I am reading, yes
+ <braunr> ok
+
+
+# IRC, freenode, #hurd, 2012-07-16
+
+ <antrik> youpi: the lack of a PCI arbiter is a problem, but I wounldn't
+ consider it a precondition for adding another userspace driver
+ class... it's up to the user to make sure he has only one class active,
+ or take the risk of not doing so...
+ <antrik> (plus, I suspect writing the arbiter is a smaller task than
+ implementing another DDE class anyways...)
+ <bddebian> Where would the arbiter need to reside, in gnumach?
+ <antrik> bddebian: kernel would be one possible place (with the advantage
+ of running both userspace and kernel drivers without the potential for
+ conflicts)
+ <antrik> but I think I would prefer a userspace server
+ <youpi> antrik: we'd rather have PCI devices automatically set up
+ <youpi> just like /dev/netdde is already set up for the user
+ <youpi> so you can't count on the user
+ <youpi> for the arbitrer, it could as well be userland, while still
+ interacting with the kernel for some devices
+ <youpi> we however "just" need to get disk drivers in userland to drop PCI
+ drivers from kernel, actually
+
+
+# IRC, freenode, #hurd, 2012-07-17
+
+ <bddebian> youpi: So this PCI arbiter should be a hurd server?
+ <youpi> that'd be better
+ <bddebian> youpi: Is there anything existing to look at as a basis?
+ <youpi> no idea off-hand
+ <bddebian> I mean you couldn't take what netdde does and generalize it?
+ <youpi> netdde doesn't do any arbitration
+
+
+# IRC, OFTC, #debian-hurd, 2012-07-19
+
+ <bdefreese> youpi: Well at some point if you ever have time I'd like to
+ understand better how you see the PCI architecture working in Hurd.
+ I.E. would you expect the server to do enumeration and arbitration?
+ <youpi> I'd expect both, yes, but that's probably to be discussed rather
+ with antrik, he's the one who took some time to think about it
+ <bdefreese> netdde uses libpciaccess currently, right?
+ <youpi> yes
+ <youpi> libpciaccess would have to be fixed into using the arbitrer
+ <youpi> (that'd fix xorg as well)
+ <bdefreese> Man, I am still a bit unclear on how this all interacting
+ currently.. :(
+ <youpi> currently it's not
+ <youpi> and it's just by luck that it doesn't break
+ <bdefreese> Long term xxxdde would use the new server, correct?
+ <youpi> (well, we are also sure that the gnumach enumeration comes always
+ before the netdde enumeration, and xorg is currently not started
+ automatically, so its enumeration is also always after that)
+ <youpi> yes
+ <youpi> the server would essentially provide an interface equivalent to
+ libpciaccess
+ <bdefreese> Right
+ <bdefreese> In general, where does the pci map get "stored"? In GNU/Linux,
+ is it all /proc based?
+ <youpi> what do you mean by "pci map" ?
+ <bdefreese> Once I have enumerated all of the buses and devices, does it
+ stay stored or is it just redone for every call to a pci device?
+ <youpi> in linux it's stored in the kernel
+ <youpi> the abritrator would store it itself
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+ <bddebian> antrik: BTW, youpi says you are the one to talk to for design of
+ a PCI server :)
+ <antrik> oh, am I?
+ * antrik feels honoured :-)
+ <antrik> I guess it's true though: I *did* spent a little thought on
+ it... even mentioned something in my thesis IIRC
+ <antrik> there is one tricky aspect to it though, which I'm not sure how to
+ handle best: we need two different instances of libpciaccess
+ <bddebian> Why two instances of libpciaccess?
+ <antrik> one used by the PCI server to access the hardware directly (using
+ the existing port poking backend), and one using a new backend to access
+ our PCI server...
+ <braunr> bddebian: hum, both i guess ?
+ <bddebian> antrik: Why wouldn't the server access the hardware directly? I
+ thought libpciaccess was supposed to be generic on purpose?
+ <antrik> hm... guess I wasn't clear
+ <antrik> the point is that the PCI server should use the direct hardware
+ access backend of libpciaccess
+ <antrik> however, *clients* should use the PCI server backend of
+ libpciaccess
+ <antrik> I'm not sure backends can be selected at runtime...
+ <antrik> which might mean that we actually have to compile two different
+ versions of the library. erk.
+ <bddebian> So you are saying the pci server should itself use libpci access
+ rather than having it's own?
+ <antrik> admittedly, that's not the most fundamental design decision to
+ make ;-)
+ <antrik> bddebian: yes. no need to rewrite (or copy) this code...
+ <bddebian> Hmm
+ <antrik> actually that was the plan all along when I first suggested
+ implementing the register poking backend for libpciaccess
+ <bddebian> Hmm, not sure I like it but I am certainly in no position to
+ question it right now :)
+ <braunr> why don't you like it ?
+ <bddebian> I shouldn't need an Xorg specific library to access PCI on my OS
+ :)
+ <braunr> oh
+ <bddebian> Though I don't disagree that reinventing the wheel is a bit
+ tedious. :)
+ <antrik> bddebian: although it originates from X.Org, I don't think there
+ is anything about the library technically making it X-specific...
+ <braunr> yes that's my opinion too
+ <antrik> (well, there are some X-specific functions IIRC, but these do not
+ hurt the other functionality)
+ <bddebian> But what is there is api/abi breakage? :)
+ <bddebian> s/is/if/
+ <antrik> BTW according to rdepends there appear to be a number of non-X
+ things using the library now
+ <pinotree> like, uhm, hurd
+ <antrik> yeah, that too... we are already using it for DDE
+ <pinotree> if you have deb-src lines in your sources.list, use the
+ grep-dctrl power:
+ <pinotree> grep-dctrl -sPackage -FBuild-Depends libpciaccess-dev
+ /var/lib/apt/lists/*_source_Sources | sort -u
+ <bddebian> I know we are using it for netdde.
+ <antrik> nice thing about it is that once we have the PCI server and an
+ appropriate backend for libpciaccess, the same netdde and X binaries
+ should work either with or without the PCI server
+ <bddebian> Then why have the server at all?
+ <braunr> it's the arbiter
+ <braunr> you can use the library directly only if you're the only user
+ <braunr> and what antrik means is that the interface should be the same for
+ both modes
+ <bddebian> Ugh, that is where I am getting confused
+ <bddebian> In that case shouldn't everything use libpciaccess and the PCI
+ server has to arbitrate the requests?
+ <braunr> bd ?
+ <braunr> bddebian: yes
+ <braunr> bddebian: but they use the indirect version of the library
+ <braunr> whereas the server uses the raw version
+ <bddebian> OK, I gotcha (I think)
+ <braunr> (but they both provide the same interface, so if you don't have a
+ pci server and you know you're the only user, the direct version can be
+ used)
+ <bddebian> But I am not sure I see the difference between creating a second
+ library or just moving the raw access to the PCI server :)
+ <braunr> uh, there is no difference in that
+ <braunr> and you shouldn't do it
+ <braunr> (if that's what antrik meant at least)
+ <braunr> if you can select the backend (raw or pci server) easily, then
+ stick to the same code base
+ <bddebian> That's where I struggle. In my worthless opinion, raw access
+ should be the OS job while indirect access would be the libraries
+ responsibility
+ <braunr> that's true
+ <braunr> but as an optimization, if an application is the only user, it can
+ directly use raw access
+ <bddebian> How would you know that?
+ <bddebian> I'm sorry if these are dumb questions
+ <braunr> hum, don't try to make this behaviour automatic
+ <braunr> it would be selected by the user through command line switches
+ <bddebian> But the OS itself uses PCI for things like disk access and
+ video, no?
+ <braunr> (it could be automatic but it makes things more complicated)
+ <braunr> you don't need an arbiter all the time
+ <braunr> i can't tell you more, wait for antrik to return
+ <braunr> i realize i might already have said some bullshit
+ <antrik> bddebian: well, you have a point there that once we have the
+ arbiter and use it for everthing, it isn't strictly useful to still have
+ the register poking in the library
+ <antrik> however, the code will remain in the library anyways, so we better
+ continue using it rather than introducing redundancy...
+ <antrik> but again, that's rather a side issue concerning the design of the
+ PCI server
+ <bddebian> antrik: Fair enough. :) So how would I even start on this?
+ <antrik> bddebian: actually, libpciaccess is a good starting point:
+ checking the API should give you a fairly good idea what functionality
+ the server needs to implement
+ <pinotree> (+1 on library (re)use)
+ <bddebian> antrik: KK
+ <antrik> sorry, I'm a bit busy right now...
diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn
index 8dbe1160..ae05e128 100644
--- a/open_issues/performance.mdwn
+++ b/open_issues/performance.mdwn
@@ -52,3 +52,166 @@ call|/glibc/fork]]'s case.
<braunr> the more i study the code, the more i think a lot of time is
wasted on cpu, unlike the common belief of the lack of performance being
only due to I/O
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> there are several kinds of scalability issues
+ <braunr> iirc, i found some big locks in core libraries like libpager and
+ libdiskfs
+ <braunr> but anyway we can live with those
+ <braunr> in the case i observed, ext2fs, relying on libdiskfs and libpager,
+ scans the entire file list to ask for writebacks, as it can't know if the
+ pages are dirty or not
+ <braunr> the mistake here is moving part of the pageout policy out of the
+ kernel
+ <braunr> so it would require the kernel to handle periodic synces of the
+ page cache
+ <antrik> braunr: as for big locks: considering that we don't have any SMP
+ so far, does it really matter?...
+ <braunr> antrik: yes
+ <braunr> we have multithreading
+ <braunr> there is no reason to block many threads while if most of them
+ could continue
+ <braunr> -while
+ <antrik> so that's more about latency than throughput?
+ <braunr> considering sleeping/waking is expensive, it's also about
+ throughput
+ <braunr> currently, everything that deals with sleepable locks (both
+ gnumach and the hurd) just wake every thread waiting for an event when
+ the event occurs (there are a few exceptions, but not many)
+ <antrik> ouch
+
+
+## [[!message-id "20121202101508.GA30541@mail.sceen.net"]]
+
+
+## IRC, freenode, #hurd, 2012-12-04
+
+ <damo22> why do some people think hurd is slow? i find it works well even
+ under heavy load inside a virtual machine
+ <braunr> damo22: the virtual machine actually assists the hurd a lot :p
+ <braunr> but even with that, the hurd is a slow system
+ <damo22> i would have thought it would have the potential to be very fast,
+ considering the model of the kernel
+ <braunr> the design implies by definition more overhead, but the true cause
+ is more than 15 years without optimization on the core components
+ <braunr> how so ?
+ <damo22> since there are less layers of code between the hardware bare
+ metal and the application that users run
+ <braunr> how so ? :)
+ <braunr> it's the contrary actually
+ <damo22> VFS -> IPC -> scheduler -> device drivers -> hardware
+ <damo22> that is monolithic
+ <braunr> well, it's not really meaningful
+ <braunr> and i'd say the same applies for a microkernel system
+ <damo22> if the application can talk directly to hardware through the
+ kernel its almost like plugging directly into the hardware
+ <braunr> you never talk directly to hardware
+ <braunr> you talk to servers instead of the kernel
+ <damo22> ah
+ <braunr> consider monolithic kernel systems like systems with one big
+ server
+ <braunr> the kernel
+ <braunr> whereas a multiserver system is a kernel and many servers
+ <braunr> you still need the VFS to identify your service (and thus your
+ server)
+ <braunr> you need much more IPC, since system calls are "replaced" with RPC
+ <braunr> the scheduler is basically the same
+ <damo22> okay
+ <braunr> device drivers are similar too, except they run in thread context
+ (which is usually a bit heavier)
+ <damo22> but you can do cool things like report when an interrupt line is
+ blocked
+ <braunr> and there are many context switches between all that
+ <braunr> you can do all that in a monolithic kernel too, and faster
+ <braunr> but it's far more elegant, and (when well done) easy to do on a
+ microkernel based system
+ <damo22> yes
+ <damo22> i like elegant, makes coding easier if you know the basics
+ <braunr> there are only two major differences between a monolilthic kernel
+ and a multiserver microkernel system
+ * damo22 listens
+ <braunr> 1/ independence of location (your resources could be anywhere)
+ <braunr> 2/ separation of address spaces (your servers have their own
+ addresses)
+ <damo22> wow
+ <braunr> these both imply additional layers of indirection, making the
+ system as a whole slower
+ <damo22> but it would be far more secure though i suspect
+ <braunr> yes
+ <braunr> and reliable
+ <braunr> that's why systems like qnx were usually adopted for critical
+ tasks
+ <damo22> security and reliability are very important, i would switch to the
+ hurd if it supported all the hardware i use
+ <braunr> so would i :)
+ <braunr> but performance matters too
+ <damo22> not to me
+ <braunr> it should :p
+ <braunr> it really does matter a lot in practice
+ <damo22> i mean, a 2x slowdown compared to linux would not affect me
+ <damo22> if it had all the benefits we mentioned above
+ <braunr> but the hurd is really slow for other reasons than its additional
+ layers of indrection unfortunately
+ <damo22> is it because of lack of optimisation in the core code?
+ <braunr> we're working on these issues, but it's not easy and takes a lot
+ of time :p
+ <damo22> like you said
+ <braunr> yes
+ <braunr> and also because of some fundamental design choices related to the
+ microkernel back in the 80s
+ <damo22> what about the darwin system
+ <damo22> it uses a mach kernel?
+ <braunr> yes
+ <damo22> what is stopping someone taking the MIT code from darwin and
+ creating a monster free OS
+ <braunr> what for ?
+ <damo22> because it already has hardware support
+ <damo22> and a mach kernel
+ <braunr> in kernel drivers ?
+ <damo22> it has kernel extensions
+ <damo22> you can do things like kextload module
+ <braunr> first, being a mach kernel doesn't make it compatible or even
+ easily usable with the hurd, the interfaces have evolved independantly
+ <braunr> and second, we really do want more stuff out of the kernel
+ <braunr> drivers in particular
+ <damo22> may i ask why you are very keen to have drivers out of kernel?
+ <braunr> for the same reason we want other system services out of the
+ kernel
+ <braunr> security, reliability, etc..
+ <braunr> ease of debugging
+ <braunr> the ability to restart drivers separately, without restarting the
+ kernel
+ <damo22> i see
+
+
+# IRC, freenode, #hurd, 2012-09-13
+
+{{$news/2011-q2#phoronix-3}}.
+
+ <braunr> the phoronix benchmarks don't actually test the operating system
+ ..
+ <hroi_> braunr: well, at least it tests its ability to run programs for
+ those particular tasks
+ <braunr> exactly, it tests how programs that don't make much use of the
+ operating system run
+ <braunr> well yes, we can run programs :)
+ <pinotree> those are just cpu-taking tasks
+ <hroi_> ok
+ <pinotree> if you do a benchmark with also i/o, you can see how it is
+ (quite) slower on hurd
+ <hroi_> perhaps they should have run 10 of those programs in parallel, that
+ would test the kernel multitasking I suppose
+ <braunr> not even I/O, simply system calls
+ <braunr> no, multitasking is ok on the hurd
+ <braunr> and it's very similar to what is done on other systems, which
+ hasn't changed much for a long time
+ <braunr> (except for multiprocessor)
+ <braunr> true OS benchmarks measure system calls
+ <hroi_> ok, so Im sensing the view that the actual OS kernel architecture
+ dont really make that much difference, good software does
+ <braunr> not at all
+ <braunr> i'm only saying that the phoronix benchmark results are useless
+ <braunr> because they didn't measure the right thing
+ <hroi_> ok
diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn
index 710c746b..706e1632 100644
--- a/open_issues/performance/io_system/read-ahead.mdwn
+++ b/open_issues/performance/io_system/read-ahead.mdwn
@@ -1565,3 +1565,994 @@ License|/fdl]]."]]"""]]
<braunr> mcsim1: just use sane values inside the kernel :p
<braunr> this simplifies things by only adding the new vm_advise call and
not change the existing external pager interface
+
+
+## IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> mcsim: so, to begin with, tell us what state you've reached please
+ <mcsim> braunr: I'm writing code for hurd and gnumach. For gnumach I'm
+ implementing memory policies now. RANDOM and NORMAL seems work, but in
+ hurd I found error that I made during editing ext2fs. So for now ext2fs
+ does not work
+ <braunr> policies ?
+ <braunr> what about mechanism ?
+ <mcsim> also I moved some translators to new interface.
+ <mcsim> It works too
+ <braunr> well that's impressive
+ <mcsim> braunr: I'm not sure yet that everything works
+ <braunr> right, but that's already a very good step
+ <braunr> i thought you were still working on the interfaces to be honest
+ <mcsim> And with mechanism I didn't implement moving pages to inactive
+ queue
+ <braunr> what do you mean ?
+ <braunr> ah you mean with the sequential policy ?
+ <mcsim> yes
+ <braunr> you can consider this a secondary goal
+ <mcsim> sequential I was going to implement like you've said, but I still
+ want to support moving pages to inactive queue
+ <braunr> i think you shouldn't
+ <braunr> first get to a state where clustered transfers do work fine
+ <mcsim> policies are implemented in function calculate_clusters
+ <braunr> then, you can try, and measure the difference
+ <mcsim> ok. I'm now working on fixing ext2fs
+ <braunr> so, except from bug squashing, what's left to do ?
+ <mcsim> finish policies and ext2fs; move fatfs, ufs, isofs to new
+ interface; test this all; edit patches from debian repository, that
+ conflict with my changes; rearrange commits and fix code indentation;
+ update documentation;
+ <braunr> think about measurements too
+ <tschwinge> mcsim: Please don't spend a lot of time on ufs. No testing
+ required for that one.
+ <braunr> and keep us informed about your progress on bug fixing, so we can
+ test soon
+ <mcsim> Forgot about moving system to new interfaces (I mean determine form
+ of vm_advise and memory_object_change_attributes)
+ <braunr> s/determine/final/
+ <mcsim> braunr: ok.
+ <braunr> what do you mean "moving system to new interfaces" ?
+ <mcsim> braunr: I also pushed code changes to gnumach and hurd git
+ repositories
+ <mcsim> I met an issue with memory_object_change_attributes when I tried to
+ use it as I have to update all applications that use it. This includes
+ libc and translators that are not in hurd repository or use debian
+ patches. So I will not be able to run system with new
+ memory_object_change_attributes interface, until I update all software
+ that use this rpc
+ <braunr> this is a bit like the problem i had with my change
+ <braunr> the solution is : don't do it
+ <braunr> i mean, don't change the interface in an incompatible way
+ <braunr> if you can't change an existing call, add a new one
+ <mcsim> temporary I changed memory_object_set_attributes as it isn't used
+ any more.
+ <mcsim> braunr: ok. Adding new call is a good idea :)
+
+
+## IRC, freenode, #hurd, 2012-07-16
+
+ <braunr> mcsim: how did you deal with multiple page transfers towards the
+ default pager ?
+ <mcsim> braunr: hello. Didn't handle this yet, but AFAIR default pager
+ supports multiple page transfers.
+ <braunr> mcsim: i'm almost sure it doesn't
+ <mcsim> braunr: indeed
+ <mcsim> braunr: So, I'll update it just other translators.
+ <braunr> like other translators you mean ?
+ <mcsim> *just as
+ <mcsim> braunr: yes
+ <braunr> ok
+ <braunr> be aware also that it may need some support in vm_pageout.c in
+ gnumach
+ <mcsim> braunr: thank you
+ <braunr> if you see anything strange in the default pager, don't hesitate
+ to talk about it
+ <mcsim> braunr: ok. I didn't finish with ext2fs yet.
+ <braunr> so it's a good thing you're aware of it now, before you begin
+ working on it :)
+ <mcsim> braunr: I'm working on ext2 now.
+ <braunr> yes i understand
+ <braunr> i meant "before beginning work on the default pager"
+ <mcsim> ok
+
+ <antrik> mcsim: BTW, we were mostly talking about readahead (pagein) over
+ the past weeks, so I wonder what the status on clustered page*out* is?...
+ <mcsim> antrik: I don't work on this, but following, I think, is an example
+ of *clustered* pageout: _pager_seqnos_memory_object_data_return: object =
+ 113, seqno = 4, control = 120, start_address = 0, length = 8192, dirty =
+ 1. This is an example of debugging printout that shows that pageout
+ manipulates with chunks bigger than page sized.
+ <mcsim> antrik: Another one with bigger length
+ _pager_seqnos_memory_object_data_return: object = 125, seqno = 124,
+ control = 132, start_address = 131072, length = 126976, dirty = 1, kcopy
+ <antrik> mcsim: that's odd -- I didn't know the functionality for that even
+ exists in our codebase...
+ <antrik> my understanding was that Mach always sends individual pageout
+ requests for ever single page it wants cleaned...
+ <antrik> (and this being the reason for the dreadful thread storms we are
+ facing...)
+ <braunr> antrik: ok
+ <braunr> antrik: yes that's what is happening
+ <braunr> the thread storms aren't that much of a problem now
+ <braunr> (by carefully throttling pageouts, which is a task i intend to
+ work on during the following months, this won't be an issue any more)
+
+
+## IRC, freenode, #hurd, 2012-07-19
+
+ <mcsim> I moved fatfs, ufs, isofs to new interface, corrected some errors
+ in other that I already moved, moved kernel to new interface (renamed
+ vm_advice to vm_advise and added rpcs memory_object_set_advice and
+ memory_object_get_advice). Made some changes in mechanism and tried to
+ finish ext2 translator.
+ <mcsim> braunr: I've got an issue with fictitious pages...
+ <mcsim> When I determine bounds of cluster in external object I never know
+ its actual size. So, mo_data_request call could ask data that are behind
+ object bounds. The problem is that pager returns data that it has and
+ because of this fictitious pages that were allocated are not freed.
+ <braunr> why don't you know the size ?
+ <mcsim> I see 2 solutions. First one is do not allocate fictitious pages at
+ all (but I think that there could be issues). Another lies in allocating
+ fictitious pages, but then freeing them with mo_data_lock.
+ <mcsim> braunr: Because pages does not inform kernel about object size.
+ <braunr> i don't understand what you mean
+ <mcsim> I think that second way is better.
+ <braunr> so how does it happen ?
+ <braunr> you get a page fault
+ <mcsim> Don't you understand problem or solutions?
+ <braunr> then a lookup in the map finds the map entry
+ <braunr> and the map entry gives you the link to the underlying object
+ <mcsim> from vm_object.h: vm_size_t size; /*
+ Object size (only valid if internal) */
+ <braunr> mcsim: ugh
+ <mcsim> For external they are either 0x8000 or 0x20000...
+ <braunr> and for internal ?
+ <braunr> i'm very surprised to learn that
+ <mcsim> braunr: for internal size is actual
+ <braunr> right sorry, wrong question
+ <braunr> did you find what 0x8000 and 0x20000 are ?
+ <mcsim> for external I met only these 2 magic numbers when printed out
+ arguments of functions _pager_seqno_memory_object_... when they were
+ called.
+ <braunr> yes but did you try to find out where they come from ?
+ <mcsim> braunr: no. I think that 0x2000(many zeros) is maximal possible
+ object size.
+ <braunr> what's the exact value ?
+ <mcsim> can't tell exactly :/ My hurd box has broken again.
+ <braunr> mcsim: how does the vm find the backing content then ?
+ <mcsim> braunr: Do you know if it is guaranteed that map_entry size will be
+ not bigger than external object size?
+ <braunr> mcsim: i know it's not
+ <braunr> but you can use the map entry boundaries though
+ <mcsim> braunr: vm asks pager
+ <braunr> but if the page is already present
+ <braunr> how does it know ?
+ <braunr> it must be inside a vm_object ..
+ <mcsim> If I can use these boundaries than the problem, I described is not
+ actual.
+ <braunr> good
+ <braunr> it makes sense to use these boundaries, as the application can't
+ use data outside the mapping
+ <mcsim> I ask page with vm_page_lookup
+ <braunr> it would matter for shared objects, but then they have their own
+ faults :p
+ <braunr> ok
+ <braunr> so the size is actually completely ignord
+ <mcsim> if it is present than I stop expansion of cluster.
+ <braunr> which makes sense
+ <mcsim> braunr: yes, for external.
+ <braunr> all right
+ <braunr> use the mapping boundaries, it will do
+ <braunr> mcsim: i have only one comment about what i could see
+ <braunr> mcsim: there are 'advice' fields in both vm_map_entry and
+ vm_object
+ <braunr> there should be something else in vm_object
+ <braunr> i told you about pages before and after
+ <braunr> mcsim: how are you using this per object "advice" currently ?
+ <braunr> (in addition, using the same name twice for both mechanism and
+ policy is very sonfusing)
+ <braunr> confusing*
+ <mcsim> braunr: I try to expand cluster as much as it possible, but not
+ much than limit
+ <mcsim> they both determine policy, but advice for entry has bigger
+ priority
+ <braunr> that's wrong
+ <braunr> mapping and content shouldn't compete for policy
+ <braunr> the mapping tells the policy (=the advice) while the content tells
+ how to implement (e.g. how much content)
+ <braunr> IMO, you could simply get rid of the per object "advice" field and
+ use default values for now
+ <mcsim> braunr: What sense these values for number of pages before and
+ after should have?
+ <braunr> or use something well known, easy, and effective like preceding
+ and following pages
+ <braunr> they give the vm the amount of content to ask the backing pager
+ <mcsim> braunr: maximal amount, minimal amount or exact amount?
+ <braunr> neither
+ <braunr> that's why i recommend you forget it for now
+ <braunr> but
+ <braunr> imagine you implement the three standard policies (normal, random,
+ sequential)
+ <braunr> then the pager assigns preceding and following numbers for each of
+ them, say [5;5], [0;0], [15;15] respectively
+ <braunr> these numbers would tell the vm how many pages to ask the pagers
+ in a single request and from where
+ <mcsim> braunr: but in fact there could be much more policies.
+ <braunr> yes
+ <mcsim> also in kernel context there is no such unit as pager.
+ <braunr> so there should be a call like memory_object_set_advice(int
+ advice, int preceding, int following);
+ <braunr> for example
+ <braunr> what ?
+ <braunr> the pager is the memory manager
+ <braunr> it does exist in kernel context
+ <braunr> (or i don't understand what you mean)
+ <mcsim> there is only port, but port could be either pager or something
+ else
+ <braunr> no, it's a pager
+ <braunr> it's a port whose receive right is hold by a task implementing the
+ pager interface
+ <braunr> either the default pager or an untrusted task
+ <braunr> (or null if the object is anonymous memory not yet sent to the
+ default pager)
+ <mcsim> port is always pager?
+ <braunr> the object port is, yes
+ <braunr> struct ipc_port *pager; /* Where to get
+ data */
+ <mcsim> So, you suggest to keep set of advices for each object?
+ <braunr> i suggest you don't change anything in objects for now
+ <braunr> keep the advice in the mappings only, and implement default
+ behaviour for the known policies
+ <braunr> mcsim: if you understand this point, then i have nothing more to
+ say, and we should let nowhere_man present his work
+ <mcsim> braunr: ok. I'll implement only default behaviors for know policies
+ for now.
+ <braunr> (actually, using the mapping boundaries is slightly unoptimal, as
+ we could have several mappings for the same content, e.g. a program with
+ read only executable mapping, then ro only)
+ <braunr> mcsim: another way to know the "size" is to actually lookup for
+ pages in objects
+ <braunr> hm no, that's not true
+ <mcsim> braunr: But if there is no page we have to ask it
+ <mcsim> and I don't understand why using mappings boundaries is unoptimal
+ <braunr> here is bash
+ <braunr> 0000000000400000 868K r-x-- /bin/bash
+ <braunr> 00000000006d9000 36K rw--- /bin/bash
+ <braunr> two entries, same file
+ <braunr> (there is the anonymous memory layer for the second, but it would
+ matter for the first cow faults)
+
+
+## IRC, freenode, #hurd, 2012-08-02
+
+ <mcsim> braunr: You said that I probably need some support in vm_pageout.c
+ to make defpager work with clustered page transfers, but TBH I thought
+ that I have to implement only pagein. Do you expect from me implementing
+ pageout either? Or I misunderstand role of vm_pageout.c?
+ <braunr> no
+ <braunr> you're expected to implement only pagins for now
+ <braunr> pageins
+ <mcsim> well, I'm finishing merging of ext2fs patch for large stores and
+ work on defpager in parallel.
+ <mcsim> braunr: Also I didn't get your idea about configuring of paging
+ mechanism on behalf of pagers.
+ <braunr> which one ?
+ <mcsim> braunr: You said that pager has somehow pass size of desired
+ clusters for different paging policies.
+ <braunr> mcsim: i said not to care about that
+ <braunr> and the wording isn't correct, it's not "on behalf of pagers"
+ <mcsim> servers?
+ <braunr> pagers could tell the kernel what size (before and after a faulted
+ page) they prefer for each existing policy
+ <braunr> but that's one way to do it
+ <braunr> defaults work well too
+ <braunr> as shown in other implementations
+
+
+## IRC, freenode, #hurd, 2012-08-09
+
+ <mcsim> braunr: I'm still debugging ext2 with large storage patch
+ <braunr> mcsim: tough problems ?
+ <mcsim> braunr: The same issues as I always meet when do debugging, but it
+ takes time.
+ <braunr> mcsim: so nothing blocking so far ?
+ <mcsim> braunr: I can't tell you for sure that I will finish up to 13th of
+ August and this is unofficial pencil down date.
+ <braunr> all right, but are you blocked ?
+ <mcsim> braunr: If you mean the issues that I can not even imagine how to
+ solve than there is no ones.
+ <braunr> good
+ <braunr> mcsim: i'll try to review your code again this week end
+ <braunr> mcsim: make sure to commit everything even if it's messy
+ <mcsim> braunr: ok
+ <mcsim> braunr: I made changes to defpager, but I haven't tried
+ them. Commit them too?
+ <braunr> mcsim: sure
+ <braunr> mcsim: does it work fine without the large storage patch ?
+ <mcsim> braunr: looks fine, but TBH I can't even run such things like fsx,
+ because even without my changes it failed mightily at once.
+ <braunr> mcsim: right, well, that will be part of another task :)
+
+
+## IRC, freenode, #hurd, 2012-08-13
+
+ <mcsim> braunr: hello. Seems ext2fs with large store patch works.
+
+
+## IRC, freenode, #hurd, 2012-08-19
+
+ <mcsim> hello. Consider such situation. There is a page fault and kernel
+ decided to request pager for several pages, but at the moment pager is
+ able to provide only first pages, the rest ones are not know yet. Is it
+ possible to supply only one page and regarding rest ones tell the kernel
+ something like: "Rest pages try again later"?
+ <mcsim> I tried pager_data_unavailable && pager_flush_some, but this seems
+ does not work.
+ <mcsim> Or I have to supply something anyway?
+ <braunr> mcsim: better not provide them
+ <braunr> the kernel only really needs one page
+ <braunr> don't try to implement "try again later", the kernel will do that
+ if other page faults occur for those pages
+ <mcsim> braunr: No, translator just hangs
+ <braunr> ?
+ <mcsim> braunr: And I even can't deattach it without reboot
+ <braunr> hangs when what
+ <braunr> ?
+ <braunr> i mean, what happens when it hangs ?
+ <mcsim> If kernel request 2 pages and I provide one, than when page fault
+ occurs in second page translator hangs.
+ <braunr> well that's a bug
+ <braunr> clustered pager transfer is a mere optimization, you shouldn't
+ transfer more than you can just to satisfy some requested size
+ <mcsim> I think that it because I create fictitious pages before calling
+ mo_data_request
+ <braunr> as placeholders ?
+ <mcsim> Yes. Is it correct if I will not grab fictitious pages?
+ <braunr> no
+ <braunr> i don't know the details well enough about fictitious pages
+ unfortunately, but it really feels wrong to use them where real physical
+ pages should be used instead
+ <braunr> normally, an in-transfer page is simply marked busy
+ <mcsim> But If page is already marked busy kernel will not ask it another
+ time.
+ <braunr> when the pager replies, you unbusy them
+ <braunr> your bug may be that you incorrectly use pmap
+ <braunr> you shouldn't create mmu mappings for pages you didn't receive
+ from the pagers
+ <mcsim> I don't create them
+ <braunr> ok so you correctly get the second page fault
+ <mcsim> If pager supplies only first pages, when asked were two, than
+ second page will not become un-busy.
+ <braunr> that's a bug
+ <braunr> your code shouldn't assume the pager will provide all the pages it
+ was asked for
+ <braunr> only the main one
+ <mcsim> Will it be ok if I will provide special attribute that will keep
+ information that page has been advised?
+ <braunr> what for ?
+ <braunr> i don't understand "page has been advised"
+ <mcsim> Advised page is page that is asked in cluster, but there wasn't a
+ page fault in it.
+ <mcsim> I need this attribute because if I don't inform kernel about this
+ page anyhow, than kernel will not change attributes of this page.
+ <braunr> why would it change its attributes ?
+ <mcsim> But if page fault will occur in page that was asked than page will
+ be already busy by the moment.
+ <braunr> and what attribute ?
+ <mcsim> advised
+ <braunr> i'm lost
+ <braunr> 08:53 < mcsim> I need this attribute because if I don't inform
+ kernel about this page anyhow, than kernel will not change attributes of
+ this page.
+ <braunr> you need the advised attribute because if you don't inform the
+ kernel about this page, the kernel will not change the advised attribute
+ of this page ?
+ <mcsim> Not only advised, but busy as well.
+ <mcsim> And if page fault will occur in this page, kernel will not ask it
+ second time. Kernel will just block.
+ <braunr> well that's normal
+ <mcsim> But if kernel will block and pager is not going to report somehow
+ about this page, than translator will hang.
+ <braunr> but the pager is going to report
+ <braunr> and in this report, there can be less pages then requested
+ <mcsim> braunr: You told not to report
+ <braunr> the kernel can deduce it didn't receive all the pages, and mark
+ them unbusy anyway
+ <braunr> i told not to transfer more than requested
+ <braunr> but not sending data can be a form of communication
+ <braunr> i mean, sending a message in which data is missing
+ <braunr> it simply means its not there, but this info is sufficient for the
+ kernel
+ <mcsim> hmmm... Seems I understood you. Let me try something.
+ <mcsim> braunr: I informed kernel about missing page as follows:
+ pager_data_supply (pager, precious, writelock, i, 1, NULL, 0); Am I
+ right?
+ <braunr> i don't know the interface well
+ <braunr> what does it mean
+ <braunr> ?
+ <braunr> are you passing NULL as the data for a missing page ?
+ <mcsim> yes
+ <braunr> i see
+ <braunr> you shouldn't need a request for that though, avoiding useless ipc
+ is a good thing
+ <mcsim> i is number of page, 1 is quantity
+ <braunr> but if you can't find a better way for now, it will do
+ <mcsim> But this does not work :(
+ <braunr> that's a bug
+ <braunr> in your code probably
+ <mcsim> braunr: supplying NULL as data returns MACH_SEND_INVALID_MEMORY
+ <braunr> but why would it work ?
+ <braunr> mach expects something
+ <braunr> you have to change that
+ <mcsim> It's mig who refuses data. Mach does not even get the call.
+ <braunr> hum
+ <mcsim> That's why I propose to provide new attribute, that will keep
+ information regarding whether the page was asked as advice or not.
+ <braunr> i still don't understand why
+ <braunr> why don't you fix mig so you can your null message instead ?
+ <braunr> +send
+ <mcsim> braunr: because usually this is an error
+ <braunr> the kernel will decide if it's an erro
+ <braunr> r
+ <braunr> what kinf of reply do you intend to send the kernel with for these
+ "advised" pages ?
+ <mcsim> no reply. But when page fault will occur in busy page and it will
+ be also advised, kernel will not block, but ask this page another time.
+ <mcsim> And how kernel will know that this is an error or not?
+ <braunr> why ask another time ?!
+ <braunr> you really don't want to flood pagers with useless messages
+ <braunr> here is how it should be
+ <braunr> 1/ the kernel requests pages from the pager
+ <braunr> it know the range
+ <braunr> 2/ the pager replies what it can, full range, subset of it, even
+ only one page
+ <braunr> 3/ the kernel uses what the pager replied, and unbusies the other
+ pages
+ <mcsim> First time page was asked because page fault occurred in
+ neighborhood. And second time because PF occurred in page.
+ <braunr> well it shouldn't
+ <braunr> or it should, but then you have a segfault
+ <mcsim> But kernel does not keep bound of range, that it asked.
+ <braunr> if the kernel can't find the main page, the one it needs to make
+ progress, it's a segfault
+ <mcsim> And this range could be supplied in several messages.
+ <braunr> absolutely not
+ <braunr> you defeat the purpose of clustered pageins if you use several
+ messages
+ <mcsim> But interface supports it
+ <braunr> interface supported single page transfers, doesn't mean it's good
+ <braunr> well, you could use several messages
+ <braunr> as what we really want is less I/O
+ <mcsim> Noone keeps bounds of requested range, so it couldn't be checked
+ that range was split
+ <braunr> but it would be so much better to do it all with as few messages
+ as possible
+ <braunr> does the kernel knows the main page ?
+ <braunr> know*
+ <mcsim> Splitting range is not optimal, but it's not an error.
+ <braunr> i assume it does
+ <braunr> doesn't it ?
+ <mcsim> no, that's why I want to provide new attribute.
+ <braunr> i'm sorry i'm lost again
+ <braunr> how does the kernel knows a page fault has been serviced ?
+ <braunr> know*
+ <mcsim> It receives an interrupt
+ <braunr> ?
+ <braunr> let's not mix terms
+ <mcsim> oh.. I read as received. Sorry
+ <mcsim> It get mo_data_supply message. Than it replaces fictitious pages
+ with real ones.
+ <braunr> so you get a message
+ <braunr> and you kept track of the range using fictitious pages
+ <braunr> use the busy flag instead, and another way to retain the range
+ <mcsim> I allocate fictitious pages to reserve place. Than if page fault
+ will occur in this page fictitious page kernel will not send another
+ mo_data_request call, it will wait until fictitious page unblocks.
+ <braunr> i'll have to check the code but it looks unoptimal to me
+ <braunr> we really don't want to allocate useless objects when a simple
+ busy flag would do
+ <mcsim> busy flag for what? There is no page yet
+ <braunr> we're talking about mo_data_supply
+ <braunr> actually we're talking about the whole page fault process
+ <mcsim> We can't mark nothing as busy, that's why kernel allocates
+ fictitious page and marks it as busy until real page would be supplied.
+ <braunr> what do you mean "nothing" ?
+ <mcsim> VM_PAGE_NULL
+ <braunr> uh ?
+ <braunr> when are physical pages allocated ?
+ <braunr> on request or on reply from the pager ?
+ <braunr> i'm reading mo_data_supply, and it looks like the page is already
+ busy at that time
+ <mcsim> they are allocated by pager and than supplied in reply
+ <mcsim> Yes, but these pages are fictitious
+ <braunr> show me please
+ <braunr> in the master branch, not yours
+ <mcsim> that page is fictitious?
+ <braunr> yes
+ <braunr> i'm referring to the way mach currently does things
+ <mcsim> vm/vm_fault.c:582
+ <braunr> that's memory_object_lock_page
+ <braunr> hm wait
+ <braunr> my bad
+ <braunr> ah that damn object chaining :/
+ <braunr> ok
+ <braunr> the original code is stupid enough to use fictitious pages all the
+ time, you probably have to do the same
+ <mcsim> hm... Attributes will be useless, pager should tell something about
+ pages, that it is not going to supply.
+ <braunr> yes
+ <braunr> that's what null is for
+ <mcsim> Not null, null is error.
+ <braunr> one problem i can think of is making sure the kernel doesn't
+ interpret missing as error
+ <braunr> right
+ <mcsim> I think better have special value for mo_data_error
+ <braunr> probably
+
+
+### IRC, freenode, #hurd, 2012-08-20
+
+ <antrik> braunr: I think it's useful to allow supplying the data in several
+ batches. the kernel should *not* assume that any data missing in the
+ first batch won't be supplied later.
+ <braunr> antrik: it really depends
+ <braunr> i personally prefer synchronous approaches
+ <antrik> demanding that all data is supplied at once could actually turn
+ readahead into a performace killer
+ <mcsim> antrik: Why? The only drawback I see is higher response time for
+ page fault, but it also leads to reduced overhead.
+ <braunr> that's why "it depends"
+ <braunr> mcsim: it brings benefit only if enough preloaded pages are
+ actually used to compensate for the time it took the pager to provide
+ them
+ <braunr> which is the case for many workloads (including sequential access,
+ which is the common case we want to optimize here)
+ <antrik> mcsim: the overhead of an extra RPC is negligible compared to
+ increased latencies when dealing with slow backing stores (such as disk
+ or network)
+ <mcsim> antrik: also many replies lead to fragmentation, while in one reply
+ all data is gathered in one bunch. If all data is placed consecutively,
+ than it may be transferred next time faster.
+ <braunr> mcsim: what kind of fragmentation ?
+ <antrik> I really really don't think it's a good idea for the page to hold
+ back the first page (which is usually the one actually blocking) while
+ it's still loading some other pages (which will probably be needed only
+ in the future anyways, if at all)
+ <antrik> err... for the pager to hold back
+ <braunr> antrik: then all pagers should be changed to handle asynchronous
+ data supply
+ <braunr> it's a bit late to change that now
+ <mcsim> there could be two cases of data placement in backing store: 1/ all
+ asked data is placed consecutively; 2/ it is spread among backing
+ store. If pager gets data in one message it more like place it
+ consecutively. So to have data consecutive in each pager, each pager has
+ to try send data in one message. Having data placed consecutive is
+ important, since reading of such data is much more faster.
+ <braunr> mcsim: you're confusing things ..
+ <braunr> or you're not telling them properly
+ <mcsim> Ok. Let me try one more time
+ <braunr> since you're working *only* on pagein, not pageout, how do you
+ expect spread pages being sent in a single message be better than
+ multiple messages ?
+ <mcsim> braunr: I think about future :)
+ <braunr> ok
+ <braunr> but antrik is right, paging in too much can reduce performance
+ <braunr> so the default policy should be adjusted for both the worst case
+ (one page) and the average/best (some/mane contiguous pages)
+ <braunr> through measurement ideally
+ <antrik> mcsim: BTW, I still think implementing clustered pageout has
+ higher priority than implementing madvise()... but if the latter is less
+ work, it might still make sense to do it first of course :-)
+ <braunr> many*
+ <braunr> there aren't many users of madvise, true
+ <mcsim> antrik: Implementing madvise I expect to be very simple. It should
+ just translate call to vm_advise
+ <antrik> well, that part is easy of course :-) so you already implemented
+ vm_advise itself I take it?
+ <mcsim> antrik: Yes, that was also quite easy.
+ <antrik> great :-)
+ <antrik> in that case it would be silly of course to postpone implementing
+ the madvise() wrapper. in other words: never mind my remark about
+ priorities :-)
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+ <mcsim> I try a test with ext2fs. It works, than I just recompile ext2fs
+ and it stops working, than I recompile it again several times and each
+ time the result is unpredictable.
+ <braunr> sounds like a concurrency issue
+ <mcsim> I can run the same test several times and ext2 works until I
+ recompile it. That's the problem. Could that be concurrency too?
+ <braunr> mcsim: without bad luck, yes, unless "several times" is a lot
+ <braunr> like several dozens of tries
+
+
+## IRC, freenode, #hurd, 2012-09-04
+
+ <mcsim> hello. I want to tell that ext2fs translator, that I work on,
+ replaced for my system old variant that processed only single pages
+ requests. And it works with partitions bigger than 2 Gb.
+ <mcsim> Probably I'm not for from the end.
+ <mcsim> But it's worth to mention that I didn't fix that nasty bug that I
+ told yesterday about.
+ <mcsim> braunr: That bug sometimes appears after recompilation of ext2fs
+ and always disappears after sync or reboot. Now I'm going to finish
+ defpager and test other translators.
+
+
+## IRC, freenode, #hurd, 2012-09-17
+
+ <mcsim> braunr: hello. Do you remember that you said that pager has to
+ inform kernel about appropriate cluster size for readahead?
+ <mcsim> I don't understand how kernel store this information, because it
+ does not know about such unit as "pager".
+ <mcsim> Can you give me an advice about how this could be implemented?
+ <youpi> mcsim: it can store it in the object
+ <mcsim> youpi: It too big overhead
+ <mcsim> youpi: at least from my pow
+ <mcsim> *pov
+ <braunr> mcsim: we discussed this already
+ <braunr> mcsim: there is no "pager" entity in the kernel, which is a defect
+ from my PoV
+ <braunr> mcsim: the best you can do is follow what the kernel already does
+ <braunr> that is, store this property per object$
+ <braunr> we don't care much about the overhead for now
+ <braunr> my guess is there is already some padding, so the overhead is
+ likely to be amortized by this
+ <braunr> like youpi said
+ <mcsim> I remember that discussion, but I didn't get than whether there
+ should be only one or two values for all policies. Or each policy should
+ have its own values?
+ <mcsim> braunr: ^
+ <braunr> each policy should have its own values, which means it can be
+ implemented with a simple static array somewhere
+ <braunr> the information in each object is a policy selector, such as an
+ index in this static array
+ <mcsim> ok
+ <braunr> mcsim: if you want to minimize the overhead, you can make this
+ selector a char, and place it near another char member, so that you use
+ space that was previously used as padding by the compiler
+ <braunr> mcsim: do you see what i mean ?
+ <mcsim> yes
+ <braunr> good
+
+
+## IRC, freenode, #hurd, 2012-09-17
+
+ <mcsim> hello. May I add function krealloc to slab.c?
+ <braunr> mcsim: what for ?
+ <mcsim> braunr: It is quite useful for creating dynamic arrays
+ <braunr> you don't want dynamic arrays
+ <mcsim> why?
+ <braunr> they're expensive
+ <braunr> try other data structures
+ <mcsim> more expensive than linked lists?
+ <braunr> depends
+ <braunr> but linked lists aren't the only other alternative
+ <braunr> that's why btrees and radix trees (basically trees of arrays)
+ exist
+ <braunr> the best general purpose data structure we have in mach is the red
+ black tree currently
+ <braunr> but always think about what you want to do with it
+ <mcsim> I want to store there sets of sizes for different memory
+ policies. I don't expect this array to be big. But for sure I can use
+ rbtree for it.
+ <braunr> why not a static array ?
+ <braunr> arrays are perfect for known data sizes
+ <mcsim> I expect from pager to supply its own sizes. So at the beginning in
+ this array is only default policy. When pager wants to supply it own
+ policy kernel lookups table of advice. If this policy is new set of sizes
+ then kernel creates new entry in table of advice.
+ <braunr> that would mean one set of sizes for each object
+ <braunr> why don't you make things simple first ?
+ <mcsim> Object stores only pointer to entry in this table.
+ <braunr> but there is no pager object shared by memory objects in the
+ kernel
+ <mcsim> I mean struct vm_object
+ <braunr> so that's what i'm saying, one set per object
+ <braunr> it's useless overhead
+ <braunr> i would really suggest using a global set of policies for now
+ <mcsim> Probably, I don't understand you. Where do you want to store this
+ static array?
+ <braunr> it's a global one
+ <mcsim> "for now"? It is not a problem to implement a table for local
+ advice, using either rbtree or dynamic array.
+ <braunr> it's useless overhead
+ <braunr> and it's not a single integer, you want a whole container per
+ object
+ <braunr> don't do anything fancy unless you know you really want it
+ <braunr> i'll link the netbsd code again as a very good example of how to
+ implement global policies that work more than decently for every file
+ system in this OS
+ <braunr>
+ http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/uvm/uvm_fault.c?rev=1.194&content-type=text/x-cvsweb-markup&only_with_tag=MAIN
+ <braunr> look for uvmadvice
+ <mcsim> But different translators have different demands. Thus changing of
+ global policy for one translator would have impact on behavior of another
+ one.
+ <braunr> i understand
+ <braunr> this isn't l4, or anything experimental
+ <braunr> we want something that works well for us
+ <mcsim> And this is acceptable?
+ <braunr> until you're able to demonstrate we need different policies, i'd
+ recommend not making things more complicated than they already are and
+ need to be
+ <braunr> why wouldn't it ?
+ <braunr> we've been discussing this a long time :/
+ <mcsim> because every process runs in isolated environment and the fact
+ that there is something outside this environment, that has no rights to
+ do that, does it surprises me.
+ <braunr> ?
+ <mcsim> ok. let me dip in uvm code. Probably my questions disappear
+ <braunr> i don't think it will
+ <braunr> you're asking about the system design here, not implementation
+ details
+ <braunr> with l4, there are as you'd expect well defined components
+ handling policies for address space allocation, or paging, or whatever
+ <braunr> but this is mach
+ <braunr> mach has a big shared global vm server with in kernel policies for
+ it
+ <braunr> so it's ok to implement a global policy for this
+ <braunr> and let's be pragmatic, if we don't need complicated stuff, why
+ would we waste time on this ?
+ <mcsim> It is not complicated.
+ <braunr> retaining a whole container for each object, whereas they're all
+ going to contain exactly the same stuff for years to come seems overly
+ complicated for me
+ <mcsim> I'm not going to create separate container for each object.
+ <braunr> i'm not following you then
+ <braunr> how can pagers upload their sizes in the kernel ?
+ <mcsim> I'm going to create a new container only for combination of cluster
+ sizes that are not present in table of advice.
+ <braunr> that's equivalent
+ <braunr> you're ruling out the default set, but that's just an optimization
+ <braunr> whenever a file system decides to use other sizes, the problem
+ will arise
+ <mcsim> Before creating a container I'm going to lookup a table. And only
+ than create
+ <braunr> a table ?
+ <mcsim> But there will be the same container for a huge bunch of objects
+ <braunr> how do you select it ?
+ <braunr> if it's a per pager container, remember there is no shared pager
+ object in the kernel, only ports to external programs
+ <mcsim> I'll give an example
+ <mcsim> Suppose there are only two policies. At the beginning we have table
+ {{random = 4096, sequential = 8096}}. Than pager 1 wants to add new
+ policy where random cluster size is 8192. He asks kernel to create it and
+ after this table will be following: {{random = 4096, sequential = 8192},
+ {random = 8192, sequential = 8192}}. If pager 2 wants to create the same
+ policy as pager 1, kernel will lockup table and will not create new
+ entry. So the table will be the same.
+ <mcsim> And each object has link to appropriate table entry
+ <braunr> i'm not sure how this can work
+ <braunr> how can pagers 1 and 2 know the sizes are the same for the same
+ policy ?
+ <braunr> (and actually they shouldn't)
+ <mcsim> For faster lookup there will be create hash keys for each entry
+ <braunr> what's the lookup key ?
+ <mcsim> They do not know
+ <mcsim> The kernel knows
+ <braunr> then i really don't understand
+ <braunr> and how do you select sizes based on the policy ?
+ <braunr> and how do you remove unused entries ?
+ <braunr> (ok this can be implemented with a simple ref counter)
+ <mcsim> "and how do you select sizes based on the policy ?" you mean at
+ page fault?
+ <braunr> yes
+ <mcsim> entry or object keeps pointer to appropriate entry in the table
+ <braunr> ok your per object data is a pointer to the table entry and the
+ policy is the index inside
+ <braunr> so you really need a ref counter there
+ <mcsim> yes
+ <braunr> and you need to maintain this table
+ <braunr> for me it's uselessly complicated
+ <mcsim> but this keeps design clear
+ <braunr> not for me
+ <braunr> i don't see how this is clearer
+ <braunr> it's just more powerful
+ <braunr> a power we clearly don't need now
+ <braunr> and in the following years
+ <braunr> in addition, i'm very worried about the potential problems this
+ can introduce
+ <mcsim> In fact I don't feel comfortable from the thought that one
+ translator can impact on behavior of another.
+ <braunr> simple example: the table is shared, it needs a lock, other data
+ structures you may have added in your patch may also need a lock
+ <braunr> but our locks are noop for now, so you just can't be sure there is
+ no deadlock or other issues
+ <braunr> and adding smp is a *lot* more important than being able to select
+ precisely policy sizes that we're very likely not to change a lot
+ <braunr> what do you mean by "one translator can impact another" ?
+ <mcsim> As I understand your idea (I haven't read uvm code yet) that there
+ is a global table of cluster sizes for different policies. And every
+ translator can change values in this table. That is what I mean under one
+ translator will have an impact on another one.
+ <braunr> absolutely not
+ <braunr> translators *can't* change sizes
+ <braunr> the sizes are completely static, assumed to be fit all
+ <braunr> -be
+ <braunr> it's not optimial but it's very simple and effective in practice
+ <braunr> optimal*
+ <braunr> and it's not a table of cluster sizes
+ <braunr> it's a table of pages before/after the faulted one
+ <braunr> this reflects the fact tha in mach, virtual memory (implementation
+ and policy) is in the kernel
+ <braunr> translators must not be able to change that
+ <braunr> let's talk about pagers here, not translators
+ <mcsim> Finally I got you. This is an acceptable tradeoff.
+ <braunr> it took some time :)
+ <braunr> just to clear something
+ <braunr> 20:12 < mcsim> For faster lookup there will be create hash keys
+ for each entry
+ <braunr> i'm not sure i understand you here
+ <mcsim> To found out if there is such policy (set of sizes) in the table we
+ can lookup every entry and compare each value. But it is better to create
+ a hash value for set and thus find equal policies.
+ <braunr> first, i'm really not comfortable with hash tables
+ <braunr> they really need careful configuration
+ <braunr> next, as we don't expect many entries in this table, there is
+ probably no need for this overhead
+ <braunr> remember that one property of tables is locality of reference
+ <braunr> you access the first entry, the processor automatically fills a
+ whole cache line
+ <braunr> so if your table fits on just a few, it's probably faster to
+ compare entries completely than to jump around in memory
+ <mcsim> But we can sort hash keys, and in this way find policies quickly.
+ <braunr> cache misses are way slower than computation
+ <braunr> so unless you have massive amounts of data, don't use an optimized
+ container
+ <mcsim> (20:38:53) braunr: that's why btrees and radix trees (basically
+ trees of arrays) exist
+ <mcsim> and what will be the key?
+ <braunr> i'm not saying to use a tree instead of a hash table
+ <braunr> i'm saying, unless you have many entries, just use a simple table
+ <braunr> and since pagers don't add and remove entries from this table
+ often, it's on case reallocation is ok
+ <braunr> one*
+ <mcsim> So here dynamic arrays fit the most?
+ <braunr> probably
+ <braunr> it really depends on the number of entries and the write ratio
+ <braunr> keep in mind current processors have 32-bits or (more commonly)
+ 64-bits cache line sizes
+ <mcsim> bytes probably?
+ <braunr> yes bytes
+ <braunr> but i'm not willing to add a realloc like call to our general
+ purpose kernel allocator
+ <braunr> i don't want to make it easy for people to rely on it, and i hope
+ the lack of it will make them think about other solutions instead :)
+ <braunr> and if they really want to, they can just use alloc/free
+ <mcsim> Under "other solutions" you mean trees?
+ <braunr> i mean anything else :)
+ <braunr> lists are simple, trees are elegant (but add non negligible
+ overhead)
+ <braunr> i like trees because they truely "gracefully" scale
+ <braunr> but they're still O(log n)
+ <braunr> a good hash table is O(1), but must be carefully measured and
+ adjusted
+ <braunr> there are many other data structures, many of them you can find in
+ linux
+ <braunr> but in mach we don't need a lot of them
+ <mcsim> Your favorite data structures are lists and trees. Next, what
+ should you claim, is that lisp is your favorite language :)
+ <braunr> functional programming should eventually rule the world, yes
+ <braunr> i wouldn't count lists are my favorite, which are really trees
+ <braunr> as*
+ <braunr> there is a reason why red black trees back higher level data
+ structures like vectors or maps in many common libraries ;)
+ <braunr> mcsim: hum but just to make it clear, i asked this question about
+ hashing because i was curious about what you had in mind, i still think
+ it's best to use static predetermined values for policies
+ <mcsim> braunr: I understand this.
+ <braunr> :)
+ <mcsim> braunr: Yeah. You should be cautious with me :)
+
+
+## IRC, freenode, #hurd, 2012-09-21
+
+ <antrik> mcsim: there is only one cluster size per object -- it depends on
+ the properties of the backing store, nothing else.
+ <antrik> (while the readahead policies depend on the use pattern of the
+ application, and thus should be selected per mapping)
+ <antrik> but I'm still not convinced it's worthwhile to bother with cluster
+ size at all. do other systems even do that?...
+
+
+## IRC, freenode, #hurd, 2012-09-23
+
+ <braunr> mcsim: how long do you think it will take you to polish your gsoc
+ work ?
+ <braunr> (and when before you begin that part actually, because we'll to
+ review the whole stuff prior to polishing it)
+ <mcsim> braunr: I think about 2 weeks
+ <mcsim> But you may already start review it, if you're intended to do it
+ before I'll rearrange commits.
+ <mcsim> Gnumach, ext2fs and defpager are ready. I just have to polish the
+ code.
+ <braunr> mcsim: i don't know when i'll be able to do that
+ <braunr> so expect a few weeks on my (our) side too
+ <mcsim> ok
+ <braunr> sorry for being slow, that's how hurd development is :)
+ <mcsim> What should I do with libc patch that adds madvise support?
+ <mcsim> Post it to bug-hurd?
+ <braunr> hm probably the same i did for pthreads, create a topic branch in
+ glibc.git
+ <mcsim> there is only one commit
+ <braunr> yes
+ <braunr> (mine was a one liner :p)
+ <mcsim> ok
+ <braunr> it will probably be a debian patch before going into glibc anyway,
+ just for making sure it works
+ <mcsim> But according to term. I expect that my study begins in a week and
+ I'll have to do some stuff then, so actually probably I'll need a week
+ more.
+ <braunr> don't worry, that's expected
+ <braunr> and that's the reason why we're slow
+ <mcsim> And what should I do with large store patch?
+ <braunr> hm good question
+ <braunr> what did you do for now ?
+ <braunr> include it in your work ?
+ <braunr> that's what i saw iirc
+ <mcsim> Yes. It consists of two parts.
+ <braunr> the original part and the modificaionts ?
+ <braunr> modifications*
+ <braunr> i think youpi would know better about that
+ <mcsim> First (small) adds notification to libpager interface and second
+ one adds support for large stores.
+ <braunr> i suppose we'll probably merge the large store patch at some point
+ anyway
+ <mcsim> Yes both original and modifications
+ <braunr> good
+ <mcsim> I'll split these parts to different commits and I'll try to make
+ support for large stores independent from other work.
+ <braunr> that would be best
+ <braunr> if you can make it so that, by ommitting (or including) one patch,
+ we can add your patches to the debian package, it would be great
+ <braunr> (only with regard to the large store change, not other potential
+ smaller conflicts)
+ <mcsim> braunr: I also found several bugs in defpager, that I haven't fixed
+ since winter.
+ <braunr> oh
+ <mcsim> seems nobody hasn't expect them.
+ <braunr> i'm very interested in those actually (not too soon because it
+ concerns my work on pageout, which is postponed after pthreads and
+ select)
+ <mcsim> ok. than I'll do it first.
+
+
+## IRC, freenode, #hurd, 2012-09-24
+
+ <braunr> mcsim: what is vm_get_advice_info ?
+ <mcsim> braunr: hello. It should supply some machine specific parameters
+ regarding clustered reading. At the moment it supplies only maximal
+ possible size of cluster.
+ <braunr> mcsim: why such a need ?
+ <mcsim> It is used by defpager, as it can't allocate memory dynamically and
+ every thread has to allocate maximal size beforehand
+ <braunr> mcsim: i see
+
+
+## IRC, freenode, #hurd, 2012-10-05
+
+ <mcsim> braunr: I think it's not worth to separate large store patch for
+ ext2 and patch for moving it to new libpager interface. Am I right?
+ <braunr> mcsim: it's worth separating, but not creating two versions
+ <braunr> i'm not sure what you mean here
+ <mcsim> First, I applied large store patch, and than I was changing patched
+ code, to make it work with new libpager interface. So changes to make
+ ext2 work with new interface depend on large store patch.
+ <mcsim> braunr: ^
+ <braunr> mcsim: you're not forced to make each version resulting from a new
+ commit work
+ <braunr> but don't make big commits
+ <braunr> so if changing an interface requires its users to be updated
+ twice, it doesn't make sense to do that
+ <braunr> just update the interface cleanly, you'll have one or more commits
+ that produce intermediate version that don't build, that's ok
+ <braunr> then in another, separate commit, adjust the users
+ <mcsim> braunr: The only user now is ext2. And the problem with ext2 is
+ that I updated not the version from git repository, but the version, that
+ I've got after applying the large store patch. So in other words my
+ question is follows: should I make a commit that moves to new interface
+ version of ext2fs without large store patch?
+ <braunr> you're asking if you can include the large store patch in your
+ work, and by extension, in the main branch
+ <braunr> i would say yes, but this must be discussed with others
diff --git a/open_issues/pfinet_vs_system_time_changes.mdwn b/open_issues/pfinet_vs_system_time_changes.mdwn
index 46705047..09b00d30 100644
--- a/open_issues/pfinet_vs_system_time_changes.mdwn
+++ b/open_issues/pfinet_vs_system_time_changes.mdwn
@@ -11,14 +11,16 @@ License|/fdl]]."]]"""]]
[[!tag open_issue_hurd]]
-IRC, unknown channel, unknown date.
+
+# IRC, unknown channel, unknown date
<grey_gandalf> I did a sudo date...
<grey_gandalf> and the machine hangs
-This was very likely a misdiagnosis:
+This was very likely a misdiagnosis.
+
-IRC, freenode, #hurd, 2011-03-25:
+# IRC, freenode, #hurd, 2011-03-25
<tschwinge> antrik: I suspect it'S some timing stuff in pfinet that perhaps
uses absolute time, and somehow wildely gets confused?
@@ -42,7 +44,8 @@ IRC, freenode, #hurd, 2011-03-25:
wrap-around, and thus the same result.)
<tschwinge> Yes.
-IRC, freenode, #hurd, 2011-10-26:
+
+# IRC, freenode, #hurd, 2011-10-26
<antrik> anyways, when ntpdate adjusts to the past, the connections hang,
roughly for the amount of time being adjusted
@@ -50,7 +53,8 @@ IRC, freenode, #hurd, 2011-10-26:
<antrik> (well, if it's long enough, they probably timeout on the other
side...)
-IRC, freenode, #hurd, 2011-10-27:
+
+# IRC, freenode, #hurd, 2011-10-27
<antrik> oh, another interesting thing I observed is that the the subhurd
pfinet did *not* drop the connection... only the main Hurd one. I thought
@@ -60,7 +64,8 @@ IRC, freenode, #hurd, 2011-10-27:
where I set the date is affected, and not the pfinet in the other
instance
-IRC, freenode, #hurd, 2012-06-28:
+
+# IRC, freenode, #hurd, 2012-06-28
<bddebian> great, now setting the date/time fucked my machine
<pinotree> yes, we lack a monotonic clock
@@ -80,3 +85,17 @@ IRC, freenode, #hurd, 2012-06-28:
it fucked me because I now cannot get to it.. :)
<antrik> bddebian: that's odd... you should be able to just log in again
IIRC
+
+
+# IRC, freenode, #hurd, 2012-07-29
+
+ <antrik> pfinet can't cope with larger system time changes because it can't
+ use a monotonic clock
+
+[[clock_gettime]].
+
+ <braunr> well when librt becomes easily usable everywhere (it it's
+ possible), it will be quite easy to work around this issue
+ <pinotree> yes and no, you just need a monotonic clock and clock_gettime
+ able to use it
+ <braunr> why "no" ?
diff --git a/open_issues/robustness.mdwn b/open_issues/robustness.mdwn
index d32bd509..1f8aa0c6 100644
--- a/open_issues/robustness.mdwn
+++ b/open_issues/robustness.mdwn
@@ -62,3 +62,68 @@ License|/fdl]]."]]"""]]
<antrik> well, I'm not aware of the Minix implementation working across
reboots. the one I have in mind based on a generic session management
infrastructure should though :-)
+
+
+## IRC, freenode, #hurd, 2012-12-06
+
+ <Tekk_> out of curiosity, would it be possible to strap on a resurrection
+ server to hurd?
+ <Tekk_> in the future, that is
+ <braunr> sure
+ <Tekk_> cool :)
+ <braunr> but this requires things like persistence
+ <spiderweb> like a reincarnation server?
+ <braunr> it's a lot of works, with non negligible overhead
+ <Tekk_> spiderweb: yes, exactly. I didn't remember tanenbaum's wording on
+ that
+ <braunr> i'm pretty sure most people would be against that
+ <spiderweb> braunr: why so?
+ <Tekk_> it was actually the feature that convinced me that ukernels were a
+ good idea
+ <Tekk_> spiderweb: because then you need a process that keeps track of all
+ the other servers
+ <Tekk_> and they have to be replying to "useless" pings to see if they're
+ still alive
+ <braunr> spiderweb: the hurd community isn't looking for a system reliable
+ in critical environments
+ <braunr> just a general purpose system
+ <braunr> and persistence requires regular data saves
+ <braunr> it's expensive
+ <Tekk_> as well as that
+ <braunr> we already have performance problems because of the nature of the
+ system, adding more without really looking for the benefits is useless
+ <spiderweb> so you can't theoretically have both?
+ <braunr> persistence and performance ?
+ <braunr> it's hard
+ <Tekk_> spiderweb: you need to modify the other translators to be
+ persistent
+ <braunr> only the ones you care about actually
+ <braunr> but it's just better to make the critical servers very stable
+ <Tekk_> so it's not just turning on and off the reincarnation
+ <braunr> (there isn't that much code there)
+ <braunr> and the other servers restartable
+ <mcsim> braunr: I think that if there will be aim to make something like
+ resurrection server than it will be needed rewrite most servers to make
+ them stateless, isn't it?
+ <braunr> that's a lot easier and already works with non essential passive
+ translators
+ <Tekk_> mcsim: pretty much
+ <braunr> mcsim: only those you care about
+ <braunr> mcsim: the proc auth exec servers for example, perhaps the file
+ system servers that can act as root fs, but the others would simply be
+ restarted by the passive translator mechanism
+ <spiderweb> what about restarting device drivers, that would be simple
+ right?
+ <braunr> that's perfectly doable, yes
+ <spiderweb> (being an OS newbie) - it does seem to me that the whole
+ reincarnation server concept could quite possibly be a band aid.
+ <braunr> spiderweb: no it really works
+ <braunr> many systems do that actually
+ <braunr> let me give you a link
+ <braunr>
+ http://ftp.sceen.net/curios_improving_reliability_through_operating_system_structure.pdf
+ <braunr> it's a bit old, but there is a review of systems aiming at
+ resilience and how they achieve part of it
+ <spiderweb> neat, thanks
+ <braunr> actually it's not that old at all
+ <braunr> around 2007
diff --git a/open_issues/select.mdwn b/open_issues/select.mdwn
index abec304d..778af530 100644
--- a/open_issues/select.mdwn
+++ b/open_issues/select.mdwn
@@ -215,6 +215,1422 @@ IRC, unknown channel, unknown date:
<youpi> it's better than nothing yes
+# IRC, freenode, #hurd, 2012-07-21
+
+ <braunr> damn, select is actually completely misdesigned :/
+ <braunr> iiuc, it makes servers *block*, in turn :/
+ <braunr> can't be right
+ <braunr> ok i understand it better
+ <braunr> yes, timeouts should be passed along with the other parameters to
+ correctly implement non blocking select
+ <braunr> (or the round-trip io_select should only ask for notification
+ requests instead of making a server thread block, but this would require
+ even more work)
+ <braunr> adding the timeout in the io_select call should be easy enough for
+ whoever wants to take over a not-too-complicated-but-not-one-liner-either
+ task :)
+ <antrik> braunr: why is a blocking server thread a problem?
+ <braunr> antrik: handling the timeout at client side while server threads
+ block is the problem
+ <braunr> the timeout must be handled along with blocking obviously
+ <braunr> so you either do it at server side when async ipc is available,
+ which is the case here
+ <braunr> or request notifications (synchronously) and block at client side,
+ waiting forthose notifications
+ <antrik> braunr: are you saying the client has a receive timeout, but when
+ it elapses, the server thread keeps on blocking?...
+ <braunr> antrik: no i'm referring to the non-blocking select issue we have
+ <braunr> antrik: the client doesn't block in this case, whereas the servers
+ do
+ <braunr> which obviously doesn't work ..
+ <braunr> see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=79358
+ <braunr> this is the reason why vim (and probably others) are slow on the
+ hurd, while not consuming any cpu
+ <braunr> the current work around is that whenevever a non-blocking select
+ is done, it's transformed into a blocking select with the smallest
+ possible timeout
+ <braunr> whenever*
+ <antrik> braunr: well, note that the issue only began after fixing some
+ other select issue... it was fine before
+ <braunr> apparently, the issue was raised in 2000
+ <braunr> also, note that there is a delay between sending the io_select
+ requests and blocking on the replies
+ <braunr> when machines were slow, this delay could almost guarantee a
+ preemption between these steps, making the servers reply soon enough even
+ for a non blocking select
+ <braunr> the problem occurs when sending all the requests and checking for
+ replies is done before servers have a chance the send the reply
+ <antrik> braunr: I don't know what issue was raised in 2000, but I do know
+ that vim worked perfectly fine until last year or so. then some select
+ fix was introduced, which in turn broke vim
+ <braunr> antrik: could be the timeout rounding, Aug 2 2010
+ <braunr> hum but, the problem wasn't with vim
+ <braunr> vim does still work fine (in fact, glibc is patched to check some
+ well known process names and selectively fix the timeout)
+ <braunr> which is why vim is fast and view isn't
+ <braunr> the problem was with other services apparently
+ <braunr> and in order to fix them, that workaround had to be introduced
+ <braunr> i think it has nothing to do with the timeout rounding
+ <braunr> it must be the time when youpi added the patch to the debian
+ package
+ <antrik> braunr: the problem is that with the patch changing the timeout
+ rounding, vim got extremely slow. this is why the ugly hacky exception
+ was added later...
+ <antrik> after reading the report, I agree that the timeout needs to be
+ handled by the server. at least the timeout=0 case.
+ <pinotree> vim uses often 0-time selects to check whether there's input
+ <antrik> client-side handling might still be OK for other timeout settings
+ I guess
+ <antrik> I'm a bit ambivalent about that
+ <antrik> I tend to agree with Neal though: it really doesn't make much
+ sense to have a client-side watchdog timer for this specific call, while
+ for all other ones we trust the servers not to block...
+ <antrik> or perhaps not. for standard sync I/O, clients should expect that
+ an operation could take long (though not forever); but they might use
+ select() precisely to avoid long delays in I/O... so it makes some sense
+ to make sure that select() really doesn't delay because of a busy server
+ <antrik> OTOH, unless the server is actually broken (in which anything
+ could happen), a 0-time select should never actually block for an
+ extended period of time... I guess it's not wrong to trust the servers on
+ that
+ <antrik> pinotree: hm... that might explain a certain issue I *was*
+ observing with Vim on Hurd -- though I never really thought about it
+ being an actual bug, as opposed to just general Hurd sluggishness...
+ <antrik> but it makes sense now
+ <pinotree> antrik:
+ http://patch-tracker.debian.org/patch/series/view/eglibc/2.13-34/hurd-i386/local-select.diff
+ <antrik> so I guess we all agree that moving the select timeout to the
+ server is probably the most reasonably approach...
+ <antrik> braunr: BTW, I wouldn't really consider the sync vs. async IPC
+ cases any different. the client blocks waiting for the server to reply
+ either way...
+ <antrik> the only difference is that in the sync IPC case, the server might
+ want to take some special precaution so it doesn't have to block until
+ the client is ready to receive the reply
+ <antrik> but that's optional and not really select-specific I'd say
+ <antrik> (I'd say the only sane approach with sync IPC is probably for the
+ server never to wait -- if the client fails to set up for receiving the
+ reply in time, it looses...)
+ <antrik> and with the receive buffer approach in Viengoos, this can be done
+ really easy and nice :-)
+
+
+## IRC, freenode, #hurd, 2012-07-22
+
+ <braunr> antrik: you can't block in servers with sync ipc
+ <braunr> so in this case, "select" becomes a request for notifications
+ <braunr> whereas with async ipc, you can, so it's less efficient to make a
+ full round trip just to ask for requests when you can just do async
+ requests (doing the actual blocking) and wait for any reply after
+ <antrik> braunr: I don't understand. why can't you block in servers with
+ async IPC?
+ <antrik> braunr: err... with sync IPC I mean
+ <braunr> antrik: because select operates on more than one fd
+ <antrik> braunr: and what does that got to do with sync vs. async IPC?...
+ <antrik> maybe you are thinking of endpoints here, which is a whole
+ different story
+ <antrik> traditional L4 has IPC ports bound to specific threads; so
+ implementing select requires a separate client thread for each
+ server. but that's not mandatory for sync IPC. Viengoos has endpoints not
+ bound to threads
+ <braunr> antrik: i don't know what "endpoint" means here
+ <braunr> but, you can't use sync IPC to implement select on multiple fds
+ (and thus possibly multiple servers) by blocking in the servers
+ <braunr> you'd block in the first and completely miss the others
+ <antrik> braunr: I still don't see why... or why async IPC would change
+ anything in that regard
+ <braunr> antrik: well, you call select on 3 fds, each implemented by
+ different servers
+ <braunr> antrik: you call a sync select on the first fd, obviously you'll
+ block there
+ <braunr> antrik: if it's async, you don't block, you just send the
+ requests, and wait for any reply
+ <braunr> like we do
+ <antrik> braunr: I think you might be confused about the meaning of sync
+ IPC. it doesn't in any way imply that after sending an RPC request you
+ have to block on some particular reply...
+ <youpi> antrik: what does sync mean then?
+ <antrik> braunr: you can have any number of threads listening for replies
+ from the various servers (if using an L4-like model); or even a single
+ thread, if you have endpoints that can listen on replies from different
+ sources (which was pretty much the central concern in the Viengoos IPC
+ design AIUI)
+ <youpi> antrik: I agree with your "so it makes some sense to make sure that
+ select() really doesn't delay because of a busy server" (for blocking
+ select) and "OTOH, unless the server is actually broken (in which
+ anything could happen), a 0-time select should never actually block" (for
+ non-blocking select)
+ <antrik> youpi: regarding the select, I was thinking out loud; the former
+ statement was mostly cancelled by my later conclusions...
+ <antrik> and I'm not sure the latter statement was quite clear
+ <youpi> do you know when it was?
+ <antrik> after rethinking it, I finally concluded that it's probably *not*
+ a problem to rely on the server to observe the timout. if it's really
+ busy, it might take longer than the designated timeout (especially if
+ timeout is 0, hehe) -- but I don't think this is a problem
+ <antrik> and if it doens't observe the timout because it's
+ broken/malicious, that's not more problematic that any other RPC the
+ server doesn't handle as expected
+ <youpi> ok
+ <youpi> did somebody wrote down the conclusion "let's make select timeout
+ handled at server side" somewhere?
+ <antrik> youpi: well, neal already said that in a followup to the select
+ issue Debian bug... and after some consideration, I completely agree with
+ his reasoning (as does braunr)
+
+
+## IRC, freenode, #hurd, 2012-07-23
+
+ <braunr> antrik: i was meaning sync in the most common meaning, yes, the
+ client blocking on the reply
+ <antrik> braunr: I think you are confusing sync IPC with sync I/O ;-)
+ <antrik> braunr: by that definition, the vast majority of Hurd IPC would be
+ sync... but that's obviously not the case
+ <antrik> synchronous IPC means that send and receive happen at the same
+ time -- nothing more, nothing less. that's why it's called synchronous
+ <braunr> antrik: yes
+ <braunr> antrik: so it means the client can't continue unless he actually
+ receives
+ <antrik> in a pure sync model such as L4 or EROS, this means either the
+ sender or the receiver has to block, so synchronisation can happen. which
+ one is server and which one is client is completely irrelevant here --
+ this is about individual message transfer, not any RPC model on top of it
+ <braunr> i the case of select, i assume sender == client
+ <antrik> in Viengoos, the IPC is synchronous in the sense that transfer
+ from the send buffer to the receive buffer happens at the same time; but
+ it's asynchronous in the sense that the receiver doesn't necessarily have
+ to be actively waiting for the incoming message
+ <braunr> ok, i was talking about a pure sync model
+ <antrik> (though it most cases it will still do so...)
+ <antrik> braunr: BTW, in the case of select, the sender is *not* the
+ client. the reply is relevant here, not the request -- so the client is
+ the receiver
+ <antrik> (the select request is boring)
+ <braunr> sorry, i don't understand, you seem to dismiss the select request
+ for no valid reason
+ <antrik> I still don't see how sync vs. async affects the select reply
+ receive though... blocking seems the right approach in either case
+ <braunr> blocking is required
+ <braunr> but you either block in the servers, or in the client
+ <braunr> (and if blocking in the servers, the client also blocks)
+ <braunr> i'll explain how i see it again
+ <braunr> there are two approaches to implementing select
+ <braunr> 1/ send requests to all servers, wait for any reply, this is what
+ the hurd does
+ <braunr> but it's possible because you can send all the requests without
+ waiting for the replies
+ <braunr> 2/ send notification requests, wait for a notification
+ <braunr> this doesn't require blocking in the servers (so if you have many
+ clients, you don't need as many threads)
+ <braunr> i was wondering which approach was used by the hurd, and if it
+ made sense to change
+ <antrik> TBH I don't see the difference between 1) and 2)... whether the
+ message from the server is called an RPC reply or a notification is just
+ a matter of definition
+ <antrik> I think I see though what you are getting at
+ <antrik> with sync IPC, if the client sent all requests and only afterwards
+ started to listen for replies, the servers might need to block while
+ trying to deliver the reply because the client is not ready yet
+ <braunr> that's one thing yes
+ <antrik> but even in the sync case, the client can immediately wait for
+ replies to each individual request -- it might just be more complicated,
+ depending on the specifics of the IPC design
+ <braunr> what i mean by "send notification requests" is actually more than
+ just sending, it's a complete RPC
+ <braunr> and notifications are non-blocking, yes
+ <antrik> (with L4, it would require a separate client thread for each
+ server contacted... which is precisely why a different mechanism was
+ designed for Viengoos)
+ <braunr> seems weird though
+ <braunr> don't they have a portset like abstraction ?
+ <antrik> braunr: well, having an immediate reply to the request and a
+ separate notification later is just a waste of resources... the immediate
+ reply would have no information value
+ <antrik> no, in original L4 IPC is always directed to specific threads
+ <braunr> antrik: some could see the waste of resource as being the
+ duplication of the number of client threads in the server
+ <antrik> you could have one thread listening to replies from several
+ servers -- but then, replies can get lost
+ <braunr> i see
+ <antrik> (or the servers have to block on the reply)
+ <braunr> so, there are really no capabilities in the original l4 design ?
+ <antrik> though I guess in the case of select() it wouldn't really matter
+ if replies get lost, as long as at least one is handled... would just
+ require the listener thread by separate from the thread sending the
+ requests
+ <antrik> braunr: right. no capabilities of any kind
+ <braunr> that was my initial understanding too
+ <braunr> thanks
+ <antrik> so I partially agree: in a purely sync IPC design, it would be
+ more complicated (but not impossible) to make sure the client gets the
+ replies without the server having to block while sending replies
+
+ <braunr> arg, we need hurd_condition_timedwait (and possible
+ condition_timedwait) to cleanly fix io_select
+ <braunr> luckily, i still have my old patch for condition_timedwait :>
+ <braunr> bddebian: in order to implement timeouts in select calls, servers
+ now have to use a hurd_condition_timedwait function
+ <braunr> is it possible that a thread both gets canceled and timeout on a
+ wait ?
+ <braunr> looks unlikely to me
+
+ <braunr> hm, i guess the same kind of compatibility constraints exist for
+ hurd interfaces
+ <braunr> so, should we have an io_select1 ?
+ <antrik> braunr: I would use a more descriptive name: io_select_timeout()
+ <braunr> antrik: ah yes
+ <braunr> well, i don't really like the idea of having 2 interfaces for the
+ same call :)
+ <braunr> because all select should be select_timeout :)
+ <braunr> but ok
+ <braunr> antrik: actually, having two select calls may be better
+ <braunr> oh it's really minor, we do'nt care actually
+ <antrik> braunr: two select calls?
+ <braunr> antrik: one with a timeout and one without
+ <braunr> the glibc would choose at runtime
+ <antrik> right. that was the idea. like with most transitions, that's
+ probably the best option
+ <braunr> there is no need to pass the timeout value if it's not needed, and
+ it's easier to pass NULL this way
+ <antrik> oh
+ <antrik> nah, that would make the transition more complicated I think
+ <braunr> ?
+ <braunr> ok
+ <braunr> :)
+ <braunr> this way, it becomes very easy
+ <braunr> the existing io_select call moves into a select_common() function
+ <antrik> the old variant doesn't know that the server has to return
+ immediately; changing that would be tricky. better just use the new
+ variant for the new behaviour, and deprecate the old one
+ <braunr> and the entry points just call this common function with either
+ NULL or the given timeout
+ <braunr> no need to deprecate the old one
+ <braunr> that's what i'm saying
+ <braunr> and i don't understand "the old variant doesn't know that the
+ server has to return immediately"
+ <antrik> won't the old variant block indefinitely in the server if there
+ are no ready fds?
+ <braunr> yes it will
+ <antrik> oh, you mean using the old variant if there is no timeout value?
+ <braunr> yes
+ <antrik> well, I guess this would work
+ <braunr> well of course, the question is rather if we want this or not :)
+ <antrik> hm... not sure
+ <braunr> we need something to improve the process of changing our
+ interfaces
+ <braunr> it's really painful currnelty
+ <antrik> inside the servers, we probably want to use common code
+ anyways... so in the long run, I think it simplifies the code when we can
+ just drop the old variant at some point
+ <braunr> a lot of the work we need to do involves changing interfaces, and
+ we very often get to the point where we don't know how to do that and
+ hardly agree on a final version :
+ <braunr> :/
+ <braunr> ok but
+ <braunr> how do you tell the server you don't want a timeout ?
+ <braunr> a special value ? like { -1; -1 } ?
+ <antrik> hm... good point
+ <braunr> i'll do it that way for now
+ <braunr> it's the best way to test it
+ <antrik> which way you mean now?
+ <braunr> keeping io_select as it is, add io_select_timeout
+ <antrik> yeah, I thought we agreed on that part... the question is just
+ whether io_select_timeout should also handle the no-timeout variant going
+ forward, or keep io_select for that. I'm really not sure
+ <antrik> maybe I'll form an opinion over time :-)
+ <antrik> but right now I'm undecided
+ <braunr> i say we keep io_select
+ <braunr> anyway it won't change much
+ <braunr> we can just change that at the end if we decide otherwise
+ <antrik> right
+ <braunr> even passing special values is ok
+ <braunr> with a carefully written hurd_condition_timedwait, it's very easy
+ to add the timeouts :)
+ <youpi> antrik, braunr: I'm wondering, another solution is to add an
+ io_probe, i.e. the server has to return an immediate result, and the
+ client then just waits for all results, without timeout
+ <youpi> that'd be a mere addition in the glibc select() call: when timeout
+ is 0, use that, and otherwise use the previous code
+ <youpi> the good point is that it looks nicer in fs.defs
+ <youpi> are there bad points?
+ <youpi> (I don't have the whole issues in the mind now, so I'm probably
+ missing things)
+ <braunr> youpi: the bad point is duplicating the implementation maybe
+ <youpi> what duplication ?
+ <youpi> ah you mean for the select case
+ <braunr> yes
+ <braunr> although it would be pretty much the same
+ <braunr> that is, if probe only, don't enter the wait loop
+ <youpi> could that be just some ifs here and there?
+ <youpi> (though not making the code easier to read...)
+ <braunr> hm i'm not sure it's fine
+ <youpi> in that case oi_select_timeout looks ncier ideed :)
+ <braunr> my problem with the current implementation is having the timeout
+ at the client side whereas the server side is doing the blocking
+ <youpi> I wonder how expensive a notification is, compared to blocking
+ <youpi> a blocking indeed needs a thread stack
+ <youpi> (and kernel thread stuff)
+ <braunr> with the kind of async ipc we have, it's still better to do it
+ that way
+ <braunr> and all the code already exists
+ <braunr> having the timeout at the client side also have its advantage
+ <braunr> has*
+ <braunr> latency is more precise
+ <braunr> so the real problem is indeed the non blocking case only
+ <youpi> isn't it bound to kernel ticks anyway ?
+ <braunr> uh, not if your server sucks
+ <braunr> or is loaded for whatever reason
+ <youpi> ok, that's not what I understood by "precision" :)
+ <youpi> I'd rather call it robustness :)
+ <braunr> hm
+ <braunr> right
+ <braunr> there are several ways to do this, but the io_select_timeout one
+ looks fine to me
+ <braunr> and is already well on its way
+ <braunr> and it's reliable
+ <braunr> (whereas i'm not sure about reliability if we keep the timeout at
+ client side)
+ <youpi> btw make the timeout nanoseconds
+ <braunr> ??
+ <youpi> pselect uses timespec, not timeval
+ <braunr> do we want pselect ?
+ <youpi> err, that's the only safe way with signals
+ <braunr> not only, no
+ <youpi> and poll is timespec also
+ <youpi> not only??
+ <braunr> you mean ppol
+ <braunr> ppoll
+ <youpi> no, poll too
+ <youpi> by "the only safe way", I mean for select calls
+ <braunr> i understand the race issue
+ <youpi> ppoll is a gnu extension
+ <braunr> int poll(struct pollfd *fds, nfds_t nfds, int timeout);
+ <youpi> ah, right, I was also looking at ppoll
+ <youpi> any
+ <youpi> way
+ <youpi> we can use nanosecs
+ <braunr> most event loops use a pipe or a socketpair
+ <youpi> there's no reason not to
+ <antrik> youpi: I briefly considered special-casisg 0 timeouts last time we
+ discussed this; but I concluded that it's probably better to handle all
+ timeouts server-side
+ <youpi> I don't see why we should even discuss that
+ <braunr> and translate signals to writes into the pipe/socketpair
+ <youpi> antrik: ok
+ <antrik> you can't count on select() timout precision anyways
+ <antrik> a few ms more shouldn't hurt any sanely written program
+ <youpi> braunr: "most" doesn't mean "all"
+ <youpi> there *are* applications which use pselect
+ <braunr> well mach only handles millisedonds
+ <braunr> seconds
+ <youpi> and it's not going out of the standard
+ <youpi> mach is not the hurd
+ <youpi> if we change mach, we can still keep the hurd ipcs
+ <youpi> anyway
+ <youpi> agagin
+ <youpi> I reallyt don't see the point of the discussion
+ <youpi> is there anything *against* using nanoseconds?
+ <braunr> i chose the types specifically because of that :p
+ <braunr> but ok i can change again
+ <youpi> becaus what??
+ <braunr> i chose to use mach's native time_value_t
+ <braunr> because it matches timeval nicely
+ <youpi> but it doesn't match timespec nicely
+ <braunr> no it doesn't
+ <braunr> should i add a hurd specific time_spec_t then ?
+ <youpi> "how do you tell the server you don't want a timeout ? a special
+ value ? like { -1; -1 } ?"
+ <youpi> you meant infinite blocking?
+ <braunr> youpi: yes
+ <braunr> oh right, pselect is posix
+ <youpi> actually posix says that there can be limitations on the maximum
+ timeout supported, which should be at least 31 days
+ <youpi> -1;-1 is thus fine
+ <braunr> yes
+ <braunr> which is why i could choose time_value_t (a struct of 2 integer_t)
+ <youpi> well, I'd say gnumach could grow a nanosecond-precision time value
+ <youpi> e.g. for clock_gettime precision and such
+ <braunr> so you would prefer me adding the time_spec_t time to gnumach
+ rather than the hurd ?
+ <youpi> well, if hurd RPCs are using mach types and there's no mach type
+ for nanoseconds, it m akes sense to add one
+ <youpi> I don't know about the first part
+ <braunr> yes some hurd itnerfaces also use time_value_t
+ <antrik> in general, I don't think Hurd interfaces should rely on a Mach
+ timevalue. it's really only meaningful when Mach is involved...
+ <antrik> we could even pass the time value as an opaque struct. don't
+ really need an explicit MIG type for that.
+ <braunr> opaque ?
+ <youpi> an opaque type would be a step backward from multi-machine support
+ ;)
+ <antrik> youpi: that's a sham anyways ;-)
+ <youpi> what?
+ <youpi> ah, using an opaque type, yes :)
+ <braunr> probably why my head bugged while reading that
+ <antrik> it wouldn't be fully opaque either. it would be two ints, right?
+ even if Mach doesn't know what these two ints mean, it still could to
+ byte order conversion, if we ever actually supported setups where it
+ matters...
+ <braunr> so uh, should this new time_spec_t be added in gnumach or the hurd
+ ?
+ <braunr> youpi: you're the maintainer, you decide :p
+ *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has joined channel
+ #hurd
+ <youpi> well, I don't like deciding when I didn't even have read fs.defs :)
+ <youpi> but I'd say the way forward is defining it in the hurd
+ <youpi> and put a comment "should be our own type" above use of the mach
+ type
+ <braunr> ok
+ *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has quit: Remote host
+ closed the connection
+ <braunr> and, by the way, is using integer_t fine wrt the 64-bits port ?
+ <youpi> I believe we settled on keeping integer_t a 32bit integer, like xnu
+ does
+ *** elmig (~elmig@a89-155-34-142.cpe.netcabo.pt) has quit: Quit: leaving
+ <braunr> ok so it's not
+ *** antrik (~olaf@port-92-195-60-96.dynamic.qsc.de) has joined channel
+ #hurd
+ <braunr> uh well
+ <youpi> why "not" ?
+ <braunr> keeping it 32-bits for the 32-bits userspace hurd
+ <braunr> but i'm talking about a true 64-bits version
+ <braunr> wouldn't integer_t get 64-bits then ?
+ <youpi> I meant we settled on a no
+ <youpi> like xnu does
+ <braunr> xnu uses 32-bits integer_t even when userspace runs in 64-bits
+ mode ?
+ <youpi> because things for which we'd need 64bits then are offset_t,
+ vm_size_t, and such
+ <youpi> yes
+ <braunr> ok
+ <braunr> youpi: but then what is the type to use for long integers ?
+ <braunr> or uintptr_t
+ <youpi> braunr: uintptr_t
+ <braunr> the mig type i mean
+ <youpi> type memory_object_offset_t = uint64_t;
+ <youpi> (and size)
+ <braunr> well that's a 64-bits type
+ <youpi> well, yes
+ <braunr> natural_t and integer_t were supposed to have the processor word
+ size
+ <youpi> probably I didn't understand your question
+ <braunr> if we remove that property, what else has it ?
+ <youpi> yes, but see rolands comment on this
+ <braunr> ah ?
+ <youpi> ah, no, he just says the same
+ <antrik> braunr: well, it's debatable whether the processor word size is
+ really 64 bit on x86_64...
+ <antrik> all known compilers still consider int to be 32 bit
+ <antrik> (and int is the default word size)
+ <braunr> not really
+ <youpi> as in?
+ <braunr> the word size really is 64-bits
+ <braunr> the question concerns the data model
+ <braunr> with ILP32 and LP64, int is always 32-bits, and long gets the
+ processor word size
+ <braunr> and those are the only ones current unices support
+ <braunr> (which is why long is used everywhere for this purpose instead of
+ uintptr_t in linux)
+ <antrik> I don't think int is 32 bit on alpha?
+ <antrik> (and probably some other 64 bit arches)
+ <braunr> also, assuming we want to maintain the ability to support single
+ system images, do we really want RPC with variable size types ?
+ <youpi> antrik: linux alpha's int is 32bit
+ <braunr> sparc64 too
+ <youpi> I don't know any 64bit port with 64bit int
+ <braunr> i wonder how posix will solve the year 2038 problem ;p
+ <youpi> time_t is a long
+ <youpi> the hope is that there'll be no 32bit systems by 2038 :)
+ <braunr> :)
+ <youpi> but yes, that matters to us
+ <youpi> number of seconds should not be just an int
+ <braunr> we can force a 64-bits type then
+ <braunr> i tend to think we should have no variable size type in any mig
+ interface
+ <braunr> youpi: so, new hurd type, named time_spec_t, composed of two
+ 64-bits signed integers
+ <pinotree> braunr: i added that in my prototype of monotonic clock patch
+ for gnumach
+ <braunr> oh
+ <youpi> braunr: well, 64bit is not needed for the nanosecond part
+ <braunr> right
+ <braunr> it will be aligned anyway :p
+ <youpi> I know
+ <youpi> uh, actually linux uses long there
+ <braunr> pinotree: i guess your patch is still in debian ?
+ <braunr> youpi: well yes
+ <braunr> youpi: why wouldn't it ? :)
+ <pinotree> no, never applied
+ <youpi> braunr: because 64bit is not needed
+ <braunr> ah, i see what you mean
+ <youpi> oh, posix says longa ctually
+ <youpi> *exactly* long
+ <braunr> i'll use the same sizes
+ <braunr> so it fits nicely with timespec
+ <braunr> hm
+ <braunr> but timespec is only used at the client side
+ <braunr> glibc would simply move the timespec values into our hurd specific
+ type (which can use 32-bits nanosecs) and servers would only use that
+ type
+ <braunr> all right, i'll do it that way, unless there are additional
+ comments next morning :)
+ <antrik> braunr: we never supported federations, and I'm pretty sure we
+ never will. the remnants of network IPC code were ripped out some years
+ ago. some of the Hurd interfaces use opaque structs too, so it wouldn't
+ even work if it existed. as I said earlier, it's really all a sham
+ <antrik> as for the timespec type, I think it's easier to stick with the
+ API definition at RPC level too
+
+
+## IRC, freenode, #hurd, 2012-07-24
+
+ <braunr> youpi: antrik: is vm_size_t an appropriate type for a c long ?
+ <braunr> (appropriate mig type)
+ <antrik> I wouldn't say so. while technically they are pretty much
+ guaranteed to be the same, conceptually they are entirely different
+ things -- it would be confusing at least to do it that way...
+ <braunr> antrik: well which one then ? :(
+ <antrik> braunr: no idea TBH
+ <braunr> antrik_: that should have been natural_t and integer_t
+ <braunr> so maybe we should new types to replace them
+ <antrik_> braunr: actually, RPCs should never have nay machine-specific
+ types... which makes me realise that a 1:1 translation to the POSIX
+ definition is actually not possible if we want to follow the Mach ideals
+ <braunr> i agree
+ <braunr> (well, the original mach authors used natural_t in quite a bunch
+ of places ..)
+ <braunr> the mig interfaces look extremely messy to me because of this type
+ issue
+ <braunr> and i just want to move forward with my work now
+ <braunr> i could just use 2 integer_t, that would get converted in the
+ massive future revamp of the interfaces for the 64-bits userspace
+ <braunr> or 2 64-bits types
+ <braunr> i'd like us to agree on one of the two not too late so i can
+ continue
+
+
+## IRC, freenode, #hurd, 2012-07-25
+
+ <antrik_> braunr: well, for actual kernel calls, machine-specific types are
+ probably hard to avoid... the problem is when they are used in other RPCs
+ <braunr> antrik: i opted for a hurd specific time_data_t = struct[2] of
+ int64
+ <braunr> and going on with this for now
+ <braunr> once it works we'll finalize the types if needed
+ <antrik> I'm really not sure how to best handle such 32 vs. 64 bit issues
+ in Hurd interfaces...
+ <braunr> you *could* consider time_t and long to be machine specific types
+ <antrik> well, they clearly are
+ <braunr> long is
+ <braunr> time_t isn't really
+ <antrik> didn't you say POSIX demands it to be longs?
+ <braunr> we could decide to make it 64 bits in all versions of the hurd
+ <braunr> no
+ <braunr> posix requires the nanoseconds field of timespec to be long
+ <braunr> the way i see it, i don't see any problem (other than a little bit
+ of storage and performance) using 64-bits types here
+ <antrik> well, do we really want to use a machine-independent time format,
+ if the POSIX interfaces we are mapping do not?...
+ <antrik> (perhaps we should; I'm just uncertain what's better in this case)
+ <braunr> this would require creating new types for that
+ <braunr> probably mach types for consistency
+ <braunr> to replace natural_t and integer_t
+ <braunr> now this concerns a totally different issue than select
+ <braunr> which is how we're gonna handle the 64-bits port
+ <braunr> because natural_t and integer_t are used almost everywhere
+ <antrik> indeed
+ <braunr> and we must think of 2 ports
+ <braunr> the 32-bits over 64-bits gnumach, and the complete 64-bits one
+ <antrik> what do we do for the interfaces that are explicitly 64 bit?
+ <braunr> what do you mean ?
+ <braunr> i'm not sure there is anything to do
+ <antrik> I mean what is done in the existing ones?
+ <braunr> like off64_t ?
+ <antrik> yeah
+ <braunr> they use int64 and unsigned64
+ <antrik> OK. so we shouldn't have any trouble with that at least...
+ <pinotree> braunr: were you adding a time_value_t in mach, but for
+ nanoseconds?
+ <braunr> no i'm adding a time_data_t to the hurd
+ <braunr> for nanoseconds yes
+ <pinotree> ah ok
+ <pinotree> (maybe sure it is available in hurd/hurd_types.defs)
+ <braunr> yes it's there
+ <pinotree> \o/
+ <braunr> i mean, i didn't forget to add it there
+ <braunr> for now it's a struct[2] of int64
+ <braunr> but we're not completely sure of that
+ <braunr> currently i'm teaching the hurd how to use timeouts
+ <pinotree> cool
+ <braunr> which basically involves adding a time_data_t *timeout parameter
+ to many functions
+ <braunr> and replacing hurd_condition_wait with hurd_condition_timedwait
+ <braunr> and making sure a timeout isn't an error on the return path
+ * pinotree has a simplier idea for time_data_t: add a file_utimesns to
+ fs.defs
+ <braunr> hmm, some functions have a nonblocking parameter
+ <braunr> i'm not sure if it's better to replace them with the timeout, or add the timeout parameter
+ <braunr> considering the functions involved may return EWOULDBLOCK
+ <braunr> for now i'll add a timeout parameter, so that the code requires as little modification as possible
+ <braunr> tell me your opinion on that please
+ <antrik> braunr: what functions?
+ <braunr> connq_listen in pflocal for example
+ <antrik> braunr: I don't really understand what you are talking about :-(
+ <braunr> some servers implement select this way :
+ <braunr> 1/ call a function in non-blocking mode, if it indicates data is available, return immediately
+ <braunr> 2/ call the same function, in blocking mode
+ <braunr> normally, with the new timeout parameter, non-blocking could be passed in the timeout parameter (with a timeout of 0)
+ <braunr> operating in non-blocking mode, i mean
+ <braunr> antrik: is it clear now ? :)
+ <braunr> i wonder how the hurd managed to grow so much code without a cond_timedwait function :/
+ <braunr> i think i have finished my io_select_timeout patch on the hurd side
+ <braunr> :)
+ <braunr> a small step for the hurd, but a big one against vim latencies !!
+ <braunr> (which is the true reason i'm working on this haha)
+ <braunr> new hurd rbraun/io_select_timeout branch for those interested
+ <braunr> hm, my changes clashes hard with the debian pflocal patch by neal :/
+ <braunr> clash*
+ <antrik> braunr: replace I'd say. no need to introduce redundancy; and code changes not affecting interfaces are cheap
+ <antrik> (in general, I'm always in favour of refactoring)
+ <braunr> antrik: replace what ?
+ <antrik> braunr: wow, didn't think moving the timeouts to server would be such a quick task :-)
+ <braunr> antrik: :)
+ <antrik> 16:57 < braunr> hmm, some functions have a nonblocking parameter
+ <antrik> 16:58 < braunr> i'm not sure if it's better to replace them with the timeout, or add the timeout parameter
+ <braunr> antrik: ah about that, ok
+
+
+## IRC, freenode, #hurd, 2012-07-26
+
+ <pinotree> braunr: wrt your select_timeout branch, why not push only the
+ time_data stuff to master?
+ <braunr> pinotree: we didn't agree on that yet
+
+ <braunr> ah better, with the correct ordering of io routines, my hurd boots
+ :)
+ <pinotree> and works too? :p
+ <braunr> so far yes
+ <braunr> i've spotted some issues in libpipe but nothing major
+ <braunr> i "only" have to adjust the client side select implementation now
+
+
+## IRC, freenode, #hurd, 2012-07-27
+
+ <braunr> io_select should remain a routine (i.e. synchronous) for server
+ side stub code
+ <braunr> but should be asynchronous (send only) for client side stub code
+ <braunr> (since _hurs_select manually handles replies through a port set)
+
+
+## IRC, freenode, #hurd, 2012-07-28
+
+ <braunr> why are there both REPLY_PORTS and IO_SELECT_REPLY_PORT macros in
+ the hurd ..
+ <braunr> and for the select call only :(
+ <braunr> and doing the exact same thing unless i'm mistaken
+ <braunr> the reply port is required for select anyway ..
+ <braunr> i just want to squeeze them into a new IO_SELECT_SERVER macro
+ <braunr> i don't think i can maintain the use the existing io_select call
+ as it is
+ <braunr> grr, the io_request/io_reply files aren't synced with the io.defs
+ file
+ <braunr> calls like io_sigio_request seem totally unused
+ <antrik> yeah, that's a major shortcoming of MIG -- we shouldn't need to
+ have separate request/reply defs
+ <braunr> they're not even used :/
+ <braunr> i did something a bit ugly but it seems to do what i wanted
+
+
+## IRC, freenode, #hurd, 2012-07-29
+
+ <braunr> good, i have a working client-side select
+ <braunr> now i need to fix the servers a bit :x
+ <braunr> arg, my test cases work, but vim doesn't :((
+ <braunr> i hate select :p
+ <braunr> ah good, my problems are caused by a deadlock because of my glibc
+ changes
+ <braunr> ah yes, found my locking problem
+ <braunr> building my final libc now
+ * braunr crosses fingers
+ <braunr> (the deadlock issue was of course a one liner)
+ <braunr> grr deadlocks again
+ <braunr> grmbl, my deadlock is in pfinet :/
+ <braunr> my select_timeout code makes servers deadlock on the libports
+ global lock :/
+ <braunr> wtf..
+ <braunr> youpi: it may be related to the failed asserttion
+ <braunr> deadlocking on mutex_unlock oO
+ <braunr> grr
+ <braunr> actually, mutex_unlock sends a message to notify other threads
+ that the lock is ready
+ <braunr> and that's what is blocking ..
+ <braunr> i'm not sure it's a fundamental problem here
+ <braunr> it may simply be a corruption
+ <braunr> i have several (but not that many) threads blocked in mutex_unlock
+ and one blocked in mutex_lcok
+ <braunr> i fail to see how my changes can create such a behaviour
+ <braunr> the weird thing is that i can't reproduce this with my test cases
+ :/
+ <braunr> only vim makes things crazy
+ <braunr> and i suppose it's related to the terminal
+ <braunr> (don't terminals relay select requests ?)
+ <braunr> when starting vim through ssh, pfinet deadlocks, and when starting
+ it on the mach console, the console term deadlocks
+ <pinotree> no help/hints when started with rpctrace?
+ <braunr> i only get assertions with rpctrace
+ <braunr> it's completely unusable for me
+ <braunr> gdb tells vim is indeed blocked in a select request
+ <braunr> and i can't see any in the remote servers :/
+ <braunr> this is so weird ..
+ <braunr> when using vim with the unmodified c library, i clearly see the
+ select call, and everything works fine ....
+ <braunr> 2e27: a1 c4 d2 b7 f7 mov 0xf7b7d2c4,%eax
+ <braunr> 2e2c: 62 (bad)
+ <braunr> 2e2d: f6 47 b6 69 testb $0x69,-0x4a(%edi)
+ <braunr> what's the "bad" line ??
+ <braunr> ew, i think i understand my problem now
+ <braunr> the timeout makes blocking threads wake prematurely
+ <braunr> but on an mutex unlock, or a condition signal/broadcast, a message
+ is still sent, as it is expected a thread is still waiting
+ <braunr> but the receiving thread, having returned sooner than expected
+ from mach_msg, doesn't dequeue the message
+ <braunr> as vim does a lot of non blocking selects, this fills the message
+ queue ...
+
+
+## IRC, freenode, #hurd, 2012-07-30
+
+ <braunr> hm nice, the problem i have with my hurd_condition_timedwait seems
+ to also exist in libpthread
+
+[[!taglink open_issue_libpthread]].
+
+ <braunr> although at a lesser degree (the implementation already correctly
+ removes a thread that timed out from a condition queue, and there is a
+ nice FIXME comment asking what to do with any stale wakeup message)
+ <braunr> and the only solution i can think of for now is to drain the
+ message queue
+ <braunr> ah yes, i know have vim running with my io_select_timeout code :>
+ <braunr> but hum
+ <braunr> eating all cpu
+ <braunr> ah nice, an infinite loop in _hurd_critical_section_unlock
+ <braunr> grmbl
+ <tschwinge> braunr: But not this one?
+ http://www.gnu.org/software/hurd/open_issues/fork_deadlock.html
+ <braunr> it looks similar, yes
+ <braunr> let me try again to compare in detail
+ <braunr> pretty much the same yes
+ <braunr> there is only one difference but i really don't think it matters
+ <braunr> (#3 _hurd_sigstate_lock (ss=0x2dff718) at hurdsig.c:173
+ <braunr> instead of
+ <braunr> #3 _hurd_sigstate_lock (ss=0x1235008) at hurdsig.c:172)
+ <braunr> ok so we need to review jeremie's work
+ <braunr> tschwinge: thanks for pointing me at this
+ <braunr> the good thing with my patch is that i can reproduce in a few
+ seconds
+ <braunr> consistently
+ <tschwinge> braunr: You're welcome. Great -- a reproducer!
+ <tschwinge> You might also build a glibc without his patches as a
+ cross-test to see the issues goes away?
+ <braunr> right
+ <braunr> i hope they're easy to find :)
+ <tschwinge> Hmm, have you already done changes to glibc? Otherwise you
+ might also simply use a Debian package from before?
+ <braunr> yes i have local changes to _hurd_select
+ <tschwinge> OK, too bad.
+ <tschwinge> braunr: debian/patches/hurd-i386/tg-hurdsig-*, I think.
+ <braunr> ok
+ <braunr> hmmmmm
+ <braunr> it may be related to my last patch on the select_timeout branch
+ <braunr> (i mean, this may be caused by what i mentioned earlier this
+ morning)
+ <braunr> damn i can't build glibc without the signal disposition patches :(
+ <braunr> libpthread_sigmask.diff depends on it
+ <braunr> tschwinge: doesn't libpthread (as implemented in the debian glibc
+ patches) depend on global signal dispositions ?
+ <braunr> i think i'll use an older glibc for now
+ <braunr> but hmm which one ..
+ <braunr> oh whatever, let's fix the deadlock, it's simpler
+ <braunr> and more productive anyway
+ <tschwinge> braunr: May be that you need to revert some libpthread patch,
+ too. Or even take out the libpthread build completely (you don't need it
+ for you current work, I think).
+ <tschwinge> braunr: Or, of course, you locate the deadlock. :-)
+ <braunr> hum, now why would __io_select_timeout return
+ EMACH_SEND_INVALID_DEST :(
+ <braunr> the current glibc code just transparently reports any such error
+ as a false positive oO
+ <braunr> hm nice, segfault through recursion
+ <braunr> "task foo destroying an invalid port bar" everywhere :((
+ <braunr> i still have problems at the server side ..
+ <braunr> ok i think i have a solution for the "synchronization problem"
+ <braunr> (by this name, i refer to the way mutex and condition variables
+ are implemented"
+ <braunr> (the problem being that, when a thread unblocks early, because of
+ a timeout, another may still send a message to attempt it, which may fill
+ up the message queue and make the sender block, causing a deadlock)
+ <braunr> s/attempt/attempt to wake/
+ <bddebian> Attempts to wake a dead thread?
+ <braunr> no
+ <braunr> attempt to wake an already active thread
+ <braunr> which won't dequeue the message because it's doing something else
+ <braunr> bddebian: i'm mentioning this because the problem potentially also
+ exists in libpthread
+
+[[!taglink open_issue_libpthread]].
+
+ <braunr> since the underlying algorithms are exactly the same
+ <youpi> (fortunately the time-out versions are not often used)
+ <braunr> for now :)
+ <braunr> for reference, my idea is to make the wake call truely non
+ blocking, by setting a timeout of 0
+ <braunr> i also limit the message queue size to 1, to limit the amount of
+ spurious wakeups
+ <braunr> i'll be able to test that in 30 mins or so
+ <braunr> hum
+ <braunr> how can mach_msg block with a timeout of 0 ??
+ <braunr> never mind :p
+ <braunr> unfortunately, my idea alone isn't enough
+ <braunr> for those interested in the problem, i've updated the analysis in
+ my last commit
+ (http://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?h=rbraun/select_timeout&id=40fe717ba9093c0c893d9ea44673e46a6f9e0c7d)
+
+
+## IRC, freenode, #hurd, 2012-08-01
+
+ <braunr> damn, i can't manage to make threads calling condition_wait to
+ dequeue themselves from the condition queue :(
+ <braunr> (instead of the one sending the signal/broadcast)
+ <braunr> my changes on cthreads introduce 2 intrusive changes
+ <braunr> the first is that the wakeup port is limited to 1 port, and the
+ wakeup operation is totally non blocking
+ <braunr> which is something we should probably add in any case
+ <braunr> the second is that condition_wait dequeues itself after blocking,
+ instead of condition_signal/broadcast
+ <braunr> and this second change seems to introduce deadlocks, for reasons
+ completely unknown to me :((
+ <braunr> limited to 1 message*
+ <braunr> if anyone has an idea about why it is bad for a thread to remove
+ itself from a condition/mutex queue, i'm all ears
+ <braunr> i'm hitting a wall :(
+ <braunr> antrik: if you have some motivation, can you review this please ?
+ http://www.sceen.net/~rbraun/0001-Rework-condition-signal-broadcast.patch
+ <braunr> with this patch, i get threads blocked in condition_wait,
+ apparently waiting for a wakeup that never comes (or was already
+ consumed)
+ <braunr> and i don't understand why :
+ <braunr> :(
+ <bddebian> braunr: The condition never happens?
+ <braunr> bddebian: it works without the patch, so i guess that's not the
+ problem
+ <braunr> bddebian: hm, you could be right actually :p
+ <bddebian> braunr: About what? :)
+ <braunr> 17:50 < bddebian> braunr: The condition never happens?
+ <braunr> although i doubt it again
+ <braunr> this problem is getting very very frustrating
+ <bddebian> :(
+ <braunr> it frightens me because i don't see any flaw in the logic :(
+
+
+## IRC, freenode, #hurd, 2012-08-02
+
+ <braunr> ah, seems i found a reliable workaround to my deadlock issue, and
+ more than a workaround, it should increase efficiency by reducing
+ messaging
+ * braunr happy
+ <kilobug> congrats :)
+ <braunr> the downside is that we may have a problem with non blocking send
+ calls :/
+ <braunr> which are used for signals
+ <braunr> i mean, this could be a mach bug
+ <braunr> let's try running a complete hurd with the change
+ <braunr> arg, the boot doesn't complete with the patch .. :(
+ <braunr> grmbl, by changing only a few bits in crtheads, the boot process
+ freezes in an infinite loop in somethign started after auth
+ (/etc/hurd/runsystem i assume)
+
+
+## IRC, freenode, #hurd, 2012-08-03
+
+ <braunr> glibc actually makes some direct use of cthreads condition
+ variables
+ <braunr> and my patch seems to work with servers in an already working
+ hurd, but don't allow it to boot
+ <braunr> and the hang happens on bash, the first thing that doesn't come
+ from the hurd package
+ <braunr> (i mean, during the boot sequence)
+ <braunr> which means we can't change cthreads headers (as some primitives
+ are macros)
+ <braunr> *sigh*
+ <braunr> the thing is, i can't fix select until i have a
+ condition_timedwait primitive
+ <braunr> and i can't add this primitive until either 1/ cthreads are fixed
+ not to allow the inlining of its primitives, or 2/ the switch to pthreads
+ is done
+ <braunr> which might take a loong time :p
+ <braunr> i'll have to rebuild a whole libc package with a fixed cthreads
+ version
+ <braunr> let's do this
+ <braunr> pinotree: i see two __condition_wait calls in glibc, how is the
+ double underscore handled ?
+ <pinotree> where do you see it?
+ <braunr> sysdeps/mach/hurd/setpgid.c and sysdeps/mach/hurd/setsid.c
+ <braunr> i wonder if it's even used
+ <braunr> looks like we use posix/setsid.c now
+ <pinotree> #ifdef noteven
+ <braunr> ?
+ <pinotree> the two __condition_wait calls you pointed out are in such
+ preprocessor block
+ <pinotree> s
+ <braunr> but what does it mean ?
+ <pinotree> no idea
+ <braunr> ok
+ <pinotree> these two files should be definitely be used, they are found
+ earlier in the vpath
+ <braunr> hum, posix/setsid.c is a nop stub
+ <pinotree> i don't see anything defining "noteven" in glibc itself nor in
+ hurd
+ <braunr> :(
+ <pinotree> yes, most of the stuff in posix/, misc/, signal/, time/ are
+ ENOSYS stubs, to be reimplemented in a sysdep
+ <braunr> hm, i may have made a small mistake in cthreads itself actually
+ <braunr> right
+ <braunr> when i try to debug using a subhurd, gdb tells me the blocked
+ process is spinning in ld ..
+ <braunr> i mean ld.so
+ <braunr> and i can't see any debugging symbol
+ <braunr> some progress, it hangs at process_envvars
+ <braunr> eh
+ <braunr> i've partially traced my problem
+ <braunr> when a "normal" program starts, libc creates the signal thread
+ early
+ <braunr> the main thread waits for the creation of this thread by polling
+ its address
+ <braunr> (i.e. while (signal_thread == 0); )
+ <braunr> for some reason, it is stuck in this loop
+ <braunr> cthread creation being actually governed by
+ condition_wait/broadcast, it makes some sense
+ <bddebian> braunr: When you say the "main" thread, do you mean the main
+ thread of the program?
+ <braunr> bddebian: yes
+ <braunr> i think i've determined my mistake
+ <braunr> glibc has its own variants of the mutex primitives
+ <braunr> and i changed one :/
+ <bddebian> Ah
+ <braunr> it's good news for me :)
+ <braunr> hum no, that's not exactly what i described
+ <braunr> glibc has some stubs, but it's not the problem, the problem is
+ that mutex_lock/unlock are macros, and i changed one of them
+ <braunr> so everything that used that macro inside glibc wasn't changed
+ <braunr> yes!
+ <braunr> my patched hurd now boots :)
+ * braunr relieved
+ <braunr> this experience at least taught me that it's not possible to
+ easily change the singly linked queues of thread (waiting for a mutex or
+ a condition variable) :(
+ <braunr> for now, i'm using a linear search from the start
+ <braunr> so, not only does this patched hurd boot, but i was able to use
+ aptitude, git, build a whole hurd, copy the whole thing, and remove
+ everything, and it still runs fine (whereas usually it would fail very
+ early)
+ * braunr happy
+ <antrik> and vim works fine now?
+ <braunr> err, wait
+ <braunr> this patch does only one thing
+ <braunr> it alters the way condition_signal/broadcast and
+ {hurd_,}condition_wait operate
+ <braunr> currently, condition_signal/broadcast dequeues threads from a
+ condition queue and wake them
+ <braunr> my patch makes these functions only wake the target threads
+ <braunr> which dequeue themselves
+ <braunr> (a necessary requirement to allow clean timeout handling)
+ <braunr> the next step is to fix my hurd_condition_wait patch
+ <braunr> and reapply the whole hurd patch indotrucing io_select_timeout
+ <braunr> introducing*
+ <braunr> then i'll be able to tell you
+ <braunr> one side effect of my current changes is that the linear search
+ required when a thread dequeues itself is ugly
+ <braunr> so it'll be an additional reason to help the pthreads porting
+ effort
+ <braunr> (pthreads have the same sort of issues wrt to timeout handling,
+ but threads are a doubly-linked lists, making it way easier to adjust)
+ <braunr> +on
+ <braunr> damn i'm happy
+ <braunr> 3 days on this stupid bug
+ <braunr> (which is actually responsible for what i initially feared to be a
+ mach bug on non blocking sends)
+ <braunr> (and because of that, i worked on the code to make it sure that 1/
+ waking is truely non blocking and 2/ only one message is required for
+ wakeups
+ <braunr> )
+ <braunr> a simple flag is tested instead of sending in a non blocking way
+ :)
+ <braunr> these improvments should be ported to pthreads some day
+
+[[!taglink open_issue_libpthread]]
+
+ <braunr> ahah !
+ <braunr> view is now FAST !
+ <mel-> braunr: what do you mean by 'view'?
+ <braunr> mel-: i mean the read-only version of vim
+ <mel-> aah
+ <braunr> i still have a few port leaks to fix
+ <braunr> and some polishing
+ <braunr> but basically, the non-blocking select issue seems fixed
+ <braunr> and with some luck, we should get unexpected speedups here and
+ there
+ <mel-> so vim was considerable slow on the Hurd before? didn't know that.
+ <braunr> not exactly
+ <braunr> at first, it wasn't, but the non blocking select/poll calls
+ misbehaved
+ <braunr> so a patch was introduced to make these block at least 1 ms
+ <braunr> then vim became slow, because it does a lot of non blocking select
+ <braunr> so another patch was introduced, not to set the 1ms timeout for a
+ few programs
+ <braunr> youpi: darnassus is already running the patched hurd, which shows
+ (as expected) that it can safely be used with an older libc
+ <youpi> i.e. servers with the additional io_select?
+ <braunr> yes
+ <youpi> k
+ <youpi> good :)
+ <braunr> and the modified cthreads
+ <braunr> which is the most intrusive change
+ <braunr> port leaks fixed
+ <gnu_srs> braunr: Congrats:-D
+ <braunr> thanks
+ <braunr> it's not over yet :p
+ <braunr> tests, reviews, more tests, polishing, commits, packaging
+
+
+## IRC, freenode, #hurd, 2012-08-04
+
+ <braunr> grmbl, apt-get fails on select in my subhurd with the updated
+ glibc
+ <braunr> otherwise it boots and runs fine
+ <braunr> fixed :)
+ <braunr> grmbl, there is a deadlock in pfinet with my patch
+ <braunr> deadlock fixed
+ <braunr> the sigstate and the condition locks must be taken at the same
+ time, for some obscure reason explained in the cthreads code
+ <braunr> but when a thread awakes and dequeues itself from the condition
+ queue, it only took the condition lock
+ <braunr> i noted in my todo list that this could create problems, but
+ wanted to leave it as it is to really see it happen
+ <braunr> well, i saw :)
+ <braunr> the last commit of my hurd branch includes the 3 line fix
+ <braunr> these fixes will be required for libpthreads
+ (pthread_mutex_timedlock and pthread_cond_timedwait) some day
+ <braunr> after the select bug is fixed, i'll probably work on that with you
+ and thomas d
+
+
+## IRC, freenode, #hurd, 2012-08-05
+
+ <braunr> eh, i made dpkg-buildpackage use the patched c library, and it
+ finished the build oO
+ <gnu_srs> braunr: :)
+ <braunr> faked-tcp was blocked in a select call :/
+ <braunr> (with the old libc i mean)
+ <braunr> with mine i just worked at the first attempt
+ <braunr> i'm not sure what it means
+ <braunr> it could mean that the patched hurd servers are not completely
+ compatible with the current libc, for some weird corner cases
+ <braunr> the slowness of faked-tcp is apparently inherent to its
+ implementation
+ <braunr> all right, let's put all these packages online
+ <braunr> eh, right when i upload them, i get a deadlock
+ <braunr> this one seems specific to pfinet
+ <braunr> only one deadlock so far, and the libc wasn't in sync with the
+ hurd
+ <braunr> :/
+ <braunr> damn, another deadlock as soon as i send a mail on bug-hurd :(
+ <braunr> grr
+ <pinotree> thou shall not email
+ <braunr> aptitude seems to be a heavy user of select
+ <braunr> oh, it may be due to my script regularly chaning the system time
+ <braunr> or it may not be a deadlock, but simply the linear queue getting
+ extremely large
+
+
+## IRC, freenode, #hurd, 2012-08-06
+
+ <braunr> i have bad news :( it seems there can be memory corruptions with
+ my io_select patch
+ <braunr> i've just seen an auth server (!) spinning on a condition lock
+ (the internal spin lock), probably because the condition was corrupted ..
+ <braunr> i guess it's simply because conditions embedded in dynamically
+ allocated structures can be freed while there are still threads waiting
+ ...
+ <braunr> so, yes the solution to my problem is simply to dequeue threads
+ from both the waker when there is one, and the waiter when no wakeup
+ message was received
+ <braunr> simple
+ <braunr> it's so obvious i wonder how i didn't think of it earlier :(-
+ <antrik> braunr: an elegant solution always seems obvious afterwards... ;-)
+ <braunr> antrik: let's hope this time, it's completely right
+ <braunr> good, my latest hurd packages seem fixed finally
+ <braunr> looks like i got another deadlock
+ * braunr hangs himselg
+ <braunr> that, or again, condition queues can get very large (e.g. on
+ thread storms)
+ <braunr> looks like this is the case yes
+ <braunr> after some time the system recovered :(
+ <braunr> which means a doubly linked list is required to avoid pathological
+ behaviours
+ <braunr> arg
+ <braunr> it won't be easy at all to add a doubly linked list to condition
+ variables :(
+ <braunr> actually, just a bit messy
+ <braunr> youpi: other than this linear search on dequeue, darnassus has
+ been working fine so far
+ <youpi> k
+ <youpi> Mmm, you'd need to bump the abi soname if changing the condition
+ structure layout
+ <braunr> :(
+ <braunr> youpi: how are we going to solve that ?
+ <youpi> well, either bump soname, or finish transition to libpthread :)
+ <braunr> it looks better to work on pthread now
+ <braunr> to avoid too many abi changes
+
+[[libpthread]].
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <rbraun_hurd> anyone knows of applications extensively using non-blocking
+ networking functions ?
+ <rbraun_hurd> (well, networking functions in a non-blocking way)
+ <antrik> rbraun_hurd: X perhaps?
+ <antrik> it's single-threaded, so I guess it must be pretty async ;-)
+ <antrik> thinking about it, perhaps it's the reason it works so poorly on
+ Hurd...
+ <braunr> it does ?
+ <rbraun_hurd> ah maybe at the client side, right
+ <rbraun_hurd> hm no, the client side is synchronous
+ <rbraun_hurd> oh by the way, i can use gitk on darnassys
+ <rbraun_hurd> i wonder if it's because of the select fix
+ <tschwinge> rbraun_hurd: If you want, you could also have a look if there's
+ any improvement for these:
+ http://www.gnu.org/software/hurd/open_issues/select.html (elinks),
+ http://www.gnu.org/software/hurd/open_issues/dbus.html,
+ http://www.gnu.org/software/hurd/open_issues/runit.html
+ <tschwinge> rbraun_hurd: And congratulations, again! :-)
+ <rbraun_hurd> tschwinge: too bad it can't be merged before the pthread port
+ :(
+ <antrik> rbraun_hurd: I was talking about server. most clients are probably
+ sync.
+ <rbraun_hurd> antrik: i guessed :)
+ <antrik> (thought certainly not all... multithreaded clients are not really
+ supported with xlib IIRC)
+ <rbraun_hurd> but i didn't have much trouble with X
+ <antrik> tried something pushing a lot of data? like, say, glxgears? :-)
+ <rbraun_hurd> why not
+ <rbraun_hurd> the problem with tests involving "a lot of data" is that it
+ can easily degenerate into a livelock
+ <antrik> yeah, sounds about right
+ <rbraun_hurd> (with the current patch i mean)
+ <antrik> the symptoms I got were general jerkiness, with occasional long
+ hangs
+ <rbraun_hurd> that applies to about everything on the hurd
+ <rbraun_hurd> so it didn't alarm me
+ <antrik> another interesting testcase is freeciv-gtk... it reporducibly
+ caused a thread explosion after idling for some time -- though I don't
+ remember the details; and never managed to come up with a way to track
+ down how this happens...
+ <rbraun_hurd> dbus is more worthwhile
+ <rbraun_hurd> pinotree: hwo do i test that ?
+ <pinotree> eh?
+ <rbraun_hurd> pinotree: you once mentioned dbus had trouble with non
+ blocking selects
+ <pinotree> it does a poll() with a 0s timeout
+ <rbraun_hurd> that's the non blocking select part, yes
+ <pinotree> you'll need also fixes for the socket credentials though,
+ otherwise it won't work ootb
+ <rbraun_hurd> right but, isn't it already used somehow ?
+ <antrik> rbraun_hurd: uhm... none of the non-X applications I use expose a
+ visible jerkiness/long hangs pattern... though that may well be a result
+ of general load patterns rather than X I guess
+ <rbraun_hurd> antrik: that's my feeling
+ <rbraun_hurd> antrik: heavy communication channels, unoptimal scheduling,
+ lack of scalability, they're clearly responsible for the generally
+ perceived "jerkiness" of the system
+ <antrik> again, I can't say I observe "general jerkiness". apart from slow
+ I/O the system behaves rather normally for the things I do
+ <antrik> I'm pretty sure the X jerkiness *is* caused by the socket
+ communication
+ <antrik> which of course might be a scheduling issue
+ <antrik> but it seems perfectly possible that it *is* related to the select
+ implementation
+ <antrik> at least worth a try I'd say
+ <rbraun_hurd> sure
+ <rbraun_hurd> there is still some work to do on it though
+ <rbraun_hurd> the client side changes i did could be optimized a bit more
+ <rbraun_hurd> (but i'm afraid it would lead to ugly things like 2 timeout
+ parameters in the io_select_timeout call, one for the client side, the
+ other for the servers, eh)
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <braunr> when running gitk on [darnassus], yesterday, i could push the CPU
+ to 100% by simply moving the mouse in the window :p
+ <braunr> (but it may also be caused by the select fix)
+ <antrik> braunr: that cursor might be "normal"
+ <rbraunrh> antrik: what do you mean ?
+ <antrik> the 100% CPU
+ <rbraunh> antrik: yes i got that, but what would make it normal ?
+ <rbraunh> antrik: right i get similar behaviour on linux actually
+ <rbraunh> (not 100% because two threads are spread on different cores, but
+ their cpu usage add up to 100%)
+ <rbraunh> antrik: so you think as long as there are events to process, the
+ x client is running
+ <rbraunh> thath would mean latencies are small enough to allow that, which
+ is actually a very good thing
+ <antrik> hehe... sound kinda funny :-)
+ <rbraunh> this linear search on dequeue is a real pain :/
+
+
+## IRC, freenode, #hurd, 2012-08-09
+
+`screen` doesn't close a window/hangs after exiting the shell.
+
+ <rbraunh> the screen issue seems linked to select :p
+ <rbraunh> tschwinge: the term server may not correctly implement it
+ <rbraunh> tschwinge: the problem looks related to the term consoles not
+ dying
+ <rbraunh> http://www.gnu.org/software/hurd/open_issues/term_blocking.html
+
+[[Term_blocking]].
+
+
+# IRC, freenode, #hurd, 2012-12-05
+
+ <braunr> well if i'm unable to build my own packages, i'll send you the one
+ line patch i wrote that fixes select/poll for the case where there is
+ only one descriptor
+ <braunr> (the current code calls mach_msg twice, each time with the same
+ timeout, doubling the total wait time when there is no event)
+
+
+## IRC, freenode, #hurd, 2012-12-06
+
+ <braunr> damn, my eglibc patch breaks select :x
+ <braunr> i guess i'll just simplify the code by using the same path for
+ both single fd and multiple fd calls
+ <braunr> at least, the patch does fix the case i wanted it to .. :)
+ <braunr> htop and ping act at the right regular interval
+ <braunr> my select patch is :
+ <braunr> /* Now wait for reply messages. */
+ <braunr> - if (!err && got == 0)
+ <braunr> + if (!err && got == 0 && firstfd != -1 && firstfd != lastfd)
+ <braunr> basically, when there is a single fd, the code calls io_select
+ with a timeout
+ <braunr> and later calls mach_msg with the same timeout
+ <braunr> effectively making the maximum wait time twice what it should be
+ <pinotree> ouch
+ <braunr> which is why htop and ping are "laggy"
+ <braunr> and perhaps also why fakeroot is when building libc
+ <braunr> well
+ <braunr> when building packages
+ <braunr> my patch avoids entering the mach_msg call if there is only one fd
+ <braunr> (my failed attempt didn't have the firstfd != -1 check, leading to
+ the 0 fd case skipping mach_msg too, which is wrong since in that case
+ there is just no wait, making applications use select/poll for sleeping
+ consume all cpu)
+
+ <braunr> the second is a fix in select (yet another) for the case where a
+ single fd is passed
+ <braunr> in which case there is one timeout directly passed in the
+ io_select call, but then yet another in the mach_msg call that waits for
+ replies
+ <braunr> this can account for the slowness of a bunch of select/poll users
+
+
+## IRC, freenode, #hurd, 2012-12-07
+
+ <braunr> finally, my select patch works :)
+
+
+## IRC, freenode, #hurd, 2012-12-08
+
+ <braunr> for those interested, i pushed my eglibc packages that include
+ this little select/poll timeout fix on my debian repository
+ <braunr> deb http://ftp.sceen.net/debian-hurd experimental/
+ <braunr> reports are welcome, i'm especially interested in potential
+ regressions
+
+
+## IRC, freenode, #hurd, 2012-12-10
+
+ <gnu_srs> I have verified your double timeout bug in hurdselect.c.
+ <gnu_srs> Since I'm also working on hurdselect I have a few questions
+ about where the timeouts in mach_msg and io_select are implemented.
+ <gnu_srs> Have a big problem to trace them down to actual code: mig magic
+ again?
+ <braunr> yes
+ <braunr> see hurd/io.defs, io_select includes a waittime timeout:
+ natural_t; parameter
+ <braunr> waittime is mig magic that tells the client side not to wait more
+ than the timeout
+ <braunr> and in _hurd_select, you can see these lines :
+ <braunr> err = __io_select (d[i].io_port, d[i].reply_port,
+ <braunr> /* Poll only if there's a single
+ descriptor. */
+ <braunr> (firstfd == lastfd) ? to : 0,
+ <braunr> to being the timeout previously computed
+ <braunr> "to"
+ <braunr> and later, when waiting for replies :
+ <braunr> while ((msgerr = __mach_msg (&msg.head,
+ <braunr> MACH_RCV_MSG | options,
+ <braunr> 0, sizeof msg, portset, to,
+ <braunr> MACH_PORT_NULL)) ==
+ MACH_MSG_SUCCESS)
+ <braunr> the same timeout is used
+ <braunr> hope it helps
+ <gnu_srs> Additional stuff on io-select question is at
+ http://paste.debian.net/215401/
+ <gnu_srs> Sorry, should have posted it before you comment, but was
+ disturbed.
+ <braunr> 14:13 < braunr> waittime is mig magic that tells the client side
+ not to wait more than the timeout
+ <braunr> the waittime argument is a client argument only
+ <braunr> that's one of the main source of problems with select/poll, and
+ the one i fixed 6 months ago
+ <gnu_srs> so there is no relation to the third argument of the client call
+ and the third argument of the server code?
+ <braunr> no
+ <braunr> the 3rd argument at server side is undoubtedly the 4th at client
+ side here
+ <gnu_srs> but for the fourth argument there is?
+ <braunr> i think i've just answered that
+ <braunr> when in doubt, check the code generated by mig when building glibc
+ <gnu_srs> as I said before, I have verified the timeout bug you solved.
+ <gnu_srs> which code to look for RPC_*?
+ <braunr> should be easy to guess
+ <gnu_srs> is it the same with mach_msg()? No explicit usage of the timeout
+ there either.
+ <gnu_srs> in the code for the function I mean.
+ <braunr> gnu_srs: mach_msg is a low level system call
+ <braunr> see
+ http://www.gnu.org/software/hurd/gnumach-doc/Mach-Message-Call.html#Mach-Message-Call
+ <gnu_srs> found the definition of __io_select in: RPC_io_select.c, thanks.
+ <gnu_srs> so the client code to look for wrt RPC_ is in hurd/*.defs? what
+ about the gnumach/*/include/*.defs?
+ <gnu_srs> a final question: why use a timeout if there is a single FD for
+ the __io_select call, not when there are more than one?
+ <braunr> well, the code is obviously buggy, so don't expect me to justify
+ wrong code
+ <braunr> but i suppose the idea was : if there is only one fd, perform a
+ classical synchronous RPC, whereas if there are more use a heavyweight
+ portset and additional code to receive replies
+
+ <youpi> exim4 didn't get fixed by the libc patch, unfortunately
+ <braunr> yes i noticed
+ <braunr> gdb can't attach correctly to exim, so it's probably something
+ completely different
+ <braunr> i'll try the non intrusive mode
+
+
# See Also
See also [[select_bogus_fd]] and [[select_vs_signals]].
diff --git a/open_issues/strict_aliasing.mdwn b/open_issues/strict_aliasing.mdwn
index 01019372..b7d39805 100644
--- a/open_issues/strict_aliasing.mdwn
+++ b/open_issues/strict_aliasing.mdwn
@@ -19,3 +19,13 @@ License|/fdl]]."]]"""]]
instead?
<braunr> pinotree: if we can rely on gcc for the warnings, yes
<braunr> but i suspect there might be other silent issues in very old code
+
+
+# IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> btw, i'm building glibc right now, and i can see a few strict
+ aliasing warnings
+ <braunr> fixing them will allow us to avoid wasting time on very obscure
+ issues (if gcc catches them all)
+ <tschwinge> The strict aliasing things should be fixed, yes. Some might be
+ from MIG.
diff --git a/open_issues/synchronous_ipc.mdwn b/open_issues/synchronous_ipc.mdwn
new file mode 100644
index 00000000..53d5d69d
--- /dev/null
+++ b/open_issues/synchronous_ipc.mdwn
@@ -0,0 +1,185 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_hurd]]
+
+
+# IRC, freenode, #hurd, 2012-07-20
+
+From [[Genode RPC|microkernel/genode/rpc]].
+
+ <braunr> assuming synchronous ipc is the way to go (it seems so), there is
+ still the need for some async ipc (e.g signalling untrusted recipients
+ without risking blocking on them)
+ <braunr> 1/ do you agree on that and 2/ how would this low-overhead async
+ ipc be done ? (and 3/ are there relevant examples ?
+ <antrik> if you think about this stuff too much you will end up like marcus
+ and neal ;-)
+ <braunr> antrik: likely :)
+ <antrik> the truth is that there are various possible designs all with
+ their own tradeoffs, and nobody can really tell which one is better
+ <braunr> the only sensible one i found is qnx :/
+ <braunr> but it's still messy
+ <braunr> they have what they call pulses, with a strictly defined format
+ <braunr> so it's actually fine because it guarantees low overhead, and can
+ easily be queued
+ <braunr> but i'm not sure about the format
+ <antrik> I must say that Neal's half-sync approach in Viengoos still sounds
+ most promising to me. it's actually modelled after the needs of a
+ Hurd-like system; and he thought about it a lot...
+ <braunr> damn i forgot to reread that
+ <braunr> stupid me
+ <antrik> note that you can't come up with a design that allows both a)
+ delivering reliably and b) never blocking the sender -- unless you cache
+ in the kernel, which we don't want
+ <antrik> but I don't think it's really necessary to fulfill both of these
+ requirements
+ <antrik> it's up to the receiver to make sure it gets important signals
+ <braunr> right
+ <braunr> caching in the kernel is ok as long as the limit allows the
+ receiver to handle its signals
+ <antrik> in the Viengoos approach, the receiver can allocate a number of
+ receive buffers; so it's even possible to do some queuing if desired
+ <braunr> ah great, limits in the form of resources lent by the receiver
+ <braunr> one thing i really don't like in mach is the behaviour on full
+ message queues
+ <braunr> blocking :/
+ <braunr> i bet the libpager deadlock is due to that
+
+[[libpager_deadlock]].
+
+ <braunr> it simply means async ipc doesn't prevent at all from deadlocks
+ <antrik> the sender can set a timeout. blocking only happens when setting
+ it to infinite...
+ <braunr> which is commonly the case
+ <antrik> well, if you see places where blocking is done but failing would
+ be more appropriate, try changing them I'd say...
+ <braunr> it's not that easy :/
+
+
+# IRC, freenode, #hurd, 2012-08-18
+
+ <lcc> what is the deepest design mistake of the HURD/gnumach?
+ <braunr> lcc: async ipc
+ <savask> braunr: You mentioned that moving to L4 will create problems. Can
+ you name some, please?
+ <savask> I thought it was going to be faster on L4
+ <braunr> the problem is that l4 *only* provides sync ipc
+ <braunr> so implementing async communication would require one seperated
+ thread for each instance of async communication
+ <savask> But you said that the deepest design mistake of Hurd is asynch
+ ipc.
+ <braunr> not the hurd, mach
+ <braunr> and hurd depends on it now
+ <braunr> i said l4 provides *only* sync ipc
+ <braunr> systems require async communication tools
+ <braunr> but they shouldn't be built entirely on top of them
+ <savask> Hmm, so you mean mach has bad asynch ipc?
+ <braunr> you can consider mach and l4 as two extremes in os design
+ <braunr> mach *only* has async ipc
+ <lcc> what was viengoos trying to explore?
+ * savask is confused
+ <braunr> lcc: half-sync ipc :)
+ <braunr> lcc: i can't tell you more on that, i need to understand it better
+ myself before any explanation attempt
+ <savask> You say that mach problem is asynch ipc. And L4's problem is it's
+ sync ipc. That means problems are in either of them!
+ <braunr> exactly
+ <lcc> how did apple resolve issues with mach?
+ <savask> What is perfect then? A "golden middle"?
+ <braunr> lcc: they have migrating threads, which make most rpc behave as if
+ they used sync ipc
+ <braunr> savask: nothing is perfect :p
+ <mcsim> braunr: but why async ipc is the problem?
+ <braunr> mcsim: it requires in-kernel buffering
+ <savask> braunr: Yes, but we can't have problems everywhere o_O
+ <braunr> mcsim: this not only reduces communication performance, but
+ creates many resource usage problems
+ <braunr> mcsim: and potential denial of service, which is what we
+ experience most of the time when something in the hurd fails
+ <braunr> savask: there are problems we can live with
+ <mcsim> braunr: But this could be replaced by userspace server, isn't it?
+ <braunr> savask: this is what monolithic kernels do
+ <braunr> mcsim: what ?
+ <braunr> mcsim: this would be the same, this central buffering server would
+ suffer from the same kind of issue
+ <mcsim> braunr: async ipc. Buffer can hold special server
+ <mcsim> But there could be created several servers, and queue could have
+ limit.
+ <braunr> queue limits are a problem
+ <braunr> when a queue limit is reached, you either block (= sync ipc) or
+ lose a message
+ <braunr> to keep messaging reliable, mach makes senders block
+ <braunr> the problem is that async ipc is often used to avoid blocking
+ <braunr> so blocking when you don't expect it can create deadlocks
+ <braunr> savask: a good compromise is to use sync ipc most of the time, and
+ async ipc for a few special cases, like signals
+ <braunr> this is what okl4 does if i'm right
+ <braunr> i'm not sure of the details, but like many other projects they
+ realized current systems simply need good support for async ipc, so they
+ extended l4 or something on top of it to provide it
+ <braunr> it took years of research for very smart people to get to some
+ consensus like "sync ipc is better but async is needed too"
+ <braunr> personaly i don't like l4 :/
+ <braunr> really not
+ <mcsim> braunr: Anyway there is some queue for messaging, but at the moment
+ if it overflows panics kernel. And with limited queue servers will panic.
+ <braunr> mcsim: it can't overflow
+ <braunr> mach blocks senders
+ <braunr> queuing basically means "block and possible deadlock" or "lose
+ messages and live with it"
+ <mcsim> So, deadlocks are still possible?
+ <braunr> of course
+ <braunr> have a look at the libpager debian patch and the discussion around
+ it
+ <braunr> it's a perfect example
+ <youpi> braunr: it makes gnu mach slow as hell sometimes, which I guess is
+ because all threads (which can ben 1000s) wake at the same time
+ <braunr> youpi: you mean are created ?
+ <braunr> because they'll have to wake in any case
+ <braunr> i can understand why creating lots of threads is slower, but
+ cthreads never destroyes kernel threads
+ <braunr> doesn't seem to be a mach problem, rather a cthreads one
+ <braunr> i hope we're able to remove the patch after pthreads are used
+
+[[libpthread]].
+
+ <mcsim> braunr: You state that hurd can't move to sync ipc, since it
+ depends on async ipc. But at the same time async ipc doesn't guarantee
+ that task wouldn't block. So, I don't understand why limited queues will
+ lead to more deadlocks?
+ <braunr> mcsim: async ipc can block because of queue limits
+ <braunr> mcsim: if you remove the limit, you remove the deadlock problem,
+ and replace it with denial of service
+ <braunr> mcsim: i didn't say the hurd can't move to sync ipc
+ <braunr> mcsim: i said it came to depend on async ipc as provided by mach,
+ and we would need to change that
+ <braunr> and it's tricky
+ <youpi> braunr: no, I really mean are woken. The timeout which gets dropped
+ by the patch makes threads wake after some time, to realize they should
+ go away. It's a hell long when all these threads wake at the same time
+ (because theygot created at the same time)
+ <braunr> ahh
+
+ <antrik> savask: what is perfect regarding IPC is something nobody can
+ really answer... there are competing opinions on that matter. but we know
+ by know that the Mach model is far from ideal, and that the (original) L4
+ model is also problematic -- at least for implementing a UNIX-like system
+ <braunr> personally, if i'd create a system now, i'd use sync ipc for
+ almost everything, and implement posix-like signals in the kernel
+ <braunr> that's one solution, it's not perfect
+ <braunr> savask: actually the real answer may be "noone knows for now and
+ it still requires work and research"
+ <braunr> so for now, we're using mach
+ <antrik> savask: regarding IPC, the path explored by Viengoos (and briefly
+ Coyotos) seems rather promising to me
+ <antrik> savask: and yes, I believe that whatever direction we take, we
+ should do so by incrementally reworking Mach rather than jumping to a
+ completely new microkernel...
diff --git a/open_issues/system_stats.mdwn b/open_issues/system_stats.mdwn
new file mode 100644
index 00000000..9a13b29a
--- /dev/null
+++ b/open_issues/system_stats.mdwn
@@ -0,0 +1,39 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_documentation]]There should be a page listing ways to get
+system statistics, how to interpret them, and some example/expected values.
+
+
+# IRC, frenode, #hurd, 2012-11-04
+
+ <mcsim> Hi, is that normal that memory cache "ipc_port" is 24 Mb already?
+ Some memory has been already swapped out.
+ <mcsim> Other caches are big too
+ <braunr> how many ports ?
+ <mcsim> 45922
+ <braunr> yes it's normal
+ <braunr> ipc_port 0010 76 4k 50 45937 302050
+ 24164k 4240k
+ <braunr> it's a bug in exim
+ <braunr> or triggered by exim, from time to time
+ <braunr> lots of ports are created until the faulty processes are killed
+ <braunr> the other big caches you have are vm_object and vm_map_entry,
+ probably because of a big build like glibc
+ <braunr> and if they remain big, it's because there was no memory pressure
+ since they got big
+ <braunr> memory pressure can only be caused by very large files on the
+ hurd, because of the limited page cache size (4000 objects at most)
+ <braunr> the reason you have swapped memory is probably because of a glibc
+ test that allocates a very large (more than 1.5 GiB iirc) block and fills
+ it
+ <mcsim> yes
+ <braunr> (a test that fails with the 2G/2G split of the debian kernel, but
+ not on your vanilla version btw)
diff --git a/open_issues/term_blocking.mdwn b/open_issues/term_blocking.mdwn
index 19d18d0e..0ed0b4df 100644
--- a/open_issues/term_blocking.mdwn
+++ b/open_issues/term_blocking.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2009, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2009, 2011, 2012 Free Software Foundation,
+Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -113,9 +114,198 @@ user started GDB test suite, noticed the PTY it's using; in a root shell
started GDB (the system one, for `.debug` stuff) on `/hurd/term`, `set
noninvasive on`, attach to the *term* that GDB is using.
+---
[[2011-07-04]].
+---
+
+2012-11-05
+
+Log file from a 2011-09-07 run:
+
+ [...]
+ Running ../../../master/gdb/testsuite/gdb.base/readline.exp ...
+ spawn [...]/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory [...]/gdb/testsuite/../data-directory
+ GNU gdb (GDB) 7.3.50.20110906-cvs
+ Copyright (C) 2011 Free Software Foundation, Inc.
+ License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
+ This is free software: you are free to change and redistribute it.
+ There is NO WARRANTY, to the extent permitted by law. Type "show copying"
+ and "show warranty" for details.
+ This GDB was configured as "i686-unknown-gnu0.3".
+ For bug reporting instructions, please see:
+ <http://www.gnu.org/software/gdb/bugs/>.
+ (gdb) set height 0
+ (gdb) set width 0
+ (gdb) dir
+ Reinitialize source path to empty? (y or n) y
+ Source directories searched: $cdir:$cwd
+ (gdb) dir ../../../master/gdb/testsuite/gdb.base
+ Source directories searched: [...]/gdb/testsuite/../../../master/gdb/testsuite/gdb.base:$cdir:$cwd
+ (gdb) p 1
+ $1 = 1
+ PASS: gdb.base/readline.exp: Simple operate-and-get-next - send p 1
+ (gdb) p 2
+ $2 = 2
+ PASS: gdb.base/readline.exp: Simple operate-and-get-next - send p 2
+ (gdb) p 3
+ $3 = 3
+ PASS: gdb.base/readline.exp: Simple operate-and-get-next - send p 3
+ (gdb) p 3(gdb) p 3PASS: gdb.base/readline.exp: Simple operate-and-get-next - C-p to p 3
+ ^H2(gdb) p 2PASS: gdb.base/readline.exp: Simple operate-and-get-next - C-p to p 2
+ ^H1(gdb) p 1PASS: gdb.base/readline.exp: Simple operate-and-get-next - C-p to p 1
+ ^OFAIL: gdb.base/readline.exp: Simple operate-and-get-next - C-o for p 1
+ FAIL: gdb.base/readline.exp: operate-and-get-next with secondary prompt - send if 1 > 0
+ FAIL: gdb.base/readline.exp: print 42 (timeout)
+ FAIL: gdb.base/readline.exp: arrow keys with secondary prompt (timeout)
+ spawn [...]/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory [...]/gdb/testsuite/../data-directory
+ ERROR: (timeout) GDB never initialized after 10 seconds.
+ ERROR: no fileid for coulomb
+ ERROR: no fileid for coulomb
+ UNRESOLVED: gdb.base/readline.exp: Simple operate-and-get-next - send p 7
+ testcase ../../../master/gdb/testsuite/gdb.base/readline.exp completed in 646 seconds
+ Running ../../../master/gdb/testsuite/gdb.base/wchar.exp ...
+ Executing on host: gcc -c -g -o [...]/gdb/testsuite/gdb.base/wchar0.o ../../../master/gdb/testsuite/gdb.base/wchar.c (timeout = 300)
+ spawn gcc -c -g -o [...]/gdb/testsuite/gdb.base/wchar0.o ../../../master/gdb/testsuite/gdb.base/wchar.c
+ Executing on host: gcc [...]/gdb/testsuite/gdb.base/wchar0.o -g -lm -o [...]/gdb/testsuite/gdb.base/wchar (timeout = 300)
+ spawn gcc [...]/gdb/testsuite/gdb.base/wchar0.o -g -lm -o [...]/gdb/testsuite/gdb.base/wchar
+ get_compiler_info: gcc-4-6-1
+ spawn [...]/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory [...]/gdb/testsuite/../data-directory
+ ERROR: (timeout) GDB never initialized after 10 seconds.
+ ERROR: no fileid for coulomb
+ ERROR: no fileid for coulomb
+ ERROR: no fileid for coulomb
+ ERROR: couldn't load [...]/gdb/testsuite/gdb.base/wchar into [...]/gdb/testsuite/../../gdb/gdb (timed out).
+ ERROR: no fileid for coulomb
+ ERROR: Delete all breakpoints in delete_breakpoints (timeout)
+ ERROR: no fileid for coulomb
+ UNRESOLVED: gdb.base/wchar.exp: setting breakpoint at wchar.c:34 (timeout)
+ testcase ../../../master/gdb/testsuite/gdb.base/wchar.exp completed in 797 seconds
+ [...]
+
+
+# IRC, freenode, #hurd, 2012-08-09
+
+In context of the [[select]] issue.
+
+ <braunr> i wonder where the tty allocation is made
+ <braunr> it could simply be that current applications don't handle old BSD
+ ptys correctly
+ <braunr> hm no, allocation is fine
+ <braunr> does someone know why there is no term instance for /dev/ttypX ?
+ <braunr> showtrans says "/hurd/term /dev/ttyp0 pty-slave /dev/ptyp0" though
+ <youpi> braunr: /dev/ttypX share the same translator with /dev/ptypX
+ <braunr> youpi: but how ?
+ <youpi> see the main function of term
+ <youpi> it attaches itself to the other node
+ <youpi> with file_set_translator
+ <youpi> just like pfinet can attach itself to /servers/socket/26 too
+ <braunr> youpi: isn't there a possible race when the same translator tries
+ to sets itself on several nodes ?
+ <youpi> I don't know
+ <tschwinge> There is.
+ <braunr> i guess it would just faikl
+ <braunr> fail
+ <tschwinge> I remember some discussion about this, possibly in context of
+ the IPv6 project.
+ <braunr> gdb shows weird traces in term
+ <braunr> i got this earlier today: http://www.sceen.net/~rbraun/gdb.txt
+ <braunr> 0x805e008 is the ptyctl, the trivs control for the pty
+ <tschwinge> braunr: How do you mean »weird«?
+ <braunr> tschwinge: some peropen (po) are never destroyed
+ <tschwinge> Well, can't they possibly still be open?
+ <braunr> they shouldn't
+ <braunr> that's why term doesn't close cleany, why select still reports
+ readiness, and why screen loops on it
+ <braunr> (and why each ssh session uses a different pty)
+ <tschwinge> ... but only on darnassus, I think? (I think I haven't seen
+ this anywhere else.)
+ <braunr> really ?
+ <braunr> i had it on my virtual machines too
+ <tschwinge> But perhaps I've always been rebooting systems quickly enough
+ to not notice.
+ <tschwinge> OK, I'll have a look next time I boot mine.
+ <braunr> i suppose it's why you can't login anymore quickly when syslog is
+ running
+
+[[syslog]]?
+
+ <braunr> i've traced the problem to ptyio.c, where pty_open_hook returns
+ EBUSY because ptyopen is still true
+ <braunr> ptyopen remains true because pty_po_create_hook doesn't get called
+ <youpi> tschwinge: I've seen the pty issue on exodar too, and on my qemu
+ image too
+ <braunr> err, pty_po_destroy_hook
+ <tschwinge> OK.
+ <braunr> and pty_po_destroy_hook doesn't get called from users.c because
+ po->cntl != ptyctl
+ <braunr> which means, somehow, the pty never gets closed
+ <youpi> oddly enough it seems to happen on all qemu systems I have, and no
+ xen system I have
+ <braunr> Oo
+ <braunr> are they all (xen and qemu) up to date ?
+ <braunr> (so we can remove versions as a factor)
+ <tschwinge> Aha. I only hve Xen and real hardware.
+ <youpi> braunr: no
+ <braunr> youpi: do you know any obscur site about ptys ? :)
+ <youpi> no
+ <youpi> well, actually yes
+ <youpi> http://dept-info.labri.fr/~thibault/a (in french)
+ <braunr> :D
+ <braunr> http://www.linusakesson.net/programming/tty/index.php looks
+ interesting
+ <youpi> indeed
+
+
+## IRC, freenode, #hurdfr, 2012-08-09
+
+ <braunr> youpi: ce que j'ai le plus de mal à comprendre, c'est ce qu'est un
+ "controlling tty"
+ <youpi> c'est le plus obscur d'obscur :)
+ <braunr> s'il est exclusif à une appli, comment ça doit se comporter sur un
+ fork, etc..
+ <youpi> de manière simple, c'est ce qui permet de faire ^C
+ <braunr> eh oui, et c'est sûrement là que ça explose
+ <youpi> c'est pas exclusif, c'est hérité
+ <braunr>
+ http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/bernstein-on-ttys/cttys.html
+
+
+## IRC, freenode, #hurd, 2012-08-10
+
+ <braunr> youpi: and just to be sure about the test procedure, i log on a
+ system, type tty, see e.g. ttyp0, log out, and in again, then tty returns
+ ttyp1, etc..
+ <youpi> yes
+ <braunr> youpi: and an open (e.g. cat) on /dev/ptyp0 returns EBUSY
+ <youpi> indeed
+ <braunr> so on xen it doesn't
+ <braunr> grmbl
+ <youpi> I've never seen it, more precisely
+ <braunr> i also have the problem with a non-accelerated qemu
+ <braunr> antrik: do you have the term problems we've seen on your bare
+ hardware ?
+ <antrik> I'm not sure what problem you are seeing exactly :-)
+ <braunr> antrik: when logging through ssh, tty first returns ttyp0, and the
+ second time (after logging out from the first session) ttyp1
+ <braunr> antrik: and term servers that have been used are then stuck in a
+ busy state
+ <antrik> braunr: my ptys seem to be reused just fine
+ <braunr> or perhaps they didn't have the bug
+ <braunr> antrik: that's so weird
+ <antrik> (I do *sometimes* get hanging ptys, but that's a different issue
+ -- these are *not* busy; they just hang when reused...)
+ <braunr> antrik: yes i saw that too
+ <antrik> braunr: note though that my hurd package is many months old...
+ <antrik> (in fact everything on this system)
+ <braunr> antrik: i didn't see anything relevant about the term server in
+ years
+ <braunr> antrik: what shell do you use ?
+ <antrik> yeah, but such errors could be caused by all kinds of changes in
+ other parts of the Hurd, glibc, whatever...
+ <antrik> bash
+
# Formal Verification
diff --git a/open_issues/user-space_device_drivers.mdwn b/open_issues/user-space_device_drivers.mdwn
index 25168fce..8cde8281 100644
--- a/open_issues/user-space_device_drivers.mdwn
+++ b/open_issues/user-space_device_drivers.mdwn
@@ -50,6 +50,65 @@ Also see [[device drivers and IO systems]].
* I/O MMU.
+
+### IRC, freenode, #hurd, 2012-08-15
+
+ <carli2> hi. does hurd support mesa?
+ <braunr> carli2: software only, but yes
+ <carli2> :(
+ <carli2> so you did not solve the problem with the CS checkers and GPU DMA
+ for microkernels yet, right?
+ <braunr> cs = ?
+ <carli2> control stream
+ <carli2> the data sent to the gpu
+ <braunr> no
+ <braunr> and to be honest we're not currently trying to
+ <carli2> well, a microkernel containing cs checkers for each hardware is
+ not a microkernel any more
+ <braunr> the problem is having the ability to check
+ <braunr> or rather, giving only what's necessary to delegate checking to
+ mmus
+ <carli2> but maybe the kernel could have a smaller interface like a
+ function to check if a memory block is owned by a process
+ <braunr> i'm not sure what you refer to
+ <carli2> about DMA-capable devices you can send messages to
+ <braunr> carli2: dma must be delegated to a trusted server
+ <carli2> linux checks the data sent to these devices, parses them and
+ checks all pointers if they are in a memory range that the client is
+ allowed to read/write from
+ <braunr> the client ?
+ <carli2> in linux, 3d drivers are in user space, so the kernel side checks
+ the pointer sent to the GPU
+ <youpi> carli2: mach could do that as well
+ <braunr> well, there is a rather large part in kernel space too
+ <carli2> so in hurd I trust some drivers to not do evil things?
+ <braunr> those in the kernel yes
+ <carli2> what does "in the kernel" mean? afaik a microkernel only has
+ memory manager and some basic memory sharing and messaging functionality
+ <braunr> did you read about the hurd ?
+ <braunr> mach is considered an hybrid kernel, not a true microkernel
+ <braunr> even with all drivers outside, it's still an hybrid
+ <youpi> although we're to move some parts into userlands :)
+ <youpi> braunr: ah, why?
+ <braunr> youpi: the vm part is too large
+ <youpi> ok
+ <braunr> the microkernel dogma is no policy inside the kernel
+ <braunr> "except scheduling because it's very complicated"
+ <braunr> but all modern systems have moved memory management outisde the
+ kernel, leaving just the kernel abstraction inside
+ <braunr> the adress space kernel abstraction
+ <braunr> and the two components required to make it work are what l4re
+ calls region mappers (the rough equivalent of our vm_map), which decides
+ how to allocate regions in an address space
+ <braunr> and the pager, like ours, which are already external
+ <carli2> i'm not a OS developer, i mostly develop games, web services and
+ sometimes I fix gpu drivers
+ <braunr> that was just FYI
+ <braunr> but yes, dma must be considered something privileged
+ <braunr> and the hurd doesn't have the infrastructure you seem to be
+ looking for
+
+
## I/O Ports
* Security considerations.
@@ -63,8 +122,13 @@ Also see [[device drivers and IO systems]].
* [[GNU Mach|microkernel/mach/gnumach]] is said to have a high overhead when
doing RPC calls.
+
## System Boot
+A similar problem is described in
+[[community/gsoc/project_ideas/unionfs_boot]], and needs to be implemented.
+
+
### IRC, freenode, #hurd, 2011-07-27
< braunr> btw, was there any formulation of the modifications required to
@@ -89,12 +153,270 @@ Also see [[device drivers and IO systems]].
< Tekk_> mhm
< braunr> s/disk/storage/
+
### IRC, freenode, #hurd, 2012-04-25
<youpi> btw, remember the initrd thing?
<youpi> I just came across task.c in libstore/ :)
+### IRC, freenode, #hurd, 2012-07-17
+
+ <bddebian> OK, here is a stupid question I have always had. If you move
+ PCI and disk drivers in to userspace, how do do initial bootstrap to get
+ the system booting?
+ <braunr> that's hard
+ <braunr> basically you make the boot loader load all the components you
+ need in ram
+ <braunr> then you make it give each component something (ports) so they can
+ communicate
+
+
+### IRC, freenode, #hurd, 2012-08-12
+
+ <antrik> braunr: so, about booting with userspace disk drivers
+ <antrik> after rereading the chapter in my thesis, I see that there aren't
+ really all than many interesting options...
+ <antrik> I pondered some variants involving a temporary boot filesystem
+ with handoff to the real root FS; but ultimately concluded with another
+ option that is slightly less elegant but probably gets a much better
+ usefulness/complexity ratio:
+ <antrik> just start the root filesystem as the first process as we used to;
+ only hack it so that initially it doesn't try to access the disk, but
+ instead gets the files from GRUB
+ <antrik> once the disk driver is operational, we flip a switch, and the
+ root filesystem starts reading stuff from disk normally
+ <antrik> transparently for all other processes
+ <bddebian> How does grub access the disk without drivers?
+ <antrik> bddebian: GRUB obviously has its own drivers... that's how it
+ loads the kernel and modules
+ <antrik> bddebian: basically, it would have to load additional modules for
+ all the components necessary to get the Hurd disk driver going
+ <bddebian> Right, why wouldn't that be possible?
+ <antrik> (I have some more crazy ideas too -- but these are mostly
+ orthogonal :-) )
+ <antrik> ?
+ <antrik> I'm describing this because I'm pretty sure it *is* possible :-)
+ <bddebian> That grub loads the kernel and whatever server/module gets
+ access to the disk
+ <antrik> not sure what you mean
+ <bddebian> Well as usual I probably don't know the proper terminology but
+ why could grub load gnumach and the hurd "disk server" that contains the
+ userspace drivers?
+ <antrik> disk server?
+ <bddebian> Oh FFS whatever contains the disk drivers :)
+ <bddebian> diskdde, whatever :)
+ <antrik> actually, I never liked the idea of having a big driver blob very
+ much... ideally each driver should have it's own file
+ <antrik> but that's admittedly beside the point :-)
+ <antrik> its
+ <antrik> so to restate: in addition to gnumach, ext2fs.static, and ld.so,
+ in the new scenario GRUB will also load exec, the disk driver, any
+ libraries these two depend upon, and any additional infrastructure
+ involved in getting the disk driver running (for automatic probing or
+ whatever)
+ <antrik> probably some other Hurd core servers too, so we can have a more
+ complete POSIX environment for the disk driver to run in
+ <bddebian> There ya go :)
+ <antrik> the interesting part is modifying ext2fs so it will access only
+ the GRUB-provided files, until it is told that it's OK now to access the
+ real disk
+ <antrik> (and the mechanism how ext2 actually gets at the GRUB-provided
+ files)
+ <bddebian> Or write some new really small ext2fs? :)
+ <antrik> ?
+ <bddebian> I'm just talking out my butt. Something temporary that gets
+ disposed of when the real disk is available :)
+ <antrik> well, I mentioned above that I considered some handoff
+ schemes... but they would probably be more complex to implement than
+ doing the switchover internally in ext2
+ <bddebian> Ah
+ <bddebian> boot up in a ramdisk? :)
+ <antrik> (and the temporary FS would *not* be an ext2 obviously, but rather
+ some special ramdisk-like filesystem operating from GRUB-loaded files...)
+ <antrik> again, that would require a complicated handoff-scheme
+ <bddebian> Bah, what do I know? :)
+ <antrik> (well, you could of course go with a trivial chroot()... but that
+ would be ugly and inefficient, as the initial processes would still run
+ from the ramdisk)
+ <bddebian> Aren't most things running in memory initially anyway? At what
+ point must it have access to the real disk?
+ <braunr> antrik: but doesn't that require that disk drivers be statically
+ linked ?
+ <braunr> and having all disk drivers in separate tasks (which is what we
+ prefer to blobs as you put it) seems to pretty much forbid using static
+ linking
+ <braunr> hm actually, i don't see how any solution could work without
+ static linking, as it would create a recursion
+ <braunr> and the only one required is the one used by the root file system
+ <braunr> others can be run from the dynamically linked version
+ <braunr> antrik: i agree, it's a good approach, requiring only a slightly
+ more complicated boot script/sequence
+ <antrik> bddebian: at some point we have to access the real disk so we
+ don't have to work exclusively with stuff loaded by grub... but there is
+ no specific point where it *has* to happen. generally speaking, the
+ sooner the better
+ <antrik> braunr: why wouldn't that work with a dynamically linked disk
+ driver? we only need to make sure all required libraries are loaded by
+ grub too
+ <braunr> antrik: i have a problem with that approach :p
+ <braunr> antrik: it would probably require a reboot when those libraries
+ are upgraded, wouldn't it ?
+ <antrik> I'd actually wish we could run with a dynamically linked ext2fs as
+ well... but that would require a separated boot filesystem and some kind
+ of handoff approach, which would be much more complicated I fear...
+ <braunr> and if a driver is restarted, would it use those libraries too ?
+ and if so, how to find them ?
+ <braunr> but how can you run a dynamically linked root file system ?
+ <braunr> unless the libraries it uses are provided by something else, as
+ you said
+ <antrik> braunr: well, if you upgrade the libraries, *and* want the disk
+ driver to use the upgraded libraries, you are obviously in a tricky
+ situation ;-)
+ <braunr> yes
+ <antrik> perhaps you could tell ext2 to preload the new libraries before
+ restarting the disk driver...
+ <antrik> but that's a minor quibble anyways IMHO
+ <braunr> but that case isn't that important actually, since upgrading these
+ libraries usually means we're upgrading the system, which can imply a
+ reoobt
+ <braunr> i don't think it is
+ <braunr> it looks very complicated to me
+ <braunr> think of restart as after a crash :p
+ <braunr> you can't preload stuff in that case
+ <antrik> uh? I don't see anything particularily complicated. but my point
+ was more that it's not a big thing if that's not implemented IMHO
+ <braunr> right
+ <braunr> it's not that important
+ <braunr> but i still think statically linking is better
+ <braunr> although i'm not sure about some details
+ <antrik> oh, you mean how to make the root filesystem use new libraries
+ without a reboot? that would be tricky indeed... but this is not possible
+ right now either, so that's not a regression
+ <braunr> i assume that, when statically linking, only the .o providing the
+ required symbols are included, right ?
+ <antrik> making the root filesystem restartable is a whole different epic
+ story ;-)
+ <braunr> antrik: not the root file system, but the disk driver
+ <braunr> but i guess it's the same
+ <antrik> no, it's not
+ <braunr> ah
+ <antrik> for the disk driver it's really not that hard I believe
+ <antrik> still some extra effort, but definitely doable
+ <braunr> with the preload you mentioned
+ <antrik> yes
+ <braunr> i see
+ <braunr> i don't think it's worth the trouble actually
+ <braunr> statically linking looks way simpler and should make for smaller
+ binaries than if libraries were loaded by grub
+ <antrik> no, I really don't want statically linked disk drivers
+ <braunr> why ?
+ <antrik> again, I'd prefer even ext2fs to be dynamic -- only that would be
+ much more complicated
+ <braunr> the point of dynamically linking is sharing
+ <antrik> while dynamic disk drivers do not require any extra effort beyond
+ loading the libraries with grub
+ <braunr> but if it means sharing big files that are seldom used (i assume
+ there is a lot of code that simply isn't used by hurd servers), i don't
+ see the point
+ <antrik> right. and with the approach I proposed that will work just as it
+ should
+ <antrik> err... what big files?
+ <braunr> glibc ?
+ <antrik> I don't get your point
+ <antrik> you prefer statically linking everything needed before the disk
+ driver runs (which BTW is much more than only the disk driver itself) to
+ using normal shared libraries like the rest of the system?...
+ <braunr> it's not "like the rest of the system"
+ <braunr> the libraries loaded by grub wouldn't be back by the ext2fs server
+ <braunr> they would be wired in memory
+ <braunr> you'd have two copies of them, the one loaded by grub, and the one
+ shared by normal executables
+ <antrik> no
+ <braunr> i prefer static linking because, if done correctly, the combined
+ size of the root file system and the disk driver should be smaller than
+ that of the rootfs+disk driver and libraries loaded by grub
+ <antrik> apparently I was not quite clear how my approach would work :-(
+ <braunr> probably not
+ <antrik> (preventing that is actually the reason why I do *not* want as
+ simple boot filesystem+chroot approach)
+ <braunr> and initramfs can be easily freed after init
+ <braunr> an*
+ <braunr> it wouldn't be a chroot but something a bit more involved like
+ switch_root in linux
+ <antrik> not if various servers use files provided by that init filesystem
+ <antrik> yes, that's the complex handoff I'm talking about
+ <braunr> yes
+ <braunr> that's one approach
+ <antrik> as I said, that would be a quite elegant approach (allowing a
+ dynamically linked ext2); but it would be much more complicated to
+ implement I believe
+ <braunr> how would it allow a dynamically linked ext2 ?
+ <braunr> how can the root file system be linked with code backed by itself
+ ?
+ <braunr> unless it requires wiring all its memory ?
+ <antrik> it would be loaded from the init filesystem before the handoff
+ <braunr> init sn't the problem here
+ <braunr> i understand how it would boot
+ <braunr> but then, you need to make sure the root fs is never used to
+ service page faults on its own address space
+ <braunr> or any address space it depends on, like the disk driver
+ <braunr> so this basically requires wiring all the system libraries, glibc
+ included
+ <braunr> why not
+ <antrik> ah. yes, that's something I covered in a separate section in my
+ thesis ;-)
+ <braunr> eh :)
+ <antrik> we have to do that anyways, if we want *any* dynamically linked
+ components (such as the disk driver) in the paging path
+ <braunr> yes
+ <braunr> and it should make swapping more reliable too
+ <antrik> so that adds a couple MiB of wired memory... I guess we will just
+ have to live with that
+ <braunr> yes it seems acceptable
+ <braunr> thanks
+ <antrik> (it is actually one reason why I want to avoid static linking as
+ much as possible... so at least we have to wire these libraries only
+ *once*)
+ <antrik> anyways, back to my "simpler" approach
+ <antrik> the idea is that a (static) ext2fs would still be the first task
+ running, and immediately able to serve filesystem access requests -- only
+ it would serve these requests from files preloaded by GRUB rather than
+ the actual disk driver
+ <braunr> i understand now
+ <antrik> until a switch is flipped telling it that now the disk driver (and
+ anything it depends upon) is operational
+ <braunr> you still need to make sure all this is wired
+ <antrik> yes
+ <antrik> that's orthogonal
+ <antrik> which is why I have a separate section about it :-)
+ <braunr> what was the relation with ggi ?
+ <antrik> none strictly speaking
+ <braunr> i'll rephrase it: how did it end up in your thesis ?
+ <antrik> I just covered all aspects of userspace drivers in one of the
+ "introduction" sections of my thesis
+ <braunr> ok
+ <antrik> before going into specifics of KGI
+ <antrik> (and throwing in along the way that most of the issues described
+ do not matter for KGI ;-) )
+ <braunr> hehe
+ <braunr> i'm wondering, do we have mlockall on the hurd ? it seems not
+ <braunr> that's something deeply missing in mach
+ <antrik> well, bootstrap in general *is* actually relevant for KGI as well,
+ because of console messages during boot... but the filesystem bootstrap
+ is mostly irrelevant there ;-)
+ <antrik> braunr: oh? that's a problem then... I just assumed we have it
+ <braunr> well, it's possible to implement MCL_CURRENT, but not MCL_FUTURE
+ <braunr> or at least, it would be a bit difficult
+ <braunr> every allocation would need to be aware of that property
+ <braunr> it's better to have it managed by the vm system
+ <braunr> mach-defpager has its own version of vm_allocate for that
+ <antrik> braunr: I don't think we care about MCL_FUTURE here
+ <antrik> hm, wait... MCL_CURRENT is fine for code, but it might indeed be a
+ problem for dynamically allocated memory :-(
+ <braunr> yes
+
+
# Plan
* Examine what other systems are doing.
@@ -116,6 +438,112 @@ Also see [[device drivers and IO systems]].
and parallel port drivers, using `libtrivfs`.
+## I/O Server
+
+### IRC, freenode, #hurd, 2012-08-10
+
+ <braunr> usually you'd have an I/O server, and serveral device drivers
+ using it
+ <bddebian> Well maybe that's my question. Should there be unique servers
+ for say ISA, PCI, etc or could all of that be served by one "server"?
+ <braunr> forget about ISA
+ <bddebian> How? Oh because the ISA bus is now served via a PCI bridge?
+ <braunr> the I/O server would merely be there to help device drivers map
+ only what they require, and avoid conflicts
+ <braunr> because it's a relic of the past :p
+ <braunr> and because it requires too high privileges
+ <bddebian> But still exists in several PCs :)
+ <braunr> so usually, you'd directly ask the kernel for the I/O ports you
+ need
+ <mel-> so do floppy drives
+ <mel-> :)
+ <braunr> if i'm right, even the l4 guys do it that way
+ <braunr> he's right, some devices are still considered ISA
+ <bddebian> But that is where my confusion lies. Something has to figure
+ out what/where those I/O ports are
+ <braunr> and that's why i tell you to forget about it
+ <braunr> ISA has both statically allocated ports (the historical ones) and
+ others usually detected through PnP, when it works
+ <braunr> PCI is much cleaner, and memory mapped I/O is both better and much
+ more popular currently
+ <bddebian> So let's say I have a PCI SCSI card. I need some device driver
+ to know how to talk to that, right?
+ <bddebian> something is going to enumerate all the PCI devices and map them
+ to and address space
+ <braunr> bddebian: that would be the I/O server
+ <braunr> we'll call it the PCI server
+ <bddebian> OK, that is where I am headed. What if everything isn't PCI?
+ Is the "I/O server" generic enough?
+ <youpi> nowadays everything is PCI
+ <bddebian> So we are completely ignoring legacy hardware?
+ <braunr> we could have separate servers using a shared library that would
+ provide allocation routines like resource maps
+ <braunr> yes
+ <youpi> for what is not, the translator just needs to be run as root
+ <youpi> to get i/o perm from the kernel
+ <braunr> the idea for projects like ours, where the user base is very small
+ is: don't implement what you can't test
+ <youpi> bddebian: legacy can not be supported in a nice way, so for them we
+ can just afford a bad solution
+ <youpi> i.e. leave the driver in kernel
+ <braunr> right
+ <youpi> e.g. the keyboard
+ <bddebian> Well what if I have a USB keyboard? :-P
+ <braunr> that's a different matter
+ <youpi> USB keyboard is not legacy hardware
+ <youpi> it's usb
+ <youpi> which can be enumerated like pci
+ <braunr> and USB uses PCI
+ <youpi> and pci could be on usb :)
+ <braunr> so it's just a separate stack on top of the PCI server
+ <bddebian> Sure so would SCSI in my example above but is still a seperate
+ bus
+ <braunr> netbsd has a very nice way of attaching drivers to buses
+ <youpi> bddebian: also, yes, and it can be enumerated
+ <bddebian> Which was my original question. This magic I/O server handles
+ all of the buses?
+ <youpi> no, just PCI, and then you'd have other servers for other busses
+ <braunr> i didn't mean that there would be *one* I/O server instance
+ <bddebian> So then it isn't a generic I/O server is it?
+ <bddebian> Ahhhh
+ <youpi> that way you can even put scsi over ppp or other crazy things
+ <braunr> it's more of an idea
+ <braunr> there would probably be a generic interface for basic stuff
+ <braunr> and i assume it could be augmented with specific (e.g. USB)
+ interfaces for servers that need more detailed communication
+ <braunr> (well, i'm pretty sure of it)
+ <bddebian> So the I/O server generalizes all functions, say read and write,
+ and then the PCI, USB, SCIS, whatever servers are contacted by it?
+ <braunr> no, not read and write
+ <braunr> resource allocation rather
+ <youpi> and enumeration
+ <braunr> probing perhaps
+ <braunr> bddebian: the goal of the I/O server is to make it possible for
+ device drivers to access the resources they need without a chance to
+ interfere with other device drivers
+ <braunr> (at least, that's one of the goals)
+ <braunr> so a driver would request the bus space matching the device(s) and
+ obtain that through memory mapping
+ <bddebian> Shouldn't that be in the "global address space"? SOrry if I am
+ using the wrong terminology
+ <youpi> well, the i/o server should also trigger the start of that driver
+ <youpi> bddebian: address space is not a matter for drivers
+ <braunr> bddebian: i'm not sure what you think of with "global address
+ space"
+ <youpi> bddebian: it's just a matter for the pci enumerator when (and if)
+ it places the BARs in physical address space
+ <youpi> drivers merely request mapping that, they don't need to know about
+ actual physical addresses
+ <braunr> i'm almost sure you lost him at BARs
+ <braunr> :(
+ <braunr> youpi: that's what i meant with probing actually
+ <bddebian> Actually I know BARs I have been reading on PCI :)
+ <bddebian> I suppose physicall address space is more what I meant when I
+ used "global address space"
+ <braunr> i see
+ <youpi> bddebian: probably, yes
+
+
# Documentation
* [An Architecture for Device Drivers Executing as User-Level
diff --git a/open_issues/usleep.mdwn b/open_issues/usleep.mdwn
new file mode 100644
index 00000000..b71cd902
--- /dev/null
+++ b/open_issues/usleep.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc]]
+
+# IRC, OFTC, #debian-hurd, 2012-07-14
+
+ <pinotree> eeek, usleep has the issues which i fixed in nanosleep
+ <bdefreese> pinotree: ?
+ * pinotree ponders a `mv sysdeps/unix/sysv/linux/usleep.c
+ sysdeps/mach/usleep.c`
+ <pinotree> s/mv/cp/
+ <bdefreese> What the heck is the point of usleep(0) anyway? Isn't that
+ basically saying suspend for 0 milliseconds?
+ <youpi> it's rounded up by the kernel I guess
+ <youpi> i.e. suspend for the shortest time possible (a clock tick)
+ <pinotree> posix 2001 says that «If the value of useconds is 0, then the
+ call has no effect.»
diff --git a/open_issues/virtualbox.mdwn b/open_issues/virtualbox.mdwn
index 9440284f..d0608b4a 100644
--- a/open_issues/virtualbox.mdwn
+++ b/open_issues/virtualbox.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -8,11 +8,15 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
+[[!toc]]
+
+
+# Running GNU Mach in VirtualBox crashes during initialization.
+
[[!tag open_issue_gnumach]]
-Running GNU Mach in VirtualBox crashes during initialization.
-IRC, freenode, #hurd, 2011-08-15
+## IRC, freenode, #hurd, 2011-08-15
<BlueT_> HowTo Reproduce: 1) Use `reboot` to reboot the system. 2) Once
you see the Grub menu, turn off the debian hurd box. 3) Let the box boot
@@ -97,3 +101,37 @@ IRC, freenode, #hurd, 2011-08-15
<youpi> what's interesting is that that one means that $USER_DS did load in
%es fine at least once
<youpi> and it's the reload that fails
+
+
+# Slow SCSI probing
+
+[[!tag open_issue_gnumach]]
+
+
+## IRC, freenode, #hurd, 2012-08-07
+
+ <braunr> youpi: it seems the slow boot on virtualbox is really because of
+ scsi (it spends a long time in scsi_init, probing for all the drivers)
+ <youpi> braunr: we know that
+ <youpi> isn't it in the io port probe printed at boot?
+ <youpi> iirc that was that
+ <braunr> the discussion i found was about eata
+ <braunr> not the whole scsi group
+ <youpi> there used to be another in eata, yas
+ <braunr> oh
+ <braunr> i must have missed the first discussion then
+ <youpi> I mean
+ <youpi> the eata is the first
+ <braunr> ok
+ <youpi> and scsi was mentioned later
+ <youpi> just nobody took the time to track it down
+ <braunr> ok
+ <braunr> so it's not just a matter of disabling a single driver :(
+ <youpi> braunr: I still believe it's a matter of disableing a single driver
+ <youpi> I don't see why scsi in general should take a lot of time
+ <braunr> youpi: it doesn't on qemu, it may simply be virtualbox's fault
+ <youpi> it is, yes
+ <youpi> and virtualbox people say it's hurd's fault, of course
+ <braunr> both are possible
+ <braunr> but we can't expect them to fix it :)
+ <youpi> that's what I mean
diff --git a/open_issues/vm_map_kernel_bug.mdwn b/open_issues/vm_map_kernel_bug.mdwn
new file mode 100644
index 00000000..613c1317
--- /dev/null
+++ b/open_issues/vm_map_kernel_bug.mdwn
@@ -0,0 +1,54 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_gnumach]]
+
+
+# IRC, frenode, #hurd, 2012-11-04
+
+ <tschwinge> braunr, pinotree, youpi: Has either of you already figured out
+ what [glibc]/sysdeps/mach/hurd/dl-sysdep.c:fmh »XXX loser kludge for
+ vm_map kernel bug« is about?
+ <pinotree> tschwinge: ETOOLOWLEVELFORME :)
+ <pinotree> tschwinge: 5bf62f2d3a8af353fac661b224fc1604d4de51ea added it
+ <braunr> tschwinge: no, but that looks interesting
+ <braunr> i'll have a look later
+ <tschwinge> Heh, "interesting". ;-)
+ <tschwinge> It seems related to vm_map's mask
+ parameter/ELF_MACHINE_USER_ADDRESS_MASK, though the latter in only used
+ in the mmap implementation in sysdeps/mach/hurd/dl-sysdep.c (in mmap.c, 0
+ is passed; perhaps due to the bug?).
+ <tschwinge> braunr: Anyway, I'd already welcome a patch to simply turn that
+ into a more comprehensible form.
+ <braunr> tschwinge: ELF_MACHINE_USER_ADDRESS_MASK is defined as "Mask
+ identifying addresses reserved for the user program, where the dynamic
+ linker should not map anything."
+ <braunr> about the vm_map parameter, which is a mask, it is described by
+ "Bits asserted in this mask must not be asserted in the address returned"
+ <braunr> so it's an alignment constraint
+ <braunr> the kludge disables alignment, apparently because gnumach doesn't
+ handle them correctly for some cases
+ <tschwinge> braunr: But ELF_MACHINE_USER_ADDRESS_MASK is 0xf8000000, so I'd
+ rather assume this means to restrict to addresses lower than 0xf8000000.
+ (What are whigher ones reserved for?)
+ <braunr> tschwinge: the linker i suppose
+ <braunr> tschwinge: sorry, i don't understand what
+ ELF_MACHINE_USER_ADDRESS_MASK really is used for :/
+ <braunr> tschwinge: it looks unused for the other systems
+ <braunr> tschwinge: i guess it's just one way to partition the address
+ space, so that the linker knows where to load libraries and mmap can
+ still allocate large contiguous blocks
+ <braunr> tschwinge: 0xf8000000 means each "chunk" of linker/other blocks
+ are 128 MiB large
+ <tschwinge> braunr: OK, thanks for looking. I guess I'll ask Roland about
+ it.
+ <braunr> it could be that gnumach isn't good at aligning to large values
+
+[[!message-id "87fw4pb4c7.fsf@kepler.schwinge.homeip.net"]]
diff --git a/open_issues/wait_errors.mdwn b/open_issues/wait_errors.mdwn
new file mode 100644
index 00000000..855b9add
--- /dev/null
+++ b/open_issues/wait_errors.mdwn
@@ -0,0 +1,25 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_glibc open_issue_hurd]]
+
+# IRC, freenode, #hurd, 2012-07-12
+
+ <braunr> tschwinge: have you encountered wait() errors ?
+ <tschwinge> What kind of wait errors?
+ <braunr> when running htop or watch vmstat, other apparently unrelated
+ processes calling wait() sometimes fail with an error
+ <braunr> i saw it mostly during builds, as they spawn lots of children
+ <braunr> (and used the aforementioned commands to monitor the builds)
+ <tschwinge> Sounds nasty... No, don't remember seeing that. But I don't
+ typiclly invoke such commands during builds.
+ <tschwinge> So this wait thing suggests there's something going wrong in
+ the proc server?
+ <braunr> tschwinge: yes
diff --git a/open_issues/whole_system_debugging.mdwn b/open_issues/whole_system_debugging.mdwn
new file mode 100644
index 00000000..b438c5cf
--- /dev/null
+++ b/open_issues/whole_system_debugging.mdwn
@@ -0,0 +1,19 @@
+[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_gdb open_issue_gnumach]]
+
+Given our distributed system structure, it'd be immensely useful then when a
+[[RPC]] to another entitiy is made, [[GDB]] followed suit.
+
+[[GDB]] does have some *multi-process* debugging infrastructure which should
+basically be usable for this.
+
+[[`mach_msg`|microkernel/mach/message]] is the *great barrier*, of course.
diff --git a/rpc.mdwn b/rpc.mdwn
index 87f22593..5fed0aa2 100644
--- a/rpc.mdwn
+++ b/rpc.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2007, 2008, 2010 Free Software Foundation,
+[[!meta copyright="Copyright © 2007, 2008, 2010, 2012 Free Software Foundation,
Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
@@ -11,105 +11,11 @@ License|/fdl]]."]]"""]]
RPC stands for remote procedure call.
-This is the basis for about everything in the Hurd. It is based on the Mach
-RPC mechanism (the mach_msg system call). An RPC is made against a Mach port,
-which is the gateway to the translator which will serve the RPC. Let's take for
-instance the case of opening a file, and advancing (lseek) 10 bytes into it. The
-user program will be something like:
-
- #include <fcntl.h>
-
- int main(void) {
- int fd = open("test.txt", O_RDONLY);
- lseek(fd, 10, SEEK_CUR);
- }
-
-Both open and lseek are functions provided the libc, which make RPC calls.
-
-Open is a bit complex because it finds its way to the eventual translator, but
-for a mere file on the root filesystem, what happens boils down to calling
-the dir_lookup function against the root filesystem. This is an RPC from the fs
-interface (see fs.defs). The implementation of the function is thus actually
-generated from the fs.defs file using mig during the glibc build in RPC_dir_lookup.c.
-This generated function essentially encodes the parameters into
-a data buffer, and makes a mach_msg system call to send the buffer to the root
-filesystem port, with the dir_lookup RPC id.
-
-The root filesystem, for instance ext2fs, was sitting in its main
-loop (libdiskfs/init-first.c, master_thread_function()), which calls
-ports_manage_port_operations_multithread(), which essentially simply keeps
-making a mach_msg system call to receive a message, and calls the demuxer
-on it, here the demuxer parameter, diskfs_demuxer. This demuxer calls the
-demuxers for the various interfaces supported by ext2fs. These demuxers are
-generated using mig during the hurd build. For instance, the fs interface
-demuxer for diskfs, diskfs_fs_server, is in libdiskfs/fsServer.c. It simply
-checks whether the RPC id is an fs interface ID, and if so uses the
-diskfs_fs_server_routines array to know which function should be
-called according to the RPC id. Here it's _Xdir_lookup which thus gets
-called. This decodes the parameters from the message data buffer, and calls
-diskfs_S_dir_lookup.
-
-diskfs_S_dir_lookup() in the ext2fs translator does stuff to check that the file
-exists, etc. and eventually creates a new port, which will symbolize the file
-being opened, and a structure to store information about it. It returns the
-port to its caller, _Xdir_lookup, which puts it into the reply message data
-buffer and returns. ports_manage_port_operations_multithread() then calls
-mach_msg to send that port to the user program.
-
-The mach_msg call in the user program thus returns, returning the port, decoded
-by dir_lookup. glibc creates a new fd entry in its fd table, and records the
-port into it. It returns 0 (success).
-
-Lseek is simpler. The glibc implementation simply calls the __io_seek function
-against the port of the fd. This is an RPC from the io interface (see io.defs).
-As explained above, the implementation is thus in RPC_io_seek.c, it encodes
-parameters and makes a mach_msg system call to the port of the fd with the
-io_seek RPC id.
-
-In the root filesystem, it's now the demuxer for the io interface,
-diskfs_io_server, which will recognize the RPC id, and call _Xio_seek, which
-finds the data structure for the port, and calls diskfs_S_io_seek. The latter
-simply modifies the data structure to account for the file position change, and
-returns the new position. _Xio_seek encodes the position into the reply message,
-which is sent back by ports_manage_port_operations_multithread()
-through mach_msg.
-
-The mach_msg call in the user program thus returns the new offset, decoded by
-__io_seek. lseek can then return it to the user application.
-
-
-When hacking, one does *not* need to keep all that in mind. All one needs
-to remember is that when the application program calls open(), the glibc
-implementation actually calls dir_lookup(), which triggers a call to
-diskfs_S_dir_lookup in the ext2fs translator. When the application program calls
-lseek(), the glibc implementation calls __io_seek(), which triggers a call to
-diskfs_S_io_seek in the ext2fs translator. And so on...
-
-Q&A
-
-Q: How do I know whether a function is an RPC or not?
-
-A: Simply grep the function name (without leading underscores) in the
-/usr/include/hurd/*.defs files.
-
-Q: Why is it libdiskfs functions which get called?
-
-A: Because ext2fs is libdiskfs-based (see HURDLIBS = diskfs in ext2fs/Makefile).
-Other translators are libnetfs-based or libtrivfs-based. grep for RPC names into
-those according to what your translator is based on.
-
-Q: How do I know which translator the RPC gets into?
-
-A: Check the type of file whose port the RPC was made on. Most files are handled
-by the translator which is mounted where the files are opened. Some special
-files are handled by particular translators:
-
-* PF_LOCAL/PF_UNIX sockets are served by /hurd/pflocal, see [[hurd/networking]]
-* PF_INET sockets are served by /hurd/pfinet, see [[hurd/networking]]
-* named sockets (aka fifo) are served by /hurd/fifo
# See Also
* [[Mach RPC|microkernel/mach/rpc]]s
+ * [[RPC usage in the Hurd|hurd/rpc]]
+
* the [[Hurd's rpctrace|hurd/debugging/rpctrace]]
diff --git a/shortcuts.mdwn b/shortcuts.mdwn
index 38c7b3c7..b978939e 100644
--- a/shortcuts.mdwn
+++ b/shortcuts.mdwn
@@ -108,3 +108,10 @@ ikiwiki will include your shortcut in the standard underlay.
* [[!shortcut name=sourceware_PR
url="http://sourceware.org/PR%s"
desc="sourceware [BZ #%s]"]]
+
+
+## <http://stackoverflow.com/>
+
+ * [[!shortcut name=stackoverflow_question
+ url="http://http://stackoverflow.com/questions/%s"
+ desc="Stack Overflow question %s"]]
diff --git a/source_repositories/gdb.mdwn b/source_repositories/gdb.mdwn
index 76b82534..7418f5e4 100644
--- a/source_repositories/gdb.mdwn
+++ b/source_repositories/gdb.mdwn
@@ -1,4 +1,5 @@
-[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation,
+Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -11,5 +12,11 @@ License|/fdl]]."]]"""]]
There is a repository for maintenance of [[/GDB]] for the Hurd's needs:
`grubber:~tschwinge/tmp/gdb/git`.
+<!--
+
+No longer, but can't accesss/remove at the moment.
+
This repository uses [[TopGit]] and is based on
<http://sourceware.org/git/?p=gdb.git;a=summary>.
+
+-->
diff --git a/source_repositories/glibc.mdwn b/source_repositories/glibc.mdwn
index fabd7cab..d9a470ae 100644
--- a/source_repositories/glibc.mdwn
+++ b/source_repositories/glibc.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -9,13 +9,26 @@ is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
There is a repository for maintenance of [[/glibc]] for the Hurd's needs:
-<http://git.savannah.gnu.org/cgit/hurd/glibc.git/>.
+<http://git.savannah.gnu.org/cgit/hurd/glibc.git/>. It's mainly used for
+testing glibc's master branch, but with all the patches that we need on top of
+it, and also for development and sharing of (Hurd-specific) glibc patches.
This repository uses [[TopGit]].
-*A plan for the Hurd-specific glibc repository*, thread
-[begins](http://lists.gnu.org/archive/html/bug-hurd/2010-01/msg00062.html),
-[continues](http://lists.gnu.org/archive/html/bug-hurd/2010-02/msg00021.html).
+History: *A plan for the Hurd-specific glibc repository*, [[!message-id
+"878wbp81ed.fsf@dirichlet.schwinge.homeip.net"]].
+
+
+# Relation to Debian glibc
+
+For a lot of topic branches there is a correspondence to a Debian glibc patch,
+and vice-versa, which is also indicated by the Debian glibc patch files' names.
+
+Debian glibc is based on EGLIBC 2.13 and the Savannah hurd/glibc.git one is
+tracking sourceware master.
+
+The Savannah hurd/glibc.git one does not/not yet include
+[[libpthread|open_issues/libpthread]], [[open_issue/packaging_libpthread]].
# Usage
@@ -91,3 +104,13 @@ Make `tschwinge/Roger_Whittaker` (the current branch) depend on it:
4 files changed, 20 insertions(+), 14 deletions(-)
rename {nptl/sysdeps/pthread => sysdeps/gnu}/rt-unwind-resume.c (100%)
rename {nptl/sysdeps/pthread => sysdeps/gnu}/unwind-resume.c (93%)
+
+
+# Maintenance
+
+## Tags
+
+Occasionally push new tags from the sourceware repository to the Savannah one:
+
+ $ git fetch sourceware
+ $ git tag | grep ^glibc- | sed 's%^%tag %' | xargs git push savannah
diff --git a/toolchain/logs b/toolchain/logs
-Subproject d3878230df905bb9b85aa1548da7485f71c3efc
+Subproject 272397686eea60669290da0add796ca601b1a2e