diff options
author | Samuel Thibault <samuel.thibault@ens-lyon.org> | 2011-03-27 21:37:59 +0200 |
---|---|---|
committer | Samuel Thibault <samuel.thibault@ens-lyon.org> | 2011-03-27 21:37:59 +0200 |
commit | 3f5b019c3f6e0c6e1683b2374cc86116251ecf2b (patch) | |
tree | 318230b32e3aa98afa4a28a0c0855e86a78fc721 /open_issues | |
parent | 51c4760238ec774f0eb4facb1eb17c4abd516029 (diff) | |
parent | 4f51f8e21f1962a0749ec7081567f4916bab7910 (diff) |
Merge branch 'master' of flubber:~hurd-web/hurd-web
Diffstat (limited to 'open_issues')
28 files changed, 493 insertions, 247 deletions
diff --git a/open_issues/anatomy_of_a_hurd_system.mdwn b/open_issues/anatomy_of_a_hurd_system.mdwn new file mode 100644 index 00000000..e1d5c9d8 --- /dev/null +++ b/open_issues/anatomy_of_a_hurd_system.mdwn @@ -0,0 +1,73 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!taglink open_issue_documentation]] + +A bunch of this should also be covered in other (introductionary) material, +like Bushnell's Hurd paper. All this should be unfied and streamlined. + +IRC, freenode, #hurd, 2011-03-08 + + <foocraft> I've a question on what are the "units" in the hurd project, if + you were to divide them into units if they aren't, and what are the + dependency relations between those units(roughly, nothing too pedantic + for now) + <antrik> there is GNU Mach (the microkernel); there are the server + libraries in the Hurd package; there are the actual servers in the same; + and there is the POSIX implementation layer in glibc + <antrik> relations are a bit tricky + <antrik> Mach is the base layer which implements IPC and memory management + <foocraft> hmm I'll probably allocate time for dependency graph generation, + in the worst case + <antrik> on top of this, the Hurd servers, using the server libraries, + implement various aspects of the system functionality + <antrik> client programs use libc calls to use the servers + <antrik> (servers also use libc to communicate with other servers and/or + Mach though) + <foocraft> so every server depends solely on mach, and no other server? + <foocraft> s/mach/mach and/or libc/ + <antrik> I think these things should be pretty clear one you are somewhat + familiar with the Hurd architecture... nothing really tricky there + <antrik> no + <antrik> servers often depend on other servers for certain functionality + +--- + +IRC, freenode, #hurd, 2011-03-12 + + <dEhiN> when mach first starts up, does it have some basic i/o or fs + functionality built into it to start up the initial hurd translators? + <antrik> I/O is presently completely in Mach + <antrik> filesystems are in userspace + <antrik> the root filesystem and exec server are loaded by grub + <dEhiN> o I see + <dEhiN> so in order to start hurd, you would have to start mach and + simultaneously start the root filesystem and exec server? + <antrik> not exactly + <antrik> GRUB loads all three, and then starts Mach. Mach in turn starts + the servers according to the multiboot information passed from GRUB + <dEhiN> ok, so does GRUB load them into ram? + <dEhiN> I'm trying to figure out in my mind how hurd is initially started + up from a low-level pov + <antrik> yes, as I said, GRUB loads them + <dEhiN> ok, thanks antrik...I'm new to the idea of microkernels, but a + veteran of monolithic kernels + <dEhiN> although I just learned that windows nt is a hybrid kernel which I + never knew! + <rm> note there's a /hurd/ext2fs.static + <rm> I belive that's what is used initially... right? + <antrik> yes + <antrik> loading the shared libraries in addition to the actual server + would be unweildy + <antrik> so the root FS server is linked statically instead + <dEhiN> what does the root FS server do? + <antrik> well, it serves the root FS ;-) + <antrik> it also does some bootstrapping work during startup, to bring the + rest of the system up diff --git a/open_issues/code_analysis.mdwn b/open_issues/code_analysis.mdwn index ad59f962..21e09089 100644 --- a/open_issues/code_analysis.mdwn +++ b/open_issues/code_analysis.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -40,7 +40,7 @@ analysis|performance]], [[formal_verification]], as well as general * <http://blog.llvm.org/2010/04/whats-wrong-with-this-code.html> - * [[Valgrind]] + * [[community/gsoc/project_ideas/Valgrind]] * [Smatch](http://smatch.sourceforge.net/) diff --git a/open_issues/crash_server.mdwn b/open_issues/crash_server.mdwn index d97f5458..7ed4afbf 100644 --- a/open_issues/crash_server.mdwn +++ b/open_issues/crash_server.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2009, 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2010, 2011 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -187,3 +188,8 @@ one... /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/ipc_kobject.c:76 mach_msg_trap /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/ipc/mach_msg.c:1367 + +--- + +If someone is working in this area, they may want to have a look at +[[GDB_gcore]], and port <http://code.google.com/p/google-coredumper/>, too. diff --git a/open_issues/debian_cross_toolchain.mdwn b/open_issues/debian_cross_toolchain.mdwn new file mode 100644 index 00000000..e0665466 --- /dev/null +++ b/open_issues/debian_cross_toolchain.mdwn @@ -0,0 +1,15 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + +Have a look at the Debian *Cross Toolchain* project, +<https://alioth.debian.org/projects/crosstoolchain/>, +<http://wiki.debian.org/ToolChain/Cross>. diff --git a/open_issues/debugging.mdwn b/open_issues/debugging.mdwn index e66a086f..e5fbf7a0 100644 --- a/open_issues/debugging.mdwn +++ b/open_issues/debugging.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -42,7 +42,10 @@ We have debugging infrastructure. For example: Continues: <http://lwn.net/Articles/414264/>, which introduces <http://dmtcp.sourceforge.net/>. - * [[locking]] + * [[crash_server}}, [[GDB_gcore]], + <http://code.google.com/p/google-coredumper/> + + * [[community/gsoc/project_ideas/libdiskfs_locking]] * <http://lwn.net/Articles/415728/>, or <http://lwn.net/Articles/415471/> -- just two examples; there's a lot of such stuff for Linux. diff --git a/open_issues/dtrace.mdwn b/open_issues/dtrace.mdwn deleted file mode 100644 index cbac28fb..00000000 --- a/open_issues/dtrace.mdwn +++ /dev/null @@ -1,47 +0,0 @@ -[[!meta copyright="Copyright © 2008, 2009, 2011 Free Software Foundation, -Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -One of the main problems of the current Hurd implementation is very poor -[[performance]]. While we have a bunch of ideas what could cause the performance -problems, these are mostly just guesses. Better understanding what really -causes bad performance is necessary to improve the situation. - -For that, we need tools for performance measurements. While all kinds of more -or less specific [[profiling]] tools could be conceived, the most promising and -generic approach seems to be a framework for logging certain events in the -running system (both in the microkernel and in the Hurd servers). This would -allow checking how much time is spent in certain modules, how often certain -situations occur, how things interact, etc. It could also prove helpful in -debugging some issues that are otherwise hard to find because of complex -interactions. - -The most popular framework for that is Sun's dtrace; but there might be others. -The student has to evaluate the existing options, deciding which makes most -sense for the Hurd; and implement that one. (Apple's implementation of dtrace -in their Mach-based kernel might be helpful here...) - -This project requires ability to evaluate possible solutions, and experience -with integrating existing components as well as low-level programming. - -Possible mentors: Samuel Thibault (youpi) - -Related: [[profiling]], [[LTTng]], [[SystemTap]] - -Exercise: In lack of a good exercise directly related to this task, just pick -one of the kernel-related or generally low-level tasks from the bug/task -trackers on savannah, and make a go at it. You might not be able to finish the -task in a limited amount of time, but you should at least be able to make a -detailed analysis of the issue. - -*Status*: Andei Barbu was working on -[SystemTap](http://csclub.uwaterloo.ca/~abarbu/hurd/) for GSoC 2008, but it -turned out too Linux-specific. He implemented kernel probes, but there is no -nice frontend yet. diff --git a/open_issues/ext2fs_page_cache_swapping_leak.mdwn b/open_issues/ext2fs_page_cache_swapping_leak.mdwn new file mode 100644 index 00000000..0ace5cd3 --- /dev/null +++ b/open_issues/ext2fs_page_cache_swapping_leak.mdwn @@ -0,0 +1,23 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + +IRC, OFTC, #debian-hurd, 2011-03-24 + + <youpi> I still believe we have an ext2fs page cache swapping leak, however + <youpi> as the 1.8GiB swap was full, yet the ld process was only 1.5GiB big + <pinotree> a leak at swapping time, you mean? + <youpi> I mean the ext2fs page cache being swapped out instead of simply + dropped + <pinotree> ah + <pinotree> so the swap tends to accumulate unuseful stuff, i see + <youpi> yes + <youpi> the disk content, basicallyt :) diff --git a/open_issues/sudo_date_crash.mdwn b/open_issues/file_system_exerciser.mdwn index 53303abc..4277e5e7 100644 --- a/open_issues/sudo_date_crash.mdwn +++ b/open_issues/file_system_exerciser.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,9 +8,8 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] -[[!tag open_issue_gnumach]] +[[!tag open_issue_hurd]] -IRC, unknown channel, unknown date. +Test our file system implementations with the File System Exerciser. - <grey_gandalf> I did a sudo date... - <grey_gandalf> and the machine hangs + * <http://codemonkey.org.uk/projects/fsx/> diff --git a/open_issues/gdb_gcore.mdwn b/open_issues/gdb_gcore.mdwn index 7d4980f1..69211ac0 100644 --- a/open_issues/gdb_gcore.mdwn +++ b/open_issues/gdb_gcore.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -21,3 +21,6 @@ GDB's `gcore` command doesn't work / needs to be implemented / ported in GDB: /media/data/home/tschwinge/core.cA0ICY:2: Error in sourced command file: Undefined command: "gcore". Try "help". gcore: failed to create core.8371 + +If someone is working in this area, they may want to port +<http://code.google.com/p/google-coredumper/>, too. diff --git a/open_issues/gdb_noninvasive_mode_new_threads.mdwn b/open_issues/gdb_noninvasive_mode_new_threads.mdwn new file mode 100644 index 00000000..9b3992f4 --- /dev/null +++ b/open_issues/gdb_noninvasive_mode_new_threads.mdwn @@ -0,0 +1,15 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_gdb]] + +Debugging a translator. `gdb binary`. `set noninvasive on`. `attach [PID]`. +Translator does some work. GDB doesn't notice new threads. `detach`. `attach +[PID]` -- now new threads are visible. diff --git a/open_issues/gdb_thread_ids.mdwn b/open_issues/gdb_thread_ids.mdwn index c31a9967..c04a10ee 100644 --- a/open_issues/gdb_thread_ids.mdwn +++ b/open_issues/gdb_thread_ids.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2008, 2009, 2010 Free Software Foundation, +[[!meta copyright="Copyright © 2008, 2009, 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -6,8 +6,8 @@ id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled -[[GNU Free Documentation License|/fdl]]."]]"""]] +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] [[!meta title="GDB: thread ids"]] @@ -23,3 +23,9 @@ GNU GDB's Pedro Alves: Also see [[thread numbering of ps and GDB]]. + +--- + +`attach` to a multi-threaded process. See threads 1 to 5. `detach`. `attach` +again -- thread numbers continue where they stopped last time: now they're +threads 6 to 10. diff --git a/open_issues/locking.mdwn b/open_issues/locking.mdwn deleted file mode 100644 index 11a10524..00000000 --- a/open_issues/locking.mdwn +++ /dev/null @@ -1,40 +0,0 @@ -[[!meta copyright="Copyright © 2008, 2009, 2010 Free Software Foundation, -Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -[[!tag open_issue_hurd]] - -Every now and then, new locking issues are discovered in -[[hurd/libdiskfs]] or [[hurd/translator/ext2fs]], for example. Nowadays -these in fact seem to be the most often encountered cause of Hurd crashes -/ lockups. - -One of these could be traced -recently, and turned out to be a lock inside [[hurd/libdiskfs]] that was taken -and not released in some cases. There is reason to believe that there are more -faulty paths causing these lockups. - -The task is systematically checking the [[hurd/libdiskfs]] code for this kind of locking -issues. To achieve this, some kind of test harness has to be implemented: For -example instrumenting the code to check locking correctness constantly at -runtime. Or implementing a [[unit testing]] framework that explicitly checks -locking in various code paths. (The latter could serve as a template for -implementing unit tests in other parts of the Hurd codebase...) - -(A [[systematic code review|security]] would probably suffice to find the -existing locking -issues; but it wouldn't document the work in terms of actual code produced, and -thus it's not suitable for a GSoC project...) - -This task requires experience with debugging locking issues in -[[multithreaded|multithreading]] applications. - -Tools have been written for automated [[code analysis]]; these can help to -locate and fix such errors. diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn index 39203333..addc29c3 100644 --- a/open_issues/multithreading.mdwn +++ b/open_issues/multithreading.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,21 @@ License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] -Hurd servers / VFS libraries are multithreaded, roughly using one thread per +Hurd servers / VFS libraries are multithreaded. + + +# Implementation + + * well-known threading libraries + + * [[hurd/libthreads]] + + * [[hurd/libpthread]] + + +# Design + +Roughly using one thread per incoming request. This is not the best approach: it doesn't really make sense to scale the number of worker threads with the number of incoming requests, but instead they should be scaled according to the backends' characteristics. @@ -18,7 +32,9 @@ instead they should be scaled according to the backends' characteristics. The [[hurd/Critique]] should have some more on this. -Alternative approaches: +# Alternative approaches: + + * <http://www.concurrencykit.org/> * Continuation-passing style diff --git a/open_issues/nightly_builds.mdwn b/open_issues/nightly_builds.mdwn index 506697bb..5d1257fb 100644 --- a/open_issues/nightly_builds.mdwn +++ b/open_issues/nightly_builds.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -15,6 +15,8 @@ Resources: * [[toolchain/cross-gnu]] + * [[Debian_Cross_Toolchain]] + * As reported in the [[news/2010-05-31]] news, there's Hydra doing nightly builds / Nix packages. diff --git a/open_issues/nightly_builds_deb_packages.mdwn b/open_issues/nightly_builds_deb_packages.mdwn index 9f5e2373..11fc4c79 100644 --- a/open_issues/nightly_builds_deb_packages.mdwn +++ b/open_issues/nightly_builds_deb_packages.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -24,4 +24,8 @@ There is infrastructure available to test whole OS installations. --- +[[Debian_Cross_Toolchain]] for cross-building? + +--- + See also [[nightly_builds]]. diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn index 9b3701b3..eb9f3f8a 100644 --- a/open_issues/performance.mdwn +++ b/open_issues/performance.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -18,7 +18,7 @@ In [[microkernel]]-based systems, there is generally a considerable [[RPC]] overhead. In a multi-server system, it is non-trivial to implement a high-performance -[[I/O System|io_system]]. +[[I/O System|community/gsoc/project_ideas/disk_io_performance]]. When providing [[faq/POSIX_compatibility]] (and similar interfaces) in an environemnt that doesn't natively implement these interfaces, there may be a diff --git a/open_issues/performance/fork.mdwn b/open_issues/performance/fork.mdwn index 2748be53..5ceb6455 100644 --- a/open_issues/performance/fork.mdwn +++ b/open_issues/performance/fork.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -14,3 +14,24 @@ Our [[`fork` implementation|glibc/fork]] is nontrivial. To do: hard numbers. [[Microbenchmarks]]? + + +# Windows / Cygwin + + * <http://www.google.com/search?q=cygwin+fork> + + * <http://www.redhat.com/support/wpapers/cygnus/cygnus_cygwin/architecture.html> + + In particular, *5.6. Process Creation*. + + * <http://archive.gamedev.net/community/forums/topic.asp?topic_id=360290> + + * <http://cygwin.com/cgi-bin/cvsweb.cgi/src/winsup/cygwin/how-cygheap-works.txt?cvsroot=src> + + > Cygwin has recently adopted something called the "cygwin heap". This is + > an internal heap that is inherited by forked/execed children. It + > consists of process specific information that should be inherited. So + > things like the file descriptor table, the current working directory, and + > the chroot value live there. + + * <http://www.perlmonks.org/?node_id=588994> diff --git a/open_issues/performance/io_system.mdwn b/open_issues/performance/io_system.mdwn deleted file mode 100644 index 4af093ba..00000000 --- a/open_issues/performance/io_system.mdwn +++ /dev/null @@ -1,50 +0,0 @@ -[[!meta copyright="Copyright © 2008, 2009, 2010, 2011 Free Software Foundation, -Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -[[!meta title="I/O System"]] - -[[!tag open_issue_hurd]] - -The most obvious reason for the Hurd feeling slow compared to mainstream -systems like GNU/Linux, is a low I/O system performance, in particular very -slow hard disk access. - -The reason for this slowness is lack and/or bad implementation of common -optimization techniques, like scheduling reads and writes to minimize head -movement; effective block caching; effective reads/writes to partial blocks; -[[reading/writing multiple blocks at once|clustered_page_faults]]; and -[[read-ahead]]. The -[[ext2_filesystem_server|hurd/translator/ext2fs]] might also need some -optimizations at a higher logical level. - -The goal of this project is to analyze the current situation, and implement/fix -various optimizations, to achieve significantly better disk performance. It -requires understanding the data flow through the various layers involved in -disk access on the Hurd ([[filesystem|hurd/virtual_file_system]], -[[pager|hurd/libpager]], driver), and general experience with -optimizing complex systems. That said, the killing feature we are definitely -missing is the [[read-ahead]], and even a very simple implementation would bring -very big performance speedups. - -Here are some real testcases: - - * [[binutils_ld_64ksec]]; - - * running the Git testsuite which is mostly I/O bound; - - * use [[TopGit]] on a non-toy repository. - - -Possible mentors: Samuel Thibault (youpi) - -Exercise: Look through all the code involved in disk I/O, and try something -easy to improve. It's quite likely though that you will find nothing obvious -- -in this case, please contact us about a different exercise task. diff --git a/open_issues/performance/io_system/binutils_ld_64ksec.mdwn b/open_issues/performance/io_system/binutils_ld_64ksec.mdwn index b59a87a7..79c2300f 100644 --- a/open_issues/performance/io_system/binutils_ld_64ksec.mdwn +++ b/open_issues/performance/io_system/binutils_ld_64ksec.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] -This one may be considered as a testcase for I/O system optimization. +This one may be considered as a testcase for [[I/O system +optimization|community/gsoc/project_ideas/disk_io_performance]]. It is taken from the [[binutils testsuite|binutils]], `ld/ld-elf/sec64k.exp`, where this diff --git a/open_issues/performance/io_system/clustered_page_faults.mdwn b/open_issues/performance/io_system/clustered_page_faults.mdwn index 3a187523..37433e06 100644 --- a/open_issues/performance/io_system/clustered_page_faults.mdwn +++ b/open_issues/performance/io_system/clustered_page_faults.mdwn @@ -10,6 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_gnumach open_issue_hurd]] +[[community/gsoc/project_ideas/disk_io_performance]]. + IRC, freenode, #hurd, 2011-02-16 <braunr> exceptfor the kernel, everything in an address space is diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn index 3ee30b5d..b6851edd 100644 --- a/open_issues/performance/io_system/read-ahead.mdwn +++ b/open_issues/performance/io_system/read-ahead.mdwn @@ -10,6 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_gnumach open_issue_hurd]] +[[community/gsoc/project_ideas/disk_io_performance]] + IRC, #hurd, freenode, 2011-02-13: <etenil> youpi: Would libdiskfs/diskfs.h be in the right place to make diff --git a/open_issues/perlmagick.mdwn b/open_issues/perlmagick.mdwn index 1daac62b..8a57a8fd 100644 --- a/open_issues/perlmagick.mdwn +++ b/open_issues/perlmagick.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -57,6 +57,49 @@ Etc. +/usr/lib/gcc/i486-gnu/4.4.2/include/omp.h: +# State as of 2011-03-06 + +freenode, #hurd channel, 2011-03-06: + + <pinotree> tschwinge: (speaking on working perl, how did it en with that + "(glibc) double free" crash with perl?) + <pinotree> *end + <tschwinge> I think I remember I suspected it's a libgomp (!) issue in the + end. I have not yet continued working on that. + <pinotree> libogmp? looks like you know more than me, then :) + <youpi> tschwinge: oh, I'm interested + <youpi> I know a bit about libgomp :) + <tschwinge> I bisected this down to where Imagemagick added -fgomp (or + whatever it is). And then the perl library (Imagemagick.pm?) which loads + the imagemagick.so segfaulted. + <tschwinge> ImageMagick did this change in the middle of a x.x.x.something + release.. + <tschwinge> My next step would have been to test whether libgomp works at + all for us. + <youpi> ./usr/sbin/debootstrap:DEBOOTSTRAP_CHECKSUM_FIELD="SHA$SHA_SIZE" + <youpi> erf + <youpi> so they switched to another checksum + <youpi> but we don't have that one on all of our packages :) + <youpi> tschwinge: + <youpi> buildd@bach:~$ OMP_NUM_THREADS=2 ./test + <youpi> I'm 0x1 + <youpi> I'm 0x3 + <youpi> libgomp works at least a bit + <tschwinge> OK. + <pinotree> i guess we should hope the working bits don't stop at that point + ;) + <tschwinge> If open_issues/perlmagick is to be believed a diff of 6.4.1-1 + and 6.4.1-2 should tell what exactly was changed. + <tschwinge> Oh! + <tschwinge> I even have it on the page already! ;-) + <tschwinge> -fopenmp + <youpi> I've tried the pragmas that imagemagick uses + <youpi> they work + <tschwinge> Might be the issue fixed itself? + <youpi> I don't know, it's the latest libc here + <youpi> (and latest hurd, to be uploaded) + + # Other [[!debbug 551017]] diff --git a/open_issues/pfinet_vs_system_time_changes.mdwn b/open_issues/pfinet_vs_system_time_changes.mdwn new file mode 100644 index 00000000..714c8784 --- /dev/null +++ b/open_issues/pfinet_vs_system_time_changes.mdwn @@ -0,0 +1,42 @@ +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + +IRC, unknown channel, unknown date. + + <grey_gandalf> I did a sudo date... + <grey_gandalf> and the machine hangs + +This was very likely a misdiagnosis: + +IRC, freenode, #hurd, 2011-03-25 + + <tschwinge> antrik: I suspect it'S some timing stuff in pfinet that perhaps + uses absolute time, and somehow wildely gets confused? + <antrik> tschwinge: BTW, pfinet doesn't actually die I think -- it just + drops open connections... + <antrik> perhaps it thinks they timed out + <tschwinge> antrik: Isn't the translator restarted instead? + <antrik> don't think so + <antrik> when pfinet actually dies, I also loose the NFS mounts, which + doesn't happen in this case + <antrik> hehe "... and the machine hangs" + <antrik> he didn't bother to check that the machine is perfectly fine, only + the SSH connection got dropped + <tschwinge> Ah, I see. So it'S perhaps indeed simply closes TCP + connections that have been without data for ``too long''? + <antrik> yeah, that's my guess + <antrik> my clock is speeding, so ntpdate sets it in the past + <antrik> perhaps there is some math that concludes the connection have been + inactive for -200 seconds, which (unsigned) is more than any timeout :-) + <tschwinge> (The other way round, you might likely get some integer + wrap-around, and thus the same result.) + <tschwinge> Yes. diff --git a/open_issues/profiling.mdwn b/open_issues/profiling.mdwn index e04fb08a..7e3c7350 100644 --- a/open_issues/profiling.mdwn +++ b/open_issues/profiling.mdwn @@ -17,7 +17,7 @@ done for [[performance analysis|performance]] reasons. Should be working, but some issues have been reported, regarding GCC spec files. Should be possible to fix (if not yet done) easily. - * [[dtrace]] + * [[community/gsoc/project_ideas/dtrace]] Have a look at this, integrate it into the main trees. diff --git a/open_issues/rpc_to_self_with_rendez-vous_leading_to_duplicate_port_destroy.mdwn b/open_issues/rpc_to_self_with_rendez-vous_leading_to_duplicate_port_destroy.mdwn new file mode 100644 index 00000000..9db92250 --- /dev/null +++ b/open_issues/rpc_to_self_with_rendez-vous_leading_to_duplicate_port_destroy.mdwn @@ -0,0 +1,163 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + +[RPC to self with rendez-vous leading to duplicate port +destroy](http://lists.gnu.org/archive/html/bug-hurd/2011-03/msg00045.html) + +IRC, freenode, #hurd, 2011-03-14 + + <antrik> youpi: I wonder, why does the root FS call diskfs_S_dir_lookup() + at all?... + <youpi> errr, because a client asked for it? + <youpi> (problem with RPCs is you can't easily know where they come from :) + ) + <youpi> (especially when it's the root fs...) + <antrik> ah, it's about a client request... didn't see that + <youpi> well, I just said "is called", yes + <antrik> I do not really understand though why it tries to reauthenticate + against itself... + <antrik> I fear my memory of the lookup mechanism grew a bit dim + <youpi> see the source + <youpi> it's about a translated entry + <antrik> (and I never fully understood some aspects anyways...) + <youpi> it needs to start the translated entry as another user, possibly + <antrik> yes, but a translated entry normally would be served by *another* + process?... + <youpi> sure, but ext2fs has to prepare it + <youpi> thus reauthenticate to prepare the correct set of rights + <antrik> prepare what? + <youpi> rights + <youpi> so the process is not root, doesn't have / opened as root, etc. + <antrik> rights for what? + <youpi> err, about everything + <antrik> IIRC the reauthentication is done by the parent FS on the port to + the *translated* node + <antrik> and the translated node should be a different process?... + <youpi> that's not what I read in the source + <youpi> fshelp_fetch_root + <youpi> ports[INIT_PORT_CRDIR] = reauth (getcrdir ()); + <youpi> here, getcrdir() returns ext2fs itself + <antrik> well, perhaps the issue is that I have no idea what + fshelp_fetch_root() does, nor why it is called here... + <youpi> it notably starts the translator that dir_lookup is looking at, if + needed + <youpi> possibly as a different user, thus reauthentication of CRDIR + <antrik> so this is about a port that is passed to the translator being + started? + <youpi> no + <youpi> well, depends on what you mean by "port" + <youpi> it's about reauthenticating a port to be passed to the translator + being started + <youpi> and for that a rendez-vous port is needed for the reauthentication + <youpi> and that's the one at stake + <antrik> yeah, I meant the port that is reauthenticated + <antrik> what is CRDIR? + <youpi> current root dir ... + <antrik> so the parent translator passes it's own root dir to the child + translator; and the issue is that for the root FS the root dir points to + the root FS itself... + <youpi> yes + <antrik> OK, that makes sense + <youpi> (but that's only one example, rgrep mach_port_destroy hurd/ show + other potential issues) + <antrik> well, that's actually what I wanted to mention next... why is the + rendez-vous port destroyed, instead of just deallocating the port right + and letting reference counting to it's thing?... + <antrik> do its thing + <youpi> "just to make sure" I guess + <antrik> it's pretty obvious that this will cause trouble for any RPC + referencing itself... + <youpi> well, follow-up with that on the list + <youpi> with roland/tb in CC + <youpi> only they would know any real reason for destroy + <youpi> btw, if you knew how we could make _hurd_select()'s raw __mach_msg + call be interruptible by signals, that'll permit to fix sudo + <youpi> (damn, I need sleep, my tenses are all wrong) + <antrik> BTW, does this cause any actual trouble?... + <antrik> I don't know much about interruption... cfhammer might have a + better idea, he look into that stuff quite a bit AIUI + <antrik> looked + <antrik> (hehe, it's not only your tenses... guess there's something in the + ether ;-) ) + <youpi> it makes sudo, mailq, etc. fail sometimes + <antrik> I mean the rendez-vous thing + <youpi> that's it, yes + <youpi> sudo etc. fail at least due to this + <antrik> so these are two different problems that both affect sudo? + <antrik> (rendez-vous and interruption I mean) + <youpi> yes + <youpi> with my patch the buildds have much fewer issues, but still some + <youpi> (my interrupt-related patch) + <youpi> I'm installing a s/destroy/deallocate/ version of ext2fs on the + buildds, we'll see how it behaves + <youpi> (it fixes my testcase at least) + <antrik> interrupt-related patch? + <antrik> only thing interrupt-related I remember was the reauthentication + race... + <youpi> that's what I mean + <antrik> well, cfhammer investigated this is quite some depth, explaining + quite well why the race is only mitigated but still exists... problem is + that we didn't know how to fix it properly + <antrik> because nobody seems to understand the cancellation code, except + perhaps for Roland and Thomas + <antrik> (and I'm not even entirely sure about them :-) ) + <antrik> I think his findings and our conclusions are documented on the + ML... + <youpi> by "much fewer issues", I mean that some of the symptoms have + disappeared, others haven't + <antrik> BTW, couldn't the rendez-vous thing be worked around by simply + ignoring the errors from the failing deallocate?... + <youpi> no, failing deallocate are actually dangerous + <antrik> why? + <youpi> since the name might have been reused for something else in the + meanwhile + <youpi> that's the whole point of the warning I had added in the kernel + itself + <antrik> I see + <youpi> such things really deserve tracking, since they can have any kind + of consequence + <antrik> does Mach try to reuse names quickly, rather than only after + wrapping around?... + <youpi> it seems to + <antrik> OK, then this is a serious problem indeed + <youpi> (note: I rarely divine issues when there aren't actual frequent + symptoms :) ) + <antrik> well, the problem with the warning is that it only shows in the + cases that do *not* cause a problem... so it's hard to associate them + with any specific issues + <youpi> well, most of the time the port is not reused quickly enough + <youpi> so in most case it shows up more often than causing problem + +IRC, freenode, #hurd, 2011-03-14 + + <youpi> ok, mach_port_deallocate actually can't be used + <youpi> since mach_reply_port() returns a receive right, not a send right + * youpi guesses he will really have to manage to understand all that port + stuff completely + <antrik> oh, right + <antrik> youpi: hm... now I'm confused though. if one client holds a + receive right, the other client (or in this case the same process) should + have a send or send-once right -- these should *not* share the same name + in my understanding + <antrik> destroying the receive right should turn the send right into a + dead name + <antrik> so unless I'm missing something, the destroy shouldn't be a + problem, and there must be something else going wrong + <antrik> hm... actually I'm probably wrong + <antrik> yeah, definitely wrong. receive rights and "ordinary" send rights + share the name. only send-once rights are special + <antrik> I wonder whether the problem could be worked around by using a + send-once right... + <antrik> mach_port_mod_refs(mach_task_self(), name, + MACH_PORT_RIGHT_RECEIVE, -1) can be used to deallocate only the receive + right + <antrik> oh, you already figured that out :-) diff --git a/open_issues/sync_but_still_unclean_filesystem.mdwn b/open_issues/sync_but_still_unclean_filesystem.mdwn index f1fbb4e0..83c7951e 100644 --- a/open_issues/sync_but_still_unclean_filesystem.mdwn +++ b/open_issues/sync_but_still_unclean_filesystem.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,9 +10,19 @@ License|/fdl]]."]]"""]] [[!tag open_issue_gnumach open_issue_hurd]] +Also filed as [[!GNU_Savannah_bug 29292]]. + \#hurd, 2010, end of May / beginning of June [runnign sync, but sill unclean filesystem at next boot] - <slpz> guillem: when libpager syncs an object, it sends an m_o_lock_request and waits (if the synchronous argument was specified) for a m_o_lock_completed. But m_o_lock_completed only means that dirty pages have been sent to the translator, and this one still needs to write them to the backing storage - <slpz> guillem: there's no problem if sync() returns before actually writting the changes to disk, but this also happens when shutting down the translator - <slpz> guillem: in theory, locking mechanisms in libpager should prevent this from happening by keeping track of write operations, but this seems to fail in some situations + <slpz> guillem: when libpager syncs an object, it sends an m_o_lock_request + and waits (if the synchronous argument was specified) for a + m_o_lock_completed. But m_o_lock_completed only means that dirty pages + have been sent to the translator, and this one still needs to write them + to the backing storage + <slpz> guillem: there's no problem if sync() returns before actually + writting the changes to disk, but this also happens when shutting down + the translator + <slpz> guillem: in theory, locking mechanisms in libpager should prevent + this from happening by keeping track of write operations, but this seems + to fail in some situations diff --git a/open_issues/unit_testing.mdwn b/open_issues/unit_testing.mdwn index d7821fd3..1378be85 100644 --- a/open_issues/unit_testing.mdwn +++ b/open_issues/unit_testing.mdwn @@ -8,6 +8,11 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] +This task may be suitable for [[community/GSoC]]: +[[community/gsoc/project_ideas/testing_framework]] + +--- + A collection of thoughts with respect to unit testing. We definitely want to add unit test suites to our code base. @@ -54,6 +59,8 @@ abandoned). Developers*](http://lwn.net/Articles/412302/) by Steven Rostedt, 2010-10-28. [v2](http://lwn.net/Articles/414064/), 2010-11-08. + * <http://autotest.kernel.org/wiki/WhitePaper> + # Related @@ -70,3 +77,10 @@ abandoned). * [LaBrea](https://github.com/dustin/labrea/wiki), or similar tools can be used for modelling certain aspects of system behavior (long response times, for example). + + +# Discussion + +See the [[GSoC project idea|community/gsoc/project_ideas/testing_framework]]'s +[[discussion +subpage|community/gsoc/project_ideas/testing_framework/discussion]]. diff --git a/open_issues/valgrind.mdwn b/open_issues/valgrind.mdwn deleted file mode 100644 index 2b0624d7..00000000 --- a/open_issues/valgrind.mdwn +++ /dev/null @@ -1,80 +0,0 @@ -[[!meta copyright="Copyright © 2009, 2010 Free Software Foundation, Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -[[!meta title="Porting Valgrind to the Hurd"]] - -[Valgrind](http://valgrind.org/) is an extremely useful debugging tool for memory errors. -(And some other kinds of hard-to-find errors too.) -Aside from being useful for program development in general, -a Hurd port will help finding out why certain programs segfault on the Hurd, -although they work on Linux. -Even more importantly, it will help finding bugs in the Hurd servers themselfs. - -To keep track of memory use, -Valgrind however needs to know how each [[system call]] affects the validity of memory regions. -This knowledge is highly kernel-specific, -and thus Valgrind needs to be explicitely ported for every system. - -Such a port involves two major steps: -making Valgrind understand how kernel traps work in general on the system in question; -and how all the individual kernel calls affect memory. -The latter step is where most of the work is, -as the behaviour of each single [[system call]] needs to be described. - -Compared to Linux, -[[microkernel/Mach]] (the microkernel used by the Hurd) has very few kernel traps. -Almost all [[system call]]s are implemented as [[RPC]]s instead -- -either handled by Mach itself, or by the various [[Hurd servers|hurd/translator]]. -All RPCs use a pair of `mach_msg` invocations: -one to send a request message, and one to receive a reply. -However, while all RPCs use the same `mach_msg` trap, -the actual effect of the call varies greatly depending on which RPC is invoked -- -similar to the `ioctl` call on Linux. -Each request thus must be handled individually. - -Unlike `ioctl`, -the RPC invocations have explicit type information for the parameters though, -which can be retrieved from the message header. -By analyzing the parameters of the RPC reply message, -Valgrind can know exactly which memory regions are affected by that call, -even without specific knowledge of the RPC in question. -Thus implementing a general parser for the reply messages -will already give Valgrind a fairly good approximation of memory validity -- -without having to specify the exact semantic of each RPC by hand. - -While this should make Valgrind quite usable on the Hurd already, it's not perfect: -some RPCs might return a buffer that is only partially filled with valid data; -or some reply parameters might be optional, -and only contain valid data under certain conditions. -Such specific semantics can't be deduced from the message headers alone. -Thus for a complete port, -it will still be necessary to go through the list of all known RPCs, -and implement special handling in Valgrind for those RPCs that need it. - -The goal of this task is at minimum to make Valgrind grok Mach traps, -and to implement the generic RPC handler. -Ideally, specific handling for RPCs needing it should also be implemented. - -Completing this project will require digging into Valgrind's handling of [[system call]]s, -and into Hurd RPCs. -It is not an easy task, but a fairly predictable one -- -there shouldn't be any unexpected difficulties, -and no major design work is necessary. -It doesn't require any specific previous knowledge: -only good programming skills in general. -On the other hand, -the student will obtain a good understanding of Hurd RPCs while working on this task, -and thus perfect qualifications for Hurd development in general :-) - -Possible mentors: Samuel Thibault (youpi) - -Exercise: As a starter, -students can try to teach valgrind a couple of Linux ioctls, -as this will make them learn how to use the read/write primitives of valgrind. |