From 47e4d194dc36adfcfd2577fa4630c9fcded005d3 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Sun, 27 Oct 2013 19:15:06 +0100 Subject: IRC. --- open_issues/64-bit_port.mdwn | 7 + open_issues/anatomy_of_a_hurd_system.mdwn | 8 + open_issues/boehm_gc.mdwn | 19 ++ open_issues/code_analysis/discussion.mdwn | 56 +++- open_issues/dbus.mdwn | 112 ++++++++ .../debugging_gnumach_startup_qemu_gdb.mdwn | 34 ++- open_issues/emacs.mdwn | 17 +- open_issues/exec_memory_leaks.mdwn | 25 ++ ...t2fs_libports_reference_counting_assertion.mdwn | 13 +- open_issues/gdb_qemu_debugging_gnumach.mdwn | 19 -- open_issues/gdb_signal_handler.mdwn | 71 +++++ open_issues/git-core-2.mdwn | 107 +++++++ open_issues/glibc.mdwn | 319 +++++++++++++++++++++ open_issues/glibc/t/tls-threadvar.mdwn | 37 +++ open_issues/gnumach_page_cache_policy.mdwn | 60 ++++ open_issues/hurd_101.mdwn | 38 +++ open_issues/hurd_init.mdwn | 8 + .../libpthread/t/fix_have_kernel_resources.mdwn | 64 +++++ open_issues/lsof.mdwn | 40 ++- open_issues/mach-defpager_swap.mdwn | 21 ++ open_issues/multiprocessing.mdwn | 6 +- open_issues/performance.mdwn | 4 +- open_issues/performance/io_system/read-ahead.mdwn | 10 + .../performance/microkernel_multi-server.mdwn | 183 +++++++++++- open_issues/pthread_atfork.mdwn | 86 ++++++ open_issues/smp.mdwn | 8 + open_issues/strict_aliasing.mdwn | 15 +- ..._spin_lock_locked_ss_critical_section_lock.mdwn | 2 + open_issues/time.mdwn | 14 + open_issues/wine.mdwn | 15 +- 30 files changed, 1384 insertions(+), 34 deletions(-) delete mode 100644 open_issues/gdb_qemu_debugging_gnumach.mdwn (limited to 'open_issues') diff --git a/open_issues/64-bit_port.mdwn b/open_issues/64-bit_port.mdwn index b0c95612..edb2dccd 100644 --- a/open_issues/64-bit_port.mdwn +++ b/open_issues/64-bit_port.mdwn @@ -155,3 +155,10 @@ In context of [[mondriaan_memory_protection]]. the problem is the interfaces themselves type widths as passed between userspace and kernel + + +# IRC, OFTC, #debian-hurd, 2013-10-05 + + and what about 64 bit support, almost done? + kernel part is done + MIG 32/64 trnaslation missing diff --git a/open_issues/anatomy_of_a_hurd_system.mdwn b/open_issues/anatomy_of_a_hurd_system.mdwn index ba72b00f..a3c55063 100644 --- a/open_issues/anatomy_of_a_hurd_system.mdwn +++ b/open_issues/anatomy_of_a_hurd_system.mdwn @@ -803,3 +803,11 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l and hoping it didn't corrupt something important like file system caches before being flushed kilobug, braunr : mhn, ook + + +# IRC, freenode, #hurd, 2013-10-13 + + ahh, ^c isn't working to cancel a ping - is there alternative? + ahungry: ctrl-c does work, you just missed something somewhere and + are running a shell directly on a console, without a terminal to handle + signals diff --git a/open_issues/boehm_gc.mdwn b/open_issues/boehm_gc.mdwn index 623dcb83..0a476d71 100644 --- a/open_issues/boehm_gc.mdwn +++ b/open_issues/boehm_gc.mdwn @@ -523,3 +523,22 @@ restults of GNU/Linux and GNU/Hurd look very similar. hi, I am dotgnu work on hurd, and even winforms app s/am/make and maybe c# hello world translate another day :) + + +## Leak Detection + +### IRC, freenode, #hurd, 2013-10-17 + + I spent the last two days integrating libgc - the boehm + conservative garbage collector - into hurd + it can be used in leak detection mode + whoa, cool + and it actually kind of works, finds malloc leaks in translators + i think there were problems with signal handling in libgc + i'm not sure we support nested signal handling well + yes, I read about them + libgc uses SIGUSR1/2, so any program installing handlers on them + will break + (which is not a problem on Linux, cause there some RT-signals or so + are used) + yes diff --git a/open_issues/code_analysis/discussion.mdwn b/open_issues/code_analysis/discussion.mdwn index 7ac3beb1..4cb03293 100644 --- a/open_issues/code_analysis/discussion.mdwn +++ b/open_issues/code_analysis/discussion.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -42,6 +43,8 @@ License|/fdl]]."]]"""]] i tried duma, and it crashes, probably because of cthreads :) +# Static Analysis + ## IRC, freenode, #hurd, 2012-09-08 hello. What static analyzer would you suggest (probably you have @@ -49,3 +52,54 @@ License|/fdl]]."]]"""]] mcsim: if you find some good free static analyzer, let me know :) a simple one is cppcheck braunr: I'm choosing now between splint and adlint + + +## IRC, freenode, #hurd, 2013-10-17 + + whoa, llvm kinda works, enough to make scan-build work :) + teythoon: what is scan-build ? + braunr: clangs static analyzer + ok + I'm doing a full build of the hurd using it, I will post the + report once it is finished + this will help spot many problems + well, here are the scan-build reports I got so far: + https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build/ + I noticed it finds problems in mig generated code, so there are + probably lot's of duplictaes for those kind of problems + what's a... better one to look at? + it's also good at spotting error handling errors, and can spot + leaks sometimes + hm + + https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build/report-yVBHO1.html + that's minor, the device always exist + but that's still ugly + + https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build/report-MtgWSa.html + + https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build/report-QdsZIm.html + this could be important: + https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build/report-PDMEbk.html + this is the issue it finds in mig generated server stubs: + https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build/report-iU3soc.html + this one is #if TypeCheck1 + the libports one looks weird indeed + but TypeCheck is 1 (the tooltip shows macro expansion) + it is defined in line 23 + oh + hmmm... clang does not support nested functions, that will limit + its usefulness for us :/ + yes + one more reason not to use them + + +### IRC, freenode, #hurd, 2013-10-18 + + more complete, now with index: + https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build-2/ + + +# Leak Detection + +See *Leak Detection* on [[boehm_gc]]. diff --git a/open_issues/dbus.mdwn b/open_issues/dbus.mdwn index a41515a1..4473fba0 100644 --- a/open_issues/dbus.mdwn +++ b/open_issues/dbus.mdwn @@ -253,3 +253,115 @@ See [[glibc]], *Missing interfaces, amongst many more*, *`SOCK_CLOEXEC`*. to know how to find this sendmsg.c file? (it's in glibc, but otherwise the remark is valid) s/otherwise/anyway/ + + +# Emails + +# IRC, freenode, #hurd, 2013-10-16 + + gnu_srs: how could you fail to understand credentials need to be + checked ? + braunr: If data is sent via sendmsg, no problem, right? + gnu_srs: that's irrelevant + It's just to move the check to the receive side. + and that is the whole problem + it's not "just" doing it + first, do you know what the receive side is ? + do you know what it can be ? + do you know where the corresponding source code is to be found ? + please, describe a scenario where receiving faulty ancillary data + could be a problem instead + dbus + a user starting privileged stuff although he's not part of a + privileged group of users for example + gnome, kde and others use dbus to pass user ids around + if you can't rely on these ids being correct, you can compromise + the whole system + because dbus runs as root and can give root privileges + or maybe not root, i don't remember but a system user probably + "messagebus" + k! + see http://www.gnu.org/software/hurd/open_issues/dbus.html + IRC, freenode, #hurd, 2013-07-17 + and the proper fix is to patch pflocal to query the + auth server and add the credentials? + possibly + that doesn't sound to bad, did you give it a try? + + +# IRC, freenode, #hurd, 2013-10-22 + + I think I have a solution on the receive side for SCM_CREDS :) + + A question related to SCM_CREDS: dbus use a zero data byte to get + credentials sent. + however, kfreebsd does not care which data (and credentials) is + sent, they report the credentials anyway + should the hurd implementation do the same as kfreebsd? + gnu_srs: I'm not sure to understand: what happens on linux then? + does it see zero data byte as being bogus, and refuse to send the + creds? + linux is also transparent, it sends the credentials independent + of the data (but data has to be non-null) + ok + anyway, what the sending application writes does not matter indeed + so we can just ignore that + and have creds sent anyway + i think the interface normally requires at least a byte of data + for ancilliary data + possibly, yes + To pass file descriptors or credentials over a SOCK_STREAM, + you need to send or + receive at least one byte of non-ancillary data in + the same sendmsg(2) or + recvmsg(2) call. + but that may simply be linux specific + gnu_srs: how do you plan on implementing right checking ? + Yes, data has to be sent, at least one byte, I was asking about + e.g. sending an integer + just send a zero + well + dbus already does that + just don't change anything + let applications pass the data they want + the socket interface already deals with port rights correctly + what you need to do is make sure the rights received match the + credentials + The question is to special case on a zero byte, and forbid + anything else, or allow any data. + why would you forbid + ? + linux and kfreebsd does not special case on a received zero byte + same question, why would you want to do that ? + linux sends credentials data even if no SCM_CREDENTIALS structure + is created, kfreebsd don't + i doubt that + To be specific:msgh.msg_control = NULL; msgh.msg_controllen = 0; + bbl + see the test code: + http://lists.debian.org/debian-hurd/2013/08/msg00091.html + back + why would the hurd include groups when sending a zero byte, but + only uid when not ? + ? + 1) Sent credentials are correct: + no flags: Hurd: OK, only sent ids + -z Hurd: OK, sent IDs + groups + and how can it send more than one uid and gid ? + "sent credentials are not honoured, received ones are created" + Sorry, the implementation is changed by now. And I don't special + case on a zero byte. + what does this mean ? + then why give me that link ? + The code still applies for Linux and kFreeBSD. + It means that whatever you send, the kernel emits does not read + that data: see + socket.h: before struct cmsgcred: the sender's structure is + ignored ... + do you mean receiving on a socket can succeed with gaining + credentials, although the sender sent wrong ones ? + Looks like it. I don't have a kfreebsd image available right now. + linux returns EPERM + anyway + how do you plan to implement credential checking ? + I'll mail patches RSN diff --git a/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn b/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn index e3a6b648..3faa56fc 100644 --- a/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn +++ b/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -12,8 +13,22 @@ License|/fdl]]."]]"""]] [[!tag open_issue_gdb open_issue_gnumach]] +[[!toc]] -# IRC, freenode, #hurd, 2011-07-14 + +# Memory Map + +## IRC, freenode, #hurd, 2010-06 (?) + + is there a way to get gdb to map addresses as required when + debugging mach with qemu ? + I can examine the data if I manually map the addresses th + 0xc0000000 but maybe there's an easier way... + jkoenig: I haven't found a way + I'm mostly using the internal kdb + + +## IRC, freenode, #hurd, 2011-07-14 Hello. I have problem with debugging gnumach. I set 2 brakepoints in file i386/i386at/model_dep.c on functions gdt_init and idt_init. Then @@ -114,3 +129,18 @@ License|/fdl]]."]]"""]] oh, right, without GDB... though if that's what he meant, his statement was very misleading at least + + +# Multiboot + +See also discussion about *multiboot* on [[arm_port]]. + + +## IRC, freenode, #hurd, 2013-10-09 + + I was just wondering - once gnumach is compiled and I have the + gnumach elf, is that bootable? I.e. can I use something like + "qemu-system-i386 -kernel gnumach"? + matlea01: you need something with multiboot support (like grub) + to provide the various bootstrap modules to the kernel + Ah, I see diff --git a/open_issues/emacs.mdwn b/open_issues/emacs.mdwn index cdd1b10d..749649be 100644 --- a/open_issues/emacs.mdwn +++ b/open_issues/emacs.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -1525,3 +1525,18 @@ perhaps prepared (I did not yet have a look), and re-tries again and again? Why doesn't Mach page out some pages to make memory available? This is stock GNU Mach from Git, no patches, configured for Xen domU usage. + + +# IRC, freenode, #hurd, 2013-10-04 + + given you are an emacs user: could you please pick the build + patch from deb#725099, recompile emacs24 and test it with your daily + work? + + +## IRC, freenode, #hurd, 2013-10-07 + + Wow! emacs24 runs in X:-D + pinotree: I've now built and installed emacs 24.3. So far so good + ^ + good, keep testing and stressing diff --git a/open_issues/exec_memory_leaks.mdwn b/open_issues/exec_memory_leaks.mdwn index 67281bdc..1fc5a928 100644 --- a/open_issues/exec_memory_leaks.mdwn +++ b/open_issues/exec_memory_leaks.mdwn @@ -94,3 +94,28 @@ After running the libtool testsuite for some time: 8 39.5 0:15.60 28:48.57 9 0.0 0:04.49 10:24.12 10 12.8 0:08.84 19:34.45 + + +# IRC, freenode, #hurd, 2013-10-08 + + * braunr hunting the exec leak + and i think i found it + yes :> + testing a bit more and committing the fix later tonight + pinotree: i've been building glibc for 40 mins and exec is still + consuming around 1m memory + wow nice + i've been noticing exec leaking quite some time ago, then forgot + to pay more attention to that + it's been more annoying since darnassus provides web access to + cgis + automated tools make requests every seconds + the leak occurred when starting a shell script or using system() + youpi: not sure you saw it, i fixed the exec leak + + +## IRC, freenode, #hurd, 2013-10-10 + + braunr: http://postimg.org/image/jd764wfpp/ + exec 797M + this should be fixed with the release of the next hurd packages diff --git a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn index ff1c4c38..9ff43afa 100644 --- a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn +++ b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -91,3 +91,14 @@ With that patch in place, the assertion failure is seen more often. sure we can get that easily lol [[automatic_backtraces_when_assertions_hit]]. + + +# IRC, freenode, #hurd, 2013-10-09 + + mhmm, i may have an explanation for the weird assertions we + sometimes see in ext2fs + glibc uses alloca to reserve memory for one reply port per thread + in abort_all_rpcs + if this erases the thread-specific area, we can expect all kinds + of wreckage + i'm not sure how to fix this though diff --git a/open_issues/gdb_qemu_debugging_gnumach.mdwn b/open_issues/gdb_qemu_debugging_gnumach.mdwn deleted file mode 100644 index d3105f50..00000000 --- a/open_issues/gdb_qemu_debugging_gnumach.mdwn +++ /dev/null @@ -1,19 +0,0 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -[[!tag open_issue_gdb open_issue_gnumach]] - -\#hurd, freenode, June (?) 2010 - - is there a way to get gdb to map addresses as required when debugging mach with qemu ? - I can examine the data if I manually map the addresses th 0xc0000000 but maybe there's an easier way... - jkoenig: I haven't found a way - I'm mostly using the internal kdb - diff --git a/open_issues/gdb_signal_handler.mdwn b/open_issues/gdb_signal_handler.mdwn index 3084f7e3..5e27a099 100644 --- a/open_issues/gdb_signal_handler.mdwn +++ b/open_issues/gdb_signal_handler.mdwn @@ -401,3 +401,74 @@ License|/fdl]]."]]"""]] braunr: are you sure? there is minimal user-code run before the signal is going into the handler. you "step out of the handler" + + +# IRC, freenode, #hurd, 2013-10-24 + + how come some executables are not debuggable with gdb, e.g Cannot + access memory at address xxx. -fPIC flag? + no + i'm not sure but it's certainly not -fPIC + Another example is localedef: ./debian/tmp-libc/usr/bin/localedef + -i en_GB -c -f UTF-8 -A /usr/share/locale/locale.alias en_GB.UTF-8 + segfailts + and in gdb hangs after creating a thread., after C-c no useful + info: stack ends with: Cannot access memory at address 0x8382c385 + if it's on the stack, it's probably a stack corruption + gnu_srs: are u using 'x' command or 'print' in GDB? IIRC + print may throw such message, but x may not + bt + x may too + what you're showing looks like an utf-8 string + c385 is Å + 83 is a special f + 82 is a comma + so the stack is corrupted:-( + probably + well, certainly + but gdb should show you where the program counter is + is that: ECX: the count register + no + eip + program counter == instruction pointer + k!, the program counter is at first entry in bt: #0 0x01082612 + in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:133 + this is the hurd interruptible version of mach_msg + so it probably means the corruption was made by a signal handler + which is one of the reasons why gdb can't handle Ctrl-c + what to do in such a case, follow the source code + single-stepping? + single stepping also uses signals + and using printf will probably create an infinite recursion + in those cases, i use mach_print + as a first step, you could make sure a signal is actually received + and which one + hmm + also, before rushing into conclusions, make sure you're looking at + the right thread + i don't expect localedef to be multithreaded + but gdb sometimes just doesn't get the thread where the segfault + actually occurred + two threads: 1095.4 and 1095.5 (created when starting localedef + in gdb) + no, at the time of the crash + the second thread is always the signal thread + OK,in gdb the program hangs, interrupted by C-c, outside it + segfaults + when you use bt to get the corrupted stack, you can also use info + threads and thread apply all bt + I did: http://paste.debian.net/61170/ + ok so it confirms there is only one real application thread, the + main one + and that the corruption probably occurs during signal handling + rpctrace (edited out non-printable characters): + http://paste.debian.net/61178/ + Ah, have to do it again as root;-) + yes .. :p + new last part: http://paste.debian.net/61181/ + so, there is a seek, then a stat, then a close perhaps (port + deallocation) and then a signal received (probably sigsegv) + gnu_srs: when you try running it in gdb, do you get a sigkill ? + damn, gdb on darnassus is bugged :-( + It hangs, interrupted with C-c. + ok diff --git a/open_issues/git-core-2.mdwn b/open_issues/git-core-2.mdwn index cbf47bd2..a92b3ebb 100644 --- a/open_issues/git-core-2.mdwn +++ b/open_issues/git-core-2.mdwn @@ -61,6 +61,113 @@ Fixing this situation is easy enough: Still seen. +## IRC, freenode, #hurd, 2013-10-10 + + Huh? I've cloned the 'hurd' repository and I'm attempting to compile + it, but the 'rtnetlink.h' header in + 'hurd/pfinet/linux-src/include/linux/' is just blank. (Which leads to an + error later down when a macro that's supposed to be defined in there is + first used) + So I'm just wondering, is that file really blank? Or is this some + unexpected error of decompression? + clone again and see + the file is definitely not empty + I cloned it twice, both have that file blank. BUT, I want to point + out that both clones do have some decompression errors. (Some files are + missing chunks in /both/ cloned repositories). + where did you clone it from ? + git.sv.gnu.org/hurd/hurd.git + hum decompression errors ? + can you paste them please ? + Hmm, I can clone again and show you an example if I find one + This was on the hurd. When I run: git clone $repo;, it seems to fail + almost randomly with "incorrect header check", but when it does succeed, + occasionally some files are missing chunks + and apparently entire files can be blank + http or git ? + git. + that's really weird + actually i don't even have problems with http any more nowadays .. + This is using the hurd image from sthibault + So once I get it recompiled and shuffle in the new binaries, the + problem should probably go away + no + well maybe but + don't recompile + upgrade packages instead + Alright, I'll do an upgrade instead. Why that path specifically? + rebuilding is long + i wonder if the image you got is corrupted + compute the checksum + we've had weird reports in the past about the images he provides + well not the images themselves, but differences after dowloading + .. + downloading* + The MD5SUMS file on his site isn't including the values for the most + recent images. + It stops at 2012-12-28 + hummm + Anyway, let's see. git clone failed again: + Receiving objects: 100% (50955/50955), 15.48 MiB | 42 KiB/s, done. + error: inflate: date stream error (incorrect header check) <- This + is the interesting part + fatal: serious inflate inconsistency + fatal: index-pack failed + not intereseting enough unfortunately + but it might come from savannah too + try the mirrors at + http://darnassus.sceen.net/gitweb/?a=project_list;pf=savannah_mirror + Let's see..if I try: 'git clone + git://darnassus.sceen.net/gitweb/savannah_mirror/hurd.git', I get: + 'fatal: remote error: access denied or repository not exported: + /gitweb/savannah_mirror/hurd.git' + my bad + that's weird, it should work .. + oh, stupid translation error + translation? From one human language to another? + not translation actually + typo :) + it's either + git://darnassus.sceen.net/savannah_mirror/hurd.git + or + http://darnassus.sceen.net/gitweb/savannah_mirror/hurd.git + copy paste the url exactly please + /gitweb/ is only present in the http url + Ah, right. Okay, I'll paste it exactly + Ehm. The whole thing locked up badly. I'll reboot it and try again. + are you sure it locked oO ? + the hurd can easily become unresponsive when performing io + operations + but you need more than such a git repository to reach that state + Yeah, that happens occasionally. It's not related to git, but rather + it happens when I cancel some command. + your image must be corrupted + have you enabled host io caching btw ? + By now it's corrupted for sure..everytime it crashes the filesystem + gets into a weird state. + I'll unpack a fresh image, then update the packages, and then try + cloning this git repository. + i'll get the image too so we can compare sums + 957bb0768c9558564f0c3e0adb9b317e ./debian-hurd.img.tar.gz + Which unpacks to: debian-hurd-20130504.img + the NSA might backdoor the Hurd, in anticipation of our scheduled + world-dominance + for now they're doing it passively : + :p + sea`: same thing here + sea`: if you still have problems, the image itself might be wrong + in which case you should try with the debian network installer + Ah, so if problems persist, try with the network installer. Okay + Is there some recipe for constructing a hurd/mach minimal + environment? + A system with only just enough tools and libraries to compile and + poke at things. + not currently + we all work in debian environments + the reason being that a lot of patches are queued for integration + upstream + + # 2010-11-17 A very similar issue. The working tree had a lot of diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn index b453b44f..292c6256 100644 --- a/open_issues/glibc.mdwn +++ b/open_issues/glibc.mdwn @@ -330,6 +330,33 @@ Last reviewed up to the [[Git mirror's 0323d08657f111267efa47bd448fbf6cd76befe8 clearly not a priority ok + IRC, freenode, #hurd, 2013-09-26: + + if I want to have epoll/kqueue like things, where + should it dwell? kernel or some libs? + libs + userland + that would be a good project to work on, something i + intended to do (so i can help) but it requires a lot of work + you basically need to add a way to explicitely install and + remove polling requests (instead of the currently way that + implicitely remove polling requests when select/poll returns) + while keeping the existing way working for some time + glibc implements select + the hurd io interface shows the select interface + servers such as pfinet/pflocal implement it + glibc implements the client-side of the call + where's poll? since epoll just added edge-trigger in + poll + both select and poll are implemented on top of the hurd io + select call (which isn't exactly select) + + http://darnassus.sceen.net/gitweb/savannah_mirror/hurd.git/blob/HEAD:/hurd/io.defs + this is the io interface + + http://darnassus.sceen.net/gitweb/savannah_mirror/glibc.git/blob/refs/heads/tschwinge/Roger_Whittaker:/hurd/hurdselect.c + this is the client side implementation + * `sys/eventfd.h` * `sys/inotify.h` @@ -854,6 +881,298 @@ Last reviewed up to the [[Git mirror's 0323d08657f111267efa47bd448fbf6cd76befe8 to check where those locks are held and determine the right order + IRC, OFTC, #debian-hurd, 2013-09-28: + + now we'd just need tls + http://bugs.ruby-lang.org/issues/8937 + well, it would pass makecheck at least. makecheckall would + keep hanging on threads/pipes tests i guess, unless tls/thread + destruction patches fix them + + IRC, OFTC, #debian-hurd, 2013-10-05: + + so what is missing for ruby2.0, only disabling use of + context for now, no? + i'm not tracking it closely, gg0_ is + maybe terceiro would accept a patch which only disables + *context, "maybe" because he rightly said changes must go + upstream + anyway with or without *context, many many tests in + makecheckall fail by making it hang, first with and without + assertion you removed, now they all simply hang + youpi: what do we want to do? if you're about finishing tls + migration (as i thought a couple of weeks ago), i won't propose + anything upstream. otherwise i could but that will have to be + reverted upstream once you finish + about tests, current ruby2.0 doesn't run makecheckall, only + makecheck which succeeds on hurd (w/o context) + if anyone wants to give it a try: + http://paste.debian.net/plain/51089 + first hunk makes makecheck (not makecheckall) succeed and + has been upstreamed, not packaged yet + what about makecheckall for ruby2.0? + 16:58 < gg0_> anyway with or without *context, many many + tests in makecheckall fail by making it hang, first with and + without assertion you removed, now they all simply hang + i for a moment thought it as for 1.9.1, ok + these hangs should be debugged, yes + nope, tests behavior doesn't change between 1.9 and 2.0. i + started suppressing tests onebyone on 2.0 as well and as happened + on 1.9, i gave up cause there were too many + yep a smart mind could start debugging them, starting from + patch above pasted by a lazy one owner + one problem is that one can't reproduce them by isolate + them, they don't fail. start makecheckall then wait for one fail + now after my stupid report, someone like pinotree could take + it over, play with it for half an hour/an hour (which equals to + half a gg0's year/a gg0's year + ) + and fix them all + + 17:05 < gg0_> youpi: what do we want to do? if you're about + finishing tls migration (as i thought a couple of weeks ago), i + won't propose anything upstream. otherwise i could but that will + have to be reverted upstream once you finish + gg0_: I don't really know what to answer + that's why I didn't answer :) + youpi: well then we could upstream context disable and keep + it disabled even if you fix tls. ruby won't be as fast as it + would be with context but i don't think anyone will complain + about that. then once packaged, if terceiro doesn't enable + makecheckall, we will have ruby2.0 in main + that can be a plan yes + btw reverting it upstream should not be a problem eventually + sure, the thing is remembering to do it + filed http://bugs.ruby-lang.org/issues/8990 + please don't fix tls too soon :) + s/makecheck/maketest/g + + IRC, OFTC, #debian-hurd, 2013-10-08: + + ok. *context disabled http://bugs.ruby-lang.org/issues/8990 + + bt full of an attached stuck ruby test + http://paste.debian.net/plain/53788/ + anything useful? + uh, is that really all? + there's not much interesting unfortunately + did you run thread apply all bt full ? + (not just bt full) + no just bt full + http://paste.debian.net/plain/53790/ + wait, there's a child + damn ctrl-c'ing while it was loading symbols made it crash :/ + restarted testsuite + isn't it interesting that failed tests fail only if testsuite + runs from beginning, whereas if run singularly, they succeed? + as it got out of whatever resources + youpi: http://paste.debian.net/plain/53798/ + the interesting part is actually right at the top + it's indeed stuck in the critical section spinlock + question being what is keeping it + iirc I had already checked in the whole glibc code that all + paths which lock critical_section_lock actually release it in + all cases, but maybe I have missed some + (I did find some missing paths, which I fixed) + i guess the same check you and braunr talk about in + discussion just before this anchor + http://darnassus.sceen.net/~hurd-web/open_issues/glibc/#recvmmsg + yes, but the issue we were discussing there is not what + happens here + we would see another thread stuck in the other way roudn, + otherwise + no way to get what is locking? + no, that's not recorded + and what about writing it somewhere right after getting the + lock? + one will have to do that in all spots taking that lock + but yes, that's the usual approach + i would give it try but eglibc rebuild takes too much time, + that conflicts with my laziness + i read even making locks timed would help + + IRC, OFTC, #debian-hurd, 2013-10-09: + + so correct order would be: + __spin_lock (&ss->lock); // locks sigstate + __spin_lock (&ss->critical_section_lock); + [do critical stuff] + __spin_unlock (&ss->critical_section_lock); + __spin_unlock (&ss->lock); // unlocks sigstate + ? + + 21:44 < gg0> terceiro: backported to 2.0 (backport to 1.9 is + waiting) https://bugs.ruby-lang.org/issues/9000 + 21:46 < gg0> that means that if you take a 2.0 snapshot, + it'll build fine on hurd (unless you introduce maketestall as in + 1.9, that would make it get stuck like 1.9) + 21:48 < terceiro> gg0: nice + 21:48 < terceiro> I will try to upload a snapshot as soon as + I can + 21:52 < gg0> no problem. you might break my "conditional + satisfaction" by adding maketestall. better if you do that on + next+1 upload so we'll have at least one 2.0 built :) + + would it be a problem granting me access to a porter box to + rebuild eglibc+ruby2.0? + i'm already doing it on another vm but host often loses power + you cannot install random stuff on a porterbox though + i know i'd just need build-deps of eglibc+ruby2.0 i guess + (already accessed to porter machines in the past, account + lele, mips iirc) + ldap should remember that + don't want to disturb anyone else work btw. if it's not a + problem, nice. otherwise no problem + please send a request to admin@exodar.debian.net so it + is not forgotten + following this one would be too "official"? + http://dsa.debian.org/doc/guest-account/ + hurd is not a release architecture, so hurd machines are + not managed by DSA + ok + the general procedure outlines is ok though, just need + to be sent to the address above + sent + (1st signed mail with mutt, in the worst case i've attached + passphrase :)) + gg0: could you send me an ssh key? + no alioth account? + yes, but EPERM + youpi: sent to youpi@ + youpi@ ? + (... which doesn't exist :/) + sthibault@ + please test gg0-guest@exodar.debian.net ? + (I'd rather not adduser the ldap name, who knows what might + happen when you get your DD account) + i'm in. thanks + you're welcome + ldap users need to be adduser'ed? + I'm not getting your ldap user account from ud-replicate, + at least + (btw i never planned to apply nm, i'd be honoured but i + simply think not to deserve it) + never say never ;) + bah i like failing. that would be a success. i can't :) + gg0-guest@exodar:~$ dchroot + E: Access not authorised + I: You do not have permission to access the schroot service. + I: This failure will be reported. + ah, right, iirc I need to add you somewhere + gg0: please retry? + works + good + are there already eglibc+ruby2.0 build-deps? + yes + oh that means i should do something myself now :) + yep, that had to happen at some point :) + my laziness thanks: "at some point" is better than "now" :) + + IRC, freenode, #hurd, 2013-10-10: + + ok just reproduced the + former. ../sysdeps/mach/hurd/jmp-unwind.c:53 waits + 20:37 < braunr> gg0: does ruby create and destroy threads + ? + no idea + braunr: days ago you and youpi talked about locking order + (just before this anchor + http://darnassus.sceen.net/~hurd-web/open_issues/glibc/#recvmmsg) + oh right + could you submit the fix for jmp-unwind.c to + upstream? + it didn't made it in the todo list + so correct order is in hurd_thread_cancel, right? + sorry about that + we need to make a pass to make sure it is + that means locking first ss->critical_section_lock _then_ + ss->lock + correct? + but considering how critical hurd_thread_cancel is, i + expect so + + i get the same deadlock by swapping locks + braunr: youpi: fyi ^ + 20:51 < braunr> 20:37 < braunr> gg0: does ruby create and + destroy threads ? + how could i check it? + gg0: ps -eflw + gg0: that's not surprising, since in the b acktrace you + posted there isn't another thread locked in the other order + so it's really that somehow the thread is already in + critical sesction + youpi: you mean there is ? + ah, it's not the same bug + no, in what he posted, no other thread is stuck + so it's not a locking order + just that the critical section is actually busy + youpi: ack + braunr: what's the other bug? ext2fs one? + gg0: idk + braunr: thanks. doesn't show threads (found -T for that) but + at least doesn't limit columns number if piped (thanks to -w) + it does + there is a TH column + ok thread count. -T gives more info + + IRC, freenode, #hurd, 2013-10-24: + + ruby2.0 builds fine with the to-be-uploaded libc btw + youpi: without d-ports patches? surprise me :) + gg0: plain main archive source + you did it. surprised + ah ok you just pushed your tls. great! + tls will fix a lot of things + + * `sigaltstack` + + IRC, freenode, #hurd, 2013-10-09: + + Hi, is sigaltstack() really supported, even if it is + defined as well as SA_ONSTACK? + probably not + well, + i don't know actually, mistaking with something else + it may be supported + iirc no + pinotree: are you sure? + this is what i remember + if you want to be sure that $foo works, just do the + usual way: test it yourself + found it: hurd/TODO: *** does sigaltstack/sigstack + really work? -- NO + well TODO is old and there were signal-related patches + by jk in the meanwhile, although i don't think they would have + changed this lack + in any case, test it + anybody fluent in assembly? Looks like this code + destroys the stack: http://paste.debian.net/54331/ + gnu_srs1: why would it ? + it does something special with the stack pointer but it + just looks like it aligns it to 16 bytes, maybe because of sse2 + restrictions (recent gcc align the stack already anyway) + Well, in that case it is the called function: + http://paste.debian.net/54341/ + how do you know there is a problem with the stack in the + first place ? + tracing up to here, everything is OK. then esp and ebp + are destroyed. + and single stepping goes backward until it segfaults + "destroyed" ? + zero if I remember correctly now. the x86 version built + for is i586, should that be changed to i486? + this shouldn't change anything + and they shouldn't get to 0 + use gdb to determine exactly which instruction resets the + stack pointer + how to step into the assembly part? using 's' steps + through the function since no line information: + Single stepping until exit from function + wine_call_on_stack, + which has no line number information. + gnu_srs1: use break on the address + how do i get the address of where the assembly starts? + * `recvmmsg`/`sendmmsg` (`t/sendmmsg`) From [[!message-id "20120625233206.C000A2C06F@topped-with-meat.com"]], diff --git a/open_issues/glibc/t/tls-threadvar.mdwn b/open_issues/glibc/t/tls-threadvar.mdwn index 7ce36f41..40d1463e 100644 --- a/open_issues/glibc/t/tls-threadvar.mdwn +++ b/open_issues/glibc/t/tls-threadvar.mdwn @@ -116,3 +116,40 @@ dropped altogether, and `__thread` directly be used in glibc. ## IRC, OFTC, #debian-hurd, 2013-09-23 yay, errno threadvar conversion success + + +## IRC, OFTC, #debian-hurd, 2013-10-05 + + youpi: any ETA for tls? + gg0_: one can't have an ETA for bugfixing + i don't call them bugs if there's something missing to implement btw + no, here it's bugs + the implementation is already in the glibc branches in our + repository + it just makes some important regressions + + +## IRC, OFTC, #debian-hurd, 2013-10-07 + + about tls, I've made some "progress": now I'm wondering how raise() + has ever been working before :) + + +## IRC, OFTC, #debian-hurd, 2013-10-15 + + good, reply_port tls is now ok + last but not least, sigstate + + +## IRC, OFTC, #debian-hurd, 2013-10-21 + + started testsuite with threadvars dropped completely + so far so good + + +## IRC, OFTC, #debian-hurd, 2013-10-24 + + ok, hurd boots with full-tls libc, no threadvars at all any more + \o/ + good bye threadvars bugs, welcome tls ones ;) + now I need to check that threads can really use another stack :) diff --git a/open_issues/gnumach_page_cache_policy.mdwn b/open_issues/gnumach_page_cache_policy.mdwn index 5e93887e..77e52ddb 100644 --- a/open_issues/gnumach_page_cache_policy.mdwn +++ b/open_issues/gnumach_page_cache_policy.mdwn @@ -811,3 +811,63 @@ License|/fdl]]."]]"""]] have* and even if laggy, it doesn't feel much more than the usual lag of a network (ssh) based session + + +# IRC, freenode, #hurd, 2013-10-08 + + hmm i have to change what gnumach reports as being cached memory + + +## IRC, freenode, #hurd, 2013-10-09 + + mhmm, i'm able to copy files as big as 256M while building debian + packages, using a gnumach kernel patched for maximum memory usage in the + page cache + just because i used --sync=30 in ext2fs + a bit of swapping (around 40M), no deadlock yet + gitweb is a bit slow but that's about it + that's quite impressive + i suspect thread storms might not even be the cataclysmic event + that we thought it was + the true problem might simply be parallel fs synces + + +## IRC, freenode, #hurd, 2013-10-10 + + even with the page cache patch, memory filled, swap used, and lots + of cached objects (over 200k), darnassus is impressively resilient + i really wonder whether we fixed ext2fs deadlock + + youpi: fyi, darnassus is currently running a patched gnumach with + the vm cache changes, in hope of reproducing the assertion errors we had + in the past + i increased the sync interval of ext2fs to 30s like we discussed a + few months back + and for now, it has been very resilient, failing only because of + the lack of kernel map entries after several heavy package builds + wait the latter wasn't a deadlock it resumed after 1363.06 s + gg0: thread storms can sometimes (rarely) fade and let the system + resume "normally" + which is why i increased the sync interval to 30s, this leaves + time between two intervals for normal operations + otherwise writebacks are queued one after the other, and never + processed fast enough for that queue to become empty again (except + rarely) + youpi: i think we should consider applying at least the sync + interval to exodar, since many DDs are just unaware of the potential + problems with large IOs + sure + + 222k cached objects (1G of cached memory) and darnassus is still + kicking :) + youpi: those lock fixing patches your colleague sent last year + must have helped somewhere + :) + + +## IRC, freenode, #hurd, 2013-10-13 + + braunr: how are your tests going with the object cache? + youpi: not so good + youpi: it failed after 2 days of straight building without a + single error output :/ diff --git a/open_issues/hurd_101.mdwn b/open_issues/hurd_101.mdwn index 574a03ec..25822512 100644 --- a/open_issues/hurd_101.mdwn +++ b/open_issues/hurd_101.mdwn @@ -60,3 +60,41 @@ Not the first time that something like this is proposed... how ipc works and understand exactly what state is stored where ok + + +# IRC, freenode, #hurd, 2013-10-12 + + Hi all, can anyone expand on + https://www.gnu.org/software/hurd/contributing.html - if I proceed with + the quick start and have the system running in a virtual image, how do I + go from there to being able to start tweaking the source (and recompiling + ) in a meaningful way? + Would I modify the source, compile within the VM and then what + would be the next step to actually test my new changes? + ahungry: we use debian + i suggest formatting your changes into patches, importing them + into debian packages, rebuilding those packages, and installing them over + the upstream ones + what about modifications to mach itself? or say I wanted to try + to work on the wifi drives - I would build the translator or module or + whatever and just add to the running instance of hurd? + s/drives/drivers + same thing + although + during development, it's obviously a bit too expensive to rebuild + complete packages each time + you can use the hurd on top of a gnumach kernel built completely + from upstream sources + you need a few debian patches for the hurd itself + a lot of them for glibc + i usually create a temporary local branch with the debian patches + i need to make my code run + and then create the true development branch itself from that one + drivers are a a dark corner of the hurd + i wouldn't recommend starting there + but if you did, yes, you'd write a server to run drivers, and + start it + you'd probably write a translator (which is a special kind of + server), yes + braunr: thanks for all the info, hittin the sack now but ill have + to set up a box and try to contribute diff --git a/open_issues/hurd_init.mdwn b/open_issues/hurd_init.mdwn index b0b58a70..cc06935c 100644 --- a/open_issues/hurd_init.mdwn +++ b/open_issues/hurd_init.mdwn @@ -214,3 +214,11 @@ License|/fdl]]."]]"""]] I've been hacking on init/startup, I've looked into cleaning it up + + +## IRC, freenode, #hurd, 2013-10-07 + + braunr: btw, what do you think of my /hurd/startup proposal? + i haven't read it in detail yet + it's about separating init right ? + yes diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn index 6f09ea0d..feea7c0d 100644 --- a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn +++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn @@ -413,3 +413,67 @@ Address problem mentioned in [[/libpthread]], *Threads' Death*. oh, git is multithreaded great so i've actually tested my libpthread patch quite a lot + + +## IRC, freenode, #hurd, 2013-09-25 + + on a side note, i was able to build gnumach/libc/hurd packages + with thread destruction + nice :) + they boot and work mostly fine, although they add their own issues + e.g. the comm field of the root ext2fs is empty + ps crashes when trying to display threads + but thread destruction actually works, i.e. servers (those that + are configured that away at least) go away after some time, and even + heavily used servers such as ext2fs dynamically scale over time :) + + +## IRC, freenode, #hurd, 2013-10-10 + + concerning threads, i think i figured out the last bugs i had with + thread destruction + it should be well on its way to be merged by the end of the year + + +## IRC, freenode, #hurd, 2013-10-11 + + braunr: is your thread destruction patch ready for testing? + gg0: there are packages at my repository, yes + but i still have hurd fixes to do before i polish it + in particular, posix says returning from main() stops the entire + process and all other threads + i didn't check that during the switch to pthreads, and ext2fs (and + maybe others) actually return from main but expect other threads to live + on + this creates problems when the main thread is actually destroyed, + but not the process + braunr: tmpfs does something like that, but calls pthread_exit + at the end of main + same effect + this was fine with cthreads, but must be changed with pthreads + and libpthread must be fixed to enforce it + (or libc) + + diskfs_startup_diskfs should probably be changed to reuse the main + thread instead of returning + + +## IRC, freenode, #hurd, 2013-10-19 + + I know what threads are, but what is 'thread destruction'? + the hurd currently never destroys individual threads + they're destroyed when tasks are destroyed + if the number of threads in a task peaks at a high number, say + thousands of them, they'll remain until the task is terminated + such tasks are usually file systems, normally never restarted (and + in the case of the root file system, not restartable) + this results in a form of leak + another effect of this leak is that servers which should go away + because of inactivity still remain + since thread destruction doesn't actually work, the debian package + uses a patch to prevent worker threads from timeouting + and to finish with, since thread destruction actually doesn't + work, normal (unpatched) applications that destroy threads are certainly + failing bad + i just need to polish a few things, wait for youpi to finish his + work on TLS to resolve conflicts, and that will be all diff --git a/open_issues/lsof.mdwn b/open_issues/lsof.mdwn index 2cbf2302..2651932d 100644 --- a/open_issues/lsof.mdwn +++ b/open_issues/lsof.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -11,3 +11,41 @@ License|/fdl]]."]]"""]] We don't have a `lsof` tool. Perhaps we could cook something with having a look at which ports are open at the moment (as [[`portinfo`|hurd/portinfo]] does, for example)? + + +# IRC, freenode, #hurd, 2013-10-16 + + braunr: there's something I've been working on, it's not yet + finished but usable + http://paste.debian.net/58266/ + it graphs port usage + it's a bit heavy on the dependency-side though... + but + is it able to link rights from different ipc spaces ? + no + what do you mean exactly? + know that send right 123 in task 1 refers to receive right 321 in + task 2 + basically, lsof + i'm not sure it's possible right now, and that's what we'd really + need + does the kernel hand out this information? + ^ + right, I'm not sure it's possible either + but a graph maker in less than 300 is cute :) + 300 lines* + well, it leverages pymatplotlib or something, it needs half of + the pythonverse ;) + lsof and pmap and two tools we really lack on the hurd + what does portinfo --translate=PID do? + i guess it asks proc so that ports that refer to task actually + give useful info + hml + no + doesn't make sense to give a pid in this case + teythoon: looks like it does what we talked about + :) + teythoon: the output looks a bit weird anyway, i think we need to + look at the code to be sure + braunr: this is what aptitude update looks like: + https://teythoon.cryptobitch.de/portmonitor/aptitude_portmonitor.svg diff --git a/open_issues/mach-defpager_swap.mdwn b/open_issues/mach-defpager_swap.mdwn index 7d3b001c..6e4dc088 100644 --- a/open_issues/mach-defpager_swap.mdwn +++ b/open_issues/mach-defpager_swap.mdwn @@ -18,3 +18,24 @@ License|/fdl]]."]]"""]] I allocated a 5GB partition as swap, but hurd only found 1GB use 2GiB swaps only, >2Gib are not supported (and apparently it just truncates the size, to be investigated) + +## IRC, freenode, #hurd, 2013-10-25 + + mkswap truncated the swap partiton to 2GB + :/ + have you checked with 'free' ? + I have a 4gb swap partition on one of my boxes + how did you create it? + 2gig swap alright + according to free + + +# Swap Files + +## IRC, freenode, #hurd, 2013-10-25 + + C-Keen: swapfiles are not to work very badly on the hurd + swapfiles cause recursion and reservation problems on every system + but on the hurd, we just never took the time to fix the swap code + +Same issues as we generally would have with `hurd-defpager`? diff --git a/open_issues/multiprocessing.mdwn b/open_issues/multiprocessing.mdwn index 0ac7f195..eaaa2289 100644 --- a/open_issues/multiprocessing.mdwn +++ b/open_issues/multiprocessing.mdwn @@ -17,7 +17,7 @@ for applying multiprocessing. That is, however, only true from a first and inexperienced point of view: there are many difficulties. -IRC, freenode, #hurd, August / September 2010 +# IRC, freenode, #hurd, August / September 2010 silver_hook: because multi-server systems depend on inter-process communication, and inter-process communication is many times more @@ -32,7 +32,7 @@ IRC, freenode, #hurd, August / September 2010 serious research challenges -IRC, freenode, #hurd, 2011-07-26 +# IRC, freenode, #hurd, 2011-07-26 < braunr> 12:03 < CTKArcher> and does the hurd take more advantages in a multicore architecture than linux ? @@ -57,7 +57,7 @@ IRC, freenode, #hurd, 2011-07-26 < braunr> (here, thread migration means being dispatched on another cpu) -debian-hurd list +# debian-hurd list On Thu, Jan 02, 2003 at 05:40:00PM -0800, Thomas Bushnell, BSG wrote: > Georg Lehner writes: diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn index ae05e128..772fd865 100644 --- a/open_issues/performance.mdwn +++ b/open_issues/performance.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -44,6 +44,8 @@ call|/glibc/fork]]'s case. * [[metadata_caching]] + * [[community/gsoc/project_ideas/object_lookups]] + --- diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn index cd39328f..05a58f2e 100644 --- a/open_issues/performance/io_system/read-ahead.mdwn +++ b/open_issues/performance/io_system/read-ahead.mdwn @@ -3031,3 +3031,13 @@ License|/fdl]]."]]"""]] so, add? if that's what you want to do, ok i'll think about your initial question tomorrow + + +## IRC, freenode, #hurd, 2013-09-30 + + talking about which... did the clustered I/O work ever get + concluded? + antrik: yes, mcsim was able to finish clustered pageins, and it's + still on my TODO list + it will get merged eventually, now that the large store patch has + also been applied diff --git a/open_issues/performance/microkernel_multi-server.mdwn b/open_issues/performance/microkernel_multi-server.mdwn index 111d2b88..0382c835 100644 --- a/open_issues/performance/microkernel_multi-server.mdwn +++ b/open_issues/performance/microkernel_multi-server.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -12,7 +12,8 @@ License|/fdl]]."]]"""]] Performance issues due to the microkernel/multi-server system architecture? -IRC, freenode, #hurd, 2011-07-26 + +# IRC, freenode, #hurd, 2011-07-26 < CTKArcher> I read that, because of its microkernel+servers design, the hurd was slower than a monolithic kernel, is that confirmed ? @@ -45,3 +46,181 @@ IRC, freenode, #hurd, 2011-07-26 < braunr> but in 95, processors weren't that fast compared to other components as they are now < youpi> while disk/mem haven't evovled so fast + + +# IRC, freenode, #hurd, 2013-09-30 + + ok.. i noticed when installing debian packages in X, the mouse + lagged a little bit + that takes me back to classic linux days + it could be a side effect of running under virtualisation who + knows + no + it's because of the difference of priorities between server and + client tasks + is it simple enough to increase the priority of the X server? + it does remind me of the early linux days.. people were more + interested in making things work, and making things not crash.. than + improving the desktop interactivity or responsiveness + very low priority :P + snadge: actually it's not the difference in priority, it's the + fact that some asynchronous processing is done at server side + the priority difference just gives more time overall to servers + for that processing + snadge: when i talk about servers, i mean system (hurd) servers, + no x + yeah.. linux is the same.. in the sense that, that was its + priority and focus + snadge: ? + servers + what are you talking about ? + going back 10 years or so.. linux had very poor desktop + performance + i'm not talking about priorities for developers + it has obviously improved significantly + i'm talking about things like nice values + right.. and some of the modifications that have been done to + improve interactivity of an X desktop, are not relevant to servers + not relevant at all since it's a hurd problem, not an x problem + yeah.. that was more of a linux problem too, some time ago was the + only real point i was making.. a redundant one :p + where i was going with that.. was desktop interactivity is not a + focus for hurd at this time + it's not "desktop interactivity" + it's just correct scheduling + is it "correct" though.. the scheduler in linux is configurable, + and selectable + depending on the type of workload you expect to be doing + not really + it can be interactive, for desktop loads.. or more batched, for + server type loads.. is my basic understanding + no + that's the scheduling policy + the scheduler is cfs currently + and that's the main difference + cfs means completely fair + whereas back in 2.4 and before, it was a multilevel feedback + scheduler + i.e. a scheduler with a lot of heuristics + the gnumach scheduler is similar, since it was the standard + practice from unix v6 at the time + (gnumach code base comes from bsd) + so 1/ we would need a completely fair scheduler too + and 2/ we need to remove asynchronous processing by using mostly + synchronous rpc + im just trying to appreciate the difference between async and sync + event processing + on unix, the only thing asynchronous is signals + on the hurd, simply cancelling select() can cause many + asynchronous notifications at the server to remove now unneeded resources + when i say cancelling select, i mean one or more fds now have + pending events, and the others must be cleaned + yep.. thats a pretty fundamental change though isnt it? .. if im + following you, you're talking about every X event.. so mouse move, + keyboard press etc etc etc + instead of being handled async.. you're polling for them at some + sort of timing interval? + never mind.. i just read about async and sync with regards to rpc, + and feel like a bit of a noob + async provides a callback, sync waits for the result.. got it :p + async is resource intensive on hurd for the above mentioned + reasons.. makes sense now + how about optimising the situation where a select is cancelled, + and deferring the signal to the server to clean up resources until a + later time? + so like java.. dont clean up, just make a mess + then spend lots of time later trying to clean it up.. sounds like + my life ;) + reuse stale objects instead of destroying and recreating them, and + all the problems associated with that + but if you're going to all these lengths to avoid sending messages + between processes + then you may as well just use linux? :P + im still trying to wrap my head around how converting X to use + synchronous rpc calls will improve responsiveness + what has X to do with it? + nothing wrong with X.. braunr just mentioned that hurd doesnt + really handle the async calls so well + there is more overhead.. that it would be more efficient on hurd, + if it uses sync rpc instead + and perhaps a different task scheduler would help also + ala cfs + but i dont think anyone is terribly motivated in turning hurd into + a desktop operating system just yet.. but i could be wrong ;) + i didn't say that + i misinterpreted what you said then .. im not surprised, im a + linux sysadmin by trade.. and have basic university OS understanding (ie + crap all) at a hobbyist level + i said there is asynchronous processing (i.e. server still have + work to do even when there is no client) + that processing mostly comes from select requests cancelling what + they installed + ie.e. you select fd 1 2 3, even on 2, you cancel on 1 and 3 + those cancellations aren't synchronous + the client deletes ports, and the server asynchronously receives + dead name notifications + since servers have a greater priority, these notifications are + processed before the client can continue + which is what makes you feel lag + X is actually a client here + when i say server, i mean hurd servers + the stuff implementing sockets and files + also, you don't need to turn the hurd into a desktop os + any correct way to do fair scheduling will do + can the X client be made to have a higher priority than the hurd + servers? + or perhaps something can be added to hurd to interface with X + well, the future is wayland + ufs .. unfair scheduling.. give priority to X over everything else + hurd almost seams ideal for that idea.. since the majority of the + system is seperated from the kernel + im likely very wrong though :p + snadge: the reason we elevated the priority of servers is to avoid + delaying the processing of notifications + because each notification can spawn a server thread + and this lead to cases where processing notifications was so slow + that spawning threads would occur more frequently, leading to the server + exhausting its address space because of thread stacks + cant it wait for X though? .. or does it lead to that situation + you just described + we should never need such special cases + we should remove async notifications + my logic is this.. if you're not running X then it doesnt + matter.. if you are, then it might.. its sort of up to you whether you + want priority over your desktop interface or whether it can wait for more + important things, which creates perceptible lag + snadge: no it doesn't + X is clearly not the only process involved + the whole chain should act synchronously + from the client through the server through the drivers, including + the file system and sockets, and everything that is required + it's a general problem, not specific to X + right.. from googling around, it looks like people get very + excited about asyncronous + there was a move to that for some reason.. it sounds great in + theory + continue processing something else whilst you wait for a + potentially time consuming process.. and continue processing that when + you get the result + its also the only way to improve performance with parallelism? + which is of no concern to hurd at this time + snadge: please don't much such statements when you don't know what + you're talking about + it is a concern + and yes, async processing is a way to improve performance + but don't mistake async rpc and async processing + async rpc simply means you can send and receive at any time + sync means you need to recv right after send, blocking until a + reply arrives + the key word here is *blocking*ù + okay sure.. that makes sense + what is the disadvantage to doing it that way? + you potentially have more processes that are blocking? + a system implementing posix such as the hurd needs signals + and some event handling facility like select + implementing them synchronously means a thread ready to service + these events + the hurd currently has such a message thread + but it's complicated and also a scalability concern + e.g. you have at least two thread per process + bbl diff --git a/open_issues/pthread_atfork.mdwn b/open_issues/pthread_atfork.mdwn index 1b656f05..06b9d6c6 100644 --- a/open_issues/pthread_atfork.mdwn +++ b/open_issues/pthread_atfork.mdwn @@ -18,3 +18,89 @@ can probably be borrowed from `nptl/sysdeps/unix/sysv/linux/register-atfork.c`. SRCDIR/opal/mca/memory/linux/arena.c:387: warning: warning: pthread_atfork is not implemented and will always fail + + +# Samuel's implementation + +TODO. + + +## IRC, OFTC, #debian-hurd, 2013-10-08 + + youpi: if you need/want to test your pthread_atfork + implementation, you can check libposix-atfork-perl and its test suite + (whose test 004 hangs now, with eglibc -93) + while it failed previously indeed + we might simply need to rebuild perl against it + (I see ifdef pthread_atfork in perl) + + +## IRC, freenode, #hurd, 2013-10-16 + + tschwinge: I'd love to try your cross-gnu tool, the wiki page + suggests that the list of required source packages is outdated. can you + give me some hints? + tschwinge: I got this error running cross-gnu: + http://paste.debian.net/58303/ + make[4]: Leaving directory `/home/teythoon/repos/hurd/cross/src/glibc/setjmp' + make subdir=string -C ../string ..=../ objdir=/home/teythoon/repos/hurd/cross/obj/glibc -f Makefile -f ../elf/rtld-Rules rtld-all rtld-modules='rtld-strchr.os rtld-strcmp.os rtld-strcpy.os rtld-strlen.os rtld-strnlen.os rtld-memchr.os rtld-memcmp.os rtld-memmove.os rtld-memset.os rtld-mempcpy.os rtld-stpcpy.os rtld-memcpy.os rtld-rawmemchr.os rtld-argz-count.os rtld-argz-extract.os rtld-stpncpy.os' + make[4]: Entering directory `/home/teythoon/repos/hurd/cross/src/glibc/string' + make[4]: Leaving directory `/home/teythoon/repos/hurd/cross/src/glibc/string' + make[4]: Entering directory `/home/teythoon/repos/hurd/cross/src/glibc/string' + make[4]: Nothing to be done for `rtld-all'. + make[4]: Leaving directory `/home/teythoon/repos/hurd/cross/src/glibc/string' + make[3]: Leaving directory `/home/teythoon/repos/hurd/cross/src/glibc/elf' + i686-pc-gnu-gcc -shared -static-libgcc -Wl,-O1 -Wl,-z,defs -Wl,-dynamic-linker=/lib/ld.so.1 -B/home/teythoon/repos/hurd/cross/obj/glibc/csu/ -Wl,--version-script=/home/teythoon/repos/hurd/cross/obj/glibc/libc.map -Wl,-soname=libc.so.0.3 -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both -nostdlib -nostartfiles -e __libc_main -L/home/teythoon/repos/hurd/cross/obj/glibc -L/home/teythoon/repos/hurd/cross/obj/glibc/math -L/home/teythoon/repos/hurd/cross/obj/glibc/elf -L/home/teythoon/repos/hurd/cross/obj/glibc/dlfcn -L/home/teythoon/repos/hurd/cross/obj/glibc/nss -L/home/teythoon/repos/hurd/cross/obj/glibc/nis -L/home/teythoon/repos/hurd/cross/obj/glibc/rt -L/home/teythoon/repos/hurd/cross/obj/glibc/resolv -L/home/teythoon/repos/hurd/cross/obj/glibc/crypt -L/home/teythoon/repos/hurd/cross/obj/glibc/mach -L/home/teythoon/repos/hurd/cross/obj/glibc/hurd -Wl,-rpath-link=/home/teythoon/repos/hurd/cross/obj/glibc:/home/teythoon/repos/hurd/cross/obj/glibc/math:/home/teythoon/repos/hurd/cross/obj/glibc/elf:/home/teythoon/repos/hurd/cross/obj/glibc/dlfcn:/home/teythoon/repos/hurd/cross/obj/glibc/nss:/home/teythoon/repos/hurd/cross/obj/glibc/nis:/home/teythoon/repos/hurd/cross/obj/glibc/rt:/home/teythoon/repos/hurd/cross/obj/glibc/resolv:/home/teythoon/repos/hurd/cross/obj/glibc/crypt:/home/teythoon/repos/hurd/cross/obj/glibc/mach:/home/teythoon/repos/hurd/cross/obj/glibc/hurd -o /home/teythoon/repos/hurd/cross/obj/glibc/libc.so -T /home/teythoon/repos/hurd/cross/obj/glibc/shlib.lds /home/teythoon/repos/hurd/cross/obj/glibc/csu/abi-note.o /home/teythoon/repos/hurd/cross/obj/glibc/elf/soinit.os /home/teythoon/repos/hurd/cross/obj/glibc/libc_pic.os /home/teythoon/repos/hurd/cross/obj/glibc/elf/sofini.os /home/teythoon/repos/hurd/cross/obj/glibc/elf/interp.os /home/teythoon/repos/hurd/cross/obj/glibc/elf/ld.so /home/teythoon/repos/hurd/cross/obj/glibc/mach/libmachuser-link.so /home/teythoon/repos/hurd/cross/obj/glibc/hurd/libhurduser-link.so -lgcc + /home/teythoon/repos/hurd/cross/obj/glibc/libc_pic.os: In function `__fork': + /home/teythoon/repos/hurd/cross/src/glibc/posix/../sysdeps/mach/hurd/fork.c:70: undefined reference to `__start__hurd_atfork_prepare_hook' + /home/teythoon/repos/hurd/cross/lib/gcc/i686-pc-gnu/4.8.1/../../../../i686-pc-gnu/bin/ld: /home/teythoon/repos/hurd/cross/obj/glibc/libc_pic.os: relocation R_386_GOTOFF against undefined hidden symbol `__start__hurd_atfork_prepare_hook' can not be used when making a shared object + /home/teythoon/repos/hurd/cross/lib/gcc/i686-pc-gnu/4.8.1/../../../../i686-pc-gnu/bin/ld: final link failed: Bad value + collect2: error: ld returned 1 exit status + make[2]: *** [/home/teythoon/repos/hurd/cross/obj/glibc/libc.so] Error 1 + make[2]: Leaving directory `/home/teythoon/repos/hurd/cross/src/glibc/elf' + make[1]: *** [elf/subdir_lib] Error 2 + make[1]: Leaving directory `/home/teythoon/repos/hurd/cross/src/glibc' + make: *** [all] Error 2 + + rm -f /home/teythoon/repos/hurd/cross/sys_root/lib/ld.so + + exit 100 + + binutils-2.23.2, + gcc-4.8.1, + everything else is from git as specified in the wiki. + + +## IRC, freenode, #hurd, 2013-10-24 + + in recent glibc commits (tschwinge/Roger_Whittaker branch) there + are references to _hurd_atfork_* symbols in sysdeps/mach/hurd/fork.c, and + some _hurd_fork_* symbols, some of the _hurd_fork_* symbols seem to be + defined in Hurd's boot/frankemul.ld (mostly guessing by their names being + mentioned, I don't know linker script syntax), but those _hurd_atfork_* + symbols don't seem to be defined there, are they supposed to be defined + elsewhere or is th + does anyone know where the _hurd_atfork_* group of symbols + referenced in glibc are defined (if anywhere)? + AliciaC: it's the DEFINE_HOOK (_hurd_atfork_prepare_hook, (void)); + in glibc/sysdeps/mach/hurd/fork.c + hm, is that not just a declaration? + no, it's a definition, as its name suggests : + (despite the macro name) + :) + ok + I should look into it more, I could have sworn I was getting + undefined references, but maybe the symbol names used are different from + those defined, but that'd be odd as well, in the same file and all + I mean, I do get undefined references, but question is if it's to + things that should have been defined or not + what undefined references do you gaT? + s/gaT/get + I'll get back to you once I have that system up again + youpi: sysdeps/mach/hurd/fork.c:70: undefined reference to + `__start__hurd_atfork_prepare_hook' + fork.c:70: 'RUN_HOOK (_hurd_atfork_prepare_hook, ());' + DEFINE_HOOK (_hurd_atfork_prepare_hook, (void)); is higher up in + the file + though there is also this message: build/libc_pic.os: relocation + R_386_GOTOFF against undefined hidden symbol + `__start__hurd_atfork_prepare_hook' can not be used when making a shared + object diff --git a/open_issues/smp.mdwn b/open_issues/smp.mdwn index a45a1e22..89474d25 100644 --- a/open_issues/smp.mdwn +++ b/open_issues/smp.mdwn @@ -37,3 +37,11 @@ See also the [[FAQ entry|faq/smp]]. ## Richard, 2013-03-20 This task actually looks too big for a GSoC project. + + +## IRC, freenode, #hurd, 2013-09-30 + + also, while the problem with hurd is about I/O, it's actually a + lot more about caching, and even with more data cached in, the true + problem is contention, in which case having several processors would + actually slow things down even more diff --git a/open_issues/strict_aliasing.mdwn b/open_issues/strict_aliasing.mdwn index b7d39805..0e59f796 100644 --- a/open_issues/strict_aliasing.mdwn +++ b/open_issues/strict_aliasing.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -29,3 +29,16 @@ License|/fdl]]."]]"""]] issues (if gcc catches them all) The strict aliasing things should be fixed, yes. Some might be from MIG. + + +# IRC, freenode, #hurd, 2013-10-17 + + we should build gnumach and the hurd with -fno-strict-aliasing + aren't the mig-generated stubs the only issues related to that? + no + b/c we often have pointers of different type pointing to the + same address? for example code using libports? + the old linux code, including pfinet, and even the hurd libraries, + use techniques that assume aliasing + exactly + right, I agree diff --git a/open_issues/thread-cancel_c_55_hurd_thread_cancel_assertion___spin_lock_locked_ss_critical_section_lock.mdwn b/open_issues/thread-cancel_c_55_hurd_thread_cancel_assertion___spin_lock_locked_ss_critical_section_lock.mdwn index 7159551d..f40e0455 100644 --- a/open_issues/thread-cancel_c_55_hurd_thread_cancel_assertion___spin_lock_locked_ss_critical_section_lock.mdwn +++ b/open_issues/thread-cancel_c_55_hurd_thread_cancel_assertion___spin_lock_locked_ss_critical_section_lock.mdwn @@ -50,3 +50,5 @@ IRC, unknown channel, unknown date: result in others trying to take it... nope: look at the code :) or maybe the cancel_hook, but I really doubt it + +See discussion about *`critical_section_lock`* on [[glibc]]. diff --git a/open_issues/time.mdwn b/open_issues/time.mdwn index 367db872..d9f1fa1d 100644 --- a/open_issues/time.mdwn +++ b/open_issues/time.mdwn @@ -837,3 +837,17 @@ not get a define for `HZ`, which is then defined with a fallback value of 60. braunr: Guile2 works smoothly now, let me try something cool with it nalaginrut: nice + + +### IRC, OFTC, #debian-hurd, 2013-09-29 + + youpi: is the latest glibc carrying the changes related to + timing? what about gb guile-2.0 with it? + it does + so that was the only issue with guile? + well at least we'll see + iirc yes + according to nalaginrut and the latest build log, it'd seem so + started + yay, guile-2.0 :) + yay diff --git a/open_issues/wine.mdwn b/open_issues/wine.mdwn index 65e6c584..f8bb469b 100644 --- a/open_issues/wine.mdwn +++ b/open_issues/wine.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -21,7 +22,7 @@ requirements Wine has: only libc / POSIX / etc., or if there are allocation. There is kernel support for this,* however. -IRC, freenode, #hurd, 2011-08-11 +# IRC, freenode, #hurd, 2011-08-11 < arethusa> I've been trying to make Wine work inside a Debian GNU/Hurd VM, and to that end, I've successfully compiled the latest sources from Git @@ -67,3 +68,13 @@ IRC, freenode, #hurd, 2011-08-11 < youpi> yes < pinotree> (but that patch is lame) + + +# IRC, freenode, #hurd, 2013-10-02 + + youpi: I've come a little further with wine, see debian bug + #724681 (same problem). + Now the problem is probably due to the specific address space + and stack issues to be + fixed for wine to run as braunr pointed out some months ago + (IRC?) when we discussed wine. -- cgit v1.2.3