From 12c341b917921eb631026ec44a284c4d884e5de6 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 6 Mar 2013 21:52:20 +0100 Subject: IRC. --- open_issues/arm_port.mdwn | 16 +- open_issues/bash.mdwn | 16 +- open_issues/clock_gettime.mdwn | 121 +++++- open_issues/dde.mdwn | 85 +++- open_issues/gcc/pie.mdwn | 10 +- open_issues/git-core-2.mdwn | 52 ++- open_issues/git_duplicated_content.mdwn | 7 +- open_issues/glibc.mdwn | 270 +++++++++++- open_issues/gnumach_page_cache_policy.mdwn | 29 +- open_issues/gnumach_panic_thread_dispatch.mdwn | 20 + open_issues/hurd_101.mdwn | 25 +- open_issues/libmachuser_libhurduser_rpc_stubs.mdwn | 14 +- open_issues/libpthread.mdwn | 346 ++++++++++++++++ open_issues/mach_tasks_memory_usage.mdwn | 36 +- open_issues/mission_statement.mdwn | 11 +- open_issues/multithreading.mdwn | 90 +++- open_issues/nice_vs_mach_thread_priorities.mdwn | 14 + open_issues/ogi.mdwn | 31 +- open_issues/packaging_libpthread.mdwn | 24 ++ open_issues/performance/io_system/read-ahead.mdwn | 429 ++++++++++++++++++- open_issues/pfinet_timers.mdwn | 17 + ...local_socket_credentials_for_local_sockets.mdwn | 26 +- open_issues/ps_SIGSEGV.mdwn | 17 + open_issues/rpc_stub_generator.mdwn | 49 ++- open_issues/select.mdwn | 458 +++++++++++++++++++-- open_issues/select_vs_signals.mdwn | 15 + open_issues/some_todo_list.mdwn | 1 - open_issues/subhurd_vs_proc_server.mdwn | 54 +++ open_issues/syslog.mdwn | 11 +- open_issues/system_stats.mdwn | 8 +- open_issues/systemd.mdwn | 14 +- .../translators_set_up_by_untrusted_users.mdwn | 229 ++++++++++- open_issues/virtualization.mdwn | 2 + open_issues/virtualization/fakeroot.mdwn | 17 + .../virtualization/remap_root_translator.mdwn | 44 ++ open_issues/vm_map_kernel_bug.mdwn | 19 +- 36 files changed, 2533 insertions(+), 94 deletions(-) create mode 100644 open_issues/gnumach_panic_thread_dispatch.mdwn create mode 100644 open_issues/pfinet_timers.mdwn create mode 100644 open_issues/ps_SIGSEGV.mdwn create mode 100644 open_issues/subhurd_vs_proc_server.mdwn create mode 100644 open_issues/virtualization/fakeroot.mdwn (limited to 'open_issues') diff --git a/open_issues/arm_port.mdwn b/open_issues/arm_port.mdwn index 65a82d92..8a2a037f 100644 --- a/open_issues/arm_port.mdwn +++ b/open_issues/arm_port.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -260,3 +260,17 @@ architecture. Well, I have a ARM gnumach kernel compiled. It just doesn't run! :) matty3269: good luck :) + + +# IRC, freenode, #hurd, 2013-01-30 + + Hi, i've read there's an ongoing effort to port GNU Mach to ARM. How + is it going? + not sure where you read that + but i'm pretty sure it's not started if it exists + braunr: http://www.gnu.org/software/hurd/open_issues/arm_port.html + i confirm what i said + braunr: OK, thanks. I'm interested on it, and didn't want to + duplicate efforts. + little addition: it may have started, but we don't know about it + diff --git a/open_issues/bash.mdwn b/open_issues/bash.mdwn index 47598071..f6b14a08 100644 --- a/open_issues/bash.mdwn +++ b/open_issues/bash.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,7 +8,8 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] -[[!tag open_issue_porting]] +[[!tag open_issue_glibc open_issue_porting]] + # *bash* 4.0 vs. typing `C-c` (*SIGINT*) @@ -45,3 +46,14 @@ After having noticed that this error doesn't occur if starting *bash* with bash: start_pipeline: pgrp pipe: (ipc/mig) wrong reply message ID So, there's something different with stdout in / after the SIGINT handler. + + +# IRC, freenode, #hurd, 2013-01-13 + +Perhaps completely unrelated to the issue above, perhaps not. + + bash: xmalloc: ../../../bash/lib/sh/strtrans.c:60: cannot + allocate 261 bytes (323584 bytes allocated) + 1.5 GiB RAM were free. + This happened when I did a rever history search (C-r [...]), + and then pressed C-c. diff --git a/open_issues/clock_gettime.mdwn b/open_issues/clock_gettime.mdwn index 5345ed6b..83ad81e8 100644 --- a/open_issues/clock_gettime.mdwn +++ b/open_issues/clock_gettime.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -23,7 +23,8 @@ applications assume that it is. What about adding a nanosecond-precision clock, too? --[[tschwinge]] -IRC, freenode, #hurd, 2011-08-26: + +# IRC, freenode, #hurd, 2011-08-26 < pinotree> youpi: thing is: apparently i found a simple way to have a monotonic clock as mmap-able device inside gnumach @@ -40,7 +41,8 @@ IRC, freenode, #hurd, 2011-08-26: < braunr> sure < youpi> and that's the way I was considering implementing it -IRC, freenode, #hurd, 2011-09-06: + +# IRC, freenode, #hurd, 2011-09-06 yeah, i had a draft of improved idea for also handling nanoseconds @@ -69,3 +71,116 @@ IRC, freenode, #hurd, 2011-09-06: yes And it will forever be a witness of the evolving of this map_time interface. :-) + + +# IRC, freenode, #hurd, 2013-02-11 + +In context of [[select]]. + + braunr: would you send for review (and inclusion) your + time_data_t addition? + this way we could add nanosecs-based utime rpc (and then their + implementation in libc) + pinotree: it's part of the hurd branch + do you want it sent separately ? + yeah + ok + let me get it right first :) + sure :) + + +## IRC, freenode, #hurd, 2013-02-12 + + pinotree: + http://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?h=rbraun/select_timeout_pthread_v2&id=6ec50e62d9792c803d00cbff1cab2c0b3675690a + uh nice + will need two small inline functions to convert time_data_t <-> + timespec, but that's it + hm right + i could have thought about it + but i'll leave it for another patch :p + oh sure, no hurry + + +## IRC, freenode, #hurd, 2013-02-19 + + braunr: about time_data_t, I get it's needed that it be an array + so it can be passed by reference, not by value? + by address, yes + that's the difference between array and struct + + +## IRC, freenode, #hurd, 2013-02-25 + + braunr: why did you want to see time_data passed as pointer, not as + struct? + to microoptimize + the struct is 2 64-bit integers + well, we already pass structs along in a few cases, + e.g. io_statbuf_t, rusage_t, etc. + be it written t[0].sec or t->sec, it seems odd + copying 2 64bit integers is not much compared to the potential for + bugs here + bugs ? + yes, as in trying to access t[1], passing a wrong pointer, etc. + or the reader frowning on "why is this case different than the + others?" + well, i'm already usually frowning when i see what mig does .. + right + on the plus side, it's only the client side, i.e. mostly glibc, + which sees the t[0] + and the practice established by my patch is to convert to struct + timespec as soon as possible + the direct use of this type is therefore limited + could we define time_data_t as a struct time_data * instead of + struct time_data[1] ? + (in the.h) + that would make more sense to define a struct time_data, and pass a + pointer to it + i'm not sure + the mach server writing guide was very clear about array implying + a C array too + and i remember having compilation problems before doing that + but i don't remember their nature exactly + I'm not sure to understand what you said about converting to struct + timespec + what makes it not possible now? + and what is the relation with being an array or a pointer? + concerning struct timespec, what i mean is that the functions + called by the mig stub code directly convert time_data_t to a struct + timespec (which is the real type used throughout the hurd code) + about the rest, i'm not sure, i'd have to try again + mig just assumes it's an array + and why not just using struct timespec? + (for the mig type too) + my brain can't correctly compute variable sized types in mig + definition files + i wanted something that would remain correct for the 64-bit port + ah, you mean because tv_nsec is a long, which will not be the same + type? + and tv_sec being a time_t (thus a long too) + but we have the same issue e.g. for the rusage structure, don't we? + yes + so we'll have to fix things for that too anyway + sure + making a special case will not necessarily help + but it doesn't mean new interfaces have to be buggy too + well, using the proper type in the server itself is nicer + instead of having to convert + yes + i'm not exactly sure where to declare struct timespec then + should it be declared in hurd_types.h, and simply reused by the + libc headers ? + ? AIUI, it's the converse, hurd_types.h uses the struct timespec + from libc headers, and defines timespec_t + ok + timespec_t being the internal type whose definition gets done right + for mig to do the right thing + yes + i see + so, you'd like a struct of integer_t instead of an array of + signed64 + for our current 32bit userland yes + do you want to make the changes yourself or should i add a new + branch ? + and we'll make that a 64bit struct when we have a64bit userland diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn index 5f6fcf6a..b25e53d7 100644 --- a/open_issues/dde.mdwn +++ b/open_issues/dde.mdwn @@ -33,7 +33,7 @@ The plan is to use [[libstore_parted]] for accessing partitions. ## IRC, freenode, #hurd, 2012-02-08 -At the microkernel davroom at [[community/meetings/FOSDEM_2012]]: +After the microkernel devroom at [[community/meetings/FOSDEM_2012]]: there was quite some talk about DDE. I learnt that there are newer versions in Genode and in Minix (as opposed to the DROPS one we are @@ -109,6 +109,40 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]: you want. It uses Linux 2.6.26 usb subsystem. +## IRC, freenode, #hurd, 2013-02-15 + +After the microkernel devroom at [[community/meetings/FOSDEM_2013]]. + + youpi: speaking of dde, was there any will among other + microkernel os developers to eventually develop one single dde (with + every team handling the custom glue of the own kernel)? + well, there is still upstream dde actually + in dresden + nothing was really decided or anything (it was a round table, not a + workgroup) + but conversation converged into sharing the DDE maintenance, yes + and dresden would be the logical central place + pb is that they don't have the habit of being very open + http://svn.tudos.org/repos/oc/tudos/trunk/l4/pkg/dde has a recent + enough version + which macsim confirmed having all the latest commits from the + internal repository + i see + so it seems a viable solution on the medium term + the long term might need a real visible open source project + but we should probably still keep basing on dresden work + (better take work being done anywhere) + well, if the upstream is not really open, microkernel teams + could just fork it and all work on it + that's what I mean + should still be a win than everybody maintaining their own dde + sure + ah yes, i was writing and i'm slow at it :) + but at least we can try to work with dresden + see how open they could become by just asking :) + right + + # IRC, OFTC, #debian-hurd, 2012-02-15 i have no idea how the dde system works @@ -484,26 +518,17 @@ At the microkernel davroom at [[community/meetings/FOSDEM_2012]]: *sigh* - -# IRC, freenode, #hurd, 2012-08-18 +## IRC, freenode, #hurd, 2012-08-18 hum, leaks and potential deadlocks in libddekit/thread.c :/ -# IRC, freenode, #hurd, 2012-08-18 +## IRC, freenode, #hurd, 2012-08-18 nice, dde relies on a race to start .. -# IRC, freenode, #hurd, 2012-08-18 - - hm looks like if netdde crashes, the kernel doesn't handle it - cleanly, and we can't attach another netdde instance - -[[!message-id "877gu8klq3.fsf@kepler.schwinge.homeip.net"]] - - -# IRC, freenode, #hurd, 2012-08-21 +## IRC, freenode, #hurd, 2012-08-21 In context of [[libpthread]]. @@ -513,6 +538,40 @@ In context of [[libpthread]]. either in netdde or pfinet +## IRC, freenode, #hurd, 2013-02-28 + + (which needs the same kinds of fixes that libpthread got) + actually i'm not sure why he didn't simply reuse the pthread + functions :/ + which kind of fixes? + cancellation? + timeouts + cancellation too but that's less an issue + I'm not sure it really needs timeout work + on what RPC? + pfinet is just using the mach interface + i don't know but it clearly copies some of the previous pthread + code from pthread_cond_timedwait + see libddekit/thread.c:_sem_timedwait_internal + I recognize the comment indeed :) + I guess he thought he might need some particular semantic that + libpthread may not provide + also, now that i think about it, he couldn't have used libpthread, + could he ? + and there was no condition_timedwait in cthreads + there is a deadlock in netdde + it occurs sometimes, at high network speeds + (well high, 4 MiB/s or more) + + +# IRC, freenode, #hurd, 2012-08-18 + + hm looks like if netdde crashes, the kernel doesn't handle it + cleanly, and we can't attach another netdde instance + +[[!message-id "877gu8klq3.fsf@kepler.schwinge.homeip.net"]] + + # DDE for Filesystems ## IRC, freenode, #hurd, 2012-10-07 diff --git a/open_issues/gcc/pie.mdwn b/open_issues/gcc/pie.mdwn index a4598d1e..da951001 100644 --- a/open_issues/gcc/pie.mdwn +++ b/open_issues/gcc/pie.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -13,7 +13,7 @@ License|/fdl]]."]]"""]] [[!tag open_issue_gcc]] -# IRC, freenode, #debian-hurd, 2012-11-08 +# IRC, freenode, #hurd, 2012-11-08 tschwinge: i'm not totally sure, but it seems the pie options for gcc/ld are causing issues @@ -38,3 +38,9 @@ License|/fdl]]."]]"""]] uh this causes the w3m build failure and (indirectly, due to elinks built with -pie) aptitude + + +## IRC, freenode, #hurd, 2013-01-19 + + pinotree: I can confirm that -fPIE -pie fails and only -fPIE + works for mktable in w3m. Still have to check with elinks. What's up doc? diff --git a/open_issues/git-core-2.mdwn b/open_issues/git-core-2.mdwn index 2d8ad96b..cbf47bd2 100644 --- a/open_issues/git-core-2.mdwn +++ b/open_issues/git-core-2.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2008, 2009, 2010, 2011 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2008, 2009, 2010, 2011, 2013 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -16,9 +16,7 @@ License|/fdl]]."]]"""]] [[!toc]] -# Log - -December, 2008. +# 2008-12 On the otherwise-idle flubber: @@ -57,11 +55,15 @@ Fixing this situation is easy enough: # On branch master nothing to commit (working directory clean) -Still seen on 2010-03-16. ---- +## 2010-03-16 + +Still seen. + -A very similar issue, seen on 2010-11-17. The working tree had a lot of +# 2010-11-17 + +A very similar issue. The working tree had a lot of differences to HEAD. tschwinge@grubber:~/tmp/gcc/hurd $ git reset --hard HEAD @@ -92,9 +94,10 @@ differences to HEAD. Checking out files: 100% (1149/1149), done. HEAD is now at fe3e43c Merge commit 'refs/top-bases/hurd/master' into hurd/master ---- -2010-12-22, grubber: +## 2010-12-22 + +grubber: $ git remote update Fetching savannah @@ -106,9 +109,10 @@ differences to HEAD. fatal: index-pack failed error: Could not fetch savannah ---- -2011-06-10, coulomb.SCHWINGE, checking out [[binutils]]' master branch, +## 2011-06-10 + +coulomb.SCHWINGE, checking out [[binutils]]' master branch, starting from an empty working directory (after an external `git push`): $ git checkout -f @@ -144,7 +148,7 @@ starting from an empty working directory (after an external `git push`): # Analysis -2011-06-13 +## 2011-06-13 Running `git checkout -f` under GDB: @@ -188,3 +192,25 @@ there are cases where `unlink` apparently returns EINTR, which is not kosher either. Etc. Do we have problems with `SA_RESTART` vs. the atomicity of our syscall-alikes? + + +## IRC, freenode, #hurd, 2013-01-30 + + hm, let's try to clone a huge repository + hm, cloning a whole linux repo, and still no problem :) + weren't most/all the issues at unpack time? + i don't remember + we'll see when it gets there + the longest part is "resolving deltas", for which ext2fs is + clearly the big bottleneck (no I/O, page-cache only, but still) + pinotree: well, slow, but no error + + +### IRC, freenode, #hurd, 2013-01-31 + + fyi, i've tried several checkouts of big repositories, and never + got a single error + youpi: looks like the recent fixes also solved some git issues we + had + i could clone big repositories without any problem + cool :) diff --git a/open_issues/git_duplicated_content.mdwn b/open_issues/git_duplicated_content.mdwn index cbc171a7..f0ffad77 100644 --- a/open_issues/git_duplicated_content.mdwn +++ b/open_issues/git_duplicated_content.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -116,16 +116,13 @@ Some more copying: No further difference. ---- - $ git-new-workdir git master master $ diff -x .git -ur tar_master/ master/ > master.diff $ rm -rf ar_master* && (cd git/ && git archive master) | (mkdir ar_master && cd ar_master/ && tar -x) && diff -x .git -ru tar_master/ ar_master/ > ar_master.diff; ls -l ar_master.diff $ (cd git/ && git archive master) | md5sum ---- -2011-06-13 +# 2011-06-13 -> [[git-core-2]] diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn index 4111700b..425ce827 100644 --- a/open_issues/glibc.mdwn +++ b/open_issues/glibc.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2007, 2008, 2010, 2011, 2012 Free Software +[[!meta copyright="Copyright © 2007, 2008, 2010, 2011, 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -385,6 +385,51 @@ Last reviewed up to the [[Git mirror's d3bd58cf0a027016544949ffd27300ac5fb01bb8 pinotree: undefined expected, given the output above + * `getsockopt`, `setsockopt` + + IRC, freenode, #hurd, 2013-02-14 + + Hi, {get,set}sockopt is not supported on Hurd. This shows + e.g. in the gnulib's test-{poll,select} code. + Reading + http://hea-www.harvard.edu/~fine/Tech/addrinuse.html there might + be reasons _not_ to implement them, comments? + uh? they are supported on hurd + not SO_REUSEPORT for setsockopt() + that isn't the same as claiming "get/setsockopt is not + supported on hurd" + most probably that option is not implemented by the + socket family you are using + OK, some options like SO_REUSEPORT then, more info in + the link. + note also SO_REUSEPORT is not posix + and i don't see SO_REUSEPORT mentioned in the page you + linked + No, but SO_REUSEADDR + + IRC, freenode, #hurd, 2013-02-23 + + as an example, the poll test code from gnulib fails due + to that problem (and I've told you before) + gnu_srs: what's the actual failure? + can you provide a minimal test case showing the issue? + pinotree: A smaller test program: + http://paste.debian.net/237495/ + gnu_srs: setting SO_REUSEADDR before binding the socket + works... + and it seems it was a bug in the gnulib tests, see + http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=6ed6dffbe79bcf95e2ed5593eee94ab32fcde3f4 + pinotree: You are right, still the code I pasted pass on + Linux, not on Hurd. + so? + the code is wrong + you cannot change what bind does after you have called + it + * pinotree → out + so linux is buggy? + no, linux is more permissive + (at least, on this matter) + For specific packages: * [[octave]] @@ -669,6 +714,198 @@ Last reviewed up to the [[Git mirror's d3bd58cf0a027016544949ffd27300ac5fb01bb8 [[!message-id "201211172058.21035.toscano.pino@tiscali.it"]]. + In context of [[libpthread]]. + + IRC, freenode, #hurd, 2013-01-21 + + ah, found something interesting + tschwinge: there seems to be a race on our file descriptors + the content written by one thread seems to be retained + somewhere and another thread writing data to the file descriptor will + resend what the first already did + it could be a FILE race instead of fd one though + yes, it's not at the fd level, it's above + so good news, seems like the low level message/signalling code + isn't faulty here + all right, simple explanation: our IO_lockfile functions are + no-ops + braunr: i found that out days ago, and samuel said they were + okay + well, they're not no-ops in libpthreads + so i suppose they replace the default libc stubs, yes + so the issue happens in cthreads-using apps? + no + we don't have cthreads apps any more + and aiui, libpthreads provides cthreads compatibility calls to + libc, so everything is actually using pthreads + more buffer management debugging needed :/ + hm, so how can it be that there's a multithread app with no + libpthread-provided file locking? + ? + file locking looks fine + hm, the recursive locking might be wrong though + ./sysdeps/mach/hurd/bits/libc-lock.h:#define + __libc_lock_owner_self() ((void *) __hurd_threadvar_location (0)) + nop, looks fine too + indeed, without stream buffering, the problem seems to go away + pinotree: it really looks like the stub IO_flockfile is used + i'll try to make sure it's the root of the problem + braunr: you earlier said that there's some race with + different threads, no? + yes + either a race or an error in the iostream management code + but i highly doubt the latter + if the stub locks are used, then libpthread is not + loaded... so which different threads are running? + that's the thing + the libpthread versions should be used + so the application is linked to pthread? + yes + i see, that was the detail i was missing earlier + the common code looks fine, but i can see wrong values even + there + e.g. when vfprintf calls write, the buffer is already wrong + i've made similar tests on linux sid, and it behaves as it + should + hm + i even used load to "slow down" my test program so that + preemption is much more likely to happen + note we have slightly different behaviour in glibc's libio, + ie different memory allocation ways (mmap on linux, malloc for us) + the problem gets systematic on the hurd while it never occurs + on linux + that shouldn't matter either + ok + but i'll make sure it doesn't anyway + this mach_print system call is proving very handy :) + and also, with load, unbuffered output is always correct too + braunr: you could try the following hack + http://paste.debian.net/227106/ + what does it do ? + (yes, ugly as f**k) + does it force libio to use mmap ? + or rather, enable ? + provides a EXEC_PAGESIZE define in libio, so it makes it use + mmap (like on linux) instead of malloc + + `t/pagesize`. + + yes, the stub is used instead of the libpthreads code + tschwinge: ^ + i'll override those to check that it fixes the problem + hm, not that easy actually + copy their files from libpthreads to sysdeps/mach/hurd + hm right, in libpthread they are not that split as in glibc + let's check symbol declaration to understand why the stubs + aren't overriden by ld + _IO_vfprintf correctly calls @plt versions + i don't know enough about dynamic linking to see what causes + the problem :/ + youpi: it seems our stdio functions use the stub IO_flockfile + functions + really? I thought we were going through cthreads-compat.c + yes really + i don't know why, but that's the origin of the "duplicated" + messages issue + messages aren't duplicated, there is a race that makes on + thread reuse the content of the stream buffer + one* + k, quite bad + at least we know where the problem comes from now + youpi: what would be the most likely reason why weak symbols + in libc wouldn't be overriden by global ones from libpthread ? + being loaded after libc + i tried preloading it + i'll compare with what is done on wheezy + you have the local-dl-dynamic-weak.diff patch, right? + (on squeeze, the _IO_flockfile function in libc seems to do + real work unlike our noop stub) + it's the debian package, i have all patches provided there + indeed, on linux, libc provides valid IO_flock functions + ./sysdeps/pthread/flockfile.c:strong_alias (__flockfile, + _IO_flockfile) + that's how ntpl exports it + nptl* + imho we should restructure libpthread to be more close to + nptl + i wish i knew what it involves + file structing for sources and tests, for example + well yes obviously :) + i've just found a patch that does exactly that for linuxthreads + that = fix the file locking? + in addition to linuxthreads/lockfile.c (which we also + equivalently provide), there is + linuxthreads/sysdeps/pthread/flockfile.c + no, restructiring + restructuring* + i still have only a very limited idea of how the glibc sources + are organized + the latter is used as source file when compiling flockfile.c + in stdio-common + shouldn't we provide one too ? + that would mean it would be compiled as part of libc proper, + not libpthread + yes + that's what both linuxthreads and nptl seem to do + and the code is strictly the same, i.e. a call to the internal + _IO_lock_xxx functions + I guess that's for the hot-dlopen case + you need to have locks properly taken at dlopen time + youpi: do you mean adding an flockfile.c file to our sysdeps + will only solve the problem by side effect ? + and that the real problem is that the libpthread versions + aren't used ? + yes + ok + youpi: could it simply be a versioning issue ? + could be + it seems so + i've rebuilt with the flockfile functions versioned to 2.2.6 + (same as in libc) and the cthreads_compat functions are now used + and the problem doesn't occur any more with my test code + :) + could you post a patch? + i need a few info before + it'd be good to check which such functions are hooked + i suppose the version for functions declared in libpthreads + shouldn't change, right ? + yes + ok + they didn't have a vresion before + shall i commit directly ? + so it should be fine + well, they did + 2.12 + yes, but please tell me when it's done + sure + so I can commit that to debian's eglibc + I mean, before we integrated libpthread build into glibc + so they never had any version before 2.12 + ok + basically we need to check the symbols which are both in + libpthread and referenced in libc + to make sure they have the same version in the reference + ok + only weak references need to be checked, others would have + produced a runtime error + youpi: done + arg, the version i mention in the comment is wrong + i suppose people understand nonetheless + probably, yes + ah, i can now appreciate the headache this bug hunting gave me + these last days :) + + IRC, freenode, #hurd, 2013-01-22 + + braunr: commited to debian glibc + btw, it's normal that the program doesn't terminate, right? + (i.e. it's the original bug you were chasing) + youpi: about your earlier question (yesterday) about my test + code, it's expected to block, which is the problem i was initially + working on + ok, so all god + +o + * `t/pagesize` IRC, freenode, #hurd, 2012-11-16 @@ -677,6 +914,37 @@ Last reviewed up to the [[Git mirror's d3bd58cf0a027016544949ffd27300ac5fb01bb8 the fact that EXEC_PAGESIZE is not defined on hurd, libio/libioP.h switches the allocation modes from mmap to malloc + IRC, freenode, #hurd, 2013-01-21 + + why is it a hack ? + because most probably glibc shouldn't rely on EXEC_PAGESIZE + like that + ah + there's a mail from roland, replying to thomas about this + issue, that this use of EXEC_PAGESIZE to enable mmap or not is just + wrong + ok + (the above is + http://thread.gmane.org/87mxd9hl2n.fsf@kepler.schwinge.homeip.net ) + thanks + (just added the reference to that in the wiki) + pinotree: btw, what's wrong with using malloc instead of mmap + in libio ? + braunr: i'm still not totally sure, most probably it should + be slightly slower currently + locking contention ? + pinotree: + http://www.sourceware.org/ml/libc-alpha/2006-11/msg00061.html + pinotree: it looks to me there is now no valid reason not to + use malloc + the best argument for mmap is that libio requires zeroed + memory, but as the OP says, zeroing a page is usually more expensive + than a small calloc (even on kernel that keep a list of zeroed pages + for quick allocations, frequent mmaps() often make this list empty) + braunr: mmap allocations in libio are rounded to the page + size + well they have to + * `LD_DEBUG` IRC, freenode, #hurd, 2012-11-22 diff --git a/open_issues/gnumach_page_cache_policy.mdwn b/open_issues/gnumach_page_cache_policy.mdwn index 22b05953..5e93887e 100644 --- a/open_issues/gnumach_page_cache_policy.mdwn +++ b/open_issues/gnumach_page_cache_policy.mdwn @@ -771,12 +771,12 @@ License|/fdl]]."]]"""]] And set precedence. -## IRC, freenode, #hurd, 2012-07-26 +# IRC, freenode, #hurd, 2012-07-26 hm i killed darnassus, probably the page cache patch again -## IRC, freenode, #hurd, 2012-09-19 +# IRC, freenode, #hurd, 2012-09-19 I was wondering about the page cache information structure I guess the idea is that if we need to add a field, we'll just @@ -786,3 +786,28 @@ License|/fdl]]."]]"""]] youpi: have a look at the rbraun/page_cache gnumach branch that's what I was referring to ok + + +# IRC, freenode, #hurd, 2013-01-15 + + hm, no wonder the page cache patch reduced performance so much + the page cache when building even moderately large packages is + about a few dozens MiB (around 50) + the patch enlarged it to several hundreds :/ + braunr: so the big page cache essentially killed memory locality? + ArneBab: no, it made ext2fs crazy (disk translators - used as + pagers - scan their cached pages every 5 seconds to flush the dirty ones) + you can imagine what happens if scanning and flushing a lot of + pages takes more than 5 seconds + ouch… that’s heavy, yes + I already see it pile up in my mindb + and it's completely linear, using a lock to protect the whole list + darnassus is currently showing such a behaviour, because tschwinge + is linking huge files (one object with lots of pages) + 446 MB of swap used, between 200 and 1850 MiB of RAM used, and i + can still use vim and build stuff without being too disturbed + the system does feel laggy, but there has been great stability + improvements + have* + and even if laggy, it doesn't feel much more than the usual lag of + a network (ssh) based session diff --git a/open_issues/gnumach_panic_thread_dispatch.mdwn b/open_issues/gnumach_panic_thread_dispatch.mdwn new file mode 100644 index 00000000..db094f2f --- /dev/null +++ b/open_issues/gnumach_panic_thread_dispatch.mdwn @@ -0,0 +1,20 @@ +[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_gnumach]] + + +# IRC, freenode, #hurd, 2013-02-10 + + panic: thread_dispatch: thread a958c950 has unexpected state 114 + hum ): + ouch + during a perl build + TH_SWAPPED | TH_HALTED | TH_RUN diff --git a/open_issues/hurd_101.mdwn b/open_issues/hurd_101.mdwn index 6146885d..574a03ec 100644 --- a/open_issues/hurd_101.mdwn +++ b/open_issues/hurd_101.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -12,7 +12,8 @@ License|/fdl]]."]]"""]] Not the first time that something like this is proposed... -IRC, freenode, #hurd, 2011-07-25 + +# IRC, freenode, #hurd, 2011-07-25 [failed GNU/Hurd project] < antrik> gnu_srs1: I wouldn't say he was on track. just one of the many @@ -39,3 +40,23 @@ IRC, freenode, #hurd, 2011-07-25 < Tekk_`> under the right conditions < cluck> antrik: jokes aside some sort of triage system/training ground for newcomers could be helpful + + +# IRC, freenode, #hurd, 2013-01-20 + + so once I have written my first translators, and really understand + that, what kinds of projects would you recommend to an operating + systems/hurd newbie. + I am reading the minix book now as I have it, but I'm waiting on + getting the modern operating systems book by the same author. + I was initially going to start working on minix, but their focus + seems to be on embedded, and I want to work on a system that is more + general purpose, and I like the philosophy of freedom surrounding the + hurd. + I like how the hurd design allows more freedom for users of the + operating system, but I would also like to incorporate ideas from minix + on the hurd. mainly, rebootless updates of servers/translators. + then you should study how translators work + how ipc works + and understand exactly what state is stored where + ok diff --git a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn index 80fe36f8..670c82cb 100644 --- a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn +++ b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -121,3 +121,15 @@ License|/fdl]]."]]"""]] libmachuser and libhurduser? should they be linked to explicitly, or assume libc brings them? pinotree: libc should bring them + + +# IRC, freenode, #hurd, 2013-02-25 + + we should also discuss the mach_debug interface some day + it's not exported by libc, but the kernel provides it + slabinfo depends on it, and i'd like to include it in the hurd + but i don't know what kind of security problems giving access to + mach_debug RPCs would create + (imo, the mach_debug interface should be adjusted to be used with + privileged ports only) + (well, maybe not all mach_debug RPCs) diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn index 05aab85f..f0c0db58 100644 --- a/open_issues/libpthread.mdwn +++ b/open_issues/libpthread.mdwn @@ -1170,6 +1170,12 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task. haven't tested +### IRC, freenode, #hurd, 2013-01-26 + + ah great, one of the recent fixes (probably select-eintr or + setitimer) fixed exim4 :) + + ## IRC, freenode, #hurd, 2012-09-23 tschwinge: i committed the last hurd pthread change, @@ -1270,6 +1276,17 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task. that's it, yes +### IRC, freenode, #hurd, 2013-03-01 + + braunr: btw, "unable to adjust libports thread priority: (ipc/send) + invalid destination port" is actually not a sign of fatality + bach recovered from it + youpi: well, it never was a sign of fatality + but it means that, for some reason, a process looses a right for a + very obscure reason :/ + weird sentence, agreed :p + + ## IRC, freenode, #hurd, 2012-12-05 tschwinge: i'm currently working on a few easy bugs and i have @@ -1459,3 +1476,332 @@ Same issue as [[term_blocking]] perhaps? we have a similar problem with the hurd-specific cancellation code, it's in my todo list with io_select ah, no, the condvar is not global + + +## IRC, freenode, #hurd, 2013-01-14 + + *sigh* thread cancellable is totally broken :( + cancellation* + it looks like playing with thread cancellability can make some + functions completely restart + (e.g. one call to printf to write twice its output) + +[[git_duplicated_content]], [[git-core-2]]. + + * braunr is cooking a patch to fix pthread cancellation in + pthread_cond_{,timed}wait, smells good + youpi: ever heard of something that would make libc functions + "restart" ? + you mean as a feature, or as a bug ? + when changing the pthread cancellation state of a thread, i + sometimes see printf print its output twice + or perhaps after a signal dispatch? + i'll post my test code + that could be a duplicate write + due to restarting after signal + http://www.sceen.net/~rbraun/pthreads_test_cancel.c + #include + #include + #include + #include + #include + + static pthread_cond_t cond = PTHREAD_COND_INITIALIZER; + static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; + static int predicate; + static int ready; + static int cancelled; + + static void + uncancellable_printf(const char *format, ...) + { + int oldstate; + va_list ap; + + va_start(ap, format); + pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate); + vprintf(format, ap); + pthread_setcancelstate(oldstate, &oldstate); + va_end(ap); + } + + static void * + run(void *arg) + { + uncancellable_printf("thread: setting ready\n"); + ready = 1; + uncancellable_printf("thread: spin until cancellation is sent\n"); + + while (!cancelled) + sched_yield(); + + uncancellable_printf("thread: locking mutex\n"); + pthread_mutex_lock(&mutex); + uncancellable_printf("thread: waiting for predicate\n"); + + while (!predicate) + pthread_cond_wait(&cond, &mutex); + + uncancellable_printf("thread: unlocking mutex\n"); + pthread_mutex_unlock(&mutex); + uncancellable_printf("thread: exit\n"); + return NULL; + } + + int + main(int argc, char *argv[]) + { + pthread_t thread; + + uncancellable_printf("main: create thread\n"); + pthread_create(&thread, NULL, run, NULL); + uncancellable_printf("main: spin until thread is ready\n"); + + while (!ready) + sched_yield(); + + uncancellable_printf("main: sending cancellation\n"); + pthread_cancel(thread); + uncancellable_printf("main: setting cancelled\n"); + cancelled = 1; + uncancellable_printf("main: joining thread\n"); + pthread_join(thread, NULL); + uncancellable_printf("main: exit\n"); + return EXIT_SUCCESS; + } + youpi: i'd see two calls to write, the second because of a signal, + as normal, as long as the second call resumes, but not restarts after + finishing :/ + or restarts because nothing was done (or everything was entirely + rolled back) + well, with an RPC you may not be sure whether it's finished or not + ah + we don't really have rollback + i don't really see the difference with a syscall there + the kernel controls the interruption in the case of the syscall + except that write is normally atomic if i'm right + it can't happen on the way back to userland + but that could be exactly the same with RPCs + while perhaps it can happen on the mach_msg back to userland + back to userland ok, back to the application, no + anyway, that's a side issue + i'm fixing a few bugs in libpthread + and noticed that + (i should soon have patches to fix - at least partially - thread + cancellation and timed blocking) + i was just wondering how cancellation how handled in glibc wrt + libpthread + I don't know + (because the non standard hurd cancellation has nothing to do with + pthread cancellation)à + ok + s/how h/is h/ + + +### IRC, freenode, #hurd, 2013-01-15 + + braunr: Re »one call to printf to write twice its output«: + sounds familiar: + http://www.gnu.org/software/hurd/open_issues/git_duplicated_content.html + and http://www.gnu.org/software/hurd/open_issues/git-core-2.html + tschwinge: what i find strange with the duplicated operations i've + seen is that i merely use pthreads and printf, nothing else + no setitimer, no alarm, no select + so i wonder how cancellation/syscall restart is actually handled + in our glibc + but i agree with you on the analysis + + +### IRC, freenode, #hurd, 2013-01-16 + + neal: do you (by any chance) remember if there could possibly be + spurious wakeups in your libpthread implementation ? + braunr: There probably are. + but I don't recall + + i think the duplicated content issue is due to the libmach/glibc + mach_msg wrapper + which restarts a message send if interrupted + Hrm, depending on which point it has been interrupted you mean? + yes + not sure yet and i could be wrong + but i suspect that if interrupted after send and during receive, + the restart might be wrongfully done + i'm currently reworking the timed* pthreads functions, doing the + same kind of changes i did last summer when working on select (since + implement the timeout at the server side requires pthread_cond_timedwait) + and i limit the message queue size of the port used to wake up + threads to 1 + and it seems i have the same kind of problems, i.e. blocking + because of a second, unexpected send + i'll try using __mach_msg_trap directly and see how it goes + Hrm, mach/msg.c:__mach_msg does look correct to me, but yeah, + won't hurd to confirm this by looking what direct usage of + __mach_msg_trap is doing. + tschwinge: can i ask if you still have a cthreads based hurd + around ? + tschwinge: and if so, to send me libthreads.so.0.3 ... :) + braunr: darnassus:~tschwinge/libthreads.so.0.3 + call 19c0 + so, cthreads were also using the glibc wrapper + and i never had a single MACH_SEND_INTERRUPTED + or a busy queue :/ + (IOW, no duplicated messages, and the wrapper indeed looks + correct, so it's something else) + (Assuming Mach is doing the correct thing re interruptions, of + course...) + mach doesn't implement it + it's explicitely meant to be done in userspace + mach merely reports the error + i checked the osfmach code of libmach, it's almost exactly the + same as ours + Yeah, I meant Mach returns the interurption code but anyway + completed the RPC. + ok + i don't expect mach wouldn't do it right + the only difference in osf libmach is that, when retrying, + MACH_SEND_INTERRUPT|MACH_RCV_INTERRUPT are both masked (for both the + send/send+receive and receive cases) + Hrm. + but they say it's for performance, i.e. mach won't take the slow + path because of unexpected bits in the options + we probably should do the same anyway + + +### IRC, freenode, #hurd, 2013-01-17 + + tschwinge: i think our duplicated RPCs come from + hurd/intr-msg.c:148 (err == MACH_SEND_INTERRUPTED but !(option & + MACH_SEND_MSG)) + a thread is interrupted by a signal meant for a different thread + hum no, still not that .. + or maybe .. :) + Hrm. Why would it matter for for the current thread for which + reason (different thread) mach_msg_trap returns *_INTERRUPTED? + mach_msg wouldn't return it, as explained in the comment + the signal thread would, to indicate the send was completed but + the receive must be retried + however, when retrying, the original user_options are used again, + which contain MACH_SEND_MSG + i'll test with a modified version that masks it + tschwinge: hm no, doesn't fix anything :( + + +### IRC, freenode, #hurd, 2013-01-18 + + the duplicated rpc calls is one i find very very frustrating :/ + you mean the dup writes we've seen lately? + yes + k + + +### IRC, freenode, #hurd, 2013-01-19 + + all right, i think the duplicated message sends are due to thread + creation + the duplicated message seems to be sent by the newly created + thread + arg no, misread + + +### IRC, freenode, #hurd, 2013-01-20 + + tschwinge: youpi: about the diplucated messages issue, it seems to + be caused by two threads (with pthreads) doing an rpc concurrently + duplicated* + + +### IRC, freenode, #hurd, 2013-01-21 + + ah, found something interesting + tschwinge: there seems to be a race on our file descriptors + the content written by one thread seems to be retained somewhere + and another thread writing data to the file descriptor will resend what + the first already did + it could be a FILE race instead of fd one though + yes, it's not at the fd level, it's above + so good news, seems like the low level message/signalling code + isn't faulty here + all right, simple explanation: our IO_lockfile functions are + no-ops + braunr: i found that out days ago, and samuel said they were + okay + +[[glibc]], `flockfile`/`ftrylockfile`/`funlockfile`. + + +## IRC, freenode, #hurd, 2013-01-15 + + hmm, looks like subhurds have been broken by the pthreads patch :/ + arg, we really do have broken subhurds :(( + time for an immersion in the early hurd bootstrapping stuff + Hrm. Narrowed down to cthreads -> pthread you say. + i think so + but i think the problem is only exposed + it was already present before + even for the main hurd, i sometimes have systems blocking on exec + there must be a race there that showed far less frequently with + cthreads + youpi: we broke subhurds :/ + ? + i can't start one + exec seems to die and prevent the root file system from + progressing + there must be a race, exposed by the switch to pthreads + arg, looks like exec doesn't even reach main :( + now, i'm wondering if it could be the tls support that stops exec + although i wonder why exec would start correctly on a main hurd, + and not on a subhurd :( + i even wonder how much progress ld.so.1 is able to make, and don't + have much idea on how to debug that + + +### IRC, freenode, #hurd, 2013-01-22 + + hm, subhurds seem to be broken because of select + damn select ! + hm i see, we can't boot a subhurd that still uses libthreads from + a main hurd that doesn't + the linker can't find it and doesn't start exec + pinotree: do you understand what the fmh function does in + sysdeps/mach/hurd/dl-sysdep.c ? + i think we broke subhurds by fixing vm_map with size 0 + braunr: no idea, but i remember thomas talking about this code + +[[vm_map_kernel_bug]] + + it checks for KERN_INVALID_ADDRESS and KERN_NO_SPACE + and calls assert_perror(err); to make sure it's one of them + but now, KERN_INVALID_ARGUMENT can be returned + ok i understand what it does + and youpi has changed the code, so he does too + (now i'm wondering why he didn't think of it when we fixed vm_map + size with 0 but his head must already be filled with other things so ..) + anyway, once this is dealt with, we get subhurds back :) + yes, with a slight change, my subhurd starts again \o/ + youpi: i found the bug that prevents subhurds from booting + it's caused by our fixing of vm_map with size 0 + when ld.so.1 starts exec, the code in + sysdeps/mach/hurd/dl-sysdep.c fails because it doesn't expect the new + error code we introduced + (the fmh functions) + ah :) + good :) + adding KERN_INVALID_ARGUMENT to the list should do the job, but if + i understand the code correctly, checking if fmhs isn't 0 before calling + vm_map should do the work too + s/do the work/work/ + i'm not sure which is the preferred way + otherwise I believe fmh could be just fixed to avoid calling vm_map + in the !fmhs case + yes that's what i currently do + at the start of the loop, just after computing it + seems to work so far + + +## IRC, freenode, #hurd, 2013-01-22 + + i have almost completed fixing both cancellation and timeout + handling, but there are still a few bugs remaining + fyi, the related discussion was + https://lists.gnu.org/archive/html/bug-hurd/2012-08/msg00057.html diff --git a/open_issues/mach_tasks_memory_usage.mdwn b/open_issues/mach_tasks_memory_usage.mdwn index 9abb7639..7a7a77ce 100644 --- a/open_issues/mach_tasks_memory_usage.mdwn +++ b/open_issues/mach_tasks_memory_usage.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,9 +8,10 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] -[[!tag open_issue_documentation]] +[[!tag open_issue_documentation open_issue_gnumach]] -IRC, freenode, #hurd, 2011-01-06 + +# IRC, freenode, #hurd, 2011-01-06 hm, odd... vmstat tells me that ~500 MiB of RAM are in use; but the sum of all RSS is <300 MiB... what's the rest? @@ -100,7 +101,7 @@ IRC, freenode, #hurd, 2011-01-06 libraries -IRC, freenode, #hurd, 2011-07-24 +# IRC, freenode, #hurd, 2011-07-24 < braunr> the panic is probably due to memory shortage < braunr> so as antrik suggested, use more swap @@ -145,3 +146,30 @@ IRC, freenode, #hurd, 2011-07-24 looks like it is on both seqnos_memory_object_data_initialize and seqnos_memory_object_data_write < braunr> antrik: so i guess reserved memory is accounted for + + +# IRC, freenode, #hurd, 2013-01-12 + + darnassus linking clang: 600 MiB swap in use and 22 MiB RAM + free, of 2 GiB. But ps shows a RSS of just 100 MiB, huh? + Getting "better": near the end of the link, nearly 1 GiB swap + in use, and 200 KiB (!) RAM free. + can hurd have more than 1GB of ram ? + And then it completed; 75 MiB swap in use, and 1.2 GiB RAM + free. + tschwinge: unless i'm mistaken, mach uses the legacy "swapping" + bsd mechanism + tschwinge: i.e. when it swaps a process, it swaps all of it + tschwinge: the rest is probably one big anonymous vm object + containing the process space + cached objects aren't currently well accounted + (well, since youpi got my page cache patches in, they are, but + procfs isn't yet modified to report them) + tschwinge: right, i'm currently looking at the machine and it + doesn't add up, i suppoe there are some big files still in the cache + ah, git packed objects :p + and a few llvm .a/.so/executable files too + and since they're probably targets, they're built last, which + explains why they're retained in the cache for a while + +[[microkernel/mach/message/msgh_id]] (why on *that* page?). diff --git a/open_issues/mission_statement.mdwn b/open_issues/mission_statement.mdwn index b32d6ba6..a1c8f235 100644 --- a/open_issues/mission_statement.mdwn +++ b/open_issues/mission_statement.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -697,3 +698,11 @@ License|/fdl]]."]]"""]] nowhere_man: it can be used that way too functional programming is getting more and more attention so it's fine if you're a lisp fan really + + +# IRC, freenode, #hurd, 2013-02-04 + + BTW, it's weird that the mission statement linked from + hurd.gnu.org is in weblog/ and written in the first person + yes + very :) diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn index f631a80b..d7804864 100644 --- a/open_issues/multithreading.mdwn +++ b/open_issues/multithreading.mdwn @@ -266,6 +266,94 @@ Tom Van Cutsem, 2009. async by nature, will create messages floods anyway +### IRC, freenode, #hurd, 2013-02-23 + + hmm let's try something + iirc, we cannot limit the max number of threads in libports + but did someone try limiting the number of threads used by + libpager ? + (the only source of system stability problems i currently have are + the unthrottled writeback requests) + braunr: perhaps we can limit the amount of requests batched by the + ext2fs sync? + youpi: that's another approach, yes + (I'm not sure to understand what threads libpager create) + youpi: one for each writeback request + ew + but it makes its own call to + ports_manage_port_operations_multithread + i'll write a new ports_manage_port_operations_multithread_n + function that takes a mx threads parameter + and see if it helps + i thought replacing spin locks with mutexes would help, but it's + not enough, the true problem is simply far too much contention + youpi: i still think we should increase the page dirty timeout to + 30 seconds + wouldn't that actually increase the amount of request done in one + go? + it would + but other systems (including linux) do that + but they group requests + what linux does is scan pages every 5 seconds, and writeback those + who have been dirty for more than 30 secs + hum yes but that's just a performance issue + i mean, a separate one + a great source of fs performance degradation is due to this + regular scan happenning at the same time regular I/O calls are made + e.G. aptitude update + so, as a first step, until the sync scan is truley optimized, we + could increase that interval + I'm afraid of the resulting stability regression + having 6 times as much writebacks to do + i see + my current patch seems to work fine for now + i'll stress it some more + (it limits the number of paging threads to 10 currently) + but iirc, you fixed a deadlock with a debian patch there + i think the case was a pager thread sending a request to the + kernel, and waiting for the kernel to call another RPC that would unblock + the pager thread + ah yes it was merged upstream + which means a thread calling memory_object_lock_request with sync + == 1 must wait for a memory_object_lock_completed + so it can deadlock, whatever the number of threads + i'll try creating two separate pools with a limited number of + threads then + we probably have the same deadlock issue in + pager_change_attributes btw + hm no, i can still bring a hurd down easily with a large i/o + request :( + and now it just recovered after 20 seconds without any visible cpu + or i/o usage .. + i'm giving up on this libpager issue + it simply requires a redesign + + +### IRC, freenode, #hurd, 2013-02-28 + + so what causes the stability issues? or is that not really + known yet? + the basic idea is that the kernel handles the page cache + and writebacks aren't correctly throttled + so a huge number of threads (several hundreds, sometimes + thousands) are created + when this pathological state is reached, it's very hard to recover + because of the various sources of (low) I/O in the system + a simple line sent to syslog increases the load average + the solution requires reworking the libpager library, and probably + the libdiskfs one too, perhaps others, certainly also the pagers + maybe the kernel too, i'm not sure + i'd say so because it manages a big part of the paging policy + + +### IRC, freenode, #hurd, 2013-03-02 + + i think i have a simple-enough solution for the writeback + instability + +[[hurd/libpager]]. + + ## Alternative approaches: * @@ -273,7 +361,7 @@ Tom Van Cutsem, 2009. * Continuation-passing style * [[microkernel/Mach]] internally [[uses - continuations|microkernel/mach/continuation]], too. + continuations|microkernel/mach/gnumach/continuation]], too. * [[Erlang-style_parallelism]] diff --git a/open_issues/nice_vs_mach_thread_priorities.mdwn b/open_issues/nice_vs_mach_thread_priorities.mdwn index 76788a53..e27d3018 100644 --- a/open_issues/nice_vs_mach_thread_priorities.mdwn +++ b/open_issues/nice_vs_mach_thread_priorities.mdwn @@ -373,3 +373,17 @@ here. braunr: can't remember right now, either that or to fix a ftbfs in debian iirc it's coreutils which wants proper nice levels + + +# IRC, OFTC, #debian-hurd, 2013-03-04 + + Is it not possible to set the priority of a process to 1 ? + these macros: + #define MACH_PRIORITY_TO_NICE(prio) (2 * ((prio) - 12)) + #define NICE_TO_MACH_PRIORITY(nice) (12 + ((nice) / 2)) + are used in the setpriority() implementation of Hurd + so setting a process' priority to 1 is just like setting it to 0 + Steap: that has already been discussed to drop the *2 + the issue is mach not supporting enough sched levels + can be fixed, of course + just nobody did yet diff --git a/open_issues/ogi.mdwn b/open_issues/ogi.mdwn index e4372dc0..c58d2ee1 100644 --- a/open_issues/ogi.mdwn +++ b/open_issues/ogi.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -23,3 +23,32 @@ interesting. checking copyright situation, also for thesis / w.r.t. university project + + IRC, freenode, #hurd, 2013-02-15: + + ogi: The question was rather (IIRC) whether your + university has the copyright of this project, given it was done + on their time. + tschwinge: no problems with my university + + +# IRC, freenode, #hurd, 2013-02-15 + + braunr: i want to update my ext3fs server to ext4 actually + you have an ext3 server ? + braunr: this was my M.Sc. thesis and the 2G patch was a side effect + braunr: but it easily crashes under stress, so not usable + it does ? + braunr: it's not available for download ATM + are you sure it's not a thread storm issue caused by the + unthrottled mach writebacks ? + braunr: i don't know, haven't looked at it since 2004 + oh :) + ok + i have all ext3fs stuff archived, just haven't put it on + http://fire.tower.3.bg/ yet + ogi: If the copyright situation is clear, we can put it into + upstream Git repositories, no matter how dirty it is. + "dirty" in the sense of that it needs cleanup, has bugs, etc. + so at some point i want to audit libdiskfs and then continue with + ext4fs: https://savannah.gnu.org/patch/?1839 diff --git a/open_issues/packaging_libpthread.mdwn b/open_issues/packaging_libpthread.mdwn index 18f124b4..171dc7a0 100644 --- a/open_issues/packaging_libpthread.mdwn +++ b/open_issues/packaging_libpthread.mdwn @@ -155,6 +155,30 @@ cherry-picked. upstream +## IRC, OFTC, #debian-hurd, 2013-02-08 + + I also have it on my (never-ending) agenda to add libpthread to + the tschwinge/Roger_Whittaker branch and/or propose it be added upstream + (as a Git submodule?). + imho a git submodule could be a solution, if glibc people would + accept it + if so, libpthread.git would need proper glibc/x.y branches to + follow glibc + Yep. + I though that would be the least invasive approach for glibc + upstream -- and quite convenient for us, too. + after all, git submodules don't track branches, but point to + specific commits, no? + Correct. + So we can do locally/in Debian whatever we want, and every once + in a while update the upstream glibc commit ID for libpthread. + so we could update the git submodule references in glibc when + we've tested enough libpthread changes + Just like when committing patches upstream, just without + pestering them with all the patches/commits. + Yep. + + # IRC, freenode, #hurd, 2012-11-16 *** $(common-objpfx)resolv/gai_suspend.o: uses diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn index 706e1632..be582e8a 100644 --- a/open_issues/performance/io_system/read-ahead.mdwn +++ b/open_issues/performance/io_system/read-ahead.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -2556,3 +2557,429 @@ License|/fdl]]."]]"""]] you're asking if you can include the large store patch in your work, and by extension, in the main branch i would say yes, but this must be discussed with others + + +## IRC, freenode, #hurd, 2013-02-18 + + mcsim: so, currently reviewing gnumach + braunr: hello + mcsim: the review branch, right ? + braunr: yes + braunr: What do you start with? + memory refreshing + i see you added the advice twice, to vm_object and vm_map_entry + iirc, we agreed to only add it to map entries + am i wrong ? + let me see + the real question being: what do you use the object advice for ? + >iirc, we agreed to only add it to map entries + braunr: TBH, do not remember that. At some point we came to + conclusion that there should be only one advice. But I'm not sure if it + was final point. + maybe it wasn't, yes + that's why i've just reformulated the question + if (map_entry && (map_entry->advice != VM_ADVICE_DEFAULT)) + advice = map_entry->advice; + else + advice = object->advice; + ok + It just participates in determining actual advice + ok that's not a bad thing + let's keep it + please document VM_ADVICE_KEEP + and rephrase "How to handle page faults" in vm_object.h to + something like 'How to tune page fault handling" + mcsim: what's the point of VM_ADVICE_KEEP btw ? + braunr: Probably it is better to remove it? + well if it doesn't do anything, probably + braunr: advising was part of mo_set_attributes before + no it is redudant + i see + so yes, remove it + s/no/now + (don't waste time on a gcs-like changelog format for now) + i also suggest creating _vX branches + so we can compare the changes between each of your review branches + hm, minor coding style issues like switch(...) instead of switch + (...) + why does syscall_vm_advise return MACH_SEND_INTERRUPTED if the + target map is NULL ? + is it modelled after an existing behaviour ? + ah, it's the syscall version + braunr: every syscall does so + and the error is supposed to be used by user stubs to switch to + the rpc version + ok + hm + you've replaced obsolete port_set_select and port_set_backup calls + with your own + don't do that + instead, add your calls to the new gnumach interface + mcsim: out of curiosity, have you actually tried the syscall + version ? + braunr: Isn't it called by default? + i don't think so, no + than no + ok + you could name vm_get_advice_info vm_advice_info + regarding obsolete calls, did you say that only in regard of + port_set_* or all other calls too? + all of the + m + i missed one, yes + the idea is: don't change the existing interface + >you could name vm_get_advice_info vm_advice_info + could or should? i.e. rename? + i'd say should, to remain consistent with the existing similar + calls + ok + can you explain KERN_NO_DATA a bit more ? + i suppose it's what servers should answer for neighbour pages that + don't exist in the backend, right ? + kernel can ask server for some data to read them beforehand, but + server can be in situation when it does not know what data should be + prefetched + yes + ok + it is used by ext2 server + with large store patch + so its purpose is to allow the kernel to free the preallocated + pages that won't be used + do i get it right ? + no. + ext2 server has a buffer for pages and when kernel asks to read + pages ahead it specifies region of that buffer + ah ok + but consecutive pages in buffer does not correspond to consecutive + pages on disk + so, the kernel can only prefetch pages that were already read by + the server ? + no, it can ask a server to prefetch pages that were not read by + server + hum + ok + but in case with buffer, if buffer page is empty, server does not + know what to prefetch + i'm not sure i'm following + well, i'm sure i'm not following + what happens when the kernel requests data from a server, right + after a page fault ? + what does the message afk for ? + kernel is unaware regarding actual size of file where was page + fault because of buffer indirection, right? + i don't know what "buffer" refers to here + this is buffer in memory where ext2 server reads pages + with large store patch ext2 server does not map the whole disk, but + some of its pages + and it maps these pages in special buffer + that means that constructiveness of pages in memory does not mean + that they are consecutive on disk or logically (belong to the same file) + ok so it's a page pool + with unordered pages + but what do you mean when you say "server does not know what to + prefetch" + it normally has everything to determine that + For instance, page fault occurs that leads to reading of + 4k-file. But kernel does not know actual size of file and asks to + prefetch 16K bytes + yes + There is no sense to prefetch something that does not belong to + this file + yes but the server *knows* that + and server answers with KERN_NO_DATA + server should always say something about every page that was asked + then, again, isn't the purpose of KERN_NO_DATA to notify the + kernel it can release the preallocated pages meant for the non existing + data ? + (non existing or more generally non prefetchable) + yes + then + why did you answer no to + 15:46 < braunr> so its purpose is to allow the kernel to free the + preallocated pages that won't be used + is there something missing ? + (well obviously, notify the kernel it can go on with page fault + handling) + braunr: sorry, misunderstoo/misread + ok + so good, i got this right :) + i wonder if KERN_NO_DATA may be a bit too vague + people might confuse it with ENODATA + Actually, this is transformation of ENODATA + I was looking among POSIX error codes and thought that this is the + most appropriate + i'm not sure it is + first, it's about STREAMS, a commonly unused feature + and second, the code is obsolete + braunr: AFAIR purpose of KERN_NO_DATA is not only free + pages. Without this call something should hang + 15:59 < braunr> (well obviously, notify the kernel it can go on + with page fault handling) + yes + hm + sorry again + i don't see anything better for the error name for now + and it's really minor so let's keep it as it is + actually, ENODATA being obsolete helps here + ok, done for now, work calling + we'll continue later or tomorrow + braunr: ok + other than that, this looks ok on the kernel side for now + the next change is a bit larger so i'd like to take the time to + read it + braunr: ok + regarding moving calls in mach.defs, can I put them elsewhere? + gnumach.defs + you'll probably need to rebase your changes to get it + braunr: I'll rebase this later, when we finish with review + ok + keep the comments in a list then, not to forget + (logging irc is also useful) + + +## IRC, freenode, #hurd, 2013-02-20 + + mcsim: why does VM_ADVICE_DEFAULT have its own entry ? + braunr: this kind of fallback mode + i suppose that even random strategy could even read several pages + at once + yes + but then, why did you name it "default" ? + because it is assigned by default + ah + so you expect pagers to set something else + for all objects they create + yes + ok + why not, but add a comment please + at least until all pagers will support clustered reading + ok + even after that, it's ok + just say it's there to keep the previous behaviour by default + so people don't get the idea of changing it too easily + comment in vm_advice.h? + no, in vm_fault.C + right above the array + why does vm_calculate_clusters return two ranges ? + also, "Function PAGE_IS_NOT_ELIGIBLE is used to determine if", + PAGE_IS_NOT_ELIGIBLE doesn't look like a function + I thought make it possible not only prefetch range, but also free + some memory that is not used already + braunr: ^ + but didn't implement it :/ + don't overengineer it + reduce to what's needed + braunr: ok + braunr: do you think it's worth to implement? + no + braunr: it could be useful for sequential policy + describe what you have in mind a bit more please, i think i don't + have the complete picture + with sequential policy user supposed to read strictly in sequential + order, so pages that user is not supposed to read could be put in unused + list + what pages the user isn't supposed to read ? + if user read pages in increasing order than it is not supposed to + read pages that are right before the page where page fault occured + right ? + do you mean higher ? + that are before + before would be lower then + oh + "right before" + yes :) + why not ? + the initial assumption, that MADV_SEQUENTIAL expects *strict* + sequential access, looks wrong + remember it's just a hint + a user could just acces pages that are closer to one another and + still use MADV_SEQUENTIAL, expecting a speedup because pages are close + well ok, this wouldn't be wise + MADV_SEQUENTIAL should be optimized for true sequential access, + agreed + but i'm not sure i'm following you + but I'm not going to page these pages out. Just put in unused + list, and if they will be used later they will be move to active list + your optimization seem to be about freeing pages that were + prefetched and not actually accessed + what's the unused list ? + inactive list + ok + so that they're freed sooner + yes + well, i guess all neighbour pages should first be put in the + inactive list + iirc, pages in the inactive list aren't mapped + this would force another page fault, with a quick resolution, to + tell the vm system the page was actually used, and must become active, + and paged out later than other inactive pages + but i really think it's not worth doing it now + clustered pagins is about improving I/O + page faults without I/O are orders of magnitude faster than I/O + it wouldn't bring much right now + ok, I remove this, but put in TODO + I'm not sure that right list is inactive list, but the list that is + scanned to pageout pages to swap partition. There should be such list + both the active and inactive are + the active one is scanned when the inactive isn't large enough + (the current ratio of active pages is limited to 1/3) + (btw, we could try increasing it to 1/2) + iirc, linux uses 1/2 + your comment about unlock_request isn't obvious, i'll have to + reread again + i mean, the problem isn't obvious + ew, functions with so many indentation levels :/ + i forgot how ugly some parts of the mach vm were + mcsim: basically it's ok, i'll wait for the simplified version for + another pass + simplified? + 22:11 < braunr> reduce to what's needed + ok + and what comment? + your XXX in vm_fault.c + when calling vm_calculate_clusters + is m->unlock_request the same for all cluster or I should + recalculate it for every page? + s/all/whole + that's what i say, i'll have to come back to that later + after i have reviewed the userspace code i think + so i understand the interactions better + braunr: pushed v1 branch + braunr: "Move new calls to gnumach.defs file" and "Implement + putting pages in inactive list with sequential policy" are in my TODO + mcsim: ok + + +## IRC, freenode, #hurd, 2013-02-24 + + mcsim: where does the commit from neal (reworking libpager) come + from ? + (ok the question looks a little weird semantically but i think you + get my point) + braunr: you want me to give you a link to mail with this commit? + why not, yes + http://permalink.gmane.org/gmane.os.hurd.bugs/446 + ok so + http://lists.gnu.org/archive/html/bug-hurd/2012-06/msg00001.html + ok so, we actually have three things to review here + that libpager patch, the ext2fs large store one, and your work + mcsim: i suppose something in your work depends on neal's patch, + right ? + i mean, why did you work on top of it ? + Yes + All user level code + i see it adds some notifications + no + notifacations are for large store + ok + but the rest is for my work + but what does it do that you require ? + braunr: this patch adds support for multipage work. There were just + stubs that returned errors for chunks longer than one page before. + ok + for now, i'll just consider that it's ok, as well as the large + store patch + ok i've skipped all patches up to "Make mach-defpager process + multipage requests in m_o_data_request." since they're obvious + but this one isn't + mcsim: why is the offset member a vm_size_t in struct block ? + (these things matter for large file support on 32-bit systems) + braunr: It should be vm_offset_t, right? + yes + well + it seems so but + im not sure what offset is here + vm_offset is normally the offset inside a vm_object + and if we want large file support, it could become a 64-bit + integer + while vm_size_t is a size inside an address space, so it's either + 32 or 64-bit, depending on the address space size + but here, if offset is an offset inside an address space, + vm_size_t is fine + same question for send_range_parameters + braunr: TBH, I do not differ vm_size_t and vm_offset_t well + they can be easily confused yes + they're both offsets and sizes actually + they're integers + so here I used vm_offset_t because field name is offset + but vm_size_t is an offset/size inside an address space (a + vm_map), while vm_offset_t is an offset/size inside an object + braunr: I didn't know that + it's not clear at all + and it may not have been that clear in mach either + but i think it's best to consider them this way from now on + well, it's not that important anyway since we don't have large + file support, but we should some day :/ + i'm afraid we'll have it as a side effect of the 64-bit port + mcsim: just name them vm_offset_t when they're offsets for + consistency + but seems that I guessed, because I use vm_offset_t variables in + mo_ functions + well ok, but my question was about struct block + where you use vm_size_t + braunr: I consider this like a mistake + ok + moving on + in upload_range, there are two XXX comments + i'm not sure to understand + Second XXX I put because at the moment when I wrote this not all + hurd libraries and servers supported size different from vm_page_size + But then I fixed this and replaced vm_page_size with size in + page_read_file_direct + ok then update the comment accordingly + When I was adding third XXX, I tried to check everything. But I + still had felling that I forgot something. + No it is better to remove second and third XXX, since I didn't find + what I missed + well, that's what i mean by "update" :) + ok + and first XXX just an optimisation. Its idea is that there is no + case when the whole structure is used in one function. + ok + But I was not sure if was worth to do, because if there will appear + some bug in future it could be hard to find it. + I mean that maintainability decreases because of using union + So, I'd rather keep it like it is + how is struct send_range_parameters used ? + it doesn't looked to be something stored long + also, you're allowed to use GNU extensions + It is used to pass parameters from one function to another + which of them? + see + http://gcc.gnu.org/onlinedocs/gcc-4.4.7/gcc/Unnamed-Fields.html#Unnamed-Fields + mcsim: if it's used to pass parameters, it's likely always on the + stack + braunr: I use it when necessary + we really don't care much about a few extra words on the stack + the difference in size would + agree + matter + oops + the difference in size would matter if a lot of those were stored + in memory for long durations + that's not the case, so the size isn't a problem, and you should + remove the comment + ok + mcsim: if i get it right, the libpager rework patch changes some + parameters from byte offset to page frame numbers + braunr: yes + why don't you check errors in send_range ? + braunr: it was absent in original code, but you're right, I should + do this + i'm not sure how to handle any error there, but at least an assert + I found a place where pager just panics + for now it's ok + your work isn't about avoiding panics, but there must be a check, + so if we can debug it and reach that point, we'll know what went wrong + i don't understand the prototype change of default_read :/ + it looks like it doesn't return anything any more + has it become asynchronous ? + It was returning some status before, but now it handles this status + on its own + hum + how ? + how do you deal with errors ? + in old code default_read returned kr and this kr was used to + determine what m_o_ function will be used + now default_read calls m_o_ on its own + ok diff --git a/open_issues/pfinet_timers.mdwn b/open_issues/pfinet_timers.mdwn new file mode 100644 index 00000000..387ad4fe --- /dev/null +++ b/open_issues/pfinet_timers.mdwn @@ -0,0 +1,17 @@ +[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + + +# IRC, freenode, #hurd, 2013-02-11 + + now that there is a pthread_hurd_cond_timedwait_np function + available, we could replace the ulgy timers in pfinet diff --git a/open_issues/pflocal_socket_credentials_for_local_sockets.mdwn b/open_issues/pflocal_socket_credentials_for_local_sockets.mdwn index dfdc213c..d252eb54 100644 --- a/open_issues/pflocal_socket_credentials_for_local_sockets.mdwn +++ b/open_issues/pflocal_socket_credentials_for_local_sockets.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,10 +8,11 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] -IRC, freenode, #hurd, 2011-03-28 - [[!tag open_issue_hurd]] + +# IRC, freenode, #hurd, 2011-03-28 + basically, i'm trying to implement socket credentials for local sockets, and i guessed doing it in pflocal would be the appropriate place what i thought was filling the cmsg data for MSG_CRED at @@ -41,6 +42,25 @@ IRC, freenode, #hurd, 2011-03-28 yes nice thanks, i will try that change first + +# IRC, OFTC, #debian-hurd, 2013-02-20 + + youpi: while debugging #700530, it seems that xorg does not have + working socket credentials on kfreebsd (and hurd too) + julien provided sune with + http://people.debian.org/~jcristau/kbsd-peercred.diff to test, but of + course that won't work for us (even if we would have working socket + credentials with cmsg) + (that patch is not tested yet) + at least, we're aware there's another place in need for working + socket credentials now + k + youpi: (the patch above has been confirmed to work, with + s/SOL_SOCKET/0/ ) + 0 ?! + yeah + + --- See also [[pflocal_reauth]] and [[sendmsg_scm_creds]]. diff --git a/open_issues/ps_SIGSEGV.mdwn b/open_issues/ps_SIGSEGV.mdwn new file mode 100644 index 00000000..24d5cb4f --- /dev/null +++ b/open_issues/ps_SIGSEGV.mdwn @@ -0,0 +1,17 @@ +[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + + +# IRC, freenode, #hurd, 2013-02-05 + + ps -l segfaults + ah, perhaps because of the subhurd diff --git a/open_issues/rpc_stub_generator.mdwn b/open_issues/rpc_stub_generator.mdwn index 05eb53b8..d4622d67 100644 --- a/open_issues/rpc_stub_generator.mdwn +++ b/open_issues/rpc_stub_generator.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -97,3 +97,50 @@ License|/fdl]]."]]"""]] scatter-gather to be used with x15 once more, i fell back on omg idl oh, there is also flick that looks interesting + + +# IRC, freenode, #hurd, 2013-13-16 + + braunr: By the way, regarding your recent IDL considerations + (and I too suggest using some kind of RPC generator basone on whichever + IDL) -- are you aware that for Viengoos, Neal has written a RPC stub + generator entirely in C Preprocessor macros? No idea whather that's + suitable for your case, but may be worth having a look at. + it probably isn't easy to port to Mach + genode has an ipc generator as well + which is written in a real langugage + that might be worth checking out as well + (note: I haven't followed the conversation at all.) + i was considering using macros only too actually + (i thought genode had switched to complex c++ templates) + dunno + I'm not up to date + macros are nice, but marshalling complicated data structures is hard + why implement it with just macros ?? + no lexer, no parser + no special special tools + the first are a burden + the latter is a pain + + http://git.savannah.gnu.org/gitweb/?p=hurd/viengoos.git;a=blob;f=libviengoos/viengoos/rpc.h;h=721768358a0299637fb79f226aea6a304571da85;hb=refs/heads/viengoos-on-bare-metal + in the same directory, you there are headers that use it + neal: cf. http://genode.org/documentation/release-notes/11.05 + tschwinge: why do you recommend an IDL ? + braunr: What about it? + neal: it shows the difference between the earlier ipc/rpc + interface, and the new one based only on templates and dynamic + marshalling using c++ streams + ok + braunr: In my book, the definition of RPC interfaces is just + "data" in the sense that it describes data structures (exchanged + messages) and as such should be expressed as data (by means of an IDL), + instead of directly codifying it in a specific programming language. + Of course, there may be other reasons for doing the latter + anyway, such as performance/optimization reasons. + tschwinge: well, from my pov, you're justifying the use of an idl + from the definition of an rpc + i'm not sure it makes much sense for me + in addition, the idl becomes the "specific programming language" + Well, I see it as data that has to be translated into several + formats: different programming languages' stub code. + you could consider c the "common" language :) diff --git a/open_issues/select.mdwn b/open_issues/select.mdwn index 391509a9..caecc437 100644 --- a/open_issues/select.mdwn +++ b/open_issues/select.mdwn @@ -215,7 +215,7 @@ IRC, unknown channel, unknown date: it's better than nothing yes -# IRC, freenode, #hurd, 2012-07-21 +### IRC, freenode, #hurd, 2012-07-21 damn, select is actually completely misdesigned :/ iiuc, it makes servers *block*, in turn :/ @@ -315,7 +315,7 @@ IRC, unknown channel, unknown date: really easy and nice :-) -## IRC, freenode, #hurd, 2012-07-22 +#### IRC, freenode, #hurd, 2012-07-22 antrik: you can't block in servers with sync ipc so in this case, "select" becomes a request for notifications @@ -379,7 +379,7 @@ IRC, unknown channel, unknown date: his reasoning (as does braunr) -## IRC, freenode, #hurd, 2012-07-23 +#### IRC, freenode, #hurd, 2012-07-23 antrik: i was meaning sync in the most common meaning, yes, the client blocking on the reply @@ -650,6 +650,9 @@ IRC, unknown channel, unknown date: which is why i could choose time_value_t (a struct of 2 integer_t) well, I'd say gnumach could grow a nanosecond-precision time value e.g. for clock_gettime precision and such + +[[clock_gettime]]. + so you would prefer me adding the time_spec_t time to gnumach rather than the hurd ? well, if hurd RPCs are using mach types and there's no mach type @@ -782,7 +785,7 @@ IRC, unknown channel, unknown date: API definition at RPC level too -## IRC, freenode, #hurd, 2012-07-24 +#### IRC, freenode, #hurd, 2012-07-24 youpi: antrik: is vm_size_t an appropriate type for a c long ? (appropriate mig type) @@ -809,7 +812,7 @@ IRC, unknown channel, unknown date: continue -## IRC, freenode, #hurd, 2012-07-25 +#### IRC, freenode, #hurd, 2012-07-25 braunr: well, for actual kernel calls, machine-specific types are probably hard to avoid... the problem is when they are used in other RPCs @@ -900,7 +903,7 @@ IRC, unknown channel, unknown date: antrik: ah about that, ok -## IRC, freenode, #hurd, 2012-07-26 +#### IRC, freenode, #hurd, 2012-07-26 braunr: wrt your select_timeout branch, why not push only the time_data stuff to master? @@ -914,7 +917,7 @@ IRC, unknown channel, unknown date: i "only" have to adjust the client side select implementation now -## IRC, freenode, #hurd, 2012-07-27 +#### IRC, freenode, #hurd, 2012-07-27 io_select should remain a routine (i.e. synchronous) for server side stub code @@ -922,7 +925,14 @@ IRC, unknown channel, unknown date: (since _hurs_select manually handles replies through a port set) -## IRC, freenode, #hurd, 2012-07-28 +##### IRC, freenode, #hurd, 2013-02-09 + + io_select becomes a simpleroutine, except inside the hurd, where + it's a routine to keep the receive and reply mig stub code + (the server side) + + +#### IRC, freenode, #hurd, 2012-07-28 why are there both REPLY_PORTS and IO_SELECT_REPLY_PORT macros in the hurd .. @@ -941,7 +951,7 @@ IRC, unknown channel, unknown date: i did something a bit ugly but it seems to do what i wanted -## IRC, freenode, #hurd, 2012-07-29 +#### IRC, freenode, #hurd, 2012-07-29 good, i have a working client-side select now i need to fix the servers a bit :x @@ -998,7 +1008,7 @@ IRC, unknown channel, unknown date: queue ... -## IRC, freenode, #hurd, 2012-07-30 +#### IRC, freenode, #hurd, 2012-07-30 hm nice, the problem i have with my hurd_condition_timedwait seems to also exist in libpthread @@ -1096,7 +1106,7 @@ IRC, unknown channel, unknown date: (http://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?h=rbraun/select_timeout&id=40fe717ba9093c0c893d9ea44673e46a6f9e0c7d) -## IRC, freenode, #hurd, 2012-08-01 +#### IRC, freenode, #hurd, 2012-08-01 damn, i can't manage to make threads calling condition_wait to dequeue themselves from the condition queue :( @@ -1132,7 +1142,7 @@ IRC, unknown channel, unknown date: it frightens me because i don't see any flaw in the logic :( -## IRC, freenode, #hurd, 2012-08-02 +#### IRC, freenode, #hurd, 2012-08-02 ah, seems i found a reliable workaround to my deadlock issue, and more than a workaround, it should increase efficiency by reducing @@ -1150,7 +1160,7 @@ IRC, unknown channel, unknown date: (/etc/hurd/runsystem i assume) -## IRC, freenode, #hurd, 2012-08-03 +#### IRC, freenode, #hurd, 2012-08-03 glibc actually makes some direct use of cthreads condition variables @@ -1302,7 +1312,7 @@ IRC, unknown channel, unknown date: tests, reviews, more tests, polishing, commits, packaging -## IRC, freenode, #hurd, 2012-08-04 +#### IRC, freenode, #hurd, 2012-08-04 grmbl, apt-get fails on select in my subhurd with the updated glibc @@ -1324,7 +1334,7 @@ IRC, unknown channel, unknown date: and thomas d -## IRC, freenode, #hurd, 2012-08-05 +#### IRC, freenode, #hurd, 2012-08-05 eh, i made dpkg-buildpackage use the patched c library, and it finished the build oO @@ -1352,7 +1362,7 @@ IRC, unknown channel, unknown date: extremely large -## IRC, freenode, #hurd, 2012-08-06 +#### IRC, freenode, #hurd, 2012-08-06 i have bad news :( it seems there can be memory corruptions with my io_select patch @@ -1395,7 +1405,7 @@ IRC, unknown channel, unknown date: [[libpthread]]. -## IRC, freenode, #hurd, 2012-08-07 +#### IRC, freenode, #hurd, 2012-08-07 anyone knows of applications extensively using non-blocking networking functions ? @@ -1470,7 +1480,7 @@ IRC, unknown channel, unknown date: other for the servers, eh) -## IRC, freenode, #hurd, 2012-08-07 +#### IRC, freenode, #hurd, 2012-08-07 when running gitk on [darnassus], yesterday, i could push the CPU to 100% by simply moving the mouse in the window :p @@ -1490,7 +1500,7 @@ IRC, unknown channel, unknown date: this linear search on dequeue is a real pain :/ -## IRC, freenode, #hurd, 2012-08-09 +#### IRC, freenode, #hurd, 2012-08-09 `screen` doesn't close a window/hangs after exiting the shell. @@ -1503,7 +1513,7 @@ IRC, unknown channel, unknown date: [[Term_blocking]]. -# IRC, freenode, #hurd, 2012-12-05 +### IRC, freenode, #hurd, 2012-12-05 well if i'm unable to build my own packages, i'll send you the one line patch i wrote that fixes select/poll for the case where there is @@ -1512,7 +1522,7 @@ IRC, unknown channel, unknown date: timeout, doubling the total wait time when there is no event) -## IRC, freenode, #hurd, 2012-12-06 +#### IRC, freenode, #hurd, 2012-12-06 damn, my eglibc patch breaks select :x i guess i'll just simplify the code by using the same path for @@ -1546,12 +1556,12 @@ IRC, unknown channel, unknown date: this can account for the slowness of a bunch of select/poll users -## IRC, freenode, #hurd, 2012-12-07 +#### IRC, freenode, #hurd, 2012-12-07 finally, my select patch works :) -## IRC, freenode, #hurd, 2012-12-08 +#### IRC, freenode, #hurd, 2012-12-08 for those interested, i pushed my eglibc packages that include this little select/poll timeout fix on my debian repository @@ -1560,7 +1570,7 @@ IRC, unknown channel, unknown date: regressions -## IRC, freenode, #hurd, 2012-12-10 +#### IRC, freenode, #hurd, 2012-12-10 I have verified your double timeout bug in hurdselect.c. Since I'm also working on hurdselect I have a few questions @@ -1631,7 +1641,13 @@ IRC, unknown channel, unknown date: i'll try the non intrusive mode -## IRC, freenode, #hurd, 2012-12-11 +##### IRC, freenode, #hurd, 2013-01-26 + + ah great, one of the recent fixes (probably select-eintr or + setitimer) fixed exim4 :) + + +#### IRC, freenode, #hurd, 2012-12-11 braunr: What is the technical difference of having the delay at io_select compared to mach_msg for one FD? @@ -1641,7 +1657,7 @@ IRC, unknown channel, unknown date: (for L4 guys it wouldn't be considered a slight optimization :)) -## IRC, freenode, #hurd, 2012-12-17 +#### IRC, freenode, #hurd, 2012-12-17 tschwinge: http://git.savannah.gnu.org/cgit/hurd/glibc.git/log/?h=rbraun/select_timeout_for_one_fd @@ -1668,20 +1684,20 @@ IRC, unknown channel, unknown date: notifications resulting from the way io_select works -## IRC, freenode, #hurd, 2012-12-19 +#### IRC, freenode, #hurd, 2012-12-19 tschwinge: i've tested the glibc rbraun/select_timeout_for_one_fd branch for a few days on darnassus now, and nothing wrong to report -## IRC, freenode, #hurd, 2012-12-20 +#### IRC, freenode, #hurd, 2012-12-20 braunr: so, shall I commit the single hurd select timeout fix to the debian package? youpi: i'd say so yes -## IRC, freenode, #hurd, 2013-01-03 +#### IRC, freenode, #hurd, 2013-01-03 gnu_srs: sorry, i don't understand your poll_timeout patch it basically reverts mine for poll only @@ -1848,7 +1864,7 @@ IRC, unknown channel, unknown date: to the poll stuff: Have to check further with my poll patch... -## IRC, freenode, #hurd, 2013-01-04 +#### IRC, freenode, #hurd, 2013-01-04 Summary of the eglibc-2.13-38 issues: without the unsubmitted-setitimer_fix.diff patch and with @@ -2037,6 +2053,388 @@ IRC, unknown channel, unknown date: See also [[alarm_setitimer]]. +#### IRC, freenode, #hurd, 2013-01-22 + + youpi: Maybe it's overkill to have a separate case for DELAY; but + it enhances readability (and simplifies a lot too) + but it reduces factorization + if select is already supposed to behave the same way as delay, + there is no need for a separate code + OK; I'll make a two-way split then. What about POLL and nfds=0, + timeout !=0? + gnu_srs: handle nfds=0 as a pure timeout as the linux man page + describes + it makes sense, and as other popular systems do it, it's better to + do it the same way + and i disagree with you, factorization doesn't imply less + readability + So you agree with me to have a special case for DELAY? + Coding style is a matter of taste: for me case a: case b: etc is + more readable than "if then elseif then else ..." + it's not coding style + avoiding duplication is almost always best + whatever the style + i don't see the need for a special delay case + it's the same mach_msg call + (for now) + gnu_srs: i'd say the only reason to duplicate is when you can't do + otherwise + ways of coding then... And I agree with the idea of avoiding code + duplication, ever heard of Literate Programming + we'll need a "special case" when the timeout is handled at the + server side, but it's like two lines .. + + +#### IRC, freenode, #hurd, 2013-02-11 + + braunr: the libpthread hurd_cond_timedwait_np looks good to me + + +##### IRC, freenode, #hurd, 2013-02-15 + + braunr: does cond_timedwait_np depend on the cancellation fix? + yes + ok + the timeout fix + so I also have to pull that into my glibc build + (i fixed cancellation too because the cleanup routine had to be + adjusted anyway + ) + ah, and I need the patches hurd package too + if unsure, you can check my packages + ok, not for tonight then + i listed the additional patches in the changelog + yep, I'll probably use them + + +#### IRC, freenode, #hurd, 2013-02-11 + + braunr: I don't understand one change in glibc: + - err = __io_select (d[i].io_port, d[i].reply_port, 0, &type); + + err = __io_select (d[i].io_port, d[i].reply_port, type); + youpi: the waittime parameter ahs been removed + has* + where? when? + in the hurd branch + in the defs? + yes + I don't see this change + only the addition of io_select_timeout + hum + also, io_select_timeout should be documented along io_select in + hurd.texi + be6e5b86bdb9055b01ab929cb6b6eec49521ef93 + Selectively compile io_select{,_timeout} as a routine + * hurd/io.defs (io_select_timeout): Declare as a routine if + _HURD_IO_SELECT_ROUTINE is defined, or a simpleroutine + otherwise. + (io_select): Likewise. In addition, remove the waittime + timeout parameter. + ah, it's in another commit + yes, perhaps misplaced + that's the kind of thing i want to polish + my main issue currently is that time_data_t is passed by value + i'm trying to pass it by address + I don't know the details of routine vs simpleroutine + it made sense for me to remove the waittime parameter at the same + time as adding the _HURD_IO_SELECT_ROUTINE macro, since waittime is what + allows glibc to use a synchronous RPC in an asynchronous way + is it only a matter of timeout parameter? + simpleroutine sends a message + routine sends and receives + by having a waittime parameter, _hurd_select could make io_select + send a message and return before having a reply + ah, that's why in glibc you replaced MACH_RCV_TIMED_OUT by 0 + yes + it seems a bit odd to have a two-face call + it is + can't we just keep it as such? + no + damn + well we could, but it really wouldn't make any sense + why not? + because the way select is implemented implies io_select doesn't + expect a reply + (except for the single df case but that's an optimization) + fd* + that's how it is already, yes? + yes + well yes and no + that's complicated :) + there are two passes + let me check before saying anything ;p + :) + in the io_select(timeout=0) case, can it ever happen that we + receive an answer? + i don't think it is + you mean non blocking right ? + not infinite timeout + I mean calling io_select with the timeout parameter being set to 0 + so yes, non blocking + no, i think we always get MACH_RCV_TIMED_OUT + for me non-blocking can mean a lot of things :) + ok + i was thinking mach_msg here + ok so, let's not consider the single fd case + the first pass simply calls io_select with a timeout 0 to send + messages + I don't think it's useful to try to optimize it + it'd only lead to bugs :) + me neither + yes + (as was shown :) ) + what seems useful to me however is to optimize the io_select call + with a waittime parameter, the generated code is an RPC (send | + receive) + whereas, as a simpleroutine, it becomes a simple send + ok + my concern is that, as you change it, you change the API of the + __io_select() function + (from libhurduser) + yes but glibc is the only user + and actually no + i mean + i change the api at the client side only + that's what I mean + remember that io.Defs is almost full + "full" ? + i'm almost certain it becomes full with io_select_timeout + there is a practical limit of 100 calls per interface iirc + since the reply identifiers are request + 100 + are we at it already? + i remember i had problems with it so probably + but anyway, I'm not thinking about introducing yet another RPC + but get a reasonable state of io_select + i'l have to check that limit + it looks wrong now + or was it 50 + i don't remember :/ + i understand + but what i can guarantee is that, while the api changes at the + client side, it doesn't at the server side + ideally, the client api of io_select could be left as it is, and + libc use it as a simpleroutine + sure, I understand that + which means glibc, whether patched or not, still works fine with + that call + yes it could + that's merely a performance optimization + my concern is that an API depends on the presence of + _HURD_IO_SELECT_ROUTINE, and backward compatibility being brought by + defining it! :) + yes + i personally don't mind much + I'd rather avoid the clutter + what do you mean ? + anything that avoids this situation + like just using timeout = 0 + well, in that case, we'll have both a useless timeout at the + client side + and another call for truely passing a timeout + that's also weird + how so a useless timeout at the client side? + 22:39 < youpi> - err = __io_select (d[i].io_port, d[i].reply_port, + 0, &type); + 0 here is the waittime parameter + that's a 0-timeout + and it will have to be 0 + yes + that's confusing + ah, you mean the two io_select calls? + yes + but isn't that necessary for the several-fd case, anyway? + ? + if the io_select calls are simple routines, this useless waittime + parameter can just be omitted like i did + don't we *have* to make several calls when we select on several + fds? + suure but i don't see how it's related + well then I don't see what optimization you are doing then + except dropping a parameter + which does not bring much to my standard :) + a simpleroutine makes mach_msg take a much shorter path + that the 0-timeout doesn't take? + yes + it's a send | receive + ok, but that's why I asked before + so there are a bunch of additional checks until the timeout is + handled + whether timeout=0 means we can't get a receive + and thus the kernel could optimize + that's not the same thing :) + ok + it's a longer path to the same result + I'd really rather see glibc building its own private simpleroutine + version of io_select + iirc we already have such kind of thing + ok + well there are io_request and io_reply defs + but i haven't seen them used anywhere + but agreed, we should do that + braunr: the prototype for io_select seems bogus in the io_request, + id_tag is no more since ages :) + youpi: yes + youpi: i'll recreate my hurd branch with only one commit + without the routine/simpleroutine hack + and with time_data_t passed by address + and perhaps other very minor changes + braunr: the firstfd == -1 test needs a comment + or better, i'll create a v2 branch to make it easy to compare them + ok + braunr: actually it's also the other branch of the if which needs a + comment: "we rely on servers implementing the timeout" + youpi: ok + - (msg.success.result & SELECT_ALL) == 0) + why removing that test? + you also need to document the difference between got and ready + hm i'll have to remember + i wrote this code like a year ago :) + almost + AIUI, got is the number of replies + but i think it has to do with error handling + and + + if (d[i].type) + + ++ready; + while ready is the number of successful replies + is what replaces it + youpi: yes + the poll wrapper already normalizes the timeout parameter to + _hurd_select + no you probably don't + the whole point of the patch is to remove this ugly hack + youpi: ok so + 23:24 < youpi> - (msg.success.result & SELECT_ALL) + == 0) + when a request times out + ah, right + we could get a result with no event + and no error + and this is what makes got != ready + tell that to the source, not me :) + sure :) + i'm also saying it to myself + ... :) + right, using io_select_request() is only an optimization, which we + can do later + what i currently do is remove the waittime parameter from + io_select + what we'll do instead (soon) is let the parameter there to keep + the API unchancged + but always use a waittime of 0 + to make the mach_msg call non blocking + then we'll try to get the io_request/io_reply definitions back so + we can have simpleroutines (send only) version of the io RPCs + and we'll use io_select_request (without a waittime) + youpi: is that what you understood too ? + yes + (and we can do that later) + gnu_srs: does it make more sense for you ? + this change is quite sparsed so it's not easy to get the big + picture + sparse* + it requires changes in libpthread, the hurd, and glibc + the libpthread change can be almost forgotten + it's just yet another cond_foo function :) + well not if he's building his own packages + right + ok, apart from the io_select_request() and documenting the newer + io_select_timeout(), the changes seem good to me + youpi: actually, a send | timeout takes the slow path in mach_msg + and i actually wonder if send | receive | timeout = 0 can get a + valid reply from the server + but the select code already handles that so it shouldn't be much + of a problem + k + + +##### IRC, freenode, #hurd, 2013-02-12 + + hum + io_select_timeout actually has to be a simpleroutine at the client + side :/ + grmbl + ah? + otherwise it blocks + how so? + routines wait for replies + even with timeout 0? + there is no waittime for io_select_timeout + adding one would be really weird + oh, sorry, I thought you were talking about io_select + it would be more interesting to directly use + io_select_timeout_request + but this means additional and separate work to make the + request/reply defs up to date + and used + personally i don't mind, but it matters for wheezy + youpi: i suppose it's not difficult to add .defs to glibc, is it ? + i mean, make glibc build the stub code + it's probably not difficult indeed + ok then it's better to do that first + yes + there's faultexec for instance in hurd/Makefile + ok + or rather, apparently it'd be simply user-interfaces + it'll probably be linked into libhurduser + but with an odd-enough name it shouldn't matter + youpi: adding io_request to the list does indeed build the RPCs :) + i'll write a patch to sync io/io_reply/io_request + youpi: oh by the way, i'm having a small issue with the + io_{reply,request} interfaces + the generated headers both share the same enclosing macro + (_io_user) + so i'm getting compiler warning + s + we could fix that quickly in mig, couldn't we? + youpi: i suppose, yes, just mentioning + + +##### IRC, freenode, #hurd, 2013-02-19 + + in the hurdselect.c code, I'd rather see it td[0]. rather than + td-> + ok + otherwise it's frownprone + (it has just made me frown :) ) + yes, that looked odd to me too, but at the same time, i didn't + want it to seem to contain several elements + I prefer it to look like there could be several elements (and then + the reader has to find out how many, i.e. 1), rather than it to look like + the pointer is not initialized + right + I'll also rather move that code further + so the preparation can set timeout to 0 + (needed for poll) + how about turning your branch into a tg branch? + feel free to add your modifications on top of it + sure + ok + I'll handle these then + youpi: i made an updated changelog entry in the + io_select_timeout_v3 branch + could you rather commit that to the t/io_select_timeout branch I've + just created? + i mean, i did that a few days ago + (in the .topmsg file) + ah + k + + +##### IRC, freenode, #hurd, 2013-02-26 + + youpi: i've just pushed a rbraun/select_timeout_pthread_v4 branch + in the hurd repository that includes the changes we discussed yesterday + untested, but easy to compare with the previous version + + +##### IRC, freenode, #hurd, 2013-02-27 + + braunr: io_select_timeout seems to be working fine here + braunr: I feel like uploading them to debian-ports, what do you + think? + youpi: the packages i rebuild last night work fine too + + # See Also See also [[select_bogus_fd]] and [[select_vs_signals]]. diff --git a/open_issues/select_vs_signals.mdwn b/open_issues/select_vs_signals.mdwn index 9e9699b8..db616acb 100644 --- a/open_issues/select_vs_signals.mdwn +++ b/open_issues/select_vs_signals.mdwn @@ -42,6 +42,21 @@ In context of [[alarm_setitimer]]. Proposed patch: [[!message-id "20130105162817.GW5965@type.youpi.perso.aquilenet.fr"]]. + +## IRC, freenode, #hurd, 2013-01-15 + + <_d3f> Hello, any one else having problems with git? + _d3f: yes + _d3f: it will be fixed in the next glibc release + <_d3f> oh thx. what was the problem? + http://lists.gnu.org/archive/html/bug-hurd/2013-01/msg00005.html + exactly this problem is preventing us building glibc + it's indeed very annoying + and this fix will probably have a visible and positive effect on + other issues + <_d3f> let's hope so. + well, i'm already using it and see the difference + --- See also [[select]] and [[select_bogus_fd]]. diff --git a/open_issues/some_todo_list.mdwn b/open_issues/some_todo_list.mdwn index 82822a29..80592abf 100644 --- a/open_issues/some_todo_list.mdwn +++ b/open_issues/some_todo_list.mdwn @@ -27,7 +27,6 @@ From Marcus, 2002: * xkb driver for console (for international users) * kbd leds in console (well, in general, Roland's new driver in oskit for that crap) -* fixing fakeroot (it's buggy) * fixing tmpfs (it's buggy, Neal says it's Mach's fault) * adding posix shared memory (requires the io\_close call to be implemented) * adding posix file locking (requires the io\_close call to be implemented) diff --git a/open_issues/subhurd_vs_proc_server.mdwn b/open_issues/subhurd_vs_proc_server.mdwn new file mode 100644 index 00000000..36d150f8 --- /dev/null +++ b/open_issues/subhurd_vs_proc_server.mdwn @@ -0,0 +1,54 @@ +[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + + +# IRC, freenode, #hurd, 2013-02-09 + + also, can you actually gdb a process of another subhurd? + yes + but you need to talk to its proc server, don't you? + i don't know + but i did it several times + how? + the usual way + gdb /path/to/bin pid + but which pid? + the hard part was finding the right pid + well, gdb still needs to talk with the right proc too + i don't think it does + btw about the "unable to adjust libports thread priority" errors + I'm seeing on the buildd consoles + from what i've seen, proc "creates" tasks when it first sees them + too + it's about the destination port + yes + i have those when starting a subhurd too + so it would mean that proc somehow got bogus + ah + so you can actually use your own proc + yes + and it feels bogus to me + and I guess mach lets that proc access the task because your proc + is privileged + probably + it feels bogus because, you can't rely on pids being allocated per + task + what i mean is that, if some tasks spawn and die quickly + and you start another application running long enough to see it in + ps + it's pid will be +1, not +the number of created tasks + which means the proc server will never have seen those previous + tasks + it's minor but a bit confusing + i personally don't like seeing the tasks of other systems in ps :/ + and despite the ability to use gdb from another hurd, i think we + should improve the intra system debugging tools diff --git a/open_issues/syslog.mdwn b/open_issues/syslog.mdwn index 19cba82e..ab32b2e1 100644 --- a/open_issues/syslog.mdwn +++ b/open_issues/syslog.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -13,6 +13,7 @@ License|/fdl]]."]]"""]] [[!toc]] + # IRC, unknown channel, unknown date scolobb: In wiki edit 60accafa79f645ae61b578403f7fc0c11914b725 @@ -105,3 +106,11 @@ IRC, OFTC, #debian-hurd, 2011-11-02: it depends on how many things are actually logged. IIRC the hang happens when some client sends 128 messages to syslog or something like that + + +# IRC, freenode, #hurd, 2013-02-09 + + tschwinge: looks like now you could disable syslog no + ... more + It that working now? + should be yes, samuel fixed its issue many months ago diff --git a/open_issues/system_stats.mdwn b/open_issues/system_stats.mdwn index 9a13b29a..ce34ec09 100644 --- a/open_issues/system_stats.mdwn +++ b/open_issues/system_stats.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -37,3 +37,9 @@ system statistics, how to interpret them, and some example/expected values. yes (a test that fails with the 2G/2G split of the debian kernel, but not on your vanilla version btw) + + +## IRC, frenode, #hurd, 2013-01-26 + + ah great, one of the recent fixes (probably select-eintr or + setitimer) fixed exim4 :) diff --git a/open_issues/systemd.mdwn b/open_issues/systemd.mdwn index 1d774307..c23f887f 100644 --- a/open_issues/systemd.mdwn +++ b/open_issues/systemd.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -92,7 +93,16 @@ Likely there's also some other porting needed. agreed -# Requires Interfaces +## IRC, freenode, #hurd, 2013-01-18 + + systemd relies on linux specific stuff that is difficult to + implement + notably cgroups to isolate the deamons it starts so it knows when + they stopped regardless of their pid + just assume you can't use systemd on anything else than linux + + +# Required Interfaces In the thread starting [here](http://lists.debian.org/debian-devel/2011/07/threads.html#00269), a diff --git a/open_issues/translators_set_up_by_untrusted_users.mdwn b/open_issues/translators_set_up_by_untrusted_users.mdwn index 1dac130c..521331e9 100644 --- a/open_issues/translators_set_up_by_untrusted_users.mdwn +++ b/open_issues/translators_set_up_by_untrusted_users.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -350,3 +350,230 @@ IRC, freenode, #hurd, 2011-09-14: cjuner: either glibc or the parent translators Continued discussion about [[resource_management_problems/pagers]]. + + +# IRC, freenode, #hurd, 2013-02-24 + + on a more general topic, i've been thinking about client and + server trust + there have been many talkbs about it regarding l4/coyotos/hurdng + I generally think the client can trust the server + and passing the select timeout to servers corroborates this + because it's either root, or it's the same user + hum yes, but that's not exactly my question, you'll see + there is one feature the hurd has, and i'm not sure we should have + it considering what it requires + the feature is that clients can, at any time, "break" from a + server + "break" ? + the current implementation is to cancel the current RPC after 3 + seconds without a reply when the user sends SIGINT + the problem is that, moving to a complete migrating thread model + would make that impossible (or very complicated to do right) + +[[mach_migrating_threads]]. + + would it be ok to remove this feature ? + well, we need to have SIGINT working, don't we? + obviously + but that's not what the feature is meant to do + it allows clients to recover from a server that misbehaves and + doesn't return + then I don't understand in enough details what you mean :) + imagine a buggy driver in linux that gets into an uninterruptible + sleep + you can't even kill your process + that's what the feature is meant to solve + that's a quite useful thing + e.g. stuck nfs etc., it's useful to be able to recover from that + forbidding uninterruptible sleeps would also be a solution, but + then it means relying on servers acting right + which is why i mention we usually trust servers + well, there is "trust" and "trust" :) + i.e. security-wise and robustness-wise + I meant clients can usually trust servers security-wise + my current idea for x15 is to forbid this kind of breaking, but + also forbid uninterruptible sleeps + robustness-wise, I'd say no + this way, sending a signal directly reaches the server, which is + trusted to return right away (well, conforming to the handling behaviour) + so yes, buggy servers would prevent that, but on the other hand, + stuck nfs wouldn't + provided the nfs implementation is not bogus + yes + I'd tend to agree, but would rather see this discussed on the list + yes + actually, it wouldn't be that hard to keep the current behaviour, + since i won't implement true migrating threads + but it means retaining some issues we have (most importantely, + denial of service) + -e + what i want to avoid is + http://www.gnu.org/software/hurd/hurd/ng/cancellationforwarding.html + for non-trusted servers, we could have a safety wrapper + which we trust and does things carefully when talking with the + non-trusted server + what would a non trusted server be ? + whatever is neither root nor me + e.g. nobody-provided /ftp:, or $HOME of another user, etc. + i'd argue we don't talk to non trusted servers at all, period + users won't like it :) + and i'd extend root to a system provided list + actually the nobody /ftp: case is important + users should be able to create their own list of trusted users + it's also the nobody /dev/null case + atm it's root + yes + i see the point + i'm just saying the idea of "using non-trusted server" doesn't + make sense + actually running /ftp: under nobody is dangerous + since if you're running as nobody (because you broke into the + system or whatever), then you can poke with nobody-provided servers + yes + so we'd rather have really-nobody processes + taht's an already existing problem + which can't be poked into + (and thus can't poke into each other) + or a separate user for each + that'd be difficult + or separate tokens, it's not important + for /ftp:/ftp.debian.org used by someone, and /ftp:/ftp.foo.org + used by someone else + what i mean is that, by talking to a server, a client implicitely + trusts it + youpi: wouldn't that just be the same "ftp" user ? + ideally, a carefully-crafted client could avoid having to trust it + really ? + braunr: that's the problem: then each ftpfs can poke on each other + well, each global one + there's the daemon-sharing issue too, yes + i wasn't thinking about ftpfs, but rather the "system" pfinet for + example + like /dev/null is shared + when you say root or me, it's "system" or me + by default, users trust their system + they don't trust other users + avoid having to trust it: yes, by using timeouts etc. + that's clearly not enough + why? + shapiro described this in a mail but i can't find it right now + I wouldn't like to have to trust ftpfs + well time is one thing, data provided for example is another + well, you do + who knows what bug ftpfs has + ideally I would be able not to have to + braunr: you can check data + i don't think that ideal is possible + it you set a ftp translator with a user account, you give it the + password + which password? + the account password + which account? + "a user account" + i.e. not anonymoius + ah + well, sure, you have to give that to ftpfs + I mean the ftp server might be malicious or whatever + and trigger a bug in ftpfs + yes + so I don't want to have to trust ftpfs + what would that mean in practice ? + have a trusted translation layer which papers over it, checking + timeouts & data + how do you check data ? + by knowing the protocol + ? + can you give a quick example ? + well, which data check do you need? + (it's you who mentioned data issues :) ) + i don't know what you mean by that so, choose as you see fit + well the password one for example + i was merely saying that, buy using an application, be it a + regular one or a translator, you automatically trust it + you mean the ftp user password ? + it becomes part of your tcb + of course you have to provide it to ftpfs + that's not a problem + yes, but it's not because you connect to an http website that you + trust the webserver on the other end + your web browser does checking for you + when the protocol allows it + (in this case, i'm thinking assymmetrical cryptography) + in which case example doesn't it ? + it seems we're not talking about the same kind of issue, thus not + understanding each other + indeed + by "trusting", I don't mean "be sure that it's the right server at + the other end" + my point is that not trusting a server is impossible + I mean "it behaves correectly" + yes + it may not behave correctly, and we might not know it + as long as it doesn't make the program crash, that's fine + that's what I mean + that's where the difference is + but giving the password is not my concern here + and giving the password is a matter of cryptography, etc. yes, but + that's completely not my point + i'm concerned about absolute correct behaviour + hm + no actually i was + but not any more, the discussion has shifted back to the timeout + issue + ah no, i remember + we talked about which servers to trust + and how to alter communication accordingly + and my point was that altering communication shouldn't be done, we + either trust the server, and talk to it, or we don't, and we stay away + the wrapper would help for this specific blocking issue, yes + I don't agree on this + let me take a way more simple example + a server that provides data through io_read + I don't want to trust it because it's provided by some joe user + but I still want to have a look at the data that it produces + I'm fine that the data may be completely non-sense, that's not a + problem + what is a problem, however, is if the hexdump program I've run over + it can't be ^C-ed + yes, that's the specific issue i mentioned + and for that, a mere trusted intermediate could be enough + iirc, there is a firmlink-related issue + ? + + http://www.gnu.org/software/hurd/open_issues/translators_set_up_by_untrusted_users.html + I'm not able to guess what conclusion you are drawing here :) + don't talk to untrusted servers + or be careful + the rm -fr /tmp being aabout being careful actually + right + i have a very unix-centric view for my system actually + i think posix compatibility is very important for me + more than it seems to have been in the past when the hurd was + designed + to* me + so i don't trust tools to be careful + that's why a wrapping translator could make it back to posix + compatibility + but i see what you mean + being careful for the tools + hum, a wrapping _translator_ ? + yes, similar to remap and fakeroot + ok + you'd tell it "for this path, please be careful for my tools" + ok so + it would basically still end up trusting system or me + but you'd add this wrapper to the system + "it" ? + the situation + i don't know :) + the implementation, whatever + the shell I'm running, you mean + and it would be the job of this translator to shield the user + yes + that's a good idea, yes + it could reduce the allowed RPC set to what it knows to check + how would the shell use it ? + would it "shadow" / ? + yes + ok diff --git a/open_issues/virtualization.mdwn b/open_issues/virtualization.mdwn index 10cf73db..34074c18 100644 --- a/open_issues/virtualization.mdwn +++ b/open_issues/virtualization.mdwn @@ -46,3 +46,5 @@ An index of things to work on w.r.t. virtualization. * [[Networking]] * [[remap_root_translator]] + + * [[fakeroot]] diff --git a/open_issues/virtualization/fakeroot.mdwn b/open_issues/virtualization/fakeroot.mdwn new file mode 100644 index 00000000..ec762b59 --- /dev/null +++ b/open_issues/virtualization/fakeroot.mdwn @@ -0,0 +1,17 @@ +[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + + +# IRC, freenode, #hurd, 2013-02-26 + + btw, about fakeroot-hurd + the remaining issue I see is with argv[0] (yes, again...) diff --git a/open_issues/virtualization/remap_root_translator.mdwn b/open_issues/virtualization/remap_root_translator.mdwn index 3cb574ae..67d64ae0 100644 --- a/open_issues/virtualization/remap_root_translator.mdwn +++ b/open_issues/virtualization/remap_root_translator.mdwn @@ -95,3 +95,47 @@ License|/fdl]]."]]"""]] his own one ok attached to the remapping + + +## IRC, freenode, #hurd, 2013-01-29 + + ok, the remap translator was too easy + just took fakeroot.c + added if (!strcmp("bin/foo", filename)) filename = + "bin/bash"; in + netfs_S_dir_lookup + and it just works + ok, remap does indeed take my own pfinet + good :) + pfinet's tun seems to be working too + it's however not really flexible, it has to show up in /dev/tunx + I'll have a look at fixing that + yep, works fine + + +## IRC, freenode, #hurd, 2013-02-01 + + braunr: as I expected, simply passing FS_RETRY_REAUTH does the + remapping trick + + +# IRC, freenode, #hurd, 2013-02-12 + + + http://darnassus.sceen.net/~hurd-web/community/gsoc/project_ideas/server_overriding/ + youpi: isn't that your remap translator ? + completely + remap being (5) + + +# IRC, freenode, #hurd, 2013-02-25 + + I'm just having an issue with getcwd getting in the sky + I wonder whether libc might need patching to understand it's in + some sort of chroot + or perhaps remap fixed into avoiding .. of / being odd + erf, it's actually an explicit error + libc just doesn't want to have a ".." / being different from CRDIR + let me just comment out that :) + way better :) + yep, just works fine diff --git a/open_issues/vm_map_kernel_bug.mdwn b/open_issues/vm_map_kernel_bug.mdwn index 613c1317..159e9d04 100644 --- a/open_issues/vm_map_kernel_bug.mdwn +++ b/open_issues/vm_map_kernel_bug.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -52,3 +52,20 @@ License|/fdl]]."]]"""]] it could be that gnumach isn't good at aligning to large values [[!message-id "87fw4pb4c7.fsf@kepler.schwinge.homeip.net"]] + + +# IRC, frenode, #hurd, 2013-01-22 + +In context of [[libpthread]]. + + pinotree: do you understand what the fmh function does in + sysdeps/mach/hurd/dl-sysdep.c ? + ok i understand what it does + and youpi has changed the code, so he does too + youpi: do you have a suggestion about how to solve this issue in + the fmh function ? + do we remember which bug it's after? + what do you mean ? + ah + no :/ + it could be a good occasion to get rid of it, yes -- cgit v1.2.3