diff options
author | Thomas Schwinge <thomas@codesourcery.com> | 2014-02-26 12:32:06 +0100 |
---|---|---|
committer | Thomas Schwinge <thomas@codesourcery.com> | 2014-02-26 12:32:06 +0100 |
commit | c4ad3f73033c7e0511c3e7df961e1232cc503478 (patch) | |
tree | 16ddfd3348bfeec014a4d8bb8c1701023c63678f /open_issues/libpthread/t | |
parent | d9079faac8940c4654912b0e085e1583358631fe (diff) |
IRC.
Diffstat (limited to 'open_issues/libpthread/t')
-rw-r--r-- | open_issues/libpthread/t/fix_have_kernel_resources.mdwn | 824 |
1 files changed, 823 insertions, 1 deletions
diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn index feea7c0d..02b6ab05 100644 --- a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn +++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -477,3 +478,824 @@ Address problem mentioned in [[/libpthread]], *Threads' Death*. failing bad <braunr> i just need to polish a few things, wait for youpi to finish his work on TLS to resolve conflicts, and that will be all + + +## IRC, freenode, #hurd, 2013-10-30 + + <braunr> FYI, the packages on my repository enable actual thread + destruction, and i've altered the libports_stability.patch + <braunr> it nows only sets the global timeout to 0 + <braunr> now* + <braunr> we actually can't let translator "die" on global timeout because + of a race issue + <braunr> tested for about two weeks now and no major problem sighted + <braunr> top reports processes running for 100% of their time when + terminating threads, but i expect it's simply mach/proc aggregating their + run time to the task + <braunr> 100% of cpu time + + +## IRC, freenode, #hurd, 2013-11-08 + + <braunr> teythoon: darnassus is currently running a modified glibc with + thread destruction, yes + <teythoon> braunr: did that require any fixups in Hurd that I'd have missed + ? + <braunr> no + <braunr> well + <teythoon> b/c the resulting hurd package would not boot + <braunr> actually yes + <braunr> one + <braunr> i'll push the patch somewhere + <teythoon> iirc the mach-defpager spewed some error and /hurd/init failed + to bootstrap the system + <braunr> teythoon: + http://darnassus.sceen.net/~rbraun/0001-Prevent-diskfs-translators-from-destroying-main-thre.patch + <braunr> make sure you have the proper gnumach packages too :p + <teythoon> well, that could very well account for my trouble ;) + <teythoon> uh + <teythoon> well + <braunr> gnumach implements thread destruction, glibc uses it, hurd makes + sure it doesn't exit from main + + +## IRC, freenode, #hurd, 2013-11-12 + + <braunr> ok so, calling pthread_exit() from main isn't the same as + returning from main() + <braunr> unlike what some man pages seem to say + <braunr> so loosing task info when destroying the main thread is actually a + proc bug + <braunr> ugh + <teythoon> ^^ + <braunr> or a glibc one + <teythoon> the proc server, your favorite Hurd component... + <braunr> :) + <braunr> hm :/ + <braunr> looks like command line arguments are stored on the stack of the + main thread + <braunr> and proc merely receives the addresses of those in the target task + <neal> why not just keep the main thread around? + <neal> it represents a minor resource leak, true + <braunr> yes + <braunr> that's the hack i suggested + <neal> but it is relatively small + <braunr> well no + <braunr> my hack was about diskfs translators + <braunr> it should be generalized in libpthread + <braunr> seems reasonable + <braunr> let's do it >) + + +## IRC, freenode, #hurd, 2013-11-13 + + <youpi> braunr: there is a thread destruction issue in the experimental + ocaml build, worth looking at, probably + <braunr> what do you mean ? + <youpi> ... testing 'testfork.ml': ocamlcocamlrun: + ../libpthread/sysdeps/mach/pt-thread-halt.c:51: __pthread_thread_halt: + Unexpected error: (ipc/send) invalid destination port. + <youpi> during the experimental ocaml build + <braunr> well yes + <braunr> thread recycling is buggy + <braunr> i had the choice to fix it, or implement true destruction + <braunr> i'm tweaking my patch so it leaves the main thread stack untouched + on destruction + <braunr> and it should be ready + <braunr> for review at least + + +## IRC, OFTC, #debian-hurd, 2013-11-13 + + <gg0> ironforge out of memory during ruby1.9.1 rebuild. during test which + creates 10000 threads + <gg0> ironforge out of memory during ruby1.9.1 rebuild, test which creates + 10000 threads + <gg0> i guess ironforge kernel has been rebuilt against -95, correct? + <youpi> err, what kernel? + <gg0> 23:37 < youpi> hurd needs a rebuild to be able to work with the newer + eglibc + <gg0> i mean hurd + <youpi> yes, libc0.3 breaks the old packages anyway + <gg0> wrt ENOMEM, was it expected? + <gg0> wrt disk problems, aren't there on alioth only? + <youpi> well 10,000 threads is a lot, especially on 32bit machine with 2M + default stack size + <youpi> that makes 2GiB stacks + <youpi> can't fit in a 2/2 split model, which gnumach uses + <gg0> well, though active thread should die right away, just after set x to + false, if i read it correctly + <youpi> perhaps the stacks are not correctly reused + <youpi> that's probably worth digging in libpthread + <youpi> by putting printfs, etc. + <youpi> it seems stacks are never reused indeed, damn + <youpi> I just wrote a small test that creates threads which just print + their stack address + <youpi> that takes just a few minutes to do + <gg0> i see. about reusage i guess you mean base address is kindof always + incremented + * gg0 likes being wrong + <youpi> that's it, yes + <youpi> gg0: take care, by keeping being wrong all the time, sometimes you + get right ;) + <youpi> and you are definitely right here :) + <youpi> Mmm, but the stack is really deallocated + <youpi> and the numbers wrap around + <youpi> I wonder how that is :) + <youpi> ok, creating 20 000 threads does work + <youpi> perhaps ruby does odd things which makes it not work + + +### IRC, OFTC, #debian-hurd, 2013-11-14 + + <gg0> UID PID PPID TH MSGI MSGO SZ RSS SC STAT TIME COMMAND + <gg0> 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu 0:00.15 + /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1 + -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb + <gg0> 720 threads, stuck + <youpi> 2G SZ is very big :) + <gg0> 00:42 < youpi> perhaps ruby does odd things which makes it not work + <gg0> is that enough to file a ruby bug? as ruby suggests itself btw + <youpi> no, they will probably not be able to investigate + <youpi> but you can already check out how they create threads + <youpi> and try to reproduce the same with a small C program + <gg0> ehm on ruby2.0 with *context _enabled_ i can not reproduce it + +See [[/open_issues/glibc]] for `*context` functions. + + +## IRC, freenode, #hurd, 2013-11-14 + + <braunr> nice, i got glibc packages with thread destruction + <braunr> building hurd packages against it now + <braunr> everything seems fine + <braunr> hurd packages ready, let's see + + <gg0> ruby1.9.1 FTBFS due to a couple of tests + https://buildd.debian.org/status/fetch.php?pkg=ruby1.9.1&arch=hurd-i386&ver=1.9.3.448-1&stamp=1384265526 + <gg0> second one creates 10000 threads and machine got ENOMEM + <braunr> bootstraptest.tmp.rb: [BUG] [BUG] pthread_cond_init: Cannot + allocate memory (ENOMEM) ew + <gg0> few hours ago trying to reproduce it: + <gg0> 01:20 < gg0> UID PID PPID TH MSGI MSGO SZ RSS SC STAT + TIME COMMAND + <gg0> 01:20 < gg0> 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu + 0:00.15 /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1 + -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb + <braunr> yes that's expected + <braunr> our stacks are 2M + <braunr> 10k threads means right over 2G of stacks + <braunr> userspace is restricted to 2G + <gg0> but if i read correctly test in question, thread should just set x to + false then die + <braunr> so ? + <gg0> and ENOMEM popped upk when there were thread count was at 720 + <braunr> hum + <braunr> 10k threads would actually be 20G + <braunr> 1k threads is 2G + <braunr> 720 is about 1.5G + <braunr> the rest is probably the ruby runtime + <gg0> youpi tried to create 10000 thread, no problem. he guessed something + wrong on ruby side + <gg0> indeed on ruby2.0 such test succeeds + <braunr> you can't create 10k threads unless you change the stack size + <braunr> hurd servers use a stack size of 64k by default which allows them + to go up to 30k iirc + <braunr> but normal applications use the default 2M + <gg0> i guess you mean 10000 threads active at the same time. test in + question should make them die after simply setting x to false, i guess + youpi's test did so as well + <braunr> no + <braunr> it's about stacks + <braunr> hm + <braunr> yes at the same time but + <braunr> thread recycling is known to be buggy + <braunr> which is what i'm currently fixing btw + <neal> what's the bug? + <braunr> neal: there are several subtle issues + <braunr> for example, joining a thread that is also calling pthread_exit + can fail badly + <neal> hmm + <neal> good that you are on it then :) + <braunr> or detaching + <braunr> i don't remember the details + <braunr> but i remember such problems + <braunr> apparently, keeping the stack of the main thread isn't enough + <braunr> :( + <braunr> for now, i'll keep the entire thread + + +## IRC, freenode, #hurd, 2013-11-15 + + <gg0> i wasn't doing anything, just some single test runs. but yes, also + that one which creates hundreds of threads + <gg0> it would like creating 10000 but goes out of memory after ~720 + <gg0> btw same tests succeed on ruby2.0, so they should be fixed by + backporting some changes + <braunr> actually it looks more like a deadlock .. + <gg0> deadlock that says ENOMEM? + <braunr> ? + <braunr> ENOMEM is returned because the test task has no more virtual + memory + <braunr> this doesn't mean the rest of the system should fail + <gg0> ok i thought you were talking about such test + <braunr> no it's something else + <braunr> a deadlock in a critical server + <braunr> the root file system maybe + <gg0> braunr: htop and ps hang. just run the test once again + <gg0> now you should still be able to login + <braunr> htop/ps hanging means one process is unable to reply to queries + sent to the message port/thread + <braunr> procfs does that to report on what a process is waiting + <braunr> it usually mean there is a bug around signals, since the message + thread is also in charge of delivering signals + <braunr> use ps -eM + <braunr> and kill -KILL + <braunr> hum + <braunr> root 954 S<o 0:00.05 /hurd/crash --dump-core + <braunr> dumping cores is known not to work most of the time + <braunr> exodar shouldn't be configured like that + <braunr> so yes, the crash server is hanging + <braunr> gg0: i've set it to crash --kill and killed the hanging crash + instances blocking top/ps + <gg0> nice + + <braunr> my thread destruction patch and tls are indeed conflicting a bit + <braunr> i suspect the tcb is used after being freed + <braunr> i think i'll simply recycle the tcb, along with the pthread + structs + <braunr> ok i think it's fine now + <braunr> there was also a small bug in the tls code, keeping a reference on + the thread port + <braunr> mach reference counting is so counter intuitive :/ + <braunr> well, error-prone + + <braunr> argh, more bugs in libc :( + <teythoon> :/ + <teythoon> but don't worry, there is always one more bug ;) + <braunr> this one might explain crashes that are long to trigger + <braunr> _hurd_self_sigstate() is implemented like this : + _hurd_thread_sigstate (__mach_thread_self ()); + <braunr> it leaks a reference on the current thread each time it's called + <teythoon> >,< + <braunr> but glibc maintains such references, so if the maximum value is + reached, and references are dropped, the value can reach 0 + <teythoon> ouch + <braunr> at which point any call on a thread will result in an invalid send + right + <braunr> and probably an assertion + <teythoon> well it's a good thing then that you found it :) + <braunr> i think it's always been there + <braunr> but it's more apparent since jknoenig's patch on signal + dispositions + <braunr> the maximum number of user references in mach is 64k + <braunr> this right leak isn't easy + <braunr> tls is very tricky heh :) + <braunr> for the main thread, tls initialization happens after the thread + creation, obviously + <braunr> but for other threads, it's initialized before starting them + <braunr> the leak was probably an overlook caused by that complexity + <braunr> teythoon: actually that leak i mentioned in _hurd_self_sigstate + has only been recently added in Convert sigstate to TLS + <braunr> so it's merely tls integration polishing + <braunr> youpi: i'm currently reviewing changes related to tls and i think + there is a bug in _hurd_self_sigstate + <braunr> calls to mach_thread_self() should be paired with + mach_port_deallocate to avoid urefs overflows + <braunr> and right leaks + <braunr> _hurd_critical_section_lock is probably affected too + <braunr> hm + <braunr> mhmm + <braunr> in glibc, hurd/hurd/signal.h, _hurd_critical_section_lock + <braunr> why is the sigstate unlocked after the call to + _hurd_thread_sigstate + <braunr> _hurd_thread_sigstate doesn't seem to lock it .. + <braunr> unless __spin_lock_init does it + <braunr> yes, leak solved :) + + +## IRC, freenode, #hurd, 2013-11-16 + + <braunr> argh, _hurd_critical_section_lock is called before the send right + on the main thread is fetched in libpthread :/ + <teythoon> is that bad ? + <braunr> the sigstate is supposed to be initialized after pthreads + <braunr> _hurd_critical_section_lock will create it if it sees there is + none + <braunr> creating the sigstate is currently what makes the send right leak + <teythoon> ok + <teythoon> it's bad then + <braunr> it may be due to my patch + <braunr> _hurd_critical_section_lock is called during pthreads + initializatio + <braunr> n + <braunr> before the sigstate for the main thread is created, but after the + pthread init routine is called + <braunr> it does indeed look like the code wasn't written with thread being + destroyed some day in mind :/ + <teythoon> braunr: btw, if you ever feel like benchmarking, sysbench has a + benchmark for threads contending for a lock + <braunr> yes i've used it before + <teythoon> was it useful for this purpose ? + <braunr> no :) + <teythoon> :/ + <braunr> we already know libpthread isn't optimized + <braunr> and felt it when we switched from cthreads + <braunr> humpf + <braunr> simply calling malloc implies a call to + _hurd_critical_section_lock + <braunr> on the other hand, unlike what some glibc comments say, this does + work + + +## IRC, freenode, #hurd, 2013-11-17 + + <braunr> looks like i've fixed all leak issues with thread destruction and + tls :) + <braunr> let's see if ext2fs.static works fine too + <youpi> braunr: \o/ + <youpi> sorry about introducing the tls ones :) + <braunr> no worries, it was expected + <braunr> and tls was really needed :) + <braunr> i mean, i expected to have some problems when rebasing on tls :p + <teythoon> braunr: this is good news, how is your rootfs translator holding + up? + <braunr> building hurd packages right now + <braunr> for now, only test applications and a few really multithreaded + ones (e.g. iceweasel) have been tested + <braunr> well, the system boots :) + <teythoon> awesome :) + <braunr> stressing the file system with git while watching youtube videos + with gnash doesn't make the system crash + <teythoon> you can actually watch yt videos on your Hurd box ? + <braunr> yes + <braunr> for a while now + <teythoon> o_O + <braunr> can't you ? + <teythoon> I never even dared to try + <braunr> hehe + <braunr> teythoon: looks stable enough to install on darnassus + + +## IRC, freenode, #hurd, 2013-11-18 + + <teythoon> braunr: wrt to your thread destruction patchset, I thought you + also had to fix the proc server ? + <braunr> teythoon: no + <braunr> the problem was in glibc + <braunr> i may have to fix proc/procfs though, because cpu time gets wrong + with the patch + <braunr> currently, it's the addition of the cpu time of all threads + <braunr> mach provides aggregate times including destroyed threads though + <teythoon> ah, I see + <braunr> one side effect is that you'll see processes sometimes taking 100% + of cpu time although the cpu is unused + <braunr> or the cpu time of a process gets reduced :) + <braunr> i guess the 100% cpu is how top sees a negative increment + <teythoon> ^^ + <braunr> gg0: do my threadterm packages help with ruby1.9 ? + <braunr> i mean, can you test with them some time ? :) + + +## IRC, freenode, #hurd, 2013-11-21 + + <braunr> youpi: ping about my question regarding error handling in the + proposed thread_terminate_release call + <youpi> I agree with what Neal said + <braunr> he didn't say anything about error handling + <braunr> see + http://lists.gnu.org/archive/html/bug-hurd/2013-11/msg00181.html + <braunr> i think i should make the call fail on first error + <braunr> it shouldn't happen, so it would merely serve to catch bugs + <braunr> it's not easily recoverable (if it's recoverable at all) + <youpi> uh, I thought he had + <youpi> I must have dreamt + + <braunr> i think i'll go ahead with thread destruction integration + + +## IRC, freenode, #hurd, 2013-11-25 + + <braunr> i've pushed the thread destruction patches for gnumach upstream + <braunr> and made a branch in glibc for that too + <teythoon> awesome :) + <braunr> youpi: i don't remember how glibc changes should be managed + <braunr> once those are applied, i'll commit in libpthread + <youpi> braunr: usually we create a topgit branch, and then we add the + patch from that to the debian repository + + +## IRC, freenode, #hurd, 2013-11-29 + + <braunr> youpi: i still have a leak somewhere with the thread destruction + patches + <braunr> maybe on the host priv port in bootstrap servers (root fs and proc + server) + <braunr> it prevents priority adjusting in libports and can easily bring + down a system because servers can start trashing a lot sooner, as it was + the case during the pthread migration + +See discussion about that on [[/open_issues/libpthread]]. + + <braunr> so i'll hunt it down before merging + + +## IRC, freenode, #hurd, 2013-12-19 + + <braunr> darnassus still has the libports priority adjustement leaks + <braunr> i'll apply a few more patches to my hurd packages + + <braunr> humpf, proc seems to have a problem getting the host priv port :/ + <teythoon> thats bad + <teythoon> what did you do ? + <braunr> i fixed all the leaks in libports when adjusting priorities + <braunr> the last one being releasing the host priv right + <braunr> and i get errors at boot time from the proc server + <teythoon> remember when i had this problem ? + <braunr> proc doesn't get the host priv port the normal way since the + normal way is to get it from proc iirc + <teythoon> ah, thought you fixed that + <braunr> so i guess the alternate way doesn't add a reference + <braunr> well the leak is fixed + <braunr> the problem you had was due to the leak which made the host priv + port reach its max uref value + <braunr> now it's just the proc server + <braunr> the system works fine though + <teythoon> for real ? + <teythoon> the proc server needs the host priv port for getting the new + tasks + <braunr> well yes + <teythoon> how can it work w/o it ? + <braunr> i don't know .. + <braunr> i guess the problem is internal to glibc + <braunr> i mean, get_priv_ports fails, but that doesn't mean the host priv + port is lost + <teythoon> could be + <teythoon> are you running a patched rootfs translator too ? + <braunr> yes + <teythoon> ok + <teythoon> b/c i remember having trouble with that + <braunr> right, the glibc call would make proc call __proc_getprivports + <braunr> hum + <braunr> teythoon: do you remember how proc gets its host priv port ? + <teythoon> from init + <teythoon> i think + <braunr> startup_procinit ? + <teythoon> possibly + <braunr> right + <braunr> so it's probably not the host priv port + <braunr> i mean, the error is about another invalid send right + <braunr> hm nope, it is on host_priv :/ + <braunr> hm ok i see, looks like a bug from a debian patch + <braunr> or rather, a bug fix not yet imported into the debian package + <braunr> teythoon: you actually fixed it in + 2c9422595f41635e2f4f7ef1afb7eece9001feae + <braunr> great :) + <teythoon> ah, that one + <braunr> i was looking at the upstream code and couldn't understand what + was going wrong + <braunr> :) + <braunr> much better + <braunr> except ps -eT doesn't work any more .. + <braunr> interestingly, with the thread destruction patch, ps -eT sometimes + work, and sometimes doesn't + <braunr> the behaviour doesn't seem to change without a reboot + <braunr> and of course, as soon as i say it, i'm proven wrong by the next + test :) + + +## IRC, freenode, #hurd, 2013-12-26 + + <braunr> __pthread_sigstate_init doesn't seem to be converted to TLS in the + upstream repository master branch + + <braunr> ah dammit, the global signal dispositions patch touches both glibc + and libpthread @#! + <braunr> what a mess + + <braunr> youpi: do you have some time to quickly review the + rbraun/thread_destruction branch in libpthread ? + <braunr> there might be conflict with some glibc patches + <braunr> or do you prefer it on the mailing list ? + <braunr> (i used a branch because it's not based on master) + <youpi> rather mail the list, yes + <braunr> ok + <youpi> it'd also be useful to write the rationale + <youpi> probably to be left as comment in the source code + <braunr> yes, that branch was for personal storage :) + <youpi> so the reader knows how things are recycled or not + <braunr> hm + <braunr> that should already be the case + <youpi> ok + <braunr> the two structures that are still recycled are the pthread struct + and tls + <braunr> it's quite obvious from pthread_alloc + <braunr> and well commented there + <braunr> for tls, it's explained in pthread_exit + + <braunr> there, thread destruction finally merged in + <braunr> and now, we can remove the ugly hacks that were done for + threadvars + <braunr> :) + <braunr> change stacks at will and support all sorts of weird languages and + runtimes + <teythoon> braunr: cool :) + + +## IRC, freenode, #hurd, 2013-12-31 + + <youpi1> braunr: I've added sigstate_locking, sigstate_thread_reference and + tls_thread_leak to the debian glibc 2.18 package + <youpi1> I believe that's complete? + <youpi1> is mach_msg_uspace_options ready for being added? Does it bring + much speedup? + <youpi1> AIUI, thread_terminate_release is the union of the branches + mentioned above? + <youpi1> (I'm cleaning up branches in the glibc repo) + <braunr> youpi1: mach_msg_uspace_options can be left over, it only affects + selects and not noticeably + <braunr> yes, those three branches are the only ones needed for thread + destruction + <youpi1> ok + <youpi> does the hurd changes depend on these changes ? + <braunr> no + <youpi> good :) + <braunr> only on tls for one of them + <braunr> (it's about the default stack size of 64k for hurd servers) + <youpi> and we have had this in debian for a long time already :) + <braunr> yes + <youpi> (how big were they before?) + <youpi> (where they a couple MiB, and thus exploding to GiBs on thousands + of threads?) + <braunr> 64k + <braunr> pthread stacks are 2M by default + <braunr> yes + + +## IRC, freenode, #hurd, 2014-01-14 + + <youpi> braunr: it seems your time change in libps made ps produce odd re + <youpi> results + <youpi> samy 10987 5 -514358:-18:-42.17 /hurd/firmlink tmp + <braunr> youpi: wow :) + <braunr> that change is supposed to run on a system where threads actually + get destroyed + <braunr> but i don't see what could trigger this side effect + <youpi> root 8629 664 56 years make -j 3 + <youpi> :) + <braunr> heh + <braunr> youpi: does the hurd package on darnassus include that patch ? + <youpi> yes + <braunr> i don't reproduce the problem :/ + <youpi> err + <braunr> what command are you using ? + <youpi> ps -feM on darnassus + <youpi> root 29642 473 7 months /usr/sbin/sshd -R + <braunr> hmmmm + <braunr> i don't see it with a make -j + <youpi> well, it's not systematic + <youpi> it's like once over two launches + <braunr> hhhhmmmmm + <youpi> it'd look like some random numbers get added + <braunr> strangely, the gcc processes started by a recursive make aren't + children of make .. + <braunr> ps -eF hurd seems to report the correct values + <braunr> even ps -eM + <braunr> oO + <braunr> ps -ef too + <braunr> the problem seems to be with ps -efM + <youpi> too bad I'm always using that :) + <braunr> another way to see it is that it makes us spot the issue ;p + + +### IRC, freenode, #hurd, 2014-01-15 + + <braunr> ok i have an idea of what goes wrong in libps + + <braunr> youpi: for some reason, ps -efM lacks the PSTAT_TASK_BASIC flag + <braunr> my patch is wrong since it doesn't try to determine whether the + stats apply to a task or a thread, but that is easy to fix + <braunr> ps -efM should nonetheless provide basic task info, obviously + <braunr> in addition, the problems i've observed with ps -T (occasional + segfaults) seem to have existed before thread destruction + <braunr> they're just strongly exposed now that the thread list can be + shrunk + + <braunr> libps is quite complicated + <braunr> even hairy, i'd say .. + + +### IRC, freenode, #hurd, 2014-01-16 + + <braunr> youpi: i think i have a proper fix for libps + <braunr> i'll commit it soon + <youpi> ok + <braunr> basically, getting system times simply set the PSTAT_THREAD_BASIC + flag + <braunr> whereas getting the run time of the terminated threads requires + PSTAT_TASK_BASIC + <braunr> i assumed it was always set in the function i changed when dealing + with a task and not a thread + <braunr> and well, that was a wrong assumtion, -M can remove it if not + strictly needed by the format + <braunr> the default format asks for suspend_count, which forces the + retrieval of task basic info, os it works with -eM + <braunr> but -f doesn't :) + <youpi> so extremely bad lucky combination of flags :) + <braunr> indeed + <braunr> i added a pstat_times using the last (!) available flag bit + <braunr> looks clean to me + <braunr> i hope there is no abi issue + <braunr> (at least everything works with the unmodified ps-hurd executable + and a new libps.so) + + <braunr> hm, small bug in the thread destruction patch :/ + + +### IRC, freenode, #hurd, 2014-01-17 + + <braunr> good, i have proper fixes for tls in the main thread and thread + termination :) + <teythoon> awesome :) + <teythoon> i've been wondering, what does it take to get the thread + destruction stuff into the debian package ? + <braunr> i still have to build test packages, look for (unlikely, heh) + regressions and work some integration details with samuel + <braunr> hum the main thread tls fixup i guess + <braunr> youpi was waiting for me to fix that + <braunr> gnumach already provides the RPC + <braunr> so it will be in glibc soon + <braunr> i just have to get those last bits right + <braunr> teythoon: i'm quite slow at integrating stuff + <teythoon> and samuel then builds packages ? + <teythoon> i mean, is our libc package build linked to the other libc + packages ? + <braunr> libpthread is applied as a patch to glibc + <braunr> and loaded as a plugin + + +## IRC, freenode, #hurd, 2014-01-17 + + <braunr> uhm, did we break fakeroot-tcp ? + <teythoon> we did ? + <youpi> fakeroot-tcp just works fine on buildds + <braunr> with fakeroot-tcp, i get + <braunr> make[4]: Entering directory + `/home/rbraun/devel/debian/packages/hurd/hurd-0.5.git20140113/libdde-linux26/contrib/include' + <braunr> rm -f .general.d + <braunr> make[4]: *** [cleanall] Killed + <braunr> when cleaning the package before building .. + + +### IRC, freenode, #hurd, 2014-01-18 + + <braunr> damn, fakeroot-tcp won't work on darnassus .. + <braunr> uh, looks like my tls/thread destruction "fixes" do cause + regressions :( + <braunr> fakeroot works fine with debian glibc + <teythoon> which one ? + <teythoon> which fakeroot i mean + <braunr> -tcp + <braunr> yes, it fails as soon as i use the patched glibc :/ + <braunr> at least it's easy to reproduce + + +### IRC, freenode, #hurd, 2014-01-20 + + <braunr> great, 3rd libc version installed on darnassus, let's see if i can + build hurd packages against that + + +### IRC, freenode, #hurd, 2014-01-21 + + <braunr> damn, fakeroot-tcp still crashes with my latest changes .... + + <braunr> darnassus looks in good shape + <braunr> youpi: ^ + <braunr> youpi: if you have other tests, feel free to do them now + <braunr> i feel confident about committing the changes, if you're ok with + it + <youpi> which changes ? + <youpi> I'm a bit lost in what you were talking about :) + <braunr> you can find them in 2 patches in /var/tmp on darnassus + <braunr> one is about fixing thread destruction + <braunr> i'm pretty certain about this one so i'll commit it directly + <braunr> the other is fixing the tcb of the main thread + +[[open_issues/libpthread]]. + + <braunr> where i simply do tcb->self = thread->kernel_thread :) + <braunr> with a comment explaining why i don't do something else like + deallocating the unused tcb + <youpi> braunr: ok, that looks good + <teythoon> braunr: awesome :) + <braunr> youpi: ok + + +### IRC, freenode, #hurd, 2014-01-22 + + <braunr> there, libpthread should be fine now + + +## IRC, freenode, #hurd, 2014-02-06 + + <braunr> youpi: in case you're planning to upgrade glibc (or not), the + thread destruction changes are complete + <braunr> youpi: darnassus has been running them for some weeks with no + visible regression + <youpi> braunr: ok, good + <youpi> including it in glibc was on my todo list indeed + <youpi> and Adam indeed plan for a 2.18 upload + <braunr> good :) + <youpi> braunr: this is up to 7c6dc6e28b2fc4b67934223f41cf080ffe58b230, + right? (Wed Jan 22, Fix up the main thread TCB) + <braunr> yes + <braunr> oh, i just saw 2.17-98~0 glibc packages on debian-ports :) + <youpi> yes, it's just to fix the dhcp crash + <braunr> ah yes, it's not 2.18 + <youpi> 2.18 is available in experimental + + <youpi> braunr: just to make sure: did you have + 983b18a6ff16f5687a9ece63a50d1831dec88609 in libc on darnassus? + <youpi> (which drops the stack size hack) + <braunr> youpi: let me check + <braunr> youpi: ah no, i don't, you're right + <youpi> well, I was just wondering, nothing make me think that was the case + :) + <youpi> what was the issue that it was raising btw? + <braunr> threadvards + <youpi> ok, b ut in which case? + <youpi> (to make sure I test that before committing) + <braunr> now that we switched to tls, i would assume the transition path to + be 1/ hurd stops defining that symbol, 2/ libpthread can stop using it + <braunr> the goal was to reduce the stack size of hurd server threads + <youpi> well, that's not my question :) I'm wondering in which precise case + that was breaking things + <braunr> youpi: i don't know, it shouldn't break + <youpi> ok + <braunr> youpi: just in case, don't forget that last one line patch i + committed last night, fakeroot can't work right without it + <braunr> (i made a minor change while reviewing before comitting, and + obviously got it wrong :p) + <youpi> ok + + <youpi> braunr: I've upgraded libpthread in debian's eglibc btw + + <braunr> + /home/rbraun/devel/debian/packages/eglibc/eglibc-2.17/build-tree/hurd-i386-libc/libc.so.phdr: + *** executable stack signaled + <braunr> from build-tree/hurd-i386-libc/elf/check-execstack.out + <braunr> i thought glibc didn't use those + <braunr> anyway it doesn't look to be the regression i'm having + <braunr> does this ring a bell : + <braunr> Encountered regressions that don't match expected failures + (debian/testsuite-checking/expected-results-i486-gnu-libc): + <braunr> test-stpcpy_chk.out, Error 1 + <braunr> TEST test-stpcpy_chk.out: __stpcpy_chk normal_stpcpy + simple_stpcpy_chk + <youpi> nope + <youpi> after what are you getting this regression? + <braunr> building glibc 2.17-97 with thread destruction patches, including + the one removing the stack size hack + <braunr> during tests + <braunr> there also are "progressions", but i'm not sure what these are + <youpi> some progressions are just luck, other seem to happen on some + platforms only + <youpi> I'm not sure you want to test 2.17 + <youpi> a lot has changed between 2.17's libpthread and 2.18's libpthread + (which is now equal to cvs's libpthread + <youpi> ) + <youpi> s/cvs/git/ + <braunr> yes + <braunr> i usually build with nocheck + + +## IRC, freenode, #hurd, 2014-02-07 + + <braunr> youpi: on a vm with hurd 1:0.5.git20140203-1, upgrading to a + patched glibc 2.17-97 that includes the patch which reverts the stack + size hack, the system reboots and works fine + <youpi> ok. I don't remember what problem I was seeing + <braunr> that version of the hurd no longer defines the symbol + <braunr> but even then, there shouldn't have been any problem + <braunr> hm, or does it + <braunr> yes, it does + <braunr> youpi: the hurd package patch mentions + <braunr> Revert this for now, will have to wait for dropping the use of + <braunr> __pthread_stack_default_size from eglibc's + libpthread_hurd_cond_wait.diff + <braunr> i wonder how it got there + <youpi> IIRC I was wondering too + <braunr> i've installed my c library on darnassus and it works fine there + too + <braunr> with older (january) hurd packages + <braunr> looks good to me + + +## IRC, freenode, #hurd, 2014-02-10 + + <teythoon> braunr: btw, do the new libc packages contain your thread + destruction work ? + <braunr> teythoon: the -98 ones on experimental ? + <braunr> i don't think they do + <braunr> the -18 ones should do |