summaryrefslogtreecommitdiff
path: root/open_issues/libpthread
diff options
context:
space:
mode:
authorSamuel Thibault <samuel.thibault@ens-lyon.org>2015-02-18 00:58:35 +0100
committerSamuel Thibault <samuel.thibault@ens-lyon.org>2015-02-18 00:58:35 +0100
commit49a086299e047b18280457b654790ef4a2e5abfa (patch)
treec2b29e0734d560ce4f58c6945390650b5cac8a1b /open_issues/libpthread
parente2b3602ea241cd0f6bc3db88bf055bee459028b6 (diff)
Revert "rename open_issues.mdwn to service_solahart_jakarta_selatan__082122541663.mdwn"
This reverts commit 95878586ec7611791f4001a4ee17abf943fae3c1.
Diffstat (limited to 'open_issues/libpthread')
-rw-r--r--open_issues/libpthread/t/fix_have_kernel_resources.mdwn1301
1 files changed, 1301 insertions, 0 deletions
diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
new file mode 100644
index 00000000..02b6ab05
--- /dev/null
+++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn
@@ -0,0 +1,1301 @@
+[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation,
+Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+[[!tag open_issue_libpthread]]
+
+`t/fix_have_kernel_resources`
+
+Address problem mentioned in [[/libpthread]], *Threads' Death*.
+
+
+# IRC, freenode, #hurd, 2012-08-30
+
+ <braunr> tschwinge: this issue needs more cooperation with the kernel
+ <braunr> tschwinge: i.e. the ability to tell the kernel where the stack is,
+ so it's unmapped when the thread dies
+ <braunr> which requiring another thread to perform this deallocation
+
+
+## IRC, freenode, #hurd, 2013-05-09
+
+ <bddebian> braunr: Speaking of which, didn't you say you had another "easy"
+ task?
+ <braunr> bddebian: make a system call that both terminates a thread and
+ releases memory
+ <braunr> (the memory released being the thread stack)
+ <braunr> this way, a thread can completely terminates itself without the
+ assistance of a managing thread or deferring work
+ <bddebian> braunr: That's "easy" ? :)
+ <braunr> bddebian: since it's just a thread_terminate+vm_deallocate, it is
+ <braunr> something like thread_terminate_self
+ <bddebian> But a syscall not an RPC right?
+ <braunr> in hurd terminology, we don't make the distinction
+ <braunr> the only real syscalls are mach_msg (obviously) and some to get
+ well known port rights
+ <braunr> e.g. mach_task_self
+ <braunr> everything else should be an RPC but could be a system call for
+ performance
+ <braunr> since mach was designed to support clusters, it was necessary that
+ anything not strictly machine-local was an RPC
+ <braunr> and it also helps emulation a lot
+ <braunr> so keep doing RPCs :p
+
+
+## IRC, freenode, #hurd, 2013-05-10
+
+ <braunr> i'm not sure it should only apply to self though
+ <braunr> youpi: can we get a quick opinion on this please ?
+ <braunr> i've suggested bddebian to work on a new RPC that both terminates
+ a thread and releases its stack to help fix libpthread
+ <braunr> and initially, i thought of it as operating only on the calling
+ thread
+ <braunr> do you see any reason to make it work on any thread ?
+ <braunr> (e.g. a real thread_terminate + vm_deallocate)
+ <braunr> (or any reason not to)
+ <youpi> thread stack deallocation is always a burden indeed
+ <youpi> I'd tend to think it'd be useful, but perhaps ask the list
+
+
+## IRC, freenode, #hurd, 2013-06-26
+
+ <braunr> looks like there is a port right leak in libpthread
+ <braunr> grmbl, the port leak seems to come from mach_port_destroy being
+ buggy :/
+ <braunr> hum, apparently we're not the only ones to suffer from port leaks
+ wrt mach_port_destroy
+ <braunr> ew, libpthread is leaking
+ <pinotree> memory or ports?
+ <braunr> both
+ <pinotree> sounds great ;)
+ <braunr> as it is, libpthread doesn't destroy threads
+ <braunr> it queues them so they're recycled late
+ <braunr> r
+ <braunr> but there is confusion between the thread structure itself and its
+ internal resources
+ <braunr> i.e. there is pthread_alloc which allocates a thread structure,
+ and pthread_create which allocates everything else
+ <braunr> but on pthread_exit, nothing is destroyed
+ <braunr> when a thread structure is reused, its internal resources are
+ replaced by new instances
+ <pinotree> oh
+ <braunr> it's ok for joinable threads but most of our threads are detached
+ <braunr> pinotree: as expected, it's bigger than expected :p
+ <braunr> so i won't be able to write a quick fix
+ <braunr> the true way to fix this is make it possible for threads to free
+ their own resources
+ <braunr> let's do that :p
+ <braunr> ok, got the new thread termination function, i'll build eglibc
+ package providing it, then experiment with libpthread
+ <pinotree> braunr: iirc there's also a tschwinge patch in the debian eglibc
+ about that
+ <braunr> ah
+ <pinotree> libpthread_fix.diff
+ <braunr> i see
+ <braunr> thanks for the notice
+ <braunr> bddebian:
+ http://www.sceen.net/~rbraun/0001-thread_terminate_deallocate.patch
+ <braunr> bddebian: this is what it looks like
+ <braunr> see, short and easy
+ <bddebian> Aye but didn't youpi say not to bother with it??
+ <braunr> he did ?
+ <braunr> i don't remember
+ <bddebian> I thought that was the implication. Or maybe that was the one I
+ already did!?
+ <braunr> i'd be interested in reading that
+ <braunr> anyway, there still are problems in libpthread, and this call is
+ one building block to fix some of them
+ <braunr> some important ones
+ <braunr> (big leaks)
+
+
+## IRC, freenode, #hurd, 2013-06-29
+
+ <braunr> damn, i fix leaks in libpthread, only to find out leaks somewhere
+ else :(
+ <braunr> bddebian: ok, actually it was a bit more complicated than what i
+ showed you
+ <braunr> because in addition to the stack, the call must also release the
+ send right in the caller's ipc space
+ <braunr> (it can't be released before since there would be no mean to
+ reference the thread to destroy)
+ <braunr> or perhaps it should strictly be reserved to self termination
+ <braunr> hmm
+ <braunr> yes it would probably be simpler
+ <braunr> but it should be a decent compromise
+ <braunr> i'm close to having a libpthread that doesn't leak anything
+ <braunr> and that properly destroys threads and their resources
+
+
+## IRC, freenode, #hurd, 2013-06-30
+
+ <braunr> bddebian: ok, it was even more tricky, because the kernel would
+ save the return value on the user stack (which is released by the call
+ and then invalid) before checking for asynchronous software traps (ASTs,
+ a kind of software interrupts in mach), and terminating the calling
+ thread is done by a deferred AST ... :)
+ <braunr> hmm, making threads able to terminate themselves makes rpctrace a
+ bit useless :/
+ <braunr> well, more restricted
+
+ <braunr> ok so, tough question :
+ <braunr> i have a small test program that creates a thread, and inspect its
+ state before any thread dies
+ <braunr> i can see msg_report_wait requests when using ps
+ <braunr> (one per thread)
+ <braunr> one of these requests create a new receive right, apparently for
+ the second thread in the test program
+ <braunr> each time i use ps, i can see the sequence numbers of two receive
+ rights increase
+ <braunr> i guess these rights are related to proc and signal handling per
+ thread
+ <braunr> but i can't find what create them
+ <braunr> does anyone know ?
+ <braunr> tschwing_: ^ :)
+
+ <braunr> again, too many things wrong elsewhere to cleanly destroy threads
+ ..
+ <braunr> something is deeply wrong with controlling terminals ..
+
+
+## IRC, freenode, #hurd, 2013-07-01
+
+ <braunr> youpi: if you happen to notice what receive right is created for
+ each thread (beyond the obvious port used for blocking and waking up),
+ please let me know
+ <braunr> it's the only port leak i have with thread destruction
+ <braunr> and i think it's related to the proc server since i see the
+ sequence number increase every time i use ps
+
+ <braunr> pinotree: my change doesn't fix all the pthread leaks but it's a
+ lot better
+ <braunr> bddebian: i've spent almost the whole week end trying to find the
+ last port leak without success
+ <braunr> there is some weird bug related to the controlling tty that hits
+ me every time i try to change something
+ <braunr> it's the same bug that prevents ttys from being correctly closed
+ when using ssh or screen
+ <braunr> well maybe not the same, but it's close
+ <braunr> some stale receive right kept around for no apparent reason
+ <braunr> and i can't find its source
+
+
+## IRC, freenode, #hurd, 2013-07-02
+
+ <braunr> and btw, i don't think i can make my libpthread patch work
+ <braunr> i'll just aim at avoiding leaks, but destroying threads and their
+ related resources depends on other changes i don't clearly see
+
+
+## IRC, freenode, #hurd, 2013-07-03
+
+ <braunr> grmbl, i don't want to give up thread destruction ..
+
+
+## IRC, freenode, #hurd, 2013-07-15
+
+ <braunr> btw, my work on thread destruction is currently stalled
+ <braunr> i don't have much free time right now
+
+
+## IRC, freenode, #hurd, 2013-09-13
+
+ <braunr> i think i know why my thread_terminate_deallocate patches leak one
+ receive port :>
+ <braunr> but now i'm not sure of the proper solution
+ <braunr> every time a thread is created and destroyed, a receive right is
+ leaked
+ <braunr> i guess it's simply the reply port ..
+ <braunr> grmbl
+ <braunr> i guess i have to make it a simpleroutine ...
+ <braunr> hm too bad, it's not the reply port :(
+ <braunr> it's also leaking some memory
+ <braunr> it doesn't seem related to my changes though
+ <braunr> stacks, rights, and threads are correctly destroyed
+ <braunr> some obscure state is left behind
+ <braunr> i wonder how exception ports are dealt with
+ <braunr> vminfo seems to confirm memory is leaking in the heap
+ <braunr> humpf
+ <braunr> oh silly me
+ <braunr> i don't detach threads
+ <teythoon> well, detach them ;)
+ <braunr> hm worse :p
+ <braunr> now i get additional dead names
+ <braunr> but it's a step forward
+
+
+## IRC, freenode, #hurd, 2013-09-16
+
+ <braunr> that thread port leak is so strange
+ <braunr> the leaked port seems to be created when the new thread starts
+ running
+ <braunr> so it looks like a port the kernel would implicitely create
+ <braunr> hm could it be a thread-specific reply port ?
+ <youpi> ah, yes, there is one of those
+ <braunr> how come mach/mig-reply.c in glibc isn't thread-safe ?
+ <youpi> it is overriden by sysdeps/mach/hurd/img-reply.c I guess
+ <youpi> which uses a threadvar for the mig reply port
+ <braunr> oh
+ <youpi> talking of which, there is also last_value in
+ sysdeps/mach/strerror_l.c
+ <youpi> strerror_thread_freeres is supposed to get called, but who knows
+ <braunr> it does look to be that port
+ <youpi> iirc that's the issue which prevents from letting us make threads
+ exit on idleness?
+ <braunr> one of them
+ <youpi> ok
+ <braunr> maybe the only one, yes
+ <braunr> i see memory leaks but they could be related/normal
+ <braunr> (i.e. not actual leaks)
+ <braunr> on the other hand, i also can't boot a hurd with my patch
+ <braunr> but i consider removing such leaks a priority
+ <braunr> does anyone know the semantic difference between
+ __mig_put_reply_port and __mig_dealloc_reply_port ?
+ <braunr> i guess __mig_dealloc_reply_port is actually a destruction
+ operation, right ?
+ <youpi> AIUI, dealloc is used when one wants the port not to be reused at
+ all
+ <youpi> because it has been used as a reference for something, and can
+ still be currently in use
+ <youpi> while put_reply would be when we're really done with it, and won't
+ use it again, and can thus be used as such
+ <youpi> or at least something like that
+ <braunr> heh
+ <braunr> __mig_dealloc_reply_port calls __mach_port_mod_refs, which is a
+ RPC, and creates a new reply port when destroying the current one
+ <youpi> bah
+ <youpi> that's fine, it's a deref of the old port, which is not in the
+ reply_port variable any more
+ <braunr> it's fine, but still a leak
+ <youpi> well, dealloc does not completely deallocs, yes
+ <braunr> that's not really the problem here
+ <braunr> i've introduced a case that wasn't considered at the time, namely
+ that a thread can destroy itself
+ <youpi> we probably need another function to be called from the thread exit
+ <braunr> i'll simply try with mach_port_destroy
+ <braunr> mach_port_destroy seems to be a RPC too ...
+ <braunr> grmbl
+ <youpi> isn't there a trap version somehow ?
+ <braunr> not in libc
+ <youpi> erf
+ <braunr> at least i know what's wrong now :)
+ <braunr> there still is a small memory leak i have to investigate
+ <braunr> but outside the stack
+ <braunr> the stack, the thread name and the thread are correctly destroyed
+ <braunr> slabinfo confirms only one port leak and nothing else is leaked
+ <braunr> ok so the port leak was indeed the thread-specific reply port,
+ taken care of
+ <braunr> there are also memory leaks too
+
+
+## IRC, freenode, #hurd, 2013-09-17
+
+ <braunr> teythoon: on my side, i'm getting to know our threading
+ implementation better
+ <braunr> closing to clean thread destruction
+ <braunr> x15 ipc will hide reply ports ;p
+ <braunr> memory leaks solved \o/
+ <braunr> now, have to fix memory release when joining
+ <braunr> proper reference counting on detach/join/exit, let's see how it
+ goes ..
+ <braunr> seems to work fine
+
+
+## IRC, freenode, #hurd, 2013-09-18
+
+ <braunr> ok i'll soon have gnumach and libc packages including proper
+ thread destruction :>
+ <teythoon> braunr: why did you have to touch gnumach?
+ <braunr> to add a call allowing threads to release ports and memory
+ <braunr> i.e. their last self reference, their reply port and their stack
+ <braunr> let me public my current patches
+ <teythoon> braunr: thread_commit_suicide ?
+ <braunr> hehe
+ <braunr> initially thread_terminate_self but
+ <braunr> it can be used by other threads too
+ <braunr> to i named it thread_terminate_release
+ <braunr> http://darnassus.sceen.net/~rbraun/0001-pthread_thread_halt.patch
+ <braunr>
+ http://darnassus.sceen.net/~rbraun/0001-thread_terminate_release.patch
+ <braunr> the pthread patch needs to be polished because it changes the
+ semantics of pthread_thread_halt
+ <braunr> but other than that, it should be complete
+ <pinotree> pthread_thread_halt_reallyhalt
+ <braunr> ok let's try these libc packages
+ <braunr> old static ext2fs for the root, but other than that, it boots
+ <braunr> let's try iceweasel
+ <braunr> (i'll need to build a hurd package against this new libc, removing
+ the libports_stability patch which prevents thread destruction in servers
+ on the way)
+ <teythoon> prevents thread destruction o_O
+ <braunr> yes
+ <braunr> in libports only ;p
+ <teythoon> oh, *only* in libports, I assumed for a moment that it affected
+ almost every component of the Hurd...
+ <teythoon> *phew(
+ <braunr> ... :)
+ <braunr> that's why, after a burst of messages, say because of aptitude
+ (select), you may see a few hundred threads still hanging around
+ <braunr> also why unused servers remain running even after several minutes,
+ where the normal timeout is 2mins
+ <teythoon> I wondered about that, some servers (symlink comes to mind) seem
+ to go away if unused (or that's how I read the code)
+ <braunr> symlinks are usually not servers, since most of them actually
+ exist in file systems, and are implemented through an optimization
+ <teythoon> yes I know that
+ <teythoon> trans/symlink.c reads:
+ <teythoon> /* The timeout here is 10 minutes */
+ <teythoon> err = mach_msg_server_timeout (fsys_server, 0, control,
+ <teythoon> MACH_RCV_TIMEOUT, 1000 * 60 * 10);
+ <teythoon> if (err == MACH_RCV_TIMED_OUT)
+ <teythoon> exit (0);
+ <braunr> ok
+ <teythoon> hm, /hurd/symlink doesn't feel at all like a symlink... but
+ works like one
+ <braunr> well, starting iceweasel makes X on my host freeze oO
+ <braunr> bbl
+ <teythoon> /hurd/symlink translators do go away after being unused for 10
+ minutes... this is funny if they are set up by hand instead of being
+ started from a passive translator record
+ <teythoon> magically vanishing symlinks ;)
+
+
+## IRC, freenode, #hurd, 2013-09-19
+
+ <braunr> hum, i can't rebuild a hurd package :(
+ <teythoon> braunr: with your thread destruction patches in libc?
+ <braunr> yes but it's unrelated
+ <braunr> In file included from ../../libdiskfs/boot-start.c:38:0:
+ <braunr> ./fsys_reply_U.h:173:15: error: conflicting types for
+ ‘fsys_get_children’
+ <braunr> i didn't see a new libc debian release
+ <teythoon> hm, David reported that as well
+ <teythoon>
+ id:CAEvUa7=QzOiS41G5Vq8k4AiaN10jAPm+CL_205OHJnL0xpJXbw@mail.gmail.com
+ <teythoon> uh oh
+ <teythoon> it seems I didn't add a _reply suffix to the reply routines :/
+ <teythoon> there's quite a bit of fallout from my patches, I kinda feel bad
+ :(
+ <braunr> teythoon: what i'm wondering is what youpi did too, since he got
+ hurd binary packages
+ <teythoon> braunr: well neither he nor I noticed that b/c for us the
+ declarations were just missing
+ <braunr> from libc you mean ?
+ <braunr> or hum gnumach-common ?
+ <teythoon> not sure actually
+ <braunr> no it's not a gnumach thing
+ <braunr> hurd-dev then
+ <teythoon> the build system should have cought these, or mig...
+ <braunr> also, i see you changed fsys_reply.defs, but nothing about
+ fsys_request.defs
+ <teythoon> I have no fsys_requests.defs
+ <braunr> looks like there was no fsys_request.defs in the first place
+ ... *sigh*
+ <braunr> do you know an application that often creates and destroys threads
+ ?
+ <teythoon> no, sorry
+ <pinotree> maybe some test suite
+ <braunr> ah right
+ <braunr> sysbench maybe
+ <braunr> also, i've been hit by a lot more network deadlocks than usual
+ lately
+ <braunr> fixing netdde has gained some priority in my todo list
+
+
+## IRC, freenode, #hurd, 2013-09-20
+
+ <braunr> oh, git is multithreaded
+ <braunr> great
+ <braunr> so i've actually tested my libpthread patch quite a lot
+
+
+## IRC, freenode, #hurd, 2013-09-25
+
+ <braunr> on a side note, i was able to build gnumach/libc/hurd packages
+ with thread destruction
+ <teythoon> nice :)
+ <braunr> they boot and work mostly fine, although they add their own issues
+ <braunr> e.g. the comm field of the root ext2fs is empty
+ <braunr> ps crashes when trying to display threads
+ <braunr> but thread destruction actually works, i.e. servers (those that
+ are configured that away at least) go away after some time, and even
+ heavily used servers such as ext2fs dynamically scale over time :)
+
+
+## IRC, freenode, #hurd, 2013-10-10
+
+ <braunr> concerning threads, i think i figured out the last bugs i had with
+ thread destruction
+ <braunr> it should be well on its way to be merged by the end of the year
+
+
+## IRC, freenode, #hurd, 2013-10-11
+
+ <gg0> braunr: is your thread destruction patch ready for testing?
+ <braunr> gg0: there are packages at my repository, yes
+ <braunr> but i still have hurd fixes to do before i polish it
+ <braunr> in particular, posix says returning from main() stops the entire
+ process and all other threads
+ <braunr> i didn't check that during the switch to pthreads, and ext2fs (and
+ maybe others) actually return from main but expect other threads to live
+ on
+ <braunr> this creates problems when the main thread is actually destroyed,
+ but not the process
+ <teythoon> braunr: tmpfs does something like that, but calls pthread_exit
+ at the end of main
+ <braunr> same effect
+ <braunr> this was fine with cthreads, but must be changed with pthreads
+ <braunr> and libpthread must be fixed to enforce it
+ <braunr> (or libc)
+
+ <braunr> diskfs_startup_diskfs should probably be changed to reuse the main
+ thread instead of returning
+
+
+## IRC, freenode, #hurd, 2013-10-19
+
+ <zacts> I know what threads are, but what is 'thread destruction'?
+ <braunr> the hurd currently never destroys individual threads
+ <braunr> they're destroyed when tasks are destroyed
+ <braunr> if the number of threads in a task peaks at a high number, say
+ thousands of them, they'll remain until the task is terminated
+ <braunr> such tasks are usually file systems, normally never restarted (and
+ in the case of the root file system, not restartable)
+ <braunr> this results in a form of leak
+ <braunr> another effect of this leak is that servers which should go away
+ because of inactivity still remain
+ <braunr> since thread destruction doesn't actually work, the debian package
+ uses a patch to prevent worker threads from timeouting
+ <braunr> and to finish with, since thread destruction actually doesn't
+ work, normal (unpatched) applications that destroy threads are certainly
+ failing bad
+ <braunr> i just need to polish a few things, wait for youpi to finish his
+ work on TLS to resolve conflicts, and that will be all
+
+
+## IRC, freenode, #hurd, 2013-10-30
+
+ <braunr> FYI, the packages on my repository enable actual thread
+ destruction, and i've altered the libports_stability.patch
+ <braunr> it nows only sets the global timeout to 0
+ <braunr> now*
+ <braunr> we actually can't let translator "die" on global timeout because
+ of a race issue
+ <braunr> tested for about two weeks now and no major problem sighted
+ <braunr> top reports processes running for 100% of their time when
+ terminating threads, but i expect it's simply mach/proc aggregating their
+ run time to the task
+ <braunr> 100% of cpu time
+
+
+## IRC, freenode, #hurd, 2013-11-08
+
+ <braunr> teythoon: darnassus is currently running a modified glibc with
+ thread destruction, yes
+ <teythoon> braunr: did that require any fixups in Hurd that I'd have missed
+ ?
+ <braunr> no
+ <braunr> well
+ <teythoon> b/c the resulting hurd package would not boot
+ <braunr> actually yes
+ <braunr> one
+ <braunr> i'll push the patch somewhere
+ <teythoon> iirc the mach-defpager spewed some error and /hurd/init failed
+ to bootstrap the system
+ <braunr> teythoon:
+ http://darnassus.sceen.net/~rbraun/0001-Prevent-diskfs-translators-from-destroying-main-thre.patch
+ <braunr> make sure you have the proper gnumach packages too :p
+ <teythoon> well, that could very well account for my trouble ;)
+ <teythoon> uh
+ <teythoon> well
+ <braunr> gnumach implements thread destruction, glibc uses it, hurd makes
+ sure it doesn't exit from main
+
+
+## IRC, freenode, #hurd, 2013-11-12
+
+ <braunr> ok so, calling pthread_exit() from main isn't the same as
+ returning from main()
+ <braunr> unlike what some man pages seem to say
+ <braunr> so loosing task info when destroying the main thread is actually a
+ proc bug
+ <braunr> ugh
+ <teythoon> ^^
+ <braunr> or a glibc one
+ <teythoon> the proc server, your favorite Hurd component...
+ <braunr> :)
+ <braunr> hm :/
+ <braunr> looks like command line arguments are stored on the stack of the
+ main thread
+ <braunr> and proc merely receives the addresses of those in the target task
+ <neal> why not just keep the main thread around?
+ <neal> it represents a minor resource leak, true
+ <braunr> yes
+ <braunr> that's the hack i suggested
+ <neal> but it is relatively small
+ <braunr> well no
+ <braunr> my hack was about diskfs translators
+ <braunr> it should be generalized in libpthread
+ <braunr> seems reasonable
+ <braunr> let's do it >)
+
+
+## IRC, freenode, #hurd, 2013-11-13
+
+ <youpi> braunr: there is a thread destruction issue in the experimental
+ ocaml build, worth looking at, probably
+ <braunr> what do you mean ?
+ <youpi> ... testing 'testfork.ml': ocamlcocamlrun:
+ ../libpthread/sysdeps/mach/pt-thread-halt.c:51: __pthread_thread_halt:
+ Unexpected error: (ipc/send) invalid destination port.
+ <youpi> during the experimental ocaml build
+ <braunr> well yes
+ <braunr> thread recycling is buggy
+ <braunr> i had the choice to fix it, or implement true destruction
+ <braunr> i'm tweaking my patch so it leaves the main thread stack untouched
+ on destruction
+ <braunr> and it should be ready
+ <braunr> for review at least
+
+
+## IRC, OFTC, #debian-hurd, 2013-11-13
+
+ <gg0> ironforge out of memory during ruby1.9.1 rebuild. during test which
+ creates 10000 threads
+ <gg0> ironforge out of memory during ruby1.9.1 rebuild, test which creates
+ 10000 threads
+ <gg0> i guess ironforge kernel has been rebuilt against -95, correct?
+ <youpi> err, what kernel?
+ <gg0> 23:37 < youpi> hurd needs a rebuild to be able to work with the newer
+ eglibc
+ <gg0> i mean hurd
+ <youpi> yes, libc0.3 breaks the old packages anyway
+ <gg0> wrt ENOMEM, was it expected?
+ <gg0> wrt disk problems, aren't there on alioth only?
+ <youpi> well 10,000 threads is a lot, especially on 32bit machine with 2M
+ default stack size
+ <youpi> that makes 2GiB stacks
+ <youpi> can't fit in a 2/2 split model, which gnumach uses
+ <gg0> well, though active thread should die right away, just after set x to
+ false, if i read it correctly
+ <youpi> perhaps the stacks are not correctly reused
+ <youpi> that's probably worth digging in libpthread
+ <youpi> by putting printfs, etc.
+ <youpi> it seems stacks are never reused indeed, damn
+ <youpi> I just wrote a small test that creates threads which just print
+ their stack address
+ <youpi> that takes just a few minutes to do
+ <gg0> i see. about reusage i guess you mean base address is kindof always
+ incremented
+ * gg0 likes being wrong
+ <youpi> that's it, yes
+ <youpi> gg0: take care, by keeping being wrong all the time, sometimes you
+ get right ;)
+ <youpi> and you are definitely right here :)
+ <youpi> Mmm, but the stack is really deallocated
+ <youpi> and the numbers wrap around
+ <youpi> I wonder how that is :)
+ <youpi> ok, creating 20 000 threads does work
+ <youpi> perhaps ruby does odd things which makes it not work
+
+
+### IRC, OFTC, #debian-hurd, 2013-11-14
+
+ <gg0> UID PID PPID TH MSGI MSGO SZ RSS SC STAT TIME COMMAND
+ <gg0> 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu 0:00.15
+ /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1
+ -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb
+ <gg0> 720 threads, stuck
+ <youpi> 2G SZ is very big :)
+ <gg0> 00:42 < youpi> perhaps ruby does odd things which makes it not work
+ <gg0> is that enough to file a ruby bug? as ruby suggests itself btw
+ <youpi> no, they will probably not be able to investigate
+ <youpi> but you can already check out how they create threads
+ <youpi> and try to reproduce the same with a small C program
+ <gg0> ehm on ruby2.0 with *context _enabled_ i can not reproduce it
+
+See [[/open_issues/glibc]] for `*context` functions.
+
+
+## IRC, freenode, #hurd, 2013-11-14
+
+ <braunr> nice, i got glibc packages with thread destruction
+ <braunr> building hurd packages against it now
+ <braunr> everything seems fine
+ <braunr> hurd packages ready, let's see
+
+ <gg0> ruby1.9.1 FTBFS due to a couple of tests
+ https://buildd.debian.org/status/fetch.php?pkg=ruby1.9.1&arch=hurd-i386&ver=1.9.3.448-1&stamp=1384265526
+ <gg0> second one creates 10000 threads and machine got ENOMEM
+ <braunr> bootstraptest.tmp.rb: [BUG] [BUG] pthread_cond_init: Cannot
+ allocate memory (ENOMEM) ew
+ <gg0> few hours ago trying to reproduce it:
+ <gg0> 01:20 < gg0> UID PID PPID TH MSGI MSGO SZ RSS SC STAT
+ TIME COMMAND
+ <gg0> 01:20 < gg0> 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu
+ 0:00.15 /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1
+ -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb
+ <braunr> yes that's expected
+ <braunr> our stacks are 2M
+ <braunr> 10k threads means right over 2G of stacks
+ <braunr> userspace is restricted to 2G
+ <gg0> but if i read correctly test in question, thread should just set x to
+ false then die
+ <braunr> so ?
+ <gg0> and ENOMEM popped upk when there were thread count was at 720
+ <braunr> hum
+ <braunr> 10k threads would actually be 20G
+ <braunr> 1k threads is 2G
+ <braunr> 720 is about 1.5G
+ <braunr> the rest is probably the ruby runtime
+ <gg0> youpi tried to create 10000 thread, no problem. he guessed something
+ wrong on ruby side
+ <gg0> indeed on ruby2.0 such test succeeds
+ <braunr> you can't create 10k threads unless you change the stack size
+ <braunr> hurd servers use a stack size of 64k by default which allows them
+ to go up to 30k iirc
+ <braunr> but normal applications use the default 2M
+ <gg0> i guess you mean 10000 threads active at the same time. test in
+ question should make them die after simply setting x to false, i guess
+ youpi's test did so as well
+ <braunr> no
+ <braunr> it's about stacks
+ <braunr> hm
+ <braunr> yes at the same time but
+ <braunr> thread recycling is known to be buggy
+ <braunr> which is what i'm currently fixing btw
+ <neal> what's the bug?
+ <braunr> neal: there are several subtle issues
+ <braunr> for example, joining a thread that is also calling pthread_exit
+ can fail badly
+ <neal> hmm
+ <neal> good that you are on it then :)
+ <braunr> or detaching
+ <braunr> i don't remember the details
+ <braunr> but i remember such problems
+ <braunr> apparently, keeping the stack of the main thread isn't enough
+ <braunr> :(
+ <braunr> for now, i'll keep the entire thread
+
+
+## IRC, freenode, #hurd, 2013-11-15
+
+ <gg0> i wasn't doing anything, just some single test runs. but yes, also
+ that one which creates hundreds of threads
+ <gg0> it would like creating 10000 but goes out of memory after ~720
+ <gg0> btw same tests succeed on ruby2.0, so they should be fixed by
+ backporting some changes
+ <braunr> actually it looks more like a deadlock ..
+ <gg0> deadlock that says ENOMEM?
+ <braunr> ?
+ <braunr> ENOMEM is returned because the test task has no more virtual
+ memory
+ <braunr> this doesn't mean the rest of the system should fail
+ <gg0> ok i thought you were talking about such test
+ <braunr> no it's something else
+ <braunr> a deadlock in a critical server
+ <braunr> the root file system maybe
+ <gg0> braunr: htop and ps hang. just run the test once again
+ <gg0> now you should still be able to login
+ <braunr> htop/ps hanging means one process is unable to reply to queries
+ sent to the message port/thread
+ <braunr> procfs does that to report on what a process is waiting
+ <braunr> it usually mean there is a bug around signals, since the message
+ thread is also in charge of delivering signals
+ <braunr> use ps -eM
+ <braunr> and kill -KILL
+ <braunr> hum
+ <braunr> root 954 S<o 0:00.05 /hurd/crash --dump-core
+ <braunr> dumping cores is known not to work most of the time
+ <braunr> exodar shouldn't be configured like that
+ <braunr> so yes, the crash server is hanging
+ <braunr> gg0: i've set it to crash --kill and killed the hanging crash
+ instances blocking top/ps
+ <gg0> nice
+
+ <braunr> my thread destruction patch and tls are indeed conflicting a bit
+ <braunr> i suspect the tcb is used after being freed
+ <braunr> i think i'll simply recycle the tcb, along with the pthread
+ structs
+ <braunr> ok i think it's fine now
+ <braunr> there was also a small bug in the tls code, keeping a reference on
+ the thread port
+ <braunr> mach reference counting is so counter intuitive :/
+ <braunr> well, error-prone
+
+ <braunr> argh, more bugs in libc :(
+ <teythoon> :/
+ <teythoon> but don't worry, there is always one more bug ;)
+ <braunr> this one might explain crashes that are long to trigger
+ <braunr> _hurd_self_sigstate() is implemented like this :
+ _hurd_thread_sigstate (__mach_thread_self ());
+ <braunr> it leaks a reference on the current thread each time it's called
+ <teythoon> >,<
+ <braunr> but glibc maintains such references, so if the maximum value is
+ reached, and references are dropped, the value can reach 0
+ <teythoon> ouch
+ <braunr> at which point any call on a thread will result in an invalid send
+ right
+ <braunr> and probably an assertion
+ <teythoon> well it's a good thing then that you found it :)
+ <braunr> i think it's always been there
+ <braunr> but it's more apparent since jknoenig's patch on signal
+ dispositions
+ <braunr> the maximum number of user references in mach is 64k
+ <braunr> this right leak isn't easy
+ <braunr> tls is very tricky heh :)
+ <braunr> for the main thread, tls initialization happens after the thread
+ creation, obviously
+ <braunr> but for other threads, it's initialized before starting them
+ <braunr> the leak was probably an overlook caused by that complexity
+ <braunr> teythoon: actually that leak i mentioned in _hurd_self_sigstate
+ has only been recently added in Convert sigstate to TLS
+ <braunr> so it's merely tls integration polishing
+ <braunr> youpi: i'm currently reviewing changes related to tls and i think
+ there is a bug in _hurd_self_sigstate
+ <braunr> calls to mach_thread_self() should be paired with
+ mach_port_deallocate to avoid urefs overflows
+ <braunr> and right leaks
+ <braunr> _hurd_critical_section_lock is probably affected too
+ <braunr> hm
+ <braunr> mhmm
+ <braunr> in glibc, hurd/hurd/signal.h, _hurd_critical_section_lock
+ <braunr> why is the sigstate unlocked after the call to
+ _hurd_thread_sigstate
+ <braunr> _hurd_thread_sigstate doesn't seem to lock it ..
+ <braunr> unless __spin_lock_init does it
+ <braunr> yes, leak solved :)
+
+
+## IRC, freenode, #hurd, 2013-11-16
+
+ <braunr> argh, _hurd_critical_section_lock is called before the send right
+ on the main thread is fetched in libpthread :/
+ <teythoon> is that bad ?
+ <braunr> the sigstate is supposed to be initialized after pthreads
+ <braunr> _hurd_critical_section_lock will create it if it sees there is
+ none
+ <braunr> creating the sigstate is currently what makes the send right leak
+ <teythoon> ok
+ <teythoon> it's bad then
+ <braunr> it may be due to my patch
+ <braunr> _hurd_critical_section_lock is called during pthreads
+ initializatio
+ <braunr> n
+ <braunr> before the sigstate for the main thread is created, but after the
+ pthread init routine is called
+ <braunr> it does indeed look like the code wasn't written with thread being
+ destroyed some day in mind :/
+ <teythoon> braunr: btw, if you ever feel like benchmarking, sysbench has a
+ benchmark for threads contending for a lock
+ <braunr> yes i've used it before
+ <teythoon> was it useful for this purpose ?
+ <braunr> no :)
+ <teythoon> :/
+ <braunr> we already know libpthread isn't optimized
+ <braunr> and felt it when we switched from cthreads
+ <braunr> humpf
+ <braunr> simply calling malloc implies a call to
+ _hurd_critical_section_lock
+ <braunr> on the other hand, unlike what some glibc comments say, this does
+ work
+
+
+## IRC, freenode, #hurd, 2013-11-17
+
+ <braunr> looks like i've fixed all leak issues with thread destruction and
+ tls :)
+ <braunr> let's see if ext2fs.static works fine too
+ <youpi> braunr: \o/
+ <youpi> sorry about introducing the tls ones :)
+ <braunr> no worries, it was expected
+ <braunr> and tls was really needed :)
+ <braunr> i mean, i expected to have some problems when rebasing on tls :p
+ <teythoon> braunr: this is good news, how is your rootfs translator holding
+ up?
+ <braunr> building hurd packages right now
+ <braunr> for now, only test applications and a few really multithreaded
+ ones (e.g. iceweasel) have been tested
+ <braunr> well, the system boots :)
+ <teythoon> awesome :)
+ <braunr> stressing the file system with git while watching youtube videos
+ with gnash doesn't make the system crash
+ <teythoon> you can actually watch yt videos on your Hurd box ?
+ <braunr> yes
+ <braunr> for a while now
+ <teythoon> o_O
+ <braunr> can't you ?
+ <teythoon> I never even dared to try
+ <braunr> hehe
+ <braunr> teythoon: looks stable enough to install on darnassus
+
+
+## IRC, freenode, #hurd, 2013-11-18
+
+ <teythoon> braunr: wrt to your thread destruction patchset, I thought you
+ also had to fix the proc server ?
+ <braunr> teythoon: no
+ <braunr> the problem was in glibc
+ <braunr> i may have to fix proc/procfs though, because cpu time gets wrong
+ with the patch
+ <braunr> currently, it's the addition of the cpu time of all threads
+ <braunr> mach provides aggregate times including destroyed threads though
+ <teythoon> ah, I see
+ <braunr> one side effect is that you'll see processes sometimes taking 100%
+ of cpu time although the cpu is unused
+ <braunr> or the cpu time of a process gets reduced :)
+ <braunr> i guess the 100% cpu is how top sees a negative increment
+ <teythoon> ^^
+ <braunr> gg0: do my threadterm packages help with ruby1.9 ?
+ <braunr> i mean, can you test with them some time ? :)
+
+
+## IRC, freenode, #hurd, 2013-11-21
+
+ <braunr> youpi: ping about my question regarding error handling in the
+ proposed thread_terminate_release call
+ <youpi> I agree with what Neal said
+ <braunr> he didn't say anything about error handling
+ <braunr> see
+ http://lists.gnu.org/archive/html/bug-hurd/2013-11/msg00181.html
+ <braunr> i think i should make the call fail on first error
+ <braunr> it shouldn't happen, so it would merely serve to catch bugs
+ <braunr> it's not easily recoverable (if it's recoverable at all)
+ <youpi> uh, I thought he had
+ <youpi> I must have dreamt
+
+ <braunr> i think i'll go ahead with thread destruction integration
+
+
+## IRC, freenode, #hurd, 2013-11-25
+
+ <braunr> i've pushed the thread destruction patches for gnumach upstream
+ <braunr> and made a branch in glibc for that too
+ <teythoon> awesome :)
+ <braunr> youpi: i don't remember how glibc changes should be managed
+ <braunr> once those are applied, i'll commit in libpthread
+ <youpi> braunr: usually we create a topgit branch, and then we add the
+ patch from that to the debian repository
+
+
+## IRC, freenode, #hurd, 2013-11-29
+
+ <braunr> youpi: i still have a leak somewhere with the thread destruction
+ patches
+ <braunr> maybe on the host priv port in bootstrap servers (root fs and proc
+ server)
+ <braunr> it prevents priority adjusting in libports and can easily bring
+ down a system because servers can start trashing a lot sooner, as it was
+ the case during the pthread migration
+
+See discussion about that on [[/open_issues/libpthread]].
+
+ <braunr> so i'll hunt it down before merging
+
+
+## IRC, freenode, #hurd, 2013-12-19
+
+ <braunr> darnassus still has the libports priority adjustement leaks
+ <braunr> i'll apply a few more patches to my hurd packages
+
+ <braunr> humpf, proc seems to have a problem getting the host priv port :/
+ <teythoon> thats bad
+ <teythoon> what did you do ?
+ <braunr> i fixed all the leaks in libports when adjusting priorities
+ <braunr> the last one being releasing the host priv right
+ <braunr> and i get errors at boot time from the proc server
+ <teythoon> remember when i had this problem ?
+ <braunr> proc doesn't get the host priv port the normal way since the
+ normal way is to get it from proc iirc
+ <teythoon> ah, thought you fixed that
+ <braunr> so i guess the alternate way doesn't add a reference
+ <braunr> well the leak is fixed
+ <braunr> the problem you had was due to the leak which made the host priv
+ port reach its max uref value
+ <braunr> now it's just the proc server
+ <braunr> the system works fine though
+ <teythoon> for real ?
+ <teythoon> the proc server needs the host priv port for getting the new
+ tasks
+ <braunr> well yes
+ <teythoon> how can it work w/o it ?
+ <braunr> i don't know ..
+ <braunr> i guess the problem is internal to glibc
+ <braunr> i mean, get_priv_ports fails, but that doesn't mean the host priv
+ port is lost
+ <teythoon> could be
+ <teythoon> are you running a patched rootfs translator too ?
+ <braunr> yes
+ <teythoon> ok
+ <teythoon> b/c i remember having trouble with that
+ <braunr> right, the glibc call would make proc call __proc_getprivports
+ <braunr> hum
+ <braunr> teythoon: do you remember how proc gets its host priv port ?
+ <teythoon> from init
+ <teythoon> i think
+ <braunr> startup_procinit ?
+ <teythoon> possibly
+ <braunr> right
+ <braunr> so it's probably not the host priv port
+ <braunr> i mean, the error is about another invalid send right
+ <braunr> hm nope, it is on host_priv :/
+ <braunr> hm ok i see, looks like a bug from a debian patch
+ <braunr> or rather, a bug fix not yet imported into the debian package
+ <braunr> teythoon: you actually fixed it in
+ 2c9422595f41635e2f4f7ef1afb7eece9001feae
+ <braunr> great :)
+ <teythoon> ah, that one
+ <braunr> i was looking at the upstream code and couldn't understand what
+ was going wrong
+ <braunr> :)
+ <braunr> much better
+ <braunr> except ps -eT doesn't work any more ..
+ <braunr> interestingly, with the thread destruction patch, ps -eT sometimes
+ work, and sometimes doesn't
+ <braunr> the behaviour doesn't seem to change without a reboot
+ <braunr> and of course, as soon as i say it, i'm proven wrong by the next
+ test :)
+
+
+## IRC, freenode, #hurd, 2013-12-26
+
+ <braunr> __pthread_sigstate_init doesn't seem to be converted to TLS in the
+ upstream repository master branch
+
+ <braunr> ah dammit, the global signal dispositions patch touches both glibc
+ and libpthread @#!
+ <braunr> what a mess
+
+ <braunr> youpi: do you have some time to quickly review the
+ rbraun/thread_destruction branch in libpthread ?
+ <braunr> there might be conflict with some glibc patches
+ <braunr> or do you prefer it on the mailing list ?
+ <braunr> (i used a branch because it's not based on master)
+ <youpi> rather mail the list, yes
+ <braunr> ok
+ <youpi> it'd also be useful to write the rationale
+ <youpi> probably to be left as comment in the source code
+ <braunr> yes, that branch was for personal storage :)
+ <youpi> so the reader knows how things are recycled or not
+ <braunr> hm
+ <braunr> that should already be the case
+ <youpi> ok
+ <braunr> the two structures that are still recycled are the pthread struct
+ and tls
+ <braunr> it's quite obvious from pthread_alloc
+ <braunr> and well commented there
+ <braunr> for tls, it's explained in pthread_exit
+
+ <braunr> there, thread destruction finally merged in
+ <braunr> and now, we can remove the ugly hacks that were done for
+ threadvars
+ <braunr> :)
+ <braunr> change stacks at will and support all sorts of weird languages and
+ runtimes
+ <teythoon> braunr: cool :)
+
+
+## IRC, freenode, #hurd, 2013-12-31
+
+ <youpi1> braunr: I've added sigstate_locking, sigstate_thread_reference and
+ tls_thread_leak to the debian glibc 2.18 package
+ <youpi1> I believe that's complete?
+ <youpi1> is mach_msg_uspace_options ready for being added? Does it bring
+ much speedup?
+ <youpi1> AIUI, thread_terminate_release is the union of the branches
+ mentioned above?
+ <youpi1> (I'm cleaning up branches in the glibc repo)
+ <braunr> youpi1: mach_msg_uspace_options can be left over, it only affects
+ selects and not noticeably
+ <braunr> yes, those three branches are the only ones needed for thread
+ destruction
+ <youpi1> ok
+ <youpi> does the hurd changes depend on these changes ?
+ <braunr> no
+ <youpi> good :)
+ <braunr> only on tls for one of them
+ <braunr> (it's about the default stack size of 64k for hurd servers)
+ <youpi> and we have had this in debian for a long time already :)
+ <braunr> yes
+ <youpi> (how big were they before?)
+ <youpi> (where they a couple MiB, and thus exploding to GiBs on thousands
+ of threads?)
+ <braunr> 64k
+ <braunr> pthread stacks are 2M by default
+ <braunr> yes
+
+
+## IRC, freenode, #hurd, 2014-01-14
+
+ <youpi> braunr: it seems your time change in libps made ps produce odd re
+ <youpi> results
+ <youpi> samy 10987 5 -514358:-18:-42.17 /hurd/firmlink tmp
+ <braunr> youpi: wow :)
+ <braunr> that change is supposed to run on a system where threads actually
+ get destroyed
+ <braunr> but i don't see what could trigger this side effect
+ <youpi> root 8629 664 56 years make -j 3
+ <youpi> :)
+ <braunr> heh
+ <braunr> youpi: does the hurd package on darnassus include that patch ?
+ <youpi> yes
+ <braunr> i don't reproduce the problem :/
+ <youpi> err
+ <braunr> what command are you using ?
+ <youpi> ps -feM on darnassus
+ <youpi> root 29642 473 7 months /usr/sbin/sshd -R
+ <braunr> hmmmm
+ <braunr> i don't see it with a make -j
+ <youpi> well, it's not systematic
+ <youpi> it's like once over two launches
+ <braunr> hhhhmmmmm
+ <youpi> it'd look like some random numbers get added
+ <braunr> strangely, the gcc processes started by a recursive make aren't
+ children of make ..
+ <braunr> ps -eF hurd seems to report the correct values
+ <braunr> even ps -eM
+ <braunr> oO
+ <braunr> ps -ef too
+ <braunr> the problem seems to be with ps -efM
+ <youpi> too bad I'm always using that :)
+ <braunr> another way to see it is that it makes us spot the issue ;p
+
+
+### IRC, freenode, #hurd, 2014-01-15
+
+ <braunr> ok i have an idea of what goes wrong in libps
+
+ <braunr> youpi: for some reason, ps -efM lacks the PSTAT_TASK_BASIC flag
+ <braunr> my patch is wrong since it doesn't try to determine whether the
+ stats apply to a task or a thread, but that is easy to fix
+ <braunr> ps -efM should nonetheless provide basic task info, obviously
+ <braunr> in addition, the problems i've observed with ps -T (occasional
+ segfaults) seem to have existed before thread destruction
+ <braunr> they're just strongly exposed now that the thread list can be
+ shrunk
+
+ <braunr> libps is quite complicated
+ <braunr> even hairy, i'd say ..
+
+
+### IRC, freenode, #hurd, 2014-01-16
+
+ <braunr> youpi: i think i have a proper fix for libps
+ <braunr> i'll commit it soon
+ <youpi> ok
+ <braunr> basically, getting system times simply set the PSTAT_THREAD_BASIC
+ flag
+ <braunr> whereas getting the run time of the terminated threads requires
+ PSTAT_TASK_BASIC
+ <braunr> i assumed it was always set in the function i changed when dealing
+ with a task and not a thread
+ <braunr> and well, that was a wrong assumtion, -M can remove it if not
+ strictly needed by the format
+ <braunr> the default format asks for suspend_count, which forces the
+ retrieval of task basic info, os it works with -eM
+ <braunr> but -f doesn't :)
+ <youpi> so extremely bad lucky combination of flags :)
+ <braunr> indeed
+ <braunr> i added a pstat_times using the last (!) available flag bit
+ <braunr> looks clean to me
+ <braunr> i hope there is no abi issue
+ <braunr> (at least everything works with the unmodified ps-hurd executable
+ and a new libps.so)
+
+ <braunr> hm, small bug in the thread destruction patch :/
+
+
+### IRC, freenode, #hurd, 2014-01-17
+
+ <braunr> good, i have proper fixes for tls in the main thread and thread
+ termination :)
+ <teythoon> awesome :)
+ <teythoon> i've been wondering, what does it take to get the thread
+ destruction stuff into the debian package ?
+ <braunr> i still have to build test packages, look for (unlikely, heh)
+ regressions and work some integration details with samuel
+ <braunr> hum the main thread tls fixup i guess
+ <braunr> youpi was waiting for me to fix that
+ <braunr> gnumach already provides the RPC
+ <braunr> so it will be in glibc soon
+ <braunr> i just have to get those last bits right
+ <braunr> teythoon: i'm quite slow at integrating stuff
+ <teythoon> and samuel then builds packages ?
+ <teythoon> i mean, is our libc package build linked to the other libc
+ packages ?
+ <braunr> libpthread is applied as a patch to glibc
+ <braunr> and loaded as a plugin
+
+
+## IRC, freenode, #hurd, 2014-01-17
+
+ <braunr> uhm, did we break fakeroot-tcp ?
+ <teythoon> we did ?
+ <youpi> fakeroot-tcp just works fine on buildds
+ <braunr> with fakeroot-tcp, i get
+ <braunr> make[4]: Entering directory
+ `/home/rbraun/devel/debian/packages/hurd/hurd-0.5.git20140113/libdde-linux26/contrib/include'
+ <braunr> rm -f .general.d
+ <braunr> make[4]: *** [cleanall] Killed
+ <braunr> when cleaning the package before building ..
+
+
+### IRC, freenode, #hurd, 2014-01-18
+
+ <braunr> damn, fakeroot-tcp won't work on darnassus ..
+ <braunr> uh, looks like my tls/thread destruction "fixes" do cause
+ regressions :(
+ <braunr> fakeroot works fine with debian glibc
+ <teythoon> which one ?
+ <teythoon> which fakeroot i mean
+ <braunr> -tcp
+ <braunr> yes, it fails as soon as i use the patched glibc :/
+ <braunr> at least it's easy to reproduce
+
+
+### IRC, freenode, #hurd, 2014-01-20
+
+ <braunr> great, 3rd libc version installed on darnassus, let's see if i can
+ build hurd packages against that
+
+
+### IRC, freenode, #hurd, 2014-01-21
+
+ <braunr> damn, fakeroot-tcp still crashes with my latest changes ....
+
+ <braunr> darnassus looks in good shape
+ <braunr> youpi: ^
+ <braunr> youpi: if you have other tests, feel free to do them now
+ <braunr> i feel confident about committing the changes, if you're ok with
+ it
+ <youpi> which changes ?
+ <youpi> I'm a bit lost in what you were talking about :)
+ <braunr> you can find them in 2 patches in /var/tmp on darnassus
+ <braunr> one is about fixing thread destruction
+ <braunr> i'm pretty certain about this one so i'll commit it directly
+ <braunr> the other is fixing the tcb of the main thread
+
+[[open_issues/libpthread]].
+
+ <braunr> where i simply do tcb->self = thread->kernel_thread :)
+ <braunr> with a comment explaining why i don't do something else like
+ deallocating the unused tcb
+ <youpi> braunr: ok, that looks good
+ <teythoon> braunr: awesome :)
+ <braunr> youpi: ok
+
+
+### IRC, freenode, #hurd, 2014-01-22
+
+ <braunr> there, libpthread should be fine now
+
+
+## IRC, freenode, #hurd, 2014-02-06
+
+ <braunr> youpi: in case you're planning to upgrade glibc (or not), the
+ thread destruction changes are complete
+ <braunr> youpi: darnassus has been running them for some weeks with no
+ visible regression
+ <youpi> braunr: ok, good
+ <youpi> including it in glibc was on my todo list indeed
+ <youpi> and Adam indeed plan for a 2.18 upload
+ <braunr> good :)
+ <youpi> braunr: this is up to 7c6dc6e28b2fc4b67934223f41cf080ffe58b230,
+ right? (Wed Jan 22, Fix up the main thread TCB)
+ <braunr> yes
+ <braunr> oh, i just saw 2.17-98~0 glibc packages on debian-ports :)
+ <youpi> yes, it's just to fix the dhcp crash
+ <braunr> ah yes, it's not 2.18
+ <youpi> 2.18 is available in experimental
+
+ <youpi> braunr: just to make sure: did you have
+ 983b18a6ff16f5687a9ece63a50d1831dec88609 in libc on darnassus?
+ <youpi> (which drops the stack size hack)
+ <braunr> youpi: let me check
+ <braunr> youpi: ah no, i don't, you're right
+ <youpi> well, I was just wondering, nothing make me think that was the case
+ :)
+ <youpi> what was the issue that it was raising btw?
+ <braunr> threadvards
+ <youpi> ok, b ut in which case?
+ <youpi> (to make sure I test that before committing)
+ <braunr> now that we switched to tls, i would assume the transition path to
+ be 1/ hurd stops defining that symbol, 2/ libpthread can stop using it
+ <braunr> the goal was to reduce the stack size of hurd server threads
+ <youpi> well, that's not my question :) I'm wondering in which precise case
+ that was breaking things
+ <braunr> youpi: i don't know, it shouldn't break
+ <youpi> ok
+ <braunr> youpi: just in case, don't forget that last one line patch i
+ committed last night, fakeroot can't work right without it
+ <braunr> (i made a minor change while reviewing before comitting, and
+ obviously got it wrong :p)
+ <youpi> ok
+
+ <youpi> braunr: I've upgraded libpthread in debian's eglibc btw
+
+ <braunr>
+ /home/rbraun/devel/debian/packages/eglibc/eglibc-2.17/build-tree/hurd-i386-libc/libc.so.phdr:
+ *** executable stack signaled
+ <braunr> from build-tree/hurd-i386-libc/elf/check-execstack.out
+ <braunr> i thought glibc didn't use those
+ <braunr> anyway it doesn't look to be the regression i'm having
+ <braunr> does this ring a bell :
+ <braunr> Encountered regressions that don't match expected failures
+ (debian/testsuite-checking/expected-results-i486-gnu-libc):
+ <braunr> test-stpcpy_chk.out, Error 1
+ <braunr> TEST test-stpcpy_chk.out: __stpcpy_chk normal_stpcpy
+ simple_stpcpy_chk
+ <youpi> nope
+ <youpi> after what are you getting this regression?
+ <braunr> building glibc 2.17-97 with thread destruction patches, including
+ the one removing the stack size hack
+ <braunr> during tests
+ <braunr> there also are "progressions", but i'm not sure what these are
+ <youpi> some progressions are just luck, other seem to happen on some
+ platforms only
+ <youpi> I'm not sure you want to test 2.17
+ <youpi> a lot has changed between 2.17's libpthread and 2.18's libpthread
+ (which is now equal to cvs's libpthread
+ <youpi> )
+ <youpi> s/cvs/git/
+ <braunr> yes
+ <braunr> i usually build with nocheck
+
+
+## IRC, freenode, #hurd, 2014-02-07
+
+ <braunr> youpi: on a vm with hurd 1:0.5.git20140203-1, upgrading to a
+ patched glibc 2.17-97 that includes the patch which reverts the stack
+ size hack, the system reboots and works fine
+ <youpi> ok. I don't remember what problem I was seeing
+ <braunr> that version of the hurd no longer defines the symbol
+ <braunr> but even then, there shouldn't have been any problem
+ <braunr> hm, or does it
+ <braunr> yes, it does
+ <braunr> youpi: the hurd package patch mentions
+ <braunr> Revert this for now, will have to wait for dropping the use of
+ <braunr> __pthread_stack_default_size from eglibc's
+ libpthread_hurd_cond_wait.diff
+ <braunr> i wonder how it got there
+ <youpi> IIRC I was wondering too
+ <braunr> i've installed my c library on darnassus and it works fine there
+ too
+ <braunr> with older (january) hurd packages
+ <braunr> looks good to me
+
+
+## IRC, freenode, #hurd, 2014-02-10
+
+ <teythoon> braunr: btw, do the new libc packages contain your thread
+ destruction work ?
+ <braunr> teythoon: the -98 ones on experimental ?
+ <braunr> i don't think they do
+ <braunr> the -18 ones should do