IRC.

author: Thomas Schwinge <tschwinge@gnu.org> 2012-11-29 01:33:22 +0100
committer: Thomas Schwinge <tschwinge@gnu.org> 2012-11-29 01:33:22 +0100
commit: 5bd36fdff16871eb7d06fc26cac07e7f2703432b (patch)
tree: b430970a01dfc56b8d41979552999984be5c6dfd /open_issues/libpthread.mdwn
parent: 2603401fa1f899a8ff60ec6a134d5bd511073a9d (diff)
1 files changed, 668 insertions, 0 deletions
diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn
index 03a52218..81f1a382 100644
--- a/open_issues/libpthread.mdwn
+++ b/open_issues/libpthread.mdwn
@@ -566,3 +566,671 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task.
     <braunr> ouch
     <bddebian> braunr: Do you have debugging enabled in that custom kernel you
       installed?  Apparently it is sitting at the debug prompt.
+
+
+## IRC, freenode, #hurd, 2012-08-12
+
+    <braunr> hmm, it seems the hurd notion of cancellation is actually not the
+      pthread one at all
+    <braunr> pthread_cancel merely marks a thread as being cancelled, while
+      hurd_thread_cancel interrupts it
+    <braunr> ok, i have a pthread_hurd_cond_wait_np function in glibc
+
+
+## IRC, freenode, #hurd, 2012-08-13
+
+    <braunr> nice, i got ext2fs work with pthreads
+    <braunr> there are issues with the stack size strongly limiting the number
+      of concurrent threads, but that's easy to fix
+    <braunr> one problem with the hurd side is the condition implications
+    <braunr> i think it should be deal separately, and before doing anything
+      with pthreads
+    <braunr> but that's minor, the most complex part is, again, the term server
+    <braunr> other than that, it was pretty easy to do
+    <braunr> but, i shouldn't speak too soon, who knows what tricky bootstrap
+      issue i'm gonna face ;p
+    <braunr> tschwinge: i'd like to know how i should proceed if i want a
+      symbol in a library overriden by that of a main executable
+    <braunr> e.g. have libpthread define a default stack size, and let
+      executables define their own if they want to change it
+    <braunr> tschwinge: i suppose i should create a weak alias in the library
+      and a normal variable in the executable, right ?
+    <braunr> hm i'm making this too complicated
+    <braunr> don't mind that stupid question
+    <tschwinge> braunr: A simple variable definition would do, too, I think?
+    <tschwinge> braunr: Anyway, I'd first like to know why we can'T reduce the
+      size of libpthread threads from 2 MiB to 64 KiB as libthreads had.  Is
+      that a requirement of the pthread specification?
+    <braunr> tschwinge: it's a requirement yes
+    <braunr> the main reason i see is that hurd threadvars (which are still
+      present) rely on common stack sizes and alignment to work
+    <tschwinge> Mhm, I see.
+    <braunr> so for now, i'm using this approach as a hack only
+    <tschwinge> I'm working on phasing out threadvars, but we're not there yet.
+    <tschwinge> Yes, that's fine for the moment.
+    <braunr> tschwinge: a simple definition wouldn't work
+    <braunr> tschwinge: i resorted to a weak symbol, and see how it goes
+    <braunr> tschwinge: i supposed i need to export my symbol as a global one,
+      otherwise making it weak makes no sense, right ?
+    <braunr> suppose*
+    <braunr> tschwinge: also, i'm not actually sure what you meant is a
+      requirement about the stack size, i shouldn't have answered right away
+    <braunr> no there is actually no requirement
+    <braunr> i misunderstood your question
+    <braunr> hm when adding this weak variable, starting a program segfaults :(
+    <braunr> apparently on ___pthread_self, a tls variable
+    <braunr> fighting black magic begins
+    <braunr> arg, i can't manage to use that weak symbol to reduce stack sizes
+      :(
+    <braunr> ah yes, finally
+    <braunr> git clone /path/to/glibc.git on a pthread-powered ext2fs server :>
+    <braunr> tschwinge: seems i have problems using __thread in hurd code
+    <braunr> tschwinge: they produce undefined symbols
+    <braunr> tschwinge: forget that, another mistake on my part
+    <braunr> so, current state: i just need to create another patch, for the
+      code that is included in the debian hurd package but not in the upstream
+      hurd repository (e.g. procfs, netdde), and i should be able to create
+      hurd packages taht completely use pthreads
+
+
+## IRC, freenode, #hurd, 2012-08-14
+
+    <braunr> tschwinge: i have weird bootstrap issues, as expected
+    <braunr> tschwinge: can you point me to important files involved during
+      bootstrap ?
+    <braunr> my ext2fs.static server refuses to start as a rootfs, whereas it
+      seems to work fine otherwise
+    <braunr> hm, it looks like it's related to global signal dispositions
+
+
+## IRC, freenode, #hurd, 2012-08-15
+
+    <braunr> ahah, a subhurd running pthreads-powered hurd servers only
+    <LarstiQ> braunr: \o/
+    <braunr> i can even long on ssh
+    <braunr> log
+    <braunr> pinotree: for reference, i uploaded my debian-specific changes
+      there :
+    <braunr> http://git.sceen.net/rbraun/debian_hurd.git/
+    <braunr> darnassus is now running a pthreads-enabled hurd system :)
+
+
+## IRC, freenode, #hurd, 2012-08-16
+
+    <braunr> my pthreads-enabled hurd systems can quickly die under load
+    <braunr> youpi: with hurd servers using pthreads, i occasionally see thread
+      storms apparently due to a deadlock
+    <braunr> youpi: it makes me think of the problem you sometimes have (and
+      had often with the page cache patch)
+    <braunr> in cthreads, mutex and condition operations are macros, and they
+      check the mutex/condition queue without holding the internal
+      mutex/condition lock
+    <braunr> i'm not sure where this can lead to, but it doesn't seem right
+    <pinotree> isn't that a bit dangerous?
+    <braunr> i believe it is
+    <braunr> i mean
+    <braunr> it looks dangerous
+    <braunr> but it may be perfectly safe
+    <pinotree> could it be?
+    <braunr> aiui, it's an optimization, e.g. "dont take the internal lock if
+      there are no thread to wake"
+    <braunr> but if there is a thread enqueuing itself at the same time, it
+      might not be waken
+    <pinotree> yeah
+    <braunr> pthreads don't have this issue
+    <braunr> and what i see looks like a deadlock
+    <pinotree> anything can happen between the unlocked checking and the
+      following instruction
+    <braunr> so i'm not sure how a situation working around a faulty
+      implementation would result in a deadlock with a correct one
+    <braunr> on the other hand, the error youpi reported
+      (http://lists.gnu.org/archive/html/bug-hurd/2012-07/msg00051.html) seems
+      to indicate something is deeply wrong with libports
+    <pinotree> it could also be the current code does not really "works around"
+      that, but simply implicitly relies on the so-generated behaviour
+    <braunr> luckily not often
+    <braunr> maybe
+    <braunr> i think we have to find and fix these issues before moving to
+      pthreads entirely
+    <braunr> (ofc, using pthreads to trigger those bugs is a good procedure)
+    <pinotree> indeed
+    <braunr> i wonder if tweaking the error checking mode of pthreads to abort
+      on EDEADLK is a good approach to detecting this problem
+    <braunr> let's try !
+    <braunr> youpi: eh, i think i've spotted the libports ref mistake
+    <youpi> ooo!
+    <youpi> .oOo.!!
+    <gnu_srs> Same problem but different patches
+    <braunr> look at libports/bucket-iterate.c
+    <braunr> in the HURD_IHASH_ITERATE loop, pi->refcnt is incremented without
+      a lock
+    <youpi> Mmm, the incrementation itself would probably be compiled into an
+      INC, which is safe in UP
+    <youpi> it's an add currently actually
+    <youpi>    0x00004343 <+163>:   addl   $0x1,0x4(%edi)
+    <braunr>     40c4:       83 47 04 01             addl   $0x1,0x4(%edi)
+    <youpi> that makes it SMP unsafe, but not UP unsafe
+    <braunr> right
+    <braunr> too bad
+    <youpi> that still deserves fixing :)
+    <braunr> the good side is my mind is already wired for smp
+    <youpi> well, it's actually not UP either
+    <youpi> in general
+    <youpi> when the processor is not able to do the add in one instruction
+    <braunr> sure
+    <braunr> youpi: looks like i'm wrong, refcnt is protected by the global
+      libports lock
+    <youpi> braunr: but aren't there pieces of code which manipulate the refcnt
+      while taking another lock than the global libports lock
+    <youpi> it'd not be scalable to use the global libports lock to protect
+      refcnt
+    <braunr> youpi: imo, the scalability issues are present because global
+      locks are taken all the time, indeed
+    <youpi> urgl
+    <braunr> yes ..
+    <braunr> when enabling mutex checks in libpthread, pfinet dies :/
+    <braunr> grmbl, when trying to start "ls" using my deadlock-detection
+      libpthread, the terminal gets unresponsive, and i can't even use ps .. :(
+    <pinotree> braunr: one could say your deadlock detection works too
+      good... :P
+    <braunr> pinotree: no, i made a mistake :p
+    <braunr> it works now :)
+    <braunr> well, works is a bit fast
+    <braunr> i can't attach gdb now :(
+    <braunr> *sigh*
+    <braunr> i guess i'd better revert to a cthreads hurd and debug from there
+    <braunr> eh, with my deadlock-detection changes, recursive mutexes are now
+      failing on _pthread_self(), which for some obscure reason generates this
+    <braunr> => 0x0107223b <+283>:   jmp    0x107223b
+      <__pthread_mutex_timedlock_internal+283>
+    <braunr> *sigh*
+
+
+## IRC, freenode, #hurd, 2012-08-17
+
+    <braunr> aw, the thread storm i see isn't a deadlock
+    <braunr> seems to be mere contention ....
+    <braunr> youpi: what do you think of the way
+      ports_manage_port_operations_multithread determines it needs to spawn a
+      new thread ?
+    <braunr> it grabs a lock protecting the number of threads to determine if
+      it needs a new thread
+    <braunr> then releases it, to retake it right after if a new thread must be
+      created
+    <braunr> aiui, it could lead to a situation where many threads could
+      determine they need to create threads
+    <youpi> braunr: there's no reason to release the spinlock before re-taking
+      it
+    <youpi> that can indeed lead to too much thread creations
+    <braunr> youpi: a harder question
+    <braunr> youpi: what if thread creation fails ? :/
+    <braunr> if i'm right, hurd servers simply never expect thread creation to
+      fail
+    <youpi> indeed
+    <braunr> and as some patterns have threads blocking until another produce
+      an event
+    <braunr> i'm not sure there is any point handling the failure at all :/
+    <youpi> well, at least produce some output
+    <braunr> i added a perror
+    <youpi> so we know that happened
+    <braunr> async messaging is quite evil actually
+    <braunr> the bug i sometimes have with pfinet is usually triggered by
+      fakeroot
+    <braunr> it seems to use select a lot
+    <braunr> and select often destroys ports when it has something to return to
+      the caller
+    <braunr> which creates dead name notifications
+    <braunr> and if done often enough, a lot of them
+    <youpi> uh
+    <braunr> and as pfinet is creating threads to service new messages, already
+      existing threads are starved and can't continue
+    <braunr> which leads to pfinet exhausting its address space with thread
+      stacks (at about 30k threads)
+    <braunr> i initially thought it was a deadlock, but my modified libpthread
+      didn't detect one, and indeed, after i killed fakeroot (the whole
+      dpkg-buildpackage process hierarchy), pfinet just "cooled down"
+    <braunr> with almost all 30k threads simply waiting for requests to
+      service, and the few expected select calls blocking (a few ssh sessions,
+      exim probably, possibly others)
+    <braunr> i wonder why this doesn't happen with cthreads
+    <youpi> there's a 4k guard between stacks, otherwise I don't see anything
+      obvious
+    <braunr> i'll test my pthreads package with the fixed
+      ports_manage_port_operations_multithread
+    <braunr> but even if this "fix" should reduce thread creation, it doesn't
+      prevent the starvation i observed
+    <braunr> evil concurrency :p
+
+    <braunr> youpi: hm i've just spotted an important difference actually
+    <braunr> youpi: glibc sched_yield is __swtch(), cthreads is
+      thread_switch(MACH_PORT_NULL, SWITCH_OPTION_DEPRESS, 10)
+    <braunr> i'll change the glibc implementation, see how it affects the whole
+      system
+
+    <braunr> youpi: do you think bootsting the priority or cancellation
+      requests is an acceptable workaround ?
+    <braunr> boosting
+    <braunr> of*
+    <youpi> workaround for what?
+    <braunr> youpi: the starvation i described earlier
+    <youpi> well, I guess I'm not into the thing enough to understand
+    <youpi> you meant the dead port notifications, right?
+    <braunr> yes
+    <braunr> they are the cancellation triggers
+    <youpi> cancelling whaT?
+    <braunr> a blocking select for example
+    <braunr> ports_do_mach_notify_dead_name -> ports_dead_name ->
+      ports_interrupt_notified_rpcs -> hurd_thread_cancel
+    <braunr> so it's important they are processed quickly, to allow blocking
+      threads to unblock, reply, and be recycled
+    <youpi> you mean the threads in pfinet?
+    <braunr> the issue applies to all servers, but yes
+    <youpi> k
+    <youpi> well, it can not not be useful :)
+    <braunr> whatever the choice, it seems to be there will be a security issue
+      (a denial of service of some kind)
+    <youpi> well, it's not only in that case
+    <youpi> you can always queue a lot of requests to a server
+    <braunr> sure, i'm just focusing on this particular problem
+    <braunr> hm
+    <braunr> max POLICY_TIMESHARE or min POLICY_FIXEDPRI ?
+    <braunr> i'd say POLICY_TIMESHARE just in case
+    <braunr> (and i'm not sure mach handles fixed priority threads first
+      actually :/)
+    <braunr> hm my current hack which consists of calling swtch_pri(0) from a
+      freshly created thread seems to do the job eh
+    <braunr> (it may be what cthreads unintentionally does by acquiring a spin
+      lock from the entry function)
+    <braunr> not a single issue any more with this hack
+    <bddebian> Nice
+    <braunr> bddebian: well it's a hack :p
+    <braunr> and the problem is that, in order to boost a thread's priority,
+      one would need to implement that in libpthread
+    <bddebian> there isn't thread priority in libpthread?
+    <braunr> it's not implemented
+    <bddebian> Interesting
+    <braunr> if you want to do it, be my guest :p
+    <braunr> mach should provide the basic stuff for a partial implementation
+    <braunr> but for now, i'll fall back on the hack, because that's what
+      cthreads "does", and it's "reliable enough"
+
+    <antrik> braunr: I don't think the locking approach in
+      ports_manage_port_operations_multithread() could cause issues. the worst
+      that can happen is that some other thread becomes idle between the check
+      and creating a new thread -- and I can't think of a situation where this
+      could have any impact...
+    <braunr> antrik: hm ?
+    <braunr> the worst case is that many threads will evalute spawn to 1 and
+      create threads, whereas only one of them should have
+    <antrik> braunr: I'm not sure perror() is a good way to handle the
+      situation where thread creation failed. this would usually happen because
+      of resource shortage, right? in that case, it should work in non-debug
+      builds too
+    <braunr> perror isn't specific to debug builds
+    <braunr> i'm building glibc packages with a pthreads-enabled hurd :>
+    <braunr> (which at one point run the test allocating and filling 2 GiB of
+      memory, which passed)
+    <braunr> (with a kernel using a 3/1 split of course, swap usage reached
+      something like 1.6 GiB)
+    <antrik> braunr: BTW, I think the observation that thread storms tend to
+      happen on destroying stuff more than on creating stuff has been made
+      before...
+    <braunr> ok
+    <antrik> braunr: you are right about perror() of course. brain fart -- was
+      thinking about assert_perror()
+    <antrik> (which is misused in some places in existing Hurd code...)
+    <antrik> braunr: I still don't see the issue with the "spawn"
+      locking... the only situation where this code can be executed
+      concurrently is when multiple threads are idle and handling incoming
+      request -- but in that case spawning does *not* happen anyways...
+    <antrik> unless you are talking about something else than what I'm thinking
+      of...
+    <braunr> well imagine you have idle threads, yes
+    <braunr> let's say a lot like a thousand
+    <braunr> and the server gets a thousand requests
+    <braunr> a one more :p
+    <braunr> normally only one thread should be created to handle it
+    <braunr> but here, the worst case is that all threads run internal_demuxer
+      roughly at the same time
+    <braunr> and they all determine they need to spawn a thread
+    <braunr> leading to another thousand
+    <braunr> (that's extreme and very unlikely in practice of course)
+    <antrik> oh, I see... you mean all the idle threads decide that no spawning
+      is necessary; but before they proceed, finally one comes in and decides
+      that it needs to spawn; and when the other ones are scheduled again they
+      all spawn unnecessarily?
+    <braunr> no, spawn is a local variable
+    <braunr> it's rather, all idle threads become busy, and right before
+      servicing their request, they all decide they must spawn a thread
+    <antrik> I don't think that's how it works. changing the status to busy (by
+      decrementing the idle counter) and checking that there are no idle
+      threads is atomic, isn't it?
+    <braunr> no
+    <antrik> oh
+    <antrik> I guess I should actually look at that code (again) before
+      commenting ;-)
+    <braunr> let me check
+    <braunr> no sorry you're right
+    <braunr> so right, you can't lead to that situation
+    <braunr> i don't even understand how i can't see that :/
+    <braunr> let's say it's the heat :p
+    <braunr> 22:08 < braunr> so right, you can't lead to that situation
+    <braunr> it can't lead to that situation
+
+
+## IRC, freenode, #hurd, 2012-08-18
+
+    <braunr> one more attempt at fixing netdde, hope i get it right this time
+    <braunr> some parts assume a ddekit thread is a cthread, because they share
+      the same address
+    <braunr> it's not as easy when using pthread_self :/
+    <braunr> good, i got netdde work with pthreads
+    <braunr> youpi: for reference, there are now glibc, hurd and netdde
+      packages on my repository
+    <braunr> youpi: the debian specific patches can be found at my git
+      repository (http://git.sceen.net/rbraun/debian_hurd.git/ and
+      http://git.sceen.net/rbraun/debian_netdde.git/)
+    <braunr> except a freeze during boot (between exec and init) which happens
+      rarely, and the starvation which still exists to some extent (fakeroot
+      can cause many threads to be created in pfinet and pflocal), the
+      glibc/hurd packages have been working fine for a few days now
+    <braunr> the threading issue in pfinet/pflocal is directly related to
+      select, which the io_select_timeout patches should fix once merged
+    <braunr> well, considerably reduce at least
+    <braunr> and maybe fix completely, i'm not sure
+
+
+## IRC, freenode, #hurd, 2012-08-27
+
+    <pinotree> braunr: wrt a78a95d in your pthread branch of hurd.git,
+      shouldn't that job theorically been done using pthread api (of course
+      after implementing it)?
+    <braunr> pinotree: sure, it could be done through pthreads
+    <braunr> pinotree: i simply restricted myself to moving the hurd to
+      pthreads, not augment libpthread
+    <braunr> (you need to remember that i work on hurd with pthreads because it
+      became a dependency of my work on fixing select :p)
+    <braunr> and even if it wasn't the reason, it is best to do these tasks
+      (replace cthreads and implement pthread scheduling api) separately
+    <pinotree> braunr: hm ok
+    <pinotree> implementing the pthread priority bits could be done
+      independently though
+
+    <braunr> youpi: there are more than 9000 threads for /hurd/streamio kmsg on
+      ironforge oO
+    <youpi> kmsg ?!
+    <youpi> it's only /dev/klog right?
+    <braunr> not sure but it seems so
+    <pinotree> which syslog daemon is running?
+    <youpi> inetutils
+    <youpi> I've restarted the klog translator, to see whether when it grows
+      again
+
+    <braunr> 6 hours and 21 minutes to build glibc on darnassus
+    <braunr> pfinet still runs only 24 threads
+    <braunr> the ext2 instance used for the build runs 2k threads, but that's
+      because of the pageouts
+    <braunr> so indeed, the priority patch helps a lot
+    <braunr> (pfinet used to have several hundreds, sometimes more than a
+      thousand threads after a glibc build, and potentially increasing with
+      each use of fakeroot)
+    <braunr> exec weights 164M eww, we definitely have to fix that leak
+    <braunr> the leaks are probably due to wrong mmap/munmap usage
+
+[[exec_leak]].
+
+
+### IRC, freenode, #hurd, 2012-08-29
+
+    <braunr> youpi: btw, after my glibc build, there were as little as between
+      20 and 30 threads for pflocal and pfinet
+    <braunr> with the priority patch
+    <braunr> ext2fs still had around 2k because of pageouts, but that's
+      expected
+    <youpi> ok
+    <braunr> overall the results seem very good and allow the switch to
+      pthreads
+    <youpi> yep, so it seems
+    <braunr> youpi: i think my first integration branch will include only a few
+      changes, such as this priority tuning, and the replacement of
+      condition_implies
+    <youpi> sure
+    <braunr> so we can push the move to pthreads after all its small
+      dependencies
+    <youpi> yep, that's the most readable way
+
+
+## IRC, freenode, #hurd, 2012-09-03
+
+    <gnu_srs> braunr: Compiling yodl-3.00.0-7:
+    <gnu_srs> pthreads: real    13m42.460s, user    0m0.000s, sys     0m0.030s
+    <gnu_srs> cthreads: real     9m 6.950s, user    0m0.000s, sys     0m0.020s   
+    <braunr> thanks
+    <braunr> i'm not exactly certain about what causes the problem though
+    <braunr> it could be due to libpthread using doubly-linked lists, but i
+      don't think the overhead would be so heavier because of that alone
+    <braunr> there is so much contention sometimes that it could
+    <braunr> the hurd would have been better off with single threaded servers
+      :/
+    <braunr> we should probably replace spin locks with mutexes everywhere
+    <braunr> on the other hand, i don't have any more starvation problem with
+      the current code
+
+
+### IRC, freenode, #hurd, 2012-09-06
+
+    <gnu_srs> braunr: Yes you are right, the new pthread-based Hurd is _much_
+      slower.
+    <gnu_srs> One annoying example is when compiling, the standard output is
+      written in bursts with _long_ periods of no output in between:-( 
+    <braunr> that's more probably because of the priority boost, not the
+      overhead
+    <braunr> that's one of the big issues with our mach-based model
+    <braunr> we either give high priorities to our servers, or we can suffer
+      from message floods
+    <braunr> that's in fact more a hurd problem than a mach one
+    <gnu_srs> braunr: any immediate ideas how to speed up responsiveness the
+      pthread-hurd. It is annoyingly slow (slow-witted)
+    <braunr> gnu_srs: i already answered that
+    <braunr> it doesn't look that slower on my machines though
+    <gnu_srs> you said you had some ideas, not which. except for mcsims work.
+    <braunr> i have ideas about what makes it slower
+    <braunr> it doesn't mean i have solutions for that
+    <braunr> if i had, don't you think i'd have applied them ? :)
+    <gnu_srs> ok, how to make it more responsive on the console? and printing
+      stdout more regularly, now several pages are stored and then flushed.
+    <braunr> give more details please
+    <gnu_srs> it behaves like a loaded linux desktop, with little memory
+      left...
+    <braunr> details about what you're doing
+    <gnu_srs> apt-get source any big package and: fakeroot debian/rules binary
+      2>&1 | tee ../binary.logg
+    <braunr> isee
+    <braunr> well no, we can't improve responsiveness
+    <braunr> without reintroducing the starvation problem
+    <braunr> they are linked
+    <braunr> and what you're doing involes a few buffers, so the laggy feel is
+      expected
+    <braunr> if we can fix that simply, we'll do so after it is merged upstream
+
+
+### IRC, freenode, #hurd, 2012-09-07
+
+    <braunr> gnu_srs: i really don't feel the sluggishness you described with
+      hurd+pthreads on my machines
+    <braunr> gnu_srs: what's your hardware ?
+    <braunr> and your VM configuration ?
+    <gnu_srs> Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz
+    <gnu_srs> kvm -m 1024 -net nic,model=rtl8139 -net
+      user,hostfwd=tcp::5562-:22 -drive
+      cache=writeback,index=0,media=disk,file=hurd-experimental.img -vnc :6
+      -cdrom isos/netinst_2012-07-15.iso -no-kvm-irqchip
+    <braunr> what is the file system type where your disk image is stored ?
+    <gnu_srs> ext3
+    <braunr> and how much physical memory on the host ?
+    <braunr> (paste meminfo somewhere please)
+    <gnu_srs> 4G, and it's on the limit, 2 kvm instances+gnome,etc
+    <gnu_srs> 80% in use by programs, 14% in cache.
+    <braunr> ok, that's probably the reason then
+    <braunr> the writeback option doesn't help a lot if you don't have much
+      cache
+    <gnu_srs> well the other instance is cthreads based, and not so sluggish.
+    <braunr> we know hurd+pthreads is slower
+    <braunr> i just wondered why i didn't feel it that much
+    <gnu_srs> try to fire up more kvm instances, and do a heavy compile...
+    <braunr> i don't do that :)
+    <braunr> that's why i never had the problem
+    <braunr> most of the time i have like 2-3 GiB of cache
+    <braunr> and of course more on shattrath
+    <braunr> (the host of the sceen.net hurdboxes, which has 16 GiB of ram)
+
+
+### IRC, freenode, #hurd, 2012-09-11
+
+    <gnu_srs> Monitoring the cthreads and the pthreads load under Linux shows:  
+    <gnu_srs> cthread version: load can jump very high, less cpu usage than
+      pthread version
+    <gnu_srs> pthread version: less memory usage, background cpu usage higher
+      than for cthread version
+    <braunr> that's the expected behaviour
+    <braunr> gnu_srs: are you using the lifothreads gnumach kernel ?
+    <gnu_srs> for experimental, yes.
+    <gnu_srs> i.e. pthreads
+    <braunr> i mean, you're measuring on it right now, right ?
+    <gnu_srs> yes, one instance running cthreads, and one pthreads (with lifo
+      gnumach)
+    <braunr> ok
+    <gnu_srs> no swap used in either instance, will try a heavy compile later
+      on.
+    <braunr> what for ?
+    <gnu_srs> E.g. for memory when linking. I have swap available, but no swap
+      is used currently.
+    <braunr> yes but, what do you intend to measure ?
+    <gnu_srs> don't know, just to see if swap is used at all. it seems to be
+      used not very much.
+    <braunr> depends
+    <braunr> be warned that using the swap means there is pageout, which is one
+      of the triggers for global system freeze :p
+    <braunr> anonymous memory pageout
+    <gnu_srs> for linux swap is used constructively, why not on hurd?
+    <braunr> because of hard to squash bugs
+    <gnu_srs> aha, so it is bugs hindering swap usage:-/
+    <braunr> yup :/
+    <gnu_srs> Let's find them thenO:-), piece of cake
+    <braunr> remember my page cache branch in gnumach ? :)
+
+[[gnumach_page_cache_policy]].
+
+    <gnu_srs> not much
+    <braunr> i started it before fixing non blocking select
+    <braunr> anyway, as a side effect, it should solve this stability issue
+      too, but it'll probably take time
+    <gnu_srs> is that branch integrated? I only remember slab and the lifo
+      stuff.
+    <gnu_srs> and mcsims work
+    <braunr> no it's not
+    <braunr> it's unfinished
+    <gnu_srs> k!
+    <braunr> it correctly extends the page cache to all available physical
+      memory, but since the hurd doesn't scale well, it slows the system down
+
+
+## IRC, freenode, #hurd, 2012-09-14
+
+    <braunr> arg
+    <braunr> darnassus seems to eat 100% cpu and make top freeze after some
+      time
+    <braunr> seems like there is an important leak in the pthreads version
+    <braunr> could be the lifothreads patch :/
+    <cjbirk> there's a memory leak?
+    <cjbirk> in pthreads? 
+    <braunr> i don't think so, and it's not a memory leak
+    <braunr> it's a port leak
+    <braunr> probably in the kernel
+
+
+### IRC, freenode, #hurd, 2012-09-17
+
+    <braunr> nice, the port leak is actually caused by the exim4 loop bug
+
+
+### IRC, freenode, #hurd, 2012-09-23
+
+    <braunr> the port leak i observed a few days ago is because of exim4 (the
+      infamous loop eating the cpu we've been seeing regularly)
+
+[[fork_deadlock]]?
+
+    <youpi> oh
+    <braunr> next time it happens, and if i have the occasion, i'll examine the
+      problem
+    <braunr> tip: when you can't use top or ps -e, you can use ps -e -o
+      pid=,args=
+    <youpi> or -M ?
+    <braunr> haven't tested
+
+
+## IRC, freenode, #hurd, 2012-09-23
+
+    <braunr> tschwinge: i committed the last hurd pthread change,
+      http://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?h=master-pthreads
+    <braunr> tschwinge: please tell me if you consider it ok for merging
+
+
+### IRC, freenode, #hurd, 2012-11-27
+
+    <youpi> braunr: btw, I forgot to forward here, with the glibc patch it does
+      boot fine, I'll push all that and build some almost-official packages for
+      people to try out what will come when eglibc gets the change in unstable
+    <braunr> youpi: great :)
+    <youpi> thanks for managing the final bits of this
+    <youpi> (and thanks for everybody involved)
+    <braunr> sorry again for the non obvious parts
+    <braunr> if you need the debian specific parts refined (e.g. nice commits
+      for procfs & others), i can do that
+    <youpi> I'll do that, no pb
+    <braunr> ok
+    <braunr> after that (well, during also), we should focus more on bug
+      hunting
+
+
+## IRC, freenode, #hurd, 2012-10-26
+
+    <mcsim1> hello. What does following error message means? "unable to adjust
+      libports thread priority: Operation not permitted" It appears when I set
+      translators.
+    <mcsim1> Seems has some attitude to libpthread. Also following appeared
+      when I tried to remove translator: "pthread_create: Resource temporarily
+      unavailable"
+    <mcsim1> Oh, first message appears very often, when I use translator I set.
+    <braunr> mcsim1: it's related to a recent patch i sent
+    <braunr> mcsim1: hurd servers attempt to increase their priority on startup
+      (when a thread is created actually)
+    <braunr> to reduce message floods and thread storms (such sweet names :))
+    <braunr> but if you start them as an unprivileged user, it fails, which is
+      ok, it's just a warning
+    <braunr> the second way is weird
+    <braunr> it normally happens when you're out of available virtual space,
+      not when shutting a translator donw
+    <mcsim1> braunr: you mean this patch: libports: reduce thread starvation on
+      message floods?
+    <braunr> yes
+    <braunr> remember you're running on darnassus
+    <braunr> with a heavily modified hurd/glibc
+    <braunr> you can go back to the cthreads version if you wish
+    <mcsim1> it's better to check translators privileges, before attempting to
+      increase their priority, I think.
+    <braunr> no
+    <mcsim1> it's just a bit annoying
+    <braunr> privileges can be changed during execution
+    <braunr> well remove it
+    <mcsim1> But warning should not appear.
+    <braunr> what could be done is to limit the warning to one occurrence
+    <braunr> mcsim1: i prefer that it appears
+    <mcsim1> ok
+    <braunr> it's always better to be explicit and verbose
+    <braunr> well not always, but very often
+    <braunr> one of the reasons the hurd is so difficult to debug is the lack
+      of a "message server" à la dmesg
+
+[[translator_stdout_stderr]].
author	Thomas Schwinge <tschwinge@gnu.org>	2012-11-29 01:33:22 +0100
committer	Thomas Schwinge <tschwinge@gnu.org>	2012-11-29 01:33:22 +0100
commit	5bd36fdff16871eb7d06fc26cac07e7f2703432b (patch)
tree	b430970a01dfc56b8d41979552999984be5c6dfd /open_issues/libpthread.mdwn
parent	2603401fa1f899a8ff60ec6a134d5bd511073a9d (diff)