From 9933cec0a18ae2a3d752f269d1bb12c19f51199d Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Sun, 21 Jul 2013 15:35:02 -0400 Subject: IRC. --- open_issues/libpthread.mdwn | 197 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 195 insertions(+), 2 deletions(-) (limited to 'open_issues/libpthread.mdwn') diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn index e2fda122..8e3fde71 100644 --- a/open_issues/libpthread.mdwn +++ b/open_issues/libpthread.mdwn @@ -1260,7 +1260,7 @@ Most of the issues raised on this page has been resolved, a few remain. i'll add traces to know which step causes the error -### IRC, freenode, #hurd, 2012-12-11 +#### IRC, freenode, #hurd, 2012-12-11 braunr: mktoolnix seems like a reproducer for the libports thread priority issue @@ -1273,7 +1273,7 @@ Most of the issues raised on this page has been resolved, a few remain. that's it, yes -### IRC, freenode, #hurd, 2013-03-01 +#### IRC, freenode, #hurd, 2013-03-01 braunr: btw, "unable to adjust libports thread priority: (ipc/send) invalid destination port" is actually not a sign of fatality @@ -1284,6 +1284,34 @@ Most of the issues raised on this page has been resolved, a few remain. weird sentence, agreed :p +#### IRC, freenode, #hurd, 2013-06-14 + + Hi, when running check for gccgo the following occurs (multiple + times) locking up the console + unable to adjust libports thread priority: (ipc/send) invalid + destination port + (not locking up the console, it was just completely filled with + messages)) + gnu_srs: are you running your translator as root ? + or, do you have a translator running as an unprivileged user ? + hm, invalid dest port + that's a bug :p + but i don't know why + i'll have to take some time to track it down + it might be a user ref overflow or something similarly tricky + gnu_srs: does it happen everytime you run gccgo checks or only + after the system has been alive for some time ? + (some time being at least a few hours, more probably days) + +#### IRC, freenode, #hurd, 2013-07-05 + + ok, found the bug about invalid ports when adjusting priorities + thhe hurd must be plagued with wrong deallocations :( + i have so many problems when trying to cleanly destroy threads + +[[libpthread/t/fix_have_kernel_resources]]. + + ### IRC, freenode, #hurd, 2013-03-11 youpi: oh btw, i noticed a problem with the priority adjustement @@ -1296,6 +1324,171 @@ Most of the issues raised on this page has been resolved, a few remain. uh indeed +### IRC, freenode, #hurd, 2013-07-01 + + braunr: it seems as if pfinet is not prioritized enough + I'm getting network connectivity issues when the system is quite + loaded + loaded with what ? + it could be ext2fs having a lot more threads than other servers + building packages + I'm talking about the buildds + ok + ironforge or others ? + they're having troubles uploading packages while building stuff + ironforge and others + that happened already in the past sometimes + but at the moment it's really pronounced + i don't think it's a priority issue + i think it's swapping + ah, that's not impossible indeed + but why would it swap? + there's a lot of available memory + a big file is enough + it pushes anonymous memory out + to fill 900MiB memory ? + i see 535M of swap on if + yes + ironforge is just building libc + and for some reason, swapping is orders of magnitude slower than + anything else + not linking it yet + i also see 1G of free memory on it + that's what I meant with 900MiB + so at some point, it needed a lot of memory, caused swapping + and from time to time it's probably swapping back + well, pfinet had all the time to swap back already + I don't see why it should be still suffering from it + swapping is a kernel activity + ok, but once you're back, you're back + unless something else pushes you out + if the kernel is busy waiting for the default pager, nothing makes + progress + (eccept the default pager hopefully) + sure but pfinet should be back already, since it does work + so I don't see why it should wait for something + the kernel is waiting + and the kernel isn't preemptibl + e + although i'm not sure preemption is the problem here + well + what I don't understand is what we have changed that could have so + much impact + the only culprit I can see is the priorities we have changed + recently + do you mean it happens a lot more frequently than before ? + yes + way + ok + ironforge is almost unusable while building glibc + I've never seen that + that's weird, i don't have these problems on darnassus + but i think i reboot it more often + could be a scalability issue then + combined with the increased priorities + if is indeed running full time on the host, whereas swapping + issues show the cpu being almost idle + loadavg is high too so i guess there are many threads + 0 971 3 -20 -20 1553 305358625 866485906 523M 63M * S 0 972 3 -20 -20 1434 125237556 719443981 483M 5.85M * S around 1k5 each + that's quite usual + could be the priorities then + but i'm afraid that if we lower them, the number of threads will + grow out of control + (good thing is i'm currently working on a way to make libpthread + actually remove kernel resources) + but the priorities should be the same in ext2fs and pfinet, + shouldn't they? + yes but ext2 has a lot more threads than pfinet + the scheduler only sees threads, we don't have a grouping feature + right + we also should remove priority depressing in glibc + (in sched_yield) + it affects spin locks + + youpi: is it normal to see priorities of 26 ? + braunr: we have changed the nice factor + ah, factor + Mm, I'm however realizing the gnumach kernel running these systems + hasn't been upgraded in a while + it may not even have the needed priority levels + ar euare you using top right now on if ? + hm no i don't see it any more + well yes, could be the priorities .. + I've rebooted with an upgraded kernel + no issue so far + package uploads will tell me on the long run + i bet it's also a scalability issue + but why would it appear now only? + until the cache and other data containers start to get filled, + processing is fast enough that we don't see it hapenning + sure, but I haven't seen that in the past + oh it's combined with the increased priorities + even after a week building packages + what i mean is, increased priorities don't affect much if threads + porcess things fast + things get longer with more data, and then increased prioritis + give more time to these threads + and that's when the problem appears + but increased priorities give more time to the pfinet threads too, + don't they? + yes + so what is different ? + but again, there are a lot more threads elsewhere + with a lot more data to process + sure, but that has alwasy been so + hm + really, 1k5 threads does not surprise me at all :) + 10k would + there aren't all active either + yes + but right, i don't know why pfinet would be given less time than + other threads .. + compared to before + particularly on xen-based buildds + libpthread is slower than cthreads + where it doesn't even have to wait for netdde + threads need more quanta to achieve the same ting + perhaps processing could usually be completed in one go before, + and not any more + we had a discussion about this with antrik + + youpi: concerning the buildd issue, i don't think pfinet is + affected actually + but the applications using the network may be + why using the network would be a difference ? + normal applications have a lower priority + what i mean is, pfinet simply has nothing to do, because normal + applications don't have enough cpu time + (what you said earlier seemed to imply pfinet had issues, i don't + think it has) + it should be easy to test by pinging the machine while under load + we should also check the priority of the special thread used to + handle packets, both in pfinet and netdde + this one isn't spawned by libports and is likely to have a lower + priority as well + + youpi: you're right, something very recent slowed things down a + lot + perhaps the new priority factor + well not the factor but i suppose the priority range has been + increased + +[[nice_vs_mach_thread_priorities]]. + + braunr: haven't had any upload issue so far + over 20 uploads + while it was usually 1 every 2 before... + so it was very probably the kernel missing the priorities levels + ok + i think i've had the same problem on another virtual machine + with a custom kernel i built a few weeks ago + same kind of issue i guess + it's fine now, and always was on darnassus + ## IRC, freenode, #hurd, 2012-12-05 -- cgit v1.2.3