diff options
Diffstat (limited to 'open_issues/fakeroot_eagain.mdwn')
-rw-r--r-- | open_issues/fakeroot_eagain.mdwn | 216 |
1 files changed, 216 insertions, 0 deletions
diff --git a/open_issues/fakeroot_eagain.mdwn b/open_issues/fakeroot_eagain.mdwn new file mode 100644 index 00000000..6b684a04 --- /dev/null +++ b/open_issues/fakeroot_eagain.mdwn @@ -0,0 +1,216 @@ +[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_glibc open_issue_porting]] + + +# IRC, freenode, #hurd, 2012-12-05 + + <braunr> rbraun 18813 R 2hrs ln -sf ../af_ZA/LC_NUMERIC + debian/locales-all/usr/lib/locale/en_BW/LC_NUMERIC + <braunr> when building glibc + <braunr> is this a known issue ? + <tschwinge> braunr: No. Can you get a backtrace? + <braunr> tschwinge: with gdb you mean ? + <tschwinge> Yes. If you have any debugging symbols (glibc?). + <braunr> or the build log leading to that ? + <braunr> ok, i will next time i have it + <tschwinge> OK. + <braunr> (i regularly had it when working on the pthreads port) + <braunr> tschwinge: + http://www.sceen.net/~rbraun/hurd_glibc_build_deadlock_trace + <braunr> youpi: ^ + <youpi> Mmm, there's not so much we can do about this one + <braunr> youpi: what do you mean ? + <youpi> the problem is that it's really a reentrency issue of the libc + locale + <youpi> it would happen just the same on linux + <braunr> sure + <braunr> but hat doesn't mean we can't report and/or fix it :) + <youpi> (the _nl_state_lock) + <braunr> do you have any workaround in mind ? + <youpi> no + <youpi> actually that's what I meant by "there's not so much we can do + about this" + <braunr> ok + <youpi> because it's a bad interaction between libfakeroot and glibc + <youpi> glibc believe fxtstat64 would never call locale functions + <youpi> but with libfakeroot it does + <braunr> i see + <youpi> only because we get an EAGAIN here + <braunr> but hm, doesn't it happen on linux ? + <youpi> EAGAIN doesn't happen on linux for fxstat64, no :) + <braunr> why does it happen on the hurd ? + <youpi> I mean for fakeroot stuff + <youpi> probably because fakeroot uses socket functions + <youpi> for which we probably don't properly handleEAGAIN + <youpi> I've already seen such kind of issue + <youpi> in buildd failures + <braunr> ok + <youpi> (so the actual bug here is EAGAIN + <youpi> ) + <braunr> yes, so we can do something about it + <braunr> worth a look + <pinotree> (implement sysv semaphores) + <youpi> pinotree: if we could also solve all these buildd EAGAIN issues + that'd be nice :) + <braunr> that EAGAIN error might also be what makes exim behave badly and + loop forever + <youpi> possibly + <braunr> i've updated the trace with debugging symbols + <braunr> it fails on connect + <pinotree> like http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=563342 ? + <braunr> it's EAGAIN, not ECONNREFUSED + <pinotree> ah ok + <braunr> might be an error in tcp_v4_get_port + + +## IRC, freenode, #hurd, 2012-12-06 + + <braunr> hmm, tcp_v4_get_port sometimes fails indeed + <gnu_srs> braunr: may I ask how you found out, adding print statements in + pfinet, or? + <braunr> yes + <gnu_srs> OK, so that's the only (easy) way to debug. + <braunr> that's the last resort + <braunr> gdb is easy too + <braunr> i could have added a breakpoint too + <braunr> but i didn't want to block pfinet while i was away + <braunr> is it possible to force the use of fakeroot-tcp on linux ? + <braunr> the problem seems to be that fakeroot doesn't close the sockets + that it connected to faked-tcp + <braunr> which, at some point, exhauts the port space + <pinotree> braunr: sure + <pinotree> change the fakeroot dpkg alternative + <braunr> ok + <pinotree> calling it explicitly `fakeroot-tcp command` or + `dpkg-buildpackage -rfakeroot-tcp ...` should work too + <braunr> fakeroot-tcp looks really evil :p + <braunr> hum, i don't see any faked-tcp process on linux :/ + <pinotree> not even with `fakeroot-tcp bash -c "sleep 10"`? + <braunr> pinotree: now yes + <braunr> but, does it mean faked-tcp is started for *each* process loading + fakeroot-tcp ? + <braunr> (the lib i mean) + <pinotree> i think so + <braunr> well the hurd doesn't seem to do that at all + <braunr> or maybe it does and i don't see it + <braunr> the stale faked-tcp processes could be those that failed something + only + <pinotree> yes, there's also that issue: sometimes there are stake + faked-tcp processes + <braunr> hum no, i see one faked-tcp that consumes cpu when building glibc + <pinotree> *stale + <braunr> it's the same process for all commands + <pinotree> <braunr> but, does it mean faked-tcp is started for *each* + process loading fakeroot-tcp ? + <pinotree> → everytime you start fakeroot, there's a new faked-xxx for it + <braunr> it doesn't look that way + <braunr> again, on the hurd, i see one faked-tcp, consuming cpu while + building so i assume it services libfakeroot-tcp requests + <pinotree> yes + <braunr> which means i probably won't reproduce the problem on linux + <pinotree> it serves that fakeroot under which the binary(-arch) target is + run + <braunr> or perhaps it's the normal fakeroot-tcp behaviour on sid + <braunr> pinotree: a faked-tcp that is started for each command invocation + will implicitely make the network stack close all its sockets when + exiting + <braunr> pinotree: as our fakeroot-tcp uses the same instance of faked-tcp, + it's a lot more likely to exhaust the port space + <pinotree> i see + <braunr> i'll try on sid and see how it behaves + <braunr> pinotree: on the other hand, forking so many processes at each + command invocation may make exec leak a lot :p + <braunr> or rather, a lot more + <braunr> (or maybe not, since it leaks only in some cases) + +[[exec_leak]]. + + <braunr> pinotree: actually, the behaviour under linux is the same with the + alternative correctly set, whereas faked-tcp is restarted (if used at + all) with -rfakeroot-tcp + <braunr> hm no, even that isn't true + <braunr> grr + <braunr> pinotree: i think i found a handy workaround for fakeroot + <braunr> pinotree: the range of local ports in our networking stack is a + lot more limited than what is configured in current systems + <braunr> by extending it, i can now build glibc \o/ + <pinotree> braunr: what are the current ours and the usual one? + <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c + <braunr> the modern ones are the ones suggested in the comment + <braunr> sysctl_local_port_range is the symbol storing the range + <pinotree> i see + <pinotree> what's the current range on linux? + <braunr> 20:44 < braunr> the modern ones are the ones suggested in the + comment + <pinotree> i see + <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range + <braunr> 32768 61000 + <braunr> so, i'm not sure why we have the problem, since even on linux, + netstat doesn't show open bound ports, but it does help + <braunr> the fact faked-tcp can remain after its use is more problematic + <pinotree> (maybe pfinet could grow a (startup-only?) option to change it, + similar to that sysctl) + <braunr> but it can also stems from the same issue gnu_srs found about + closed sockets that haven't been shut down + <braunr> perhaps + <braunr> but i don't see the point actually + <braunr> we could simply change the values in the code + + <braunr> youpi: first, in pfinet, i increased the range of local ports to + reduce the likeliness of port space exhaustion + <braunr> so we should get a lot less EAGAIN after that + <braunr> (i've not committed any of those changes) + <youpi> range of local ports? + <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c, tcp_v4_get_port function + and sysctl_local_port_range array + <youpi> oh + <braunr> EAGAIN is caused by tcp_v4_get_port failing at + <braunr> /* Exhausted local port range during search? */ + <braunr> if (remaining <= 0) + <braunr> goto fail; + <youpi> interesting + <youpi> so it's not a hurd bug after all + <youpi> just a problem in fakeroot eating a lot of ports + <braunr> maybe because of the same issue gnu_srs worked on (bad socket + close when no clean shutdown) + <braunr> maybe, maybe not + <braunr> but increasing the range is effective + <braunr> and i compared with what linux does today, which is exactly what + is in the comment above sysctl_local_port_range + <braunr> so it looks safe + <youpi> so that means that the pfinet just uses ports 1024- 4999 for + auto-allocated ports? + <braunr> i guess so + <youpi> the linux pfinet I meant + <braunr> i haven't checked the whole code but it looks that way + <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_min[] = { 1, 1 + }; + <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_max[] = { 65535, + 65535 }; + <youpi> looks like they have increased it since then :) + <braunr> hum :) + <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range + <braunr> 32768 61000 + <youpi> yep, same here + <youpi> ./inet_connection_sock.c: .range = { 32768, 61000 }, + <youpi> so there are two things apparently + <youpi> but linux now defaults to 32k-61k + <youpi> braunr: please just push the port range upgrade to 32Ki-61K + <braunr> ok, will do + <youpi> there's not reason not to do it + + +## IRC, freenode, #hurd, 2012-12-11 + + <braunr> youpi: at least, i haven't had any failure building eglibc since + the port range patch + <youpi> good :) |