[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_glibc open_issue_porting]] # IRC, freenode, #hurd, 2012-12-05 <braunr> rbraun 18813 R 2hrs ln -sf ../af_ZA/LC_NUMERIC debian/locales-all/usr/lib/locale/en_BW/LC_NUMERIC <braunr> when building glibc <braunr> is this a known issue ? <tschwinge> braunr: No. Can you get a backtrace? <braunr> tschwinge: with gdb you mean ? <tschwinge> Yes. If you have any debugging symbols (glibc?). <braunr> or the build log leading to that ? <braunr> ok, i will next time i have it <tschwinge> OK. <braunr> (i regularly had it when working on the pthreads port) <braunr> tschwinge: http://www.sceen.net/~rbraun/hurd_glibc_build_deadlock_trace <braunr> youpi: ^ <youpi> Mmm, there's not so much we can do about this one <braunr> youpi: what do you mean ? <youpi> the problem is that it's really a reentrency issue of the libc locale <youpi> it would happen just the same on linux <braunr> sure <braunr> but hat doesn't mean we can't report and/or fix it :) <youpi> (the _nl_state_lock) <braunr> do you have any workaround in mind ? <youpi> no <youpi> actually that's what I meant by "there's not so much we can do about this" <braunr> ok <youpi> because it's a bad interaction between libfakeroot and glibc <youpi> glibc believe fxtstat64 would never call locale functions <youpi> but with libfakeroot it does <braunr> i see <youpi> only because we get an EAGAIN here <braunr> but hm, doesn't it happen on linux ? <youpi> EAGAIN doesn't happen on linux for fxstat64, no :) <braunr> why does it happen on the hurd ? <youpi> I mean for fakeroot stuff <youpi> probably because fakeroot uses socket functions <youpi> for which we probably don't properly handleEAGAIN <youpi> I've already seen such kind of issue <youpi> in buildd failures <braunr> ok <youpi> (so the actual bug here is EAGAIN <youpi> ) <braunr> yes, so we can do something about it <braunr> worth a look <pinotree> (implement sysv semaphores) <youpi> pinotree: if we could also solve all these buildd EAGAIN issues that'd be nice :) <braunr> that EAGAIN error might also be what makes exim behave badly and loop forever <youpi> possibly <braunr> i've updated the trace with debugging symbols <braunr> it fails on connect <pinotree> like http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=563342 ? <braunr> it's EAGAIN, not ECONNREFUSED <pinotree> ah ok <braunr> might be an error in tcp_v4_get_port ## IRC, freenode, #hurd, 2012-12-06 <braunr> hmm, tcp_v4_get_port sometimes fails indeed <gnu_srs> braunr: may I ask how you found out, adding print statements in pfinet, or? <braunr> yes <gnu_srs> OK, so that's the only (easy) way to debug. <braunr> that's the last resort <braunr> gdb is easy too <braunr> i could have added a breakpoint too <braunr> but i didn't want to block pfinet while i was away <braunr> is it possible to force the use of fakeroot-tcp on linux ? <braunr> the problem seems to be that fakeroot doesn't close the sockets that it connected to faked-tcp <braunr> which, at some point, exhauts the port space <pinotree> braunr: sure <pinotree> change the fakeroot dpkg alternative <braunr> ok <pinotree> calling it explicitly `fakeroot-tcp command` or `dpkg-buildpackage -rfakeroot-tcp ...` should work too <braunr> fakeroot-tcp looks really evil :p <braunr> hum, i don't see any faked-tcp process on linux :/ <pinotree> not even with `fakeroot-tcp bash -c "sleep 10"`? <braunr> pinotree: now yes <braunr> but, does it mean faked-tcp is started for *each* process loading fakeroot-tcp ? <braunr> (the lib i mean) <pinotree> i think so <braunr> well the hurd doesn't seem to do that at all <braunr> or maybe it does and i don't see it <braunr> the stale faked-tcp processes could be those that failed something only <pinotree> yes, there's also that issue: sometimes there are stake faked-tcp processes <braunr> hum no, i see one faked-tcp that consumes cpu when building glibc <pinotree> *stale <braunr> it's the same process for all commands <pinotree> <braunr> but, does it mean faked-tcp is started for *each* process loading fakeroot-tcp ? <pinotree> → everytime you start fakeroot, there's a new faked-xxx for it <braunr> it doesn't look that way <braunr> again, on the hurd, i see one faked-tcp, consuming cpu while building so i assume it services libfakeroot-tcp requests <pinotree> yes <braunr> which means i probably won't reproduce the problem on linux <pinotree> it serves that fakeroot under which the binary(-arch) target is run <braunr> or perhaps it's the normal fakeroot-tcp behaviour on sid <braunr> pinotree: a faked-tcp that is started for each command invocation will implicitely make the network stack close all its sockets when exiting <braunr> pinotree: as our fakeroot-tcp uses the same instance of faked-tcp, it's a lot more likely to exhaust the port space <pinotree> i see <braunr> i'll try on sid and see how it behaves <braunr> pinotree: on the other hand, forking so many processes at each command invocation may make exec leak a lot :p <braunr> or rather, a lot more <braunr> (or maybe not, since it leaks only in some cases) [[exec_memory_leaks]]. <braunr> pinotree: actually, the behaviour under linux is the same with the alternative correctly set, whereas faked-tcp is restarted (if used at all) with -rfakeroot-tcp <braunr> hm no, even that isn't true <braunr> grr <braunr> pinotree: i think i found a handy workaround for fakeroot <braunr> pinotree: the range of local ports in our networking stack is a lot more limited than what is configured in current systems <braunr> by extending it, i can now build glibc \o/ <pinotree> braunr: what are the current ours and the usual one? <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c <braunr> the modern ones are the ones suggested in the comment <braunr> sysctl_local_port_range is the symbol storing the range <pinotree> i see <pinotree> what's the current range on linux? <braunr> 20:44 < braunr> the modern ones are the ones suggested in the comment <pinotree> i see <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range <braunr> 32768 61000 <braunr> so, i'm not sure why we have the problem, since even on linux, netstat doesn't show open bound ports, but it does help <braunr> the fact faked-tcp can remain after its use is more problematic <pinotree> (maybe pfinet could grow a (startup-only?) option to change it, similar to that sysctl) <braunr> but it can also stems from the same issue gnu_srs found about closed sockets that haven't been shut down <braunr> perhaps <braunr> but i don't see the point actually <braunr> we could simply change the values in the code <braunr> youpi: first, in pfinet, i increased the range of local ports to reduce the likeliness of port space exhaustion <braunr> so we should get a lot less EAGAIN after that <braunr> (i've not committed any of those changes) <youpi> range of local ports? <braunr> see pfinet/linux-src/net/ipv4/tcp_ipv4.c, tcp_v4_get_port function and sysctl_local_port_range array <youpi> oh <braunr> EAGAIN is caused by tcp_v4_get_port failing at <braunr> /* Exhausted local port range during search? */ <braunr> if (remaining <= 0) <braunr> goto fail; <youpi> interesting <youpi> so it's not a hurd bug after all <youpi> just a problem in fakeroot eating a lot of ports <braunr> maybe because of the same issue gnu_srs worked on (bad socket close when no clean shutdown) <braunr> maybe, maybe not <braunr> but increasing the range is effective <braunr> and i compared with what linux does today, which is exactly what is in the comment above sysctl_local_port_range <braunr> so it looks safe <youpi> so that means that the pfinet just uses ports 1024- 4999 for auto-allocated ports? <braunr> i guess so <youpi> the linux pfinet I meant <braunr> i haven't checked the whole code but it looks that way <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_min[] = { 1, 1 }; <youpi> ./sysctl_net_ipv4.c:static int ip_local_port_range_max[] = { 65535, 65535 }; <youpi> looks like they have increased it since then :) <braunr> hum :) <braunr> $ cat /proc/sys/net/ipv4/ip_local_port_range <braunr> 32768 61000 <youpi> yep, same here <youpi> ./inet_connection_sock.c: .range = { 32768, 61000 }, <youpi> so there are two things apparently <youpi> but linux now defaults to 32k-61k <youpi> braunr: please just push the port range upgrade to 32Ki-61K <braunr> ok, will do <youpi> there's not reason not to do it ## IRC, freenode, #hurd, 2012-12-11 <braunr> youpi: at least, i haven't had any failure building eglibc since the port range patch <youpi> good :)