[[!meta copyright="Copyright © 2007, 2008, 2010, 2011, 2012, 2013 Free Software
Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
[[!tag open_issue_glibc]]
Here's what's to be done for maintaining glibc.
[[!toc levels=2]]
# [[General information|/glibc]]
# [[Sources|source_repositories/glibc]]
# [[Debian Cheat Sheet|debian]]
# Configuration
Last reviewed up to the [[Git mirror's 0323d08657f111267efa47bd448fbf6cd76befe8
(2013-05-24) sources|source_repositories/glibc]].
* `t/hurdsig-fixes`
hurdsig.c: In function '_hurd_internal_post_signal':
hurdsig.c:1188:26: warning: 'pending' may be used uninitialized in this function [-Wmaybe-uninitialized]
hurdsig.c:1168:12: note: 'pending' was declared here
* `t/host-independency`
[[!message-id "87bougerfb.fsf@kepler.schwinge.homeip.net"]], [[!message-id
"20120525202732.GA31088@intel.com"]], commit
918b56067a444572f1c71b02f18255ae4540b043. [[!GCC_PR 53183]], GCC commit
c05436a7e361b8040ee899266e15bea817212c37.
* `t/pie-sbrk`
[[gcc/PIE]].
* `t/sysvshm`
../sysdeps/mach/hurd/shmat.c: In function '__shmat':
../sysdeps/mach/hurd/shmat.c:57:7: warning: implicit declaration of function '__close' [-Wimplicit-function-declaration]
../sysdeps/mach/hurd/shmget.c: In function 'get_exclusive':
../sysdeps/mach/hurd/shmget.c:85:8: warning: variable 'is_private' set but not used [-Wunused-but-set-variable]
../sysdeps/mach/hurd/shmget.c:102:8: warning: 'dir' may be used uninitialized in this function [-Wmaybe-uninitialized]
../sysdeps/mach/hurd/shmget.c:102:8: warning: 'file' may be used uninitialized in this function [-Wmaybe-uninitialized]
* [[`t/tls`|t/tls]]
* [[`t/tls-threadvar`|t/tls-threadvar]]
* `t/verify.h`
People didn't like this too much.
Other examples:
* 11988f8f9656042c3dfd9002ac85dff33173b9bd -- `static_assert`
* [[toolchain/cross-gnu]], without `--disable-multi-arch`
i686-pc-gnu-gcc ../sysdeps/i386/i686/multiarch/strcmp.S -c [...]
../sysdeps/i386/i686/multiarch/../strcmp.S: Assembler messages:
../sysdeps/i386/i686/multiarch/../strcmp.S:31: Error: symbol `strcmp' is already defined
make[2]: *** [/media/boole-data/thomas/tmp/gnu-0/src/glibc.obj/string/strcmp.o] Error 1
make[2]: Leaving directory `/media/boole-data/thomas/tmp/gnu-0/src/glibc/string'
Might simply be a missing patch(es) from master.
* `--disable-multi-arch`
IRC, freenode, #hurd, 2012-11-22
tschwinge: is your glibc build w/ or w/o multiarch?
pinotree: See open_issues/glibc: --disable-multi-arch
ah, because you do cross-compilation?
No, that's natively.
There is also a not of what happened in cross-gnu when I
enabled multi-arch.
No idea whether that's still relevant, though.
EPARSE
s%not%note
Better?
yes :)
As for native builds: I guess I just didn't (want to) play
with it yet.
it is enabled in debian since quite some time, maybe other
i386/i686 patches (done for linux) help us too
I though we first needed some CPU identification
infrastructe before it can really work?
I thought [...].
as in use the i686 variant as runtime automatically? i guess
so
I thought I had some notes about that, but can't currently
find them.
Ah, I probably have been thinking about open_issues/ifunc
and open_issues/libc_variant_selection.
* --build=X
`long double` test: due to `cross_compiling = maybe` wants to execute a
file, which fails. Thus `--build=X` has to be set.
* Check what all these are:
running configure fragment for sysdeps/mach/hurd
checking Hurd header version... ok
running configure fragment for sysdeps/mach
checking for i586-pc-gnu-mig... i586-pc-gnu-mig
checking for mach/mach_types.h... yes
checking for mach/mach_types.defs... yes
checking for task_t in mach/mach_types.h... task_t
checking for thread_t in mach/mach_types.h... thread_t
checking for creation_time in task_basic_info... yes
checking for mach/mach.defs... yes
checking for mach/mach4.defs... yes
checking for mach/clock.defs... no
checking for mach/clock_priv.defs... no
checking for mach/host_priv.defs... no
checking for mach/host_security.defs... no
checking for mach/ledger.defs... no
checking for mach/lock_set.defs... no
checking for mach/processor.defs... no
checking for mach/processor_set.defs... no
checking for mach/task.defs... no
checking for mach/thread_act.defs... no
checking for mach/vm_map.defs... no
checking for mach/memory_object.defs... yes
checking for mach/memory_object_default.defs... yes
checking for mach/default_pager.defs... yes
checking for mach/i386/mach_i386.defs... yes
checking for egrep... grep -E
checking for host_page_size in mach_host.defs... no
checking for mach/machine/ndr_def.h... no
checking for machine/ndr_def.h... no
checking for i386_io_perm_modify in mach_i386.defs... yes
checking for i386_set_gdt in mach_i386.defs... yes
checking whether i586-pc-gnu-mig supports the retcode keyword... yes
* `sysdeps/i386/stackguard-macros.h`
See [[t/tls|t/tls]].
* Verify 77c84aeb81808c3109665949448dba59965c391e against
`~/shared/glibc/make_TAGS.patch`.
* `HP_SMALL_TIMING_AVAIL` not defined anywhere.
* Unify `CPUCLOCK_WHICH` stuff in `clock_*` files.
* Not all tests are re-run in a `make -k tests; make tests-clean; make -k
tests` cycle. For example, after `make tests-clean`:
$ find ./ -name \*.out
./localedata/tst-locale.out
./localedata/sort-test.out
./localedata/de_DE.out
./localedata/en_US.out
./localedata/da_DK.out
./localedata/hr_HR.out
./localedata/sv_SE.out
./localedata/tr_TR.out
./localedata/fr_FR.out
./localedata/si_LK.out
./localedata/tst-mbswcs.out
./iconvdata/iconv-test.out
./iconvdata/tst-tables.out
./stdlib/isomac.out
./posix/wordexp-tst.out
./posix/annexc.out
./posix/tst-getconf.out
./elf/check-textrel.out
./elf/check-execstack.out
./elf/check-localplt.out
./c++-types-check.out
./check-local-headers.out
./begin-end-check.out
* `CPUCLOCK_WHICH`, `t/cpuclock`
/media/boole-data/thomas/tmp/gnu-0/src/glibc.obj/rt/librt_pic.a(clock_settime.os): In function `clock_settime':
/media/boole-data/thomas/tmp/gnu-0/src/glibc/rt/../sysdeps/unix/clock_settime.c:113: undefined reference to `CPUCLOCK_WHICH'
/media/boole-data/thomas/tmp/gnu-0/src/glibc/rt/../sysdeps/unix/clock_settime.c:114: undefined reference to `CPUCLOCK_WHICH'
collect2: error: ld returned 1 exit status
make[2]: *** [/media/boole-data/thomas/tmp/gnu-0/src/glibc.obj/rt/librt.so] Error 1
make[2]: Leaving directory `/media/boole-data/thomas/tmp/gnu-0/src/glibc/rt'
make[1]: *** [rt/others] Error 2
make[1]: Leaving directory `/media/boole-data/thomas/tmp/gnu-0/src/glibc'
make: *** [all] Error 2
* Missing interfaces, amongst many more.
Many more are missing, some of which have been announced in `NEWS`, others
typically haven't (like new flags to existing functions). Typically,
porters will notice missing functionaly. But in case you're looking for
something to work on, here's a list.
`AT_EMPTY_PATH`, `CLOCK_BOOTTIME`, `CLOCK_BOOTTIME_ALARM`,
`CLOCK_REALTIME_ALARM`, `O_PATH`,
`PTRACE_*` (for example, cbff0d9689c4d68578b6a4f0a17807232506ea27,
b1b2aaf8eb9eed301ea8f65b96844568ca017f8b),
`RLIMIT_RTTIME`, `SEEK_DATA` (`unistd.h`), `SEEK_HOLE` (`unistd.h`)
`clock_adjtime`, `fallocate`, `fallocate64`, `name_to_handle_at`,
`open_by_handle_at`, `process_vm_readv`, `process_vm_writev`,
`setns`, `sync_file_range`, [[`mremap`|mremap]] and [[several
`MAP_*`|glibc/mmap]], `PTR_MANGLE`/`PTR_DEMANGLE` (`t/ptrmangle`)
Check also the content of `gnu/stubs.h`, which lists all the functions
marked as stub which only return `ENOSYS`.
* `chflags`
Patch sent, [[!message-id "20120427012130.GZ19431@type.famille.thibault.fr"]].
IRC, OFTC, #debian-hurd, 2012-04-27:
Does anyone have any idea why int main(void) { return
chflags(); } will compile with gcc but not with g++ ? It says
that "chflags" was not declared in this scope.
I get the same error on FreeBSD, but including sys/stat.h
makes it work
Can't find a solution on Hurd though :/
the Hurd doesn't have chflags
apparently linux neither
what does it do?
change flags :)
Are you sure the Hurd does not have chflags ? Because gcc
does not complain
there is no chflags function in /usr/include
but what flags does it change?
According to the FreeBSD manpage, it can set flags such as
UF_NODUMP, UF_IMMUTABLE etc.
Hum, there is actually a chflags() definition
but no declaration
so actually chflags is supported, but the declaration was
forgotten
probably because since linux doens't have it, it has never
been a problem up to now
so I'd say ignore the error for now, we'll add the
declaration
* [[t/tls-threadvar]]
* `futimesat`
If we have all of 'em (check Linux kernel), `#define __ASSUME_ATFCTS`.
* `bits/stat.h [__USE_ATFILE]`: `UTIME_NOW`, `UTIME_OMIT`
* `io/fcntl.h [__USE_ATFILE]`
Do we support `AT_FDCWD` et al.?
(80b4e5f3ef231702b24d44c33e8dceb70abb3a06.)
* `t/opendirat`: `opendirat` (`scandirat`, `scandirat64`)
Need changes equivalent to c55fbd1ea768f9fdef34a01377702c0d72cbc213 +
14d96785125abee5e9a49a1c3037f35a581750bd.
* `madvise`, `MADV_DONTNEED`, `MADV_DONTDUMP`, `MADV_DODUMP`
[[glibc_madvise_vs_static_linking]].
IRC, OFTC, #debian-hurd, 2013-09-09:
does hurd MADV_DONTNEED or MADV_FREE or none?
http://sources.debian.net/src/jemalloc/3.4.0-1/include/jemalloc/jemalloc_defs.h.in#L239
seems it builds by defining JEMALLOC_PURGE_MADVISE_DONTNEED
but i don't know what i'm talking about, so it could build with
JEMALLOC_PURGE_MADVISE_FREE as well
IRC, OFTC, #debian-hurd, 2013-09-10:
gg0: it implements none, even if it defines DONTNEED (but
not FREE)
See also:
gnash (0.8.11~git20130903-1) unstable; urgency=low
* Git snapshot.
+ Embedded jemalloc copy has been replaced by system one.
[...]
- Disable jemalloc on hurd and kfreebsd-*. No longer disabled upstream.
* `msync`
Then define `_POSIX_MAPPED_FILES`, `_POSIX_SYNCHRONIZED_IO`.
* `epoll`, `sys/epoll.h`
Used by [[wayland]], for example.
IRC, freenode, #hurd, 2013-08-08:
is there any possible to have kquque/epoll alike
things in hurd? or there is one?
nalaginrut: use select/poll
is it possible to implement epoll?
it is
we don't care enough about it to do it
(for now)
well, since I wrote a server with Guile, and it could
take advantage of epoll, never mind, if there's no, it'll use
select automatically
but if someday someone care about it, I'll be
interested on it
epoll is a scalability improvement over poll
the hurd being full of scalability issues, this one is
clearly not a priority
ok
* `sys/eventfd.h`
* `sys/inotify.h`
* `sys/signalfd.h`
* `sys/timerfd.h`
* `timespec_get` (74033a2507841cf077e31221de2481ff30b43d51,
87f51853ce3671f4ba9a9953de1fff952c5f7e52)
* `waitflags.h` (`WEXITED`, `WNOWAIT`, `WSTOPPED`, `WCONTINUED`)
IRC, freenode, #hurd, 2012-04-20:
in glibc, we use the generic waitflags.h which, unlike
linux's version, does not define WEXITED, WNOWAIT, WSTOPPED,
WCONTINUED
should the generic bits/waitflags.h define them anyway,
since they are posix?
well, we'd have to implement them anyway
but otherwise, I'd say yes
sure, but since glibc headers should expose at least
everything declared by posix, i thought they should be defined
anyway
that might bring bugs
some applications might be #ifdefing them
and break when they are defined but not working
i guess they would define them to 0, andd having them to
non-zero values shouldn't break them (since those values don't do
anything, so they would act as if they were 0.. or not?)
no, I mean they would do something else, not define them to
0
like posix/tst-waitid.c, you mean?
yes
See `posix/tst-waitid.out` failure below.
* `getconf` things (see below the results of `tst-getconf.out`)
* `getsockopt`, `setsockopt`
IRC, freenode, #hurd, 2013-02-14
Hi, {get,set}sockopt is not supported on Hurd. This shows
e.g. in the gnulib's test-{poll,select} code.
Reading
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html there might
be reasons _not_ to implement them, comments?
uh? they are supported on hurd
not SO_REUSEPORT for setsockopt()
that isn't the same as claiming "get/setsockopt is not
supported on hurd"
most probably that option is not implemented by the
socket family you are using
OK, some options like SO_REUSEPORT then, more info in
the link.
note also SO_REUSEPORT is not posix
and i don't see SO_REUSEPORT mentioned in the page you
linked
No, but SO_REUSEADDR
IRC, freenode, #hurd, 2013-02-23
as an example, the poll test code from gnulib fails due
to that problem (and I've told you before)
gnu_srs: what's the actual failure?
can you provide a minimal test case showing the issue?
pinotree: A smaller test program:
http://paste.debian.net/237495/
gnu_srs: setting SO_REUSEADDR before binding the socket
works...
and it seems it was a bug in the gnulib tests, see
http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=6ed6dffbe79bcf95e2ed5593eee94ab32fcde3f4
pinotree: You are right, still the code I pasted pass on
Linux, not on Hurd.
so?
the code is wrong
you cannot change what bind does after you have called
it
* pinotree → out
so linux is buggy?
no, linux is more permissive
(at least, on this matter)
* `getcontext`/`makecontext`/`setcontext`/`swapcontext`
Support for these functions within the Hurd threadvar environment has
been added, but for multi-threaded applications ([[libpthread]]), it is
a bit clunky: as a practical requirement, a thread's stack size always
has to be equal to `PTHREAD_STACK_DEFAULT`, 2 MiB, and also has to be
naturally aligned. The idea is still to [[get rid of Hurd threadvars
and replace them with TLS|t/tls-threadvar]].
Aside from [[gccgo]], the following packages might make use of these
functions, searching on for
`\b(get|set|make|swap)context\s*\(` on 2013-05-18: boost1.49,
chromium-browser, gtk-vnc, guile-1.8, iceape, icedove, iceweasel,
libgc, libsigsegv, luatex, mono, nspr, pth, ruby1.8, texlive-bin, uim,
and more.
IRC, OFTC, #debian-hurd, 2013-09-08:
oh, and even ruby2.0 suffers because of fixed-stack
threads
yes, we definitely need to finish fixing it
my current work is in our glibc repo, youpi/tls-threadvar
| *** makecontext: a stack at 0xbc000 with size 0x40000
is not usable with threadvars
all 8 failing tests with that
maybe we can hand-disable the use of contexts in ruby for
now?
gg0: ↑ :)
after the pseudo-patch i RFCed, i don't deserve to say
anything else about that :)
i mean, feel free to investigate and "fix" ruby2.0 as
above :)
eh maybe i'd just be able to hand-disable failing
thread-related _tests_ :)
i'm still hoping some real developer picks and actually fixes
it, seems it's not enough interesting though
21:37 < youpi> yes, we definitely need to finish fixing it
afaiu youpi is working on threadvars-tls migration, which
would mean fixing them all. i just meant fixing ruby, which would
mean having puppet btw
gg0: "actually fixing" means fixing threadvars-tls
migration
"just fixing" ruby can be done by simply disabling context
use in ruby
IRC, OFTC, #debian-hurd, 2013-09-10:
this one fixes make test by disabling context and giving more
time to timing related tests http://paste.debian.net/plain/37977/
make test-all is another story
gg0: AIUI, the sleep part should get fixed by the next
glibc upload, which will include the getclk patch
but the disabling context part could be good to submit to
the debian ruby package, mentioning that this is a workaround for
now
unfortunately still not enough, test-all still fails
does it make the package not build?
test-all is the second part of what we call tests
they build and package (they produce all ruby packages),
after that they run debian/run-test-suites.bash which is make
test + make test-all
well after or during the build doesn't matter, it's their
testsuite
ok just failed:
TestBug4409#test_bug4409 = Illegal instruction
make: *** [yes-test-all] Error 132
what to do with Illegal instruction?
just found 2 words that make everybody shut up :p
same as above: debug it
gg0: have you confirmed that this is reproducible? I've
once had a process die with SIGILL and it was not and I figured
it might have been a (qemu?) glitch
seems i'm running tests which are disabled on _all_ archs,
better so
well, this should be reproducible. i just got it on a qemu, i
could try to reproduce it on real hardware but as just said, i
was testing tests disabled by maintainer so completely useless
gg0: yeah, I'm running all my hurd instances on qemu/kvm
as well, I meant did you get this twice in a row?
to be honest i got another illegal instruction months ago but
don't recall doing what
nope not twice, i've commented it out. then run the remaining
and then found out i should not have done what i was doing
but i could try to reproduce it
ok now i recall i got it another one few hours ago on real
hardware, from logs:
TestIO#test_copy_stream_socket = Illegal instruction
teythoon: on real hardware though
and this is the one i should debug once it finishes, still
running
IRC, freenode, #hurd, 2013-09-11:
../sysdeps/mach/hurd/jmp-unwind.c:53: _longjmp_unwind:
Assertion `! __spin_lock_locked (&ss->critical_section_lock)'
failed.
and
../libpthread/sysdeps/mach/pt-thread-halt.c:51:
__pthread_thread_halt: Unexpected error: (ipc/send) invalid
destination port.
gg0_: Which libpthread source are these? Stock Debian
package?
tschwinge: everything debian, ruby rebuilt with
http://paste.debian.net/plain/38519/ which should disable
*context
IRC, OFTC, #debian-hurd, 2013-09-11:
wrt ruby, i'd propose a patch that disables *context and
comments out failed tests (a dozen). most of them are timing
related, don't always fail
if they failed gracefully, we could leave them enabled and
just ignoring testsuite result, but most of them block testsuite
run when fail
anyone against? any better idea (and intention to implement
it? :p)?
youpi: is disabling some tests acceptable? ^
it'd be good to at least know what is failing
so as to know what impact hiding these failures will have
remember that hiding bugs usually means getting bitten by
them even harder later :)
many of them use pipes
here the final list, see commented out ones
http://paste.debian.net/plain/38426
and as said some don't always fails
test_copy_stream_socket uses a socket
note that we can still at least build packages with notest
at least to get the binaries uploaded
disabling *context should however really be done
and the pipe issues are concerning
I don't remember other pipe issues
so maybe it's a but in the ruby bindings
i just remember they didn't die, then something unknown
fixed it
I see something frightening in io.c
#if BSD_STDIO
preserving_errno(fseeko(f, lseek(fileno(f),
(off_t)0, SEEK_CUR), SEEK_SET));
#endif
this looks very much like a workaround for an odd thing in
BSD
it happens that that gets enabled on hurd too, since
__MACH__ is defined
you could try to drop these three lines, just to see
this is very probably very worth investigating, at any rate
even just test_gets_limit_extra_arg is a very simple test,
that I fail to see why it should ever fail on hurd-i386
starting debugging it would be a matter of putting printfs
in io.c, to check what gets called, with what parameters, etc.
just a matter of taking the time to do it, it's not very
complex
youpi: are you looking at 1.8? no BSD_STDIO here
yes, 1.8
1.9.3.448
landed to sid few days ago
ah, I have 1.87
+.
my favourites are TestIO#test_copy_stream_socket and
TestIO#test_cross_thread_close_fd -> Illegal instruction
TestIO#test_io_select_with_many_files sometimes Illegal
instruction, sometimes ruby1.9.1:
../sysdeps/mach/hurd/jmp-unwind.c:53: _longjmp_unwind: Assertion
`! __spin_lock_locked (&ss->critical_section_lock)' failed.
[[thread-cancel_c_55_hurd_thread_cancel_assertion___spin_lock_locked_ss_critical_section_lock]]?
trying to debug illegal instruction
http://paste.debian.net/plain/38585/
(yes, i'm not even good at gdbing)
any hint?
oh found out there's an intree .gdbinit, that might
complicate things
IRC, OFTC, #debian-hurd, 2013-09-13:
where should it be implemented MAP_STACK? plus, is it worth
doing it considering migration to tls, wouldn't it be useless?
sysdeps/mach/hurd/mmap.c i should reduce stupid questions
frequency from daily to weekly basis
IRC, OFTC, #debian-hurd, 2013-09-14:
say i managed to mmap 0x200000-aligned memory
now i get almost the same failed tests i get disabling
*context
that would mean they don't depend on threading
IRC, freenode, #hurd, 2013-09-16:
i get many ../sysdeps/mach/hurd/jmp-unwind.c:53:
_longjmp_unwind: Assertion `! __spin_lock_locked
(&ss->critical_section_lock)' failed.
by running ruby testsuite, especially during test_read* tests
http://sources.debian.net/src/ruby1.9.1/1.9.3.448-1/test/ruby/test_io.rb#L972
read/write operations with pipes
gg0: that's weird
gg0: debian glibc ?
braunr: yep, debian 2.17-92
sometimes assertion above, sometimes tests in question get
stuck reading
it would be nice reproducing it w/o ruby
probably massive io on pipes could do the job
also more nice finding someone who finds it interesting to
fix :p
ruby is rebuilt with http://paste.debian.net/plain/40755/, no
*context
pipe function in tests above creates one thread for write,
one for read
http://sources.debian.net/src/ruby1.9.1/1.9.3.448-1/test/ruby/test_io.rb#L26
gg0: About the jmp-unwind assertion failure: is it be
chance this issue:
?
I didn't look in detail.
tschwinge: that's what i thought too about the assertion,
which is why i find it strange
asserting it's not locked then locking it doesn't exclude
race conditions
IRC, OFTC, #debian-hurd, 2013-09-17:
youpi: i guess no one saw it anymore since
tg-thread-cancel.diff patch
it =
http://www.gnu.org/software/hurd/open_issues/thread-cancel_c_55_hurd_thread_cancel_assertion___spin_lock_locked_ss_critical_section_lock.html
this one comes from sysdeps/mach/hurd/jmp-unwind.c:53 though
another assertion to remove?
gg0: it's not exactly the same: in hurd_thread_cancel we
hold no lock at all at the assertion point
in jmp-unwind.c, we do hold a lock
and the assertion might be actually true because all other
threads are supposed to hold the first lock before taking the
other one
you could check for that in other places
and maybe it's the other place which wouldhave to be fixed
also look for documentation which would say that
IRC, freenode, #hurd, 2013-09-17:
gg0: is that what we do ??
braunr: well, i was looking at
http://sources.debian.net/src/eglibc/2.17-92/debian/patches/hurd-i386/tg-thread-cancel.diff
which afaics fixes
http://www.gnu.org/software/hurd/open_issues/thread-cancel_c_55_hurd_thread_cancel_assertion___spin_lock_locked_ss_critical_section_lock.html
the one i get now is
http://sources.debian.net/src/eglibc/2.17-92/sysdeps/mach/hurd/jmp-unwind.c#L53
09:12 < youpi> gg0: it's not exactly the same: in
hurd_thread_cancel we hold no lock at all at the assertion point
09:13 < youpi> in jmp-unwind.c, we do hold a lock
09:13 < youpi> and the assertion might be actually true
because all other threads are supposed to hold the first lock
before taking the other one
gg0: that assertion is normal
it says there is a deadlock
ss->critical_section_lock must be taken before ss->lock
you mean ss->lock before ss->critical_section_lock
no
ah ok got it
that's a bug
longjmp
ugh
you could make a pass through the various uses of those
locks and check what the intended locking protocol should be
i inferred ss->critical_section_lock before ss->lock from
hurd_thread_cancel
this might be wrong too but considering this function is
used a lot, i doubt it
(no, i hadn't got it, i was looking at jmp-unwind.c where
lock is before critical_section_lock)
could we get useful info from gdb'ing the assertion?
gg0: Only if you first get an understanding why it is
happening, what you expect to happen instead/why it shall not
happen/etc. Then you can perhaps use GDB to verify that.
i can offer an irc interface if anyone is interested, it's
ready, just to attach :)
this is the test
http://sources.debian.net/src/ruby1.9.1/1.9.3.448-1/test/ruby/test_io.rb#L937
pipe function creates two threads
http://sources.debian.net/src/ruby1.9.1/1.9.3.448-1/test/ruby/test_io.rb#L26
Attaching to pid 15552
[New Thread 15552.1]
[New Thread 15552.2]
(gdb)
IRC, freenode, #hurd, 2013-09-21:
gg0: it seems the assert (! __spin_lock_locked
(&ss->critical_section_lock)); is bogus
but it'd be good to catch a call trace
well, it may not be bogus, in case that lock is only ever
taken by the thread itself
in that case, inside longjmp_unwind we're not supposed to
have it already
ok, that's what we had tried to discuss with Roland
it can happen when playing with thread cancelation
youpi: the assertion isn't exactly bogus
the lock ordering is
braunr: which one are you talking about?
the one in hurd_thread_cancel looks really wrong
and some parts of the code keep the critical section lock
without ss->lock held, so I don't see how lock ordering can help
IRC, OFTC, #debian-hurd, 2013-09-22:
how much does this patch suck on a scale from 1 to 10?
http://paste.debian.net/plain/44810/
well, the stack allocation issue will go away once I get
the threadvars away
I'm working on it right now
about the lib paths, it makes sense to add the gnu case,
but i386-gnu shouldn't be put in the path
that's great
so seems the wrong moment for what i've already done
ie. asking terceiro what he thinks about patch above :/
any distro-independent way to get libc.so and libm.so path?
ruby as last resource takes them from "ldd ruby"
gg0: should work fine then
well it does. but gnu doesn't have a case so it hits default
which is broken
http://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/40235/entry/test/dl/test_base.rb
btw even linux and kfreebsd with debian multipath have broken
cases but they don't hit default and get fixed by ldd later
why it is broken? are arguments passed to that script?
i'm not sure about what propose. a broken case so it doesn't
hit default like linux and kfbsd
yes they are :/