From c4ad3f73033c7e0511c3e7df961e1232cc503478 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 26 Feb 2014 12:32:06 +0100 Subject: IRC. --- open_issues/64-bit_port.mdwn | 106 +- open_issues/anatomy_of_a_hurd_system.mdwn | 520 +++- open_issues/boehm_gc.mdwn | 6 + open_issues/bpf.mdwn | 61 +- ..._create__dev_null__interrupted_system_call.mdwn | 193 ++ open_issues/clock_gettime.mdwn | 132 +- open_issues/code_analysis.mdwn | 66 +- open_issues/code_analysis/discussion.mdwn | 142 +- open_issues/crash_server.mdwn | 20 +- open_issues/dbus.mdwn | 137 +- open_issues/dbus_in_linux_kernel.mdwn | 90 +- open_issues/dde.mdwn | 50 +- .../debugging_gnumach_startup_qemu_gdb.mdwn | 60 +- open_issues/default_pager.mdwn | 9 +- ...t2fs_libports_reference_counting_assertion.mdwn | 9 +- open_issues/gcc.mdwn | 104 +- open_issues/gdb_catch_syscall.mdwn | 4 +- open_issues/glibc.mdwn | 550 ++++- open_issues/glibc/0.4.mdwn | 44 +- open_issues/glibc/debian/experimental.mdwn | 156 +- open_issues/glibc_ioctls.mdwn | 103 +- open_issues/gnumach_memory_management.mdwn | 128 +- open_issues/hurd_101.mdwn | 262 +- open_issues/libmachuser_libhurduser_rpc_stubs.mdwn | 44 +- open_issues/libpthread.mdwn | 408 ++- .../libpthread/t/fix_have_kernel_resources.mdwn | 824 ++++++- open_issues/libpthread_dlopen.mdwn | 104 +- open_issues/libpthread_set_stack_size.mdwn | 91 +- open_issues/linux_as_the_kernel.mdwn | 33 +- open_issues/mach_migrating_threads.mdwn | 17 +- open_issues/mig_portable_rpc_declarations.mdwn | 130 +- open_issues/mig_strings.mdwn | 38 + open_issues/mig_stub_functions.mdwn | 14 +- open_issues/multithreading.mdwn | 184 +- open_issues/nightly_builds.mdwn | 26 +- open_issues/nightly_builds_deb_packages.mdwn | 81 +- open_issues/nptl.mdwn | 69 +- open_issues/performance.mdwn | 26 +- .../io_system/clustered_page_faults.mdwn | 5 +- open_issues/performance/io_system/read-ahead.mdwn | 25 +- open_issues/pfinet_timers.mdwn | 60 +- open_issues/profiling.mdwn | 233 +- open_issues/robustness.mdwn | 50 +- open_issues/serial_console.mdwn | 58 +- open_issues/system_initialization.mdwn | 26 +- open_issues/systemd.mdwn | 2603 +++++++++++++++++++- open_issues/ti-rpc_then_nfs.mdwn | 87 +- open_issues/tmux.mdwn | 35 +- open_issues/translate_fd_or_port_to_file_name.mdwn | 87 +- open_issues/user-space_device_drivers.mdwn | 423 +++- open_issues/virtualization/fakeroot.mdwn | 1224 ++++++++- open_issues/wine.mdwn | 98 +- open_issues/xattr.mdwn | 12 +- 53 files changed, 9878 insertions(+), 189 deletions(-) create mode 100644 open_issues/cannot_create__dev_null__interrupted_system_call.mdwn create mode 100644 open_issues/mig_strings.mdwn (limited to 'open_issues') diff --git a/open_issues/64-bit_port.mdwn b/open_issues/64-bit_port.mdwn index edb2dccd..04273630 100644 --- a/open_issues/64-bit_port.mdwn +++ b/open_issues/64-bit_port.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -23,22 +23,8 @@ the [[microkernel/mach/gnumach/ports/Xen]] platform. i guess it wouldn't be too hard to have a special mach kernel for 64 bits processors, but 32 bits userland only well, it means tinkering with mig - like old sparc systems :p - to build the 32bit interface, not the 64bit one - ah yes - hm - i'm not sure - mig would assume a 32 bits kernel, like now - and you'll have all kinds of discrepancies in vm_size_t & such - yes - the 64 bits type should be completely internal - types* - but it would be far less work than changing all the userspace bits - for 64 bit (ofc we'll do that some day but in the meanwhile ..) - yes - and it'd boost userland addrespace to 4GiB - yes - leaving time for a 64bit userland :) + +[[mig_portable_rpc_declarations]]. # IRC, freenode, #hurd, 2012-10-03 @@ -60,87 +46,7 @@ the [[microkernel/mach/gnumach/ports/Xen]] platform. i think i'll go the second way with x15, so you'll have the two :) -# IRC, freenode, #hurd, 2012-12-12 - -In context of [[microkernel/mach/gnumach/memory_management]]. - - Or with a 64-bit one? ;-P - tschwinge: i think we all had that idea in mind :) - tschwinge: patches welcome :P - tschwinge: sure, please help us settle down with the mig stuff - what was blocking me was just deciding how to do it - hum, what's blocking x86_64, except time to work on it ? - deciding the mig types & such things - i.e. the RPC ABI - ok - easy answer: keep it the same - sorry, let me rephrase - decide what ABI is supposed to be on a 64bit system, so as to know - which way to rewrite the types of the kernel MIG part to support 64/32 - conversion - can't this be done in two steps ? - well, it'd mean revamping the whole kernel twice - as the types at stake are referenced in the whole RPC code - the first step i imagine would simply imply having an x86_64 - kernel for 32-bits userspace, without any type change (unless restricting - to 32-bits when a type is automatically enlarged on 64-bits) - it's not so simple - the RPC code is tricky - and there are alignments things that RPC code uses - which become different when build with a 64bit compiler - there are also things like int[N] for io_stat_struct and so on - i see - making the code wrong for 32 - thus having to change the types - pinotree: yes - (doesn't mig support structs, or it is too clumsy to be used in - practice?) - pinotree: what's the problem with that (i explcitely said changing - int to e.g. int32_t) - that won't fly for some of the calls - e.g. getting a thread state - pinotree: no it doesn't support struct - braunr: that some types in struct stat are long, for instance - pinotree: same thing with longs - youpi: why wouldn't it ? - that wouldn't work on a 64bit system - so we can't make it int32_t in the interface definition - i understand the alignment issues and that the mig code adjusts - the generated code, but not the content of what is transfered - well of course - i'm talking about the first step here - which targets a 32-bits userspace only - ok, so we agree - the second step would have to revamp the whole RPC code again - i imagine the first to be less costly - well, actually no - you're right, the mig stuff would be easy on the application side, - but more complicated on the kernel side, since it would really mean - dealing with 64-bits values there - (unless we keep a 3/1 split instead of giving the full 4g to - applications) - -See also [[microkernel/mach/gnumach/memory_management]]. - - (I don't see what that changes) - if the kernel still runs with 32-bits addresses, everything it - recevies from or sends through mig can be stored with the user side - 32-bits types - err, ok, but what's the point of the 64bit kernel then ? :) - and it simply uses 64-bits addresses to deal with physical memory - ok - that could even be a 3.5/0.5 split then - but the memory model forces us to run either at the low 2g or the - highest ones - but linux has 3/1, so we don't need that - otherwise we need an mcmodel=medium - we could do with mcmodel=medium though, for a time - hm actually no, it would require mcmodel=large - hum, that's stupid, we can make the kernel run at -2g, and use 3g - up to the sign extension hole for the kernel map - - -# IRC, freenode, #hurd, 2013-07-02 +## IRC, freenode, #hurd, 2013-07-02 In context of [[mondriaan_memory_protection]]. @@ -157,8 +63,10 @@ In context of [[mondriaan_memory_protection]]. as passed between userspace and kernel -# IRC, OFTC, #debian-hurd, 2013-10-05 +## IRC, OFTC, #debian-hurd, 2013-10-05 and what about 64 bit support, almost done? kernel part is done MIG 32/64 trnaslation missing + +[[mig_portable_rpc_declarations]]. diff --git a/open_issues/anatomy_of_a_hurd_system.mdwn b/open_issues/anatomy_of_a_hurd_system.mdwn index a3c55063..33635b80 100644 --- a/open_issues/anatomy_of_a_hurd_system.mdwn +++ b/open_issues/anatomy_of_a_hurd_system.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -43,7 +43,11 @@ like Bushnell's Hurd paper. All this should be unfied and streamlined. servers often depend on other servers for certain functionality -# IRC, freenode, #hurd, 2011-03-12 +# Bootstrap + +## [[hurd_init]] + +## IRC, freenode, #hurd, 2011-03-12 when mach first starts up, does it have some basic i/o or fs functionality built into it to start up the initial hurd translators? @@ -76,6 +80,112 @@ like Bushnell's Hurd paper. All this should be unfied and streamlined. rest of the system up +## IRC, freenode, #hurd, 2014-01-03 + + hmpf, the hurd bootstrapping process is complicated and fragile, + maybe to the point that it is to be considered broken + aiui the hurd uses the filesystem for service lookup + older mach documentation suggests that there once existed a name + server instead for this purpose + the hurd approach is elegant and plan9ish + the problem is in the early bootstrapping + what if the root filesystem is r/o and there is no /servers or + /servers/exec ? + e. g. rm /servers/exec && reboot -> the rootfs dies early in the + hurd server bootstrap :/ + well yes + it's normal to have such constraints + uh no + at the same time, the boot protocol must be improved, if only to + support userspace disk drivers + totally unacceptable + why not ? + b/c my box just died and lost it's exec node + so ? + loosing the exec node is unacceptable + well, linux dies too if you don't have /dev populated at least a + bit + not being able to boot without the "exec" service is pretty normal + the hurd turns the vfs into a service directory + the exec service is there, only the lookup mechanism is broken + replacing the name server you mentioned earlier + yes + if you don't have services, you don't have them + i don't see the problem + the problem is the lookup mechanism getting broken + ... that easily + imagine a boot protocol based on a ramfs filled from a cpio + i do actually ;) + there would be no reason at all the lookup mechanism would break + yes + but the current situation is not acceptable + i agree + ^^ + ext2fs is too unreliable for that + but using the VFS as a directory is more than acceptable + it's probably the main hurd feature + yes + i see it rather as a circular dependency problem + and if you have good ideas, i'm all ear for propel ... :> + antrik already talked about some of them for the bootstrap + protocol + we should sum them up somewhere if not done already + i've been pondering how to install a tmpfs translator as root + translator + braunr: we could create a special translator for /servers + maybe + very much like fakeroot, it just proxies messages to a real + translator + but if operations like settrans fail, we handle them + transparently, like an overlay + i consider /servers to be very close to /dev + yes + so something like devfs seems obvious yes + i don't even think there needs to be an overlay + y not ? + why does /servers need real nodes ? + for persistence + what for ? + e.g. crash server selection + hm ok + network configuration + i personally wouldn't make that persistent + it can be configured in files and installed at boot time + me neither, but that's how it's currently done + are you planning to actually work on that soon ? + if we need no persistence, we can just use tmpfs + it wouldn't be a mere tmpfs + it could + it's a tmpfs that performs automatic discovery and registration of + system services + with some special wrapper that preserves e.g. /servers/exec + oh + so rather, devtmpfs + it is o_O :p + ? + what is what ? + well, it could be a tmpfs and some utility creating the nodes + whether the node management is merged in or separate doesn't + matter that much i guess + i'd personally imagine it merged, and tmpfs available as a + library, so that stuff like sysfs or netstatfs can easily be written + + +## IRC, freenode, #hurd, 2014-02-12 + + braunr: i fixed all fsys-related receiver lookups in libdiskfs + and surely enough the bootstrap hangs with no indication whats wrong + teythoon: use mach_print :/ + braunr: the hurd bootstrap is both fragile and hard to tweak in + interesting ways :/ + teythoon: i agree with that + teythoon: maybe this will help : + http://wiki.hurdfr.org/upload/graphviz/dot9b65733655309d059dca236f940ef37a.png + although i guess you probably already know that + heh, unicode for the win >,< + :/ + + # Source Code Documentation Provide a cross-linked sources documentation, including generated files, like @@ -311,6 +421,9 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l spiderweb: well, there's 1 advantage of minix for you :P the main idea of mach is to make it easy to extend unix without having hundreds of system calls + +[[/system_call]]. + the hurd keeps that and extends it by making many operations unprivileged you don't need special code for kernel modules any more @@ -539,6 +652,9 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l it must translate these system calls into ipc or something then mach handles it? exactly + +[[/system_call]]. + that's why i say it's not the exokernel way of doing things ok so does every low level hardware access go through mach?' @@ -811,3 +927,403 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l ahungry: ctrl-c does work, you just missed something somewhere and are running a shell directly on a console, without a terminal to handle signals + + +# IRC, freenode, #hurd, 2013-11-04 + + nalaginrut: you can't use the hurd for real embedded stuff without + a lot of work on it + but the hurd design applies very well to embedded environments + the fact that we're able to dynamically link practically all hurd + servers against the c library can visibly reduce the system code size + it also reduces the TCB + what about the memory occupation? + code size is about memory occupation + also, the system is composable like lego, don't need tcp - don't + include pfinet then + the memory overheald of a capability based system like the hurd + are, well, capabilities + teythoon: that's not an argument compared to modular kernels like + linux + yes it is + why ? + if you don't need tcp in linux, you just don't load it + same thing + ok, right + on the other hand, a traditional unix kernel can never be linked + against the c library + much less dynamically + right + I think the point is that it's easy to cut, since it has + better modularity than monolithic, and could be done in userland relative + easier + modularity isn't better + that's a big misconception + also, restarting components is easier on a distributed system + on the hurd, this is a side effect + and it doesn't apply well + braunr: oops, misconception + many core servers such as proc, auth, exec, the root fs server + can't be restarted at all + not yet + and servers like pfinet can be restarted, but at the cost of posix + servers not expecting that + looping on errors such as EBADF because the target socket doesn't + exist any more + I've been working on a restartable exec server during some of my + gsoc weekends + ah right + linux has kexec + and can be patched at run time + sounds like Hurd needs something similar to generalizable + continuation + so again, it's not a real advantage + no + sorry serilizable + that would persistence + personally, i don't want it at all + yes it is a real advantage, b/c the means of communication + (ports) is common to every IPC method on Hurd, and ports are first class + objects + so preserving the state is much easier on Hurd + if a monolithic kernel can do it too, it's not a real advantage + yes, but it is more work + that is one true advantage of the hurd + but don't reuse it each time + oh, that's nice for the ports + why not? + what we're talking about here is resilience + the fact that it's easier to implement doesn't mean the hurd is + better because it has resilience + it simply means the hurd is better because it's easier to + implement things on it + same for development in general + debugging + virtualization + etc.. + yes, but why we stick to compare it to monolithic + but it's still *one* property + well, minix advertises this feature a lot, even if minix can + only restart very simple things like printer servers + minix sucks + let them advertise what they can + ^^ + it has cool features, that's enough, no need to find a feature + that monolithic can never done + no it's not enough + minix isn't a general purpose system + let's just not compare it to general purpose systems + + +# IRC, freenode, #hurd, 2013-11-08 + + and, provided you have suitable language bindings, you can + replace almost any hurd server with your own implementation in any + language + teythoon: language bindings? + Do you mean language bindings against C libraries? + either that or for the low level mach primitives + For your information, IPC is independent of languages. + sure, that's the beauty + Why is hurd best for replacing parts written in C with other + languages? + because Hurd consists of many servers, each server managing one + kind of resource + so you have /hurd/proc managing posix processes + you could reimplement /hurd/proc in say python or go, and + replace just that component of the Hurd system + you cannot do this with any other (general purpose) operating + system that I know of + you could incrementally replace the Hurd with your own + Hurd-compatible set of servers written in X + use a language that you can verify, i.e. prove that a certain + specification is fulfilled, and you end up with an awesome stable and + secure operating system + Any microkernel OS fits the description. + teythoon, Does hurd have formal protocols for IPC communications? + sure, name some other general purpose and somewhat + posix-compatible microkernel based operating system please + what do you mean by formal protocols ? + IPC communications need to be defined in documents. + the "wire" format is specified of course, the semantic not so + much + network protocols exist. + HTTP is a transport protocol. + Without formal protocols, IPC communications suffer from + debugging difficulties. + Formal protocols make it possible to develop and test each module + independently. + as I said, the wire format is specified, the semantics only in + written form in the source + this is an example of the ipc specification for the proc server + http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/hurd/process.defs + teythoon, how file server interacts with file clients should be + defined as a formal protocol, too. + do you consider the ipc description a kind of formal protocol ? + + http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/hurd/process.defs can + be considered as a formal protocol. + However, the file server protocol should be defined on top of IPC + protocol. + the file server protocol is in fs.defs + every protocol spoken is defined in that ipc description + language + it is used to derive code from + crocket: not any system can be used to implement system services + in any language + in theory, they do, but in theory only + the main reason they don't is because most aren't posix compliant + from the ground up + posix compliance is achieved through virtualization + which isolates services too much for them to get useful, + notwithstanding the impacts on performance, memory, etc.. + braunr, Do you mean it's difficult to achieve POSIX compliance + with haskell? + crocket: i mean most l4 based systems aren't posix + genode isn't posix + helenos is by design not posix + the hurd is the only multi server system providing such a good + level of posix conformance + and with tls on the way, we'll support even more non-posix + applications that are nonetheless very common on unices because of + historical interfaces still present, such as mcontext + and modern ones + e.g. ruby is now working, go should be there after tls + * teythoon drools over the perspective of having go on the Hurd... + braunr, Is posix relevant now? + it's hugely relevant + conforming to posix and some native unix interfaces is the only + way to reuse a lot of existing production applications + and for the matter at hand (system services not written in c), it + means almost readily getting runtimes for other languages than c + something other microkernel based system will not have + imagine this + one day, one of us could create a company for a hurd-like system, + presenting this idea as the killer feature + by supporting posix, customers could port their software with very + little effort + *very little effort* is what makes software attractive + + http://stackoverflow.com/questions/1806585/why-is-linux-called-a-monolithic-kernel/1806597#1806597 + says "The disadvantage to a microkernel is that asynchronous IPC + messaging can become very difficult to debug, especially if fibrils are + implemented." + " GNU Hurd suffers from these debugging problems (reference)." + stackoverflow is usually a nice place + but concerning microkernel stuff, you'll read a lot of crap + anywhere + whether it's sync or async, tracking references is a hard task + it's a bit more difficult in distributed systems, but not that + much if the proper debugging features are provided + we actually don't suffer from that too much + many of us have been able to debug reference leaks in the past, + without too much trouble + we lack some tools that would give us a better view of the system + state + braunr, But is it more difficult with microkernel? + crocket: it's more difficult with distributed systems + How much more difficult? + i don't know + distributed systems + not much + braunr, How do you define distributed systems? + crocket: not monolithic + braunr, Hurd is distributed, then. + multiserver if you prefer + yes it is + braunr, So it is more difficult with hurd. + How much more difficult? How do you debug? + just keep in mind that a monolithic system can run on a + microkenrel + we use tools that show us references + braunr, like? + like portinfo + braunr, Does hurd use unix-socket to implement IPC? + no + unix-socket use mach ipc + I'm confused + ipc is provided by the microkernel, gnumach (a variant of mach) + unix sockets are provided by one of the hurd servers (pflocal) + servers and clients communicate through mach ipc + braunr, Do you think it's feasible to build servers in haskell? + why not ? + ok + I've been thinking about that + in go, with cgo, you can call go functions from c code + so it should be possible to create bindings for say libtrivfs + I'd like to write an OS in clojure or haskell. + crocket: what for ? + braunr, I want to see a better system programming language than + C. + i don't see how clojure or haskell would be "better system + programming languages" than c + and even assuming that, what for ? + braunr, It's better for programmers. + haskell + haskell is expressive. + personally i disagree + it's better for some things + not for system programming + For system programming, Google Go is trying to replace C. But I + doubt it will. + we may not be referring to the same thing here when we say "system + programming" + braunr, What do you think is a better one? + crocket: i don't think there is a better one currently + braunr, Even Rust and D? + i don't know them well enough + certainly not D if it's what i think it is + C is too slow. + C is too slow to develop. + depends + again, i disagree + rust looks good but i don't know it well to comment + C is a tank, and clojure is an airplane. + A tank is reliable but slow. + Clojure is fast but lacks some accuracy. + c is as reliable as the developer is skilled with it + it's clearly not a tank + there are many traps + crocket: are you suggesting to rewrite Hurd in Clojure? + no + Why rewrite hud? + hurd + I'd rather start from scratch. + which is what a rewrite is + I am not expert on Clojure, but I don't think it is made for + system programming. + If you want alternate language, I thing Go is only serious + candidate other than C + Or Rust + However, some people wrote OSes in haskell. + again, why ? + if it's only for the sake of using another language, i think it's + bad reason + Because haskell provides a high level of abstraction that helps + programmers. + It is more secure with monads. + If you want your OS to become successful Free Software project, + you have to use popular language. Haskell is not. + Most Haskell programmers are not into kernels + They do high level stuff. + So little contributors. + crocket: so you aim at security ? + I mean, candidats for contribution + braunr, security and higher abstraction. + i don't understand higher abstraction + braunr, FP can be useful to systems. + FP ? + functional programming + right + but you can abstract a lot with c too, with more efforts + braunr, like that's easy. + it's not that hard + i'm just questioning the goals and the solution of using a + particular language + the reason c is still the preferred language for system + programming is because it provides control over how the hardware does + stuff + which is very important for performance + the hurd never took off because of bad performance + performance doesn't mean doing things faster, it means being able + to do things or not, or doing things a new way + so ok, great, you have your amazing file system written in + haskell, and you find out it doesn't scale at all beyond some threshold + of processors or memory + braunr, L4 is fast. + l4 is merely an architecture abstraction + and it's not written in haskell :p + don't assume anything running on top of something fast will be + fast + Hurd is slow and written in C. + yes + not because of c though + Becuase it's microkernel? + because c wasn't used well enough to make the most of the hardware + in many places + far too many places + A microkernel can be as fast as a monolithic kernel according to + L4. + no + it can't + it can for very specific cases + almost none of which are real world + but that's not the problem + again, i'm questioning your choice of another language in relation + to your goals, that's all + c can do things you really can't do easily in other languages + be aware of that + braunr, "Monolithic kernel are faster than microkernel . while + The first microkernel Mach is 50% slower than Monolithic kernel while + later version like L4 only 2% or 4% slower than the Monolithic kernel ." + 14:05 < braunr> but concerning microkernel stuff, you'll read a + lot of crap anywhere + simple counterexample : + the measurements you're giving consider a bare l4 kernel with + nothing on top of it + doing thread-to-thread ipc + this model of communication is hardly used in any real world + application + one of the huge features people look for with microkernels are + capabilities + and that alone will bump your 4% up + since capabilities will be used for practically every ipc + ok + + +# Hurd From Scratch + +## IRC, freenode, #hurd, 2013-11-30 + + because I think there is no way to understand the whole pile, + you need to go step by step + for example, I'm starting with mach only, then adding one + server, then another and on each step I have working system + that's how I want to understand it + you are interested in the early bootstrapping of the hurd system + ? + now I'm starting debian gnu/mach, it hungs, show me black + screen and I have no idea how to fix it + if you are unable to fix this, why do you think you can build a + hurd system from scratch ? + not gnu/mach, gnu/hurd I mean + or, you could describe your problem in more detail and one of + the nice people around here might help you ;) + as I said, it will be easier to understand and fix bugs, if I + will go step by step, and I will be able to see where bugs appears + so you should help me with that + and I tend to disagree + but you could always read my blog. you'll learn lots of things + about bootstrapping a hurd system + but it's complicated + http://www.linuxfromscratch.org/ + also, you'll need at least four hurd servers before you'll + actually see much + five + yeah, i know lfs + if somebody is interested in creating such a project, let me + know + you seem to be interested + yes, but I need the a real hurd master to help me + become one. fix your system and get to know it + I need knowledge, somebody built the system but didn't write + documentation about it, I have to extract it from your heads + hurdmaster: extract something from here + http://teythoon.cryptobitch.de + I need my head ;) + thanks + okay, what's the smallest thing I can run? + life of a Hurd system starts with the root filesystem, and the + exec server is loaded but not started + you could get rid of the exec server and replace the root + filesystem with your own program + statically linked, uses no unix stuff, only mach stuff + can I get 'hello world' on pure mach? + you could + hurdmaster: actually, here it is: + http://darnassus.sceen.net/gitweb/rbraun/mach_print.git/ + compile it statically, put it somewhere in /boot + make sure you're running a debug kernel + load it from grub instead of /hurd/ext2fs.static + look at the grub config for how this is done + let me know if it worked ;) diff --git a/open_issues/boehm_gc.mdwn b/open_issues/boehm_gc.mdwn index 8cd2415a..2913eea8 100644 --- a/open_issues/boehm_gc.mdwn +++ b/open_issues/boehm_gc.mdwn @@ -528,6 +528,12 @@ restults of GNU/Linux and GNU/Hurd look very similar. and maybe c# hello world translate another day :) +### IRC, freenode, #hurd, 2013-12-16 + + gnu_srs: ah, libgc + there are signal-related problems with libgc + + ## Leak Detection ### IRC, freenode, #hurd, 2013-10-17 diff --git a/open_issues/bpf.mdwn b/open_issues/bpf.mdwn index 02dc7f87..d051c2d8 100644 --- a/open_issues/bpf.mdwn +++ b/open_issues/bpf.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2009, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2012, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -593,3 +594,61 @@ In context of the [[select]] issue. i understand now why my bpf translator was so buggy the condition_timedwait i wrote at the time was .. incomplete :) + + +## IRC, freenode, #hurd, 2014-02-04 + + btw, why is there a bpf filter in gnumach ? + braunr: didn't you put it there ? + teythoon: ah yes i did + teythoon: i completed the work of a friend + teythoon: the original filters in mach were netf filters + teythoon: we added bpf so that libpcap could directly upload them + to the kernel + in order to apply filters as close as possible to the packet + source and save copies + so they were used with the in-kernel network drivers ? + only by experimental code and pfinet which sets a + receive-all-inet4/6 filter + i also have a pcap-hurd.c file for libpcap but integration is a + bit tricky because of netdde + maybe i could work on it again some day + it should be easy to get into the debian package at least + so they can still be used with a netdde-based driver ? + i'm not sure + the pcap-hurd.c file i wrote uses the libpcap bpf filter + oh, ok, i misinterpreted what you said wrt netdde + the problem caused by netdde is about where to get packets from, + but devnode should take care of that + did you mean that the integration is tricky b/c when netdde is + used, a different approach is necessary and that would have to be + detected at runtime ? + something like that + right + i didn't want to detect anything + right + i was waiting for things to settle but netdde is still debian only + but that's ok, this oculd be a debian only patch for now + so is eth-filter the netdde equivalent or am i getting a wrong + picture here ? + i don't know + it seems to implement bpf filters as well + it could very well be + whatever the driver, pfinet must be able to install a filter + even if it's almost a catch-all + i guess it could start a eth-filter and use this, why not + sure + + +### IRC, freenode, #hurd, 2014-02-06 + + teythoon: the BPF filter in Mach can also be used by + eth-multiplexer or eth-filter when running on in-kernel network + drivers... in fact the implementation was finished by the guy who created + eth-multiplexer; it was not fully working before + it's not useful at all when using netdde I believe + teythoon: IIRC eth-filted both relies on BPF being implemented by + the layer below it (whatever it is) to do the actual filtering, as well + as implements BPF itself so any layer on top of it can in turn use BPF + netdde should provide BPF filters too I'd say... but don't + remember for sure diff --git a/open_issues/cannot_create__dev_null__interrupted_system_call.mdwn b/open_issues/cannot_create__dev_null__interrupted_system_call.mdwn new file mode 100644 index 00000000..b0f14a17 --- /dev/null +++ b/open_issues/cannot_create__dev_null__interrupted_system_call.mdwn @@ -0,0 +1,193 @@ +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + + +# IRC, freenode, #hurd, 2013-12-05 + + Creating device nodes: fd fdX std vcs hdX hdXsY hdXs1Y sdX sdXsY + sdXs1Y cdX netdde ethX loopX ttyX ptyp ptyq/sbin/MAKEDEV: 75: + /sbin/MAKEDEV: cannot create /dev/null: Interrupted system call + that's new + teythoon: ouch + braunr: everything works fine though + teythoon: that part isn't too surprising + y? + teythoon: /dev/null already existed, didn't it ? + braunr: sure, yes + + +## IRC, freenode, #hurd, 2013-12-19 + + hm + i'm seeing those /sbin/MAKEDEV: cannot create /dev/null: + Interrupted system call messages too + + +## IRC, freenode, #hurd, 2013-12-20 + + braunr: interesting, I've seen some of those as well + + +## IRC, freenode, #hurd, 2014-01-26 + + cannot create /dev/null: Interrupted system call + + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/26/steps/system_upgrade/logs/stdio + + +## IRC, freenode, #hurd, 2014-01-27 + + gg0: I had same /dev/null error after upgrading my old image + (more than 6 months old) a week ago. But I got such message only on boot + and it didn't autostart hurd console. + Tried to upgrade current qemu image (from topic) to reproduce it + but it works OK after upgrade + i can reproduce it with # apt-get install --reinstall python2.7 dbus + # for instance + http://paste.debian.net/plain/78566/ + gg0: i've seen those as well, but i cannot reliably reproduce it + to track it down + i believe it's benign though + in shell scripts if -e is set, it aborts on failures like those + uh, it does? :/ + so if this happens in prerm/postinst scripts, package is not properly + installed/removed/configured and it fails + redirecting stdout and strerr to /dev/null shouldn't be so + problematic, anything wrong in my setup? + can you reproduce it? + not reliably + gg0: but i do not believe that anything is wrong with your + machine + any way to debug it? + having a minimal test case that triggers this reliably would be + great + but i fear it might be a race + + +## IRC, freenode, #hurd, 2014-01-28 + + have you seen the /dev/null issue ? + yes + what do you make of it ? + no idea + i believe it is related to the inlining work i've done + just like the bogus deallocation at boot, it needs debugging :) + hm i don't think so + no ? + i think we saw it even before your started working on the hurd ;p + i've never seen it before my recent patches + maybe i made it worse + not worse, just exposed more + right + + +## IRC, freenode, #hurd, 2014-01-29 + + cannot reproduce "cannot create /dev/null: Interrupted system call" + on a faster VM + might depend on that? + + +## IRC, OFTC, #debian-hurd, 2014-02-02 + + but now saw a strange message at the end of the boot: + /etc/init.dhurd-console: 55: /etc/init.d/hurd-console: cannot create + /dev/null: Interrupted system call + oh well known on a slow VM (even old qemu/kvm btw), i can't reproduce + it on a faster/more recent one + slow VM = gnash buildbot slave + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/26/steps/system_upgrade/logs/stdio + especially bad on system upgrade because it doesn't finish to run + prerm/postinst scripts :/ + + +## IRC, freenode, #hurd, 2014-02-05 + + Creating device nodes: fd fdX std vcs hdX hdXsY/sbin/MAKEDEV: 75: + /sbin/MAKEDEV: cannot create /dev/null: Interrupted system call hdXs1Y + sdX sdXsY sdXs1Y cdX netdde ethX loopX ttyX ptyp ptyq lprX comX random + urandom kbd mouse shm. + + +## IRC, freenode, #hurd, 2014-02-11 + + typical dist-upgrade http://paste.debian.net/plain/81346/ + many fewer cannot create /dev/null: Interrupted system call + on a faster machine + gg0: wow, so many interrupted system call messages + i don't get as many, but makedev produces a few every time i run + it as well + + +## IRC, OFTC, #debian-hurd, 2014-02-16 + + anyone here got any idea why upgrading initscripts fail on the hurd + gnash autobuilder, as reported on ? + pere: cannot create /dev/null: Interrupted system call + gg0: I noticed the message, but fail to understand how this could + happen. + 13:16 < gg0> oh well known on a slow VM (even old qemu/kvm btw), i + can't reproduce it on a faster/more recent one + 13:17 < gg0> slow VM = gnash buildbot slave + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/26/steps/system_upgrade/logs/stdio + 13:18 < gg0> especially bad on system upgrade because it doesn't + finish to run prerm/postinst scripts :/ + i remember teythoon talking about something racy + gg0: the /dev/null issue is known for a long time + gg0: some of the recent work (i believe mine) has made the + problem more apparent + gg0: that's what braunr told me + i see. it would be really nice fixing it. really annoying. i + workaround it by moving null away and moving it back under /dev before + halting/rebooting + + +## IRC, freenode, #hurd, 2014-02-17 + + Earlier today, I upgraded my Debian GNU/Hurd installation from + several months ago, and I'm now seeing bogus things as follows; is that a + known issue? + checking for i686-unknown-gnu0.5-ar... ar + configure: updating cache ./config.cache + configure: creating ./config.status + +./config.status: 299: ./config.status: cannot create + /dev/null: Interrupted system call + config.status: creating Makefile + (The plus is from a build log diff.) + 13:36 < gg0> pere: cannot create /dev/null: Interrupted system call + 20:10 < teythoon> gg0: the /dev/null issue is known for a long time + Anyone working on resolving this? I't causing build issues: + checking for i686-unknown-gnu0.5-ranlib... (cached) ranlib + checking command to parse nm output from gcc-4.8 + object... [...]/opcodes/configure: 6760: ./configure.lineno: cannot + create /dev/null: Interrupted system call + failed + checking for dlfcn.h... yes + Anyway, will go researching IRC logs. + tschwinge: (that one was from #debian-hurd) + I assume teythoon and/or braunr can comment once he's back + they're* + tschwinge: we've been seing this more often lately but noone has + attempted to fix it yet + tschwinge: if you have a reliable way to reproduce that /dev/null: + Interrupted system call error, please let us know + + +## IRC, freenode, #hurd, 2014-02-23 + + braunr: cool. i'd vote /dev/null one as next one in your todo + still frequent on this slow vm + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/30/steps/system_upgrade/logs/stdio + especially during setup-translators -k + yes diff --git a/open_issues/clock_gettime.mdwn b/open_issues/clock_gettime.mdwn index 65ab52df..baa21bbb 100644 --- a/open_issues/clock_gettime.mdwn +++ b/open_issues/clock_gettime.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -158,6 +159,9 @@ In context of [[select]]. my brain can't correctly compute variable sized types in mig definition files i wanted something that would remain correct for the 64-bit port + +[[64-bit_port]], [[mig_portable_rpc_declarations]]. + ah, you mean because tv_nsec is a long, which will not be the same type? and tv_sec being a time_t (thus a long too) @@ -208,3 +212,129 @@ In context of [[select]]. # Candidate for [[vDSO]] code? + + +# IRC, freenode, #hurd, 2014-02-23 + + GLib (gthread-posix.c): Unexpected error from C library during + 'pthread_condattr_setclock': Invalid argument. Aborting. + uh oh... + time to go digging in glibc i guess... + what are you trying to run ? + glib + with what ? + just running glib's test suite under jhbuild + i maintain glib and i made some changes recently -- i wanted to + make sure they didn't break the hurd + and it seems they have ;/ + well + the hurd doesn't completely comply with posix 2008 + long story short: we've keyed our timed waits on condition + variables to the monotonic clock for a long time now, but we never tested + that it actually worked + so i just added an assert -- and indeed it fails on hurd + our glibc lies about supporting timers + good thinking + we don't support the monotonic clock + clock_gettime(CLOCK_MONOTONIC) seems to work + and you should know that, even if clock selection and timers are + available (which posix 2008 requires), it's still optional + no, glibc lies + !! + our "support" is a mere hack shifting CLOCK_REALTIME + it should at least lie consistently :) + we need to implement CLOCK_MONOTONIC properly + ya... that would be very nice indeed + not that hard either + i agree! + we just have to do it right + fwiw, i plan to keep this assert in glib + yes, it's good + is there anywhere i can file a bug to give you guys some advance + warning? + i don't think it's needed + we know the problem + k -- consider yourself warned, then :) + and it's been a bigger concern recently + awesome. glad i don't have to do anything :) + if it's not already done, i suggest you check for the + CLOCK_MONOTONIC option + fwiw, i'm trying to get a regular debian/gnu/hurd build of + glib/gtk/etc setup + regular ? + ya... out of git master on a daily basis + from sources ? + oh nice + we recently set this up for freebsd as well + few maintainers take the pain :) + our non-linux 'problem discovery' is a bit crap before now :/ + i guess that's pretty normal + i don't consider it the responsibility of the maintainers to test + every possible platform + glib is a bit unique -- portability is our business + taking our patches into consideration is what we ask most + right + and the "please take the patches" thing is something we want to + stop doing + why ? + mostly because we often look at a patch that someone sent a few + years ago and say "do we even still need this?" + and have no way to know + uh + you would not believe how many patches like this we've + accumulated... + but if we send it now ? :) + braunr: new policy is roughly this: + https://wiki.gnome.org/Projects/GLib/SupportedPlatforms + ie: fixes for issues that are general portability improvements and + POSIX compliance are welcome... + patches that introduce platform-specific #ifdef sections are + rejected unless we have a regular builder to test that code + i see + again, regarding portability, don't consider CLOCK_MONOTONIC to be + readily available, check for it + an #error would be enough but it has to be checked + it basically comes down to: we don't want to have code in our + version control that we have no possible way of testing + yes + braunr: we do check for it + ok + we assert() if clock_gettime(CLOCK_MONOTONIC) fails + no i mean + as POSIX said it should if CLOCK_MONOTONIC is not supported + if you lie to us.... well, not much we can do + POSIX_MONOTONIC_CLOCK + _POSIX_MONOTONIC_CLOCK + this is actually defined to 0 on most platforms... + which does not mean that it's unsupported -- it means that the + runtime must be ready to deal with it not actually existing at runtime + really ? + yes + we used to rely on this and got a bug that we were doing it wrong + :) + and indeed, even on linux, both with glibc and uclibc: + /usr/include/bits/posix_opt.h:#define _POSIX_MONOTONIC_CLOCK + 0 + /usr/include/uClibc/bits/posix_opt.h:#define _POSIX_MONOTONIC_CLOCK + 0 + ok it's described in 2.1.6 Options + so your check is appropriate + so does clock_gettime(MONOTONIC) on debian/hurd get me realtime? + either that, or a value shifted from it + if so, i'll just hack out the condattr_setclock() check and proceed + trying to build past glib... + * desrt checks + as it is, even the build of glib fails since we use some tools + linked against ourselves during the build process... + 1393124084790000 1393124084790000 + those look the same.... + heh + i also notice that your clocks are not very high precision :) + that's right + HZ = 100, i guess + yes + fair enough + our mainloop doesn't support better-than-millisecond accuracy yet + anyway :) + (although it will soon...) + nice diff --git a/open_issues/code_analysis.mdwn b/open_issues/code_analysis.mdwn index 67798c6a..d61d5921 100644 --- a/open_issues/code_analysis.mdwn +++ b/open_issues/code_analysis.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -87,8 +87,70 @@ There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks. * [Frama-C](http://frama-c.com/) + btw, I've been looking at http://frama-c.com/ lately + it's a theorem prover for c/c++ + oh nice + I think it's most impressive, it works on the hurd (aptitude + install frama-c o_O) + *and it works + "Simple things should be simple, + complex things should be possible." + :) + looks great + even the gui is awesome, allows one to browse source code in + a very impressive way + clear separation between value changes, dependencies, side + effects + we could have plugins for stuff like ports + handles concurrency oO + so you want to use Frame-C to analyze the whole Hurd code + base? + nalaginrut: well, frama-c looks "able" to assist in + analyzing the Hurd, yes + nalaginrut: but theorem proving is a manual process, one + needs to guide the prover + nalaginrut: b/c some stuff is not decideable + I ask this because I can imagine how to analyze Linux + since all the code is in a directory. But Hurd's codes are + distributed to many other projects + that's not a problem + each server can be analyzed separately + braunr: also, each "entry point" + alright, but sounds a big work + it is + otherwise, formal verification would be widespread :) + that, and most tools are horrible to use, frama-c is really + an exception in this regard + * [Coverity](http://www.coverity.com/) (nonfree?) + * IRC, OFTC, #debian-hurd, 2014-02-03 + + btw, did you consider adding hurd and mach to to detect bugs automatically? + I found lots of bugs in gnash, ipmitool and sysvinit when I + started scanning those projects. :) + i did some static analysis work, i haven't used coverty + but free tools for that + i think thomas wanted to look into coverty though + quite easy to set up, but you need to download and run a + non-free tarball on the build host. + does that tar ball contains binary code ? + that'd be a show stopper for the hurd of course + did not investigate. I just put it in a contained virtual + machine. + did not want it on my laptop. :) + prefer free software here. :) + but I did not have to "accept license", at least. :) + + * IRC, OFTC, #debian-hurd, 2014-02-05 + + ah, cool. + is now in place. :) + + [[microkernel/mach/gnumach/projects/clean_up_the_code]], + *Code_Analysis, Coverity*. + * [Splint](http://www.splint.org/) * IRC, freenode, #hurd, 2011-12-04 diff --git a/open_issues/code_analysis/discussion.mdwn b/open_issues/code_analysis/discussion.mdwn index 4cb03293..45126b91 100644 --- a/open_issues/code_analysis/discussion.mdwn +++ b/open_issues/code_analysis/discussion.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -100,6 +100,146 @@ License|/fdl]]."]]"""]] https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build-2/ +### IRC, freenode, #hurd, 2013-11-04 + + btw, why does the nested functions stuff needs the executable + stack? for trampolines? + yes + I didn't even realize that, that's one more reason to avoid them + indeed + + braunr: kern/slab.c (1471): vm_size_t info_size = info_size; + yes ? + braunr: what's up with that? + that's one way to silence gcc warnings about uninitialized + variables + this warning can easily result in false positives when gcc is + unable to determine dependencies + e.g. if (flag & FLAG_CREATE) myvar = create(); ...; ... if (flag & + FLAG_CREATE) use(myvar) + well, ok, that's a shortcomming of gcc + braunr: your way of silencing that in gcc still shows up in + scan-build and most likely any more advanced analysis tool + as it should of course, but it is noisy + teythoon: there is a gcc attribute for that + __attribute__((unused)) + analysis tools might know that better + braunr: could you have a quick look at + http://darnassus.sceen.net/~teythoon/qa/gnumach/scan-build/2013-11-04/report-mXqstT.html#EndPath + ? + nice + anything else on the rbtree code ? + well + + http://darnassus.sceen.net/~teythoon/qa/gnumach/scan-build/2013-11-04/report-LyiOO1.html#EndPath + but this is of length 18, so it might be far-fetched + ?? + the length of the chain of argumentation + i don't understand that issue + isn't 18 the analysis step ? + well, the greater the length, the more assumption the tool + makes, the more likely it is that it just does not "get" some invariant + probably yes + the code can segfault if input parameters are invalid + that's expected + right, looks like this only happens if the tree is invalid + if in line 349 brother->children[right] is NULL + this is a very good target for verification using frama-c + :) + the code already has many assertions that will be picked up by + it automatically + so what about the dead store, is it a bug or is it harmless ? + harmless probably + certainly + a simple overlook when polishing + + +### IRC, freenode, #hurd, 2014-01-16 + + braunr: hi. Once, when I wrote a lot if inline gcc functions in + kernel you said me not to use them. And one of the arguments was that you + want to know which binary will be produced. Do you remember that? + not exactly + it seems likely that i advice not to use many inline functions + but i don't see myself stating such a reason + braunr: ok + so, what do you think about using some high level primitives in + kernel + like inline-functions + ? + "high level primitives" ? + you mean switching big and important functions into inline code ? + braunr: something that is hard to translate in assembly directly + braunr: I mean in general + i think it's bad habit + braunr: why? + don't inline anything at first, then profile, then inline if + function calls really are a bottleneck + my argument would be that it makes code more readable + https://www.kernel.org/doc/Documentation/CodingStyle <= see the + "inline disease" + uh + more readable ? + the only difference is an inline keyword + sorry + i confused with functions that you declare inside functions + nested + forgot the word + sorry + ah nested + my main argument against nested functions is that they're not + standard and hard to support for non-gcc tools + another argument was that it required an executable stack but + there is apparently a way to reliably make nested functions without this + requirement + so, at the language level, they bring nice closures + the problem for me is at the machine level + i don't know them well so i'm unable to predict the kind of code + they generate + but i guess anyone who would take the time to study their + internals would be able to do that + and why this last argument is important? + because machine code runs on machines + one shouldn't ignore the end result .. + if you don't know the implications of what you're doing precisely, + you loose control over the result + if you can trust the tool, fine + mcsim: in general, when you use something you don't really + understand how it works internally, you've a much higher risk of making + bugs or inefficient code because you just didn't realize it couldn't work + or would be inefficient + but in the case of a kernel, it often happens that you can't, or + at least not in a straightforward way + s/loose/lose/ + kilobug: and that's why for kernel programming you try to use the + most straightforward primitives as possible? + no + mcsim: not necessarily the most straightforward ones, but ones + you understand well + keeping things simple is a way to keep control complexity in any + software + as long as you understand, and decouple complicated things apart, + you can keep things simple + nested functions doesn't have to do with complexity + don't* + it's just that, since they're not standard and commonly used + outside gnu projects, they're not well known + i don't "master" them + also, they decouple the data flow from the control flow + which in my book is bad for imparative languages + and support for them in tools like gdb is poor + braunr: I remembered nested functions because now I use C++ and I + question myself if I may use all these C++ facilities, like lambdas, + complicated templates and other stuff. + kilobug: And using only things that you understand well sounds + straightforward and logical + that's why i don't write c++ code :) + it's very complicated and requires a lot of effort for the + developer to actually master it + mcsim: you can use those features, but sparsely, when they really + do bring something useful + + # Leak Detection See *Leak Detection* on [[boehm_gc]]. diff --git a/open_issues/crash_server.mdwn b/open_issues/crash_server.mdwn index 5182df6f..3d656082 100644 --- a/open_issues/crash_server.mdwn +++ b/open_issues/crash_server.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2009, 2010, 2011, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2009, 2010, 2011, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -248,6 +248,22 @@ one... rekado: In case that's still helpful: . + +# IRC, freenode, #hurd, 2013-12-14 + + How to get a core dump? + either set CRASHSERVER to /servers/crash-dump-core for the + process you want the core file of + or make /servers/crash point to crash-dump-core to make this the + default for all processes + does it work now, it did not before? + it does for me, never had issues + k! + well, i believe the second option has issues + if two processes crash, both may write/create a file in the same + location + + --- If someone is working in this area, they may want to have a look at diff --git a/open_issues/dbus.mdwn b/open_issues/dbus.mdwn index 4473fba0..b3bebf48 100644 --- a/open_issues/dbus.mdwn +++ b/open_issues/dbus.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -365,3 +365,138 @@ See [[glibc]], *Missing interfaces, amongst many more*, *`SOCK_CLOEXEC`*. anyway how do you plan to implement credential checking ? I'll mail patches RSN + + +# IRC, freenode, #hurd, 2013-11-03 + + Finally, SCM_CREDS (IDs) works:) I was on the right track all the + time, it was just a small misunderstanding. + remains to solve the PID check + gnu_srs: it should be a matter of adding + proc_user/server_authenticate + there are no proc_user/server_authenticate RPCs? + do you mean adding them to process.defs (and implement them)? + gnu_srs: I mean that, yes + + +# IRC, freenode, #hurd, 2013-11-13 + + BTW: I have to modify the SCM_RIGHTS patch to work together with + SCM_CREDS, OK? + probably + depends on what you change of course + + +# IRC, freenode, #hurd, 2013-11-15 + + Hi, any ideas where this originates, gdb? warning: Error setting + exception port for process 9070: (ipc/send) invalid destination port + gnu_srs: what's process 9070 ? + braunr: It's a test program for sending credentials over a + socket. Have to create a reproducible case, it's intermittent. + The error happens when running through gdb and the sending + program is chrooted: + -rwsr-sr-x 1 root root 21156 Nov 15 15:12 + scm_rights+creds_send.chroot + + +## IRC, freenode, #hurd, 2013-11-16 + + Hi, I have a problem debugging a suid program, see + http://paste.debian.net/66171/ + I think this reveals a gnumach/hurd bug, it makes things behave + strangely for other programs. + How to get further on with this? + Or can't I debug a suid program as non-root? + gnu_srs: if gdb doesn't work for setuid programs on hurd, I suppose + you could chmod -s the binary you're trying to debug, login as root and + run it under gdb + pochu: When logged in as root the program works, independent of + the s flag setting. + right, probably the setuid has no effect in that case because your + effective uid is already fine + so you don't hit the gdb bug in that case + (just guessing) + It doesn't work in Linux either, so it might be futile. + trying + hmm that may be the expected behaviour. after all, gdb needs to be + priviledged to debug priviledged processes + Problem is that it was just the suid properties I wanted to + test:( + gnu_srs: imagine if you could just alter the code or data of any + suid program just because you're debugging it + + +## IRC, freenode, #hurd, 2013-11-18 + + Hi, is the code path different for a suid program compared to run + as root? + Combined with LD_PRELOAD? + gnu_srs: afaik LD_PRELOAD is ignored by suid programs for + obvious security reasons + aha, thanks:-/ + gnu_srs: what's your problem with suid ? + I made changes to libc and tried them out with + LD_PRELOAD=... test_progam. It worked as any user (including root), + but not with suid settings. Justus explained why not. + well i did too + but is that all ? + i mean, why did you test with suid programs in the first place ? + to get different euid and egid numbers + + hi, anybody seen this with eglibc-2.17-96: locale: relocation + error: locale: symbol errno, + version GLIBC_PRIVATE not defined in file libc.so.0.3 with link + time reference + yes, I have + but afaics nothing did break, so I ignored it + + +## IRC, freenode, #hurd, 2013-11-23 + + Finally 8-) + Good news: soon both SCM_CREDS _and_ SCM_RIGHTS is supported + jointly. RFCs will be sent soon. + + +## IRC, freenode, #hurd, 2013-12-05 + + I have a problem with the SCM_CREDS patch and dbus. gamin and my + test code runs fine. + the problem with the dbus code is that it won't work well with + auth_user_authenticate in sendmsg and auth_server_authenticate in + recvmsg. + Should I try to modify the dbus code to make it work? + unless you manage to prove that dbus is not following the posix + standard, there is no reason why you should have to modify dbus + I think the implementation is correct, + but auth_user_authenticate hangs sendmsg until + auth_seerver_authenticate is executed in recvmsg. + and dbus is not doing that, so it hangs in sendmsg writing a + credentials byte. + well the credentials byte is definitely non-posix. + I found a bug related to the HURD_DPORT_USE macro too:-( + ah, yes, auth_user_authenticate might be synchronous indeed, let me + think about it + Nevertheless, I think it's time to publish the code so it can be + commented on:-D + sure + publish early, publish often + + +# IRC, freenode, #hurd, 2014-01-17 + + youpi: as a start all our requested dbus changes are now + committed, and in Debian unstable + good :) + + +# IRC, freenode, #hurd, 2014-01-30 + + dbus has some known problems + known fixes too? + http://www.gnu.org/software/hurd/open_issues/dbus.html + pochu: Maybe that page should be updated: + http://lists.nongnu.org/archive/html/bug-hurd/2013-12/msg00150.html + gnu_srs: well, maybe you can do it : + ) diff --git a/open_issues/dbus_in_linux_kernel.mdwn b/open_issues/dbus_in_linux_kernel.mdwn index caf47711..6f83db03 100644 --- a/open_issues/dbus_in_linux_kernel.mdwn +++ b/open_issues/dbus_in_linux_kernel.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -74,3 +75,90 @@ Might be interesting to watch how this develops. [AF_BUS, D-Bus, and the Linux kernel](http://www.kroah.com/log/linux/af_bus.html), Greg Kroah-Hartman, 2013-02-08. + + +# kdbus + + +## IRC, freenode, #hurd, 2014-01-28 + + i would like to see things like dbus and zeromq use an optimized + microkernel transport one day + we could port kdbus >,< + why not + you port cgroups first + exactly + :p + +[[systemd]]. + + +## IRC, freenode, #hurd, 2014-02-23 + +In context of [[linux_as_the_kernel]], *IRC, freenode, #hurd, 2014-02-23*. + + mach seems like this really simple thing when you first explain + what a microkernel is + and because of that, i think it's better to start the right + solution directly + it looks simple, it's clearly not + but i did a bit of looking into it... it's a bit non-trivial after + all :) + mach ipc is over complicated and error prone + it leads to unefficient communication compared to other solutions + such as what l4 does + ya -- i hear that this is a big part of the performance hit + that's why i've started x15 + i was also doing some reading about how it's based on mapping + memory segments between processes + first, it was a mach clone, but since i've come to know mach + better, it's now a "spiritual" mach successor .. :) + these are two issues that we've been dealing with at another + level... in the design of kdbus + ah kdbus :) + this is something that started with my masters thesis a long time + ago... + ah you too + first thing we did is make the serialisation format so that all + messages are valid and therefore never need to be checked + (old dbus format requires checks at every step on the way) + looks interesting + then of course we cut the daemon out + but some other interesting things: security is super-simple... it's + based enirely on endpoints + either you're allowed to send messages between two processes or + you're not + there is no checking for message types, for example + yes + and the other thing: memory mapping is usually bad + that's what i mean when i say mach ipc is over complicated + it depends + the kdbus guys did some performance testing and found out that if + the message is less than ~512k then the cost of invalidating the TLB in + order to change the memory mapping is higher than the cost of just + copying the data + yes, we know that too + that's why zero copy isn't the normal way of passing small amounts + of data over mach either + nice + i got the impression in some of my reading (wikipedia, honestly) + that memory mapping was being done all the time + well + no it's not + memory mapping is unfortunately a small fraction of the + performance overhead + that's good :) + that being said + memory mapping can be very useful + for example, it's hard for us to comply with posix requirements of + being able to read/write at least 2G of data in a single call + weird bugs occur beyond 512M iirc + you do want memory mapping for that + ya... for things of this size.... you don't want to copy that + through a socket :) + monolithic kernels have it naturally, since the kernel is mapped + everywhere + for microkernels, it's a little more complicated + and the problem gets worse on smp + again, that's why i preferred starting a new kernel instead of + reusing linux diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn index fe9fd8aa..9d8bf509 100644 --- a/open_issues/dde.mdwn +++ b/open_issues/dde.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -579,6 +579,41 @@ In context of [[libpthread]]. (well high, 4 MiB/s or more) +## IRC, freenode, #hurd, 2013-11-20 + + for example, netdde needs more reviewing and polishing + it is known to deadlock sometimes + what deadlocks ? + i'm not sure + ah, netdde + right + yes + I'm seeing that to on one of my vms + nasty one + i know something is wrong with the condition_wait_timeout function + for example + breaks sysvinit shutdown + because it was taken without modification from libpthread + it might be that, or something else + well, dhclient hangs releasing the lease + that's still on my todo list + so I'm pretty sure it's related + hm + maybe + :/ + + +## IRC, freenode, #hurd, 2014-02-11 + + teythoon: looks like a netdde/pfinet freeze/deadlock + yes a netdde deadlock + i really have to fix that too one day :( + hehe :) + the netdde locking privimites are copies of the "old" pthread + ones, instead of reusing pthread + primitives* + + # IRC, freenode, #hurd, 2012-08-18 hm looks like if netdde crashes, the kernel doesn't handle it @@ -602,4 +637,15 @@ In context of [[libpthread]]. partitions/media... +## IRC, freenode, #hurd, 2013-12-03 + + how about porting linux block device layer via dde as mcsim wanted to + do? then all linux filesystems could be brought in, right? + gg0: that should be done, but we need to correctly deal with + multiple pci devices in userspace and arbitration + wouldn't adding support to passive translator into Linux + filesystems be quite some work ? IIRC ext2fs needs a special "owner = + hurd" mode to handle them + + # [[virtio]] diff --git a/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn b/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn index 3faa56fc..7b300ea1 100644 --- a/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn +++ b/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -144,3 +144,61 @@ See also discussion about *multiboot* on [[arm_port]]. matlea01: you need something with multiboot support (like grub) to provide the various bootstrap modules to the kernel Ah, I see + + +## IRC, freenode, #hurd, 2014-02-24 + + hi, will grub load mach kernel to fix address? and which + address? + I want to use qemu gdb support to debug mach + need add-symble-file to right address + congzhang: see objdump gnumach + grub simply follows what's provided by the ELF format of the ELF + file + I think it's default value of _start in ELF, right? + hmm...the actual entry point should plus the size of + multi_boot header, at least 0xc... + youpi: I try that, but not works + I start qemu with -s + the /bin/console was very easy to cause black death, and I want + to use gdb to check whether the mach is death + I will try again later + Anyone know some tutorial to debug mach with qemu? + for better debug, I suggest bochs + although it's slower + nalaginrut: maybe it's my problem, I did not do the right thing + qemu with kvm was great. + qemu with kvm is cool to run, but not so cool for debug kernel + anyway, it's personal taste + you may use gdb for that + for bochs, you don't have to use external debugger + thanks for explain + does anyone succeed boot hurd with qemu multiboot boot + function? + with -kernel and -initrd command line parameter + I boot it with grub, in qemu, it's fine. Then I moved to + physical machine + boot with grub work for me too + I want to know whether it is possible to boot from qemu + directly + qemu can directly load kernel and hurd module for linux + nalaginrut: can you help to test whether hurd-console service + start will cause hurd black death? + I know qemu can boot Linux without MBR, but I don't know if + it's true for Hurd too + congzhang: I'm busy for other works now ;-) + ok, thks:) + qemu's multiboot options don't seem to allow providing + ext2fs.static and ld.so, so I don't think it's possible + I try to do this, because hurd hurd-console cause system to + death very high frequency + (because qemu doesn't implement all of multiboot) + qemu help show that's possible, -initrd support multi module + and parameter + en, I will check with them later + how do you pass parameters to modules? + ah, right, it's after the file name + well, then simply try to pass the kernel, and the two modules + with the same option as in the grub config templates + it's fortunate that neither ext2fs nor exec need a comma on their + command line... diff --git a/open_issues/default_pager.mdwn b/open_issues/default_pager.mdwn index 9a8e9412..38c9a2be 100644 --- a/open_issues/default_pager.mdwn +++ b/open_issues/default_pager.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -35,3 +36,9 @@ License|/fdl]]."]]"""]] # [[trust_the_behavior_of_translators]] + + +# IRC, freenode, #hurd, 2013-10-30 + + it also seems that the kernel has trouble resuming processes that + have been swapped out diff --git a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn index 9ff43afa..2b9f28e8 100644 --- a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn +++ b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -102,3 +103,9 @@ With that patch in place, the assertion failure is seen more often. if this erases the thread-specific area, we can expect all kinds of wreckage i'm not sure how to fix this though + + +# IRC, freenode, #hurd, 2014-01-29 + + ext2fs: ../../libports/port-ref.c:30: ports_port_ref: Assertion + `pi->refcnt || pi->weakrefcnt' failed. diff --git a/open_issues/gcc.mdwn b/open_issues/gcc.mdwn index 2b772cfc..6c14fdd4 100644 --- a/open_issues/gcc.mdwn +++ b/open_issues/gcc.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2007, 2008, 2009, 2010, 2011, 2012, 2013 Free -Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014 +Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -303,6 +303,47 @@ Last reviewed up to the [[Git mirror's 3a930d3fc68785662f5f3f4af02474cb21a62056 * [`-fsplit-stack`](http://nickclifton.livejournal.com/6889.html) + IRC, freenode, #hurd, 2014-01-10: + + Hi, I assume gcc -fsplit-stack is not yet supported? + gnu_srs1: + https://lists.gnu.org/archive/html/bug-hurd/2013-06/msg00100.html + braunr: That's exactly where the problem is: + src/libgcc/generic-morestack.c:814:__morestack_load_mmap + no return value recorded + creating a call: page = mmap ((void*)0x0, 0, 4, 2, -1, 0);, + returning EINVAL + lenght of 0 ? + yes, __morestack_current_segment, is zero + mmap is expected to return einval if the requested mapping has + a size of 0 .. + i don't know what split stack is, but i remember it's a + problem for the hurd + sorry, the address is zero from the above, and the length in + the call is zero too + yes that's what i understood + and i'm telling you it's normal + the size is invalid + libgcc/generic-morestack.c: mmap + (__morestack_current_segment, 0, PROT_READ, MAP_ANONYMOUS, -1, 0); + well this is wrong + and the error code stays, not being reset in subsequent + calls + causing an error later on + as roland says in + https://lists.gnu.org/archive/html/bug-hurd/2013-06/msg00102.html, it + should be possible to support split-stack now that we have tls + as thomas reported + i don't see the relation between split-stack and the mmap + invocation + tls s in 2.17-97, right? that's the one I tried + tls is there, but not split stack support + and libpthread still has bugs related to changing the stack + apparently + fixed upstream but not yet in debian packages + unless you want to try with the thread destruction packages + not sure it will change much though + * Also see `libgcc/config/i386/morestack.S`: comments w.r.t `TARGET_THREAD_SPLIT_STACK_OFFSET`/`%gs:0x30` usage; likely needs porting. @@ -498,6 +539,29 @@ Last reviewed up to the [[Git mirror's 3a930d3fc68785662f5f3f4af02474cb21a62056 [[!message-id "201211061305.02565.pino@debian.org"]]. + IRC, freenode, #hurd, 2014-01-08: + + How come __GLIBC__ is defined in gcc for kFreeBSD and not + GNU? They sometimes use that instead of __FreeBSD_kernel__ + it's defined by libc's /usr/include/features.h + pochu: __GLIBC__ is defined in features.h both for GNU and + kFreeBSD, but only in gcc/cpp for kFreeBSD: touch foo.h;gcc -E -dM + foo.h|grep GLIBC + gnu_srs: #include + pochu: they both include + gnu_srs: I get __GLIBC__ defined if I include features.h + with an empty file (as suggested by your `touch foo.h') I don't + get it defined, whether on hurd or linux, but I think that's expected + pochu: might be so but it is not pre-defined in CPP, as it is + for kFreeBSD. + I think it should not be defined, or it should be defined by + all three: GNU,.kFreeBSD and Linux + an anomaly, something for tschwinge + https://lists.debian.org/debian-bsd/2012/11/msg00016.html + braunr: good finding, I assume nothing has happened since + then? + not likely + * [low] Does `-mcpu=native` etc. work? (For example, 2ae1f0cc764e998bfc684d662aba0497e8723e52.) @@ -535,6 +599,42 @@ Last reviewed up to the [[Git mirror's 3a930d3fc68785662f5f3f4af02474cb21a62056 A lot of Linux-specific things. + * `libcilkrts` + + IRC, freenode, #hurd, 2014-01-10: + + bwaarf, libcilkrts in gcc-4.9 + libcilkrts? + the runtime for the cilk language I guess + Yes. That most likely needs disabling for us. + I'll hve a look eventually. + As soon as I get + + resolved, actually. + + [[!debbug 734973]]. + + * `WCONTINUED` + + IRC, OFTC, #debian-hurd, 2014-02-25: + + youpi: some gcc-4.9 packages (and source) are needed for + gnat-4.9 to build: Is it OK to propose this patch: + http://paste.debian.net/84079/ + --- a/src/gcc/lto_lto.c.orig 2014-02-14 19:22:14.000000000 +0100 + +++ b/src/gcc/lto/lto.c 2014-02-25 20:50:20.000000000 +0100 + @@ -2476,7 +2476,11 @@ + int status; + do + { + +#ifdef __GNU__ + + int w = waitpid(0, &status, WUNTRACED); + +#else + int w = waitpid(0, &status, WUNTRACED | WCONTINUED); + +#endif + if (w == -1) + fatal_error ("waitpid failed"); + gnu_srs: rather ifndef WCONTINUED diff --git a/open_issues/gdb_catch_syscall.mdwn b/open_issues/gdb_catch_syscall.mdwn index 366c88f5..a875b211 100644 --- a/open_issues/gdb_catch_syscall.mdwn +++ b/open_issues/gdb_catch_syscall.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,7 +8,7 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] -[[meta title="GDB: catch syscall"]] +[[!meta title="GDB: catch syscall"]] (gdb) catch syscall The feature 'catch syscall' is not supported on this architeture yet. diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn index 5aec5139..8d18d1e2 100644 --- a/open_issues/glibc.mdwn +++ b/open_issues/glibc.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2007, 2008, 2010, 2011, 2012, 2013 Free Software -Foundation, Inc."]] +[[!meta copyright="Copyright © 2007, 2008, 2010, 2011, 2012, 2013, 2014 Free +Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -210,6 +210,14 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 * Missing interfaces, amongst many more. + IRC, freenode, #hurd, 2014-02-25: + + youpi et al.: Is it a useful GSoC task to have the student + implement interfaces in glibc that we are currently missing? + tschwinge: definitely + posix_timers would be great + tschwinge: probably + Many more are missing, some of which have been announced in `NEWS`, others typically haven't (like new flags to existing functions). Typically, porters will notice missing functionaly. But in case you're looking for @@ -270,6 +278,20 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 If we have all of 'em (check Linux kernel), `#define __ASSUME_ATFCTS`. + * `futimens` + + IRC, freenode, #hurd, 2014-02-09: + + it seems apt 0.9.15.1 has troubles downloading packages + etc., as opposed to apt 0.9.15 + ah, that version uses futimens unconditionally + and we haven't implemented that yet + did somebody file a bug for that apt-get issue? + I haven't + I'll commit the fix in eglibc + but perhaps a bug report would be good for the kfreebsd + case + * `bits/stat.h [__USE_ATFILE]`: `UTIME_NOW`, `UTIME_OMIT` * `io/fcntl.h [__USE_ATFILE]` @@ -362,6 +384,374 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 http://darnassus.sceen.net/gitweb/savannah_mirror/glibc.git/blob/refs/heads/tschwinge/Roger_Whittaker:/hurd/hurdselect.c this is the client side implementation + IRC, freenode, #hurd, 2014-02-14: + + also: do you know if hurd has a modern-day poll() + replacement? ala epoll, kqueue, iocp, port_create(), etc? + last thing I remember was that there was no epoll + equivalent, but that was a few years ago :) + braunr: ^ + * desrt is about to replace gmaincontext in glib with something + more modern + * desrt really very much wants not to have to write a poll() + backend.... + it seems that absolutely every system that i care about, + except for hurd, has a new approach here :/ + even illumos has solaris-style ports + desrt: I suggest you bring up the question on bug-hurd + the poll() system call there to satisfy POSIX, but there + might be a better Hurd-specific thing you could use + is there* + that would be ideal + i have to assume that a system that passes to many messages + has some other facilities :) + *so many + the question is if they work with fds.... + bug-hurd doesn't seem like a good place to ask open-ended + questions.... + it's the main development lists, it's just old GNU naming + list* + k. thanks. + bug-hurd@gnu.org is the address + * desrt goes to bug... hurd + written. thanks. + desrt: the hurd has only select/poll + it suffers from so many scalability issues there isn't + much point providing one currently + we focus more on bug fixing and posix compliance right now + fair answer + you should want a poll-based backend + it's the most portable one, and doesn't suck as much as + select + very easy to write + although, internally, our select/poll works just like a + bare epoll + i.e. select requests are installed, the client waits for + one or more messages, then uninstalls the requests + + IRC, freenode, #hurd, 2014-02-23: + + brings me to another question i asked here recently that + nobody had a great answer for: any plan to do kqueue? + not for now + i remember answering you about that + ah. on IRC or the list? + that internally, our select/poll implementation works just + like epoll + on irc + well "just like" is a bit far from the truth + well... poll() doesn't really work like epoll :p + internally, it does + even on linux + since both of us have to do the linear scan on the list + which is really the entire difference + that's the user interface part + i'm talking about the implementation + ya -- but it's the interface that makes it unscalable + i know + what i mean is + since the implementation already works like a more modern + poll + we could in theory add such an interface + but epoll adds some complicated detail + you'll have to forgive me a bit -- i wasn't around from a + time that i could imagine what a non-modern poll would look like + inside of a kernel :) + what i mean with a modern poll is a scalable poll-like + interface + epoll being the reference + * desrt is not super-crazy about the epoll interface.... + me neither + kevent() is amazing -- one syscall for everything you need + i don't know kqueue enough to talk about it + no need to do 100 epollctls when you have a whole batch of + updates to do + there's two main differences + first is that instead of having a bunch of separate fds for + things like inotify, timerfd, eventfd, signalfd, etc -- they're + all built in as different 'filter' types + second is that instead of a separate epoll_ctl() call to + update the list of monitored things, the kevent() call + (epoll_wait() equivalent) takes two lists: one is the list of + updates to make and the other is the list of events to + return.... so you only do one syscall + well, again, that's the interface + internally, there still are updates and waits + and on a multiserver system like the hurd, this would mean + one system call per update per fd + and then one per wait + on the implementation side, i think kqueue also has a nice + feature: the kernel somehow has some magic that lets it post + events to a userspace queue.... so if you're not making updates + and you do a kevent() that would not block, you don't even enter + the kernel + ok + hm. that's an interesting point + "unix" as such is just another server for you guys, right? + no + that's a major difference between the hurd and other + microkernel based systems + even multiserver ones like minix + we don't have a unix server + we don't have a vfs server or even an "fd server" + so mach knows about things like fds? + no + only glibc + oh. weird! + yes + that's the hurd's magic :) + being so posix compliant despite how exotic it is + this starts to feel like msvcrt :p + maybe, i wouldn't know + windows is a hybrid after all + with multiple servers for its file system + so why not + anyway + so windows doesn't have fds in the kernel either... the C + library runtime emulates them + mach has something close to file descriptors + which is fun when you get into dll hell -- sometimes you + have multiple copies of the C library runtime in the same program + -- and you have to take care not to use fds from one of them with + th o ther one + yes .. + that, i knew :) + but back to the hurd + since fds are a glibc thing here, and because "files" can + be implemented by multiple servers + (sockets actually most of the time with select/poll) + we have to make per fd requests + the implementation uses the "port set" kernel abstraction + right -- we could have different "fd" coming from different + places + do you know what a mach port is ? + not even a little bit + hm + i think it's what a plane does when it goes really fast, + right? + let's say it's a kernel message queue + no it's not a sonic boom + :) + ;p + so + ports are queues + (aside: i did briefly run into mach ports recently on macos + where they modified their kqueue to support them...) + queues of RPC requests usually + (but i didn't use them or look into them at all) + they can be referenced through mach port names, which are + integers much like file descriptors + they're also used for replies but, except for weird calls + like select/poll, you don't need to know that :) + a port set is one object containing multiple ports + sounds like dbus :) + the point of a port set is to provide the ability to + perform a single operation (wait for a message) on multiple ports + sounds like an epoll fd.... + is the port set itself a port? + so, when a client calls select, it translates the list of + fds into port names, creates reply ports for each of them, puts + them into a port set, send one select request for each, and does + one blocking wait on the port set + no, but you can wait for a message on a port set the same + way you do on a port + and that's all it does + does that mean that you can you put a port set inside of + another port set? + hm maybe + i guess in some way that doesn't actually make sense + i guess + because i assume that the message you sent to each port in + your example is "tell me when you have some stuff" + yes + and you'd have to send an equivalent message to the port + set.... and that just doesn't make sense + since it's not really a thing, per se + it would + insteaf of port -> port set, it would just be port -> port + set -> port set + but we don't have any interface where an fd stands for a + port set + what i'm trying to tell here is that + considering how it's done, you can easily see that there + has to be non trivial communication + each with the cost of a system call + and not just any system call, a messaging one + mach is clearly not as good as l4 when it comes to that + hrmph + and the fact that most pollable fds are either unix or + inet/inet6 sockets mean that there will be contention in the + socket servers anyway + i've seen some of the crazy things you guys can do as a + result of the way mach works and way that hurd uses it, in + particular + normal users setting up little tcp/ip universes for + themselves, and so on + yes :) + but i guess this all has a cost + the cost here comes more from the implementation than the + added abstractions + mach provides async ipc, which can partially succeed + if i spin up a subhurd, it's using the same mach, right? + yes + that's neat + we tend to call them neighbour hurds because of that + i'm not sure it is + it puts it half way between linux containers and outright + VMs + because you have a new kernel.... ish... + well, it is for the same reasons hypervisors are neat + but the kernel exists within this construct.... + a new kernel ? + a new hurd + yes + but not a new mach + exactly + ya -- that's very cool + it's halfway between hypervisors and containers/jails + what matters is that we didn't need to write much code to + make it work + and that the design naturally guarantees strong isolation + right. that's what i'm getting at + unlike containers + it shows that the interaction between mach and these set of + crazy things collectively referred to as the hurd is really + proper + usually + sometimes i think it's not + but that's another story :) + don't worry -- you can fix it when you port to L4 ;) + eh, no :) + btw: is this fundamentally the same mach as darwin? + yes + so i guess there are multiple separate implementations of a + standard set of interfaces? + ? + * desrt has to assume that apple wouldn't be using GNU mach, for + example... + no it's the same code base + they couldn't + but only because the forks have diverged a bit + ah + and they probably changed a lot of things in their virtual + memory implementation + so i guess original mach was under some BSDish type thing + and GNU mach forked from that and started adding GPL code? + something like that + makes sense + we have very few "non-standard" mach interfaces + but we now rely on them so we couldn't use another mach + either + back to the select/poll stuff + * desrt gets a lesson tonight :) + it costs, it's not scalable + but + we have scalability problems in our servers + they're old code, they use global locks + right. this is the story i heard last time. + probably from me + poll works good enough for us right now + we're more interested in bug fixes than scalability + currently + the reason this negative impacts me is because now i need + to write a bunch more code ;p + i hope this changes but we still get weird errors that + many applications don't expect and they react badly to those + well, poll really is the posix fallback + every other OS that we want to support has some sort of new + scalable epoll-type interface or is Windows (which needs separate + code anyway) + a very large number of them have kqueue... linux has + epoll... solaris/illumos is the odd one out with this weird thing + that's sort of like epoll + i would think you want a posix fallback for such a + commonly used interface + hm + braunr: hurd is pretty much the only one that doesn't + already have something better.... + linux can be built without epoll + and the nice thing about all of these things is that every + single one of them gives me an fd that can be polled when any + event is ready + i don't see why anyone would do that, but it's a compile + time option ;p + yes ... + we don't have xxxfd() :) + and we want to expose that fd on our API... so people can + chain gmaincontext into other mainloops + that's expected + so for hurd this means that i will need to spin up a + separate thread doing poll() and communicating back to the main + thread when anything becomes ready + i was looking forward to not having to do that :) + it matches the unix "everything is a file" idea, and + windows concept of "events" + i understand but again, it's a posix fallback + you probably want it anyway + probably + it could help new systems trying to be posix like + i honestly thought i'd get away with it, though + this is true... + CLOCK_MONOTONIC is an easy enough requirement to implement + or fake.... "modern event polling framework" is another story... + + [[clock_gettime]]. + + yes, but again, we do have the underlying machinery to add + it + i appreciate if your priorities are elsewhere ;) + it's just not worth the effort right now + although we do have performance and latency improvements + in our patch queues currently + if our network stack gets replaced, it would become + interesting + we need to improve posix compliance first + make more applications not choke on unecpected errors + and then we can think of improving scalability + +1 vote from me for implementing monotonic time :) + (and also pthread_condattr_setclock()) + and we probably won't implement the epoll interface ;p + yes + it's worth noting that there is also a semi-widely + available non-standard extension called + pthread_cond_timedwait_relative_np that you could implement + instead + it takes a (relative) timeout instead of an absolute one -- + we can use that if it's available + desrt: why would you want relative timeouts ? + braunr: if you're willing to take the calculations into + your own hands and you don't have another way to base it on + monotonic time it starts to look like a good alternative + and indeed, this is the case on android and macos at least + hm + not great as a user-facing API of course.... due to the + spurious wakeup possibility and need to retry + so it's non standard alternative to a monotonic clock ? + no -- these systems have monotonic clocks + what they lack is pthread_condattr_setclock() + oh right + which is documented in POSIX but labelled as 'optional' + so relative is implicitely monotonic + yes + i imagine it would be the same 'relative' you get as the + timeout you pass to poll() + since basing anything like this on wallclock time is + absolutely insane + (which is exactly why we refuse to use wallclock time on + our timed waits) + sure + i'm surprised clock_monotonic is even optional in posix + 2008 + but i guess that's to give some transition margin for + small embedded systems + when you think about it, CLOCK_REALTIME really ought to + have been the optional feature + monotonic time is so utterly basic + yes + and that's how it's normally implemented + kernels provide a monotonic clock, and realtime is merely + shifted from it + * `sys/eventfd.h` * `sys/inotify.h` @@ -1129,6 +1519,82 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 ah ok you just pushed your tls. great! tls will fix a lot of things + IRC, OFTC, #debian-hurd, 2013-11-03: + + gg0: + #252 test_fork.rb:30:in `': core dumped + [ruby-core:28924] + FAIL 1/949 tests failed + with the to-be-uploaded glibc + why does it coredump? + that's the test i had workarounded by increasing sleep from 1 + to 3 but i don't recall it coredump'ed + *recall if + "sleep 1" at bootstraptest/test_fork.rb:33 + how can I run the test alone? + + IRC, OFTC, #debian-hurd, 2013-11-04: + + gg0: ^ + it should not take much + run $ make OPTS=-v test + found out how to minimize + mkdir _youpi && cp bootstraptest/{runner,test_fork}.rb _youpi + then run $ ./miniruby -I./lib -I. -I.ext/common + ./tool/runruby.rb --extout=.ext -- --disable-gems + "./_youpi/runner.rb" --ruby="ruby2.0 -I./lib" -q -v + youpi: that should work + #1 test_fork.rb:1:in `': No such file or + directory - /usr/src/ruby1.9.1-1.9.3.448/ruby2.0 + -I/usr/src/ruby1.9.1-1.9.3.448/lib -W0 bootstraptest.tmp.rb + [ruby-dev:32404] + seems it can't find /usr/src/ruby1.9.1-1.9.3.448/ruby2.0 + well it's ruby1.9.1 indeed :) + ok, got core + replace 2.0 with 1.9, check what you have in rootdir + k + Mmm, no, there's no core file + does stupidly increasing sleep time work? + nope + without *context it runs "make test" fine. real problems come + later with "make test-all" + wrt test_fork, is correspondence between signals correct? i + recall i read something about USR1 not implemented + USR1 is implemented, it's SIGRT which is not implemented + my next wild guess is that that has something to do with + atfork, whatever that means + it makes 2 forks: one sleeps for 1 sec then kills -USR1 + itself, the second traps USR1 in getting current time. in the + meanwhile parent sleeps for 2 secs + + IRC, OFTC, #debian-hurd, 2013-11-07: + + ruby2.0 just built on unstable + + IRC, OFTC, #debian-hurd, 2013-11-09: + + youpi: just found out a more "official" way to run one test + only + http://anonscm.debian.org/gitweb/?p=collab-maint/ruby1.9.1.git;a=blob;f=debian/README.porters;h=94aff7dd3ecd9f748498f2e285b4a4313b4b8f36;hb=HEAD + btw still getting coredumps? + + IRC, OFTC, #debian-hurd, 2013-11-13: + + wrt the other test test_fork i suppose you made it not to + segfault anymore, it simply does fail + I haven't taken any particular care + didn't have any time to deal with it + + IRC, OFTC, #debian-hurd, 2013-11-14: + + btw patches to disable *context have been backported to 1.9 + as well so next 1.9 point release should have *context disabled + as 2.0 have + *has + i guess you'd like to get them reverted now + youpi: ^ + after testing that *context work, yes + * `sigaltstack` IRC, freenode, #hurd, 2013-10-09: @@ -1316,6 +1782,77 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 socket/socketpair, didn't we talk about them when i worked on eglibc 2.17? + * `mlock`, `munlock`, `mlockall`, `munlockall` + + IRC, freenode, #hurd, 2014-01-09: + + Hi, is mlock, mlockall et al implemented? + i doubt it + mlock could be, but mlockall only partially + + * [[glibc_IOCTLs]] + + * Support for `$ORIGIN` in the dynamic linker, `ld.so` + + IRC, freenode, #hurd, 2014-02-23: + + + https://www.gnu.org/software/hurd/user/jkoenig/java/report.html + says $ORIGIN patches have been added to Hurd. Have those hit the + mainline codebase? + + [[user/jkoenig/java]], [[user/jkoenig/java/report]]. + + It doesn't seem to work here, but perhaps I'm missing + something (I'm using the prebuilt Debian/Hurd 2014-02-11 VM + image) + objdump -x says the value of RPATH is $ORIGIN + But it doesn't load a library I placed in the same dir as + the binary + sjamaan: i'm not sure + sjamaan: what are you trying to do ? + + IRC, freenode, #hurd, 2014-02-24: + + braunr: I am working on a release of the CHICKEN Scheme + compiler. Its test suite is currently failing on the stand-alone + deployment tests. Either it should work and use $ORIGIN, or the + test should be disabled, saying Hurd is not supported for + stand-alone deployment-directories + braunr: The basic idea is to be able to create "appdirs" + like on OS X or PC-BSD, containing all the dependencies a program + needs, which can then simply be untarred + sjamaan: ok so you do need $ORIGIN + yeah + iiuc, so does Java. Does Java work on Hurd? + we had packages at the time jkoenig worked on it + integration of patches may have been incomplete, i wasn't + there at the time and i'm not sure + So it's safest to claim it's unsupported, for now? + yes + Thank you, I'll do that and revisit it later + + * `mig_reply_setup` + + IRC, freenode, #hurd, 2014-02-24: + + braunr: neither hurd, gnu mach or glibc provides + mig_reply_setup + i want to provide this function, where should i put it ? + i found some mach source that put it in libmach afaic + + ftp://ftp.sra.co.jp/.a/pub/os/mach/extracted/mach3/mk/user/libmach/mig_reply_setup.c + teythoon: what does it do ? + braunr: not much, it just initializes the reply message + libports does this as well, in the + ports_manage_port_operations* functions + teythoon: is it a new function you're adding ? + braunr: yes + braunr: glibc has a declaration for it, but no + implementation + teythoon: i think it should be in glibc + maybe in mach/ + For specific packages: * [[octave]] @@ -2115,6 +2652,15 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 +tst-tls-atexit-lib.c:35:3: warning: implicit declaration of function '__cxa_thread_atexit_impl' [-Wimplicit-function-declaration] * a600e5cef53e10147932d910cdb2fdfc62afae4e `Consolidate Linux and POSIX libc_fatal code.` -- is `backtrace_and_maps` specific to Linux? + + IRC, freenode, #hurd, 2014-02-06: + + why wouldn't glibc double free detection code also print + the backtrace on hurd ? + I don't see any reason why + except missing telling glibc that it's essentially like on + linux + * 288f7d79fe2dcc8e62c539f57b25d7662a2cd5ff `Use __ehdr_start, if available, as fallback for AT_PHDR.` -- once we require Binutils 2.23, can we simplify [[glibc's process startup|glibc/process]] diff --git a/open_issues/glibc/0.4.mdwn b/open_issues/glibc/0.4.mdwn index 8991d4c0..33ef8f3a 100644 --- a/open_issues/glibc/0.4.mdwn +++ b/open_issues/glibc/0.4.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -15,6 +16,8 @@ Things to consider doing when bumping the glibc SONAME. There are some comments in the sources, for example `hurd/geteuids.c`: `XXX Remove this alias when we bump the libc soname.` +[[!toc]] + # IRC, freenode, #hurd, 2012-12-14 @@ -33,3 +36,42 @@ In context of [[packaging_libpthread]]/[[libpthread]]. [[!GNU_Savannah_bug 28934]], [[user/pochu]], [[!message-id "4BFA500A.7030502@gmail.com"]]. + + +# `time_t` -- Unix Epoch vs. 2038 + +## IRC, freenode, #hurd, 2013-12-12 + + because it gets discussed in #debian-devel for the Linux i386 + architecture right now: what's the deal with hurd-i386 and the 32bit + epoch overflow in 2038? + what do you mean ? + braunr: http://lwn.net/Articles/563285/ + ok but what do you mean ? + i don't think there is anything special with the hurd about that + well, time_t is 64bit on amd64 AIUI + it's a signed long + so maybe the Hurd guys were clever from the start + k, k + our big advantage is that we can afford to break things a little + without too much trouble + in a system at work, we use unsigned 32-bit words + which overflows in 2106 + and we already include funny comments that predict our successors, + if any, will probably fail to deal with the problem until short before + the overflow :> + luckily, no nuclear reactors are running the Hurd sofar + i wonder how the problem will be dealt with though + ah, openbsd decided to break their abi + yeah + that's probably the simplest solution + "just recompile" + and they can afford it too + yeah + good to see people actually worry about it + I guess people are getting worried about where Linux embedded is + being put into + they're right about that + "Please, don't fix the 2038 year issue. I also want to have some + job security :)" + haha diff --git a/open_issues/glibc/debian/experimental.mdwn b/open_issues/glibc/debian/experimental.mdwn index 5168479d..273f02fd 100644 --- a/open_issues/glibc/debian/experimental.mdwn +++ b/open_issues/glibc/debian/experimental.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -130,6 +130,101 @@ Now in unstable. btw i saw too the segmentation fault when generating locales +## IRC, freenode, #hurd, 2014-02-04 + + hello + I just updated + Setting up locales (2.17-98~0) ... + Generating locales (this might take a while)... + en_US.UTF-8...Segmentation fault + done + bu^: That's known, it still seems to work, though. If you have + the time please debug. I've tried but not found the solution yet:-( + ok, just wanted to notify + + +## IRC, freenode, #hurd, 2014-02-19 + + for info, the localedef segfault has been fixed upstream + or rather, upstream has been written in a way that won't trigger + the segfault + it is caused by the locale archive code that maps the locale + archive file in the address space, enlarging the mapping as needed, but + unmaps the complete reserved size of 512M on close + munmap is implemented through vm_deallocate, but it looks like the + latter doesn't allow deallocating unmapped regions of the address space + (to be confirmed) + upstream code tracks the mapping size so vm_deallocate won't whine + i expect we'll have that in eglibc 2.18 + hm actually, posix says munmap must refer to memory obtained with + mmap :) + (or actually, that the behaviour is undefined, which most unix + systems allow anyway, but not us) + + also, before i leave, i have partially traced the localedef + segfault + ah, cool + localedef maps the locale archive, and enlarges the mapping as + needed + but munmaps the complete 512m reserved area + and i strongly suspect it unmaps something it shouldn't on the + hurd + since linux mmap has different boundaries depending on the mapping + use + while our glibc will happily maps stacks below text + the good news is that it looks fixed upstream + ah :) + + https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=17db6e8d6b12f55e312fcab46faf5d332c806fb6 + see the change about close_archive + i haven't tested it though + + +## IRC, freenode, #hurd, 2014-02-21 + + just upgraded to 2.18, locales still segfaults + ok + + +## IRC, freenode, #hurd, 2014-02-23 + + ok, as expected, the localdef bug is because of some mmap issue + +[[glibc/mmap]]. + + looks like our mmap doesn't like mapping files with PROT_NONE + shouldn't be too hard to fix + gg0: i should have a fix ready soon for localedef + + youpi: i have a patch for glibc about the localedef segfault + is that the backport we talked about, or something else? + something else + in short + mmap() PROT_NONE on files return 0 + ok + seems like fixable indeed + nothing is mapped, and the localdef code doesn't consider this an + error + my current fix is to handle PROT_NONE like PROT_READ + doesn't vm_protect allow to map something without giving read + right? + it probably does + the problem is in glibc + ok + when i say like PROT_READ, i mean a memory object gets a reference + on the read port returned by io_map + since it's not accessible anyway, it shouldn't make a difference + but i preferred to have the memory object referenced anyway to + match what i expect is done by other systems + + +## IRC, freenode, #hurd, 2014-02-24 + + braunr: ah ok + + ok that mmap fix looks fine, i'll add comments and commit it soon + + # IRC, OFTC, #debian-hurd, 2013-06-20 damn @@ -173,3 +268,62 @@ Now in unstable. I'd warmly welcome a way to detect whether being the / translator process btw it seems far from trivial + + +# glibc 2.18 vs. GCC 4.8 + +## IRC, freenode, #hurd, 2013-11-25 + + grmbl, installing a glibc 2.18 rebuilt with gcc-4.8 brings an + unbootable system + + +## IRC, freenode, #hurd, 2013-11-29 + + so, what do I do? rebuild the glibc 2.18 package with gcc4.8 and + see what breaks ? + when I boot a system with that libc that is ? + I wish youpi would have been more specific, I've never built the + libc before... + debian/rules build in the debian package + ctrl-c when you see gcc invocations + cd buildir; make lib others + although hm + what breaks is at boot time right ? + yes + heh .. + then dpkg-buildpackage + DEB_BUILD_OPTIONS=nocheck speeds things up + just answer on the mailing list and ask him + he usually answers quickly + + +## IRC, freenode, #hurd, 2013-12-18 + + teythoon: k!, any luck with eglibc-2.18? + tbh i didn't look into this after two unsuccessful attempts at + building the libc package + there was a post over at the libc-alpha list that sounded + familiar + http://www.cygwin.com/ml/libc-alpha/2013-12/msg00281.html + wow + ? + this looks tricky + and why ia64 only + indeed + it's rare to see aurel32 ask such questions + + +## IRC, freenode, #hurd, 2014-01-22 + + btw, did anybody investigate the glibc-built-with-gcc-4.8 issue? + oddly enough, a subhurd boots completely fine with it + i didn't + no, sorry + I was wondering whether the bogus deallocation at boot might have + something to do + which one ? + ah + yes + maybe + quoted earlier here diff --git a/open_issues/glibc_ioctls.mdwn b/open_issues/glibc_ioctls.mdwn index 14329d0f..3f396754 100644 --- a/open_issues/glibc_ioctls.mdwn +++ b/open_issues/glibc_ioctls.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_glibc]] -IRC, unknown channel, unknown date. + +# IRC, unknown channel, unknown date d'oh, broken defines for ioctl()! http://paste.debian.net/45021/ ← any idea about this? looks like something fishy with the SIO* defines @@ -70,3 +71,101 @@ IRC, unknown channel, unknown date. right which might end up in mach, other processes, other machines, etc. * pinotree s/Mach/Hurd/ :) + + +# `TIOCCONS` + +## IRC, freenode, #hurd, 2014-02-05 + + Hi, anybody have time to look at what fails with: ioctl(0, + TIOCCONS, NULL)? + found a program doing the same function call as bootlogd: + http://paste.debian.net/80231/ + rpctrace: http://paste.debian.net/80232/ + gnu_srs: it seems there is a misunderstanding between linux and + *bsd on this one + to be able to work on *bsd (and on hurd too), the source code + should replace its NULL parameter with the address of an integer + containing 1 + see + http://lists.freebsd.org/pipermail/freebsd-current/2011-January/022116.html + for the bsd implementation, for instance + youpi: replacing 0 with &i where int i=1 gives: TIOCCONS: + Inappropriate ioctl for device + so be it, but that's clearly needed to be able to work on bsd + and probably the implementation is just missing on the Hurd for now + jus to be clear: do you mean 0 or NULL in: ioctl(0, TIOCCONS, + NULL)? + yes, for instance there is an implementation do_tiocsctty in glibc, + but no to_tioccons + I mean NULL + OK, that's where I changed, the first argument id the FD + well, when I wrote "NULL", I really meant "NULL" ... + yes sure, so you say that it is not yet implemented? + yes, for instance there is an implementation do_tiocsctty in glibc, + but no to_tioccons + easy to do? + no idea, I don't even know what that is suppsoed to do + it's probably something like tiocsctty, but I don't really know + Redirecting console output to a pseudotty + omg that ioctl is so ugly + the way I can see it working is to add an RPC to the /dev/console + translator (i.e. /hurd/term) to give it the fd, and have /hurd/term write + to it whenever it gets writes, instead of writing to the console device + gnu_srs: what do you need that for? + bootlogd in sysvinit use that for logging. + should I propose a patch to avoid the segfault when booting then? + at least, yes + *bsd will need it anyway + youpi: btw: hurd console does not work when running openrc, + neither is halt/reboot. Maybe you should try it out? + bootlogd use ioctl(0, TIOCCONS, NULL) a Linux (only) construct + ? + gnu_srs: I had infinite time in the day, I would be able to try it + out, yes + heh + giving NULL to TIOCCONS is a linux-only construct, yes + to be compatible with *BSD, you have to pass the parameter + mentioned above + instead of NULL + well bootlogd is from sysvinit, so it is a matter if we move to + that for init. + ***checking if bootlogd segfaults on kFreeBSD too + + +# Non-constant structures as IOCTL parameter + +[[!debbug 413734]]. + + +## IRC, OFTC, #debian-hurd, 2014-02-16 + + https://bugs.debian.org/413734 + patch #2 has become http://paste.debian.net/plain/82412/ + ie. almost entirely ifdef'ing DeviceEnum + ok final patch is http://paste.debian.net/plain/82440/ + could anyone review it, especially last 3 oss hunks? + gg0: well probably it would be cleaner to have autoconf check for + any of the three soundcard.h include locations? + azeem: i think if upstream is ok with 2 it could be ok with 3 too + my concern is about linux/ in header path (hurd is not linux) and + about ways cleaner than last 2 hunks + well yeah, #ifdef __GNU__ #include certainly looks + ugly + i'll ifdef ioctls only + + +### IRC, OFTC, #debian-hurd, 2014-02-17 + + http://paste.debian.net/plain/82446/ + https://trac.videolan.org/vlc/ticket/10696 + + +### IRC, freenode, #hurd, 2014-02-17 + + porting vlc with http://paste.debian.net/plain/82446/ + + http://paste.debian.net/plain/82510/ + what's the proper way to fix ioctl instead of ifdef'ing them? + see https://bugs.debian.org/413734 + gg0: defining them in libc + and in servers implementing them ofc diff --git a/open_issues/gnumach_memory_management.mdwn b/open_issues/gnumach_memory_management.mdwn index 60ec7357..b36c674a 100644 --- a/open_issues/gnumach_memory_management.mdwn +++ b/open_issues/gnumach_memory_management.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -2231,6 +2231,132 @@ There is a [[!FF_project 266]][[!tag bounty]] on this task. more of them to be needed) +## IRC, freenode, #hurd, 2014-02-11 + + youpi: what's the issue with kentry_data_size ? + I don't know + so back to 64pages from 256 ? + in debian for now yes + :/ + from what i recall with x15, grub is indeed allowed to put modules + and command lines around as it likes + restricted to 4G + iirc, command lines were in the first 1M while modules could be + loaded right after the kernel or at the end of memory, depending on the + versions + braunr: possibly VM_KERNEL_MAP_SIZE is then not big enough + youpi: what's the size of the ramdisk ? + youpi: or kmem_map too big + we discussed this earlier with teythoon + +[[user-space_device_drivers]], *Open Issues*, *System Boot*, *IRC, freenode, +\#hurd, 2011-07-27*, *IRC, freenode, #hurd, 2014-02-10* + + or maybe we want to remove kmem_map altogether and directly use + kernel_map + it's 6.2MiB big + hm + err no + looks small + 70MiB + ok yes + (uncompressed) + well + kernel_map is supposed to have 64M on i386 ... + it's 192M large, with kmem_map taking 128M + so at most 64M, with possible fragmentation + i believe the compressed initrd is stored in the ramdisk + ah, right it's ext2fs which uncompresses it + uncompresses it where + ? + libstore does that + module --nounzip /boot/${gtk}initrd.gz + braunr: in userland memory + it's not grub which uncompresses it for sure + braunr: so my ramdisk isn't 64 megs either + which explains why it sometimes works + yes + mine is like 15 megs + kentry_data_size calls pmap_steal_memory, an early allocation + function which changes virtual_space_start, which is later used to create + the first kernel map entry + err, pmap_steal_memory is called with kentry_data_size as its + argument + this first kernel map entry is installed inside kernel_map and + reduces the amount of available virtual memory there + so yes, it all points to a layout problem + i suggest reducing kmem_map down to 64M + that's enough to get d-i back to boot + what would be the downside? + (why did you raise it to 128 actually? :) ) + i merged the map used by generic kalloc allocations into kmem_map + both were 64M + i don't see any downside for the moment + i rarely see more than 50M used by the slab allocator + and with the recent code i added to collect reclaimable memory on + kernel allocation failures, it's unlikely the slab allocator will be + starved + but then we need that patch too + no + it would be needed if kmem_map gets filled + this very rarely happens + is "very rarely" enough ? :) + actualy i've never seen it happen + i added it because i had port leaks with fakeroot + port rights are a bit special because they're stored in a table in + kernel space + this table is enlarged with kmem_realloc + when an ipc space gets very large, fragmentation makes it very + difficult to successfully resize it + that should be the only possible issue + actually, there is another submap that steals memory from + kernel_map: device_io_map is 16M large + so kernel_map gets down to 48M + if the initial entry (that is, kentry_data_size + the physical + page table size) gets a bit large, kernel_map may have very little + available room + the physical page table size obviously varies depending on the + amount of physical memory loaded, which may explain why the installer + worked on some machines + well, it works up to 1855M + at 1856 it doesn't work any more :) + heh :) + and that's about the max gnumach can handle anyway + then reducing kmem_map down to 96M should be enough + it works indeed + could you check the amount of available space in kernel_map ? + the value of kernel_map->size should do + printing it "multiboot modules" print should be fine I guess? + + +### IRC, freenode, #hurd, 2014-02-12 + + probably + ? + i expect a bit more than 160M + (for the value of kernel_map->size) + teythoon: ? + well, it's 2110210048 + what is multiboot modules printing ? + almost last in gnumach bootup + humm + it must account directly mapped physical pages + considering the kernel has exactly 2G, this means there is 36M + available in kernel_map + youpi: is the ramdisk loaded at that moment ? + what do you mean by "loaded" ? :) + created + where? + allocated in kernel memory + the script hasn't started yet + ok + its size was 6M+ right ? + so it leaves around 30M + something like this yes + and changing kmem_map from 128M to 96M gave us 32M + so that's it + + # IRC, freenode, #hurd, 2013-04-18 oh nice, i've found a big scalability issue with my slab allocator diff --git a/open_issues/hurd_101.mdwn b/open_issues/hurd_101.mdwn index 25822512..e55b0e8e 100644 --- a/open_issues/hurd_101.mdwn +++ b/open_issues/hurd_101.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -98,3 +99,262 @@ Not the first time that something like this is proposed... server), yes braunr: thanks for all the info, hittin the sack now but ill have to set up a box and try to contribute + + +# Documentation + +## IRC, freenode, #hurd, 2013-11-04 + + i think the problem my hurd have not more developers or + contubutors is the project idears and management , eg, the most problem + is the mach kernel and documatation and the missing subsystem goals + (driver, etc) + no i think you and other have a clue but this is not + tranzparent when i read the webpage + well, fwiw I agree, the documentation is lacking + about what ? + something that doesn't exist ? + like smp or a generic device driver framework ? + no, high level concepts, design stuff + what ? + how come ? + not even the gnumach documentation is complete + for example ? + see http://www.sceen.net/~rbraun/doc/mach/ + which is my personal collection of docs on mach/hurd + and it's lacking at least one paper + well two, since i can't find the original article about the hurd + in pdf format + project ideas are clearly listed in the project ideas page + braunr: do you think the mach kernel decumatation a compleat? + and you think its good documentatition about "how write a drive for mach" + and you think a answare is found why dont work smp and why is have no + arm, x64 support ? + stargater: + http://darnassus.sceen.net/~hurd-web/community/gsoc/project_ideas/ + the page is even named "project ideas" + the mach kernel is probably the most documented in the world + even today + and if there is no documentation about "how to write drivers for + mach", that's because we don't want in kernel drivers any more + and the state of our driver framework is practically non existent + it's basically netdde + partial support for network drivers from linux + that's all + we need to improve that + someone needs to do the job + noone has for now + that's all + why would we document something that doesn't exist ? + only stupid project managers with no clue about the real world do + that + (or great ones who already know everything there is to know before + writing code, but that's rare) + stargater: the answer about smp, architectures etc.. is the same + spirit and magic are nice ;-) braunr sorry, that is only my + meanig and i will help, so i ask and say what i think. when you say, hurd + and mach are good and we on the right way, then its ok for me . i wonder + why not more developer help hurd. and i can read and see the project page + fro side a first time user/developer + i didn't say they're good + they're not, they need to be improved + clearly + ok, then sorry + i wondered about that too, and my conclusion is that people aren't + interested that much in system architectures + and those who are considered the hurd too old to be interesting, + and don't learn about it + consider* + stargater: why are you interested in the hurd ? + that's a question everyone intending to work on it should ask + the spirit of free software and new and other operation system, + with focus to make good stuff with less code and working code for ever + and everone can it used + well, if the focus was really to produce good stuff, the hurd + wouldn't be so crappy + it is now, but it wasn't in the past + a good point whas more documentation in now and in the future, + eg, i like the small project http://wiki.osdev.org/ and i like to see + more how understanding mach and hurd + I love osdev much, it taught me a lot ;-D + osdev is a great source for beginners + teythoon: what else did you find lacking ? + braunr: in my opinion the learning curve of Hurd development is + quite steep at the beginning + yes, documentation exists, but it is distributed all over the + internets + teythoon: hm ok + yes the learning curve is too hard + that's an entry barrier + + +# IRC, freenode, #hurd, 2014-02-04 + +[[!tag open_issue_documentation]] + + Does the GNU Mach kernel have concepts of capabilities? + yes + see ports, port rights and port names + Does it follow the take grant approch + approach* + probably + Can for example I take an endpoint that I retype from untyped + memory and mint it such that it only has read access and pass that to the + cspace of another task over ipc. + Where that read minted cap enforces it may onnly wait on that ep. + ep ? + ah + Endpoint. + probably + Alright cool. + it's a bit too abstract for me to answer reliably + ports are message queues + port rights are capabilities to ports + Not sure exactly how it would be implemented but essentially you + would have a guarded page table with 2 levels, 2^pow slots. + port names are integers referring to port rights + we don't care about the implementation of page tables + Each slot contains a kernel object, which in itself may be more + page tabels that store more caps. + it's not l4 :p + mach is more of a hybrid + It isn't a page table for memory. + it manages virtual memory + Ah ok. + whatever, we don't care about the implementation + So if I want to say port an ethernet driver over. + whether memory or capabilities, mach manages them + Can I forward the interrupts through to my new process? + yes + it has been implemented for netdde + these are debian specific patches for the time being though + Great, and shared memory set ups are all nice and dandy. + yes, the mach vm takes care of that + Can I forward page faults? + Or does mach actually handle the faults? + (Sorry for so many questions just comparing what I know from my + microkernel knowledge to mach and gnu mach) + mach handles them but translates them to requests to userspace + pagers + (Still have a mach paper to read) + Alright that sounds sane. + Does GNU mach have benchmarks on its IPC times? + no but expect them to suck :) + Isn't it fixable though? + mach ipc is known to be extremely heavy in comparison with modern + l4-like kernels + not easily + Yeah so I know that IPC is an issue but never dug into why it is + bad on Mach. + So what design decision really screwed up IPC speed? + for one because they're completely async, and also because they + were designed for network clusters, meaning data is typed inside messages + Oh weird + So how is type marshalled in the message? + in its own field + messages have their own header + and each data field inside has its own header + Oh ok, so I can see this being heavy. + So the big advantage is for RPC + It would make things nice in that case. + Is it possible to send an IPC without the guff though? + Or would this break the model mach is trying to achieve? + I am assuming Mach wanted something where you couldn't tell if a + process was local or not. + So I am assuming then that IPC is costly for system calls from a + user process. + You have some sort of blocking wait on the call to the service + that dispatches the syscall. + I am assuming the current variants of GNU/Hurd run on glibc. + It would be interesting to possibly replace that with UlibC or do + a full port of the FlexSC exceptionless system calls. + Could get rid of some of the bottlenecks in hurd assuming it is + very IPC heavy. + And that won't break the async model. + Actually should be simpler if it is already designed for that. + But would break the "distributed" vibe unless you had the faults + to those shared pages hit a page faulter that sent them over the network + on write. + + bwright: a lot of POSIX compatibility is handled by the glibc, + "porting" another libc to the Hurd will be a titanic task + In theory exceptionless system calls work fine on glibc, it is + just harder to get them working. + has not been done or was not explored in the paper. + Something about it having a few too many annoying assumptions. + Would be interesting to run some benchmarks on hurd and figure + out where the bottlenecks really are. + At least for an exercise in writing good benchmarks :P + I have a paper on the design of hurd I should read actually. + After I get through this l4 ref man. + the main bottleneck is scalability + there are a lot of global locks + and servers are prone to spawning lots of threads + because, despite the fact mach provides async ipc, the hurd mostly + uses sync ipc + so the way to handle async notifications is to receive messages + and spawn threads as needed + Lets take a senario + beyond that, core algorithms such as scanning pages in pagers, are + suboptimal + I want to get a file and send it across the network. + How many copies of the data occur? + define send + ouch :) + disk drivers are currently in the kernel + I read a block from disk, I pass this to my file system it passes + it to the app and it sends to the lwip or whatever interface then out the + ethernet card. + and "block device drivers" in userspace (storeio) are able to + redirect file system servers directly to those in kernel drivers + so + kernel -> fs -> client -> pfinet -> netdde (user space network + drivers on debian hurd) + Alright. Hopefully each arrow is not a copy :p + it is + My currently multiserver does this same thing with zero copy. + because buffers are usually small + yes but zero copy requires some care + Which is possible. + and usually, posix clients don't care about that + Yes it requires a lot of care. + POSIX ruins this + Absolutely. + they assume read/write copy data, or that the kernel is directly + able to access data + But there are some things you can take care with + And not break posix and still have this work. + pfinet handles ethernet packets one at a time, and 1500 isn't + worth zero copying + This depends though right? + i'm not saying it's not possible + i'm saying most often, there are copies + So if I have high throughput I can load up lots of packets and + the data section can then be sectioned with scatter gather + again, the current interface doesn't provide that + Alright yeah that is what I expected which is fine. + It will be POSIX compliant which is the main goal. + not really scatter gather here but rather segment offloading for + example + ah you're working on something like that too :) + Yeah I am an intern :) + Have it mostly working, just lots of pain. + Have you read the netmap paper? + Really interesting. + not sure i have + unless it has another full name + 14.86 million packets per second out of the ethernet card :p + SMOKES everything else. + Implemented in Linux and FreeBSD now. + Packets are UDP 1 byte MTU I think + 1 byte data * + To be correct :p + right, i see + Break posix again + "More Extend" + i've actually worked on a proprietary implementation of such a + thing where i'm currently working + Bloody useful for high frequency trading etc. + Final year as an undergraduate this year doing my thesis which + should be fun, going to be something OS hopefully. + Very fun field lots of weird and crazy problems. diff --git a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn index 11bebd6e..b571b82e 100644 --- a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn +++ b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -107,6 +107,34 @@ License|/fdl]]."]]"""]] now that's a good question... no idea TBH :-) +## IRC, freenode, #hurd, 2013-02-25 + + we should also discuss the mach_debug interface some day + it's not exported by libc, but the kernel provides it + slabinfo depends on it, and i'd like to include it in the hurd + but i don't know what kind of security problems giving access to + mach_debug RPCs would create + (imo, the mach_debug interface should be adjusted to be used with + privileged ports only) + (well, maybe not all mach_debug RPCs) + + +## IRC, freenode, #hurd, 2013-11-20 + + [...] we have to make the mach_debug interface available + well, i never took the time to integrate slabinfo into the hurd + repository + because it relies on the mach_debug interface + ah + while enabling that interface alone can't do harm, some debugging + functions shouldn't be usable by unprivileged applications + so it requires some discussions + i always delayed it because of more important stuff to do + but slabinfo is actually very useful + the more information we have about the system state, the better + so it's actually important + + # IRC, freenode, #hurd, 2012-07-23 aren't libmachuser and libhurduser supposed to be slowly faded @@ -123,18 +151,6 @@ License|/fdl]]."]]"""]] pinotree: libc should bring them -# IRC, freenode, #hurd, 2013-02-25 - - we should also discuss the mach_debug interface some day - it's not exported by libc, but the kernel provides it - slabinfo depends on it, and i'd like to include it in the hurd - but i don't know what kind of security problems giving access to - mach_debug RPCs would create - (imo, the mach_debug interface should be adjusted to be used with - privileged ports only) - (well, maybe not all mach_debug RPCs) - - # `gnumach.defs` [[!message-id diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn index 0b426884..0294b008 100644 --- a/open_issues/libpthread.mdwn +++ b/open_issues/libpthread.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -1303,6 +1303,7 @@ Most of the issues raised on this page has been resolved, a few remain. after the system has been alive for some time ? (some time being at least a few hours, more probably days) + #### IRC, freenode, #hurd, 2013-07-05 ok, found the bug about invalid ports when adjusting priorities @@ -1312,6 +1313,149 @@ Most of the issues raised on this page has been resolved, a few remain. [[libpthread/t/fix_have_kernel_resources]]. +#### IRC, freenode, #hurd, 2013-11-25 + + youpi: btw, my last commit on the hurd repo fixes the urefs + overflow we've sometimes seen in the past in the priority adjusting code + of libports + + +#### IRC, freenode, #hurd, 2013-11-29 + +See also [[open_issues/libpthread/t/fix_have_kernel_resources]]. + + there still are some leak ports making servers spawn threads with + non-elevated priorities :/ + leaks* + issues with your thread destruction work ? + err, wait + why does a port leak cause that ? + because it causes urefs overflows + and the priority adjustment code does check errors :p + ^^ + ah yes, urefs... + apparently it only affects the root file system + hm + i'll spend an hour looking for it, and whatever i find, i'll + install the upstream debian packages so you can build glibc without too + much trouble + we need a clean build chroot on darnassus for this situation + ah yes + i should have time to set things up this week end + 1: send (refs: 65534) + i wonder what the first right is in the root file system + hm + search doesn't help so i'm pretty sure it's a kernel object + perhaps the host priv port + could be the thread port or something ? + no, not the thread port + why would it have so many refs ? + the task port maybe but it's fine if it overflows + also, some urefs are clamped at max, so maybe this is fine ? + it may be fine yes + err = get_privileged_ports (&host_priv, NULL); + iirc, this function should pass copies of the name, not increment + the urefs counter + it may behave differently if built statically + o_O y would it ? + no idea + something doesn't behave as it should :) + i'm not asking why, i'm asking where :) + the proc server is also affected + so it does look like it has something to do with bootstrap + I'm not surprised :/ + + +#### IRC, freenode, #hurd, 2013-11-30 + + so yes, the host_priv port gets a reference when calling + get_privileged_ports + but only in the rootfs and proc servers, probably because others + use the code path to fetch it from proc + ah + well, it shouldn't behave differently + ? + get_privileged_ports + get_privileged_ports is explictely described to cache references + i don't get it + you said it behaved differently for proc and the rootfs + that's undesireable, isn't it ? + yes + ok + so it should behave differently than it does + yes + right + teythoon: during your work this summer, have you come across the + bootstrap port of a task ? + i wonder what the bootstrap port of the root file system is + maybe i got the description wrong since references on host or + master are deallocated where get_privileged_ports is used .. + no, I do not believe i did anything bootstrap port related + ok + i don't need that any more fortunately + i just wonder how someone could write a description so error-prone + .. + and apparently, this problem should affect all servers, but for + some reason i didn't see it + there, problem fixed + ? + last leak eliminated + cool :) + how ? + i simply deallocate host_priv in addition to the others when + adjusting thread priority + as simple as that .. + uh + sure ? + so many system calls just for reference counting + yes + i did that, and broke the rootfs + well i'm using one right now + ok + maybe i should let it run a bit :) + no, for me it failed on the first write + teythoon: looks weird + so i figured it was wrong to deallocate that port + i'll reboot it and see if there may be a race + thought i didn't get a reference after all or something + I believe there is a race in ext2fs + teythoon: that's not good news for me + when doing fsysopts --update / (which remounts /) + sometimes, the system hangs + :/ + might be a deadlock, or the rootfs dies and noone notices + with my protected payload stuff, the system would reboot instead + of just hanging + oh + which might point to a segfault in ext2fs + maybe the exception message carries a bad payload + makes sense + exception handling in ext2fs is messy .. + braunr: and, doing sleep 0.1 before remounting / makes the + problem less likely to appear + ugh + and system load on my host system seems to affect this + but it is hard to tell + sometimes, this doesn't show up at all + sometimes several times in a row + the system load might simply indicate very short lived processes + (or threads) + system load on my host + ah + this makes me believe that it is a race somewhere + all of this + well, i can't get anything wrong with my patched rootfs + braunr: ok, maybe I messed up + or maybe you were very unlucky + and there is a rare race + but i'll commit anyway + no, i never got it to work, always hung at the first write + it won't be the first or last rare problem we'll have to live with + hm + then you probably did something wrong, yes + that's reassuring + + ### IRC, freenode, #hurd, 2013-03-11 youpi: oh btw, i noticed a problem with the priority adjustement @@ -1582,6 +1726,9 @@ Same issue as [[term_blocking]] perhaps? ## IRC, freenode, #hurd, 2013-01-06 it seems fakeroot has become slow as hell + +[[pfinet_timers]]. + fakeroot is the main source of dead name notifications well, a very heavy one with pthreads hurd servers, their priority is raised, precisely to @@ -2008,3 +2155,260 @@ Same issue as [[term_blocking]] perhaps? handling, but there are still a few bugs remaining fyi, the related discussion was https://lists.gnu.org/archive/html/bug-hurd/2012-08/msg00057.html + + +## IRC, freenode, #hurd, 2014-01-01 + + braunr: I have an issue with tls_thread_leak + int main(void) { + pthread_create(&t, NULL, foo, NULL); + pthread_exit(0); + } + this fails at least with the libpthread without your libpthread + thread termination patch + because for the main thread, tcb->self doesn't contain thread_self + where is tcb->self supposed to be initialized for the main thread? + there's also the case of fork()ing from main(), then calling + pthread_exit() + (calling pthread_exit() from the child) + the child would inherit the tcb->self value from the parent, and + thus pthread_exit() would try to kill the father + can't we still do tcb->self = self, even if we don't keep a + reference over the name? + (the pthread_exit() issue above should be fixed by your thread + termination patch actually) + Mmm, it seems the thread_t port that the child inherits actually + properly references the thread of the child, and not the thread of the + father? + “For the name we use for our own thread port, we will insert the + thread port for the child main user thread after we create it.” Oh, good + :) + and, “Skip the name we use for any of our own thread ports.”, good + too :) + youpi: reading + youpi: if we do tcb->self = self, we have to keep the reference + this is strange though, i had tests that did exactlt what you're + talking about, and they didn't fail + why? + if you don't keep the reference, it means you deallocate self + with the thread termination patch, tcb->self is not used for + destruction + hum + no it isn't + but it must be deallocated at some point if it's not temporary + normally, libpthread should set it for the main thread too, i + don't understand + I don't see which code is supposed to do it + sure it needs to be deallocated at some point + but does tcb->self has to wear the reference? + init_routine should do it + it calls __pthread_create_internal + which allocates the tcb + i think at some point, __pthread_setup should be called for it too + but what makes pthread->kernel_thread contain the port for the + thread? + but i have to check that + __pthread_thread_alloc does that + so normally it should work + is your libpthread up to date as well ? + no, as I said it doesn't contain the thread destruction patch + ah + that may explain + but the tcb->self uninitialized issue happens on darnassus too + it just doesn't happen to crash because it's not used + that's weird :/ + see ~youpi/test.c there for instance + humpf + i don't see why :/ + i'll debug that later + youpi: did you find the problem ? + no + I'm working on fixing the libpthread hell in the glibc debian + package :) + i.e. replace a dozen patches with a git snapshot + ah you reverted commit + +a + i imagine it's hairy :) + not too much actually + wow :) + with the latest commits, things have converged + it's now about small build details + I just take time to make sure I'm getting the same source code in + the end :) + :) + i hope i can determine what's going wrong tonight + youpi: avec mach_print, je vois bien self setté par la libpthread + .. + mais à autre chose que 0 ? + oui + bizarrement, l'autre thread n'as pas la même valeur + tu es bien sûr que c'est self que tu affiches avec l'assembleur ? + oops, english + see test2 + so I'm positive + well, there obviously is a bug + but are you certain your assembly code displays the thread port + name ? + I'm certain it displays tcb->self + oh wait, hexadecimal, ok + and the value happens to be what mach_thread_self returns + ah right + ah, right, names are usually decimals :) + hm + what's the problem with test2 ? + none + ok + I was just checking what happens on fork from another thread + ok i do have 0x68 now + so the self field gets erased somehow + 15:34 < youpi> this fails at least with the libpthread without + your libpthread thread termination patch + how does it fail ? + ../libpthread/sysdeps/mach/pt-thread-halt.c:44: + __pthread_thread_halt: Unexpected error: (ipc/send) invalid destination + port. + hm + i don't have that problem on darnassus + with the new libc? + the pthread destruction patch actually doesn't use the tcb->self + name if i'm right + yes + what is tcb->self used for ? + it used to be used by pt-thread-halt + but is darnassus using your thread destruction patch? + as I said, since your thread destruction pathc doesn't use + tcb->self, it doesn't have the issue + the patched libpthread merely uses the sysdeps kernel_thread + member + ok + it's the old libpthread against the new libc which has issues + yes it is + so for me, the only thing to do is make sure tcb->self remains + valid + we could simply add a third user ref but i don't like the idea + well, as you said the issue is rather that tcb->self gets + overwritten + there is no reason why it should + the value is still valid when init_routine exits, so it must be in + libc + or perhaps for some reason tls gets initialized twice + maybe + and thus what libpthread's init writes to is not what's used later + i've add a print in pthread_create, to see if self actually got + overwritten + and it doesn't + there is a disrepancy between the tcb member in libpthread and + what libc uses for tls + added* + (the print is at the very start of pthread_create, and displays + the thread name of the caller only) + well, yes, for the main thread libpthread shouldn't be allocating a + new tcb + and just use the existing one + ? + the main thread's tcb is initialized before the threading library + iirc + hmm + it would make sense if we actually had non-threaded programs :) + at any rate, the address of the tcb allocated by libpthread is not + put into registers + how does it get there for the other threads ? + __pthread_setup does it + so + looks like dl_main is called after init_routine + and it then calls init_tls + init_tls returns the tcb for the main thread, and that's what + overrides the libpthread one + yes, _hurd_tls_init is called very early, before init_routine + __pthread_create_internal could fetch the tcb pointer from gs:0 + when it's the main thread + so there is something i didn't get right + i thought _hurd_tls_init was called as part of dl_main + well, it's not a bug of yours, it has always been bug :) + which is called *after* init_routine + and that explains why the libpthread tcb isn't the one installed + in the thread register + i can actually check that quite easily + where do you see dl_main called after init_routine? + well no i got that wrong somehow + or i'm unable to find it again + let's see + init_routine is called by init which is called by _dl_init_first + which i can only find in the macro RTLD_START_SPECIAL_INIT + with print traces, i see dl_main called before init_routine + so yes, libpthread should reuse it + the tcb isn't overriden, it's just never installed + i'm not sure how to achieve that cleanly + well, it is installed, by _hurd_tls_init + it's the linker which creates the main thread's tcb + and calls _hurd_tls_init to install it + before the thread library enters into action + agreed + + +### IRC, freenode, #hurd, 2014-01-14 + + btw, are you planning to do something with regard to the main + thread tcb initialization issue ? + well, I thought you were working on it + ok + i wasn't sure + + +### IRC, freenode, #hurd, 2014-01-19 + + i have some fixup code for the main thread tcb + but it sometimes crashes on tcb deallocation + is there anything particular that you would know about the tcb of + the main thread ? + (that could help explaining this) + Mmmm, I don't think there is anything particular + doesn't look like the first tcb can be reused safely + i think we should instead update the thread register to point to + the pthread tcb + what do you mean by "the first tcb" exactly? + + +## IRC, freenode, #hurd, 2014-01-03 + + braunr: hurd from your repo can't boot. restored debian one + gg0: it does boot + gg0: but you need everything (gnumach and glibc) in order to make + it work + i think youpi did take care of compatibility with older kernels + braunr: so do we need a rebuilt libc for the latest hurd from + git ? + teythoon: no, the hurd isn't the problem + ok + good + the problem is the libports_stability patch + what about it ? + the hurd can't work correctly without it since the switch to + pthreads + because of subtle bugs concerning resource recycling + ok + these have been fixed recently by youpi and me (youpi fixed them + exactly as i did, which made my life very easy when merging :)) + there is also the problem of the stack sizes, which means the hurd + servers could use 2M stacks with an older glibc + or perhaps it chokes on an error when attempting to set the stack + size because it was unsupported + i don't know + that may be what gg0 suffered from + yes, both gnumach and eglibc were from debian. seems i didn't + manually upgrade eglibc from yours + i'll reinstall them now. let's screw it up once again + :) + bbl + ok it boots + # apt-get install + {hurd,hurd-dev,hurd-libs0.3}=1:0.5.git20131101-1+rbraun.7 + {libc0.3,libc0.3-dev,libc0.3-dbg,libc-dev-bin}=2.17-97+hurd.0+rbraun.1+threadterm.1 + there must a simpler way + besides apt-pinning + making it a real "experimental" release might help with -t option for + instance + btw locales still segfaults + rpctrace from teythoon gets stuck at + http://paste.debian.net/plain/74072/ + ("rpctrace locale-gen", last 300 lines) diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn index feea7c0d..02b6ab05 100644 --- a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn +++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -477,3 +478,824 @@ Address problem mentioned in [[/libpthread]], *Threads' Death*. failing bad i just need to polish a few things, wait for youpi to finish his work on TLS to resolve conflicts, and that will be all + + +## IRC, freenode, #hurd, 2013-10-30 + + FYI, the packages on my repository enable actual thread + destruction, and i've altered the libports_stability.patch + it nows only sets the global timeout to 0 + now* + we actually can't let translator "die" on global timeout because + of a race issue + tested for about two weeks now and no major problem sighted + top reports processes running for 100% of their time when + terminating threads, but i expect it's simply mach/proc aggregating their + run time to the task + 100% of cpu time + + +## IRC, freenode, #hurd, 2013-11-08 + + teythoon: darnassus is currently running a modified glibc with + thread destruction, yes + braunr: did that require any fixups in Hurd that I'd have missed + ? + no + well + b/c the resulting hurd package would not boot + actually yes + one + i'll push the patch somewhere + iirc the mach-defpager spewed some error and /hurd/init failed + to bootstrap the system + teythoon: + http://darnassus.sceen.net/~rbraun/0001-Prevent-diskfs-translators-from-destroying-main-thre.patch + make sure you have the proper gnumach packages too :p + well, that could very well account for my trouble ;) + uh + well + gnumach implements thread destruction, glibc uses it, hurd makes + sure it doesn't exit from main + + +## IRC, freenode, #hurd, 2013-11-12 + + ok so, calling pthread_exit() from main isn't the same as + returning from main() + unlike what some man pages seem to say + so loosing task info when destroying the main thread is actually a + proc bug + ugh + ^^ + or a glibc one + the proc server, your favorite Hurd component... + :) + hm :/ + looks like command line arguments are stored on the stack of the + main thread + and proc merely receives the addresses of those in the target task + why not just keep the main thread around? + it represents a minor resource leak, true + yes + that's the hack i suggested + but it is relatively small + well no + my hack was about diskfs translators + it should be generalized in libpthread + seems reasonable + let's do it >) + + +## IRC, freenode, #hurd, 2013-11-13 + + braunr: there is a thread destruction issue in the experimental + ocaml build, worth looking at, probably + what do you mean ? + ... testing 'testfork.ml': ocamlcocamlrun: + ../libpthread/sysdeps/mach/pt-thread-halt.c:51: __pthread_thread_halt: + Unexpected error: (ipc/send) invalid destination port. + during the experimental ocaml build + well yes + thread recycling is buggy + i had the choice to fix it, or implement true destruction + i'm tweaking my patch so it leaves the main thread stack untouched + on destruction + and it should be ready + for review at least + + +## IRC, OFTC, #debian-hurd, 2013-11-13 + + ironforge out of memory during ruby1.9.1 rebuild. during test which + creates 10000 threads + ironforge out of memory during ruby1.9.1 rebuild, test which creates + 10000 threads + i guess ironforge kernel has been rebuilt against -95, correct? + err, what kernel? + 23:37 < youpi> hurd needs a rebuild to be able to work with the newer + eglibc + i mean hurd + yes, libc0.3 breaks the old packages anyway + wrt ENOMEM, was it expected? + wrt disk problems, aren't there on alioth only? + well 10,000 threads is a lot, especially on 32bit machine with 2M + default stack size + that makes 2GiB stacks + can't fit in a 2/2 split model, which gnumach uses + well, though active thread should die right away, just after set x to + false, if i read it correctly + perhaps the stacks are not correctly reused + that's probably worth digging in libpthread + by putting printfs, etc. + it seems stacks are never reused indeed, damn + I just wrote a small test that creates threads which just print + their stack address + that takes just a few minutes to do + i see. about reusage i guess you mean base address is kindof always + incremented + * gg0 likes being wrong + that's it, yes + gg0: take care, by keeping being wrong all the time, sometimes you + get right ;) + and you are definitely right here :) + Mmm, but the stack is really deallocated + and the numbers wrap around + I wonder how that is :) + ok, creating 20 000 threads does work + perhaps ruby does odd things which makes it not work + + +### IRC, OFTC, #debian-hurd, 2013-11-14 + + UID PID PPID TH MSGI MSGO SZ RSS SC STAT TIME COMMAND + 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu 0:00.15 + /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1 + -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb + 720 threads, stuck + 2G SZ is very big :) + 00:42 < youpi> perhaps ruby does odd things which makes it not work + is that enough to file a ruby bug? as ruby suggests itself btw + no, they will probably not be able to investigate + but you can already check out how they create threads + and try to reproduce the same with a small C program + ehm on ruby2.0 with *context _enabled_ i can not reproduce it + +See [[/open_issues/glibc]] for `*context` functions. + + +## IRC, freenode, #hurd, 2013-11-14 + + nice, i got glibc packages with thread destruction + building hurd packages against it now + everything seems fine + hurd packages ready, let's see + + ruby1.9.1 FTBFS due to a couple of tests + https://buildd.debian.org/status/fetch.php?pkg=ruby1.9.1&arch=hurd-i386&ver=1.9.3.448-1&stamp=1384265526 + second one creates 10000 threads and machine got ENOMEM + bootstraptest.tmp.rb: [BUG] [BUG] pthread_cond_init: Cannot + allocate memory (ENOMEM) ew + few hours ago trying to reproduce it: + 01:20 < gg0> UID PID PPID TH MSGI MSGO SZ RSS SC STAT + TIME COMMAND + 01:20 < gg0> 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu + 0:00.15 /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1 + -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb + yes that's expected + our stacks are 2M + 10k threads means right over 2G of stacks + userspace is restricted to 2G + but if i read correctly test in question, thread should just set x to + false then die + so ? + and ENOMEM popped upk when there were thread count was at 720 + hum + 10k threads would actually be 20G + 1k threads is 2G + 720 is about 1.5G + the rest is probably the ruby runtime + youpi tried to create 10000 thread, no problem. he guessed something + wrong on ruby side + indeed on ruby2.0 such test succeeds + you can't create 10k threads unless you change the stack size + hurd servers use a stack size of 64k by default which allows them + to go up to 30k iirc + but normal applications use the default 2M + i guess you mean 10000 threads active at the same time. test in + question should make them die after simply setting x to false, i guess + youpi's test did so as well + no + it's about stacks + hm + yes at the same time but + thread recycling is known to be buggy + which is what i'm currently fixing btw + what's the bug? + neal: there are several subtle issues + for example, joining a thread that is also calling pthread_exit + can fail badly + hmm + good that you are on it then :) + or detaching + i don't remember the details + but i remember such problems + apparently, keeping the stack of the main thread isn't enough + :( + for now, i'll keep the entire thread + + +## IRC, freenode, #hurd, 2013-11-15 + + i wasn't doing anything, just some single test runs. but yes, also + that one which creates hundreds of threads + it would like creating 10000 but goes out of memory after ~720 + btw same tests succeed on ruby2.0, so they should be fixed by + backporting some changes + actually it looks more like a deadlock .. + deadlock that says ENOMEM? + ? + ENOMEM is returned because the test task has no more virtual + memory + this doesn't mean the rest of the system should fail + ok i thought you were talking about such test + no it's something else + a deadlock in a critical server + the root file system maybe + braunr: htop and ps hang. just run the test once again + now you should still be able to login + htop/ps hanging means one process is unable to reply to queries + sent to the message port/thread + procfs does that to report on what a process is waiting + it usually mean there is a bug around signals, since the message + thread is also in charge of delivering signals + use ps -eM + and kill -KILL + hum + root 954 S dumping cores is known not to work most of the time + exodar shouldn't be configured like that + so yes, the crash server is hanging + gg0: i've set it to crash --kill and killed the hanging crash + instances blocking top/ps + nice + + my thread destruction patch and tls are indeed conflicting a bit + i suspect the tcb is used after being freed + i think i'll simply recycle the tcb, along with the pthread + structs + ok i think it's fine now + there was also a small bug in the tls code, keeping a reference on + the thread port + mach reference counting is so counter intuitive :/ + well, error-prone + + argh, more bugs in libc :( + :/ + but don't worry, there is always one more bug ;) + this one might explain crashes that are long to trigger + _hurd_self_sigstate() is implemented like this : + _hurd_thread_sigstate (__mach_thread_self ()); + it leaks a reference on the current thread each time it's called + >,< + but glibc maintains such references, so if the maximum value is + reached, and references are dropped, the value can reach 0 + ouch + at which point any call on a thread will result in an invalid send + right + and probably an assertion + well it's a good thing then that you found it :) + i think it's always been there + but it's more apparent since jknoenig's patch on signal + dispositions + the maximum number of user references in mach is 64k + this right leak isn't easy + tls is very tricky heh :) + for the main thread, tls initialization happens after the thread + creation, obviously + but for other threads, it's initialized before starting them + the leak was probably an overlook caused by that complexity + teythoon: actually that leak i mentioned in _hurd_self_sigstate + has only been recently added in Convert sigstate to TLS + so it's merely tls integration polishing + youpi: i'm currently reviewing changes related to tls and i think + there is a bug in _hurd_self_sigstate + calls to mach_thread_self() should be paired with + mach_port_deallocate to avoid urefs overflows + and right leaks + _hurd_critical_section_lock is probably affected too + hm + mhmm + in glibc, hurd/hurd/signal.h, _hurd_critical_section_lock + why is the sigstate unlocked after the call to + _hurd_thread_sigstate + _hurd_thread_sigstate doesn't seem to lock it .. + unless __spin_lock_init does it + yes, leak solved :) + + +## IRC, freenode, #hurd, 2013-11-16 + + argh, _hurd_critical_section_lock is called before the send right + on the main thread is fetched in libpthread :/ + is that bad ? + the sigstate is supposed to be initialized after pthreads + _hurd_critical_section_lock will create it if it sees there is + none + creating the sigstate is currently what makes the send right leak + ok + it's bad then + it may be due to my patch + _hurd_critical_section_lock is called during pthreads + initializatio + n + before the sigstate for the main thread is created, but after the + pthread init routine is called + it does indeed look like the code wasn't written with thread being + destroyed some day in mind :/ + braunr: btw, if you ever feel like benchmarking, sysbench has a + benchmark for threads contending for a lock + yes i've used it before + was it useful for this purpose ? + no :) + :/ + we already know libpthread isn't optimized + and felt it when we switched from cthreads + humpf + simply calling malloc implies a call to + _hurd_critical_section_lock + on the other hand, unlike what some glibc comments say, this does + work + + +## IRC, freenode, #hurd, 2013-11-17 + + looks like i've fixed all leak issues with thread destruction and + tls :) + let's see if ext2fs.static works fine too + braunr: \o/ + sorry about introducing the tls ones :) + no worries, it was expected + and tls was really needed :) + i mean, i expected to have some problems when rebasing on tls :p + braunr: this is good news, how is your rootfs translator holding + up? + building hurd packages right now + for now, only test applications and a few really multithreaded + ones (e.g. iceweasel) have been tested + well, the system boots :) + awesome :) + stressing the file system with git while watching youtube videos + with gnash doesn't make the system crash + you can actually watch yt videos on your Hurd box ? + yes + for a while now + o_O + can't you ? + I never even dared to try + hehe + teythoon: looks stable enough to install on darnassus + + +## IRC, freenode, #hurd, 2013-11-18 + + braunr: wrt to your thread destruction patchset, I thought you + also had to fix the proc server ? + teythoon: no + the problem was in glibc + i may have to fix proc/procfs though, because cpu time gets wrong + with the patch + currently, it's the addition of the cpu time of all threads + mach provides aggregate times including destroyed threads though + ah, I see + one side effect is that you'll see processes sometimes taking 100% + of cpu time although the cpu is unused + or the cpu time of a process gets reduced :) + i guess the 100% cpu is how top sees a negative increment + ^^ + gg0: do my threadterm packages help with ruby1.9 ? + i mean, can you test with them some time ? :) + + +## IRC, freenode, #hurd, 2013-11-21 + + youpi: ping about my question regarding error handling in the + proposed thread_terminate_release call + I agree with what Neal said + he didn't say anything about error handling + see + http://lists.gnu.org/archive/html/bug-hurd/2013-11/msg00181.html + i think i should make the call fail on first error + it shouldn't happen, so it would merely serve to catch bugs + it's not easily recoverable (if it's recoverable at all) + uh, I thought he had + I must have dreamt + + i think i'll go ahead with thread destruction integration + + +## IRC, freenode, #hurd, 2013-11-25 + + i've pushed the thread destruction patches for gnumach upstream + and made a branch in glibc for that too + awesome :) + youpi: i don't remember how glibc changes should be managed + once those are applied, i'll commit in libpthread + braunr: usually we create a topgit branch, and then we add the + patch from that to the debian repository + + +## IRC, freenode, #hurd, 2013-11-29 + + youpi: i still have a leak somewhere with the thread destruction + patches + maybe on the host priv port in bootstrap servers (root fs and proc + server) + it prevents priority adjusting in libports and can easily bring + down a system because servers can start trashing a lot sooner, as it was + the case during the pthread migration + +See discussion about that on [[/open_issues/libpthread]]. + + so i'll hunt it down before merging + + +## IRC, freenode, #hurd, 2013-12-19 + + darnassus still has the libports priority adjustement leaks + i'll apply a few more patches to my hurd packages + + humpf, proc seems to have a problem getting the host priv port :/ + thats bad + what did you do ? + i fixed all the leaks in libports when adjusting priorities + the last one being releasing the host priv right + and i get errors at boot time from the proc server + remember when i had this problem ? + proc doesn't get the host priv port the normal way since the + normal way is to get it from proc iirc + ah, thought you fixed that + so i guess the alternate way doesn't add a reference + well the leak is fixed + the problem you had was due to the leak which made the host priv + port reach its max uref value + now it's just the proc server + the system works fine though + for real ? + the proc server needs the host priv port for getting the new + tasks + well yes + how can it work w/o it ? + i don't know .. + i guess the problem is internal to glibc + i mean, get_priv_ports fails, but that doesn't mean the host priv + port is lost + could be + are you running a patched rootfs translator too ? + yes + ok + b/c i remember having trouble with that + right, the glibc call would make proc call __proc_getprivports + hum + teythoon: do you remember how proc gets its host priv port ? + from init + i think + startup_procinit ? + possibly + right + so it's probably not the host priv port + i mean, the error is about another invalid send right + hm nope, it is on host_priv :/ + hm ok i see, looks like a bug from a debian patch + or rather, a bug fix not yet imported into the debian package + teythoon: you actually fixed it in + 2c9422595f41635e2f4f7ef1afb7eece9001feae + great :) + ah, that one + i was looking at the upstream code and couldn't understand what + was going wrong + :) + much better + except ps -eT doesn't work any more .. + interestingly, with the thread destruction patch, ps -eT sometimes + work, and sometimes doesn't + the behaviour doesn't seem to change without a reboot + and of course, as soon as i say it, i'm proven wrong by the next + test :) + + +## IRC, freenode, #hurd, 2013-12-26 + + __pthread_sigstate_init doesn't seem to be converted to TLS in the + upstream repository master branch + + ah dammit, the global signal dispositions patch touches both glibc + and libpthread @#! + what a mess + + youpi: do you have some time to quickly review the + rbraun/thread_destruction branch in libpthread ? + there might be conflict with some glibc patches + or do you prefer it on the mailing list ? + (i used a branch because it's not based on master) + rather mail the list, yes + ok + it'd also be useful to write the rationale + probably to be left as comment in the source code + yes, that branch was for personal storage :) + so the reader knows how things are recycled or not + hm + that should already be the case + ok + the two structures that are still recycled are the pthread struct + and tls + it's quite obvious from pthread_alloc + and well commented there + for tls, it's explained in pthread_exit + + there, thread destruction finally merged in + and now, we can remove the ugly hacks that were done for + threadvars + :) + change stacks at will and support all sorts of weird languages and + runtimes + braunr: cool :) + + +## IRC, freenode, #hurd, 2013-12-31 + + braunr: I've added sigstate_locking, sigstate_thread_reference and + tls_thread_leak to the debian glibc 2.18 package + I believe that's complete? + is mach_msg_uspace_options ready for being added? Does it bring + much speedup? + AIUI, thread_terminate_release is the union of the branches + mentioned above? + (I'm cleaning up branches in the glibc repo) + youpi1: mach_msg_uspace_options can be left over, it only affects + selects and not noticeably + yes, those three branches are the only ones needed for thread + destruction + ok + does the hurd changes depend on these changes ? + no + good :) + only on tls for one of them + (it's about the default stack size of 64k for hurd servers) + and we have had this in debian for a long time already :) + yes + (how big were they before?) + (where they a couple MiB, and thus exploding to GiBs on thousands + of threads?) + 64k + pthread stacks are 2M by default + yes + + +## IRC, freenode, #hurd, 2014-01-14 + + braunr: it seems your time change in libps made ps produce odd re + results + samy 10987 5 -514358:-18:-42.17 /hurd/firmlink tmp + youpi: wow :) + that change is supposed to run on a system where threads actually + get destroyed + but i don't see what could trigger this side effect + root 8629 664 56 years make -j 3 + :) + heh + youpi: does the hurd package on darnassus include that patch ? + yes + i don't reproduce the problem :/ + err + what command are you using ? + ps -feM on darnassus + root 29642 473 7 months /usr/sbin/sshd -R + hmmmm + i don't see it with a make -j + well, it's not systematic + it's like once over two launches + hhhhmmmmm + it'd look like some random numbers get added + strangely, the gcc processes started by a recursive make aren't + children of make .. + ps -eF hurd seems to report the correct values + even ps -eM + oO + ps -ef too + the problem seems to be with ps -efM + too bad I'm always using that :) + another way to see it is that it makes us spot the issue ;p + + +### IRC, freenode, #hurd, 2014-01-15 + + ok i have an idea of what goes wrong in libps + + youpi: for some reason, ps -efM lacks the PSTAT_TASK_BASIC flag + my patch is wrong since it doesn't try to determine whether the + stats apply to a task or a thread, but that is easy to fix + ps -efM should nonetheless provide basic task info, obviously + in addition, the problems i've observed with ps -T (occasional + segfaults) seem to have existed before thread destruction + they're just strongly exposed now that the thread list can be + shrunk + + libps is quite complicated + even hairy, i'd say .. + + +### IRC, freenode, #hurd, 2014-01-16 + + youpi: i think i have a proper fix for libps + i'll commit it soon + ok + basically, getting system times simply set the PSTAT_THREAD_BASIC + flag + whereas getting the run time of the terminated threads requires + PSTAT_TASK_BASIC + i assumed it was always set in the function i changed when dealing + with a task and not a thread + and well, that was a wrong assumtion, -M can remove it if not + strictly needed by the format + the default format asks for suspend_count, which forces the + retrieval of task basic info, os it works with -eM + but -f doesn't :) + so extremely bad lucky combination of flags :) + indeed + i added a pstat_times using the last (!) available flag bit + looks clean to me + i hope there is no abi issue + (at least everything works with the unmodified ps-hurd executable + and a new libps.so) + + hm, small bug in the thread destruction patch :/ + + +### IRC, freenode, #hurd, 2014-01-17 + + good, i have proper fixes for tls in the main thread and thread + termination :) + awesome :) + i've been wondering, what does it take to get the thread + destruction stuff into the debian package ? + i still have to build test packages, look for (unlikely, heh) + regressions and work some integration details with samuel + hum the main thread tls fixup i guess + youpi was waiting for me to fix that + gnumach already provides the RPC + so it will be in glibc soon + i just have to get those last bits right + teythoon: i'm quite slow at integrating stuff + and samuel then builds packages ? + i mean, is our libc package build linked to the other libc + packages ? + libpthread is applied as a patch to glibc + and loaded as a plugin + + +## IRC, freenode, #hurd, 2014-01-17 + + uhm, did we break fakeroot-tcp ? + we did ? + fakeroot-tcp just works fine on buildds + with fakeroot-tcp, i get + make[4]: Entering directory + `/home/rbraun/devel/debian/packages/hurd/hurd-0.5.git20140113/libdde-linux26/contrib/include' + rm -f .general.d + make[4]: *** [cleanall] Killed + when cleaning the package before building .. + + +### IRC, freenode, #hurd, 2014-01-18 + + damn, fakeroot-tcp won't work on darnassus .. + uh, looks like my tls/thread destruction "fixes" do cause + regressions :( + fakeroot works fine with debian glibc + which one ? + which fakeroot i mean + -tcp + yes, it fails as soon as i use the patched glibc :/ + at least it's easy to reproduce + + +### IRC, freenode, #hurd, 2014-01-20 + + great, 3rd libc version installed on darnassus, let's see if i can + build hurd packages against that + + +### IRC, freenode, #hurd, 2014-01-21 + + damn, fakeroot-tcp still crashes with my latest changes .... + + darnassus looks in good shape + youpi: ^ + youpi: if you have other tests, feel free to do them now + i feel confident about committing the changes, if you're ok with + it + which changes ? + I'm a bit lost in what you were talking about :) + you can find them in 2 patches in /var/tmp on darnassus + one is about fixing thread destruction + i'm pretty certain about this one so i'll commit it directly + the other is fixing the tcb of the main thread + +[[open_issues/libpthread]]. + + where i simply do tcb->self = thread->kernel_thread :) + with a comment explaining why i don't do something else like + deallocating the unused tcb + braunr: ok, that looks good + braunr: awesome :) + youpi: ok + + +### IRC, freenode, #hurd, 2014-01-22 + + there, libpthread should be fine now + + +## IRC, freenode, #hurd, 2014-02-06 + + youpi: in case you're planning to upgrade glibc (or not), the + thread destruction changes are complete + youpi: darnassus has been running them for some weeks with no + visible regression + braunr: ok, good + including it in glibc was on my todo list indeed + and Adam indeed plan for a 2.18 upload + good :) + braunr: this is up to 7c6dc6e28b2fc4b67934223f41cf080ffe58b230, + right? (Wed Jan 22, Fix up the main thread TCB) + yes + oh, i just saw 2.17-98~0 glibc packages on debian-ports :) + yes, it's just to fix the dhcp crash + ah yes, it's not 2.18 + 2.18 is available in experimental + + braunr: just to make sure: did you have + 983b18a6ff16f5687a9ece63a50d1831dec88609 in libc on darnassus? + (which drops the stack size hack) + youpi: let me check + youpi: ah no, i don't, you're right + well, I was just wondering, nothing make me think that was the case + :) + what was the issue that it was raising btw? + threadvards + ok, b ut in which case? + (to make sure I test that before committing) + now that we switched to tls, i would assume the transition path to + be 1/ hurd stops defining that symbol, 2/ libpthread can stop using it + the goal was to reduce the stack size of hurd server threads + well, that's not my question :) I'm wondering in which precise case + that was breaking things + youpi: i don't know, it shouldn't break + ok + youpi: just in case, don't forget that last one line patch i + committed last night, fakeroot can't work right without it + (i made a minor change while reviewing before comitting, and + obviously got it wrong :p) + ok + + braunr: I've upgraded libpthread in debian's eglibc btw + + + /home/rbraun/devel/debian/packages/eglibc/eglibc-2.17/build-tree/hurd-i386-libc/libc.so.phdr: + *** executable stack signaled + from build-tree/hurd-i386-libc/elf/check-execstack.out + i thought glibc didn't use those + anyway it doesn't look to be the regression i'm having + does this ring a bell : + Encountered regressions that don't match expected failures + (debian/testsuite-checking/expected-results-i486-gnu-libc): + test-stpcpy_chk.out, Error 1 + TEST test-stpcpy_chk.out: __stpcpy_chk normal_stpcpy + simple_stpcpy_chk + nope + after what are you getting this regression? + building glibc 2.17-97 with thread destruction patches, including + the one removing the stack size hack + during tests + there also are "progressions", but i'm not sure what these are + some progressions are just luck, other seem to happen on some + platforms only + I'm not sure you want to test 2.17 + a lot has changed between 2.17's libpthread and 2.18's libpthread + (which is now equal to cvs's libpthread + ) + s/cvs/git/ + yes + i usually build with nocheck + + +## IRC, freenode, #hurd, 2014-02-07 + + youpi: on a vm with hurd 1:0.5.git20140203-1, upgrading to a + patched glibc 2.17-97 that includes the patch which reverts the stack + size hack, the system reboots and works fine + ok. I don't remember what problem I was seeing + that version of the hurd no longer defines the symbol + but even then, there shouldn't have been any problem + hm, or does it + yes, it does + youpi: the hurd package patch mentions + Revert this for now, will have to wait for dropping the use of + __pthread_stack_default_size from eglibc's + libpthread_hurd_cond_wait.diff + i wonder how it got there + IIRC I was wondering too + i've installed my c library on darnassus and it works fine there + too + with older (january) hurd packages + looks good to me + + +## IRC, freenode, #hurd, 2014-02-10 + + braunr: btw, do the new libc packages contain your thread + destruction work ? + teythoon: the -98 ones on experimental ? + i don't think they do + the -18 ones should do diff --git a/open_issues/libpthread_dlopen.mdwn b/open_issues/libpthread_dlopen.mdwn index 3c36eb26..a825fdff 100644 --- a/open_issues/libpthread_dlopen.mdwn +++ b/open_issues/libpthread_dlopen.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -125,6 +125,108 @@ IRC, freenode, #hurd, 2011-08-17 and yes, it's known already, just nobody worked on solving it +# IRC, freenode, #hurd, 2014-01-28 + + braunr: Is this fixed by your recent patches? test_dbi: + ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: + __pthread_mutex_timedlock_internal: Assertion `__pthread_threads' + failed. + faq/libpthread_dlopen.mdwn: + ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: + __pthread_mutex_time + youpi: tks. A workaround seems to be available: + LD_PRELOAD=/lib/i386-gnu/libpthread.so.0.3 + Is that possible on a buildd? + it would be simpler to just make the package explicitly link + libpthread + Package is libdbi-drivers, providing libdbd-sqlite3 needed by + gnucash + + +# IRC, freenode, #hurd, 2014-02-17 + + hm ok, looks like iceweasel errors all have something to do with + the libc dns resolver + http://darnassus.sceen.net/~rbraun/iceweasel_crash + apparently, it's simply because the memory chunk isn't page + aligned .. + looks like not preloading libpthread tirggers lots of tricky + issues + anyway, apparently, the malloc/free calls in libresolv don't use + locks if libpthread isn't preloaded, which explains why the program state + looked impossible to reach and why crashes look random + debian linux does not have the pthread load problem. + congzhang: it had it + maybe not debian but i've found one such report for opensuse + + ok the bug is simple + for some reason, our glibc still uses a global _res state for dns + resolution instead of per thread ones + uh, apparently, it's libpthread's job to define a __res_state + function for that :( + +## IRC, freenode, #hurd, 2014-02-18 + + usually when i say it, it crashes soon after, so let's try it : + i've been running iceweasel 27 fine for like 10 minutes with a + patched libpthread + still no crash ;p + with luck this extremely lightweight patch will fix all + multithreaded applications doing concurrent name resolution .... :) + nice :) + let's try gnash .... + uh, segfault on termination + gnash works :) + sweet :) + i'm very surprised we could live so long with that resolv bug + + +## IRC, freenode, #hurd, 2014-02-19 + + youpi: the eglibc bug is about libresolv + it uses a global resolver state even in multithreaded applications + libresolv is a horrible part of glibc :) + which is obviously bad + yes .. :) + here is the patch : + http://darnassus.sceen.net/~rbraun/0001-libpthread-per-thread-resolver-states.patch + it's very short, it basically allocates a resolver state per + thread in the pthread struct, and sets the TLS variable __resp when the + thread starts + should we make that hurd-specific ? + or enclose that assignment with #ifdef ENABLE_TLS ? + well, ENABLE_TLS is now always 1, iirc :) + for the hurd, yes + I'm surprised linux never had the issue + no, not for the hurd + ah + I *had* to implement TLS for hurd because it was always 1 for + everybody :) + ok + so all those ifdefs could be removed and libpthread can assume tls + is enabled + in which case my patch looks fine + ah, thats a libpthread patch, not glibc patch + yes + nptl obviously did that from the start . :) + linuxthreads had the problem a looong time ago + ok + i'm surprised we overlooked it for so long + but anyway, that's a good fix + indeed + it seems all good to me + well, __resp is a __thread variable + i could add #ifdef ENABLE_TLS, but then what of the case where TLS + isn't enabled, and do we actually care ? + #error maybe ? + or #warning ? + I don't think we care about the non-TLS case any more + ok + topgit branch i suppose ? + well, not, hurd libpthread repo :) + oh right ... :) + + # libthreads vs. libpthread The same symptom appears in an odd case, for instance: diff --git a/open_issues/libpthread_set_stack_size.mdwn b/open_issues/libpthread_set_stack_size.mdwn index 68f81752..21c2f18e 100644 --- a/open_issues/libpthread_set_stack_size.mdwn +++ b/open_issues/libpthread_set_stack_size.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -23,3 +24,91 @@ IRC, freenode, #hurd, 2011-10-21: it's simply on the so-long TODO list [[glibc/t/tls-threadvar]]. + +2012-12-28: + +Hurd commit 3a3fcc811e6b50b21124a5c5a128652e788a3b67 `libports: remove the +threadvars stack size hack`. + +IRC, freenode, #hurd, 2014-01-09: + + braunr: i'm afraid it might be your patch 3a3fcc81 that breaks + proc + w/ the current debian libc that is + braunr: i reverted that patch and now it boots again + is alternate stack and arbitrary stack sizes supported by now, or + upcoming? + gnu_srs: supported + well + considering what teythoon just said, maybe not + need to remove __pthread_stack_default_size from + libpthread_hurd_cond_wait patch too i guess + teythoon: i don't understand why this change has any negative + effect :/ + or + hm no .. + there may be a bug in the latest glibc, where changing the stack + is allowed on the ground that threadvars have been replaced with tls, but + the libpthread stack handling code does it wrong + see 714413a7694ff534855e9e5904899695eac6c9bb in libpthread + which the thread destruction patches already did before it was + fixed in libpthread + and may explain why my packages work + + +IRC, freenode, #hurd, 2014-01-14: + + teythoon: Mmm, I tried to update to the latest hurd commits, but + init dies early at boot + exec init proc auth, and then init crashes + downgrading libports to previous makes the issue go away + youpi: previous ? + previous debian package + which patch makes it fail ? + I'm bisecting + i remember teythoon saying he had failures with the patch that + removes the threadvars stack size hack + I'll try that already, ok + yes, boots fine without this change + ok + perhaps some missing patches in the current 2.17-97 glibc + or libpthread reacting badly to new stack sizes + is 714413a7694ff534855e9e5904899695eac6c9bb included in your glibc + ? + (714413a7694ff534855e9e5904899695eac6c9bb from libpthread) + or maybe that's not the problem + anyway, it's normally fixed with the thread destruction patch + i did test it and checked the stack size were correct + sizes* + yes, debian's glibc has it + ok + so that can wait + is 959f7365fccd1c89be9938c2655eba9122171e6a (Drop threadvars + entirely) also in your glibc ? + yes + that's weird :/ + the only thing i can think of is __pthread_stack_alloc miserably + failing with 2M stacks and "many" threads for some odd reason .. + anyway, see you tomorrow + hurd-i386/libpthread_hurd_cond_wait.diff keeps using + __pthread_stack_default_size. isn't it the problem? + * youpi wonders what that change is doing there + and it's there from the start of that patch... + + if (&__pthread_stack_default_size != NULL) + checks if the symbol is actually resolved + that's what allows regular applications to work + it should be the same for hurd servers + + +# sigaltstack + +Likewise, `sigaltstack` is not usable at the moment. + +IRC, freenode, #hurd, 2014-02-25: + + braunr: are the split/alternate stack etc problems solved by now + so gccgo can work properly? + i don't know + i suspect it wouldn't require much work now that tls is well + supported + alternate stack is supposed to be working diff --git a/open_issues/linux_as_the_kernel.mdwn b/open_issues/linux_as_the_kernel.mdwn index 1d84d777..2656b1a3 100644 --- a/open_issues/linux_as_the_kernel.mdwn +++ b/open_issues/linux_as_the_kernel.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -235,3 +235,34 @@ Richard's X-15 Mach re-implementation: i'll have to check, it's been a long time since i've really used it they must use a pure devfs instance now + + +# IRC, freenode, #hurd, 2014-02-23 + + so crazy idea: would it be possible to have mach as a linux kernel + module? + ie: some new binfmt type thing that could load mach binaries and + implement the required kernel ABI for them + and then run the entire hurd under that.... + desrt: that's an idea, yes + and not a new one + * desrt did a bit of googling but didn't find any information about it + desrt: but why are you thinking of it ? + we talked about it here, informally + braunr: mostly because running hurd in a VM sucks + if we had mach-via-linux, we'd have: + - no vm overhead + - no device virtualisation + - 64bit (physical at least) memory support + - SMP + - access to the linux drivers, natively + and maybe some other nice things + yes we talkbed about all this + but i still consider that to be an incomplete solution + i don't consider it to be running "the hurd" as your OS... but it + would be a nice solution for development and virtualisation + we probably don't want to use drivers natively, since we want them + to run in their own address space, with their own namespace context + it would, certainly + but it would require a lot of effort anyway + right diff --git a/open_issues/mach_migrating_threads.mdwn b/open_issues/mach_migrating_threads.mdwn index bbc6ac45..16547838 100644 --- a/open_issues/mach_migrating_threads.mdwn +++ b/open_issues/mach_migrating_threads.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -101,3 +102,17 @@ In context of [[resource_management_problems]]. i initially downloaded osfmach sources to see an example of how thread migration was used from userspace and they do have a special threading library for that + + +# IRC, freenode, #hurd, 2014-02-18 + + has anyone here ever tried to enable the thread migration bits + in gnumach to see where things break and how far that effort has been + taken ? + without proper userspace support, i don't see how this could work + but is the kernel part finished or close to being finished ? + no idea + i don't think it is + i didn't see much code related to that feature, and practically + none that looked like what the paper described + some structures, but not used diff --git a/open_issues/mig_portable_rpc_declarations.mdwn b/open_issues/mig_portable_rpc_declarations.mdwn index ecfa06ae..f5f18880 100644 --- a/open_issues/mig_portable_rpc_declarations.mdwn +++ b/open_issues/mig_portable_rpc_declarations.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,8 +11,35 @@ License|/fdl]]."]]"""]] [[!tag open_issue_mig]] +[[!toc]] -# IRC, freenode, #hurd, 2011-11-14 + +# 32-Bit vs. 64-Bit Interfaces + +## IRC, freenode, #hurd, 2011-10-16 + + i guess it wouldn't be too hard to have a special mach kernel for + 64 bits processors, but 32 bits userland only + well, it means tinkering with mig + like old sparc systems :p + to build the 32bit interface, not the 64bit one + ah yes + hm + i'm not sure + mig would assume a 32 bits kernel, like now + and you'll have all kinds of discrepancies in vm_size_t & such + yes + the 64 bits type should be completely internal + types* + but it would be far less work than changing all the userspace bits + for 64 bit (ofc we'll do that some day but in the meanwhile ..) + yes + and it'd boost userland addrespace to 4GiB + yes + leaving time for a 64bit userland :) + + +## IRC, freenode, #hurd, 2011-11-14 also, what's the best way to deal with types such as type cache_info_t = struct[23] of integer_t; @@ -58,7 +86,103 @@ License|/fdl]]."]]"""]] (which I still need to follow up on... [sigh]) -# IRC, freenode, #hurd, 2013-06-25 +## IRC, freenode, #hurd, 2012-12-12 + +In context of [[microkernel/mach/gnumach/memory_management]]. + + Or with a 64-bit one? ;-P + tschwinge: i think we all had that idea in mind :) + tschwinge: patches welcome :P + tschwinge: sure, please help us settle down with the mig stuff + what was blocking me was just deciding how to do it + hum, what's blocking x86_64, except time to work on it ? + deciding the mig types & such things + i.e. the RPC ABI + ok + easy answer: keep it the same + sorry, let me rephrase + decide what ABI is supposed to be on a 64bit system, so as to know + which way to rewrite the types of the kernel MIG part to support 64/32 + conversion + can't this be done in two steps ? + well, it'd mean revamping the whole kernel twice + as the types at stake are referenced in the whole RPC code + the first step i imagine would simply imply having an x86_64 + kernel for 32-bits userspace, without any type change (unless restricting + to 32-bits when a type is automatically enlarged on 64-bits) + it's not so simple + the RPC code is tricky + and there are alignments things that RPC code uses + which become different when build with a 64bit compiler + there are also things like int[N] for io_stat_struct and so on + i see + making the code wrong for 32 + thus having to change the types + pinotree: yes + (doesn't mig support structs, or it is too clumsy to be used in + practice?) + pinotree: what's the problem with that (i explcitely said changing + int to e.g. int32_t) + that won't fly for some of the calls + e.g. getting a thread state + pinotree: no it doesn't support struct + braunr: that some types in struct stat are long, for instance + pinotree: same thing with longs + youpi: why wouldn't it ? + that wouldn't work on a 64bit system + so we can't make it int32_t in the interface definition + i understand the alignment issues and that the mig code adjusts + the generated code, but not the content of what is transfered + well of course + i'm talking about the first step here + which targets a 32-bits userspace only + ok, so we agree + the second step would have to revamp the whole RPC code again + i imagine the first to be less costly + well, actually no + you're right, the mig stuff would be easy on the application side, + but more complicated on the kernel side, since it would really mean + dealing with 64-bits values there + (unless we keep a 3/1 split instead of giving the full 4g to + applications) + +See also [[microkernel/mach/gnumach/memory_management]]. + + (I don't see what that changes) + if the kernel still runs with 32-bits addresses, everything it + recevies from or sends through mig can be stored with the user side + 32-bits types + err, ok, but what's the point of the 64bit kernel then ? :) + and it simply uses 64-bits addresses to deal with physical memory + ok + that could even be a 3.5/0.5 split then + but the memory model forces us to run either at the low 2g or the + highest ones + but linux has 3/1, so we don't need that + otherwise we need an mcmodel=medium + we could do with mcmodel=medium though, for a time + hm actually no, it would require mcmodel=large + hum, that's stupid, we can make the kernel run at -2g, and use 3g + up to the sign extension hole for the kernel map + + +## IRC, freenode, #hurd, 2013-12-03 + + I believe the main issue is redoing the RPCs in 64bit, i.e. the + Mach/Hurd interface + mach has always been 64-bits capable + the problem is both mach and the hurd + it's at the system interface (the .defs of the RPCs) + azeem: ah, actually that's why you also say + but i consider it to be a hurd problem + the hurd itself is defined as being a set of interfaces and + servers implementing them, i wouldn't exclude the interfaces + that's what* + + +# Structured Data + +## IRC, freenode, #hurd, 2013-06-25 is there a nice way to get structured data through mig that I haven't found yet? diff --git a/open_issues/mig_strings.mdwn b/open_issues/mig_strings.mdwn new file mode 100644 index 00000000..3693fcc2 --- /dev/null +++ b/open_issues/mig_strings.mdwn @@ -0,0 +1,38 @@ +[[!meta copyright="Copyright © 2014 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_mig]] + +[[!toc]] + + +# IRC, freenode, #hurd, 2014-02-21 + + grml... migs support for variable-length c strings is broken :( + completely .. + no one told me :p + noone dares + to tell me ? + or anyone else ;p + ^^ + root@debian:~# pkill mtab + task /hurd/procfs(19) �O� deallocating an invalid port 1049744, + most probably a bug. + :) + it's still an improvement >,< + uh the joys... + gnu machs mig_strncpy behaves differently from glibcs + the mach version always 0-terminates the target string, the libc + variant does not + which one should i "fix" ? + strncpy should behave like strncpy + not according to the documentation in gnumach... + people who know it expect it not to always null terminate + you can either fix mig_strncpy, or call it mig_strlcpy diff --git a/open_issues/mig_stub_functions.mdwn b/open_issues/mig_stub_functions.mdwn index 24a582b1..474a7675 100644 --- a/open_issues/mig_stub_functions.mdwn +++ b/open_issues/mig_stub_functions.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -39,3 +39,15 @@ License|/fdl]]."]]"""]] btw, is there any reason why mig couldn't generate the request and reply routines from the synchronous routines? i guess it could + + +# Compiler Optimization + +## IRC, freenode, #hurd, 2013-12-02 + + braunr: inlining the mach generated x_server_procedure functions + shaved 5 minutes off my hurd package build :) + i guess fakeroot-tcp benefits most from this... I'm going to try + this w/o fakeroot and on real hardware shortly + teythoon: nice + :) diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn index 03614fae..d5c0272c 100644 --- a/open_issues/multithreading.mdwn +++ b/open_issues/multithreading.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -362,6 +362,8 @@ Tom Van Cutsem, 2009. having servers go away when unneeded is a valuable and visible feature of modularity +[[open_issues/libpthread/t/fix_have_kernel_resources]]. + ### IRC, freenode, #hurd, 2013-04-03 @@ -381,6 +383,184 @@ Tom Van Cutsem, 2009. ok +### IRC, freenode, #hurd, 2013-11-30 + +"Thread storms". + + if you copy a large file for example, it is loaded in memory, each + page is touched and becomes dirty, and when the file system requests them + to be flushed, the kernel sends one message for each page + the file system spawns a thread as soon as a message arrives and + there is no idle thread left + if the amount of message is large and arrives very quickly, a lot + of threads are created + and they compete for cpu time + How do you plan to work around that? + first i have to merge in some work about pagein clustering + then i intend to implement a specific thread pool for paging + messages + with a fixed size + something compareable for a kernel scheduler? + no + the problem in the hurd is that it spawns threads as soon as it + needs + the thread does both the receiving and the processing + But you want to queue such threads? + what i want is to separate those tasks for paging + and manage action queues internally + in the past, it was attempted to limit the amount ot threads in + servers, but since receiving is bound with processing, and some actions + in libpager depend on messages not yet received, file systems would + sometimes freeze + that's entirely the task of the hurd? One cannot solve that in + the microkernel itself? + it could, but it would involve redesigning the paging interface + and the less there is in the microkernel, the better + + +#### IRC, freenode, #hurd, 2013-12-03 + + i think our greatest problem currently is our file system and our + paging library + if someone can spend some time getting to know the details and + fixing the major problems they have, we would have a much more stable + system + braunr: The paging library because it cannot predict or keep + statistics on pages to evict or not? + braunr: I.e. in short - is it a stability problem or a + performance problem (or both :) ) + it's a scalability problem + the sclability problem makes paging so slow that paging requests + stack up until the system becomes almost completely unresponsive + ah + So one should chase defpager code then + no + defpager is for anonymous memory + vmm? + Ah ok ofc + our swap has problems of its own, but we don't suffer from it as + much as from ext2fs + From what I have picked up from the mailing lists is the ext2fs + just because no one really have put lots of love in it? While paging is + because it is hard? + (and I am not at that level of wizardry!) + no + just because it was done at a time when memory was a lot smaller, + and developers didn't anticipate the huge growth of data that came during + the 90s and after + that's what scalability is about + properly dealing with any kind of quantity + braunr: are we talking about libpager ? + yes + and ext2fs + yeah, i got that one :p + :) + the linear scans are in ext2fs + the main drawback of libpager is that it doesn't restrict the + amount of concurrent paging requests + i think we talked about that recently + i don't remember + maybe with someone else then + that doesn't sound too hard to add, is it ? + what are the requirements ? + and more importantly, will it make the system faster ? + it's not too hard + well + it's not that easy to do reliably because of the async nature of + the paging requests + teythoon: the problem with paging on top of mach is that paging + requests are asynchronous + ok + libpager uses the bare thread pool from libports to deal with + that, i.e. a thread is spawned as soon as a message arrives and all + threads are busy + if a lot of messages arrive in a burst, a lot of threads are + created + libports implies a lot of contention (which should hopefully be + lowered with your payload patch) + +[[community/gsoc/project_ideas/object_lookups]]. + + that contention is part of the scalability problem + a simple solution is to use a more controlled thread pool that + merely queues requests until user threads can process them + i'll try to make it clearer : we can't simply limit the amout of + threads in libports, because some paging requests require the reception + of future paging requests in order to complete an operation + why would that help with the async nature of paging requests ? + it wouldn't + right + thaht's a solution to the scalability problem, not to reliability + well, that kind of queue could also be useful for the other hurd + servers, no ? + i don't think so + why not ? + teythoon: why would it ? + the only other major async messages in the hurd are the no sender + and dead name notification + notifications* + we could cap the number of threads + two problems with that solution + does not solve the dos issue, but makes it less interruptive, + no? + 1/ it would dynamically scale + and 2/ it would prevent the reception of messages that allow + operations to complete + why would it block the reception ? + it won't be processed, but accepting it should be possilbe + because all worker threads would be blocked, waiting for a future + message to arrive to complete, and no thread would be available to + receive that message + accepting, yes + that's why i was suggesting a separate pool just for that + 15:35 < braunr> a simple solution is to use a more controlled + thread pool that merely queues requests until user threads can process + them + "user threads" is a poor choice + i used that to mirror what happens in current kernels, where + threads are blocked until the system tells them they can continue + hm + but user threads don't handle their own page faults on mach + so how would the threads be blocked exactly, mach_msg ? + phread_locks ? + probably a pthread_hurd_cond_wait_np yes + that's not really the problem + why not ? that's the point where we could yield the thread and + steal some work from our queue + this solution (a specific thread pool of a limited number of + threads to receive messages) has the advantage that it solves one part of + the scalability issue + if you do that, you loose the current state, and you have to use + something like continuations instead + indeed ;) + this is about the same as making threads uninterruptible when + waiting for IO in unix + it makes things simpler + less error prone + but then, the problem has just been moved + instead of a large number of threads, we might have a large number + of queued requests + actually, it's not completely asynchronous + the pageout code in mach uses some heuristics to slow down + it's ugly, and is the reason why the system can get extremely slow + when swap is used + solving that probably requires a new paging interface with the + kernel + ok, we will postpone this + I'll have to look at libpager for the protected payload series + anyways + 15:38 < braunr> 1/ it would dynamically scale + + not + why not ? + 15:37 < teythoon> we could cap the number of threads + to what value ? + we could adjust the number of threads and the queue size based + on some magic unicorn function + :) + this one deserves a smiley too + ^^ + + ## Alternative approaches: * diff --git a/open_issues/nightly_builds.mdwn b/open_issues/nightly_builds.mdwn index 96567685..f6d2c311 100644 --- a/open_issues/nightly_builds.mdwn +++ b/open_issues/nightly_builds.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -29,9 +29,25 @@ Resources: * + IRC, freenode, #hurd, 2013-11-15: + + today I discovered buildbot, and both the master as well as + the build slave works just fine out of the box on Hurd :) + I'd love to set one up on darnassus + ah nice + we use buildbot at work too + even better, so you already know it + sure we can + no i don't + i just know we use it :) + k + but that would be a good occasion to learn + i'm a bit busy right now, have to go soon + we'll see the details later + yes :) + + [[Nightly_Builds_deb_Packages]]. + * [LAVA (Linaro Automated Validation Architecture)](http://lava.readthedocs.org/) ---- - -See also [[nightly_builds_deb_packages]]. diff --git a/open_issues/nightly_builds_deb_packages.mdwn b/open_issues/nightly_builds_deb_packages.mdwn index 11fc4c79..da7bdc7d 100644 --- a/open_issues/nightly_builds_deb_packages.mdwn +++ b/open_issues/nightly_builds_deb_packages.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -16,6 +17,13 @@ packages. * Need to have an automation to get from Hurd upstream Git branches to a branch usable in Debian. + IRC, freenode, #hurd, 2013-12-18: + + http://darnassus.sceen.net/~teythoon/hurd-ci/ has hurd and + mig and gnumach packages built directly from the upstream git + repository + + --- There is infrastructure available to test whole OS installations. @@ -29,3 +37,74 @@ There is infrastructure available to test whole OS installations. --- See also [[nightly_builds]]. + + +# Debian Jenkins Instance + +## IRC, OFTC, #debian-hurd, 2014-02-24 + + hi. can hurd be installed using d-i? If so, what about scripting + the installation on ? + pere: d-i works for Hurd, yes, with full graphical interface I + dunno. Maybe you can ask about scripting in #hurd, more people are + present there? + gnu_srs: the scripts in questions are for jenkins. quite easy to + write (d-i preseed scripts and qemu boot rules). + +## IRC, OFTC, #debian-hurd, 2014-02-25 + + getting a automated test in jenkins running could show the status. + what is needed to boot the hurd d-i image with a preseed file using qemu? + git://git.debian.org/git/users/holger/jenkins.debian.net.git is the + repo with the jenkins build rules. + youpi: is it possible to start the hurd d-i installer with a preseed + file from the qemu command line? --append need --kernel, which I suspect + do not make sense with hurd? + can the d-i hurd installer take a preseed file at all? my initial + try failed. :( + i don't know + there has been talk here the other day about using qemus + multiboot capabilities to directly boot the hurd + +[[debugging_gnumach_startup_qemu_gdb]], *Multiboot* + + i always wanted to try that out + the jenkins rules to test the install uses --kernel, --initrd and + --append in qemu to specify the preseed file. without a similar method + to boot hurd, it will be hard to automate the test. rewriting the iso + might be an option, but not a very nice one. + i believe that it is possible to use those options to boot a + hurd + i'll report back to you + I tried adding an url= option to grub when booting the installer, + but it seem to be ignored. + I suspect it did not make it to /proc/cmdline, but am not sure. + um + it should + could be. I am unable to get a shell in the installer, so I do not + know. + root@pluto ~ # cat /proc/cmdline + root=device:hd0s1 + oh ? select expert install, then spawn a shell or something + perhaps the preseed udeb is missing, or the network support was + enabled after preseed looked for the file? + uh, i don't know about that stuff, youpi creates the d-i images + ok. seem to me that the d-i images do not support preseeding at the + moment. + pere: when i try to use qemus multiboot support to boot the + hurd, qemu crashes :/ + youpi: ^ did you succeed? if so, can you share how? + teythoon: nope, I concluded it didn't work, and left it to other to + fix. :) + pere, teythoon: IIRC preseeding can be put on the gnumach kernel + command line + but I'm wondering why you can't simply modify the disk image into + doing what you want + or you mean reinstalling the image each time? + youpi: the point is testing the installer, and that can only be done + by using the installer. :) + ok + I would like to see something like for hurd. diff --git a/open_issues/nptl.mdwn b/open_issues/nptl.mdwn index 3c84bfb0..be0270df 100644 --- a/open_issues/nptl.mdwn +++ b/open_issues/nptl.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -15,7 +16,8 @@ License|/fdl]]."]]"""]] # IRC, freenode, #hurd, 2010-07-31 - Other question: how difficult is a NPTL port? Futexes and some kernel interfaces for scheduling stuff etc. -- what else? + Other question: how difficult is a NPTL port? Futexes and some + kernel interfaces for scheduling stuff etc. -- what else? actually NPTL doesn't _require_ futexes it just requires low-level locks Mmm, it seems to be so only in principle @@ -25,8 +27,10 @@ License|/fdl]]."]]"""]] I'm not sure we really want to port NPTL OK. Drepper will keep finding things to add - while the interface between glibc and libpthread isn't increasing _so_ much - ... and even less so the interfavce that actual applications are using. + while the interface between glibc and libpthread isn't increasing + _so_ much + ... and even less so the interfavce that actual applications + are using. We'd need to evaluate which benefits NPTL would bring. @@ -44,6 +48,63 @@ License|/fdl]]."]]"""]] and http://lists.debian.org/debian-bsd/2013/07/msg00138.html +# IRC, freenode, #hurd, 2013-12-26 + + hm? has NPTL already supported for Hurd? + probably won't ever be + so no plan for it? + what for ? + no one interested in it, or no necessary adding it? + why would you want nptl ? + ntpl was created to overcome the defficiencies of linuxthreads + we have our own libpthread + (with its own defficiencies) + supporting nptl would probably force us to implement something a + la clone + well, just inertia, now that Linux/kFreebsd has it + are you sure kfreebsd has it ? + * teythoon thought we have clone + http://www.gnu.org/software/hurd/open_issues/nptl.html + seems someone mentioned it + it's a "nptl-like implementation" + yes, I don't think it should be the same with Linux one, but + something like it + but what for ? + as mentioned in the link you just gave, " We'd need to + evaluate which benefits NPTL would bring." + well, it's the note of 2010, I don't know if it's relative now + relevant* + ah thanks + but that still doesn't answer anything + why are *you* talking about nptl ? + just saw pthread, then recall nptl, dunno + just asking + :) + but you mentioned that Hurd has its own thread implementation, + is it similar or better than Linux NPTL? + or there's no benchmark yet? + it's inferior in performance + almost everything in the hurd is inferior performance-wise because + of the lack of optimizations + currently we care more about correctness + speak the NPTL, I ever argued with a friend since I saw + drepper mentioned NPTL should be m:n, then I thought it is...But finally + I was failed, he didn't implement it yet... + what ? + nptl was always 1:1 + but in nptl-design draft, I thought it's m:n + anyway, it's draft + and seems being a draft for long time + never read anything like that + I think it's my misread + I have to go, see you guys tomorrow + The consensus among the kernel developers was that an M-on-N + implementation + would not fit into the Linux kernel concept. The necessary + infrastructure which would + have to be added comes with a cost which is too high. + + --- # Resources diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn index 772fd865..3dab6d4c 100644 --- a/open_issues/performance.mdwn +++ b/open_issues/performance.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -217,3 +217,25 @@ call|/glibc/fork]]'s case. i'm only saying that the phoronix benchmark results are useless because they didn't measure the right thing ok + + +# Optimizing Data Structure Layout + +## IRC, freenode, #hurd, 2014-01-02 + + teythoon_: wow, digging into the vm code :) + i discovered pahole and gnumach was a tempting target :) + never heard of pahole :/ + it's nice + braunr: try pahole -C kmem_cache /boot/gnumach + on linux that is. ... + ok + braunr: http://paste.debian.net/73864/ + very nice + + +## IRC, freenode, #hurd, 2014-01-03 + + teythoon: pahole is a very handy tool :) + yes + i especially like how general it is diff --git a/open_issues/performance/io_system/clustered_page_faults.mdwn b/open_issues/performance/io_system/clustered_page_faults.mdwn index a3baf30d..8bd6ba72 100644 --- a/open_issues/performance/io_system/clustered_page_faults.mdwn +++ b/open_issues/performance/io_system/clustered_page_faults.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -160,3 +160,6 @@ License|/fdl]]."]]"""]] immediately when he stopped attending meetings... slpz: oh, you even already looked into vm_pageout_scan() back then :-) + + +# [[Read-Ahead]] diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn index 05a58f2e..711f7691 100644 --- a/open_issues/performance/io_system/read-ahead.mdwn +++ b/open_issues/performance/io_system/read-ahead.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -3041,3 +3041,26 @@ License|/fdl]]."]]"""]] still on my TODO list it will get merged eventually, now that the large store patch has also been applied + + +## IRC, freenode, #hurd, 2013-12-31 + + mcsim: do you think you'll have time during january to work out + your clustered pagein work again ? :) + braunr: hello. yes, I think. Depends how much time :) + shouldn't be much i guess + what exactly should be done there? + probably a rebase, and once the review and tests have been + completed, writing the full changelogs + ok + the libpager notification on eviction patch has been pushed in as + part of the merge of the ext2fs large store patch + i have to review neal's rework patch again, and merge it + and then i'll test your work and make debian packages for + darnassus + play with it a bit, see how itgoes + mcsim: i guess you could start with + 62004794b01e9e712af4943e02d889157ea9163f (Fix bugs and warnings in + mach-defpager) + rebase it, send it as a patch on bug-hurd, it should be + straightforward and short diff --git a/open_issues/pfinet_timers.mdwn b/open_issues/pfinet_timers.mdwn index 5db192e3..244ca98b 100644 --- a/open_issues/pfinet_timers.mdwn +++ b/open_issues/pfinet_timers.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -117,3 +117,61 @@ License|/fdl]]."]]"""]] yes, schedule_timeout could need a review actually, fakeroot rm -rf * is a good test and it's still damn slow + + +## IRC, freenode, #hurd, 2013-11-04 + + i think i know why fakeroot is slow no + now + schedule_timeout as implemented in pfinet can only be awaken by a + timeout + even when the expected even comes in earlier + so yes, the proper solution is to rewrite the timers using + interruptible_sleep_on_timeout (and in turn + pthread_hurd_cond_timedwait_np) + hm no, it's still not that straightforward :( + + +## IRC, freenode, #hurd, 2013-11-05 + + youpi: i found the bug slowing down fakeroot-tcp + it's actually a bug that slows down anything using the loopback + device + (although there still is a problem with fakeroot chown) + oh! + basically + the loopback device calls netif_rx from its xmit function + which is perfectly fine + except the glue code makes mark_bh (used to raise bottom halves) + broadcast a condition + and since netif_rx is called from within xmit, which is called + from the net_bh worker thread + the thread itself is never waiting for the condition when it is + broadcast + it's very simple to fix, i'll send a patch later + netcat to netcat now consumes 100% cpu + as does fakeroot ls -Rl + but for some reason fakeroot chown is still extremely slow + and i've seen deadlocks in glibc (e.g. setlocale() getting the + locale lock, which is locked again in case libfakeroot fails and calls + strerror) + so still a bit of debugging work needed + + +## IRC, freenode, #hurd, 2013-11-06 + + chown being slow with fakeroot-tcp can also be seen on linux + + did your recent patch improve the performance of fakeroot-tcp ? + yes + very nice :) + but fakeroot chown is still slow + although it's also slow on linux + so i'm not looking into that any more for the time being + as long as it's not used recursively on huge directories, it's + fine + + +## IRC, freenode, #hurd, 2013-11-09 + + braunr: fakeroot-tcp is indeed much faster now :) diff --git a/open_issues/profiling.mdwn b/open_issues/profiling.mdwn index 545edcf6..e7dde903 100644 --- a/open_issues/profiling.mdwn +++ b/open_issues/profiling.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -138,3 +138,234 @@ done for [[performance analysis|performance]] reasons. know what happen and how happen, maybe just suitable for newbie, hope more young hack like it once it's done, everything else is just sugar candy around it + + +# IRC, freenode, #hurd, 2014-01-05 + + braunr: do you speak ocaml ? + i had this awesome idea for a universal profiling framework for + c + universal as in not os dependent, so it can be easily used on + hurd or in gnu mach + it does a source transformation, instrumenting what you are + interested in + for this transformation, coccinelle is used + i have a prototype to measure how often a field in a struct is + accessed + unfortunately, coccinelle hangs while processing kern/slab.c :/ + teythoon: I do speak ocaml + awesome :) + unfortunately, i do not :/ + i should probably get in touch with the coccinelle devs, most + likely the problem is that coccinelle runs in circles somewhere + it's not so complex actually + possibly, yes + do you know coccinelle ? + the only really peculiar thing in ocaml is lambda calculus + +c + I know a bit, although I've never really written an semantic patch + myself + i'm okay with that + but I can understand them + then ocaml should be fine for you :) + just ask the few bits that you don't understand :) + yeah, i haven't really made an effort yet + writing ocaml is a bit more difficult because you need to + understand the syntax, but for putting printfs it should be easy enough + if you get a backtrace with ocamldebug (it basically works like + gdb), I can probably explain you what might be happening + + +## IRC, freenode, #hurd, 2014-01-06 + + braunr: i'm not doing microoptimizations, i'm developing a + profiler :p + teythoon: nice :) + i thought you might like it + teythoon: you may want to look at + http://pdos.csail.mit.edu/multicore/dprof/ + from the same people who brought radixvm + which data structure should i test it with next ? + uh, no idea :) + the ipc ones i suppose + yeah, or the task related ones + but be careful, there many "inline" versions of many ipc functions + in the fast paths + and when they say inline, they really mean they copied it + +are + but i have a microbenchmark for ipc performance + you sure have been busy ;p + it's funny you're working on a profiler at the same time a + collegue of mine said he was interested in writing one in x15 :) + i don't think inlining is a problem for my tool + well, you can use my tool for x15 + i told him he could look at what you did + so i expect he'll ask soon + cool :) + my tool uses coccinelle to instrument c code, so this works in + any environment + one just needs a little glue and a method to get the data + seems reasonable + for gnumach, i just stuff a tiny bit of code into the kdb + + hm debians bigmem patch with my code transformation makes + gnumach hang early on + i don't even get a single message from gnumach + ouch + or it is somethign else entirely + it didn't even work without my patches o_O + weird + uh oh, the kmem_cache array is not properly aligned + braunr: http://paste.debian.net/74588/ + teythoon: do you mean, with your patch ? + i'm not sure i understand + are you saying gnumach doesn't start because of an alignment issue + ? + no, that's unrelated + i skipped the bigmem patch, have a running gnumach with + instrumentation + hum, what is that aliased column ? + but, despite my efforts with __attribute__((align(64))), i see + lot's of accesses to kmem_cache objects which are not properly aligned + is that reported by the performance counters ? + no + http://paste.debian.net/74593/ + aer those the previous lines accessed by other unrelated code ? + previous bytes in the same line* + this is a patch generated to instrument the code + so i instrument field access of the form i->a + but if one does &i->a, my approach will no longer keep track of + any access through that pointer + so i do not count that as an access but as creating an alias for + that field + ok + so if that aliased count is not zero, the tool might + underestimate the access count + hm + static struct kmem_cache kalloc_caches[KALLOC_NR_CACHES] + __attribute__((align(64))); + but + nm gnumach|grep kalloc_caches + c0226e20 b kalloc_caches + ah, that's fine + yes + nevr mind + don't we have a macro for the cache line size ? + ah, there are a great many more kmem_caches around and noone + told me ... + teythoon: eh :) + aren't you familiar with type-specific caches ? + no, i'm not familiar with anything in gnumach-land + well, it's the regular slab allocator, carrying the same ideas + since 1994 + it's pretty much the same in linux and other modern unices + ok + the main difference is likely that we allocate our caches + statically because we have no kernel modules and know we'll never destroy + them, only reap them + is there a macro for the cache line size ? + there is one burried in the linux source + L1_CACHE_BYTES from linux/src/include/asm-i386/cache.h + there is one in kern/slab.h + but it is out of date + there is ? + but it's commented out + only used when SLAB_USE_CPU_POOLS is defined + but the build system should give you CPU_L1_SHIFT + hm + and we probably should define CPU_L1_SIZE from that + unconditionnally in config.h or a general param.h file if there is one + the architecture-specific one perhaps + although it's exported to userland so maybe not + + +## IRC, freenode, #hurd, 2014-01-07 + + braunr: linux defines ____cacheline_aligned : + http://lxr.free-electrons.com/source/include/linux/cache.h#L20 + where would i put a similar definition in gnumach ? + .oO( four underscores ?!? ) + heh + yes, four + teythoon: yes :) + + are kmem_cache objects ever allocated dynamically in gnumach ? + no + hm + i figured that, since there are no kernel modules, there is no + need to allocate them dynamically, since they're never destroyed + so i aligned all statically declarations with + __attribute__((align(1 << CPU_L1_SHIFT))) + but i still see 77% of all accesses being to objects that are + not properly aligned o_O + ah + >,< + you could add an assertion in kmem_cache_init to find out what's + wrong + *aligned + eh :) + right + grr + sweet, the kmem_caches are now all properly aligned :) + :) + + hm + i guess i should change what vmstat reports as "cache" from the + cached objects to the external ones (which map files and not anonymous + memory) + braunr: http://paste.debian.net/74869/ + turned out that struct kmem_cache was actually an easy target + no bitfields, no embedded structs that were addressed as such + (and not aliased) + :) + + +## IRC, freenode, #hurd, 2014-01-09 + + braunr: i didn't quite get what you and youpi were talking about + wrt to the alignment attribute + define a type for struct kmem_cache with the alignment attribute + ? is that possible ? + ah, like it's done for kmem_cpu_pool + teythoon: that's it :) + note that aligning a struct doesn't change what sizeof returns + heh, that save's one a whole lot of trouble indeed + you have to align a member inside for that + why would it change the size ? + imagine an array of such structs + ah + right + but it fits into two cachelines exactly + that wouldn't be a problem with an array either + so an array of those will still be aligned element-wise + yes + and it's often used like that, just as i did for the cpu pools + but then one is tempted to think the size of each element has + changed too + and then use that technique for, say, reserving a whole cache line + for one variable + ah, now i get that remark ;) + :) + + braunr: i annotated struct kmem_cache in slab.h with + __cacheline_aligned and it did not have the desired effect + can you show the diff please ? + http://paste.debian.net/75192/ + i don't know why :/ + that's how it's done for kmem_cpu_pool + i'll try it here + wait + i made a typo + >,< + __cachline_aligned + bad one + uh :) + i don't see it + ah yes + missing e + yep, works like a charme :) + nice, good to know :) + :) + given the previous discussion, shall i send it to the list or + commit it right away ? + i'd say go ahead and commit diff --git a/open_issues/robustness.mdwn b/open_issues/robustness.mdwn index a6b0dbfb..4b0cdc9b 100644 --- a/open_issues/robustness.mdwn +++ b/open_issues/robustness.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -12,6 +13,7 @@ License|/fdl]]."]]"""]] [[!toc]] + # IRC, freenode, #hurd, 2011-11-18 I'm learning about GNU Hurd and was speculating with a friend @@ -167,3 +169,49 @@ License|/fdl]]."]]"""]] http://darnassus.sceen.net/gitweb/teythoon/reincarnation.git/blame/HEAD:/reincarnation.defshttp://darnassus.sceen.net/gitweb/teythoon/reincarnation.git/blame/HEAD:/reincarnation.defs < teythoon> uh >,< sorry, pasted twice < braunr> oh ok + + +## IRC, freenode, #hurd, 2014-02-01 + + btw, can hurd upgrade the kernel without reboot? + no + but since most functionality is not within the kernel, the more + interesting question is, what parts of the hurd can be replaced at + runtime + ok. what is the answer to that question? + no hurd server can be restarted transparently, i.e. w/o its + clients noticing that + however, if a server is not in use, it can be easily restarted + transparently restarting servers would be nice + and i believe it is even possible on mach + teythoon: how ? + one has to retain two things, client-related state and the port + right + doesn't that require persistence ? + it does + but i see no reason why it should not be possible to implement + this on top of mach + maybe + the most crucial thing is to preserve the receive port, and to + replace the server without race-conditions + receive rights can be transfered using the notification + mechanism + + braunr: restarting servers doesn't exactly require + persistance. you only need to pass the state from the old server to the + new one, rather than serialising it for on-disk storage. it's a slightly + easier requirement... + (most notably, you don't need any magic to keep the capabilities + around -- just pass them over using normal IPC) + antrik: i agree, but then again, once this is in place, adding + persistence is only a little step + teythoon: depends. if it's implemented with persistence in mind + from the beginning, it might be a fairly small step indeed; but + otherwise, it could be two entirely different things + this also depends on the kind of persistence you want + I must say that for the kind of persistence *I* would like, it is + indeed quite related + well, please elaborate a little :) + what do you have in mind ? + busy right now... remind me some other time if I forget :-) + sure diff --git a/open_issues/serial_console.mdwn b/open_issues/serial_console.mdwn index ed6358a2..827fd211 100644 --- a/open_issues/serial_console.mdwn +++ b/open_issues/serial_console.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_documentation]] -IRC, #hurdfr, 2010-09-20 + +# IRC, freenode, #hurdfr, 2010-09-20 tu peux compiler ton gnumach pour qu'il utilise la console série, et tu mets le port série sur la console qemu @@ -50,3 +51,56 @@ IRC, #hurdfr, 2010-09-20 pour xen j'ai mis £ comme raccourcis ça me paraît plus simple dans ce cas clin d'œil à la société anglaise :) + + +# IRC, freenode, #hurd, 2014-02-20 + + 04:06:45< gg0> ok a configuration that works w/o patching anything is + 9600 7S1 [ 7bits - parity Space - 1 stopbit ] + 04:07:57< gg0> it displays correctly gnumach, ext2fs and following + outputs + 04:28:05< gg0> youpi: instead if you want a patch, this one makes + gnumach default to 8N1. someone should still implement serial line + settings for ext2fs though + seems something broke it later + or it never worked on real hardware + we definitely want it to work with 8N1 + never had problems with _virtual_ serial consoles + never = during last 2 years = since + http://git.savannah.gnu.org/gitweb/?p=hurd/gnumach.git;a=commitdiff;h=2a603e88f86bee88e013c2451eacf076fbcaed81 + but i don't think i was on real hardware at that time + + +## IRC, freenode, #hurd, 2014-02-21 + + yeah, i have one rebuilt trying to fix serial console (already give + up) + what were you trying to fix ? + i didn't fix anything but it's been useful somehow :) + this one http://paste.debian.net/plain/83292 + initial messages from mach/hurd outputs like there was no line feed + each line overwrites previous one + then ext2fs outputs garbage + then openrc start outputting fine + minicom 9600 8N1 + this is from a real machine ? + yep real machine + nice :) + i fixed last line, last garbage, by switching c: from 38400 to 9600 + in inittab + i've a vt510 terminal connected to my hurd box, and i started to + make the serial setting in gnumach more configurable + and disabling T0 + didn't finish it though + physical vt510 connected to virtual hurd box? + no, it's a real box as well + good. and does it behave as described/pasted above? + currently i do not put the mach console on the serial line + b/c it has a fixed baud rate of 9600 + and both grub and the getty are configured at a higher speed + hence my desire to improve gnumachs serial port setup + i don't care much about speed. such no-line-feed behavior is quite + annoying though + i thought it was related to CRMOD which afaiu should translate cr to + cr-lf, but i was surely missing something + (annoying till one does ^A-A to make minicom add line feeds itself) diff --git a/open_issues/system_initialization.mdwn b/open_issues/system_initialization.mdwn index 9048b615..0df1078e 100644 --- a/open_issues/system_initialization.mdwn +++ b/open_issues/system_initialization.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] -IRC, freenode, #hurd, 2011-03-30 + +# IRC, freenode, #hurd, 2011-03-30 init=/bin/sh hack doesn't work for GNU/Hurd ? kilobug: I don't think you can override init on Hurd. the init @@ -19,6 +20,23 @@ IRC, freenode, #hurd, 2011-03-30 server to *only* do that, and then pass on to standard sysv init... with that it could actually work ---- - * [[systemd]], etc. +# IRC, freenode, #hurd, 2013-11-29 + + we need to make the bootstrap code more robust and fix the error + handling there + for example, you can kill the exec server and the rootfs w/o + /hurd/init noticing it... + yes + there are plans in init.c to take over the exception port of the + essential processes + that could help + + +# [[hurd_init]] + + +# [[Anatomy_of_a_Hurd_System]] + + +# [[systemd]] diff --git a/open_issues/systemd.mdwn b/open_issues/systemd.mdwn index 1f3eea03..ca910491 100644 --- a/open_issues/systemd.mdwn +++ b/open_issues/systemd.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -27,7 +27,9 @@ Daniel Gollub, Stefan Seyfried, 2010-10-14. Likely there's also some other porting needed. -# IRC, OFTC, #debian-hurd, 2011-05-19 +# Discussion + +## IRC, OFTC, #debian-hurd, 2011-05-19 pochu: http://news.gmane.org/gmane.comp.gnome.desktop - the "systemd as dependency" and all the messages in it don't give me a bright @@ -172,7 +174,7 @@ Likely there's also some other porting needed. anyway, I'll talk to the upstart guys about libnih -## IRC, OFTC, #debian-hurd, 2013-08-15 +### IRC, OFTC, #debian-hurd, 2013-08-15 btw, I talked to vorlon about upstart and the Hurd so the situation with libnih is that it is basically @@ -183,6 +185,16 @@ Likely there's also some other porting needed. patches +### IRC, OFTC, #debian-hurd, 2013-11-28 + + teythoon: did you see they got libnih ported to kfreebsd? + http://lists.debian.org/debian-devel/2013/11/msg00395.html + "I haven't started looking into Hurd yet," sounds promising + saw that + i looked at libnih too + wrote a mail about that + + ## IRC, freenode, #hurd, 2013-08-26 < youpi> teythoon: I tend to agree with mbanck @@ -1035,6 +1047,2591 @@ Likely there's also some other porting needed. wrecks havoc on the system +## IRC, freenode, #hurd, 2014-01-03 + + openrc on debian + https://buildd.debian.org/status/package.php?p=openrc&suite=experimental + gg0: ah nice + + +## IRC, freenode, #hurd, 2014-01-11 + + teythoon: is the Hurd boot now fully init compatible? I would + like to try to boot with a ported openrc in a sandbox kvm:P + + +### IRC, freenode, #hurd, 2014-01-12 + + gnu_srs1: yes, go ahead + gnu_srs1: you'll have to switch to sysvinit first + for that, you need patched sysvinit packages + + teythoon: do you mean the parches in #721917? + gnu_srs: yes, mostly, but there is one final patch missing + uploading patched sysvinit to debian-ports? (or braunr's or + teythoon's repos) + gg0, gnu_srs: they are actually here + http://teythoon.cryptobitch.de/gsoc/heap/debian/ but outdated + teythoon: if the sysvinit patches are outdated, can you update + them please? and provide a package for upload to -ports (as gg0 proposed) + gnu_srs: i will + tks:) + + +### IRC, freenode, #hurd, 2014-01-13 + + gnu_srs: i updated the sysvinit patches + gnu_srs: for your convenience, here are packages: + http://darnassus.sceen.net/~teythoon/heap/sysvinit/ + gnu_srs: you have to install the sysvinit-core package first, + then the others + to switch to sysvinit, do update-alternatives --config runsystem + and select runsystem.sysv + then, do reboot-hurd and hope for the best ;) + + teythoon: thanks, will try soon. Are you submitting the updated + patches to #721917 too? + gnu_srs: i already did + good;-) + teythoon: rebooted with sysv:http://paste.debian.net/75925/ + gnu_srs: please, whenever you run into a problem, give more + context + which file are you talking about ? + also, as the postinst script advised you, you need to use + {halt,reboot}-hurd *whenever* you switch the runsystem + not doing so wont do any harm, but it wont work + shutdown: /run/initctl: No such file or directory <-- that's + what happens if you run reboot (=reboot-sysv) w/o sysvinit being run + if you don't get a getty on the console, check /etc/inittab + I did note see a message from any posinst script about + {halt,reboot}-hurd, only LC* related messages + A I missed it: You must use halt-hurd or reboot-hurd to halt or + reboot the + system whenever you change the runsystem. + I don't see anything suspicious in /etc/inittab, + eg. 1:2345:respawn:/sbin/getty 38400 tty1 is there + 7:2345:respawn:/sbin/getty 38400 console + then, you'll get a getty on the mach console, even if the + hurd-console does not start + teythoon: with 7:2345:respawn:/sbin/getty 38400 console in + /etc/inittab I get a (mach) console. + never seen that mentioned anywhere + anyway, the image is now booted with sysvinit. next to try will + be openrc:P + gnu_srs: you haven't heard of the inittab entry for the mach + console before b/c the inittab was not used before on the hurd + i should probably write that down in the wiki somewhere... + shouldn't the upgrade of the sysvinit package do it too? + (does it at least install a correct version on newer installs?) + it probably should / i'm not sure + + +## IRC, freenode, #hurd, 2014-01-13 + + gnu_srs: have you ported openrc already ? + I made it build (with temporary workarounds for PATH_MAX) but + need to change at least one file to be hurd-specific before trying to + boot + cool :) + i guess not much different from http://paste.debian.net/plain/75893/ + (i didn't say it sucks but one can find it out by taking a look) + gg0: Have you talked to zigo in #openrc?. He has partial patches + (submitted to the debian repo), you do and me too. + Maybe we should align our work. + The file to make Hurd-specific is: init.sh.GNU (you start with + copy of the Linux version, I start from a copy of the BSD version). + BTW: I don't think fstabinfo is available for GNU/Hurd! + gg0: Sorry, fstabinfo and moutinfo are parts of openrc, my bad:-D + mountinfo* + + +## IRC, freenode, #hurd, 2014-01-15 + + Hi, is these some simple way to find out the sequence of commands + executed during boot: + current using runsystem.gnu and with sysv-rc using runsystem.sysv + I need to edit on file of OpenRC before trying to boot with + it. (mainly mounting /run/*) + Is mount functional or is settrans .needed? + + +## IRC, freenode, #hurd, 2014-01-16 + + gnu_srs: you are adding OpenRC? cool! + ArneBab: Working on it, will try booting when my questions here + have been answered ;-) + gnu_srs: mount is functional enough to boot Debian/Hurd using + sysvinit + gnu_srs: you could add "set -x" to runsystem.*, or add "bash" to + just drop into a shell and examine the environment interactively + teythoon: Hi, is mount a wrapper on top of settrans ...? + yes + how to log the boot sequence, when booting the mach console is + cleared when the hurd console starts? + you could just disable the hurd console + and the kvm console does not have scrolling functionality + it's actually the mach console that lacks this + copying manually is cumbersome, even if all is readable + but as a workaround you can use kvm .... -curses and use xterms + backlog + and c&p works then :) + tks, I'll try with that:P + + +## IRC, freenode, #hurd, 2014-01-17 + + BTW: zigo successfully booted openrc on Hurd, I haven't tried + yet,, you know things coming in between. He used my patches to create + updated ones:) + that version is now in experimental (I still have to operate away + all those PATH_MAX issues, and fins at least one sh file). + :/ + + +## IRC, freenode, #hurd, 2014-01-21 + + teythoon: I don't get a scrollable output when using -curses in + kvm, to be able to see all startup messages. Any other ideas? + gnu_srs: are you sure ? i just tested this, and it works nicely + for me + gnu_srs: that's how i created all the "screenshots" for my blog + posts + teythoon: kvm -m 1024 -net nic,model=rtl8139 -net + user,hostfwd=tcp::5564-:22 -curses -hda debian-hurd-20140115.img + ah, my bad + gnu_srs: try -nographic + oh, and maybe you need to add console=com0 to the gnumach + command line + b/c with -nographic, the first serial port is connected to qemus + stdio + sorry, i mixed this up + and how to add console=com0 to the qemu start oprtions? -kernel + and -append are Linux only + # grep console /etc/default/grub + GRUB_CMDLINE_GNUMACH="console=com0 --crash-debug" + and if you want grub on the serial port: + # grep serial /etc/default/grub + GRUB_TERMINAL=serial + GRUB_SERIAL_COMMAND="serial --speed=9600 --unit=0 --word=8 + --parity=no --stop=1" + teythoon: with -nographic I don't get any output at all? + did you run update-grub ? + aha, will do + still no scrollbar with gnome-terminal, will try with xterm and + rxvt + it works: with rxvt, tks:-D + good :) + i found -nographic to be quite handy + in /etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet" and + GRUB_CMDLINE_LINUX="" #GRUB_DISABLE_LINUX_UUID=true + linux configuration parameters in a gnumach boot setup? + those won't be used + unless the grub scripts find a linux kernel in /boot + it's just the stock debian configuration file + nevertheless:-( + what ? + there could be OS-specific files: Linux, kFreeBSD, Hurd? + or, preferebly, one that works on every os ? like it is now ;) + OK, one that works on every OS, with a common part and + OS-specific parts? + that's how it is now + stuff with LINUX in it is used for linux + stuff with GNUMACH in it is used for gnumach + + +## IRC, freenode, #hurd, 2014-01-22 + + teythoon: A boot message segfault: (syv-rc specific?) + + exec /sbin/init -a + INIT: version 2.88 booting + Using makefile-style concurrent boot in runlevel S. + end_request: I/O error, dev 02:00, sector 0 + Segmentation fault + Activating swap...done. + Checking root file system...fsck from util-linux 2.20.1 + another: mount: cannot remount /proc: Invalid argument + ... + df: Warning: cannot read table of mounted file systems: No such + file or directory + openrc boots on Hurd, login (user,root) works, read-only mode so + far, have to tweak some scripts:) + not bad + gnu_srs: woah! + very cool! + + +## IRC, freenode, #hurd, 2014-01-22 + + I think with that you are doing the most useful thing to avoid + OpenRC: If it provides almost the same as systemd and runs on the Hurd, + then there is no technical reason for using systemd, but many against it. + s/avoid OpenRC/avoid systemd/ + (gah, brain is jumbled) + I hate systemd because it monopolizes cgroups + which is SUPPOSED to be a generic interface open to anyone + I do not want an unholy alliance in a kernel-user api + ArneBab: the openrc maintainer will take care it will get + communicated + ArneBab: also, not sure what you mean about systemd, the question + isn't so much between openrc vs. systemd, but upstart vs. systemd + at least for the Technical Committee decision, none of the + tech-ctte members seems to consider openrc as n realistic contender + s/as n/as a/ + azeem_: seem like it is so:-( + maybe in a future, if openrc gets some attention and developers, + it could become a one-for-all solution;-) + gnu_srs: nice :) + ignore the proc related message + gnu_srs: there is no way to associate the segfault with a + process for me, can you shed some light on which process dies ? + as for df complaining, you could fix this up like youpi did: + grep ln /etc/hurd/rc + ln -s /proc/mounts /var/run/mtab + the proper way is to fix our libc of course + teythoon: I was just coping the boot messages, I don't know + either which process segfaults + hm, maybe you can make openrc more verbose about what it starts + All I wrote earlier was from sysv-rc + ah + i've never seen that then + azeem_: actually I think OpenRC is the only sane choice: It is + the only choice which supports other kernels. + Shentino: I can’t stand systemd, because it establishes a tight + control over the init process by encouraging developers to add + dependencies to libraries which are so tightly coupled with others, that + they cannot be adapted without affecting the whole system. + Shentino: But I wrote about that in much more details: + http://draketo.de/light/english/top-5-systemd-troubles TL;DR: + distributions become completely dependent on a small group and they throw + away the skills their maintainers already have (shell scripting) + And systemd is Linux-only… + …with no intention of changing that. + why would debian strive to support other kernels ? + instead of other kernels adjusting ? + if posix introduces new apis, are we going to say no, or are we + going to try and support them ? + the issue of multi-kernel support is completely irrelevant + what you're saying about tight coupling is actually the only real + issue of systemd + braunr: I see a difference between providing a stable API which + others can easily replicate and a running target with no intention to + become cross-kernel usable (my experience with udev suggests that they + won’t really try to keep anything stable for long). + braunr: but the tight coupling is the main issue for me, too: + that creates a vulnerability for the free software community. + no, the free software community doesn't risk much here + it's a technical problem + ok, yes, posix as a point of convergence is clearly not the same + as linux as an implementation that diverges + agreed + if the systemd people decide to go a certain direction which + makes it impossible to provide a certain feature while using their new + tech, then there is a problem. + but it still implies we have to adapt + from my point of view, multi-kernel distributions are a technical + heresy + if you want something really efficient, you want it very well + integrated + i'm concerned by the linux kernel making up interfaces w/o + proper considerations + braunr: in Gentoo we had all the hassle with /usr on a separate + partition. There are usecases for that, and Gentoo wanted to provide + them, but udev (now systemd) made that impossible. + teythoon: yes i'm concerned about that too + we will never be able to implement the cgroup interface for + example b/c it is too badly designed + badly ? + it's system specific + braunr: also the systemd folks could essentially hold Linus at + ransom: “We couple userspace tightly to implementation details in the + kernel, so when you break the implementation in a way which we don’t + like, you’ll break userspace in the worst possible way” + it's very hard to design an interface without properly + understanding what it would internally imply in the implementation + ArneBab: that's already the case + system specific in a way that it will be impossible to implement + on non-monolithic kernels + teythoon: exactly + they didn't think of that because they don't care + and why would they ? + it doesn't make the interface bad per se + it is the case in systemd, but not in sysVinit + well it is too + but sysvint is less demanding + again, the coupling is the problem + yes + systemd comes from people with other goals and interests + I think everything I wrote comes down to that. + they're very technical, very business oriented + they want to get up to speed with competitors quickly + they're not wrong in doing that + it just helps understand why they get with such results + A distribution would be foolish to let other people take over a + crucial part of the system when those other people have a track record of + coupling more and more parts of the system with their product. + and i agree, i don't want it either + but please, stop with the nonsense + don't say openrc is the only sane one because it's the only + multikernel one + personally, i consider that very argument almost insane itself + considering distributions that are hardly used can really have any + weight in the decision is absurd + openrc is the only sane one, because it keeps already aquired + skills useful. + s/distributions/kernels/ + (that’s my opinion) + we have to make progress + the init system is clearly obsolete and lacking features + so "acquired" skills here are irrelevant too + if it takes acquiring new skills to operate a better init system, + i'm all for it + after all, it makes a lot more sense to me than all those fancy + languages/technologies like C# and ruby that have gained so much + popularity in so little time + If you can get a similarly good init system wiothut forcing + people to learn new skills, that’s a big win. + you probably can't + OpenRC is pretty close in features to systemd + err + not even close + teythoon is right + openrc is just sysvinit++ + no + openrc replaces the sysv rc, not sysvinit + ok + it complements it + i wasn"'t being pedantic here + nicely in my opinion + yes i like it too + but i'm afraid it's not a complete solution + I think I need to be more pedantic in what I say: A system-boot + with OpenRC is pretty close in features to a system-boot using systemd. + on the other hand, when i see discussions about event driven + systems and handling of dependencies, it sounds like something like + openrc could do the job, and something else, system-specific, would + handle the rest + ArneBab: i disagree + me too + ArneBab: have you actually used systemd? + I have read about what it provides. + My udev experience burned me pretty badly. + udev is only one part + but actually, coupling is both a problem and a great feature + yes + it's precisely the integration of many services previously + organized in a very messy way that makes it better + and cgroups, by accurately tracking resources, allow even better + control + heh, i watched lennarts recent talk about kdbus + but it does so by pulling in more and more parts instead of + providing a clean interface which separate projects can use. + again, the coupling is too tight + it's hard to hook in between + teythoon: I watched lennart troll a talk pretty badly… + braunr: yes + he cites mach and hurd for having an nice ipc mechanism, and + linux lacking such a system + haha + i was expecting such comparisons :) + that’s why he writes an init-system which does not run on the + Hurd… + ArneBab: that's trolling on your part ;) + :) + somehow yes… + what i personally get out of this is that, in the end, proper + messaging at the kernel level is something people do want + and if you make stuff like x use it, why not things like the + network stack and the file system + i wish the linux kernel would allow the kernel devs to write + nicer interfaces + yes + they're almost in the process of acknowledging the merits of + multiserver architectures :) + b/c they lack a proper ipc mechanism, they do stuff like ad-hoc + filesystem-based interfaces that are crappy to support on the hurd :-/ + * ArneBab has been out of the loop for too long… + teythoon: what file system do you consider "crappy to support on + the hurd" ? + braunr: cgroupfs in particular + not crappy, but impossible + well, that's probably because we need realy resource containers + first + real* + no, we'll never be able to implement the current interface + i didn't study it as you did so i trust you + braunr: + http://teythoon.cryptobitch.de/posts/cgroupfs-is-as-cgroupy-as-it-gets/ + ok this would require proper support at the client side + yes + i wouldn't say impossible but definitely not as clean as we would + want it + far from it + how would you ever implement it w/o fixing the client + (i.e. fixing the interface first) ? + the client would translate the request + magical write retries ? + probably + uh + clients are the only entities which know what their file + desctiptors refer to + descriptors* + yes + so writing such a request would make the client get a magic retry, + and use the proper rpc, passing the proper rights instead + yeah, i can see how that could work + but i'm not sure that we should go down this path ... + we probably really do'nt want to :) + i'd personally be fine if debian would allow two init systems + me too + with the powerful linux-specific one still allowing sysvinit + scripts + in particular b/c the sysvinit scripts are already there + from what i've read, they all provide some decent backward + compatibility with sysvinit + yes + and i think we can count on the linux community to riot if, + assuming systemd was chosen, it becomes too hard to use and tweak + again, these people want their software to be used + so they'll probably manage something decent in the long run, + whatever is chosen + i don't care much + :) + AFAIK Debian is planning to let users chose the init system, the + discussion is only on what should be the main/default one; but I might + have misunderstood it + that was one of the possibilities, yes + maybe we could help the debate by agreeing on whether or not we + consider supporting ports is that important, as port maintainers, + considering we'll probably keep the ability to use sysvinit scripts + anyway + and making that decision known + and stating that we consider openrc an worthwile incremental + improvement, whatever debian decides to do wrt to the default init system + for example, yes + we should discuss that with youpi and thomas + tschwinge: ^ + when they have some time later :) + + +## IRC, freenode, #hurd, 2014-01-24 + + Good news, a successful boot of Hurd with OpenRC: + http://paste.debian.net/78119/ :-) + ramains to fix the false negative for checkpath -W + remains* + not bad + + teythoon: btw, the segfault happens when starting the bootlogd + service: + end_request: I/O error, dev 02:00, sector 0 + Segmentation fault + gnu_srs: nice progress :) + i've never seen bootlogd crash like that, though i + i'm not sure it is installed + how can I check / ? it is mounted RW and even if cd to /run which + is on tmpfs, fsysopts --readonly fails: + :fsysopts: /: --readonly: Device or resource busy + I don't have bootlogd installed the segfault is at: + checkroot.sh: hwclock.sh mountdevsubfs.sh hostname.sh hdparm + keyboard-setup + called by /etc/rcS.d/S06checkroot.sh + you should probably create this directory that it fails to + create early in the boot process + + +## IRC, freenode, #hurd, 2014-01-25 + + braunr: being Linux-only is *part* of the "tight coupling" + strategy of the systemd cabal + of course you could implement all the Linux-specific interfaces on + other systems; as you could implement any other interfaces relied upon or + provided by systemd components... + (this is in fact Lennart's favourit cop-out argument whenever + someone raises concern about this) + the problem however is that such alternative implementations + usually have prohibitive costs + yes i know + (and Lennart knows that perfectly well... he doesn't exactly take + pains to conceal the fact that it's a cop-out) + their whole point is to create a tightly integrated stack of + monopolistic components, giving a shit about any possible alternatives + this does have an obvious appeal: it *significantly* reduces the + cost of innovation within their stack + at the same time however it kills the traditional innovation + driver in the free software eco-system, which is competition among + interchangable components + quite frankly, it makes little sense that other distributions are + embracing systemd in droves: the tight coupling pretty much turns them + all into Fedora look-alikes, questioning the point of their very + existence... + what is dmd? + as for Debian considering fringe kernels in their decision, I + think it makes *perfect* sense: the real value of Debian is precisely the + fact that it supports so many different things, making it a good base to + build upon + (it's just unfortunate that many Debian developers do not realise + this, and instead try to compete with user-oriented distributions...) + zacts: daemon managing daemon? yet another new init system... + yeah + didn't know if you have an opinion on it vs systemd + and whether or not hurd will use it.. + hm... not sure whether I do ;-) + antrik: one could argue an init system is hard to make + interchangeable without also making it quite poor in functionality + the GNU system uses it, right? when using the GNU system with the + Hurd (as it's really meant to be), that would obviously mean using DMD + with Hurd. though I'm not sure whether anyone has actually tried that + combination ;-) + just to make it clear, i'm totally not in favor of systemd + i'm just trying to measure the value of an interchangeable init + system here + value versus cost + why is it bad to try to compete with user oriented distros ? + braunr: I suspect most of the really good things about systemd + could be kept while making it somewhat more open at fairly little cost... + braunr: because that's not Debian's strength -- and never will be + trying to compete in this space too hard is bound to fail, at only + bears the risk of loosing the actual strengths + antrik: sounds true + hm... thinking about it, I'd say it actually makes more sense for + the init system to be distribution-specific than kernel-specific... + that makes sense + but systemd isn't just an init system + it's really the distribution's job to create a well-integrated + system. and basically, that's what the systemd cabal is doing for + Fedora... + it's just problematic that they have so much influence in + important upstream projects, that they are basically killing any chance + for others to integrate things in different ways + antrik: agreed + the tight coupling i refer to is about the init system and the + upstream projects you mention such as udev, acpid, console-kit, etc.. + yeah... and GNOME + is it really that coupled now ? + don't really know; but judging from remarks people make, it must + be pretty bad + this reminds me of the talk on gnome 3 last year at fosdem + it would have been hilarious if gnome wasn't such an important + project + (specifically, GNOME is now pretty much tied to logind AIUI, which + is not entirely inseparable from systemd -- but again, the cost is + prohibitive...) + i don't get what all the hate here is about ... + in fact, certain people used that as an argument why Debian must + switch to systemd as init, as they are already pretty much forced to use + various of the other coupled components anyways, and trying to decouple + them is too costly for Debian... + teythoon: hate ? here ? + i mean they don't do this for fun, they actually provide + something of value, right ? + some value + teythoon: they? + but they remove the kind of value that made free software evolve + the way it did, as antrik said + the evil cabal around systemd ;) + I didn't say "evil"... not explicitly at least ;-) + then again, if you are runnign linux/gnome3 and plug in a second + monitor, that one is automatically activated + yes, that's what they want to achieve + that's what they achieved + i mean, they targetted that, it's not a side effect + and anyone not happy with how they did that can surely provide a + nicer solution ;) + teythoon: as I said, there are clearly good aspects to what they + are doing -- but at the same time it's very dangerous to the free + software eco-system... + teythoon: not easily + antrik: i don't buy that + i do + braunr: yes, not easily. that is kind of the point, right ? + pulling projects such as gnome into a category of kernel specific + applications is dangerous + teythoon: well, considering who they are and the means they have, + they could have spent the time to do it right for everyone + maybe + err... activating a second monitor is not in any way tied to + systemd or related compontents... I think you are talking about a second + seat + that's another killer feature they achieved, yes + (which is nice, but quite frankly, a niche use case in my book...) + maybe you're not the typical user + I'm not. but the *typical* user definitely doesn't care about + multi-seat + if you say so + antrik: when you say it's dangerous what 'they' are doing, what + do you mean exactly ? + dangerous for whom ? + asides from schools in developing countries, who try everything to + save on IT costs, I really can't think of many users for multi-seat... + (maybe schools all around the world trying to cut down their + costs?) + or like everyone, here, a $30 dongle that gives you an extra + workstation, how awesome is that ? + teythoon: see above: they are killing the ability to combine + interchangable components, which has always been a core asset of the free + software ecosystem + antrik: so gnome is going for systemd, and gnome loses the + ability to be used w/o systemd + why do you care ? how does this affect the whole ecosystem ? + i really don't get why everyone is getting so upset about this + teythoon: who cares about a dongle giving an extra workstation? + the remaining users of workstations are either corporate -- who prefer + dedicated boxes for organisational reasons -- or gamers, who want all the + power to themselves... + teythoon: well gnome is kind of one of the major destkop software + in the free software world + s/one of// + antrik: you stated that you havent used gnome3, yet you have an + opinion how tightly it should be coupled with systemd or linux + people who haven't used systemd or upstart have an opinion about + which one should be preferred + teythoon: why do you think people shouldn't think about systems as + a whole ? + teythoon: actually, I am using it (for some value of "use") -- + though in legacy mode, as my hardware can't run the new bling... + in that case, people shouldn't be allowed to vote, because that + would require them to be politicians .. + it's okay to think about that + i don't think it is + teythoon: but seriously, whether *I* have used it is quite beside + the point. I have no illusions about being a niche user + people don't need to use something to actually understand it + but i cannot stand all the whining lately in the free software + world... + whining isn't fair + i mean, the word + y ? + it's a big problem and complaining to force a debate is important + yes, but "they" are solving problems, and everyone is + complaining for one reason or the other + they are also creating problems + and not everyone is complaining + as opposed to offering alternatives + that's a major issue, a lot of people are favorable to these + changes + and if you don't like what "they" are building, you are free not + to use it, no ? that's a freedom too ;) + no + you aren't + what ? + that's precisely the point + you'll be de facto forced to use it if you want to keep using the + rest + i'm free not to use gnome3 + you won't be free from using linux if you want gnome3 + what kind of argument is that ? + i'm abusing the word freedom + because it has no clear meaning in practice + as antrik said, it's about interchangeability and portability + and alternatives + accepting the way systemd is designed is a major shift towards + making linux its own standard, away from the rest + and the way it's done isn't thought to easily allow the + alternatives to keep up with the changes + we agreed the other day that they shouldn't create ad-hoc + interfaces like they do, yes + well that's the whole point + you just talked "about the way systemd is designed" + they could invest some more effort to make well designed + interfaces that allow changing both the dependencies and the services + provided + how is that related to bad interface design ? + for me, it's almost a synonym + and we discussed it + aren't tightness of coupling and quality of interfaces + completely orthogonal ? + it is designed with a narrow set of apparently company directed + interested towards a single system, a single distribution even, and + nothing else + no + absolutely not, when it's about something that should be + interchangeable + an interface that forces tight coupling is of low quality to me + braunr: they claim it's not actually company-directed... and I + tend to believe them on *that* point TBH + antrik: this would have been a valid reason at least + teythoon: it's just not right that some people can no longer use + major pieces of free software just because a tiny but highly vocal cabal + decides to disrupt the whole ecosystem + what are you talking about ? you are free to use older versions + of the software + i's not technically feasible + or it would require forking to maintain + again, it's the start of a rift + but, if the gnome people want to go into that direction, who are + you to say that they shouldn't ?? that's what i get the least about this + kind of argument... + i'm part of the free software community + more accurately, the free unix-like community + and you are actively developing gnome... ? + if they want to get out of this community, they'll hurt it, and + themselves + do you understand what a rift is ? + but that's their choice, no ? + a major division ? + so what ? + it doesn't mean it's a good one + you pick the desktop environment you like next best and be done + with it ? + it's almost public service at this point + what if they all do the same thing ? + err + they don't + you won't be free to do what you want because the technical + possibility will have disappeared + kde might + if only to compete with gnome + well, if you don't like hte direction a project is taking, you + fork it + that's what happened + exactly .. + why the long faces ? + forks increase complexity and reduce manpower + fork == division + forking in the free software community is normally a last resort + huh ? since when is this considered a bad thing ? + it's not a bad thing per se + it usually implies a bad situation + < braunr> fork == division + and division == rift + think of these situations that were caused by stupid drama and + lead to the duplication of a lot of effort + openbsd, eglibc, jenkins, to name a few + i don't + why would i ? i never created these forks + it affects the community as a whole + but the people who did thought it was necessary + the fact they could do it is good, the fact they had to do it + isn't + they were usually forced by the situation + and often by the stupidity of other people + someone forced someone else to fork a project ? with a gun or + something like this ? + i don't buy this ;) + of course not .. + eglibc was forced by the inability of drepper to accept a whole + class of patches + openbsd because theo de raadt has some huge ego + for jenkins, it was a licensing issue iirc + nothing technical at all + nothing in the interest of the community + err + it brings diversity + no + netbsd versus freebsd brings diversity + i thought that was a good thing + openbsd was just agotistic crap + ego* + if there is no diversity, why should stuff be interchangeable if + there are no alternatives? + and netbsd and freebsd aren't exactly forks, they're both bsd + based but had different goals from the start + that's not what i'm talking about + eglibc isn't exactly a new libc + it's glibc+the stuff that should have gone into it + teythoon: the stuff the systemd cabal does builds on the work of + thousands of projects and people; yet they act as if the don't own anyone + anything, and it's fine to boot out large parts of the community whos + work they are building on + iceweasel isn't a whole new firefox + most often, alternatives aren't forks of one another + if they are, they have diverged a lot + antrik: that is your interpretation, and i respectfully disagree + with it;) + and usually have different goals + that's diversity, and i'm very ok with it + (being a hurd guy and all) + but forking because of decisions that prevent alternatives is a + very bad reason to fork + again, who are you to tell a project (say gnome) what they + should do or not ? + that question makes no sense + we're trying to think objectively + forget who we are + think about what should be done + no such thing ;) + ok well, in that case, i'm a very smart person who knows a lot of + things, and people had better do what i tell them ;p + satisfied ? :) + yes + that's much better actually + not really .. + it's more honest + no it was sarcasm + what was honest are the arguments i explained + why care about who says them ? + i do + teythoon: there is not much interpretation in there really. some + of their own statements are quite explicit... + damn non scalable kernel .. + who is "their"? what statements ? + teythoon: when building glibc, there are so many nodes to fake + that ext2fs+fakeroot allocate enough ports to starve kernel memory ... + if i were mr. gnome3 and you would tell me that i should cuddle + with systemd b/c that's bad for one reason or another, the first thing + i'd like to know is who is telling me that + teythoon: why not solely consider the argument ? + braunr: yes, i can imagine fakeroot doing that + teythoon: Lennart and his friends. not sure how much of these + statements I have seen written down -- part of it I heard myself from + their own mouths + braunr: b/c maybe i like to develop my project in the direction + i want + that's unrelated + and if anyone disagrees, she may fork + this is a debate + why ? + so now we are debating what i may develop or not ? you lost me + ;) + a way to reach consensus + many people are discussing so that projects like debian and gnome3 + make the best decisions + a naive way to explain it is that the result is the sum of what + everyone likes and how louds he speaks for it + sure but you are not a gnome developer, no ? + no, but again, i'm a free software community member + and this affects the whole community + because gnome3 is a major software component used by a lot of + people + well, gnome at least + so the gnome project needs to seek consensus with everyone of + the free software community ? + no + that would be unanimity + but wrt to the systemd integration ? + siding with systemd is starting to get away from the free software + community + or, by bringing a lot of people along, dividing it + that's your interpretation + yes + always + you don't have to say it, we're not doing raw science here + it's implicit + i think it's important to point that out and make it explicit + you made it several times + we got the point + what matters in the current discussion is whether you agree or not + and why + and this will be your interpretation too + and we'll see if it's convincing + but, from experience, i expect noone will be convinced ;p + ^^ + the issue is too tied with the core goals we have in mind + but why does it matter whether i agree or not + that's my point actually + you seem to have a problem understanding the issue, i was trying + to convince you there is one + so, if i want to achieve that, it matters + what core goals ? + basic dialectic + well, for example, for me, i want people to think of the system as + a whole + i want something effective, technically very good, and that + respects user freedoms + i also want alternatives, i won't explain why, let's say it's + obvious + i agree + well, systemd people don't think of the system as a whole + here, what i call "system" is very large + it would almost equal society + i understand why they do that + they have the right to do that + but then i could say i understand why people make proprietary + software, and they also have the right to do it, i still won't approve it + it contradicts my personal goals, my personal view of how things + should be + i completely agree + but then again, what you said now and the way you said it was + very different + maybe, it's 3am, i'm sick and exhausted :) + more abstract + when i give an opinion + actually, when anyone gives an opinion + i consider it implicit that it's their point of view alone + they're not enforcing anything + merely speaking out + people tend to overestimate the importance of their own opinion + hm i wouldn't say so + and that's probably why the "who" doesn't matter a lot to me + it would matter if the person in question had real power + and his opinion could have a strong influence + in which case it wouldn't be overestimated + i could say what i think to systemd people + teythoon: quite frankly, I'm not sure what you are complaining + about. the systemd followers are trying to impose their opinions on + various projects. other people (including braunr and me, among many + others) are voicing counter-opinions. what's wrong with that? + but i'm pertty certain the weight they'll associate to what i tell + them will be very low :) + antrik: he called it "annoying whining" + i think it's the only problem + braunr: I don't think the systemd people associate much weight to + *anything* others say... ;-) + heh :) + to make an historic analogy + it seems to me they're repeating the same mistakes others did + during the unix wars + antrik: but when you say "the systemd followers are trying to + impose their opinion on various projects", don't you dismiss the + possibility that the gnome3 people just want to make external displays + hot-pluggable? + of course they do + don't you dismiss that proprietary software author just want to + make money ? + no + well, if that's the only thing you keep in mind to make your + opinion, you'll miss important points + that is an example of course + they're sacrificing interchangeability and starting a possibly + major rift in the community for hot pluggable displays + it may not be worth it + not supporting stuff like that might make the whole ecosystem + obsolete + i'm not saying it shouldn't be done + i'm saying it should be done while sacrificing other important + things + it would just take a little mort effort + and even if it wasn't done + that's what i meant by "whining" + no offense + what is the problem of it being "obsolete" ? + but talk is cheap, offering alternative solutions is hard + isn't unix obsolete ? isn't xorg obsolete ? + hum no + no one did, so they implemented their nice features + the point isn't to offer alternative solutions + it's to make them possible + or at least, not deny their technical feasibility because they + don't care + teythoon: see, "interchangeability and starting a possibly major + rift" don't look to conflict with your personal goals + that's the point where i think i can no longer do anything to + convince you + so i'll head to bed :) + heh, me too :) + honestly, i don't care a lot + i mean + it won't change much for me + but again, my brain is wired to think of things as a whole + on that note, good night :) + good night :) + teythoon: again, IT'S NOT ABOUT DISPLAYS + believe me, I do have some understanding how display hotplugging + works + also, the problem is not that gnome3 supports logind. the problem + is that gnome3 works *only* with logind now AIUI + there is yet another way to state the fundamental problem + there is a kind of social contract among free software projects: + every maintainer takes a reasonable amount of extra effort to support use + cases beyond his own. in return, his use cases are supported by other + maintainers + the systemd guys are breaking this contract, by explicitly + refusing, up front, to take *any* effort to accomodate other projects' + needs + + +## IRC, freenode, #hurd, 2014-01-28 + + teythoon: + https://plus.google.com/+LennartPoetteringTheOneAndOnly/posts/EgKwQV8te7s + azeem_: pffff :) + heh + which reminds me + if we want to state our position wrt the default init system + debate we should probably do it right now + yes + ml or collaborative editor ? + well, tech-ctte chair called the vote only for the default init + system for the Linux-ports + the vote got shot down on technicalities, but that might stand + I think that is a good thing, cause it implies that not one init + system has to be adopted across all ports + we talked the other day that it might make sense just to state + our view and our needs + sure. + I think what's needed is (i) an init-system agnostic system to set + the enable/disable state of services (ii) possibly mandating a .ini-style + config file along the style of whatever init system gets chosen as + default for Linux, to be used by non-Linux init systems as inut + input* + just my 0.02 EUR + uh + looks overkill + i was thinking more along the lines of 1) we have never used the + default debian init system and are cool with not using the default in the + future, 2) we intend to use sysvinit in the future, 3) to that end, we + ask the init script machinery to be left in place + but then, people managed to write stuff like libvirt + so who knows + 4) we will help maintaining it as part of our porter effort + i agree with teythoon + 5) we look forward to using openrc as incremental improvement, + complementing our sysvinit boot solution + yes that would be nice + i'll write a draft to debian-hurd, ok ? + openrc now has a dependency loop resolver, so parallel would + work:) + so is insserv, isn't it ? + there were complaints on openrc + https://bugs.gentoo.org/show_bug.cgi?id=391945 in the tech-ctte + discussions, now fixed + gnu_srs: please accept the fact that openrc will not be picked by + the tech-ctte for the Linux ports + azeem_: I do, I'm referring to arguments during the discussion + (history) + sure, just checking + teythoon: your post is being used to portray systemd cgroups + treatment as the right way… + ArneBab: so ? + it probably is the right way + that's not the problem + do you want to clear that up? (do I remember correctly that you + did not like that way?) + we don't like the cgroups interface + i will + not the feature + braunr: that’s what I meant + exactly + the feature amounts to resource containers in the hurd critique + ... + we do want that too :) + anatoly: you want them to rewrite cgroups ? + err + ArneBab: ^ + +[[dbus_in_linux_kernel]]. + + i've been thinking + maybe the magic write stuff isn't that bad after all + :) + i was thinking the same thing actually + i mean, it's not the nicest thing, but it shows how flexible our + solution is + the hurd is a lot about glue code already so why not + the problem is that there is no way to test cgroupfs + the main user is systemd, and it requires tons of other stuff + right + any other user of cgroups is also probably using other + linux-interfaces too + + +## IRC, freenode, #hurd, 2014-01-29 + + About openrc having a dependency loop resolver: : so is + insserv, isn't it ? + I found is_loop_detected() in insserv/listing.c but that one just + exits without telling where the loop is + + +## IRC, OFTC, #debian-hurd, 2014-01-29 + + * youpi trying the new sysvinit + hopefully we'll then be able to at last use the proper ifup/ifdown + debian way for networking :) + teythoon: why leaving hurd's runsystem by default rather than + sysvinit's? + ah, another issue, too, now that /dev/vcs appears in /proc/mounts, + umountfs would umount it + ideally umountfs would not umount passive translators + we could blacklist /dev/vcs in umountfs, but the same issue would + happen for user-defined translators in their own home, for instance + + +## IRC, freenode, #hurd, 2014-01-30 + + booting with the new sysvinit and openrc versions: works:), but + only in recovery mode:-( Hangs before INIT: version 2.88 booting + after start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] + exec init proc authtask c1120dc8 deallocating an invalid port 134517370, + most probably a bug. + related or an openrc problem? will test with sysv-rc + I don't have such issue with sysv-rc + k! + shouldn't recovery mode mean starting in runlevel 1, I get + runlevel 2? + it should + gnu_srs: recovery mode normally mean single user, which is between + rcS and rc2 + I get INIT: Entering runlevel: 2 + rcS.d should really have been named rcboot.d, as that is really what + it is. + ah, right, recovery is not single + (single as in init 1) + runlevel 1 is not single user either. it is more a gateway into + single user. see /etc/init.d/single to see what happen at the end of + runlevel 1. + init 1 and init 2 seems to work + well, the openrc dependency loop detector has found an init + script loop, maybe it has to be fixed? + disabling the hurd console solved the dependency loop problems, + thanks openrc;-) + (have to dig deeper to see where the loop is, and how to solve + it) + + +## IRC, freenode, #hurd, 2014-01-31 + + Hi, does the hurd console work with sysv-rc: In operc I get with + #console -d vga -d pc_mouse --repeat=mouse -d pc_kbd --repeat=kbd -d + generic_speaker -c /dev/vcs + console: Console library initialization failed: Not a directory + gnu_srs: yes, it works with sysvrc + gnu_srs: check that /dev/vcs has the appropriate translator + record + showtrans /dev/vcs: empty on another box: /hurd/console + yes, fix that and your console will be fine + settrans /dev/vcs /hurd/console? + or should it be active? + no, set an passive translator record so that this will be + persistent + something is wrong: when starting the hurd console screen is + blanked (and hangs) + can I get the hurd console when running with the serial console + (to see boot messages)? + gnu_srs: yes, yuo can + will try that image then, tks:) + teythoon: how to create all underlying directories? ls /dev/vcs: + 1 2 3 4 5 6 + don't, /hurd/console takes care of that + is settrans /dev/vcs /hurd/console correct? + yes + What are those underlying directories representing ? + the hurd console is a console multiplexer + bringing multiple virtual consoles to the hurd + # showtrans /dev/tty1 + /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + aha: console -d vga -d pc_mouse --repeat=mouse -d pc_kbd + --repeat=kbd -d generic_speaker -c /dev/vcs + task c1120e70 deallocating an invalid port 1782, most probably a + bug. + teythoon: Is it that /dev/tty1 has multiple translators ? + no + exactly one translator is bound to any given node in the vfs + something is strange with the hurd console: booting with it + enabled still runs the mach console, halting: + http://paste.debian.net/79438/ + what is strange about taht ? + when starting the hurd console: task c1120e70 deallocating an + invalid port 1782, most probably a bug. + so ? + and the paste when halting: twice + that is a known issue + with the hurd console? + how do you know it's the hurd console ? + that message comes from the kernel + currently, it is not possible to tell which process is + responsible + b/c the task is given as a pointer to the kernel task structure + not as a pid + I don't ,it is triggered by it at least + currently there is no way to map the former to the latter + why do you think it's a problem ? is something not working as + expected ? + maybe a reproducible way to hunt that bug! + we have one already + it happens every time the hurd boots + yes, hurd console does not start, even when enabled:-( + then please say so ;) + I did: (11:23:30) srs: something is strange with the hurd + console: booting with it enabled still runs the mach console, halting: + http://paste.debian.net/79438/ + where do you say that the hurd console did not start ? + maybe it is easier to hunt the bug in an already booted system + you just said that the mach console is still active, wich it is + even if the hurd console starts + yes + please start the hurd console by hand + -d current_vcs -c /dev/vcs -d vga -d pc_kbd --keymap us + --repeat=kbd -d pc_mouse --protocol=ps/2 --repeat=mouse + err + /bin/console -d current_vcs -c /dev/vcs -d vga -d pc_kbd + --keymap us --repeat=kbd -d pc_mouse --protocol=ps/2 --repeat=mouse + when I log in I have the mach console not the hurd console + yes, log in as root, then run that command + I've done that: (11:10:27) srs: aha: console -d vga -d pc_mouse + --repeat=mouse -d pc_kbd --repeat=kbd -d generic_speaker -c /dev/vcs + please read? + and you discovered in that process that /dev/vcs lacked a + translator record + did you run it again after fixing that ? + the reply was: (11:10:27) srs: task c1120e70 deallocating an + invalid port 1782, most probably a bug. + well, if you are feeling that what i ask you to do is + unreasonable, i'm not sure how i can help you + yes, the translator was running! + you could hunt down the port deallocation bug, that'd be awesome + and most welcomed + but i don't believe it is causing your console malfunction + I did what you asked for?? + I'll do it again! + ok, now I don't get that error, but still no hurd console? the + process is running, logging out and then in, no hurd console. + not possible in serial console? + no, the hurd console is displayed using the graphic card + you asked for that with -d vga ;) + not sure if there are any other display drivers + when you asked whether you can use the serial line, i assumed + you used both qemus graphic terminal and a serial console + try kvm ... -serial telnet::1236,server,nowait, then use telnet + localhost 1236 to connect to the serial console + then, you can start the hurd console over the serial console and + see whether that worked + OK; that's what I asked before. I tried with the graphic one, + I'll try again + telnet output is empty + frozen + did you start a getty there ? + in hurd? + b/c if you dropped the console=com0 argument from you gnumach + command line, the mach console will be put on the vga screen, not on the + serial console + I dropped console=com0 from grub.cfg, yes + ok + so simply no one is talking to the serial port anymore + did you try to start the hurd console ? + I did before, can do it again + startin the HC blanks the screen, and freezes the vga output:-( + ssh still working + hm + try ps Ax | grep tty, are there any term servers running for + /dev/tty1..6 ? + lplenty of them: http://paste.debian.net/79442/ + good, even gettys are there + and the console translator runs + hm + root 1224 5 7 months /hurd/console + root 1227 1226 7 months /bin/console -d vga -d pc_mouse + pc_mouse -d pc_kb... + yes, everything looks good + just to be sure, you are currently using the qemus graphical + frontend, right ? + yes + hm :/ + gnu_srs: do you see loginpr processes ? + nope + hum + this strikes me as odd + on my system, i see no gettys but only loginpr processes + this is b/c the hurd getty does little other than to print some + text and run the login program + but on your system the getty sticks around + is /sbin/getty really the hurd getty? it's easily recognized by + its crappieness: + /sbin/getty --help || echo $? + 1 + 1 + hm + still funny though + you could try to run the hurd console, then run a getty manually + e.g. /sbin/getty 38400 tty1 + from the ssh login? + yes + then the graphic display is back showing the loin prompt:P + weird + well, so most things work + that's a good thing + funny that hurds getty should get stuck like this + and the terminal is hurd:-) + any chance you can produce a stack trace of one of your getty + processes ? + how? + gdb --pid=the_pid /sbin/getty + then, do bt like usual + so you mean tty2-6 are broken? + no + it's just for some reason your gettys do not behave nicely when + run from init + from running tty2: bt #0 0x01087b09 in ?? () + #1 0x00000000 in ?? () + not much + hm :/ + indeed + our getty logs to syslog, can you see anythign of interest here + ? + Jan 31 12:00:46 debian-openrc-20140123 rsyslogd-2066: could not + load module '/usr/lib/rsyslog/imklog.so', dlopen: + /usr/lib/rsyslog/imklog.so: undefined symbol: klogAfterRun + [try http://www.rsyslog.com/e/2066 ] + nothing tty releated + gnu_srs: oh, i just noticed, please look into auth.log, the + getty stuff ends up there + teythoon: http://paste.debian.net/79465/ + well, that is interesting :) + /dev/tty1 not a directory? + for instance, yes + it says bad syntax if it was invoked in the wrong way, i.e. not + with exactly two arguments + that might have been you yourself, right ? + with getty --help i mean + for the not a directory message, please verify that + # showtrans /dev//tty1 + /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + and stat /dev/vcs/1/console says it's a character special file + I used exactly: /sbin/getty --help || echo $? + yes, that accounts for that bad syntax message + what so bad about that? + showtrans /dev//tty1 + /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + getty is so simple minded that it doesn't really parse its + arguments + stat: http://paste.debian.net/79469/ + looks nice + everything looks nice, i'm at my wits end here + and everything works OK with sysv-rc? + yes + by the way, are you using the sysvinit init scripts or something + openrc related ? + openrc use all the scripts in /etc/init.d + actually, could you try to kill -HUP 1 ? + BTW: the dependency loop detector has found many loops in those + scripts + kill -HUP 1: nothing happens + ok, try to kill one of those gettys and see if the one that + respawns works + then again, the getty should try to reopen the device every + minute until it succeeds + getty tty1 and tty2 disappeared? kill -HUP tty3 respawns + immediately + now no getty processes are left? + /dev//tty4: Not a directory etc? + sorry, i should have expressed myself more clearly + kill -HUP 1 sends a SIGHUP to sysvinit, this makes it reload + it's configuration + when i said kill some getty, i meant just kill some_pid + when you said 'kill -HUP tty3 respawns immediately', did you + mean you killed the getty that was listening on /dev/tty3, and then a new + one appeared and you got a login prompt at tty3 ? + a new pid appeared, the login prompt is on tty1 + this one? /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + i'd like to invite you to look at daemons/getty.c + not a big piece of code: anything specific? + no, just look what it roughly does + not a directory is not coming from that code + correct + it execl-s login + yes + inevitably + but you do not observe this + how come when they are running? + this is the question that you will have to answer in order to + make any progress + I killed only one of them: kill -HUP 1031 and they all + disappeared + i thought along these lines: the most obvious way to stall getty + is if it never exits that loop + so i guessed it might be failing to open the device + we already observed that getty works fine if invoked by you + manually + the question thus is, what is different when getty is invoked by + init ? + if a process started by init in this way is killed, init will + restart it + please note, that if anyone says kill that process, she means + send a signal that results in process termination + and while sighup causes processes to die if the signal is not + handled, it is not the ideal signal to kill processes + b/c some processes handle sighup + like sysvinit, which reloads its configuration + many daemons do this + see 'man 7 signal' for how signals affect processes + sorry, have to leave for now, bbl and thanks a LOT so far:) + ok :) + you are welcome :) + teythoon: I'm back but cannot spend to much time on this + tonight. Maybe you should try it yourself, do you want another image on + my box? + it'd be nice if you put your packages somewhere + there are no special packages sysvinit (-46) and openrc (-8) + surely openrc with some patches ? + from #openrc: (17:37:41) srs: start with sysvinit and make it + work first! + (17:28:43) srs: zigo: Then I copied that working image to + another, and changing hostname, and continued from there. + openrc with the hurd patches for /lib/rc/sh/init.sh (v8 should be + available from experimental by now) + sweet :) + gnu_srs: maybe it was just some weird issue with your system + i just switched to openrc and everything seems to just work + i'll redo what i just did more cleanly to get a clean test vm... + nice:) + teythoon: And you got the hurd console? + heh, i believe so >,< + i didn't see it b/c i was using --nographic + but ps Ax looked alright + hrm + gnu_srs: i can reproduce your trouble, umount still strips the + translator record from /dev/vcs + at system shutdown time + so that's the reason. Additionally I have to issue halt twice + from a ssh login, see http://paste.debian.net/79517/ + funny indeed + gnu_srs: i can reliably recover the hurd console by doing + settrans /dev/vcs /hurd/console && service hurd-console restart + && pkill getty ; sleep 5 ; pkill getty + humm, as you say, halt doesn't work + + +## IRC, OFTC, #debian-hurd, 2014-02-01 + + I've just uploaded a new new sysvinit package to experimental, with + all the latest hurd fixes. + + +## IRC, freenode, #hurd, 2014-02-01 + + 17:53:28< teythoon> settrans /dev/vcs /hurd/console && service + hurd-console restart && pkill getty ; sleep 5 ; pkill getty + teythoon: Any ideas on how to solve this? + gnu_srs: yes, i have that on my todo list + so it is not an openrc problem? + gnu_srs: no + + +## IRC, freenode, #hurd, 2014-02-01 + + start ext2fs: Hurd server bootstrap: ext2fs[gunzip:device:rd0] + exec init proc au + thtask with pid 6 deallocating an invalid port 134517370, most + probably a bug. + :) + pid 6 is exec o_O + teythoon: Nice to see that you added pid numbers for error + print-outs:) + so the boot error comes from the exec sever? + so it seems + server* + have you found where? + no + + +## IRC, OFTC, #debian-hurd, 2014-02-02 + + but when I install the new packages, and run update-alternatives + --config runsystem to select sysv, the boot fail with: start ext2fs: Hurd + server bootstrap: ext2fs[device:hd0s1] exec init proc authtask c1128dc8 + deallocationg and invalid port 134517370, most probably a bug. + was that the wrong approach? + is there some way to recover when hurd fail to boot with sysvinit? + I was able to boot in recovery mode. :) + and this time sysvinit booted. saw a segfault message just after + sysvinit started, no idea what caused it. + looks like it is startpar that segfaults. + looks like the invalid port message come every time, no matter if + the boot hang or not. + I was wrong. it isn't startpar segfaulting, it is something in + rcS.d/. + bootlogd is the process segfaulting at boot. + looks like the boot success rate is 30% or so. + reported bootlogd problem as . + I really miss valgrind. :) + pere: yes, the invalid port message is from the exec server + pere: i see the hurd boot process hang sometimes, no matter if i + use sysvinit or not + i believe it's a race condition in the ext2fs, not sure though + teythoon: but did the frequency of the hang go up with sysvinit or + not? to me it seem like that. + pere: yes, i believe it got worse + what hangs is fsysopts --update / + runsystem.sysv does that quite early + able to debug it? + I like the fact that runsystem.sysv set up ip at boot time, while + with .gnu, I have to run dhclient /dev/eth0 manually + it is quite confusing that hurd got two init processes with + sysvinit. one as pid 1, and another that seem to be the parent of all + internal stuff. perhaps the latter could be renamed to hurd-system or + something like that? + "sleep 0.2 # Work around a race condition (probably in the root + translator)." do not look too good... + (I increased from 0.1 to see if it help me. :) + did it ? + i plan to rename /hurd/init to /hurd/startup + +[[hurd_init]]. + + nope. :) + five boots in a row hung. :( + still no go... + are you using a vm or real hardware ? + vm + kvm, via virt-manager, to be exact. + me too + on the sixt boot, after waiting a long time between try 5 and 6 + (gave up a bit), it booted. + sleep 1 did not help either. + :( + well, it's not *that* bad for me + in fact recently it has been a lot better + you might try my packages + pere: here http://darnassus.sceen.net/~teythoon/hurd-ci/ + teythoon: tested it, and it seem to solve the problem. + is also rid of the strange error at the start. + teythoon: your packages even work without the sleep 0.1, at least + some of the time. :) + hm, but the success rate without sleep 0.1 is very low. I was able + to boot once, and never again. :( + pere: yes, i fixed the spurious port allocation today :) + pere: nice to hear that the sleep 0.1 i put in does increase + your chance to boot as well + + +## IRC, freenode, #hurd, 2014-02-02 + + gnu_srs: i found the spurious port deallocation :) + Cangrats:-D + trouble is, i introduced it >,< + Congrats* + Ah, you did? + gnu_srs: yes, in debian/patches/exec_filename_fix.patch + + http://darnassus.sceen.net/gitweb/teythoon/packaging/hurd.git/commitdiff/6da3e0be8fde0594bd84a13536d9d93048186790 + * teythoon . o O (diffs of diffs are trippy :) + + +### IRC, freenode, #hurd, 2014-02-03 + + teythoon: oh nice, you found that bug :) + braunr: yes, once i knew where to look it was easy to fix ;) + + +### IRC, freenode, #hurd, 2014-02-05 + + i wonder why the port deallocation bug made the system hang when + the libc was compiled with the newer gcc + teythoon: so it was indeed the problem ? + braunr: youpi said so, yes + oh right + +[[glibc/debian/experimental]], *glibc 2.18 vs. GCC 4.8*? + + +## IRC, OFTC, #debian-hurd, 2014-02-03 + + + http://people.skolelinux.org/pere/blog/Testing_sysvinit_from_experimental_in_Debian_Hurd.html + :) + pere: sounds like your hurd-console isn't running and there is + no getty on the mach console + pere: you could add sth like 8:2345:respawn:/sbin/getty 38400 + console to your inittab + I'd rather wait until the hurd porters get it right in the debs. :) + I suspect upgrading the downloadable image to use the latest + packages also would help a lot. + with upgraded packages, /proc is working and pstree, pkill, top, etc + is working out of the box. :) + + +## IRC, OFTC, #debian-hurd, 2014-02-04 + + I just uploaded sysvinit with hurd support to unstable. :) + + +## IRC, freenode, #hurd, 2014-02-04 + + teythoon: Hi, the segfault during boot is coming from bootlogd, + see bug #737375 + also the output on the console is from there: end_request: I/O + error, dev 02:00, sector 0 + gnu_srs: interesting :) + gnu_srs: i believe the end_request message comes from gnumach + yes, that's just a floppy disk access attempt + might be so yes + it's not a "might", it's sure :) + dev 02:00 is the flopy + k! + + +## [[glibc_IOCTLs]], `TIOCCONS` + + +## IRC, OFTC, #debian-hurd, 2014-02-04 + + Each time I upgrade my hurd box, I cannot login into it ... + No login prompt. + WTF is going on? + How to fix? + zigo: most likely your hurd console is not running and there is no getty started for the mach console + teythoon: How to fix? (note: I already have the partition mounted in a loopback) + Or maybe go in recovery mode? + depends + do you use sysvinit ? + do you use the hurd packages from hurd-ci ? + + +## IRC, OFTC, #debian-hurd, 2014-02-05 + + teythoon: Sorry, didn't see your reply. I just used the Hurd image, + untar it, and apt-get update / dist-upgrade. That's it, nothing more or + less. + teythoon: I obviously would like to install sysvinit, and later + OpenRC. That's the reason why I'm running Hurd: to make sure OpenRC works + with it without issues. + teythoon: It seems it "sometimes work" or what??? + I was able to repair it using the recovery mode, it seems. + grrr... + I got this issue again, again and again ... + Sometimes, got the tty1, sometimes, it doesn't appear. + That's REALLY frustrating. + zigo: and yes, the success rate for boot is not 100%. it increases + a bit by using the packages teythoon created at hurd-ci. + apparently some race condition somewhere. + pere: So, I should just try and reboot again and again ? + pere: Is it improving after switching to sysvinit? + once I had to boot six times before I got it running... + I was told that the race involves a call to fsysopts, and that the + success rate with sysvinit was smaller because fsysopts command was + called earlier. I can not confirm nor deny this. + with the latest packages from hurd-ci the success rate is almost + 100% again. + pere: Where do get that? + zigo: see + pere: What's the "update-alternatives --config runsystem" for? + to switch to sysvinit + Right, that's what I was missing then! :) + the new sysvinit version in unstable was built for hurd one and a + half hour ago. so soon hurd users can skip experimental for that. + pere: I've just succeeded in booting with OpenRC! :) + Though this console pb is REAAAALLLYYYY getting on my nerves! :) + Also, any idea why we don't get the nice colorfull output when + booting? + When booting with OpenRC, I've noticed that the dependency loop + detects some loops with the hurd-console thing. + zigo: good to hear that you got it working + the console problem is the following + when you shutdown using sysvinit, the system will run umount -a + it will then mistake some translators (like the one on /dev/vcs) + for file systems and remove their passive translator records + you can fix this by running '/usr/lib/hurd/setup-translators -k + -p' + you can avoid it for the time being by using reboot-hurd or + halt-hurd + teythoon: btw, how often is the hurd boot image available for + download updated? + not very often + teythoon: Can I run '/usr/lib/hurd/setup-translators -k -p' + mounting my hurd image in a chroot? + Hum... + Probably better to do that in the recovery mode, no? :) + dpkg-reconfigure hurd + would be easier to type :) + but we really need to fix that /dev/vcs unmounting + missing working getty and missing symlink from /run/mtab to + /proc/mount are the most serious problems I still see. + The recovery mode doesn't work with OpenRC ! :( + (it does in kFreeBSD and Linux, not with hurd ...) + What happens is that it continues to runlevel 2. + How can I fix then? + pere: missing working getty? + I don't see what issue you are referring to + about the missing symlink, I'm wondering what is supposed to add it + zigo: I don't know if anybody investigated it yet + youpi: yes, after boot there is no login prompt. + * pere have no idea, suspect a script in initscripts. + youpi: I'm reffering to the fact that I have no login prompt after + boot, and that I don't know how to fix, since I don't have a recovery + mode to my disposal anymore. + pere: but is the console started? + (I mean the hurd console) + pere: I suspect a wrong dependency, which OpenRC by the way, prints. + pere: otherwise, unless you have a /dev/console getty in + /etc/inittab, it's expected you don't have a prompt + zigo: add + c:23:respawn:/sbin/getty 38400 console + to your /etc/inittab + youpi: yes, we need to get that fixed + grrrr + * youpi wanted to change the image file on people.d.o + but I can't do that without downloading it on my laptop, to be able + to modify it + I would have been, if people was a hurd system :) + the proper way to fix this is to implement the get_source stuff + and get rid of the heuristic in mtab.c + youpi: nope, no console process running. + then that's why, /dev/vcs got unmounted + I already have a console getty in inittab. got it from the last + sysvinit package + * youpi should have brown-bag-fixed these bugs before this week-end + actually :) + pere: but you don't get a getty prompt on the mach console? I don't + understand why + it does work for me + brown-bag-fixed ? + youpi: Adding that in /etc/inittab didn't fix anything. + yes, ugly hacks uploaded to debian-ports + zigo: even with rebooting? + could you snapshot your screen so we can make sure what you are + actually getting? + youpi: I did it mounting my partition in a loopback... + Then booted up, and still couldn't see the console prompt. + ok, but please take a snapshot, so we are sure what is actually + happening + whether the console starts, etc. + that info passed out of the screen and is not shown after my boot, + at least. + which info? + again, please take a snapshot of the screen + otherwise we are just guessing, and that's never good for debugging + Maybe you'll find this interesting: http://paste.debian.net/80246/ + This is the output of OpenRC booting and detecting dependency loops + in the LSB header scripts. + youpi: the info about the console being started or not. I'll show + you, give me a minute. + zigo: well, that shouldn't be more problems than the dependency + loop already existing between rc.local and rmnologin + youpi: any loop is a fatal problem. + how come the rc.local vs rmnologin is not a problem ? + With sysv-rc in Debian, there's all sorts of loops that are just + silent. + I have not seen that loop on my linux system, so I am unsure what + you talk about. + (the actual issues is simply that all three use Required-start: + $all, and thus all depend on each other) + That's a huge pb IMO. + pere: well, + zigo: show me one? + rc.local:# Required-Start: $all + rmnologin:# Required-Start: $remote_fs $all + Yeah, the $all is just *bad*. + that is no loop. + I do believe we should implement a lintian warning about it. + sure, $all do not behave the way most people expect, and should be + avoided as much as possible. + any other loops? + no + (not that I know of) + youpi: sending you the screenshot via irc. + uh, long time no use dcc send, I don't even know where it sent it + to :o) + ok. aborting and trying another approach. + http://www.picpaste.com/booted-herd.png + ok, so boot didn't actually finish + that's why you don't get gettys or hurd-console (which is last) + there must be some init script hanging in the meanwhile + logging in via ssh show no running startpar process, so I doubt that + is the case. + syslog contain this: Feb 5 10:10:27 hurdtest console[808]: Console + library initialization failed: Not a directory + that is due to /dev/vcs not mounted + but that should have not prevented the boot from completing... + the boot is completed, as far as I can tell. + you can disable the hurd console in /etc/defaults/hurd-console + do you have gettys running? + no such file. + oops, -s + http://paste.debian.net/80251/ + pere: check your /etc/inittab, is there a getty for the mach + console ? + he said yes earlier + oh ok + i wonder why it doesn't show up then + same for me + if the getty cannot open the device, it will loop + ah, I was wrong. the inittab is not the one I thought. the current + one is after a reinstall, while I checked the content before that. + pere: check /var/log/auth.log + there is indeed no console entry in /etc/inittab. I thought it + would be copied into place during upgrades? + not if it exists + iirc + indeed + ah, great. "cp /usr/share/sysvinit/inittab /etc/inittab" and a + reboot fixed it. :) + phew :) + it really should try harder to update the inittab on hurd to a + working one. + didn't i do something like this to fix the getty path ? + yes. that was the code I expected to solve this. + it didn't work ? + well, I had the wrong inittab file... + btw, do hurd have the needed syscalls for bootlogd to work? + i haven't looked at bootlogd yet + would be nice to have a text dump of the boot when trying to figure + out what went wrong. + yes, that'd be nice + + pere: could you blacklist /dev/vcs in umountfs, just like already + done for /proc|/dev|/.dev etc. ? + so at least that case, which is really problematic, gets fixed now, + and not have to wait for another, more hurdish solution + youpi: just send patches to bts, and I'll pick it up from there. + nice. i'll work on the proper solution. bbl + teythoon: Can we add those translators to the exclusion lists in + umount[nfs]? + Sorry, I just noticed youpi's comment. I'm a bit behind. + rleigh: good to see you! are you back to the keyboard? fully + recovered? + Not quite fully, but on the mend, thanks! + :] + rleigh: yeah, good to see you again. I got a burst of energy and + brushed a bit on sysvinit in your absence. :) Even revitalized the + #pkg-sysvinit channel. :) + pere: Yes, I saw all the commit emails flying by! + I realistically won't be doing much for several weeks at least + though, I'm afraid. + no worries. spend your time getting well. :) it would be great to + have you on #pkg-sysvinit, though. :) + I'll join, no worries. I should add it to my irssi config so I + can't forget! + teythoon: serial console always works, right? no matter how + hurd-console behaves. + heroxbd: yes + but you need a getty on it + well, just like on linux :) + yes + almost + on mach, we have the mach console. by default that is put on the + vga screen, but you can make mach put it on a serial port using the + gnumach command line flag console=comX + well, just like on linux :) + understood, thanks! + oh, i didn't realize linux has this as well + teythoon: you'll use it a lot on a embedded system + an* + ok + + plus, seems it can't cleanly umount /, at boot it fsck's it, fixes it + and auto-reboot + it's odd that / doesn't get unmounted, don't you get a message at + "notifying ext2fs device:hd0s1 of shutown" ? + on console last 3 lines on halt are + Deactivating swap...swapoff: /dev/hd0s5: 4193208k swap space + done. + Unmounting local filesystems...done. + INIT: no more processes left in this runlevel + is this on reboot or on halt? + halt + then you should also be getting the "notifying" messages, as well + as "In tight loop: hit ctl-alt-del to reboot" message + it umounts uncleanly on reboot too + if you don't wait for these, there's little wonder it's not + properly unmounted + i waited many seconds, time to rewrite 3 lines above for you for + instance (not a fast typist) + on reboot it's harder but iirc they don't appear as well + * gg0 rebooting again + need to wait it finishes fsck'ing + (i should resoldering my serial cable to get back to lazily c&p) + -ing + many Give root password messages then + Give root password for maintenance + (or type Control-d to continue): + INIT: Id "z6" respawning too fast: disabled for 5 minutes + INIT: no more processes left in this runlevel + i'll wait 5 mins to see what happen + ok another dozen of Give root password and same couple of INIT above + no, just the first INIT + so z6 doesn't work + i.e. /sbin/sulogin (see /etc/inittab) + check out why that is + +[[hurd/translator/mtab/discussion]], *IRC, freenode, #hurd, 2013-06-25*, +*coreutils' `df`*. + + [...] depends on coreutils actually building + which depends on putting back a login package from the shadow + source package + are someone on that task? + no idea + IIRC I've mentioned the issue on the lists like months ago + but probably nobody took the tas + k + basically it means fixing any bug that login or su from the login + package would have + and then properly handle the migration from hurd-provided versions + to login-provided versions + and then we would be able to build coreutils + which BTS report is this? + I don't know if any report has been written about it + perhaps simplest would be to build the login package, but not its + bin/login + it seems hurd's getty uses special options of hurd'slogin + that's probably the easiest way to go + + sulogin seems to work fine but it shouldn't even called: + # Normally not reached, but fallthrough in case of emergency. + z6:6:respawn:/sbin/sulogin + +be + I suspect a good fix is to provide a new init.d script in the hurd + package adding the symlink for hurd. + + umountfs gets stuck at "Will now umount local filesystem:settrans + -apgf /lib/rc/init.d" + + +## IRC, freenode, #hurd, 2014-02-05 + + teythoon: Any ideas why I have to issue halt/reboot twice to make + the command succeed (from ssh login) + Is it the same issue with sysv-rc? + no + BTW: The segfault when booting came from bootlogd (wrong + parameters, Linux/~Linux), removing that one fixed it;-) + + +## IRC, freenode, #hurd, 2014-02-06 + + teythoon: we really need to find the boot issue for which you added + a sleep 0.1 in runsystem.sysv + apparently I had to move it above the mach-defpager startup, to get + a system that boots most of the time... + + did somebody look at + http://homepage.ntlworld.com/jonathan.deboynepollard/Softwares/nosh.html + ? + azeem: interesting + braunr: was mentioned here: http://lwn.net/Articles/584428/ + " Systemd won't work for them, that's for sure, but nosh as a + systemd unit file compatible alternative could. " + "I'm also very interested in seeing a discussion where the Debian + Hurd and BSD porters weigh in for themselves" + + +## IRC, OFTC, #debian-hurd, 2014-02-06 + + on halt/reboot it can't remount readonly root because it's busy, what + makes it busy? + by keeping /lib/rc/init.d mounted (like /dev/vcs) it shuts down + properly + I don't know about such directory + so seems that failed readonly remount is not a real problem because + at the end it runs halt-hurd/reboot-hurd which umount root properly + yes + afaiu it's a tmpfs where openrc copies "itself", kind of work + directory + by removing it, it can't continue working + at boot some messages are about its creation/population + why do init.d/hurd-console depend on $all? In most cases, depending + on $all is not giving you want you expect. + because we prefer to start the console (and thus clear all the + screen) only after the boot has finished + otherwise the console output will be messed up by the end of the + boot messages + youpi: there has to be a better way + b/c the way it is now, if one spawns a getty on the mach + console, it will mess up the hurd console as well + well, we do want mach messages printed even with the hurd console, + at least + i once thought that instead of printing them the kernel could + send messages to a registered userspace daemon that could e.g. send them + to syslog + that requires syslog to be working at all + changing $all to $local_fs seem to work fine here. + when the kernel cries out, we'd better always be able to hear it :) + pere: but then you have the bootup messages in the middle of the + console, don't you? + not as far as I can tell. look just the same as before. + well, on my box it seems that it gets to start after other daemons, + by luck + ah, perhaps getty actually clears the tty? + then that would be ok + youpi: i don't think it does + well, somehow something clears the output at least + i thought he hurd console does this + it does on startup, yes + but if it starts before other daemons + the damons startup output gets over it + one sees the console clear the screen, then get daemon startup + messages, and then the screen gets cleared again before the login prompt + appears + interesting, i haven't seen this happening + it seems like it happens when emitting text on /dev/tty1, the + console will then clear the screen to make the way for the new output + and since that happens on getty startup, it happens to be after all + daemon startup + yes, that's what happens + so considering this, I'm fine with starting the console earlier + getting a display glitch seems to have been acceptable on Linux for + years :) + (during boot, I mean) + ok + + anyone else tried openrc? + 15:20 < pere> yes, it did not umount properly. + 15:36 < gg0> reboot or halt? it takes few seconds to actually + reboot/halt since the last message from openrc + 15:39 < gg0> any typo adding such path? + * gg0 likes cross-channel pasting + anyone else keeps getting unclean umounts even after applying + http://paste.debian.net/plain/80386/ ? + gg0: yes, me. worked fine, it didn't shut down properly though + here works like a charm + what do you mean by properly? + i see first it can't remount root readonly but at least by not umount + path in question it continues executing scripts till actually shut it + down with something like {halt,reboot}-hurd + *not umounting + *shutting + for me it did not shut down + you mean don't you get classic press ctrl+alt+canc to reboot message? + yes + from my perspective (and from /hurd/init's), that's not shutting + down + as in it did not call reboot(2) + what are configuration not to miss besides switching runsystem to + sysv one? + *configuration steps + no idea, i did nothing else but to switch to runsystem.sysv and + to install openrc thus replacing sysv-rc + can you paste shutdown messages somewhere? + sure + .o(world is failing, /me can't debug teythoon :)) + http://paste.debian.net/hidden/745071e6/ + in my case i just found out that /etc/init.d/umountfs tries to umount + /lib/rc/init.d where openrc scripts are + what if you set VERBOSE and print REG_MTPTS? something like + http://paste.debian.net/plain/80570/ + there i got "settrans -apfg /lib/rc/init.d" which vanished with first + patch + http://paste.debian.net/80573/ + ok and if you apply first patch http://paste.debian.net/plain/80386/ + i.e. adding |/lib/rc/init.d to mount point to ignore + didn't help + well output should change though + it does + but it still does not shut down + paste please then + http://paste.debian.net/80576/ + what did you expect ? + did you unapply VERBOSE & print REG_MTPTS? + yes + no + well + seems you do, if VERBOSE is set, it prints Will now unmount local + filesystems" + i restored a vm snapshot, and applied both patches + instead of "Unmounting local filesystems" + *seems you did + http://paste.debian.net/80577/ + shall i do it again ? + and what after "root@debian:/# halt" ? :p + 23:55 < teythoon> http://paste.debian.net/80576/ + and openrc shouting lots of stuff about breaking dependencies + please yes do it again + if VERBOSE is set, it prints "Will now unmount local filesystems" + instead of "Unmounting local filesystems" + yes, you are right + still, it does not work + http://paste.debian.net/80579/ + i'm curious about the new REG_MTPTS, supposing /lib/rc/init.d has + been suppressed + ok stop + 23:47 < gg0> ok and if you apply first patch + http://paste.debian.net/plain/80386/ + i did + well, i added that path + i don't believe so, it should ignore it if added + did it fix the issue for you ? + yes + any typo in addition? + obviously patch is against sysvinit source but you have to apply it + to /etc/init.d/umountfs + obviously + isn't it time to tell me you are kidding me yet? + pere: thanks for the upload. I happened to realized that since it + was in collab-maint, I could as well just commit changes, I hope it's ok? + gg0: root@debian:~# fgrep '/lib/rc/init.d' /etc/init.d/umountfs + /|/proc|/dev|/.dev|/dev/pts|/dev/shm|/dev/.static/dev|/proc/*|/sys|/sys/*|/run|/run/*|/lib/rc/init.d) + /dev/vcs is missing, not the latest sysvinit version + could this affect shutdown? + i know + possibly + what if you also add /dev/vcs to path list? + what then ? + i don't mind /dev/vcs being + err, 'umounted' + i can handle that just fine + i mean what happens if you add /dev/vcs to path list in + /etc/init.d/umountfs as you did with /lib/rc/init.d? + what happens = how it shutdown + why would it be any different ? + no idea, seems the only change you don't have + i just know it fixes hurd console + i know it fixes the hurd console b/c i was the one who broke the + hurd console in the first place ... + quite sure there's something wrong on your side + if it's actually among those path to ignore, it can't be added to + REG_MTPTS + my /proc/mounts http://paste.debian.net/plain/80583 + yours? + i hope i'm not forgetting one change i did around + teythoon: /proc/mounts ? + + +## IRC, OFTC, #debian-hurd, 2014-02-07 + + teythoon: sorry for pasting reversed patches + please apply http://paste.debian.net/plain/80587, halt and paste + output + /proc/mounts + youpi: just fine. but please join us on #pkg-sysvinit and make sure + to follow the mailing lists. + gg0: no, sorry, i was perfectly able to use -R on your patches, + as demonstrated by the paste i send + i think i'll rather just wait for the next sysvinit package and + try it again + teythoon: i don't doubt you are able, i'm sorry because i messed up + things + /lib/rc/init.d should not go in $REG_MTPTS + sysvinit 2.88dsf-48 just add /dev/vcs to not-to-umount paths and make + boot consider -s for single user, nothing about umounting filesystems on + halt/reboot + the /lib/rc/init.d/ change to umountfs seem to be the wrong one, as + it do not solve the problem for me. because of this, I have not applied + it to git. + pere: could you try to apply http://paste.debian.net/plain/80587, + halt and paste output? + well it applies to teythoon who doesn't have /dev/vcs + */dev/vcs change + pere: this one applies to -48 + installed. http://paste.debian.net/plain/80615/ + given /lib/rc/init.d is added to not-to-umount paths it can't go in + REG_MTPTS + http://picpaste.com/halt-hurd-DVEVoHnr.png + pere: you didn't apply it + no messages from umountfs + which is even more weird + well, patch claimed it did. + normally it says "Unmounting local filesystems..." + checked the file, patch is applied. + ok i think i got it + patch is good. it just requires booting twice _and_ removing + non-patched /etc/init.d/umountfs.* if any + patch = adding /lib/rc/init.d + so + which files do you need to remove? + /etc/init.d/umountfs.* and /lib/rc/init.d/started/umountfs.* + do you have any? + you should just have patched umountfs under both /etc/init.d/ and + /lib/rc/init.d/started/ + the latter is populate at boot, that's why i said twice to become + effective + *populated + but propably /lib/rc/init.d/started/umountfs can be fixed on the fly + from start: + why do you need to remove these files? + 1/ patch /etc/init.d/umountfs by adding /lib/rc/init.d to + not-to-umount path list + why are these files not ignored? + 2/ remove /etc/init.d/umountfs.* if any (eg. .orig .new .whatever) + pere: because it loads them at boot, you need it loads just the right + one + 3/ reboot twice + (3/ halt twice) + this sound very fishy to me. + or 3/ fix umountfs files under /lib/rc/init.d/started as well + that should make it shutdown properly right away + my halt still hang. + pere: you have /lib/rc/init.d in both /etc/init/umountfs and + /lib/rc/init.d/started/umountfs and there are no umountfs.* around? + problem seems to be it picks first it finds if there are more than + one + well i could have been more precise: /lib/rc/init.d/started/umountfs + is a link to /etc/init.d one + btw there must be just one and only one umountfs, patched + pere: clean /etc/init.d, reboot/halt with reboot-hurd or halt-hurd, + then next sysv reboot/halt will be good + you just need to leave patched umountfs under /etc/init.d alone + patch has always been good, it just needs 2 reboots to be appreciated + pere: do you have other /etc/init/umountfs* files besides patched + one? + my guess is it takes the first and only the first which Provides: + umountfs + 12:17 < pere> why are these files not ignored? + 12:35 < gg0> my guess is it takes the first and only the first which + Provides: umountfs + to confirm that, if you have umountfs and umountfs.orig, under + /started you'll find just umountfs.orig + pere: how goes? + teythoon: last ~40 lines + i'm assuming you have any else umountfs.* under /etc/init.d. if you + just add /lib/rc/init.d path to the only umountfs there should not be any + problem + gg0: removing the umountfs.* files did not help, as far as I can + tell. + are you telling me that openrc caches all init.d scripts in + /lib/rc/init.d/ at boot? + pere: yes, you can see them. which umountfs* do you have under + /lib/rc/init.d ? + the right one. :) + only the right one? + just scared me to know that changes on the disk do not take effect + immediately with openrc. + pere: only the right one? + yes + here i screwed it up by forcing initscripts removal and reinstall to + reproduce it, then fixed it once again + i should just improving the explaination :) + pere: "removing the umountfs.* files did not help," so did you find + any? + yes, both .orig, .rej and .dpkg-old + pere: ok you should find one of them linked under + /lib/rc/init.d/started then + /lib/rc/init.d/started/umountfs.* + I removed them three boots ago. still halt hangs. + pere: and current umountfs have /lib/rc/init.d in path list? + *has + yes. + pere: can you access via ssh to it before issuing halt? + that is how I access it normally. + ok + before halt df should list /lib/rc/init.d as well + after halt it should not, do you confirm that? + (ssh connection here is kept alive) + my ssh connection went down, but /lib/rc/init.d was mounted while it + was active. + to me it look like umountfs isn't executed at all during shutdown. + oh, well. got to work on other things now. :) + it's correct getting no messages if there no filesystem to umount + as it wouldn't be run at all + pere: Hey, thanks for uploading sysv-rc -48 ! :) + you are welcome. :) + i can't reproduce it on a VM :/ http://paste.debian.net/plain/80658/ + ehm no, same machive, successive halt + http://paste.debian.net/plain/80659/ + got stuck + are there any testet sysvinit patches for hurd lingering? I plan to + upload a new version tonight or tomorrow. + + +## IRC, OFTC, #debian-hurd, 2014-02-08 + + http://paste.debian.net/plain/80854/ + expected? + do tmpfs and procfs need to be shown as types /hurd/tmpfs and + /hurd/procfs? + or can they be "normalized"? + domount mount_noupdate tmpfs shmfs /run tmpfs + -onosuid,noexec,size=10%,mode=755 + another one is why on linux options are nosuid,noexec ^, whereas on + hurd no-suid,no-exec,... ? + gg0: If they need generalising, we can add $nosuid/$noexec + etc. variables to mount-functions.sh and set them appropriately for the + currently platform. + current platform rather + yeah, i ask just to understand what side people prefers modifying, in + this case hurd vs sysvinit + btw in the meanwhile i got tmpfs takes options without '-' though it + shows them with '-' in proc/mounts + rleigh: and thanks for pointing out what looking for, little hints + saves hours in my case :) + [IRC connection closed] + + +## IRC, freenode, #hurd, 2014-02-08 + + gnu_srs: the -49 version of sysvinit contains a fix for bootlogd + + +### IRC, freenode, #hurd, 2014-02-09 + + (16:31:17) : gnu_srs: the -49 version of sysvinit contains + a fix for bootlogd + Nice for kFreeBSD, for Hurd it doesn't matter if we get a + segfault or an error code saying it's not implemented :-( + segfault vs error code is really not the same + iirc bootlogd would ignore the error + Nevertheless, bootlogd is not usable on Hurd :( + then fix it + + +## IRC, OFTC, #debian-hurd, 2014-02-08 + + gg0: If the sames are set by hurd itself, then it makes sense to + adapt sysvinit to cope with that rather than altering hurd since that + would be a fairly major compatibility break. OTOH, adding support for the + Linux/FreeBSD names in addition to the hyphenated names would be good + from the point of view of better interoperability generally, not just for + sysvinit. + For now, getting sysvinit to support the Hurd names is easy + enough, and if you do add the Linux/FreeBSD names then the compatibility + stuff can be removed when that's available. + + +## IRC, freenode, #hurd, 2014-02-11 + + Hi, still problems with hurd console under openrc: console: + Console library initialization failed: Not a directory + and /dev/vcs is there + gnu_srs: but is it a directory? + the output of console -d vga -d pc_mouse --repeat=mouse -d pc_kbd + --repeat=kbd -d generic_speaker -c /dev/vcs gives the response above + looks like /dev/vcs is a file. How to recreate the directory + content? + I thought it should not be removed with the latest sysvinit + package (-49) + from -48 changelog: Tell init.d/umountfs to not umount /dev/vcs, + as it break the console on Hurd. Patch from Samuel Thibault. + gnu_srs: but did your reconfigure the hurd package to remount it ? + ? + /dev/vcs won't magically be remounted by just not being unmounted + by sysvinit + dpkg-reconfigure hurd? + sure + I can start the console manually, but ENABLE='true' in + /etc/default/hurd-console does not work (at least with openrc) + does /dev/vcs becomes a mere file again with openrc? + no it's a directory with 6 entries + does the /etc/init.d/hurd-console gets to starT? + I'm afraid I'm really asking obvious questions that you should have + already asked for yourself + so you mounted it and it's not a file anymore. does it work now? + it seem like the service is not started, trying to figure out + why:-D + I can restart it but it is not visible in rc-status? + + shutdown stuck at "Asking all remaining processes to + terminate...done." (even before distupgrade btw) + seems stuck at killall5 -18 + hm, that's bad + how do you know that ? + /etc/init.d/sendsigs and /etc/init.d/killprocs + (yes, switched to sysvinit and testing openrc) + but killall5 -18 is SIGSTOP right? + and if it says ...done. then killall5 has already been run + so, how do you know it hangs at killall5 ? + teythoon: "done" is "log_action_end_msg 0" just after killall5 -15, + then we should get "Killing all remaining processes" or "All processes + ended within $seq seconds." + Asking all remaining processes to terminate...killall5 -15 -o 956 # + SIGTERM...done. + All processes ended within 1 seconds...done. + shutdown properly this time + hm + fwiw, i've also encountered hangs, haven't investigated yet + with openrc? + yes + + Is it so that with teythoons mtab translator umount -a unmounts + all passive translators, removing the translator records?? + causing pflocal (and pfinet) to disappear? + +[[hurd/translator/mtab/discussion]]. + + gnu_srs: didn't he say that this is getting fixed in his latest + patchset? + yes, what about mine and gg0s currently hosed systems? + yes, but until the patch makes into the next release,** + gnu_srs: pflocal and pfinet don't appear in mtab + because they don't expose whole directories, just a trivial node + so no, they won't get umounted by umount -a + simply check the content of /proc/mounts + so how come I cannot recover my image? + and gg0 neither + no idea, I've never tried openrc + when daring new fields, you face new issues, that's no wonder + so this does not happen with sysv-rc? + I haven't seen any of this kind of issue + whether it's related to using openrc vs sysvrc, I have no idea + but at least that's a candidate for sure + well in my case hurd bootstrap is stuck after ext2fs exec and + before init + ant reinstalling hurd via linux does not help + you mean the hurd package? + you can also try to reinstall the libc0.3 package + normally it should be all that is needed for boot + perhaps also some /dev entries + yes, the hurd package. I will try with libc0.3 tomorrow. Which + /dev entries, and how to create them manually? + "perhaps" implies that I don't know + you can as well just boot with an install CD, mount your disk, + chroot into it, and run dpkg-reconfigure hurd there to recreate + everything in /dv + +e + + +## IRC, OFTC, #debian-hurd, 2014-02-13 + + pere, rleigh: which script is supposed to make /etc/mtab a symlink + to /proc/mounts already? I can't find it + youpi: see /lib/init/mount-functions.sh + + +## IRC, freenode, #hurd, 2014-02-13 + + teythoon: are the sysvinit debian packages in sid usable currently + ? + they are + nice + youpi and pere have been busy polishing it quite a bit + teythoon: and uhm, how does one enable sysvinit in debian ? :) + ah, found pere's blog + braunr: didn't you read the postinst instructions ? :p + update-alternatives --config runsystem + oh right + got lost in the noise + very nice + still a few glitches i see, but it does the job + although i'm not sure i like the lack of console prompt :/ + i'll keep darnassus on the old runsystem until this is fixed + braunr: cp -p /usr/share/sysvinit/inittab /etc/inittab + and kill -HUP 1 + oh + :) + teythoon: thanks + teythoon: do you know why there are three tmpfs instances after + startup (/run, and in addition, /run/shm and /run/lock) instead of one on + /run ? + sorry for being so annoying :) + braunr: dunno, but that is what Debian does + https://wiki.debian.org/ReleaseGoals/RunDirectory explains it a + bit + root@thinkbox ~src # uname -s; mount | grep /run + Linux + tmpfs on /run type tmpfs + (rw,nosuid,noexec,relatime,size=306952k,mode=755) + tmpfs on /run/lock type tmpfs + (rw,nosuid,nodev,noexec,relatime,size=5120k) + tmpfs on /run/shm type tmpfs + (rw,nosuid,nodev,noexec,relatime,size=613900k) + i like this /run directory + yep, it's nice + ah great, i can add ,sync=30 to fstab and it's added at boot time + :) + + +## IRC, freenode, #hurd, 2014-02-17 + + hi, I think we should make console server separate from + hurd-console + if DM want start, console server need be start first + congzhang: send patches + and hurd-console mark it start at the end of sysinit? + congzhang: i agree + teythoon: isn't hurd-console the console server ? + I want to check whether it is need first + braunr: yes, but congzhangs point is (as i understand it) that + the backend component should be started earlier + then again, i know little about the hurd console + no, if user enable one dispaly manager, then cycle dependence + happen + why ? + i believe that is a different problem, namely that our + hurd-console init script depends on $all + pere: ^ + hurd-console Required-Start: $all + ok + yes that's a separate issue, and easier to understand + teythoon: if wdm Required-Start hurd-console, then insserv + can't generate the script order, right ? + congzhang: possibly, i don't know for sure + It doesn't work , and I rename to S??wdm to later one like + S20wdm + but insserv will regenerate the script order in /etc/rc2.d/, I + can't depend on that + congzhang: $all means after all scripts not depending on $all, and + not what the intuitive interpretation would tell you. + the current implementation order all scripts as if $all were not + present, and then move all scripts depending on $all to the last order + number+1. + because $all is misunderstood by most users, I strongly recommend to + _not_ use $all in any init.d script. + pere: so to make wdm to be number+more? + congzhang: make it depend on $all and be lexically sorted after + hurd-console. :) + wdm need start after hurd-console, if console-driver will run + when hurd-console start + not quite sure how startpar handle that case, so it might not work + the way you want anyway. + adding a dependency on hurd-console should not hurt, though. :) + how make it lexically sorted after hurd-console? + w is already after h in the alphabet. :) + that's trick! + but startpar uses the info in /etc/init.d/.depend.* (makefile style) + to order scripts, so check what the result is there too. + congzhang: no it's not + that's just cache + congzhang: ? + and generated from script head? + the right way is Adding run-time dependencies in script + congzhang: yes. insserv called from update-rc.d generate the + .depend.* files, and startpar reads the files (and ignore the headers) + when starting scripts. + if the script have cycle dependence, no one can help + congzhang: if there is a cycle, update-rc.d will reject the script. + sure, because the system current have not runable one + Display Manager run before hurd-console, and never successful + for X stared failed! + what is this hurd-console stuff, btw? it sound like somthing that + should be started in rcboot.d (aka rcS.d on Debian). + if you install wdm, you will notice that wdm start failed + should it run before sulogin when booting into single user? + hurd-console mix too much thins + pere: it's the console multiplexes that provides /dev/tty? + just part of that function + pere: it's like screen or tmux a server-client architecture + the x server gets keyboard and mouse events from it iirc + right. so not needed by sulogin, I guess. because if it was, it + should start in rcS.d, not rc[2-5].d/. + and also start /bin/console to start keyboard and mouse driver + /bin/console is the frontend + and if it started in rcS.d/, it would always be started before + wdm. :) + i think it should be started in rcS.d + why not essential? + braunr: when I tried, it failed. + https://www.gnu.org/software/hurd/hurd/console.html + teythoon: i want to make one disk img with default DM, and face + these problem + pere: do you have a log of the failur e? + teythoon: I know you are working on the hurd init system, so I + ask you for help + braunr: only the boot message: Starting Hurd console multiplexer: + hurd-console failed! + braunr: how can I learn more? + i don't know any easy way + try to put the system in its early state manually + and maybe run rpctrace on the actual console command + if that is what really fails + and I found that pc_kbd may have some bug? I have high + frequence of start failed if I make it start + but I can't located the real source of these problem + pere: the console logs some messages to syslog + teythoon: looked, nothing there. :( + gah, look like I broke my hurd machine. Added rpctrace to the start + of hurd-console, and now the boot just hang there, and when I interrupt + it the kernel reboot the entire machine. :( + pere: use rpctrace manually, don't script it + oh yeah, seen this as well + braunr: well, no use to test it after boot when it hang during + boot... + it triggers an assertion in the proc server iirc + pere: that doesn't imply you need to script it + pere: qemu snapshot mode will be your friend:) + ideally, i'd run the init system automatically up to the point i + want to run my test, and make it spawn a shell, and use that shell then + congzhang: hah. real men do to take backups. but they weep a + lot. :) + teythoon: runsystem.sysv has work well on my machine, just some + error infomation + good + + +## IRC, freenode, #hurd, 2014-02-21 + + Hi, a general question: is ptrace available for GNU/Hurd? + yes + tks, the openrc developers are working on process supervision + using it: good/bad? (compared to cgroups) + uh + i prefer the cgroups approach + but upstart also uses ptrace to keep track of the 'main' process + of an daemon + they use ptrace to follow a daemon that double forks + teythoon: and regarding portability? + + +## IRC, freenode, #hurd, 2014-02-24 + + sysvinit doesn't seem to handle /etc/default/locale into + consideration + + +## IRC, OFTC, #debian-hurd, 2014-02-25 + + how about switching runsystem.sysv by default? + now that it seems to be running fine, we could do that, yes + + # Required Interfaces In the thread starting diff --git a/open_issues/ti-rpc_then_nfs.mdwn b/open_issues/ti-rpc_then_nfs.mdwn index aa36e020..c3dd4e26 100644 --- a/open_issues/ti-rpc_then_nfs.mdwn +++ b/open_issues/ti-rpc_then_nfs.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -18,3 +18,88 @@ It needs some work on our side, [[!message-id Then, the Hurd's [[hurd/translator/nfs]] translator and [[hurd/nfsd]] can be re-enabled, [[!message-id "87hb2j7ha7.fsf@gnu.org"]]. + + +## IRC, freenode, #hurd, 2014-02-19 + + hi. I'm trying to port libtirpc to get rcpbind on hurd, and am + unable to find IPV6_PORTRANGE and IPV6_PORTRANGE_LOW. is this a known + problem with a known fix? + what are they supposed to be ? + braunr: found them described in . + "The IPV6_PORTRANGE socket option and the conflict resolution rule + are not defined in the RFCs and should be considered implementation + dependent + " + hm + if we have that, they're very probably not accessible from outside + our network stack + needed feature on hurd, in other words... + why ? + If I remember correctly, SO_PEERCRED is also missing? + yes .. + that one is important + braunr: you wonder why the IPV6_PORTRANGE socket option was created? + i wonder why it's needed + does linux have it ? + yes, linux got it. + same name ? + it make it possible for some services to work with some + firewalls. :) + yes, same name, as far I can tell. + they could merely bind ports explicitely, couldn't they ? + not always. + or is it for servers on creation of a client socket ? + see + for an example I came across. + i don't find these macros on linux :/ + how strange. libtirpc build on linux. + is there a gitweb or so somewhere ? + i can't find it on sf :/ + for , you mean? + yes + no idea. + are you looking at upstream 0.2.4 or a particular debian package ? + I'm looking at the debian package. + let me take a look + http://paste.debian.net/82971/ is my first draft patch to get the + source building. + ok so + in src/bindresvport.c + if you look carefully, you'll see that these _PORTRANGE macros are + used in non linux code + not very portable but it explains why you hit the problem + try using #if defined (__linux__) || defined(__GNU__) + also, i think we intend to implement SCM_CREDS, not SO_PEERCRED + but consider we have neither for now + ah, definitely a simpler fix. + pere: btw, see + https://lists.debian.org/debian-hurd/2010/12/msg00014.html + + with patch reporte.d + + +## IRC, freenode, #hurd, 2014-02-20 + + new libtirpc with hurd fixes just uploaded to debian. should fix + the rpcbind build too. + + +## IRC, OFTC, #debian-hurd, 2014-02-20 + + hm, rpcbind built with freshly patched libtirpc fail to work on + hurd. no idea why. + running 'rpcinfo -p' show 'rpcinfo: can't contact portmapper: RPC: + Success' + o_O + I have no idea how to debug it. :( + anyway, I've found that rpcinfo is the broken part. rpcbind work, + when I test it from a remote machine. + + +## IRC, OFTC, #debian-hurd, 2014-02-21 + + failing rpcinfo -p on hurd reported as . Anyone got a clue how to debug it? diff --git a/open_issues/tmux.mdwn b/open_issues/tmux.mdwn index f71d13e1..c49a5e12 100644 --- a/open_issues/tmux.mdwn +++ b/open_issues/tmux.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,6 +10,7 @@ License|/fdl]]."]]"""]] [[!tag open_issue_porting]] + # IRC, freenode, #hurd, 2013-08-01 teythoon: can you stop tmux on darnassus please ? @@ -22,3 +23,35 @@ License|/fdl]]."]]"""]] sometimes tmux would hang on attaching or detaching though, but overall I had less problems with tmux than with screen ah, I tried to start tmux on darnassus and now it hangs + + +# IRC, freenode, #hurd, 2014-02-04 + + braunr: whoa, i can reproduce gnu_srs' hanging ssh sessions on + darnassus + here goes + run tmux, exit the shell so that tmux quits, start tmux again + (tmux hangs now on some socket stuff), log in with ssh again, pkill tmux, + rm /tmp/tmux*/default => both ssh sessions hang and time out eventually + why start tmux twice ? + dunno + that's what i just did, twice in a row + there's a bug somewhere that makes tmux hang if the socket + exists but no tmux server is running + maybe that contributes to to the other issuse, i don't know + looks like an infinite loop somewhere + teythoon: Nice to set that I'm not alone having this problem:P + teythoon: what's happening ? :) + ? + on darnassus + not sure + uh, something is very wrong o_O + help ? + :) + the msg thread of a process is blocked somewhere + preventing ps/top from completing + looks like proc is blocked now .. + restarting the vm + apparently, removing buggy tmux sockets make pflocal crash + thanks for the report :) + you are welcome :) diff --git a/open_issues/translate_fd_or_port_to_file_name.mdwn b/open_issues/translate_fd_or_port_to_file_name.mdwn index 98fe0cfc..87556075 100644 --- a/open_issues/translate_fd_or_port_to_file_name.mdwn +++ b/open_issues/translate_fd_or_port_to_file_name.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -156,6 +156,91 @@ License|/fdl]]."]]"""]] see bug-hurd +## IRC, freenode, #hurd, 2013-12-05 + + braunr: no more room for vm_map_find_entry in 80220a40 + 80220a40 <- is that a task ? + or a vm_map, not sure + probably a vm_map + hm + let's fix this kind of reporting + :) + let one process register for kernel log messages + make a rich interface, say klog_thread and friends + a userspace process gets the port name, looks it up in proc, + logs nicely to syslog + if noone registered for this notifications, fall back to the old + reporting + i tend to think using internal names is probably better + how would i use them to see wich process caused the issue ? + you give the name of the task + (which means tasks have names, yes) + ok + the reason is that reporting is often used for debugging + and debugging usually means there is a bug + if the bug prevents from reporting, it's not very useful + and we're talking about the kernel here, the low level stuff + incidentally, i got myself a stuck process + ah, got it killed + braunr: so you propose to add a task rpc to set a name ? + i don't want to push such things + which is why this hasn't been done until now + but that's what i'd do in x15, yes + y not ? + and instead of a process registered to gather kernel messages, i'd + use a dmesg-like interface, where the kernel manages its message buffer + itself + i didn't feel the need to + the tools i've had until now were sufficient + don't forget you still need to fix mtab :p + or is it done ? + i sometimes see tasks deallocating invalid ports + no + there is an un-acked patche series on the list + ok + so, i want to identify which process caused it + is that possible right now ? + not easily, no + so that's a valid use case + it is + good + :) + so proc would register a string describing each task and mach + would use this for printing nicer messages ? + for example, yes + one problem with that approach is that it doesn't fit well with + subhurds + *bingbingbing + but i personally wouldn't care much, they're kernel messages + in the future, we could make mach more a hypervisor, and register + names for each domains + yet unanswered proposal about hierachical proc servers on the + list... + that'd also fix subhurds, so that the parents processes won't + appear in the subhurd + making it sandboxier + and killall5 couldn't slaughter the host system if the subhurd + shuts down with sysvinit + + +## IRC, freenode, #hurd, 2014-01-20 + + i wonder if it would not be best to add a description to mach + tasks + i think it would + to aid fixing these kind of issues + in x15, i actually add descriptions (names) to all kernel objects + that's probably a good idea, yes + well, not all, but many + + +## IRC, OFTC, #debian-hurd, 2014-02-05 + + youpi: about that patch implementing task_set_name, may i merge + the amended version ? + yes + + # IRC, freenode, #hurd, 2011-07-13 A related issue: diff --git a/open_issues/user-space_device_drivers.mdwn b/open_issues/user-space_device_drivers.mdwn index d6c33d30..69ec1d23 100644 --- a/open_issues/user-space_device_drivers.mdwn +++ b/open_issues/user-space_device_drivers.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2009, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2009, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -19,7 +19,7 @@ Also see [[device drivers and IO systems]]. [[!toc levels=2]] -# Issues +# Open Issues ## IRQs @@ -250,6 +250,297 @@ A similar problem is described in cool :) +#### IRC, freenode, #hurd, 2014-02-10 + + braunr: i have a question wrt memory allocation in gnumach + i made a live cd with a rather large ramdisk + it works fine in qemu, when i tried it on a real machine it + failed to allocate the buffer for the ramdisk + i was wondering why + i believe the function that failed was kmem_alloc trying to + allocate 64 megabytes + teythoon: how much memory on the real machine ? + 4 gigs + so 1.8G + yes + does it fail systematically ? + but surely enough + uh, i must admit i only tried it once + it's likely a 64M kernel allocation would fail + the kmem_map is 128M wide iirc + and likely fragmented + it doesn't take much to prevent a 64M contiguous virtual area + i see + i suggest you try my last gnumach patch + hm + surely there is a way to make this more robust, like using a + different map for the allocation ? + the more you give to the kernel, the less you have for userspace + merging maps together was actually a goal + the kernel should never try to allocate such a large region + can you trace the origin of the allocation request ? + i'm pretty sure it is for the ram disk + makes sense but still, it's huge + well... + the ram disk should behave as any other mapping, i.e. pages should + be mapped in on demand + right, so the implementation could be improved ? + we need to understand why the kernel makes such big requests first + oh ? i thought i asked it to do so + ? + for the ram disk + normally, i would expect this to translate to the creation of a + 64M anonymous memory vm object + the kernel would then fill that object with zeroed pages on demand + (on page fault) + at no time would there be a single 64M congituous kernel memory + allocation + such big allocations are a sign of a serious bug + for reference, linux (which is even more demanding because + physical memory is directly mapped in kernel space) allows at most 4M + contiguous blocks on most architectures + on my systems, the largest kernel allocation is actually 128k + and there are only two such allocations + teythoon: i need you to reproduce it so we understand what happens + better + braunr: currently the ramdisk implementation kmem_allocs the + buffer in the kernel_map + hum + did you add this code ? + no + where is it ? + debian/patches + ugh + heh + ok, don't expect that to scale + it's a quick and dirty hack + teythoon: why not use tmpfs ? + i use it as root filesystem + :/ + ok so + update on what i said before + kmem_map is exclusively used for kernel object (slab) allocations + kmem_map is a submap of kernel_map + which is 192M on i386 + so a 64M allocation can't work at all + it would work on xen, where the kernel map is 224M large + teythoon: do you use xen ? + ok, thanks for the pointers :) + i don't use xen + then i can't explain how it worked in your virtual machine + unless the size was smaller + i'll look into improving the ramdisk patch if time permits + no it wasnt + :/ + and it works reliably in qemu + that's very strange + unless the kernel allocates nothing at all inside kernel_map on + qemu + + +##### IRC, freenode, #hurd, 2014-02-11 + + braunr: http://paste.debian.net/81339/ + teythoon: oO ? + teythoon: you can't allocate memory from a non kernel map + what you're doing here is that you create a separate, non-kernel + address space, that overlaps kernel memory, and allocate from that area + it's like having two overlapping heaps and allocating from them + braunr: i do? o_O + so i need to map it instead ? + teythoon: what do you want to do ? + i'm currently reading up on the vm system, any pointers ? + teythoon: but what do you want to achieve here ? + 12:24 < teythoon> so i need to map it instead ? + i'm trying to do what you said the other day, create a different + map to back the ramdisk + no + no ? + i said an object, not a map + but it means a complete rework + ok + i'll head back into hurd-land then, though i'd love to see this + done properly + teythoon: what you want basically is tmpfs as a rootfs right ? + sure + i'd need a way to populate it though + how is it done currently ? + grub loads an ext2 image, then it's copied into the ramdisk + device, and used by the root translator + how is it copied ? + what makes use of the kernel ramdisk ? + in ramdisk_create, currently via memcpy + the ext2fs translator that provides / + ah so it's a kernel device like hd0 ? + yes + hm ok + then you could create an anonymous memory object in the kernel, + and map read/write requests to object operations + the object must not be mapped in the kernel though, only temporary + on reads/writes + right + so i'd not use memcpy, but one of the mach functions that copy + stuff to memory objects ? + i'm not sure + you could simply map the object, memcpy to/from it, and unmap it + what documentation should i read ? + vm/vm_map.h for one + i can only find stuff describing the kernel interface to + userspace + vm/vm_kern.h may help + copyinmap and copyoutmap maybe + hm no + vm_map.h isn't overly verbose :( + vm_map_enter/vm_map_remove + ah, i actually tried vm_map_enter + look at the .c files, functions are described there + that leads to funny results + vm_map_enter == mmap basically + and vm_object.h + panic: kernel thread accessed user space! + heh :) + right, i hoped vm_map_enter to be the in-kernel equivalent of + vm_map + + braunr: uh, it worked + teythoon: ? + weird + :) + teythoon: what's happening ? + i refined the ramdisk patch, and it seems to work + not sure if i got it right though, i'll paste the patch + yes please + http://paste.debian.net/81376/ + no it can't work either + :/ + you can't map the complete object + (amusingly it does) + you have to temporarily map the pages you want to access + it does for the same obscure reason the previous code worked on + qemu + ok, i think i see + increase the size a lot more + like 512M + and see + you could also use the kernel debugger to print the kernel map + before and after mapping + how ? + hm + see show task + maybe you can call the in kernel function directly with the kernel + map as argument + which one ? + the one for "show task" + hm no it shows threads, show map + and show map crashes on darnassus .. + here as well + ugh + personally i'd use something like vm_map_info in x15 + but you may not want to waste time with that + try with a bigger size and see what it does, should be quick and + simple enough + right + braunr: ok, you were right, mapping the entire object fails if + it is too big + teythoon: fyi, kmem_alloc and vm_map have some common code, namely + the allocation of an virtual area inside a vm_map + kmem_alloc requires a kernel map (kernel_map or a submap) whereas + vm_map can operate on any map + what differs is the backing store + braunr: i believe i want to use vm_object_copy_slowly to create + and populate the vm object + for that, i'd need a source vm_object + the data is provided as a multiboot_module + kmem_alloc backs the virtual range with wired down physical memory + whereas vm_map maps part of an object that is usually pageable + i see + and you probably want your object to be pageable here + yes :) + yes object copy functions could work + let me check + what would i specify as source object ? + let's assume a device write + the source object would be where the source data is + e.g. the data provided by the user + yes + trouble is, i'm not sure what the source is + it looks a bit complicated yes + i mean the boot loader put it into memory, not sure what mach + makes of that + i guess there already are device functions that look up the object + from the given address + it's anonymous memory + but that's not the problem here + so i need to create a memory object for that ? + you probably don't want to populate your ramdisk from the kernel + wire it down to the physical memory ? + don't bother with the wire property + oh ? + if it can't be paged out, it won't be + ah, that's not what i meant + you probably want ext2fs to populate it, or another task loaded by + the boot loader + interesting idea + and then, this task will have a memory object somewhere + imagine a task which sole purpose is to embedd an archive to + extract into the ramdisk + sweet, my thoughts exactly :) + the data section of a program will be backed by an anonymous + memory object + the problem is the interface + the device interface passes addresses and sizes + you need to look up the object from that + but i guess there is already code doing that in the device code + somewhere + teythoon: vm_object_copy_slowly seems to create a new object + that's not exactly what we want either + why not ? + again, let's assume a device_write scenario + ah + you want to populate the ramdisk, which is merely one object + not a new object + yes + teythoon: i suggest using vm_page_alloc and vm_page_copy + and vm_page_lookup + teythoon: perhaps vm_fault_page too + although you might want wired pages initially + teythoon: but i guess you see what i mean when i say it needs to + be reworked + i do + braunr: aww, screw that, using a tmpfs is much nicer anyway + the ramdisk strikes again ... + teythoon: :) + teythoon: an extremely simple solution would be to enlarge the + kernel map + this would reduce the userspace max size to ~1.7G but allow ~64M + ramdisks + nah + or we could reduce the kmem_map + i think i'll do that anyway + the slab allocator rarely uses more than 50-60M + and the 64M remaining area in kernel_map can quickly get + fragmented + braunr: using a tmpfs as the root translator won't be straight + forward either ... damn the early boostrapping stuff ... + yes .. + that's one of the downsides of the vfs-as-namespace approach + i'm not sure + it could be simplified + hm + it could even use a temporary name server to avoid dependencies + indeed + there's even still the slot for that somewhere + braunr: hm... I have a vague recollection that the fixed-sized + kmem-map was supposed to be gone with the introduction of the new + allocator?... + antrik: the kalloc_map and kmem_map were merged + we could directly use kernel_map but we may still want to isolate + it to avoid fragmentation + +See also the discussion on [[gnumach_memory_management]], *IRC, freenode, +\#hurd, 2013-01-06*, *IRC, freenode, #hurd, 2014-02-11* (`KENTRY_DATA_SIZE`). + + ### IRC, freenode, #hurd, 2012-07-17 OK, here is a stupid question I have always had. If you move @@ -725,7 +1016,133 @@ A similar problem is described in * + +## The Anykernel and Rump Kernels + * [Running applications on the Xen Hypervisor](http://blog.netbsd.org/tnf/entry/running_applications_on_the_xen), Antti Kantee, 2013-09-17. [The Anykernel and Rump Kernels](http://www.netbsd.org/docs/rump/). + + +### IRC, freenode, #hurd, 2014-02-13 + + is anyone working on getting netbsd's rump kernel working under + hurd? it seems like a neat way to get audio/usb/etc with little extra + work (it might be a great complement to dde) + noone is but i do agree + although rump wasn't exactly designed to make drivers portable, + more subsystems and higher level "drivers" like file systems and network + stacks + but it's certainly possible to use it for drivers to without too + much work + cluck: I am reading about rumpkernels and his thesis. + braunr: afaiu there is (at least partial) work done on having it + run on linux, xen and genode [unless i misunderstood the fosdem'14 talks + i've watched so far] + "Generally speaking, any driver-like kernel functionality can be + offered by a rump server. Examples include file systems, networking + protocols, the audio subsystem and USB hardware device drivers. A rump + server is absolutely standalone and running one does not require for + example the creation and maintenance of a root file system." + from http://www.netbsd.org/docs/rump/sptut.html + cluck: how do they solve resource sharing problems ? + braunr: some sort of lock iiuc, not sure if that's managed by the + host (haven't looked at the code yet) + cluck: no, i mean things like irq sharing ;p + bus sharing in general + netbsd has a very well defined interface for that, but i'm + wondering what rump makes of it + braunr: yes, i understood + braunr: just lacking proper terminology to express myself + braunr: at least from the talk i saw what i picked up is it behaves + like netbsd inside but there's some sort of minimum support required from + the "host" so the outside can reach down to the hw + cluck: rump is basically glue code + braunr: but as i've said, i haven't looked at the code in detail + yet + braunr: yes + but host support, at least for the hurd, is a bit more involved + we don't merely want to run standalone netbsd components + we want to make them act as real hurd servers + therefore tricky stuff like signals quickly become more + complicated + we also don't want it to use its own RPC format, but instead use + the native one + braunr: antti says required support is minimal + but again, compared to everything else, the porting effort / size + of reusable code base ratio is probably the lowest + cluck: and i say we don't merely want to run standalone netbsd + components on top of a system, we want them to be our system + braunr: argh.. i hate being unable to express myself properly + sometimes :| + ..the entry point?! + ? + dunno what to call them + i understand what you mean + the system specific layer + and *againù i'm telling you our goals are different + yes, anyways.. just a couple of things, the rest is just C + when you have portable code such as found in netbsd, it's not that + hard to extract it, create some transport between a client and a server, + and run it + if you want to make that hurdish, there is more than that + 1/ you don't use tcp, you use the native microkernel transport + 2/ you don't use the rump rpc code over tcp, you create native rpc + code over the microkernel transport (think mig over mach) + 3/ you need to adjust how authentication is performed (use the + auth server instead of netbsd internal auth mechanisms) + 4/ you need to take care of signals (if the server generates a + signal, it must correctly reach the client) + and those are what i think about right now, there are certainly + other details + braunr: yes, some of those might've been solved already, it seems + the next genode release already has support for rump kernels, i don't + know how they went about it + braunr: in the talk antii mentions he wanted to quickly implement + some i/o when playing on linux so he hacked a fs interface + so the requirements can't be all that big + braunr: in any case i agree with your view, that's why i found rump + kernels interesting in the first place + i went to the presentation at fosdem last year + and even then considered it the best approach for + driver/subsystems reuse on top of a microkernel + that's what i intend to use in propel, but we're far from there ;p + braunr: tbh i hadn't paid much attention to rump at first, i had + read about it before but thought it was more netbsd specific, the genode + mention piked my interest and so i went back and watched the talk, got + positively surprised at how far it has come already (in retrospect it + shouldn't have been so unexpected, netbsd has always been very small, + "modular", with clean interfaces that make porting easier) + netbsd isn't small at all + not exactly modular, well it is, but less than other systems + but yes, clean interfaces, explicitely because their stated goal + is portability + other projects such as minix and qnx didn't wait for rump to reuse + netbsd code + braunr: qnx and minix have had money and free academia labor done + in their favor before (sadly hurd doesn't have the luck to enjoy those + much) + :) + sure but that's not the point + resources or not, they chose the netbsd code base for a reason + and that reason is portability + yes + but it's more work their way + more work ? + with rump we'd get all those interfaces for free + i don't know + not for free, certainly not + "free" + but the cost would be close to as low as it could possibly be + considering what is done + braunr: the small list of dependencies makes me wonder if it's + possible it'd build under hurd without any mods (yes, i know, very + unlikely, just dreaming here) + cluck: i'd say it's likely + I quickly tried to build it during the talk + there are PATH_MAX everywhere + ugh + but maybe that can be #defined + since that's most probably for internal use + not interaction with the host diff --git a/open_issues/virtualization/fakeroot.mdwn b/open_issues/virtualization/fakeroot.mdwn index f9dd4756..7856e299 100644 --- a/open_issues/virtualization/fakeroot.mdwn +++ b/open_issues/virtualization/fakeroot.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -65,3 +66,1224 @@ License|/fdl]]."]]"""]] < youpi> that's why we still use fakeroot-sysv < teythoon> right < youpi> err, -tcp + + +## IRC, freenode, #hurd, 2013-11-18 + + I believe I figured out the argv[0] issue with fakeroot-hurd + but I'm not sure how to fix this + first of all, Emilios file_exec_file_name patch set works fine + but not with fakeroot + + http://git.sceen.net/hurd/hurd.git/blob/HEAD:/exec/hashexec.c#l300 + check_hashexec tries to locate the script file using a heuristic + Emilios patch improves the situation with just providing this + information + but then, the identity port of the "discovered" file is compared + with the id port of the script file + to verify if the heuristic found the right file + but when using fakeroot-hurd, /hurd/fakeroot proxies all + requests + but the exec server is outside of the /hurd/fakeroot + environment, so it gets the id port from the real filesystem + we could skip that test if the script name is explicitly + provided though + that test was meant to see whether a search through $PATH turned + up the right file + teythoon: nice + braunr: thanks :) + unfortunately, dpkg-buildpackaging hurd with it still fails for + some reason + but it is faster than fakeroot-tcp :) + even chown ? + or chmod ? + dunno in detail, but the whole build is faster + if you can try it, i'm interested + because chown/chmod is also slow on linux with fakeroot-tcp + i can try... + so it's probably not a hurd bug + braunr: yes, it really is + no i mean + chown/chmod being slow with fakeroot-tcp is probably not a hurd + bug + but a fakeroot-tcp bug + chowning all files in /usr/bin takes 5.930s with fakeroot-hurd + (6.09 with startup overhead) vs 26.42s (26.59s) with fakeroot-tcp + but try it on linux (fakeroot-tcp i mean) + although you may want to do it on something you don't care much + about :p) + + +## IRC, freenode, #hurd, 2013-12-03 + + * teythoon is gonna hunt a fakeroot bug ... + % fakeroot-hurd /bin/sh -c ":> /tmp/some_file" + /bin/sh: 1: cannot create /tmp/some_file: Is a directory + ah fakeroot-hurd + prevents installing stuff with /bin/install + sure fakeroot-hurd, why would i work on the slow one ? + i don't know + because it makes chmod/chown/maybe others horrenddously slow + ? + yes, fixing this involves fixing fakeroot-hurd + are you sure ? + i prefer repeating just in case: i saw that problem on linux as + well + with fakeroot-sysv + so ? + i'm almost certain it's a pure fakeroot bug, not a hurd bug + so + even if this is fixed, it still has to pay the socket + communication overhead + fixing fakeroot-hurd so that i can be used instead of fakeroot-tcp + is a very good thing to do, obviously + it* + but it won't solve the chown/chmod speed + (or, probably won't) + huh, why not ? + 15:53 < braunr> i'm almost certain it's a pure fakeroot bug, not a + hurd bug + when i say it's slow, i should be more precise + it doesn't show up in top + yes, but why would fakeroot-hurd suffer from the same issue ? + the cpu is almost idle + oh right, it's a completely different tool + my bad + right, right, the proper way to implement fakeroot actually :) + yes + this will bring near-native speed + + +## IRC, freenode, #hurd, 2013-12-05 + + fakeroot-hurd just successfully built mig :) + hangs in dh_gencontrol when building gnumach or hurd though + i believe it hangs waiting for a lock + lock like in file lock that is + braunr: no more room for vm_map_find_entry in 80220a40 + 80220a40 <- is that a task ? + or a vm_map, not sure + probably a vm_map + + +## IRC, freenode, #hurd, 2013-12-06 + + well, aren't threads a source of endless entertainment ... ? + well, I found three more bugs in fakeroot-hurd + one of them requires fixing the locking used in fakeroot + ouch + the current code does some lock cycling to aquire a lock out of + order + cycling ? + in the netfs_node_norefs function + release and reaquire + i see + which imho should be better solved with a weak reference + working on it, it no longer deadlocks but i broke something else + ... + endless fun ;) + such things could have been done right in the beginning + ... + yes, I wonder + libports has weak references + but pflocal is the only user + hm + none of the lib*fs support that + didn't i add one in libdiskfs too ? + anyway, irrelevant + weak references are a nice feature + teythoon: i don't see the cycling you mentioned + only netfs_node_refcnt_lock being dropped temporarily + yep, that one + line 145 + note that due to another bug this code is currently never run + how surprising .. + the note about some leak actually gave a hint about that + yeah, that leak + I think i'm actually very close + it's just so frustrating, i thought i got it last night + good luck then + thanks :) + + +## IRC, freenode, #hurd, 2013-12-09 + + sweet, i fixed fakeroot-hurd :) + /clap + what was the problem ? + lots + i see + it's amazing it actually run as well as it did + mess strikes again + i hate messy code .. + * teythoon is building half a hurd package using this ... stay tuned ;) + teythoon: is this going to make building faster as well? + most likely, yes + fakeroot-tcp is known to be slow, even on linux + teythoon: are you sure about the transparent retry patch ? + pretty sure, why ? + it's about a more general issue that we didn't fix yet + our last discussions about it lead us to agree that clients should + check the identity of a server before interacting with it + braunr: i don't understand, what's the problem here ? + teythoon: fakeroot does the lookup itself, doesn't it ? + yes + teythoon: but was that also the case before your patch ? + braunr: yes + teythoon: then ok + teythoon: i guess fakeroot handles requests only for a specific + set of calls right ? + and for others, requests are directly relayed + braunr: yes + and that still is the case, right ? + yes + ok + looks right since it only affects lookups + ok then + well, fakeroot-hurd built half a hurd package in less than 70 + minutes + a new record for my box + compared to how much before ? + (and why half of it ?) + unfortunately it hung after signing the packages... some perl + process with a /usr/bin/tee child + killing tee made it succeed though + braunr: i don't build the udeb package + oh ok + braunr: compared with ~75 with fakeroot-tcp and my demuxer + rework, ~80 before + teythoon: nice + + +## IRC, freenode, #hurd, 2013-12-18 + + there, i fixed the last fakeroot-hurd bug + *whee* :) + i thought so many times that i got the last fakeroot bug ... + last as in it's in a good enough shape to compile the hurd + package that is + but now it is + :) + this will make glibc and others so much faster to build + + +## IRC, freenode, #hurd, 2013-12-19 + + teythoon_: hum, you should make the behaviour of fakeroot-hurd on + the last client exiting optional + y? + fakeroot-tcp does the very same thing + fakeroot-hurd is different + it's part of the file system + yes + users may want it to stay around + and reuse it without checking it's actually there + but once the last client is gone, who is ever getting another + port to it ? + no + that cannot happen + really ? + yes + i thought it was like remap + since remap is based on it + the same thing applies to remap + only settrans has the control port + hum + and uses it once to get a protid for the working dir of the + initial process started inside the chrooted environment + you may not want to chroot inside + so ? + then, you get another protid + i'll make an example + i create a myroot directory implemented by fakeroot + populate it + leave and do something else, + i might want to return to it later + ah + ok, so you are not using settrans --chroot + or maybe i'm confusing the fakeroot translator and fakeroot-hurd + 10:48 < braunr> you may not want to chroot inside + yes + hm + ok, so the patch could be changed to check whether the last + control port is gone too + i have no idea of any practical use, but i don't see a valid + reason to make a translator go away just because it has no client + except for resource usage + and if it's installed as a passive translator + although that would make fakeroot loose its state + though remap state is on the command line so it would be fine for + it + see what i mean ? + yes i do + fakeroot state could be saved in some db one day so it may apply, + if anyone feels the need + so what about checking for control ports too ? + i'm not too familiar with those + who has the control port of a passive translator ? the parent ? + that should cover the use case you described + for the parent translator + for fsys_getroot requests it has to keep it around + and for more fsys stuff too + and if active ? settrans ? who just looses it ? + if settrans is used to start an active translator, the parent + fs still gets a right to the control port + ok + i don't have a clear view of what this implies for fakeroot-hurd + we'd want fakeroot-hurd to clean all resources including the + fakeroot translator on exit + for fakeroot-hurd (or any child translator) this means that a + port from the control port class will still exists + so we do not exit + oh, you're speaking of fakeroot.sh ? the wrapper script ? + probably + for me, fakeroot-hurd is the command line too, similar to + fakeroot-sysv and fakeroot-tcp + and fakeroot is the translator + yes, agreed + fakeroot-hurd could use settrans --force --chroot ... to force + fakeroot to exit if the main chrooted process dies + but that'd kill anything that outlives that process + that might be legitimate, say a process daemonized + so detecting that noone uses fakeroot is the much cleaner + solution + ok + also, that's what fakeroot-tcp does + which is why i suggested an option for that + why add an option if we can do the right thing without + troubling the user ? + ah, if we can, good + i think we can + I'll rework the patch, thanks for the hint + so + just to be clear + the way you intend it to work is + wait for all clients and the control port to drop before shutting + down + the control port is dropped when dettaching the translator, right + ? + yes + but hm + what if clients spawn other processes ? + they won't find the translator any more + then, that client get's a port to fakeroot at least for it's + working dir + so another protid is created + ah yes, it's usually choorted for such uses + chrooted + so fakeroot will stick around + but clients, even from fakeroot, might simply use absolute paths + so ? + in which case they won't find fakeroot + it will hit fakeroots dir_lookup + sure + how so ? + if the path is absolute, it will trigger a magic retry of some + kind + so the client uses it's root dir port + i thought the lookup would be done straight from the root fs port + .. + which points to fakeroot of course + ah, chrooted again + that's the whole point + so this implies clients are chrooted + they are + even if you do another chroot + what i mean is + that root port also points to a fakeroot port + if we detach the translator, and clients outside the chroot spawn + processes, say shell scripts, they won't find the fakeroot tree + now, i wonder if we want to actually handle that + i'm just uncomfortable with a translator silently shutting down + because it has no client + if fakeroot is detached, how are clients outside the chroot + ever supposed to get a handle to files inside the fakerooted env ? + it makes sense for fakeroot, so the expected behaviours here aer + conflicting + they had those before fakeroot being detached + then fakeroot wouldn't go away + right + unless there is a race but i don't think there is + there isn't + i call netfs_shutdown + clients get the rights before the parent has a chance to terminate + and only shutdown if it doesn't return ebusy + makes sense + ok go ahead :) + cool, thanks for the walk-through ;) + on the other hand .. + that's a complicated topic left unfinished by the original authors + one of many + having translators automatically go away when there is no client + may be a good feature + but it only makes sense for passive translators + and this should be automated + the lib*fs libraries should be able to handle it + or, we could go for proper persistence instead + stay around if active, leave after a while when no more clients if + passive + why ? + clean solution + persistence looks much more expensive to me + other benefits + i mean + persistence looks so expensive it doesn't make sense in a general + purpose system + sure, we could make our *fs libs handle this smarter at a much + lower cost + don't we get a handle to the underlying file ? + i think we do yes + if that's actually a file and not a directory, we could store + data into it + many translators are read-only + so ? + well, when we can write, we can use passive translators instead + normally + yes + depends on the fs type actually but you're right, we could use + regular files + or a special type of file, i don't know + braunr: BTW, I agree that active translators should only go away + when no ports are open anymore, while passive ones can exit when control + ports are still open but no protids + antrik: you mean as a general rule ? + that leaves the question how the translator distinguishes + between having a passive translator record and not having one + I believe I already arrived at that conclusion in some design + discussion, probaly regarding namespace-based translator selection + teythoon: yeah, as a general rule + interesting + currently there are command line arguments controling timeouts, + but they don't consider control ports IIRC + i thought there are problems with shutting down translators in + general + (also, command line arguments seem inconvenient to distinguish the + passive vs. active case...) + yeah, but we disregard the timeouts in the debian flavor of hurd + teythoon: err... no we don't. at least not last time I knew. are + you confusing this with thread timeouts? + simple test: do ls -l on /dev, wait a few minutes, compare + what do you expect will happen ? + the unused translators should go away + no + that must be new then + might be, yes + + http://darnassus.sceen.net/gitweb/teythoon/packaging/hurd.git/blame/HEAD:/debian/patches/libports_stability.patch + antrik: debian currently disables both the global and thread + timeouts in libports + my work on thread destruction consists in part in reenabling + thread timeouts, and my binary packages do that well so far :) + braunr: any idea why the global timeouts were disabled? + + +## IRC, freenode, #hurd, 2013-12-20 + + antrik: not sure + but i suspect there could be races + if a message arrives while the server is going away, i'm not sure + the client can determine this and retry transparently + good point... not sure how that is supposed to work exactly + + +## IRC, freenode, #hurd, 2013-12-31 + + btw, we should remove the libports_stability patch and directly + change the upstream code + if you agree, i can force the global timeout to 0 (because we're + still not sure what can go wrong when a translator goes away while a + message is being delivered to it) + i didn't experience any slowdown with thread destruction however + so i'm tempted to set that to an actual reasonable timeout value + of 30-300 seconds + braunr: if you do, please introduce a macro for the default + value so it can be changed easily + teythoon: yes + i don't understand why these are left as parameters tbh + true + 30 seconds seems to be plenty enough + + +## IRC, freenode, #hurd, 2014-01-17 + + time to give fakeroot-hurd a shot + http://darnassus.sceen.net/~rbraun/darnassus_fakeroot_hurd_assert + braunr: (wrt fakeroot-hurd) well in my book that shouldn't + happen + that's why i put the assertion there ;) + i assumed so :) + then again, /me does not agree with "threads" as concurrency + model >,<, and that feeling seems to be mutual :p + ? + well, obviously, the threads do not agree with me wrt to that + assertion + the threads ? + well, fakeroot is a multithreaded server + teythoon: i'm not sure i get the point, are you saying you're not + comfortable with threads ? + that's exactly what i'm saying + ok + coroutines/functional i guess ? + csp + functional not so much + + +## IRC, freenode, #hurd, 2014-01-20 + +[[open_issues/libpthread]], +[[open_issues/libpthread/t/fix_have_kernel_resources]]. + + teythoon: it's perfectly possible that the bug i had with + fakeroot-hurd have been caused by my own glibc thread related patches + has* + ok + *phew* :p + :) + i wonder if youpi could reproduce his issue on his machine + what issue ? + i must have missed something + some package failed + but he didn't gave any details + he wanted to try it on his vm first + ok + + +## IRC, freenode, #hurd, 2014-01-21 + + teythoon: i still get the same assertion failure with + fakeroot-hurd + will take a look at that sometimes too + braunr: hrm :/ + teythoon: don't worry, i'm sure it's nothing big + in the mean time, there are updated hurd and glibc packages on my + repository with fixed tls and thread destruction + cool :) + + +## IRC, freenode, #hurd, 2014-01-23 + + teythoon: can you briefly explain this fake reference thing in + fakeroot when you have some time please ? + braunr: fakeroot creates ports to hand out to clients + every port represents a node and references a real node + fakeroot allows one to set attributes, e.g. file permissions on + any node as if the client was root + those faked attributes are stored in the node objects + let's focus on fake_reference please + once some attribute is faked, that node has to be kept alive + otherwise, that faked information is lost + so if the last peropen object is closed and some information is + faked, a fake reference is kept + as indicated by a flag + hm + in dir lookup, if a node is looked-up that has a fake reference, + it is recycled, i.e. the flag cleared and the referecne count is not + incremented + so every time fakeroot_netfs_release_protid is called b/c, the + node in question should not have the fake reference flag set + what's the relation between the number of hard links and this fake + reference ? + i don' + i don't think fakeroot has a notion of 'hard links' + it does + the fake reference is added on nodes with a hard link count + greater than 0 + but i guess that just means the underlying node still exists + ah yes + right + currently, if the real node is deleted, the fake node is still + kept around + let's say it's ok for now + that's what the comment is talking about, the one that indicates + that garbage collection could help here + yes + properly fixing this is difficult + agreed + it would require something like inotify anyway + b/c of the way file deletion works + let's just ignore the issue, that's not what i'm hunting + agreed + the assertion i have is telling us that we're dropping a fake + reference + are we certain this isn't possible ? + that function is called if a client dereferences a port + in order to have a port in the first place, it has to get it + from a dir_lookup + the dir lookup turns a fake reference into a real one + so i'm certain of that (barring a race condition somewhere) + ok + netfs_S_dir_lookup grabs idport_ihash_lock (line 354) but doesn't + release it if nn == NULL (lines 388-392) + hm, my file numbers are slightly different o_O + i have printfs around + sorry :) + ok + new node unlocks it + new_node + oh + how unintuitive .. + yes, don't blame me ;) that's how it was + :) + worse, the description says "if successful" .. + ah no, the node lock + ok + yes, badly worded description + i strongly doubt it's a race + how do you trigger that assertion failure ? + dpkg-buildpackage -rfakeroot-hurd -uc -us + for the hurd package + very similar to one of your test cases i think + umm :-/ + one thing that i find confusing is that fake_reference seems to + apply to nodes, whereas release_protid is about, well, protids + is there a 1:1 relationship ? + since there is a peropen in the protid, i assume not + it may be a race actually + np->references must be accessed with netfs_node_refcnt_lock locked + hm no, that's not it + no, it's not a 1:1 relationship + note that the lock idport_ihash_lock serializes most operations, + despite it's name indicating that it's only for the hash table + the "interesting" operations being dir_lookup and release_protid + yes + again, that's another issue + why ? that's a pretty strong guarantee already + ah yes, i was referring to scalability + sure + the assertion is triggered from ports_port_deref in + ports_manage_port_operations_multithread + but i found it hard to reason about fakeroot, there are multiple + locks involved, two kinds of reference counting across different libs + yes + yes, that's to be expected + teythoon: do we agree that the fake reference is reused by a + protid ? + braunr: yes + why is there a ref counter for the protid as well as the peropen + then ? :/ + funny... i thought there was no refcnt for the peropen objects, + but there is + but for fakeroot-hurd that shouldn't matter, right ? + i don't know + here, one protid object is associated with one peropen object + yes + and the other way around, i.e. it's 1:1 + so the refcount for those should be identical + but i get a case where protid has a refcnt of 0 while the peropen + has 2 .. + umm, that doesn't sound right + teythoon: ok, it does look like a race on np->references + node references are protected by a global lock in lib*fs libs + yes + you check it without holding it + which means another protid can be closed at the same time, setting + the flag on the underlying node + i'll make a proper patch soon + they cannot both hold the hash lock + hm + teythoon: actually, i don't see why that's relevant + one thread closes its protid, sets the fakeref flag + the other does the same, chokes on the assertion + serially + i'm always a little fuzzy when exactly the references get + decremented + but shouldn't only the second thread set the fakeref flag ? + well, that's not what i see + i'll check what happens to this ref counter + see how my release_protid function calls netfs_release_protid + just after the out label + *while holding the big hash lock + so, any refcounting should happen while the lock is being held, + no ? + perhaps + now, my logs show something new + a case where the protid being released was never printed before + i.e. not obtained from dir_lookup + or at least, not fakeroot dir_lookup + huh, where did it came from then ? + no idea + only dir_lookup hands out those + check_openmodes calls dir_lookup too + yes, but that's not our dir_lookup + that's what i mean + it bypasses fakeroot's custom dir_lookup + but i guess the reference already exists at this point + bypass ? i wouldn't call it that + you're right, wrong wording + it accesses files on other translators + yes + the netnode is already present + yes + could it be the root node ? + i do not believe so + the root node is always faked + and is handed out to the first process in the fakeroot env for + it's current directory port + so you could try something that chdirs away to test that + hypothesis + the assertion looks triggered by a chdir + how do you know that ? + dh_auto_install: error: unable to chdir to build-deb + ah + well, or that is just the operation after fakeroot died and + completely unrelated + maybe + can you trigger this reliably ? + yes + i'm trying to write a shell script for that + so for you, fakeroot-hurd never succeeded in building a hurd + package ? + no + on darnassus ? + yes + b/c i stopped working on fakeroot-hurd when it was in a + good-enough shape to build the hurd package + >,< + maybe my system is not fast enough to hit this race (if it turns + out to be one) + some calls seems to decrease the refcount of the root node + call* + have you confirmed that it's the root node ? + almost + i could say yes + teythoon: actually no, it's not .. + could be .. + teythoon: on what node does fakeroot-hurd install the fakeroot + translator when used to build debian packages ? + hum + could it simply be that the check on np->references should be + moved above the assertion ? + braunr: it is not bound to any node, check settrans --chroot + oh right + teythoon: ok i mean + does it shadow / ? + looks very likely, otherwise the chroot wouldn't work + i'm not sure what you mean by shadow + settrans --chroot cmd -- / /hurd/fakeroot ? + but yes, for any process in the chroot-like env every real node + is replaced, including / + makes sense + teythoon: moving the assertion seems to fix the issue + intuitively, it seems reasonable to assume the fakeref flag can + only be set when there is only one reference, namely the fake reference + (well, the fake ref, recycled by the last open) + no, i don't follow + i'd still say, that if ...release_protid is called, then there + is no way for the fake flag to be set in the first place + that's why i put the assertion in ;) + on the other hand, you check the refcnt precisely because other + threads may have reacquired the node + but why would moving the assertion change anything ? + if we would do that, we'd "lose" all threads that see + np->reference being >1 + but for those objects the fake_reference flag should never be + set anyways + i cannot see why this would help + (does it help ?) + (and if it does, it points to a serious problem imho) + i'm recreating the traces that made me think that + to get a clearer view of what's happening + the problem i have with the current code is this + there can be multiple protid referring to the same node occurring + at the same time + they are serialized by the hash table lock, ok + but there apparently are cases where the first (of two) protids + being closed sets the fakeref flag + and the following chokes because the flag is set + i assume you put this refcount check because you assumed only the + last protid being closed can set the flag, right ? + but then, why > 1 ? why not > 0 ? + yes, that's what i was trying to assert + b/c the 1 is our reference + which one exactly ? + >1 is anyone *beside* us + ? + hm + you mean the reference held by the protid being destroyed + yes + isn't that reference already dropped before calling the cleanup + function ? + ah no, it's the node ref + yes + released by netfs_release_protid + exactly + which is called without the hash table lock held + hm no + it's locked + damn my brain is slow today + i actually think that it's the combination of manual reference + counting and the primitive concurrency model that makes it hard to reason + about this + well + the model is quite simple too + accesses to refcounters must be protected by the appropriate lock + this isn't done here, on the assumption that all referencing + operations are protected by another global lock all the time + even if a model is simple, this does not mean that it is a good + model for human beings to comprehend and reason about + i don't know + note that netfs_drop_node is designed to be called with + netfs_node_refcnt_lock locked + implying the refcount must remain stable between checking it and + dropping the node + netfs_make_peropen is called without the hash table lock held in + dir_lookup + and this increases the refcount + although the problem is rather that something decreases it without + the lock held + we should port libtsan and just ask gcc -fsanitize=thread + what about the netfs_nput call at the end of dir_lookup ? + the fake ref should be set by the norefs function + that should not decrease the count to 0 b/c the caller holds a + reference too + yes that's ugly + ugh + i'm unable to think clearly right now + as mentioned in the commit message, you cannot do something like + this in the norefs function + bbl ;) + bye teythoon + thanks for your time + for when you come back : + instead of maintaining this "fake" reference, why not assumeing + the hash table holds a reference, and simply count it + the same way a cache does + and drop that reference when removing a node, either to reflect + the current state of the underlying node, or because the translator is + being shut down ? + why not assume* + bbl too + sure, refactoring is definitively an option + + +## IRC, freenode, #hurd, 2014-01-24 + + teythoon: ok, i'll take care of fakeroot + braunr: thanks. tbh i was a little fed up with that little + bugger >,< + i can imagine + considering the number of patches you've sent already + + teythoon: are you sure about your call to fshelp_lock_init ? + yes, why do you ask ? + (the test case is given in the commit message) + it doesn't look right to me to call "init" while the node is + potentially locked + i noticed libdiskfs peropen release function takes care of + releasing locks + it looks better to me + it's not about releasing the lock + it's about faking the file being closed which implicitly + releases the lock + the file is being close + closed + since it's in the cleanup function + yes, but we keep it b/c the file has faked attributes + did you look at the problem description in the commit message ? + we keep the node + not the peropen + so ? + the lock is in the node + why would libdiskfs do it in the peropen release then ? + there is an inconsistency somwhere + actually, the lock looks to be per open + or rather, the lock is per node, but its status is recorded per + open + allowing the implementation to track if a file descriptor was used + to install a lock and release it when that file descriptor goes away + why would the node be locked ? + locked in what way, file-locking locked ? + yes + posix explicitely says that file locks must be implicitely removed + when closing the file descriptor used to install them, so that makes + sense + isn't hat exactly what i'm doing ? + no + you're initializing the file lock + init != unlock + and it's specific to fakeroot, while it looks like libnetfs should + be doing it + libnetfs would do it + but we prevent that by keeping the node alive + again, it's a per open thing + and no, libnetfs doesn't release locks implicitely in the current + version + didn't we agree that for fakeroot one peropen object is + associated with one protid object ? + yes + and don't keep those alive + so let them die peacefully, and fix libnetfs so it releases the + lock as it's supposed to + and we* don't + we don't keep those alive + why would we ? + yes that's what i wanted to say + what i mean is + since letting peropens die is already what is being done + there is no need for a special handling of locks in fakeroot + oh + on the other hand, libnetfs must be fixed + ok, that might very well be true + (we need to bring libnetfs and diskfs closer so that they can be + diff'ed easily) + i just wanted to check your reason for using lock_init in the + first place + yes .. + teythoon: also, i think we actually do have what's necessary to + deal with garbage collection + namely, dead-name notifications + i'll see if i can cook something simple enough + otherwise, merely keeping every node around is also acceptable + considering the use cases + dead-name notifications won't help if the real node disappears, + no ? + teythoon: dead name notifications on the real node port :) + teythoon: at least i can reliably build the hurd package using + fakeroot-hurd now + let's try glibc :) + +## IRC, freenode, #hurd, 2014-01-25 + + braunr: awesome :) + teythoon: hm not sure :/ + darnassus got oom + teythoon: could be unrelated though + teythoon: something has apprently made /home unresponsive :( + teythoon: i suspect bots hitting apache and in particular the git + repositories to have increased memory usage + + +## IRC, freenode, #hurd, 2014-01-26 + + teythoon: btw, fakeroot interacts very very badly with other netfs + file systems + e.g., listing /proc through it creates lots of nodes + i'm not yet sure how to fix that + using a dead name notification doesn't seem appropriate (at least + not directly) because fakeroot holds a true reference that prevents the + deallocation of the target node + + +## IRC, freenode, #hurd, 2014-01-27 + + teythoon: good news (more or less): fakeroot is actually leaking a + lot when crossing file systems + which means if i fix that, there is a good chance we can use it to + build all packages with it + -with it + what do you mean exactly ? + if target nodes are from /, there is no such leak + as soon as the target nodes are from another file system, ports + rights are leaked + that's what fills the kernel allocator actually + oh, so dir_lookup leaks ports when crossing translator + boundaries ? + seems so + yeah, that might very well be it + the dir_lookup logic in lib*fs is quite involved :/ + yes, my simple attempts were unsuccessful + but i'm confident i can fix it soon + that sounds good :) + i also remove the fake_ref flag and replace it with "accounting + the reference in the hash table" as soon as a node is faked + fine with me + these will be the expected leak + but they're far less in numbers than what i observe + and garbage collection can be implemented later + although i would prefer notifications a lot more + end of the news, bbl :) + found it :> + braunr: -v ;) + err = dir_lookup (...); + if (dir != dnp->nn->file) mach_port_deallocate (mach_task_self (), + dir); + in other words, deallocate ports for intermediate file system root + directories .. :) + teythoon: currently building hurd and glibc packages + but i intend to improve some more with the addition of a default + faked state + so that only nodes with modified faked states are retained + how do you mark nodes as having the default faked state ? + i don't + ok, right, makes sense :) + this sounds awesome, thanks for following up on this + i'm quite busy with other stuff so, with proper testing, it should + take me the week to get merged + teythoon: well thanks for all the fixes you've done + fakeroot was completely unusable before that + if you push your changes somewhere i'll integrate them into my + packages and test them + ok + implementing fakeroot -u could also be a good thing + and this should work easily with that default faked state strategy + + +## IRC, freenode, #hurd, 2014-01-28 + + teythoon: i should be able to test fakeroot-hurd with the default + faked attributes strategy today on glibc + braunr: very nice :) + azeem_: do you happen to know if fakeroot -u is used by debian ? + i mean when building packages + braunr: how does fakeroot-hurd perform on darnassus ? + i mean, does it yield a noticeable improvement over fakeroot-tcp + just like on my slow box ? + i'm not measuring that :/ + ok, no problem + and since nodes are removed from the hash table, performance might + decrease slightly + but the number of rights is kept very low, as expected + that's good + i keep seeing leaks though + when switching cwd between file systems + humm + so i assume something is wrong with the identity of . or .. + it's so insignificant compared to the previous problems that i + won't waste time on that + teythoon: the problem with measuring on darnassus is that it's a + public machine + right + often scanned by ssh worms or http bots + +[[cannot_create__dev_null__interrupted_system_call]]. + + but it makes complete sense to get better performance with + fakeroot-hurd + that's actually one of the reasons i'm working on it + if not the main one + :) + that was my motivation too + it shows how you can get an interchangeable unix tool that + directly plugs well with the low level system + and make it work better + nicely put :) + + teythoon: i still can't manage to build glibc with fakeroot-hurd + but i'm not sure why :/ + there was no kernel memory exhaustion this time + :/ + cp: cannot create regular file `debian/libc-bin.dirs': Permission + denied + hum + youpi: do you know if building debian packages requires fakeroot + -u option ? + I don't know + braunr: man dpkg-buildpackage says it just runs "fakeroot + debian/rules " + sources confirm that + http://sources.debian.net/src/dpkg/1.17.6/scripts/dpkg-buildpackage.pl#L465 + gg0: ok + + +## IRC, freenode, #hurd, 2014-01-29 + + it seems that something sets the permissions of this + debian/libc-bin.dirs file to 000 ... + i've seen this too + oh + do you think it's a fakeroot-hurd bug ? + have i mentioned something like this in a commit message ? + yes + it is + ok + i didn't see any mention of it + but i could have missed it + hm, i cannot recall it either + but i've seen this issue with fakeroot-hurd + ok + it's probably the last issue to fix to get it to work for our + packages + teythoon: i think i have a solution for that last mode bug + fakeroot doesn't relay chmod requests, unless they change an + executable bit + i don't see the point, and simply removed that condition to relay + any chmod request + braunr: did it work ? + no + fakeroot still consumes too many ports + and for each file, there are at least two ports, the faked one, + and the real one + it should be completely reworked + but i don't have time to do that + i'll see if it works when building from scratch + actually, it's not even a quantity problem but a fragmentation + problem + the function that fails is kmem_realloc .. + ipc spaces are arrays in kernel space .... + it's more like three ports per file, you forgot the identity + port + ah yes + + +## IRC, freenode, #hurd, 2014-02-03 + + teythoon: i'll commit my changes on fakeroot tonight + they do improve the tool, but not enough to build glibc with it + braunr: cool :), so how do we make it fully usable ? + teythoon: i don't know .. + i'll try re adding detection of nodes with no hard links for one + but imho, it needs a rework based on what the real fakeroot does + i won't work on it though + + teythoon: also, it looks like i've tested building glibc with a + wrong test binary of my fakeroot version :/ + so consider all test results irrelevant so far + + +## IRC, freenode, #hurd, 2014-02-04 + + fakeroot-hurd might turn out to be easily usable for our debian + packages with the fixed binary :) + + teythoon: hum, can you explain + 672005782e57e049c7c8f4d6d0b2a80c0df512b4 (trans: fix locking issue in + fakeroot) when you have time please ? + it looks like it introduces a deadlock by calling new_node (which + acquires the hash table lock) while dir is locked, violating the hash + table -> node locking order + + braunr: awesome, then there still is hope for fakeroot-hurd :) + + teythoon: i've been able to build glibc packages several times + this night + so except for this deadlock i've seen once, it looks good + right + that deadlock + right, it does indeed violate the partial order of the locks :-/ + + teythoon: can you explain why you moved the lock in attempt_mkfile + please ? + + teythoon: i've just tested a fakeroot binary without the patch + introducing the deadlock, and glibc built without any problem + braunr: well, this is very good news :) + teythoon: but i still wonder why you made this patch in the first + place, i don't want to revert it blindly and reintroduce a potential + regression + braunr: i thought i was fixing the order in which locks were + taken. if the commit message does not specify that it fixes an issue, + then i was probably just wrong and you can revert it + oh ok + good + + teythoon: another successful build :) + i'll commit my changes + awesome :) + there might still be concurrency issues but it's much better + i'm curious what you did :) + so little :) + i was sick all week heh + you'll se + see + well, that's good actually ;) + yes + + teythoon: actually there was another debugging line left over, and + again, my test results are irrelevant @#! + + +## IRC, freenode, #hurd, 2014-02-05 + + teythoon: i got an assertion about nn->np->nn not being equal to + nn atfer the hash table lookup is dir_lookup + +failure + that's bad + not over yet + i had a couple of those too + i guess it's a use after free + yes + i used to poison the pointers and comment out the frees to track + them down iirc + teythoon: one of your patches stores netnodes instead of nodes in + the hash table, citing some overwriting issue + teythoon: i don't understand why using netnodes fixes this + braunr: libihash has this cookie for fast deletes + that has to be stored somewhere + the node structure has no room for it + uh + yes + it was that bad + ... + hence the uglyish back pointers + i see + looking back i cannot even say why it worked at all + well, it didn't + i believe libihash must have destroyed a linked list in the node + struct + possibly + no, it did not >,<, but for simple tests it kind of did + yes fakeroot sometimes corrupts memory badly .... + and yes, turns out the assertion is triggered on nodes with 0 refs + .. + teythoon: it looks like even the current version makes wrong usage + of the ihash interface + locp_offset is defined as "The offset of the location pointer from + the hash value" + and indeed, it's an intptr_t + teythoon: hm no, it looks ok actually, forget what i said :) + *phew + :p + + hmm, still occasional double frees in fakeroot, but it looks in + good shape for single threaded tasks like package building + + teythoon: i've just sent my fakeroot patches + braunr: sweet, i'll have a closer look tomorrow :) + teythoon: i couldn't debug the double frees though :/ + + +## IRC, freenode, #hurd, 2014-02-06 + + btw, i'm able to successfully use fakeroot-hurd to build glibc + packages, but is there a way to make sure the resulting archives contain + the right privileges and ownerships ? + I don't remember whether debdiff checks permissions + + braunr: I've just got fakeroot-hurd debian/rules clean + dh_clean + fakeroot: ../../trans/fakeroot.c:161: netfs_node_norefs: Assertion + `np->nn->np == np' failed. + while building eglibc + youpi: yes, that lockup is most annoying... :/ + youpi: with the new version ? + yes + hum + i only had rare double frees, not that any more :/ + youpi: ok i got the error too + still not good enough + ok + + +## IRC, freenode, #hurd, 2014-02-07 + + youpi: debdiff seems to handle permissions + i've found the cause of the assertions + braunr: groovie :) + + +## IRC, freenode, #hurd, 2014-02-08 + + braunr: nice :) + http://darnassus.sceen.net/~rbraun/debdiff_report + + +## IRC, freenode, #hurd, 2014-02-10 + + and, on a completely different topic, here is a crash i can + reproduce when using fakeroot: + http://darnassus.sceen.net/~rbraun/fakeroot_hurd_rpctrace_o_var_tmp_out_rm_rf_dir.png + + +## IRC, freenode, #hurd, 2014-02-11 + + still working on fakeroot + there are still races (not disturbing for package building but + still ..) + there may be wrong right handling + i believe i have witnessed a fakeroot deadlock :/ + aw + not sure though, buildbot killed the build process before i + could investigate + teythoon: was it a big package ? + half of the hurd package + that's not a port right overflow then diff --git a/open_issues/wine.mdwn b/open_issues/wine.mdwn index f8bb469b..842442f1 100644 --- a/open_issues/wine.mdwn +++ b/open_issues/wine.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -78,3 +78,99 @@ allocation. There is kernel support for this,* however. and stack issues to be fixed for wine to run as braunr pointed out some months ago (IRC?) when we discussed wine. + + +# IRC, freenode, #hurd, 2013-12-29 + + Hi, + http://www.gnu.org/software/hurd/open_issues/sendmsg_scm_creds.html seems + fixed in Debian GNU/Hurd 2013, do you know which patch they used? i + already asked in their channel, but well, there are only 18 people :) + Andre_H: it hasn't been fixed in Debian GNU/Hurd. Work is discussed + on the bug-hurd mailing list + youpi: thx for the info, i wonder why wine now works with some + hacks, but didn't in the past + I guess some circumvention patch was added to wine + does it actually really work, as in running applications for real? + (I've nevere tried) + youpi: i'm a wine developer and haven't seen circumventions for + hurd... i also just tried winelib apps last night, will try... let's say + powerpoint viewer today + Andre_H: How did you make wine run? I have patches for wine-1.4.1 + and 1.6.1 to build (so far unpublished), but it does not yet run + properly. + test case: wine notepad + gnu_srs: what's happening when you try that? + Andre_H: Currently it hangs at connect() (after creating the + /tmp/.wine1000/.../socket, etc, and starting again) + seems to be some problem with the HURD_DPORT_USE macro in eglibc, + investigation ongoing + gnu_srs: well, i'm using the debian distro, maybe you're on + something else? you could also pastebin your hacks, so i could have a + look. i'm about to clean mine up to send them upstream... ntdll will be + quite hard... + + +## IRC, freenode, #hurd, 2013-12-30 + + wine runs:) + It's just extremely slow.,.. + + gg0: please don't reopen #733604 , I've filed an updated one: + #7336045 + #733605 + gnu_srs: i've reassigned it from wine-1.6 (nonexistent) to wine + (correct), then to src:wine (more correct), but between such + reassignments you closed it so found command in the latter made it + reopening + then i realized you could mess up bugs on your own, without help :) + gg0: tks anyway, now it is src:wine and the title is right. Maybe + you should have noted me on IRC? + + gnu_srs: what's your status about wine? i'm still about to get + things upstream... + Andre_H: see debian bug #733605 + + +## IRC, freenode, #hurd, 2013-12-31 + + gnu_srs: i didn't need the patches for + dlls/mountmgr.sys/diskarb.c, maybe due to missing headers + + +## IRC, freenode, #hurd, 2014-01-06 + + Wanted to note that + http://www.gnu.org/software/hurd/open_issues/wine.html is wrong about + socket credentials, afaik they are still not implemented but that doesn't + block Wine anymore + In fact all you need to run Wine are the patches followed by + https://source.winehq.org/patches/data/101439 (not yet upstream) or see + http://wiki.winehq.org/Hurd + + Andre_H: thanks for your report + np :) + braunr: can someone update + http://www.gnu.org/software/hurd/open_issues/wine.html please? + Andre_H: well, you can :) + log in with google -> check guidelines of your wiki -> try out + your wiki syntax -> laziness alarm :) + Andre_H: The reason why wine runs now is a bug in SCM_CREDS was + fixed, see the wine-devel ML. + + Andre_H: s/SCM_CREDS/SCM_RIGHTS/ + gnu_srs: already updated our wiki :) + gnu_srs: would you mind updating yours: + http://www.gnu.org/software/hurd/open_issues/wine.html :) + + gnu_srs: two commits for wine are in now :) + + +## IRC, freenode, #hurd, 2014-01-11 + + Andre_H: Looks like the two committed patches did not go into + wine-1.6.2:-( + Additionally, your PATH_MAX fixes was not accepted? + gnu_srs1: well, the stable branch is called stable because not + everything get's there :)7 + gnu_srs1: the PATH_MAX patch needs more thinking... diff --git a/open_issues/xattr.mdwn b/open_issues/xattr.mdwn index 558c93b7..c6b9d8f7 100644 --- a/open_issues/xattr.mdwn +++ b/open_issues/xattr.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -43,3 +44,12 @@ IRC, OFTC, #debian-hurd, 2012-03-18: notes to self: it seems our ext2 driver comes from linux 2.3.42 or so, and in linux 2.5.46 ext2/ext3 get xattr and acl support + + +# Test Cases + +## IRC, freenode, #hurd, 2013-12-06: + + for fakeroot t.xattr test fails, a known issue? + the test must probably be disabled + the hurd doesn't support extended attributes currently -- cgit v1.2.3