diff options
author | Thomas Schwinge <thomas@codesourcery.com> | 2014-02-26 12:32:06 +0100 |
---|---|---|
committer | Thomas Schwinge <thomas@codesourcery.com> | 2014-02-26 12:32:06 +0100 |
commit | c4ad3f73033c7e0511c3e7df961e1232cc503478 (patch) | |
tree | 16ddfd3348bfeec014a4d8bb8c1701023c63678f /open_issues | |
parent | d9079faac8940c4654912b0e085e1583358631fe (diff) |
IRC.
Diffstat (limited to 'open_issues')
53 files changed, 9878 insertions, 189 deletions
diff --git a/open_issues/64-bit_port.mdwn b/open_issues/64-bit_port.mdwn index edb2dccd..04273630 100644 --- a/open_issues/64-bit_port.mdwn +++ b/open_issues/64-bit_port.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -23,22 +23,8 @@ the [[microkernel/mach/gnumach/ports/Xen]] platform. <braunr> i guess it wouldn't be too hard to have a special mach kernel for 64 bits processors, but 32 bits userland only <youpi> well, it means tinkering with mig - <braunr> like old sparc systems :p - <youpi> to build the 32bit interface, not the 64bit one - <braunr> ah yes - <braunr> hm - <braunr> i'm not sure - <braunr> mig would assume a 32 bits kernel, like now - <youpi> and you'll have all kinds of discrepancies in vm_size_t & such - <braunr> yes - <braunr> the 64 bits type should be completely internal - <braunr> types* - <braunr> but it would be far less work than changing all the userspace bits - for 64 bit (ofc we'll do that some day but in the meanwhile ..) - <youpi> yes - <youpi> and it'd boost userland addrespace to 4GiB - <braunr> yes - <youpi> leaving time for a 64bit userland :) + +[[mig_portable_rpc_declarations]]. # IRC, freenode, #hurd, 2012-10-03 @@ -60,87 +46,7 @@ the [[microkernel/mach/gnumach/ports/Xen]] platform. <braunr> i think i'll go the second way with x15, so you'll have the two :) -# IRC, freenode, #hurd, 2012-12-12 - -In context of [[microkernel/mach/gnumach/memory_management]]. - - <tschwinge> Or with a 64-bit one? ;-P - <braunr> tschwinge: i think we all had that idea in mind :) - <pinotree> tschwinge: patches welcome :P - <youpi> tschwinge: sure, please help us settle down with the mig stuff - <youpi> what was blocking me was just deciding how to do it - <braunr> hum, what's blocking x86_64, except time to work on it ? - <youpi> deciding the mig types & such things - <youpi> i.e. the RPC ABI - <braunr> ok - <braunr> easy answer: keep it the same - <youpi> sorry, let me rephrase - <youpi> decide what ABI is supposed to be on a 64bit system, so as to know - which way to rewrite the types of the kernel MIG part to support 64/32 - conversion - <braunr> can't this be done in two steps ? - <youpi> well, it'd mean revamping the whole kernel twice - <youpi> as the types at stake are referenced in the whole RPC code - <braunr> the first step i imagine would simply imply having an x86_64 - kernel for 32-bits userspace, without any type change (unless restricting - to 32-bits when a type is automatically enlarged on 64-bits) - <youpi> it's not so simple - <youpi> the RPC code is tricky - <youpi> and there are alignments things that RPC code uses - <youpi> which become different when build with a 64bit compiler - <pinotree> there are also things like int[N] for io_stat_struct and so on - <braunr> i see - <youpi> making the code wrong for 32 - <youpi> thus having to change the types - <youpi> pinotree: yes - <pinotree> (doesn't mig support structs, or it is too clumsy to be used in - practice?) - <braunr> pinotree: what's the problem with that (i explcitely said changing - int to e.g. int32_t) - <youpi> that won't fly for some of the calls - <youpi> e.g. getting a thread state - <braunr> pinotree: no it doesn't support struct - <pinotree> braunr: that some types in struct stat are long, for instance - <braunr> pinotree: same thing with longs - <braunr> youpi: why wouldn't it ? - <youpi> that wouldn't work on a 64bit system - <youpi> so we can't make it int32_t in the interface definition - <braunr> i understand the alignment issues and that the mig code adjusts - the generated code, but not the content of what is transfered - <braunr> well of course - <braunr> i'm talking about the first step here - <braunr> which targets a 32-bits userspace only - <youpi> ok, so we agree - <youpi> the second step would have to revamp the whole RPC code again - <braunr> i imagine the first to be less costly - <braunr> well, actually no - <braunr> you're right, the mig stuff would be easy on the application side, - but more complicated on the kernel side, since it would really mean - dealing with 64-bits values there - <braunr> (unless we keep a 3/1 split instead of giving the full 4g to - applications) - -See also [[microkernel/mach/gnumach/memory_management]]. - - <youpi> (I don't see what that changes) - <braunr> if the kernel still runs with 32-bits addresses, everything it - recevies from or sends through mig can be stored with the user side - 32-bits types - <youpi> err, ok, but what's the point of the 64bit kernel then ? :) - <braunr> and it simply uses 64-bits addresses to deal with physical memory - <youpi> ok - <youpi> that could even be a 3.5/0.5 split then - <braunr> but the memory model forces us to run either at the low 2g or the - highest ones - <youpi> but linux has 3/1, so we don't need that - <braunr> otherwise we need an mcmodel=medium - <braunr> we could do with mcmodel=medium though, for a time - <braunr> hm actually no, it would require mcmodel=large - <braunr> hum, that's stupid, we can make the kernel run at -2g, and use 3g - up to the sign extension hole for the kernel map - - -# IRC, freenode, #hurd, 2013-07-02 +## IRC, freenode, #hurd, 2013-07-02 In context of [[mondriaan_memory_protection]]. @@ -157,8 +63,10 @@ In context of [[mondriaan_memory_protection]]. <braunr> as passed between userspace and kernel -# IRC, OFTC, #debian-hurd, 2013-10-05 +## IRC, OFTC, #debian-hurd, 2013-10-05 <dharc> and what about 64 bit support, almost done? <youpi> kernel part is done <youpi> MIG 32/64 trnaslation missing + +[[mig_portable_rpc_declarations]]. diff --git a/open_issues/anatomy_of_a_hurd_system.mdwn b/open_issues/anatomy_of_a_hurd_system.mdwn index a3c55063..33635b80 100644 --- a/open_issues/anatomy_of_a_hurd_system.mdwn +++ b/open_issues/anatomy_of_a_hurd_system.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -43,7 +43,11 @@ like Bushnell's Hurd paper. All this should be unfied and streamlined. <antrik> servers often depend on other servers for certain functionality -# IRC, freenode, #hurd, 2011-03-12 +# Bootstrap + +## [[hurd_init]] + +## IRC, freenode, #hurd, 2011-03-12 <dEhiN> when mach first starts up, does it have some basic i/o or fs functionality built into it to start up the initial hurd translators? @@ -76,6 +80,112 @@ like Bushnell's Hurd paper. All this should be unfied and streamlined. rest of the system up +## IRC, freenode, #hurd, 2014-01-03 + + <teythoon> hmpf, the hurd bootstrapping process is complicated and fragile, + maybe to the point that it is to be considered broken + <teythoon> aiui the hurd uses the filesystem for service lookup + <teythoon> older mach documentation suggests that there once existed a name + server instead for this purpose + <teythoon> the hurd approach is elegant and plan9ish + <teythoon> the problem is in the early bootstrapping + <teythoon> what if the root filesystem is r/o and there is no /servers or + /servers/exec ? + <teythoon> e. g. rm /servers/exec && reboot -> the rootfs dies early in the + hurd server bootstrap :/ + <braunr> well yes + <braunr> it's normal to have such constraints + <teythoon> uh no + <braunr> at the same time, the boot protocol must be improved, if only to + support userspace disk drivers + <teythoon> totally unacceptable + <braunr> why not ? + <teythoon> b/c my box just died and lost it's exec node + <braunr> so ? + <braunr> loosing the exec node is unacceptable + <youpi> well, linux dies too if you don't have /dev populated at least a + bit + <braunr> not being able to boot without the "exec" service is pretty normal + <braunr> the hurd turns the vfs into a service directory + <teythoon> the exec service is there, only the lookup mechanism is broken + <braunr> replacing the name server you mentioned earlier + <teythoon> yes + <braunr> if you don't have services, you don't have them + <braunr> i don't see the problem + <braunr> the problem is the lookup mechanism getting broken + <teythoon> ... that easily + <braunr> imagine a boot protocol based on a ramfs filled from a cpio + <teythoon> i do actually ;) + <braunr> there would be no reason at all the lookup mechanism would break + <teythoon> yes + <teythoon> but the current situation is not acceptable + <braunr> i agree + <teythoon> ^^ + <braunr> ext2fs is too unreliable for that + <braunr> but using the VFS as a directory is more than acceptable + <braunr> it's probably the main hurd feature + <teythoon> yes + <braunr> i see it rather as a circular dependency problem + <braunr> and if you have good ideas, i'm all ear for propel ... :> + <braunr> antrik already talked about some of them for the bootstrap + protocol + <braunr> we should sum them up somewhere if not done already + <teythoon> i've been pondering how to install a tmpfs translator as root + translator + <teythoon> braunr: we could create a special translator for /servers + <braunr> maybe + <teythoon> very much like fakeroot, it just proxies messages to a real + translator + <teythoon> but if operations like settrans fail, we handle them + transparently, like an overlay + <braunr> i consider /servers to be very close to /dev + <teythoon> yes + <braunr> so something like devfs seems obvious yes + <braunr> i don't even think there needs to be an overlay + <teythoon> y not ? + <braunr> why does /servers need real nodes ? + <teythoon> for persistence + <braunr> what for ? + <teythoon> e.g. crash server selection + <braunr> hm ok + <teythoon> network configuration + <braunr> i personally wouldn't make that persistent + <braunr> it can be configured in files and installed at boot time + <teythoon> me neither, but that's how it's currently done + <braunr> are you planning to actually work on that soon ? + <teythoon> if we need no persistence, we can just use tmpfs + <braunr> it wouldn't be a mere tmpfs + <teythoon> it could + <braunr> it's a tmpfs that performs automatic discovery and registration of + system services + <teythoon> with some special wrapper that preserves e.g. /servers/exec + <teythoon> oh + <braunr> so rather, devtmpfs + <teythoon> it is o_O :p + <braunr> ? + <braunr> what is what ? + <teythoon> well, it could be a tmpfs and some utility creating the nodes + <braunr> whether the node management is merged in or separate doesn't + matter that much i guess + <braunr> i'd personally imagine it merged, and tmpfs available as a + library, so that stuff like sysfs or netstatfs can easily be written + + +## IRC, freenode, #hurd, 2014-02-12 + + <teythoon> braunr: i fixed all fsys-related receiver lookups in libdiskfs + and surely enough the bootstrap hangs with no indication whats wrong + <braunr> teythoon: use mach_print :/ + <teythoon> braunr: the hurd bootstrap is both fragile and hard to tweak in + interesting ways :/ + <braunr> teythoon: i agree with that + <braunr> teythoon: maybe this will help : + http://wiki.hurdfr.org/upload/graphviz/dot9b65733655309d059dca236f940ef37a.png + <braunr> although i guess you probably already know that + <teythoon> heh, unicode for the win >,< + <braunr> :/ + + # Source Code Documentation Provide a cross-linked sources documentation, including generated files, like @@ -311,6 +421,9 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l <Tekk_> spiderweb: well, there's 1 advantage of minix for you :P <braunr> the main idea of mach is to make it easy to extend unix <braunr> without having hundreds of system calls + +[[/system_call]]. + <braunr> the hurd keeps that and extends it by making many operations unprivileged <braunr> you don't need special code for kernel modules any more @@ -539,6 +652,9 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l <damo22> it must translate these system calls into ipc or something <damo22> then mach handles it? <braunr> exactly + +[[/system_call]]. + <braunr> that's why i say it's not the exokernel way of doing things <damo22> ok <damo22> so does every low level hardware access go through mach?' @@ -811,3 +927,403 @@ Actually, the Hurd has never used an M:N model. Both libthreads (cthreads) and l <braunr> ahungry: ctrl-c does work, you just missed something somewhere and are running a shell directly on a console, without a terminal to handle signals + + +# IRC, freenode, #hurd, 2013-11-04 + + <braunr> nalaginrut: you can't use the hurd for real embedded stuff without + a lot of work on it + <braunr> but the hurd design applies very well to embedded environments + <braunr> the fact that we're able to dynamically link practically all hurd + servers against the c library can visibly reduce the system code size + <braunr> it also reduces the TCB + <nalaginrut> what about the memory occupation? + <braunr> code size is about memory occupation + <teythoon> also, the system is composable like lego, don't need tcp - don't + include pfinet then + <braunr> the memory overheald of a capability based system like the hurd + are, well, capabilities + <braunr> teythoon: that's not an argument compared to modular kernels like + linux + <teythoon> yes it is + <braunr> why ? + <braunr> if you don't need tcp in linux, you just don't load it + <braunr> same thing + <teythoon> ok, right + <braunr> on the other hand, a traditional unix kernel can never be linked + against the c library + <braunr> much less dynamically + <teythoon> right + <nalaginrut> I think the point is that it's easy to cut, since it has + better modularity than monolithic, and could be done in userland relative + easier + <braunr> modularity isn't better + <braunr> that's a big misconception + <teythoon> also, restarting components is easier on a distributed system + <braunr> on the hurd, this is a side effect + <braunr> and it doesn't apply well + <nalaginrut> braunr: oops, misconception + <braunr> many core servers such as proc, auth, exec, the root fs server + can't be restarted at all + <teythoon> not yet + <braunr> and servers like pfinet can be restarted, but at the cost of posix + servers not expecting that + <braunr> looping on errors such as EBADF because the target socket doesn't + exist any more + <teythoon> I've been working on a restartable exec server during some of my + gsoc weekends + <braunr> ah right + <braunr> linux has kexec + <braunr> and can be patched at run time + <nalaginrut> sounds like Hurd needs something similar to generalizable + continuation + <braunr> so again, it's not a real advantage + <braunr> no + <nalaginrut> sorry serilizable + <braunr> that would persistence + <braunr> personally, i don't want it at all + <teythoon> yes it is a real advantage, b/c the means of communication + (ports) is common to every IPC method on Hurd, and ports are first class + objects + <teythoon> so preserving the state is much easier on Hurd + <braunr> if a monolithic kernel can do it too, it's not a real advantage + <teythoon> yes, but it is more work + <braunr> that is one true advantage of the hurd + <braunr> but don't reuse it each time + <nalaginrut> oh, that's nice for the ports + <teythoon> why not? + <braunr> what we're talking about here is resilience + <braunr> the fact that it's easier to implement doesn't mean the hurd is + better because it has resilience + <braunr> it simply means the hurd is better because it's easier to + implement things on it + <braunr> same for development in general + <braunr> debugging + <braunr> virtualization + <braunr> etc.. + <nalaginrut> yes, but why we stick to compare it to monolithic + <braunr> but it's still *one* property + <teythoon> well, minix advertises this feature a lot, even if minix can + only restart very simple things like printer servers + <braunr> minix sucks + <braunr> let them advertise what they can + <teythoon> ^^ + <nalaginrut> it has cool features, that's enough, no need to find a feature + that monolithic can never done + <braunr> no it's not enough + <braunr> minix isn't a general purpose system + <braunr> let's just not compare it to general purpose systems + + +# IRC, freenode, #hurd, 2013-11-08 + + <teythoon> and, provided you have suitable language bindings, you can + replace almost any hurd server with your own implementation in any + language + <crocket> teythoon: language bindings? + <crocket> Do you mean language bindings against C libraries? + <teythoon> either that or for the low level mach primitives + <crocket> For your information, IPC is independent of languages. + <teythoon> sure, that's the beauty + <crocket> Why is hurd best for replacing parts written in C with other + languages? + <teythoon> because Hurd consists of many servers, each server managing one + kind of resource + <teythoon> so you have /hurd/proc managing posix processes + <teythoon> you could reimplement /hurd/proc in say python or go, and + replace just that component of the Hurd system + <teythoon> you cannot do this with any other (general purpose) operating + system that I know of + <teythoon> you could incrementally replace the Hurd with your own + Hurd-compatible set of servers written in X + <teythoon> use a language that you can verify, i.e. prove that a certain + specification is fulfilled, and you end up with an awesome stable and + secure operating system + <crocket> Any microkernel OS fits the description. + <crocket> teythoon, Does hurd have formal protocols for IPC communications? + <teythoon> sure, name some other general purpose and somewhat + posix-compatible microkernel based operating system please + <teythoon> what do you mean by formal protocols ? + <crocket> IPC communications need to be defined in documents. + <teythoon> the "wire" format is specified of course, the semantic not so + much + <crocket> network protocols exist. + <crocket> HTTP is a transport protocol. + <crocket> Without formal protocols, IPC communications suffer from + debugging difficulties. + <crocket> Formal protocols make it possible to develop and test each module + independently. + <teythoon> as I said, the wire format is specified, the semantics only in + written form in the source + <teythoon> this is an example of the ipc specification for the proc server + http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/hurd/process.defs + <crocket> teythoon, how file server interacts with file clients should be + defined as a formal protocol, too. + <teythoon> do you consider the ipc description a kind of formal protocol ? + <crocket> + http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/hurd/process.defs can + be considered as a formal protocol. + <crocket> However, the file server protocol should be defined on top of IPC + protocol. + <teythoon> the file server protocol is in fs.defs + <teythoon> every protocol spoken is defined in that ipc description + language + <teythoon> it is used to derive code from + <braunr> crocket: not any system can be used to implement system services + in any language + <braunr> in theory, they do, but in theory only + <braunr> the main reason they don't is because most aren't posix compliant + from the ground up + <braunr> posix compliance is achieved through virtualization + <braunr> which isolates services too much for them to get useful, + notwithstanding the impacts on performance, memory, etc.. + <crocket> braunr, Do you mean it's difficult to achieve POSIX compliance + with haskell? + <braunr> crocket: i mean most l4 based systems aren't posix + <braunr> genode isn't posix + <braunr> helenos is by design not posix + <braunr> the hurd is the only multi server system providing such a good + level of posix conformance + <braunr> and with tls on the way, we'll support even more non-posix + applications that are nonetheless very common on unices because of + historical interfaces still present, such as mcontext + <braunr> and modern ones + <braunr> e.g. ruby is now working, go should be there after tls + * teythoon drools over the perspective of having go on the Hurd... + <crocket> braunr, Is posix relevant now? + <braunr> it's hugely relevant + <braunr> conforming to posix and some native unix interfaces is the only + way to reuse a lot of existing production applications + <braunr> and for the matter at hand (system services not written in c), it + means almost readily getting runtimes for other languages than c + <braunr> something other microkernel based system will not have + <braunr> imagine this + <braunr> one day, one of us could create a company for a hurd-like system, + presenting this idea as the killer feature + <braunr> by supporting posix, customers could port their software with very + little effort + <braunr> *very little effort* is what makes software attractive + <crocket> + http://stackoverflow.com/questions/1806585/why-is-linux-called-a-monolithic-kernel/1806597#1806597 + says "The disadvantage to a microkernel is that asynchronous IPC + messaging can become very difficult to debug, especially if fibrils are + implemented." + <crocket> " GNU Hurd suffers from these debugging problems (reference)." + <braunr> stackoverflow is usually a nice place + <braunr> but concerning microkernel stuff, you'll read a lot of crap + anywhere + <braunr> whether it's sync or async, tracking references is a hard task + <braunr> it's a bit more difficult in distributed systems, but not that + much if the proper debugging features are provided + <braunr> we actually don't suffer from that too much + <braunr> many of us have been able to debug reference leaks in the past, + without too much trouble + <braunr> we lack some tools that would give us a better view of the system + state + <crocket> braunr, But is it more difficult with microkernel? + <braunr> crocket: it's more difficult with distributed systems + <crocket> How much more difficult? + <braunr> i don't know + <crocket> distributed systems + <braunr> not much + <crocket> braunr, How do you define distributed systems? + <braunr> crocket: not monolithic + <crocket> braunr, Hurd is distributed, then. + <braunr> multiserver if you prefer + <braunr> yes it is + <crocket> braunr, So it is more difficult with hurd. + <crocket> How much more difficult? How do you debug? + <braunr> just keep in mind that a monolithic system can run on a + microkenrel + <braunr> we use tools that show us references + <crocket> braunr, like? + <braunr> like portinfo + <crocket> braunr, Does hurd use unix-socket to implement IPC? + <braunr> no + <braunr> unix-socket use mach ipc + <crocket> I'm confused + <braunr> ipc is provided by the microkernel, gnumach (a variant of mach) + <braunr> unix sockets are provided by one of the hurd servers (pflocal) + <braunr> servers and clients communicate through mach ipc + <crocket> braunr, Do you think it's feasible to build servers in haskell? + <braunr> why not ? + <crocket> ok + <teythoon> I've been thinking about that + <teythoon> in go, with cgo, you can call go functions from c code + <teythoon> so it should be possible to create bindings for say libtrivfs + <crocket> I'd like to write an OS in clojure or haskell. + <braunr> crocket: what for ? + <crocket> braunr, I want to see a better system programming language than + C. + <braunr> i don't see how clojure or haskell would be "better system + programming languages" than c + <braunr> and even assuming that, what for ? + <crocket> braunr, It's better for programmers. + <crocket> haskell + <crocket> haskell is expressive. + <braunr> personally i disagree + <braunr> it's better for some things + <braunr> not for system programming + <gnufreex> For system programming, Google Go is trying to replace C. But I + doubt it will. + <braunr> we may not be referring to the same thing here when we say "system + programming" + <crocket> braunr, What do you think is a better one? + <braunr> crocket: i don't think there is a better one currently + <crocket> braunr, Even Rust and D? + <braunr> i don't know them well enough + <braunr> certainly not D if it's what i think it is + <crocket> C is too slow. + <crocket> C is too slow to develop. + <braunr> depends + <braunr> again, i disagree + <braunr> rust looks good but i don't know it well to comment + <crocket> C is a tank, and clojure is an airplane. + <crocket> A tank is reliable but slow. + <crocket> Clojure is fast but lacks some accuracy. + <braunr> c is as reliable as the developer is skilled with it + <braunr> it's clearly not a tank + <braunr> there are many traps + <gnufreex> crocket: are you suggesting to rewrite Hurd in Clojure? + <crocket> no + <crocket> Why rewrite hud? + <crocket> hurd + <crocket> I'd rather start from scratch. + <braunr> which is what a rewrite is + <gnufreex> I am not expert on Clojure, but I don't think it is made for + system programming. + <gnufreex> If you want alternate language, I thing Go is only serious + candidate other than C + <crocket> Or Rust + <crocket> However, some people wrote OSes in haskell. + <braunr> again, why ? + <braunr> if it's only for the sake of using another language, i think it's + bad reason + <crocket> Because haskell provides a high level of abstraction that helps + programmers. + <crocket> It is more secure with monads. + <gnufreex> If you want your OS to become successful Free Software project, + you have to use popular language. Haskell is not. + <gnufreex> Most Haskell programmers are not into kernels + <gnufreex> They do high level stuff. + <gnufreex> So little contributors. + <braunr> crocket: so you aim at security ? + <gnufreex> I mean, candidats for contribution + <crocket> braunr, security and higher abstraction. + <braunr> i don't understand higher abstraction + <crocket> braunr, FP can be useful to systems. + <braunr> FP ? + <neal> functional programming + <braunr> right + <braunr> but you can abstract a lot with c too, with more efforts + <crocket> braunr, like that's easy. + <braunr> it's not that hard + <braunr> i'm just questioning the goals and the solution of using a + particular language + <braunr> the reason c is still the preferred language for system + programming is because it provides control over how the hardware does + stuff + <braunr> which is very important for performance + <braunr> the hurd never took off because of bad performance + <braunr> performance doesn't mean doing things faster, it means being able + to do things or not, or doing things a new way + <braunr> so ok, great, you have your amazing file system written in + haskell, and you find out it doesn't scale at all beyond some threshold + of processors or memory + <crocket> braunr, L4 is fast. + <braunr> l4 is merely an architecture abstraction + <braunr> and it's not written in haskell :p + <braunr> don't assume anything running on top of something fast will be + fast + <crocket> Hurd is slow and written in C. + <braunr> yes + <braunr> not because of c though + <crocket> Becuase it's microkernel? + <braunr> because c wasn't used well enough to make the most of the hardware + in many places + <braunr> far too many places + <crocket> A microkernel can be as fast as a monolithic kernel according to + L4. + <braunr> no + <braunr> it can't + <braunr> it can for very specific cases + <braunr> almost none of which are real world + <braunr> but that's not the problem + <braunr> again, i'm questioning your choice of another language in relation + to your goals, that's all + <braunr> c can do things you really can't do easily in other languages + <braunr> be aware of that + <crocket> braunr, "Monolithic kernel are faster than microkernel . while + The first microkernel Mach is 50% slower than Monolithic kernel while + later version like L4 only 2% or 4% slower than the Monolithic kernel ." + <braunr> 14:05 < braunr> but concerning microkernel stuff, you'll read a + lot of crap anywhere + <braunr> simple counterexample : + <braunr> the measurements you're giving consider a bare l4 kernel with + nothing on top of it + <braunr> doing thread-to-thread ipc + <braunr> this model of communication is hardly used in any real world + application + <braunr> one of the huge features people look for with microkernels are + capabilities + <braunr> and that alone will bump your 4% up + <braunr> since capabilities will be used for practically every ipc + <crocket> ok + + +# Hurd From Scratch + +## IRC, freenode, #hurd, 2013-11-30 + + <hurdmaster> because I think there is no way to understand the whole pile, + you need to go step by step + <hurdmaster> for example, I'm starting with mach only, then adding one + server, then another and on each step I have working system + <hurdmaster> that's how I want to understand it + <teythoon> you are interested in the early bootstrapping of the hurd system + ? + <hurdmaster> now I'm starting debian gnu/mach, it hungs, show me black + screen and I have no idea how to fix it + <teythoon> if you are unable to fix this, why do you think you can build a + hurd system from scratch ? + <hurdmaster> not gnu/mach, gnu/hurd I mean + <teythoon> or, you could describe your problem in more detail and one of + the nice people around here might help you ;) + <hurdmaster> as I said, it will be easier to understand and fix bugs, if I + will go step by step, and I will be able to see where bugs appears + <hurdmaster> so you should help me with that + <teythoon> and I tend to disagree + <teythoon> but you could always read my blog. you'll learn lots of things + about bootstrapping a hurd system + <teythoon> but it's complicated + <hurdmaster> http://www.linuxfromscratch.org/ + <teythoon> also, you'll need at least four hurd servers before you'll + actually see much + <teythoon> five + <teythoon> yeah, i know lfs + <hurdmaster> if somebody is interested in creating such a project, let me + know + <teythoon> you seem to be interested + <hurdmaster> yes, but I need the a real hurd master to help me + <teythoon> become one. fix your system and get to know it + <hurdmaster> I need knowledge, somebody built the system but didn't write + documentation about it, I have to extract it from your heads + <teythoon> hurdmaster: extract something from here + http://teythoon.cryptobitch.de + <teythoon> I need my head ;) + <hurdmaster> thanks + <hurdmaster> okay, what's the smallest thing I can run? + <teythoon> life of a Hurd system starts with the root filesystem, and the + exec server is loaded but not started + <teythoon> you could get rid of the exec server and replace the root + filesystem with your own program + <teythoon> statically linked, uses no unix stuff, only mach stuff + <hurdmaster> can I get 'hello world' on pure mach? + <teythoon> you could + <teythoon> hurdmaster: actually, here it is: + http://darnassus.sceen.net/gitweb/rbraun/mach_print.git/ + <teythoon> compile it statically, put it somewhere in /boot + <teythoon> make sure you're running a debug kernel + <teythoon> load it from grub instead of /hurd/ext2fs.static + <teythoon> look at the grub config for how this is done + <teythoon> let me know if it worked ;) diff --git a/open_issues/boehm_gc.mdwn b/open_issues/boehm_gc.mdwn index 8cd2415a..2913eea8 100644 --- a/open_issues/boehm_gc.mdwn +++ b/open_issues/boehm_gc.mdwn @@ -528,6 +528,12 @@ restults of GNU/Linux and GNU/Hurd look very similar. <congzhang> and maybe c# hello world translate another day :) +### IRC, freenode, #hurd, 2013-12-16 + + <braunr> gnu_srs: ah, libgc + <braunr> there are signal-related problems with libgc + + ## Leak Detection ### IRC, freenode, #hurd, 2013-10-17 diff --git a/open_issues/bpf.mdwn b/open_issues/bpf.mdwn index 02dc7f87..d051c2d8 100644 --- a/open_issues/bpf.mdwn +++ b/open_issues/bpf.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2009, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2009, 2012, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -593,3 +594,61 @@ In context of the [[select]] issue. <braunr> i understand now why my bpf translator was so buggy <braunr> the condition_timedwait i wrote at the time was .. incomplete :) + + +## IRC, freenode, #hurd, 2014-02-04 + + <teythoon> btw, why is there a bpf filter in gnumach ? + <teythoon> braunr: didn't you put it there ? + <braunr> teythoon: ah yes i did + <braunr> teythoon: i completed the work of a friend + <braunr> teythoon: the original filters in mach were netf filters + <braunr> teythoon: we added bpf so that libpcap could directly upload them + to the kernel + <braunr> in order to apply filters as close as possible to the packet + source and save copies + <teythoon> so they were used with the in-kernel network drivers ? + <braunr> only by experimental code and pfinet which sets a + receive-all-inet4/6 filter + <braunr> i also have a pcap-hurd.c file for libpcap but integration is a + bit tricky because of netdde + <braunr> maybe i could work on it again some day + <braunr> it should be easy to get into the debian package at least + <teythoon> so they can still be used with a netdde-based driver ? + <braunr> i'm not sure + <braunr> the pcap-hurd.c file i wrote uses the libpcap bpf filter + <teythoon> oh, ok, i misinterpreted what you said wrt netdde + <braunr> the problem caused by netdde is about where to get packets from, + but devnode should take care of that + <teythoon> did you mean that the integration is tricky b/c when netdde is + used, a different approach is necessary and that would have to be + detected at runtime ? + <braunr> something like that + <teythoon> right + <braunr> i didn't want to detect anything + <teythoon> right + <braunr> i was waiting for things to settle but netdde is still debian only + <braunr> but that's ok, this oculd be a debian only patch for now + <teythoon> so is eth-filter the netdde equivalent or am i getting a wrong + picture here ? + <braunr> i don't know + <teythoon> it seems to implement bpf filters as well + <braunr> it could very well be + <braunr> whatever the driver, pfinet must be able to install a filter + <braunr> even if it's almost a catch-all + <teythoon> i guess it could start a eth-filter and use this, why not + <braunr> sure + + +### IRC, freenode, #hurd, 2014-02-06 + + <antrik> teythoon: the BPF filter in Mach can also be used by + eth-multiplexer or eth-filter when running on in-kernel network + drivers... in fact the implementation was finished by the guy who created + eth-multiplexer; it was not fully working before + <antrik> it's not useful at all when using netdde I believe + <antrik> teythoon: IIRC eth-filted both relies on BPF being implemented by + the layer below it (whatever it is) to do the actual filtering, as well + as implements BPF itself so any layer on top of it can in turn use BPF + <antrik> netdde should provide BPF filters too I'd say... but don't + remember for sure diff --git a/open_issues/cannot_create__dev_null__interrupted_system_call.mdwn b/open_issues/cannot_create__dev_null__interrupted_system_call.mdwn new file mode 100644 index 00000000..b0f14a17 --- /dev/null +++ b/open_issues/cannot_create__dev_null__interrupted_system_call.mdwn @@ -0,0 +1,193 @@ +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_hurd]] + + +# IRC, freenode, #hurd, 2013-12-05 + + <teythoon> Creating device nodes: fd fdX std vcs hdX hdXsY hdXs1Y sdX sdXsY + sdXs1Y cdX netdde ethX loopX ttyX ptyp ptyq/sbin/MAKEDEV: 75: + /sbin/MAKEDEV: cannot create /dev/null: Interrupted system call + <teythoon> that's new + <braunr> teythoon: ouch + <teythoon> braunr: everything works fine though + <braunr> teythoon: that part isn't too surprising + <teythoon> y? + <braunr> teythoon: /dev/null already existed, didn't it ? + <teythoon> braunr: sure, yes + + +## IRC, freenode, #hurd, 2013-12-19 + + <braunr> hm + <braunr> i'm seeing those /sbin/MAKEDEV: cannot create /dev/null: + Interrupted system call messages too + + +## IRC, freenode, #hurd, 2013-12-20 + + <teythoon> braunr: interesting, I've seen some of those as well + + +## IRC, freenode, #hurd, 2014-01-26 + + <gg0> cannot create /dev/null: Interrupted system call + <gg0> + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/26/steps/system_upgrade/logs/stdio + + +## IRC, freenode, #hurd, 2014-01-27 + + <anatoly> gg0: I had same /dev/null error after upgrading my old image + (more than 6 months old) a week ago. But I got such message only on boot + and it didn't autostart hurd console. + <anatoly> Tried to upgrade current qemu image (from topic) to reproduce it + but it works OK after upgrade + <gg0> i can reproduce it with # apt-get install --reinstall python2.7 dbus + # for instance + <gg0> http://paste.debian.net/plain/78566/ + <teythoon> gg0: i've seen those as well, but i cannot reliably reproduce it + to track it down + <teythoon> i believe it's benign though + <gg0> in shell scripts if -e is set, it aborts on failures like those + <teythoon> uh, it does? :/ + <gg0> so if this happens in prerm/postinst scripts, package is not properly + installed/removed/configured and it fails + <gg0> redirecting stdout and strerr to /dev/null shouldn't be so + problematic, anything wrong in my setup? + <gg0> can you reproduce it? + <teythoon> not reliably + <teythoon> gg0: but i do not believe that anything is wrong with your + machine + <gg0> any way to debug it? + <teythoon> having a minimal test case that triggers this reliably would be + great + <teythoon> but i fear it might be a race + + +## IRC, freenode, #hurd, 2014-01-28 + + <teythoon> have you seen the /dev/null issue ? + <braunr> yes + <teythoon> what do you make of it ? + <braunr> no idea + <teythoon> i believe it is related to the inlining work i've done + <braunr> just like the bogus deallocation at boot, it needs debugging :) + <braunr> hm i don't think so + <teythoon> no ? + <braunr> i think we saw it even before your started working on the hurd ;p + <teythoon> i've never seen it before my recent patches + <teythoon> maybe i made it worse + <braunr> not worse, just exposed more + <teythoon> right + + +## IRC, freenode, #hurd, 2014-01-29 + + <gg0> cannot reproduce "cannot create /dev/null: Interrupted system call" + on a faster VM + <gg0> might depend on that? + + +## IRC, OFTC, #debian-hurd, 2014-02-02 + + <pere> but now saw a strange message at the end of the boot: + /etc/init.dhurd-console: 55: /etc/init.d/hurd-console: cannot create + /dev/null: Interrupted system call + <gg0> oh well known on a slow VM (even old qemu/kvm btw), i can't reproduce + it on a faster/more recent one + <gg0> slow VM = gnash buildbot slave + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/26/steps/system_upgrade/logs/stdio + <gg0> especially bad on system upgrade because it doesn't finish to run + prerm/postinst scripts :/ + + +## IRC, freenode, #hurd, 2014-02-05 + + <gg0> Creating device nodes: fd fdX std vcs hdX hdXsY/sbin/MAKEDEV: 75: + /sbin/MAKEDEV: cannot create /dev/null: Interrupted system call hdXs1Y + sdX sdXsY sdXs1Y cdX netdde ethX loopX ttyX ptyp ptyq lprX comX random + urandom kbd mouse shm. + + +## IRC, freenode, #hurd, 2014-02-11 + + <gg0> typical dist-upgrade http://paste.debian.net/plain/81346/ + <gg0> many fewer cannot create /dev/null: Interrupted system call + <gg0> on a faster machine + <teythoon> gg0: wow, so many interrupted system call messages + <teythoon> i don't get as many, but makedev produces a few every time i run + it as well + + +## IRC, OFTC, #debian-hurd, 2014-02-16 + + <pere> anyone here got any idea why upgrading initscripts fail on the hurd + gnash autobuilder, as reported on <URL: + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/28/steps/system_upgrade/logs/stdio + >? + <gg0> pere: cannot create /dev/null: Interrupted system call + <pere> gg0: I noticed the message, but fail to understand how this could + happen. + <gg0> 13:16 < gg0> oh well known on a slow VM (even old qemu/kvm btw), i + can't reproduce it on a faster/more recent one + <gg0> 13:17 < gg0> slow VM = gnash buildbot slave + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/26/steps/system_upgrade/logs/stdio + <gg0> 13:18 < gg0> especially bad on system upgrade because it doesn't + finish to run prerm/postinst scripts :/ + <gg0> i remember teythoon talking about something racy + <teythoon> gg0: the /dev/null issue is known for a long time + <teythoon> gg0: some of the recent work (i believe mine) has made the + problem more apparent + <teythoon> gg0: that's what braunr told me + <gg0> i see. it would be really nice fixing it. really annoying. i + workaround it by moving null away and moving it back under /dev before + halting/rebooting + + +## IRC, freenode, #hurd, 2014-02-17 + + <tschwinge> Earlier today, I upgraded my Debian GNU/Hurd installation from + several months ago, and I'm now seeing bogus things as follows; is that a + known issue? + <tschwinge> checking for i686-unknown-gnu0.5-ar... ar + <tschwinge> configure: updating cache ./config.cache + <tschwinge> configure: creating ./config.status + <tschwinge> +./config.status: 299: ./config.status: cannot create + /dev/null: Interrupted system call + <tschwinge> config.status: creating Makefile + <tschwinge> (The plus is from a build log diff.) + <azeem> 13:36 < gg0> pere: cannot create /dev/null: Interrupted system call + <azeem> 20:10 < teythoon> gg0: the /dev/null issue is known for a long time + <tschwinge> Anyone working on resolving this? I't causing build issues: + <tschwinge> checking for i686-unknown-gnu0.5-ranlib... (cached) ranlib + <tschwinge> checking command to parse nm output from gcc-4.8 + object... [...]/opcodes/configure: 6760: ./configure.lineno: cannot + create /dev/null: Interrupted system call + <tschwinge> failed + <tschwinge> checking for dlfcn.h... yes + <tschwinge> Anyway, will go researching IRC logs. + <azeem> tschwinge: (that one was from #debian-hurd) + <azeem> I assume teythoon and/or braunr can comment once he's back + <azeem> they're* + <braunr> tschwinge: we've been seing this more often lately but noone has + attempted to fix it yet + <braunr> tschwinge: if you have a reliable way to reproduce that /dev/null: + Interrupted system call error, please let us know + + +## IRC, freenode, #hurd, 2014-02-23 + + <gg0> braunr: cool. i'd vote /dev/null one as next one in your todo + <gg0> still frequent on this slow vm + http://gnashdev.org:8010/builders/z-sid-hurd-i386/builds/30/steps/system_upgrade/logs/stdio + <gg0> especially during setup-translators -k + <braunr> yes diff --git a/open_issues/clock_gettime.mdwn b/open_issues/clock_gettime.mdwn index 65ab52df..baa21bbb 100644 --- a/open_issues/clock_gettime.mdwn +++ b/open_issues/clock_gettime.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -158,6 +159,9 @@ In context of [[select]]. <braunr> my brain can't correctly compute variable sized types in mig definition files <braunr> i wanted something that would remain correct for the 64-bit port + +[[64-bit_port]], [[mig_portable_rpc_declarations]]. + <youpi> ah, you mean because tv_nsec is a long, which will not be the same type? <braunr> and tv_sec being a time_t (thus a long too) @@ -208,3 +212,129 @@ In context of [[select]]. # Candidate for [[vDSO]] code? + + +# IRC, freenode, #hurd, 2014-02-23 + + <desrt> GLib (gthread-posix.c): Unexpected error from C library during + 'pthread_condattr_setclock': Invalid argument. Aborting. + <desrt> uh oh... + <desrt> time to go digging in glibc i guess... + <braunr> what are you trying to run ? + <desrt> glib + <braunr> with what ? + <desrt> just running glib's test suite under jhbuild + <desrt> i maintain glib and i made some changes recently -- i wanted to + make sure they didn't break the hurd + <desrt> and it seems they have ;/ + <braunr> well + <braunr> the hurd doesn't completely comply with posix 2008 + <desrt> long story short: we've keyed our timed waits on condition + variables to the monotonic clock for a long time now, but we never tested + that it actually worked + <desrt> so i just added an assert -- and indeed it fails on hurd + <braunr> our glibc lies about supporting timers + <braunr> good thinking + <braunr> we don't support the monotonic clock + <desrt> clock_gettime(CLOCK_MONOTONIC) seems to work + <braunr> and you should know that, even if clock selection and timers are + available (which posix 2008 requires), it's still optional + <braunr> no, glibc lies + <desrt> !! + <braunr> our "support" is a mere hack shifting CLOCK_REALTIME + <desrt> it should at least lie consistently :) + <braunr> we need to implement CLOCK_MONOTONIC properly + <desrt> ya... that would be very nice indeed + <braunr> not that hard either + <desrt> i agree! + <braunr> we just have to do it right + <desrt> fwiw, i plan to keep this assert in glib + <braunr> yes, it's good + <desrt> is there anywhere i can file a bug to give you guys some advance + warning? + <braunr> i don't think it's needed + <braunr> we know the problem + <desrt> k -- consider yourself warned, then :) + <braunr> and it's been a bigger concern recently + <desrt> awesome. glad i don't have to do anything :) + <braunr> if it's not already done, i suggest you check for the + CLOCK_MONOTONIC option + <desrt> fwiw, i'm trying to get a regular debian/gnu/hurd build of + glib/gtk/etc setup + <braunr> regular ? + <desrt> ya... out of git master on a daily basis + <braunr> from sources ? + <braunr> oh nice + <desrt> we recently set this up for freebsd as well + <braunr> few maintainers take the pain :) + <desrt> our non-linux 'problem discovery' is a bit crap before now :/ + <braunr> i guess that's pretty normal + <braunr> i don't consider it the responsibility of the maintainers to test + every possible platform + <desrt> glib is a bit unique -- portability is our business + <braunr> taking our patches into consideration is what we ask most + <braunr> right + <desrt> and the "please take the patches" thing is something we want to + stop doing + <braunr> why ? + <desrt> mostly because we often look at a patch that someone sent a few + years ago and say "do we even still need this?" + <desrt> and have no way to know + <braunr> uh + <desrt> you would not believe how many patches like this we've + accumulated... + <braunr> but if we send it now ? :) + <desrt> braunr: new policy is roughly this: + https://wiki.gnome.org/Projects/GLib/SupportedPlatforms + <desrt> ie: fixes for issues that are general portability improvements and + POSIX compliance are welcome... + <desrt> patches that introduce platform-specific #ifdef sections are + rejected unless we have a regular builder to test that code + <braunr> i see + <braunr> again, regarding portability, don't consider CLOCK_MONOTONIC to be + readily available, check for it + <braunr> an #error would be enough but it has to be checked + <desrt> it basically comes down to: we don't want to have code in our + version control that we have no possible way of testing + <braunr> yes + <desrt> braunr: we do check for it + <braunr> ok + <desrt> we assert() if clock_gettime(CLOCK_MONOTONIC) fails + <braunr> no i mean + <desrt> as POSIX said it should if CLOCK_MONOTONIC is not supported + <desrt> if you lie to us.... well, not much we can do + <braunr> POSIX_MONOTONIC_CLOCK + <braunr> _POSIX_MONOTONIC_CLOCK + <desrt> this is actually defined to 0 on most platforms... + <desrt> which does not mean that it's unsupported -- it means that the + runtime must be ready to deal with it not actually existing at runtime + <braunr> really ? + <desrt> yes + <desrt> we used to rely on this and got a bug that we were doing it wrong + :) + <desrt> and indeed, even on linux, both with glibc and uclibc: + <desrt> /usr/include/bits/posix_opt.h:#define _POSIX_MONOTONIC_CLOCK + 0 + <desrt> /usr/include/uClibc/bits/posix_opt.h:#define _POSIX_MONOTONIC_CLOCK + 0 + <braunr> ok it's described in 2.1.6 Options + <braunr> so your check is appropriate + <desrt> so does clock_gettime(MONOTONIC) on debian/hurd get me realtime? + <braunr> either that, or a value shifted from it + <desrt> if so, i'll just hack out the condattr_setclock() check and proceed + trying to build past glib... + * desrt checks + <desrt> as it is, even the build of glib fails since we use some tools + linked against ourselves during the build process... + <desrt> 1393124084790000 1393124084790000 + <desrt> those look the same.... + <braunr> heh + <desrt> i also notice that your clocks are not very high precision :) + <braunr> that's right + <desrt> HZ = 100, i guess + <braunr> yes + <desrt> fair enough + <desrt> our mainloop doesn't support better-than-millisecond accuracy yet + anyway :) + <desrt> (although it will soon...) + <braunr> nice diff --git a/open_issues/code_analysis.mdwn b/open_issues/code_analysis.mdwn index 67798c6a..d61d5921 100644 --- a/open_issues/code_analysis.mdwn +++ b/open_issues/code_analysis.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -87,8 +87,70 @@ There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks. * [Frama-C](http://frama-c.com/) + <teythoon> btw, I've been looking at http://frama-c.com/ lately + <teythoon> it's a theorem prover for c/c++ + <braunr> oh nice + <teythoon> I think it's most impressive, it works on the hurd (aptitude + install frama-c o_O) + <teythoon> *and it works + <braunr> "Simple things should be simple, + <braunr> complex things should be possible." + <braunr> :) + <braunr> looks great + <teythoon> even the gui is awesome, allows one to browse source code in + a very impressive way + <braunr> clear separation between value changes, dependencies, side + effects + <braunr> we could have plugins for stuff like ports + <braunr> handles concurrency oO + <nalaginrut> so you want to use Frame-C to analyze the whole Hurd code + base? + <teythoon> nalaginrut: well, frama-c looks "able" to assist in + analyzing the Hurd, yes + <teythoon> nalaginrut: but theorem proving is a manual process, one + needs to guide the prover + <teythoon> nalaginrut: b/c some stuff is not decideable + <nalaginrut> I ask this because I can imagine how to analyze Linux + since all the code is in a directory. But Hurd's codes are + distributed to many other projects + <braunr> that's not a problem + <braunr> each server can be analyzed separately + <teythoon> braunr: also, each "entry point" + <nalaginrut> alright, but sounds a big work + <teythoon> it is + <braunr> otherwise, formal verification would be widespread :) + <teythoon> that, and most tools are horrible to use, frama-c is really + an exception in this regard + * [Coverity](http://www.coverity.com/) (nonfree?) + * IRC, OFTC, #debian-hurd, 2014-02-03 + + <pere> btw, did you consider adding hurd and mach to <URL: + https://scan.coverity.com/ > to detect bugs automatically? + <pere> I found lots of bugs in gnash, ipmitool and sysvinit when I + started scanning those projects. :) + <teythoon> i did some static analysis work, i haven't used coverty + but free tools for that + <teythoon> i think thomas wanted to look into coverty though + <pere> quite easy to set up, but you need to download and run a + non-free tarball on the build host. + <teythoon> does that tar ball contains binary code ? + <teythoon> that'd be a show stopper for the hurd of course + <pere> did not investigate. I just put it in a contained virtual + machine. + <pere> did not want it on my laptop. :) + <pere> prefer free software here. :) + <pere> but I did not have to "accept license", at least. :) + + * IRC, OFTC, #debian-hurd, 2014-02-05 + + <pere> ah, cool. <URL: https://scan.coverity.com/projects/1307 > + is now in place. :) + + [[microkernel/mach/gnumach/projects/clean_up_the_code]], + *Code_Analysis, Coverity*. + * [Splint](http://www.splint.org/) * IRC, freenode, #hurd, 2011-12-04 diff --git a/open_issues/code_analysis/discussion.mdwn b/open_issues/code_analysis/discussion.mdwn index 4cb03293..45126b91 100644 --- a/open_issues/code_analysis/discussion.mdwn +++ b/open_issues/code_analysis/discussion.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -100,6 +100,146 @@ License|/fdl]]."]]"""]] https://teythoon.cryptobitch.de/qa/2013-10-17/scan-build-2/ +### IRC, freenode, #hurd, 2013-11-04 + + <teythoon> btw, why does the nested functions stuff needs the executable + stack? for trampolines? + <braunr> yes + <teythoon> I didn't even realize that, that's one more reason to avoid them + indeed + + <teythoon> braunr: kern/slab.c (1471): vm_size_t info_size = info_size; + <braunr> yes ? + <teythoon> braunr: what's up with that? + <braunr> that's one way to silence gcc warnings about uninitialized + variables + <braunr> this warning can easily result in false positives when gcc is + unable to determine dependencies + <braunr> e.g. if (flag & FLAG_CREATE) myvar = create(); ...; ... if (flag & + FLAG_CREATE) use(myvar) + <teythoon> well, ok, that's a shortcomming of gcc + <teythoon> braunr: your way of silencing that in gcc still shows up in + scan-build and most likely any more advanced analysis tool + <teythoon> as it should of course, but it is noisy + <braunr> teythoon: there is a gcc attribute for that + <braunr> __attribute__((unused)) + <braunr> analysis tools might know that better + <teythoon> braunr: could you have a quick look at + http://darnassus.sceen.net/~teythoon/qa/gnumach/scan-build/2013-11-04/report-mXqstT.html#EndPath + ? + <braunr> nice + <braunr> anything else on the rbtree code ? + <teythoon> well + <teythoon> + http://darnassus.sceen.net/~teythoon/qa/gnumach/scan-build/2013-11-04/report-LyiOO1.html#EndPath + <teythoon> but this is of length 18, so it might be far-fetched + <braunr> ?? + <teythoon> the length of the chain of argumentation + <braunr> i don't understand that issue + <braunr> isn't 18 the analysis step ? + <teythoon> well, the greater the length, the more assumption the tool + makes, the more likely it is that it just does not "get" some invariant + <braunr> probably yes + <braunr> the code can segfault if input parameters are invalid + <braunr> that's expected + <teythoon> right, looks like this only happens if the tree is invalid + <teythoon> if in line 349 brother->children[right] is NULL + <teythoon> this is a very good target for verification using frama-c + <braunr> :) + <teythoon> the code already has many assertions that will be picked up by + it automatically + <teythoon> so what about the dead store, is it a bug or is it harmless ? + <braunr> harmless probably + <braunr> certainly + <braunr> a simple overlook when polishing + + +### IRC, freenode, #hurd, 2014-01-16 + + <mcsim> braunr: hi. Once, when I wrote a lot if inline gcc functions in + kernel you said me not to use them. And one of the arguments was that you + want to know which binary will be produced. Do you remember that? + <braunr> not exactly + <braunr> it seems likely that i advice not to use many inline functions + <braunr> but i don't see myself stating such a reason + <mcsim> braunr: ok + <mcsim> so, what do you think about using some high level primitives in + kernel + <mcsim> like inline-functions + <mcsim> ? + <braunr> "high level primitives" ? + <braunr> you mean switching big and important functions into inline code ? + <mcsim> braunr: something that is hard to translate in assembly directly + <mcsim> braunr: I mean in general + <braunr> i think it's bad habit + <mcsim> braunr: why? + <braunr> don't inline anything at first, then profile, then inline if + function calls really are a bottleneck + <mcsim> my argument would be that it makes code more readable + <braunr> https://www.kernel.org/doc/Documentation/CodingStyle <= see the + "inline disease" + <braunr> uh + <braunr> more readable ? + <braunr> the only difference is an inline keyword + <mcsim> sorry + <mcsim> i confused with functions that you declare inside functions + <mcsim> nested + <mcsim> forgot the word + <mcsim> sorry + <braunr> ah nested + <braunr> my main argument against nested functions is that they're not + standard and hard to support for non-gcc tools + <braunr> another argument was that it required an executable stack but + there is apparently a way to reliably make nested functions without this + requirement + <braunr> so, at the language level, they bring nice closures + <braunr> the problem for me is at the machine level + <braunr> i don't know them well so i'm unable to predict the kind of code + they generate + <braunr> but i guess anyone who would take the time to study their + internals would be able to do that + <mcsim> and why this last argument is important? + <braunr> because machine code runs on machines + <braunr> one shouldn't ignore the end result .. + <braunr> if you don't know the implications of what you're doing precisely, + you loose control over the result + <braunr> if you can trust the tool, fine + <kilobug> mcsim: in general, when you use something you don't really + understand how it works internally, you've a much higher risk of making + bugs or inefficient code because you just didn't realize it couldn't work + or would be inefficient + <braunr> but in the case of a kernel, it often happens that you can't, or + at least not in a straightforward way + <braunr> s/loose/lose/ + <mcsim> kilobug: and that's why for kernel programming you try to use the + most straightforward primitives as possible? + <braunr> no + <kilobug> mcsim: not necessarily the most straightforward ones, but ones + you understand well + <braunr> keeping things simple is a way to keep control complexity in any + software + <braunr> as long as you understand, and decouple complicated things apart, + you can keep things simple + <braunr> nested functions doesn't have to do with complexity + <braunr> don't* + <braunr> it's just that, since they're not standard and commonly used + outside gnu projects, they're not well known + <braunr> i don't "master" them + <teythoon> also, they decouple the data flow from the control flow + <teythoon> which in my book is bad for imparative languages + <teythoon> and support for them in tools like gdb is poor + <mcsim> braunr: I remembered nested functions because now I use C++ and I + question myself if I may use all these C++ facilities, like lambdas, + complicated templates and other stuff. + <mcsim> kilobug: And using only things that you understand well sounds + straightforward and logical + <braunr> that's why i don't write c++ code :) + <braunr> it's very complicated and requires a lot of effort for the + developer to actually master it + <braunr> mcsim: you can use those features, but sparsely, when they really + do bring something useful + + # Leak Detection See *Leak Detection* on [[boehm_gc]]. diff --git a/open_issues/crash_server.mdwn b/open_issues/crash_server.mdwn index 5182df6f..3d656082 100644 --- a/open_issues/crash_server.mdwn +++ b/open_issues/crash_server.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2009, 2010, 2011, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2009, 2010, 2011, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -248,6 +248,22 @@ one... <tschwinge> rekado: In case that's still helpful: <http://www.gnu.org/software/hurd/hurd/debugging/translator.html>. + +# IRC, freenode, #hurd, 2013-12-14 + + <gnu_srs> How to get a core dump? + <teythoon> either set CRASHSERVER to /servers/crash-dump-core for the + process you want the core file of + <teythoon> or make /servers/crash point to crash-dump-core to make this the + default for all processes + <gnu_srs> does it work now, it did not before? + <teythoon> it does for me, never had issues + <gnu_srs> k! + <teythoon> well, i believe the second option has issues + <teythoon> if two processes crash, both may write/create a file in the same + location + + --- If someone is working in this area, they may want to have a look at diff --git a/open_issues/dbus.mdwn b/open_issues/dbus.mdwn index 4473fba0..b3bebf48 100644 --- a/open_issues/dbus.mdwn +++ b/open_issues/dbus.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -365,3 +365,138 @@ See [[glibc]], *Missing interfaces, amongst many more*, *`SOCK_CLOEXEC`*. <braunr> anyway <braunr> how do you plan to implement credential checking ? <gnu_srs> I'll mail patches RSN + + +# IRC, freenode, #hurd, 2013-11-03 + + <gnu_srs> Finally, SCM_CREDS (IDs) works:) I was on the right track all the + time, it was just a small misunderstanding. + <gnu_srs> remains to solve the PID check + <youpi> gnu_srs: it should be a matter of adding + proc_user/server_authenticate + <gnu_srs> there are no proc_user/server_authenticate RPCs? + <gnu_srs> do you mean adding them to process.defs (and implement them)? + <youpi> gnu_srs: I mean that, yes + + +# IRC, freenode, #hurd, 2013-11-13 + + <gnu_srs> BTW: I have to modify the SCM_RIGHTS patch to work together with + SCM_CREDS, OK? + <youpi> probably + <youpi> depends on what you change of course + + +# IRC, freenode, #hurd, 2013-11-15 + + <gnu_srs> Hi, any ideas where this originates, gdb? warning: Error setting + exception port for process 9070: (ipc/send) invalid destination port + <braunr> gnu_srs: what's process 9070 ? + <gnu_srs> braunr: It's a test program for sending credentials over a + socket. Have to create a reproducible case, it's intermittent. + <gnu_srs> The error happens when running through gdb and the sending + program is chrooted: + <gnu_srs> -rwsr-sr-x 1 root root 21156 Nov 15 15:12 + scm_rights+creds_send.chroot + + +## IRC, freenode, #hurd, 2013-11-16 + + <gnu_srs> Hi, I have a problem debugging a suid program, see + http://paste.debian.net/66171/ + <gnu_srs> I think this reveals a gnumach/hurd bug, it makes things behave + strangely for other programs. + <gnu_srs> How to get further on with this? + <gnu_srs> Or can't I debug a suid program as non-root? + <pochu> gnu_srs: if gdb doesn't work for setuid programs on hurd, I suppose + you could chmod -s the binary you're trying to debug, login as root and + run it under gdb + <gnu_srs> pochu: When logged in as root the program works, independent of + the s flag setting. + <pochu> right, probably the setuid has no effect in that case because your + effective uid is already fine + <pochu> so you don't hit the gdb bug in that case + <pochu> (just guessing) + <gnu_srs> It doesn't work in Linux either, so it might be futile. + <gnu_srs> trying + <pochu> hmm that may be the expected behaviour. after all, gdb needs to be + priviledged to debug priviledged processes + <gnu_srs> Problem is that it was just the suid properties I wanted to + test:( + <braunr> gnu_srs: imagine if you could just alter the code or data of any + suid program just because you're debugging it + + +## IRC, freenode, #hurd, 2013-11-18 + + <gnu_srs> Hi, is the code path different for a suid program compared to run + as root? + <gnu_srs> Combined with LD_PRELOAD? + <teythoon> gnu_srs: afaik LD_PRELOAD is ignored by suid programs for + obvious security reasons + <gnu_srs> aha, thanks:-/ + <braunr> gnu_srs: what's your problem with suid ? + <gnu_srs> I made changes to libc and tried them out with + LD_PRELOAD=... test_progam. It worked as any user (including root), + <gnu_srs> but not with suid settings. Justus explained why not. + <braunr> well i did too + <braunr> but is that all ? + <braunr> i mean, why did you test with suid programs in the first place ? + <gnu_srs> to get different euid and egid numbers + + <gnu_srs> hi, anybody seen this with eglibc-2.17-96: locale: relocation + error: locale: symbol errno, + <gnu_srs> version GLIBC_PRIVATE not defined in file libc.so.0.3 with link + time reference + <teythoon> yes, I have + <teythoon> but afaics nothing did break, so I ignored it + + +## IRC, freenode, #hurd, 2013-11-23 + + <gnu_srs> Finally 8-) + <gnu_srs> Good news: soon both SCM_CREDS _and_ SCM_RIGHTS is supported + jointly. RFCs will be sent soon. + + +## IRC, freenode, #hurd, 2013-12-05 + + <gnu_srs> I have a problem with the SCM_CREDS patch and dbus. gamin and my + test code runs fine. + <gnu_srs> the problem with the dbus code is that it won't work well with + <gnu_srs> auth_user_authenticate in sendmsg and auth_server_authenticate in + recvmsg. + <gnu_srs> Should I try to modify the dbus code to make it work? + <youpi> unless you manage to prove that dbus is not following the posix + standard, there is no reason why you should have to modify dbus + <gnu_srs> I think the implementation is correct, + <gnu_srs> but auth_user_authenticate hangs sendmsg until + auth_seerver_authenticate is executed in recvmsg. + <gnu_srs> and dbus is not doing that, so it hangs in sendmsg writing a + credentials byte. + <gnu_srs> well the credentials byte is definitely non-posix. + <gnu_srs> I found a bug related to the HURD_DPORT_USE macro too:-( + <youpi> ah, yes, auth_user_authenticate might be synchronous indeed, let me + think about it + <gnu_srs> Nevertheless, I think it's time to publish the code so it can be + commented on:-D + <youpi> sure + <youpi> publish early, publish often + + +# IRC, freenode, #hurd, 2014-01-17 + + <gnu_srs> youpi: as a start all our requested dbus changes are now + committed, and in Debian unstable + <youpi> good :) + + +# IRC, freenode, #hurd, 2014-01-30 + + <pochu> dbus has some known problems + <pere> known fixes too? + <pochu> http://www.gnu.org/software/hurd/open_issues/dbus.html + <gnu_srs> pochu: Maybe that page should be updated: + http://lists.nongnu.org/archive/html/bug-hurd/2013-12/msg00150.html + <youpi> gnu_srs: well, maybe you can do it : + <youpi> ) diff --git a/open_issues/dbus_in_linux_kernel.mdwn b/open_issues/dbus_in_linux_kernel.mdwn index caf47711..6f83db03 100644 --- a/open_issues/dbus_in_linux_kernel.mdwn +++ b/open_issues/dbus_in_linux_kernel.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -74,3 +75,90 @@ Might be interesting to watch how this develops. [AF_BUS, D-Bus, and the Linux kernel](http://www.kroah.com/log/linux/af_bus.html), Greg Kroah-Hartman, 2013-02-08. + + +# kdbus + + +## IRC, freenode, #hurd, 2014-01-28 + + <braunr> i would like to see things like dbus and zeromq use an optimized + microkernel transport one day + <teythoon> we could port kdbus >,< + <braunr> why not + <braunr> you port cgroups first + <teythoon> exactly + <braunr> :p + +[[systemd]]. + + +## IRC, freenode, #hurd, 2014-02-23 + +In context of [[linux_as_the_kernel]], *IRC, freenode, #hurd, 2014-02-23*. + + <desrt> mach seems like this really simple thing when you first explain + what a microkernel is + <braunr> and because of that, i think it's better to start the right + solution directly + <braunr> it looks simple, it's clearly not + <desrt> but i did a bit of looking into it... it's a bit non-trivial after + all :) + <braunr> mach ipc is over complicated and error prone + <braunr> it leads to unefficient communication compared to other solutions + such as what l4 does + <desrt> ya -- i hear that this is a big part of the performance hit + <braunr> that's why i've started x15 + <desrt> i was also doing some reading about how it's based on mapping + memory segments between processes + <braunr> first, it was a mach clone, but since i've come to know mach + better, it's now a "spiritual" mach successor .. :) + <desrt> these are two issues that we've been dealing with at another + level... in the design of kdbus + <braunr> ah kdbus :) + <desrt> this is something that started with my masters thesis a long time + ago... + <braunr> ah you too + <desrt> first thing we did is make the serialisation format so that all + messages are valid and therefore never need to be checked + <desrt> (old dbus format requires checks at every step on the way) + <braunr> looks interesting + <desrt> then of course we cut the daemon out + <desrt> but some other interesting things: security is super-simple... it's + based enirely on endpoints + <desrt> either you're allowed to send messages between two processes or + you're not + <desrt> there is no checking for message types, for example + <braunr> yes + <desrt> and the other thing: memory mapping is usually bad + <braunr> that's what i mean when i say mach ipc is over complicated + <braunr> it depends + <desrt> the kdbus guys did some performance testing and found out that if + the message is less than ~512k then the cost of invalidating the TLB in + order to change the memory mapping is higher than the cost of just + copying the data + <braunr> yes, we know that too + <braunr> that's why zero copy isn't the normal way of passing small amounts + of data over mach either + <desrt> nice + <desrt> i got the impression in some of my reading (wikipedia, honestly) + that memory mapping was being done all the time + <braunr> well + <braunr> no it's not + <braunr> memory mapping is unfortunately a small fraction of the + performance overhead + <desrt> that's good :) + <braunr> that being said + <braunr> memory mapping can be very useful + <braunr> for example, it's hard for us to comply with posix requirements of + being able to read/write at least 2G of data in a single call + <braunr> weird bugs occur beyond 512M iirc + <braunr> you do want memory mapping for that + <desrt> ya... for things of this size.... you don't want to copy that + through a socket :) + <braunr> monolithic kernels have it naturally, since the kernel is mapped + everywhere + <braunr> for microkernels, it's a little more complicated + <braunr> and the problem gets worse on smp + <braunr> again, that's why i preferred starting a new kernel instead of + reusing linux diff --git a/open_issues/dde.mdwn b/open_issues/dde.mdwn index fe9fd8aa..9d8bf509 100644 --- a/open_issues/dde.mdwn +++ b/open_issues/dde.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -579,6 +579,41 @@ In context of [[libpthread]]. <braunr> (well high, 4 MiB/s or more) +## IRC, freenode, #hurd, 2013-11-20 + + <braunr> for example, netdde needs more reviewing and polishing + <braunr> it is known to deadlock sometimes + <teythoon> what deadlocks ? + <braunr> i'm not sure + <teythoon> ah, netdde + <teythoon> right + <braunr> yes + <teythoon> I'm seeing that to on one of my vms + <teythoon> nasty one + <braunr> i know something is wrong with the condition_wait_timeout function + for example + <teythoon> breaks sysvinit shutdown + <braunr> because it was taken without modification from libpthread + <braunr> it might be that, or something else + <teythoon> well, dhclient hangs releasing the lease + <braunr> that's still on my todo list + <teythoon> so I'm pretty sure it's related + <braunr> hm + <braunr> maybe + <braunr> :/ + + +## IRC, freenode, #hurd, 2014-02-11 + + <braunr> teythoon: looks like a netdde/pfinet freeze/deadlock + <braunr> yes a netdde deadlock + <braunr> i really have to fix that too one day :( + <teythoon> hehe :) + <braunr> the netdde locking privimites are copies of the "old" pthread + ones, instead of reusing pthread + <braunr> primitives* + + # IRC, freenode, #hurd, 2012-08-18 <braunr> hm looks like if netdde crashes, the kernel doesn't handle it @@ -602,4 +637,15 @@ In context of [[libpthread]]. partitions/media... +## IRC, freenode, #hurd, 2013-12-03 + + <gg0> how about porting linux block device layer via dde as mcsim wanted to + do? then all linux filesystems could be brought in, right? + <braunr> gg0: that should be done, but we need to correctly deal with + multiple pci devices in userspace and arbitration + <kilobug> wouldn't adding support to passive translator into Linux + filesystems be quite some work ? IIRC ext2fs needs a special "owner = + hurd" mode to handle them + + # [[virtio]] diff --git a/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn b/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn index 3faa56fc..7b300ea1 100644 --- a/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn +++ b/open_issues/debugging_gnumach_startup_qemu_gdb.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -144,3 +144,61 @@ See also discussion about *multiboot* on [[arm_port]]. <kilobug> matlea01: you need something with multiboot support (like grub) to provide the various bootstrap modules to the kernel <matlea01> Ah, I see + + +## IRC, freenode, #hurd, 2014-02-24 + + <congzhang> hi, will grub load mach kernel to fix address? and which + address? + <congzhang> I want to use qemu gdb support to debug mach + <congzhang> need add-symble-file to right address + <youpi> congzhang: see objdump gnumach + <youpi> grub simply follows what's provided by the ELF format of the ELF + file + <nalaginrut> I think it's default value of _start in ELF, right? + <nalaginrut> hmm...the actual entry point should plus the size of + multi_boot header, at least 0xc... + <congzhang> youpi: I try that, but not works + <congzhang> I start qemu with -s + <congzhang> the /bin/console was very easy to cause black death, and I want + to use gdb to check whether the mach is death + <congzhang> I will try again later + <congzhang> Anyone know some tutorial to debug mach with qemu? + <nalaginrut> for better debug, I suggest bochs + <nalaginrut> although it's slower + <congzhang> nalaginrut: maybe it's my problem, I did not do the right thing + <congzhang> qemu with kvm was great. + <nalaginrut> qemu with kvm is cool to run, but not so cool for debug kernel + <nalaginrut> anyway, it's personal taste + <nalaginrut> you may use gdb for that + <nalaginrut> for bochs, you don't have to use external debugger + <congzhang> thanks for explain + <congzhang> does anyone succeed boot hurd with qemu multiboot boot + function? + <congzhang> with -kernel and -initrd command line parameter + <nalaginrut> I boot it with grub, in qemu, it's fine. Then I moved to + physical machine + <congzhang> boot with grub work for me too + <congzhang> I want to know whether it is possible to boot from qemu + directly + <congzhang> qemu can directly load kernel and hurd module for linux + <congzhang> nalaginrut: can you help to test whether hurd-console service + start will cause hurd black death? + <nalaginrut> I know qemu can boot Linux without MBR, but I don't know if + it's true for Hurd too + <nalaginrut> congzhang: I'm busy for other works now ;-) + <congzhang> ok, thks:) + <youpi> qemu's multiboot options don't seem to allow providing + ext2fs.static and ld.so, so I don't think it's possible + <congzhang> I try to do this, because hurd hurd-console cause system to + death very high frequency + <youpi> (because qemu doesn't implement all of multiboot) + <congzhang> qemu help show that's possible, -initrd support multi module + and parameter + <congzhang> en, I will check with them later + <youpi> how do you pass parameters to modules? + <youpi> ah, right, it's after the file name + <youpi> well, then simply try to pass the kernel, and the two modules + <youpi> with the same option as in the grub config templates + <youpi> it's fortunate that neither ext2fs nor exec need a comma on their + command line... diff --git a/open_issues/default_pager.mdwn b/open_issues/default_pager.mdwn index 9a8e9412..38c9a2be 100644 --- a/open_issues/default_pager.mdwn +++ b/open_issues/default_pager.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -35,3 +36,9 @@ License|/fdl]]."]]"""]] # [[trust_the_behavior_of_translators]] + + +# IRC, freenode, #hurd, 2013-10-30 + + <braunr> it also seems that the kernel has trouble resuming processes that + have been swapped out diff --git a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn index 9ff43afa..2b9f28e8 100644 --- a/open_issues/ext2fs_libports_reference_counting_assertion.mdwn +++ b/open_issues/ext2fs_libports_reference_counting_assertion.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -102,3 +103,9 @@ With that patch in place, the assertion failure is seen more often. <braunr> if this erases the thread-specific area, we can expect all kinds of wreckage <braunr> i'm not sure how to fix this though + + +# IRC, freenode, #hurd, 2014-01-29 + + <gg0> ext2fs: ../../libports/port-ref.c:30: ports_port_ref: Assertion + `pi->refcnt || pi->weakrefcnt' failed. diff --git a/open_issues/gcc.mdwn b/open_issues/gcc.mdwn index 2b772cfc..6c14fdd4 100644 --- a/open_issues/gcc.mdwn +++ b/open_issues/gcc.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2007, 2008, 2009, 2010, 2011, 2012, 2013 Free -Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014 +Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -303,6 +303,47 @@ Last reviewed up to the [[Git mirror's 3a930d3fc68785662f5f3f4af02474cb21a62056 * [`-fsplit-stack`](http://nickclifton.livejournal.com/6889.html) + IRC, freenode, #hurd, 2014-01-10: + + <gnu_srs1> Hi, I assume gcc -fsplit-stack is not yet supported? + <braunr> gnu_srs1: + https://lists.gnu.org/archive/html/bug-hurd/2013-06/msg00100.html + <gnu_srs1> braunr: That's exactly where the problem is: + src/libgcc/generic-morestack.c:814:__morestack_load_mmap + <gnu_srs1> no return value recorded + <gnu_srs1> creating a call: page = mmap ((void*)0x0, 0, 4, 2, -1, 0);, + returning EINVAL + <braunr> lenght of 0 ? + <gnu_srs1> yes, __morestack_current_segment, is zero + <braunr> mmap is expected to return einval if the requested mapping has + a size of 0 .. + <braunr> i don't know what split stack is, but i remember it's a + problem for the hurd + <gnu_srs1> sorry, the address is zero from the above, and the length in + the call is zero too + <braunr> yes that's what i understood + <braunr> and i'm telling you it's normal + <braunr> the size is invalid + <gnu_srs1> libgcc/generic-morestack.c: mmap + (__morestack_current_segment, 0, PROT_READ, MAP_ANONYMOUS, -1, 0); + <braunr> well this is wrong + <gnu_srs1> and the error code stays, not being reset in subsequent + calls + <gnu_srs1> causing an error later on + <braunr> as roland says in + https://lists.gnu.org/archive/html/bug-hurd/2013-06/msg00102.html, it + should be possible to support split-stack now that we have tls + <gnu_srs1> as thomas reported + <braunr> i don't see the relation between split-stack and the mmap + invocation + <gnu_srs1> tls s in 2.17-97, right? that's the one I tried + <braunr> tls is there, but not split stack support + <braunr> and libpthread still has bugs related to changing the stack + apparently + <braunr> fixed upstream but not yet in debian packages + <braunr> unless you want to try with the thread destruction packages + <braunr> not sure it will change much though + * Also see `libgcc/config/i386/morestack.S`: comments w.r.t `TARGET_THREAD_SPLIT_STACK_OFFSET`/`%gs:0x30` usage; likely needs porting. @@ -498,6 +539,29 @@ Last reviewed up to the [[Git mirror's 3a930d3fc68785662f5f3f4af02474cb21a62056 [[!message-id "201211061305.02565.pino@debian.org"]]. + IRC, freenode, #hurd, 2014-01-08: + + <gnu_srs> How come __GLIBC__ is defined in gcc for kFreeBSD and not + GNU? They sometimes use that instead of __FreeBSD_kernel__ + <pochu> it's defined by libc's /usr/include/features.h + <gnu_srs> pochu: __GLIBC__ is defined in features.h both for GNU and + kFreeBSD, but only in gcc/cpp for kFreeBSD: touch foo.h;gcc -E -dM + foo.h|grep GLIBC + <pochu> gnu_srs: #include <stdlib.h> + <gnu_srs> pochu: they both include <features.h> + <pochu> gnu_srs: I get __GLIBC__ defined if I include features.h + <pochu> with an empty file (as suggested by your `touch foo.h') I don't + get it defined, whether on hurd or linux, but I think that's expected + <gnu_srs> pochu: might be so but it is not pre-defined in CPP, as it is + for kFreeBSD. + <gnu_srs> I think it should not be defined, or it should be defined by + all three: GNU,.kFreeBSD and Linux + <gnu_srs> an anomaly, something for tschwinge + <braunr> https://lists.debian.org/debian-bsd/2012/11/msg00016.html + <gnu_srs> braunr: good finding, I assume nothing has happened since + then? + <braunr> not likely + * [low] Does `-mcpu=native` etc. work? (For example, 2ae1f0cc764e998bfc684d662aba0497e8723e52.) @@ -535,6 +599,42 @@ Last reviewed up to the [[Git mirror's 3a930d3fc68785662f5f3f4af02474cb21a62056 A lot of Linux-specific things. + * `libcilkrts` + + IRC, freenode, #hurd, 2014-01-10: + + <youpi> bwaarf, libcilkrts in gcc-4.9 + <p2-mate> libcilkrts? + <youpi> the runtime for the cilk language I guess + <tschwinge> Yes. That most likely needs disabling for us. + <tschwinge> I'll hve a look eventually. + <tschwinge> As soon as I get + <http://news.gmane.org/find-root.php?message_id=%3C87wqjjo5kx.fsf%40kepler.schwinge.homeip.net%3E> + resolved, actually. + + [[!debbug 734973]]. + + * `WCONTINUED` + + IRC, OFTC, #debian-hurd, 2014-02-25: + + <gnu_srs> youpi: some gcc-4.9 packages (and source) are needed for + gnat-4.9 to build: Is it OK to propose this patch: + http://paste.debian.net/84079/ + --- a/src/gcc/lto_lto.c.orig 2014-02-14 19:22:14.000000000 +0100 + +++ b/src/gcc/lto/lto.c 2014-02-25 20:50:20.000000000 +0100 + @@ -2476,7 +2476,11 @@ + int status; + do + { + +#ifdef __GNU__ + + int w = waitpid(0, &status, WUNTRACED); + +#else + int w = waitpid(0, &status, WUNTRACED | WCONTINUED); + +#endif + if (w == -1) + fatal_error ("waitpid failed"); + <youpi> gnu_srs: rather ifndef WCONTINUED diff --git a/open_issues/gdb_catch_syscall.mdwn b/open_issues/gdb_catch_syscall.mdwn index 366c88f5..a875b211 100644 --- a/open_issues/gdb_catch_syscall.mdwn +++ b/open_issues/gdb_catch_syscall.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,7 +8,7 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] -[[meta title="GDB: catch syscall"]] +[[!meta title="GDB: catch syscall"]] (gdb) catch syscall The feature 'catch syscall' is not supported on this architeture yet. diff --git a/open_issues/glibc.mdwn b/open_issues/glibc.mdwn index 5aec5139..8d18d1e2 100644 --- a/open_issues/glibc.mdwn +++ b/open_issues/glibc.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2007, 2008, 2010, 2011, 2012, 2013 Free Software -Foundation, Inc."]] +[[!meta copyright="Copyright © 2007, 2008, 2010, 2011, 2012, 2013, 2014 Free +Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -210,6 +210,14 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 * Missing interfaces, amongst many more. + IRC, freenode, #hurd, 2014-02-25: + + <tschwinge> youpi et al.: Is it a useful GSoC task to have the student + implement interfaces in glibc that we are currently missing? + <braunr> tschwinge: definitely + <braunr> posix_timers would be great + <youpi> tschwinge: probably + Many more are missing, some of which have been announced in `NEWS`, others typically haven't (like new flags to existing functions). Typically, porters will notice missing functionaly. But in case you're looking for @@ -270,6 +278,20 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 If we have all of 'em (check Linux kernel), `#define __ASSUME_ATFCTS`. + * `futimens` + + IRC, freenode, #hurd, 2014-02-09: + + <youpi> it seems apt 0.9.15.1 has troubles downloading packages + etc., as opposed to apt 0.9.15 + <youpi> ah, that version uses futimens unconditionally + <youpi> and we haven't implemented that yet + <azeem> did somebody file a bug for that apt-get issue? + <youpi> I haven't + <youpi> I'll commit the fix in eglibc + <youpi> but perhaps a bug report would be good for the kfreebsd + case + * `bits/stat.h [__USE_ATFILE]`: `UTIME_NOW`, `UTIME_OMIT` * `io/fcntl.h [__USE_ATFILE]` @@ -362,6 +384,374 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 http://darnassus.sceen.net/gitweb/savannah_mirror/glibc.git/blob/refs/heads/tschwinge/Roger_Whittaker:/hurd/hurdselect.c <braunr> this is the client side implementation + IRC, freenode, #hurd, 2014-02-14: + + <desrt> also: do you know if hurd has a modern-day poll() + replacement? ala epoll, kqueue, iocp, port_create(), etc? + <pochu_> last thing I remember was that there was no epoll + equivalent, but that was a few years ago :) + <pochu_> braunr: ^ + * desrt is about to replace gmaincontext in glib with something + more modern + * desrt really very much wants not to have to write a poll() + backend.... + <desrt> it seems that absolutely every system that i care about, + except for hurd, has a new approach here :/ + <desrt> even illumos has solaris-style ports + <azeem> desrt: I suggest you bring up the question on bug-hurd + <azeem> the poll() system call there to satisfy POSIX, but there + might be a better Hurd-specific thing you could use + <azeem> is there* + <desrt> that would be ideal + <desrt> i have to assume that a system that passes to many messages + has some other facilities :) + <desrt> *so many + <desrt> the question is if they work with fds.... + <desrt> bug-hurd doesn't seem like a good place to ask open-ended + questions.... + <azeem> it's the main development lists, it's just old GNU naming + <azeem> list* + <desrt> k. thanks. + <azeem> bug-hurd@gnu.org is the address + * desrt goes to bug... hurd + <desrt> written. thanks. + <braunr> desrt: the hurd has only select/poll + <braunr> it suffers from so many scalability issues there isn't + much point providing one currently + <braunr> we focus more on bug fixing and posix compliance right now + <desrt> fair answer + <braunr> you should want a poll-based backend + <braunr> it's the most portable one, and doesn't suck as much as + select + <braunr> very easy to write + <braunr> although, internally, our select/poll works just like a + bare epoll + <braunr> i.e. select requests are installed, the client waits for + one or more messages, then uninstalls the requests + + IRC, freenode, #hurd, 2014-02-23: + + <desrt> brings me to another question i asked here recently that + nobody had a great answer for: any plan to do kqueue? + <braunr> not for now + <braunr> i remember answering you about that + <desrt> ah. on IRC or the list? + <braunr> that internally, our select/poll implementation works just + like epoll + <braunr> on irc + <braunr> well "just like" is a bit far from the truth + <desrt> well... poll() doesn't really work like epoll :p + <braunr> internally, it does + <braunr> even on linux + <desrt> since both of us have to do the linear scan on the list + <desrt> which is really the entire difference + <braunr> that's the user interface part + <braunr> i'm talking about the implementation + <desrt> ya -- but it's the interface that makes it unscalable + <braunr> i know + <braunr> what i mean is + <braunr> since the implementation already works like a more modern + poll + <braunr> we could in theory add such an interface + <braunr> but epoll adds some complicated detail + <desrt> you'll have to forgive me a bit -- i wasn't around from a + time that i could imagine what a non-modern poll would look like + inside of a kernel :) + <braunr> what i mean with a modern poll is a scalable poll-like + interface + <braunr> epoll being the reference + * desrt is not super-crazy about the epoll interface.... + <braunr> me neither + <desrt> kevent() is amazing -- one syscall for everything you need + <braunr> i don't know kqueue enough to talk about it + <desrt> no need to do 100 epollctls when you have a whole batch of + updates to do + <desrt> there's two main differences + <desrt> first is that instead of having a bunch of separate fds for + things like inotify, timerfd, eventfd, signalfd, etc -- they're + all built in as different 'filter' types + <desrt> second is that instead of a separate epoll_ctl() call to + update the list of monitored things, the kevent() call + (epoll_wait() equivalent) takes two lists: one is the list of + updates to make and the other is the list of events to + return.... so you only do one syscall + <braunr> well, again, that's the interface + <braunr> internally, there still are updates and waits + <braunr> and on a multiserver system like the hurd, this would mean + one system call per update per fd + <braunr> and then one per wait + <desrt> on the implementation side, i think kqueue also has a nice + feature: the kernel somehow has some magic that lets it post + events to a userspace queue.... so if you're not making updates + and you do a kevent() that would not block, you don't even enter + the kernel + <braunr> ok + <desrt> hm. that's an interesting point + <desrt> "unix" as such is just another server for you guys, right? + <braunr> no + <braunr> that's a major difference between the hurd and other + microkernel based systems + <braunr> even multiserver ones like minix + <braunr> we don't have a unix server + <braunr> we don't have a vfs server or even an "fd server" + <desrt> so mach knows about things like fds? + <braunr> no + <braunr> only glibc + <desrt> oh. weird! + <braunr> yes + <braunr> that's the hurd's magic :) + <braunr> being so posix compliant despite how exotic it is + <desrt> this starts to feel like msvcrt :p + <braunr> maybe, i wouldn't know + <braunr> windows is a hybrid after all + <braunr> with multiple servers for its file system + <braunr> so why not + <braunr> anyway + <desrt> so windows doesn't have fds in the kernel either... the C + library runtime emulates them + <braunr> mach has something close to file descriptors + <desrt> which is fun when you get into dll hell -- sometimes you + have multiple copies of the C library runtime in the same program + -- and you have to take care not to use fds from one of them with + th o ther one + <braunr> yes .. + <braunr> that, i knew :) + <braunr> but back to the hurd + <braunr> since fds are a glibc thing here, and because "files" can + be implemented by multiple servers + <braunr> (sockets actually most of the time with select/poll) + <braunr> we have to make per fd requests + <braunr> the implementation uses the "port set" kernel abstraction + <desrt> right -- we could have different "fd" coming from different + places + <braunr> do you know what a mach port is ? + <desrt> not even a little bit + <braunr> hm + <desrt> i think it's what a plane does when it goes really fast, + right? + <braunr> let's say it's a kernel message queue + <braunr> no it's not a sonic boom + <desrt> :) + <braunr> ;p + <braunr> so + <braunr> ports are queues + <desrt> (aside: i did briefly run into mach ports recently on macos + where they modified their kqueue to support them...) + <braunr> queues of RPC requests usually + <desrt> (but i didn't use them or look into them at all) + <braunr> they can be referenced through mach port names, which are + integers much like file descriptors + <braunr> they're also used for replies but, except for weird calls + like select/poll, you don't need to know that :) + <braunr> a port set is one object containing multiple ports + <desrt> sounds like dbus :) + <braunr> the point of a port set is to provide the ability to + perform a single operation (wait for a message) on multiple ports + <desrt> sounds like an epoll fd.... + <desrt> is the port set itself a port? + <braunr> so, when a client calls select, it translates the list of + fds into port names, creates reply ports for each of them, puts + them into a port set, send one select request for each, and does + one blocking wait on the port set + <braunr> no, but you can wait for a message on a port set the same + way you do on a port + <braunr> and that's all it does + <desrt> does that mean that you can you put a port set inside of + another port set? + <braunr> hm maybe + <desrt> i guess in some way that doesn't actually make sense + <braunr> i guess + <desrt> because i assume that the message you sent to each port in + your example is "tell me when you have some stuff" + <braunr> yes + <desrt> and you'd have to send an equivalent message to the port + set.... and that just doesn't make sense + <desrt> since it's not really a thing, per se + <braunr> it would + <braunr> insteaf of port -> port set, it would just be port -> port + set -> port set + <braunr> but we don't have any interface where an fd stands for a + port set + <braunr> what i'm trying to tell here is that + <braunr> considering how it's done, you can easily see that there + has to be non trivial communication + <braunr> each with the cost of a system call + <braunr> and not just any system call, a messaging one + <braunr> mach is clearly not as good as l4 when it comes to that + <desrt> hrmph + <braunr> and the fact that most pollable fds are either unix or + inet/inet6 sockets mean that there will be contention in the + socket servers anyway + <desrt> i've seen some of the crazy things you guys can do as a + result of the way mach works and way that hurd uses it, in + particular + <desrt> normal users setting up little tcp/ip universes for + themselves, and so on + <braunr> yes :) + <desrt> but i guess this all has a cost + <braunr> the cost here comes more from the implementation than the + added abstractions + <braunr> mach provides async ipc, which can partially succeed + <desrt> if i spin up a subhurd, it's using the same mach, right? + <braunr> yes + <desrt> that's neat + <braunr> we tend to call them neighbour hurds because of that + <braunr> i'm not sure it is + <desrt> it puts it half way between linux containers and outright + VMs + <desrt> because you have a new kernel.... ish... + <braunr> well, it is for the same reasons hypervisors are neat + <desrt> but the kernel exists within this construct.... + <braunr> a new kernel ? + <desrt> a new hurd + <braunr> yes + <desrt> but not a new mach + <braunr> exactly + <desrt> ya -- that's very cool + <braunr> it's halfway between hypervisors and containers/jails + <braunr> what matters is that we didn't need to write much code to + make it work + <braunr> and that the design naturally guarantees strong isolation + <desrt> right. that's what i'm getting at + <braunr> unlike containers + <desrt> it shows that the interaction between mach and these set of + crazy things collectively referred to as the hurd is really + proper + <braunr> usually + <braunr> sometimes i think it's not + <braunr> but that's another story :) + <desrt> don't worry -- you can fix it when you port to L4 ;) + <braunr> eh, no :) + <desrt> btw: is this fundamentally the same mach as darwin? + <braunr> yes + <desrt> so i guess there are multiple separate implementations of a + standard set of interfaces? + <braunr> ? + * desrt has to assume that apple wouldn't be using GNU mach, for + example... + <braunr> no it's the same code base + <braunr> they couldn't + <braunr> but only because the forks have diverged a bit + <desrt> ah + <braunr> and they probably changed a lot of things in their virtual + memory implementation + <desrt> so i guess original mach was under some BSDish type thing + and GNU mach forked from that and started adding GPL code? + <braunr> something like that + <desrt> makes sense + <braunr> we have very few "non-standard" mach interfaces + <braunr> but we now rely on them so we couldn't use another mach + either + <braunr> back to the select/poll stuff + * desrt gets a lesson tonight :) + <braunr> it costs, it's not scalable + <braunr> but + <braunr> we have scalability problems in our servers + <braunr> they're old code, they use global locks + <desrt> right. this is the story i heard last time. + <braunr> probably from me + <braunr> poll works good enough for us right now + <braunr> we're more interested in bug fixes than scalability + currently + <desrt> the reason this negative impacts me is because now i need + to write a bunch more code ;p + <braunr> i hope this changes but we still get weird errors that + many applications don't expect and they react badly to those + <braunr> well, poll really is the posix fallback + <desrt> every other OS that we want to support has some sort of new + scalable epoll-type interface or is Windows (which needs separate + code anyway) + <desrt> a very large number of them have kqueue... linux has + epoll... solaris/illumos is the odd one out with this weird thing + that's sort of like epoll + <braunr> i would think you want a posix fallback for such a + commonly used interface + <braunr> hm + <desrt> braunr: hurd is pretty much the only one that doesn't + already have something better.... + <braunr> linux can be built without epoll + <desrt> and the nice thing about all of these things is that every + single one of them gives me an fd that can be polled when any + event is ready + <braunr> i don't see why anyone would do that, but it's a compile + time option ;p + <braunr> yes ... + <braunr> we don't have xxxfd() :) + <desrt> and we want to expose that fd on our API... so people can + chain gmaincontext into other mainloops + <braunr> that's expected + <desrt> so for hurd this means that i will need to spin up a + separate thread doing poll() and communicating back to the main + thread when anything becomes ready + <desrt> i was looking forward to not having to do that :) + <braunr> it matches the unix "everything is a file" idea, and + windows concept of "events" + <braunr> i understand but again, it's a posix fallback + <braunr> you probably want it anyway + <desrt> probably + <braunr> it could help new systems trying to be posix like + <desrt> i honestly thought i'd get away with it, though + <desrt> this is true... + <desrt> CLOCK_MONOTONIC is an easy enough requirement to implement + or fake.... "modern event polling framework" is another story... + + [[clock_gettime]]. + + <braunr> yes, but again, we do have the underlying machinery to add + it + <desrt> i appreciate if your priorities are elsewhere ;) + <braunr> it's just not worth the effort right now + <braunr> although we do have performance and latency improvements + in our patch queues currently + <braunr> if our network stack gets replaced, it would become + interesting + <braunr> we need to improve posix compliance first + <braunr> make more applications not choke on unecpected errors + <braunr> and then we can think of improving scalability + <desrt> +1 vote from me for implementing monotonic time :) + <desrt> (and also pthread_condattr_setclock()) + <braunr> and we probably won't implement the epoll interface ;p + <braunr> yes + <desrt> it's worth noting that there is also a semi-widely + available non-standard extension called + pthread_cond_timedwait_relative_np that you could implement + instead + <desrt> it takes a (relative) timeout instead of an absolute one -- + we can use that if it's available + <braunr> desrt: why would you want relative timeouts ? + <desrt> braunr: if you're willing to take the calculations into + your own hands and you don't have another way to base it on + monotonic time it starts to look like a good alternative + <desrt> and indeed, this is the case on android and macos at least + <braunr> hm + <desrt> not great as a user-facing API of course.... due to the + spurious wakeup possibility and need to retry + <braunr> so it's non standard alternative to a monotonic clock ? + <desrt> no -- these systems have monotonic clocks + <desrt> what they lack is pthread_condattr_setclock() + <braunr> oh right + <desrt> which is documented in POSIX but labelled as 'optional' + <braunr> so relative is implicitely monotonic + <desrt> yes + <desrt> i imagine it would be the same 'relative' you get as the + timeout you pass to poll() + <desrt> since basing anything like this on wallclock time is + absolutely insane + <desrt> (which is exactly why we refuse to use wallclock time on + our timed waits) + <braunr> sure + <braunr> i'm surprised clock_monotonic is even optional in posix + 2008 + <braunr> but i guess that's to give some transition margin for + small embedded systems + <desrt> when you think about it, CLOCK_REALTIME really ought to + have been the optional feature + <desrt> monotonic time is so utterly basic + <braunr> yes + <braunr> and that's how it's normally implemented + <braunr> kernels provide a monotonic clock, and realtime is merely + shifted from it + * `sys/eventfd.h` * `sys/inotify.h` @@ -1129,6 +1519,82 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 <gg0> ah ok you just pushed your tls. great! <braunr> tls will fix a lot of things + IRC, OFTC, #debian-hurd, 2013-11-03: + + <youpi> gg0: + <youpi> #252 test_fork.rb:30:in `<top (required)>': core dumped + [ruby-core:28924] + <youpi> FAIL 1/949 tests failed + <youpi> with the to-be-uploaded glibc + <gg0> why does it coredump? + <gg0> that's the test i had workarounded by increasing sleep from 1 + to 3 but i don't recall it coredump'ed + <gg0> *recall if + <gg0> "sleep 1" at bootstraptest/test_fork.rb:33 + <youpi> how can I run the test alone? + + IRC, OFTC, #debian-hurd, 2013-11-04: + + <youpi> gg0: ^ + <gg0> it should not take much + <gg0> run $ make OPTS=-v test + <gg0> found out how to minimize + <gg0> mkdir _youpi && cp bootstraptest/{runner,test_fork}.rb _youpi + <gg0> then run $ ./miniruby -I./lib -I. -I.ext/common + ./tool/runruby.rb --extout=.ext -- --disable-gems + "./_youpi/runner.rb" --ruby="ruby2.0 -I./lib" -q -v + <gg0> youpi: that should work + <youpi> #1 test_fork.rb:1:in `<top (required)>': No such file or + directory - /usr/src/ruby1.9.1-1.9.3.448/ruby2.0 + -I/usr/src/ruby1.9.1-1.9.3.448/lib -W0 bootstraptest.tmp.rb + [ruby-dev:32404] + <gg0> seems it can't find /usr/src/ruby1.9.1-1.9.3.448/ruby2.0 + <youpi> well it's ruby1.9.1 indeed :) + <youpi> ok, got core + <gg0> replace 2.0 with 1.9, check what you have in rootdir + <gg0> k + <youpi> Mmm, no, there's no core file + <gg0> does stupidly increasing sleep time work? + <youpi> nope + <gg0> without *context it runs "make test" fine. real problems come + later with "make test-all" + <gg0> wrt test_fork, is correspondence between signals correct? i + recall i read something about USR1 not implemented + <youpi> USR1 is implemented, it's SIGRT which is not implemented + <gg0> my next wild guess is that that has something to do with + atfork, whatever that means + <gg0> it makes 2 forks: one sleeps for 1 sec then kills -USR1 + itself, the second traps USR1 in getting current time. in the + meanwhile parent sleeps for 2 secs + + IRC, OFTC, #debian-hurd, 2013-11-07: + + <gg0> ruby2.0 just built on unstable + + IRC, OFTC, #debian-hurd, 2013-11-09: + + <gg0> youpi: just found out a more "official" way to run one test + only + http://anonscm.debian.org/gitweb/?p=collab-maint/ruby1.9.1.git;a=blob;f=debian/README.porters;h=94aff7dd3ecd9f748498f2e285b4a4313b4b8f36;hb=HEAD + <gg0> btw still getting coredumps? + + IRC, OFTC, #debian-hurd, 2013-11-13: + + <gg0> wrt the other test test_fork i suppose you made it not to + segfault anymore, it simply does fail + <youpi> I haven't taken any particular care + <youpi> didn't have any time to deal with it + + IRC, OFTC, #debian-hurd, 2013-11-14: + + <gg0> btw patches to disable *context have been backported to 1.9 + as well so next 1.9 point release should have *context disabled + <gg0> as 2.0 have + <gg0> *has + <gg0> i guess you'd like to get them reverted now + <gg0> youpi: ^ + <youpi> after testing that *context work, yes + * `sigaltstack` IRC, freenode, #hurd, 2013-10-09: @@ -1316,6 +1782,77 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 socket/socketpair, didn't we talk about them when i worked on eglibc 2.17? + * `mlock`, `munlock`, `mlockall`, `munlockall` + + IRC, freenode, #hurd, 2014-01-09: + + <gnu_srs> Hi, is mlock, mlockall et al implemented? + <braunr> i doubt it + <braunr> mlock could be, but mlockall only partially + + * [[glibc_IOCTLs]] + + * Support for `$ORIGIN` in the dynamic linker, `ld.so` + + IRC, freenode, #hurd, 2014-02-23: + + <sjamaan> + https://www.gnu.org/software/hurd/user/jkoenig/java/report.html + says $ORIGIN patches have been added to Hurd. Have those hit the + mainline codebase? + + [[user/jkoenig/java]], [[user/jkoenig/java/report]]. + + <sjamaan> It doesn't seem to work here, but perhaps I'm missing + something (I'm using the prebuilt Debian/Hurd 2014-02-11 VM + image) + <sjamaan> objdump -x says the value of RPATH is $ORIGIN + <sjamaan> But it doesn't load a library I placed in the same dir as + the binary + <braunr> sjamaan: i'm not sure + <braunr> sjamaan: what are you trying to do ? + + IRC, freenode, #hurd, 2014-02-24: + + <sjamaan> braunr: I am working on a release of the CHICKEN Scheme + compiler. Its test suite is currently failing on the stand-alone + deployment tests. Either it should work and use $ORIGIN, or the + test should be disabled, saying Hurd is not supported for + stand-alone deployment-directories + <sjamaan> braunr: The basic idea is to be able to create "appdirs" + like on OS X or PC-BSD, containing all the dependencies a program + needs, which can then simply be untarred + <braunr> sjamaan: ok so you do need $ORIGIN + <sjamaan> yeah + <sjamaan> iiuc, so does Java. Does Java work on Hurd? + <braunr> we had packages at the time jkoenig worked on it + <braunr> integration of patches may have been incomplete, i wasn't + there at the time and i'm not sure + <sjamaan> So it's safest to claim it's unsupported, for now? + <braunr> yes + <sjamaan> Thank you, I'll do that and revisit it later + + * `mig_reply_setup` + + IRC, freenode, #hurd, 2014-02-24: + + <teythoon> braunr: neither hurd, gnu mach or glibc provides + mig_reply_setup + <teythoon> i want to provide this function, where should i put it ? + <teythoon> i found some mach source that put it in libmach afaic + <teythoon> + ftp://ftp.sra.co.jp/.a/pub/os/mach/extracted/mach3/mk/user/libmach/mig_reply_setup.c + <braunr> teythoon: what does it do ? + <teythoon> braunr: not much, it just initializes the reply message + <teythoon> libports does this as well, in the + ports_manage_port_operations* functions + <braunr> teythoon: is it a new function you're adding ? + <teythoon> braunr: yes + <teythoon> braunr: glibc has a declaration for it, but no + implementation + <braunr> teythoon: i think it should be in glibc + <braunr> maybe in mach/ + For specific packages: * [[octave]] @@ -2115,6 +2652,15 @@ Last reviewed up to the [[Git mirror's 64a17f1adde4715bb6607f64decd73b2df9e6852 +tst-tls-atexit-lib.c:35:3: warning: implicit declaration of function '__cxa_thread_atexit_impl' [-Wimplicit-function-declaration] * a600e5cef53e10147932d910cdb2fdfc62afae4e `Consolidate Linux and POSIX libc_fatal code.` -- is `backtrace_and_maps` specific to Linux? + + IRC, freenode, #hurd, 2014-02-06: + + <braunr> why wouldn't glibc double free detection code also print + the backtrace on hurd ? + <youpi> I don't see any reason why + <youpi> except missing telling glibc that it's essentially like on + linux + * 288f7d79fe2dcc8e62c539f57b25d7662a2cd5ff `Use __ehdr_start, if available, as fallback for AT_PHDR.` -- once we require Binutils 2.23, can we simplify [[glibc's process startup|glibc/process]] diff --git a/open_issues/glibc/0.4.mdwn b/open_issues/glibc/0.4.mdwn index 8991d4c0..33ef8f3a 100644 --- a/open_issues/glibc/0.4.mdwn +++ b/open_issues/glibc/0.4.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -15,6 +16,8 @@ Things to consider doing when bumping the glibc SONAME. There are some comments in the sources, for example `hurd/geteuids.c`: `XXX Remove this alias when we bump the libc soname.` +[[!toc]] + # IRC, freenode, #hurd, 2012-12-14 @@ -33,3 +36,42 @@ In context of [[packaging_libpthread]]/[[libpthread]]. [[!GNU_Savannah_bug 28934]], [[user/pochu]], [[!message-id "4BFA500A.7030502@gmail.com"]]. + + +# `time_t` -- Unix Epoch vs. 2038 + +## IRC, freenode, #hurd, 2013-12-12 + + <azeem> because it gets discussed in #debian-devel for the Linux i386 + architecture right now: what's the deal with hurd-i386 and the 32bit + epoch overflow in 2038? + <braunr> what do you mean ? + <azeem> braunr: http://lwn.net/Articles/563285/ + <braunr> ok but what do you mean ? + <braunr> i don't think there is anything special with the hurd about that + <azeem> well, time_t is 64bit on amd64 AIUI + <braunr> it's a signed long + <azeem> so maybe the Hurd guys were clever from the start + <azeem> k, k + <braunr> our big advantage is that we can afford to break things a little + without too much trouble + <braunr> in a system at work, we use unsigned 32-bit words + <braunr> which overflows in 2106 + <braunr> and we already include funny comments that predict our successors, + if any, will probably fail to deal with the problem until short before + the overflow :> + <azeem> luckily, no nuclear reactors are running the Hurd sofar + <braunr> i wonder how the problem will be dealt with though + <braunr> ah, openbsd decided to break their abi + <azeem> yeah + <braunr> that's probably the simplest solution + <azeem> "just recompile" + <braunr> and they can afford it too + <azeem> yeah + <braunr> good to see people actually worry about it + <azeem> I guess people are getting worried about where Linux embedded is + being put into + <braunr> they're right about that + <azeem> "Please, don't fix the 2038 year issue. I also want to have some + job security :)" + <braunr> haha diff --git a/open_issues/glibc/debian/experimental.mdwn b/open_issues/glibc/debian/experimental.mdwn index 5168479d..273f02fd 100644 --- a/open_issues/glibc/debian/experimental.mdwn +++ b/open_issues/glibc/debian/experimental.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -130,6 +130,101 @@ Now in unstable. <pinotree> btw i saw too the segmentation fault when generating locales +## IRC, freenode, #hurd, 2014-02-04 + + <bu^> hello + <bu^> I just updated + <bu^> Setting up locales (2.17-98~0) ... + <bu^> Generating locales (this might take a while)... + <bu^> en_US.UTF-8...Segmentation fault + <bu^> done + <gnu_srs> bu^: That's known, it still seems to work, though. If you have + the time please debug. I've tried but not found the solution yet:-( + <bu^> ok, just wanted to notify + + +## IRC, freenode, #hurd, 2014-02-19 + + <braunr> for info, the localedef segfault has been fixed upstream + <braunr> or rather, upstream has been written in a way that won't trigger + the segfault + <braunr> it is caused by the locale archive code that maps the locale + archive file in the address space, enlarging the mapping as needed, but + unmaps the complete reserved size of 512M on close + <braunr> munmap is implemented through vm_deallocate, but it looks like the + latter doesn't allow deallocating unmapped regions of the address space + <braunr> (to be confirmed) + <braunr> upstream code tracks the mapping size so vm_deallocate won't whine + <braunr> i expect we'll have that in eglibc 2.18 + <braunr> hm actually, posix says munmap must refer to memory obtained with + mmap :) + <braunr> (or actually, that the behaviour is undefined, which most unix + systems allow anyway, but not us) + + <braunr> also, before i leave, i have partially traced the localedef + segfault + <youpi> ah, cool + <braunr> localedef maps the locale archive, and enlarges the mapping as + needed + <braunr> but munmaps the complete 512m reserved area + <braunr> and i strongly suspect it unmaps something it shouldn't on the + hurd + <braunr> since linux mmap has different boundaries depending on the mapping + use + <braunr> while our glibc will happily maps stacks below text + <braunr> the good news is that it looks fixed upstream + <youpi> ah :) + <braunr> + https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=17db6e8d6b12f55e312fcab46faf5d332c806fb6 + <braunr> see the change about close_archive + <braunr> i haven't tested it though + + +## IRC, freenode, #hurd, 2014-02-21 + + <gg0> just upgraded to 2.18, locales still segfaults + <braunr> ok + + +## IRC, freenode, #hurd, 2014-02-23 + + <braunr> ok, as expected, the localdef bug is because of some mmap issue + +[[glibc/mmap]]. + + <braunr> looks like our mmap doesn't like mapping files with PROT_NONE + <braunr> shouldn't be too hard to fix + <braunr> gg0: i should have a fix ready soon for localedef + + <braunr> youpi: i have a patch for glibc about the localedef segfault + <youpi> is that the backport we talked about, or something else? + <braunr> something else + <braunr> in short + <braunr> mmap() PROT_NONE on files return 0 + <youpi> ok + <youpi> seems like fixable indeed + <braunr> nothing is mapped, and the localdef code doesn't consider this an + error + <braunr> my current fix is to handle PROT_NONE like PROT_READ + <youpi> doesn't vm_protect allow to map something without giving read + right? + <braunr> it probably does + <braunr> the problem is in glibc + <youpi> ok + <braunr> when i say like PROT_READ, i mean a memory object gets a reference + <braunr> on the read port returned by io_map + <braunr> since it's not accessible anyway, it shouldn't make a difference + <braunr> but i preferred to have the memory object referenced anyway to + match what i expect is done by other systems + + +## IRC, freenode, #hurd, 2014-02-24 + + <youpi> braunr: ah ok + + <braunr> ok that mmap fix looks fine, i'll add comments and commit it soon + + # IRC, OFTC, #debian-hurd, 2013-06-20 <youpi> damn @@ -173,3 +268,62 @@ Now in unstable. <youpi> I'd warmly welcome a way to detect whether being the / translator process btw <youpi> it seems far from trivial + + +# glibc 2.18 vs. GCC 4.8 + +## IRC, freenode, #hurd, 2013-11-25 + + <youpi> grmbl, installing a glibc 2.18 rebuilt with gcc-4.8 brings an + unbootable system + + +## IRC, freenode, #hurd, 2013-11-29 + + <teythoon> so, what do I do? rebuild the glibc 2.18 package with gcc4.8 and + see what breaks ? + <teythoon> when I boot a system with that libc that is ? + <teythoon> I wish youpi would have been more specific, I've never built the + libc before... + <braunr> debian/rules build in the debian package + <braunr> ctrl-c when you see gcc invocations + <braunr> cd buildir; make lib others + <braunr> although hm + <braunr> what breaks is at boot time right ? + <teythoon> yes + <braunr> heh .. + <braunr> then dpkg-buildpackage + <braunr> DEB_BUILD_OPTIONS=nocheck speeds things up + <braunr> just answer on the mailing list and ask him + <braunr> he usually answers quickly + + +## IRC, freenode, #hurd, 2013-12-18 + + <gnu_srs> teythoon: k!, any luck with eglibc-2.18? + <teythoon> tbh i didn't look into this after two unsuccessful attempts at + building the libc package + <teythoon> there was a post over at the libc-alpha list that sounded + familiar + <teythoon> http://www.cygwin.com/ml/libc-alpha/2013-12/msg00281.html + <braunr> wow + <teythoon> ? + <braunr> this looks tricky + <braunr> and why ia64 only + <teythoon> indeed + <braunr> it's rare to see aurel32 ask such questions + + +## IRC, freenode, #hurd, 2014-01-22 + + <youpi> btw, did anybody investigate the glibc-built-with-gcc-4.8 issue? + <youpi> oddly enough, a subhurd boots completely fine with it + <braunr> i didn't + <teythoon> no, sorry + <youpi> I was wondering whether the bogus deallocation at boot might have + something to do + <braunr> which one ? + <braunr> ah + <braunr> yes + <braunr> maybe + <youpi> quoted earlier here diff --git a/open_issues/glibc_ioctls.mdwn b/open_issues/glibc_ioctls.mdwn index 14329d0f..3f396754 100644 --- a/open_issues/glibc_ioctls.mdwn +++ b/open_issues/glibc_ioctls.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_glibc]] -IRC, unknown channel, unknown date. + +# IRC, unknown channel, unknown date <pinotree> d'oh, broken defines for ioctl()! <pinotree> http://paste.debian.net/45021/ ← any idea about this? looks like something fishy with the SIO* defines @@ -70,3 +71,101 @@ IRC, unknown channel, unknown date. <pinotree> right <youpi> which might end up in mach, other processes, other machines, etc. * pinotree s/Mach/Hurd/ :) + + +# `TIOCCONS` + +## IRC, freenode, #hurd, 2014-02-05 + + <gnu_srs> Hi, anybody have time to look at what fails with: ioctl(0, + TIOCCONS, NULL)? + <gnu_srs> found a program doing the same function call as bootlogd: + http://paste.debian.net/80231/ + <gnu_srs> rpctrace: http://paste.debian.net/80232/ + <youpi> gnu_srs: it seems there is a misunderstanding between linux and + *bsd on this one + <youpi> to be able to work on *bsd (and on hurd too), the source code + should replace its NULL parameter with the address of an integer + containing 1 + <youpi> see + http://lists.freebsd.org/pipermail/freebsd-current/2011-January/022116.html + for the bsd implementation, for instance + <gnu_srs> youpi: replacing 0 with &i where int i=1 gives: TIOCCONS: + Inappropriate ioctl for device + <youpi> so be it, but that's clearly needed to be able to work on bsd + <youpi> and probably the implementation is just missing on the Hurd for now + <gnu_srs> jus to be clear: do you mean 0 or NULL in: ioctl(0, TIOCCONS, + NULL)? + <youpi> yes, for instance there is an implementation do_tiocsctty in glibc, + but no to_tioccons + <youpi> I mean NULL + <gnu_srs> OK, that's where I changed, the first argument id the FD + <youpi> well, when I wrote "NULL", I really meant "NULL" ... + <gnu_srs> yes sure, so you say that it is not yet implemented? + <youpi> yes, for instance there is an implementation do_tiocsctty in glibc, + but no to_tioccons + <gnu_srs> easy to do? + <youpi> no idea, I don't even know what that is suppsoed to do + <youpi> it's probably something like tiocsctty, but I don't really know + <gnu_srs> Redirecting console output to a pseudotty + <youpi> omg that ioctl is so ugly + <youpi> the way I can see it working is to add an RPC to the /dev/console + translator (i.e. /hurd/term) to give it the fd, and have /hurd/term write + to it whenever it gets writes, instead of writing to the console device + <youpi> gnu_srs: what do you need that for? + <gnu_srs> bootlogd in sysvinit use that for logging. + <gnu_srs> should I propose a patch to avoid the segfault when booting then? + <youpi> at least, yes + <youpi> *bsd will need it anyway + <gnu_srs> youpi: btw: hurd console does not work when running openrc, + neither is halt/reboot. Maybe you should try it out? + <gnu_srs> bootlogd use ioctl(0, TIOCCONS, NULL) a Linux (only) construct + <gnu_srs> ? + <youpi> gnu_srs: I had infinite time in the day, I would be able to try it + out, yes + <braunr> heh + <youpi> giving NULL to TIOCCONS is a linux-only construct, yes + <youpi> to be compatible with *BSD, you have to pass the parameter + mentioned above + <youpi> instead of NULL + <gnu_srs> well bootlogd is from sysvinit, so it is a matter if we move to + that for init. + <gnu_srs> ***checking if bootlogd segfaults on kFreeBSD too + + +# Non-constant structures as IOCTL parameter + +[[!debbug 413734]]. + + +## IRC, OFTC, #debian-hurd, 2014-02-16 + + <gg0> https://bugs.debian.org/413734 + <gg0> patch #2 has become http://paste.debian.net/plain/82412/ + <gg0> ie. almost entirely ifdef'ing DeviceEnum + <gg0> ok final patch is http://paste.debian.net/plain/82440/ + <gg0> could anyone review it, especially last 3 oss hunks? + <azeem> gg0: well probably it would be cleaner to have autoconf check for + any of the three soundcard.h include locations? + <gg0> azeem: i think if upstream is ok with 2 it could be ok with 3 too + <gg0> my concern is about linux/ in header path (hurd is not linux) and + about ways cleaner than last 2 hunks + <azeem> well yeah, #ifdef __GNU__ #include <linux/foo.h> certainly looks + ugly + <gg0> i'll ifdef ioctls only + + +### IRC, OFTC, #debian-hurd, 2014-02-17 + + <gg0> http://paste.debian.net/plain/82446/ + <gg0> https://trac.videolan.org/vlc/ticket/10696 + + +### IRC, freenode, #hurd, 2014-02-17 + + <gg0> porting vlc with http://paste.debian.net/plain/82446/ + + http://paste.debian.net/plain/82510/ + <gg0> what's the proper way to fix ioctl instead of ifdef'ing them? + <gg0> see https://bugs.debian.org/413734 + <braunr> gg0: defining them in libc + <braunr> and in servers implementing them ofc diff --git a/open_issues/gnumach_memory_management.mdwn b/open_issues/gnumach_memory_management.mdwn index 60ec7357..b36c674a 100644 --- a/open_issues/gnumach_memory_management.mdwn +++ b/open_issues/gnumach_memory_management.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -2231,6 +2231,132 @@ There is a [[!FF_project 266]][[!tag bounty]] on this task. more of them to be needed) +## IRC, freenode, #hurd, 2014-02-11 + + <braunr> youpi: what's the issue with kentry_data_size ? + <youpi> I don't know + <braunr> so back to 64pages from 256 ? + <youpi> in debian for now yes + <braunr> :/ + <braunr> from what i recall with x15, grub is indeed allowed to put modules + and command lines around as it likes + <braunr> restricted to 4G + <braunr> iirc, command lines were in the first 1M while modules could be + loaded right after the kernel or at the end of memory, depending on the + versions + <youpi> braunr: possibly VM_KERNEL_MAP_SIZE is then not big enough + <braunr> youpi: what's the size of the ramdisk ? + <braunr> youpi: or kmem_map too big + <braunr> we discussed this earlier with teythoon + +[[user-space_device_drivers]], *Open Issues*, *System Boot*, *IRC, freenode, +\#hurd, 2011-07-27*, *IRC, freenode, #hurd, 2014-02-10* + + <braunr> or maybe we want to remove kmem_map altogether and directly use + kernel_map + <youpi> it's 6.2MiB big + <braunr> hm + <youpi> err no + <braunr> looks small + <youpi> 70MiB + <braunr> ok yes + <youpi> (uncompressed) + <braunr> well + <braunr> kernel_map is supposed to have 64M on i386 ... + <braunr> it's 192M large, with kmem_map taking 128M + <braunr> so at most 64M, with possible fragmentation + <teythoon> i believe the compressed initrd is stored in the ramdisk + <youpi> ah, right it's ext2fs which uncompresses it + <braunr> uncompresses it where + <braunr> ? + <teythoon> libstore does that + <youpi> module --nounzip /boot/${gtk}initrd.gz + <youpi> braunr: in userland memory + <youpi> it's not grub which uncompresses it for sure + <teythoon> braunr: so my ramdisk isn't 64 megs either + <braunr> which explains why it sometimes works + <teythoon> yes + <teythoon> mine is like 15 megs + <braunr> kentry_data_size calls pmap_steal_memory, an early allocation + function which changes virtual_space_start, which is later used to create + the first kernel map entry + <braunr> err, pmap_steal_memory is called with kentry_data_size as its + argument + <braunr> this first kernel map entry is installed inside kernel_map and + reduces the amount of available virtual memory there + <braunr> so yes, it all points to a layout problem + <braunr> i suggest reducing kmem_map down to 64M + <youpi> that's enough to get d-i back to boot + <youpi> what would be the downside? + <youpi> (why did you raise it to 128 actually? :) ) + <braunr> i merged the map used by generic kalloc allocations into kmem_map + <braunr> both were 64M + <braunr> i don't see any downside for the moment + <braunr> i rarely see more than 50M used by the slab allocator + <braunr> and with the recent code i added to collect reclaimable memory on + kernel allocation failures, it's unlikely the slab allocator will be + starved + <youpi> but then we need that patch too + <braunr> no + <braunr> it would be needed if kmem_map gets filled + <braunr> this very rarely happens + <youpi> is "very rarely" enough ? :) + <braunr> actualy i've never seen it happen + <braunr> i added it because i had port leaks with fakeroot + <braunr> port rights are a bit special because they're stored in a table in + kernel space + <braunr> this table is enlarged with kmem_realloc + <braunr> when an ipc space gets very large, fragmentation makes it very + difficult to successfully resize it + <braunr> that should be the only possible issue + <braunr> actually, there is another submap that steals memory from + kernel_map: device_io_map is 16M large + <braunr> so kernel_map gets down to 48M + <braunr> if the initial entry (that is, kentry_data_size + the physical + page table size) gets a bit large, kernel_map may have very little + available room + <braunr> the physical page table size obviously varies depending on the + amount of physical memory loaded, which may explain why the installer + worked on some machines + <youpi> well, it works up to 1855M + <youpi> at 1856 it doesn't work any more :) + <braunr> heh :) + <youpi> and that's about the max gnumach can handle anyway + <braunr> then reducing kmem_map down to 96M should be enough + <youpi> it works indeed + <braunr> could you check the amount of available space in kernel_map ? + <braunr> the value of kernel_map->size should do + <youpi> printing it "multiboot modules" print should be fine I guess? + + +### IRC, freenode, #hurd, 2014-02-12 + + <braunr> probably + <teythoon> ? + <braunr> i expect a bit more than 160M + <braunr> (for the value of kernel_map->size) + <braunr> teythoon: ? + <youpi> well, it's 2110210048 + <teythoon> what is multiboot modules printing ? + <youpi> almost last in gnumach bootup + <braunr> humm + <braunr> it must account directly mapped physical pages + <braunr> considering the kernel has exactly 2G, this means there is 36M + available in kernel_map + <braunr> youpi: is the ramdisk loaded at that moment ? + <youpi> what do you mean by "loaded" ? :) + <braunr> created + <youpi> where? + <braunr> allocated in kernel memory + <youpi> the script hasn't started yet + <braunr> ok + <braunr> its size was 6M+ right ? + <braunr> so it leaves around 30M + <youpi> something like this yes + <braunr> and changing kmem_map from 128M to 96M gave us 32M + <braunr> so that's it + + # IRC, freenode, #hurd, 2013-04-18 <braunr> oh nice, i've found a big scalability issue with my slab allocator diff --git a/open_issues/hurd_101.mdwn b/open_issues/hurd_101.mdwn index 25822512..e55b0e8e 100644 --- a/open_issues/hurd_101.mdwn +++ b/open_issues/hurd_101.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -98,3 +99,262 @@ Not the first time that something like this is proposed... server), yes <ahungry> braunr: thanks for all the info, hittin the sack now but ill have to set up a box and try to contribute + + +# Documentation + +## IRC, freenode, #hurd, 2013-11-04 + + <stargater> i think the problem my hurd have not more developers or + contubutors is the project idears and management , eg, the most problem + is the mach kernel and documatation and the missing subsystem goals + (driver, etc) + <stargater> no i think you and other have a clue but this is not + tranzparent when i read the webpage + <teythoon> well, fwiw I agree, the documentation is lacking + <braunr> about what ? + <braunr> something that doesn't exist ? + <braunr> like smp or a generic device driver framework ? + <teythoon> no, high level concepts, design stuff + <braunr> what ? + <braunr> how come ? + <teythoon> not even the gnumach documentation is complete + <braunr> for example ? + <braunr> see http://www.sceen.net/~rbraun/doc/mach/ + <braunr> which is my personal collection of docs on mach/hurd + <braunr> and it's lacking at least one paper + <braunr> well two, since i can't find the original article about the hurd + in pdf format + <braunr> project ideas are clearly listed in the project ideas page + <stargater> braunr: do you think the mach kernel decumatation a compleat? + and you think its good documentatition about "how write a drive for mach" + and you think a answare is found why dont work smp and why is have no + arm, x64 support ? + <braunr> stargater: + http://darnassus.sceen.net/~hurd-web/community/gsoc/project_ideas/ + <braunr> the page is even named "project ideas" + <braunr> the mach kernel is probably the most documented in the world + <braunr> even today + <braunr> and if there is no documentation about "how to write drivers for + mach", that's because we don't want in kernel drivers any more + <braunr> and the state of our driver framework is practically non existent + <braunr> it's basically netdde + <braunr> partial support for network drivers from linux + <braunr> that's all + <braunr> we need to improve that + <braunr> someone needs to do the job + <braunr> noone has for now + <braunr> that's all + <braunr> why would we document something that doesn't exist ? + <braunr> only stupid project managers with no clue about the real world do + that + <braunr> (or great ones who already know everything there is to know before + writing code, but that's rare) + <braunr> stargater: the answer about smp, architectures etc.. is the same + <stargater> spirit and magic are nice ;-) braunr sorry, that is only my + meanig and i will help, so i ask and say what i think. when you say, hurd + and mach are good and we on the right way, then its ok for me . i wonder + why not more developer help hurd. and i can read and see the project page + fro side a first time user/developer + <braunr> i didn't say they're good + <braunr> they're not, they need to be improved + <braunr> clearly + <stargater> ok, then sorry + <braunr> i wondered about that too, and my conclusion is that people aren't + interested that much in system architectures + <braunr> and those who are considered the hurd too old to be interesting, + and don't learn about it + <braunr> consider* + <braunr> stargater: why are you interested in the hurd ? + <braunr> that's a question everyone intending to work on it should ask + <stargater> the spirit of free software and new and other operation system, + with focus to make good stuff with less code and working code for ever + and everone can it used + <braunr> well, if the focus was really to produce good stuff, the hurd + wouldn't be so crappy + <braunr> it is now, but it wasn't in the past + <stargater> a good point whas more documentation in now and in the future, + eg, i like the small project http://wiki.osdev.org/ and i like to see + more how understanding mach and hurd + <nalaginrut> I love osdev much, it taught me a lot ;-D + <braunr> osdev is a great source for beginners + <braunr> teythoon: what else did you find lacking ? + <teythoon> braunr: in my opinion the learning curve of Hurd development is + quite steep at the beginning + <teythoon> yes, documentation exists, but it is distributed all over the + internets + <braunr> teythoon: hm ok + <braunr> yes the learning curve is too hard + <braunr> that's an entry barrier + + +# IRC, freenode, #hurd, 2014-02-04 + +[[!tag open_issue_documentation]] + + <bwright> Does the GNU Mach kernel have concepts of capabilities? + <braunr> yes + <braunr> see ports, port rights and port names + <bwright> Does it follow the take grant approch + <bwright> approach* + <braunr> probably + <bwright> Can for example I take an endpoint that I retype from untyped + memory and mint it such that it only has read access and pass that to the + cspace of another task over ipc. + <bwright> Where that read minted cap enforces it may onnly wait on that ep. + <braunr> ep ? + <braunr> ah + <bwright> Endpoint. + <braunr> probably + <bwright> Alright cool. + <braunr> it's a bit too abstract for me to answer reliably + <braunr> ports are message queues + <braunr> port rights are capabilities to ports + <bwright> Not sure exactly how it would be implemented but essentially you + would have a guarded page table with 2 levels, 2^pow slots. + <braunr> port names are integers referring to port rights + <braunr> we don't care about the implementation of page tables + <bwright> Each slot contains a kernel object, which in itself may be more + page tabels that store more caps. + <braunr> it's not l4 :p + <braunr> mach is more of a hybrid + <bwright> It isn't a page table for memory. + <braunr> it manages virtual memory + <bwright> Ah ok. + <braunr> whatever, we don't care about the implementation + <bwright> So if I want to say port an ethernet driver over. + <braunr> whether memory or capabilities, mach manages them + <bwright> Can I forward the interrupts through to my new process? + <braunr> yes + <braunr> it has been implemented for netdde + <braunr> these are debian specific patches for the time being though + <bwright> Great, and shared memory set ups are all nice and dandy. + <braunr> yes, the mach vm takes care of that + <bwright> Can I forward page faults? + <bwright> Or does mach actually handle the faults? + <bwright> (Sorry for so many questions just comparing what I know from my + microkernel knowledge to mach and gnu mach) + <braunr> mach handles them but translates them to requests to userspace + pagers + <bwright> (Still have a mach paper to read) + <bwright> Alright that sounds sane. + <bwright> Does GNU mach have benchmarks on its IPC times? + <braunr> no but expect them to suck :) + <bwright> Isn't it fixable though? + <braunr> mach ipc is known to be extremely heavy in comparison with modern + l4-like kernels + <braunr> not easily + <bwright> Yeah so I know that IPC is an issue but never dug into why it is + bad on Mach. + <bwright> So what design decision really screwed up IPC speed? + <braunr> for one because they're completely async, and also because they + were designed for network clusters, meaning data is typed inside messages + <bwright> Oh weird + <bwright> So how is type marshalled in the message? + <braunr> in its own field + <braunr> messages have their own header + <braunr> and each data field inside has its own header + <bwright> Oh ok, so I can see this being heavy. + <bwright> So the big advantage is for RPC + <bwright> It would make things nice in that case. + <bwright> Is it possible to send an IPC without the guff though? + <bwright> Or would this break the model mach is trying to achieve? + <bwright> I am assuming Mach wanted something where you couldn't tell if a + process was local or not. + <bwright> So I am assuming then that IPC is costly for system calls from a + user process. + <bwright> You have some sort of blocking wait on the call to the service + that dispatches the syscall. + <bwright> I am assuming the current variants of GNU/Hurd run on glibc. + <bwright> It would be interesting to possibly replace that with UlibC or do + a full port of the FlexSC exceptionless system calls. + <bwright> Could get rid of some of the bottlenecks in hurd assuming it is + very IPC heavy. + <bwright> And that won't break the async model. + <bwright> Actually should be simpler if it is already designed for that. + <bwright> But would break the "distributed" vibe unless you had the faults + to those shared pages hit a page faulter that sent them over the network + on write. + <bwright> </end probably stupid ideas> + <kilobug> bwright: a lot of POSIX compatibility is handled by the glibc, + "porting" another libc to the Hurd will be a titanic task + <bwright> In theory exceptionless system calls work fine on glibc, it is + just harder to get them working. + <bwright> has not been done or was not explored in the paper. + <bwright> Something about it having a few too many annoying assumptions. + <bwright> Would be interesting to run some benchmarks on hurd and figure + out where the bottlenecks really are. + <bwright> At least for an exercise in writing good benchmarks :P + <bwright> I have a paper on the design of hurd I should read actually. + <bwright> After I get through this l4 ref man. + <braunr> the main bottleneck is scalability + <braunr> there are a lot of global locks + <braunr> and servers are prone to spawning lots of threads + <braunr> because, despite the fact mach provides async ipc, the hurd mostly + uses sync ipc + <braunr> so the way to handle async notifications is to receive messages + and spawn threads as needed + <bwright> Lets take a senario + <braunr> beyond that, core algorithms such as scanning pages in pagers, are + suboptimal + <bwright> I want to get a file and send it across the network. + <bwright> How many copies of the data occur? + <braunr> define send + <braunr> ouch :) + <braunr> disk drivers are currently in the kernel + <bwright> I read a block from disk, I pass this to my file system it passes + it to the app and it sends to the lwip or whatever interface then out the + ethernet card. + <braunr> and "block device drivers" in userspace (storeio) are able to + redirect file system servers directly to those in kernel drivers + <braunr> so + <braunr> kernel -> fs -> client -> pfinet -> netdde (user space network + drivers on debian hurd) + <bwright> Alright. Hopefully each arrow is not a copy :p + <braunr> it is + <bwright> My currently multiserver does this same thing with zero copy. + <braunr> because buffers are usually small + <braunr> yes but zero copy requires some care + <bwright> Which is possible. + <braunr> and usually, posix clients don't care about that + <bwright> Yes it requires a lot of care. + <bwright> POSIX ruins this + <bwright> Absolutely. + <braunr> they assume read/write copy data, or that the kernel is directly + able to access data + <bwright> But there are some things you can take care with + <bwright> And not break posix and still have this work. + <braunr> pfinet handles ethernet packets one at a time, and 1500 isn't + worth zero copying + <bwright> This depends though right? + <braunr> i'm not saying it's not possible + <braunr> i'm saying most often, there are copies + <bwright> So if I have high throughput I can load up lots of packets and + the data section can then be sectioned with scatter gather + <braunr> again, the current interface doesn't provide that + <bwright> Alright yeah that is what I expected which is fine. + <bwright> It will be POSIX compliant which is the main goal. + <braunr> not really scatter gather here but rather segment offloading for + example + <braunr> ah you're working on something like that too :) + <bwright> Yeah I am an intern :) + <bwright> Have it mostly working, just lots of pain. + <bwright> Have you read the netmap paper? + <bwright> Really interesting. + <braunr> not sure i have + <braunr> unless it has another full name + <bwright> 14.86 million packets per second out of the ethernet card :p + <bwright> SMOKES everything else. + <bwright> Implemented in Linux and FreeBSD now. + <bwright> Packets are UDP 1 byte MTU I think + <bwright> 1 byte data * + <bwright> To be correct :p + <braunr> right, i see + <bwright> Break posix again + <bwright> "More Extend" + <braunr> i've actually worked on a proprietary implementation of such a + thing where i'm currently working + <bwright> Bloody useful for high frequency trading etc. + <bwright> Final year as an undergraduate this year doing my thesis which + should be fun, going to be something OS hopefully. + <bwright> Very fun field lots of weird and crazy problems. diff --git a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn index 11bebd6e..b571b82e 100644 --- a/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn +++ b/open_issues/libmachuser_libhurduser_rpc_stubs.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -107,6 +107,34 @@ License|/fdl]]."]]"""]] <antrik> now that's a good question... no idea TBH :-) +## IRC, freenode, #hurd, 2013-02-25 + + <braunr> we should also discuss the mach_debug interface some day + <braunr> it's not exported by libc, but the kernel provides it + <braunr> slabinfo depends on it, and i'd like to include it in the hurd + <braunr> but i don't know what kind of security problems giving access to + mach_debug RPCs would create + <braunr> (imo, the mach_debug interface should be adjusted to be used with + privileged ports only) + <braunr> (well, maybe not all mach_debug RPCs) + + +## IRC, freenode, #hurd, 2013-11-20 + + <braunr> [...] we have to make the mach_debug interface available + <braunr> well, i never took the time to integrate slabinfo into the hurd + repository + <braunr> because it relies on the mach_debug interface + <teythoon> ah + <braunr> while enabling that interface alone can't do harm, some debugging + functions shouldn't be usable by unprivileged applications + <braunr> so it requires some discussions + <braunr> i always delayed it because of more important stuff to do + <braunr> but slabinfo is actually very useful + <braunr> the more information we have about the system state, the better + <braunr> so it's actually important + + # IRC, freenode, #hurd, 2012-07-23 <pinotree> aren't libmachuser and libhurduser supposed to be slowly faded @@ -123,18 +151,6 @@ License|/fdl]]."]]"""]] <braunr> pinotree: libc should bring them -# IRC, freenode, #hurd, 2013-02-25 - - <braunr> we should also discuss the mach_debug interface some day - <braunr> it's not exported by libc, but the kernel provides it - <braunr> slabinfo depends on it, and i'd like to include it in the hurd - <braunr> but i don't know what kind of security problems giving access to - mach_debug RPCs would create - <braunr> (imo, the mach_debug interface should be adjusted to be used with - privileged ports only) - <braunr> (well, maybe not all mach_debug RPCs) - - # `gnumach.defs` [[!message-id diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn index 0b426884..0294b008 100644 --- a/open_issues/libpthread.mdwn +++ b/open_issues/libpthread.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -1303,6 +1303,7 @@ Most of the issues raised on this page has been resolved, a few remain. after the system has been alive for some time ? <braunr> (some time being at least a few hours, more probably days) + #### IRC, freenode, #hurd, 2013-07-05 <braunr> ok, found the bug about invalid ports when adjusting priorities @@ -1312,6 +1313,149 @@ Most of the issues raised on this page has been resolved, a few remain. [[libpthread/t/fix_have_kernel_resources]]. +#### IRC, freenode, #hurd, 2013-11-25 + + <braunr> youpi: btw, my last commit on the hurd repo fixes the urefs + overflow we've sometimes seen in the past in the priority adjusting code + of libports + + +#### IRC, freenode, #hurd, 2013-11-29 + +See also [[open_issues/libpthread/t/fix_have_kernel_resources]]. + + <braunr> there still are some leak ports making servers spawn threads with + non-elevated priorities :/ + <braunr> leaks* + <teythoon> issues with your thread destruction work ? + <teythoon> err, wait + <teythoon> why does a port leak cause that ? + <braunr> because it causes urefs overflows + <braunr> and the priority adjustment code does check errors :p + <teythoon> ^^ + <teythoon> ah yes, urefs... + <braunr> apparently it only affects the root file system + <teythoon> hm + <braunr> i'll spend an hour looking for it, and whatever i find, i'll + install the upstream debian packages so you can build glibc without too + much trouble + <teythoon> we need a clean build chroot on darnassus for this situation + <braunr> ah yes + <braunr> i should have time to set things up this week end + <braunr> 1: send (refs: 65534) + <braunr> i wonder what the first right is in the root file system + <teythoon> hm + <braunr> search doesn't help so i'm pretty sure it's a kernel object + <braunr> perhaps the host priv port + <teythoon> could be the thread port or something ? + <braunr> no, not the thread port + <teythoon> why would it have so many refs ? + <braunr> the task port maybe but it's fine if it overflows + <teythoon> also, some urefs are clamped at max, so maybe this is fine ? + <braunr> it may be fine yes + <braunr> err = get_privileged_ports (&host_priv, NULL); + <braunr> iirc, this function should pass copies of the name, not increment + the urefs counter + <braunr> it may behave differently if built statically + <teythoon> o_O y would it ? + <braunr> no idea + <braunr> something doesn't behave as it should :) + <braunr> i'm not asking why, i'm asking where :) + <braunr> the proc server is also affected + <braunr> so it does look like it has something to do with bootstrap + <teythoon> I'm not surprised :/ + + +#### IRC, freenode, #hurd, 2013-11-30 + + <braunr> so yes, the host_priv port gets a reference when calling + get_privileged_ports + <braunr> but only in the rootfs and proc servers, probably because others + use the code path to fetch it from proc + <teythoon> ah + <teythoon> well, it shouldn't behave differently + <braunr> ? + <teythoon> get_privileged_ports + <braunr> get_privileged_ports is explictely described to cache references + <teythoon> i don't get it + <teythoon> you said it behaved differently for proc and the rootfs + <teythoon> that's undesireable, isn't it ? + <braunr> yes + <teythoon> ok + <braunr> so it should behave differently than it does + <teythoon> yes + <teythoon> right + <braunr> teythoon: during your work this summer, have you come across the + bootstrap port of a task ? + <braunr> i wonder what the bootstrap port of the root file system is + <braunr> maybe i got the description wrong since references on host or + master are deallocated where get_privileged_ports is used .. + <teythoon> no, I do not believe i did anything bootstrap port related + <braunr> ok + <braunr> i don't need that any more fortunately + <braunr> i just wonder how someone could write a description so error-prone + .. + <braunr> and apparently, this problem should affect all servers, but for + some reason i didn't see it + <braunr> there, problem fixed + <teythoon> ? + <braunr> last leak eliminated + <teythoon> cool :) + <teythoon> how ? + <braunr> i simply deallocate host_priv in addition to the others when + adjusting thread priority + <braunr> as simple as that .. + <teythoon> uh + <teythoon> sure ? + <braunr> so many system calls just for reference counting + <braunr> yes + <teythoon> i did that, and broke the rootfs + <braunr> well i'm using one right now + <teythoon> ok + <braunr> maybe i should let it run a bit :) + <teythoon> no, for me it failed on the first write + <braunr> teythoon: looks weird + <teythoon> so i figured it was wrong to deallocate that port + <braunr> i'll reboot it and see if there may be a race + <teythoon> thought i didn't get a reference after all or something + <teythoon> I believe there is a race in ext2fs + <braunr> teythoon: that's not good news for me + <teythoon> when doing fsysopts --update / (which remounts /) + <teythoon> sometimes, the system hangs + <braunr> :/ + <teythoon> might be a deadlock, or the rootfs dies and noone notices + <teythoon> with my protected payload stuff, the system would reboot instead + of just hanging + <braunr> oh + <teythoon> which might point to a segfault in ext2fs + <teythoon> maybe the exception message carries a bad payload + <braunr> makes sense + <braunr> exception handling in ext2fs is messy .. + <teythoon> braunr: and, doing sleep 0.1 before remounting / makes the + problem less likely to appear + <braunr> ugh + <teythoon> and system load on my host system seems to affect this + <teythoon> but it is hard to tell + <teythoon> sometimes, this doesn't show up at all + <teythoon> sometimes several times in a row + <braunr> the system load might simply indicate very short lived processes + <braunr> (or threads) + <teythoon> system load on my host + <braunr> ah + <teythoon> this makes me believe that it is a race somewhere + <teythoon> all of this + <braunr> well, i can't get anything wrong with my patched rootfs + <teythoon> braunr: ok, maybe I messed up + <braunr> or maybe you were very unlucky + <braunr> and there is a rare race + <braunr> but i'll commit anyway + <teythoon> no, i never got it to work, always hung at the first write + <braunr> it won't be the first or last rare problem we'll have to live with + <braunr> hm + <braunr> then you probably did something wrong, yes + <braunr> that's reassuring + + ### IRC, freenode, #hurd, 2013-03-11 <braunr> youpi: oh btw, i noticed a problem with the priority adjustement @@ -1582,6 +1726,9 @@ Same issue as [[term_blocking]] perhaps? ## IRC, freenode, #hurd, 2013-01-06 <youpi> it seems fakeroot has become slow as hell + +[[pfinet_timers]]. + <braunr> fakeroot is the main source of dead name notifications <braunr> well, a very heavy one <braunr> with pthreads hurd servers, their priority is raised, precisely to @@ -2008,3 +2155,260 @@ Same issue as [[term_blocking]] perhaps? handling, but there are still a few bugs remaining <braunr> fyi, the related discussion was https://lists.gnu.org/archive/html/bug-hurd/2012-08/msg00057.html + + +## IRC, freenode, #hurd, 2014-01-01 + + <youpi> braunr: I have an issue with tls_thread_leak + <youpi> int main(void) { + <youpi> pthread_create(&t, NULL, foo, NULL); + <youpi> pthread_exit(0); + <youpi> } + <youpi> this fails at least with the libpthread without your libpthread + thread termination patch + <youpi> because for the main thread, tcb->self doesn't contain thread_self + <youpi> where is tcb->self supposed to be initialized for the main thread? + <youpi> there's also the case of fork()ing from main(), then calling + pthread_exit() + <youpi> (calling pthread_exit() from the child) + <youpi> the child would inherit the tcb->self value from the parent, and + thus pthread_exit() would try to kill the father + <youpi> can't we still do tcb->self = self, even if we don't keep a + reference over the name? + <youpi> (the pthread_exit() issue above should be fixed by your thread + termination patch actually) + <youpi> Mmm, it seems the thread_t port that the child inherits actually + properly references the thread of the child, and not the thread of the + father? + <youpi> “For the name we use for our own thread port, we will insert the + thread port for the child main user thread after we create it.” Oh, good + :) + <youpi> and, “Skip the name we use for any of our own thread ports.”, good + too :) + <braunr> youpi: reading + <braunr> youpi: if we do tcb->self = self, we have to keep the reference + <braunr> this is strange though, i had tests that did exactlt what you're + talking about, and they didn't fail + <youpi> why? + <braunr> if you don't keep the reference, it means you deallocate self + <youpi> with the thread termination patch, tcb->self is not used for + destruction + <braunr> hum + <braunr> no it isn't + <braunr> but it must be deallocated at some point if it's not temporary + <braunr> normally, libpthread should set it for the main thread too, i + don't understand + <youpi> I don't see which code is supposed to do it + <youpi> sure it needs to be deallocated at some point + <youpi> but does tcb->self has to wear the reference? + <braunr> init_routine should do it + <braunr> it calls __pthread_create_internal + <braunr> which allocates the tcb + <braunr> i think at some point, __pthread_setup should be called for it too + <youpi> but what makes pthread->kernel_thread contain the port for the + thread? + <braunr> but i have to check that + <braunr> __pthread_thread_alloc does that + <braunr> so normally it should work + <braunr> is your libpthread up to date as well ? + <youpi> no, as I said it doesn't contain the thread destruction patch + <braunr> ah + <braunr> that may explain + <youpi> but the tcb->self uninitialized issue happens on darnassus too + <youpi> it just doesn't happen to crash because it's not used + <braunr> that's weird :/ + <youpi> see ~youpi/test.c there for instance + <braunr> humpf + <braunr> i don't see why :/ + <braunr> i'll debug that later + <braunr> youpi: did you find the problem ? + <youpi> no + <youpi> I'm working on fixing the libpthread hell in the glibc debian + package :) + <youpi> i.e. replace a dozen patches with a git snapshot + <braunr> ah you reverted commit + <braunr> +a + <braunr> i imagine it's hairy :) + <youpi> not too much actually + <braunr> wow :) + <youpi> with the latest commits, things have converged + <youpi> it's now about small build details + <youpi> I just take time to make sure I'm getting the same source code in + the end :) + <braunr> :) + <braunr> i hope i can determine what's going wrong tonight + <braunr> youpi: avec mach_print, je vois bien self setté par la libpthread + .. + <youpi> mais à autre chose que 0 ? + <braunr> oui + <braunr> bizarrement, l'autre thread n'as pas la même valeur + <braunr> tu es bien sûr que c'est self que tu affiches avec l'assembleur ? + <braunr> oops, english + <youpi> see test2 + <youpi> so I'm positive + <braunr> well, there obviously is a bug + <braunr> but are you certain your assembly code displays the thread port + name ? + <youpi> I'm certain it displays tcb->self + <braunr> oh wait, hexadecimal, ok + <youpi> and the value happens to be what mach_thread_self returns + <braunr> ah right + <youpi> ah, right, names are usually decimals :) + <braunr> hm + <braunr> what's the problem with test2 ? + <youpi> none + <braunr> ok + <youpi> I was just checking what happens on fork from another thread + <braunr> ok i do have 0x68 now + <braunr> so the self field gets erased somehow + <braunr> 15:34 < youpi> this fails at least with the libpthread without + your libpthread thread termination patch + <braunr> how does it fail ? + <youpi> ../libpthread/sysdeps/mach/pt-thread-halt.c:44: + __pthread_thread_halt: Unexpected error: (ipc/send) invalid destination + port. + <braunr> hm + <braunr> i don't have that problem on darnassus + <youpi> with the new libc? + <braunr> the pthread destruction patch actually doesn't use the tcb->self + name if i'm right + <braunr> yes + <braunr> what is tcb->self used for ? + <youpi> it used to be used by pt-thread-halt + <youpi> but is darnassus using your thread destruction patch? + <youpi> as I said, since your thread destruction pathc doesn't use + tcb->self, it doesn't have the issue + <braunr> the patched libpthread merely uses the sysdeps kernel_thread + member + <braunr> ok + <youpi> it's the old libpthread against the new libc which has issues + <braunr> yes it is + <braunr> so for me, the only thing to do is make sure tcb->self remains + valid + <braunr> we could simply add a third user ref but i don't like the idea + <youpi> well, as you said the issue is rather that tcb->self gets + overwritten + <youpi> there is no reason why it should + <braunr> the value is still valid when init_routine exits, so it must be in + libc + <youpi> or perhaps for some reason tls gets initialized twice + <braunr> maybe + <youpi> and thus what libpthread's init writes to is not what's used later + <braunr> i've add a print in pthread_create, to see if self actually got + overwritten + <braunr> and it doesn't + <braunr> there is a disrepancy between the tcb member in libpthread and + what libc uses for tls + <braunr> added* + <braunr> (the print is at the very start of pthread_create, and displays + the thread name of the caller only) + <youpi> well, yes, for the main thread libpthread shouldn't be allocating a + new tcb + <youpi> and just use the existing one + <braunr> ? + <youpi> the main thread's tcb is initialized before the threading library + iirc + <braunr> hmm + <braunr> it would make sense if we actually had non-threaded programs :) + <youpi> at any rate, the address of the tcb allocated by libpthread is not + put into registers + <braunr> how does it get there for the other threads ? + <youpi> __pthread_setup does it + <braunr> so + <braunr> looks like dl_main is called after init_routine + <braunr> and it then calls init_tls + <braunr> init_tls returns the tcb for the main thread, and that's what + overrides the libpthread one + <youpi> yes, _hurd_tls_init is called very early, before init_routine + <youpi> __pthread_create_internal could fetch the tcb pointer from gs:0 + when it's the main thread + <braunr> so there is something i didn't get right + <braunr> i thought _hurd_tls_init was called as part of dl_main + <youpi> well, it's not a bug of yours, it has always been bug :) + <braunr> which is called *after* init_routine + <braunr> and that explains why the libpthread tcb isn't the one installed + in the thread register + <braunr> i can actually check that quite easily + <youpi> where do you see dl_main called after init_routine? + <braunr> well no i got that wrong somehow + <braunr> or i'm unable to find it again + <braunr> let's see + <braunr> init_routine is called by init which is called by _dl_init_first + <braunr> which i can only find in the macro RTLD_START_SPECIAL_INIT + <braunr> with print traces, i see dl_main called before init_routine + <braunr> so yes, libpthread should reuse it + <braunr> the tcb isn't overriden, it's just never installed + <braunr> i'm not sure how to achieve that cleanly + <youpi> well, it is installed, by _hurd_tls_init + <youpi> it's the linker which creates the main thread's tcb + <youpi> and calls _hurd_tls_init to install it + <youpi> before the thread library enters into action + <braunr> agreed + + +### IRC, freenode, #hurd, 2014-01-14 + + <braunr> btw, are you planning to do something with regard to the main + thread tcb initialization issue ? + <youpi> well, I thought you were working on it + <braunr> ok + <braunr> i wasn't sure + + +### IRC, freenode, #hurd, 2014-01-19 + + <braunr> i have some fixup code for the main thread tcb + <braunr> but it sometimes crashes on tcb deallocation + <braunr> is there anything particular that you would know about the tcb of + the main thread ? + <braunr> (that could help explaining this) + <youpi> Mmmm, I don't think there is anything particular + <braunr> doesn't look like the first tcb can be reused safely + <braunr> i think we should instead update the thread register to point to + the pthread tcb + <youpi> what do you mean by "the first tcb" exactly? + + +## IRC, freenode, #hurd, 2014-01-03 + + <gg0> braunr: hurd from your repo can't boot. restored debian one + <braunr> gg0: it does boot + <braunr> gg0: but you need everything (gnumach and glibc) in order to make + it work + <braunr> i think youpi did take care of compatibility with older kernels + <teythoon> braunr: so do we need a rebuilt libc for the latest hurd from + git ? + <braunr> teythoon: no, the hurd isn't the problem + <teythoon> ok + <teythoon> good + <braunr> the problem is the libports_stability patch + <teythoon> what about it ? + <braunr> the hurd can't work correctly without it since the switch to + pthreads + <braunr> because of subtle bugs concerning resource recycling + <teythoon> ok + <braunr> these have been fixed recently by youpi and me (youpi fixed them + exactly as i did, which made my life very easy when merging :)) + <braunr> there is also the problem of the stack sizes, which means the hurd + servers could use 2M stacks with an older glibc + <braunr> or perhaps it chokes on an error when attempting to set the stack + size because it was unsupported + <braunr> i don't know + <braunr> that may be what gg0 suffered from + <gg0> yes, both gnumach and eglibc were from debian. seems i didn't + manually upgrade eglibc from yours + <gg0> i'll reinstall them now. let's screw it up once again + <braunr> :) + <braunr> bbl + <gg0> ok it boots + <gg0> # apt-get install + {hurd,hurd-dev,hurd-libs0.3}=1:0.5.git20131101-1+rbraun.7 + {libc0.3,libc0.3-dev,libc0.3-dbg,libc-dev-bin}=2.17-97+hurd.0+rbraun.1+threadterm.1 + <gg0> there must a simpler way + <gg0> besides apt-pinning + <gg0> making it a real "experimental" release might help with -t option for + instance + <gg0> btw locales still segfaults + <gg0> rpctrace from teythoon gets stuck at + http://paste.debian.net/plain/74072/ + <gg0> ("rpctrace locale-gen", last 300 lines) diff --git a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn index feea7c0d..02b6ab05 100644 --- a/open_issues/libpthread/t/fix_have_kernel_resources.mdwn +++ b/open_issues/libpthread/t/fix_have_kernel_resources.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -477,3 +478,824 @@ Address problem mentioned in [[/libpthread]], *Threads' Death*. failing bad <braunr> i just need to polish a few things, wait for youpi to finish his work on TLS to resolve conflicts, and that will be all + + +## IRC, freenode, #hurd, 2013-10-30 + + <braunr> FYI, the packages on my repository enable actual thread + destruction, and i've altered the libports_stability.patch + <braunr> it nows only sets the global timeout to 0 + <braunr> now* + <braunr> we actually can't let translator "die" on global timeout because + of a race issue + <braunr> tested for about two weeks now and no major problem sighted + <braunr> top reports processes running for 100% of their time when + terminating threads, but i expect it's simply mach/proc aggregating their + run time to the task + <braunr> 100% of cpu time + + +## IRC, freenode, #hurd, 2013-11-08 + + <braunr> teythoon: darnassus is currently running a modified glibc with + thread destruction, yes + <teythoon> braunr: did that require any fixups in Hurd that I'd have missed + ? + <braunr> no + <braunr> well + <teythoon> b/c the resulting hurd package would not boot + <braunr> actually yes + <braunr> one + <braunr> i'll push the patch somewhere + <teythoon> iirc the mach-defpager spewed some error and /hurd/init failed + to bootstrap the system + <braunr> teythoon: + http://darnassus.sceen.net/~rbraun/0001-Prevent-diskfs-translators-from-destroying-main-thre.patch + <braunr> make sure you have the proper gnumach packages too :p + <teythoon> well, that could very well account for my trouble ;) + <teythoon> uh + <teythoon> well + <braunr> gnumach implements thread destruction, glibc uses it, hurd makes + sure it doesn't exit from main + + +## IRC, freenode, #hurd, 2013-11-12 + + <braunr> ok so, calling pthread_exit() from main isn't the same as + returning from main() + <braunr> unlike what some man pages seem to say + <braunr> so loosing task info when destroying the main thread is actually a + proc bug + <braunr> ugh + <teythoon> ^^ + <braunr> or a glibc one + <teythoon> the proc server, your favorite Hurd component... + <braunr> :) + <braunr> hm :/ + <braunr> looks like command line arguments are stored on the stack of the + main thread + <braunr> and proc merely receives the addresses of those in the target task + <neal> why not just keep the main thread around? + <neal> it represents a minor resource leak, true + <braunr> yes + <braunr> that's the hack i suggested + <neal> but it is relatively small + <braunr> well no + <braunr> my hack was about diskfs translators + <braunr> it should be generalized in libpthread + <braunr> seems reasonable + <braunr> let's do it >) + + +## IRC, freenode, #hurd, 2013-11-13 + + <youpi> braunr: there is a thread destruction issue in the experimental + ocaml build, worth looking at, probably + <braunr> what do you mean ? + <youpi> ... testing 'testfork.ml': ocamlcocamlrun: + ../libpthread/sysdeps/mach/pt-thread-halt.c:51: __pthread_thread_halt: + Unexpected error: (ipc/send) invalid destination port. + <youpi> during the experimental ocaml build + <braunr> well yes + <braunr> thread recycling is buggy + <braunr> i had the choice to fix it, or implement true destruction + <braunr> i'm tweaking my patch so it leaves the main thread stack untouched + on destruction + <braunr> and it should be ready + <braunr> for review at least + + +## IRC, OFTC, #debian-hurd, 2013-11-13 + + <gg0> ironforge out of memory during ruby1.9.1 rebuild. during test which + creates 10000 threads + <gg0> ironforge out of memory during ruby1.9.1 rebuild, test which creates + 10000 threads + <gg0> i guess ironforge kernel has been rebuilt against -95, correct? + <youpi> err, what kernel? + <gg0> 23:37 < youpi> hurd needs a rebuild to be able to work with the newer + eglibc + <gg0> i mean hurd + <youpi> yes, libc0.3 breaks the old packages anyway + <gg0> wrt ENOMEM, was it expected? + <gg0> wrt disk problems, aren't there on alioth only? + <youpi> well 10,000 threads is a lot, especially on 32bit machine with 2M + default stack size + <youpi> that makes 2GiB stacks + <youpi> can't fit in a 2/2 split model, which gnumach uses + <gg0> well, though active thread should die right away, just after set x to + false, if i read it correctly + <youpi> perhaps the stacks are not correctly reused + <youpi> that's probably worth digging in libpthread + <youpi> by putting printfs, etc. + <youpi> it seems stacks are never reused indeed, damn + <youpi> I just wrote a small test that creates threads which just print + their stack address + <youpi> that takes just a few minutes to do + <gg0> i see. about reusage i guess you mean base address is kindof always + incremented + * gg0 likes being wrong + <youpi> that's it, yes + <youpi> gg0: take care, by keeping being wrong all the time, sometimes you + get right ;) + <youpi> and you are definitely right here :) + <youpi> Mmm, but the stack is really deallocated + <youpi> and the numbers wrap around + <youpi> I wonder how that is :) + <youpi> ok, creating 20 000 threads does work + <youpi> perhaps ruby does odd things which makes it not work + + +### IRC, OFTC, #debian-hurd, 2013-11-14 + + <gg0> UID PID PPID TH MSGI MSGO SZ RSS SC STAT TIME COMMAND + <gg0> 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu 0:00.15 + /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1 + -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb + <gg0> 720 threads, stuck + <youpi> 2G SZ is very big :) + <gg0> 00:42 < youpi> perhaps ruby does odd things which makes it not work + <gg0> is that enough to file a ruby bug? as ruby suggests itself btw + <youpi> no, they will probably not be able to investigate + <youpi> but you can already check out how they create threads + <youpi> and try to reproduce the same with a small C program + <gg0> ehm on ruby2.0 with *context _enabled_ i can not reproduce it + +See [[/open_issues/glibc]] for `*context` functions. + + +## IRC, freenode, #hurd, 2013-11-14 + + <braunr> nice, i got glibc packages with thread destruction + <braunr> building hurd packages against it now + <braunr> everything seems fine + <braunr> hurd packages ready, let's see + + <gg0> ruby1.9.1 FTBFS due to a couple of tests + https://buildd.debian.org/status/fetch.php?pkg=ruby1.9.1&arch=hurd-i386&ver=1.9.3.448-1&stamp=1384265526 + <gg0> second one creates 10000 threads and machine got ENOMEM + <braunr> bootstraptest.tmp.rb: [BUG] [BUG] pthread_cond_init: Cannot + allocate memory (ENOMEM) ew + <gg0> few hours ago trying to reproduce it: + <gg0> 01:20 < gg0> UID PID PPID TH MSGI MSGO SZ RSS SC STAT + TIME COMMAND + <gg0> 01:20 < gg0> 1012 16446 15473 720 987 509 1.89G 23.6M 1 Hu + 0:00.15 /home/gg0-guest/ruby/ruby1.9.git/ruby1.9.1 + -I/home/gg0-guest/ruby/ruby1.9.git/lib -W0 bootstraptest.tmp.rb + <braunr> yes that's expected + <braunr> our stacks are 2M + <braunr> 10k threads means right over 2G of stacks + <braunr> userspace is restricted to 2G + <gg0> but if i read correctly test in question, thread should just set x to + false then die + <braunr> so ? + <gg0> and ENOMEM popped upk when there were thread count was at 720 + <braunr> hum + <braunr> 10k threads would actually be 20G + <braunr> 1k threads is 2G + <braunr> 720 is about 1.5G + <braunr> the rest is probably the ruby runtime + <gg0> youpi tried to create 10000 thread, no problem. he guessed something + wrong on ruby side + <gg0> indeed on ruby2.0 such test succeeds + <braunr> you can't create 10k threads unless you change the stack size + <braunr> hurd servers use a stack size of 64k by default which allows them + to go up to 30k iirc + <braunr> but normal applications use the default 2M + <gg0> i guess you mean 10000 threads active at the same time. test in + question should make them die after simply setting x to false, i guess + youpi's test did so as well + <braunr> no + <braunr> it's about stacks + <braunr> hm + <braunr> yes at the same time but + <braunr> thread recycling is known to be buggy + <braunr> which is what i'm currently fixing btw + <neal> what's the bug? + <braunr> neal: there are several subtle issues + <braunr> for example, joining a thread that is also calling pthread_exit + can fail badly + <neal> hmm + <neal> good that you are on it then :) + <braunr> or detaching + <braunr> i don't remember the details + <braunr> but i remember such problems + <braunr> apparently, keeping the stack of the main thread isn't enough + <braunr> :( + <braunr> for now, i'll keep the entire thread + + +## IRC, freenode, #hurd, 2013-11-15 + + <gg0> i wasn't doing anything, just some single test runs. but yes, also + that one which creates hundreds of threads + <gg0> it would like creating 10000 but goes out of memory after ~720 + <gg0> btw same tests succeed on ruby2.0, so they should be fixed by + backporting some changes + <braunr> actually it looks more like a deadlock .. + <gg0> deadlock that says ENOMEM? + <braunr> ? + <braunr> ENOMEM is returned because the test task has no more virtual + memory + <braunr> this doesn't mean the rest of the system should fail + <gg0> ok i thought you were talking about such test + <braunr> no it's something else + <braunr> a deadlock in a critical server + <braunr> the root file system maybe + <gg0> braunr: htop and ps hang. just run the test once again + <gg0> now you should still be able to login + <braunr> htop/ps hanging means one process is unable to reply to queries + sent to the message port/thread + <braunr> procfs does that to report on what a process is waiting + <braunr> it usually mean there is a bug around signals, since the message + thread is also in charge of delivering signals + <braunr> use ps -eM + <braunr> and kill -KILL + <braunr> hum + <braunr> root 954 S<o 0:00.05 /hurd/crash --dump-core + <braunr> dumping cores is known not to work most of the time + <braunr> exodar shouldn't be configured like that + <braunr> so yes, the crash server is hanging + <braunr> gg0: i've set it to crash --kill and killed the hanging crash + instances blocking top/ps + <gg0> nice + + <braunr> my thread destruction patch and tls are indeed conflicting a bit + <braunr> i suspect the tcb is used after being freed + <braunr> i think i'll simply recycle the tcb, along with the pthread + structs + <braunr> ok i think it's fine now + <braunr> there was also a small bug in the tls code, keeping a reference on + the thread port + <braunr> mach reference counting is so counter intuitive :/ + <braunr> well, error-prone + + <braunr> argh, more bugs in libc :( + <teythoon> :/ + <teythoon> but don't worry, there is always one more bug ;) + <braunr> this one might explain crashes that are long to trigger + <braunr> _hurd_self_sigstate() is implemented like this : + _hurd_thread_sigstate (__mach_thread_self ()); + <braunr> it leaks a reference on the current thread each time it's called + <teythoon> >,< + <braunr> but glibc maintains such references, so if the maximum value is + reached, and references are dropped, the value can reach 0 + <teythoon> ouch + <braunr> at which point any call on a thread will result in an invalid send + right + <braunr> and probably an assertion + <teythoon> well it's a good thing then that you found it :) + <braunr> i think it's always been there + <braunr> but it's more apparent since jknoenig's patch on signal + dispositions + <braunr> the maximum number of user references in mach is 64k + <braunr> this right leak isn't easy + <braunr> tls is very tricky heh :) + <braunr> for the main thread, tls initialization happens after the thread + creation, obviously + <braunr> but for other threads, it's initialized before starting them + <braunr> the leak was probably an overlook caused by that complexity + <braunr> teythoon: actually that leak i mentioned in _hurd_self_sigstate + has only been recently added in Convert sigstate to TLS + <braunr> so it's merely tls integration polishing + <braunr> youpi: i'm currently reviewing changes related to tls and i think + there is a bug in _hurd_self_sigstate + <braunr> calls to mach_thread_self() should be paired with + mach_port_deallocate to avoid urefs overflows + <braunr> and right leaks + <braunr> _hurd_critical_section_lock is probably affected too + <braunr> hm + <braunr> mhmm + <braunr> in glibc, hurd/hurd/signal.h, _hurd_critical_section_lock + <braunr> why is the sigstate unlocked after the call to + _hurd_thread_sigstate + <braunr> _hurd_thread_sigstate doesn't seem to lock it .. + <braunr> unless __spin_lock_init does it + <braunr> yes, leak solved :) + + +## IRC, freenode, #hurd, 2013-11-16 + + <braunr> argh, _hurd_critical_section_lock is called before the send right + on the main thread is fetched in libpthread :/ + <teythoon> is that bad ? + <braunr> the sigstate is supposed to be initialized after pthreads + <braunr> _hurd_critical_section_lock will create it if it sees there is + none + <braunr> creating the sigstate is currently what makes the send right leak + <teythoon> ok + <teythoon> it's bad then + <braunr> it may be due to my patch + <braunr> _hurd_critical_section_lock is called during pthreads + initializatio + <braunr> n + <braunr> before the sigstate for the main thread is created, but after the + pthread init routine is called + <braunr> it does indeed look like the code wasn't written with thread being + destroyed some day in mind :/ + <teythoon> braunr: btw, if you ever feel like benchmarking, sysbench has a + benchmark for threads contending for a lock + <braunr> yes i've used it before + <teythoon> was it useful for this purpose ? + <braunr> no :) + <teythoon> :/ + <braunr> we already know libpthread isn't optimized + <braunr> and felt it when we switched from cthreads + <braunr> humpf + <braunr> simply calling malloc implies a call to + _hurd_critical_section_lock + <braunr> on the other hand, unlike what some glibc comments say, this does + work + + +## IRC, freenode, #hurd, 2013-11-17 + + <braunr> looks like i've fixed all leak issues with thread destruction and + tls :) + <braunr> let's see if ext2fs.static works fine too + <youpi> braunr: \o/ + <youpi> sorry about introducing the tls ones :) + <braunr> no worries, it was expected + <braunr> and tls was really needed :) + <braunr> i mean, i expected to have some problems when rebasing on tls :p + <teythoon> braunr: this is good news, how is your rootfs translator holding + up? + <braunr> building hurd packages right now + <braunr> for now, only test applications and a few really multithreaded + ones (e.g. iceweasel) have been tested + <braunr> well, the system boots :) + <teythoon> awesome :) + <braunr> stressing the file system with git while watching youtube videos + with gnash doesn't make the system crash + <teythoon> you can actually watch yt videos on your Hurd box ? + <braunr> yes + <braunr> for a while now + <teythoon> o_O + <braunr> can't you ? + <teythoon> I never even dared to try + <braunr> hehe + <braunr> teythoon: looks stable enough to install on darnassus + + +## IRC, freenode, #hurd, 2013-11-18 + + <teythoon> braunr: wrt to your thread destruction patchset, I thought you + also had to fix the proc server ? + <braunr> teythoon: no + <braunr> the problem was in glibc + <braunr> i may have to fix proc/procfs though, because cpu time gets wrong + with the patch + <braunr> currently, it's the addition of the cpu time of all threads + <braunr> mach provides aggregate times including destroyed threads though + <teythoon> ah, I see + <braunr> one side effect is that you'll see processes sometimes taking 100% + of cpu time although the cpu is unused + <braunr> or the cpu time of a process gets reduced :) + <braunr> i guess the 100% cpu is how top sees a negative increment + <teythoon> ^^ + <braunr> gg0: do my threadterm packages help with ruby1.9 ? + <braunr> i mean, can you test with them some time ? :) + + +## IRC, freenode, #hurd, 2013-11-21 + + <braunr> youpi: ping about my question regarding error handling in the + proposed thread_terminate_release call + <youpi> I agree with what Neal said + <braunr> he didn't say anything about error handling + <braunr> see + http://lists.gnu.org/archive/html/bug-hurd/2013-11/msg00181.html + <braunr> i think i should make the call fail on first error + <braunr> it shouldn't happen, so it would merely serve to catch bugs + <braunr> it's not easily recoverable (if it's recoverable at all) + <youpi> uh, I thought he had + <youpi> I must have dreamt + + <braunr> i think i'll go ahead with thread destruction integration + + +## IRC, freenode, #hurd, 2013-11-25 + + <braunr> i've pushed the thread destruction patches for gnumach upstream + <braunr> and made a branch in glibc for that too + <teythoon> awesome :) + <braunr> youpi: i don't remember how glibc changes should be managed + <braunr> once those are applied, i'll commit in libpthread + <youpi> braunr: usually we create a topgit branch, and then we add the + patch from that to the debian repository + + +## IRC, freenode, #hurd, 2013-11-29 + + <braunr> youpi: i still have a leak somewhere with the thread destruction + patches + <braunr> maybe on the host priv port in bootstrap servers (root fs and proc + server) + <braunr> it prevents priority adjusting in libports and can easily bring + down a system because servers can start trashing a lot sooner, as it was + the case during the pthread migration + +See discussion about that on [[/open_issues/libpthread]]. + + <braunr> so i'll hunt it down before merging + + +## IRC, freenode, #hurd, 2013-12-19 + + <braunr> darnassus still has the libports priority adjustement leaks + <braunr> i'll apply a few more patches to my hurd packages + + <braunr> humpf, proc seems to have a problem getting the host priv port :/ + <teythoon> thats bad + <teythoon> what did you do ? + <braunr> i fixed all the leaks in libports when adjusting priorities + <braunr> the last one being releasing the host priv right + <braunr> and i get errors at boot time from the proc server + <teythoon> remember when i had this problem ? + <braunr> proc doesn't get the host priv port the normal way since the + normal way is to get it from proc iirc + <teythoon> ah, thought you fixed that + <braunr> so i guess the alternate way doesn't add a reference + <braunr> well the leak is fixed + <braunr> the problem you had was due to the leak which made the host priv + port reach its max uref value + <braunr> now it's just the proc server + <braunr> the system works fine though + <teythoon> for real ? + <teythoon> the proc server needs the host priv port for getting the new + tasks + <braunr> well yes + <teythoon> how can it work w/o it ? + <braunr> i don't know .. + <braunr> i guess the problem is internal to glibc + <braunr> i mean, get_priv_ports fails, but that doesn't mean the host priv + port is lost + <teythoon> could be + <teythoon> are you running a patched rootfs translator too ? + <braunr> yes + <teythoon> ok + <teythoon> b/c i remember having trouble with that + <braunr> right, the glibc call would make proc call __proc_getprivports + <braunr> hum + <braunr> teythoon: do you remember how proc gets its host priv port ? + <teythoon> from init + <teythoon> i think + <braunr> startup_procinit ? + <teythoon> possibly + <braunr> right + <braunr> so it's probably not the host priv port + <braunr> i mean, the error is about another invalid send right + <braunr> hm nope, it is on host_priv :/ + <braunr> hm ok i see, looks like a bug from a debian patch + <braunr> or rather, a bug fix not yet imported into the debian package + <braunr> teythoon: you actually fixed it in + 2c9422595f41635e2f4f7ef1afb7eece9001feae + <braunr> great :) + <teythoon> ah, that one + <braunr> i was looking at the upstream code and couldn't understand what + was going wrong + <braunr> :) + <braunr> much better + <braunr> except ps -eT doesn't work any more .. + <braunr> interestingly, with the thread destruction patch, ps -eT sometimes + work, and sometimes doesn't + <braunr> the behaviour doesn't seem to change without a reboot + <braunr> and of course, as soon as i say it, i'm proven wrong by the next + test :) + + +## IRC, freenode, #hurd, 2013-12-26 + + <braunr> __pthread_sigstate_init doesn't seem to be converted to TLS in the + upstream repository master branch + + <braunr> ah dammit, the global signal dispositions patch touches both glibc + and libpthread @#! + <braunr> what a mess + + <braunr> youpi: do you have some time to quickly review the + rbraun/thread_destruction branch in libpthread ? + <braunr> there might be conflict with some glibc patches + <braunr> or do you prefer it on the mailing list ? + <braunr> (i used a branch because it's not based on master) + <youpi> rather mail the list, yes + <braunr> ok + <youpi> it'd also be useful to write the rationale + <youpi> probably to be left as comment in the source code + <braunr> yes, that branch was for personal storage :) + <youpi> so the reader knows how things are recycled or not + <braunr> hm + <braunr> that should already be the case + <youpi> ok + <braunr> the two structures that are still recycled are the pthread struct + and tls + <braunr> it's quite obvious from pthread_alloc + <braunr> and well commented there + <braunr> for tls, it's explained in pthread_exit + + <braunr> there, thread destruction finally merged in + <braunr> and now, we can remove the ugly hacks that were done for + threadvars + <braunr> :) + <braunr> change stacks at will and support all sorts of weird languages and + runtimes + <teythoon> braunr: cool :) + + +## IRC, freenode, #hurd, 2013-12-31 + + <youpi1> braunr: I've added sigstate_locking, sigstate_thread_reference and + tls_thread_leak to the debian glibc 2.18 package + <youpi1> I believe that's complete? + <youpi1> is mach_msg_uspace_options ready for being added? Does it bring + much speedup? + <youpi1> AIUI, thread_terminate_release is the union of the branches + mentioned above? + <youpi1> (I'm cleaning up branches in the glibc repo) + <braunr> youpi1: mach_msg_uspace_options can be left over, it only affects + selects and not noticeably + <braunr> yes, those three branches are the only ones needed for thread + destruction + <youpi1> ok + <youpi> does the hurd changes depend on these changes ? + <braunr> no + <youpi> good :) + <braunr> only on tls for one of them + <braunr> (it's about the default stack size of 64k for hurd servers) + <youpi> and we have had this in debian for a long time already :) + <braunr> yes + <youpi> (how big were they before?) + <youpi> (where they a couple MiB, and thus exploding to GiBs on thousands + of threads?) + <braunr> 64k + <braunr> pthread stacks are 2M by default + <braunr> yes + + +## IRC, freenode, #hurd, 2014-01-14 + + <youpi> braunr: it seems your time change in libps made ps produce odd re + <youpi> results + <youpi> samy 10987 5 -514358:-18:-42.17 /hurd/firmlink tmp + <braunr> youpi: wow :) + <braunr> that change is supposed to run on a system where threads actually + get destroyed + <braunr> but i don't see what could trigger this side effect + <youpi> root 8629 664 56 years make -j 3 + <youpi> :) + <braunr> heh + <braunr> youpi: does the hurd package on darnassus include that patch ? + <youpi> yes + <braunr> i don't reproduce the problem :/ + <youpi> err + <braunr> what command are you using ? + <youpi> ps -feM on darnassus + <youpi> root 29642 473 7 months /usr/sbin/sshd -R + <braunr> hmmmm + <braunr> i don't see it with a make -j + <youpi> well, it's not systematic + <youpi> it's like once over two launches + <braunr> hhhhmmmmm + <youpi> it'd look like some random numbers get added + <braunr> strangely, the gcc processes started by a recursive make aren't + children of make .. + <braunr> ps -eF hurd seems to report the correct values + <braunr> even ps -eM + <braunr> oO + <braunr> ps -ef too + <braunr> the problem seems to be with ps -efM + <youpi> too bad I'm always using that :) + <braunr> another way to see it is that it makes us spot the issue ;p + + +### IRC, freenode, #hurd, 2014-01-15 + + <braunr> ok i have an idea of what goes wrong in libps + + <braunr> youpi: for some reason, ps -efM lacks the PSTAT_TASK_BASIC flag + <braunr> my patch is wrong since it doesn't try to determine whether the + stats apply to a task or a thread, but that is easy to fix + <braunr> ps -efM should nonetheless provide basic task info, obviously + <braunr> in addition, the problems i've observed with ps -T (occasional + segfaults) seem to have existed before thread destruction + <braunr> they're just strongly exposed now that the thread list can be + shrunk + + <braunr> libps is quite complicated + <braunr> even hairy, i'd say .. + + +### IRC, freenode, #hurd, 2014-01-16 + + <braunr> youpi: i think i have a proper fix for libps + <braunr> i'll commit it soon + <youpi> ok + <braunr> basically, getting system times simply set the PSTAT_THREAD_BASIC + flag + <braunr> whereas getting the run time of the terminated threads requires + PSTAT_TASK_BASIC + <braunr> i assumed it was always set in the function i changed when dealing + with a task and not a thread + <braunr> and well, that was a wrong assumtion, -M can remove it if not + strictly needed by the format + <braunr> the default format asks for suspend_count, which forces the + retrieval of task basic info, os it works with -eM + <braunr> but -f doesn't :) + <youpi> so extremely bad lucky combination of flags :) + <braunr> indeed + <braunr> i added a pstat_times using the last (!) available flag bit + <braunr> looks clean to me + <braunr> i hope there is no abi issue + <braunr> (at least everything works with the unmodified ps-hurd executable + and a new libps.so) + + <braunr> hm, small bug in the thread destruction patch :/ + + +### IRC, freenode, #hurd, 2014-01-17 + + <braunr> good, i have proper fixes for tls in the main thread and thread + termination :) + <teythoon> awesome :) + <teythoon> i've been wondering, what does it take to get the thread + destruction stuff into the debian package ? + <braunr> i still have to build test packages, look for (unlikely, heh) + regressions and work some integration details with samuel + <braunr> hum the main thread tls fixup i guess + <braunr> youpi was waiting for me to fix that + <braunr> gnumach already provides the RPC + <braunr> so it will be in glibc soon + <braunr> i just have to get those last bits right + <braunr> teythoon: i'm quite slow at integrating stuff + <teythoon> and samuel then builds packages ? + <teythoon> i mean, is our libc package build linked to the other libc + packages ? + <braunr> libpthread is applied as a patch to glibc + <braunr> and loaded as a plugin + + +## IRC, freenode, #hurd, 2014-01-17 + + <braunr> uhm, did we break fakeroot-tcp ? + <teythoon> we did ? + <youpi> fakeroot-tcp just works fine on buildds + <braunr> with fakeroot-tcp, i get + <braunr> make[4]: Entering directory + `/home/rbraun/devel/debian/packages/hurd/hurd-0.5.git20140113/libdde-linux26/contrib/include' + <braunr> rm -f .general.d + <braunr> make[4]: *** [cleanall] Killed + <braunr> when cleaning the package before building .. + + +### IRC, freenode, #hurd, 2014-01-18 + + <braunr> damn, fakeroot-tcp won't work on darnassus .. + <braunr> uh, looks like my tls/thread destruction "fixes" do cause + regressions :( + <braunr> fakeroot works fine with debian glibc + <teythoon> which one ? + <teythoon> which fakeroot i mean + <braunr> -tcp + <braunr> yes, it fails as soon as i use the patched glibc :/ + <braunr> at least it's easy to reproduce + + +### IRC, freenode, #hurd, 2014-01-20 + + <braunr> great, 3rd libc version installed on darnassus, let's see if i can + build hurd packages against that + + +### IRC, freenode, #hurd, 2014-01-21 + + <braunr> damn, fakeroot-tcp still crashes with my latest changes .... + + <braunr> darnassus looks in good shape + <braunr> youpi: ^ + <braunr> youpi: if you have other tests, feel free to do them now + <braunr> i feel confident about committing the changes, if you're ok with + it + <youpi> which changes ? + <youpi> I'm a bit lost in what you were talking about :) + <braunr> you can find them in 2 patches in /var/tmp on darnassus + <braunr> one is about fixing thread destruction + <braunr> i'm pretty certain about this one so i'll commit it directly + <braunr> the other is fixing the tcb of the main thread + +[[open_issues/libpthread]]. + + <braunr> where i simply do tcb->self = thread->kernel_thread :) + <braunr> with a comment explaining why i don't do something else like + deallocating the unused tcb + <youpi> braunr: ok, that looks good + <teythoon> braunr: awesome :) + <braunr> youpi: ok + + +### IRC, freenode, #hurd, 2014-01-22 + + <braunr> there, libpthread should be fine now + + +## IRC, freenode, #hurd, 2014-02-06 + + <braunr> youpi: in case you're planning to upgrade glibc (or not), the + thread destruction changes are complete + <braunr> youpi: darnassus has been running them for some weeks with no + visible regression + <youpi> braunr: ok, good + <youpi> including it in glibc was on my todo list indeed + <youpi> and Adam indeed plan for a 2.18 upload + <braunr> good :) + <youpi> braunr: this is up to 7c6dc6e28b2fc4b67934223f41cf080ffe58b230, + right? (Wed Jan 22, Fix up the main thread TCB) + <braunr> yes + <braunr> oh, i just saw 2.17-98~0 glibc packages on debian-ports :) + <youpi> yes, it's just to fix the dhcp crash + <braunr> ah yes, it's not 2.18 + <youpi> 2.18 is available in experimental + + <youpi> braunr: just to make sure: did you have + 983b18a6ff16f5687a9ece63a50d1831dec88609 in libc on darnassus? + <youpi> (which drops the stack size hack) + <braunr> youpi: let me check + <braunr> youpi: ah no, i don't, you're right + <youpi> well, I was just wondering, nothing make me think that was the case + :) + <youpi> what was the issue that it was raising btw? + <braunr> threadvards + <youpi> ok, b ut in which case? + <youpi> (to make sure I test that before committing) + <braunr> now that we switched to tls, i would assume the transition path to + be 1/ hurd stops defining that symbol, 2/ libpthread can stop using it + <braunr> the goal was to reduce the stack size of hurd server threads + <youpi> well, that's not my question :) I'm wondering in which precise case + that was breaking things + <braunr> youpi: i don't know, it shouldn't break + <youpi> ok + <braunr> youpi: just in case, don't forget that last one line patch i + committed last night, fakeroot can't work right without it + <braunr> (i made a minor change while reviewing before comitting, and + obviously got it wrong :p) + <youpi> ok + + <youpi> braunr: I've upgraded libpthread in debian's eglibc btw + + <braunr> + /home/rbraun/devel/debian/packages/eglibc/eglibc-2.17/build-tree/hurd-i386-libc/libc.so.phdr: + *** executable stack signaled + <braunr> from build-tree/hurd-i386-libc/elf/check-execstack.out + <braunr> i thought glibc didn't use those + <braunr> anyway it doesn't look to be the regression i'm having + <braunr> does this ring a bell : + <braunr> Encountered regressions that don't match expected failures + (debian/testsuite-checking/expected-results-i486-gnu-libc): + <braunr> test-stpcpy_chk.out, Error 1 + <braunr> TEST test-stpcpy_chk.out: __stpcpy_chk normal_stpcpy + simple_stpcpy_chk + <youpi> nope + <youpi> after what are you getting this regression? + <braunr> building glibc 2.17-97 with thread destruction patches, including + the one removing the stack size hack + <braunr> during tests + <braunr> there also are "progressions", but i'm not sure what these are + <youpi> some progressions are just luck, other seem to happen on some + platforms only + <youpi> I'm not sure you want to test 2.17 + <youpi> a lot has changed between 2.17's libpthread and 2.18's libpthread + (which is now equal to cvs's libpthread + <youpi> ) + <youpi> s/cvs/git/ + <braunr> yes + <braunr> i usually build with nocheck + + +## IRC, freenode, #hurd, 2014-02-07 + + <braunr> youpi: on a vm with hurd 1:0.5.git20140203-1, upgrading to a + patched glibc 2.17-97 that includes the patch which reverts the stack + size hack, the system reboots and works fine + <youpi> ok. I don't remember what problem I was seeing + <braunr> that version of the hurd no longer defines the symbol + <braunr> but even then, there shouldn't have been any problem + <braunr> hm, or does it + <braunr> yes, it does + <braunr> youpi: the hurd package patch mentions + <braunr> Revert this for now, will have to wait for dropping the use of + <braunr> __pthread_stack_default_size from eglibc's + libpthread_hurd_cond_wait.diff + <braunr> i wonder how it got there + <youpi> IIRC I was wondering too + <braunr> i've installed my c library on darnassus and it works fine there + too + <braunr> with older (january) hurd packages + <braunr> looks good to me + + +## IRC, freenode, #hurd, 2014-02-10 + + <teythoon> braunr: btw, do the new libc packages contain your thread + destruction work ? + <braunr> teythoon: the -98 ones on experimental ? + <braunr> i don't think they do + <braunr> the -18 ones should do diff --git a/open_issues/libpthread_dlopen.mdwn b/open_issues/libpthread_dlopen.mdwn index 3c36eb26..a825fdff 100644 --- a/open_issues/libpthread_dlopen.mdwn +++ b/open_issues/libpthread_dlopen.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -125,6 +125,108 @@ IRC, freenode, #hurd, 2011-08-17 <pinotree> and yes, it's known already, just nobody worked on solving it +# IRC, freenode, #hurd, 2014-01-28 + + <gnu_srs> braunr: Is this fixed by your recent patches? test_dbi: + ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: + <gnu_srs> __pthread_mutex_timedlock_internal: Assertion `__pthread_threads' + failed. + <youpi> faq/libpthread_dlopen.mdwn: + ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: + __pthread_mutex_time + <gnu_srs> youpi: tks. A workaround seems to be available: + LD_PRELOAD=/lib/i386-gnu/libpthread.so.0.3 + <gnu_srs> Is that possible on a buildd? + <youpi> it would be simpler to just make the package explicitly link + libpthread + <gnu_srs> Package is libdbi-drivers, providing libdbd-sqlite3 needed by + gnucash + + +# IRC, freenode, #hurd, 2014-02-17 + + <braunr> hm ok, looks like iceweasel errors all have something to do with + the libc dns resolver + <braunr> http://darnassus.sceen.net/~rbraun/iceweasel_crash + <braunr> apparently, it's simply because the memory chunk isn't page + aligned .. + <braunr> looks like not preloading libpthread tirggers lots of tricky + issues + <braunr> anyway, apparently, the malloc/free calls in libresolv don't use + locks if libpthread isn't preloaded, which explains why the program state + looked impossible to reach and why crashes look random + <congzhang> debian linux does not have the pthread load problem. + <braunr> congzhang: it had it + <braunr> maybe not debian but i've found one such report for opensuse + + <braunr> ok the bug is simple + <braunr> for some reason, our glibc still uses a global _res state for dns + resolution instead of per thread ones + <braunr> uh, apparently, it's libpthread's job to define a __res_state + function for that :( + +## IRC, freenode, #hurd, 2014-02-18 + + <braunr> usually when i say it, it crashes soon after, so let's try it : + <braunr> i've been running iceweasel 27 fine for like 10 minutes with a + patched libpthread + <braunr> still no crash ;p + <braunr> with luck this extremely lightweight patch will fix all + multithreaded applications doing concurrent name resolution .... :) + <teythoon> nice :) + <braunr> let's try gnash .... + <braunr> uh, segfault on termination + <braunr> gnash works :) + <teythoon> sweet :) + <braunr> i'm very surprised we could live so long with that resolv bug + + +## IRC, freenode, #hurd, 2014-02-19 + + <braunr> youpi: the eglibc bug is about libresolv + <braunr> it uses a global resolver state even in multithreaded applications + <youpi> libresolv is a horrible part of glibc :) + <braunr> which is obviously bad + <braunr> yes .. :) + <braunr> here is the patch : + http://darnassus.sceen.net/~rbraun/0001-libpthread-per-thread-resolver-states.patch + <braunr> it's very short, it basically allocates a resolver state per + thread in the pthread struct, and sets the TLS variable __resp when the + thread starts + <braunr> should we make that hurd-specific ? + <braunr> or enclose that assignment with #ifdef ENABLE_TLS ? + <youpi> well, ENABLE_TLS is now always 1, iirc :) + <braunr> for the hurd, yes + <youpi> I'm surprised linux never had the issue + <youpi> no, not for the hurd + <braunr> ah + <youpi> I *had* to implement TLS for hurd because it was always 1 for + everybody :) + <braunr> ok + <braunr> so all those ifdefs could be removed and libpthread can assume tls + is enabled + <braunr> in which case my patch looks fine + <youpi> ah, thats a libpthread patch, not glibc patch + <braunr> yes + <braunr> nptl obviously did that from the start . :) + <braunr> linuxthreads had the problem a looong time ago + <youpi> ok + <braunr> i'm surprised we overlooked it for so long + <braunr> but anyway, that's a good fix + <youpi> indeed + <youpi> it seems all good to me + <braunr> well, __resp is a __thread variable + <braunr> i could add #ifdef ENABLE_TLS, but then what of the case where TLS + isn't enabled, and do we actually care ? + <braunr> #error maybe ? + <braunr> or #warning ? + <youpi> I don't think we care about the non-TLS case any more + <braunr> ok + <braunr> topgit branch i suppose ? + <youpi> well, not, hurd libpthread repo :) + <braunr> oh right ... :) + + # libthreads vs. libpthread The same symptom appears in an odd case, for instance: diff --git a/open_issues/libpthread_set_stack_size.mdwn b/open_issues/libpthread_set_stack_size.mdwn index 68f81752..21c2f18e 100644 --- a/open_issues/libpthread_set_stack_size.mdwn +++ b/open_issues/libpthread_set_stack_size.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -23,3 +24,91 @@ IRC, freenode, #hurd, 2011-10-21: <youpi> it's simply on the so-long TODO list [[glibc/t/tls-threadvar]]. + +2012-12-28: + +Hurd commit 3a3fcc811e6b50b21124a5c5a128652e788a3b67 `libports: remove the +threadvars stack size hack`. + +IRC, freenode, #hurd, 2014-01-09: + + <teythoon> braunr: i'm afraid it might be your patch 3a3fcc81 that breaks + proc + <teythoon> w/ the current debian libc that is + <teythoon> braunr: i reverted that patch and now it boots again + <gnu_srs> is alternate stack and arbitrary stack sizes supported by now, or + upcoming? + <braunr> gnu_srs: supported + <braunr> well + <braunr> considering what teythoon just said, maybe not + <gg0> need to remove __pthread_stack_default_size from + libpthread_hurd_cond_wait patch too i guess + <braunr> teythoon: i don't understand why this change has any negative + effect :/ + <braunr> or + <braunr> hm no .. + <braunr> there may be a bug in the latest glibc, where changing the stack + is allowed on the ground that threadvars have been replaced with tls, but + the libpthread stack handling code does it wrong + <braunr> see 714413a7694ff534855e9e5904899695eac6c9bb in libpthread + <braunr> which the thread destruction patches already did before it was + fixed in libpthread + <braunr> and may explain why my packages work + + +IRC, freenode, #hurd, 2014-01-14: + + <youpi> teythoon: Mmm, I tried to update to the latest hurd commits, but + init dies early at boot + <youpi> exec init proc auth, and then init crashes + <youpi> downgrading libports to previous makes the issue go away + <braunr> youpi: previous ? + <youpi> previous debian package + <braunr> which patch makes it fail ? + <youpi> I'm bisecting + <braunr> i remember teythoon saying he had failures with the patch that + removes the threadvars stack size hack + <youpi> I'll try that already, ok + <youpi> yes, boots fine without this change + <braunr> ok + <youpi> perhaps some missing patches in the current 2.17-97 glibc + <braunr> or libpthread reacting badly to new stack sizes + <braunr> is 714413a7694ff534855e9e5904899695eac6c9bb included in your glibc + ? + <braunr> (714413a7694ff534855e9e5904899695eac6c9bb from libpthread) + <braunr> or maybe that's not the problem + <braunr> anyway, it's normally fixed with the thread destruction patch + <braunr> i did test it and checked the stack size were correct + <braunr> sizes* + <youpi> yes, debian's glibc has it + <youpi> ok + <youpi> so that can wait + <braunr> is 959f7365fccd1c89be9938c2655eba9122171e6a (Drop threadvars + entirely) also in your glibc ? + <youpi> yes + <braunr> that's weird :/ + <braunr> the only thing i can think of is __pthread_stack_alloc miserably + failing with 2M stacks and "many" threads for some odd reason .. + <braunr> anyway, see you tomorrow + <gg0> hurd-i386/libpthread_hurd_cond_wait.diff keeps using + __pthread_stack_default_size. isn't it the problem? + * youpi wonders what that change is doing there + <youpi> and it's there from the start of that patch... + <braunr> + if (&__pthread_stack_default_size != NULL) + <braunr> checks if the symbol is actually resolved + <braunr> that's what allows regular applications to work + <braunr> it should be the same for hurd servers + + +# sigaltstack + +Likewise, `sigaltstack` is not usable at the moment. + +IRC, freenode, #hurd, 2014-02-25: + + <gnu_srs> braunr: are the split/alternate stack etc problems solved by now + so gccgo can work properly? + <braunr> i don't know + <braunr> i suspect it wouldn't require much work now that tls is well + supported + <youpi> alternate stack is supposed to be working diff --git a/open_issues/linux_as_the_kernel.mdwn b/open_issues/linux_as_the_kernel.mdwn index 1d84d777..2656b1a3 100644 --- a/open_issues/linux_as_the_kernel.mdwn +++ b/open_issues/linux_as_the_kernel.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2012, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -235,3 +235,34 @@ Richard's X-15 Mach re-implementation: <braunr> i'll have to check, it's been a long time since i've really used it <braunr> they must use a pure devfs instance now + + +# IRC, freenode, #hurd, 2014-02-23 + + <desrt> so crazy idea: would it be possible to have mach as a linux kernel + module? + <desrt> ie: some new binfmt type thing that could load mach binaries and + implement the required kernel ABI for them + <desrt> and then run the entire hurd under that.... + <braunr> desrt: that's an idea, yes + <braunr> and not a new one + * desrt did a bit of googling but didn't find any information about it + <braunr> desrt: but why are you thinking of it ? + <braunr> we talked about it here, informally + <desrt> braunr: mostly because running hurd in a VM sucks + <desrt> if we had mach-via-linux, we'd have: + <desrt> - no vm overhead + <desrt> - no device virtualisation + <desrt> - 64bit (physical at least) memory support + <desrt> - SMP + <desrt> - access to the linux drivers, natively + <desrt> and maybe some other nice things + <braunr> yes we talkbed about all this + <braunr> but i still consider that to be an incomplete solution + <desrt> i don't consider it to be running "the hurd" as your OS... but it + would be a nice solution for development and virtualisation + <braunr> we probably don't want to use drivers natively, since we want them + to run in their own address space, with their own namespace context + <braunr> it would, certainly + <braunr> but it would require a lot of effort anyway + <desrt> right diff --git a/open_issues/mach_migrating_threads.mdwn b/open_issues/mach_migrating_threads.mdwn index bbc6ac45..16547838 100644 --- a/open_issues/mach_migrating_threads.mdwn +++ b/open_issues/mach_migrating_threads.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -101,3 +102,17 @@ In context of [[resource_management_problems]]. <braunr> i initially downloaded osfmach sources to see an example of how thread migration was used from userspace <braunr> and they do have a special threading library for that + + +# IRC, freenode, #hurd, 2014-02-18 + + <teythoon> has anyone here ever tried to enable the thread migration bits + in gnumach to see where things break and how far that effort has been + taken ? + <braunr> without proper userspace support, i don't see how this could work + <teythoon> but is the kernel part finished or close to being finished ? + <braunr> no idea + <braunr> i don't think it is + <braunr> i didn't see much code related to that feature, and practically + none that looked like what the paper described + <braunr> some structures, but not used diff --git a/open_issues/mig_portable_rpc_declarations.mdwn b/open_issues/mig_portable_rpc_declarations.mdwn index ecfa06ae..f5f18880 100644 --- a/open_issues/mig_portable_rpc_declarations.mdwn +++ b/open_issues/mig_portable_rpc_declarations.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,8 +11,35 @@ License|/fdl]]."]]"""]] [[!tag open_issue_mig]] +[[!toc]] -# IRC, freenode, #hurd, 2011-11-14 + +# 32-Bit vs. 64-Bit Interfaces + +## IRC, freenode, #hurd, 2011-10-16 + + <braunr> i guess it wouldn't be too hard to have a special mach kernel for + 64 bits processors, but 32 bits userland only + <youpi> well, it means tinkering with mig + <braunr> like old sparc systems :p + <youpi> to build the 32bit interface, not the 64bit one + <braunr> ah yes + <braunr> hm + <braunr> i'm not sure + <braunr> mig would assume a 32 bits kernel, like now + <youpi> and you'll have all kinds of discrepancies in vm_size_t & such + <braunr> yes + <braunr> the 64 bits type should be completely internal + <braunr> types* + <braunr> but it would be far less work than changing all the userspace bits + for 64 bit (ofc we'll do that some day but in the meanwhile ..) + <youpi> yes + <youpi> and it'd boost userland addrespace to 4GiB + <braunr> yes + <youpi> leaving time for a 64bit userland :) + + +## IRC, freenode, #hurd, 2011-11-14 <braunr> also, what's the best way to deal with types such as <braunr> type cache_info_t = struct[23] of integer_t; @@ -58,7 +86,103 @@ License|/fdl]]."]]"""]] <antrik> (which I still need to follow up on... [sigh]) -# IRC, freenode, #hurd, 2013-06-25 +## IRC, freenode, #hurd, 2012-12-12 + +In context of [[microkernel/mach/gnumach/memory_management]]. + + <tschwinge> Or with a 64-bit one? ;-P + <braunr> tschwinge: i think we all had that idea in mind :) + <pinotree> tschwinge: patches welcome :P + <youpi> tschwinge: sure, please help us settle down with the mig stuff + <youpi> what was blocking me was just deciding how to do it + <braunr> hum, what's blocking x86_64, except time to work on it ? + <youpi> deciding the mig types & such things + <youpi> i.e. the RPC ABI + <braunr> ok + <braunr> easy answer: keep it the same + <youpi> sorry, let me rephrase + <youpi> decide what ABI is supposed to be on a 64bit system, so as to know + which way to rewrite the types of the kernel MIG part to support 64/32 + conversion + <braunr> can't this be done in two steps ? + <youpi> well, it'd mean revamping the whole kernel twice + <youpi> as the types at stake are referenced in the whole RPC code + <braunr> the first step i imagine would simply imply having an x86_64 + kernel for 32-bits userspace, without any type change (unless restricting + to 32-bits when a type is automatically enlarged on 64-bits) + <youpi> it's not so simple + <youpi> the RPC code is tricky + <youpi> and there are alignments things that RPC code uses + <youpi> which become different when build with a 64bit compiler + <pinotree> there are also things like int[N] for io_stat_struct and so on + <braunr> i see + <youpi> making the code wrong for 32 + <youpi> thus having to change the types + <youpi> pinotree: yes + <pinotree> (doesn't mig support structs, or it is too clumsy to be used in + practice?) + <braunr> pinotree: what's the problem with that (i explcitely said changing + int to e.g. int32_t) + <youpi> that won't fly for some of the calls + <youpi> e.g. getting a thread state + <braunr> pinotree: no it doesn't support struct + <pinotree> braunr: that some types in struct stat are long, for instance + <braunr> pinotree: same thing with longs + <braunr> youpi: why wouldn't it ? + <youpi> that wouldn't work on a 64bit system + <youpi> so we can't make it int32_t in the interface definition + <braunr> i understand the alignment issues and that the mig code adjusts + the generated code, but not the content of what is transfered + <braunr> well of course + <braunr> i'm talking about the first step here + <braunr> which targets a 32-bits userspace only + <youpi> ok, so we agree + <youpi> the second step would have to revamp the whole RPC code again + <braunr> i imagine the first to be less costly + <braunr> well, actually no + <braunr> you're right, the mig stuff would be easy on the application side, + but more complicated on the kernel side, since it would really mean + dealing with 64-bits values there + <braunr> (unless we keep a 3/1 split instead of giving the full 4g to + applications) + +See also [[microkernel/mach/gnumach/memory_management]]. + + <youpi> (I don't see what that changes) + <braunr> if the kernel still runs with 32-bits addresses, everything it + recevies from or sends through mig can be stored with the user side + 32-bits types + <youpi> err, ok, but what's the point of the 64bit kernel then ? :) + <braunr> and it simply uses 64-bits addresses to deal with physical memory + <youpi> ok + <youpi> that could even be a 3.5/0.5 split then + <braunr> but the memory model forces us to run either at the low 2g or the + highest ones + <youpi> but linux has 3/1, so we don't need that + <braunr> otherwise we need an mcmodel=medium + <braunr> we could do with mcmodel=medium though, for a time + <braunr> hm actually no, it would require mcmodel=large + <braunr> hum, that's stupid, we can make the kernel run at -2g, and use 3g + up to the sign extension hole for the kernel map + + +## IRC, freenode, #hurd, 2013-12-03 + + <azeem> I believe the main issue is redoing the RPCs in 64bit, i.e. the + Mach/Hurd interface + <braunr> mach has always been 64-bits capable + <braunr> the problem is both mach and the hurd + <braunr> it's at the system interface (the .defs of the RPCs) + <braunr> azeem: ah, actually that's why you also say + <braunr> but i consider it to be a hurd problem + <braunr> the hurd itself is defined as being a set of interfaces and + servers implementing them, i wouldn't exclude the interfaces + <braunr> that's what* + + +# Structured Data + +## IRC, freenode, #hurd, 2013-06-25 <teythoon> is there a nice way to get structured data through mig that I haven't found yet? diff --git a/open_issues/mig_strings.mdwn b/open_issues/mig_strings.mdwn new file mode 100644 index 00000000..3693fcc2 --- /dev/null +++ b/open_issues/mig_strings.mdwn @@ -0,0 +1,38 @@ +[[!meta copyright="Copyright © 2014 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_mig]] + +[[!toc]] + + +# IRC, freenode, #hurd, 2014-02-21 + + <teythoon> grml... migs support for variable-length c strings is broken :( + <braunr> completely .. + <teythoon> no one told me :p + <braunr> noone dares + <teythoon> to tell me ? + <braunr> or anyone else ;p + <teythoon> ^^ + <teythoon> root@debian:~# pkill mtab + <teythoon> task /hurd/procfs(19) �O� deallocating an invalid port 1049744, + most probably a bug. + <braunr> :) + <teythoon> it's still an improvement >,< + <teythoon> uh the joys... + <teythoon> gnu machs mig_strncpy behaves differently from glibcs + <teythoon> the mach version always 0-terminates the target string, the libc + variant does not + <teythoon> which one should i "fix" ? + <braunr> strncpy should behave like strncpy + <teythoon> not according to the documentation in gnumach... + <braunr> people who know it expect it not to always null terminate + <braunr> you can either fix mig_strncpy, or call it mig_strlcpy diff --git a/open_issues/mig_stub_functions.mdwn b/open_issues/mig_stub_functions.mdwn index 24a582b1..474a7675 100644 --- a/open_issues/mig_stub_functions.mdwn +++ b/open_issues/mig_stub_functions.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -39,3 +39,15 @@ License|/fdl]]."]]"""]] <teythoon> btw, is there any reason why mig couldn't generate the request and reply routines from the synchronous routines? <braunr> i guess it could + + +# Compiler Optimization + +## IRC, freenode, #hurd, 2013-12-02 + + <teythoon> braunr: inlining the mach generated x_server_procedure functions + shaved 5 minutes off my hurd package build :) + <teythoon> i guess fakeroot-tcp benefits most from this... I'm going to try + this w/o fakeroot and on real hardware shortly + <braunr> teythoon: nice + <teythoon> :) diff --git a/open_issues/multithreading.mdwn b/open_issues/multithreading.mdwn index 03614fae..d5c0272c 100644 --- a/open_issues/multithreading.mdwn +++ b/open_issues/multithreading.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -362,6 +362,8 @@ Tom Van Cutsem, 2009. <braunr> having servers go away when unneeded is a valuable and visible feature of modularity +[[open_issues/libpthread/t/fix_have_kernel_resources]]. + ### IRC, freenode, #hurd, 2013-04-03 @@ -381,6 +383,184 @@ Tom Van Cutsem, 2009. <braunr> ok +### IRC, freenode, #hurd, 2013-11-30 + +"Thread storms". + + <braunr> if you copy a large file for example, it is loaded in memory, each + page is touched and becomes dirty, and when the file system requests them + to be flushed, the kernel sends one message for each page + <braunr> the file system spawns a thread as soon as a message arrives and + there is no idle thread left + <braunr> if the amount of message is large and arrives very quickly, a lot + of threads are created + <braunr> and they compete for cpu time + <Gerhard> How do you plan to work around that? + <braunr> first i have to merge in some work about pagein clustering + <braunr> then i intend to implement a specific thread pool for paging + messages + <braunr> with a fixed size + <Gerhard> something compareable for a kernel scheduler? + <braunr> no + <braunr> the problem in the hurd is that it spawns threads as soon as it + needs + <braunr> the thread does both the receiving and the processing + <Gerhard> But you want to queue such threads? + <braunr> what i want is to separate those tasks for paging + <braunr> and manage action queues internally + <braunr> in the past, it was attempted to limit the amount ot threads in + servers, but since receiving is bound with processing, and some actions + in libpager depend on messages not yet received, file systems would + sometimes freeze + <Gerhard> that's entirely the task of the hurd? One cannot solve that in + the microkernel itself? + <braunr> it could, but it would involve redesigning the paging interface + <braunr> and the less there is in the microkernel, the better + + +#### IRC, freenode, #hurd, 2013-12-03 + + <braunr> i think our greatest problem currently is our file system and our + paging library + <braunr> if someone can spend some time getting to know the details and + fixing the major problems they have, we would have a much more stable + system + <TimKack> braunr: The paging library because it cannot predict or keep + statistics on pages to evict or not? + <TimKack> braunr: I.e. in short - is it a stability problem or a + performance problem (or both :) ) + <braunr> it's a scalability problem + <braunr> the sclability problem makes paging so slow that paging requests + stack up until the system becomes almost completely unresponsive + <TimKack> ah + <TimKack> So one should chase defpager code then + <braunr> no + <braunr> defpager is for anonymous memory + <TimKack> vmm? + <TimKack> Ah ok ofc + <braunr> our swap has problems of its own, but we don't suffer from it as + much as from ext2fs + <TimKack> From what I have picked up from the mailing lists is the ext2fs + just because no one really have put lots of love in it? While paging is + because it is hard? + <TimKack> (and I am not at that level of wizardry!) + <braunr> no + <braunr> just because it was done at a time when memory was a lot smaller, + and developers didn't anticipate the huge growth of data that came during + the 90s and after + <braunr> that's what scalability is about + <braunr> properly dealing with any kind of quantity + <teythoon> braunr: are we talking about libpager ? + <braunr> yes + <braunr> and ext2fs + <teythoon> yeah, i got that one :p + <braunr> :) + <braunr> the linear scans are in ext2fs + <braunr> the main drawback of libpager is that it doesn't restrict the + amount of concurrent paging requests + <braunr> i think we talked about that recently + <teythoon> i don't remember + <braunr> maybe with someone else then + <teythoon> that doesn't sound too hard to add, is it ? + <teythoon> what are the requirements ? + <teythoon> and more importantly, will it make the system faster ? + <braunr> it's not too hard + <braunr> well + <braunr> it's not that easy to do reliably because of the async nature of + the paging requests + <braunr> teythoon: the problem with paging on top of mach is that paging + requests are asynchronous + <teythoon> ok + <braunr> libpager uses the bare thread pool from libports to deal with + that, i.e. a thread is spawned as soon as a message arrives and all + threads are busy + <braunr> if a lot of messages arrive in a burst, a lot of threads are + created + <braunr> libports implies a lot of contention (which should hopefully be + lowered with your payload patch) + +[[community/gsoc/project_ideas/object_lookups]]. + + <braunr> that contention is part of the scalability problem + <braunr> a simple solution is to use a more controlled thread pool that + merely queues requests until user threads can process them + <braunr> i'll try to make it clearer : we can't simply limit the amout of + threads in libports, because some paging requests require the reception + of future paging requests in order to complete an operation + <teythoon> why would that help with the async nature of paging requests ? + <braunr> it wouldn't + <teythoon> right + <braunr> thaht's a solution to the scalability problem, not to reliability + <teythoon> well, that kind of queue could also be useful for the other hurd + servers, no ? + <braunr> i don't think so + <teythoon> why not ? + <braunr> teythoon: why would it ? + <braunr> the only other major async messages in the hurd are the no sender + and dead name notification + <braunr> notifications* + <teythoon> we could cap the number of threads + <braunr> two problems with that solution + <teythoon> does not solve the dos issue, but makes it less interruptive, + no? + <braunr> 1/ it would dynamically scale + <braunr> and 2/ it would prevent the reception of messages that allow + operations to complete + <teythoon> why would it block the reception ? + <teythoon> it won't be processed, but accepting it should be possilbe + <braunr> because all worker threads would be blocked, waiting for a future + message to arrive to complete, and no thread would be available to + receive that message + <braunr> accepting, yes + <braunr> that's why i was suggesting a separate pool just for that + <braunr> 15:35 < braunr> a simple solution is to use a more controlled + thread pool that merely queues requests until user threads can process + them + <braunr> "user threads" is a poor choice + <braunr> i used that to mirror what happens in current kernels, where + threads are blocked until the system tells them they can continue + <teythoon> hm + <braunr> but user threads don't handle their own page faults on mach + <teythoon> so how would the threads be blocked exactly, mach_msg ? + phread_locks ? + <braunr> probably a pthread_hurd_cond_wait_np yes + <braunr> that's not really the problem + <teythoon> why not ? that's the point where we could yield the thread and + steal some work from our queue + <braunr> this solution (a specific thread pool of a limited number of + threads to receive messages) has the advantage that it solves one part of + the scalability issue + <braunr> if you do that, you loose the current state, and you have to use + something like continuations instead + <teythoon> indeed ;) + <braunr> this is about the same as making threads uninterruptible when + waiting for IO in unix + <braunr> it makes things simpler + <braunr> less error prone + <braunr> but then, the problem has just been moved + <braunr> instead of a large number of threads, we might have a large number + of queued requests + <braunr> actually, it's not completely asynchronous + <braunr> the pageout code in mach uses some heuristics to slow down + <braunr> it's ugly, and is the reason why the system can get extremely slow + when swap is used + <braunr> solving that probably requires a new paging interface with the + kernel + <teythoon> ok, we will postpone this + <teythoon> I'll have to look at libpager for the protected payload series + anyways + <braunr> 15:38 < braunr> 1/ it would dynamically scale + <braunr> + not + <teythoon> why not ? + <braunr> 15:37 < teythoon> we could cap the number of threads + <braunr> to what value ? + <teythoon> we could adjust the number of threads and the queue size based + on some magic unicorn function + <braunr> :) + <braunr> this one deserves a smiley too + <teythoon> ^^ + + ## Alternative approaches: * <http://www.concurrencykit.org/> diff --git a/open_issues/nightly_builds.mdwn b/open_issues/nightly_builds.mdwn index 96567685..f6d2c311 100644 --- a/open_issues/nightly_builds.mdwn +++ b/open_issues/nightly_builds.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -29,9 +29,25 @@ Resources: * <http://buildbot.net/> + IRC, freenode, #hurd, 2013-11-15: + + <teythoon> today I discovered buildbot, and both the master as well as + the build slave works just fine out of the box on Hurd :) + <teythoon> I'd love to set one up on darnassus + <braunr> ah nice + <braunr> we use buildbot at work too + <teythoon> even better, so you already know it + <braunr> sure we can + <braunr> no i don't + <braunr> i just know we use it :) + <teythoon> k + <braunr> but that would be a good occasion to learn + <braunr> i'm a bit busy right now, have to go soon + <braunr> we'll see the details later + <teythoon> yes :) + + [[Nightly_Builds_deb_Packages]]. + * [LAVA (Linaro Automated Validation Architecture)](http://lava.readthedocs.org/) ---- - -See also [[nightly_builds_deb_packages]]. diff --git a/open_issues/nightly_builds_deb_packages.mdwn b/open_issues/nightly_builds_deb_packages.mdwn index 11fc4c79..da7bdc7d 100644 --- a/open_issues/nightly_builds_deb_packages.mdwn +++ b/open_issues/nightly_builds_deb_packages.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -16,6 +17,13 @@ packages. * Need to have an automation to get from Hurd upstream Git branches to a branch usable in Debian. + IRC, freenode, #hurd, 2013-12-18: + + <teythoon> http://darnassus.sceen.net/~teythoon/hurd-ci/ has hurd and + mig and gnumach packages built directly from the upstream git + repository + + --- There is infrastructure available to test whole OS installations. @@ -29,3 +37,74 @@ There is infrastructure available to test whole OS installations. --- See also [[nightly_builds]]. + + +# Debian Jenkins Instance + +## IRC, OFTC, #debian-hurd, 2014-02-24 + + <pere> hi. can hurd be installed using d-i? If so, what about scripting + the installation on <URL: + http://jenkins.debian.net/view/g-i-installation/ >? + <gnu_srs> pere: d-i works for Hurd, yes, with full graphical interface I + dunno. Maybe you can ask about scripting in #hurd, more people are + present there? + <pere> gnu_srs: the scripts in questions are for jenkins. quite easy to + write (d-i preseed scripts and qemu boot rules). + +## IRC, OFTC, #debian-hurd, 2014-02-25 + + <pere> getting a automated test in jenkins running could show the status. + what is needed to boot the hurd d-i image with a preseed file using qemu? + <pere> git://git.debian.org/git/users/holger/jenkins.debian.net.git is the + repo with the jenkins build rules. + <pere> youpi: is it possible to start the hurd d-i installer with a preseed + file from the qemu command line? --append need --kernel, which I suspect + do not make sense with hurd? + <pere> can the d-i hurd installer take a preseed file at all? my initial + try failed. :( + <teythoon> i don't know + <teythoon> there has been talk here the other day about using qemus + multiboot capabilities to directly boot the hurd + +[[debugging_gnumach_startup_qemu_gdb]], *Multiboot* + + <teythoon> i always wanted to try that out + <pere> the jenkins rules to test the install uses --kernel, --initrd and + --append in qemu to specify the preseed file. without a similar method + to boot hurd, it will be hard to automate the test. rewriting the iso + might be an option, but not a very nice one. + <teythoon> i believe that it is possible to use those options to boot a + hurd + <teythoon> i'll report back to you + <pere> I tried adding an url= option to grub when booting the installer, + but it seem to be ignored. + <pere> I suspect it did not make it to /proc/cmdline, but am not sure. + <teythoon> um + <teythoon> it should + <pere> could be. I am unable to get a shell in the installer, so I do not + know. + <teythoon> root@pluto ~ # cat /proc/cmdline + <teythoon> root=device:hd0s1 + <teythoon> oh ? select expert install, then spawn a shell or something + <pere> perhaps the preseed udeb is missing, or the network support was + enabled after preseed looked for the file? + <teythoon> uh, i don't know about that stuff, youpi creates the d-i images + <pere> ok. seem to me that the d-i images do not support preseeding at the + moment. + <teythoon> pere: when i try to use qemus multiboot support to boot the + hurd, qemu crashes :/ + <teythoon> youpi: ^ did you succeed? if so, can you share how? + <pere> teythoon: nope, I concluded it didn't work, and left it to other to + fix. :) + <youpi> pere, teythoon: IIRC preseeding can be put on the gnumach kernel + command line + <youpi> but I'm wondering why you can't simply modify the disk image into + doing what you want + <youpi> or you mean reinstalling the image each time? + <pere> youpi: the point is testing the installer, and that can only be done + by using the installer. :) + <youpi> ok + <pere> I would like to see something like <URL: + http://jenkins.debian.net/view/g-i-installation/job/g-i-installation_debian_sid_daily_lxde/lastBuild/ + > for hurd. diff --git a/open_issues/nptl.mdwn b/open_issues/nptl.mdwn index 3c84bfb0..be0270df 100644 --- a/open_issues/nptl.mdwn +++ b/open_issues/nptl.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -15,7 +16,8 @@ License|/fdl]]."]]"""]] # IRC, freenode, #hurd, 2010-07-31 - <tschwinge> Other question: how difficult is a NPTL port? Futexes and some kernel interfaces for scheduling stuff etc. -- what else? + <tschwinge> Other question: how difficult is a NPTL port? Futexes and some + kernel interfaces for scheduling stuff etc. -- what else? <youpi> actually NPTL doesn't _require_ futexes <youpi> it just requires low-level locks <youpi> Mmm, it seems to be so only in principle @@ -25,8 +27,10 @@ License|/fdl]]."]]"""]] <youpi> I'm not sure we really want to port NPTL <tschwinge> OK. <youpi> Drepper will keep finding things to add - <youpi> while the interface between glibc and libpthread isn't increasing _so_ much - <tschwinge> ... and even less so the interfavce that actual applications are using. + <youpi> while the interface between glibc and libpthread isn't increasing + _so_ much + <tschwinge> ... and even less so the interfavce that actual applications + are using. <tschwinge> We'd need to evaluate which benefits NPTL would bring. @@ -44,6 +48,63 @@ License|/fdl]]."]]"""]] <azeem> and http://lists.debian.org/debian-bsd/2013/07/msg00138.html +# IRC, freenode, #hurd, 2013-12-26 + + <nalaginrut> hm? has NPTL already supported for Hurd? + <braunr> probably won't ever be + <nalaginrut> so no plan for it? + <braunr> what for ? + <nalaginrut> no one interested in it, or no necessary adding it? + <braunr> why would you want nptl ? + <braunr> ntpl was created to overcome the defficiencies of linuxthreads + <braunr> we have our own libpthread + <braunr> (with its own defficiencies) + <braunr> supporting nptl would probably force us to implement something a + la clone + <nalaginrut> well, just inertia, now that Linux/kFreebsd has it + <braunr> are you sure kfreebsd has it ? + * teythoon thought we have clone + <nalaginrut> http://www.gnu.org/software/hurd/open_issues/nptl.html + <nalaginrut> seems someone mentioned it + <braunr> it's a "nptl-like implementation" + <nalaginrut> yes, I don't think it should be the same with Linux one, but + something like it + <braunr> but what for ? + <braunr> as mentioned in the link you just gave, "<tschwinge> We'd need to + evaluate which benefits NPTL would bring." + <nalaginrut> well, it's the note of 2010, I don't know if it's relative now + <braunr> relevant* + <nalaginrut> ah thanks + <braunr> but that still doesn't answer anything + <braunr> why are *you* talking about nptl ? + <nalaginrut> just saw pthread, then recall nptl, dunno + <nalaginrut> just asking + <braunr> :) + <nalaginrut> but you mentioned that Hurd has its own thread implementation, + is it similar or better than Linux NPTL? + <nalaginrut> or there's no benchmark yet? + <braunr> it's inferior in performance + <braunr> almost everything in the hurd is inferior performance-wise because + of the lack of optimizations + <braunr> currently we care more about correctness + <nalaginrut> speak the NPTL, I ever argued with a friend since I saw + drepper mentioned NPTL should be m:n, then I thought it is...But finally + I was failed, he didn't implement it yet... + <braunr> what ? + <braunr> nptl was always 1:1 + <nalaginrut> but in nptl-design draft, I thought it's m:n + <nalaginrut> anyway, it's draft + <nalaginrut> and seems being a draft for long time + <braunr> never read anything like that + <nalaginrut> I think it's my misread + <nalaginrut> I have to go, see you guys tomorrow + <braunr> The consensus among the kernel developers was that an M-on-N + implementation + <braunr> would not fit into the Linux kernel concept. The necessary + infrastructure which would + <braunr> have to be added comes with a cost which is too high. + + --- # Resources diff --git a/open_issues/performance.mdwn b/open_issues/performance.mdwn index 772fd865..3dab6d4c 100644 --- a/open_issues/performance.mdwn +++ b/open_issues/performance.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -217,3 +217,25 @@ call|/glibc/fork]]'s case. <braunr> i'm only saying that the phoronix benchmark results are useless <braunr> because they didn't measure the right thing <hroi_> ok + + +# Optimizing Data Structure Layout + +## IRC, freenode, #hurd, 2014-01-02 + + <braunr> teythoon_: wow, digging into the vm code :) + <teythoon_> i discovered pahole and gnumach was a tempting target :) + <braunr> never heard of pahole :/ + <teythoon_> it's nice + <teythoon_> braunr: try pahole -C kmem_cache /boot/gnumach + <teythoon_> on linux that is. ... + <braunr> ok + <teythoon_> braunr: http://paste.debian.net/73864/ + <braunr> very nice + + +## IRC, freenode, #hurd, 2014-01-03 + + <braunr> teythoon: pahole is a very handy tool :) + <teythoon> yes + <teythoon> i especially like how general it is diff --git a/open_issues/performance/io_system/clustered_page_faults.mdwn b/open_issues/performance/io_system/clustered_page_faults.mdwn index a3baf30d..8bd6ba72 100644 --- a/open_issues/performance/io_system/clustered_page_faults.mdwn +++ b/open_issues/performance/io_system/clustered_page_faults.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -160,3 +160,6 @@ License|/fdl]]."]]"""]] immediately when he stopped attending meetings... <antrik> slpz: oh, you even already looked into vm_pageout_scan() back then :-) + + +# [[Read-Ahead]] diff --git a/open_issues/performance/io_system/read-ahead.mdwn b/open_issues/performance/io_system/read-ahead.mdwn index 05a58f2e..711f7691 100644 --- a/open_issues/performance/io_system/read-ahead.mdwn +++ b/open_issues/performance/io_system/read-ahead.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011, 2012, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -3041,3 +3041,26 @@ License|/fdl]]."]]"""]] still on my TODO list <braunr> it will get merged eventually, now that the large store patch has also been applied + + +## IRC, freenode, #hurd, 2013-12-31 + + <braunr> mcsim: do you think you'll have time during january to work out + your clustered pagein work again ? :) + <mcsim> braunr: hello. yes, I think. Depends how much time :) + <braunr> shouldn't be much i guess + <mcsim> what exactly should be done there? + <braunr> probably a rebase, and once the review and tests have been + completed, writing the full changelogs + <mcsim> ok + <braunr> the libpager notification on eviction patch has been pushed in as + part of the merge of the ext2fs large store patch + <braunr> i have to review neal's rework patch again, and merge it + <braunr> and then i'll test your work and make debian packages for + darnassus + <braunr> play with it a bit, see how itgoes + <braunr> mcsim: i guess you could start with + 62004794b01e9e712af4943e02d889157ea9163f (Fix bugs and warnings in + mach-defpager) + <braunr> rebase it, send it as a patch on bug-hurd, it should be + straightforward and short diff --git a/open_issues/pfinet_timers.mdwn b/open_issues/pfinet_timers.mdwn index 5db192e3..244ca98b 100644 --- a/open_issues/pfinet_timers.mdwn +++ b/open_issues/pfinet_timers.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -117,3 +117,61 @@ License|/fdl]]."]]"""]] <braunr> yes, schedule_timeout could need a review <braunr> actually, fakeroot rm -rf * is a good test <braunr> and it's still damn slow + + +## IRC, freenode, #hurd, 2013-11-04 + + <braunr> i think i know why fakeroot is slow no + <braunr> now + <braunr> schedule_timeout as implemented in pfinet can only be awaken by a + timeout + <braunr> even when the expected even comes in earlier + <braunr> so yes, the proper solution is to rewrite the timers using + interruptible_sleep_on_timeout (and in turn + pthread_hurd_cond_timedwait_np) + <braunr> hm no, it's still not that straightforward :( + + +## IRC, freenode, #hurd, 2013-11-05 + + <braunr> youpi: i found the bug slowing down fakeroot-tcp + <braunr> it's actually a bug that slows down anything using the loopback + device + <braunr> (although there still is a problem with fakeroot chown) + <youpi> oh! + <braunr> basically + <braunr> the loopback device calls netif_rx from its xmit function + <braunr> which is perfectly fine + <braunr> except the glue code makes mark_bh (used to raise bottom halves) + broadcast a condition + <braunr> and since netif_rx is called from within xmit, which is called + from the net_bh worker thread + <braunr> the thread itself is never waiting for the condition when it is + broadcast + <braunr> it's very simple to fix, i'll send a patch later + <braunr> netcat to netcat now consumes 100% cpu + <braunr> as does fakeroot ls -Rl + <braunr> but for some reason fakeroot chown is still extremely slow + <braunr> and i've seen deadlocks in glibc (e.g. setlocale() getting the + locale lock, which is locked again in case libfakeroot fails and calls + strerror) + <braunr> so still a bit of debugging work needed + + +## IRC, freenode, #hurd, 2013-11-06 + + <braunr> chown being slow with fakeroot-tcp can also be seen on linux + + <teythoon> did your recent patch improve the performance of fakeroot-tcp ? + <braunr> yes + <teythoon> very nice :) + <braunr> but fakeroot chown is still slow + <braunr> although it's also slow on linux + <braunr> so i'm not looking into that any more for the time being + <braunr> as long as it's not used recursively on huge directories, it's + fine + + +## IRC, freenode, #hurd, 2013-11-09 + + <teythoon> braunr: fakeroot-tcp is indeed much faster now :) diff --git a/open_issues/profiling.mdwn b/open_issues/profiling.mdwn index 545edcf6..e7dde903 100644 --- a/open_issues/profiling.mdwn +++ b/open_issues/profiling.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -138,3 +138,234 @@ done for [[performance analysis|performance]] reasons. know what happen and how happen, maybe just suitable for newbie, hope more young hack like it <braunr> once it's done, everything else is just sugar candy around it + + +# IRC, freenode, #hurd, 2014-01-05 + + <teythoon> braunr: do you speak ocaml ? + <teythoon> i had this awesome idea for a universal profiling framework for + c + <teythoon> universal as in not os dependent, so it can be easily used on + hurd or in gnu mach + <teythoon> it does a source transformation, instrumenting what you are + interested in + <teythoon> for this transformation, coccinelle is used + <teythoon> i have a prototype to measure how often a field in a struct is + accessed + <teythoon> unfortunately, coccinelle hangs while processing kern/slab.c :/ + <youpi> teythoon: I do speak ocaml + <teythoon> awesome :) + <teythoon> unfortunately, i do not :/ + <teythoon> i should probably get in touch with the coccinelle devs, most + likely the problem is that coccinelle runs in circles somewhere + <youpi> it's not so complex actually + <youpi> possibly, yes + <teythoon> do you know coccinelle ? + <youpi> the only really peculiar thing in ocaml is lambda calculus + <youpi> +c + <youpi> I know a bit, although I've never really written an semantic patch + myself + <teythoon> i'm okay with that + <youpi> but I can understand them + <youpi> then ocaml should be fine for you :) + <youpi> just ask the few bits that you don't understand :) + <teythoon> yeah, i haven't really made an effort yet + <youpi> writing ocaml is a bit more difficult because you need to + understand the syntax, but for putting printfs it should be easy enough + <youpi> if you get a backtrace with ocamldebug (it basically works like + gdb), I can probably explain you what might be happening + + +## IRC, freenode, #hurd, 2014-01-06 + + <teythoon> braunr: i'm not doing microoptimizations, i'm developing a + profiler :p + <braunr> teythoon: nice :) + <teythoon> i thought you might like it + <braunr> teythoon: you may want to look at + http://pdos.csail.mit.edu/multicore/dprof/ + <braunr> from the same people who brought radixvm + <teythoon> which data structure should i test it with next ? + <braunr> uh, no idea :) + <braunr> the ipc ones i suppose + <teythoon> yeah, or the task related ones + <braunr> but be careful, there many "inline" versions of many ipc functions + in the fast paths + <braunr> and when they say inline, they really mean they copied it + <braunr> +are + <teythoon> but i have a microbenchmark for ipc performance + <braunr> you sure have been busy ;p + <braunr> it's funny you're working on a profiler at the same time a + collegue of mine said he was interested in writing one in x15 :) + <teythoon> i don't think inlining is a problem for my tool + <teythoon> well, you can use my tool for x15 + <braunr> i told him he could look at what you did + <braunr> so i expect he'll ask soon + <teythoon> cool :) + <teythoon> my tool uses coccinelle to instrument c code, so this works in + any environment + <teythoon> one just needs a little glue and a method to get the data + <braunr> seems reasonable + <teythoon> for gnumach, i just stuff a tiny bit of code into the kdb + + <teythoon> hm debians bigmem patch with my code transformation makes + gnumach hang early on + <teythoon> i don't even get a single message from gnumach + <braunr> ouch + <teythoon> or it is somethign else entirely + <teythoon> it didn't even work without my patches o_O + <teythoon> weird + <teythoon> uh oh, the kmem_cache array is not properly aligned + <teythoon> braunr: http://paste.debian.net/74588/ + <braunr> teythoon: do you mean, with your patch ? + <braunr> i'm not sure i understand + <braunr> are you saying gnumach doesn't start because of an alignment issue + ? + <teythoon> no, that's unrelated + <teythoon> i skipped the bigmem patch, have a running gnumach with + instrumentation + <braunr> hum, what is that aliased column ? + <teythoon> but, despite my efforts with __attribute__((align(64))), i see + lot's of accesses to kmem_cache objects which are not properly aligned + <braunr> is that reported by the performance counters ? + <teythoon> no + <teythoon> http://paste.debian.net/74593/ + <braunr> aer those the previous lines accessed by other unrelated code ? + <braunr> previous bytes in the same line* + <teythoon> this is a patch generated to instrument the code + <teythoon> so i instrument field access of the form i->a + <teythoon> but if one does &i->a, my approach will no longer keep track of + any access through that pointer + <teythoon> so i do not count that as an access but as creating an alias for + that field + <braunr> ok + <teythoon> so if that aliased count is not zero, the tool might + underestimate the access count + <teythoon> hm + <teythoon> static struct kmem_cache kalloc_caches[KALLOC_NR_CACHES] + __attribute__((align(64))); + <teythoon> but + <teythoon> nm gnumach|grep kalloc_caches + <teythoon> c0226e20 b kalloc_caches + <teythoon> ah, that's fine + <braunr> yes + <teythoon> nevr mind + <braunr> don't we have a macro for the cache line size ? + <teythoon> ah, there are a great many more kmem_caches around and noone + told me ... + <braunr> teythoon: eh :) + <braunr> aren't you familiar with type-specific caches ? + <teythoon> no, i'm not familiar with anything in gnumach-land + <braunr> well, it's the regular slab allocator, carrying the same ideas + since 1994 + <braunr> it's pretty much the same in linux and other modern unices + <teythoon> ok + <braunr> the main difference is likely that we allocate our caches + statically because we have no kernel modules and know we'll never destroy + them, only reap them + <teythoon> is there a macro for the cache line size ? + <teythoon> there is one burried in the linux source + <teythoon> L1_CACHE_BYTES from linux/src/include/asm-i386/cache.h + <braunr> there is one in kern/slab.h + <teythoon> but it is out of date + <teythoon> there is ? + <braunr> but it's commented out + <braunr> only used when SLAB_USE_CPU_POOLS is defined + <braunr> but the build system should give you CPU_L1_SHIFT + <teythoon> hm + <braunr> and we probably should define CPU_L1_SIZE from that + unconditionnally in config.h or a general param.h file if there is one + <braunr> the architecture-specific one perhaps + <braunr> although it's exported to userland so maybe not + + +## IRC, freenode, #hurd, 2014-01-07 + + <teythoon> braunr: linux defines ____cacheline_aligned : + http://lxr.free-electrons.com/source/include/linux/cache.h#L20 + <teythoon> where would i put a similar definition in gnumach ? + <taylanub> .oO( four underscores ?!? ) + <teythoon> heh + <teythoon> yes, four + <braunr> teythoon: yes :) + + <teythoon> are kmem_cache objects ever allocated dynamically in gnumach ? + <braunr> no + <teythoon> hm + <braunr> i figured that, since there are no kernel modules, there is no + need to allocate them dynamically, since they're never destroyed + <teythoon> so i aligned all statically declarations with + __attribute__((align(1 << CPU_L1_SHIFT))) + <teythoon> but i still see 77% of all accesses being to objects that are + not properly aligned o_O + <teythoon> ah + <teythoon> >,< + <braunr> you could add an assertion in kmem_cache_init to find out what's + wrong + <teythoon> *aligned + <braunr> eh :) + <braunr> right + <teythoon> grr + <teythoon> sweet, the kmem_caches are now all properly aligned :) + <braunr> :) + + <braunr> hm + <braunr> i guess i should change what vmstat reports as "cache" from the + cached objects to the external ones (which map files and not anonymous + memory) + <teythoon> braunr: http://paste.debian.net/74869/ + <teythoon> turned out that struct kmem_cache was actually an easy target + <teythoon> no bitfields, no embedded structs that were addressed as such + (and not aliased) + <braunr> :) + + +## IRC, freenode, #hurd, 2014-01-09 + + <teythoon> braunr: i didn't quite get what you and youpi were talking about + wrt to the alignment attribute + <teythoon> define a type for struct kmem_cache with the alignment attribute + ? is that possible ? + <teythoon> ah, like it's done for kmem_cpu_pool + <braunr> teythoon: that's it :) + <braunr> note that aligning a struct doesn't change what sizeof returns + <teythoon> heh, that save's one a whole lot of trouble indeed + <braunr> you have to align a member inside for that + <teythoon> why would it change the size ? + <braunr> imagine an array of such structs + <teythoon> ah + <teythoon> right + <teythoon> but it fits into two cachelines exactly + <braunr> that wouldn't be a problem with an array either + <teythoon> so an array of those will still be aligned element-wise + <teythoon> yes + <braunr> and it's often used like that, just as i did for the cpu pools + <braunr> but then one is tempted to think the size of each element has + changed too + <braunr> and then use that technique for, say, reserving a whole cache line + for one variable + <teythoon> ah, now i get that remark ;) + <braunr> :) + + <teythoon> braunr: i annotated struct kmem_cache in slab.h with + __cacheline_aligned and it did not have the desired effect + <braunr> can you show the diff please ? + <teythoon> http://paste.debian.net/75192/ + <braunr> i don't know why :/ + <teythoon> that's how it's done for kmem_cpu_pool + <braunr> i'll try it here + <teythoon> wait + <teythoon> i made a typo + <teythoon> >,< + <teythoon> __cachline_aligned + <teythoon> bad one + <braunr> uh :) + <braunr> i don't see it + <braunr> ah yes + <braunr> missing e + <teythoon> yep, works like a charme :) + <teythoon> nice, good to know :) + <braunr> :) + <teythoon> given the previous discussion, shall i send it to the list or + commit it right away ? + <braunr> i'd say go ahead and commit diff --git a/open_issues/robustness.mdwn b/open_issues/robustness.mdwn index a6b0dbfb..4b0cdc9b 100644 --- a/open_issues/robustness.mdwn +++ b/open_issues/robustness.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -12,6 +13,7 @@ License|/fdl]]."]]"""]] [[!toc]] + # IRC, freenode, #hurd, 2011-11-18 <nocturnal> I'm learning about GNU Hurd and was speculating with a friend @@ -167,3 +169,49 @@ License|/fdl]]."]]"""]] http://darnassus.sceen.net/gitweb/teythoon/reincarnation.git/blame/HEAD:/reincarnation.defshttp://darnassus.sceen.net/gitweb/teythoon/reincarnation.git/blame/HEAD:/reincarnation.defs < teythoon> uh >,< sorry, pasted twice < braunr> oh ok + + +## IRC, freenode, #hurd, 2014-02-01 + + <pere> btw, can hurd upgrade the kernel without reboot? + <teythoon> no + <teythoon> but since most functionality is not within the kernel, the more + interesting question is, what parts of the hurd can be replaced at + runtime + <pere> ok. what is the answer to that question? + <teythoon> no hurd server can be restarted transparently, i.e. w/o its + clients noticing that + <teythoon> however, if a server is not in use, it can be easily restarted + <teythoon> transparently restarting servers would be nice + <teythoon> and i believe it is even possible on mach + <braunr> teythoon: how ? + <teythoon> one has to retain two things, client-related state and the port + right + <braunr> doesn't that require persistence ? + <teythoon> it does + <teythoon> but i see no reason why it should not be possible to implement + this on top of mach + <braunr> maybe + <teythoon> the most crucial thing is to preserve the receive port, and to + replace the server without race-conditions + <teythoon> receive rights can be transfered using the notification + mechanism + + <antrik> braunr: restarting servers doesn't exactly require + persistance. you only need to pass the state from the old server to the + new one, rather than serialising it for on-disk storage. it's a slightly + easier requirement... + <antrik> (most notably, you don't need any magic to keep the capabilities + around -- just pass them over using normal IPC) + <teythoon> antrik: i agree, but then again, once this is in place, adding + persistence is only a little step + <antrik> teythoon: depends. if it's implemented with persistence in mind + from the beginning, it might be a fairly small step indeed; but + otherwise, it could be two entirely different things + <antrik> this also depends on the kind of persistence you want + <antrik> I must say that for the kind of persistence *I* would like, it is + indeed quite related + <teythoon> well, please elaborate a little :) + <teythoon> what do you have in mind ? + <antrik> busy right now... remind me some other time if I forget :-) + <teythoon> sure diff --git a/open_issues/serial_console.mdwn b/open_issues/serial_console.mdwn index ed6358a2..827fd211 100644 --- a/open_issues/serial_console.mdwn +++ b/open_issues/serial_console.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_documentation]] -IRC, #hurdfr, 2010-09-20 + +# IRC, freenode, #hurdfr, 2010-09-20 <youpi> tu peux compiler ton gnumach pour qu'il utilise la console série, et tu mets le port série sur la console qemu @@ -50,3 +51,56 @@ IRC, #hurdfr, 2010-09-20 <youpi> pour xen j'ai mis £ comme raccourcis <manuel> ça me paraît plus simple dans ce cas <youpi> clin d'œil à la société anglaise :) + + +# IRC, freenode, #hurd, 2014-02-20 + + <gg0> 04:06:45< gg0> ok a configuration that works w/o patching anything is + 9600 7S1 [ 7bits - parity Space - 1 stopbit ] + <gg0> 04:07:57< gg0> it displays correctly gnumach, ext2fs and following + outputs + <gg0> 04:28:05< gg0> youpi: instead if you want a patch, this one makes + gnumach default to 8N1. someone should still implement serial line + settings for ext2fs though + <gg0> seems something broke it later + <gg0> or it never worked on real hardware + <braunr> we definitely want it to work with 8N1 + <gg0> never had problems with _virtual_ serial consoles + <gg0> never = during last 2 years = since + http://git.savannah.gnu.org/gitweb/?p=hurd/gnumach.git;a=commitdiff;h=2a603e88f86bee88e013c2451eacf076fbcaed81 + <gg0> but i don't think i was on real hardware at that time + + +## IRC, freenode, #hurd, 2014-02-21 + + <gg0> yeah, i have one rebuilt trying to fix serial console (already give + up) + <teythoon> what were you trying to fix ? + <gg0> i didn't fix anything but it's been useful somehow :) + <gg0> this one http://paste.debian.net/plain/83292 + <gg0> initial messages from mach/hurd outputs like there was no line feed + <gg0> each line overwrites previous one + <gg0> then ext2fs outputs garbage + <gg0> then openrc start outputting fine + <gg0> minicom 9600 8N1 + <teythoon> this is from a real machine ? + <gg0> yep real machine + <teythoon> nice :) + <gg0> i fixed last line, last garbage, by switching c: from 38400 to 9600 + in inittab + <teythoon> i've a vt510 terminal connected to my hurd box, and i started to + make the serial setting in gnumach more configurable + <gg0> and disabling T0 + <teythoon> didn't finish it though + <gg0> physical vt510 connected to virtual hurd box? + <teythoon> no, it's a real box as well + <gg0> good. and does it behave as described/pasted above? + <teythoon> currently i do not put the mach console on the serial line + <teythoon> b/c it has a fixed baud rate of 9600 + <teythoon> and both grub and the getty are configured at a higher speed + <teythoon> hence my desire to improve gnumachs serial port setup + <gg0> i don't care much about speed. such no-line-feed behavior is quite + annoying though + <gg0> i thought it was related to CRMOD which afaiu should translate cr to + cr-lf, but i was surely missing something + <gg0> (annoying till one does ^A-A to make minicom add line feeds itself) diff --git a/open_issues/system_initialization.mdwn b/open_issues/system_initialization.mdwn index 9048b615..0df1078e 100644 --- a/open_issues/system_initialization.mdwn +++ b/open_issues/system_initialization.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,7 +10,8 @@ License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] -IRC, freenode, #hurd, 2011-03-30 + +# IRC, freenode, #hurd, 2011-03-30 <kilobug> init=/bin/sh hack doesn't work for GNU/Hurd ? <antrik> kilobug: I don't think you can override init on Hurd. the init @@ -19,6 +20,23 @@ IRC, freenode, #hurd, 2011-03-30 server to *only* do that, and then pass on to standard sysv init... with that it could actually work ---- - * [[systemd]], etc. +# IRC, freenode, #hurd, 2013-11-29 + + <teythoon> we need to make the bootstrap code more robust and fix the error + handling there + <teythoon> for example, you can kill the exec server and the rootfs w/o + /hurd/init noticing it... + <braunr> yes + <teythoon> there are plans in init.c to take over the exception port of the + essential processes + <teythoon> that could help + + +# [[hurd_init]] + + +# [[Anatomy_of_a_Hurd_System]] + + +# [[systemd]] diff --git a/open_issues/systemd.mdwn b/open_issues/systemd.mdwn index 1f3eea03..ca910491 100644 --- a/open_issues/systemd.mdwn +++ b/open_issues/systemd.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -27,7 +27,9 @@ Daniel Gollub, Stefan Seyfried, 2010-10-14. Likely there's also some other porting needed. -# IRC, OFTC, #debian-hurd, 2011-05-19 +# Discussion + +## IRC, OFTC, #debian-hurd, 2011-05-19 <pinotree> pochu: http://news.gmane.org/gmane.comp.gnome.desktop - the "systemd as dependency" and all the messages in it don't give me a bright @@ -172,7 +174,7 @@ Likely there's also some other porting needed. <azeem> anyway, I'll talk to the upstart guys about libnih -## IRC, OFTC, #debian-hurd, 2013-08-15 +### IRC, OFTC, #debian-hurd, 2013-08-15 <azeem> btw, I talked to vorlon about upstart and the Hurd <azeem> so the situation with libnih is that it is basically @@ -183,6 +185,16 @@ Likely there's also some other porting needed. patches +### IRC, OFTC, #debian-hurd, 2013-11-28 + + <azeem> teythoon: did you see they got libnih ported to kfreebsd? + <azeem> http://lists.debian.org/debian-devel/2013/11/msg00395.html + <azeem> "I haven't started looking into Hurd yet," sounds promising + <teythoon> saw that + <teythoon> i looked at libnih too + <teythoon> wrote a mail about that + + ## IRC, freenode, #hurd, 2013-08-26 < youpi> teythoon: I tend to agree with mbanck @@ -1035,6 +1047,2591 @@ Likely there's also some other porting needed. wrecks havoc on the system +## IRC, freenode, #hurd, 2014-01-03 + + <gg0> openrc on debian + https://buildd.debian.org/status/package.php?p=openrc&suite=experimental + <braunr> gg0: ah nice + + +## IRC, freenode, #hurd, 2014-01-11 + + <gnu_srs1> teythoon: is the Hurd boot now fully init compatible? I would + like to try to boot with a ported openrc in a sandbox kvm:P + + +### IRC, freenode, #hurd, 2014-01-12 + + <teythoon> gnu_srs1: yes, go ahead + <teythoon> gnu_srs1: you'll have to switch to sysvinit first + <teythoon> for that, you need patched sysvinit packages + + <gnu_srs> teythoon: do you mean the parches in #721917? + <teythoon> gnu_srs: yes, mostly, but there is one final patch missing + <gg0> uploading patched sysvinit to debian-ports? (or braunr's or + teythoon's repos) + <teythoon> gg0, gnu_srs: they are actually here + http://teythoon.cryptobitch.de/gsoc/heap/debian/ but outdated + <gnu_srs> teythoon: if the sysvinit patches are outdated, can you update + them please? and provide a package for upload to -ports (as gg0 proposed) + <teythoon> gnu_srs: i will + <gnu_srs> tks:) + + +### IRC, freenode, #hurd, 2014-01-13 + + <teythoon> gnu_srs: i updated the sysvinit patches + <teythoon> gnu_srs: for your convenience, here are packages: + http://darnassus.sceen.net/~teythoon/heap/sysvinit/ + <teythoon> gnu_srs: you have to install the sysvinit-core package first, + then the others + <teythoon> to switch to sysvinit, do update-alternatives --config runsystem + and select runsystem.sysv + <teythoon> then, do reboot-hurd and hope for the best ;) + + <gnu_srs> teythoon: thanks, will try soon. Are you submitting the updated + patches to #721917 too? + <teythoon> gnu_srs: i already did + <gnu_srs> good;-) + <gnu_srs> teythoon: rebooted with sysv:http://paste.debian.net/75925/ + <teythoon> gnu_srs: please, whenever you run into a problem, give more + context + <teythoon> which file are you talking about ? + <teythoon> also, as the postinst script advised you, you need to use + {halt,reboot}-hurd *whenever* you switch the runsystem + <teythoon> not doing so wont do any harm, but it wont work + <teythoon> shutdown: /run/initctl: No such file or directory <-- that's + what happens if you run reboot (=reboot-sysv) w/o sysvinit being run + <teythoon> if you don't get a getty on the console, check /etc/inittab + <gnu_srs> I did note see a message from any posinst script about + {halt,reboot}-hurd, only LC* related messages + <gnu_srs> A I missed it: You must use halt-hurd or reboot-hurd to halt or + reboot the + <gnu_srs> system whenever you change the runsystem. + <gnu_srs> I don't see anything suspicious in /etc/inittab, + eg. 1:2345:respawn:/sbin/getty 38400 tty1 is there + <teythoon> 7:2345:respawn:/sbin/getty 38400 console + <teythoon> then, you'll get a getty on the mach console, even if the + hurd-console does not start + <gnu_srs> teythoon: with 7:2345:respawn:/sbin/getty 38400 console in + /etc/inittab I get a (mach) console. + <gnu_srs> never seen that mentioned anywhere + <gnu_srs> anyway, the image is now booted with sysvinit. next to try will + be openrc:P + <teythoon> gnu_srs: you haven't heard of the inittab entry for the mach + console before b/c the inittab was not used before on the hurd + <teythoon> i should probably write that down in the wiki somewhere... + <youpi> shouldn't the upgrade of the sysvinit package do it too? + <youpi> (does it at least install a correct version on newer installs?) + <teythoon> it probably should / i'm not sure + + +## IRC, freenode, #hurd, 2014-01-13 + + <teythoon> gnu_srs: have you ported openrc already ? + <gnu_srs> I made it build (with temporary workarounds for PATH_MAX) but + need to change at least one file to be hurd-specific before trying to + boot + <teythoon> cool :) + <gg0> i guess not much different from http://paste.debian.net/plain/75893/ + <gg0> (i didn't say it sucks but one can find it out by taking a look) + <gnu_srs> gg0: Have you talked to zigo in #openrc?. He has partial patches + (submitted to the debian repo), you do and me too. + <gnu_srs> Maybe we should align our work. + <gnu_srs> The file to make Hurd-specific is: init.sh.GNU (you start with + copy of the Linux version, I start from a copy of the BSD version). + <gnu_srs> BTW: I don't think fstabinfo is available for GNU/Hurd! + <gnu_srs> gg0: Sorry, fstabinfo and moutinfo are parts of openrc, my bad:-D + <gnu_srs> mountinfo* + + +## IRC, freenode, #hurd, 2014-01-15 + + <gnu_srs> Hi, is these some simple way to find out the sequence of commands + executed during boot: + <gnu_srs> current using runsystem.gnu and with sysv-rc using runsystem.sysv + <gnu_srs> I need to edit on file of OpenRC before trying to boot with + it. (mainly mounting /run/*) + <gnu_srs> Is mount functional or is settrans .needed? + + +## IRC, freenode, #hurd, 2014-01-16 + + <ArneBab> gnu_srs: you are adding OpenRC? cool! + <gnu_srs> ArneBab: Working on it, will try booting when my questions here + have been answered ;-) + <teythoon> gnu_srs: mount is functional enough to boot Debian/Hurd using + sysvinit + <teythoon> gnu_srs: you could add "set -x" to runsystem.*, or add "bash" to + just drop into a shell and examine the environment interactively + <gnu_srs> teythoon: Hi, is mount a wrapper on top of settrans ...? + <teythoon> yes + <gnu_srs> how to log the boot sequence, when booting the mach console is + cleared when the hurd console starts? + <teythoon> you could just disable the hurd console + <gnu_srs> and the kvm console does not have scrolling functionality + <teythoon> it's actually the mach console that lacks this + <gnu_srs> copying manually is cumbersome, even if all is readable + <teythoon> but as a workaround you can use kvm .... -curses and use xterms + backlog + <teythoon> and c&p works then :) + <gnu_srs> tks, I'll try with that:P + + +## IRC, freenode, #hurd, 2014-01-17 + + <gnu_srs> BTW: zigo successfully booted openrc on Hurd, I haven't tried + yet,, you know things coming in between. He used my patches to create + updated ones:) + <gnu_srs> that version is now in experimental (I still have to operate away + all those PATH_MAX issues, and fins at least one sh file). + <teythoon> :/ + + +## IRC, freenode, #hurd, 2014-01-21 + + <gnu_srs> teythoon: I don't get a scrollable output when using -curses in + kvm, to be able to see all startup messages. Any other ideas? + <teythoon> gnu_srs: are you sure ? i just tested this, and it works nicely + for me + <teythoon> gnu_srs: that's how i created all the "screenshots" for my blog + posts + <gnu_srs> teythoon: kvm -m 1024 -net nic,model=rtl8139 -net + user,hostfwd=tcp::5564-:22 -curses -hda debian-hurd-20140115.img + <teythoon> ah, my bad + <teythoon> gnu_srs: try -nographic + <teythoon> oh, and maybe you need to add console=com0 to the gnumach + command line + <teythoon> b/c with -nographic, the first serial port is connected to qemus + stdio + <teythoon> sorry, i mixed this up + <gnu_srs> and how to add console=com0 to the qemu start oprtions? -kernel + and -append are Linux only + <teythoon> # grep console /etc/default/grub + <teythoon> GRUB_CMDLINE_GNUMACH="console=com0 --crash-debug" + <teythoon> and if you want grub on the serial port: + <teythoon> # grep serial /etc/default/grub + <teythoon> GRUB_TERMINAL=serial + <teythoon> GRUB_SERIAL_COMMAND="serial --speed=9600 --unit=0 --word=8 + --parity=no --stop=1" + <gnu_srs> teythoon: with -nographic I don't get any output at all? + <teythoon> did you run update-grub ? + <gnu_srs> aha, will do + <gnu_srs> still no scrollbar with gnome-terminal, will try with xterm and + rxvt + <gnu_srs> it works: with rxvt, tks:-D + <teythoon> good :) + <teythoon> i found -nographic to be quite handy + <gnu_srs> in /etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet" and + GRUB_CMDLINE_LINUX="" #GRUB_DISABLE_LINUX_UUID=true + <gnu_srs> linux configuration parameters in a gnumach boot setup? + <teythoon> those won't be used + <teythoon> unless the grub scripts find a linux kernel in /boot + <teythoon> it's just the stock debian configuration file + <gnu_srs> nevertheless:-( + <teythoon> what ? + <gnu_srs> there could be OS-specific files: Linux, kFreeBSD, Hurd? + <teythoon> or, preferebly, one that works on every os ? like it is now ;) + <gnu_srs> OK, one that works on every OS, with a common part and + OS-specific parts? + <teythoon> that's how it is now + <teythoon> stuff with LINUX in it is used for linux + <teythoon> stuff with GNUMACH in it is used for gnumach + + +## IRC, freenode, #hurd, 2014-01-22 + + <gnu_srs> teythoon: A boot message segfault: (syv-rc specific?) + <gnu_srs> + exec /sbin/init -a + <gnu_srs> INIT: version 2.88 booting + <gnu_srs> Using makefile-style concurrent boot in runlevel S. + <gnu_srs> end_request: I/O error, dev 02:00, sector 0 + <gnu_srs> Segmentation fault + <gnu_srs> Activating swap...done. + <gnu_srs> Checking root file system...fsck from util-linux 2.20.1 + <gnu_srs> another: mount: cannot remount /proc: Invalid argument + <gnu_srs> ... + <gnu_srs> df: Warning: cannot read table of mounted file systems: No such + file or directory + <gnu_srs> openrc boots on Hurd, login (user,root) works, read-only mode so + far, have to tweak some scripts:) + <braunr> not bad + <ArneBab> gnu_srs: woah! + <ArneBab> very cool! + + +## IRC, freenode, #hurd, 2014-01-22 + + <ArneBab> I think with that you are doing the most useful thing to avoid + OpenRC: If it provides almost the same as systemd and runs on the Hurd, + then there is no technical reason for using systemd, but many against it. + <ArneBab> s/avoid OpenRC/avoid systemd/ + <ArneBab> (gah, brain is jumbled) + <Shentino> I hate systemd because it monopolizes cgroups + <Shentino> which is SUPPOSED to be a generic interface open to anyone + <Shentino> I do not want an unholy alliance in a kernel-user api + <azeem_> ArneBab: the openrc maintainer will take care it will get + communicated + <azeem_> ArneBab: also, not sure what you mean about systemd, the question + isn't so much between openrc vs. systemd, but upstart vs. systemd + <azeem_> at least for the Technical Committee decision, none of the + tech-ctte members seems to consider openrc as n realistic contender + <azeem_> s/as n/as a/ + <gnu_srs> azeem_: seem like it is so:-( + <gnu_srs> maybe in a future, if openrc gets some attention and developers, + it could become a one-for-all solution;-) + <teythoon> gnu_srs: nice :) + <teythoon> ignore the proc related message + <teythoon> gnu_srs: there is no way to associate the segfault with a + process for me, can you shed some light on which process dies ? + <teythoon> as for df complaining, you could fix this up like youpi did: + <teythoon> grep ln /etc/hurd/rc + <teythoon> ln -s /proc/mounts /var/run/mtab + <teythoon> the proper way is to fix our libc of course + <gnu_srs> teythoon: I was just coping the boot messages, I don't know + either which process segfaults + <teythoon> hm, maybe you can make openrc more verbose about what it starts + <gnu_srs> All I wrote earlier was from sysv-rc + <teythoon> ah + <teythoon> i've never seen that then + <ArneBab> azeem_: actually I think OpenRC is the only sane choice: It is + the only choice which supports other kernels. + <ArneBab> Shentino: I can’t stand systemd, because it establishes a tight + control over the init process by encouraging developers to add + dependencies to libraries which are so tightly coupled with others, that + they cannot be adapted without affecting the whole system. + <ArneBab> Shentino: But I wrote about that in much more details: + http://draketo.de/light/english/top-5-systemd-troubles TL;DR: + distributions become completely dependent on a small group and they throw + away the skills their maintainers already have (shell scripting) + <ArneBab> And systemd is Linux-only… + <ArneBab> …with no intention of changing that. + <braunr> why would debian strive to support other kernels ? + <braunr> instead of other kernels adjusting ? + <braunr> if posix introduces new apis, are we going to say no, or are we + going to try and support them ? + <braunr> the issue of multi-kernel support is completely irrelevant + <braunr> what you're saying about tight coupling is actually the only real + issue of systemd + <ArneBab> braunr: I see a difference between providing a stable API which + others can easily replicate and a running target with no intention to + become cross-kernel usable (my experience with udev suggests that they + won’t really try to keep anything stable for long). + <ArneBab> braunr: but the tight coupling is the main issue for me, too: + that creates a vulnerability for the free software community. + <braunr> no, the free software community doesn't risk much here + <braunr> it's a technical problem + <braunr> ok, yes, posix as a point of convergence is clearly not the same + as linux as an implementation that diverges + <braunr> agreed + <ArneBab> if the systemd people decide to go a certain direction which + makes it impossible to provide a certain feature while using their new + tech, then there is a problem. + <braunr> but it still implies we have to adapt + <braunr> from my point of view, multi-kernel distributions are a technical + heresy + <braunr> if you want something really efficient, you want it very well + integrated + <teythoon> i'm concerned by the linux kernel making up interfaces w/o + proper considerations + <ArneBab> braunr: in Gentoo we had all the hassle with /usr on a separate + partition. There are usecases for that, and Gentoo wanted to provide + them, but udev (now systemd) made that impossible. + <braunr> teythoon: yes i'm concerned about that too + <teythoon> we will never be able to implement the cgroup interface for + example b/c it is too badly designed + <braunr> badly ? + <braunr> it's system specific + <ArneBab> braunr: also the systemd folks could essentially hold Linus at + ransom: “We couple userspace tightly to implementation details in the + kernel, so when you break the implementation in a way which we don’t + like, you’ll break userspace in the worst possible way” + <braunr> it's very hard to design an interface without properly + understanding what it would internally imply in the implementation + <braunr> ArneBab: that's already the case + <teythoon> system specific in a way that it will be impossible to implement + on non-monolithic kernels + <braunr> teythoon: exactly + <braunr> they didn't think of that because they don't care + <braunr> and why would they ? + <braunr> it doesn't make the interface bad per se + <ArneBab> it is the case in systemd, but not in sysVinit + <braunr> well it is too + <braunr> but sysvint is less demanding + <braunr> again, the coupling is the problem + <ArneBab> yes + <braunr> systemd comes from people with other goals and interests + <ArneBab> I think everything I wrote comes down to that. + <braunr> they're very technical, very business oriented + <braunr> they want to get up to speed with competitors quickly + <braunr> they're not wrong in doing that + <braunr> it just helps understand why they get with such results + <ArneBab> A distribution would be foolish to let other people take over a + crucial part of the system when those other people have a track record of + coupling more and more parts of the system with their product. + <braunr> and i agree, i don't want it either + <braunr> but please, stop with the nonsense + <braunr> don't say openrc is the only sane one because it's the only + multikernel one + <braunr> personally, i consider that very argument almost insane itself + <braunr> considering distributions that are hardly used can really have any + weight in the decision is absurd + <ArneBab> openrc is the only sane one, because it keeps already aquired + skills useful. + <braunr> s/distributions/kernels/ + <ArneBab> (that’s my opinion) + <braunr> we have to make progress + <braunr> the init system is clearly obsolete and lacking features + <braunr> so "acquired" skills here are irrelevant too + <braunr> if it takes acquiring new skills to operate a better init system, + i'm all for it + <braunr> after all, it makes a lot more sense to me than all those fancy + languages/technologies like C# and ruby that have gained so much + popularity in so little time + <ArneBab> If you can get a similarly good init system wiothut forcing + people to learn new skills, that’s a big win. + <braunr> you probably can't + <ArneBab> OpenRC is pretty close in features to systemd + <teythoon> err + <teythoon> not even close + <braunr> teythoon is right + <braunr> openrc is just sysvinit++ + <teythoon> no + <teythoon> openrc replaces the sysv rc, not sysvinit + <braunr> ok + <teythoon> it complements it + <braunr> i wasn"'t being pedantic here + <teythoon> nicely in my opinion + <braunr> yes i like it too + <braunr> but i'm afraid it's not a complete solution + <ArneBab> I think I need to be more pedantic in what I say: A system-boot + with OpenRC is pretty close in features to a system-boot using systemd. + <braunr> on the other hand, when i see discussions about event driven + systems and handling of dependencies, it sounds like something like + openrc could do the job, and something else, system-specific, would + handle the rest + <braunr> ArneBab: i disagree + <teythoon> me too + <teythoon> ArneBab: have you actually used systemd? + <ArneBab> I have read about what it provides. + <ArneBab> My udev experience burned me pretty badly. + <braunr> udev is only one part + <braunr> but actually, coupling is both a problem and a great feature + <teythoon> yes + <braunr> it's precisely the integration of many services previously + organized in a very messy way that makes it better + <braunr> and cgroups, by accurately tracking resources, allow even better + control + <teythoon> heh, i watched lennarts recent talk about kdbus + <ArneBab> but it does so by pulling in more and more parts instead of + providing a clean interface which separate projects can use. + <braunr> again, the coupling is too tight + <braunr> it's hard to hook in between + <ArneBab> teythoon: I watched lennart troll a talk pretty badly… + <ArneBab> braunr: yes + <teythoon> he cites mach and hurd for having an nice ipc mechanism, and + linux lacking such a system + <braunr> haha + <braunr> i was expecting such comparisons :) + <ArneBab> that’s why he writes an init-system which does not run on the + Hurd… + <teythoon> ArneBab: that's trolling on your part ;) + <braunr> :) + <ArneBab> somehow yes… + <braunr> what i personally get out of this is that, in the end, proper + messaging at the kernel level is something people do want + <braunr> and if you make stuff like x use it, why not things like the + network stack and the file system + <teythoon> i wish the linux kernel would allow the kernel devs to write + nicer interfaces + <ArneBab> yes + <braunr> they're almost in the process of acknowledging the merits of + multiserver architectures :) + <teythoon> b/c they lack a proper ipc mechanism, they do stuff like ad-hoc + filesystem-based interfaces that are crappy to support on the hurd :-/ + * ArneBab has been out of the loop for too long… + <braunr> teythoon: what file system do you consider "crappy to support on + the hurd" ? + <teythoon> braunr: cgroupfs in particular + <teythoon> not crappy, but impossible + <braunr> well, that's probably because we need realy resource containers + first + <braunr> real* + <teythoon> no, we'll never be able to implement the current interface + <braunr> i didn't study it as you did so i trust you + <teythoon> braunr: + http://teythoon.cryptobitch.de/posts/cgroupfs-is-as-cgroupy-as-it-gets/ + <braunr> ok this would require proper support at the client side + <teythoon> yes + <braunr> i wouldn't say impossible but definitely not as clean as we would + want it + <braunr> far from it + <teythoon> how would you ever implement it w/o fixing the client + (i.e. fixing the interface first) ? + <braunr> the client would translate the request + <teythoon> magical write retries ? + <braunr> probably + <teythoon> uh + <braunr> clients are the only entities which know what their file + desctiptors refer to + <braunr> descriptors* + <teythoon> yes + <braunr> so writing such a request would make the client get a magic retry, + and use the proper rpc, passing the proper rights instead + <teythoon> yeah, i can see how that could work + <teythoon> but i'm not sure that we should go down this path ... + <braunr> we probably really do'nt want to :) + <braunr> i'd personally be fine if debian would allow two init systems + <teythoon> me too + <braunr> with the powerful linux-specific one still allowing sysvinit + scripts + <teythoon> in particular b/c the sysvinit scripts are already there + <braunr> from what i've read, they all provide some decent backward + compatibility with sysvinit + <teythoon> yes + <braunr> and i think we can count on the linux community to riot if, + assuming systemd was chosen, it becomes too hard to use and tweak + <braunr> again, these people want their software to be used + <braunr> so they'll probably manage something decent in the long run, + whatever is chosen + <braunr> i don't care much + <braunr> :) + <kilobug> AFAIK Debian is planning to let users chose the init system, the + discussion is only on what should be the main/default one; but I might + have misunderstood it + <braunr> that was one of the possibilities, yes + <braunr> maybe we could help the debate by agreeing on whether or not we + consider supporting ports is that important, as port maintainers, + considering we'll probably keep the ability to use sysvinit scripts + anyway + <braunr> and making that decision known + <teythoon> and stating that we consider openrc an worthwile incremental + improvement, whatever debian decides to do wrt to the default init system + <braunr> for example, yes + <braunr> we should discuss that with youpi and thomas + <braunr> tschwinge: ^ + <braunr> when they have some time later :) + + +## IRC, freenode, #hurd, 2014-01-24 + + <gnu_srs> Good news, a successful boot of Hurd with OpenRC: + http://paste.debian.net/78119/ :-) + <gnu_srs> ramains to fix the false negative for checkpath -W + <gnu_srs> remains* + <braunr> not bad + + <gnu_srs> teythoon: btw, the segfault happens when starting the bootlogd + service: + <gnu_srs> end_request: I/O error, dev 02:00, sector 0 + <gnu_srs> Segmentation fault + <teythoon> gnu_srs: nice progress :) + <teythoon> i've never seen bootlogd crash like that, though i + <teythoon> i'm not sure it is installed + <gnu_srs> how can I check / ? it is mounted RW and even if cd to /run which + is on tmpfs, fsysopts --readonly fails: + <gnu_srs> :fsysopts: /: --readonly: Device or resource busy + <gnu_srs> I don't have bootlogd installed the segfault is at: + <gnu_srs> checkroot.sh: hwclock.sh mountdevsubfs.sh hostname.sh hdparm + keyboard-setup + <gnu_srs> called by /etc/rcS.d/S06checkroot.sh + <teythoon> you should probably create this directory that it fails to + create early in the boot process + + +## IRC, freenode, #hurd, 2014-01-25 + + <antrik> braunr: being Linux-only is *part* of the "tight coupling" + strategy of the systemd cabal + <antrik> of course you could implement all the Linux-specific interfaces on + other systems; as you could implement any other interfaces relied upon or + provided by systemd components... + <antrik> (this is in fact Lennart's favourit cop-out argument whenever + someone raises concern about this) + <antrik> the problem however is that such alternative implementations + usually have prohibitive costs + <braunr> yes i know + <antrik> (and Lennart knows that perfectly well... he doesn't exactly take + pains to conceal the fact that it's a cop-out) + <antrik> their whole point is to create a tightly integrated stack of + monopolistic components, giving a shit about any possible alternatives + <antrik> this does have an obvious appeal: it *significantly* reduces the + cost of innovation within their stack + <antrik> at the same time however it kills the traditional innovation + driver in the free software eco-system, which is competition among + interchangable components + <antrik> quite frankly, it makes little sense that other distributions are + embracing systemd in droves: the tight coupling pretty much turns them + all into Fedora look-alikes, questioning the point of their very + existence... + <zacts> what is dmd? + <antrik> as for Debian considering fringe kernels in their decision, I + think it makes *perfect* sense: the real value of Debian is precisely the + fact that it supports so many different things, making it a good base to + build upon + <antrik> (it's just unfortunate that many Debian developers do not realise + this, and instead try to compete with user-oriented distributions...) + <antrik> zacts: daemon managing daemon? yet another new init system... + <zacts> yeah + <zacts> didn't know if you have an opinion on it vs systemd + <zacts> and whether or not hurd will use it.. + <antrik> hm... not sure whether I do ;-) + <braunr> antrik: one could argue an init system is hard to make + interchangeable without also making it quite poor in functionality + <antrik> the GNU system uses it, right? when using the GNU system with the + Hurd (as it's really meant to be), that would obviously mean using DMD + with Hurd. though I'm not sure whether anyone has actually tried that + combination ;-) + <braunr> just to make it clear, i'm totally not in favor of systemd + <braunr> i'm just trying to measure the value of an interchangeable init + system here + <braunr> value versus cost + <braunr> why is it bad to try to compete with user oriented distros ? + <antrik> braunr: I suspect most of the really good things about systemd + could be kept while making it somewhat more open at fairly little cost... + <antrik> braunr: because that's not Debian's strength -- and never will be + <antrik> trying to compete in this space too hard is bound to fail, at only + bears the risk of loosing the actual strengths + <braunr> antrik: sounds true + <antrik> hm... thinking about it, I'd say it actually makes more sense for + the init system to be distribution-specific than kernel-specific... + <braunr> that makes sense + <braunr> but systemd isn't just an init system + <antrik> it's really the distribution's job to create a well-integrated + system. and basically, that's what the systemd cabal is doing for + Fedora... + <antrik> it's just problematic that they have so much influence in + important upstream projects, that they are basically killing any chance + for others to integrate things in different ways + <braunr> antrik: agreed + <braunr> the tight coupling i refer to is about the init system and the + upstream projects you mention such as udev, acpid, console-kit, etc.. + <antrik> yeah... and GNOME + <braunr> is it really that coupled now ? + <antrik> don't really know; but judging from remarks people make, it must + be pretty bad + <braunr> this reminds me of the talk on gnome 3 last year at fosdem + <braunr> it would have been hilarious if gnome wasn't such an important + project + <antrik> (specifically, GNOME is now pretty much tied to logind AIUI, which + is not entirely inseparable from systemd -- but again, the cost is + prohibitive...) + <teythoon> i don't get what all the hate here is about ... + <antrik> in fact, certain people used that as an argument why Debian must + switch to systemd as init, as they are already pretty much forced to use + various of the other coupled components anyways, and trying to decouple + them is too costly for Debian... + <braunr> teythoon: hate ? here ? + <teythoon> i mean they don't do this for fun, they actually provide + something of value, right ? + <braunr> some value + <antrik> teythoon: they? + <braunr> but they remove the kind of value that made free software evolve + the way it did, as antrik said + <teythoon> the evil cabal around systemd ;) + <antrik> I didn't say "evil"... not explicitly at least ;-) + <teythoon> then again, if you are runnign linux/gnome3 and plug in a second + monitor, that one is automatically activated + <braunr> yes, that's what they want to achieve + <teythoon> that's what they achieved + <braunr> i mean, they targetted that, it's not a side effect + <teythoon> and anyone not happy with how they did that can surely provide a + nicer solution ;) + <antrik> teythoon: as I said, there are clearly good aspects to what they + are doing -- but at the same time it's very dangerous to the free + software eco-system... + <braunr> teythoon: not easily + <teythoon> antrik: i don't buy that + <braunr> i do + <teythoon> braunr: yes, not easily. that is kind of the point, right ? + <braunr> pulling projects such as gnome into a category of kernel specific + applications is dangerous + <braunr> teythoon: well, considering who they are and the means they have, + they could have spent the time to do it right for everyone + <teythoon> maybe + <antrik> err... activating a second monitor is not in any way tied to + systemd or related compontents... I think you are talking about a second + seat + <teythoon> that's another killer feature they achieved, yes + <antrik> (which is nice, but quite frankly, a niche use case in my book...) + <teythoon> maybe you're not the typical user + <antrik> I'm not. but the *typical* user definitely doesn't care about + multi-seat + <teythoon> if you say so + <teythoon> antrik: when you say it's dangerous what 'they' are doing, what + do you mean exactly ? + <teythoon> dangerous for whom ? + <antrik> asides from schools in developing countries, who try everything to + save on IT costs, I really can't think of many users for multi-seat... + <teythoon> (maybe schools all around the world trying to cut down their + costs?) + <teythoon> or like everyone, here, a $30 dongle that gives you an extra + workstation, how awesome is that ? + <antrik> teythoon: see above: they are killing the ability to combine + interchangable components, which has always been a core asset of the free + software ecosystem + <teythoon> antrik: so gnome is going for systemd, and gnome loses the + ability to be used w/o systemd + <teythoon> why do you care ? how does this affect the whole ecosystem ? + <teythoon> i really don't get why everyone is getting so upset about this + <antrik> teythoon: who cares about a dongle giving an extra workstation? + the remaining users of workstations are either corporate -- who prefer + dedicated boxes for organisational reasons -- or gamers, who want all the + power to themselves... + <braunr> teythoon: well gnome is kind of one of the major destkop software + in the free software world + <antrik> s/one of// + <teythoon> antrik: you stated that you havent used gnome3, yet you have an + opinion how tightly it should be coupled with systemd or linux + <teythoon> people who haven't used systemd or upstart have an opinion about + which one should be preferred + <braunr> teythoon: why do you think people shouldn't think about systems as + a whole ? + <antrik> teythoon: actually, I am using it (for some value of "use") -- + though in legacy mode, as my hardware can't run the new bling... + <braunr> in that case, people shouldn't be allowed to vote, because that + would require them to be politicians .. + <teythoon> it's okay to think about that + <braunr> i don't think it is + <antrik> teythoon: but seriously, whether *I* have used it is quite beside + the point. I have no illusions about being a niche user + <braunr> people don't need to use something to actually understand it + <teythoon> but i cannot stand all the whining lately in the free software + world... + <braunr> whining isn't fair + <braunr> i mean, the word + <teythoon> y ? + <braunr> it's a big problem and complaining to force a debate is important + <teythoon> yes, but "they" are solving problems, and everyone is + complaining for one reason or the other + <braunr> they are also creating problems + <braunr> and not everyone is complaining + <teythoon> as opposed to offering alternatives + <braunr> that's a major issue, a lot of people are favorable to these + changes + <teythoon> and if you don't like what "they" are building, you are free not + to use it, no ? that's a freedom too ;) + <braunr> no + <braunr> you aren't + <teythoon> what ? + <braunr> that's precisely the point + <braunr> you'll be de facto forced to use it if you want to keep using the + rest + <teythoon> i'm free not to use gnome3 + <braunr> you won't be free from using linux if you want gnome3 + <teythoon> what kind of argument is that ? + <braunr> i'm abusing the word freedom + <braunr> because it has no clear meaning in practice + <braunr> as antrik said, it's about interchangeability and portability + <braunr> and alternatives + <braunr> accepting the way systemd is designed is a major shift towards + making linux its own standard, away from the rest + <braunr> and the way it's done isn't thought to easily allow the + alternatives to keep up with the changes + <teythoon> we agreed the other day that they shouldn't create ad-hoc + interfaces like they do, yes + <braunr> well that's the whole point + <teythoon> you just talked "about the way systemd is designed" + <braunr> they could invest some more effort to make well designed + interfaces that allow changing both the dependencies and the services + provided + <teythoon> how is that related to bad interface design ? + <braunr> for me, it's almost a synonym + <braunr> and we discussed it + <teythoon> aren't tightness of coupling and quality of interfaces + completely orthogonal ? + <braunr> it is designed with a narrow set of apparently company directed + interested towards a single system, a single distribution even, and + nothing else + <braunr> no + <braunr> absolutely not, when it's about something that should be + interchangeable + <braunr> an interface that forces tight coupling is of low quality to me + <antrik> braunr: they claim it's not actually company-directed... and I + tend to believe them on *that* point TBH + <braunr> antrik: this would have been a valid reason at least + <antrik> teythoon: it's just not right that some people can no longer use + major pieces of free software just because a tiny but highly vocal cabal + decides to disrupt the whole ecosystem + <teythoon> what are you talking about ? you are free to use older versions + of the software + <braunr> i's not technically feasible + <braunr> or it would require forking to maintain + <braunr> again, it's the start of a rift + <teythoon> but, if the gnome people want to go into that direction, who are + you to say that they shouldn't ?? that's what i get the least about this + kind of argument... + <braunr> i'm part of the free software community + <braunr> more accurately, the free unix-like community + <teythoon> and you are actively developing gnome... ? + <braunr> if they want to get out of this community, they'll hurt it, and + themselves + <braunr> do you understand what a rift is ? + <teythoon> but that's their choice, no ? + <braunr> a major division ? + <braunr> so what ? + <braunr> it doesn't mean it's a good one + <teythoon> you pick the desktop environment you like next best and be done + with it ? + <braunr> it's almost public service at this point + <braunr> what if they all do the same thing ? + <teythoon> err + <teythoon> they don't + <braunr> you won't be free to do what you want because the technical + possibility will have disappeared + <braunr> kde might + <braunr> if only to compete with gnome + <teythoon> well, if you don't like hte direction a project is taking, you + fork it + <teythoon> that's what happened + <braunr> exactly .. + <teythoon> why the long faces ? + <braunr> forks increase complexity and reduce manpower + <braunr> fork == division + <braunr> forking in the free software community is normally a last resort + <teythoon> huh ? since when is this considered a bad thing ? + <braunr> it's not a bad thing per se + <braunr> it usually implies a bad situation + <teythoon> < braunr> fork == division + <teythoon> and division == rift + <braunr> think of these situations that were caused by stupid drama and + lead to the duplication of a lot of effort + <braunr> openbsd, eglibc, jenkins, to name a few + <teythoon> i don't + <teythoon> why would i ? i never created these forks + <braunr> it affects the community as a whole + <teythoon> but the people who did thought it was necessary + <braunr> the fact they could do it is good, the fact they had to do it + isn't + <braunr> they were usually forced by the situation + <braunr> and often by the stupidity of other people + <teythoon> someone forced someone else to fork a project ? with a gun or + something like this ? + <teythoon> i don't buy this ;) + <braunr> of course not .. + <braunr> eglibc was forced by the inability of drepper to accept a whole + class of patches + <braunr> openbsd because theo de raadt has some huge ego + <braunr> for jenkins, it was a licensing issue iirc + <braunr> nothing technical at all + <braunr> nothing in the interest of the community + <teythoon> err + <teythoon> it brings diversity + <braunr> no + <braunr> netbsd versus freebsd brings diversity + <teythoon> i thought that was a good thing + <braunr> openbsd was just agotistic crap + <braunr> ego* + <teythoon> if there is no diversity, why should stuff be interchangeable if + there are no alternatives? + <braunr> and netbsd and freebsd aren't exactly forks, they're both bsd + based but had different goals from the start + <braunr> that's not what i'm talking about + <braunr> eglibc isn't exactly a new libc + <braunr> it's glibc+the stuff that should have gone into it + <antrik> teythoon: the stuff the systemd cabal does builds on the work of + thousands of projects and people; yet they act as if the don't own anyone + anything, and it's fine to boot out large parts of the community whos + work they are building on + <braunr> iceweasel isn't a whole new firefox + <braunr> most often, alternatives aren't forks of one another + <braunr> if they are, they have diverged a lot + <teythoon> antrik: that is your interpretation, and i respectfully disagree + with it;) + <braunr> and usually have different goals + <braunr> that's diversity, and i'm very ok with it + <braunr> (being a hurd guy and all) + <braunr> but forking because of decisions that prevent alternatives is a + very bad reason to fork + <teythoon> again, who are you to tell a project (say gnome) what they + should do or not ? + <braunr> that question makes no sense + <braunr> we're trying to think objectively + <braunr> forget who we are + <braunr> think about what should be done + <teythoon> no such thing ;) + <braunr> ok well, in that case, i'm a very smart person who knows a lot of + things, and people had better do what i tell them ;p + <braunr> satisfied ? :) + <teythoon> yes + <teythoon> that's much better actually + <braunr> not really .. + <teythoon> it's more honest + <braunr> no it was sarcasm + <braunr> what was honest are the arguments i explained + <braunr> why care about who says them ? + <teythoon> i do + <antrik> teythoon: there is not much interpretation in there really. some + of their own statements are quite explicit... + <braunr> damn non scalable kernel .. + <teythoon> who is "their"? what statements ? + <braunr> teythoon: when building glibc, there are so many nodes to fake + that ext2fs+fakeroot allocate enough ports to starve kernel memory ... + <teythoon> if i were mr. gnome3 and you would tell me that i should cuddle + with systemd b/c that's bad for one reason or another, the first thing + i'd like to know is who is telling me that + <braunr> teythoon: why not solely consider the argument ? + <teythoon> braunr: yes, i can imagine fakeroot doing that + <antrik> teythoon: Lennart and his friends. not sure how much of these + statements I have seen written down -- part of it I heard myself from + their own mouths + <teythoon> braunr: b/c maybe i like to develop my project in the direction + i want + <braunr> that's unrelated + <teythoon> and if anyone disagrees, she may fork + <braunr> this is a debate + <teythoon> why ? + <teythoon> so now we are debating what i may develop or not ? you lost me + ;) + <braunr> a way to reach consensus + <braunr> many people are discussing so that projects like debian and gnome3 + make the best decisions + <braunr> a naive way to explain it is that the result is the sum of what + everyone likes and how louds he speaks for it + <teythoon> sure but you are not a gnome developer, no ? + <braunr> no, but again, i'm a free software community member + <braunr> and this affects the whole community + <braunr> because gnome3 is a major software component used by a lot of + people + <braunr> well, gnome at least + <teythoon> so the gnome project needs to seek consensus with everyone of + the free software community ? + <braunr> no + <braunr> that would be unanimity + <teythoon> but wrt to the systemd integration ? + <braunr> siding with systemd is starting to get away from the free software + community + <braunr> or, by bringing a lot of people along, dividing it + <teythoon> that's your interpretation + <braunr> yes + <braunr> always + <braunr> you don't have to say it, we're not doing raw science here + <braunr> it's implicit + <teythoon> i think it's important to point that out and make it explicit + <braunr> you made it several times + <braunr> we got the point + <braunr> what matters in the current discussion is whether you agree or not + and why + <braunr> and this will be your interpretation too + <braunr> and we'll see if it's convincing + <braunr> but, from experience, i expect noone will be convinced ;p + <teythoon> ^^ + <braunr> the issue is too tied with the core goals we have in mind + <teythoon> but why does it matter whether i agree or not + <teythoon> that's my point actually + <braunr> you seem to have a problem understanding the issue, i was trying + to convince you there is one + <braunr> so, if i want to achieve that, it matters + <teythoon> what core goals ? + <braunr> basic dialectic + <braunr> well, for example, for me, i want people to think of the system as + a whole + <braunr> i want something effective, technically very good, and that + respects user freedoms + <braunr> i also want alternatives, i won't explain why, let's say it's + obvious + <teythoon> i agree + <braunr> well, systemd people don't think of the system as a whole + <braunr> here, what i call "system" is very large + <braunr> it would almost equal society + <braunr> i understand why they do that + <braunr> they have the right to do that + <braunr> but then i could say i understand why people make proprietary + software, and they also have the right to do it, i still won't approve it + <braunr> it contradicts my personal goals, my personal view of how things + should be + <teythoon> i completely agree + <teythoon> but then again, what you said now and the way you said it was + very different + <braunr> maybe, it's 3am, i'm sick and exhausted :) + <teythoon> more abstract + <braunr> when i give an opinion + <braunr> actually, when anyone gives an opinion + <braunr> i consider it implicit that it's their point of view alone + <braunr> they're not enforcing anything + <braunr> merely speaking out + <teythoon> people tend to overestimate the importance of their own opinion + <braunr> hm i wouldn't say so + <braunr> and that's probably why the "who" doesn't matter a lot to me + <braunr> it would matter if the person in question had real power + <braunr> and his opinion could have a strong influence + <braunr> in which case it wouldn't be overestimated + <braunr> i could say what i think to systemd people + <antrik> teythoon: quite frankly, I'm not sure what you are complaining + about. the systemd followers are trying to impose their opinions on + various projects. other people (including braunr and me, among many + others) are voicing counter-opinions. what's wrong with that? + <braunr> but i'm pertty certain the weight they'll associate to what i tell + them will be very low :) + <braunr> antrik: he called it "annoying whining" + <braunr> i think it's the only problem + <antrik> braunr: I don't think the systemd people associate much weight to + *anything* others say... ;-) + <braunr> heh :) + <braunr> to make an historic analogy + <braunr> it seems to me they're repeating the same mistakes others did + during the unix wars + <teythoon> antrik: but when you say "the systemd followers are trying to + impose their opinion on various projects", don't you dismiss the + possibility that the gnome3 people just want to make external displays + hot-pluggable? + <braunr> of course they do + <braunr> don't you dismiss that proprietary software author just want to + make money ? + <teythoon> no + <braunr> well, if that's the only thing you keep in mind to make your + opinion, you'll miss important points + <teythoon> that is an example of course + <braunr> they're sacrificing interchangeability and starting a possibly + major rift in the community for hot pluggable displays + <braunr> it may not be worth it + <teythoon> not supporting stuff like that might make the whole ecosystem + obsolete + <braunr> i'm not saying it shouldn't be done + <braunr> i'm saying it should be done while sacrificing other important + things + <braunr> it would just take a little mort effort + <braunr> and even if it wasn't done + <teythoon> that's what i meant by "whining" + <teythoon> no offense + <braunr> what is the problem of it being "obsolete" ? + <teythoon> but talk is cheap, offering alternative solutions is hard + <braunr> isn't unix obsolete ? isn't xorg obsolete ? + <braunr> hum no + <teythoon> no one did, so they implemented their nice features + <braunr> the point isn't to offer alternative solutions + <braunr> it's to make them possible + <braunr> or at least, not deny their technical feasibility because they + don't care + <braunr> teythoon: see, "interchangeability and starting a possibly major + rift" don't look to conflict with your personal goals + <braunr> that's the point where i think i can no longer do anything to + convince you + <braunr> so i'll head to bed :) + <teythoon> heh, me too :) + <braunr> honestly, i don't care a lot + <braunr> i mean + <braunr> it won't change much for me + <braunr> but again, my brain is wired to think of things as a whole + <braunr> on that note, good night :) + <teythoon> good night :) + <antrik> teythoon: again, IT'S NOT ABOUT DISPLAYS + <antrik> believe me, I do have some understanding how display hotplugging + works + <antrik> also, the problem is not that gnome3 supports logind. the problem + is that gnome3 works *only* with logind now AIUI + <antrik> there is yet another way to state the fundamental problem + <antrik> there is a kind of social contract among free software projects: + every maintainer takes a reasonable amount of extra effort to support use + cases beyond his own. in return, his use cases are supported by other + maintainers + <antrik> the systemd guys are breaking this contract, by explicitly + refusing, up front, to take *any* effort to accomodate other projects' + needs + + +## IRC, freenode, #hurd, 2014-01-28 + + <azeem_> teythoon: + https://plus.google.com/+LennartPoetteringTheOneAndOnly/posts/EgKwQV8te7s + <teythoon> azeem_: pffff :) + <braunr> heh + <teythoon> which reminds me + <teythoon> if we want to state our position wrt the default init system + debate we should probably do it right now + <braunr> yes + <teythoon> ml or collaborative editor ? + <azeem_> well, tech-ctte chair called the vote only for the default init + system for the Linux-ports + <azeem_> the vote got shot down on technicalities, but that might stand + <azeem_> I think that is a good thing, cause it implies that not one init + system has to be adopted across all ports + <teythoon> we talked the other day that it might make sense just to state + our view and our needs + <azeem_> sure. + <azeem_> I think what's needed is (i) an init-system agnostic system to set + the enable/disable state of services (ii) possibly mandating a .ini-style + config file along the style of whatever init system gets chosen as + default for Linux, to be used by non-Linux init systems as inut + <azeem_> input* + <azeem_> just my 0.02 EUR + <teythoon> uh + <braunr> looks overkill + <teythoon> i was thinking more along the lines of 1) we have never used the + default debian init system and are cool with not using the default in the + future, 2) we intend to use sysvinit in the future, 3) to that end, we + ask the init script machinery to be left in place + <braunr> but then, people managed to write stuff like libvirt + <braunr> so who knows + <teythoon> 4) we will help maintaining it as part of our porter effort + <braunr> i agree with teythoon + <teythoon> 5) we look forward to using openrc as incremental improvement, + complementing our sysvinit boot solution + <braunr> yes that would be nice + <teythoon> i'll write a draft to debian-hurd, ok ? + <gnu_srs> openrc now has a dependency loop resolver, so parallel would + work:) + <teythoon> so is insserv, isn't it ? + <gnu_srs> there were complaints on openrc + https://bugs.gentoo.org/show_bug.cgi?id=391945 in the tech-ctte + discussions, now fixed + <azeem_> gnu_srs: please accept the fact that openrc will not be picked by + the tech-ctte for the Linux ports + <gnu_srs> azeem_: I do, I'm referring to arguments during the discussion + (history) + <azeem_> sure, just checking + <ArneBab> teythoon: your post is being used to portray systemd cgroups + treatment as the right way… + <teythoon> ArneBab: so ? + <braunr> it probably is the right way + <braunr> that's not the problem + <ArneBab> do you want to clear that up? (do I remember correctly that you + did not like that way?) + <braunr> we don't like the cgroups interface + <teythoon> i will + <braunr> not the feature + <ArneBab> braunr: that’s what I meant + <teythoon> exactly + <braunr> the feature amounts to resource containers in the hurd critique + ... + <braunr> we do want that too :) + <braunr> anatoly: you want them to rewrite cgroups ? + <braunr> err + <braunr> ArneBab: ^ + +[[dbus_in_linux_kernel]]. + + <teythoon> i've been thinking + <teythoon> maybe the magic write stuff isn't that bad after all + <braunr> :) + <braunr> i was thinking the same thing actually + <teythoon> i mean, it's not the nicest thing, but it shows how flexible our + solution is + <braunr> the hurd is a lot about glue code already so why not + <teythoon> the problem is that there is no way to test cgroupfs + <teythoon> the main user is systemd, and it requires tons of other stuff + <braunr> right + <teythoon> any other user of cgroups is also probably using other + linux-interfaces too + + +## IRC, freenode, #hurd, 2014-01-29 + + <gnu_srs> About openrc having a dependency loop resolver: <teythoon>: so is + insserv, isn't it ? + <gnu_srs> I found is_loop_detected() in insserv/listing.c but that one just + exits without telling where the loop is + + +## IRC, OFTC, #debian-hurd, 2014-01-29 + + * youpi trying the new sysvinit + <youpi> hopefully we'll then be able to at last use the proper ifup/ifdown + debian way for networking :) + <youpi> teythoon: why leaving hurd's runsystem by default rather than + sysvinit's? + <youpi> ah, another issue, too, now that /dev/vcs appears in /proc/mounts, + umountfs would umount it + <youpi> ideally umountfs would not umount passive translators + <youpi> we could blacklist /dev/vcs in umountfs, but the same issue would + happen for user-defined translators in their own home, for instance + + +## IRC, freenode, #hurd, 2014-01-30 + + <gnu_srs> booting with the new sysvinit and openrc versions: works:), but + only in recovery mode:-( Hangs before INIT: version 2.88 booting + <gnu_srs> after start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] + exec init proc authtask c1120dc8 deallocating an invalid port 134517370, + most probably a bug. + <gnu_srs> related or an openrc problem? will test with sysv-rc + <youpi> I don't have such issue with sysv-rc + <gnu_srs> k! + <gnu_srs> shouldn't recovery mode mean starting in runlevel 1, I get + runlevel 2? + <youpi> it should + <pere> gnu_srs: recovery mode normally mean single user, which is between + rcS and rc2 + <gnu_srs> I get INIT: Entering runlevel: 2 + <pere> rcS.d should really have been named rcboot.d, as that is really what + it is. + <youpi> ah, right, recovery is not single + <youpi> (single as in init 1) + <pere> runlevel 1 is not single user either. it is more a gateway into + single user. see /etc/init.d/single to see what happen at the end of + runlevel 1. + <gnu_srs> init 1 and init 2 seems to work + <gnu_srs> well, the openrc dependency loop detector has found an init + script loop, maybe it has to be fixed? + <gnu_srs> disabling the hurd console solved the dependency loop problems, + thanks openrc;-) + <gnu_srs> (have to dig deeper to see where the loop is, and how to solve + it) + + +## IRC, freenode, #hurd, 2014-01-31 + + <gnu_srs> Hi, does the hurd console work with sysv-rc: In operc I get with + #console -d vga -d pc_mouse --repeat=mouse -d pc_kbd --repeat=kbd -d + generic_speaker -c /dev/vcs + <gnu_srs> console: Console library initialization failed: Not a directory + <teythoon> gnu_srs: yes, it works with sysvrc + <teythoon> gnu_srs: check that /dev/vcs has the appropriate translator + record + <gnu_srs> showtrans /dev/vcs: empty on another box: /hurd/console + <teythoon> yes, fix that and your console will be fine + <gnu_srs> settrans /dev/vcs /hurd/console? + <gnu_srs> or should it be active? + <teythoon> no, set an passive translator record so that this will be + persistent + <gnu_srs> something is wrong: when starting the hurd console screen is + blanked (and hangs) + <gnu_srs> can I get the hurd console when running with the serial console + (to see boot messages)? + <teythoon> gnu_srs: yes, yuo can + <gnu_srs> will try that image then, tks:) + <gnu_srs> teythoon: how to create all underlying directories? ls /dev/vcs: + 1 2 3 4 5 6 + <teythoon> don't, /hurd/console takes care of that + <gnu_srs> is settrans /dev/vcs /hurd/console correct? + <teythoon> yes + <sjbalaji> What are those underlying directories representing ? + <teythoon> the hurd console is a console multiplexer + <teythoon> bringing multiple virtual consoles to the hurd + <teythoon> # showtrans /dev/tty1 + <teythoon> /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + <gnu_srs> aha: console -d vga -d pc_mouse --repeat=mouse -d pc_kbd + --repeat=kbd -d generic_speaker -c /dev/vcs + <gnu_srs> task c1120e70 deallocating an invalid port 1782, most probably a + bug. + <sjbalaji> teythoon: Is it that /dev/tty1 has multiple translators ? + <teythoon> no + <teythoon> exactly one translator is bound to any given node in the vfs + <gnu_srs> something is strange with the hurd console: booting with it + enabled still runs the mach console, halting: + http://paste.debian.net/79438/ + <teythoon> what is strange about taht ? + <gnu_srs> when starting the hurd console: task c1120e70 deallocating an + invalid port 1782, most probably a bug. + <teythoon> so ? + <gnu_srs> and the paste when halting: twice + <teythoon> that is a known issue + <gnu_srs> with the hurd console? + <teythoon> how do you know it's the hurd console ? + <teythoon> that message comes from the kernel + <teythoon> currently, it is not possible to tell which process is + responsible + <teythoon> b/c the task is given as a pointer to the kernel task structure + <teythoon> not as a pid + <gnu_srs> I don't ,it is triggered by it at least + <teythoon> currently there is no way to map the former to the latter + <teythoon> why do you think it's a problem ? is something not working as + expected ? + <gnu_srs> maybe a reproducible way to hunt that bug! + <teythoon> we have one already + <teythoon> it happens every time the hurd boots + <gnu_srs> yes, hurd console does not start, even when enabled:-( + <teythoon> then please say so ;) + <gnu_srs> I did: (11:23:30) srs: something is strange with the hurd + console: booting with it enabled still runs the mach console, halting: + http://paste.debian.net/79438/ + <teythoon> where do you say that the hurd console did not start ? + <gnu_srs> maybe it is easier to hunt the bug in an already booted system + <teythoon> you just said that the mach console is still active, wich it is + even if the hurd console starts + <teythoon> yes + <teythoon> please start the hurd console by hand + <teythoon> -d current_vcs -c /dev/vcs -d vga -d pc_kbd --keymap us + --repeat=kbd -d pc_mouse --protocol=ps/2 --repeat=mouse + <teythoon> err + <teythoon> /bin/console -d current_vcs -c /dev/vcs -d vga -d pc_kbd + --keymap us --repeat=kbd -d pc_mouse --protocol=ps/2 --repeat=mouse + <gnu_srs> when I log in I have the mach console not the hurd console + <teythoon> yes, log in as root, then run that command + <gnu_srs> I've done that: (11:10:27) srs: aha: console -d vga -d pc_mouse + --repeat=mouse -d pc_kbd --repeat=kbd -d generic_speaker -c /dev/vcs + <gnu_srs> please read? + <teythoon> and you discovered in that process that /dev/vcs lacked a + translator record + <teythoon> did you run it again after fixing that ? + <gnu_srs> the reply was: (11:10:27) srs: task c1120e70 deallocating an + invalid port 1782, most probably a bug. + <teythoon> well, if you are feeling that what i ask you to do is + unreasonable, i'm not sure how i can help you + <gnu_srs> yes, the translator was running! + <teythoon> you could hunt down the port deallocation bug, that'd be awesome + and most welcomed + <teythoon> but i don't believe it is causing your console malfunction + <gnu_srs> I did what you asked for?? + <gnu_srs> I'll do it again! + <gnu_srs> ok, now I don't get that error, but still no hurd console? the + process is running, logging out and then in, no hurd console. + <gnu_srs> not possible in serial console? + <teythoon> no, the hurd console is displayed using the graphic card + <teythoon> you asked for that with -d vga ;) + <teythoon> not sure if there are any other display drivers + <teythoon> when you asked whether you can use the serial line, i assumed + you used both qemus graphic terminal and a serial console + <teythoon> try kvm ... -serial telnet::1236,server,nowait, then use telnet + localhost 1236 to connect to the serial console + <teythoon> then, you can start the hurd console over the serial console and + see whether that worked + <gnu_srs> OK; that's what I asked before. I tried with the graphic one, + I'll try again + <gnu_srs> telnet output is empty + <gnu_srs> frozen + <teythoon> did you start a getty there ? + <gnu_srs> in hurd? + <teythoon> b/c if you dropped the console=com0 argument from you gnumach + command line, the mach console will be put on the vga screen, not on the + serial console + <gnu_srs> I dropped console=com0 from grub.cfg, yes + <teythoon> ok + <teythoon> so simply no one is talking to the serial port anymore + <teythoon> did you try to start the hurd console ? + <gnu_srs> I did before, can do it again + <gnu_srs> startin the HC blanks the screen, and freezes the vga output:-( + ssh still working + <teythoon> hm + <teythoon> try ps Ax | grep tty, are there any term servers running for + /dev/tty1..6 ? + <gnu_srs> lplenty of them: http://paste.debian.net/79442/ + <teythoon> good, even gettys are there + <gnu_srs> and the console translator runs + <teythoon> hm + <gnu_srs> root 1224 5 7 months /hurd/console + <gnu_srs> root 1227 1226 7 months /bin/console -d vga -d pc_mouse + pc_mouse -d pc_kb... + <teythoon> yes, everything looks good + <teythoon> just to be sure, you are currently using the qemus graphical + frontend, right ? + <gnu_srs> yes + <teythoon> hm :/ + <teythoon> gnu_srs: do you see loginpr processes ? + <gnu_srs> nope + <teythoon> hum + <teythoon> this strikes me as odd + <teythoon> on my system, i see no gettys but only loginpr processes + <teythoon> this is b/c the hurd getty does little other than to print some + text and run the login program + <teythoon> but on your system the getty sticks around + <teythoon> is /sbin/getty really the hurd getty? it's easily recognized by + its crappieness: + <teythoon> /sbin/getty --help || echo $? + <teythoon> 1 + <gnu_srs> 1 + <teythoon> hm + <teythoon> still funny though + <teythoon> you could try to run the hurd console, then run a getty manually + <teythoon> e.g. /sbin/getty 38400 tty1 + <gnu_srs> from the ssh login? + <teythoon> yes + <gnu_srs> then the graphic display is back showing the loin prompt:P + <teythoon> weird + <teythoon> well, so most things work + <teythoon> that's a good thing + <teythoon> funny that hurds getty should get stuck like this + <gnu_srs> and the terminal is hurd:-) + <teythoon> any chance you can produce a stack trace of one of your getty + processes ? + <gnu_srs> how? + <teythoon> gdb --pid=the_pid /sbin/getty + <teythoon> then, do bt like usual + <gnu_srs> so you mean tty2-6 are broken? + <teythoon> no + <teythoon> it's just for some reason your gettys do not behave nicely when + run from init + <gnu_srs> from running tty2: bt #0 0x01087b09 in ?? () + <gnu_srs> #1 0x00000000 in ?? () + <gnu_srs> not much + <teythoon> hm :/ + <teythoon> indeed + <teythoon> our getty logs to syslog, can you see anythign of interest here + ? + <gnu_srs> Jan 31 12:00:46 debian-openrc-20140123 rsyslogd-2066: could not + load module '/usr/lib/rsyslog/imklog.so', dlopen: + /usr/lib/rsyslog/imklog.so: undefined symbol: klogAfterRun + <gnu_srs> [try http://www.rsyslog.com/e/2066 ] + <gnu_srs> nothing tty releated + <teythoon> gnu_srs: oh, i just noticed, please look into auth.log, the + getty stuff ends up there + <gnu_srs> teythoon: http://paste.debian.net/79465/ + <teythoon> well, that is interesting :) + <gnu_srs> /dev/tty1 not a directory? + <teythoon> for instance, yes + <teythoon> it says bad syntax if it was invoked in the wrong way, i.e. not + with exactly two arguments + <teythoon> that might have been you yourself, right ? + <teythoon> with getty --help i mean + <teythoon> for the not a directory message, please verify that + <teythoon> # showtrans /dev//tty1 + <teythoon> /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + <teythoon> and stat /dev/vcs/1/console says it's a character special file + <gnu_srs> I used exactly: /sbin/getty --help || echo $? + <teythoon> yes, that accounts for that bad syntax message + <gnu_srs> what so bad about that? + <gnu_srs> showtrans /dev//tty1 + <gnu_srs> /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + <teythoon> getty is so simple minded that it doesn't really parse its + arguments + <gnu_srs> stat: http://paste.debian.net/79469/ + <teythoon> looks nice + <teythoon> everything looks nice, i'm at my wits end here + <gnu_srs> and everything works OK with sysv-rc? + <teythoon> yes + <teythoon> by the way, are you using the sysvinit init scripts or something + openrc related ? + <gnu_srs> openrc use all the scripts in /etc/init.d + <teythoon> actually, could you try to kill -HUP 1 ? + <gnu_srs> BTW: the dependency loop detector has found many loops in those + scripts + <gnu_srs> kill -HUP 1: nothing happens + <teythoon> ok, try to kill one of those gettys and see if the one that + respawns works + <teythoon> then again, the getty should try to reopen the device every + minute until it succeeds + <gnu_srs> getty tty1 and tty2 disappeared? kill -HUP tty3 respawns + immediately + <gnu_srs> now no getty processes are left? + <gnu_srs> /dev//tty4: Not a directory etc? + <teythoon> sorry, i should have expressed myself more clearly + <teythoon> kill -HUP 1 sends a SIGHUP to sysvinit, this makes it reload + it's configuration + <teythoon> when i said kill some getty, i meant just kill some_pid + <teythoon> when you said 'kill -HUP tty3 respawns immediately', did you + mean you killed the getty that was listening on /dev/tty3, and then a new + one appeared and you got a login prompt at tty3 ? + <gnu_srs> a new pid appeared, the login prompt is on tty1 + <gnu_srs> this one? /hurd/term /dev/tty1 hurdio /dev/vcs/1/console + <teythoon> i'd like to invite you to look at daemons/getty.c + <gnu_srs> not a big piece of code: anything specific? + <teythoon> no, just look what it roughly does + <gnu_srs> not a directory is not coming from that code + <teythoon> correct + <gnu_srs> it execl-s login + <teythoon> yes + <teythoon> inevitably + <teythoon> but you do not observe this + <gnu_srs> how come when they are running? + <teythoon> this is the question that you will have to answer in order to + make any progress + <gnu_srs> I killed only one of them: kill -HUP 1031 and they all + disappeared + <teythoon> i thought along these lines: the most obvious way to stall getty + is if it never exits that loop + <teythoon> so i guessed it might be failing to open the device + <teythoon> we already observed that getty works fine if invoked by you + manually + <teythoon> the question thus is, what is different when getty is invoked by + init ? + <teythoon> if a process started by init in this way is killed, init will + restart it + <teythoon> please note, that if anyone says kill that process, she means + send a signal that results in process termination + <teythoon> and while sighup causes processes to die if the signal is not + handled, it is not the ideal signal to kill processes + <teythoon> b/c some processes handle sighup + <teythoon> like sysvinit, which reloads its configuration + <teythoon> many daemons do this + <teythoon> see 'man 7 signal' for how signals affect processes + <gnu_srs> sorry, have to leave for now, bbl and thanks a LOT so far:) + <teythoon> ok :) + <teythoon> you are welcome :) + <gnu_srs> teythoon: I'm back but cannot spend to much time on this + tonight. Maybe you should try it yourself, do you want another image on + my box? + <teythoon> it'd be nice if you put your packages somewhere + <gnu_srs> there are no special packages sysvinit (-46) and openrc (-8) + <teythoon> surely openrc with some patches ? + <gnu_srs> from #openrc: (17:37:41) srs: start with sysvinit and make it + work first! + <gnu_srs> (17:28:43) srs: zigo: Then I copied that working image to + another, and changing hostname, and continued from there. + <gnu_srs> openrc with the hurd patches for /lib/rc/sh/init.sh (v8 should be + available from experimental by now) + <teythoon> sweet :) + <teythoon> gnu_srs: maybe it was just some weird issue with your system + <teythoon> i just switched to openrc and everything seems to just work + <teythoon> i'll redo what i just did more cleanly to get a clean test vm... + <gnu_srs> nice:) + <gnu_srs> teythoon: And you got the hurd console? + <teythoon> heh, i believe so >,< + <teythoon> i didn't see it b/c i was using --nographic + <teythoon> but ps Ax looked alright + <teythoon> hrm + <teythoon> gnu_srs: i can reproduce your trouble, umount still strips the + translator record from /dev/vcs + <teythoon> at system shutdown time + <gnu_srs> so that's the reason. Additionally I have to issue halt twice + from a ssh login, see http://paste.debian.net/79517/ + <teythoon> funny indeed + <teythoon> gnu_srs: i can reliably recover the hurd console by doing + <teythoon> settrans /dev/vcs /hurd/console && service hurd-console restart + && pkill getty ; sleep 5 ; pkill getty + <teythoon> humm, as you say, halt doesn't work + + +## IRC, OFTC, #debian-hurd, 2014-02-01 + + <pere> I've just uploaded a new new sysvinit package to experimental, with + all the latest hurd fixes. + + +## IRC, freenode, #hurd, 2014-02-01 + + <gnu_srs> 17:53:28< teythoon> settrans /dev/vcs /hurd/console && service + hurd-console restart && pkill getty ; sleep 5 ; pkill getty + <gnu_srs> teythoon: Any ideas on how to solve this? + <teythoon> gnu_srs: yes, i have that on my todo list + <gnu_srs> so it is not an openrc problem? + <teythoon> gnu_srs: no + + +## IRC, freenode, #hurd, 2014-02-01 + + <teythoon> start ext2fs: Hurd server bootstrap: ext2fs[gunzip:device:rd0] + exec init proc au + <teythoon> thtask with pid 6 deallocating an invalid port 134517370, most + probably a bug. + <teythoon> :) + <teythoon> pid 6 is exec o_O + <gnu_srs> teythoon: Nice to see that you added pid numbers for error + print-outs:) + <gnu_srs> so the boot error comes from the exec sever? + <teythoon> so it seems + <gnu_srs> server* + <gnu_srs> have you found where? + <teythoon> no + + +## IRC, OFTC, #debian-hurd, 2014-02-02 + + <pere> but when I install the new packages, and run update-alternatives + --config runsystem to select sysv, the boot fail with: start ext2fs: Hurd + server bootstrap: ext2fs[device:hd0s1] exec init proc authtask c1128dc8 + deallocationg and invalid port 134517370, most probably a bug. + <pere> was that the wrong approach? + <pere> is there some way to recover when hurd fail to boot with sysvinit? + <pere> I was able to boot in recovery mode. :) + <pere> and this time sysvinit booted. saw a segfault message just after + sysvinit started, no idea what caused it. + <pere> looks like it is startpar that segfaults. + <pere> looks like the invalid port message come every time, no matter if + the boot hang or not. + <pere> I was wrong. it isn't startpar segfaulting, it is something in + rcS.d/. + <pere> bootlogd is the process segfaulting at boot. + <pere> looks like the boot success rate is 30% or so. + <pere> reported bootlogd problem as <URL: http://bugs.debian.org/737375 >. + I really miss valgrind. :) + <teythoon> pere: yes, the invalid port message is from the exec server + <teythoon> pere: i see the hurd boot process hang sometimes, no matter if i + use sysvinit or not + <teythoon> i believe it's a race condition in the ext2fs, not sure though + <pere> teythoon: but did the frequency of the hang go up with sysvinit or + not? to me it seem like that. + <teythoon> pere: yes, i believe it got worse + <teythoon> what hangs is fsysopts --update / + <teythoon> runsystem.sysv does that quite early + <pere> able to debug it? + <pere> I like the fact that runsystem.sysv set up ip at boot time, while + with .gnu, I have to run dhclient /dev/eth0 manually + <pere> it is quite confusing that hurd got two init processes with + sysvinit. one as pid 1, and another that seem to be the parent of all + internal stuff. perhaps the latter could be renamed to hurd-system or + something like that? + <pere> "sleep 0.2 # Work around a race condition (probably in the root + translator)." do not look too good... + <pere> (I increased from 0.1 to see if it help me. :) + <teythoon> did it ? + <teythoon> i plan to rename /hurd/init to /hurd/startup + +[[hurd_init]]. + + <pere> nope. :) + <pere> five boots in a row hung. :( + <pere> still no go... + <teythoon> are you using a vm or real hardware ? + <pere> vm + <pere> kvm, via virt-manager, to be exact. + <teythoon> me too + <pere> on the sixt boot, after waiting a long time between try 5 and 6 + (gave up a bit), it booted. + <pere> sleep 1 did not help either. + <teythoon> :( + <teythoon> well, it's not *that* bad for me + <teythoon> in fact recently it has been a lot better + <teythoon> you might try my packages + <teythoon> pere: here http://darnassus.sceen.net/~teythoon/hurd-ci/ + <pere> teythoon: tested it, and it seem to solve the problem. + <pere> is also rid of the strange error at the start. + <pere> teythoon: your packages even work without the sleep 0.1, at least + some of the time. :) + <pere> hm, but the success rate without sleep 0.1 is very low. I was able + to boot once, and never again. :( + <teythoon> pere: yes, i fixed the spurious port allocation today :) + <teythoon> pere: nice to hear that the sleep 0.1 i put in does increase + your chance to boot as well + + +## IRC, freenode, #hurd, 2014-02-02 + + <teythoon> gnu_srs: i found the spurious port deallocation :) + <gnu_srs> Cangrats:-D + <teythoon> trouble is, i introduced it >,< + <gnu_srs> Congrats* + <gnu_srs> Ah, you did? + <teythoon> gnu_srs: yes, in debian/patches/exec_filename_fix.patch + <teythoon> + http://darnassus.sceen.net/gitweb/teythoon/packaging/hurd.git/commitdiff/6da3e0be8fde0594bd84a13536d9d93048186790 + * teythoon . o O (diffs of diffs are trippy :) + + +### IRC, freenode, #hurd, 2014-02-03 + + <braunr> teythoon: oh nice, you found that bug :) + <teythoon> braunr: yes, once i knew where to look it was easy to fix ;) + + +### IRC, freenode, #hurd, 2014-02-05 + + <teythoon> i wonder why the port deallocation bug made the system hang when + the libc was compiled with the newer gcc + <braunr> teythoon: so it was indeed the problem ? + <teythoon> braunr: youpi said so, yes + <braunr> oh right + +[[glibc/debian/experimental]], *glibc 2.18 vs. GCC 4.8*? + + +## IRC, OFTC, #debian-hurd, 2014-02-03 + + <pere> + http://people.skolelinux.org/pere/blog/Testing_sysvinit_from_experimental_in_Debian_Hurd.html + <teythoon> :) + <teythoon> pere: sounds like your hurd-console isn't running and there is + no getty on the mach console + <teythoon> pere: you could add sth like 8:2345:respawn:/sbin/getty 38400 + console to your inittab + <pere> I'd rather wait until the hurd porters get it right in the debs. :) + <pere> I suspect upgrading the downloadable image to use the latest + packages also would help a lot. + <pere> with upgraded packages, /proc is working and pstree, pkill, top, etc + is working out of the box. :) + + +## IRC, OFTC, #debian-hurd, 2014-02-04 + + <pere> I just uploaded sysvinit with hurd support to unstable. :) + + +## IRC, freenode, #hurd, 2014-02-04 + + <gnu_srs> teythoon: Hi, the segfault during boot is coming from bootlogd, + see bug #737375 + <gnu_srs> also the output on the console is from there: end_request: I/O + error, dev 02:00, sector 0 + <teythoon> gnu_srs: interesting :) + <teythoon> gnu_srs: i believe the end_request message comes from gnumach + <youpi> yes, that's just a floppy disk access attempt + <gnu_srs> might be so yes + <youpi> it's not a "might", it's sure :) + <youpi> dev 02:00 is the flopy + <gnu_srs> k! + + +## [[glibc_IOCTLs]], `TIOCCONS` + + +## IRC, OFTC, #debian-hurd, 2014-02-04 + + <zigo> Each time I upgrade my hurd box, I cannot login into it ... + <zigo> No login prompt. + <zigo> WTF is going on? + <zigo> How to fix? + <teythoon> zigo: most likely your hurd console is not running and there is no getty started for the mach console + <zigo> teythoon: How to fix? (note: I already have the partition mounted in a loopback) + <zigo> Or maybe go in recovery mode? + <teythoon> depends + <teythoon> do you use sysvinit ? + <teythoon> do you use the hurd packages from hurd-ci ? + + +## IRC, OFTC, #debian-hurd, 2014-02-05 + + <zigo> teythoon: Sorry, didn't see your reply. I just used the Hurd image, + untar it, and apt-get update / dist-upgrade. That's it, nothing more or + less. + <zigo> teythoon: I obviously would like to install sysvinit, and later + OpenRC. That's the reason why I'm running Hurd: to make sure OpenRC works + with it without issues. + <zigo> teythoon: It seems it "sometimes work" or what??? + <zigo> I was able to repair it using the recovery mode, it seems. + <zigo> grrr... + <zigo> I got this issue again, again and again ... + <zigo> Sometimes, got the tty1, sometimes, it doesn't appear. + <zigo> That's REALLY frustrating. + <pere> zigo: and yes, the success rate for boot is not 100%. it increases + a bit by using the packages teythoon created at hurd-ci. + <pere> apparently some race condition somewhere. + <zigo> pere: So, I should just try and reboot again and again ? + <zigo> pere: Is it improving after switching to sysvinit? + <pere> once I had to boot six times before I got it running... + <pere> I was told that the race involves a call to fsysopts, and that the + success rate with sysvinit was smaller because fsysopts command was + called earlier. I can not confirm nor deny this. + <pere> with the latest packages from hurd-ci the success rate is almost + 100% again. + <zigo> pere: Where do get that? + <pere> zigo: see <URL: + http://people.skolelinux.org/pere/blog/Testing_sysvinit_from_experimental_in_Debian_Hurd.html + > + <zigo> pere: What's the "update-alternatives --config runsystem" for? + <pere> to switch to sysvinit + <zigo> Right, that's what I was missing then! :) + <pere> the new sysvinit version in unstable was built for hurd one and a + half hour ago. so soon hurd users can skip experimental for that. + <zigo> pere: I've just succeeded in booting with OpenRC! :) + <zigo> Though this console pb is REAAAALLLYYYY getting on my nerves! :) + <zigo> Also, any idea why we don't get the nice colorfull output when + booting? + <zigo> When booting with OpenRC, I've noticed that the dependency loop + detects some loops with the hurd-console thing. + <teythoon> zigo: good to hear that you got it working + <teythoon> the console problem is the following + <teythoon> when you shutdown using sysvinit, the system will run umount -a + <teythoon> it will then mistake some translators (like the one on /dev/vcs) + for file systems and remove their passive translator records + <teythoon> you can fix this by running '/usr/lib/hurd/setup-translators -k + -p' + <teythoon> you can avoid it for the time being by using reboot-hurd or + halt-hurd + <pere> teythoon: btw, how often is the hurd boot image available for + download updated? + <teythoon> not very often + <zigo> teythoon: Can I run '/usr/lib/hurd/setup-translators -k -p' + mounting my hurd image in a chroot? + <zigo> Hum... + <zigo> Probably better to do that in the recovery mode, no? :) + <youpi> dpkg-reconfigure hurd + <youpi> would be easier to type :) + <youpi> but we really need to fix that /dev/vcs unmounting + <pere> missing working getty and missing symlink from /run/mtab to + /proc/mount are the most serious problems I still see. + <zigo> The recovery mode doesn't work with OpenRC ! :( + <zigo> (it does in kFreeBSD and Linux, not with hurd ...) + <zigo> What happens is that it continues to runlevel 2. + <zigo> How can I fix then? + <youpi> pere: missing working getty? + <youpi> I don't see what issue you are referring to + <youpi> about the missing symlink, I'm wondering what is supposed to add it + <youpi> zigo: I don't know if anybody investigated it yet + <pere> youpi: yes, after boot there is no login prompt. + * pere have no idea, suspect a script in initscripts. + <zigo> youpi: I'm reffering to the fact that I have no login prompt after + boot, and that I don't know how to fix, since I don't have a recovery + mode to my disposal anymore. + <youpi> pere: but is the console started? + <youpi> (I mean the hurd console) + <zigo> pere: I suspect a wrong dependency, which OpenRC by the way, prints. + <youpi> pere: otherwise, unless you have a /dev/console getty in + /etc/inittab, it's expected you don't have a prompt + <youpi> zigo: add + <youpi> c:23:respawn:/sbin/getty 38400 console + <youpi> to your /etc/inittab + <teythoon> youpi: yes, we need to get that fixed + <youpi> grrrr + * youpi wanted to change the image file on people.d.o + <youpi> but I can't do that without downloading it on my laptop, to be able + to modify it + <youpi> I would have been, if people was a hurd system :) + <teythoon> the proper way to fix this is to implement the get_source stuff + and get rid of the heuristic in mtab.c + <pere> youpi: nope, no console process running. + <youpi> then that's why, /dev/vcs got unmounted + <pere> I already have a console getty in inittab. got it from the last + sysvinit package + * youpi should have brown-bag-fixed these bugs before this week-end + actually :) + <youpi> pere: but you don't get a getty prompt on the mach console? I don't + understand why + <youpi> it does work for me + <teythoon> brown-bag-fixed ? + <zigo> youpi: Adding that in /etc/inittab didn't fix anything. + <youpi> yes, ugly hacks uploaded to debian-ports + <youpi> zigo: even with rebooting? + <youpi> could you snapshot your screen so we can make sure what you are + actually getting? + <zigo> youpi: I did it mounting my partition in a loopback... + <zigo> Then booted up, and still couldn't see the console prompt. + <youpi> ok, but please take a snapshot, so we are sure what is actually + happening + <youpi> whether the console starts, etc. + <pere> that info passed out of the screen and is not shown after my boot, + at least. + <youpi> which info? + <youpi> again, please take a snapshot of the screen + <youpi> otherwise we are just guessing, and that's never good for debugging + <zigo> Maybe you'll find this interesting: http://paste.debian.net/80246/ + <zigo> This is the output of OpenRC booting and detecting dependency loops + in the LSB header scripts. + <pere> youpi: the info about the console being started or not. I'll show + you, give me a minute. + <youpi> zigo: well, that shouldn't be more problems than the dependency + loop already existing between rc.local and rmnologin + <pere> youpi: any loop is a fatal problem. + <youpi> how come the rc.local vs rmnologin is not a problem ? + <zigo> With sysv-rc in Debian, there's all sorts of loops that are just + silent. + <pere> I have not seen that loop on my linux system, so I am unsure what + you talk about. + <youpi> (the actual issues is simply that all three use Required-start: + $all, and thus all depend on each other) + <zigo> That's a huge pb IMO. + <youpi> pere: well, + <pere> zigo: show me one? + <youpi> rc.local:# Required-Start: $all + <youpi> rmnologin:# Required-Start: $remote_fs $all + <zigo> Yeah, the $all is just *bad*. + <pere> that is no loop. + <zigo> I do believe we should implement a lintian warning about it. + <pere> sure, $all do not behave the way most people expect, and should be + avoided as much as possible. + <pere> any other loops? + <youpi> no + <youpi> (not that I know of) + <pere> youpi: sending you the screenshot via irc. + <youpi> uh, long time no use dcc send, I don't even know where it sent it + to :o) + <pere> ok. aborting and trying another approach. + <pere> http://www.picpaste.com/booted-herd.png + <youpi> ok, so boot didn't actually finish + <youpi> that's why you don't get gettys or hurd-console (which is last) + <youpi> there must be some init script hanging in the meanwhile + <pere> logging in via ssh show no running startpar process, so I doubt that + is the case. + <pere> syslog contain this: Feb 5 10:10:27 hurdtest console[808]: Console + library initialization failed: Not a directory + <youpi> that is due to /dev/vcs not mounted + <youpi> but that should have not prevented the boot from completing... + <pere> the boot is completed, as far as I can tell. + <youpi> you can disable the hurd console in /etc/defaults/hurd-console + <youpi> do you have gettys running? + <pere> no such file. + <youpi> oops, -s + <pere> http://paste.debian.net/80251/ + <teythoon> pere: check your /etc/inittab, is there a getty for the mach + console ? + <youpi> he said yes earlier + <teythoon> oh ok + <teythoon> i wonder why it doesn't show up then + <youpi> same for me + <teythoon> if the getty cannot open the device, it will loop + <pere> ah, I was wrong. the inittab is not the one I thought. the current + one is after a reinstall, while I checked the content before that. + <teythoon> pere: check /var/log/auth.log + <pere> there is indeed no console entry in /etc/inittab. I thought it + would be copied into place during upgrades? + <teythoon> not if it exists + <teythoon> iirc + <youpi> indeed + <pere> ah, great. "cp /usr/share/sysvinit/inittab /etc/inittab" and a + reboot fixed it. :) + <youpi> phew :) + <pere> it really should try harder to update the inittab on hurd to a + working one. + <teythoon> didn't i do something like this to fix the getty path ? + <pere> yes. that was the code I expected to solve this. + <teythoon> it didn't work ? + <pere> well, I had the wrong inittab file... + <pere> btw, do hurd have the needed syscalls for bootlogd to work? + <teythoon> i haven't looked at bootlogd yet + <pere> would be nice to have a text dump of the boot when trying to figure + out what went wrong. + <teythoon> yes, that'd be nice + + <youpi> pere: could you blacklist /dev/vcs in umountfs, just like already + done for /proc|/dev|/.dev etc. ? + <youpi> so at least that case, which is really problematic, gets fixed now, + and not have to wait for another, more hurdish solution + <pere> youpi: just send patches to bts, and I'll pick it up from there. + <teythoon> nice. i'll work on the proper solution. bbl + <rleigh> teythoon: Can we add those translators to the exclusion lists in + umount[nfs]? + <rleigh> Sorry, I just noticed youpi's comment. I'm a bit behind. + <heroxbd> rleigh: good to see you! are you back to the keyboard? fully + recovered? + <rleigh> Not quite fully, but on the mend, thanks! + <heroxbd> :] + <pere> rleigh: yeah, good to see you again. I got a burst of energy and + brushed a bit on sysvinit in your absence. :) Even revitalized the + #pkg-sysvinit channel. :) + <rleigh> pere: Yes, I saw all the commit emails flying by! + <rleigh> I realistically won't be doing much for several weeks at least + though, I'm afraid. + <pere> no worries. spend your time getting well. :) it would be great to + have you on #pkg-sysvinit, though. :) + <rleigh> I'll join, no worries. I should add it to my irssi config so I + can't forget! + <heroxbd> teythoon: serial console always works, right? no matter how + hurd-console behaves. + <teythoon> heroxbd: yes + <teythoon> but you need a getty on it + <youpi> well, just like on linux :) + <teythoon> yes + <teythoon> almost + <teythoon> on mach, we have the mach console. by default that is put on the + vga screen, but you can make mach put it on a serial port using the + gnumach command line flag console=comX + <youpi> well, just like on linux :) + <heroxbd> understood, thanks! + <teythoon> oh, i didn't realize linux has this as well + <heroxbd> teythoon: you'll use it a lot on a embedded system + <heroxbd> an* + <teythoon> ok + + <gg0> plus, seems it can't cleanly umount /, at boot it fsck's it, fixes it + and auto-reboot + <youpi> it's odd that / doesn't get unmounted, don't you get a message at + "notifying ext2fs device:hd0s1 of shutown" ? + <gg0> on console last 3 lines on halt are + <gg0> Deactivating swap...swapoff: /dev/hd0s5: 4193208k swap space + <gg0> done. + <gg0> Unmounting local filesystems...done. + <gg0> INIT: no more processes left in this runlevel + <youpi> is this on reboot or on halt? + <gg0> halt + <youpi> then you should also be getting the "notifying" messages, as well + as "In tight loop: hit ctl-alt-del to reboot" message + <gg0> it umounts uncleanly on reboot too + <youpi> if you don't wait for these, there's little wonder it's not + properly unmounted + <gg0> i waited many seconds, time to rewrite 3 lines above for you for + instance (not a fast typist) + <gg0> on reboot it's harder but iirc they don't appear as well + * gg0 rebooting again + <gg0> need to wait it finishes fsck'ing + <gg0> (i should resoldering my serial cable to get back to lazily c&p) + <gg0> -ing + <gg0> many Give root password messages then + <gg0> Give root password for maintenance + <gg0> (or type Control-d to continue): + <gg0> INIT: Id "z6" respawning too fast: disabled for 5 minutes + <gg0> INIT: no more processes left in this runlevel + <gg0> i'll wait 5 mins to see what happen + <gg0> ok another dozen of Give root password and same couple of INIT above + <gg0> no, just the first INIT + <youpi> so z6 doesn't work + <youpi> i.e. /sbin/sulogin (see /etc/inittab) + <youpi> check out why that is + +[[hurd/translator/mtab/discussion]], *IRC, freenode, #hurd, 2013-06-25*, +*coreutils' `df`*. + + <youpi> [...] depends on coreutils actually building + <youpi> which depends on putting back a login package from the shadow + source package + <pere> are someone on that task? + <youpi> no idea + <youpi> IIRC I've mentioned the issue on the lists like months ago + <youpi> but probably nobody took the tas + <youpi> k + <youpi> basically it means fixing any bug that login or su from the login + package would have + <youpi> and then properly handle the migration from hurd-provided versions + to login-provided versions + <youpi> and then we would be able to build coreutils + <pere> which BTS report is this? + <youpi> I don't know if any report has been written about it + <youpi> perhaps simplest would be to build the login package, but not its + bin/login + <youpi> it seems hurd's getty uses special options of hurd'slogin + <youpi> that's probably the easiest way to go + + <gg0> sulogin seems to work fine but it shouldn't even called: + <gg0> # Normally not reached, but fallthrough in case of emergency. + <gg0> z6:6:respawn:/sbin/sulogin + <gg0> +be + <pere> I suspect a good fix is to provide a new init.d script in the hurd + package adding the symlink for hurd. + + <gg0> umountfs gets stuck at "Will now umount local filesystem:settrans + -apgf /lib/rc/init.d" + + +## IRC, freenode, #hurd, 2014-02-05 + + <gnu_srs> teythoon: Any ideas why I have to issue halt/reboot twice to make + the command succeed (from ssh login) + <gnu_srs> Is it the same issue with sysv-rc? + <teythoon> no + <gnu_srs> BTW: The segfault when booting came from bootlogd (wrong + parameters, Linux/~Linux), removing that one fixed it;-) + + +## IRC, freenode, #hurd, 2014-02-06 + + <youpi> teythoon: we really need to find the boot issue for which you added + a sleep 0.1 in runsystem.sysv + <youpi> apparently I had to move it above the mach-defpager startup, to get + a system that boots most of the time... + + <azeem> did somebody look at + http://homepage.ntlworld.com/jonathan.deboynepollard/Softwares/nosh.html + ? + <braunr> azeem: interesting + <azeem> braunr: was mentioned here: http://lwn.net/Articles/584428/ + <azeem> " Systemd won't work for them, that's for sure, but nosh as a + systemd unit file compatible alternative could. " + <braunr> "I'm also very interested in seeing a discussion where the Debian + Hurd and BSD porters weigh in for themselves" + + +## IRC, OFTC, #debian-hurd, 2014-02-06 + + <gg0> on halt/reboot it can't remount readonly root because it's busy, what + makes it busy? + <gg0> by keeping /lib/rc/init.d mounted (like /dev/vcs) it shuts down + properly + <youpi> I don't know about such directory + <gg0> so seems that failed readonly remount is not a real problem because + at the end it runs halt-hurd/reboot-hurd which umount root properly + <youpi> yes + <gg0> afaiu it's a tmpfs where openrc copies "itself", kind of work + directory + <gg0> by removing it, it can't continue working + <gg0> at boot some messages are about its creation/population + <pere> why do init.d/hurd-console depend on $all? In most cases, depending + on $all is not giving you want you expect. + <youpi> because we prefer to start the console (and thus clear all the + screen) only after the boot has finished + <youpi> otherwise the console output will be messed up by the end of the + boot messages + <teythoon> youpi: there has to be a better way + <teythoon> b/c the way it is now, if one spawns a getty on the mach + console, it will mess up the hurd console as well + <youpi> well, we do want mach messages printed even with the hurd console, + at least + <teythoon> i once thought that instead of printing them the kernel could + send messages to a registered userspace daemon that could e.g. send them + to syslog + <youpi> that requires syslog to be working at all + <pere> changing $all to $local_fs seem to work fine here. + <youpi> when the kernel cries out, we'd better always be able to hear it :) + <youpi> pere: but then you have the bootup messages in the middle of the + console, don't you? + <pere> not as far as I can tell. look just the same as before. + <youpi> well, on my box it seems that it gets to start after other daemons, + by luck + <youpi> ah, perhaps getty actually clears the tty? + <youpi> then that would be ok + <teythoon> youpi: i don't think it does + <youpi> well, somehow something clears the output at least + <teythoon> i thought he hurd console does this + <youpi> it does on startup, yes + <youpi> but if it starts before other daemons + <youpi> the damons startup output gets over it + <youpi> one sees the console clear the screen, then get daemon startup + messages, and then the screen gets cleared again before the login prompt + appears + <teythoon> interesting, i haven't seen this happening + <youpi> it seems like it happens when emitting text on /dev/tty1, the + console will then clear the screen to make the way for the new output + <youpi> and since that happens on getty startup, it happens to be after all + daemon startup + <youpi> yes, that's what happens + <youpi> so considering this, I'm fine with starting the console earlier + <youpi> getting a display glitch seems to have been acceptable on Linux for + years :) + <youpi> (during boot, I mean) + <teythoon> ok + + <gg0> anyone else tried openrc? + <gg0> 15:20 < pere> yes, it did not umount properly. + <gg0> 15:36 < gg0> reboot or halt? it takes few seconds to actually + reboot/halt since the last message from openrc + <gg0> 15:39 < gg0> any typo adding such path? + * gg0 likes cross-channel pasting + <gg0> anyone else keeps getting unclean umounts even after applying + http://paste.debian.net/plain/80386/ ? + <teythoon> gg0: yes, me. worked fine, it didn't shut down properly though + <gg0> here works like a charm + <gg0> what do you mean by properly? + <gg0> i see first it can't remount root readonly but at least by not umount + path in question it continues executing scripts till actually shut it + down with something like {halt,reboot}-hurd + <gg0> *not umounting + <gg0> *shutting + <teythoon> for me it did not shut down + <gg0> you mean don't you get classic press ctrl+alt+canc to reboot message? + <teythoon> yes + <teythoon> from my perspective (and from /hurd/init's), that's not shutting + down + <teythoon> as in it did not call reboot(2) + <gg0> what are configuration not to miss besides switching runsystem to + sysv one? + <gg0> *configuration steps + <teythoon> no idea, i did nothing else but to switch to runsystem.sysv and + to install openrc thus replacing sysv-rc + <gg0> can you paste shutdown messages somewhere? + <teythoon> sure + <gg0> .o(world is failing, /me can't debug teythoon :)) + <teythoon> http://paste.debian.net/hidden/745071e6/ + <gg0> in my case i just found out that /etc/init.d/umountfs tries to umount + /lib/rc/init.d where openrc scripts are + <gg0> what if you set VERBOSE and print REG_MTPTS? something like + http://paste.debian.net/plain/80570/ + <gg0> there i got "settrans -apfg /lib/rc/init.d" which vanished with first + patch + <teythoon> http://paste.debian.net/80573/ + <gg0> ok and if you apply first patch http://paste.debian.net/plain/80386/ + <gg0> i.e. adding |/lib/rc/init.d to mount point to ignore + <teythoon> didn't help + <gg0> well output should change though + <teythoon> it does + <teythoon> but it still does not shut down + <gg0> paste please then + <teythoon> http://paste.debian.net/80576/ + <teythoon> what did you expect ? + <gg0> did you unapply VERBOSE & print REG_MTPTS? + <teythoon> yes + <teythoon> no + <teythoon> well + <gg0> seems you do, if VERBOSE is set, it prints Will now unmount local + filesystems" + <teythoon> i restored a vm snapshot, and applied both patches + <gg0> instead of "Unmounting local filesystems" + <gg0> *seems you did + <teythoon> http://paste.debian.net/80577/ + <teythoon> shall i do it again ? + <gg0> and what after "root@debian:/# halt" ? :p + <teythoon> 23:55 < teythoon> http://paste.debian.net/80576/ + <teythoon> and openrc shouting lots of stuff about breaking dependencies + <gg0> please yes do it again + <gg0> if VERBOSE is set, it prints "Will now unmount local filesystems" + instead of "Unmounting local filesystems" + <teythoon> yes, you are right + <teythoon> still, it does not work + <teythoon> http://paste.debian.net/80579/ + <gg0> i'm curious about the new REG_MTPTS, supposing /lib/rc/init.d has + been suppressed + <gg0> ok stop + <gg0> 23:47 < gg0> ok and if you apply first patch + http://paste.debian.net/plain/80386/ + <teythoon> i did + <teythoon> well, i added that path + <gg0> i don't believe so, it should ignore it if added + <teythoon> did it fix the issue for you ? + <gg0> yes + <gg0> any typo in addition? + <gg0> obviously patch is against sysvinit source but you have to apply it + to /etc/init.d/umountfs + <teythoon> obviously + <gg0> isn't it time to tell me you are kidding me yet? + <youpi> pere: thanks for the upload. I happened to realized that since it + was in collab-maint, I could as well just commit changes, I hope it's ok? + <teythoon> gg0: root@debian:~# fgrep '/lib/rc/init.d' /etc/init.d/umountfs + /|/proc|/dev|/.dev|/dev/pts|/dev/shm|/dev/.static/dev|/proc/*|/sys|/sys/*|/run|/run/*|/lib/rc/init.d) + <gg0> /dev/vcs is missing, not the latest sysvinit version + <gg0> could this affect shutdown? + <teythoon> i know + <teythoon> possibly + <gg0> what if you also add /dev/vcs to path list? + <teythoon> what then ? + <teythoon> i don't mind /dev/vcs being + <teythoon> err, 'umounted' + <teythoon> i can handle that just fine + <gg0> i mean what happens if you add /dev/vcs to path list in + /etc/init.d/umountfs as you did with /lib/rc/init.d? + <gg0> what happens = how it shutdown + <teythoon> why would it be any different ? + <gg0> no idea, seems the only change you don't have + <gg0> i just know it fixes hurd console + <teythoon> i know it fixes the hurd console b/c i was the one who broke the + hurd console in the first place ... + <gg0> quite sure there's something wrong on your side + <gg0> if it's actually among those path to ignore, it can't be added to + REG_MTPTS + <gg0> my /proc/mounts http://paste.debian.net/plain/80583 + <gg0> yours? + <gg0> i hope i'm not forgetting one change i did around + <gg0> teythoon: /proc/mounts ? + + +## IRC, OFTC, #debian-hurd, 2014-02-07 + + <gg0> teythoon: sorry for pasting reversed patches + <gg0> please apply http://paste.debian.net/plain/80587, halt and paste + output + /proc/mounts + <pere> youpi: just fine. but please join us on #pkg-sysvinit and make sure + to follow the mailing lists. + <teythoon> gg0: no, sorry, i was perfectly able to use -R on your patches, + as demonstrated by the paste i send + <teythoon> i think i'll rather just wait for the next sysvinit package and + try it again + <gg0> teythoon: i don't doubt you are able, i'm sorry because i messed up + things + <gg0> /lib/rc/init.d should not go in $REG_MTPTS + <gg0> sysvinit 2.88dsf-48 just add /dev/vcs to not-to-umount paths and make + boot consider -s for single user, nothing about umounting filesystems on + halt/reboot + <pere> the /lib/rc/init.d/ change to umountfs seem to be the wrong one, as + it do not solve the problem for me. because of this, I have not applied + it to git. + <gg0> pere: could you try to apply http://paste.debian.net/plain/80587, + halt and paste output? + <gg0> well it applies to teythoon who doesn't have /dev/vcs + <gg0> */dev/vcs change + <gg0> pere: this one applies to -48 + installed. http://paste.debian.net/plain/80615/ + <gg0> given /lib/rc/init.d is added to not-to-umount paths it can't go in + REG_MTPTS + <pere> http://picpaste.com/halt-hurd-DVEVoHnr.png + <gg0> pere: you didn't apply it + <gg0> no messages from umountfs + <gg0> which is even more weird + <pere> well, patch claimed it did. + <gg0> normally it says "Unmounting local filesystems..." + <pere> checked the file, patch is applied. + <gg0> ok i think i got it + <gg0> patch is good. it just requires booting twice _and_ removing + non-patched /etc/init.d/umountfs.* if any + <gg0> patch = adding /lib/rc/init.d + <gg0> so + <pere> which files do you need to remove? + <gg0> /etc/init.d/umountfs.* and /lib/rc/init.d/started/umountfs.* + <gg0> do you have any? + <gg0> you should just have patched umountfs under both /etc/init.d/ and + /lib/rc/init.d/started/ + <gg0> the latter is populate at boot, that's why i said twice to become + effective + <gg0> *populated + <gg0> but propably /lib/rc/init.d/started/umountfs can be fixed on the fly + <gg0> from start: + <pere> why do you need to remove these files? + <gg0> 1/ patch /etc/init.d/umountfs by adding /lib/rc/init.d to + not-to-umount path list + <pere> why are these files not ignored? + <gg0> 2/ remove /etc/init.d/umountfs.* if any (eg. .orig .new .whatever) + <gg0> pere: because it loads them at boot, you need it loads just the right + one + <gg0> 3/ reboot twice + <gg0> (3/ halt twice) + <pere> this sound very fishy to me. + <gg0> or 3/ fix umountfs files under /lib/rc/init.d/started as well + <gg0> that should make it shutdown properly right away + <pere> my halt still hang. + <gg0> pere: you have /lib/rc/init.d in both /etc/init/umountfs and + /lib/rc/init.d/started/umountfs and there are no umountfs.* around? + <gg0> problem seems to be it picks first it finds if there are more than + one + <gg0> well i could have been more precise: /lib/rc/init.d/started/umountfs + is a link to /etc/init.d one + <gg0> btw there must be just one and only one umountfs, patched + <gg0> pere: clean /etc/init.d, reboot/halt with reboot-hurd or halt-hurd, + then next sysv reboot/halt will be good + <gg0> you just need to leave patched umountfs under /etc/init.d alone + <gg0> patch has always been good, it just needs 2 reboots to be appreciated + <gg0> pere: do you have other /etc/init/umountfs* files besides patched + one? + <gg0> my guess is it takes the first and only the first which Provides: + umountfs + <gg0> 12:17 < pere> why are these files not ignored? + <gg0> 12:35 < gg0> my guess is it takes the first and only the first which + Provides: umountfs + <gg0> to confirm that, if you have umountfs and umountfs.orig, under + /started you'll find just umountfs.orig + <gg0> pere: how goes? + <gg0> teythoon: last ~40 lines + <gg0> i'm assuming you have any else umountfs.* under /etc/init.d. if you + just add /lib/rc/init.d path to the only umountfs there should not be any + problem + <pere> gg0: removing the umountfs.* files did not help, as far as I can + tell. + <pere> are you telling me that openrc caches all init.d scripts in + /lib/rc/init.d/ at boot? + <gg0> pere: yes, you can see them. which umountfs* do you have under + /lib/rc/init.d ? + <pere> the right one. :) + <gg0> only the right one? + <pere> just scared me to know that changes on the disk do not take effect + immediately with openrc. + <gg0> pere: only the right one? + <pere> yes + <gg0> here i screwed it up by forcing initscripts removal and reinstall to + reproduce it, then fixed it once again + <gg0> i should just improving the explaination :) + <gg0> pere: "removing the umountfs.* files did not help," so did you find + any? + <pere> yes, both .orig, .rej and .dpkg-old + <gg0> pere: ok you should find one of them linked under + /lib/rc/init.d/started then + <gg0> /lib/rc/init.d/started/umountfs.* + <pere> I removed them three boots ago. still halt hangs. + <gg0> pere: and current umountfs have /lib/rc/init.d in path list? + <gg0> *has + <pere> yes. + <gg0> pere: can you access via ssh to it before issuing halt? + <pere> that is how I access it normally. + <gg0> ok + <gg0> before halt df should list /lib/rc/init.d as well + <gg0> after halt it should not, do you confirm that? + <gg0> (ssh connection here is kept alive) + <pere> my ssh connection went down, but /lib/rc/init.d was mounted while it + was active. + <pere> to me it look like umountfs isn't executed at all during shutdown. + <pere> oh, well. got to work on other things now. :) + <gg0> it's correct getting no messages if there no filesystem to umount + <gg0> as it wouldn't be run at all + <zigo> pere: Hey, thanks for uploading sysv-rc -48 ! :) + <pere> you are welcome. :) + <gg0> i can't reproduce it on a VM :/ http://paste.debian.net/plain/80658/ + <gg0> ehm no, same machive, successive halt + http://paste.debian.net/plain/80659/ + <gg0> got stuck + <pere> are there any testet sysvinit patches for hurd lingering? I plan to + upload a new version tonight or tomorrow. + + +## IRC, OFTC, #debian-hurd, 2014-02-08 + + <gg0> http://paste.debian.net/plain/80854/ + <gg0> expected? + <gg0> do tmpfs and procfs need to be shown as types /hurd/tmpfs and + /hurd/procfs? + <gg0> or can they be "normalized"? + <gg0> domount mount_noupdate tmpfs shmfs /run tmpfs + -onosuid,noexec,size=10%,mode=755 + <gg0> another one is why on linux options are nosuid,noexec ^, whereas on + hurd no-suid,no-exec,... ? + <rleigh> gg0: If they need generalising, we can add $nosuid/$noexec + etc. variables to mount-functions.sh and set them appropriately for the + currently platform. + <rleigh> current platform rather + <gg0> yeah, i ask just to understand what side people prefers modifying, in + this case hurd vs sysvinit + <gg0> btw in the meanwhile i got tmpfs takes options without '-' though it + shows them with '-' in proc/mounts + <gg0> rleigh: and thanks for pointing out what looking for, little hints + saves hours in my case :) + [IRC connection closed] + + +## IRC, freenode, #hurd, 2014-02-08 + + <youpi> gnu_srs: the -49 version of sysvinit contains a fix for bootlogd + + +### IRC, freenode, #hurd, 2014-02-09 + + <gnu_srs> (16:31:17) <youpi>: gnu_srs: the -49 version of sysvinit contains + a fix for bootlogd + <gnu_srs> Nice for kFreeBSD, for Hurd it doesn't matter if we get a + segfault or an error code saying it's not implemented :-( + <youpi> segfault vs error code is really not the same + <youpi> iirc bootlogd would ignore the error + <gnu_srs> Nevertheless, bootlogd is not usable on Hurd :( + <youpi> then fix it + + +## IRC, OFTC, #debian-hurd, 2014-02-08 + + <rleigh> gg0: If the sames are set by hurd itself, then it makes sense to + adapt sysvinit to cope with that rather than altering hurd since that + would be a fairly major compatibility break. OTOH, adding support for the + Linux/FreeBSD names in addition to the hyphenated names would be good + from the point of view of better interoperability generally, not just for + sysvinit. + <rleigh> For now, getting sysvinit to support the Hurd names is easy + enough, and if you do add the Linux/FreeBSD names then the compatibility + stuff can be removed when that's available. + + +## IRC, freenode, #hurd, 2014-02-11 + + <gnu_srs> Hi, still problems with hurd console under openrc: console: + Console library initialization failed: Not a directory + <gnu_srs> and /dev/vcs is there + <youpi> gnu_srs: but is it a directory? + <gnu_srs> the output of console -d vga -d pc_mouse --repeat=mouse -d pc_kbd + --repeat=kbd -d generic_speaker -c /dev/vcs gives the response above + <gnu_srs> looks like /dev/vcs is a file. How to recreate the directory + content? + <gnu_srs> I thought it should not be removed with the latest sysvinit + package (-49) + <gnu_srs> from -48 changelog: Tell init.d/umountfs to not umount /dev/vcs, + as it break the console on Hurd. Patch from Samuel Thibault. + <youpi> gnu_srs: but did your reconfigure the hurd package to remount it ? + <gnu_srs> ? + <youpi> /dev/vcs won't magically be remounted by just not being unmounted + by sysvinit + <gnu_srs> dpkg-reconfigure hurd? + <youpi> sure + <gnu_srs> I can start the console manually, but ENABLE='true' in + /etc/default/hurd-console does not work (at least with openrc) + <youpi> does /dev/vcs becomes a mere file again with openrc? + <gnu_srs> no it's a directory with 6 entries + <youpi> does the /etc/init.d/hurd-console gets to starT? + <youpi> I'm afraid I'm really asking obvious questions that you should have + already asked for yourself + <gg0> so you mounted it and it's not a file anymore. does it work now? + <gnu_srs> it seem like the service is not started, trying to figure out + why:-D + <gnu_srs> I can restart it but it is not visible in rc-status? + + <gg0> shutdown stuck at "Asking all remaining processes to + terminate...done." (even before distupgrade btw) + <gg0> seems stuck at killall5 -18 + <teythoon> hm, that's bad + <teythoon> how do you know that ? + <gg0> /etc/init.d/sendsigs and /etc/init.d/killprocs + <gg0> (yes, switched to sysvinit and testing openrc) + <teythoon> but killall5 -18 is SIGSTOP right? + <teythoon> and if it says ...done. then killall5 has already been run + <teythoon> so, how do you know it hangs at killall5 ? + <gg0> teythoon: "done" is "log_action_end_msg 0" just after killall5 -15, + then we should get "Killing all remaining processes" or "All processes + ended within $seq seconds." + <gg0> Asking all remaining processes to terminate...killall5 -15 -o 956 # + SIGTERM...done. + <gg0> All processes ended within 1 seconds...done. + <gg0> shutdown properly this time + <teythoon> hm + <teythoon> fwiw, i've also encountered hangs, haven't investigated yet + <gg0> with openrc? + <teythoon> yes + + <gnu_srs> Is it so that with teythoons mtab translator umount -a unmounts + all passive translators, removing the translator records?? + <gnu_srs> causing pflocal (and pfinet) to disappear? + +[[hurd/translator/mtab/discussion]]. + + <azeem> gnu_srs: didn't he say that this is getting fixed in his latest + patchset? + <gnu_srs> yes, what about mine and gg0s currently hosed systems? + <gnu_srs> yes, but until the patch makes into the next release,** + <youpi> gnu_srs: pflocal and pfinet don't appear in mtab + <youpi> because they don't expose whole directories, just a trivial node + <youpi> so no, they won't get umounted by umount -a + <youpi> simply check the content of /proc/mounts + <gnu_srs> so how come I cannot recover my image? + <gnu_srs> and gg0 neither + <youpi> no idea, I've never tried openrc + <youpi> when daring new fields, you face new issues, that's no wonder + <gnu_srs> so this does not happen with sysv-rc? + <youpi> I haven't seen any of this kind of issue + <youpi> whether it's related to using openrc vs sysvrc, I have no idea + <youpi> but at least that's a candidate for sure + <gnu_srs> well in my case hurd bootstrap is stuck after ext2fs exec and + before init + <gnu_srs> ant reinstalling hurd via linux does not help + <youpi> you mean the hurd package? + <youpi> you can also try to reinstall the libc0.3 package + <youpi> normally it should be all that is needed for boot + <youpi> perhaps also some /dev entries + <gnu_srs> yes, the hurd package. I will try with libc0.3 tomorrow. Which + /dev entries, and how to create them manually? + <youpi> "perhaps" implies that I don't know + <youpi> you can as well just boot with an install CD, mount your disk, + chroot into it, and run dpkg-reconfigure hurd there to recreate + everything in /dv + <youpi> +e + + +## IRC, OFTC, #debian-hurd, 2014-02-13 + + <youpi> pere, rleigh: which script is supposed to make /etc/mtab a symlink + to /proc/mounts already? I can't find it + <pere> youpi: see /lib/init/mount-functions.sh + + +## IRC, freenode, #hurd, 2014-02-13 + + <braunr> teythoon: are the sysvinit debian packages in sid usable currently + ? + <teythoon> they are + <braunr> nice + <teythoon> youpi and pere have been busy polishing it quite a bit + <braunr> teythoon: and uhm, how does one enable sysvinit in debian ? :) + <braunr> ah, found pere's blog + <teythoon> braunr: didn't you read the postinst instructions ? :p + <teythoon> update-alternatives --config runsystem + <braunr> oh right + <braunr> got lost in the noise + <braunr> very nice + <braunr> still a few glitches i see, but it does the job + <braunr> although i'm not sure i like the lack of console prompt :/ + <braunr> i'll keep darnassus on the old runsystem until this is fixed + <teythoon> braunr: cp -p /usr/share/sysvinit/inittab /etc/inittab + <teythoon> and kill -HUP 1 + <braunr> oh + <braunr> :) + <braunr> teythoon: thanks + <braunr> teythoon: do you know why there are three tmpfs instances after + startup (/run, and in addition, /run/shm and /run/lock) instead of one on + /run ? + <braunr> sorry for being so annoying :) + <teythoon> braunr: dunno, but that is what Debian does + <braunr> https://wiki.debian.org/ReleaseGoals/RunDirectory explains it a + bit + <teythoon> root@thinkbox ~src # uname -s; mount | grep /run + <teythoon> Linux + <teythoon> tmpfs on /run type tmpfs + (rw,nosuid,noexec,relatime,size=306952k,mode=755) + <teythoon> tmpfs on /run/lock type tmpfs + (rw,nosuid,nodev,noexec,relatime,size=5120k) + <teythoon> tmpfs on /run/shm type tmpfs + (rw,nosuid,nodev,noexec,relatime,size=613900k) + <braunr> i like this /run directory + <teythoon> yep, it's nice + <braunr> ah great, i can add ,sync=30 to fstab and it's added at boot time + :) + + +## IRC, freenode, #hurd, 2014-02-17 + + <congzhang> hi, I think we should make console server separate from + hurd-console + <congzhang> if DM want start, console server need be start first + <braunr> congzhang: send patches + <congzhang> and hurd-console mark it start at the end of sysinit? + <teythoon> congzhang: i agree + <braunr> teythoon: isn't hurd-console the console server ? + <congzhang> I want to check whether it is need first + <teythoon> braunr: yes, but congzhangs point is (as i understand it) that + the backend component should be started earlier + <teythoon> then again, i know little about the hurd console + <congzhang> no, if user enable one dispaly manager, then cycle dependence + happen + <braunr> why ? + <teythoon> i believe that is a different problem, namely that our + hurd-console init script depends on $all + <teythoon> pere: ^ + <congzhang> hurd-console Required-Start: $all + <braunr> ok + <braunr> yes that's a separate issue, and easier to understand + <congzhang> teythoon: if wdm Required-Start hurd-console, then insserv + can't generate the script order, right ? + <teythoon> congzhang: possibly, i don't know for sure + <congzhang> It doesn't work , and I rename to S??wdm to later one like + S20wdm + <congzhang> but insserv will regenerate the script order in /etc/rc2.d/, I + can't depend on that + <pere> congzhang: $all means after all scripts not depending on $all, and + not what the intuitive interpretation would tell you. + <pere> the current implementation order all scripts as if $all were not + present, and then move all scripts depending on $all to the last order + number+1. + <pere> because $all is misunderstood by most users, I strongly recommend to + _not_ use $all in any init.d script. + <congzhang> pere: so to make wdm to be number+more? + <pere> congzhang: make it depend on $all and be lexically sorted after + hurd-console. :) + <congzhang> wdm need start after hurd-console, if console-driver will run + when hurd-console start + <pere> not quite sure how startpar handle that case, so it might not work + the way you want anyway. + <pere> adding a dependency on hurd-console should not hurt, though. :) + <congzhang> how make it lexically sorted after hurd-console? + <pere> w is already after h in the alphabet. :) + <congzhang> that's trick! + <pere> but startpar uses the info in /etc/init.d/.depend.* (makefile style) + to order scripts, so check what the result is there too. + <braunr> congzhang: no it's not + <congzhang> that's just cache + <braunr> congzhang: ? + <congzhang> and generated from script head? + <congzhang> the right way is Adding run-time dependencies in script + <pere> congzhang: yes. insserv called from update-rc.d generate the + .depend.* files, and startpar reads the files (and ignore the headers) + when starting scripts. + <congzhang> if the script have cycle dependence, no one can help + <pere> congzhang: if there is a cycle, update-rc.d will reject the script. + <congzhang> sure, because the system current have not runable one + <congzhang> Display Manager run before hurd-console, and never successful + for X stared failed! + <pere> what is this hurd-console stuff, btw? it sound like somthing that + should be started in rcboot.d (aka rcS.d on Debian). + <congzhang> if you install wdm, you will notice that wdm start failed + <pere> should it run before sulogin when booting into single user? + <congzhang> hurd-console mix too much thins + <teythoon> pere: it's the console multiplexes that provides /dev/tty? + <congzhang> just part of that function + <teythoon> pere: it's like screen or tmux a server-client architecture + <teythoon> the x server gets keyboard and mouse events from it iirc + <pere> right. so not needed by sulogin, I guess. because if it was, it + should start in rcS.d, not rc[2-5].d/. + <congzhang> and also start /bin/console to start keyboard and mouse driver + <teythoon> /bin/console is the frontend + <pere> and if it started in rcS.d/, it would always be started before + wdm. :) + <braunr> i think it should be started in rcS.d + <congzhang> why not essential? + <pere> braunr: when I tried, it failed. + <congzhang> https://www.gnu.org/software/hurd/hurd/console.html + <congzhang> teythoon: i want to make one disk img with default DM, and face + these problem + <braunr> pere: do you have a log of the failur e? + <congzhang> teythoon: I know you are working on the hurd init system, so I + ask you for help + <pere> braunr: only the boot message: Starting Hurd console multiplexer: + hurd-console failed! + <pere> braunr: how can I learn more? + <braunr> i don't know any easy way + <braunr> try to put the system in its early state manually + <braunr> and maybe run rpctrace on the actual console command + <braunr> if that is what really fails + <congzhang> and I found that pc_kbd may have some bug? I have high + frequence of start failed if I make it start + <congzhang> but I can't located the real source of these problem + <teythoon> pere: the console logs some messages to syslog + <pere> teythoon: looked, nothing there. :( + <pere> gah, look like I broke my hurd machine. Added rpctrace to the start + of hurd-console, and now the boot just hang there, and when I interrupt + it the kernel reboot the entire machine. :( + <braunr> pere: use rpctrace manually, don't script it + <teythoon> oh yeah, seen this as well + <pere> braunr: well, no use to test it after boot when it hang during + boot... + <teythoon> it triggers an assertion in the proc server iirc + <braunr> pere: that doesn't imply you need to script it + <congzhang> pere: qemu snapshot mode will be your friend:) + <braunr> ideally, i'd run the init system automatically up to the point i + want to run my test, and make it spawn a shell, and use that shell then + <pere> congzhang: hah. real men do to take backups. but they weep a + lot. :) + <congzhang> teythoon: runsystem.sysv has work well on my machine, just some + error infomation + <teythoon> good + + +## IRC, freenode, #hurd, 2014-02-21 + + <gnu_srs1> Hi, a general question: is ptrace available for GNU/Hurd? + <teythoon> yes + <gnu_srs1> tks, the openrc developers are working on process supervision + using it: good/bad? (compared to cgroups) + <teythoon> uh + <teythoon> i prefer the cgroups approach + <teythoon> but upstart also uses ptrace to keep track of the 'main' process + of an daemon + <teythoon> they use ptrace to follow a daemon that double forks + <gnu_srs1> teythoon: and regarding portability? + + +## IRC, freenode, #hurd, 2014-02-24 + + <braunr> sysvinit doesn't seem to handle /etc/default/locale into + consideration + + +## IRC, OFTC, #debian-hurd, 2014-02-25 + + <gg0> how about switching runsystem.sysv by default? + <youpi> now that it seems to be running fine, we could do that, yes + + # Required Interfaces In the thread starting diff --git a/open_issues/ti-rpc_then_nfs.mdwn b/open_issues/ti-rpc_then_nfs.mdwn index aa36e020..c3dd4e26 100644 --- a/open_issues/ti-rpc_then_nfs.mdwn +++ b/open_issues/ti-rpc_then_nfs.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -18,3 +18,88 @@ It needs some work on our side, [[!message-id Then, the Hurd's [[hurd/translator/nfs]] translator and [[hurd/nfsd]] can be re-enabled, [[!message-id "87hb2j7ha7.fsf@gnu.org"]]. + + +## IRC, freenode, #hurd, 2014-02-19 + + <pere> hi. I'm trying to port libtirpc to get rcpbind on hurd, and am + unable to find IPV6_PORTRANGE and IPV6_PORTRANGE_LOW. is this a known + problem with a known fix? + <braunr> what are they supposed to be ? + <pere> braunr: found them described in <URL: + http://www.daemon-systems.org/man/ip6.4.html >. + <braunr> "The IPV6_PORTRANGE socket option and the conflict resolution rule + are not defined in the RFCs and should be considered implementation + dependent + <braunr> " + <braunr> hm + <braunr> if we have that, they're very probably not accessible from outside + our network stack + <pere> needed feature on hurd, in other words... + <braunr> why ? + <pere> If I remember correctly, SO_PEERCRED is also missing? + <braunr> yes .. + <braunr> that one is important + <pere> braunr: you wonder why the IPV6_PORTRANGE socket option was created? + <braunr> i wonder why it's needed + <braunr> does linux have it ? + <pere> yes, linux got it. + <braunr> same name ? + <pere> it make it possible for some services to work with some + firewalls. :) + <pere> yes, same name, as far I can tell. + <braunr> they could merely bind ports explicitely, couldn't they ? + <pere> not always. + <braunr> or is it for servers on creation of a client socket ? + <pere> see <URL: + http://www.stacken.kth.se/lists/heimdal-discuss/2000-11/msg00002.html > + for an example I came across. + <braunr> i don't find these macros on linux :/ + <pere> how strange. libtirpc build on linux. + <braunr> is there a gitweb or so somewhere ? + <braunr> i can't find it on sf :/ + <pere> for <URL: http://sourceforge.net/projects/libtirpc >, you mean? + <braunr> yes + <pere> no idea. + <braunr> are you looking at upstream 0.2.4 or a particular debian package ? + <pere> I'm looking at the debian package. + <braunr> let me take a look + <pere> http://paste.debian.net/82971/ is my first draft patch to get the + source building. + <braunr> ok so + <braunr> in src/bindresvport.c + <braunr> if you look carefully, you'll see that these _PORTRANGE macros are + used in non linux code + <braunr> not very portable but it explains why you hit the problem + <braunr> try using #if defined (__linux__) || defined(__GNU__) + <braunr> also, i think we intend to implement SCM_CREDS, not SO_PEERCRED + <braunr> but consider we have neither for now + <pere> ah, definitely a simpler fix. + <braunr> pere: btw, see + https://lists.debian.org/debian-hurd/2010/12/msg00014.html + + <pere> <URL: https://bugs.debian.org/739557 > with patch reporte.d + + +## IRC, freenode, #hurd, 2014-02-20 + + <pere> new libtirpc with hurd fixes just uploaded to debian. should fix + the rpcbind build too. + + +## IRC, OFTC, #debian-hurd, 2014-02-20 + + <pere> hm, rpcbind built with freshly patched libtirpc fail to work on + hurd. no idea why. + <pere> running 'rpcinfo -p' show 'rpcinfo: can't contact portmapper: RPC: + Success' + <teythoon> o_O + <pere> I have no idea how to debug it. :( + <pere> anyway, I've found that rpcinfo is the broken part. rpcbind work, + when I test it from a remote machine. + + +## IRC, OFTC, #debian-hurd, 2014-02-21 + + <pere> failing rpcinfo -p on hurd reported as <URL: + http://bugs.debian.org/739674 >. Anyone got a clue how to debug it? diff --git a/open_issues/tmux.mdwn b/open_issues/tmux.mdwn index f71d13e1..c49a5e12 100644 --- a/open_issues/tmux.mdwn +++ b/open_issues/tmux.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -10,6 +10,7 @@ License|/fdl]]."]]"""]] [[!tag open_issue_porting]] + # IRC, freenode, #hurd, 2013-08-01 <braunr> teythoon: can you stop tmux on darnassus please ? @@ -22,3 +23,35 @@ License|/fdl]]."]]"""]] <teythoon> sometimes tmux would hang on attaching or detaching though, but overall I had less problems with tmux than with screen <teythoon> ah, I tried to start tmux on darnassus and now it hangs + + +# IRC, freenode, #hurd, 2014-02-04 + + <teythoon> braunr: whoa, i can reproduce gnu_srs' hanging ssh sessions on + darnassus + <teythoon> here goes + <teythoon> run tmux, exit the shell so that tmux quits, start tmux again + (tmux hangs now on some socket stuff), log in with ssh again, pkill tmux, + rm /tmp/tmux*/default => both ssh sessions hang and time out eventually + <braunr> why start tmux twice ? + <teythoon> dunno + <teythoon> that's what i just did, twice in a row + <teythoon> there's a bug somewhere that makes tmux hang if the socket + exists but no tmux server is running + <teythoon> maybe that contributes to to the other issuse, i don't know + <braunr> looks like an infinite loop somewhere + <gnu_srs> teythoon: Nice to set that I'm not alone having this problem:P + <braunr> teythoon: what's happening ? :) + <teythoon> ? + <braunr> on darnassus + <teythoon> not sure + <teythoon> uh, something is very wrong o_O + <teythoon> help ? + <braunr> :) + <braunr> the msg thread of a process is blocked somewhere + <braunr> preventing ps/top from completing + <braunr> looks like proc is blocked now .. + <braunr> restarting the vm + <braunr> apparently, removing buggy tmux sockets make pflocal crash + <braunr> thanks for the report :) + <teythoon> you are welcome :) diff --git a/open_issues/translate_fd_or_port_to_file_name.mdwn b/open_issues/translate_fd_or_port_to_file_name.mdwn index 98fe0cfc..87556075 100644 --- a/open_issues/translate_fd_or_port_to_file_name.mdwn +++ b/open_issues/translate_fd_or_port_to_file_name.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -156,6 +156,91 @@ License|/fdl]]."]]"""]] <braunr> see bug-hurd +## IRC, freenode, #hurd, 2013-12-05 + + <teythoon> braunr: no more room for vm_map_find_entry in 80220a40 + <teythoon> 80220a40 <- is that a task ? + <braunr> or a vm_map, not sure + <braunr> probably a vm_map + <teythoon> hm + <teythoon> let's fix this kind of reporting + <braunr> :) + <teythoon> let one process register for kernel log messages + <teythoon> make a rich interface, say klog_thread and friends + <teythoon> a userspace process gets the port name, looks it up in proc, + logs nicely to syslog + <teythoon> if noone registered for this notifications, fall back to the old + reporting + <braunr> i tend to think using internal names is probably better + <teythoon> how would i use them to see wich process caused the issue ? + <braunr> you give the name of the task + <braunr> (which means tasks have names, yes) + <teythoon> ok + <braunr> the reason is that reporting is often used for debugging + <braunr> and debugging usually means there is a bug + <braunr> if the bug prevents from reporting, it's not very useful + <braunr> and we're talking about the kernel here, the low level stuff + <teythoon> incidentally, i got myself a stuck process + <teythoon> ah, got it killed + <teythoon> braunr: so you propose to add a task rpc to set a name ? + <braunr> i don't want to push such things + <braunr> which is why this hasn't been done until now + <braunr> but that's what i'd do in x15, yes + <teythoon> y not ? + <braunr> and instead of a process registered to gather kernel messages, i'd + use a dmesg-like interface, where the kernel manages its message buffer + itself + <braunr> i didn't feel the need to + <braunr> the tools i've had until now were sufficient + <braunr> don't forget you still need to fix mtab :p + <braunr> or is it done ? + <teythoon> i sometimes see tasks deallocating invalid ports + <teythoon> no + <teythoon> there is an un-acked patche series on the list + <braunr> ok + <teythoon> so, i want to identify which process caused it + <teythoon> is that possible right now ? + <braunr> not easily, no + <teythoon> so that's a valid use case + <braunr> it is + <teythoon> good + <teythoon> :) + <teythoon> so proc would register a string describing each task and mach + would use this for printing nicer messages ? + <braunr> for example, yes + <braunr> one problem with that approach is that it doesn't fit well with + subhurds + <teythoon> *bingbingbing + <braunr> but i personally wouldn't care much, they're kernel messages + <braunr> in the future, we could make mach more a hypervisor, and register + names for each domains + <teythoon> yet unanswered proposal about hierachical proc servers on the + list... + <teythoon> that'd also fix subhurds, so that the parents processes won't + appear in the subhurd + <teythoon> making it sandboxier + <teythoon> and killall5 couldn't slaughter the host system if the subhurd + shuts down with sysvinit + + +## IRC, freenode, #hurd, 2014-01-20 + + <teythoon> i wonder if it would not be best to add a description to mach + tasks + <braunr> i think it would + <teythoon> to aid fixing these kind of issues + <braunr> in x15, i actually add descriptions (names) to all kernel objects + <teythoon> that's probably a good idea, yes + <braunr> well, not all, but many + + +## IRC, OFTC, #debian-hurd, 2014-02-05 + + <teythoon> youpi: about that patch implementing task_set_name, may i merge + the amended version ? + <youpi> yes + + # IRC, freenode, #hurd, 2011-07-13 A related issue: diff --git a/open_issues/user-space_device_drivers.mdwn b/open_issues/user-space_device_drivers.mdwn index d6c33d30..69ec1d23 100644 --- a/open_issues/user-space_device_drivers.mdwn +++ b/open_issues/user-space_device_drivers.mdwn @@ -1,5 +1,5 @@ -[[!meta copyright="Copyright © 2009, 2011, 2012, 2013 Free Software Foundation, -Inc."]] +[[!meta copyright="Copyright © 2009, 2011, 2012, 2013, 2014 Free Software +Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -19,7 +19,7 @@ Also see [[device drivers and IO systems]]. [[!toc levels=2]] -# Issues +# Open Issues ## IRQs @@ -250,6 +250,297 @@ A similar problem is described in <teythoon> cool :) +#### IRC, freenode, #hurd, 2014-02-10 + + <teythoon> braunr: i have a question wrt memory allocation in gnumach + <teythoon> i made a live cd with a rather large ramdisk + <teythoon> it works fine in qemu, when i tried it on a real machine it + failed to allocate the buffer for the ramdisk + <teythoon> i was wondering why + <teythoon> i believe the function that failed was kmem_alloc trying to + allocate 64 megabytes + <braunr> teythoon: how much memory on the real machine ? + <teythoon> 4 gigs + <braunr> so 1.8G + <teythoon> yes + <braunr> does it fail systematically ? + <teythoon> but surely enough + <teythoon> uh, i must admit i only tried it once + <braunr> it's likely a 64M kernel allocation would fail + <braunr> the kmem_map is 128M wide iirc + <braunr> and likely fragmented + <braunr> it doesn't take much to prevent a 64M contiguous virtual area + <teythoon> i see + <braunr> i suggest you try my last gnumach patch + <teythoon> hm + <teythoon> surely there is a way to make this more robust, like using a + different map for the allocation ? + <braunr> the more you give to the kernel, the less you have for userspace + <braunr> merging maps together was actually a goal + <braunr> the kernel should never try to allocate such a large region + <braunr> can you trace the origin of the allocation request ? + <teythoon> i'm pretty sure it is for the ram disk + <braunr> makes sense but still, it's huge + <teythoon> well... + <braunr> the ram disk should behave as any other mapping, i.e. pages should + be mapped in on demand + <teythoon> right, so the implementation could be improved ? + <braunr> we need to understand why the kernel makes such big requests first + <teythoon> oh ? i thought i asked it to do so + <braunr> ? + <teythoon> for the ram disk + <braunr> normally, i would expect this to translate to the creation of a + 64M anonymous memory vm object + <braunr> the kernel would then fill that object with zeroed pages on demand + (on page fault) + <braunr> at no time would there be a single 64M congituous kernel memory + allocation + <braunr> such big allocations are a sign of a serious bug + <braunr> for reference, linux (which is even more demanding because + physical memory is directly mapped in kernel space) allows at most 4M + contiguous blocks on most architectures + <braunr> on my systems, the largest kernel allocation is actually 128k + <braunr> and there are only two such allocations + <braunr> teythoon: i need you to reproduce it so we understand what happens + better + <teythoon> braunr: currently the ramdisk implementation kmem_allocs the + buffer in the kernel_map + <braunr> hum + <braunr> did you add this code ? + <teythoon> no + <braunr> where is it ? + <teythoon> debian/patches + <braunr> ugh + <teythoon> heh + <braunr> ok, don't expect that to scale + <braunr> it's a quick and dirty hack + <braunr> teythoon: why not use tmpfs ? + <teythoon> i use it as root filesystem + <braunr> :/ + <braunr> ok so + <braunr> update on what i said before + <braunr> kmem_map is exclusively used for kernel object (slab) allocations + <braunr> kmem_map is a submap of kernel_map + <braunr> which is 192M on i386 + <braunr> so a 64M allocation can't work at all + <braunr> it would work on xen, where the kernel map is 224M large + <braunr> teythoon: do you use xen ? + <teythoon> ok, thanks for the pointers :) + <teythoon> i don't use xen + <braunr> then i can't explain how it worked in your virtual machine + <braunr> unless the size was smaller + <teythoon> i'll look into improving the ramdisk patch if time permits + <teythoon> no it wasnt + <braunr> :/ + <teythoon> and it works reliably in qemu + <braunr> that's very strange + <braunr> unless the kernel allocates nothing at all inside kernel_map on + qemu + + +##### IRC, freenode, #hurd, 2014-02-11 + + <teythoon> braunr: http://paste.debian.net/81339/ + <braunr> teythoon: oO ? + <braunr> teythoon: you can't allocate memory from a non kernel map + <braunr> what you're doing here is that you create a separate, non-kernel + address space, that overlaps kernel memory, and allocate from that area + <braunr> it's like having two overlapping heaps and allocating from them + <teythoon> braunr: i do? o_O + <teythoon> so i need to map it instead ? + <braunr> teythoon: what do you want to do ? + <teythoon> i'm currently reading up on the vm system, any pointers ? + <braunr> teythoon: but what do you want to achieve here ? + <braunr> 12:24 < teythoon> so i need to map it instead ? + <teythoon> i'm trying to do what you said the other day, create a different + map to back the ramdisk + <braunr> no + <teythoon> no ? + <braunr> i said an object, not a map + <braunr> but it means a complete rework + <teythoon> ok + <teythoon> i'll head back into hurd-land then, though i'd love to see this + done properly + <braunr> teythoon: what you want basically is tmpfs as a rootfs right ? + <teythoon> sure + <teythoon> i'd need a way to populate it though + <braunr> how is it done currently ? + <teythoon> grub loads an ext2 image, then it's copied into the ramdisk + device, and used by the root translator + <braunr> how is it copied ? + <braunr> what makes use of the kernel ramdisk ? + <teythoon> in ramdisk_create, currently via memcpy + <teythoon> the ext2fs translator that provides / + <braunr> ah so it's a kernel device like hd0 ? + <teythoon> yes + <braunr> hm ok + <braunr> then you could create an anonymous memory object in the kernel, + and map read/write requests to object operations + <braunr> the object must not be mapped in the kernel though, only temporary + on reads/writes + <teythoon> right + <teythoon> so i'd not use memcpy, but one of the mach functions that copy + stuff to memory objects ? + <braunr> i'm not sure + <braunr> you could simply map the object, memcpy to/from it, and unmap it + <teythoon> what documentation should i read ? + <braunr> vm/vm_map.h for one + <teythoon> i can only find stuff describing the kernel interface to + userspace + <braunr> vm/vm_kern.h may help + <braunr> copyinmap and copyoutmap maybe + <braunr> hm no + <teythoon> vm_map.h isn't overly verbose :( + <braunr> vm_map_enter/vm_map_remove + <teythoon> ah, i actually tried vm_map_enter + <braunr> look at the .c files, functions are described there + <teythoon> that leads to funny results + <braunr> vm_map_enter == mmap basically + <braunr> and vm_object.h + <teythoon> panic: kernel thread accessed user space! + <braunr> heh :) + <teythoon> right, i hoped vm_map_enter to be the in-kernel equivalent of + vm_map + + <teythoon> braunr: uh, it worked + <braunr> teythoon: ? + <teythoon> weird + <teythoon> :) + <braunr> teythoon: what's happening ? + <teythoon> i refined the ramdisk patch, and it seems to work + <teythoon> not sure if i got it right though, i'll paste the patch + <braunr> yes please + <teythoon> http://paste.debian.net/81376/ + <braunr> no it can't work either + <teythoon> :/ + <braunr> you can't map the complete object + <teythoon> (amusingly it does) + <braunr> you have to temporarily map the pages you want to access + <braunr> it does for the same obscure reason the previous code worked on + qemu + <teythoon> ok, i think i see + <braunr> increase the size a lot more + <braunr> like 512M + <braunr> and see + <braunr> you could also use the kernel debugger to print the kernel map + before and after mapping + <teythoon> how ? + <braunr> hm + <braunr> see show task + <braunr> maybe you can call the in kernel function directly with the kernel + map as argument + <teythoon> which one ? + <braunr> the one for "show task" + <braunr> hm no it shows threads, show map + <braunr> and show map crashes on darnassus .. + <teythoon> here as well + <braunr> ugh + <braunr> personally i'd use something like vm_map_info in x15 + <braunr> but you may not want to waste time with that + <braunr> try with a bigger size and see what it does, should be quick and + simple enough + <teythoon> right + <teythoon> braunr: ok, you were right, mapping the entire object fails if + it is too big + <braunr> teythoon: fyi, kmem_alloc and vm_map have some common code, namely + the allocation of an virtual area inside a vm_map + <braunr> kmem_alloc requires a kernel map (kernel_map or a submap) whereas + vm_map can operate on any map + <braunr> what differs is the backing store + <teythoon> braunr: i believe i want to use vm_object_copy_slowly to create + and populate the vm object + <teythoon> for that, i'd need a source vm_object + <teythoon> the data is provided as a multiboot_module + <braunr> kmem_alloc backs the virtual range with wired down physical memory + <braunr> whereas vm_map maps part of an object that is usually pageable + <teythoon> i see + <braunr> and you probably want your object to be pageable here + <teythoon> yes :) + <braunr> yes object copy functions could work + <braunr> let me check + <teythoon> what would i specify as source object ? + <braunr> let's assume a device write + <braunr> the source object would be where the source data is + <braunr> e.g. the data provided by the user + <teythoon> yes + <teythoon> trouble is, i'm not sure what the source is + <braunr> it looks a bit complicated yes + <teythoon> i mean the boot loader put it into memory, not sure what mach + makes of that + <braunr> i guess there already are device functions that look up the object + from the given address + <braunr> it's anonymous memory + <braunr> but that's not the problem here + <teythoon> so i need to create a memory object for that ? + <braunr> you probably don't want to populate your ramdisk from the kernel + <teythoon> wire it down to the physical memory ? + <braunr> don't bother with the wire property + <teythoon> oh ? + <braunr> if it can't be paged out, it won't be + <teythoon> ah, that's not what i meant + <braunr> you probably want ext2fs to populate it, or another task loaded by + the boot loader + <teythoon> interesting idea + <braunr> and then, this task will have a memory object somewhere + <braunr> imagine a task which sole purpose is to embedd an archive to + extract into the ramdisk + <teythoon> sweet, my thoughts exactly :) + <braunr> the data section of a program will be backed by an anonymous + memory object + <braunr> the problem is the interface + <braunr> the device interface passes addresses and sizes + <braunr> you need to look up the object from that + <braunr> but i guess there is already code doing that in the device code + somewhere + <braunr> teythoon: vm_object_copy_slowly seems to create a new object + <braunr> that's not exactly what we want either + <teythoon> why not ? + <braunr> again, let's assume a device_write scenario + <teythoon> ah + <braunr> you want to populate the ramdisk, which is merely one object + <braunr> not a new object + <teythoon> yes + <braunr> teythoon: i suggest using vm_page_alloc and vm_page_copy + <braunr> and vm_page_lookup + <braunr> teythoon: perhaps vm_fault_page too + <braunr> although you might want wired pages initially + <braunr> teythoon: but i guess you see what i mean when i say it needs to + be reworked + <teythoon> i do + <teythoon> braunr: aww, screw that, using a tmpfs is much nicer anyway + <teythoon> the ramdisk strikes again ... + <braunr> teythoon: :) + <braunr> teythoon: an extremely simple solution would be to enlarge the + kernel map + <braunr> this would reduce the userspace max size to ~1.7G but allow ~64M + ramdisks + <teythoon> nah + <braunr> or we could reduce the kmem_map + <braunr> i think i'll do that anyway + <braunr> the slab allocator rarely uses more than 50-60M + <braunr> and the 64M remaining area in kernel_map can quickly get + fragmented + <teythoon> braunr: using a tmpfs as the root translator won't be straight + forward either ... damn the early boostrapping stuff ... + <braunr> yes .. + <teythoon> that's one of the downsides of the vfs-as-namespace approach + <braunr> i'm not sure + <braunr> it could be simplified + <teythoon> hm + <braunr> it could even use a temporary name server to avoid dependencies + <teythoon> indeed + <teythoon> there's even still the slot for that somewhere + <antrik> braunr: hm... I have a vague recollection that the fixed-sized + kmem-map was supposed to be gone with the introduction of the new + allocator?... + <braunr> antrik: the kalloc_map and kmem_map were merged + <braunr> we could directly use kernel_map but we may still want to isolate + it to avoid fragmentation + +See also the discussion on [[gnumach_memory_management]], *IRC, freenode, +\#hurd, 2013-01-06*, *IRC, freenode, #hurd, 2014-02-11* (`KENTRY_DATA_SIZE`). + + ### IRC, freenode, #hurd, 2012-07-17 <bddebian> OK, here is a stupid question I have always had. If you move @@ -725,7 +1016,133 @@ A similar problem is described in * <http://gelato.unsw.edu.au/IA64wiki/UserLevelDrivers> + +## The Anykernel and Rump Kernels + * [Running applications on the Xen Hypervisor](http://blog.netbsd.org/tnf/entry/running_applications_on_the_xen), Antti Kantee, 2013-09-17. [The Anykernel and Rump Kernels](http://www.netbsd.org/docs/rump/). + + +### IRC, freenode, #hurd, 2014-02-13 + + <cluck> is anyone working on getting netbsd's rump kernel working under + hurd? it seems like a neat way to get audio/usb/etc with little extra + work (it might be a great complement to dde) + <braunr> noone is but i do agree + <braunr> although rump wasn't exactly designed to make drivers portable, + more subsystems and higher level "drivers" like file systems and network + stacks + <braunr> but it's certainly possible to use it for drivers to without too + much work + <curious_troll> cluck: I am reading about rumpkernels and his thesis. + <cluck> braunr: afaiu there is (at least partial) work done on having it + run on linux, xen and genode [unless i misunderstood the fosdem'14 talks + i've watched so far] + <cluck> "Generally speaking, any driver-like kernel functionality can be + offered by a rump server. Examples include file systems, networking + protocols, the audio subsystem and USB hardware device drivers. A rump + server is absolutely standalone and running one does not require for + example the creation and maintenance of a root file system." + <cluck> from http://www.netbsd.org/docs/rump/sptut.html + <braunr> cluck: how do they solve resource sharing problems ? + <cluck> braunr: some sort of lock iiuc, not sure if that's managed by the + host (haven't looked at the code yet) + <braunr> cluck: no, i mean things like irq sharing ;p + <braunr> bus sharing in general + <braunr> netbsd has a very well defined interface for that, but i'm + wondering what rump makes of it + <cluck> braunr: yes, i understood + <cluck> braunr: just lacking proper terminology to express myself + <cluck> braunr: at least from the talk i saw what i picked up is it behaves + like netbsd inside but there's some sort of minimum support required from + the "host" so the outside can reach down to the hw + <braunr> cluck: rump is basically glue code + <cluck> braunr: but as i've said, i haven't looked at the code in detail + yet + <cluck> braunr: yes + <braunr> but host support, at least for the hurd, is a bit more involved + <braunr> we don't merely want to run standalone netbsd components + <braunr> we want to make them act as real hurd servers + <braunr> therefore tricky stuff like signals quickly become more + complicated + <braunr> we also don't want it to use its own RPC format, but instead use + the native one + <cluck> braunr: antti says required support is minimal + <braunr> but again, compared to everything else, the porting effort / size + of reusable code base ratio is probably the lowest + <braunr> cluck: and i say we don't merely want to run standalone netbsd + components on top of a system, we want them to be our system + <cluck> braunr: argh.. i hate being unable to express myself properly + sometimes :| + <cluck> ..the entry point?! + <braunr> ? + <cluck> dunno what to call them + <braunr> i understand what you mean + <braunr> the system specific layer + <braunr> and *againù i'm telling you our goals are different + <cluck> yes, anyways.. just a couple of things, the rest is just C + <braunr> when you have portable code such as found in netbsd, it's not that + hard to extract it, create some transport between a client and a server, + and run it + <braunr> if you want to make that hurdish, there is more than that + <braunr> 1/ you don't use tcp, you use the native microkernel transport + <braunr> 2/ you don't use the rump rpc code over tcp, you create native rpc + code over the microkernel transport (think mig over mach) + <braunr> 3/ you need to adjust how authentication is performed (use the + auth server instead of netbsd internal auth mechanisms) + <braunr> 4/ you need to take care of signals (if the server generates a + signal, it must correctly reach the client) + <braunr> and those are what i think about right now, there are certainly + other details + <cluck> braunr: yes, some of those might've been solved already, it seems + the next genode release already has support for rump kernels, i don't + know how they went about it + <cluck> braunr: in the talk antii mentions he wanted to quickly implement + some i/o when playing on linux so he hacked a fs interface + <cluck> so the requirements can't be all that big + <cluck> braunr: in any case i agree with your view, that's why i found rump + kernels interesting in the first place + <braunr> i went to the presentation at fosdem last year + <braunr> and even then considered it the best approach for + driver/subsystems reuse on top of a microkernel + <braunr> that's what i intend to use in propel, but we're far from there ;p + <cluck> braunr: tbh i hadn't paid much attention to rump at first, i had + read about it before but thought it was more netbsd specific, the genode + mention piked my interest and so i went back and watched the talk, got + positively surprised at how far it has come already (in retrospect it + shouldn't have been so unexpected, netbsd has always been very small, + "modular", with clean interfaces that make porting easier) + <braunr> netbsd isn't small at all + <braunr> not exactly modular, well it is, but less than other systems + <braunr> but yes, clean interfaces, explicitely because their stated goal + is portability + <braunr> other projects such as minix and qnx didn't wait for rump to reuse + netbsd code + <cluck> braunr: qnx and minix have had money and free academia labor done + in their favor before (sadly hurd doesn't have the luck to enjoy those + much) + <cluck> :) + <braunr> sure but that's not the point + <braunr> resources or not, they chose the netbsd code base for a reason + <braunr> and that reason is portability + <cluck> yes + <cluck> but it's more work their way + <braunr> more work ? + <cluck> with rump we'd get all those interfaces for free + <braunr> i don't know + <braunr> not for free, certainly not + <cluck> "free" + <braunr> but the cost would be close to as low as it could possibly be + considering what is done + <cluck> braunr: the small list of dependencies makes me wonder if it's + possible it'd build under hurd without any mods (yes, i know, very + unlikely, just dreaming here) + <braunr> cluck: i'd say it's likely + <youpi> I quickly tried to build it during the talk + <youpi> there are PATH_MAX everywhere + <braunr> ugh + <youpi> but maybe that can be #defined + <youpi> since that's most probably for internal use + <youpi> not interaction with the host diff --git a/open_issues/virtualization/fakeroot.mdwn b/open_issues/virtualization/fakeroot.mdwn index f9dd4756..7856e299 100644 --- a/open_issues/virtualization/fakeroot.mdwn +++ b/open_issues/virtualization/fakeroot.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -65,3 +66,1224 @@ License|/fdl]]."]]"""]] < youpi> that's why we still use fakeroot-sysv < teythoon> right < youpi> err, -tcp + + +## IRC, freenode, #hurd, 2013-11-18 + + <teythoon> I believe I figured out the argv[0] issue with fakeroot-hurd + <teythoon> but I'm not sure how to fix this + <teythoon> first of all, Emilios file_exec_file_name patch set works fine + <teythoon> but not with fakeroot + <teythoon> + http://git.sceen.net/hurd/hurd.git/blob/HEAD:/exec/hashexec.c#l300 + <teythoon> check_hashexec tries to locate the script file using a heuristic + <teythoon> Emilios patch improves the situation with just providing this + information + <teythoon> but then, the identity port of the "discovered" file is compared + with the id port of the script file + <teythoon> to verify if the heuristic found the right file + <teythoon> but when using fakeroot-hurd, /hurd/fakeroot proxies all + requests + <teythoon> but the exec server is outside of the /hurd/fakeroot + environment, so it gets the id port from the real filesystem + <teythoon> we could skip that test if the script name is explicitly + provided though + <teythoon> that test was meant to see whether a search through $PATH turned + up the right file + <braunr> teythoon: nice + <teythoon> braunr: thanks :) + <teythoon> unfortunately, dpkg-buildpackaging hurd with it still fails for + some reason + <teythoon> but it is faster than fakeroot-tcp :) + <braunr> even chown ? + <braunr> or chmod ? + <teythoon> dunno in detail, but the whole build is faster + <braunr> if you can try it, i'm interested + <braunr> because chown/chmod is also slow on linux with fakeroot-tcp + <teythoon> i can try... + <braunr> so it's probably not a hurd bug + <teythoon> braunr: yes, it really is + <braunr> no i mean + <braunr> chown/chmod being slow with fakeroot-tcp is probably not a hurd + bug + <braunr> but a fakeroot-tcp bug + <teythoon> chowning all files in /usr/bin takes 5.930s with fakeroot-hurd + (6.09 with startup overhead) vs 26.42s (26.59s) with fakeroot-tcp + <braunr> but try it on linux (fakeroot-tcp i mean) + <braunr> although you may want to do it on something you don't care much + about :p) + + +## IRC, freenode, #hurd, 2013-12-03 + + * teythoon is gonna hunt a fakeroot bug ... + <teythoon> % fakeroot-hurd /bin/sh -c ":> /tmp/some_file" + <teythoon> /bin/sh: 1: cannot create /tmp/some_file: Is a directory + <braunr> ah fakeroot-hurd + <teythoon> prevents installing stuff with /bin/install + <teythoon> sure fakeroot-hurd, why would i work on the slow one ? + <braunr> i don't know + <braunr> because it makes chmod/chown/maybe others horrenddously slow + <braunr> ? + <teythoon> yes, fixing this involves fixing fakeroot-hurd + <braunr> are you sure ? + <braunr> i prefer repeating just in case: i saw that problem on linux as + well + <braunr> with fakeroot-sysv + <teythoon> so ? + <braunr> i'm almost certain it's a pure fakeroot bug, not a hurd bug + <braunr> so + <teythoon> even if this is fixed, it still has to pay the socket + communication overhead + <braunr> fixing fakeroot-hurd so that i can be used instead of fakeroot-tcp + is a very good thing to do, obviously + <braunr> it* + <braunr> but it won't solve the chown/chmod speed + <braunr> (or, probably won't) + <teythoon> huh, why not ? + <braunr> 15:53 < braunr> i'm almost certain it's a pure fakeroot bug, not a + hurd bug + <braunr> when i say it's slow, i should be more precise + <braunr> it doesn't show up in top + <teythoon> yes, but why would fakeroot-hurd suffer from the same issue ? + <braunr> the cpu is almost idle + <braunr> oh right, it's a completely different tool + <braunr> my bad + <braunr> right, right, the proper way to implement fakeroot actually :) + <teythoon> yes + <teythoon> this will bring near-native speed + + +## IRC, freenode, #hurd, 2013-12-05 + + <teythoon> fakeroot-hurd just successfully built mig :) + <teythoon> hangs in dh_gencontrol when building gnumach or hurd though + <teythoon> i believe it hangs waiting for a lock + <teythoon> lock like in file lock that is + <teythoon> braunr: no more room for vm_map_find_entry in 80220a40 + <teythoon> 80220a40 <- is that a task ? + <braunr> or a vm_map, not sure + <braunr> probably a vm_map + + +## IRC, freenode, #hurd, 2013-12-06 + + <teythoon> well, aren't threads a source of endless entertainment ... ? + <teythoon> well, I found three more bugs in fakeroot-hurd + <teythoon> one of them requires fixing the locking used in fakeroot + <braunr> ouch + <teythoon> the current code does some lock cycling to aquire a lock out of + order + <braunr> cycling ? + <teythoon> in the netfs_node_norefs function + <teythoon> release and reaquire + <braunr> i see + <teythoon> which imho should be better solved with a weak reference + <teythoon> working on it, it no longer deadlocks but i broke something else + ... + <teythoon> endless fun ;) + <braunr> such things could have been done right in the beginning + <braunr> ... + <teythoon> yes, I wonder + <teythoon> libports has weak references + <teythoon> but pflocal is the only user + <braunr> hm + <teythoon> none of the lib*fs support that + <braunr> didn't i add one in libdiskfs too ? + <braunr> anyway, irrelevant + <braunr> weak references are a nice feature + <braunr> teythoon: i don't see the cycling you mentioned + <braunr> only netfs_node_refcnt_lock being dropped temporarily + <teythoon> yep, that one + <teythoon> line 145 + <teythoon> note that due to another bug this code is currently never run + <braunr> how surprising .. + <braunr> the note about some leak actually gave a hint about that + <teythoon> yeah, that leak + <teythoon> I think i'm actually very close + <teythoon> it's just so frustrating, i thought i got it last night + <braunr> good luck then + <teythoon> thanks :) + + +## IRC, freenode, #hurd, 2013-12-09 + + <teythoon> sweet, i fixed fakeroot-hurd :) + <braunr> /clap + <braunr> what was the problem ? + <teythoon> lots + <braunr> i see + <teythoon> it's amazing it actually run as well as it did + <braunr> mess strikes again + <braunr> i hate messy code .. + * teythoon is building half a hurd package using this ... stay tuned ;) + <azeem> teythoon: is this going to make building faster as well? + <teythoon> most likely, yes + <teythoon> fakeroot-tcp is known to be slow, even on linux + <braunr> teythoon: are you sure about the transparent retry patch ? + <teythoon> pretty sure, why ? + <braunr> it's about a more general issue that we didn't fix yet + <braunr> our last discussions about it lead us to agree that clients should + check the identity of a server before interacting with it + <teythoon> braunr: i don't understand, what's the problem here ? + <braunr> teythoon: fakeroot does the lookup itself, doesn't it ? + <teythoon> yes + <braunr> teythoon: but was that also the case before your patch ? + <teythoon> braunr: yes + <braunr> teythoon: then ok + <braunr> teythoon: i guess fakeroot handles requests only for a specific + set of calls right ? + <braunr> and for others, requests are directly relayed + <teythoon> braunr: yes + <braunr> and that still is the case, right ? + <teythoon> yes + <braunr> ok + <braunr> looks right since it only affects lookups + <braunr> ok then + <teythoon> well, fakeroot-hurd built half a hurd package in less than 70 + minutes + <teythoon> a new record for my box + <braunr> compared to how much before ? + <braunr> (and why half of it ?) + <teythoon> unfortunately it hung after signing the packages... some perl + process with a /usr/bin/tee child + <teythoon> killing tee made it succeed though + <teythoon> braunr: i don't build the udeb package + <braunr> oh ok + <teythoon> braunr: compared with ~75 with fakeroot-tcp and my demuxer + rework, ~80 before + <braunr> teythoon: nice + + +## IRC, freenode, #hurd, 2013-12-18 + + <teythoon> there, i fixed the last fakeroot-hurd bug + <teythoon> *whee* :) + <teythoon> i thought so many times that i got the last fakeroot bug ... + <teythoon> last as in it's in a good enough shape to compile the hurd + package that is + <teythoon> but now it is + <braunr> :) + <braunr> this will make glibc and others so much faster to build + + +## IRC, freenode, #hurd, 2013-12-19 + + <braunr> teythoon_: hum, you should make the behaviour of fakeroot-hurd on + the last client exiting optional + <teythoon_> y? + <teythoon_> fakeroot-tcp does the very same thing + <braunr> fakeroot-hurd is different + <braunr> it's part of the file system + <teythoon_> yes + <braunr> users may want it to stay around + <braunr> and reuse it without checking it's actually there + <teythoon_> but once the last client is gone, who is ever getting another + port to it ? + <teythoon_> no + <teythoon_> that cannot happen + <braunr> really ? + <teythoon_> yes + <braunr> i thought it was like remap + <braunr> since remap is based on it + <teythoon_> the same thing applies to remap + <teythoon_> only settrans has the control port + <braunr> hum + <teythoon_> and uses it once to get a protid for the working dir of the + initial process started inside the chrooted environment + <braunr> you may not want to chroot inside + <teythoon_> so ? + <teythoon_> then, you get another protid + <braunr> i'll make an example + <braunr> i create a myroot directory implemented by fakeroot + <braunr> populate it + <braunr> leave and do something else, + <braunr> i might want to return to it later + <teythoon_> ah + <teythoon_> ok, so you are not using settrans --chroot + <braunr> or maybe i'm confusing the fakeroot translator and fakeroot-hurd + <braunr> 10:48 < braunr> you may not want to chroot inside + <braunr> yes + <teythoon_> hm + <teythoon_> ok, so the patch could be changed to check whether the last + control port is gone too + <braunr> i have no idea of any practical use, but i don't see a valid + reason to make a translator go away just because it has no client + <braunr> except for resource usage + <braunr> and if it's installed as a passive translator + <braunr> although that would make fakeroot loose its state + <braunr> though remap state is on the command line so it would be fine for + it + <braunr> see what i mean ? + <teythoon_> yes i do + <braunr> fakeroot state could be saved in some db one day so it may apply, + if anyone feels the need + <teythoon_> so what about checking for control ports too ? + <braunr> i'm not too familiar with those + <braunr> who has the control port of a passive translator ? the parent ? + <teythoon_> that should cover the use case you described + <teythoon_> for the parent translator + <teythoon_> for fsys_getroot requests it has to keep it around + <teythoon_> and for more fsys stuff too + <braunr> and if active ? settrans ? who just looses it ? + <teythoon_> if settrans is used to start an active translator, the parent + fs still gets a right to the control port + <braunr> ok + <braunr> i don't have a clear view of what this implies for fakeroot-hurd + <braunr> we'd want fakeroot-hurd to clean all resources including the + fakeroot translator on exit + <teythoon_> for fakeroot-hurd (or any child translator) this means that a + port from the control port class will still exists + <teythoon_> so we do not exit + <teythoon_> oh, you're speaking of fakeroot.sh ? the wrapper script ? + <braunr> probably + <braunr> for me, fakeroot-hurd is the command line too, similar to + fakeroot-sysv and fakeroot-tcp + <braunr> and fakeroot is the translator + <teythoon_> yes, agreed + <teythoon_> fakeroot-hurd could use settrans --force --chroot ... to force + fakeroot to exit if the main chrooted process dies + <teythoon_> but that'd kill anything that outlives that process + <teythoon_> that might be legitimate, say a process daemonized + <teythoon_> so detecting that noone uses fakeroot is the much cleaner + solution + <braunr> ok + <teythoon_> also, that's what fakeroot-tcp does + <braunr> which is why i suggested an option for that + <teythoon_> why add an option if we can do the right thing without + troubling the user ? + <braunr> ah, if we can, good + <teythoon_> i think we can + <teythoon_> I'll rework the patch, thanks for the hint + <braunr> so + <braunr> just to be clear + <braunr> the way you intend it to work is + <braunr> wait for all clients and the control port to drop before shutting + down + <braunr> the control port is dropped when dettaching the translator, right + ? + <teythoon_> yes + <braunr> but hm + <braunr> what if clients spawn other processes ? + <braunr> they won't find the translator any more + <teythoon_> then, that client get's a port to fakeroot at least for it's + working dir + <teythoon_> so another protid is created + <braunr> ah yes, it's usually choorted for such uses + <braunr> chrooted + <teythoon_> so fakeroot will stick around + <braunr> but clients, even from fakeroot, might simply use absolute paths + <teythoon_> so ? + <braunr> in which case they won't find fakeroot + <teythoon_> it will hit fakeroots dir_lookup + <teythoon_> sure + <braunr> how so ? + <teythoon_> if the path is absolute, it will trigger a magic retry of some + kind + <teythoon_> so the client uses it's root dir port + <braunr> i thought the lookup would be done straight from the root fs port + .. + <teythoon_> which points to fakeroot of course + <braunr> ah, chrooted again + <teythoon_> that's the whole point + <braunr> so this implies clients are chrooted + <teythoon_> they are + <teythoon_> even if you do another chroot + <braunr> what i mean is + <teythoon_> that root port also points to a fakeroot port + <braunr> if we detach the translator, and clients outside the chroot spawn + processes, say shell scripts, they won't find the fakeroot tree + <braunr> now, i wonder if we want to actually handle that + <braunr> i'm just uncomfortable with a translator silently shutting down + because it has no client + <teythoon_> if fakeroot is detached, how are clients outside the chroot + ever supposed to get a handle to files inside the fakerooted env ? + <braunr> it makes sense for fakeroot, so the expected behaviours here aer + conflicting + <braunr> they had those before fakeroot being detached + <teythoon_> then fakeroot wouldn't go away + <braunr> right + <braunr> unless there is a race but i don't think there is + <teythoon_> there isn't + <teythoon_> i call netfs_shutdown + <braunr> clients get the rights before the parent has a chance to terminate + <teythoon_> and only shutdown if it doesn't return ebusy + <braunr> makes sense + <braunr> ok go ahead :) + <teythoon_> cool, thanks for the walk-through ;) + <braunr> on the other hand .. + <braunr> that's a complicated topic left unfinished by the original authors + <teythoon_> one of many + <braunr> having translators automatically go away when there is no client + may be a good feature + <braunr> but it only makes sense for passive translators + <braunr> and this should be automated + <braunr> the lib*fs libraries should be able to handle it + <teythoon_> or, we could go for proper persistence instead + <braunr> stay around if active, leave after a while when no more clients if + passive + <braunr> why ? + <teythoon_> clean solution + <braunr> persistence looks much more expensive to me + <teythoon_> other benefits + <braunr> i mean + <braunr> persistence looks so expensive it doesn't make sense in a general + purpose system + <teythoon_> sure, we could make our *fs libs handle this smarter at a much + lower cost + <teythoon_> don't we get a handle to the underlying file ? + <braunr> i think we do yes + <teythoon_> if that's actually a file and not a directory, we could store + data into it + <braunr> many translators are read-only + <teythoon_> so ? + <braunr> well, when we can write, we can use passive translators instead + <braunr> normally + <teythoon_> yes + <braunr> depends on the fs type actually but you're right, we could use + regular files + <braunr> or a special type of file, i don't know + <antrik> braunr: BTW, I agree that active translators should only go away + when no ports are open anymore, while passive ones can exit when control + ports are still open but no protids + <teythoon> antrik: you mean as a general rule ? + <teythoon> that leaves the question how the translator distinguishes + between having a passive translator record and not having one + <antrik> I believe I already arrived at that conclusion in some design + discussion, probaly regarding namespace-based translator selection + <antrik> teythoon: yeah, as a general rule + <teythoon> interesting + <antrik> currently there are command line arguments controling timeouts, + but they don't consider control ports IIRC + <teythoon> i thought there are problems with shutting down translators in + general + <antrik> (also, command line arguments seem inconvenient to distinguish the + passive vs. active case...) + <teythoon> yeah, but we disregard the timeouts in the debian flavor of hurd + <antrik> teythoon: err... no we don't. at least not last time I knew. are + you confusing this with thread timeouts? + <antrik> simple test: do ls -l on /dev, wait a few minutes, compare + <teythoon> what do you expect will happen ? + <antrik> the unused translators should go away + <teythoon> no + <antrik> that must be new then + <teythoon> might be, yes + <teythoon> + http://darnassus.sceen.net/gitweb/teythoon/packaging/hurd.git/blame/HEAD:/debian/patches/libports_stability.patch + <braunr> antrik: debian currently disables both the global and thread + timeouts in libports + <braunr> my work on thread destruction consists in part in reenabling + thread timeouts, and my binary packages do that well so far :) + <antrik> braunr: any idea why the global timeouts were disabled? + + +## IRC, freenode, #hurd, 2013-12-20 + + <braunr> antrik: not sure + <braunr> but i suspect there could be races + <braunr> if a message arrives while the server is going away, i'm not sure + the client can determine this and retry transparently + <antrik> good point... not sure how that is supposed to work exactly + + +## IRC, freenode, #hurd, 2013-12-31 + + <braunr> btw, we should remove the libports_stability patch and directly + change the upstream code + <braunr> if you agree, i can force the global timeout to 0 (because we're + still not sure what can go wrong when a translator goes away while a + message is being delivered to it) + <braunr> i didn't experience any slowdown with thread destruction however + <braunr> so i'm tempted to set that to an actual reasonable timeout value + of 30-300 seconds + <teythoon> braunr: if you do, please introduce a macro for the default + value so it can be changed easily + <braunr> teythoon: yes + <braunr> i don't understand why these are left as parameters tbh + <teythoon> true + <braunr> 30 seconds seems to be plenty enough + + +## IRC, freenode, #hurd, 2014-01-17 + + <braunr> time to give fakeroot-hurd a shot + <braunr> http://darnassus.sceen.net/~rbraun/darnassus_fakeroot_hurd_assert + <teythoon> braunr: (wrt fakeroot-hurd) well in my book that shouldn't + happen + <teythoon> that's why i put the assertion there ;) + <braunr> i assumed so :) + <teythoon> then again, /me does not agree with "threads" as concurrency + model >,<, and that feeling seems to be mutual :p + <braunr> ? + <teythoon> well, obviously, the threads do not agree with me wrt to that + assertion + <braunr> the threads ? + <teythoon> well, fakeroot is a multithreaded server + <braunr> teythoon: i'm not sure i get the point, are you saying you're not + comfortable with threads ? + <teythoon> that's exactly what i'm saying + <braunr> ok + <braunr> coroutines/functional i guess ? + <teythoon> csp + <teythoon> functional not so much + + +## IRC, freenode, #hurd, 2014-01-20 + +[[open_issues/libpthread]], +[[open_issues/libpthread/t/fix_have_kernel_resources]]. + + <braunr> teythoon: it's perfectly possible that the bug i had with + fakeroot-hurd have been caused by my own glibc thread related patches + <braunr> has* + <teythoon> ok + <teythoon> *phew* :p + <braunr> :) + <teythoon> i wonder if youpi could reproduce his issue on his machine + <braunr> what issue ? + <braunr> i must have missed something + <teythoon> some package failed + <teythoon> but he didn't gave any details + <teythoon> he wanted to try it on his vm first + <braunr> ok + + +## IRC, freenode, #hurd, 2014-01-21 + + <braunr> teythoon: i still get the same assertion failure with + fakeroot-hurd + <braunr> will take a look at that sometimes too + <teythoon> braunr: hrm :/ + <braunr> teythoon: don't worry, i'm sure it's nothing big + <braunr> in the mean time, there are updated hurd and glibc packages on my + repository with fixed tls and thread destruction + <teythoon> cool :) + + +## IRC, freenode, #hurd, 2014-01-23 + + <braunr> teythoon: can you briefly explain this fake reference thing in + fakeroot when you have some time please ? + <teythoon> braunr: fakeroot creates ports to hand out to clients + <teythoon> every port represents a node and references a real node + <teythoon> fakeroot allows one to set attributes, e.g. file permissions on + any node as if the client was root + <teythoon> those faked attributes are stored in the node objects + <braunr> let's focus on fake_reference please + <teythoon> once some attribute is faked, that node has to be kept alive + <teythoon> otherwise, that faked information is lost + <teythoon> so if the last peropen object is closed and some information is + faked, a fake reference is kept + <teythoon> as indicated by a flag + <braunr> hm + <teythoon> in dir lookup, if a node is looked-up that has a fake reference, + it is recycled, i.e. the flag cleared and the referecne count is not + incremented + <teythoon> so every time fakeroot_netfs_release_protid is called b/c, the + node in question should not have the fake reference flag set + <braunr> what's the relation between the number of hard links and this fake + reference ? + <teythoon> i don' + <teythoon> i don't think fakeroot has a notion of 'hard links' + <braunr> it does + <braunr> the fake reference is added on nodes with a hard link count + greater than 0 + <braunr> but i guess that just means the underlying node still exists + <teythoon> ah yes + <teythoon> right + <teythoon> currently, if the real node is deleted, the fake node is still + kept around + <braunr> let's say it's ok for now + <teythoon> that's what the comment is talking about, the one that indicates + that garbage collection could help here + <teythoon> yes + <teythoon> properly fixing this is difficult + <braunr> agreed + <braunr> it would require something like inotify anyway + <teythoon> b/c of the way file deletion works + <braunr> let's just ignore the issue, that's not what i'm hunting + <teythoon> agreed + <braunr> the assertion i have is telling us that we're dropping a fake + reference + <braunr> are we certain this isn't possible ? + <teythoon> that function is called if a client dereferences a port + <teythoon> in order to have a port in the first place, it has to get it + from a dir_lookup + <teythoon> the dir lookup turns a fake reference into a real one + <teythoon> so i'm certain of that (barring a race condition somewhere) + <braunr> ok + <braunr> netfs_S_dir_lookup grabs idport_ihash_lock (line 354) but doesn't + release it if nn == NULL (lines 388-392) + <teythoon> hm, my file numbers are slightly different o_O + <braunr> i have printfs around + <braunr> sorry :) + <teythoon> ok + <teythoon> new node unlocks it + <teythoon> new_node + <braunr> oh + <braunr> how unintuitive .. + <teythoon> yes, don't blame me ;) that's how it was + <braunr> :) + <braunr> worse, the description says "if successful" .. + <braunr> ah no, the node lock + <braunr> ok + <teythoon> yes, badly worded description + <braunr> i strongly doubt it's a race + <teythoon> how do you trigger that assertion failure ? + <braunr> dpkg-buildpackage -rfakeroot-hurd -uc -us + <braunr> for the hurd package + <braunr> very similar to one of your test cases i think + <teythoon> umm :-/ + <braunr> one thing that i find confusing is that fake_reference seems to + apply to nodes, whereas release_protid is about, well, protids + <braunr> is there a 1:1 relationship ? + <braunr> since there is a peropen in the protid, i assume not + <braunr> it may be a race actually + <braunr> np->references must be accessed with netfs_node_refcnt_lock locked + <braunr> hm no, that's not it + <teythoon> no, it's not a 1:1 relationship + <teythoon> note that the lock idport_ihash_lock serializes most operations, + despite it's name indicating that it's only for the hash table + <teythoon> the "interesting" operations being dir_lookup and release_protid + <braunr> yes + <braunr> again, that's another issue + <teythoon> why ? that's a pretty strong guarantee already + <braunr> ah yes, i was referring to scalability + <teythoon> sure + <braunr> the assertion is triggered from ports_port_deref in + ports_manage_port_operations_multithread + <teythoon> but i found it hard to reason about fakeroot, there are multiple + locks involved, two kinds of reference counting across different libs + <braunr> yes + <teythoon> yes, that's to be expected + <braunr> teythoon: do we agree that the fake reference is reused by a + protid ? + <teythoon> braunr: yes + <braunr> why is there a ref counter for the protid as well as the peropen + then ? :/ + <teythoon> funny... i thought there was no refcnt for the peropen objects, + but there is + <teythoon> but for fakeroot-hurd that shouldn't matter, right ? + <braunr> i don't know + <teythoon> here, one protid object is associated with one peropen object + <braunr> yes + <teythoon> and the other way around, i.e. it's 1:1 + <teythoon> so the refcount for those should be identical + <braunr> but i get a case where protid has a refcnt of 0 while the peropen + has 2 .. + <teythoon> umm, that doesn't sound right + <braunr> teythoon: ok, it does look like a race on np->references + <braunr> node references are protected by a global lock in lib*fs libs + <teythoon> yes + <braunr> you check it without holding it + <braunr> which means another protid can be closed at the same time, setting + the flag on the underlying node + <braunr> i'll make a proper patch soon + <teythoon> they cannot both hold the hash lock + <braunr> hm + <braunr> teythoon: actually, i don't see why that's relevant + <braunr> one thread closes its protid, sets the fakeref flag + <braunr> the other does the same, chokes on the assertion + <braunr> serially + <teythoon> i'm always a little fuzzy when exactly the references get + decremented + <teythoon> but shouldn't only the second thread set the fakeref flag ? + <braunr> well, that's not what i see + <braunr> i'll check what happens to this ref counter + <teythoon> see how my release_protid function calls netfs_release_protid + just after the out label + <teythoon> *while holding the big hash lock + <teythoon> so, any refcounting should happen while the lock is being held, + no ? + <braunr> perhaps + <braunr> now, my logs show something new + <braunr> a case where the protid being released was never printed before + <braunr> i.e. not obtained from dir_lookup + <braunr> or at least, not fakeroot dir_lookup + <teythoon> huh, where did it came from then ? + <braunr> no idea + <teythoon> only dir_lookup hands out those + <braunr> check_openmodes calls dir_lookup too + <teythoon> yes, but that's not our dir_lookup + <braunr> that's what i mean + <braunr> it bypasses fakeroot's custom dir_lookup + <braunr> but i guess the reference already exists at this point + <teythoon> bypass ? i wouldn't call it that + <braunr> you're right, wrong wording + <teythoon> it accesses files on other translators + <braunr> yes + <braunr> the netnode is already present + <teythoon> yes + <braunr> could it be the root node ? + <teythoon> i do not believe so + <teythoon> the root node is always faked + <teythoon> and is handed out to the first process in the fakeroot env for + it's current directory port + <teythoon> so you could try something that chdirs away to test that + hypothesis + <braunr> the assertion looks triggered by a chdir + <teythoon> how do you know that ? + <braunr> dh_auto_install: error: unable to chdir to build-deb + <teythoon> ah + <teythoon> well, or that is just the operation after fakeroot died and + completely unrelated + <braunr> maybe + <teythoon> can you trigger this reliably ? + <braunr> yes + <braunr> i'm trying to write a shell script for that + <teythoon> so for you, fakeroot-hurd never succeeded in building a hurd + package ? + <braunr> no + <teythoon> on darnassus ? + <braunr> yes + <teythoon> b/c i stopped working on fakeroot-hurd when it was in a + good-enough shape to build the hurd package + <teythoon> >,< + <teythoon> maybe my system is not fast enough to hit this race (if it turns + out to be one) + <braunr> some calls seems to decrease the refcount of the root node + <braunr> call* + <teythoon> have you confirmed that it's the root node ? + <braunr> almost + <braunr> i could say yes + <braunr> teythoon: actually no, it's not .. + <braunr> could be .. + <braunr> teythoon: on what node does fakeroot-hurd install the fakeroot + translator when used to build debian packages ? + <braunr> hum + <braunr> could it simply be that the check on np->references should be + moved above the assertion ? + <teythoon> braunr: it is not bound to any node, check settrans --chroot + <braunr> oh right + <braunr> teythoon: ok i mean + <braunr> does it shadow / ? + <braunr> looks very likely, otherwise the chroot wouldn't work + <teythoon> i'm not sure what you mean by shadow + <braunr> settrans --chroot cmd -- / /hurd/fakeroot ? + <teythoon> but yes, for any process in the chroot-like env every real node + is replaced, including / + <braunr> makes sense + <braunr> teythoon: moving the assertion seems to fix the issue + <braunr> intuitively, it seems reasonable to assume the fakeref flag can + only be set when there is only one reference, namely the fake reference + <braunr> (well, the fake ref, recycled by the last open) + <teythoon> no, i don't follow + <teythoon> i'd still say, that if ...release_protid is called, then there + is no way for the fake flag to be set in the first place + <teythoon> that's why i put the assertion in ;) + <braunr> on the other hand, you check the refcnt precisely because other + threads may have reacquired the node + <teythoon> but why would moving the assertion change anything ? + <teythoon> if we would do that, we'd "lose" all threads that see + np->reference being >1 + <teythoon> but for those objects the fake_reference flag should never be + set anyways + <teythoon> i cannot see why this would help + <teythoon> (does it help ?) + <teythoon> (and if it does, it points to a serious problem imho) + <braunr> i'm recreating the traces that made me think that + <braunr> to get a clearer view of what's happening + <braunr> the problem i have with the current code is this + <braunr> there can be multiple protid referring to the same node occurring + at the same time + <braunr> they are serialized by the hash table lock, ok + <braunr> but there apparently are cases where the first (of two) protids + being closed sets the fakeref flag + <braunr> and the following chokes because the flag is set + <braunr> i assume you put this refcount check because you assumed only the + last protid being closed can set the flag, right ? + <braunr> but then, why > 1 ? why not > 0 ? + <teythoon> yes, that's what i was trying to assert + <teythoon> b/c the 1 is our reference + <braunr> which one exactly ? + <teythoon> >1 is anyone *beside* us + <teythoon> ? + <braunr> hm + <braunr> you mean the reference held by the protid being destroyed + <teythoon> yes + <braunr> isn't that reference already dropped before calling the cleanup + function ? + <braunr> ah no, it's the node ref + <teythoon> yes + <braunr> released by netfs_release_protid + <teythoon> exactly + <braunr> which is called without the hash table lock held + <braunr> hm no + <braunr> it's locked + <braunr> damn my brain is slow today + <teythoon> i actually think that it's the combination of manual reference + counting and the primitive concurrency model that makes it hard to reason + about this + <braunr> well + <braunr> the model is quite simple too + <braunr> accesses to refcounters must be protected by the appropriate lock + <braunr> this isn't done here, on the assumption that all referencing + operations are protected by another global lock all the time + <teythoon> even if a model is simple, this does not mean that it is a good + model for human beings to comprehend and reason about + <braunr> i don't know + <braunr> note that netfs_drop_node is designed to be called with + netfs_node_refcnt_lock locked + <braunr> implying the refcount must remain stable between checking it and + dropping the node + <braunr> netfs_make_peropen is called without the hash table lock held in + dir_lookup + <braunr> and this increases the refcount + <braunr> although the problem is rather that something decreases it without + the lock held + <teythoon> we should port libtsan and just ask gcc -fsanitize=thread + <braunr> what about the netfs_nput call at the end of dir_lookup ? + <braunr> the fake ref should be set by the norefs function + <teythoon> that should not decrease the count to 0 b/c the caller holds a + reference too + <braunr> yes that's ugly + <braunr> ugh + <braunr> i'm unable to think clearly right now + <teythoon> as mentioned in the commit message, you cannot do something like + this in the norefs function + <teythoon> bbl ;) + <braunr> bye teythoon + <braunr> thanks for your time + <braunr> for when you come back : + <braunr> instead of maintaining this "fake" reference, why not assumeing + the hash table holds a reference, and simply count it + <braunr> the same way a cache does + <braunr> and drop that reference when removing a node, either to reflect + the current state of the underlying node, or because the translator is + being shut down ? + <braunr> why not assume* + <braunr> bbl too + <teythoon> sure, refactoring is definitively an option + + +## IRC, freenode, #hurd, 2014-01-24 + + <braunr> teythoon: ok, i'll take care of fakeroot + <teythoon> braunr: thanks. tbh i was a little fed up with that little + bugger >,< + <braunr> i can imagine + <braunr> considering the number of patches you've sent already + + <braunr> teythoon: are you sure about your call to fshelp_lock_init ? + <teythoon> yes, why do you ask ? + <teythoon> (the test case is given in the commit message) + <braunr> it doesn't look right to me to call "init" while the node is + potentially locked + <braunr> i noticed libdiskfs peropen release function takes care of + releasing locks + <braunr> it looks better to me + <teythoon> it's not about releasing the lock + <teythoon> it's about faking the file being closed which implicitly + releases the lock + <braunr> the file is being close + <braunr> closed + <braunr> since it's in the cleanup function + <teythoon> yes, but we keep it b/c the file has faked attributes + <teythoon> did you look at the problem description in the commit message ? + <braunr> we keep the node + <braunr> not the peropen + <teythoon> so ? + <teythoon> the lock is in the node + <braunr> why would libdiskfs do it in the peropen release then ? + <braunr> there is an inconsistency somwhere + <braunr> actually, the lock looks to be per open + <braunr> or rather, the lock is per node, but its status is recorded per + open + <braunr> allowing the implementation to track if a file descriptor was used + to install a lock and release it when that file descriptor goes away + <teythoon> why would the node be locked ? + <teythoon> locked in what way, file-locking locked ? + <braunr> yes + <braunr> posix explicitely says that file locks must be implicitely removed + when closing the file descriptor used to install them, so that makes + sense + <teythoon> isn't hat exactly what i'm doing ? + <braunr> no + <braunr> you're initializing the file lock + <braunr> init != unlock + <braunr> and it's specific to fakeroot, while it looks like libnetfs should + be doing it + <teythoon> libnetfs would do it + <teythoon> but we prevent that by keeping the node alive + <braunr> again, it's a per open thing + <braunr> and no, libnetfs doesn't release locks implicitely in the current + version + <teythoon> didn't we agree that for fakeroot one peropen object is + associated with one protid object ? + <braunr> yes + <braunr> and don't keep those alive + <braunr> so let them die peacefully, and fix libnetfs so it releases the + lock as it's supposed to + <braunr> and we* don't + <teythoon> we don't keep those alive + <teythoon> why would we ? + <braunr> yes that's what i wanted to say + <braunr> what i mean is + <braunr> since letting peropens die is already what is being done + <braunr> there is no need for a special handling of locks in fakeroot + <teythoon> oh + <braunr> on the other hand, libnetfs must be fixed + <teythoon> ok, that might very well be true + <teythoon> (we need to bring libnetfs and diskfs closer so that they can be + diff'ed easily) + <braunr> i just wanted to check your reason for using lock_init in the + first place + <braunr> yes .. + <braunr> teythoon: also, i think we actually do have what's necessary to + deal with garbage collection + <braunr> namely, dead-name notifications + <braunr> i'll see if i can cook something simple enough + <braunr> otherwise, merely keeping every node around is also acceptable + considering the use cases + <teythoon> dead-name notifications won't help if the real node disappears, + no ? + <braunr> teythoon: dead name notifications on the real node port :) + <braunr> teythoon: at least i can reliably build the hurd package using + fakeroot-hurd now + <braunr> let's try glibc :) + +## IRC, freenode, #hurd, 2014-01-25 + + <teythoon> braunr: awesome :) + <braunr> teythoon: hm not sure :/ + <braunr> darnassus got oom + <braunr> teythoon: could be unrelated though + <braunr> teythoon: something has apprently made /home unresponsive :( + <braunr> teythoon: i suspect bots hitting apache and in particular the git + repositories to have increased memory usage + + +## IRC, freenode, #hurd, 2014-01-26 + + <braunr> teythoon: btw, fakeroot interacts very very badly with other netfs + file systems + <braunr> e.g., listing /proc through it creates lots of nodes + <braunr> i'm not yet sure how to fix that + <braunr> using a dead name notification doesn't seem appropriate (at least + not directly) because fakeroot holds a true reference that prevents the + deallocation of the target node + + +## IRC, freenode, #hurd, 2014-01-27 + + <braunr> teythoon: good news (more or less): fakeroot is actually leaking a + lot when crossing file systems + <braunr> which means if i fix that, there is a good chance we can use it to + build all packages with it + <braunr> -with it + <teythoon> what do you mean exactly ? + <braunr> if target nodes are from /, there is no such leak + <braunr> as soon as the target nodes are from another file system, ports + rights are leaked + <braunr> that's what fills the kernel allocator actually + <teythoon> oh, so dir_lookup leaks ports when crossing translator + boundaries ? + <braunr> seems so + <teythoon> yeah, that might very well be it + <teythoon> the dir_lookup logic in lib*fs is quite involved :/ + <braunr> yes, my simple attempts were unsuccessful + <braunr> but i'm confident i can fix it soon + <teythoon> that sounds good :) + <braunr> i also remove the fake_ref flag and replace it with "accounting + the reference in the hash table" as soon as a node is faked + <teythoon> fine with me + <braunr> these will be the expected leak + <braunr> but they're far less in numbers than what i observe + <braunr> and garbage collection can be implemented later + <braunr> although i would prefer notifications a lot more + <braunr> end of the news, bbl :) + <braunr> found it :> + <teythoon> braunr: -v ;) + <braunr> err = dir_lookup (...); + <braunr> if (dir != dnp->nn->file) mach_port_deallocate (mach_task_self (), + dir); + <braunr> in other words, deallocate ports for intermediate file system root + directories .. :) + <braunr> teythoon: currently building hurd and glibc packages + <braunr> but i intend to improve some more with the addition of a default + faked state + <braunr> so that only nodes with modified faked states are retained + <teythoon> how do you mark nodes as having the default faked state ? + <braunr> i don't + <teythoon> ok, right, makes sense :) + <teythoon> this sounds awesome, thanks for following up on this + <braunr> i'm quite busy with other stuff so, with proper testing, it should + take me the week to get merged + <braunr> teythoon: well thanks for all the fixes you've done + <braunr> fakeroot was completely unusable before that + <teythoon> if you push your changes somewhere i'll integrate them into my + packages and test them + <braunr> ok + <braunr> implementing fakeroot -u could also be a good thing + <braunr> and this should work easily with that default faked state strategy + + +## IRC, freenode, #hurd, 2014-01-28 + + <braunr> teythoon: i should be able to test fakeroot-hurd with the default + faked attributes strategy today on glibc + <teythoon> braunr: very nice :) + <braunr> azeem_: do you happen to know if fakeroot -u is used by debian ? + <braunr> i mean when building packages + <teythoon> braunr: how does fakeroot-hurd perform on darnassus ? + <teythoon> i mean, does it yield a noticeable improvement over fakeroot-tcp + just like on my slow box ? + <braunr> i'm not measuring that :/ + <teythoon> ok, no problem + <braunr> and since nodes are removed from the hash table, performance might + decrease slightly + <braunr> but the number of rights is kept very low, as expected + <teythoon> that's good + <braunr> i keep seeing leaks though + <braunr> when switching cwd between file systems + <teythoon> humm + <braunr> so i assume something is wrong with the identity of . or .. + <braunr> it's so insignificant compared to the previous problems that i + won't waste time on that + <braunr> teythoon: the problem with measuring on darnassus is that it's a + public machine + <teythoon> right + <braunr> often scanned by ssh worms or http bots + +[[cannot_create__dev_null__interrupted_system_call]]. + + <braunr> but it makes complete sense to get better performance with + fakeroot-hurd + <braunr> that's actually one of the reasons i'm working on it + <braunr> if not the main one + <teythoon> :) + <teythoon> that was my motivation too + <braunr> it shows how you can get an interchangeable unix tool that + directly plugs well with the low level system + <braunr> and make it work better + <teythoon> nicely put :) + + <braunr> teythoon: i still can't manage to build glibc with fakeroot-hurd + <braunr> but i'm not sure why :/ + <braunr> there was no kernel memory exhaustion this time + <teythoon> :/ + <braunr> cp: cannot create regular file `debian/libc-bin.dirs': Permission + denied + <braunr> hum + <braunr> youpi: do you know if building debian packages requires fakeroot + -u option ? + <youpi> I don't know + <gg0> braunr: man dpkg-buildpackage says it just runs "fakeroot + debian/rules <target>" + <gg0> sources confirm that + http://sources.debian.net/src/dpkg/1.17.6/scripts/dpkg-buildpackage.pl#L465 + <braunr> gg0: ok + + +## IRC, freenode, #hurd, 2014-01-29 + + <braunr> it seems that something sets the permissions of this + debian/libc-bin.dirs file to 000 ... + <teythoon> i've seen this too + <braunr> oh + <braunr> do you think it's a fakeroot-hurd bug ? + <teythoon> have i mentioned something like this in a commit message ? + <teythoon> yes + <teythoon> it is + <braunr> ok + <braunr> i didn't see any mention of it + <braunr> but i could have missed it + <teythoon> hm, i cannot recall it either + <teythoon> but i've seen this issue with fakeroot-hurd + <braunr> ok + <braunr> it's probably the last issue to fix to get it to work for our + packages + <braunr> teythoon: i think i have a solution for that last mode bug + <braunr> fakeroot doesn't relay chmod requests, unless they change an + executable bit + <braunr> i don't see the point, and simply removed that condition to relay + any chmod request + <teythoon> braunr: did it work ? + <braunr> no + <braunr> fakeroot still consumes too many ports + <braunr> and for each file, there are at least two ports, the faked one, + and the real one + <braunr> it should be completely reworked + <braunr> but i don't have time to do that + <braunr> i'll see if it works when building from scratch + <braunr> actually, it's not even a quantity problem but a fragmentation + problem + <braunr> the function that fails is kmem_realloc .. + <braunr> ipc spaces are arrays in kernel space .... + <teythoon> it's more like three ports per file, you forgot the identity + port + <braunr> ah yes + + +## IRC, freenode, #hurd, 2014-02-03 + + <braunr> teythoon: i'll commit my changes on fakeroot tonight + <braunr> they do improve the tool, but not enough to build glibc with it + <teythoon> braunr: cool :), so how do we make it fully usable ? + <braunr> teythoon: i don't know .. + <braunr> i'll try re adding detection of nodes with no hard links for one + <braunr> but imho, it needs a rework based on what the real fakeroot does + <braunr> i won't work on it though + + <braunr> teythoon: also, it looks like i've tested building glibc with a + wrong test binary of my fakeroot version :/ + <braunr> so consider all test results irrelevant so far + + +## IRC, freenode, #hurd, 2014-02-04 + + <braunr> fakeroot-hurd might turn out to be easily usable for our debian + packages with the fixed binary :) + + <braunr> teythoon: hum, can you explain + 672005782e57e049c7c8f4d6d0b2a80c0df512b4 (trans: fix locking issue in + fakeroot) when you have time please ? + <braunr> it looks like it introduces a deadlock by calling new_node (which + acquires the hash table lock) while dir is locked, violating the hash + table -> node locking order + + <teythoon> braunr: awesome, then there still is hope for fakeroot-hurd :) + + <braunr> teythoon: i've been able to build glibc packages several times + this night + <braunr> so except for this deadlock i've seen once, it looks good + <teythoon> right + <teythoon> that deadlock + <teythoon> right, it does indeed violate the partial order of the locks :-/ + + <braunr> teythoon: can you explain why you moved the lock in attempt_mkfile + please ? + + <braunr> teythoon: i've just tested a fakeroot binary without the patch + introducing the deadlock, and glibc built without any problem + <teythoon> braunr: well, this is very good news :) + <braunr> teythoon: but i still wonder why you made this patch in the first + place, i don't want to revert it blindly and reintroduce a potential + regression + <teythoon> braunr: i thought i was fixing the order in which locks were + taken. if the commit message does not specify that it fixes an issue, + then i was probably just wrong and you can revert it + <braunr> oh ok + <braunr> good + + <braunr> teythoon: another successful build :) + <braunr> i'll commit my changes + <teythoon> awesome :) + <braunr> there might still be concurrency issues but it's much better + <teythoon> i'm curious what you did :) + <braunr> so little :) + <braunr> i was sick all week heh + <braunr> you'll se + <braunr> see + <teythoon> well, that's good actually ;) + <braunr> yes + + <braunr> teythoon: actually there was another debugging line left over, and + again, my test results are irrelevant @#! + + +## IRC, freenode, #hurd, 2014-02-05 + + <braunr> teythoon: i got an assertion about nn->np->nn not being equal to + nn atfer the hash table lookup is dir_lookup + <braunr> +failure + <teythoon> that's bad + <braunr> not over yet + <teythoon> i had a couple of those too + <teythoon> i guess it's a use after free + <braunr> yes + <teythoon> i used to poison the pointers and comment out the frees to track + them down iirc + <braunr> teythoon: one of your patches stores netnodes instead of nodes in + the hash table, citing some overwriting issue + <braunr> teythoon: i don't understand why using netnodes fixes this + <teythoon> braunr: libihash has this cookie for fast deletes + <teythoon> that has to be stored somewhere + <teythoon> the node structure has no room for it + <braunr> uh + <teythoon> yes + <teythoon> it was that bad + <braunr> ... + <teythoon> hence the uglyish back pointers + <braunr> i see + <teythoon> looking back i cannot even say why it worked at all + <braunr> well, it didn't + <teythoon> i believe libihash must have destroyed a linked list in the node + struct + <braunr> possibly + <teythoon> no, it did not >,<, but for simple tests it kind of did + <braunr> yes fakeroot sometimes corrupts memory badly .... + <braunr> and yes, turns out the assertion is triggered on nodes with 0 refs + .. + <braunr> teythoon: it looks like even the current version makes wrong usage + of the ihash interface + <braunr> locp_offset is defined as "The offset of the location pointer from + the hash value" + <braunr> and indeed, it's an intptr_t + <braunr> teythoon: hm no, it looks ok actually, forget what i said :) + <teythoon> *phew + <teythoon> :p + + <braunr> hmm, still occasional double frees in fakeroot, but it looks in + good shape for single threaded tasks like package building + + <braunr> teythoon: i've just sent my fakeroot patches + <teythoon> braunr: sweet, i'll have a closer look tomorrow :) + <braunr> teythoon: i couldn't debug the double frees though :/ + + +## IRC, freenode, #hurd, 2014-02-06 + + <braunr> btw, i'm able to successfully use fakeroot-hurd to build glibc + packages, but is there a way to make sure the resulting archives contain + the right privileges and ownerships ? + <youpi> I don't remember whether debdiff checks permissions + + <youpi> braunr: I've just got fakeroot-hurd debian/rules clean + <youpi> dh_clean + <youpi> fakeroot: ../../trans/fakeroot.c:161: netfs_node_norefs: Assertion + `np->nn->np == np' failed. + <youpi> while building eglibc + <teythoon> youpi: yes, that lockup is most annoying... :/ + <braunr> youpi: with the new version ? + <youpi> yes + <braunr> hum + <braunr> i only had rare double frees, not that any more :/ + <braunr> youpi: ok i got the error too + <braunr> still not good enough + <youpi> ok + + +## IRC, freenode, #hurd, 2014-02-07 + + <braunr> youpi: debdiff seems to handle permissions + <braunr> i've found the cause of the assertions + <youpi> braunr: groovie :) + + +## IRC, freenode, #hurd, 2014-02-08 + + <teythoon> braunr: nice :) + <braunr> http://darnassus.sceen.net/~rbraun/debdiff_report + + +## IRC, freenode, #hurd, 2014-02-10 + + <braunr> and, on a completely different topic, here is a crash i can + reproduce when using fakeroot: + http://darnassus.sceen.net/~rbraun/fakeroot_hurd_rpctrace_o_var_tmp_out_rm_rf_dir.png + + +## IRC, freenode, #hurd, 2014-02-11 + + <braunr> still working on fakeroot + <braunr> there are still races (not disturbing for package building but + still ..) + <braunr> there may be wrong right handling + <teythoon> i believe i have witnessed a fakeroot deadlock :/ + <braunr> aw + <teythoon> not sure though, buildbot killed the build process before i + could investigate + <braunr> teythoon: was it a big package ? + <teythoon> half of the hurd package + <braunr> that's not a port right overflow then diff --git a/open_issues/wine.mdwn b/open_issues/wine.mdwn index f8bb469b..842442f1 100644 --- a/open_issues/wine.mdwn +++ b/open_issues/wine.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2010, 2011, 2013 Free Software Foundation, +[[!meta copyright="Copyright © 2010, 2011, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable @@ -78,3 +78,99 @@ allocation. There is kernel support for this,* however. and stack issues to be <gnu_srs> fixed for wine to run as braunr pointed out some months ago (IRC?) when we discussed wine. + + +# IRC, freenode, #hurd, 2013-12-29 + + <Andre_H> Hi, + http://www.gnu.org/software/hurd/open_issues/sendmsg_scm_creds.html seems + fixed in Debian GNU/Hurd 2013, do you know which patch they used? i + already asked in their channel, but well, there are only 18 people :) + <youpi> Andre_H: it hasn't been fixed in Debian GNU/Hurd. Work is discussed + on the bug-hurd mailing list + <Andre_H> youpi: thx for the info, i wonder why wine now works with some + hacks, but didn't in the past + <youpi> I guess some circumvention patch was added to wine + <youpi> does it actually really work, as in running applications for real? + <youpi> (I've nevere tried) + <Andre_H> youpi: i'm a wine developer and haven't seen circumventions for + hurd... i also just tried winelib apps last night, will try... let's say + powerpoint viewer today + <gnu_srs> Andre_H: How did you make wine run? I have patches for wine-1.4.1 + and 1.6.1 to build (so far unpublished), but it does not yet run + properly. + <gnu_srs> test case: wine notepad + <Andre_H> gnu_srs: what's happening when you try that? + <gnu_srs> Andre_H: Currently it hangs at connect() (after creating the + /tmp/.wine1000/.../socket, etc, and starting again) + <gnu_srs> seems to be some problem with the HURD_DPORT_USE macro in eglibc, + investigation ongoing + <Andre_H> gnu_srs: well, i'm using the debian distro, maybe you're on + something else? you could also pastebin your hacks, so i could have a + look. i'm about to clean mine up to send them upstream... ntdll will be + quite hard... + + +## IRC, freenode, #hurd, 2013-12-30 + + <gnu_srs> wine runs:) + <gnu_srs> It's just extremely slow.,.. + + <gnu_srs> gg0: please don't reopen #733604 , I've filed an updated one: + #7336045 + <gnu_srs> #733605 + <gg0> gnu_srs: i've reassigned it from wine-1.6 (nonexistent) to wine + (correct), then to src:wine (more correct), but between such + reassignments you closed it so found command in the latter made it + reopening + <gg0> then i realized you could mess up bugs on your own, without help :) + <gnu_srs> gg0: tks anyway, now it is src:wine and the title is right. Maybe + you should have noted me on IRC? + + <Andre_H> gnu_srs: what's your status about wine? i'm still about to get + things upstream... + <gnu_srs> Andre_H: see debian bug #733605 + + +## IRC, freenode, #hurd, 2013-12-31 + + <Andre_H> gnu_srs: i didn't need the patches for + dlls/mountmgr.sys/diskarb.c, maybe due to missing headers + + +## IRC, freenode, #hurd, 2014-01-06 + + <Andre_H> Wanted to note that + http://www.gnu.org/software/hurd/open_issues/wine.html is wrong about + socket credentials, afaik they are still not implemented but that doesn't + block Wine anymore + <Andre_H> In fact all you need to run Wine are the patches followed by + https://source.winehq.org/patches/data/101439 (not yet upstream) or see + http://wiki.winehq.org/Hurd + + <braunr> Andre_H: thanks for your report + <Andre_H> np :) + <Andre_H> braunr: can someone update + http://www.gnu.org/software/hurd/open_issues/wine.html please? + <braunr> Andre_H: well, you can :) + <Andre_H> log in with google -> check guidelines of your wiki -> try out + your wiki syntax -> laziness alarm :) + <gnu_srs> Andre_H: The reason why wine runs now is a bug in SCM_CREDS was + fixed, see the wine-devel ML. + + <gnu_srs> Andre_H: s/SCM_CREDS/SCM_RIGHTS/ + <Andre_H> gnu_srs: already updated our wiki :) + <Andre_H> gnu_srs: would you mind updating yours: + http://www.gnu.org/software/hurd/open_issues/wine.html :) + + <Andre_H> gnu_srs: two commits for wine are in now :) + + +## IRC, freenode, #hurd, 2014-01-11 + + <gnu_srs1> Andre_H: Looks like the two committed patches did not go into + wine-1.6.2:-( + <gnu_srs1> Additionally, your PATH_MAX fixes was not accepted? + <Andre_H> gnu_srs1: well, the stable branch is called stable because not + everything get's there :)7 + <Andre_H> gnu_srs1: the PATH_MAX patch needs more thinking... diff --git a/open_issues/xattr.mdwn b/open_issues/xattr.mdwn index 558c93b7..c6b9d8f7 100644 --- a/open_issues/xattr.mdwn +++ b/open_issues/xattr.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -43,3 +44,12 @@ IRC, OFTC, #debian-hurd, 2012-03-18: <pinotree> notes to self: it seems our ext2 driver comes from linux 2.3.42 or so, and in linux 2.5.46 ext2/ext3 get xattr and acl support + + +# Test Cases + +## IRC, freenode, #hurd, 2013-12-06: + + <gnu_srs> for fakeroot t.xattr test fails, a known issue? + <braunr> the test must probably be disabled + <braunr> the hurd doesn't support extended attributes currently |