From c4ad3f73033c7e0511c3e7df961e1232cc503478 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 26 Feb 2014 12:32:06 +0100 Subject: IRC. --- open_issues/virtualization/fakeroot.mdwn | 1224 +++++++++++++++++++++++++++++- 1 file changed, 1223 insertions(+), 1 deletion(-) (limited to 'open_issues/virtualization/fakeroot.mdwn') diff --git a/open_issues/virtualization/fakeroot.mdwn b/open_issues/virtualization/fakeroot.mdwn index f9dd4756..7856e299 100644 --- a/open_issues/virtualization/fakeroot.mdwn +++ b/open_issues/virtualization/fakeroot.mdwn @@ -1,4 +1,5 @@ -[[!meta copyright="Copyright © 2010, 2013 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2010, 2013, 2014 Free Software Foundation, +Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -65,3 +66,1224 @@ License|/fdl]]."]]"""]] < youpi> that's why we still use fakeroot-sysv < teythoon> right < youpi> err, -tcp + + +## IRC, freenode, #hurd, 2013-11-18 + + I believe I figured out the argv[0] issue with fakeroot-hurd + but I'm not sure how to fix this + first of all, Emilios file_exec_file_name patch set works fine + but not with fakeroot + + http://git.sceen.net/hurd/hurd.git/blob/HEAD:/exec/hashexec.c#l300 + check_hashexec tries to locate the script file using a heuristic + Emilios patch improves the situation with just providing this + information + but then, the identity port of the "discovered" file is compared + with the id port of the script file + to verify if the heuristic found the right file + but when using fakeroot-hurd, /hurd/fakeroot proxies all + requests + but the exec server is outside of the /hurd/fakeroot + environment, so it gets the id port from the real filesystem + we could skip that test if the script name is explicitly + provided though + that test was meant to see whether a search through $PATH turned + up the right file + teythoon: nice + braunr: thanks :) + unfortunately, dpkg-buildpackaging hurd with it still fails for + some reason + but it is faster than fakeroot-tcp :) + even chown ? + or chmod ? + dunno in detail, but the whole build is faster + if you can try it, i'm interested + because chown/chmod is also slow on linux with fakeroot-tcp + i can try... + so it's probably not a hurd bug + braunr: yes, it really is + no i mean + chown/chmod being slow with fakeroot-tcp is probably not a hurd + bug + but a fakeroot-tcp bug + chowning all files in /usr/bin takes 5.930s with fakeroot-hurd + (6.09 with startup overhead) vs 26.42s (26.59s) with fakeroot-tcp + but try it on linux (fakeroot-tcp i mean) + although you may want to do it on something you don't care much + about :p) + + +## IRC, freenode, #hurd, 2013-12-03 + + * teythoon is gonna hunt a fakeroot bug ... + % fakeroot-hurd /bin/sh -c ":> /tmp/some_file" + /bin/sh: 1: cannot create /tmp/some_file: Is a directory + ah fakeroot-hurd + prevents installing stuff with /bin/install + sure fakeroot-hurd, why would i work on the slow one ? + i don't know + because it makes chmod/chown/maybe others horrenddously slow + ? + yes, fixing this involves fixing fakeroot-hurd + are you sure ? + i prefer repeating just in case: i saw that problem on linux as + well + with fakeroot-sysv + so ? + i'm almost certain it's a pure fakeroot bug, not a hurd bug + so + even if this is fixed, it still has to pay the socket + communication overhead + fixing fakeroot-hurd so that i can be used instead of fakeroot-tcp + is a very good thing to do, obviously + it* + but it won't solve the chown/chmod speed + (or, probably won't) + huh, why not ? + 15:53 < braunr> i'm almost certain it's a pure fakeroot bug, not a + hurd bug + when i say it's slow, i should be more precise + it doesn't show up in top + yes, but why would fakeroot-hurd suffer from the same issue ? + the cpu is almost idle + oh right, it's a completely different tool + my bad + right, right, the proper way to implement fakeroot actually :) + yes + this will bring near-native speed + + +## IRC, freenode, #hurd, 2013-12-05 + + fakeroot-hurd just successfully built mig :) + hangs in dh_gencontrol when building gnumach or hurd though + i believe it hangs waiting for a lock + lock like in file lock that is + braunr: no more room for vm_map_find_entry in 80220a40 + 80220a40 <- is that a task ? + or a vm_map, not sure + probably a vm_map + + +## IRC, freenode, #hurd, 2013-12-06 + + well, aren't threads a source of endless entertainment ... ? + well, I found three more bugs in fakeroot-hurd + one of them requires fixing the locking used in fakeroot + ouch + the current code does some lock cycling to aquire a lock out of + order + cycling ? + in the netfs_node_norefs function + release and reaquire + i see + which imho should be better solved with a weak reference + working on it, it no longer deadlocks but i broke something else + ... + endless fun ;) + such things could have been done right in the beginning + ... + yes, I wonder + libports has weak references + but pflocal is the only user + hm + none of the lib*fs support that + didn't i add one in libdiskfs too ? + anyway, irrelevant + weak references are a nice feature + teythoon: i don't see the cycling you mentioned + only netfs_node_refcnt_lock being dropped temporarily + yep, that one + line 145 + note that due to another bug this code is currently never run + how surprising .. + the note about some leak actually gave a hint about that + yeah, that leak + I think i'm actually very close + it's just so frustrating, i thought i got it last night + good luck then + thanks :) + + +## IRC, freenode, #hurd, 2013-12-09 + + sweet, i fixed fakeroot-hurd :) + /clap + what was the problem ? + lots + i see + it's amazing it actually run as well as it did + mess strikes again + i hate messy code .. + * teythoon is building half a hurd package using this ... stay tuned ;) + teythoon: is this going to make building faster as well? + most likely, yes + fakeroot-tcp is known to be slow, even on linux + teythoon: are you sure about the transparent retry patch ? + pretty sure, why ? + it's about a more general issue that we didn't fix yet + our last discussions about it lead us to agree that clients should + check the identity of a server before interacting with it + braunr: i don't understand, what's the problem here ? + teythoon: fakeroot does the lookup itself, doesn't it ? + yes + teythoon: but was that also the case before your patch ? + braunr: yes + teythoon: then ok + teythoon: i guess fakeroot handles requests only for a specific + set of calls right ? + and for others, requests are directly relayed + braunr: yes + and that still is the case, right ? + yes + ok + looks right since it only affects lookups + ok then + well, fakeroot-hurd built half a hurd package in less than 70 + minutes + a new record for my box + compared to how much before ? + (and why half of it ?) + unfortunately it hung after signing the packages... some perl + process with a /usr/bin/tee child + killing tee made it succeed though + braunr: i don't build the udeb package + oh ok + braunr: compared with ~75 with fakeroot-tcp and my demuxer + rework, ~80 before + teythoon: nice + + +## IRC, freenode, #hurd, 2013-12-18 + + there, i fixed the last fakeroot-hurd bug + *whee* :) + i thought so many times that i got the last fakeroot bug ... + last as in it's in a good enough shape to compile the hurd + package that is + but now it is + :) + this will make glibc and others so much faster to build + + +## IRC, freenode, #hurd, 2013-12-19 + + teythoon_: hum, you should make the behaviour of fakeroot-hurd on + the last client exiting optional + y? + fakeroot-tcp does the very same thing + fakeroot-hurd is different + it's part of the file system + yes + users may want it to stay around + and reuse it without checking it's actually there + but once the last client is gone, who is ever getting another + port to it ? + no + that cannot happen + really ? + yes + i thought it was like remap + since remap is based on it + the same thing applies to remap + only settrans has the control port + hum + and uses it once to get a protid for the working dir of the + initial process started inside the chrooted environment + you may not want to chroot inside + so ? + then, you get another protid + i'll make an example + i create a myroot directory implemented by fakeroot + populate it + leave and do something else, + i might want to return to it later + ah + ok, so you are not using settrans --chroot + or maybe i'm confusing the fakeroot translator and fakeroot-hurd + 10:48 < braunr> you may not want to chroot inside + yes + hm + ok, so the patch could be changed to check whether the last + control port is gone too + i have no idea of any practical use, but i don't see a valid + reason to make a translator go away just because it has no client + except for resource usage + and if it's installed as a passive translator + although that would make fakeroot loose its state + though remap state is on the command line so it would be fine for + it + see what i mean ? + yes i do + fakeroot state could be saved in some db one day so it may apply, + if anyone feels the need + so what about checking for control ports too ? + i'm not too familiar with those + who has the control port of a passive translator ? the parent ? + that should cover the use case you described + for the parent translator + for fsys_getroot requests it has to keep it around + and for more fsys stuff too + and if active ? settrans ? who just looses it ? + if settrans is used to start an active translator, the parent + fs still gets a right to the control port + ok + i don't have a clear view of what this implies for fakeroot-hurd + we'd want fakeroot-hurd to clean all resources including the + fakeroot translator on exit + for fakeroot-hurd (or any child translator) this means that a + port from the control port class will still exists + so we do not exit + oh, you're speaking of fakeroot.sh ? the wrapper script ? + probably + for me, fakeroot-hurd is the command line too, similar to + fakeroot-sysv and fakeroot-tcp + and fakeroot is the translator + yes, agreed + fakeroot-hurd could use settrans --force --chroot ... to force + fakeroot to exit if the main chrooted process dies + but that'd kill anything that outlives that process + that might be legitimate, say a process daemonized + so detecting that noone uses fakeroot is the much cleaner + solution + ok + also, that's what fakeroot-tcp does + which is why i suggested an option for that + why add an option if we can do the right thing without + troubling the user ? + ah, if we can, good + i think we can + I'll rework the patch, thanks for the hint + so + just to be clear + the way you intend it to work is + wait for all clients and the control port to drop before shutting + down + the control port is dropped when dettaching the translator, right + ? + yes + but hm + what if clients spawn other processes ? + they won't find the translator any more + then, that client get's a port to fakeroot at least for it's + working dir + so another protid is created + ah yes, it's usually choorted for such uses + chrooted + so fakeroot will stick around + but clients, even from fakeroot, might simply use absolute paths + so ? + in which case they won't find fakeroot + it will hit fakeroots dir_lookup + sure + how so ? + if the path is absolute, it will trigger a magic retry of some + kind + so the client uses it's root dir port + i thought the lookup would be done straight from the root fs port + .. + which points to fakeroot of course + ah, chrooted again + that's the whole point + so this implies clients are chrooted + they are + even if you do another chroot + what i mean is + that root port also points to a fakeroot port + if we detach the translator, and clients outside the chroot spawn + processes, say shell scripts, they won't find the fakeroot tree + now, i wonder if we want to actually handle that + i'm just uncomfortable with a translator silently shutting down + because it has no client + if fakeroot is detached, how are clients outside the chroot + ever supposed to get a handle to files inside the fakerooted env ? + it makes sense for fakeroot, so the expected behaviours here aer + conflicting + they had those before fakeroot being detached + then fakeroot wouldn't go away + right + unless there is a race but i don't think there is + there isn't + i call netfs_shutdown + clients get the rights before the parent has a chance to terminate + and only shutdown if it doesn't return ebusy + makes sense + ok go ahead :) + cool, thanks for the walk-through ;) + on the other hand .. + that's a complicated topic left unfinished by the original authors + one of many + having translators automatically go away when there is no client + may be a good feature + but it only makes sense for passive translators + and this should be automated + the lib*fs libraries should be able to handle it + or, we could go for proper persistence instead + stay around if active, leave after a while when no more clients if + passive + why ? + clean solution + persistence looks much more expensive to me + other benefits + i mean + persistence looks so expensive it doesn't make sense in a general + purpose system + sure, we could make our *fs libs handle this smarter at a much + lower cost + don't we get a handle to the underlying file ? + i think we do yes + if that's actually a file and not a directory, we could store + data into it + many translators are read-only + so ? + well, when we can write, we can use passive translators instead + normally + yes + depends on the fs type actually but you're right, we could use + regular files + or a special type of file, i don't know + braunr: BTW, I agree that active translators should only go away + when no ports are open anymore, while passive ones can exit when control + ports are still open but no protids + antrik: you mean as a general rule ? + that leaves the question how the translator distinguishes + between having a passive translator record and not having one + I believe I already arrived at that conclusion in some design + discussion, probaly regarding namespace-based translator selection + teythoon: yeah, as a general rule + interesting + currently there are command line arguments controling timeouts, + but they don't consider control ports IIRC + i thought there are problems with shutting down translators in + general + (also, command line arguments seem inconvenient to distinguish the + passive vs. active case...) + yeah, but we disregard the timeouts in the debian flavor of hurd + teythoon: err... no we don't. at least not last time I knew. are + you confusing this with thread timeouts? + simple test: do ls -l on /dev, wait a few minutes, compare + what do you expect will happen ? + the unused translators should go away + no + that must be new then + might be, yes + + http://darnassus.sceen.net/gitweb/teythoon/packaging/hurd.git/blame/HEAD:/debian/patches/libports_stability.patch + antrik: debian currently disables both the global and thread + timeouts in libports + my work on thread destruction consists in part in reenabling + thread timeouts, and my binary packages do that well so far :) + braunr: any idea why the global timeouts were disabled? + + +## IRC, freenode, #hurd, 2013-12-20 + + antrik: not sure + but i suspect there could be races + if a message arrives while the server is going away, i'm not sure + the client can determine this and retry transparently + good point... not sure how that is supposed to work exactly + + +## IRC, freenode, #hurd, 2013-12-31 + + btw, we should remove the libports_stability patch and directly + change the upstream code + if you agree, i can force the global timeout to 0 (because we're + still not sure what can go wrong when a translator goes away while a + message is being delivered to it) + i didn't experience any slowdown with thread destruction however + so i'm tempted to set that to an actual reasonable timeout value + of 30-300 seconds + braunr: if you do, please introduce a macro for the default + value so it can be changed easily + teythoon: yes + i don't understand why these are left as parameters tbh + true + 30 seconds seems to be plenty enough + + +## IRC, freenode, #hurd, 2014-01-17 + + time to give fakeroot-hurd a shot + http://darnassus.sceen.net/~rbraun/darnassus_fakeroot_hurd_assert + braunr: (wrt fakeroot-hurd) well in my book that shouldn't + happen + that's why i put the assertion there ;) + i assumed so :) + then again, /me does not agree with "threads" as concurrency + model >,<, and that feeling seems to be mutual :p + ? + well, obviously, the threads do not agree with me wrt to that + assertion + the threads ? + well, fakeroot is a multithreaded server + teythoon: i'm not sure i get the point, are you saying you're not + comfortable with threads ? + that's exactly what i'm saying + ok + coroutines/functional i guess ? + csp + functional not so much + + +## IRC, freenode, #hurd, 2014-01-20 + +[[open_issues/libpthread]], +[[open_issues/libpthread/t/fix_have_kernel_resources]]. + + teythoon: it's perfectly possible that the bug i had with + fakeroot-hurd have been caused by my own glibc thread related patches + has* + ok + *phew* :p + :) + i wonder if youpi could reproduce his issue on his machine + what issue ? + i must have missed something + some package failed + but he didn't gave any details + he wanted to try it on his vm first + ok + + +## IRC, freenode, #hurd, 2014-01-21 + + teythoon: i still get the same assertion failure with + fakeroot-hurd + will take a look at that sometimes too + braunr: hrm :/ + teythoon: don't worry, i'm sure it's nothing big + in the mean time, there are updated hurd and glibc packages on my + repository with fixed tls and thread destruction + cool :) + + +## IRC, freenode, #hurd, 2014-01-23 + + teythoon: can you briefly explain this fake reference thing in + fakeroot when you have some time please ? + braunr: fakeroot creates ports to hand out to clients + every port represents a node and references a real node + fakeroot allows one to set attributes, e.g. file permissions on + any node as if the client was root + those faked attributes are stored in the node objects + let's focus on fake_reference please + once some attribute is faked, that node has to be kept alive + otherwise, that faked information is lost + so if the last peropen object is closed and some information is + faked, a fake reference is kept + as indicated by a flag + hm + in dir lookup, if a node is looked-up that has a fake reference, + it is recycled, i.e. the flag cleared and the referecne count is not + incremented + so every time fakeroot_netfs_release_protid is called b/c, the + node in question should not have the fake reference flag set + what's the relation between the number of hard links and this fake + reference ? + i don' + i don't think fakeroot has a notion of 'hard links' + it does + the fake reference is added on nodes with a hard link count + greater than 0 + but i guess that just means the underlying node still exists + ah yes + right + currently, if the real node is deleted, the fake node is still + kept around + let's say it's ok for now + that's what the comment is talking about, the one that indicates + that garbage collection could help here + yes + properly fixing this is difficult + agreed + it would require something like inotify anyway + b/c of the way file deletion works + let's just ignore the issue, that's not what i'm hunting + agreed + the assertion i have is telling us that we're dropping a fake + reference + are we certain this isn't possible ? + that function is called if a client dereferences a port + in order to have a port in the first place, it has to get it + from a dir_lookup + the dir lookup turns a fake reference into a real one + so i'm certain of that (barring a race condition somewhere) + ok + netfs_S_dir_lookup grabs idport_ihash_lock (line 354) but doesn't + release it if nn == NULL (lines 388-392) + hm, my file numbers are slightly different o_O + i have printfs around + sorry :) + ok + new node unlocks it + new_node + oh + how unintuitive .. + yes, don't blame me ;) that's how it was + :) + worse, the description says "if successful" .. + ah no, the node lock + ok + yes, badly worded description + i strongly doubt it's a race + how do you trigger that assertion failure ? + dpkg-buildpackage -rfakeroot-hurd -uc -us + for the hurd package + very similar to one of your test cases i think + umm :-/ + one thing that i find confusing is that fake_reference seems to + apply to nodes, whereas release_protid is about, well, protids + is there a 1:1 relationship ? + since there is a peropen in the protid, i assume not + it may be a race actually + np->references must be accessed with netfs_node_refcnt_lock locked + hm no, that's not it + no, it's not a 1:1 relationship + note that the lock idport_ihash_lock serializes most operations, + despite it's name indicating that it's only for the hash table + the "interesting" operations being dir_lookup and release_protid + yes + again, that's another issue + why ? that's a pretty strong guarantee already + ah yes, i was referring to scalability + sure + the assertion is triggered from ports_port_deref in + ports_manage_port_operations_multithread + but i found it hard to reason about fakeroot, there are multiple + locks involved, two kinds of reference counting across different libs + yes + yes, that's to be expected + teythoon: do we agree that the fake reference is reused by a + protid ? + braunr: yes + why is there a ref counter for the protid as well as the peropen + then ? :/ + funny... i thought there was no refcnt for the peropen objects, + but there is + but for fakeroot-hurd that shouldn't matter, right ? + i don't know + here, one protid object is associated with one peropen object + yes + and the other way around, i.e. it's 1:1 + so the refcount for those should be identical + but i get a case where protid has a refcnt of 0 while the peropen + has 2 .. + umm, that doesn't sound right + teythoon: ok, it does look like a race on np->references + node references are protected by a global lock in lib*fs libs + yes + you check it without holding it + which means another protid can be closed at the same time, setting + the flag on the underlying node + i'll make a proper patch soon + they cannot both hold the hash lock + hm + teythoon: actually, i don't see why that's relevant + one thread closes its protid, sets the fakeref flag + the other does the same, chokes on the assertion + serially + i'm always a little fuzzy when exactly the references get + decremented + but shouldn't only the second thread set the fakeref flag ? + well, that's not what i see + i'll check what happens to this ref counter + see how my release_protid function calls netfs_release_protid + just after the out label + *while holding the big hash lock + so, any refcounting should happen while the lock is being held, + no ? + perhaps + now, my logs show something new + a case where the protid being released was never printed before + i.e. not obtained from dir_lookup + or at least, not fakeroot dir_lookup + huh, where did it came from then ? + no idea + only dir_lookup hands out those + check_openmodes calls dir_lookup too + yes, but that's not our dir_lookup + that's what i mean + it bypasses fakeroot's custom dir_lookup + but i guess the reference already exists at this point + bypass ? i wouldn't call it that + you're right, wrong wording + it accesses files on other translators + yes + the netnode is already present + yes + could it be the root node ? + i do not believe so + the root node is always faked + and is handed out to the first process in the fakeroot env for + it's current directory port + so you could try something that chdirs away to test that + hypothesis + the assertion looks triggered by a chdir + how do you know that ? + dh_auto_install: error: unable to chdir to build-deb + ah + well, or that is just the operation after fakeroot died and + completely unrelated + maybe + can you trigger this reliably ? + yes + i'm trying to write a shell script for that + so for you, fakeroot-hurd never succeeded in building a hurd + package ? + no + on darnassus ? + yes + b/c i stopped working on fakeroot-hurd when it was in a + good-enough shape to build the hurd package + >,< + maybe my system is not fast enough to hit this race (if it turns + out to be one) + some calls seems to decrease the refcount of the root node + call* + have you confirmed that it's the root node ? + almost + i could say yes + teythoon: actually no, it's not .. + could be .. + teythoon: on what node does fakeroot-hurd install the fakeroot + translator when used to build debian packages ? + hum + could it simply be that the check on np->references should be + moved above the assertion ? + braunr: it is not bound to any node, check settrans --chroot + oh right + teythoon: ok i mean + does it shadow / ? + looks very likely, otherwise the chroot wouldn't work + i'm not sure what you mean by shadow + settrans --chroot cmd -- / /hurd/fakeroot ? + but yes, for any process in the chroot-like env every real node + is replaced, including / + makes sense + teythoon: moving the assertion seems to fix the issue + intuitively, it seems reasonable to assume the fakeref flag can + only be set when there is only one reference, namely the fake reference + (well, the fake ref, recycled by the last open) + no, i don't follow + i'd still say, that if ...release_protid is called, then there + is no way for the fake flag to be set in the first place + that's why i put the assertion in ;) + on the other hand, you check the refcnt precisely because other + threads may have reacquired the node + but why would moving the assertion change anything ? + if we would do that, we'd "lose" all threads that see + np->reference being >1 + but for those objects the fake_reference flag should never be + set anyways + i cannot see why this would help + (does it help ?) + (and if it does, it points to a serious problem imho) + i'm recreating the traces that made me think that + to get a clearer view of what's happening + the problem i have with the current code is this + there can be multiple protid referring to the same node occurring + at the same time + they are serialized by the hash table lock, ok + but there apparently are cases where the first (of two) protids + being closed sets the fakeref flag + and the following chokes because the flag is set + i assume you put this refcount check because you assumed only the + last protid being closed can set the flag, right ? + but then, why > 1 ? why not > 0 ? + yes, that's what i was trying to assert + b/c the 1 is our reference + which one exactly ? + >1 is anyone *beside* us + ? + hm + you mean the reference held by the protid being destroyed + yes + isn't that reference already dropped before calling the cleanup + function ? + ah no, it's the node ref + yes + released by netfs_release_protid + exactly + which is called without the hash table lock held + hm no + it's locked + damn my brain is slow today + i actually think that it's the combination of manual reference + counting and the primitive concurrency model that makes it hard to reason + about this + well + the model is quite simple too + accesses to refcounters must be protected by the appropriate lock + this isn't done here, on the assumption that all referencing + operations are protected by another global lock all the time + even if a model is simple, this does not mean that it is a good + model for human beings to comprehend and reason about + i don't know + note that netfs_drop_node is designed to be called with + netfs_node_refcnt_lock locked + implying the refcount must remain stable between checking it and + dropping the node + netfs_make_peropen is called without the hash table lock held in + dir_lookup + and this increases the refcount + although the problem is rather that something decreases it without + the lock held + we should port libtsan and just ask gcc -fsanitize=thread + what about the netfs_nput call at the end of dir_lookup ? + the fake ref should be set by the norefs function + that should not decrease the count to 0 b/c the caller holds a + reference too + yes that's ugly + ugh + i'm unable to think clearly right now + as mentioned in the commit message, you cannot do something like + this in the norefs function + bbl ;) + bye teythoon + thanks for your time + for when you come back : + instead of maintaining this "fake" reference, why not assumeing + the hash table holds a reference, and simply count it + the same way a cache does + and drop that reference when removing a node, either to reflect + the current state of the underlying node, or because the translator is + being shut down ? + why not assume* + bbl too + sure, refactoring is definitively an option + + +## IRC, freenode, #hurd, 2014-01-24 + + teythoon: ok, i'll take care of fakeroot + braunr: thanks. tbh i was a little fed up with that little + bugger >,< + i can imagine + considering the number of patches you've sent already + + teythoon: are you sure about your call to fshelp_lock_init ? + yes, why do you ask ? + (the test case is given in the commit message) + it doesn't look right to me to call "init" while the node is + potentially locked + i noticed libdiskfs peropen release function takes care of + releasing locks + it looks better to me + it's not about releasing the lock + it's about faking the file being closed which implicitly + releases the lock + the file is being close + closed + since it's in the cleanup function + yes, but we keep it b/c the file has faked attributes + did you look at the problem description in the commit message ? + we keep the node + not the peropen + so ? + the lock is in the node + why would libdiskfs do it in the peropen release then ? + there is an inconsistency somwhere + actually, the lock looks to be per open + or rather, the lock is per node, but its status is recorded per + open + allowing the implementation to track if a file descriptor was used + to install a lock and release it when that file descriptor goes away + why would the node be locked ? + locked in what way, file-locking locked ? + yes + posix explicitely says that file locks must be implicitely removed + when closing the file descriptor used to install them, so that makes + sense + isn't hat exactly what i'm doing ? + no + you're initializing the file lock + init != unlock + and it's specific to fakeroot, while it looks like libnetfs should + be doing it + libnetfs would do it + but we prevent that by keeping the node alive + again, it's a per open thing + and no, libnetfs doesn't release locks implicitely in the current + version + didn't we agree that for fakeroot one peropen object is + associated with one protid object ? + yes + and don't keep those alive + so let them die peacefully, and fix libnetfs so it releases the + lock as it's supposed to + and we* don't + we don't keep those alive + why would we ? + yes that's what i wanted to say + what i mean is + since letting peropens die is already what is being done + there is no need for a special handling of locks in fakeroot + oh + on the other hand, libnetfs must be fixed + ok, that might very well be true + (we need to bring libnetfs and diskfs closer so that they can be + diff'ed easily) + i just wanted to check your reason for using lock_init in the + first place + yes .. + teythoon: also, i think we actually do have what's necessary to + deal with garbage collection + namely, dead-name notifications + i'll see if i can cook something simple enough + otherwise, merely keeping every node around is also acceptable + considering the use cases + dead-name notifications won't help if the real node disappears, + no ? + teythoon: dead name notifications on the real node port :) + teythoon: at least i can reliably build the hurd package using + fakeroot-hurd now + let's try glibc :) + +## IRC, freenode, #hurd, 2014-01-25 + + braunr: awesome :) + teythoon: hm not sure :/ + darnassus got oom + teythoon: could be unrelated though + teythoon: something has apprently made /home unresponsive :( + teythoon: i suspect bots hitting apache and in particular the git + repositories to have increased memory usage + + +## IRC, freenode, #hurd, 2014-01-26 + + teythoon: btw, fakeroot interacts very very badly with other netfs + file systems + e.g., listing /proc through it creates lots of nodes + i'm not yet sure how to fix that + using a dead name notification doesn't seem appropriate (at least + not directly) because fakeroot holds a true reference that prevents the + deallocation of the target node + + +## IRC, freenode, #hurd, 2014-01-27 + + teythoon: good news (more or less): fakeroot is actually leaking a + lot when crossing file systems + which means if i fix that, there is a good chance we can use it to + build all packages with it + -with it + what do you mean exactly ? + if target nodes are from /, there is no such leak + as soon as the target nodes are from another file system, ports + rights are leaked + that's what fills the kernel allocator actually + oh, so dir_lookup leaks ports when crossing translator + boundaries ? + seems so + yeah, that might very well be it + the dir_lookup logic in lib*fs is quite involved :/ + yes, my simple attempts were unsuccessful + but i'm confident i can fix it soon + that sounds good :) + i also remove the fake_ref flag and replace it with "accounting + the reference in the hash table" as soon as a node is faked + fine with me + these will be the expected leak + but they're far less in numbers than what i observe + and garbage collection can be implemented later + although i would prefer notifications a lot more + end of the news, bbl :) + found it :> + braunr: -v ;) + err = dir_lookup (...); + if (dir != dnp->nn->file) mach_port_deallocate (mach_task_self (), + dir); + in other words, deallocate ports for intermediate file system root + directories .. :) + teythoon: currently building hurd and glibc packages + but i intend to improve some more with the addition of a default + faked state + so that only nodes with modified faked states are retained + how do you mark nodes as having the default faked state ? + i don't + ok, right, makes sense :) + this sounds awesome, thanks for following up on this + i'm quite busy with other stuff so, with proper testing, it should + take me the week to get merged + teythoon: well thanks for all the fixes you've done + fakeroot was completely unusable before that + if you push your changes somewhere i'll integrate them into my + packages and test them + ok + implementing fakeroot -u could also be a good thing + and this should work easily with that default faked state strategy + + +## IRC, freenode, #hurd, 2014-01-28 + + teythoon: i should be able to test fakeroot-hurd with the default + faked attributes strategy today on glibc + braunr: very nice :) + azeem_: do you happen to know if fakeroot -u is used by debian ? + i mean when building packages + braunr: how does fakeroot-hurd perform on darnassus ? + i mean, does it yield a noticeable improvement over fakeroot-tcp + just like on my slow box ? + i'm not measuring that :/ + ok, no problem + and since nodes are removed from the hash table, performance might + decrease slightly + but the number of rights is kept very low, as expected + that's good + i keep seeing leaks though + when switching cwd between file systems + humm + so i assume something is wrong with the identity of . or .. + it's so insignificant compared to the previous problems that i + won't waste time on that + teythoon: the problem with measuring on darnassus is that it's a + public machine + right + often scanned by ssh worms or http bots + +[[cannot_create__dev_null__interrupted_system_call]]. + + but it makes complete sense to get better performance with + fakeroot-hurd + that's actually one of the reasons i'm working on it + if not the main one + :) + that was my motivation too + it shows how you can get an interchangeable unix tool that + directly plugs well with the low level system + and make it work better + nicely put :) + + teythoon: i still can't manage to build glibc with fakeroot-hurd + but i'm not sure why :/ + there was no kernel memory exhaustion this time + :/ + cp: cannot create regular file `debian/libc-bin.dirs': Permission + denied + hum + youpi: do you know if building debian packages requires fakeroot + -u option ? + I don't know + braunr: man dpkg-buildpackage says it just runs "fakeroot + debian/rules " + sources confirm that + http://sources.debian.net/src/dpkg/1.17.6/scripts/dpkg-buildpackage.pl#L465 + gg0: ok + + +## IRC, freenode, #hurd, 2014-01-29 + + it seems that something sets the permissions of this + debian/libc-bin.dirs file to 000 ... + i've seen this too + oh + do you think it's a fakeroot-hurd bug ? + have i mentioned something like this in a commit message ? + yes + it is + ok + i didn't see any mention of it + but i could have missed it + hm, i cannot recall it either + but i've seen this issue with fakeroot-hurd + ok + it's probably the last issue to fix to get it to work for our + packages + teythoon: i think i have a solution for that last mode bug + fakeroot doesn't relay chmod requests, unless they change an + executable bit + i don't see the point, and simply removed that condition to relay + any chmod request + braunr: did it work ? + no + fakeroot still consumes too many ports + and for each file, there are at least two ports, the faked one, + and the real one + it should be completely reworked + but i don't have time to do that + i'll see if it works when building from scratch + actually, it's not even a quantity problem but a fragmentation + problem + the function that fails is kmem_realloc .. + ipc spaces are arrays in kernel space .... + it's more like three ports per file, you forgot the identity + port + ah yes + + +## IRC, freenode, #hurd, 2014-02-03 + + teythoon: i'll commit my changes on fakeroot tonight + they do improve the tool, but not enough to build glibc with it + braunr: cool :), so how do we make it fully usable ? + teythoon: i don't know .. + i'll try re adding detection of nodes with no hard links for one + but imho, it needs a rework based on what the real fakeroot does + i won't work on it though + + teythoon: also, it looks like i've tested building glibc with a + wrong test binary of my fakeroot version :/ + so consider all test results irrelevant so far + + +## IRC, freenode, #hurd, 2014-02-04 + + fakeroot-hurd might turn out to be easily usable for our debian + packages with the fixed binary :) + + teythoon: hum, can you explain + 672005782e57e049c7c8f4d6d0b2a80c0df512b4 (trans: fix locking issue in + fakeroot) when you have time please ? + it looks like it introduces a deadlock by calling new_node (which + acquires the hash table lock) while dir is locked, violating the hash + table -> node locking order + + braunr: awesome, then there still is hope for fakeroot-hurd :) + + teythoon: i've been able to build glibc packages several times + this night + so except for this deadlock i've seen once, it looks good + right + that deadlock + right, it does indeed violate the partial order of the locks :-/ + + teythoon: can you explain why you moved the lock in attempt_mkfile + please ? + + teythoon: i've just tested a fakeroot binary without the patch + introducing the deadlock, and glibc built without any problem + braunr: well, this is very good news :) + teythoon: but i still wonder why you made this patch in the first + place, i don't want to revert it blindly and reintroduce a potential + regression + braunr: i thought i was fixing the order in which locks were + taken. if the commit message does not specify that it fixes an issue, + then i was probably just wrong and you can revert it + oh ok + good + + teythoon: another successful build :) + i'll commit my changes + awesome :) + there might still be concurrency issues but it's much better + i'm curious what you did :) + so little :) + i was sick all week heh + you'll se + see + well, that's good actually ;) + yes + + teythoon: actually there was another debugging line left over, and + again, my test results are irrelevant @#! + + +## IRC, freenode, #hurd, 2014-02-05 + + teythoon: i got an assertion about nn->np->nn not being equal to + nn atfer the hash table lookup is dir_lookup + +failure + that's bad + not over yet + i had a couple of those too + i guess it's a use after free + yes + i used to poison the pointers and comment out the frees to track + them down iirc + teythoon: one of your patches stores netnodes instead of nodes in + the hash table, citing some overwriting issue + teythoon: i don't understand why using netnodes fixes this + braunr: libihash has this cookie for fast deletes + that has to be stored somewhere + the node structure has no room for it + uh + yes + it was that bad + ... + hence the uglyish back pointers + i see + looking back i cannot even say why it worked at all + well, it didn't + i believe libihash must have destroyed a linked list in the node + struct + possibly + no, it did not >,<, but for simple tests it kind of did + yes fakeroot sometimes corrupts memory badly .... + and yes, turns out the assertion is triggered on nodes with 0 refs + .. + teythoon: it looks like even the current version makes wrong usage + of the ihash interface + locp_offset is defined as "The offset of the location pointer from + the hash value" + and indeed, it's an intptr_t + teythoon: hm no, it looks ok actually, forget what i said :) + *phew + :p + + hmm, still occasional double frees in fakeroot, but it looks in + good shape for single threaded tasks like package building + + teythoon: i've just sent my fakeroot patches + braunr: sweet, i'll have a closer look tomorrow :) + teythoon: i couldn't debug the double frees though :/ + + +## IRC, freenode, #hurd, 2014-02-06 + + btw, i'm able to successfully use fakeroot-hurd to build glibc + packages, but is there a way to make sure the resulting archives contain + the right privileges and ownerships ? + I don't remember whether debdiff checks permissions + + braunr: I've just got fakeroot-hurd debian/rules clean + dh_clean + fakeroot: ../../trans/fakeroot.c:161: netfs_node_norefs: Assertion + `np->nn->np == np' failed. + while building eglibc + youpi: yes, that lockup is most annoying... :/ + youpi: with the new version ? + yes + hum + i only had rare double frees, not that any more :/ + youpi: ok i got the error too + still not good enough + ok + + +## IRC, freenode, #hurd, 2014-02-07 + + youpi: debdiff seems to handle permissions + i've found the cause of the assertions + braunr: groovie :) + + +## IRC, freenode, #hurd, 2014-02-08 + + braunr: nice :) + http://darnassus.sceen.net/~rbraun/debdiff_report + + +## IRC, freenode, #hurd, 2014-02-10 + + and, on a completely different topic, here is a crash i can + reproduce when using fakeroot: + http://darnassus.sceen.net/~rbraun/fakeroot_hurd_rpctrace_o_var_tmp_out_rm_rf_dir.png + + +## IRC, freenode, #hurd, 2014-02-11 + + still working on fakeroot + there are still races (not disturbing for package building but + still ..) + there may be wrong right handling + i believe i have witnessed a fakeroot deadlock :/ + aw + not sure though, buildbot killed the build process before i + could investigate + teythoon: was it a big package ? + half of the hurd package + that's not a port right overflow then -- cgit v1.2.3