From eccdd13dd3c812b8f0b3d046ef9d8738df00562a Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 25 Sep 2013 21:45:38 +0200 Subject: IRC. --- community/gsoc/2013/nlightnfotis.mdwn | 2587 +++++++++++++++++++++++++++++++++ 1 file changed, 2587 insertions(+) (limited to 'community/gsoc/2013/nlightnfotis.mdwn') diff --git a/community/gsoc/2013/nlightnfotis.mdwn b/community/gsoc/2013/nlightnfotis.mdwn index 43f9b14c..a9176f51 100644 --- a/community/gsoc/2013/nlightnfotis.mdwn +++ b/community/gsoc/2013/nlightnfotis.mdwn @@ -448,3 +448,2590 @@ License|/fdl]]."]]"""]] nlightnfotis: OK, so probably waiting at the FSF office to be processed. Let's allow for some more time. After all, this is not critical for your progress. + + +# IRC, freenode, #hurd, 2013-07-10 + + tschwinge: I have run the diff of the GCC repo on the Hurd + against the one on my host linux os, and there was nothing relevant to + fixcontext and initcontext that are the ones that fail the + compilation. In any case I did recheck out the branch, and I have + attempted a build with it. It fails at the same point. Now I am + attempting a build with the -w (inhibit warnings) flag enabled + nlightnfotis: Have there been any differences in the diff? + There should be none at all. + tschwinge: there were some small changes due to the repo's + being checked out at different times. It was a large diff however. I + inspected it and didn't find anythign that was of much use. Here it is in + case you might want to see it: + https://www.dropbox.com/s/ilgc3skmhst7lpv/diffs_in_git.txt + nlightnfotis: Well, the idea of this exercise precisely was to + use the same Git revisions on both sides of the diff -- to show that + there are no spurious differences -- which can't be shown from your + 124486 lines diff. (Even though indeed there is no difference in + libgo/configure that would explain the mis-match, but who knows what else + might be relevant for that. + Would you please repeat that? + tschwinge: I will do so. It was wrong from me to not diff + against the same revisions, but going through the diff results grepping + for the problematic code didn't yield any results, so I thought that + might not be the issue. + I will perform the diff again tomorrow morning and report on + the results. + nlightnfotis: Anyway, if you checked out again, the latest + revision, and it still fails in exactly the same way, there is something + wrong. + nlightnfotis: And -w won't help, as there is a hard error + involved. + nlightnfotis: Are yous till working on GSoC things today? + tschwinge: yeah I am here. I decided to do the diff today + instead of tomorrow. + It finished now btw + let me tell you + ah and this time, the gits were checked out at the same time + from the same source + and are at the same branch + nlightnfotis: Coulod you upload the + gccbuild/i686-unknown-gnu0.3/libgo/config.log of the build that failed? + tschwinge: sure. give me a minute + tschwinge: there is something strange going on. The two + repos are at the exact same state (or at least should be, and the logs + indicate them to be) but still the diff output is 4.4 mb + but no presence of initcontext of fixcontext + tschwinge: the config.log file --> + http://pastebin.com/bSCW1JfF + wow! I can see several errors in the config.log file + but I am not so sure about their fatality. Config returns 0 + at the end of the log + nlightnfotis: As the configure scripts probe for all kings of + features on all kings of strange systems, it's to be expected that some + of these fail on GNU/Hurd. + What is not expected, however, is: + configure:15046: checking whether setcontext clobbers TLS + variables + [...] + configure:15172: ./conftest + /root/gcc_new/gcc/libgo/configure: line 1740: 1015 Aborted + ./conftest$ac_exeext + Hmm. apt-cache policy libc0.3 + nlightnfotis: ^ + tschwinge: Installed 2.13-39+hurd.3 + Candidate: 2.1-6 + *2.17 + Bummer. + nlightnfotis: As indicated in + + and thereabouts, you need 2.17-3+hurd.4 or later... + Well. + At least that now explains what is going on. + tschwinge: i see. I am in the process of updating my hurd + vm. I saw that libc has also been updated to 2.17 + I will confirm when updating is done + nlightnfotis: Anyway, is the diff between the two repositories + empty now or are there still differences? + there are differences + and they were checked out at the same time + from the same source + (the official git mirror) + and they are both at the same branch + and still diff output is 4.4 MB + but quick grepping into it and there is not mention of + initcontext or fixcontext + That's... unexpected. + may be a mistake I am making + but considering that diff run for some time before + completing + In both Git repositories, »git rev-parse HEAD« shows the same + thing? + Could you please upload the diff again? + tschwinge: confirmed. libc is now version 2.17-1 + tschwinge: http://pastebin.com/bSCW1JfF + for the rev-parse give me a second + nlightnfotis: Where is libc0.3 2.17-1 coming from? You need + 2.17-3+hurd.4 or later. + it is 2.17-7+hurd.1 + OK, good. + The URL you just have is the config.log file, not the diff. + s%have%gave + oh my mistake + wait a minute + the two repos have different output to rev-parse + Phew. + That explains. + So the Git branches are at different revisions. + that confused me... when I run git pull -a the branches that + were changed were all updated to the same revision + unless... there were some automatic merges in the *host* GCC + repo required during some pulls + but that was some time ago + would it have messed my local history that much? + that's the only thing that may be different between the two + repos + they checkout from the same source + nlightnfotis: At which revisions are the two + repositories/branches? + I have never used »put pull -a«. What does that do? + tschwinge: from what I know it does an automatic git fetch + followed by git merge. The -a flag must signal to pull all branches (I + think it's possible to pull only one branch) + That's the --all option. -a is something different (that I + don't understand off-hand). + Well, --all means to pull all remotes. + But you just want the GCC upstream, I guess. + I always use git fetch and git merge manually. + oh my god! You are write. -a is equivallent to --append + + https://www.kernel.org/pub/software/scm/git/docs/git-pull.html + git pull must be safe though + + http://stackoverflow.com/questions/292357/whats-the-difference-between-git-pull-and-git-fetch + without the -a + *right + why did I even write "right" as "write" above I don't + even... + what did I write in the sentence above + oh my god... + tschwinge: they are indeed on different revisions: The host + repo's last commit was made by me apparently, to merge master into + tschwinge/t/hurd/go, whereas the last commit of the Hurd repo was by you + and it reverted commit 2eb51ea + and that should also explain the large diff file + with master merged into the tschwinge/t/hurd/go branch + I will purge the debian repo and redownload it + *reclone it + that should bring it to a safe state I suppose. + + +# IRC, freenode, #hurd, 2013-07-11 + + nlightnfotis: how's your build going? + I tried one earlier and it seemed to build without any + issues, something that was...strange. I am repeating the build now, but I + am saving the compilation output this time to study it. + it was strange that the build succeeded? that sounds sad :/ + teythoon: considering that 3 weeks now I failed to build it + without errors, it sure seems weird that it builds without errors now :) + what did you change ? + braunr: not many things apparently. To be honest the change + that seemed to do the trick was (under thomas' guidance) update of libc + from 2.13 to 2.17 + well that can explain + tschwinge: Big update! GCC-go not compiles without errors + under the Hurd. I have done 2 compilations so far, none of which had + issues. Time needed for full build (without bootstrap) is 45 minutes +- 1 + minute. I also run the test suite, and I can confirm your results + s/not/now/, perhaps? + pinotree yeah. I don't know how it came up with not there. I + meant now + tschwinge: link for the go.sum is here --> + https://www.dropbox.com/s/7qze9znhv96t1wj/go.sum + + +# IRC, freenode, #hurd, 2013-07-12 + + nlightnfotis: Great! So you finally reproduced my results. + :-) + tschwinge: Yep! I am now building a blog, so that I can move + my reports there, so that they are more detailed, to allow for greater + transparency of my actions + nlightnfotis: Did you recently (in email, I think?) indicate + that there is another Go testsuite, for libgo? + nlightnfotis: As you prefer. + tschwinge: there seemed to be one, at least in linux. I + think I saw one in the Hurd too. + Oh indeed there is a libgo testsuite, too. + as a matter of fact, make check-go + did check for the lib + but lib was failing + yeah + So please have a look at that testsuite's results, too, and + compare to the GNU/Linux ones. + sure. I can do that now. + And for the go.sum you posted, please have a look at the tests + that do not pass (»grep -v ^PASS: < go.sum«), assuming they do pass on + GNU/Linux. + I suggest you add a list of the differences between GNU/Linux + and GNU/Hurd testresults to the wiki page, + , at the end of + the Part I section. + I'm on it. + For now, please ignore any failing tests that have »select« in + their name -- that is, do file them, but do not spend a lot of time + figuring out what might be wrong there. + The Hurd's select implementation is a bit of a beast, and I + don't want you -- at this time -- spend a lot of time on that. We + already know there are some deficiencies, so we should postpone that to + later. + tschwinge: noted. + So what I would like at the moment, is a list of the testresult + differences to GNU/Linux, then from the go.log file any useful + information about the failing test (which perhaps already explains) + what's going wrong, and then a analysis of the failure. + nlightnfotis: I assume you must be really happy that you + finally got it build fine, and reproduced my results. :-) + tschwinge: yeah! I can not hide from you the fact that + failing all those builds made me really nervous about me missing my + schedule. Having finally built that and revisiting my application I can + see I am on schedule, but I have to intensify my work to compensate for + any potential unforeseen obstacles + , in the futute + *future + + +# IRC, freenode, #hurd, 2013-07-15 + + nlightnfotis: btw, do you have a weekly progress report? + youpi: not yet. Will write it shortly and post it here. I + made a new blog to keep track of my progress. + Will report much more frequently now via my blog + did you add your blog url to the hurd iwki? + currently I am running gcc tests on both gcc go and libgo to + see what the differences are with Linux + I believe I have done so, let me see + youpi: gccgo passes most of its tests (it fails a small + number, and I am looking into those tests) but libgo fails 130/131 tests + (on the Hurd that is) + ok + + guys I wrote my report. This time I made it available on my + personal blog. You can find it here: + www.fotiskoutoulakis.com/blog/2013/07/15/gsoc-week-4-report/ As always, + open to (and encouraging) criticism, suggestions, anything that might + help me. + I also have to mention that now that my personal website is + online, I will report much more frequently, to the scale of reporting day + by day, or every 2-3 days. + nlightnfotis: without spending time on select, it'd be good to have + an idea of what is going wrong + eh, go having trouble with select + select is a beast, but we do have fixed things lately and we don't + currently know any issue still pending + youpi: are you suggesting to not skip the select tests too? + select is kind of critical .. + as youpi said, if you can determine what's wrong, at the interface + level (not the implementation), it would be a good thing to do + so we know what's wrong + we're not asking to fix it, though + braunr: youpi: noted. Thanks for the feedback. Is there + something else you might want me to improve? Something with the report + itself? Something you were expecting to see but I failed to provide? + no it's ok + it's short, readable, and readily answers the questions i might + have had so it's good + as you say, now you have to work on the core of your task :) + note: the "select" word in the testsuite is not strictly bound to + the C "select" + so it is probably really worth digging a bit at least on the go + side + but it's really worth doing in the end, as it will probably reveal + some nasty bugs on the way + I appreciate your input. I will start working on it asap + (today) and will report on Wednesday perhaps (or Thursday at worst). + + +# IRC, freenode, #hurd, 2013-07-18 + + braunr: I found out what was causing the fails in the tests + in both libgo and gccgo + it's a assertion: mach_port_t ktid = __mach_thread_self (); + int ok = thread->kernel_thread == ktid; __mach_port_deallocate + ((__mach_task_self_ + 0), ktid); ok; }) + is all that the assertion ? + yes + please paste the code somewhere + or is it in libpthread ? + http://pastebin.com/G2w9d474 + nonblock.x: ./pthread/pt-create.c:167: __pthread_create_internal: Assertion `({ mach_port_t ktid = __mach_thread_self (); int ok = thread->kernel_thread == ktid; __mach_port_deallocate ((__mach_task_self_ + 0), ktid); ok; })' failed. + 9 FAIL: go.test/test/chan/nonblock.go execution, -O2 -g + yes + that's related to my current work on thread destruction + +[[open_issues/libpthread/t/fix_have_kernel_resources]]. + + thread resources recycling is buggy + i suggest you make your own thread pool if you can + I will look into it further and let you know. Thanks for + that. + + +# IRC, freenode, #hurd, 2013-07-22 + + tschwinge, I have found what is failing both libgo and gccgo + tests, but for the life of me, I can not really find the offending code + on any repository. + not even the eglibc-source debian package. it's driving me + insane. + nlightnfotis: If this is driving you insane, we should quickly + have a look at that! + thanks tschwinge: I have found that the offending code is an + assertion: { mach_port_t ktid = __mach_thread_self (); int ok = + thread->kernel_th read == ktid; __mach_port_deallocate ((__mach_task_s + elf_ + 0), ktid); ok; } on a file called pt-create.c under the + libpthread on line 167 + but for the life of me, I can not find that piece of code + anywhere. And when I mean anywhere, I mean anywhere. I have looked for it + on all of the branches of glibc, libpthread and the source code of + eglibc. + that's why if you don't mind I would like to write my report + in a day or two, when (hopefully) I will have more progress to report on. + nlightnfotis: isn't that libpthread/sysdeps/mach/pt-thread-start.c + ? + or rather, ./sysdeps/mach/hurd/pt-sysdep.h + youpi: let me check this out. If that's it I'm gonna cry. + which unfortunately is inlined in a lot of places + nlightnfotis: does the assertion not tell you the file & line? + youpi: holy smokes! That's the code I was looking for! Oh + boy. Yeah the logs do tell me, but it was very misleading. So misleading, + taht I was actually looking at the wrong place. All logs suggest that + this piece of code is at libpthread/pthread/pt-create.c in line 167 + what is that line in your tree? + a call to _pthread_self(), isn't it? + then it's not actually misleading, this is indeed where the + pt-sysdep.h definition gets inlined + it seems so, yeah. it's err = __pthread_sigstate + (_pthread_self (), 0, 0, &sigset, 0); + nlightnfotis: and what is the backtrace? + youpi: _pthread_create_internal: Assertion failed. + The assertion is the one above + nlightnfotis: sure, but what is the backtrace? + I don't have the full backtrace. These are the logs from the + compiler. All I can get is: reports like this: nonblock.x: + ./pthread/pt-create.c:167: __pthread_create_internal: Assertion `({ + mach_port_t ktid = __mach_thread_self (); int ok = thread->kernel_thread + == ktid; __mach_port_deallocate ((__mach_task_self_ + 0), ktid); + ok; })' failed. + nlightnfotis: you should probably have a look at running the tests + by hand + so you can run them in a debugger, and get backtraces etc. + nlightnfotis: did i answer that ? + braunr: which one? + the problems you're seeing are the pthread resources leaks i've + been trying to fix lately + they're not only leaks + creation and destruction are buggy + I have read so in + http://www.gnu.org/software/hurd/libpthread.html. I believe it's under + Thread's Death right? + nlightnfotis: yes but it's buggy + and the description doesn't describe the bugs + so we will either have to find a temporary workaround, or + better yet work on a fix, right? + nlightnfotis: i also told you the work around + nlightnfotis: create a thread pool + braunr: since thread creation is also buggy, wouldn't the + thread pool be buggy too? + nlightnfotis: creation *and* destruction is buggy + nlightnfotis: i.e. recycling is buggy + nlightnfotis: the hurd servers aren't affected much because the + worker threads are actually never destroyed on debian (because of a + debian specific patch) + + youpi, nlightnfotis, hacklu_: btw, what about the copyright + assignment process + nlightnfotis just got his on file, so there is progress. + I have email from Donald R Robertson III + about that -- but it is not yet present in the + FSF copyright.list file... + I think I received that email because I was CCed on + nlightnfotis' submission. + tschwinge: I have got the papers, and they were signed by + the FSF. They stated delivery date 11 of July, but the documents were + signed on the 10th of July :P + Ah, no, I received it via hurd-maintainers@gnu.org -- and the + strange thing is that not all assignments that got processed got sent + there... + At the recent GNU Tools Cauldron we also discussed this in the + GCC context; and their experience was the very same. Emails get lost, + and/or take ages to be processed, etc. + It seems the FSF is undermanned. + + +# IRC, freenode, #hurd, 2013-07-27 + + I have one question about the Mach sources: I can see it + uses its own scheduler (more like, initializes) and also does the same + for the linux scheduler. Which one does it use? + it doesn't use the linux scheduler + the linux glue just glues linux scheduling concepts onto the mach + scheduler + ohh I see now. Thanks for that youpi. + + +# IRC, freenode, #hurd, 2013-07-28 + + In the mach kernel source code, does the (void) before a + function call have a semantic meaning, or is it just remnants of the past + (or even documentation) + for example? + pinotree: (void) thread_create (kernel_task, + &startup_thread); + I read on stack overflow that there is only one case where + it has a semantic meaning, most of the times it doesn't + + http://stackoverflow.com/questions/13954517/use-of-void-before-a-function-call + most probably thread_create has a non-void return value, and + this way you're explicitly suppressing its return value (usually because + you don't want/need to care about it) + isn't the value discarded if the (void) is not there? + yes, but depending on extra attributes and/or compiler warning + flags the compiler might warn that the return value is not used while it + ought to + the cast to void should suppress that + oh, okay, thanks for that pinotree + and yes you are right that thread_create actually does + return something + even if there would be no compiler message about that, adding + the explicit cast could mean "yes, i know the function does return + something, but i don't care about it" + ... as hint to other code readers + as a form of documentation then + also + + oh well, I am gonna ask and I hope someone will answer it: + In the Mach's dmesg (/var/log/dmesg) I can see that the version string + along with initial memory mapping information are printed twice, when in + fact they are supposed to be called only once. Is this a bug, or some + buffering error, or are they actually called twice for some reason? + + +# IRC, freenode, #hurd, 2013-07-29 + + guys is the evaluation today? + yes + right + where can we find the evaluation papers on melange? + wait untill 12pm UTC. + yeah, I just noticed thanks hacklu_ + nlightnfotis:) + + tschwinge: I only have one question regarding my project. If + I make some changes to libpthread, what's the best way to test them in + the hurd? Rebuild glibc with the updated libpthread? + NlightNFotis: Yes, you'll have to rebuild glibc. I have a + cheat sheet for that: + http://darnassus.sceen.net/~hurd-web/open_issues/glibc/debian/ + It may be that the »Run debian/rules patch to apply patches« + step is no longer encessary with the 2.17 glibc packages. + thanks for that tschwinge. :) + NlightNFotis: Sure. :-) + + NlightNFotis: Where's your weekly status? + I will write it today at the noon. I have written all the + other ones, and they are available at www.fotiskoutoulakis.com + the next one will be available there as well, later in the + day + Ack. But please try to finish your report before the meeting, + as discussed. + oh, forgive me for that. I thought it was ok to write my + report a day or so later. Sorry. + NlightNFotis: Please write your report as soon as possible -- + otherwise there's no useful way for me to know what your status is. + I will. This week I have been mostly going through the + various sources (the Hurd, Mach and libpthread, especially the last two) + in my attempt to get a better understanding for how libpthread + works. Since yesterday I have attempted some small changes on my + libpthread repo that I plan on testing and reporting on them. That's why + I still have not written my report. + NlightNFotis: Things don't need to be finished before you + report about them. It's often more useful to discuss issues *before* you + spend time on implementing them. + #hurd + NlightNFotis: what kind of changes do you want to add to + libpthread ? + Have a look at the asseriton failure, I would hope. :-) + well no + again, i did that + and it's not easy to fix + braunr: I was looking into ways that I could create the + thread pool you suggested into libpthread + no, don't + create it in your application + not in libpthread + well, this may not be an acceptable solution either .. + Before doing that we have to understand what exactly the Go + runtime is doing. It may just be a weird itneraction with the setcontext + et al. functions that I failed to think about when implementing these? + the other possibility is the go runtime libraries. But I + thought that libpthread might be a better idea, since you told me that + creation *and* destruction are buggy + braunr: you are right, the signal thread is always exist. I have + got a wrong understand before. + tschwinge: I can look into that, now. I will also include + that in my report. + NlightNFotis: i don't see how this is a relevant argument .. + tschwinge: i'd suggest he first try with a custom pool in the go + runtime, so we exclude what you're suspecting + if this pool actually works around the issues NlightNFotis is + having, it will confirm the offending problem comes from libpthread + So, as a very first step make any thread + distruction/deallocation a no-op. + yes + braunr: I originally understood that a thread pool might + skip the thread's destruction, so that we escape the buggy part with the + thread's destruction. Since that was a problem with libpthread, it sure + affects other threads (instead of go's ) too. So I assumed that building + the thread pool into libpthread might help eliminate bugs that may affect + other code too. + no, it's not a proper fix + it's a work around + and i'm working on a proper fix in parallel + (when i have the time, that is :/) + oh, I see. So for the time, I had better not touch + libpthread, and take a look at the go run time aye? + NlightNFotis: Remember: one thing after the other. First + identify what is wrong exactly. Then think and discuss how to solve the + very specific issue. Then implement it. + as tschwinge said, make thread destruction a nop in go + see if that helps + NlightNFotis: For example, you surely have noticed (per your + last report), that basically all Go language test pass (aside from the + handful of those testing select, etc.) -- but all those of the libgo + runtime library fail, literally all of them. + You noticed they basically all fail with the same assertion + failure. But why do all the Go language ones work fine? + Don't they execute the program they built, for example? + (I haven't looked.) + they do execute the program. the language ones that fail + too, fail due to the assertion failure + Or, what else is different for them? How are they built, which + flags, how are they invoked. + how many goroutines ? + :p + Do you also get the assertion failure when you built a small Go + program yourself and run that one. + Don't get the assertion failure? Then add some more complex + stuff that are likely to invole adding/re-using new threads, such as + goroutines. + I didn't get the assertion failure on a small test program, + but now that you suggest it it might be a good idea to build a custom + test suite + Etc. That way you'll eventually get an understanding what + triggers the assertion failure. + And that exeactly is the kind of analysis I'd like to read in + your weekly report. + A list of things what you have done, which assuptions you've + made, how that directed your further analysis, what results that gave, + etc. + I will do it. I will try to rush to finish it today before + you leave, so that you can inspect it. God I feel like all that time I + spent this week studying the particular source code (libpthread, and the + Mach) were in vain... + on second thoughts, it was not in vain. I got a pretty good + understanding of how these pieces of software work, but now I will have + to do something completely different. + Studying code is never in vain. + Exactly. + You must have had some motivation to study the code, so that + was surely a valid thing to do. + But we'd link to understand your reasoning, so that we can + support you and direct you accordingly. + but it's better to focus on your goals and determine an + appropriate course of actions, usually starting with good analysis + Yes. + s/link/like/? + pinotree: Indeed, thanks. + makes me remember when i implemented radix trees to replace splay + trees, only to realize splay trees were barely used .. + braunr: Yes. It has happened to all of us. ;-P + NlightNFotis: So, don't worry -- but learn from such things. + :-) + anyway, I will start right away with the courses of action + you suggested, and will try to have finished them by noon. Thanks for + your help, it really means a lot. + In software generally, it is never a good idea to let you be + distracted, and don't follow your focus goal, because there are always so + many different things that could be improved/learned/fixed/etc. + tschwinge, I am only nervous about one thing: the fact that + I have not submitted yet any patch or some piece of code in general. Then + again, the summer of code for me so far has been 70-80% reading about + stuff I didn't know about and 30-20% doing the stuff I should know + about... + NlightNFotis: That's why we're here, to teach you something. + Which we're happy to do, but we all need to cooperate for that (and I'm + well aware that this is difficult if one is not in the same rooms, and + I'm also aware that my time is pretty limited). + NlightNFotis: We're also very aware that the Hurd system, as + any operating system project (if you're not just doing "superficial" + things) is difficult, and takes lots of time to learn, and have concepts + and things sink into your brain. + i wouldn't worry too much + We're also still learning every day. + go doesn't require a lot from the underlying system, but what is + required is critical + once you identify it, coding will be quick + tschwinge: braunr: thanks. I shall begin working following + the directions you gave to me. + NlightNFotis: So yes, because Google wants us to grade you + based on that, you'll eventually have to write some code, but for + example, a patch to disable thread distruction/deallocation in libgo + would definitely count as such code. And that seems like one of your + next steps. + tschwinge: i need to deliver that instantly, right? seeing + as the evaluation is today. + NlightNFotis: No. Deliver it when you have something to + deliver. :-) + tschwinge: I am nervous about the evaluation today. I have + not submitted a single piece of code, only some reports. How negatively + does this influence my performance report? + NlightNFotis: If I can say so, in the evaluation today, Google + basically asks us mentors whether we want to fail our students right now. + Which I don'T plan to do, knowing about the complexity of the Hurd + system, and the learning required before you can do useful code changes. + tschwinge: that really means a lot to me, and it got a + weight of my chest. + uh ok, i have to be the rude guy again + NlightNFotis: the gsoc is also a way for the student to prepare + for working in software development communities + whether free software/open source and/or in companies + people involved care a lot less about pathos than actual results + (or to prepare students to be hired by google, but that's + another story) + NlightNFotis: in other words, stop apologizing that much, stop + focusing so much on that, and just work as you can + + +# IRC, freenode, #hurd, 2013-07-31 + + teythoon: both samuel and thomas would be missing for the + week right? + nlightnfotis: they do, why? + nlightnfotis: err, they do?? why? + + +# IRC, freenode, #hurd, 2013-08-01 + + braunr: I checked out what you (and Thomas) suggested and + did some research on go on the Hurd. I have found out that go works, + until you need to use anything that has to do with a goroutine. I am now + playing with the go runtime and checking to see if turning thread + destruction to noop will have any difference. + + +# IRC, freenode, #hurd, 2013-08-05 + + youpi: whenever you have time, I would like to report my + progress as well. + nlightnfotis: sure, go ahead + but again, you should report before the meeting + so we can read it before coming to the discussion + I have written my report + ah + nlightnfotis: I have read your report, these days you have make a + great progress. + where is it? + it was available since yesterday + + http://www.fotiskoutoulakis.com/blog/2013/08/05/gsoc-partial-week-7-report/ + thanks hacklu. The particular piece of code I was studying + was very very interesting :) + nlightnfotis: I think you should show your link in here or email + next time. I have spend a bit more time to find that :) + youpi: for a tldr, at the last time I was told to check + gccgo's runtime for clues regarding the go routine failures. + hacklu: will keep that in mind, thanks. + youpi: thing is, gccgo operates on two different thread + types: G's (the goroutines, lightweight threads that are managed by the + runtime) and M's (the "real" kernel threads") + none of which are really "destroyed" + ok, makes sense + G's are put in a pool of available goroutines when their + status is changed to "Gdead" so that they can be reused + M's also don't seem to go away. There is always at least one + M (the bootstrap one) and all other M's that get created are also stashed + in a pool of available working threads. + you could put some debugging printfs in libpthread, to make sure + whether threads do die or not + I am studying this further as we speak, but they both don't + seem to get "destroyed", so that we can be sure that bugs are triggered + by thread destruction + I was beginning to believe that maybe I was looking in the + wrong direction + but then I looked at my past findings, and I noticed + something else + if you take a look at the first failed go routine, it failed + at the time.sleep function, which puts a goroutine to sleep for ns + nanoseconds. That made me think if it was something that had to do with + the context functions and not the goroutines' creation. + nlightnfotis: that's possible + nlightnfotis: I'd say you can focus on this very simple example: a + mere sleep + that's one of the simplest things a thread scheduler has to do, but + it has to do it right + fixing that should fix a lot of other issues + if I have understood correctly, there is at least one G + (Goroutine) and at least one M (kernel thread) running. Sleep does put + that goroutine at a hold, and restarting it might be an issue + talking about thread scheduling ? :) + nlightnfotis: go's runtime doesn't actually destroy kernel threads, + apparently + youpi: yeah, that's what I have understood so far. And it + neither does destroy goroutines. If there was an issue with thread + creation, then I guess it should be triggered in the beginning of the + program too (seeing as both M's and G's are created there) + the fact that it is triggered when a goroutine goes to sleep + makes me suspect the context functions + yes + again I am studying it the last days, in search of + clues. Will keep you all updated. + braunr: I have written my report and it is available here + http://www.fotiskoutoulakis.com/blog/2013/08/05/gsoc-partial-week-7-report/ + If you could read it and tell me if you notice something weird tell me + so. + nlightnfotis: ok + nlightnfotis: quite busy here so don't worry if i suddenly + disappear + nlightnfotis: hum, does go implement its own threads ?? + braunr: yeah. It has 2 threads. Runtime managed (the + goroutines) and "real" (kernel managed) ones. + i mean, does it still use libpthread ? + thing is none of them "disappear" so as to explain the bug + with "thread creation **and** destruction) + it must use libpthread for kernel threads as far as creation + goes. + ok, good + then, it schedules its own threads inside one pthread, right ? + using the pthread as a virtual cpu + yes. It matches kernel threads and runtime threads and runs + the kernel threads in reality + the scheduler decides which goroutine will run on each + kernel thread. + ew + this is pretty much non portable + and you're right to suspect context switching functions + yeah my thought for it was the following: thread creation, + if it was buggy, should be triggered as soon as a program starts, seeing + as at least one kernel thread and at least one go routine starts. My + sleep experiment crashes when the goroutine is put on hold + did you find the code putting on hold ? + I will give you the exact link, wait a moment + braunr: + https://github.com/NlightNFotis/gcc/blob/master/libgo/runtime/time.goc?source=c#L59 + that is the exact location is line 26, which calls the one I + pointed you at + ahah, tsleep + old ghost from the past + nlightnfotis: the real location is probably runtime_park + I will check this out. + + may I ask something non-technical but relevant to summer of + code? + sure + would it be okay if I took the day off tomorrow? + nlightnfotis: ask tschwinge but i guess it's ok + + have you found runtime_park ? + i'm downloading your repository from github but it's slow :/ + braunr: not yet. Grepping through the files didn't produce + any meaningful results and github's search is not working + braunr: there is that strange thing with th gccgo sources, + where I can find a function's declaration but not it's definition. Funny + thing is those functions are not really extern, so I am playing a hide + and seek game, in which I am not always successful. + runtime_park is declared in runtime.h. I have looked nearly + everywhere for it. There is only one last place I have not looked at. + braunr: I found runtime_park. It's here: + https://github.com/NlightNFotis/gcc/blob/master/libgo/runtime/proc.c?source=c#L1372 + + nlightnfotis: Taking the day off is fine. Have fun! + tschwinge: I am still here; Thanks for that tschwinge. I + will be for the next half hour or something if you would like to ask me + anything + nlightnfotis: I have no immediate questions (first have to read + your report and discussion in here) -- so feel free to log out and enjoy + the sun outside. :-) + + nlightnfotis, tschwinge: btw, have you seen + http://morsmachine.dk/go-scheduler ? + teythoon: thanks for the link. It's really interesting. + + +# IRC, freenode, #hurd, 2013-08-12 + + teythoon did you manage to build the Hurd successfuly? + ah yes, the Hurd is relatively easy + the libc is hard + debian glibc or hurd upstream libc? + but my build on darnassus was successful + *debian eglibc + well, I rebuilt the debian package with two tweaks + do you build on linux and rsync on hurd or ...? + I built it on Hurd, though I thought about setting up a cross + compiler + I see. The process was build Mach, build Hurd, and then + build glibc and it's ready or it needed more? + no, I never built Mach + I must admit I'm not sure about the "proper" procedure + if I change one of Hurds RPC definitions, I think the proper way + is to rebuild the libc against the new definitions and then the Hurd + but I found no way to do that, so everyone seems to build the + Hurd, install it, build the libc and then rebuild the Hurd again + I see. Thanks for that :) + + tschwinge, I have also written my report! It's available + here + http://www.fotiskoutoulakis.com/blog/2013/08/12/gsoc-week-8-partial-report/ + I can sum it up if you want me to. + nlightnfotis: I already read it! :-D + Oh, I didn't. I read the week 7 one. Let me read week 8. ;-) + ok. I am currently going through the assembly generated for + the sample program I have embedded my report. + the weird thing is that the assembly generated is pretty + much the same for the program with 1 and 2 goroutine functions (with the + obvious difference that the one with 2 goroutine functions has 1 more + goroutine in it's assembly code) + I can not understand why it is that when I have 1 goroutine, + an exception is triggered, but when I am having two (which are 99% + identical) it seems to be executed. + and I do not understand why the exception is triggered when + I manually use a goroutine. + To my understanding so far, there is at least 1 (kernel) + thread created at program startup to run main. The same thread gets + created to run a new goroutine (goroutines get associated with kernel + threads) + and it's obvious from the assembly generated. + go_init_main (the main function for go programs) starts with + a .cfi_startproc + the same piece of code (.cfi_startproc) starts a new kernel + thread (on which a goroutine runs) + nlightnfotis: Re your two-goroutines example: in that case I + assume, you're directly returning from the main function and the program + terminates normally. ;-) + nlightnfotis: Studying the assembly code for this will be too + verbose, too low-level. What we need is a trace of steps that happen + until the error. + tschwinge, that must be it, but it should trigger the bug, + since it still has at least one goroutine (and one is known to trigger + the bug) + nlightnfotis: I guess the program exits before the first + gorouting would be scheduled for execution. + the assembly for the goroutines is identical. You can't tell + one from the other. The only change is that it has 2 of these sections + instead of one + actually it's the same for the first one + nlightnfotis: I very much assume that the issue is not due to + the code generated by the Go compiler (which you're seeing in the + assembly code), but rather due to the runtime code in the libgo library. + I didn't think of it this way. + ... that improperly interacts with our libpthread. + so my research should focus on the runtime from now on? + Improperly may well imply that our libpthread is at fault, of + course, as we discussed. + Back to the one-gouroutine case (that shows the assertion + failure). Simple case: one goroutine, plus the "main" thread. + We need to get an understanding of the steps that happen until + the error happens. + As this is a parallel problem, and it is involving "advanced" + things (such as setcontext), I would not trust GDB too much when used on + this code. + I will have to manually step through the source myself, + right? + What I would do, is add printf's (or similar) into the code at + critical points, to get an udnerstanding of what's going on. + Such critical points are: pthread_create, setcontext, + swapcontext. + It sounds like a good idea. Anything else to note? + That way, you can isolate the steps required to trigger the + assertion failure. + For example, it could be something like: makecontext, + swapcontext, pthread_creat, boom. + pthread_create_internal is failing at an assertion. I wonder + what would happen if I remove that assertion. + Not without understanding what the error is, and why it is + happening (which steps lead to it). We don't usually do »voodoo + computing and programming by coincidence«. + tschwinge, I also figured out something. If it is a + libpthread issue, it should also get triggered when a simple C program + creates a thread (assuming _pthread_create is causing the issue) + so maybe I should write a C program to test that + functionality and see if it provides any further clues? + nlightnfotis: That's precile what the goal of »isolate the + steps required to trigger the assertion failure« is about: reduce the big + libgo code to a few function calls required to reproduce the problem. + nlightnfotis: I simple C program just doing pthread_create + evidently does not fail. + nlightnfotis: I assume you have a Go program dynamically linked + to the libgo you build? + yes. To the latest go build from the source (4.9) + *gccgo build from source + removing an assertion is usually extremely bad practice + Then you can just do something like make target-libgo (IIRC) + (or instead: cd i686-pc-gnu/libgo/ && make) to rebuild your changed + libgo, and then re-run the Go program. + the thought of randomly removing assertions shouldn't even reach + your mind ! + braunr: even if it is not permanent, but an experiment? + yes + can you explain to me why? + nlightnfotis: Not without understanding what the + error is, and why it is happening (which steps lead to it). We don't + usually do »voodoo computing and programming by coincidence«. + an assertion exists to make sure something that should *never* + happen never happens + removing it allows such events to silently occur + braunr: that's the theory, yes, to check invariants + i dont' know what you mean by using assertions for "an experiment" + unfortunately some people use assert for error handling :/ + that's wrong + and i dont't remember it to be the case in libpthread + nlightnfotis: can you point the faulting assertion again there + please ? + braunr: sure: Assertion `({ mach_port_t ktid = + __mach_thread_self (); int ok = thread->kernel_thread == ktid; + __mach_port_deallocate ((__mach_task_self + 0), ktid); ok; + })' failed. + so basically, thread->kernel_thread != __mach_thread_self() + this code is run only for num_threads == 1 + but has there been any thread destruction before ? + no. To my understanding kernel threads in the go runtime + never get destroyed (comments seem to support that) + IOW: is it certain the only thread left *is* the main thread ? + hm + intuitively, i'd say this is wrong + i'd say go doesn't destroy threads in most cases, but something in + the go runtime must have done it already + i'm not even sure the main thread still exists + check that + where is the go code you're working on ? + there are 3 files of interest + i'd like the whole sources please + I will find it in a moment + braunr: GCC Git clone, tschwinge/t/hurd/go branch. + it is /libgo/runtime/runtime.h + it is /libgo/runtime/proc.c + tschwinge: thanks + braunr: git://gcc.gnu.org/git/gcc.git + I will provide links on github + nlightnfotis: i sayd the whole sources, why do you insist on + giving me separate files ? + for checking it out quickly + oh I misunderstood that sorry + thought you wanted to check out thread creation and + destruction and that you were interested only in those specific files + tschwinge: is it completely contained there or are there external + libraries ? + braunr: You mean libgo? + tschwinge: possibly + tschwinge, I just made sure that yeah programs are + dynamically linked against the compiler's libgo + libgo.so.3 + does libgo come from gcc sources ? + yeah + ok + go files on gcc sources are split under two directories: go, + which contains the frontend go, and libgo which contains the libraries + and the runtime code + braunr: darnassus:~tschwinge/tmp/gcc/go.build/ is a recent + build, with sources in $PWD/../go/. + braunr: libgo is in i686-unknown-gnu0.3/libgo/.libs/ + so tschwinge to roundup for this week I should print debug + around the "hotspots" and see if I can extract more information about + where the specific problem is triggered right? + nlightnfotis: Yes, for a start. + nlightnfotis: identify the main thread, make sure it doesn't exit + noted. + braunr: do you have an idea about the issue I described + earlier? The one with the 1 goroutine triggering the bug, but the 2 + exiting successfully but with no output? + nlightnfotis: i didn't read + do you have 2 mins to read my report? I describe the issue + something messed up in the context i suppose + nlightnfotis: Uhm, I already explained that issue? + you did ? + tschwinge, I know, don't worry. I am trying to get all the + insight I can get. + you mentioned that the scheduler might have an issue and + that the main thread returns before the goroutines execu + *execute + right? + It is the normal thing for a process to terminate normally when + the main function returns. I would expect Go to behave the same way. + "Now, if we change one of the say functions inside main to a + goroutine, this happens" + how do you change it ? + Or am I confused? + tschwinge: i don't remember exactly + braunr: from say("world") to go say("world") + tschwinge, yeah I get that. What I still have not understood + is what is it specifically about the 2 goroutines that doesn't trigger + the issu when 1 goroutine does. + You said that it might have something to do with the + scheduler; it does seem like a good explanation to me + nlightnfotis: My understanding still is that the goroutinges + don't get executed before the main thread exits. + which scheduler ? + braunr: the runtime (go) scheduler. + tschwinge, Yeah, they don't. But still, with 1 goroutine: + you get into main, attempt to execute it, and bam! With two, it should be + the same, but strangely it seems to exit main without an issue + (attempt to execute the goroutine) + why should it be the same ? + braunr: seeing as one goroutine has problems, I can't see + why two wouldn't. At least one of the two should result in an exception. + nlightnfotis: why ? + nlightnfotis: they do have the problem + they don't run + they just don't run into that assertion, probably because there is + more than one thread + wait a minute. You imply that they fail silently? But still + end up in the same situation + yes + in which case it does look like a go scheduler problem + if I understood it correctly, that assertion fails when it + is only 1 thread? + yes + and since the main thread is always correct, i expect the main + thread has exited + which this happens because the one thread left is *not* the main + thread + (which is a libpthread bug) + but it's a bug we've not seen because we don't have applications + creating threads while exiting + I think I got it now. + try to put something like getchar() in your go program + something that introduces a break + so that the main thread doesn't exit + oh right. Thanks for that. And sorry tschwinge I reread what + you said, it seems I had misinterpreted what you suggested. + braunr: If you're interested: for a Go program triggering the + asserition, I don't see any thread exiting (see + darnassus:~tschwinge/tmp/gcc/a.go, run: cd ~tschwinge/tmp/gcc/go.build/ + && ./a.out) -- but perhaps I've been looking for the wrong things in l_. + File l is without a goroutine. Have to leave now, sorry. + braunr: If you want to rebuild: gcc/gccgo -B gcc -B + i686-unknown-gnu0.3/libgo ../a.go -Li686-unknown-gnu0.3/libgo/.libs + -Wl,-rpath,i686-unknown-gnu0.3/libgo/.libs + tschwinge: no i won't touch anything + but thanks + + +# IRC, freenode, #hurd, 2013-08-19 + + nlightnfotis: how are you going with gcc go? + I was print debugging all the week. + I can tell you I haven't noticed anything weird so far. + But I feel I am close to the solution + I have not written my report yet. + I will write it maximum until wednesday + I hope I will have figured it all out until then + a report is not for writing solutions, but for the progress + yes + it's completely fine to be saying "I've been debugging, not found + anything yet" + results or not, always write your reports on time, so your + mentor(s) know what you are doing + I see. Would you like me to write it right now, or is it + okay to write it a day or two later? + nlightnfotis: FYI. this week my report is not finished. just + state some problem I face now. + nlightnfotis: I'd say better write it now + youpi: Ok I will write it and tell you when I am done with + it. + youpi: here is my partial report describing what my course + of action looked like this + week. http://www.fotiskoutoulakis.com/blog/2013/08/19/gsoc-week-9-partial-report/ + of course, I will write in a day or two (hopefully having + figured out the whole situation) an exhaustive report describing + everything I did in detail + youpi: I have written my (partial) report describing how I + went about this week + http://www.fotiskoutoulakis.com/blog/2013/08/19/gsoc-week-9-partial-report/ + nlightnfotis: good, thanks! + youpi: please note that this is not an exhaustive link of my + findings or course of action, it merely acts as an example to demonstrate + the way I think and how I go about every day. + I will write an exhaustive report of everything I did so + far, when I figure out what the issue is, and I feel I am close. + well, you don't need to explain all bits in details + this is fine to show an example of how you went + but please also provide a summary of your other findings + oh okay, I will keep this in mind. :) + + +# IRC, freenode, #hurd, 2013-08-22 + + < nlightnfotis> if I want to rebuild libpthread, I have to embed it into + eglibc's source, then build? + < pinotree> or pick the debian sources, patch libpthread there and rebuild + < nlightnfotis> that's most likely what I am going to do. Thanks pinotree. + < pinotree> yw + < braunr> nlightnfotis: i usually add my patches on top of the debian glibc + ones, yes + < braunr> it requires some tweaking + < braunr> but it's probably the easiest way + < nlightnfotis> braunr: I was studying my issues with gcc, and everyday I + was getting more and more confident it must be a libpthread issue + < nlightnfotis> and I figured out, that I might wanna play with libpthread + this time + < braunr> it probably is but + < braunr> i'm not so sure you should dive there + < nlightnfotis> why not? + < braunr> because it can be worked around in go + < braunr> i had a test for you last time + < braunr> do you remember what it was ? + < nlightnfotis> nope :/ care to remind it? + < braunr> iirc, it was running the go test you did but with an additional + instruction in the main function, that pauses + < braunr> something like getchar() in c + < braunr> to make sure main doesn't exit while the goroutines are still + running + < braunr> i'm almost positive that the bug you're seeing is main returning + and libpthread beleiving it's acting on the main thread because there is + only one left + < nlightnfotis> oh that's easy, I can do it now. But it's probably what + thomas had suggested: go routines may not be running at all. + < braunr> they probably aren't + < braunr> and that's a context bug + < braunr> not a libpthread bug + < braunr> and that's what you should focus on + < braunr> the libpthread bug is minor + < nlightnfotis> which is strange, because I had studied the assembly code + and it the code for the goroutine was there + < nlightnfotis> anyway I will proceed with what you suggested + < braunr> yes please + < braunr> that's becoming important + < nlightnfotis> would you mind me dumping some of my findings for you to + evaluate/ post on opinion on? + < braunr> no + < braunr> please do so + < nlightnfotis> I have found that the go runtime starts with a total number + of threads == 1 + < braunr> nlightnfotis: as all processes + < nlightnfotis> I would guess that's because of using fork () + < nlightnfotis> oh so it's ok + < braunr> there always is a main thread + < braunr> even for non-threaded applications + < nlightnfotis> yeah, that I know. The runtime proceeds to create + immediately one more. + < braunr> then it's 2 + < nlightnfotis> and that's ok, it doesn't have an issue with that + < nlightnfotis> yep + < nlightnfotis> the issue begins when it tries to create the 3rd one + < braunr> hum + < braunr> from what i remember + < nlightnfotis> it happily goes through the go runtime's kernel thread + allocation function (runtime_newm()) + < braunr> you also had an issue with the first goroutine + < nlightnfotis> that's with 1 go routine + < braunr> ok + < braunr> so 1 goroutine == 3 threads + < nlightnfotis> it seems so yes. + < braunr> depending on how the go scheduler is able to assign goroutines to + kernel threads i suppose + < nlightnfotis> mind you, (disclaimer: I am not so sure about that) that go + must be using one extra thread for the runtime scheduler and garbage + collector + < braunr> that's ok + < nlightnfotis> so that's where the two come from + < braunr> and expected from a modern runtime + < nlightnfotis> the third must be the go routime + < nlightnfotis> routine + < braunr> hum have to go + < braunr> brb in a few minutes + < braunr> keep posting + < nlightnfotis> it's ok take your time + < nlightnfotis> I will be here + < braunr> but i may not ;p + < braunr> in fact i will not + < braunr> i have like 15 mins ;) + < braunr> nlightnfotis: ^ + < nlightnfotis> I am trying what you told me to do with go + < nlightnfotis> it's ok if you have to go, I will continue investigating + and be back tomorrow + < braunr> ok + < nlightnfotis> braunr: I tried what you asked me to do, both we waiting to + read a string from stdin and with waiting to read an int from stdin + < nlightnfotis> it never waits, it still aborts with the assertion failure + < nlightnfotis> both with one and two go routines + < nlightnfotis> dumping it here just for the log, running the same code + without waiting for input results in two threads created (1 for main and + 1 for runtime, most likely) and "normal" execution. + < nlightnfotis> normal as in no assertion failure, + < nlightnfotis> it seems to skip the goroutines altogether + + +# IRC, freenode, #hurd, 2013-08-23 + + < braunr> nlightnfotis: can i see your last go test code please ? the one + with the read at the end of main + < nlightnfotis> braunr sure + < nlightnfotis> sorry I had gone to the toilet, now I am back + < nlightnfotis> I will send it right now + < nlightnfotis> braunr: http://pastebin.com/DVg3FipE + < nlightnfotis> it crashes when it attempts to create the 3rd thread (the + 1st goroutine), with the assertion fail + < nlightnfotis> if you remove the Scanf it will not fail, return 0, but + only create 2 threads (skip the goroutines alltogether) + < braunr> can you add a print right before main exits please ? + < braunr> so we know when it does + < nlightnfotis> doing it now + < nlightnfotis> braunr: If I enter a print statement right before main + exits, the assertion failure is triggered. If I remove it, it still runs + and creates only 2 threads. + < braunr> i don't understand + < braunr> 14:42 < nlightnfotis> it crashes when it attempts to create the + 3rd thread (the 1st goroutine), with the assertion fail + < braunr> why don't you get that ? + < nlightnfotis> This seems like having to do with the runtime. I mean, I + have seen the emitted assembly from the compiler, and the goroutines are + there. Something in the runtime must be skipping them + < braunr> context switching seems buggy + < nlightnfotis> if it's only goroutines in main + < nlightnfotis> if there's also something else in main, the assertion + failure is triggered. + < braunr> i want you to add a printf right before main exits, from the code + you pasted + < nlightnfotis> I did. It acts the same as before. + < braunr> do you see that last printf ? + < nlightnfotis> no. It aborts before that + < nlightnfotis> :q + < braunr> find a way to make sure the output buffer is flushed + < braunr> i don't know how it's done in go + < nlightnfotis> mistype the :q, was supposed to do it vim + < nlightnfotis> braunr will do right away + < nlightnfotis> there is one thing I still can not understand: Why is it + that two threads are ok, but when the next is going to get created, the + assertion is triggered. + < braunr> nlightnfotis: the assertion is triggered because a thread is + being created while there is only one thread left, and this thread isn't + the main thread + < braunr> so basically, the main thread has exited, and another (the last + one) is trying to create one + < nlightnfotis> the other one might be the runtime I guess. Let me check + out quickly what you suggested + < braunr> the main thread shouldn't exit at all + < braunr> so something with context switching is wrong + < nlightnfotis> the thing is: it doesn't seem to exit when this happens. My + debug statements (in the runtime) suggest that there are at least 2 + threads active, kernel threads don't get destroyed in gccgo + < braunr> 14:52 < braunr> so something with context switching is wrong + < braunr> how well have the context switching functions been tested ? + < nlightnfotis> to be honest I have not tested them; up until this point I + trusted they worked. Should I also take a look at them? + < braunr> how can you trust them ? + < braunr> they've never been used .. + < braunr> thomas added them recently if i'm right + < braunr> nothing has been using them except go + < braunr> piece of advice: don't trust anything + < nlightnfotis> I think they were in before, and thomas recently patched + them! + < braunr> they were in, but didn't work + < braunr> (if i'm right) + < braunr> nlightnfotis: you could patch libpthread to monitor the number of + threads + < braunr> or the go runtime, idk + < nlightnfotis> I have done so on the go runtime + < nlightnfotis> that's where I am getting the number of threads I + report. That's straight out from the scheduler's count. + < braunr> threads can exit by calling pthread_exit() or returning from the + thread routine + < braunr> make sure you catch both + < braunr> also check for pthread_cancel(), although i don't expect any in + go + < nlightnfotis> braunr: Should I really do that? I mean, from what I can + see in gccgo's comments, Kernel threads (m) never go away. They are added + to a pool of m's waiting for work if there is no goroutine running on + them + < nlightnfotis> I mean, I am not so sure they exit at all + < braunr> be sure + < braunr> point me the code please + < nlightnfotis> + https://github.com/NlightNFotis/gcc/blob/master/libgo/runtime/proc.c#L224 + < nlightnfotis> this is where it get's stated that m's never go away + < nlightnfotis> and at line 257 you can see the pool + < nlightnfotis> and wait for me to find the code that actually releases an + and places into the pool + < nlightnfotis> yep found it + < nlightnfotis> line 817 mput + < nlightnfotis> puts a kernel thread given as parameter to the pool + < nlightnfotis> another proof of the theory is at line 1177. It states: + "This point is never reached, because scheduler does not release os + threads at the moment." + < braunr> fetching git repository, bit busy, i'll have a look in 5-10 mins + < nlightnfotis> oh it's ok, I had pointed you to the file directly on + github to check it out instantly, but never mind, the file is + /libgo/runtime/proc.c + < braunr> damn github is so slow .. + < braunr> nlightnfotis: i much prefer my own text interface :) + < nlightnfotis> braunr: just out of curiosity what's your setup? I use vim + mainly (not that I am a vim expert or anything, I only know the basics, + but I love it) + < braunr> same + < braunr> nlightnfotis: add a trace at that comment to make SURE threads do + not exit + < braunr> you *cannot* get the libpthread assertion with more than 1 thread + < braunr> grep for pthread_exit() too + < nlightnfotis> will do it now. It will take about an hour to compile + though. + < braunr> i don't understand the stack trick at the start of runtime_mstart + < braunr> ah splitstack .. + < nlightnfotis> I think I should try cross compiling gcc, and then move + files on the hurd. It would be so much faster I believe. + < braunr> than what ? + < nlightnfotis> building gcc on the hurd + < nlightnfotis> I remember it taking about 10minutes with make -j4 on the + host + < nlightnfotis> it takes 45-50 minutes on the vm (kvm enabled) + < braunr> but you can merely rebuild the files you've changed + < nlightnfotis> I feel stupid now... + < braunr> nlightnfotis: have you tried setting GOMAXPROCS to 1 ? + < nlightnfotis> not really, but from what I know GOMAXPROCS defaults to 1 + if not set + < braunr> again, check that + < braunr> take the habit of checking things + < nlightnfotis> braunr: yeah sorry for that. I have checked these things + out before they don't come out of my head I just don't remember exactly + where I had seen this + < braunr> what you can also do is use gdb to catch the assertion and check + the number of threads at that time, as well as the number of threads as + seen by libpthread + < nlightnfotis> braunr: line 492 file proc.c: runtime_gomaxprocs = 1; + < braunr> also see runtime.LockOSThread + < braunr> to make sure the main thread is locked to its own pthread + < nlightnfotis> I can see in line 529 of the same file that the first + thread is getting locked + < nlightnfotis> the new threads that get initialised are non main threads + < braunr> if(!runtime_sched.lockmain) runtime_UnlockOSThread(); + < braunr> i'm suggesting you set runtime_sched.lockmain + < braunr> so it remains true for the whole execution + < braunr> this code looks like a revamp of plan9 lol + < nlightnfotis> it is + < nlightnfotis> in the paper from Ian Lance Taylor describing gccgo he + states somewhere that the original go compilers (the 3gs) are a modified + version of plan9's C compiler, and that gccgo tries to follow them + < nlightnfotis> they differ in a lot of ways though + < nlightnfotis> the 3gs generate a lot of code during link time + < nlightnfotis> gccgo follows the standard gcc procedures + < braunr> eh :D + < nlightnfotis> go -> gogo -> generic -> gimple -> rtl -> object + < nlightnfotis> that's how it flows as far as I recall + < nlightnfotis> gogo is an internal representation of go's structure inside + the gccgo frontend + < nlightnfotis> that's why you see many functions with gogo in their name + < nlightnfotis> I just revisited the paper: gogo is there to make it easy + to implement whatever analysis might seem desirable. It mirrors however + the Go source code read from the input files + < braunr> nlightnfotis: what are you trying now ? + < nlightnfotis> I am basically studying the runtime's source code while + waiting for gccgo to compile on the Hurd + < nlightnfotis> yes I did the stupid whole recompilation again. :/ + < braunr> nlightnfotis: compile for what ? + < braunr> what test ? + < nlightnfotis> to check out to see if M's really are added to the pool + instead of getting deleted + < braunr> nlightnfotis: but how ? + < nlightnfotis> braunr: I have added a statement in mput if we get there + first, and secondly the number of threads that the runtime scheduler + knows that are waiting (are in the pool of m's waiting for work) + < braunr> ok + < braunr> when you can, i'd really like you to do this test : + < braunr> 15:55 < braunr> what you can also do is use gdb to catch the + assertion and check the number of threads at that time, as well as the + number of threads as seen by libpthread + < nlightnfotis> the number of threads required by libpthread is gonna need + me to recompile the whole eglibc right? + < braunr> no + < braunr> just print it with gdb + < nlightnfotis> oh, ok + < braunr> it's __pthread_num_threads + < nlightnfotis> is gdb reliable? I remember thomas telling me that I can't + trust gdb at this point in time + < braunr> and also __pthread_total + < braunr> really ? + < braunr> i don't see why not :/ + < braunr> youpi: any idea about what nlightnfotis is speaking of ? + < nlightnfotis> I may have misunderstood it; don't take it by heart + < nlightnfotis> I don't wanna put words in other people's mouths because I + misunderstood something + < braunr> sure + < braunr> that's my habit to check things + < youpi> braunr: nope + < braunr> youpi: and am i right when i say we don't use context functions + on the hurd, and they're likely to be incomplete, even with the recent + changes from thomas ? + < braunr> (mcontext, ucontext) + < nlightnfotis> braunr: this is what had been said: 08:46:30< tschwinge> As + this is a parallel problem, and it is involving "advanced" things (such + as setcontext), I would not trust GDB too much when used on this code. + < pinotree> if thomas' changes were complete and polished, i guess he would + have sent them upstream already + < braunr> i see but + < braunr> you can normally trust gdb for global variables + < nlightnfotis> Didn't post it as an objection; I posted it because I felt + bad putting the wrong words on other people's mouths, as I said + before. So I posted his original comment which was more authoritative + than my interpretation of it + < braunr> i wonder if there is a tunable to strictly map one thread to one + goroutine + < braunr> nlightnfotis: more focus on the work, less on the rest please + < nlightnfotis> Did I do something wrong? + < braunr> you waste too much time apologizing + < braunr> for no reason + < braunr> nlightnfotis: i suppose you don't use splitstack, right ? + < nlightnfotis> no I didn't + < nlightnfotis> and here's something interesting: The code I just added, in + mput, to see if threads are added in the pool. It's not there, no matter + what I run + < nlightnfotis> So it seems that we the runtime is not reaching mput. + < nlightnfotis> Could this be normal behavior? I mean, on process + termination just release the resources so mput is skipped? + < braunr> i don't know the code well enough to answer that + < braunr> check closer to the lower interface + + +# IRC, freenode, #hurd, 2013-08-25 + + < nlightnfotis> braunr: what is initcontext supposed to be doing? + < braunr> nlightnfotis: didn't look + < braunr> i'll take a look later + < nlightnfotis> braunr: I am buffled by it. It seems to be doing nothing on + the Hurd branch and nothing in the Linux branch either. Why call a + function that does nothing? (it doesn't only seem to do nothing, I have + confirmed it) + < nlightnfotis> youpi: I was wondering if you could explain me + something. What is the initcontext function supposed to be doing? + < youpi> you mean initcontext ? + < nlightnfotis> yes + < youpi> ergl + < youpi> you mean makecontext? + < nlightnfotis> no initcontext. I am faced with this in the goruntime. It's + called in it, but it is doing nothing. Neither in the Hurd tree, nor in + the Linux one + < youpi> I don't know what initcontext is + < youpi> where do you read it? + < nlightnfotis> youpi: let me show you + < nlightnfotis> + https://github.com/NlightNFotis/gcc/blob/fotisk/goruntime_hurd/libgo/runtime/proc.c#L80 + < nlightnfotis> and it is called in quite a few places + < youpi> it's not doing nothing, see other implementations + < pinotree> if SETCONTEXT_CLOBBERS_TLS is not defined, initcontext and + fixcontext do nothing + < pinotree> otherwise (presuming if setcontext clobbers tls) there are two + implementations for solaris/x86_64 and netbsd + < youpi> I don't think we have the tls clobber bug + < youpi> so these functions being empty is completely fine + < nlightnfotis> pinotree: oh, you mean it's used as a workaround for these + two systems only? + < youpi> yes + < pinotree> yes + < nlightnfotis> That makes sense. Thanks both of you for the help :) + < nlightnfotis> youpi: if this counts as some progress, I have traced the + exact bootstrapping sequence of a new go process. I know a good deal of + what is done from it's spawn to it's end. There are some things I wanna + sort out, and later tonight I will write my report for it to be ready for + tomorrow. + < youpi> good + + +# IRC, freenode, #hurd, 2013-08-26 + + < nlightnfotis> Hi everyone, my report is here + http://www.fotiskoutoulakis.com/blog/2013/08/26/gsoc-week-10-report/ + < youpi> nlightnfotis: you should clearly put printfs inside libpthread + < youpi> to check what is happening with the ktids + < nlightnfotis> youpi: yep, that's my next course of action. I just want to + spend some more time in the go runtime to make sure that I understand the + flow perfectly, and to make sure that it is not the runtime's fault + < braunr> nlightnfotis: did you try gdb to print the number of threads ? + < youpi> nlightnfotis: to build it, the easiest way is to start building + eglibc, and when you see it compiling C files (i.e. run i486-gnu-gcc-4.7 + etc.) + < youpi> stop it + < youpi> and go into build/hurd-i386-libc, and run "make others" from there + < nlightnfotis> braunr: that was my plan for today or tomorrow :) + < braunr> start building *debian* glibc + < youpi> there's perhaps some way to only build libpthread, but I don't + remember + < braunr> nlightnfotis: ok + < braunr> youpi: i suggested he tried gdb first + < youpi> why not + < braunr> if you need quick glibc builds, you can use darnassus + < nlightnfotis> braunr: how much time on average should I expect it to + take? + < youpi> it highly depends on the machine + < youpi> it can be hours + < youpi> or a few minutes + < youpi> depending you already have a built tree, a fast disk, etc. + < braunr> make lib others on darnassus takes around 30 minutes + < braunr> a complete dpkg-buildpackage from fresh sources takes 5-6 hours + < braunr> make others from a built tree is very quick + < braunr> a few minutes at most + < braunr> nlightnfotis: i don't see any trace of thread exiting in your + report, is that normal ? + < nlightnfotis> yeah, I guess, since they don't exit prematurely, they are + released along with other resources at the process' exit + < braunr> i'll rephrase + < braunr> you said last time that you saw a function never got called + < braunr> i assumed it was because a thread exited prematurely + < nlightnfotis> oh I sorted it out with the help of youpi and pinotree + yesterday + < braunr> that's different + < braunr> i'm not talking about the function that does nothing + < braunr> i'm talking about the one never called + < nlightnfotis> oh, go on then, + < braunr> i don't remember its name + < braunr> anyway + < nlightnfotis> abort()? + < braunr> i hope abort doesn't get called :) + < nlightnfotis> it doesn't + < braunr> i thought it was the one right before + < braunr> what i mean is + < nlightnfotis> oh runtime_mstart, it does get called + < braunr> add traces at thread exit points + < nlightnfotis> I sorted it out too + < braunr> make *sure* threads don't exit + < nlightnfotis> it get's called to start the kernel thread created at + process spawn at the runtime_schedinit + < braunr> if they really don't, it's probably a context/tls issue + < nlightnfotis> I will do this right now. + < nlightnfotis> braunr: if it's a context/tls issue it's libpthread's + problem? + + +# IRC, freenode, #hurd, 2013-09-02 + + Hello! My report for this week is online: + http://www.fotiskoutoulakis.com/blog/2013/09/02/gsoc-week-11-report/ + nlightnfotis: there always is a signal thread in every hurd + program + nlightnfotis: i also pointed out that there are two variables + involved in counting threads in libpthread, the other one being + __pthread_num_threads + again, more attention to work and details, less showmanship + i'm tired of repeating it + nlightnfotis: doesn't backtrace work in gdb to tell you what + 0x01da48ec is? + also, do you have libc0.3-dbg installed? + braunr: __pthread_num_threads reports is 4. + then why isn't it in your report ? + it's acceptable that you overlook it + and youpi: yeah I have got the backtrace, but 0x01da48ec is + ?? () from /lib/i386-gnu/libc.so.3 + it's NOT when someone else has previously mentioned it to you + nlightnfotis: only that line, no other line? + it has 8 more youpi, the one after ?? is mach_msg () + form/lib/gni386-gnu/libc.so.0.3 + yes mach_msg + almost everything ends up in mach_msg + you should probably pastebin somewhere the output of thread apply + all bt + what's before that ? + braunr: I don't know how I even missed it. I skimmed through + the code and only found __pthread_total and assumed that it was the total + number of threads + nlightnfotis: i don't know either + take notes + before mach_msg ins __pthread_timedblock () from + /lib/i386-gnu/libpthread.so.0.3 + I will add it to pastebin in a second + i find it very disappointing that after several weeks blocking on + this, despite all the pointers you've been given, you still haven't made + enough progress to reach the context switching functions + last week, most progress was made when we talked together + then nothing + it seems that you disappear, apparently searching on your own + but for far too long + braunr: I do search on my own, yes, + almost like exploiting being blocked not to make progress on + purpose ... + but too much + braunr: I am not doing this on purpose, I believe you are + unfair to me. I am trying to make as much progress as I can alone, and + reach out only when I can't do much more alone + then why is it only now that we get replies to questions such as + "how much is __pthread_num_threads" ? + why do you stop discussions for almost a week, just to find + yourself blocked again ? + I was working on gcc, going through the runtime making sure + about assumptions and going through various other goroutine or not + programs through gdb + that doesn't take a week + clearly not + last time we talked was + 10:40 < nlightnfotis> braunr: if it's a context/tls issue it's + libpthread's problem? + it did for me... honestly, what is it you believe I am doing + wrong? I too am frustrated by my lack of progress, but I am doing my best + august 26 + yeah, I wanted to make sure about certain assumptions on the + gcc side. I don't want to start hacking on libpthread only to see that it + might have been something I msissed on the gcc side + i told you + it's probably not a libpthread issue + the assertion is + but it's minor + it's not the realy problem, only a side effect + i told you about __pthread_num_threads, why didn't you look at it + ? + i told you about context switching functions, why nothing about it + ? + doing a few printfs to check numbers and using gdb to check them + at break points should be quick + when we talk,ed we had the results in a few minutes + yeah, because I was guided, and that helped me target my + research. On my own things are quite different. I find out something + about gcc's behavior, then find out I need tons more information, and I + have a lot of things that I need to research to confirm any assumptions + from my side + how did you miss the signal thread ? + we even talked about it right here with hacklu + i'll say it again + if blocked more than one day, ask for help + 2 days minimum each time is just too long + I'm sorry. I will be online every day from now on and report + every 10 minutes, on my course of actions. + I recognise that time is off the essence at this point in + time + it's also NO + NO + *SIGH* + nlightnfotis: calm down. braunr just want to help you solve + problem quickly. + 10 minutes is the other extreme + nlightnfotis: in my experiecence, if something block me, I will + keep asking him until I solve the problem. + it's also very frustrating to see you answer questions quickly + when you're here, then wait days for unanswered questions that could have + taken little time if you kept being here + this just gives the impression that you're doing something else in + parallel that keeps you busy + and comfort me in believing you're not being serious enough + aboutit + yeah, I understand that it gives that impression. The only + thing I can tell you now, is that I am *not* doing something else in + parallel. I am only trying to demonstrate some progress alone, and when + working alone things for me take quite some more time than when I am + guided + hacklu: i'm actually the nervous one here + braunr: ok, I understand I have dissapointed you. What would + you suggest me to do from now on? + braunr: :) + manage your time correctly or you'll fail + i'm not the main mentor of this project so it's not for me to + decide + but if i were, and if i had to wait again for several days before + any notice of progress or blocking, i wouldn't even wait for the end of + the gsoc + you're confronted with difficult issues + tls, context switching, thread + ing + they're all complicated + unless you're very experienced and/or gifted, don't assume you can + solve it on your own + and the biggest concern for me is that it's not even the main + focus of your project + you should be working on go + on porting + any side issues should be solved as quickly as possible + and we're now in september ... + go is working quite alright. It's goroutines that have + issues. + nlightnfotis: same thing + goroutines are part of go as far as i'm concerned + and they're working too, something in the hurd isn't + so it's a side issue + you're very much entitled to ask as much help as you need for side + issues + and i strongly feel you didn't + yeah, you're right. I failed on that aspect, mainly because + of the way I work. I wanted to show some progress on my own, and not be + here and spam all day. I felt that spamming questions all day would + demonstrate incompetence from my side + and I wanted to show that I am capable of solving my + problems on my own. + well, in a sense it does, but that's not the skills we were + expecting from you so it's perfectly ok + nlightnfotis: no development group, even in companies, in their + right mind, would expect you to grasp the low level dark details of an + operating system implementation in a few weeks ... + braunr: ok, may I ask what you suggest to me that my next + course of action is? + let me see + nlightnfotis: your report mentions runtime_malg + yes, I runtime malg always returns a new goroutine + nlightnfotis: what's the problem ? + a new m created is assigned a new goroutine via runtime_malg + what happens to that goroutine? Is it destroyed? Because it + seems to be a bogus goroutine. Why isn't the kernel thread instantly + picking the one goroutine available at the global goroutine pool? + let's see if it's that hard to figure out + seeing as m's and g's have a 1:1 (in gccgo) relationship, + and a new kernel thread is created everytime there is a new goroutine + there to run. + are you sure about that 1:1 relationship ? + i hardly doubt it + highly* + yeah, that's what I thought too, but then again, my research + so far shows that when a new goroutine is created, a new kernel thread + creation follows suit + what I have mentioned of course, happens in runtime_newm + nlightnfotis: that's when you create a new m, not a new g + yes, a new m is created when you create a new g. My issue is + that during m's creation, a new (bogus) g is created and assigned to the + m. I am looking into what happens to that. + nlightnfotis: "a new m is created when you create a new g", can + you point me to the code ? + braunr: matchmg line 1280 or close to that. Creates new m's + to run new g's up to (mcpumax) + "Kick off new m's as needed (up to mcpumax)." + so basically you have at most mcpumax m + yeah. but for a small number of goroutines (as for example + in my experiments), a new m is created in order to run a new g. + runtime_newm is called only if mget(gp)) == nil + be rigorous please + when i ask + 11:01 < braunr> are you sure about that 1:1 relationship ? + this conclusively proves it's *false* + so don't answer yes to that + it's true for a small number of goroutines, ok + and at startup + because then, mget returns an existing m + nlightnfotis: this g0 goroutine is described in the struct as + G runtime_g0; // idle goroutine for m0 + runtime_malg builds it with just a stack + apparently, that's the goroutine an m runs when there are no g + left + so yes, the idle one + it's not bogus + I thought m0 and g0 where the bootstrap m and g for the + scheduler. + *correction: runtime_m0 and runtime_g0 + hm i got a bit fast + G* g0; // goroutine with scheduling stack + braunr: scheduling stack with stacksize = -1? + unless it's not used as a parameter + let me investigate that + yeah now that I am seeing it, it might make sense, if it + using a default stack size, #defined as StackMin + g0 looks like a placeholder + i think it's used to reuse switching code when there is only one + goroutine involved + e.g. when starting + anyway i don't think we should waste too much time with it + nlightnfotis: try to make a real 1:1 mapping + that's something else i suggested last time + braunr: ok. Where do you suspect the problem lies? + context switching + inside the goruntime? + in glibc + try to use runtime.LockOSThread + http://code.google.com/p/go-wiki/wiki/LockOSThread + nlightnfotis: http://golang.org/pkg/runtime/ is probably better + what exactly do you mean by `use runtime.LockOSThread`? + LockOSThread locks the very first m and goroutine as the main threads + during process initialisation + in proc.c line 565 or something + i'm not sure it will help, because the problem is likely to occur + before even switching to the goroutine that locks its m, but worth trying + 11:28 < braunr> nlightnfotis: http://golang.org/pkg/runtime/ is + probably better + the first example is specific to GUIs that have requirements on + the main thread + whereas i want every goroutine to run in its own thread + I have also noticed that some context switching happens in + the goruntime even with a low number of goroutines and kernel threads + that's expected + goroutines must be viewed as works, and ms as worker threads + everytime a goroutine sleeps, its m should be switching to useful + work + nlightnfotis: i'd make prints (probably using mach_print) of + contexts when saved and restored + and try to see if it makes any sense + that's not simple to setup but not overly complicated either + don't hesitate to ask for help + from inside glibc, right? + yes + well + no from go + don't touch glibc from now + put these prints near calls to makecontext/swapcontext + and setcontext/getcontext + wel + you'll be using getcontext i think + noted it all. I also have the gdb output you asked me for + http://pastebin.com/LdnMQDh1 + i don't see main + some notes first: The main thread is the one with id 4, and + the output on the top is its backtrace. + and main.main is run in thread 6 + Remember that main when it comes to go is in the file + go-main.c + so main becomes runtime_MHeap_Scavenger + yeah, main.main is the code of the program, (the one the + user wrote, not the runtime) + yeah, it becomes a gc thread + seeing as runtime_starttheworld reports that there is + already one gc thread + and how much are __pthread_total and __pthread_num_threads for + that trace ? + they were: __pthread_total = 2, and __pthread_num_threads = + 4 + can you paste the assertion again please, just to make sure + a.out: ./pthread/pt-create.c:167: __pthread_create_internal: + Assertion `({ mach_port_t ktid = __mach_thread_self (); int ok = + thread->kernel_thread == ktid; + __mach_port_deallocate ((__mach_task_self + 0), ktid); ok; + })' failed. + btw, install the -dbg packages too + dbg for which one? gccgo? + libc0.3 + pthread/pt-create.c:167 is __pthread_sigstate (_pthread_self (), + 0, 0, &sigset, 0); here :/ + that assertion should be in __pthread_thread_start + let's just say gdb is confused + braunr: apt-get source eglibc ; cd eglibc-* ; debian/rules patch + pinotree: i have + and that assertion can only trigger if __pthread_total is 1 + so let's say it just got to 2 + it does from very early on in process initialisation + let me check this out again + hm + actually, both __pthread_total and __pthread_num_threads must be 1 + the context functions might be fine actually + braunr: __pthread_num_threads = 2 right from the start of + the program + 0x01da48ec is in mach_msg_trap + something happened with libpthreads recently .. + i can't even start iceweasel + braunr: what's the error? + iceweasel: ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: + __pthread_mutex_timedlock_internal: Assertion `__pthread_threads' failed. + +But not the [[open_issues/libpthread_dlopen]] issue? + + considering __pthread_threads is a global variable, this is tough + i wonder if that's the issue with nlightnfotis's work + wrong symbol resolution, leading libpthread to consider there is + only one thread running + try with LD_PRELOAD=/lib/i386-gnu/libpthread.so.0 iceweasel + same + maybe the switch to glibc 2.17 + this assertion is triggered by __pthread_self, assert + (__pthread_threads); + __pthread_threads being the array of thread pointers + so either corrupted (but we hardly changed anything ...) or wrong + resolution + __pthread_num_threads includes the signal thread, __pthread_total + doesn't + braunr: I recompiled with the libc debugging symbols and I + have new information + the threads block at mach_msg_trap + again, almost everything blocks there + mach_msg is mach ipc, the way hurd system calls are implemented + and the next calls (if it didn't block, from what I can see + from eip) are mach_reply_port and mach_thread_self + please paste it + yes give me 2 mins plz, brb + pinotree: looks different for firefox + it seems it calls pthread_key_create before pthread_create + something our libpthread doesn't handle correctly + braunr: http://pastebin.com/yNbT7nLn + braunr: what do you mean? + pinotree: i mean libpthread needs to be fixed so thread-specific + data can be set even without a call to pthread_create + nlightnfotis: hum, we already knew it was blocking in a semaphore + nlightnfotis: ok forget the other things i told you to test + nlightnfotis: track __pthread_total and __pthread_num_threads + add prints (again, with mach_print) to see when (and why) they + change and go back to 1 + braunr: i see that pthread_key_create uses a mutex which in + turns needs _pthread_self(), but shouldn't at least one pthread_create be + done (directly by libc for the main thread)? + pinotree: no :) + well + it should have been for the signal thread indeed + and the signal thread exists + and the main thread? + not the main, no + how so? + a simple test program shows it does indeed work .. + so this is again another problem in firefox too + braunr: I don't think I understand this. I mean how can + pthread_total and __pthread_num_thread turn to 1, when , right before and + right after the crash they have numbers between 2, 3, and 4? + how did you get their values "right before" the crash ? + I have set a breakpoint to a printing function right before + the go statement + (right before in this context, in the application code, not + the runtime code, but then again, I don't really think they are too far + each other) + well, that's the mystery + I am not challenging what you said, I will of course do, + just asking to understand some things + they may either turn to 1, or there is some mess with symbol + resolution leading threads to see a value of 1 + *do it + there* + braunr: ping + just ask ;) + teythoon: have you used mach_print? + no + I have some questions about it + ask them + I was told to use them inside go's runtime, to print the + values of __pthread_total and __pthread_num_threads. The thing is, these + values (I believe) are unknown to the runtime, they are only known to the + executable (linking time and later) + so? if the requested information is bound to a symbol that is + resolved at link time, you can print it from within the runtime + the same way any function from the libc is not known to the + executable until linking against it, but you can still "use" it in your + executable + yeah, ok I understand that, but these are references that + are resolved at link time. The values I want to print are totally unknown + to the runtime (0 references to them) + if the value you are interested in is bound to the symbol + __pthread_total at link time, then you've got a reference you can use + doesn't printing __pthread_total work? did you try that? + no, whenever I printed these values I did it from gdb. I am + trying to do what you suggested atm + nlightnfotis: im here + printing those values from libgo will tell us what value libgo + actually sees + I am trying to use mach_print. Could you give me some + pointers on its usage (inside the goruntime?) (I have already read your + document here + http://www.gnu.org/software/hurd/microkernel/mach/gnumach/interface/syscall/mach_print.html + and the example code) + and symbol resolution may depend on where it's done from + nlightnfotis: first, it only work with -dbg kernels + so make sure you're running one + actually, i'll write you a patch + including a mach_printf function with argument parsing + isn't it on by default? I read that on the document you are + discussing mach_printf + ahh ok + it's on by default on -dbg kernels + i'll make a repository on darnassus too + better store it there + nlightnfotis: + http://darnassus.sceen.net/gitweb/rbraun/mach_print.git/ + nlightnfotis: i suggest you implement mach_print with inline asm + statement in a C file, so that you don't need to alter the build system + configuration + i'll make an example of that too + braunr: that wasn't a problem. My only real problem atm is + that __atomic_t isn't recognised as a type, and I can not find the header + file for it on Hurd + it was pt-internal.h in libpthread + ah + nlightnfotis: just in case, i updated the repository with an + inline assembly version + let's see about __atomic_t + sysdeps/i386/bits/pt-atomic.h:typedef __volatile int __atomic_t; + nlightnfotis: just redeclare it as this locally + nlightnfotis: ok ? + I am working on it, because I still haven't found what + __atomic_t is typedefed from. Thinking of typedefing an int to it and see + how it goes + braunr: found it just now: __volatile int + "just now" ? + 14:19 < braunr> sysdeps/i386/bits/pt-atomic.h:typedef __volatile + int __atomic_t; + I was using cscope all this time + why use cscope at all when i tell you where it is ? + because I didn't notice it: your discussion was between + pino's and srs' and I wasn't tagged and thought it had something to do + with their discussion + (sorry) + no it was my bad + ok + pinotree: there is indeed a special call to + __pthread_create_internal for the main thread + yeah + braunr: if there wouldn't be that libc→pthread bridge, things + like pthread_self() or so wouldn't work for the main thread + pinotree: right + braunr: weird thing is that the error you got is usually a sign + that pthread is not linked in explicitly + pinotree: yes + pinotree: with firefox, gdb can't locate pthread symbols before a + call to a pthread function + so yes, libpthread is loaded after main is called + nlightnfotis: can you give me a quick procedure to build gcc with + go support from your repository, and then test a go program please ? + to i can have a better look at it myself + so* + braunr: sure you want access to my go repo? If you already + have gcc repo add my github repo as a remote and checkout + fotisk/goruntime_hurd + i have your github repo + git checkout fotisk/goruntime_hurd (You may need to revert a + commit or two, because of my latest endeavour with mach_print + braunr: check it out now, I reverted some messy commits for + you to rebuild + nlightnfotis: i won't work on it right now, i'm building glibc to + check some things in libpthread + since it seems to be the source of your problems and many others + oh ok then. btw, it compiles ok, but when I try to compile + another program with gccgo collect2 cries about undefined references to + __pthread_num_threads and __pthread_total + Oo + another program ? + braunr: will I get the same result if I slowly go through it + with gdb + yep + i don't understand + what compiles ok, what fails ? + gccgo compiles without errors (which is strange) but when I + use it to compile goroutine.go it fails with the errors I reported + (missing linking to pthread?) + since when ? + pinotree: perhaps braunr: since I made the changes with + mach_print + pinotree: but what could be missing the link? GCC compiled + programs are getting linked automatically to the shared objects of the + headers they include right? + (assuming it's not a huge program, only a tiny 10 liner for + instance) + uh + did you declare them as extern + ? + yes + do you see -lpthread on the link line ? + during gcc's compilation? I will have to rerun it again and + see. + log the compilation output somewhere once + nlightnfotis: why did you remove volatile from the definition of + __atomic_t ?? + just for testing purposes, because I thought that the GNU + version is volatile with no __ in front of it and that might cause some + issues. + i don't understand + it was just an experiment gone wrong + nlightnfotis: keep volatile there + just did + braunr: there is -lpthread on some lines. For instance when + libtool is invoked. + braunr: the pthread assertion usually happens when libpthread gets + loaded from a plugin, I guess mozilla got rid of libpthread in the main + application recently, simply + youpi: he said that the LD_PRELOAD trick (which used to + workaround the issue in older iceweasel) does not work, though + ah? it does work for me + dunno then... + youpi: aouch, ok + nlightnfotis: what about the specific gcc invocation that fails ? + pinotree: /lib/i386-gnu/libpthread.so.0: ERROR: cannot open + `/lib/i386-gnu/libpthread.so.0' (No such file or directory) + trying with a working path this time + better + sorry, i typed it by hand :p + Segmentation fault + but no assertion + braunr: gccgo hello.go + nlightnfotis: ? + nlightnfotis: what about the specific gcc invocation + that fails ? + nlightnfotis: i'm asking if -lpthread is present when you have + these undefined reference errors + it is. it seems so + I wrote above that it is present when libtool is called + I don't know what libtool is doing sadly + you said some lines + but I from what I've seen I believe it does some kind of + linking + paste it somewhere please + yeah it doesn't fail though + that's far too vague ... + it doesn't fail ? + give me a second + i thought it did + no it doesn't + 14:53 < nlightnfotis> gccgo compiles without errors (which is + strange) but when I use it to compile goroutine.go it fails with the + errors I reported + yeah gccgo compiles. + when I use the compiler, it fails + so it fails running + is gccgo built with -lpthread itself ? + http://pastebin.com/1TkFrDcG + check it out + I think it does, but I would take an extra opinion + line 782 + and 784 + (are you building as root ?) + yes. for now + baaad :p + I never had any particular problems...except that one time + that I rm -rf the source tree :P + I know it's bad d/w + braunr: I found something interesting (I don't know if it's + expected or not; probably not): If I set GOMAXPROCS to 2, and run the + goroutine program, it seems to be running for a while (with the + goroutines!) and then it segfaults. Will look more into it + it's interesting, yes + nlightnfotis: have you tried the preload trick too ? + ldpreload? no. Could you tell me how to do it? export + LDPRELOAD and a path to libpthread? + nlightnfotis: LD_PRELOAD=/lib/i386-gnu/libpthread.so.0.3 ... + braunr: it also produces a very different backtrace. This + one heavily involves mig functions + braunr, nlightnfotis: Thanks for working together, and sorry + for my lack of time. + nlightnfotis: paste please + tschwinge, Hello. It's ok, I am sorry for not showing good + amounts of progress from my part. + braunr: http://pastebin.com/J4q2NN9p + nlightnfotis: thread apply all bt full please + braunr: http://pastebin.com/tbRkNzjw + looks like an infinite loop of + __mach_port_mod_refs/__mig_dealloc_reply_port + ... + yes that's what I got from it too. Keep in mind these + results are with GOMAXPROCS=2 and they result in segmentation fault + and I also can not understand the corrupted stack at the + beginning of the backtrace + no please + ? + test LD_PRELOAD=/lib/i386-gnu/libpthread.so.0.3 without + GOMAXPROCS=2 + braunr: LD_PRELOAD without GOMAXPROCS results in the usual + assertion failure and abortion of execution after it + nlightnfotis: ok + nlightnfotis: im sorry, i thought you couldn't launch a test since + you added mach_print + I am not using mach_print, I couldn't fix the issue with the + references and thought I was losing time, so I went back to debugging + with gdb until I can't get anything more out of it + braunr: should I focuse on mach_print? Will it produce very + different results than gdb? + *focus + (btw I didn't delete mach print or anything, it's still + there, in another branch) + braunr: Now I stepped through the program in gdb, and got + something really really weird. Some close to a full execution + Number of gorountines and machine threads according to + runtime was 3, __pthread_num_threads was 4 + it did get SIGILL (illegal instruction some times though) + and it exited with code 02 + uh + nlightnfotis: try with mach_print yes, it will show the values + from the real execution context, and be as close as what we can get + i'm not sure about how gdb finds the values + braunr: ok, will spend the rest of the day to find a way to + make mach_print and the other values work. Did you see my last messages, + with the goroutines that worked under gdb? + yes + it seemed to run. Didn't get the expected output, but also + didn't get any errors other than illegal instruction either + braunr: I still have not found an easy way to do what you + asked me to from go's runtime. Would it be ok if I do it from inside + libpthread? + nlightnfotis: do what ? + print the values of __pthread_total and + __pthread_num_threads with mach_print. + how ? + oh wait + well yes ofc, they're not exported :/ + nlightnfotis: have you been able to use mach_print ? + braunr: not really because of the problems I shared + earlier. I can try to use with in-gcc structures if you want me to, it's + nothing hard to do + actually I will. Hang on + proceed with debugging inside libpthread instead + using mach_print to avoid deadlocks this time + (mach_print was purposely built for debugging such low level code + parts) + ok, I will patch this, but can I build it tomorrow? + yes + just keep us informed + ok, thanks, and sorry for everything I have done. I want you + to know that I really appreciate that you are helping me. + remember: the goal here is to understand why __pthread_total and + __pthread_num_threads have inconsistent values + braunr: whenever you see it, mach_print works as expected + inside gcc. + + +# IRC, freenode, #hurd, 2013-09-03 + + braunr: I have made the changes I want to glibc. After I + build it, how do I install it? make install or is it more involved? + nlightnfotis: use LD_LIBRARY_PATH + never install an experimental glibc unless you have backups or are + certain of what you're doing + nlightnfotis: i didn't understand what you meant about mach_print + yesterday + it works in gcc. + what do you mean "in gcc" ? + why would you put mach_print in gcc ? + we want it in go programs .. + yes, I understand it. gcc was the fastest way to test it's + usage at that moment (for me) and I just wanted to confirm it works. I + only had to change its signature to const char * because gcc wouldn't + accept it otherwise + doesn't my example include const ? + nlightnfotis: why did you rebuild glibc ? + braunr: I have not started yet, will do now, to apply the + changes to libpthread + you mean add the print calls there ? + yes + ok + use debian/rules build, interrupt when you see gcc invocations + then switch to the build directory (hurd-libc-i386 iirc), and make + others + nlightnfotis: did you send me the instructions to build and test + your work ? + so i can reproduce these weird threading problems at my side + braunr: sorry, I was in the toilet, where would you like me + to send the instructions? + nlightnfotis: i should be fine i guess, let's check here + nlightnfotis: i simply used configure + --enable-languages=c,c++,go,lto + and i'll see how it goes + I configure with --enable-languages=go (it automatically + builds c and c++ for that as go depends on them), --disable-bootstrap, + and use a custom prefix to install at a custom location + yes + ok + nlightnfotis: how long does it take you ? + complete non-bootstrap build about 45 minutes. With a build + tree ready and only simple changes, about 2-3 minutes + braunr: In an hour I will go offline for 2-3 hours, I am + gonna move back to my other home in the other city. It won't take long, + the whole process will be about 4 hours, and I will compensate for the + time lost by staying up late up until 3 o clock in the morning + i'd prefer you didn't "compensate" + ? + work if you want to + noone if forcing you to work late at night for gsoc, unless you + want to + no, I do it because I want to. I **really** really want to + succeed, and time is off the essence for me at this point + then ok + nlok i have a gccgo compiler + nlok? + nl being nlightnfotis but he's gone + oh + * pinotree was trying to parse that as "now" or "look" or the like + braunr: 08:19:56< braunr> use debian/rules build, interrupt + when you see gcc invocations: Are gcc invocations related to + i486-gnu-gcc-4.7? + nvm I'm good now :) + of course not, that's only for compiling applications using the + newly built libc + gnu_srs: I didn't exactly understand what you said? Care to + elaborate? which one is for compiling applications using the newly build + libc? -486-gnu-gcc-4.7? + when you see gcc ... -llibc.so you know libc.so is built, and + that is sufficient to use it. + with LD_PRELOAD or LD_LIBRARY_PATH (after cding and building + others) + gnu_srs: thanks for the tip :) + :-D + is anyone else getting glibc build problems? (from apt-get + source glibc, at cxa-finalize.c)? + apt-get source eglibc; apt-get build-dep eglibc (as root); + dpkg-buildpackage -b ... + nlightnfotis: just debian/rules build + to start the glibc build + braunr: oh I have now, it's building without issues so far + when you see gcc processes, it means the build process has + switched from configuring to making + then interrupt (ctrl-c) + cd build-tree/hurd-i386-libc + make others + or make lib others + lib is glibc, others is some addons which include our libpthread + thanks for the tip braunr. + braunr: I have managed to get a working version of glibc and + libpthread with mach_print working. I have also run 2 test programs and + it works as expected. Will continue researching tomorrow if that's ok + with you, I am too tired to keep on now. + for the record compilation of glibc right from the start was + about 1 hour and 20 - 30 minutes + + +# IRC, freenode, #hurd, 2013-09-04 + + i've taken a deeper look at this assertion failure + and ... + it has nothing to do with pthread_create + i assumed it was the one in sysdeps/mach/pt-thread-start.c + pthread_self ()? + but it's actually from sysdeps/mach/hurd/pt-sysdep.h, in + _pthread_self() + and looking there : + thread = *(struct __pthread **)__hurd_threadvar_location + (_HURD_THREADVAR_THREAD); + so simply put, context switching doesn't fix up thread specific + data ... + it's that simple + wow + today I was running programs all day long with mach_print on + to print __pthread_total and __pthread_num_threads to see when both + become 1 and couldn't find anything + I was nearly desperate. You just made my day! :) + now the problem is + thread specific data is highly dependent on the stack + it's illegal to make a thread switch stack and expect it to keep + working on the hurd + unless split stack is activated? + no wait + split stack is completely unsupported on the hurd + uh, why would that be? + teythoon: about split stack ? + yes + i'm not sure + at least now we do know what the problem is and I can start + working on a solution. + braunr: we should tell tschwinge and youpi about it. + nlightnfotis: sure but + nlightnfotis: you can also start looking at a workaround + nlightnfotis: also, let's makre sure that's the reason first + nlightnfotis: use mach_print to display the stack pointer when + switching + nlightnfotis: + http://stackoverflow.com/questions/1880262/go-forcing-goroutines-into-the-same-thread + " I believe runtime.LockOSThread() is necessary if you are + creating a library binding from C code which uses thread-local storage" + oh, a paper about the go runtime scheduler + let's have a look .. + braunr: have you seen the high level overview presented in that + blog post I once posted here? + no + braunr, just came back, and read the log. Which paper are + you reading? The one from columbia university? + but i need to know about details here, specifically, if threads do + change stack + nlightnfotis: yes + braunr: ok + this could be caused either by true stack switching, or by "stack + segmentation" as implemented by go + it is interesting that there are stack related members per + goroutine + nlightnfotis: in particular, pthread_attr_setstacksize() doesn't + work on the hurd + it is interesting that there are stack related + members per goroutine -> I think that's go's policy. All goroutines run + on a shared address space (that is the kernel thread's address space) + nlightnfotis: that's obvious + and not the problem + and yes, it's "stack segmentation" + and on linux, and probably other archs, switching stack may be + perfectly legit + on the hurd, we still have threadvars + which are the hurd specific thread local storage mechanism + it means 1/ all stacks in a process must have the same size + 2/ stack size must be a power of two + 3/ threads can't switch stack + this hardly prevents goroutines from being run by just any thread + i see there already hard hurd specific changes about stack + handling + so we should only make changes to the specific gccgo + scheduler as a workaround under the Hurd right? + i don't know + this might also push the switch to tls + this sounds better as a long term fix + but it must also involve a great amount of work, right? + most of it has already been done + by youpi and tschwinge + with the changes to tls early in the summer? + maybe + 14:36 < braunr> nlightnfotis: also, let's makre sure that's the + reason first + 14:36 < braunr> nlightnfotis: use mach_print to display the stack + pointer when switching + check what goes wrong with the stack + then we'll see + as a very simple workaround, i expect locking g's on m's to be a + good first step + braunr: noted everything. that's my work for tonight. I + expect myself to stay up late like yesterday and have this all figured + out by tomorrow. + nlightnfotis: why not now ? + I am starting from now, but I expect myself to stop about 6 + o clock here (2 hours) because I have an appointment with a doctor. + and keep on when I come back home + well adding a few printfs to track the stack should be doable + before 2 hours + braunr: I am doing it now. Will report as soon as I have + results :) + braunr: have I messed up with the way I read esp's value? + https://github.com/NlightNFotis/glibc/commit/fdab1f5d45a43db5c5c288c4579b3d8251ee0f64#L1R67 + nlightnfotis: +unsigned + nlightnfotis: using gdb : + (gdb) info registers + esp 0x203ff7c0 0x203ff7c0 + (gdb) print thread->stackaddr + $2 = (void *) 0x2000000 + oh yes, I know about gdb, I thought you wanted me to use + mach_print + nlightnfotis: yes + this is just my own attempt + and it does show the stack pointer is completely outside the + thread stack + nlightnfotis: in your code, i suggest using + __builtin_frame_address() + well __builtin_frame_address(0) + see + http://gcc.gnu.org/onlinedocs/gcc-4.7.3/gcc/Return-Address.html#Return-Address + it's not exactly the stack pointer but close enough, unless of + course the stack is changed in the middle of the function + I see. I am gonna try one more time with esp the way I + worked it and if it fails to work, I am gonna use return address + nlightnfotis: be very careful about signed/unsigned and type + widths + not return address, frame address + return address is code, frame address is data (stack) + ah, I see, thanks for the correction. + youpi: not sure you catched it earlier, the problem fotis has been + having with goroutines is about threadvars + simply put, threads use setcontext functions to save/restore + goroutines state, which make them switch stack, rendering the location of + threadvars invalid, and making _pthread_self() choke + + +# IRC, freenode, #hurd, 2013-09-05 + + I am having very weird behavior with my code, something that + I can not explain and seems likely to be a bug, could someone else take a + look? + pinotree are you available at the moment to take a look at + something? + nlightnfotis: dont ask to ask, just ask + I have made some modifications to pthread_self as also + suggested by braunr to see if the stack pointer is within the bounds of + the frame address after context switching. I can get the values of both + esp and frame_address to be shown before the context switch, but I can + only get the value of esp to be shown after the context switch, and it + always results to the program getting killed + + https://github.com/NlightNFotis/glibc/blob/7e72da09a42b1518865f6f4882d68689e681f25b/libpthread/sysdeps/mach/hurd/pt-sysdep.h#L97 + thing is a dummy print value I have right after the code + that was supposed to print the frame_address after the context switching + is executing without any issues. + oh assembler... cannot help, sorry :/ + oh no, I am not asking for assembler help, that part works + quite alright. I am asking why from the 4 identical pieces of code that + print debugging values the last one doesn't work. I am on it all day, and + still have not found an answer + nlightnfotis: i can + hello braunr, + nlightnfotis: do you have a backtrace ? + uh + nope, it crashes right after I execute something. Let me + compile glibc once again and see if a fix I attempted works + malloc and free use locks + so they probably use _pthread_self + don't use them + for debugging, a simple statically allocated buffer on the stack + will do + nlightnfotis: so ? + Ι got past my original problem, but now I am trying to get + past the sigkills that kill the program at the beginning + i remember not having this problem, so I am compiling my + master branch to see if it is reproducible. If it is, it means something + is very wrong. If it's not, it means I screwed up somewhere + i don't understand, how do you know if you get past the problem if + you still have trouble reaching that code ? + braunr: I fixed all my problems now. I can see that both esp + and the frame_address are the same after context switching though? + always ? + for all goroutines ? + for all kernel threads, not go routines. We are in + libpthread + if they're the same after a context switch, it usually means the + scheduler didn't switch + well obviously + but what i asked you was to trace calls to setcontext functions + I will run some tests again. May I show you my code to see + if there is anything wrong with it? + what address do you have ? + not yet + i'm not sure you understand what i want to check + do you see how threadvars work basically ? + I think so yes, they keep in the stack the local variables + of a thread right? + and the globals + or + wait a minute... + yes but do you see how the thread specific data are fetched ? + with __hurd_threadvar_location_from_sp? + yes but "basically", what does it do ? + it get's a stack pointer as a parameter, and returns the + location of that specific data based on that stack pointer, right? + and how ? + I believe it must compare the base value of the stack and + the value of the end of the stack, and if the results are consistent, it + returns a pointer to the data? + and how does it determine the start and end of the stack ? + stack_pointer must be pointing at the base of the + stack. That + stack_size must be the stack limit I guess. + so you're saying the caller of __hurd_threadvar_location_from_sp + knows the stack base ? + I am not so sure I understand this question. + i want to know if you understand how threadvars work + apparently you don't + the caller only has its current stack pointer + which does *not* point to the stack base + threadvars work by assuming a *fixed* stack size, power of two, + aligned (obviously) + in our case, 2MiB (except in hurd servers where a kludge reduces + that to 64k) + this is why stack size can't be changed + this is also why the stack pointer can't ever point outside the + initial stack + i want you to make sure go violates this last assumption + so 1/ show the initial stack boundaries of your threads, then show + that, after loading a goroutine, the stack pointer is outside + which is what, if i'm right, triggers the assertion + ask if there is anything confusing + this is important, it should already have been done + ok, I noted it all, I am starting to work on it right now. I + only have one question. My results, the ones with the stack pointer and + the frame address, are expected or unexpected? + i don't know + show me the code again please + and explain your intent + + https://github.com/NlightNFotis/glibc/blob/7fe202317db4c3947f8ae1d1a4e52f7f0642e9ed/libpthread/sysdeps/mach/hurd/pt-sysdep.h + At first I print the value of esp and the frame_address + before the context switching and after the context switching. + The different variables were introduced as part of a test to + see if my results were consistent, + what context switch ? + in hurd_threadvar_location + what makes you think this is a context switch ? + in threadvar.h, it calls __hurd_threadvar_location_from_sp. + the full path for it is glibc/hurd/hurd/threadvar.h + i don't see how giving me the path will explain why it's a context + switch + and i can tell you right away it's not + hurd_threadvar_location is basically a lookup returning the + address of the thread specific data + wait a minute...does this mean that + hurd_threadvar_location_from_sp is also a lookup function for the same + reason + ? + yes + isn't the name meaningful enough ? + "location of the threadvars from stack pointer" + I guess I made wrong deductions from when you originally + shared your findings... + thread = *(struct __pthread + **)__hurd_threadvar_location (_HURD_THREADVAR_THREAD); + so simply put, context switching doesn't fix up + thread specific data ... + I thought that hurd_threadvar_location was doing the context + switching + nlightnfotis: by context switching, i mean setcontext functions + braunr: You mean the one in sysdeps/mach/hurd/i386? + yes + but + do you understand what i want you to check now ? + I think I got this time: Let me explain it: + You suggested that stack sizes are fixed. That is the main + reason that the stack pointer should not be able to point outside of it. + no + locating threadvars is done by applying a mask, computed from the + stack size, on the stack pointer, to determine its base + yeah, what __hurd_threadvar_location_from_sp is doing + if size is a power of two, size - 1 is a mask that, if + complemented, aligns the address + yes + so, threadvars expect the stack pointer to always point to the + initial stack + and we wanna prove that go violates this rule right? That + the stack pointer is not pointing at the initial stack + yes -- cgit v1.2.3