From 6f3a380f3c1bc602b1b86dec307abf27f71bfef4 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Sat, 28 Jan 2012 15:04:40 +0100 Subject: IRC. --- hurd/translator/tmpfs/discussion.mdwn | 266 +++++++++++++++++++++++++++++++++- 1 file changed, 265 insertions(+), 1 deletion(-) (limited to 'hurd/translator/tmpfs') diff --git a/hurd/translator/tmpfs/discussion.mdwn b/hurd/translator/tmpfs/discussion.mdwn index 486206e3..0409f046 100644 --- a/hurd/translator/tmpfs/discussion.mdwn +++ b/hurd/translator/tmpfs/discussion.mdwn @@ -1,4 +1,4 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] +[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -19,3 +19,267 @@ License|/fdl]]."]]"""]] * [[!GNU_Savannah_bug 26751]] * [[!GNU_Savannah_bug 32755]] + + +# [[Maksym_Planeta]] + +## IRC, freenode, #hurd, 2011-11-29 + + Hello. In seqno_memory_object_data_request I call + memory_object_data_supply and supply one zero filled page, but seems that + kernel ignores this call because this page stays filled in specified + memory object. In what cases kernel may ignore this call? It is written + in documentation that "kernel prohibits the overwriting of live data + pages". But when I called memory_object_lock_request on this page with + should flush and MEMORY_OBJECT_RETURN_ALL nothing change + what are you trying to do ? + I think that memory object holds wrong data, so I'm trying to + replace them. This happens when file is truncated, so I should notify + memory object that there is no some data. But since gnumach works only + with sizes that are multiple of vm_page_size, I should manually correct + last page for case when file size isn't multiple of vm_page_size. This is + needed for case when file grows again and that tail of last page, which + wasn't part of file should be filled wit + I've put some printf's in kernel and it seems that page that holds + data which I want replace both absent and busy: + m = vm_page_lookup(object,offset); + ... + if (m->absent && m->busy) { <-- Condition is true + in vm/memory_object.c:169 + mcsim: Receiving m_o_data_request means there's no page in the + memory object at that offset, so m_o_data_supply should work + are you sure that page is not being installed into the memory + object? + it seems normal it's both absent and busy + absent because, as sergio said, the page is missing, and busy + because the kernel starts a transfer for its content + i don't understand how you determine the kernel ignores your + data_supply + "because this page stays filled in specified memory object" + please explain this with more detail + mcsim: anyway, when truncating a file to a non page-aligned length, + you can just zero fill the rest of the page by mapping the object and + writing to it with memset/bzero + (avoid bzero, it's obsolete) + slpz: I'll try try it now. + slpz: i think that's what he's trying to do + I don't vm_map it + how do you zero it then ? + "I call memory_object_data_supply and supply one zero filled page" + First I call mo_lock_request and ask to return this page, than I + memset tail and try to mo_data_supply + I use this function when I try to replace kr = + memory_object_data_supply(reply_to, offset, addr, vm_page_size, FALSE, + VM_PROT_NONE, FALSE, MACH_PORT_NULL); + where addr points to new data, offset points to old data in + object. and reply_to is memory_control which I get as parameter in + mo_data_request + why would you want to vm_map it then ? + because mo_data_supply doesn't work. + mcsim: i still don't see why you want to vm_map + I just want to try it. + but what do you think will happen ? + But seems that it doesn't work too, because I can't vm_map + memory_object from memory_manager of this object. + + +## IRC, freenode, #hurd, 2012-01-05 + + Seems tmpfs works now. The code really needs cleaning, but the main + is that it works. So in nearest future it will be ready for merging to + master branch. BTW, anyone knows good tutorial about refactoring using + git (I have a lot of pointless commits and I want to gather and scatter + them to sensible ones). + I wonder whether he actually got the "proper" tmpfs with the + defaul pager working? or only the hack with a private pager? + antrik: with default pager + mcsim: wow, that's great :-) + how did you fix it? + antrik: The main code I wrote before December, so I forgot some of + what exactly I were doing. So I have to look through my code :) + antrik: old defpager was using old functions like m_o_data_write + instead of m_o_data_return etc. I changed it, mostly because of my + misunderstanding. But I hope that this is not a problem. + + +## IRC, freenode, #hurd, 2012-01-18 + + mcsim: did you publish your in-progress work? + there is a branch with working tmpfs in git repository: + http://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?h=mplaneta/tmpfs/defpager + sorry for interrupting the meeting but i wonder what is a + lazyfs? + jd823592: lazyfs is tmpfs which uses own pager + mcsim: ah, nice :-) + BTW, what exactly did you need to fix to make it work? + most fixes wore in defpager in default_pager_object_set_size. Also, + as i said earlier, I switched to new functions (m_o_data_return instead + of m_o_data_write and so on). I said that this was mostly because of my + misunderstanding, but it turned out that new function provide work with + precious attribute of page. + Also there were some small errors like this: + pager->map = (dp_map_t) kalloc (PAGEMAP_SIZE (new_size)); + memcpy (pager->map, old_mapptr, PAGEMAP_SIZE (old_size)); + where in second line should be new_size too + I removed all warnings in compiling defpager (and this helped to + find an error). + great work :-) + tmpfs is nice thing to have :), are there other recent + improvements that were not yet published in previous moth? + BTW, i measured tmpfs speed in it is up to 6 times faster than + ramdisk+ext2fs + mcsim: whow, that's quite a difference... didn't expect that + + +## IRC, freenode, #hurd, 2012-01-24 + + braunr: I'm just wondering is there any messages before hurd + breaks. I have quite strange message: memory_object_data_request(0x0, + 0x0, 0xf000, 0x1000, 0x1) failed, 10000003 + hm i don't think so + usually it either freezes completely, or it panics because of an + exhausted resource + where first and second 0x0 are pager and pager_request for memory + object in vm_fault_page from gnumach/vm_fault.c + if you're using the code you're currently working on (which i + assume), then look for a bug there first + mcsim: Maybe you're running out of swap? + tschwinge: no + also, translate the error code + AFAIR that's MACH_INVALID_DEST + and what does it mean in this situation ? + I've run fsx as long as possible several times. It runs quite long + but it breaks in different ways. + MACH_SEND_INVALID_DEST + this means that kernel tries to call rpc with pager 0x0 + this is invalid destiantion + null port + ok + did the pager die ? + When I get this message pager dies, but also computer can suddenly + reboot + i guess the pager crashing makes mach print this error + but then you may have a dead port instead of a null port, i don't + remember the details + braunr: thank you. + btw, for big file sizes fsx breaks on ext2fs + could you identify the threshold ? + and what's fsx exactly ? + fsx is a testing utility for filesystems + see http://codemonkey.org.uk/projects/fsx/ + ah, written by tevanian + threshold seems to be 8Mb + fyi, avadis tevanian is the main author of the mach 3 core + services and VM parts + well, ext2fs is bugged, we already know that + old code maintained as well as possible, but still + hmm, with 6mb it breaks too + i guess that it may break on anything larger than a page actually + :p + When I tested with size of 256kb, fsx worked quite long and didn't + break + mcsim: without knowing exactly what the test actually does, it's + hard to tell + I see, I just wanted to tell that there are bugs in ext2fs too. But + I didn't debugged it. + fsx performs different operations, like read, write, truncate file, + grow file in random order. + in parellel too ? + parellel + parallel* + no + I run several fsx's parallel on tmpfs, but they break on file with + size 8mb. + that must match something in mach + s/must/could/ :) + braunr: I've pushed my commits to mplaneta/tmpfs/master branch in + hurd repository, so you could review it. + you shouldn't do that just for me :p + you should do that regularly, and ask for reviews after + (e.g. during the meetings) + everyone could do that :) + i'm quite busy currently unfortunately + i'll try when i have time, but the best would be to ask very + specific questions + these are usually the faster to answer for people ho have the + necessary expertise to help you + fastest* + ok. + braunr: probably, I was doing something wrong, because now parallel + works only for small sizes. Sorry, for disinformation. + + +### IRC, freenode, #hurd, 2012-01-25 + + braunr: actually, the paging errors are *precisely* the way my + system tends to die... + (it's after about a month of uptime usually though, not a week...) + tschwinge: in my case at least, I have still plenty of swap when + this happens. swap usage is generally at about the amount of physical + memory -- I have no idea though whether there is an actual connection, or + it's just coincidence + antrik: ok, your hurd dies because of memory issues, my virtual + machines die because of something else (though idk what) + before I aquired the habit of running my box 24/7 and thus hitting + this issue, most of the hangs I experienced were also of a different + nature... but very rare in general, except when doing specific + problematic actions + antrik: yes. Do you get messages like that I posted? + here is it: memory_object_data_request(0x0, 0x0, 0xf000, 0x1000, + 0x1) failed, 10000003 + mcsim: I can't tell for sure (never noted them down, silly me...) + but I definitely get paging errors right before it hangs + I guess that was unclear... what I'm trying to say is: I do get + memory_object_data_request() failed; but I'm not sure about the + parameters + antrik: ok. Thank you. + I'll try to find something in defpager, but there should be errors + in mach too. At least because sometimes computer suddenly reboots during + test. + mcsim: I don't get sudden reboots + might be a different error + do you have debugging mode activated in Mach? otherwise it reboots + on kernel panics... + antrik: no. But usually on kernel panics mach waits for some time + showing the error message and only than reboots. + OK + how can I know that tmpfs is stable enough? Correcting errors in + kernel to make fsx test work seems to be very complex. + *If errors are in kernel. + well, it seems that you tested it already much more thoroughly + than any other code in the Hurd was ever tested :-) + of course it would be great if you could pinpoint some of the + problems you see nevertheless :-) + but that's not really necessary before declaring tmpfs good enough + I'd say + ok. I'll describe every error I meet on my userpage + but it will take some time, not before weekend. + don't worry, it's not urgent + the reason I'd really love to see those errors investigated is + that most likely they are the same ones that cause stability problems in + actual use... + having an easy method for reproducing them is already a good start + no. they are not the same + every time i get different one + especially when i just start one process fsx and wait error + mcsim: have you watched memory stats while running it? if it's + related to the problems I'm experiencing, you will probably see rising + memory use while the test is running + it could be reboot, message, I posted and also fsx could stop + telling that something wrong with data + you get all of these also on ext2? + i've done it only once. Here is the log: + http://paste.debian.net/153511/ + I saved "free" output every 30 seconds + no. I'll do it now + would be better to log with "vmstat 1" + ok. + as you can see, there is now any leek during work. But near end + free memory suddenly decreases + yeah... it's a bit odd, as there is a single large drop, but seems + stable again afterwards... + a more detailed log might shed some light + drop at the beginning was when I started translator. + what kind of log do you mean? + vmstat 1 I mean + ah... -- cgit v1.2.3