[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_hurd]] * [[notes_bing]] * [[notes_various]] * [[tmpfs_vs_defpager]] * [[!GNU_Savannah_bug 26751]] * [[!GNU_Savannah_bug 32755]] # [[Maksym_Planeta]] ## IRC, freenode, #hurd, 2011-11-29 Hello. In seqno_memory_object_data_request I call memory_object_data_supply and supply one zero filled page, but seems that kernel ignores this call because this page stays filled in specified memory object. In what cases kernel may ignore this call? It is written in documentation that "kernel prohibits the overwriting of live data pages". But when I called memory_object_lock_request on this page with should flush and MEMORY_OBJECT_RETURN_ALL nothing change what are you trying to do ? I think that memory object holds wrong data, so I'm trying to replace them. This happens when file is truncated, so I should notify memory object that there is no some data. But since gnumach works only with sizes that are multiple of vm_page_size, I should manually correct last page for case when file size isn't multiple of vm_page_size. This is needed for case when file grows again and that tail of last page, which wasn't part of file should be filled wit I've put some printf's in kernel and it seems that page that holds data which I want replace both absent and busy: m = vm_page_lookup(object,offset); ... if (m->absent && m->busy) { <-- Condition is true in vm/memory_object.c:169 mcsim: Receiving m_o_data_request means there's no page in the memory object at that offset, so m_o_data_supply should work are you sure that page is not being installed into the memory object? it seems normal it's both absent and busy absent because, as sergio said, the page is missing, and busy because the kernel starts a transfer for its content i don't understand how you determine the kernel ignores your data_supply "because this page stays filled in specified memory object" please explain this with more detail mcsim: anyway, when truncating a file to a non page-aligned length, you can just zero fill the rest of the page by mapping the object and writing to it with memset/bzero (avoid bzero, it's obsolete) slpz: I'll try try it now. slpz: i think that's what he's trying to do I don't vm_map it how do you zero it then ? "I call memory_object_data_supply and supply one zero filled page" First I call mo_lock_request and ask to return this page, than I memset tail and try to mo_data_supply I use this function when I try to replace kr = memory_object_data_supply(reply_to, offset, addr, vm_page_size, FALSE, VM_PROT_NONE, FALSE, MACH_PORT_NULL); where addr points to new data, offset points to old data in object. and reply_to is memory_control which I get as parameter in mo_data_request why would you want to vm_map it then ? because mo_data_supply doesn't work. mcsim: i still don't see why you want to vm_map I just want to try it. but what do you think will happen ? But seems that it doesn't work too, because I can't vm_map memory_object from memory_manager of this object. ## IRC, freenode, #hurd, 2012-01-05 Seems tmpfs works now. The code really needs cleaning, but the main is that it works. So in nearest future it will be ready for merging to master branch. BTW, anyone knows good tutorial about refactoring using git (I have a lot of pointless commits and I want to gather and scatter them to sensible ones). I wonder whether he actually got the "proper" tmpfs with the defaul pager working? or only the hack with a private pager? antrik: with default pager mcsim: wow, that's great :-) how did you fix it? antrik: The main code I wrote before December, so I forgot some of what exactly I were doing. So I have to look through my code :) antrik: old defpager was using old functions like m_o_data_write instead of m_o_data_return etc. I changed it, mostly because of my misunderstanding. But I hope that this is not a problem. ## IRC, freenode, #hurd, 2012-01-18 mcsim: did you publish your in-progress work? there is a branch with working tmpfs in git repository: http://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?h=mplaneta/tmpfs/defpager sorry for interrupting the meeting but i wonder what is a lazyfs? jd823592: lazyfs is tmpfs which uses own pager mcsim: ah, nice :-) BTW, what exactly did you need to fix to make it work? most fixes wore in defpager in default_pager_object_set_size. Also, as i said earlier, I switched to new functions (m_o_data_return instead of m_o_data_write and so on). I said that this was mostly because of my misunderstanding, but it turned out that new function provide work with precious attribute of page. Also there were some small errors like this: pager->map = (dp_map_t) kalloc (PAGEMAP_SIZE (new_size)); memcpy (pager->map, old_mapptr, PAGEMAP_SIZE (old_size)); where in second line should be new_size too I removed all warnings in compiling defpager (and this helped to find an error). great work :-) tmpfs is nice thing to have :), are there other recent improvements that were not yet published in previous moth? BTW, i measured tmpfs speed in it is up to 6 times faster than ramdisk+ext2fs mcsim: whow, that's quite a difference... didn't expect that ## IRC, freenode, #hurd, 2012-01-24 braunr: I'm just wondering is there any messages before hurd breaks. I have quite strange message: memory_object_data_request(0x0, 0x0, 0xf000, 0x1000, 0x1) failed, 10000003 hm i don't think so usually it either freezes completely, or it panics because of an exhausted resource where first and second 0x0 are pager and pager_request for memory object in vm_fault_page from gnumach/vm_fault.c if you're using the code you're currently working on (which i assume), then look for a bug there first mcsim: Maybe you're running out of swap? tschwinge: no also, translate the error code AFAIR that's MACH_INVALID_DEST and what does it mean in this situation ? I've run fsx as long as possible several times. It runs quite long but it breaks in different ways. MACH_SEND_INVALID_DEST this means that kernel tries to call rpc with pager 0x0 this is invalid destiantion null port ok did the pager die ? When I get this message pager dies, but also computer can suddenly reboot i guess the pager crashing makes mach print this error but then you may have a dead port instead of a null port, i don't remember the details braunr: thank you. btw, for big file sizes fsx breaks on ext2fs could you identify the threshold ? and what's fsx exactly ? fsx is a testing utility for filesystems see http://codemonkey.org.uk/projects/fsx/ ah, written by tevanian threshold seems to be 8Mb fyi, avadis tevanian is the main author of the mach 3 core services and VM parts well, ext2fs is bugged, we already know that old code maintained as well as possible, but still hmm, with 6mb it breaks too i guess that it may break on anything larger than a page actually :p When I tested with size of 256kb, fsx worked quite long and didn't break mcsim: without knowing exactly what the test actually does, it's hard to tell I see, I just wanted to tell that there are bugs in ext2fs too. But I didn't debugged it. fsx performs different operations, like read, write, truncate file, grow file in random order. in parellel too ? parellel parallel* no I run several fsx's parallel on tmpfs, but they break on file with size 8mb. that must match something in mach s/must/could/ :) braunr: I've pushed my commits to mplaneta/tmpfs/master branch in hurd repository, so you could review it. you shouldn't do that just for me :p you should do that regularly, and ask for reviews after (e.g. during the meetings) everyone could do that :) i'm quite busy currently unfortunately i'll try when i have time, but the best would be to ask very specific questions these are usually the faster to answer for people ho have the necessary expertise to help you fastest* ok. braunr: probably, I was doing something wrong, because now parallel works only for small sizes. Sorry, for disinformation. ### IRC, freenode, #hurd, 2012-01-25 braunr: actually, the paging errors are *precisely* the way my system tends to die... (it's after about a month of uptime usually though, not a week...) tschwinge: in my case at least, I have still plenty of swap when this happens. swap usage is generally at about the amount of physical memory -- I have no idea though whether there is an actual connection, or it's just coincidence antrik: ok, your hurd dies because of memory issues, my virtual machines die because of something else (though idk what) before I aquired the habit of running my box 24/7 and thus hitting this issue, most of the hangs I experienced were also of a different nature... but very rare in general, except when doing specific problematic actions antrik: yes. Do you get messages like that I posted? here is it: memory_object_data_request(0x0, 0x0, 0xf000, 0x1000, 0x1) failed, 10000003 mcsim: I can't tell for sure (never noted them down, silly me...) but I definitely get paging errors right before it hangs I guess that was unclear... what I'm trying to say is: I do get memory_object_data_request() failed; but I'm not sure about the parameters antrik: ok. Thank you. I'll try to find something in defpager, but there should be errors in mach too. At least because sometimes computer suddenly reboots during test. mcsim: I don't get sudden reboots might be a different error do you have debugging mode activated in Mach? otherwise it reboots on kernel panics... antrik: no. But usually on kernel panics mach waits for some time showing the error message and only than reboots. OK how can I know that tmpfs is stable enough? Correcting errors in kernel to make fsx test work seems to be very complex. *If errors are in kernel. well, it seems that you tested it already much more thoroughly than any other code in the Hurd was ever tested :-) of course it would be great if you could pinpoint some of the problems you see nevertheless :-) but that's not really necessary before declaring tmpfs good enough I'd say ok. I'll describe every error I meet on my userpage but it will take some time, not before weekend. don't worry, it's not urgent the reason I'd really love to see those errors investigated is that most likely they are the same ones that cause stability problems in actual use... having an easy method for reproducing them is already a good start no. they are not the same every time i get different one especially when i just start one process fsx and wait error mcsim: have you watched memory stats while running it? if it's related to the problems I'm experiencing, you will probably see rising memory use while the test is running it could be reboot, message, I posted and also fsx could stop telling that something wrong with data you get all of these also on ext2? i've done it only once. Here is the log: http://paste.debian.net/153511/ I saved "free" output every 30 seconds no. I'll do it now would be better to log with "vmstat 1" ok. as you can see, there is now any leek during work. But near end free memory suddenly decreases yeah... it's a bit odd, as there is a single large drop, but seems stable again afterwards... a more detailed log might shed some light drop at the beginning was when I started translator. what kind of log do you mean? vmstat 1 I mean ah... ## IRC, freenode, #hurd, 2012-02-01 I run fsx with this command: fsx -N3000 foo/bar -S4 -l$((1024*1024*8)). And after 70 commands it breaks. The strangeness is at address 0xc000 there is text, which was printed in fsx with vfprintf I've lost log. Wait a bit, while I generate new mcsim, what's fsx / where can I find it ? fsx is filesystem exersiser http://codemonkey.org.uk/projects/fsx/ ok thanks i use it to test tmpfs here is fsx that compiles on linux: http://paste.debian.net/154390/ and Makefile for it: http://paste.debian.net/154392/ mcsim, hmm, I get a failure with ext2fs too, is it expected? yes i'll show you logs with tmpfs. They slightly differ here: http://paste.debian.net/154399/ pre last operation is truncate and last is read during pre-last (or last) starting from address 0xa000, every 0x1000 bytes appears text skipping zero size read skipping zero size read truncating to largest ever: 0x705f4b signal 2 testcalls = 38 this text is printed by fsx, by function prt I've mistaken: this text appears even from every beginning I know that this text appears exactly at this moment, because I added check of the whole file after every step. And this error appeared only after last truncation. I think that the problem is in defpager (I'm fixing it), but I don't understand where defpager could get this text wow I get java code and debconf templates So, my question is: is it possible for defpager to get somehow this text? possibly recycled, non-zeroed pages? hmmm... probably you're right 0x1000 bytes is consistent with the page size Should I clean these pages in tmpfs? or in defpager? What is proper way? mcsim, I'd say defpager should do it, to avoid leaking information, I'm not sure though. maybe tmpfs should also not assume the pages have been blanked out. if i do it in both, it could have big influence on performance. i'll do it only in defpager so far. jkoenig_: Thank you a lot mcsim, no problem. ## IRC, freenode, #hurd, 2012-02-08 mcsim: You pushed another branch with cleaned-up patches? yes. mcsim: Anyway, any data from your report that we could be interested in? (Though it's not in English.) It's completely in ukrainian an and mostly describes some aspects of hurd's work. mcsim: OK. So you ran out of time to do the benchmarking, etc.? Comparing tmpfs to ext2fs with RAM backend, etc., I mean. tschwinge: I made benchmarking and it turned out that tmpfs up to 6 times faster than ext2fs tschwinge: is it possible to have a review of work, I've already done, even if parallel writing doesn't work? mcsim: Do you need this for university or just a general review for inclusion in the Git master branch? general review Will need to find someone who feels competent to do that... the branch that should be checked is tmpfs-final cool, i guess you tested also special types of files like sockets and pipes? (they are used in eg /run, /var/run or similar) Oh. I accidentally created this branch. It is my private branch. I'll delete it now and merge everything to mplaneta/tmpfs/master pinotree: Completely forgot about them :( I'll do it by all means mcsim: no worries :) tschwinge: Ready. The right branch is mplaneta/tmpfs/master ## IRC, freenode, #hurd, 2012-03-07 did you test it with sockets and pipes? pinotree: pipes work and sockets seems to work too (I've created new pfinet device for them and pinged it). try with simple C apps Anyway all these are just translators, so there shouldn't be any problems. pinotree: works