summaryrefslogtreecommitdiff
path: root/hurd/translator/tmpfs/discussion.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'hurd/translator/tmpfs/discussion.mdwn')
-rw-r--r--hurd/translator/tmpfs/discussion.mdwn266
1 files changed, 265 insertions, 1 deletions
diff --git a/hurd/translator/tmpfs/discussion.mdwn b/hurd/translator/tmpfs/discussion.mdwn
index 486206e3..0409f046 100644
--- a/hurd/translator/tmpfs/discussion.mdwn
+++ b/hurd/translator/tmpfs/discussion.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2011, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -19,3 +19,267 @@ License|/fdl]]."]]"""]]
* [[!GNU_Savannah_bug 26751]]
* [[!GNU_Savannah_bug 32755]]
+
+
+# [[Maksym_Planeta]]
+
+## IRC, freenode, #hurd, 2011-11-29
+
+ <mcsim> Hello. In seqno_memory_object_data_request I call
+ memory_object_data_supply and supply one zero filled page, but seems that
+ kernel ignores this call because this page stays filled in specified
+ memory object. In what cases kernel may ignore this call? It is written
+ in documentation that "kernel prohibits the overwriting of live data
+ pages". But when I called memory_object_lock_request on this page with
+ should flush and MEMORY_OBJECT_RETURN_ALL nothing change
+ <braunr> what are you trying to do ?
+ <mcsim> I think that memory object holds wrong data, so I'm trying to
+ replace them. This happens when file is truncated, so I should notify
+ memory object that there is no some data. But since gnumach works only
+ with sizes that are multiple of vm_page_size, I should manually correct
+ last page for case when file size isn't multiple of vm_page_size. This is
+ needed for case when file grows again and that tail of last page, which
+ wasn't part of file should be filled wit
+ <mcsim> I've put some printf's in kernel and it seems that page that holds
+ data which I want replace both absent and busy:
+ <mcsim> m = vm_page_lookup(object,offset);
+ <mcsim> ...
+ <mcsim> if (m->absent && m->busy) { <-- Condition is true
+ <mcsim> in vm/memory_object.c:169
+ <slpz> mcsim: Receiving m_o_data_request means there's no page in the
+ memory object at that offset, so m_o_data_supply should work
+ <slpz> are you sure that page is not being installed into the memory
+ object?
+ <braunr> it seems normal it's both absent and busy
+ <braunr> absent because, as sergio said, the page is missing, and busy
+ because the kernel starts a transfer for its content
+ <braunr> i don't understand how you determine the kernel ignores your
+ data_supply
+ <braunr> "because this page stays filled in specified memory object"
+ <braunr> please explain this with more detail
+ <slpz> mcsim: anyway, when truncating a file to a non page-aligned length,
+ you can just zero fill the rest of the page by mapping the object and
+ writing to it with memset/bzero
+ <braunr> (avoid bzero, it's obsolete)
+ <mcsim> slpz: I'll try try it now.
+ <braunr> slpz: i think that's what he's trying to do
+ <mcsim> I don't vm_map it
+ <braunr> how do you zero it then ?
+ <braunr> "I call memory_object_data_supply and supply one zero filled page"
+ <mcsim> First I call mo_lock_request and ask to return this page, than I
+ memset tail and try to mo_data_supply
+ <mcsim> I use this function when I try to replace kr =
+ memory_object_data_supply(reply_to, offset, addr, vm_page_size, FALSE,
+ VM_PROT_NONE, FALSE, MACH_PORT_NULL);
+ <mcsim> where addr points to new data, offset points to old data in
+ object. and reply_to is memory_control which I get as parameter in
+ mo_data_request
+ <braunr> why would you want to vm_map it then ?
+ <mcsim> because mo_data_supply doesn't work.
+ <braunr> mcsim: i still don't see why you want to vm_map
+ <mcsim> I just want to try it.
+ <braunr> but what do you think will happen ?
+ <mcsim> But seems that it doesn't work too, because I can't vm_map
+ memory_object from memory_manager of this object.
+
+
+## IRC, freenode, #hurd, 2012-01-05
+
+ <mcsim> Seems tmpfs works now. The code really needs cleaning, but the main
+ is that it works. So in nearest future it will be ready for merging to
+ master branch. BTW, anyone knows good tutorial about refactoring using
+ git (I have a lot of pointless commits and I want to gather and scatter
+ them to sensible ones).
+ <antrik> I wonder whether he actually got the "proper" tmpfs with the
+ defaul pager working? or only the hack with a private pager?
+ <mcsim> antrik: with default pager
+ <antrik> mcsim: wow, that's great :-)
+ <antrik> how did you fix it?
+ <mcsim> antrik: The main code I wrote before December, so I forgot some of
+ what exactly I were doing. So I have to look through my code :)
+ <mcsim> antrik: old defpager was using old functions like m_o_data_write
+ instead of m_o_data_return etc. I changed it, mostly because of my
+ misunderstanding. But I hope that this is not a problem.
+
+
+## IRC, freenode, #hurd, 2012-01-18
+
+ <antrik> mcsim: did you publish your in-progress work?
+ <mcsim> there is a branch with working tmpfs in git repository:
+ http://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?h=mplaneta/tmpfs/defpager
+ <jd823592> sorry for interrupting the meeting but i wonder what is a
+ lazyfs?
+ <mcsim> jd823592: lazyfs is tmpfs which uses own pager
+ <antrik> mcsim: ah, nice :-)
+ <antrik> BTW, what exactly did you need to fix to make it work?
+ <mcsim> most fixes wore in defpager in default_pager_object_set_size. Also,
+ as i said earlier, I switched to new functions (m_o_data_return instead
+ of m_o_data_write and so on). I said that this was mostly because of my
+ misunderstanding, but it turned out that new function provide work with
+ precious attribute of page.
+ <mcsim> Also there were some small errors like this:
+ <mcsim> pager->map = (dp_map_t) kalloc (PAGEMAP_SIZE (new_size));
+ <mcsim> memcpy (pager->map, old_mapptr, PAGEMAP_SIZE (old_size));
+ <mcsim> where in second line should be new_size too
+ <mcsim> I removed all warnings in compiling defpager (and this helped to
+ find an error).
+ <antrik> great work :-)
+ <jd823592> tmpfs is nice thing to have :), are there other recent
+ improvements that were not yet published in previous moth?
+ <mcsim> BTW, i measured tmpfs speed in it is up to 6 times faster than
+ ramdisk+ext2fs
+ <antrik> mcsim: whow, that's quite a difference... didn't expect that
+
+
+## IRC, freenode, #hurd, 2012-01-24
+
+ <mcsim> braunr: I'm just wondering is there any messages before hurd
+ breaks. I have quite strange message: memory_object_data_request(0x0,
+ 0x0, 0xf000, 0x1000, 0x1) failed, 10000003
+ <braunr> hm i don't think so
+ <braunr> usually it either freezes completely, or it panics because of an
+ exhausted resource
+ <mcsim> where first and second 0x0 are pager and pager_request for memory
+ object in vm_fault_page from gnumach/vm_fault.c
+ <braunr> if you're using the code you're currently working on (which i
+ assume), then look for a bug there first
+ <tschwinge> mcsim: Maybe you're running out of swap?
+ <mcsim> tschwinge: no
+ <braunr> also, translate the error code
+ <mcsim> AFAIR that's MACH_INVALID_DEST
+ <braunr> and what does it mean in this situation ?
+ <mcsim> I've run fsx as long as possible several times. It runs quite long
+ but it breaks in different ways.
+ <mcsim> MACH_SEND_INVALID_DEST
+ <mcsim> this means that kernel tries to call rpc with pager 0x0
+ <mcsim> this is invalid destiantion
+ <braunr> null port
+ <braunr> ok
+ <braunr> did the pager die ?
+ <mcsim> When I get this message pager dies, but also computer can suddenly
+ reboot
+ <braunr> i guess the pager crashing makes mach print this error
+ <braunr> but then you may have a dead port instead of a null port, i don't
+ remember the details
+ <mcsim> braunr: thank you.
+ <mcsim> btw, for big file sizes fsx breaks on ext2fs
+ <braunr> could you identify the threshold ?
+ <braunr> and what's fsx exactly ?
+ <mcsim> fsx is a testing utility for filesystems
+ <mcsim> see http://codemonkey.org.uk/projects/fsx/
+ <braunr> ah, written by tevanian
+ <mcsim> threshold seems to be 8Mb
+ <braunr> fyi, avadis tevanian is the main author of the mach 3 core
+ services and VM parts
+ <braunr> well, ext2fs is bugged, we already know that
+ <braunr> old code maintained as well as possible, but still
+ <mcsim> hmm, with 6mb it breaks too
+ <braunr> i guess that it may break on anything larger than a page actually
+ :p
+ <mcsim> When I tested with size of 256kb, fsx worked quite long and didn't
+ break
+ <braunr> mcsim: without knowing exactly what the test actually does, it's
+ hard to tell
+ <mcsim> I see, I just wanted to tell that there are bugs in ext2fs too. But
+ I didn't debugged it.
+ <mcsim> fsx performs different operations, like read, write, truncate file,
+ grow file in random order.
+ <braunr> in parellel too ?
+ <braunr> parellel
+ <braunr> parallel*
+ <mcsim> no
+ <mcsim> I run several fsx's parallel on tmpfs, but they break on file with
+ size 8mb.
+ <braunr> that must match something in mach
+ <braunr> s/must/could/ :)
+ <mcsim> braunr: I've pushed my commits to mplaneta/tmpfs/master branch in
+ hurd repository, so you could review it.
+ <braunr> you shouldn't do that just for me :p
+ <braunr> you should do that regularly, and ask for reviews after
+ (e.g. during the meetings)
+ <mcsim> everyone could do that :)
+ <braunr> i'm quite busy currently unfortunately
+ <braunr> i'll try when i have time, but the best would be to ask very
+ specific questions
+ <braunr> these are usually the faster to answer for people ho have the
+ necessary expertise to help you
+ <braunr> fastest*
+ <mcsim> ok.
+ <mcsim> braunr: probably, I was doing something wrong, because now parallel
+ works only for small sizes. Sorry, for disinformation.
+
+
+### IRC, freenode, #hurd, 2012-01-25
+
+ <antrik> braunr: actually, the paging errors are *precisely* the way my
+ system tends to die...
+ <antrik> (it's after about a month of uptime usually though, not a week...)
+ <antrik> tschwinge: in my case at least, I have still plenty of swap when
+ this happens. swap usage is generally at about the amount of physical
+ memory -- I have no idea though whether there is an actual connection, or
+ it's just coincidence
+ <braunr> antrik: ok, your hurd dies because of memory issues, my virtual
+ machines die because of something else (though idk what)
+ <antrik> before I aquired the habit of running my box 24/7 and thus hitting
+ this issue, most of the hangs I experienced were also of a different
+ nature... but very rare in general, except when doing specific
+ problematic actions
+ <mcsim> antrik: yes. Do you get messages like that I posted?
+ <mcsim> here is it: memory_object_data_request(0x0, 0x0, 0xf000, 0x1000,
+ 0x1) failed, 10000003
+ <antrik> mcsim: I can't tell for sure (never noted them down, silly me...)
+ <antrik> but I definitely get paging errors right before it hangs
+ <antrik> I guess that was unclear... what I'm trying to say is: I do get
+ memory_object_data_request() failed; but I'm not sure about the
+ parameters
+ <mcsim> antrik: ok. Thank you.
+ <mcsim> I'll try to find something in defpager, but there should be errors
+ in mach too. At least because sometimes computer suddenly reboots during
+ test.
+ <antrik> mcsim: I don't get sudden reboots
+ <antrik> might be a different error
+ <antrik> do you have debugging mode activated in Mach? otherwise it reboots
+ on kernel panics...
+ <mcsim> antrik: no. But usually on kernel panics mach waits for some time
+ showing the error message and only than reboots.
+ <antrik> OK
+ <mcsim> how can I know that tmpfs is stable enough? Correcting errors in
+ kernel to make fsx test work seems to be very complex.
+ <mcsim> *If errors are in kernel.
+ <antrik> well, it seems that you tested it already much more thoroughly
+ than any other code in the Hurd was ever tested :-)
+ <antrik> of course it would be great if you could pinpoint some of the
+ problems you see nevertheless :-)
+ <antrik> but that's not really necessary before declaring tmpfs good enough
+ I'd say
+ <mcsim> ok. I'll describe every error I meet on my userpage
+ <mcsim> but it will take some time, not before weekend.
+ <antrik> don't worry, it's not urgent
+ <antrik> the reason I'd really love to see those errors investigated is
+ that most likely they are the same ones that cause stability problems in
+ actual use...
+ <antrik> having an easy method for reproducing them is already a good start
+ <mcsim> no. they are not the same
+ <mcsim> every time i get different one
+ <mcsim> especially when i just start one process fsx and wait error
+ <antrik> mcsim: have you watched memory stats while running it? if it's
+ related to the problems I'm experiencing, you will probably see rising
+ memory use while the test is running
+ <mcsim> it could be reboot, message, I posted and also fsx could stop
+ telling that something wrong with data
+ <antrik> you get all of these also on ext2?
+ <mcsim> i've done it only once. Here is the log:
+ http://paste.debian.net/153511/
+ <mcsim> I saved "free" output every 30 seconds
+ <mcsim> no. I'll do it now
+ <antrik> would be better to log with "vmstat 1"
+ <mcsim> ok.
+ <mcsim> as you can see, there is now any leek during work. But near end
+ free memory suddenly decreases
+ <antrik> yeah... it's a bit odd, as there is a single large drop, but seems
+ stable again afterwards...
+ <antrik> a more detailed log might shed some light
+ <mcsim> drop at the beginning was when I started translator.
+ <mcsim> what kind of log do you mean?
+ <antrik> vmstat 1 I mean
+ <mcsim> ah...