[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
[[!toc]]
#Notes on tmpfs
## mach-defpager
[[defpager|http://www.bddebian.com:8888/~hurd-web/user/Maksym_Planeta/#defpager81111]]
[[http://www.mail-archive.com/bug-hurd@gnu.org/msg18859.html]]
[[My patch that fixes tmpfs and defpager|http://lists.gnu.org/archive/html/bug-hurd/2011-11/msg00086.html]]
My vision of the problem in short: when user tries to access to the memory backed by tmpfs first time, kernel asks defpager to initialize memory object and sets it's request port to some value. But sometimes user directly accesses to defpager and supplies it own request port. So defpager should handle different request port, what causes errors.
TODO: pager_extend always frees old pagemap and allocates new. Consider if it is necessary.
## Steps
### Find out what causes crashes in tmpfs with defpager
[[http://www.gnu.org/s/hurd/hurd/translator/tmpfs/notes_various.html]]
TODO: Consider deleting of parameter "port" in function mach-defpager/default_pager.c:pager_port_list_insert
since this parameter is unused
Probably pager_request shouldn't be stored because request may arrive from different kernels (or from kernel and translator), so this parameter doesn't have any sense.
22.11.11 Reading/writing for any size works, [[this|http://lists.gnu.org/archive/html/bug-hurd/2011-11/msg00127.html]] works, but fsx test fails ([[see|http://www.bddebian.com:8888/~hurd-web/user/Maksym_Planeta/#fsx_fail2211]]).
### Write own pager
6.11.11 Reading/writing for files that fit in vm_page_size works
7.11.11 Works for any size.
TODO: During execution tmpfs hangs in random places. The most possible is variant is deadlocks,
because nothing was undertaken for thread safety.
TODO: Make tmpfs use not more space than it was allowed.
### Make links work
Symlinks behavior: [[links|http://www.bddebian.com:8888/~hurd-web/user/Maksym_Planeta/#links81111]]
8.11.11 Symlinks work.
[[Patch by Ben Asselstine.|http://thread.gmane.org/gmane.os.hurd.bugs/11829/focus=12098]]
### After sometime of inactivity tmpfs exits.
TODO: Find out why and correct this.
> This may perhaps be the standard translator
> shutdown-after-a-period-of-inactivity functionality? This is of course not
> valid in the tmpfs case, as all its state (file system content) is not stored
> on any non-volatile backend. Can this auto-shutdown functionality be
> disabled (in [[hurd/libdiskfs]], perhaps? See `libdiskfs/init-first.c`, the
> call to `ports_manage_port_operations_multithread`, the `global_timeout`
> paramenter (see the [[hurd/reference_manual]]) (here: `server_timeout`).
> Probably this should be set to `0` in tmpfs to disable timing out?
> --[[tschwinge]]
### Passive translator doesn't work
> Must be some bug: `diskfs_set_translator` is implemented in `node.c`, so this
> is supposed to work. --[[tschwinge]]
#Chalanges
Translators vs FUSE:
[[What can a translator do that FUSE can't?|http://lists.gnu.org/archive/html/bug-hurd/2010-07/msg00061.html]]
[[Re: Hurd translators on FUSE|http://lists.gnu.org/archive/html/l4-hurd/2009-09/msg00146.html]]
[[Example of sane utilization of filesystem stored in RAM|http://habrahabr.ru/blogs/gdev/131043/]] (Russian). Author of this article copied some resources of game "World of Tanks" to RAM-drive and game started load much faster. Although he used Windows in this article, this could be good example of benefits, which filesystem, stored in RAM, could give.
#Debugging
To debug tmpfs, using libraries from "$PWD"/lib and trace rpc:
settrans -ca foo /usr/bin/env LD_LIBRARY_PATH="$PWD"/lib utils/rpctrace -I /usr/share/msgids/ tmpfs/tmpfs 1M
LD_LIBRARY_PATH="$PWD"/lib gdb tmpfs/tmpfs `pidof tmpfs`
For debugging ext2fs:
settrans --create --active ramdisk0 /hurd/storeio -T copy zero:32M && \
/sbin/mkfs.ext2 -F -b 4096 ramdisk0 && \
settrans --active --orphan ramdisk0 /usr/bin/env LD_LIBRARY_PATH="$PWD"/lib utils/rpctrace -I /usr/share/msgids/ \
ext2fs/ext2fs.static ramdisk0
#Questions
1. What are sequence numbers? What are they used for?
> See [[microkernel/mach/ipc/sequence_numbering]]. --[[tschwinge]]
2. Is there any way to debug mach-defpager? When I set breakpoint to any function in it, pager never breaks.
> Is that still an unresolved problem? If not, what was the problem? --[[tschwinge]]
3. Is it normal that defpager panics and breaks on any wrong data?
> No server must ever panic (or reach any other invalid state), whichever input
> data it receives from untrusted sources (which may include every possible
> source). All input data must be checked for validity before use; anything
> else is a bug. --[[tschwinge]]
#Links
1. [[Cthreads manuals|http://www.cc.gatech.edu/classes/cs6432_99_winter/threads_man/]]
#Conversations
## 8.11.11
### links
(10:29:11) braunr: mcsim: ln -s foo/bar foo/baz means the link name is baz in the foo directory,
and its target (relative to its directory) is foo/bar (which would mean /tmp/foo/foo/bar in canonical form)
(10:29:42) braunr: youpi: tschwinge: what did ludovic achieve ?
(10:30:06) tschwinge: mcsim: As Richard says, symlink targets are always relative to the directory they're contained in.
(10:31:26) braunr: oh ok
(10:31:27) mcsim: so, if I want to create link in cd, first I need to cd there?
(10:31:36) mcsim: in foo*
(10:31:36) braunr: mcsim: just provide the right paths
(10:32:11) braunr: $ touch foo/bar
(10:32:14) braunr: $ ln -s bar foo/baz
(10:32:32) braunr: bar
(10:32:35) braunr: baz -> bar
### defpager
earlier:
: 1. On every system there is a ``default pager'' (mach-defpager). That one is responsible
for all ``anonymous memory''. For example, when you do malloc(10 MiB), and then there is memory pressure,
this 10 MiB memory region is backed by the default pager, whose job then is it to provide the backing store for this.
: This is what commonly would be known as a swap partition.
: And this is also the way tmpfs works (as I understand it).
: malloc(10 MiB) can also be mmap(MAP_ANONYMOUS, 10 MIB); that's the same, essentially.
: Now, for ext2fs or any other disk-based file system, this is different:
: The ext2fs translator implements its own backing store, namely it accesses the disk for storing
changed file content, or to read in data from disk if a new file is opened.
...
(10:36:14) mcsim: who else uses defpager besides tmpfs and kernel?
(10:36:27) braunr: normally, nothing directly
(10:37:04) mcsim: than why tmpfs should use defpager?
(10:37:22) braunr: it's its backend
(10:37:28) braunr: backign store rather
(10:37:38) braunr: the backing store of most file systems are partitions
(10:37:44) braunr: tmpfs has none, it uses the swap space
(10:39:31) mcsim: if we allocate memory for tmpfs using vm_allocate, will it be able to use swap partition?
(10:39:56) braunr: it should
(10:40:27) braunr: vm_allocate just maps anonymous memory
(10:41:27) braunr: anonymous memory uses swap space as its backing store too
(10:43:47) braunr: but be aware that this part of the vm system is known to have deficiencies
(10:44:14) braunr: which is why all mach based implementations have rewritten their default pager
(10:45:11) mcsim: what kind of deficiencies?
(10:45:16) braunr: bugs
(10:45:39) braunr: and design issues, making anonymous memory fragmentation horrible
...
(15:23:33) antrik: mcsim: vm_allocate doesn't return a memory object; so it can't be passed to clients for mmap()
(15:50:37) mcsim: antrik: I use vm_allocate in pager_read_page
(15:54:43) antrik: mcsim: well, that means that you have to actually implement a pager yourself
(15:56:10) antrik: also, when the kernel asks the pager to write back some pages, it expects the memory to become free.
if you are "paging" to ordinary anonymous memory, this doesn't happen; so I expect it to have a very bad effect
on system performance
(15:56:54) antrik: both can be avoided by just passing a real anonymous memory object, i.e. one provided by the defpager
(15:57:07) antrik: only problem is that the current defpager implementation can't really handle that...
#Pastes
## fsx test on 22.11.11
$ ~/src/fsx/fsx -W -R -t 4096 -w 4096 -r 4096 bar
mapped writes DISABLED
truncating to largest ever: 0x32000
truncating to largest ever: 0x39000
READ BAD DATA: offset = 0x16000, size = 0xd9a0
OFFSET GOOD BAD RANGE
0x1f000 0x0000 0x01a0 0x 2d14
operation# (mod 256) for the bad data may be 1
LOG DUMP (16 total operations):
1(1 mod 256): WRITE 0x1f000 thru 0x21d28 (0x2d29 bytes) HOLE ***WWWW
2(2 mod 256): WRITE 0x10000 thru 0x15bfe (0x5bff bytes)
3(3 mod 256): READ 0x1d000 thru 0x21668 (0x4669 bytes)
4(4 mod 256): READ 0xa000 thru 0x16a59 (0xca5a bytes)
5(5 mod 256): READ 0x8000 thru 0x8f2c (0xf2d bytes)
6(6 mod 256): READ 0xa000 thru 0x17fe8 (0xdfe9 bytes)
7(7 mod 256): READ 0x1b000 thru 0x20f33 (0x5f34 bytes)
8(8 mod 256): READ 0x15000 thru 0x1c05b (0x705c bytes)
9(9 mod 256): TRUNCATE UP from 0x21d29 to 0x32000
10(10 mod 256): READ 0x3000 thru 0x5431 (0x2432 bytes)
11(11 mod 256): WRITE 0x29000 thru 0x34745 (0xb746 bytes) EXTEND
12(12 mod 256): TRUNCATE DOWN from 0x34746 to 0x19000 ******WWWW
13(13 mod 256): READ 0x14000 thru 0x186d8 (0x46d9 bytes)
14(14 mod 256): TRUNCATE UP from 0x19000 to 0x39000 ******WWWW
15(15 mod 256): WRITE 0x28000 thru 0x3548c (0xd48d bytes)
16(16 mod 256): READ 0x16000 thru 0x2399f (0xd9a0 bytes) ***RRRR***
Correct content saved for comparison
(maybe hexdump "bar" vs "bar.fsxgood")