1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
|
[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
[[!tag open_issue_gnumach open_issue_hurd]]
# IRC, freenode, #hurd, 2010
<slpz> humm... why does tmpfs try to use the default pager? that's a bad
idea, and probably will never work correctly...
* slpz is thinking about old issues
<slpz> tmpfs should create its own pagers, just like ext2fs, storeio...
<slpz> slopez@slp-hurd:~$ settrans -a tmp /hurd/tmpfs 10M
<slpz> slopez@slp-hurd:~$ echo "foo" > tmp/bar
<slpz> slopez@slp-hurd:~$ cat tmp/bar
<slpz> foo
<slpz> slopez@slp-hurd:~$
<slpz> :-)
<pochu> slpz: woo you fixed it?
<slpz> pochu: well, it's WIP, but reading/writing works...
<slpz> I've replaced the use of default pager for the standard pager
creation mechanism
<antrik> slpz: err... how is it supposed to use swap space if not using the
default pager?
<antrik> slpz: or do you mean that it should act as a proxy, just
allocating anonymous memory (backed by the default pager) itself?
<youpi> antrik: the kernel uses the default pager if the application pager
isn't responsive enough
<slpz> antrik: it will just create memory objects and provide zerofilled
pages when requested by the kernel (after a page fault)
<antrik> youpi: that makes sense I guess... but how is that relevant to the
question at hand?...
<slpz> antrik: memory objects will contain the data by themselves
<slpz> antrik: as youpi said, when memory is scarce, GNU Mach will start
paging out data from memory objects to the default pager
<slpz> antrik: that's the way in which pages will get into swap space
<slpz> (if needed)
<youpi> the thing being that the tmpfs pager has a chance to select pages
he doesn't care any more about
<antrik> slpz: well, the point is that instead of writing the pages to a
backing store, tmpfs will just keep them in anonymous memory, and let the
default pager write them out when there is pressure, right?
<antrik> youpi: no idea what you are talking about. apparently I still
don't really understand this stuff :-(
<youpi> ah, but tmpfs doesn't have pages he doesn't care about, does it?
<slpz> antrik: yes, but the term "anonymous memory" could be a bit
confusing.
<slpz> antrik: in GNU Mach, anonymous memory is backed by a memory object
without a pager. In tmpfs, nodes will be allocated in memory objects, and
the pager for those memory objects will be tmpfs itself
<antrik> slpz: hm... I thought anynymous memory is backed by memory objects
created from the default pager?
<antrik> yes, I understand that tmpfs is supposed to be the pager for the
objects it provides. they are obviously not anonymoust -- they have
inodes in the tmpfs name space
<antrik> but my understanding so far was that when Mach returns pages to
the pager, they end up in anonymous memory allocated to the pager
process; and then this pager is responsible for writing them back to the
actual backing store
<antrik> am I totally off there?...
<antrik> (i.e. in my understanding the returned pages do not reside in the
actual memory object the pager provides, but in an anonymous memory
object)
<slpz> antrik: you're right. The trick here is, when does Mach return the
pages?
<slpz> antrik: if we set the attribute "can_persist" in a memory object,
Mach will keep it until object cache is full or memory is scarce
<slpz> or we change the attributes so it can no longer persist, of course
<slpz> without a backing store, if Mach starts sending us pages to be
written, we're in trouble
<slpz> so we must do something about it. One option, could be creating
another pager and copying the contents between objects.
<antrik> another pager? not sure what you mean
<antrik> BTW, you didn't really say why we can't use the default pager for
tmpfs objects :-)
<slpz> well, there're two problems when using the default pager as backing
store for translators
<slpz> 1) Mach relies on it to do swapping tasks, so meddling with it is
not a good idea
<slpz> 2) There're problems with seqnos when trying to work with the
default pager from tasks other the kernel itself
<slpz> (probably, the latter could be fixed)
<slpz> antrik: pager's terminology is a bit confusing. One can also say
creating another memory object (though the function in libpager is
"pager_create")
<antrik> not sure why "meddling" with it would be a problem...
<antrik> and yeah, I was vaguely aware that there is some seqno problem
with tmpfs... though so far I didn't really understand what it was about
:-)
<antrik> makes sense now
<antrik> anyways, AIUI now you are trying to come up with a mechanism where
the default pager is not used for tmpfs objects directly, but without
making it inefficient?
<antrik> slpz: still don't understand what you mean by creating another
memory object/pager...
<antrik> (and yeat, the terminology is pretty mixed up even in Mach itself)
<slpz> antrik: I meant creating another pager, in terms of calling again to
libpager's pager_create
<antrik> slpz: well, I understand what "create another pager" means... I
just don't understand what this other pager would be, when you would
create it, and what for...
<slpz> antrik: oh, ok, sorry
<slpz> antrik: creating another pager it's just a trick to avoid losing
information when Mach's objects cache is full, and it decides to purge
one of our objects
<slpz> anyway, IMHO object caching mechanism is obsolete and should be
replaced
<slpz> I'm writting a comment to bug #28730 which says something about this
<slpz> antrik: just one more thing :-)
<slpz> if you look at the code, for most time of their lives, anonymous
memory objects don't have a pager
<slpz> not even the default one
<slpz> only the pageout thread, when the system is running really low on
memory, gives them a reference to the default pager by calling
vm_object_pager_create
<slpz> this is not really important, but worth noting ;-)
# IRC, freenode, #hurd, 2011-09-28
<slpz> mcsim: "Fix tmpfs" task should be called "Fix default pager" :-)
<slpz> mcsim: I've been thinking about modifying tmpfs to actually have
it's own storeio based backend, even if a tmpfs with storage sounds a bit
stupid.
<slpz> mcsim: but I don't like the idea of having translators messing up
with the default pager...
<antrik> slpz: messing up?...
<slpz> antrik: in the sense of creating a number of arbitrarily sized
objects
<antrik> slpz: well, it doesn't really matter much whether a process
indirectly eats up arbitrary amounts of swap through tmpfs, or directly
through vm_allocate()...
<antrik> though admittedly it's harder to implement resource limits with
tmpfs
<slpz> antrik: but I've talked about having its own storeio device as
backend. This way Mach can pageout memory to tmpfs if it's needed.
<mcsim> Do I understand correctly that the goal of tmpfs task is to create
tmpfs in RAM?
<slpz> mcsim: It is. But it also needs some kind of backend, just in case
it's ordered to page out data to free some system's memory.
<slpz> mcsim: Nowadays, this backend is another translator that acts as
default pager for the whole system
<antrik> slpz: pageout memory to tmpfs? not sure what you mean
<slpz> antrik: I mean tmpfs acting as its own pager
<antrik> slpz: you mean tmpfs not using the swap partition, but some other
backing store?
<slpz> antrik: Yes.
See also: [[open_issues/resource_management_problems/pagers]].
<antrik> slpz: I don't think an extra backing store for tmpfs is a good
idea. the whole point of tmpfs is not having a backing store... TBH, I'd
even like to see a single backing store for anonymous memory and named
files
<slpz> antrik: But you need a backing store, even if it's the default pager
:-)
<slpz> antrik: The question is, Should users share the same backing store
(swap space) or provide their own?
<antrik> slpz: not sure what you mean by "users" in this context :-)
<slpz> antrik: Real users with the ability of setting tmpfs translators
<antrik> essentially, I'd like to have a single partition that contains
both swap space and the main filesystem (at least /tmp, but probably also
all of /run, and possibly even /home...)
<antrik> but that's a bit off-topic :-)
<antrik> well, ideally all storage should be accounted to a user,
regardless whether it's swapped out anonymous storage, temporary named
files, or permanent files
<slpz> antrik: you could use a file as backend for tmpfs
<antrik> slpz: what's the point of using tmpfs then? :-)
<pinotree> (and then store the file in another tmpfs)
<slpz> antrik: mach-defpager could be modified to use storeio instead of
Mach's device_* operations, but by the way things work right now, that
could be dangerous, IMHO
<antrik> pinotree: hehe
<pinotree> .. recursive tmpfs'es ;)
<antrik> slpz: hm, sounds interesting
<slpz> antrik: tmpfs would try to keep data in memory always it's possible
(not calling m_o_lock_request would do the trick), but if memory is
scarce an Mach starts paging out, it would write it to that
file/device/whatever
<antrik> ideally, all storage used by system tasks for swapped out
anonymous memory as well as temporary named files would end up on the
/run partition; while all storage used by users would end up in /home/*
<antrik> if users share a partition, some explicit storage accounting would
be useful too...
<antrik> slpz: is that any different from what "normal" filesystems do?...
<antrik> (and *should* it be different?...)
<slpz> antrik: Yes, as most FS try to synchronize to disk at a reasonable
rate, to prevent data losses.
<slpz> antrik: tmpfs would be a FS that wouldn't synchronize until it's
forced to do that (which, by the way, it's what's currently happening
with everyone that uses the default pager).
<antrik> slpz: hm, good point...
<slpz> antrik: Also, metadata in never written to disk, only kept in memory
(which saves a lot of I/O, too).
<slpz> antrik: In fact, we would be doing the same as every other kernel
does, but doing it explicitly :-)
<antrik> I see the use in separating precious data (in permanent named
files) from temporary state (anonymous memory and temporary named files)
-- but I'm not sure whether having a completely separate FS for the
temporary data is the right approach for that...
<slpz> antrik: And giving the user the option to specify its own storage,
so we don't limit him to the size established for swap by the super-user.
<antrik> either way, that would be a rather radical change... still would
be good to fix tmpfs as it is first if possible
<antrik> as for limited swap, that's precisely why I'd prefer not to have
an extra swap partition at all...
<slpz> antrik: It's not much o fa change, it's how it works right now, with
the exception of replacing the default pager with its own.
<slpz> antrik: I think it's just a matter of 10-20 hours, as
much. Including testing.
<slpz> antrik: It could be forked with another name, though :-)
<antrik> slpz: I don't mean radical change in the implementation... but a
radical change in the way it would be used
<slpz> antrik: I suggest "almosttmpfs" as the name for the forked one :-P
<antrik> hehe
<antrik> how about lazyfs?
<slpz> antrik: That sound good to me, but probably we should use a more
descriptive name :-)
## 2011-09-29
<tschwinge> slpz, antrik: There is a defpager in the Hurd code. It is not
currently being used, and likely incomplete. It is backed by libstore.
I have never looked at it.
|