summaryrefslogtreecommitdiff
path: root/open_issues/performance/io_system/clustered_page_faults.mdwn
blob: 8bd6ba725a585944235023a3d261b2b06578a1b8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
[[!meta copyright="Copyright © 2011, 2014 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!tag open_issue_gnumach open_issue_hurd]]

[[community/gsoc/project_ideas/disk_io_performance]].

[[!toc]]


# IRC, freenode, #hurd, 2011-02-16

    <braunr> exceptfor the kernel, everything in an address space is
      represented with a VM object
    <braunr> those objects can represent anonymous memory (from malloc() or
      because of a copy-on-write)
    <braunr> or files
    <braunr> on classic Unix systems, these are files
    <braunr> on the Hurd, these are memory objects, backed by external pagers
      (like ext2fs)
    <braunr> so when you read a file
    <braunr> the kernel maps it from ext2fs in your address space
    <braunr> and when you access the memory, a fault occurs
    <braunr> the kernel determines it's a region backed by ext2fs
    <braunr> so it asks ext2fs to provide the data
    <braunr> when the fault is resolved, your process goes on
    <etenil> does the faul occur because Mach doesn't know how to access the
      memory?
    <braunr> it occurs because Mach intentionnaly didn't back the region with
      physical memory
    <braunr> the MMU is programmed not to know what is present in the memory
      region
    <braunr> or because it's read only
    <braunr> (which is the case for COW faults)
    <etenil> so that means this bit of memory is a buffer that ext2fs loads the
      file into and then it is remapped to the application that asked for it
    <braunr> more or less, yes
    <braunr> ideally, it's directly written into the right pages
    <braunr> there is no intermediate buffer
    <etenil> I see
    <etenil> and as you told me before, currently the page faults are handled
      one at a time
    <etenil> which wastes a lot of time
    <braunr> a certain amount of time
    <etenil> enough to bother the user :)
    <etenil> I've seen pages have a fixed size
    <braunr> yes
    <braunr> use the PAGE_SIZE macro
    <etenil> and when allocating memory, the size that's asked for is rounded
      up to the page size
    <etenil> so if I have this correctly, it means that a file ext2fs provides
      could be split into a lot of pages
    <braunr> yes
    <braunr> once in memory, it is managed by the page cache
    <braunr> so that pages more actively used are kept longer than others
    <braunr> in order to minimize I/O
    <etenil> ok
    <braunr> so a better page cache code would also improve overall performance
    <braunr> and more RAM would help a lot, since we are strongly limited by
      the 768 MiB limit
    <braunr> which reduces the page cache size a lot
    <etenil> but the problem is that reading a whole file in means trigerring
      many page faults just for one file
    <braunr> if you want to stick to the page clustering thing, yes
    <braunr> you want less page faults, so that there are less IPC between the
      kernel and the pager
    <etenil> so either I make pages bigger
    <etenil> or I modify Mach so it can check up on a range of pages for faults
      before actually processing
    <braunr> you *don't* change the page size
    <etenil> ah
    <etenil> that's hardware isn't it?
    <braunr> in Mach, yes
    <etenil> ok
    <braunr> and usually, you want the page size to be the CPU page size
    <etenil> I see
    <braunr> current CPU can support multiple page sizes, but it becomes quite
      hard to correctly handle
    <braunr> and bigger page sizes mean more fragmentation, so it only suits
      machines with large amounts of RAM, which isn't the case for us
    <etenil> ok
    <etenil> so I'll try the second approach then
    <braunr> that's what i'd recommand
    <braunr> recommend*
    <etenil> ok


# IRC, freenode, #hurd, 2011-02-16

    <antrik> etenil: OSF Mach does have clustered paging BTW; so that's one
      place to start looking...
    <antrik> (KAM ported the OSF code to gnumach IIRC)
    <antrik> there is also an existing patch for clustered paging in libpager,
      which needs some adaptation
    <antrik> the biggest part of the task is probably modifying the Hurd
      servers to use the new interface
    <antrik> but as I said, KAM's code should be available through google, and
      can serve as a starting point

<http://lists.gnu.org/archive/html/bug-hurd/2010-06/msg00023.html>


# IRC, freenode, #hurd, 2011-07-22

    <braunr> but concerning clustered pagins/outs, i'm not sure it's a mach
      interface limitation
    <braunr> the external memory pager interface does allow multiple pages to
      be transfered
    <braunr> isn't it an internal Mach VM problem ?
    <braunr> isn't it simply the page fault handler ?
    <antrik> braunr: are you sure? I was under the impression that changing the
      pager interface was among the requirements...
    <antrik> hm... I wonder whether for pageins, it could actually be handled
      in the pages instead of Mach... though this wouldn't work for pageouts,
      so probably not very helpful
    <antrik> err... in the pagers
    <braunr> antrik: i'm almost sure
    <braunr> but i've be proven wrong many times, so ..
    <braunr> there are two main facts that lead me to think this
    <braunr> 1/
      http://www.gnu.org/software/hurd/gnumach-doc/Memory-Objects-and-Data.html#Memory-Objects-and-Data
      says lengths are provided and doesn't mention the limitation
    <braunr> 2/ when reading about UVM, one of the major improvements (between
      10 and 30% of global performance depending on the benchmarks) was
      implementing the madvise semantics
    <braunr> and this didn't involve a new pager interface, but rather a new
      page fault handler
    <antrik> braunr: hm... the interface indeed looks like it can handle
      multiple pages in both directions... perhaps it was at the Hurd level
      where the pager interface needs to be modified, not the Mach one?...
    <braunr> antrik: would be nice wouldn't it ? :)
    <braunr> antrik: more probably the page fault handler


# IRC, freenode, #hurd, 2011-09-28

    <slpz> antrik: I've just recovered part of my old multipage I/O work
    <slpz> antrik: I intend to clean and submit it after finishing the changes
      to the pageout system.
    <antrik> slpz: oh, great!
    <antrik> didn't know you worked on multipage I/O
    <antrik> slpz: BTW, have you checked whether any of the work done for GSoC
      last year is any good?...
    <antrik> (apart from missing copyright assignments, which would be a
      serious problem for the Hurd parts...)
    <slpz> antrik: It was seven years ago, but I did:
      http://www.mail-archive.com/bug-hurd@gnu.org/msg10285.html :-)
    <slpz> antrik: Sincerely, I don't think the quality of that code is good
      enough to be considered... but I think it was my fault as his mentor for
      not correcting him soon enough...
    <antrik> slpz: I see
    <antrik> TBH, I feel guilty myself, for not asking about the situation
      immediately when he stopped attending meetings...
    <antrik> slpz: oh, you even already looked into vm_pageout_scan() back then
      :-)


# [[Read-Ahead]]