1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
|
[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
[[!toc]]
## Current task
Write a clusterized pagein (prefetching) mechanism in Mach.
- - -
## General information on system architecture
In order to implement the pagein properly, it was necessary for me to get a general idea of the I/O path that data follows in the Hurd/Mach. To accomplish this, I've investigated top-down from the [[hurd/translator/ext2fs]] translator to Mach. This section contains the main nodes that data passes through.
This section is probably unnecessary to implement the prefetcher in Mach, however it is always interesting to understand how things work so we can notice when they get broken.
This is based on my understanding of the system and is probably imprecise. Refer to the manuals of both Hurd and Mach for more detailed information.
### Pagers
Pagers are implemented in libpager and provide abstracted access to Mach's [[microkernel/mach/virtual address space]]. A pager is a struct that contains callback function references. These are used to actually access the storage. In the case of FS translators, like ext2fs, the pager uses libstore to acess the underlying hardware.
### Libstore
Libstore provides abstracted access to Mach's storage access.
I am currently looking at the way the stores call Mach, especially for memory allocation. My intuition is that memory is allocated in Mach when the function *store_create()*. I am currently investigating this to see how the memory allocation process happens in practice.
### Mach
VM allocation happens with a call to:
kern_return_t vm_allocate (vm_task_t target_task, vm_address_t *address, vm_size_t size, boolean_t anywhere)
*vm_allocate()* looks more and more like a red herring. What I'm trying to prefetch is data on hard drives. I'll rather look at the devices in Mach.
- - -
## Implementation plan
### Ideas
To start of with, I will toy with the VM (even if it breaks stuff). My initial intent is to systematically allocate more memory than requested in the hope that the excess will be manipulated by the task in the near future, thus saving on future I/O requests.
I'd also need to keep track of the pre-allocated memory and so that I can pass it on to the task on demand and prefetch even more. I could also possibly time the prefetched data and unallocate it if it's not requested after a while, but that's just an idea.
The tricky part is to understand how the memory allocation works in Mach and to create an additional struct for the prefetched data.
### Foreseeable difficulties
* Tracking the prefetched memory
* Unallocating prefetched memory along with the requested memory
* Shared prefetched memory (i.e. a task requested memory, some more was prefetched and a second task used the prefetched memory)
* Page faults
|