diff options
-rw-r--r-- | microkernel/mach/external_pager_mechanism.mdwn | 157 |
1 files changed, 82 insertions, 75 deletions
diff --git a/microkernel/mach/external_pager_mechanism.mdwn b/microkernel/mach/external_pager_mechanism.mdwn index 169745fb..67c10713 100644 --- a/microkernel/mach/external_pager_mechanism.mdwn +++ b/microkernel/mach/external_pager_mechanism.mdwn @@ -17,79 +17,79 @@ redistribute your contributions. Mach provides a so-called external pager [[mechanism]]. This mechanism serves to separate *managing memory* from *managing -content*. Mach does the former while user space tasks do the +content*. Mach does the former while [[user_space]] [[task]]s do the latter. # Introduction -In Mach, a task's [[Mach/AddressSpace]] consists of references -to [[Mach/MemoryObjects]]. A memory object is designated using +In Mach, a [[task]]'s [[address_space]] consists of references +to [[memory_object]]s. A memory object is [[designated|designation]] using a [[port]] (a port is just a [[capability]]) and -implemented by a normal process. +implemented by a normal [[process]]. To associate a memory object with a portion of a task's -address space, vm\_map is invoked a capability designating +address space, `vm_map` is invoked on a capability designating the task and passing a reference to the memory object and the offset at which to install it. (The first time a task maps an object, Mach sends an initialization message to the server including a control capability, which it uses to supply pages to the kernel.) This is essentially the same as mapping a file into an address space on Unix -using mmap. +using `mmap`. -When a task faults, Mach checks to see if there is a memory +When a task [[faults|page_fault]], Mach checks to see if there is a memory object associated with the fault address. If not, the task -is sent an exception, which is normally further propagated +is sent an [[exception]], which is normally further propagated as a segmentation fault. If there is an associated memory -object, Mach checks whether the corresponding page is in core. -If it is, it installs the page and resumes the task. Mach -then invokes the memory object with the memory\_object\_request +object, Mach checks whether the corresponding [[page]] is in core. +If it is, it installs the page and resumes the task. Mach +then invokes the memory object with the `memory_object_request` method and the page to read. The memory manager then fetches or creates the content as appropriate and supplies it to -Mach using the memory\_object\_supply method. +Mach using the `memory_object_supply` method. # Creating and Mapping a Memory Object The following illustrates the basic idea: -> ________ -> / \ -> | Mach | -> \________/ -> /| / |\ \ -> (C) vm_map / / m_o_ready (E)\ \ (D) memory_object_init -> / |/ (F) return \ \| -> ________ ________ -> / \ -----> / \ -> | Client | (A) open | Server | -> \________/ <----- \________/ -> (B) memory_object - -(A) The client sends an "open" rpc to the server. + ________ + / \ + | Mach | + \________/ + /| / |\ \ + (C) vm_map / / m_o_ready (E)\ \ (D) memory_object_init + / |/ (F) return \ \| + ________ ________ + / \ -----> / \ + | Client | (A) open | Server | + \________/ <----- \________/ + (B) memory_object + +(A) The client sends an `open` [[RPC]] to the server. (B) The server creates a memory object (i.e., a port receive right), adds it to the port set that it is listening on and returns a capability (a port send right) to the client. (C) The client attempts to map the object into its address space using -the vm\_map rpc. It passes a reference to the port that the server gave +the `vm_map` RPC. It passes a reference to the port that the server gave it to the vm server (typically Mach). (D) Since Mach has never seen the object before, it queues a -memory\_object\_init on the given port along with a send right (the +`memory_object_init` on the given port along with a send right (the memory control port) for the manager to use to send messages to the kernel and also as an authentication mechanism for future interactions: the port is supplied so that the manager will be able to -identify from which kernel a given memory\_object\_* IPC is from. +identify from which kernel a given `memory_object_*` IPC is from. (E) The server dequeues the message, initializes internal data structures to manage the mapping and then invokes the -memory\_object\_ready method on the control object. +`memory_object_ready` method on the control object. (F) The kernel sees that the manager is ready, sets up the appropriate -mappings in the client and then replies to the vm\_map rpc indicating +mappings in the client's address space and then replies to the `vm_map` RPC indicating success. There is nothing stopping others from playing "the kernel." This is @@ -102,37 +102,37 @@ mappings etc. # Resolving Page Faults -> (G) Client ________ -> resumed / \ -> | Mach | -> (A) Fault +----|------+ | \ (B) m_o_request (C) store_read -> ____|___ \_____|__/ |\ \| ________ _________ -> / +---\-------+ \ / \ / \ -> | Client | (F) | Server |<===>| storeio | -> \________/ m_o_supply \________/ \_________/ -> (E) return data | ^ -> | | (D) device_read -> v | -> ________ -> / Device \ -> | Driver | -> \________/ -> | ^ -> | | -> v -> ____________ -> / Hardware \ - -(A) The client does a memory access and faults. The kernel catches + (G) Client ________ + resumed / \ + | Mach | + (A) Fault +----|------+ | \ (B) m_o_request (C) store_read + ____|___ \_____|__/ |\ \| ________ _________ + / +---\-------+ \ / \ / \ + | Client | (F) | Server |<===>| storeio | + \________/ m_o_supply \________/ \_________/ + (E) return data | ^ + | | (D) device_read + v | + ________ + / Device \ + | Driver | + \________/ + | ^ + | | + v + ____________ + / Hardware \ + +(A) The client does a memory access and [[faults|page_fault]]. The kernel catches the fault and maps the address to the appropriate memory object. It -then invokes the memory\_object\_request method on the associated +then invokes the `memory_object_request` method on the associated capability. (In addition to the page to supply, it also supplies the control port so that the server can determine which kernel sent the message.) -(B) The manager dequeues the message. On the Hurd, this is translated -into a store\_read: a function in the libstore library which is used to -transparently manage block devices. The storeio server starts off as +(B) The manager dequeues the message. On the [[Hurd]], this is translated +into a `store_read`: a function in the [[hurd/libstore]] library which is used to +transparently manage block devices. The [[hurd/storeio]] server starts off as a separate process, however, if the server has the appropriate permission, the backing object can be contacted directly by the server. This layer of indirection is desirable when, for instance, a @@ -140,37 +140,37 @@ storeio running as root may want to only permit read only access to a resource, yet it cannot safely transfer its handle to the client. In this case, it would proxy the requests. -(C) The storeio server contacts, for instance, a device driver to do +(C) The storeio server contacts, for instance, a [[device_driver]] to do the read. This could also be a network block device (the NBD server in GNU/Linux), a file, a memory object, etc. -(D) The device driver allocates an anonymous page from the default -pager and reads the data into it. Once all of the operations are +(D) The device driver allocates an [[anonymous_page]] from the +[[default_pager]] and reads the data into it. Once all of the operations are complete, the device returns the data to the client unmapping it from its own address space at the same time. -(E) The storeio transfers the page to the server. The page is still +(E) The storeio server transfers the page to the server. The page is still anonymous. -(F) The manager does a memory\_object\_supply transferring the page to +(F) The manager does a `memory_object_supply` transferring the page to the kernel. Only now is the page not considered to be anonymous but managed. (G) The kernel caches the page, installs it in the client's virtual -address space and finally, resumes the client. +[[address_space]] and finally, resumes the client. # Paging Data Out -> Change manager Pager m_o_return store_write -> \ _________ (B) __(A)__ (C) ________ (D) _______ -> S | / Default \ / \ / \ / \ -> W |<=>| Pager |<=>| Mach |==>| server |<=>| storeio |<=> -> A | \_________/ \________/ \________/ \_______/ -> P | -> / + Change manager Pager m_o_return store_write + \ _________ (B) __(A)__ (C) ________ (D) _______ + S | / Default \ / \ / \ / \ + W |<=>| Pager |<=>| Mach |==>| server |<=>| storeio |<=> + A | \_________/ \________/ \________/ \_______/ + P | + / -(A) The paging [[policy]] is implemented by Mach: servers just implement +(A) The [[paging]] [[policy]] is implemented by Mach: servers just implement the [[mechanism]]. (B) Once the kernel has selected a page that it would like to evict, it @@ -179,10 +179,17 @@ if the server does not deallocate the page quickly enough, it cannot cause a denial of service: the kernel will just later double page it to swap (the default pager is part of the [[tcb]]). -(C) Mach then invokes memory\_object\_return method on the control -object. The server is expected to save the page free it in a timely +(C) Mach then invokes `memory_object_return` <!-- doesn't exist --> method on the control +object. The server is expected to save the page free <!-- ? --> it in a timely fashion. The server is not required to send a response to the kernel. -(D) The manager then transfers the data to the storeio which +(D) The manager then transfers the data to the storeio server which eventually sends it to disk. The device driver consumes the memory -doing the equivalent of a vm\_deallocate. +doing the equivalent of a `vm_deallocate`. + + +# Sources + +This text is based on a [June 2002 +email](http://lists.gnu.org/archive/html/l4-hurd/2002-06/msg00001.html) by +[[NealWalfield]]. |