[[!meta copyright="Copyright © 2020 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!meta title="System bootstrap"]] [[!tag open_issue_documentation]] [[!toc]] # State at the beginning of the bootstrap After initializing itself, GNU Mach sets up tasks for the various bootstrap translators (which were loader by the GRUB bootloader). It notably makes variables replacement on their command lines and boot script function calls (see the details in `gnumach/kern/boot_script.c`). For instance, if the GRUB bootloader has the following configuration: multiboot /boot/gnumach-1.8-486-dbg.gz root=device:hd1 console=com0 module /hurd/ext2fs.static ext2fs --readonly \ --multiboot-command-line='${kernel-command-line}' \ --host-priv-port='${host-port}' \ --device-master-port='${device-port}' \ --exec-server-task='${exec-task}' -T typed '${root}' \ '$(task-create)' '$(task-resume)' module /lib/ld.so.1 exec /hurd/exec '$(exec-task=task-create)' GNU Mach will first make the `$(task-create)` function calls, and thus create a task for the ext2fs module and a task for the exec module (and store a port on that task in the `exec-task` variable). It will then replace the variables (`${foo}`), i.e. * `${kernel-command-line}` with its own command line (`root=device:hd1 console=com0`), * `${host-port}` with a reference to the GNU Mach host port, * `${device-port}` with a reference to the GNU Mach device port, * `${exec-task}` with a reference to the exec task port. * `${root}` with `device:hd1` This typically results in: task loaded: ext2fs --readonly --multiboot-command-line=root="device:hd1 console=com0" --host-priv-port=1 --device-master-port=2 --exec-server-task=3 -T typed device:hd1 task loaded: exec /hurd/exec (You will have noticed that `/hurd/exec` is not run directly, but through `ld.so.1`: Mach only knows to run statically-linked ELF binaries, so we could either load `/hurd/exec.static`, or load the dynamic loader `ld.so.1` and tell it to load `/hurd/exec`) GNU Mach will eventually make the `$(task-resume)` function calls, and thus resume the ext2fs task only. This starts a dance between the bootstrap processes: `ext2fs`, `exec`, `startup`, `proc`, and `auth`. Indeed, there are a few dependencies between them: `exec` needs `ext2fs` working to be able to start `startup`, `proc` and `auth`, and `ext2fs` needs to register itself to `startup`, `proc` and `auth` so as to appear as a normal process, running under uid 0. The base principle is that `ext2fs` has a nul bootstrap port set to while other translators will have a non-nul bootstrap port, with which they will discuss. We thus have a hierarchy between the bootstrap processes. `ext2fs` is at the root, `exec` and `startup` are its direct children, and `auth` and `port` are direct children of `startup`. Usually the bootstrap port is used when starting a translator, see `fshelp_start_translator_long`: the parent translator starts the child and sets its bootstrap port. The parent then waits for the child to call `fsys_startup` on the bootstrap port, for the child to provide its control port, and for the parent to provide the FS node that the child is translator for. For the bootstrap translators, the story is extended: * Between `ext2fs` and `startup`: `startup` additionally calls `fsys_init`, to provide `ext2fs` with `proc` and `auth` ports. * Between `startup` and `proc`: `proc` just calls `startup_procinit` to hand over a `proc` port and get `auth` and `priv` ports. * Between `startup` and `auth`: `auth` calls `startup_authinit` to hand over an `auth` port and get a `proc` port, then calls `startup_essential_task` to notify `startup` that the boot can proceed. * For translators before `ext2fs`, the child calls `fsys_startup` to pass over the control port of `ext2fs` (instead of its own control port, which is useless for its parent). This is typically done in the `S_fsys_startup` stub, simply forwarding it. The child also calls `fsys_init` to pass over the `proc` and `auth` ports. Again, this is typically done in the `S_fsys_init` stub, simply forwarding them. With that in mind, the dance between the bootstrap translators is happening as described in the next sections. # Initialization of translators before ext2fs We plan to start pci-arbiter and rumpdisk translators before ext2fs. pci-arbiter's `main` function parses the arguments, and since it is given a disk task port, it knows it is a bootstrap translator and thus initializes the machdev library. `machdev_trivfs_init` resumes the disk task. rumpdisk's `main` function parses the arguments, and since it is given a FS task port, it knows it is a bootstrap translator, and thus `machdev_trivfs_init` resumes the FS task. # ext2fs initialization ext2fs's `main` function starts by calling `diskfs_init_main`. `diskfs_init_main` parses the ext2fs command line with `argp_parse`, to record the parameters set up by the kernel. It makes sure to have a working stdout by opening the Mach console. Since the multiboot command line is available, `diskfs_init_main` sets the ext2fs bootstrap port to `MACH_PORT_NULL`: it is the bootstrap filesystem which will be in charge of dancing with the exec and startup translator. `diskfs_init_main` then initializes the libdiskfs library and spawns a thread to manage libdiskfs RPCs. ext2fs continues its initialization: creating a pager, opening the hypermetadata, opening the root inode to be set as root by libdiskfs. ext2fs then calls `diskfs_startup_diskfs` to really run the startup, implemented by the libdiskfs library. # libdiskfs bootstrap Since the bootstrap port is `MACH_PORT_NULL`, `diskfs_startup_diskfs` calls `diskfs_start_bootstrap`. `diskfs_start_bootstrap` starts by creating a open port on itself for the current and root directory, all other processes will inherit it. `diskfs_start_bootstrap` does have a port on the exec task, so it can dance with it. It calls `start_execserver` that sets the bootstrap port of the exec task to a port of the `diskfs_execboot_class`, and resumes the exec task. `diskfs_start_bootstrap` then waits for `execstarted`. # exec bootstrap exec's `main` function starts and calls `task_get_bootstrap_port` to get its bootstrap port and `getproc` to get a port on the proc translator (thus `MACH_PORT_NULL` at this point since the proc translator is not started yet). exec initializes the trivfs library, and eventually calls `trivfs_startup` on its bootstrap port. `trivfs_startup` creates a control port for the exec translator, and calls `fsys_startup` on the bootstrap port to notify ext2fs that it is ready, give it its exec control port, and get back a port on the underlying node for the exec translator (we want to make it show up on `/servers/exec`). # libdiskfs taking back control `diskfs_execboot_fsys_startup` is thus called. It calls `dir_lookup` on `/servers/exec` to return the underlying node for the exec translator, and stores the `exec` control port in `diskfs_exec_ctl`. It can then signal `execstarted`. `diskfs_start_bootstrap` thus takes back control, It calls `fsys_getroot` on the control port of exec, and uses `dir_lookup` and `file_set_translator` to attach it to `/servers/exec`. `diskfs_start_bootstrap` then looks for which startup process to run. It may be specified on the multiboot command line, but otherwise it will default to `/hurd/startup`. Now that exec is up and running, the `startup` process can be created with `exec_exec`. `diskfs_start_bootstrap` takes a lot of care in this: this is the first unix-looking process, it notably inherits the root directory and current working directory initialized above, it gets stdin/out/err on the mach console. It is passed as bootstrap port a port from the `diskfs_control_class`. `diskfs_start_bootstrap` is complete, we are back to `diskfs_startup_diskfs`, which checks whether ext2fs was given a bootstrap port, i.e. whether the rumpdisk translator was started before ext2fs. If so, it calls `diskfs_call_fsys_startup` which creates a new control port and passes it do a call to `fsys_startup` on the bootstrap port, so rumpdisk gets access to the ext2fs filesystem. Rumpdisk however does not return any `realnode` port, since we are not exposing the ext2fs filesystem in rumpdisk, but rather the converse. TODO: Rumpdisk forwards this `fsys_startup` call to pci-arbiter, so the latter also gets access to the ext2fs filesystem. # startup startup's `main` function starts and calls `task_get_bootstrap_port` to get its bootstrap port, i.e. the control port of ext2fs, and `fsys_getpriv` on it to get the host privileged port and device master port. It clears the bootstrap port so children do not inherit it. It sets itself up with output on the Mach console, and wires itself against swapping. It requests notification for ext2fs translator dying to detect it and print a warning in case that happens during boot. It creates a `startup` port which it will get RPCs on. startup can then complete the unixish initialization, and run `/hurd/proc` and `/hurd/auth`, giving them as bootstrap port the `startup` port. # proc proc's `main` function starts. It initializes itself, and calls `task_get_bootstrap_port` to get a port on startup. It can then call `startup_procinit` on it to pass it the proc port that will represent the startup task, and get ports on the auth server, the host privileged port, and device master port. Eventually, proc calls `startup_essential_task` to tell startup that it is ready. # auth auth's `main` function starts. It creates the initial root auth handle (all permissions allowed). It calls `task_get_bootstrap_port` to get a port on startup. It can then call `startup_authinit` on it to pass the initial root auth handle, and get a port on the proc server. It can then register itself to proc. Eventually, auth calls `startup_essential_task` to tell startup that it is ready. # startup getting back control startup notices initialization of auth and proc from `S_startup_procinit` and `S_startup_authinit`. Once both have been called, it calls `launch_core_servers`. `launch_core_servers` starts by calling `startup_procinit_reply` to actually reply to the `startup_procinit` RPC with a port on auth. `launch_core_servers` then tells proc that the bootstrap processes are important, and how they relate to each other. `launch_core_servers` then calls `install_as_translator` to show up in the filesystem on `/servers/startup`. `launch_core_servers` calls `startup_authinit_reply` to actually reply to the `startup_authinit` RPC with a port on proc. `launch_core_servers` eventually calls `fsys_init` on its bootstrap port, to give ext2fs the proc and auth ports. diskfs' `diskfs_S_fsys_init` thus gets called. It first replies to startup, so startup is not stuck in its `fsys_init` call and not able to reply to RPCs. From then on, startup will be watching for `startup_essential_task` calls from the various bootstrap processes. # libdiskfs taking back control In diskfs' `diskfs_S_fsys_init`, diskfs now knows that proc and auth are ready, and can call `exec_init` on the exec port. # exec getting initialized exec's `S_exec_init` gets called from the `exec_init` call from ext2fs. Exec can register itself with proc, and eventually call `startup_essential_task` to tell startup that it is ready. # back to libdiskfs initialization It also calls `fsys_init` on its bootstrap port, i.e. rumpdisk. # rumpdisk getting initialized rumpdisk's `trivfs_S_fsys_init` gets called from the `fsys_init` call from ext2fs. It calls `fsys_init` on its bootstrap port. # pci-arbiter getting initialized pci-arbiter's `trivfs_S_fsys_init` gets called from the `fsys_init` call from rumpdisk. It gets the root node of ext2fs, sets all common ports, and install pci-arbiter in the ext2fs FS as translator for `/servers/bus/pci`. It eventually calls `startup_essential_task` to tell startup that it is ready, and requests shutdown notifications. # back to rumpdisk initialization It gets the root node of ext2fs, sets all common ports, and install rumpdisk in the ext2fs FS as translator for `/dev/disk`. It eventually calls `startup_essential_task` to tell startup that it is ready, and requests shutdown notifications. # back to libdiskfs initialization It initializes the default proc and auth ports to be given to processes. It calls `startup_essential_task` on the startup port to tell startup that it is ready. Eventually, it calls `_diskfs_init_completed` to finish its initialization, and notably call `startup_request_notification` to get notified by startup when the system is shutting down. # startup monitoring bootstrap progress As mentioned above, the different essential tasks (pci-arbiter, rumpdisk, ext2fs, proc, auth, exec) call `startup_essential_task` when they are ready. startup's `S_startup_essential_task` function thus gets called each time, and startup records each of them as essential, monitoring their death to crash the whole system. Once all of proc, auth, exec have called `startup_essential_task`, startup replies to their respective RPCs, so they actually start working altogether. It also calls `init_stdarrays` which sets the initial values of the standard exec data, and `frob_kernel_process` to plug the kernel task into the picture. It eventually calls `launch_something`, which "launches something", which by default is `/libexec/runsystem`, but if that can not be found, launches a shell instead, so the user can fix it.