[[!meta copyright="Copyright © 2012, 2013 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!meta title="SMP"]] [[!tag open_issue_glibc open_issue_gnumach open_issue_hurd]] SMP stands for Symmetric multiprocessing, which is a computer that has numerous identical processors connected to a single shared main memory. All processors are controlled by one single operating system, and each processor can access all devices. Operating systems with SMP can provide more performance, but it is not trivial to do so. It is a little like having a packed board meeting. More people in the room potentially means more can get done, but only one person can speak at a time. Scheduling everyone to speak can be quite an involved task. NOTE: Many of the veteran Hurd developers consider this task too large for a Google summer of code project. [[!img images/smp.svg size="700x"]] # Current Status As of September 2024, the SMP support is implemented for i386 with working internet, because it boots with only one cpu in the default processor set. The slave processors are temporarily disabled until they can be safely used per task. We are unable to turn on the full set at boot time due to race bugs. Debian Hurd provides SMP enabled GNU Mach kernels. # How to test the current SMP support The easiest way to test the SMP support, is via the [[qemu guide|https://www.gnu.org/software/hurd/hurd/running/qemu.html]]. Once you have the Hurd running you can build an SMP enabled GNU Mach. $ git clone git://git.savannah.gnu.org/hurd/gnumach.git $ cd gnumach $ autoreconf -i $ mkdir build $ cd build $ ../configure --enable-ncpus=4 --enable-apic --enable-kdb --disable-linux-groups $ make gnumach.gz $ su # mv /boot/gnumach-1.8-486.gz /boot/gnumach-1.8-486.gz.bak # cp gnumach.gz /boot/gnumach-1.8-486.gz You may optionally update `/boot/grub/grub.cfg` change `hd0` to `wd0` and add `console=com0` /boot/gnumach-1.8-486.gz root=part:2:device:wd0 console=com0 update `/etc/fstab` and update `wd0` instead of `hd0`. /dev/wd0s2 / ext2 defaults 0 1 /dev/wd0s1 none swap sw 0 0 You can shutdown via `/sbin/poweroff`. start qemu with `-smp 4` and add `-nographic` if you want to use `com0`. $ qemu-system-i386 -M q35,accel=kvm -smp 4 -m 2G -net \ user,hostfwd=tcp::2223-:22 -net nic -hda debian-hurd-VERSION.img \ -nographic To test smp: $ sudo /path/to/smp /bin/bash $ stress -c 7 smp.c source can be found [[here|https://lists.gnu.org/archive/html/bug-hurd/2024-02/msg00088.html]]. # What was done to get the 32-bit SMP support The GNU Mach source code includes many special cases for multiprocessor, controlled by #if NCPUS > 1 macro. But this support is very limited: - GNU Mach don't detect CPUs in runtime: The number of CPUs must be hardcoded in compilation time. The number of cpus is set in `mach_ncpus` configuration variable, set to 1 by default, in configfrag.ac file. This variable will generate `NCPUS` macro, which is used by gnumach to control the special cases for multiprocessor. If `NCPUS` > 1, then gnumach will enable multiprocessor support, with the number of cpus set by the user in mach_ncpus variable. Otherwise, SMP will be disabled. - The special cases to multicore in gnumach source code have never been tested, so these can contain many errors. Furthermore, these special case are incomplete: many functions, such as `cpu_number()` or `intel_startCPU()` aren't written. - GNU Mach doesn't initialize the processor with the proper options for multiprocessing. For this reason, the current support is only multithread and not real multiprocessor support. - Many drivers included in Hurd aren't thread-safe, and these could crash in a SMP environment. So, it's necessary to isolate this drivers, to avoid concurrency problems. ### Solution To solve this, we need to implement some routines to detect the number of processors, assign an identifier to each processor, and configure the lapic and IPI support. These routines must be executed during Mach boot. > "Really, all the support you want to get from the hardware is just getting the > number of processors, initializing them, and support for interprocessor > interrupts (IPI) for signaling." - Samuel Thibault > [link](https://lists.gnu.org/archive/html/bug-hurd/2018-08/msg00071.html) > "The process scheduler probably already has the support. What is missing is the hardware driver for SMP: enumeration and initialization." - Samuel Thibault [link](https://lists.gnu.org/archive/html/bug-hurd/2018-08/msg00083.html) The current necessary functions are `cpu_number()` (in kern/cpu_number.h) and `intel_startCPU()`. Another non-critical function, is `cpu_control()` [*Reference*](https://www.gnu.org/software/hurd/gnumach-doc/Processor-Control.html#Processor-Control) Other interesting files are `pmap.c` and `sched_prim.c` We also have to build an isolated environment to execute the non-thread-safe drivers. > "Yes, this is a real concern. For the Linux drivers, the long-term goal is to > move them to userland anyway. For Mach drivers, quite often they are not > performance-sensitive, so big locks would be enough." - Samuel Thibault > [link](https://lists.gnu.org/archive/html/bug-hurd/2018-08/msg00073.html) ### Task list 1. DONE Implement a routine to detect and identify the processors This routine must check the number of processors, initialize the lapic of BSP (the master processor), and assign a kernel ID for each processor. This kernel ID does not have to be equal to the APIC ID. The relation kernel/APIC can be settled with an array, where the kernel ID is the index, and the APIC contains the data. GNU Mach can derive the list of processors from memory, reading from ACPI table, or from MP table. However, MP table is deprecated in most modern CPUs, so it is preferable to use ACPI table for this. The tasks to do for this are: - Detect the number of processors - Create a array indexed by kernel ID, which sets a relation with APIC ID. - Initialize the lapic of BSP - Initialize IOAPIC This routine could be called from `i386at_init()` (i386/i386at/model_dep.c). This function will call the functions which initialize the lapic and the ioapic. **NOTE**: This routine must be executed before `intel_startCPU()` or other routines. - **How to find APIC table** To find APIC table, we can read RSDT table [RSDT reference](http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf#G10.1358180). To get the address of RSDT, we need to read RDSP table. We can get the RSDP table by this [RDSP reference](http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf#G10.1357698) Once we have the RSDT table, we need to read *Entry* field, and search the pointer to the APIC table in the array referenced in this field. We can find an example about reading ACPI table in X15 OS: [Reference](https://github.com/richardbraun/x15/blob/0c0e2a02a42a8161e1b8dc1e1943fe5057ecb3a3/arch/x86/machine/acpi.c#L576) - We need to initialize the `machine_slot` of each processor (currently only initializes cpu0). The `machine_slot` has this structure. [Reference](https://github.com/AlmuHS/GNUMach_SMP/blob/0d490ef21c156907f3f26a6cdc00842f462a877a/include/mach/machine.h#L68): > `struct machine_slot { /*boolean_t*/`
> `integer_t is_cpu;`
> `/* is there a cpu in this slot? */ `
> `cpu_type_t cpu_type; /* type of cpu */`
> `cpu_subtype_t cpu_subtype; /* subtype of cpu */`
> `/*boolean_t*/ integer_t running; /* is cpu running */`
> `integer_t cpu_ticks[CPU_STATE_MAX]; integer_t`
> `clock_freq; /* clock interrupt frequency */ };`
We can find an example of initialization in this [link:](https://github.com/AlmuHS/GNUMach_SMP/blob/0d490ef21c156907f3f26a6cdc00842f462a877a/i386/i386at/model_dep.c#L612) This modification also involve the redefinition of `NCPUS`, which must be set to the maximum **possible** number of processors. We can do this by modifying `configfrag.ac`, with this: > `# Multiprocessor support is still broken.`
> `AH_TEMPLATE([MULTIPROCESSOR], [set things up for a uniprocessor])`
> `mach_ncpus=2`
> `AC_DEFINE_UNQUOTED([NCPUS], [$mach_ncpus], [number of CPUs])`
> `[if [$mach_ncpus` > `-gt 1 ]; then]`
> `AC_DEFINE([MULTIPROCESSOR], [1], [set things up for a` > `multiprocessor])` > `AC_DEFINE_UNQUOTED([NCPUS], [256], [number of CPUs]) `
> `[fi]`
- Interesting files and functions - `machine.c` [Link](https://github.com/AlmuHS/GNUMach_SMP/blob/master/kern/machine.c) - `c_boot_entry()` [Link](https://github.com/AlmuHS/GNUMach_SMP/blob/0d490ef21c156907f3f26a6cdc00842f462a877a/i386/i386at/model_dep.c#L529) - Example, in X15 OS: [Link](https://github.com/richardbraun/x15/blob/d6d90a3276a09da65690b019e985392bf77b53b0/arch/x86/machine/cpu.c#L1114) 1.1. Implement a `cpu_number()` function. This function must return the kernel ID of the processor which is executing the function. To get this, we have to read the local apic memory space, which will show the lapic of the current CPU. Reading the lapic, we can get its APIC ID. Once we have the APIC ID of the current CPU, the function will search in the Kernel/APIC array until it finds the same APIC ID. Then it will return the index (Kernel ID) of this position. 2. DONE Implement a routine to initialize the processors This routine will initialize the lapic of each processor and other structures needed to run the kernel. We can find an example of lapic initialization here [reference](https://github.com/mit-pdos/xv6-public/blob/b818915f793cd20c5d1e24f668534a9d690f3cc8/lapic.c#L55) Also, we can get more information in Chapter 8.4 and 8.11 of Intel Developer Guide, Volume 3. [link](https://software.intel.com/sites/default/files/managed/a4/60/325384-sdm-vol-3abcd.pdf) 3. DONE Implement `intel_startCPU()` This function will initialize the descriptor tables of the processor specified by the parameter, and launch the startup IPI to this processor. This function will be executed during the boot of the kernel (process 0). The task to do in this function are: - Initialize the processor descriptor tables - Launch Startup IPI to this processor We have a current implementation of `intel_startCPU()` in this [link](https://github.com/AlmuHS/GNUMach_SMP/blob/smp/i386/i386/mp_desc.c). This implementation is based in XNU's `intel_startCPU()` [function](https://github.com/nneonneo/osx-10.9-opensource/blob/f5a0b24e4d98574462c8f5e5dfcf0d37ef7e0764/xnu-2422.1.72/osfmk/i386/mp.c#L423) We can find explainations about how to raise an IPI in this pages: [*Reference 1*](https://www.cs.usfca.edu/~cruse/cs630f08/lesson22.ppt), [*Reference 2*](https://www.cheesecake.org/sac/smp.html), [*Reference 3*](http://www.dis.uniroma1.it/pub/quaglia/AOSV-traps-interrupts.pdf) We can get information about how to raise an IPI in Intel Developer Guide, Volume 3, Chapter 10.6 4. Implement another routine to start the processors This routine calls to `processor_start()` for each processor, which will start the processor using this sequence of calls: [`processor_start(processor_t processor)`](https://github.com/AlmuHS/GNUMach_SMP/blob/5d527f532dfba9f2da54555d5fbe585dd458579b/kern/processor.c#L447) -> [`cpu_start(processor->slot_num)`](https://github.com/AlmuHS/GNUMach_SMP/blob/5d527f532dfba9f2da54555d5fbe585dd458579b/i386/i386/mp_desc.c#L335) -> [`intel_startCPU(cpu)`](https://github.com/AlmuHS/GNUMach_SMP/blob/5d527f532dfba9f2da54555d5fbe585dd458579b/i386/i386/mp_desc.c#L180) These articles shows some annotations about how to do the AP Startup: - [Reference1](https://wiki.osdev.org/Symmetric_Multiprocessing#AP_startup), - [Reference2](https://stackoverflow.com/a/16368043/7077301) (...) After implement IPI support, It's recommended reimplement `machine_idle()`, `machine_relax ()`, `halt_cpu()` and `halt_all_cpus()` using IPI. - [reference](https://github.com/AlmuHS/GNUMach_SMP/blob/0d490ef21c156907f3f26a6cdc00842f462a877a/i386/i386at/model_dep.c#L201) - Also in `ast_check.c`, we have to implement both functions, using IPI [Reference](https://github.com/AlmuHS/GNUMach_SMP/blob/master/i386/i386/ast_check.c) This functions must force the processors to check if there are any AST signal, and we ought to keep in the mind the following irc chat: > what is the use of AST in gnumach?
> this file what do?
> https://github.com/AlmuHS/GNUMach_SMP/blob/master/i386/i386/ast_check.c
> I don't know
> but look at what calls that
> see e.g. the call in thread.c
> This
> function is called during the sequence of cpu_up(), in machine.c
> but only if NCPUS > 1
> it seems like it's to trigger an AST check on another
> processor
> i.e. a processor tells another to run ast_check
> (see the comment in thread.c)
>
> https://github.com/AlmuHS/GNUMach_SMP/blob/master/kern/machine.c
> well, the initialization part is not necessarily what's > important to
> think about at first
> i.e. until you know what you'll have
> to do during execution, you don't know what you'll need to intialize at
> initialization
> you might even not need to initialize anything
> then, this is the reason because all functions
> in ast_check.c are empty
> cause_ast_check being empty is really probably a TODO
> but I'm not clear what I need to write in this functions
> what the comment said: make another processor run ast_check()
> which probably means raising an inter-processor interrupt
> (aka IPI)
> to get ast_check() called by the other processor
> then, this funcions must raise an IPI in the processor?
> that's the idea
> the IPI probably needs some setup
We can use [XV6 source code](https://pdos.csail.mit.edu/6.828/2018/xv6.html). as model to implements the function and routines. Some interesting files are [`lapic.c`](https://github.com/mit-pdos/xv6-public/blob/master/lapic.c), [`proc.c`](https://github.com/mit-pdos/xv6-public/blob/master/proc.c) and [`main.c`](https://github.com/mit-pdos/xv6-public/blob/master/main.c) ## References - [Comments about the project bug-hurd maillist](https://lists.gnu.org/archive/html/bug-hurd/2018-08/msg00048.html) - [Initial thread in bug-hurd maillist](https://lists.gnu.org/archive/html/bug-hurd/2018-06/msg00048.html) - [SMP in GNU/Hurd FAQ](https://www.gnu.org/software/hurd/faq/smp.html) - [GNU Mach git repository](https://git.savannah.gnu.org/cgit/hurd/gnumach.git) - [GNU Mach reference manual](https://www.gnu.org/software/hurd/gnumach-doc/) - [**MultiProcessor Specification**](https://pdos.csail.mit.edu/6.828/2011/readings/ia32/MPspec.pdf) - [**ACPI Specification**](http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf) - [Mach boot trace](https://www.gnu.org/software/hurd/microkernel/mach/gnumach/boot_trace.html) - [SPL man page](https://man.openbsd.org/spl) - [Book: The Mach System](http://codex.cs.yale.edu/avi/os-book/OS9/appendices-dir/b.pdf) - [Book: Mach3 Mysteries](http://www.nv50.0fees.net/Doc/Mach3Mysteries.pdf) - [X15 operating system](https://www.sceen.net/x15) - [Symmetric Multiprocessing - OSDev Wiki](https://wiki.osdev.org/Symmetric_Multiprocessing) - [**Intel® 64 and IA-32 Architectures Software Developer’s Manuals**](https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf) - [**Intel Developer Guide, Volume 3: System Programming Guide**](https://software.intel.com/sites/default/files/managed/a4/60/325384-sdm-vol-3abcd.pdf) See also the [[FAQ entry|faq/smp]].