SMP stands for Symmetric multiprocessing, which is a computer that has numerous identical processors connected to a single shared main memory. All processors are controlled by one single operating system, and each processor can access all devices. Operating systems with SMP can provide more performance, but it is not trivial to do so. It is a little like having a packed board meeting. More people in the room potentially means more can get done, but only one person can speak at a time. Scheduling everyone to speak can be quite an involved task.
NOTE: Many of the veteran Hurd developers consider this task too large for a Google summer of code project.
Current Status
As of September 2024, the SMP support is implemented for i386 with working internet, because it boots with only one cpu in the default processor set. The slave processors are temporarily disabled until they can be safely used per task. We are unable to turn on the full set at boot time due to race bugs. Debian Hurd provides SMP enabled GNU Mach kernels.
How to test the current SMP support
The easiest way to test the SMP support, is via the qemu guide. Once you have the Hurd running you can build an SMP enabled GNU Mach.
$ git clone git://git.savannah.gnu.org/hurd/gnumach.git
$ cd gnumach
$ autoreconf -i
$ mkdir build
$ cd build
$ ../configure --enable-ncpus=4 --enable-apic --enable-kdb --disable-linux-groups
$ make gnumach.gz
$ su
# mv /boot/gnumach-1.8-486.gz /boot/gnumach-1.8-486.gz.bak
# cp gnumach.gz /boot/gnumach-1.8-486.gz
You may optionally update /boot/grub/grub.cfg
change hd0
to wd0
and add console=com0
/boot/gnumach-1.8-486.gz root=part:2:device:wd0 console=com0
update /etc/fstab
and update wd0
instead of hd0
.
/dev/wd0s2 / ext2 defaults 0 1
/dev/wd0s1 none swap sw 0 0
You can shutdown via /sbin/poweroff
.
start qemu with -smp 4
and add -nographic
if you want to use com0
.
$ qemu-system-i386 -M q35,accel=kvm -smp 4 -m 2G -net \
user,hostfwd=tcp::2223-:22 -net nic -hda debian-hurd-VERSION.img \
-nographic
To test smp:
$ sudo /path/to/smp /bin/bash
$ stress -c 7
smp.c source can be found here.
What was done to get the 32-bit SMP support
The GNU Mach source code includes many special cases for multiprocessor, controlled by #if NCPUS > 1 macro.
But this support is very limited:
GNU Mach don't detect CPUs in runtime: The number of CPUs must be hardcoded in compilation time. The number of cpus is set in
mach_ncpus
configuration variable, set to 1 by default, in configfrag.ac file. This variable will generateNCPUS
macro, which is used by gnumach to control the special cases for multiprocessor. IfNCPUS
> 1, then gnumach will enable multiprocessor support, with the number of cpus set by the user in mach_ncpus variable. Otherwise, SMP will be disabled.The special cases to multicore in gnumach source code have never been tested, so these can contain many errors. Furthermore, these special case are incomplete: many functions, such as
cpu_number()
orintel_startCPU()
aren't written.GNU Mach doesn't initialize the processor with the proper options for multiprocessing. For this reason, the current support is only multithread and not real multiprocessor support.
Many drivers included in Hurd aren't thread-safe, and these could crash in a SMP environment. So, it's necessary to isolate this drivers, to avoid concurrency problems.
Solution
To solve this, we need to implement some routines to detect the number of processors, assign an identifier to each processor, and configure the lapic and IPI support. These routines must be executed during Mach boot.
"Really, all the support you want to get from the hardware is just getting the number of processors, initializing them, and support for interprocessor interrupts (IPI) for signaling." - Samuel Thibault link
"The process scheduler probably already has the support. What is missing is the hardware driver for SMP: enumeration and initialization." - Samuel Thibault link
The current necessary functions are cpu_number()
(in kern/cpu_number.h) and
intel_startCPU()
. Another non-critical function, is cpu_control()
Reference
Other interesting files are pmap.c
and sched_prim.c
We also
have to build an isolated environment to execute the non-thread-safe drivers.
"Yes, this is a real concern. For the Linux drivers, the long-term goal is to move them to userland anyway. For Mach drivers, quite often they are not performance-sensitive, so big locks would be enough." - Samuel Thibault link
Task list
DONE Implement a routine to detect and identify the processors
This routine must check the number of processors, initialize the lapic of BSP (the master processor), and assign a kernel ID for each processor. This kernel ID does not have to be equal to the APIC ID. The relation kernel/APIC can be settled with an array, where the kernel ID is the index, and the APIC contains the data. GNU Mach can derive the list of processors from memory, reading from ACPI table, or from MP table. However, MP table is deprecated in most modern CPUs, so it is preferable to use ACPI table for this.
The tasks to do for this are:
Detect the number of processors
- Create a array indexed by kernel ID, which sets a relation with APIC ID.
Initialize the lapic of BSP
Initialize IOAPIC
This routine could be called from
i386at_init()
(i386/i386at/model_dep.c). This function will call the functions which initialize the lapic and the ioapic.NOTE: This routine must be executed before
intel_startCPU()
or other routines.How to find APIC table
To find APIC table, we can read RSDT table RSDT reference. To get the address of RSDT, we need to read RDSP table. We can get the RSDP table by this RDSP reference Once we have the RSDT table, we need to read Entry field, and search the pointer to the APIC table in the array referenced in this field.
We can find an example about reading ACPI table in X15 OS: Reference
We need to initialize the
machine_slot
of each processor (currently only initializes cpu0). Themachine_slot
has this structure. Reference:
struct machine_slot { /*boolean_t*/
integer_t is_cpu;
/* is there a cpu in this slot? */
cpu_type_t cpu_type; /* type of cpu */
cpu_subtype_t cpu_subtype; /* subtype of cpu */
/*boolean_t*/ integer_t running; /* is cpu running */
integer_t cpu_ticks[CPU_STATE_MAX]; integer_t
clock_freq; /* clock interrupt frequency */ };
We can find an example of initialization in this link:
This modification also involve the redefinition of
NCPUS
, which must be set to the maximum possible number of processors. We can do this by modifyingconfigfrag.ac
, with this:# Multiprocessor support is still broken.
AH_TEMPLATE([MULTIPROCESSOR], [set things up for a uniprocessor])
mach_ncpus=2
AC_DEFINE_UNQUOTED([NCPUS], [$mach_ncpus], [number of CPUs])
[if [$mach_ncpus
>-gt 1 ]; then]
AC_DEFINE([MULTIPROCESSOR], [1], [set things up for a
>multiprocessor])
AC_DEFINE_UNQUOTED([NCPUS], [256], [number of CPUs])
[fi]
1.1. Implement a
cpu_number()
function.This function must return the kernel ID of the processor which is executing the function. To get this, we have to read the local apic memory space, which will show the lapic of the current CPU. Reading the lapic, we can get its APIC ID. Once we have the APIC ID of the current CPU, the function will search in the Kernel/APIC array until it finds the same APIC ID. Then it will return the index (Kernel ID) of this position.
DONE Implement a routine to initialize the processors
This routine will initialize the lapic of each processor and other structures needed to run the kernel. We can find an example of lapic initialization here reference Also, we can get more information in Chapter 8.4 and 8.11 of Intel Developer Guide, Volume 3. link
DONE Implement
intel_startCPU()
This function will initialize the descriptor tables of the processor specified by the parameter, and launch the startup IPI to this processor. This function will be executed during the boot of the kernel (process 0). The task to do in this function are:
- Initialize the processor descriptor tables
- Launch Startup IPI to this processor
We have a current implementation of
intel_startCPU()
in this link. This implementation is based in XNU'sintel_startCPU()
function
We can find explainations about how to raise an IPI in this pages: Reference 1, Reference 2, Reference 3 We can get information about how to raise an IPI in Intel Developer Guide, Volume 3, Chapter 10.6
Implement another routine to start the processors
This routine calls to
processor_start()
for each processor, which will start the processor using this sequence of calls:processor_start(processor_t processor)
->cpu_start(processor->slot_num)
->intel_startCPU(cpu)
These articles shows some annotations about how to do the AP Startup:
- Reference1,
Reference2 (...)
After implement IPI support, It's recommended reimplement
machine_idle()
,machine_relax ()
,halt_cpu()
andhalt_all_cpus()
using IPI.- reference
- Also in
ast_check.c
, we have to implement both functions, using IPI Reference
This functions must force the processors to check if there are any AST signal, and we ought to keep in the mind the following irc chat:
<AlmuHS> what is the use of AST in gnumach? <br/> <AlmuHS> this file what do? <br/> https://github.com/AlmuHS/GNUMach_SMP/blob/master/i386/i386/ast_check.c <br/> <youpi> I don't know <br/> <youpi> but look at what calls that <br/> <youpi> see e.g. the call in thread.c <br/> <AlmuHS> This <br/> function is called during the sequence of cpu_up(), in machine.c <br/> <AlmuHS> but only if NCPUS > 1 <br/> <youpi> it seems like it's to trigger an AST check on another <br/> processor <br/> <youpi> i.e. a processor tells another to run ast_check <br/> <youpi> (see the comment in thread.c) <br/> <AlmuHS> <br/> https://github.com/AlmuHS/GNUMach_SMP/blob/master/kern/machine.c <br/> <youpi> well, the initialization part is not necessarily what's important to <br/> think about at first <br/> <youpi> i.e. until you know what you'll have <br/> to do during execution, you don't know what you'll need to intialize at <br/> initialization <br/> <youpi> you might even not need to initialize anything <br/> <AlmuHS> then, this is the reason because all functions <br/> in ast_check.c are empty <br/> <youpi> cause_ast_check being empty is really probably a TODO <br/> <AlmuHS> but I'm not clear what I need to write in this functions <br/> <youpi> what the comment said: make another processor run ast_check() <br/> <youpi> which probably means raising an inter-processor interrupt <br/> <youpi> (aka IPI) <br/> <youpi> to get ast_check() called by the other processor <br/> <AlmuHS> then, this funcions must raise an IPI in the processor? <br/> <youpi> that's the idea <br/> <youpi> the IPI probably needs some setup <br/>
We can use XV6 source
code. as model to implements
the function and routines. Some interesting files are
lapic.c
,
proc.c
and
main.c
References
- Comments about the project bug-hurd maillist
- Initial thread in bug-hurd maillist
- SMP in GNU/Hurd FAQ
- GNU Mach git repository
- GNU Mach reference manual
- MultiProcessor Specification
- ACPI Specification
- Mach boot trace
- SPL man page
- Book: The Mach System
- Book: Mach3 Mysteries
- X15 operating system
- Symmetric Multiprocessing - OSDev Wiki
- Intel® 64 and IA-32 Architectures Software Developer’s Manuals
- Intel Developer Guide, Volume 3: System Programming Guide
See also the FAQ entry.