diff options
author | Marcus Brinkmann <marcus@gnu.org> | 2001-09-11 04:51:41 +0000 |
---|---|---|
committer | Marcus Brinkmann <marcus@gnu.org> | 2001-09-11 04:51:41 +0000 |
commit | a3cd3171ac1cd08447a7f300f402f86ea77518d5 (patch) | |
tree | aa705e15caccef8a32a025817011d466463a8e0d | |
parent | b0bc71482c3354017ab2d06a406faa28cedebc87 (diff) |
Adding my talk about the Hurd.
-rw-r--r-- | hurd-talk.html | 968 |
1 files changed, 968 insertions, 0 deletions
diff --git a/hurd-talk.html b/hurd-talk.html new file mode 100644 index 00000000..559a8456 --- /dev/null +++ b/hurd-talk.html @@ -0,0 +1,968 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" + "http://www.w3.org/TR/REC-html40/strict.dtd"> +<HTML> +<HEAD> +<TITLE>The GNU Hurd - GNU Project - Free Software Foundation (FSF)</TITLE> +<LINK REV="made" HREF="mailto:web-hurd@gnu.org"> +<META NAME="keywords" CONTENT="hurd"> +</HEAD> +<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#1F00FF" ALINK="#FF0000" VLINK="#9900DD"> +<TABLE width="100%" border="0" cellspacing="5" cellpadding="15"> +<TR> +<TD COLSPAN="2"> +<IMAGE SRC="/graphics/hurd_sm_mf.jpg" ALT=" [image of the Hurd logo] "> +[ + <A HREF="/software/hurd/hurd-talk.html">English</A> +] +</TD> +</TR> +<TR> +<TD ALIGN="LEFT" VALIGN="TOP" BGCOLOR="#eeeeee"> +<A HREF="/software/hurd/hurd.html"><STRONG>The GNU Hurd</STRONG></A></BR> + <BR> +<A HREF="/software/hurd/docs.html">Documentation</A><BR> +<A HREF="/software/hurd/install.html">Installation</A><BR> +<A HREF="/software/hurd/help.html">Getting Help</A><BR> +<A HREF="/software/hurd/download.html">Download</A><BR> +<A HREF="/software/hurd/devel.html">Development</A><BR> +<A HREF="/software/hurd/history.html">History</A> +</TD> +<TD ALIGN="LEFT" VALIGN="TOP"> +<HR> +<H4><A NAME="contents">Table of Contents</A></H4> +<UL> + <LI><A HREF="#int" NAME="TOCint">Introduction</A> + <LI><A HREF="#ove" NAME="TOCove">Overview</A> + <LI><A HREF="#his" NAME="TOChis">Historicals</A> + <LI><A HREF="#ker" NAME="TOCker">Kernel Architectures</A> + <LI><A HREF="#mic" NAME="TOCmic">Micro vs Monolithic</A> + <LI><A HREF="#sin" NAME="TOCsin">Single Server vs Multi Server</A> + <LI><A HREF="#mul" NAME="TOCmul">Multi Server is superior, ...</A> + <LI><A HREF="#the" NAME="TOCthe">The Hurd even more so.</A> + <LI><A HREF="#mac" NAME="TOCmac">Mach Inter Process Communication</A> + <LI><A HREF="#how" NAME="TOChow">How to get a port?</A> + <LI><A HREF="#exa" NAME="TOCexa">Example of <SAMP>hurd_file_name_lookup</SAMP></A> + <LI><A HREF="#pat" NAME="TOCpat">Pathname resolution example</A> + <LI><A HREF="#map" NAME="TOCmap">Mapping the POSIX Interface</A> + <LI><A HREF="#filser" NAME="TOCfilser">File System Servers</A> + <LI><A HREF="#act" NAME="TOCact">Active vs Passive</A> + <LI><A HREF="#aut" NAME="TOCaut">Authentication</A> + <LI><A HREF="#ope" NAME="TOCope">Operations on authentication ports</A> + <LI><A HREF="#est" NAME="TOCest">Establishing trusted connections</A> + <LI><A HREF="#pas" NAME="TOCpas">Password Server</A> + <LI><A HREF="#pro" NAME="TOCpro">Process Server</A> + <LI><A HREF="#filsys" NAME="TOCfilsys">Filesystems</A> + <LI><A HREF="#dev" NAME="TOCdev">Developing the Hurd</A> + <LI><A HREF="#sto" NAME="TOCsto">Store Abstraction</A> + <LI><A HREF="#deb" NAME="TOCdeb">Debian GNU/Hurd</A> + <LI><A HREF="#stabin" NAME="TOCstabin">Status of the Debian GNU/Hurd binary archive</A> + <LI><A HREF="#stainf" NAME="TOCstainf">Status of the Debian infrastructure</A> + <LI><A HREF="#staarc" NAME="TOCstaarc">Status of the Debian Source archive</A> + <LI><A HREF="#debide" NAME="TOCdebide">Debian GNU/Hurd: Good idea, bad idea?</A> + <LI><A HREF="#end" NAME="TOCend">End</A> +</UL> +<HR> +<H3>Talk about the Hurd</H3> +<P> +This talk about the Hurd was written by Marcus Brinkmann for +<UL> +<LI>OSDEM, Brussels, 4. Feb 2001, +<LI>Frühjahrsfachgespräche, Cologne, 2. Mar 2001 and +<LI>Libre Software Meeting, Bordeaux, 4. Jul 2001. +</UL> + +<H4><A HREF="#TOCint" NAME="int">Introduction</A></H4> +<P> +When we talk about free software, we usually refer to the free +software licenses. We also need relief from software patents, so our +freedom is not restricted by them. But there is a third type of +freedom we need, and that's user freedom. + +<P> +Expert users don't take a system as it is. They like to change the +configuration, and they want to run the software that works best for +them. That includes window managers as well as your favourite text +editor. But even on a GNU/Linux system consisting only of free +software, you can not easily use the filesystem format, network +protocol or binary format you want without special privileges. In +traditional unix systems, user freedom is severly restricted by the +system administrator. + +<P> +The Hurd removes these restrictions from the user. It provides an +user extensible system framework without giving up POSIX compatibility +and the unix security model. Throughout this talk, we will see that +this brings further advantages beside freedom. + +<H4><A HREF="#TOCove" NAME="ove">Overview</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +The Hurd is a POSIX compatible multi-server +system operating on top of the GNU Mach microkernel. + +Topics: +GNU Mach +The Hurd +Development +Debian GNU/Hurd +</PRE></TD></TR></TABLE> +<P> +The Hurd is a POSIX compatible multi-server system operating on top of +the GNU Mach Microkernel. + +<P> +I will have to explain what GNU Mach is, so we start with that. Then +I will talk about the Hurds architecture. After that, I will give a +short overview on the Hurd libraries. Finally, I will tell you how +the Debian project is related to the Hurd. + +<H4><A HREF="#TOChis" NAME="his">Historicals</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +1983: Richard Stallman founds the GNU project. +1988: Decision is made to use Mach 3.0 as the kernel. +1991: Mach 3.0 is released under compatible license. +1991: Thomas Bushnell, BSG, founds the Hurd project. +1994: The Hurd boots the first time. +1997: Version 0.2 of the Hurd is released. + +1998: Debian hurd-i386 archive is created. +2001: Debian GNU/Hurd snapshot fills three CD images. +</PRE></TD></TR></TABLE> +<P> +When Richard Stallman founded the GNU project in 1983, he wanted to +write an operating system consisting only of free software. Very +soon, a lot of the essential tools were implemented, and released +under the GPL. However, one critical piece was missing: The kernel. +<P> +After considering several alternatives, it was decided not to write a +new kernel from scratch, but to start with the Mach microkernel. This +was in 1988, and it was not before 1991 that Mach was released under a +license allowing the GNU project to distribute it as a part of the +system. +<P> +In 1998, I started the Debian GNU/Hurd project, and in 2001 the number +of available CDs with Hurd packages fills three CD images. + +<H4><A HREF="#TOCker" NAME="ker">Kernel Architectures</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Microkernel: +Enforces resource managament (paging, scheduling) +Manages tasks +Implements message passing for IPC +Provides basic hardware support + +Monolithic kernel: +No message passing necessary. +Rich set of features (filesystems, authentication, network sockets, POSIX interface, ...) +</PRE></TD></TR></TABLE> +<P> +Microkernels were very popular in the scientific world around that +time. They don't implement a full operating system, but only the +infrastructure needed to enable other tasks to implement most +features. In contrast, monolithical kernels like Linux contain +program code of device drivers, network protocols, process management, +authentication, file systems, POSIX compatible interfaces and much +more. +<P> +So what are the basic facilities a microkernel provides? In general, +this is resource management and message passing. Resource management, +because the kernel task needs to run in a special privileged mode of +the processor, to be able to manipulate the memory management unit and +perform context switches (also to manage interrupts). Message +passing, because without a basic communication facility the other +tasks could not interact to provide the system services. Some +rudimentary hardware device support is often necessary to bootstrap +the system. So the basic jobs of a microkernel are enforcing the +paging policy (the actual paging can be done by an external pager +task), scheduling, message passing and probably basic hardware device +support. +<P> +Mach was the obvious choice back then, as it provides a rich set of +interfaces to get the job done. Beside a rather brain-dead device +interface, it provides tasks and threads, a messaging system allowing +synchronous and asynchronous operation and a complex interface for +external pagers. It's certainly not one of the sexiest microkernels +that exist today, but more like a big old mama. The GNU project +maintains its own version of Mach, called GNU Mach, which is based on +Mach 3.0. In addition to the features contained in Mach 3.0, the GNU +version contains many of the Linux 2.0 block device and network card +drivers. +<P> +A complete treatment of the differences between a microkernel and +monolithical kernel design can not be provided here. But a couple of +advantages of a microkernel design are fairly obvious. + +<H4><A HREF="#TOCmic" NAME="mic">Micro vs Monolithic</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Microkernel +Clear cut responsibilities +Flexibility in operating system design, easier debugging +More stability (less code to break) +New features are not added to the kernel + +Monolithic kernel +Intolerance or creeping featuritis +Danger of spaghetti code +Small changes can have far reaching side effects +</PRE></TD></TR></TABLE> +<P> +Because the system is split up into several components, clean +interfaces have to be developed, and the responsibilities of each part +of the system must be clear. +<P> +Once a microkernel is written, it can be used as the base for several +different operating systems. Those can even run in parallel which +makes debugging easier. When porting, most of the hardware dependant +code is in the kernel. +<P> +Much of the code that doesn't need to run in the special kernel mode +of the processor is not part of the kernel, so stability increases +because there is simply less code to break. +<P> +New features are not added to the kernel, so there is no need to hold +the barrier high for new operating system features. +<P> +Compare this to a monolithical kernel, where you either suffer from +creeping featuritis or you are intolerant of new features (we see both +in the Linux kernel). +<P> +Because in a monolithical kernel, all parts of the kernel can access +all data structures in other parts, it is more likely that short cuts +are used to avoid the overhead of a clean interface. This leads to a +simple speed up of the kernel, but also makes it less comprehensible +and more error prone. A small change in one part of the kernel can +break remote other parts. + +<H4><A HREF="#TOCsin" NAME="sin">Single Server vs Multi Server</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Single Server +A single task implements the functionality of the operating system. +Multi Server +Many tasks cooperate to provide the system's functionality. +One server provides only a small but well-defined part of the whole system. +The responsibilities are distributed logically among the servers. + +A single-server system is comparable to a monolithic kernel system. It has similar +advantages and disadvantages. +</PRE></TD></TR></TABLE> +<P> +There exist a couple of operating systems based on Mach, but they all +have the same disadvantages as a monolithical kernel, because those +operating systems are implemented in one single process running on top +of the kernel. This process provides all the services a monolithical +kernel would provide. This doesn't make a whole lot of sense (the +only advantage is that you can probably run several of such isolated +single servers on the same machine). Those systems are also called +single-server systems. The Hurd is the only usable multi-server +system on top of Mach. In the Hurd, there are many server programs, +each one responsible for a unique service provided by the operating +system. These servers run as Mach tasks, and communicate using the +Mach message passing facilities. One of them does only provide a +small part of the functionality of the system, but together they build +up a complete and functional POSIX compatible operating system. + +<H4><A HREF="#TOCmul" NAME="mul">Multi Server is superior, ...</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Any multi-server has advantages over single-server: + +Clear cut responsibilities +More stability: If one server dies, all others remain +Easier development cycle: Testing without reboot (or replacing running servers), debugging with gdb +Easier to make changes and add new features +</PRE></TD></TR></TABLE> +<P> +Using several servers has many advantages, if done right. If a file +system server for a mounted partition crashes, it doesn't take down +the whole system. Instead the partition is "unmounted", and you can +try to start the server again, probably debugging it this time with +gdb. The system is less prone to errors in individual components, and +over-all stability increases. The functionality of the system can be +extended by writing and starting new servers dynamically. (Developing +these new servers is easier for the reasons just mentioned.) +<P> +But even in a multi-server system the barrier between the system and +the users remains, and special privileges are needed to cross it. We +have not achieved user freedom yet. + +<H4><A HREF="#TOCthe" NAME="the">The Hurd even more so.</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +The Hurd goes beyond all this, and allows users to write and run their servers, too! + +Users can replace system servers dynamically with their own implementations. +Users can decide what parts of the remainder of the system they want to use. +Users can extend the functionality of the system. +No mutual trust necessary to make use of other users services. +Security of the system is not harmed by trusting users services. +</PRE></TD></TR></TABLE> +<P> +To quote Thomas Bushnell, BSG, from his paper +<A HREF="/software/hurd/hurd-paper.html">``A new strategy towards OS +design'' (1996)</A>: +<BLOCKQUOTE> +The GNU Hurd, by contrast, is designed to make the area of system code +as limited as possible. Programs are required to communicate only +with a few essential parts of the kernel; the rest of the system is +replaceable dynamically. Users can use whatever parts of the +remainder of the system they want, and can easily add components +themselves for other users to take advantage of. No mutual trust need +exist in advance for users to use each other's services, nor does the +system become vulnerable by trusting the services of arbitrary users. +</BLOCKQUOTE> + +<EM> +So the Hurd is a set of servers running on top of the Mach +micro-kernel, providing a POSIX compatible and extensible operating +system. What servers are there? What functionality do they provide, +and how do they cooperate? +</EM> + +<H4><A HREF="#TOCmac" NAME="mac">Mach Inter Process Communication</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Ports are message queues which can be used as one-way communication channels. + +Port rights are receive, send or send-once +Exactly one receiver +Potentially many senders + +MiG provides remote procedure calls on top of Mach IPC. RPCs look like function calls to the +user. +</PRE></TD></TR></TABLE> +<P> +Inter-process communication in Mach is based on the ports concept. A +port is a message queue, used as a one-way communication channel. In +addition to a port, you need a port right, which can be a send right, +receive right, or send-once right. Depending on the port right, you +are allowed to send messages to the server, receive messages from it, +or send just one single message. +<P> +For every port, there exists exactly one task holding the receive +right, but there can be no or many senders. The send-once right is +useful for clients expecting a response message. They can give a +send-once right to the reply port along with the message. The kernel +guarantees that at some point, a message will be received on the reply +oprt (this can be a notification that the server destroyed the +send-once right). +<P> +You don't need to know much about the format a message takes to be +able to use the Mach IPC. The Mach interface generator mig hides the +details of composing and sending a message, as well as receiving the +reply message. To the user, it just looks like a function call, but +in truth the message could be sent over a network to a server running +on a different computer. The set of remote procedure calls a server +provides is the public interface of this server. + + +<H4><A HREF="#TOChow" NAME="how">How to get a port?</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Traditional Mach: + +Nameserver provides ports to all registered servers. +The nameserver port itself is provided by Mach. +Like a phone book: One list. + +The Hurd: + +The filesystem is used as the server namespace. +Root directory port is inserted into each task. +The C library finds other ports with hurd_file_name_lookup, performing a pathname resolution. +Like a tree of phone books. +</PRE></TD></TR></TABLE> +<P> +So how does one get a port to a server? You need something like a +phone book for server ports, or otherwise you can only talk to +yourself. In the original Mach system, a special nameserver is +dedicated to that job. A task could get a port to the nameserver from +the Mach kernel and ask it for a port (with send right) to a server +that registered itself with the nameserver at some earlier time. +<P> +In the Hurd, there is no nameserver. Instead, the filesystem is used +as the server namespace. This works because there is always a root +filesystem in the Hurd (remember that the Hurd is a POSIX compatible +system); this is an assumption the people who developed Mach couldn't +make, so they had to choose a different strategy. You can use the +function hurd_file_name_lookup, which is part of the C library, to get +a port to the server belonging to a filename. Then you can start to +send messages to the server in the usual way. + +<H4><A HREF="#TOCexa" NAME="exa">Example of <SAMP>hurd_file_name_lookup</SAMP></A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +mach_port_t identity; +mach_port_t pwserver; +kern_return_t err; + +pwserver = hurd_file_name_lookup + ("/servers/password"); + +err = password_check_user (pwserver, + 0 /* root */, "supass", + &identity); +</PRE></TD></TR></TABLE> +<P> +As a concrete example, the special filename +<SAMP>/servers/password</SAMP> can be used to request a port to the +Hurd password server, which is responsible to check user provided +passwords. +<P> +(explanation of the example) + +<H4><A HREF="#TOCpat" NAME="pat">Pathname resolution example</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Task: Lookup /mnt/readme.txt where /mnt has a mounted filesystem. + +The C library asks the root filesystem server about /mnt/readme.txt. +The root filesystem returns a port to the mnt filesystem server (matching /mnt) and the retry name +/readme.txt. +The C library asks the mnt filesystem server about /readme.txt. +The mnt filesystem server returns a port to itself and records that this port refers to the regular + file /readme.txt. +</PRE></TD></TR></TABLE> +<P> +The C library itself does not have a full list of all available +servers. Instead pathname resolution is used to traverse through a +tree of servers. In fact, filesystems themselves are implemented by +servers (let us ignore the chicken and egg problem here). So all the +C library can do is to ask the root filesystem server about the +filename provided by the user (assuming that the user wants to resolve +an absolute path), using the <SAMP>dir_lookup</SAMP> RPC. If the +filename refers to a regular file or directory on the filesystem, the +root filesystem server just returns a port to itself and records that +this port corresponds to the file or directory in question. But if a +prefix of the full path matches the path of a server the root +filesystem knows about, it returns to the C library a port to this +server and the remaining part of the pathname that couldn't be +resolved. The C library than has to retry and query the other server +about the remaining path component. Eventually, the C library will +either know that the remaining path can't be resolved by the last +server in the list, or get a valid port to the server in question. + +<H4><A HREF="#TOCmap" NAME="map">Mapping the POSIX Interface</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Filedescriptor Port to server + providing the file + +fd = open(name,...) dir_lookup(..,name,..,&port) + [pathname resolution] + +read(fd, ...) io_read(port, ...) + +write(fd, ...) io_write(port, ...) + +fstat(fd, ...) io_stat(port, ...) + +... +</PRE></TD></TR></TABLE> +<P> +It should by now be obvious that the port returned by the server can +be used to query the files status, content and other information from +the server, if good remote procedure calls to do that are defined and +implemented by it. This is exactly what happens. Whenever a file is +opened using the C libraries <SAMP>open()</SAMP> call, the C library +uses the above pathname resolution to get a port to a server providing +the file. Then it wraps a file descriptor around it. So in the Hurd, +for every open file descriptor there is a port to a server providing +this file. Many other C library calls like <SAMP>read()</SAMP> and +<SAMP>write()</SAMP> just call a corresponding RPC using the port +associated with the file descriptor. + +<H4><A HREF="#TOCfilser" NAME="filser">File System Servers</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Provide file and directory services for ports (and more). +These ports are returned by a directory lookup. +Translate filesystem accesses through their root path (hence the name translator). +The C library maps the POSIX file and directory interface (and more) to RPCs to +the filesystem servers ports, but also does work on its own. +Any user can install file system servers on inodes they own. +</PRE></TD></TR></TABLE> +<P> +So we don't have a single phone book listing all servers, but rather a +tree of servers keeping track of each other. That's really like +calling your friend and asking for the phone number of the blond girl +at the party yesterday. He might refer you to a friend who hopefully +knows more about it. Then you have to retry. +<P> +This mechanism has huge advantages over a single nameserver. First, +note that standard unix permissions on directories can be used to +restrict access to a server (this requires that the filesystems +providing those directories behave). You just have to set the +permissions of a parent directory accordingly and provide no other way +to get a server port. +<P> +But there are much deeper implications. Most of all, a pathname never +directly refers to a file, it refers to a port of a server. That +means that providing a regular file with static data is just one of +the many options the server has to service requests on the file port. +A server can also create the data dynamically. For example, a server +associated with /dev/random can provide new random data on every +io_read() on the port to it. A server associated with /dev/fortune +can provide a new fortune cookie on every open(). +<P> +While a regular filesystem server will just serve the data as stored +in a filesystem on disk, there are servers providing purely virtual +information, or a mixture of both. It is up to the server to behave +and provide consistent and useful data on each remote procedure call. +If it does not, the results may not match the expectations of the user +and confuse him. +<P> +A footnote from the Hurd info manual: +<BLOCKQUOTE> +(1) You are lost in a maze of twisty little filesystems, all +alike.... +</BLOCKQUOTE> +<P> +Because a server installed in the filesystem namespace translates all +filesystem operations that go through its root path, such a server is +also called "active translator". You can install translators using +the settrans command with the -a option. + +<H4><A HREF="#TOCact" NAME="act">Active vs Passive</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Active Translators: + +"settrans -a /cdrom /hurd/isofs /dev/hd2" +Are running filesystem servers. +Are attached to the root node they translate. +Run as a normal process. +Go away with every reboot, or even time out. +</PRE></TD></TR></TABLE> +<P> +Many translator settings remain constant for a long time. It would be +very lame to always repeat the same couple of dozens settrans calls +manually or at boot time. So the Hurd provides a filesystem extension +that allows to store translator settings inside the filesystem and let +the filesystem servers do the work to start those servers on demand. +Such translator settings are called "passive translators". A passive +translator is really just a command line string stored in an inode of +the filesystem. If during a pathname resolution a server encounters +such a passive translator, and no active translator does exist already +(for this node), it will use this string to start up a new translator +for this inode, and then let the C library continue with the path +resolution as described above. Passive translators are installed with +settrans using the -p option (which is alrady the default). + +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Passive Translators: + +"settrans /mnt /hurd/ext2fs /dev/hd1s1" +Are stored as command strings into an inode. +Are used to start a new active translator if there isn't one. +Startup is transparent to the user. +Startup happens the first time the server is needed. +Are permanent across reboots (like file data). +</PRE></TD></TR></TABLE> +<P> +So passive translators also serve as a sort of automounting feature, +because no manual interaction is required. The server start up is +deferred until the service is need, and it is transparent to the user. +<P> +When starting up a passive translator, it will run as a normal process +with the same user and group id as those of the underlying inode. Any +user is allowed to install passive and active translators on inodes +that he owns. This way the user can install new servers into the +global namespace (for example, in his home or tmp directory) and thus +extend the functionality of the system (recall that servers can +implement other remote procedure calls beside those used for files and +directories). A careful design of the trusted system servers makes +sure that no permissions leak out. +<P> +In addition, users can provide their own implementations of some of +the system servers instead the system default. For example, they can +use their own exec server to start processes. The user specific exec +server could for example start java programs transparently (without +invoking the interpreter manually). This is done by setting the +environment variable EXECSERVERS. The systems default exec server +will evaluate this environment variable and forward the RPC to each of +the servers listed in turn, until some server accepts it and takes +over. The system default exec server will only do this if there are +no security implications. (XXX There are other ways to start new +programs than by using the system exec server. Those are still +available.) +<P> +Let's take a closer look at some of the Hurd servers. It was already +mentioned that only few system servers are mandatory for users. To +establish your identity within the Hurd system, you have to +communicate with the trusted systems authentication server auth. To +put the system administrator into control over the system components, +the process server does some global bookkeeping. +<P> +But even these servers can be ignored. However, registration with the +authentication server is the only way to establish your identity +towards other system servers. Likewise, only tasks registered as +processes with the process server can make use of its services. + +<H4><A HREF="#TOCaut" NAME="aut">Authentication</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +A user identity is just a port to an authserver. The auth server stores four set of ids for it: +effective user ids +effective group ids +available user ids +available group ids +Basic properties: +Any of these can be empty. +A 0 among the user ids identifies the superuser. +Effective ids are used to check if the user has the permission. +Available ids can be turned into effective ids on user request. +</PRE></TD></TR></TABLE> +<P> +<P> +The Hurd auth server is used to establish the identity of a user for a +server. Such an identity (which is just a port to the auth server) +consists of a set of effective user ids, a set of effective group ids, +a set of available user ids and a set of available group ids. Any of +these sets can be empty. + +<H4><A HREF="#TOCope" NAME="ope">Operations on authentication ports</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +The auth server provides the following operations on ports: + +Merge the ids of two ports into a new one. + +Return a new port containing a subset of the ids in a port. + +Create a new port with arbitrary ids (superuser only). + +Establish a trusted connection between users and servers. +</PRE></TD></TR></TABLE> +<P> +If you have two identities, you can merge them and request an identity +consisting of the unions of the sets from the auth server. You can +also create a new identity consisting only of subsets of an identity +you already have. What you can't do is extending your sets, unless +you are the superuser which is denoted by having the user id 0. + +<H4><A HREF="#TOCest" NAME="est">Establishing trusted connections</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +User provides a rendevous port to the server (with io_reauthenticate). +User calls auth_user_authenticate on the authentication port (his identity), passing the rendevous +port. +Server calls auth_server_authenticate on its authentication port (to a trusted auth server), passin +g the rendevous port and the server port. +If both authentication servers are the same, it can match the rendevous ports and return the server + port to the user and the user ids to the server. +</PRE></TD></TR></TABLE> +<P> +Finally, the auth server can establish the identity of a user for a +server. This is done by exchanging a server port and a user identity +if both match the same rendevous port. The server port will be +returned to the user, while the server is informed about the id sets +of the user. The server can then serve or reject subsequent RPCs by +the user on the server port, based on the identity it received from +the auth server. +<P> +Anyone can write a server conforming to the auth protocol, but of +course all system servers use a trusted system auth server to +establish the identity of a user. If the user is not using the system +auth server, matching the rendevous port will fail and no server port +will be returned to the user. Because this practically requires all +programs to use the same auth server, the system auth server is +minimal in every respect, and additional functionality is moved +elsewhere, so user freedom is not unnecessarily restricted. + +<H4><A HREF="#TOCpas" NAME="pas">Password Server</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +The password server `/servers/password' runs as +root and returns a new authentication port in +exchange for a unix password. + +The ids corresponding to the authentication +port match the unix user and group ids. + +Support for shadow passwords is implemented here. +</PRE></TD></TR></TABLE> +<P> +The password server sits at /servers/password and runs as root. It +can hand out ports to the auth server in exchange for a unix password, +matching it against the password or shadow file. Several utilities +make use of this server, so they don't need to be setuid root. + +<H4><A HREF="#TOCpro" NAME="pro">Process Server</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +The superuser must remain control over user tasks, so: +All mach tasks are associated with a PID in the system default proc server. +Optionally, user tasks can store: +Their environment variables. +Their argument vector. +A port, which others can request based on the PID (like a nameserver). +Also implemented in the proc server: +Sessions and process groups. +Global configuration not in Mach, like hostname, hostid, system version. +</PRE></TD></TR></TABLE> +<P> +The process server is responsible for some global bookkeeping. As +such it has to be trusted and is not replaceable by the user. +However, a user is not required to use any of its service. In that +case the user will not be able to take advantage of the POSIXish +appearance of the Hurd. +<P> +The Mach Tasks are not as heavy as POSIX processes. For example, +there is no concept of process groups or sessions in Mach. The proc +server fills in the gap. It provides a PID for all Mach tasks, and +also stores the argument line, environment variables and other +information about a process (if the mach tasks provide them, which is +usually the case if you start a process with the default +fork()/exec()). A process can also register a message port with the +proc server, which can then be requested by anyone. So the proc +server also functions as a nameserver using the process id as the +name. +<P> +The proc server also stores some other miscellaneous information not +provided by Mach, like the hostname, hostid and system version. +Finally, it provides facilities to group processes and their ports +together, as well as to convert between pids, process server ports and +mach task ports. +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +User tasks not registering themselve with proc only have a PID assigned. + +Users can run their own proc server in addition +to the system default, at least for those parts +of the interface that don't require superuser privileges. +</PRE></TD></TR></TABLE> +<P> +Although the system default proc server can't be avoided (all mach +tasks spawned by users will get a pid assigned, so the system +administrator can control them), users can run their own additional +process servers if they want, implementing the features not requiring +superuser privileges. + +<H4><A HREF="#TOCfilsys" NAME="filsys">Filesystems</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Store based filesystems +ext2fs +ufs +isofs (iso9660, RockRidge, GNU extensions) +fatfs (under development) +Network file systems +nfs +ftpfs +Miscellaneous +hostmux +usermux +tmpfs (under development) +</PRE></TD></TR></TABLE> +<P> +We already talked about translators and the file system service they +provide. Currently, we have translators for the ext2, ufs and iso9660 +filesystems. We also have an nfs client and an ftp filesystem. +Especially the latter is intriguing, as it provides transparent access +to ftp servers in the filesystem. Programs can start to move away +from implementing a plethora of network protocols, as the files are +directly available in the filesystem through the standard POSIX file +interface. + + +<H4><A HREF="#TOCdev" NAME="dev">Developing the Hurd</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Over a dozen libraries support the development of new servers. + +For special server types highly specialized +libraries require only the implementation of a +number of callback functions. + +Use libdiskfs for store based filesystems. +Use libnetfs for network filesystems, also for virtual filesystems. +Use libtrivfs for simple filesystems providing only a single file or directory. +</PRE></TD></TR></TABLE> +<P> +The Hurd server protocols are complex enough to allow for the +implementation of a POSIX compatible system with GNU extensions. +However, a lot of code can be shared by all or at least similar +servers. For example, all storage based filesystems need to be able to +read and write to a store medium splitted in blocks. The Hurd comes +with several libraries which make it easy to implement new servers. +Also, there are already a lot of examples of different server types in +the Hurd. This makes writing a new server easier. +<P> +libdiskfs is a library that supports writing store based filesystems +like ext2fs or ufs. It is not very useful for filesystems which are +purely virtual, like /proc or files in /dev. +<P> +libnetfs is intended for filesystems which provide a rich directory +hierarchy, but don't use a backing store (for example ftpfs, nfs). +<P> +libtrivfs is intended for filesystems which just provide a single +inode or directory. Most servers which are not intended to provide a +filesystem but other services (like /servers/password) use it to +provide a dummy file, so that file operations on the servers node will +not return errors. But it can also be used to provide meaningful data +in a single file, like a device store or a character device. + +<H4><A HREF="#TOCsto" NAME="sto">Store Abstraction</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Another very useful library is libstore, which is used by all store based filesystems. +It provides a store media abstraction. +A store consists of a store class and a name +(which itself can sometimes contain stores). + +Primitive store classes: +device store like device:hd2, device:hd0s1, device:fd0 +file store like file:/tmp/disk_image +task store like task:PID +zero store like zero:4m (like /dev/zero, of size 4 MB) +</PRE></TD></TR></TABLE> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Composed store classes: +copy store like copy:zero:4m +gunzip/bunzip2 store like gunzip:device:fd0 +concat store like concat:device:hd0s2:device:hd1s5 +ileave store (RAID-0(2)) +remap store like remap:10+20,50+:file:/tmp/blocks +... + +Wanted: A similar abstraction for streams (based on channels), which can be used by +network and character device servers. +</PRE></TD></TR></TABLE> +<P> +<P> +libstore provides a store abstraction, which is used by all store +based filesystems. The store is determined by a type and a name, but +some store types modify another store rather than providing a new +store, and thus stores can be stacked. For example, the device store +type expects a Mach device, but the remap store expects a list of +blocks to pick from another store, like remap:1+:device:hd2, which +would pick all blocks from hd2 but the first one, which skipped. +Because this functionality is provided in a library, all libstore +using filesystems support many different store kinds, and adding a new +store type is enough to make all store based filesystems support it. + +<H4><A HREF="#TOCdeb" NAME="deb">Debian GNU/Hurd</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Goal: +Provide a binary distribution of the Hurd that is easy to install. +Constraints: +Use the same source packages as Debian GNU/Linux. +Use the same infrastructure: +Policy +Archive +Bug tracking system +Release process +Side Goal: +Prepare Debian for the future: +More flexibility in the base system +Identify dependencies on the Linux kernel +</PRE></TD></TR></TABLE> +<P> +The Debian distribution of the GNU Hurd that I started in 1998 is +supposed to become a complete binary distribution of the Hurd that is +easy to install. + +<H4><A HREF="#TOCstabin" NAME="stabin">Status of the Debian GNU/Hurd binary archive</A></H4> +See +<A HREF="http://buildd.debian.org/stats/graph.png">http://buildd.debian.org/stats/graph.png</A> +for the most current version of the statistic. + +<H4><A HREF="#TOCstainf" NAME="stainf">Status of the Debian infrastructure</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Plus: +Source packages can identify build and host OS using dpkg-architecure. + +Minus: +The binary architecture field is insufficient. +The BTS has no architecture tag. +The policy/FHS need (small) Hurd specific extensions. +</PRE></TD></TR></TABLE> +<P> +While good compatibiity can be achieved at the source level, +the binary packages can not always express their relationship +to the available architectures sufficiently. +<P> +For example, the Linux version of makedev is binary-all, where +a binary-all-linux relationship would be more appropriate. +<P> +More work has to be done here to fix the tools. + +<H4><A HREF="#TOCstaarc" NAME="staarc">Status of the Debian Source archive</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Most packages just work. +Maintainers are usually responsive and cooperative. +Turtle, the autobuilder, crunches through the whole list right now. +Common pitfalls are POSIX incompatibilities: +Upstream: +Unconditional use of PATH_MAX (MAXPATHLEN), MAXHOSTNAMELEN. +Unguarded use of Linux kernel features. +Use of legacy interfaces (sys_errlist, termio). +Debian: +Unguarded activation of extensions available with Linux. +Low quality patches. +Assuming GNU/Linux in package scripts. +</PRE></TD></TR></TABLE> +<P> +Most packages are PSIX compatible and can be compiled without +changes on the Hurd. The maintainers of the Debian source packages +are usually very kind, responsiver and helpful. +<P> +The Turtle autobuilder software (http://turtle.sourceforge.net) +builds the Debian packages on the Hurd automatically. + +<H4><A HREF="#TOCdebide" NAME="debide">Debian GNU/Hurd: Good idea, bad idea?</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Upstream benefits: +Software packages become more portable. +Debian benefits: +Debian becomes more portable. +Maintainers learn about portability and other systems. +Debian gets a lot of public recognition. + +GNU/Hurd benefits: +Large software base. +Great infrastructure. +Nice community to partner with. +</PRE></TD></TR></TABLE> +<P> +The sheet lists the advantages of all groups involved. + +<H4><A HREF="#TOCend" NAME="end">End</A></H4> +<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +Join us at +http://hurd.gnu.org/ +http://www.debian.org/ports/hurd +http://www.hurd-fr.org +</PRE></TD></TR></TABLE> +<P> +List of contacts. + + + + +<P> +<EM>Some of these links are at other web sites not maintained by the +FSF. The FSF is not responsible for the content of these other web sites.</EM> + +</TD> +</TR> +</TABLE> + +<HR> + +[ + <A HREF="/software/hurd/hurd-talk.html">English</A> +] + +<HR> + +<P> +Return to <A HREF="/home.html">GNU's home page</A>. +<P> + +Please send FSF & GNU inquiries & questions to + +<A HREF="mailto:gnu@gnu.org"><EM>gnu@gnu.org</EM></A>. +There are also <A HREF="/home.html#ContactInfo">other ways to +contact</A> the FSF. +<P> + +Please send comments on these web pages to + +<A HREF="mailto:web-hurd@gnu.org"><EM>web-hurd@gnu.org</EM></A>, +send other questions to +<A HREF="mailto:gnu@gnu.org"><EM>gnu@gnu.org</EM></A>. +<P> +Copyright (C) 2001 Marcus Brinkmann <A HREF="mailto:marcus@gnu.org"><marcus@gnu.org></A> +<P> +Verbatim copying and distribution of this entire article is +permitted in any medium, provided this notice is preserved. +<P> +Updated: +<!-- timestamp start --> +$Date$ $Author$ +<!-- timestamp end --> +<HR> +</BODY> +</HTML> |