Adding my talk about the Hurd.

author: Marcus Brinkmann <marcus@gnu.org> 2001-09-11 04:51:41 +0000
committer: Marcus Brinkmann <marcus@gnu.org> 2001-09-11 04:51:41 +0000
commit: a3cd3171ac1cd08447a7f300f402f86ea77518d5 (patch)
tree: aa705e15caccef8a32a025817011d466463a8e0d /hurd-talk.html
parent: b0bc71482c3354017ab2d06a406faa28cedebc87 (diff)
1 files changed, 968 insertions, 0 deletions
diff --git a/hurd-talk.html b/hurd-talk.html
new file mode 100644
index 00000000..559a8456
--- /dev/null
+++ b/hurd-talk.html
@@ -0,0 +1,968 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
+	"http://www.w3.org/TR/REC-html40/strict.dtd">
+<HTML>
+<HEAD>
+<TITLE>The GNU Hurd - GNU Project - Free Software Foundation (FSF)</TITLE>
+<LINK REV="made" HREF="mailto:web-hurd@gnu.org">
+<META NAME="keywords" CONTENT="hurd">
+</HEAD>
+<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#1F00FF" ALINK="#FF0000" VLINK="#9900DD">
+<TABLE width="100%" border="0" cellspacing="5" cellpadding="15">
+<TR>
+<TD COLSPAN="2">
+<IMAGE SRC="/graphics/hurd_sm_mf.jpg" ALT=" [image of the Hurd logo] ">
+[
+  <A HREF="/software/hurd/hurd-talk.html">English</A>
+]
+</TD>
+</TR>
+<TR>
+<TD ALIGN="LEFT" VALIGN="TOP" BGCOLOR="#eeeeee">
+<A HREF="/software/hurd/hurd.html"><STRONG>The&nbsp;GNU&nbsp;Hurd</STRONG></A></BR>
+&nbsp;<BR>
+<A HREF="/software/hurd/docs.html">Documentation</A><BR>
+<A HREF="/software/hurd/install.html">Installation</A><BR>
+<A HREF="/software/hurd/help.html">Getting&nbsp;Help</A><BR>
+<A HREF="/software/hurd/download.html">Download</A><BR>
+<A HREF="/software/hurd/devel.html">Development</A><BR>
+<A HREF="/software/hurd/history.html">History</A>
+</TD>
+<TD ALIGN="LEFT" VALIGN="TOP">
+<HR>
+<H4><A NAME="contents">Table of Contents</A></H4>
+<UL>
+  <LI><A HREF="#int" NAME="TOCint">Introduction</A>
+  <LI><A HREF="#ove" NAME="TOCove">Overview</A>
+  <LI><A HREF="#his" NAME="TOChis">Historicals</A>
+  <LI><A HREF="#ker" NAME="TOCker">Kernel Architectures</A>
+  <LI><A HREF="#mic" NAME="TOCmic">Micro vs Monolithic</A>
+  <LI><A HREF="#sin" NAME="TOCsin">Single Server vs Multi Server</A>
+  <LI><A HREF="#mul" NAME="TOCmul">Multi Server is superior, ...</A>
+  <LI><A HREF="#the" NAME="TOCthe">The Hurd even more so.</A>
+  <LI><A HREF="#mac" NAME="TOCmac">Mach Inter Process Communication</A>
+  <LI><A HREF="#how" NAME="TOChow">How to get a port?</A>
+  <LI><A HREF="#exa" NAME="TOCexa">Example of <SAMP>hurd_file_name_lookup</SAMP></A>
+  <LI><A HREF="#pat" NAME="TOCpat">Pathname resolution example</A>
+  <LI><A HREF="#map" NAME="TOCmap">Mapping the POSIX Interface</A>
+  <LI><A HREF="#filser" NAME="TOCfilser">File System Servers</A>
+  <LI><A HREF="#act" NAME="TOCact">Active vs Passive</A>
+  <LI><A HREF="#aut" NAME="TOCaut">Authentication</A>
+  <LI><A HREF="#ope" NAME="TOCope">Operations on authentication ports</A>
+  <LI><A HREF="#est" NAME="TOCest">Establishing trusted connections</A>
+  <LI><A HREF="#pas" NAME="TOCpas">Password Server</A>
+  <LI><A HREF="#pro" NAME="TOCpro">Process Server</A>
+  <LI><A HREF="#filsys" NAME="TOCfilsys">Filesystems</A>
+  <LI><A HREF="#dev" NAME="TOCdev">Developing the Hurd</A>
+  <LI><A HREF="#sto" NAME="TOCsto">Store Abstraction</A>
+  <LI><A HREF="#deb" NAME="TOCdeb">Debian GNU/Hurd</A>
+  <LI><A HREF="#stabin" NAME="TOCstabin">Status of the Debian GNU/Hurd binary archive</A>
+  <LI><A HREF="#stainf" NAME="TOCstainf">Status of the Debian infrastructure</A>
+  <LI><A HREF="#staarc" NAME="TOCstaarc">Status of the Debian Source archive</A>
+  <LI><A HREF="#debide" NAME="TOCdebide">Debian GNU/Hurd: Good idea, bad idea?</A>
+  <LI><A HREF="#end" NAME="TOCend">End</A>
+</UL>
+<HR>
+<H3>Talk about the Hurd</H3>
+<P>
+This talk about the Hurd was written by Marcus Brinkmann for
+<UL>
+<LI>OSDEM, Brussels, 4. Feb 2001,
+<LI>Frühjahrsfachgespräche, Cologne, 2. Mar 2001 and
+<LI>Libre Software Meeting, Bordeaux, 4. Jul 2001.
+</UL>
+
+<H4><A HREF="#TOCint" NAME="int">Introduction</A></H4>
+<P>
+When we talk about free software, we usually refer to the free
+software licenses.  We also need relief from software patents, so our
+freedom is not restricted by them.  But there is a third type of
+freedom we need, and that's user freedom.
+
+<P>
+Expert users don't take a system as it is.  They like to change the
+configuration, and they want to run the software that works best for
+them.  That includes window managers as well as your favourite text
+editor.  But even on a GNU/Linux system consisting only of free
+software, you can not easily use the filesystem format, network
+protocol or binary format you want without special privileges.  In
+traditional unix systems, user freedom is severly restricted by the
+system administrator.
+
+<P>
+The Hurd removes these restrictions from the user.  It provides an
+user extensible system framework without giving up POSIX compatibility
+and the unix security model.  Throughout this talk, we will see that
+this brings further advantages beside freedom.
+
+<H4><A HREF="#TOCove" NAME="ove">Overview</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+The Hurd is a POSIX compatible multi-server
+system operating on top of the GNU Mach microkernel.
+
+Topics:
+GNU Mach
+The Hurd
+Development
+Debian GNU/Hurd
+</PRE></TD></TR></TABLE>
+<P>
+The Hurd is a POSIX compatible multi-server system operating on top of
+the GNU Mach Microkernel.
+
+<P>
+I will have to explain what GNU Mach is, so we start with that.  Then
+I will talk about the Hurds architecture.  After that, I will give a
+short overview on the Hurd libraries.  Finally, I will tell you how
+the Debian project is related to the Hurd.
+
+<H4><A HREF="#TOChis" NAME="his">Historicals</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+1983: Richard Stallman founds the GNU project.
+1988: Decision is made to use Mach 3.0 as the kernel.
+1991: Mach 3.0 is released under compatible license.
+1991: Thomas Bushnell, BSG, founds the Hurd project.
+1994: The Hurd boots the first time.
+1997: Version 0.2 of the Hurd is released.
+
+1998: Debian hurd-i386 archive is created.
+2001: Debian GNU/Hurd snapshot fills three CD images.
+</PRE></TD></TR></TABLE>
+<P>
+When Richard Stallman founded the GNU project in 1983, he wanted to
+write an operating system consisting only of free software.  Very
+soon, a lot of the essential tools were implemented, and released
+under the GPL.  However, one critical piece was missing: The kernel.
+<P>
+After considering several alternatives, it was decided not to write a
+new kernel from scratch, but to start with the Mach microkernel.  This
+was in 1988, and it was not before 1991 that Mach was released under a
+license allowing the GNU project to distribute it as a part of the
+system.
+<P>
+In 1998, I started the Debian GNU/Hurd project, and in 2001 the number
+of available CDs with Hurd packages fills three CD images.
+
+<H4><A HREF="#TOCker" NAME="ker">Kernel Architectures</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Microkernel:
+Enforces resource managament (paging, scheduling)
+Manages tasks
+Implements message passing for IPC
+Provides basic hardware support
+
+Monolithic kernel:
+No message passing necessary.
+Rich set of features (filesystems, authentication, network sockets, POSIX interface, ...)
+</PRE></TD></TR></TABLE>
+<P>
+Microkernels were very popular in the scientific world around that
+time.  They don't implement a full operating system, but only the
+infrastructure needed to enable other tasks to implement most
+features.  In contrast, monolithical kernels like Linux contain
+program code of device drivers, network protocols, process management,
+authentication, file systems, POSIX compatible interfaces and much
+more.
+<P>
+So what are the basic facilities a microkernel provides?  In general,
+this is resource management and message passing.  Resource management,
+because the kernel task needs to run in a special privileged mode of
+the processor, to be able to manipulate the memory management unit and
+perform context switches (also to manage interrupts).  Message
+passing, because without a basic communication facility the other
+tasks could not interact to provide the system services. Some
+rudimentary hardware device support is often necessary to bootstrap
+the system.  So the basic jobs of a microkernel are enforcing the
+paging policy (the actual paging can be done by an external pager
+task), scheduling, message passing and probably basic hardware device
+support.
+<P>
+Mach was the obvious choice back then, as it provides a rich set of
+interfaces to get the job done.  Beside a rather brain-dead device
+interface, it provides tasks and threads, a messaging system allowing
+synchronous and asynchronous operation and a complex interface for
+external pagers.  It's certainly not one of the sexiest microkernels
+that exist today, but more like a big old mama.  The GNU project
+maintains its own version of Mach, called GNU Mach, which is based on
+Mach 3.0.  In addition to the features contained in Mach 3.0, the GNU
+version contains many of the Linux 2.0 block device and network card
+drivers.
+<P>
+A complete treatment of the differences between a microkernel and
+monolithical kernel design can not be provided here.  But a couple of
+advantages of a microkernel design are fairly obvious.
+
+<H4><A HREF="#TOCmic" NAME="mic">Micro vs Monolithic</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Microkernel
+Clear cut responsibilities
+Flexibility in operating system design, easier debugging
+More stability (less code to break)
+New features are not added to the kernel
+
+Monolithic kernel
+Intolerance or creeping featuritis
+Danger of spaghetti code
+Small changes can have far reaching side effects
+</PRE></TD></TR></TABLE>
+<P>
+Because the system is split up into several components, clean
+interfaces have to be developed, and the responsibilities of each part
+of the system must be clear.
+<P>
+Once a microkernel is written, it can be used as the base for several
+different operating systems.  Those can even run in parallel which
+makes debugging easier.  When porting, most of the hardware dependant
+code is in the kernel.
+<P>
+Much of the code that doesn't need to run in the special kernel mode
+of the processor is not part of the kernel, so stability increases
+because there is simply less code to break.
+<P>
+New features are not added to the kernel, so there is no need to hold
+the barrier high for new operating system features.
+<P>
+Compare this to a monolithical kernel, where you either suffer from
+creeping featuritis or you are intolerant of new features (we see both
+in the Linux kernel).
+<P>
+Because in a monolithical kernel, all parts of the kernel can access
+all data structures in other parts, it is more likely that short cuts
+are used to avoid the overhead of a clean interface.  This leads to a
+simple speed up of the kernel, but also makes it less comprehensible
+and more error prone.  A small change in one part of the kernel can
+break remote other parts.
+
+<H4><A HREF="#TOCsin" NAME="sin">Single Server vs Multi Server</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Single Server
+A single task implements the functionality of the operating system.
+Multi Server
+Many tasks cooperate to provide the system's functionality.
+One server provides only a small but well-defined part of the whole system.
+The responsibilities are distributed logically among the servers.
+
+A single-server system is comparable to a monolithic kernel system. It has similar
+advantages and disadvantages.
+</PRE></TD></TR></TABLE>
+<P>
+There exist a couple of operating systems based on Mach, but they all
+have the same disadvantages as a monolithical kernel, because those
+operating systems are implemented in one single process running on top
+of the kernel.  This process provides all the services a monolithical
+kernel would provide.  This doesn't make a whole lot of sense (the
+only advantage is that you can probably run several of such isolated
+single servers on the same machine).  Those systems are also called
+single-server systems.  The Hurd is the only usable multi-server
+system on top of Mach.  In the Hurd, there are many server programs,
+each one responsible for a unique service provided by the operating
+system.  These servers run as Mach tasks, and communicate using the
+Mach message passing facilities.  One of them does only provide a
+small part of the functionality of the system, but together they build
+up a complete and functional POSIX compatible operating system.
+
+<H4><A HREF="#TOCmul" NAME="mul">Multi Server is superior, ...</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Any multi-server has advantages over single-server:
+
+Clear cut responsibilities
+More stability: If one server dies, all others remain
+Easier development cycle: Testing without reboot (or replacing running servers), debugging with gdb
+Easier to make changes and add new features
+</PRE></TD></TR></TABLE>
+<P>
+Using several servers has many advantages, if done right.  If a file
+system server for a mounted partition crashes, it doesn't take down
+the whole system.  Instead the partition is "unmounted", and you can
+try to start the server again, probably debugging it this time with
+gdb.  The system is less prone to errors in individual components, and
+over-all stability increases.  The functionality of the system can be
+extended by writing and starting new servers dynamically.  (Developing
+these new servers is easier for the reasons just mentioned.)
+<P>
+But even in a multi-server system the barrier between the system and
+the users remains, and special privileges are needed to cross it.  We
+have not achieved user freedom yet.
+
+<H4><A HREF="#TOCthe" NAME="the">The Hurd even more so.</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+The Hurd goes beyond all this, and allows users to write and run their servers, too!
+
+Users can replace system servers dynamically with their own implementations.
+Users can decide what parts of the remainder of the system they want to use.
+Users can extend the functionality of the system.
+No mutual trust necessary to make use of other users services.
+Security of the system is not harmed by trusting users services.
+</PRE></TD></TR></TABLE>
+<P>
+To quote Thomas Bushnell, BSG, from his paper
+<A HREF="/software/hurd/hurd-paper.html">``A new strategy towards OS
+design'' (1996)</A>:
+<BLOCKQUOTE>
+The GNU Hurd, by contrast, is designed to make the area of system code
+as limited as possible.  Programs are required to communicate only
+with a few essential parts of the kernel; the rest of the system is
+replaceable dynamically.  Users can use whatever parts of the
+remainder of the system they want, and can easily add components
+themselves for other users to take advantage of.  No mutual trust need
+exist in advance for users to use each other's services, nor does the
+system become vulnerable by trusting the services of arbitrary users.
+</BLOCKQUOTE>
+
+<EM>
+So the Hurd is a set of servers running on top of the Mach
+micro-kernel, providing a POSIX compatible and extensible operating
+system.  What servers are there?  What functionality do they provide,
+and how do they cooperate?
+</EM>
+
+<H4><A HREF="#TOCmac" NAME="mac">Mach Inter Process Communication</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Ports are message queues which can be used as one-way communication channels.
+
+Port rights are receive, send or send-once
+Exactly one receiver
+Potentially many senders
+
+MiG provides remote procedure calls on top of Mach IPC. RPCs look like function calls to the
+user.
+</PRE></TD></TR></TABLE>
+<P>
+Inter-process communication in Mach is based on the ports concept.  A
+port is a message queue, used as a one-way communication channel.  In
+addition to a port, you need a port right, which can be a send right,
+receive right, or send-once right.  Depending on the port right, you
+are allowed to send messages to the server, receive messages from it,
+or send just one single message.
+<P>
+For every port, there exists exactly one task holding the receive
+right, but there can be no or many senders.  The send-once right is
+useful for clients expecting a response message.  They can give a
+send-once right to the reply port along with the message.  The kernel
+guarantees that at some point, a message will be received on the reply
+oprt (this can be a notification that the server destroyed the
+send-once right).
+<P>
+You don't need to know much about the format a message takes to be
+able to use the Mach IPC.  The Mach interface generator mig hides the
+details of composing and sending a message, as well as receiving the
+reply message.  To the user, it just looks like a function call, but
+in truth the message could be sent over a network to a server running
+on a different computer.  The set of remote procedure calls a server
+provides is the public interface of this server.
+
+
+<H4><A HREF="#TOChow" NAME="how">How to get a port?</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Traditional Mach:
+
+Nameserver provides ports to all registered servers.
+The nameserver port itself is provided by Mach.
+Like a phone book: One list.
+
+The Hurd:
+
+The filesystem is used as the server namespace.
+Root directory port is inserted into each task.
+The C library finds other ports with hurd_file_name_lookup, performing a pathname resolution.
+Like a tree of phone books.
+</PRE></TD></TR></TABLE>
+<P>
+So how does one get a port to a server?  You need something like a
+phone book for server ports, or otherwise you can only talk to
+yourself.  In the original Mach system, a special nameserver is
+dedicated to that job.  A task could get a port to the nameserver from
+the Mach kernel and ask it for a port (with send right) to a server
+that registered itself with the nameserver at some earlier time.
+<P>
+In the Hurd, there is no nameserver.  Instead, the filesystem is used
+as the server namespace.  This works because there is always a root
+filesystem in the Hurd (remember that the Hurd is a POSIX compatible
+system); this is an assumption the people who developed Mach couldn't
+make, so they had to choose a different strategy.  You can use the
+function hurd_file_name_lookup, which is part of the C library, to get
+a port to the server belonging to a filename.  Then you can start to
+send messages to the server in the usual way.
+
+<H4><A HREF="#TOCexa" NAME="exa">Example of <SAMP>hurd_file_name_lookup</SAMP></A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+mach_port_t identity;
+mach_port_t pwserver;
+kern_return_t err;
+
+pwserver = hurd_file_name_lookup
+                ("/servers/password");
+
+err = password_check_user (pwserver,
+                           0 /* root */, "supass",
+                           &identity);
+</PRE></TD></TR></TABLE>
+<P>
+As a concrete example, the special filename
+<SAMP>/servers/password</SAMP> can be used to request a port to the
+Hurd password server, which is responsible to check user provided
+passwords.
+<P>
+(explanation of the example)
+
+<H4><A HREF="#TOCpat" NAME="pat">Pathname resolution example</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Task: Lookup /mnt/readme.txt where /mnt has a mounted filesystem.
+
+The C library asks the root filesystem server about /mnt/readme.txt.
+The root filesystem returns a port to the mnt filesystem server (matching /mnt) and the retry name
+/readme.txt.
+The C library asks the mnt filesystem server about /readme.txt.
+The mnt filesystem server returns a port to itself and records that this port refers to the regular
+ file /readme.txt.
+</PRE></TD></TR></TABLE>
+<P>
+The C library itself does not have a full list of all available
+servers.  Instead pathname resolution is used to traverse through a
+tree of servers.  In fact, filesystems themselves are implemented by
+servers (let us ignore the chicken and egg problem here).  So all the
+C library can do is to ask the root filesystem server about the
+filename provided by the user (assuming that the user wants to resolve
+an absolute path), using the <SAMP>dir_lookup</SAMP> RPC.  If the
+filename refers to a regular file or directory on the filesystem, the
+root filesystem server just returns a port to itself and records that
+this port corresponds to the file or directory in question.  But if a
+prefix of the full path matches the path of a server the root
+filesystem knows about, it returns to the C library a port to this
+server and the remaining part of the pathname that couldn't be
+resolved.  The C library than has to retry and query the other server
+about the remaining path component.  Eventually, the C library will
+either know that the remaining path can't be resolved by the last
+server in the list, or get a valid port to the server in question.
+
+<H4><A HREF="#TOCmap" NAME="map">Mapping the POSIX Interface</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Filedescriptor          Port to server
+                        providing the file
+
+fd = open(name,...)     dir_lookup(..,name,..,&port)
+                        [pathname resolution]
+
+read(fd, ...)           io_read(port, ...)
+
+write(fd, ...)          io_write(port, ...)
+
+fstat(fd, ...)          io_stat(port, ...)
+
+...
+</PRE></TD></TR></TABLE>
+<P>
+It should by now be obvious that the port returned by the server can
+be used to query the files status, content and other information from
+the server, if good remote procedure calls to do that are defined and
+implemented by it.  This is exactly what happens.  Whenever a file is
+opened using the C libraries <SAMP>open()</SAMP> call, the C library
+uses the above pathname resolution to get a port to a server providing
+the file.  Then it wraps a file descriptor around it.  So in the Hurd,
+for every open file descriptor there is a port to a server providing
+this file.  Many other C library calls like <SAMP>read()</SAMP> and
+<SAMP>write()</SAMP> just call a corresponding RPC using the port
+associated with the file descriptor.
+
+<H4><A HREF="#TOCfilser" NAME="filser">File System Servers</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Provide file and directory services for ports (and more).
+These ports are returned by a directory lookup.
+Translate filesystem accesses through their root path (hence the name translator).
+The C library maps the POSIX file and directory interface (and more) to RPCs to
+the filesystem servers ports, but also does work on its own.
+Any user can install file system servers on inodes they own.
+</PRE></TD></TR></TABLE>
+<P>
+So we don't have a single phone book listing all servers, but rather a
+tree of servers keeping track of each other.  That's really like
+calling your friend and asking for the phone number of the blond girl
+at the party yesterday.  He might refer you to a friend who hopefully
+knows more about it. Then you have to retry.
+<P>
+This mechanism has huge advantages over a single nameserver.  First,
+note that standard unix permissions on directories can be used to
+restrict access to a server (this requires that the filesystems
+providing those directories behave).  You just have to set the
+permissions of a parent directory accordingly and provide no other way
+to get a server port.
+<P>
+But there are much deeper implications.  Most of all, a pathname never
+directly refers to a file, it refers to a port of a server.  That
+means that providing a regular file with static data is just one of
+the many options the server has to service requests on the file port.
+A server can also create the data dynamically.  For example, a server
+associated with /dev/random can provide new random data on every
+io_read() on the port to it.  A server associated with /dev/fortune
+can provide a new fortune cookie on every open().
+<P>
+While a regular filesystem server will just serve the data as stored
+in a filesystem on disk, there are servers providing purely virtual
+information, or a mixture of both.  It is up to the server to behave
+and provide consistent and useful data on each remote procedure call.
+If it does not, the results may not match the expectations of the user
+and confuse him.
+<P>
+A footnote from the Hurd info manual:
+<BLOCKQUOTE>
+(1) You are lost in a maze of twisty little filesystems, all
+alike....
+</BLOCKQUOTE>
+<P>
+Because a server installed in the filesystem namespace translates all
+filesystem operations that go through its root path, such a server is
+also called "active translator".  You can install translators using
+the settrans command with the -a option.
+
+<H4><A HREF="#TOCact" NAME="act">Active vs Passive</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Active Translators:
+
+"settrans -a /cdrom /hurd/isofs /dev/hd2"
+Are running filesystem servers.
+Are attached to the root node they translate.
+Run as a normal process.
+Go away with every reboot, or even time out.
+</PRE></TD></TR></TABLE>
+<P>
+Many translator settings remain constant for a long time.  It would be
+very lame to always repeat the same couple of dozens settrans calls
+manually or at boot time.  So the Hurd provides a filesystem extension
+that allows to store translator settings inside the filesystem and let
+the filesystem servers do the work to start those servers on demand.
+Such translator settings are called "passive translators".  A passive
+translator is really just a command line string stored in an inode of
+the filesystem.  If during a pathname resolution a server encounters
+such a passive translator, and no active translator does exist already
+(for this node), it will use this string to start up a new translator
+for this inode, and then let the C library continue with the path
+resolution as described above.  Passive translators are installed with
+settrans using the -p option (which is alrady the default).
+
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Passive Translators:
+
+"settrans /mnt /hurd/ext2fs /dev/hd1s1"
+Are stored as command strings into an inode.
+Are used to start a new active translator if there isn't one.
+Startup is transparent to the user.
+Startup happens the first time the server is needed.
+Are permanent across reboots (like file data).
+</PRE></TD></TR></TABLE>
+<P>
+So passive translators also serve as a sort of automounting feature,
+because no manual interaction is required.  The server start up is
+deferred until the service is need, and it is transparent to the user.
+<P>
+When starting up a passive translator, it will run as a normal process
+with the same user and group id as those of the underlying inode.  Any
+user is allowed to install passive and active translators on inodes
+that he owns.  This way the user can install new servers into the
+global namespace (for example, in his home or tmp directory) and thus
+extend the functionality of the system (recall that servers can
+implement other remote procedure calls beside those used for files and
+directories).  A careful design of the trusted system servers makes
+sure that no permissions leak out.
+<P>
+In addition, users can provide their own implementations of some of
+the system servers instead the system default.  For example, they can
+use their own exec server to start processes.  The user specific exec
+server could for example start java programs transparently (without
+invoking the interpreter manually).  This is done by setting the
+environment variable EXECSERVERS.  The systems default exec server
+will evaluate this environment variable and forward the RPC to each of
+the servers listed in turn, until some server accepts it and takes
+over.  The system default exec server will only do this if there are
+no security implications.  (XXX There are other ways to start new
+programs than by using the system exec server.  Those are still
+available.)
+<P>
+Let's take a closer look at some of the Hurd servers.  It was already
+mentioned that only few system servers are mandatory for users.  To
+establish your identity within the Hurd system, you have to
+communicate with the trusted systems authentication server auth.  To
+put the system administrator into control over the system components,
+the process server does some global bookkeeping.
+<P>
+But even these servers can be ignored.  However, registration with the
+authentication server is the only way to establish your identity
+towards other system servers.  Likewise, only tasks registered as
+processes with the process server can make use of its services.
+
+<H4><A HREF="#TOCaut" NAME="aut">Authentication</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+A user identity is just a port to an authserver. The auth server stores four set of ids for it:
+effective user ids
+effective group ids
+available user ids
+available group ids
+Basic properties:
+Any of these can be empty.
+A 0 among the user ids identifies the superuser.
+Effective ids are used to check if the user has the permission.
+Available ids can be turned into effective ids on user request.
+</PRE></TD></TR></TABLE>
+<P>
+<P>
+The Hurd auth server is used to establish the identity of a user for a
+server.  Such an identity (which is just a port to the auth server)
+consists of a set of effective user ids, a set of effective group ids,
+a set of available user ids and a set of available group ids.  Any of
+these sets can be empty.
+
+<H4><A HREF="#TOCope" NAME="ope">Operations on authentication ports</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+The auth server provides the following operations on ports:
+
+Merge the ids of two ports into a new one.
+
+Return a new port containing a subset of the ids in a port.
+
+Create a new port with arbitrary ids (superuser only).
+
+Establish a trusted connection between users and servers.
+</PRE></TD></TR></TABLE>
+<P>
+If you have two identities, you can merge them and request an identity
+consisting of the unions of the sets from the auth server.  You can
+also create a new identity consisting only of subsets of an identity
+you already have.  What you can't do is extending your sets, unless
+you are the superuser which is denoted by having the user id 0.
+
+<H4><A HREF="#TOCest" NAME="est">Establishing trusted connections</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+User provides a rendevous port to the server (with io_reauthenticate).
+User calls auth_user_authenticate on the authentication port (his identity), passing the rendevous
+port.
+Server calls auth_server_authenticate on its authentication port (to a trusted auth server), passin
+g the rendevous port and the server port.
+If both authentication servers are the same, it can match the rendevous ports and return the server
+ port to the user and the user ids to the server.
+</PRE></TD></TR></TABLE>
+<P>
+Finally, the auth server can establish the identity of a user for a
+server.  This is done by exchanging a server port and a user identity
+if both match the same rendevous port.  The server port will be
+returned to the user, while the server is informed about the id sets
+of the user.  The server can then serve or reject subsequent RPCs by
+the user on the server port, based on the identity it received from
+the auth server.
+<P>
+Anyone can write a server conforming to the auth protocol, but of
+course all system servers use a trusted system auth server to
+establish the identity of a user.  If the user is not using the system
+auth server, matching the rendevous port will fail and no server port
+will be returned to the user.  Because this practically requires all
+programs to use the same auth server, the system auth server is
+minimal in every respect, and additional functionality is moved
+elsewhere, so user freedom is not unnecessarily restricted.
+
+<H4><A HREF="#TOCpas" NAME="pas">Password Server</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+The password server `/servers/password' runs as
+root and returns a new authentication port in
+exchange for a unix password.
+
+The ids corresponding to the authentication
+port match the unix user and group ids.
+
+Support for shadow passwords is implemented here.
+</PRE></TD></TR></TABLE>
+<P>
+The password server sits at /servers/password and runs as root.  It
+can hand out ports to the auth server in exchange for a unix password,
+matching it against the password or shadow file.  Several utilities
+make use of this server, so they don't need to be setuid root.
+
+<H4><A HREF="#TOCpro" NAME="pro">Process Server</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+The superuser must remain control over user tasks, so:
+All mach tasks are associated with a PID in the system default proc server.
+Optionally, user tasks can store:
+Their environment variables.
+Their argument vector.
+A port, which others can request based on the PID (like a nameserver).
+Also implemented in the proc server:
+Sessions and process groups.
+Global configuration not in Mach, like hostname, hostid, system version.
+</PRE></TD></TR></TABLE>
+<P>
+The process server is responsible for some global bookkeeping.  As
+such it has to be trusted and is not replaceable by the user.
+However, a user is not required to use any of its service.  In that
+case the user will not be able to take advantage of the POSIXish
+appearance of the Hurd.
+<P>
+The Mach Tasks are not as heavy as POSIX processes.  For example,
+there is no concept of process groups or sessions in Mach.  The proc
+server fills in the gap.  It provides a PID for all Mach tasks, and
+also stores the argument line, environment variables and other
+information about a process (if the mach tasks provide them, which is
+usually the case if you start a process with the default
+fork()/exec()).  A process can also register a message port with the
+proc server, which can then be requested by anyone.  So the proc
+server also functions as a nameserver using the process id as the
+name.
+<P>
+The proc server also stores some other miscellaneous information not
+provided by Mach, like the hostname, hostid and system version.
+Finally, it provides facilities to group processes and their ports
+together, as well as to convert between pids, process server ports and
+mach task ports.
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+User tasks not registering themselve with proc only have a PID assigned.
+
+Users can run their own proc server in addition
+to the system default, at least for those parts
+of the interface that don't require superuser privileges.
+</PRE></TD></TR></TABLE>
+<P>
+Although the system default proc server can't be avoided (all mach
+tasks spawned by users will get a pid assigned, so the system
+administrator can control them), users can run their own additional
+process servers if they want, implementing the features not requiring
+superuser privileges.
+
+<H4><A HREF="#TOCfilsys" NAME="filsys">Filesystems</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Store based filesystems
+ext2fs
+ufs
+isofs (iso9660, RockRidge, GNU extensions)
+fatfs (under development)
+Network file systems
+nfs
+ftpfs
+Miscellaneous
+hostmux
+usermux
+tmpfs (under development)
+</PRE></TD></TR></TABLE>
+<P>
+We already talked about translators and the file system service they
+provide.  Currently, we have translators for the ext2, ufs and iso9660
+filesystems.  We also have an nfs client and an ftp filesystem.
+Especially the latter is intriguing, as it provides transparent access
+to ftp servers in the filesystem.  Programs can start to move away
+from implementing a plethora of network protocols, as the files are
+directly available in the filesystem through the standard POSIX file
+interface.
+
+
+<H4><A HREF="#TOCdev" NAME="dev">Developing the Hurd</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Over a dozen libraries support the development of new servers.
+
+For special server types highly specialized
+libraries require only the implementation of a
+number of callback functions.
+
+Use libdiskfs for store based filesystems.
+Use libnetfs for network filesystems, also for virtual filesystems.
+Use libtrivfs for simple filesystems providing only a single file or directory.
+</PRE></TD></TR></TABLE>
+<P>
+The Hurd server protocols are complex enough to allow for the
+implementation of a POSIX compatible system with GNU extensions.
+However, a lot of code can be shared by all or at least similar
+servers. For example, all storage based filesystems need to be able to
+read and write to a store medium splitted in blocks. The Hurd comes
+with several libraries which make it easy to implement new servers.
+Also, there are already a lot of examples of different server types in
+the Hurd.  This makes writing a new server easier.
+<P>
+libdiskfs is a library that supports writing store based filesystems
+like ext2fs or ufs. It is not very useful for filesystems which are
+purely virtual, like /proc or files in /dev.
+<P>
+libnetfs is intended for filesystems which provide a rich directory
+hierarchy, but don't use a backing store (for example ftpfs, nfs).
+<P>
+libtrivfs is intended for filesystems which just provide a single
+inode or directory. Most servers which are not intended to provide a
+filesystem but other services (like /servers/password) use it to
+provide a dummy file, so that file operations on the servers node will
+not return errors. But it can also be used to provide meaningful data
+in a single file, like a device store or a character device.
+
+<H4><A HREF="#TOCsto" NAME="sto">Store Abstraction</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Another very useful library is libstore, which is used by all store based filesystems.
+It provides a store media abstraction.
+A store consists of a store class and a name
+(which itself can sometimes contain stores).
+
+Primitive store classes:
+device store like device:hd2, device:hd0s1, device:fd0
+file store like file:/tmp/disk_image
+task store like task:PID
+zero store like zero:4m (like /dev/zero, of size 4 MB)
+</PRE></TD></TR></TABLE>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Composed store classes:
+copy store like copy:zero:4m
+gunzip/bunzip2 store like gunzip:device:fd0
+concat store like concat:device:hd0s2:device:hd1s5
+ileave store (RAID-0(2))
+remap store like remap:10+20,50+:file:/tmp/blocks
+...
+
+Wanted: A similar abstraction for streams (based on channels), which can be used by
+network and character device servers.
+</PRE></TD></TR></TABLE>
+<P>
+<P>
+libstore provides a store abstraction, which is used by all store
+based filesystems. The store is determined by a type and a name, but
+some store types modify another store rather than providing a new
+store, and thus stores can be stacked. For example, the device store
+type expects a Mach device, but the remap store expects a list of
+blocks to pick from another store, like remap:1+:device:hd2, which
+would pick all blocks from hd2 but the first one, which skipped.
+Because this functionality is provided in a library, all libstore
+using filesystems support many different store kinds, and adding a new
+store type is enough to make all store based filesystems support it.
+
+<H4><A HREF="#TOCdeb" NAME="deb">Debian GNU/Hurd</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Goal:
+Provide a binary distribution of the Hurd that is easy to install.
+Constraints:
+Use the same source packages as Debian GNU/Linux.
+Use the same infrastructure:
+Policy
+Archive
+Bug tracking system
+Release process
+Side Goal:
+Prepare Debian for the future:
+More flexibility in the base system
+Identify dependencies on the Linux kernel
+</PRE></TD></TR></TABLE>
+<P>
+The Debian distribution of the GNU Hurd that I started in 1998 is
+supposed to become a complete binary distribution of the Hurd that is
+easy to install.
+
+<H4><A HREF="#TOCstabin" NAME="stabin">Status of the Debian GNU/Hurd binary archive</A></H4>
+See
+<A HREF="http://buildd.debian.org/stats/graph.png">http://buildd.debian.org/stats/graph.png</A>
+for the most current version of the statistic.
+
+<H4><A HREF="#TOCstainf" NAME="stainf">Status of the Debian infrastructure</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Plus:
+Source packages can identify build and host OS using dpkg-architecure.
+
+Minus:
+The binary architecture field is insufficient.
+The BTS has no architecture tag.
+The policy/FHS need (small) Hurd specific extensions.
+</PRE></TD></TR></TABLE>
+<P>
+While good compatibiity can be achieved at the source level,
+the binary packages can not always express their relationship
+to the available architectures sufficiently.
+<P>
+For example, the Linux version of makedev is binary-all, where
+a binary-all-linux relationship would be more appropriate.
+<P>
+More work has to be done here to fix the tools.
+
+<H4><A HREF="#TOCstaarc" NAME="staarc">Status of the Debian Source archive</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Most packages just work.
+Maintainers are usually responsive and cooperative.
+Turtle, the autobuilder, crunches through the whole list right now.
+Common pitfalls are POSIX incompatibilities:
+Upstream:
+Unconditional use of PATH_MAX (MAXPATHLEN), MAXHOSTNAMELEN.
+Unguarded use of Linux kernel features.
+Use of legacy interfaces (sys_errlist, termio).
+Debian:
+Unguarded activation of extensions available with Linux.
+Low quality patches.
+Assuming GNU/Linux in package scripts.
+</PRE></TD></TR></TABLE>
+<P>
+Most packages are PSIX compatible and can be compiled without
+changes on the Hurd.  The maintainers of the Debian source packages
+are usually very kind, responsiver and helpful.
+<P>
+The Turtle autobuilder software (http://turtle.sourceforge.net)
+builds the Debian packages on the Hurd automatically.
+
+<H4><A HREF="#TOCdebide" NAME="debide">Debian GNU/Hurd: Good idea, bad idea?</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Upstream benefits:
+Software packages become more portable.
+Debian benefits:
+Debian becomes more portable.
+Maintainers learn about portability and other systems.
+Debian gets a lot of public recognition.
+
+GNU/Hurd benefits:
+Large software base.
+Great infrastructure.
+Nice community to partner with.
+</PRE></TD></TR></TABLE>
+<P>
+The sheet lists the advantages of all groups involved.
+
+<H4><A HREF="#TOCend" NAME="end">End</A></H4>
+<TABLE BORDER="1" CELLPADDING="5" ALIGN="CENTER" WIDTH="50%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE>
+Join us at
+http://hurd.gnu.org/
+http://www.debian.org/ports/hurd
+http://www.hurd-fr.org
+</PRE></TD></TR></TABLE>
+<P>
+List of contacts.
+
+
+
+
+<P>
+<EM>Some of these links are at other web sites not maintained by the
+FSF. The FSF is not responsible for the content of these other web sites.</EM>
+
+</TD>
+</TR>
+</TABLE>
+
+<HR>
+
+[
+  <A HREF="/software/hurd/hurd-talk.html">English</A>
+]
+
+<HR>
+
+<P>
+Return to <A HREF="/home.html">GNU's home page</A>.
+<P>
+
+Please send FSF &amp; GNU inquiries &amp; questions to
+
+<A HREF="mailto:gnu@gnu.org"><EM>gnu@gnu.org</EM></A>.
+There are also <A HREF="/home.html#ContactInfo">other ways to
+contact</A> the FSF.
+<P>
+
+Please send comments on these web pages to
+
+<A HREF="mailto:web-hurd@gnu.org"><EM>web-hurd@gnu.org</EM></A>,
+send other questions to
+<A HREF="mailto:gnu@gnu.org"><EM>gnu@gnu.org</EM></A>.
+<P>
+Copyright (C) 2001 Marcus Brinkmann <A HREF="mailto:marcus@gnu.org">&lt;marcus@gnu.org&gt;</A>
+<P>
+Verbatim copying and distribution of this entire article is
+permitted in any medium, provided this notice is preserved.
+<P>
+Updated:
+<!-- timestamp start -->
+$Date$ $Author$
+<!-- timestamp end -->
+<HR>
+</BODY>
+</HTML>
author	Marcus Brinkmann <marcus@gnu.org>	2001-09-11 04:51:41 +0000
committer	Marcus Brinkmann <marcus@gnu.org>	2001-09-11 04:51:41 +0000
commit	a3cd3171ac1cd08447a7f300f402f86ea77518d5 (patch)
tree	aa705e15caccef8a32a025817011d466463a8e0d /hurd-talk.html
parent	b0bc71482c3354017ab2d06a406faa28cedebc87 (diff)