diff options
author | Thomas Schwinge <tschwinge@gnu.org> | 2008-11-06 11:37:33 +0100 |
---|---|---|
committer | Thomas Schwinge <tschwinge@gnu.org> | 2008-11-06 11:37:33 +0100 |
commit | 24e7f01d6e6322e9d98412076dcf3f4d98196bb0 (patch) | |
tree | 6747bd440175ac9fb30ab30adfa83791df68f326 /hurd | |
parent | 155b3fde577d1f6e06ee1df2204ebcaea7c81935 (diff) |
Integrate auth.html, hurd-paper.html, hurd-talk.html. Move content from devel.html, docs.html.
Diffstat (limited to 'hurd')
-rw-r--r-- | hurd/authentication.mdwn | 2 | ||||
-rw-r--r-- | hurd/critique.mdwn | 4 | ||||
-rw-r--r-- | hurd/documentation.mdwn | 48 | ||||
-rw-r--r-- | hurd/documentation/auth.html | 168 | ||||
-rw-r--r-- | hurd/documentation/hurd-paper.html | 760 | ||||
-rw-r--r-- | hurd/documentation/hurd-talk.html | 1061 | ||||
-rw-r--r-- | hurd/hurd_hacking_guide.mdwn | 16 | ||||
-rw-r--r-- | hurd/ng/position_paper.mdwn | 9 | ||||
-rw-r--r-- | hurd/reference_manual.mdwn | 18 | ||||
-rw-r--r-- | hurd/running/distrib.mdwn | 2 | ||||
-rw-r--r-- | hurd/running/gnu/universal_package_manager.mdwn | 2 | ||||
-rw-r--r-- | hurd/translator.mdwn | 1 | ||||
-rw-r--r-- | hurd/translator/auth.mdwn | 13 |
13 files changed, 2090 insertions, 14 deletions
diff --git a/hurd/authentication.mdwn b/hurd/authentication.mdwn index cbb164c8..14144d8e 100644 --- a/hurd/authentication.mdwn +++ b/hurd/authentication.mdwn @@ -10,7 +10,7 @@ is included in the section entitled UIDs on the Hurd are separate from processes. A process has [[capabilities|capability]] designating so-called UID vectors that -are implemented by an [[auth]] server. This +are implemented by an [[translator/auth]] server. This makes them easily [[virtualizable|virtualization]]. When a process wishes to gain access to a resource provided by a third diff --git a/hurd/critique.mdwn b/hurd/critique.mdwn index 9770138e..dacd7bb8 100644 --- a/hurd/critique.mdwn +++ b/hurd/critique.mdwn @@ -8,8 +8,8 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] -[[NealWalfield]] and [[MarcusBrinkmann]] wrote a paper titled [*A Critique of -the GNU Hurd Multi-Server Operating +Neal Walfield and Marcus Brinkmann wrote a paper titled [*A Critique of +the GNU Hurd Multi-server Operating System*](http://walfield.org/papers/200707-walfield-critique-of-the-GNU-Hurd.pdf). This was published in ACM SIGOPS Operating Systems Review in July 2007. This is sometimes referred to as *the critique*. diff --git a/hurd/documentation.mdwn b/hurd/documentation.mdwn index bb37a8be..a8c3a988 100644 --- a/hurd/documentation.mdwn +++ b/hurd/documentation.mdwn @@ -1,4 +1,5 @@ -[[meta copyright="Copyright © 2008 Free Software Foundation, Inc."]] +[[meta copyright="Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +Free Software Foundation, Inc."]] [[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,10 +9,53 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] +# Introductory Material + * [[What_Is_the_GNU_Hurd]] * [[Advantages]] * [[FAQ]] - * <http://www.gnu.org/software/hurd/docs.html> + * [[*Towards_a_New_Strategy_of_OS_Design*|hurd-paper]], an architectural + overview by Thomas Bushnell, BSG. + + * [[*The_Hurd*|hurd-talk]], a presentation by Marcus Brinkmann. + + * [[*A_Critique_of_the_GNU_Hurd_Multi-server_Operating_System*|critique]], an + analysis of the GNU Hurd on GNU Mach system, written by Neal Walfield and + Marcus Brinkmann. + +## External + + * [*Examining the Legendary HURD + Kernel*](http://www.informit.com/articles/printerfriendly.aspx?p=1180992), + an article by David Chisnall. + + Also covers a bit of GNU's and the Hurd's history, fundamental techniques + applied, comparisions to other systems. + + +# Development + + * *[[The_GNU_Hurd_Reference_Manual|reference_manual]]*. + + * The *[[Hurd_Hacking_Guide]]*, an introduction to GNU Hurd and Mach + programming by Wolfgang Jährling. + + * [*Manually Bootstrapping a + Translator*](http://walfield.org/pub/people/neal/papers/hurd-misc/manual-bootstrap.txt), + a text by Neal Walfield about how to *manually connect the translator to + the filesystem*. + + * [[*The_Authentication_Server*|auth]], the transcript of a talk about the + details of the authentication mechanisms in the Hurd by Wolfgang Jährling. + + * [*The Mach Paging Interface as Used by the + Hurd*](http://lists.gnu.org/archive/html/l4-hurd/2002-06/msg00001.html), a + text by Neal Walfield. + + * In the + [[Position_paper_*Improving_Usability_via_Access_Decomposition_and_Policy*|ng/position_paper]] + Neal Walfield and Marcus Brinkmann give an overview about how a future, + subsequent system may be architected. diff --git a/hurd/documentation/auth.html b/hurd/documentation/auth.html new file mode 100644 index 00000000..487fc1fe --- /dev/null +++ b/hurd/documentation/auth.html @@ -0,0 +1,168 @@ +[[meta copyright="Copyright © 2002, 2008 Free Software Foundation, Inc."]] + +[[meta license="Verbatim copying and distribution of this entire article is +permitted in any medium, provided this notice is preserved."]] + +[[meta title="The Authentication Server, the transcript of a talk about the +details of the authentication mechanisms in the Hurd by Wolfgang Jährling"]] + +<H3><A NAME="contents">Table of Contents</A></H3> +<UL> + <LI><A HREF="#intro" NAME="TOCintro">Introduction</A> + <LI><A HREF="#ids" NAME="TOCids">How IDs are represented and used</A> + <LI><A HREF="#posix" NAME="TOCposix">POSIX and beyond</A> + <LI><A HREF="#servers" NAME="TOCservers">Related servers</A> +</UL> +<HR> + +<H3><A HREF="#TOCintro" NAME="intro">Introduction</A></H3> +<P> +In this text, which mostly resembles the talk I gave at Libre Software +Meeting 2002 in Bordeaux, I will describe what the auth server does, +why it is so important and which cool things you can do with it, both +on the programming and the user side. I will also describe related +programs like the password and fakeauth servers. Note that this text +is targeted at programmers who want to understand the auth mechanism +in detail and are already familiar with concepts like Remote Procedure +Calls (RPCs) as well as the way User- and Group-IDs are used in the +POSIX world. + +<P> +The auth server is a very small server, therefore it gives a useful +example when you want to learn how a server typically looks like. One +reason why it is so small is that the auth interface, which it +implements, consists of only four RPCs. You can find the interface in +hurd/hurd/auth.defs and the server itself in hurd/auth/. + +<H3><A HREF="#TOCids" NAME="ids">How IDs are represented and used</A></H3> +<P> +Each process holds (usually) one port to auth (an auth_t in C source, +which actually is a mach_port_t, of course). The purpose of auth is +to manage User-IDs and Group-IDs, which is the reason why users often +will have no choice but to make use of the systems main auth server, +which does not listen on /servers/auth; instead you inherit a port to +auth from your parent process. Each such port is (internally in the +auth server) associated with a set of effective User- and Group-IDs as +well as a set of available User- and Group-IDs. So we have four sets +of IDs in total. The available IDs can be turned into corresponding +effective IDs at any time. + +<P> +When you send an auth_getids RPC on the port you hold, you will get +information about which IDs are associated with it, so you can figure +out which permissions you have. But how will a server know that you +have these permissions and therefore know which actions (e.g. writing +into file "foo") it is supposed to do on your behalf and which not? +The establishing of a trusted connection to a server works as follows: + +<P><OL> +<LI>A user wants a server to know its IDs</LI> +<LI>The user requests a reauthentication from the server</LI> +<LI>In this request the user will include a port</LI> +<LI>Both will hand this port to auth</LI> +<LI>The user uses auth_user_authenticate</LI> +<LI>The server uses auth_server_authenticate</LI> +<LI>The server also passes a new port to auth</LI> +<LI>auth matches these two requests</LI> +<LI>The user gets the new port from auth</LI> +<LI>The server learns about the IDs of the user</LI> +<LI>The user uses the new port for further communication</LI> +</OL> + +<P> +We have different RPCs for users and servers because what we pass and +what we get back differs for them: Users get a port, and servers get +the sets of IDs, and have to specify the port which the user will get. + +<P> +It is interesting to note that auth can match the requests by +comparing two integers, because when you get the same port from two +people, you will have the same mach_port_t (which is nothing but an +integer). + +<P> +All of this of course only works if they use the same auth server, +which is why I said often you have no choice other than to use the +one main auth server. But this is no serious restriction, as the auth server has +almost no functionality one might want to replace. In fact, there is +one replacement for the default auth implementation, but more on that +later. + +<H3><A HREF="#TOCposix" NAME="posix">POSIX and beyond</A></H3> +<P> +Before we examine what is possible with this design, let us take a +short look at how the POSIX semantics are implemented on top of this +design. When a program that comes out of POSIX-land asks for its own +effective User- or Group-ID, we will tell it about the first of the +effective IDs. In the same sense, the POSIX real User- or Group-ID is +the first available ID and the POSIX saved User- or Group-ID is the +second available ID, which is why you have the same ID two times in +the available IDs when you log into your GNU/Hurd machine (you can +figure out which IDs you have with the program "ids", that basically +just does an auth_getauth RPC). When you lack one of those IDs (for +example when you have no effective Group-ID), a POSIX program asking +for this particular information will get "-1" as the ID. + +<P> +But as you can imagine, we can do more than what POSIX specifies. Fox +example, we can modify our permissions. This is always done with the +auth_makeauth RPC. In this RPC, you specify the IDs that should be +associated with the new port. All of these IDs must be associated +with either the port where the RPC is sent to or one of the additional +ports you can specify; an exception is the superuser root, which is +allowed to creat ports that are associated with arbitrary IDs. +Hereby you can convert available into effective IDs. + +<P> +This opens the door to a bunch of nice features. For example, we have +the addauth program in the Hurd, which makes it possible to add an ID +to either a single process or a group of processes if you hold the ID or know the +appropriate password, and there is a corresponding rmauth program that +removes an ID. So when you are working on your computer with GNU +Emacs and want to edit a system configuration file, you switch to +Emacs' shell-mode, do an "addauth root", enter the password, edit the +file, and when you are done switch back to shell-mode and do "rmauth +root". These programs have some interesting options, and there are +various other programs, for setting the complete list of IDs (setauth) +and so on. + +<H3><A HREF="#TOCservers" NAME="servers">Related servers</A></H3> +<P> +Finally, I want to explain two servers which are related to auth. The +first is the password server, which listens on /servers/password. If +you pass to it a User- or Group-ID and the correct password for it, it +will return a port to auth to you which is associated with the ID you +passed to it. It can create such a port because it is running as +root. So let us assume you are an FTP server process. You will start +as root, because you want to use port 21 (in this case, "port" does +not refer to a mach_port_t, of course). But then, you can drop all +your permissions so that you run without any ID. This makes it far +less dangerous to communicate with yet unknown users over the +network. But when someone now hands a username and password to you, +you can ask the password server for a new auth port. The password +server will check the data you pass to it, for example by looking into +/etc/shadow, and if it is valid, it will ask the auth server for a new +port. It receives this port from auth and then passes it on to you. +So you have raised your permissions. (And for the very curious: Yes, +we are well aware of the differences between this concept and +capabilities; and we also do have some kinds of capabilities in +various parts of the Hurd.) + +<P> +My second example is the fakeauth server. It also implements the auth +protocol. It is the part of the fakeroot implementation that gives a +process the impression that it runs as root, even if it doesn't. So +when the process asks fakeauth about its own IDs, fakeauth will tell +the process that it runs as root. But when the process wants to make +use of the authentication protocol described earlier in this text, +fakeauth will forward the request to its own auth server, which will +usually be the systems main auth server, which will then be able to +match the auth_*_authenticate requests. So what fakeauth does is +acting as a proxy auth server that gives someone the impression to run +as root, while not modifying what that one is allowed to do. + +<P> +At this point, I have said at least most of what can be said about the +auth server and the protocol it implements, so I will finish by saying +that it might be an interesting task (for you) to modify some existing +software to take advantage of the features I described here. diff --git a/hurd/documentation/hurd-paper.html b/hurd/documentation/hurd-paper.html new file mode 100644 index 00000000..15d2daec --- /dev/null +++ b/hurd/documentation/hurd-paper.html @@ -0,0 +1,760 @@ +[[meta copyright="Copyright © 1996, 1997, 1998, 2007, 2008 Free Software +Foundation, Inc."]] + +[[meta license="Verbatim copying and distribution of this entire article is +permitted in any medium, provided this notice is preserved."]] + +[[meta title="Towards a New Strategy of OS Design, an architectural overview by +Thomas Bushnell, BSG."]] + + +This article explains why FSF is developing a new operating system named the +Hurd, which will be a foundation of the whole GNU system. +The Hurd is built +on top of CMU's Mach 3.0 kernel and uses Mach's virtual memory management and +message-passing facilities. +The GNU C Library will provide the Unix system +call interface, and will call the Hurd for needed services it can't provide +itself. +The design and implementation of the Hurd is being lead by Michael +Bushnell, with assistance from Richard Stallman, Roland McGrath, +Jan Brittenson, and others. + +<H2>Part 1: A More Usable Approach to OS Design</H2> +<P> +The fundamental purpose of an operating system (OS) is to enable a variety of +programs to share a single computer efficiently and productively. +This +demands memory protection, preemptively scheduled timesharing, coordinated +access to I/O peripherals, and other services. +In addition, an OS can allow +several users to share a computer. +In this case, efficiency demands services +that protect users from harming each other, enable them to share without +prior arrangement, and mediate access to physical devices. +<P> +On today's computer systems, programmers usually implement these goals +through a large program called the kernel. +Since this program must be +accessible to all user programs, it is the natural place to add functionality +to the system. +Since the only model for process interaction is that of +specific, individual services provided by the kernel, no one creates other +places to add functionality. +As time goes by, more and more is added to the +kernel. +<P> +A traditional system allows users to add components to a kernel only if they +both understand most of it and have a privileged status within the system. +Testing new components requires a much more painful edit-compile-debug cycle +than testing other programs. +It cannot be done while others are using the +system. +Bugs usually cause fatal system crashes, further disrupting others' +use of the system. +The entire kernel is usually non-pageable. +(There are +systems with pageable kernels, but deciding what can be paged is difficult +and error prone. +Usually the mechanisms are complex, making them difficult +to use even when adding simple extensions.) +<P> +Because of these restrictions, functionality which properly belongs +<STRONG>behind</STRONG> +the wall of a traditional kernel is usually left out of systems unless it is +absolutely mandatory. +Many good ideas, best done with an open/read/write +interface cannot be implemented because of the problems inherent in the +monolithic nature of a traditional system. +Further, even among those with +the endurance to implement new ideas, only those who are privileged users of +their computers can do so. +The software copyright system darkens the mire by +preventing unlicensed people from even reading the kernel source. +<P> +Some systems have tried to address these difficulties. +Smalltalk-80 and +the Lisp Machine both represented one method of getting around the problem. +System code is not distinguished from user code; all of the system is +accessible to the user and can be changed as need be. +Both systems were +built around languages that facilitated such easy replacement and extension, +and were moderately successful. +But they both were fairly poor at insulating +users and programs from each other, failing one of the principal goals of OS +design. +<P> +Most projects that use the Mach 3.0 kernel carry on the hard-to-change +tradition of OS design. +The internal structure is different, but the same +heavy barrier between user and system remains. +The single-servers, while +fairly easy to construct, inherit all the deficiencies of the monolithic +kernels. +<P> +A multi-server divides the kernel functionality up into logical blocks with +well-defined interfaces. +Properly done, it is easier to make changes and add +functionality. +So most multi-server projects do somewhat better. +Much more +of the system is pageable. +You can debug the system more easily. +You can +test new system components without interfering with other users. +But the +wall between user and system remains; no user can cross it without special +privilege. +<P> +The GNU Hurd, by contrast, is designed to make the area of +<STRONG>system</STRONG> +code as +limited as possible. +Programs are required to communicate only with a few +essential parts of the kernel; the rest of the system is replaceable +dynamically. +Users can use whatever parts of the remainder of the system +they want, and can easily add components themselves for other users to take +advantage of. +No mutual trust need exist in advance for users to use each +other's services, nor does the system become vulnerable by trusting the +services of arbitrary users. +<P> +This has been done by identifying those system components which users +<STRONG>must</STRONG> +use in order to communicate with each other. +One of these is responsible for +identifying users' identities and is called the +<DFN> +authentication server. +</DFN> +In +order to establish each other's identities, programs must communicate, each +with an authentication server they trust. +Another component establishes +control over system components by the superuser, provides global bookkeeping +operations, and is called the +<DFN> +process server. +</DFN> +<P> +Not all user programs need to communicate with the process server; it is only +necessary for programs which require its services. +Likewise, the +authentication server is only necessary for programs that wish to communicate +their identity to another. +None of the remaining services carry any special +status; not the network implementation, the filesystems, the program +execution mechanism (including setuid), or any others. + +<H3>The Translator Mechanism</H3> +<P> +The Hurd uses Mach ports primarily as methods for communicating between users +and servers. +(A Mach port is a communication point on a Mach task where +messages are sent and received.) Each port implements a particular set of +protocols, representing operations that can be undertaken on the underlying +object represented by the port. +Some of the protocols specified by the Hurd +are the I/O protocol, used for generic I/O operations; the file protocol, +used for filesystem operations; the socket protocol, used for network +operations; and the process protocol, used for manipulating processes et al. +<P> +Most servers are accessed by opening files. +Normally, when you open a file, +you create a port associated with that file that is owned by the server +that owns the directory containing the file. +For example, a disk-based +filesystem will normally serve a large number of ports, each of which +represents an open file or directory. +When a file is opened, the server +creates a new port, associates it with the file, and returns the port to the +calling program. +<P> +However, a file can have a +<DFN>translator</DFN> +associated with it. +In this case, +rather than return its own port which refers to the contents of the file, the +server executes a translator program associated with that file. +This +translator is given a port to the actual contents of the file, and is then +asked to return a port to the original user to complete the open operation. +<P> +This mechanism is used for +<CODE>mount</CODE> +by having a translator associated with +each mount point. +When a program opens the mount point, the translator (in +this case, a program which understands the disk format of the mounted +filesystem) is executed and returns a port to the program. +After the +translator is started, it need not be run again unless it dies; the parent +filesystem retains a port to the translator to use in further requests. +<P> +The owner of a file can associate a translator with it without special +permission. +This means that any program can be specified as a translator. +Obviously the system will not work properly if the translator does not +implement the file protocol correctly. +However, the Hurd is constructed so +that the worst possible consequence is an interruptible hang. +<P> +One way to use translators is to access hierarchically structured data using +the file protocol. +For example, all the complexity of the user interface to +the +<CODE>ftp</CODE> +program is removed. +Users need only know that a particular +directory represents FTP and can use all the standard file manipulation +commands (e.g +<CODE>ls</CODE> +or +<CODE>cp</CODE>) +to access the remote system, rather than learning +a new set. +Similarly, a simple translator could ease the complexity of +<CODE>tar</CODE> +or +<CODE>gzip</CODE>. +(Such transparent access would have some added cost, but it would +be convenient.) + +<H3>Generic Services</H3> +<P> +With translators, the filesystem can act as a rendezvous for interfaces which +are not similar to files. +Consider a service which implements some version +of the X protocol, using Mach messages as an underlying transport. +For each +X display, a file can be created with the appropriate program as its +translator. +X clients would open that file. +At that point, few file +operations would be useful (read and write, for example, would be useless), +but new operations ( +<CODE>XCreateWindow</CODE> +or +<CODE>XDrawText</CODE>) +might become meaningful. +In this case, the filesystem protocol is used only to manipulate +characteristics of the node used for the rendezvous. +The node need not +support I/O operations, though it should reply to any such messages with a +<CODE>message_not_understood</CODE> +return code. +<P> +This translator technique is used to contact most of the services in the Hurd +that are not structured like hierarchical filesystems. +For example, the +password server, which hands out authorization tags in exchange for +passwords, is contacted this way. +Network protocol servers are also +contacted in this fashion. +Roland McGrath thought up this use of translators. + +<H3>Clever Filesystem Pictures</H3> +<P> +In the Hurd, translators can also be used to present a filesystem-like view +of another part of the filesystem, with some semantics changed. +For example, +it would be nice to have a filesystem that cannot itself be changed, but +nonetheless records changed versions of its files elsewhere. +(This could be +useful for source code management.) +<P> +The Hurd will have a translator which creates a directory which is a +conceptual union of other directories, with collision resolution rules of +various sorts. +This can be used to present a single directory to users that +contains all the programs they would want to execute. +There are other useful +variations on this theme. + +<H3>What The User Can Do</H3> +<P> +No translator gains extra privilege by virtue of being hooked into the +filesystem. +Translators run with the uid of the owner of the file being +translated, and can only be set or changed by that owner. +The I/O and +filesystem protocols are carefully designed to allow their use by mutually +untrusting clients and servers. +Indeed, translators are just ordinary +programs. +The GNU C library has a variety of facilities to make common sorts +of translators easier to write. +<P> +Some translators may need special privileges, such as the password server or +translators which allow setuid execution. +These translators could be run by +anyone, but only if they are set on a root-owned node would they be able to +provide all their services successfully. +This is analogous to letting any +user call the +<CODE>reboot</CODE> +system call, but only honoring it if that user is root. + +<H3>Why This Is So Different</H3> +<P> +What this design provides is completely novel to the Unix world. +Until now, +OSs have kept huge portions of their functionality in the realm of system +code, thus preventing its modification and extension except in extreme need. +Users cannot replace parts of the system in their programs no matter how much +easier that would make their task, and system managers are loath to install +random tweaks off the net into their kernels. +<P> +In the Hurd, users can change almost all of the things that are decided for +them in advance by traditional systems. +In combination with the tremendous +control given by the Mach kernel over task address spaces and properties, the +Hurd provides a system in which users will, for the first time, be able to +replace parts of the system they dislike, without disrupting other users. +<P> +Most Mach-based OSs to date have mostly implemented a wider set of the +<STRONG> +same old +</STRONG> +Unix semantics in a new environment. +In contrast, GNU is extending +those semantics to allow users to improve, bypass, or replace them. + + +<H2>Part 2: A Look at Some of the Hurd's Beasts</H2> +<H3>The Authentication Server</H3> +<P> +One of the Hurd's more central servers is the authentication server. +Each +port to this server identifies a user and is associated by this server with +an +<DFN>id block</DFN>. +Each id block contains sets of user and group ids. +Either +set may be empty. +This server is not the same as the password server +referred to above. +<P> +The authentication server exports three services. +First, it provides simple +boolean operations on authentication ports: given two authentication ports, +this server will provide a third port representing the union of the two sets +of uids and gids. +Second, this server allows any user with a uid of zero to +create an arbitrary authentication port. +Finally, this server provides RPCs +(Remote Procedure Calls between different programs and possibly different +hosts) which allow mutually untrusting clients and servers to establish their +identities and pass initial information on each other. +This is crucial to +the security of the filesystem and I/O protocols. +<P> +Any user could write a program which implements the authentication protocol; +this does not violate the system's security. +When a service needs to +authenticate a user, it communicates with its trusted authentication server. +If that user is using a different authentication server, the transaction will +fail and the server can refuse to communicate further. +Because, in effect, +this forces all programs on the system to use the same authentication server, +we have designed its interface to make any safe operation possible, and to +include no extraneous operations. +(This is why there is a separate password +server.) +<H3>The Process Server</H3> +<P> +The process server acts as an information categorization repository. +There +are four main services supported by this server. +First, the process server +keeps track of generic host-level information not handled by the Mach kernel. +For example, the hostname, the hostid, and the system version are maintained +by the process server. +Second, this server maintains the Posix notions of +sessions and process groups, to help out programs that wish to use Posix +features. +<P> +Third, the process server maintains a one-to-one mapping between Mach tasks +and Hurd processes. +Every task is assigned a pid. +Processes can register a +message port with this server, which can then be given out to any program +which requests it. +This server makes no attempt to keep these message ports +private, so user programs are expected to implement whatever security they +need themselves. +(The GNU C Library provides convenient functions for all +this.) Processes can tell the process server their current `argv' and `envp' +values; this server will then provide, on request, these vectors of arguments +and environment. +This is useful for writing +<CODE>ps</CODE>-like +programs and also +makes it easier to hide or change this information. +None of these features +are mandatory. +Programs are free to disregard all of this and never register +themselves with the process server at all. +They will, however, still have a +pid assigned. +<P> +Finally, the process server implements +<DFN>process collections</DFN>, +which are used +to collect a number of process message ports at the same time. +Also, +facilities are provided for converting between pids, process server ports, +and Mach task ports, while ensuring the security of the ports managed. +<P> +It is important to stress that the process server is optional. +Because of +restrictions in Mach, programs must run as root in order to identify all the +tasks in the system. +But given that, multiple process servers could +co-exist, each with their own clients, giving their own model of the +universe. +Those process server features which do not require root privileges +to be implemented could be done as per-user servers. +The user's hands are +not tied. +<H3>Transparent FTP</H3> +<P> +Transparent FTP is an intriguing idea whose time has come. +The popular +<CODE>ange-ftp</CODE> +package available for GNU Emacs makes access to FTP files +virtually transparent to all the Emacs file manipulation functions. +Transparent FTP does the same thing, but in a system wide fashion. +This +server is not yet written; the details remain to be fleshed out, and will +doubtless change with experience. +<P> +In a BSD kernel, a transparent FTP filesystem would be no harder to write +than in the Hurd. +But mention the idea to a BSD kernel hacker, and the +response is that ``such a thing doesn't belong in the kernel''. +In a sense, +this is correct. +It violates all the layering principles of such systems to +place such things in the kernel. +The unfortunate side effect, however, is +that the design methodology (which is based on preventing users from changing +things they don't like) is being used to prevent system designers from making +things better. +(Recent BSD kernels make it possible to write a user program +that provides transparent FTP. +An example is +<CODE>alex</CODE>, +but it needs to run +with full root privileges.) +<P> +In the Hurd, there are no obstacles to doing transparent FTP. +A translator +will be provided for the node +<CODE>/ftp</CODE>. +The contents of +<CODE>/ftp</CODE> +will probably +not be directly listable, though further subdirectories will be. +There will +be a variety of possible formats. +For example, to access files on uunet, one +could +<CODE> +cd /ftp/ftp.uu.net:anonymous:mib@gnu. +</CODE> +Or to access files on a remote +account, one might +<CODE> +cd /ftp/gnu.org:mib:passwd. +</CODE> +Parts of this +command could be left out and the transparent FTP program would read them +from a user's +<CODE>.netrc</CODE> +file. +In the last case, one might just +<CODE> +cd /ftp/gnu.org; +</CODE> +when the rest of the data is already in +<CODE>.netrc</CODE>. +<P> +There is no need to do a +<CODE>cd</CODE> +first--use any file command. +To find out about +RFC 1097 (the Telnet Subliminal Message Option), just type +<CODE> +more /ftp/ftp.uu.net/inet/rfc/rfc1097. +</CODE> +A copy command to a local disk +could be used if the RFC would be read frequently. +<H3>Filesystems</H3> +<P> +Ordinary filesystems are also being implemented. +The initial release of the +Hurd will contain a filesystem upwardly compatible with the BSD 4.4 Fast File +System. +In addition to the ordinary semantics, it will provide means to +record translators, offer thirty-two bit user ids and group ids, and supply a +new id per file, called the +<DFN>author</DFN> +of the file, which can be set by the +owner arbitrarily. +In addition, because users in the Hurd can have multiple +uids (or even none), there is an additional set of permission bits providing +access control for +<DFN> +unknown user +</DFN> +(no uids) as distinct from +<DFN> +known but arbitrary user +</DFN> +(some uids: the existing +<DFN>world</DFN> +category of file +permissions). +<P> +The Network File System protocol will be implemented using 4.4 BSD as a +starting point. +A log-structured filesystem will also be implemented using +the same ideas as in Sprite, but probably not the same format. +A GNU network +file protocol may be designed in time, or NFS may be extended to remove its +deficiencies. +There will also be various ``little'' filesystems, such as the +MS-DOS filesystem, to help people move files between GNU and other OSs. + +<H3>Terminals</H3> +<P> +An I/O server will provide the terminal semantics of Posix. +The GNU C +Library has features for keeping track of the controlling terminal and for +arranging to have proper job control signals sent at the proper times, as +well as features for obeying keyboard and hangup signals. +<P> +Programs will be able to insert a terminal driver into communications +channels in a variety of ways. +Servers like +<CODE>rlogind</CODE> +will be able to insert +the terminal protocol onto their network communication port. +Pseudo-terminals will not be necessary, though they will be provided for +backward compatibility with older programs. +No programs in GNU will depend +on them. +<P> +Nothing about a terminal driver is forced upon users. +A terminal driver +allows a user to get at the underlying communications channel easily, to +bypass itself on an as-needed basis or altogether, or to substitute a +different terminal driver-like program. +In the last case, provided the +alternate program implements the necessary interfaces, it will be used by the +C Library exactly as if it were the ordinary terminal driver. +<P> +Because of this flexibility, the original terminal driver will not provide +complex line editing features, restricting itself to the behavior found in +Posix and BSD. +In time, there will be a +<CODE>readline</CODE>-based +terminal driver, +which will provide complex line-editing features for those users who want +them. +<P> +The terminal driver will probably not provide good support for the +high-volume, rapid data transmission required by UUCP or SLIP. +Those +programs do not need any of its features. +Instead they will be using the +underlying Mach device ports for terminals, which support moving large +amounts of data efficiently. + +<H3>Executing Programs</H3> +<P> +The implementation of the +<CODE>execve</CODE> +call is spread across three programs. +The +library marshals the argument and environment vectors. +It then sends a +message to the file server that holds the file to be executed. +The file +server checks execute permissions and makes whatever changes it desires in +the exec call. +For example, if the file is marked setuid and the fileserver +has the ability, it will change the user identification of the new image. +The file server also decides if programs which had access to the old task +should continue to have access to the new task. +If the file server is +augmenting permissions, or executing an unreadable image, then the exec needs +to take place in a new Mach task to maintain security. +<P> +After deciding the policy associated with the new image, the filesystem calls +the exec server to load the task. +This server, using the BFD (Binary File +Descriptor) library, loads the image. +BFD supports a large number of object +file formats; almost any supported format will be executable. +This server +also handles scripts starting with +<CODE>#!</CODE>, +running them through the indicated +program. +<P> +The standard exec server also looks at the environment of the new image; if +it contains a variable +<CODE>EXECSERVERS</CODE> +then it uses the programs specified +there as exec servers instead of the system default. +(This is, of course, +not done for execs that the file server has requested be kept secure.) +<P> +The new image starts running in the GNU C Library, which sends a message to +the exec server to get the arguments, environment, umask, current directory, +etc. +None of this additional state is special to the file or exec servers; +if programs wish, they can use it in a different manner than the Library. + +<H3>New Processes</H3> +<P> +The +<CODE>fork</CODE> +call is implemented almost entirely in the GNU C Library. +The new +task is created by Mach kernel calls. +The C Library arranges to have its +image inherited properly. +The new task is registered with the process server +(though this is not mandatory). +The C Library provides vectors of functions +to be called at fork time: one vector to be called before the fork, one after +in the parent, and one after in the child. +(These features should not be +used to replace the normal fork-calling sequence; it is intended for +libraries which need to close ports or clean up before a fork occurs.) +The C +library will implement both fork calls specified by the draft Posix.4a (the +proposed standard dealing with the threads extension to the real-time +extension). +<P> +Nothing forces the user to create new tasks this way. +If a program wants to +use almost the normal fork, but with some special characteristics, then it +can do so. +Hooks will be provided by the C Library, or the function can even +be completely replaced. +None of this is possible in a traditional Unix +system. + +<H3>Asynchronous Messages</H3> +<P> +As mentioned above, the process server maintains a +<DFN> +message port +</DFN> +for each +task registered with it. +These ports are public, and are used to send +asynchronous messages to the task. +Signals, for example, are sent to the +message port. +The signal message also provides a port as an indication that +the sender should be trusted to send the signal. +The GNU C Library lists a +variety of ports in a table, each of which identifies a set of signals that +can be sent by anyone who possesses that port. +For example, if the user +possesses the task's kernel port, it is allowed to send any signal. +If the +user possesses a special +<DFN> +terminal id +</DFN> +port, it is allowed to send the +keyboard and hangup signals. +Users can add arbitrary new entries into the C +library's signal permissions table. +<P> +When a process's process group changes, the process server will send it a +message indicating the new process group. +In this case, the process server +proves its authority by providing the task's kernel port. +<P> +The C library also has messages to add and delete uids currently used by the +process. +If new uids are sent to the program, the library adds them to its +current set, and then exchanges messages with all the I/O servers it knows +about, proving to them its new authorization. +Similarly, a message can +delete uids. +In the latter case, the caller must provide the process's task +port. +(You can't harm a process by giving it extra permission, but you can +harm it by taking permission away.) The Hurd will provide user programs to +send these messages to processes. +For example, the +<CODE>su</CODE> +command will be able +to cause all the programs in your current login session, to gain a new uid, +rather than spawn a subshell. +<P> +The C library will allow programs to add asynchronous messages they wish to +recognize, as well as prevent recognition of the standard set. +<H3>Making It Look Like Unix</H3> +<P> +The C Library will implement all of the calls from BSD and Posix as well as +some obvious extensions to them. +This enables users to replace those calls +they dislike or bypass them entirely, whereas in Unix the calls must be used +``as they come'' with no alternatives possible. +<P> +In some environments binary compatibility will also be supported. +This works +by building a special version of the library which is then loaded somewhere +in the address space of the process. +(For example, on a VAX, it would be +tucked in above the stack.) A feature of Mach, called system call +redirection, is then used to trap Unix system calls and turn them into jumps +into this special version of the library. +(On almost all machines, the cost +of such a redirection is very small; this is a highly optimized path in Mach. +On a 386 it's about two dozen instructions. +This is little worse than a +simple procedure call.) +<P> +Many features of Unix, such as signal masks and vectors, are handled +completely by the library. +This makes such features significantly cheaper +than in Unix. +It is now reasonable to use +<CODE>sigblock</CODE> +extensively to protect +critical sections, rather than seeking out some other, less expensive method. + +<H3>Network Protocols</H3> +<P> +The Hurd will have a library that will make it very easy to port 4.4 BSD +protocol stacks into the Hurd. +This will enable operation, virtually for +free, of all the protocols supported by BSD. +Currently, this includes the +CCITT protocols, the TCP/IP protocols, the Xerox NS protocols, and the ISO +protocols. +<P> +For optimal performance some work would be necessary to take advantage of +Hurd features that provide for very high speed I/O. +For most protocols this +will require some thought, but not too much time. +The Hurd will run the +TCP/IP protocols as efficiently as possible. +<P> +As an interesting example of the flexibility of the Hurd design, consider the +case of IP trailers, used extensively in BSD for performance. +While the Hurd +will be willing to send and receive trailers, it will gain fairly little +advantage in doing so because there is no requirement that data be copied and +avoiding copies for page-aligned data is irrelevant. diff --git a/hurd/documentation/hurd-talk.html b/hurd/documentation/hurd-talk.html new file mode 100644 index 00000000..d608e12a --- /dev/null +++ b/hurd/documentation/hurd-talk.html @@ -0,0 +1,1061 @@ +[[meta copyright="Copyright © 2001 Marcus Brinkmann"]] + +[[meta license="Verbatim copying and distribution of this entire article is +permitted in any medium, provided this notice is preserved."]] + +[[meta title="The Hurd, a presentation by Marcus Brinkmann"]] + + +<H4><A NAME="contents">Table of Contents</A></H4> +<UL> + <LI><A HREF="#int" NAME="TOCint">Introduction</A> + <LI><A HREF="#ove" NAME="TOCove">Overview</A> + <LI><A HREF="#his" NAME="TOChis">Historicals</A> + <LI><A HREF="#ker" NAME="TOCker">Kernel Architectures</A> + <LI><A HREF="#mic" NAME="TOCmic">Micro vs Monolithic</A> + <LI><A HREF="#sin" NAME="TOCsin">Single Server vs Multi Server</A> + <LI><A HREF="#mul" NAME="TOCmul">Multi Server is superior, ...</A> + <LI><A HREF="#the" NAME="TOCthe">The Hurd even more so.</A> + <LI><A HREF="#mac" NAME="TOCmac">Mach Inter Process Communication</A> + <LI><A HREF="#how" NAME="TOChow">How to get a port?</A> + <LI><A HREF="#exa" NAME="TOCexa">Example of <SAMP>hurd_file_name_lookup</SAMP></A> + <LI><A HREF="#pat" NAME="TOCpat">Pathname resolution example</A> + <LI><A HREF="#map" NAME="TOCmap">Mapping the POSIX Interface</A> + <LI><A HREF="#filser" NAME="TOCfilser">File System Servers</A> + <LI><A HREF="#act" NAME="TOCact">Active vs Passive</A> + <LI><A HREF="#aut" NAME="TOCaut">Authentication</A> + <LI><A HREF="#ope" NAME="TOCope">Operations on authentication ports</A> + <LI><A HREF="#est" NAME="TOCest">Establishing trusted connections</A> + <LI><A HREF="#pas" NAME="TOCpas">Password Server</A> + <LI><A HREF="#pro" NAME="TOCpro">Process Server</A> + <LI><A HREF="#filsys" NAME="TOCfilsys">Filesystems</A> + <LI><A HREF="#dev" NAME="TOCdev">Developing the Hurd</A> + <LI><A HREF="#sto" NAME="TOCsto">Store Abstraction</A> + <LI><A HREF="#deb" NAME="TOCdeb">Debian GNU/Hurd</A> + <LI><A HREF="#stabin" NAME="TOCstabin">Status of the Debian GNU/Hurd binary archive</A> + <LI><A HREF="#stainf" NAME="TOCstainf">Status of the Debian infrastructure</A> + <LI><A HREF="#staarc" NAME="TOCstaarc">Status of the Debian Source archive</A> + <LI><A HREF="#debide" NAME="TOCdebide">Debian GNU/Hurd: Good idea, bad idea?</A> + <LI><A HREF="#end" NAME="TOCend">End</A> +</UL> +<HR> +<H3>Talk about the Hurd</H3> +<P> +This talk about the Hurd was written by Marcus Brinkmann for +<UL> +<LI>OSDEM, Brussels, 4. Feb 2001, +<LI>Frühjahrsfachgespräche, Cologne, 2. Mar 2001 and +<LI>Libre Software Meeting, Bordeaux, 4. Jul 2001. +</UL> + +<H4><A HREF="#TOCint" NAME="int">Introduction</A></H4> +<P> +When we talk about free software, we usually refer to the free +software licenses. We also need relief from software patents, so our +freedom is not restricted by them. But there is a third type of +freedom we need, and that's user freedom. + +<P> +Expert users don't take a system as it is. They like to change the +configuration, and they want to run the software that works best for +them. That includes window managers as well as your favourite text +editor. But even on a GNU/Linux system consisting only of free +software, you can not easily use the filesystem format, network +protocol or binary format you want without special privileges. In +traditional unix systems, user freedom is severly restricted by the +system administrator. + +<P> +The Hurd removes these restrictions from the user. It provides an +user extensible system framework without giving up POSIX compatibility +and the unix security model. Throughout this talk, we will see that +this brings further advantages beside freedom. + +<H4><A HREF="#TOCove" NAME="ove">Overview</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> + +<P> +The Hurd is a POSIX compatible multi-server +system operating on top of the GNU Mach microkernel. + +<P> +Topics: +<UL> + <LI>GNU Mach</LI> + <LI>The Hurd</LI> + <LI>Development</LI> + <LI>Debian GNU/Hurd</LI> +</UL> +</TD></TR></TABLE> + +<P> +The Hurd is a POSIX compatible multi-server system operating on top of +the GNU Mach Microkernel. + +<P> +I will have to explain what GNU Mach is, so we start with that. Then +I will talk about the Hurd's architecture. After that, I will give a +short overview on the Hurd libraries. Finally, I will tell you how +the Debian project is related to the Hurd. + +<H4><A HREF="#TOChis" NAME="his">Historicals</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"> +<TR><TD VALIGN="TOP" ALIGN="LEFT"> +<UL> + <LI>1983: Richard Stallman founds the GNU project.</LI> + <LI>1988: Decision is made to use Mach 3.0 as the kernel.</LI> + <LI>1991: Mach 3.0 is released under compatible license.</LI> + <LI>1991: Thomas Bushnell, BSG, founds the Hurd project.</LI> + <LI>1994: The Hurd boots the first time.</LI> + <LI>1997: Version 0.2 of the Hurd is released.<BR><BR></LI> + <LI>1998: Debian hurd-i386 archive is created.</LI> + <LI>2001: Debian GNU/Hurd snapshot fills three CD images.</LI> +</UL> +</TD></TR></TABLE> + +<P> +When Richard Stallman founded the GNU project in 1983, he wanted to +write an operating system consisting only of free software. Very +soon, a lot of the essential tools were implemented, and released +under the GPL. However, one critical piece was missing: The kernel. +<P> +After considering several alternatives, it was decided not to write a +new kernel from scratch, but to start with the Mach microkernel. This +was in 1988, and it was not before 1991 that Mach was released under a +license allowing the GNU project to distribute it as a part of the +system. +<P> +In 1998, I started the Debian GNU/Hurd project, and in 2001 the number +of available GNU/Hurd packages fills three CD images. + +<H4><A HREF="#TOCker" NAME="ker">Kernel Architectures</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Microkernel: +<UL> + <LI>Enforces resource management (paging, scheduling)</LI> + <LI>Manages tasks</LI> + <LI>Implements message passing for IPC</LI> + <LI>Provides basic hardware support</LI> +</UL> +<P> +Monolithic kernel: +<UL> + <LI>No message passing necessary</LI> + <LI>Rich set of features (filesystems, authentication, network + sockets, POSIX interface, ...)</LI> +</UL> +</TD></TR></TABLE> +<P> +Microkernels were very popular in the scientific world around that +time. They don't implement a full operating system, but only the +infrastructure needed to enable other tasks to implement most +features. In contrast, monolithical kernels like Linux contain +program code of device drivers, network protocols, process management, +authentication, file systems, POSIX compatible interfaces and much +more. +<P> +So what are the basic facilities a microkernel provides? In general, +this is resource management and message passing. Resource management, +because the kernel task needs to run in a special privileged mode of +the processor, to be able to manipulate the memory management unit and +perform context switches (also to manage interrupts). Message +passing, because without a basic communication facility the other +tasks could not interact to provide the system services. Some +rudimentary hardware device support is often necessary to bootstrap +the system. So the basic jobs of a microkernel are enforcing the +paging policy (the actual paging can be done by an external pager +task), scheduling, message passing and probably basic hardware device +support. +<P> +Mach was the obvious choice back then, as it provides a rich set of +interfaces to get the job done. Beside a rather brain-dead device +interface, it provides tasks and threads, a messaging system allowing +synchronous and asynchronous operation and a complex interface for +external pagers. It's certainly not one of the sexiest microkernels +that exist today, but more like a big old mama. The GNU project +maintains its own version of Mach, called GNU Mach, which is based on +Mach 4.0. In addition to the features contained in Mach 4.0, the GNU +version contains many of the Linux 2.0 block device and network card +drivers. +<P> +A complete treatment of the differences between a microkernel and +monolithical kernel design can not be provided here. But a couple of +advantages of a microkernel design are fairly obvious. + +<H4><A HREF="#TOCmic" NAME="mic">Micro vs Monolithic</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Microkernel +<UL> + <LI>Clear cut responsibilities + <LI>Flexibility in operating system design, easier debugging</LI> + <LI>More stability (less code to break)</LI> + <LI>New features are not added to the kernel</LI> +</UL> +<P> +Monolithic kernel +<UL> + <LI>Intolerance or creeping featuritis</LI> + <LI>Danger of spaghetti code</LI> + <LI>Small changes can have far reaching side effects</LI> +</UL> +</TD></TR></TABLE> +<P> +Because the system is split up into several components, clean +interfaces have to be developed, and the responsibilities of each part +of the system must be clear. +<P> +Once a microkernel is written, it can be used as the base for several +different operating systems. Those can even run in parallel which +makes debugging easier. When porting, most of the hardware dependant +code is in the kernel. +<P> +Much of the code that doesn't need to run in the special kernel mode +of the processor is not part of the kernel, so stability increases +because there is simply less code to break. +<P> +New features are not added to the kernel, so there is no need to hold +the barrier high for new operating system features. +<P> +Compare this to a monolithical kernel, where you either suffer from +creeping featuritis or you are intolerant of new features (we see both +in the Linux kernel). +<P> +Because in a monolithical kernel, all parts of the kernel can access +all data structures in other parts, it is more likely that short cuts +are used to avoid the overhead of a clean interface. This leads to a +simple speed up of the kernel, but also makes it less comprehensible +and more error prone. A small change in one part of the kernel can +break remote other parts. + +<H4><A HREF="#TOCsin" NAME="sin">Single Server vs Multi Server</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Single Server +<UL> + <LI>A single task implements the functionality of the operating system.</LI> +</UL> +<P> +Multi Server +<UL> + <LI>Many tasks cooperate to provide the system's functionality.</LI> + <LI>One server provides only a small but well-defined part of the + whole system.</LI> + <LI>The responsibilities are distributed logically among the servers.</LI> +</UL> +<P> +A single-server system is comparable to a monolithic kernel system. It +has similar +advantages and disadvantages. +</TD></TR></TABLE> +<P> +There exist a couple of operating systems based on Mach, but they all +have the same disadvantages as a monolithical kernel, because those +operating systems are implemented in one single process running on top +of the kernel. This process provides all the services a monolithical +kernel would provide. This doesn't make a whole lot of sense (the +only advantage is that you can probably run several of such isolated +single servers on the same machine). Those systems are also called +single-server systems. The Hurd is the only usable multi-server +system on top of Mach. In the Hurd, there are many server programs, +each one responsible for a unique service provided by the operating +system. These servers run as Mach tasks, and communicate using the +Mach message passing facilities. One of them does only provide a +small part of the functionality of the system, but together they build +up a complete and functional POSIX compatible operating system. + +<H4><A HREF="#TOCmul" NAME="mul">Multi Server is superior, ...</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Any multi-server has advantages over single-server: +<UL> + <LI>Clear cut responsibilities</LI> + <LI>More stability: If one server dies, all others remain</LI> + <LI>Easier development cycle: Testing without reboot (or replacing + running servers), debugging with gdb</LI> + <LI>Easier to make changes and add new features +</UL> +</TD></TR></TABLE> +<P> +Using several servers has many advantages, if done right. If a file +system server for a mounted partition crashes, it doesn't take down +the whole system. Instead the partition is "unmounted", and +you can try to start the server again, probably debugging it this time +with gdb. The system is less prone to errors in individual +components, and over-all stability increases. The functionality of +the system can be extended by writing and starting new servers +dynamically. (Developing these new servers is easier for the reasons +just mentioned.) +<P> +But even in a multi-server system the barrier between the system and +the users remains, and special privileges are needed to cross it. We +have not achieved user freedom yet. + +<H4><A HREF="#TOCthe" NAME="the">The Hurd even more so.</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +The Hurd goes beyond all this, and allows users to write and run their +servers, too! +<UL> + <LI>Users can replace system servers dynamically with their own + implementations.</LI> + <LI>Users can decide what parts of the remainder of the system they + want to use.</LI> + <LI>Users can extend the functionality of the system.</LI> + <LI>No mutual trust necessary to make use of other users + services.</LI> + <LI>Security of the system is not harmed by trusting users + services.</LI> +</UL> +</TD></TR></TABLE> +<P> +To quote Thomas Bushnell, BSG, from his paper +[[``Towards_a_New_Strategy_of_OS_design''_(1996)|hurd-paper]]: +<BLOCKQUOTE> +The GNU Hurd, by contrast, is designed to make the area of system code +as limited as possible. Programs are required to communicate only +with a few essential parts of the kernel; the rest of the system is +replaceable dynamically. Users can use whatever parts of the +remainder of the system they want, and can easily add components +themselves for other users to take advantage of. No mutual trust need +exist in advance for users to use each other's services, nor does the +system become vulnerable by trusting the services of arbitrary users. +</BLOCKQUOTE> + +<P> +<EM>So the Hurd is a set of servers running on top of the Mach +micro-kernel, providing a POSIX compatible and extensible operating +system. What servers are there? What functionality do they provide, +and how do they cooperate?</EM> + +<H4><A HREF="#TOCmac" NAME="mac">Mach Inter Process Communication</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Ports are message queues which can be used as one-way communication +channels. +<UL> + <LI>Port rights are receive, send or send-once</LI> + <LI>Exactly one receiver</LI> + <LI>Potentially many senders</LI> +</UL> +<P> +MIG provides remote procedure calls on top of Mach IPC. RPCs look like +function calls to the user. +</TD></TR></TABLE> +<P> +Inter-process communication in Mach is based on the ports concept. A +port is a message queue, used as a one-way communication channel. In +addition to a port, you need a port right, which can be a send right, +receive right, or send-once right. Depending on the port right, you +are allowed to send messages to the server, receive messages from it, +or send just one single message. +<P> +For every port, there exists exactly one task holding the receive +right, but there can be no or many senders. The send-once right is +useful for clients expecting a response message. They can give a +send-once right to the reply port along with the message. The kernel +guarantees that at some point, a message will be received on the reply +port (this can be a notification that the server destroyed the +send-once right). +<P> +You don't need to know much about the format a message takes to be +able to use the Mach IPC. The Mach interface generator mig hides the +details of composing and sending a message, as well as receiving the +reply message. To the user, it just looks like a function call, but +in truth the message could be sent over a network to a server running +on a different computer. The set of remote procedure calls a server +provides is the public interface of this server. + + +<H4><A HREF="#TOChow" NAME="how">How to get a port?</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Traditional Mach: +<UL> + <LI>Nameserver provides ports to all registered servers.</LI> + <LI>The nameserver port itself is provided by Mach.</LI> + <LI>Like a phone book: One list.</LI> +</UL> +<P> +The Hurd: +<UL> + <LI>The filesystem is used as the server namespace.</LI> + <LI>Root directory port is inserted into each task.</LI> + <LI>The C library finds other ports with hurd_file_name_lookup, + performing a pathname resolution.</LI> + <LI>Like a tree of phone books.</LI> +</UL> +</TD></TR></TABLE> +<P> +So how does one get a port to a server? You need something like a +phone book for server ports, or otherwise you can only talk to +yourself. In the original Mach system, a special nameserver is +dedicated to that job. A task could get a port to the nameserver from +the Mach kernel and ask it for a port (with send right) to a server +that registered itself with the nameserver at some earlier time. +<P> +In the Hurd, there is no nameserver. Instead, the filesystem is used +as the server namespace. This works because there is always a root +filesystem in the Hurd (remember that the Hurd is a POSIX compatible +system); this is an assumption the people who developed Mach couldn't +make, so they had to choose a different strategy. You can use the +function hurd_file_name_lookup, which is part of the C library, to get +a port to the server belonging to a filename. Then you can start to +send messages to the server in the usual way. + +<H4><A HREF="#TOCexa" NAME="exa">Example of <SAMP>hurd_file_name_lookup</SAMP></A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"><PRE> +mach_port_t identity; +mach_port_t pwserver; +kern_return_t err; + +pwserver = hurd_file_name_lookup + ("/servers/password"); + +err = password_check_user (pwserver, + 0 /* root */, "supass", + &identity); +</PRE></TD></TR></TABLE> +<P> +As a concrete example, the special filename +<SAMP>/servers/password</SAMP> can be used to request a port to the +Hurd password server, which is responsible to check user provided +passwords. +<P> +(explanation of the example) + +<H4><A HREF="#TOCpat" NAME="pat">Pathname resolution example</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Task: Lookup /mnt/readme.txt where /mnt has a mounted filesystem. +<UL> + <LI>The C library asks the root filesystem server about + <SAMP>/mnt/readme.txt</SAMP>.</LI> + <LI>The root filesystem returns a port to the mnt filesystem server + (matching <SAMP>/mnt</SAMP>) and the retry name + <SAMP>/readme.txt</SAMP>.</LI> + <LI>The C library asks the mnt filesystem server about + <SAMP>/readme.txt</SAMP>.</LI> + <LI>The mnt filesystem server returns a port to itself and records + that this port refers to the regular file + <SAMP>/readme.txt</SAMP>.</LI> +</UL> +</TD></TR></TABLE> +<P> +The C library itself does not have a full list of all available +servers. Instead pathname resolution is used to traverse through a +tree of servers. In fact, filesystems themselves are implemented by +servers (let us ignore the chicken and egg problem here). So all the +C library can do is to ask the root filesystem server about the +filename provided by the user (assuming that the user wants to resolve +an absolute path), using the <SAMP>dir_lookup</SAMP> RPC. If the +filename refers to a regular file or directory on the filesystem, the +root filesystem server just returns a port to itself and records that +this port corresponds to the file or directory in question. But if a +prefix of the full path matches the path of a server the root +filesystem knows about, it returns to the C library a port to this +server and the remaining part of the pathname that couldn't be +resolved. The C library than has to retry and query the other server +about the remaining path component. Eventually, the C library will +either know that the remaining path can't be resolved by the last +server in the list, or get a valid port to the server in question. + +<H4><A HREF="#TOCmap" NAME="map">Mapping the POSIX Interface</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<TABLE BORDER="0" CELLPADDING="10"> +<TR> +<TH>Filedescriptor</TH> +<TH>Port to server providing the file</TH> +</TR><TR> +<TD VALIGN="TOP" ALIGN="LEFT"><SAMP>fd = open(name,...)</SAMP></TD> +<TD VALIGN="TOP" +ALIGN="LEFT"><SAMP>dir_lookup(..,name,..,&port)</SAMP><BR> +[pathname resolution]</TD> +</TR><TR> +<TD VALIGN="TOP" ALIGN="LEFT"><SAMP>read(fd, ...)</SAMP></TD> +<TD VALIGN="TOP" ALIGN="LEFT"><SAMP>io_read(port, ...)</SAMP></TD> +</TR><TR> +<TD VALIGN="TOP" ALIGN="LEFT"><SAMP>write(fd, ...)</SAMP></TD> +<TD VALIGN="TOP" ALIGN="LEFT"><SAMP>io_write(port, ...)</SAMP></TD> +</TR><TR> +<TD VALIGN="TOP" ALIGN="LEFT"><SAMP>fstat(fd, ...)</SAMP></TD> +<TD VALIGN="TOP" ALIGN="LEFT"><SAMP>io_stat(port, ...)</SAMP></TD> +</TR><TR> +<TD VALIGN="TOP" ALIGN="LEFT">...</TD><TD></TD> +</TR> +</TABLE> +</TD></TR></TABLE> +<P> +It should by now be obvious that the port returned by the server can +be used to query the files status, content and other information from +the server, if good remote procedure calls to do that are defined and +implemented by it. This is exactly what happens. Whenever a file is +opened using the C libraries <SAMP>open()</SAMP> call, the C library +uses the above pathname resolution to get a port to a server providing +the file. Then it wraps a file descriptor around it. So in the Hurd, +for every open file descriptor there is a port to a server providing +this file. Many other C library calls like <SAMP>read()</SAMP> and +<SAMP>write()</SAMP> just call a corresponding RPC using the port +associated with the file descriptor. + +<H4><A HREF="#TOCfilser" NAME="filser">File System Servers</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<UL> + <LI>Provide file and directory services for ports (and more).</LI> + <LI>These ports are returned by a directory lookup.</LI> + <LI>Translate filesystem accesses through their root path (hence the + name translator).</LI> + <LI>The C library maps the POSIX file and directory interface (and + more) to RPCs to the filesystem servers ports, but also does work on + its own.</LI> + <LI>Any user can install file system servers on inodes they own.</LI> +</UL> +</TD></TR></TABLE> +<P> +So we don't have a single phone book listing all servers, but rather a +tree of servers keeping track of each other. That's really like +calling your friend and asking for the phone number of the blond girl +at the party yesterday. He might refer you to a friend who hopefully +knows more about it. Then you have to retry. +<P> +This mechanism has huge advantages over a single nameserver. First, +note that standard unix permissions on directories can be used to +restrict access to a server (this requires that the filesystems +providing those directories behave). You just have to set the +permissions of a parent directory accordingly and provide no other way +to get a server port. +<P> +But there are much deeper implications. Most of all, a pathname never +directly refers to a file, it refers to a port of a server. That +means that providing a regular file with static data is just one of +the many options the server has to service requests on the file port. +A server can also create the data dynamically. For example, a server +associated with <SAMP>/dev/random</SAMP> can provide new random data +on every <SAMP>io_read()</SAMP> on the port to it. A server +associated with <SAMP>/dev/fortune</SAMP> can provide a new fortune +cookie on every <SAMP>open()</SAMP>. +<P> +While a regular filesystem server will just serve the data as stored +in a filesystem on disk, there are servers providing purely virtual +information, or a mixture of both. It is up to the server to behave +and provide consistent and useful data on each remote procedure call. +If it does not, the results may not match the expectations of the user +and confuse him. +<P> +A footnote from the Hurd info manual: +<BLOCKQUOTE> +(1) You are lost in a maze of twisty little filesystems, all +alike.... +</BLOCKQUOTE> +<P> +Because a server installed in the filesystem namespace translates all +filesystem operations that go through its root path, such a server is +also called "active translator". You can install translators using +the settrans command with the <SAMP>-a</SAMP> option. + +<H4><A HREF="#TOCact" NAME="act">Active vs Passive</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Active Translators: +<UL> + <LI>"<SAMP>settrans -a /cdrom /hurd/isofs /dev/hd2</SAMP>"</LI> + <LI>Are running filesystem servers.</LI> + <LI>Are attached to the root node they translate.</LI> + <LI>Run as a normal process.</LI> + <LI>Go away with every reboot, or even time out.</LI> +</UL> +</TD></TR></TABLE> +<P> +Many translator settings remain constant for a long time. It would be +very lame to always repeat the same couple of dozens settrans calls +manually or at boot time. So the Hurd provides a filesystem extension +that allows to store translator settings inside the filesystem and let +the filesystem servers do the work to start those servers on demand. +Such translator settings are called "passive translators". A passive +translator is really just a command line string stored in an inode of +the filesystem. If during a pathname resolution a server encounters +such a passive translator, and no active translator does exist already +(for this node), it will use this string to start up a new translator +for this inode, and then let the C library continue with the path +resolution as described above. Passive translators are installed with +settrans using the <SAMP>-p</SAMP> option (which is already the +default). + +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Passive Translators: +<UL> + <LI>"<SAMP>settrans /mnt /hurd/ext2fs /dev/hd1s1</SAMP>"</LI> + <LI>Are stored as command strings into an inode.</LI> + <LI>Are used to start a new active translator if there isn't + one.</LI> + <LI>Startup is transparent to the user.</LI> + <LI>Startup happens the first time the server is needed.</LI> + <LI>Are permanent across reboots (like file data).</LI> +</UL> +</TD></TR></TABLE> +<P> +So passive translators also serve as a sort of automounting feature, +because no manual interaction is required. The server start up is +deferred until the service is need, and it is transparent to the user. +<P> +When starting up a passive translator, it will run as a normal process +with the same user and group id as those of the underlying inode. Any +user is allowed to install passive and active translators on inodes +that he owns. This way the user can install new servers into the +global namespace (for example, in his home or tmp directory) and thus +extend the functionality of the system (recall that servers can +implement other remote procedure calls beside those used for files and +directories). A careful design of the trusted system servers makes +sure that no permissions leak out. +<P> +In addition, users can provide their own implementations of some of +the system servers instead the system default. For example, they can +use their own exec server to start processes. The user specific exec +server could for example start java programs transparently (without +invoking the interpreter manually). This is done by setting the +environment variable <SAMP>EXECSERVERS</SAMP>. The systems default +exec server will evaluate this environment variable and forward the +RPC to each of the servers listed in turn, until some server accepts +it and takes over. The system default exec server will only do this +if there are no security implications. (XXX There are other ways to +start new programs than by using the system exec server. Those are +still available.) +<P> +Let's take a closer look at some of the Hurd servers. It was already +mentioned that only few system servers are mandatory for users. To +establish your identity within the Hurd system, you have to +communicate with the trusted systems authentication server +<SAMP>auth</SAMP>. To put the system administrator into control over +the system components, the process server does some global +bookkeeping. +<P> +But even these servers can be ignored. However, registration with the +authentication server is the only way to establish your identity +towards other system servers. Likewise, only tasks registered as +processes with the process server can make use of its services. + +<H4><A HREF="#TOCaut" NAME="aut">Authentication</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +A user identity is just a port to an authserver. The auth server +stores four set of ids for it: +<UL> + <LI>effective user ids</LI> + <LI>effective group ids</LI> + <LI>available user ids</LI> + <LI>available group ids</LI> +</UL> +<P> +Basic properties: +<UL> + <LI>Any of these can be empty.</LI> + <LI>A 0 among the user ids identifies the superuser.</LI> + <LI>Effective ids are used to check if the user has the + permission.</LI> + <LI>Available ids can be turned into effective ids on user + request.</LI> +</UL> +</TD></TR></TABLE> +<P> +The Hurd auth server is used to establish the identity of a user for a +server. Such an identity (which is just a port to the auth server) +consists of a set of effective user ids, a set of effective group ids, +a set of available user ids and a set of available group ids. Any of +these sets can be empty. + +<H4><A HREF="#TOCope" NAME="ope">Operations on authentication ports</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +The auth server provides the following operations on ports: +<UL> + <LI>Merge the ids of two ports into a new one.</LI> + <LI>Return a new port containing a subset of the ids in a port.</LI> + <LI>Create a new port with arbitrary ids (superuser only).</LI> + <LI>Establish a trusted connection between users and servers.</LI> +</UL> +</TD></TR></TABLE> +<P> +If you have two identities, you can merge them and request an identity +consisting of the unions of the sets from the auth server. You can +also create a new identity consisting only of subsets of an identity +you already have. What you can't do is extending your sets, unless +you are the superuser which is denoted by having the user id 0. + +<H4><A HREF="#TOCest" NAME="est">Establishing trusted connections</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<UL> + <LI>User provides a rendezvous port to the server (with + <SAMP>io_reauthenticate</SAMP>).</LI> + <LI>User calls <SAMP>auth_user_authenticate</SAMP> on the + authentication port (his identity), passing the rendezvous port.</LI> + <LI>Server calls <SAMP>auth_server_authenticate</SAMP> on its + authentication port (to a trusted auth server), passing the + rendezvous port and the server port.</LI> + <LI>If both authentication servers are the same, it can match the + rendezvous ports and return the server port to the user and the user + ids to the server.</LI> +</UL> +</TD></TR></TABLE> +<P> +Finally, the auth server can establish the identity of a user for a +server. This is done by exchanging a server port and a user identity +if both match the same rendezvous port. The server port will be +returned to the user, while the server is informed about the id sets +of the user. The server can then serve or reject subsequent RPCs by +the user on the server port, based on the identity it received from +the auth server. +<P> +Anyone can write a server conforming to the auth protocol, but of +course all system servers use a trusted system auth server to +establish the identity of a user. If the user is not using the system +auth server, matching the rendezvous port will fail and no server port +will be returned to the user. Because this practically requires all +programs to use the same auth server, the system auth server is +minimal in every respect, and additional functionality is moved +elsewhere, so user freedom is not unnecessarily restricted. + +<H4><A HREF="#TOCpas" NAME="pas">Password Server</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +The password server <SAMP>/servers/password</SAMP> runs as root and +returns a new authentication port in exchange for a unix password. +<P> +The ids corresponding to the authentication port match the unix user +and group ids. +<P> +Support for shadow passwords is implemented here. +</TD></TR></TABLE> +<P> +The password server sits at <SAMP>/servers/password</SAMP> and runs as +root. It can hand out ports to the auth server in exchange for a unix +password, matching it against the password or shadow file. Several +utilities make use of this server, so they don't need to be setuid +root. + +<H4><A HREF="#TOCpro" NAME="pro">Process Server</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +The superuser must remain control over user tasks, so: +<UL> + <LI>All mach tasks are associated with a PID in the system default + proc server.</LI> +</UL> +<P> +Optionally, user tasks can store: +<UL> + <LI>Their environment variables.</LI> + <LI>Their argument vector.</LI> + <LI>A port, which others can request based on the PID (like a + nameserver).</LI> +</UL> +<P> +Also implemented in the proc server: +<UL> + <LI>Sessions and process groups.</LI> + <LI>Global configuration not in Mach, like hostname, hostid, system + version.</LI> +</UL> +</TD></TR></TABLE> +<P> +The process server is responsible for some global bookkeeping. As +such it has to be trusted and is not replaceable by the user. +However, a user is not required to use any of its service. In that +case the user will not be able to take advantage of the POSIXish +appearance of the Hurd. +<P> +The Mach Tasks are not as heavy as POSIX processes. For example, +there is no concept of process groups or sessions in Mach. The proc +server fills in the gap. It provides a PID for all Mach tasks, and +also stores the argument line, environment variables and other +information about a process (if the mach tasks provide them, which is +usually the case if you start a process with the default +<SAMP>fork()</SAMP>/<SAMP>exec()</SAMP>). A process can also register +a message port with the proc server, which can then be requested by +anyone. So the proc server also functions as a nameserver using the +process id as the name. +<P> +The proc server also stores some other miscellaneous information not +provided by Mach, like the hostname, hostid and system version. +Finally, it provides facilities to group processes and their ports +together, as well as to convert between pids, process server ports and +mach task ports. +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +User tasks not registering themselve with proc only have a PID assigned. +<P> +Users can run their own proc server in addition to the system default, +at least for those parts of the interface that don't require superuser +privileges. +</TD></TR></TABLE> +<P> +Although the system default proc server can't be avoided (all Mach +tasks spawned by users will get a pid assigned, so the system +administrator can control them), users can run their own additional +process servers if they want, implementing the features not requiring +superuser privileges. + +<H4><A HREF="#TOCfilsys" NAME="filsys">Filesystems</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Store based filesystems +<UL> + <LI><SAMP>ext2fs</SAMP></LI> + <LI><SAMP>ufs</SAMP></LI> + <LI><SAMP>isofs</SAMP> (iso9660, RockRidge, GNU extensions)</LI> + <LI><SAMP>fatfs</SAMP> (under development)</LI> +</UL> +<P> +Network file systems +<UL> + <LI><SAMP>nfs</SAMP></LI> + <LI><SAMP>ftpfs</SAMP></LI> +</UL> +<P> +Miscellaneous +<UL> + <LI><SAMP>hostmux</SAMP></LI> + <LI><SAMP>usermux</SAMP></LI> + <LI><SAMP>tmpfs</SAMP> (under development)</LI> +</UL> +</TD></TR></TABLE> +<P> +We already talked about translators and the file system service they +provide. Currently, we have translators for the ext2, ufs and iso9660 +filesystems. We also have an nfs client and an ftp filesystem. +Especially the latter is intriguing, as it provides transparent access +to ftp servers in the filesystem. Programs can start to move away +from implementing a plethora of network protocols, as the files are +directly available in the filesystem through the standard POSIX file +interface. + + +<H4><A HREF="#TOCdev" NAME="dev">Developing the Hurd</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Over a dozen libraries support the development of new servers. +<P> +For special server types highly specialized +libraries require only the implementation of a +number of callback functions. +<UL> + <LI>Use <SAMP>libdiskfs</SAMP> for store based filesystems.</LI> + <LI>Use <SAMP>libnetfs</SAMP> for network filesystems, also for + virtual filesystems.</LI> + <LI>Use <SAMP>libtrivfs</SAMP> for simple filesystems providing only + a single file or directory.</LI> +</UL> +</TD></TR></TABLE> +<P> +The Hurd server protocols are complex enough to allow for the +implementation of a POSIX compatible system with GNU extensions. +However, a lot of code can be shared by all or at least similar +servers. For example, all storage based filesystems need to be able to +read and write to a store medium splitted in blocks. The Hurd comes +with several libraries which make it easy to implement new servers. +Also, there are already a lot of examples of different server types in +the Hurd. This makes writing a new server easier. +<P> +<SAMP>libdiskfs</SAMP> is a library that supports writing store based +filesystems like ext2fs or ufs. It is not very useful for filesystems +which are purely virtual, like <SAMP>/proc</SAMP> or files in +<SAMP>/dev</SAMP>. +<P> +<SAMP>libnetfs</SAMP> is intended for filesystems which provide a rich +directory hierarchy, but don't use a backing store (for example ftpfs, +nfs). +<P> +<SAMP>libtrivfs</SAMP> is intended for filesystems which just provide +a single inode or directory. Most servers which are not intended to +provide a filesystem but other services (like +<SAMP>/servers/password</SAMP>) use it to provide a dummy file, so +that file operations on the servers node will not return errors. But +it can also be used to provide meaningful data in a single file, like +a device store or a character device. + +<H4><A HREF="#TOCsto" NAME="sto">Store Abstraction</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Another very useful library is libstore, which is used by all store +based filesystems. It provides a store media abstraction. A store +consists of a store class and a name (which itself can sometimes +contain stores). +<P> +Primitive store classes: +<UL> + <LI>device store like device:hd2, device:hd0s1, device:fd0</LI> + <LI>file store like file:/tmp/disk_image</LI> + <LI>task store like task:PID</LI> + <LI>zero store like zero:4m (like /dev/zero, of size 4 MB)</LI> +</UL> +</TD></TR></TABLE> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Composed store classes: +<UL> + <LI>copy store like copy:zero:4m</LI> + <LI>gunzip/bunzip2 store like gunzip:device:fd0</LI> + <LI>concat store like concat:device:hd0s2:device:hd1s5</LI> + <LI>ileave store (RAID-0(2))</LI> + <LI>remap store like remap:10+20,50+:file:/tmp/blocks</LI> + <LI>...</LI> +</UL> +<P> +Wanted: A similar abstraction for streams (based on channels), which +can be used by network and character device servers. +</TD></TR></TABLE> +<P> +<SAMP>libstore</SAMP> provides a store abstraction, which is used by +all store based filesystems. The store is determined by a type and a +name, but some store types modify another store rather than providing +a new store, and thus stores can be stacked. For example, the device +store type expects a Mach device, but the remap store expects a list +of blocks to pick from another store, like remap:1+:device:hd2, which +would pick all blocks from hd2 but the first one, which skipped. +Because this functionality is provided in a library, all libstore +using filesystems support many different store kinds, and adding a new +store type is enough to make all store based filesystems support it. + +<H4><A HREF="#TOCdeb" NAME="deb">Debian GNU/Hurd</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Goal: +<UL> + <LI>Provide a binary distribution of the Hurd that is easy to + install.</LI> +</UL> +<P> +Constraints: +<UL> + <LI>Use the same source packages as Debian GNU/Linux.</LI> + <LI>Use the same infrastructure: + <UL> + <LI>Policy</LI> + <LI>Archive</LI> + <LI>Bug tracking system</LI> + <LI>Release process</LI> + </UL></LI> +</UL> +<P> +Side Goal: +<UL> + <LI>Prepare Debian for the future: + <UL> + <LI>More flexibility in the base system</LI> + <LI>Identify dependencies on the Linux kernel</LI> + </UL></LI> +</UL> +</TD></TR></TABLE> +<P> +The Debian distribution of the GNU Hurd that I started in 1998 is +supposed to become a complete binary distribution of the Hurd that is +easy to install. + +<H4><A HREF="#TOCstabin" NAME="stabin">Status of the Debian GNU/Hurd binary archive</A></H4> +<P> +See +<A HREF="http://buildd.debian.org/stats/graph.png">http://buildd.debian.org/stats/graph.png</A> +for the most current version of the statistic. + +<H4><A HREF="#TOCstainf" NAME="stainf">Status of the Debian infrastructure</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Plus: +<UL> + <LI>Source packages can identify build and host OS using + dpkg-architecture.</LI> +</UL> +<P> +Minus: +<UL> + <LI>The binary architecture field is insufficient.</LI> + <LI>The BTS has no architecture tag.</LI> + <LI>The policy/FHS need (small) Hurd specific extensions.</LI> +</UL> +</TD></TR></TABLE> +<P> +While good compatibiity can be achieved at the source level, +the binary packages can not always express their relationship +to the available architectures sufficiently. +<P> +For example, the Linux version of makedev is binary-all, where +a binary-all-linux relationship would be more appropriate. +<P> +More work has to be done here to fix the tools. + +<H4><A HREF="#TOCstaarc" NAME="staarc">Status of the Debian Source archive</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<UL> + <LI>Most packages just work.</LI> + <LI>Maintainers are usually responsive and cooperative.</LI> + <LI>Turtle, the autobuilder, crunches through the whole list right + now.</LI> +</UL> +<P> +Common pitfalls are POSIX incompatibilities: +<UL> + <LI>Upstream: + <UL> + <LI>Unconditional use of <SAMP>PATH_MAX</SAMP> + (<SAMP>MAXPATHLEN</SAMP>), <SAMP>MAXHOSTNAMELEN</SAMP>.</LI> + <LI>Unguarded use of Linux kernel features.</LI> + <LI>Use of legacy interfaces (<SAMP>sys_errlist</SAMP>, + <SAMP>termio</SAMP>).</LI> + </UL></LI> + <LI>Debian: + <UL> + <LI>Unguarded activation of extensions available with Linux.</LI> + <LI>Low quality patches.</LI> + <LI>Assuming GNU/Linux in package scripts.</LI> + </UL></LI> +</UL> +</TD></TR></TABLE> +<P> +Most packages are POSIX compatible and can be compiled without +changes on the Hurd. The maintainers of the Debian source packages +are usually very kind, responsiver and helpful. +<P> +The Turtle autobuilder software (<A +HREF="http://turtle.sourceforge.net" >http://turtle.sourceforge.net</A>) +builds the Debian packages on the Hurd automatically. + +<H4><A HREF="#TOCdebide" NAME="debide">Debian GNU/Hurd: Good idea, bad idea?</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Upstream benefits: +<UL> + <LI>Software packages become more portable.</LI> +</UL> +<P> +Debian benefits: +<UL> + <LI>Debian becomes more portable.</LI> + <LI>Maintainers learn about portability and other systems.</LI> + <LI>Debian gets a lot of public recognition.</LI> +</UL> +<P> +GNU/Hurd benefits: +<UL> + <LI>Large software base.</LI> + <LI>Great infrastructure.</LI> + <LI>Nice community to partner with.</LI> +</UL> +</TD></TR></TABLE> +<P> +The sheet lists the advantages of all groups involved. + +<H4><A HREF="#TOCend" NAME="end">End</A></H4> +<TABLE BORDER="1" CELLPADDING="5" WIDTH="100%"><TR><TD VALIGN="TOP" ALIGN="LEFT"> +<P> +Join us at +<UL> + <LI><A HREF="http://hurd.gnu.org/" >http://hurd.gnu.org/</A></LI> + <LI><A HREF="http://www.debian.org/ports/hurd" + >http://www.debian.org/ports/hurd</A></LI> + <LI><A HREF="http://www.hurdfr.org" + >http://www.hurdfr.org</A></LI> +</UL> +</TD></TR></TABLE> +<P> +List of contacts. diff --git a/hurd/hurd_hacking_guide.mdwn b/hurd/hurd_hacking_guide.mdwn index 0cb96f32..2ef08f8a 100644 --- a/hurd/hurd_hacking_guide.mdwn +++ b/hurd/hurd_hacking_guide.mdwn @@ -8,6 +8,16 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] -Originally written by Wolfgang Jährling, the [Hurd Hacking Guide](http://www.gnu.org/software/hurd/hacking-guide/hhg.html) -contains an overview of some of the Hurd's features. -Also contains a tutorial on writing your own [[translator]]. +Originally written by Wolfgang Jährling, the *Hurd Hacking Guide* contains an +introduction to GNU Hurd and GNU Mach programming, an overview of some of the +Hurd's features. It also contains a tutorial on writing your own +[[translator]]. + + * [HTML version](http://www.gnu.org/software/hurd/hacking-guide/hhg.html) for + browsing online, + * [PostScript version](http://www.gnu.org/software/hurd/hacking-guide/hhg.ps) + [187kB, 37 pages], + * [ASCII text + version](http://www.gnu.org/software/hurd/hacking-guide/hhg.txt) [59kB], + * [Texinfo source](http://www.gnu.org/software/hurd/hacking-guide/hhg.texi) + [60kB]. diff --git a/hurd/ng/position_paper.mdwn b/hurd/ng/position_paper.mdwn index 3240a41d..e0f4bf60 100644 --- a/hurd/ng/position_paper.mdwn +++ b/hurd/ng/position_paper.mdwn @@ -8,7 +8,8 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] -[[NealWalfield]] and [[MarcusBrinkmann]] wrote a paper titled [*Improving -Usability via Access Decomposition and Policy -Refinement*](http://walfield.org/papers/20070104-walfield-access-decomposition-policy-refinement.pdf). -This is sometimes referred to as *the position paper*. +Neal Walfield and Marcus Brinkmann wrote a paper titled [*Improving Usability +via Access Decomposition and Policy +Refinement*](http://walfield.org/papers/20070104-walfield-access-decomposition-policy-refinement.pdf) +where they give an overview about how a future, subsequent system may be +architected. This is sometimes referred to as *the position paper*. diff --git a/hurd/reference_manual.mdwn b/hurd/reference_manual.mdwn new file mode 100644 index 00000000..5b5bff2d --- /dev/null +++ b/hurd/reference_manual.mdwn @@ -0,0 +1,18 @@ +[[meta copyright="Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +*The GNU Hurd Reference Manual* documents the architecture, the usage and the +programming of the GNU Hurd. At the moment, the manual is quite incomplete. + + * [HTML version](http://www.gnu.org/software/hurd/doc/hurd_toc.html) for + browsing online, + * [PostScript version](http://www.gnu.org/software/hurd/doc/hurd.ps) + [1020KiB, 91 pages]. diff --git a/hurd/running/distrib.mdwn b/hurd/running/distrib.mdwn index fc42e862..b0a6badd 100644 --- a/hurd/running/distrib.mdwn +++ b/hurd/running/distrib.mdwn @@ -94,7 +94,7 @@ about getting applications to work (if possible). * GNU [Coding Standards](http://www.gnu.org/prep/standards.html) * [[TestSuites]] - Posix, Perl, results feedback, etc. -* [docs and papers](http://www.gnu.org/software/hurd/docs.html) +* [[Documentation]] * [[SystemAPILimits]] * [[CodeAnnouncements]] - Recent coding projects related to the Hurd diff --git a/hurd/running/gnu/universal_package_manager.mdwn b/hurd/running/gnu/universal_package_manager.mdwn index 009b26bf..440f1122 100644 --- a/hurd/running/gnu/universal_package_manager.mdwn +++ b/hurd/running/gnu/universal_package_manager.mdwn @@ -127,7 +127,7 @@ OK. I will give you steps. i. Install a GNU System by folowing [[these_instructions|setup]] -ii. Read about GNU Design <http://www.gnu.org/software/hurd/hurd-paper.html> +ii. Read about GNU Design: [[Towards_a_New_Strategy_of_OS_Design|documentation/hurd-paper]] iii. Read about translators <http://www.debian.org/ports/hurd/hurd-doc-translator> diff --git a/hurd/translator.mdwn b/hurd/translator.mdwn index b9952931..889f02a6 100644 --- a/hurd/translator.mdwn +++ b/hurd/translator.mdwn @@ -43,6 +43,7 @@ See some [[examples]] about how to use translators. # Existing Translators +* [[auth]] * [[pfinet]] * [[pflocal]] * [[hostmux]] diff --git a/hurd/translator/auth.mdwn b/hurd/translator/auth.mdwn new file mode 100644 index 00000000..73e7e025 --- /dev/null +++ b/hurd/translator/auth.mdwn @@ -0,0 +1,13 @@ +[[meta copyright="Copyright © 2008 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[*The_Authentication_Server*|documentation/auth]], the transcript of a talk +about the details of the authentication mechanisms in the Hurd by Wolfgang +Jährling. |