From 24e7f01d6e6322e9d98412076dcf3f4d98196bb0 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Thu, 6 Nov 2008 11:37:33 +0100 Subject: Integrate auth.html, hurd-paper.html, hurd-talk.html. Move content from devel.html, docs.html. --- auth.html | 248 ----- contributing.mdwn | 10 +- devel.html | 159 ---- docs.html | 302 ------ documentation.mdwn | 16 +- hurd-paper.html | 812 ---------------- hurd-talk.html | 1151 ----------------------- hurd.mdwn | 2 - hurd/authentication.mdwn | 2 +- hurd/critique.mdwn | 4 +- hurd/documentation.mdwn | 48 +- hurd/documentation/auth.html | 168 ++++ hurd/documentation/hurd-paper.html | 760 +++++++++++++++ hurd/documentation/hurd-talk.html | 1061 +++++++++++++++++++++ hurd/hurd_hacking_guide.mdwn | 16 +- hurd/ng/position_paper.mdwn | 9 +- hurd/reference_manual.mdwn | 18 + hurd/running/distrib.mdwn | 2 +- hurd/running/gnu/universal_package_manager.mdwn | 2 +- hurd/translator.mdwn | 1 + hurd/translator/auth.mdwn | 13 + microkernel/faq/multiserver_microkernel.mdwn | 4 +- microkernel/mach/documentation.mdwn | 18 +- microkernel/mach/gnu_mach.mdwn | 1 + microkernel/mach/gnu_mach/reference_manual.mdwn | 26 + microkernel/mach/mig/documentation.mdwn | 3 +- 26 files changed, 2146 insertions(+), 2710 deletions(-) delete mode 100644 auth.html delete mode 100644 devel.html delete mode 100644 docs.html delete mode 100644 hurd-paper.html delete mode 100644 hurd-talk.html create mode 100644 hurd/documentation/auth.html create mode 100644 hurd/documentation/hurd-paper.html create mode 100644 hurd/documentation/hurd-talk.html create mode 100644 hurd/reference_manual.mdwn create mode 100644 hurd/translator/auth.mdwn create mode 100644 microkernel/mach/gnu_mach/reference_manual.mdwn diff --git a/auth.html b/auth.html deleted file mode 100644 index 676442ee..00000000 --- a/auth.html +++ /dev/null @@ -1,248 +0,0 @@ - - - -The GNU Hurd - GNU Project - Free Software Foundation (FSF) - - - - - - - - - - - - -
- [image of the Hurd logo] -[ - English -] -
-What's New

-ChangeLogs

-Documentation
-

-GNU Hurd

-Installation
-Getting Help
-Source Code
-Development
-History

-GNU Mach

-Installation
-Source Code

-GNU MIG

-Source Code
-

-
-

Table of Contents

- -
- -

Introduction

-

-In this text, which mostly resembles the talk I gave at Libre Software -Meeting 2002 in Bordeaux, I will describe what the auth server does, -why it is so important and which cool things you can do with it, both -on the programming and the user side. I will also describe related -programs like the password and fakeauth servers. Note that this text -is targeted at programmers who want to understand the auth mechanism -in detail and are already familiar with concepts like Remote Procedure -Calls (RPCs) as well as the way User- and Group-IDs are used in the -POSIX world. - -

-The auth server is a very small server, therefore it gives a useful -example when you want to learn how a server typically looks like. One -reason why it is so small is that the auth interface, which it -implements, consists of only four RPCs. You can find the interface in -hurd/hurd/auth.defs and the server itself in hurd/auth/. - -

How IDs are represented and used

-

-Each process holds (usually) one port to auth (an auth_t in C source, -which actually is a mach_port_t, of course). The purpose of auth is -to manage User-IDs and Group-IDs, which is the reason why users often -will have no choice but to make use of the systems main auth server, -which does not listen on /servers/auth; instead you inherit a port to -auth from your parent process. Each such port is (internally in the -auth server) associated with a set of effective User- and Group-IDs as -well as a set of available User- and Group-IDs. So we have four sets -of IDs in total. The available IDs can be turned into corresponding -effective IDs at any time. - -

-When you send an auth_getids RPC on the port you hold, you will get -information about which IDs are associated with it, so you can figure -out which permissions you have. But how will a server know that you -have these permissions and therefore know which actions (e.g. writing -into file "foo") it is supposed to do on your behalf and which not? -The establishing of a trusted connection to a server works as follows: - -

    -
  1. A user wants a server to know its IDs
  2. -
  3. The user requests a reauthentication from the server
  4. -
  5. In this request the user will include a port
  6. -
  7. Both will hand this port to auth
  8. -
  9. The user uses auth_user_authenticate
  10. -
  11. The server uses auth_server_authenticate
  12. -
  13. The server also passes a new port to auth
  14. -
  15. auth matches these two requests
  16. -
  17. The user gets the new port from auth
  18. -
  19. The server learns about the IDs of the user
  20. -
  21. The user uses the new port for further communication
  22. -
- -

-We have different RPCs for users and servers because what we pass and -what we get back differs for them: Users get a port, and servers get -the sets of IDs, and have to specify the port which the user will get. - -

-It is interesting to note that auth can match the requests by -comparing two integers, because when you get the same port from two -people, you will have the same mach_port_t (which is nothing but an -integer). - -

-All of this of course only works if they use the same auth server, -which is why I said often you have no choice other than to use the -one main auth server. But this is no serious restriction, as the auth server has -almost no functionality one might want to replace. In fact, there is -one replacement for the default auth implementation, but more on that -later. - -

POSIX and beyond

-

-Before we examine what is possible with this design, let us take a -short look at how the POSIX semantics are implemented on top of this -design. When a program that comes out of POSIX-land asks for its own -effective User- or Group-ID, we will tell it about the first of the -effective IDs. In the same sense, the POSIX real User- or Group-ID is -the first available ID and the POSIX saved User- or Group-ID is the -second available ID, which is why you have the same ID two times in -the available IDs when you log into your GNU/Hurd machine (you can -figure out which IDs you have with the program "ids", that basically -just does an auth_getauth RPC). When you lack one of those IDs (for -example when you have no effective Group-ID), a POSIX program asking -for this particular information will get "-1" as the ID. - -

-But as you can imagine, we can do more than what POSIX specifies. Fox -example, we can modify our permissions. This is always done with the -auth_makeauth RPC. In this RPC, you specify the IDs that should be -associated with the new port. All of these IDs must be associated -with either the port where the RPC is sent to or one of the additional -ports you can specify; an exception is the superuser root, which is -allowed to creat ports that are associated with arbitrary IDs. -Hereby you can convert available into effective IDs. - -

-This opens the door to a bunch of nice features. For example, we have -the addauth program in the Hurd, which makes it possible to add an ID -to either a single process or a group of processes if you hold the ID or know the -appropriate password, and there is a corresponding rmauth program that -removes an ID. So when you are working on your computer with GNU -Emacs and want to edit a system configuration file, you switch to -Emacs' shell-mode, do an "addauth root", enter the password, edit the -file, and when you are done switch back to shell-mode and do "rmauth -root". These programs have some interesting options, and there are -various other programs, for setting the complete list of IDs (setauth) -and so on. - -

Related servers

-

-Finally, I want to explain two servers which are related to auth. The -first is the password server, which listens on /servers/password. If -you pass to it a User- or Group-ID and the correct password for it, it -will return a port to auth to you which is associated with the ID you -passed to it. It can create such a port because it is running as -root. So let us assume you are an FTP server process. You will start -as root, because you want to use port 21 (in this case, "port" does -not refer to a mach_port_t, of course). But then, you can drop all -your permissions so that you run without any ID. This makes it far -less dangerous to communicate with yet unknown users over the -network. But when someone now hands a username and password to you, -you can ask the password server for a new auth port. The password -server will check the data you pass to it, for example by looking into -/etc/shadow, and if it is valid, it will ask the auth server for a new -port. It receives this port from auth and then passes it on to you. -So you have raised your permissions. (And for the very curious: Yes, -we are well aware of the differences between this concept and -capabilities; and we also do have some kinds of capabilities in -various parts of the Hurd.) - -

-My second example is the fakeauth server. It also implements the auth -protocol. It is the part of the fakeroot implementation that gives a -process the impression that it runs as root, even if it doesn't. So -when the process asks fakeauth about its own IDs, fakeauth will tell -the process that it runs as root. But when the process wants to make -use of the authentication protocol described earlier in this text, -fakeauth will forward the request to its own auth server, which will -usually be the systems main auth server, which will then be able to -match the auth_*_authenticate requests. So what fakeauth does is -acting as a proxy auth server that gives someone the impression to run -as root, while not modifying what that one is allowed to do. - -

-At this point, I have said at least most of what can be said about the -auth server and the protocol it implements, so I will finish by saying -that it might be an interesting task (for you) to modify some existing -software to take advantage of the features I described here. - -

- -
- -[ - English -] - -
- -

-Return to GNU's home page. -

- -Please send FSF & GNU inquiries & questions to - -gnu@gnu.org. -There are also other ways to -contact the FSF. -

- -Please send comments on these web pages to - -web-hurd@gnu.org, -send other questions to -gnu@gnu.org. -

-Copyright (C) 2002 Free Software Foundation, Inc., -59 Temple Place - Suite 330, Boston, MA 02111, USA -

-Verbatim copying and distribution of this entire article is -permitted in any medium, provided this notice is preserved. -

-Updated: - -$Date$ $Author$ - -


- - diff --git a/contributing.mdwn b/contributing.mdwn index 698d03b3..f1003389 100644 --- a/contributing.mdwn +++ b/contributing.mdwn @@ -52,7 +52,7 @@ is another possibility, see the page about [[running_a_Hurd_system|hurd/running]] for the full story. Then you can either play around and eventually strive to do something -useful or -- if you want -- ask us to assign something to you, depending +useful or -- if you want -- [[ask_us|contact_us]] to assign something to you, depending on the skills you have and the resources you intend to invest. Please spend some time with thinking about the items in this [[questionnaire]]. @@ -62,10 +62,10 @@ system, e.g., [[microkernels_for_beginners|microkernel/for_beginners]]. Until you can do the basic exercises listed there, you won't be able to significantly contribute to the Hurd. -For more reading resources, please see these web pages, and also -, - for links to a bunch of documents, -and in general. +For more reading resources, please see these web pages, for example, +[[Hurd_documentation|hurd/documentation]] and +[[Mach_documentation|microkernel/mach/documentation]] for links to a bunch of +documents. ### Porting Packages diff --git a/devel.html b/devel.html deleted file mode 100644 index 2a31f959..00000000 --- a/devel.html +++ /dev/null @@ -1,159 +0,0 @@ - - - -The GNU Hurd - GNU Project - Free Software Foundation (FSF) - - - - - - - - - - - - -
- [image of the Hurd logo] -[ - - - en -| eo -| es -] -
-What's New

-ChangeLogs

-Documentation
-

-GNU Hurd

-Installation
-Getting Help
-Source Code
-Development
-History

-GNU Mach

-Installation
-Source Code

-GNU MIG

-Source Code

-Related Projects -

-
-

Table of Contents

- -
- -

Contributing

-

-If you want to contribute to the Hurd, you should first install and -use it for a while, to become familiar with its features and design. -To join the development team, subscribe to the -Bug-Hurd -<bug-hurd@gnu.org> -mailing list, which is also the place where you can announce your -intentions, make your proposals and send in your patches. -(You can also send mail there without being subscribed to the list.) -

-There is also the -Hurd-devel-readers -mailing list. It is the read-only version of Hurd-devel, an internal -low-volume list restricted to the core developers of the Hurd. If you -want to follow up on the discussion of the Hurd experts, please reply -to the Bug-hurd mailing list. You can also follow the Hurd-devel -mailing list by browsing the web-based archive of -Hurd-devel. - -

Machinery: getting access to a -system

-

-There are essentially two possibilities: either you install the GNU/Hurd on a -system (see here) or if you don't -have a system to install it on, you can be provided with a shell account. See -this wiki page -for details. - -

Tasks

-

-Developing an operating system is a huge job, with a lot of different -things to do. To be able to keep track of issues, we use a -

-There is also an older (but still valid) list of specific items in the -task file -and in the -TODO file -of the Hurd source repository. - -
- -
- -[ - - - en -| eo -| es -] - -
- -

-Return to GNU's home page. -

- -Please send FSF & GNU inquiries & questions to - -gnu@gnu.org. -There are also other ways to -contact the FSF. -

- -Please send comments on these web pages to - -web-hurd@gnu.org, -send other questions to -gnu@gnu.org. -

-Copyright (C) 2001, 2002, 2006, 2007 -Free Software Foundation, Inc., -59 Temple Place - Suite 330, Boston, MA 02111, USA -

-Verbatim copying and distribution of this entire article is -permitted in any medium, provided this notice is preserved. -

-Updated: - -$Date$ $Author$ - -


- - diff --git a/docs.html b/docs.html deleted file mode 100644 index 9c3a43f1..00000000 --- a/docs.html +++ /dev/null @@ -1,302 +0,0 @@ - - - - -The GNU Hurd - GNU Project - Free Software Foundation (FSF) - - - - - - - - - - - - - -
- [image of the Hurd logo] -[ - - - English -| Esperanto -| Spanish -] -
-What's New

-ChangeLogs

-Documentation
-

-GNU Hurd

-Installation
-Getting Help
-Source Code
-Development
-History

-GNU Mach

-Installation
-Source Code

-GNU MIG

-Source Code

-Related Projects -

-
-

Table of Contents

- -
- -

Introductory material, papers and other -informational documents

-

-

- -

Frequently asked questions

-

-Please check out the -Frequently -Asked Questions about the GNU Hurd (33k characters) and their -answers, which cover most issues a new user will be confronted with. -

-This document is available in several languages: -

- -

Wiki

-

A wiki is available for -collecting ideas and reciepes. Fell free -to contribute! - -

Some topics: - -

- - -

Reference manuals

- -
    - -
  • -

    -The GNU Mach Reference Manual documents the architecture, the usage and -the programming of the GNU Mach microkernel. At the moment, the manual -documents the interface completely, but is not very useful as a tutorial or -introduction into the Mach architecture. -

    -Available Formats: -

    -

    -If you want to work on the manual, you're advised to make a checkout of the source tree. Be sure to get the -GNU Mach 1 branch when you intend to work on the manual. You -can then find the manual's sources in the doc/ directory. Please -submit any modifications to <bug-hurd@gnu.org> (if possible in -unidiff format, as produced by diff -u). - -

  • - -
  • -

    -The GNU Hurd Reference Manual documents the architecture, the usage -and the programming of the GNU Hurd. At the moment, the manual is -quite incomplete. -

    -Available Formats: -

    -

    -If you want to work on the manual, you're advised to make a checkout of the source tree. You can then find the manual's -sources in the doc/ directory. Please submit any modifications to -<bug-hurd@gnu.org> (if possible in -unidiff format, as produced by diff -u). - -

  • - -
- -
- -
- -[ - - - English -| Esperanto -| Spanish -] - -
- -

-Return to GNU's home page. -

- -Please send FSF & GNU inquiries & questions to - -gnu@gnu.org. -There are also other ways to -contact the FSF. -

- -Please send comments on these web pages to - -web-hurd@gnu.org, -send other questions to -gnu@gnu.org. -

-Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007 -Free Software Foundation, Inc., -59 Temple Place - Suite 330, Boston, MA 02111, USA -

-Verbatim copying and distribution of this entire article is -permitted in any medium, provided this notice is preserved. -

-Updated: - -$Date$ $Author$ - -


- - diff --git a/documentation.mdwn b/documentation.mdwn index 45bb5ffc..4e4b4b23 100644 --- a/documentation.mdwn +++ b/documentation.mdwn @@ -8,21 +8,13 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] -# General +Documentation for... - * [GNU Hurd Documentation](http://www.gnu.org/software/hurd/docs.html) + * [[GNU_Hurd|hurd/documentation]] + * [[Mach|microkernel/mach/documentation]] -# Introductory Material - -## External - - * [*Examining the Legendary HURD - Kernel*](http://www.informit.com/articles/printerfriendly.aspx?p=1180992), - an article by David Chisnall. - - Also covers a bit of GNU's and the Hurd's history, fundamental techniques - applied, comparisions to other systems. + * [[MIG|microkernel/mach/mig/documentation]] # [[Unix]] Programming diff --git a/hurd-paper.html b/hurd-paper.html deleted file mode 100644 index bb49829c..00000000 --- a/hurd-paper.html +++ /dev/null @@ -1,812 +0,0 @@ - - - -Towards a New Strategy of OS Design - - - -

Towards a New Strategy of OS Design

- [image of a Hurd Metafont Logo] (jpeg 10k) -(jpeg 20k) -no gifs due to patent problems -
-
-[ - - - English -| Hebrew -| Turkish -] - -

-This article explains why FSF is developing a new operating system named the -Hurd, which will be a foundation of the whole GNU system. -The Hurd is built -on top of CMU's Mach 3.0 kernel and uses Mach's virtual memory management and -message-passing facilities. -The GNU C Library will provide the Unix system -call interface, and will call the Hurd for needed services it can't provide -itself. -The design and implementation of the Hurd is being lead by Michael -Bushnell, with assistance from Richard Stallman, Roland McGrath, -Jan Brittenson, and others. - -

Part 1: A More Usable Approach to OS Design

-

-The fundamental purpose of an operating system (OS) is to enable a variety of -programs to share a single computer efficiently and productively. -This -demands memory protection, preemptively scheduled timesharing, coordinated -access to I/O peripherals, and other services. -In addition, an OS can allow -several users to share a computer. -In this case, efficiency demands services -that protect users from harming each other, enable them to share without -prior arrangement, and mediate access to physical devices. -

-On today's computer systems, programmers usually implement these goals -through a large program called the kernel. -Since this program must be -accessible to all user programs, it is the natural place to add functionality -to the system. -Since the only model for process interaction is that of -specific, individual services provided by the kernel, no one creates other -places to add functionality. -As time goes by, more and more is added to the -kernel. -

-A traditional system allows users to add components to a kernel only if they -both understand most of it and have a privileged status within the system. -Testing new components requires a much more painful edit-compile-debug cycle -than testing other programs. -It cannot be done while others are using the -system. -Bugs usually cause fatal system crashes, further disrupting others' -use of the system. -The entire kernel is usually non-pageable. -(There are -systems with pageable kernels, but deciding what can be paged is difficult -and error prone. -Usually the mechanisms are complex, making them difficult -to use even when adding simple extensions.) -

-Because of these restrictions, functionality which properly belongs -behind -the wall of a traditional kernel is usually left out of systems unless it is -absolutely mandatory. -Many good ideas, best done with an open/read/write -interface cannot be implemented because of the problems inherent in the -monolithic nature of a traditional system. -Further, even among those with -the endurance to implement new ideas, only those who are privileged users of -their computers can do so. -The software copyright system darkens the mire by -preventing unlicensed people from even reading the kernel source. -

-Some systems have tried to address these difficulties. -Smalltalk-80 and -the Lisp Machine both represented one method of getting around the problem. -System code is not distinguished from user code; all of the system is -accessible to the user and can be changed as need be. -Both systems were -built around languages that facilitated such easy replacement and extension, -and were moderately successful. -But they both were fairly poor at insulating -users and programs from each other, failing one of the principal goals of OS -design. -

-Most projects that use the Mach 3.0 kernel carry on the hard-to-change -tradition of OS design. -The internal structure is different, but the same -heavy barrier between user and system remains. -The single-servers, while -fairly easy to construct, inherit all the deficiencies of the monolithic -kernels. -

-A multi-server divides the kernel functionality up into logical blocks with -well-defined interfaces. -Properly done, it is easier to make changes and add -functionality. -So most multi-server projects do somewhat better. -Much more -of the system is pageable. -You can debug the system more easily. -You can -test new system components without interfering with other users. -But the -wall between user and system remains; no user can cross it without special -privilege. -

-The GNU Hurd, by contrast, is designed to make the area of -system -code as -limited as possible. -Programs are required to communicate only with a few -essential parts of the kernel; the rest of the system is replaceable -dynamically. -Users can use whatever parts of the remainder of the system -they want, and can easily add components themselves for other users to take -advantage of. -No mutual trust need exist in advance for users to use each -other's services, nor does the system become vulnerable by trusting the -services of arbitrary users. -

-This has been done by identifying those system components which users -must -use in order to communicate with each other. -One of these is responsible for -identifying users' identities and is called the - -authentication server. - -In -order to establish each other's identities, programs must communicate, each -with an authentication server they trust. -Another component establishes -control over system components by the superuser, provides global bookkeeping -operations, and is called the - -process server. - -

-Not all user programs need to communicate with the process server; it is only -necessary for programs which require its services. -Likewise, the -authentication server is only necessary for programs that wish to communicate -their identity to another. -None of the remaining services carry any special -status; not the network implementation, the filesystems, the program -execution mechanism (including setuid), or any others. - -

The Translator Mechanism

-

-The Hurd uses Mach ports primarily as methods for communicating between users -and servers. -(A Mach port is a communication point on a Mach task where -messages are sent and received.) Each port implements a particular set of -protocols, representing operations that can be undertaken on the underlying -object represented by the port. -Some of the protocols specified by the Hurd -are the I/O protocol, used for generic I/O operations; the file protocol, -used for filesystem operations; the socket protocol, used for network -operations; and the process protocol, used for manipulating processes et al. -

-Most servers are accessed by opening files. -Normally, when you open a file, -you create a port associated with that file that is owned by the server -that owns the directory containing the file. -For example, a disk-based -filesystem will normally serve a large number of ports, each of which -represents an open file or directory. -When a file is opened, the server -creates a new port, associates it with the file, and returns the port to the -calling program. -

-However, a file can have a -translator -associated with it. -In this case, -rather than return its own port which refers to the contents of the file, the -server executes a translator program associated with that file. -This -translator is given a port to the actual contents of the file, and is then -asked to return a port to the original user to complete the open operation. -

-This mechanism is used for -mount -by having a translator associated with -each mount point. -When a program opens the mount point, the translator (in -this case, a program which understands the disk format of the mounted -filesystem) is executed and returns a port to the program. -After the -translator is started, it need not be run again unless it dies; the parent -filesystem retains a port to the translator to use in further requests. -

-The owner of a file can associate a translator with it without special -permission. -This means that any program can be specified as a translator. -Obviously the system will not work properly if the translator does not -implement the file protocol correctly. -However, the Hurd is constructed so -that the worst possible consequence is an interruptible hang. -

-One way to use translators is to access hierarchically structured data using -the file protocol. -For example, all the complexity of the user interface to -the -ftp -program is removed. -Users need only know that a particular -directory represents FTP and can use all the standard file manipulation -commands (e.g -ls -or -cp) -to access the remote system, rather than learning -a new set. -Similarly, a simple translator could ease the complexity of -tar -or -gzip. -(Such transparent access would have some added cost, but it would -be convenient.) - -

Generic Services

-

-With translators, the filesystem can act as a rendezvous for interfaces which -are not similar to files. -Consider a service which implements some version -of the X protocol, using Mach messages as an underlying transport. -For each -X display, a file can be created with the appropriate program as its -translator. -X clients would open that file. -At that point, few file -operations would be useful (read and write, for example, would be useless), -but new operations ( -XCreateWindow -or -XDrawText) -might become meaningful. -In this case, the filesystem protocol is used only to manipulate -characteristics of the node used for the rendezvous. -The node need not -support I/O operations, though it should reply to any such messages with a -message_not_understood -return code. -

-This translator technique is used to contact most of the services in the Hurd -that are not structured like hierarchical filesystems. -For example, the -password server, which hands out authorization tags in exchange for -passwords, is contacted this way. -Network protocol servers are also -contacted in this fashion. -Roland McGrath thought up this use of translators. - -

Clever Filesystem Pictures

-

-In the Hurd, translators can also be used to present a filesystem-like view -of another part of the filesystem, with some semantics changed. -For example, -it would be nice to have a filesystem that cannot itself be changed, but -nonetheless records changed versions of its files elsewhere. -(This could be -useful for source code management.) -

-The Hurd will have a translator which creates a directory which is a -conceptual union of other directories, with collision resolution rules of -various sorts. -This can be used to present a single directory to users that -contains all the programs they would want to execute. -There are other useful -variations on this theme. - -

What The User Can Do

-

-No translator gains extra privilege by virtue of being hooked into the -filesystem. -Translators run with the uid of the owner of the file being -translated, and can only be set or changed by that owner. -The I/O and -filesystem protocols are carefully designed to allow their use by mutually -untrusting clients and servers. -Indeed, translators are just ordinary -programs. -The GNU C library has a variety of facilities to make common sorts -of translators easier to write. -

-Some translators may need special privileges, such as the password server or -translators which allow setuid execution. -These translators could be run by -anyone, but only if they are set on a root-owned node would they be able to -provide all their services successfully. -This is analogous to letting any -user call the -reboot -system call, but only honoring it if that user is root. - -

Why This Is So Different

-

-What this design provides is completely novel to the Unix world. -Until now, -OSs have kept huge portions of their functionality in the realm of system -code, thus preventing its modification and extension except in extreme need. -Users cannot replace parts of the system in their programs no matter how much -easier that would make their task, and system managers are loath to install -random tweaks off the net into their kernels. -

-In the Hurd, users can change almost all of the things that are decided for -them in advance by traditional systems. -In combination with the tremendous -control given by the Mach kernel over task address spaces and properties, the -Hurd provides a system in which users will, for the first time, be able to -replace parts of the system they dislike, without disrupting other users. -

-Most Mach-based OSs to date have mostly implemented a wider set of the - -same old - -Unix semantics in a new environment. -In contrast, GNU is extending -those semantics to allow users to improve, bypass, or replace them. - - -

Part 2: A Look at Some of the Hurd's Beasts

-

The Authentication Server

-

-One of the Hurd's more central servers is the authentication server. -Each -port to this server identifies a user and is associated by this server with -an -id block. -Each id block contains sets of user and group ids. -Either -set may be empty. -This server is not the same as the password server -referred to above. -

-The authentication server exports three services. -First, it provides simple -boolean operations on authentication ports: given two authentication ports, -this server will provide a third port representing the union of the two sets -of uids and gids. -Second, this server allows any user with a uid of zero to -create an arbitrary authentication port. -Finally, this server provides RPCs -(Remote Procedure Calls between different programs and possibly different -hosts) which allow mutually untrusting clients and servers to establish their -identities and pass initial information on each other. -This is crucial to -the security of the filesystem and I/O protocols. -

-Any user could write a program which implements the authentication protocol; -this does not violate the system's security. -When a service needs to -authenticate a user, it communicates with its trusted authentication server. -If that user is using a different authentication server, the transaction will -fail and the server can refuse to communicate further. -Because, in effect, -this forces all programs on the system to use the same authentication server, -we have designed its interface to make any safe operation possible, and to -include no extraneous operations. -(This is why there is a separate password -server.) -

The Process Server

-

-The process server acts as an information categorization repository. -There -are four main services supported by this server. -First, the process server -keeps track of generic host-level information not handled by the Mach kernel. -For example, the hostname, the hostid, and the system version are maintained -by the process server. -Second, this server maintains the Posix notions of -sessions and process groups, to help out programs that wish to use Posix -features. -

-Third, the process server maintains a one-to-one mapping between Mach tasks -and Hurd processes. -Every task is assigned a pid. -Processes can register a -message port with this server, which can then be given out to any program -which requests it. -This server makes no attempt to keep these message ports -private, so user programs are expected to implement whatever security they -need themselves. -(The GNU C Library provides convenient functions for all -this.) Processes can tell the process server their current `argv' and `envp' -values; this server will then provide, on request, these vectors of arguments -and environment. -This is useful for writing -ps-like -programs and also -makes it easier to hide or change this information. -None of these features -are mandatory. -Programs are free to disregard all of this and never register -themselves with the process server at all. -They will, however, still have a -pid assigned. -

-Finally, the process server implements -process collections, -which are used -to collect a number of process message ports at the same time. -Also, -facilities are provided for converting between pids, process server ports, -and Mach task ports, while ensuring the security of the ports managed. -

-It is important to stress that the process server is optional. -Because of -restrictions in Mach, programs must run as root in order to identify all the -tasks in the system. -But given that, multiple process servers could -co-exist, each with their own clients, giving their own model of the -universe. -Those process server features which do not require root privileges -to be implemented could be done as per-user servers. -The user's hands are -not tied. -

Transparent FTP

-

-Transparent FTP is an intriguing idea whose time has come. -The popular -ange-ftp -package available for GNU Emacs makes access to FTP files -virtually transparent to all the Emacs file manipulation functions. -Transparent FTP does the same thing, but in a system wide fashion. -This -server is not yet written; the details remain to be fleshed out, and will -doubtless change with experience. -

-In a BSD kernel, a transparent FTP filesystem would be no harder to write -than in the Hurd. -But mention the idea to a BSD kernel hacker, and the -response is that ``such a thing doesn't belong in the kernel''. -In a sense, -this is correct. -It violates all the layering principles of such systems to -place such things in the kernel. -The unfortunate side effect, however, is -that the design methodology (which is based on preventing users from changing -things they don't like) is being used to prevent system designers from making -things better. -(Recent BSD kernels make it possible to write a user program -that provides transparent FTP. -An example is -alex, -but it needs to run -with full root privileges.) -

-In the Hurd, there are no obstacles to doing transparent FTP. -A translator -will be provided for the node -/ftp. -The contents of -/ftp -will probably -not be directly listable, though further subdirectories will be. -There will -be a variety of possible formats. -For example, to access files on uunet, one -could - -cd /ftp/ftp.uu.net:anonymous:mib@gnu. - -Or to access files on a remote -account, one might - -cd /ftp/gnu.org:mib:passwd. - -Parts of this -command could be left out and the transparent FTP program would read them -from a user's -.netrc -file. -In the last case, one might just - -cd /ftp/gnu.org; - -when the rest of the data is already in -.netrc. -

-There is no need to do a -cd -first--use any file command. -To find out about -RFC 1097 (the Telnet Subliminal Message Option), just type - -more /ftp/ftp.uu.net/inet/rfc/rfc1097. - -A copy command to a local disk -could be used if the RFC would be read frequently. -

Filesystems

-

-Ordinary filesystems are also being implemented. -The initial release of the -Hurd will contain a filesystem upwardly compatible with the BSD 4.4 Fast File -System. -In addition to the ordinary semantics, it will provide means to -record translators, offer thirty-two bit user ids and group ids, and supply a -new id per file, called the -author -of the file, which can be set by the -owner arbitrarily. -In addition, because users in the Hurd can have multiple -uids (or even none), there is an additional set of permission bits providing -access control for - -unknown user - -(no uids) as distinct from - -known but arbitrary user - -(some uids: the existing -world -category of file -permissions). -

-The Network File System protocol will be implemented using 4.4 BSD as a -starting point. -A log-structured filesystem will also be implemented using -the same ideas as in Sprite, but probably not the same format. -A GNU network -file protocol may be designed in time, or NFS may be extended to remove its -deficiencies. -There will also be various ``little'' filesystems, such as the -MS-DOS filesystem, to help people move files between GNU and other OSs. - -

Terminals

-

-An I/O server will provide the terminal semantics of Posix. -The GNU C -Library has features for keeping track of the controlling terminal and for -arranging to have proper job control signals sent at the proper times, as -well as features for obeying keyboard and hangup signals. -

-Programs will be able to insert a terminal driver into communications -channels in a variety of ways. -Servers like -rlogind -will be able to insert -the terminal protocol onto their network communication port. -Pseudo-terminals will not be necessary, though they will be provided for -backward compatibility with older programs. -No programs in GNU will depend -on them. -

-Nothing about a terminal driver is forced upon users. -A terminal driver -allows a user to get at the underlying communications channel easily, to -bypass itself on an as-needed basis or altogether, or to substitute a -different terminal driver-like program. -In the last case, provided the -alternate program implements the necessary interfaces, it will be used by the -C Library exactly as if it were the ordinary terminal driver. -

-Because of this flexibility, the original terminal driver will not provide -complex line editing features, restricting itself to the behavior found in -Posix and BSD. -In time, there will be a -readline-based -terminal driver, -which will provide complex line-editing features for those users who want -them. -

-The terminal driver will probably not provide good support for the -high-volume, rapid data transmission required by UUCP or SLIP. -Those -programs do not need any of its features. -Instead they will be using the -underlying Mach device ports for terminals, which support moving large -amounts of data efficiently. - -

Executing Programs

-

-The implementation of the -execve -call is spread across three programs. -The -library marshals the argument and environment vectors. -It then sends a -message to the file server that holds the file to be executed. -The file -server checks execute permissions and makes whatever changes it desires in -the exec call. -For example, if the file is marked setuid and the fileserver -has the ability, it will change the user identification of the new image. -The file server also decides if programs which had access to the old task -should continue to have access to the new task. -If the file server is -augmenting permissions, or executing an unreadable image, then the exec needs -to take place in a new Mach task to maintain security. -

-After deciding the policy associated with the new image, the filesystem calls -the exec server to load the task. -This server, using the BFD (Binary File -Descriptor) library, loads the image. -BFD supports a large number of object -file formats; almost any supported format will be executable. -This server -also handles scripts starting with -#!, -running them through the indicated -program. -

-The standard exec server also looks at the environment of the new image; if -it contains a variable -EXECSERVERS -then it uses the programs specified -there as exec servers instead of the system default. -(This is, of course, -not done for execs that the file server has requested be kept secure.) -

-The new image starts running in the GNU C Library, which sends a message to -the exec server to get the arguments, environment, umask, current directory, -etc. -None of this additional state is special to the file or exec servers; -if programs wish, they can use it in a different manner than the Library. - -

New Processes

-

-The -fork -call is implemented almost entirely in the GNU C Library. -The new -task is created by Mach kernel calls. -The C Library arranges to have its -image inherited properly. -The new task is registered with the process server -(though this is not mandatory). -The C Library provides vectors of functions -to be called at fork time: one vector to be called before the fork, one after -in the parent, and one after in the child. -(These features should not be -used to replace the normal fork-calling sequence; it is intended for -libraries which need to close ports or clean up before a fork occurs.) -The C -library will implement both fork calls specified by the draft Posix.4a (the -proposed standard dealing with the threads extension to the real-time -extension). -

-Nothing forces the user to create new tasks this way. -If a program wants to -use almost the normal fork, but with some special characteristics, then it -can do so. -Hooks will be provided by the C Library, or the function can even -be completely replaced. -None of this is possible in a traditional Unix -system. - -

Asynchronous Messages

-

-As mentioned above, the process server maintains a - -message port - -for each -task registered with it. -These ports are public, and are used to send -asynchronous messages to the task. -Signals, for example, are sent to the -message port. -The signal message also provides a port as an indication that -the sender should be trusted to send the signal. -The GNU C Library lists a -variety of ports in a table, each of which identifies a set of signals that -can be sent by anyone who possesses that port. -For example, if the user -possesses the task's kernel port, it is allowed to send any signal. -If the -user possesses a special - -terminal id - -port, it is allowed to send the -keyboard and hangup signals. -Users can add arbitrary new entries into the C -library's signal permissions table. -

-When a process's process group changes, the process server will send it a -message indicating the new process group. -In this case, the process server -proves its authority by providing the task's kernel port. -

-The C library also has messages to add and delete uids currently used by the -process. -If new uids are sent to the program, the library adds them to its -current set, and then exchanges messages with all the I/O servers it knows -about, proving to them its new authorization. -Similarly, a message can -delete uids. -In the latter case, the caller must provide the process's task -port. -(You can't harm a process by giving it extra permission, but you can -harm it by taking permission away.) The Hurd will provide user programs to -send these messages to processes. -For example, the -su -command will be able -to cause all the programs in your current login session, to gain a new uid, -rather than spawn a subshell. -

-The C library will allow programs to add asynchronous messages they wish to -recognize, as well as prevent recognition of the standard set. -

Making It Look Like Unix

-

-The C Library will implement all of the calls from BSD and Posix as well as -some obvious extensions to them. -This enables users to replace those calls -they dislike or bypass them entirely, whereas in Unix the calls must be used -``as they come'' with no alternatives possible. -

-In some environments binary compatibility will also be supported. -This works -by building a special version of the library which is then loaded somewhere -in the address space of the process. -(For example, on a VAX, it would be -tucked in above the stack.) A feature of Mach, called system call -redirection, is then used to trap Unix system calls and turn them into jumps -into this special version of the library. -(On almost all machines, the cost -of such a redirection is very small; this is a highly optimized path in Mach. -On a 386 it's about two dozen instructions. -This is little worse than a -simple procedure call.) -

-Many features of Unix, such as signal masks and vectors, are handled -completely by the library. -This makes such features significantly cheaper -than in Unix. -It is now reasonable to use -sigblock -extensively to protect -critical sections, rather than seeking out some other, less expensive method. - -

Network Protocols

-

-The Hurd will have a library that will make it very easy to port 4.4 BSD -protocol stacks into the Hurd. -This will enable operation, virtually for -free, of all the protocols supported by BSD. -Currently, this includes the -CCITT protocols, the TCP/IP protocols, the Xerox NS protocols, and the ISO -protocols. -

-For optimal performance some work would be necessary to take advantage of -Hurd features that provide for very high speed I/O. -For most protocols this -will require some thought, but not too much time. -The Hurd will run the -TCP/IP protocols as efficiently as possible. -

-As an interesting example of the flexibility of the Hurd design, consider the -case of IP trailers, used extensively in BSD for performance. -While the Hurd -will be willing to send and receive trailers, it will gain fairly little -advantage in doing so because there is no requirement that data be copied and -avoiding copies for page-aligned data is irrelevant. - -


- -[ - - - English -| Hebrew -| Turkish -] - -
- -Return to GNU's home page. -

-FSF & GNU inquiries & questions to -gnu@gnu.org. -Other ways to contact the FSF. -

-Comments on these web pages to -web-hurd@gnu.org, -send other questions to -gnu@gnu.org. -

-Copyright (C) 1996 Trent Fisher -
-Copyright (C) 1996, 1997, 1998, 2007 Free Software Foundation, Inc., -51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. -

-Verbatim copying and distribution of this entire article is -permitted in any medium, provided this notice is preserved.

-Updated: - -19 Dec 1998 jonas - -


- - diff --git a/hurd-talk.html b/hurd-talk.html deleted file mode 100644 index 630bbc7d..00000000 --- a/hurd-talk.html +++ /dev/null @@ -1,1151 +0,0 @@ - - - -The GNU Hurd - GNU Project - Free Software Foundation (FSF) - - - - - - - - - - - - -
- [image of the Hurd logo] -[ - - - English -] -
-What's New

-ChangeLogs

-Documentation
-

-GNU Hurd

-Installation
-Getting Help
-Source Code
-Development
-History

-GNU Mach

-Installation
-Source Code

-GNU MIG

-Source Code

-Related Projects -

-
-

Table of Contents

- -
-

Talk about the Hurd

-

-This talk about the Hurd was written by Marcus Brinkmann for -

    -
  • OSDEM, Brussels, 4. Feb 2001, -
  • Frühjahrsfachgespräche, Cologne, 2. Mar 2001 and -
  • Libre Software Meeting, Bordeaux, 4. Jul 2001. -
- -

Introduction

-

-When we talk about free software, we usually refer to the free -software licenses. We also need relief from software patents, so our -freedom is not restricted by them. But there is a third type of -freedom we need, and that's user freedom. - -

-Expert users don't take a system as it is. They like to change the -configuration, and they want to run the software that works best for -them. That includes window managers as well as your favourite text -editor. But even on a GNU/Linux system consisting only of free -software, you can not easily use the filesystem format, network -protocol or binary format you want without special privileges. In -traditional unix systems, user freedom is severly restricted by the -system administrator. - -

-The Hurd removes these restrictions from the user. It provides an -user extensible system framework without giving up POSIX compatibility -and the unix security model. Throughout this talk, we will see that -this brings further advantages beside freedom. - -

Overview

-
- -

-The Hurd is a POSIX compatible multi-server -system operating on top of the GNU Mach microkernel. - -

-Topics: -

    -
  • GNU Mach
  • -
  • The Hurd
  • -
  • Development
  • -
  • Debian GNU/Hurd
  • -
-
- -

-The Hurd is a POSIX compatible multi-server system operating on top of -the GNU Mach Microkernel. - -

-I will have to explain what GNU Mach is, so we start with that. Then -I will talk about the Hurd's architecture. After that, I will give a -short overview on the Hurd libraries. Finally, I will tell you how -the Debian project is related to the Hurd. - -

Historicals

- -
-
    -
  • 1983: Richard Stallman founds the GNU project.
  • -
  • 1988: Decision is made to use Mach 3.0 as the kernel.
  • -
  • 1991: Mach 3.0 is released under compatible license.
  • -
  • 1991: Thomas Bushnell, BSG, founds the Hurd project.
  • -
  • 1994: The Hurd boots the first time.
  • -
  • 1997: Version 0.2 of the Hurd is released.

  • -
  • 1998: Debian hurd-i386 archive is created.
  • -
  • 2001: Debian GNU/Hurd snapshot fills three CD images.
  • -
-
- -

-When Richard Stallman founded the GNU project in 1983, he wanted to -write an operating system consisting only of free software. Very -soon, a lot of the essential tools were implemented, and released -under the GPL. However, one critical piece was missing: The kernel. -

-After considering several alternatives, it was decided not to write a -new kernel from scratch, but to start with the Mach microkernel. This -was in 1988, and it was not before 1991 that Mach was released under a -license allowing the GNU project to distribute it as a part of the -system. -

-In 1998, I started the Debian GNU/Hurd project, and in 2001 the number -of available GNU/Hurd packages fills three CD images. - -

Kernel Architectures

-
-

-Microkernel: -

    -
  • Enforces resource management (paging, scheduling)
  • -
  • Manages tasks
  • -
  • Implements message passing for IPC
  • -
  • Provides basic hardware support
  • -
-

-Monolithic kernel: -

    -
  • No message passing necessary
  • -
  • Rich set of features (filesystems, authentication, network - sockets, POSIX interface, ...)
  • -
-
-

-Microkernels were very popular in the scientific world around that -time. They don't implement a full operating system, but only the -infrastructure needed to enable other tasks to implement most -features. In contrast, monolithical kernels like Linux contain -program code of device drivers, network protocols, process management, -authentication, file systems, POSIX compatible interfaces and much -more. -

-So what are the basic facilities a microkernel provides? In general, -this is resource management and message passing. Resource management, -because the kernel task needs to run in a special privileged mode of -the processor, to be able to manipulate the memory management unit and -perform context switches (also to manage interrupts). Message -passing, because without a basic communication facility the other -tasks could not interact to provide the system services. Some -rudimentary hardware device support is often necessary to bootstrap -the system. So the basic jobs of a microkernel are enforcing the -paging policy (the actual paging can be done by an external pager -task), scheduling, message passing and probably basic hardware device -support. -

-Mach was the obvious choice back then, as it provides a rich set of -interfaces to get the job done. Beside a rather brain-dead device -interface, it provides tasks and threads, a messaging system allowing -synchronous and asynchronous operation and a complex interface for -external pagers. It's certainly not one of the sexiest microkernels -that exist today, but more like a big old mama. The GNU project -maintains its own version of Mach, called GNU Mach, which is based on -Mach 4.0. In addition to the features contained in Mach 4.0, the GNU -version contains many of the Linux 2.0 block device and network card -drivers. -

-A complete treatment of the differences between a microkernel and -monolithical kernel design can not be provided here. But a couple of -advantages of a microkernel design are fairly obvious. - -

Micro vs Monolithic

-
-

-Microkernel -

    -
  • Clear cut responsibilities -
  • Flexibility in operating system design, easier debugging
  • -
  • More stability (less code to break)
  • -
  • New features are not added to the kernel
  • -
-

-Monolithic kernel -

    -
  • Intolerance or creeping featuritis
  • -
  • Danger of spaghetti code
  • -
  • Small changes can have far reaching side effects
  • -
-
-

-Because the system is split up into several components, clean -interfaces have to be developed, and the responsibilities of each part -of the system must be clear. -

-Once a microkernel is written, it can be used as the base for several -different operating systems. Those can even run in parallel which -makes debugging easier. When porting, most of the hardware dependant -code is in the kernel. -

-Much of the code that doesn't need to run in the special kernel mode -of the processor is not part of the kernel, so stability increases -because there is simply less code to break. -

-New features are not added to the kernel, so there is no need to hold -the barrier high for new operating system features. -

-Compare this to a monolithical kernel, where you either suffer from -creeping featuritis or you are intolerant of new features (we see both -in the Linux kernel). -

-Because in a monolithical kernel, all parts of the kernel can access -all data structures in other parts, it is more likely that short cuts -are used to avoid the overhead of a clean interface. This leads to a -simple speed up of the kernel, but also makes it less comprehensible -and more error prone. A small change in one part of the kernel can -break remote other parts. - -

Single Server vs Multi Server

-
-

-Single Server -

    -
  • A single task implements the functionality of the operating system.
  • -
-

-Multi Server -

    -
  • Many tasks cooperate to provide the system's functionality.
  • -
  • One server provides only a small but well-defined part of the - whole system.
  • -
  • The responsibilities are distributed logically among the servers.
  • -
-

-A single-server system is comparable to a monolithic kernel system. It -has similar -advantages and disadvantages. -

-

-There exist a couple of operating systems based on Mach, but they all -have the same disadvantages as a monolithical kernel, because those -operating systems are implemented in one single process running on top -of the kernel. This process provides all the services a monolithical -kernel would provide. This doesn't make a whole lot of sense (the -only advantage is that you can probably run several of such isolated -single servers on the same machine). Those systems are also called -single-server systems. The Hurd is the only usable multi-server -system on top of Mach. In the Hurd, there are many server programs, -each one responsible for a unique service provided by the operating -system. These servers run as Mach tasks, and communicate using the -Mach message passing facilities. One of them does only provide a -small part of the functionality of the system, but together they build -up a complete and functional POSIX compatible operating system. - -

Multi Server is superior, ...

-
-

-Any multi-server has advantages over single-server: -

    -
  • Clear cut responsibilities
  • -
  • More stability: If one server dies, all others remain
  • -
  • Easier development cycle: Testing without reboot (or replacing - running servers), debugging with gdb
  • -
  • Easier to make changes and add new features -
-
-

-Using several servers has many advantages, if done right. If a file -system server for a mounted partition crashes, it doesn't take down -the whole system. Instead the partition is "unmounted", and -you can try to start the server again, probably debugging it this time -with gdb. The system is less prone to errors in individual -components, and over-all stability increases. The functionality of -the system can be extended by writing and starting new servers -dynamically. (Developing these new servers is easier for the reasons -just mentioned.) -

-But even in a multi-server system the barrier between the system and -the users remains, and special privileges are needed to cross it. We -have not achieved user freedom yet. - -

The Hurd even more so.

-
-

-The Hurd goes beyond all this, and allows users to write and run their -servers, too! -

    -
  • Users can replace system servers dynamically with their own - implementations.
  • -
  • Users can decide what parts of the remainder of the system they - want to use.
  • -
  • Users can extend the functionality of the system.
  • -
  • No mutual trust necessary to make use of other users - services.
  • -
  • Security of the system is not harmed by trusting users - services.
  • -
-
-

-To quote Thomas Bushnell, BSG, from his paper -``A new strategy towards OS -design'' (1996): -

-The GNU Hurd, by contrast, is designed to make the area of system code -as limited as possible. Programs are required to communicate only -with a few essential parts of the kernel; the rest of the system is -replaceable dynamically. Users can use whatever parts of the -remainder of the system they want, and can easily add components -themselves for other users to take advantage of. No mutual trust need -exist in advance for users to use each other's services, nor does the -system become vulnerable by trusting the services of arbitrary users. -
- -

-So the Hurd is a set of servers running on top of the Mach -micro-kernel, providing a POSIX compatible and extensible operating -system. What servers are there? What functionality do they provide, -and how do they cooperate? - -

Mach Inter Process Communication

-
-

-Ports are message queues which can be used as one-way communication -channels. -

    -
  • Port rights are receive, send or send-once
  • -
  • Exactly one receiver
  • -
  • Potentially many senders
  • -
-

-MIG provides remote procedure calls on top of Mach IPC. RPCs look like -function calls to the user. -

-

-Inter-process communication in Mach is based on the ports concept. A -port is a message queue, used as a one-way communication channel. In -addition to a port, you need a port right, which can be a send right, -receive right, or send-once right. Depending on the port right, you -are allowed to send messages to the server, receive messages from it, -or send just one single message. -

-For every port, there exists exactly one task holding the receive -right, but there can be no or many senders. The send-once right is -useful for clients expecting a response message. They can give a -send-once right to the reply port along with the message. The kernel -guarantees that at some point, a message will be received on the reply -port (this can be a notification that the server destroyed the -send-once right). -

-You don't need to know much about the format a message takes to be -able to use the Mach IPC. The Mach interface generator mig hides the -details of composing and sending a message, as well as receiving the -reply message. To the user, it just looks like a function call, but -in truth the message could be sent over a network to a server running -on a different computer. The set of remote procedure calls a server -provides is the public interface of this server. - - -

How to get a port?

-
-

-Traditional Mach: -

    -
  • Nameserver provides ports to all registered servers.
  • -
  • The nameserver port itself is provided by Mach.
  • -
  • Like a phone book: One list.
  • -
-

-The Hurd: -

    -
  • The filesystem is used as the server namespace.
  • -
  • Root directory port is inserted into each task.
  • -
  • The C library finds other ports with hurd_file_name_lookup, - performing a pathname resolution.
  • -
  • Like a tree of phone books.
  • -
-
-

-So how does one get a port to a server? You need something like a -phone book for server ports, or otherwise you can only talk to -yourself. In the original Mach system, a special nameserver is -dedicated to that job. A task could get a port to the nameserver from -the Mach kernel and ask it for a port (with send right) to a server -that registered itself with the nameserver at some earlier time. -

-In the Hurd, there is no nameserver. Instead, the filesystem is used -as the server namespace. This works because there is always a root -filesystem in the Hurd (remember that the Hurd is a POSIX compatible -system); this is an assumption the people who developed Mach couldn't -make, so they had to choose a different strategy. You can use the -function hurd_file_name_lookup, which is part of the C library, to get -a port to the server belonging to a filename. Then you can start to -send messages to the server in the usual way. - -

Example of hurd_file_name_lookup

-
-mach_port_t identity;
-mach_port_t pwserver;
-kern_return_t err;
-
-pwserver = hurd_file_name_lookup
-                ("/servers/password");
-
-err = password_check_user (pwserver,
-                           0 /* root */, "supass",
-                           &identity);
-
-

-As a concrete example, the special filename -/servers/password can be used to request a port to the -Hurd password server, which is responsible to check user provided -passwords. -

-(explanation of the example) - -

Pathname resolution example

-
-

-Task: Lookup /mnt/readme.txt where /mnt has a mounted filesystem. -

    -
  • The C library asks the root filesystem server about - /mnt/readme.txt.
  • -
  • The root filesystem returns a port to the mnt filesystem server - (matching /mnt) and the retry name - /readme.txt.
  • -
  • The C library asks the mnt filesystem server about - /readme.txt.
  • -
  • The mnt filesystem server returns a port to itself and records - that this port refers to the regular file - /readme.txt.
  • -
-
-

-The C library itself does not have a full list of all available -servers. Instead pathname resolution is used to traverse through a -tree of servers. In fact, filesystems themselves are implemented by -servers (let us ignore the chicken and egg problem here). So all the -C library can do is to ask the root filesystem server about the -filename provided by the user (assuming that the user wants to resolve -an absolute path), using the dir_lookup RPC. If the -filename refers to a regular file or directory on the filesystem, the -root filesystem server just returns a port to itself and records that -this port corresponds to the file or directory in question. But if a -prefix of the full path matches the path of a server the root -filesystem knows about, it returns to the C library a port to this -server and the remaining part of the pathname that couldn't be -resolved. The C library than has to retry and query the other server -about the remaining path component. Eventually, the C library will -either know that the remaining path can't be resolved by the last -server in the list, or get a valid port to the server in question. - -

Mapping the POSIX Interface

-
- - - - - - - - - - - - - - - - - - - -
FiledescriptorPort to server providing the file
fd = open(name,...)dir_lookup(..,name,..,&port)
-[pathname resolution]
read(fd, ...)io_read(port, ...)
write(fd, ...)io_write(port, ...)
fstat(fd, ...)io_stat(port, ...)
...
-
-

-It should by now be obvious that the port returned by the server can -be used to query the files status, content and other information from -the server, if good remote procedure calls to do that are defined and -implemented by it. This is exactly what happens. Whenever a file is -opened using the C libraries open() call, the C library -uses the above pathname resolution to get a port to a server providing -the file. Then it wraps a file descriptor around it. So in the Hurd, -for every open file descriptor there is a port to a server providing -this file. Many other C library calls like read() and -write() just call a corresponding RPC using the port -associated with the file descriptor. - -

File System Servers

-
-
    -
  • Provide file and directory services for ports (and more).
  • -
  • These ports are returned by a directory lookup.
  • -
  • Translate filesystem accesses through their root path (hence the - name translator).
  • -
  • The C library maps the POSIX file and directory interface (and - more) to RPCs to the filesystem servers ports, but also does work on - its own.
  • -
  • Any user can install file system servers on inodes they own.
  • -
-
-

-So we don't have a single phone book listing all servers, but rather a -tree of servers keeping track of each other. That's really like -calling your friend and asking for the phone number of the blond girl -at the party yesterday. He might refer you to a friend who hopefully -knows more about it. Then you have to retry. -

-This mechanism has huge advantages over a single nameserver. First, -note that standard unix permissions on directories can be used to -restrict access to a server (this requires that the filesystems -providing those directories behave). You just have to set the -permissions of a parent directory accordingly and provide no other way -to get a server port. -

-But there are much deeper implications. Most of all, a pathname never -directly refers to a file, it refers to a port of a server. That -means that providing a regular file with static data is just one of -the many options the server has to service requests on the file port. -A server can also create the data dynamically. For example, a server -associated with /dev/random can provide new random data -on every io_read() on the port to it. A server -associated with /dev/fortune can provide a new fortune -cookie on every open(). -

-While a regular filesystem server will just serve the data as stored -in a filesystem on disk, there are servers providing purely virtual -information, or a mixture of both. It is up to the server to behave -and provide consistent and useful data on each remote procedure call. -If it does not, the results may not match the expectations of the user -and confuse him. -

-A footnote from the Hurd info manual: -

-(1) You are lost in a maze of twisty little filesystems, all -alike.... -
-

-Because a server installed in the filesystem namespace translates all -filesystem operations that go through its root path, such a server is -also called "active translator". You can install translators using -the settrans command with the -a option. - -

Active vs Passive

-
-

-Active Translators: -

    -
  • "settrans -a /cdrom /hurd/isofs /dev/hd2"
  • -
  • Are running filesystem servers.
  • -
  • Are attached to the root node they translate.
  • -
  • Run as a normal process.
  • -
  • Go away with every reboot, or even time out.
  • -
-
-

-Many translator settings remain constant for a long time. It would be -very lame to always repeat the same couple of dozens settrans calls -manually or at boot time. So the Hurd provides a filesystem extension -that allows to store translator settings inside the filesystem and let -the filesystem servers do the work to start those servers on demand. -Such translator settings are called "passive translators". A passive -translator is really just a command line string stored in an inode of -the filesystem. If during a pathname resolution a server encounters -such a passive translator, and no active translator does exist already -(for this node), it will use this string to start up a new translator -for this inode, and then let the C library continue with the path -resolution as described above. Passive translators are installed with -settrans using the -p option (which is already the -default). - -
-

-Passive Translators: -

    -
  • "settrans /mnt /hurd/ext2fs /dev/hd1s1"
  • -
  • Are stored as command strings into an inode.
  • -
  • Are used to start a new active translator if there isn't - one.
  • -
  • Startup is transparent to the user.
  • -
  • Startup happens the first time the server is needed.
  • -
  • Are permanent across reboots (like file data).
  • -
-
-

-So passive translators also serve as a sort of automounting feature, -because no manual interaction is required. The server start up is -deferred until the service is need, and it is transparent to the user. -

-When starting up a passive translator, it will run as a normal process -with the same user and group id as those of the underlying inode. Any -user is allowed to install passive and active translators on inodes -that he owns. This way the user can install new servers into the -global namespace (for example, in his home or tmp directory) and thus -extend the functionality of the system (recall that servers can -implement other remote procedure calls beside those used for files and -directories). A careful design of the trusted system servers makes -sure that no permissions leak out. -

-In addition, users can provide their own implementations of some of -the system servers instead the system default. For example, they can -use their own exec server to start processes. The user specific exec -server could for example start java programs transparently (without -invoking the interpreter manually). This is done by setting the -environment variable EXECSERVERS. The systems default -exec server will evaluate this environment variable and forward the -RPC to each of the servers listed in turn, until some server accepts -it and takes over. The system default exec server will only do this -if there are no security implications. (XXX There are other ways to -start new programs than by using the system exec server. Those are -still available.) -

-Let's take a closer look at some of the Hurd servers. It was already -mentioned that only few system servers are mandatory for users. To -establish your identity within the Hurd system, you have to -communicate with the trusted systems authentication server -auth. To put the system administrator into control over -the system components, the process server does some global -bookkeeping. -

-But even these servers can be ignored. However, registration with the -authentication server is the only way to establish your identity -towards other system servers. Likewise, only tasks registered as -processes with the process server can make use of its services. - -

Authentication

-
-

-A user identity is just a port to an authserver. The auth server -stores four set of ids for it: -

    -
  • effective user ids
  • -
  • effective group ids
  • -
  • available user ids
  • -
  • available group ids
  • -
-

-Basic properties: -

    -
  • Any of these can be empty.
  • -
  • A 0 among the user ids identifies the superuser.
  • -
  • Effective ids are used to check if the user has the - permission.
  • -
  • Available ids can be turned into effective ids on user - request.
  • -
-
-

-The Hurd auth server is used to establish the identity of a user for a -server. Such an identity (which is just a port to the auth server) -consists of a set of effective user ids, a set of effective group ids, -a set of available user ids and a set of available group ids. Any of -these sets can be empty. - -

Operations on authentication ports

-
-

-The auth server provides the following operations on ports: -

    -
  • Merge the ids of two ports into a new one.
  • -
  • Return a new port containing a subset of the ids in a port.
  • -
  • Create a new port with arbitrary ids (superuser only).
  • -
  • Establish a trusted connection between users and servers.
  • -
-
-

-If you have two identities, you can merge them and request an identity -consisting of the unions of the sets from the auth server. You can -also create a new identity consisting only of subsets of an identity -you already have. What you can't do is extending your sets, unless -you are the superuser which is denoted by having the user id 0. - -

Establishing trusted connections

-
-
    -
  • User provides a rendezvous port to the server (with - io_reauthenticate).
  • -
  • User calls auth_user_authenticate on the - authentication port (his identity), passing the rendezvous port.
  • -
  • Server calls auth_server_authenticate on its - authentication port (to a trusted auth server), passing the - rendezvous port and the server port.
  • -
  • If both authentication servers are the same, it can match the - rendezvous ports and return the server port to the user and the user - ids to the server.
  • -
-
-

-Finally, the auth server can establish the identity of a user for a -server. This is done by exchanging a server port and a user identity -if both match the same rendezvous port. The server port will be -returned to the user, while the server is informed about the id sets -of the user. The server can then serve or reject subsequent RPCs by -the user on the server port, based on the identity it received from -the auth server. -

-Anyone can write a server conforming to the auth protocol, but of -course all system servers use a trusted system auth server to -establish the identity of a user. If the user is not using the system -auth server, matching the rendezvous port will fail and no server port -will be returned to the user. Because this practically requires all -programs to use the same auth server, the system auth server is -minimal in every respect, and additional functionality is moved -elsewhere, so user freedom is not unnecessarily restricted. - -

Password Server

-
-

-The password server /servers/password runs as root and -returns a new authentication port in exchange for a unix password. -

-The ids corresponding to the authentication port match the unix user -and group ids. -

-Support for shadow passwords is implemented here. -

-

-The password server sits at /servers/password and runs as -root. It can hand out ports to the auth server in exchange for a unix -password, matching it against the password or shadow file. Several -utilities make use of this server, so they don't need to be setuid -root. - -

Process Server

-
-

-The superuser must remain control over user tasks, so: -

    -
  • All mach tasks are associated with a PID in the system default - proc server.
  • -
-

-Optionally, user tasks can store: -

    -
  • Their environment variables.
  • -
  • Their argument vector.
  • -
  • A port, which others can request based on the PID (like a - nameserver).
  • -
-

-Also implemented in the proc server: -

    -
  • Sessions and process groups.
  • -
  • Global configuration not in Mach, like hostname, hostid, system - version.
  • -
-
-

-The process server is responsible for some global bookkeeping. As -such it has to be trusted and is not replaceable by the user. -However, a user is not required to use any of its service. In that -case the user will not be able to take advantage of the POSIXish -appearance of the Hurd. -

-The Mach Tasks are not as heavy as POSIX processes. For example, -there is no concept of process groups or sessions in Mach. The proc -server fills in the gap. It provides a PID for all Mach tasks, and -also stores the argument line, environment variables and other -information about a process (if the mach tasks provide them, which is -usually the case if you start a process with the default -fork()/exec()). A process can also register -a message port with the proc server, which can then be requested by -anyone. So the proc server also functions as a nameserver using the -process id as the name. -

-The proc server also stores some other miscellaneous information not -provided by Mach, like the hostname, hostid and system version. -Finally, it provides facilities to group processes and their ports -together, as well as to convert between pids, process server ports and -mach task ports. -
-

-User tasks not registering themselve with proc only have a PID assigned. -

-Users can run their own proc server in addition to the system default, -at least for those parts of the interface that don't require superuser -privileges. -

-

-Although the system default proc server can't be avoided (all Mach -tasks spawned by users will get a pid assigned, so the system -administrator can control them), users can run their own additional -process servers if they want, implementing the features not requiring -superuser privileges. - -

Filesystems

-
-

-Store based filesystems -

    -
  • ext2fs
  • -
  • ufs
  • -
  • isofs (iso9660, RockRidge, GNU extensions)
  • -
  • fatfs (under development)
  • -
-

-Network file systems -

    -
  • nfs
  • -
  • ftpfs
  • -
-

-Miscellaneous -

    -
  • hostmux
  • -
  • usermux
  • -
  • tmpfs (under development)
  • -
-
-

-We already talked about translators and the file system service they -provide. Currently, we have translators for the ext2, ufs and iso9660 -filesystems. We also have an nfs client and an ftp filesystem. -Especially the latter is intriguing, as it provides transparent access -to ftp servers in the filesystem. Programs can start to move away -from implementing a plethora of network protocols, as the files are -directly available in the filesystem through the standard POSIX file -interface. - - -

Developing the Hurd

-
-

-Over a dozen libraries support the development of new servers. -

-For special server types highly specialized -libraries require only the implementation of a -number of callback functions. -

    -
  • Use libdiskfs for store based filesystems.
  • -
  • Use libnetfs for network filesystems, also for - virtual filesystems.
  • -
  • Use libtrivfs for simple filesystems providing only - a single file or directory.
  • -
-
-

-The Hurd server protocols are complex enough to allow for the -implementation of a POSIX compatible system with GNU extensions. -However, a lot of code can be shared by all or at least similar -servers. For example, all storage based filesystems need to be able to -read and write to a store medium splitted in blocks. The Hurd comes -with several libraries which make it easy to implement new servers. -Also, there are already a lot of examples of different server types in -the Hurd. This makes writing a new server easier. -

-libdiskfs is a library that supports writing store based -filesystems like ext2fs or ufs. It is not very useful for filesystems -which are purely virtual, like /proc or files in -/dev. -

-libnetfs is intended for filesystems which provide a rich -directory hierarchy, but don't use a backing store (for example ftpfs, -nfs). -

-libtrivfs is intended for filesystems which just provide -a single inode or directory. Most servers which are not intended to -provide a filesystem but other services (like -/servers/password) use it to provide a dummy file, so -that file operations on the servers node will not return errors. But -it can also be used to provide meaningful data in a single file, like -a device store or a character device. - -

Store Abstraction

-
-

-Another very useful library is libstore, which is used by all store -based filesystems. It provides a store media abstraction. A store -consists of a store class and a name (which itself can sometimes -contain stores). -

-Primitive store classes: -

    -
  • device store like device:hd2, device:hd0s1, device:fd0
  • -
  • file store like file:/tmp/disk_image
  • -
  • task store like task:PID
  • -
  • zero store like zero:4m (like /dev/zero, of size 4 MB)
  • -
-
-
-

-Composed store classes: -

    -
  • copy store like copy:zero:4m
  • -
  • gunzip/bunzip2 store like gunzip:device:fd0
  • -
  • concat store like concat:device:hd0s2:device:hd1s5
  • -
  • ileave store (RAID-0(2))
  • -
  • remap store like remap:10+20,50+:file:/tmp/blocks
  • -
  • ...
  • -
-

-Wanted: A similar abstraction for streams (based on channels), which -can be used by network and character device servers. -

-

-libstore provides a store abstraction, which is used by -all store based filesystems. The store is determined by a type and a -name, but some store types modify another store rather than providing -a new store, and thus stores can be stacked. For example, the device -store type expects a Mach device, but the remap store expects a list -of blocks to pick from another store, like remap:1+:device:hd2, which -would pick all blocks from hd2 but the first one, which skipped. -Because this functionality is provided in a library, all libstore -using filesystems support many different store kinds, and adding a new -store type is enough to make all store based filesystems support it. - -

Debian GNU/Hurd

-
-

-Goal: -

    -
  • Provide a binary distribution of the Hurd that is easy to - install.
  • -
-

-Constraints: -

    -
  • Use the same source packages as Debian GNU/Linux.
  • -
  • Use the same infrastructure: -
      -
    • Policy
    • -
    • Archive
    • -
    • Bug tracking system
    • -
    • Release process
    • -
  • -
-

-Side Goal: -

    -
  • Prepare Debian for the future: -
      -
    • More flexibility in the base system
    • -
    • Identify dependencies on the Linux kernel
    • -
  • -
-
-

-The Debian distribution of the GNU Hurd that I started in 1998 is -supposed to become a complete binary distribution of the Hurd that is -easy to install. - -

Status of the Debian GNU/Hurd binary archive

-

-See -http://buildd.debian.org/stats/graph.png -for the most current version of the statistic. - -

Status of the Debian infrastructure

-
-

-Plus: -

    -
  • Source packages can identify build and host OS using - dpkg-architecture.
  • -
-

-Minus: -

    -
  • The binary architecture field is insufficient.
  • -
  • The BTS has no architecture tag.
  • -
  • The policy/FHS need (small) Hurd specific extensions.
  • -
-
-

-While good compatibiity can be achieved at the source level, -the binary packages can not always express their relationship -to the available architectures sufficiently. -

-For example, the Linux version of makedev is binary-all, where -a binary-all-linux relationship would be more appropriate. -

-More work has to be done here to fix the tools. - -

Status of the Debian Source archive

-
-
    -
  • Most packages just work.
  • -
  • Maintainers are usually responsive and cooperative.
  • -
  • Turtle, the autobuilder, crunches through the whole list right - now.
  • -
-

-Common pitfalls are POSIX incompatibilities: -

    -
  • Upstream: -
      -
    • Unconditional use of PATH_MAX - (MAXPATHLEN), MAXHOSTNAMELEN.
    • -
    • Unguarded use of Linux kernel features.
    • -
    • Use of legacy interfaces (sys_errlist, - termio).
    • -
  • -
  • Debian: -
      -
    • Unguarded activation of extensions available with Linux.
    • -
    • Low quality patches.
    • -
    • Assuming GNU/Linux in package scripts.
    • -
  • -
-
-

-Most packages are POSIX compatible and can be compiled without -changes on the Hurd. The maintainers of the Debian source packages -are usually very kind, responsiver and helpful. -

-The Turtle autobuilder software (http://turtle.sourceforge.net) -builds the Debian packages on the Hurd automatically. - -

Debian GNU/Hurd: Good idea, bad idea?

-
-

-Upstream benefits: -

    -
  • Software packages become more portable.
  • -
-

-Debian benefits: -

    -
  • Debian becomes more portable.
  • -
  • Maintainers learn about portability and other systems.
  • -
  • Debian gets a lot of public recognition.
  • -
-

-GNU/Hurd benefits: -

    -
  • Large software base.
  • -
  • Great infrastructure.
  • -
  • Nice community to partner with.
  • -
-
-

-The sheet lists the advantages of all groups involved. - -

End

-
-

-Join us at -

-
-

-List of contacts. - -

-Some of these links are at other web sites not maintained by the -FSF. The FSF is not responsible for the content of these other web sites. - -

- -
- -[ - - - English -] - -
- -

-Return to GNU's home page. -

- -Please send FSF & GNU inquiries & questions to - -gnu@gnu.org. -There are also other ways to -contact the FSF. -

- -Please send comments on these web pages to - -web-hurd@gnu.org, -send other questions to -gnu@gnu.org. -

-Copyright (C) 2001 Marcus Brinkmann <marcus@gnu.org> -

-Verbatim copying and distribution of this entire article is -permitted in any medium, provided this notice is preserved. -

-Updated: - -$Date$ $Author$ - -


- - diff --git a/hurd.mdwn b/hurd.mdwn index 61e83d21..3c65bb4e 100644 --- a/hurd.mdwn +++ b/hurd.mdwn @@ -53,8 +53,6 @@ in the *unstable* branch of the Debian archive. * [[Critique]] - Analysis * [[Hurd_Hacking_Guide]] * [[Concepts]] -* Other resources - * [Docs at gnu.org](http://www.gnu.org/software/hurd/docs.html) # Using diff --git a/hurd/authentication.mdwn b/hurd/authentication.mdwn index cbb164c8..14144d8e 100644 --- a/hurd/authentication.mdwn +++ b/hurd/authentication.mdwn @@ -10,7 +10,7 @@ is included in the section entitled UIDs on the Hurd are separate from processes. A process has [[capabilities|capability]] designating so-called UID vectors that -are implemented by an [[auth]] server. This +are implemented by an [[translator/auth]] server. This makes them easily [[virtualizable|virtualization]]. When a process wishes to gain access to a resource provided by a third diff --git a/hurd/critique.mdwn b/hurd/critique.mdwn index 9770138e..dacd7bb8 100644 --- a/hurd/critique.mdwn +++ b/hurd/critique.mdwn @@ -8,8 +8,8 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] -[[NealWalfield]] and [[MarcusBrinkmann]] wrote a paper titled [*A Critique of -the GNU Hurd Multi-Server Operating +Neal Walfield and Marcus Brinkmann wrote a paper titled [*A Critique of +the GNU Hurd Multi-server Operating System*](http://walfield.org/papers/200707-walfield-critique-of-the-GNU-Hurd.pdf). This was published in ACM SIGOPS Operating Systems Review in July 2007. This is sometimes referred to as *the critique*. diff --git a/hurd/documentation.mdwn b/hurd/documentation.mdwn index bb37a8be..a8c3a988 100644 --- a/hurd/documentation.mdwn +++ b/hurd/documentation.mdwn @@ -1,4 +1,5 @@ -[[meta copyright="Copyright © 2008 Free Software Foundation, Inc."]] +[[meta copyright="Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +Free Software Foundation, Inc."]] [[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -8,10 +9,53 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] +# Introductory Material + * [[What_Is_the_GNU_Hurd]] * [[Advantages]] * [[FAQ]] - * + * [[*Towards_a_New_Strategy_of_OS_Design*|hurd-paper]], an architectural + overview by Thomas Bushnell, BSG. + + * [[*The_Hurd*|hurd-talk]], a presentation by Marcus Brinkmann. + + * [[*A_Critique_of_the_GNU_Hurd_Multi-server_Operating_System*|critique]], an + analysis of the GNU Hurd on GNU Mach system, written by Neal Walfield and + Marcus Brinkmann. + +## External + + * [*Examining the Legendary HURD + Kernel*](http://www.informit.com/articles/printerfriendly.aspx?p=1180992), + an article by David Chisnall. + + Also covers a bit of GNU's and the Hurd's history, fundamental techniques + applied, comparisions to other systems. + + +# Development + + * *[[The_GNU_Hurd_Reference_Manual|reference_manual]]*. + + * The *[[Hurd_Hacking_Guide]]*, an introduction to GNU Hurd and Mach + programming by Wolfgang Jährling. + + * [*Manually Bootstrapping a + Translator*](http://walfield.org/pub/people/neal/papers/hurd-misc/manual-bootstrap.txt), + a text by Neal Walfield about how to *manually connect the translator to + the filesystem*. + + * [[*The_Authentication_Server*|auth]], the transcript of a talk about the + details of the authentication mechanisms in the Hurd by Wolfgang Jährling. + + * [*The Mach Paging Interface as Used by the + Hurd*](http://lists.gnu.org/archive/html/l4-hurd/2002-06/msg00001.html), a + text by Neal Walfield. + + * In the + [[Position_paper_*Improving_Usability_via_Access_Decomposition_and_Policy*|ng/position_paper]] + Neal Walfield and Marcus Brinkmann give an overview about how a future, + subsequent system may be architected. diff --git a/hurd/documentation/auth.html b/hurd/documentation/auth.html new file mode 100644 index 00000000..487fc1fe --- /dev/null +++ b/hurd/documentation/auth.html @@ -0,0 +1,168 @@ +[[meta copyright="Copyright © 2002, 2008 Free Software Foundation, Inc."]] + +[[meta license="Verbatim copying and distribution of this entire article is +permitted in any medium, provided this notice is preserved."]] + +[[meta title="The Authentication Server, the transcript of a talk about the +details of the authentication mechanisms in the Hurd by Wolfgang Jährling"]] + +

Table of Contents

+ +
+ +

Introduction

+

+In this text, which mostly resembles the talk I gave at Libre Software +Meeting 2002 in Bordeaux, I will describe what the auth server does, +why it is so important and which cool things you can do with it, both +on the programming and the user side. I will also describe related +programs like the password and fakeauth servers. Note that this text +is targeted at programmers who want to understand the auth mechanism +in detail and are already familiar with concepts like Remote Procedure +Calls (RPCs) as well as the way User- and Group-IDs are used in the +POSIX world. + +

+The auth server is a very small server, therefore it gives a useful +example when you want to learn how a server typically looks like. One +reason why it is so small is that the auth interface, which it +implements, consists of only four RPCs. You can find the interface in +hurd/hurd/auth.defs and the server itself in hurd/auth/. + +

How IDs are represented and used

+

+Each process holds (usually) one port to auth (an auth_t in C source, +which actually is a mach_port_t, of course). The purpose of auth is +to manage User-IDs and Group-IDs, which is the reason why users often +will have no choice but to make use of the systems main auth server, +which does not listen on /servers/auth; instead you inherit a port to +auth from your parent process. Each such port is (internally in the +auth server) associated with a set of effective User- and Group-IDs as +well as a set of available User- and Group-IDs. So we have four sets +of IDs in total. The available IDs can be turned into corresponding +effective IDs at any time. + +

+When you send an auth_getids RPC on the port you hold, you will get +information about which IDs are associated with it, so you can figure +out which permissions you have. But how will a server know that you +have these permissions and therefore know which actions (e.g. writing +into file "foo") it is supposed to do on your behalf and which not? +The establishing of a trusted connection to a server works as follows: + +

    +
  1. A user wants a server to know its IDs
  2. +
  3. The user requests a reauthentication from the server
  4. +
  5. In this request the user will include a port
  6. +
  7. Both will hand this port to auth
  8. +
  9. The user uses auth_user_authenticate
  10. +
  11. The server uses auth_server_authenticate
  12. +
  13. The server also passes a new port to auth
  14. +
  15. auth matches these two requests
  16. +
  17. The user gets the new port from auth
  18. +
  19. The server learns about the IDs of the user
  20. +
  21. The user uses the new port for further communication
  22. +
+ +

+We have different RPCs for users and servers because what we pass and +what we get back differs for them: Users get a port, and servers get +the sets of IDs, and have to specify the port which the user will get. + +

+It is interesting to note that auth can match the requests by +comparing two integers, because when you get the same port from two +people, you will have the same mach_port_t (which is nothing but an +integer). + +

+All of this of course only works if they use the same auth server, +which is why I said often you have no choice other than to use the +one main auth server. But this is no serious restriction, as the auth server has +almost no functionality one might want to replace. In fact, there is +one replacement for the default auth implementation, but more on that +later. + +

POSIX and beyond

+

+Before we examine what is possible with this design, let us take a +short look at how the POSIX semantics are implemented on top of this +design. When a program that comes out of POSIX-land asks for its own +effective User- or Group-ID, we will tell it about the first of the +effective IDs. In the same sense, the POSIX real User- or Group-ID is +the first available ID and the POSIX saved User- or Group-ID is the +second available ID, which is why you have the same ID two times in +the available IDs when you log into your GNU/Hurd machine (you can +figure out which IDs you have with the program "ids", that basically +just does an auth_getauth RPC). When you lack one of those IDs (for +example when you have no effective Group-ID), a POSIX program asking +for this particular information will get "-1" as the ID. + +

+But as you can imagine, we can do more than what POSIX specifies. Fox +example, we can modify our permissions. This is always done with the +auth_makeauth RPC. In this RPC, you specify the IDs that should be +associated with the new port. All of these IDs must be associated +with either the port where the RPC is sent to or one of the additional +ports you can specify; an exception is the superuser root, which is +allowed to creat ports that are associated with arbitrary IDs. +Hereby you can convert available into effective IDs. + +

+This opens the door to a bunch of nice features. For example, we have +the addauth program in the Hurd, which makes it possible to add an ID +to either a single process or a group of processes if you hold the ID or know the +appropriate password, and there is a corresponding rmauth program that +removes an ID. So when you are working on your computer with GNU +Emacs and want to edit a system configuration file, you switch to +Emacs' shell-mode, do an "addauth root", enter the password, edit the +file, and when you are done switch back to shell-mode and do "rmauth +root". These programs have some interesting options, and there are +various other programs, for setting the complete list of IDs (setauth) +and so on. + +

Related servers

+

+Finally, I want to explain two servers which are related to auth. The +first is the password server, which listens on /servers/password. If +you pass to it a User- or Group-ID and the correct password for it, it +will return a port to auth to you which is associated with the ID you +passed to it. It can create such a port because it is running as +root. So let us assume you are an FTP server process. You will start +as root, because you want to use port 21 (in this case, "port" does +not refer to a mach_port_t, of course). But then, you can drop all +your permissions so that you run without any ID. This makes it far +less dangerous to communicate with yet unknown users over the +network. But when someone now hands a username and password to you, +you can ask the password server for a new auth port. The password +server will check the data you pass to it, for example by looking into +/etc/shadow, and if it is valid, it will ask the auth server for a new +port. It receives this port from auth and then passes it on to you. +So you have raised your permissions. (And for the very curious: Yes, +we are well aware of the differences between this concept and +capabilities; and we also do have some kinds of capabilities in +various parts of the Hurd.) + +

+My second example is the fakeauth server. It also implements the auth +protocol. It is the part of the fakeroot implementation that gives a +process the impression that it runs as root, even if it doesn't. So +when the process asks fakeauth about its own IDs, fakeauth will tell +the process that it runs as root. But when the process wants to make +use of the authentication protocol described earlier in this text, +fakeauth will forward the request to its own auth server, which will +usually be the systems main auth server, which will then be able to +match the auth_*_authenticate requests. So what fakeauth does is +acting as a proxy auth server that gives someone the impression to run +as root, while not modifying what that one is allowed to do. + +

+At this point, I have said at least most of what can be said about the +auth server and the protocol it implements, so I will finish by saying +that it might be an interesting task (for you) to modify some existing +software to take advantage of the features I described here. diff --git a/hurd/documentation/hurd-paper.html b/hurd/documentation/hurd-paper.html new file mode 100644 index 00000000..15d2daec --- /dev/null +++ b/hurd/documentation/hurd-paper.html @@ -0,0 +1,760 @@ +[[meta copyright="Copyright © 1996, 1997, 1998, 2007, 2008 Free Software +Foundation, Inc."]] + +[[meta license="Verbatim copying and distribution of this entire article is +permitted in any medium, provided this notice is preserved."]] + +[[meta title="Towards a New Strategy of OS Design, an architectural overview by +Thomas Bushnell, BSG."]] + + +This article explains why FSF is developing a new operating system named the +Hurd, which will be a foundation of the whole GNU system. +The Hurd is built +on top of CMU's Mach 3.0 kernel and uses Mach's virtual memory management and +message-passing facilities. +The GNU C Library will provide the Unix system +call interface, and will call the Hurd for needed services it can't provide +itself. +The design and implementation of the Hurd is being lead by Michael +Bushnell, with assistance from Richard Stallman, Roland McGrath, +Jan Brittenson, and others. + +

Part 1: A More Usable Approach to OS Design

+

+The fundamental purpose of an operating system (OS) is to enable a variety of +programs to share a single computer efficiently and productively. +This +demands memory protection, preemptively scheduled timesharing, coordinated +access to I/O peripherals, and other services. +In addition, an OS can allow +several users to share a computer. +In this case, efficiency demands services +that protect users from harming each other, enable them to share without +prior arrangement, and mediate access to physical devices. +

+On today's computer systems, programmers usually implement these goals +through a large program called the kernel. +Since this program must be +accessible to all user programs, it is the natural place to add functionality +to the system. +Since the only model for process interaction is that of +specific, individual services provided by the kernel, no one creates other +places to add functionality. +As time goes by, more and more is added to the +kernel. +

+A traditional system allows users to add components to a kernel only if they +both understand most of it and have a privileged status within the system. +Testing new components requires a much more painful edit-compile-debug cycle +than testing other programs. +It cannot be done while others are using the +system. +Bugs usually cause fatal system crashes, further disrupting others' +use of the system. +The entire kernel is usually non-pageable. +(There are +systems with pageable kernels, but deciding what can be paged is difficult +and error prone. +Usually the mechanisms are complex, making them difficult +to use even when adding simple extensions.) +

+Because of these restrictions, functionality which properly belongs +behind +the wall of a traditional kernel is usually left out of systems unless it is +absolutely mandatory. +Many good ideas, best done with an open/read/write +interface cannot be implemented because of the problems inherent in the +monolithic nature of a traditional system. +Further, even among those with +the endurance to implement new ideas, only those who are privileged users of +their computers can do so. +The software copyright system darkens the mire by +preventing unlicensed people from even reading the kernel source. +

+Some systems have tried to address these difficulties. +Smalltalk-80 and +the Lisp Machine both represented one method of getting around the problem. +System code is not distinguished from user code; all of the system is +accessible to the user and can be changed as need be. +Both systems were +built around languages that facilitated such easy replacement and extension, +and were moderately successful. +But they both were fairly poor at insulating +users and programs from each other, failing one of the principal goals of OS +design. +

+Most projects that use the Mach 3.0 kernel carry on the hard-to-change +tradition of OS design. +The internal structure is different, but the same +heavy barrier between user and system remains. +The single-servers, while +fairly easy to construct, inherit all the deficiencies of the monolithic +kernels. +

+A multi-server divides the kernel functionality up into logical blocks with +well-defined interfaces. +Properly done, it is easier to make changes and add +functionality. +So most multi-server projects do somewhat better. +Much more +of the system is pageable. +You can debug the system more easily. +You can +test new system components without interfering with other users. +But the +wall between user and system remains; no user can cross it without special +privilege. +

+The GNU Hurd, by contrast, is designed to make the area of +system +code as +limited as possible. +Programs are required to communicate only with a few +essential parts of the kernel; the rest of the system is replaceable +dynamically. +Users can use whatever parts of the remainder of the system +they want, and can easily add components themselves for other users to take +advantage of. +No mutual trust need exist in advance for users to use each +other's services, nor does the system become vulnerable by trusting the +services of arbitrary users. +

+This has been done by identifying those system components which users +must +use in order to communicate with each other. +One of these is responsible for +identifying users' identities and is called the + +authentication server. + +In +order to establish each other's identities, programs must communicate, each +with an authentication server they trust. +Another component establishes +control over system components by the superuser, provides global bookkeeping +operations, and is called the + +process server. + +

+Not all user programs need to communicate with the process server; it is only +necessary for programs which require its services. +Likewise, the +authentication server is only necessary for programs that wish to communicate +their identity to another. +None of the remaining services carry any special +status; not the network implementation, the filesystems, the program +execution mechanism (including setuid), or any others. + +

The Translator Mechanism

+

+The Hurd uses Mach ports primarily as methods for communicating between users +and servers. +(A Mach port is a communication point on a Mach task where +messages are sent and received.) Each port implements a particular set of +protocols, representing operations that can be undertaken on the underlying +object represented by the port. +Some of the protocols specified by the Hurd +are the I/O protocol, used for generic I/O operations; the file protocol, +used for filesystem operations; the socket protocol, used for network +operations; and the process protocol, used for manipulating processes et al. +

+Most servers are accessed by opening files. +Normally, when you open a file, +you create a port associated with that file that is owned by the server +that owns the directory containing the file. +For example, a disk-based +filesystem will normally serve a large number of ports, each of which +represents an open file or directory. +When a file is opened, the server +creates a new port, associates it with the file, and returns the port to the +calling program. +

+However, a file can have a +translator +associated with it. +In this case, +rather than return its own port which refers to the contents of the file, the +server executes a translator program associated with that file. +This +translator is given a port to the actual contents of the file, and is then +asked to return a port to the original user to complete the open operation. +

+This mechanism is used for +mount +by having a translator associated with +each mount point. +When a program opens the mount point, the translator (in +this case, a program which understands the disk format of the mounted +filesystem) is executed and returns a port to the program. +After the +translator is started, it need not be run again unless it dies; the parent +filesystem retains a port to the translator to use in further requests. +

+The owner of a file can associate a translator with it without special +permission. +This means that any program can be specified as a translator. +Obviously the system will not work properly if the translator does not +implement the file protocol correctly. +However, the Hurd is constructed so +that the worst possible consequence is an interruptible hang. +

+One way to use translators is to access hierarchically structured data using +the file protocol. +For example, all the complexity of the user interface to +the +ftp +program is removed. +Users need only know that a particular +directory represents FTP and can use all the standard file manipulation +commands (e.g +ls +or +cp) +to access the remote system, rather than learning +a new set. +Similarly, a simple translator could ease the complexity of +tar +or +gzip. +(Such transparent access would have some added cost, but it would +be convenient.) + +

Generic Services

+

+With translators, the filesystem can act as a rendezvous for interfaces which +are not similar to files. +Consider a service which implements some version +of the X protocol, using Mach messages as an underlying transport. +For each +X display, a file can be created with the appropriate program as its +translator. +X clients would open that file. +At that point, few file +operations would be useful (read and write, for example, would be useless), +but new operations ( +XCreateWindow +or +XDrawText) +might become meaningful. +In this case, the filesystem protocol is used only to manipulate +characteristics of the node used for the rendezvous. +The node need not +support I/O operations, though it should reply to any such messages with a +message_not_understood +return code. +

+This translator technique is used to contact most of the services in the Hurd +that are not structured like hierarchical filesystems. +For example, the +password server, which hands out authorization tags in exchange for +passwords, is contacted this way. +Network protocol servers are also +contacted in this fashion. +Roland McGrath thought up this use of translators. + +

Clever Filesystem Pictures

+

+In the Hurd, translators can also be used to present a filesystem-like view +of another part of the filesystem, with some semantics changed. +For example, +it would be nice to have a filesystem that cannot itself be changed, but +nonetheless records changed versions of its files elsewhere. +(This could be +useful for source code management.) +

+The Hurd will have a translator which creates a directory which is a +conceptual union of other directories, with collision resolution rules of +various sorts. +This can be used to present a single directory to users that +contains all the programs they would want to execute. +There are other useful +variations on this theme. + +

What The User Can Do

+

+No translator gains extra privilege by virtue of being hooked into the +filesystem. +Translators run with the uid of the owner of the file being +translated, and can only be set or changed by that owner. +The I/O and +filesystem protocols are carefully designed to allow their use by mutually +untrusting clients and servers. +Indeed, translators are just ordinary +programs. +The GNU C library has a variety of facilities to make common sorts +of translators easier to write. +

+Some translators may need special privileges, such as the password server or +translators which allow setuid execution. +These translators could be run by +anyone, but only if they are set on a root-owned node would they be able to +provide all their services successfully. +This is analogous to letting any +user call the +reboot +system call, but only honoring it if that user is root. + +

Why This Is So Different

+

+What this design provides is completely novel to the Unix world. +Until now, +OSs have kept huge portions of their functionality in the realm of system +code, thus preventing its modification and extension except in extreme need. +Users cannot replace parts of the system in their programs no matter how much +easier that would make their task, and system managers are loath to install +random tweaks off the net into their kernels. +

+In the Hurd, users can change almost all of the things that are decided for +them in advance by traditional systems. +In combination with the tremendous +control given by the Mach kernel over task address spaces and properties, the +Hurd provides a system in which users will, for the first time, be able to +replace parts of the system they dislike, without disrupting other users. +

+Most Mach-based OSs to date have mostly implemented a wider set of the + +same old + +Unix semantics in a new environment. +In contrast, GNU is extending +those semantics to allow users to improve, bypass, or replace them. + + +

Part 2: A Look at Some of the Hurd's Beasts

+

The Authentication Server

+

+One of the Hurd's more central servers is the authentication server. +Each +port to this server identifies a user and is associated by this server with +an +id block. +Each id block contains sets of user and group ids. +Either +set may be empty. +This server is not the same as the password server +referred to above. +

+The authentication server exports three services. +First, it provides simple +boolean operations on authentication ports: given two authentication ports, +this server will provide a third port representing the union of the two sets +of uids and gids. +Second, this server allows any user with a uid of zero to +create an arbitrary authentication port. +Finally, this server provides RPCs +(Remote Procedure Calls between different programs and possibly different +hosts) which allow mutually untrusting clients and servers to establish their +identities and pass initial information on each other. +This is crucial to +the security of the filesystem and I/O protocols. +

+Any user could write a program which implements the authentication protocol; +this does not violate the system's security. +When a service needs to +authenticate a user, it communicates with its trusted authentication server. +If that user is using a different authentication server, the transaction will +fail and the server can refuse to communicate further. +Because, in effect, +this forces all programs on the system to use the same authentication server, +we have designed its interface to make any safe operation possible, and to +include no extraneous operations. +(This is why there is a separate password +server.) +

The Process Server

+

+The process server acts as an information categorization repository. +There +are four main services supported by this server. +First, the process server +keeps track of generic host-level information not handled by the Mach kernel. +For example, the hostname, the hostid, and the system version are maintained +by the process server. +Second, this server maintains the Posix notions of +sessions and process groups, to help out programs that wish to use Posix +features. +

+Third, the process server maintains a one-to-one mapping between Mach tasks +and Hurd processes. +Every task is assigned a pid. +Processes can register a +message port with this server, which can then be given out to any program +which requests it. +This server makes no attempt to keep these message ports +private, so user programs are expected to implement whatever security they +need themselves. +(The GNU C Library provides convenient functions for all +this.) Processes can tell the process server their current `argv' and `envp' +values; this server will then provide, on request, these vectors of arguments +and environment. +This is useful for writing +ps-like +programs and also +makes it easier to hide or change this information. +None of these features +are mandatory. +Programs are free to disregard all of this and never register +themselves with the process server at all. +They will, however, still have a +pid assigned. +

+Finally, the process server implements +process collections, +which are used +to collect a number of process message ports at the same time. +Also, +facilities are provided for converting between pids, process server ports, +and Mach task ports, while ensuring the security of the ports managed. +

+It is important to stress that the process server is optional. +Because of +restrictions in Mach, programs must run as root in order to identify all the +tasks in the system. +But given that, multiple process servers could +co-exist, each with their own clients, giving their own model of the +universe. +Those process server features which do not require root privileges +to be implemented could be done as per-user servers. +The user's hands are +not tied. +

Transparent FTP

+

+Transparent FTP is an intriguing idea whose time has come. +The popular +ange-ftp +package available for GNU Emacs makes access to FTP files +virtually transparent to all the Emacs file manipulation functions. +Transparent FTP does the same thing, but in a system wide fashion. +This +server is not yet written; the details remain to be fleshed out, and will +doubtless change with experience. +

+In a BSD kernel, a transparent FTP filesystem would be no harder to write +than in the Hurd. +But mention the idea to a BSD kernel hacker, and the +response is that ``such a thing doesn't belong in the kernel''. +In a sense, +this is correct. +It violates all the layering principles of such systems to +place such things in the kernel. +The unfortunate side effect, however, is +that the design methodology (which is based on preventing users from changing +things they don't like) is being used to prevent system designers from making +things better. +(Recent BSD kernels make it possible to write a user program +that provides transparent FTP. +An example is +alex, +but it needs to run +with full root privileges.) +

+In the Hurd, there are no obstacles to doing transparent FTP. +A translator +will be provided for the node +/ftp. +The contents of +/ftp +will probably +not be directly listable, though further subdirectories will be. +There will +be a variety of possible formats. +For example, to access files on uunet, one +could + +cd /ftp/ftp.uu.net:anonymous:mib@gnu. + +Or to access files on a remote +account, one might + +cd /ftp/gnu.org:mib:passwd. + +Parts of this +command could be left out and the transparent FTP program would read them +from a user's +.netrc +file. +In the last case, one might just + +cd /ftp/gnu.org; + +when the rest of the data is already in +.netrc. +

+There is no need to do a +cd +first--use any file command. +To find out about +RFC 1097 (the Telnet Subliminal Message Option), just type + +more /ftp/ftp.uu.net/inet/rfc/rfc1097. + +A copy command to a local disk +could be used if the RFC would be read frequently. +

Filesystems

+

+Ordinary filesystems are also being implemented. +The initial release of the +Hurd will contain a filesystem upwardly compatible with the BSD 4.4 Fast File +System. +In addition to the ordinary semantics, it will provide means to +record translators, offer thirty-two bit user ids and group ids, and supply a +new id per file, called the +author +of the file, which can be set by the +owner arbitrarily. +In addition, because users in the Hurd can have multiple +uids (or even none), there is an additional set of permission bits providing +access control for + +unknown user + +(no uids) as distinct from + +known but arbitrary user + +(some uids: the existing +world +category of file +permissions). +

+The Network File System protocol will be implemented using 4.4 BSD as a +starting point. +A log-structured filesystem will also be implemented using +the same ideas as in Sprite, but probably not the same format. +A GNU network +file protocol may be designed in time, or NFS may be extended to remove its +deficiencies. +There will also be various ``little'' filesystems, such as the +MS-DOS filesystem, to help people move files between GNU and other OSs. + +

Terminals

+

+An I/O server will provide the terminal semantics of Posix. +The GNU C +Library has features for keeping track of the controlling terminal and for +arranging to have proper job control signals sent at the proper times, as +well as features for obeying keyboard and hangup signals. +

+Programs will be able to insert a terminal driver into communications +channels in a variety of ways. +Servers like +rlogind +will be able to insert +the terminal protocol onto their network communication port. +Pseudo-terminals will not be necessary, though they will be provided for +backward compatibility with older programs. +No programs in GNU will depend +on them. +

+Nothing about a terminal driver is forced upon users. +A terminal driver +allows a user to get at the underlying communications channel easily, to +bypass itself on an as-needed basis or altogether, or to substitute a +different terminal driver-like program. +In the last case, provided the +alternate program implements the necessary interfaces, it will be used by the +C Library exactly as if it were the ordinary terminal driver. +

+Because of this flexibility, the original terminal driver will not provide +complex line editing features, restricting itself to the behavior found in +Posix and BSD. +In time, there will be a +readline-based +terminal driver, +which will provide complex line-editing features for those users who want +them. +

+The terminal driver will probably not provide good support for the +high-volume, rapid data transmission required by UUCP or SLIP. +Those +programs do not need any of its features. +Instead they will be using the +underlying Mach device ports for terminals, which support moving large +amounts of data efficiently. + +

Executing Programs

+

+The implementation of the +execve +call is spread across three programs. +The +library marshals the argument and environment vectors. +It then sends a +message to the file server that holds the file to be executed. +The file +server checks execute permissions and makes whatever changes it desires in +the exec call. +For example, if the file is marked setuid and the fileserver +has the ability, it will change the user identification of the new image. +The file server also decides if programs which had access to the old task +should continue to have access to the new task. +If the file server is +augmenting permissions, or executing an unreadable image, then the exec needs +to take place in a new Mach task to maintain security. +

+After deciding the policy associated with the new image, the filesystem calls +the exec server to load the task. +This server, using the BFD (Binary File +Descriptor) library, loads the image. +BFD supports a large number of object +file formats; almost any supported format will be executable. +This server +also handles scripts starting with +#!, +running them through the indicated +program. +

+The standard exec server also looks at the environment of the new image; if +it contains a variable +EXECSERVERS +then it uses the programs specified +there as exec servers instead of the system default. +(This is, of course, +not done for execs that the file server has requested be kept secure.) +

+The new image starts running in the GNU C Library, which sends a message to +the exec server to get the arguments, environment, umask, current directory, +etc. +None of this additional state is special to the file or exec servers; +if programs wish, they can use it in a different manner than the Library. + +

New Processes

+

+The +fork +call is implemented almost entirely in the GNU C Library. +The new +task is created by Mach kernel calls. +The C Library arranges to have its +image inherited properly. +The new task is registered with the process server +(though this is not mandatory). +The C Library provides vectors of functions +to be called at fork time: one vector to be called before the fork, one after +in the parent, and one after in the child. +(These features should not be +used to replace the normal fork-calling sequence; it is intended for +libraries which need to close ports or clean up before a fork occurs.) +The C +library will implement both fork calls specified by the draft Posix.4a (the +proposed standard dealing with the threads extension to the real-time +extension). +

+Nothing forces the user to create new tasks this way. +If a program wants to +use almost the normal fork, but with some special characteristics, then it +can do so. +Hooks will be provided by the C Library, or the function can even +be completely replaced. +None of this is possible in a traditional Unix +system. + +

Asynchronous Messages

+

+As mentioned above, the process server maintains a + +message port + +for each +task registered with it. +These ports are public, and are used to send +asynchronous messages to the task. +Signals, for example, are sent to the +message port. +The signal message also provides a port as an indication that +the sender should be trusted to send the signal. +The GNU C Library lists a +variety of ports in a table, each of which identifies a set of signals that +can be sent by anyone who possesses that port. +For example, if the user +possesses the task's kernel port, it is allowed to send any signal. +If the +user possesses a special + +terminal id + +port, it is allowed to send the +keyboard and hangup signals. +Users can add arbitrary new entries into the C +library's signal permissions table. +

+When a process's process group changes, the process server will send it a +message indicating the new process group. +In this case, the process server +proves its authority by providing the task's kernel port. +

+The C library also has messages to add and delete uids currently used by the +process. +If new uids are sent to the program, the library adds them to its +current set, and then exchanges messages with all the I/O servers it knows +about, proving to them its new authorization. +Similarly, a message can +delete uids. +In the latter case, the caller must provide the process's task +port. +(You can't harm a process by giving it extra permission, but you can +harm it by taking permission away.) The Hurd will provide user programs to +send these messages to processes. +For example, the +su +command will be able +to cause all the programs in your current login session, to gain a new uid, +rather than spawn a subshell. +

+The C library will allow programs to add asynchronous messages they wish to +recognize, as well as prevent recognition of the standard set. +

Making It Look Like Unix

+

+The C Library will implement all of the calls from BSD and Posix as well as +some obvious extensions to them. +This enables users to replace those calls +they dislike or bypass them entirely, whereas in Unix the calls must be used +``as they come'' with no alternatives possible. +

+In some environments binary compatibility will also be supported. +This works +by building a special version of the library which is then loaded somewhere +in the address space of the process. +(For example, on a VAX, it would be +tucked in above the stack.) A feature of Mach, called system call +redirection, is then used to trap Unix system calls and turn them into jumps +into this special version of the library. +(On almost all machines, the cost +of such a redirection is very small; this is a highly optimized path in Mach. +On a 386 it's about two dozen instructions. +This is little worse than a +simple procedure call.) +

+Many features of Unix, such as signal masks and vectors, are handled +completely by the library. +This makes such features significantly cheaper +than in Unix. +It is now reasonable to use +sigblock +extensively to protect +critical sections, rather than seeking out some other, less expensive method. + +

Network Protocols

+

+The Hurd will have a library that will make it very easy to port 4.4 BSD +protocol stacks into the Hurd. +This will enable operation, virtually for +free, of all the protocols supported by BSD. +Currently, this includes the +CCITT protocols, the TCP/IP protocols, the Xerox NS protocols, and the ISO +protocols. +

+For optimal performance some work would be necessary to take advantage of +Hurd features that provide for very high speed I/O. +For most protocols this +will require some thought, but not too much time. +The Hurd will run the +TCP/IP protocols as efficiently as possible. +

+As an interesting example of the flexibility of the Hurd design, consider the +case of IP trailers, used extensively in BSD for performance. +While the Hurd +will be willing to send and receive trailers, it will gain fairly little +advantage in doing so because there is no requirement that data be copied and +avoiding copies for page-aligned data is irrelevant. diff --git a/hurd/documentation/hurd-talk.html b/hurd/documentation/hurd-talk.html new file mode 100644 index 00000000..d608e12a --- /dev/null +++ b/hurd/documentation/hurd-talk.html @@ -0,0 +1,1061 @@ +[[meta copyright="Copyright © 2001 Marcus Brinkmann"]] + +[[meta license="Verbatim copying and distribution of this entire article is +permitted in any medium, provided this notice is preserved."]] + +[[meta title="The Hurd, a presentation by Marcus Brinkmann"]] + + +

Table of Contents

+ +
+

Talk about the Hurd

+

+This talk about the Hurd was written by Marcus Brinkmann for +

    +
  • OSDEM, Brussels, 4. Feb 2001, +
  • Frühjahrsfachgespräche, Cologne, 2. Mar 2001 and +
  • Libre Software Meeting, Bordeaux, 4. Jul 2001. +
+ +

Introduction

+

+When we talk about free software, we usually refer to the free +software licenses. We also need relief from software patents, so our +freedom is not restricted by them. But there is a third type of +freedom we need, and that's user freedom. + +

+Expert users don't take a system as it is. They like to change the +configuration, and they want to run the software that works best for +them. That includes window managers as well as your favourite text +editor. But even on a GNU/Linux system consisting only of free +software, you can not easily use the filesystem format, network +protocol or binary format you want without special privileges. In +traditional unix systems, user freedom is severly restricted by the +system administrator. + +

+The Hurd removes these restrictions from the user. It provides an +user extensible system framework without giving up POSIX compatibility +and the unix security model. Throughout this talk, we will see that +this brings further advantages beside freedom. + +

Overview

+
+ +

+The Hurd is a POSIX compatible multi-server +system operating on top of the GNU Mach microkernel. + +

+Topics: +

    +
  • GNU Mach
  • +
  • The Hurd
  • +
  • Development
  • +
  • Debian GNU/Hurd
  • +
+
+ +

+The Hurd is a POSIX compatible multi-server system operating on top of +the GNU Mach Microkernel. + +

+I will have to explain what GNU Mach is, so we start with that. Then +I will talk about the Hurd's architecture. After that, I will give a +short overview on the Hurd libraries. Finally, I will tell you how +the Debian project is related to the Hurd. + +

Historicals

+ +
+
    +
  • 1983: Richard Stallman founds the GNU project.
  • +
  • 1988: Decision is made to use Mach 3.0 as the kernel.
  • +
  • 1991: Mach 3.0 is released under compatible license.
  • +
  • 1991: Thomas Bushnell, BSG, founds the Hurd project.
  • +
  • 1994: The Hurd boots the first time.
  • +
  • 1997: Version 0.2 of the Hurd is released.

  • +
  • 1998: Debian hurd-i386 archive is created.
  • +
  • 2001: Debian GNU/Hurd snapshot fills three CD images.
  • +
+
+ +

+When Richard Stallman founded the GNU project in 1983, he wanted to +write an operating system consisting only of free software. Very +soon, a lot of the essential tools were implemented, and released +under the GPL. However, one critical piece was missing: The kernel. +

+After considering several alternatives, it was decided not to write a +new kernel from scratch, but to start with the Mach microkernel. This +was in 1988, and it was not before 1991 that Mach was released under a +license allowing the GNU project to distribute it as a part of the +system. +

+In 1998, I started the Debian GNU/Hurd project, and in 2001 the number +of available GNU/Hurd packages fills three CD images. + +

Kernel Architectures

+
+

+Microkernel: +

    +
  • Enforces resource management (paging, scheduling)
  • +
  • Manages tasks
  • +
  • Implements message passing for IPC
  • +
  • Provides basic hardware support
  • +
+

+Monolithic kernel: +

    +
  • No message passing necessary
  • +
  • Rich set of features (filesystems, authentication, network + sockets, POSIX interface, ...)
  • +
+
+

+Microkernels were very popular in the scientific world around that +time. They don't implement a full operating system, but only the +infrastructure needed to enable other tasks to implement most +features. In contrast, monolithical kernels like Linux contain +program code of device drivers, network protocols, process management, +authentication, file systems, POSIX compatible interfaces and much +more. +

+So what are the basic facilities a microkernel provides? In general, +this is resource management and message passing. Resource management, +because the kernel task needs to run in a special privileged mode of +the processor, to be able to manipulate the memory management unit and +perform context switches (also to manage interrupts). Message +passing, because without a basic communication facility the other +tasks could not interact to provide the system services. Some +rudimentary hardware device support is often necessary to bootstrap +the system. So the basic jobs of a microkernel are enforcing the +paging policy (the actual paging can be done by an external pager +task), scheduling, message passing and probably basic hardware device +support. +

+Mach was the obvious choice back then, as it provides a rich set of +interfaces to get the job done. Beside a rather brain-dead device +interface, it provides tasks and threads, a messaging system allowing +synchronous and asynchronous operation and a complex interface for +external pagers. It's certainly not one of the sexiest microkernels +that exist today, but more like a big old mama. The GNU project +maintains its own version of Mach, called GNU Mach, which is based on +Mach 4.0. In addition to the features contained in Mach 4.0, the GNU +version contains many of the Linux 2.0 block device and network card +drivers. +

+A complete treatment of the differences between a microkernel and +monolithical kernel design can not be provided here. But a couple of +advantages of a microkernel design are fairly obvious. + +

Micro vs Monolithic

+
+

+Microkernel +

    +
  • Clear cut responsibilities +
  • Flexibility in operating system design, easier debugging
  • +
  • More stability (less code to break)
  • +
  • New features are not added to the kernel
  • +
+

+Monolithic kernel +

    +
  • Intolerance or creeping featuritis
  • +
  • Danger of spaghetti code
  • +
  • Small changes can have far reaching side effects
  • +
+
+

+Because the system is split up into several components, clean +interfaces have to be developed, and the responsibilities of each part +of the system must be clear. +

+Once a microkernel is written, it can be used as the base for several +different operating systems. Those can even run in parallel which +makes debugging easier. When porting, most of the hardware dependant +code is in the kernel. +

+Much of the code that doesn't need to run in the special kernel mode +of the processor is not part of the kernel, so stability increases +because there is simply less code to break. +

+New features are not added to the kernel, so there is no need to hold +the barrier high for new operating system features. +

+Compare this to a monolithical kernel, where you either suffer from +creeping featuritis or you are intolerant of new features (we see both +in the Linux kernel). +

+Because in a monolithical kernel, all parts of the kernel can access +all data structures in other parts, it is more likely that short cuts +are used to avoid the overhead of a clean interface. This leads to a +simple speed up of the kernel, but also makes it less comprehensible +and more error prone. A small change in one part of the kernel can +break remote other parts. + +

Single Server vs Multi Server

+
+

+Single Server +

    +
  • A single task implements the functionality of the operating system.
  • +
+

+Multi Server +

    +
  • Many tasks cooperate to provide the system's functionality.
  • +
  • One server provides only a small but well-defined part of the + whole system.
  • +
  • The responsibilities are distributed logically among the servers.
  • +
+

+A single-server system is comparable to a monolithic kernel system. It +has similar +advantages and disadvantages. +

+

+There exist a couple of operating systems based on Mach, but they all +have the same disadvantages as a monolithical kernel, because those +operating systems are implemented in one single process running on top +of the kernel. This process provides all the services a monolithical +kernel would provide. This doesn't make a whole lot of sense (the +only advantage is that you can probably run several of such isolated +single servers on the same machine). Those systems are also called +single-server systems. The Hurd is the only usable multi-server +system on top of Mach. In the Hurd, there are many server programs, +each one responsible for a unique service provided by the operating +system. These servers run as Mach tasks, and communicate using the +Mach message passing facilities. One of them does only provide a +small part of the functionality of the system, but together they build +up a complete and functional POSIX compatible operating system. + +

Multi Server is superior, ...

+
+

+Any multi-server has advantages over single-server: +

    +
  • Clear cut responsibilities
  • +
  • More stability: If one server dies, all others remain
  • +
  • Easier development cycle: Testing without reboot (or replacing + running servers), debugging with gdb
  • +
  • Easier to make changes and add new features +
+
+

+Using several servers has many advantages, if done right. If a file +system server for a mounted partition crashes, it doesn't take down +the whole system. Instead the partition is "unmounted", and +you can try to start the server again, probably debugging it this time +with gdb. The system is less prone to errors in individual +components, and over-all stability increases. The functionality of +the system can be extended by writing and starting new servers +dynamically. (Developing these new servers is easier for the reasons +just mentioned.) +

+But even in a multi-server system the barrier between the system and +the users remains, and special privileges are needed to cross it. We +have not achieved user freedom yet. + +

The Hurd even more so.

+
+

+The Hurd goes beyond all this, and allows users to write and run their +servers, too! +

    +
  • Users can replace system servers dynamically with their own + implementations.
  • +
  • Users can decide what parts of the remainder of the system they + want to use.
  • +
  • Users can extend the functionality of the system.
  • +
  • No mutual trust necessary to make use of other users + services.
  • +
  • Security of the system is not harmed by trusting users + services.
  • +
+
+

+To quote Thomas Bushnell, BSG, from his paper +[[``Towards_a_New_Strategy_of_OS_design''_(1996)|hurd-paper]]: +

+The GNU Hurd, by contrast, is designed to make the area of system code +as limited as possible. Programs are required to communicate only +with a few essential parts of the kernel; the rest of the system is +replaceable dynamically. Users can use whatever parts of the +remainder of the system they want, and can easily add components +themselves for other users to take advantage of. No mutual trust need +exist in advance for users to use each other's services, nor does the +system become vulnerable by trusting the services of arbitrary users. +
+ +

+So the Hurd is a set of servers running on top of the Mach +micro-kernel, providing a POSIX compatible and extensible operating +system. What servers are there? What functionality do they provide, +and how do they cooperate? + +

Mach Inter Process Communication

+
+

+Ports are message queues which can be used as one-way communication +channels. +

    +
  • Port rights are receive, send or send-once
  • +
  • Exactly one receiver
  • +
  • Potentially many senders
  • +
+

+MIG provides remote procedure calls on top of Mach IPC. RPCs look like +function calls to the user. +

+

+Inter-process communication in Mach is based on the ports concept. A +port is a message queue, used as a one-way communication channel. In +addition to a port, you need a port right, which can be a send right, +receive right, or send-once right. Depending on the port right, you +are allowed to send messages to the server, receive messages from it, +or send just one single message. +

+For every port, there exists exactly one task holding the receive +right, but there can be no or many senders. The send-once right is +useful for clients expecting a response message. They can give a +send-once right to the reply port along with the message. The kernel +guarantees that at some point, a message will be received on the reply +port (this can be a notification that the server destroyed the +send-once right). +

+You don't need to know much about the format a message takes to be +able to use the Mach IPC. The Mach interface generator mig hides the +details of composing and sending a message, as well as receiving the +reply message. To the user, it just looks like a function call, but +in truth the message could be sent over a network to a server running +on a different computer. The set of remote procedure calls a server +provides is the public interface of this server. + + +

How to get a port?

+
+

+Traditional Mach: +

    +
  • Nameserver provides ports to all registered servers.
  • +
  • The nameserver port itself is provided by Mach.
  • +
  • Like a phone book: One list.
  • +
+

+The Hurd: +

    +
  • The filesystem is used as the server namespace.
  • +
  • Root directory port is inserted into each task.
  • +
  • The C library finds other ports with hurd_file_name_lookup, + performing a pathname resolution.
  • +
  • Like a tree of phone books.
  • +
+
+

+So how does one get a port to a server? You need something like a +phone book for server ports, or otherwise you can only talk to +yourself. In the original Mach system, a special nameserver is +dedicated to that job. A task could get a port to the nameserver from +the Mach kernel and ask it for a port (with send right) to a server +that registered itself with the nameserver at some earlier time. +

+In the Hurd, there is no nameserver. Instead, the filesystem is used +as the server namespace. This works because there is always a root +filesystem in the Hurd (remember that the Hurd is a POSIX compatible +system); this is an assumption the people who developed Mach couldn't +make, so they had to choose a different strategy. You can use the +function hurd_file_name_lookup, which is part of the C library, to get +a port to the server belonging to a filename. Then you can start to +send messages to the server in the usual way. + +

Example of hurd_file_name_lookup

+
+mach_port_t identity;
+mach_port_t pwserver;
+kern_return_t err;
+
+pwserver = hurd_file_name_lookup
+                ("/servers/password");
+
+err = password_check_user (pwserver,
+                           0 /* root */, "supass",
+                           &identity);
+
+

+As a concrete example, the special filename +/servers/password can be used to request a port to the +Hurd password server, which is responsible to check user provided +passwords. +

+(explanation of the example) + +

Pathname resolution example

+
+

+Task: Lookup /mnt/readme.txt where /mnt has a mounted filesystem. +

    +
  • The C library asks the root filesystem server about + /mnt/readme.txt.
  • +
  • The root filesystem returns a port to the mnt filesystem server + (matching /mnt) and the retry name + /readme.txt.
  • +
  • The C library asks the mnt filesystem server about + /readme.txt.
  • +
  • The mnt filesystem server returns a port to itself and records + that this port refers to the regular file + /readme.txt.
  • +
+
+

+The C library itself does not have a full list of all available +servers. Instead pathname resolution is used to traverse through a +tree of servers. In fact, filesystems themselves are implemented by +servers (let us ignore the chicken and egg problem here). So all the +C library can do is to ask the root filesystem server about the +filename provided by the user (assuming that the user wants to resolve +an absolute path), using the dir_lookup RPC. If the +filename refers to a regular file or directory on the filesystem, the +root filesystem server just returns a port to itself and records that +this port corresponds to the file or directory in question. But if a +prefix of the full path matches the path of a server the root +filesystem knows about, it returns to the C library a port to this +server and the remaining part of the pathname that couldn't be +resolved. The C library than has to retry and query the other server +about the remaining path component. Eventually, the C library will +either know that the remaining path can't be resolved by the last +server in the list, or get a valid port to the server in question. + +

Mapping the POSIX Interface

+
+ + + + + + + + + + + + + + + + + + + +
FiledescriptorPort to server providing the file
fd = open(name,...)dir_lookup(..,name,..,&port)
+[pathname resolution]
read(fd, ...)io_read(port, ...)
write(fd, ...)io_write(port, ...)
fstat(fd, ...)io_stat(port, ...)
...
+
+

+It should by now be obvious that the port returned by the server can +be used to query the files status, content and other information from +the server, if good remote procedure calls to do that are defined and +implemented by it. This is exactly what happens. Whenever a file is +opened using the C libraries open() call, the C library +uses the above pathname resolution to get a port to a server providing +the file. Then it wraps a file descriptor around it. So in the Hurd, +for every open file descriptor there is a port to a server providing +this file. Many other C library calls like read() and +write() just call a corresponding RPC using the port +associated with the file descriptor. + +

File System Servers

+
+
    +
  • Provide file and directory services for ports (and more).
  • +
  • These ports are returned by a directory lookup.
  • +
  • Translate filesystem accesses through their root path (hence the + name translator).
  • +
  • The C library maps the POSIX file and directory interface (and + more) to RPCs to the filesystem servers ports, but also does work on + its own.
  • +
  • Any user can install file system servers on inodes they own.
  • +
+
+

+So we don't have a single phone book listing all servers, but rather a +tree of servers keeping track of each other. That's really like +calling your friend and asking for the phone number of the blond girl +at the party yesterday. He might refer you to a friend who hopefully +knows more about it. Then you have to retry. +

+This mechanism has huge advantages over a single nameserver. First, +note that standard unix permissions on directories can be used to +restrict access to a server (this requires that the filesystems +providing those directories behave). You just have to set the +permissions of a parent directory accordingly and provide no other way +to get a server port. +

+But there are much deeper implications. Most of all, a pathname never +directly refers to a file, it refers to a port of a server. That +means that providing a regular file with static data is just one of +the many options the server has to service requests on the file port. +A server can also create the data dynamically. For example, a server +associated with /dev/random can provide new random data +on every io_read() on the port to it. A server +associated with /dev/fortune can provide a new fortune +cookie on every open(). +

+While a regular filesystem server will just serve the data as stored +in a filesystem on disk, there are servers providing purely virtual +information, or a mixture of both. It is up to the server to behave +and provide consistent and useful data on each remote procedure call. +If it does not, the results may not match the expectations of the user +and confuse him. +

+A footnote from the Hurd info manual: +

+(1) You are lost in a maze of twisty little filesystems, all +alike.... +
+

+Because a server installed in the filesystem namespace translates all +filesystem operations that go through its root path, such a server is +also called "active translator". You can install translators using +the settrans command with the -a option. + +

Active vs Passive

+
+

+Active Translators: +

    +
  • "settrans -a /cdrom /hurd/isofs /dev/hd2"
  • +
  • Are running filesystem servers.
  • +
  • Are attached to the root node they translate.
  • +
  • Run as a normal process.
  • +
  • Go away with every reboot, or even time out.
  • +
+
+

+Many translator settings remain constant for a long time. It would be +very lame to always repeat the same couple of dozens settrans calls +manually or at boot time. So the Hurd provides a filesystem extension +that allows to store translator settings inside the filesystem and let +the filesystem servers do the work to start those servers on demand. +Such translator settings are called "passive translators". A passive +translator is really just a command line string stored in an inode of +the filesystem. If during a pathname resolution a server encounters +such a passive translator, and no active translator does exist already +(for this node), it will use this string to start up a new translator +for this inode, and then let the C library continue with the path +resolution as described above. Passive translators are installed with +settrans using the -p option (which is already the +default). + +
+

+Passive Translators: +

    +
  • "settrans /mnt /hurd/ext2fs /dev/hd1s1"
  • +
  • Are stored as command strings into an inode.
  • +
  • Are used to start a new active translator if there isn't + one.
  • +
  • Startup is transparent to the user.
  • +
  • Startup happens the first time the server is needed.
  • +
  • Are permanent across reboots (like file data).
  • +
+
+

+So passive translators also serve as a sort of automounting feature, +because no manual interaction is required. The server start up is +deferred until the service is need, and it is transparent to the user. +

+When starting up a passive translator, it will run as a normal process +with the same user and group id as those of the underlying inode. Any +user is allowed to install passive and active translators on inodes +that he owns. This way the user can install new servers into the +global namespace (for example, in his home or tmp directory) and thus +extend the functionality of the system (recall that servers can +implement other remote procedure calls beside those used for files and +directories). A careful design of the trusted system servers makes +sure that no permissions leak out. +

+In addition, users can provide their own implementations of some of +the system servers instead the system default. For example, they can +use their own exec server to start processes. The user specific exec +server could for example start java programs transparently (without +invoking the interpreter manually). This is done by setting the +environment variable EXECSERVERS. The systems default +exec server will evaluate this environment variable and forward the +RPC to each of the servers listed in turn, until some server accepts +it and takes over. The system default exec server will only do this +if there are no security implications. (XXX There are other ways to +start new programs than by using the system exec server. Those are +still available.) +

+Let's take a closer look at some of the Hurd servers. It was already +mentioned that only few system servers are mandatory for users. To +establish your identity within the Hurd system, you have to +communicate with the trusted systems authentication server +auth. To put the system administrator into control over +the system components, the process server does some global +bookkeeping. +

+But even these servers can be ignored. However, registration with the +authentication server is the only way to establish your identity +towards other system servers. Likewise, only tasks registered as +processes with the process server can make use of its services. + +

Authentication

+
+

+A user identity is just a port to an authserver. The auth server +stores four set of ids for it: +

    +
  • effective user ids
  • +
  • effective group ids
  • +
  • available user ids
  • +
  • available group ids
  • +
+

+Basic properties: +

    +
  • Any of these can be empty.
  • +
  • A 0 among the user ids identifies the superuser.
  • +
  • Effective ids are used to check if the user has the + permission.
  • +
  • Available ids can be turned into effective ids on user + request.
  • +
+
+

+The Hurd auth server is used to establish the identity of a user for a +server. Such an identity (which is just a port to the auth server) +consists of a set of effective user ids, a set of effective group ids, +a set of available user ids and a set of available group ids. Any of +these sets can be empty. + +

Operations on authentication ports

+
+

+The auth server provides the following operations on ports: +

    +
  • Merge the ids of two ports into a new one.
  • +
  • Return a new port containing a subset of the ids in a port.
  • +
  • Create a new port with arbitrary ids (superuser only).
  • +
  • Establish a trusted connection between users and servers.
  • +
+
+

+If you have two identities, you can merge them and request an identity +consisting of the unions of the sets from the auth server. You can +also create a new identity consisting only of subsets of an identity +you already have. What you can't do is extending your sets, unless +you are the superuser which is denoted by having the user id 0. + +

Establishing trusted connections

+
+
    +
  • User provides a rendezvous port to the server (with + io_reauthenticate).
  • +
  • User calls auth_user_authenticate on the + authentication port (his identity), passing the rendezvous port.
  • +
  • Server calls auth_server_authenticate on its + authentication port (to a trusted auth server), passing the + rendezvous port and the server port.
  • +
  • If both authentication servers are the same, it can match the + rendezvous ports and return the server port to the user and the user + ids to the server.
  • +
+
+

+Finally, the auth server can establish the identity of a user for a +server. This is done by exchanging a server port and a user identity +if both match the same rendezvous port. The server port will be +returned to the user, while the server is informed about the id sets +of the user. The server can then serve or reject subsequent RPCs by +the user on the server port, based on the identity it received from +the auth server. +

+Anyone can write a server conforming to the auth protocol, but of +course all system servers use a trusted system auth server to +establish the identity of a user. If the user is not using the system +auth server, matching the rendezvous port will fail and no server port +will be returned to the user. Because this practically requires all +programs to use the same auth server, the system auth server is +minimal in every respect, and additional functionality is moved +elsewhere, so user freedom is not unnecessarily restricted. + +

Password Server

+
+

+The password server /servers/password runs as root and +returns a new authentication port in exchange for a unix password. +

+The ids corresponding to the authentication port match the unix user +and group ids. +

+Support for shadow passwords is implemented here. +

+

+The password server sits at /servers/password and runs as +root. It can hand out ports to the auth server in exchange for a unix +password, matching it against the password or shadow file. Several +utilities make use of this server, so they don't need to be setuid +root. + +

Process Server

+
+

+The superuser must remain control over user tasks, so: +

    +
  • All mach tasks are associated with a PID in the system default + proc server.
  • +
+

+Optionally, user tasks can store: +

    +
  • Their environment variables.
  • +
  • Their argument vector.
  • +
  • A port, which others can request based on the PID (like a + nameserver).
  • +
+

+Also implemented in the proc server: +

    +
  • Sessions and process groups.
  • +
  • Global configuration not in Mach, like hostname, hostid, system + version.
  • +
+
+

+The process server is responsible for some global bookkeeping. As +such it has to be trusted and is not replaceable by the user. +However, a user is not required to use any of its service. In that +case the user will not be able to take advantage of the POSIXish +appearance of the Hurd. +

+The Mach Tasks are not as heavy as POSIX processes. For example, +there is no concept of process groups or sessions in Mach. The proc +server fills in the gap. It provides a PID for all Mach tasks, and +also stores the argument line, environment variables and other +information about a process (if the mach tasks provide them, which is +usually the case if you start a process with the default +fork()/exec()). A process can also register +a message port with the proc server, which can then be requested by +anyone. So the proc server also functions as a nameserver using the +process id as the name. +

+The proc server also stores some other miscellaneous information not +provided by Mach, like the hostname, hostid and system version. +Finally, it provides facilities to group processes and their ports +together, as well as to convert between pids, process server ports and +mach task ports. +
+

+User tasks not registering themselve with proc only have a PID assigned. +

+Users can run their own proc server in addition to the system default, +at least for those parts of the interface that don't require superuser +privileges. +

+

+Although the system default proc server can't be avoided (all Mach +tasks spawned by users will get a pid assigned, so the system +administrator can control them), users can run their own additional +process servers if they want, implementing the features not requiring +superuser privileges. + +

Filesystems

+
+

+Store based filesystems +

    +
  • ext2fs
  • +
  • ufs
  • +
  • isofs (iso9660, RockRidge, GNU extensions)
  • +
  • fatfs (under development)
  • +
+

+Network file systems +

    +
  • nfs
  • +
  • ftpfs
  • +
+

+Miscellaneous +

    +
  • hostmux
  • +
  • usermux
  • +
  • tmpfs (under development)
  • +
+
+

+We already talked about translators and the file system service they +provide. Currently, we have translators for the ext2, ufs and iso9660 +filesystems. We also have an nfs client and an ftp filesystem. +Especially the latter is intriguing, as it provides transparent access +to ftp servers in the filesystem. Programs can start to move away +from implementing a plethora of network protocols, as the files are +directly available in the filesystem through the standard POSIX file +interface. + + +

Developing the Hurd

+
+

+Over a dozen libraries support the development of new servers. +

+For special server types highly specialized +libraries require only the implementation of a +number of callback functions. +

    +
  • Use libdiskfs for store based filesystems.
  • +
  • Use libnetfs for network filesystems, also for + virtual filesystems.
  • +
  • Use libtrivfs for simple filesystems providing only + a single file or directory.
  • +
+
+

+The Hurd server protocols are complex enough to allow for the +implementation of a POSIX compatible system with GNU extensions. +However, a lot of code can be shared by all or at least similar +servers. For example, all storage based filesystems need to be able to +read and write to a store medium splitted in blocks. The Hurd comes +with several libraries which make it easy to implement new servers. +Also, there are already a lot of examples of different server types in +the Hurd. This makes writing a new server easier. +

+libdiskfs is a library that supports writing store based +filesystems like ext2fs or ufs. It is not very useful for filesystems +which are purely virtual, like /proc or files in +/dev. +

+libnetfs is intended for filesystems which provide a rich +directory hierarchy, but don't use a backing store (for example ftpfs, +nfs). +

+libtrivfs is intended for filesystems which just provide +a single inode or directory. Most servers which are not intended to +provide a filesystem but other services (like +/servers/password) use it to provide a dummy file, so +that file operations on the servers node will not return errors. But +it can also be used to provide meaningful data in a single file, like +a device store or a character device. + +

Store Abstraction

+
+

+Another very useful library is libstore, which is used by all store +based filesystems. It provides a store media abstraction. A store +consists of a store class and a name (which itself can sometimes +contain stores). +

+Primitive store classes: +

    +
  • device store like device:hd2, device:hd0s1, device:fd0
  • +
  • file store like file:/tmp/disk_image
  • +
  • task store like task:PID
  • +
  • zero store like zero:4m (like /dev/zero, of size 4 MB)
  • +
+
+
+

+Composed store classes: +

    +
  • copy store like copy:zero:4m
  • +
  • gunzip/bunzip2 store like gunzip:device:fd0
  • +
  • concat store like concat:device:hd0s2:device:hd1s5
  • +
  • ileave store (RAID-0(2))
  • +
  • remap store like remap:10+20,50+:file:/tmp/blocks
  • +
  • ...
  • +
+

+Wanted: A similar abstraction for streams (based on channels), which +can be used by network and character device servers. +

+

+libstore provides a store abstraction, which is used by +all store based filesystems. The store is determined by a type and a +name, but some store types modify another store rather than providing +a new store, and thus stores can be stacked. For example, the device +store type expects a Mach device, but the remap store expects a list +of blocks to pick from another store, like remap:1+:device:hd2, which +would pick all blocks from hd2 but the first one, which skipped. +Because this functionality is provided in a library, all libstore +using filesystems support many different store kinds, and adding a new +store type is enough to make all store based filesystems support it. + +

Debian GNU/Hurd

+
+

+Goal: +

    +
  • Provide a binary distribution of the Hurd that is easy to + install.
  • +
+

+Constraints: +

    +
  • Use the same source packages as Debian GNU/Linux.
  • +
  • Use the same infrastructure: +
      +
    • Policy
    • +
    • Archive
    • +
    • Bug tracking system
    • +
    • Release process
    • +
  • +
+

+Side Goal: +

    +
  • Prepare Debian for the future: +
      +
    • More flexibility in the base system
    • +
    • Identify dependencies on the Linux kernel
    • +
  • +
+
+

+The Debian distribution of the GNU Hurd that I started in 1998 is +supposed to become a complete binary distribution of the Hurd that is +easy to install. + +

Status of the Debian GNU/Hurd binary archive

+

+See +http://buildd.debian.org/stats/graph.png +for the most current version of the statistic. + +

Status of the Debian infrastructure

+
+

+Plus: +

    +
  • Source packages can identify build and host OS using + dpkg-architecture.
  • +
+

+Minus: +

    +
  • The binary architecture field is insufficient.
  • +
  • The BTS has no architecture tag.
  • +
  • The policy/FHS need (small) Hurd specific extensions.
  • +
+
+

+While good compatibiity can be achieved at the source level, +the binary packages can not always express their relationship +to the available architectures sufficiently. +

+For example, the Linux version of makedev is binary-all, where +a binary-all-linux relationship would be more appropriate. +

+More work has to be done here to fix the tools. + +

Status of the Debian Source archive

+
+
    +
  • Most packages just work.
  • +
  • Maintainers are usually responsive and cooperative.
  • +
  • Turtle, the autobuilder, crunches through the whole list right + now.
  • +
+

+Common pitfalls are POSIX incompatibilities: +

    +
  • Upstream: +
      +
    • Unconditional use of PATH_MAX + (MAXPATHLEN), MAXHOSTNAMELEN.
    • +
    • Unguarded use of Linux kernel features.
    • +
    • Use of legacy interfaces (sys_errlist, + termio).
    • +
  • +
  • Debian: +
      +
    • Unguarded activation of extensions available with Linux.
    • +
    • Low quality patches.
    • +
    • Assuming GNU/Linux in package scripts.
    • +
  • +
+
+

+Most packages are POSIX compatible and can be compiled without +changes on the Hurd. The maintainers of the Debian source packages +are usually very kind, responsiver and helpful. +

+The Turtle autobuilder software (http://turtle.sourceforge.net) +builds the Debian packages on the Hurd automatically. + +

Debian GNU/Hurd: Good idea, bad idea?

+
+

+Upstream benefits: +

    +
  • Software packages become more portable.
  • +
+

+Debian benefits: +

    +
  • Debian becomes more portable.
  • +
  • Maintainers learn about portability and other systems.
  • +
  • Debian gets a lot of public recognition.
  • +
+

+GNU/Hurd benefits: +

    +
  • Large software base.
  • +
  • Great infrastructure.
  • +
  • Nice community to partner with.
  • +
+
+

+The sheet lists the advantages of all groups involved. + +

End

+
+

+Join us at +

+
+

+List of contacts. diff --git a/hurd/hurd_hacking_guide.mdwn b/hurd/hurd_hacking_guide.mdwn index 0cb96f32..2ef08f8a 100644 --- a/hurd/hurd_hacking_guide.mdwn +++ b/hurd/hurd_hacking_guide.mdwn @@ -8,6 +8,16 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] -Originally written by Wolfgang Jährling, the [Hurd Hacking Guide](http://www.gnu.org/software/hurd/hacking-guide/hhg.html) -contains an overview of some of the Hurd's features. -Also contains a tutorial on writing your own [[translator]]. +Originally written by Wolfgang Jährling, the *Hurd Hacking Guide* contains an +introduction to GNU Hurd and GNU Mach programming, an overview of some of the +Hurd's features. It also contains a tutorial on writing your own +[[translator]]. + + * [HTML version](http://www.gnu.org/software/hurd/hacking-guide/hhg.html) for + browsing online, + * [PostScript version](http://www.gnu.org/software/hurd/hacking-guide/hhg.ps) + [187kB, 37 pages], + * [ASCII text + version](http://www.gnu.org/software/hurd/hacking-guide/hhg.txt) [59kB], + * [Texinfo source](http://www.gnu.org/software/hurd/hacking-guide/hhg.texi) + [60kB]. diff --git a/hurd/ng/position_paper.mdwn b/hurd/ng/position_paper.mdwn index 3240a41d..e0f4bf60 100644 --- a/hurd/ng/position_paper.mdwn +++ b/hurd/ng/position_paper.mdwn @@ -8,7 +8,8 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU_Free_Documentation_License|/fdl]]."]]"""]] -[[NealWalfield]] and [[MarcusBrinkmann]] wrote a paper titled [*Improving -Usability via Access Decomposition and Policy -Refinement*](http://walfield.org/papers/20070104-walfield-access-decomposition-policy-refinement.pdf). -This is sometimes referred to as *the position paper*. +Neal Walfield and Marcus Brinkmann wrote a paper titled [*Improving Usability +via Access Decomposition and Policy +Refinement*](http://walfield.org/papers/20070104-walfield-access-decomposition-policy-refinement.pdf) +where they give an overview about how a future, subsequent system may be +architected. This is sometimes referred to as *the position paper*. diff --git a/hurd/reference_manual.mdwn b/hurd/reference_manual.mdwn new file mode 100644 index 00000000..5b5bff2d --- /dev/null +++ b/hurd/reference_manual.mdwn @@ -0,0 +1,18 @@ +[[meta copyright="Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +*The GNU Hurd Reference Manual* documents the architecture, the usage and the +programming of the GNU Hurd. At the moment, the manual is quite incomplete. + + * [HTML version](http://www.gnu.org/software/hurd/doc/hurd_toc.html) for + browsing online, + * [PostScript version](http://www.gnu.org/software/hurd/doc/hurd.ps) + [1020KiB, 91 pages]. diff --git a/hurd/running/distrib.mdwn b/hurd/running/distrib.mdwn index fc42e862..b0a6badd 100644 --- a/hurd/running/distrib.mdwn +++ b/hurd/running/distrib.mdwn @@ -94,7 +94,7 @@ about getting applications to work (if possible). * GNU [Coding Standards](http://www.gnu.org/prep/standards.html) * [[TestSuites]] - Posix, Perl, results feedback, etc. -* [docs and papers](http://www.gnu.org/software/hurd/docs.html) +* [[Documentation]] * [[SystemAPILimits]] * [[CodeAnnouncements]] - Recent coding projects related to the Hurd diff --git a/hurd/running/gnu/universal_package_manager.mdwn b/hurd/running/gnu/universal_package_manager.mdwn index 009b26bf..440f1122 100644 --- a/hurd/running/gnu/universal_package_manager.mdwn +++ b/hurd/running/gnu/universal_package_manager.mdwn @@ -127,7 +127,7 @@ OK. I will give you steps. i. Install a GNU System by folowing [[these_instructions|setup]] -ii. Read about GNU Design +ii. Read about GNU Design: [[Towards_a_New_Strategy_of_OS_Design|documentation/hurd-paper]] iii. Read about translators diff --git a/hurd/translator.mdwn b/hurd/translator.mdwn index b9952931..889f02a6 100644 --- a/hurd/translator.mdwn +++ b/hurd/translator.mdwn @@ -43,6 +43,7 @@ See some [[examples]] about how to use translators. # Existing Translators +* [[auth]] * [[pfinet]] * [[pflocal]] * [[hostmux]] diff --git a/hurd/translator/auth.mdwn b/hurd/translator/auth.mdwn new file mode 100644 index 00000000..73e7e025 --- /dev/null +++ b/hurd/translator/auth.mdwn @@ -0,0 +1,13 @@ +[[meta copyright="Copyright © 2008 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[*The_Authentication_Server*|documentation/auth]], the transcript of a talk +about the details of the authentication mechanisms in the Hurd by Wolfgang +Jährling. diff --git a/microkernel/faq/multiserver_microkernel.mdwn b/microkernel/faq/multiserver_microkernel.mdwn index da690425..68ed332c 100644 --- a/microkernel/faq/multiserver_microkernel.mdwn +++ b/microkernel/faq/multiserver_microkernel.mdwn @@ -22,5 +22,5 @@ use, but now, because the server runs in user space as the user that started it, they may, for instance, mount an FTP filesystem in their home directory. For more information about the design of the Hurd, read the paper by Thomas -Bushnell, BSG: [Towards a new strategy on OS -design](http://www.gnu.org/software/hurd/hurd-paper.html). +Bushnell, BSG: +[[Towards_a_New_Strategy_of_OS_Design|hurd/documentation/hurd-paper]]. diff --git a/microkernel/mach/documentation.mdwn b/microkernel/mach/documentation.mdwn index fe870386..f6f2eb79 100644 --- a/microkernel/mach/documentation.mdwn +++ b/microkernel/mach/documentation.mdwn @@ -1,4 +1,5 @@ -[[meta copyright="Copyright © 2007, 2008 Free Software Foundation, Inc."]] +[[meta copyright="Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +Free Software Foundation, Inc."]] [[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -11,12 +12,27 @@ is included in the section entitled - [Meet Mach](http://www.stepwise.com/Articles/Technical/MeetMach.html), a summary of Mach's history and main concepts. + * *[[The_GNU_Mach_Reference_Manual|gnu_mach/reference_manual]]*. + - OSF's [Kernel Interface (ps)](ftp://ftp.cs.cmu.edu/afs/cs/project/mach/public/doc/osf/kernel_interface.ps) [Kernel Interface (pdf)](http://shakthimaan.com/downloads/hurd/kernel_interface.pdf) - OSF's [Kernel Principles (ps)](ftp://ftp.cs.cmu.edu/afs/cs/project/mach/public/doc/osf/kernel_principles.ps) [Kernel Principles (pdf)](http://shakthimaan.com/downloads/hurd/kernel_principles.pdf) + * [*The Unofficial GNU Mach IPC beginner's + guide*](http://hurdextras.nongnu.org/ipc_guide/), an easy introduction to + Inter Process Comunication in the Mach microkernel by Manuel Pavón + Valderrama. + + * [*Mach IPC without + MIG*](http://walfield.org/pub/people/neal/papers/hurd-misc/mach-ipc-without-mig.txt), + an exercise by Neal Walfield *to understand Mach IPC at one of its lowest + application levels*. + + * [*ipc-hello.c*](http://walfield.org/pub/people/neal/papers/hurd-misc/ipc-hello.c): + *Hello world à la mach ipc*. + - [Porting and Modifying the Mach 3.0 Microkernel](http://shakthimaan.com/downloads/hurd/Porting%20and%20Modifying%20the%20Mach%203.0%20Microkernel.pdf) - [An IO System for Mach](http://shakthimaan.com/downloads/hurd/An%20IO%20System%20for%20Mach.pdf) diff --git a/microkernel/mach/gnu_mach.mdwn b/microkernel/mach/gnu_mach.mdwn index cdb43a6f..19e1ea53 100644 --- a/microkernel/mach/gnu_mach.mdwn +++ b/microkernel/mach/gnu_mach.mdwn @@ -71,6 +71,7 @@ GNU/Hurd. # Development + * [[Reference_Manual]] * [[Building]] * [[Debugging]] * [[Boot_Trace]] diff --git a/microkernel/mach/gnu_mach/reference_manual.mdwn b/microkernel/mach/gnu_mach/reference_manual.mdwn new file mode 100644 index 00000000..225ab176 --- /dev/null +++ b/microkernel/mach/gnu_mach/reference_manual.mdwn @@ -0,0 +1,26 @@ +[[meta copyright="Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 +Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +*The GNU Mach Reference Manual* documents the architecture, the usage and the +programming of the GNU Mach microkernel. At the moment, the manual documents +the interface completely, but is not very useful as a tutorial or introduction +into the Mach architecture. + + * [HTML version](http://www.gnu.org/software/hurd/gnumach-doc/index.html) + for browsing online, + * [PostScript + version](http://www.gnu.org/software/hurd/gnumach-doc/mach.ps) [around + 900KiB], + * [gzipped PostScript + version](http://www.gnu.org/software/hurd/gnumach-doc/mach.ps.gz) + [around 300KiB], + * [PDF version](http://www.gnu.org/software/hurd/gnumach-doc/mach.pdf) + [around 700KiB]. diff --git a/microkernel/mach/mig/documentation.mdwn b/microkernel/mach/mig/documentation.mdwn index 8c977e55..a0bbbe14 100644 --- a/microkernel/mach/mig/documentation.mdwn +++ b/microkernel/mach/mig/documentation.mdwn @@ -66,8 +66,7 @@ pp. 67--77." # Further Relevant Documentation - * The [GNU Mach Reference - Manual](http://www.gnu.org/software/hurd/docs.html#manuals), espacially + * The [[GNU_Mach_Reference_Manual|gnu_mach/reference_manual]], espacially [Chapter 4, Inter Process Communication](http://www.gnu.org/software/hurd/gnumach-doc/Inter-Process-Communication.html), which, for example, explains how the `dealloc` flag -- cgit v1.2.3