diff options
25 files changed, 1176 insertions, 978 deletions
diff --git a/community/gsoc/project_ideas.mdwn b/community/gsoc/project_ideas.mdwn index bfd03ba6..00d3a702 100644 --- a/community/gsoc/project_ideas.mdwn +++ b/community/gsoc/project_ideas.mdwn @@ -1,4 +1,4 @@ -[[meta copyright="Copyright © 2008 Free Software Foundation, Inc."]] +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] [[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable id="license" text="Permission is granted to copy, distribute and/or modify this @@ -36,980 +36,8 @@ you submit your first proposal, the sooner we can give feedback! Take a look at our [[application_template|student_application_form]] to get an idea what your application should contain. - -## Bindings to Other Programming Languages - -The main idea of the Hurd design is giving users the ability to easily -modify/extend the system's functionality ([[extensible_system|extensibility]]). -This is done by creating [[filesystem_translators|hurd/translator]] and other -kinds of Hurd servers. - -However, in practice this is not as easy as it should, because creating -translators and other servers is quite involved -- the interfaces for doing -that are not exactly simple, and available only for C programs. Being able to -easily create simple translators in RAD languages is highly desirable, to -really be able to reap the advantages of the Hurd architecture. - -Originally Lisp was meant to be the second system language besides C in the GNU -system; but that doesn't mean we are bound to Lisp. Bindings for any popular -high-level language, that helps quickly creating simple programs, are highly -welcome. - -Several approaches are possible when creating such bindings. One way is simply -to provide wrappers to all the available C libraries ([[hurd/libtrivfs]], [[hurd/libnetfs]] -etc.). While this is easy (it requires relatively little consideration), it may -not be the optimal solution. It is preferable to hook in at a lower level, thus -being able te create interfaces that are specially adapted to make good use of -the features available in the respective language. - -These more specialised bindings could hook in at some of the lower level -library interfaces ([[hurd/libports]], [[hurd/glibc]], etc.); use the -[[microkernel/mach/MIG]]-provided [[microkernel/mach/RPC]] stubs directly; or -even create native stubs directly from the interface definitions. - -The task is to create easy to use Hurd bindings for a language of the student's -choice, and some example servers to prove that it works well in practice. This -project will require gaining a very good understanding of the various Hurd -interfaces. Skills in designing nice programming interfaces are a must. - -There has already been some [earlier work on Python -bindings](http://www.sigill.org/files/pytrivfs-20060724-ro-test1.tar.bz2), that -perhaps can be re-used. Also some work on [Perl -bindings](http://www.nongnu.org/hurdextras/#pith) is availabled. - -### Lisp - -Most Lisp implementations provide a Foreign Function Interface (FFI) that -enables the Lisp code to call functions written in another language. -Specifically, most implementations provide an FFI to the C ABI (hence giving -access to C, Fortran and possibly C++). - -Common Lisp has even a portability layer for such FFI, -[CFFI](http://common-lisp.net/project/cffi/), so that you can write bindings -purely in Lisp and use the same binding code on any implementation supported by -CFFI. - -Many Scheme implementation also provide an FFI. [Scheme48](http://www.s48.org/) -is even the implementation used to run scsh, a Scheme shell designed to provide -instant access to POSIX functions. -[Guile](http://www.gnu.org/software/guile/guile.html) is the GNU project's -Scheme implementation, meant to be embeddable and provide access to C. At least -[Gambit](http://dynamo.iro.umontreal.ca/~gambit/), -[Chicken](http://www.call-with-current-continuation.org/), -[Bigloo](http://www-sop.inria.fr/mimosa/fp/Bigloo/) and -[PLT](http://www.plt-scheme.org/) are known to provide an FFI too. - -With respect to the packaging and dependencies, the good news is that Debian -comes handy: 5 Common Lisp implementations are packaged, one of which has -already been ported to Hurd (ECL), and CFFI is also packaged. As far as Scheme -is concerned, 14 [R5RS](http://www.schemers.org/Documents/Standards/R5RS/) -implementations are provided and 1 [R6RS](http://www.r6rs.org/). - -Possible mentors: Pierre THIERRY (nowhere_man) for Common Lisp or Scheme, and perhaps Python - -Exercise: Write some simple program(s) using Hurd-specific interfaces in the -language you intend to work on. For a start, you could try printing the system -uptime. A more advanced task is writing a simple variant of the hello -translator (you can use the existing C imlementation as reference), -implementing only open() and read() calls. Don't only write an implementations -using the existing C libraries (libps, libtrivfs), but also try to work with -the MiG-generated stubs directly. If you are ambitious, you could even try to -write your own stubs... - -*Status*: Flavio Cruz has completed [[Lisp_bindings|flaviocruz]] for GSoC 2008! - - -## Virtualization Using Hurd Mechanisms - -The main idea behind the Hurd design is to allow users to replace almost any -system functionality ([[extensible_system|extensibility]]). Any user can easily -create a subenvironment using some custom [[servers|hurd/translator]] instead -of the default system servers. This can be seen as an -[[advanced_lightweight_virtualization|hurd/virtualization]] mechanism, which -allows implementing all kinds of standard and nonstandard virtualization -scenarios. - -However, though the basic mechanisms are there, currently it's not easy to make -use of these possibilities, because we lack tools to automatically launch the -desired constellations. - -The goal is to create a set of powerful tools for managing at least one -desirable virtualization scenario. One possible starting point could be the -[[hurd/subhurd]]/[[hurd/neighborhurd]] mechanism, which allows a second almost totally -independant instance of the Hurd in parallel to the main one. The current -implementation has serious limitations though. A subhurd can only be started by -root. There are no communication channels between the subhurd and the main one. -There is no mechanism for safe sharing of hardware devices. Fixing this issues -could turn subhurds into a very powerful solution for lightweight -virtualization using so-called logical partitions. (Similar to Linux-vserver, -OpenVZ etc.) - -While subhurd allow creating a complete second system instance, with an own set -of Hurd servers and [[UNIX]] daemons and all, there are also situations where it is -desirable to have a smaller subenvironment, living withing the main system and -using most of its facilities -- similar to a chroot environment. A simple way -to create such a subenvironment with a single command would be very helpful. - -It might be possible to implement (perhaps as a prototype) a wrapper using -existing tools (chroot and [[hurd/translator/unionfs]]); or it might require more specific tools, -like some kind of unionfs-like filesytem proxy that mirrors other parts of the -filesystem, but allows overriding individual locations, in conjuction with -either chroot or some similar mechanism to create a subenvironment with a -different root filesystem. - -It's also desirable to have a mechanism allowing a user to set up such a custom -environment in a way that it will automatically get launched on login -- -practically allowing the user to run a customized operating system in his own -account. - -Yet another interesting scenario would be a subenvironment -- using some kind -of special filesystem proxy again -- in which the user serves as root, being -able to create local sub-users and/or sub-groups. - -This would allow the user to run "dangerous" applications (webbrowser, chat -client etc.) in a confined fashion, allowing it access to only a subset of the -user's files and other resources. (This could be done either using a lot of -groups for individual resources, and lots of users for individual applications; -adding a user to a group would give the corresponding application access to the -corresponding resource -- an advanced [[ACL]] mechanism. Or leave out the groups, -assigning the resources to users instead, and use the Hurd's ability for a -process to have multiple user IDs, to equip individual applications with sets -of user IDs giving them access to the necessary resources -- basically a -[[capability]] mechanism.) - -The student will have to pick (at least) one of the described scenarios -- or -come up with some other one in a similar spirit -- and implement all the tools -(scripts, translators) necessary to make it available to users in an -easy-to-use fashion. While the Hurd by default already offers the necessary -mechanisms for that, these are not perfect and could be further refined for -even better virtualization capabilities. Should need or desire for specific -improvements in that regard come up in the course of this project, implementing -these improvements can be considered part of the task. - -Completing this project will require gaining a very good understanding of the -Hurd architecture and spirit. Previous experience with other virtualization -solutions would be very helpful. - -Possible mentors: Olaf Buddenhagen (antrik) - -Exercise: Make some modification to the "boot" programm used to start subhurds. -(More specific suggestions welcome... :-) ) - -*Status*: Zheng da has has implemented [[network_virtualization|zhengda]] (an -important prerequisite for unprivileged subhurds) for GSoC 2008, along with -various other interesting bits, including a mechanism to override socket -servers; a proc proxy that allows running processes/subenvironments with a -pseudo device master port; and a mechanism to pass arbitrary virtual devices to -a subhurd. He is still working on running subhurds by normal users. - - -## Namspace-based Translator Selection - -The main idea behind the Hurd is to make (almost) all system functionality -user-modifiable ([[extensible_system|extensibility]]). This includes a -user-modifiable filesystem: the whole filesystem is implemented decentrally, by -a set of filesystem servers forming the directory tree together, a -[[hurd/virtual_file_system]]. These filesystem servers are called -[[translators|hurd/translator]], and are the most visible feature of the Hurd. - -The reason they are called translators is because when you set a translator on -a filesystem node, the underlying node(s) are hidden by the translator, but the -translator itself can access them, and present their contents in a different -format -- translate them. A simple example is a -[[gunzip_translator|hurd/translator/storeio]], which can be set on a gzipped -file, and presents a virtual file with the uncompressed contents. Or the other -way around. Or a translator that presents an -[[XML_file_as_a_directory_tree|hurd/translator/xmlfs]]. Or an mbox as a set of -individual files for each mail ([[hurd/translator/mboxfs]]); or ever further -breaking it down into headers, body, attachements... - -This gets even more powerful when translators are used as building blocks for -larger applications: A mail reader for example doesn't need backends for -understanding various mailbox formats anymore. All formats can be parsed by -special translators, and the mail reader gets the data as a uniform, directly -usable filesystem structure. Translators can also be stacked: If you have a -compressed mailbox for example, first apply a gunzip translator, and then an -mbox translator on top of that. - -There are a few problems with the way translators are set, though. For one, -once a translator is set on a node, you always see the translated content. If -you need the untranslated contents again, to do a backup for example, you first -need to remove the translator again. Also, having to set a translator -explicitely before accessing the contents is pretty cumbersome, making this -feature almost useless. - -A possible solution is implementing a mechanism for selecting translators -through special filename attributes. For example you could use -`index.html.gz,,+` and `index.html.gz,,-` to choose between translated and -untranslated versions of a file. Or you could use `index.html.gz,,u` to get -the contents of the file with a gunzip translator applied automatically. You -could also use attributes on whole directory trees: `.,,0/` would give you a -directory tree corresponding to the current directory, but with any translators -disabled, for doing a backup. And `site,,u/*.html.gz` would present a whole -directory tree of compressed HTML files as uncompressed files. - -One benefit of the Hurd's flexibility is that it should be possible to -implement such a mechanism without touching the existing Hurd components: -Rather, just implement a special proxy, that mirrors the normal filesystem, but -is able to interpret the special extensions and present transformed files in -place of the original ones. - -In the long run it's probably desirable to have the mechanism implemented in -the standard name lookup mechanism, so it will be available globally, and avoid -the overhead of a proxy; but for the beginnig the proxy solution is much more -flexible. - -The goal of this project is implementing a prototype proxy; perhaps also a -first version of the global variant as proof of concept, if time permits. It -requires good understanding of the name lookup mechanism, and translator -programming; but the implementation should not be too hard. Perhaps the hardest -part is finding a convenient, flexible, elegant, hurdish method for mapping the -special extensions to actual translators... - -Possible mentors: Olaf Buddenhagen (antrik) - -Exercise: Try to make some modification to the existing unionfs and/or firmlink -translators. (More specific suggestions welcome... :-) ) - -*Status*: Sergiu Ivanov has been working *voluntarily* on -[[namespace-based_translator_selection|scolobb]], as an inofficial GSoC 2008 -participant! Not all the desired functionality is in place yet; work is -ongoing. - - -## Fix File Locking - -Over the years, [[UNIX]] has aquired a host of different file locking mechanisms. -Some of them work on the Hurd, while others are buggy or only partially -implemented. This breaks many applications. - -The goal is to make all file locking mechanisms work properly. This requires -finding all existing shortcomings (through systematic testing and/or checking -for known issues in the bug tracker and mailing list archives), and fixing -them. - -This task will require digging into parts of the code to understand how file -locking works on the Hurd. Only general programming skills are required. - -Possible mentors: Samuel Thibault (youpi) - -Exercise: Find one of the existing issues, either by looking at the task/bug -trackers on savannah, or by trying things out yourself; and take a go at it. -Probably you wont' be able to fix the problem in a limited amount of time, but -you should be able to do a detailed analysis of the issue at least. - - -## `procfs` - -Although there is no standard (POSIX or other) for the layout of the `/proc` -pseudo-filesystem, it turned out a very useful facility in GNU/Linux and other -systems, and many tools concerned with process management use it. (`ps`, `top`, -`htop`, `gtop`, `killall`, `pkill`, ...) - -Instead of porting all these tools to use [[hurd/libps]] (Hurd's official method for -accessing process information), they could be made to run out of the box, by -implementing a Linux-compatible `/proc` filesystem for the Hurd. - -The goal is to implement all `/proc` functionality needed for the various process -management tools to work. (On Linux, the `/proc` filesystem is used also for -debugging purposes; but this is highly system-specific anyways, so there is -probably no point in trying to duplicate this functionality as well...) - -The [[existing_partially_working_procfs_implementation|hurd/translator/procfs]] -can serve as a starting point, but needs to be largely rewritten. (It should -use [[hurd/libnetfs]] rather than [[hurd/libtrivfs]]; the data format needs to -change to be more Linux-compatible; and it needs adaptation to newer system -interfaces.) - -This project requires learning [[hurd/translator]] programming, and -understanding some of the internals of process management in the Hurd. It -should not be too hard coding-wise; and the task is very nicely defined by the -exising Linux `/proc` interface -- no design considerations necessary. - -**Note**: We already have several applications for this task. - -Possible mentors: Olaf Buddenhagen (antrik) - -Exercise: Add or fix one piece in the existing procfs translator. - -*Status*: Madhusudan.C.S has implemented a new, fully functional [[procfs|madhusudancs]] for -GSoC 2008. He is still working on some outstanding issues. - - -## New Driver Glue Code - -Although a driver framework in userspace would be desirable, presently the Hurd -uses kernel drivers in the microkernel, -[[GNU_Mach|microkernel/mach/gnumach]]. (And changing this would be far beyond a -GSoC project...) - -The problem is that the drivers in GNU Mach are presently old Linux drivers -(mostly from 2.0.x) accessed through a glue code layer. This is not an ideal -solution, but works quite OK, except that the drivers are very old. The goal of -this project is to redo the glue code, so we can use drivers from current Linux -versions, or from one of the free BSD variants. - -Using [ddekit](http://demo.tudos.org/dsweeper_tutorial.html) instead of our -own glue code can be explored as a possible alternative approach. - -This is a doable, but pretty involved project. Experience with driver -programming under Linux (or BSD) is a must. (No Hurd-specific knowledge is -required, though.) - -This is [[GNU_Savannah_task 5488]]. - -Possible mentors: Samuel Thibault (youpi) - -Exercise: Try porting one driver from Linux 2.6 to run in the old framework. -The port needn't be elegant or complete; but it would be nice if you could get -it to work at least partially... - - -## Server Overriding Mechanism - -The main idea of the Hurd is that every user can influence almost all system -functionality ([[extensible_system|extensibility]]), by running private Hurd -servers that replace or proxy the global default implementations. - -However, running such a cumstomized subenvironment presently is not easy, -because there is no standard mechanism to easily replace an individual standard -server, keeping everything else. (Presently there is only the [[hurd/subhurd]] -method, which creates a completely new system instance with a completely -independent set of servers.) - -The goal of this project is to provide a simple method for overriding -individual standard servers, using environment variables, or a special -subshell, or something like that. - -Various approaches for such a mechanism has been discussed before. -Probably the easiest (1) would be to modify the Hurd-specific parts of [[hurd/glibc]], -which are contacting various standard servers to implement certain system -calls, so that instead of always looking for the servers in default locations, -they first check for overrides in environment variables, and use these instead -if present. - -A somewhat more generic solution (2) could use some mechanism for arbitrary -client-side namespace overrides. The client-side part of the filename lookup -mechanism would have to check an override table on each lookup, and apply the -desired replacement whenever a match is found. - -Another approach would be server-side overrides. Again there are various -variants. The actual servers themself could provide a mechanism to redirect to -other servers on request. (3) Or we could use some more generic server-side -namespace overrides: Either all filesystem servers could provide a mechanism to -modify the namespace they export to certain clients (4), or proxies could be -used that mirror the default namespace but override certain locations. (5) - -Variants (4) and (5) are the most powerful. They are intimately related to -chroots: (4) is like the current chroot implementation works in the Hurd, and -(5) has been proposed as an alternative. The generic overriding mechanism could -be implemented on top of chroot, or chroot could be implemented on top of the -generic overriding mechanism. But this is out of scope for this project... - -In practice, probably a mix of the different approaches would prove most useful -for various servers and use cases. It is strongly recommended that the student -starts with (1) as the simplest approach, perhaps augmenting it with (3) for -certain servers that don't work with (1) because of indirect invocation. - -This tasks requires some understanding of the Hurd internals, especially a good -understanding of the file name lookup mechanism. It's probably not too heavy on -the coding side. - -This is [[GNU_Savannah_task 6612]]. Also there are quite a bit of emails -discussing this topic, from a last year's GSoC application -- see -<http://lists.gnu.org/archive/html/bug-hurd/2007-03/msg00050.html>, -<http://lists.gnu.org/archive/html/bug-hurd/2007-03/msg00114.html>, -<http://lists.gnu.org/archive/html/bug-hurd/2007-06/msg00082.html>, -<http://lists.gnu.org/archive/html/bug-hurd/2008-03/msg00039.html>. - -Possible mentors: Olaf Buddenhagen (antrik) - -Exercise: Come up with a glibc patch that allows overriding one specific -standard server using method (1). - -*Status*: Overriding of socket servers through environment variables has been -implemented by Zheng Da for GSoC 2008, as part of his -[[network_virtualization|zhengda]] project. - - -## `dtrace` Support - -One of the main problems of the current Hurd implementation is very poor -performance. While we have a bunch of ideas what could cause the performance -problems, these are mostly just guesses. Better understanding what really -causes bad performance is necessary to improve the situation. - -For that, we need tools for performance measurements. While all kinds of more -or less specific profiling tools could be convieved, the most promising and -generic approach seems to be a framework for logging certain events in the -running system (both in the microkernel and in the Hurd servers). This would -allow checking how much time is spent in certain modules, how often certain -situations occur, how things interact, etc. It could also prove helpful in -debugging some issues that are otherwise hard to find because of complex -interactions. - -The most popular framework for that is Sun's dtrace; but there might be others. -The student has to evaluate the existing options, deciding which makes most -sense for the Hurd; and implement that one. (Apple's implementation of dtrace -in their Mach-based kernel might be helpful here...) - -This project requires ability to evaluate possible solutions, and experience -with integrating existing components as well as low-level programming. - -Possible mentors: Samuel Thibault (youpi) - -Exercise: In lack of a good exercise directly related to this taks, just pick -one of the kernel-related or generally low-level tasks from the bug/task -trackers on savannah, and make a go at it. You might not be able to finish the -task in a limited amount of time, but you should at least be able to make a -detailed analysis of the issue. - -*Status*: Andei Barbu was working on -[SystemTap](http://csclub.uwaterloo.ca/~abarbu/hurd/) for GSoC 2008, but it -turned out too Linux-specific. He implemented kernel probes, but there is no -nice frontend yet. - - -## Hurdish TCP/IP Stack - -The Hurd presently uses a [[TCP/IP_stack|hurd/translator/pfinet]] based on code from an old Linux version. -This works, but lacks some rather important features (like PPP/PPPoE), and the -design is not hurdish at all. - -A true hurdish network stack will use a set of stack of [[hurd/translator]] processes, -each implementing a different protocol layer. This way not only the -implementation gets more modular, but also the network stack can be used way -more flexibly. Rather than just having the standard socket interface, plus some -lower-level hooks for special needs, there are explicit (perhaps -filesystem-based) interfaces at all the individual levels; special application -can just directly access the desired layer. All kinds of packet filtering, -routing, tunneling etc. can be easily achieved by stacking compononts in the -desired constellation. - -While the general architecture is pretty much given by the various network -layers, it's up to the student to design and implement the various interfaces -at each layer. This task requires understanding the Hurd philosophy and -translator programming, as well as good knowledge of TCP/IP. - -This is [[GNU_Savannah_task 5469]]. - -Possible mentors: ? - -Exercise: Make some modification to the existing pfinet implementation. (More -specific suggestions welcome... :-) ) - - -## Improved NFS Implementation - -The Hurd has both NFS server and client implementations, which work, but not -very well: File locking doesn't work properly (at least in conjuction with a -GNU/Linux server), and performance is extremely poor. Part of the problems -could be owed to the fact that only NFSv2 is supported so far. - -This project encompasses implementing NFSv3 support, fixing bugs and -performance problems -- the goal is to have good NFS support. The work done in -a previous unfinished GSoC project can serve as a starting point. - -Both client and server parts need work, though the client is probably much more -important for now, and shall be the major focus of this project. - -This task, [[GNU_Savannah_task 5497]], has no special prerequisites besides general programming skills, and -an interest in file systems and network protocols. - -Possible mentors: ? - -Exercise: Make a go at one of the known issues in the NFS client. You might not -be able to finish this in the limited amount of time, but you should at least -be able to make a detailed analysis of the issue. - - -## Fix `libdiskfs` Locking Issues - -Nowadays the most often encountered cause of Hurd crashes seems to be lockups -in the [[hurd/translator/ext2fs]] server. One of these could be traced -recently, and turned out to be a lock inside [[hurd/libdiskfs]] that was taken -and not released in some cases. There is reason to believe that there are more -faulty paths causing these lockups. - -The task is systematically checking the [[hurd/libdiskfs]] code for this kind of locking -issues. To achieve this, some kind of test harness has to be implemented: For -exmple instrumenting the code to check locking correctness constantly at -runtime. Or implementing a unit testing framework that explicitely checks -locking in various code paths. (The latter could serve as a template for -implementing unit checks in other parts of the Hurd codebase...) - -This task requires experience with debugging locking issues in multithreaded -applications. - -Possible mentors: ? - -Exercise: Hack libdiskfs to keep count of the number of locks currently held. - - -## Convert Hurd Libraries and Servers to pthreads - -The Hurd was originally created at a time when the [pthreads -standard](http://www.opengroup.org/onlinepubs/009695399/basedefs/pthread.h.html) -didn't exist yet. Thus all Hurd servers and libraries are using the old -[[cthreads|hurd/libcthreads]] package that came with [[microkernel/Mach]], -which is not compatible with [[pthreads|hurd/libpthread]]. - -Not only does that mean that people hacking on Hurd internals have to deal with -a non-standard thread package, which nobody is familiar with. Although a -pthreads implementation for the Hurd was created in the meantime, it's not -possible to use both cthreads and pthreads in the same program. Consequently, -pthreads can't presently be used in any Hurd servers -- including translators. - -Some work already has been done once on converting the Hurd servers and -libraries to use pthreads, but that work hasn't been finished. It is available -as [[GNU_Savannah_task 5487]] and can of course be used to base the new work -upon. - -The goal of this project is to have all the Hurd code use pthreads. Should any -limitations in the existing pthreads implementation turn up that hinder this -transition, they will have to be fixed as well. - -One possible option is creating a wrapper that implements the cthreads -interfaces on top of pthreads, to ease the transition -- but it might very well -turn out that it's easier to just change all the existing code to use pthreads -directly. This is up to the student. Such a wrapper has been proposed as -[[GNU_Savannah_task 7895]] and its implementation would be a useful -starting-point. - -This project requires relatively little Hurd-specific knowledge. Experience -with multithreaded programming in general and pthreads in particular is -required, though. - -Possible mentors: Samuel Thibault (youpi) - -Exercise: Take some small piece of code using ctreads and convert it to -pthreads. - - -## Sound Support - -The Hurd presently has no sound support. Fixing this, [[GNU_Savannah_task -5485]], requires two steps: the first is to port some other kernel's drivers to -[[GNU_Mach|microkernel/mach/gnumach]] so we can get access to actual sound -hardware. The second is to implement a userspace server ([[hurd/translator]]), -that implements an interface on top of the kernel device that can be used by -applications -- probably OSS or maybe ALSA. - -Completing this task requires porting at least one driver (e.g. from Linux) for -a popular piece of sound hardware, and the basic userspace server. For the -driver part, previous experience with programming kernel drivers is strongly -advisable. The userspace part requires some knowledge about programming Hurd -translators, but shouldn't be too hard. - -Once the basic support is working, it's up to the student to use the remaining -time for porting more drivers, or implementing a more sophisticated userspace -infrastructure. The latter requires good understanding of the Hurd philosophy, -to come up with an appropriate design. - -Another option would be to evaluate whether a driver that is completely running -in user-space is feasible. <!-- TODO. Elaborate. --> - -Possible mentors: ? - -Exercise: Take a newer driver for a device in one of the subsystems we already -implement (disk or network) from a newer Linux version, or some other operating -system, and try to port it so that it runs in the existing driver framework. -The port needn't be elegant or complete; but it would be nice if you could get -it to work at least partially... - - -## Disk I/O Performance Tuning - -The most obvious reason for the Hurd feeling slow compared to mainstream -systems like GNU/Linux, is very slow harddisk access. - -The reason for this slowness is lack and/or bad implementation of common -optimisation techniques, like scheduling reads and writes to minimalize head -movement; effective block caching; effective reads/writes to partial blocks; -reading/writing multiple blocks at once; and read-ahead. The -[[ext2_filesystem_server|hurd/translator/ext2fs]] might also need some -optimisations at a higher logical level. - -The goal of this project is to analyze the current situation, and implement/fix -various optimisations, to achieve significantly better disk performance. It -requires understanding the data flow through the various layers involved in -disk acces on the Hurd ([[filesystem|hurd/virtual_file_system]], -[[pager|hurd/libpager]], driver), and general experience with -optimising complex systems. That said, the killing feature we are definitely -missing is the read-ahead, and even a very simple implementation would bring -very big performance speedups. - -Possible mentors: ? - -Exercise: Make some modification in at least one of the components involved in -disk I/O. (More specific suggestions welcome... :-) ) - - -## VM Tuning - -Hurd/[[microkernel/Mach]] presently make very bad use of the available physical memory in the -system. Some of the problems are inherent to the system design (the kernel -can't distinguish between important application data and discardable disk -buffers for example), and can't be fixed without fundamental changes. Other -problems however are an ordinary lack of optimisation, like extremely crude -heuristics when to start paging. (See <http://lists.gnu.org/archive/html/bug-hurd/2007-08/msg00034.html> for example.) -Many parameters are based on assumptions from -a time when typical machines had like 16 MiB of RAM, or simply have been set to -arbitrary values and never tuned for actual use. - -The goal of this project is to bring the virtual memory management in Hurd/Mach -closer to that of modern mainstream kernels (Linux, FreeBSD), by comparing the -implementation to other systems, implementing any worthwhile improvements, and -general optimisation/tuning. It requires very good understanding of the Mach -VM, and virtual memory in general. - -This project is related to [[GNU_Savannah_task 5489]]. - -Possible mentors: ? - -Exercise: Make some modification to the existing VM code. You could try to find -a piece of code that can be improved with simple code optimization, for -example. - - -## `mtab` - -In traditional monolithic system, the kernel keeps track of all mounts; the -information is available through `/proc/mounts` (on Linux at least), and in a -very similar form in `/etc/mtab`. - -The Hurd on the other hand has a totally -[[decentralized_file_system|hurd/virtual_file_system]]. There is no single -entity involved in all mounts. Rather, only the parent file system to which a -mountpoint ([[hurd/translator]]) is attached is involved. As a result, there -is no central place keeping track of mounts. - -As a consequence, there is currently no easy way to obtain a listing of all -mounted file systems. This also means that commands like `df` can only work on -explicitely specified mountpoints, instead of displaying the usual listing. - -One possible solution to this would be for the translator startup mechanism to -update the `mtab` on any `mount`/`unmount`, like in traditional systems. -However, there are same problems with this approach. Most notably: what to do -with passive translators, i.e., translators that are not presently running, but -set up to be started automatically whenever the node is accessed? Probably -these should be counted an among the mounted filesystems; but how to handle the -`mtab` updates for a translator that is not started yet? Generally, being -centralized and event-based, this is a pretty unelegant, non-hurdish solution. - -A more promising approach is to have `mtab` exported by a special translator, -which gathers the necessary information on demand. This could work by -traversing the tree of translators, asking each one for mount points attached -to it. (Theoretically, it could also be done by just traversing *all* nodes, -checking each one for attached translators. That would be very inefficient, -though. Thus a special interface is probably required, that allows asking a -translator to list mount points only.) - -There are also some other issues to keep in mind. Traversing arbitrary -translators set by other users can be quite dangerous -- and it's probably not -very interesting anyways what private filesystems some other user has mounted. -But what about the global `/etc/mtab`? Should it list only root-owned -filesystems? Or should it create different listings depending on what user -contacts it?... - -That leads to a more generic question: which translators should be actually -listed? There are different kinds of translators: ranging from traditional -filesystems ([[disks|hurd/libdiskfs]] and other actual -[[stores|hurd/translator/storeio]]), but also purely virtual filesystems like -[[hurd/translator/ftpfs]] or [[hurd/translator/unionfs]], and even things that -have very little to do with a traditional filesystem, like a -[[gzip_translator|hurd/translator/storeio]], -[[mbox_translator|hurd/translator/mboxfs]], -[[xml_translator|hurd/translator/xmlfs]], or various device file translators... -Listing all of these in `/etc/mtab` would be pretty pointless, so some kind of -classification mechanism is necessary. By default it probably should list only -translators that claim to be real filesystems, though alternative views with -other filtering rules might be desirable. - -After taking decisions on the outstanding design questions, the student will -implement both the actual [[mtab_translator|hurd/translator/mtabfs]], and the -necessery interface(s) for gathering the data. It requires getting a good -understanding of the translator mechanism and Hurd interfaces in general. - -Possible mentors: Olaf Buddenhagen (antrik) - -Exercise: Create a simple translator using libnetfs, that only allows creating -directories and attaching other translators. - - -## GNU Mach Code Cleanup - -Although there are some attempts to move to a more modern microkernel -alltogether, the current Hurd implementation is based on -[[GNU_Mach|microkernel/mach/gnumach]], which is only a slightly modified -variant of the original CMU [[microkernel/Mach]]. - -Unfortunately, Mach was created about two decades ago, and is in turn based on -even older BSD code. Parts of the BSD kernel -- file systems, [[UNIX]] [[mechanism]]s -like processes and signals, etc. -- were ripped out (to be implemented in -[[userspace_servers|hurd/translator]] instead); while other mechanisms were -added to allow implementing stuff in userspace. -([[Pager_interface|microkernel/mach/external_pager_mechanism]], -[[microkernel/mach/IPC]], etc.) - -Also, Mach being a research project, many things were tried, adding lots of -optional features not really needed. - -The result of all this is that the current code base is in a pretty bad shape. -It's rather hard to make modifications -- to make better use of modern hardware -for example, or even to fix bugs. The goal of this project is to improve the -situation. - -The task starts out easy, with fixing compiler warnings. Later it moves on to -more tricky things: removing dead or unneeded code paths; restructuring code -for readability and maintainability. - -This task requires good knowledge of C, and experience with working on a large -existing code base. Previous kernel hacking experience is an advantage, but -not really necessary. - -Possible mentors: Samuel Thibault (youpi) - -Exercise: Create a few simple patches that fix some of the compiler warnings, -rework a piece of ugly code etc. - - -## `xmlfs` - -Hurd [[translators|hurd/translator]] allow presenting underlying data in a -different format. This is a very powerful ability: it allows using standard -tools on all kinds of data, and combining existing components in new ways, once -you have the necessary translators. - -A typical example for such a translator would be xmlfs: a translator that -presents the contents of an underlying XML file in the form of a directory -tree, so it can be studied and edited with standard filesystem tools, or using -a graphical file manager, or to easily extract data from an XML file in a -script etc. - -The exported directory tree should represent the DOM structure of the document, -or implement XPath, or both, or some combination thereof (perhaps XPath could -be implemented as a second translator working on top of the DOM one) -- -whatever works well, while sticking to XML standards as much as possible. - -Ideally, the translation should be reversible, so that another, complementary -translator applied on the expanded directory tree would yield the original XML -file again; and also the other way round, applying the complementary translator -on top of some directory tree and xmlfs on top of that would yield the original -directory again. However, with the different semantics of directory trees and -XML files, it might not be possible to create such a universal mapping. Thus -it is a desirable goal, but not a strict requirement. - -The goal of this project is to create a fully usable XML translator, that -allows both reading and writing any XML file. Implementing the complementary -translator also would be nice if time permits, but is not mandatory part of the -task. - -The [[existing_partial_(read-only)_xmlfs_implementation|hurd/translator/xmlfs]] -can serve as a starting point. - -This task requires pretty good designing skills. Good knowledge of XML is also -necessary. Learning translator programming will obviously be necessary to -complete the task. - -Possible mentors: Olaf Buddenhagen (antrik) - -Exercise: Make some modification to the existing xmlfs translator, and write a -shell script that uses xmlfs to extract some interesting information from an -.odt document. (More specific suggestions welcome... :-) ) - - -## Allow Using `unionfs` Early at Boot - -In [[UNIX]] systems, traditionally most software is installed in a common directory -hierachy, where files from various packages live beside each other, grouped by -function: user-invokable executables in `/bin`, system-wide configuration files -in `/etc`, architecture specific static files in `/lib`, variable data in -`/var`, and so on. To allow clean installation, deinstallation, and upgrade of -software packages, GNU/Linux distributions usually come with a package manager, -which keeps track of all files upon installation/removal in some kind of -central database. - -An alternative approach is the one implemented by GNU Stow: each package is -actually installed in a private directory tree. The actual standard directory -structure is then created by collecting the individual files from all the -packages, and presenting them in the common `/bin`, `/lib`, etc. locations. - -While the normal Stow package (for traditional UNIX systems) uses symlinks to -the actual files, updated on installation/deinstallation events, the Hurd -[[hurd/translator]] mechanism allows a much more elegant solution: -[[hurd/translator/stowfs]] (which is actually a special mode of -[[hurd/translator/unionfs]]) creates virtual directories on the fly, composed -of all the files from the individual package directories. - -The problem with this approach is that unionfs presently can be launched only -once the system is booted up, meaning the virtual directories are not available -at boot time. But the boot process itself already needs access to files from -various packages. So to make this design actually usable, it is necessary to -come up with a way to launch unionfs very early at boot time, along with the -root filesystem. - -Completing this task will require gaining a very good understanding of the Hurd -boot process and other parts of the design. It requires some design skills -also to come up with a working mechanism. - -Possible mentors: ? - -Exercise: Try to write a dummy server that is started instead of ext2fs on -system boot, and starts the actual ext2fs in turn. - - -## Fix `tmpfs` - -In some situations it is desirable to have a file system that is not backed by -actual disk storage, but only by anonymous memory, i.e. lives in the RAM (and -possibly swap space). - -A simplistic way to implement such a memory filesystem is literally creating a -ramdisk, i.e. simply allocating a big chunck of RAM (called a memory store in -Hurd terminology), and create a normal filesystem like ext2 on that. However, -this is not very efficient, and not very convenient either (the filesystem -needs to be recreated each time the ramdisk is invoked). A nicer solution is -having a real [[hurd/translator/tmpfs]], which creates all filesystem -structures directly in RAM, allocating memory on demand. - -The Hurd has had such a tmpfs for a long time. However, the existing -implementation doesn't work anymore -- it got broken by changes in other parts -of the Hurd design. - -There are several issues. The most serious known problem seems to be that for -technical reasons it receives [[microkernel/mach/RPC]]s from two different -sources on one [[microkernel/mach/port]], and gets mixed up with them. Fixing -this is non-trivial, and requires a good understanding of the involved -mechanisms. - -The goal of this project to get a fully working, full featured tmpfs -implementation. It requires digging into some parts of the Hurd, incuding the -[[pager_interface|hurd/libpager]] and [[hurd/translator]] programming. This -task probably doesn't require any design work, only good debugging skills. - -Possible mentors: ? - -Exercise: Take a go at one of the existing issues in tmpfs. You may not be able -to finish this in the limited amount of time, but you should at least be able -to do a detailed analysis of the problem. - - -## Lexical `..` Resolution - -For historical reasons, [[UNIX]] filesystems have a real (hard) `..` link from each -directory pointing to its parent. However, this is problematic, because the -meaning of "parent" really depends on context. If you have a symlink for -example, you can reach a certain node in the filesystem by a different path. If -you go to `..` from there, UNIX will traditionally take you to the hard-coded -parent node -- but this is usually not what you want. Usually you want to go -back to the logical parent from which you came. That is called "lexical" -resolution. - -Some application already use lexical resolution internally for that reason. It -is generally agreed that many problems could be avoided if the standard -filesystem lookup calls used lexical resolution as well. The compatibility -problems probably would be negligable. - -The goal of this project is to modify the filename lookup mechanism in the Hurd -to use lexical resolution, and to check that the system is still fully -functional afterwards. This task requires understanding the filename resolution -mechanism. It's probably a relatively easy task. - -See also [[GNU_Savannah_bug 17133]]. - -Possible mentors: ? - -Exercise: Make some modification to the name lookup mechanism. (More specific -suggestions welcome... :-) ) - - -## Secure `chroot` implementation - -As the Hurd attempts to be (almost) fully [[UNIX]]-compatible, it also implements a -`chroot()` system call. However, the current implementation is not really -good, as it allows easily escaping the `chroot`, for example by use of -[[passive_translators|hurd/translator]]. - -Many solutions have been suggested for this problem -- ranging from simple -workaround changing the behaviour of passive translators in a `chroot`; -changing the context in which passive translators are exectuted; changing the -interpretation of filenames in a chroot; to reworking the whole passive -translator mechanism. Some involving a completely different approch to -`chroot` implementation, using a proxy instead of a special system call in the -filesystem servers. - -The task is to pick and implement one approach for fixing chroot. - -This task is pretty heavy: it requires a very good understanding of file name -lookup and the translator mechanism, as well as of security concerns in general --- the student must prove that he really understands security implications of -the UNIX namespace approach, and how they are affected by the introduction of -new mechanisms. (Translators.) More important than the acualy code is the -documentation of what he did: he must be able to defend why he chose a certain -approach, and explain why he believes this approach really secure. - -Possible mentors: ? - -Exercise: Make some modification to the chroot mechanism. (More specific -suggestions welcome :-) ) - - -## Hurdish Package Manager for the GNU System - -Most GNU/Linux systems use pretty sophisticated package managers, to ease the -management of installed software. These keep track of all installed files, and -various kinds of other necessary information, in special databases. On package -installation, deinstallation, and upgrade, scripts are used that make all kinds -of modifications to other parts of the system, making sure the packages get -properly integrated. - -This approach creates various problems. For one, *all* management has to be -done with the distribution package management tools, or otherwise they would -loose track of the system state. This is reinforced by the fact that the state -information is stored in special databases, that only the special package -management tools can work with. - -Also, as changes to various parts of the system are made on certain events -(installation/deinstallation/update), managing the various possible state -transitions becomes very complex and bug-prone. - -For the official (Hurd-based) GNU system, a different approach is intended: -making use of Hurd [[translators|hurd/translator]] -- more specifically their -ability to present existing data in a different form -- the whole system state -will be created on the fly, directly from the information provided by the -individual packages. The visible system state is always a reflection of the -sum of packages installed at a certain moment; it doesn't matter how this state -came about. There are no global databases of any kind. (Some things might -require caching for better performance, but this must happen transparently.) - -The core of this approach is formed by [[hurd/translator/stowfs]], which -creates a traditional unix directory structure from all the files in the -individual package directories. But this only handles the lowest level of -package management. Additional mechanisms are necessary to handle stuff like -dependencies on other packages. - -The goal of this task is to create these mechanisms. - -Possible mentors: Ben Asselstine (bing) - -Exercise: Write a translator that observes a directory tree using -dir_notify_changes(), and presents a file with a log of changes. - - -## Port the Debian Installer to the Hurd - -The primary means of distributing the Hurd is through Debian GNU/Hurd. -However, the installation CDs presently use an ancient, non-native installer. -The situation could be much improved by making sure that the newer *Debian -Installer* works on the Hurd. - -Some preliminary work has been done, see -<http://wiki.debian.org/DebianInstaller/Hurd>. - -The goal is to have the Debian Installer fully working on the Hurd. It -requires relatively little Hurd-specific knowledge. - -Possible mentors: Samuel Thibault (youpi) - -Exercise: Try to get one piece of the installer running on Hurd. +[[inline +pages="community/gsoc/project_ideas/* and !*/discussion" +show=0 +feeds=no +actions=yes]] diff --git a/community/gsoc/project_ideas/debian_installer.mdwn b/community/gsoc/project_ideas/debian_installer.mdwn new file mode 100644 index 00000000..cac85df2 --- /dev/null +++ b/community/gsoc/project_ideas/debian_installer.mdwn @@ -0,0 +1,26 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Port the Debian Installer to the Hurd"]] + +The primary means of distributing the Hurd is through Debian GNU/Hurd. +However, the installation CDs presently use an ancient, non-native installer. +The situation could be much improved by making sure that the newer *Debian +Installer* works on the Hurd. + +Some preliminary work has been done, see +<http://wiki.debian.org/DebianInstaller/Hurd>. + +The goal is to have the Debian Installer fully working on the Hurd. It +requires relatively little Hurd-specific knowledge. + +Possible mentors: Samuel Thibault (youpi) + +Exercise: Try to get one piece of the installer running on Hurd. diff --git a/community/gsoc/project_ideas/disk_io_performance.mdwn b/community/gsoc/project_ideas/disk_io_performance.mdwn new file mode 100644 index 00000000..02e0b675 --- /dev/null +++ b/community/gsoc/project_ideas/disk_io_performance.mdwn @@ -0,0 +1,35 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Disk I/O Performance Tuning"]] + +The most obvious reason for the Hurd feeling slow compared to mainstream +systems like GNU/Linux, is very slow harddisk access. + +The reason for this slowness is lack and/or bad implementation of common +optimisation techniques, like scheduling reads and writes to minimalize head +movement; effective block caching; effective reads/writes to partial blocks; +reading/writing multiple blocks at once; and read-ahead. The +[[ext2_filesystem_server|hurd/translator/ext2fs]] might also need some +optimisations at a higher logical level. + +The goal of this project is to analyze the current situation, and implement/fix +various optimisations, to achieve significantly better disk performance. It +requires understanding the data flow through the various layers involved in +disk acces on the Hurd ([[filesystem|hurd/virtual_file_system]], +[[pager|hurd/libpager]], driver), and general experience with +optimising complex systems. That said, the killing feature we are definitely +missing is the read-ahead, and even a very simple implementation would bring +very big performance speedups. + +Possible mentors: ? + +Exercise: Make some modification in at least one of the components involved in +disk I/O. (More specific suggestions welcome... :-) ) diff --git a/community/gsoc/project_ideas/driver_glue_code.mdwn b/community/gsoc/project_ideas/driver_glue_code.mdwn new file mode 100644 index 00000000..2f0a0b59 --- /dev/null +++ b/community/gsoc/project_ideas/driver_glue_code.mdwn @@ -0,0 +1,37 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="New Driver Glue Code"]] + +Although a driver framework in userspace would be desirable, presently the Hurd +uses kernel drivers in the microkernel, +[[GNU_Mach|microkernel/mach/gnumach]]. (And changing this would be far beyond a +GSoC project...) + +The problem is that the drivers in GNU Mach are presently old Linux drivers +(mostly from 2.0.x) accessed through a glue code layer. This is not an ideal +solution, but works quite OK, except that the drivers are very old. The goal of +this project is to redo the glue code, so we can use drivers from current Linux +versions, or from one of the free BSD variants. + +Using [ddekit](http://demo.tudos.org/dsweeper_tutorial.html) instead of our +own glue code can be explored as a possible alternative approach. + +This is a doable, but pretty involved project. Experience with driver +programming under Linux (or BSD) is a must. (No Hurd-specific knowledge is +required, though.) + +This is [[GNU_Savannah_task 5488]]. + +Possible mentors: Samuel Thibault (youpi) + +Exercise: Try porting one driver from Linux 2.6 to run in the old framework. +The port needn't be elegant or complete; but it would be nice if you could get +it to work at least partially... diff --git a/community/gsoc/project_ideas/dtrace.mdwn b/community/gsoc/project_ideas/dtrace.mdwn new file mode 100644 index 00000000..f0c6f07a --- /dev/null +++ b/community/gsoc/project_ideas/dtrace.mdwn @@ -0,0 +1,46 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="dtrace Support"]] + +One of the main problems of the current Hurd implementation is very poor +performance. While we have a bunch of ideas what could cause the performance +problems, these are mostly just guesses. Better understanding what really +causes bad performance is necessary to improve the situation. + +For that, we need tools for performance measurements. While all kinds of more +or less specific profiling tools could be convieved, the most promising and +generic approach seems to be a framework for logging certain events in the +running system (both in the microkernel and in the Hurd servers). This would +allow checking how much time is spent in certain modules, how often certain +situations occur, how things interact, etc. It could also prove helpful in +debugging some issues that are otherwise hard to find because of complex +interactions. + +The most popular framework for that is Sun's dtrace; but there might be others. +The student has to evaluate the existing options, deciding which makes most +sense for the Hurd; and implement that one. (Apple's implementation of dtrace +in their Mach-based kernel might be helpful here...) + +This project requires ability to evaluate possible solutions, and experience +with integrating existing components as well as low-level programming. + +Possible mentors: Samuel Thibault (youpi) + +Exercise: In lack of a good exercise directly related to this taks, just pick +one of the kernel-related or generally low-level tasks from the bug/task +trackers on savannah, and make a go at it. You might not be able to finish the +task in a limited amount of time, but you should at least be able to make a +detailed analysis of the issue. + +*Status*: Andei Barbu was working on +[SystemTap](http://csclub.uwaterloo.ca/~abarbu/hurd/) for GSoC 2008, but it +turned out too Linux-specific. He implemented kernel probes, but there is no +nice frontend yet. diff --git a/community/gsoc/project_ideas/file_locking.mdwn b/community/gsoc/project_ideas/file_locking.mdwn new file mode 100644 index 00000000..ca3c28ed --- /dev/null +++ b/community/gsoc/project_ideas/file_locking.mdwn @@ -0,0 +1,30 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Fix File Locking"]] + +Over the years, [[UNIX]] has aquired a host of different file locking mechanisms. +Some of them work on the Hurd, while others are buggy or only partially +implemented. This breaks many applications. + +The goal is to make all file locking mechanisms work properly. This requires +finding all existing shortcomings (through systematic testing and/or checking +for known issues in the bug tracker and mailing list archives), and fixing +them. + +This task will require digging into parts of the code to understand how file +locking works on the Hurd. Only general programming skills are required. + +Possible mentors: Samuel Thibault (youpi) + +Exercise: Find one of the existing issues, either by looking at the task/bug +trackers on savannah, or by trying things out yourself; and take a go at it. +Probably you wont' be able to fix the problem in a limited amount of time, but +you should be able to do a detailed analysis of the issue at least. diff --git a/community/gsoc/project_ideas/gnumach_cleanup.mdwn b/community/gsoc/project_ideas/gnumach_cleanup.mdwn new file mode 100644 index 00000000..c11defe5 --- /dev/null +++ b/community/gsoc/project_ideas/gnumach_cleanup.mdwn @@ -0,0 +1,45 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="GNU Mach Code Cleanup"]] + +Although there are some attempts to move to a more modern microkernel +alltogether, the current Hurd implementation is based on +[[GNU_Mach|microkernel/mach/gnumach]], which is only a slightly modified +variant of the original CMU [[microkernel/Mach]]. + +Unfortunately, Mach was created about two decades ago, and is in turn based on +even older BSD code. Parts of the BSD kernel -- file systems, [[UNIX]] [[mechanism]]s +like processes and signals, etc. -- were ripped out (to be implemented in +[[userspace_servers|hurd/translator]] instead); while other mechanisms were +added to allow implementing stuff in userspace. +([[Pager_interface|microkernel/mach/external_pager_mechanism]], +[[microkernel/mach/IPC]], etc.) + +Also, Mach being a research project, many things were tried, adding lots of +optional features not really needed. + +The result of all this is that the current code base is in a pretty bad shape. +It's rather hard to make modifications -- to make better use of modern hardware +for example, or even to fix bugs. The goal of this project is to improve the +situation. + +The task starts out easy, with fixing compiler warnings. Later it moves on to +more tricky things: removing dead or unneeded code paths; restructuring code +for readability and maintainability. + +This task requires good knowledge of C, and experience with working on a large +existing code base. Previous kernel hacking experience is an advantage, but +not really necessary. + +Possible mentors: Samuel Thibault (youpi) + +Exercise: Create a few simple patches that fix some of the compiler warnings, +rework a piece of ugly code etc. diff --git a/community/gsoc/project_ideas/language_bindings.mdwn b/community/gsoc/project_ideas/language_bindings.mdwn new file mode 100644 index 00000000..a96f4569 --- /dev/null +++ b/community/gsoc/project_ideas/language_bindings.mdwn @@ -0,0 +1,90 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Bindings to Other Programming Languages"]] + +The main idea of the Hurd design is giving users the ability to easily +modify/extend the system's functionality ([[extensible_system|extensibility]]). +This is done by creating [[filesystem_translators|hurd/translator]] and other +kinds of Hurd servers. + +However, in practice this is not as easy as it should, because creating +translators and other servers is quite involved -- the interfaces for doing +that are not exactly simple, and available only for C programs. Being able to +easily create simple translators in RAD languages is highly desirable, to +really be able to reap the advantages of the Hurd architecture. + +Originally Lisp was meant to be the second system language besides C in the GNU +system; but that doesn't mean we are bound to Lisp. Bindings for any popular +high-level language, that helps quickly creating simple programs, are highly +welcome. + +Several approaches are possible when creating such bindings. One way is simply +to provide wrappers to all the available C libraries ([[hurd/libtrivfs]], [[hurd/libnetfs]] +etc.). While this is easy (it requires relatively little consideration), it may +not be the optimal solution. It is preferable to hook in at a lower level, thus +being able te create interfaces that are specially adapted to make good use of +the features available in the respective language. + +These more specialised bindings could hook in at some of the lower level +library interfaces ([[hurd/libports]], [[hurd/glibc]], etc.); use the +[[microkernel/mach/MIG]]-provided [[microkernel/mach/RPC]] stubs directly; or +even create native stubs directly from the interface definitions. + +The task is to create easy to use Hurd bindings for a language of the student's +choice, and some example servers to prove that it works well in practice. This +project will require gaining a very good understanding of the various Hurd +interfaces. Skills in designing nice programming interfaces are a must. + +There has already been some [earlier work on Python +bindings](http://www.sigill.org/files/pytrivfs-20060724-ro-test1.tar.bz2), that +perhaps can be re-used. Also some work on [Perl +bindings](http://www.nongnu.org/hurdextras/#pith) is availabled. + +# Lisp + +Most Lisp implementations provide a Foreign Function Interface (FFI) that +enables the Lisp code to call functions written in another language. +Specifically, most implementations provide an FFI to the C ABI (hence giving +access to C, Fortran and possibly C++). + +Common Lisp has even a portability layer for such FFI, +[CFFI](http://common-lisp.net/project/cffi/), so that you can write bindings +purely in Lisp and use the same binding code on any implementation supported by +CFFI. + +Many Scheme implementation also provide an FFI. [Scheme48](http://www.s48.org/) +is even the implementation used to run scsh, a Scheme shell designed to provide +instant access to POSIX functions. +[Guile](http://www.gnu.org/software/guile/guile.html) is the GNU project's +Scheme implementation, meant to be embeddable and provide access to C. At least +[Gambit](http://dynamo.iro.umontreal.ca/~gambit/), +[Chicken](http://www.call-with-current-continuation.org/), +[Bigloo](http://www-sop.inria.fr/mimosa/fp/Bigloo/) and +[PLT](http://www.plt-scheme.org/) are known to provide an FFI too. + +With respect to the packaging and dependencies, the good news is that Debian +comes handy: 5 Common Lisp implementations are packaged, one of which has +already been ported to Hurd (ECL), and CFFI is also packaged. As far as Scheme +is concerned, 14 [R5RS](http://www.schemers.org/Documents/Standards/R5RS/) +implementations are provided and 1 [R6RS](http://www.r6rs.org/). + +Possible mentors: Pierre THIERRY (nowhere_man) for Common Lisp or Scheme, and perhaps Python + +Exercise: Write some simple program(s) using Hurd-specific interfaces in the +language you intend to work on. For a start, you could try printing the system +uptime. A more advanced task is writing a simple variant of the hello +translator (you can use the existing C imlementation as reference), +implementing only open() and read() calls. Don't only write an implementations +using the existing C libraries (libps, libtrivfs), but also try to work with +the MiG-generated stubs directly. If you are ambitious, you could even try to +write your own stubs... + +*Status*: Flavio Cruz has completed [[Lisp_bindings|flaviocruz]] for GSoC 2008! diff --git a/community/gsoc/project_ideas/lexical_dot-dot.mdwn b/community/gsoc/project_ideas/lexical_dot-dot.mdwn new file mode 100644 index 00000000..c4591df5 --- /dev/null +++ b/community/gsoc/project_ideas/lexical_dot-dot.mdwn @@ -0,0 +1,37 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Lexical .. Resolution"]] + +For historical reasons, [[UNIX]] filesystems have a real (hard) `..` link from each +directory pointing to its parent. However, this is problematic, because the +meaning of "parent" really depends on context. If you have a symlink for +example, you can reach a certain node in the filesystem by a different path. If +you go to `..` from there, UNIX will traditionally take you to the hard-coded +parent node -- but this is usually not what you want. Usually you want to go +back to the logical parent from which you came. That is called "lexical" +resolution. + +Some application already use lexical resolution internally for that reason. It +is generally agreed that many problems could be avoided if the standard +filesystem lookup calls used lexical resolution as well. The compatibility +problems probably would be negligable. + +The goal of this project is to modify the filename lookup mechanism in the Hurd +to use lexical resolution, and to check that the system is still fully +functional afterwards. This task requires understanding the filename resolution +mechanism. It's probably a relatively easy task. + +See also [[GNU_Savannah_bug 17133]]. + +Possible mentors: ? + +Exercise: Make some modification to the name lookup mechanism. (More specific +suggestions welcome... :-) ) diff --git a/community/gsoc/project_ideas/libdiskfs_locking.mdwn b/community/gsoc/project_ideas/libdiskfs_locking.mdwn new file mode 100644 index 00000000..c9d55bb7 --- /dev/null +++ b/community/gsoc/project_ideas/libdiskfs_locking.mdwn @@ -0,0 +1,31 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Fix libdiskfs Locking Issues"]] + +Nowadays the most often encountered cause of Hurd crashes seems to be lockups +in the [[hurd/translator/ext2fs]] server. One of these could be traced +recently, and turned out to be a lock inside [[hurd/libdiskfs]] that was taken +and not released in some cases. There is reason to believe that there are more +faulty paths causing these lockups. + +The task is systematically checking the [[hurd/libdiskfs]] code for this kind of locking +issues. To achieve this, some kind of test harness has to be implemented: For +exmple instrumenting the code to check locking correctness constantly at +runtime. Or implementing a unit testing framework that explicitely checks +locking in various code paths. (The latter could serve as a template for +implementing unit checks in other parts of the Hurd codebase...) + +This task requires experience with debugging locking issues in multithreaded +applications. + +Possible mentors: ? + +Exercise: Hack libdiskfs to keep count of the number of locks currently held. diff --git a/community/gsoc/project_ideas/mtab.mdwn b/community/gsoc/project_ideas/mtab.mdwn new file mode 100644 index 00000000..056ed042 --- /dev/null +++ b/community/gsoc/project_ideas/mtab.mdwn @@ -0,0 +1,73 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="mtab"]] + +In traditional monolithic system, the kernel keeps track of all mounts; the +information is available through `/proc/mounts` (on Linux at least), and in a +very similar form in `/etc/mtab`. + +The Hurd on the other hand has a totally +[[decentralized_file_system|hurd/virtual_file_system]]. There is no single +entity involved in all mounts. Rather, only the parent file system to which a +mountpoint ([[hurd/translator]]) is attached is involved. As a result, there +is no central place keeping track of mounts. + +As a consequence, there is currently no easy way to obtain a listing of all +mounted file systems. This also means that commands like `df` can only work on +explicitely specified mountpoints, instead of displaying the usual listing. + +One possible solution to this would be for the translator startup mechanism to +update the `mtab` on any `mount`/`unmount`, like in traditional systems. +However, there are same problems with this approach. Most notably: what to do +with passive translators, i.e., translators that are not presently running, but +set up to be started automatically whenever the node is accessed? Probably +these should be counted an among the mounted filesystems; but how to handle the +`mtab` updates for a translator that is not started yet? Generally, being +centralized and event-based, this is a pretty unelegant, non-hurdish solution. + +A more promising approach is to have `mtab` exported by a special translator, +which gathers the necessary information on demand. This could work by +traversing the tree of translators, asking each one for mount points attached +to it. (Theoretically, it could also be done by just traversing *all* nodes, +checking each one for attached translators. That would be very inefficient, +though. Thus a special interface is probably required, that allows asking a +translator to list mount points only.) + +There are also some other issues to keep in mind. Traversing arbitrary +translators set by other users can be quite dangerous -- and it's probably not +very interesting anyways what private filesystems some other user has mounted. +But what about the global `/etc/mtab`? Should it list only root-owned +filesystems? Or should it create different listings depending on what user +contacts it?... + +That leads to a more generic question: which translators should be actually +listed? There are different kinds of translators: ranging from traditional +filesystems ([[disks|hurd/libdiskfs]] and other actual +[[stores|hurd/translator/storeio]]), but also purely virtual filesystems like +[[hurd/translator/ftpfs]] or [[hurd/translator/unionfs]], and even things that +have very little to do with a traditional filesystem, like a +[[gzip_translator|hurd/translator/storeio]], +[[mbox_translator|hurd/translator/mboxfs]], +[[xml_translator|hurd/translator/xmlfs]], or various device file translators... +Listing all of these in `/etc/mtab` would be pretty pointless, so some kind of +classification mechanism is necessary. By default it probably should list only +translators that claim to be real filesystems, though alternative views with +other filtering rules might be desirable. + +After taking decisions on the outstanding design questions, the student will +implement both the actual [[mtab_translator|hurd/translator/mtabfs]], and the +necessery interface(s) for gathering the data. It requires getting a good +understanding of the translator mechanism and Hurd interfaces in general. + +Possible mentors: Olaf Buddenhagen (antrik) + +Exercise: Create a simple translator using libnetfs, that only allows creating +directories and attaching other translators. diff --git a/community/gsoc/project_ideas/namespace-based_translator_selection.mdwn b/community/gsoc/project_ideas/namespace-based_translator_selection.mdwn new file mode 100644 index 00000000..6bb643fa --- /dev/null +++ b/community/gsoc/project_ideas/namespace-based_translator_selection.mdwn @@ -0,0 +1,82 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Namspace-based Translator Selection"]] + +The main idea behind the Hurd is to make (almost) all system functionality +user-modifiable ([[extensible_system|extensibility]]). This includes a +user-modifiable filesystem: the whole filesystem is implemented decentrally, by +a set of filesystem servers forming the directory tree together, a +[[hurd/virtual_file_system]]. These filesystem servers are called +[[translators|hurd/translator]], and are the most visible feature of the Hurd. + +The reason they are called translators is because when you set a translator on +a filesystem node, the underlying node(s) are hidden by the translator, but the +translator itself can access them, and present their contents in a different +format -- translate them. A simple example is a +[[gunzip_translator|hurd/translator/storeio]], which can be set on a gzipped +file, and presents a virtual file with the uncompressed contents. Or the other +way around. Or a translator that presents an +[[XML_file_as_a_directory_tree|hurd/translator/xmlfs]]. Or an mbox as a set of +individual files for each mail ([[hurd/translator/mboxfs]]); or ever further +breaking it down into headers, body, attachements... + +This gets even more powerful when translators are used as building blocks for +larger applications: A mail reader for example doesn't need backends for +understanding various mailbox formats anymore. All formats can be parsed by +special translators, and the mail reader gets the data as a uniform, directly +usable filesystem structure. Translators can also be stacked: If you have a +compressed mailbox for example, first apply a gunzip translator, and then an +mbox translator on top of that. + +There are a few problems with the way translators are set, though. For one, +once a translator is set on a node, you always see the translated content. If +you need the untranslated contents again, to do a backup for example, you first +need to remove the translator again. Also, having to set a translator +explicitely before accessing the contents is pretty cumbersome, making this +feature almost useless. + +A possible solution is implementing a mechanism for selecting translators +through special filename attributes. For example you could use +`index.html.gz,,+` and `index.html.gz,,-` to choose between translated and +untranslated versions of a file. Or you could use `index.html.gz,,u` to get +the contents of the file with a gunzip translator applied automatically. You +could also use attributes on whole directory trees: `.,,0/` would give you a +directory tree corresponding to the current directory, but with any translators +disabled, for doing a backup. And `site,,u/*.html.gz` would present a whole +directory tree of compressed HTML files as uncompressed files. + +One benefit of the Hurd's flexibility is that it should be possible to +implement such a mechanism without touching the existing Hurd components: +Rather, just implement a special proxy, that mirrors the normal filesystem, but +is able to interpret the special extensions and present transformed files in +place of the original ones. + +In the long run it's probably desirable to have the mechanism implemented in +the standard name lookup mechanism, so it will be available globally, and avoid +the overhead of a proxy; but for the beginnig the proxy solution is much more +flexible. + +The goal of this project is implementing a prototype proxy; perhaps also a +first version of the global variant as proof of concept, if time permits. It +requires good understanding of the name lookup mechanism, and translator +programming; but the implementation should not be too hard. Perhaps the hardest +part is finding a convenient, flexible, elegant, hurdish method for mapping the +special extensions to actual translators... + +Possible mentors: Olaf Buddenhagen (antrik) + +Exercise: Try to make some modification to the existing unionfs and/or firmlink +translators. (More specific suggestions welcome... :-) ) + +*Status*: Sergiu Ivanov has been working *voluntarily* on +[[namespace-based_translator_selection|scolobb]], as an inofficial GSoC 2008 +participant! Not all the desired functionality is in place yet; work is +ongoing. diff --git a/community/gsoc/project_ideas/nfs.mdwn b/community/gsoc/project_ideas/nfs.mdwn new file mode 100644 index 00000000..a643fab4 --- /dev/null +++ b/community/gsoc/project_ideas/nfs.mdwn @@ -0,0 +1,32 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Improved NFS Implementation"]] + +The Hurd has both NFS server and client implementations, which work, but not +very well: File locking doesn't work properly (at least in conjuction with a +GNU/Linux server), and performance is extremely poor. Part of the problems +could be owed to the fact that only NFSv2 is supported so far. + +This project encompasses implementing NFSv3 support, fixing bugs and +performance problems -- the goal is to have good NFS support. The work done in +a previous unfinished GSoC project can serve as a starting point. + +Both client and server parts need work, though the client is probably much more +important for now, and shall be the major focus of this project. + +This task, [[GNU_Savannah_task 5497]], has no special prerequisites besides general programming skills, and +an interest in file systems and network protocols. + +Possible mentors: ? + +Exercise: Make a go at one of the known issues in the NFS client. You might not +be able to finish this in the limited amount of time, but you should at least +be able to make a detailed analysis of the issue. diff --git a/community/gsoc/project_ideas/package_manager.mdwn b/community/gsoc/project_ideas/package_manager.mdwn new file mode 100644 index 00000000..0734cea0 --- /dev/null +++ b/community/gsoc/project_ideas/package_manager.mdwn @@ -0,0 +1,50 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Hurdish Package Manager for the GNU System"]] + +Most GNU/Linux systems use pretty sophisticated package managers, to ease the +management of installed software. These keep track of all installed files, and +various kinds of other necessary information, in special databases. On package +installation, deinstallation, and upgrade, scripts are used that make all kinds +of modifications to other parts of the system, making sure the packages get +properly integrated. + +This approach creates various problems. For one, *all* management has to be +done with the distribution package management tools, or otherwise they would +loose track of the system state. This is reinforced by the fact that the state +information is stored in special databases, that only the special package +management tools can work with. + +Also, as changes to various parts of the system are made on certain events +(installation/deinstallation/update), managing the various possible state +transitions becomes very complex and bug-prone. + +For the official (Hurd-based) GNU system, a different approach is intended: +making use of Hurd [[translators|hurd/translator]] -- more specifically their +ability to present existing data in a different form -- the whole system state +will be created on the fly, directly from the information provided by the +individual packages. The visible system state is always a reflection of the +sum of packages installed at a certain moment; it doesn't matter how this state +came about. There are no global databases of any kind. (Some things might +require caching for better performance, but this must happen transparently.) + +The core of this approach is formed by [[hurd/translator/stowfs]], which +creates a traditional unix directory structure from all the files in the +individual package directories. But this only handles the lowest level of +package management. Additional mechanisms are necessary to handle stuff like +dependencies on other packages. + +The goal of this task is to create these mechanisms. + +Possible mentors: Ben Asselstine (bing) + +Exercise: Write a translator that observes a directory tree using +dir_notify_changes(), and presents a file with a log of changes. diff --git a/community/gsoc/project_ideas/procfs.mdwn b/community/gsoc/project_ideas/procfs.mdwn new file mode 100644 index 00000000..55556b02 --- /dev/null +++ b/community/gsoc/project_ideas/procfs.mdwn @@ -0,0 +1,45 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="procfs"]] + +Although there is no standard (POSIX or other) for the layout of the `/proc` +pseudo-filesystem, it turned out a very useful facility in GNU/Linux and other +systems, and many tools concerned with process management use it. (`ps`, `top`, +`htop`, `gtop`, `killall`, `pkill`, ...) + +Instead of porting all these tools to use [[hurd/libps]] (Hurd's official method for +accessing process information), they could be made to run out of the box, by +implementing a Linux-compatible `/proc` filesystem for the Hurd. + +The goal is to implement all `/proc` functionality needed for the various process +management tools to work. (On Linux, the `/proc` filesystem is used also for +debugging purposes; but this is highly system-specific anyways, so there is +probably no point in trying to duplicate this functionality as well...) + +The [[existing_partially_working_procfs_implementation|hurd/translator/procfs]] +can serve as a starting point, but needs to be largely rewritten. (It should +use [[hurd/libnetfs]] rather than [[hurd/libtrivfs]]; the data format needs to +change to be more Linux-compatible; and it needs adaptation to newer system +interfaces.) + +This project requires learning [[hurd/translator]] programming, and +understanding some of the internals of process management in the Hurd. It +should not be too hard coding-wise; and the task is very nicely defined by the +exising Linux `/proc` interface -- no design considerations necessary. + +**Note**: We already have several applications for this task. + +Possible mentors: Olaf Buddenhagen (antrik) + +Exercise: Add or fix one piece in the existing procfs translator. + +*Status*: Madhusudan.C.S has implemented a new, fully functional [[procfs|madhusudancs]] for +GSoC 2008. He is still working on some outstanding issues. diff --git a/community/gsoc/project_ideas/pthreads.mdwn b/community/gsoc/project_ideas/pthreads.mdwn new file mode 100644 index 00000000..4ac20b45 --- /dev/null +++ b/community/gsoc/project_ideas/pthreads.mdwn @@ -0,0 +1,48 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Convert Hurd Libraries and Servers to pthreads"]] + +The Hurd was originally created at a time when the [pthreads +standard](http://www.opengroup.org/onlinepubs/009695399/basedefs/pthread.h.html) +didn't exist yet. Thus all Hurd servers and libraries are using the old +[[cthreads|hurd/libcthreads]] package that came with [[microkernel/Mach]], +which is not compatible with [[pthreads|hurd/libpthread]]. + +Not only does that mean that people hacking on Hurd internals have to deal with +a non-standard thread package, which nobody is familiar with. Although a +pthreads implementation for the Hurd was created in the meantime, it's not +possible to use both cthreads and pthreads in the same program. Consequently, +pthreads can't presently be used in any Hurd servers -- including translators. + +Some work already has been done once on converting the Hurd servers and +libraries to use pthreads, but that work hasn't been finished. It is available +as [[GNU_Savannah_task 5487]] and can of course be used to base the new work +upon. + +The goal of this project is to have all the Hurd code use pthreads. Should any +limitations in the existing pthreads implementation turn up that hinder this +transition, they will have to be fixed as well. + +One possible option is creating a wrapper that implements the cthreads +interfaces on top of pthreads, to ease the transition -- but it might very well +turn out that it's easier to just change all the existing code to use pthreads +directly. This is up to the student. Such a wrapper has been proposed as +[[GNU_Savannah_task 7895]] and its implementation would be a useful +starting-point. + +This project requires relatively little Hurd-specific knowledge. Experience +with multithreaded programming in general and pthreads in particular is +required, though. + +Possible mentors: Samuel Thibault (youpi) + +Exercise: Take some small piece of code using ctreads and convert it to +pthreads. diff --git a/community/gsoc/project_ideas/secure_chroot.mdwn b/community/gsoc/project_ideas/secure_chroot.mdwn new file mode 100644 index 00000000..a47bd5db --- /dev/null +++ b/community/gsoc/project_ideas/secure_chroot.mdwn @@ -0,0 +1,39 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Secure chroot Implementation"]] + +As the Hurd attempts to be (almost) fully [[UNIX]]-compatible, it also implements a +`chroot()` system call. However, the current implementation is not really +good, as it allows easily escaping the `chroot`, for example by use of +[[passive_translators|hurd/translator]]. + +Many solutions have been suggested for this problem -- ranging from simple +workaround changing the behaviour of passive translators in a `chroot`; +changing the context in which passive translators are exectuted; changing the +interpretation of filenames in a chroot; to reworking the whole passive +translator mechanism. Some involving a completely different approch to +`chroot` implementation, using a proxy instead of a special system call in the +filesystem servers. + +The task is to pick and implement one approach for fixing chroot. + +This task is pretty heavy: it requires a very good understanding of file name +lookup and the translator mechanism, as well as of security concerns in general +-- the student must prove that he really understands security implications of +the UNIX namespace approach, and how they are affected by the introduction of +new mechanisms. (Translators.) More important than the acualy code is the +documentation of what he did: he must be able to defend why he chose a certain +approach, and explain why he believes this approach really secure. + +Possible mentors: ? + +Exercise: Make some modification to the chroot mechanism. (More specific +suggestions welcome :-) ) diff --git a/community/gsoc/project_ideas/server_overriding.mdwn b/community/gsoc/project_ideas/server_overriding.mdwn new file mode 100644 index 00000000..c9aab792 --- /dev/null +++ b/community/gsoc/project_ideas/server_overriding.mdwn @@ -0,0 +1,75 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Server Overriding Mechanism"]] + +The main idea of the Hurd is that every user can influence almost all system +functionality ([[extensible_system|extensibility]]), by running private Hurd +servers that replace or proxy the global default implementations. + +However, running such a cumstomized subenvironment presently is not easy, +because there is no standard mechanism to easily replace an individual standard +server, keeping everything else. (Presently there is only the [[hurd/subhurd]] +method, which creates a completely new system instance with a completely +independent set of servers.) + +The goal of this project is to provide a simple method for overriding +individual standard servers, using environment variables, or a special +subshell, or something like that. + +Various approaches for such a mechanism has been discussed before. +Probably the easiest (1) would be to modify the Hurd-specific parts of [[hurd/glibc]], +which are contacting various standard servers to implement certain system +calls, so that instead of always looking for the servers in default locations, +they first check for overrides in environment variables, and use these instead +if present. + +A somewhat more generic solution (2) could use some mechanism for arbitrary +client-side namespace overrides. The client-side part of the filename lookup +mechanism would have to check an override table on each lookup, and apply the +desired replacement whenever a match is found. + +Another approach would be server-side overrides. Again there are various +variants. The actual servers themself could provide a mechanism to redirect to +other servers on request. (3) Or we could use some more generic server-side +namespace overrides: Either all filesystem servers could provide a mechanism to +modify the namespace they export to certain clients (4), or proxies could be +used that mirror the default namespace but override certain locations. (5) + +Variants (4) and (5) are the most powerful. They are intimately related to +chroots: (4) is like the current chroot implementation works in the Hurd, and +(5) has been proposed as an alternative. The generic overriding mechanism could +be implemented on top of chroot, or chroot could be implemented on top of the +generic overriding mechanism. But this is out of scope for this project... + +In practice, probably a mix of the different approaches would prove most useful +for various servers and use cases. It is strongly recommended that the student +starts with (1) as the simplest approach, perhaps augmenting it with (3) for +certain servers that don't work with (1) because of indirect invocation. + +This tasks requires some understanding of the Hurd internals, especially a good +understanding of the file name lookup mechanism. It's probably not too heavy on +the coding side. + +This is [[GNU_Savannah_task 6612]]. Also there are quite a bit of emails +discussing this topic, from a last year's GSoC application -- see +<http://lists.gnu.org/archive/html/bug-hurd/2007-03/msg00050.html>, +<http://lists.gnu.org/archive/html/bug-hurd/2007-03/msg00114.html>, +<http://lists.gnu.org/archive/html/bug-hurd/2007-06/msg00082.html>, +<http://lists.gnu.org/archive/html/bug-hurd/2008-03/msg00039.html>. + +Possible mentors: Olaf Buddenhagen (antrik) + +Exercise: Come up with a glibc patch that allows overriding one specific +standard server using method (1). + +*Status*: Overriding of socket servers through environment variables has been +implemented by Zheng Da for GSoC 2008, as part of his +[[network_virtualization|zhengda]] project. diff --git a/community/gsoc/project_ideas/sound.mdwn b/community/gsoc/project_ideas/sound.mdwn new file mode 100644 index 00000000..e22a7e19 --- /dev/null +++ b/community/gsoc/project_ideas/sound.mdwn @@ -0,0 +1,40 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Sound Support"]] + +The Hurd presently has no sound support. Fixing this, [[GNU_Savannah_task +5485]], requires two steps: the first is to port some other kernel's drivers to +[[GNU_Mach|microkernel/mach/gnumach]] so we can get access to actual sound +hardware. The second is to implement a userspace server ([[hurd/translator]]), +that implements an interface on top of the kernel device that can be used by +applications -- probably OSS or maybe ALSA. + +Completing this task requires porting at least one driver (e.g. from Linux) for +a popular piece of sound hardware, and the basic userspace server. For the +driver part, previous experience with programming kernel drivers is strongly +advisable. The userspace part requires some knowledge about programming Hurd +translators, but shouldn't be too hard. + +Once the basic support is working, it's up to the student to use the remaining +time for porting more drivers, or implementing a more sophisticated userspace +infrastructure. The latter requires good understanding of the Hurd philosophy, +to come up with an appropriate design. + +Another option would be to evaluate whether a driver that is completely running +in user-space is feasible. <!-- TODO. Elaborate. --> + +Possible mentors: ? + +Exercise: Take a newer driver for a device in one of the subsystems we already +implement (disk or network) from a newer Linux version, or some other operating +system, and try to port it so that it runs in the existing driver framework. +The port needn't be elegant or complete; but it would be nice if you could get +it to work at least partially... diff --git a/community/gsoc/project_ideas/tcp_ip_stack.mdwn b/community/gsoc/project_ideas/tcp_ip_stack.mdwn new file mode 100644 index 00000000..b8fb76df --- /dev/null +++ b/community/gsoc/project_ideas/tcp_ip_stack.mdwn @@ -0,0 +1,37 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Hurdish TCP/IP Stack"]] + +The Hurd presently uses a [[TCP/IP_stack|hurd/translator/pfinet]] based on code from an old Linux version. +This works, but lacks some rather important features (like PPP/PPPoE), and the +design is not hurdish at all. + +A true hurdish network stack will use a set of stack of [[hurd/translator]] processes, +each implementing a different protocol layer. This way not only the +implementation gets more modular, but also the network stack can be used way +more flexibly. Rather than just having the standard socket interface, plus some +lower-level hooks for special needs, there are explicit (perhaps +filesystem-based) interfaces at all the individual levels; special application +can just directly access the desired layer. All kinds of packet filtering, +routing, tunneling etc. can be easily achieved by stacking compononts in the +desired constellation. + +While the general architecture is pretty much given by the various network +layers, it's up to the student to design and implement the various interfaces +at each layer. This task requires understanding the Hurd philosophy and +translator programming, as well as good knowledge of TCP/IP. + +This is [[GNU_Savannah_task 5469]]. + +Possible mentors: ? + +Exercise: Make some modification to the existing pfinet implementation. (More +specific suggestions welcome... :-) ) diff --git a/community/gsoc/project_ideas/tmpfs.mdwn b/community/gsoc/project_ideas/tmpfs.mdwn new file mode 100644 index 00000000..7c9cf67b --- /dev/null +++ b/community/gsoc/project_ideas/tmpfs.mdwn @@ -0,0 +1,44 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Fix tmpfs"]] + +In some situations it is desirable to have a file system that is not backed by +actual disk storage, but only by anonymous memory, i.e. lives in the RAM (and +possibly swap space). + +A simplistic way to implement such a memory filesystem is literally creating a +ramdisk, i.e. simply allocating a big chunck of RAM (called a memory store in +Hurd terminology), and create a normal filesystem like ext2 on that. However, +this is not very efficient, and not very convenient either (the filesystem +needs to be recreated each time the ramdisk is invoked). A nicer solution is +having a real [[hurd/translator/tmpfs]], which creates all filesystem +structures directly in RAM, allocating memory on demand. + +The Hurd has had such a tmpfs for a long time. However, the existing +implementation doesn't work anymore -- it got broken by changes in other parts +of the Hurd design. + +There are several issues. The most serious known problem seems to be that for +technical reasons it receives [[microkernel/mach/RPC]]s from two different +sources on one [[microkernel/mach/port]], and gets mixed up with them. Fixing +this is non-trivial, and requires a good understanding of the involved +mechanisms. + +The goal of this project to get a fully working, full featured tmpfs +implementation. It requires digging into some parts of the Hurd, incuding the +[[pager_interface|hurd/libpager]] and [[hurd/translator]] programming. This +task probably doesn't require any design work, only good debugging skills. + +Possible mentors: ? + +Exercise: Take a go at one of the existing issues in tmpfs. You may not be able +to finish this in the limited amount of time, but you should at least be able +to do a detailed analysis of the problem. diff --git a/community/gsoc/project_ideas/unionfs_boot.mdwn b/community/gsoc/project_ideas/unionfs_boot.mdwn new file mode 100644 index 00000000..77fc357f --- /dev/null +++ b/community/gsoc/project_ideas/unionfs_boot.mdwn @@ -0,0 +1,48 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Allow Using unionfs Early at Boot"]] + +In [[UNIX]] systems, traditionally most software is installed in a common directory +hierachy, where files from various packages live beside each other, grouped by +function: user-invokable executables in `/bin`, system-wide configuration files +in `/etc`, architecture specific static files in `/lib`, variable data in +`/var`, and so on. To allow clean installation, deinstallation, and upgrade of +software packages, GNU/Linux distributions usually come with a package manager, +which keeps track of all files upon installation/removal in some kind of +central database. + +An alternative approach is the one implemented by GNU Stow: each package is +actually installed in a private directory tree. The actual standard directory +structure is then created by collecting the individual files from all the +packages, and presenting them in the common `/bin`, `/lib`, etc. locations. + +While the normal Stow package (for traditional UNIX systems) uses symlinks to +the actual files, updated on installation/deinstallation events, the Hurd +[[hurd/translator]] mechanism allows a much more elegant solution: +[[hurd/translator/stowfs]] (which is actually a special mode of +[[hurd/translator/unionfs]]) creates virtual directories on the fly, composed +of all the files from the individual package directories. + +The problem with this approach is that unionfs presently can be launched only +once the system is booted up, meaning the virtual directories are not available +at boot time. But the boot process itself already needs access to files from +various packages. So to make this design actually usable, it is necessary to +come up with a way to launch unionfs very early at boot time, along with the +root filesystem. + +Completing this task will require gaining a very good understanding of the Hurd +boot process and other parts of the design. It requires some design skills +also to come up with a working mechanism. + +Possible mentors: ? + +Exercise: Try to write a dummy server that is started instead of ext2fs on +system boot, and starts the actual ext2fs in turn. diff --git a/community/gsoc/project_ideas/virtualization.mdwn b/community/gsoc/project_ideas/virtualization.mdwn new file mode 100644 index 00000000..52b1f48d --- /dev/null +++ b/community/gsoc/project_ideas/virtualization.mdwn @@ -0,0 +1,92 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="Virtualization Using Hurd Mechanisms"]] + +The main idea behind the Hurd design is to allow users to replace almost any +system functionality ([[extensible_system|extensibility]]). Any user can easily +create a subenvironment using some custom [[servers|hurd/translator]] instead +of the default system servers. This can be seen as an +[[advanced_lightweight_virtualization|hurd/virtualization]] mechanism, which +allows implementing all kinds of standard and nonstandard virtualization +scenarios. + +However, though the basic mechanisms are there, currently it's not easy to make +use of these possibilities, because we lack tools to automatically launch the +desired constellations. + +The goal is to create a set of powerful tools for managing at least one +desirable virtualization scenario. One possible starting point could be the +[[hurd/subhurd]]/[[hurd/neighborhurd]] mechanism, which allows a second almost totally +independant instance of the Hurd in parallel to the main one. The current +implementation has serious limitations though. A subhurd can only be started by +root. There are no communication channels between the subhurd and the main one. +There is no mechanism for safe sharing of hardware devices. Fixing this issues +could turn subhurds into a very powerful solution for lightweight +virtualization using so-called logical partitions. (Similar to Linux-vserver, +OpenVZ etc.) + +While subhurd allow creating a complete second system instance, with an own set +of Hurd servers and [[UNIX]] daemons and all, there are also situations where it is +desirable to have a smaller subenvironment, living withing the main system and +using most of its facilities -- similar to a chroot environment. A simple way +to create such a subenvironment with a single command would be very helpful. + +It might be possible to implement (perhaps as a prototype) a wrapper using +existing tools (chroot and [[hurd/translator/unionfs]]); or it might require more specific tools, +like some kind of unionfs-like filesytem proxy that mirrors other parts of the +filesystem, but allows overriding individual locations, in conjuction with +either chroot or some similar mechanism to create a subenvironment with a +different root filesystem. + +It's also desirable to have a mechanism allowing a user to set up such a custom +environment in a way that it will automatically get launched on login -- +practically allowing the user to run a customized operating system in his own +account. + +Yet another interesting scenario would be a subenvironment -- using some kind +of special filesystem proxy again -- in which the user serves as root, being +able to create local sub-users and/or sub-groups. + +This would allow the user to run "dangerous" applications (webbrowser, chat +client etc.) in a confined fashion, allowing it access to only a subset of the +user's files and other resources. (This could be done either using a lot of +groups for individual resources, and lots of users for individual applications; +adding a user to a group would give the corresponding application access to the +corresponding resource -- an advanced [[ACL]] mechanism. Or leave out the groups, +assigning the resources to users instead, and use the Hurd's ability for a +process to have multiple user IDs, to equip individual applications with sets +of user IDs giving them access to the necessary resources -- basically a +[[capability]] mechanism.) + +The student will have to pick (at least) one of the described scenarios -- or +come up with some other one in a similar spirit -- and implement all the tools +(scripts, translators) necessary to make it available to users in an +easy-to-use fashion. While the Hurd by default already offers the necessary +mechanisms for that, these are not perfect and could be further refined for +even better virtualization capabilities. Should need or desire for specific +improvements in that regard come up in the course of this project, implementing +these improvements can be considered part of the task. + +Completing this project will require gaining a very good understanding of the +Hurd architecture and spirit. Previous experience with other virtualization +solutions would be very helpful. + +Possible mentors: Olaf Buddenhagen (antrik) + +Exercise: Make some modification to the "boot" programm used to start subhurds. +(More specific suggestions welcome... :-) ) + +*Status*: Zheng da has has implemented [[network_virtualization|zhengda]] (an +important prerequisite for unprivileged subhurds) for GSoC 2008, along with +various other interesting bits, including a mechanism to override socket +servers; a proc proxy that allows running processes/subenvironments with a +pseudo device master port; and a mechanism to pass arbitrary virtual devices to +a subhurd. He is still working on running subhurds by normal users. diff --git a/community/gsoc/project_ideas/vm_tuning.mdwn b/community/gsoc/project_ideas/vm_tuning.mdwn new file mode 100644 index 00000000..79361189 --- /dev/null +++ b/community/gsoc/project_ideas/vm_tuning.mdwn @@ -0,0 +1,35 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="VM Tuning"]] + +Hurd/[[microkernel/Mach]] presently make very bad use of the available physical memory in the +system. Some of the problems are inherent to the system design (the kernel +can't distinguish between important application data and discardable disk +buffers for example), and can't be fixed without fundamental changes. Other +problems however are an ordinary lack of optimisation, like extremely crude +heuristics when to start paging. (See <http://lists.gnu.org/archive/html/bug-hurd/2007-08/msg00034.html> for example.) +Many parameters are based on assumptions from +a time when typical machines had like 16 MiB of RAM, or simply have been set to +arbitrary values and never tuned for actual use. + +The goal of this project is to bring the virtual memory management in Hurd/Mach +closer to that of modern mainstream kernels (Linux, FreeBSD), by comparing the +implementation to other systems, implementing any worthwhile improvements, and +general optimisation/tuning. It requires very good understanding of the Mach +VM, and virtual memory in general. + +This project is related to [[GNU_Savannah_task 5489]]. + +Possible mentors: ? + +Exercise: Make some modification to the existing VM code. You could try to find +a piece of code that can be improved with simple code optimization, for +example. diff --git a/community/gsoc/project_ideas/xmlfs.mdwn b/community/gsoc/project_ideas/xmlfs.mdwn new file mode 100644 index 00000000..cfdfb4c7 --- /dev/null +++ b/community/gsoc/project_ideas/xmlfs.mdwn @@ -0,0 +1,53 @@ +[[meta copyright="Copyright © 2008, 2009 Free Software Foundation, Inc."]] + +[[meta license="""[[toggle id="license" text="GFDL 1.2+"]][[toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled +[[GNU_Free_Documentation_License|/fdl]]."]]"""]] + +[[meta title="xmlfs"]] + +Hurd [[translators|hurd/translator]] allow presenting underlying data in a +different format. This is a very powerful ability: it allows using standard +tools on all kinds of data, and combining existing components in new ways, once +you have the necessary translators. + +A typical example for such a translator would be xmlfs: a translator that +presents the contents of an underlying XML file in the form of a directory +tree, so it can be studied and edited with standard filesystem tools, or using +a graphical file manager, or to easily extract data from an XML file in a +script etc. + +The exported directory tree should represent the DOM structure of the document, +or implement XPath, or both, or some combination thereof (perhaps XPath could +be implemented as a second translator working on top of the DOM one) -- +whatever works well, while sticking to XML standards as much as possible. + +Ideally, the translation should be reversible, so that another, complementary +translator applied on the expanded directory tree would yield the original XML +file again; and also the other way round, applying the complementary translator +on top of some directory tree and xmlfs on top of that would yield the original +directory again. However, with the different semantics of directory trees and +XML files, it might not be possible to create such a universal mapping. Thus +it is a desirable goal, but not a strict requirement. + +The goal of this project is to create a fully usable XML translator, that +allows both reading and writing any XML file. Implementing the complementary +translator also would be nice if time permits, but is not mandatory part of the +task. + +The [[existing_partial_(read-only)_xmlfs_implementation|hurd/translator/xmlfs]] +can serve as a starting point. + +This task requires pretty good designing skills. Good knowledge of XML is also +necessary. Learning translator programming will obviously be necessary to +complete the task. + +Possible mentors: Olaf Buddenhagen (antrik) + +Exercise: Make some modification to the existing xmlfs translator, and write a +shell script that uses xmlfs to extract some interesting information from an +.odt document. (More specific suggestions welcome... :-) ) |