summaryrefslogtreecommitdiff
path: root/community/gsoc/project_ideas/namespace-based_translator_selection.mdwn
blob: e41059b7d1dd708ffb8007f6f8e7734febee5641 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
[[!meta copyright="Copyright © 2008, 2009, 2012, 2018 Free Software Foundation,
Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!meta title="Namespace-based Translator Selection"]]

[[!template id=highlight text="""/!\ Obsolete? /!\

---

This is probably no longer valid as a Google Summer of Code project.
[[Sergiu Ivanov|scolobb]] has been working *voluntarily* on this task an
inofficial GSoC 2008 participant.  Not all the desired functionality is in
place yet, though, but the status needs to be evaluated."""]]


The main idea behind the Hurd is to make (almost) all system functionality
user-modifiable ([[extensible_system|extensibility]]).  This includes a
user-modifiable filesystem: the whole filesystem is implemented decentrally, by
a set of filesystem servers forming the directory tree together, a
[[hurd/virtual_file_system]].  These filesystem servers are called
[[translators|hurd/translator]], and are the most visible feature of the Hurd.

The reason they are called translators is because when you set a translator on
a filesystem node, the underlying node(s) are hidden by the translator, but the
translator itself can access them, and present their contents in a different
format -- translate them.  A simple example is a
[[gunzip_translator|hurd/translator/storeio]], which can be set on a gzipped
file, and presents a virtual file with the uncompressed contents.  Or the other
way around.  Or a translator that presents an
[[XML_file_as_a_directory_tree|hurd/translator/xmlfs]].  Or an mbox as a set of
individual files for each mail ([[hurd/translator/mboxfs]]); or ever further
breaking it down into headers, body, attachments...

This gets even more powerful when translators are used as building blocks for
larger applications: A mail reader for example doesn't need backends for
understanding various mailbox formats anymore. All formats can be parsed by
special translators, and the mail reader gets the data as a uniform, directly
usable filesystem structure. Translators can also be stacked: If you have a
compressed mailbox for example, first apply a gunzip translator, and then an
mbox translator on top of that.

There are a few problems with the way translators are set, though. For one,
once a translator is set on a node, you always see the translated content. If
you need the untranslated contents again, to do a backup for example, you first
need to remove the translator again. Also, having to set a translator
explicitly before accessing the contents is pretty cumbersome, making this
feature almost useless.

A possible solution is implementing a mechanism for selecting translators
through special filename attributes.  For example you could use
`index.html.gz,,+` and `index.html.gz,,-` to choose between translated and
untranslated versions of a file.  Or you could use `index.html.gz,,u` to get
the contents of the file with a gunzip translator applied automatically.  You
could also use attributes on whole directory trees: `.,,0/` would give you a
directory tree corresponding to the current directory, but with any translators
disabled, for doing a backup.  And `site,,u/*.html.gz` would present a whole
directory tree of compressed HTML files as uncompressed files.

One benefit of the Hurd's flexibility is that it should be possible to
implement such a mechanism without touching the existing Hurd components:
Rather, just implement a special proxy, that mirrors the normal filesystem, but
is able to interpret the special extensions and present transformed files in
place of the original ones.

In the long run it's probably desirable to have the mechanism implemented in
the standard name lookup mechanism, so it will be available globally, and avoid
the overhead of a proxy; but for the beginning the proxy solution is much more
flexible.

The goal of this project is implementing a prototype proxy; perhaps also a
first version of the global variant as proof of concept, if time permits. It
requires good understanding of the name lookup mechanism, and translator
programming; but the implementation should not be too hard. Perhaps the hardest
part is finding a convenient, flexible, elegant, hurdish method for mapping the
special extensions to actual translators...

Possible mentors: Olaf Buddenhagen (antrik)

Exercise: Try to make some modification to the existing unionfs and/or firmlink
translators. (More specific suggestions welcome... :-) )