diff options
author | Samuel Thibault <samuel.thibault@ens-lyon.org> | 2011-07-16 18:09:44 +0200 |
---|---|---|
committer | Samuel Thibault <samuel.thibault@ens-lyon.org> | 2011-07-16 18:09:44 +0200 |
commit | 856c182cd0d76db0ec2444cec96e9c53714ec930 (patch) | |
tree | 70eb5719020b548fb722c3ad6f6e253aabc89312 /user/jkoenig | |
parent | 5d5d6f01b3e0e700a33de02f9ece38557bb2af13 (diff) | |
parent | 89f33677640b8a6ff0bb2b7b4cb2b6c24670bde9 (diff) |
Merge branch 'master' of flubber:~hurd-web/hurd-web
Diffstat (limited to 'user/jkoenig')
-rw-r--r-- | user/jkoenig/d-i.mdwn | 358 | ||||
-rw-r--r-- | user/jkoenig/gsoc2011_proposal.mdwn | 634 | ||||
-rw-r--r-- | user/jkoenig/gsoc2011_proposal/discussion.mdwn | 180 | ||||
-rw-r--r-- | user/jkoenig/java.mdwn | 321 | ||||
-rw-r--r-- | user/jkoenig/java/discussion.mdwn | 526 | ||||
-rw-r--r-- | user/jkoenig/java/java-access-bridge.mdwn | 78 | ||||
-rw-r--r-- | user/jkoenig/java/proposal.mdwn | 628 |
7 files changed, 1920 insertions, 805 deletions
diff --git a/user/jkoenig/d-i.mdwn b/user/jkoenig/d-i.mdwn new file mode 100644 index 00000000..0b9f9f7d --- /dev/null +++ b/user/jkoenig/d-i.mdwn @@ -0,0 +1,358 @@ +[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +## Hurd Debian-Installer + +My [proposal](http://wiki.debian.org/SummerOfCode2010/HurdDebianInstaller/JeremieKoenig) +to work on porting d-i on Hurd +as a [Google Summer of Code](http://code.google.com/soc/) student +has been accepted by the Debian project. + +I will be keeping track of my progress on this page. + +### Links + + * [Modified packages](http://jk.fr.eu.org/debian/unstable) + * [Latest images](http://jk.fr.eu.org/debian/hurd-installer) + * [Debian bugs](http://bugs.debian.org/cgi-bin/pkgreport.cgi?users=jk@jk.fr.eu.org&tag=gsoc2010) + * [BusyBox port](http://lists.debian.org/debian-bsd/2010/05/msg00048.html) + * [GNU Mach initrd](http://lists.gnu.org/archive/html/bug-hurd/2010-06/msg00047.html) + +### Roadmap + +* **mach**: initrd support + * (./) preliminary patch posted and self-built (2010-06-12) + * adjustments will be needed (postponed) + * consider the alternatives discussed on bug-hurd (postponed) + +* **glibc**: fix `mkdir("/")` which returned `EINVAL` + * (./) eglibc 2.11.2-1 includes a quick fix by youpi (2010-06-15) + * (./) more complete patch posted to bug-hurd, + since other calls return incorrect errors under some circumstances + (2010-06-16) + * more work on it will be needed to make it fix the whole thing + (postponed) + +* (./) **partman** (2010-06-23) + * (./) add hurd-i386 to + `partman-partitioning/lib/disk-label.sh` + (2010-06-16, commited by youpi on 2010-06-23) + * (./) short-circuit + `partman-basicfilesystems/init.d/kernelmodules_basicfilesystems` + (2010-06-16) + * (./) partman-auto recipes: + make the default filesystem os-dependent + when it has not been preseeded (ie. the *seen* flag is clear) + * (./) force 4k blocks and 128 bytes inodes + * (./) submit patches to bugs.debian.org + ([[!debbug 586870]] and [[!debbug 586871]]) + * (./) rebuild with responsible version numbers and upload to my repository + +* (./) **libparted** (2010-06-23) + * (./) fix device paths ([[!debbug 586696]]) + * (./) fix crash on exit for part:* stores ([[!debbug 586682]]) + +* **hurd-udeb** (2010-06-23) + * (./) rebuild with the hack suggested by youpi for qemu network configuration + * (./) fix mount to accept `-o defaults` + * cleanup, ask youpi to commit + +* reloading the partition table (2010-06-25) + * User-space part stores + * (./) hurd-udeb now uses `part:N:device:X` for partition devices + (2010-06-23) + * (./) it also provides /lib/partman/commit.d/??hurd\_reloadpart, + which basically does `settrans -ag /dev/[hs]d*`. + (2010-06-24) + * Kernel-based partition devices + * (./) Mach's drivers from Linux support reloading partitions. + With help from youpi this has been made available through a + device\_set\_status() call. + * (./) make libparted use it + * Reminder: + I should file a bug against libparted with the patch sometime. + +* (./) The `/servers/exec` issue (2010-06-26) + * Due to /servers being inexistant, + the bootstrap ext2fs could not register the initial exec server, + meaning that non-bootstrap filesystems used a different one + (started from the passive translator), + which for some reason died on shell scripts, making them stall. + * Adding the `/servers` directory to hurd-udeb fixed it, + as well as the /hurd/proc issue + (failed to be run by init the first time around). + * Reminder: report the non-bootstrap exec servers failure on scripts. + +* (./) **base-installer**: (2010-06-26) + * Work around non-existant /proc/mounts. + * Firmlink /servers into /target after debootstrap + to make the network available. + +* (./) **grub-installer** + * (./) add hurd support (2010-06-27) + * /!\ grub-legacy still needs to be tested + * submit changes as a Debian bug + +**Milestone (2010-06-28): +installer kindof works, with documented manual intervention required** + +* (./) Sort out the situation with dev node creation (2010-07-07): + * Devices and servers used to be set up by debootstrap; + the hurd package would add some missing nodes. + * New strategy implemented in hurd and debootstrap: + * debootstrap uses active firmlinks into the host system + for the target system's /dev and /servers. + * the hurd package now include a `setup-translators` script, + which is used to register the passive translators by the installer's + `/libexec/runsystem` and hurd's postinst script. + +* **busybox**: submit upstream and to [[!debbug 323670]] + (waiting for upstream to review) + * (./) I have mentioned my work on the upstream mailing list, + * (./) merge the recent changes from upstream, + notably to the build system. + (2010-06-23) + * (./) ask upstream for review and merge + (2010-06-25) + * (./) sent as patches as requested + (2010-07-08) + * (./) backport any additional changes onto the debian branch + * (./) hijack [[!debbug 323670]] and submit my patches + +* **aptitude**: + * Currently broken on hurd-i386: + [[!debpkg gtest]] fails to build because of a segfault in one of the test + cases, [[!debpkg google-mock]] and hence [[!debpkg aptitude]] are missing + it as a build-dep. + The older package is not installable anymore because it's linked against + an older version of libept, which has been removed. + * (./) I bypassed the tests and uploaded the 3 packages to my repository + (2010-07-08) + * The segfault will have to be sorted out. (postponed) + +* (./) "Fix" the swap situation. (2010-07-08) + * The device\_close() libstore patch + had the unfortunate effect of making swapon fail, + since the device it activates has to be kept open. + * add options for MAKEDEV and setup-devices + to use the libparted stores + * disable youpi's patch + * make partman-basicfilesystems re-create the device + as a kernel partition, which is needed for swapon + +* (./) netcfg-static: port to hurd (2010-07-09) + * There was some amount of hurd support already + (namely, activating the interface by replacing the socket translator) + * However, this code started an active translator with + di\_exec\_shell\_log("settrans -a ...), + which stalled as a consequence of it capturing libdi's pipe + as its standard output. + * Network devices must be probed by trying to open Mach devices + with predetermined names (currently eth%d, wl%d), + because getifaddrs() does not seem to work on Hurd. + * /!\ netcfg, and configuring the installed system, postponed. + +* **procps** 3.2.7-11 (current hurd-i386 version) has [[!debbug 546888]] + * (./) Submit [[!debbug 588677]] and upload the result to my repository. + (2010-07-11) + +* (./) Set up a Debian mirror with modified packages for installation + * the [mirror](http://jk.fr.eu.org/debian/hurd-install/mirror) + is now up and running (2010-07-06) + * hacked the image build script to include its public key in + debian-archive-keyring at image build time (2010-07-08) + * Apparently debootstrap does not handle multiple versions very well. + Fix by using dpkg-scan{package,sources} rather than apt-ftparchive + to create index files. + (2010-07-10) + * Use the override files from ftp.debian.org, + to avoid debootstrap grabbing inappropriate packages. + * Changed them to make [[!debpkg ifupdown]], + [[!debpkg dhcp3-client]] and [[!debpkg dhcp3-client]] priority extra, + because they're uninstallable at the moment. + (2010-07-12) + +* (./) Put together a "jk-archive-keyring" package, + so that the mirror is authenticated in the target system as well. + (2010-07-12) + +* (./) Fix grub for user-space partitions (2010-07-16) + * grub-probe detects the whole device rather than the partition + as the device behind /boot/grub. + Consequently, grub-install fails. + * One approach would be to replace /dev/hd* by kernel devices + for file systems as well as for swap partitions. + > {X} this makes the installer crash, + > possibly due to cache coherency issue between hdX and hdXsY. + + * (./) GRUB2 kern/emu/getroot.c + [patched](http://lists.gnu.org/archive/html/bug-hurd/2010-07/msg00059.html) + to support part stores. + +* (./) Fix finish-install to skip `finish-install.d/90console` on Hurd + (2010-07-17) + +* (./) Avoid starting unnecessary /dev translators in a burst (2010-07-20) + * Use debootstrap use the extracted /usr/lib/hurd/setup-translators + to create device and server nodes in /target, + then firmlink the whole /target/dev and /target/servers + to the outer system. + * Make hurd.postinst not touch them on initial install. + +* (./) Fix mach-defpager for file and part stores on larger devices + * Use DEVICE\_GET\_RECORDS instead of DEVICE\_GET\_SIZE, which overflows an int + (2010-07-22) + +**Milestone (2010-07-22): +installer works but it's still somewhat ugly and broken** + +* (./) Ship the UTF-8 font for the hurd console + (2010-07-22) + * Upload a version of bogl with youpi's patch for Hurd. + (see [[!debbug 589987]]) + * Fix the hurd console for fonts with 16 pixels wide glyphs + (ie. handle the 8-wide glyph in there correclty) + * Support double-width glyphs (2010-07-24) + * (./) However the reduced font can't be loaded yet, + so make installer/build/Makefile + ship the whole `/usr/src/unifont.bgf` + as `/usr/share/hurd/vga-system.bgf`. + +* (./) Make the installer used the extended capabilities of the Hurd console + (2010-07-23) + * Set an UTF-8 locale in `/lib/debian-installer.d/S41term-hurd`. + * localechooser: set the language display level to 3 + when using the hurd console. + +* (./) **busybox**: cross-platform package uploaded to experimental + (2010-08-03?) + * Aurelien Jarno updated the packaging to busybox 1.17.1, + fixed a whole lot of bugs, + and uploaded a new package with both our changes; + * most patches adopted upstream, and included in the new package; + * (u)mount/swaponoff ported to kFreeBSD; + * per-OS configuration overrides. + +* (./) Update custom packages to the latest versions + and send updated patches to the BTS + (2010-08-11) + * updated partman-base to choose a default filesystem in debian/rules + rather than at runtime, + as suggested by Aurelien Jarno in [[!debbug 586870]] + * patch submitted for debian-installer-utils + ([[!debbug 592684]]). + * patch submitted for locale-chooser + ([[!debbug 592690]]). + * debootstrap, grub-installer and finish-install not yet submitted, + since the details may still change. + +* (./) **partman-target**: fix fstab creation + (2010-08-11) + * See [[!debbug 592671]] + * debian/rules: set `partman/mount_style` to `traditional` on Hurd. + * finish.d/create\_fstab\_header: add a Hurd case. + +* (./) **rootskel**: FTBFS on Hurd and other quirks + (to be fixed very soon) + +* **d-i/installer/build**: (expected soon) + * publish the patch I use + * sort out the changes suitable for inclusion + and ask youpi and/or debian-boot@l.d.o to commit them + +* call for testing and fix the bugs + +* Bug in setup-translators/MAKEDEV: + permissions are broken for nodes re-created through `MAKEDEV -k`, + because MAKEDEV's chmod/chown reaches the pre-existing translator + * Maybe settrans could be made to accept -o/--owner and + -p/--perm, to set the permissions for the underlying node? + +* (./) Silence the "no kernel" warning somehow. + +* Investigate the wget/libc/pfinet/whatever bug which corrupts Packages.gz, + see the IRC log for 2010-07-23, around 1am UTC+0200 + +* Try to resolve problems with udebs which are uninstallable on hurd-i386, + such as installation-locale and partman-whatever. + +* Provide `/proc/cmdline -> 2/cmdline`, or something. + +* Prepare a NMU for genext2fs (which is orphaned), + and ask youpi to sponsor the upload. + +* **busybox**: port + * fix stty/stat/ipcs on kFreeBSD, + * generally port more stuff, + * *ip* is needed (maybe) for network configuration, + * *mount*, *swaponoff* can be from hurd-udeb for now, + though the kFreeBSD people will need them + +* **partman**: further adjustments + * partman-base: handle /dev/hd?s* in lib/base.h + * hide irrelevant mount options? (sync, relatime) + +* Network configuration on the installed system. + This includes porting ifupdown and isc-dhcp-client, + which are currently uninstallable on hurd-i386. +* Also, better DHCP support during and after installation + +* improve the [initrd situation](FIXME: link to bug-hurd post): + ajust the ramdisk support in Mach, + use tmpfs if possible. + +* mklibs{,-copy}: + test library reduction, + make it copy the ld.so -> ld.so.1 symlink. + +* (./) hurd console fonts + +**Milestone (expected 2010-07-19): +it works great and it's beautiful** + +* test, fix, document +* support more types of installation images +* give a shot at the graphical installer if time permits +* integrate wireless drivers with netcfg +* see how [[zhengda]]'s work on DDE could be integrated +* etc.. + +### Mostly done + +#### Week 1 (2010-05-24) + +* genext2fs: patches submitted, [[!debbug 562999]] + which add support for all block sizes and choosing them at runtime. +* busybox: started porting the upstream and Debian package to Hurd and FreeBSD +* rebuilding hurd-udeb from the pkg-hurd version + and adding a ld.so link to the initrd + fixes the exec translator crashing on startup. + (BTW would there be a mean to detect this from the libdiskfs bootstrap code + and report it ?) + +#### Week 2 (2010-05-31 to 2010-06-06) + +* *busybox*: patches [posted](http://lists.debian.org/debian-bsd/2010/05/msg00048.html). +* *libdebian-installer4*: [[!debbug 584538]] +* started working on mach initrd support +* the installation images could boot into the main-menu + with the following changes: + * rebuild hurd-udeb from with the latest pkg-hurd patches + * use busybox from my osports-debian branch (see link above) + * tweak the d-i image build scripts + * the symlink /lib/ld.so -> ld.so.1 needs to be created somehow + (youpi mentionned it being the job of libc0.3-udeb I think) + * fix the poll() issue in libdebian-installer + (patch to be submitted soon), + also there is some hurd doxygen short-circuiting stuff + there which does not apply any more and prevents is to build. + * feed the initrd as a hard drive in qemu + (with some more space added to avoid it from becoming full) + diff --git a/user/jkoenig/gsoc2011_proposal.mdwn b/user/jkoenig/gsoc2011_proposal.mdwn index 4052f455..9840f14f 100644 --- a/user/jkoenig/gsoc2011_proposal.mdwn +++ b/user/jkoenig/gsoc2011_proposal.mdwn @@ -1,628 +1,12 @@ +[[!meta copyright="Copyright © 2010 Free Software Foundation, Inc."]] -# Java for Hurd (and vice versa) +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] -Contact information: - - * Full name: Jérémie Koenig - * Email: jk@jk.fr.eu.org - * IRC: jkoenig on Freenode and OFTC - -## Introductions - -I am a first year M.Sc. student -in Computer Science at University of Strasbourg (France). -My interests include capability-based security, -programming languages and formal methods -(in particular, object-capability languages and proof-carrying code). - -### Proposal summary - -This project would consist in improving Java support on Hurd. -The first part would consist in -fixing bugs and porting Java-related packages. -The second part would consist in -creating low-level Java bindings for the Hurd interfaces, -as well as libraries to make translator development easier. - -### Previous involvement - -I started contributing to Hurd last summer, -during which I participated to Google Summer of Code -as a student for the Debian project. -I worked on porting Debian-Installer to Hurd. -This project was mostly a success, -although we still have to use a special mirror for installation -with a few modified packages -and tweaked priorities -to work around some uninstallable packages -with Priority: standard. - -Shortly afterwards, -I rewrote the procfs translator -to fix some issues with memory leaks, -make it more reliable, -and improve compatibility with Linux-based tools -such as `procps` or `htop`. - -Although I have not had as much time -as I would have liked to dedicate to the Hurd -since that time, -I have continued to maintain the mirror in question, -and I have started to work -on implementing POSIX threads signal semantics in glibc. - -### Project-related skills and interests - -I have used Java mostly for university assignments. -This includes non-trivial projects -using threads and distributed programming frameworks -such as Java RMI or CORBA. -I have also used it to experiment with -Google App Engine -(web applications) -and Google Web Toolkit -(a compiler from Java to Javascript which helps with AJAX code), -and I have some limited experience with JNI -(the Java Native Interface, to link Java with C code). - -My knowledge of the Hurd and Debian GNU/Hurd is reasonable, -as the Debian-Installer and procfs projects -gave me the opportunity to fiddle with many parts of the system. - -Initially, -I started working on this project because I wanted to use -[Joe-E](http://code.google.com/p/joe-e/) -(a subset of Java) -to investigate the potential -[[applications of object-capability languages|objcap]] -in a Hurd context. -I also believe that improving Java support on Hurd -would be an important milestone. - -### Organisational matters - -I am subscribed to bug-hurd@g.o and -I do have a permanent internet connexion. - -I would be able to attend the regular IRC meetings, -and otherwise communicate with my mentor -through any means they would prefer -(though I expect email and IRC would be the most practical). -Since I'm already familiar with the Hurd, -I don't expect I would require too much time from them. - -My exams end on May 20 so I would be able to start coding -right at the beginning of the GSoC period. -Next year's term would probably begin around September 15, -so that would not be an issue either. -I expect I would work around 40 hours per week, -and my waking hours would be flexible. - -I don't have any other plans for the summer -and would not make any if my project were to be accepted. - -Full disclosure: -I also submitted a proposal to the Jikes RVM project -(which is a research-oriented Java Virtual Machine, -itself written in Java) -for implementing a new garbage collector into the MMTk subsystem. - -## Improve Java support - -### Justification - -Java is a popular language and platform used by many desktop and web -applications (mostly on the server side). As a consequence, competitive Java -support is important for any general-purpose operating system. -Better Java support would also be a prerequisite -for the second part of my proposal. - -### Current situation - -Java is currently supported on Hurd with the GNU Java suite: - - * [GCJ](http://gcc.gnu.org/java/), - the GNU Compiler for Java, is part of GCC and can compile Java - source code to Java bytecode, and both source code and bytecode to - native code; - * libgcj is the implementation of the Java runtime which GCJ uses. - It is based on [GNU Classpath](http://www.gnu.org/software/classpath/). - It includes a bytecode interpreter which enables - Java applications compiled to native code to dynamically load and execute - Java bytecode from class files. - * The gij command is a wrapper around the above-mentioned virtual machine - functionality of libgcj and can be used as a replacement for the java - command. - -However, GCJ does not work flawlessly on Hurd.r -For instance, some parts of libgcj relies on -the POSIX threads signal semantics, which are not yet implemented. -In particular, this makes ant hang waiting for child processes, -which makes some packages fail to build on Hurd -(“ant” is the “make” of the Java world). - -### Tasks - - * **Finish implementing POSIX thread semantics** in glibc (high priority). - According to POSIX, signal dispositions should be global to a process, - while signal blocking masks should be thread-specific. Signals sent to the - process as a whole are to be delivered to any thread which does not block - them. By contrast, Hurd has per-thread signal dispositions and signals - sent to a process are delivered to the main thread only. I have been - working on refactoring the glibc signal code and implementing the POSIX - semantics as a per-thread option. However, due to lack of time I have not - yet been able to test and debug my code properly. Finishing this work - would be my first task. - * **Fix further problems with GCJ on Hurd** (high priority). While I’m not - aware of any other problems with GCJ at the moment, I suspect some might - turn up as I progress with the other tasks. Fixing these problems would - also be a high-priority task. - * **Port OpenJDK 6** (medium priority). While GCJ is fine, it is not yet - 100% complete. It is also slower than OpenJDK on architectures where a - just-in-time compiler is available. Porting OpenJDK would therefore - improve Java support on Hurd in scope and quality. Besides, it would also - be a good way to test GCJ, which is used for bootstrapping by the Debian - OpenJDK packages. Also note that OpenJDK 6 is now the default Java - Runtime Environment on all released Linux-based Debian architectures; - bringing Hurd in line with this would probably be a good thing. - * **Port Eclipse and other Java applications** (low priority). Eclipse is a - popular, state-of-the-art IDE and tool suite used for Java and other - languages. It is a dependency of the Joe-E verifier (see part 3 of this - proposal). Porting Eclipse would be a good opportunity to test GCJ and - OpenJDK. - -### Deliverables - - * The glibc pthreads patch and any other fixes on the Hurd side - would be submitted upstream - * Patches against Debian source packages - required to make them build on Hurd would be submitted - to the [Debian bug tracking system](http://bugs.debian.org/). - - -## Create Java bindings for the Hurd interfaces - -### Justification - -Java is used for many applications and often taught to -introduce object-oriented programming. The fact that Java is a -garbage-collected language makes it easier to use, especially for the less -experienced programmers. Besides, its object-oriented nature is a -natural fit for the capability-based design of Hurd. -The JVM is also used as a target for many other languages, -all of which would benefit from the access provided by these bindings. - -Advantages over other garbage-collected, object-oriented languages include -performance, type safety and the possibility to compile a Java translator to -native code and -[link it statically](http://gcc.gnu.org/wiki/Statically_linking_libgcj) -using GCJ, should anyone want to use a -translator written in Java for booting. -Note that Java is -[being](http://www.linuxjournal.com/article/8757) -[used](http://oss.readytalk.com/avian/) -in this manner for embedded development. -Since GCJ can take bytecode as its input, -this expect this possibility would apply to any JVM-based language. - -Java bindings would lower the bar for newcomers -to begin experimenting with what makes Hurd unique -without being faced right away with the complexity of -low-level systems programming. - -### Tasks summary - - * Implement Java bindings for Mach - * Implement a libports-like library for Java - * Modify MIG to output Java code - * Implement libfoofs-like Java libraries - -### Design principles - -The principles I would use to guide the design -of these Java bindings would be the following ones: - - * The system should be hooked into at a low level, - to ensure that Java is a "first class citizen" - as far as the access to the Hurd's interfaces is concerned. - * At the same time, the memory safety of Java should be maintained - and extended to Mach primitives such as port names and - out-of-line memory regions. - * Higher-level interfaces should be provided as well - in order to make translator development - as easy as possible. - * A minimum amount of JNI code (ie. C code) should be used. - Most of the system should be built using Java itself - on top of a few low-level primitives. - * Hurd objects would map to Java objects. - * Using the same interfaces, - objects corresponding to local ports would be accessed directly, - and remote objects would be accessed over IPC. - -One approach used previously to interface programming languages with the Hurd -has been to create bindings for helper libraries such as libtrivfs. Instead, -for Java I would like to take a lower-level approach by providing access to -Mach primitives and extending MIG to generate Java code from the interface -description files. - -This approach would be initially more involved, and would introduces several -issues related to overcoming the "impedance mismatch" between Java and Mach. -However, once an initial implementation is done it would be easier to maintain -in the long run and we would be able to provide Java bindings for a large -percentage of the Hurd’s interfaces. - -### Bindings for Mach system calls - -In this low-level approach, my intention is to enable Java code to use Mach -system calls (in particular, mach_msg) more or less directly. This would -ensure full access to the system from Java code, but it raises a number of -issues: - - * the Java code must be able to manipulate Mach-level entities, such as port - rights or page-aligned buffers mapped outside of the garbage-collected - heap (for out-of-line transfers); - * putting together IPC messages requires control of the low-level - representation of data. - -In order to address these concerns, classes would be encapsulating these -low-level entities so that they can be referenced through normal, safe objects -from standard Java code. Bindings for Mach system calls can then be provided -in terms of these classes. Their implementation would use C code through the -Java Native Interface (JNI). - -More specifically, this functionality would be provided by the `org.gnu.mach` -package, which would contain at least the following classes: - - * `MachPort` would encapsulate a `mach_port_t`. (Some of) its constructors - would act as an interface for the `mach_port_allocate()` system call. - `MachPort` objects would also be instantiated from other parts of the JNI - C code to represent port rights received through IPC. The `deallocate()` - method would call `mach_port_deallocate()` and replace the encapsulated - port name with `MACH_PORT_DEAD`. We would recommend that users call it - when a port is no longer used, but the finalizer would also deallocate the - port when the `MachPort` object is garbage collected. - * `Buffer` would represent a page-aligned buffer allocated outside of the - Java heap, to be transferred (or having been received) as out-of-line - memory. The JNI code would would provide methods to read and write data at - an arbitrary offset (but within bounds) and would use `vm_allocate()` and - `vm_deallocate()` in the same spirit as for `MachPort` objects. - * `Message` would allow Java code to put together Mach messages. The - constructor would allocate a `byte[]` member array of a given size. - Additional methods would be provided to fill in or query the information - in the message header and additional data items, including `MachPort` and - `Buffer` objects which would be translated to the corresponding port names - and out-of-line pointers. - A global map from port names to the corresponding `MachPort` object - would probably be needed to ensure that there is a one-to-one - correspondence. - * `Syscall` would provide static JNI methods for performing system calls not - covered by the above classes, such as `mach_msg()` or - `mach_thread_self()`. These methods would accept or return `MachPort`, - `Buffer` and `Message` objects when appropriate. The associated C code - would access the contents of such objects directly in order to perform the - required unsafe operations, such as constructing `MachPort` and `Buffer` - objects directly from port names and C pointers. - -Note that careful consideration should be given to the interfaces of these -classes to avoid “safety leaks” which would compromise the safety guarantees -provided by Java. Potential problematic scenarios include the following -examples: - - * It must not be possible to write an integer at some position in a - `Message` object, and to read it back as a `MachPort` or `Buffer` object, - since this would allow unsafe access to arbitrary memory addresses and - mach port names. - * Providing the `mach_task_self()` system call would also provide access to - arbitrary addresses and ports by using the `vm_*` family of RPC operations - with the returned `MachPort` object. This means that the relevant task - operations should be provided by the `Syscall` class instead. - -Finally, access should be provided to the initial ports and file descriptors -in `_hurd_ports` and provided by the `getdport()` function, -for instance through static methods such as -`getCRDir()`, `getCWDir()`, `getProc()`, ... in a dedicated class such as -`org.gnu.hurd.InitPorts`. - -A realistic example of code based on such interfaces would be: - - import org.gnu.mach.MsgType; - import org.gnu.mach.MachPort; - import org.gnu.mach.Buffer; - import org.gnu.mach.Message; - import org.gnu.mach.Syscall; - import org.gnu.hurd.InitPorts; - - public class Hello - { - public static main(String argv[]) - /* Parent class for all Mach-related exceptions */ - throws org.gnu.mach.MachException - { - /* Allocate a reply port */ - MachPort reply = new MachPort(); - - /* Allocate an out-of-line buffer */ - Buffer data = new Buffer(MsgType.CHAR, 13); - data.writeString(0, "Hello, World!"); - - /* Craft an io_write message */ - Message msg = new Message(1024); - msg.setRemotePort(InitPorts.getdport(1)); - msg.setLocalPort(reply, Message.Type.MAKE_SEND_ONCE); - msg.setId(21000); - msg.addBuffer(data); - - /* Make the call, MACH_MSG_SEND | MACH_MSG_RECEIVE */ - Syscall.machMsg(msg, true, true, reply); - - /* Extract the returned value */ - msg.assertId(21100); - int retCode = msg.readInt(0); - int amount = msg.readInt(1); - } - } - -Should this paradigm prove insufficient, -more ideas could be borrowed from the -[`org.vmmagic`](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.151.5253&rep=rep1&type=pdf) -package used by [Jikes RVM](http://jikesrvm.org/), -a research Java virtual machine itself written in Java. - -### Generating Java stubs with MIG - -Once the basic machinery is in place to interface with Mach, Java programs -have more or less equal access to the system functionality without resorting -to more JNI code. However, as illustrated above, this access is far from -convenient. - -As a solution I would modify MIG to add the option to output Java code. MIG -would emit a Java interface, a client class able to implement the interface -given a Mach port send right, an a server class which would be able to handle -incoming messages. The class diagram below, although it is by no means -complete or exempt of any problem, illustrates the general idea: - -[[gsoc2011_classes.png]] - -This structure is somewhat reminiscent of -[Java RMI](http://en.wikipedia.org/wiki/Java_remote_method_invocation) -or similar systems, -which aim to provide more or less transparent access to remote objects. -The exact way the Java code would be generated still needs to be determined, -but basically: - - * An interface, corresponding to the header files generated by MIG, would - enumerate the operations listed in a given .defs files. Method names would - be transformed to adhere to Java conventions (for instance, - `some_random_identifier` would become `someRandomIdentifier`). - * A user class, corresponding to the `*User.c` files, - would implement this interface by doing RPC over a given MachPort object. - * A server class, corresponding to `*Server.c`, would be able to handle - incoming messages using a user-provided implementation of the interface. - (Possibly, a skeleton class providing methods which would raise - `NotImplementedException`s would be provided as well. - Users would derive from this class and override the relevant methods. - This would allow them not to implement some operations, - and would avoid pre-existing code from breaking when new operations are - introduced.) - -In order to help with the implementation of servers, some kind of library -would be needed to associate Mach receive rights with server objects and to -handle incoming messages on dedicated threads, in the spirit of libports. -This would probably require support for port sets at the level of the Mach -primitives described in the previous section. - -When possible, operations involving the transmission of send rights -of some kind would be expressed in terms of the MIG-generated interfaces -instead of `MachPort` objects. -Upon reception of a send right, -a `FooUser` object would be created -and associated with the corresponding `MachPort` object. -If the received send right corresponds to a local port -to which a server object has been associated, -this object would be used instead. -This way, -subsequent operations on the received send right -would be handled as direct method calls -instead of going through RPC mechanisms. - -Some issues will still need to be solved regarding how MIG will convert -interface description files to Java interfaces. For instance: - - * `.defs` files are not explicitly associated with a type. For instance in - the example above, MIG would have to somehow infer that io_t corresponds - to `this` in the `Io` interface. - * More generally, a correspondence between MIG and Java types would have - to be determined. Ideally this would be automated and not hardcoded - too much. - * Initially, reply port parameters would be ignored. However they may be - needed for some applications. - -So the details would need to be flushed out during the community bonding -period and as the implementation progresses. However I’m confident that a -satisfactory solution can be designed. - -Using these new features, the example above could be rewritten as: - - import org.gnu.hurd.InitPorts; - import org.gnu.hurd.Io; - import org.gnu.hurd.IoUser; - - class Hello { - static void main(String argv[]) throws ... - { - Io stdout = new IoUser(InitPorts.getdport(1)); - String hello = “Hello, World!\n”; - - int amount = stdout.write(hello.getBytes(), -1); - - /* (A retCode corresponding to an error - would be signalled as an exception.) */ - } - } - -An example of server implementation would be: - - import org.gnu.hurd.Io; - import java.util.Arrays; - - class HelloIo implements Io { - final byte[] contents = “Hello, World!\n”.getBytes(); - - int write(byte[] data, int offset) { - return SOME_ERROR_CODE; - } - - byte[] read(int offset, int amount) { - return Arrays.copyOfRange(contents, offset, - offset + amount - 1); - } - - /* ... */ - } - -A new server object could then be created with `new IoServer(new HelloIo())`, -and associated with some receive right at the level of the ports management -library. - -### Base classes for common types of translators - -Once MIG can target Java code, and a libports equivalent is available, -creating new translators in Java would be greatly facilitated. However, -we would probably want to introduce basic implementations of file system -translators in the spirit of libtrivfs or libnetfs. They could take the form -of base classes implementing the relevant MIG-generated interfaces which -would then be derived by users, -or could define a simpler interface -which would then be used by adapter classes -to implement the required ones. - -I would draw inspiration from libtrivfs and libnetfs -to design and implement similar solutions for Java. - -### Deliverables - - * A hurd-java package would contain the Java code developed - in the context of this project. - * The Java code would be documented using javadoc - and a tutorial for writing translators would be written as well. - * Modifications to MIG would be submitted upstream, - or a patched MIG package would be made available. - -The Java libraries resulting from this work, -including any MIG support classes -as well as the class files built from the MIG-generated code -for the Mach and Hurd interface definition files, -would be provided as single `hurd-java` package for -Debian GNU/Hurd. -This package would be separate from both Hurd and Mach, -so as not to impose unreasonable build dependencies on them. - -I expect I would be able to act as its maintainer in the foreseeable future, -either as an individual or as a part of the Hurd team. -Hopefully, -my code would be claimed by the Hurd project as their own, -and consequently the modifications to MIG -(which would at least conceptually depend on the Mach Java package) -could be integrated upstream. - -Since by design, -the Java code would use only a small number of stable interfaces, -it would not be subject to excessive amounts of bitrot. -Consequently, -maintenance would primarily consist in -fixing bugs as they are reported, -and adding new features as they are requested. -A large number of such requests -would mean the package is useful, -so I expect that the overall amount of work -would be correlated with the willingness of more people -to help with maintenance -should I become overwhelmed or get hit by a bus. - - -## Timeline - -The dates listed are deadlines for the associated tasks. - - * *Community bonding period.* - Discuss, refine and complete the design of the Java bindings - (in particular the MIG and "libports" parts) - * *May 23.* - Coding starts. - * *May 30.* - Finish implementing pthread signal semantics. - * *June 5.* - Port OpenJDK - * *June 12.* - Fix the remaining problems with GCJ and/or OpenJDK, - possibly port Eclipse or other big Java packages. - * *June 19.* - Create the bindings for Mach. - * *June 26.* - Work on some kind of basic Java libports - to handle receive rights. - * *July 3.* - Test, write some documentation and examples. - * *July 17 (two weeks).* - Add the Java target to MIG. - * *July 24.* - Test, write some documentation and examples. - * *August 7 (two weeks).* - Implement a modular libfoofs to help with translator development. - Try to write a basic but non-trivial translator - to evaluate the performance and ease of use of the result, - rectify any rough edges this would uncover. - * *August 22. (last two weeks)* - Polish the code and packaging, - finish writing the documentation. - - -## Conclusion - -This project is arguably ambitious. -However, I have been thinking about it for some time now -and I'm confident I would be able to accomplish most of it. - -In the event multiple language bindings projects -would be accepted, -some work could probably be done in common. -In particular, -[ArneBab](http://www.bddebian.com/~hurd-web/community/weblogs/ArneBab/2011-04-06-application-pyhurd/) -seems to favor a low-level approach for his Python bindings as I do for Java, -and I would be happy to discuss API design and coordinate MIG changes with him. -I would also have an extra month after the end of the GSoC period -before I go back to school, -which I would be able to use to finish the project -if there is some remaining work. -(Last year's rewrite of procfs was done during this period.) - -As for the project's benefits, -I believe that good support for Java -is a must-have for the Hurd. -Java bindings would also further the Hurd's agenda -of user freedom by extending this freedom to more people: -I expect the set of developers -who would be able to write Java code against a well-written libfoofs -is much larger than -those who master the intricacies of low-level systems C programming. -From a more strategic point of view, -this would also help recruit new contributors -by providing an easier path to learning the inner workings of the Hurd. - -Further developments -which would build on the results of this project -include my planned [[experiment with Joe-E|objcap]] -(which I would possibly take on as a university project next year). -Another possibility would be to reimplement some parts -of the Java standard library -directly in terms of the Hurd interfaces -instead of using the POSIX ones through glibc. -This would possibly improve the performance -of some Java applications (though probably not by much), -and would otherwise be a good project -for someone trying to get acquainted with Hurd. - -Overall, I believe this project would be fun, interesting and useful. -I hope that you will share this sentiment -and give me the opportunity to spend another summer working on Hurd. +This page has moved [[here|java]]. diff --git a/user/jkoenig/gsoc2011_proposal/discussion.mdwn b/user/jkoenig/gsoc2011_proposal/discussion.mdwn deleted file mode 100644 index 0131d8d5..00000000 --- a/user/jkoenig/gsoc2011_proposal/discussion.mdwn +++ /dev/null @@ -1,180 +0,0 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -Some [[tschwinge]] comments regarding your proposal. Which is very good, if I -may say so again! :-) - -Of course, everyone is invited to contribute here! - -I want to give the following methodology a try, instead of only having -email/IRC discussions -- for the latter are again and again showing a tendency -to be dumped and deposited into their respective archives, and be forgotten -there. Of course, email/IRC discussions have their usefulness too, so we're -not going to replace them totally. For example, for conducting discussions -with a bunch of people (who may not even be following these pages here), email -(or, as applicable, the even more interactive IRC) will still be the medium of -choice. (And then, the executive summary should be posted here, or -incorporated into your proposal.) - -Also, if you disagree with this suggested procedure right away, or at some -later point begin to feel that this thing doesn't work out, or simply takes too -much time (I don't think so: writing emails takes time, too), just say so, and -we can reconsider. - -Of course, as this wiki is a passive medium rather than an active one as IRC -and email are, it is fine to send notices like: *I have updated the wiki page, -please have a look*. - -One idea is that your proposal evolves alongside with the ongoing work, and -represents (in more or less detail) what has been done and what will be done. -Also, we can hopefully use parts of it for documentation purposes, or as -recipes for similar work (enabling other programming languages on the Hurd, for -example). - -For this, I suggest the following procedure: as applicable, you can either -address any comments in here (for example, if they're wrong :-), or if they -require further discussion; think: *email discussion*), or you can address them -directly in your propoal and remove the comments from here at the same time -(think: *bug fix*). - -Generally, you can assume that for things I didn't comment on (within some -reasonable timeframe/upon asking me again) that I'm fine with them. Otherwise, -I might say: *I don't like this as is, but I'll need more time to think about -it.* - -There is also a possibility that parts of your proposal will be split off; in -cases where we think they're valuable to follow, but not at this time. (As you -know, your proposal is not really a trivial one, so it may just be too much for -one person's summer.) Such bits could be moved to [[open_issues]] pages, -either new ones or existing ones, as applicable. - - -# POSIX Threads Signal Semantics - - * Great! [[tschwinge]] had a brief look, and should have a deeper one. - - * If [[jkoenig]] thinks it's mature enough: should ask Samuel to test this - (that is, only the refactoring patches for starters?) on the buildds. - - * Then: should ask Roland to review. - - * Documentations bits should probably be moved to [[glibc/signal]]. - - -## libthreads (cthreads) Integration - - * [[tschwinge]] suggests to leave them as-is? - - -## [[libpthread]] integration - - * To be done. - - -# Java - - * [[tschwinge]] has to read about RMI and CORBA. - - -# Joe-E - - * For later. - - -# GCJ - - * [[tschwinge]] has the feeling that Java in GCC (that is, GCJ) is mostly - dead? (True?) - - * Thus perhaps not too much effort should be spent with it. - - If the POSIX threads signal semantics makes it going, then great, otherwise - we should get a feeling what else is missing. - - -# OpenJDK - - * All in all, [[tschwinge]] has the feeling that a working OpenJDK will be - more useful/powerful than GCJ. - - * We need to get a feeling how difficult such an OS port will be. - - * [[jkoenig]] suggests OpenJDK 6 -- should we directly go for version 7 - instead? - - * What are the differences (regarding the OS port) between the two - versions? Or this there something even more recent to be worked upon, - for new OS ports? - - * Perhaps the different versions' OS port specific stuff is not at - all very different, so that both v6 and v7 could be done? - - * They seem to have a rather heavy-weight process for such projects: confer - <http://mail.openjdk.java.net/pipermail/announce/2011-January/000092.html>, - for example. Do we need this, too? - - -# Eclipse - -OK for testing -- but I'd very much hope that it *just works* as soon as we -provide the required Java platform. - - -# Java Bindings - - -## Design Principles - - * Generally ack. - - -### MIG - - * Hacking [[microkernel/mach/MIG]] shouldn't be too difficult. - - * (Unless you want to make MIG's own code (that is, not the generated - code, but MIG itself) look a bit more nice, too.) ;-) - - * There are also alternatives to MIG. If there is interest, the following - could be considered: - - * FLICK ([[!GNU_Savannah_task 5723]]). [[tschwinge]] has no idea yet if - there would be any benefits over MIG, like better modularity (for the - backends)? If we feel like it, we could spend a little bit of time on - this. - - * For [[microkernel/Viengoos]], Neal has written a RPC stub generator - entirely in C Preprocessor macros. While this is obviously not - directly applicable, perhaps we can get some ideas from it. - - * Anything else that would be worth having a look at? (What are other - microkernels using?) - - -### `mach_msg` - - * Seems like the right approach to [[tschwinge]], but hasn't digested all the - pecularities yet. Will definitely need more time. - - -# GSoC Site Discussion - - * Discussion items from - <http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/jkoenig/1> - should be copied here: - - * technical bits (obviously); - - * also the *why do we want Java bindings* reasoning; - - * CLISP findings should also be documented somewhere permanently. - - * We should probaby open up a *languages for Hurd* section on the web - pages ([[!taglink open_issue_documentation]]). diff --git a/user/jkoenig/java.mdwn b/user/jkoenig/java.mdwn new file mode 100644 index 00000000..700f9c4e --- /dev/null +++ b/user/jkoenig/java.mdwn @@ -0,0 +1,321 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +# Improve Java on Hurd (GSoC 2011) + + +## Description + +The project consists in improving Java support on Hurd. +This includes porting OpenJDK, +creating low-level Java bindings for Mach and Hurd, +as well as creating Java libraries to help with translator development. + +For details, see my original [[proposal]]. + + +## Current status + +Feeling slightly behind schedule; but project is very ambitious, which has been +known from the beginning, and there is great progress, so there is no problem. +--[[tschwinge]], 2011-06-29. + +[[tschwinge]] will be on vacations in China starting July 26th, will have +Internet access intermittently, but not regularely. We'll have to figure out +some scheme. + + +### Apt repository + +Modified Debian packages are available in this repository: + + deb http://jk.fr.eu.org/debian experimental/ + deb-src http://jk.fr.eu.org/debian experimental/ + + +### Glibc signal code improvements + +2011-06-29: +Patches were submitted to `libc-alpha` +which implement global signal dispositions and `SA_SIGINFO`. +My latest code is available on +[github](http://github.com/jeremie-koenig/glibc/commits/master-beware-rebase), +and modified Debian packages +are available in my apt repository. + +One question is how the new symbols introduced by my patches +should be handled. +Weak symbols turned out to be impractical, +so I'm currently considering using a Debian-specific +symbol version in the interim period (`GLIBC_2.13_DEBIAN_8` so far). +The ultimate symbol version to be used will depend on +the time at which the patches get integrated upstream +(most likely `GLIBC_2.15`), +at which point we will alias the interim version +to the new one in debian packages. + +I have modified libc0.3 to include a `deb-symbols(5)` file +(alternatively see <http://wiki.debian.org/Projects/ImprovedDpkgShlibdeps>) +so that we get an accurate libc dependency in `hurd` and other packages +when the symbols in question are pulled in. + +[[hurd/libthreads]] (cthreads library) will not be changed. There's no reason +why its behavior should change, whereas for [[libpthread]] it's needed for +conformance. Patches posted on 2011-05-25, but there's a more recent one in +the modified hurd package (adds `_hurd_sigstate_delete` and removes the weak +symbols). + +Another issue which came up with OpenJDK is the expansion +by the dynamic linker of `$ORIGIN` in the `RPATH` header, +see below. + +#### Plans + +The patches are pending review and inclusion upstream. +As soon as we reach an agreement wrt. the new interfaces +(in particular wrt. the value of `SA_SIGINFO`), +the patches will be applied to the Debian libc packages +for broader testing. + + +##### Open Items + + * Test patches: in progress, [[jkoenig]], Svante. More volunteers welcome, + of course. + + > There's an issue with gdb, + > namely signals lose their "untracedness" when they go + > through the global sigstate's pending mask, + > so gdb spins intercepting a signal and trying to deliver it. + > [Patch](http://github.com/jeremie-koenig/glibc/commit/3ecb990e9d08d5f75adc40b738b35a1802cc0943). + + * If [[jkoenig]] thinks it's mature enough: should ask + [[Samuel|samuelthibault]] to test these patches on the buildds. + + > There's a risk that a dependency on my patched libc + > might be pulled in while building packages + > (in particular hurd) + > --[[jkoenig]] 2011-06-22 + + * Waiting on ABI finalization ([!] Roland). + + * Which numeric values to use for `SA_SIGINFO` (and `SA_NOCLDWAIT`)? + + > Staying in sync with BSD seems the most logical approach, + > so I have defined it to 0x40. --[[jkoenig]] 2011-06-29 + + * Get patches reviewed (Roland?), and integrated into official sources: [!] + [[tschwinge]]. + + > [[samuelthibault]] reviewed the patches and pointed out a couple of + > issues which I'm currently working on: + > + > * Slight behaviour change with respect to forgetting blocked ignored + > signals. POSIX is flexible in this regard but I guess we could retain + > them instead of the current behaviour. + > * Sigstate accessors could be made extern inline functions. + > I suggest we postpone this. + > * Incorrect changes for `msg_{get,set}_init_int(INIT_SIGMASK)` + > * Some comments which can be improved. + > + > Once these are fixed we can probably test the patches in Debian. + > + > --[[jk]] 2011-07-06 + + * Documentations bits (from here, the initial [[proposal]], and elsewhere) + should probably be + moved either into the appropriate glibc or Hurd documentation + files/reference manuals, or to [[glibc/signal]]. + + * `SA_SIGINFO` patch is based on [[Samuel|samuelthibault]]'s earlier work. + Thus, have him review the new patch? + + * `SA_SIGINFO` patch has a few TODOs w.r.t. protocol changes for missing + information, and for FPU state. Providing even incomplete information is + an improvement on the current status. The question is, whether + applications rely on this information in any hard way if `SA_SIGINFO` is + available? + + * We could possibly rename certain fields in `struct siginfo`, say + `si_pid_not_implemented`, to ensure compilation failures for programs + which use them. Or perhaps a linker warning is possible. + + * The FPU state is not included in the `ucontext_t` passed to the signal + handler. On the other hand, `ucontext_t` is actually being somewhat + deprecated: the functions to restore it are no longer in POSIX. + `thread_get_state`() should return this information, in case we decide + to fill the gap, and there might be existing glibc wrappers, too. + + * Perhaps have a look at `SA_NOCLDWAIT`. + + +### Port OpenJDK + +As suggested by [[tschwinge]], I have targeted OpenJDK 7 at first. +I don't expect it will be too hard to backport my patches to OpenJDK 6. +I have succeeded in building a working JIT-less ("zero") version, +although the dynamic linker issue must be worked around. +Porting Hotspot (the original just-in-time compiler of OpenJDK) +should not be too hard. +If that fails we can fall back on Shark +(a portable alternative JIT which uses LLVM). + +Complexity of porting HotSpot: probably low. The complex things should be +arch- rather than OS-specific. Not many Linux-specific interfaces used. +Garbage collection/memory management, etc. and/or most of other Linux-specific +interfaces are already dealt with for the zero build. + +The dynamic linker issue is as follows. +An executable-specific search path can be provided in the ELF RPATH header. +RPATH directories can include the special string `$ORIGIN`, +which is to be expanded to the directory the executable was loaded from. +OpenJDK's `java` command uses this feature to locate +the right `libjli.so` at runtime. +However, +on Hurd this information is not available to the dynamic linker +and as a consequence RPATH components which include `$ORIGIN` +are silently discarded. + +This can be worked around by defining +the `LD_ORIGIN_PATH` environment variable. +(which have I used to build and test OpenJDK so far.) + +#### Plans + +I intend to fix the RPATH issue +by building on [[pochu]]'s `file_exec_file_name()` +[patches](http://lists.gnu.org/archive/html/bug-hurd/2010-08/msg00023.html). + +I have succeeded in building a Hotspot-enabled `libjvm.so`, +although the current toolchain issues +([[toolchain/ELFOSABI_GNU]]; 2011-07-03: fix committed in binutils) +have so far prevented me from testing it. + +> It turns out the build fails later on in `hotspot/agent` +> because Hurd lack a `libthread_db.so`. +> Also, a Shark version builds, but the result does not work so far. +> +> In other news, Damien Raude-Morvan is +> [working on a kFreeBSD version](http://lists.debian.org/debian-java/2011/06/msg00124.html), +> so I intend to merge my current patches with his. +> +> --[[jkoenig]] 2011-06-29 + +##### Upstream Submission + +On 2011-07-15, *gnu_andrew* talked to us in the #hurd channel (freenode IRC), +who is a maintainer of IcedTea. He's supportive of the porting approach, and +is willing to review and integrate small patches for individual issues (rather +than some huge patchset). Send patches to <distro-pkg-dev@openjdk.java.net>. + +##### Open Items + + * [!] [[tschwinge]] to have a look at [[pochu]]'s `file_exec_file_name()` + patches, whether it's generally the right idea. + + * Assuming it is, continue with getting `$ORIGIN` working. + + * `libthread_db.so` issue. Likely, the Serviceability Agent is used by jdb + and the like only, so for now the goal should be to lose some functionality + by removing/avoiding this dependency. + + * [[java-access-bridge]] (not critical; JVM appears to work without) + + * They seem to have a rather heavy-weight process for such projects: confer + <http://mail.openjdk.java.net/pipermail/announce/2011-January/000092.html>, + for example. Do we need this, too? + + > Probably not. + > My current approach (and Damien's wrt. the kFreeBSD patches) + > is to add preprocessor directives in the Linux code + > to make it more portable. + > --[[jkoenig]] 2011-06-29 + + * Eclipse + + OK for testing -- but I'd very much hope that it *just works* as soon as we + provide the required Java platform. But it may perhaps have some + Linux-specifics (needlessly?) in its basement. Is it available for Debian + GNU/kFreeBSD already? + + +### Java bindings for Mach + +The code is at <http://github.com/jeremie-koenig/hurd-java>. + +[[tschwinge]]'s notes for building with... + + * GCJ installed (due to the current Debian multilib confusion): + + $ tmp1=/usr/lib/gcc/i486-gnu/4.6 tmp2=/usr/lib/i386-gnu/gcc/i486-gnu/4.6 LIBRARY_PATH=$tmp2 COMPILER_PATH=$tmp1:$tmp2 C_INCLUDE_PATH=$tmp1/include make + + * OpenJDK installed (to have it find the shared library, and the jni.h header + file): + + $ jdk=/usr/lib/jvm/java-7-openjdk LD_LIBRARY_PATH=$jdk/jre/lib/i386/jli C_INCLUDE_PATH=$jdk/include make + +Doxygen-generated documentation is available at +<http://jk.fr.eu.org/hurd-java/doc/html/>; or run `make doc` yourself. + + +#### Plans + +(just started.) + + +##### Open Items + + * [[tschwinge]] has to read about RMI and CORBA. + + * MIG + + * Hacking [[microkernel/mach/MIG]] shouldn't be too difficult. + + * (Unless you want to make MIG's own code (that is, not the generated + code, but MIG itself) look a bit more nice, too.) ;-) + + * There are also alternatives to MIG. If there is interest, the following + could be considered: + + * FLICK ([[!GNU_Savannah_task 5723]]). [[tschwinge]] has no idea yet if + there would be any benefits over MIG, like better modularity (for the + backends)? If we feel like it, we could spend a little bit of time on + this. + + * For [[microkernel/Viengoos]], Neal has written a RPC stub generator + entirely in C Preprocessor macros. While this is obviously not + directly applicable, perhaps we can get some ideas from it. + + * Anything else that would be worth having a look at? (What are other + microkernels using?) + + * `mach_msg` + + * Seems like the right approach to [[tschwinge]], but he hasn't digested + all the pecularities yet. Will definitely need more time. + + +## Postponed + +Might get back to these as time/interest permits. + + +### GCJ + + * [[tschwinge]] has the feeling that Java in GCC (that is, GCJ) is mostly + dead? (True?) + + * Thus perhaps not too much effort should be spent with it. + + If the POSIX threads signal semantics makes it going, then great, otherwise + we should get a feeling what else is missing. + + +### Joe-E. diff --git a/user/jkoenig/java/discussion.mdwn b/user/jkoenig/java/discussion.mdwn new file mode 100644 index 00000000..f16d7678 --- /dev/null +++ b/user/jkoenig/java/discussion.mdwn @@ -0,0 +1,526 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!toc]] + + +# General + +Some [[tschwinge]] comments regarding your proposal. Which is very good, if I +may say so again! :-) + +Of course, everyone is invited to contribute here! + +I want to give the following methodology a try, instead of only having +email/IRC discussions -- for the latter are again and again showing a tendency +to be dumped and deposited into their respective archives, and be forgotten +there. Of course, email/IRC discussions have their usefulness too, so we're +not going to replace them totally. For example, for conducting discussions +with a bunch of people (who may not even be following these pages here), email +(or, as applicable, the even more interactive IRC) will still be the medium of +choice. (And then, the executive summary should be posted here, or +incorporated into your proposal.) + +Also, if you disagree with this suggested procedure right away, or at some +later point begin to feel that this thing doesn't work out, or simply takes too +much time (I don't think so: writing emails takes time, too), just say so, and +we can reconsider. + +Of course, as this wiki is a passive medium rather than an active one as IRC +and email are, it is fine to send notices like: *I have updated the wiki page, +please have a look*. + +One idea is that your proposal evolves alongside with the ongoing work, and +represents (in more or less detail) what has been done and what will be done. +Also, we can hopefully use parts of it for documentation purposes, or as +recipes for similar work (enabling other programming languages on the Hurd, for +example). + +For this, I suggest the following procedure: as applicable, you can either +address any comments in here (for example, if they're wrong :-), or if they +require further discussion; think: *email discussion*), or you can address them +directly in your propoal and remove the comments from here at the same time +(think: *bug fix*). + +Generally, you can assume that for things I didn't comment on (within some +reasonable timeframe/upon asking me again) that I'm fine with them. Otherwise, +I might say: *I don't like this as is, but I'll need more time to think about +it.* + +There is also a possibility that parts of your proposal will be split off; in +cases where we think they're valuable to follow, but not at this time. (As you +know, your proposal is not really a trivial one, so it may just be too much for +one person's summer.) Such bits could be moved to [[open_issues]] pages, +either new ones or existing ones, as applicable. + + +# GSoC Site Discussion + + * Discussion items from + <http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/jkoenig/1> + should be copied here: + + * technical bits (obviously); + + * also the *why do we want Java bindings* reasoning; + + * CLISP findings should also be documented somewhere permanently. + + * We should probaby open up a *languages for Hurd* section on the web + pages ([[!taglink open_issue_documentation]]). + + +# IRC, freenode, #hurd, 2011-07-13 + +[[!tag open_issue_documentation]] + + <jkoenig> Yes, I guess so. Maybe start investigating mig because it may + have repercussions on what the best approach would be for some aspects of + the Mach bindings. + <tschwinge> I still think that making MIG emit Java code is not too + difficult, once you have the required Java infrastructure (like what + you're writing at the moment). + <tschwinge> On the other hand, if there's another approach that you'd like + to use, I'm not trying to force using MIG. + <braunr> i still have a problem understanding your approach + <braunr> at which level are your bindings located ? + <jkoenig> I expect mig it will be the easiest route, but of course possibly + it won't. + <tschwinge> jkoenig: Yeah, be give some high-level to low-level overview? + <jkoenig> ok, so + <jkoenig> at the very core, low-level, we have a very thin amount of JNI + code to access (proper) system calls. + <jkoenig> by "proper" I mean things like mach_task_self, mach_msg and + mach_reply_port, which are actually system calls rather than RPCs to the + kernel. + <braunr> right + <jkoenig> at this level, we manipulate port names as integers, and the + message buffers for mach_msg are raw ByteBuffers (from the java.nio + package) + <jkoenig> actually, so-called /direct/ ByteBuffers, which are backed by + memory allocated outside of the Java heap, rather than as a byte[] array + <jkoenig> we can retreive the pointer from the JNI code and use the buffer + directly. + <jkoenig> (so, good for performance and it's also portable.) + <braunr> ok + <braunr> i'm more interested in the higher level bindings :) + <jkoenig> ok so, higher up. + <jkoenig> design goal from my proposal: "the memory safety of Java should + be maintained and extended to Mach primitives such as port names and + out-of-line memory regions" + <jkoenig> so integer port names are not "safe" in the sense that they can + be forged and misused in all kinds of way + <jkoenig> which is why I have a layer of Java code whose job is to wrap + this kind of low-level Mach stuff into safe abstractions + <jkoenig> and ideally the user should only use these safe abstractions. + <tschwinge> (Not to restrict the programmer, but to help him write correct + code.) + <jkoenig> right. + <braunr> so you can't use mach RPCs directly + <jkoenig> tschwinge, also to actually restrict them, in a Joe-E / + object-capability context, but that's not the primary concern right now + ;-) + <braunr> or you force your wrappers to have these abstractions as input + <jkoenig> braunr, well, actually at this level you still have Mach RPC + <jkoenig> but for instance, port names are encapsulated into "MachPort" + objects which ensure they are handled correcly + <tschwinge> As I understand it, you use these abstractions to prepare a + usual mach_msg message, and then invoke mach_msg. + <braunr> ok + <jkoenig> and message buffers are wrapped into "MachMsg" objects which both + help you write the messages into the ByteBuffer and prevent you from + doing funky stuff + <jkoenig> and ensure the ports which you send/receive/pseudo-receive after + an error/... are deallocated as required, etc. + <braunr> what's the interface to use IPC ? + <tschwinge> Is MIG doing that, too, I think? (And antrik once found some + error there, which is still to be reviewed...) + <jkoenig> braunr, so basically as a user you would be free to use either + one of these layers, or to use MIG-generated classes which would + construct and exchange messages for you using the second (safe) layer. + <braunr> ok, let's just finish with the low level layer before going + further please + <jkoenig> tschwinge, MIG does some type checking on the received message + and saves you the trouble of constructing/parsing them yourself, but I'm + not sure about how mach_msg errors are handled + <braunr> what are the main methods of MachMsg for example ? + <jkoenig> braunr, you may want to have a look at + http://jk.fr.eu.org/hurd-java/doc/html/classorg_1_1gnu_1_1mach_1_1MachMsg.html + <braunr> right, sorry + <braunr> grabbed the code at work and forgot here + <jkoenig> and also + https://github.com/jeremie-koenig/hurd-java/blob/master/HelloMach.java + which uses it + <jkoenig> but roughly, you'd use setRemotePort, setLocalPort, setId to + write your message's header + <jkoenig> then use one of the putFoo() methods to add data items to the + message + <braunr> ok, the mapping with the low level C interface is very clear + <braunr> that's good for me + <jkoenig> the putFoo() methods would write the appropriate type + descriptors, then the actual data. + <braunr> we can go on with the MiG part if you want :) + <jkoenig> right, + <jkoenig> so here you may want to look at the UML class diagram from + http://www.bddebian.com/~hurd-web/user/jkoenig/java/proposal/ + <jkoenig> so in the C case, mig generates 3 files + <jkoenig> a header file which has the prototypes of the mig-generated + stubs, + <jkoenig> a *User.c which has their actual implementation + <jkoenig> and a *Server.c which handles demultiplexing the incoming + messages and helps with implementing servers. + <jkoenig> so we would do something along these lines, more or less: + <jkoenig> mig would generate the code for a Java interface in lieu of the + *.h file. + <jkoenig> a generated FooUser class would implement this interface by doing + RPC + <jkoenig> (so basically you would pass a MachPort object to the + constructor, and then you could use the resulting object to do RPC with + whatever is on the other end) + <jkoenig> and the generated FooServer class would do the opposite, + <braunr> ok + <braunr> issues with threads ? + <jkoenig> you would pass an object implementing the Foo interface to the + constructor, + <braunr> i'm guessing the demux part may have to create threads, right ? + <jkoenig> and the resulting object would handle messages by using the + object you passed. + <jkoenig> braunr, right, so that would be more a libports kind of code, + <braunr> the libports-like library, i see + <jkoenig> to which you could pass Server objects (for instance the + FooServer above), and it would handle incoming messages. + <braunr> how is message content mapped to a java interface ? + <jkoenig> this would be determined from the .defs files and MIG would + generate the appropriate code, hopefully. + <braunr> so the demux part would handle rpc integer identifiers ? + <jkoenig> right. + <braunr> but hm + <jkoenig> also mapping .defs files to Java interfaces might prove to be + tricky. data types conversion and all + <antrik> tschwinge: my mamory is rather hazy. IIRC the issue was that the + MIG-generated stubs deallocate out-of-line port arrays after the + implementation returns, before returning to the dispatcher + <braunr> i'll just overlook this specific implementation detail + <jkoenig> but we could use some annotation-based system if we need to + provide more information to generate the java code. + <antrik> but the Hurd (or rather glibc) RPC handling also automatically + deallocates everything if an error occurs + <antrik> so I changed the MIG code to deallocate only when no error occurs + <braunr> jkoenig: ok, we'll talk about that when there is more progress and + you have a better view of the problem + <antrik> at that time I was pretty sure that this is a correctly working + solution, but it always seemed questionable conceptually... however, I + wasn't able to come up with a better one, and nobody else commented on it + <braunr> antrik: shouldn't the hurd be changed not to deallocate something + it didn't allocate in the first place ? + <antrik> braunr: no, the server has to deallocate stuff before returning to + the client. the request message is destroyed before returning the reply. + <tschwinge> jkoenig, braunr: That's what I had in mind where MIG might be a + bit awkward. Then we can indeed either add annotations to the .defs + files, or reproduce them in some other format. That's some work, but + it's mostly a one-time work. + <tschwinge> After all, the RPC interface is a binary one, and there may be + more than one API for creating these messages, etc. + <antrik> jkoenig: actually, at least in the Hurd, server-side and + client-side headers are separate -- so MIG actually creates four files + <jkoenig> tschwinge, wrt to annotations I was more thinking about Java + ones, such as: @MIGDefsFile("mach/task.defs") @MIGCType("task_t") public + interface Task { } + <jkoenig> antrik, oh, ok, it makes sense. + <braunr> jkoenig: anything else ? + <jkoenig> braunr, nothing that I can think of + <braunr> ok + <antrik> tschwinge: I think it would be a *very* bad idea to introduce + redundancy regarding RPC definitions + <braunr> thanks for the tour :) + <antrik> (the _request.defs/_reply.defs mess is bad enough...) + <jkoenig> did I speak about the "Unsafe" pseudo-exception? that's + interesting :-) + <tschwinge> jkoenig: Also, virtual memory abstractions? + <braunr> jkoenig: you didn't + <tschwinge> antrik: Well, then we could create some other super-format. + But that's just a detail IMO. + <jkoenig> ok, so wrt virtual memory, a page we received can be wrapped with + some JNI help into a (direct) ByteBuffer object. + <jkoenig> deallocating sent pages will be tricky, though. + <tschwinge> antrik: To put it this way: for me the .defs files are just one + way of expressing the RPC interfaces' contracts. (At the same time, they + happen to be the actual reference for these, too. But the specification + itself could just as well be a textual one.) + <jkoenig> on approach I've been thinking about would be to "wrap" the + ByteBuffer object into an object which has the sole reference to it, so + that when it's deallocated the reference can be replaced with "null", and + further attempts to access the buffer would throw exceptions. + <braunr> sounds reasonable + <jkoenig> but that's still in flux in my head, we may end up needing our + own implementation of ByteBuffer-like objects. + <tschwinge> The problem being that there is no mechanism to ``revoke'' an + object once a reference to it has been shared. + <jkoenig> right. + <tschwinge> A wrapper is one possibility indeed. + <antrik> tschwinge: they are called interface *definitions* for a reason + :-) + <tschwinge> This is a very similar problem as with capabilities when there + is no revoke operation for these, too. + <tschwinge> antrik: Yes, because they define MIG's input. :-P + <tschwinge> Isn't that what is called a membrane in the capability world? + <antrik> I do not say that we have to consider the format of the .defs to + be set in stone; but I do insist on using a canonical machine-parsable + source for all language bindings + <tschwinge> attenuation + <jkoenig> tschwinge, you mean the revokable proxy contruct ? (It's the same + principle indeed) + <tschwinge> A common design pattern in object-capability systems: given + one reference of an object, create another reference for a proxy object + with certain security restrictions, such as only permitting read-only + access or allowing revocation. The proxy object performs security checks + on messages that it receives and passes on any that are allowed. Deep + attenuation refers to the case where the same attenuation is applied + transitively to any + <tschwinge> objects obtained via the original attenuated object, + typically by use of a "membrane". + <tschwinge> http://en.wikipedia.org/wiki/Object-capability_model + <tschwinge> Yes. + <tschwinge> Good. I understood something. ;-) + <tschwinge> antrik: OKAY! :-P + <tschwinge> jkoenig: And hopefully the JVM will optimize away all the + additional indirection... :-D + <tschwinge> jkoenig: Is there anything more to say about the VM layer? + <jkoenig> tschwinge, "hopefully", yes :-) + <tschwinge> Like, the data that I'm sharing -- is it untyped, isn't it? + <jkoenig> tschwinge, you mean that within the received/sent pages ? + <tschwinge> Yes. + <tschwinge> But that'S how it is, indeed. + <jkoenig> well actually the type descriptor should indicate what they + contain. + <tschwinge> I cannot trust anything I receive from externally. + <jkoenig> it's most often used for MACH_MSG_TYPE_CHAR items I guess, and it + will be type checked when retreive + <tschwinge> Yeah, and that then just *is* arbitrary data, like a block read + from a disk file. + <jkoenig> you would have something like: ByteBuffer + MachMsg.getBuffer(MachMsg.Type expected), and MachMsg would check the + type descriptor against that which you specified + <tschwinge> Or a packet transmitted over the network. + <tschwinge> OK, yes. + <antrik> jkoenig: in theory ints should be used quite often too. the whole + purpose of the type descriptors is to allow byte order swapping when + messages are passed between hosts with different architecture... + <jkoenig> tschwinge, right, except for out-of-line port arrays, which need + to be handled differently obviously. + <antrik> (which is totally irrelevat for our purposes -- especially since + the actual network IPC code doesn't exist anymore ;-) ) + <jkoenig> antrik, oh, interesting + <tschwinge> Yes, that was one original idea. + <jkoenig> actually my litmus test for what the bindings should be, is you + should be able to implement such a proxy in Java :-) + <tschwinge> antrik: And hey, you now have processors that can switch + between different modes during runtime... :-) + <jkoenig> (although arguably that's a little bit ambitious) + <braunr> tschwinge: there should be bits in page tables to indicate the + endianness to use on a page .. :) + <tschwinge> Hehe! + <tschwinge> jkoenig: Don't worry -- you're already known for ambitious + projects. One more can't hurt. + <jkoenig> Also, actually the word size is not something that I've been able + to abstract so far, so I'll be hardcoding little-endian 32 bits for now. + <braunr> why is that ? + <antrik> some of the Hurd RPC break the idea anyways BTW + <jkoenig> the org.vmmagic package (from Jikes RVM and JNode) could help + with that, but GCJ does not support it unfortunately (not sure about + OpenJDK) + <jkoenig> braunr, Java does not allow us to define new unboxed types + <braunr> jkoenig: does it have its own definition of the word size ? + <jkoenig> braunr, nope. + <jkoenig> (although, maybe, and also we could use JNI to query it) + <braunr> even if virtual, i'd expect a machine to have such a defnition + <jkoenig> braunr, maybe it has, but basically in Java nothing depends on + the word size + <jkoenig> 'int' is 32 bits, 'long' is 64 and that's it. + <braunr> oh right, i remember most types are fixed size, right ? + <jkoenig> right. + <braunr> if not all + <jkoenig> now Jikes RVM's "org.vmmagic" provides an interface to defined + new unboxed types which can depend on the actual word size, but Jikes RVM + is its own JVM so obviously they can use and provide whatever extensions + they need :-) + <jkoenig> (but maybe they've implemented them in OpenJDK for bootstrap + purposes, I'm not sure) + <tschwinge> I'm missing this detail: where does the word size come into + play here? + <jkoenig> anyway, I _could_ indiscriminately use 'long' for port names, and + sparkle the code with word size tests but that would be very clumsy + <braunr> jkoenig: port names are actually ints :/ + <jkoenig> tschwinge, the actual format of the message header and type + descriptors, for instance. + <braunr> jkoenig: ok, got your point + <jkoenig> braunr, by 'long' I mean 64-bits integers (which they are on + 64-bits machines I think?) + <braunr> :) + <braunr> jkoenig: port names are as large as the word size + <braunr> but in C at least, they're int, not long + <braunr> it doesn't change many things, but you get lots of warnings if you + try with a long :) + <tschwinge> What is the reason that port names are an + architecture-dependent word size's width, and not simply 32 bit? + <jkoenig> "4 billions of port names should be enough for everyone" :-) + <braunr> tschwinge: an optimization is to use them as pointers in the + kernel + <antrik> tschwinge: the machine's native word size is what it can process + most efficiently, and what should be used for most normal + operations... it makes sense to define stuff as int, except for network + communication + <tschwinge> jkoenig: Well, yeah, but if you want to communicate with a + peer, you have to agree on the maximum number anyway (not for port names, + though, which are local). + <braunr> antrik: int isn't the word size everywhere + <braunr> antrik: the most common type matching the word size is long, at + least on ILP32/LP64 data models + <antrik> braunr: that's just because some idiots assumed int would always + be 32 bits, and consequently when 64 architectures came up the compiler + guys chickened out ;-) + <braunr> without int, you wouldn't have a 32 bits type + <antrik> that's not true for all architectures and/or operating systems + though AFAIK + <braunr> or a 16 bits one + <braunr> antrik: windows guys got even more scared, so windows 64 is LLP64 + <antrik> BTW, I haven't checked, but it's quite possible that 32 bit + numbers are actually preferable even on AMD64... + <tschwinge> jkoenig: So, back on track. :-) + <tschwinge> jkoenig: You didn't find anything yet in Mach's VM interfaces + as well a MemoryObject, etc., that can't be used/implemented in the Java + world? + <braunr> antrik: they consume less memory, but don't have much effect on + performance + <jkoenig> tschwinge, once we have the basic system calls and the + corresponding abstractions in place, I don't think anything else + fundamentally problematic could possibly show up + <antrik> braunr: if you really *need* a type of a certain bit size, you + should use stdint types. so not having a 16 or 32 bit type in the + short/int/long canon is *not* an excuse + <tschwinge> jkoenig: That speaks for the Mach designers! + <braunr> antrik: right + <jkoenig> tschwinge, on trick is that for instance, mach_task_self would + still be unsafe even if it returned a nicely wrapped Task object, because + you could still wreck your own address space and threads with it. So we + would need the "attenuation" pattern mentionned above to provide a safe + one. + <jkoenig> (which would disallow thinks such as the port/thread/vm calls) + <braunr> jkoenig: you mentioned the unsafe pseudo exception earlier + <jkoenig> braunr, right, so the issue is with distinguishing safe from + unsafe methods + <antrik> braunr: BTW, the Windows guys actually broke a lot of stuff by + fixing long at 32 bits -- this way long doesn't match size_t and pointer + types anymore, which was an assumption that was true for pretty much any + system so far... + <tschwinge> jkoenig: Yes. (And again hope for the JVM to optim...) + <braunr> antrik: that's right :) + <braunr> antrik: that's LLP64 + <braunr> antrik: long long and pointers + <jkoenig> braunr, so basically the idea is that unsafe methods are declared + as "throws Unsafe" + <jkoenig> the effect is that if you use such a method you must either + "throw Unsafe" yourself, + <jkoenig> or if you're building a safe abstraction on top of Unsafe + methods, you'll "catch" the "exception" in question to tell the compiler + that it's okay. + <jkoenig> it's more or less inspired from the "semantic regimes" idea from + the org.vmmagic paper which is referenced in my original proposal, + <jkoenig> only implementing by hijacking the exception checking machinery, + which has a behaviour similar to what we want. + <braunr> ok + <braunr> but hmm this seems pretty normal, what's the tricky part ? :) + <tschwinge> braunr: The idea is that the programmer explicitly has to + acknowledge if he'S using an unsafe interface. + <braunr> tschwinge: sounds pretty normal too + <jkoenig> braunr, the trick is that you would not usually declare + exceptions which are never actually thrown (and actually since the + compiler does not know it's never thrown, I need to work around it in a + few places) + <braunr> oh, ok + <braunr> jkoenig: that's interesting indeed + <jkoenig> braunr, the org.vmmagic paper provides an example which uses some + annotations called @UncheckedMemoryAccess and @AssertSafe to the same + effect (which is kind of cleaner), but it would be a headache to + implement without help from the compiler I think (as far as I can tell + the annotation processor would have to inspect the bytecode) + <braunr> but hm + <braunr> what's the true problem about this ? + <jkoenig> (the paper advocates "high-level low-level programming" and is a + very interesting read I think, + http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.151.5253&rep=rep1&type=pdf, + for what it's worth) + <braunr> what's wrong if you just declare your methods unsafe and don't + alter anything else ? + <tschwinge> Yes, I read it and it is interesting. Unfortunately, it seems + I forgot most of it again... + <jkoenig> braunr, declare? alter? + <jkoenig> you mean just tag them with an annotation? + <braunr> just stating a method "throws Unsafe" + <jkoenig> braunr, well some compiler will output a warning because they can + tell there's no way the method is going to throw such an exception. + <jkoenig> and then some other compiler will complain that my + @SuppressWarnings("unused") does not serve any purpose to them :-) + <jkoenig> also, when initializing final fields, I need to work around the + fact that the compiler thinks "Unsafe" might be thrown. + <jkoenig> see for instance MachPort.DEAD + <braunr> jkoenig: ok + <jkoenig> braunr, but I'm more than willing to accept this in exchange for + a clear, compiler-enforced materialization of the border between safe an + unsafe code. + <jkoenig> actually another question I have is the amount of static typing I + should add to the safe version, for instance should I subclass MachPort + into MachSendRight, MachReceiveRight and so on. I don't want to depart + from the C inteface too much but it could be useful. + <braunr> jkoenig: can't answer that :) + <braunr> jkoenig: keep them in mind for later i think + <tschwinge> jkoenig: What's the safety concern w.r.t. having MachPort (not) + final? + <jkoenig> tschwinge, actually I'm partly wrong in that we only need name() + and a couple other methods to be final + <tschwinge> jkoenig: That's what I was thinking. :-) + <tschwinge> I though I'm missing something here. + <jkoenig> tschwinge, the idea is that the user (ie., the adversary :-) + could extend MachPort and inject their own fake port name into messages + <jkoenig> by overriding name() or clear() + <tschwinge> Yeah, but if these are final, that's not possible. + <jkoenig> right. + <tschwinge> And that *should* be enough, I think. + <tschwinge> Unless I'm missing something. + <jkoenig> I don't think so. Also I hope it is, because as mentionned above + there might be some value in subclassing MachPort. + <tschwinge> Yep. + <jkoenig> incidentally, declaring the class or the method final will allow + the JVM to inline them I think. + <tschwinge> It will help the JVM, yes. It can also figure that out without + final, though. (And may have to de-optimize the code again in case there + are additional classes loaded during run-time.) + <tschwinge> jkoenig: The reference counting in MachPort. I think I'm + beginning to understand this. + <jkoenig> oh ok + <jkoenig> tschwinge, yes the javadoc is maybe a bit obscure so far. + <jkoenig> but basically you don't want the port name you acquire to become + invalid before you're done using it. + <tschwinge> But how is this different from the C world? + <jkoenig> here my goal is to provide some guarantees if you use only safe + methods + <jkoenig> like, you can't forge a port name and things like that + <jkoenig> so basically it should never be possible to include an invalid + port name in a message if you use only safe methods. + <tschwinge> Ah, I see! + <tschwinge> Now that does make sense. + <jkoenig> but the mechanism in itself is similar to the Hurd port cells and + user_link structures + <tschwinge> It's again ``only'' helping the programmer. + <jkoenig> right, no object-capability ulterior motives :-) + <jkoenig> another assumption which the javadoc does not state yet it that + basically there should be exactly one MachPort object for each mach-level + port name reference (in the sense of mach_port_mod_refs) + <tschwinge> Yes, I figured out that bit. diff --git a/user/jkoenig/java/java-access-bridge.mdwn b/user/jkoenig/java/java-access-bridge.mdwn new file mode 100644 index 00000000..6f860709 --- /dev/null +++ b/user/jkoenig/java/java-access-bridge.mdwn @@ -0,0 +1,78 @@ +[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] + +[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable +id="license" text="Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, Version 1.2 or +any later version published by the Free Software Foundation; with no Invariant +Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license +is included in the section entitled [[GNU Free Documentation +License|/fdl]]."]]"""]] + +[[!tag open_issue_porting]] + +Debian's *openjdk-7-jre* package depends on *libaccess-bridge-java-jni* (source +package: *java-access-bridge*). + +The latter one has *openjdk-6-jdk* as a build dependency, but that can be +hacked around: + + # ln -s java-7-openjdk /usr/lib/jvm/java-6-openjdk + +Trying to build it: + + $ LD_LIBRARY_PATH=/usr/lib/jvm/java-7-openjdk/jre/lib/i386/jli dpkg-buildpackage -b -uc -d + [...] + make[3]: Entering directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2/idlgen' + /usr/lib/jvm/java-6-openjdk/bin/idlj \ + -pkgPrefix Bonobo org.GNOME \ + -pkgPrefix Accessibility org.GNOME \ + -emitAll -i /usr/share/idl/bonobo-activation-2.0 -i /usr/share/idl/at-spi-1.0 -i /usr/share/idl/bonobo-2.0 \ + -fallTie /usr/share/idl/at-spi-1.0/Accessibility.idl + /usr/share/idl/at-spi-1.0/Accessibility_Collection.idl (line 66): WARNING: Identifier `object' collides with a keyword; use an escaped identifier to ensure future compatibility. + boolean isAncestorOf (in Accessible object); + ^ + /usr/share/idl/at-spi-1.0/Accessibility_Component.idl (line 83): WARNING: Identifier `Component' collides with a keyword; use an escaped identifier to ensure future compatibility. + interface Component : Bonobo::Unknown { + ^ + Exception in thread "main" java.lang.AssertionError: Platform not recognized + at sun.nio.fs.DefaultFileSystemProvider.create(DefaultFileSystemProvider.java:71) + at java.nio.file.FileSystems$DefaultFileSystemHolder.getDefaultProvider(FileSystems.java:108) + at java.nio.file.FileSystems$DefaultFileSystemHolder.access$000(FileSystems.java:89) + at java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:98) + at java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:96) + at java.security.AccessController.doPrivileged(Native Method) + at java.nio.file.FileSystems$DefaultFileSystemHolder.defaultFileSystem(FileSystems.java:95) + at java.nio.file.FileSystems$DefaultFileSystemHolder.<clinit>(FileSystems.java:90) + at java.nio.file.FileSystems.getDefault(FileSystems.java:176) + at sun.util.calendar.ZoneInfoFile$1.run(ZoneInfoFile.java:489) + at sun.util.calendar.ZoneInfoFile$1.run(ZoneInfoFile.java:480) + at java.security.AccessController.doPrivileged(Native Method) + at sun.util.calendar.ZoneInfoFile.<clinit>(ZoneInfoFile.java:479) + at sun.util.calendar.ZoneInfo.getTimeZone(ZoneInfo.java:658) + at java.util.TimeZone.getTimeZone(TimeZone.java:559) + at java.util.TimeZone.setDefaultZone(TimeZone.java:656) + at java.util.TimeZone.getDefaultRef(TimeZone.java:623) + at java.util.TimeZone.getDefault(TimeZone.java:610) + at java.text.SimpleDateFormat.initializeCalendar(SimpleDateFormat.java:682) + at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:619) + at java.text.DateFormat.get(DateFormat.java:772) + at java.text.DateFormat.getDateTimeInstance(DateFormat.java:547) + at com.sun.tools.corba.se.idl.toJavaPortable.Util.writeProlog(Util.java:1139) + at com.sun.tools.corba.se.idl.toJavaPortable.Skeleton.writeHeading(Skeleton.java:145) + at com.sun.tools.corba.se.idl.toJavaPortable.Skeleton.generate(Skeleton.java:102) + at com.sun.tools.corba.se.idl.toJavaPortable.InterfaceGen.generateSkeleton(InterfaceGen.java:159) + at com.sun.tools.corba.se.idl.toJavaPortable.InterfaceGen.generate(InterfaceGen.java:108) + at com.sun.tools.corba.se.idl.InterfaceEntry.generate(InterfaceEntry.java:110) + at com.sun.tools.corba.se.idl.toJavaPortable.ModuleGen.generate(ModuleGen.java:75) + at com.sun.tools.corba.se.idl.ModuleEntry.generate(ModuleEntry.java:83) + at com.sun.tools.corba.se.idl.Compile.generate(Compile.java:324) + at com.sun.tools.corba.se.idl.toJavaPortable.Compile.start(Compile.java:169) + at com.sun.tools.corba.se.idl.toJavaPortable.Compile.main(Compile.java:146) + make[3]: *** [org/GNOME/Accessibility/Accessible.java] Error 1 + make[3]: Leaving directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2/idlgen' + make[2]: *** [all-recursive] Error 1 + make[2]: Leaving directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2/idlgen' + make[1]: *** [all-recursive] Error 1 + make[1]: Leaving directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2' + make: *** [debian/stamp-makefile-build] Error 2 + dpkg-buildpackage: error: debian/rules build gave error exit status 2 diff --git a/user/jkoenig/java/proposal.mdwn b/user/jkoenig/java/proposal.mdwn new file mode 100644 index 00000000..4052f455 --- /dev/null +++ b/user/jkoenig/java/proposal.mdwn @@ -0,0 +1,628 @@ + +# Java for Hurd (and vice versa) + +Contact information: + + * Full name: Jérémie Koenig + * Email: jk@jk.fr.eu.org + * IRC: jkoenig on Freenode and OFTC + +## Introductions + +I am a first year M.Sc. student +in Computer Science at University of Strasbourg (France). +My interests include capability-based security, +programming languages and formal methods +(in particular, object-capability languages and proof-carrying code). + +### Proposal summary + +This project would consist in improving Java support on Hurd. +The first part would consist in +fixing bugs and porting Java-related packages. +The second part would consist in +creating low-level Java bindings for the Hurd interfaces, +as well as libraries to make translator development easier. + +### Previous involvement + +I started contributing to Hurd last summer, +during which I participated to Google Summer of Code +as a student for the Debian project. +I worked on porting Debian-Installer to Hurd. +This project was mostly a success, +although we still have to use a special mirror for installation +with a few modified packages +and tweaked priorities +to work around some uninstallable packages +with Priority: standard. + +Shortly afterwards, +I rewrote the procfs translator +to fix some issues with memory leaks, +make it more reliable, +and improve compatibility with Linux-based tools +such as `procps` or `htop`. + +Although I have not had as much time +as I would have liked to dedicate to the Hurd +since that time, +I have continued to maintain the mirror in question, +and I have started to work +on implementing POSIX threads signal semantics in glibc. + +### Project-related skills and interests + +I have used Java mostly for university assignments. +This includes non-trivial projects +using threads and distributed programming frameworks +such as Java RMI or CORBA. +I have also used it to experiment with +Google App Engine +(web applications) +and Google Web Toolkit +(a compiler from Java to Javascript which helps with AJAX code), +and I have some limited experience with JNI +(the Java Native Interface, to link Java with C code). + +My knowledge of the Hurd and Debian GNU/Hurd is reasonable, +as the Debian-Installer and procfs projects +gave me the opportunity to fiddle with many parts of the system. + +Initially, +I started working on this project because I wanted to use +[Joe-E](http://code.google.com/p/joe-e/) +(a subset of Java) +to investigate the potential +[[applications of object-capability languages|objcap]] +in a Hurd context. +I also believe that improving Java support on Hurd +would be an important milestone. + +### Organisational matters + +I am subscribed to bug-hurd@g.o and +I do have a permanent internet connexion. + +I would be able to attend the regular IRC meetings, +and otherwise communicate with my mentor +through any means they would prefer +(though I expect email and IRC would be the most practical). +Since I'm already familiar with the Hurd, +I don't expect I would require too much time from them. + +My exams end on May 20 so I would be able to start coding +right at the beginning of the GSoC period. +Next year's term would probably begin around September 15, +so that would not be an issue either. +I expect I would work around 40 hours per week, +and my waking hours would be flexible. + +I don't have any other plans for the summer +and would not make any if my project were to be accepted. + +Full disclosure: +I also submitted a proposal to the Jikes RVM project +(which is a research-oriented Java Virtual Machine, +itself written in Java) +for implementing a new garbage collector into the MMTk subsystem. + +## Improve Java support + +### Justification + +Java is a popular language and platform used by many desktop and web +applications (mostly on the server side). As a consequence, competitive Java +support is important for any general-purpose operating system. +Better Java support would also be a prerequisite +for the second part of my proposal. + +### Current situation + +Java is currently supported on Hurd with the GNU Java suite: + + * [GCJ](http://gcc.gnu.org/java/), + the GNU Compiler for Java, is part of GCC and can compile Java + source code to Java bytecode, and both source code and bytecode to + native code; + * libgcj is the implementation of the Java runtime which GCJ uses. + It is based on [GNU Classpath](http://www.gnu.org/software/classpath/). + It includes a bytecode interpreter which enables + Java applications compiled to native code to dynamically load and execute + Java bytecode from class files. + * The gij command is a wrapper around the above-mentioned virtual machine + functionality of libgcj and can be used as a replacement for the java + command. + +However, GCJ does not work flawlessly on Hurd.r +For instance, some parts of libgcj relies on +the POSIX threads signal semantics, which are not yet implemented. +In particular, this makes ant hang waiting for child processes, +which makes some packages fail to build on Hurd +(“ant” is the “make” of the Java world). + +### Tasks + + * **Finish implementing POSIX thread semantics** in glibc (high priority). + According to POSIX, signal dispositions should be global to a process, + while signal blocking masks should be thread-specific. Signals sent to the + process as a whole are to be delivered to any thread which does not block + them. By contrast, Hurd has per-thread signal dispositions and signals + sent to a process are delivered to the main thread only. I have been + working on refactoring the glibc signal code and implementing the POSIX + semantics as a per-thread option. However, due to lack of time I have not + yet been able to test and debug my code properly. Finishing this work + would be my first task. + * **Fix further problems with GCJ on Hurd** (high priority). While I’m not + aware of any other problems with GCJ at the moment, I suspect some might + turn up as I progress with the other tasks. Fixing these problems would + also be a high-priority task. + * **Port OpenJDK 6** (medium priority). While GCJ is fine, it is not yet + 100% complete. It is also slower than OpenJDK on architectures where a + just-in-time compiler is available. Porting OpenJDK would therefore + improve Java support on Hurd in scope and quality. Besides, it would also + be a good way to test GCJ, which is used for bootstrapping by the Debian + OpenJDK packages. Also note that OpenJDK 6 is now the default Java + Runtime Environment on all released Linux-based Debian architectures; + bringing Hurd in line with this would probably be a good thing. + * **Port Eclipse and other Java applications** (low priority). Eclipse is a + popular, state-of-the-art IDE and tool suite used for Java and other + languages. It is a dependency of the Joe-E verifier (see part 3 of this + proposal). Porting Eclipse would be a good opportunity to test GCJ and + OpenJDK. + +### Deliverables + + * The glibc pthreads patch and any other fixes on the Hurd side + would be submitted upstream + * Patches against Debian source packages + required to make them build on Hurd would be submitted + to the [Debian bug tracking system](http://bugs.debian.org/). + + +## Create Java bindings for the Hurd interfaces + +### Justification + +Java is used for many applications and often taught to +introduce object-oriented programming. The fact that Java is a +garbage-collected language makes it easier to use, especially for the less +experienced programmers. Besides, its object-oriented nature is a +natural fit for the capability-based design of Hurd. +The JVM is also used as a target for many other languages, +all of which would benefit from the access provided by these bindings. + +Advantages over other garbage-collected, object-oriented languages include +performance, type safety and the possibility to compile a Java translator to +native code and +[link it statically](http://gcc.gnu.org/wiki/Statically_linking_libgcj) +using GCJ, should anyone want to use a +translator written in Java for booting. +Note that Java is +[being](http://www.linuxjournal.com/article/8757) +[used](http://oss.readytalk.com/avian/) +in this manner for embedded development. +Since GCJ can take bytecode as its input, +this expect this possibility would apply to any JVM-based language. + +Java bindings would lower the bar for newcomers +to begin experimenting with what makes Hurd unique +without being faced right away with the complexity of +low-level systems programming. + +### Tasks summary + + * Implement Java bindings for Mach + * Implement a libports-like library for Java + * Modify MIG to output Java code + * Implement libfoofs-like Java libraries + +### Design principles + +The principles I would use to guide the design +of these Java bindings would be the following ones: + + * The system should be hooked into at a low level, + to ensure that Java is a "first class citizen" + as far as the access to the Hurd's interfaces is concerned. + * At the same time, the memory safety of Java should be maintained + and extended to Mach primitives such as port names and + out-of-line memory regions. + * Higher-level interfaces should be provided as well + in order to make translator development + as easy as possible. + * A minimum amount of JNI code (ie. C code) should be used. + Most of the system should be built using Java itself + on top of a few low-level primitives. + * Hurd objects would map to Java objects. + * Using the same interfaces, + objects corresponding to local ports would be accessed directly, + and remote objects would be accessed over IPC. + +One approach used previously to interface programming languages with the Hurd +has been to create bindings for helper libraries such as libtrivfs. Instead, +for Java I would like to take a lower-level approach by providing access to +Mach primitives and extending MIG to generate Java code from the interface +description files. + +This approach would be initially more involved, and would introduces several +issues related to overcoming the "impedance mismatch" between Java and Mach. +However, once an initial implementation is done it would be easier to maintain +in the long run and we would be able to provide Java bindings for a large +percentage of the Hurd’s interfaces. + +### Bindings for Mach system calls + +In this low-level approach, my intention is to enable Java code to use Mach +system calls (in particular, mach_msg) more or less directly. This would +ensure full access to the system from Java code, but it raises a number of +issues: + + * the Java code must be able to manipulate Mach-level entities, such as port + rights or page-aligned buffers mapped outside of the garbage-collected + heap (for out-of-line transfers); + * putting together IPC messages requires control of the low-level + representation of data. + +In order to address these concerns, classes would be encapsulating these +low-level entities so that they can be referenced through normal, safe objects +from standard Java code. Bindings for Mach system calls can then be provided +in terms of these classes. Their implementation would use C code through the +Java Native Interface (JNI). + +More specifically, this functionality would be provided by the `org.gnu.mach` +package, which would contain at least the following classes: + + * `MachPort` would encapsulate a `mach_port_t`. (Some of) its constructors + would act as an interface for the `mach_port_allocate()` system call. + `MachPort` objects would also be instantiated from other parts of the JNI + C code to represent port rights received through IPC. The `deallocate()` + method would call `mach_port_deallocate()` and replace the encapsulated + port name with `MACH_PORT_DEAD`. We would recommend that users call it + when a port is no longer used, but the finalizer would also deallocate the + port when the `MachPort` object is garbage collected. + * `Buffer` would represent a page-aligned buffer allocated outside of the + Java heap, to be transferred (or having been received) as out-of-line + memory. The JNI code would would provide methods to read and write data at + an arbitrary offset (but within bounds) and would use `vm_allocate()` and + `vm_deallocate()` in the same spirit as for `MachPort` objects. + * `Message` would allow Java code to put together Mach messages. The + constructor would allocate a `byte[]` member array of a given size. + Additional methods would be provided to fill in or query the information + in the message header and additional data items, including `MachPort` and + `Buffer` objects which would be translated to the corresponding port names + and out-of-line pointers. + A global map from port names to the corresponding `MachPort` object + would probably be needed to ensure that there is a one-to-one + correspondence. + * `Syscall` would provide static JNI methods for performing system calls not + covered by the above classes, such as `mach_msg()` or + `mach_thread_self()`. These methods would accept or return `MachPort`, + `Buffer` and `Message` objects when appropriate. The associated C code + would access the contents of such objects directly in order to perform the + required unsafe operations, such as constructing `MachPort` and `Buffer` + objects directly from port names and C pointers. + +Note that careful consideration should be given to the interfaces of these +classes to avoid “safety leaks” which would compromise the safety guarantees +provided by Java. Potential problematic scenarios include the following +examples: + + * It must not be possible to write an integer at some position in a + `Message` object, and to read it back as a `MachPort` or `Buffer` object, + since this would allow unsafe access to arbitrary memory addresses and + mach port names. + * Providing the `mach_task_self()` system call would also provide access to + arbitrary addresses and ports by using the `vm_*` family of RPC operations + with the returned `MachPort` object. This means that the relevant task + operations should be provided by the `Syscall` class instead. + +Finally, access should be provided to the initial ports and file descriptors +in `_hurd_ports` and provided by the `getdport()` function, +for instance through static methods such as +`getCRDir()`, `getCWDir()`, `getProc()`, ... in a dedicated class such as +`org.gnu.hurd.InitPorts`. + +A realistic example of code based on such interfaces would be: + + import org.gnu.mach.MsgType; + import org.gnu.mach.MachPort; + import org.gnu.mach.Buffer; + import org.gnu.mach.Message; + import org.gnu.mach.Syscall; + import org.gnu.hurd.InitPorts; + + public class Hello + { + public static main(String argv[]) + /* Parent class for all Mach-related exceptions */ + throws org.gnu.mach.MachException + { + /* Allocate a reply port */ + MachPort reply = new MachPort(); + + /* Allocate an out-of-line buffer */ + Buffer data = new Buffer(MsgType.CHAR, 13); + data.writeString(0, "Hello, World!"); + + /* Craft an io_write message */ + Message msg = new Message(1024); + msg.setRemotePort(InitPorts.getdport(1)); + msg.setLocalPort(reply, Message.Type.MAKE_SEND_ONCE); + msg.setId(21000); + msg.addBuffer(data); + + /* Make the call, MACH_MSG_SEND | MACH_MSG_RECEIVE */ + Syscall.machMsg(msg, true, true, reply); + + /* Extract the returned value */ + msg.assertId(21100); + int retCode = msg.readInt(0); + int amount = msg.readInt(1); + } + } + +Should this paradigm prove insufficient, +more ideas could be borrowed from the +[`org.vmmagic`](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.151.5253&rep=rep1&type=pdf) +package used by [Jikes RVM](http://jikesrvm.org/), +a research Java virtual machine itself written in Java. + +### Generating Java stubs with MIG + +Once the basic machinery is in place to interface with Mach, Java programs +have more or less equal access to the system functionality without resorting +to more JNI code. However, as illustrated above, this access is far from +convenient. + +As a solution I would modify MIG to add the option to output Java code. MIG +would emit a Java interface, a client class able to implement the interface +given a Mach port send right, an a server class which would be able to handle +incoming messages. The class diagram below, although it is by no means +complete or exempt of any problem, illustrates the general idea: + +[[gsoc2011_classes.png]] + +This structure is somewhat reminiscent of +[Java RMI](http://en.wikipedia.org/wiki/Java_remote_method_invocation) +or similar systems, +which aim to provide more or less transparent access to remote objects. +The exact way the Java code would be generated still needs to be determined, +but basically: + + * An interface, corresponding to the header files generated by MIG, would + enumerate the operations listed in a given .defs files. Method names would + be transformed to adhere to Java conventions (for instance, + `some_random_identifier` would become `someRandomIdentifier`). + * A user class, corresponding to the `*User.c` files, + would implement this interface by doing RPC over a given MachPort object. + * A server class, corresponding to `*Server.c`, would be able to handle + incoming messages using a user-provided implementation of the interface. + (Possibly, a skeleton class providing methods which would raise + `NotImplementedException`s would be provided as well. + Users would derive from this class and override the relevant methods. + This would allow them not to implement some operations, + and would avoid pre-existing code from breaking when new operations are + introduced.) + +In order to help with the implementation of servers, some kind of library +would be needed to associate Mach receive rights with server objects and to +handle incoming messages on dedicated threads, in the spirit of libports. +This would probably require support for port sets at the level of the Mach +primitives described in the previous section. + +When possible, operations involving the transmission of send rights +of some kind would be expressed in terms of the MIG-generated interfaces +instead of `MachPort` objects. +Upon reception of a send right, +a `FooUser` object would be created +and associated with the corresponding `MachPort` object. +If the received send right corresponds to a local port +to which a server object has been associated, +this object would be used instead. +This way, +subsequent operations on the received send right +would be handled as direct method calls +instead of going through RPC mechanisms. + +Some issues will still need to be solved regarding how MIG will convert +interface description files to Java interfaces. For instance: + + * `.defs` files are not explicitly associated with a type. For instance in + the example above, MIG would have to somehow infer that io_t corresponds + to `this` in the `Io` interface. + * More generally, a correspondence between MIG and Java types would have + to be determined. Ideally this would be automated and not hardcoded + too much. + * Initially, reply port parameters would be ignored. However they may be + needed for some applications. + +So the details would need to be flushed out during the community bonding +period and as the implementation progresses. However I’m confident that a +satisfactory solution can be designed. + +Using these new features, the example above could be rewritten as: + + import org.gnu.hurd.InitPorts; + import org.gnu.hurd.Io; + import org.gnu.hurd.IoUser; + + class Hello { + static void main(String argv[]) throws ... + { + Io stdout = new IoUser(InitPorts.getdport(1)); + String hello = “Hello, World!\n”; + + int amount = stdout.write(hello.getBytes(), -1); + + /* (A retCode corresponding to an error + would be signalled as an exception.) */ + } + } + +An example of server implementation would be: + + import org.gnu.hurd.Io; + import java.util.Arrays; + + class HelloIo implements Io { + final byte[] contents = “Hello, World!\n”.getBytes(); + + int write(byte[] data, int offset) { + return SOME_ERROR_CODE; + } + + byte[] read(int offset, int amount) { + return Arrays.copyOfRange(contents, offset, + offset + amount - 1); + } + + /* ... */ + } + +A new server object could then be created with `new IoServer(new HelloIo())`, +and associated with some receive right at the level of the ports management +library. + +### Base classes for common types of translators + +Once MIG can target Java code, and a libports equivalent is available, +creating new translators in Java would be greatly facilitated. However, +we would probably want to introduce basic implementations of file system +translators in the spirit of libtrivfs or libnetfs. They could take the form +of base classes implementing the relevant MIG-generated interfaces which +would then be derived by users, +or could define a simpler interface +which would then be used by adapter classes +to implement the required ones. + +I would draw inspiration from libtrivfs and libnetfs +to design and implement similar solutions for Java. + +### Deliverables + + * A hurd-java package would contain the Java code developed + in the context of this project. + * The Java code would be documented using javadoc + and a tutorial for writing translators would be written as well. + * Modifications to MIG would be submitted upstream, + or a patched MIG package would be made available. + +The Java libraries resulting from this work, +including any MIG support classes +as well as the class files built from the MIG-generated code +for the Mach and Hurd interface definition files, +would be provided as single `hurd-java` package for +Debian GNU/Hurd. +This package would be separate from both Hurd and Mach, +so as not to impose unreasonable build dependencies on them. + +I expect I would be able to act as its maintainer in the foreseeable future, +either as an individual or as a part of the Hurd team. +Hopefully, +my code would be claimed by the Hurd project as their own, +and consequently the modifications to MIG +(which would at least conceptually depend on the Mach Java package) +could be integrated upstream. + +Since by design, +the Java code would use only a small number of stable interfaces, +it would not be subject to excessive amounts of bitrot. +Consequently, +maintenance would primarily consist in +fixing bugs as they are reported, +and adding new features as they are requested. +A large number of such requests +would mean the package is useful, +so I expect that the overall amount of work +would be correlated with the willingness of more people +to help with maintenance +should I become overwhelmed or get hit by a bus. + + +## Timeline + +The dates listed are deadlines for the associated tasks. + + * *Community bonding period.* + Discuss, refine and complete the design of the Java bindings + (in particular the MIG and "libports" parts) + * *May 23.* + Coding starts. + * *May 30.* + Finish implementing pthread signal semantics. + * *June 5.* + Port OpenJDK + * *June 12.* + Fix the remaining problems with GCJ and/or OpenJDK, + possibly port Eclipse or other big Java packages. + * *June 19.* + Create the bindings for Mach. + * *June 26.* + Work on some kind of basic Java libports + to handle receive rights. + * *July 3.* + Test, write some documentation and examples. + * *July 17 (two weeks).* + Add the Java target to MIG. + * *July 24.* + Test, write some documentation and examples. + * *August 7 (two weeks).* + Implement a modular libfoofs to help with translator development. + Try to write a basic but non-trivial translator + to evaluate the performance and ease of use of the result, + rectify any rough edges this would uncover. + * *August 22. (last two weeks)* + Polish the code and packaging, + finish writing the documentation. + + +## Conclusion + +This project is arguably ambitious. +However, I have been thinking about it for some time now +and I'm confident I would be able to accomplish most of it. + +In the event multiple language bindings projects +would be accepted, +some work could probably be done in common. +In particular, +[ArneBab](http://www.bddebian.com/~hurd-web/community/weblogs/ArneBab/2011-04-06-application-pyhurd/) +seems to favor a low-level approach for his Python bindings as I do for Java, +and I would be happy to discuss API design and coordinate MIG changes with him. +I would also have an extra month after the end of the GSoC period +before I go back to school, +which I would be able to use to finish the project +if there is some remaining work. +(Last year's rewrite of procfs was done during this period.) + +As for the project's benefits, +I believe that good support for Java +is a must-have for the Hurd. +Java bindings would also further the Hurd's agenda +of user freedom by extending this freedom to more people: +I expect the set of developers +who would be able to write Java code against a well-written libfoofs +is much larger than +those who master the intricacies of low-level systems C programming. +From a more strategic point of view, +this would also help recruit new contributors +by providing an easier path to learning the inner workings of the Hurd. + +Further developments +which would build on the results of this project +include my planned [[experiment with Joe-E|objcap]] +(which I would possibly take on as a university project next year). +Another possibility would be to reimplement some parts +of the Java standard library +directly in terms of the Hurd interfaces +instead of using the POSIX ones through glibc. +This would possibly improve the performance +of some Java applications (though probably not by much), +and would otherwise be a good project +for someone trying to get acquainted with Hurd. + +Overall, I believe this project would be fun, interesting and useful. +I hope that you will share this sentiment +and give me the opportunity to spend another summer working on Hurd. + |