1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
|
[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
# GSoC 2011 final report (Java on Hurd)
This is my final report regarding my work on Java for Hurd
as a Google Summer of Code student for the GNU project.
The work is going on,
for recent status updates, see my [[java]] page.
## Global signal dispositions and SA_SIGINFO
Signal delivery was implemented in Hurd before POSIX threads were
defined. As a consequence the current semantics differ from the POSIX
prescriptions, which libgcj relies on.
On the Hurd, each thread has its own signal dispositions and
process-wide signals are always received by the main thread.
In contrast, POSIX specifies signal dispositions to be global to the
process (although there is still a per-thread blocking mask), and a
global signal can be delivered to any thread which does not block it.
To further complicate the matter, the Hurd currently has two options for
threads: the cthread library, still used by most of the Hurd code, and
libpthread which was introduced later for compatibility purposes. To
avoid breaking existing code, cthread programs should continue to run
with the historical Hurd signal semantics whereas pthread programs are
expected to rely on the POSIX behavior.
To address this, the patch series I wrote allows selecting a per-thread
behavior: by default, newly created threads provide historical
semantics, but they can be marked by libpthread as global signal
receivers using the new function `_hurd_sigstate_set_global_rcv()`.
In addition, I refactored some of the signal code to improve
readability, and fixed a couple of bugs I came across in the process.
Another improvement which was required by OpenJDK was the implementation
of the `SA_SIGINFO` flag for signal handlers. My signal patch series
provides the basic infrastructure. However it is not yet complete, as
some of the information provided by `siginfo_t` structures is not
available to glibc. Making this information available would require a
change in the `msg_sig_post()` RPC.
### Related Debian changes
In Debian GNU/Hurd, libpthread is provided the `hurd` package. Hurd also
uses extern inline functions from glibc which are affected by the new
symbols. This means that newer Hurd packages which take advantage of
glibc's support for global signal dispositions cannot run on older C
libraries and some thought had to be given to the way we could ensure
smooth upgrades.
An early attempt at using weak symbols proved to be impractical. As a
consequence I modified the eglibc source package to enable
dpkg-gensymbols on hurd-i386. This means that packages which are built
against a newer libc and make use of the new symbols will automatically
get an appropriately versionned dependency on libc0.3.
### Status as of 2012-01-28
The patch series has not yet been merged upstream. However, it is now
being used for the Debian package of glibc.
## $ORIGIN substitution in RPATH
Another feature used by OpenJDK which was not implemented in Hurd is the
substitution of the special string `$ORIGIN` within the ELF `RPATH`
header. `RPATH` is a per-executable library search path, within which
`$ORIGIN` should be substituted by the directory part of the binary's
canonical file name.
Currently, a newly executed program has no way of figuring out which
binary it was created from. Actually, even the `_hurd_exec()` function,
which is used in glibc to implement the `exec*()` family, is never
passed the file name of the executable, but only a port to it.
Likewise, the `file_exec()`, `exec_exec()` and `exec_startup_get_info()`
RPCs do not provide a path to transmit the file name from the shell to
the file system server, to the exec server, to the executed program.
Last year, Emilio Pozuelo Monfort submitted a patch series which fixes
this problem, up to the exec server. The series' original purpose was to
replace the guesswork done by `exec` when running shell scripts. It
provides new versions of `file_exec()` and `exec_exec()` which allow for
passing the file name. I extended Emilio's patches to add the missing
link, namely a new `exec_startup_get_info_2()` RPC. New code in glibc
takes advantage of it to retrieve the file name and use it in a
Hurd-specific `dl-origin.c` to allow for `RPATH` `$ORIGIN` substitution.
### Status as of 2012-01-28
The (hurd and glibc) patch series for `$ORIGIN` are mostly complete.
However, there is still an issue related to the canonicalization of the
executable's file name. Doing it in the dynamic linker (where `$ORIGIN`
is expanded) is complicated due to the limited set of available
functions (`realpath()` is not). Unfortunately canonicalizing in
`_hurd_exec_file_name()` is not an option either because many shell
scripts use `argv[0]` to alter their behavior, but `argv[0]` is replaced
by the shell with the file name it's passed.
Another issue is that the patches use a fixed-length string buffer to
transmit the file name through RPC.
## OpenJDK 7
With the groundwork above being taken care of, I was able to build
OpenJDK 7 on Hurd, although heavy portability patching was also
necessary. A similar effort for Debian GNU/kFreeBSD was undertaken
around the same time by Damien Raude-Morvan, so I intend to submit a
more general set of "non-Linux" patches.
Due to the lack of a `libpthread_db` library on the Hurd, I was only
able to build a Zero (interpreter only) virtual machine so far. However,
it should be possible to disable the debugging agent and build Hotspot.
### Status as of 2012-01-28
I have put together generic `nonlinux-*.diff` patches for the `openjdk7`
Debian package, however I have not yet tested them on Linux and kFreeBSD.
## Java bindings
Besides improving Java support on Hurd, my original proposal also
included the creation of Java bindings for the Hurd interfaces.
My progress on this front has not been as fast as I would have liked.
However I have started some of the work required to provide safe Java
bindings for Mach system calls.
See https://github.com/jeremie-koenig/hurd-java.
|