summaryrefslogtreecommitdiff
path: root/user/jkoenig/java.mdwn
blob: e5d288cc6a4af5c3421cbe4fdefed4aa7e37838b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!tag stable_URL]]


# Improve Java on Hurd (GSoC 2011)


## Description

The project consists in improving Java support on Hurd.
This includes porting OpenJDK,
creating low-level Java bindings for Mach and Hurd,
as well as creating Java libraries to help with translator development.

For details, see my original [[proposal]].


## Current status

Feeling slightly behind schedule; but project is very ambitious, which has been
known from the beginning, and there is great progress, so there is no problem.
--[[tschwinge]], 2011-06-29.


### Apt repository

Modified Debian packages are available in this repository:

    deb http://jk.fr.eu.org/debian experimental/
    deb-src http://jk.fr.eu.org/debian experimental/


### Glibc signal code improvements

2011-06-29:
Patches were submitted to `libc-alpha`
which implement global signal dispositions and `SA_SIGINFO`.
My latest code is available on
[github](http://github.com/jeremie-koenig/glibc/commits/master-beware-rebase),
and modified Debian packages
are available in my apt repository.

2011-07-20:
The patches were reviewed by Samuel Thibault.
Samuel pointed out a couple of issues
and I beleive I have addressed all of them (fixes posted).
I'm in the process of publishing updated libc and hurd packages;
provided those work as expected,
the next step would be to get these changes into Debian.

One question is how the new symbols introduced by my patches
should be handled.
Weak symbols turned out to be impractical,
so I'm currently considering using a Debian-specific
symbol version in the interim period (`GLIBC_2.13_DEBIAN_8` so far).
The ultimate symbol version to be used will depend on
the time at which the patches get integrated upstream
(most likely `GLIBC_2.15`),
at which point we will alias the interim version
to the new one in debian packages.

I have modified libc0.3 to include a `deb-symbols(5)` file
(alternatively see <http://wiki.debian.org/Projects/ImprovedDpkgShlibdeps>)
so that we get an accurate libc dependency in `hurd` and other packages
when the symbols in question are pulled in.

[[hurd/libthreads]] (cthreads library) will not be changed.  There's no reason
why its behavior should change, whereas for [[libpthread]] it's needed for
conformance.  Patches posted on 2011-05-25, but there's a more recent one in
the modified hurd package (adds `_hurd_sigstate_delete` and removes the weak
symbols).

IRC, freenode, #hurd, 2011-07-27: 

    < jkoenig> the glibc patches are pending review and inclusion in Debian (I
      think youpi wants to check my latest additions before we go ahead with
      that)
    < jkoenig> when it's in Debian and the sky does not fall, I intend to
      resubmit a full series to libc-alpha for inclusion upstream.

IRC, freenode, #hurd, 2011-08-24: 

    < youpi> jkoenig: I'll probably commit your siginfo/globalsig patches soon
    < youpi> I'm building the ant package atm, seems to proceed great
    < jkoenig> youpi, great!

Another issue which came up with OpenJDK is the expansion
by the dynamic linker of `$ORIGIN` in the `RPATH` header,
see below.

#### Plans

The patches are pending review and inclusion upstream.
As soon as we reach an agreement wrt. the new interfaces
(in particular wrt. the value of `SA_SIGINFO`),
the patches will be applied to the Debian libc packages
for broader testing.


##### Open Items

  * Test patches: in progress, [[jkoenig]], Svante.  More volunteers welcome,
    of course.

    > There's an issue with gdb,
    > namely signals lose their "untracedness" when they go
    > through the global sigstate's pending mask,
    > so gdb spins intercepting a signal and trying to deliver it.
    > [Patch](http://github.com/jeremie-koenig/glibc/commit/3ecb990e9d08d5f75adc40b738b35a1802cc0943).

  * If [[jkoenig]] thinks it's mature enough: should ask
    [[Samuel|samuelthibault]] to test these patches on the buildds.

    > There's a risk that a dependency on my patched libc
    > might be pulled in while building packages
    > (in particular hurd)
    > --[[jkoenig]] 2011-06-22

      * Waiting on ABI finalization ([!] Roland).

          * Which numeric values to use for `SA_SIGINFO` (and `SA_NOCLDWAIT`)?

            > Staying in sync with BSD seems the most logical approach,
            > so I have defined it to 0x40. --[[jkoenig]] 2011-06-29

  * Get patches reviewed (Roland?), and integrated into official sources: [!]
    [[tschwinge]].

    > [[samuelthibault]] reviewed the patches and pointed out a couple of
    > issues which I'm currently working on:
    >
    >   * Slight behaviour change with respect to forgetting blocked ignored
    >     signals. POSIX is flexible in this regard but I guess we could retain
    >     them instead of the current behaviour.
    >   * Sigstate accessors could be made extern inline functions.
    >     I suggest we postpone this.
    >   * Incorrect changes for `msg_{get,set}_init_int(INIT_SIGMASK)`
    >   * Some comments which can be improved.
    >
    > Once these are fixed we can probably test the patches in Debian.
    >
    > --[[jk]] 2011-07-06

  * Documentations bits (from here, the initial [[proposal]], and elsewhere)
    should probably be
    moved either into the appropriate glibc or Hurd documentation
    files/reference manuals, or to [[glibc/signal]].

  * `SA_SIGINFO` patch is based on [[Samuel|samuelthibault]]'s earlier work.
    Thus, have him review the new patch?

  * `SA_SIGINFO` patch has a few TODOs w.r.t. protocol changes for missing
    information, and for FPU state.  Providing even incomplete information is
    an improvement on the current status.  The question is, whether
    applications rely on this information in any hard way if `SA_SIGINFO` is
    available?

      * We could possibly rename certain fields in `struct siginfo`, say
        `si_pid_not_implemented`, to ensure compilation failures for programs
        which use them.  Or perhaps a linker warning is possible.

        IRC, freenode, #hurd, 2011-08-20:

            < youpi> jkoenig: I was considering renaming the fields of siginfo
            < youpi> to catch applications which need those which we haven't
              yet
            < jkoenig> youpi, makes sense AFAICT
            < youpi> one issue we'll get is some application which previously
              built without SA_SIGINFO, and will now want some information
              we're not yet able to provide
            < youpi> but at least we'll know
            < jkoenig> youpi, yes it would still be better than having them
              crash at runtime because of it

        IRC, freenode, #hurd, 2011-08-21:

            < youpi> jkoenig: actually we need the fields for waitid

      * The FPU state is not included in the `ucontext_t` passed to the signal
        handler.  On the other hand, `ucontext_t` is actually being somewhat
        deprecated: the functions to restore it are no longer in POSIX.
        `thread_get_state`() should return this information, in case we decide
        to fill the gap, and there might be existing glibc wrappers, too.

  * Perhaps have a look at `SA_NOCLDWAIT`.


### Port OpenJDK

As suggested by [[tschwinge]], I have targeted OpenJDK 7 at first.
I don't expect it will be too hard to backport my patches to OpenJDK 6.
I have succeeded in building a working JIT-less ("zero") version,
although the dynamic linker issue must be worked around.
Porting Hotspot (the original just-in-time compiler of OpenJDK)
should not be too hard.
If that fails we can fall back on Shark
(a portable alternative JIT which uses LLVM).

Complexity of porting HotSpot: probably low.  The complex things should be
arch- rather than OS-specific.  Not many Linux-specific interfaces used.
Garbage collection/memory management, etc. and/or most of other Linux-specific
interfaces are already dealt with for the zero build.

The dynamic linker issue is as follows.
An executable-specific search path can be provided in the ELF RPATH header.
RPATH directories can include the special string `$ORIGIN`,
which is to be expanded to the directory the executable was loaded from.
OpenJDK's `java` command uses this feature to locate
the right `libjli.so` at runtime.
However,
on Hurd this information is not available to the dynamic linker
and as a consequence RPATH components which include `$ORIGIN`
are silently discarded.

This can be worked around by defining
the `LD_ORIGIN_PATH` environment variable.
(which have I used to build and test OpenJDK so far.)

IRC, freenode, #hurd, 2011-07-27:

    < jkoenig> if you have the latest hurd/libc in my repository, you should be
      able to run /usr/lib/jvm/java-7-openjdk/bin/java without defining
      LD_ORIGIN_PATH manually
    < braunr> java: error while loading shared libraries: libjli.so: cannot
      open shared object file: No such file or directory
    < jkoenig> braunr, this one is expected, it's the symlink problem.
    < braunr> oh ok
    < jkoenig> (ie. thus far, if java is accessed as /usr/bin/java, the ld
      origin ends up as /usr/bin)

    < jkoenig> *sigh*... it seems I'm going to have to reimplement realpath()
      in elf/dl-origin.c.
    < braunr> why ?
    < jkoenig> using it from there results in duplicate symbols when linking
      elf/librtld.map.o
    < braunr> from where ?
    < braunr> dl-origin ?
    < jkoenig> apparently this part of the code uses a different allocator
      (elf/dl-minimal.c)
    < braunr> oh
    < braunr> depndency issues ?
    < braunr> or bootstrapping ones ?
    < jkoenig> http://paste.debian.net/124310/
    < jkoenig> dl-origin is what provides the $ORIGIN value for RPATH (now
      sysdeps/mach/hurd/dl-origin.c, in our case)
    < braunr> but what's the problem ?
    < braunr> what prevents you from using the existing implementation ?
    < jkoenig> you mean copy-and-paste the code ? Well I'll end up doing that I
      guess... not that it feels right.
    < braunr> not really
    < braunr> link against what provides it
    < braunr> i'm really not familiar with glibc :/
    < jkoenig> also I'd like to understand what's happening precisely before I
      resort to such blasphemy :-)
    < braunr> :)
    < jkoenig> maybe I could make {file,exec,_hurd}_exec_file_name()
      canonicalize it instead.
    < jkoenig> for some reason it does not feel right, though.
    < braunr> why ?
    < jkoenig> I'm not sure, loss of information maybe?
    < jkoenig> (that I ran /usr/bin/java as opposed to /usr/lib/jvm/...)
    < braunr> i guess you should explain the issue more clearly, i feel like
      there is something i'm really missing :/
    < braunr> but it can wait
    < jkoenig> that ld.so actually needs the canonical file name to substitute
      $ORIGIN is its own problem, not that of exec or _hurd_exec_file_name..
    < jkoenig> Ok, so.. Initially the shell (indirectly) runs
      _hurd_exec_file_name(..., "/usr/bin/java", ...), which then calls
      file_exec_file_name() on the file in question, passing it its own
      filename
    < jkoenig> which is transmitted to exec_exec_file_name()
    < jkoenig> (until now it's all pochu's patch)
    < jkoenig> which then makes it available to the newly created process
      through exec_startup_get_info_2() (my own addition)
    < braunr> oh
    < braunr> wasn't it available before oO ?
    < jkoenig> no, exec only has access to a port to the executable file.
    < braunr> how was argv[0] handled then ?
    < jkoenig> argv[0] is handled like any other argument
    < braunr> ok, so the file path is duplicated ?
    < jkoenig> the shell (or whomever calls _hurd_exec) provide whatever they
      want.
    < braunr> ok
    < jkoenig> well argv[0] is not necessarily the file path (at least not the
      full path)
    < braunr> right
    < jkoenig> so exec() does some guesswork with $PATH but obviously that's
      limited.
    < braunr> so what you changed is that get_info_2 now receives a canonical
      path ?
    < jkoenig> right
    < jkoenig> (or whatever was specified to _hurd_exec_file_name(), for this
      reason and others we shouldn't use it for setuid programs.)
    < jkoenig> well, not a canonical path. A path. (hence the problem)
    < braunr> ok
    < jkoenig> now both the filesystem and exec might run under another root so
      they're not an option for canonicalization
    < jkoenig> _hurd_exec_file_name (in libc) might be a better spot.
    < braunr> resolution from the client, yes

IRC, freenode, #hurd, 2011-08-03:

    < jkoenig> so my RPATH patches are polished and built, and I'll post them
      soon, is the good news

IRC, freenode, #hurd, 2011-08-17:

    < jkoenig> also fixed a fakeroot-induced deadlock in my dl-origin patches
      (namely, under fakeroot, realpath() uses a socket (through stat), so we
      need to use it when _hurd_dtable_lock is not held)
    < jkoenig> also I'll post my dl-origin patches shortly
    < youpi> dl-origin is about the environment variable that java needs,
      right?
    < jkoenig> about the environment variable it shouldn't need, yes :-)
    < youpi> ah :)
    < youpi> but ok, I vaguely remember what that refers to
    < jkoenig> $LD_ORIGIN_PATH is used as an override (much like
      LD_LIBRARY_PATH), but ideally ld.so uses whatever directory the loaded
      binary is from.
    < youpi> ok
    < jkoenig> (as a substitution for $ORIGIN in RPATH)


#### Plans

I intend to fix the RPATH issue
by building on [[pochu]]'s `file_exec_file_name()`
[patches](http://lists.gnu.org/archive/html/bug-hurd/2010-08/msg00023.html).

I have succeeded in building a Hotspot-enabled `libjvm.so`,
although the current toolchain issues
([[toolchain/ELFOSABI_GNU]]; 2011-07-03: fix committed in binutils)
have so far prevented me from testing it.

> It turns out the build fails later on in `hotspot/agent`
> because Hurd lack a `libthread_db.so`.
> Also, a Shark version builds, but the result does not work so far.
>
> In other news, Damien Raude-Morvan is
> [working on a kFreeBSD version](http://lists.debian.org/debian-java/2011/06/msg00124.html),
> so I intend to merge my current patches with his.
>
> --[[jkoenig]] 2011-06-29

IRC, freenode, #hurd, 2011-08-03:

    < jkoenig> and I'm battleing to update my OpenJDK patches to b147, and
      merge the with the kFreeBSD ones.
    < braunr> b147 ?
    < jkoenig> but that thing is seriously huge and touches about everything,
      so it's taking more time than I'd have hoped
    < jkoenig> braunr, the latest release of IcedTea / OpenJDK 7 and the
      current Debian version (in experimental of course)
    < braunr> ok
    < jkoenig> I'm trying to make this clean so that hopefully we can get them
      integrated at some level of upstream (probably IcedTea, at least at
      first)

IRC, freenode, #hurd, 2011-08-10:

    < jkoenig> well actually I've finished merging my patches with the freebsd
      ones, and updating them to the new openjdk-7,
    < jkoenig> but now a new version of both is out :-P


##### Upstream Submission

On 2011-07-15, *gnu_andrew* talked to us in the #hurd channel (freenode IRC),
who is a maintainer of IcedTea.  He's supportive of the porting approach, and
is willing to review and integrate small patches for individual issues (rather
than some huge patchset).  Send patches to <distro-pkg-dev@openjdk.java.net>.

##### Open Items

  * [!] [[tschwinge]] to have a look at [[pochu]]'s `file_exec_file_name()`
    patches, whether it's generally the right idea.

      * Assuming it is, continue with getting `$ORIGIN` working.

  * `libthread_db.so` issue.  Likely, the Serviceability Agent is used by jdb
    and the like only, so for now the goal should be to lose some functionality
    by removing/avoiding this dependency.

  * [[java-access-bridge]] (not critical; JVM appears to work without)

  * IRC, freenode, #hurd, 2011-07-27:

        < jkoenig> there's a bug with java.nio when running javadoc, you might
          run into it.

  * [[`SCM_CREDENTIALS`|open_issues/sendmsg/scm_creds]]

    IRC, freenode, #hurd, 2011-08-03:

        < jkoenig> wrt. peer credentials, openjdk also uses file modes for
          security, and my guess is that it's sufficient, at least on Hurd, so
          I've reduced my priority for this at least in the meantime

  * They seem to have a rather heavy-weight process for such projects: confer
    <http://mail.openjdk.java.net/pipermail/announce/2011-January/000092.html>,
    for example.  Do we need this, too?

    > Probably not.
    > My current approach (and Damien's wrt. the kFreeBSD patches)
    > is to add preprocessor directives in the Linux code
    > to make it more portable.
    > --[[jkoenig]] 2011-06-29

  * Eclipse

    OK for testing -- but I'd very much hope that it *just works* as soon as we
    provide the required Java platform.  But it may perhaps have some
    Linux-specifics (needlessly?) in its basement.  Is it available for Debian
    GNU/kFreeBSD already?


### Java bindings for Mach

The code is at <http://github.com/jeremie-koenig/hurd-java>.

[[tschwinge]]'s notes for building with...

  * GCJ installed (due to the current Debian multilib confusion):

        $ tmp1=/usr/lib/gcc/i486-gnu/4.6 tmp2=/usr/lib/i386-gnu/gcc/i486-gnu/4.6 LIBRARY_PATH=$tmp2 COMPILER_PATH=$tmp1:$tmp2 C_INCLUDE_PATH=$tmp1/include make

  * OpenJDK installed (to have it find the shared library, and the jni.h header
    file):

        $ jdk=/usr/lib/jvm/java-7-openjdk LD_LIBRARY_PATH=$jdk/jre/lib/i386/jli C_INCLUDE_PATH=$jdk/include make

Doxygen-generated documentation is available at
<http://jk.fr.eu.org/hurd-java/doc/html/>; or run `make doc` yourself.

IRC, freenode, #hurd, 2011-07-27:

    < jkoenig> I need to be able to read/write individual data items from
      messages, in order to implement deallocation correctly, so I'm working on
      that when I'm waiting for things to build, but it's not my primary focus
      right now.

IRC, freenode, #hurd, 2011-08-17:

    < jkoenig> so, weekly status report: I have made some progress on the java
      bindings, I hope to have a safe version mach_msg soon, after which I can
      begin experimenting with mig.


#### Plans

(just started.)


##### Open Items

  * [[tschwinge]] has to read about RMI and CORBA.

  * MIG

      * Hacking [[microkernel/mach/MIG]] shouldn't be too difficult.

          * (Unless you want to make MIG's own code (that is, not the generated
            code, but MIG itself) look a bit more nice, too.)  ;-)

      * There are also alternatives to MIG.  If there is interest, the following
        could be considered:

          * FLICK ([[!GNU_Savannah_task 5723]]).  [[tschwinge]] has no idea yet if
            there would be any benefits over MIG, like better modularity (for the
            backends)?  If we feel like it, we could spend a little bit of time on
            this.

          * For [[microkernel/Viengoos]], Neal has written a RPC stub generator
            entirely in C Preprocessor macros.  While this is obviously not
            directly applicable, perhaps we can get some ideas from it.

          * Anything else that would be worth having a look at?  (What are other
            microkernels using?)

  * `mach_msg`

      * Seems like the right approach to [[tschwinge]], but he hasn't digested
        all the pecularities yet.  Will definitely need more time.


## Postponed

Might get back to these as time/interest permits.


### GCJ

  * [[tschwinge]] has the feeling that Java in GCC (that is, GCJ) is mostly
    dead?  (True?)

  * Thus perhaps not too much effort should be spent with it.

    If the POSIX threads signal semantics makes it going, then great, otherwise
    we should get a feeling what else is missing.


### Joe-E.