diff options
Diffstat (limited to 'user/jkoenig/java')
-rw-r--r-- | user/jkoenig/java/discussion.mdwn | 559 | ||||
-rw-r--r-- | user/jkoenig/java/java-access-bridge.mdwn | 92 | ||||
-rw-r--r-- | user/jkoenig/java/proposal.mdwn | 629 | ||||
-rw-r--r-- | user/jkoenig/java/report.mdwn | 136 |
4 files changed, 0 insertions, 1416 deletions
diff --git a/user/jkoenig/java/discussion.mdwn b/user/jkoenig/java/discussion.mdwn deleted file mode 100644 index 352f6d62..00000000 --- a/user/jkoenig/java/discussion.mdwn +++ /dev/null @@ -1,559 +0,0 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -[[!toc]] - - -# General - -Some [[tschwinge]] comments regarding your proposal. Which is very good, if I -may say so again! :-) - -Of course, everyone is invited to contribute here! - -I want to give the following methodology a try, instead of only having -email/IRC discussions -- for the latter are again and again showing a tendency -to be dumped and deposited into their respective archives, and be forgotten -there. Of course, email/IRC discussions have their usefulness too, so we're -not going to replace them totally. For example, for conducting discussions -with a bunch of people (who may not even be following these pages here), email -(or, as applicable, the even more interactive IRC) will still be the medium of -choice. (And then, the executive summary should be posted here, or -incorporated into your proposal.) - -Also, if you disagree with this suggested procedure right away, or at some -later point begin to feel that this thing doesn't work out, or simply takes too -much time (I don't think so: writing emails takes time, too), just say so, and -we can reconsider. - -Of course, as this wiki is a passive medium rather than an active one as IRC -and email are, it is fine to send notices like: *I have updated the wiki page, -please have a look*. - -One idea is that your proposal evolves alongside with the ongoing work, and -represents (in more or less detail) what has been done and what will be done. -Also, we can hopefully use parts of it for documentation purposes, or as -recipes for similar work (enabling other programming languages on the Hurd, for -example). - -For this, I suggest the following procedure: as applicable, you can either -address any comments in here (for example, if they're wrong :-), or if they -require further discussion; think: *email discussion*), or you can address them -directly in your propoal and remove the comments from here at the same time -(think: *bug fix*). - -Generally, you can assume that for things I didn't comment on (within some -reasonable timeframe/upon asking me again) that I'm fine with them. Otherwise, -I might say: *I don't like this as is, but I'll need more time to think about -it.* - -There is also a possibility that parts of your proposal will be split off; in -cases where we think they're valuable to follow, but not at this time. (As you -know, your proposal is not really a trivial one, so it may just be too much for -one person's summer.) Such bits could be moved to [[open_issues]] pages, -either new ones or existing ones, as applicable. - - -# GSoC Site Discussion - - * Discussion items from - <http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/jkoenig/1> - should be copied here: - - * technical bits (obviously); - - * also the *why do we want Java bindings* reasoning; - - * CLISP findings should also be documented somewhere permanently. - - * We should probaby open up a *languages for Hurd* section on the web - pages ([[!taglink open_issue_documentation]]). - - -# Java Native Interface (JNI) - - * <http://en.wikipedia.org/wiki/Java_Native_Interface> - * <http://download.oracle.com/javase/7/docs/technotes/guides/jni/> - * <http://java.sun.com/products/jdk/faq/jnifaq.html> - * <http://java.sun.com/docs/books/jni/> - - -## Java Native Access (JNA) - - * <http://jna.java.net/> - * <https://github.com/twall/jna#readme> - -This is a different approach, and *while some attention is paid to performance, -correctness and ease of use take priority*. - -As we plan on only having a few native methods (for invoking `mach_msg`, -essentially), JNA is probably the wrong approach: portability and ease of use -is not important, but performance is. - -## Compiled Native Interface (CNI) - - * <http://gcc.gnu.org/onlinedocs/gcj/About-CNI.html> - * <http://per.bothner.com/papers/UsenixJVM01/CNI01.pdf> - -Probably faster than JNI, but only usable with GCJ. - -> Given that we have very few JNI calls, -> it might be interesting to take a "dual" approach -> if CNI actually improves performance -> when compiling to native code. -> --[[jkoenig]] 2011-07-20 - -# IRC, freenode, #hurd, 2011-07-13 - -[[!tag open_issue_documentation]] - - <jkoenig> Yes, I guess so. Maybe start investigating mig because it may - have repercussions on what the best approach would be for some aspects of - the Mach bindings. - <tschwinge> I still think that making MIG emit Java code is not too - difficult, once you have the required Java infrastructure (like what - you're writing at the moment). - <tschwinge> On the other hand, if there's another approach that you'd like - to use, I'm not trying to force using MIG. - <braunr> i still have a problem understanding your approach - <braunr> at which level are your bindings located ? - <jkoenig> I expect mig it will be the easiest route, but of course possibly - it won't. - <tschwinge> jkoenig: Yeah, be give some high-level to low-level overview? - <jkoenig> ok, so - <jkoenig> at the very core, low-level, we have a very thin amount of JNI - code to access (proper) system calls. - <jkoenig> by "proper" I mean things like mach_task_self, mach_msg and - mach_reply_port, which are actually system calls rather than RPCs to the - kernel. - <braunr> right - <jkoenig> at this level, we manipulate port names as integers, and the - message buffers for mach_msg are raw ByteBuffers (from the java.nio - package) - <jkoenig> actually, so-called /direct/ ByteBuffers, which are backed by - memory allocated outside of the Java heap, rather than as a byte[] array - <jkoenig> we can retreive the pointer from the JNI code and use the buffer - directly. - <jkoenig> (so, good for performance and it's also portable.) - <braunr> ok - <braunr> i'm more interested in the higher level bindings :) - <jkoenig> ok so, higher up. - <jkoenig> design goal from my proposal: "the memory safety of Java should - be maintained and extended to Mach primitives such as port names and - out-of-line memory regions" - <jkoenig> so integer port names are not "safe" in the sense that they can - be forged and misused in all kinds of way - <jkoenig> which is why I have a layer of Java code whose job is to wrap - this kind of low-level Mach stuff into safe abstractions - <jkoenig> and ideally the user should only use these safe abstractions. - <tschwinge> (Not to restrict the programmer, but to help him write correct - code.) - <jkoenig> right. - <braunr> so you can't use mach RPCs directly - <jkoenig> tschwinge, also to actually restrict them, in a Joe-E / - object-capability context, but that's not the primary concern right now - ;-) - <braunr> or you force your wrappers to have these abstractions as input - <jkoenig> braunr, well, actually at this level you still have Mach RPC - <jkoenig> but for instance, port names are encapsulated into "MachPort" - objects which ensure they are handled correcly - <tschwinge> As I understand it, you use these abstractions to prepare a - usual mach_msg message, and then invoke mach_msg. - <braunr> ok - <jkoenig> and message buffers are wrapped into "MachMsg" objects which both - help you write the messages into the ByteBuffer and prevent you from - doing funky stuff - <jkoenig> and ensure the ports which you send/receive/pseudo-receive after - an error/... are deallocated as required, etc. - <braunr> what's the interface to use IPC ? - <tschwinge> Is MIG doing that, too, I think? (And antrik once found some - error there, which is still to be reviewed...) - <jkoenig> braunr, so basically as a user you would be free to use either - one of these layers, or to use MIG-generated classes which would - construct and exchange messages for you using the second (safe) layer. - <braunr> ok, let's just finish with the low level layer before going - further please - <jkoenig> tschwinge, MIG does some type checking on the received message - and saves you the trouble of constructing/parsing them yourself, but I'm - not sure about how mach_msg errors are handled - <braunr> what are the main methods of MachMsg for example ? - <jkoenig> braunr, you may want to have a look at - http://jk.fr.eu.org/hurd-java/doc/html/classorg_1_1gnu_1_1mach_1_1MachMsg.html - <braunr> right, sorry - <braunr> grabbed the code at work and forgot here - <jkoenig> and also - https://github.com/jeremie-koenig/hurd-java/blob/master/HelloMach.java - which uses it - <jkoenig> but roughly, you'd use setRemotePort, setLocalPort, setId to - write your message's header - <jkoenig> then use one of the putFoo() methods to add data items to the - message - <braunr> ok, the mapping with the low level C interface is very clear - <braunr> that's good for me - <jkoenig> the putFoo() methods would write the appropriate type - descriptors, then the actual data. - <braunr> we can go on with the MiG part if you want :) - <jkoenig> right, - <jkoenig> so here you may want to look at the UML class diagram from - http://www.bddebian.com/~hurd-web/user/jkoenig/java/proposal/ - <jkoenig> so in the C case, mig generates 3 files - <jkoenig> a header file which has the prototypes of the mig-generated - stubs, - <jkoenig> a *User.c which has their actual implementation - <jkoenig> and a *Server.c which handles demultiplexing the incoming - messages and helps with implementing servers. - <jkoenig> so we would do something along these lines, more or less: - <jkoenig> mig would generate the code for a Java interface in lieu of the - *.h file. - <jkoenig> a generated FooUser class would implement this interface by doing - RPC - <jkoenig> (so basically you would pass a MachPort object to the - constructor, and then you could use the resulting object to do RPC with - whatever is on the other end) - <jkoenig> and the generated FooServer class would do the opposite, - <braunr> ok - <braunr> issues with threads ? - <jkoenig> you would pass an object implementing the Foo interface to the - constructor, - <braunr> i'm guessing the demux part may have to create threads, right ? - <jkoenig> and the resulting object would handle messages by using the - object you passed. - <jkoenig> braunr, right, so that would be more a libports kind of code, - <braunr> the libports-like library, i see - <jkoenig> to which you could pass Server objects (for instance the - FooServer above), and it would handle incoming messages. - <braunr> how is message content mapped to a java interface ? - <jkoenig> this would be determined from the .defs files and MIG would - generate the appropriate code, hopefully. - <braunr> so the demux part would handle rpc integer identifiers ? - <jkoenig> right. - <braunr> but hm - <jkoenig> also mapping .defs files to Java interfaces might prove to be - tricky. data types conversion and all - <antrik> tschwinge: my mamory is rather hazy. IIRC the issue was that the - MIG-generated stubs deallocate out-of-line port arrays after the - implementation returns, before returning to the dispatcher - <braunr> i'll just overlook this specific implementation detail - <jkoenig> but we could use some annotation-based system if we need to - provide more information to generate the java code. - <antrik> but the Hurd (or rather glibc) RPC handling also automatically - deallocates everything if an error occurs - <antrik> so I changed the MIG code to deallocate only when no error occurs - <braunr> jkoenig: ok, we'll talk about that when there is more progress and - you have a better view of the problem - <antrik> at that time I was pretty sure that this is a correctly working - solution, but it always seemed questionable conceptually... however, I - wasn't able to come up with a better one, and nobody else commented on it - <braunr> antrik: shouldn't the hurd be changed not to deallocate something - it didn't allocate in the first place ? - <antrik> braunr: no, the server has to deallocate stuff before returning to - the client. the request message is destroyed before returning the reply. - <tschwinge> jkoenig, braunr: That's what I had in mind where MIG might be a - bit awkward. Then we can indeed either add annotations to the .defs - files, or reproduce them in some other format. That's some work, but - it's mostly a one-time work. - <tschwinge> After all, the RPC interface is a binary one, and there may be - more than one API for creating these messages, etc. - <antrik> jkoenig: actually, at least in the Hurd, server-side and - client-side headers are separate -- so MIG actually creates four files - <jkoenig> tschwinge, wrt to annotations I was more thinking about Java - ones, such as: @MIGDefsFile("mach/task.defs") @MIGCType("task_t") public - interface Task { } - <jkoenig> antrik, oh, ok, it makes sense. - <braunr> jkoenig: anything else ? - <jkoenig> braunr, nothing that I can think of - <braunr> ok - <antrik> tschwinge: I think it would be a *very* bad idea to introduce - redundancy regarding RPC definitions - <braunr> thanks for the tour :) - <antrik> (the _request.defs/_reply.defs mess is bad enough...) - <jkoenig> did I speak about the "Unsafe" pseudo-exception? that's - interesting :-) - <tschwinge> jkoenig: Also, virtual memory abstractions? - <braunr> jkoenig: you didn't - <tschwinge> antrik: Well, then we could create some other super-format. - But that's just a detail IMO. - <jkoenig> ok, so wrt virtual memory, a page we received can be wrapped with - some JNI help into a (direct) ByteBuffer object. - <jkoenig> deallocating sent pages will be tricky, though. - <tschwinge> antrik: To put it this way: for me the .defs files are just one - way of expressing the RPC interfaces' contracts. (At the same time, they - happen to be the actual reference for these, too. But the specification - itself could just as well be a textual one.) - <jkoenig> on approach I've been thinking about would be to "wrap" the - ByteBuffer object into an object which has the sole reference to it, so - that when it's deallocated the reference can be replaced with "null", and - further attempts to access the buffer would throw exceptions. - <braunr> sounds reasonable - <jkoenig> but that's still in flux in my head, we may end up needing our - own implementation of ByteBuffer-like objects. - <tschwinge> The problem being that there is no mechanism to ``revoke'' an - object once a reference to it has been shared. - <jkoenig> right. - <tschwinge> A wrapper is one possibility indeed. - <antrik> tschwinge: they are called interface *definitions* for a reason - :-) - <tschwinge> This is a very similar problem as with capabilities when there - is no revoke operation for these, too. - <tschwinge> antrik: Yes, because they define MIG's input. :-P - <tschwinge> Isn't that what is called a membrane in the capability world? - <antrik> I do not say that we have to consider the format of the .defs to - be set in stone; but I do insist on using a canonical machine-parsable - source for all language bindings - <tschwinge> attenuation - <jkoenig> tschwinge, you mean the revokable proxy contruct ? (It's the same - principle indeed) - <tschwinge> A common design pattern in object-capability systems: given - one reference of an object, create another reference for a proxy object - with certain security restrictions, such as only permitting read-only - access or allowing revocation. The proxy object performs security checks - on messages that it receives and passes on any that are allowed. Deep - attenuation refers to the case where the same attenuation is applied - transitively to any - <tschwinge> objects obtained via the original attenuated object, - typically by use of a "membrane". - <tschwinge> http://en.wikipedia.org/wiki/Object-capability_model - <tschwinge> Yes. - <tschwinge> Good. I understood something. ;-) - <tschwinge> antrik: OKAY! :-P - <tschwinge> jkoenig: And hopefully the JVM will optimize away all the - additional indirection... :-D - <tschwinge> jkoenig: Is there anything more to say about the VM layer? - <jkoenig> tschwinge, "hopefully", yes :-) - <tschwinge> Like, the data that I'm sharing -- is it untyped, isn't it? - <jkoenig> tschwinge, you mean that within the received/sent pages ? - <tschwinge> Yes. - <tschwinge> But that'S how it is, indeed. - <jkoenig> well actually the type descriptor should indicate what they - contain. - <tschwinge> I cannot trust anything I receive from externally. - <jkoenig> it's most often used for MACH_MSG_TYPE_CHAR items I guess, and it - will be type checked when retreive - <tschwinge> Yeah, and that then just *is* arbitrary data, like a block read - from a disk file. - <jkoenig> you would have something like: ByteBuffer - MachMsg.getBuffer(MachMsg.Type expected), and MachMsg would check the - type descriptor against that which you specified - <tschwinge> Or a packet transmitted over the network. - <tschwinge> OK, yes. - <antrik> jkoenig: in theory ints should be used quite often too. the whole - purpose of the type descriptors is to allow byte order swapping when - messages are passed between hosts with different architecture... - <jkoenig> tschwinge, right, except for out-of-line port arrays, which need - to be handled differently obviously. - <antrik> (which is totally irrelevat for our purposes -- especially since - the actual network IPC code doesn't exist anymore ;-) ) - <jkoenig> antrik, oh, interesting - <tschwinge> Yes, that was one original idea. - <jkoenig> actually my litmus test for what the bindings should be, is you - should be able to implement such a proxy in Java :-) - <tschwinge> antrik: And hey, you now have processors that can switch - between different modes during runtime... :-) - <jkoenig> (although arguably that's a little bit ambitious) - <braunr> tschwinge: there should be bits in page tables to indicate the - endianness to use on a page .. :) - <tschwinge> Hehe! - <tschwinge> jkoenig: Don't worry -- you're already known for ambitious - projects. One more can't hurt. - <jkoenig> Also, actually the word size is not something that I've been able - to abstract so far, so I'll be hardcoding little-endian 32 bits for now. - <braunr> why is that ? - <antrik> some of the Hurd RPC break the idea anyways BTW - <jkoenig> the org.vmmagic package (from Jikes RVM and JNode) could help - with that, but GCJ does not support it unfortunately (not sure about - OpenJDK) - <jkoenig> braunr, Java does not allow us to define new unboxed types - <braunr> jkoenig: does it have its own definition of the word size ? - <jkoenig> braunr, nope. - <jkoenig> (although, maybe, and also we could use JNI to query it) - <braunr> even if virtual, i'd expect a machine to have such a defnition - <jkoenig> braunr, maybe it has, but basically in Java nothing depends on - the word size - <jkoenig> 'int' is 32 bits, 'long' is 64 and that's it. - <braunr> oh right, i remember most types are fixed size, right ? - <jkoenig> right. - <braunr> if not all - <jkoenig> now Jikes RVM's "org.vmmagic" provides an interface to defined - new unboxed types which can depend on the actual word size, but Jikes RVM - is its own JVM so obviously they can use and provide whatever extensions - they need :-) - <jkoenig> (but maybe they've implemented them in OpenJDK for bootstrap - purposes, I'm not sure) - <tschwinge> I'm missing this detail: where does the word size come into - play here? - <jkoenig> anyway, I _could_ indiscriminately use 'long' for port names, and - sparkle the code with word size tests but that would be very clumsy - <braunr> jkoenig: port names are actually ints :/ - <jkoenig> tschwinge, the actual format of the message header and type - descriptors, for instance. - <braunr> jkoenig: ok, got your point - <jkoenig> braunr, by 'long' I mean 64-bits integers (which they are on - 64-bits machines I think?) - <braunr> :) - <braunr> jkoenig: port names are as large as the word size - <braunr> but in C at least, they're int, not long - <braunr> it doesn't change many things, but you get lots of warnings if you - try with a long :) - <tschwinge> What is the reason that port names are an - architecture-dependent word size's width, and not simply 32 bit? - <jkoenig> "4 billions of port names should be enough for everyone" :-) - <braunr> tschwinge: an optimization is to use them as pointers in the - kernel - <antrik> tschwinge: the machine's native word size is what it can process - most efficiently, and what should be used for most normal - operations... it makes sense to define stuff as int, except for network - communication - <tschwinge> jkoenig: Well, yeah, but if you want to communicate with a - peer, you have to agree on the maximum number anyway (not for port names, - though, which are local). - <braunr> antrik: int isn't the word size everywhere - <braunr> antrik: the most common type matching the word size is long, at - least on ILP32/LP64 data models - <antrik> braunr: that's just because some idiots assumed int would always - be 32 bits, and consequently when 64 architectures came up the compiler - guys chickened out ;-) - <braunr> without int, you wouldn't have a 32 bits type - <antrik> that's not true for all architectures and/or operating systems - though AFAIK - <braunr> or a 16 bits one - <braunr> antrik: windows guys got even more scared, so windows 64 is LLP64 - <antrik> BTW, I haven't checked, but it's quite possible that 32 bit - numbers are actually preferable even on AMD64... - <tschwinge> jkoenig: So, back on track. :-) - <tschwinge> jkoenig: You didn't find anything yet in Mach's VM interfaces - as well a MemoryObject, etc., that can't be used/implemented in the Java - world? - <braunr> antrik: they consume less memory, but don't have much effect on - performance - <jkoenig> tschwinge, once we have the basic system calls and the - corresponding abstractions in place, I don't think anything else - fundamentally problematic could possibly show up - <antrik> braunr: if you really *need* a type of a certain bit size, you - should use stdint types. so not having a 16 or 32 bit type in the - short/int/long canon is *not* an excuse - <tschwinge> jkoenig: That speaks for the Mach designers! - <braunr> antrik: right - <jkoenig> tschwinge, on trick is that for instance, mach_task_self would - still be unsafe even if it returned a nicely wrapped Task object, because - you could still wreck your own address space and threads with it. So we - would need the "attenuation" pattern mentionned above to provide a safe - one. - <jkoenig> (which would disallow thinks such as the port/thread/vm calls) - <braunr> jkoenig: you mentioned the unsafe pseudo exception earlier - <jkoenig> braunr, right, so the issue is with distinguishing safe from - unsafe methods - <antrik> braunr: BTW, the Windows guys actually broke a lot of stuff by - fixing long at 32 bits -- this way long doesn't match size_t and pointer - types anymore, which was an assumption that was true for pretty much any - system so far... - <tschwinge> jkoenig: Yes. (And again hope for the JVM to optim...) - <braunr> antrik: that's right :) - <braunr> antrik: that's LLP64 - <braunr> antrik: long long and pointers - <jkoenig> braunr, so basically the idea is that unsafe methods are declared - as "throws Unsafe" - <jkoenig> the effect is that if you use such a method you must either - "throw Unsafe" yourself, - <jkoenig> or if you're building a safe abstraction on top of Unsafe - methods, you'll "catch" the "exception" in question to tell the compiler - that it's okay. - <jkoenig> it's more or less inspired from the "semantic regimes" idea from - the org.vmmagic paper which is referenced in my original proposal, - <jkoenig> only implementing by hijacking the exception checking machinery, - which has a behaviour similar to what we want. - <braunr> ok - <braunr> but hmm this seems pretty normal, what's the tricky part ? :) - <tschwinge> braunr: The idea is that the programmer explicitly has to - acknowledge if he'S using an unsafe interface. - <braunr> tschwinge: sounds pretty normal too - <jkoenig> braunr, the trick is that you would not usually declare - exceptions which are never actually thrown (and actually since the - compiler does not know it's never thrown, I need to work around it in a - few places) - <braunr> oh, ok - <braunr> jkoenig: that's interesting indeed - <jkoenig> braunr, the org.vmmagic paper provides an example which uses some - annotations called @UncheckedMemoryAccess and @AssertSafe to the same - effect (which is kind of cleaner), but it would be a headache to - implement without help from the compiler I think (as far as I can tell - the annotation processor would have to inspect the bytecode) - <braunr> but hm - <braunr> what's the true problem about this ? - <jkoenig> (the paper advocates "high-level low-level programming" and is a - very interesting read I think, - http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.151.5253&rep=rep1&type=pdf, - for what it's worth) - <braunr> what's wrong if you just declare your methods unsafe and don't - alter anything else ? - <tschwinge> Yes, I read it and it is interesting. Unfortunately, it seems - I forgot most of it again... - <jkoenig> braunr, declare? alter? - <jkoenig> you mean just tag them with an annotation? - <braunr> just stating a method "throws Unsafe" - <jkoenig> braunr, well some compiler will output a warning because they can - tell there's no way the method is going to throw such an exception. - <jkoenig> and then some other compiler will complain that my - @SuppressWarnings("unused") does not serve any purpose to them :-) - <jkoenig> also, when initializing final fields, I need to work around the - fact that the compiler thinks "Unsafe" might be thrown. - <jkoenig> see for instance MachPort.DEAD - <braunr> jkoenig: ok - <jkoenig> braunr, but I'm more than willing to accept this in exchange for - a clear, compiler-enforced materialization of the border between safe an - unsafe code. - <jkoenig> actually another question I have is the amount of static typing I - should add to the safe version, for instance should I subclass MachPort - into MachSendRight, MachReceiveRight and so on. I don't want to depart - from the C inteface too much but it could be useful. - <braunr> jkoenig: can't answer that :) - <braunr> jkoenig: keep them in mind for later i think - <tschwinge> jkoenig: What's the safety concern w.r.t. having MachPort (not) - final? - <jkoenig> tschwinge, actually I'm partly wrong in that we only need name() - and a couple other methods to be final - <tschwinge> jkoenig: That's what I was thinking. :-) - <tschwinge> I though I'm missing something here. - <jkoenig> tschwinge, the idea is that the user (ie., the adversary :-) - could extend MachPort and inject their own fake port name into messages - <jkoenig> by overriding name() or clear() - <tschwinge> Yeah, but if these are final, that's not possible. - <jkoenig> right. - <tschwinge> And that *should* be enough, I think. - <tschwinge> Unless I'm missing something. - <jkoenig> I don't think so. Also I hope it is, because as mentionned above - there might be some value in subclassing MachPort. - <tschwinge> Yep. - <jkoenig> incidentally, declaring the class or the method final will allow - the JVM to inline them I think. - <tschwinge> It will help the JVM, yes. It can also figure that out without - final, though. (And may have to de-optimize the code again in case there - are additional classes loaded during run-time.) - <tschwinge> jkoenig: The reference counting in MachPort. I think I'm - beginning to understand this. - <jkoenig> oh ok - <jkoenig> tschwinge, yes the javadoc is maybe a bit obscure so far. - <jkoenig> but basically you don't want the port name you acquire to become - invalid before you're done using it. - <tschwinge> But how is this different from the C world? - <jkoenig> here my goal is to provide some guarantees if you use only safe - methods - <jkoenig> like, you can't forge a port name and things like that - <jkoenig> so basically it should never be possible to include an invalid - port name in a message if you use only safe methods. - <tschwinge> Ah, I see! - <tschwinge> Now that does make sense. - <jkoenig> but the mechanism in itself is similar to the Hurd port cells and - user_link structures - <tschwinge> It's again ``only'' helping the programmer. - <jkoenig> right, no object-capability ulterior motives :-) - <jkoenig> another assumption which the javadoc does not state yet it that - basically there should be exactly one MachPort object for each mach-level - port name reference (in the sense of mach_port_mod_refs) - <tschwinge> Yes, I figured out that bit. diff --git a/user/jkoenig/java/java-access-bridge.mdwn b/user/jkoenig/java/java-access-bridge.mdwn deleted file mode 100644 index 57c87068..00000000 --- a/user/jkoenig/java/java-access-bridge.mdwn +++ /dev/null @@ -1,92 +0,0 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -[[!tag open_issue_porting]] - -Debian's *openjdk-7-jre* package depends on *libaccess-bridge-java-jni* (source -package: *java-access-bridge*). - -The latter one has *openjdk-6-jdk* as a build dependency, but that can be -hacked around: - - # ln -s java-7-openjdk /usr/lib/jvm/java-6-openjdk - -Trying to build it: - - $ LD_LIBRARY_PATH=/usr/lib/jvm/java-7-openjdk/jre/lib/i386/jli dpkg-buildpackage -b -uc -d - [...] - make[3]: Entering directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2/idlgen' - /usr/lib/jvm/java-6-openjdk/bin/idlj \ - -pkgPrefix Bonobo org.GNOME \ - -pkgPrefix Accessibility org.GNOME \ - -emitAll -i /usr/share/idl/bonobo-activation-2.0 -i /usr/share/idl/at-spi-1.0 -i /usr/share/idl/bonobo-2.0 \ - -fallTie /usr/share/idl/at-spi-1.0/Accessibility.idl - /usr/share/idl/at-spi-1.0/Accessibility_Collection.idl (line 66): WARNING: Identifier `object' collides with a keyword; use an escaped identifier to ensure future compatibility. - boolean isAncestorOf (in Accessible object); - ^ - /usr/share/idl/at-spi-1.0/Accessibility_Component.idl (line 83): WARNING: Identifier `Component' collides with a keyword; use an escaped identifier to ensure future compatibility. - interface Component : Bonobo::Unknown { - ^ - Exception in thread "main" java.lang.AssertionError: Platform not recognized - at sun.nio.fs.DefaultFileSystemProvider.create(DefaultFileSystemProvider.java:71) - at java.nio.file.FileSystems$DefaultFileSystemHolder.getDefaultProvider(FileSystems.java:108) - at java.nio.file.FileSystems$DefaultFileSystemHolder.access$000(FileSystems.java:89) - at java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:98) - at java.nio.file.FileSystems$DefaultFileSystemHolder$1.run(FileSystems.java:96) - at java.security.AccessController.doPrivileged(Native Method) - at java.nio.file.FileSystems$DefaultFileSystemHolder.defaultFileSystem(FileSystems.java:95) - at java.nio.file.FileSystems$DefaultFileSystemHolder.<clinit>(FileSystems.java:90) - at java.nio.file.FileSystems.getDefault(FileSystems.java:176) - at sun.util.calendar.ZoneInfoFile$1.run(ZoneInfoFile.java:489) - at sun.util.calendar.ZoneInfoFile$1.run(ZoneInfoFile.java:480) - at java.security.AccessController.doPrivileged(Native Method) - at sun.util.calendar.ZoneInfoFile.<clinit>(ZoneInfoFile.java:479) - at sun.util.calendar.ZoneInfo.getTimeZone(ZoneInfo.java:658) - at java.util.TimeZone.getTimeZone(TimeZone.java:559) - at java.util.TimeZone.setDefaultZone(TimeZone.java:656) - at java.util.TimeZone.getDefaultRef(TimeZone.java:623) - at java.util.TimeZone.getDefault(TimeZone.java:610) - at java.text.SimpleDateFormat.initializeCalendar(SimpleDateFormat.java:682) - at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:619) - at java.text.DateFormat.get(DateFormat.java:772) - at java.text.DateFormat.getDateTimeInstance(DateFormat.java:547) - at com.sun.tools.corba.se.idl.toJavaPortable.Util.writeProlog(Util.java:1139) - at com.sun.tools.corba.se.idl.toJavaPortable.Skeleton.writeHeading(Skeleton.java:145) - at com.sun.tools.corba.se.idl.toJavaPortable.Skeleton.generate(Skeleton.java:102) - at com.sun.tools.corba.se.idl.toJavaPortable.InterfaceGen.generateSkeleton(InterfaceGen.java:159) - at com.sun.tools.corba.se.idl.toJavaPortable.InterfaceGen.generate(InterfaceGen.java:108) - at com.sun.tools.corba.se.idl.InterfaceEntry.generate(InterfaceEntry.java:110) - at com.sun.tools.corba.se.idl.toJavaPortable.ModuleGen.generate(ModuleGen.java:75) - at com.sun.tools.corba.se.idl.ModuleEntry.generate(ModuleEntry.java:83) - at com.sun.tools.corba.se.idl.Compile.generate(Compile.java:324) - at com.sun.tools.corba.se.idl.toJavaPortable.Compile.start(Compile.java:169) - at com.sun.tools.corba.se.idl.toJavaPortable.Compile.main(Compile.java:146) - make[3]: *** [org/GNOME/Accessibility/Accessible.java] Error 1 - make[3]: Leaving directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2/idlgen' - make[2]: *** [all-recursive] Error 1 - make[2]: Leaving directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2/idlgen' - make[1]: *** [all-recursive] Error 1 - make[1]: Leaving directory `/media/erich/home/thomas/tmp/libaccess-bridge-java-jni/java-access-bridge-1.26.2' - make: *** [debian/stamp-makefile-build] Error 2 - dpkg-buildpackage: error: debian/rules build gave error exit status 2 - - -IRC, freenode, #hurd, 2011-08-10: - - < jkoenig> and with my latest fix (hardwire os.name as "Linux"), - java-access-bridge actually built \o/ - < youpi> I wouldn't call it a "fix" :) - < jkoenig> true, but pretty much everything assumes we're either solaris, - linux or windows :-/ - < jkoenig> also we're actually using the Linux code which it is used to - select throughout the JDK - < jkoenig> if it's any consolation, os.version stays "GNU-Mach - 1.3.99/Hurd-0.3" :-) - < youpi> ideally it should simply be changed to "GNU" diff --git a/user/jkoenig/java/proposal.mdwn b/user/jkoenig/java/proposal.mdwn deleted file mode 100644 index feb7e9dc..00000000 --- a/user/jkoenig/java/proposal.mdwn +++ /dev/null @@ -1,629 +0,0 @@ -[[!tag stable_URL]] - -# Java for Hurd (and vice versa) - -Contact information: - - * Full name: Jérémie Koenig - * Email: jk@jk.fr.eu.org - * IRC: jkoenig on Freenode and OFTC - -## Introductions - -I am a first year M.Sc. student -in Computer Science at University of Strasbourg (France). -My interests include capability-based security, -programming languages and formal methods -(in particular, object-capability languages and proof-carrying code). - -### Proposal summary - -This project would consist in improving Java support on Hurd. -The first part would consist in -fixing bugs and porting Java-related packages. -The second part would consist in -creating low-level Java bindings for the Hurd interfaces, -as well as libraries to make translator development easier. - -### Previous involvement - -I started contributing to Hurd last summer, -during which I participated to Google Summer of Code -as a student for the Debian project. -I worked on porting Debian-Installer to Hurd. -This project was mostly a success, -although we still have to use a special mirror for installation -with a few modified packages -and tweaked priorities -to work around some uninstallable packages -with Priority: standard. - -Shortly afterwards, -I rewrote the procfs translator -to fix some issues with memory leaks, -make it more reliable, -and improve compatibility with Linux-based tools -such as `procps` or `htop`. - -Although I have not had as much time -as I would have liked to dedicate to the Hurd -since that time, -I have continued to maintain the mirror in question, -and I have started to work -on implementing POSIX threads signal semantics in glibc. - -### Project-related skills and interests - -I have used Java mostly for university assignments. -This includes non-trivial projects -using threads and distributed programming frameworks -such as Java RMI or CORBA. -I have also used it to experiment with -Google App Engine -(web applications) -and Google Web Toolkit -(a compiler from Java to Javascript which helps with AJAX code), -and I have some limited experience with JNI -(the Java Native Interface, to link Java with C code). - -My knowledge of the Hurd and Debian GNU/Hurd is reasonable, -as the Debian-Installer and procfs projects -gave me the opportunity to fiddle with many parts of the system. - -Initially, -I started working on this project because I wanted to use -[Joe-E](http://code.google.com/p/joe-e/) -(a subset of Java) -to investigate the potential -[[applications of object-capability languages|objcap]] -in a Hurd context. -I also believe that improving Java support on Hurd -would be an important milestone. - -### Organisational matters - -I am subscribed to bug-hurd@g.o and -I do have a permanent internet connexion. - -I would be able to attend the regular IRC meetings, -and otherwise communicate with my mentor -through any means they would prefer -(though I expect email and IRC would be the most practical). -Since I'm already familiar with the Hurd, -I don't expect I would require too much time from them. - -My exams end on May 20 so I would be able to start coding -right at the beginning of the GSoC period. -Next year's term would probably begin around September 15, -so that would not be an issue either. -I expect I would work around 40 hours per week, -and my waking hours would be flexible. - -I don't have any other plans for the summer -and would not make any if my project were to be accepted. - -Full disclosure: -I also submitted a proposal to the Jikes RVM project -(which is a research-oriented Java Virtual Machine, -itself written in Java) -for implementing a new garbage collector into the MMTk subsystem. - -## Improve Java support - -### Justification - -Java is a popular language and platform used by many desktop and web -applications (mostly on the server side). As a consequence, competitive Java -support is important for any general-purpose operating system. -Better Java support would also be a prerequisite -for the second part of my proposal. - -### Current situation - -Java is currently supported on Hurd with the GNU Java suite: - - * [GCJ](http://gcc.gnu.org/java/), - the GNU Compiler for Java, is part of GCC and can compile Java - source code to Java bytecode, and both source code and bytecode to - native code; - * libgcj is the implementation of the Java runtime which GCJ uses. - It is based on [GNU Classpath](http://www.gnu.org/software/classpath/). - It includes a bytecode interpreter which enables - Java applications compiled to native code to dynamically load and execute - Java bytecode from class files. - * The gij command is a wrapper around the above-mentioned virtual machine - functionality of libgcj and can be used as a replacement for the java - command. - -However, GCJ does not work flawlessly on Hurd.r -For instance, some parts of libgcj relies on -the POSIX threads signal semantics, which are not yet implemented. -In particular, this makes ant hang waiting for child processes, -which makes some packages fail to build on Hurd -(“ant” is the “make” of the Java world). - -### Tasks - - * **Finish implementing POSIX thread semantics** in glibc (high priority). - According to POSIX, signal dispositions should be global to a process, - while signal blocking masks should be thread-specific. Signals sent to the - process as a whole are to be delivered to any thread which does not block - them. By contrast, Hurd has per-thread signal dispositions and signals - sent to a process are delivered to the main thread only. I have been - working on refactoring the glibc signal code and implementing the POSIX - semantics as a per-thread option. However, due to lack of time I have not - yet been able to test and debug my code properly. Finishing this work - would be my first task. - * **Fix further problems with GCJ on Hurd** (high priority). While I’m not - aware of any other problems with GCJ at the moment, I suspect some might - turn up as I progress with the other tasks. Fixing these problems would - also be a high-priority task. - * **Port OpenJDK 6** (medium priority). While GCJ is fine, it is not yet - 100% complete. It is also slower than OpenJDK on architectures where a - just-in-time compiler is available. Porting OpenJDK would therefore - improve Java support on Hurd in scope and quality. Besides, it would also - be a good way to test GCJ, which is used for bootstrapping by the Debian - OpenJDK packages. Also note that OpenJDK 6 is now the default Java - Runtime Environment on all released Linux-based Debian architectures; - bringing Hurd in line with this would probably be a good thing. - * **Port Eclipse and other Java applications** (low priority). Eclipse is a - popular, state-of-the-art IDE and tool suite used for Java and other - languages. It is a dependency of the Joe-E verifier (see part 3 of this - proposal). Porting Eclipse would be a good opportunity to test GCJ and - OpenJDK. - -### Deliverables - - * The glibc pthreads patch and any other fixes on the Hurd side - would be submitted upstream - * Patches against Debian source packages - required to make them build on Hurd would be submitted - to the [Debian bug tracking system](http://bugs.debian.org/). - - -## Create Java bindings for the Hurd interfaces - -### Justification - -Java is used for many applications and often taught to -introduce object-oriented programming. The fact that Java is a -garbage-collected language makes it easier to use, especially for the less -experienced programmers. Besides, its object-oriented nature is a -natural fit for the capability-based design of Hurd. -The JVM is also used as a target for many other languages, -all of which would benefit from the access provided by these bindings. - -Advantages over other garbage-collected, object-oriented languages include -performance, type safety and the possibility to compile a Java translator to -native code and -[link it statically](http://gcc.gnu.org/wiki/Statically_linking_libgcj) -using GCJ, should anyone want to use a -translator written in Java for booting. -Note that Java is -[being](http://www.linuxjournal.com/article/8757) -[used](http://oss.readytalk.com/avian/) -in this manner for embedded development. -Since GCJ can take bytecode as its input, -this expect this possibility would apply to any JVM-based language. - -Java bindings would lower the bar for newcomers -to begin experimenting with what makes Hurd unique -without being faced right away with the complexity of -low-level systems programming. - -### Tasks summary - - * Implement Java bindings for Mach - * Implement a libports-like library for Java - * Modify MIG to output Java code - * Implement libfoofs-like Java libraries - -### Design principles - -The principles I would use to guide the design -of these Java bindings would be the following ones: - - * The system should be hooked into at a low level, - to ensure that Java is a "first class citizen" - as far as the access to the Hurd's interfaces is concerned. - * At the same time, the memory safety of Java should be maintained - and extended to Mach primitives such as port names and - out-of-line memory regions. - * Higher-level interfaces should be provided as well - in order to make translator development - as easy as possible. - * A minimum amount of JNI code (ie. C code) should be used. - Most of the system should be built using Java itself - on top of a few low-level primitives. - * Hurd objects would map to Java objects. - * Using the same interfaces, - objects corresponding to local ports would be accessed directly, - and remote objects would be accessed over IPC. - -One approach used previously to interface programming languages with the Hurd -has been to create bindings for helper libraries such as libtrivfs. Instead, -for Java I would like to take a lower-level approach by providing access to -Mach primitives and extending MIG to generate Java code from the interface -description files. - -This approach would be initially more involved, and would introduces several -issues related to overcoming the "impedance mismatch" between Java and Mach. -However, once an initial implementation is done it would be easier to maintain -in the long run and we would be able to provide Java bindings for a large -percentage of the Hurd’s interfaces. - -### Bindings for Mach system calls - -In this low-level approach, my intention is to enable Java code to use Mach -system calls (in particular, mach_msg) more or less directly. This would -ensure full access to the system from Java code, but it raises a number of -issues: - - * the Java code must be able to manipulate Mach-level entities, such as port - rights or page-aligned buffers mapped outside of the garbage-collected - heap (for out-of-line transfers); - * putting together IPC messages requires control of the low-level - representation of data. - -In order to address these concerns, classes would be encapsulating these -low-level entities so that they can be referenced through normal, safe objects -from standard Java code. Bindings for Mach system calls can then be provided -in terms of these classes. Their implementation would use C code through the -Java Native Interface (JNI). - -More specifically, this functionality would be provided by the `org.gnu.mach` -package, which would contain at least the following classes: - - * `MachPort` would encapsulate a `mach_port_t`. (Some of) its constructors - would act as an interface for the `mach_port_allocate()` system call. - `MachPort` objects would also be instantiated from other parts of the JNI - C code to represent port rights received through IPC. The `deallocate()` - method would call `mach_port_deallocate()` and replace the encapsulated - port name with `MACH_PORT_DEAD`. We would recommend that users call it - when a port is no longer used, but the finalizer would also deallocate the - port when the `MachPort` object is garbage collected. - * `Buffer` would represent a page-aligned buffer allocated outside of the - Java heap, to be transferred (or having been received) as out-of-line - memory. The JNI code would would provide methods to read and write data at - an arbitrary offset (but within bounds) and would use `vm_allocate()` and - `vm_deallocate()` in the same spirit as for `MachPort` objects. - * `Message` would allow Java code to put together Mach messages. The - constructor would allocate a `byte[]` member array of a given size. - Additional methods would be provided to fill in or query the information - in the message header and additional data items, including `MachPort` and - `Buffer` objects which would be translated to the corresponding port names - and out-of-line pointers. - A global map from port names to the corresponding `MachPort` object - would probably be needed to ensure that there is a one-to-one - correspondence. - * `Syscall` would provide static JNI methods for performing system calls not - covered by the above classes, such as `mach_msg()` or - `mach_thread_self()`. These methods would accept or return `MachPort`, - `Buffer` and `Message` objects when appropriate. The associated C code - would access the contents of such objects directly in order to perform the - required unsafe operations, such as constructing `MachPort` and `Buffer` - objects directly from port names and C pointers. - -Note that careful consideration should be given to the interfaces of these -classes to avoid “safety leaks” which would compromise the safety guarantees -provided by Java. Potential problematic scenarios include the following -examples: - - * It must not be possible to write an integer at some position in a - `Message` object, and to read it back as a `MachPort` or `Buffer` object, - since this would allow unsafe access to arbitrary memory addresses and - mach port names. - * Providing the `mach_task_self()` system call would also provide access to - arbitrary addresses and ports by using the `vm_*` family of RPC operations - with the returned `MachPort` object. This means that the relevant task - operations should be provided by the `Syscall` class instead. - -Finally, access should be provided to the initial ports and file descriptors -in `_hurd_ports` and provided by the `getdport()` function, -for instance through static methods such as -`getCRDir()`, `getCWDir()`, `getProc()`, ... in a dedicated class such as -`org.gnu.hurd.InitPorts`. - -A realistic example of code based on such interfaces would be: - - import org.gnu.mach.MsgType; - import org.gnu.mach.MachPort; - import org.gnu.mach.Buffer; - import org.gnu.mach.Message; - import org.gnu.mach.Syscall; - import org.gnu.hurd.InitPorts; - - public class Hello - { - public static main(String argv[]) - /* Parent class for all Mach-related exceptions */ - throws org.gnu.mach.MachException - { - /* Allocate a reply port */ - MachPort reply = new MachPort(); - - /* Allocate an out-of-line buffer */ - Buffer data = new Buffer(MsgType.CHAR, 13); - data.writeString(0, "Hello, World!"); - - /* Craft an io_write message */ - Message msg = new Message(1024); - msg.setRemotePort(InitPorts.getdport(1)); - msg.setLocalPort(reply, Message.Type.MAKE_SEND_ONCE); - msg.setId(21000); - msg.addBuffer(data); - - /* Make the call, MACH_MSG_SEND | MACH_MSG_RECEIVE */ - Syscall.machMsg(msg, true, true, reply); - - /* Extract the returned value */ - msg.assertId(21100); - int retCode = msg.readInt(0); - int amount = msg.readInt(1); - } - } - -Should this paradigm prove insufficient, -more ideas could be borrowed from the -[`org.vmmagic`](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.151.5253&rep=rep1&type=pdf) -package used by [Jikes RVM](http://jikesrvm.org/), -a research Java virtual machine itself written in Java. - -### Generating Java stubs with MIG - -Once the basic machinery is in place to interface with Mach, Java programs -have more or less equal access to the system functionality without resorting -to more JNI code. However, as illustrated above, this access is far from -convenient. - -As a solution I would modify MIG to add the option to output Java code. MIG -would emit a Java interface, a client class able to implement the interface -given a Mach port send right, an a server class which would be able to handle -incoming messages. The class diagram below, although it is by no means -complete or exempt of any problem, illustrates the general idea: - -[[gsoc2011_classes.png]] - -This structure is somewhat reminiscent of -[Java RMI](http://en.wikipedia.org/wiki/Java_remote_method_invocation) -or similar systems, -which aim to provide more or less transparent access to remote objects. -The exact way the Java code would be generated still needs to be determined, -but basically: - - * An interface, corresponding to the header files generated by MIG, would - enumerate the operations listed in a given .defs files. Method names would - be transformed to adhere to Java conventions (for instance, - `some_random_identifier` would become `someRandomIdentifier`). - * A user class, corresponding to the `*User.c` files, - would implement this interface by doing RPC over a given MachPort object. - * A server class, corresponding to `*Server.c`, would be able to handle - incoming messages using a user-provided implementation of the interface. - (Possibly, a skeleton class providing methods which would raise - `NotImplementedException`s would be provided as well. - Users would derive from this class and override the relevant methods. - This would allow them not to implement some operations, - and would avoid pre-existing code from breaking when new operations are - introduced.) - -In order to help with the implementation of servers, some kind of library -would be needed to associate Mach receive rights with server objects and to -handle incoming messages on dedicated threads, in the spirit of libports. -This would probably require support for port sets at the level of the Mach -primitives described in the previous section. - -When possible, operations involving the transmission of send rights -of some kind would be expressed in terms of the MIG-generated interfaces -instead of `MachPort` objects. -Upon reception of a send right, -a `FooUser` object would be created -and associated with the corresponding `MachPort` object. -If the received send right corresponds to a local port -to which a server object has been associated, -this object would be used instead. -This way, -subsequent operations on the received send right -would be handled as direct method calls -instead of going through RPC mechanisms. - -Some issues will still need to be solved regarding how MIG will convert -interface description files to Java interfaces. For instance: - - * `.defs` files are not explicitly associated with a type. For instance in - the example above, MIG would have to somehow infer that io_t corresponds - to `this` in the `Io` interface. - * More generally, a correspondence between MIG and Java types would have - to be determined. Ideally this would be automated and not hardcoded - too much. - * Initially, reply port parameters would be ignored. However they may be - needed for some applications. - -So the details would need to be flushed out during the community bonding -period and as the implementation progresses. However I’m confident that a -satisfactory solution can be designed. - -Using these new features, the example above could be rewritten as: - - import org.gnu.hurd.InitPorts; - import org.gnu.hurd.Io; - import org.gnu.hurd.IoUser; - - class Hello { - static void main(String argv[]) throws ... - { - Io stdout = new IoUser(InitPorts.getdport(1)); - String hello = “Hello, World!\n”; - - int amount = stdout.write(hello.getBytes(), -1); - - /* (A retCode corresponding to an error - would be signalled as an exception.) */ - } - } - -An example of server implementation would be: - - import org.gnu.hurd.Io; - import java.util.Arrays; - - class HelloIo implements Io { - final byte[] contents = “Hello, World!\n”.getBytes(); - - int write(byte[] data, int offset) { - return SOME_ERROR_CODE; - } - - byte[] read(int offset, int amount) { - return Arrays.copyOfRange(contents, offset, - offset + amount - 1); - } - - /* ... */ - } - -A new server object could then be created with `new IoServer(new HelloIo())`, -and associated with some receive right at the level of the ports management -library. - -### Base classes for common types of translators - -Once MIG can target Java code, and a libports equivalent is available, -creating new translators in Java would be greatly facilitated. However, -we would probably want to introduce basic implementations of file system -translators in the spirit of libtrivfs or libnetfs. They could take the form -of base classes implementing the relevant MIG-generated interfaces which -would then be derived by users, -or could define a simpler interface -which would then be used by adapter classes -to implement the required ones. - -I would draw inspiration from libtrivfs and libnetfs -to design and implement similar solutions for Java. - -### Deliverables - - * A hurd-java package would contain the Java code developed - in the context of this project. - * The Java code would be documented using javadoc - and a tutorial for writing translators would be written as well. - * Modifications to MIG would be submitted upstream, - or a patched MIG package would be made available. - -The Java libraries resulting from this work, -including any MIG support classes -as well as the class files built from the MIG-generated code -for the Mach and Hurd interface definition files, -would be provided as single `hurd-java` package for -Debian GNU/Hurd. -This package would be separate from both Hurd and Mach, -so as not to impose unreasonable build dependencies on them. - -I expect I would be able to act as its maintainer in the foreseeable future, -either as an individual or as a part of the Hurd team. -Hopefully, -my code would be claimed by the Hurd project as their own, -and consequently the modifications to MIG -(which would at least conceptually depend on the Mach Java package) -could be integrated upstream. - -Since by design, -the Java code would use only a small number of stable interfaces, -it would not be subject to excessive amounts of bitrot. -Consequently, -maintenance would primarily consist in -fixing bugs as they are reported, -and adding new features as they are requested. -A large number of such requests -would mean the package is useful, -so I expect that the overall amount of work -would be correlated with the willingness of more people -to help with maintenance -should I become overwhelmed or get hit by a bus. - - -## Timeline - -The dates listed are deadlines for the associated tasks. - - * *Community bonding period.* - Discuss, refine and complete the design of the Java bindings - (in particular the MIG and "libports" parts) - * *May 23.* - Coding starts. - * *May 30.* - Finish implementing pthread signal semantics. - * *June 5.* - Port OpenJDK - * *June 12.* - Fix the remaining problems with GCJ and/or OpenJDK, - possibly port Eclipse or other big Java packages. - * *June 19.* - Create the bindings for Mach. - * *June 26.* - Work on some kind of basic Java libports - to handle receive rights. - * *July 3.* - Test, write some documentation and examples. - * *July 17 (two weeks).* - Add the Java target to MIG. - * *July 24.* - Test, write some documentation and examples. - * *August 7 (two weeks).* - Implement a modular libfoofs to help with translator development. - Try to write a basic but non-trivial translator - to evaluate the performance and ease of use of the result, - rectify any rough edges this would uncover. - * *August 22. (last two weeks)* - Polish the code and packaging, - finish writing the documentation. - - -## Conclusion - -This project is arguably ambitious. -However, I have been thinking about it for some time now -and I'm confident I would be able to accomplish most of it. - -In the event multiple language bindings projects -would be accepted, -some work could probably be done in common. -In particular, -[ArneBab](http://www.bddebian.com/~hurd-web/community/weblogs/ArneBab/2011-04-06-application-pyhurd/) -seems to favor a low-level approach for his Python bindings as I do for Java, -and I would be happy to discuss API design and coordinate MIG changes with him. -I would also have an extra month after the end of the GSoC period -before I go back to school, -which I would be able to use to finish the project -if there is some remaining work. -(Last year's rewrite of procfs was done during this period.) - -As for the project's benefits, -I believe that good support for Java -is a must-have for the Hurd. -Java bindings would also further the Hurd's agenda -of user freedom by extending this freedom to more people: -I expect the set of developers -who would be able to write Java code against a well-written libfoofs -is much larger than -those who master the intricacies of low-level systems C programming. -From a more strategic point of view, -this would also help recruit new contributors -by providing an easier path to learning the inner workings of the Hurd. - -Further developments -which would build on the results of this project -include my planned [[experiment with Joe-E|objcap]] -(which I would possibly take on as a university project next year). -Another possibility would be to reimplement some parts -of the Java standard library -directly in terms of the Hurd interfaces -instead of using the POSIX ones through glibc. -This would possibly improve the performance -of some Java applications (though probably not by much), -and would otherwise be a good project -for someone trying to get acquainted with Hurd. - -Overall, I believe this project would be fun, interesting and useful. -I hope that you will share this sentiment -and give me the opportunity to spend another summer working on Hurd. - diff --git a/user/jkoenig/java/report.mdwn b/user/jkoenig/java/report.mdwn deleted file mode 100644 index cb1acda9..00000000 --- a/user/jkoenig/java/report.mdwn +++ /dev/null @@ -1,136 +0,0 @@ -[[!meta copyright="Copyright © 2011 Free Software Foundation, Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -# GSoC 2011 final report (Java on Hurd) - -This is my final report regarding my work on Java for Hurd -as a Google Summer of Code student for the GNU project. -The work is going on, -for recent status updates, see my [[java]] page. - -## Global signal dispositions and SA_SIGINFO - -Signal delivery was implemented in Hurd before POSIX threads were -defined. As a consequence the current semantics differ from the POSIX -prescriptions, which libgcj relies on. - -On the Hurd, each thread has its own signal dispositions and -process-wide signals are always received by the main thread. -In contrast, POSIX specifies signal dispositions to be global to the -process (although there is still a per-thread blocking mask), and a -global signal can be delivered to any thread which does not block it. - -To further complicate the matter, the Hurd currently has two options for -threads: the cthread library, still used by most of the Hurd code, and -libpthread which was introduced later for compatibility purposes. To -avoid breaking existing code, cthread programs should continue to run -with the historical Hurd signal semantics whereas pthread programs are -expected to rely on the POSIX behavior. - -To address this, the patch series I wrote allows selecting a per-thread -behavior: by default, newly created threads provide historical -semantics, but they can be marked by libpthread as global signal -receivers using the new function `_hurd_sigstate_set_global_rcv()`. -In addition, I refactored some of the signal code to improve -readability, and fixed a couple of bugs I came across in the process. - -Another improvement which was required by OpenJDK was the implementation -of the `SA_SIGINFO` flag for signal handlers. My signal patch series -provides the basic infrastructure. However it is not yet complete, as -some of the information provided by `siginfo_t` structures is not -available to glibc. Making this information available would require a -change in the `msg_sig_post()` RPC. - -### Related Debian changes - -In Debian GNU/Hurd, libpthread is provided the `hurd` package. Hurd also -uses extern inline functions from glibc which are affected by the new -symbols. This means that newer Hurd packages which take advantage of -glibc's support for global signal dispositions cannot run on older C -libraries and some thought had to be given to the way we could ensure -smooth upgrades. - -An early attempt at using weak symbols proved to be impractical. As a -consequence I modified the eglibc source package to enable -dpkg-gensymbols on hurd-i386. This means that packages which are built -against a newer libc and make use of the new symbols will automatically -get an appropriately versionned dependency on libc0.3. - -### Status as of 2012-01-28 - -The patch series has not yet been merged upstream. However, it is now -being used for the Debian package of glibc. - -## $ORIGIN substitution in RPATH - -Another feature used by OpenJDK which was not implemented in Hurd is the -substitution of the special string `$ORIGIN` within the ELF `RPATH` -header. `RPATH` is a per-executable library search path, within which -`$ORIGIN` should be substituted by the directory part of the binary's -canonical file name. - -Currently, a newly executed program has no way of figuring out which -binary it was created from. Actually, even the `_hurd_exec()` function, -which is used in glibc to implement the `exec*()` family, is never -passed the file name of the executable, but only a port to it. -Likewise, the `file_exec()`, `exec_exec()` and `exec_startup_get_info()` -RPCs do not provide a path to transmit the file name from the shell to -the file system server, to the exec server, to the executed program. - -Last year, Emilio Pozuelo Monfort submitted a patch series which fixes -this problem, up to the exec server. The series' original purpose was to -replace the guesswork done by `exec` when running shell scripts. It -provides new versions of `file_exec()` and `exec_exec()` which allow for -passing the file name. I extended Emilio's patches to add the missing -link, namely a new `exec_startup_get_info_2()` RPC. New code in glibc -takes advantage of it to retrieve the file name and use it in a -Hurd-specific `dl-origin.c` to allow for `RPATH` `$ORIGIN` substitution. - -### Status as of 2012-01-28 - -The (hurd and glibc) patch series for `$ORIGIN` are mostly complete. -However, there is still an issue related to the canonicalization of the -executable's file name. Doing it in the dynamic linker (where `$ORIGIN` -is expanded) is complicated due to the limited set of available -functions (`realpath()` is not). Unfortunately canonicalizing in -`_hurd_exec_file_name()` is not an option either because many shell -scripts use `argv[0]` to alter their behavior, but `argv[0]` is replaced -by the shell with the file name it's passed. - -Another issue is that the patches use a fixed-length string buffer to -transmit the file name through RPC. - -## OpenJDK 7 - -With the groundwork above being taken care of, I was able to build -OpenJDK 7 on Hurd, although heavy portability patching was also -necessary. A similar effort for Debian GNU/kFreeBSD was undertaken -around the same time by Damien Raude-Morvan, so I intend to submit a -more general set of "non-Linux" patches. - -Due to the lack of a `libpthread_db` library on the Hurd, I was only -able to build a Zero (interpreter only) virtual machine so far. However, -it should be possible to disable the debugging agent and build Hotspot. - -### Status as of 2012-01-28 - -I have put together generic `nonlinux-*.diff` patches for the `openjdk7` -Debian package, however I have not yet tested them on Linux and kFreeBSD. - -## Java bindings - -Besides improving Java support on Hurd, my original proposal also -included the creation of Java bindings for the Hurd interfaces. -My progress on this front has not been as fast as I would have liked. -However I have started some of the work required to provide safe Java -bindings for Mach system calls. - -See https://github.com/jeremie-koenig/hurd-java. - |