summaryrefslogtreecommitdiff
path: root/microkernel/mach/gnumach/ports/xen.mdwn
blob: 1af19257a5f75c2dffbe1714896ea867a6fd859c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
[[!meta copyright="Copyright © 2007, 2008, 2009, 2011, 2013, 2014, 2015 Free
Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!toc]]


# Xen dom0, hypervisor

/!\ Now that GNU Mach handles PAE you can use a PAE-enabled hypervisor.

You can either get binaries at <http://youpibouh.thefreecat.org/hurd-xen/> or build them yourself.

- Copy `gnumach-xen-pae` and `hurd-modules` to your dom0 /boot. If you still have a non-PAE hypervisor, use `gnumach-xen-nonpae` instead.
- Copy `hurd` into `/etc/xen`, edit it for fixing access to your hurd / and swap

# GNU/Hurd system

/!\ You need an already installed [[GNU/Hurd_system|hurd/running]].

If you have a free partition, you can fdisk to type 0x83, create a filesystem using:

    sudo mke2fs -b 4096 -I 128 -o hurd /dev/sda4 

Replace /dev/sda4 with your partition. Install and use crosshurd to setup a GNU/Hurd system on this partition.


# /etc/xen/hurd configuration

There are two ways to boot a Hurd system: either directly boot gnumach, or boot it through PV-Grub. The former is a bit more complex.

## Directly booting gnumach

Here is a sample /etc/xen/hurd configuration

    kernel = "/boot/gnumach-xen-pae"
    memory = 512
    disk = ['phy:sda4,hda,w']
    extra = "root=device:hd0"
    vif = [ '' ]
    ramdisk = "/boot/hurd-modules"

`hurd-modules` from http://youpibouh.thefreecat.org/hurd-xen/ was built from a specific libc version,
/!\ This means that when using this image, your GNU/Hurd system also needs to have the same version!

It is preferrable to rebuild your own hurd-modules, using your own libc version, by using the 
http://youpibouh.thefreecat.org/hurd-xen/build_hurd-modules script.

Suggestions about [[networking_configuration]] are available.

If you need stable MAC addresses, use a syntax like `vif = [
'mac=00:16:3e:XX:XX:XX, bridge=br0' ]`.

## Booting through pv-grub

# `pv-grub`

Starting from Xen 4.0, you can run the GNU Hurd using `pv-grub`.

Download http://youpibouh.thefreecat.org/hurd-xen/pv-grub.gz into /boot, and use the following for instance:

    kernel = "/boot/pv-grub.gz"
    memory = 512
    disk = ['phy:sda4,hda,w']
    extra = "(hd0)/boot/grub/menu.lst"
    vif = [ '' ]

extra is now the path to the grub config file, which must contain the usual grub
command to boot a hurd system.

# Running Hurd with Xen

To run Hurd with Xen, use:

    xm create -c hurd

and gnumach should get started. Proceed with native-install.

    export TERM=mach
    ./native-install

- If `xm` complains about networking (`vif could not be connected`), it's Xen scripts' fault, see Xen documentation for how to configure the network.  The simplest way is network-bridge with fixed IPs (note that you need the bridge-utils package for this).  You can also just disable networking by commenting the vif line in the config.
- If `xm` complains `Error: (2, 'Invalid kernel', 'xc_dom_compat_check: guest type xen-3.0-x86_32 not supported by xen kernel, sorry\n')`, you most probably have a PAE-enabled hypervisor and a non-PAE gnumach. Either install and boot non-PAE hypervisor and kernel, or rebuilt gnumach in PAE mode.

# Partitions

You will need the following notation for the gnumach root= parameter:

root=part:2:device:hd0

to access the second partition of hd0, for instance.

You will also need to use the parted storeio module for the /dev entries, for instance:

settrans -fgap /dev/hd0s1 /hurd/storeio -T typed part:1:device:hd0

# Miscellaneous

[[Internals]].

[[!GNU_Savannah_task 5468]], [[!GNU_Savannah_task 6584]].

# Building from sources

If you want to generate your own gnumach kernel, see [[microkernel/mach/gnumach/building]], and use

    ./configure --enable-platform=xen
    make

## IRC, freenode, #hurd, 2013-11-09

    <phcoder> youpi: would a limitation of 32 modules to hurd in pvgrub2 be a
      problem?
    <phcoder> *31
    <youpi> phcoder: probably not
    <phcoder> youpi: ok

    <phcoder> youpi: gnumach goes into infinite loop with "warning: nsec
      0x000096dae65d2697 < lastnsec 0x000096db11dee20d". Second value stays
      constant, first value loops from 0x000096da14968a59 to
      0x000096db08bf359e. Not sure if the problem is on GRUB or gnumach ide
    <youpi> loops?!
    <youpi> that's the time coming from the hypervisor
    <youpi> not a problem from GRUB anyway
    <phcoder> Yes, loops in steps of around 0x40 and comes back regularly.
    <youpi> Mmm, maybe it could be grub not properly setting up
      hyp_shared_info.vcpu_info[], actually
    <youpi> i.e. the mfn in boot_info.shared_info
    <phcoder> I don't think we write to shared page at all
    <phcoder> could gnumach suffer from overflow on fast CPU?
    <phcoder>   next_start.shared_info = grub_xen_start_page_addr->shared_info;
    <phcoder> And shared_info is machine address, so no need to adjust it
    <phcoder> tsc_shift can be negative. Does gnumach handle this?
    <youpi> yes
    <youpi> here it's the base which doesn't change, actually
    <phcoder> Do you mean this: 		system_time	=
      time->system_time; ?
    <phcoder> But wait: ((delta * (unsigned64_t) mul) >> 32)
    <phcoder> this overflows after 2^32 nanoseconds
    <phcoder> which is about 4 seconds
    <phcoder> I think this is the mistake
    <phcoder> which is more or less what I see
    <phcoder> Let me make a patch
    <youpi> does xen have some tickless  feature now?
    <youpi> I'd expect the clock to get updated at least sometimes during 4
      seconds :)
    <phcoder> Hm, can't compile master:
    <phcoder> ./include/mach/xen.h:52:18: error: ‘MACH2PHYS_VIRT_START_PAE’
      undeclared (first use in this function)
    <phcoder>  #define PFN_LIST MACH2PHYS_VIRT_START_PAE
    <phcoder> Here is the patch: http://paste.debian.net/64857/
    <youpi> it's defined in xen/public/arch-x86/xen-x86_32.h
    <phcoder> yes it is. Let's see why it's not included
    <phcoder> Hm, for some reason it pulls 64-bit headers in
    <youpi> how do you cross-compile?
    <youpi> I use
    <youpi> ./configure --host=i686-gnu CC='gcc -m32' LD='ld -melf_i386' 
    <phcoder> Yes. GRUB adds those itself
    <phcoder> youpi: confirmed: my patch solves the problem
    <phcoder> any yes: I tried with unpatched master and it fails
    <youpi> ok
    <youpi> phcoder: thanks!

    <phcoder> Now I get plenty of "getcwd: cannot access parent directories:
      Inappropriate file type or format". But I don't think it's grub-related
    <youpi> what do you get before that?
    <youpi> perhaps ext2fs doesn't get properly initialized
    <youpi> which module commande line do you get in the boot log?
    <youpi> perhaps it's simply a typo in there
    <phcoder> http://paste.debian.net/64865/
    <youpi>  $(task-create) $(task-resume) is missing at the end of the ext2fs
      line indeed
    <youpi> in your paste it stops at $(
    <phcoder> this is at the end of my console. I believe it to be
      cosmetic. Let me reset console to some sane state
    <youpi> ok
    <youpi> the spurious event at the start is probably worth checking up
    <youpi> it looks like events that pvgrub2 should have eaten
    <youpi> (in its own drivers, before finishing shutting them down)
    <phcoder> when redirecting console to file: http://paste.debian.net/64868/
    <phcoder> could swapon have sth to do with it?
    <youpi> I'd be surprised
    <phcoder> my guess it's because I use older userland (debian about May) and
      new kernel (fresh from master)
    <youpi> the kernel hasn't really changed
    <youpi> you could rebuild the may-debian kernel with your patch to make
      sure
    <youpi> but probably better trying to fix swapon first, at least
    <youpi> (even if that'd surprise me)
    <youpi> "trying fixiing* swapon", actually
    <youpi> it makes a difference :)
    <phcoder> We actually never eat event on evtchn, we look into buffers to
      check for response
    <youpi> ah, that's why
    <youpi> you should really eat the events too
    <youpi> in principle it wouldn't hurt not to, but you'd probably get
      surprises

    <phcoder> youpi: would doing EVTCHNOP_reset at the end be enough?
    <youpi> possibly, I don't know that one
    <youpi> looks like a good thing to do before handing control, indeed
    <youpi>     /* Clear pending event to avoid unexpected behavior on
      re-bind. */
    <youpi>     evtchn_port_clear_pending(d1, chn1);
    <youpi> yes, it does clear the pending events
    <phcoder> http://paste.debian.net/64870/
    <phcoder> I did this: http://paste.debian.net/64871/
    <youpi> well, closing the event channels would be a good idea too
    <youpi> (reset does not only clear pending events, it also closes the event
      channels)
    <phcoder> well we can't close console one. So it leave to close disk ones
      (the ones we allocated)
    <phcoder> http://paste.debian.net/64875/
    <phcoder> New log: http://paste.debian.net/64876/ (swapon fixed, given 1G
      of memory)
    <youpi> ok, so it really is something else
    <phcoder> looks like there is a space after $(task-resume) but can't tell
      if it's real or comes from message
    <phcoder> tottally artefact

    <phcoder> youpi: this happens when booted in qemu with old kernel now. Now
      my bet is on weird fs corruption or because I accessed it with Linux in
      rw. In any case I feel like it's time to call it a port and commit
    <youpi> I'd say so, yes
    <phcoder> Let's look what's remaining: vfb, vkbd and vif: don't need them
      for first port commit. Hm, there is an issue of default configfile. What
      is pvgrub default behaviour?
    <youpi> iirc it just enters the shell
    <youpi> I had implemented vfb and vkbd to get the graphical support, but
      that's optional indeed
    <youpi> vif is useful for netboot only
    <youpi> ah, no, by default it runs dhcp --with-configfile
    <phcoder> youpi: port committed to trunk
    <youpi> \o/
    <youpi> I was lamenting for 5 years that that wasn't happening :)
    <youpi> Citrix could have asked one of his engineers to work on it, really
    <phcoder> documentation on using the port is still missing though
    <youpi> amazon EC2 users will be happy to upgrade from pv-grub to pv-grub2
      :)
    <youpi> I asked some amazon guy at SuperComputing whether he knew how many
      people were using pv-grub, but he told me that was customer private
      information
    <phcoder> Another interesting idea would be to switch between 64-bit and
      32-bit domains somehow
    <youpi> yes, we were discussing about it at XenSource when I implemented
      pv-grub
    <youpi> that's not really an easy thing
    <youpi> pvh would probably help there, again
    <youpi> in the end, we considered that it was usually not hard to select a
      32bit or 64bit pv-grub depending on the userland bitness
    <youpi> we considered adding a hypercall to change the bitness of a domU,
      but that's really involved
    <phcoder> Well when you discussed i386 domains were still around
    <phcoder> now it's only PAE and amd64 and they are very similar. Only few
      gdt differ
    <youpi> well, switching from 32-PAE to 64 is not *so* hard
    <youpi> since a 32bit-loaded OS can fit in 64bit
    <youpi> the converse is more questionable of course
    <phcoder> yes
    <youpi> still, it's really not easy for the hypervisor
    <youpi> it'd mean converting stuff here and there
    <youpi> most probably missing things here and there :)
    <phcoder> Ok, not that important anyway
    <youpi> we felt it was too dangerous to promise the feature as working :)
    <youpi> heh, 5000 lines patch, just like my patch adding support to Mach :)
    <phcoder> BTW do you know how to check if kernel supports dom0 ? Apparently
      there is feature "privilegied" and dom0 kernels are supposed to have it
      in notes but my linux one doesn't even though I'm in xen now
    <youpi> it's XENFEAT_dom0
    <youpi> called dom0 in the notes
    <phcoder> http://paste.debian.net/64894/
    <youpi> well, maybe the hypervisor doesn't actually check it's there
    <youpi> phcoder: what does grub-mkstandalone?
    <phcoder> puts all modules in memdisk which is embed into core
    <youpi> ah, ok
    <youpi> we didn't have to care about that in grub1 indeed :)


## IRC, freenode, #hurd, 2013-11-09

    <phcoder> youpi: now I get "hd0: dom0's VBD 768
      (/home/phcoder/diskimg/debian-hurd-20130504.img,w) 3001MB"
    <phcoder> but "start ext2fs: ext2fs: device:hd0s1: No such device or
      address"
    <phcoder> disk = [
      'file:/home/phcoder/diskimg/debian-hurd-20130504.img,hda,w' ]
    <phcoder> Hm, using "disk = [ 'phy:loop0,hda1,w' ]" instead worked (loop0
      is an offset loop)
    <youpi> yes, xen disks don't support partitioning
    <youpi> and we haven't migrated userland to userland partitioning yet

[[hurd/libstore/part]].


# Host-side Writeback Caching

Optimization possible as it is with
[[QEMU|hurd/running/qemu/writeback_caching]]?

IRC, freenode, #hurd, 2011-06-08

    <braunr> youpi: does xen provide disk caching options ?
    <youpi> through a blktap, probably
    <braunr> ok


# IRC, freenode, #hurd, 2013-11-09

    <phcoder> youpi: debian-hurd-20130504.img apparently has a kernel without
      xen note. Do I have to do sth special to get xen kernel?
    <youpi> phcoder: there's the -xen package for that
    <youpi> I haven't made the kernel hybrid
    <phcoder> youpi: easiest way is probably to have different entry
      points. You could even just link both of them at different addresses and
      then glue together though it's not very efficient
    <youpi> it's also about all the privileged operations that have to be
      replaced with PV operations
    <youpi> PVH will help with that regard

    <phcoder> youpi: btw, I recommend compiling xen kernel for 686 and drop
      non-pae