summaryrefslogtreecommitdiff
path: root/hurd/porting/guidelines.mdwn
blob: aabf0345d2ebe26724c7b3acd081cd22c128b473 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
[[!meta copyright="Copyright © 2002, 2003, 2005, 2007, 2008, 2009, 2010, 2011,
2012 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled
[[GNU Free Documentation License|/fdl]]."]]"""]]

This is a compilation of common porting problems and their solutions.


Additionally to this page, also see the section *General Porting Issues* of
<http://www.debian.org/ports/hurd/hurd-devel-debian><!-- TODO: merge these two
pages.  -->, as well as further Debian-specific [[running/debian/porting]]
information.

There is a separate page about [[System_API_Limitations]].

You may ask on the [[mailing lists/bug-hurd]] mailing list for details or
questions about fixing bugs.

## <a name="GNU build system"> GNU build system </a>

For a good overview of the components in the GNU build system, see
<http://en.wikipedia.org/wiki/GNU_build_system> and
<http://www.gnu.org/s/hello/manual/autoconf/index.html>.

The GNU build system distinguishes between 'build', 'host' and 'target' machines.
The 'build' machine is where compilers are run, the 'host' machine where the package 
being built will run, and for cross compiling the 'target' machine, on which the compiler
built will generate code for.

When using GNU autotools to configure a package config.guess and config.sub from autotools-dev
are used to find out the build machine identity: CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM.
For GNU/Hurd config.guess gives 'i686-unknown-gnu0.3'. Sometimes a quadruple is used
adding KERNEL, e.g. for Linux on an amd64: 'x86_64-unknown-linux-gnu'. This
is however actually a triple, it just happens that the operating system part
unfortunately contains a '-'. config.sub is used to 
canonicalize on these triplets, e.g. config.sub i686-gnu gives 'i686-pc-gnu'.

On Debian systems the build Makefile is debian/rules and some Debian packages will set $host to 
'i486-pc-gnu'. This is accomplished with the 'dpkg-architecture -qDEB_HOST_GNU_TYPE' construct
forwarded  to configure in debian/rules, e.g. configure --host=$DEB_HOST_GNU_TYPE. 
Another way to set $build, $host etc is via the Debian dh_auto_configure script from the debhelper 
package which uses the Perl code autoconf.pm to find out these variables.

## <a name="autoconf"> Fixing configure.{ac,in} </a>

The GNU/Hurd (and GNU/kFreeBSD) toolchain is extremely close to the GNU/Linux toolchain.
configure.ac thus very often just needs to be fixed by using the same cases as Linux, that is, turn

    switch "$host_os" in
        case linux*)

into

    switch "$host_os" in
       case linux*|k*bsd-gnu*|gnu*)

for a host_os case statement, or

    switch "$host" in
        case *-linux*)

into

    switch "$host" in
       case *-linux*|*-k*bsd-gnu*|*-gnu*)

If separate case is needed, make sure to put *-gnu* *after* *-linux*: 

    switch "$host" in
       case *-linux*|*-k*bsd-gnu*)
           something;;

       case *-gnu*)
           something else;;

because else *-gnu* would catch i386-pc-linux-gnu for instance...

Note: some of such statements are not from the source package itself, but from aclocal.m4 which is actually from libtool. In such case, the package simply needs to be re-libtoolize-d.

## <a name="Undefined_bits_confname_h_tt_mac"> Undefined `bits/confname.h` macros (`PIPE_BUF`, ...) </a>

If macro `XXX` is undefined, but macro `_SC_XXX` or `_PC_XXX` is defined in `bits/confname.h`, you probably need to use `sysconf`, `pathconf` or `fpathconf` to obtain it dynamicaly.

The following macros have been found in this offending situation (add more if you find them): `PIPE_BUF`

An example with `sysconf`: (when you find a `sysconf` offending macro, put a better example)

    #ifndef XXX
    #define XXX sysconf(_SC_XXX)
    #endif
    /* offending code using XXX follows */

An example with `fpathconf`:

    #ifdef PIPE_BUF
       read(fd, buff, PIPE_BUF - 1);
    #else
       read(fd, buff, fpathconf(fd, _PC_PIPE_BUF) - 1);
    #endif
    /* note we can't #define PIPE_BUF, because it depends
       on the "fd" variable */

## <a name="Bad_File_Descriptor"> Bad File Descriptor </a>

If you get Bad File Descriptor error when trying to read from a file (or accessing it at all), check the `open()` invocation. The second argument is the access method. If it is a hard coded number instead of a symbol defined in the standard header files, the code is screwed and should be fixed to either use `O_RDONLY`, `O_WRONLY` or `O_RDWR`. This bug was observed in the `fortunes` and `mtools` packages for example.

## <a name="PATH_MAX_tt_MAX_PATH_tt_MAXPATHL">`PATH_MAX`, `MAX_PATH`, `MAXPATHLEN`, `_POSIX_PATH_MAX`</a>

<http://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html>

Every unconditionalized use of `PATH_MAX`, `MAX_PATH` or `MAXPATHLEN` is a POSIX incompatibility. If there is no upper limit on the length of a path (as its the case for GNU), this symbol is not defined in any header file. Instead, you need to either use a different implementation that does not rely on the length of a string or use `sysconf()` to query the length at runtime. If `sysconf()` returns -1, you have to use `realloc()` to allocate the needed memory dynamically. Usually it is thus simpler to just use dynamic allocation. Sometimes the amount is actually known. Else, a geometrically growing loop can be used: for instance, see [Alioth patch](http://alioth.debian.org/tracker/download.php/30628/410472/303735/1575/cpulimit-path-max-fix.patch) or [Pulseaudio patch](http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=patch-pulse;att=1;bug=522100). Note that in some cases there are GNU extensions that just work fine: when the `__GLIBC__` macro is defined, `getcwd()` calls can be just replaced by `get_current_dir_name()` calls.

Note: constants such as `_POSIX_PATH_MAX` are only the minimum required value
for a potential corresponding `PATH_MAX` macro.  They are not a replacement for
`PATH_MAX`, just the minimum value that one can assume.

Note 2: Yes, some POSIX functions such as `realpath()` actually assume that
`PATH_MAX` is defined.  This is a bug of the POSIX standard, which got fixed in
the latest revisions, in which one can simply pass `NULL` to get a dynamically
allocated buffer.

## <a name="ARG_MAX"> `ARG_MAX` </a>

Same rationale as `PATH_MAX`. There is no limit on the number of arguments.

## <a name="IOV_MAX"> `IOV_MAX` </a>

Same rationale as `PATH_MAX`. There is no limit on the number of iovec items.

## <a name="MAXHOSTNAMELEN_tt_"> `MAXHOSTNAMELEN` </a>

Same as `PATH_MAX`. When you find a `gethostname()` function, which acts on a static buffer, you can replace it with Neal's [xgethostname function](http://walfield.org/pub/people/neal/xgethostname/) which returns the hostname as a dynamic buffer. For example:

Buggy code:

    char localhost[MAXHOSTNAMELEN];
    ...
    gethostname(localhost, sizeof(localhost));

Fixed code:

    #include "xgethostname.h"
    ...
    char *localhost;
    ...
    localhost = xgethostname();
    if (! localhost)
      {
        perror ("xgethostname");
        return ERROR;
      }
    ...
    /* use LOCALHOST.  */
    free (localhost);

## <a name="NOFILE_tt_"> `NOFILE` </a>

Replace with `getrlimit(RLIMIT_NOFILE,...)`

## <a name="GNU_specific_define_tt_"> </a> GNU specific `#define`

If you need to include specific code for GNU/Hurd using `#if` ... `#endif`, then you can use the `__GNU__` symbol to do so. But think (at least) thrice! before doing so. In most situations, this is completely unnecessary and will create more problems than it may solve. Better ask on the mailing list how to do it right if you can't think of a better solution.

## <a name="sys_errlist_tt_vs_strerror_tt_"> `sys_errlist[]` vs. `strerror()` </a>

If a program has only support for `sys_errlist[]` you will have to do some work to make it compile on GNU, which has dropped support for it and does only provide `strerror()`. Steinar Hamre writes about `strerror()`:

`strerror()` should be used because:

* It is the modern, POSIX way.
* It is localized.
* It handles invalid signals/numbers out of range. (better errorhandling and not a buffer-overflow-candidate/security risk)

`strerror()` should always be used if it is available. Unfortunaly there are still some old non-POSIX systems that do not have `strerror()`, only `sys_errlist[]`.

Today, only supporting `strerror()` is far better than only supporting `sys_errlist[]`. The best (from a portability viewpoint), however is supporting both. For configure.in, you will need:

    AC_CHECK_FUNCS(strerror)

To `config.h.in`, you need to add:

    #undef HAVE_STRERROR

Then something like:

            #ifndef HAVE_STRERROR
            static char *
            private_strerror (errnum)
                 int errnum;
            {
              extern char *sys_errlist[];
              extern int sys_nerr;

              if (errnum > 0 && errnum <= sys_nerr)
                return sys_errlist[errnum];

              return "Unknown system error";
            }
            #define strerror private_strerror
            #endif /* HAVE_STRERROR */

You can for example look in the latest coreutils (the above is a simplified version of what I found there.) Patches should of course be sent to upstream maintainers, this is very useful even for systems with a working `sys_errlist[]`.

Of course, if you don't care about broken systems (like MS-DOG) not supporting `strerror()` you can just replace `sys_errlist[]` directly (upstream might not accept your patch, but debian should have no problem)

## <a name="C++_error_t"> C++, `error_t` and `E*` </a>

On the Hurd, `error_t` is an enumeration of the `E*` constants. However, C++
does not like `E*` integer macros being directly assigned to that enumeration. In short, replace

        error_t err = EINTR;

by

        error_t err = error_t(EINTR);

## <a name="Missing_termio_h_tt_"> Missing `termio.h` </a>

Change it to use `termios.h` (check for it properly with autoconf `HAVE_TERMIOS_H` or the `__GLIBC__` macro)

Also, change calls to `ioctl(fd, TCGETS, ...)` and `ioctl(fd, TCSETS, ...)` with `tcgetattr(fd, ...)` and `tcsetattr(fd, ...)`.

## <a name="AC_HEADER_TERMIO_tt_"> `AC_HEADER_TERMIO` </a>

The autoconf check for `AC_HEADER_TERMIO` tryes to check for termios, but it's only really checking for termio in `termios.h`. It is better to use `AC_CHECK_HEADERS(termio.h termios.h)`

## <a name="_IOT"> missing `_IOT` </a>

This comes from ioctls.  Fixing this is easy if the structure members can be expressed by using the _IOT() macro, else it's simply impossible... See `bits/termios.h` for an instance:

`#define _IOT_termios /* Hurd ioctl type field.  */ \
  _IOT (_IOTS (tcflag_t), 4, _IOTS (cc_t), NCCS, _IOTS (speed_t), 2)`

The rationale behind is that on the Hurd ioctl numbers actually encode how the
data should be transferred via RPC: here `struct termios` holds 4 members of
type `tcflag_ts`, then `NCCS` members of type `cc_tsi` and finaly 2 members of
type `speed_ts`, so the RPC mecanism will know how to transfer them.

As you can see, this limits the number of contiguous kinds of members to 3, and
in addition to that (see the bitfield described in `ioctls.h`), the third kind
of member is limited to 3 members.  This is a design limitation, there is no way
to overcome it at the moment.

Note: if a field member is a pointer, then the ioctl can't be expressed
this way, and that makes sense, since the server you're talking to
doesn't have direct access to your memory. Ways other than ioctls must
then be found.

## <a name="SA_SIGINFO"> `SA_SIGINFO` </a>

Implemented by Jérémie Koenig, pending upload in Debian eglibc 2.13-19.

## <a name="SA_NOCLDWAIT"> `SA_NOCLDWAIT` </a>

Not implemented yet.

## <a name="SOL_IP"> `SOL_IP` </a>

Not implemented yet.

## <a name="HZ"> `HZ` </a>

Linuxish and doesn't even make sense since the value may vary according to the running kernel.  Should use `sysconf(_SC_CLK_TCK)` or `CLK_TCK` instead.

## <a name="SIOCDEVPRIVATE"> `SIOCDEVPRIVATE` </a>

Oh, we should probably provide it.

## <a name="MAP_NORESERVE"> `MAP_NORESERVE` </a>

Not POSIX, but we could implement it.
See [[glibc/mmap]].

## <a name="O_DIRECT"> `O_DIRECT` </a>

Long story to implement.

## <a name="PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP"> `PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP` </a>

We could easily provide it;

## <a name="PTHREAD_STACK_MIN"> `PTHREAD_STACK_MIN` </a>

We should actually provide it.

## <a name="types"> `linux/types.h` or `asm/types.h` </a>

These are not POSIX, `sys/types.h` and `stdint.h` should be used instead.

## <a name="iopl"> `iopl` </a>

Not supported and actually very dangerous (permits userland to completely disable interruptions...). Replace with `ioperm(0, 65536, 1)`.

## <a name="iopl"> `semget`, `sem_open` </a>

Not implemented, will always fail.  Use `sem_init()` instead if possible.

## <a name="net/..."> `net/if_arp.h`, `net/ethernet.h`, etc. </a>

Not implemented, not POSIX.  Try to disable the feature in the package.

## <a name="parport"> <linux/parport.h> <linux/ppdev.h> </a>

There is no programming interface for the parallel port on GNU/Hurd yet.

## <a name="cdrom"> <linux/cdrom.h> </a>

Use <sys/cdrom.h> instead.

## <a name="baud"> CBAUD </a>

This is not actually standard; cfsetspeed, cfsetispeed, or cfsetospeed should be used instead.

## <a name="errno"> `errno` values </a>

When dealing with `errno`, you should always use the predefined error codes defined with the `E*` constants, instead of manually comparing/assigning/etc with their values.

For example (C/C++):

    /* check whether it does not exist */
    if (errno == 2)
      ...

or Python:

    # check whether it does not exist
    try:
      ...
    except OSError, err:
      err.errno == 2:
        ...

This is wrong, as [the actual values of the `E*` are unspecified (per POSIX)](http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_03.html#tag_02_03). You must always use the predefined constants for the possible errors.

For example (C/C++):

    /* check whether it does not exist */
    if (errno == ENOENT)
      ...

With Python, you can use the [`errno` module](http://docs.python.org/library/errno.html) for the various constants:

    # check whether it does not exist
    try:
      ...
    except OSError, err:
      import errno
      err.errno == errno.ENOENT:
        ...

## <a name="libdl"> undefined reference to `dlopen`, `dlsym`, `dlclose` </a>

Configure script often hardcode the library that contains dlopen & such (`-ldl'), and only for Linux. Simply add the other GNU OS cases: replace `linux*' with `linux*|gnu*|k*bsd*-gnu`

## <a name="linux_headers"> Missing `linux/types.h`, `asm/types.h`, `linux/limits.h`, `asm/byteorder.h`, `sys/endian.h`, `asm/ioctl.h`, `asm/ioctls.h`, `linux/soundcard.h` </a>

These are often used (from lame rgrep results) instead of their standard equivalents: `sys/types.h` (or `stdint.h` for fixed-size types), `limits.h`, `endian.h`, `sys/ioctl.h`, `sys/soundcard.h`

## <a name="linux_features"> Missing `sys/*.h`, `linux/*.h`</a>

These are linuxish things, they may not have Hurd equivalents yet, better disable the code.