summaryrefslogtreecommitdiff
path: root/open_issues/boehm_gc.mdwn
blob: 7f860bba95325615a76f0d7610f920ff00ec3fb5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
[[!meta copyright="Copyright © 2010, 2012 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

Here's what's to be done for maintaining Boehm GC.

This one does need Hurd-specific configuration.

It is, for example, used by [[/GCC]] (which has its own fork), so any changes
committed upstream should very like also be made there.

[[!toc levels=2]]


# [[General information|/boehm_gc]]


# Configuration

<!--

git checkout reviewed
git log --reverse --pretty=fuller --stat=$COLUMNS,$COLUMNS -w -p -C --cc ..upstream/master
-i
/^commit |^---$|hurd|linux|glibc

-->

Last reviewed up to the 5f492b98dd131bdd6c67eb56c31024420c1e7dab (2012-06-08)
sources, and for `libatomic_ops` to the
6a0afde033f105c6320f1409162e3765a1395bfd (2012-05-15) sources.

  * `configure.ac`

      * `PARALLEL_MARK` is not enabled; doesn't make sense so far.

      * `*-*-kfreebsd*-gnu` defines `USE_COMPILER_TLS`.  What's this, and
        why does not other config?

      * TODO

            [ if test "$enable_gc_debug" = "yes"; then
                AC_MSG_WARN("Should define GC_DEBUG and use debug alloc. in clients.")
                AC_DEFINE([KEEP_BACK_PTRS], 1,
                          [Define to save back-pointers in debugging headers.])
                keep_back_ptrs=true
                AC_DEFINE([DBG_HDRS_ALL], 1,
                          [Define to force debug headers on all objects.])
                case $host in
                  x86-*-linux* | i586-*-linux* | i686-*-linux* | x86_64-*-linux* )
                    AC_DEFINE(MAKE_BACK_GRAPH)
                    AC_MSG_WARN("Client must not use -fomit-frame-pointer.")
                    AC_DEFINE(SAVE_CALL_COUNT, 8)
                  ;;
            AM_CONDITIONAL([KEEP_BACK_PTRS], [test x"$keep_back_ptrs" = xtrue])

  * `configure.host`

    Nothing.

  * `Makefile.am`, `include/include.am`, `cord/cord.am`, `doc/doc.am`,
    `tests/tests.am`

    Nothing.

  * `include/gc_config_macros.h`

    Should be OK.

  * `include/private/gcconfig.h`

    Hairy.  But should be OK.  Search for *HURD*, compare to *LINUX*,
    *I386* case.

    See `doc/porting.html` and `doc/README.macros` (and others) for
    documentation.

    *LINUX* has:

      * `#define LINUX_STACKBOTTOM`

        Defined instead of `STACKBOTTOM` to have the value read from `/proc/`.

      * `#define HEAP_START (ptr_t)0x1000`

        May want to define it for us, too?

      * `#ifdef USE_I686_PREFETCH`, `USE_3DNOW_PREFETCH` --- [...]

        Apparently these are optimization that we also could use.  Have a
        look at *LINUX* for *X86_64*, which uses `__builtin_prefetch`
        (which Linux x86 could use, too?).

      * TODO

            #if defined(LINUX) && defined(USE_MMAP)
                /* The kernel may do a somewhat better job merging mappings etc.    */
                /* with anonymous mappings.                                         */
            #   define USE_MMAP_ANON
            #endif

      * TODO

            #if defined(GC_LINUX_THREADS) && defined(REDIRECT_MALLOC)
                /* Nptl allocates thread stacks with mmap, which is fine.  But it   */
                /* keeps a cache of thread stacks.  Thread stacks contain the       */
                /* thread control blocks.  These in turn contain a pointer to       */
                /* (sizeof (void *) from the beginning of) the dtv for thread-local */
                /* storage, which is calloc allocated.  If we don't scan the cached */
                /* thread stacks, we appear to lose the dtv.  This tends to         */
                /* result in something that looks like a bogus dtv count, which     */
                /* tends to result in a memset call on a block that is way too      */
                /* large.  Sometimes we're lucky and the process just dies ...      */
                /* There seems to be a similar issue with some other memory         */
                /* allocated by the dynamic loader.                                 */
                /* This should be avoidable by either:                              */
                /* - Defining USE_PROC_FOR_LIBRARIES here.                          */
                /*   That performs very poorly, precisely because we end up         */
                /*   scanning cached stacks.                                        */
                /* - Have calloc look at its callers.                               */
                /*   In spite of the fact that it is gross and disgusting.          */
                /* In fact neither seems to suffice, probably in part because       */
                /* even with USE_PROC_FOR_LIBRARIES, we don't scan parts of stack   */
                /* segments that appear to be out of bounds.  Thus we actually      */
                /* do both, which seems to yield the best results.                  */
            
            #   define USE_PROC_FOR_LIBRARIES
            #endif

      * TODO

            # if defined(GC_LINUX_THREADS) && defined(REDIRECT_MALLOC) \
                 && !defined(INCLUDE_LINUX_THREAD_DESCR)
                /* Will not work, since libc and the dynamic loader use thread      */
                /* locals, sometimes as the only reference.                         */
            #   define INCLUDE_LINUX_THREAD_DESCR
            # endif

      * TODO

            # if defined(UNIX_LIKE) && defined(THREADS) && !defined(NO_CANCEL_SAFE) \
                 && !defined(PLATFORM_ANDROID)
                /* Make the code cancellation-safe.  This basically means that we   */
                /* ensure that cancellation requests are ignored while we are in    */
                /* the collector.  This applies only to Posix deferred cancellation;*/
                /* we don't handle Posix asynchronous cancellation.                 */
                /* Note that this only works if pthread_setcancelstate is           */
                /* async-signal-safe, at least in the absence of asynchronous       */
                /* cancellation.  This appears to be true for the glibc version,    */
                /* though it is not documented.  Without that assumption, there     */
                /* seems to be no way to safely wait in a signal handler, which     */
                /* we need to do for thread suspension.                             */
                /* Also note that little other code appears to be cancellation-safe.*/
                /* Hence it may make sense to turn this off for performance.        */
            #   define CANCEL_SAFE
            # endif

      * `CAN_SAVE_CALL_ARGS` vs. -fomit-frame-pointer now being on by
        default for Linux x86 IIRC?  (Which is an [[!taglink
        open_issue_gcc]] for not including us.)

      * TODO

            # if defined(REDIRECT_MALLOC) && defined(THREADS) && !defined(LINUX)
            #   error "REDIRECT_MALLOC with THREADS works at most on Linux."
            # endif


    *HURD* has:

      * `#define STACK_GROWS_DOWN`

      * `#define HEURISTIC2`

        Defined instead of `STACKBOTTOM` to have the value probed.

        Linux also has this:

            #if defined(LINUX_STACKBOTTOM) && defined(NO_PROC_STAT) \
                && !defined(USE_LIBC_PRIVATES)
                /* This combination will fail, since we have no way to get  */
                /* the stack base.  Use HEURISTIC2 instead.                 */
            #   undef LINUX_STACKBOTTOM
            #   define HEURISTIC2
                /* This may still fail on some architectures like IA64.     */
                /* We tried ...                                             */
            #endif

        Being on [[glibc]], we could perhaps do similar as `USE_LIBC_PRIVATES`
        instead of `HEURISTIC2`.  Pro: avoid `SIGSEGV` (and general fragility)
        during probing at startup (if I'm understanding this correctly).  Con:
        rely on glibc internals.  Or we instead add support to parse
        [[`/proc/`|hurd/translator/procfs]] (can even use the same as Linux?),
        or use some other interface.  [[!tag open_issue_glibc]]

      * `#define SIG_SUSPEND SIGUSR1`, `#define SIG_THR_RESTART SIGUSR2`

      * We don't `#define MPROTECT_VDB` (WIP comment); but Linux neither.

      * Where does our `GETPAGESIZE` come from?  Should we `#include
        <unistd.h>` like it is done for *LINUX*?

  * `include/gc_pthread_redirects.h`

      * TODO

        Cancellation stuff is Linux-only.  In other places, too.

  * `mach_dep.c`

      * `#define NO_GETCONTEXT`

        [[!taglink open_issue_glibc]], but this is not a real problem here,
        because we can use the following GCC internal function without much
        overhead:

      * `GC_with_callee_saves_pushed`

        The `HAVE_BUILTIN_UNWIND_INIT` case is ours.

  * `os_dep.c`

      * `read`

        Sure that it doesn't internally (in [[glibc]]) use `malloc`.  Probably
        only / mostly (?) a problem for `--enable-redirect-malloc`
        configurations?  Linux with threads uses `readv`.

      * TODO.

  * `dyn_load.c`

    For `DYNAMIC_LOADING`.  TODO.

  * `pthread_support.c`, `pthread_stop_world.c`

    TODO.

  * TODO.

    Other files also contain *LINUX* and other conditionals.

  * `libatomic_ops/`

      * `configure.ac`

        Nothing.

      * `Makefile`, `src/Makefile`, `src/atomic_ops/Makefile`,
        `src/atomic_ops/sysdeps/Makefile`, `doc/Makefile`, `tests/Makefile`

        Nothing.

      * `src/atomic_ops/sysdeps/gcc/x86.h`

        Nothing.

  * b8b65e8a5c2c4896728cd00d008168a6293f55b1 configure.ac probably not all
    correct.

  * `mmap`, b64dd3bc1e5a23e677c96b478d55648a0730ab75

  * `parallel mark`, 07c2b8e455c9e70d1f173475bbf1196320812154, pass
    `--disable-parallel-mark` or enable for us, too?

  * `HANDLE_FORK`, e9b11b6655c45ad3ab3326707aa31567a767134b,
    806d656802a1e3c2b55cd9e4530c6420340886c9,
    1e882b98c2cf9479a9cd08a67439dab7f9622924

  * Check `include/private/thread_local_alloc.h` re
    `USE_COMPILER_TLS`/`USE_PTHREAD_SPECIFIC`.


# Build

Here's a log of a binutils build run; this is from the
5f492b98dd131bdd6c67eb56c31024420c1e7dab (2012-06-08) sources, and for
`libatomic_ops` for the 6a0afde033f105c6320f1409162e3765a1395bfd (2012-05-15)
sources, run on kepler.SCHWINGE and coulomb.SCHWINGE.

    $ export LC_ALL=C
    $ (cd ../master/ && ln -sfn ../libatomic_ops/master libatomic_ops)
    $ (cd ../master/ && autoreconf -vfi)
    $ ../master/configure --prefix="$PWD".install SHELL=/bin/bash CC=gcc-4.6 CXX=g++-4.6 --enable-cplusplus --enable-gc-debug --enable-gc-assertions --enable-assertions 2>&1 | tee log_build
    [...]
    $ make 2>&1 | tee log_build_
    [...]

Different hosts may default to different shells and compiler versions; thus
harmonized.  Using bash instead of dash as otherwise libtool explodes.

This takes up around X MiB, and needs roughly X min on kepler.SCHWINGE and
X min on coulomb.SCHWINGE.

<!--

    $ (make && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install && touch .go-check) 2>&1 | tee log_install && test -f .go-check && { make -k check 2>&1 | tee log_check; (cd libatomic_ops/ && make -k check) 2>&1 | tee log_check_; }

-->

## Analysis

    $ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_build* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_build
    $ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_build* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_build
    $ diff -wu <(sed -f toolchain/logs/boehm-gc/linux/log_build.sed < toolchain/logs/boehm-gc/linux/log_build) <(sed -f toolchain/logs/boehm-gc/hurd/log_build.sed < toolchain/logs/boehm-gc/hurd/log_build) > toolchain/logs/boehm-gc/log_build.diff

  * only GNU/Linux: `configure: WARNING: "Explicit GC_INIT() calls may be
    required."`

  * only GNU/Linux: `configure: WARNING: "Client must not use
    -fomit-frame-pointer."`


# Install

    $ make install 2>&1 | tee log_install
    [...]

This takes up around X MiB, and needs roughly X min on kepler.SCHWINGE and X
min on coulomb.SCHWINGE.


## Analysis

    $ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_install | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_install
    $ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_install | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_install
    $ diff -wu toolchain/logs/boehm-gc/linux/log_install toolchain/logs/boehm-gc/hurd/log_install > toolchain/logs/boehm-gc/log_install.diff


# Testsuite

    $ make -k check
    [...]
    $ (cd libatomic_ops/ && make -k check)
    [...]

This needs roughly X min on kepler.SCHWINGE and X min on coulomb.SCHWINGE.


## Analysis

    $ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_check* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_check
    $ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_check* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_check
    $ diff -wu <(sed -f toolchain/logs/boehm-gc/linux/log_check.sed < toolchain/logs/boehm-gc/linux/log_check) <(sed -f toolchain/logs/boehm-gc/hurd/log_check.sed < toolchain/logs/boehm-gc/hurd/log_check) > toolchain/logs/boehm-gc/log_check.diff

There are different configurations possible, but in general, the testsuite
restults of GNU/Linux and GNU/Hurd look very similar.

  * GNU/Hurd is missing `Call chain at allocation: [...]` output.

    `os_dep.c`:`GC_print_callers`


# TODO

  * What are other applications to test Boehm GC?  Also especially in
    combination with [[/libpthread]] and dynamic loading of shared libraries?

      * There are patches (apparently not committed) that GCC itself can use
        it, too: <http://gcc.gnu.org/wiki/Garbage_collection_tuning>.

      * There's been some talking about it on GNU guile mailing lists, and two
        Git branches (2010-12-15: last change 2009-09).

      * <http://www.hpl.hp.com/personal/Hans_Boehm/gc/#users>


## IRC, OFTC, #debian-hurd, 2012-02-05

[[!tag open_issue_porting]]

    <pinotree> youpi: i think i found out the possible cause of the ecl and
      mono issuess
    <pinotree> -s
    <youpi> oh
    <pinotree> basically, we don't have the realtime signals (so no
      SIGRTMIN/SIGRTMAX defined), hence things use either SIGUSR1 or
      SIGUSR2... which are used in libgc to resp. stop/resume threads when
      "collecting"
    <pinotree> i just patched ecl to use SIGINFO instead of SIGUSR1 (used when
      no SIGRTMIN+2 is available), and it seems going on for a while
    <youpi> uh, why would SIGINFO work better than SIGUSR1?
    <pinotree> it was a test, i tried the first "not common" signal i saw
    <pinotree> my test was, use any signal different than USR1/2
    <youpi> ah, sorry, I hadn't understood
    <youpi> you mean there's a conflict between ecl and mono using SIGUSR1, as
      well as libgc?
    <pinotree> yes
    <pinotree> for example, in ecl sources see src/c/unixint.d,
      install_process_interrupt_handler()
    <youpi> SIGINFO seems a sane choice
    <youpi> SIGPWR could have been a better choice if it was available :)
    <pinotree> i would have chose an "unassigned" number, say SIGLOST (the
      bigger one) + 10, but it would be greater than _NSIG (and thus discarded)
    <youpi> not a good idea indeed
    <pinotree> it seems that linux, beside the range for rt signals, has some
      "free space"
    <pinotree> i'll start now another ecl build, from scratch this time, with
      s/SIGUSR1/SIGINFO/ (making sure ctags won't bother), and if it works i'll
      update svante's bug

    <pinotree> mmap(...PROT_NONE...) failed
    <pinotree> hmm...
    <pinotree> apparently enabling MMAP_ANON in mono's libgc copy was a good
      step, let's see


### IRC, OFTC, #debian-hurd, 2012-03-18

    <pinotree> youpi: mono is afflicted by the SIGUSR1/2 conflict with libgc
    <youpi> pinotree: didn't we have a solution for that?
    <pinotree> well, it works just for one signal
    <pinotree> the ideal solution would be having a range for RT signals, and
      make libgc use RTMIN+5/6, like done on most of other OSes
    <youpi> but we don't have RT signals, do we?
    <pinotree> right :(


### IRC, freenode, #hurd, 2012-03-21

    <pinotree> civodul: given we have to realtime signals (so no range of
      signals for them), libgc uses SIGUSR1/2 instead of using SIGRTMIN+5/6 for
      its thread synchronization stuff
    <pinotree> civodul: which means that if an application using libgc then
      sets its own handlers for either of SIGUSR1/2, hell breaks
    <civodul> pinotree: ok
    <civodul> pinotree: is it a Debian-specific change, or included upstream?
    <pinotree> libgc using SIGUSR1/2? upstream
    <civodul> ok