summaryrefslogtreecommitdiff
path: root/glibc/mmap.mdwn
diff options
context:
space:
mode:
authorThomas Schwinge <thomas@codesourcery.com>2012-12-20 22:20:05 +0100
committerThomas Schwinge <thomas@codesourcery.com>2012-12-20 22:20:05 +0100
commitfc4d1650f3e35a1cff0111ae3808c61d44346f1f (patch)
tree6be97c865d896e3e49411ede9870410e0c4f6479 /glibc/mmap.mdwn
parentc77f17cfb827c17de7f1d5318cbbbeea03286715 (diff)
glibc/mmap: Extend.
Diffstat (limited to 'glibc/mmap.mdwn')
-rw-r--r--glibc/mmap.mdwn402
1 files changed, 345 insertions, 57 deletions
diff --git a/glibc/mmap.mdwn b/glibc/mmap.mdwn
index 09b0b65d..cddd0584 100644
--- a/glibc/mmap.mdwn
+++ b/glibc/mmap.mdwn
@@ -8,91 +8,379 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
-There are two implementations of `mmap` for GNU Hurd:
-`sysdeps/mach/hurd/mmap.c` (main implementation) and
-`sysdeps/mach/hurd/dl-sysdep.c` (*Minimal mmap implementation sufficient for
-initial loading of shared libraries.*).
+The `mmap` call is generally supported on GNU Hurd, as indicated by
+`_POSIX_MAPPED_FILES` (`sysconf (_SC_MAPPED_FILES)`).
- * `MAP_COPY`
- What exactly is that? `elf/dl-load.c` has some explanation.
- <http://lkml.indiana.edu/hypermail/linux/kernel/0110.1/1506.html>
+# Flags
- It is only handled in `dl-sysdep.c`, when `flags & (MAP_COPY|MAP_PRIVATE)`
- is used for `vm_map`'s `copy` parameter, and `mmap.c` uses `! (flags &
- MAP_SHARED)` instead, which seems inconsistent?
+*Flags contain mapping type, sharing type and options.*
+ * *Mapping type (must choose one and only one of these).*
-# `io_map` Failure
+ * `MAP_FILE` (*Mapped from a file or device.*)
-This is the [[libnetfs: `io_map`|open_issues/libnetfs_io_map]] issue.
+ * `MAP_ANON`/`MAP_ANONYMOUS` (*Allocated from anonymous virtual memory.*)
-[[!tag open_issue_glibc]]
+ Even though it is not defined to zero (it is for the Linux kernel; why not
+ for us?), `MAP_FILE` is the default and can be omitted.
-Review of `mmap` usage in generic bits of glibc, based on
-a1bcbd4035ac2483dc10da150d4db46f3e1744f8 (2012-03-11), listing these cases
-where failure (due to `io_map` failing; that is, invocations where a `fd` is
-passed) is not properly handled.
+ * *Sharing types (must choose one and only one of these).*
-`catgets/open_catalog.c`, `iconv/gconv_cache.c`, `intl/loadmsgcat.c`,
-`locale/loadlocale.c` have fallback code for the `MAP_FAILED` case.
+ * `MAP_SHARED` (*Share changes.*)
-[[tschwinge]]'s current plan is to make the following cases do the same (if
-that is possible); probably by introducing a generic `mmap_or_read` function,
-that first tries `mmap` (and that will succeed on Linux-based systems and also
-on Hurd-based, if it's backed by [[hurd/libdiskfs]]), and if that fails tries
-`mmap` on anonymous memory and then fills it by `read`ing the required data.
-This is also what the [[hurd/exec]] server is doing (and is the reason that the
-`./true` invocation on [[libnetfs: `io_map`|open_issues/libnetfs_io_map]]
-works, to my understanding): see `exec.c:prepare`, if `io_map` fails,
-`e->filemap == MACH_PORT_NULL`; then `exec.c:map` (as invoked from
-`exec.c:load_section`, `exec.c:check_elf`, `exec.c:do_exec`, or
-`hashexec.c:check_hashbang`) will use `io_read` instead.
+ * `MAP_PRIVATE` (*Changes private; copy pages on write.*)
-Doing so potentially means reading in a lot of unused data -- but we probably
-can't do any better?
+ * `MAP_COPY` (*Virtual copy of region at mapping time.*)
-In parallel (or even alternatively?), it should be researched how Linux (or any
-other kernel) implements `mmap` on NFS and similar file systems, and then
-implement the same in [[hurd/libnetfs]] and/or [[hurd/translator/nfs]], etc.
+ For us, `MAP_PRIVATE` is the default (is defined to zero), for the Linux
+ kernel, one of `MAP_SHARED` or `MAP_PRIVATE` has to be specified
+ explicitly.
+
+ The Linux kernel does not support `MAP_COPY`, and as per the comment in
+ `elf/dl-load.c`, `MAP_PRIVATE | MAP_DENYWRITE` is Linux' replacement for
+ `MAP_COPY`. However, `MAP_DENYWRITE` is defunct (`mmap` manpage).
+
+ In contrast to `MAP_COPY`, for `MAP_PRIVATE` *it is unspecified whether
+ changes made to the file after the `mmap` call are visible in the mapped
+ region* (`mmap` manpage).
+
+ `MAP_COPY`:
+
+ What exactly is that? `elf/dl-load.c` has some explanation.
+ <http://lkml.indiana.edu/hypermail/linux/kernel/0110.1/1506.html>
+
+ It is only handled in `dl-sysdep.c`, when `flags &
+ (MAP_COPY|MAP_PRIVATE)` is used for
+ [[`vm_map`|microkernel/mach/interface/vm_map]]'s `copy` parameter, and
+ `mmap.c` uses `! (flags & MAP_SHARED)` instead, which seems
+ inconsistent?
+
+ Usage in glibc:
+
+ * `catgets/open_catalog.c:__open_catalog`,
+ `locale/loadlocale.c:_nl_load_locale`: *Linux seems to lack read-only
+ copy-on-write.*
+
+ * `MAP_TYPE` (*Mask for type field.*/*Mask for type of mapping.*)
+
+ [[!tag open_issue_glibc]]In `bits/mman.h` this is described and defined to
+ be a mask for the *mapping* type, in the `bits/mman.h` files corresponding
+ to Linux kernel it is described an defined to be a mask for the *sharing*
+ type.
+
+ * *Other flags.*
+
+ * `MAP_FIXED` (*Map address must be exactly as requested.*)
+
+ If the memory region is already in use, an unmap is attempted before
+ (re-)mapping it.
+
+ [[!tag open_issue_glibc]]The following text should be improved:
+
+ `[glibc]/llio.texi` says:
+
+ @var{address} gives a preferred starting address for the mapping.
+ @code{NULL} expresses no preference. Any previous mapping at that
+ address is automatically removed. [...]
+
+ The comments in `misc/sys/mman.h`, `misc/mmap.c`, `misc/mmap64.c`,
+ `ports/sysdeps/unix/sysv/linux/hppa/mmap.c`, and
+ `sysdeps/mach/hurd/mmap.c` have a better wording:
+
+ A successful `mmap' call
+ deallocates any previous mapping for the affected region.
+
+ This is correct insofar that for `MAP_FIXED` indeed it is first
+ unmapped if already in use, and for the regular cases, an address will
+ be chosen that has no previous mapping.
+
+ * `MAP_NOEXTEND` (*For `MAP_FILE`, don't change file size.*)
+
+ Referenced in `[hurd]/TODO` as unimplemented.
+
+ * `MAP_HASSEMPHORE` (*Region may contain semaphores.*)
+
+ * `MAP_INHERIT` (*Region is retained after exec.*)
+
+ * Linux-specific flags
+
+ * `MAP_GROWSDOWN` (*Stack-like segment.*), `MAP_GROWSUP` (*Register
+ stack-like segment.*)
+
+ See `mmap` manpage.
+
+ * `MAP_DENYWRITE` (*`ETXTBSY`*)
+
+ As per the comment in `elf/dl-load.c`, `MAP_PRIVATE | MAP_DENYWRITE` is
+ Linux' replacement for `MAP_COPY`. However, `MAP_DENYWRITE` is defunct
+ (`mmap` manpage).
+
+ * `MAP_EXECUTABLE` (*Mark it as an executable.*)
+
+ * `MAP_LOCKED` (*Lock the mapping.*)
+
+ ... à la `mlock`. Not implemented for us, but probably
+ could[[open_issue_glibc]].
+
+ * `MAP_NORESERVE` (*Don't check for reservations.*)
+
+ See `mmap` manpage.
+
+ From [[hurd/porting/guidelines]]: *Not POSIX, but we could implement
+ it.*
+
+ * `MAP_POPULATE` (*Populate (prefault) pagetables.*)
+
+ From the `mmap` manpage:
+
+ Populate (prefault) page tables for a mapping. For a file mapping,
+ this causes read-ahead on the file. Later accesses to the mapping
+ will not be blocked by page faults. MAP_POPULATE is only supported
+ for private mappings since Linux 2.6.23.
+
+ Unknown Linux kernel version, `mm/mmap.c`:
+
+ if (vm_flags & VM_LOCKED) {
+ if (!mlock_vma_pages_range(vma, addr, addr + len))
+ mm->locked_vm += (len >> PAGE_SHIFT);
+ } else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
+ make_pages_present(addr, addr + len);
+ return addr;
+
+ Is only advisory, so can worked around with `#define MAP_POPULATE 0`,
+ 8069478040336a7de3461be275432493cc7e4c91.
+
+ * `MAP_NONBLOCK` (*Do not block on IO.*)
+
+ From the `mmap` manpage:
+
+ Only meaningful in conjunction with MAP_POPULATE. Don't perform
+ read-ahead: only create page tables entries for pages that are
+ already present in RAM. Since Linux 2.6.23, this flag causes
+ MAP_POPULATE to do nothing. One day the combination of
+ MAP_POPULATE and MAP_NONBLOCK may be reimplemented.
+
+ * `MAP_STACK` (*Allocation is for a stack.*)
-Here, also probably the whole mapping region [has to be
-read](http://lists.gnu.org/archive/html/bug-hurd/2001-10/msg00306.html) at
-`mmap` time.
+ See `mmap` manpage.
-List of files without fallback code for the *`MAP_FAILED` due to `io_map`
-failed* case:
+ * `MAP_HUGETLB` (*Create huge page mapping.*)
- * `elf/cache.c`
+ See `mmap` manpage.
- * `elf/dl-load.c`
+ * `MAP_32BIT` (*Only give out 32-bit addresses.*)
- * `elf/dl-misc.c`
+ See `mmap` manpage.
- * `elf/dl-profile.c`
- * `elf/readlib.c`
+# Implementation
- * `elf/sprof.c`
+Essentially, `mmap` is implemented by means of
+[[`io_map`|hurd/interface/io_map]] (not for `MAP_ANON`) followed by
+[[`vm_map`|microkernel/mach/interface/vm_map]].
- * `locale/loadarchive.c`
+There are two implementations: `sysdeps/mach/hurd/mmap.c` (main implementation)
+and `sysdeps/mach/hurd/dl-sysdep.c` (*Minimal mmap implementation sufficient
+for initial loading of shared libraries.*).
- * `locale/programs/locale.c`
- * `locale/programs/locarchive.c`
+## `mmap ("/dev/zero")`
- * `nscd/connections.c`
+[[!tag open_issue_glibc open_issue_hurd]]Do we implement that (equivalently to
+`MAP_ANON`)?
- * `nscd/nscd_helper.c`
- * `nss/makedb.c`
+## Mapping Size
- * `nss/nss_db/db-open.c`
+From the `mmap` manpage:
- * Omitted:
+ A file is mapped in multiples of the page size. For a file that is not a
+ multiple of the page size, the remaining memory is zeroed when mapped, and
+ writes to that region are not written out to the file. The effect of
+ changing the size of the underlying file of a mapping on the pages that
+ correspond to added or removed regions of the file is unspecified.
- * `nptl/`
+[[!tag open_issue_glibc]]Do we implement that?
- * `sysdeps/unix/sparc/`
- * `sysdepts/unix/sysv/linux/`
+## Use of a Mapped Region
+
+From the `mmap` manpage:
+
+ Use of a mapped region can result in these signals:
+
+ SIGSEGV Attempted write into a region mapped as read-only.
+
+ SIGBUS Attempted access to a portion of the buffer that does not
+ correspond to the file (for example, beyond the end of the file,
+ including the case where another process has truncated the file).
+
+[[!tag open_issue_glibc]]Do we implement that?
+
+
+# Usage in glibc itself
+
+Review of `mmap` usage in generic bits of glibc (omitted: `nptl/`,
+`sysdeps/unix/sparc/`, `sysdepts/unix/sysv/linux/`), based on
+a1bcbd4035ac2483dc10da150d4db46f3e1744f8 (2012-03-11). `MAP_FILE` is the
+interesting case; `MAP_ANON` is generally fine. Some of the `mmap` usages in
+glibc have fallback code for the `MAP_FAILED` case, some do not.
+
+ catgets/open_catalog.c: (struct catalog_obj *) __mmap (NULL, st.st_size, PROT_READ,
+ catgets/open_catalog.c- MAP_FILE|MAP_COPY, fd, 0);
+
+Has fallback for `MAP_FAILED`.
+
+ elf/cache.c: = mmap (NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+ elf/cache.c: = mmap (NULL, aux_cache_size, PROT_READ, MAP_PRIVATE, fd, 0);
+
+No fallback for `MAP_FAILED`.
+
+ elf/dl-load.c: l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
+ elf/dl-load.c- c->prot,
+ elf/dl-load.c- MAP_COPY|MAP_FILE,
+ elf/dl-load.c- fd, c->mapoff);
+ elf/dl-load.c: && (__mmap ((void *) (l->l_addr + c->mapstart),
+ elf/dl-load.c- c->mapend - c->mapstart, c->prot,
+ elf/dl-load.c- MAP_FIXED|MAP_COPY|MAP_FILE,
+ elf/dl-load.c- fd, c->mapoff)
+
+No fallback for `MAP_FAILED`.
+
+ elf/dl-misc.c: result = __mmap (NULL, *sizep, prot,
+ elf/dl-misc.c-#ifdef MAP_COPY
+ elf/dl-misc.c- MAP_COPY
+ elf/dl-misc.c-#else
+ elf/dl-misc.c- MAP_PRIVATE
+ elf/dl-misc.c-#endif
+ elf/dl-misc.c-#ifdef MAP_FILE
+ elf/dl-misc.c- | MAP_FILE
+ elf/dl-misc.c-#endif
+ elf/dl-misc.c- , fd, 0);
+
+No fallback for `MAP_FAILED`.
+
+ elf/dl-profile.c: addr = (struct gmon_hdr *) __mmap (NULL, expected_size, PROT_READ|PROT_WRITE,
+ elf/dl-profile.c- MAP_SHARED|MAP_FILE, fd, 0);
+
+No fallback for `MAP_FAILED`.
+
+ elf/readlib.c: file_contents = mmap (0, statbuf.st_size, PROT_READ, MAP_SHARED,
+ elf/readlib.c- fileno (file), 0);
+
+No fallback for `MAP_FAILED`.
+
+ elf/sprof.c: result->symbol_map = mmap (NULL, max_offset - min_offset,
+ elf/sprof.c- PROT_READ, MAP_SHARED|MAP_FILE, symfd,
+ elf/sprof.c- min_offset);
+ elf/sprof.c: addr = mmap (NULL, st.st_size, PROT_READ, MAP_SHARED|MAP_FILE, fd, 0);
+
+No fallback for `MAP_FAILED`.
+
+ iconv/gconv_cache.c: gconv_cache = __mmap (NULL, cache_size, PROT_READ, MAP_SHARED, fd, 0);
+ iconv/iconv_charmap.c: && ((addr = mmap (NULL, st.st_size, PROT_READ, MAP_PRIVATE,
+ iconv/iconv_charmap.c- fd, 0)) != MAP_FAILED))
+ iconv/iconv_prog.c: && ((addr = mmap (NULL, st.st_size, PROT_READ, MAP_PRIVATE,
+ iconv/iconv_prog.c- fd, 0)) != MAP_FAILED))
+
+Have fallback for `MAP_FAILED`.
+
+ intl/loadmsgcat.c: data = (struct mo_file_header *) mmap (NULL, size, PROT_READ,
+ intl/loadmsgcat.c- MAP_PRIVATE, fd, 0);
+
+Has fallback for `MAP_FAILED`.
+
+ libio/fileops.c: p = __mmap (NULL, st.st_size, PROT_READ, MAP_SHARED,
+ libio/fileops.c- fp->_fileno, 0);
+ libio/fileops.c: p = __mmap (NULL, st.st_size, PROT_READ, MAP_SHARED, fp->_fileno, 0);
+
+Has fallback for `MAP_FAILED`.
+
+ locale/loadarchive.c: result = __mmap64 (NULL, mapsize, PROT_READ, MAP_FILE|MAP_COPY, fd, 0);
+ locale/loadarchive.c: result = __mmap64 (NULL, mapsize, PROT_READ, MAP_FILE|MAP_COPY,
+ locale/loadarchive.c- fd, 0);
+ locale/loadarchive.c: addr = __mmap64 (NULL, to - from, PROT_READ, MAP_FILE|MAP_COPY,
+ locale/loadarchive.c- fd, from);
+
+Some have fallback for `MAP_FAILED`.
+
+ locale/programs/locale.c: void *mapped = mmap64 (NULL, st.st_size, PROT_READ,
+ locale/programs/locale.c- MAP_SHARED, fd, 0);
+ locale/programs/locale.c: && ((mapped = mmap64 (NULL, st.st_size, PROT_READ,
+ locale/programs/locale.c- MAP_SHARED, fd, 0))
+ locale/programs/locale.c: addr = mmap64 (NULL, len, PROT_READ, MAP_SHARED, fd, 0);
+ locale/programs/locarchive.c: void *p = mmap64 (NULL, RESERVE_MMAP_SIZE, PROT_NONE, MAP_SHARED, fd, 0);
+ locale/programs/locarchive.c: p = mmap64 (p, total, PROT_READ | PROT_WRITE, MAP_SHARED | xflags, fd, 0);
+ locale/programs/locarchive.c: void *p = mmap64 (ah->addr + start, st.st_size - start,
+ locale/programs/locarchive.c- PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,
+ locale/programs/locarchive.c- ah->fd, start);
+ locale/programs/locarchive.c: ah->addr = mmap64 (ah->addr, st.st_size, PROT_READ | PROT_WRITE,
+ locale/programs/locarchive.c- MAP_SHARED | MAP_FIXED, ah->fd, 0);
+ locale/programs/locarchive.c: ah->addr = mmap64 (NULL, st.st_size, PROT_READ | PROT_WRITE,
+ locale/programs/locarchive.c- MAP_SHARED, ah->fd, 0);
+ locale/programs/locarchive.c: p = mmap64 (p, total, PROT_READ | PROT_WRITE, MAP_SHARED | xflags, fd, 0);
+ locale/programs/locarchive.c: ah->addr = mmap64 (p, st.st_size, PROT_READ | (readonly ? 0 : PROT_WRITE),
+ locale/programs/locarchive.c- MAP_SHARED | xflags, fd, 0);
+ locale/programs/locarchive.c: data[cnt].addr = mmap64 (NULL, st.st_size, PROT_READ, MAP_SHARED,
+ locale/programs/locarchive.c- fd, 0);
+
+No fallback for `MAP_FAILED`.
+
+ nscd/connections.c: else if ((mem = mmap (NULL, dbs[cnt].max_db_size,
+ nscd/connections.c- PROT_READ | PROT_WRITE,
+ nscd/connections.c- MAP_SHARED, fd, 0))
+ nscd/connections.c: || (mem = mmap (NULL, dbs[cnt].max_db_size,
+ nscd/connections.c- PROT_READ | PROT_WRITE,
+ nscd/connections.c- MAP_SHARED, fd, 0)) == MAP_FAILED)
+ nscd/nscd_helper.c: void *mapping = __mmap (NULL, mapsize, PROT_READ, MAP_SHARED, mapfd, 0);
+
+No fallback for `MAP_FAILED`.
+
+ nss/makedb.c: const struct nss_db_header *header = mmap (NULL, st.st_size, PROT_READ,
+ nss/makedb.c- MAP_PRIVATE|MAP_POPULATE, fd, 0);
+ nss/nss_db/db-open.c: mapping->header = mmap (NULL, header.allocate, PROT_READ,
+ nss/nss_db/db-open.c- MAP_PRIVATE, fd, 0);
+
+No fallback for `MAP_FAILED`.
+
+ posix/tst-mmap.c: ptr = mmap (NULL, 1000, PROT_READ, MAP_SHARED, fd, ps);
+ posix/tst-mmap.c: ptr = mmap64 (NULL, 1000, PROT_READ, MAP_SHARED, fd, ps);
+ rt/tst-mqueue3.c: void *mem = mmap (NULL, ps, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ rt/tst-mqueue5.c: void *mem = mmap (NULL, ps, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ rt/tst-shm.c: mem = mmap (NULL, 4000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ stdio-common/tst-fmemopen.c: if ((mmap_data = (char *) mmap (NULL, fs.st_size, PROT_READ,
+ stdio-common/tst-fmemopen.c- MAP_SHARED, fd, 0)) == MAP_FAILED)
+
+No fallback for `MAP_FAILED`.
+
+
+## `io_map` Failure
+
+This is the [[libnetfs: `io_map`|open_issues/libnetfs_io_map]] issue.
+
+[[!tag open_issue_glibc open_issue_hurd]]
+[[tschwinge]]'s current plan is to make the following cases do the same (if
+that is possible); probably by introducing a generic `mmap_or_read` function,
+that first tries `mmap` (and that will succeed on Linux-based systems and also
+on Hurd-based, if it's backed by [[hurd/libdiskfs]]), and if that fails tries
+`mmap` on anonymous memory and then fills it by `read`ing the required data.
+This is also what the [[hurd/exec]] server is doing (and is the reason that the
+`./true` invocation on [[libnetfs: `io_map`|open_issues/libnetfs_io_map]]
+works, to my understanding): see `exec.c:prepare`, if `io_map` fails,
+`e->filemap == MACH_PORT_NULL`; then `exec.c:map` (as invoked from
+`exec.c:load_section`, `exec.c:check_elf`, `exec.c:do_exec`, or
+`hashexec.c:check_hashbang`) will use `io_read` instead.
+
+Doing so potentially means reading in a lot of unused data -- but we probably
+can't do any better?
+
+In parallel (or even alternatively?), it should be researched how Linux (or any
+other kernel) implements `mmap` on NFS and similar file systems, and then
+implement the same in [[hurd/libnetfs]] and/or [[hurd/translator/nfs]], etc.
+
+Here, also probably the whole mapping region [[!message-id desc="has to be
+read" "871yjkl50c.fsf@becket.becket.net"]] ([bug-hurd list
+archive](http://lists.gnu.org/archive/html/bug-hurd/2001-10/msg00306.html)) at
+`mmap` time. Then, only `MAP_PRIVATE` (or rather: `MAP_COPY`) is possible, but
+not `MAP_SHARED`.