|
On some systems, ext2fs.static would regularly hang at startup, as a
race condition meant it would process paging requests while remounting.
To fix this, libpager has been altered to allow inhibiting and resuming
its worker threads, and ext2fs uses this to inhibit paging while
remounting.
* console/pager.c (pager_requests): New variable.
(user_pager_init): Updated call to pager_start_workers to use new
pager_requests variable.
* daemons/runsystem.sh: Removed artificial delay working around the race
condition.
* ext2fs/ext2fs.c (diskfs_reload_global_state): Call new
inhibit_ext2_pager and resume_ext2_pager functions, and leave sblock as
non-NULL so it will be munmapped.
* ext2fs/ext2fs.h (inhibit_ext2_pager,resume_ext2_pager): New functions.
* ext2fs/pager.c (file_pager_requests): New variable.
(create_disk_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_ext2_pager,resume_ext2_pager): New functions.
* fatfs/fatfs.h (inhibit_fat_pager,resume_fat_pager): New functions.
* fatfs/pager.c (file_pager_requests): New variable.
(create_fat_pager): Updated call to pager_start_workers to use new
file_pager_requests variable.
(inhibit_fat_pager,resume_fat_pager): New functions.
* libdiskfs/disk-pager.c (diskfs_disk_pager_requests): New variable.
(diskfs_start_disk_pager): Updated call to pager_start_workers to use
new diskfs_disk_pager_requests variable.
* libdiskfs/diskfs-pager.h (diskfs_disk_pager_requests): New variable.
* libpager/demuxer.c (struct pager_requests): Renamed struct requests to
struct pager_requests. Replaced queue with queue_in and queue_out
pointers. Added inhibit_wakeup field.
(pager_demuxer): Updated to use new queue_in/queue_out pointers. Only
wake up workers if not inhibited.
(worker_func): Updated to use new queue_in/queue_out pointers. Final
worker thread to sleep notifies the inhibit_wakeup condition variable.
(pager_start_workers): Added out parameter for the requests instance.
Allocate heap space shared by both queues. Initialise new inhibit_wakeup
condition.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/pager.h (struct pager_requests): Public forward definition.
(pager_start_workers): Added out parameter for the requests instance.
(pager_inhibit_workers,pager_resume_workers): New functions.
* libpager/queue.h (queue_empty): New function.
* storeio/pager.c (pager_requests): New variable.
(init_dev_paging): Updated call to pager_start_workers to use new
pager_requests variable.
|
|
Previously, libpager used an unbounded number of threads to receive
messages from the pager bucket. It used sequence barriers to execute
the requests to order requests to each object.
The sequence barriers are implemented in seqnos.c. A server function
uses _pager_wait_for_seqno to wait for its sequence number and
_pager_release_seqno to release it, or it uses _pager_update_seqno to
do both operations in one step.
These sequence barriers divide each server function in three parts: A,
B, and C. A_i happens "before" the sequence barrier i, B_i happens
"in order", C_i happens "after" the sequence barrier. This partial
order < has the following properties:
* This order is *per object*. Requests to different objects are not
ordered.
* A_i < B_i, B_i < C_i (due to the structure of the code)
* B_i < B_{i+1} (this is due to the sequence barriers)
* Note that only the B parts are ordered by the sequence numbers, we
are free to execute C_i and C_{i+1} in any possible order. The same
argument applies to the A parts.
The sequence barriers are implemented using a very simple ticket
algorithm. Every request, even the invalid ones, is processed by a
thread, and waits until the ticket count reaches its seqno, does some
work in-order, then increments the ticket and awakes all threads that
have piled up up to this moment. All of them except one will then
discover that it's not their turn yet and go to sleep again.
Creating one thread per request has proven to be problematic as
memory_object requests often arrive in large batches.
This patch does two things:
* Use a single thread to receive messages from the port bucket. All
incoming request are put into a queue.
* Use a fixed-number of threads (though even one is actually enough)
to execute the the server functions. If multiple threads are used,
a work-delegation mechanism ensures that the per object order < is
preserved.
For reference, I used the following command to create workloads that
highlight the problem this patch is addressing:
% settrans t .../ext2fs --sync=30 /dev/sd2s1
...
% /usr/bin/time zsh -c 'for ((i=0; i < 1500; i++)); do
dd if=/dev/zero of=t/src/$i bs=4k count=290 2>/dev/null
echo -n .
if ((i % 100 == 0)) ; then echo -n $i; fi
done'
* libpager/queue.h: New file.
* libpager/demuxer.c: Manage a queue of requests received from the
port bucket.
(pager_demuxer): Just decode the server function and enqueue the
request.
(worker_func): New function that consumes and executes the requests
from the queue.
(service_paging_requests): New function.
(pager_start_workers): Likewise.
* libpager/data-request.c: Remove the seqno barriers.
* libpager/data-return.c: Likewise.
* libpager/data-unlock.c: Likewise.
* libpager/chg-compl.c: Likewise.
* libpager/lock-completed.c: Likewise.
* libpager/no-senders.c: Likewise.
* libpager/notify-stubs.c: Likewise.
* libpager/object-init.c: Likewise.
* libpager/object-terminate.c: Likewise.
* libpager/seqnos.c: Remove file.
* libpager/stubs.c: Likewise.
* libpager/pager.h (pager_demuxer): Drop declaration.
(pager_start_workers): New declaration.
* libpager/priv.h: Remove the _pager_seqno declarations.
* libpager/Makefile (SRCS): Drop seqnos.c.
* console/pager.c (user_pager_init): Call pager_start_workers.
* libdiskfs/disk-pager.c: Likewise.
* storeio/pager.c: Likewise.
* ext2fs/pager.c (service_paging_requests): Remove function.
(create_disk_pager): Start separate file pager using
`pager_start_workers'.
* fatfs/pager.c (service_paging_requests): Remove function.
(create_fat_pager): Start separate file pager using
`pager_start_workers'.
|