summaryrefslogtreecommitdiff
path: root/community/gsoc/project_ideas/disk_io_performance.mdwn
blob: 0d793845cc7bc57b6dd6be7efaa3843f9a82e911 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[[!meta copyright="Copyright © 2008, 2009, 2010, 2011 Free Software Foundation,
Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!meta title="Disk I/O Performance Tuning"]]

[[!tag open_issue_hurd]]

The most obvious reason for the Hurd feeling slow compared to mainstream
systems like GNU/Linux, is a low I/O system performance, in particular very
slow hard disk access.

The reason for this slowness is lack and/or bad implementation of common
optimization techniques, like scheduling reads and writes to minimize head
movement; effective block caching; effective reads/writes to partial blocks;
[[reading/writing multiple blocks at once|service_solahart_jakarta_selatan__082122541663/performance/io_system/clustered_page_faults]]; and
[[open_issues/performance/io_system/read-ahead]].  The
[[ext2_filesystem_server|hurd/translator/ext2fs]] might also need some
optimizations at a higher logical level.

The goal of this project is to analyze the current situation, and implement/fix
various optimizations, to achieve significantly better disk performance. It
requires understanding the data flow through the various layers involved in
disk access on the Hurd ([[filesystem|hurd/virtual_file_system]],
[[pager|hurd/libpager]], driver), and general experience with
optimizing complex systems.  That said, the killing feature we are definitely
missing is the [[open_issues/performance/io_system/read-ahead]], and even a very simple implementation would bring
very big performance speedups.

Here are some real testcases:

  * [[service_solahart_jakarta_selatan__082122541663/performance/io_system/binutils_ld_64ksec]];

  * running the Git testsuite which is mostly I/O bound;

  * use [[TopGit]] on a non-toy repository.


Possible mentors: Samuel Thibault (youpi)

Exercise: Look through all the code involved in disk I/O, and try something
easy to improve. It's quite likely though that you will find nothing obvious --
in this case, please contact us about a different exercise task.