[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] *Performance analysis* ([[!wikipedia Performance_analysis desc="Wikipedia article"]]) deals with analyzing how computing resources are used for completing a specified task. [[service_solahart_jakarta_selatan__082122541663/Profiling]] is one relevant tool. In [[microkernel]]-based systems, there is generally a considerable [[RPC]] overhead. In a multi-server system, it is non-trivial to implement a high-performance [[I/O System|community/gsoc/project_ideas/disk_io_performance]]. When providing [[faq/POSIX_compatibility]] (and similar interfaces) in an environemnt that doesn't natively implement these interfaces, there may be a severe performance degradation. For example, in this [[`fork` system call|/glibc/fork]]'s case. [[service_solahart_jakarta_selatan__082122541663/Unit_testing]] can be used for tracking performance regressions. --- * [[Degradation]] * [[service_solahart_jakarta_selatan__082122541663/performance/fork]] * [[service_solahart_jakarta_selatan__082122541663/performance/Ipc_virtual_copy]] * [[service_solahart_jakarta_selatan__082122541663/performance/microbenchmarks]] * [[microkernel_multi-server]] * [[gnumach_page_cache_policy]] * [[metadata_caching]] * [[community/gsoc/project_ideas/object_lookups]] --- # IRC, freenode, #hurd, 2012-07-05 the more i study the code, the more i think a lot of time is wasted on cpu, unlike the common belief of the lack of performance being only due to I/O ## IRC, freenode, #hurd, 2012-07-23 there are several kinds of scalability issues iirc, i found some big locks in core libraries like libpager and libdiskfs but anyway we can live with those in the case i observed, ext2fs, relying on libdiskfs and libpager, scans the entire file list to ask for writebacks, as it can't know if the pages are dirty or not the mistake here is moving part of the pageout policy out of the kernel so it would require the kernel to handle periodic synces of the page cache braunr: as for big locks: considering that we don't have any SMP so far, does it really matter?... antrik: yes we have multithreading there is no reason to block many threads while if most of them could continue -while so that's more about latency than throughput? considering sleeping/waking is expensive, it's also about throughput currently, everything that deals with sleepable locks (both gnumach and the hurd) just wake every thread waiting for an event when the event occurs (there are a few exceptions, but not many) ouch ## [[!message-id "20121202101508.GA30541@mail.sceen.net"]] ## IRC, freenode, #hurd, 2012-12-04 why do some people think hurd is slow? i find it works well even under heavy load inside a virtual machine damo22: the virtual machine actually assists the hurd a lot :p but even with that, the hurd is a slow system i would have thought it would have the potential to be very fast, considering the model of the kernel the design implies by definition more overhead, but the true cause is more than 15 years without optimization on the core components how so ? since there are less layers of code between the hardware bare metal and the application that users run how so ? :) it's the contrary actually VFS -> IPC -> scheduler -> device drivers -> hardware that is monolithic well, it's not really meaningful and i'd say the same applies for a microkernel system if the application can talk directly to hardware through the kernel its almost like plugging directly into the hardware you never talk directly to hardware you talk to servers instead of the kernel ah consider monolithic kernel systems like systems with one big server the kernel whereas a multiserver system is a kernel and many servers you still need the VFS to identify your service (and thus your server) you need much more IPC, since system calls are "replaced" with RPC the scheduler is basically the same okay device drivers are similar too, except they run in thread context (which is usually a bit heavier) but you can do cool things like report when an interrupt line is blocked and there are many context switches between all that you can do all that in a monolithic kernel too, and faster but it's far more elegant, and (when well done) easy to do on a microkernel based system yes i like elegant, makes coding easier if you know the basics there are only two major differences between a monolilthic kernel and a multiserver microkernel system * damo22 listens 1/ independence of location (your resources could be anywhere) 2/ separation of address spaces (your servers have their own addresses) wow these both imply additional layers of indirection, making the system as a whole slower but it would be far more secure though i suspect yes and reliable that's why systems like qnx were usually adopted for critical tasks security and reliability are very important, i would switch to the hurd if it supported all the hardware i use so would i :) but performance matters too not to me it should :p it really does matter a lot in practice i mean, a 2x slowdown compared to linux would not affect me if it had all the benefits we mentioned above but the hurd is really slow for other reasons than its additional layers of indrection unfortunately is it because of lack of optimisation in the core code? we're working on these issues, but it's not easy and takes a lot of time :p like you said yes and also because of some fundamental design choices related to the microkernel back in the 80s what about the darwin system it uses a mach kernel? yes what is stopping someone taking the MIT code from darwin and creating a monster free OS what for ? because it already has hardware support and a mach kernel in kernel drivers ? it has kernel extensions you can do things like kextload module first, being a mach kernel doesn't make it compatible or even easily usable with the hurd, the interfaces have evolved independantly and second, we really do want more stuff out of the kernel drivers in particular may i ask why you are very keen to have drivers out of kernel? for the same reason we want other system services out of the kernel security, reliability, etc.. ease of debugging the ability to restart drivers separately, without restarting the kernel i see # IRC, freenode, #hurd, 2012-09-13 {{$news/2011-q2#phoronix-3}}. the phoronix benchmarks don't actually test the operating system .. braunr: well, at least it tests its ability to run programs for those particular tasks exactly, it tests how programs that don't make much use of the operating system run well yes, we can run programs :) those are just cpu-taking tasks ok if you do a benchmark with also i/o, you can see how it is (quite) slower on hurd perhaps they should have run 10 of those programs in parallel, that would test the kernel multitasking I suppose not even I/O, simply system calls no, multitasking is ok on the hurd and it's very similar to what is done on other systems, which hasn't changed much for a long time (except for multiprocessor) true OS benchmarks measure system calls ok, so Im sensing the view that the actual OS kernel architecture dont really make that much difference, good software does not at all i'm only saying that the phoronix benchmark results are useless because they didn't measure the right thing ok # Optimizing Data Structure Layout ## IRC, freenode, #hurd, 2014-01-02 teythoon_: wow, digging into the vm code :) i discovered pahole and gnumach was a tempting target :) never heard of pahole :/ it's nice braunr: try pahole -C kmem_cache /boot/gnumach on linux that is. ... ok braunr: http://paste.debian.net/73864/ very nice ## IRC, freenode, #hurd, 2014-01-03 teythoon: pahole is a very handy tool :) yes i especially like how general it is # Measurement ## coulomb ### [[!message-id "87wqghouoc.fsf@schwinge.name"]] ## IRC, freenode, #hurd, 2014-02-27 tschwinge: about your concern with regard to performance measurements, you could run kvm with hugetlbfs and cpuset on a machine that provides nested page tables, this makes the virtualization overhead as small as it could be considering the implementatoin hugetlbs reduces the overhead of page faults, and also implies locked memory while cpuset isolates the vm from global scheduling hugetlbfs* Thanks, will look into that.