summaryrefslogtreecommitdiff
path: root/open_issues/code_analysis.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'open_issues/code_analysis.mdwn')
-rw-r--r--open_issues/code_analysis.mdwn277
1 files changed, 277 insertions, 0 deletions
diff --git a/open_issues/code_analysis.mdwn b/open_issues/code_analysis.mdwn
new file mode 100644
index 00000000..df434b76
--- /dev/null
+++ b/open_issues/code_analysis.mdwn
@@ -0,0 +1,277 @@
+[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software
+Foundation, Inc."]]
+
+[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
+id="license" text="Permission is granted to copy, distribute and/or modify this
+document under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no Invariant
+Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
+is included in the section entitled [[GNU Free Documentation
+License|/fdl]]."]]"""]]
+
+In the topic of *code analysis* or *program analysis* ([[!wikipedia
+Program_analysis_(computer_science) desc="Wikipedia article"]]), there is
+static code analysis ([[!wikipedia Static_code_analysis desc="Wikipedia
+article"]]) and dynamic program analysis ([[!wikipedia Dynamic_program_analysis
+desc="Wikipedia article"]]). This topic overlaps with [[performance
+analysis|performance]], [[formal_verification]], as well as general
+[[debugging]].
+
+[[!toc]]
+
+
+# Bounty
+
+There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks.
+
+
+# Static
+
+ * [[GCC]]'s warnings. Yes, really.
+
+ * GCC plugins can be used for additional semantic analysis. For example,
+ <http://lwn.net/Articles/457543/>, and search for *kernel context* in
+ the comments.
+
+ * Have GCC make use of [[RPC]]/[[microkernel/mach/MIG]] *in*/*out*
+ specifiers, and have it emit useful warnings in case these are pointing
+ to uninitialized data (for *in* only).
+
+ * [[!wikipedia List_of_tools_for_static_code_analysis]]
+
+ * [Engineering zero-defect software](http://esr.ibiblio.org/?p=4340), Eric
+ S. Raymond, 2012-05-13
+
+ * [Static Source Code Analysis Tools for C](http://spinroot.com/static/)
+
+ * [Cppcheck](http://sourceforge.net/apps/mediawiki/cppcheck/)
+
+ For example, [Debian's hurd_20110319-2
+ package](http://qa.debian.org/daca/cppcheck/sid/hurd_20110319-2.html)
+ (Samuel Thibault, 2011-08-05: *I had a look at those, some are spurious;
+ the realloc issues are for real*).
+
+ * Coccinelle
+
+ * <http://lwn.net/Articles/315686/>
+
+ * <http://www.google.com/search?q=coccinelle+analysis>
+
+ Has already been used for finding and fixing [[!message-id desc="double
+ mutex unlocking issues"
+ "1355701890-29227-1-git-send-email-tipecaml@gmail.com"]].
+
+ * [clang](http://www.google.com/search?q=clang+analysis)
+
+ * <http://darnassus.sceen.net/~teythoon/qa/gnumach/scan-build>
+
+ * <http://darnassus.sceen.net/~teythoon/qa/hurd/scan-build>
+
+ * [Linux' sparse](https://sparse.wiki.kernel.org/)
+
+ * <http://klee.llvm.org/>
+
+ * <http://blog.llvm.org/2010/04/whats-wrong-with-this-code.html>
+
+ * [Smatch](http://smatch.sourceforge.net/)
+
+ * [Parfait](http://labs.oracle.com/projects/parfait/)
+
+ * <http://lwn.net/Articles/344003/>
+
+ * [Saturn](http://saturn.stanford.edu/)
+
+ * [Flawfinder](http://www.dwheeler.com/flawfinder/)
+
+ * [sixgill](http://sixgill.org/)
+
+ * [s-spider](http://code.google.com/p/s-spider/)
+
+ * [CIL (C Intermediate Language)](http://kerneis.github.com/cil/)
+
+ * [Frama-C](http://frama-c.com/)
+
+ <teythoon> btw, I've been looking at http://frama-c.com/ lately
+ <teythoon> it's a theorem prover for c/c++
+ <braunr> oh nice
+ <teythoon> I think it's most impressive, it works on the hurd (aptitude
+ install frama-c o_O)
+ <teythoon> *and it works
+ <braunr> "Simple things should be simple,
+ <braunr> complex things should be possible."
+ <braunr> :)
+ <braunr> looks great
+ <teythoon> even the gui is awesome, allows one to browse source code in
+ a very impressive way
+ <braunr> clear separation between value changes, dependencies, side
+ effects
+ <braunr> we could have plugins for stuff like ports
+ <braunr> handles concurrency oO
+ <nalaginrut> so you want to use Frame-C to analyze the whole Hurd code
+ base?
+ <teythoon> nalaginrut: well, frama-c looks "able" to assist in
+ analyzing the Hurd, yes
+ <teythoon> nalaginrut: but theorem proving is a manual process, one
+ needs to guide the prover
+ <teythoon> nalaginrut: b/c some stuff is not decideable
+ <nalaginrut> I ask this because I can imagine how to analyze Linux
+ since all the code is in a directory. But Hurd's codes are
+ distributed to many other projects
+ <braunr> that's not a problem
+ <braunr> each server can be analyzed separately
+ <teythoon> braunr: also, each "entry point"
+ <nalaginrut> alright, but sounds a big work
+ <teythoon> it is
+ <braunr> otherwise, formal verification would be widespread :)
+ <teythoon> that, and most tools are horrible to use, frama-c is really
+ an exception in this regard
+
+ * [Coverity](http://www.coverity.com/) (nonfree)
+
+ * <https://scan.coverity.com/projects/1307> If you want access, speak up in #hurd or on the mailing list.
+
+ * IRC, OFTC, #debian-hurd, 2014-02-03
+
+ <pere> btw, did you consider adding hurd and mach to <URL:
+ https://scan.coverity.com/ > to detect bugs automatically?
+ <pere> I found lots of bugs in gnash, ipmitool and sysvinit when I
+ started scanning those projects. :)
+ <teythoon> i did some static analysis work, i haven't used coverty
+ but free tools for that
+ <teythoon> i think thomas wanted to look into coverty though
+ <pere> quite easy to set up, but you need to download and run a
+ non-free tarball on the build host.
+ <teythoon> does that tar ball contains binary code ?
+ <teythoon> that'd be a show stopper for the hurd of course
+ <pere> did not investigate. I just put it in a contained virtual
+ machine.
+ <pere> did not want it on my laptop. :)
+ <pere> prefer free software here. :)
+ <pere> but I did not have to "accept license", at least. :)
+
+ * IRC, OFTC, #debian-hurd, 2014-02-05
+
+ <pere> ah, cool. <URL: https://scan.coverity.com/projects/1307 >
+ is now in place. :)
+
+ [[microkernel/mach/gnumach/projects/clean_up_the_code]],
+ *Code_Analysis, Coverity*.
+
+ * [Splint](http://www.splint.org/)
+
+ * IRC, freenode, #hurd, 2011-12-04
+
+ <mcsim> has anyone used splint on hurd?
+ <mcsim> this is tool for statically checking C programs
+ <mcsim> seems I made it work
+
+
+## Hurd-specific Applications
+
+ * [[Port Sequence Numbers|microkernel/mach/ipc/sequence_numbering]]. If
+ these are used, care must be taken to update them reliably, [[!message-id
+ "1123688017.3905.22.camel@buko.sinrega.org"]]. This could be checked by a
+ static analysis tool.
+
+ * [[glibc]]'s [[glibc/critical_section]]s.
+
+
+# Dynamic
+
+ * [[community/gsoc/project_ideas/Valgrind]]
+
+ * glibc's `libmcheck`
+
+ * Used by GDB, for example.
+
+ * Is not thread-safe, [[!sourceware_PR 6547]], [[!sourceware_PR 9939]],
+ [[!sourceware_PR 12751]], [[!stackoverflow_question 314931]].
+
+ * <http://en.wikipedia.org/wiki/Electric_Fence>
+
+ * <http://sourceforge.net/projects/duma/>
+
+ * <http://wiki.debian.org/Hardening>
+
+ * <https://wiki.ubuntu.com/CompilerFlags>
+
+ * `MALLOC_CHECK_`/`MALLOC_PERTURB_`
+
+ * IRC, freenode, #glibc, 2011-09-28
+
+ <vsrinivas> two things you can do -- there is an environment
+ variable (DEBUG_MALLOC_ iirc?) that can be set to 2 to make
+ ptmalloc (glibc's allocator) more forceful and verbose wrt error
+ checking
+ <vsrinivas> another is to grab a copy of Tor's source tree and copy
+ out OpenBSD's allocator (its a clearly-identifyable file in the
+ tree); LD_PRELOAD it or link it into your app, it is even more
+ aggressive about detecting memory misuse.
+ <vsrinivas> third, Red hat has a gdb python plugin that can
+ instrument glibc's heap structure. its kinda handy, might help?
+ <vsrinivas> MALLOC_CHECK_ was the envvar you want, sorry.
+
+ * [`MALLOC_PERTURB_`](http://udrepper.livejournal.com/11429.html)
+
+ * <http://git.fedorahosted.org/cgit/initscripts.git/diff/?id=deb0df0124fbe9b645755a0a44c7cb8044f24719>
+
+ * In context of [[!message-id
+ "1341350006-2499-1-git-send-email-rbraun@sceen.net"]]/the `alloca` issue
+ mentioned in [[gnumach_page_cache_policy]]:
+
+ IRC, freenode, #hurd, 2012-07-08:
+
+ <youpi> braunr: there's actually already an ifdef REDZONE in libthreads
+
+ It's `RED_ZONE`.
+
+ <youpi> except it seems clumsy :)
+ <youpi> ah, no, the libthreads code properly sets the guard, just for
+ grow-up stacks
+
+ * GCC, LLVM/clang: [[Address Sanitizer (asan), Memory Sanitizer (msan),
+ Thread Sanitizer (tasn), Undefined Behavor Sanitizer (ubsan), ...|_san]]
+
+ * [GCC plugins](http://gcc.gnu.org/wiki/plugins)
+
+ * [CTraps](https://github.com/blucia0a/CTraps-gcc)
+
+ > CTraps is a gcc plugin and runtime library that inserts calls to runtime
+ > library functions just before shared memory accesses in parallel/concurrent
+ > code.
+ >
+ > The purpose of this plugin is to expose information about when and how threads
+ > communicate with one another to programmers for the purpose of debugging and
+ > performance tuning. The overhead of the instrumentation and runtime code is
+ > very low -- often low enough for always-on use in production code. In a series
+ > of initial experiments the overhead was 0-10% in many important cases.
+
+ * Input fuzzing
+
+ Not a new topic; has been used (and papers published?) for early [[UNIX]]
+ tools. What about some [[RPC]] fuzzing?
+
+ * <http://caca.zoy.org/wiki/zzuf>
+
+ * <http://www.ece.cmu.edu/~koopman/ballista/>
+
+ * [Jones: system call abuse](http://lwn.net/Articles/414273/), Dave
+ Jones, 2010.
+
+ * [Trinity: A Linux kernel fuzz tester (and then
+ some)](http://www.socallinuxexpo.org/scale11x/presentations/trinity-linux-kernel-fuzz-tester-and-then-some),
+ Dave Jones, The Eleventh Annual Southern California Linux Expo, 2013.
+
+ * Mayhem, *an automatic bug finding system*
+
+ IRC, freenode, #hurd, 2013-06-29:
+
+ <teythoon> started reading the mayhem paper referenced here
+ http://lists.debian.org/debian-devel/2013/06/msg00720.html
+ <teythoon> that's nice work, they are doing symbolic execution of x86
+ binary code, that's effectively model checking with some specialized
+ formulas
+ <teythoon> (too bad the mayhem code isn't available, damn those
+ academic people keeping the good stuff to themselvs...)
+ <teythoon> (and I really think that's bad practice, how should anyone
+ reproduce their results? that's not how science works imho...)