[[!meta copyright="Copyright © 2010, 2011, 2012, 2013, 2014 Free Software
Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

In the topic of *code analysis* or *program analysis* ([[!wikipedia
Program_analysis_(computer_science) desc="Wikipedia article"]]), there is
static code analysis ([[!wikipedia Static_code_analysis desc="Wikipedia
article"]]) and dynamic program analysis ([[!wikipedia Dynamic_program_analysis
desc="Wikipedia article"]]).  This topic overlaps with [[performance
analysis|performance]], [[formal_verification]], as well as general
[[debugging]].

[[!toc]]


# Bounty

There is a [[!FF_project 276]][[!tag bounty]] on some of these tasks.


# Static

  * [[GCC]]'s warnings.  Yes, really.

      * GCC plugins can be used for additional semantic analysis.  For example,
        <http://lwn.net/Articles/457543/>, and search for *kernel context* in
        the comments.

      * Have GCC make use of [[RPC]]/[[microkernel/mach/MIG]] *in*/*out*
        specifiers, and have it emit useful warnings in case these are pointing
        to uninitialized data (for *in* only).

  * [[!wikipedia List_of_tools_for_static_code_analysis]]

  * [Engineering zero-defect software](http://esr.ibiblio.org/?p=4340), Eric
    S. Raymond, 2012-05-13

  * [Static Source Code Analysis Tools for C](http://spinroot.com/static/)

  * [Cppcheck](http://sourceforge.net/apps/mediawiki/cppcheck/)

    For example, [Debian's hurd_20110319-2
    package](http://qa.debian.org/daca/cppcheck/sid/hurd_20110319-2.html)
    (Samuel Thibault, 2011-08-05: *I had a look at those, some are spurious;
    the realloc issues are for real*).

  * Coccinelle

      * <http://lwn.net/Articles/315686/>

      * <http://www.google.com/search?q=coccinelle+analysis>

    Has already been used for finding and fixing [[!message-id desc="double
    mutex unlocking issues"
    "1355701890-29227-1-git-send-email-tipecaml@gmail.com"]].

  * [clang](http://www.google.com/search?q=clang+analysis)

      * <http://darnassus.sceen.net/~teythoon/qa/gnumach/scan-build>

      * <http://darnassus.sceen.net/~teythoon/qa/hurd/scan-build>

  * [Linux' sparse](https://sparse.wiki.kernel.org/)

  * <http://klee.llvm.org/>

      * <http://blog.llvm.org/2010/04/whats-wrong-with-this-code.html>

  * [Smatch](http://smatch.sourceforge.net/)

  * [Parfait](http://labs.oracle.com/projects/parfait/)

      * <http://lwn.net/Articles/344003/>

  * [Saturn](http://saturn.stanford.edu/)

  * [Flawfinder](http://www.dwheeler.com/flawfinder/)

  * [sixgill](http://sixgill.org/)

  * [s-spider](http://code.google.com/p/s-spider/)

  * [CIL (C Intermediate Language)](http://kerneis.github.com/cil/)

  * [Frama-C](http://frama-c.com/)

        <teythoon> btw, I've been looking at http://frama-c.com/ lately
        <teythoon> it's a theorem prover for c/c++
        <braunr> oh nice
        <teythoon> I think it's most impressive, it works on the hurd (aptitude
          install frama-c o_O)
        <teythoon> *and it works
        <braunr> "Simple things should be simple,
        <braunr> complex things should be possible."
        <braunr> :)
        <braunr> looks great
        <teythoon> even the gui is awesome, allows one to browse source code in
          a very impressive way
        <braunr> clear separation between value changes, dependencies, side
          effects
        <braunr> we could have plugins for stuff like ports
        <braunr> handles concurrency oO
        <nalaginrut> so you want to use Frame-C to analyze the whole Hurd code
          base?
        <teythoon> nalaginrut: well, frama-c looks "able" to assist in
          analyzing the Hurd, yes
        <teythoon> nalaginrut: but theorem proving is a manual process, one
          needs to guide the prover
        <teythoon> nalaginrut: b/c some stuff is not decideable
        <nalaginrut> I ask this because I can imagine how to analyze Linux
          since all the code is in a directory. But Hurd's codes are
          distributed to many other projects
        <braunr> that's not a problem
        <braunr> each server can be analyzed separately
        <teythoon> braunr: also, each "entry point"
        <nalaginrut> alright, but sounds a big work
        <teythoon> it is
        <braunr> otherwise, formal verification would be widespread :)
        <teythoon> that, and most tools are horrible to use, frama-c is really
          an exception in this regard

  * [Coverity](http://www.coverity.com/) (nonfree)

      * <https://scan.coverity.com/projects/1307> If you want access, speak up in #hurd or on the mailing list.

      * IRC, OFTC, #debian-hurd, 2014-02-03

            <pere> btw, did you consider adding hurd and mach to <URL:
              https://scan.coverity.com/ > to detect bugs automatically?
            <pere> I found lots of bugs in gnash, ipmitool and sysvinit when I
              started scanning those projects. :)
            <teythoon> i did some static analysis work, i haven't used coverty
              but free tools for that
            <teythoon> i think thomas wanted to look into coverty though
            <pere> quite easy to set up, but you need to download and run a
              non-free tarball on the build host.
            <teythoon> does that tar ball contains binary code ?
            <teythoon> that'd be a show stopper for the hurd of course
            <pere> did not investigate.  I just put it in a contained virtual
              machine.
            <pere> did not want it on my laptop. :)
            <pere> prefer free software here. :)
            <pere> but I did not have to "accept license", at least. :)

      * IRC, OFTC, #debian-hurd, 2014-02-05

            <pere> ah, cool.  <URL: https://scan.coverity.com/projects/1307 >
              is now in place. :)

        [[microkernel/mach/gnumach/projects/clean_up_the_code]],
        *Code_Analysis, Coverity*.

  * [Splint](http://www.splint.org/)

      * IRC, freenode, #hurd, 2011-12-04

            <mcsim> has anyone used splint on hurd?
            <mcsim> this is tool for statically checking C programs
            <mcsim> seems I made it work


## Hurd-specific Applications

  * [[Port Sequence Numbers|microkernel/mach/ipc/sequence_numbering]].  If
    these are used, care must be taken to update them reliably, [[!message-id
    "1123688017.3905.22.camel@buko.sinrega.org"]].  This could be checked by a
    static analysis tool.

  * [[glibc]]'s [[glibc/critical_section]]s.


# Dynamic

  * [[community/gsoc/project_ideas/Valgrind]]

  * glibc's `libmcheck`

      * Used by GDB, for example.

      * Is not thread-safe, [[!sourceware_PR 6547]], [[!sourceware_PR 9939]],
        [[!sourceware_PR 12751]], [[!stackoverflow_question 314931]].

  * <http://en.wikipedia.org/wiki/Electric_Fence>

      * <http://sourceforge.net/projects/duma/>

  * <http://wiki.debian.org/Hardening>

  * <https://wiki.ubuntu.com/CompilerFlags>

  * `MALLOC_CHECK_`/`MALLOC_PERTURB_`

      * IRC, freenode, #glibc, 2011-09-28

            <vsrinivas> two things you can do -- there is an environment
              variable (DEBUG_MALLOC_ iirc?) that can be set to 2 to make
              ptmalloc (glibc's allocator) more forceful and verbose wrt error
              checking
            <vsrinivas> another is to grab a copy of Tor's source tree and copy
              out OpenBSD's allocator (its a clearly-identifyable file in the
              tree); LD_PRELOAD it or link it into your app, it is even more
              aggressive about detecting memory misuse.
            <vsrinivas> third, Red hat has a gdb python plugin that can
              instrument glibc's heap structure. its kinda handy, might help?
            <vsrinivas> MALLOC_CHECK_ was the envvar you want, sorry.

      * [`MALLOC_PERTURB_`](http://udrepper.livejournal.com/11429.html)

      * <http://git.fedorahosted.org/cgit/initscripts.git/diff/?id=deb0df0124fbe9b645755a0a44c7cb8044f24719>

  * In context of [[!message-id
    "1341350006-2499-1-git-send-email-rbraun@sceen.net"]]/the `alloca` issue
    mentioned in [[gnumach_page_cache_policy]]:

    IRC, freenode, #hurd, 2012-07-08:

        <youpi> braunr: there's actually already an ifdef REDZONE in libthreads

    It's `RED_ZONE`.

        <youpi> except it seems clumsy :)
        <youpi> ah, no, the libthreads code properly sets the guard, just for
          grow-up stacks

  * GCC, LLVM/clang: [[Address Sanitizer (asan), Memory Sanitizer (msan),
    Thread Sanitizer (tasn), Undefined Behavor Sanitizer (ubsan), ...|_san]]

  * [GCC plugins](http://gcc.gnu.org/wiki/plugins)

      * [CTraps](https://github.com/blucia0a/CTraps-gcc)

        > CTraps is a gcc plugin and runtime library that inserts calls to runtime
        > library functions just before shared memory accesses in parallel/concurrent
        > code.
        > 
        > The purpose of this plugin is to expose information about when and how threads
        > communicate with one another to programmers for the purpose of debugging and
        > performance tuning.  The overhead of the instrumentation and runtime code is
        > very low -- often low enough for always-on use in production code.  In a series
        > of initial experiments the overhead was 0-10% in many important cases.

  * Input fuzzing

    Not a new topic; has been used (and papers published?) for early [[UNIX]]
    tools.  What about some [[RPC]] fuzzing?

      * <http://caca.zoy.org/wiki/zzuf>

      * <http://www.ece.cmu.edu/~koopman/ballista/>

      * [Jones: system call abuse](http://lwn.net/Articles/414273/), Dave
        Jones, 2010.

          * [Trinity: A Linux system call
            fuzzer]http://codemonkey.org.uk/projects/trinity/().
            [Trinity: A Linux kernel fuzz tester (and then
            some)](http://www.socallinuxexpo.org/scale11x/presentations/trinity-linux-kernel-fuzz-tester-and-then-some),
            Dave Jones, The Eleventh Annual Southern California Linux Expo, 2013.

      * [American fuzzy lop](https://code.google.com/p/american-fuzzy-lop/), *a
        practical, instrumentation-driven fuzzer for binary formats*.

      * [Melkor - An ELF File Format
        Fuzzer](https://www.blackhat.com/us-14/arsenal.html#Hernandez),
        Alejandro Hernández.

          * Can use this to find bugs in our [[hurd/translator/exec]] server,
            for example?  See also the discussion in [[!message-id
            "5452389B.502@samsung.com"]].

  * Mayhem, *an automatic bug finding system*

    IRC, freenode, #hurd, 2013-06-29:

        <teythoon> started reading the mayhem paper referenced here
          http://lists.debian.org/debian-devel/2013/06/msg00720.html
        <teythoon> that's nice work, they are doing symbolic execution of x86
          binary code, that's effectively model checking with some specialized
          formulas
        <teythoon> (too bad the mayhem code isn't available, damn those
          academic people keeping the good stuff to themselvs...)
        <teythoon> (and I really think that's bad practice, how should anyone
          reproduce their results? that's not how science works imho...)