open_issues/unit_testing.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235

[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

A collection of thoughts with respect to unit testing.

We definitely want to add unit test suites to our code base.

We should select a tool that we like to use, and that is supported (not
abandoned).

  * [SC
    Test](http://web.archive.org/web/20021204193607/sc-archive.codesourcery.com/sc_test)

  * [DejaGnu](http://www.gnu.org/software/dejagnu/) /
    [Expect](http://expect.nist.gov/)

      * used by the [[GCC testsuite|gcc]], [[GDB_testsuite]],
        [[binutils testsuite|binutils]], etc.

  * The [[glibc_testsuite]] has a home-grown system (Makefile-based), likewise
    does the [[Open_POSIX_Test_Suite]].

  * [check](http://check.sourceforge.net/)

      * used by some GNU packages, for example GNU PDF (Jose E. Marchesi)

  * CodeSourcery's [QMTest](http://www.codesourcery.com/qmtest)

      * useb by?

      * documentation:

          * <http://www.codesourcery.com/public/qmtest/whitepaper.pdf>

          * <http://www.python.org/workshops/2002-02/papers/01/index.htm>

          * <http://gcc.gnu.org/ml/gcc/2002-05/msg01978.html>

          * <http://www.codesourcery.com/public/qmtest/qmtest-snapshot/share/doc/qmtest/html/tutorial/index.html>

          * <http://www.codesourcery.com/public/qmtest/qmtest-snapshot/share/doc/qmtest/html/manual/index.html>

  * [Git](http://git-scm.com/) has an elaborate unit testsuite, which is also
    used in [Notmuch](http://notmuchmail.org/).

  * [*[ANNOUNCE] ktest.pl: Easy and flexible testing script for Linux Kernel
    Developers*](http://lwn.net/Articles/412302/) by Steven Rostedt,
    2010-10-28.  [v2](http://lwn.net/Articles/414064/), 2010-11-08.


# Related

  * [[nightly_builds]]

      * [[nightly_builds_deb_packages]]

  * <http://www.phoronix-test-suite.com/> -- ``comprehensive testing and
    benchmarking platform''.  This one might be useful for [[performance]]
    testing, too?

  * <http://ltp.sourceforge.net/>

  * [LaBrea](https://github.com/dustin/labrea/wiki), or similar tools can be
    used for modelling certain aspects of system behavior (long response times,
    for example).


# Discussion

freenode, #hurd channel, 2011-03-05:

    <nixness> what about testing though?
    <nixness> like sort of "what's missing? lets write tests for it so that
      when someone gets to implementing it, he knows what to do. Repeat"
      project
    <antrik> you mean creating an automated testing framework?
    <antrik> this is actually a task I want to add for this year, if I get
      around to it :-)
    <nixness> yeah I'd very much want to apply for that one
    <nixness> cuz I've never done Kernel hacking before, but I know that with
      the right tasks like "test VM functionality", I would be able to write up
      the automated tests and hopefully learn more about what breaks/makes the
      kernel
    <nixness> (and it would make implementing the feature much less hand-wavy
      and more correct)
    <nixness> antrik, I believe the framework(CUnit right?) exists, but we just
      need to write the tests.
    <antrik> do you have prior experience implementing automated tests?
    <nixness> lots of tests!
    <nixness> yes, in Java mostly, but I've played around with CUnit
    <antrik> ah, great :-)
    <nixness> here's what I know from experience: 1) write basic tests. 2)
      write ones that test multiple features 3) stress test [option 4)
      benchmark and record to DB]
    <youpi> well, what we'd rather need is to fix the issues we already know
      from the existing testsuites :)

[[GSoC project propsal|community/gsoc/project_ideas/testsuites]].

    <nixness> youpi, true, and I think I should check what's available in way
      of tests, but if the tests are "all or nothing" then it becomes really
      hard to fix your mistakes
    <youpi> they're not all or nothing
    <antrik> youpi: the existing testsuites don't test specific system features
    <youpi> libc ones do
    <youpi> we could also check posixtestsuite which does too

[[open_issues/open_posix_test_suite]].

    <antrik> AFAIK libc has very few failing tests

[[open_issues/glibc_testsuite]].

    <youpi> err, like twenty?
    <youpi> € grep -v '^#' expected-results-i486-gnu-libc | wc -l
    <youpi> 67
    <youpi> nope, even more
    <antrik> oh, sorry, I confused it with coreutils
    <pinotree> plus the binutils ones, i guess
    <youpi> yes

[[open_issues/binutils#weak]].

    <antrik> anyways, I don't think relying on external testsuites for
      regression testing is a good plan
    <antrik> also, that doesn't cover unit testing at all
    <youpi> why ?
    <youpi> sure there can be unit testing at the translator etc. level
    <antrik> if we want to implement test-driven development, and/or do serious
      refactoring without too much risk, we need a test harness where we can
      add specific tests as needed
    <youpi> but more than often, the issues are at the libc / etc. level
      because of a combination o fthings at the translator level, which unit
      testing won't find out
    * nixness yewzzz!
    <nixness> sure unit testing can find them out. if they're good "unit" tests
    <youpi> the problem is that you don't necessarily know what "good" means
    <youpi> e.g. for posix correctness
    <youpi> since it's not posix
    <nixness> but if they're composite clever tests, then you lose that
      granularity
    <nixness> youpi, is that a blackbox test intended to be run at the very end
      to check if you're posix compliant?
    <antrik> also, if we have our own test harness, we can run tests
      automatically as part of the build process, which is a great plus IMHO
    <youpi> nixness: "that" = ?
    <nixness> oh nvm, I thought there was a test stuie called "posix
      correctness"
    <youpi> there's the posixtestsuite yes
    <youpi> it's an external one however
    <youpi> antrik: being external doesn't mean we can't invoke it
      automatically as part of the build process when it's available
    <nixness> youpi, but being internal means it can test the inner workings of
      certain modules that you are unsure of, and not just the interface
    <youpi> sure, that's why I said it'd be useful too
    <youpi> but as I said too, most bugs I've seen were not easy to find out at
      the unit level
    <youpi> but rather at the libc level
    <antrik> of course we can integrate external tests if they exist and are
      suitable. but that that doesn't preclude adding our own ones too. in
      either case, that integration work has to be done too
    <youpi> again, I've never said I was against internal testsuite
    <antrik> also, the major purpose of a test suite is checking known
      behaviour. a low-level test won't directly point to a POSIX violation;
      but it will catch any changes in behaviour that would lead to one
    <youpi> what I said is that it will be hard to write them tight enough to
      find bugs
    <youpi> again, the problem is knowing what will  lead to a POSIX violation
    <youpi> it's long work
    <youpi> while libc / posixtestsuite / etc. already do that
    <antrik> *any* unexpected change in behaviour is likely to cause bugs
      somewher
    <youpi> but WHAT is "expected" ?
    <youpi> THAT is the issue
    <youpi> and libc/posixtessuite do know that
    <youpi> at the translator level we don't really
    <youpi> see the recent post about link()

[link(dir,name) should fail with
EPERM](http://lists.gnu.org/archive/html/bug-hurd/2011-03/msg00007.html)

    <youpi> in my memory jkoenig pointed it out for a lot of such calls
    <youpi> and that issue is clearly not known at the translator level
    <nixness> so you're saying that the tests have to be really really
      low-level, and work our way up?
    <youpi> I'm saying that the translator level tests will be difficult to
      write
    <antrik> why isn't it known at the translator level? if it's a translator
      (not libc) bug, than obviously the translator must return something wrong
      at some point, and that's something we can check
    <youpi> because you'll have to know all the details of the combinations
      used in libc, to know whether they'll lead to posix issues
    <youpi> antrik: sure, but how did we detect that that was unexpected
      behavior?
    <youpi> because of a glib test
    <youpi> at the translator level we didn't know it was an unexpected
      behavior
    <antrik> gnulib actually
    <youpi> and if you had asked me, I wouldn't have known
    <antrik> again, we do *not* write a test suite to find existing bugs
    <youpi> right, took one for the other
    <youpi> doesn't really matter actually
    <youpi> antrik: ok, I don't care then
    <antrik> we write a test suite to prevent future bugs, or track status of
      known bugs
    <youpi> (don't care *enough* for now, I mean)
    <nixness> hmm, so write something up antrik for GSoC :) and I'll be sure to
      apply
    <antrik> now that we know some translators return a wrong error status in a
      particular situation, we can write a test that checks exactly this error
      status. that way we know when it is fixed, and also when it breaks again
    <antrik> nixness: great :-)
    <nixness> sweet. that kind of thing would also need a db backend
    <antrik> nixness: BTW, if you have a good idea, you can send an application
      for it even if it's not listed among the proposed tasks...
    <antrik> so you don't strictly need a writeup from me to apply for this :-)
    <nixness> antrik, I'll keep that in mind, but I'll also be checking your
      draft page
    <nixness> oh cool :)
    <antrik> (and it's a well known fact that those projects which students
      proposed themselfs tend to be the most successful ones :-) )
    * nixness draft initiated
    <antrik> youpi: BTW, I'm surprised that you didn't mention libc testsuite
      before... working up from there is probably a more effective plan than
      starting with high-level test suites like Python etc...
    <youpi> wasn't it already in the gsoc proposal?
    <youpi> bummer
    <antrik> nope