summaryrefslogtreecommitdiff
path: root/open_issues
diff options
context:
space:
mode:
Diffstat (limited to 'open_issues')
-rw-r--r--open_issues/unit_testing.mdwn266
1 files changed, 5 insertions, 261 deletions
diff --git a/open_issues/unit_testing.mdwn b/open_issues/unit_testing.mdwn
index feda3be4..1378be85 100644
--- a/open_issues/unit_testing.mdwn
+++ b/open_issues/unit_testing.mdwn
@@ -8,7 +8,8 @@ Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]
-This task may be suitable for [[community/GSoC]][[!tag gsoc-task]].
+This task may be suitable for [[community/GSoC]]:
+[[community/gsoc/project_ideas/testing_framework]]
---
@@ -80,263 +81,6 @@ abandoned).
# Discussion
-freenode, #hurd channel, 2011-03-05:
-
- <nixness> what about testing though?
- <nixness> like sort of "what's missing? lets write tests for it so that
- when someone gets to implementing it, he knows what to do. Repeat"
- project
- <antrik> you mean creating an automated testing framework?
- <antrik> this is actually a task I want to add for this year, if I get
- around to it :-)
- <nixness> yeah I'd very much want to apply for that one
- <nixness> cuz I've never done Kernel hacking before, but I know that with
- the right tasks like "test VM functionality", I would be able to write up
- the automated tests and hopefully learn more about what breaks/makes the
- kernel
- <nixness> (and it would make implementing the feature much less hand-wavy
- and more correct)
- <nixness> antrik, I believe the framework(CUnit right?) exists, but we just
- need to write the tests.
- <antrik> do you have prior experience implementing automated tests?
- <nixness> lots of tests!
- <nixness> yes, in Java mostly, but I've played around with CUnit
- <antrik> ah, great :-)
- <nixness> here's what I know from experience: 1) write basic tests. 2)
- write ones that test multiple features 3) stress test [option 4)
- benchmark and record to DB]
- <youpi> well, what we'd rather need is to fix the issues we already know
- from the existing testsuites :)
-
-[[GSoC project propsal|community/gsoc/project_ideas/testsuites]].
-
- <nixness> youpi, true, and I think I should check what's available in way
- of tests, but if the tests are "all or nothing" then it becomes really
- hard to fix your mistakes
- <youpi> they're not all or nothing
- <antrik> youpi: the existing testsuites don't test specific system features
- <youpi> libc ones do
- <youpi> we could also check posixtestsuite which does too
-
-[[open_issues/open_posix_test_suite]].
-
- <antrik> AFAIK libc has very few failing tests
-
-[[open_issues/glibc_testsuite]].
-
- <youpi> err, like twenty?
- <youpi> € grep -v '^#' expected-results-i486-gnu-libc | wc -l
- <youpi> 67
- <youpi> nope, even more
- <antrik> oh, sorry, I confused it with coreutils
- <pinotree> plus the binutils ones, i guess
- <youpi> yes
-
-[[open_issues/binutils#weak]].
-
- <antrik> anyways, I don't think relying on external testsuites for
- regression testing is a good plan
- <antrik> also, that doesn't cover unit testing at all
- <youpi> why ?
- <youpi> sure there can be unit testing at the translator etc. level
- <antrik> if we want to implement test-driven development, and/or do serious
- refactoring without too much risk, we need a test harness where we can
- add specific tests as needed
- <youpi> but more than often, the issues are at the libc / etc. level
- because of a combination o fthings at the translator level, which unit
- testing won't find out
- * nixness yewzzz!
- <nixness> sure unit testing can find them out. if they're good "unit" tests
- <youpi> the problem is that you don't necessarily know what "good" means
- <youpi> e.g. for posix correctness
- <youpi> since it's not posix
- <nixness> but if they're composite clever tests, then you lose that
- granularity
- <nixness> youpi, is that a blackbox test intended to be run at the very end
- to check if you're posix compliant?
- <antrik> also, if we have our own test harness, we can run tests
- automatically as part of the build process, which is a great plus IMHO
- <youpi> nixness: "that" = ?
- <nixness> oh nvm, I thought there was a test stuie called "posix
- correctness"
- <youpi> there's the posixtestsuite yes
- <youpi> it's an external one however
- <youpi> antrik: being external doesn't mean we can't invoke it
- automatically as part of the build process when it's available
- <nixness> youpi, but being internal means it can test the inner workings of
- certain modules that you are unsure of, and not just the interface
- <youpi> sure, that's why I said it'd be useful too
- <youpi> but as I said too, most bugs I've seen were not easy to find out at
- the unit level
- <youpi> but rather at the libc level
- <antrik> of course we can integrate external tests if they exist and are
- suitable. but that that doesn't preclude adding our own ones too. in
- either case, that integration work has to be done too
- <youpi> again, I've never said I was against internal testsuite
- <antrik> also, the major purpose of a test suite is checking known
- behaviour. a low-level test won't directly point to a POSIX violation;
- but it will catch any changes in behaviour that would lead to one
- <youpi> what I said is that it will be hard to write them tight enough to
- find bugs
- <youpi> again, the problem is knowing what will lead to a POSIX violation
- <youpi> it's long work
- <youpi> while libc / posixtestsuite / etc. already do that
- <antrik> *any* unexpected change in behaviour is likely to cause bugs
- somewher
- <youpi> but WHAT is "expected" ?
- <youpi> THAT is the issue
- <youpi> and libc/posixtessuite do know that
- <youpi> at the translator level we don't really
- <youpi> see the recent post about link()
-
-[link(dir,name) should fail with
-EPERM](http://lists.gnu.org/archive/html/bug-hurd/2011-03/msg00007.html)
-
- <youpi> in my memory jkoenig pointed it out for a lot of such calls
- <youpi> and that issue is clearly not known at the translator level
- <nixness> so you're saying that the tests have to be really really
- low-level, and work our way up?
- <youpi> I'm saying that the translator level tests will be difficult to
- write
- <antrik> why isn't it known at the translator level? if it's a translator
- (not libc) bug, than obviously the translator must return something wrong
- at some point, and that's something we can check
- <youpi> because you'll have to know all the details of the combinations
- used in libc, to know whether they'll lead to posix issues
- <youpi> antrik: sure, but how did we detect that that was unexpected
- behavior?
- <youpi> because of a glib test
- <youpi> at the translator level we didn't know it was an unexpected
- behavior
- <antrik> gnulib actually
- <youpi> and if you had asked me, I wouldn't have known
- <antrik> again, we do *not* write a test suite to find existing bugs
- <youpi> right, took one for the other
- <youpi> doesn't really matter actually
- <youpi> antrik: ok, I don't care then
- <antrik> we write a test suite to prevent future bugs, or track status of
- known bugs
- <youpi> (don't care *enough* for now, I mean)
- <nixness> hmm, so write something up antrik for GSoC :) and I'll be sure to
- apply
- <antrik> now that we know some translators return a wrong error status in a
- particular situation, we can write a test that checks exactly this error
- status. that way we know when it is fixed, and also when it breaks again
- <antrik> nixness: great :-)
- <nixness> sweet. that kind of thing would also need a db backend
- <antrik> nixness: BTW, if you have a good idea, you can send an application
- for it even if it's not listed among the proposed tasks...
- <antrik> so you don't strictly need a writeup from me to apply for this :-)
- <nixness> antrik, I'll keep that in mind, but I'll also be checking your
- draft page
- <nixness> oh cool :)
- <antrik> (and it's a well known fact that those projects which students
- proposed themselfs tend to be the most successful ones :-) )
- * nixness draft initiated
- <antrik> youpi: BTW, I'm surprised that you didn't mention libc testsuite
- before... working up from there is probably a more effective plan than
- starting with high-level test suites like Python etc...
- <youpi> wasn't it already in the gsoc proposal?
- <youpi> bummer
- <antrik> nope
-
-freenode, #hurd channel, 2011-03-06:
-
- <nixness> how's the hurd coding workflow, typically?
-
-*nixness* -> *foocraft*.
-
- <foocraft> we're discussing how TDD can work with Hurd (or general kernel
- development) on #osdev
- <foocraft> so what I wanted to know, since I haven't dealt much with GNU
- Hurd, is how do you guys go about coding, in this case
- <tschwinge> Our current workflow scheme is... well... is...
- <tschwinge> Someone wants to work on something, or spots a bug, then works
- on it, submits a patch, and 0 to 10 years later it is applied.
- <tschwinge> Roughly.
- <foocraft> hmm so there's no indicator of whether things broke with that
- patch
- <foocraft> and how low do you think we can get with tests? A friend of mine
- was telling me that with kernel dev, you really don't know whether, for
- instance, the stack even exists, and a lot of things that I, as a
- programmer, can assume while writing code break when it comes to writing
- kernel code
- <foocraft> Autotest seems promising
-
-See autotest link given above.
-
- <foocraft> but in any case, coming up with the testing framework that
- doesn't break when the OS itself breaks is hard, it seems
- <foocraft> not sure if autotest isolates the mistakes in the os from
- finding their way in the validity of the tests themselves
- <youpi> it could be interesting to have scripts that automatically start a
- sub-hurd to do the tests
-
-[[hurd/subhurd#unit_testing]].
-
- <tschwinge> foocraft: To answer one of your earlier questions: you can do
- really low-level testing. Like testing Mach's message passing. A
- million times. The questions is whether that makes sense. And / or if
- it makes sense to do that as part of a unit testing framework. Or rather
- do such things manually once you suspect an error somewhere.
- <tschwinge> The reason for the latter may be that Mach IPC is already
- heavily tested during normal system operation.
- <tschwinge> And yet, there still may be (there are, surely) bugs.
- <tschwinge> But I guess that you have to stop at some (arbitrary?) level.
- <foocraft> so we'd assume it works, and test from there
- <tschwinge> Otherwise you'd be implementing the exact counter-part of what
- you're testing.
- <tschwinge> Which may be what you want, or may be not. Or it may just not
- be feasible.
- <foocraft> maybe the testing framework should have dependencies
- <foocraft> which we can automate using make, and phony targets that run
- tests
- <foocraft> so everyone writes a test suite and says that it depends on A
- and B working correctly
- <foocraft> then it'd go try to run the tests for A etc.
- <tschwinge> Hmm, isn't that -- on a high level -- have you have by
- different packages? For example, the perl testsuite depends (inherently)
- on glibc working properly. A perl program's testsuite depends on perl
- working properly.
- <foocraft> yeah, but afaik, the ordering is done by hand
-
-freenode, #hurd channel, 2011-03-07:
-
- <antrik> actually, I think for most tests it would be better not to use a
- subhurd... that leads precisely to the problem that if something is
- broken, you might have a hard time running the tests at all :-)
- <antrik> foocraft: most of the Hurd code isn't really low-level. you can
- use normal debugging and testing methods
- <antrik> gnumach of course *does* have some low-level stuff, so if you add
- unit tests to gnumach too, you might run into issues :-)
- <antrik> tschwinge: I think testing IPC is a good thing. as I already said,
- automated testing is *not* to discover existing but unknown bugs, but to
- prevent new ones creeping in, and tracking progress on known bugs
- <antrik> tschwinge: I think you are confusing unit testing and regression
- testing. http://www.bddebian.com/~hurd-web/open_issues/unit_testing/
- talks about unit testing, but a lot (most?) of it is actually about
- regression tests...
- <tschwinge> antrik: That may certainly be -- I'm not at all an expert in
- this, and just generally though that some sort of automated testing is
- needed, and thus started collecting ideas.
- <tschwinge> antrik: You're of course invited to fix that.
-
-IRC, freenode, #hurd, 2011-03-08
-
-(After discussing the [[anatomy_of_a_hurd_system]].)
-
- <antrik> so that's what your question is actually about?
- <foocraft> so what I would imagine is a set of only-this-server tests for
- each server, and then we can have fun adding composite tests
- <foocraft> thus making debugging the composite scenarios a bit less tricky
- <antrik> indeed
- <foocraft> and if you were trying to pass a composite test, it would also
- help knowing that you still didn't break the server-only test
- <antrik> there are so many different things that can be tested... the
- summer will only suffice to dip into this really :-)
- <foocraft> yeah, I'm designing my proposal to focus on 1) make/use a
- testing framework that fits the Hurd case very well 2) write some tests
- and docs on how to write good tests
- <antrik> well, doesn't have to be *one* framework... unit testing and
- regression testing are quite different things, which can be covered by
- different frameworks
+See the [[GSoC project idea|community/gsoc/project_ideas/testing_framework]]'s
+[[discussion
+subpage|community/gsoc/project_ideas/testing_framework/discussion]].