Merge lp:~mars/launchpad/fix-test_on_merge-578886 into lp:launchpad
Status: | Merged |
---|---|
Approved by: | Māris Fogels |
Approved revision: | no longer in the source branch. |
Merged at revision: | 11372 |
Proposed branch: | lp:~mars/launchpad/fix-test_on_merge-578886 |
Merge into: | lp:launchpad |
Diff against target: |
254 lines (+108/-73) 2 files modified
buildout-templates/bin/test.in (+0/-9) test_on_merge.py (+108/-64) |
To merge this branch: | bzr merge lp:~mars/launchpad/fix-test_on_merge-578886 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Michael Hudson-Doyle | Approve | ||
Robert Collins | Pending | ||
Māris Fogels | Pending | ||
Review via email: mp+31207@code.launchpad.net |
This proposal supersedes a proposal from 2010-05-25.
Commit message
Clean up the test_on_merge.py code to make it handle suite errors correctly. Removed the tabnanny checker.
Description of the change
Hello,
This branch contains simplified code for playing process group shenanigans, and it simplifies the nice_killpg() function. After discussion with Robert, this is a best effort approach.
I am particularly keen to see that the cleanup_
I tested this code as best I could with harnesses that reproduce test suite hangs. However, I think the process group shenanigans alone should be enough to keep the watchdog from dying, and therefore to keep the suite sane.
I have included the original proposal text below. It still holds true for this patch.
Maris
=====
Hi,
This branch fixes test_on_merge.py so that it now terminates the test suite properly after 10 minutes. This means EC2 instances will no longer hang when they kill the test suite.
This branch changes how the testrunner watchdog handles the process groups of itself and its children. The diff explains why the process group fiddling takes place. In order for the testrunner watchdog to control the process groups I had to remove process group handling from bin/test.
I changed the code a fair bit by introducing many comments and some new functions to encapsulate the existing code. I found these changes necessary to understand what the code was doing, as I only had basic knowledge of Unix process handling before this change. This code should now be easier for others to understand.
I dropped the testrunner timeout to 10 minutes. No test should run that long without output.
I cleaned up the print statements a bit, personal preference.
I added some detailed output to the testrunner kill procedure. This is failing loudly, with the hope that the script's behaviour will be less opaque when things go wrong in the code that handles things going wrong.
This branch was tested with a hand-crafted windmill testrunner harness. The harness forces an hour-long sleep() in one of the windmill tests. I used this harness to test both a local test suite timeout and an ec2 test suite timeout, and to ensure that they can kill off an entire process group.
I will complete a full "make check" run and a full "ec2 test" run with this branch before landing.
Pre-implementation call with: no one
Test command: see above
Maris
Gary asked me to have a look at this.
I think it broadly looks reasonable -- certainly it terms of readability it's way better than what went before -- and it's great that you've tested that it works. The proof is in the pudding here!
Just to be clear, the reason this stopped working is because there's now a "xvfb-run" process 'between' the test_on_merge.py script and the bin/test script?
I have a little apprehension about simply dying if we're the process leader. For a start, it means you can't just run "./test_ on_merge. py" from the shell! It also might not work on buildbot -- I think it actually will, unless we change our configuration to specify usePTY=True, but that seems rather fragile. Did you consider re-execing the process if we find we're the process group leader? I've pushed an implementation of this idea to lp:~mwhudson/launchpad/fix-test_on_merge-578886 and it seems to work.
If you merge my change in or do something similar, I think this is good to land.