Merge lp:~michael.nelson/launchpad/fix-buildd-slave-test into lp:launchpad

Proposed by Michael Nelson
Status: Merged
Approved by: Michael Nelson
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~michael.nelson/launchpad/fix-buildd-slave-test
Merge into: lp:launchpad
Diff against target: 17 lines (+4/-3)
1 file modified
lib/canonical/launchpad/daemons/tachandler.py (+4/-3)
To merge this branch: bzr merge lp:~michael.nelson/launchpad/fix-buildd-slave-test
Reviewer Review Type Date Requested Status
Eleanor Berger (community) code Approve
Review via email: mp+21847@code.launchpad.net

Commit message

Don't swallow all OSError exceptions during two_stage_kill.

Description of the change

Recently I've seen a failure during ec2 test (and hence didn't land my branch) which today also appeared on buildbot:

https://lpbuildbot.canonical.com/builders/lp/builds/703/steps/shell_7/logs/summary

It seems that the recently updated two_stage_kill is returning before the process has terminated, and terminates instead during the s.info() call, which triggers the connection reset.

The recent changes to two_stage_kill can be seen here:
https://code.edge.launchpad.net/~jml/launchpad/sftp-poppy/+merge/21627
although I don't see any behaviour change.

I can force the error as in the test failure by doing:
http://pastebin.ubuntu.com/399226/

Pre-imp. with jml:
http://pastebin.ubuntu.com/399240/

As I don't know what exception might be being raised, I'm not sure how it can be tested (other than the current test-suite still passing).

To post a comment you must log in.
Revision history for this message
Eleanor Berger (intellectronica) :
review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/canonical/launchpad/daemons/tachandler.py'
2--- lib/canonical/launchpad/daemons/tachandler.py 2010-03-18 19:18:34 +0000
3+++ lib/canonical/launchpad/daemons/tachandler.py 2010-03-22 11:33:25 +0000
4@@ -57,9 +57,10 @@
5 return result
6 time.sleep(poll_interval)
7 except OSError, e:
8- # Raised if the process is gone by the time we try to get the
9- # return value.
10- return
11+ if e.errno in (errno.ESRCH, errno.ECHILD):
12+ # Raised if the process is gone by the time we try to get the
13+ # return value.
14+ return
15
16 # The process is still around, so terminate it violently.
17 try: