Merge lp:~wgrant/launchpad/rescueiflost-and-updatestatus-to-model into lp:launchpad

Proposed by William Grant
Status: Merged
Approved by: Michael Nelson
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~wgrant/launchpad/rescueiflost-and-updatestatus-to-model
Merge into: lp:launchpad
Prerequisite: lp:~wgrant/launchpad/refactor-slave-architecture-check
Diff against target: 326 lines (+105/-87)
5 files modified
lib/lp/buildmaster/buildergroup.py (+1/-76)
lib/lp/buildmaster/interfaces/builder.py (+12/-0)
lib/lp/buildmaster/model/builder.py (+76/-1)
lib/lp/soyuz/doc/buildd-slavescanner.txt (+8/-10)
lib/lp/soyuz/tests/soyuzbuilddhelpers.py (+8/-0)
To merge this branch: bzr merge lp:~wgrant/launchpad/rescueiflost-and-updatestatus-to-model
Reviewer Review Type Date Requested Status
Michael Nelson (community) code Approve
Review via email: mp+22016@code.launchpad.net

Commit message

Move BuilderGroup.{rescueBuilderifLost,updateBuilderStatus} to Builder.

Description of the change

BuilderGroup.rescueBuilderIfLost and updateBuilderStatus act only on a context builder. They belong on IBuilder. This branch moves them, just about emptying out BuilderGroup.

To post a comment you must log in.
Revision history for this message
Michael Nelson (michael.nelson) wrote :

Thanks for more cleanups and simplifications William!

It'd be nice if we didn't need to define the functions externally to be able to re-use them, but you clarified why that's necessary. Although I wonder if MockBuilder could just inherit from Builder and stub out the relevant methods as required? (leaving rescuiIfLost/updateStatus)? I'll leave it up to you.

Just a small suggested update for a comment below.

> === modified file 'lib/lp/buildmaster/model/builder.py'
> --- lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:21 +0000
> +++ lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:22 +0000
> @@ -8,6 +8,8 @@
> __all__ = [
> 'Builder',
> 'BuilderSet',
> + 'rescueBuilderIfLost',
> + 'updateBuilderStatus',
> ]
>
> import httplib
> @@ -40,7 +42,8 @@
> from lp.buildmaster.interfaces.buildbase import BuildStatus
> from lp.buildmaster.interfaces.builder import (
> BuildDaemonError, BuildSlaveFailure, CannotBuild, CannotFetchFile,
> - CannotResumeHost, IBuilder, IBuilderSet, ProtocolVersionMismatch)
> + CannotResumeHost, CorruptBuildID, IBuilder, IBuilderSet,
> + ProtocolVersionMismatch)
> from lp.buildmaster.interfaces.buildfarmjobbehavior import (
> BuildBehaviorMismatch)
> from lp.buildmaster.interfaces.buildqueue import IBuildQueueSet
> @@ -154,6 +157,71 @@
> raise BuildSlaveFailure(info)
>
>
> +# This is a separate function since MockBuilder needs to use it too.
> +# Do not use it -- (Mock)Builder.rescueIfLost should be used instead.

Perhaps:

These functions are defined here so that both Builder and MockBuilder
can share the implementation for the rescueIfLost/updateStatus methods.

> +def rescueBuilderIfLost(builder, logger=None):
> + """See `IBuilder`."""
> + status_sentence = builder.slaveStatusSentence()

{{{
10:10 < noodles775> wgrant: with your rescueiflost/updatestatus branch, couldn't we do something like rescueIfLost = Builder.rescueIfLost on the MockBuilder
                    if the code is the same? (and not have to define the functions?)
10:10 < wgrant> noodles775: Bound methods check the type of the first argument.
10:10 < wgrant> So no, it's not quite that easy :(
10:10 < wgrant> Er, unbound methods.
10:10 < noodles775> Ok.
10:22 < noodles775> wgrant: in which case, is there an obvious reason that I'm missing for why MockBuilder couldn't inherit from Builder and just overload all its methods as needed?
10:27 < wgrant> noodles775: It looks like a few other mocks do that. So maybe I could.
10:27 < noodles775> wgrant: OK, I'll leave it up to you - I personally think it would improve the code.
10:27 < noodles775> r=me

}}}

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'lib/lp/buildmaster/buildergroup.py'
--- lib/lp/buildmaster/buildergroup.py 2010-03-24 12:02:21 +0000
+++ lib/lp/buildmaster/buildergroup.py 2010-03-24 12:02:22 +0000
@@ -15,7 +15,6 @@
1515
16from lp.buildmaster.interfaces.builder import (16from lp.buildmaster.interfaces.builder import (
17 BuildDaemonError, IBuilderSet)17 BuildDaemonError, IBuilderSet)
18from lp.buildmaster.interfaces.builder import CorruptBuildID
1918
2019
21class BuilderGroup:20class BuilderGroup:
@@ -46,81 +45,7 @@
46 # 'ok' are not worth rechecking here for some currently45 # 'ok' are not worth rechecking here for some currently
47 # undocumented reason. This also relates to bug #30633.46 # undocumented reason. This also relates to bug #30633.
48 if builder.builderok:47 if builder.builderok:
49 self.updateBuilderStatus(builder, arch)48 builder.updateStatus(self.logger)
5049
51 # Commit the updates made to the builders.50 # Commit the updates made to the builders.
52 self.commit()51 self.commit()
53
54 def updateBuilderStatus(self, builder, arch):
55 """Update the status for a builder by probing it.
56
57 :param builder: A builder object.
58 :param arch: The expected architecture family of the builder.
59 """
60 self.logger.debug('Checking %s' % builder.name)
61 try:
62 builder.checkSlaveAlive()
63 builder.checkSlaveArchitecture()
64 self.rescueBuilderIfLost(builder)
65 # Catch only known exceptions.
66 # XXX cprov 2007-06-15 bug=120571: ValueError & TypeError catching is
67 # disturbing in this context. We should spend sometime sanitizing the
68 # exceptions raised in the Builder API since we already started the
69 # main refactoring of this area.
70 except (ValueError, TypeError, xmlrpclib.Fault,
71 BuildDaemonError), reason:
72 builder.failBuilder(str(reason))
73 self.logger.warn(
74 "%s (%s) marked as failed due to: %s",
75 builder.name, builder.url, builder.failnotes, exc_info=True)
76 except socket.error, reason:
77 error_message = str(reason)
78 builder.handleTimeout(self.logger, error_message)
79
80 def rescueBuilderIfLost(self, builder):
81 """Reset Builder slave if job information doesn't match with DB.
82
83 If builder is BUILDING or WAITING but has an information record
84 that doesn't match what is stored in the DB, we have to dismiss
85 its current actions and let the slave free for another job,
86 assuming the XMLRPC is working properly at this point.
87 """
88 status_sentence = builder.slaveStatusSentence()
89
90 # 'ident_position' dict relates the position of the job identifier
91 # token in the sentence received from status(), according the
92 # two status we care about. See see lib/canonical/buildd/slave.py
93 # for further information about sentence format.
94 ident_position = {
95 'BuilderStatus.BUILDING': 1,
96 'BuilderStatus.WAITING': 2
97 }
98
99 # Isolate the BuilderStatus string, always the first token in
100 # see lib/canonical/buildd/slave.py and
101 # IBuilder.slaveStatusSentence().
102 status = status_sentence[0]
103
104 # If slave is not building nor waiting, it's not in need of rescuing.
105 if status not in ident_position.keys():
106 return
107
108 slave_build_id = status_sentence[ident_position[status]]
109
110 try:
111 builder.verifySlaveBuildID(slave_build_id)
112 except CorruptBuildID, reason:
113 if status == 'BuilderStatus.WAITING':
114 builder.cleanSlave()
115 else:
116 builder.requestAbort()
117 self.logger.warn("Builder '%s' rescued from '%s': '%s'" % (
118 builder.name, slave_build_id, reason))
119
120 def updateBuild(self, queueItem):
121 """Verify the current build job status.
122
123 Perform the required actions for each state.
124 """
125 queueItem.builder.updateBuild(queueItem)
126 self.commit()
12752
=== modified file 'lib/lp/buildmaster/interfaces/builder.py'
--- lib/lp/buildmaster/interfaces/builder.py 2010-03-24 12:02:21 +0000
+++ lib/lp/buildmaster/interfaces/builder.py 2010-03-24 12:02:22 +0000
@@ -173,6 +173,18 @@
173 :raises BuildDaemonError: When the slave is down.173 :raises BuildDaemonError: When the slave is down.
174 """174 """
175175
176 def rescueIfLost(logger=None):
177 """Reset the slave if its job information doesn't match the DB.
178
179 If the builder is BUILDING or WAITING but has a build ID string
180 that doesn't match what is stored in the DB, we have to dismiss
181 its current actions and clean the slave for another job, assuming
182 the XMLRPC is working properly at this point.
183 """
184
185 def updateStatus(logger=None):
186 """Update the builder's status by probing it."""
187
176 def cleanSlave():188 def cleanSlave():
177 """Clean any temporary files from the slave."""189 """Clean any temporary files from the slave."""
178190
179191
=== modified file 'lib/lp/buildmaster/model/builder.py'
--- lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:21 +0000
+++ lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:22 +0000
@@ -8,6 +8,8 @@
8__all__ = [8__all__ = [
9 'Builder',9 'Builder',
10 'BuilderSet',10 'BuilderSet',
11 'rescueBuilderIfLost',
12 'updateBuilderStatus',
11 ]13 ]
1214
13import httplib15import httplib
@@ -40,7 +42,8 @@
40from lp.buildmaster.interfaces.buildbase import BuildStatus42from lp.buildmaster.interfaces.buildbase import BuildStatus
41from lp.buildmaster.interfaces.builder import (43from lp.buildmaster.interfaces.builder import (
42 BuildDaemonError, BuildSlaveFailure, CannotBuild, CannotFetchFile,44 BuildDaemonError, BuildSlaveFailure, CannotBuild, CannotFetchFile,
43 CannotResumeHost, IBuilder, IBuilderSet, ProtocolVersionMismatch)45 CannotResumeHost, CorruptBuildID, IBuilder, IBuilderSet,
46 ProtocolVersionMismatch)
44from lp.buildmaster.interfaces.buildfarmjobbehavior import (47from lp.buildmaster.interfaces.buildfarmjobbehavior import (
45 BuildBehaviorMismatch)48 BuildBehaviorMismatch)
46from lp.buildmaster.interfaces.buildqueue import IBuildQueueSet49from lp.buildmaster.interfaces.buildqueue import IBuildQueueSet
@@ -154,6 +157,71 @@
154 raise BuildSlaveFailure(info)157 raise BuildSlaveFailure(info)
155158
156159
160# This is a separate function since MockBuilder needs to use it too.
161# Do not use it -- (Mock)Builder.rescueIfLost should be used instead.
162def rescueBuilderIfLost(builder, logger=None):
163 """See `IBuilder`."""
164 status_sentence = builder.slaveStatusSentence()
165
166 # 'ident_position' dict relates the position of the job identifier
167 # token in the sentence received from status(), according the
168 # two status we care about. See see lib/canonical/buildd/slave.py
169 # for further information about sentence format.
170 ident_position = {
171 'BuilderStatus.BUILDING': 1,
172 'BuilderStatus.WAITING': 2
173 }
174
175 # Isolate the BuilderStatus string, always the first token in
176 # see lib/canonical/buildd/slave.py and
177 # IBuilder.slaveStatusSentence().
178 status = status_sentence[0]
179
180 # If slave is not building nor waiting, it's not in need of rescuing.
181 if status not in ident_position.keys():
182 return
183
184 slave_build_id = status_sentence[ident_position[status]]
185
186 try:
187 builder.verifySlaveBuildID(slave_build_id)
188 except CorruptBuildID, reason:
189 if status == 'BuilderStatus.WAITING':
190 builder.cleanSlave()
191 else:
192 builder.requestAbort()
193 if logger:
194 logger.warn(
195 "Builder '%s' rescued from '%s': '%s'" %
196 (builder.name, slave_build_id, reason))
197
198
199def updateBuilderStatus(builder, logger=None):
200 """See `IBuilder`."""
201 if logger:
202 logger.debug('Checking %s' % builder.name)
203
204 try:
205 builder.checkSlaveAlive()
206 builder.checkSlaveArchitecture()
207 builder.rescueIfLost(logger)
208 # Catch only known exceptions.
209 # XXX cprov 2007-06-15 bug=120571: ValueError & TypeError catching is
210 # disturbing in this context. We should spend sometime sanitizing the
211 # exceptions raised in the Builder API since we already started the
212 # main refactoring of this area.
213 except (ValueError, TypeError, xmlrpclib.Fault,
214 BuildDaemonError), reason:
215 builder.failBuilder(str(reason))
216 if logger:
217 logger.warn(
218 "%s (%s) marked as failed due to: %s",
219 builder.name, builder.url, builder.failnotes, exc_info=True)
220 except socket.error, reason:
221 error_message = str(reason)
222 builder.handleTimeout(logger, error_message)
223
224
157class Builder(SQLBase):225class Builder(SQLBase):
158226
159 implements(IBuilder, IHasBuildRecords)227 implements(IBuilder, IHasBuildRecords)
@@ -256,6 +324,13 @@
256 if self.slave.echo("Test")[0] != "Test":324 if self.slave.echo("Test")[0] != "Test":
257 raise BuildDaemonError("Failed to echo OK")325 raise BuildDaemonError("Failed to echo OK")
258326
327 def rescueIfLost(self, logger=None):
328 """See `IBuilder`."""
329 rescueBuilderIfLost(self, logger)
330
331 def updateStatus(self, logger=None):
332 updateBuilderStatus(self, logger)
333
259 def cleanSlave(self):334 def cleanSlave(self):
260 """See IBuilder."""335 """See IBuilder."""
261 return self.slave.clean()336 return self.slave.clean()
262337
=== modified file 'lib/lp/soyuz/doc/buildd-slavescanner.txt'
--- lib/lp/soyuz/doc/buildd-slavescanner.txt 2010-03-13 02:18:57 +0000
+++ lib/lp/soyuz/doc/buildd-slavescanner.txt 2010-03-24 12:02:22 +0000
@@ -69,21 +69,21 @@
6969
70 >>> sanebuilding_builder = MockBuilder(70 >>> sanebuilding_builder = MockBuilder(
71 ... 'Sane Building Slave', SaneBuildingSlave())71 ... 'Sane Building Slave', SaneBuildingSlave())
72 >>> buildergroup.rescueBuilderIfLost(sanebuilding_builder) is None72 >>> sanebuilding_builder.rescueIfLost(buildergroup.logger) is None
73 True73 True
7474
75A sane WAITING slave:75A sane WAITING slave:
7676
77 >>> sanewaiting_builder = MockBuilder(77 >>> sanewaiting_builder = MockBuilder(
78 ... 'Sane Waiting Slave', SaneWaitingSlave())78 ... 'Sane Waiting Slave', SaneWaitingSlave())
79 >>> buildergroup.rescueBuilderIfLost(sanewaiting_builder) is None79 >>> sanewaiting_builder.rescueIfLost(buildergroup.logger) is None
80 True80 True
8181
82A insane WAITING slave, with wrong BuildQueue/Build relation:82A insane WAITING slave, with wrong BuildQueue/Build relation:
8383
84 >>> insanewaiting_builder = MockBuilder(84 >>> insanewaiting_builder = MockBuilder(
85 ... 'Insane Waiting Slave', InsaneWaitingSlave())85 ... 'Insane Waiting Slave', InsaneWaitingSlave())
86 >>> buildergroup.rescueBuilderIfLost(insanewaiting_builder)86 >>> insanewaiting_builder.rescueIfLost(buildergroup.logger)
87 WARNING:root:Builder 'Insane Waiting Slave' rescued from '7-1': 'Job build entry mismatch'87 WARNING:root:Builder 'Insane Waiting Slave' rescued from '7-1': 'Job build entry mismatch'
8888
89It was rescued because the BuildQueue.id == 1 isn't related to89It was rescued because the BuildQueue.id == 1 isn't related to
@@ -94,7 +94,7 @@
9494
95 >>> lostbuilding_builder = MockBuilder(95 >>> lostbuilding_builder = MockBuilder(
96 ... 'Lost Building Slave', LostBuildingSlave())96 ... 'Lost Building Slave', LostBuildingSlave())
97 >>> buildergroup.rescueBuilderIfLost(lostbuilding_builder)97 >>> lostbuilding_builder.rescueIfLost(buildergroup.logger)
98 WARNING:root:Builder 'Lost Building Slave' rescued from '1000-10000':98 WARNING:root:Builder 'Lost Building Slave' rescued from '1000-10000':
99 'Build 1000 is not available: 'Object not found''99 'Build 1000 is not available: 'Object not found''
100100
@@ -102,7 +102,7 @@
102102
103 >>> lostwaiting_builder = MockBuilder(103 >>> lostwaiting_builder = MockBuilder(
104 ... 'Lost Waiting Slave', LostWaitingSlave())104 ... 'Lost Waiting Slave', LostWaitingSlave())
105 >>> buildergroup.rescueBuilderIfLost(lostwaiting_builder)105 >>> lostwaiting_builder.rescueIfLost(buildergroup.logger)
106 WARNING:root:Builder 'Lost Waiting Slave' rescued from '1000-10000':106 WARNING:root:Builder 'Lost Waiting Slave' rescued from '1000-10000':
107 'Build 1000 is not available: 'Object not found''107 'Build 1000 is not available: 'Object not found''
108108
@@ -115,13 +115,13 @@
115 >>> lostbuilding_builder = MockBuilder(115 >>> lostbuilding_builder = MockBuilder(
116 ... 'Lost Building Slave', LostBuildingSlave())116 ... 'Lost Building Slave', LostBuildingSlave())
117 >>> lostbuilding_builder.current_build_behavior = IdleBuildBehavior()117 >>> lostbuilding_builder.current_build_behavior = IdleBuildBehavior()
118 >>> buildergroup.rescueBuilderIfLost(lostbuilding_builder)118 >>> lostbuilding_builder.rescueIfLost(buildergroup.logger)
119 WARNING:root:Builder 'Lost Building Slave' rescued from '1000-10000': 'No job assigned to builder'119 WARNING:root:Builder 'Lost Building Slave' rescued from '1000-10000': 'No job assigned to builder'
120120
121 >>> lostwaiting_builder = MockBuilder(121 >>> lostwaiting_builder = MockBuilder(
122 ... 'Lost Waiting Slave', LostWaitingSlave())122 ... 'Lost Waiting Slave', LostWaitingSlave())
123 >>> lostwaiting_builder.current_build_behavior = IdleBuildBehavior()123 >>> lostwaiting_builder.current_build_behavior = IdleBuildBehavior()
124 >>> buildergroup.rescueBuilderIfLost(lostwaiting_builder)124 >>> lostwaiting_builder.rescueIfLost(buildergroup.logger)
125 WARNING:root:Builder 'Lost Waiting Slave' rescued from '1000-10000': 'No job assigned to builder'125 WARNING:root:Builder 'Lost Waiting Slave' rescued from '1000-10000': 'No job assigned to builder'
126126
127Slave-scanner will deactivate a 'lost-building' builder that could not127Slave-scanner will deactivate a 'lost-building' builder that could not
@@ -130,9 +130,7 @@
130 >>> lostbuilding_builder = MockBuilder(130 >>> lostbuilding_builder = MockBuilder(
131 ... 'Lost Building Broken Slave', LostBuildingBrokenSlave())131 ... 'Lost Building Broken Slave', LostBuildingBrokenSlave())
132132
133 >>> from canonical.launchpad.interfaces import IDistributionSet133 >>> lostbuilding_builder.updateStatus(buildergroup.logger)
134 >>> hoary_i386 = getUtility(IDistributionSet)['ubuntu']['hoary']['i386']
135 >>> buildergroup.updateBuilderStatus(lostbuilding_builder, hoary_i386)
136 WARNING:root:Lost Building Broken Slave (http://fake:0000) marked as failed due to: <Fault 8002: 'Could not abort'>134 WARNING:root:Lost Building Broken Slave (http://fake:0000) marked as failed due to: <Fault 8002: 'Could not abort'>
137 Traceback (most recent call last):135 Traceback (most recent call last):
138 ...136 ...
139137
=== modified file 'lib/lp/soyuz/tests/soyuzbuilddhelpers.py'
--- lib/lp/soyuz/tests/soyuzbuilddhelpers.py 2010-03-24 12:02:21 +0000
+++ lib/lp/soyuz/tests/soyuzbuilddhelpers.py 2010-03-24 12:02:22 +0000
@@ -27,6 +27,8 @@
2727
28from canonical.config import config28from canonical.config import config
29from lp.buildmaster.interfaces.builder import CannotFetchFile29from lp.buildmaster.interfaces.builder import CannotFetchFile
30from lp.buildmaster.model.builder import (rescueBuilderIfLost,
31 updateBuilderStatus)
30from lp.soyuz.model.binarypackagebuildbehavior import (32from lp.soyuz.model.binarypackagebuildbehavior import (
31 BinaryPackageBuildBehavior)33 BinaryPackageBuildBehavior)
3234
@@ -70,6 +72,12 @@
70 def checkSlaveArchitecture(self):72 def checkSlaveArchitecture(self):
71 pass73 pass
7274
75 def rescueIfLost(self, logger=None):
76 rescueBuilderIfLost(self, logger)
77
78 def updateStatus(self, logger=None):
79 updateBuilderStatus(self, logger)
80
7381
74class SaneBuildingSlave:82class SaneBuildingSlave:
75 """A mock slave that is currently building build 8 and buildqueue 1."""83 """A mock slave that is currently building build 8 and buildqueue 1."""