Merge lp:~wgrant/launchpad/rescueiflost-and-updatestatus-to-model into lp:launchpad

Proposed by William Grant
Status: Merged
Approved by: Michael Nelson
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~wgrant/launchpad/rescueiflost-and-updatestatus-to-model
Merge into: lp:launchpad
Prerequisite: lp:~wgrant/launchpad/refactor-slave-architecture-check
Diff against target: 326 lines (+105/-87)
5 files modified
lib/lp/buildmaster/buildergroup.py (+1/-76)
lib/lp/buildmaster/interfaces/builder.py (+12/-0)
lib/lp/buildmaster/model/builder.py (+76/-1)
lib/lp/soyuz/doc/buildd-slavescanner.txt (+8/-10)
lib/lp/soyuz/tests/soyuzbuilddhelpers.py (+8/-0)
To merge this branch: bzr merge lp:~wgrant/launchpad/rescueiflost-and-updatestatus-to-model
Reviewer Review Type Date Requested Status
Michael Nelson (community) code Approve
Review via email: mp+22016@code.launchpad.net

Commit message

Move BuilderGroup.{rescueBuilderifLost,updateBuilderStatus} to Builder.

Description of the change

BuilderGroup.rescueBuilderIfLost and updateBuilderStatus act only on a context builder. They belong on IBuilder. This branch moves them, just about emptying out BuilderGroup.

To post a comment you must log in.
Revision history for this message
Michael Nelson (michael.nelson) wrote :

Thanks for more cleanups and simplifications William!

It'd be nice if we didn't need to define the functions externally to be able to re-use them, but you clarified why that's necessary. Although I wonder if MockBuilder could just inherit from Builder and stub out the relevant methods as required? (leaving rescuiIfLost/updateStatus)? I'll leave it up to you.

Just a small suggested update for a comment below.

> === modified file 'lib/lp/buildmaster/model/builder.py'
> --- lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:21 +0000
> +++ lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:22 +0000
> @@ -8,6 +8,8 @@
> __all__ = [
> 'Builder',
> 'BuilderSet',
> + 'rescueBuilderIfLost',
> + 'updateBuilderStatus',
> ]
>
> import httplib
> @@ -40,7 +42,8 @@
> from lp.buildmaster.interfaces.buildbase import BuildStatus
> from lp.buildmaster.interfaces.builder import (
> BuildDaemonError, BuildSlaveFailure, CannotBuild, CannotFetchFile,
> - CannotResumeHost, IBuilder, IBuilderSet, ProtocolVersionMismatch)
> + CannotResumeHost, CorruptBuildID, IBuilder, IBuilderSet,
> + ProtocolVersionMismatch)
> from lp.buildmaster.interfaces.buildfarmjobbehavior import (
> BuildBehaviorMismatch)
> from lp.buildmaster.interfaces.buildqueue import IBuildQueueSet
> @@ -154,6 +157,71 @@
> raise BuildSlaveFailure(info)
>
>
> +# This is a separate function since MockBuilder needs to use it too.
> +# Do not use it -- (Mock)Builder.rescueIfLost should be used instead.

Perhaps:

These functions are defined here so that both Builder and MockBuilder
can share the implementation for the rescueIfLost/updateStatus methods.

> +def rescueBuilderIfLost(builder, logger=None):
> + """See `IBuilder`."""
> + status_sentence = builder.slaveStatusSentence()

{{{
10:10 < noodles775> wgrant: with your rescueiflost/updatestatus branch, couldn't we do something like rescueIfLost = Builder.rescueIfLost on the MockBuilder
                    if the code is the same? (and not have to define the functions?)
10:10 < wgrant> noodles775: Bound methods check the type of the first argument.
10:10 < wgrant> So no, it's not quite that easy :(
10:10 < wgrant> Er, unbound methods.
10:10 < noodles775> Ok.
10:22 < noodles775> wgrant: in which case, is there an obvious reason that I'm missing for why MockBuilder couldn't inherit from Builder and just overload all its methods as needed?
10:27 < wgrant> noodles775: It looks like a few other mocks do that. So maybe I could.
10:27 < noodles775> wgrant: OK, I'll leave it up to you - I personally think it would improve the code.
10:27 < noodles775> r=me

}}}

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/buildmaster/buildergroup.py'
2--- lib/lp/buildmaster/buildergroup.py 2010-03-24 12:02:21 +0000
3+++ lib/lp/buildmaster/buildergroup.py 2010-03-24 12:02:22 +0000
4@@ -15,7 +15,6 @@
5
6 from lp.buildmaster.interfaces.builder import (
7 BuildDaemonError, IBuilderSet)
8-from lp.buildmaster.interfaces.builder import CorruptBuildID
9
10
11 class BuilderGroup:
12@@ -46,81 +45,7 @@
13 # 'ok' are not worth rechecking here for some currently
14 # undocumented reason. This also relates to bug #30633.
15 if builder.builderok:
16- self.updateBuilderStatus(builder, arch)
17+ builder.updateStatus(self.logger)
18
19 # Commit the updates made to the builders.
20 self.commit()
21-
22- def updateBuilderStatus(self, builder, arch):
23- """Update the status for a builder by probing it.
24-
25- :param builder: A builder object.
26- :param arch: The expected architecture family of the builder.
27- """
28- self.logger.debug('Checking %s' % builder.name)
29- try:
30- builder.checkSlaveAlive()
31- builder.checkSlaveArchitecture()
32- self.rescueBuilderIfLost(builder)
33- # Catch only known exceptions.
34- # XXX cprov 2007-06-15 bug=120571: ValueError & TypeError catching is
35- # disturbing in this context. We should spend sometime sanitizing the
36- # exceptions raised in the Builder API since we already started the
37- # main refactoring of this area.
38- except (ValueError, TypeError, xmlrpclib.Fault,
39- BuildDaemonError), reason:
40- builder.failBuilder(str(reason))
41- self.logger.warn(
42- "%s (%s) marked as failed due to: %s",
43- builder.name, builder.url, builder.failnotes, exc_info=True)
44- except socket.error, reason:
45- error_message = str(reason)
46- builder.handleTimeout(self.logger, error_message)
47-
48- def rescueBuilderIfLost(self, builder):
49- """Reset Builder slave if job information doesn't match with DB.
50-
51- If builder is BUILDING or WAITING but has an information record
52- that doesn't match what is stored in the DB, we have to dismiss
53- its current actions and let the slave free for another job,
54- assuming the XMLRPC is working properly at this point.
55- """
56- status_sentence = builder.slaveStatusSentence()
57-
58- # 'ident_position' dict relates the position of the job identifier
59- # token in the sentence received from status(), according the
60- # two status we care about. See see lib/canonical/buildd/slave.py
61- # for further information about sentence format.
62- ident_position = {
63- 'BuilderStatus.BUILDING': 1,
64- 'BuilderStatus.WAITING': 2
65- }
66-
67- # Isolate the BuilderStatus string, always the first token in
68- # see lib/canonical/buildd/slave.py and
69- # IBuilder.slaveStatusSentence().
70- status = status_sentence[0]
71-
72- # If slave is not building nor waiting, it's not in need of rescuing.
73- if status not in ident_position.keys():
74- return
75-
76- slave_build_id = status_sentence[ident_position[status]]
77-
78- try:
79- builder.verifySlaveBuildID(slave_build_id)
80- except CorruptBuildID, reason:
81- if status == 'BuilderStatus.WAITING':
82- builder.cleanSlave()
83- else:
84- builder.requestAbort()
85- self.logger.warn("Builder '%s' rescued from '%s': '%s'" % (
86- builder.name, slave_build_id, reason))
87-
88- def updateBuild(self, queueItem):
89- """Verify the current build job status.
90-
91- Perform the required actions for each state.
92- """
93- queueItem.builder.updateBuild(queueItem)
94- self.commit()
95
96=== modified file 'lib/lp/buildmaster/interfaces/builder.py'
97--- lib/lp/buildmaster/interfaces/builder.py 2010-03-24 12:02:21 +0000
98+++ lib/lp/buildmaster/interfaces/builder.py 2010-03-24 12:02:22 +0000
99@@ -173,6 +173,18 @@
100 :raises BuildDaemonError: When the slave is down.
101 """
102
103+ def rescueIfLost(logger=None):
104+ """Reset the slave if its job information doesn't match the DB.
105+
106+ If the builder is BUILDING or WAITING but has a build ID string
107+ that doesn't match what is stored in the DB, we have to dismiss
108+ its current actions and clean the slave for another job, assuming
109+ the XMLRPC is working properly at this point.
110+ """
111+
112+ def updateStatus(logger=None):
113+ """Update the builder's status by probing it."""
114+
115 def cleanSlave():
116 """Clean any temporary files from the slave."""
117
118
119=== modified file 'lib/lp/buildmaster/model/builder.py'
120--- lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:21 +0000
121+++ lib/lp/buildmaster/model/builder.py 2010-03-24 12:02:22 +0000
122@@ -8,6 +8,8 @@
123 __all__ = [
124 'Builder',
125 'BuilderSet',
126+ 'rescueBuilderIfLost',
127+ 'updateBuilderStatus',
128 ]
129
130 import httplib
131@@ -40,7 +42,8 @@
132 from lp.buildmaster.interfaces.buildbase import BuildStatus
133 from lp.buildmaster.interfaces.builder import (
134 BuildDaemonError, BuildSlaveFailure, CannotBuild, CannotFetchFile,
135- CannotResumeHost, IBuilder, IBuilderSet, ProtocolVersionMismatch)
136+ CannotResumeHost, CorruptBuildID, IBuilder, IBuilderSet,
137+ ProtocolVersionMismatch)
138 from lp.buildmaster.interfaces.buildfarmjobbehavior import (
139 BuildBehaviorMismatch)
140 from lp.buildmaster.interfaces.buildqueue import IBuildQueueSet
141@@ -154,6 +157,71 @@
142 raise BuildSlaveFailure(info)
143
144
145+# This is a separate function since MockBuilder needs to use it too.
146+# Do not use it -- (Mock)Builder.rescueIfLost should be used instead.
147+def rescueBuilderIfLost(builder, logger=None):
148+ """See `IBuilder`."""
149+ status_sentence = builder.slaveStatusSentence()
150+
151+ # 'ident_position' dict relates the position of the job identifier
152+ # token in the sentence received from status(), according the
153+ # two status we care about. See see lib/canonical/buildd/slave.py
154+ # for further information about sentence format.
155+ ident_position = {
156+ 'BuilderStatus.BUILDING': 1,
157+ 'BuilderStatus.WAITING': 2
158+ }
159+
160+ # Isolate the BuilderStatus string, always the first token in
161+ # see lib/canonical/buildd/slave.py and
162+ # IBuilder.slaveStatusSentence().
163+ status = status_sentence[0]
164+
165+ # If slave is not building nor waiting, it's not in need of rescuing.
166+ if status not in ident_position.keys():
167+ return
168+
169+ slave_build_id = status_sentence[ident_position[status]]
170+
171+ try:
172+ builder.verifySlaveBuildID(slave_build_id)
173+ except CorruptBuildID, reason:
174+ if status == 'BuilderStatus.WAITING':
175+ builder.cleanSlave()
176+ else:
177+ builder.requestAbort()
178+ if logger:
179+ logger.warn(
180+ "Builder '%s' rescued from '%s': '%s'" %
181+ (builder.name, slave_build_id, reason))
182+
183+
184+def updateBuilderStatus(builder, logger=None):
185+ """See `IBuilder`."""
186+ if logger:
187+ logger.debug('Checking %s' % builder.name)
188+
189+ try:
190+ builder.checkSlaveAlive()
191+ builder.checkSlaveArchitecture()
192+ builder.rescueIfLost(logger)
193+ # Catch only known exceptions.
194+ # XXX cprov 2007-06-15 bug=120571: ValueError & TypeError catching is
195+ # disturbing in this context. We should spend sometime sanitizing the
196+ # exceptions raised in the Builder API since we already started the
197+ # main refactoring of this area.
198+ except (ValueError, TypeError, xmlrpclib.Fault,
199+ BuildDaemonError), reason:
200+ builder.failBuilder(str(reason))
201+ if logger:
202+ logger.warn(
203+ "%s (%s) marked as failed due to: %s",
204+ builder.name, builder.url, builder.failnotes, exc_info=True)
205+ except socket.error, reason:
206+ error_message = str(reason)
207+ builder.handleTimeout(logger, error_message)
208+
209+
210 class Builder(SQLBase):
211
212 implements(IBuilder, IHasBuildRecords)
213@@ -256,6 +324,13 @@
214 if self.slave.echo("Test")[0] != "Test":
215 raise BuildDaemonError("Failed to echo OK")
216
217+ def rescueIfLost(self, logger=None):
218+ """See `IBuilder`."""
219+ rescueBuilderIfLost(self, logger)
220+
221+ def updateStatus(self, logger=None):
222+ updateBuilderStatus(self, logger)
223+
224 def cleanSlave(self):
225 """See IBuilder."""
226 return self.slave.clean()
227
228=== modified file 'lib/lp/soyuz/doc/buildd-slavescanner.txt'
229--- lib/lp/soyuz/doc/buildd-slavescanner.txt 2010-03-13 02:18:57 +0000
230+++ lib/lp/soyuz/doc/buildd-slavescanner.txt 2010-03-24 12:02:22 +0000
231@@ -69,21 +69,21 @@
232
233 >>> sanebuilding_builder = MockBuilder(
234 ... 'Sane Building Slave', SaneBuildingSlave())
235- >>> buildergroup.rescueBuilderIfLost(sanebuilding_builder) is None
236+ >>> sanebuilding_builder.rescueIfLost(buildergroup.logger) is None
237 True
238
239 A sane WAITING slave:
240
241 >>> sanewaiting_builder = MockBuilder(
242 ... 'Sane Waiting Slave', SaneWaitingSlave())
243- >>> buildergroup.rescueBuilderIfLost(sanewaiting_builder) is None
244+ >>> sanewaiting_builder.rescueIfLost(buildergroup.logger) is None
245 True
246
247 A insane WAITING slave, with wrong BuildQueue/Build relation:
248
249 >>> insanewaiting_builder = MockBuilder(
250 ... 'Insane Waiting Slave', InsaneWaitingSlave())
251- >>> buildergroup.rescueBuilderIfLost(insanewaiting_builder)
252+ >>> insanewaiting_builder.rescueIfLost(buildergroup.logger)
253 WARNING:root:Builder 'Insane Waiting Slave' rescued from '7-1': 'Job build entry mismatch'
254
255 It was rescued because the BuildQueue.id == 1 isn't related to
256@@ -94,7 +94,7 @@
257
258 >>> lostbuilding_builder = MockBuilder(
259 ... 'Lost Building Slave', LostBuildingSlave())
260- >>> buildergroup.rescueBuilderIfLost(lostbuilding_builder)
261+ >>> lostbuilding_builder.rescueIfLost(buildergroup.logger)
262 WARNING:root:Builder 'Lost Building Slave' rescued from '1000-10000':
263 'Build 1000 is not available: 'Object not found''
264
265@@ -102,7 +102,7 @@
266
267 >>> lostwaiting_builder = MockBuilder(
268 ... 'Lost Waiting Slave', LostWaitingSlave())
269- >>> buildergroup.rescueBuilderIfLost(lostwaiting_builder)
270+ >>> lostwaiting_builder.rescueIfLost(buildergroup.logger)
271 WARNING:root:Builder 'Lost Waiting Slave' rescued from '1000-10000':
272 'Build 1000 is not available: 'Object not found''
273
274@@ -115,13 +115,13 @@
275 >>> lostbuilding_builder = MockBuilder(
276 ... 'Lost Building Slave', LostBuildingSlave())
277 >>> lostbuilding_builder.current_build_behavior = IdleBuildBehavior()
278- >>> buildergroup.rescueBuilderIfLost(lostbuilding_builder)
279+ >>> lostbuilding_builder.rescueIfLost(buildergroup.logger)
280 WARNING:root:Builder 'Lost Building Slave' rescued from '1000-10000': 'No job assigned to builder'
281
282 >>> lostwaiting_builder = MockBuilder(
283 ... 'Lost Waiting Slave', LostWaitingSlave())
284 >>> lostwaiting_builder.current_build_behavior = IdleBuildBehavior()
285- >>> buildergroup.rescueBuilderIfLost(lostwaiting_builder)
286+ >>> lostwaiting_builder.rescueIfLost(buildergroup.logger)
287 WARNING:root:Builder 'Lost Waiting Slave' rescued from '1000-10000': 'No job assigned to builder'
288
289 Slave-scanner will deactivate a 'lost-building' builder that could not
290@@ -130,9 +130,7 @@
291 >>> lostbuilding_builder = MockBuilder(
292 ... 'Lost Building Broken Slave', LostBuildingBrokenSlave())
293
294- >>> from canonical.launchpad.interfaces import IDistributionSet
295- >>> hoary_i386 = getUtility(IDistributionSet)['ubuntu']['hoary']['i386']
296- >>> buildergroup.updateBuilderStatus(lostbuilding_builder, hoary_i386)
297+ >>> lostbuilding_builder.updateStatus(buildergroup.logger)
298 WARNING:root:Lost Building Broken Slave (http://fake:0000) marked as failed due to: <Fault 8002: 'Could not abort'>
299 Traceback (most recent call last):
300 ...
301
302=== modified file 'lib/lp/soyuz/tests/soyuzbuilddhelpers.py'
303--- lib/lp/soyuz/tests/soyuzbuilddhelpers.py 2010-03-24 12:02:21 +0000
304+++ lib/lp/soyuz/tests/soyuzbuilddhelpers.py 2010-03-24 12:02:22 +0000
305@@ -27,6 +27,8 @@
306
307 from canonical.config import config
308 from lp.buildmaster.interfaces.builder import CannotFetchFile
309+from lp.buildmaster.model.builder import (rescueBuilderIfLost,
310+ updateBuilderStatus)
311 from lp.soyuz.model.binarypackagebuildbehavior import (
312 BinaryPackageBuildBehavior)
313
314@@ -70,6 +72,12 @@
315 def checkSlaveArchitecture(self):
316 pass
317
318+ def rescueIfLost(self, logger=None):
319+ rescueBuilderIfLost(self, logger)
320+
321+ def updateStatus(self, logger=None):
322+ updateBuilderStatus(self, logger)
323+
324
325 class SaneBuildingSlave:
326 """A mock slave that is currently building build 8 and buildqueue 1."""