Merge lp:~wgrant/launchpad/refactor-slave-architecture-check into lp:launchpad

Proposed by William Grant
Status: Merged
Approved by: Julian Edwards
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~wgrant/launchpad/refactor-slave-architecture-check
Merge into: lp:launchpad
Diff against target: 195 lines (+71/-18)
5 files modified
lib/lp/buildmaster/buildergroup.py (+1/-1)
lib/lp/buildmaster/doc/builder.txt (+40/-3)
lib/lp/buildmaster/interfaces/builder.py (+5/-4)
lib/lp/buildmaster/model/builder.py (+16/-7)
lib/lp/soyuz/tests/soyuzbuilddhelpers.py (+9/-3)
To merge this branch: bzr merge lp:~wgrant/launchpad/refactor-slave-architecture-check
Reviewer Review Type Date Requested Status
Michael Nelson (community) code Approve
Review via email: mp+22010@code.launchpad.net

Commit message

Builder.checkCanBuildForDistroArchSeries is now checkSlaveArchitecture, and finds the DAS itself.

Description of the change

This step in my buildd-manager cleanup regime moves the DistroArchSeries<->slavearchtag matching hack into the Builder model, and renames the victim from checkCanBuildForDistroArchSeries to checkSlaveArchitecture.

The archtag matching hack is currently somewhat concealed in the buildd-manager mess. It goes something like this:

 - buildd-manager finds all the DASes.
 - buildd-manager collects all builders with a processor matching each DAS's family.
 - buildd-manager iterates through the builders attached to each DAS, checking their arch tag against the DAS arch tag, and processing the builders in other ways.

So it's checking a DAS<->builder relationship, when there is in fact no such unique relationship. It's terribly fragile and will break in very obscure ways if a DAS happens to have a different archtag from others in its processor family.

By distilling the hack into this method it becomes a lot more obvious, and buildd-manager can be cleaned up to just iterate straight through all of the registered builders, forgetting about the DAS mess entirely.

In addition, multi-arch builder support requires a rethink of the slave identification mechanism, and this hack can evaporate during that. I hope to get to it soon, so it might be short-lived.

To post a comment you must log in.
Revision history for this message
Michael Nelson (michael.nelson) wrote :

For what it's worth, r=me

09:53 < noodles775> wgrant: I don't see what's worse about your check, than the one you removed? Both check whether the slave's builder_arch matches a das.architecturetag (only difference being that the das used to be passed in)?
09:53 < noodles775> So it seems like a good refactoring to me.
09:53 < wgrant> Right.
09:53 < wgrant> Both are making the best of a terrible situation.
09:53 * wgrant will argue with bigjools when he returns.
09:54 < noodles775> wgrant: You're probably planning to anyway, but pls add a bug to that XXX reference at some point :)
09:56 < noodles775> wgrant: Why not include the processorfamily in the original query rather than as a separate check?
09:57 -!- danilos [~danilo@canonical/launchpad/danilos] has joined #launchpad-dev
09:57 < wgrant> noodles775: This lets us give a slightly better error.
09:57 < noodles775> That would make your new code almost identical in functionality?
09:57 < noodles775> Ah, ok.
09:57 < wgrant> I suppose I could just say 'Mismatched slave architecture tag: foo'
09:58 < wgrant> But the old one gave the correct value too..
09:59 < wgrant> noodles775: Can you do a quick search for that bug? Lots of that sort of thing are still private.
09:59 < wgrant> There surely is one.
09:59 < noodles775> OK, I would have thought it would have been included in the XXX if there was noe. Checking now.
10:01 < noodles775> So just back to your raised exception, you could raise "Mismatched slave architecture, Architecture tag: %s, Processor family: %s"?
10:01 < wgrant> noodles775: I could, true.

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/buildmaster/buildergroup.py'
2--- lib/lp/buildmaster/buildergroup.py 2010-02-15 11:55:52 +0000
3+++ lib/lp/buildmaster/buildergroup.py 2010-03-24 10:34:31 +0000
4@@ -60,7 +60,7 @@
5 self.logger.debug('Checking %s' % builder.name)
6 try:
7 builder.checkSlaveAlive()
8- builder.checkCanBuildForDistroArchSeries(arch)
9+ builder.checkSlaveArchitecture()
10 self.rescueBuilderIfLost(builder)
11 # Catch only known exceptions.
12 # XXX cprov 2007-06-15 bug=120571: ValueError & TypeError catching is
13
14=== modified file 'lib/lp/buildmaster/doc/builder.txt'
15--- lib/lp/buildmaster/doc/builder.txt 2010-01-13 20:19:43 +0000
16+++ lib/lp/buildmaster/doc/builder.txt 2010-03-24 10:34:31 +0000
17@@ -1,4 +1,6 @@
18-= Builder Class =
19+=============
20+Builder Class
21+=============
22
23 This test aims to meet the requirements of
24 <https://launchpad.canonical.com/BasicTestCoverage> for the Builder class,
25@@ -54,7 +56,8 @@
26 True
27
28
29-== BuilderSet ==
30+BuilderSet
31+==========
32
33 Now perform the tests for the Builder ContentSet class, BuilderSet.
34
35@@ -171,7 +174,8 @@
36 (1, datetime.timedelta(0, 60))
37
38
39-== Resuming buildd slaves ==
40+Resuming buildd slaves
41+======================
42
43 Virtual slaves are resumed using a command specified in the
44 configuration profile. Production configuration uses a SSH trigger
45@@ -230,3 +234,36 @@
46
47 >>> config_data = config.pop('vm_resume_command')
48
49+
50+Slave architecture checks
51+=========================
52+
53+Builder.checkSlaveArchitecture() asks the slave for its version and tries
54+to match it against a DistroArchSeries with a ProcessorFamily containing
55+the Builder's Processor. If it fails, it will raise an exception so the
56+builder can be marked as failed.
57+
58+A fictitious i386 variant is rejected, since there are no DASes with that
59+tag.
60+
61+ >>> from lp.soyuz.tests.soyuzbuilddhelpers import OkSlave
62+ >>> bob.setSlaveForTesting(OkSlave('i387'))
63+ >>> bob.checkSlaveArchitecture()
64+ Traceback (most recent call last):
65+ ...
66+ BuildDaemonError: Bad slave architecture tag: i387 (registered family: x86)
67+
68+hppa isn't in the x86 family, so it too is rejected.
69+
70+ >>> from lp.soyuz.tests.soyuzbuilddhelpers import OkSlave
71+ >>> bob.setSlaveForTesting(OkSlave('hppa'))
72+ >>> bob.checkSlaveArchitecture()
73+ Traceback (most recent call last):
74+ ...
75+ BuildDaemonError: Bad slave architecture tag: hppa (registered family: x86)
76+
77+But i386, a real x86 variant, passes without objection.
78+
79+ >>> from lp.soyuz.tests.soyuzbuilddhelpers import OkSlave
80+ >>> bob.setSlaveForTesting(OkSlave('i386'))
81+ >>> bob.checkSlaveArchitecture()
82
83=== modified file 'lib/lp/buildmaster/interfaces/builder.py'
84--- lib/lp/buildmaster/interfaces/builder.py 2010-01-22 04:01:17 +0000
85+++ lib/lp/buildmaster/interfaces/builder.py 2010-03-24 10:34:31 +0000
86@@ -150,13 +150,14 @@
87 title=u"The current behavior of the builder for the current job.",
88 required=False)
89
90- def checkCanBuildForDistroArchSeries(distro_arch_series):
91- """Check that the slave can compile for the given distro_arch_release.
92+ def checkSlaveArchitecture():
93+ """Check that the slave can compile for its nominated processor.
94
95 This will query the builder to determine its actual architecture (as
96- opposed to what we expect it to be).
97+ opposed to what we expect it to be). It will then look for a
98+ DistroArchSeries with the returned architecture tag, and confirm that
99+ the processor type matches.
100
101- :param distro_arch_release: The distro_arch_release to check against.
102 :raises BuildDaemonError: When the builder is down or of the wrong
103 architecture.
104 :raises ProtocolVersionMismatch: When the builder returns an
105
106=== modified file 'lib/lp/buildmaster/model/builder.py'
107--- lib/lp/buildmaster/model/builder.py 2010-03-08 12:53:59 +0000
108+++ lib/lp/buildmaster/model/builder.py 2010-03-24 10:34:31 +0000
109@@ -215,9 +215,9 @@
110 current_build_behavior = property(
111 _getCurrentBuildBehavior, _setCurrentBuildBehavior)
112
113- def checkCanBuildForDistroArchSeries(self, distro_arch_series):
114- """See IBuilder."""
115- # XXX cprov 2007-06-15:
116+ def checkSlaveArchitecture(self):
117+ """See `IBuilder`."""
118+ # XXX cprov 2007-06-15 bug=545839:
119 # This function currently depends on the operating system specific
120 # details of the build slave to return a processor-family-name (the
121 # architecturetag) which matches the distro_arch_series. In reality,
122@@ -226,17 +226,26 @@
123 # distro specific and potentially different for radically different
124 # distributions - its not the right thing to be comparing.
125
126+ from lp.soyuz.model.distroarchseries import DistroArchSeries
127+
128 # query the slave for its active details.
129 # XXX cprov 2007-06-15: Why is 'mechanisms' ignored?
130 builder_vers, builder_arch, mechanisms = self.slave.info()
131 # we can only understand one version of slave today:
132 if builder_vers != '1.0':
133 raise ProtocolVersionMismatch("Protocol version mismatch")
134- # check the slave arch-tag against the distro_arch_series.
135- if builder_arch != distro_arch_series.architecturetag:
136+
137+ # Find a distroarchseries with the returned arch tag.
138+ # This is ugly, sick and wrong, but so is the whole concept. See the
139+ # XXX above and its bug for details.
140+ das = Store.of(self).find(
141+ DistroArchSeries, architecturetag=builder_arch,
142+ processorfamily=self.processor.family).any()
143+
144+ if das is None:
145 raise BuildDaemonError(
146- "Architecture tag mismatch: %s != %s"
147- % (builder_arch, distro_arch_series.architecturetag))
148+ "Bad slave architecture tag: %s (registered family: %s)" %
149+ (builder_arch, self.processor.family.name))
150
151 def checkSlaveAlive(self):
152 """See IBuilder."""
153
154=== modified file 'lib/lp/soyuz/tests/soyuzbuilddhelpers.py'
155--- lib/lp/soyuz/tests/soyuzbuilddhelpers.py 2010-01-22 04:01:17 +0000
156+++ lib/lp/soyuz/tests/soyuzbuilddhelpers.py 2010-03-24 10:34:31 +0000
157@@ -67,7 +67,7 @@
158 def checkSlaveAlive(self):
159 pass
160
161- def checkCanBuildForDistroArchSeries(self, distro_arch_series):
162+ def checkSlaveArchitecture(self):
163 pass
164
165
166@@ -154,7 +154,12 @@
167
168
169 class OkSlave:
170- """An idle mock slave that prints information about itself."""
171+ """An idle mock slave that prints information about itself.
172+
173+ The architecture tag can be customised during initialisation."""
174+
175+ def __init__(self, arch_tag='i386'):
176+ self.arch_tag = arch_tag
177
178 def status(self):
179 return ('BuilderStatus.IDLE', '')
180@@ -187,7 +192,7 @@
181 pass
182
183 def info(self):
184- return ('1.0', 'i386', 'debian')
185+ return ('1.0', self.arch_tag, 'debian')
186
187 def resume(self):
188 resume_argv = config.builddmaster.vm_resume_command.split()
189@@ -232,6 +237,7 @@
190 """A mock slave that looks like it's currently waiting."""
191
192 def __init__(self, state, dependencies=None):
193+ super(WaitingSlave, self).__init__()
194 self.state = state
195 self.dependencies = dependencies
196