Merge lp:~jml/launchpad/better-sourcecode-update into lp:launchpad
- better-sourcecode-update
- Merge into devel
Proposed by
Jonathan Lange
Status: | Merged |
---|---|
Merged at revision: | not available |
Proposed branch: | lp:~jml/launchpad/better-sourcecode-update |
Merge into: | lp:launchpad |
Diff against target: | None lines |
To merge this branch: | bzr merge lp:~jml/launchpad/better-sourcecode-update |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Michael Hudson-Doyle | Approve | ||
Review via email: mp+11580@code.launchpad.net |
This proposal supersedes a proposal from 2009-09-10.
Commit message
Description of the change
To post a comment you must log in.
Revision history for this message
Jonathan Lange (jml) wrote : Posted in a previous version of this proposal | # |
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : | # |
I think you should probably add a test that the utilities/
review:
Approve
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1 | === modified file '.bzrignore' |
2 | --- .bzrignore 2009-08-31 01:16:03 +0000 |
3 | +++ .bzrignore 2009-09-10 06:11:21 +0000 |
4 | @@ -50,3 +50,4 @@ |
5 | ./_pythonpath.py |
6 | ./production-configs |
7 | bzr.dev |
8 | +_trial_temp |
9 | |
10 | === modified file 'buildout-templates/bin/test.in' |
11 | --- buildout-templates/bin/test.in 2009-08-10 22:08:05 +0000 |
12 | +++ buildout-templates/bin/test.in 2009-09-10 00:17:39 +0000 |
13 | @@ -140,6 +140,7 @@ |
14 | '--test-path=${buildout:directory}/lib', |
15 | '--package=canonical', |
16 | '--package=lp', |
17 | + '--package=devscripts', |
18 | '--layer=!MailmanLayer', |
19 | ] |
20 | |
21 | |
22 | === added directory 'lib/devscripts' |
23 | === added file 'lib/devscripts/__init__.py' |
24 | --- lib/devscripts/__init__.py 1970-01-01 00:00:00 +0000 |
25 | +++ lib/devscripts/__init__.py 2009-09-09 03:27:47 +0000 |
26 | @@ -0,0 +1,4 @@ |
27 | +# Copyright 2009 Canonical Ltd. This software is licensed under the |
28 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
29 | + |
30 | +"""Scripts that are used in developing Launchpad.""" |
31 | |
32 | === added file 'lib/devscripts/sourcecode.py' |
33 | --- lib/devscripts/sourcecode.py 1970-01-01 00:00:00 +0000 |
34 | +++ lib/devscripts/sourcecode.py 2009-09-11 05:01:06 +0000 |
35 | @@ -0,0 +1,199 @@ |
36 | +# Copyright 2009 Canonical Ltd. This software is licensed under the |
37 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
38 | + |
39 | +"""Tools for maintaining the Launchpad source code.""" |
40 | + |
41 | +__metaclass__ = type |
42 | +__all__ = [ |
43 | + 'interpret_config', |
44 | + 'parse_config_file', |
45 | + 'plan_update', |
46 | + ] |
47 | + |
48 | +import os |
49 | +import shutil |
50 | +import sys |
51 | + |
52 | +from bzrlib.branch import Branch |
53 | +from bzrlib.bzrdir import BzrDir |
54 | +from bzrlib.errors import BzrError |
55 | +from bzrlib.plugin import load_plugins |
56 | +from bzrlib.trace import report_exception |
57 | +from bzrlib.transport import get_transport |
58 | +from bzrlib.workingtree import WorkingTree |
59 | + |
60 | + |
61 | +def parse_config_file(file_handle): |
62 | + """Parse the source code config file 'file_handle'. |
63 | + |
64 | + :param file_handle: A file-like object containing sourcecode |
65 | + configuration. |
66 | + :return: A sequence of lines of either '[key, value]' or |
67 | + '[key, value, optional]'. |
68 | + """ |
69 | + for line in file_handle: |
70 | + if line.startswith('#'): |
71 | + continue |
72 | + yield [token.strip() for token in line.split('=')] |
73 | + |
74 | + |
75 | +def interpret_config_entry(entry): |
76 | + """Interpret a single parsed line from the config file.""" |
77 | + return (entry[0], (entry[1], len(entry) > 2)) |
78 | + |
79 | + |
80 | +def interpret_config(config_entries): |
81 | + """Interpret a configuration stream, as parsed by 'parse_config_file'. |
82 | + |
83 | + :param configuration: A sequence of parsed configuration entries. |
84 | + :return: A dict mapping the names of the sourcecode dependencies to a |
85 | + 2-tuple of their branches and whether or not they are optional. |
86 | + """ |
87 | + return dict(map(interpret_config_entry, config_entries)) |
88 | + |
89 | + |
90 | +def _subset_dict(d, keys): |
91 | + """Return a dict that's a subset of 'd', based on the keys in 'keys'.""" |
92 | + return dict((key, d[key]) for key in keys) |
93 | + |
94 | + |
95 | +def plan_update(existing_branches, configuration): |
96 | + """Plan the update to existing branches based on 'configuration'. |
97 | + |
98 | + :param existing_branches: A sequence of branches that already exist. |
99 | + :param configuration: A dictionary of sourcecode configuration, such as is |
100 | + returned by `interpret_config`. |
101 | + :return: (new_branches, update_branches, removed_branches), where |
102 | + 'new_branches' are the branches in the configuration that don't exist |
103 | + yet, 'update_branches' are the branches in the configuration that do |
104 | + exist, and 'removed_branches' are the branches that exist locally, but |
105 | + not in the configuration. 'new_branches' and 'update_branches' are |
106 | + dicts of the same form as 'configuration', 'removed_branches' is a |
107 | + set of the same form as 'existing_branches'. |
108 | + """ |
109 | + existing_branches = set(existing_branches) |
110 | + config_branches = set(configuration.keys()) |
111 | + new_branches = config_branches - existing_branches |
112 | + removed_branches = existing_branches - config_branches |
113 | + update_branches = config_branches.intersection(existing_branches) |
114 | + return ( |
115 | + _subset_dict(configuration, new_branches), |
116 | + _subset_dict(configuration, update_branches), |
117 | + removed_branches) |
118 | + |
119 | + |
120 | +def find_branches(directory): |
121 | + """List the directory names in 'directory' that are branches.""" |
122 | + transport = get_transport(directory) |
123 | + return ( |
124 | + os.path.basename(branch.base.rstrip('/')) |
125 | + for branch in BzrDir.find_branches(transport)) |
126 | + |
127 | + |
128 | +def get_branches(sourcecode_directory, new_branches, |
129 | + possible_transports=None): |
130 | + """Get the new branches into sourcecode.""" |
131 | + for project, (branch_url, optional) in new_branches.iteritems(): |
132 | + destination = os.path.join(sourcecode_directory, project) |
133 | + remote_branch = Branch.open( |
134 | + branch_url, possible_transports=possible_transports) |
135 | + possible_transports.append( |
136 | + remote_branch.bzrdir.root_transport) |
137 | + print 'Getting %s from %s' % (project, branch_url) |
138 | + # If the 'optional' flag is set, then it's a branch that shares |
139 | + # history with Launchpad, so we should share repositories. Otherwise, |
140 | + # we should avoid sharing repositories to avoid format |
141 | + # incompatibilities. |
142 | + force_new_repo = not optional |
143 | + try: |
144 | + remote_branch.bzrdir.sprout( |
145 | + destination, create_tree_if_local=True, |
146 | + source_branch=remote_branch, force_new_repo=force_new_repo, |
147 | + possible_transports=possible_transports) |
148 | + except BzrError: |
149 | + if optional: |
150 | + report_exception(sys.exc_info(), sys.stderr) |
151 | + else: |
152 | + raise |
153 | + |
154 | + |
155 | +def update_branches(sourcecode_directory, update_branches, |
156 | + possible_transports=None): |
157 | + """Update the existing branches in sourcecode.""" |
158 | + if possible_transports is None: |
159 | + possible_transports = [] |
160 | + # XXX: JonathanLange 2009-11-09: Rather than updating one branch after |
161 | + # another, we could instead try to get them in parallel. |
162 | + for project, (branch_url, optional) in update_branches.iteritems(): |
163 | + destination = os.path.join(sourcecode_directory, project) |
164 | + print 'Updating %s' % (project,) |
165 | + local_tree = WorkingTree.open(destination) |
166 | + remote_branch = Branch.open( |
167 | + branch_url, possible_transports=possible_transports) |
168 | + possible_transports.append( |
169 | + remote_branch.bzrdir.root_transport) |
170 | + try: |
171 | + local_tree.pull( |
172 | + remote_branch, |
173 | + possible_transports=possible_transports) |
174 | + except BzrError: |
175 | + if optional: |
176 | + report_exception(sys.exc_info(), sys.stderr) |
177 | + else: |
178 | + raise |
179 | + |
180 | + |
181 | +def remove_branches(sourcecode_directory, removed_branches): |
182 | + """Remove sourcecode that's no longer there.""" |
183 | + for project in removed_branches: |
184 | + destination = os.path.join(sourcecode_directory, project) |
185 | + print 'Removing %s' % project |
186 | + try: |
187 | + shutil.rmtree(destination) |
188 | + except OSError: |
189 | + os.unlink(destination) |
190 | + |
191 | + |
192 | +def update_sourcecode(sourcecode_directory, config_filename): |
193 | + """Update the sourcecode.""" |
194 | + config_file = open(config_filename) |
195 | + config = interpret_config(parse_config_file(config_file)) |
196 | + config_file.close() |
197 | + branches = find_branches(sourcecode_directory) |
198 | + new, updated, removed = plan_update(branches, config) |
199 | + possible_transports = [] |
200 | + get_branches(sourcecode_directory, new, possible_transports) |
201 | + update_branches(sourcecode_directory, updated, possible_transports) |
202 | + remove_branches(sourcecode_directory, removed) |
203 | + |
204 | + |
205 | +def get_launchpad_root(): |
206 | + return os.path.dirname(os.path.dirname(os.path.dirname(__file__))) |
207 | + |
208 | + |
209 | +# XXX: JonathanLange 2009-09-11: By default, the script will operate on the |
210 | +# current checkout. Most people only have symlinks to sourcecode in their |
211 | +# checkouts. This is fine for updating, but breaks for removing (you can't |
212 | +# shutil.rmtree a symlink) and breaks for adding, since it adds the new branch |
213 | +# to the checkout, rather than to the shared sourcecode area. Ideally, the |
214 | +# script would see that the sourcecode directory is full of symlinks and then |
215 | +# follow these symlinks to find the shared source directory. If the symlinks |
216 | +# differ from each other (because of developers fiddling with things), we can |
217 | +# take a survey of all of them, and choose the most popular. |
218 | + |
219 | + |
220 | +def main(args): |
221 | + root = get_launchpad_root() |
222 | + if len(args) > 1: |
223 | + sourcecode_directory = args[1] |
224 | + else: |
225 | + sourcecode_directory = os.path.join(root, 'sourcecode') |
226 | + if len(args) > 2: |
227 | + config_filename = args[2] |
228 | + else: |
229 | + config_filename = os.path.join(root, 'utilities', 'sourcedeps.conf') |
230 | + print 'Sourcecode: %s' % (sourcecode_directory,) |
231 | + print 'Config: %s' % (config_filename,) |
232 | + load_plugins() |
233 | + update_sourcecode(sourcecode_directory, config_filename) |
234 | + return 0 |
235 | |
236 | === added directory 'lib/devscripts/tests' |
237 | === added file 'lib/devscripts/tests/__init__.py' |
238 | --- lib/devscripts/tests/__init__.py 1970-01-01 00:00:00 +0000 |
239 | +++ lib/devscripts/tests/__init__.py 2009-09-09 03:27:47 +0000 |
240 | @@ -0,0 +1,4 @@ |
241 | +# Copyright 2009 Canonical Ltd. This software is licensed under the |
242 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
243 | + |
244 | +"""Tests for devscripts.""" |
245 | |
246 | === added file 'lib/devscripts/tests/test_sourcecode.py' |
247 | --- lib/devscripts/tests/test_sourcecode.py 1970-01-01 00:00:00 +0000 |
248 | +++ lib/devscripts/tests/test_sourcecode.py 2009-09-11 04:57:50 +0000 |
249 | @@ -0,0 +1,156 @@ |
250 | +# Copyright 2009 Canonical Ltd. This software is licensed under the |
251 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
252 | + |
253 | +"""Module docstring goes here.""" |
254 | + |
255 | +__metaclass__ = type |
256 | + |
257 | +import shutil |
258 | +from StringIO import StringIO |
259 | +import tempfile |
260 | +import unittest |
261 | + |
262 | +from bzrlib.bzrdir import BzrDir |
263 | +from bzrlib.tests import TestCase |
264 | +from bzrlib.transport import get_transport |
265 | + |
266 | +from devscripts.sourcecode import ( |
267 | + find_branches, interpret_config, parse_config_file, plan_update) |
268 | + |
269 | + |
270 | +class TestParseConfigFile(unittest.TestCase): |
271 | + """Tests for the config file parser.""" |
272 | + |
273 | + def makeFile(self, contents): |
274 | + return StringIO(contents) |
275 | + |
276 | + def test_empty(self): |
277 | + # Parsing an empty config file returns an empty sequence. |
278 | + empty_file = self.makeFile("") |
279 | + self.assertEqual([], list(parse_config_file(empty_file))) |
280 | + |
281 | + def test_single_value(self): |
282 | + # Parsing a file containing a single key=value pair returns a sequence |
283 | + # containing the (key, value) as a list. |
284 | + config_file = self.makeFile("key=value") |
285 | + self.assertEqual( |
286 | + [['key', 'value']], list(parse_config_file(config_file))) |
287 | + |
288 | + def test_comment_ignored(self): |
289 | + # If a line begins with a '#', then its a comment. |
290 | + comment_only = self.makeFile('# foo') |
291 | + self.assertEqual([], list(parse_config_file(comment_only))) |
292 | + |
293 | + def test_optional_value(self): |
294 | + # Lines in the config file can have a third optional entry. |
295 | + config_file = self.makeFile('key=value=optional') |
296 | + self.assertEqual( |
297 | + [['key', 'value', 'optional']], |
298 | + list(parse_config_file(config_file))) |
299 | + |
300 | + def test_whitespace_stripped(self): |
301 | + # Any whitespace around any of the tokens in the config file are |
302 | + # stripped out. |
303 | + config_file = self.makeFile(' key = value = optional ') |
304 | + self.assertEqual( |
305 | + [['key', 'value', 'optional']], |
306 | + list(parse_config_file(config_file))) |
307 | + |
308 | + |
309 | +class TestInterpretConfiguration(unittest.TestCase): |
310 | + """Tests for the configuration interpreter.""" |
311 | + |
312 | + def test_empty(self): |
313 | + # An empty configuration stream means no configuration. |
314 | + config = interpret_config([]) |
315 | + self.assertEqual({}, config) |
316 | + |
317 | + def test_key_value(self): |
318 | + # A (key, value) pair without a third optional value is returned in |
319 | + # the configuration as a dictionary entry under 'key' with '(value, |
320 | + # False)' as its value. |
321 | + config = interpret_config([['key', 'value']]) |
322 | + self.assertEqual({'key': ('value', False)}, config) |
323 | + |
324 | + def test_key_value_optional(self): |
325 | + # A (key, value, optional) entry is returned in the configuration as a |
326 | + # dictionary entry under 'key' with '(value, True)' as its value. |
327 | + config = interpret_config([['key', 'value', 'optional']]) |
328 | + self.assertEqual({'key': ('value', True)}, config) |
329 | + |
330 | + |
331 | +class TestPlanUpdate(unittest.TestCase): |
332 | + """Tests for how to plan the update.""" |
333 | + |
334 | + def test_trivial(self): |
335 | + # In the trivial case, there are no existing branches and no |
336 | + # configured branches, so there are no branches to add, none to |
337 | + # update, and none to remove. |
338 | + new, existing, removed = plan_update([], {}) |
339 | + self.assertEqual({}, new) |
340 | + self.assertEqual({}, existing) |
341 | + self.assertEqual(set(), removed) |
342 | + |
343 | + def test_all_new(self): |
344 | + # If there are no existing branches, then the all of the configured |
345 | + # branches are new, none are existing and none have been removed. |
346 | + new, existing, removed = plan_update([], {'a': ('b', False)}) |
347 | + self.assertEqual({'a': ('b', False)}, new) |
348 | + self.assertEqual({}, existing) |
349 | + self.assertEqual(set(), removed) |
350 | + |
351 | + def test_all_old(self): |
352 | + # If there configuration is now empty, but there are existing |
353 | + # branches, then that means all the branches have been removed from |
354 | + # the configuration, none are new and none are updated. |
355 | + new, existing, removed = plan_update(['a', 'b', 'c'], {}) |
356 | + self.assertEqual({}, new) |
357 | + self.assertEqual({}, existing) |
358 | + self.assertEqual(set(['a', 'b', 'c']), removed) |
359 | + |
360 | + def test_all_same(self): |
361 | + # If the set of existing branches is the same as the set of |
362 | + # non-existing branches, then they all need to be updated. |
363 | + config = {'a': ('b', False), 'c': ('d', True)} |
364 | + new, existing, removed = plan_update(config.keys(), config) |
365 | + self.assertEqual({}, new) |
366 | + self.assertEqual(config, existing) |
367 | + self.assertEqual(set(), removed) |
368 | + |
369 | + |
370 | +class TestFindBranches(TestCase): |
371 | + """Tests the way that we find branches.""" |
372 | + |
373 | + def makeBranch(self, path): |
374 | + transport = get_transport(path) |
375 | + transport.ensure_base() |
376 | + BzrDir.create_branch_convenience( |
377 | + transport.base, possible_transports=[transport]) |
378 | + |
379 | + def makeDirectory(self): |
380 | + directory = tempfile.mkdtemp() |
381 | + self.addCleanup(shutil.rmtree, directory) |
382 | + return directory |
383 | + |
384 | + def test_empty_directory_has_no_branches(self): |
385 | + # An empty directory has no branches. |
386 | + empty = self.makeDirectory() |
387 | + self.assertEqual([], list(find_branches(empty))) |
388 | + |
389 | + def test_directory_with_branches(self): |
390 | + # find_branches finds branches in the directory. |
391 | + directory = self.makeDirectory() |
392 | + self.makeBranch('%s/a' % directory) |
393 | + self.assertEqual(['a'], list(find_branches(directory))) |
394 | + |
395 | + def test_ignores_files(self): |
396 | + # find_branches ignores any files in the directory. |
397 | + directory = self.makeDirectory() |
398 | + some_file = open('%s/a' % directory, 'w') |
399 | + some_file.write('hello\n') |
400 | + some_file.close() |
401 | + self.assertEqual([], list(find_branches(directory))) |
402 | + |
403 | + |
404 | +def test_suite(): |
405 | + return unittest.TestLoader().loadTestsFromName(__name__) |
406 | |
407 | === modified file 'utilities/rocketfuel-get' |
408 | --- utilities/rocketfuel-get 2009-07-24 15:37:03 +0000 |
409 | +++ utilities/rocketfuel-get 2009-09-11 05:01:06 +0000 |
410 | @@ -64,45 +64,9 @@ |
411 | echo " See https://bugs.edge.launchpad.net/bzr/+bug/401617 for why." |
412 | echo "" |
413 | echo " Sourcedeps: $LP_SOURCEDEPS_PATH" |
414 | -while IFS="=" read package branch optional |
415 | -do |
416 | - package_path="$LP_SOURCEDEPS_PATH/$package" |
417 | - echo " Checking $package_path" |
418 | - if [ "$optional" ] |
419 | - then |
420 | - # The 'set +e' below is so that we don't exit when the 'bzr log' call |
421 | - # returns an error. |
422 | - set +e |
423 | - bzr log -l 1 -q "$branch" &> /dev/null |
424 | - retval=$? |
425 | - set -e |
426 | - if [ $retval -ne 0 ] |
427 | - then |
428 | - # We can't reach this branch, but it's optional anyway, so we'll |
429 | - # just move on. |
430 | - echo "Couldn't find $branch, skipping it." |
431 | - continue |
432 | - fi |
433 | - if [ -d "$package_path" ] |
434 | - then |
435 | - run-child bzr pull --remember "$branch" --directory "$package_path" |
436 | - else |
437 | - # This is either shipit or the openid-provider, and they share the |
438 | - # history with our main tree, so we'll do this hack to stack them |
439 | - # onto LP's trunk branch. |
440 | - run-child bzr branch --no-tree --stacked "$LP_TRUNK_PATH" "$package_path" |
441 | - run-child bzr pull --overwrite "$branch" -d "$package_path" |
442 | - run-child bzr checkout "$package_path" "$package_path" |
443 | - fi |
444 | - else |
445 | - if [ -d "$package_path" ] |
446 | - then |
447 | - run-child bzr pull --remember "$branch" --directory "$package_path" |
448 | - else |
449 | - run-child bzr branch --standalone "$branch" "$package_path" |
450 | - fi |
451 | - fi |
452 | -done < "$sourcedeps_conf" |
453 | + |
454 | +run-child $LP_TRUNK_PATH/utilities/update-sourcecode $LP_SOURCEDEPS_PATH $sourcedeps_conf |
455 | + |
456 | |
457 | # Update the current trees in the repo. |
458 | echo "Updating sourcecode dependencies in current local branches:" |
459 | |
460 | === added file 'utilities/update-sourcecode' |
461 | --- utilities/update-sourcecode 1970-01-01 00:00:00 +0000 |
462 | +++ utilities/update-sourcecode 2009-09-10 04:31:12 +0000 |
463 | @@ -0,0 +1,16 @@ |
464 | +#!/usr/bin/python2.4 |
465 | +# |
466 | +# Copyright 2009 Canonical Ltd. This software is licensed under the |
467 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
468 | + |
469 | +"""Update the sourcecode managed in sourcecode/.""" |
470 | + |
471 | +import _pythonpath |
472 | + |
473 | +import sys |
474 | + |
475 | +from devscripts import sourcecode |
476 | + |
477 | + |
478 | +if __name__ == '__main__': |
479 | + sys.exit(sourcecode.main(sys.argv)) |
This branch adds a script that updates the branches in sourcecode, using Bazaar itself. Although rocketfuel-get already does this, the facility for doing so is untested and grouped together with a lot of other code.
By extracting the behaviour out into its own script, and indeed its owned tested functions, this patch opens up opportunities for ec2 image generation, and for getting us closer to being able to get started with Launchpad development without running rocketfuel-setup.
The tests are reasonably complete, in terms of logic. However, they begin to fall down at the point of actually doing stuff with Bazaar. This is mostly because I got stuck there, not able to figure out how to test that bit sanely.
For the utility to be useful, it really should have a nice command-line UI, be well-documented and provide good errors. It doesn't do that right now, and I'd appreciate it if you pushed me into making it do so, along with some specific notes.
One interesting thing about this branch is the way it makes a new top-level package called 'devutils'. It adds this package to the bin/test template so we can actually run the tests ("./bin/test -cvv devutils", btw). The script itself is put in utilities/ and deliberately kept as empty as possible.
The script takes about 2m to run when there's nothing to actually do. rsync takes about 12s to do the same. Although I have no way of actually measuring this, I'm confident that the script will be faster than rsync when there actually are updates. I don't really know where that extra 6s per branch is going.
There are two things I can think of to take this work further. One is to have the script be symlink-aware, so that it can be run on the sourcecode/ directory of a branch, but actually make new branches in the _real_ sourcecode directory. The second is to have rocketfuel-get use this script.
That's about all I can think of. Your opinions are greatly valued.
jml