Merge lp:~jml/launchpad/better-sourcecode-update into lp:launchpad

Proposed by Jonathan Lange
Status: Merged
Merged at revision: not available
Proposed branch: lp:~jml/launchpad/better-sourcecode-update
Merge into: lp:launchpad
Diff against target: None lines
To merge this branch: bzr merge lp:~jml/launchpad/better-sourcecode-update
Reviewer Review Type Date Requested Status
Michael Hudson-Doyle Approve
Review via email: mp+11580@code.launchpad.net

This proposal supersedes a proposal from 2009-09-10.

To post a comment you must log in.
Revision history for this message
Jonathan Lange (jml) wrote : Posted in a previous version of this proposal

This branch adds a script that updates the branches in sourcecode, using Bazaar itself. Although rocketfuel-get already does this, the facility for doing so is untested and grouped together with a lot of other code.

By extracting the behaviour out into its own script, and indeed its owned tested functions, this patch opens up opportunities for ec2 image generation, and for getting us closer to being able to get started with Launchpad development without running rocketfuel-setup.

The tests are reasonably complete, in terms of logic. However, they begin to fall down at the point of actually doing stuff with Bazaar. This is mostly because I got stuck there, not able to figure out how to test that bit sanely.

For the utility to be useful, it really should have a nice command-line UI, be well-documented and provide good errors. It doesn't do that right now, and I'd appreciate it if you pushed me into making it do so, along with some specific notes.

One interesting thing about this branch is the way it makes a new top-level package called 'devutils'. It adds this package to the bin/test template so we can actually run the tests ("./bin/test -cvv devutils", btw). The script itself is put in utilities/ and deliberately kept as empty as possible.

The script takes about 2m to run when there's nothing to actually do. rsync takes about 12s to do the same. Although I have no way of actually measuring this, I'm confident that the script will be faster than rsync when there actually are updates. I don't really know where that extra 6s per branch is going.

There are two things I can think of to take this work further. One is to have the script be symlink-aware, so that it can be run on the sourcecode/ directory of a branch, but actually make new branches in the _real_ sourcecode directory. The second is to have rocketfuel-get use this script.

That's about all I can think of. Your opinions are greatly valued.

jml

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I think you should probably add a test that the utilities/sourcedeps.conf file can be parsed by your code. Otherwise, I think it's fine, land it :)

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file '.bzrignore'
--- .bzrignore 2009-08-31 01:16:03 +0000
+++ .bzrignore 2009-09-10 06:11:21 +0000
@@ -50,3 +50,4 @@
50./_pythonpath.py50./_pythonpath.py
51./production-configs51./production-configs
52bzr.dev52bzr.dev
53_trial_temp
5354
=== modified file 'buildout-templates/bin/test.in'
--- buildout-templates/bin/test.in 2009-08-10 22:08:05 +0000
+++ buildout-templates/bin/test.in 2009-09-10 00:17:39 +0000
@@ -140,6 +140,7 @@
140 '--test-path=${buildout:directory}/lib',140 '--test-path=${buildout:directory}/lib',
141 '--package=canonical',141 '--package=canonical',
142 '--package=lp',142 '--package=lp',
143 '--package=devscripts',
143 '--layer=!MailmanLayer',144 '--layer=!MailmanLayer',
144 ]145 ]
145146
146147
=== added directory 'lib/devscripts'
=== added file 'lib/devscripts/__init__.py'
--- lib/devscripts/__init__.py 1970-01-01 00:00:00 +0000
+++ lib/devscripts/__init__.py 2009-09-09 03:27:47 +0000
@@ -0,0 +1,4 @@
1# Copyright 2009 Canonical Ltd. This software is licensed under the
2# GNU Affero General Public License version 3 (see the file LICENSE).
3
4"""Scripts that are used in developing Launchpad."""
05
=== added file 'lib/devscripts/sourcecode.py'
--- lib/devscripts/sourcecode.py 1970-01-01 00:00:00 +0000
+++ lib/devscripts/sourcecode.py 2009-09-11 05:01:06 +0000
@@ -0,0 +1,199 @@
1# Copyright 2009 Canonical Ltd. This software is licensed under the
2# GNU Affero General Public License version 3 (see the file LICENSE).
3
4"""Tools for maintaining the Launchpad source code."""
5
6__metaclass__ = type
7__all__ = [
8 'interpret_config',
9 'parse_config_file',
10 'plan_update',
11 ]
12
13import os
14import shutil
15import sys
16
17from bzrlib.branch import Branch
18from bzrlib.bzrdir import BzrDir
19from bzrlib.errors import BzrError
20from bzrlib.plugin import load_plugins
21from bzrlib.trace import report_exception
22from bzrlib.transport import get_transport
23from bzrlib.workingtree import WorkingTree
24
25
26def parse_config_file(file_handle):
27 """Parse the source code config file 'file_handle'.
28
29 :param file_handle: A file-like object containing sourcecode
30 configuration.
31 :return: A sequence of lines of either '[key, value]' or
32 '[key, value, optional]'.
33 """
34 for line in file_handle:
35 if line.startswith('#'):
36 continue
37 yield [token.strip() for token in line.split('=')]
38
39
40def interpret_config_entry(entry):
41 """Interpret a single parsed line from the config file."""
42 return (entry[0], (entry[1], len(entry) > 2))
43
44
45def interpret_config(config_entries):
46 """Interpret a configuration stream, as parsed by 'parse_config_file'.
47
48 :param configuration: A sequence of parsed configuration entries.
49 :return: A dict mapping the names of the sourcecode dependencies to a
50 2-tuple of their branches and whether or not they are optional.
51 """
52 return dict(map(interpret_config_entry, config_entries))
53
54
55def _subset_dict(d, keys):
56 """Return a dict that's a subset of 'd', based on the keys in 'keys'."""
57 return dict((key, d[key]) for key in keys)
58
59
60def plan_update(existing_branches, configuration):
61 """Plan the update to existing branches based on 'configuration'.
62
63 :param existing_branches: A sequence of branches that already exist.
64 :param configuration: A dictionary of sourcecode configuration, such as is
65 returned by `interpret_config`.
66 :return: (new_branches, update_branches, removed_branches), where
67 'new_branches' are the branches in the configuration that don't exist
68 yet, 'update_branches' are the branches in the configuration that do
69 exist, and 'removed_branches' are the branches that exist locally, but
70 not in the configuration. 'new_branches' and 'update_branches' are
71 dicts of the same form as 'configuration', 'removed_branches' is a
72 set of the same form as 'existing_branches'.
73 """
74 existing_branches = set(existing_branches)
75 config_branches = set(configuration.keys())
76 new_branches = config_branches - existing_branches
77 removed_branches = existing_branches - config_branches
78 update_branches = config_branches.intersection(existing_branches)
79 return (
80 _subset_dict(configuration, new_branches),
81 _subset_dict(configuration, update_branches),
82 removed_branches)
83
84
85def find_branches(directory):
86 """List the directory names in 'directory' that are branches."""
87 transport = get_transport(directory)
88 return (
89 os.path.basename(branch.base.rstrip('/'))
90 for branch in BzrDir.find_branches(transport))
91
92
93def get_branches(sourcecode_directory, new_branches,
94 possible_transports=None):
95 """Get the new branches into sourcecode."""
96 for project, (branch_url, optional) in new_branches.iteritems():
97 destination = os.path.join(sourcecode_directory, project)
98 remote_branch = Branch.open(
99 branch_url, possible_transports=possible_transports)
100 possible_transports.append(
101 remote_branch.bzrdir.root_transport)
102 print 'Getting %s from %s' % (project, branch_url)
103 # If the 'optional' flag is set, then it's a branch that shares
104 # history with Launchpad, so we should share repositories. Otherwise,
105 # we should avoid sharing repositories to avoid format
106 # incompatibilities.
107 force_new_repo = not optional
108 try:
109 remote_branch.bzrdir.sprout(
110 destination, create_tree_if_local=True,
111 source_branch=remote_branch, force_new_repo=force_new_repo,
112 possible_transports=possible_transports)
113 except BzrError:
114 if optional:
115 report_exception(sys.exc_info(), sys.stderr)
116 else:
117 raise
118
119
120def update_branches(sourcecode_directory, update_branches,
121 possible_transports=None):
122 """Update the existing branches in sourcecode."""
123 if possible_transports is None:
124 possible_transports = []
125 # XXX: JonathanLange 2009-11-09: Rather than updating one branch after
126 # another, we could instead try to get them in parallel.
127 for project, (branch_url, optional) in update_branches.iteritems():
128 destination = os.path.join(sourcecode_directory, project)
129 print 'Updating %s' % (project,)
130 local_tree = WorkingTree.open(destination)
131 remote_branch = Branch.open(
132 branch_url, possible_transports=possible_transports)
133 possible_transports.append(
134 remote_branch.bzrdir.root_transport)
135 try:
136 local_tree.pull(
137 remote_branch,
138 possible_transports=possible_transports)
139 except BzrError:
140 if optional:
141 report_exception(sys.exc_info(), sys.stderr)
142 else:
143 raise
144
145
146def remove_branches(sourcecode_directory, removed_branches):
147 """Remove sourcecode that's no longer there."""
148 for project in removed_branches:
149 destination = os.path.join(sourcecode_directory, project)
150 print 'Removing %s' % project
151 try:
152 shutil.rmtree(destination)
153 except OSError:
154 os.unlink(destination)
155
156
157def update_sourcecode(sourcecode_directory, config_filename):
158 """Update the sourcecode."""
159 config_file = open(config_filename)
160 config = interpret_config(parse_config_file(config_file))
161 config_file.close()
162 branches = find_branches(sourcecode_directory)
163 new, updated, removed = plan_update(branches, config)
164 possible_transports = []
165 get_branches(sourcecode_directory, new, possible_transports)
166 update_branches(sourcecode_directory, updated, possible_transports)
167 remove_branches(sourcecode_directory, removed)
168
169
170def get_launchpad_root():
171 return os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
172
173
174# XXX: JonathanLange 2009-09-11: By default, the script will operate on the
175# current checkout. Most people only have symlinks to sourcecode in their
176# checkouts. This is fine for updating, but breaks for removing (you can't
177# shutil.rmtree a symlink) and breaks for adding, since it adds the new branch
178# to the checkout, rather than to the shared sourcecode area. Ideally, the
179# script would see that the sourcecode directory is full of symlinks and then
180# follow these symlinks to find the shared source directory. If the symlinks
181# differ from each other (because of developers fiddling with things), we can
182# take a survey of all of them, and choose the most popular.
183
184
185def main(args):
186 root = get_launchpad_root()
187 if len(args) > 1:
188 sourcecode_directory = args[1]
189 else:
190 sourcecode_directory = os.path.join(root, 'sourcecode')
191 if len(args) > 2:
192 config_filename = args[2]
193 else:
194 config_filename = os.path.join(root, 'utilities', 'sourcedeps.conf')
195 print 'Sourcecode: %s' % (sourcecode_directory,)
196 print 'Config: %s' % (config_filename,)
197 load_plugins()
198 update_sourcecode(sourcecode_directory, config_filename)
199 return 0
0200
=== added directory 'lib/devscripts/tests'
=== added file 'lib/devscripts/tests/__init__.py'
--- lib/devscripts/tests/__init__.py 1970-01-01 00:00:00 +0000
+++ lib/devscripts/tests/__init__.py 2009-09-09 03:27:47 +0000
@@ -0,0 +1,4 @@
1# Copyright 2009 Canonical Ltd. This software is licensed under the
2# GNU Affero General Public License version 3 (see the file LICENSE).
3
4"""Tests for devscripts."""
05
=== added file 'lib/devscripts/tests/test_sourcecode.py'
--- lib/devscripts/tests/test_sourcecode.py 1970-01-01 00:00:00 +0000
+++ lib/devscripts/tests/test_sourcecode.py 2009-09-11 04:57:50 +0000
@@ -0,0 +1,156 @@
1# Copyright 2009 Canonical Ltd. This software is licensed under the
2# GNU Affero General Public License version 3 (see the file LICENSE).
3
4"""Module docstring goes here."""
5
6__metaclass__ = type
7
8import shutil
9from StringIO import StringIO
10import tempfile
11import unittest
12
13from bzrlib.bzrdir import BzrDir
14from bzrlib.tests import TestCase
15from bzrlib.transport import get_transport
16
17from devscripts.sourcecode import (
18 find_branches, interpret_config, parse_config_file, plan_update)
19
20
21class TestParseConfigFile(unittest.TestCase):
22 """Tests for the config file parser."""
23
24 def makeFile(self, contents):
25 return StringIO(contents)
26
27 def test_empty(self):
28 # Parsing an empty config file returns an empty sequence.
29 empty_file = self.makeFile("")
30 self.assertEqual([], list(parse_config_file(empty_file)))
31
32 def test_single_value(self):
33 # Parsing a file containing a single key=value pair returns a sequence
34 # containing the (key, value) as a list.
35 config_file = self.makeFile("key=value")
36 self.assertEqual(
37 [['key', 'value']], list(parse_config_file(config_file)))
38
39 def test_comment_ignored(self):
40 # If a line begins with a '#', then its a comment.
41 comment_only = self.makeFile('# foo')
42 self.assertEqual([], list(parse_config_file(comment_only)))
43
44 def test_optional_value(self):
45 # Lines in the config file can have a third optional entry.
46 config_file = self.makeFile('key=value=optional')
47 self.assertEqual(
48 [['key', 'value', 'optional']],
49 list(parse_config_file(config_file)))
50
51 def test_whitespace_stripped(self):
52 # Any whitespace around any of the tokens in the config file are
53 # stripped out.
54 config_file = self.makeFile(' key = value = optional ')
55 self.assertEqual(
56 [['key', 'value', 'optional']],
57 list(parse_config_file(config_file)))
58
59
60class TestInterpretConfiguration(unittest.TestCase):
61 """Tests for the configuration interpreter."""
62
63 def test_empty(self):
64 # An empty configuration stream means no configuration.
65 config = interpret_config([])
66 self.assertEqual({}, config)
67
68 def test_key_value(self):
69 # A (key, value) pair without a third optional value is returned in
70 # the configuration as a dictionary entry under 'key' with '(value,
71 # False)' as its value.
72 config = interpret_config([['key', 'value']])
73 self.assertEqual({'key': ('value', False)}, config)
74
75 def test_key_value_optional(self):
76 # A (key, value, optional) entry is returned in the configuration as a
77 # dictionary entry under 'key' with '(value, True)' as its value.
78 config = interpret_config([['key', 'value', 'optional']])
79 self.assertEqual({'key': ('value', True)}, config)
80
81
82class TestPlanUpdate(unittest.TestCase):
83 """Tests for how to plan the update."""
84
85 def test_trivial(self):
86 # In the trivial case, there are no existing branches and no
87 # configured branches, so there are no branches to add, none to
88 # update, and none to remove.
89 new, existing, removed = plan_update([], {})
90 self.assertEqual({}, new)
91 self.assertEqual({}, existing)
92 self.assertEqual(set(), removed)
93
94 def test_all_new(self):
95 # If there are no existing branches, then the all of the configured
96 # branches are new, none are existing and none have been removed.
97 new, existing, removed = plan_update([], {'a': ('b', False)})
98 self.assertEqual({'a': ('b', False)}, new)
99 self.assertEqual({}, existing)
100 self.assertEqual(set(), removed)
101
102 def test_all_old(self):
103 # If there configuration is now empty, but there are existing
104 # branches, then that means all the branches have been removed from
105 # the configuration, none are new and none are updated.
106 new, existing, removed = plan_update(['a', 'b', 'c'], {})
107 self.assertEqual({}, new)
108 self.assertEqual({}, existing)
109 self.assertEqual(set(['a', 'b', 'c']), removed)
110
111 def test_all_same(self):
112 # If the set of existing branches is the same as the set of
113 # non-existing branches, then they all need to be updated.
114 config = {'a': ('b', False), 'c': ('d', True)}
115 new, existing, removed = plan_update(config.keys(), config)
116 self.assertEqual({}, new)
117 self.assertEqual(config, existing)
118 self.assertEqual(set(), removed)
119
120
121class TestFindBranches(TestCase):
122 """Tests the way that we find branches."""
123
124 def makeBranch(self, path):
125 transport = get_transport(path)
126 transport.ensure_base()
127 BzrDir.create_branch_convenience(
128 transport.base, possible_transports=[transport])
129
130 def makeDirectory(self):
131 directory = tempfile.mkdtemp()
132 self.addCleanup(shutil.rmtree, directory)
133 return directory
134
135 def test_empty_directory_has_no_branches(self):
136 # An empty directory has no branches.
137 empty = self.makeDirectory()
138 self.assertEqual([], list(find_branches(empty)))
139
140 def test_directory_with_branches(self):
141 # find_branches finds branches in the directory.
142 directory = self.makeDirectory()
143 self.makeBranch('%s/a' % directory)
144 self.assertEqual(['a'], list(find_branches(directory)))
145
146 def test_ignores_files(self):
147 # find_branches ignores any files in the directory.
148 directory = self.makeDirectory()
149 some_file = open('%s/a' % directory, 'w')
150 some_file.write('hello\n')
151 some_file.close()
152 self.assertEqual([], list(find_branches(directory)))
153
154
155def test_suite():
156 return unittest.TestLoader().loadTestsFromName(__name__)
0157
=== modified file 'utilities/rocketfuel-get'
--- utilities/rocketfuel-get 2009-07-24 15:37:03 +0000
+++ utilities/rocketfuel-get 2009-09-11 05:01:06 +0000
@@ -64,45 +64,9 @@
64echo " See https://bugs.edge.launchpad.net/bzr/+bug/401617 for why."64echo " See https://bugs.edge.launchpad.net/bzr/+bug/401617 for why."
65echo ""65echo ""
66echo " Sourcedeps: $LP_SOURCEDEPS_PATH"66echo " Sourcedeps: $LP_SOURCEDEPS_PATH"
67while IFS="=" read package branch optional67
68do68run-child $LP_TRUNK_PATH/utilities/update-sourcecode $LP_SOURCEDEPS_PATH $sourcedeps_conf
69 package_path="$LP_SOURCEDEPS_PATH/$package"69
70 echo " Checking $package_path"
71 if [ "$optional" ]
72 then
73 # The 'set +e' below is so that we don't exit when the 'bzr log' call
74 # returns an error.
75 set +e
76 bzr log -l 1 -q "$branch" &> /dev/null
77 retval=$?
78 set -e
79 if [ $retval -ne 0 ]
80 then
81 # We can't reach this branch, but it's optional anyway, so we'll
82 # just move on.
83 echo "Couldn't find $branch, skipping it."
84 continue
85 fi
86 if [ -d "$package_path" ]
87 then
88 run-child bzr pull --remember "$branch" --directory "$package_path"
89 else
90 # This is either shipit or the openid-provider, and they share the
91 # history with our main tree, so we'll do this hack to stack them
92 # onto LP's trunk branch.
93 run-child bzr branch --no-tree --stacked "$LP_TRUNK_PATH" "$package_path"
94 run-child bzr pull --overwrite "$branch" -d "$package_path"
95 run-child bzr checkout "$package_path" "$package_path"
96 fi
97 else
98 if [ -d "$package_path" ]
99 then
100 run-child bzr pull --remember "$branch" --directory "$package_path"
101 else
102 run-child bzr branch --standalone "$branch" "$package_path"
103 fi
104 fi
105done < "$sourcedeps_conf"
10670
107# Update the current trees in the repo.71# Update the current trees in the repo.
108echo "Updating sourcecode dependencies in current local branches:"72echo "Updating sourcecode dependencies in current local branches:"
10973
=== added file 'utilities/update-sourcecode'
--- utilities/update-sourcecode 1970-01-01 00:00:00 +0000
+++ utilities/update-sourcecode 2009-09-10 04:31:12 +0000
@@ -0,0 +1,16 @@
1#!/usr/bin/python2.4
2#
3# Copyright 2009 Canonical Ltd. This software is licensed under the
4# GNU Affero General Public License version 3 (see the file LICENSE).
5
6"""Update the sourcecode managed in sourcecode/."""
7
8import _pythonpath
9
10import sys
11
12from devscripts import sourcecode
13
14
15if __name__ == '__main__':
16 sys.exit(sourcecode.main(sys.argv))