Merge lp:~spiv/bzr/ids-only-necessary-texts into lp:~bzr/bzr/trunk-old

Proposed by Andrew Bennetts
Status: Merged
Merged at revision: not available
Proposed branch: lp:~spiv/bzr/ids-only-necessary-texts
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 341 lines
To merge this branch: bzr merge lp:~spiv/bzr/ids-only-necessary-texts
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+10444@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Andrew Bennetts (spiv) wrote :

This fixes bug 399140. As explained in the bug discussion, the issue is that InterDifferingSerializer doesn't behave correctly with stacked sources. It should take care to copy any parent inventories that are present for the revisions it is copying, and also take care to only try to copy texts necessary for the revisions it is copying. Due to a combination of these factors it was generating an inventory delta from null for the first revision in a stacked repository during upgrade, and then expected to be able to find every text named in that delta.

The test is a somewhat copied-and-pasted variant of other stacking tests in per_interrepository/test_fetch.py. So it's both long and somewhat repetitive compared to the existing tests. We should refactor those tests at some point... Anyway, the test does fail with bzr.dev's copy of repository.py, and passes with the fix, of course :)

I expect this will have a minor performance impact on InterDifferingSerializer, but I don't expect it to be very large. Correctness beats performance, anyway.

Revision history for this message
Robert Collins (lifeless) wrote :

218 + if not self.repository_format.supports_external_lookups:
219 + raise TestNotApplicable("Need stacking support in the source.")
220 + if not self.repository_format_to.supports_external_lookups:
221 + raise TestNotApplicable("Need stacking support in the target.")

It would be good to ensure that there is a compatible pair being tested. Otherwise we'll get TNA but not run the test.

Also, the 'get parents of the edge of a graph subset' is a repeated idiom now. I think its time to put that into graph.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-08-21 03:17:13 +0000
3+++ NEWS 2009-08-24 03:35:14 +0000
4@@ -37,6 +37,10 @@
5 * Fix IndexError printing CannotBindAddress errors.
6 (Martin Pool, #286871)
7
8+* Fix "Revision ... not present" errors when upgrading stacked branches,
9+ or when doing fetches from a stacked source to a stacked target.
10+ (Andrew Bennetts, #399140)
11+
12 Improvements
13 ************
14
15
16=== modified file 'bzrlib/bzrdir.py'
17--- bzrlib/bzrdir.py 2009-08-18 05:18:52 +0000
18+++ bzrlib/bzrdir.py 2009-08-24 03:35:14 +0000
19@@ -129,8 +129,16 @@
20 return True
21
22 def check_conversion_target(self, target_format):
23+ """Check that a bzrdir as a whole can be converted to a new format."""
24+ # The only current restriction is that the repository content can be
25+ # fetched compatibly with the target.
26 target_repo_format = target_format.repository_format
27- self.open_repository()._format.check_conversion_target(target_repo_format)
28+ try:
29+ self.open_repository()._format.check_conversion_target(
30+ target_repo_format)
31+ except errors.NoRepositoryPresent:
32+ # No repo, no problem.
33+ pass
34
35 @staticmethod
36 def _check_supported(format, allow_unsupported,
37
38=== modified file 'bzrlib/repository.py'
39--- bzrlib/repository.py 2009-08-19 18:04:49 +0000
40+++ bzrlib/repository.py 2009-08-24 03:35:14 +0000
41@@ -3732,25 +3732,80 @@
42 return False
43 return True
44
45- def _get_delta_for_revision(self, tree, parent_ids, basis_id, cache):
46+ def _get_trees(self, revision_ids, cache):
47+ possible_trees = []
48+ for rev_id in revision_ids:
49+ if rev_id in cache:
50+ possible_trees.append((rev_id, cache[rev_id]))
51+ else:
52+ # Not cached, but inventory might be present anyway.
53+ try:
54+ tree = self.source.revision_tree(rev_id)
55+ except errors.NoSuchRevision:
56+ # Nope, parent is ghost.
57+ pass
58+ else:
59+ cache[rev_id] = tree
60+ possible_trees.append((rev_id, tree))
61+ return possible_trees
62+
63+ def _get_delta_for_revision(self, tree, parent_ids, possible_trees):
64 """Get the best delta and base for this revision.
65
66 :return: (basis_id, delta)
67 """
68- possible_trees = [(parent_id, cache[parent_id])
69- for parent_id in parent_ids
70- if parent_id in cache]
71- if len(possible_trees) == 0:
72- # There either aren't any parents, or the parents aren't in the
73- # cache, so just use the last converted tree
74- possible_trees.append((basis_id, cache[basis_id]))
75 deltas = []
76+ # Generate deltas against each tree, to find the shortest.
77+ texts_possibly_new_in_tree = set()
78 for basis_id, basis_tree in possible_trees:
79 delta = tree.inventory._make_delta(basis_tree.inventory)
80+ for old_path, new_path, file_id, new_entry in delta:
81+ if new_path is None:
82+ # This file_id isn't present in the new rev, so we don't
83+ # care about it.
84+ continue
85+ if not new_path:
86+ # Rich roots are handled elsewhere...
87+ continue
88+ kind = new_entry.kind
89+ if kind != 'directory' and kind != 'file':
90+ # No text record associated with this inventory entry.
91+ continue
92+ # This is a directory or file that has changed somehow.
93+ texts_possibly_new_in_tree.add((file_id, new_entry.revision))
94 deltas.append((len(delta), basis_id, delta))
95 deltas.sort()
96 return deltas[0][1:]
97
98+ def _fetch_parent_invs_for_stacking(self, parent_map, cache):
99+ """Find all parent revisions that are absent, but for which the
100+ inventory is present, and copy those inventories.
101+
102+ This is necessary to preserve correctness when the source is stacked
103+ without fallbacks configured. (Note that in cases like upgrade the
104+ source may be not have _fallback_repositories even though it is
105+ stacked.)
106+ """
107+ parent_revs = set()
108+ for parents in parent_map.values():
109+ parent_revs.update(parents)
110+ present_parents = self.source.get_parent_map(parent_revs)
111+ absent_parents = set(parent_revs).difference(present_parents)
112+ parent_invs_keys_for_stacking = self.source.inventories.get_parent_map(
113+ (rev_id,) for rev_id in absent_parents)
114+ parent_inv_ids = [key[-1] for key in parent_invs_keys_for_stacking]
115+ for parent_tree in self.source.revision_trees(parent_inv_ids):
116+ current_revision_id = parent_tree.get_revision_id()
117+ parents_parents_keys = parent_invs_keys_for_stacking[
118+ (current_revision_id,)]
119+ parents_parents = [key[-1] for key in parents_parents_keys]
120+ basis_id = _mod_revision.NULL_REVISION
121+ basis_tree = self.source.revision_tree(basis_id)
122+ delta = parent_tree.inventory._make_delta(basis_tree.inventory)
123+ self.target.add_inventory_by_delta(
124+ basis_id, delta, current_revision_id, parents_parents)
125+ cache[current_revision_id] = parent_tree
126+
127 def _fetch_batch(self, revision_ids, basis_id, cache):
128 """Fetch across a few revisions.
129
130@@ -3769,29 +3824,54 @@
131 pending_deltas = []
132 pending_revisions = []
133 parent_map = self.source.get_parent_map(revision_ids)
134+ self._fetch_parent_invs_for_stacking(parent_map, cache)
135 for tree in self.source.revision_trees(revision_ids):
136+ # Find a inventory delta for this revision.
137+ # Find text entries that need to be copied, too.
138 current_revision_id = tree.get_revision_id()
139 parent_ids = parent_map.get(current_revision_id, ())
140+ parent_trees = self._get_trees(parent_ids, cache)
141+ possible_trees = list(parent_trees)
142+ if len(possible_trees) == 0:
143+ # There either aren't any parents, or the parents are ghosts,
144+ # so just use the last converted tree.
145+ possible_trees.append((basis_id, cache[basis_id]))
146 basis_id, delta = self._get_delta_for_revision(tree, parent_ids,
147- basis_id, cache)
148+ possible_trees)
149 if self._converting_to_rich_root:
150 self._revision_id_to_root_id[current_revision_id] = \
151 tree.get_root_id()
152- # Find text entries that need to be copied
153+ # Determine which texts are in present in this revision but not in
154+ # any of the available parents.
155+ texts_possibly_new_in_tree = set()
156 for old_path, new_path, file_id, entry in delta:
157- if new_path is not None:
158- if not new_path:
159- # This is the root
160- if not self.target.supports_rich_root():
161- # The target doesn't support rich root, so we don't
162- # copy
163- continue
164- if self._converting_to_rich_root:
165- # This can't be copied normally, we have to insert
166- # it specially
167- root_keys_to_create.add((file_id, entry.revision))
168- continue
169- text_keys.add((file_id, entry.revision))
170+ if new_path is None:
171+ # This file_id isn't present in the new rev
172+ continue
173+ if not new_path:
174+ # This is the root
175+ if not self.target.supports_rich_root():
176+ # The target doesn't support rich root, so we don't
177+ # copy
178+ continue
179+ if self._converting_to_rich_root:
180+ # This can't be copied normally, we have to insert
181+ # it specially
182+ root_keys_to_create.add((file_id, entry.revision))
183+ continue
184+ kind = entry.kind
185+ texts_possibly_new_in_tree.add((file_id, entry.revision))
186+ for basis_id, basis_tree in possible_trees:
187+ basis_inv = basis_tree.inventory
188+ for file_key in list(texts_possibly_new_in_tree):
189+ file_id, file_revision = file_key
190+ try:
191+ entry = basis_inv[file_id]
192+ except errors.NoSuchId:
193+ continue
194+ if entry.revision == file_revision:
195+ texts_possibly_new_in_tree.remove(file_key)
196+ text_keys.update(texts_possibly_new_in_tree)
197 revision = self.source.get_revision(current_revision_id)
198 pending_deltas.append((basis_id, delta,
199 current_revision_id, revision.parent_ids))
200@@ -3833,8 +3913,13 @@
201 for parent_tree in self.source.revision_trees(parent_map):
202 current_revision_id = parent_tree.get_revision_id()
203 parents_parents = parent_map[current_revision_id]
204+ possible_trees = self._get_trees(parents_parents, cache)
205+ if len(possible_trees) == 0:
206+ # There either aren't any parents, or the parents are
207+ # ghosts, so just use the last converted tree.
208+ possible_trees.append((basis_id, cache[basis_id]))
209 basis_id, delta = self._get_delta_for_revision(parent_tree,
210- parents_parents, basis_id, cache)
211+ parents_parents, possible_trees)
212 self.target.add_inventory_by_delta(
213 basis_id, delta, current_revision_id, parents_parents)
214 # insert signatures and revisions
215
216=== modified file 'bzrlib/tests/blackbox/test_upgrade.py'
217--- bzrlib/tests/blackbox/test_upgrade.py 2009-08-14 08:57:57 +0000
218+++ bzrlib/tests/blackbox/test_upgrade.py 2009-08-24 03:35:14 +0000
219@@ -184,7 +184,7 @@
220 self.assertEqual('', err)
221
222
223-class UpgradeRecommendedTests(TestCaseInTempDir):
224+class UpgradeRecommendedTests(TestCaseWithTransport):
225
226 def test_recommend_upgrade_wt4(self):
227 # using a deprecated format gives a warning
228@@ -200,3 +200,9 @@
229 out, err = self.run_bzr('revno a')
230 if err.find('upgrade') > -1:
231 self.fail("message shouldn't suggest upgrade:\n%s" % err)
232+
233+ def test_upgrade_shared_repo(self):
234+ repo = self.make_repository('repo', format='2a', shared=True)
235+ branch = self.make_branch_and_tree('repo/branch', format="pack-0.92")
236+ self.get_transport('repo/branch/.bzr/repository').delete_tree('.')
237+ out, err = self.run_bzr(['upgrade'], working_dir='repo/branch')
238
239=== modified file 'bzrlib/tests/per_interrepository/test_fetch.py'
240--- bzrlib/tests/per_interrepository/test_fetch.py 2009-08-14 00:55:42 +0000
241+++ bzrlib/tests/per_interrepository/test_fetch.py 2009-08-24 03:35:14 +0000
242@@ -17,7 +17,6 @@
243
244 import sys
245
246-import bzrlib
247 from bzrlib import (
248 errors,
249 inventory,
250@@ -266,6 +265,91 @@
251 self.assertEqual(expected_texts, unstacked_repo.texts.keys())
252 self.assertCanStreamRevision(unstacked_repo, 'third')
253
254+ def test_fetch_from_stacked_to_stacked_copies_parent_inventories(self):
255+ """Fetch from a stacked branch copies inventories for parents of
256+ revisions at the stacking boundary.
257+
258+ Specifically, fetch will copy the parent inventories from the
259+ source for which the corresponding revisions are not present. This
260+ will happen even when the source repository has no fallbacks configured
261+ (as is the case during upgrade).
262+ """
263+ if not self.repository_format.supports_external_lookups:
264+ raise TestNotApplicable("Need stacking support in the source.")
265+ if not self.repository_format_to.supports_external_lookups:
266+ raise TestNotApplicable("Need stacking support in the target.")
267+ builder = self.make_branch_builder('branch')
268+ builder.start_series()
269+ builder.build_snapshot('base', None, [
270+ ('add', ('', 'root-id', 'directory', '')),
271+ ('add', ('file', 'file-id', 'file', 'content\n'))])
272+ builder.build_snapshot('left', ['base'], [
273+ ('modify', ('file-id', 'left content\n'))])
274+ builder.build_snapshot('right', ['base'], [
275+ ('modify', ('file-id', 'right content\n'))])
276+ builder.build_snapshot('merge', ['left', 'right'], [
277+ ('modify', ('file-id', 'left and right content\n'))])
278+ builder.finish_series()
279+ branch = builder.get_branch()
280+ repo = self.make_repository('old-trunk')
281+ # Make a pair of equivalent trunk repos in the from and to formats.
282+ old_trunk = repo.bzrdir.create_branch()
283+ old_trunk.repository.fetch(branch.repository, 'left')
284+ old_trunk.repository.fetch(branch.repository, 'right')
285+ repo = self.make_to_repository('new-trunk')
286+ new_trunk = repo.bzrdir.create_branch()
287+ new_trunk.repository.fetch(branch.repository, 'left')
288+ new_trunk.repository.fetch(branch.repository, 'right')
289+ # Make the source; a repo stacked on old_trunk contained just the data
290+ # for 'merge'.
291+ repo = self.make_repository('old-stacked')
292+ old_stacked_branch = repo.bzrdir.create_branch()
293+ old_stacked_branch.set_stacked_on_url(old_trunk.base)
294+ old_stacked_branch.repository.fetch(branch.repository, 'merge')
295+ # Make the target, a repo stacked on new_trunk.
296+ repo = self.make_to_repository('new-stacked')
297+ new_stacked_branch = repo.bzrdir.create_branch()
298+ new_stacked_branch.set_stacked_on_url(new_trunk.base)
299+ old_unstacked_repo = old_stacked_branch.bzrdir.open_repository()
300+ new_unstacked_repo = new_stacked_branch.bzrdir.open_repository()
301+ # Reopen the source and target repos without any fallbacks, and fetch
302+ # 'merge'.
303+ new_unstacked_repo.fetch(old_unstacked_repo, 'merge')
304+ # Now check the results. new_unstacked_repo should contain all the
305+ # data necessary to stream 'merge' (i.e. the parent inventories).
306+ new_unstacked_repo.lock_read()
307+ self.addCleanup(new_unstacked_repo.unlock)
308+ self.assertFalse(new_unstacked_repo.has_revision('left'))
309+ self.assertFalse(new_unstacked_repo.has_revision('right'))
310+ self.assertEqual(
311+ set([('left',), ('right',), ('merge',)]),
312+ new_unstacked_repo.inventories.keys())
313+ # And the basis inventories have been copied correctly
314+ new_trunk.lock_read()
315+ self.addCleanup(new_trunk.unlock)
316+ left_tree, right_tree = new_trunk.repository.revision_trees(
317+ ['left', 'right'])
318+ new_stacked_branch.lock_read()
319+ self.addCleanup(new_stacked_branch.unlock)
320+ (stacked_left_tree,
321+ stacked_right_tree) = new_stacked_branch.repository.revision_trees(
322+ ['left', 'right'])
323+ self.assertEqual(left_tree.inventory, stacked_left_tree.inventory)
324+ self.assertEqual(right_tree.inventory, stacked_right_tree.inventory)
325+ # Finally, it's not enough to see that the basis inventories are
326+ # present. The texts introduced in merge (and only those) should be
327+ # present, and also generating a stream should succeed without blowing
328+ # up.
329+ self.assertTrue(new_unstacked_repo.has_revision('merge'))
330+ expected_texts = set([('file-id', 'merge')])
331+ if new_stacked_branch.repository.texts.get_parent_map([('root-id',
332+ 'merge')]):
333+ # If a (root-id,merge) text exists, it should be in the stacked
334+ # repo.
335+ expected_texts.add(('root-id', 'merge'))
336+ self.assertEqual(expected_texts, new_unstacked_repo.texts.keys())
337+ self.assertCanStreamRevision(new_unstacked_repo, 'merge')
338+
339 def test_fetch_missing_basis_text(self):
340 """If fetching a delta, we should die if a basis is not present."""
341 tree = self.make_branch_and_tree('tree')