Merge lp:~jameinel/bzr/1.15-gc-stacking into lp:~bzr/bzr/trunk-old

Proposed by John A Meinel
Status: Merged
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr/1.15-gc-stacking
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 2014 lines
To merge this branch: bzr merge lp:~jameinel/bzr/1.15-gc-stacking
Reviewer Review Type Date Requested Status
Andrew Bennetts Approve
Review via email: mp+6880@code.launchpad.net

This proposal supersedes a proposal from 2009-05-28.

To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal

This change enables --development6-rich-root to stack. It ends up including the Repository fallback locking fixes, and a few other code cleanups that we encountered along the way.

It unfortunately adds a bit more direct connection between PackRepository and PackRepository.revisions._index._key_dependencies

We already had an explicit connection, because get_missing_parent_inventories() was directly accessing that variable. The bit we added was to just add a reset() of that cache when a write group is committed, aborted, or suspended. We felt that this was the 'right thing', but it was also required to fix a test about ghosts.

(We had a test that ghosts aren't filled in, but without resetting key deps, an earlier commit that introduced the ghost still tracks that the ghost is missing. Existing Pack fetching would suffer this as well if it used the Stream code for fetching, rather than Pack => Pack.)

This is potentially up for backporting to a 1.15.1 release.

Revision history for this message
Andrew Bennetts (spiv) wrote : Posted in a previous version of this proposal

It's a shame that you add both "_find_present_inventory_ids" and "_find_present_inventories" to groupcompress_repo.py, but it's not trivial to factor out that duplication. Similarly, around line 950 of that file you have a duplication of the logic of find_parent_ids_of_revisions, but again reusing that code isn't trivial. Something to cleanup in the future I guess...

In test_sprout_from_stacked_with_short_history in bzrlib/tests/per_repository_reference/test_fetch.py you start with a comment saying "Now copy this ...", which is a bit weird as the first thing in a test. Probably this comment hasn't been updated after you refactored the test? Anyway, please update it.

1353 + for record in stream:
1354 + records.append(record.key)
1355 + if record.key == ('a-id', 'A-id'):
1356 + self.assertEqual(''.join(content[:-2]),
1357 + record.get_bytes_as('fulltext'))
1358 + elif record.key == ('a-id', 'B-id'):
1359 + self.assertEqual(''.join(content[:-1]),
1360 + record.get_bytes_as('fulltext'))
1361 + elif record.key == ('a-id', 'C-id'):
1362 + self.assertEqual(''.join(content),
1363 + record.get_bytes_as('fulltext'))
1364 + else:
1365 + self.fail('Unexpected record: %s' % (record.key,))

This is ok, but I think I'd rather:

for record in stream:
    records.append((record.key, record.get_bytes_as('fulltext')))
records.sort()
self.assertEqual(
    [(('a-id', 'A-id'), ''.join(content[:-2])), (('a-id', 'B-id'), ''.join(content[:-1])),
     (('a-id', 'C-id'), ''.join(content))],
    records)

Which is more compact and doesn't have any need for conditionals in the test, and will probably give more informative failures.

bzrlib/tests/per_repository_reference/test_initialize.py adds a test with no assert* calls. Is that intentional?

In bzrlib/tests/test_pack_repository.py, test_resume_chk_bytes has a line of unreachable code after a raise statement.

In bzrlib/tests/test_repository.py, is the typo in 'abcdefghijklmnopqrstuvwxzy123456789' meant to be a test to see how attentive your reviewer is? ;)

Other than those, this seems fine to me though.

review: Needs Fixing
Revision history for this message
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> Review: Needs Fixing
> It's a shame that you add both "_find_present_inventory_ids" and "_find_present_inventories" to groupcompress_repo.py, but it's not trivial to factor out that duplication. Similarly, around line 950 of that file you have a duplication of the logic of find_parent_ids_of_revisions, but again reusing that code isn't trivial. Something to cleanup in the future I guess...
>

_find_present_inventory_ids, and _find_present_inventories are actually
exchangeable, it is just
self.from_repository._find_present_inventory_ids
rather than
self._find_present_inventories.

I'm glad you caught the duplication.

And for "_find_parent_ids_of_revisions()" it also is available as
self.from_repository....

Mostly because this is GroupCHKStreamSource which can assume that it has
a RepositoryCHK1 as the .from_repository.

Ultimately, we should probably move those functions to be on Repository,
and potentially make them public. I don't really like widening the
Repository api, but as it has a default implementation that works just
fine for all other implementations, it doesn't really cause a burden for
something like SVNRepository.

> In test_sprout_from_stacked_with_short_history in bzrlib/tests/per_repository_reference/test_fetch.py you start with a comment saying "Now copy this ...", which is a bit weird as the first thing in a test. Probably this comment hasn't been updated after you refactored the test? Anyway, please update it.

Done.

...

> for record in stream:
> records.append((record.key, record.get_bytes_as('fulltext')))
> records.sort()
> self.assertEqual(
> [(('a-id', 'A-id'), ''.join(content[:-2])), (('a-id', 'B-id'), ''.join(content[:-1])),
> (('a-id', 'C-id'), ''.join(content))],
> records)
>
> Which is more compact and doesn't have any need for conditionals in the test, and will probably give more informative failures.

Done.

>
> bzrlib/tests/per_repository_reference/test_initialize.py adds a test with no assert* calls. Is that intentional?
>

It exercises the code that was broken by doing things differently. (As
in you would get an exception.)
I can add arbitrary assertions, but the reason for the test was to have
a simple call to "initialize_on_transport_ex()" given all repository
formats, and remote requests, etc, etc.

I'll add some basic bits, just to make it look like a real test. I'll
even add one that tests we can initialize all formats over the smart server.

> In bzrlib/tests/test_pack_repository.py, test_resume_chk_bytes has a line of unreachable code after a raise statement.
>
> In bzrlib/tests/test_repository.py, is the typo in 'abcdefghijklmnopqrstuvwxzy123456789' meant to be a test to see how attentive your reviewer is? ;)
>
> Other than those, this seems fine to me though.

Fixed.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkofsfMACgkQJdeBCYSNAAP7BgCfeZehp6iRn0THWW1lDnOEzs1p
PxoAnjCXPs75oPLPiZTtSrrDT6jebUkt
=zVXm
-----END PGP SIGNATURE-----

Revision history for this message
Andrew Bennetts (spiv) wrote :

Looks good to me now.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-05-28 18:56:55 +0000
3+++ NEWS 2009-05-29 10:35:20 +0000
4@@ -1,6 +1,6 @@
5-====================
6+####################
7 Bazaar Release Notes
8-====================
9+####################
10
11
12 .. contents:: List of Releases
13@@ -25,12 +25,22 @@
14
15 * ``bzr diff`` is now faster on large trees. (Ian Clatworthy)
16
17+* ``--development6-rich-root`` can now stack. (Modulo some smart-server
18+ bugs with stacking and non default formats.)
19+ (John Arbash Meinel, #373455)
20+
21+
22 Bug Fixes
23 *********
24
25 * Better message in ``bzr add`` output suggesting using ``bzr ignored`` to
26 see which files can also be added. (Jason Spashett, #76616)
27
28+* Clarify the rules for locking and fallback repositories. Fix bugs in how
29+ ``RemoteRepository`` was handling fallbacks along with the
30+ ``_real_repository``. (Andrew Bennetts, John Arbash Meinel, #375496)
31+
32+
33 Documentation
34 *************
35
36@@ -76,6 +86,15 @@
37 New Features
38 ************
39
40+* New command ``bzr dpush`` that can push changes to foreign
41+ branches (svn, git) without setting custom bzr-specific metadata.
42+ (Jelmer Vernooij)
43+
44+* The new development format ``--development6-rich-root`` now supports
45+ stacking. We chose not to use a new format marker, since old clients
46+ will just fail to open stacked branches, the same as if we used a new
47+ format flag. (John Arbash Meinel, #373455)
48+
49 * Plugins can now define their own annotation tie-breaker when two revisions
50 introduce the exact same line. See ``bzrlib.annotate._break_annotation_tie``
51 Be aware though that this is temporary, private (as indicated by the leading
52
53=== modified file 'bzrlib/branch.py'
54--- bzrlib/branch.py 2009-05-26 20:32:34 +0000
55+++ bzrlib/branch.py 2009-05-29 10:35:20 +0000
56@@ -101,13 +101,9 @@
57 def _open_hook(self):
58 """Called by init to allow simpler extension of the base class."""
59
60- def _activate_fallback_location(self, url, lock_style):
61+ def _activate_fallback_location(self, url):
62 """Activate the branch/repository from url as a fallback repository."""
63 repo = self._get_fallback_repository(url)
64- if lock_style == 'write':
65- repo.lock_write()
66- elif lock_style == 'read':
67- repo.lock_read()
68 self.repository.add_fallback_repository(repo)
69
70 def break_lock(self):
71@@ -656,7 +652,7 @@
72 self.repository.fetch(source_repository, revision_id,
73 find_ghosts=True)
74 else:
75- self._activate_fallback_location(url, 'write')
76+ self._activate_fallback_location(url)
77 # write this out after the repository is stacked to avoid setting a
78 # stacked config that doesn't work.
79 self._set_config_location('stacked_on_location', url)
80@@ -2370,7 +2366,7 @@
81 raise AssertionError(
82 "'transform_fallback_location' hook %s returned "
83 "None, not a URL." % hook_name)
84- self._activate_fallback_location(url, None)
85+ self._activate_fallback_location(url)
86
87 def __init__(self, *args, **kwargs):
88 self._ignore_fallbacks = kwargs.get('ignore_fallbacks', False)
89
90=== modified file 'bzrlib/groupcompress.py'
91--- bzrlib/groupcompress.py 2009-05-25 19:04:59 +0000
92+++ bzrlib/groupcompress.py 2009-05-29 10:35:20 +0000
93@@ -31,13 +31,13 @@
94 diff,
95 errors,
96 graph as _mod_graph,
97+ knit,
98 osutils,
99 pack,
100 patiencediff,
101 trace,
102 )
103 from bzrlib.graph import Graph
104-from bzrlib.knit import _DirectPackAccess
105 from bzrlib.btree_index import BTreeBuilder
106 from bzrlib.lru_cache import LRUSizeCache
107 from bzrlib.tsort import topo_sort
108@@ -911,7 +911,7 @@
109 writer.begin()
110 index = _GCGraphIndex(graph_index, lambda:True, parents=parents,
111 add_callback=graph_index.add_nodes)
112- access = _DirectPackAccess({})
113+ access = knit._DirectPackAccess({})
114 access.set_writer(writer, graph_index, (transport, 'newpack'))
115 result = GroupCompressVersionedFiles(index, access, delta)
116 result.stream = stream
117@@ -1547,7 +1547,7 @@
118 """Mapper from GroupCompressVersionedFiles needs into GraphIndex storage."""
119
120 def __init__(self, graph_index, is_locked, parents=True,
121- add_callback=None):
122+ add_callback=None, track_external_parent_refs=False):
123 """Construct a _GCGraphIndex on a graph_index.
124
125 :param graph_index: An implementation of bzrlib.index.GraphIndex.
126@@ -1558,12 +1558,19 @@
127 :param add_callback: If not None, allow additions to the index and call
128 this callback with a list of added GraphIndex nodes:
129 [(node, value, node_refs), ...]
130+ :param track_external_parent_refs: As keys are added, keep track of the
131+ keys they reference, so that we can query get_missing_parents(),
132+ etc.
133 """
134 self._add_callback = add_callback
135 self._graph_index = graph_index
136 self._parents = parents
137 self.has_graph = parents
138 self._is_locked = is_locked
139+ if track_external_parent_refs:
140+ self._key_dependencies = knit._KeyRefs()
141+ else:
142+ self._key_dependencies = None
143
144 def add_records(self, records, random_id=False):
145 """Add multiple records to the index.
146@@ -1614,6 +1621,11 @@
147 for key, (value, node_refs) in keys.iteritems():
148 result.append((key, value))
149 records = result
150+ key_dependencies = self._key_dependencies
151+ if key_dependencies is not None and self._parents:
152+ for key, value, refs in records:
153+ parents = refs[0]
154+ key_dependencies.add_references(key, parents)
155 self._add_callback(records)
156
157 def _check_read(self):
158@@ -1668,6 +1680,14 @@
159 result[node[1]] = None
160 return result
161
162+ def get_missing_parents(self):
163+ """Return the keys of missing parents."""
164+ # Copied from _KnitGraphIndex.get_missing_parents
165+ # We may have false positives, so filter those out.
166+ self._key_dependencies.add_keys(
167+ self.get_parent_map(self._key_dependencies.get_unsatisfied_refs()))
168+ return frozenset(self._key_dependencies.get_unsatisfied_refs())
169+
170 def get_build_details(self, keys):
171 """Get the various build details for keys.
172
173@@ -1719,6 +1739,23 @@
174 delta_end = int(bits[3])
175 return node[0], start, stop, basis_end, delta_end
176
177+ def scan_unvalidated_index(self, graph_index):
178+ """Inform this _GCGraphIndex that there is an unvalidated index.
179+
180+ This allows this _GCGraphIndex to keep track of any missing
181+ compression parents we may want to have filled in to make those
182+ indices valid.
183+
184+ :param graph_index: A GraphIndex
185+ """
186+ if self._key_dependencies is not None:
187+ # Add parent refs from graph_index (and discard parent refs that
188+ # the graph_index has).
189+ add_refs = self._key_dependencies.add_references
190+ for node in graph_index.iter_all_entries():
191+ add_refs(node[1], node[3][0])
192+
193+
194
195 from bzrlib._groupcompress_py import (
196 apply_delta,
197
198=== modified file 'bzrlib/inventory.py'
199--- bzrlib/inventory.py 2009-04-10 12:11:58 +0000
200+++ bzrlib/inventory.py 2009-05-29 10:35:20 +0000
201@@ -1547,11 +1547,9 @@
202 def _get_mutable_inventory(self):
203 """See CommonInventory._get_mutable_inventory."""
204 entries = self.iter_entries()
205- if self.root_id is not None:
206- entries.next()
207- inv = Inventory(self.root_id, self.revision_id)
208+ inv = Inventory(None, self.revision_id)
209 for path, inv_entry in entries:
210- inv.add(inv_entry)
211+ inv.add(inv_entry.copy())
212 return inv
213
214 def create_by_apply_delta(self, inventory_delta, new_revision_id,
215
216=== modified file 'bzrlib/knit.py'
217--- bzrlib/knit.py 2009-05-25 19:04:59 +0000
218+++ bzrlib/knit.py 2009-05-29 10:35:20 +0000
219@@ -2882,6 +2882,8 @@
220
221 def get_missing_parents(self):
222 """Return the keys of missing parents."""
223+ # If updating this, you should also update
224+ # groupcompress._GCGraphIndex.get_missing_parents
225 # We may have false positives, so filter those out.
226 self._key_dependencies.add_keys(
227 self.get_parent_map(self._key_dependencies.get_unsatisfied_refs()))
228
229=== modified file 'bzrlib/remote.py'
230--- bzrlib/remote.py 2009-05-10 23:45:33 +0000
231+++ bzrlib/remote.py 2009-05-29 10:35:21 +0000
232@@ -670,9 +670,10 @@
233 self._ensure_real()
234 return self._real_repository.suspend_write_group()
235
236- def get_missing_parent_inventories(self):
237+ def get_missing_parent_inventories(self, check_for_missing_texts=True):
238 self._ensure_real()
239- return self._real_repository.get_missing_parent_inventories()
240+ return self._real_repository.get_missing_parent_inventories(
241+ check_for_missing_texts=check_for_missing_texts)
242
243 def _ensure_real(self):
244 """Ensure that there is a _real_repository set.
245@@ -860,10 +861,10 @@
246 self._unstacked_provider.enable_cache(cache_misses=True)
247 if self._real_repository is not None:
248 self._real_repository.lock_read()
249+ for repo in self._fallback_repositories:
250+ repo.lock_read()
251 else:
252 self._lock_count += 1
253- for repo in self._fallback_repositories:
254- repo.lock_read()
255
256 def _remote_lock_write(self, token):
257 path = self.bzrdir._path_for_remote_call(self._client)
258@@ -901,13 +902,13 @@
259 self._lock_count = 1
260 cache_misses = self._real_repository is None
261 self._unstacked_provider.enable_cache(cache_misses=cache_misses)
262+ for repo in self._fallback_repositories:
263+ # Writes don't affect fallback repos
264+ repo.lock_read()
265 elif self._lock_mode == 'r':
266 raise errors.ReadOnlyError(self)
267 else:
268 self._lock_count += 1
269- for repo in self._fallback_repositories:
270- # Writes don't affect fallback repos
271- repo.lock_read()
272 return self._lock_token or None
273
274 def leave_lock_in_place(self):
275@@ -1015,6 +1016,10 @@
276 self._lock_token = None
277 if not self._leave_lock:
278 self._unlock(old_token)
279+ # Fallbacks are always 'lock_read()' so we don't pay attention to
280+ # self._leave_lock
281+ for repo in self._fallback_repositories:
282+ repo.unlock()
283
284 def break_lock(self):
285 # should hand off to the network
286@@ -1084,6 +1089,11 @@
287 # We need to accumulate additional repositories here, to pass them in
288 # on various RPC's.
289 #
290+ if self.is_locked():
291+ # We will call fallback.unlock() when we transition to the unlocked
292+ # state, so always add a lock here. If a caller passes us a locked
293+ # repository, they are responsible for unlocking it later.
294+ repository.lock_read()
295 self._fallback_repositories.append(repository)
296 # If self._real_repository was parameterised already (e.g. because a
297 # _real_branch had its get_stacked_on_url method called), then the
298@@ -1971,7 +1981,7 @@
299 except (errors.NotStacked, errors.UnstackableBranchFormat,
300 errors.UnstackableRepositoryFormat), e:
301 return
302- self._activate_fallback_location(fallback_url, None)
303+ self._activate_fallback_location(fallback_url)
304
305 def _get_config(self):
306 return RemoteBranchConfig(self)
307
308=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
309--- bzrlib/repofmt/groupcompress_repo.py 2009-05-26 13:12:59 +0000
310+++ bzrlib/repofmt/groupcompress_repo.py 2009-05-29 10:35:21 +0000
311@@ -51,6 +51,7 @@
312 PackRootCommitBuilder,
313 RepositoryPackCollection,
314 RepositoryFormatPack,
315+ ResumedPack,
316 Packer,
317 )
318
319@@ -163,7 +164,21 @@
320 have deltas based on a fallback repository.
321 (See <https://bugs.launchpad.net/bzr/+bug/288751>)
322 """
323- # Groupcompress packs don't have any external references
324+ # Groupcompress packs don't have any external references, arguably CHK
325+ # pages have external references, but we cannot 'cheaply' determine
326+ # them without actually walking all of the chk pages.
327+
328+
329+class ResumedGCPack(ResumedPack):
330+
331+ def _check_references(self):
332+ """Make sure our external compression parents are present."""
333+ # See GCPack._check_references for why this is empty
334+
335+ def _get_external_refs(self, index):
336+ # GC repositories don't have compression parents external to a given
337+ # pack file
338+ return set()
339
340
341 class GCCHKPacker(Packer):
342@@ -540,6 +555,7 @@
343 class GCRepositoryPackCollection(RepositoryPackCollection):
344
345 pack_factory = GCPack
346+ resumed_pack_factory = ResumedGCPack
347
348 def _already_packed(self):
349 """Is the collection already packed?"""
350@@ -609,7 +625,8 @@
351 self.revisions = GroupCompressVersionedFiles(
352 _GCGraphIndex(self._pack_collection.revision_index.combined_index,
353 add_callback=self._pack_collection.revision_index.add_callback,
354- parents=True, is_locked=self.is_locked),
355+ parents=True, is_locked=self.is_locked,
356+ track_external_parent_refs=True),
357 access=self._pack_collection.revision_index.data_access,
358 delta=False)
359 self.signatures = GroupCompressVersionedFiles(
360@@ -719,52 +736,21 @@
361 # make it raise to trap naughty direct users.
362 raise NotImplementedError(self._iter_inventory_xmls)
363
364- def _find_revision_outside_set(self, revision_ids):
365- revision_set = frozenset(revision_ids)
366- for revid in revision_ids:
367- parent_ids = self.get_parent_map([revid]).get(revid, ())
368- for parent in parent_ids:
369- if parent in revision_set:
370- # Parent is not outside the set
371- continue
372- if parent not in self.get_parent_map([parent]):
373- # Parent is a ghost
374- continue
375- return parent
376- return _mod_revision.NULL_REVISION
377+ def _find_parent_ids_of_revisions(self, revision_ids):
378+ # TODO: we probably want to make this a helper that other code can get
379+ # at
380+ parent_map = self.get_parent_map(revision_ids)
381+ parents = set()
382+ map(parents.update, parent_map.itervalues())
383+ parents.difference_update(revision_ids)
384+ parents.discard(_mod_revision.NULL_REVISION)
385+ return parents
386
387- def _find_file_keys_to_fetch(self, revision_ids, pb):
388- rich_root = self.supports_rich_root()
389- revision_outside_set = self._find_revision_outside_set(revision_ids)
390- if revision_outside_set == _mod_revision.NULL_REVISION:
391- uninteresting_root_keys = set()
392- else:
393- uninteresting_inv = self.get_inventory(revision_outside_set)
394- uninteresting_root_keys = set([uninteresting_inv.id_to_entry.key()])
395- interesting_root_keys = set()
396- for idx, inv in enumerate(self.iter_inventories(revision_ids)):
397- interesting_root_keys.add(inv.id_to_entry.key())
398- revision_ids = frozenset(revision_ids)
399- file_id_revisions = {}
400- bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key
401- for record, items in chk_map.iter_interesting_nodes(self.chk_bytes,
402- interesting_root_keys, uninteresting_root_keys,
403- pb=pb):
404- # This is cheating a bit to use the last grabbed 'inv', but it
405- # works
406- for name, bytes in items:
407- (name_utf8, file_id, revision_id) = bytes_to_info(bytes)
408- if not rich_root and name_utf8 == '':
409- continue
410- if revision_id in revision_ids:
411- # Would we rather build this up into file_id => revision
412- # maps?
413- try:
414- file_id_revisions[file_id].add(revision_id)
415- except KeyError:
416- file_id_revisions[file_id] = set([revision_id])
417- for file_id, revisions in file_id_revisions.iteritems():
418- yield ('file', file_id, revisions)
419+ def _find_present_inventory_ids(self, revision_ids):
420+ keys = [(r,) for r in revision_ids]
421+ parent_map = self.inventories.get_parent_map(keys)
422+ present_inventory_ids = set(k[-1] for k in parent_map)
423+ return present_inventory_ids
424
425 def fileids_altered_by_revision_ids(self, revision_ids, _inv_weave=None):
426 """Find the file ids and versions affected by revisions.
427@@ -776,23 +762,39 @@
428 revision_ids. Each altered file-ids has the exact revision_ids that
429 altered it listed explicitly.
430 """
431- rich_roots = self.supports_rich_root()
432- result = {}
433+ rich_root = self.supports_rich_root()
434+ bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key
435+ file_id_revisions = {}
436 pb = ui.ui_factory.nested_progress_bar()
437 try:
438- total = len(revision_ids)
439- for pos, inv in enumerate(self.iter_inventories(revision_ids)):
440- pb.update("Finding text references", pos, total)
441- for entry in inv.iter_just_entries():
442- if entry.revision != inv.revision_id:
443- continue
444- if not rich_roots and entry.file_id == inv.root_id:
445- continue
446- alterations = result.setdefault(entry.file_id, set([]))
447- alterations.add(entry.revision)
448- return result
449+ parent_ids = self._find_parent_ids_of_revisions(revision_ids)
450+ present_parent_inv_ids = self._find_present_inventory_ids(parent_ids)
451+ uninteresting_root_keys = set()
452+ interesting_root_keys = set()
453+ inventories_to_read = set(present_parent_inv_ids)
454+ inventories_to_read.update(revision_ids)
455+ for inv in self.iter_inventories(inventories_to_read):
456+ entry_chk_root_key = inv.id_to_entry.key()
457+ if inv.revision_id in present_parent_inv_ids:
458+ uninteresting_root_keys.add(entry_chk_root_key)
459+ else:
460+ interesting_root_keys.add(entry_chk_root_key)
461+
462+ chk_bytes = self.chk_bytes
463+ for record, items in chk_map.iter_interesting_nodes(chk_bytes,
464+ interesting_root_keys, uninteresting_root_keys,
465+ pb=pb):
466+ for name, bytes in items:
467+ (name_utf8, file_id, revision_id) = bytes_to_info(bytes)
468+ if not rich_root and name_utf8 == '':
469+ continue
470+ try:
471+ file_id_revisions[file_id].add(revision_id)
472+ except KeyError:
473+ file_id_revisions[file_id] = set([revision_id])
474 finally:
475 pb.finished()
476+ return file_id_revisions
477
478 def find_text_key_references(self):
479 """Find the text key references within the repository.
480@@ -843,12 +845,6 @@
481 return GroupCHKStreamSource(self, to_format)
482 return super(CHKInventoryRepository, self)._get_source(to_format)
483
484- def suspend_write_group(self):
485- raise errors.UnsuspendableWriteGroup(self)
486-
487- def _resume_write_group(self, tokens):
488- raise errors.UnsuspendableWriteGroup(self)
489-
490
491 class GroupCHKStreamSource(repository.StreamSource):
492 """Used when both the source and target repo are GroupCHK repos."""
493@@ -861,7 +857,7 @@
494 self._chk_id_roots = None
495 self._chk_p_id_roots = None
496
497- def _get_filtered_inv_stream(self):
498+ def _get_inventory_stream(self, inventory_keys):
499 """Get a stream of inventory texts.
500
501 When this function returns, self._chk_id_roots and self._chk_p_id_roots
502@@ -873,7 +869,7 @@
503 id_roots_set = set()
504 p_id_roots_set = set()
505 source_vf = self.from_repository.inventories
506- stream = source_vf.get_record_stream(self._revision_keys,
507+ stream = source_vf.get_record_stream(inventory_keys,
508 'groupcompress', True)
509 for record in stream:
510 bytes = record.get_bytes_as('fulltext')
511@@ -897,16 +893,29 @@
512 p_id_roots_set.clear()
513 return ('inventories', _filtered_inv_stream())
514
515- def _get_filtered_chk_streams(self, excluded_keys):
516+ def _find_present_inventories(self, revision_ids):
517+ revision_keys = [(r,) for r in revision_ids]
518+ inventories = self.from_repository.inventories
519+ present_inventories = inventories.get_parent_map(revision_keys)
520+ return [p[-1] for p in present_inventories]
521+
522+ def _get_filtered_chk_streams(self, excluded_revision_ids):
523 self._text_keys = set()
524- excluded_keys.discard(_mod_revision.NULL_REVISION)
525- if not excluded_keys:
526+ excluded_revision_ids.discard(_mod_revision.NULL_REVISION)
527+ if not excluded_revision_ids:
528 uninteresting_root_keys = set()
529 uninteresting_pid_root_keys = set()
530 else:
531+ # filter out any excluded revisions whose inventories are not
532+ # actually present
533+ # TODO: Update Repository.iter_inventories() to add
534+ # ignore_missing=True
535+ present_ids = self.from_repository._find_present_inventory_ids(
536+ excluded_revision_ids)
537+ present_ids = self._find_present_inventories(excluded_revision_ids)
538 uninteresting_root_keys = set()
539 uninteresting_pid_root_keys = set()
540- for inv in self.from_repository.iter_inventories(excluded_keys):
541+ for inv in self.from_repository.iter_inventories(present_ids):
542 uninteresting_root_keys.add(inv.id_to_entry.key())
543 uninteresting_pid_root_keys.add(
544 inv.parent_id_basename_to_file_id.key())
545@@ -922,12 +931,16 @@
546 self._text_keys.add((file_id, revision_id))
547 if record is not None:
548 yield record
549+ # Consumed
550+ self._chk_id_roots = None
551 yield 'chk_bytes', _filter_id_to_entry()
552 def _get_parent_id_basename_to_file_id_pages():
553 for record, items in chk_map.iter_interesting_nodes(chk_bytes,
554 self._chk_p_id_roots, uninteresting_pid_root_keys):
555 if record is not None:
556 yield record
557+ # Consumed
558+ self._chk_p_id_roots = None
559 yield 'chk_bytes', _get_parent_id_basename_to_file_id_pages()
560
561 def _get_text_stream(self):
562@@ -943,18 +956,43 @@
563 for stream_info in self._fetch_revision_texts(revision_ids):
564 yield stream_info
565 self._revision_keys = [(rev_id,) for rev_id in revision_ids]
566- yield self._get_filtered_inv_stream()
567- # The keys to exclude are part of the search recipe
568- _, _, exclude_keys, _ = search.get_recipe()
569- for stream_info in self._get_filtered_chk_streams(exclude_keys):
570+ yield self._get_inventory_stream(self._revision_keys)
571+ # TODO: The keys to exclude might be part of the search recipe
572+ # For now, exclude all parents that are at the edge of ancestry, for
573+ # which we have inventories
574+ from_repo = self.from_repository
575+ parent_ids = from_repo._find_parent_ids_of_revisions(revision_ids)
576+ for stream_info in self._get_filtered_chk_streams(parent_ids):
577 yield stream_info
578 yield self._get_text_stream()
579
580+ def get_stream_for_missing_keys(self, missing_keys):
581+ # missing keys can only occur when we are byte copying and not
582+ # translating (because translation means we don't send
583+ # unreconstructable deltas ever).
584+ missing_inventory_keys = set()
585+ for key in missing_keys:
586+ if key[0] != 'inventories':
587+ raise AssertionError('The only missing keys we should'
588+ ' be filling in are inventory keys, not %s'
589+ % (key[0],))
590+ missing_inventory_keys.add(key[1:])
591+ if self._chk_id_roots or self._chk_p_id_roots:
592+ raise AssertionError('Cannot call get_stream_for_missing_keys'
593+ ' untill all of get_stream() has been consumed.')
594+ # Yield the inventory stream, so we can find the chk stream
595+ yield self._get_inventory_stream(missing_inventory_keys)
596+ # We use the empty set for excluded_revision_ids, to make it clear that
597+ # we want to transmit all referenced chk pages.
598+ for stream_info in self._get_filtered_chk_streams(set()):
599+ yield stream_info
600+
601
602 class RepositoryFormatCHK1(RepositoryFormatPack):
603 """A hashed CHK+group compress pack repository."""
604
605 repository_class = CHKInventoryRepository
606+ supports_external_lookups = True
607 supports_chks = True
608 # For right now, setting this to True gives us InterModel1And2 rather
609 # than InterDifferingSerializer
610
611=== modified file 'bzrlib/repofmt/pack_repo.py'
612--- bzrlib/repofmt/pack_repo.py 2009-04-27 23:14:00 +0000
613+++ bzrlib/repofmt/pack_repo.py 2009-05-29 10:35:21 +0000
614@@ -268,10 +268,11 @@
615
616 def __init__(self, name, revision_index, inventory_index, text_index,
617 signature_index, upload_transport, pack_transport, index_transport,
618- pack_collection):
619+ pack_collection, chk_index=None):
620 """Create a ResumedPack object."""
621 ExistingPack.__init__(self, pack_transport, name, revision_index,
622- inventory_index, text_index, signature_index)
623+ inventory_index, text_index, signature_index,
624+ chk_index=chk_index)
625 self.upload_transport = upload_transport
626 self.index_transport = index_transport
627 self.index_sizes = [None, None, None, None]
628@@ -281,6 +282,9 @@
629 ('text', text_index),
630 ('signature', signature_index),
631 ]
632+ if chk_index is not None:
633+ indices.append(('chk', chk_index))
634+ self.index_sizes.append(None)
635 for index_type, index in indices:
636 offset = self.index_offset(index_type)
637 self.index_sizes[offset] = index._size
638@@ -301,6 +305,8 @@
639 self.upload_transport.delete(self.file_name())
640 indices = [self.revision_index, self.inventory_index, self.text_index,
641 self.signature_index]
642+ if self.chk_index is not None:
643+ indices.append(self.chk_index)
644 for index in indices:
645 index._transport.delete(index._name)
646
647@@ -308,7 +314,10 @@
648 self._check_references()
649 new_name = '../packs/' + self.file_name()
650 self.upload_transport.rename(self.file_name(), new_name)
651- for index_type in ['revision', 'inventory', 'text', 'signature']:
652+ index_types = ['revision', 'inventory', 'text', 'signature']
653+ if self.chk_index is not None:
654+ index_types.append('chk')
655+ for index_type in index_types:
656 old_name = self.index_name(index_type, self.name)
657 new_name = '../indices/' + old_name
658 self.upload_transport.rename(old_name, new_name)
659@@ -316,6 +325,11 @@
660 self._state = 'finished'
661
662 def _get_external_refs(self, index):
663+ """Return compression parents for this index that are not present.
664+
665+ This returns any compression parents that are referenced by this index,
666+ which are not contained *in* this index. They may be present elsewhere.
667+ """
668 return index.external_references(1)
669
670
671@@ -1352,6 +1366,7 @@
672 """
673
674 pack_factory = NewPack
675+ resumed_pack_factory = ResumedPack
676
677 def __init__(self, repo, transport, index_transport, upload_transport,
678 pack_transport, index_builder_class, index_class,
679@@ -1680,9 +1695,14 @@
680 inv_index = self._make_index(name, '.iix', resume=True)
681 txt_index = self._make_index(name, '.tix', resume=True)
682 sig_index = self._make_index(name, '.six', resume=True)
683- result = ResumedPack(name, rev_index, inv_index, txt_index,
684- sig_index, self._upload_transport, self._pack_transport,
685- self._index_transport, self)
686+ if self.chk_index is not None:
687+ chk_index = self._make_index(name, '.cix', resume=True)
688+ else:
689+ chk_index = None
690+ result = self.resumed_pack_factory(name, rev_index, inv_index,
691+ txt_index, sig_index, self._upload_transport,
692+ self._pack_transport, self._index_transport, self,
693+ chk_index=chk_index)
694 except errors.NoSuchFile, e:
695 raise errors.UnresumableWriteGroup(self.repo, [name], str(e))
696 self.add_pack_to_memory(result)
697@@ -1809,14 +1829,11 @@
698 def reset(self):
699 """Clear all cached data."""
700 # cached revision data
701- self.repo._revision_knit = None
702 self.revision_index.clear()
703 # cached signature data
704- self.repo._signature_knit = None
705 self.signature_index.clear()
706 # cached file text data
707 self.text_index.clear()
708- self.repo._text_knit = None
709 # cached inventory data
710 self.inventory_index.clear()
711 # cached chk data
712@@ -2035,7 +2052,6 @@
713 except KeyError:
714 pass
715 del self._resumed_packs[:]
716- self.repo._text_knit = None
717
718 def _remove_resumed_pack_indices(self):
719 for resumed_pack in self._resumed_packs:
720@@ -2081,7 +2097,6 @@
721 # when autopack takes no steps, the names list is still
722 # unsaved.
723 self._save_pack_names()
724- self.repo._text_knit = None
725
726 def _suspend_write_group(self):
727 tokens = [pack.name for pack in self._resumed_packs]
728@@ -2095,7 +2110,6 @@
729 self._new_pack.abort()
730 self._new_pack = None
731 self._remove_resumed_pack_indices()
732- self.repo._text_knit = None
733 return tokens
734
735 def _resume_write_group(self, tokens):
736@@ -2202,6 +2216,7 @@
737 % (self._format, self.bzrdir.transport.base))
738
739 def _abort_write_group(self):
740+ self.revisions._index._key_dependencies.refs.clear()
741 self._pack_collection._abort_write_group()
742
743 def _find_inconsistent_revision_parents(self):
744@@ -2262,11 +2277,13 @@
745 self._pack_collection._start_write_group()
746
747 def _commit_write_group(self):
748+ self.revisions._index._key_dependencies.refs.clear()
749 return self._pack_collection._commit_write_group()
750
751 def suspend_write_group(self):
752 # XXX check self._write_group is self.get_transaction()?
753 tokens = self._pack_collection._suspend_write_group()
754+ self.revisions._index._key_dependencies.refs.clear()
755 self._write_group = None
756 return tokens
757
758@@ -2295,10 +2312,10 @@
759 self._write_lock_count += 1
760 if self._write_lock_count == 1:
761 self._transaction = transactions.WriteTransaction()
762+ if not locked:
763 for repo in self._fallback_repositories:
764 # Writes don't affect fallback repos
765 repo.lock_read()
766- if not locked:
767 self._refresh_data()
768
769 def lock_read(self):
770@@ -2307,10 +2324,9 @@
771 self._write_lock_count += 1
772 else:
773 self.control_files.lock_read()
774+ if not locked:
775 for repo in self._fallback_repositories:
776- # Writes don't affect fallback repos
777 repo.lock_read()
778- if not locked:
779 self._refresh_data()
780
781 def leave_lock_in_place(self):
782@@ -2356,10 +2372,10 @@
783 transaction = self._transaction
784 self._transaction = None
785 transaction.finish()
786- for repo in self._fallback_repositories:
787- repo.unlock()
788 else:
789 self.control_files.unlock()
790+
791+ if not self.is_locked():
792 for repo in self._fallback_repositories:
793 repo.unlock()
794
795
796=== modified file 'bzrlib/repository.py'
797--- bzrlib/repository.py 2009-05-12 04:54:04 +0000
798+++ bzrlib/repository.py 2009-05-29 10:35:21 +0000
799@@ -969,6 +969,10 @@
800 """
801 if not self._format.supports_external_lookups:
802 raise errors.UnstackableRepositoryFormat(self._format, self.base)
803+ if self.is_locked():
804+ # This repository will call fallback.unlock() when we transition to
805+ # the unlocked state, so we make sure to increment the lock count
806+ repository.lock_read()
807 self._check_fallback_repository(repository)
808 self._fallback_repositories.append(repository)
809 self.texts.add_fallback_versioned_files(repository.texts)
810@@ -1240,19 +1244,19 @@
811 """
812 locked = self.is_locked()
813 result = self.control_files.lock_write(token=token)
814- for repo in self._fallback_repositories:
815- # Writes don't affect fallback repos
816- repo.lock_read()
817 if not locked:
818+ for repo in self._fallback_repositories:
819+ # Writes don't affect fallback repos
820+ repo.lock_read()
821 self._refresh_data()
822 return result
823
824 def lock_read(self):
825 locked = self.is_locked()
826 self.control_files.lock_read()
827- for repo in self._fallback_repositories:
828- repo.lock_read()
829 if not locked:
830+ for repo in self._fallback_repositories:
831+ repo.lock_read()
832 self._refresh_data()
833
834 def get_physical_lock_status(self):
835@@ -1424,7 +1428,7 @@
836 def suspend_write_group(self):
837 raise errors.UnsuspendableWriteGroup(self)
838
839- def get_missing_parent_inventories(self):
840+ def get_missing_parent_inventories(self, check_for_missing_texts=True):
841 """Return the keys of missing inventory parents for revisions added in
842 this write group.
843
844@@ -1439,7 +1443,7 @@
845 return set()
846 if not self.is_in_write_group():
847 raise AssertionError('not in a write group')
848-
849+
850 # XXX: We assume that every added revision already has its
851 # corresponding inventory, so we only check for parent inventories that
852 # might be missing, rather than all inventories.
853@@ -1448,9 +1452,12 @@
854 unstacked_inventories = self.inventories._index
855 present_inventories = unstacked_inventories.get_parent_map(
856 key[-1:] for key in parents)
857- if len(parents.difference(present_inventories)) == 0:
858+ parents.difference_update(present_inventories)
859+ if len(parents) == 0:
860 # No missing parent inventories.
861 return set()
862+ if not check_for_missing_texts:
863+ return set(('inventories', rev_id) for (rev_id,) in parents)
864 # Ok, now we have a list of missing inventories. But these only matter
865 # if the inventories that reference them are missing some texts they
866 # appear to introduce.
867@@ -1577,8 +1584,8 @@
868 self.control_files.unlock()
869 if self.control_files._lock_count == 0:
870 self._inventory_entry_cache.clear()
871- for repo in self._fallback_repositories:
872- repo.unlock()
873+ for repo in self._fallback_repositories:
874+ repo.unlock()
875
876 @needs_read_lock
877 def clone(self, a_bzrdir, revision_id=None):
878@@ -4003,18 +4010,20 @@
879 try:
880 if resume_tokens:
881 self.target_repo.resume_write_group(resume_tokens)
882+ is_resume = True
883 else:
884 self.target_repo.start_write_group()
885+ is_resume = False
886 try:
887 # locked_insert_stream performs a commit|suspend.
888- return self._locked_insert_stream(stream, src_format)
889+ return self._locked_insert_stream(stream, src_format, is_resume)
890 except:
891 self.target_repo.abort_write_group(suppress_errors=True)
892 raise
893 finally:
894 self.target_repo.unlock()
895
896- def _locked_insert_stream(self, stream, src_format):
897+ def _locked_insert_stream(self, stream, src_format, is_resume):
898 to_serializer = self.target_repo._format._serializer
899 src_serializer = src_format._serializer
900 new_pack = None
901@@ -4070,14 +4079,18 @@
902 if new_pack is not None:
903 new_pack._write_data('', flush=True)
904 # Find all the new revisions (including ones from resume_tokens)
905- missing_keys = self.target_repo.get_missing_parent_inventories()
906+ missing_keys = self.target_repo.get_missing_parent_inventories(
907+ check_for_missing_texts=is_resume)
908 try:
909 for prefix, versioned_file in (
910 ('texts', self.target_repo.texts),
911 ('inventories', self.target_repo.inventories),
912 ('revisions', self.target_repo.revisions),
913 ('signatures', self.target_repo.signatures),
914+ ('chk_bytes', self.target_repo.chk_bytes),
915 ):
916+ if versioned_file is None:
917+ continue
918 missing_keys.update((prefix,) + key for key in
919 versioned_file.get_missing_compression_parent_keys())
920 except NotImplementedError:
921@@ -4230,6 +4243,7 @@
922 keys['texts'] = set()
923 keys['revisions'] = set()
924 keys['inventories'] = set()
925+ keys['chk_bytes'] = set()
926 keys['signatures'] = set()
927 for key in missing_keys:
928 keys[key[0]].add(key[1:])
929@@ -4242,6 +4256,13 @@
930 keys['revisions'],))
931 for substream_kind, keys in keys.iteritems():
932 vf = getattr(self.from_repository, substream_kind)
933+ if vf is None and keys:
934+ raise AssertionError(
935+ "cannot fill in keys for a versioned file we don't"
936+ " have: %s needs %s" % (substream_kind, keys))
937+ if not keys:
938+ # No need to stream something we don't have
939+ continue
940 # Ask for full texts always so that we don't need more round trips
941 # after this stream.
942 stream = vf.get_record_stream(keys,
943
944=== modified file 'bzrlib/tests/per_repository/test_fileid_involved.py'
945--- bzrlib/tests/per_repository/test_fileid_involved.py 2009-03-23 14:59:43 +0000
946+++ bzrlib/tests/per_repository/test_fileid_involved.py 2009-05-29 10:35:21 +0000
947@@ -1,4 +1,4 @@
948-# Copyright (C) 2005 Canonical Ltd
949+# Copyright (C) 2005, 2009 Canonical Ltd
950 #
951 # This program is free software; you can redistribute it and/or modify
952 # it under the terms of the GNU General Public License as published by
953@@ -16,7 +16,12 @@
954
955 import os
956 import sys
957+import time
958
959+from bzrlib import (
960+ revision as _mod_revision,
961+ tests,
962+ )
963 from bzrlib.errors import IllegalPath, NonAsciiRevisionId
964 from bzrlib.tests import TestSkipped
965 from bzrlib.tests.per_repository.test_repository import TestCaseWithRepository
966@@ -49,11 +54,11 @@
967 super(TestFileIdInvolved, self).setUp()
968 # create three branches, and merge it
969 #
970- # /-->J ------>K (branch2)
971- # / \
972- # A ---> B --->C ---->D->G (main)
973- # \ / /
974- # \---> E---/----> F (branch1)
975+ # ,-->J------>K (branch2)
976+ # / \
977+ # A --->B --->C---->D-->G (main)
978+ # \ / /
979+ # '--->E---+---->F (branch1)
980
981 # A changes:
982 # B changes: 'a-file-id-2006-01-01-abcd'
983@@ -137,8 +142,6 @@
984 self.branch = main_branch
985
986 def test_fileids_altered_between_two_revs(self):
987- def foo(old, new):
988- print set(self.branch.repository.get_ancestry(new)).difference(set(self.branch.repository.get_ancestry(old)))
989 self.branch.lock_read()
990 self.addCleanup(self.branch.unlock)
991 self.branch.repository.fileids_altered_by_revision_ids(["rev-J","rev-K"])
992@@ -295,7 +298,7 @@
993 self.branch = main_branch
994
995 def test_fileid_involved_full_compare2(self):
996- # this tests that fileids_alteted_by_revision_ids returns
997+ # this tests that fileids_altered_by_revision_ids returns
998 # more information than compare_tree can, because it
999 # sees each change rather than the aggregate delta.
1000 self.branch.lock_read()
1001@@ -315,6 +318,73 @@
1002 self.assertSubset(l2, l1)
1003
1004
1005+class FileIdInvolvedWGhosts(TestCaseWithRepository):
1006+
1007+ def create_branch_with_ghost_text(self):
1008+ builder = self.make_branch_builder('ghost')
1009+ builder.build_snapshot('A-id', None, [
1010+ ('add', ('', 'root-id', 'directory', None)),
1011+ ('add', ('a', 'a-file-id', 'file', 'some content\n'))])
1012+ b = builder.get_branch()
1013+ old_rt = b.repository.revision_tree('A-id')
1014+ new_inv = old_rt.inventory._get_mutable_inventory()
1015+ new_inv.revision_id = 'B-id'
1016+ new_inv['a-file-id'].revision = 'ghost-id'
1017+ new_rev = _mod_revision.Revision('B-id',
1018+ timestamp=time.time(),
1019+ timezone=0,
1020+ message='Committing against a ghost',
1021+ committer='Joe Foo <joe@foo.com>',
1022+ properties={},
1023+ parent_ids=('A-id', 'ghost-id'),
1024+ )
1025+ b.lock_write()
1026+ self.addCleanup(b.unlock)
1027+ b.repository.start_write_group()
1028+ b.repository.add_revision('B-id', new_rev, new_inv)
1029+ b.repository.commit_write_group()
1030+ return b
1031+
1032+ def test_file_ids_include_ghosts(self):
1033+ b = self.create_branch_with_ghost_text()
1034+ repo = b.repository
1035+ self.assertEqual(
1036+ {'a-file-id':set(['ghost-id'])},
1037+ repo.fileids_altered_by_revision_ids(['B-id']))
1038+
1039+ def test_file_ids_uses_fallbacks(self):
1040+ builder = self.make_branch_builder('source',
1041+ format=self.bzrdir_format)
1042+ repo = builder.get_branch().repository
1043+ if not repo._format.supports_external_lookups:
1044+ raise tests.TestNotApplicable('format does not support stacking')
1045+ builder.start_series()
1046+ builder.build_snapshot('A-id', None, [
1047+ ('add', ('', 'root-id', 'directory', None)),
1048+ ('add', ('file', 'file-id', 'file', 'contents\n'))])
1049+ builder.build_snapshot('B-id', ['A-id'], [
1050+ ('modify', ('file-id', 'new-content\n'))])
1051+ builder.build_snapshot('C-id', ['B-id'], [
1052+ ('modify', ('file-id', 'yet more content\n'))])
1053+ builder.finish_series()
1054+ source_b = builder.get_branch()
1055+ source_b.lock_read()
1056+ self.addCleanup(source_b.unlock)
1057+ base = self.make_branch('base')
1058+ base.pull(source_b, stop_revision='B-id')
1059+ stacked = self.make_branch('stacked')
1060+ stacked.set_stacked_on_url('../base')
1061+ stacked.pull(source_b, stop_revision='C-id')
1062+
1063+ stacked.lock_read()
1064+ self.addCleanup(stacked.unlock)
1065+ repo = stacked.repository
1066+ keys = {'file-id': set(['A-id'])}
1067+ if stacked.repository.supports_rich_root():
1068+ keys['root-id'] = set(['A-id'])
1069+ self.assertEqual(keys, repo.fileids_altered_by_revision_ids(['A-id']))
1070+
1071+
1072 def set_executability(wt, path, executable=True):
1073 """Set the executable bit for the file at path in the working tree
1074
1075
1076=== modified file 'bzrlib/tests/per_repository/test_write_group.py'
1077--- bzrlib/tests/per_repository/test_write_group.py 2009-05-12 09:05:30 +0000
1078+++ bzrlib/tests/per_repository/test_write_group.py 2009-05-29 10:35:21 +0000
1079@@ -18,7 +18,15 @@
1080
1081 import sys
1082
1083-from bzrlib import bzrdir, errors, graph, memorytree, remote
1084+from bzrlib import (
1085+ bzrdir,
1086+ errors,
1087+ graph,
1088+ memorytree,
1089+ osutils,
1090+ remote,
1091+ versionedfile,
1092+ )
1093 from bzrlib.branch import BzrBranchFormat7
1094 from bzrlib.inventory import InventoryDirectory
1095 from bzrlib.transport import local, memory
1096@@ -240,9 +248,9 @@
1097 inventory) in it must have all the texts in its inventory (even if not
1098 changed w.r.t. to the absent parent), otherwise it will report missing
1099 texts/parent inventory.
1100-
1101+
1102 The core of this test is that a file was changed in rev-1, but in a
1103- stacked repo that only has rev-2
1104+ stacked repo that only has rev-2
1105 """
1106 # Make a trunk with one commit.
1107 trunk_repo = self.make_stackable_repo()
1108@@ -284,6 +292,69 @@
1109 set(), reopened_repo.get_missing_parent_inventories())
1110 reopened_repo.abort_write_group()
1111
1112+ def test_get_missing_parent_inventories_check(self):
1113+ builder = self.make_branch_builder('test')
1114+ builder.build_snapshot('A-id', ['ghost-parent-id'], [
1115+ ('add', ('', 'root-id', 'directory', None)),
1116+ ('add', ('file', 'file-id', 'file', 'content\n'))],
1117+ allow_leftmost_as_ghost=True)
1118+ b = builder.get_branch()
1119+ b.lock_read()
1120+ self.addCleanup(b.unlock)
1121+ repo = self.make_repository('test-repo')
1122+ repo.lock_write()
1123+ self.addCleanup(repo.unlock)
1124+ repo.start_write_group()
1125+ self.addCleanup(repo.abort_write_group)
1126+ # Now, add the objects manually
1127+ text_keys = [('file-id', 'A-id')]
1128+ if repo.supports_rich_root():
1129+ text_keys.append(('root-id', 'A-id'))
1130+ # Directly add the texts, inventory, and revision object for 'A-id'
1131+ repo.texts.insert_record_stream(b.repository.texts.get_record_stream(
1132+ text_keys, 'unordered', True))
1133+ repo.add_revision('A-id', b.repository.get_revision('A-id'),
1134+ b.repository.get_inventory('A-id'))
1135+ get_missing = repo.get_missing_parent_inventories
1136+ if repo._format.supports_external_lookups:
1137+ self.assertEqual(set([('inventories', 'ghost-parent-id')]),
1138+ get_missing(check_for_missing_texts=False))
1139+ self.assertEqual(set(), get_missing(check_for_missing_texts=True))
1140+ self.assertEqual(set(), get_missing())
1141+ else:
1142+ # If we don't support external lookups, we always return empty
1143+ self.assertEqual(set(), get_missing(check_for_missing_texts=False))
1144+ self.assertEqual(set(), get_missing(check_for_missing_texts=True))
1145+ self.assertEqual(set(), get_missing())
1146+
1147+ def test_insert_stream_passes_resume_info(self):
1148+ repo = self.make_repository('test-repo')
1149+ if not repo._format.supports_external_lookups:
1150+ raise TestNotApplicable('only valid in resumable repos')
1151+ # log calls to get_missing_parent_inventories, so that we can assert it
1152+ # is called with the correct parameters
1153+ call_log = []
1154+ orig = repo.get_missing_parent_inventories
1155+ def get_missing(check_for_missing_texts=True):
1156+ call_log.append(check_for_missing_texts)
1157+ return orig(check_for_missing_texts=check_for_missing_texts)
1158+ repo.get_missing_parent_inventories = get_missing
1159+ repo.lock_write()
1160+ self.addCleanup(repo.unlock)
1161+ sink = repo._get_sink()
1162+ sink.insert_stream((), repo._format, [])
1163+ self.assertEqual([False], call_log)
1164+ del call_log[:]
1165+ repo.start_write_group()
1166+ # We need to insert something, or suspend_write_group won't actually
1167+ # create a token
1168+ repo.texts.insert_record_stream([versionedfile.FulltextContentFactory(
1169+ ('file-id', 'rev-id'), (), None, 'lines\n')])
1170+ tokens = repo.suspend_write_group()
1171+ self.assertNotEqual([], tokens)
1172+ sink.insert_stream((), repo._format, tokens)
1173+ self.assertEqual([True], call_log)
1174+
1175
1176 class TestResumeableWriteGroup(TestCaseWithRepository):
1177
1178@@ -518,9 +589,12 @@
1179 source_repo.start_write_group()
1180 key_base = ('file-id', 'base')
1181 key_delta = ('file-id', 'delta')
1182- source_repo.texts.add_lines(key_base, (), ['lines\n'])
1183- source_repo.texts.add_lines(
1184- key_delta, (key_base,), ['more\n', 'lines\n'])
1185+ def text_stream():
1186+ yield versionedfile.FulltextContentFactory(
1187+ key_base, (), None, 'lines\n')
1188+ yield versionedfile.FulltextContentFactory(
1189+ key_delta, (key_base,), None, 'more\nlines\n')
1190+ source_repo.texts.insert_record_stream(text_stream())
1191 source_repo.commit_write_group()
1192 return source_repo
1193
1194@@ -536,9 +610,20 @@
1195 stream = source_repo.texts.get_record_stream(
1196 [key_delta], 'unordered', False)
1197 repo.texts.insert_record_stream(stream)
1198- # It's not commitable due to the missing compression parent.
1199- self.assertRaises(
1200- errors.BzrCheckError, repo.commit_write_group)
1201+ # It's either not commitable due to the missing compression parent, or
1202+ # the stacked location has already filled in the fulltext.
1203+ try:
1204+ repo.commit_write_group()
1205+ except errors.BzrCheckError:
1206+ # It refused to commit because we have a missing parent
1207+ pass
1208+ else:
1209+ same_repo = self.reopen_repo(repo)
1210+ same_repo.lock_read()
1211+ record = same_repo.texts.get_record_stream([key_delta],
1212+ 'unordered', True).next()
1213+ self.assertEqual('more\nlines\n', record.get_bytes_as('fulltext'))
1214+ return
1215 # Merely suspending and resuming doesn't make it commitable either.
1216 wg_tokens = repo.suspend_write_group()
1217 same_repo = self.reopen_repo(repo)
1218@@ -570,8 +655,19 @@
1219 same_repo.texts.insert_record_stream(stream)
1220 # Just like if we'd added that record without a suspend/resume cycle,
1221 # commit_write_group fails.
1222- self.assertRaises(
1223- errors.BzrCheckError, same_repo.commit_write_group)
1224+ try:
1225+ same_repo.commit_write_group()
1226+ except errors.BzrCheckError:
1227+ pass
1228+ else:
1229+ # If the commit_write_group didn't fail, that is because the
1230+ # insert_record_stream already gave it a fulltext.
1231+ same_repo = self.reopen_repo(repo)
1232+ same_repo.lock_read()
1233+ record = same_repo.texts.get_record_stream([key_delta],
1234+ 'unordered', True).next()
1235+ self.assertEqual('more\nlines\n', record.get_bytes_as('fulltext'))
1236+ return
1237 same_repo.abort_write_group()
1238
1239 def test_add_missing_parent_after_resume(self):
1240
1241=== modified file 'bzrlib/tests/per_repository_reference/__init__.py'
1242--- bzrlib/tests/per_repository_reference/__init__.py 2009-03-23 14:59:43 +0000
1243+++ bzrlib/tests/per_repository_reference/__init__.py 2009-05-29 10:35:21 +0000
1244@@ -97,6 +97,9 @@
1245 'bzrlib.tests.per_repository_reference.test_break_lock',
1246 'bzrlib.tests.per_repository_reference.test_check',
1247 'bzrlib.tests.per_repository_reference.test_default_stacking',
1248+ 'bzrlib.tests.per_repository_reference.test_fetch',
1249+ 'bzrlib.tests.per_repository_reference.test_initialize',
1250+ 'bzrlib.tests.per_repository_reference.test_unlock',
1251 ]
1252 # Parameterize per_repository_reference test modules by format.
1253 standard_tests.addTests(loader.loadTestsFromModuleNames(module_list))
1254
1255=== modified file 'bzrlib/tests/per_repository_reference/test_default_stacking.py'
1256--- bzrlib/tests/per_repository_reference/test_default_stacking.py 2009-03-23 14:59:43 +0000
1257+++ bzrlib/tests/per_repository_reference/test_default_stacking.py 2009-05-29 10:35:21 +0000
1258@@ -21,19 +21,13 @@
1259
1260 class TestDefaultStackingPolicy(TestCaseWithRepository):
1261
1262- # XXX: this helper probably belongs on TestCaseWithTransport
1263- def make_smart_server(self, path):
1264- smart_server = server.SmartTCPServer_for_testing()
1265- smart_server.setUp(self.get_server())
1266- return smart_server.get_url() + path
1267-
1268 def test_sprout_to_smart_server_stacking_policy_handling(self):
1269 """Obey policy where possible, ignore otherwise."""
1270 stack_on = self.make_branch('stack-on')
1271 parent_bzrdir = self.make_bzrdir('.', format='default')
1272 parent_bzrdir.get_config().set_default_stack_on('stack-on')
1273 source = self.make_branch('source')
1274- url = self.make_smart_server('target')
1275+ url = self.make_smart_server('target').abspath('')
1276 target = source.bzrdir.sprout(url).open_branch()
1277 self.assertEqual('../stack-on', target.get_stacked_on_url())
1278 self.assertEqual(
1279
1280=== added file 'bzrlib/tests/per_repository_reference/test_fetch.py'
1281--- bzrlib/tests/per_repository_reference/test_fetch.py 1970-01-01 00:00:00 +0000
1282+++ bzrlib/tests/per_repository_reference/test_fetch.py 2009-05-29 10:35:21 +0000
1283@@ -0,0 +1,101 @@
1284+# Copyright (C) 2009 Canonical Ltd
1285+#
1286+# This program is free software; you can redistribute it and/or modify
1287+# it under the terms of the GNU General Public License as published by
1288+# the Free Software Foundation; either version 2 of the License, or
1289+# (at your option) any later version.
1290+#
1291+# This program is distributed in the hope that it will be useful,
1292+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1293+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1294+# GNU General Public License for more details.
1295+#
1296+# You should have received a copy of the GNU General Public License
1297+# along with this program; if not, write to the Free Software
1298+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1299+
1300+
1301+from bzrlib.smart import server
1302+from bzrlib.tests.per_repository import TestCaseWithRepository
1303+
1304+
1305+class TestFetch(TestCaseWithRepository):
1306+
1307+ def make_source_branch(self):
1308+ # It would be nice if there was a way to force this to be memory-only
1309+ builder = self.make_branch_builder('source')
1310+ content = ['content lines\n'
1311+ 'for the first revision\n'
1312+ 'which is a marginal amount of content\n'
1313+ ]
1314+ builder.start_series()
1315+ builder.build_snapshot('A-id', None, [
1316+ ('add', ('', 'root-id', 'directory', None)),
1317+ ('add', ('a', 'a-id', 'file', ''.join(content))),
1318+ ])
1319+ content.append('and some more lines for B\n')
1320+ builder.build_snapshot('B-id', ['A-id'], [
1321+ ('modify', ('a-id', ''.join(content)))])
1322+ content.append('and yet even more content for C\n')
1323+ builder.build_snapshot('C-id', ['B-id'], [
1324+ ('modify', ('a-id', ''.join(content)))])
1325+ builder.finish_series()
1326+ source_b = builder.get_branch()
1327+ source_b.lock_read()
1328+ self.addCleanup(source_b.unlock)
1329+ return content, source_b
1330+
1331+ def test_sprout_from_stacked_with_short_history(self):
1332+ content, source_b = self.make_source_branch()
1333+ # Split the generated content into a base branch, and a stacked branch
1334+ # Use 'make_branch' which gives us a bzr:// branch when appropriate,
1335+ # rather than creating a branch-on-disk
1336+ stack_b = self.make_branch('stack-on')
1337+ stack_b.pull(source_b, stop_revision='B-id')
1338+ target_b = self.make_branch('target')
1339+ target_b.set_stacked_on_url('../stack-on')
1340+ target_b.pull(source_b, stop_revision='C-id')
1341+ # At this point, we should have a target branch, with 1 revision, on
1342+ # top of the source.
1343+ final_b = self.make_branch('final')
1344+ final_b.pull(target_b)
1345+ final_b.lock_read()
1346+ self.addCleanup(final_b.unlock)
1347+ self.assertEqual('C-id', final_b.last_revision())
1348+ text_keys = [('a-id', 'A-id'), ('a-id', 'B-id'), ('a-id', 'C-id')]
1349+ stream = final_b.repository.texts.get_record_stream(text_keys,
1350+ 'unordered', True)
1351+ records = sorted([(r.key, r.get_bytes_as('fulltext')) for r in stream])
1352+ self.assertEqual([
1353+ (('a-id', 'A-id'), ''.join(content[:-2])),
1354+ (('a-id', 'B-id'), ''.join(content[:-1])),
1355+ (('a-id', 'C-id'), ''.join(content)),
1356+ ], records)
1357+
1358+ def test_sprout_from_smart_stacked_with_short_history(self):
1359+ content, source_b = self.make_source_branch()
1360+ transport = self.make_smart_server('server')
1361+ transport.ensure_base()
1362+ url = transport.abspath('')
1363+ stack_b = source_b.bzrdir.sprout(url + '/stack-on', revision_id='B-id')
1364+ # self.make_branch only takes relative paths, so we do it the 'hard'
1365+ # way
1366+ target_transport = transport.clone('target')
1367+ target_transport.ensure_base()
1368+ target_bzrdir = self.bzrdir_format.initialize_on_transport(
1369+ target_transport)
1370+ target_bzrdir.create_repository()
1371+ target_b = target_bzrdir.create_branch()
1372+ target_b.set_stacked_on_url('../stack-on')
1373+ target_b.pull(source_b, stop_revision='C-id')
1374+ # Now we should be able to branch from the remote location to a local
1375+ # location
1376+ final_b = target_b.bzrdir.sprout('final').open_branch()
1377+ self.assertEqual('C-id', final_b.last_revision())
1378+
1379+ # bzrdir.sprout() has slightly different code paths if you supply a
1380+ # revision_id versus not. If you supply revision_id, then you get a
1381+ # PendingAncestryResult for the search, versus a SearchResult...
1382+ final2_b = target_b.bzrdir.sprout('final2',
1383+ revision_id='C-id').open_branch()
1384+ self.assertEqual('C-id', final_b.last_revision())
1385
1386=== added file 'bzrlib/tests/per_repository_reference/test_initialize.py'
1387--- bzrlib/tests/per_repository_reference/test_initialize.py 1970-01-01 00:00:00 +0000
1388+++ bzrlib/tests/per_repository_reference/test_initialize.py 2009-05-29 10:35:21 +0000
1389@@ -0,0 +1,59 @@
1390+# Copyright (C) 2009 Canonical Ltd
1391+#
1392+# This program is free software; you can redistribute it and/or modify
1393+# it under the terms of the GNU General Public License as published by
1394+# the Free Software Foundation; either version 2 of the License, or
1395+# (at your option) any later version.
1396+#
1397+# This program is distributed in the hope that it will be useful,
1398+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1399+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1400+# GNU General Public License for more details.
1401+#
1402+# You should have received a copy of the GNU General Public License
1403+# along with this program; if not, write to the Free Software
1404+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1405+
1406+"""Tests for initializing a repository with external references."""
1407+
1408+
1409+from bzrlib import (
1410+ errors,
1411+ tests,
1412+ )
1413+from bzrlib.tests.per_repository_reference import (
1414+ TestCaseWithExternalReferenceRepository,
1415+ )
1416+
1417+
1418+class TestInitialize(TestCaseWithExternalReferenceRepository):
1419+
1420+ def initialize_and_check_on_transport(self, base, trans):
1421+ network_name = base.repository._format.network_name()
1422+ result = self.bzrdir_format.initialize_on_transport_ex(
1423+ trans, use_existing_dir=False, create_prefix=False,
1424+ stacked_on='../base', stack_on_pwd=base.base,
1425+ repo_format_name=network_name)
1426+ result_repo, a_bzrdir, require_stacking, repo_policy = result
1427+ self.addCleanup(result_repo.unlock)
1428+ self.assertEqual(1, len(result_repo._fallback_repositories))
1429+ return result_repo
1430+
1431+ def test_initialize_on_transport_ex(self):
1432+ base = self.make_branch('base')
1433+ trans = self.get_transport('stacked')
1434+ repo = self.initialize_and_check_on_transport(base, trans)
1435+ self.assertEqual(base.repository._format.network_name(),
1436+ repo._format.network_name())
1437+
1438+ def test_remote_initialize_on_transport_ex(self):
1439+ # All formats can be initialized appropriately over bzr://
1440+ base = self.make_branch('base')
1441+ trans = self.make_smart_server('stacked')
1442+ repo = self.initialize_and_check_on_transport(base, trans)
1443+ network_name = base.repository._format.network_name()
1444+ if network_name != repo._format.network_name():
1445+ raise tests.KnownFailure('Remote initialize_on_transport_ex()'
1446+ ' tries to "upgrade" the format because it doesn\'t have a'
1447+ ' branch format, and hard-codes the new repository format.')
1448+ self.assertEqual(network_name, repo._format.network_name())
1449
1450=== added file 'bzrlib/tests/per_repository_reference/test_unlock.py'
1451--- bzrlib/tests/per_repository_reference/test_unlock.py 1970-01-01 00:00:00 +0000
1452+++ bzrlib/tests/per_repository_reference/test_unlock.py 2009-05-29 10:35:21 +0000
1453@@ -0,0 +1,76 @@
1454+# Copyright (C) 2009 Canonical Ltd
1455+#
1456+# This program is free software; you can redistribute it and/or modify
1457+# it under the terms of the GNU General Public License as published by
1458+# the Free Software Foundation; either version 2 of the License, or
1459+# (at your option) any later version.
1460+#
1461+# This program is distributed in the hope that it will be useful,
1462+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1463+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1464+# GNU General Public License for more details.
1465+#
1466+# You should have received a copy of the GNU General Public License
1467+# along with this program; if not, write to the Free Software
1468+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1469+
1470+"""Tests for locking/unlocking a repository with external references."""
1471+
1472+from bzrlib import (
1473+ branch,
1474+ errors,
1475+ )
1476+from bzrlib.tests.per_repository_reference import (
1477+ TestCaseWithExternalReferenceRepository,
1478+ )
1479+
1480+
1481+class TestUnlock(TestCaseWithExternalReferenceRepository):
1482+
1483+ def create_stacked_branch(self):
1484+ builder = self.make_branch_builder('source',
1485+ format=self.bzrdir_format)
1486+ builder.start_series()
1487+ repo = builder.get_branch().repository
1488+ if not repo._format.supports_external_lookups:
1489+ raise tests.TestNotApplicable('format does not support stacking')
1490+ builder.build_snapshot('A-id', None, [
1491+ ('add', ('', 'root-id', 'directory', None)),
1492+ ('add', ('file', 'file-id', 'file', 'contents\n'))])
1493+ builder.build_snapshot('B-id', ['A-id'], [
1494+ ('modify', ('file-id', 'new-content\n'))])
1495+ builder.build_snapshot('C-id', ['B-id'], [
1496+ ('modify', ('file-id', 'yet more content\n'))])
1497+ builder.finish_series()
1498+ source_b = builder.get_branch()
1499+ source_b.lock_read()
1500+ self.addCleanup(source_b.unlock)
1501+ base = self.make_branch('base')
1502+ base.pull(source_b, stop_revision='B-id')
1503+ stacked = self.make_branch('stacked')
1504+ stacked.set_stacked_on_url('../base')
1505+ stacked.pull(source_b, stop_revision='C-id')
1506+
1507+ return base, stacked
1508+
1509+ def test_unlock_unlocks_fallback(self):
1510+ base = self.make_branch('base')
1511+ stacked = self.make_branch('stacked')
1512+ repo = stacked.repository
1513+ stacked.set_stacked_on_url('../base')
1514+ self.assertEqual(1, len(repo._fallback_repositories))
1515+ fallback_repo = repo._fallback_repositories[0]
1516+ self.assertFalse(repo.is_locked())
1517+ self.assertFalse(fallback_repo.is_locked())
1518+ repo.lock_read()
1519+ self.assertTrue(repo.is_locked())
1520+ self.assertTrue(fallback_repo.is_locked())
1521+ repo.unlock()
1522+ self.assertFalse(repo.is_locked())
1523+ self.assertFalse(fallback_repo.is_locked())
1524+ repo.lock_write()
1525+ self.assertTrue(repo.is_locked())
1526+ self.assertTrue(fallback_repo.is_locked())
1527+ repo.unlock()
1528+ self.assertFalse(repo.is_locked())
1529+ self.assertFalse(fallback_repo.is_locked())
1530
1531=== modified file 'bzrlib/tests/test_graph.py'
1532--- bzrlib/tests/test_graph.py 2009-03-24 23:19:12 +0000
1533+++ bzrlib/tests/test_graph.py 2009-05-29 10:35:21 +0000
1534@@ -1558,6 +1558,19 @@
1535 result = _mod_graph.PendingAncestryResult(['rev-2'], repo)
1536 self.assertEqual(set(['rev-1', 'rev-2']), set(result.get_keys()))
1537
1538+ def test_get_keys_excludes_ghosts(self):
1539+ builder = self.make_branch_builder('b')
1540+ builder.start_series()
1541+ builder.build_snapshot('rev-1', None, [
1542+ ('add', ('', 'root-id', 'directory', ''))])
1543+ builder.build_snapshot('rev-2', ['rev-1', 'ghost'], [])
1544+ builder.finish_series()
1545+ repo = builder.get_branch().repository
1546+ repo.lock_read()
1547+ self.addCleanup(repo.unlock)
1548+ result = _mod_graph.PendingAncestryResult(['rev-2'], repo)
1549+ self.assertEqual(sorted(['rev-1', 'rev-2']), sorted(result.get_keys()))
1550+
1551 def test_get_keys_excludes_null(self):
1552 # Make a 'graph' with an iter_ancestry that returns NULL_REVISION
1553 # somewhere other than the last element, which can happen in real
1554
1555=== modified file 'bzrlib/tests/test_groupcompress.py'
1556--- bzrlib/tests/test_groupcompress.py 2009-04-22 17:18:45 +0000
1557+++ bzrlib/tests/test_groupcompress.py 2009-05-29 10:35:21 +0000
1558@@ -19,8 +19,10 @@
1559 import zlib
1560
1561 from bzrlib import (
1562+ btree_index,
1563 groupcompress,
1564 errors,
1565+ index as _mod_index,
1566 osutils,
1567 tests,
1568 versionedfile,
1569@@ -475,6 +477,23 @@
1570
1571 class TestGroupCompressVersionedFiles(TestCaseWithGroupCompressVersionedFiles):
1572
1573+ def make_g_index(self, name, ref_lists=0, nodes=[]):
1574+ builder = btree_index.BTreeBuilder(ref_lists)
1575+ for node, references, value in nodes:
1576+ builder.add_node(node, references, value)
1577+ stream = builder.finish()
1578+ trans = self.get_transport()
1579+ size = trans.put_file(name, stream)
1580+ return btree_index.BTreeGraphIndex(trans, name, size)
1581+
1582+ def make_g_index_missing_parent(self):
1583+ graph_index = self.make_g_index('missing_parent', 1,
1584+ [(('parent', ), '2 78 2 10', ([],)),
1585+ (('tip', ), '2 78 2 10',
1586+ ([('parent', ), ('missing-parent', )],)),
1587+ ])
1588+ return graph_index
1589+
1590 def test_get_record_stream_as_requested(self):
1591 # Consider promoting 'as-requested' to general availability, and
1592 # make this a VF interface test
1593@@ -606,6 +625,30 @@
1594 else:
1595 self.assertIs(block, record._manager._block)
1596
1597+ def test_add_missing_noncompression_parent_unvalidated_index(self):
1598+ unvalidated = self.make_g_index_missing_parent()
1599+ combined = _mod_index.CombinedGraphIndex([unvalidated])
1600+ index = groupcompress._GCGraphIndex(combined,
1601+ is_locked=lambda: True, parents=True,
1602+ track_external_parent_refs=True)
1603+ index.scan_unvalidated_index(unvalidated)
1604+ self.assertEqual(
1605+ frozenset([('missing-parent',)]), index.get_missing_parents())
1606+
1607+ def test_track_external_parent_refs(self):
1608+ g_index = self.make_g_index('empty', 1, [])
1609+ mod_index = btree_index.BTreeBuilder(1, 1)
1610+ combined = _mod_index.CombinedGraphIndex([g_index, mod_index])
1611+ index = groupcompress._GCGraphIndex(combined,
1612+ is_locked=lambda: True, parents=True,
1613+ add_callback=mod_index.add_nodes,
1614+ track_external_parent_refs=True)
1615+ index.add_records([
1616+ (('new-key',), '2 10 2 10', [(('parent-1',), ('parent-2',))])])
1617+ self.assertEqual(
1618+ frozenset([('parent-1',), ('parent-2',)]),
1619+ index.get_missing_parents())
1620+
1621
1622 class TestLazyGroupCompress(tests.TestCaseWithTransport):
1623
1624
1625=== modified file 'bzrlib/tests/test_pack_repository.py'
1626--- bzrlib/tests/test_pack_repository.py 2009-05-11 15:30:40 +0000
1627+++ bzrlib/tests/test_pack_repository.py 2009-05-29 10:35:21 +0000
1628@@ -620,7 +620,7 @@
1629 Also requires that the exception is logged.
1630 """
1631 self.vfs_transport_factory = memory.MemoryServer
1632- repo = self.make_repository('repo')
1633+ repo = self.make_repository('repo', format=self.get_format())
1634 token = repo.lock_write()
1635 self.addCleanup(repo.unlock)
1636 repo.start_write_group()
1637@@ -637,7 +637,7 @@
1638
1639 def test_abort_write_group_does_raise_when_not_suppressed(self):
1640 self.vfs_transport_factory = memory.MemoryServer
1641- repo = self.make_repository('repo')
1642+ repo = self.make_repository('repo', format=self.get_format())
1643 token = repo.lock_write()
1644 self.addCleanup(repo.unlock)
1645 repo.start_write_group()
1646@@ -650,23 +650,51 @@
1647
1648 def test_suspend_write_group(self):
1649 self.vfs_transport_factory = memory.MemoryServer
1650- repo = self.make_repository('repo')
1651+ repo = self.make_repository('repo', format=self.get_format())
1652 token = repo.lock_write()
1653 self.addCleanup(repo.unlock)
1654 repo.start_write_group()
1655 repo.texts.add_lines(('file-id', 'revid'), (), ['lines'])
1656 wg_tokens = repo.suspend_write_group()
1657 expected_pack_name = wg_tokens[0] + '.pack'
1658+ expected_names = [wg_tokens[0] + ext for ext in
1659+ ('.rix', '.iix', '.tix', '.six')]
1660+ if repo.chk_bytes is not None:
1661+ expected_names.append(wg_tokens[0] + '.cix')
1662+ expected_names.append(expected_pack_name)
1663 upload_transport = repo._pack_collection._upload_transport
1664 limbo_files = upload_transport.list_dir('')
1665- self.assertTrue(expected_pack_name in limbo_files, limbo_files)
1666+ self.assertEqual(sorted(expected_names), sorted(limbo_files))
1667 md5 = osutils.md5(upload_transport.get_bytes(expected_pack_name))
1668 self.assertEqual(wg_tokens[0], md5.hexdigest())
1669
1670+ def test_resume_chk_bytes(self):
1671+ self.vfs_transport_factory = memory.MemoryServer
1672+ repo = self.make_repository('repo', format=self.get_format())
1673+ if repo.chk_bytes is None:
1674+ raise TestNotApplicable('no chk_bytes for this repository')
1675+ token = repo.lock_write()
1676+ self.addCleanup(repo.unlock)
1677+ repo.start_write_group()
1678+ text = 'a bit of text\n'
1679+ key = ('sha1:' + osutils.sha_string(text),)
1680+ repo.chk_bytes.add_lines(key, (), [text])
1681+ wg_tokens = repo.suspend_write_group()
1682+ same_repo = repo.bzrdir.open_repository()
1683+ same_repo.lock_write()
1684+ self.addCleanup(same_repo.unlock)
1685+ same_repo.resume_write_group(wg_tokens)
1686+ self.assertEqual([key], list(same_repo.chk_bytes.keys()))
1687+ self.assertEqual(
1688+ text, same_repo.chk_bytes.get_record_stream([key],
1689+ 'unordered', True).next().get_bytes_as('fulltext'))
1690+ same_repo.abort_write_group()
1691+ self.assertEqual([], list(same_repo.chk_bytes.keys()))
1692+
1693 def test_resume_write_group_then_abort(self):
1694 # Create a repo, start a write group, insert some data, suspend.
1695 self.vfs_transport_factory = memory.MemoryServer
1696- repo = self.make_repository('repo')
1697+ repo = self.make_repository('repo', format=self.get_format())
1698 token = repo.lock_write()
1699 self.addCleanup(repo.unlock)
1700 repo.start_write_group()
1701@@ -685,10 +713,38 @@
1702 self.assertEqual(
1703 [], same_repo._pack_collection._pack_transport.list_dir(''))
1704
1705+ def test_commit_resumed_write_group(self):
1706+ self.vfs_transport_factory = memory.MemoryServer
1707+ repo = self.make_repository('repo', format=self.get_format())
1708+ token = repo.lock_write()
1709+ self.addCleanup(repo.unlock)
1710+ repo.start_write_group()
1711+ text_key = ('file-id', 'revid')
1712+ repo.texts.add_lines(text_key, (), ['lines'])
1713+ wg_tokens = repo.suspend_write_group()
1714+ # Get a fresh repository object for the repo on the filesystem.
1715+ same_repo = repo.bzrdir.open_repository()
1716+ # Resume
1717+ same_repo.lock_write()
1718+ self.addCleanup(same_repo.unlock)
1719+ same_repo.resume_write_group(wg_tokens)
1720+ same_repo.commit_write_group()
1721+ expected_pack_name = wg_tokens[0] + '.pack'
1722+ expected_names = [wg_tokens[0] + ext for ext in
1723+ ('.rix', '.iix', '.tix', '.six')]
1724+ if repo.chk_bytes is not None:
1725+ expected_names.append(wg_tokens[0] + '.cix')
1726+ self.assertEqual(
1727+ [], same_repo._pack_collection._upload_transport.list_dir(''))
1728+ index_names = repo._pack_collection._index_transport.list_dir('')
1729+ self.assertEqual(sorted(expected_names), sorted(index_names))
1730+ pack_names = repo._pack_collection._pack_transport.list_dir('')
1731+ self.assertEqual([expected_pack_name], pack_names)
1732+
1733 def test_resume_malformed_token(self):
1734 self.vfs_transport_factory = memory.MemoryServer
1735 # Make a repository with a suspended write group
1736- repo = self.make_repository('repo')
1737+ repo = self.make_repository('repo', format=self.get_format())
1738 token = repo.lock_write()
1739 self.addCleanup(repo.unlock)
1740 repo.start_write_group()
1741@@ -696,7 +752,7 @@
1742 repo.texts.add_lines(text_key, (), ['lines'])
1743 wg_tokens = repo.suspend_write_group()
1744 # Make a new repository
1745- new_repo = self.make_repository('new_repo')
1746+ new_repo = self.make_repository('new_repo', format=self.get_format())
1747 token = new_repo.lock_write()
1748 self.addCleanup(new_repo.unlock)
1749 hacked_wg_token = (
1750@@ -732,12 +788,12 @@
1751 # can only stack on repositories that have compatible internal
1752 # metadata
1753 if getattr(repo._format, 'supports_tree_reference', False):
1754+ matching_format_name = 'pack-0.92-subtree'
1755+ else:
1756 if repo._format.supports_chks:
1757 matching_format_name = 'development6-rich-root'
1758 else:
1759- matching_format_name = 'pack-0.92-subtree'
1760- else:
1761- matching_format_name = 'rich-root-pack'
1762+ matching_format_name = 'rich-root-pack'
1763 mismatching_format_name = 'pack-0.92'
1764 else:
1765 # We don't have a non-rich-root CHK format.
1766@@ -763,15 +819,14 @@
1767 if getattr(repo._format, 'supports_tree_reference', False):
1768 # can only stack on repositories that have compatible internal
1769 # metadata
1770- if repo._format.supports_chks:
1771- # No CHK subtree formats in bzr.dev, so this doesn't execute.
1772- matching_format_name = 'development6-subtree'
1773- else:
1774- matching_format_name = 'pack-0.92-subtree'
1775+ matching_format_name = 'pack-0.92-subtree'
1776 mismatching_format_name = 'rich-root-pack'
1777 else:
1778 if repo.supports_rich_root():
1779- matching_format_name = 'rich-root-pack'
1780+ if repo._format.supports_chks:
1781+ matching_format_name = 'development6-rich-root'
1782+ else:
1783+ matching_format_name = 'rich-root-pack'
1784 mismatching_format_name = 'pack-0.92-subtree'
1785 else:
1786 raise TestNotApplicable('No formats use non-v5 serializer'
1787@@ -844,6 +899,66 @@
1788 self.assertTrue(large_pack_name in pack_names)
1789
1790
1791+class TestKeyDependencies(TestCaseWithTransport):
1792+
1793+ def get_format(self):
1794+ return bzrdir.format_registry.make_bzrdir(self.format_name)
1795+
1796+ def create_source_and_target(self):
1797+ builder = self.make_branch_builder('source', format=self.get_format())
1798+ builder.start_series()
1799+ builder.build_snapshot('A-id', None, [
1800+ ('add', ('', 'root-id', 'directory', None))])
1801+ builder.build_snapshot('B-id', ['A-id', 'ghost-id'], [])
1802+ builder.finish_series()
1803+ repo = self.make_repository('target')
1804+ b = builder.get_branch()
1805+ b.lock_read()
1806+ self.addCleanup(b.unlock)
1807+ repo.lock_write()
1808+ self.addCleanup(repo.unlock)
1809+ return b.repository, repo
1810+
1811+ def test_key_dependencies_cleared_on_abort(self):
1812+ source_repo, target_repo = self.create_source_and_target()
1813+ target_repo.start_write_group()
1814+ try:
1815+ stream = source_repo.revisions.get_record_stream([('B-id',)],
1816+ 'unordered', True)
1817+ target_repo.revisions.insert_record_stream(stream)
1818+ key_refs = target_repo.revisions._index._key_dependencies
1819+ self.assertEqual([('B-id',)], sorted(key_refs.get_referrers()))
1820+ finally:
1821+ target_repo.abort_write_group()
1822+ self.assertEqual([], sorted(key_refs.get_referrers()))
1823+
1824+ def test_key_dependencies_cleared_on_suspend(self):
1825+ source_repo, target_repo = self.create_source_and_target()
1826+ target_repo.start_write_group()
1827+ try:
1828+ stream = source_repo.revisions.get_record_stream([('B-id',)],
1829+ 'unordered', True)
1830+ target_repo.revisions.insert_record_stream(stream)
1831+ key_refs = target_repo.revisions._index._key_dependencies
1832+ self.assertEqual([('B-id',)], sorted(key_refs.get_referrers()))
1833+ finally:
1834+ target_repo.suspend_write_group()
1835+ self.assertEqual([], sorted(key_refs.get_referrers()))
1836+
1837+ def test_key_dependencies_cleared_on_commit(self):
1838+ source_repo, target_repo = self.create_source_and_target()
1839+ target_repo.start_write_group()
1840+ try:
1841+ stream = source_repo.revisions.get_record_stream([('B-id',)],
1842+ 'unordered', True)
1843+ target_repo.revisions.insert_record_stream(stream)
1844+ key_refs = target_repo.revisions._index._key_dependencies
1845+ self.assertEqual([('B-id',)], sorted(key_refs.get_referrers()))
1846+ finally:
1847+ target_repo.commit_write_group()
1848+ self.assertEqual([], sorted(key_refs.get_referrers()))
1849+
1850+
1851 class TestSmartServerAutopack(TestCaseWithTransport):
1852
1853 def setUp(self):
1854@@ -931,7 +1046,7 @@
1855 dict(format_name='development6-rich-root',
1856 format_string='Bazaar development format - group compression '
1857 'and chk inventory (needs bzr.dev from 1.14)\n',
1858- format_supports_external_lookups=False,
1859+ format_supports_external_lookups=True,
1860 index_class=BTreeGraphIndex),
1861 ]
1862 # name of the scenario is the format name
1863
1864=== modified file 'bzrlib/tests/test_repository.py'
1865--- bzrlib/tests/test_repository.py 2009-04-09 20:23:07 +0000
1866+++ bzrlib/tests/test_repository.py 2009-05-29 10:35:21 +0000
1867@@ -686,11 +686,11 @@
1868 inv.parent_id_basename_to_file_id._root_node.maximum_size)
1869
1870
1871-class TestDevelopment6FindRevisionOutsideSet(TestCaseWithTransport):
1872- """Tests for _find_revision_outside_set."""
1873+class TestDevelopment6FindParentIdsOfRevisions(TestCaseWithTransport):
1874+ """Tests for _find_parent_ids_of_revisions."""
1875
1876 def setUp(self):
1877- super(TestDevelopment6FindRevisionOutsideSet, self).setUp()
1878+ super(TestDevelopment6FindParentIdsOfRevisions, self).setUp()
1879 self.builder = self.make_branch_builder('source',
1880 format='development6-rich-root')
1881 self.builder.start_series()
1882@@ -699,42 +699,42 @@
1883 self.repo = self.builder.get_branch().repository
1884 self.addCleanup(self.builder.finish_series)
1885
1886- def assertRevisionOutsideSet(self, expected_result, rev_set):
1887- self.assertEqual(
1888- expected_result, self.repo._find_revision_outside_set(rev_set))
1889+ def assertParentIds(self, expected_result, rev_set):
1890+ self.assertEqual(sorted(expected_result),
1891+ sorted(self.repo._find_parent_ids_of_revisions(rev_set)))
1892
1893 def test_simple(self):
1894 self.builder.build_snapshot('revid1', None, [])
1895- self.builder.build_snapshot('revid2', None, [])
1896+ self.builder.build_snapshot('revid2', ['revid1'], [])
1897 rev_set = ['revid2']
1898- self.assertRevisionOutsideSet('revid1', rev_set)
1899+ self.assertParentIds(['revid1'], rev_set)
1900
1901 def test_not_first_parent(self):
1902 self.builder.build_snapshot('revid1', None, [])
1903- self.builder.build_snapshot('revid2', None, [])
1904- self.builder.build_snapshot('revid3', None, [])
1905+ self.builder.build_snapshot('revid2', ['revid1'], [])
1906+ self.builder.build_snapshot('revid3', ['revid2'], [])
1907 rev_set = ['revid3', 'revid2']
1908- self.assertRevisionOutsideSet('revid1', rev_set)
1909+ self.assertParentIds(['revid1'], rev_set)
1910
1911 def test_not_null(self):
1912 rev_set = ['initial']
1913- self.assertRevisionOutsideSet(_mod_revision.NULL_REVISION, rev_set)
1914+ self.assertParentIds([], rev_set)
1915
1916 def test_not_null_set(self):
1917 self.builder.build_snapshot('revid1', None, [])
1918 rev_set = [_mod_revision.NULL_REVISION]
1919- self.assertRevisionOutsideSet(_mod_revision.NULL_REVISION, rev_set)
1920+ self.assertParentIds([], rev_set)
1921
1922 def test_ghost(self):
1923 self.builder.build_snapshot('revid1', None, [])
1924 rev_set = ['ghost', 'revid1']
1925- self.assertRevisionOutsideSet('initial', rev_set)
1926+ self.assertParentIds(['initial'], rev_set)
1927
1928 def test_ghost_parent(self):
1929 self.builder.build_snapshot('revid1', None, [])
1930 self.builder.build_snapshot('revid2', ['revid1', 'ghost'], [])
1931 rev_set = ['revid2', 'revid1']
1932- self.assertRevisionOutsideSet('initial', rev_set)
1933+ self.assertParentIds(['ghost', 'initial'], rev_set)
1934
1935 def test_righthand_parent(self):
1936 self.builder.build_snapshot('revid1', None, [])
1937@@ -742,7 +742,7 @@
1938 self.builder.build_snapshot('revid2b', ['revid1'], [])
1939 self.builder.build_snapshot('revid3', ['revid2a', 'revid2b'], [])
1940 rev_set = ['revid3', 'revid2a']
1941- self.assertRevisionOutsideSet('revid2b', rev_set)
1942+ self.assertParentIds(['revid1', 'revid2b'], rev_set)
1943
1944
1945 class TestWithBrokenRepo(TestCaseWithTransport):
1946@@ -1220,3 +1220,68 @@
1947 stream = source._get_source(target._format)
1948 # We don't want the child GroupCHKStreamSource
1949 self.assertIs(type(stream), repository.StreamSource)
1950+
1951+ def test_get_stream_for_missing_keys_includes_all_chk_refs(self):
1952+ source_builder = self.make_branch_builder('source',
1953+ format='development6-rich-root')
1954+ # We have to build a fairly large tree, so that we are sure the chk
1955+ # pages will have split into multiple pages.
1956+ entries = [('add', ('', 'a-root-id', 'directory', None))]
1957+ for i in 'abcdefghijklmnopqrstuvwxyz123456789':
1958+ for j in 'abcdefghijklmnopqrstuvwxyz123456789':
1959+ fname = i + j
1960+ fid = fname + '-id'
1961+ content = 'content for %s\n' % (fname,)
1962+ entries.append(('add', (fname, fid, 'file', content)))
1963+ source_builder.start_series()
1964+ source_builder.build_snapshot('rev-1', None, entries)
1965+ # Now change a few of them, so we get a few new pages for the second
1966+ # revision
1967+ source_builder.build_snapshot('rev-2', ['rev-1'], [
1968+ ('modify', ('aa-id', 'new content for aa-id\n')),
1969+ ('modify', ('cc-id', 'new content for cc-id\n')),
1970+ ('modify', ('zz-id', 'new content for zz-id\n')),
1971+ ])
1972+ source_builder.finish_series()
1973+ source_branch = source_builder.get_branch()
1974+ source_branch.lock_read()
1975+ self.addCleanup(source_branch.unlock)
1976+ target = self.make_repository('target', format='development6-rich-root')
1977+ source = source_branch.repository._get_source(target._format)
1978+ self.assertIsInstance(source, groupcompress_repo.GroupCHKStreamSource)
1979+
1980+ # On a regular pass, getting the inventories and chk pages for rev-2
1981+ # would only get the newly created chk pages
1982+ search = graph.SearchResult(set(['rev-2']), set(['rev-1']), 1,
1983+ set(['rev-2']))
1984+ simple_chk_records = []
1985+ for vf_name, substream in source.get_stream(search):
1986+ if vf_name == 'chk_bytes':
1987+ for record in substream:
1988+ simple_chk_records.append(record.key)
1989+ else:
1990+ for _ in substream:
1991+ continue
1992+ # 3 pages, the root (InternalNode), + 2 pages which actually changed
1993+ self.assertEqual([('sha1:91481f539e802c76542ea5e4c83ad416bf219f73',),
1994+ ('sha1:4ff91971043668583985aec83f4f0ab10a907d3f',),
1995+ ('sha1:81e7324507c5ca132eedaf2d8414ee4bb2226187',),
1996+ ('sha1:b101b7da280596c71a4540e9a1eeba8045985ee0',)],
1997+ simple_chk_records)
1998+ # Now, when we do a similar call using 'get_stream_for_missing_keys'
1999+ # we should get a much larger set of pages.
2000+ missing = [('inventories', 'rev-2')]
2001+ full_chk_records = []
2002+ for vf_name, substream in source.get_stream_for_missing_keys(missing):
2003+ if vf_name == 'inventories':
2004+ for record in substream:
2005+ self.assertEqual(('rev-2',), record.key)
2006+ elif vf_name == 'chk_bytes':
2007+ for record in substream:
2008+ full_chk_records.append(record.key)
2009+ else:
2010+ self.fail('Should not be getting a stream of %s' % (vf_name,))
2011+ # We have 257 records now. This is because we have 1 root page, and 256
2012+ # leaf pages in a complete listing.
2013+ self.assertEqual(257, len(full_chk_records))
2014+ self.assertSubset(simple_chk_records, full_chk_records)