Merge lp:~spiv/bzr/insert-stream-check-chks-part-2 into lp:bzr/2.0

Proposed by Andrew Bennetts
Status: Merged
Merged at revision: not available
Proposed branch: lp:~spiv/bzr/insert-stream-check-chks-part-2
Merge into: lp:bzr/2.0
Diff against target: None lines
To merge this branch: bzr merge lp:~spiv/bzr/insert-stream-check-chks-part-2
Reviewer Review Type Date Requested Status
Martin Pool Needs Fixing
Review via email: mp+11290@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Andrew Bennetts (spiv) wrote :

This completes the fix for bug 406687.

We could go much further I think in terms of reusing code. I've taken a nibble at that in this branch but there's room to do much, much better. We now have three code paths in groupcompress_repo.py that are doing the essentially the same logic of finding text references from a set of revisions:

 - GroupCHKStreamSource,
 - fileids_altered_by_revision_ids, and
 - _check_new_inventories (the new checks in this patch).

However I think the incremental improvement in this patch is landable without tackling all the duplication at once. (But I'm happy to do any more low-hanging fruit that a reviewer considers appropriate.)

Revision history for this message
Robert Collins (lifeless) wrote :

So a few things.

Firstly, check is trying to move away from raising a single error at the first sign of trouble; raising BzrCheckError in code check uses takes that back a few steps. You could pass down a flag asking for an exception, or something. This occurs at least twice. I may be misreading though - the code in question might not be used by check at all.

+ repo = self.make_repository('damaged-repo')
255 + if not repo._format.supports_chks:
256 + raise TestNotApplicable('requires repository with chk_bytes')

Please move these tests to per_repository_chks

Other than that it looks plausible to me.

Revision history for this message
Martin Pool (mbp) wrote :
Download full text (13.2 KiB)

It seems like you should have a news entry before you merge.

=== modified file 'bzrlib/groupcompress.py'
--- bzrlib/groupcompress.py 2009-09-08 02:09:09 +0000
+++ bzrlib/groupcompress.py 2009-09-08 02:30:52 +0000
@@ -1182,6 +1182,12 @@
         self._group_cache = LRUSizeCache(max_size=50*1024*1024)
         self._fallback_vfs = []

+ def without_fallbacks(self):

OK, it's fairly obvious, but a docstring would still be good.

+ gcvf = GroupCompressVersionedFiles(
+ self._index, self._access, self._delta)
+ gcvf._unadded_refs = dict(self._unadded_refs)
+ return gcvf
+

It seems a bit potentially problematic to poke this in post
construction, if someone for instance changes the constructor so that it
does actually start opening things. How about instead adding a private
parameter to the constructor to pass this in?

     def add_lines(self, key, parents, lines, parent_texts=None,
         left_matching_blocks=None, nostore_sha=None, random_id=False,
         check_content=True):

=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- bzrlib/repofmt/groupcompress_repo.py 2009-09-08 02:09:09 +0000
+++ bzrlib/repofmt/groupcompress_repo.py 2009-09-08 02:30:52 +0000
@@ -591,8 +591,6 @@
         :returns: set of missing keys. Note that not every missing key is
             guaranteed to be reported.
         """
- if getattr(self.repo, 'chk_bytes', None) is None:
- return set()
         # Ensure that all revisions added in this write group have:
         # - corresponding inventories,
         # - chk root entries for those inventories,
@@ -603,6 +601,7 @@
         new_revisions_keys = key_deps.get_new_keys()
         no_fallback_inv_index = self.repo.inventories._index
         no_fallback_chk_bytes_index = self.repo.chk_bytes._index
+ no_fallback_texts_index = self.repo.texts._index
         inv_parent_map = no_fallback_inv_index.get_parent_map(
             new_revisions_keys)
         # Are any inventories for corresponding to the new revisions missing?
@@ -610,7 +609,9 @@
         missing_corresponding = set(new_revisions_keys)
         missing_corresponding.difference_update(corresponding_invs)
         if missing_corresponding:
- return [('inventories', key) for key in missing_corresponding]
+ raise errors.BzrCheckError(
+ "Repository %s missing inventories for new revisions %r "
+ % (self.repo, sorted(missing_corresponding)))

As Robert says, it seems like a step backwards to be raising an
exception directly rather than returning a list of problems - is there a
reason why you have to?

         # Are any chk root entries missing for any inventories? This includes
         # any present parent inventories, which may be used when calculating
         # deltas for streaming.
@@ -620,17 +621,57 @@
         # Filter out ghost parents.
         all_inv_keys.intersection_update(
             no_fallback_inv_index.get_parent_map(all_inv_keys))
+ parent_invs_only_keys = all_inv_keys.symmetric_difference(
+ corresponding_invs)
         all_missing = set()
         inv_ids = [key[-1] for key in all_inv_keys]
...

review: Needs Fixing
Revision history for this message
Martin Pool (mbp) wrote :

that status is meant to mean 'tweak', but feel free to call me to talk about it.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bzrlib/groupcompress.py'
--- bzrlib/groupcompress.py 2009-09-04 07:39:08 +0000
+++ bzrlib/groupcompress.py 2009-09-07 05:52:04 +0000
@@ -1182,6 +1182,12 @@
1182 self._group_cache = LRUSizeCache(max_size=50*1024*1024)1182 self._group_cache = LRUSizeCache(max_size=50*1024*1024)
1183 self._fallback_vfs = []1183 self._fallback_vfs = []
11841184
1185 def without_fallbacks(self):
1186 gcvf = GroupCompressVersionedFiles(
1187 self._index, self._access, self._delta)
1188 gcvf._unadded_refs = dict(self._unadded_refs)
1189 return gcvf
1190
1185 def add_lines(self, key, parents, lines, parent_texts=None,1191 def add_lines(self, key, parents, lines, parent_texts=None,
1186 left_matching_blocks=None, nostore_sha=None, random_id=False,1192 left_matching_blocks=None, nostore_sha=None, random_id=False,
1187 check_content=True):1193 check_content=True):
11881194
=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- bzrlib/repofmt/groupcompress_repo.py 2009-09-07 03:00:23 +0000
+++ bzrlib/repofmt/groupcompress_repo.py 2009-09-07 08:39:36 +0000
@@ -591,8 +591,6 @@
591 :returns: set of missing keys. Note that not every missing key is591 :returns: set of missing keys. Note that not every missing key is
592 guaranteed to be reported.592 guaranteed to be reported.
593 """593 """
594 if getattr(self.repo, 'chk_bytes', None) is None:
595 return set()
596 # Ensure that all revisions added in this write group have:594 # Ensure that all revisions added in this write group have:
597 # - corresponding inventories,595 # - corresponding inventories,
598 # - chk root entries for those inventories,596 # - chk root entries for those inventories,
@@ -603,6 +601,7 @@
603 new_revisions_keys = key_deps.get_new_keys()601 new_revisions_keys = key_deps.get_new_keys()
604 no_fallback_inv_index = self.repo.inventories._index602 no_fallback_inv_index = self.repo.inventories._index
605 no_fallback_chk_bytes_index = self.repo.chk_bytes._index603 no_fallback_chk_bytes_index = self.repo.chk_bytes._index
604 no_fallback_texts_index = self.repo.texts._index
606 inv_parent_map = no_fallback_inv_index.get_parent_map(605 inv_parent_map = no_fallback_inv_index.get_parent_map(
607 new_revisions_keys)606 new_revisions_keys)
608 # Are any inventories for corresponding to the new revisions missing?607 # Are any inventories for corresponding to the new revisions missing?
@@ -610,7 +609,9 @@
610 missing_corresponding = set(new_revisions_keys)609 missing_corresponding = set(new_revisions_keys)
611 missing_corresponding.difference_update(corresponding_invs)610 missing_corresponding.difference_update(corresponding_invs)
612 if missing_corresponding:611 if missing_corresponding:
613 return [('inventories', key) for key in missing_corresponding]612 raise errors.BzrCheckError(
613 "Repository %s missing inventories for new revisions %r "
614 % (self.repo, sorted(missing_corresponding)))
614 # Are any chk root entries missing for any inventories? This includes615 # Are any chk root entries missing for any inventories? This includes
615 # any present parent inventories, which may be used when calculating616 # any present parent inventories, which may be used when calculating
616 # deltas for streaming.617 # deltas for streaming.
@@ -620,17 +621,57 @@
620 # Filter out ghost parents.621 # Filter out ghost parents.
621 all_inv_keys.intersection_update(622 all_inv_keys.intersection_update(
622 no_fallback_inv_index.get_parent_map(all_inv_keys))623 no_fallback_inv_index.get_parent_map(all_inv_keys))
624 parent_invs_only_keys = all_inv_keys.symmetric_difference(
625 corresponding_invs)
623 all_missing = set()626 all_missing = set()
624 inv_ids = [key[-1] for key in all_inv_keys]627 inv_ids = [key[-1] for key in all_inv_keys]
625 for inv in self.repo.iter_inventories(inv_ids, 'unordered'):628 parent_invs_only_ids = [key[-1] for key in parent_invs_only_keys]
626 root_keys = set([inv.id_to_entry.key()])629 root_key_info = _build_interesting_key_sets(
627 if inv.parent_id_basename_to_file_id is not None:630 self.repo, inv_ids, parent_invs_only_ids)
628 root_keys.add(inv.parent_id_basename_to_file_id.key())631 expected_chk_roots = root_key_info.all_keys()
629 present = no_fallback_chk_bytes_index.get_parent_map(root_keys)632 present_chk_roots = no_fallback_chk_bytes_index.get_parent_map(
630 missing = root_keys.difference(present)633 expected_chk_roots)
631 all_missing.update([('chk_bytes',) + key for key in missing])634 missing_chk_roots = expected_chk_roots.difference(present_chk_roots)
632 return all_missing635 if missing_chk_roots:
633 636 # Don't bother checking any further.
637 raise errors.BzrCheckError(
638 "Repository %s missing chk root keys %r for new revisions"
639 % (self.repo, sorted(missing_chk_roots)))
640 # Find all interesting chk_bytes records, and make sure they are
641 # present, as well as the text keys they reference.
642 chk_bytes_no_fallbacks = self.repo.chk_bytes.without_fallbacks()
643 chk_bytes_no_fallbacks._search_key_func = \
644 self.repo.chk_bytes._search_key_func
645 chk_diff = chk_map.iter_interesting_nodes(
646 chk_bytes_no_fallbacks, root_key_info.interesting_root_keys,
647 root_key_info.uninteresting_root_keys)
648 bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key
649 text_keys = set()
650 try:
651 for record in _filter_text_keys(chk_diff, text_keys, bytes_to_info):
652 pass
653 except errors.NoSuchRevision, e:
654 # XXX: It would be nice if we could give a more precise error here.
655 raise errors.BzrCheckError(
656 "Repository %s missing chk node(s) for new revisions."
657 % (self.repo,))
658 chk_diff = chk_map.iter_interesting_nodes(
659 chk_bytes_no_fallbacks, root_key_info.interesting_pid_root_keys,
660 root_key_info.uninteresting_pid_root_keys)
661 try:
662 for interesting_rec, interesting_map in chk_diff:
663 pass
664 except errors.NoSuchRevision, e:
665 raise errors.BzrCheckError(
666 "Repository %s missing chk node(s) for new revisions."
667 % (self.repo,))
668 present_text_keys = no_fallback_texts_index.get_parent_map(text_keys)
669 missing_text_keys = text_keys.difference(present_text_keys)
670 if missing_text_keys:
671 raise errors.BzrCheckError(
672 "Repository %s missing text keys %r for new revisions"
673 % (self.repo, sorted(missing_text_keys)))
674
634 def _execute_pack_operations(self, pack_operations,675 def _execute_pack_operations(self, pack_operations,
635 _packer_class=GCCHKPacker,676 _packer_class=GCCHKPacker,
636 reload_func=None):677 reload_func=None):
@@ -898,17 +939,12 @@
898 parent_keys)939 parent_keys)
899 present_parent_inv_ids = set(940 present_parent_inv_ids = set(
900 [k[-1] for k in present_parent_inv_keys])941 [k[-1] for k in present_parent_inv_keys])
901 uninteresting_root_keys = set()
902 interesting_root_keys = set()
903 inventories_to_read = set(revision_ids)942 inventories_to_read = set(revision_ids)
904 inventories_to_read.update(present_parent_inv_ids)943 inventories_to_read.update(present_parent_inv_ids)
905 for inv in self.iter_inventories(inventories_to_read):944 root_key_info = _build_interesting_key_sets(
906 entry_chk_root_key = inv.id_to_entry.key()945 self, inventories_to_read, present_parent_inv_ids)
907 if inv.revision_id in present_parent_inv_ids:946 interesting_root_keys = root_key_info.interesting_root_keys
908 uninteresting_root_keys.add(entry_chk_root_key)947 uninteresting_root_keys = root_key_info.uninteresting_root_keys
909 else:
910 interesting_root_keys.add(entry_chk_root_key)
911
912 chk_bytes = self.chk_bytes948 chk_bytes = self.chk_bytes
913 for record, items in chk_map.iter_interesting_nodes(chk_bytes,949 for record, items in chk_map.iter_interesting_nodes(chk_bytes,
914 interesting_root_keys, uninteresting_root_keys,950 interesting_root_keys, uninteresting_root_keys,
@@ -1048,13 +1084,10 @@
1048 bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key1084 bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key
1049 chk_bytes = self.from_repository.chk_bytes1085 chk_bytes = self.from_repository.chk_bytes
1050 def _filter_id_to_entry():1086 def _filter_id_to_entry():
1051 for record, items in chk_map.iter_interesting_nodes(chk_bytes,1087 interesting_nodes = chk_map.iter_interesting_nodes(chk_bytes,
1052 self._chk_id_roots, uninteresting_root_keys):1088 self._chk_id_roots, uninteresting_root_keys)
1053 for name, bytes in items:1089 for record in _filter_text_keys(interesting_nodes, self._text_keys,
1054 # Note: we don't care about name_utf8, because we are always1090 bytes_to_info):
1055 # rich-root = True
1056 _, file_id, revision_id = bytes_to_info(bytes)
1057 self._text_keys.add((file_id, revision_id))
1058 if record is not None:1091 if record is not None:
1059 yield record1092 yield record
1060 # Consumed1093 # Consumed
@@ -1098,7 +1131,7 @@
1098 missing_inventory_keys.add(key[1:])1131 missing_inventory_keys.add(key[1:])
1099 if self._chk_id_roots or self._chk_p_id_roots:1132 if self._chk_id_roots or self._chk_p_id_roots:
1100 raise AssertionError('Cannot call get_stream_for_missing_keys'1133 raise AssertionError('Cannot call get_stream_for_missing_keys'
1101 ' untill all of get_stream() has been consumed.')1134 ' until all of get_stream() has been consumed.')
1102 # Yield the inventory stream, so we can find the chk stream1135 # Yield the inventory stream, so we can find the chk stream
1103 # Some of the missing_keys will be missing because they are ghosts.1136 # Some of the missing_keys will be missing because they are ghosts.
1104 # As such, we can ignore them. The Sink is required to verify there are1137 # As such, we can ignore them. The Sink is required to verify there are
@@ -1111,6 +1144,54 @@
1111 yield stream_info1144 yield stream_info
11121145
11131146
1147class _InterestingKeyInfo(object):
1148 def __init__(self):
1149 self.interesting_root_keys = set()
1150 self.interesting_pid_root_keys = set()
1151 self.uninteresting_root_keys = set()
1152 self.uninteresting_pid_root_keys = set()
1153
1154 def all_interesting(self):
1155 return self.interesting_root_keys.union(self.interesting_pid_root_keys)
1156
1157 def all_uninteresting(self):
1158 return self.uninteresting_root_keys.union(
1159 self.uninteresting_pid_root_keys)
1160
1161 def all_keys(self):
1162 return self.all_interesting().union(self.all_uninteresting())
1163
1164
1165def _build_interesting_key_sets(repo, inventory_ids, parent_only_inv_ids):
1166 result = _InterestingKeyInfo()
1167 for inv in repo.iter_inventories(inventory_ids, 'unordered'):
1168 root_key = inv.id_to_entry.key()
1169 pid_root_key = inv.parent_id_basename_to_file_id.key()
1170 if inv.revision_id in parent_only_inv_ids:
1171 result.uninteresting_root_keys.add(root_key)
1172 result.uninteresting_pid_root_keys.add(pid_root_key)
1173 else:
1174 result.interesting_root_keys.add(root_key)
1175 result.interesting_pid_root_keys.add(pid_root_key)
1176 return result
1177
1178
1179def _filter_text_keys(interesting_nodes_iterable, text_keys, bytes_to_info):
1180 """Iterate the result of iter_interesting_nodes, yielding the records
1181 and adding to text_keys.
1182 """
1183 for record, items in interesting_nodes_iterable:
1184 for name, bytes in items:
1185 # Note: we don't care about name_utf8, because groupcompress repos
1186 # are always rich-root, so there are no synthesised root records to
1187 # ignore.
1188 _, file_id, revision_id = bytes_to_info(bytes)
1189 text_keys.add((file_id, revision_id))
1190 yield record
1191
1192
1193
1194
1114class RepositoryFormatCHK1(RepositoryFormatPack):1195class RepositoryFormatCHK1(RepositoryFormatPack):
1115 """A hashed CHK+group compress pack repository."""1196 """A hashed CHK+group compress pack repository."""
11161197
11171198
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- bzrlib/repofmt/pack_repo.py 2009-09-07 03:00:23 +0000
+++ bzrlib/repofmt/pack_repo.py 2009-09-07 06:01:48 +0000
@@ -2071,7 +2071,7 @@
2071 """2071 """
2072 # The base implementation does no checks. GCRepositoryPackCollection2072 # The base implementation does no checks. GCRepositoryPackCollection
2073 # overrides this.2073 # overrides this.
2074 return set()2074 pass
2075 2075
2076 def _commit_write_group(self):2076 def _commit_write_group(self):
2077 all_missing = set()2077 all_missing = set()
@@ -2087,11 +2087,7 @@
2087 raise errors.BzrCheckError(2087 raise errors.BzrCheckError(
2088 "Repository %s has missing compression parent(s) %r "2088 "Repository %s has missing compression parent(s) %r "
2089 % (self.repo, sorted(all_missing)))2089 % (self.repo, sorted(all_missing)))
2090 all_missing = self._check_new_inventories()2090 self._check_new_inventories()
2091 if all_missing:
2092 raise errors.BzrCheckError(
2093 "Repository %s missing keys for new revisions %r "
2094 % (self.repo, sorted(all_missing)))
2095 self._remove_pack_indices(self._new_pack)2091 self._remove_pack_indices(self._new_pack)
2096 any_new_content = False2092 any_new_content = False
2097 if self._new_pack.data_inserted():2093 if self._new_pack.data_inserted():
20982094
=== modified file 'bzrlib/tests/per_repository/test_write_group.py'
--- bzrlib/tests/per_repository/test_write_group.py 2009-09-02 03:07:23 +0000
+++ bzrlib/tests/per_repository/test_write_group.py 2009-09-07 08:06:25 +0000
@@ -365,16 +365,16 @@
365 """commit_write_group fails with BzrCheckError when the chk root record365 """commit_write_group fails with BzrCheckError when the chk root record
366 for a new inventory is missing.366 for a new inventory is missing.
367 """367 """
368 repo = self.make_repository('damaged-repo')
369 if not repo._format.supports_chks:
370 raise TestNotApplicable('requires repository with chk_bytes')
368 builder = self.make_branch_builder('simple-branch')371 builder = self.make_branch_builder('simple-branch')
369 builder.build_snapshot('A-id', None, [372 builder.build_snapshot('A-id', None, [
370 ('add', ('', 'root-id', 'directory', None)),373 ('add', ('', 'root-id', 'directory', None)),
371 ('add', ('file', 'file-id', 'file', 'content\n'))])374 ('add', ('file', 'file-id', 'file', 'content\n'))])
372 b = builder.get_branch()375 b = builder.get_branch()
373 if not b.repository._format.supports_chks:
374 raise TestNotApplicable('requires repository with chk_bytes')
375 b.lock_read()376 b.lock_read()
376 self.addCleanup(b.unlock)377 self.addCleanup(b.unlock)
377 repo = self.make_repository('damaged-repo')
378 repo.lock_write()378 repo.lock_write()
379 repo.start_write_group()379 repo.start_write_group()
380 # Now, add the objects manually380 # Now, add the objects manually
@@ -411,6 +411,9 @@
411 (In principle the chk records are unnecessary in this case, but in411 (In principle the chk records are unnecessary in this case, but in
412 practice bzr 2.0rc1 (at least) expects to find them.)412 practice bzr 2.0rc1 (at least) expects to find them.)
413 """413 """
414 repo = self.make_repository('damaged-repo')
415 if not repo._format.supports_chks:
416 raise TestNotApplicable('requires repository with chk_bytes')
414 # Make a branch where the last two revisions have identical417 # Make a branch where the last two revisions have identical
415 # inventories.418 # inventories.
416 builder = self.make_branch_builder('simple-branch')419 builder = self.make_branch_builder('simple-branch')
@@ -420,8 +423,6 @@
420 builder.build_snapshot('B-id', None, [])423 builder.build_snapshot('B-id', None, [])
421 builder.build_snapshot('C-id', None, [])424 builder.build_snapshot('C-id', None, [])
422 b = builder.get_branch()425 b = builder.get_branch()
423 if not b.repository._format.supports_chks:
424 raise TestNotApplicable('requires repository with chk_bytes')
425 b.lock_read()426 b.lock_read()
426 self.addCleanup(b.unlock)427 self.addCleanup(b.unlock)
427 # check our setup: B-id and C-id should have identical chk root keys.428 # check our setup: B-id and C-id should have identical chk root keys.
@@ -433,10 +434,71 @@
433 # We need ('revisions', 'C-id'), ('inventories', 'C-id'),434 # We need ('revisions', 'C-id'), ('inventories', 'C-id'),
434 # ('inventories', 'B-id'), and the corresponding chk roots for those435 # ('inventories', 'B-id'), and the corresponding chk roots for those
435 # inventories.436 # inventories.
437 repo.lock_write()
438 repo.start_write_group()
439 src_repo = b.repository
440 repo.inventories.insert_record_stream(
441 src_repo.inventories.get_record_stream(
442 [('B-id',), ('C-id',)], 'unordered', True))
443 repo.revisions.insert_record_stream(
444 src_repo.revisions.get_record_stream(
445 [('C-id',)], 'unordered', True))
446 # Make sure the presence of the missing data in a fallback does not
447 # avoid the error.
448 repo.add_fallback_repository(b.repository)
449 self.assertRaises(errors.BzrCheckError, repo.commit_write_group)
450 reopened_repo = self.reopen_repo_and_resume_write_group(repo)
451 self.assertRaises(
452 errors.BzrCheckError, reopened_repo.commit_write_group)
453 reopened_repo.abort_write_group()
454
455 def test_missing_chk_leaf_for_inventory(self):
456 """commit_write_group fails with BzrCheckError when the chk root record
457 for a parent inventory of a new revision is missing.
458 """
436 repo = self.make_repository('damaged-repo')459 repo = self.make_repository('damaged-repo')
460 if not repo._format.supports_chks:
461 raise TestNotApplicable('requires repository with chk_bytes')
462 builder = self.make_branch_builder('simple-branch')
463 # add and modify files with very long file-ids, so that the chk map
464 # will need more than just a root node.
465 file_adds = []
466 file_modifies = []
467 for char in 'abc':
468 name = char * 10000
469 file_adds.append(
470 ('add', ('file-' + name, 'file-%s-id' % name, 'file',
471 'content %s\n' % name)))
472 file_modifies.append(
473 ('modify', ('file-%s-id' % name, 'new content %s\n' % name)))
474 builder.build_snapshot('A-id', None, [
475 ('add', ('', 'root-id', 'directory', None))] +
476 file_adds)
477 builder.build_snapshot('B-id', None, [])
478 builder.build_snapshot('C-id', None, file_modifies)
479 b = builder.get_branch()
480 src_repo = b.repository
481 src_repo.lock_read()
482 self.addCleanup(src_repo.unlock)
483 # Now, manually insert objects for a stacked repo with only revision
484 # C-id, *except* drop the non-root chk records.
485 inv_b = src_repo.get_inventory('B-id')
486 inv_c = src_repo.get_inventory('C-id')
487 chk_root_keys_only = [
488 inv_b.id_to_entry.key(), inv_b.parent_id_basename_to_file_id.key(),
489 inv_c.id_to_entry.key(), inv_c.parent_id_basename_to_file_id.key()]
490 all_chks = src_repo.chk_bytes.keys()
491 # Pick a non-root key to drop
492 key_to_drop = all_chks.difference(chk_root_keys_only).pop()
493 all_chks.discard(key_to_drop)
437 repo.lock_write()494 repo.lock_write()
438 repo.start_write_group()495 repo.start_write_group()
439 src_repo = b.repository496 repo.chk_bytes.insert_record_stream(
497 src_repo.chk_bytes.get_record_stream(
498 all_chks, 'unordered', True))
499 repo.texts.insert_record_stream(
500 src_repo.texts.get_record_stream(
501 src_repo.texts.keys(), 'unordered', True))
440 repo.inventories.insert_record_stream(502 repo.inventories.insert_record_stream(
441 src_repo.inventories.get_record_stream(503 src_repo.inventories.get_record_stream(
442 [('B-id',), ('C-id',)], 'unordered', True))504 [('B-id',), ('C-id',)], 'unordered', True))
@@ -456,16 +518,10 @@
456 """commit_write_group fails with BzrCheckError when the chk root record518 """commit_write_group fails with BzrCheckError when the chk root record
457 for a parent inventory of a new revision is missing.519 for a parent inventory of a new revision is missing.
458 """520 """
459 builder = self.make_branch_builder('simple-branch')521 repo = self.make_repository('damaged-repo')
460 builder.build_snapshot('A-id', None, [522 if not repo._format.supports_chks:
461 ('add', ('', 'root-id', 'directory', None)),
462 ('add', ('file', 'file-id', 'file', 'content\n'))])
463 builder.build_snapshot('B-id', None, [])
464 builder.build_snapshot('C-id', None, [
465 ('modify', ('file-id', 'new-content'))])
466 b = builder.get_branch()
467 if not b.repository._format.supports_chks:
468 raise TestNotApplicable('requires repository with chk_bytes')523 raise TestNotApplicable('requires repository with chk_bytes')
524 b = self.make_branch_with_multiple_chk_nodes()
469 b.lock_read()525 b.lock_read()
470 self.addCleanup(b.unlock)526 self.addCleanup(b.unlock)
471 # Now, manually insert objects for a stacked repo with only revision527 # Now, manually insert objects for a stacked repo with only revision
@@ -476,7 +532,6 @@
476 inv_c = b.repository.get_inventory('C-id')532 inv_c = b.repository.get_inventory('C-id')
477 chk_keys_for_c_only = [533 chk_keys_for_c_only = [
478 inv_c.id_to_entry.key(), inv_c.parent_id_basename_to_file_id.key()]534 inv_c.id_to_entry.key(), inv_c.parent_id_basename_to_file_id.key()]
479 repo = self.make_repository('damaged-repo')
480 repo.lock_write()535 repo.lock_write()
481 repo.start_write_group()536 repo.start_write_group()
482 src_repo = b.repository537 src_repo = b.repository
@@ -498,6 +553,63 @@
498 errors.BzrCheckError, reopened_repo.commit_write_group)553 errors.BzrCheckError, reopened_repo.commit_write_group)
499 reopened_repo.abort_write_group()554 reopened_repo.abort_write_group()
500555
556 def make_branch_with_multiple_chk_nodes(self):
557 # add and modify files with very long file-ids, so that the chk map
558 # will need more than just a root node.
559 builder = self.make_branch_builder('simple-branch')
560 file_adds = []
561 file_modifies = []
562 for char in 'abc':
563 name = char * 10000
564 file_adds.append(
565 ('add', ('file-' + name, 'file-%s-id' % name, 'file',
566 'content %s\n' % name)))
567 file_modifies.append(
568 ('modify', ('file-%s-id' % name, 'new content %s\n' % name)))
569 builder.build_snapshot('A-id', None, [
570 ('add', ('', 'root-id', 'directory', None))] +
571 file_adds)
572 builder.build_snapshot('B-id', None, [])
573 builder.build_snapshot('C-id', None, file_modifies)
574 return builder.get_branch()
575
576 def test_missing_text_record(self):
577 """commit_write_group fails with BzrCheckError when a text is missing.
578 """
579 repo = self.make_repository('damaged-repo')
580 if not repo._format.supports_chks:
581 raise TestNotApplicable('requires repository with chk_bytes')
582 b = self.make_branch_with_multiple_chk_nodes()
583 src_repo = b.repository
584 src_repo.lock_read()
585 self.addCleanup(src_repo.unlock)
586 # Now, manually insert objects for a stacked repo with only revision
587 # C-id, *except* drop one changed text.
588 all_texts = src_repo.texts.keys()
589 all_texts.remove(('file-%s-id' % ('c'*10000,), 'C-id'))
590 repo.lock_write()
591 repo.start_write_group()
592 repo.chk_bytes.insert_record_stream(
593 src_repo.chk_bytes.get_record_stream(
594 src_repo.chk_bytes.keys(), 'unordered', True))
595 repo.texts.insert_record_stream(
596 src_repo.texts.get_record_stream(
597 all_texts, 'unordered', True))
598 repo.inventories.insert_record_stream(
599 src_repo.inventories.get_record_stream(
600 [('B-id',), ('C-id',)], 'unordered', True))
601 repo.revisions.insert_record_stream(
602 src_repo.revisions.get_record_stream(
603 [('C-id',)], 'unordered', True))
604 # Make sure the presence of the missing data in a fallback does not
605 # avoid the error.
606 repo.add_fallback_repository(b.repository)
607 self.assertRaises(errors.BzrCheckError, repo.commit_write_group)
608 reopened_repo = self.reopen_repo_and_resume_write_group(repo)
609 self.assertRaises(
610 errors.BzrCheckError, reopened_repo.commit_write_group)
611 reopened_repo.abort_write_group()
612
501613
502class TestResumeableWriteGroup(TestCaseWithRepository):614class TestResumeableWriteGroup(TestCaseWithRepository):
503615

Subscribers

People subscribed via source and target branches