Merge lp:~lifeless/bzr/check-1 into lp:~bzr/bzr/trunk-old

Proposed by Robert Collins
Status: Merged
Merged at revision: not available
Proposed branch: lp:~lifeless/bzr/check-1
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: None lines
To merge this branch: bzr merge lp:~lifeless/bzr/check-1
Reviewer Review Type Date Requested Status
Vincent Ladeuil Approve
bzr-core Pending
Review via email: mp+7489@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

This gives a single-pass progress bar to check, makes it talk 1/2 the
time on 1.9 flavour formats and nochange on CHK formats. More work to
come but this seems unqualified better to me.

(NB: tiny projects show this code as slower, but testing on things the
size of bzr or larger gives the better timings I'm seeing).

--

Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :
Download full text (4.5 KiB)

Robert Collins wrote:
> Robert Collins has proposed merging lp:~lifeless/bzr/check-1 into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
> This gives a single-pass progress bar to check, makes it talk 1/2 the
> time on 1.9 flavour formats and nochange on CHK formats. More work to
> come but this seems unqualified better to me.
>
>
Thanks for working on check. This command will get quite a workout from
lots of users as they migrate to 2a so I think getting it performing
well and reliably is quite a high priority.

This is a pretty big patch and I'm finding it difficult to review it all
at once. Here's a partial review with numerous clean-ups outside the
guts of your new code. To begin with, I'm happy with the -Dprogress
changes. These are arguably completely independent from check so it
would have been nice to submit them as a separate patch. In any case,
the changes related to that are good to go so you can land them immediately.

> +API Changes
> +***********
> +
> +* ``WorkingTree._check`` now requires a references dict with keys matching
> + those returned by ``WorkingTree._get_check_refs``. (Robert Collins)
> +
>
And more importantly, you've changed the public API Branch.check() to
have a new mandatory parameter and InventoryEntry.check() to not have a
tree parameter. You need to document these as API breaks.

> + def find_lefthand_distances(self, keys):
> + """Find the distance to null for all the keys in keys.
> +
> + :param keys: keys to lookup.
> + :return: A dict key->distance for all of keys.
> + """
> + # Optimisable by concurrent searching, but a random spread should get
> + # some sort of hit rate.
> + result = {}
>

result isn't used in this routine.

> + else:
> + # At the moment, check does not extra work over get_record_stream
> + return self.get_record_stream(keys, 'unordered', True)
>

This comment is confusing and needs rewording.

> + for revid, revision in revisions_iterator:
> + if revision is None:
> + pass
> + parent_map = vf.get_parent_map([(revid,)])
>
"pass" does nothing here so I'm assuming it's wrong. "continue" maybe?

> === modified file 'bzrlib/smart/medium.py'
> --- bzrlib/smart/medium.py 2009-06-10 03:56:49 +0000
> +++ bzrlib/smart/medium.py 2009-06-16 05:29:42 +0000
> @@ -37,7 +37,6 @@
> from bzrlib import (
> debug,
> errors,
> - osutils,
> symbol_versioning,
> trace,
> ui,
> @@ -46,7 +45,8 @@
> from bzrlib.smart import client, protocol, request, vfs
> from bzrlib.transport import ssh
> """)
> -
> +#usually already imported, and getting IllegalScoperReplacer on it here.
> +from bzrlib import osutils
>

Was this a transient issue? If not, it looks like a completely separate
bug to the check stuff and maybe ought to be submitted as such.
> def check(self, progress_bar=None):
> - """Check this object for integrity."""
> + """Check this object for integrity.
> +
> + :param progress_bar: A progress bar to output as the check progresses.
> + :param keys: Specific keys within the Versione...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

I've stopped working on check while the current progress is reviewed.

I'll go through and look at the issues you've noted.

With regard to the API changes, I can document Branch.check, as it is
conceivable that people have used it. WT._check is private;
InventoryEntry.check is also private by virtue of requiring a check
object - it can only be used from within check.

-Rob

Revision history for this message
Vincent Ladeuil (vila) wrote :
Download full text (9.8 KiB)

Huge patch to review here... for which the main intent is not
obvious.

The overall feeling is that you stopped your effort at an
intermediate point, with enough gains to land. Yet, many things
seems unfinished.

If you could add comments as brain dump before landing I'm sure a
lot of energy will not be wasted by the future maintainer (who
may or not be yourself ;-)

Some remarks ask for changes, they are small enough to be
considered as tweaks. If you disagree, an explanation will be
welcome.

My main concern here is that you're changing check without adding
test facilities to create broken repo/branch/tree...and while I
understand the desire to not check for every conceivable bug, I'd
feel more confident in the code if some tests were easier to
write...

Other than that, it looks good so with as many tweaks as you can
take in to account:

Review: approve

>>>>> "robert" == Robert Collins <email address hidden> writes:
...
    robert> === modified file 'bzrlib/branch.py'
    robert> --- bzrlib/branch.py 2009-06-10 03:56:49 +0000
    robert> +++ bzrlib/branch.py 2009-06-16 05:29:42 +0000
    robert> @@ -125,6 +125,14 @@
    robert> raise errors.UnstackableRepositoryFormat(self.repository._format,
    robert> self.repository.base)

    robert> + def _get_check_refs(self):
    robert> + """Get the references needed for check().
    robert> +
    robert> + See bzrlib.check.
    robert> + """
    robert> + revid = self.last_revision()
    robert> + return [('revision-existence', revid), ('lefthand-distance', revid)]

'revision-existence' is self-explanatory, but 'lefthand-distance'
is not. Add a comment, if only to point the reader to the right
place to understand it.

I find the way to specify the different checks quite pleasant at
that point :)

I'm less sure about the way to store the results. AIUI 'refs' (as
used in Check.check()) will be a dict with keys being (kind,
id).

Why didn't you use a first level of dict with 'kind' as key and
new dict with 'id' as key instead ?

Is there some other reason than keeping the definition and
accesses use the same key ?

<snip/>

    robert> === modified file 'bzrlib/check.py'
...
    robert> - def check(self):
    robert> + def check(self, callback_refs=None, check_repo=True):
    robert> + if callback_refs is None:
...
    robert> + self.progress.update('check', 0, 4)
    robert> + if self.check_repo:
    robert> + self.progress.update('checking revisions', 0)
    robert> + self.check_revisions()
    robert> + self.progress.update('checking commit contents', 1)
    robert> + self.repository._check_inventories(self)
    robert> + self.progress.update('checking file graphs', 2)
    robert> + # check_weaves is done after the revision scan so that
    robert> + # revision index is known to be valid.
    robert> + self.check_weaves()
    robert> + self.progress.update('checking branches and trees', 3)

The above is clear, but put the following if block in a method by
itse...

Read more...

Revision history for this message
Vincent Ladeuil (vila) wrote :

Bah, forgot to sign my mail :-/

review: Approve
Revision history for this message
Andrew Bennetts (spiv) wrote :

Vincent Ladeuil wrote:
[...]
> robert> + def _check_text(self, record, checker, item_data):
> robert> + """Check a single text."""
> robert> + # Check it is extractable.
> robert> + # TODO: check length.
>
> Add '-- lifeless <aaaammjj>' please.

I think you mean yyyymmdd? :)

-Andrew.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2009-06-15 15:20:24 +0000
+++ NEWS 2009-06-16 05:29:42 +0000
@@ -17,6 +17,12 @@
17 diverged-branches`` when a push fails because the branches have17 diverged-branches`` when a push fails because the branches have
18 diverged. (Neil Martinsen-Burrell, #269477)18 diverged. (Neil Martinsen-Burrell, #269477)
1919
20API Changes
21***********
22
23* ``WorkingTree._check`` now requires a references dict with keys matching
24 those returned by ``WorkingTree._get_check_refs``. (Robert Collins)
25
20Internals26Internals
21*********27*********
2228
@@ -402,6 +408,9 @@
402 cause mismatched physical locks to cause test errors rather than just408 cause mismatched physical locks to cause test errors rather than just
403 reporting to the screen. (Robert Collins)409 reporting to the screen. (Robert Collins)
404410
411* -Dprogress will cause pdb to start up if a progress view jumps
412 backwards. (Robert Collins)
413
405* Fallback ``CredentialStore`` instances registered with ``fallback=True``414* Fallback ``CredentialStore`` instances registered with ``fallback=True``
406 are now be able to provide credentials if obtaining credentials 415 are now be able to provide credentials if obtaining credentials
407 via ~/.bazaar/authentication.conf fails. (Jelmer Vernooij, 416 via ~/.bazaar/authentication.conf fails. (Jelmer Vernooij,
408417
=== modified file 'bzrlib/branch.py'
--- bzrlib/branch.py 2009-06-10 03:56:49 +0000
+++ bzrlib/branch.py 2009-06-16 05:29:42 +0000
@@ -125,6 +125,14 @@
125 raise errors.UnstackableRepositoryFormat(self.repository._format,125 raise errors.UnstackableRepositoryFormat(self.repository._format,
126 self.repository.base)126 self.repository.base)
127127
128 def _get_check_refs(self):
129 """Get the references needed for check().
130
131 See bzrlib.check.
132 """
133 revid = self.last_revision()
134 return [('revision-existence', revid), ('lefthand-distance', revid)]
135
128 @staticmethod136 @staticmethod
129 def open(base, _unsupported=False, possible_transports=None):137 def open(base, _unsupported=False, possible_transports=None):
130 """Open the branch rooted at base.138 """Open the branch rooted at base.
@@ -1132,7 +1140,7 @@
1132 target._set_all_reference_info(target_reference_dict)1140 target._set_all_reference_info(target_reference_dict)
11331141
1134 @needs_read_lock1142 @needs_read_lock
1135 def check(self):1143 def check(self, refs):
1136 """Check consistency of the branch.1144 """Check consistency of the branch.
11371145
1138 In particular this checks that revisions given in the revision-history1146 In particular this checks that revisions given in the revision-history
@@ -1141,42 +1149,23 @@
11411149
1142 Callers will typically also want to check the repository.1150 Callers will typically also want to check the repository.
11431151
1152 :param refs: Calculated refs for this branch as specified by
1153 branch._get_check_refs()
1144 :return: A BranchCheckResult.1154 :return: A BranchCheckResult.
1145 """1155 """
1146 ret = BranchCheckResult(self)1156 result = BranchCheckResult(self)
1147 mainline_parent_id = None
1148 last_revno, last_revision_id = self.last_revision_info()1157 last_revno, last_revision_id = self.last_revision_info()
1149 real_rev_history = []1158 actual_revno = refs[('lefthand-distance', last_revision_id)]
1150 try:1159 if actual_revno != last_revno:
1151 for revid in self.repository.iter_reverse_revision_history(1160 result.errors.append(errors.BzrCheckError(
1152 last_revision_id):1161 'revno does not match len(mainline) %s != %s' % (
1153 real_rev_history.append(revid)1162 last_revno, actual_revno)))
1154 except errors.RevisionNotPresent:1163 # TODO: We should probably also check that self.revision_history
1155 ret.ghosts_in_mainline = True1164 # matches the repository for older branch formats.
1156 else:1165 # If looking for the code that cross-checks repository parents against
1157 ret.ghosts_in_mainline = False1166 # the iter_reverse_revision_history output, that is now a repository
1158 real_rev_history.reverse()1167 # specific check.
1159 if len(real_rev_history) != last_revno:1168 return result
1160 raise errors.BzrCheckError('revno does not match len(mainline)'
1161 ' %s != %s' % (last_revno, len(real_rev_history)))
1162 # TODO: We should probably also check that real_rev_history actually
1163 # matches self.revision_history()
1164 for revision_id in real_rev_history:
1165 try:
1166 revision = self.repository.get_revision(revision_id)
1167 except errors.NoSuchRevision, e:
1168 raise errors.BzrCheckError("mainline revision {%s} not in repository"
1169 % revision_id)
1170 # In general the first entry on the revision history has no parents.
1171 # But it's not illegal for it to have parents listed; this can happen
1172 # in imports from Arch when the parents weren't reachable.
1173 if mainline_parent_id is not None:
1174 if mainline_parent_id not in revision.parent_ids:
1175 raise errors.BzrCheckError("previous revision {%s} not listed among "
1176 "parents of {%s}"
1177 % (mainline_parent_id, revision_id))
1178 mainline_parent_id = revision_id
1179 return ret
11801169
1181 def _get_checkout_format(self):1170 def _get_checkout_format(self):
1182 """Return the most suitable metadir for a checkout of this branch.1171 """Return the most suitable metadir for a checkout of this branch.
@@ -2784,7 +2773,7 @@
27842773
2785 def __init__(self, branch):2774 def __init__(self, branch):
2786 self.branch = branch2775 self.branch = branch
2787 self.ghosts_in_mainline = False2776 self.errors = []
27882777
2789 def report_results(self, verbose):2778 def report_results(self, verbose):
2790 """Report the check results via trace.note.2779 """Report the check results via trace.note.
@@ -2792,11 +2781,10 @@
2792 :param verbose: Requests more detailed display of what was checked,2781 :param verbose: Requests more detailed display of what was checked,
2793 if any.2782 if any.
2794 """2783 """
2795 note('checked branch %s format %s',2784 note('checked branch %s format %s', self.branch.base,
2796 self.branch.base,2785 self.branch._format)
2797 self.branch._format)2786 for error in self.errors:
2798 if self.ghosts_in_mainline:2787 note('found error:%s', error)
2799 note('branch contains ghosts in mainline')
28002788
28012789
2802class Converter5to6(object):2790class Converter5to6(object):
28032791
=== modified file 'bzrlib/check.py'
--- bzrlib/check.py 2009-03-23 14:59:43 +0000
+++ bzrlib/check.py 2009-06-16 00:53:41 +0000
@@ -32,6 +32,20 @@
32# raising them. If there's more than one exception it'd be good to see them32# raising them. If there's more than one exception it'd be good to see them
33# all.33# all.
3434
35"""Checking of bzr objects.
36
37check_refs is a concept used for optimising check. Objects that depend on other
38objects (e.g. tree on repository) can list the objects they would be requesting
39so that when the dependent object is checked, matches can be pulled out and
40evaluated in-line rather than re-reading the same data many times.
41check_refs are tuples (kind, value). Currently defined kinds are:
42* 'trees', where value is a revid and the looked up objects are revision trees.
43* 'lefthand-distance', where value is a revid and the looked up objects are the
44 distance along the lefthand path to NULL for that revid.
45* 'revision-existence', where value is a revid, and the result is True or False
46 indicating that the revision was found/not found.
47"""
48
35from bzrlib import errors, osutils49from bzrlib import errors, osutils
36from bzrlib import repository as _mod_repository50from bzrlib import repository as _mod_repository
37from bzrlib import revision51from bzrlib import revision
@@ -39,6 +53,7 @@
39from bzrlib.bzrdir import BzrDir53from bzrlib.bzrdir import BzrDir
40from bzrlib.errors import BzrCheckError54from bzrlib.errors import BzrCheckError
41from bzrlib.repository import Repository55from bzrlib.repository import Repository
56from bzrlib.revision import NULL_REVISION
42from bzrlib.symbol_versioning import deprecated_function, deprecated_in57from bzrlib.symbol_versioning import deprecated_function, deprecated_in
43from bzrlib.trace import log_error, note58from bzrlib.trace import log_error, note
44import bzrlib.ui59import bzrlib.ui
@@ -49,76 +64,145 @@
4964
50 # The Check object interacts with InventoryEntry.check, etc.65 # The Check object interacts with InventoryEntry.check, etc.
5166
52 def __init__(self, repository):67 def __init__(self, repository, check_repo=True):
53 self.repository = repository68 self.repository = repository
54 self.checked_text_cnt = 0
55 self.checked_rev_cnt = 069 self.checked_rev_cnt = 0
56 self.ghosts = []70 self.ghosts = set()
57 self.repeated_text_cnt = 0
58 self.missing_parent_links = {}71 self.missing_parent_links = {}
59 self.missing_inventory_sha_cnt = 072 self.missing_inventory_sha_cnt = 0
60 self.missing_revision_cnt = 073 self.missing_revision_cnt = 0
61 # maps (file-id, version) -> sha1; used by InventoryFile._check
62 self.checked_texts = {}
63 self.checked_weaves = set()74 self.checked_weaves = set()
64 self.unreferenced_versions = set()75 self.unreferenced_versions = set()
65 self.inconsistent_parents = []76 self.inconsistent_parents = []
66 self.rich_roots = repository.supports_rich_root()77 self.rich_roots = repository.supports_rich_root()
67 self.text_key_references = {}78 self.text_key_references = {}
79 self.check_repo = check_repo
80 self.other_results = []
81 # Plain text lines to include in the report
82 self._report_items = []
83 # Keys we are looking for; may be large and need spilling to disk.
84 # key->(type(revision/inventory/text/signature/map), sha1, first-referer)
85 self.pending_keys = {}
86 # Ancestors map for all of revisions being checked; while large helper
87 # functions we call would create it anyway, so better to have once and
88 # keep.
89 self.ancestors = {}
6890
69 def check(self):91 def check(self, callback_refs=None, check_repo=True):
92 if callback_refs is None:
93 callback_refs = {}
70 self.repository.lock_read()94 self.repository.lock_read()
71 self.progress = bzrlib.ui.ui_factory.nested_progress_bar()95 self.progress = bzrlib.ui.ui_factory.nested_progress_bar()
72 try:96 try:
73 self.progress.update('retrieving inventory', 0, 2)97 self.progress.update('check', 0, 4)
74 # do not put in init, as it should be done with progess,98 if self.check_repo:
75 # and inside the lock.99 self.progress.update('checking revisions', 0)
76 self.inventory_weave = self.repository.inventories100 self.check_revisions()
77 self.progress.update('checking revision graph', 1)101 self.progress.update('checking commit contents', 1)
78 self.check_revision_graph()102 self.repository._check_inventories(self)
79 self.plan_revisions()103 self.progress.update('checking file graphs', 2)
80 revno = 0104 # check_weaves is done after the revision scan so that
81 while revno < len(self.planned_revisions):105 # revision index is known to be valid.
82 rev_id = self.planned_revisions[revno]106 self.check_weaves()
83 self.progress.update('checking revision', revno,107 self.progress.update('checking branches and trees', 3)
84 len(self.planned_revisions))108 if callback_refs:
85 revno += 1109 repo = self.repository
86 self.check_one_rev(rev_id)110 # calculate all refs, and callback the objects requesting them.
87 # check_weaves is done after the revision scan so that111 refs = {}
88 # revision index is known to be valid.112 wanting_items = set()
89 self.check_weaves()113 # Current crude version calculates everything and calls
114 # everything at once. Doing a queue and popping as things are
115 # satisfied would be cheaper on memory [but few people have
116 # huge numbers of working trees today. TODO: fix before
117 # landing].
118 distances = set()
119 existences = set()
120 for ref, wantlist in callback_refs.iteritems():
121 wanting_items.update(wantlist)
122 kind, value = ref
123 if kind == 'trees':
124 refs[ref] = repo.revision_tree(value)
125 elif kind == 'lefthand-distance':
126 distances.add(value)
127 elif kind == 'revision-existence':
128 existences.add(value)
129 else:
130 raise AssertionError(
131 'unknown ref kind for ref %s' % ref)
132 node_distances = repo.get_graph().find_lefthand_distances(distances)
133 for key, distance in node_distances.iteritems():
134 refs[('lefthand-distance', key)] = distance
135 if key in existences and distance > 0:
136 refs[('revision-existence', key)] = True
137 existences.remove(key)
138 parent_map = repo.get_graph().get_parent_map(existences)
139 for key in parent_map:
140 refs[('revision-existence', key)] = True
141 existences.remove(key)
142 for key in existences:
143 refs[('revision-existence', key)] = False
144 for item in wanting_items:
145 if isinstance(item, WorkingTree):
146 item._check(refs)
147 if isinstance(item, Branch):
148 self.other_results.append(item.check(refs))
90 finally:149 finally:
91 self.progress.finished()150 self.progress.finished()
92 self.repository.unlock()151 self.repository.unlock()
93152
94 def check_revision_graph(self):153 def _check_revisions(self, revisions_iterator):
154 """Check revision objects by decorating a generator.
155
156 :param revisions_iterator: An iterator of(revid, Revision-or-None).
157 :return: A generator of the contents of revisions_iterator.
158 """
159 self.planned_revisions = set()
160 for revid, revision in revisions_iterator:
161 yield revid, revision
162 self._check_one_rev(revid, revision)
163 # Flatten the revisions we found to guarantee consistent later
164 # iteration.
165 self.planned_revisions = list(self.planned_revisions)
166 # TODO: extract digital signatures as items to callback on too.
167
168 def check_revisions(self):
169 """Scan revisions, checking data directly available as we go."""
170 revision_iterator = self.repository._iter_revisions(None)
171 revision_iterator = self._check_revisions(revision_iterator)
172 # We read the all revisions here:
173 # - doing this allows later code to depend on the revision index.
174 # - we can fill out existence flags at this point
175 # - we can read the revision inventory sha at this point
176 # - we can check properties and serialisers etc.
95 if not self.repository.revision_graph_can_have_wrong_parents():177 if not self.repository.revision_graph_can_have_wrong_parents():
96 # This check is not necessary.178 # The check against the index isn't needed.
97 self.revs_with_bad_parents_in_index = None179 self.revs_with_bad_parents_in_index = None
98 return180 for thing in revision_iterator:
99 bad_revisions = self.repository._find_inconsistent_revision_parents()181 pass
100 self.revs_with_bad_parents_in_index = list(bad_revisions)182 else:
101183 bad_revisions = self.repository._find_inconsistent_revision_parents(
102 def plan_revisions(self):184 revision_iterator)
103 repository = self.repository185 self.revs_with_bad_parents_in_index = list(bad_revisions)
104 self.planned_revisions = repository.all_revision_ids()
105 self.progress.clear()
106 inventoried = set(key[-1] for key in self.inventory_weave.keys())
107 awol = set(self.planned_revisions) - inventoried
108 if len(awol) > 0:
109 raise BzrCheckError('Stored revisions missing from inventory'
110 '{%s}' % ','.join([f for f in awol]))
111186
112 def report_results(self, verbose):187 def report_results(self, verbose):
188 if self.check_repo:
189 self._report_repo_results(verbose)
190 for result in self.other_results:
191 result.report_results(verbose)
192
193 def _report_repo_results(self, verbose):
113 note('checked repository %s format %s',194 note('checked repository %s format %s',
114 self.repository.bzrdir.root_transport,195 self.repository.bzrdir.root_transport,
115 self.repository._format)196 self.repository._format)
116 note('%6d revisions', self.checked_rev_cnt)197 note('%6d revisions', self.checked_rev_cnt)
117 note('%6d file-ids', len(self.checked_weaves))198 note('%6d file-ids', len(self.checked_weaves))
118 note('%6d unique file texts', self.checked_text_cnt)199 if verbose:
119 note('%6d repeated file texts', self.repeated_text_cnt)200 note('%6d unreferenced text versions',
120 note('%6d unreferenced text versions',201 len(self.unreferenced_versions))
121 len(self.unreferenced_versions))202 if verbose and len(self.unreferenced_versions):
203 for file_id, revision_id in self.unreferenced_versions:
204 log_error('unreferenced version: {%s} in %s', revision_id,
205 file_id)
122 if self.missing_inventory_sha_cnt:206 if self.missing_inventory_sha_cnt:
123 note('%6d revisions are missing inventory_sha1',207 note('%6d revisions are missing inventory_sha1',
124 self.missing_inventory_sha_cnt)208 self.missing_inventory_sha_cnt)
@@ -138,10 +222,6 @@
138 note(' %s should be in the ancestry for:', link)222 note(' %s should be in the ancestry for:', link)
139 for linker in linkers:223 for linker in linkers:
140 note(' * %s', linker)224 note(' * %s', linker)
141 if verbose:
142 for file_id, revision_id in self.unreferenced_versions:
143 log_error('unreferenced version: {%s} in %s', revision_id,
144 file_id)
145 if len(self.inconsistent_parents):225 if len(self.inconsistent_parents):
146 note('%6d inconsistent parents', len(self.inconsistent_parents))226 note('%6d inconsistent parents', len(self.inconsistent_parents))
147 if verbose:227 if verbose:
@@ -161,60 +241,75 @@
161 ' %s has wrong parents in index: '241 ' %s has wrong parents in index: '
162 '%r should be %r',242 '%r should be %r',
163 revision_id, index_parents, actual_parents)243 revision_id, index_parents, actual_parents)
164244 for item in self._report_items:
165 def check_one_rev(self, rev_id):245 note(item)
166 """Check one revision.246
167247 def _check_one_rev(self, rev_id, rev):
168 rev_id - the one to check248 """Cross-check one revision.
249
250 :param rev_id: A revision id to check.
251 :param rev: A revision or None to indicate a missing revision.
169 """252 """
170 rev = self.repository.get_revision(rev_id)
171
172 if rev.revision_id != rev_id:253 if rev.revision_id != rev_id:
173 raise BzrCheckError('wrong internal revision id in revision {%s}'254 self._report_items.append(
174 % rev_id)255 'Mismatched internal revid {%s} and index revid {%s}' % (
175256 rev.revision_id, rev_id))
257 rev_id = rev.revision_id
258 # Check this revision tree etc, and count as seen when we encounter a
259 # reference to it.
260 self.planned_revisions.add(rev_id)
261 # It is not a ghost
262 self.ghosts.discard(rev_id)
263 # Count all parents as ghosts if we haven't seen them yet.
176 for parent in rev.parent_ids:264 for parent in rev.parent_ids:
177 if not parent in self.planned_revisions:265 if not parent in self.planned_revisions:
178 # rev has a parent we didn't know about.266 self.ghosts.add(parent)
179 missing_links = self.missing_parent_links.get(parent, [])267
180 missing_links.append(rev_id)268 self.ancestors[rev_id] = tuple(rev.parent_ids) or (NULL_REVISION,)
181 self.missing_parent_links[parent] = missing_links269 self.add_pending_item(rev_id, ('inventories', rev_id), 'inventory',
182 # list based so somewhat slow,270 rev.inventory_sha1)
183 # TODO have a planned_revisions list and set.
184 if self.repository.has_revision(parent):
185 missing_ancestry = self.repository.get_ancestry(parent)
186 for missing in missing_ancestry:
187 if (missing is not None
188 and missing not in self.planned_revisions):
189 self.planned_revisions.append(missing)
190 else:
191 self.ghosts.append(rev_id)
192
193 if rev.inventory_sha1:
194 # Loopback - this is currently circular logic as the
195 # knit get_inventory_sha1 call returns rev.inventory_sha1.
196 # Repository.py's get_inventory_sha1 should instead return
197 # inventories.get_record_stream([(revid,)]).next().sha1 or
198 # similar.
199 inv_sha1 = self.repository.get_inventory_sha1(rev_id)
200 if inv_sha1 != rev.inventory_sha1:
201 raise BzrCheckError('Inventory sha1 hash doesn\'t match'
202 ' value in revision {%s}' % rev_id)
203 self._check_revision_tree(rev_id)
204 self.checked_rev_cnt += 1271 self.checked_rev_cnt += 1
205272
273 def add_pending_item(self, referer, key, kind, sha1):
274 """Add a reference to a sha1 to be cross checked against a key.
275
276 :param referer: The referer that expects key to have sha1.
277 :param key: A storage key e.g. ('texts', 'foo@bar-20040504-1234')
278 :param kind: revision/inventory/text/map/signature
279 :param sha1: A hex sha1 or None if no sha1 is known.
280 """
281 existing = self.pending_keys.get(key)
282 if existing:
283 if sha1 != existing[1]:
284 self._report_items.append('Multiple expected sha1s for %s. {%s}'
285 ' expects {%s}, {%s} expects {%s}', (
286 key, referer, sha1, existing[1], existing[0]))
287 else:
288 self.pending_keys[key] = (kind, sha1, referer)
289
206 def check_weaves(self):290 def check_weaves(self):
207 """Check all the weaves we can get our hands on.291 """Check all the weaves we can get our hands on.
208 """292 """
209 weave_ids = []293 weave_ids = []
210 self.progress.update('checking inventory', 0, 2)294 storebar = bzrlib.ui.ui_factory.nested_progress_bar()
211 self.inventory_weave.check(progress_bar=self.progress)295 try:
212 self.progress.update('checking text storage', 1, 2)296 self._check_weaves(storebar)
213 self.repository.texts.check(progress_bar=self.progress)297 finally:
214 weave_checker = self.repository._get_versioned_file_checker(298 storebar.finished()
215 text_key_references=self.text_key_references)299
300 def _check_weaves(self, storebar):
301 storebar.update('text-index', 0, 2)
302 if self.repository._format.fast_deltas:
303 # We haven't considered every fileid instance so far.
304 weave_checker = self.repository._get_versioned_file_checker(
305 ancestors=self.ancestors)
306 else:
307 weave_checker = self.repository._get_versioned_file_checker(
308 text_key_references=self.text_key_references,
309 ancestors=self.ancestors)
310 storebar.update('file-graph', 1)
216 result = weave_checker.check_file_version_parents(311 result = weave_checker.check_file_version_parents(
217 self.repository.texts, progress_bar=self.progress)312 self.repository.texts)
218 self.checked_weaves = weave_checker.file_ids313 self.checked_weaves = weave_checker.file_ids
219 bad_parents, unused_versions = result314 bad_parents, unused_versions = result
220 bad_parents = bad_parents.items()315 bad_parents = bad_parents.items()
@@ -228,28 +323,8 @@
228 (revision_id, weave_id, weave_parents, correct_parents))323 (revision_id, weave_id, weave_parents, correct_parents))
229 self.unreferenced_versions.update(unused_versions)324 self.unreferenced_versions.update(unused_versions)
230325
231 def _check_revision_tree(self, rev_id):
232 tree = self.repository.revision_tree(rev_id)
233 inv = tree.inventory
234 seen_ids = set()
235 seen_names = set()
236 for path, ie in inv.iter_entries():
237 self._add_entry_to_text_key_references(inv, ie)
238 file_id = ie.file_id
239 if file_id in seen_ids:
240 raise BzrCheckError('duplicated file_id {%s} '
241 'in inventory for revision {%s}'
242 % (file_id, rev_id))
243 seen_ids.add(file_id)
244 ie.check(self, rev_id, inv, tree)
245 if path in seen_names:
246 raise BzrCheckError('duplicated path %s '
247 'in inventory for revision {%s}'
248 % (path, rev_id))
249 seen_names.add(path)
250
251 def _add_entry_to_text_key_references(self, inv, entry):326 def _add_entry_to_text_key_references(self, inv, entry):
252 if not self.rich_roots and entry == inv.root:327 if not self.rich_roots and entry.name == '':
253 return328 return
254 key = (entry.file_id, entry.revision)329 key = (entry.file_id, entry.revision)
255 self.text_key_references.setdefault(key, False)330 self.text_key_references.setdefault(key, False)
@@ -263,13 +338,14 @@
263338
264 Results are reported through logging.339 Results are reported through logging.
265340
266 Deprecated in 1.6. Please use check_branch instead.341 Deprecated in 1.6. Please use check_dwim instead.
267342
268 :raise BzrCheckError: if there's a consistency error.343 :raise BzrCheckError: if there's a consistency error.
269 """344 """
270 check_branch(branch, verbose)345 check_branch(branch, verbose)
271346
272347
348@deprecated_function(deprecated_in((1,16,0)))
273def check_branch(branch, verbose):349def check_branch(branch, verbose):
274 """Run consistency checks on a branch.350 """Run consistency checks on a branch.
275351
@@ -279,56 +355,108 @@
279 """355 """
280 branch.lock_read()356 branch.lock_read()
281 try:357 try:
282 branch_result = branch.check()358 needed_refs = {}
359 for ref in branch._get_check_refs():
360 needed_refs.setdefault(ref, []).append(branch)
361 result = branch.repository.check([branch.last_revision()], needed_refs)
362 branch_result = result.other_results[0]
283 finally:363 finally:
284 branch.unlock()364 branch.unlock()
285 branch_result.report_results(verbose)365 branch_result.report_results(verbose)
286366
287367
368def scan_branch(branch, needed_refs, to_unlock):
369 """Scan a branch for refs.
370
371 :param branch: The branch to schedule for checking.
372 :param needed_refs: Refs we are accumulating.
373 :param to_unlock: The unlock list accumulating.
374 """
375 note("Checking branch at '%s'." % (branch.base,))
376 branch.lock_read()
377 to_unlock.append(branch)
378 branch_refs = branch._get_check_refs()
379 for ref in branch_refs:
380 reflist = needed_refs.setdefault(ref, [])
381 reflist.append(branch)
382
383
384def scan_tree(base_tree, tree, needed_refs, to_unlock):
385 """Scan a tree for refs.
386
387 :param base_tree: The original tree check opened, used to detect duplicate
388 tree checks.
389 :param tree: The tree to schedule for checking.
390 :param needed_refs: Refs we are accumulating.
391 :param to_unlock: The unlock list accumulating.
392 """
393 if base_tree is not None and tree.basedir == base_tree.basedir:
394 return
395 note("Checking working tree at '%s'." % (tree.basedir,))
396 tree.lock_read()
397 to_unlock.append(tree)
398 tree_refs = tree._get_check_refs()
399 for ref in tree_refs:
400 reflist = needed_refs.setdefault(ref, [])
401 reflist.append(tree)
402
403
288def check_dwim(path, verbose, do_branch=False, do_repo=False, do_tree=False):404def check_dwim(path, verbose, do_branch=False, do_repo=False, do_tree=False):
289 try:405 try:
290 tree, branch, repo, relpath = \406 base_tree, branch, repo, relpath = \
291 BzrDir.open_containing_tree_branch_or_repository(path)407 BzrDir.open_containing_tree_branch_or_repository(path)
292 except errors.NotBranchError:408 except errors.NotBranchError:
293 tree = branch = repo = None409 base_tree = branch = repo = None
294410
295 if do_tree:411 to_unlock = []
296 if tree is not None:412 needed_refs= {}
297 note("Checking working tree at '%s'."413 try:
298 % (tree.bzrdir.root_transport.base,))414 if base_tree is not None:
299 tree._check()415 # If the tree is a lightweight checkout we won't see it in
300 else:416 # repo.find_branches - add now.
301 log_error("No working tree found at specified location.")417 if do_tree:
302418 scan_tree(None, base_tree, needed_refs, to_unlock)
303 if branch is not None:419 branch = base_tree.branch
304 # We have a branch420 if branch is not None:
305 if repo is None:421 # We have a branch
306 # The branch is in a shared repository422 if repo is None:
307 repo = branch.repository423 # The branch is in a shared repository
308 branches = [branch]424 repo = branch.repository
309 elif repo is not None:425 if repo is not None:
310 branches = repo.find_branches(using=True)426 repo.lock_read()
311427 to_unlock.append(repo)
312 if repo is not None:428 branches = repo.find_branches(using=True)
313 repo.lock_read()429 saw_tree = False
314 try:430 if do_branch or do_tree:
315 if do_repo:431 for branch in branches:
316 note("Checking repository at '%s'."432 if do_tree:
317 % (repo.bzrdir.root_transport.base,))433 try:
318 result = repo.check()434 tree = branch.bzrdir.open_workingtree()
435 saw_tree = True
436 except (errors.NotLocalUrl, errors.NoWorkingTree):
437 pass
438 else:
439 scan_tree(base_tree, tree, needed_refs, to_unlock)
440 if do_branch:
441 scan_branch(branch, needed_refs, to_unlock)
442 if do_branch and not branches:
443 log_error("No branch found at specified location.")
444 if do_tree and base_tree is None and not saw_tree:
445 log_error("No working tree found at specified location.")
446 if do_repo or do_branch or do_tree:
447 if do_repo:
448 note("Checking repository at '%s'."
449 % (repo.bzrdir.root_transport.base,))
450 result = repo.check(None, callback_refs=needed_refs,
451 check_repo=do_repo)
319 result.report_results(verbose)452 result.report_results(verbose)
453 else:
454 if do_tree:
455 log_error("No working tree found at specified location.")
320 if do_branch:456 if do_branch:
321 if branches == []:457 log_error("No branch found at specified location.")
322 log_error("No branch found at specified location.")458 if do_repo:
323 else:459 log_error("No repository found at specified location.")
324 for branch in branches:460 finally:
325 note("Checking branch at '%s'."461 for thing in to_unlock:
326 % (branch.bzrdir.root_transport.base,))462 thing.unlock()
327 check_branch(branch, verbose)
328 finally:
329 repo.unlock()
330 else:
331 if do_branch:
332 log_error("No branch found at specified location.")
333 if do_repo:
334 log_error("No repository found at specified location.")
335463
=== modified file 'bzrlib/graph.py'
--- bzrlib/graph.py 2009-06-10 03:56:49 +0000
+++ bzrlib/graph.py 2009-06-16 05:29:42 +0000
@@ -311,6 +311,27 @@
311 # get there.311 # get there.
312 return known_revnos[cur_tip] + num_steps312 return known_revnos[cur_tip] + num_steps
313313
314 def find_lefthand_distances(self, keys):
315 """Find the distance to null for all the keys in keys.
316
317 :param keys: keys to lookup.
318 :return: A dict key->distance for all of keys.
319 """
320 # Optimisable by concurrent searching, but a random spread should get
321 # some sort of hit rate.
322 result = {}
323 known_revnos = []
324 ghosts = []
325 for key in keys:
326 try:
327 known_revnos.append(
328 (key, self.find_distance_to_null(key, known_revnos)))
329 except errors.GhostRevisionsHaveNoRevno:
330 ghosts.append(key)
331 for key in ghosts:
332 known_revnos.append((key, -1))
333 return dict(known_revnos)
334
314 def find_unique_ancestors(self, unique_revision, common_revisions):335 def find_unique_ancestors(self, unique_revision, common_revisions):
315 """Find the unique ancestors for a revision versus others.336 """Find the unique ancestors for a revision versus others.
316337
317338
=== modified file 'bzrlib/groupcompress.py'
--- bzrlib/groupcompress.py 2009-06-10 03:56:49 +0000
+++ bzrlib/groupcompress.py 2009-06-16 05:29:42 +0000
@@ -1049,11 +1049,14 @@
1049 reannotate(parent_lines, lines, key, None, head_cache))1049 reannotate(parent_lines, lines, key, None, head_cache))
1050 return parent_cache[key]1050 return parent_cache[key]
10511051
1052 def check(self, progress_bar=None):1052 def check(self, progress_bar=None, keys=None):
1053 """See VersionedFiles.check()."""1053 """See VersionedFiles.check()."""
1054 keys = self.keys()1054 if keys is None:
1055 for record in self.get_record_stream(keys, 'unordered', True):1055 keys = self.keys()
1056 record.get_bytes_as('fulltext')1056 for record in self.get_record_stream(keys, 'unordered', True):
1057 record.get_bytes_as('fulltext')
1058 else:
1059 return self.get_record_stream(keys, 'unordered', True)
10571060
1058 def _check_add(self, key, lines, random_id, check_content):1061 def _check_add(self, key, lines, random_id, check_content):
1059 """check that version_id and lines are safe to add."""1062 """check that version_id and lines are safe to add."""
10601063
=== modified file 'bzrlib/help_topics/en/debug-flags.txt'
--- bzrlib/help_topics/en/debug-flags.txt 2009-03-18 23:43:51 +0000
+++ bzrlib/help_topics/en/debug-flags.txt 2009-05-12 06:30:56 +0000
@@ -19,6 +19,7 @@
19-Dindex Trace major index operations.19-Dindex Trace major index operations.
20-Dknit Trace knit operations.20-Dknit Trace knit operations.
21-Dlock Trace when lockdir locks are taken or released.21-Dlock Trace when lockdir locks are taken or released.
22-Dprogress Trace progress bar operations.
22-Dmerge Emit information for debugging merges.23-Dmerge Emit information for debugging merges.
23-Dpack Emit information about pack operations.24-Dpack Emit information about pack operations.
24-Dsftp Trace SFTP internals.25-Dsftp Trace SFTP internals.
2526
=== modified file 'bzrlib/inventory.py'
--- bzrlib/inventory.py 2009-06-10 03:56:49 +0000
+++ bzrlib/inventory.py 2009-06-16 05:29:42 +0000
@@ -262,7 +262,7 @@
262 def versionable_kind(kind):262 def versionable_kind(kind):
263 return (kind in ('file', 'directory', 'symlink', 'tree-reference'))263 return (kind in ('file', 'directory', 'symlink', 'tree-reference'))
264264
265 def check(self, checker, rev_id, inv, tree):265 def check(self, checker, rev_id, inv):
266 """Check this inventory entry is intact.266 """Check this inventory entry is intact.
267267
268 This is a template method, override _check for kind specific268 This is a template method, override _check for kind specific
@@ -274,18 +274,18 @@
274 :param rev_id: Revision id from which this InventoryEntry was loaded.274 :param rev_id: Revision id from which this InventoryEntry was loaded.
275 Not necessarily the last-changed revision for this file.275 Not necessarily the last-changed revision for this file.
276 :param inv: Inventory from which the entry was loaded.276 :param inv: Inventory from which the entry was loaded.
277 :param tree: RevisionTree for this entry.
278 """277 """
279 if self.parent_id is not None:278 if self.parent_id is not None:
280 if not inv.has_id(self.parent_id):279 if not inv.has_id(self.parent_id):
281 raise BzrCheckError('missing parent {%s} in inventory for revision {%s}'280 raise BzrCheckError('missing parent {%s} in inventory for revision {%s}'
282 % (self.parent_id, rev_id))281 % (self.parent_id, rev_id))
283 self._check(checker, rev_id, tree)282 checker._add_entry_to_text_key_references(inv, self)
283 self._check(checker, rev_id)
284284
285 def _check(self, checker, rev_id, tree):285 def _check(self, checker, rev_id):
286 """Check this inventory entry for kind specific errors."""286 """Check this inventory entry for kind specific errors."""
287 raise BzrCheckError('unknown entry kind %r in revision {%s}' %287 checker._report_items.append(
288 (self.kind, rev_id))288 'unknown entry kind %r in revision {%s}' % (self.kind, rev_id))
289289
290 def copy(self):290 def copy(self):
291 """Clone this inventory entry."""291 """Clone this inventory entry."""
@@ -404,7 +404,7 @@
404 'text_id', 'parent_id', 'children', 'executable',404 'text_id', 'parent_id', 'children', 'executable',
405 'revision', 'symlink_target', 'reference_revision']405 'revision', 'symlink_target', 'reference_revision']
406406
407 def _check(self, checker, rev_id, tree):407 def _check(self, checker, rev_id):
408 """See InventoryEntry._check"""408 """See InventoryEntry._check"""
409409
410 def __init__(self, file_id):410 def __init__(self, file_id):
@@ -433,11 +433,16 @@
433 'text_id', 'parent_id', 'children', 'executable',433 'text_id', 'parent_id', 'children', 'executable',
434 'revision', 'symlink_target', 'reference_revision']434 'revision', 'symlink_target', 'reference_revision']
435435
436 def _check(self, checker, rev_id, tree):436 def _check(self, checker, rev_id):
437 """See InventoryEntry._check"""437 """See InventoryEntry._check"""
438 if self.text_sha1 is not None or self.text_size is not None or self.text_id is not None:438 if (self.text_sha1 is not None or self.text_size is not None or
439 raise BzrCheckError('directory {%s} has text in revision {%s}'439 self.text_id is not None):
440 checker._report_items.append('directory {%s} has text in revision {%s}'
440 % (self.file_id, rev_id))441 % (self.file_id, rev_id))
442 # Directories are stored as ''.
443 checker.add_pending_item(rev_id,
444 ('texts', self.file_id, self.revision), 'text',
445 'da39a3ee5e6b4b0d3255bfef95601890afd80709')
441446
442 def copy(self):447 def copy(self):
443 other = InventoryDirectory(self.file_id, self.name, self.parent_id)448 other = InventoryDirectory(self.file_id, self.name, self.parent_id)
@@ -476,27 +481,16 @@
476 'text_id', 'parent_id', 'children', 'executable',481 'text_id', 'parent_id', 'children', 'executable',
477 'revision', 'symlink_target', 'reference_revision']482 'revision', 'symlink_target', 'reference_revision']
478483
479 def _check(self, checker, tree_revision_id, tree):484 def _check(self, checker, tree_revision_id):
480 """See InventoryEntry._check"""485 """See InventoryEntry._check"""
481 key = (self.file_id, self.revision)486 # TODO: check size too.
482 if key in checker.checked_texts:487 checker.add_pending_item(tree_revision_id,
483 prev_sha = checker.checked_texts[key]488 ('texts', self.file_id, self.revision), 'text',
484 if prev_sha != self.text_sha1:489 self.text_sha1)
485 raise BzrCheckError(490 if self.text_size is None:
486 'mismatched sha1 on {%s} in {%s} (%s != %s) %r' %491 checker._report_items.append(
487 (self.file_id, tree_revision_id, prev_sha, self.text_sha1,492 'fileid {%s} in {%s} has None for text_size' % (self.file_id,
488 t))493 tree_revision_id))
489 else:
490 checker.repeated_text_cnt += 1
491 return
492
493 checker.checked_text_cnt += 1
494 # We can't check the length, because Weave doesn't store that
495 # information, and the whole point of looking at the weave's
496 # sha1sum is that we don't have to extract the text.
497 if (self.text_sha1 != tree._repository.texts.get_sha1s([key])[key]):
498 raise BzrCheckError('text {%s} version {%s} wrong sha1' % key)
499 checker.checked_texts[key] = self.text_sha1
500494
501 def copy(self):495 def copy(self):
502 other = InventoryFile(self.file_id, self.name, self.parent_id)496 other = InventoryFile(self.file_id, self.name, self.parent_id)
@@ -600,14 +594,20 @@
600 'text_id', 'parent_id', 'children', 'executable',594 'text_id', 'parent_id', 'children', 'executable',
601 'revision', 'symlink_target', 'reference_revision']595 'revision', 'symlink_target', 'reference_revision']
602596
603 def _check(self, checker, rev_id, tree):597 def _check(self, checker, rev_id):
604 """See InventoryEntry._check"""598 """See InventoryEntry._check"""
605 if self.text_sha1 is not None or self.text_size is not None or self.text_id is not None:599 if self.text_sha1 is not None or self.text_size is not None or self.text_id is not None:
606 raise BzrCheckError('symlink {%s} has text in revision {%s}'600 checker._report_items.append(
601 'symlink {%s} has text in revision {%s}'
607 % (self.file_id, rev_id))602 % (self.file_id, rev_id))
608 if self.symlink_target is None:603 if self.symlink_target is None:
609 raise BzrCheckError('symlink {%s} has no target in revision {%s}'604 checker._report_items.append(
605 'symlink {%s} has no target in revision {%s}'
610 % (self.file_id, rev_id))606 % (self.file_id, rev_id))
607 # Symlinks are stored as ''
608 checker.add_pending_item(tree_revision_id,
609 ('texts', self.file_id, self.revision), 'text',
610 'da39a3ee5e6b4b0d3255bfef95601890afd80709')
611611
612 def copy(self):612 def copy(self):
613 other = InventoryLink(self.file_id, self.name, self.parent_id)613 other = InventoryLink(self.file_id, self.name, self.parent_id)
614614
=== modified file 'bzrlib/knit.py'
--- bzrlib/knit.py 2009-06-10 03:56:49 +0000
+++ bzrlib/knit.py 2009-06-16 05:29:42 +0000
@@ -1005,8 +1005,15 @@
1005 """See VersionedFiles.annotate."""1005 """See VersionedFiles.annotate."""
1006 return self._factory.annotate(self, key)1006 return self._factory.annotate(self, key)
10071007
1008 def check(self, progress_bar=None):1008 def check(self, progress_bar=None, keys=None):
1009 """See VersionedFiles.check()."""1009 """See VersionedFiles.check()."""
1010 if keys is None:
1011 return self._logical_check()
1012 else:
1013 # At the moment, check does not extra work over get_record_stream
1014 return self.get_record_stream(keys, 'unordered', True)
1015
1016 def _logical_check(self):
1010 # This doesn't actually test extraction of everything, but that will1017 # This doesn't actually test extraction of everything, but that will
1011 # impact 'bzr check' substantially, and needs to be integrated with1018 # impact 'bzr check' substantially, and needs to be integrated with
1012 # care. However, it does check for the obvious problem of a delta with1019 # care. However, it does check for the obvious problem of a delta with
10131020
=== modified file 'bzrlib/remote.py'
--- bzrlib/remote.py 2009-06-11 09:11:21 +0000
+++ bzrlib/remote.py 2009-06-16 05:29:42 +0000
@@ -1414,9 +1414,10 @@
1414 return self._real_repository.get_revision_reconcile(revision_id)1414 return self._real_repository.get_revision_reconcile(revision_id)
14151415
1416 @needs_read_lock1416 @needs_read_lock
1417 def check(self, revision_ids=None):1417 def check(self, revision_ids=None, callback_refs=None):
1418 self._ensure_real()1418 self._ensure_real()
1419 return self._real_repository.check(revision_ids=revision_ids)1419 return self._real_repository.check(revision_ids=revision_ids,
1420 callback_refs=callback_refs)
14201421
1421 def copy_content_into(self, destination, revision_id=None):1422 def copy_content_into(self, destination, revision_id=None):
1422 self._ensure_real()1423 self._ensure_real()
14231424
=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- bzrlib/repofmt/groupcompress_repo.py 2009-06-12 01:11:00 +0000
+++ bzrlib/repofmt/groupcompress_repo.py 2009-06-16 05:29:42 +0000
@@ -718,6 +718,10 @@
718 finally:718 finally:
719 basis_tree.unlock()719 basis_tree.unlock()
720720
721 def deserialise_inventory(self, revision_id, bytes):
722 return inventory.CHKInventory.deserialise(self.chk_bytes, bytes,
723 (revision_id,))
724
721 def _iter_inventories(self, revision_ids):725 def _iter_inventories(self, revision_ids):
722 """Iterate over many inventory objects."""726 """Iterate over many inventory objects."""
723 keys = [(revision_id,) for revision_id in revision_ids]727 keys = [(revision_id,) for revision_id in revision_ids]
724728
=== modified file 'bzrlib/repofmt/knitrepo.py'
--- bzrlib/repofmt/knitrepo.py 2009-06-10 03:56:49 +0000
+++ bzrlib/repofmt/knitrepo.py 2009-06-16 05:29:42 +0000
@@ -229,24 +229,29 @@
229 def _make_parents_provider(self):229 def _make_parents_provider(self):
230 return _KnitsParentsProvider(self.revisions)230 return _KnitsParentsProvider(self.revisions)
231231
232 def _find_inconsistent_revision_parents(self):232 def _find_inconsistent_revision_parents(self, revisions_iterator=None):
233 """Find revisions with different parent lists in the revision object233 """Find revisions with different parent lists in the revision object
234 and in the index graph.234 and in the index graph.
235235
236 :param revisions_iterator: None, or an iterator of (revid,
237 Revision-or-None). This iterator controls the revisions checked.
236 :returns: an iterator yielding tuples of (revison-id, parents-in-index,238 :returns: an iterator yielding tuples of (revison-id, parents-in-index,
237 parents-in-revision).239 parents-in-revision).
238 """240 """
239 if not self.is_locked():241 if not self.is_locked():
240 raise AssertionError()242 raise AssertionError()
241 vf = self.revisions243 vf = self.revisions
242 for index_version in vf.keys():244 if revisions_iterator is None:
243 parent_map = vf.get_parent_map([index_version])245 revisions_iterator = self._iter_revisions(None)
246 for revid, revision in revisions_iterator:
247 if revision is None:
248 pass
249 parent_map = vf.get_parent_map([(revid,)])
244 parents_according_to_index = tuple(parent[-1] for parent in250 parents_according_to_index = tuple(parent[-1] for parent in
245 parent_map[index_version])251 parent_map[(revid,)])
246 revision = self.get_revision(index_version[-1])
247 parents_according_to_revision = tuple(revision.parent_ids)252 parents_according_to_revision = tuple(revision.parent_ids)
248 if parents_according_to_index != parents_according_to_revision:253 if parents_according_to_index != parents_according_to_revision:
249 yield (index_version[-1], parents_according_to_index,254 yield (revid, parents_according_to_index,
250 parents_according_to_revision)255 parents_according_to_revision)
251256
252 def _check_for_inconsistent_revision_parents(self):257 def _check_for_inconsistent_revision_parents(self):
253258
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- bzrlib/repofmt/pack_repo.py 2009-06-10 03:56:49 +0000
+++ bzrlib/repofmt/pack_repo.py 2009-06-16 05:29:42 +0000
@@ -2219,52 +2219,6 @@
2219 self.revisions._index._key_dependencies.refs.clear()2219 self.revisions._index._key_dependencies.refs.clear()
2220 self._pack_collection._abort_write_group()2220 self._pack_collection._abort_write_group()
22212221
2222 def _find_inconsistent_revision_parents(self):
2223 """Find revisions with incorrectly cached parents.
2224
2225 :returns: an iterator yielding tuples of (revison-id, parents-in-index,
2226 parents-in-revision).
2227 """
2228 if not self.is_locked():
2229 raise errors.ObjectNotLocked(self)
2230 pb = ui.ui_factory.nested_progress_bar()
2231 result = []
2232 try:
2233 revision_nodes = self._pack_collection.revision_index \
2234 .combined_index.iter_all_entries()
2235 index_positions = []
2236 # Get the cached index values for all revisions, and also the
2237 # location in each index of the revision text so we can perform
2238 # linear IO.
2239 for index, key, value, refs in revision_nodes:
2240 node = (index, key, value, refs)
2241 index_memo = self.revisions._index._node_to_position(node)
2242 if index_memo[0] != index:
2243 raise AssertionError('%r != %r' % (index_memo[0], index))
2244 index_positions.append((index_memo, key[0],
2245 tuple(parent[0] for parent in refs[0])))
2246 pb.update("Reading revision index", 0, 0)
2247 index_positions.sort()
2248 batch_size = 1000
2249 pb.update("Checking cached revision graph", 0,
2250 len(index_positions))
2251 for offset in xrange(0, len(index_positions), 1000):
2252 pb.update("Checking cached revision graph", offset)
2253 to_query = index_positions[offset:offset + batch_size]
2254 if not to_query:
2255 break
2256 rev_ids = [item[1] for item in to_query]
2257 revs = self.get_revisions(rev_ids)
2258 for revision, item in zip(revs, to_query):
2259 index_parents = item[2]
2260 rev_parents = tuple(revision.parent_ids)
2261 if index_parents != rev_parents:
2262 result.append((revision.revision_id, index_parents,
2263 rev_parents))
2264 finally:
2265 pb.finished()
2266 return result
2267
2268 def _make_parents_provider(self):2222 def _make_parents_provider(self):
2269 return graph.CachingParentsProvider(self)2223 return graph.CachingParentsProvider(self)
22702224
22712225
=== modified file 'bzrlib/repository.py'
--- bzrlib/repository.py 2009-06-12 01:11:00 +0000
+++ bzrlib/repository.py 2009-06-16 05:29:42 +0000
@@ -1154,6 +1154,119 @@
1154 # The old API returned a list, should this actually be a set?1154 # The old API returned a list, should this actually be a set?
1155 return parent_map.keys()1155 return parent_map.keys()
11561156
1157 def _check_inventories(self, checker):
1158 """Check the inventories found from the revision scan.
1159
1160 This is responsible for verifying the sha1 of inventories and
1161 creating a pending_keys set that covers data referenced by inventories.
1162 """
1163 bar = ui.ui_factory.nested_progress_bar()
1164 try:
1165 self._do_check_inventories(checker, bar)
1166 finally:
1167 bar.finished()
1168
1169 def _do_check_inventories(self, checker, bar):
1170 """Helper for _check_inventories."""
1171 revno = 0
1172 keys = {'chk_bytes':set(), 'inventories':set(), 'texts':set()}
1173 kinds = ['chk_bytes', 'texts']
1174 count = len(checker.pending_keys)
1175 bar.update("inventories", 0, 2)
1176 current_keys = checker.pending_keys
1177 checker.pending_keys = {}
1178 # Accumulate current checks.
1179 for key in current_keys:
1180 if key[0] != 'inventories' and key[0] not in kinds:
1181 checker._report_items.append('unknown key type %r' % (key,))
1182 keys[key[0]].add(key[1:])
1183 if keys['inventories']:
1184 # NB: output order *should* be roughly sorted - topo or
1185 # inverse topo depending on repository - either way decent
1186 # to just delta against. However, pre-CHK formats didn't
1187 # try to optimise inventory layout on disk. As such the
1188 # pre-CHK code path does not use inventory deltas.
1189 last_object = None
1190 for record in self.inventories.check(keys=keys['inventories']):
1191 if record.storage_kind == 'absent':
1192 checker._report_items.append(
1193 'Missing inventory {%s}' % (record.key,))
1194 else:
1195 last_object = self._check_record('inventories', record,
1196 checker, last_object,
1197 current_keys[('inventories',) + record.key])
1198 del keys['inventories']
1199 else:
1200 return
1201 bar.update("texts", 1)
1202 while (checker.pending_keys or keys['chk_bytes']
1203 or keys['texts']):
1204 # Something to check.
1205 current_keys = checker.pending_keys
1206 checker.pending_keys = {}
1207 # Accumulate current checks.
1208 for key in current_keys:
1209 if key[0] not in kinds:
1210 checker._report_items.append('unknown key type %r' % (key,))
1211 keys[key[0]].add(key[1:])
1212 # Check the outermost kind only - inventories || chk_bytes || texts
1213 for kind in kinds:
1214 if keys[kind]:
1215 last_object = None
1216 for record in getattr(self, kind).check(keys=keys[kind]):
1217 if record.storage_kind == 'absent':
1218 checker._report_items.append(
1219 'Missing inventory {%s}' % (record.key,))
1220 else:
1221 last_object = self._check_record(kind, record,
1222 checker, last_object, current_keys[(kind,) + record.key])
1223 keys[kind] = set()
1224 break
1225
1226 def _check_record(self, kind, record, checker, last_object, item_data):
1227 """Check a single text from this repository."""
1228 if kind == 'inventories':
1229 rev_id = record.key[0]
1230 inv = self.deserialise_inventory(rev_id,
1231 record.get_bytes_as('fulltext'))
1232 if last_object is not None:
1233 delta = inv._make_delta(last_object)
1234 for old_path, path, file_id, ie in delta:
1235 if ie is None:
1236 continue
1237 ie.check(checker, rev_id, inv)
1238 else:
1239 for path, ie in inv.iter_entries():
1240 ie.check(checker, rev_id, inv)
1241 if self._format.fast_deltas:
1242 return inv
1243 elif kind == 'chk_bytes':
1244 # No code written to check chk_bytes for this repo format.
1245 checker._report_items.append(
1246 'unsupported key type chk_bytes for %s' % (record.key,))
1247 elif kind == 'texts':
1248 self._check_text(record, checker, item_data)
1249 else:
1250 checker._report_items.append(
1251 'unknown key type %s for %s' % (kind, record.key))
1252
1253 def _check_text(self, record, checker, item_data):
1254 """Check a single text."""
1255 # Check it is extractable.
1256 # TODO: check length.
1257 if record.storage_kind == 'chunked':
1258 chunks = record.get_bytes_as(record.storage_kind)
1259 sha1 = osutils.sha_strings(chunks)
1260 length = sum(map(len, chunks))
1261 else:
1262 content = record.get_bytes_as('fulltext')
1263 sha1 = osutils.sha_string(content)
1264 length = len(content)
1265 if item_data and sha1 != item_data[1]:
1266 checker._report_items.append(
1267 'sha1 mismatch: %s has sha1 %s expected %s referenced by %s' %
1268 (record.key, sha1, item_data[1], item_data[2]))
1269
1157 @staticmethod1270 @staticmethod
1158 def create(a_bzrdir):1271 def create(a_bzrdir):
1159 """Construct the current default format repository in a_bzrdir."""1272 """Construct the current default format repository in a_bzrdir."""
@@ -1700,25 +1813,49 @@
17001813
1701 @needs_read_lock1814 @needs_read_lock
1702 def get_revisions(self, revision_ids):1815 def get_revisions(self, revision_ids):
1703 """Get many revisions at once."""1816 """Get many revisions at once.
1817
1818 Repositories that need to check data on every revision read should
1819 subclass this method.
1820 """
1704 return self._get_revisions(revision_ids)1821 return self._get_revisions(revision_ids)
17051822
1706 @needs_read_lock1823 @needs_read_lock
1707 def _get_revisions(self, revision_ids):1824 def _get_revisions(self, revision_ids):
1708 """Core work logic to get many revisions without sanity checks."""1825 """Core work logic to get many revisions without sanity checks."""
1709 for rev_id in revision_ids:1826 revs = {}
1710 if not rev_id or not isinstance(rev_id, basestring):1827 for revid, rev in self._iter_revisions(revision_ids):
1711 raise errors.InvalidRevisionId(revision_id=rev_id, branch=self)1828 if rev is None:
1829 raise errors.NoSuchRevision(self, revid)
1830 revs[revid] = rev
1831 return [revs[revid] for revid in revision_ids]
1832
1833 def _iter_revisions(self, revision_ids):
1834 """Iterate over revision objects.
1835
1836 :param revision_ids: An iterable of revisions to examine. None may be
1837 passed to request all revisions known to the repository. Note that
1838 not all repositories can find unreferenced revisions; for those
1839 repositories only referenced ones will be returned.
1840 :return: An iterator of (revid, revision) tuples. Absent revisions (
1841 those asked for but not available) are returned as (revid, None).
1842 """
1843 if revision_ids is None:
1844 revision_ids = self.all_revision_ids()
1845 else:
1846 for rev_id in revision_ids:
1847 if not rev_id or not isinstance(rev_id, basestring):
1848 raise errors.InvalidRevisionId(revision_id=rev_id, branch=self)
1712 keys = [(key,) for key in revision_ids]1849 keys = [(key,) for key in revision_ids]
1713 stream = self.revisions.get_record_stream(keys, 'unordered', True)1850 stream = self.revisions.get_record_stream(keys, 'unordered', True)
1714 revs = {}
1715 for record in stream:1851 for record in stream:
1852 revid = record.key[0]
1716 if record.storage_kind == 'absent':1853 if record.storage_kind == 'absent':
1717 raise errors.NoSuchRevision(self, record.key[0])1854 yield (revid, None)
1718 text = record.get_bytes_as('fulltext')1855 else:
1719 rev = self._serializer.read_revision_from_string(text)1856 text = record.get_bytes_as('fulltext')
1720 revs[record.key[0]] = rev1857 rev = self._serializer.read_revision_from_string(text)
1721 return [revs[revid] for revid in revision_ids]1858 yield (revid, rev)
17221859
1723 @needs_read_lock1860 @needs_read_lock
1724 def get_revision_xml(self, revision_id):1861 def get_revision_xml(self, revision_id):
@@ -2056,8 +2193,7 @@
2056 batch_size]2193 batch_size]
2057 if not to_query:2194 if not to_query:
2058 break2195 break
2059 for rev_tree in self.revision_trees(to_query):2196 for revision_id in to_query:
2060 revision_id = rev_tree.get_revision_id()
2061 parent_ids = ancestors[revision_id]2197 parent_ids = ancestors[revision_id]
2062 for text_key in revision_keys[revision_id]:2198 for text_key in revision_keys[revision_id]:
2063 pb.update("Calculating text parents", processed_texts)2199 pb.update("Calculating text parents", processed_texts)
@@ -2417,7 +2553,8 @@
2417 [parents_provider, other_repository._make_parents_provider()])2553 [parents_provider, other_repository._make_parents_provider()])
2418 return graph.Graph(parents_provider)2554 return graph.Graph(parents_provider)
24192555
2420 def _get_versioned_file_checker(self, text_key_references=None):2556 def _get_versioned_file_checker(self, text_key_references=None,
2557 ancestors=None):
2421 """Return an object suitable for checking versioned files.2558 """Return an object suitable for checking versioned files.
2422 2559
2423 :param text_key_references: if non-None, an already built2560 :param text_key_references: if non-None, an already built
@@ -2425,9 +2562,12 @@
2425 to whether they were referred to by the inventory of the2562 to whether they were referred to by the inventory of the
2426 revision_id that they contain. If None, this will be2563 revision_id that they contain. If None, this will be
2427 calculated.2564 calculated.
2565 :param ancestors: Optional result from
2566 self.get_graph().get_parent_map(self.all_revision_ids()) if already
2567 available.
2428 """2568 """
2429 return _VersionedFileChecker(self,2569 return _VersionedFileChecker(self,
2430 text_key_references=text_key_references)2570 text_key_references=text_key_references, ancestors=ancestors)
24312571
2432 def revision_ids_to_search_result(self, result_set):2572 def revision_ids_to_search_result(self, result_set):
2433 """Convert a set of revision ids to a graph SearchResult."""2573 """Convert a set of revision ids to a graph SearchResult."""
@@ -2483,19 +2623,25 @@
2483 return record.get_bytes_as('fulltext')2623 return record.get_bytes_as('fulltext')
24842624
2485 @needs_read_lock2625 @needs_read_lock
2486 def check(self, revision_ids=None):2626 def check(self, revision_ids=None, callback_refs=None, check_repo=True):
2487 """Check consistency of all history of given revision_ids.2627 """Check consistency of all history of given revision_ids.
24882628
2489 Different repository implementations should override _check().2629 Different repository implementations should override _check().
24902630
2491 :param revision_ids: A non-empty list of revision_ids whose ancestry2631 :param revision_ids: A non-empty list of revision_ids whose ancestry
2492 will be checked. Typically the last revision_id of a branch.2632 will be checked. Typically the last revision_id of a branch.
2633 :param callback_refs: A dict of check-refs to resolve and callback
2634 the check/_check method on the items listed as wanting the ref.
2635 see bzrlib.check.
2636 :param check_repo: If False do not check the repository contents, just
2637 calculate the data callback_refs requires and call them back.
2493 """2638 """
2494 return self._check(revision_ids)2639 return self._check(revision_ids, callback_refs=callback_refs,
2640 check_repo=check_repo)
24952641
2496 def _check(self, revision_ids):2642 def _check(self, revision_ids, callback_refs, check_repo):
2497 result = check.Check(self)2643 result = check.Check(self, check_repo=check_repo)
2498 result.check()2644 result.check(callback_refs)
2499 return result2645 return result
25002646
2501 def _warn_if_deprecated(self):2647 def _warn_if_deprecated(self):
@@ -3923,10 +4069,10 @@
39234069
3924class _VersionedFileChecker(object):4070class _VersionedFileChecker(object):
39254071
3926 def __init__(self, repository, text_key_references=None):4072 def __init__(self, repository, text_key_references=None, ancestors=None):
3927 self.repository = repository4073 self.repository = repository
3928 self.text_index = self.repository._generate_text_key_index(4074 self.text_index = self.repository._generate_text_key_index(
3929 text_key_references=text_key_references)4075 text_key_references=text_key_references, ancestors=ancestors)
39304076
3931 def calculate_file_version_parents(self, text_key):4077 def calculate_file_version_parents(self, text_key):
3932 """Calculate the correct parents for a file version according to4078 """Calculate the correct parents for a file version according to
@@ -3950,6 +4096,18 @@
3950 revision_id) tuples for versions that are present in this versioned4096 revision_id) tuples for versions that are present in this versioned
3951 file, but not used by the corresponding inventory.4097 file, but not used by the corresponding inventory.
3952 """4098 """
4099 local_progress = None
4100 if progress_bar is None:
4101 local_progress = ui.ui_factory.nested_progress_bar()
4102 progress_bar = local_progress
4103 try:
4104 return self._check_file_version_parents(texts, progress_bar)
4105 finally:
4106 if local_progress:
4107 local_progress.finished()
4108
4109 def _check_file_version_parents(self, texts, progress_bar):
4110 """See check_file_version_parents."""
3953 wrong_parents = {}4111 wrong_parents = {}
3954 self.file_ids = set([file_id for file_id, _ in4112 self.file_ids = set([file_id for file_id, _ in
3955 self.text_index.iterkeys()])4113 self.text_index.iterkeys()])
@@ -3964,8 +4122,7 @@
3964 text_keys = self.repository.texts.keys()4122 text_keys = self.repository.texts.keys()
3965 unused_keys = frozenset(text_keys) - set(self.text_index)4123 unused_keys = frozenset(text_keys) - set(self.text_index)
3966 for num, key in enumerate(self.text_index.iterkeys()):4124 for num, key in enumerate(self.text_index.iterkeys()):
3967 if progress_bar is not None:4125 progress_bar.update('checking text graph', num, n_versions)
3968 progress_bar.update('checking text graph', num, n_versions)
3969 correct_parents = self.calculate_file_version_parents(key)4126 correct_parents = self.calculate_file_version_parents(key)
3970 try:4127 try:
3971 knit_parents = parent_map[key]4128 knit_parents = parent_map[key]
39724129
=== modified file 'bzrlib/smart/medium.py'
--- bzrlib/smart/medium.py 2009-06-10 03:56:49 +0000
+++ bzrlib/smart/medium.py 2009-06-16 05:29:42 +0000
@@ -37,7 +37,6 @@
37from bzrlib import (37from bzrlib import (
38 debug,38 debug,
39 errors,39 errors,
40 osutils,
41 symbol_versioning,40 symbol_versioning,
42 trace,41 trace,
43 ui,42 ui,
@@ -46,7 +45,8 @@
46from bzrlib.smart import client, protocol, request, vfs45from bzrlib.smart import client, protocol, request, vfs
47from bzrlib.transport import ssh46from bzrlib.transport import ssh
48""")47""")
4948#usually already imported, and getting IllegalScoperReplacer on it here.
49from bzrlib import osutils
5050
51# We must not read any more than 64k at a time so we don't risk "no buffer51# We must not read any more than 64k at a time so we don't risk "no buffer
52# space available" errors on some platforms. Windows in particular is likely52# space available" errors on some platforms. Windows in particular is likely
5353
=== modified file 'bzrlib/tests/blackbox/test_check.py'
--- bzrlib/tests/blackbox/test_check.py 2009-03-23 14:59:43 +0000
+++ bzrlib/tests/blackbox/test_check.py 2009-05-12 07:27:33 +0000
@@ -34,23 +34,21 @@
34 tree = self.make_branch_and_tree('.')34 tree = self.make_branch_and_tree('.')
35 tree.commit('hallelujah')35 tree.commit('hallelujah')
36 out, err = self.run_bzr('check')36 out, err = self.run_bzr('check')
37 self.assertContainsRe(err, r"^Checking working tree at '.*'\.\n"37 self.assertContainsRe(err, r"Checking working tree at '.*'\.\n")
38 r"Checking repository at '.*'\.\n"38 self.assertContainsRe(err, r"Checking repository at '.*'\.\n")
39 r"checked repository.*\n"39 self.assertContainsRe(err, r"checked repository.*\n"
40 r" 1 revisions\n"40 r" 1 revisions\n"
41 r" 0 file-ids\n"41 r" 0 file-ids\n"
42 r" 0 unique file texts\n"42 )
43 r" 0 repeated file texts\n"43 self.assertContainsRe(err, r"Checking branch at '.*'\.\n")
44 r" 0 unreferenced text versions\n"44 self.assertContainsRe(err, r"checked branch.*")
45 r"Checking branch at '.*'\.\n"
46 r"checked branch.*\n$")
4745
48 def test_check_branch(self):46 def test_check_branch(self):
49 tree = self.make_branch_and_tree('.')47 tree = self.make_branch_and_tree('.')
50 tree.commit('foo')48 tree.commit('foo')
51 out, err = self.run_bzr('check --branch')49 out, err = self.run_bzr('check --branch')
52 self.assertContainsRe(err, r"^Checking branch at '.*'\.\n"50 self.assertContainsRe(err, r"^Checking branch at '.*'\.\n"
53 r"checked branch.*\n$")51 r"checked branch.*")
5452
55 def test_check_repository(self):53 def test_check_repository(self):
56 tree = self.make_branch_and_tree('.')54 tree = self.make_branch_and_tree('.')
@@ -60,9 +58,7 @@
60 r"checked repository.*\n"58 r"checked repository.*\n"
61 r" 1 revisions\n"59 r" 1 revisions\n"
62 r" 0 file-ids\n"60 r" 0 file-ids\n"
63 r" 0 unique file texts\n"61 )
64 r" 0 repeated file texts\n"
65 r" 0 unreferenced text versions$")
6662
67 def test_check_tree(self):63 def test_check_tree(self):
68 tree = self.make_branch_and_tree('.')64 tree = self.make_branch_and_tree('.')
@@ -76,7 +72,7 @@
76 out, err = self.run_bzr('check --tree --branch')72 out, err = self.run_bzr('check --tree --branch')
77 self.assertContainsRe(err, r"^Checking working tree at '.*'\.\n"73 self.assertContainsRe(err, r"^Checking working tree at '.*'\.\n"
78 r"Checking branch at '.*'\.\n"74 r"Checking branch at '.*'\.\n"
79 r"checked branch.*\n$")75 r"checked branch.*")
8076
81 def test_check_missing_tree(self):77 def test_check_missing_tree(self):
82 branch = self.make_branch('.')78 branch = self.make_branch('.')
@@ -87,9 +83,9 @@
87 branch = self.make_branch('.')83 branch = self.make_branch('.')
88 out, err = self.run_bzr('check --tree --branch')84 out, err = self.run_bzr('check --tree --branch')
89 self.assertContainsRe(err,85 self.assertContainsRe(err,
90 r"^No working tree found at specified location\.\n"
91 r"Checking branch at '.*'\.\n"86 r"Checking branch at '.*'\.\n"
92 r"checked branch.*\n$")87 r"No working tree found at specified location\.\n"
88 r"checked branch.*")
9389
94 def test_check_missing_branch_in_shared_repo(self):90 def test_check_missing_branch_in_shared_repo(self):
95 self.make_repository('shared', shared=True)91 self.make_repository('shared', shared=True)
9692
=== modified file 'bzrlib/tests/branch_implementations/test_check.py'
--- bzrlib/tests/branch_implementations/test_check.py 2009-06-10 03:56:49 +0000
+++ bzrlib/tests/branch_implementations/test_check.py 2009-06-16 05:29:42 +0000
@@ -16,7 +16,9 @@
1616
17"""Tests for branch implementations - test check() functionality"""17"""Tests for branch implementations - test check() functionality"""
1818
19from bzrlib import errors19from StringIO import StringIO
20
21from bzrlib import errors, tests, ui
20from bzrlib.tests.branch_implementations import TestCaseWithBranch22from bzrlib.tests.branch_implementations import TestCaseWithBranch
2123
2224
@@ -54,24 +56,57 @@
54 # with set_last_revision_info56 # with set_last_revision_info
55 tree.branch.set_last_revision_info(3, r5)57 tree.branch.set_last_revision_info(3, r5)
5658
57 e = self.assertRaises(errors.BzrCheckError,59 tree.lock_read()
58 tree.branch.check)60 self.addCleanup(tree.unlock)
59 self.assertEqual('Internal check failed:'61 refs = self.make_refs(tree.branch)
60 ' revno does not match len(mainline) 3 != 5', str(e))62 result = tree.branch.check(refs)
63 ui.ui_factory = tests.TestUIFactory(stdout=StringIO())
64 result.report_results(True)
65 self.assertContainsRe('revno does not match len',
66 ui.ui_factory.stdout.getvalue())
6167
62 def test_check_branch_report_results(self):68 def test_check_branch_report_results(self):
63 """Checking a branch produces results which can be printed"""69 """Checking a branch produces results which can be printed"""
64 branch = self.make_branch('.')70 branch = self.make_branch('.')
65 result = branch.check()71 branch.lock_read()
72 self.addCleanup(branch.unlock)
73 result = branch.check(self.make_refs(branch))
66 # reports results through logging74 # reports results through logging
67 result.report_results(verbose=True)75 result.report_results(verbose=True)
68 result.report_results(verbose=False)76 result.report_results(verbose=False)
6977
70 def test_check_detects_ghosts_in_mainline(self):78 def test__get_check_refs(self):
71 tree = self.make_branch_and_tree('test')79 tree = self.make_branch_and_tree('.')
72 tree.set_parent_ids(['thisisaghost'], allow_leftmost_as_ghost=True)80 revid = tree.commit('foo')
73 r1 = tree.commit('one')81 self.assertEqual(
74 r2 = tree.commit('two')82 set([('revision-existence', revid), ('lefthand-distance', revid)]),
75 result = tree.branch.check()83 set(tree.branch._get_check_refs()))
76 self.assertEquals(True, result.ghosts_in_mainline)
7784
85 def make_refs(self, branch):
86 needed_refs = branch._get_check_refs()
87 refs = {}
88 distances = set()
89 existences = set()
90 for ref in needed_refs:
91 kind, value = ref
92 if kind == 'lefthand-distance':
93 distances.add(value)
94 elif kind == 'revision-existence':
95 existences.add(value)
96 else:
97 raise AssertionError(
98 'unknown ref kind for ref %s' % ref)
99 node_distances = branch.repository.get_graph().find_lefthand_distances(
100 distances)
101 for key, distance in node_distances.iteritems():
102 refs[('lefthand-distance', key)] = distance
103 if key in existences and distance > 0:
104 refs[('revision-existence', key)] = True
105 existences.remove(key)
106 parent_map = branch.repository.get_graph().get_parent_map(existences)
107 for key in parent_map:
108 refs[('revision-existence', key)] = True
109 existences.remove(key)
110 for key in existences:
111 refs[('revision-existence', key)] = False
112 return refs
78113
=== modified file 'bzrlib/tests/per_repository/test_check.py'
--- bzrlib/tests/per_repository/test_check.py 2009-04-09 20:23:07 +0000
+++ bzrlib/tests/per_repository/test_check.py 2009-05-12 05:34:15 +0000
@@ -36,18 +36,14 @@
36 tree = self.make_branch_and_tree('.')36 tree = self.make_branch_and_tree('.')
37 self.build_tree(['foo'])37 self.build_tree(['foo'])
38 tree.smart_add(['.'])38 tree.smart_add(['.'])
39 tree.commit('1')39 revid1 = tree.commit('1')
40 self.build_tree(['bar'])40 self.build_tree(['bar'])
41 tree.smart_add(['.'])41 tree.smart_add(['.'])
42 tree.commit('2')42 revid2 = tree.commit('2')
43 # XXX: check requires a non-empty revision IDs list, but it ignores the43 check_object = tree.branch.repository.check([revid1, revid2])
44 # contents of it!
45 check_object = tree.branch.repository.check(['ignored'])
46 check_object.report_results(verbose=True)44 check_object.report_results(verbose=True)
47 log = self._get_log(keep_log_file=True)45 log = self._get_log(keep_log_file=True)
48 self.assertContainsRe(46 self.assertContainsRe(log, "0 unreferenced text versions")
49 log,
50 "0 unreferenced text versions")
5147
5248
53class TestFindInconsistentRevisionParents(TestCaseWithBrokenRevisionIndex):49class TestFindInconsistentRevisionParents(TestCaseWithBrokenRevisionIndex):
@@ -100,3 +96,32 @@
100 "revision-id has wrong parents in index: "96 "revision-id has wrong parents in index: "
101 r"\('incorrect-parent',\) should be \(\)")97 r"\('incorrect-parent',\) should be \(\)")
10298
99
100class TestCallbacks(TestCaseWithRepository):
101
102 def test_callback_tree_and_branch(self):
103 # use a real tree to get actual refs that will work
104 tree = self.make_branch_and_tree('foo')
105 revid = tree.commit('foo')
106 tree.lock_read()
107 self.addCleanup(tree.unlock)
108 needed_refs = {}
109 for ref in tree._get_check_refs():
110 needed_refs.setdefault(ref, []).append(tree)
111 for ref in tree.branch._get_check_refs():
112 needed_refs.setdefault(ref, []).append(tree.branch)
113 self.tree_check = tree._check
114 self.branch_check = tree.branch.check
115 tree._check = self.tree_callback
116 tree.branch.check = self.branch_callback
117 self.callbacks = []
118 tree.branch.repository.check([revid], callback_refs=needed_refs)
119 self.assertNotEqual([], self.callbacks)
120
121 def tree_callback(self, refs):
122 self.callbacks.append(('tree', refs))
123 return self.tree_check(refs)
124
125 def branch_callback(self, refs):
126 self.callbacks.append(('branch', refs))
127 return self.branch_check(refs)
103128
=== modified file 'bzrlib/tests/test_graph.py'
--- bzrlib/tests/test_graph.py 2009-06-10 03:56:49 +0000
+++ bzrlib/tests/test_graph.py 2009-06-16 05:29:42 +0000
@@ -526,6 +526,19 @@
526 graph = self.make_graph(history_shortcut)526 graph = self.make_graph(history_shortcut)
527 self.assertEqual(set(['rev2b']), graph.find_lca('rev3a', 'rev3b'))527 self.assertEqual(set(['rev2b']), graph.find_lca('rev3a', 'rev3b'))
528528
529 def test_lefthand_distance_smoke(self):
530 """A simple does it work test for graph.lefthand_distance(keys)."""
531 graph = self.make_graph(history_shortcut)
532 distance_graph = graph.find_lefthand_distances(['rev3b', 'rev2a'])
533 self.assertEqual({'rev2a': 2, 'rev3b': 3}, distance_graph)
534
535 def test_lefthand_distance_ghosts(self):
536 """A simple does it work test for graph.lefthand_distance(keys)."""
537 nodes = {'nonghost':[NULL_REVISION], 'toghost':['ghost']}
538 graph = self.make_graph(nodes)
539 distance_graph = graph.find_lefthand_distances(['nonghost', 'toghost'])
540 self.assertEqual({'nonghost': 1, 'toghost': -1}, distance_graph)
541
529 def test_recursive_unique_lca(self):542 def test_recursive_unique_lca(self):
530 """Test finding a unique least common ancestor.543 """Test finding a unique least common ancestor.
531544
532545
=== modified file 'bzrlib/tests/test_versionedfile.py'
--- bzrlib/tests/test_versionedfile.py 2009-05-01 18:09:24 +0000
+++ bzrlib/tests/test_versionedfile.py 2009-06-15 00:24:04 +0000
@@ -30,6 +30,7 @@
30 knit as _mod_knit,30 knit as _mod_knit,
31 osutils,31 osutils,
32 progress,32 progress,
33 ui,
33 )34 )
34from bzrlib.errors import (35from bzrlib.errors import (
35 RevisionNotPresent,36 RevisionNotPresent,
@@ -1510,6 +1511,28 @@
1510 self.assertRaises(RevisionNotPresent,1511 self.assertRaises(RevisionNotPresent,
1511 files.annotate, prefix + ('missing-key',))1512 files.annotate, prefix + ('missing-key',))
15121513
1514 def test_check_no_parameters(self):
1515 files = self.get_versionedfiles()
1516
1517 def test_check_progressbar_parameter(self):
1518 """A progress bar can be supplied because check can be a generator."""
1519 pb = ui.ui_factory.nested_progress_bar()
1520 self.addCleanup(pb.finished)
1521 files = self.get_versionedfiles()
1522 files.check(progress_bar=pb)
1523
1524 def test_check_with_keys_becomes_generator(self):
1525 files = self.get_versionedfiles()
1526 self.get_diamond_files(files)
1527 keys = files.keys()
1528 entries = files.check(keys=keys)
1529 seen = set()
1530 # Texts output should be fulltexts.
1531 self.capture_stream(files, entries, seen.add,
1532 files.get_parent_map(keys), require_fulltext=True)
1533 # All texts should be output.
1534 self.assertEqual(set(keys), seen)
1535
1513 def test_construct(self):1536 def test_construct(self):
1514 """Each parameterised test can be constructed on a transport."""1537 """Each parameterised test can be constructed on a transport."""
1515 files = self.get_versionedfiles()1538 files = self.get_versionedfiles()
@@ -1669,7 +1692,8 @@
1669 'knit-delta-closure', 'knit-delta-closure-ref',1692 'knit-delta-closure', 'knit-delta-closure-ref',
1670 'groupcompress-block', 'groupcompress-block-ref'])1693 'groupcompress-block', 'groupcompress-block-ref'])
16711694
1672 def capture_stream(self, f, entries, on_seen, parents):1695 def capture_stream(self, f, entries, on_seen, parents,
1696 require_fulltext=False):
1673 """Capture a stream for testing."""1697 """Capture a stream for testing."""
1674 for factory in entries:1698 for factory in entries:
1675 on_seen(factory.key)1699 on_seen(factory.key)
@@ -1680,6 +1704,8 @@
1680 self.assertEqual(parents[factory.key], factory.parents)1704 self.assertEqual(parents[factory.key], factory.parents)
1681 self.assertIsInstance(factory.get_bytes_as(factory.storage_kind),1705 self.assertIsInstance(factory.get_bytes_as(factory.storage_kind),
1682 str)1706 str)
1707 if require_fulltext:
1708 factory.get_bytes_as('fulltext')
16831709
1684 def test_get_record_stream_interface(self):1710 def test_get_record_stream_interface(self):
1685 """each item in a stream has to provide a regular interface."""1711 """each item in a stream has to provide a regular interface."""
@@ -2547,8 +2573,8 @@
2547 self.assertRaises(NotImplementedError,2573 self.assertRaises(NotImplementedError,
2548 self.texts.add_mpdiffs, [])2574 self.texts.add_mpdiffs, [])
25492575
2550 def test_check(self):2576 def test_check_noerrors(self):
2551 self.assertTrue(self.texts.check())2577 self.texts.check()
25522578
2553 def test_insert_record_stream(self):2579 def test_insert_record_stream(self):
2554 self.assertRaises(NotImplementedError, self.texts.insert_record_stream,2580 self.assertRaises(NotImplementedError, self.texts.insert_record_stream,
25552581
=== modified file 'bzrlib/tests/workingtree_implementations/__init__.py'
--- bzrlib/tests/workingtree_implementations/__init__.py 2009-06-10 03:56:49 +0000
+++ bzrlib/tests/workingtree_implementations/__init__.py 2009-06-16 05:29:42 +0000
@@ -60,45 +60,49 @@
6060
6161
62def load_tests(standard_tests, module, loader):62def load_tests(standard_tests, module, loader):
63 test_names = [
64 'add_reference',
65 'add',
66 'basis_inventory',
67 'basis_tree',
68 'break_lock',
69 'changes_from',
70 'check',
71 'content_filters',
72 'commit',
73 'eol_conversion',
74 'executable',
75 'flush',
76 'get_file_mtime',
77 'get_parent_ids',
78 'inv',
79 'is_control_filename',
80 'is_ignored',
81 'locking',
82 'merge_from_branch',
83 'mkdir',
84 'move',
85 'nested_specifics',
86 'parents',
87 'paths2ids',
88 'pull',
89 'put_file',
90 'readonly',
91 'read_working_inventory',
92 'remove',
93 'rename_one',
94 'revision_tree',
95 'set_root_id',
96 'smart_add',
97 'uncommit',
98 'unversion',
99 'views',
100 'walkdirs',
101 'workingtree',
102 ]
63 test_workingtree_implementations = [103 test_workingtree_implementations = [
64 'bzrlib.tests.workingtree_implementations.test_add_reference',104 'bzrlib.tests.workingtree_implementations.test_' + name for
65 'bzrlib.tests.workingtree_implementations.test_add',105 name in test_names]
66 'bzrlib.tests.workingtree_implementations.test_basis_inventory',
67 'bzrlib.tests.workingtree_implementations.test_basis_tree',
68 'bzrlib.tests.workingtree_implementations.test_break_lock',
69 'bzrlib.tests.workingtree_implementations.test_changes_from',
70 'bzrlib.tests.workingtree_implementations.test_content_filters',
71 'bzrlib.tests.workingtree_implementations.test_commit',
72 'bzrlib.tests.workingtree_implementations.test_eol_conversion',
73 'bzrlib.tests.workingtree_implementations.test_executable',
74 'bzrlib.tests.workingtree_implementations.test_flush',
75 'bzrlib.tests.workingtree_implementations.test_get_file_mtime',
76 'bzrlib.tests.workingtree_implementations.test_get_parent_ids',
77 'bzrlib.tests.workingtree_implementations.test_inv',
78 'bzrlib.tests.workingtree_implementations.test_is_control_filename',
79 'bzrlib.tests.workingtree_implementations.test_is_ignored',
80 'bzrlib.tests.workingtree_implementations.test_locking',
81 'bzrlib.tests.workingtree_implementations.test_merge_from_branch',
82 'bzrlib.tests.workingtree_implementations.test_mkdir',
83 'bzrlib.tests.workingtree_implementations.test_move',
84 'bzrlib.tests.workingtree_implementations.test_nested_specifics',
85 'bzrlib.tests.workingtree_implementations.test_parents',
86 'bzrlib.tests.workingtree_implementations.test_paths2ids',
87 'bzrlib.tests.workingtree_implementations.test_pull',
88 'bzrlib.tests.workingtree_implementations.test_put_file',
89 'bzrlib.tests.workingtree_implementations.test_readonly',
90 'bzrlib.tests.workingtree_implementations.test_read_working_inventory',
91 'bzrlib.tests.workingtree_implementations.test_remove',
92 'bzrlib.tests.workingtree_implementations.test_rename_one',
93 'bzrlib.tests.workingtree_implementations.test_revision_tree',
94 'bzrlib.tests.workingtree_implementations.test_set_root_id',
95 'bzrlib.tests.workingtree_implementations.test_smart_add',
96 'bzrlib.tests.workingtree_implementations.test_uncommit',
97 'bzrlib.tests.workingtree_implementations.test_unversion',
98 'bzrlib.tests.workingtree_implementations.test_views',
99 'bzrlib.tests.workingtree_implementations.test_walkdirs',
100 'bzrlib.tests.workingtree_implementations.test_workingtree',
101 ]
102106
103 scenarios = make_scenarios(107 scenarios = make_scenarios(
104 tests.default_transport,108 tests.default_transport,
105109
=== added file 'bzrlib/tests/workingtree_implementations/test_check.py'
--- bzrlib/tests/workingtree_implementations/test_check.py 1970-01-01 00:00:00 +0000
+++ bzrlib/tests/workingtree_implementations/test_check.py 2009-05-08 02:06:36 +0000
@@ -0,0 +1,55 @@
1# Copyright (C) 2009 Canonical Ltd
2#
3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by
5# the Free Software Foundation; either version 2 of the License, or
6# (at your option) any later version.
7#
8# This program is distributed in the hope that it will be useful,
9# but WITHOUT ANY WARRANTY; without even the implied warranty of
10# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11# GNU General Public License for more details.
12#
13# You should have received a copy of the GNU General Public License
14# along with this program; if not, write to the Free Software
15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
16
17"""Tests for checking of trees."""
18
19from bzrlib import (
20 tests,
21 )
22from bzrlib.tests.workingtree_implementations import TestCaseWithWorkingTree
23
24
25class TestCheck(TestCaseWithWorkingTree):
26
27 def test__get_check_refs_new(self):
28 tree = self.make_branch_and_tree('tree')
29 self.assertEqual(set([('trees', 'null:')]),
30 set(tree._get_check_refs()))
31
32 def test__get_check_refs_basis(self):
33 # with a basis, all current bzr trees cache it and so need the
34 # inventory to cross-check.
35 tree = self.make_branch_and_tree('tree')
36 revid = tree.commit('first post')
37 self.assertEqual(set([('trees', revid)]),
38 set(tree._get_check_refs()))
39
40 def test__check_with_refs(self):
41 # _check can be called with a dict of the things required.
42 tree = self.make_branch_and_tree('tree')
43 tree.lock_write()
44 self.addCleanup(tree.unlock)
45 revid = tree.commit('first post')
46 needed_refs = tree._get_check_refs()
47 repo = tree.branch.repository
48 for ref in needed_refs:
49 kind, revid = ref
50 refs = {}
51 if kind == 'trees':
52 refs[ref] = repo.revision_tree(revid)
53 else:
54 self.fail('unknown ref kind')
55 tree._check(refs)
056
=== modified file 'bzrlib/ui/text.py'
--- bzrlib/ui/text.py 2009-04-10 19:37:20 +0000
+++ bzrlib/ui/text.py 2009-05-12 06:30:56 +0000
@@ -26,6 +26,7 @@
26from bzrlib.lazy_import import lazy_import26from bzrlib.lazy_import import lazy_import
27lazy_import(globals(), """27lazy_import(globals(), """
28from bzrlib import (28from bzrlib import (
29 debug,
29 progress,30 progress,
30 osutils,31 osutils,
31 symbol_versioning,32 symbol_versioning,
@@ -128,6 +129,7 @@
128 self._last_task = None129 self._last_task = None
129 self._total_byte_count = 0130 self._total_byte_count = 0
130 self._bytes_since_update = 0131 self._bytes_since_update = 0
132 self._fraction = 0
131133
132 def _show_line(self, s):134 def _show_line(self, s):
133 n = self._width - 1135 n = self._width - 1
@@ -151,9 +153,14 @@
151 cols = 20153 cols = 20
152 if self._last_task is None:154 if self._last_task is None:
153 completion_fraction = 0155 completion_fraction = 0
156 self._fraction = 0
154 else:157 else:
155 completion_fraction = \158 completion_fraction = \
156 self._last_task._overall_completion_fraction() or 0159 self._last_task._overall_completion_fraction() or 0
160 if (completion_fraction < self._fraction and 'progress' in
161 debug.debug_flags):
162 import pdb;pdb.set_trace()
163 self._fraction = completion_fraction
157 markers = int(round(float(cols) * completion_fraction)) - 1164 markers = int(round(float(cols) * completion_fraction)) - 1
158 bar_str = '[' + ('#' * markers + spin_str).ljust(cols) + '] '165 bar_str = '[' + ('#' * markers + spin_str).ljust(cols) + '] '
159 return bar_str166 return bar_str
160167
=== modified file 'bzrlib/versionedfile.py'
--- bzrlib/versionedfile.py 2009-06-10 03:56:49 +0000
+++ bzrlib/versionedfile.py 2009-06-16 05:29:42 +0000
@@ -876,7 +876,16 @@
876 raise NotImplementedError(self.annotate)876 raise NotImplementedError(self.annotate)
877877
878 def check(self, progress_bar=None):878 def check(self, progress_bar=None):
879 """Check this object for integrity."""879 """Check this object for integrity.
880
881 :param progress_bar: A progress bar to output as the check progresses.
882 :param keys: Specific keys within the VersionedFiles to check. When
883 this parameter is not None, check() becomes a generator as per
884 get_record_stream. The difference to get_record_stream is that
885 more or deeper checks will be performed.
886 :return: None, or if keys was supplied a generator as per
887 get_record_stream.
888 """
880 raise NotImplementedError(self.check)889 raise NotImplementedError(self.check)
881890
882 @staticmethod891 @staticmethod
@@ -1092,10 +1101,15 @@
1092 result.append((prefix + (origin,), line))1101 result.append((prefix + (origin,), line))
1093 return result1102 return result
10941103
1095 def check(self, progress_bar=None):1104 def check(self, progress_bar=None, keys=None):
1096 """See VersionedFiles.check()."""1105 """See VersionedFiles.check()."""
1106 # XXX: This is over-enthusiastic but as we only thunk for Weaves today
1107 # this is tolerable. Ideally we'd pass keys down to check() and
1108 # have the older VersiondFile interface updated too.
1097 for prefix, vf in self._iter_all_components():1109 for prefix, vf in self._iter_all_components():
1098 vf.check()1110 vf.check()
1111 if keys is not None:
1112 return self.get_record_stream(keys, 'unordered', True)
10991113
1100 def get_parent_map(self, keys):1114 def get_parent_map(self, keys):
1101 """Get a map of the parents of keys.1115 """Get a map of the parents of keys.
11021116
=== modified file 'bzrlib/workingtree.py'
--- bzrlib/workingtree.py 2009-06-15 15:47:45 +0000
+++ bzrlib/workingtree.py 2009-06-16 05:29:42 +0000
@@ -290,6 +290,16 @@
290 self._control_files.break_lock()290 self._control_files.break_lock()
291 self.branch.break_lock()291 self.branch.break_lock()
292292
293 def _get_check_refs(self):
294 """Return the references needed to perform a check of this tree.
295
296 The default implementation returns no refs, and is only suitable for
297 trees that have no local caching and can commit on ghosts at any time.
298
299 :seealso: bzrlib.check for details about check_refs.
300 """
301 return []
302
293 def requires_rich_root(self):303 def requires_rich_root(self):
294 return self._format.requires_rich_root304 return self._format.requires_rich_root
295305
@@ -2515,12 +2525,17 @@
2515 return un_resolved, resolved2525 return un_resolved, resolved
25162526
2517 @needs_read_lock2527 @needs_read_lock
2518 def _check(self):2528 def _check(self, references):
2529 """Check the tree for consistency.
2530
2531 :param references: A dict with keys matching the items returned by
2532 self._get_check_refs(), and values from looking those keys up in
2533 the repository.
2534 """
2519 tree_basis = self.basis_tree()2535 tree_basis = self.basis_tree()
2520 tree_basis.lock_read()2536 tree_basis.lock_read()
2521 try:2537 try:
2522 repo_basis = self.branch.repository.revision_tree(2538 repo_basis = references[('trees', self.last_revision())]
2523 self.last_revision())
2524 if len(list(repo_basis.iter_changes(tree_basis))) > 0:2539 if len(list(repo_basis.iter_changes(tree_basis))) > 0:
2525 raise errors.BzrCheckError(2540 raise errors.BzrCheckError(
2526 "Mismatched basis inventory content.")2541 "Mismatched basis inventory content.")
@@ -2572,6 +2587,10 @@
2572 if self._inventory is None:2587 if self._inventory is None:
2573 self.read_working_inventory()2588 self.read_working_inventory()
25742589
2590 def _get_check_refs(self):
2591 """Return the references needed to perform a check of this tree."""
2592 return [('trees', self.last_revision())]
2593
2575 def lock_tree_write(self):2594 def lock_tree_write(self):
2576 """See WorkingTree.lock_tree_write().2595 """See WorkingTree.lock_tree_write().
25772596
@@ -2634,6 +2653,10 @@
2634 mode=self.bzrdir._get_file_mode())2653 mode=self.bzrdir._get_file_mode())
2635 return True2654 return True
26362655
2656 def _get_check_refs(self):
2657 """Return the references needed to perform a check of this tree."""
2658 return [('trees', self.last_revision())]
2659
2637 @needs_tree_write_lock2660 @needs_tree_write_lock
2638 def set_conflicts(self, conflicts):2661 def set_conflicts(self, conflicts):
2639 self._put_rio('conflicts', conflicts.to_stanzas(),2662 self._put_rio('conflicts', conflicts.to_stanzas(),
26402663
=== added file 'doc/developers/check.txt'
--- doc/developers/check.txt 1970-01-01 00:00:00 +0000
+++ doc/developers/check.txt 2009-05-12 03:50:39 +0000
@@ -0,0 +1,63 @@
1Check Notes
2===========
3
4.. contents:: :local:
5
6Overview
7--------
8
9Check has multiple responsibilities:
10
11* Ensure that the data as recorded on disk is accessible intact and unaltered.
12* Ensure that a branch/repository/tree/whatever is ready for upgrade.
13* Look for and report on recorded-data issues where previous bzr's, or changing
14 situations have lead so some form of inconsistency.
15* Report sufficient information for a user to either fix the issue themselves
16 or report a bug that will hopefully be sufficiently detailed we can fix based
17 on the initial report.
18* Not scare users when run if everything is okey-dokey.
19
20Ideally one check invocation can do all these things.
21
22Repository
23----------
24
25Things that can go wrong:
26* Bit errors or alterations may occur in raw data.
27* Data that is referenced may be missing
28* There could be a lot of garbage in upload etc.
29* File graphs may be inconsistent with inventories and parents.
30* The revision graph cache can be inconsistent with the revision data.
31
32Branch
33------
34
35Things that can go wrong:
36* Tag or tip revision ids may be missing from the repo.
37* The revno tip cache may be wrong.
38* Various urls could be problematic (not inaccessible, just invalid)
39* Stacked-on branch could be inaccessible.
40
41Tree
42----
43
44Things that can go wrong:
45* Bit errors in dirstate.
46* Corrupt or invalid shelves.
47* Corrupt dirstates written to disk.
48* Cached inventories might not match repository.
49
50Duplicate work
51--------------
52
53If we check every branch in a repo separately we will encounter duplicate
54effort in assessing things like missing tags/tips, revno cache etc.
55
56Outline of approach
57-------------------
58
59To check a repository, we scan for branches, open their trees and generate
60summary data. We then collect all the summary data in as compact a form as
61possible and do a detailed check on the repository, calling back out to branch
62and trees as we encounter the actual data that that tree/branch requires to
63perform its check.
064