Code review comment for lp:~lifeless/bzr/check-1

Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

Robert Collins wrote:
> Robert Collins has proposed merging lp:~lifeless/bzr/check-1 into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
> This gives a single-pass progress bar to check, makes it talk 1/2 the
> time on 1.9 flavour formats and nochange on CHK formats. More work to
> come but this seems unqualified better to me.
>
>
Thanks for working on check. This command will get quite a workout from
lots of users as they migrate to 2a so I think getting it performing
well and reliably is quite a high priority.

This is a pretty big patch and I'm finding it difficult to review it all
at once. Here's a partial review with numerous clean-ups outside the
guts of your new code. To begin with, I'm happy with the -Dprogress
changes. These are arguably completely independent from check so it
would have been nice to submit them as a separate patch. In any case,
the changes related to that are good to go so you can land them immediately.

> +API Changes
> +***********
> +
> +* ``WorkingTree._check`` now requires a references dict with keys matching
> + those returned by ``WorkingTree._get_check_refs``. (Robert Collins)
> +
>
And more importantly, you've changed the public API Branch.check() to
have a new mandatory parameter and InventoryEntry.check() to not have a
tree parameter. You need to document these as API breaks.

> + def find_lefthand_distances(self, keys):
> + """Find the distance to null for all the keys in keys.
> +
> + :param keys: keys to lookup.
> + :return: A dict key->distance for all of keys.
> + """
> + # Optimisable by concurrent searching, but a random spread should get
> + # some sort of hit rate.
> + result = {}
>

result isn't used in this routine.

> + else:
> + # At the moment, check does not extra work over get_record_stream
> + return self.get_record_stream(keys, 'unordered', True)
>

This comment is confusing and needs rewording.

> + for revid, revision in revisions_iterator:
> + if revision is None:
> + pass
> + parent_map = vf.get_parent_map([(revid,)])
>
"pass" does nothing here so I'm assuming it's wrong. "continue" maybe?

> === modified file 'bzrlib/smart/medium.py'
> --- bzrlib/smart/medium.py 2009-06-10 03:56:49 +0000
> +++ bzrlib/smart/medium.py 2009-06-16 05:29:42 +0000
> @@ -37,7 +37,6 @@
> from bzrlib import (
> debug,
> errors,
> - osutils,
> symbol_versioning,
> trace,
> ui,
> @@ -46,7 +45,8 @@
> from bzrlib.smart import client, protocol, request, vfs
> from bzrlib.transport import ssh
> """)
> -
> +#usually already imported, and getting IllegalScoperReplacer on it here.
> +from bzrlib import osutils
>

Was this a transient issue? If not, it looks like a completely separate
bug to the check stuff and maybe ought to be submitted as such.
> def check(self, progress_bar=None):
> - """Check this object for integrity."""
> + """Check this object for integrity.
> +
> + :param progress_bar: A progress bar to output as the check progresses.
> + :param keys: Specific keys within the VersionedFiles to check. When
> + this parameter is not None, check() becomes a generator as per
> + get_record_stream. The difference to get_record_stream is that
> + more or deeper checks will be performed.
> + :return: None, or if keys was supplied a generator as per
> + get_record_stream.
> + """
>
This is a really good docstring thanks. You need to add keys to the
parameter list though.

> +Repository
> +----------
> +
> +Things that can go wrong:
> +* Bit errors or alterations may occur in raw data.
> +* Data that is referenced may be missing
> +* There could be a lot of garbage in upload etc.
> +* File graphs may be inconsistent with inventories and parents.
> +* The revision graph cache can be inconsistent with the revision data.
> +
> +Branch
> +------
> +
> +Things that can go wrong:
> +* Tag or tip revision ids may be missing from the repo.
>

ReST doesn't correctly format these lists. You need a blank line above them.

So, beyond the -Dprogress stuff, I'm not sure how best to break this up
to make review easier. I certainly don't want to see you spending much
time doing that but a patch with 24 files changed is stretching my small
brain's working set. If you can see an easy split into two parts - e.g.
land the tests first or land the VF changes first - please consider it.

Ian C.

« Back to merge proposal