Code review comment for lp:~spiv/bzr/smarter-index-search

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/smarter-index-search into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
>
> Optimise index lookups in repositories with many pack files.
>
> First, the headline: this greatly improves "bzr pull" of one new revision of grub from savannah by HTTP (as reported on the mailing list, which has further analysis):
>
> bzr.dev: 2424kB (50.2kB/s r:2395kB w:30kB)
> this patch: 1034kB (43.3kB/s r:1022kB w:12kB)
>
> Given that the pack data transferred is 701266 bytes (which itself seems quite large for such a small change...), that brings the index-searching overhead from 2.42x to 0.45x of the bytes read. It also halves the wall-clock time :)
>
> That repo has I think 14 packs, and bzr.dev tries 11 indices for each of rix, iix, etc before finding the data it needs for that fetch.
>
> There are two parts to this change:
>
> 1) when a CombinedGraphIndex performs a lookup, it shuffles the index or indices that contained the records to the front of self._indices on the assumption that future lookups should try those first.
> 2) propagates that reordering to the other CombinedGraphIndex objects from the same pack collection. This is done by a) associating a name (the pack name) with the elements of CombinedGraphIndex, and b) linking the revisions/inventories/etc CombinedGraphIndex objects belonging to a single pack collection via setting a _sibling_indices attribute on them, c) using those links and names to apply the same reordering to those sibling indices.
>
> I've been pretty conservative with API changes: the new behaviour is only activated by optional keyword arguments, so existing uses of CombinedGraphIndex should see no change of behaviour (including no improvement). This is to make it as easy as possible to backport this change to 2.1 and 2.0 if we choose to.
>
> I think this change needs some tests before it's truly ready to merge, but it's getting to the end of my work week and I think this code is ready for feedback, so here it is!
>

I forgot to vote.

  review: needs_fixing

Because it certainly should have some amount of testing. If only to
prevent future regressions (like we already have :). Though you don't
have to test it at the "get_stream()" level.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkubrdwACgkQJdeBCYSNAAPiOgCcDS74mDZFRHc6TOjROAUcFsjj
0ykAoM7AAlpwfv/uNire7vqtLaBtkohG
=90G5
-----END PGP SIGNATURE-----

review: Needs Fixing

« Back to merge proposal