This adds 'pack-on-the-fly' support for gc streaming.
1) It restores 'groupcompress' sorting for the requested inventories and texts.
2) It uses a heuristic that is approximately:
if a given block is less than 75% the size of a 'fully utilized' block, then don't re-use the
content directly, but schedule it to be packed into a new block.
The specifics are in '_LazyGroupContentManager.check_is_well_utilized()'
3) I did some real-world testing, and the results seem pretty good.
To start with, the copy of bzr.dev on Launchpad is currently very poorly packed, taking up >90MB of disk space for a single pack file. After branching that using bzr.dev, I get a 101MB repository locally. If I 'bzr pack', I end up with 39MB (30MB in .pack, and 8.8MB in indices)
101MB poorly-packed-from-lp
101MB post 'bzr.dev branch new-repo' (takes 1m0s locally)
39MB post 'bzr pack' (takes 2m0s locally)
I then tested the results of using the pack-on-the-fly
41MB post 'bzr-pack branch new-repo' (takes 1m43s locally)
41MB post 'bzr-pack branch new-repo new-repo2) (takes 1m0s)
Which means that pack-on-the-fly is working as we hoped it would. It
a) Gives almost as good of pack results as if we had issued 'bzr pack'
b) Takes a bit of extra time when the source is poorly packed (1m => 1m45s)
c) Takes no extra time when the source is already properly packed (1m => 1m)
4) Unfortunately this was built on top of bzr.dev, but we can land it there, and then cherrypick it back to 2.0. I'll still submit a merge request for 2.0.
This adds 'pack-on-the-fly' support for gc streaming.
1) It restores 'groupcompress' sorting for the requested inventories and texts. entManager. check_is_ well_utilized( )'
2) It uses a heuristic that is approximately:
if a given block is less than 75% the size of a 'fully utilized' block, then don't re-use the
content directly, but schedule it to be packed into a new block.
The specifics are in '_LazyGroupCont
3) I did some real-world testing, and the results seem pretty good.
To start with, the copy of bzr.dev on Launchpad is currently very poorly packed, taking up >90MB of disk space for a single pack file. After branching that using bzr.dev, I get a 101MB repository locally. If I 'bzr pack', I end up with 39MB (30MB in .pack, and 8.8MB in indices)
101MB poorly- packed- from-lp
101MB post 'bzr.dev branch new-repo' (takes 1m0s locally)
39MB post 'bzr pack' (takes 2m0s locally)
I then tested the results of using the pack-on-the-fly
41MB post 'bzr-pack branch new-repo' (takes 1m43s locally)
41MB post 'bzr-pack branch new-repo new-repo2) (takes 1m0s)
Which means that pack-on-the-fly is working as we hoped it would. It
a) Gives almost as good of pack results as if we had issued 'bzr pack'
b) Takes a bit of extra time when the source is poorly packed (1m => 1m45s)
c) Takes no extra time when the source is already properly packed (1m => 1m)
4) Unfortunately this was built on top of bzr.dev, but we can land it there, and then cherrypick it back to 2.0. I'll still submit a merge request for 2.0.