Merge lp:~jameinel/bzr/2.0.6-peak-commit-mem into lp:bzr/2.0

Proposed by John A Meinel
Status: Merged
Approved by: bzr PQM
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr/2.0.6-peak-commit-mem
Merge into: lp:bzr/2.0
Diff against target: 53 lines (+9/-5)
2 files modified
NEWS (+7/-3)
bzrlib/groupcompress.py (+2/-2)
To merge this branch: bzr merge lp:~jameinel/bzr/2.0.6-peak-commit-mem
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+23718@code.launchpad.net

Commit message

Reduce peak memory usage by 1 compressed text copy when compressing in 2a foramts.

Description of the change

I'm proposing this for 2.0 (and thus by extension for 2.1 and trunk). I chose 2.0 because it really is a rather trivial change that has a nice positive impact on peak memory when working with large files.

It is related to bug #566940, but this is really only 1 step out of many.

To test this, I created an 80mb file filled with incompressible data, and did 'bzr commit'. I also expanded the data with 'hexlify', which makes it 160mb, but should compress 2:1.

With the 80mb dataset, peak memory went from:
  349MB => 267MB (82MB, aka 1 copy)
With the 160mb dataset, the peak memory went from:
  465MB => 371MB (94MB, ~1 compressed size)

As near as I can tell, it is the removal of self._z_content which is saved. We still hold the fulltext in memory, because FulltextContentFactory holds on to it. And you still have some places that are doing ''.join(extra_header + zcontent).

Still, saving 1 copy of compressed content for a 1 line change is worthwhile.

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

Looks great to me.

review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> The proposal to merge lp:~jameinel/bzr/2.0.6-peak-commit-mem into lp:bzr/2.0 has been updated.
>
> Status: Approved => Queued
> Queue Position: (not set) => 73

So this was marked as Queued to be merged into lp:bzr/2.0, but as of
today, I don't see it actually merged, and I don't see an error or
success message for this.

I'm going to just submit it myself, but I figured I'd make a comment to
figure out what was going on.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvONbkACgkQJdeBCYSNAAOSpwCfSploUE4AVVzetdNzp0yat/8h
gdIAmQFGlycq4UTw33HB24YjAXe6tAw5
=s3Au
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

you can see it in the queue at http://pqm.bazaar-vcs.org/ - the fix I wrote
yesterday for one of the causes of pqm not completing properly wasn't
deployed till this morning; there is also
https://bugs.edge.launchpad.net/launchpadlib/+bug/567180 causing pqm to fail
to change the status on a failing merge, with the result it stays in the
queue. I'm considering changing the order to work around that (though in
general I think its better to not silently bounce things out of queued
state. You can manually remove things from queued just by visiting the merge
proposal and clicking on the edit to change from queued to approved.

Revision history for this message
Vincent Ladeuil (vila) wrote :

Status set to Approved instead of Queued as pqm is not able to process it (or rather
it's so happy to process it that it handles it again and again, see bug #567333 for a similar
case).

Revision history for this message
bzr PQM (bzr-pqm) wrote :

Successful steps
Failure output:
All lines of log output:
Executing star-merge lp:~jameinel/bzr/2.0.6-peak-commit-mem at Thu Apr 29 04:25:59 2010
['Nothing to merge.']

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2010-04-01 14:10:47 +0000
+++ NEWS 2010-04-19 22:12:18 +0000
@@ -14,6 +14,10 @@
14Bug Fixes14Bug Fixes
15*********15*********
1616
17* Additional merges after an unrelated branch has been merged with its
18 history no longer crash when deleted files are involved.
19 (Vincent Ladeuil, John Arbash Meinel, #375898)
20
17* ``bzr revert`` now only takes write lock on working tree, instead of on 21* ``bzr revert`` now only takes write lock on working tree, instead of on
18 both working tree and branch.22 both working tree and branch.
19 (Danny van Heumen, #498409)23 (Danny van Heumen, #498409)
@@ -22,9 +26,9 @@
22 permissions as ``.bzr`` directory on a POSIX OS.26 permissions as ``.bzr`` directory on a POSIX OS.
23 (Parth Malwankar, #262450)27 (Parth Malwankar, #262450)
2428
25* Additional merges after an unrelated branch has been merged with its29* Reduce peak memory by one copy of compressed text.
26 history no longer crash when deleted files are involved.30 (John Arbash Meinel, #566940)
27 (Vincent Ladeuil, John Arbash Meinel, #375898)31
2832
29bzr 2.0.533bzr 2.0.5
30#########34#########
3135
=== modified file 'bzrlib/groupcompress.py'
--- bzrlib/groupcompress.py 2009-09-08 06:25:26 +0000
+++ bzrlib/groupcompress.py 2010-04-19 22:12:18 +0000
@@ -1,4 +1,4 @@
1# Copyright (C) 2008, 2009 Canonical Ltd1# Copyright (C) 2008, 2009, 2010 Canonical Ltd
2#2#
3# This program is free software; you can redistribute it and/or modify3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by4# it under the terms of the GNU General Public License as published by
@@ -1641,6 +1641,7 @@
1641 keys_to_add = []1641 keys_to_add = []
1642 def flush():1642 def flush():
1643 bytes = self._compressor.flush().to_bytes()1643 bytes = self._compressor.flush().to_bytes()
1644 self._compressor = GroupCompressor()
1644 index, start, length = self._access.add_raw_records(1645 index, start, length = self._access.add_raw_records(
1645 [(None, len(bytes))], bytes)[0]1646 [(None, len(bytes))], bytes)[0]
1646 nodes = []1647 nodes = []
@@ -1649,7 +1650,6 @@
1649 self._index.add_records(nodes, random_id=random_id)1650 self._index.add_records(nodes, random_id=random_id)
1650 self._unadded_refs = {}1651 self._unadded_refs = {}
1651 del keys_to_add[:]1652 del keys_to_add[:]
1652 self._compressor = GroupCompressor()
16531653
1654 last_prefix = None1654 last_prefix = None
1655 max_fulltext_len = 01655 max_fulltext_len = 0

Subscribers

People subscribed via source and target branches