Merge lp:~jameinel/bzr/2.0.6-peak-commit-mem into lp:bzr/2.0

Proposed by John A Meinel
Status: Merged
Approved by: bzr PQM
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr/2.0.6-peak-commit-mem
Merge into: lp:bzr/2.0
Diff against target: 53 lines (+9/-5)
2 files modified
NEWS (+7/-3)
bzrlib/groupcompress.py (+2/-2)
To merge this branch: bzr merge lp:~jameinel/bzr/2.0.6-peak-commit-mem
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+23718@code.launchpad.net

Commit message

Reduce peak memory usage by 1 compressed text copy when compressing in 2a foramts.

Description of the change

I'm proposing this for 2.0 (and thus by extension for 2.1 and trunk). I chose 2.0 because it really is a rather trivial change that has a nice positive impact on peak memory when working with large files.

It is related to bug #566940, but this is really only 1 step out of many.

To test this, I created an 80mb file filled with incompressible data, and did 'bzr commit'. I also expanded the data with 'hexlify', which makes it 160mb, but should compress 2:1.

With the 80mb dataset, peak memory went from:
  349MB => 267MB (82MB, aka 1 copy)
With the 160mb dataset, the peak memory went from:
  465MB => 371MB (94MB, ~1 compressed size)

As near as I can tell, it is the removal of self._z_content which is saved. We still hold the fulltext in memory, because FulltextContentFactory holds on to it. And you still have some places that are doing ''.join(extra_header + zcontent).

Still, saving 1 copy of compressed content for a 1 line change is worthwhile.

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

Looks great to me.

review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> The proposal to merge lp:~jameinel/bzr/2.0.6-peak-commit-mem into lp:bzr/2.0 has been updated.
>
> Status: Approved => Queued
> Queue Position: (not set) => 73

So this was marked as Queued to be merged into lp:bzr/2.0, but as of
today, I don't see it actually merged, and I don't see an error or
success message for this.

I'm going to just submit it myself, but I figured I'd make a comment to
figure out what was going on.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvONbkACgkQJdeBCYSNAAOSpwCfSploUE4AVVzetdNzp0yat/8h
gdIAmQFGlycq4UTw33HB24YjAXe6tAw5
=s3Au
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

you can see it in the queue at http://pqm.bazaar-vcs.org/ - the fix I wrote
yesterday for one of the causes of pqm not completing properly wasn't
deployed till this morning; there is also
https://bugs.edge.launchpad.net/launchpadlib/+bug/567180 causing pqm to fail
to change the status on a failing merge, with the result it stays in the
queue. I'm considering changing the order to work around that (though in
general I think its better to not silently bounce things out of queued
state. You can manually remove things from queued just by visiting the merge
proposal and clicking on the edit to change from queued to approved.

Revision history for this message
Vincent Ladeuil (vila) wrote :

Status set to Approved instead of Queued as pqm is not able to process it (or rather
it's so happy to process it that it handles it again and again, see bug #567333 for a similar
case).

Revision history for this message
bzr PQM (bzr-pqm) wrote :

Successful steps
Failure output:
All lines of log output:
Executing star-merge lp:~jameinel/bzr/2.0.6-peak-commit-mem at Thu Apr 29 04:25:59 2010
['Nothing to merge.']

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-04-01 14:10:47 +0000
3+++ NEWS 2010-04-19 22:12:18 +0000
4@@ -14,6 +14,10 @@
5 Bug Fixes
6 *********
7
8+* Additional merges after an unrelated branch has been merged with its
9+ history no longer crash when deleted files are involved.
10+ (Vincent Ladeuil, John Arbash Meinel, #375898)
11+
12 * ``bzr revert`` now only takes write lock on working tree, instead of on
13 both working tree and branch.
14 (Danny van Heumen, #498409)
15@@ -22,9 +26,9 @@
16 permissions as ``.bzr`` directory on a POSIX OS.
17 (Parth Malwankar, #262450)
18
19-* Additional merges after an unrelated branch has been merged with its
20- history no longer crash when deleted files are involved.
21- (Vincent Ladeuil, John Arbash Meinel, #375898)
22+* Reduce peak memory by one copy of compressed text.
23+ (John Arbash Meinel, #566940)
24+
25
26 bzr 2.0.5
27 #########
28
29=== modified file 'bzrlib/groupcompress.py'
30--- bzrlib/groupcompress.py 2009-09-08 06:25:26 +0000
31+++ bzrlib/groupcompress.py 2010-04-19 22:12:18 +0000
32@@ -1,4 +1,4 @@
33-# Copyright (C) 2008, 2009 Canonical Ltd
34+# Copyright (C) 2008, 2009, 2010 Canonical Ltd
35 #
36 # This program is free software; you can redistribute it and/or modify
37 # it under the terms of the GNU General Public License as published by
38@@ -1641,6 +1641,7 @@
39 keys_to_add = []
40 def flush():
41 bytes = self._compressor.flush().to_bytes()
42+ self._compressor = GroupCompressor()
43 index, start, length = self._access.add_raw_records(
44 [(None, len(bytes))], bytes)[0]
45 nodes = []
46@@ -1649,7 +1650,6 @@
47 self._index.add_records(nodes, random_id=random_id)
48 self._unadded_refs = {}
49 del keys_to_add[:]
50- self._compressor = GroupCompressor()
51
52 last_prefix = None
53 max_fulltext_len = 0

Subscribers

People subscribed via source and target branches