Bazaar

Merge lp:~jameinel/bzr/2.0.6-peak-commit-mem into lp:bzr/2.0

2.0.6-peak-commit-mem
Merge into 2.0

Proposed by John A Meinel on 2010-04-19

Status:	Merged
Approved by:	bzr PQM on 2010-04-29
Approved revision:	not available
Merged at revision:	not available
Proposed branch:	lp:~jameinel/bzr/2.0.6-peak-commit-mem
Merge into:	lp:bzr/2.0
Diff against target:	53 lines (+9/-5) 2 files modified NEWS (+7/-3) bzrlib/groupcompress.py (+2/-2)
To merge this branch:	bzr merge lp:~jameinel/bzr/2.0.6-peak-commit-mem
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Robert Collins (community)		2010-04-19	Approve on 2010-04-19
Review via email: mp+23718@code.launchpad.net

Commit message

Reduce peak memory usage by 1 compressed text copy when compressing in 2a foramts.

Description of the change

I'm proposing this for 2.0 (and thus by extension for 2.1 and trunk). I chose 2.0 because it really is a rather trivial change that has a nice positive impact on peak memory when working with large files.

It is related to bug #566940, but this is really only 1 step out of many.

To test this, I created an 80mb file filled with incompressible data, and did 'bzr commit'. I also expanded the data with 'hexlify', which makes it 160mb, but should compress 2:1.

With the 80mb dataset, peak memory went from:
349MB => 267MB (82MB, aka 1 copy)
With the 160mb dataset, the peak memory went from:
465MB => 371MB (94MB, ~1 compressed size)

As near as I can tell, it is the removal of self._z_content which is saved. We still hold the fulltext in memory, because FulltextContentFactory holds on to it. And you still have some places that are doing ''.join(extra_header + zcontent).

Still, saving 1 copy of compressed content for a 1 line change is worthwhile.

Revision history for this message

Robert Collins (lifeless) wrote on 2010-04-19:

Looks great to me.

review: Approve

Revision history for this message

John A Meinel (jameinel) wrote on 2010-04-20:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> The proposal to merge lp:~jameinel/bzr/2.0.6-peak-commit-mem into lp:bzr/2.0 has been updated.
>
> Status: Approved => Queued
> Queue Position: (not set) => 73

So this was marked as Queued to be merged into lp:bzr/2.0, but as of
today, I don't see it actually merged, and I don't see an error or
success message for this.

I'm going to just submit it myself, but I figured I'd make a comment to
figure out what was going on.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvONbkACgkQJdeBCYSNAAOSpwCfSploUE4AVVzetdNzp0yat/8h
gdIAmQFGlycq4UTw33HB24YjAXe6tAw5
=s3Au
-----END PGP SIGNATURE-----

Revision history for this message

Robert Collins (lifeless) wrote on 2010-04-20:

you can see it in the queue at http://pqm.bazaar-vcs.org/ - the fix I wrote
yesterday for one of the causes of pqm not completing properly wasn't
deployed till this morning; there is also
https://bugs.edge.launchpad.net/launchpadlib/+bug/567180 causing pqm to fail
to change the status on a failing merge, with the result it stays in the
queue. I'm considering changing the order to work around that (though in
general I think its better to not silently bounce things out of queued
state. You can manually remove things from queued just by visiting the merge
proposal and clicking on the edit to change from queued to approved.

Revision history for this message

Vincent Ladeuil (vila) wrote on 2010-04-21:

Status set to Approved instead of Queued as pqm is not able to process it (or rather
it's so happy to process it that it handles it again and again, see bug #567333 for a similar
case).

Revision history for this message

bzr PQM (bzr-pqm) wrote on 2010-04-29:

Successful steps
Failure output:
All lines of log output:
Executing star-merge lp:~jameinel/bzr/2.0.6-peak-commit-mem at Thu Apr 29 04:25:59 2010
['Nothing to merge.']

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alexander Belchenko

Bazaar Codereview Subscribers

Benoit Pierre

John A Meinel

Martin Pool

Matt Nordhoff

bzr PQM

pascalprost

1	=== modified file 'NEWS'
2	--- NEWS 2010-04-01 14:10:47 +0000
3	+++ NEWS 2010-04-19 22:12:18 +0000
4	@@ -14,6 +14,10 @@
5	Bug Fixes
6	*********
7
8	+* Additional merges after an unrelated branch has been merged with its
9	+ history no longer crash when deleted files are involved.
10	+ (Vincent Ladeuil, John Arbash Meinel, #375898)
11	+
12	* ``bzr revert`` now only takes write lock on working tree, instead of on
13	both working tree and branch.
14	(Danny van Heumen, #498409)
15	@@ -22,9 +26,9 @@
16	permissions as ``.bzr`` directory on a POSIX OS.
17	(Parth Malwankar, #262450)
18
19	-* Additional merges after an unrelated branch has been merged with its
20	- history no longer crash when deleted files are involved.
21	- (Vincent Ladeuil, John Arbash Meinel, #375898)
22	+* Reduce peak memory by one copy of compressed text.
23	+ (John Arbash Meinel, #566940)
24	+
25
26	bzr 2.0.5
27	#########
28
29	=== modified file 'bzrlib/groupcompress.py'
30	--- bzrlib/groupcompress.py 2009-09-08 06:25:26 +0000
31	+++ bzrlib/groupcompress.py 2010-04-19 22:12:18 +0000
32	@@ -1,4 +1,4 @@
33	-# Copyright (C) 2008, 2009 Canonical Ltd
34	+# Copyright (C) 2008, 2009, 2010 Canonical Ltd
35	#
36	# This program is free software; you can redistribute it and/or modify
37	# it under the terms of the GNU General Public License as published by
38	@@ -1641,6 +1641,7 @@
39	keys_to_add = []
40	def flush():
41	bytes = self._compressor.flush().to_bytes()
42	+ self._compressor = GroupCompressor()
43	index, start, length = self._access.add_raw_records(
44	[(None, len(bytes))], bytes)[0]
45	nodes = []
46	@@ -1649,7 +1650,6 @@
47	self._index.add_records(nodes, random_id=random_id)
48	self._unadded_refs = {}
49	del keys_to_add[:]
50	- self._compressor = GroupCompressor()
51
52	last_prefix = None
53	max_fulltext_len = 0