Merge lp:~kbielefe/bzr/551391-log-memory-usage into lp:bzr

Proposed by Karl Bielefeldt
Status: Merged
Approved by: Vincent Ladeuil
Approved revision: no longer in the source branch.
Merged at revision: 5484
Proposed branch: lp:~kbielefe/bzr/551391-log-memory-usage
Merge into: lp:bzr
Diff against target: 106 lines (+41/-1)
4 files modified
NEWS (+4/-0)
bzrlib/help_topics/en/debug-flags.txt (+1/-0)
bzrlib/tests/test_trace.py (+12/-1)
bzrlib/trace.py (+24/-0)
To merge this branch: bzr merge lp:~kbielefe/bzr/551391-log-memory-usage
Reviewer Review Type Date Requested Status
Vincent Ladeuil Needs Information
Martin Pool Needs Fixing
John A Meinel Needs Fixing
Review via email: mp+37086@code.launchpad.net

Commit message

Dump memory profile to log if -Dmem_dump is set.

Description of the change

Implements bug 551391 by using meliae to dump all memory to a file in the current directory when there is a MemoryError exception. Gracefully exits if meliae is not installed or if there isn't enough memory to perform the dump or other exceptions occur during the dump. Logs whether it succeeded or not to ~/.bzr.log.

Test using ulimit -m and ulimit -v to set memory relatively low and run something that can potentially use a lot of memory such as bzr check.

To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

We should probably have a flag to enable/disable this. "-Ddump_memory" comes to mind. I don't know which should be the default, sort of like core-dumping. I think the default for most people is to have it off, so I would probably go with that. So we could change the error message to mention "use -Dmemory_dump to get a memory dump" (or whatever).

I like the idea, the code seems fine otherwise.

review: Needs Fixing
Revision history for this message
Martin Pool (mbp) wrote :

Thanks very much for that.

Beyond John's comments I'd say:

 * probably the dump should go into /tmp (ie use tempfile.mkstemp), so
that it doesn't clutter up or cause trouble in people's working
directory; then print the name

 * if you get an ImportError or other error, please log the exception
so we have a chance of debugging it

 * I would print the filename, not just logging it - or return it, and
let the higher-level code print it

--
Martin

Revision history for this message
Karl Bielefeldt (kbielefe) wrote :

Okay, I incorporated both of your suggestions. I'm not sure what the process is from this point.

Revision history for this message
Martin Pool (mbp) wrote :

That looks good to me. I might let John have a look again, but we can merge it from here.

review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :

One thought... dump_all_objects builds a big set to make sure it gets everything, but only once.

If we just ran out of memory, maybe meliae.dump_gc_objects() would be more appropriate? It approximates dumping everything by dumping everything in gc + 1 level of references, it tends to be about 15-50% larger on disk, because of duplicate dumps, and it does miss some stuff, but it avoids needing to keep track of every object.

If we really-really wanted to be evil, we could set the high bit on all refcount values, to indicate this has been dumped, and then walk it again to remove it. But I don't really like mutating important state like that.

Revision history for this message
John A Meinel (jameinel) wrote :

sent to pqm by email

Revision history for this message
John A Meinel (jameinel) wrote :

I failed to catch this, but PQM did.

try:
except:
finally:

Doesn't work in python 2.4, (I think it is python 2.5).

Also, if something early raises an exception, there is no 'dump_file' for finally use.

So it should be:

fd, name = ...
dump_file = ...
try:
  try:
  except:
    return
finally:
  dump_file.close()

16 +def _dump_memory_usage(err_file):
17 + try:
18 + fd, name = tempfile.mkstemp(prefix="bzr_memdump", suffix=".json")
19 + dump_file = os.fdopen(fd, 'w')
20 + from meliae import scanner
21 + scanner.dump_all_objects(dump_file)
22 + err_file.write("Memory dumped to %s\n" % name)
23 + except ImportError:
24 + err_file.write("Dumping memory requires meliae module.\n")
25 + log_exception_quietly()
26 + except:
27 + err_file.write("Exception while dumping memory.\n")
28 + log_exception_quietly()
29 + finally:
30 + dump_file.close()

review: Needs Fixing
Revision history for this message
Karl Bielefeldt (kbielefe) wrote :

> If we just ran out of memory, maybe meliae.dump_gc_objects() would be more
> appropriate?

You're probably right. The way I tested it, it failed allocating one last big chunk, so it still had enough memory to build the set. However, I think the allocation that caused it to fail wouldn't show up in the dump anyway because it never existed. The stack trace would be more useful in that case.

Where a memory dump on MemoryError is really useful is when you have a slower accumulation of too many smaller objects, and an approximation is probably good enough for those purposes. Maybe a fallback from dump_all_objects to dump_gc_objects would be useful?

When you want the precision is while doing memory profiling for development, and then developers can use whichever method works best for their scenario.

It just occurred to me that since I'm fixing the PQM bugs anyway, if it would be useful to add a hidden command like "bzr decode-memdump /tmp/dumpfile.json" for printing the summary information. That would make it a lot easier for an end user to file a bug report.

Revision history for this message
Karl Bielefeldt (kbielefe) wrote :

Fixed PQM errors and ready for resubmit.

Revision history for this message
Martin Pool (mbp) wrote :

That looks good, thanks very much. We should also mention this in news, and in 'help debug-flags', which comes from bzrlib/help_topics/en

review: Needs Fixing
Revision history for this message
Karl Bielefeldt (kbielefe) wrote :

Okay, updated help and news. Thanks for your patience in walking me through this.

Revision history for this message
Martin Pool (mbp) wrote :

No problem, thanks for your persistence.

review: Approve
Revision history for this message
Martin Pool (mbp) wrote :

sent to pqm by email

Revision history for this message
Martin Pool (mbp) wrote :

This actually causes one correct test failure:

======================================================================
FAIL: bzrlib.tests.test_trace.TestTrace.test_format_memory_error
----------------------------------------------------------------------
_StringException: Text attachment: log
------------
1966.897 Traceback (most recent call last):
 File "/home/pqm/bzr-pqm-workdir/home/+trunk/bzrlib/tests/test_trace.py", line 79, in test_format_memory_error
   raise MemoryError()
MemoryError

------------
Text attachment: traceback
------------
Traceback (most recent call last):
 File "/usr/lib/python2.4/site-packages/testtools/runtest.py", line 128, in _run_user
   return fn(*args)
 File "/usr/lib/python2.4/site-packages/testtools/testcase.py", line 368, in _run_test_method
   testMethod()
 File "/home/pqm/bzr-pqm-workdir/home/+trunk/bzrlib/tests/test_trace.py", line 84, in test_format_memory_error
   "bzr: out of memory\n")
AssertionError: not equal:
a = 'bzr: out of memory\nUse -Dmem_dump to dump memory to a file.\n'
b = 'bzr: out of memory\n'

------------

That's nice because we can check it is actually being printed.

review: Needs Fixing
Revision history for this message
Karl Bielefeldt (kbielefe) wrote :

Okay, I fixed that test and added another test for if -Dmem_dump is specified. On the latter, I'm getting the following output, which I'm not sure is a problem or not. Also I'm not sure if the test should assume the meliae module is present.

Output of bzr selftest --verbose test_format_mem_dump:

test_trace.TestTrace.test_format_mem_dump /usr/lib/pymodules/python2.6/dbus/_dbus.py:251: DeprecationWarning: The dbus_bindings module is not public API and will go away soon.

Most uses of dbus_bindings are applications catching the exception
dbus.dbus_bindings.DBusException. You should use dbus.DBusException
instead (this is compatible with all dbus-python versions since 0.40.2).

If you need additional public API, please contact the maintainers via
<email address hidden>.

  import dbus.dbus_bindings as m
/usr/lib/python2.6/dist-packages/meliae/scanner.py:94: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  recurse_depth=recurse_depth)
   OK 3602ms
----------------------------------------------------------------------
Ran 1 test in 3.684s

OK

Revision history for this message
Martin Pool (mbp) wrote :

The dbus warnings are probably caused by bzr-gtk; running with
--no-plugins should let your tests pass.

You can use ModuleAvailableFeature to make the tests check that a file
is written if meliae is installed and that it does not crash if it's
not.

--
Martin

Revision history for this message
Vincent Ladeuil (vila) wrote :

@Karl: I went ahead and created https://code.edge.launchpad.net/~vila/bzr/551391-log-memory-usage/+merge/38103 so we can land this patch (see there for details).

review: Approve
Revision history for this message
Karl Bielefeldt (kbielefe) wrote :

> @Karl: I went ahead and created
> https://code.edge.launchpad.net/~vila/bzr/551391-log-memory-usage/+merge/38103
> so we can land this patch (see there for details).

Thanks for that. My time for open source work comes in bursts. For future reference, should I make it a practice to give bzr-core or someone access to my branch, so you wouldn't need to create a new one?

Revision history for this message
Vincent Ladeuil (vila) wrote :

>>>>> Karl Bielefeldt <email address hidden> writes:

    >> @Karl: I went ahead and created
    >> https://code.edge.launchpad.net/~vila/bzr/551391-log-memory-usage/+merge/38103
    >> so we can land this patch (see there for details).

    > Thanks for that. My time for open source work comes in bursts.
    > For future reference, should I make it a practice to give bzr-core
    > or someone access to my branch, so you wouldn't need to create a
    > new one?

Hmm, not really.

I was reviewing your submission in a local branch, all I had to do was:

bzr push lp:~vila/bzr/`bzr nick`
bzr lp-open
and add your branch as a pre-requisite when proposing mine for merge.

If I had made a controversial change and thought you may agree or not
with it, I would also have 'bzr push' my branch and told you: 'If these
changes suit you, you could:

 bzr pull lp:~vila/bzr/551391-log-memory-usage

From there, you would have continue working on your branch, then 'bzr
push' again and add a comment in the ongoing merge proposal explaining
the new status.

There are a lot of variations around this theme, but in practive we
rarely use shared branches (in the ~bzr namespace).

Revision history for this message
Vincent Ladeuil (vila) wrote :

Meh, I was sure you signed the contributor agreement but you're not part of https://launchpad.net/~contributor-agreement-canonical

Did I miss something ?

review: Needs Information
Revision history for this message
Karl Bielefeldt (kbielefe) wrote :

> Meh, I was sure you signed the contributor agreement but you're not part of
> https://launchpad.net/~contributor-agreement-canonical
>
> Did I miss something ?

Taken care of. I thought I had signed it as well, but it turns out I was thinking of the Code of Conduct.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-10-05 21:15:13 +0000
3+++ NEWS 2010-10-06 13:36:52 +0000
4@@ -31,6 +31,10 @@
5 Also, bzr will recognize if the 'ssh' comand is a symlink to lsh.
6 (Matthew Gordon, #374700)
7
8+* Add ``-Dmem_dump`` debug flag, which uses meliae to dump memory to
9+ a file upon an out of memory error.
10+ (Karl Bielefeldt, #551391)
11+
12 Bug Fixes
13 *********
14
15
16=== modified file 'bzrlib/help_topics/en/debug-flags.txt'
17--- bzrlib/help_topics/en/debug-flags.txt 2010-01-05 04:30:07 +0000
18+++ bzrlib/help_topics/en/debug-flags.txt 2010-10-06 13:36:52 +0000
19@@ -25,6 +25,7 @@
20 -Dknit Trace knit operations.
21 -Dlock Trace when lockdir locks are taken or released.
22 -Dprogress Trace progress bar operations.
23+-Dmem_dump Dump memory to a file upon an out of memory error.
24 -Dmerge Emit information for debugging merges.
25 -Dno_apport Don't use apport to report crashes.
26 -Dno_activity Don't show transport activity indicator in progress bar.
27
28=== modified file 'bzrlib/tests/test_trace.py'
29--- bzrlib/tests/test_trace.py 2010-06-26 01:07:16 +0000
30+++ bzrlib/tests/test_trace.py 2010-10-06 13:36:52 +0000
31@@ -26,6 +26,7 @@
32 import tempfile
33
34 from bzrlib import (
35+ debug,
36 errors,
37 trace,
38 )
39@@ -81,7 +82,17 @@
40 pass
41 msg = _format_exception()
42 self.assertEquals(msg,
43- "bzr: out of memory\n")
44+ "bzr: out of memory\nUse -Dmem_dump to dump memory to a file.\n")
45+
46+ def test_format_mem_dump(self):
47+ debug.debug_flags.add('mem_dump')
48+ try:
49+ raise MemoryError()
50+ except MemoryError:
51+ pass
52+ msg = _format_exception()
53+ self.assertStartsWith(msg,
54+ "bzr: out of memory\nMemory dumped to ")
55
56 def test_format_os_error(self):
57 try:
58
59=== modified file 'bzrlib/trace.py'
60--- bzrlib/trace.py 2010-06-28 02:41:22 +0000
61+++ bzrlib/trace.py 2010-10-06 13:36:52 +0000
62@@ -59,6 +59,7 @@
63 import os
64 import sys
65 import time
66+import tempfile
67
68 from bzrlib.lazy_import import lazy_import
69 lazy_import(globals(), """
70@@ -469,6 +470,25 @@
71 note(line)
72 break
73
74+def _dump_memory_usage(err_file):
75+ try:
76+ try:
77+ fd, name = tempfile.mkstemp(prefix="bzr_memdump", suffix=".json")
78+ dump_file = os.fdopen(fd, 'w')
79+ from meliae import scanner
80+ scanner.dump_gc_objects(dump_file)
81+ err_file.write("Memory dumped to %s\n" % name)
82+ except ImportError:
83+ err_file.write("Dumping memory requires meliae module.\n")
84+ log_exception_quietly()
85+ except:
86+ err_file.write("Exception while dumping memory.\n")
87+ log_exception_quietly()
88+ finally:
89+ if dump_file is not None:
90+ dump_file.close()
91+ elif fd is not None:
92+ os.close(fd)
93
94 def report_exception(exc_info, err_file):
95 """Report an exception to err_file (typically stderr) and to .bzr.log.
96@@ -492,6 +512,10 @@
97 return errors.EXIT_ERROR
98 elif isinstance(exc_object, MemoryError):
99 err_file.write("bzr: out of memory\n")
100+ if 'mem_dump' in debug.debug_flags:
101+ _dump_memory_usage(err_file)
102+ else:
103+ err_file.write("Use -Dmem_dump to dump memory to a file.\n")
104 return errors.EXIT_ERROR
105 elif isinstance(exc_object, ImportError) \
106 and str(exc_object).startswith("No module named "):