Merge lp:~mnordhoff/loggerhead/statictuples into lp:loggerhead

Proposed by Matt Nordhoff
Status: Work in progress
Proposed branch: lp:~mnordhoff/loggerhead/statictuples
Merge into: lp:loggerhead
Diff against target: 253 lines (+134/-12)
6 files modified
loggerhead/_static_tuple_py.py (+80/-0)
loggerhead/changecache.py (+11/-2)
loggerhead/controllers/filediff_ui.py (+3/-0)
loggerhead/history.py (+3/-2)
loggerhead/util.py (+7/-0)
loggerhead/wholehistory.py (+30/-8)
To merge this branch: bzr merge lp:~mnordhoff/loggerhead/statictuples
Reviewer Review Type Date Requested Status
Loggerhead Team Pending
Review via email: mp+13706@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

This totally isn't ready to merge (though it does work 100%, AFAICT), but I wanted to bring it to your attention.

This branch changes Loggerhead to make use of StaticTuples, jam's recent addition to bzrlib. (If you don't know, they're very similar to tuples, but they are faster and save memory by only holding certain types of objects that can't get in reference cycles, so they can avoid the GC overhead.)

I'm bundling a copy of the Python version of StaticTuple for compatibility with older bzr versions. This has a performance cost, but hopefully it's minor, and worth it.

I don't really have information on how much of a savings this is; I haven't been running it long enough for memory usage to stabilize. But I figure it should reduce the number of tuples by 50-75%, which is 100,000-300,000 for me.

BTW, I fixed a missing import since the last time Launchpad mirrored this branch.

Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat, which has a proposal up for merging to bzr.dev.

What's left to do:

1.) It's not possible to marshal StaticTuples, so RevInfoDiskCache currently uses listcomps to convert to/from regular tuples when serializing/deserializing. This probably has a performance impact, but all I cared about at the time was getting it working.

    It's not possible to pickle them either, so we can't switch to that. Besides, pickling uses more disk space than marshal. (As a quick test, I took a random row from revinfo.sql. Without changing from tuple to StaticTuple, changing it from marshal+zlib to pickle+zlib (using pickle protocol version 2) went from 39 KB to 54 KB.)

2.) Currently I work around StaticTuple.from_sequence() only supporting sequences, not generators. I'll poke jam about that next time I see him.

3.) There weren't very many of them, but I saw some tuples like ('replace', 0, 1, 0, 1) in memory, and want to StaticTupleify them too, but don't know where they come from.

4.) I convert the results of iter_ancestry to StaticTuples. I should probably send a patch to bzr.dev, or maybe just leave them as normal tuples, to save the cost of converting them (though it's probably very minor).

4.) Maybe some things should be interned? I dunno.

5.) Some of the things I StaticTupled probably have a negligible impact on memory usage, but hey, why not?

6.) After this, the only other interesting tuples I see in Dozer are related to Subversion exceptions (bug #436406) and a few like ('sha1:...',), which I haven't investigated yet. Plus a couple ('$some_revid', [$graph_cache]).

7.) It might be possible for some of Loggerhead's many lists to be turned into StaticTuples. In particular, the graph cache is a list of lists of 3 tuples. (They actually are sometimes modified where they're built, wholehistory.compute_whole_history_data(), but it may be worth the inconvenience of turning them into StaticTuples anyway.)

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

> It's not possible to pickle them either, so we can't switch to that.

If we decide to go that way, it is, of course, possible to add pickling
support to StaticTuple. I just meant that isn't supported now.

Turns out it's really trivial to implement, too. (Well, as trivial as
anything can be in C. It's a one-liner in Python. So I can probably hack
a C version together in...a few weeks? :-P )

401. By Matt Nordhoff

Merge trunk, fixing conflicts.

402. By Matt Nordhoff

Merge trunk, fixing conflict

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

Matt Nordhoff wrote:
> Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat,
> which has a proposal up for merging to bzr.dev.

FYI, this has landed now. Pickle support has been approved, so it will
hopefully land soon as well.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :
Download full text (3.9 KiB)

Matt Nordhoff wrote:
> Matt Nordhoff has proposed merging lp:~mnordhoff/loggerhead/statictuples into lp:loggerhead.
>
> Requested reviews:
> Loggerhead-team (loggerhead-team)
>
>
> This totally isn't ready to merge (though it does work 100%, AFAICT), but I wanted to bring it to your attention.
>
> This branch changes Loggerhead to make use of StaticTuples, jam's recent addition to bzrlib. (If you don't know, they're very similar to tuples, but they are faster and save memory by only holding certain types of objects that can't get in reference cycles, so they can avoid the GC overhead.)

Thanks for poking at this.

> I'm bundling a copy of the Python version of StaticTuple for compatibility with older bzr versions. This has a performance cost, but hopefully it's minor, and worth it.
>
> I don't really have information on how much of a savings this is; I haven't been running it long enough for memory usage to stabilize. But I figure it should reduce the number of tuples by 50-75%, which is 100,000-300,000 for me.
>
> BTW, I fixed a missing import since the last time Launchpad mirrored this branch.
>
> Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat, which has a proposal up for merging to bzr.dev.
>
> What's left to do:
>
> 1.) It's not possible to marshal StaticTuples, so RevInfoDiskCache currently uses listcomps to convert to/from regular tuples when serializing/deserializing. This probably has a performance impact, but all I cared about at the time was getting it working.

I'd definitely want to measure the performance impact of this on a
launchpad-sized history before landing this.

> It's not possible to pickle them either, so we can't switch to that. Besides, pickling uses more disk space than marshal. (As a quick test, I took a random row from revinfo.sql. Without changing from tuple to StaticTuple, changing it from marshal+zlib to pickle+zlib (using pickle protocol version 2) went from 39 KB to 54 KB.)

FWIW, the reason I went with marshal is that it's so much quicker than
anything else I tried. I have this recollection that cPickle was a
bunch slower.

Really though, we need to find a tech that allows us to not deserialize
the whole graph all the time -- a more sophisticated schema or
something. I have other reasons to want to use a cache that can work
over the network -- maybe couch? -- but I don't know when I'll get time
to work on it :(

> 2.) Currently I work around StaticTuple.from_sequence() only supporting sequences, not generators. I'll poke jam about that next time I see him.
>
> 3.) There weren't very many of them, but I saw some tuples like ('replace', 0, 1, 0, 1) in memory, and want to StaticTupleify them too, but don't know where they come from.

I'd guess that's to do with diff generation.

> 4.) I convert the results of iter_ancestry to StaticTuples. I should probably send a patch to bzr.dev, or maybe just leave them as normal tuples, to save the cost of converting them (though it's probably very minor).
>
> 4.) Maybe some things should be interned? I dunno.

I guess this might help if you are looking at branches that share ancestry.

> 5.) Some of the things I StaticTupled probably...

Read more...

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :
Download full text (5.1 KiB)

Michael Hudson wrote:
> Matt Nordhoff wrote:
>> Matt Nordhoff has proposed merging lp:~mnordhoff/loggerhead/statictuples into lp:loggerhead.
>>
>> Requested reviews:
>> Loggerhead-team (loggerhead-team)
>>
>>
>> This totally isn't ready to merge (though it does work 100%, AFAICT), but I wanted to bring it to your attention.
>>
>> This branch changes Loggerhead to make use of StaticTuples, jam's recent addition to bzrlib. (If you don't know, they're very similar to tuples, but they are faster and save memory by only holding certain types of objects that can't get in reference cycles, so they can avoid the GC overhead.)
>
> Thanks for poking at this.
>
>> I'm bundling a copy of the Python version of StaticTuple for compatibility with older bzr versions. This has a performance cost, but hopefully it's minor, and worth it.
>>
>> I don't really have information on how much of a savings this is; I haven't been running it long enough for memory usage to stabilize. But I figure it should reduce the number of tuples by 50-75%, which is 100,000-300,000 for me.
>>
>> BTW, I fixed a missing import since the last time Launchpad mirrored this branch.
>>
>> Also, you need to run it against lp:~jameinel/bzr/2.1-st-concat, which has a proposal up for merging to bzr.dev.
>>
>> What's left to do:
>>
>> 1.) It's not possible to marshal StaticTuples, so RevInfoDiskCache currently uses listcomps to convert to/from regular tuples when serializing/deserializing. This probably has a performance impact, but all I cared about at the time was getting it working.
>
> I'd definitely want to measure the performance impact of this on a
> launchpad-sized history before landing this.

I don't have a copy of Launchpad, but I'd be happy to do some sort of
simple, mostly-good-enough test (:-P) on something less scary-big like
bzr.dev or Python.

>> It's not possible to pickle them either, so we can't switch to that. Besides, pickling uses more disk space than marshal. (As a quick test, I took a random row from revinfo.sql. Without changing from tuple to StaticTuple, changing it from marshal+zlib to pickle+zlib (using pickle protocol version 2) went from 39 KB to 54 KB.)
>
> FWIW, the reason I went with marshal is that it's so much quicker than
> anything else I tried. I have this recollection that cPickle was a
> bunch slower.

Oh, thanks for the explanation.

> Really though, we need to find a tech that allows us to not deserialize
> the whole graph all the time -- a more sophisticated schema or
> something. I have other reasons to want to use a cache that can work
> over the network -- maybe couch? -- but I don't know when I'll get time
> to work on it :(

That sounds good, but it's over my head. It's definitely not going to be
a part of this patch. :-P

>> 2.) Currently I work around StaticTuple.from_sequence() only supporting sequences, not generators. I'll poke jam about that next time I see him.
>>
>> 3.) There weren't very many of them, but I saw some tuples like ('replace', 0, 1, 0, 1) in memory, and want to StaticTupleify them too, but don't know where they come from.
>
> I'd guess that's to do with diff generation.

I just tracked it down. It comes f...

Read more...

403. By Matt Nordhoff

Update _static_tuple_py.py to lp:~mnordhoff/bzr/statictuple-pickling r4779, and update the TODO that says pickling is not supported.

(revid: <email address hidden>)

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (5.7 KiB)

I'm obviously not an official reviewer, but I figured I'd poke my head in here.

I'm a bit surprised at some of the changes that seem pretty unrelated. Such as:
195 if (base < 100) and (dot != 0):
196 - out += '.%d' % (dot,)
197 + out += '.%d' % (dot)

(I realize that the latter probably works, but if 'dot' ever ends up as a tuple, you'll get a rather cryptic failure mode.)

Also, I would be quite skeptical about this change:
 @classmethod
223 - def is_installed(cls):
224 - return os.environ.get(cls._reloader_environ_key)
225 + def is_installed(self):
226 + return os.environ.get(self._reloader_environ_key)
227

^- You are changing a "classmethod" to use the "self" attribute, which is, at best, confusing. You don't have an instance here, you just have the Class object, hence why the original author used "cls" to clarify that.

279 - revision_graph[key] = tuple(parent for parent in parents if parent
280 - in revision_graph)
281 + # TODO: It'd be nice if from_sequence supported generators.
282 + revision_graph[key] = StaticTuple.from_sequence(tuple(
283 + parent for parent in parents if parent in revision_graph))

^- It probably wouldn't be hard to add this sort of thing to FromSequence. Namely change these lines:
    if (!PySequence_Check(sequence)) {
        PyErr_Format(PyExc_TypeError, "Type %s is not a sequence type",
                     Py_TYPE(sequence)->tp_name);
        return NULL;
    }

to

PyObject *as_tuple = NULL;

if (!PySequence_Check(sequence)) {
  as_tuple = PySequence_Tuple(sequence);
  sequence = as_tuple;
}

...

Py_XDECREF(as_tuple);

Here are some performance numbers for a Launchpad tree. This is ~66k revisions.

 61 msec TIMEIT "zlib.decompress(open('test.data', 'rb').read())"
139 msec TIMEIT "o = marshal.loads(zlib.decompress(open('test.data', 'rb').read()))"
  (78 msec to marshal.loads())

 76 msec TIMEIT "zlib.decompress(open('test.pickle2.z', 'rb').read())"
273 msec TIMEIT "o = cPickle.loads(....)"
  (196 msec to cPickle.loads())

So marshal is *almost* as fast as zlib on the same data.

 97 msec TIMEIT "zlib.decompress(...'test.json.z')"
356 msec TIMEIT "simplejson.loads(...'test.json.z')"

(the raw json is 16MB)

 94 msec TIMEIT "zlib.decompress(...'test.bencode.z')"
271 msec TIMEIT "o = _bencode_pyx.bdecode(...'test.bencode.z')"

So bdecode is as fast as cPickle, and we can probably teach bdecode(as_tuples=True) to return StaticTuple objects. Though

306 msec TIMEIT "o = _bencode_pyx.bdecode_as_tuple(...)"

 92 msec TIMEIT "zlib.decompress(...'test.pickle2-st.z')"
335 msec TIMEIT "o = cPickle.loads(...'test.pickle2-st.z')"

So the StaticTuple form is slightly slower than the tuple form. My guess is that cPickle has a bit of extra overhead when you don't use built-in types. (Like it reads in as a list/tuple, and then passes the data to the constructor.)

Now, what about using marshal plus casting to StaticTuples:

o = marshal.loads(zlib.decompress(...))
st = static_tuple.StaticTuple
as_st = static_tuple.StaticTuple.from_sequence
o2 = [st(as_st(x[0]), as_st(x[1]), as_st(x[2])) for x in o]

256 msec TIMEIT "o2 = ..."

So it turns out that using 'marshal' + casting the data into StaticTuple instances is faste...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

I should also note. I tried hard to make a pure-python StaticTuple just inherit from a regular tuple, so that they would hopefully not have any overhead versus a normal tuple. However, note that we're currently working out some details, and it may have a small amount of overhead.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
> I should also note. I tried hard to make a pure-python StaticTuple just inherit from a regular tuple, so that they would hopefully not have any overhead versus a normal tuple. However, note that we're currently working out some details, and it may have a small amount of overhead.

I just added this proposal:

https://code.launchpad.net/~jameinel/bzr/2.1-pure-static-tuple/+merge/14006

I'm still seeing weird results on Windows. But Andrew Bennetts clarified
that it does, indeed, save him 8 bytes per object on Linux w/ python
2.6.4rc2. (for me, they are always 8-bytes bigger on python 2.6.2 and
2.6.4.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrmePIACgkQJdeBCYSNAAPwMQCgoxzgjaSq4vMLlcmdxq+J5Txu
FA4AoKHt/wdpuKa6HSjFKHl75GrRwqkm
=gMTk
-----END PGP SIGNATURE-----

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :
Download full text (6.1 KiB)

John A Meinel wrote:
> I'm obviously not an official reviewer, but I figured I'd poke my head in here.

Thank you! :-)

> I'm a bit surprised at some of the changes that seem pretty unrelated. Such as:
> 195 if (base < 100) and (dot != 0):
> 196 - out += '.%d' % (dot,)
> 197 + out += '.%d' % (dot)
>
> (I realize that the latter probably works, but if 'dot' ever ends up as a tuple, you'll get a rather cryptic failure mode.)
>
> Also, I would be quite skeptical about this change:
> @classmethod
> 223 - def is_installed(cls):
> 224 - return os.environ.get(cls._reloader_environ_key)
> 225 + def is_installed(self):
> 226 + return os.environ.get(self._reloader_environ_key)
> 227
>
> ^- You are changing a "classmethod" to use the "self" attribute, which is, at best, confusing. You don't have an instance here, you just have the Class object, hence why the original author used "cls" to clarify that.

I didn't makes those changes intentionally. Looks like I screwed up when
resolving merge conflicts in that file and accidentally reversed some
changes. I'll fix it now.

Thanks for mentioning it; I thought it was just LP being funny so I
hadn't checked it out.

> 279 - revision_graph[key] = tuple(parent for parent in parents if parent
> 280 - in revision_graph)
> 281 + # TODO: It'd be nice if from_sequence supported generators.
> 282 + revision_graph[key] = StaticTuple.from_sequence(tuple(
> 283 + parent for parent in parents if parent in revision_graph))
>
> ^- It probably wouldn't be hard to add this sort of thing to FromSequence. Namely change these lines:
> if (!PySequence_Check(sequence)) {
> PyErr_Format(PyExc_TypeError, "Type %s is not a sequence type",
> Py_TYPE(sequence)->tp_name);
> return NULL;
> }
>
> to
>
> PyObject *as_tuple = NULL;
>
> if (!PySequence_Check(sequence)) {
> as_tuple = PySequence_Tuple(sequence);
> sequence = as_tuple;
> }
>
> ...
>
> Py_XDECREF(as_tuple);

Oh, cool!

(Are you suggesting you'll do it or I do it? Cuz I don't mind, but if
you were going to...)

(If it supports arbitrary iterables, it could be renamed to
"from_iterable", though I guess it makes sense to keep it
"from_sequence" for backwards compatibility and to discourage passing
arbitrary iterables since it's less efficient...)

> Here are some performance numbers for a Launchpad tree. This is ~66k revisions.

[snip lots of numbers]

> So it turns out that using 'marshal' + casting the data into StaticTuple instances is faster than using cPickle.

Thanks so much for doing this analysis!

I'm not awake enough to read it yet, but I *can* understand your summary
at the end. :-P

> As for whether things should be interned... I would say that you probably only benefit if you are sharing between branches. Namely, the entries up there are:
> [(merge-info, where-merged, parents)].
> `merge-info` is (seq, revid, merge_depth, revno_str, end_of_merge) --
> like a merged sorted list, but the revno is stringified.
> `where-merged` is a tuple of revisions that have this revision as a
> non-lefthand parent. Finally, `parents` is just the usual list of
> parents of this revis...

Read more...

404. By Matt Nordhoff

Fix some changes I accidentally reverted when merging the trunk.

405. By Matt Nordhoff

Bind StaticTuple.from_sequence to "as_st" when using it in loops.

406. By Matt Nordhoff

Make each...item in the revgraph cache a StaticTuple instead of a list.

This is a little ugly because the items actually get mutated while the cache is being generated, but it's not a big deal.

Even if we decide to back this out later for performance, we should keep the changes to changecache.py: There's no reason they have to stay lists after being roundtripped through marshal.

I benchmarked compute_whole_history_data(bzr.dev) and any difference was within the normal variations, which was only 3%. So it's probably okay.

I didn't actually check what the memory difference is.

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (4.3 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

>

>>
>> PyObject *as_tuple = NULL;
>>
>> if (!PySequence_Check(sequence)) {
>> as_tuple = PySequence_Tuple(sequence);
>> sequence = as_tuple;
>> }
>>
>> ...
>>
>> Py_XDECREF(as_tuple);
>
> Oh, cool!
>
> (Are you suggesting you'll do it or I do it? Cuz I don't mind, but if
> you were going to...)
>
> (If it supports arbitrary iterables, it could be renamed to
> "from_iterable", though I guess it makes sense to keep it
> "from_sequence" for backwards compatibility and to discourage passing
> arbitrary iterables since it's less efficient...)

So I should comment a bit more. The code you have is something like:

for r, parents in graph.iteritems():
  val = tuple(p for p in parents if p in graph)

However, it turns out that list comprehensions are faster than
generators. Mostly because a generator has a 'context' that list
comprehensions don't have.

In essence if you use a generator, it adds the overhead of 1 function
call. So doing the above as

for r, parents in graph.iteritems():
  val = tuple([p for p in parents if p in graph])

Should actually be faster. And has the distinct advantage that it
generates a sequence rather than an iterable

for r, parents in graph.iteritems():
  val = StaticTuple.from_sequence([p for p in parents if p in graph])

just works :)

The only reason not to use a list comprehension over a generator is when
the number of items would cause significant memory overhead. That
certainly isn't true in the case of parent graphs, where you only have a
small handful of keys.
...

> If we intern things, it probably makes sense to intern the whole 3-tuple.
>
> Interning things bothers me, though, since _static_tuple_py interns
> forever and I don't want to increase memory usage that much for people
> with older bzrlibs.
>
> (Also, interning hurts my brain a bit. :-P )

It is up to you, but it saves *tons* of memory otherwise. You could use
a flag based on whether you have the compiled extensions...

>
>> I should also note that here:
>> 104 + data_statictuples = [[StaticTuple.from_sequence(t) for t in l] for
>> 105 + l in data_tuples]
>>
>> ^- You also almost definitely want to use a StaticTuple instead of that inner list. The cost of the list will otherwise be most of your savings from using ST. If it is always a 3-sequence list, you can use my form, or you could use another ".from_sequence()".
>
> Yeah, it uses the inner list because `where-merged` is populated in a
> second loop. I pretty much don't understand that code
> (loggerhead/wholehistory.py) or the performance implications of changing
> it, so I haven't looked into it yet.
>
> Hmm, a quick hack is easier than I thought. I'll check it out right now.
>
>> I'll also note that you really do want to create a local variable for "StaticTuple.from_sequence" otherwise you pay an attribute lookup for every call.
>
> Ehh, didn't figure that was significant. OK, will do.
>
>> In short, using StaticTuple.from_sequence(marshal.loads()) costs (on my machine, with Launchpad's repo) 256ms up from 138ms for just marshal, up from 61ms to just zlib.decompress.
>>
>> The bigger question is whether it is worthwhile or no...

Read more...

407. By Matt Nordhoff

Update _static_tuple_py.py to r4772 of bzr.dev

408. By Matt Nordhoff

Update/remove TODOs.

409. By Matt Nordhoff

Ah, iter_ancestry does return StaticTuples now. No need to convert then.

(Well, it likely doesn't on pre-CHK repo formats, but I don't care much.)

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

(Oh, forgot about replying to this again.)

John A Meinel wrote:
> So I should comment a bit more. The code you have is something like:
>
> for r, parents in graph.iteritems():
> val = tuple(p for p in parents if p in graph)
>
> However, it turns out that list comprehensions are faster than
> generators. Mostly because a generator has a 'context' that list
> comprehensions don't have.
>
> In essence if you use a generator, it adds the overhead of 1 function
> call. So doing the above as
>
> for r, parents in graph.iteritems():
> val = tuple([p for p in parents if p in graph])
>
> Should actually be faster. And has the distinct advantage that it
> generates a sequence rather than an iterable
>
> for r, parents in graph.iteritems():
> val = StaticTuple.from_sequence([p for p in parents if p in graph])
>
> just works :)
>
> The only reason not to use a list comprehension over a generator is when
> the number of items would cause significant memory overhead. That
> certainly isn't true in the case of parent graphs, where you only have a
> small handful of keys.
> ...

OK. Done.

Although I didn't benchmark it; my system is rather overloaded right
now. And I'm lazy. :-P

>> If we intern things, it probably makes sense to intern the whole 3-tuple.
>
>> Interning things bothers me, though, since _static_tuple_py interns
>> forever and I don't want to increase memory usage that much for people
>> with older bzrlibs.
>
>> (Also, interning hurts my brain a bit. :-P )
>
> It is up to you, but it saves *tons* of memory otherwise. You could use
> a flag based on whether you have the compiled extensions...

Eh. I'll work on interning next.

I actually wrote the code for using a flag based on whether the compiled
extensions are available, but I'm not sure I'll use it yet. I don't like
the overhead. :-P

I wonder how much of an impact interning the stringified revnos would
have? They're tiny, but they add up, and most of them are duplicates...

>> Keeping a bunch of free tuples around is probably acceptable, since they
>> will be reused next time a cache gets loaded, but it's still rather
>> unfortunate. Huh.
>
> It is only 2000, though. Versus the 60,000 or so that are created. So it
> isn't a huge overhead.

2000 tuples is _nothing_. There are already 50-60k tuples in memory
before Loggerhead gets a single request, from all of the libraries and
everything.

> John
> =:->
--

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

Matt Nordhoff wrote:
> I wonder how much of an impact interning the stringified revnos would
> have? They're tiny, but they add up, and most of them are duplicates...

Although interning them when they're unmarshalled will lead to one ugly
list comprehension...
--

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matt Nordhoff wrote:
> Matt Nordhoff wrote:
>> I wonder how much of an impact interning the stringified revnos would
>> have? They're tiny, but they add up, and most of them are duplicates...
>
> Although interning them when they're unmarshalled will lead to one ugly
> list comprehension...

as_st(x).intern()

isn't all that bad.

st(as_st(x[0]).intern(), as_st(x[1]).intern(),
   as_st(x[2]).intern()).intern()

Isn't great, but it isn't terrible, either.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrnGp4ACgkQJdeBCYSNAAP3oQCeJy17L/u1+obC2rmJpPaiu0Yd
D5QAoJD8Z1Z+b2KeTGAdZjer8SUBLm4s
=WRny
-----END PGP SIGNATURE-----

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

John A Meinel wrote:
> Matt Nordhoff wrote:
>> Matt Nordhoff wrote:
>>> I wonder how much of an impact interning the stringified revnos would
>>> have? They're tiny, but they add up, and most of them are duplicates...
>> Although interning them when they're unmarshalled will lead to one ugly
>> list comprehension...
>
> as_st(x).intern()
>
> isn't all that bad.
>
> st(as_st(x[0]).intern(), as_st(x[1]).intern(),
> as_st(x[2]).intern()).intern()
>
> Isn't great, but it isn't terrible, either.
>
> John
> =:->

Yeah, but interning the individual items in the tuples would be.
--

410. By Matt Nordhoff

Replace some genexps with listcomps. Hopefully faster.

I haven't tested it (my system is way to bogged down for accurate benchmarks right now), but I have it lying around, and constantly (un)shelving it is inconvenient. :P

411. By Matt Nordhoff

Go back to converting iter_ancestry to ST to be safe.

412. By Matt Nordhoff

Ehh, convert the iter_ancestry thing back to a generator expression.

It's a lot of data, and besides, one little function call is nothing compared to the rest of what iter_ancestry is doing.

413. By Matt Nordhoff

Fix some 3-space indentation

414. By Matt Nordhoff

StaticTuple interning!

I also made some of the code longer but hopefully more readable.

Testing compute_whole_history_data(bzr.dev), this does help, but it's not as much as I was expecting.

OTOH, a list and dict each 28k items long is not free.

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

Comparing the speed of RevInfoDiskCache on lp:launchpad/devel in the trunk and r415 of my branch, in seconds:

_ trunk statictuples
get 0.28 1.33
set 1.25 1.40

That's unfortunate. :\

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

Same thing, on lp:bzr:

_ trunk statictuples
get 0.106 0.486
set 0.491 1.4

Sigh.

415. By Matt Nordhoff

Style tweak.

This either has a negligible impact on performance or a very slight positive impact.

Curiously, making the same change to set() has a slight negative impact (~1.39 -> ~1.58 secs for LP).

(Boy, I've had this sitting on my hard drive since 2009-11-05T21:01:27Z without fully testing it...)

416. By Matt Nordhoff

Pull in _static_tuple_py.py from lp:bzr r4842.

417. By Matt Nordhoff

Merge lp:loggerhead

418. By Matt Nordhoff

Pull in _static_tuple_py.py from lp:bzr r5055

419. By Matt Nordhoff

Merge lp:loggerhead

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

So, I haven't worked on this in a long time. In my opinion, this is
what's blocking it (I'm probably forgetting some things, though):

* The disk caching code has to convert between StaticTuples and normal
tuples, which is ugly and quite slow.

* For people without C extensions, or with an old version of bzr,
_static_tuple_py adds some overhead. It's not much, but Loggerhead has a
frightening number of tuples, so it could be a problem.

* I'm not 100% sure StaticTuple's intern cache (a SimpleSet or dict) is
entirely thread-safe. IIRC the answer was "the GIL should handle it",
but I _did_ once have a crash from some sort of bizarre corruption.

* The interning. Honestly, interning hurts my brain. I'm not sure what
all of the consequences are. In particular, _static_tuple_py's cache
never, ever removes things, so it could get quite large. Plus, some
other things could probably be interned (e.g. revno strings).

* There are some TODO comments about minor things that should either be
removed or actually fixed.
--
Matt Nordhoff

420. By Matt Nordhoff

Merge lp:loggerhead

421. By Matt Nordhoff

Merge lp:loggerhead

Unmerged revisions

421. By Matt Nordhoff

Merge lp:loggerhead

420. By Matt Nordhoff

Merge lp:loggerhead

419. By Matt Nordhoff

Merge lp:loggerhead

418. By Matt Nordhoff

Pull in _static_tuple_py.py from lp:bzr r5055

417. By Matt Nordhoff

Merge lp:loggerhead

416. By Matt Nordhoff

Pull in _static_tuple_py.py from lp:bzr r4842.

415. By Matt Nordhoff

Style tweak.

This either has a negligible impact on performance or a very slight positive impact.

Curiously, making the same change to set() has a slight negative impact (~1.39 -> ~1.58 secs for LP).

(Boy, I've had this sitting on my hard drive since 2009-11-05T21:01:27Z without fully testing it...)

414. By Matt Nordhoff

StaticTuple interning!

I also made some of the code longer but hopefully more readable.

Testing compute_whole_history_data(bzr.dev), this does help, but it's not as much as I was expecting.

OTOH, a list and dict each 28k items long is not free.

413. By Matt Nordhoff

Fix some 3-space indentation

412. By Matt Nordhoff

Ehh, convert the iter_ancestry thing back to a generator expression.

It's a lot of data, and besides, one little function call is nothing compared to the rest of what iter_ancestry is doing.

Updating diff...

An updated diff will be available in a few minutes. Reload to see the changes.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'loggerhead/_static_tuple_py.py'
2--- loggerhead/_static_tuple_py.py 1970-01-01 00:00:00 +0000
3+++ loggerhead/_static_tuple_py.py 2010-05-06 10:33:25 +0000
4@@ -0,0 +1,80 @@
5+# Copyright (C) 2009, 2010 Canonical Ltd
6+#
7+# This program is free software; you can redistribute it and/or modify
8+# it under the terms of the GNU General Public License as published by
9+# the Free Software Foundation; either version 2 of the License, or
10+# (at your option) any later version.
11+#
12+# This program is distributed in the hope that it will be useful,
13+# but WITHOUT ANY WARRANTY; without even the implied warranty of
14+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15+# GNU General Public License for more details.
16+#
17+# You should have received a copy of the GNU General Public License
18+# along with this program; if not, write to the Free Software
19+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
20+
21+"""The pure-python implementation of the StaticTuple type.
22+
23+Note that it is generally just implemented as using tuples of tuples of
24+strings.
25+"""
26+
27+
28+class StaticTuple(tuple):
29+ """A static type, similar to a tuple of strings."""
30+
31+ __slots__ = ()
32+
33+ def __new__(cls, *args):
34+ # Make the empty StaticTuple a singleton
35+ if not args and _empty_tuple is not None:
36+ return _empty_tuple
37+ return tuple.__new__(cls, args)
38+
39+ def __init__(self, *args):
40+ """Create a new 'StaticTuple'"""
41+ num_keys = len(args)
42+ if num_keys < 0 or num_keys > 255:
43+ raise TypeError('StaticTuple(...) takes from 0 to 255 items')
44+ for bit in args:
45+ if type(bit) not in (str, StaticTuple, unicode, int, long, float,
46+ None.__class__, bool):
47+ raise TypeError('StaticTuple can only point to'
48+ ' StaticTuple, str, unicode, int, long, float, bool, or'
49+ ' None not %s' % (type(bit),))
50+ # We don't need to pass args to tuple.__init__, because that was
51+ # already handled in __new__.
52+ tuple.__init__(self)
53+
54+ def __repr__(self):
55+ return '%s%s' % (self.__class__.__name__, tuple.__repr__(self))
56+
57+ def __reduce__(self):
58+ return (StaticTuple, tuple(self))
59+
60+ def __add__(self, other):
61+ """Concatenate self with other"""
62+ return StaticTuple.from_sequence(tuple.__add__(self,other))
63+
64+ def as_tuple(self):
65+ return tuple(self)
66+
67+ def intern(self):
68+ return _interned_tuples.setdefault(self, self)
69+
70+ @staticmethod
71+ def from_sequence(seq):
72+ """Convert a sequence object into a StaticTuple instance."""
73+ if isinstance(seq, StaticTuple):
74+ # it already is
75+ return seq
76+ return StaticTuple(*seq)
77+
78+
79+
80+# Have to set it to None first, so that __new__ can determine whether
81+# the _empty_tuple singleton has been created yet or not.
82+_empty_tuple = None
83+_empty_tuple = StaticTuple()
84+_interned_tuples = {}
85
86=== modified file 'loggerhead/changecache.py'
87--- loggerhead/changecache.py 2009-05-02 14:01:05 +0000
88+++ loggerhead/changecache.py 2010-05-06 10:33:25 +0000
89@@ -37,6 +37,8 @@
90 except ImportError:
91 from pysqlite2 import dbapi2
92
93+from loggerhead.util import StaticTuple
94+
95 # We take an optimistic approach to concurrency here: we might do work twice
96 # in the case of races, but not crash or corrupt data.
97
98@@ -141,13 +143,20 @@
99 elif str(row[0]) != revid:
100 return None
101 else:
102- return marshal.loads(zlib.decompress(row[1]))
103+ data_tuples = marshal.loads(zlib.decompress(row[1]))
104+ as_st = StaticTuple.from_sequence
105+ data_statictuples = [StaticTuple(
106+ as_st(a).intern(), as_st(b).intern(),
107+ as_st(c).intern()).intern() for (a, b, c) in data_tuples]
108+ return data_statictuples
109
110 def set(self, key, revid, data):
111+ data_tuples = [
112+ (x[0].as_tuple(), x[1].as_tuple(), x[2].as_tuple()) for x in data]
113 try:
114 self.cursor.execute(
115 'delete from data where key = ?', (dbapi2.Binary(key), ))
116- blob = zlib.compress(marshal.dumps(data))
117+ blob = zlib.compress(marshal.dumps(data_tuples))
118 self.cursor.execute(
119 "insert into data (key, revid, data) values (?, ?, ?)",
120 map(dbapi2.Binary, [key, revid, blob]))
121
122=== modified file 'loggerhead/controllers/filediff_ui.py'
123--- loggerhead/controllers/filediff_ui.py 2009-07-07 23:52:24 +0000
124+++ loggerhead/controllers/filediff_ui.py 2010-05-06 10:33:25 +0000
125@@ -8,6 +8,9 @@
126 from loggerhead import util
127 from loggerhead.controllers import TemplatedBranchView
128
129+# TODO: There are some tuples created by the sequence matcher in
130+# bzrlib.patiencediff that it would be nice to StaticTuplize.
131+
132 def _process_diff(difftext):
133 chunks = []
134 chunk = None
135
136=== modified file 'loggerhead/history.py'
137--- loggerhead/history.py 2010-04-27 15:45:52 +0000
138+++ loggerhead/history.py 2010-05-06 10:33:25 +0000
139@@ -586,9 +586,9 @@
140 if revnos in d:
141 m = d[revnos][0]
142 if revnolast < m:
143- d[revnos] = (revnolast, revid)
144+ d[revnos] = util.StaticTuple(revnolast, revid)
145 else:
146- d[revnos] = (revnolast, revid)
147+ d[revnos] = util.StaticTuple(revnolast, revid)
148
149 return [revid for (_, revid) in d.itervalues()]
150
151@@ -603,6 +603,7 @@
152 for p in change.merge_points:
153 fetch_set.add(p.revid)
154 p_changes = self.get_changes(list(fetch_set))
155+ # TODO: Can't use StaticTuple here because of the Containers.
156 p_change_dict = dict([(c.revid, c) for c in p_changes])
157 for p in change.parents:
158 if p.revid in p_change_dict:
159
160=== modified file 'loggerhead/util.py'
161--- loggerhead/util.py 2010-04-24 12:40:17 +0000
162+++ loggerhead/util.py 2010-05-06 10:33:25 +0000
163@@ -40,6 +40,13 @@
164
165 from simpletal.simpleTALUtils import HTMLStructureCleaner
166
167+# StaticTuple was added in bzr 2.1.0. We bundle a copy of
168+# bzrlib/_static_tuple_py.py in case it's not available.
169+try:
170+ from bzrlib.static_tuple import StaticTuple
171+except ImportError:
172+ from loggerhead._static_tuple_py import StaticTuple
173+
174 log = logging.getLogger("loggerhead.controllers")
175
176
177
178=== modified file 'loggerhead/wholehistory.py'
179--- loggerhead/wholehistory.py 2009-10-21 14:40:23 +0000
180+++ loggerhead/wholehistory.py 2010-05-06 10:33:25 +0000
181@@ -23,6 +23,8 @@
182 from bzrlib.revision import is_null, NULL_REVISION
183 from bzrlib.tsort import merge_sort
184
185+from loggerhead.util import StaticTuple
186+
187
188 def _strip_NULL_ghosts(revision_graph):
189 """
190@@ -32,9 +34,10 @@
191 # Filter ghosts, and null:
192 if NULL_REVISION in revision_graph:
193 del revision_graph[NULL_REVISION]
194+ as_st = StaticTuple.from_sequence
195 for key, parents in revision_graph.iteritems():
196- revision_graph[key] = tuple(parent for parent in parents if parent
197- in revision_graph)
198+ revision_graph[key] = as_st([
199+ parent for parent in parents if parent in revision_graph])
200 return revision_graph
201
202
203@@ -51,7 +54,11 @@
204 (branch.get_config().get_nickname(),))
205
206 graph = branch.repository.get_graph()
207- parent_map = dict((key, value) for key, value in
208+ # Note: iter_ancestry returns StaticTuples in 2.1.0, but I don't know if
209+ # it's only on CHK repos, and we have to be compatible with older bzrlibs
210+ # anyway.
211+ as_st = StaticTuple.from_sequence
212+ parent_map = dict((key, as_st(value).intern()) for key, value in
213 graph.iter_ancestry([last_revid]) if value is not None)
214
215 _revision_graph = _strip_NULL_ghosts(parent_map)
216@@ -67,18 +74,33 @@
217
218 for info in _merge_sort:
219 seq, revid, merge_depth, revno, end_of_merge = info
220- revno_str = '.'.join(str(n) for n in revno)
221+ revno_str = '.'.join([str(n) for n in revno])
222 parents = _revision_graph[revid]
223 _rev_indices[revid] = len(_rev_info)
224- _rev_info.append([(seq, revid, merge_depth, revno_str, end_of_merge), (), parents])
225+ # TODO: Do something about interning the outer tuple. The problem is,
226+ # it may/will be modified by the loop below, so it's a waste to intern
227+ # it before that happens.
228+ # TODO: I'm 99% sure "parents" is the same StaticTuple that was
229+ # interned above, but I'd like to verify it.
230+ _rev_info.append(StaticTuple(
231+ StaticTuple(
232+ seq, revid, merge_depth, revno_str, end_of_merge).intern(),
233+ StaticTuple(), parents.intern()))
234
235 for revid in _revision_graph.iterkeys():
236- if _rev_info[_rev_indices[revid]][0][2] == 0:
237+ revid_index = _rev_indices[revid]
238+ if _rev_info[revid_index][0][2] == 0:
239 continue
240 for parent in _revision_graph[revid]:
241- c = _rev_info[_rev_indices[parent]]
242+ # TODO: Rebuilding 'c' is a bit ugly.
243+ parent_index = _rev_indices[parent]
244+ c = _rev_info[parent_index]
245 if revid not in c[1]:
246- c[1] = c[1] + (revid,)
247+ _rev_info[parent_index] = StaticTuple(
248+ # TODO: Interning c[1] and StaticTuple(revid) is not ideal,
249+ # since the loop might modify them again, but hopefully
250+ # it's a net benefit.
251+ c[0], (c[1] + StaticTuple(revid).intern()).intern(), c[2])
252
253 log.info('built revision graph cache: %r secs' % (time.time() - z,))
254

Subscribers

People subscribed via source and target branches