loggerhead

Code review comment for lp:~mnordhoff/loggerhead/statictuples

statictuples
Merge into trunk-rich

Revision history for this message

Matt Nordhoff (mnordhoff) wrote on 2009-10-27:

John A Meinel wrote:
> I'm obviously not an official reviewer, but I figured I'd poke my head in here.

Thank you! :-)

> I'm a bit surprised at some of the changes that seem pretty unrelated. Such as:
> 195 if (base < 100) and (dot != 0):
> 196 - out += '.%d' % (dot,)
> 197 + out += '.%d' % (dot)
>
> (I realize that the latter probably works, but if 'dot' ever ends up as a tuple, you'll get a rather cryptic failure mode.)
>
> Also, I would be quite skeptical about this change:
> @classmethod
> 223 - def is_installed(cls):
> 224 - return os.environ.get(cls._reloader_environ_key)
> 225 + def is_installed(self):
> 226 + return os.environ.get(self._reloader_environ_key)
> 227
>
> ^- You are changing a "classmethod" to use the "self" attribute, which is, at best, confusing. You don't have an instance here, you just have the Class object, hence why the original author used "cls" to clarify that.

I didn't makes those changes intentionally. Looks like I screwed up when
resolving merge conflicts in that file and accidentally reversed some
changes. I'll fix it now.

Thanks for mentioning it; I thought it was just LP being funny so I
hadn't checked it out.

> 279 - revision_graph[key] = tuple(parent for parent in parents if parent
> 280 - in revision_graph)
> 281 + # TODO: It'd be nice if from_sequence supported generators.
> 282 + revision_graph[key] = StaticTuple.from_sequence(tuple(
> 283 + parent for parent in parents if parent in revision_graph))
>
> ^- It probably wouldn't be hard to add this sort of thing to FromSequence. Namely change these lines:
> if (!PySequence_Check(sequence)) {
> PyErr_Format(PyExc_TypeError, "Type %s is not a sequence type",
> Py_TYPE(sequence)->tp_name);
> return NULL;
> }
>
> to
>
> PyObject *as_tuple = NULL;
>
> if (!PySequence_Check(sequence)) {
> as_tuple = PySequence_Tuple(sequence);
> sequence = as_tuple;
> }
>
> ...
>
> Py_XDECREF(as_tuple);

Oh, cool!

(Are you suggesting you'll do it or I do it? Cuz I don't mind, but if
you were going to...)

(If it supports arbitrary iterables, it could be renamed to
"from_iterable", though I guess it makes sense to keep it
"from_sequence" for backwards compatibility and to discourage passing
arbitrary iterables since it's less efficient...)

> Here are some performance numbers for a Launchpad tree. This is ~66k revisions.

[snip lots of numbers]

> So it turns out that using 'marshal' + casting the data into StaticTuple instances is faster than using cPickle.

Thanks so much for doing this analysis!

I'm not awake enough to read it yet, but I *can* understand your summary
at the end. :-P

> As for whether things should be interned... I would say that you probably only benefit if you are sharing between branches. Namely, the entries up there are:
> [(merge-info, where-merged, parents)].
> `merge-info` is (seq, revid, merge_depth, revno_str, end_of_merge) --
> like a merged sorted list, but the revno is stringified.
> `where-merged` is a tuple of revisions that have this revision as a
> non-lefthand parent. Finally, `parents` is just the usual list of
> parents of this revision.
>
> However, btree indexes use a tuple of parent *keys* (tuples), not a tuple of parent-ids (strings). So you don't actually share the data there. Though graph.iter_ancestry() is probably using the same parents lists, and thus could share with this revision-info cache. (But I would have guessed that you would only have *one* of them at a time.)
>
> That said, if you do load 2 branches of the same project at the same time, you will save *tremendously* by interning pretty much everything. You could probably intern at the "outer" tuple if you wanted. Since I think the inner data isn't going to be repeated much outside of that.

Loggerhead will keep multiple revgraph caches around at the same time,
and they very well might be from the same project. Hm.

If we intern things, it probably makes sense to intern the whole 3-tuple.

Interning things bothers me, though, since _static_tuple_py interns
forever and I don't want to increase memory usage that much for people
with older bzrlibs.

(Also, interning hurts my brain a bit. :-P )

> I should also note that here:
> 104 + data_statictuples = [[StaticTuple.from_sequence(t) for t in l] for
> 105 + l in data_tuples]
>
> ^- You also almost definitely want to use a StaticTuple instead of that inner list. The cost of the list will otherwise be most of your savings from using ST. If it is always a 3-sequence list, you can use my form, or you could use another ".from_sequence()".

Yeah, it uses the inner list because `where-merged` is populated in a
second loop. I pretty much don't understand that code
(loggerhead/wholehistory.py) or the performance implications of changing
it, so I haven't looked into it yet.

Hmm, a quick hack is easier than I thought. I'll check it out right now.

> I'll also note that you really do want to create a local variable for "StaticTuple.from_sequence" otherwise you pay an attribute lookup for every call.

Ehh, didn't figure that was significant. OK, will do.

> In short, using StaticTuple.from_sequence(marshal.loads()) costs (on my machine, with Launchpad's repo) 256ms up from 138ms for just marshal, up from 61ms to just zlib.decompress.
>
> The bigger question is whether it is worthwhile or not. My initial testing doesn't show a significant memory savings *just* from the marshal change. In fact, it may even show an increase. However, this was on a single branch. Also, tuples don't free memory. They use a free list of up to 2k tuples (for every size from 1-20). Which means that doing:
>
> x = (a, b, c)
> x = StaticTuple.from_sequence(x)
>
> Will actually keep the memory allocated as both a StaticTuple *and* a Tuple.
>
> However, eventually that tuple is likely to be re-used. But specifically 'marshal.loads()' is going to create all 60k+ (*3) tuples, and then you cast them into StaticTuple, leaving a lot of unused tuples around.

Keeping a bunch of free tuples around is probably acceptable, since they
will be reused next time a cache gets loaded, but it's still rather
unfortunate. Huh.
--

John A Meinel wrote:
> I'm obviously not an official reviewer, but I figured I'd poke my head in here.

Thank you! :-)

> I'm a bit surprised at some of the changes that seem pretty unrelated. Such as:
> 195	 if (base < 100) and (dot != 0):
> 196	- out += '.%d' % (dot,)
> 197	+ out += '.%d' % (dot)
> 
> (I realize that the latter probably works, but if 'dot' ever ends up as a tuple, you'll get a rather cryptic failure mode.)
> 
> Also, I would be quite skeptical about this change:
>  @classmethod
> 223	- def is_installed(cls):
> 224	- return os.environ.get(cls._reloader_environ_key)
> 225	+ def is_installed(self):
> 226	+ return os.environ.get(self._reloader_environ_key)
> 227	
> 
> ^- You are changing a "classmethod" to use the "self" attribute, which is, at best, confusing. You don't have an instance here, you just have the Class object, hence why the original author used "cls" to clarify that.

I didn't makes those changes intentionally. Looks like I screwed up when
resolving merge conflicts in that file and accidentally reversed some
changes. I'll fix it now.

Thanks for mentioning it; I thought it was just LP being funny so I
hadn't checked it out.

> 279	- revision_graph[key] = tuple(parent for parent in parents if parent
> 280	- in revision_graph)
> 281	+ # TODO: It'd be nice if from_sequence supported generators.
> 282	+ revision_graph[key] = StaticTuple.from_sequence(tuple(
> 283	+ parent for parent in parents if parent in revision_graph))
> 
> ^- It probably wouldn't be hard to add this sort of thing to FromSequence. Namely change these lines:
>     if (!PySequence_Check(sequence)) {
>         PyErr_Format(PyExc_TypeError, "Type %s is not a sequence type",
>                      Py_TYPE(sequence)->tp_name);
>         return NULL;
>     }
> 
> to
> 
> PyObject *as_tuple = NULL;
> 
> if (!PySequence_Check(sequence)) {
>   as_tuple = PySequence_Tuple(sequence);
>   sequence = as_tuple;
> }
> 
> ...
> 
> Py_XDECREF(as_tuple);

Oh, cool!

(Are you suggesting you'll do it or I do it? Cuz I don't mind, but if
you were going to...)

> Here are some performance numbers for a Launchpad tree. This is ~66k revisions.

[snip lots of numbers]

> So it turns out that using 'marshal' + casting the data into StaticTuple instances is faster than using cPickle.

Thanks so much for doing this analysis!

I'm not awake enough to read it yet, but I *can* understand your summary
at the end. :-P

> As for whether things should be interned... I would say that you probably only benefit if you are sharing between branches. Namely, the entries up there are:
>   [(merge-info, where-merged, parents)].
>         `merge-info` is (seq, revid, merge_depth, revno_str, end_of_merge) --
>         like a merged sorted list, but the revno is stringified.
>         `where-merged` is a tuple of revisions that have this revision as a
>         non-lefthand parent.  Finally, `parents` is just the usual list of
>         parents of this revision.
> 
> However, btree indexes use a tuple of parent *keys* (tuples), not a tuple of parent-ids (strings). So you don't actually share the data there. Though graph.iter_ancestry() is probably using the same parents lists, and thus could share with this revision-info cache. (But I would have guessed that you would only have *one* of them at a time.)
> 
> That said, if you do load 2 branches of the same project at the same time, you will save *tremendously* by interning pretty much everything. You could probably intern at the "outer" tuple if you wanted. Since I think the inner data isn't going to be repeated much outside of that.

Loggerhead will keep multiple revgraph caches around at the same time,
and they very well might be from the same project. Hm.

If we intern things, it probably makes sense to intern the whole 3-tuple.

Interning things bothers me, though, since _static_tuple_py interns
forever and I don't want to increase memory usage that much for people
with older bzrlibs.

(Also, interning hurts my brain a bit. :-P )

> I should also note that here:
> 104	+ data_statictuples = [[StaticTuple.from_sequence(t) for t in l] for
> 105	+ l in data_tuples]
> 
> ^- You also almost definitely want to use a StaticTuple instead of that inner list. The cost of the list will otherwise be most of your savings from using ST. If it is always a 3-sequence list, you can use my form, or you could use another ".from_sequence()".

Hmm, a quick hack is easier than I thought. I'll check it out right now.

> I'll also note that you really do want to create a local variable for "StaticTuple.from_sequence" otherwise you pay an attribute lookup for every call.

Ehh, didn't figure that was significant. OK, will do.

> In short, using StaticTuple.from_sequence(marshal.loads()) costs (on my machine, with Launchpad's repo) 256ms up from 138ms for just marshal, up from 61ms to just zlib.decompress.
> 
> The bigger question is whether it is worthwhile or not. My initial testing doesn't show a significant memory savings *just* from the marshal change. In fact, it may even show an increase. However, this was on a single branch. Also, tuples don't free memory. They use a free list of up to 2k tuples (for every size from 1-20). Which means that doing:
> 
> x = (a, b, c)
> x = StaticTuple.from_sequence(x)
> 
> Will actually keep the memory allocated as both a StaticTuple *and* a Tuple.
> 
> However, eventually that tuple is likely to be re-used. But specifically 'marshal.loads()' is going to create all 60k+ (*3) tuples, and then you cast them into StaticTuple, leaving a lot of unused tuples around.

Keeping a bunch of free tuples around is probably acceptable, since they
will be reused next time a cache gets loaded, but it's still rather
unfortunate. Huh.
--

« Back to merge proposal