Bazaar

Code review comment for lp:~mkbosmans/bzr/topo_sort

topo_sort
Merge into trunk-old

Revision history for this message

Maarten Bosmans (mkbosmans) wrote on 2009-08-05:

I benchmarked all the suggested changes with the same result as you had. That is, mainly the foo_append change was a significant improvement, the rest not some much.

> ^- since you just populated the dictionary with 0 for every node, the only
> times when a parent isn't in the nchildren dict is when it is a ghost.
> So I wouldn't do a "x in y" check for each one, but just catch a potential
> KeyError.

Indeed, I added this check after the first version failed the unit test for a ghost node in the graph. As you mentioned removing the check wasn't a big win, so I'm inclined to leave it with the simpler version as is, perhaps with a comment added that explains that the check is for ghost parents.

> 1) You are calling "foo.append()". It turns out to usually be a bit faster to
> do:

This is indeed faster. I'll change it.

> 2) You seem to be popping an item off the list at the beginning, and probably
> are often pushing (append) one back on at the end. In my tests with this sort
> of code in KnownGraph it turns out to often be a lot better to peek at the
> last item, and then replace it the first time you append, rather than pop +
> append. For example:
>
> ...
>
> I would be curious what results you would get, but I wouldn't be surprised if
> it was significant. Given that you are at 260ms for 20k+ nodes, you are
> already in the sub ms per node, which is where stuff like attribute lookup,
> append/pop really start to add up.

So perhaps Python needs a more efficient stack implementation? I'm not very attracted to having such hacks in the otherwise quite clean algorithm. And as you already mentioned, it's not a big performance win. A similar speedup can be acquired by using a set instead of a stack. This requires only an added set() on initialization and using .add instead of .append.

> Of course, if we really think this code is a bottleneck and worth optimizing,
> then we may want to keep a reasonable and fairly simple implementation in
> python, and look into writing a pyrex version.
>
> Could you look at some of the suggestions here and give benchmark results?

If your pyrex code is a lot faster, may be it is better to add that and scrap all these optimizations. The topo_sorter function can than simply remain

return TopoSorter(graph).sorted()

and we don't have to Python algorithms implemented.

If you still think this new algorithm is worth going in, I will make a new version with the foo_append and stack->set changes. I propose that we leave the optimizations with those two and instead of micro-optimizing it further, focus our attention to merge_sort, which is much more used, I believe.

BTW, how do I make these extra changes? Should I add a commit to the branch that only makes these few changes, or should I go back to the revision that added the whole algorithm, recommit it with changes and push the new branch over the current in lp?

I benchmarked all the suggested changes with the same result as you had. That is, mainly the foo_append change was a significant improvement, the rest not some much.

> 1) You are calling "foo.append()". It turns out to usually be a bit faster to
> do:

This is indeed faster. I'll change it.

> 2) You seem to be popping an item off the list at the beginning, and probably
> are often pushing (append) one back on at the end. In my tests with this sort
> of code in KnownGraph it turns out to often be a lot better to peek at the
> last item, and then replace it the first time you append, rather than pop +
> append. For example:
>
> ...
> 
> I would be curious what results you would get, but I wouldn't be surprised if
> it was significant. Given that you are at 260ms for 20k+ nodes, you are
> already in the sub ms per node, which is where stuff like attribute lookup,
> append/pop really start to add up.

> Of course, if we really think this code is a bottleneck and worth optimizing,
> then we may want to keep a reasonable and fairly simple implementation in
> python, and look into writing a pyrex version.
> 
> Could you look at some of the suggestions here and give benchmark results?

If your pyrex code is a lot faster, may be it is better to add that and scrap all these optimizations. The topo_sorter function can than simply remain

return TopoSorter(graph).sorted()

and we don't have to Python algorithms implemented.

« Back to merge proposal