Sorry, I didn't mean to mark it approve prematurely.
To give some numbers using topo_sort on 26k revs of bzr.dev loaded into a dict:
188.0ms bzr.dev
75.8ms mkbosmans code
74.0ms getting rid of the "if x in y" lookup (not a big win)
64.2ms changing "foo.append" to "foo_append"
62.5ms using nochildren[-1] rather than always pop+append
63.0ms using nochildren[last_tip] (probably a small win on more linear histories)
65.6ms reversed(node_name_stack) # I was surprised that reversed(x) is slower than x.reverse() when the list has 26k items
37.4ms Pyrex version that mostly matches the optimized version I put together
49.9ms k = KnownGraph(parent_map); k.topo_sort()
9.5ms k.topo_sort()
So building the KnownGraph object costs approximately as much as the optimized C topo_sort (which makes sense, because to find_gdfo we basically topo_sort and update the counters.)
However, once we *have* a KnownGraph (if we decided we wanted one for some other reason), then we can generate the topological sorted output in <10ms. Or about 18x faster than the original python TopoSorted implementation.
The new implementations in _known_graph.pyx don't have test cases on them, and we would need to implement topo_sort in _known_graph.py, etc. But I honestly think that is the area to work in to get the best performance.
I believe you were also considering updating "tsort.merge_sort", and again this would be the place to look into writing some really optimized code.
Sorry, I didn't mean to mark it approve prematurely.
To give some numbers using topo_sort on 26k revs of bzr.dev loaded into a dict:
188.0ms bzr.dev last_tip] (probably a small win on more linear histories) node_name_ stack) # I was surprised that reversed(x) is slower than
x.reverse( ) when the list has 26k items
75.8ms mkbosmans code
74.0ms getting rid of the "if x in y" lookup (not a big win)
64.2ms changing "foo.append" to "foo_append"
62.5ms using nochildren[-1] rather than always pop+append
63.0ms using nochildren[
65.6ms reversed(
37.4ms Pyrex version that mostly matches the optimized version I put together
49.9ms k = KnownGraph( parent_ map); k.topo_sort()
9.5ms k.topo_sort()
So building the KnownGraph object costs approximately as much as the optimized C topo_sort (which makes sense, because to find_gdfo we basically topo_sort and update the counters.)
However, once we *have* a KnownGraph (if we decided we wanted one for some other reason), then we can generate the topological sorted output in <10ms. Or about 18x faster than the original python TopoSorted implementation.
My test code is available at lp:~jameinel/bzr/1.18-topo-sort
The new implementations in _known_graph.pyx don't have test cases on them, and we would need to implement topo_sort in _known_graph.py, etc. But I honestly think that is the area to work in to get the best performance.
I believe you were also considering updating "tsort.merge_sort", and again this would be the place to look into writing some really optimized code.