Code review comment for lp:~jameinel/bzr/1.16-chkmap-updates

Revision history for this message
John A Meinel (jameinel) wrote :

This is a slight rework of InternalNode._iter_nodes() to provide some shortcuts to finding nodes to return.

This changes the "initial commit" time for a mysql tree (7.8k files, 140MB) from 15.2s down to 13.7s on my machine (about 10% improvement.)

The big win here is that "CHKMap.apply_delta()" is adding all of the keys by calling .map() one at a time. For each of these, we have to look up what page it fits in, and possibly create the page. The lookup *was* being done by iterating all of the children of this InternalNode, and seeing if each child matched the search key.

This changes the search in 2 ways.

1) If we are searching for a single key, and that results in a _node_width search key, then we can do a single dict lookup for that item, and know if we have it or not, without building up *any* other structures.

2) If we are looking for multiple keys, we can aggregate their search keys. If at that point, all search keys are "_node_width", then again, we can do simple dict lookups for each key.

The main win here, is that root nodes are going to be 255-entries wide. So we avoid iterating across 255 items and doing a comparison, in exchange for doing a dict lookup for each item.

Part (1) is probably what makes the biggest improvement for 'bzr commit' time, but I think both are still useful improvments. (I'm guessing the latter will help stuff like 'bzr ls' more.)

Note that this, coupled with my fulltext commit changes bring initial commit time down to 12.7s, though --1.9 format commit is still around 11.7s. (But that is down from ~15s without those changes.)
So there are still bits of work to be done, but it is getting closer.

« Back to merge proposal