Code review comment for lp:~jameinel/bzr/2.1-static-tuple-chk-map

Revision history for this message
John A Meinel (jameinel) wrote :

Ok, so I'm resubmitting this one again, only this time, it's even better.

Including the changes to CHKMap to only use StaticTuple internally, I also found that _filter_text_keys was creating a whole lot of file_key tuples, and not interning the various attributes.

I also found that we create *millions* of integer objects, and most of them are redundant because of the identical 'group' start and end information. (IIRC, we create 1.4M integers at peak of parsing the chk stream, and only 300k of them are unique.)

the _filter_text_keys fix saved around 40MB peak, interning the integers saves another 7MB.

Overall, with this patch, I'm now down to 457MB peak when branching all of launchpad. Which is very close to my 50% goal. I also know a way to save another ~10MB or so, but it requires using SimpleSet, which I'm not sure I want to do yet.

Anyway, versus bzr.dev, this patch drops me from 548MB => 457MB peak memory.

Also, I've focused a bit on 'streaming' data out of a repository (versus the insert on the other side). In that scenario, the numbers are:
  583MB bzr 2.0.1
  422MB bzr.dev
  338MB this patch

So not quite 50% savings, but I expect it to still be fairly noticable on Launchpad's code hosting.

« Back to merge proposal