lp:~jameinel/+junk/pybloom

Created by John A Meinel and last modified
Get this branch:
bzr branch lp:~jameinel/+junk/pybloom
Only John A Meinel can upload to this branch. If you are John A Meinel please log in for upload directions.

Related bugs

Related blueprints

Branch information

Owner:
John A Meinel
Status:
Development

Recent revisions

42. By John A Meinel

Remove an !=, which also caught an accidental access to the original python object
instead of the character buffer.

41. By John A Meinel

play around with some tweaks. when benchmarking murmur, use 20x the number of keys.

40. By John A Meinel

Lots of little tweaks.
Include a larger ancestry set. Now using 24k bzr revs instead of 8k.
Spend some time optimizing the insert side of blooms.
Improves insert time from 200+ms down to 50ms.
Add BloomMurmur to the benchmark suite, so it is actually being benchmarked.
Overall BloomMurmur is about 200 => 30ms for check times (not quite 10x faster).
Insert times are approx 250 => 50ms (about 5x).
Using the multi-way & aligned path shaves about 10% off. 56ms => 49ms.
We really need a longer running test, ideally ~1s rather than .05s.

39. By John A Meinel

Implement the parallel form of multi hash.

38. By John A Meinel

Add the framework for a multi-hash version.

The main win here is that we can factor out some of the common work
rather than making multiple passes over the data.
Also, we allow for using the fully optimized code when on a
little endian device with aligned input.

37. By John A Meinel

Minor optimization of the python implementation of murmur hash.
The big change is to just use struct.unpack() across everything in one pass,
rather than lots of ord() calls and bit shifts.

36. By John A Meinel

A bit more info about paging thoughts.

35. By John A Meinel

Push the __contains__ check down into pyrex.
Gives us another ~10% on indexbench.
For some reason get_components_positions still loses.

34. By John A Meinel

Add a filter_by_presence function, to allow filtering multiple nodes in one go.
Customize the BloomMurmur.__contains__ so that it doesn't have to compute all hashes
when it can know right away that something doesn't hit.

33. By John A Meinel

If you are going to write a compiled form... *use it*

Branch metadata

Branch format:
Branch format 5
Repository format:
Bazaar pack repository format 1 (needs bzr 0.92)
This branch contains Public information 
Everyone can see this information.