lp:~jameinel/+junk/godirstate

Created by John A Meinel and last modified
Get this branch:
bzr branch lp:~jameinel/+junk/godirstate
Only John A Meinel can upload to this branch. If you are John A Meinel please log in for upload directions.

Related bugs

Related blueprints

Branch information

Owner:
John A Meinel
Status:
Development

Recent revisions

24. By John A Meinel

Remove the error returns in favor of panic().

The immediate feeling, less clear state because we no longer add extra
state info to the error message. Probably a worthy trade for getting
a real traceback that has *even more* error info. (At the expense that
Parse() suppresses getting a traceback.)
Much clearer code without all the 'err' checking statements.

Possibly marginally faster (183ms). Probably fewer if checks
in trade for panic which probably has to be runtime checked
anyway.

23. By John A Meinel

Simplify a bit. If we aren't going to compute the hash, don't pretend we are.

22. By John A Meinel

setting it to 50MB should allow us to read the whole file in 1 go, but actually makes it slower. 200ms.

21. By John A Meinel

Increasing the buffer size to 1MB helps a tiny bit, but not particularly dramatically.
Down to about 182ms from 187ms. so about 2-3%.
10 loops 70133 entries in 184.621ms
185 samples (avg 1 threads)
 13.51%bytes.IndexByte
  8.11%runtime.mcpy
  5.41%MHeap_AllocLocked
  5.41%runtime.mallocgc
  5.41%scanblock
  4.86%dirstate.*entryParser·getDetails

20. By John A Meinel

Play with removing the 'extra' field. But it doesn't seem very expensive
for cases where we don't use it. 187ms to 185ms. Not tracking the extra
content is more apparent, at 255ms vs 289ms. But parsing-and-not-keeping data
is cheating, so it doesn't really count.
With just a working+basis, we're at 187ms vs 131ms, which is pretty good.
6prof gives this layout:
10 loops 70133 entries in 186.195ms
186 samples (avg 1 threads)
 19.35%bytes.IndexByte
  8.06%runtime.mcpy
  7.53%syscall.Syscall
  4.84%runtime.memclr
  4.84%scanblock
  4.84%sweep
  4.30%bytes.Equal
  3.76%dirstate.*entryParser·getNext
So we still spend most of the time finding null bytes, followed by time
copying bytes into new buffers. I guess the 8% syscall times is for
reading the content, since we don't read everything-up-front like we
do in the python code.

19. By John A Meinel

It appears go authors are no strangers to assembly optimizations.
Changing from bytes.Index() to a custom 'memchr' function dropped the time from
478ms down to ~324ms. However bytes.IndexByte() is assembly code, that even uses
SSE instructions to chunk through memory quickly, and drops it all the way down
to 283ms.
Still 283ms vs Pyrex 164ms, but I didn't expect to shave 195ms off the go time
by a simple switch of bytes.Index to bytes.IndexByte.

18. By John A Meinel

In a gcc tree with a merge (2 parents), still consistent.
go: 10 loops 70134 entries in 482.877ms
py: 10 loops, best of 3: 163 msec per loop
ratio is 2.96:1 in favor of Pyrex.

17. By John A Meinel

Relative pattern holds for 'gcc' sized tree:
go: 10 loops 70134 entries in 336.742ms
pyrex: 10 loops, best of 3: 133 msec per loop
Which is 2.5:1 in favor of pyrex, for bzr.dev it was 2.8:1.

16. By John A Meinel

For posterity, not sharing the dirname string was:
go: 1000 loops 1415 entries in 5.708ms
pyrex: 1000 loops, best of 3: 1.84 msec per loop
So it seems to help (5.7ms became 5.5ms) but not a tremendous win.

15. By John A Meinel

Try to share the dirname entries.
Current timing is:
go: 1000 loops 1415 entries in 5.588ms
pyrex: 1000 loops, best of 3: 1.98 msec per loop

Branch metadata

Branch format:
Branch format 7
Repository format:
Bazaar repository format 2a (needs bzr 1.16 or later)
This branch contains Public information 
Everyone can see this information.

Subscribers