godirstate : Code : John A Meinel

Get this branch:: bzr branch lp:~jameinel/+junk/godirstate

Only John A Meinel can upload to this branch. If you are John A Meinel please log in for upload directions.

Related bugs

Link a bug report

Related blueprints

Branch information

Owner:: John A Meinel

Status:: Development

Recent revisions

24. By John A Meinel on 2011-05-27

Remove the error returns in favor of panic().

The immediate feeling, less clear state because we no longer add extra
state info to the error message. Probably a worthy trade for getting
a real traceback that has *even more* error info. (At the expense that
Parse() suppresses getting a traceback.)
Much clearer code without all the 'err' checking statements.

Possibly marginally faster (183ms). Probably fewer if checks
in trade for panic which probably has to be runtime checked
anyway.

23. By John A Meinel on 2011-05-27

Simplify a bit. If we aren't going to compute the hash, don't pretend we are.

22. By John A Meinel on 2011-05-27

setting it to 50MB should allow us to read the whole file in 1 go, but actually makes it slower. 200ms.

21. By John A Meinel on 2011-05-27

Increasing the buffer size to 1MB helps a tiny bit, but not particularly dramatically.
Down to about 182ms from 187ms. so about 2-3%.
10 loops 70133 entries in 184.621ms
185 samples (avg 1 threads)
13.51%bytes.IndexByte
  8.11%runtime.mcpy
  5.41%MHeap_AllocLocked
  5.41%runtime.mallocgc
  5.41%scanblock
  4.86%dirstate.*entryParser·getDetails

20. By John A Meinel on 2011-05-27

Play with removing the 'extra' field. But it doesn't seem very expensive
for cases where we don't use it. 187ms to 185ms. Not tracking the extra
content is more apparent, at 255ms vs 289ms. But parsing-and-not-keeping data
is cheating, so it doesn't really count.
With just a working+basis, we're at 187ms vs 131ms, which is pretty good.
6prof gives this layout:
10 loops 70133 entries in 186.195ms
186 samples (avg 1 threads)
19.35%bytes.IndexByte
  8.06%runtime.mcpy
  7.53%syscall.Syscall
  4.84%runtime.memclr
  4.84%scanblock
  4.84%sweep
  4.30%bytes.Equal
  3.76%dirstate.*entryParser·getNext
So we still spend most of the time finding null bytes, followed by time
copying bytes into new buffers. I guess the 8% syscall times is for
reading the content, since we don't read everything-up-front like we
do in the python code.

19. By John A Meinel on 2011-05-27

It appears go authors are no strangers to assembly optimizations.
Changing from bytes.Index() to a custom 'memchr' function dropped the time from
478ms down to ~324ms. However bytes.IndexByte() is assembly code, that even uses
SSE instructions to chunk through memory quickly, and drops it all the way down
to 283ms.
Still 283ms vs Pyrex 164ms, but I didn't expect to shave 195ms off the go time
by a simple switch of bytes.Index to bytes.IndexByte.

18. By John A Meinel on 2011-05-27

In a gcc tree with a merge (2 parents), still consistent.
go: 10 loops 70134 entries in 482.877ms
py: 10 loops, best of 3: 163 msec per loop
ratio is 2.96:1 in favor of Pyrex.

17. By John A Meinel on 2011-05-27

Relative pattern holds for 'gcc' sized tree:
go: 10 loops 70134 entries in 336.742ms
pyrex: 10 loops, best of 3: 133 msec per loop
Which is 2.5:1 in favor of pyrex, for bzr.dev it was 2.8:1.

16. By John A Meinel on 2011-05-27

For posterity, not sharing the dirname string was:
go: 1000 loops 1415 entries in 5.708ms
pyrex: 1000 loops, best of 3: 1.84 msec per loop
So it seems to help (5.7ms became 5.5ms) but not a tremendous win.

15. By John A Meinel on 2011-05-27

Try to share the dirname entries.
Current timing is:
go: 1000 loops 1415 entries in 5.588ms
pyrex: 1000 loops, best of 3: 1.98 msec per loop

»

Branch metadata

Branch format:: Branch format 7

Repository format:: Bazaar repository format 2a (needs bzr 1.16 or later)

John A Meinel

lp:~jameinel/+junk/godirstate

Related source package recipes

Related snap packages

Related bugs

Related blueprints

Branch information

Recent revisions

Branch metadata

Nearby

Subscribers