Merge lp:~jameinel/bzr/2.1-static-tuple-btree-string-intern into lp:bzr

Proposed by John A Meinel
Status: Merged
Approved by: Andrew Bennetts
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr/2.1-static-tuple-btree-string-intern
Merge into: lp:bzr
Diff against target: 553 lines
12 files modified
NEWS (+13/-7)
bzrlib/_bencode_pyx.pyx (+9/-1)
bzrlib/_btree_serializer_pyx.pyx (+80/-38)
bzrlib/_static_tuple_c.c (+8/-2)
bzrlib/btree_index.py (+2/-0)
bzrlib/builtins.py (+4/-1)
bzrlib/index.py (+4/-1)
bzrlib/repository.py (+7/-0)
bzrlib/static_tuple.py (+25/-0)
bzrlib/tests/test__static_tuple.py (+21/-0)
bzrlib/util/_bencode_py.py (+7/-0)
setup.py (+2/-1)
To merge this branch: bzr merge lp:~jameinel/bzr/2.1-static-tuple-btree-string-intern
Reviewer Review Type Date Requested Status
Andrew Bennetts Approve
bzr-core Pending
Review via email: mp+13296@code.launchpad.net
To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

This is the same as an earlier patch for using StaticTuple as part of the btree code. It has a couple small additions.

1) Small fix for 'bzr dump-btree' that casts the objects back to tuples for nicer formatting.
2) Add 'StaticTuple' as a type that 'bencode' knows how to deal with (just treats it as another
   Tuple/List object.)
   Arguably we probably want to end up with 'decode_as_tuples=True' to return StaticTuple
   instances. For now, though, this was all that was necessary to get the test suite to pass on my
   machine. (Though a lot of the tests that were failing on PQM weren't failing here anyway...)

Revision history for this message
Andrew Bennetts (spiv) wrote :

384 + refs_as_tuples = tuple([tuple([tuple(ref) for ref in ref_list])
385 + for ref_list in node[3]])

I wonder if it would be worth adding a convenience method, perhaps StaticTuple.as_tuples(), that recursively does this conversion. That would make this ugliness unnecessary.

431 + # I don't believe we can define a method by which
432 + # (prefix,) + StaticTuple will work, though we could

In plain Python you could define an __radd__ for this, so surely there's a way to do this in C?

class T(object):
    def __radd__(self, other):
        return 'haha!'
t = T()

print ('tuple',) + t # prints 'haha!'

You may need to do something odd like provide the nb_add slot, even though this isn't really a numeric type, but I think that's ok. (All pure python classes would have that I think, even the non-numeric ones, so presumably having tp_as_number filled doesn't automatically make Python do dumb things.)

I think we can live without this, but it would be nice.

488 + k1 = stuple(stuple('<email address hidden>',),
489 + stuple('<email address hidden>',))
490 + k2 = stuple(stuple(stuple('<email address hidden>',),
491 + stuple('<email address hidden>',)),
492 + stuple('<email address hidden>',))

This test data is needlessly complex and hard to read. Why not e.g.:

    k1 = stuple(stuple('a',), stuple('b',))
    k2 = stuple(stuple(stuple('c',), stuple('d',)), stuple('a',))

Which is structurally the same and much easier to follow.

review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> Review: Approve
> 384 + refs_as_tuples = tuple([tuple([tuple(ref) for ref in ref_list])
> 385 + for ref_list in node[3]])
>
> I wonder if it would be worth adding a convenience method, perhaps StaticTuple.as_tuples(), that recursively does this conversion. That would make this ugliness unnecessary.

As for "as_tuples()" I would be fine just extending ".as_tuple()" to do
exactly that. The main restriction is that we may not always have tuples
at this point.

At least so far, tuple is interchangeable w/ StaticTuple.

>
> 431 + # I don't believe we can define a method by which
> 432 + # (prefix,) + StaticTuple will work, though we could
>
> In plain Python you could define an __radd__ for this, so surely there's a way to do this in C?
>
> class T(object):
> def __radd__(self, other):
> return 'haha!'
> t = T()
>
> print ('tuple',) + t # prints 'haha!'
>

Tuple uses "tp_as_sequence.sq_concat" to handle ('tuple',) + t, which I
didn't think worked the other way. But thanks for pointing me to this,
I'll look into it.

> You may need to do something odd like provide the nb_add slot, even though this isn't really a numeric type, but I think that's ok. (All pure python classes would have that I think, even the non-numeric ones, so presumably having tp_as_number filled doesn't automatically make Python do dumb things.)
>
> I think we can live without this, but it would be nice.

Actually, the main reason I added the comment is because I expect things
to fail at that point, but I haven't gotten a test case to trigger it,
and it also won't trigger with --2a formats... (They don't have missing
compression parents.)

>
> 488 + k1 = stuple(stuple('<email address hidden>',),
> 489 + stuple('<email address hidden>',))
> 490 + k2 = stuple(stuple(stuple('<email address hidden>',),
> 491 + stuple('<email address hidden>',)),
> 492 + stuple('<email address hidden>',))
>
> This test data is needlessly complex and hard to read. Why not e.g.:
>
> k1 = stuple(stuple('a',), stuple('b',))
> k2 = stuple(stuple(stuple('c',), stuple('d',)), stuple('a',))
>
> Which is structurally the same and much easier to follow.

Sure. I did the above because that was the actual data I was getting. Of
course, I've since narrowed it down to a bug in interning....

Anyway, I'm happy to simplify it, and should have done so before submitting.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrVNP8ACgkQJdeBCYSNAAN61QCbBceibTUybQ2cwzsABrC2rcPc
RwcAni/o9YUyAE/7ShfvcoeHZFGUMwDw
=ZhaX
-----END PGP SIGNATURE-----

Revision history for this message
Andrew Bennetts (spiv) wrote :

[...]
> > In plain Python you could define an __radd__ for this, so surely there's a
> > way to do this in C?
[...]
> > You may need to do something odd like provide the nb_add slot, even though
> > this isn't really a numeric type, but I think that's ok. (All pure python
> > classes would have that I think, even the non-numeric ones, so presumably
> > having tp_as_number filled doesn't automatically make Python do dumb
> > things.)

By the way, because lack of support for tuple + StaticTuple caused all the
babune builders to go red, I took a look at this. (specifically,
bzrlib.tests.per_repository.test_write_group.TestResumeableWriteGroup.test_commit_resumed_write_group_with_missing_parents
was failing)

You actually need nb_coerce as well as nb_add. Here's a rough patch that does
this. The error it gives when you try to add a plain tuple with incompatible
elements (e.g. ints) is probably not ideal, but it works.

-Andrew.

1=== modified file 'bzrlib/_static_tuple_c.c'
2--- bzrlib/_static_tuple_c.c 2009-10-15 18:18:44 +0000
3+++ bzrlib/_static_tuple_c.c 2009-10-16 08:24:36 +0000
4@@ -513,6 +513,78 @@
5 "Check to see if this tuple has been interned.\n";
6
7
8+int
9+StaticTuple_coerce(PyObject **v, PyObject **w)
10+{
11+ StaticTuple *st;
12+ if (PyTuple_Check(*v)) {
13+ st = (StaticTuple*) StaticTuple_new_constructor(
14+ &StaticTuple_Type, *v, NULL);
15+ if (!st)
16+ return -1;
17+ Py_INCREF(st);
18+ *v = (PyObject*)st;
19+ } else if (StaticTuple_CheckExact(*v))
20+ Py_INCREF(*v);
21+ else
22+ return 1;
23+
24+ if (PyTuple_Check(*w)) {
25+ st = (StaticTuple*) StaticTuple_new_constructor(
26+ &StaticTuple_Type, *w, NULL);
27+ if (!st)
28+ return -1;
29+ Py_INCREF(st);
30+ *w = (PyObject*)st;
31+ } else if (StaticTuple_CheckExact(*w))
32+ Py_INCREF(*w);
33+ else
34+ return 1;
35+ return 0;
36+}
37+
38+static PyObject *
39+StaticTuple_add(PyObject *v, PyObject *w)
40+{
41+ PyObject *v_t = NULL, *w_t = NULL;
42+ PyObject *tmp_tuple, *result;
43+ /* StaticTuples and plain tuples may be added (concatenated) to
44+ * StaticTuples.
45+ */
46+ if (StaticTuple_CheckExact(v)) {
47+ v_t = StaticTuple_as_tuple((StaticTuple*)v);
48+ if (!v_t)
49+ goto fail;
50+ } else if (PyTuple_Check(v))
51+ v_t = v;
52+ else
53+ goto not_imp;
54+
55+ if (StaticTuple_CheckExact(w)) {
56+ w_t = StaticTuple_as_tuple((StaticTuple*)w);
57+ if (!w_t)
58+ goto fail;
59+ } else if (PyTuple_Check(w))
60+ w_t = w;
61+ else
62+ goto not_imp;
63+
64+ tmp_tuple = PySequence_Concat(v_t, w_t);
65+ result = StaticTuple_new_constructor(&StaticTuple_Type, tmp_tuple, NULL);
66+ Py_DECREF(tmp_tuple);
67+ Py_INCREF(result);
68+ return result;
69+
70+not_imp:
71+ Py_XDECREF(v_t);
72+ Py_XDECREF(w_t);
73+ return Py_NotImplemented;
74+fail:
75+ Py_XDECREF(v_t);
76+ Py_XDECREF(w_t);
77+ return NULL;
78+}
79+
80 static PyObject *
81 StaticTuple_item(StaticTuple *self, Py_ssize_t offset)
82 {
83@@ -574,6 +646,29 @@
84 {NULL, NULL} /* sentinel */
85 };
86
87+
88+static PyNumberMethods StaticTuple_as_number = {
89+ (binaryfunc) StaticTuple_add, /* nb_add */
90+ 0, /* nb_subtract */
91+ 0, /* nb_multiply */
92+ 0, /* nb_divide */
93+ 0, /* nb_remainder */
94+ 0, /* nb_divmod */
95+ 0, /* nb_power */
96+ 0, /* nb_negative */
97+ 0, /* nb_positive */
98+ 0, /* nb_absolute */
99+ 0, /* nb_nonzero */
100+ 0, /* nb_invert */
101+ 0, /* nb_lshift */
102+ 0, /* nb_rshift */
103+ 0, /* nb_and */
104+ 0, /* nb_xor */
105+ 0, /* nb_or */
106+ StaticTuple_coerce, /* nb_coerce */
107+};
108+
109+
110 static PySequenceMethods StaticTuple_as_sequence = {
111 (lenfunc)StaticTuple_length, /* sq_length */
112 0, /* sq_concat */
113@@ -604,7 +699,7 @@
114 0, /* tp_setattr */
115 0, /* tp_compare */
116 (reprfunc)StaticTuple_repr, /* tp_repr */
117- 0, /* tp_as_number */
118+ &StaticTuple_as_number, /* tp_as_number */
119 &StaticTuple_as_sequence, /* tp_as_sequence */
120 0, /* tp_as_mapping */
121 (hashfunc)StaticTuple_hash, /* tp_hash */

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-10-15 04:06:32 +0000
3+++ NEWS 2009-10-15 18:31:17 +0000
4@@ -25,6 +25,11 @@
5 Improvements
6 ************
7
8+* When reading index files, we now use a ``StaticTuple`` rather than a
9+ plain ``tuple`` object. This generally gives a 20% decrease in peak
10+ memory, and can give a performance boost up to 40% on large projects.
11+ (John Arbash Meinel)
12+
13 Documentation
14 *************
15
16@@ -45,13 +50,14 @@
17 used as the interning structure for StaticTuple objects.
18 (John Arbash Meinel)
19
20-* ``bzrlib._static_tuple_pyx.StaticTuple`` is now available. This class
21- functions similarly to ``tuple`` objects. However, it can only point at
22- other ``StaticTuple`` instances or strings. This allows us to remove it
23- from the garbage collector (it cannot be in a cycle), it also allows us
24- to intern the objects. In testing, this can reduce peak memory by
25- 20-40%, and significantly improve performance by removing objects from
26- being inspected by the garbage collector. (John Arbash Meinel)
27+* ``bzrlib._static_tuple_pyx.StaticTuple`` is now available and used by
28+ the btree index parser. This class functions similarly to ``tuple``
29+ objects. However, it can only point at other ``StaticTuple`` instances
30+ or strings. This allows us to remove it from the garbage collector (it
31+ cannot be in a cycle), it also allows us to intern the objects. In
32+ testing, this can reduce peak memory by 20-40%, and significantly
33+ improve performance by removing objects from being inspected by the
34+ garbage collector. (John Arbash Meinel)
35
36 Testing
37 *******
38
39=== modified file 'bzrlib/_bencode_pyx.pyx'
40--- bzrlib/_bencode_pyx.pyx 2009-06-05 01:48:32 +0000
41+++ bzrlib/_bencode_pyx.pyx 2009-10-15 18:31:17 +0000
42@@ -58,6 +58,13 @@
43 void D_UPDATE_TAIL(Decoder, int n)
44 void E_UPDATE_TAIL(Encoder, int n)
45
46+# To maintain compatibility with older versions of pyrex, we have to use the
47+# relative import here, rather than 'bzrlib._static_tuple_c'
48+from _static_tuple_c cimport StaticTuple, StaticTuple_CheckExact, \
49+ import_static_tuple_c
50+
51+import_static_tuple_c()
52+
53
54 cdef class Decoder:
55 """Bencode decoder"""
56@@ -371,7 +378,8 @@
57 self._encode_int(x)
58 elif PyLong_CheckExact(x):
59 self._encode_long(x)
60- elif PyList_CheckExact(x) or PyTuple_CheckExact(x):
61+ elif (PyList_CheckExact(x) or PyTuple_CheckExact(x)
62+ or StaticTuple_CheckExact(x)):
63 self._encode_list(x)
64 elif PyDict_CheckExact(x):
65 self._encode_dict(x)
66
67=== modified file 'bzrlib/_btree_serializer_pyx.pyx'
68--- bzrlib/_btree_serializer_pyx.pyx 2009-10-08 05:12:01 +0000
69+++ bzrlib/_btree_serializer_pyx.pyx 2009-10-15 18:31:17 +0000
70@@ -38,6 +38,8 @@
71 Py_ssize_t PyString_Size(object p)
72 Py_ssize_t PyString_GET_SIZE_ptr "PyString_GET_SIZE" (PyObject *)
73 char * PyString_AS_STRING_ptr "PyString_AS_STRING" (PyObject *)
74+ char * PyString_AS_STRING(object)
75+ Py_ssize_t PyString_GET_SIZE(object)
76 int PyString_AsStringAndSize_ptr(PyObject *, char **buf, Py_ssize_t *len)
77 void PyString_InternInPlace(PyObject **)
78 int PyTuple_CheckExact(object t)
79@@ -55,6 +57,12 @@
80 # void *memrchr(void *s, int c, size_t n)
81 int strncmp(char *s1, char *s2, size_t n)
82
83+# It seems we need to import the definitions so that the pyrex compiler has
84+# local names to access them.
85+from _static_tuple_c cimport StaticTuple, \
86+ import_static_tuple_c, StaticTuple_New, \
87+ StaticTuple_Intern, StaticTuple_SET_ITEM, StaticTuple_CheckExact
88+
89
90 # TODO: Find some way to import this from _dirstate_helpers
91 cdef void* _my_memrchr(void *s, int c, size_t n):
92@@ -71,6 +79,7 @@
93 pos = pos - 1
94 return NULL
95
96+
97 # TODO: Import this from _dirstate_helpers when it is merged
98 cdef object safe_string_from_size(char *s, Py_ssize_t size):
99 if size < 0:
100@@ -94,6 +103,10 @@
101 Py_DECREF_ptr(py_str)
102 return result
103
104+from bzrlib import _static_tuple_c
105+# This sets up the StaticTuple C_API functionality
106+import_static_tuple_c()
107+
108
109 cdef class BTreeLeafParser:
110 """Parse the leaf nodes of a BTree index.
111@@ -133,6 +146,7 @@
112 self._cur_str = NULL
113 self._end_str = NULL
114 self._header_found = 0
115+ # keys are tuples
116
117 cdef extract_key(self, char * last):
118 """Extract a key.
119@@ -142,8 +156,9 @@
120 """
121 cdef char *temp_ptr
122 cdef int loop_counter
123- # keys are tuples
124- key = PyTuple_New(self.key_length)
125+ cdef StaticTuple key
126+
127+ key = StaticTuple_New(self.key_length)
128 for loop_counter from 0 <= loop_counter < self.key_length:
129 # grab a key segment
130 temp_ptr = <char*>memchr(self._start, c'\0', last - self._start)
131@@ -158,15 +173,19 @@
132 last - self._start)))
133 raise AssertionError(failure_string)
134 # capture the key string
135- # TODO: Consider using PyIntern_FromString, the only caveat is that
136- # it assumes a NULL-terminated string, so we have to check if
137- # temp_ptr[0] == c'\0' or some other char.
138- key_element = safe_interned_string_from_size(self._start,
139+ if (self.key_length == 1
140+ and (temp_ptr - self._start) == 45
141+ and strncmp(self._start, 'sha1:', 5) == 0):
142+ key_element = safe_string_from_size(self._start,
143+ temp_ptr - self._start)
144+ else:
145+ key_element = safe_interned_string_from_size(self._start,
146 temp_ptr - self._start)
147 # advance our pointer
148 self._start = temp_ptr + 1
149 Py_INCREF(key_element)
150- PyTuple_SET_ITEM(key, loop_counter, key_element)
151+ StaticTuple_SET_ITEM(key, loop_counter, key_element)
152+ key = StaticTuple_Intern(key)
153 return key
154
155 cdef int process_line(self) except -1:
156@@ -176,6 +195,7 @@
157 cdef char *ref_ptr
158 cdef char *next_start
159 cdef int loop_counter
160+ cdef Py_ssize_t str_len
161
162 self._start = self._cur_str
163 # Find the next newline
164@@ -211,12 +231,25 @@
165 # Invalid line
166 raise AssertionError("Failed to find the value area")
167 else:
168- # capture the value string
169- value = safe_string_from_size(temp_ptr + 1, last - temp_ptr - 1)
170+ # Because of how conversions were done, we ended up with *lots* of
171+ # values that are identical. These are all of the 0-length nodes
172+ # that are referred to by the TREE_ROOT (and likely some other
173+ # directory nodes.) For example, bzr has 25k references to
174+ # something like '12607215 328306 0 0', which ends up consuming 1MB
175+ # of memory, just for those strings.
176+ str_len = last - temp_ptr - 1
177+ if (str_len > 4
178+ and strncmp(" 0 0", last - 4, 4) == 0):
179+ # This drops peak mem for bzr.dev from 87.4MB => 86.2MB
180+ # For Launchpad 236MB => 232MB
181+ value = safe_interned_string_from_size(temp_ptr + 1, str_len)
182+ else:
183+ value = safe_string_from_size(temp_ptr + 1, str_len)
184 # shrink the references end point
185 last = temp_ptr
186+
187 if self.ref_list_length:
188- ref_lists = []
189+ ref_lists = StaticTuple_New(self.ref_list_length)
190 loop_counter = 0
191 while loop_counter < self.ref_list_length:
192 ref_list = []
193@@ -248,18 +281,20 @@
194 if temp_ptr == NULL:
195 # key runs to the end
196 temp_ptr = ref_ptr
197+
198 PyList_Append(ref_list, self.extract_key(temp_ptr))
199- PyList_Append(ref_lists, tuple(ref_list))
200+ ref_list = StaticTuple_Intern(StaticTuple(*ref_list))
201+ Py_INCREF(ref_list)
202+ StaticTuple_SET_ITEM(ref_lists, loop_counter - 1, ref_list)
203 # prepare for the next reference list
204 self._start = next_start
205- ref_lists = tuple(ref_lists)
206- node_value = (value, ref_lists)
207+ node_value = StaticTuple(value, ref_lists)
208 else:
209 if last != self._start:
210 # unexpected reference data present
211 raise AssertionError("unexpected reference data present")
212- node_value = (value, ())
213- PyList_Append(self.keys, (key, node_value))
214+ node_value = StaticTuple(value, StaticTuple())
215+ PyList_Append(self.keys, StaticTuple(key, node_value))
216 return 0
217
218 def parse(self):
219@@ -294,7 +329,6 @@
220 cdef Py_ssize_t flat_len
221 cdef Py_ssize_t key_len
222 cdef Py_ssize_t node_len
223- cdef PyObject * val
224 cdef char * value
225 cdef Py_ssize_t value_len
226 cdef char * out
227@@ -303,13 +337,12 @@
228 cdef int first_ref_list
229 cdef int first_reference
230 cdef int i
231- cdef PyObject *ref_bit
232 cdef Py_ssize_t ref_bit_len
233
234- if not PyTuple_CheckExact(node):
235- raise TypeError('We expected a tuple() for node not: %s'
236+ if not PyTuple_CheckExact(node) and not StaticTuple_CheckExact(node):
237+ raise TypeError('We expected a tuple() or StaticTuple() for node not: %s'
238 % type(node))
239- node_len = PyTuple_GET_SIZE(node)
240+ node_len = len(node)
241 have_reference_lists = reference_lists
242 if have_reference_lists:
243 if node_len != 4:
244@@ -318,8 +351,17 @@
245 elif node_len < 3:
246 raise ValueError('Without ref_lists, we need at least 3 entries not: %s'
247 % len(node))
248- # I don't expect that we can do faster than string.join()
249- string_key = '\0'.join(<object>PyTuple_GET_ITEM_ptr_object(node, 1))
250+ # TODO: We can probably do better than string.join(), namely
251+ # when key has only 1 item, we can just grab that string
252+ # And when there are 2 items, we could do a single malloc + len() + 1
253+ # also, doing .join() requires a PyObject_GetAttrString call, which
254+ # we could also avoid.
255+ # TODO: Note that pyrex 0.9.6 generates fairly crummy code here, using the
256+ # python object interface, versus 0.9.8+ which uses a helper that
257+ # checks if this supports the sequence interface.
258+ # We *could* do more work on our own, and grab the actual items
259+ # lists. For now, just ask people to use a better compiler. :)
260+ string_key = '\0'.join(node[1])
261
262 # TODO: instead of using string joins, precompute the final string length,
263 # and then malloc a single string and copy everything in.
264@@ -336,7 +378,7 @@
265 refs_len = 0
266 if have_reference_lists:
267 # Figure out how many bytes it will take to store the references
268- ref_lists = <object>PyTuple_GET_ITEM_ptr_object(node, 3)
269+ ref_lists = node[3]
270 next_len = len(ref_lists) # TODO: use a Py function
271 if next_len > 0:
272 # If there are no nodes, we don't need to do any work
273@@ -350,31 +392,31 @@
274 # references
275 refs_len = refs_len + (next_len - 1)
276 for reference in ref_list:
277- if not PyTuple_CheckExact(reference):
278+ if (not PyTuple_CheckExact(reference)
279+ and not StaticTuple_CheckExact(reference)):
280 raise TypeError(
281 'We expect references to be tuples not: %s'
282 % type(reference))
283- next_len = PyTuple_GET_SIZE(reference)
284+ next_len = len(reference)
285 if next_len > 0:
286 # We will need (len - 1) '\x00' characters to
287 # separate the reference key
288 refs_len = refs_len + (next_len - 1)
289- for i from 0 <= i < next_len:
290- ref_bit = PyTuple_GET_ITEM_ptr_object(reference, i)
291- if not PyString_CheckExact_ptr(ref_bit):
292+ for ref_bit in reference:
293+ if not PyString_CheckExact(ref_bit):
294 raise TypeError('We expect reference bits'
295 ' to be strings not: %s'
296 % type(<object>ref_bit))
297- refs_len = refs_len + PyString_GET_SIZE_ptr(ref_bit)
298+ refs_len = refs_len + PyString_GET_SIZE(ref_bit)
299
300 # So we have the (key NULL refs NULL value LF)
301 key_len = PyString_Size(string_key)
302- val = PyTuple_GET_ITEM_ptr_object(node, 2)
303- if not PyString_CheckExact_ptr(val):
304+ val = node[2]
305+ if not PyString_CheckExact(val):
306 raise TypeError('Expected a plain str for value not: %s'
307- % type(<object>val))
308- value = PyString_AS_STRING_ptr(val)
309- value_len = PyString_GET_SIZE_ptr(val)
310+ % type(val))
311+ value = PyString_AS_STRING(val)
312+ value_len = PyString_GET_SIZE(val)
313 flat_len = (key_len + 1 + refs_len + 1 + value_len + 1)
314 line = PyString_FromStringAndSize(NULL, flat_len)
315 # Get a pointer to the new buffer
316@@ -396,14 +438,14 @@
317 out[0] = c'\r'
318 out = out + 1
319 first_reference = 0
320- next_len = PyTuple_GET_SIZE(reference)
321+ next_len = len(reference)
322 for i from 0 <= i < next_len:
323 if i != 0:
324 out[0] = c'\x00'
325 out = out + 1
326- ref_bit = PyTuple_GET_ITEM_ptr_object(reference, i)
327- ref_bit_len = PyString_GET_SIZE_ptr(ref_bit)
328- memcpy(out, PyString_AS_STRING_ptr(ref_bit), ref_bit_len)
329+ ref_bit = reference[i]
330+ ref_bit_len = PyString_GET_SIZE(ref_bit)
331+ memcpy(out, PyString_AS_STRING(ref_bit), ref_bit_len)
332 out = out + ref_bit_len
333 out[0] = c'\0'
334 out = out + 1
335
336=== modified file 'bzrlib/_static_tuple_c.c'
337--- bzrlib/_static_tuple_c.c 2009-10-15 16:18:47 +0000
338+++ bzrlib/_static_tuple_c.c 2009-10-15 18:31:17 +0000
339@@ -418,9 +418,15 @@
340 return NULL; /* There seems to be an error */
341 }
342 if (result == Py_NotImplemented) {
343- PyErr_BadInternalCall();
344 Py_DECREF(result);
345- return NULL;
346+ /* One side must have had a string and the other a StaticTuple.
347+ * This clearly means that they are not equal.
348+ */
349+ if (op == Py_EQ) {
350+ Py_INCREF(Py_False);
351+ return Py_False;
352+ }
353+ result = PyObject_RichCompare(v_obj, w_obj, Py_EQ);
354 }
355 if (result == Py_False) {
356 /* This entry is not identical
357
358=== modified file 'bzrlib/btree_index.py'
359--- bzrlib/btree_index.py 2009-10-15 04:01:26 +0000
360+++ bzrlib/btree_index.py 2009-10-15 18:31:17 +0000
361@@ -163,6 +163,7 @@
362 node_refs, _ = self._check_key_ref_value(key, references, value)
363 if key in self._nodes:
364 raise errors.BadIndexDuplicateKey(key, self)
365+ # TODO: StaticTuple
366 self._nodes[key] = (node_refs, value)
367 self._keys.add(key)
368 if self._nodes_by_key is not None and self._key_length > 1:
369@@ -625,6 +626,7 @@
370 for line in lines[2:]:
371 if line == '':
372 break
373+ # TODO: Switch to StaticTuple here.
374 nodes.append(tuple(map(intern, line.split('\0'))))
375 return nodes
376
377
378=== modified file 'bzrlib/builtins.py'
379--- bzrlib/builtins.py 2009-10-08 16:32:43 +0000
380+++ bzrlib/builtins.py 2009-10-15 18:31:17 +0000
381@@ -431,7 +431,10 @@
382 for node in bt.iter_all_entries():
383 # Node is made up of:
384 # (index, key, value, [references])
385- self.outf.write('%s\n' % (node[1:],))
386+ refs_as_tuples = tuple([tuple([tuple(ref) for ref in ref_list])
387+ for ref_list in node[3]])
388+ as_tuple = (tuple(node[1]), node[2], refs_as_tuples)
389+ self.outf.write('%s\n' % (as_tuple,))
390
391
392 class cmd_remove_tree(Command):
393
394=== modified file 'bzrlib/index.py'
395--- bzrlib/index.py 2009-10-13 05:20:50 +0000
396+++ bzrlib/index.py 2009-10-15 18:31:17 +0000
397@@ -40,6 +40,7 @@
398 debug,
399 errors,
400 )
401+from bzrlib.static_tuple import StaticTuple
402
403 _HEADER_READV = (0, 200)
404 _OPTION_KEY_ELEMENTS = "key_elements="
405@@ -102,7 +103,7 @@
406
407 def _check_key(self, key):
408 """Raise BadIndexKey if key is not a valid key for this index."""
409- if type(key) != tuple:
410+ if type(key) not in (tuple, StaticTuple):
411 raise errors.BadIndexKey(key)
412 if self._key_length != len(key):
413 raise errors.BadIndexKey(key)
414@@ -202,7 +203,9 @@
415 if reference not in self._nodes:
416 self._check_key(reference)
417 absent_references.append(reference)
418+ # TODO: StaticTuple
419 node_refs.append(tuple(reference_list))
420+ # TODO: StaticTuple
421 return tuple(node_refs), absent_references
422
423 def add_node(self, key, value, references=()):
424
425=== modified file 'bzrlib/repository.py'
426--- bzrlib/repository.py 2009-10-08 22:53:13 +0000
427+++ bzrlib/repository.py 2009-10-15 18:31:17 +0000
428@@ -4319,6 +4319,13 @@
429 ):
430 if versioned_file is None:
431 continue
432+ # TODO: key is often going to be a StaticTuple object
433+ # I don't believe we can define a method by which
434+ # (prefix,) + StaticTuple will work, though we could
435+ # define a StaticTuple.sq_concat that would allow you to
436+ # pass in either a tuple or a StaticTuple as the second
437+ # object, so instead we could have:
438+ # StaticTuple(prefix) + key here...
439 missing_keys.update((prefix,) + key for key in
440 versioned_file.get_missing_compression_parent_keys())
441 except NotImplementedError:
442
443=== added file 'bzrlib/static_tuple.py'
444--- bzrlib/static_tuple.py 1970-01-01 00:00:00 +0000
445+++ bzrlib/static_tuple.py 2009-10-15 18:31:17 +0000
446@@ -0,0 +1,25 @@
447+# Copyright (C) 2009 Canonical Ltd
448+#
449+# This program is free software; you can redistribute it and/or modify
450+# it under the terms of the GNU General Public License as published by
451+# the Free Software Foundation; either version 2 of the License, or
452+# (at your option) any later version.
453+#
454+# This program is distributed in the hope that it will be useful,
455+# but WITHOUT ANY WARRANTY; without even the implied warranty of
456+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
457+# GNU General Public License for more details.
458+#
459+# You should have received a copy of the GNU General Public License
460+# along with this program; if not, write to the Free Software
461+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
462+
463+"""Interface thunk for a StaticTuple implementation."""
464+
465+try:
466+ from bzrlib._static_tuple_c import StaticTuple
467+except ImportError, e:
468+ from bzrlib import osutils
469+ osutils.failed_to_load_extension(e)
470+ from bzrlib._static_tuple_py import StaticTuple
471+
472
473=== modified file 'bzrlib/tests/test__static_tuple.py'
474--- bzrlib/tests/test__static_tuple.py 2009-10-12 18:10:24 +0000
475+++ bzrlib/tests/test__static_tuple.py 2009-10-15 18:31:17 +0000
476@@ -23,6 +23,7 @@
477 _static_tuple_py,
478 errors,
479 osutils,
480+ static_tuple,
481 tests,
482 )
483
484@@ -278,6 +279,16 @@
485 self.assertCompareEqual(k3, (k1, ('foo', 'bar')))
486 self.assertCompareEqual((k1, ('foo', 'bar')), k3)
487
488+ def test_compare_mixed_depths(self):
489+ stuple = self.module.StaticTuple
490+ k1 = stuple(stuple('a',), stuple('b',))
491+ k2 = stuple(stuple(stuple('c',), stuple('d',)),
492+ stuple('b',))
493+ # This requires comparing a StaticTuple to a 'string', and then
494+ # interpreting that value in the next higher StaticTuple. This used to
495+ # generate a PyErr_BadIternalCall. We now fall back to *something*.
496+ self.assertCompareNoRelation(k1, k2)
497+
498 def test_hash(self):
499 k = self.module.StaticTuple('foo')
500 self.assertEqual(hash(k), hash(('foo',)))
501@@ -416,3 +427,13 @@
502 if self.module is _static_tuple_py:
503 return
504 self.assertIsNot(None, self.module._C_API)
505+
506+ def test_static_tuple_thunk(self):
507+ # Make sure the right implementation is available from
508+ # bzrlib.static_tuple.StaticTuple.
509+ if self.module is _static_tuple_py:
510+ if CompiledStaticTuple.available():
511+ # We will be using the C version
512+ return
513+ self.assertIs(static_tuple.StaticTuple,
514+ self.module.StaticTuple)
515
516=== modified file 'bzrlib/util/_bencode_py.py'
517--- bzrlib/util/_bencode_py.py 2009-06-10 03:56:49 +0000
518+++ bzrlib/util/_bencode_py.py 2009-10-15 18:31:17 +0000
519@@ -154,6 +154,13 @@
520 encode_int(int(x), r)
521 encode_func[BooleanType] = encode_bool
522
523+try:
524+ from bzrlib._static_tuple_c import StaticTuple
525+except ImportError:
526+ pass
527+else:
528+ encode_func[StaticTuple] = encode_list
529+
530
531 def bencode(x):
532 r = []
533
534=== modified file 'setup.py'
535--- setup.py 2009-10-12 17:03:40 +0000
536+++ setup.py 2009-10-15 18:31:17 +0000
537@@ -270,7 +270,6 @@
538
539 add_pyrex_extension('bzrlib._annotator_pyx')
540 add_pyrex_extension('bzrlib._bencode_pyx')
541-add_pyrex_extension('bzrlib._btree_serializer_pyx')
542 add_pyrex_extension('bzrlib._chunks_to_lines_pyx')
543 add_pyrex_extension('bzrlib._groupcompress_pyx',
544 extra_source=['bzrlib/diff-delta.c'])
545@@ -303,6 +302,8 @@
546 add_pyrex_extension('bzrlib._simple_set_pyx')
547 ext_modules.append(Extension('bzrlib._static_tuple_c',
548 ['bzrlib/_static_tuple_c.c']))
549+add_pyrex_extension('bzrlib._btree_serializer_pyx')
550+
551
552 if unavailable_files:
553 print 'C extension(s) not found:'