Merge lp:~jameinel/bzr/1.17-rework-annotate into lp:~bzr/bzr/trunk-old

Proposed by John A Meinel
Status: Merged
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr/1.17-rework-annotate
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 2746 lines
To merge this branch: bzr merge lp:~jameinel/bzr/1.17-rework-annotate
Reviewer Review Type Date Requested Status
Vincent Ladeuil Pending
bzr-core Pending
Review via email: mp+8281@code.launchpad.net
To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

This is a fairly major overhaul of the annotate code, with an eye on improving annotations overall. In the short term, it just makes it faster (~9s => ~7s for NEWS)

Overview of changes
1) Some small tweaks to BranchBuilder that I needed to write some test cases.
2) Changing from a bunch of loose functions in 'bzrlib.annotate.reannotate*' to a class Annotator.
3) Re-implement _KnitAnnotator as an implementation of this class. I didn't change much about how the texts were extracted and then compared, but there is a much better test suite against it now. It also vetted the design a bit, to ensure that the Annotator could be properly subclassed to do specialized extraction. (Knits and knitpack gives us hints as to what our delta should be, so that we don't have to re-delta every text versus all of its parents. Timing shows this to be rather significant.)
4) Implement a pyrex version of Annotator, to handle some inner-loop functionality. (Nicely enough, you can still subclass from it.)
5) This also includes a fairly fundamental change to how the annotation is produced.
   a) Switch from [(ann1, line1), (ann2, line2)] to ([ann1, ann2], [line1, line2]) style of tracking annotations. This means fewer times when we have to re-cast the data from a list of annotated lines into a list of plain lines.
   b) When computing the delta, compare all plain lines first. The prior code used to compare annotated lines, because it made computing overall annotations faster. However, that introduced bug #387294.
   c) Start tracking *multiple* sources for a given line. This means that rather than resolving 'heads()' and collisions at every revision, we wait to do the resolution on the *final* text. This removes a lot of heads() calls for stuff like NEWS (and is the primary performance improvement). I don't think this gets us a lot when dealing with a command line interface, but it has lots of potential for a GUI (which could then show all sources that introduced a line, etc.)

Still to do:
1) I'd like to break it up a bit more, and allow you to pass some sort of Policy object into the code. That would let you do things like ignore whitespace changes, only annotate based on mainline changes, etc.

2) Support _break_annotation_tie again. This is really something I'd like to fold into Policy, but I guess MySQL wanted a custom implementation. I don't really like the current api, as it is probably fixed at 2 revisions, and it passes in the lines along with the revisions. But I can certainly adapt what I need to the old api. Note that technically the api supports > 2, but I doubt that is actually supported anywhere, but I haven't seen the MySQL implementation as a comparison point.

3) GUI support. I don't really know how to expose the Annotator functionality to a GUI, or if the api really works well there until I've actually written some code. However, this patch has gotten too big already, so I'd like to get it reviewed as is.

Revision history for this message
Vincent Ladeuil (vila) wrote :
Download full text (4.5 KiB)

Review: Approve

Good to see more tests in that area !

There is little to comment on given the detailed cover letter, I
like the cleanup (to come ? :) in annotate and the introduction
of the Annotator class, but since you intend to build policy
classes as front-end, why not make the class private until you
feel more confident about the overall API ? Like you, I'm not
sure the GUIs will really need to access that class...

>>>>> "jam" == John A Meinel <email address hidden> writes:

    jam> You have been requested to review the proposed merge of
    jam> lp:~jameinel/bzr/1.17-rework-annotate into lp:bzr.

    jam> This is a fairly major overhaul of the annotate code,
    jam> with an eye on improving annotations overall. In the
    jam> short term, it just makes it faster (~9s => ~7s for
    jam> NEWS)

Which is always good to take (for interested readers that's still
with using --show-ids).

    jam> Overview of changes

    jam> 1) Some small tweaks to BranchBuilder that I needed to
    jam> write some test cases.

Good, obviously the tests you modified are clearer too.

    jam> 2) Changing from a bunch of loose functions in
    jam> bzrlib.annotate.reannotate*' to a class Annotator.

Good step forward.

A couple of comments on the class API:

- _update_needed_children() and _get_needed_keys() sounds like
  good candidates for Graph() or some specialization of it.

- _update_from_one_parent() the doc string says first parent, why
  not call it _update_from_first_parent() then ? Unless you envision
  some other possible usage...

- add_special_text(), hmm, what's that ? The doc string doesn't
  help a lot :-) Does that need to be public ?

    jam> 4) Implement a pyrex version of Annotator, to handle some
    jam> inner-loop functionality. (Nicely enough, you can still subclass
    jam> from it.)

It would be nice to defines in pyrex *only* those inner-loops,
not a requirement to land that patch tough.

<snip/>

    jam> Still to do:
    jam> 1) I'd like to break it up a bit more, and allow you to pass some
    jam> sort of Policy object into the code. That would let you do things
    jam> like ignore whitespace changes, only annotate based on mainline
    jam> changes, etc.

    jam> 2) Support _break_annotation_tie again. This is really something
    jam> I'd like to fold into Policy, but I guess MySQL wanted a custom
    jam> implementation. I don't really like the current api, as it is
    jam> probably fixed at 2 revisions, and it passes in the lines along
    jam> with the revisions. But I can certainly adapt what I need to the
    jam> old api. Note that technically the api supports > 2, but I doubt
    jam> that is actually supported anywhere, but I haven't seen the MySQL
    jam> implementation as a comparison point.

Pretty much:

- extract the date from the revids of the annotations (only the
  first two ones),
- return the oldest

It would be very appreciated to not break the actual result or at
the very least provides a way to get the same functionality
before 1.17 is out.

    jam> 3) GUI support. I don't really know how to expose the Annotator
    jam> functionality to a GUI, or if the api really works well there
    jam...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> There is little to comment on given the detailed cover letter, I
> like the cleanup (to come ? :) in annotate and the introduction
> of the Annotator class, but since you intend to build policy
> classes as front-end, why not make the class private until you
> feel more confident about the overall API ? Like you, I'm not
> sure the GUIs will really need to access that class...

I'm not sure if you understood me correctly. Annotator is *meant* to be
the final api for getting an annotation for various versions of a file.

An AnnotationPolicy is meant to be the way to say "I want to ignore
whitespace", etc.

I can make it hidden, but since:

VF.get_annotator() is meant to be the public api that
qannotate/gannotate will use...

...
>
> Pretty much:
>
> - extract the date from the revids of the annotations (only the
> first two ones),
> - return the oldest
>
> It would be very appreciated to not break the actual result or at
> the very least provides a way to get the same functionality
> before 1.17 is out.

So my idea was to do:

_break_annotation_tie = None

if _break_annotation_tie is not None:
  # mutate the data to fit the old api
else:
  # do it my way

...

> jam> + the_heads = heads(annotation)
> jam> + if len(the_heads) == 1:
> jam> + for head in the_heads:
> jam> + break
>
> That's a really funny way to write head = heads[0]... Do I miss
> something ?

heads is a set, you can't do set()[0], you have to:

list(heads)[0]

iter(heads).next()

heads.pop() # though I believe it is a frozenset and this is illegal

[head for head in heads][0]

for head in heads:
  continue

for head in heads:
  pass

for head in heads:
  break

Take your pick. The last one is the fastest because it evaluates the
iter() inline, and doesn't have a function call. Nor does it build an
intermediate list. And for whatever reason, TIMEIT says that 'break' is
faster than the others.

>
> <snip/>
>
> jam> === added file 'bzrlib/_annotator_pyx.pyx'
> ...
> jam> +class Annotator:
> ...
> jam> + def _update_needed_children(self, key, parent_keys):
> jam> + for parent_key in parent_keys:
> jam> + if parent_key in self._num_needed_children:
> jam> + self._num_needed_children[parent_key] += 1
>
> +=1 ? Hmm, I like that pyrex version you're using, send me some :)

Actually for pyrex 0.9.8 you can even do:

cdef list foo

and then *it* will translate

foo.append(...)

into

PyList_Append(foo, ...)

It would be *really* nice to depend on 0.9.8.5 as it would clean up
certain bits. (Note it only really supports lists, it allows:
  cdef dict foo
  cdef tuple foo

and will do runtime checking, etc, but it doesn't have any smarts about
set_item /get item/append, etc.
)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpTftEACgkQJdeBCYSNAAM/TgCfdlmsdNtMT2+t+lFRsL3QTgVr
E8AAmwdgejnYx0JK1rWMOIrEBeQJJfSN
=jaUI
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...
>
> A couple of comments on the class API:
>
> - _update_needed_children() and _get_needed_keys() sounds like
> good candidates for Graph() or some specialization of it.

True, though they also have side effects like removing texts when there
are no more needed children, etc.

>
> - _update_from_one_parent() the doc string says first parent, why
> not call it _update_from_first_parent() then ? Unless you envision
> some other possible usage...

Sure.

>
> - add_special_text(), hmm, what's that ? The doc string doesn't
> help a lot :-) Does that need to be public ?

It does, as it is used by WorkingTree to add the 'current:' text to be
annotated. (One other benefit of this new code is that 'bzr annotate
NEWS' after a merge doesn't annotate both parents independently... \o/)

>
> jam> 4) Implement a pyrex version of Annotator, to handle some
> jam> inner-loop functionality. (Nicely enough, you can still subclass
> jam> from it.)
>
> It would be nice to defines in pyrex *only* those inner-loops,
> not a requirement to land that patch tough.
>

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpThhgACgkQJdeBCYSNAAN5GACgvRr9AoVrnRJ/u4Gd2nPEjAil
mTcAn3q07M6kiM6qaerO6ldZRQSOp0wh
=9LiN
-----END PGP SIGNATURE-----

Revision history for this message
Vincent Ladeuil (vila) wrote :

>>>>> "jam" == John A Meinel <email address hidden> writes:

    >> There is little to comment on given the detailed cover letter, I
    >> like the cleanup (to come ? :) in annotate and the introduction
    >> of the Annotator class, but since you intend to build policy
    >> classes as front-end, why not make the class private until you
    >> feel more confident about the overall API ? Like you, I'm not
    >> sure the GUIs will really need to access that class...

    jam> I'm not sure if you understood me correctly. Annotator
    jam> is *meant* to be the final api for getting an annotation
    jam> for various versions of a file.

Oh ! Indeed, I understood you wanted trees to be the primary interface.

<snip/>

    jam> So my idea was to do:

    jam> _break_annotation_tie = None

    jam> if _break_annotation_tie is not None:
    jam> # mutate the data to fit the old api
    jam> else:
    jam> # do it my way

Fine.

    jam> heads is a set, you can't do set()[0], you have to:

    jam> list(heads)[0]

    jam> iter(heads).next()

    jam> heads.pop() # though I believe it is a frozenset and this is illegal

    jam> [head for head in heads][0]

    jam> for head in heads:
    jam> continue

    jam> for head in heads:
    jam> pass

    jam> for head in heads:
    jam> break

Then I'd go with:

  for head in heads: break # Get head from the set

on a single line to make it more obvious.

That's the first time I see that idiom, I will not be surprised
next time (hopefully, but others can).

<snip/>

    >> +=1 ? Hmm, I like that pyrex version you're using, send me some :)

    jam> Actually for pyrex 0.9.8 you can even do:

Oh ! Yes, jaunty is still at 0.9.7.2 ...

Reading the NEWS about it, I can only agree here.

What is needed to have the package updated ? Host it in the bzr
PPAs ?

     Vincent

Revision history for this message
Vincent Ladeuil (vila) wrote :

>>>>> "jam" == John A Meinel <email address hidden> writes:

    jam> ...
    >>
    >> A couple of comments on the class API:
    >>
    >> - _update_needed_children() and _get_needed_keys() sounds like
    >> good candidates for Graph() or some specialization of it.

    jam> True, though they also have side effects like removing
    jam> texts when there are no more needed children, etc.

I see.

<snip/>

    >> - add_special_text(), hmm, what's that ? The doc string doesn't
    >> help a lot :-) Does that need to be public ?

    jam> It does, as it is used by WorkingTree to add the 'current:' text to be
    jam> annotated.

Haaa ! Worth mentioning then :)

    jam> (One other benefit of this new code is that 'bzr
    jam> annotate NEWS' after a merge doesn't annotate both
    jam> parents independently... \o/)

Hurrah !

       Vincent

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vincent Ladeuil wrote:
>>>>>> "jam" == John A Meinel <email address hidden> writes:
>
> >> There is little to comment on given the detailed cover letter, I
> >> like the cleanup (to come ? :) in annotate and the introduction
> >> of the Annotator class, but since you intend to build policy
> >> classes as front-end, why not make the class private until you
> >> feel more confident about the overall API ? Like you, I'm not
> >> sure the GUIs will really need to access that class...
>
> jam> I'm not sure if you understood me correctly. Annotator
> jam> is *meant* to be the final api for getting an annotation
> jam> for various versions of a file.
>
> Oh ! Indeed, I understood you wanted trees to be the primary interface.

So I'm trying to resolve the two issues. But a *tree* doesn't talk about
the history of a file. I might add:

 tree.get_annotator(file_id)

That would say something like: "give me an object that can talk about
the history of the file you have."

I need some way for the WT to inject the 'current' version, and have a
default to annotating that tip. I haven't worked out the details yet.

You *could* do this by having a bunch of revision trees for each
revision you want to annotate, and then having the annotator cached at
the VF layer. But it seems better to control the caching lifetime and
parameters in the GUI layer, rather than underneath trees inside VF.

...

>
> Then I'd go with:
>
> for head in heads: break # Get head from the set
>
> on a single line to make it more obvious.
>
> That's the first time I see that idiom, I will not be surprised
> next time (hopefully, but others can).
>
> <snip/>
>
> >> +=1 ? Hmm, I like that pyrex version you're using, send me some :)
>
> jam> Actually for pyrex 0.9.8 you can even do:
>
> Oh ! Yes, jaunty is still at 0.9.7.2 ...
>
> Reading the NEWS about it, I can only agree here.
>
> What is needed to have the package updated ? Host it in the bzr
> PPAs ?
>
> Vincent

I don't really know.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpUqcYACgkQJdeBCYSNAAN8nwCfXd9wjlU4U9L48k7ollH9qFqX
kzoAoJQPLsJPwIy7y2CcrjeyOrQFYYAI
=qWQI
-----END PGP SIGNATURE-----

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file '.bzrignore'
--- .bzrignore 2009-06-22 12:52:39 +0000
+++ .bzrignore 2009-07-08 23:35:26 +0000
@@ -38,6 +38,7 @@
38./api38./api
39doc/**/*.html39doc/**/*.html
40doc/developers/performance.png40doc/developers/performance.png
41bzrlib/_annotator_pyx.c
41bzrlib/_bencode_pyx.c42bzrlib/_bencode_pyx.c
42bzrlib/_btree_serializer_pyx.c43bzrlib/_btree_serializer_pyx.c
43bzrlib/_chk_map_pyx.c44bzrlib/_chk_map_pyx.c
4445
=== modified file 'NEWS'
--- NEWS 2009-07-08 18:05:38 +0000
+++ NEWS 2009-07-08 23:35:27 +0000
@@ -48,6 +48,9 @@
48 diverged-branches`` when a push fails because the branches have48 diverged-branches`` when a push fails because the branches have
49 diverged. (Neil Martinsen-Burrell, #269477)49 diverged. (Neil Martinsen-Burrell, #269477)
5050
51* Annotate would sometimes 'latch on' to trivial lines, causing important
52 lines to be incorrectly annotated. (John Arbash Meinel, #387952)
53
51* Automatic format upgrades triggered by default stacking policies on a54* Automatic format upgrades triggered by default stacking policies on a
52 1.16rc1 (or later) smart server work again.55 1.16rc1 (or later) smart server work again.
53 (Andrew Bennetts, #388675)56 (Andrew Bennetts, #388675)
@@ -164,7 +167,12 @@
164Improvements167Improvements
165************168************
166169
167``bzr ls`` is now faster. On OpenOffice.org, the time drops from 2.4170* ``bzr annotate`` can now be significantly faster. The time for
171 ``bzr annotate NEWS`` is down to 7s from 22s in 1.16. Files with long
172 histories and lots of 'duplicate insertions' will be improved more than
173 others. (John Arbash Meinel, Vincent Ladeuil)
174
175* ``bzr ls`` is now faster. On OpenOffice.org, the time drops from 2.4
168 to 1.1 seconds. The improvement for ``bzr ls -r-1`` is more176 to 1.1 seconds. The improvement for ``bzr ls -r-1`` is more
169 substantial dropping from 54.3 to 1.1 seconds. (Ian Clatworthy)177 substantial dropping from 54.3 to 1.1 seconds. (Ian Clatworthy)
170178
171179
=== added file 'bzrlib/_annotator_py.py'
--- bzrlib/_annotator_py.py 1970-01-01 00:00:00 +0000
+++ bzrlib/_annotator_py.py 2009-07-08 23:35:27 +0000
@@ -0,0 +1,309 @@
1# Copyright (C) 2009 Canonical Ltd
2#
3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by
5# the Free Software Foundation; either version 2 of the License, or
6# (at your option) any later version.
7#
8# This program is distributed in the hope that it will be useful,
9# but WITHOUT ANY WARRANTY; without even the implied warranty of
10# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11# GNU General Public License for more details.
12#
13# You should have received a copy of the GNU General Public License
14# along with this program; if not, write to the Free Software
15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
16
17"""Functionality for doing annotations in the 'optimal' way"""
18
19from bzrlib.lazy_import import lazy_import
20lazy_import(globals(), """
21from bzrlib import annotate # Must be lazy to avoid circular importing
22""")
23from bzrlib import (
24 errors,
25 graph as _mod_graph,
26 osutils,
27 patiencediff,
28 ui,
29 )
30
31
32class Annotator(object):
33 """Class that drives performing annotations."""
34
35 def __init__(self, vf):
36 """Create a new Annotator from a VersionedFile."""
37 self._vf = vf
38 self._parent_map = {}
39 self._text_cache = {}
40 # Map from key => number of nexts that will be built from this key
41 self._num_needed_children = {}
42 self._annotations_cache = {}
43 self._heads_provider = None
44 self._ann_tuple_cache = {}
45
46 def _update_needed_children(self, key, parent_keys):
47 for parent_key in parent_keys:
48 if parent_key in self._num_needed_children:
49 self._num_needed_children[parent_key] += 1
50 else:
51 self._num_needed_children[parent_key] = 1
52
53 def _get_needed_keys(self, key):
54 """Determine the texts we need to get from the backing vf.
55
56 :return: (vf_keys_needed, ann_keys_needed)
57 vf_keys_needed These are keys that we need to get from the vf
58 ann_keys_needed Texts which we have in self._text_cache but we
59 don't have annotations for. We need to yield these
60 in the proper order so that we can get proper
61 annotations.
62 """
63 parent_map = self._parent_map
64 # We need 1 extra copy of the node we will be looking at when we are
65 # done
66 self._num_needed_children[key] = 1
67 vf_keys_needed = set()
68 ann_keys_needed = set()
69 needed_keys = set([key])
70 while needed_keys:
71 parent_lookup = []
72 next_parent_map = {}
73 for key in needed_keys:
74 if key in self._parent_map:
75 # We don't need to lookup this key in the vf
76 if key not in self._text_cache:
77 # Extract this text from the vf
78 vf_keys_needed.add(key)
79 elif key not in self._annotations_cache:
80 # We do need to annotate
81 ann_keys_needed.add(key)
82 next_parent_map[key] = self._parent_map[key]
83 else:
84 parent_lookup.append(key)
85 vf_keys_needed.add(key)
86 needed_keys = set()
87 next_parent_map.update(self._vf.get_parent_map(parent_lookup))
88 for key, parent_keys in next_parent_map.iteritems():
89 if parent_keys is None: # No graph versionedfile
90 parent_keys = ()
91 next_parent_map[key] = ()
92 self._update_needed_children(key, parent_keys)
93 needed_keys.update([key for key in parent_keys
94 if key not in parent_map])
95 parent_map.update(next_parent_map)
96 # _heads_provider does some graph caching, so it is only valid while
97 # self._parent_map hasn't changed
98 self._heads_provider = None
99 return vf_keys_needed, ann_keys_needed
100
101 def _get_needed_texts(self, key, pb=None):
102 """Get the texts we need to properly annotate key.
103
104 :param key: A Key that is present in self._vf
105 :return: Yield (this_key, text, num_lines)
106 'text' is an opaque object that just has to work with whatever
107 matcher object we are using. Currently it is always 'lines' but
108 future improvements may change this to a simple text string.
109 """
110 keys, ann_keys = self._get_needed_keys(key)
111 if pb is not None:
112 pb.update('getting stream', 0, len(keys))
113 stream = self._vf.get_record_stream(keys, 'topological', True)
114 for idx, record in enumerate(stream):
115 if pb is not None:
116 pb.update('extracting', 0, len(keys))
117 if record.storage_kind == 'absent':
118 raise errors.RevisionNotPresent(record.key, self._vf)
119 this_key = record.key
120 lines = osutils.chunks_to_lines(record.get_bytes_as('chunked'))
121 num_lines = len(lines)
122 self._text_cache[this_key] = lines
123 yield this_key, lines, num_lines
124 for key in ann_keys:
125 lines = self._text_cache[key]
126 num_lines = len(lines)
127 yield key, lines, num_lines
128
129 def _get_parent_annotations_and_matches(self, key, text, parent_key):
130 """Get the list of annotations for the parent, and the matching lines.
131
132 :param text: The opaque value given by _get_needed_texts
133 :param parent_key: The key for the parent text
134 :return: (parent_annotations, matching_blocks)
135 parent_annotations is a list as long as the number of lines in
136 parent
137 matching_blocks is a list of (parent_idx, text_idx, len) tuples
138 indicating which lines match between the two texts
139 """
140 parent_lines = self._text_cache[parent_key]
141 parent_annotations = self._annotations_cache[parent_key]
142 # PatienceSequenceMatcher should probably be part of Policy
143 matcher = patiencediff.PatienceSequenceMatcher(None,
144 parent_lines, text)
145 matching_blocks = matcher.get_matching_blocks()
146 return parent_annotations, matching_blocks
147
148 def _update_from_first_parent(self, key, annotations, lines, parent_key):
149 """Reannotate this text relative to its first parent."""
150 (parent_annotations,
151 matching_blocks) = self._get_parent_annotations_and_matches(
152 key, lines, parent_key)
153
154 for parent_idx, lines_idx, match_len in matching_blocks:
155 # For all matching regions we copy across the parent annotations
156 annotations[lines_idx:lines_idx + match_len] = \
157 parent_annotations[parent_idx:parent_idx + match_len]
158
159 def _update_from_other_parents(self, key, annotations, lines,
160 this_annotation, parent_key):
161 """Reannotate this text relative to a second (or more) parent."""
162 (parent_annotations,
163 matching_blocks) = self._get_parent_annotations_and_matches(
164 key, lines, parent_key)
165
166 last_ann = None
167 last_parent = None
168 last_res = None
169 # TODO: consider making all annotations unique and then using 'is'
170 # everywhere. Current results claim that isn't any faster,
171 # because of the time spent deduping
172 # deduping also saves a bit of memory. For NEWS it saves ~1MB,
173 # but that is out of 200-300MB for extracting everything, so a
174 # fairly trivial amount
175 for parent_idx, lines_idx, match_len in matching_blocks:
176 # For lines which match this parent, we will now resolve whether
177 # this parent wins over the current annotation
178 ann_sub = annotations[lines_idx:lines_idx + match_len]
179 par_sub = parent_annotations[parent_idx:parent_idx + match_len]
180 if ann_sub == par_sub:
181 continue
182 for idx in xrange(match_len):
183 ann = ann_sub[idx]
184 par_ann = par_sub[idx]
185 ann_idx = lines_idx + idx
186 if ann == par_ann:
187 # Nothing to change
188 continue
189 if ann == this_annotation:
190 # Originally claimed 'this', but it was really in this
191 # parent
192 annotations[ann_idx] = par_ann
193 continue
194 # Resolve the fact that both sides have a different value for
195 # last modified
196 if ann == last_ann and par_ann == last_parent:
197 annotations[ann_idx] = last_res
198 else:
199 new_ann = set(ann)
200 new_ann.update(par_ann)
201 new_ann = tuple(sorted(new_ann))
202 annotations[ann_idx] = new_ann
203 last_ann = ann
204 last_parent = par_ann
205 last_res = new_ann
206
207 def _record_annotation(self, key, parent_keys, annotations):
208 self._annotations_cache[key] = annotations
209 for parent_key in parent_keys:
210 num = self._num_needed_children[parent_key]
211 num -= 1
212 if num == 0:
213 del self._text_cache[parent_key]
214 del self._annotations_cache[parent_key]
215 # Do we want to clean up _num_needed_children at this point as
216 # well?
217 self._num_needed_children[parent_key] = num
218
219 def _annotate_one(self, key, text, num_lines):
220 this_annotation = (key,)
221 # Note: annotations will be mutated by calls to _update_from*
222 annotations = [this_annotation] * num_lines
223 parent_keys = self._parent_map[key]
224 if parent_keys:
225 self._update_from_first_parent(key, annotations, text,
226 parent_keys[0])
227 for parent in parent_keys[1:]:
228 self._update_from_other_parents(key, annotations, text,
229 this_annotation, parent)
230 self._record_annotation(key, parent_keys, annotations)
231
232 def add_special_text(self, key, parent_keys, text):
233 """Add a specific text to the graph.
234
235 This is used to add a text which is not otherwise present in the
236 versioned file. (eg. a WorkingTree injecting 'current:' into the
237 graph to annotate the edited content.)
238
239 :param key: The key to use to request this text be annotated
240 :param parent_keys: The parents of this text
241 :param text: A string containing the content of the text
242 """
243 self._parent_map[key] = parent_keys
244 self._text_cache[key] = osutils.split_lines(text)
245 self._heads_provider = None
246
247 def annotate(self, key):
248 """Return annotated fulltext for the given key.
249
250 :param key: A tuple defining the text to annotate
251 :return: ([annotations], [lines])
252 annotations is a list of tuples of keys, one for each line in lines
253 each key is a possible source for the given line.
254 lines the text of "key" as a list of lines
255 """
256 pb = ui.ui_factory.nested_progress_bar()
257 try:
258 for text_key, text, num_lines in self._get_needed_texts(key, pb=pb):
259 self._annotate_one(text_key, text, num_lines)
260 finally:
261 pb.finished()
262 try:
263 annotations = self._annotations_cache[key]
264 except KeyError:
265 raise errors.RevisionNotPresent(key, self._vf)
266 return annotations, self._text_cache[key]
267
268 def _get_heads_provider(self):
269 if self._heads_provider is None:
270 self._heads_provider = _mod_graph.KnownGraph(self._parent_map)
271 return self._heads_provider
272
273 def _resolve_annotation_tie(self, the_heads, line, tiebreaker):
274 if tiebreaker is None:
275 head = sorted(the_heads)[0]
276 else:
277 # Backwards compatibility, break up the heads into pairs and
278 # resolve the result
279 next_head = iter(the_heads)
280 head = next_head.next()
281 for possible_head in next_head:
282 annotated_lines = ((head, line), (possible_head, line))
283 head = tiebreaker(annotated_lines)[0]
284 return head
285
286 def annotate_flat(self, key):
287 """Determine the single-best-revision to source for each line.
288
289 This is meant as a compatibility thunk to how annotate() used to work.
290 :return: [(ann_key, line)]
291 A list of tuples with a single annotation key for each line.
292 """
293 custom_tiebreaker = annotate._break_annotation_tie
294 annotations, lines = self.annotate(key)
295 out = []
296 heads = self._get_heads_provider().heads
297 append = out.append
298 for annotation, line in zip(annotations, lines):
299 if len(annotation) == 1:
300 head = annotation[0]
301 else:
302 the_heads = heads(annotation)
303 if len(the_heads) == 1:
304 for head in the_heads: break # get the item out of the set
305 else:
306 head = self._resolve_annotation_tie(the_heads, line,
307 custom_tiebreaker)
308 append((head, line))
309 return out
0310
=== added file 'bzrlib/_annotator_pyx.pyx'
--- bzrlib/_annotator_pyx.pyx 1970-01-01 00:00:00 +0000
+++ bzrlib/_annotator_pyx.pyx 2009-07-08 23:35:27 +0000
@@ -0,0 +1,287 @@
1# Copyright (C) 2009 Canonical Ltd
2#
3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by
5# the Free Software Foundation; either version 2 of the License, or
6# (at your option) any later version.
7#
8# This program is distributed in the hope that it will be useful,
9# but WITHOUT ANY WARRANTY; without even the implied warranty of
10# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11# GNU General Public License for more details.
12#
13# You should have received a copy of the GNU General Public License
14# along with this program; if not, write to the Free Software
15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
16
17"""Functionality for doing annotations in the 'optimal' way"""
18
19cdef extern from "python-compat.h":
20 pass
21
22cdef extern from "Python.h":
23 ctypedef int Py_ssize_t
24 ctypedef struct PyObject:
25 pass
26 ctypedef struct PyListObject:
27 PyObject **ob_item
28 int PyList_CheckExact(object)
29 PyObject *PyList_GET_ITEM(object, Py_ssize_t o)
30 Py_ssize_t PyList_GET_SIZE(object)
31 int PyList_Append(object, object) except -1
32 int PyList_SetItem(object, Py_ssize_t o, object) except -1
33 int PyList_Sort(object) except -1
34
35 int PyTuple_CheckExact(object)
36 object PyTuple_New(Py_ssize_t len)
37 void PyTuple_SET_ITEM(object, Py_ssize_t pos, object)
38 void PyTuple_SET_ITEM_ptr "PyTuple_SET_ITEM" (object, Py_ssize_t,
39 PyObject *)
40 int PyTuple_Resize(PyObject **, Py_ssize_t newlen)
41 PyObject *PyTuple_GET_ITEM(object, Py_ssize_t o)
42 Py_ssize_t PyTuple_GET_SIZE(object)
43
44 PyObject *PyDict_GetItem(object d, object k)
45 int PyDict_SetItem(object d, object k, object v) except -1
46
47 void Py_INCREF(object)
48 void Py_INCREF_ptr "Py_INCREF" (PyObject *)
49 void Py_DECREF_ptr "Py_DECREF" (PyObject *)
50
51 int Py_EQ
52 int Py_LT
53 int PyObject_RichCompareBool(object, object, int opid) except -1
54 int PyObject_RichCompareBool_ptr "PyObject_RichCompareBool" (
55 PyObject *, PyObject *, int opid)
56
57
58from bzrlib import _annotator_py
59
60
61cdef int _check_annotations_are_lists(annotations,
62 parent_annotations) except -1:
63 if not PyList_CheckExact(annotations):
64 raise TypeError('annotations must be a list')
65 if not PyList_CheckExact(parent_annotations):
66 raise TypeError('parent_annotations must be a list')
67 return 0
68
69
70cdef int _check_match_ranges(parent_annotations, annotations,
71 Py_ssize_t parent_idx, Py_ssize_t lines_idx,
72 Py_ssize_t match_len) except -1:
73 if parent_idx + match_len > PyList_GET_SIZE(parent_annotations):
74 raise ValueError('Match length exceeds len of'
75 ' parent_annotations %s > %s'
76 % (parent_idx + match_len,
77 PyList_GET_SIZE(parent_annotations)))
78 if lines_idx + match_len > PyList_GET_SIZE(annotations):
79 raise ValueError('Match length exceeds len of'
80 ' annotations %s > %s'
81 % (lines_idx + match_len,
82 PyList_GET_SIZE(annotations)))
83 return 0
84
85
86cdef PyObject *_next_tuple_entry(object tpl, Py_ssize_t *pos):
87 pos[0] = pos[0] + 1
88 if pos[0] >= PyTuple_GET_SIZE(tpl):
89 return NULL
90 return PyTuple_GET_ITEM(tpl, pos[0])
91
92
93cdef object _combine_annotations(ann_one, ann_two, cache):
94 """Combine the annotations from both sides."""
95 cdef Py_ssize_t pos_one, pos_two, len_one, len_two
96 cdef Py_ssize_t out_pos
97 cdef PyObject *temp, *left, *right
98
99 if (PyObject_RichCompareBool(ann_one, ann_two, Py_LT)):
100 cache_key = (ann_one, ann_two)
101 else:
102 cache_key = (ann_two, ann_one)
103 temp = PyDict_GetItem(cache, cache_key)
104 if temp != NULL:
105 return <object>temp
106
107 if not PyTuple_CheckExact(ann_one) or not PyTuple_CheckExact(ann_two):
108 raise TypeError('annotations must be tuples')
109 # We know that annotations are tuples, and that both sides are already
110 # sorted, so we can just walk and update a new list.
111 pos_one = -1
112 pos_two = -1
113 out_pos = 0
114 left = _next_tuple_entry(ann_one, &pos_one)
115 right = _next_tuple_entry(ann_two, &pos_two)
116 new_ann = PyTuple_New(PyTuple_GET_SIZE(ann_one)
117 + PyTuple_GET_SIZE(ann_two))
118 while left != NULL and right != NULL:
119 # left == right is done by PyObject_RichCompareBool_ptr, however it
120 # avoids a function call for a very common case. Drops 'time bzr
121 # annotate NEWS' from 7.25s to 7.16s, so it *is* a visible impact.
122 if (left == right
123 or PyObject_RichCompareBool_ptr(left, right, Py_EQ)):
124 # Identical values, step both
125 Py_INCREF_ptr(left)
126 PyTuple_SET_ITEM_ptr(new_ann, out_pos, left)
127 left = _next_tuple_entry(ann_one, &pos_one)
128 right = _next_tuple_entry(ann_two, &pos_two)
129 elif (PyObject_RichCompareBool_ptr(left, right, Py_LT)):
130 # left < right or right == NULL
131 Py_INCREF_ptr(left)
132 PyTuple_SET_ITEM_ptr(new_ann, out_pos, left)
133 left = _next_tuple_entry(ann_one, &pos_one)
134 else: # right < left or left == NULL
135 Py_INCREF_ptr(right)
136 PyTuple_SET_ITEM_ptr(new_ann, out_pos, right)
137 right = _next_tuple_entry(ann_two, &pos_two)
138 out_pos = out_pos + 1
139 while left != NULL:
140 Py_INCREF_ptr(left)
141 PyTuple_SET_ITEM_ptr(new_ann, out_pos, left)
142 left = _next_tuple_entry(ann_one, &pos_one)
143 out_pos = out_pos + 1
144 while right != NULL:
145 Py_INCREF_ptr(right)
146 PyTuple_SET_ITEM_ptr(new_ann, out_pos, right)
147 right = _next_tuple_entry(ann_two, &pos_two)
148 out_pos = out_pos + 1
149 if out_pos != PyTuple_GET_SIZE(new_ann):
150 # Timing _PyTuple_Resize was not significantly faster that slicing
151 # PyTuple_Resize((<PyObject **>new_ann), out_pos)
152 new_ann = new_ann[0:out_pos]
153 PyDict_SetItem(cache, cache_key, new_ann)
154 return new_ann
155
156
157cdef int _apply_parent_annotations(annotations, parent_annotations,
158 matching_blocks) except -1:
159 """Apply the annotations from parent_annotations into annotations.
160
161 matching_blocks defines the ranges that match.
162 """
163 cdef Py_ssize_t parent_idx, lines_idx, match_len, idx
164 cdef PyListObject *par_list, *ann_list
165 cdef PyObject **par_temp, **ann_temp
166
167 _check_annotations_are_lists(annotations, parent_annotations)
168 par_list = <PyListObject *>parent_annotations
169 ann_list = <PyListObject *>annotations
170 # For NEWS and bzrlib/builtins.py, over 99% of the lines are simply copied
171 # across from the parent entry. So this routine is heavily optimized for
172 # that. Would be interesting if we could use memcpy() but we have to incref
173 # and decref
174 for parent_idx, lines_idx, match_len in matching_blocks:
175 _check_match_ranges(parent_annotations, annotations,
176 parent_idx, lines_idx, match_len)
177 par_temp = par_list.ob_item + parent_idx
178 ann_temp = ann_list.ob_item + lines_idx
179 for idx from 0 <= idx < match_len:
180 Py_INCREF_ptr(par_temp[idx])
181 Py_DECREF_ptr(ann_temp[idx])
182 ann_temp[idx] = par_temp[idx]
183 return 0
184
185
186cdef int _merge_annotations(this_annotation, annotations, parent_annotations,
187 matching_blocks, ann_cache) except -1:
188 cdef Py_ssize_t parent_idx, ann_idx, lines_idx, match_len, idx
189 cdef Py_ssize_t pos
190 cdef PyObject *ann_temp, *par_temp
191
192 _check_annotations_are_lists(annotations, parent_annotations)
193 last_ann = None
194 last_parent = None
195 last_res = None
196 for parent_idx, lines_idx, match_len in matching_blocks:
197 _check_match_ranges(parent_annotations, annotations,
198 parent_idx, lines_idx, match_len)
199 # For lines which match this parent, we will now resolve whether
200 # this parent wins over the current annotation
201 for idx from 0 <= idx < match_len:
202 ann_idx = lines_idx + idx
203 ann_temp = PyList_GET_ITEM(annotations, ann_idx)
204 par_temp = PyList_GET_ITEM(parent_annotations, parent_idx + idx)
205 if (ann_temp == par_temp):
206 # This is parent, do nothing
207 # Pointer comparison is fine here. Value comparison would
208 # be ok, but it will be handled in the final if clause by
209 # merging the two tuples into the same tuple
210 # Avoiding the Py_INCREF and function call to
211 # PyObject_RichCompareBool using pointer comparison drops
212 # timing from 215ms => 125ms
213 continue
214 par_ann = <object>par_temp
215 ann = <object>ann_temp
216 if (ann is this_annotation):
217 # Originally claimed 'this', but it was really in this
218 # parent
219 Py_INCREF(par_ann)
220 PyList_SetItem(annotations, ann_idx, par_ann)
221 continue
222 # Resolve the fact that both sides have a different value for
223 # last modified
224 if (ann is last_ann and par_ann is last_parent):
225 Py_INCREF(last_res)
226 PyList_SetItem(annotations, ann_idx, last_res)
227 else:
228 new_ann = _combine_annotations(ann, par_ann, ann_cache)
229 Py_INCREF(new_ann)
230 PyList_SetItem(annotations, ann_idx, new_ann)
231 last_ann = ann
232 last_parent = par_ann
233 last_res = new_ann
234 return 0
235
236
237class Annotator(_annotator_py.Annotator):
238 """Class that drives performing annotations."""
239
240 def _update_from_first_parent(self, key, annotations, lines, parent_key):
241 """Reannotate this text relative to its first parent."""
242 (parent_annotations,
243 matching_blocks) = self._get_parent_annotations_and_matches(
244 key, lines, parent_key)
245
246 _apply_parent_annotations(annotations, parent_annotations,
247 matching_blocks)
248
249 def _update_from_other_parents(self, key, annotations, lines,
250 this_annotation, parent_key):
251 """Reannotate this text relative to a second (or more) parent."""
252 (parent_annotations,
253 matching_blocks) = self._get_parent_annotations_and_matches(
254 key, lines, parent_key)
255 _merge_annotations(this_annotation, annotations, parent_annotations,
256 matching_blocks, self._ann_tuple_cache)
257
258 def annotate_flat(self, key):
259 """Determine the single-best-revision to source for each line.
260
261 This is meant as a compatibility thunk to how annotate() used to work.
262 """
263 cdef Py_ssize_t pos, num_lines
264
265 from bzrlib import annotate
266
267 custom_tiebreaker = annotate._break_annotation_tie
268 annotations, lines = self.annotate(key)
269 num_lines = len(lines)
270 out = []
271 heads = self._get_heads_provider().heads
272 for pos from 0 <= pos < num_lines:
273 annotation = annotations[pos]
274 line = lines[pos]
275 if len(annotation) == 1:
276 head = annotation[0]
277 else:
278 the_heads = heads(annotation)
279 if len(the_heads) == 1:
280 for head in the_heads: break # get the item out of the set
281 else:
282 # We need to resolve the ambiguity, for now just pick the
283 # sorted smallest
284 head = self._resolve_annotation_tie(the_heads, line,
285 custom_tiebreaker)
286 PyList_Append(out, (head, line))
287 return out
0288
=== modified file 'bzrlib/_known_graph_py.py'
--- bzrlib/_known_graph_py.py 2009-06-19 17:40:59 +0000
+++ bzrlib/_known_graph_py.py 2009-07-08 23:35:27 +0000
@@ -63,18 +63,12 @@
63 - ghosts will have a parent_keys = None,63 - ghosts will have a parent_keys = None,
64 - all nodes found will also have .child_keys populated with all known64 - all nodes found will also have .child_keys populated with all known
65 child_keys,65 child_keys,
66 - self._tails will list all the nodes without parents.
67 """66 """
68 tails = self._tails = set()
69 nodes = self._nodes67 nodes = self._nodes
70 for key, parent_keys in parent_map.iteritems():68 for key, parent_keys in parent_map.iteritems():
71 if key in nodes:69 if key in nodes:
72 node = nodes[key]70 node = nodes[key]
73 node.parent_keys = parent_keys71 node.parent_keys = parent_keys
74 if parent_keys:
75 # This node has been added before being seen in parent_map
76 # (see below)
77 tails.remove(node)
78 else:72 else:
79 node = _KnownGraphNode(key, parent_keys)73 node = _KnownGraphNode(key, parent_keys)
80 nodes[key] = node74 nodes[key] = node
@@ -84,17 +78,18 @@
84 except KeyError:78 except KeyError:
85 parent_node = _KnownGraphNode(parent_key, None)79 parent_node = _KnownGraphNode(parent_key, None)
86 nodes[parent_key] = parent_node80 nodes[parent_key] = parent_node
87 # Potentially a tail, if we're wrong we'll remove it later
88 # (see above)
89 tails.add(parent_node)
90 parent_node.child_keys.append(key)81 parent_node.child_keys.append(key)
9182
83 def _find_tails(self):
84 return [node for node in self._nodes.itervalues()
85 if not node.parent_keys]
86
92 def _find_gdfo(self):87 def _find_gdfo(self):
93 nodes = self._nodes88 nodes = self._nodes
94 known_parent_gdfos = {}89 known_parent_gdfos = {}
95 pending = []90 pending = []
9691
97 for node in self._tails:92 for node in self._find_tails():
98 node.gdfo = 193 node.gdfo = 1
99 pending.append(node)94 pending.append(node)
10095
@@ -144,9 +139,6 @@
144 # No or only one candidate139 # No or only one candidate
145 return frozenset(candidate_nodes)140 return frozenset(candidate_nodes)
146 heads_key = frozenset(candidate_nodes)141 heads_key = frozenset(candidate_nodes)
147 if heads_key != frozenset(keys):
148 # Mention duplicates
149 note('%s != %s', heads_key, frozenset(keys))
150 # Do we have a cached result ?142 # Do we have a cached result ?
151 try:143 try:
152 heads = self._known_heads[heads_key]144 heads = self._known_heads[heads_key]
153145
=== modified file 'bzrlib/annotate.py'
--- bzrlib/annotate.py 2009-04-08 13:13:30 +0000
+++ bzrlib/annotate.py 2009-07-08 23:35:27 +0000
@@ -1,4 +1,4 @@
1# Copyright (C) 2004, 2005, 2006, 2007 Canonical Ltd1# Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009 Canonical Ltd
2#2#
3# This program is free software; you can redistribute it and/or modify3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by4# it under the terms of the GNU General Public License as published by
@@ -313,7 +313,9 @@
313 return matcher.get_matching_blocks()313 return matcher.get_matching_blocks()
314314
315315
316def _break_annotation_tie(annotated_lines):316_break_annotation_tie = None
317
318def _old_break_annotation_tie(annotated_lines):
317 """Chose an attribution between several possible ones.319 """Chose an attribution between several possible ones.
318320
319 :param annotated_lines: A list of tuples ((file_id, rev_id), line) where321 :param annotated_lines: A list of tuples ((file_id, rev_id), line) where
@@ -394,7 +396,11 @@
394 # If the result is not stable, there is a risk a396 # If the result is not stable, there is a risk a
395 # performance degradation as criss-cross merges will397 # performance degradation as criss-cross merges will
396 # flip-flop the attribution.398 # flip-flop the attribution.
397 output_append(_break_annotation_tie([left, right]))399 if _break_annotation_tie is None:
400 output_append(
401 _old_break_annotation_tie([left, right]))
402 else:
403 output_append(_break_annotation_tie([left, right]))
398 last_child_idx = child_idx + match_len404 last_child_idx = child_idx + match_len
399405
400406
@@ -444,3 +450,9 @@
444 # If left and right agree on a range, just push that into the output450 # If left and right agree on a range, just push that into the output
445 lines_extend(annotated_lines[left_idx:left_idx + match_len])451 lines_extend(annotated_lines[left_idx:left_idx + match_len])
446 return lines452 return lines
453
454
455try:
456 from bzrlib._annotator_pyx import Annotator
457except ImportError:
458 from bzrlib._annotator_py import Annotator
447459
=== modified file 'bzrlib/branchbuilder.py'
--- bzrlib/branchbuilder.py 2009-05-07 05:08:46 +0000
+++ bzrlib/branchbuilder.py 2009-07-08 23:35:27 +0000
@@ -161,7 +161,8 @@
161 self._tree = None161 self._tree = None
162162
163 def build_snapshot(self, revision_id, parent_ids, actions,163 def build_snapshot(self, revision_id, parent_ids, actions,
164 message=None, timestamp=None, allow_leftmost_as_ghost=False):164 message=None, timestamp=None, allow_leftmost_as_ghost=False,
165 committer=None):
165 """Build a commit, shaped in a specific way.166 """Build a commit, shaped in a specific way.
166167
167 :param revision_id: The handle for the new commit, can be None168 :param revision_id: The handle for the new commit, can be None
@@ -176,6 +177,7 @@
176 commit message will be written.177 commit message will be written.
177 :param timestamp: If non-None, set the timestamp of the commit to this178 :param timestamp: If non-None, set the timestamp of the commit to this
178 value.179 value.
180 :param committer: An optional username to use for commit
179 :param allow_leftmost_as_ghost: True if the leftmost parent should be181 :param allow_leftmost_as_ghost: True if the leftmost parent should be
180 permitted to be a ghost.182 permitted to be a ghost.
181 :return: The revision_id of the new commit183 :return: The revision_id of the new commit
@@ -241,7 +243,7 @@
241 for file_id, content in new_contents.iteritems():243 for file_id, content in new_contents.iteritems():
242 tree.put_file_bytes_non_atomic(file_id, content)244 tree.put_file_bytes_non_atomic(file_id, content)
243 return self._do_commit(tree, message=message, rev_id=revision_id,245 return self._do_commit(tree, message=message, rev_id=revision_id,
244 timestamp=timestamp)246 timestamp=timestamp, committer=committer)
245 finally:247 finally:
246 tree.unlock()248 tree.unlock()
247249
248250
=== modified file 'bzrlib/groupcompress.py'
--- bzrlib/groupcompress.py 2009-07-01 10:47:37 +0000
+++ bzrlib/groupcompress.py 2009-07-08 23:35:27 +0000
@@ -1069,29 +1069,11 @@
10691069
1070 def annotate(self, key):1070 def annotate(self, key):
1071 """See VersionedFiles.annotate."""1071 """See VersionedFiles.annotate."""
1072 graph = Graph(self)1072 ann = annotate.Annotator(self)
1073 parent_map = self.get_parent_map([key])1073 return ann.annotate_flat(key)
1074 if not parent_map:1074
1075 raise errors.RevisionNotPresent(key, self)1075 def get_annotator(self):
1076 if parent_map[key] is not None:1076 return annotate.Annotator(self)
1077 parent_map = dict((k, v) for k, v in graph.iter_ancestry([key])
1078 if v is not None)
1079 keys = parent_map.keys()
1080 else:
1081 keys = [key]
1082 parent_map = {key:()}
1083 # We used Graph(self) to load the parent_map, but now that we have it,
1084 # we can just query the parent map directly, so create a KnownGraph
1085 heads_provider = _mod_graph.KnownGraph(parent_map)
1086 parent_cache = {}
1087 reannotate = annotate.reannotate
1088 for record in self.get_record_stream(keys, 'topological', True):
1089 key = record.key
1090 lines = osutils.chunks_to_lines(record.get_bytes_as('chunked'))
1091 parent_lines = [parent_cache[parent] for parent in parent_map[key]]
1092 parent_cache[key] = list(
1093 reannotate(parent_lines, lines, key, None, heads_provider))
1094 return parent_cache[key]
10951077
1096 def check(self, progress_bar=None):1078 def check(self, progress_bar=None):
1097 """See VersionedFiles.check()."""1079 """See VersionedFiles.check()."""
10981080
=== modified file 'bzrlib/knit.py'
--- bzrlib/knit.py 2009-06-23 15:27:50 +0000
+++ bzrlib/knit.py 2009-07-08 23:35:27 +0000
@@ -664,8 +664,6 @@
664664
665 see parse_fulltext which this inverts.665 see parse_fulltext which this inverts.
666 """666 """
667 # TODO: jam 20070209 We only do the caching thing to make sure that
668 # the origin is a valid utf-8 line, eventually we could remove it
669 return ['%s %s' % (o, t) for o, t in content._lines]667 return ['%s %s' % (o, t) for o, t in content._lines]
670668
671 def lower_line_delta(self, delta):669 def lower_line_delta(self, delta):
@@ -758,7 +756,7 @@
758756
759 def annotate(self, knit, key):757 def annotate(self, knit, key):
760 annotator = _KnitAnnotator(knit)758 annotator = _KnitAnnotator(knit)
761 return annotator.annotate(key)759 return annotator.annotate_flat(key)
762760
763761
764762
@@ -1044,6 +1042,9 @@
1044 """See VersionedFiles.annotate."""1042 """See VersionedFiles.annotate."""
1045 return self._factory.annotate(self, key)1043 return self._factory.annotate(self, key)
10461044
1045 def get_annotator(self):
1046 return _KnitAnnotator(self)
1047
1047 def check(self, progress_bar=None):1048 def check(self, progress_bar=None):
1048 """See VersionedFiles.check()."""1049 """See VersionedFiles.check()."""
1049 # This doesn't actually test extraction of everything, but that will1050 # This doesn't actually test extraction of everything, but that will
@@ -3336,103 +3337,33 @@
3336 recommended.3337 recommended.
3337 """3338 """
3338 annotator = _KnitAnnotator(knit)3339 annotator = _KnitAnnotator(knit)
3339 return iter(annotator.annotate(revision_id))3340 return iter(annotator.annotate_flat(revision_id))
33403341
33413342
3342class _KnitAnnotator(object):3343class _KnitAnnotator(annotate.Annotator):
3343 """Build up the annotations for a text."""3344 """Build up the annotations for a text."""
33443345
3345 def __init__(self, knit):3346 def __init__(self, vf):
3346 self._knit = knit3347 annotate.Annotator.__init__(self, vf)
33473348
3348 # Content objects, differs from fulltexts because of how final newlines3349 # TODO: handle Nodes which cannot be extracted
3349 # are treated by knits. the content objects here will always have a3350 # self._ghosts = set()
3350 # final newline3351
3351 self._fulltext_contents = {}3352 # Map from (key, parent_key) => matching_blocks, should be 'use once'
33523353 self._matching_blocks = {}
3353 # Annotated lines of specific revisions3354
3354 self._annotated_lines = {}3355 # KnitContent objects
33553356 self._content_objects = {}
3356 # Track the raw data for nodes that we could not process yet.3357 # The number of children that depend on this fulltext content object
3357 # This maps the revision_id of the base to a list of children that will3358 self._num_compression_children = {}
3358 # annotated from it.3359 # Delta records that need their compression parent before they can be
3359 self._pending_children = {}3360 # expanded
33603361 self._pending_deltas = {}
3361 # Nodes which cannot be extracted3362 # Fulltext records that are waiting for their parents fulltexts before
3362 self._ghosts = set()3363 # they can be yielded for annotation
33633364 self._pending_annotation = {}
3364 # Track how many children this node has, so we know if we need to keep
3365 # it
3366 self._annotate_children = {}
3367 self._compression_children = {}
33683365
3369 self._all_build_details = {}3366 self._all_build_details = {}
3370 # The children => parent revision_id graph
3371 self._revision_id_graph = {}
3372
3373 self._heads_provider = None
3374
3375 self._nodes_to_keep_annotations = set()
3376 self._generations_until_keep = 100
3377
3378 def set_generations_until_keep(self, value):
3379 """Set the number of generations before caching a node.
3380
3381 Setting this to -1 will cache every merge node, setting this higher
3382 will cache fewer nodes.
3383 """
3384 self._generations_until_keep = value
3385
3386 def _add_fulltext_content(self, revision_id, content_obj):
3387 self._fulltext_contents[revision_id] = content_obj
3388 # TODO: jam 20080305 It might be good to check the sha1digest here
3389 return content_obj.text()
3390
3391 def _check_parents(self, child, nodes_to_annotate):
3392 """Check if all parents have been processed.
3393
3394 :param child: A tuple of (rev_id, parents, raw_content)
3395 :param nodes_to_annotate: If child is ready, add it to
3396 nodes_to_annotate, otherwise put it back in self._pending_children
3397 """
3398 for parent_id in child[1]:
3399 if (parent_id not in self._annotated_lines):
3400 # This parent is present, but another parent is missing
3401 self._pending_children.setdefault(parent_id,
3402 []).append(child)
3403 break
3404 else:
3405 # This one is ready to be processed
3406 nodes_to_annotate.append(child)
3407
3408 def _add_annotation(self, revision_id, fulltext, parent_ids,
3409 left_matching_blocks=None):
3410 """Add an annotation entry.
3411
3412 All parents should already have been annotated.
3413 :return: A list of children that now have their parents satisfied.
3414 """
3415 a = self._annotated_lines
3416 annotated_parent_lines = [a[p] for p in parent_ids]
3417 annotated_lines = list(annotate.reannotate(annotated_parent_lines,
3418 fulltext, revision_id, left_matching_blocks,
3419 heads_provider=self._get_heads_provider()))
3420 self._annotated_lines[revision_id] = annotated_lines
3421 for p in parent_ids:
3422 ann_children = self._annotate_children[p]
3423 ann_children.remove(revision_id)
3424 if (not ann_children
3425 and p not in self._nodes_to_keep_annotations):
3426 del self._annotated_lines[p]
3427 del self._all_build_details[p]
3428 if p in self._fulltext_contents:
3429 del self._fulltext_contents[p]
3430 # Now that we've added this one, see if there are any pending
3431 # deltas to be done, certainly this parent is finished
3432 nodes_to_annotate = []
3433 for child in self._pending_children.pop(revision_id, []):
3434 self._check_parents(child, nodes_to_annotate)
3435 return nodes_to_annotate
34363367
3437 def _get_build_graph(self, key):3368 def _get_build_graph(self, key):
3438 """Get the graphs for building texts and annotations.3369 """Get the graphs for building texts and annotations.
@@ -3446,202 +3377,243 @@
3446 passing to read_records_iter to start reading in the raw data from3377 passing to read_records_iter to start reading in the raw data from
3447 the pack file.3378 the pack file.
3448 """3379 """
3449 if key in self._annotated_lines:
3450 # Nothing to do
3451 return []
3452 pending = set([key])3380 pending = set([key])
3453 records = []3381 records = []
3454 generation = 03382 ann_keys = set()
3455 kept_generation = 03383 self._num_needed_children[key] = 1
3456 while pending:3384 while pending:
3457 # get all pending nodes3385 # get all pending nodes
3458 generation += 1
3459 this_iteration = pending3386 this_iteration = pending
3460 build_details = self._knit._index.get_build_details(this_iteration)3387 build_details = self._vf._index.get_build_details(this_iteration)
3461 self._all_build_details.update(build_details)3388 self._all_build_details.update(build_details)
3462 # new_nodes = self._knit._index._get_entries(this_iteration)3389 # new_nodes = self._vf._index._get_entries(this_iteration)
3463 pending = set()3390 pending = set()
3464 for key, details in build_details.iteritems():3391 for key, details in build_details.iteritems():
3465 (index_memo, compression_parent, parents,3392 (index_memo, compression_parent, parent_keys,
3466 record_details) = details3393 record_details) = details
3467 self._revision_id_graph[key] = parents3394 self._parent_map[key] = parent_keys
3395 self._heads_provider = None
3468 records.append((key, index_memo))3396 records.append((key, index_memo))
3469 # Do we actually need to check _annotated_lines?3397 # Do we actually need to check _annotated_lines?
3470 pending.update(p for p in parents3398 pending.update([p for p in parent_keys
3471 if p not in self._all_build_details)3399 if p not in self._all_build_details])
3400 if parent_keys:
3401 for parent_key in parent_keys:
3402 if parent_key in self._num_needed_children:
3403 self._num_needed_children[parent_key] += 1
3404 else:
3405 self._num_needed_children[parent_key] = 1
3472 if compression_parent:3406 if compression_parent:
3473 self._compression_children.setdefault(compression_parent,3407 if compression_parent in self._num_compression_children:
3474 []).append(key)3408 self._num_compression_children[compression_parent] += 1
3475 if parents:3409 else:
3476 for parent in parents:3410 self._num_compression_children[compression_parent] = 1
3477 self._annotate_children.setdefault(parent,
3478 []).append(key)
3479 num_gens = generation - kept_generation
3480 if ((num_gens >= self._generations_until_keep)
3481 and len(parents) > 1):
3482 kept_generation = generation
3483 self._nodes_to_keep_annotations.add(key)
34843411
3485 missing_versions = this_iteration.difference(build_details.keys())3412 missing_versions = this_iteration.difference(build_details.keys())
3486 self._ghosts.update(missing_versions)3413 if missing_versions:
3487 for missing_version in missing_versions:3414 for key in missing_versions:
3488 # add a key, no parents3415 if key in self._parent_map and key in self._text_cache:
3489 self._revision_id_graph[missing_version] = ()3416 # We already have this text ready, we just need to
3490 pending.discard(missing_version) # don't look for it3417 # yield it later so we get it annotated
3491 if self._ghosts.intersection(self._compression_children):3418 ann_keys.add(key)
3492 raise KnitCorrupt(3419 parent_keys = self._parent_map[key]
3493 "We cannot have nodes which have a ghost compression parent:\n"3420 for parent_key in parent_keys:
3494 "ghosts: %r\n"3421 if parent_key in self._num_needed_children:
3495 "compression children: %r"3422 self._num_needed_children[parent_key] += 1
3496 % (self._ghosts, self._compression_children))3423 else:
3497 # Cleanout anything that depends on a ghost so that we don't wait for3424 self._num_needed_children[parent_key] = 1
3498 # the ghost to show up3425 pending.update([p for p in parent_keys
3499 for node in self._ghosts:3426 if p not in self._all_build_details])
3500 if node in self._annotate_children:3427 else:
3501 # We won't be building this node3428 raise errors.RevisionNotPresent(key, self._vf)
3502 del self._annotate_children[node]
3503 # Generally we will want to read the records in reverse order, because3429 # Generally we will want to read the records in reverse order, because
3504 # we find the parent nodes after the children3430 # we find the parent nodes after the children
3505 records.reverse()3431 records.reverse()
3506 return records3432 return records, ann_keys
35073433
3508 def _annotate_records(self, records):3434 def _get_needed_texts(self, key, pb=None):
3509 """Build the annotations for the listed records."""3435 # if True or len(self._vf._fallback_vfs) > 0:
3436 if len(self._vf._fallback_vfs) > 0:
3437 # If we have fallbacks, go to the generic path
3438 for v in annotate.Annotator._get_needed_texts(self, key, pb=pb):
3439 yield v
3440 return
3441 while True:
3442 try:
3443 records, ann_keys = self._get_build_graph(key)
3444 for idx, (sub_key, text, num_lines) in enumerate(
3445 self._extract_texts(records)):
3446 if pb is not None:
3447 pb.update('annotating', idx, len(records))
3448 yield sub_key, text, num_lines
3449 for sub_key in ann_keys:
3450 text = self._text_cache[sub_key]
3451 num_lines = len(text) # bad assumption
3452 yield sub_key, text, num_lines
3453 return
3454 except errors.RetryWithNewPacks, e:
3455 self._vf._access.reload_or_raise(e)
3456 # The cached build_details are no longer valid
3457 self._all_build_details.clear()
3458
3459 def _cache_delta_blocks(self, key, compression_parent, delta, lines):
3460 parent_lines = self._text_cache[compression_parent]
3461 blocks = list(KnitContent.get_line_delta_blocks(delta, parent_lines, lines))
3462 self._matching_blocks[(key, compression_parent)] = blocks
3463
3464 def _expand_record(self, key, parent_keys, compression_parent, record,
3465 record_details):
3466 delta = None
3467 if compression_parent:
3468 if compression_parent not in self._content_objects:
3469 # Waiting for the parent
3470 self._pending_deltas.setdefault(compression_parent, []).append(
3471 (key, parent_keys, record, record_details))
3472 return None
3473 # We have the basis parent, so expand the delta
3474 num = self._num_compression_children[compression_parent]
3475 num -= 1
3476 if num == 0:
3477 base_content = self._content_objects.pop(compression_parent)
3478 self._num_compression_children.pop(compression_parent)
3479 else:
3480 self._num_compression_children[compression_parent] = num
3481 base_content = self._content_objects[compression_parent]
3482 # It is tempting to want to copy_base_content=False for the last
3483 # child object. However, whenever noeol=False,
3484 # self._text_cache[parent_key] is content._lines. So mutating it
3485 # gives very bad results.
3486 # The alternative is to copy the lines into text cache, but then we
3487 # are copying anyway, so just do it here.
3488 content, delta = self._vf._factory.parse_record(
3489 key, record, record_details, base_content,
3490 copy_base_content=True)
3491 else:
3492 # Fulltext record
3493 content, _ = self._vf._factory.parse_record(
3494 key, record, record_details, None)
3495 if self._num_compression_children.get(key, 0) > 0:
3496 self._content_objects[key] = content
3497 lines = content.text()
3498 self._text_cache[key] = lines
3499 if delta is not None:
3500 self._cache_delta_blocks(key, compression_parent, delta, lines)
3501 return lines
3502
3503 def _get_parent_annotations_and_matches(self, key, text, parent_key):
3504 """Get the list of annotations for the parent, and the matching lines.
3505
3506 :param text: The opaque value given by _get_needed_texts
3507 :param parent_key: The key for the parent text
3508 :return: (parent_annotations, matching_blocks)
3509 parent_annotations is a list as long as the number of lines in
3510 parent
3511 matching_blocks is a list of (parent_idx, text_idx, len) tuples
3512 indicating which lines match between the two texts
3513 """
3514 block_key = (key, parent_key)
3515 if block_key in self._matching_blocks:
3516 blocks = self._matching_blocks.pop(block_key)
3517 parent_annotations = self._annotations_cache[parent_key]
3518 return parent_annotations, blocks
3519 return annotate.Annotator._get_parent_annotations_and_matches(self,
3520 key, text, parent_key)
3521
3522 def _process_pending(self, key):
3523 """The content for 'key' was just processed.
3524
3525 Determine if there is any more pending work to be processed.
3526 """
3527 to_return = []
3528 if key in self._pending_deltas:
3529 compression_parent = key
3530 children = self._pending_deltas.pop(key)
3531 for child_key, parent_keys, record, record_details in children:
3532 lines = self._expand_record(child_key, parent_keys,
3533 compression_parent,
3534 record, record_details)
3535 if self._check_ready_for_annotations(child_key, parent_keys):
3536 to_return.append(child_key)
3537 # Also check any children that are waiting for this parent to be
3538 # annotation ready
3539 if key in self._pending_annotation:
3540 children = self._pending_annotation.pop(key)
3541 to_return.extend([c for c, p_keys in children
3542 if self._check_ready_for_annotations(c, p_keys)])
3543 return to_return
3544
3545 def _check_ready_for_annotations(self, key, parent_keys):
3546 """return true if this text is ready to be yielded.
3547
3548 Otherwise, this will return False, and queue the text into
3549 self._pending_annotation
3550 """
3551 for parent_key in parent_keys:
3552 if parent_key not in self._annotations_cache:
3553 # still waiting on at least one parent text, so queue it up
3554 # Note that if there are multiple parents, we need to wait
3555 # for all of them.
3556 self._pending_annotation.setdefault(parent_key,
3557 []).append((key, parent_keys))
3558 return False
3559 return True
3560
3561 def _extract_texts(self, records):
3562 """Extract the various texts needed based on records"""
3510 # We iterate in the order read, rather than a strict order requested3563 # We iterate in the order read, rather than a strict order requested
3511 # However, process what we can, and put off to the side things that3564 # However, process what we can, and put off to the side things that
3512 # still need parents, cleaning them up when those parents are3565 # still need parents, cleaning them up when those parents are
3513 # processed.3566 # processed.
3514 for (rev_id, record,3567 # Basic data flow:
3515 digest) in self._knit._read_records_iter(records):3568 # 1) As 'records' are read, see if we can expand these records into
3516 if rev_id in self._annotated_lines:3569 # Content objects (and thus lines)
3570 # 2) If a given line-delta is waiting on its compression parent, it
3571 # gets queued up into self._pending_deltas, otherwise we expand
3572 # it, and put it into self._text_cache and self._content_objects
3573 # 3) If we expanded the text, we will then check to see if all
3574 # parents have also been processed. If so, this text gets yielded,
3575 # else this record gets set aside into pending_annotation
3576 # 4) Further, if we expanded the text in (2), we will then check to
3577 # see if there are any children in self._pending_deltas waiting to
3578 # also be processed. If so, we go back to (2) for those
3579 # 5) Further again, if we yielded the text, we can then check if that
3580 # 'unlocks' any of the texts in pending_annotations, which should
3581 # then get yielded as well
3582 # Note that both steps 4 and 5 are 'recursive' in that unlocking one
3583 # compression child could unlock yet another, and yielding a fulltext
3584 # will also 'unlock' the children that are waiting on that annotation.
3585 # (Though also, unlocking 1 parent's fulltext, does not unlock a child
3586 # if other parents are also waiting.)
3587 # We want to yield content before expanding child content objects, so
3588 # that we know when we can re-use the content lines, and the annotation
3589 # code can know when it can stop caching fulltexts, as well.
3590
3591 # Children that are missing their compression parent
3592 pending_deltas = {}
3593 for (key, record, digest) in self._vf._read_records_iter(records):
3594 # ghosts?
3595 details = self._all_build_details[key]
3596 (_, compression_parent, parent_keys, record_details) = details
3597 lines = self._expand_record(key, parent_keys, compression_parent,
3598 record, record_details)
3599 if lines is None:
3600 # Pending delta should be queued up
3517 continue3601 continue
3518 parent_ids = self._revision_id_graph[rev_id]3602 # At this point, we may be able to yield this content, if all
3519 parent_ids = [p for p in parent_ids if p not in self._ghosts]3603 # parents are also finished
3520 details = self._all_build_details[rev_id]3604 yield_this_text = self._check_ready_for_annotations(key,
3521 (index_memo, compression_parent, parents,3605 parent_keys)
3522 record_details) = details3606 if yield_this_text:
3523 nodes_to_annotate = []3607 # All parents present
3524 # TODO: Remove the punning between compression parents, and3608 yield key, lines, len(lines)
3525 # parent_ids, we should be able to do this without assuming3609 to_process = self._process_pending(key)
3526 # the build order3610 while to_process:
3527 if len(parent_ids) == 0:3611 this_process = to_process
3528 # There are no parents for this node, so just add it3612 to_process = []
3529 # TODO: This probably needs to be decoupled3613 for key in this_process:
3530 fulltext_content, delta = self._knit._factory.parse_record(3614 lines = self._text_cache[key]
3531 rev_id, record, record_details, None)3615 yield key, lines, len(lines)
3532 fulltext = self._add_fulltext_content(rev_id, fulltext_content)3616 to_process.extend(self._process_pending(key))
3533 nodes_to_annotate.extend(self._add_annotation(rev_id, fulltext,
3534 parent_ids, left_matching_blocks=None))
3535 else:
3536 child = (rev_id, parent_ids, record)
3537 # Check if all the parents are present
3538 self._check_parents(child, nodes_to_annotate)
3539 while nodes_to_annotate:
3540 # Should we use a queue here instead of a stack?
3541 (rev_id, parent_ids, record) = nodes_to_annotate.pop()
3542 (index_memo, compression_parent, parents,
3543 record_details) = self._all_build_details[rev_id]
3544 blocks = None
3545 if compression_parent is not None:
3546 comp_children = self._compression_children[compression_parent]
3547 if rev_id not in comp_children:
3548 raise AssertionError("%r not in compression children %r"
3549 % (rev_id, comp_children))
3550 # If there is only 1 child, it is safe to reuse this
3551 # content
3552 reuse_content = (len(comp_children) == 1
3553 and compression_parent not in
3554 self._nodes_to_keep_annotations)
3555 if reuse_content:
3556 # Remove it from the cache since it will be changing
3557 parent_fulltext_content = self._fulltext_contents.pop(compression_parent)
3558 # Make sure to copy the fulltext since it might be
3559 # modified
3560 parent_fulltext = list(parent_fulltext_content.text())
3561 else:
3562 parent_fulltext_content = self._fulltext_contents[compression_parent]
3563 parent_fulltext = parent_fulltext_content.text()
3564 comp_children.remove(rev_id)
3565 fulltext_content, delta = self._knit._factory.parse_record(
3566 rev_id, record, record_details,
3567 parent_fulltext_content,
3568 copy_base_content=(not reuse_content))
3569 fulltext = self._add_fulltext_content(rev_id,
3570 fulltext_content)
3571 if compression_parent == parent_ids[0]:
3572 # the compression_parent is the left parent, so we can
3573 # re-use the delta
3574 blocks = KnitContent.get_line_delta_blocks(delta,
3575 parent_fulltext, fulltext)
3576 else:
3577 fulltext_content = self._knit._factory.parse_fulltext(
3578 record, rev_id)
3579 fulltext = self._add_fulltext_content(rev_id,
3580 fulltext_content)
3581 nodes_to_annotate.extend(
3582 self._add_annotation(rev_id, fulltext, parent_ids,
3583 left_matching_blocks=blocks))
3584
3585 def _get_heads_provider(self):
3586 """Create a heads provider for resolving ancestry issues."""
3587 if self._heads_provider is not None:
3588 return self._heads_provider
3589 self._heads_provider = _mod_graph.KnownGraph(self._revision_id_graph)
3590 return self._heads_provider
3591
3592 def annotate(self, key):
3593 """Return the annotated fulltext at the given key.
3594
3595 :param key: The key to annotate.
3596 """
3597 if len(self._knit._fallback_vfs) > 0:
3598 # stacked knits can't use the fast path at present.
3599 return self._simple_annotate(key)
3600 while True:
3601 try:
3602 records = self._get_build_graph(key)
3603 if key in self._ghosts:
3604 raise errors.RevisionNotPresent(key, self._knit)
3605 self._annotate_records(records)
3606 return self._annotated_lines[key]
3607 except errors.RetryWithNewPacks, e:
3608 self._knit._access.reload_or_raise(e)
3609 # The cached build_details are no longer valid
3610 self._all_build_details.clear()
3611
3612 def _simple_annotate(self, key):
3613 """Return annotated fulltext, rediffing from the full texts.
3614
3615 This is slow but makes no assumptions about the repository
3616 being able to produce line deltas.
3617 """
3618 # TODO: this code generates a parent maps of present ancestors; it
3619 # could be split out into a separate method
3620 # -- mbp and robertc 20080704
3621 graph = _mod_graph.Graph(self._knit)
3622 parent_map = dict((k, v) for k, v in graph.iter_ancestry([key])
3623 if v is not None)
3624 if not parent_map:
3625 raise errors.RevisionNotPresent(key, self)
3626 keys = parent_map.keys()
3627 heads_provider = _mod_graph.KnownGraph(parent_map)
3628 parent_cache = {}
3629 reannotate = annotate.reannotate
3630 for record in self._knit.get_record_stream(keys, 'topological', True):
3631 key = record.key
3632 fulltext = osutils.chunks_to_lines(record.get_bytes_as('chunked'))
3633 parents = parent_map[key]
3634 if parents is not None:
3635 parent_lines = [parent_cache[parent] for parent in parent_map[key]]
3636 else:
3637 parent_lines = []
3638 parent_cache[key] = list(
3639 reannotate(parent_lines, fulltext, key, None, heads_provider))
3640 try:
3641 return parent_cache[key]
3642 except KeyError, e:
3643 raise errors.RevisionNotPresent(key, self._knit)
3644
36453617
3646try:3618try:
3647 from bzrlib._knit_load_data_c import _load_data_c as _load_data3619 from bzrlib._knit_load_data_c import _load_data_c as _load_data
36483620
=== modified file 'bzrlib/revisiontree.py'
--- bzrlib/revisiontree.py 2009-06-17 03:41:33 +0000
+++ bzrlib/revisiontree.py 2009-07-08 23:35:27 +0000
@@ -87,7 +87,8 @@
87 default_revision=revision.CURRENT_REVISION):87 default_revision=revision.CURRENT_REVISION):
88 """See Tree.annotate_iter"""88 """See Tree.annotate_iter"""
89 text_key = (file_id, self.inventory[file_id].revision)89 text_key = (file_id, self.inventory[file_id].revision)
90 annotations = self._repository.texts.annotate(text_key)90 annotator = self._repository.texts.get_annotator()
91 annotations = annotator.annotate_flat(text_key)
91 return [(key[-1], line) for key, line in annotations]92 return [(key[-1], line) for key, line in annotations]
9293
93 def get_file_size(self, file_id):94 def get_file_size(self, file_id):
9495
=== modified file 'bzrlib/tests/__init__.py'
--- bzrlib/tests/__init__.py 2009-07-02 11:37:38 +0000
+++ bzrlib/tests/__init__.py 2009-07-08 23:35:28 +0000
@@ -3344,6 +3344,7 @@
3344 'bzrlib.tests.per_repository',3344 'bzrlib.tests.per_repository',
3345 'bzrlib.tests.per_repository_chk',3345 'bzrlib.tests.per_repository_chk',
3346 'bzrlib.tests.per_repository_reference',3346 'bzrlib.tests.per_repository_reference',
3347 'bzrlib.tests.test__annotator',
3347 'bzrlib.tests.test__chk_map',3348 'bzrlib.tests.test__chk_map',
3348 'bzrlib.tests.test__dirstate_helpers',3349 'bzrlib.tests.test__dirstate_helpers',
3349 'bzrlib.tests.test__groupcompress',3350 'bzrlib.tests.test__groupcompress',
33503351
=== added file 'bzrlib/tests/test__annotator.py'
--- bzrlib/tests/test__annotator.py 1970-01-01 00:00:00 +0000
+++ bzrlib/tests/test__annotator.py 2009-07-08 23:35:28 +0000
@@ -0,0 +1,403 @@
1# Copyright (C) 2009 Canonical Ltd
2#
3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by
5# the Free Software Foundation; either version 2 of the License, or
6# (at your option) any later version.
7#
8# This program is distributed in the hope that it will be useful,
9# but WITHOUT ANY WARRANTY; without even the implied warranty of
10# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11# GNU General Public License for more details.
12#
13# You should have received a copy of the GNU General Public License
14# along with this program; if not, write to the Free Software
15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
16
17"""Tests for Annotators."""
18
19from bzrlib import (
20 annotate,
21 _annotator_py,
22 errors,
23 knit,
24 revision,
25 tests,
26 )
27
28
29def load_tests(standard_tests, module, loader):
30 """Parameterize tests for all versions of groupcompress."""
31 scenarios = [
32 ('python', {'module': _annotator_py}),
33 ]
34 suite = loader.suiteClass()
35 if CompiledAnnotator.available():
36 from bzrlib import _annotator_pyx
37 scenarios.append(('C', {'module': _annotator_pyx}))
38 else:
39 # the compiled module isn't available, so we add a failing test
40 class FailWithoutFeature(tests.TestCase):
41 def test_fail(self):
42 self.requireFeature(CompiledAnnotator)
43 suite.addTest(loader.loadTestsFromTestCase(FailWithoutFeature))
44 result = tests.multiply_tests(standard_tests, scenarios, suite)
45 return result
46
47
48class _CompiledAnnotator(tests.Feature):
49
50 def _probe(self):
51 try:
52 import bzrlib._annotator_pyx
53 except ImportError:
54 return False
55 return True
56
57 def feature_name(self):
58 return 'bzrlib._annotator_pyx'
59
60CompiledAnnotator = _CompiledAnnotator()
61
62
63class TestAnnotator(tests.TestCaseWithMemoryTransport):
64
65 module = None # Set by load_tests
66
67 fa_key = ('f-id', 'a-id')
68 fb_key = ('f-id', 'b-id')
69 fc_key = ('f-id', 'c-id')
70 fd_key = ('f-id', 'd-id')
71 fe_key = ('f-id', 'e-id')
72 ff_key = ('f-id', 'f-id')
73
74 def make_no_graph_texts(self):
75 factory = knit.make_pack_factory(False, False, 2)
76 self.vf = factory(self.get_transport())
77 self.ann = self.module.Annotator(self.vf)
78 self.vf.add_lines(self.fa_key, (), ['simple\n', 'content\n'])
79 self.vf.add_lines(self.fb_key, (), ['simple\n', 'new content\n'])
80
81 def make_simple_text(self):
82 # TODO: all we really need is a VersionedFile instance, we'd like to
83 # avoid creating all the intermediate stuff
84 factory = knit.make_pack_factory(True, True, 2)
85 self.vf = factory(self.get_transport())
86 # This assumes nothing special happens during __init__, which may be
87 # valid
88 self.ann = self.module.Annotator(self.vf)
89 # A 'simple|content|'
90 # |
91 # B 'simple|new content|'
92 self.vf.add_lines(self.fa_key, [], ['simple\n', 'content\n'])
93 self.vf.add_lines(self.fb_key, [self.fa_key],
94 ['simple\n', 'new content\n'])
95
96 def make_merge_text(self):
97 self.make_simple_text()
98 # A 'simple|content|'
99 # |\
100 # B | 'simple|new content|'
101 # | |
102 # | C 'simple|from c|content|'
103 # |/
104 # D 'simple|from c|new content|introduced in merge|'
105 self.vf.add_lines(self.fc_key, [self.fa_key],
106 ['simple\n', 'from c\n', 'content\n'])
107 self.vf.add_lines(self.fd_key, [self.fb_key, self.fc_key],
108 ['simple\n', 'from c\n', 'new content\n',
109 'introduced in merge\n'])
110
111 def make_common_merge_text(self):
112 """Both sides of the merge will have introduced a line."""
113 self.make_simple_text()
114 # A 'simple|content|'
115 # |\
116 # B | 'simple|new content|'
117 # | |
118 # | C 'simple|new content|'
119 # |/
120 # D 'simple|new content|'
121 self.vf.add_lines(self.fc_key, [self.fa_key],
122 ['simple\n', 'new content\n'])
123 self.vf.add_lines(self.fd_key, [self.fb_key, self.fc_key],
124 ['simple\n', 'new content\n'])
125
126 def make_many_way_common_merge_text(self):
127 self.make_simple_text()
128 # A-. 'simple|content|'
129 # |\ \
130 # B | | 'simple|new content|'
131 # | | |
132 # | C | 'simple|new content|'
133 # |/ |
134 # D | 'simple|new content|'
135 # | |
136 # | E 'simple|new content|'
137 # | /
138 # F-' 'simple|new content|'
139 self.vf.add_lines(self.fc_key, [self.fa_key],
140 ['simple\n', 'new content\n'])
141 self.vf.add_lines(self.fd_key, [self.fb_key, self.fc_key],
142 ['simple\n', 'new content\n'])
143 self.vf.add_lines(self.fe_key, [self.fa_key],
144 ['simple\n', 'new content\n'])
145 self.vf.add_lines(self.ff_key, [self.fd_key, self.fe_key],
146 ['simple\n', 'new content\n'])
147
148 def make_merge_and_restored_text(self):
149 self.make_simple_text()
150 # A 'simple|content|'
151 # |\
152 # B | 'simple|new content|'
153 # | |
154 # C | 'simple|content|' # reverted to A
155 # \|
156 # D 'simple|content|'
157 # c reverts back to 'a' for the new content line
158 self.vf.add_lines(self.fc_key, [self.fb_key],
159 ['simple\n', 'content\n'])
160 # d merges 'a' and 'c', to find both claim last modified
161 self.vf.add_lines(self.fd_key, [self.fa_key, self.fc_key],
162 ['simple\n', 'content\n'])
163
164 def assertAnnotateEqual(self, expected_annotation, key, exp_text=None):
165 annotation, lines = self.ann.annotate(key)
166 self.assertEqual(expected_annotation, annotation)
167 if exp_text is None:
168 record = self.vf.get_record_stream([key], 'unordered', True).next()
169 exp_text = record.get_bytes_as('fulltext')
170 self.assertEqualDiff(exp_text, ''.join(lines))
171
172 def test_annotate_missing(self):
173 self.make_simple_text()
174 self.assertRaises(errors.RevisionNotPresent,
175 self.ann.annotate, ('not', 'present'))
176
177 def test_annotate_simple(self):
178 self.make_simple_text()
179 self.assertAnnotateEqual([(self.fa_key,)]*2, self.fa_key)
180 self.assertAnnotateEqual([(self.fa_key,), (self.fb_key,)], self.fb_key)
181
182 def test_annotate_merge_text(self):
183 self.make_merge_text()
184 self.assertAnnotateEqual([(self.fa_key,), (self.fc_key,),
185 (self.fb_key,), (self.fd_key,)],
186 self.fd_key)
187
188 def test_annotate_common_merge_text(self):
189 self.make_common_merge_text()
190 self.assertAnnotateEqual([(self.fa_key,), (self.fb_key, self.fc_key)],
191 self.fd_key)
192
193 def test_annotate_many_way_common_merge_text(self):
194 self.make_many_way_common_merge_text()
195 self.assertAnnotateEqual([(self.fa_key,),
196 (self.fb_key, self.fc_key, self.fe_key)],
197 self.ff_key)
198
199 def test_annotate_merge_and_restored(self):
200 self.make_merge_and_restored_text()
201 self.assertAnnotateEqual([(self.fa_key,), (self.fa_key, self.fc_key)],
202 self.fd_key)
203
204 def test_annotate_flat_simple(self):
205 self.make_simple_text()
206 self.assertEqual([(self.fa_key, 'simple\n'),
207 (self.fa_key, 'content\n'),
208 ], self.ann.annotate_flat(self.fa_key))
209 self.assertEqual([(self.fa_key, 'simple\n'),
210 (self.fb_key, 'new content\n'),
211 ], self.ann.annotate_flat(self.fb_key))
212
213 def test_annotate_flat_merge_and_restored_text(self):
214 self.make_merge_and_restored_text()
215 # fc is a simple dominator of fa
216 self.assertEqual([(self.fa_key, 'simple\n'),
217 (self.fc_key, 'content\n'),
218 ], self.ann.annotate_flat(self.fd_key))
219
220 def test_annotate_common_merge_text(self):
221 self.make_common_merge_text()
222 # there is no common point, so we just pick the lexicographical lowest
223 # and 'b-id' comes before 'c-id'
224 self.assertEqual([(self.fa_key, 'simple\n'),
225 (self.fb_key, 'new content\n'),
226 ], self.ann.annotate_flat(self.fd_key))
227
228 def test_annotate_many_way_common_merge_text(self):
229 self.make_many_way_common_merge_text()
230 self.assertEqual([(self.fa_key, 'simple\n'),
231 (self.fb_key, 'new content\n')],
232 self.ann.annotate_flat(self.ff_key))
233
234 def test_annotate_flat_respects_break_ann_tie(self):
235 tiebreaker = annotate._break_annotation_tie
236 try:
237 calls = []
238 def custom_tiebreaker(annotated_lines):
239 self.assertEqual(2, len(annotated_lines))
240 left = annotated_lines[0]
241 self.assertEqual(2, len(left))
242 self.assertEqual('new content\n', left[1])
243 right = annotated_lines[1]
244 self.assertEqual(2, len(right))
245 self.assertEqual('new content\n', right[1])
246 calls.append((left[0], right[0]))
247 # Our custom tiebreaker takes the *largest* value, rather than
248 # the *smallest* value
249 if left[0] < right[0]:
250 return right
251 else:
252 return left
253 annotate._break_annotation_tie = custom_tiebreaker
254 self.make_many_way_common_merge_text()
255 self.assertEqual([(self.fa_key, 'simple\n'),
256 (self.fe_key, 'new content\n')],
257 self.ann.annotate_flat(self.ff_key))
258 self.assertEqual([(self.fe_key, self.fc_key),
259 (self.fe_key, self.fb_key)], calls)
260 finally:
261 annotate._break_annotation_tie = tiebreaker
262
263
264 def test_needed_keys_simple(self):
265 self.make_simple_text()
266 keys, ann_keys = self.ann._get_needed_keys(self.fb_key)
267 self.assertEqual([self.fa_key, self.fb_key], sorted(keys))
268 self.assertEqual({self.fa_key: 1, self.fb_key: 1},
269 self.ann._num_needed_children)
270 self.assertEqual(set(), ann_keys)
271
272 def test_needed_keys_many(self):
273 self.make_many_way_common_merge_text()
274 keys, ann_keys = self.ann._get_needed_keys(self.ff_key)
275 self.assertEqual([self.fa_key, self.fb_key, self.fc_key,
276 self.fd_key, self.fe_key, self.ff_key,
277 ], sorted(keys))
278 self.assertEqual({self.fa_key: 3,
279 self.fb_key: 1,
280 self.fc_key: 1,
281 self.fd_key: 1,
282 self.fe_key: 1,
283 self.ff_key: 1,
284 }, self.ann._num_needed_children)
285 self.assertEqual(set(), ann_keys)
286
287 def test_needed_keys_with_special_text(self):
288 self.make_many_way_common_merge_text()
289 spec_key = ('f-id', revision.CURRENT_REVISION)
290 spec_text = 'simple\nnew content\nlocally modified\n'
291 self.ann.add_special_text(spec_key, [self.fd_key, self.fe_key],
292 spec_text)
293 keys, ann_keys = self.ann._get_needed_keys(spec_key)
294 self.assertEqual([self.fa_key, self.fb_key, self.fc_key,
295 self.fd_key, self.fe_key,
296 ], sorted(keys))
297 self.assertEqual([spec_key], sorted(ann_keys))
298
299 def test_needed_keys_with_parent_texts(self):
300 self.make_many_way_common_merge_text()
301 # If 'D' and 'E' are already annotated, we don't need to extract all
302 # the texts
303 # D | 'simple|new content|'
304 # | |
305 # | E 'simple|new content|'
306 # | /
307 # F-' 'simple|new content|'
308 self.ann._parent_map[self.fd_key] = (self.fb_key, self.fc_key)
309 self.ann._text_cache[self.fd_key] = ['simple\n', 'new content\n']
310 self.ann._annotations_cache[self.fd_key] = [
311 (self.fa_key,),
312 (self.fb_key, self.fc_key),
313 ]
314 self.ann._parent_map[self.fe_key] = (self.fa_key,)
315 self.ann._text_cache[self.fe_key] = ['simple\n', 'new content\n']
316 self.ann._annotations_cache[self.fe_key] = [
317 (self.fa_key,),
318 (self.fe_key,),
319 ]
320 keys, ann_keys = self.ann._get_needed_keys(self.ff_key)
321 self.assertEqual([self.ff_key], sorted(keys))
322 self.assertEqual({self.fd_key: 1,
323 self.fe_key: 1,
324 self.ff_key: 1,
325 }, self.ann._num_needed_children)
326 self.assertEqual([], sorted(ann_keys))
327
328 def test_record_annotation_removes_texts(self):
329 self.make_many_way_common_merge_text()
330 # Populate the caches
331 for x in self.ann._get_needed_texts(self.ff_key):
332 continue
333 self.assertEqual({self.fa_key: 3,
334 self.fb_key: 1,
335 self.fc_key: 1,
336 self.fd_key: 1,
337 self.fe_key: 1,
338 self.ff_key: 1,
339 }, self.ann._num_needed_children)
340 self.assertEqual([self.fa_key, self.fb_key, self.fc_key,
341 self.fd_key, self.fe_key, self.ff_key,
342 ], sorted(self.ann._text_cache.keys()))
343 self.ann._record_annotation(self.fa_key, [], [])
344 self.ann._record_annotation(self.fb_key, [self.fa_key], [])
345 self.assertEqual({self.fa_key: 2,
346 self.fb_key: 1,
347 self.fc_key: 1,
348 self.fd_key: 1,
349 self.fe_key: 1,
350 self.ff_key: 1,
351 }, self.ann._num_needed_children)
352 self.assertTrue(self.fa_key in self.ann._text_cache)
353 self.assertTrue(self.fa_key in self.ann._annotations_cache)
354 self.ann._record_annotation(self.fc_key, [self.fa_key], [])
355 self.ann._record_annotation(self.fd_key, [self.fb_key, self.fc_key], [])
356 self.assertEqual({self.fa_key: 1,
357 self.fb_key: 0,
358 self.fc_key: 0,
359 self.fd_key: 1,
360 self.fe_key: 1,
361 self.ff_key: 1,
362 }, self.ann._num_needed_children)
363 self.assertTrue(self.fa_key in self.ann._text_cache)
364 self.assertTrue(self.fa_key in self.ann._annotations_cache)
365 self.assertFalse(self.fb_key in self.ann._text_cache)
366 self.assertFalse(self.fb_key in self.ann._annotations_cache)
367 self.assertFalse(self.fc_key in self.ann._text_cache)
368 self.assertFalse(self.fc_key in self.ann._annotations_cache)
369
370 def test_annotate_special_text(self):
371 # Things like WT and PreviewTree want to annotate an arbitrary text
372 # ('current:') so we need a way to add that to the group of files to be
373 # annotated.
374 self.make_many_way_common_merge_text()
375 # A-. 'simple|content|'
376 # |\ \
377 # B | | 'simple|new content|'
378 # | | |
379 # | C | 'simple|new content|'
380 # |/ |
381 # D | 'simple|new content|'
382 # | |
383 # | E 'simple|new content|'
384 # | /
385 # SPEC 'simple|new content|locally modified|'
386 spec_key = ('f-id', revision.CURRENT_REVISION)
387 spec_text = 'simple\nnew content\nlocally modified\n'
388 self.ann.add_special_text(spec_key, [self.fd_key, self.fe_key],
389 spec_text)
390 self.assertAnnotateEqual([(self.fa_key,),
391 (self.fb_key, self.fc_key, self.fe_key),
392 (spec_key,),
393 ], spec_key,
394 exp_text=spec_text)
395
396 def test_no_graph(self):
397 self.make_no_graph_texts()
398 self.assertAnnotateEqual([(self.fa_key,),
399 (self.fa_key,),
400 ], self.fa_key)
401 self.assertAnnotateEqual([(self.fb_key,),
402 (self.fb_key,),
403 ], self.fb_key)
0404
=== modified file 'bzrlib/tests/test__known_graph.py'
--- bzrlib/tests/test__known_graph.py 2009-06-18 19:45:24 +0000
+++ bzrlib/tests/test__known_graph.py 2009-07-08 23:35:28 +0000
@@ -14,7 +14,7 @@
14# along with this program; if not, write to the Free Software14# along with this program; if not, write to the Free Software
15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1616
17"""Tests for the python and pyrex extensions of groupcompress"""17"""Tests for the python and pyrex extensions of KnownGraph"""
1818
19from bzrlib import (19from bzrlib import (
20 errors,20 errors,
@@ -63,6 +63,16 @@
63CompiledKnownGraphFeature = _CompiledKnownGraphFeature()63CompiledKnownGraphFeature = _CompiledKnownGraphFeature()
6464
6565
66# a
67# |\
68# b |
69# | |
70# c |
71# \|
72# d
73alt_merge = {'a': [], 'b': ['a'], 'c': ['b'], 'd': ['a', 'c']}
74
75
66class TestKnownGraph(tests.TestCase):76class TestKnownGraph(tests.TestCase):
6777
68 module = None # Set by load_tests78 module = None # Set by load_tests
@@ -203,6 +213,10 @@
203 self.assertEqual(set(['w', 'q']), graph.heads(['w', 's', 'q']))213 self.assertEqual(set(['w', 'q']), graph.heads(['w', 's', 'q']))
204 self.assertEqual(set(['z']), graph.heads(['s', 'z']))214 self.assertEqual(set(['z']), graph.heads(['s', 'z']))
205215
216 def test_heads_alt_merge(self):
217 graph = self.make_known_graph(alt_merge)
218 self.assertEqual(set(['c']), graph.heads(['a', 'c']))
219
206 def test_heads_with_ghost(self):220 def test_heads_with_ghost(self):
207 graph = self.make_known_graph(test_graph.with_ghost)221 graph = self.make_known_graph(test_graph.with_ghost)
208 self.assertEqual(set(['e', 'g']), graph.heads(['e', 'g']))222 self.assertEqual(set(['e', 'g']), graph.heads(['e', 'g']))
209223
=== modified file 'bzrlib/tests/test_annotate.py'
--- bzrlib/tests/test_annotate.py 2009-03-23 14:59:43 +0000
+++ bzrlib/tests/test_annotate.py 2009-07-08 23:35:28 +0000
@@ -176,38 +176,23 @@
176 |176 |
177 rev-3177 rev-3
178 """178 """
179179 builder = self.make_branch_builder('branch')
180 tree1 = self.make_branch_and_tree('tree1')180 builder.start_series()
181 self.build_tree_contents([('tree1/a', 'first\n')])181 self.addCleanup(builder.finish_series)
182 tree1.add(['a'], ['a-id'])182 builder.build_snapshot('rev-1', None, [
183 tree1.commit('a', rev_id='rev-1',183 ('add', ('', 'root-id', 'directory', None)),
184 committer="joe@foo.com",184 ('add', ('a', 'a-id', 'file', 'first\n')),
185 timestamp=1166046000.00, timezone=0)185 ], timestamp=1166046000.00, committer="joe@foo.com")
186186 builder.build_snapshot('rev-2', ['rev-1'], [
187 tree2 = tree1.bzrdir.sprout('tree2').open_workingtree()187 ('modify', ('a-id', 'first\nsecond\n')),
188188 ], timestamp=1166046001.00, committer="joe@foo.com")
189 self.build_tree_contents([('tree1/a', 'first\nsecond\n')])189 builder.build_snapshot('rev-1_1_1', ['rev-1'], [
190 tree1.commit('b', rev_id='rev-2',190 ('modify', ('a-id', 'first\nthird\n')),
191 committer='joe@foo.com',191 ], timestamp=1166046002.00, committer="barry@foo.com")
192 timestamp=1166046001.00, timezone=0)192 builder.build_snapshot('rev-3', ['rev-2', 'rev-1_1_1'], [
193193 ('modify', ('a-id', 'first\nsecond\nthird\n')),
194 self.build_tree_contents([('tree2/a', 'first\nthird\n')])194 ], timestamp=1166046003.00, committer="sal@foo.com")
195 tree2.commit('c', rev_id='rev-1_1_1',195 return builder
196 committer="barry@foo.com",
197 timestamp=1166046002.00, timezone=0)
198
199 num_conflicts = tree1.merge_from_branch(tree2.branch)
200 self.assertEqual(1, num_conflicts)
201
202 self.build_tree_contents([('tree1/a',
203 'first\nsecond\nthird\n')])
204 tree1.set_conflicts(conflicts.ConflictList())
205 tree1.commit('merge 2', rev_id='rev-3',
206 committer='sal@foo.com',
207 timestamp=1166046003.00, timezone=0)
208 tree1.lock_read()
209 self.addCleanup(tree1.unlock)
210 return tree1, tree2
211196
212 def create_deeply_merged_trees(self):197 def create_deeply_merged_trees(self):
213 """Create some trees with a more complex merge history.198 """Create some trees with a more complex merge history.
@@ -232,69 +217,51 @@
232 |217 |
233 rev-6218 rev-6
234 """219 """
235 tree1, tree2 = self.create_merged_trees()220 builder = self.create_merged_trees()
236 tree1.unlock()221 builder.build_snapshot('rev-1_1_2', ['rev-1_1_1'], [])
237222 builder.build_snapshot('rev-4', ['rev-3', 'rev-1_1_2'], [])
238 tree3 = tree2.bzrdir.sprout('tree3').open_workingtree()223 builder.build_snapshot('rev-1_2_1', ['rev-1_1_1'], [
239224 ('modify', ('a-id', 'first\nthird\nfourth\n')),
240 tree2.commit('noop', rev_id='rev-1_1_2')225 ], timestamp=1166046003.00, committer="jerry@foo.com")
241 self.assertEqual(0, tree1.merge_from_branch(tree2.branch))226 builder.build_snapshot('rev-1_2_2', ['rev-1_2_1'], [],
242 tree1.commit('noop merge', rev_id='rev-4')227 timestamp=1166046004.00, committer="jerry@foo.com")
243228 builder.build_snapshot('rev-5', ['rev-4', 'rev-1_2_2'], [
244 self.build_tree_contents([('tree3/a', 'first\nthird\nfourth\n')])229 ('modify', ('a-id', 'first\nsecond\nthird\nfourth\n')),
245 tree3.commit('four', rev_id='rev-1_2_1',230 ], timestamp=1166046004.00, committer="jerry@foo.com")
246 committer='jerry@foo.com',231 builder.build_snapshot('rev-1_3_1', ['rev-1_2_1'], [
247 timestamp=1166046003.00, timezone=0)232 ('modify', ('a-id', 'first\nthird\nfourth\nfifth\nsixth\n')),
248233 ], timestamp=1166046005.00, committer="george@foo.com")
249 tree4 = tree3.bzrdir.sprout('tree4').open_workingtree()234 builder.build_snapshot('rev-6', ['rev-5', 'rev-1_3_1'], [
250235 ('modify', ('a-id',
251 tree3.commit('noop', rev_id='rev-1_2_2',236 'first\nsecond\nthird\nfourth\nfifth\nsixth\n')),
252 committer='jerry@foo.com',237 ])
253 timestamp=1166046004.00, timezone=0)238 return builder
254 self.assertEqual(0, tree1.merge_from_branch(tree3.branch))
255 tree1.commit('merge four', rev_id='rev-5')
256
257 self.build_tree_contents([('tree4/a',
258 'first\nthird\nfourth\nfifth\nsixth\n')])
259 tree4.commit('five and six', rev_id='rev-1_3_1',
260 committer='george@foo.com',
261 timestamp=1166046005.00, timezone=0)
262 self.assertEqual(0, tree1.merge_from_branch(tree4.branch))
263 tree1.commit('merge five and six', rev_id='rev-6')
264 tree1.lock_read()
265 return tree1
266239
267 def create_duplicate_lines_tree(self):240 def create_duplicate_lines_tree(self):
268 tree1 = self.make_branch_and_tree('tree1')241 builder = self.make_branch_builder('branch')
242 builder.start_series()
243 self.addCleanup(builder.finish_series)
269 base_text = ''.join(l for r, l in duplicate_base)244 base_text = ''.join(l for r, l in duplicate_base)
270 a_text = ''.join(l for r, l in duplicate_A)245 a_text = ''.join(l for r, l in duplicate_A)
271 b_text = ''.join(l for r, l in duplicate_B)246 b_text = ''.join(l for r, l in duplicate_B)
272 c_text = ''.join(l for r, l in duplicate_C)247 c_text = ''.join(l for r, l in duplicate_C)
273 d_text = ''.join(l for r, l in duplicate_D)248 d_text = ''.join(l for r, l in duplicate_D)
274 e_text = ''.join(l for r, l in duplicate_E)249 e_text = ''.join(l for r, l in duplicate_E)
275 self.build_tree_contents([('tree1/file', base_text)])250 builder.build_snapshot('rev-base', None, [
276 tree1.add(['file'], ['file-id'])251 ('add', ('', 'root-id', 'directory', None)),
277 tree1.commit('base', rev_id='rev-base')252 ('add', ('file', 'file-id', 'file', base_text)),
278 tree2 = tree1.bzrdir.sprout('tree2').open_workingtree()253 ])
279254 builder.build_snapshot('rev-A', ['rev-base'], [
280 self.build_tree_contents([('tree1/file', a_text),255 ('modify', ('file-id', a_text))])
281 ('tree2/file', b_text)])256 builder.build_snapshot('rev-B', ['rev-base'], [
282 tree1.commit('A', rev_id='rev-A')257 ('modify', ('file-id', b_text))])
283 tree2.commit('B', rev_id='rev-B')258 builder.build_snapshot('rev-C', ['rev-A'], [
284259 ('modify', ('file-id', c_text))])
285 tree2.merge_from_branch(tree1.branch)260 builder.build_snapshot('rev-D', ['rev-B', 'rev-A'], [
286 conflicts.resolve(tree2, None) # Resolve the conflicts261 ('modify', ('file-id', d_text))])
287 self.build_tree_contents([('tree2/file', d_text)])262 builder.build_snapshot('rev-E', ['rev-C', 'rev-D'], [
288 tree2.commit('D', rev_id='rev-D')263 ('modify', ('file-id', e_text))])
289264 return builder
290 self.build_tree_contents([('tree1/file', c_text)])
291 tree1.commit('C', rev_id='rev-C')
292
293 tree1.merge_from_branch(tree2.branch)
294 conflicts.resolve(tree1, None) # Resolve the conflicts
295 self.build_tree_contents([('tree1/file', e_text)])
296 tree1.commit('E', rev_id='rev-E')
297 return tree1
298265
299 def assertRepoAnnotate(self, expected, repo, file_id, revision_id):266 def assertRepoAnnotate(self, expected, repo, file_id, revision_id):
300 """Assert that the revision is properly annotated."""267 """Assert that the revision is properly annotated."""
@@ -307,8 +274,8 @@
307274
308 def test_annotate_duplicate_lines(self):275 def test_annotate_duplicate_lines(self):
309 # XXX: Should this be a per_repository test?276 # XXX: Should this be a per_repository test?
310 tree1 = self.create_duplicate_lines_tree()277 builder = self.create_duplicate_lines_tree()
311 repo = tree1.branch.repository278 repo = builder.get_branch().repository
312 repo.lock_read()279 repo.lock_read()
313 self.addCleanup(repo.unlock)280 self.addCleanup(repo.unlock)
314 self.assertRepoAnnotate(duplicate_base, repo, 'file-id', 'rev-base')281 self.assertRepoAnnotate(duplicate_base, repo, 'file-id', 'rev-base')
@@ -319,10 +286,10 @@
319 self.assertRepoAnnotate(duplicate_E, repo, 'file-id', 'rev-E')286 self.assertRepoAnnotate(duplicate_E, repo, 'file-id', 'rev-E')
320287
321 def test_annotate_shows_dotted_revnos(self):288 def test_annotate_shows_dotted_revnos(self):
322 tree1, tree2 = self.create_merged_trees()289 builder = self.create_merged_trees()
323290
324 sio = StringIO()291 sio = StringIO()
325 annotate.annotate_file(tree1.branch, 'rev-3', 'a-id',292 annotate.annotate_file(builder.get_branch(), 'rev-3', 'a-id',
326 to_file=sio)293 to_file=sio)
327 self.assertEqualDiff('1 joe@foo | first\n'294 self.assertEqualDiff('1 joe@foo | first\n'
328 '2 joe@foo | second\n'295 '2 joe@foo | second\n'
@@ -331,10 +298,10 @@
331298
332 def test_annotate_limits_dotted_revnos(self):299 def test_annotate_limits_dotted_revnos(self):
333 """Annotate should limit dotted revnos to a depth of 12"""300 """Annotate should limit dotted revnos to a depth of 12"""
334 tree1 = self.create_deeply_merged_trees()301 builder = self.create_deeply_merged_trees()
335302
336 sio = StringIO()303 sio = StringIO()
337 annotate.annotate_file(tree1.branch, 'rev-6', 'a-id',304 annotate.annotate_file(builder.get_branch(), 'rev-6', 'a-id',
338 to_file=sio, verbose=False, full=False)305 to_file=sio, verbose=False, full=False)
339 self.assertEqualDiff('1 joe@foo | first\n'306 self.assertEqualDiff('1 joe@foo | first\n'
340 '2 joe@foo | second\n'307 '2 joe@foo | second\n'
@@ -345,7 +312,7 @@
345 sio.getvalue())312 sio.getvalue())
346313
347 sio = StringIO()314 sio = StringIO()
348 annotate.annotate_file(tree1.branch, 'rev-6', 'a-id',315 annotate.annotate_file(builder.get_branch(), 'rev-6', 'a-id',
349 to_file=sio, verbose=False, full=True)316 to_file=sio, verbose=False, full=True)
350 self.assertEqualDiff('1 joe@foo | first\n'317 self.assertEqualDiff('1 joe@foo | first\n'
351 '2 joe@foo | second\n'318 '2 joe@foo | second\n'
@@ -357,7 +324,7 @@
357324
358 # verbose=True shows everything, the full revno, user id, and date325 # verbose=True shows everything, the full revno, user id, and date
359 sio = StringIO()326 sio = StringIO()
360 annotate.annotate_file(tree1.branch, 'rev-6', 'a-id',327 annotate.annotate_file(builder.get_branch(), 'rev-6', 'a-id',
361 to_file=sio, verbose=True, full=False)328 to_file=sio, verbose=True, full=False)
362 self.assertEqualDiff('1 joe@foo.com 20061213 | first\n'329 self.assertEqualDiff('1 joe@foo.com 20061213 | first\n'
363 '2 joe@foo.com 20061213 | second\n'330 '2 joe@foo.com 20061213 | second\n'
@@ -368,7 +335,7 @@
368 sio.getvalue())335 sio.getvalue())
369336
370 sio = StringIO()337 sio = StringIO()
371 annotate.annotate_file(tree1.branch, 'rev-6', 'a-id',338 annotate.annotate_file(builder.get_branch(), 'rev-6', 'a-id',
372 to_file=sio, verbose=True, full=True)339 to_file=sio, verbose=True, full=True)
373 self.assertEqualDiff('1 joe@foo.com 20061213 | first\n'340 self.assertEqualDiff('1 joe@foo.com 20061213 | first\n'
374 '2 joe@foo.com 20061213 | second\n'341 '2 joe@foo.com 20061213 | second\n'
@@ -384,10 +351,10 @@
384 When annotating a non-mainline revision, the annotation should still351 When annotating a non-mainline revision, the annotation should still
385 use dotted revnos from the mainline.352 use dotted revnos from the mainline.
386 """353 """
387 tree1 = self.create_deeply_merged_trees()354 builder = self.create_deeply_merged_trees()
388355
389 sio = StringIO()356 sio = StringIO()
390 annotate.annotate_file(tree1.branch, 'rev-1_3_1', 'a-id',357 annotate.annotate_file(builder.get_branch(), 'rev-1_3_1', 'a-id',
391 to_file=sio, verbose=False, full=False)358 to_file=sio, verbose=False, full=False)
392 self.assertEqualDiff('1 joe@foo | first\n'359 self.assertEqualDiff('1 joe@foo | first\n'
393 '1.1.1 barry@f | third\n'360 '1.1.1 barry@f | third\n'
@@ -397,10 +364,10 @@
397 sio.getvalue())364 sio.getvalue())
398365
399 def test_annotate_show_ids(self):366 def test_annotate_show_ids(self):
400 tree1 = self.create_deeply_merged_trees()367 builder = self.create_deeply_merged_trees()
401368
402 sio = StringIO()369 sio = StringIO()
403 annotate.annotate_file(tree1.branch, 'rev-6', 'a-id',370 annotate.annotate_file(builder.get_branch(), 'rev-6', 'a-id',
404 to_file=sio, show_ids=True, full=False)371 to_file=sio, show_ids=True, full=False)
405372
406 # It looks better with real revision ids :)373 # It looks better with real revision ids :)
@@ -413,7 +380,7 @@
413 sio.getvalue())380 sio.getvalue())
414381
415 sio = StringIO()382 sio = StringIO()
416 annotate.annotate_file(tree1.branch, 'rev-6', 'a-id',383 annotate.annotate_file(builder.get_branch(), 'rev-6', 'a-id',
417 to_file=sio, show_ids=True, full=True)384 to_file=sio, show_ids=True, full=True)
418385
419 self.assertEqualDiff(' rev-1 | first\n'386 self.assertEqualDiff(' rev-1 | first\n'
420387
=== modified file 'bzrlib/tests/test_knit.py'
--- bzrlib/tests/test_knit.py 2009-06-16 13:57:14 +0000
+++ bzrlib/tests/test_knit.py 2009-07-08 23:35:28 +0000
@@ -366,16 +366,25 @@
366 :return: (versioned_file, reload_counter)366 :return: (versioned_file, reload_counter)
367 versioned_file a KnitVersionedFiles using the packs for access367 versioned_file a KnitVersionedFiles using the packs for access
368 """368 """
369 tree = self.make_branch_and_memory_tree('tree')369 builder = self.make_branch_builder('.')
370 tree.lock_write()370 builder.start_series()
371 self.addCleanup(tree.unlock)371 builder.build_snapshot('rev-1', None, [
372 tree.add([''], ['root-id'])372 ('add', ('', 'root-id', 'directory', None)),
373 tree.commit('one', rev_id='rev-1')373 ('add', ('file', 'file-id', 'file', 'content\nrev 1\n')),
374 tree.commit('two', rev_id='rev-2')374 ])
375 tree.commit('three', rev_id='rev-3')375 builder.build_snapshot('rev-2', ['rev-1'], [
376 ('modify', ('file-id', 'content\nrev 2\n')),
377 ])
378 builder.build_snapshot('rev-3', ['rev-2'], [
379 ('modify', ('file-id', 'content\nrev 3\n')),
380 ])
381 builder.finish_series()
382 b = builder.get_branch()
383 b.lock_write()
384 self.addCleanup(b.unlock)
376 # Pack these three revisions into another pack file, but don't remove385 # Pack these three revisions into another pack file, but don't remove
377 # the originals386 # the originals
378 repo = tree.branch.repository387 repo = b.repository
379 collection = repo._pack_collection388 collection = repo._pack_collection
380 collection.ensure_loaded()389 collection.ensure_loaded()
381 orig_packs = collection.packs390 orig_packs = collection.packs
@@ -384,7 +393,7 @@
384 # forget about the new pack393 # forget about the new pack
385 collection.reset()394 collection.reset()
386 repo.refresh_data()395 repo.refresh_data()
387 vf = tree.branch.repository.revisions396 vf = repo.revisions
388 # Set up a reload() function that switches to using the new pack file397 # Set up a reload() function that switches to using the new pack file
389 new_index = new_pack.revision_index398 new_index = new_pack.revision_index
390 access_tuple = new_pack.access_tuple()399 access_tuple = new_pack.access_tuple()
@@ -1313,6 +1322,168 @@
1313 return _KndxIndex(transport, mapper, lambda:None, allow_writes, lambda:True)1322 return _KndxIndex(transport, mapper, lambda:None, allow_writes, lambda:True)
13141323
13151324
1325class Test_KnitAnnotator(TestCaseWithMemoryTransport):
1326
1327 def make_annotator(self):
1328 factory = knit.make_pack_factory(True, True, 1)
1329 vf = factory(self.get_transport())
1330 return knit._KnitAnnotator(vf)
1331
1332 def test__expand_fulltext(self):
1333 ann = self.make_annotator()
1334 rev_key = ('rev-id',)
1335 ann._num_compression_children[rev_key] = 1
1336 res = ann._expand_record(rev_key, (('parent-id',),), None,
1337 ['line1\n', 'line2\n'], ('fulltext', True))
1338 # The content object and text lines should be cached appropriately
1339 self.assertEqual(['line1\n', 'line2'], res)
1340 content_obj = ann._content_objects[rev_key]
1341 self.assertEqual(['line1\n', 'line2\n'], content_obj._lines)
1342 self.assertEqual(res, content_obj.text())
1343 self.assertEqual(res, ann._text_cache[rev_key])
1344
1345 def test__expand_delta_comp_parent_not_available(self):
1346 # Parent isn't available yet, so we return nothing, but queue up this
1347 # node for later processing
1348 ann = self.make_annotator()
1349 rev_key = ('rev-id',)
1350 parent_key = ('parent-id',)
1351 record = ['0,1,1\n', 'new-line\n']
1352 details = ('line-delta', False)
1353 res = ann._expand_record(rev_key, (parent_key,), parent_key,
1354 record, details)
1355 self.assertEqual(None, res)
1356 self.assertTrue(parent_key in ann._pending_deltas)
1357 pending = ann._pending_deltas[parent_key]
1358 self.assertEqual(1, len(pending))
1359 self.assertEqual((rev_key, (parent_key,), record, details), pending[0])
1360
1361 def test__expand_record_tracks_num_children(self):
1362 ann = self.make_annotator()
1363 rev_key = ('rev-id',)
1364 rev2_key = ('rev2-id',)
1365 parent_key = ('parent-id',)
1366 record = ['0,1,1\n', 'new-line\n']
1367 details = ('line-delta', False)
1368 ann._num_compression_children[parent_key] = 2
1369 ann._expand_record(parent_key, (), None, ['line1\n', 'line2\n'],
1370 ('fulltext', False))
1371 res = ann._expand_record(rev_key, (parent_key,), parent_key,
1372 record, details)
1373 self.assertEqual({parent_key: 1}, ann._num_compression_children)
1374 # Expanding the second child should remove the content object, and the
1375 # num_compression_children entry
1376 res = ann._expand_record(rev2_key, (parent_key,), parent_key,
1377 record, details)
1378 self.assertFalse(parent_key in ann._content_objects)
1379 self.assertEqual({}, ann._num_compression_children)
1380 # We should not cache the content_objects for rev2 and rev, because
1381 # they do not have compression children of their own.
1382 self.assertEqual({}, ann._content_objects)
1383
1384 def test__expand_delta_records_blocks(self):
1385 ann = self.make_annotator()
1386 rev_key = ('rev-id',)
1387 parent_key = ('parent-id',)
1388 record = ['0,1,1\n', 'new-line\n']
1389 details = ('line-delta', True)
1390 ann._num_compression_children[parent_key] = 2
1391 ann._expand_record(parent_key, (), None,
1392 ['line1\n', 'line2\n', 'line3\n'],
1393 ('fulltext', False))
1394 ann._expand_record(rev_key, (parent_key,), parent_key, record, details)
1395 self.assertEqual({(rev_key, parent_key): [(1, 1, 1), (3, 3, 0)]},
1396 ann._matching_blocks)
1397 rev2_key = ('rev2-id',)
1398 record = ['0,1,1\n', 'new-line\n']
1399 details = ('line-delta', False)
1400 ann._expand_record(rev2_key, (parent_key,), parent_key, record, details)
1401 self.assertEqual([(1, 1, 2), (3, 3, 0)],
1402 ann._matching_blocks[(rev2_key, parent_key)])
1403
1404 def test__get_parent_ann_uses_matching_blocks(self):
1405 ann = self.make_annotator()
1406 rev_key = ('rev-id',)
1407 parent_key = ('parent-id',)
1408 parent_ann = [(parent_key,)]*3
1409 block_key = (rev_key, parent_key)
1410 ann._annotations_cache[parent_key] = parent_ann
1411 ann._matching_blocks[block_key] = [(0, 1, 1), (3, 3, 0)]
1412 # We should not try to access any parent_lines content, because we know
1413 # we already have the matching blocks
1414 par_ann, blocks = ann._get_parent_annotations_and_matches(rev_key,
1415 ['1\n', '2\n', '3\n'], parent_key)
1416 self.assertEqual(parent_ann, par_ann)
1417 self.assertEqual([(0, 1, 1), (3, 3, 0)], blocks)
1418 self.assertEqual({}, ann._matching_blocks)
1419
1420 def test__process_pending(self):
1421 ann = self.make_annotator()
1422 rev_key = ('rev-id',)
1423 p1_key = ('p1-id',)
1424 p2_key = ('p2-id',)
1425 record = ['0,1,1\n', 'new-line\n']
1426 details = ('line-delta', False)
1427 p1_record = ['line1\n', 'line2\n']
1428 ann._num_compression_children[p1_key] = 1
1429 res = ann._expand_record(rev_key, (p1_key,p2_key), p1_key,
1430 record, details)
1431 self.assertEqual(None, res)
1432 # self.assertTrue(p1_key in ann._pending_deltas)
1433 self.assertEqual({}, ann._pending_annotation)
1434 # Now insert p1, and we should be able to expand the delta
1435 res = ann._expand_record(p1_key, (), None, p1_record,
1436 ('fulltext', False))
1437 self.assertEqual(p1_record, res)
1438 ann._annotations_cache[p1_key] = [(p1_key,)]*2
1439 res = ann._process_pending(p1_key)
1440 self.assertEqual([], res)
1441 self.assertFalse(p1_key in ann._pending_deltas)
1442 self.assertTrue(p2_key in ann._pending_annotation)
1443 self.assertEqual({p2_key: [(rev_key, (p1_key, p2_key))]},
1444 ann._pending_annotation)
1445 # Now fill in parent 2, and pending annotation should be satisfied
1446 res = ann._expand_record(p2_key, (), None, [], ('fulltext', False))
1447 ann._annotations_cache[p2_key] = []
1448 res = ann._process_pending(p2_key)
1449 self.assertEqual([rev_key], res)
1450 self.assertEqual({}, ann._pending_annotation)
1451 self.assertEqual({}, ann._pending_deltas)
1452
1453 def test_record_delta_removes_basis(self):
1454 ann = self.make_annotator()
1455 ann._expand_record(('parent-id',), (), None,
1456 ['line1\n', 'line2\n'], ('fulltext', False))
1457 ann._num_compression_children['parent-id'] = 2
1458
1459 def test_annotate_special_text(self):
1460 ann = self.make_annotator()
1461 vf = ann._vf
1462 rev1_key = ('rev-1',)
1463 rev2_key = ('rev-2',)
1464 rev3_key = ('rev-3',)
1465 spec_key = ('special:',)
1466 vf.add_lines(rev1_key, [], ['initial content\n'])
1467 vf.add_lines(rev2_key, [rev1_key], ['initial content\n',
1468 'common content\n',
1469 'content in 2\n'])
1470 vf.add_lines(rev3_key, [rev1_key], ['initial content\n',
1471 'common content\n',
1472 'content in 3\n'])
1473 spec_text = ('initial content\n'
1474 'common content\n'
1475 'content in 2\n'
1476 'content in 3\n')
1477 ann.add_special_text(spec_key, [rev2_key, rev3_key], spec_text)
1478 anns, lines = ann.annotate(spec_key)
1479 self.assertEqual([(rev1_key,),
1480 (rev2_key, rev3_key),
1481 (rev2_key,),
1482 (rev3_key,),
1483 ], anns)
1484 self.assertEqualDiff(spec_text, ''.join(lines))
1485
1486
1316class KnitTests(TestCaseWithTransport):1487class KnitTests(TestCaseWithTransport):
1317 """Class containing knit test helper routines."""1488 """Class containing knit test helper routines."""
13181489
13191490
=== modified file 'bzrlib/tests/test_versionedfile.py'
--- bzrlib/tests/test_versionedfile.py 2009-06-22 15:37:06 +0000
+++ bzrlib/tests/test_versionedfile.py 2009-07-08 23:35:28 +0000
@@ -1557,6 +1557,42 @@
1557 self.assertRaises(RevisionNotPresent,1557 self.assertRaises(RevisionNotPresent,
1558 files.annotate, prefix + ('missing-key',))1558 files.annotate, prefix + ('missing-key',))
15591559
1560 def test_get_annotator(self):
1561 files = self.get_versionedfiles()
1562 self.get_diamond_files(files)
1563 origin_key = self.get_simple_key('origin')
1564 base_key = self.get_simple_key('base')
1565 left_key = self.get_simple_key('left')
1566 right_key = self.get_simple_key('right')
1567 merged_key = self.get_simple_key('merged')
1568 # annotator = files.get_annotator()
1569 # introduced full text
1570 origins, lines = files.get_annotator().annotate(origin_key)
1571 self.assertEqual([(origin_key,)], origins)
1572 self.assertEqual(['origin\n'], lines)
1573 # a delta
1574 origins, lines = files.get_annotator().annotate(base_key)
1575 self.assertEqual([(base_key,)], origins)
1576 # a merge
1577 origins, lines = files.get_annotator().annotate(merged_key)
1578 if self.graph:
1579 self.assertEqual([
1580 (base_key,),
1581 (left_key,),
1582 (right_key,),
1583 (merged_key,),
1584 ], origins)
1585 else:
1586 # Without a graph everything is new.
1587 self.assertEqual([
1588 (merged_key,),
1589 (merged_key,),
1590 (merged_key,),
1591 (merged_key,),
1592 ], origins)
1593 self.assertRaises(RevisionNotPresent,
1594 files.get_annotator().annotate, self.get_simple_key('missing-key'))
1595
1560 def test_construct(self):1596 def test_construct(self):
1561 """Each parameterised test can be constructed on a transport."""1597 """Each parameterised test can be constructed on a transport."""
1562 files = self.get_versionedfiles()1598 files = self.get_versionedfiles()
15631599
=== modified file 'bzrlib/tests/workingtree_implementations/__init__.py'
--- bzrlib/tests/workingtree_implementations/__init__.py 2009-06-10 03:56:49 +0000
+++ bzrlib/tests/workingtree_implementations/__init__.py 2009-07-08 23:35:28 +0000
@@ -63,6 +63,7 @@
63 test_workingtree_implementations = [63 test_workingtree_implementations = [
64 'bzrlib.tests.workingtree_implementations.test_add_reference',64 'bzrlib.tests.workingtree_implementations.test_add_reference',
65 'bzrlib.tests.workingtree_implementations.test_add',65 'bzrlib.tests.workingtree_implementations.test_add',
66 'bzrlib.tests.workingtree_implementations.test_annotate_iter',
66 'bzrlib.tests.workingtree_implementations.test_basis_inventory',67 'bzrlib.tests.workingtree_implementations.test_basis_inventory',
67 'bzrlib.tests.workingtree_implementations.test_basis_tree',68 'bzrlib.tests.workingtree_implementations.test_basis_tree',
68 'bzrlib.tests.workingtree_implementations.test_break_lock',69 'bzrlib.tests.workingtree_implementations.test_break_lock',
6970
=== added file 'bzrlib/tests/workingtree_implementations/test_annotate_iter.py'
--- bzrlib/tests/workingtree_implementations/test_annotate_iter.py 1970-01-01 00:00:00 +0000
+++ bzrlib/tests/workingtree_implementations/test_annotate_iter.py 2009-07-08 23:35:28 +0000
@@ -0,0 +1,189 @@
1# Copyright (C) 2009 Canonical Ltd
2#
3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by
5# the Free Software Foundation; either version 2 of the License, or
6# (at your option) any later version.
7#
8# This program is distributed in the hope that it will be useful,
9# but WITHOUT ANY WARRANTY; without even the implied warranty of
10# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11# GNU General Public License for more details.
12#
13# You should have received a copy of the GNU General Public License
14# along with this program; if not, write to the Free Software
15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
16
17"""Tests for interface conformance of 'WorkingTree.annotate_iter'"""
18
19import os
20
21from bzrlib import (
22 errors,
23 inventory,
24 osutils,
25 tests,
26 )
27from bzrlib.tests.workingtree_implementations import TestCaseWithWorkingTree
28
29
30class TestAnnotateIter(TestCaseWithWorkingTree):
31
32 def make_single_rev_tree(self):
33 builder = self.make_branch_builder('branch')
34 builder.build_snapshot('rev-1', None, [
35 ('add', ('', 'TREE_ROOT', 'directory', None)),
36 ('add', ('file', 'file-id', 'file', 'initial content\n')),
37 ])
38 b = builder.get_branch()
39 tree = b.create_checkout('tree', lightweight=True)
40 tree.lock_read()
41 self.addCleanup(tree.unlock)
42 return tree
43
44 def test_annotate_same_as_parent(self):
45 tree = self.make_single_rev_tree()
46 annotations = tree.annotate_iter('file-id')
47 self.assertEqual([('rev-1', 'initial content\n')],
48 annotations)
49
50 def test_annotate_mod_from_parent(self):
51 tree = self.make_single_rev_tree()
52 self.build_tree_contents([('tree/file',
53 'initial content\nnew content\n')])
54 annotations = tree.annotate_iter('file-id')
55 self.assertEqual([('rev-1', 'initial content\n'),
56 ('current:', 'new content\n'),
57 ], annotations)
58
59 def test_annotate_merge_parents(self):
60 builder = self.make_branch_builder('branch')
61 builder.start_series()
62 builder.build_snapshot('rev-1', None, [
63 ('add', ('', 'TREE_ROOT', 'directory', None)),
64 ('add', ('file', 'file-id', 'file', 'initial content\n')),
65 ])
66 builder.build_snapshot('rev-2', ['rev-1'], [
67 ('modify', ('file-id', 'initial content\ncontent in 2\n')),
68 ])
69 builder.build_snapshot('rev-3', ['rev-1'], [
70 ('modify', ('file-id', 'initial content\ncontent in 3\n')),
71 ])
72 builder.finish_series()
73 b = builder.get_branch()
74 tree = b.create_checkout('tree', revision_id='rev-2', lightweight=True)
75 tree.lock_write()
76 self.addCleanup(tree.unlock)
77 tree.set_parent_ids(['rev-2', 'rev-3'])
78 self.build_tree_contents([('tree/file',
79 'initial content\ncontent in 2\n'
80 'content in 3\nnew content\n')])
81 annotations = tree.annotate_iter('file-id')
82 self.assertEqual([('rev-1', 'initial content\n'),
83 ('rev-2', 'content in 2\n'),
84 ('rev-3', 'content in 3\n'),
85 ('current:', 'new content\n'),
86 ], annotations)
87
88 def test_annotate_merge_parent_no_file(self):
89 builder = self.make_branch_builder('branch')
90 builder.start_series()
91 builder.build_snapshot('rev-1', None, [
92 ('add', ('', 'TREE_ROOT', 'directory', None)),
93 ])
94 builder.build_snapshot('rev-2', ['rev-1'], [
95 ('add', ('file', 'file-id', 'file', 'initial content\n')),
96 ])
97 builder.build_snapshot('rev-3', ['rev-1'], [])
98 builder.finish_series()
99 b = builder.get_branch()
100 tree = b.create_checkout('tree', revision_id='rev-2', lightweight=True)
101 tree.lock_write()
102 self.addCleanup(tree.unlock)
103 tree.set_parent_ids(['rev-2', 'rev-3'])
104 self.build_tree_contents([('tree/file',
105 'initial content\nnew content\n')])
106 annotations = tree.annotate_iter('file-id')
107 self.assertEqual([('rev-2', 'initial content\n'),
108 ('current:', 'new content\n'),
109 ], annotations)
110
111 def test_annotate_merge_parent_was_directory(self):
112 builder = self.make_branch_builder('branch')
113 builder.start_series()
114 builder.build_snapshot('rev-1', None, [
115 ('add', ('', 'TREE_ROOT', 'directory', None)),
116 ])
117 builder.build_snapshot('rev-2', ['rev-1'], [
118 ('add', ('file', 'file-id', 'file', 'initial content\n')),
119 ])
120 builder.build_snapshot('rev-3', ['rev-1'], [
121 ('add', ('a_dir', 'file-id', 'directory', None)),
122 ])
123 builder.finish_series()
124 b = builder.get_branch()
125 tree = b.create_checkout('tree', revision_id='rev-2', lightweight=True)
126 tree.lock_write()
127 self.addCleanup(tree.unlock)
128 tree.set_parent_ids(['rev-2', 'rev-3'])
129 self.build_tree_contents([('tree/file',
130 'initial content\nnew content\n')])
131 annotations = tree.annotate_iter('file-id')
132 self.assertEqual([('rev-2', 'initial content\n'),
133 ('current:', 'new content\n'),
134 ], annotations)
135
136 def test_annotate_same_as_merge_parent(self):
137 builder = self.make_branch_builder('branch')
138 builder.start_series()
139 builder.build_snapshot('rev-1', None, [
140 ('add', ('', 'TREE_ROOT', 'directory', None)),
141 ('add', ('file', 'file-id', 'file', 'initial content\n')),
142 ])
143 builder.build_snapshot('rev-2', ['rev-1'], [
144 ])
145 builder.build_snapshot('rev-3', ['rev-1'], [
146 ('modify', ('file-id', 'initial content\ncontent in 3\n')),
147 ])
148 builder.finish_series()
149 b = builder.get_branch()
150 tree = b.create_checkout('tree', revision_id='rev-2', lightweight=True)
151 tree.lock_write()
152 self.addCleanup(tree.unlock)
153 tree.set_parent_ids(['rev-2', 'rev-3'])
154 self.build_tree_contents([('tree/file',
155 'initial content\ncontent in 3\n')])
156 annotations = tree.annotate_iter('file-id')
157 self.assertEqual([('rev-1', 'initial content\n'),
158 ('rev-3', 'content in 3\n'),
159 ], annotations)
160
161 def test_annotate_same_as_merge_parent_supersedes(self):
162 builder = self.make_branch_builder('branch')
163 builder.start_series()
164 builder.build_snapshot('rev-1', None, [
165 ('add', ('', 'TREE_ROOT', 'directory', None)),
166 ('add', ('file', 'file-id', 'file', 'initial content\n')),
167 ])
168 builder.build_snapshot('rev-2', ['rev-1'], [
169 ('modify', ('file-id', 'initial content\nnew content\n')),
170 ])
171 builder.build_snapshot('rev-3', ['rev-2'], [
172 ('modify', ('file-id', 'initial content\ncontent in 3\n')),
173 ])
174 builder.build_snapshot('rev-4', ['rev-3'], [
175 ('modify', ('file-id', 'initial content\nnew content\n')),
176 ])
177 # In this case, the content locally is the same as content in basis
178 # tree, but the merge revision states that *it* should win
179 builder.finish_series()
180 b = builder.get_branch()
181 tree = b.create_checkout('tree', revision_id='rev-2', lightweight=True)
182 tree.lock_write()
183 self.addCleanup(tree.unlock)
184 tree.set_parent_ids(['rev-2', 'rev-4'])
185 annotations = tree.annotate_iter('file-id')
186 self.assertEqual([('rev-1', 'initial content\n'),
187 ('rev-4', 'new content\n'),
188 ], annotations)
189
0190
=== modified file 'bzrlib/transform.py'
--- bzrlib/transform.py 2009-06-30 04:08:12 +0000
+++ bzrlib/transform.py 2009-07-08 23:35:27 +0000
@@ -1,4 +1,4 @@
1# Copyright (C) 2006, 2007, 2008 Canonical Ltd1# Copyright (C) 2006, 2007, 2008, 2009 Canonical Ltd
2#2#
3# This program is free software; you can redistribute it and/or modify3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by4# it under the terms of the GNU General Public License as published by
@@ -1962,6 +1962,13 @@
1962 return old_annotation1962 return old_annotation
1963 if not changed_content:1963 if not changed_content:
1964 return old_annotation1964 return old_annotation
1965 # TODO: This is doing something similar to what WT.annotate_iter is
1966 # doing, however it fails slightly because it doesn't know what
1967 # the *other* revision_id is, so it doesn't know how to give the
1968 # other as the origin for some lines, they all get
1969 # 'default_revision'
1970 # It would be nice to be able to use the new Annotator based
1971 # approach, as well.
1965 return annotate.reannotate([old_annotation],1972 return annotate.reannotate([old_annotation],
1966 self.get_file(file_id).readlines(),1973 self.get_file(file_id).readlines(),
1967 default_revision)1974 default_revision)
19681975
=== modified file 'bzrlib/versionedfile.py'
--- bzrlib/versionedfile.py 2009-06-22 15:47:25 +0000
+++ bzrlib/versionedfile.py 2009-07-08 23:35:28 +0000
@@ -30,6 +30,7 @@
30import urllib30import urllib
3131
32from bzrlib import (32from bzrlib import (
33 annotate,
33 errors,34 errors,
34 groupcompress,35 groupcompress,
35 index,36 index,
@@ -1122,6 +1123,9 @@
1122 result.append((prefix + (origin,), line))1123 result.append((prefix + (origin,), line))
1123 return result1124 return result
11241125
1126 def get_annotator(self):
1127 return annotate.Annotator(self)
1128
1125 def check(self, progress_bar=None):1129 def check(self, progress_bar=None):
1126 """See VersionedFiles.check()."""1130 """See VersionedFiles.check()."""
1127 for prefix, vf in self._iter_all_components():1131 for prefix, vf in self._iter_all_components():
11281132
=== modified file 'bzrlib/workingtree.py'
--- bzrlib/workingtree.py 2009-07-01 10:40:07 +0000
+++ bzrlib/workingtree.py 2009-07-08 23:35:28 +0000
@@ -1,4 +1,4 @@
1# Copyright (C) 2005, 2006, 2007, 2008 Canonical Ltd1# Copyright (C) 2005, 2006, 2007, 2008, 2009 Canonical Ltd
2#2#
3# This program is free software; you can redistribute it and/or modify3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by4# it under the terms of the GNU General Public License as published by
@@ -58,6 +58,7 @@
58 errors,58 errors,
59 generate_ids,59 generate_ids,
60 globbing,60 globbing,
61 graph as _mod_graph,
61 hashcache,62 hashcache,
62 ignores,63 ignores,
63 inventory,64 inventory,
@@ -477,31 +478,42 @@
477 incorrectly attributed to CURRENT_REVISION (but after committing, the478 incorrectly attributed to CURRENT_REVISION (but after committing, the
478 attribution will be correct).479 attribution will be correct).
479 """480 """
480 basis = self.basis_tree()481 maybe_file_parent_keys = []
481 basis.lock_read()482 for parent_id in self.get_parent_ids():
482 try:483 try:
483 changes = self.iter_changes(basis, True, [self.id2path(file_id)],484 parent_tree = self.revision_tree(parent_id)
484 require_versioned=True).next()485 except errors.NoSuchRevisionInTree:
485 changed_content, kind = changes[2], changes[6]486 parent_tree = self.branch.repository.revision_tree(parent_id)
486 if not changed_content:487 parent_tree.lock_read()
487 return basis.annotate_iter(file_id)488 try:
488 if kind[1] is None:489 if file_id not in parent_tree:
489 return None490 continue
490 import annotate491 ie = parent_tree.inventory[file_id]
491 if kind[0] != 'file':492 if ie.kind != 'file':
492 old_lines = []493 # Note: this is slightly unnecessary, because symlinks and
493 else:494 # directories have a "text" which is the empty text, and we
494 old_lines = list(basis.annotate_iter(file_id))495 # know that won't mess up annotations. But it seems cleaner
495 old = [old_lines]496 continue
496 for tree in self.branch.repository.revision_trees(497 parent_text_key = (file_id, ie.revision)
497 self.get_parent_ids()[1:]):498 if parent_text_key not in maybe_file_parent_keys:
498 if file_id not in tree:499 maybe_file_parent_keys.append(parent_text_key)
499 continue500 finally:
500 old.append(list(tree.annotate_iter(file_id)))501 parent_tree.unlock()
501 return annotate.reannotate(old, self.get_file(file_id).readlines(),502 graph = _mod_graph.Graph(self.branch.repository.texts)
502 default_revision)503 heads = graph.heads(maybe_file_parent_keys)
503 finally:504 file_parent_keys = []
504 basis.unlock()505 for key in maybe_file_parent_keys:
506 if key in heads:
507 file_parent_keys.append(key)
508
509 # Now we have the parents of this content
510 annotator = self.branch.repository.texts.get_annotator()
511 text = self.get_file(file_id).read()
512 this_key =(file_id, default_revision)
513 annotator.add_special_text(this_key, file_parent_keys, text)
514 annotations = [(key[-1], line)
515 for key, line in annotator.annotate_flat(this_key)]
516 return annotations
505517
506 def _get_ancestors(self, default_revision):518 def _get_ancestors(self, default_revision):
507 ancestors = set([default_revision])519 ancestors = set([default_revision])
508520
=== modified file 'setup.py'
--- setup.py 2009-06-23 07:10:03 +0000
+++ setup.py 2009-07-08 23:35:27 +0000
@@ -259,6 +259,7 @@
259 define_macros=define_macros, libraries=libraries))259 define_macros=define_macros, libraries=libraries))
260260
261261
262add_pyrex_extension('bzrlib._annotator_pyx')
262add_pyrex_extension('bzrlib._bencode_pyx')263add_pyrex_extension('bzrlib._bencode_pyx')
263add_pyrex_extension('bzrlib._btree_serializer_pyx')264add_pyrex_extension('bzrlib._btree_serializer_pyx')
264add_pyrex_extension('bzrlib._chunks_to_lines_pyx')265add_pyrex_extension('bzrlib._chunks_to_lines_pyx')