Merge lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr
- per-file-merge-hook-491711
- Merge into bzr.dev
Status: | Merged | ||||
---|---|---|---|---|---|
Merged at revision: | not available | ||||
Proposed branch: | lp:~spiv/bzr/per-file-merge-hook-491711 | ||||
Merge into: | lp:bzr | ||||
Prerequisite: | lp:~spiv/bzr/merge-minor-refactoring | ||||
Diff against target: |
960 lines (+672/-74) 10 files modified
NEWS (+16/-0) bzrlib/decorators.py (+76/-0) bzrlib/hooks.py (+2/-0) bzrlib/merge.py (+176/-74) bzrlib/plugins/news_merge/README (+9/-0) bzrlib/plugins/news_merge/__init__.py (+73/-0) bzrlib/plugins/news_merge/news_merge.py (+88/-0) bzrlib/plugins/news_merge/parser.py (+62/-0) bzrlib/tests/__init__.py (+1/-0) bzrlib/tests/per_merger.py (+169/-0) |
||||
To merge this branch: | bzr merge lp:~spiv/bzr/per-file-merge-hook-491711 | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
John A Meinel | Approve | ||
Vincent Ladeuil | Approve | ||
Review via email: mp+17279@code.launchpad.net |
This proposal supersedes a proposal from 2009-12-18.
Commit message
Description of the change
Andrew Bennetts (spiv) wrote : Posted in a previous version of this proposal | # |
Robert Collins (lifeless) wrote : Posted in a previous version of this proposal | # |
On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>
> * it is invoked for all changes where one side has changed the
> content, and the other side has changed the content or deleted the fil
Why not 'all changes' ? I don't see why we should prevent hooks merging
even one-side changes in a more sophisticated manner.
> * should a hook function be able to have some control over the
> emission of THIS/BASE/OTHER files when it results in a conflict, or
> over their contents of those files?
I think yes, but that can come later.
> * what helpers can we provide to help people write hook functions?
> * what more can we do to help multiple hook registrations co-exist
> peacefully? e.g. These functions already chain, but if multiple
> plugins register with this hook the ordering is not defined.
I don't think the ordering matters, except in the case that someone is
writing a new 'fallback' hook, to replace our three-way merge attempt.
Perhaps put a priority into the hook or something? I'd wait for signs we
need it.
> I think ideally this hook point would be marked experimental until we
> get some practical experience proving that it works well in practice
> (e.g. that someone can write a plugin to merge NEWS files), but that's
> what beta series are for ;)
You can just mark it experimental in the docs ;)
-Rob
James Westby (james-w) wrote : Posted in a previous version of this proposal | # |
On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
Yay, thanks for working on this.
> The tests hopefully make the capabilities fairly clear. In short:
>
> * hook functions are passed the merger, file_id and trans_id (so can
> get access to filenames and file contents in the various
> this/other/base trees)
Is there any other information that is readily available?
I am a little concerned about the performance issues. bzr-builddeb is
going to be shipping one of these hooks, and I don't want that to
destroy the performance of merge.
At minimum on every conflict it has to look up the path names. If
we do as Rob suggests and call the hook on every modification that
is going to be O(tree) lookups.
Some way to short-circuit this might be good. Either passing in the
paths if they are already known (and are likely to be known even
with refactoring), or even allowing the hook to declare what paths
it is interested in (though that could be limiting).
(e.g. if path_filter is not None and basename == path_filter:)
Have you written a very simple implementation of e.g. a NEWS merger
and measured the impact?
> Some open questions (could be addressed in future patches perhaps):
>
> * should merge of a file rename vs. a text change use this hook
> point?
I imagine we would want to make it possible.
Some helpers would be useful for splitting out the cases, e.g. I imagine
many will just want to act on two text changes.
> * should a hook function be able to have some control over the
> emission of THIS/BASE/OTHER files when it results in a conflict, or
> over their contents of those files?
I'm not sure I see a use case for this.
> * what helpers can we provide to help people write hook functions?
Helpers for getting the paths, and other meta information.
Helpers for determining the cause of the conflict, text changes vs.
path changes etc.
> * what more can we do to help multiple hook registrations co-exist
> peacefully? e.g. These functions already chain, but if multiple
> plugins register with this hook the ordering is not defined.
I think using a known ordering would be good. Even using sorted() on
the hook names. This would allow you to place something at the end
by calling it zzfallback for instance.
Otherwise changing the API to allow a priority could be good. However,
these APIs are always tricky to deal with, as you are playing a game
of rock-paper-scissors with and unbounded number of choices.
I bring it up now, as it may require API changes in the registration
to do this.
Thanks,
James
Aaron Bentley (abentley) wrote : Posted in a previous version of this proposal | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>> * it is invoked for all changes where one side has changed the
>> content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.
That would require us to pay attention to THIS-only changes, and that
would be a performance regression.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
XUUAni6xZaHfi47
=uew/
-----END PGP SIGNATURE-----
Aaron Bentley (abentley) wrote : Posted in a previous version of this proposal | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
James Westby wrote:
> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
>> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>> * should a hook function be able to have some control over the
>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>> over their contents of those files?
>
> I'm not sure I see a use case for this.
merge3 and weave merge emit different things as their BASE, so if we
hooked merge to force a weave merge for certain files, we might want to
also use that style of BASE.
Can't see any use for controlling THIS or OTHER, though.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
3EgAn08cMSUKDne
=QYTu
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>> * it is invoked for all changes where one side has changed the
>> content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.
I assume this is because the merge code itself splits these cases. Files
with clear winners don't get passed into the Merger code. Also, there is
bidirectional issues.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
9fsAoMnNoCWak28
=lRgj
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
>
> Is there any other information that is readily available?
>
> I am a little concerned about the performance issues. bzr-builddeb is
> going to be shipping one of these hooks, and I don't want that to
> destroy the performance of merge.
>
> At minimum on every conflict it has to look up the path names. If
> we do as Rob suggests and call the hook on every modification that
> is going to be O(tree) lookups.
Still O(changes), I don't see how you get O(tree).
>
> Some way to short-circuit this might be good. Either passing in the
> paths if they are already known (and are likely to be known even
> with refactoring), or even allowing the hook to declare what paths
> it is interested in (though that could be limiting).
I think passing in the "path_in_this_tree, path_in_other_tree, and
path_in_final_tree" as determined by the current merge logic would be
reasonable.
>
> (e.g. if path_filter is not None and basename == path_filter:)
>
> Have you written a very simple implementation of e.g. a NEWS merger
> and measured the impact?
>
>> Some open questions (could be addressed in future patches perhaps):
>>
>> * should merge of a file rename vs. a text change use this hook
>> point?
>
> I imagine we would want to make it possible.
>
> Some helpers would be useful for splitting out the cases, e.g. I imagine
> many will just want to act on two text changes.
>
>> * should a hook function be able to have some control over the
>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>> over their contents of those files?
>
> I'm not sure I see a use case for this.
>
>> * what helpers can we provide to help people write hook functions?
>
> Helpers for getting the paths, and other meta information.
>
> Helpers for determining the cause of the conflict, text changes vs.
> path changes etc.
>
>> * what more can we do to help multiple hook registrations co-exist
>> peacefully? e.g. These functions already chain, but if multiple
>> plugins register with this hook the ordering is not defined.
>
> I think using a known ordering would be good. Even using sorted() on
> the hook names. This would allow you to place something at the end
> by calling it zzfallback for instance.
>
> Otherwise changing the API to allow a priority could be good. However,
> these APIs are always tricky to deal with, as you are playing a game
> of rock-paper-scissors with and unbounded number of choices.
>
> I bring it up now, as it may require API changes in the registration
> to do this.
>
> Thanks,
>
> James
I think explicitly making them unordered forces people to think about
whether they can do what they need in an unordered fashion.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
/3oAn1TyJC5/
=Gavq
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Bentley wrote:
> James Westby wrote:
>> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
>>> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>>> * should a hook function be able to have some control over the
>>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>>> over their contents of those files?
>> I'm not sure I see a use case for this.
>
> merge3 and weave merge emit different things as their BASE, so if we
> hooked merge to force a weave merge for certain files, we might want to
> also use that style of BASE.
>
> Can't see any use for controlling THIS or OTHER, though.
>
> Aaron
Well, potentially a whitespace ignoring merger may want to pass that
ability on to the output files?
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
6UIAn30zdkrAz6h
=omK9
-----END PGP SIGNATURE-----
Robert Collins (lifeless) wrote : Posted in a previous version of this proposal | # |
On Fri, 2009-12-18 at 17:09 +0000, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
> >> * it is invoked for all changes where one side has changed the
> >> content, and the other side has changed the content or deleted the fil
> >
> > Why not 'all changes' ? I don't see why we should prevent hooks merging
> > even one-side changes in a more sophisticated manner.
>
> That would require us to pay attention to THIS-only changes, and that
> would be a performance regression.
So there is a middle ground, where OTHER-only changes are examined.
There are a number of useful things one can do where OTHER-only changes
are examined (such as better merges of NEWS items by accounting for
changes after a release) which the current definition won't help with. I
see that checking for THIS-only changes would have an impact, and while
it might be nice, I don't see a need to do it at this point.
-Rob
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr with lp:~spiv/bzr/merge-minor-refactoring as a prerequisite.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #491711 hook to permit per file merges
> https:/
>
>
> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>
> The tests hopefully make the capabilities fairly clear. In short:
>
> * it is invoked for all changes where one side has changed the content, and the other side has changed the content or deleted the file
> * hook functions are passed the merger, file_id and trans_id (so can get access to filenames and file contents in the various this/other/base trees)
> * hook functions can explicitly do nothing (allowing the next registered function to try), or return a conflict, a successful merge, or a deletion of the file.
> * hook functions can set the contents of a file in both the success and conflict case
>
> Some open questions (could be addressed in future patches perhaps):
>
> * should merge of a file rename vs. a text change use this hook point?
> * should a hook function be able to have some control over the emission of THIS/BASE/OTHER files when it results in a conflict, or over their contents of those files?
> * what helpers can we provide to help people write hook functions?
> * what more can we do to help multiple hook registrations co-exist peacefully? e.g. These functions already chain, but if multiple plugins register with this hook the ordering is not defined.
>
> I think ideally this hook point would be marked experimental until we get some practical experience proving that it works well in practice (e.g. that someone can write a plugin to merge NEWS files), but that's what beta series are for ;)
>
>
+ self.create_
+ "Called when file content needs to be merged (including
when one "
+ "side has deleted the file and the other has changed it)."
+ "merge_file_content is called with a "
+ "bzrlib.
tuple "
+ "of (status, lines), where status is one of 'not_applicable', "
+ "'success', 'conflicted', or 'delete'. If status is
success or "
+ "conflicted, then lines should be an iterable of the new
lines "
+ "for the file.",
+ (2, 1), None))
^- I'm wondering if we wouldn't want something smarter than a tuple of
(status, lines) being returned. It might give us a way to move to
supporting returning THIS/BASE/OTHER more directly. It may also make it
easier to handle path conflicts / updating the path as part of the hook.
I think that since the transform is being passed in (via the Merger
object), you actually can do any renaming you want. You just tell the
transform that the path for 'trans_id' should really be 'X'.
As james_w mentioned, we may want to make the api a little bit more
verbose, and have some helper functions o...
Andrew Bennetts (spiv) wrote : Posted in a previous version of this proposal | # |
First, thanks everyone for the deluge of thoughtful comments! It's
great to have so much enthusiasm about a patch :)
I'm going to reply to all the comments in this one jumbo reply, rather
than try to detangle the conversation threads this proposal has spawned.
I've skipped over some comments that are repeats of others, but I think
I've replied to every point. Do let me know if I've inadvertantly
forgotten something you'd like me to consider.
Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
> >
> > * it is invoked for all changes where one side has changed the
> > content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.
Hmm. I guess because if there's a clear one-side winner then I don't
think of that as a “merge” (from the point of view of that file), and I
didn't have a use-case for it. I was also trying to be mindful of
keeping the performance cost minimal, as Aaron mentioned. You have
given a use-case in a follow-up (“better merges of NEWS items by
accounting for changes after a release”), and that would definitely be
nice to have. I'll have a play and see what it would take to make
OTHER-only changes go via the hook, as your followup suggests. It
sounds like no-one is strongly interested in THIS-only at this point, so
I'm happy to ignore that for this patch.
Also, I know it's a bit of a cop-out answer, but a plugin could always
implement a whole new Merger if the hook isn't powerful enough...
John's reply to you said:
] I assume this is because the merge code itself splits these cases. Files
] with clear winners don't get passed into the Merger code. Also, there is
] bidirectional issues.
Right, I definitely did want to keep the changes as minimal as possible
because I'm not deeply familiar with the structure of the merge code,
but I do get the sense that it would be easy to accidentally introduce
bugs if I rearranged things.
I'm not certain what you mean by bidirectional issues? I think I can
make some educated guesses, but I'd rather you explained for me :)
> > * should a hook function be able to have some control over the
> > emission of THIS/BASE/OTHER files when it results in a conflict, or
> > over their contents of those files?
>
> I think yes, but that can come later.
Cool.
James Westby wrote:
> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
[...]
> > * hook functions are passed the merger, file_id and trans_id (so can
> > get access to filenames and file contents in the various
> > this/other/base trees)
>
> Is there any other information that is readily available?
As John says, the path in the various tree versions are fairly cheaply
available via the merger object. I think it'll be worth adding
convenience methods to get at them. I imagine some plugins might want
to just check the basename, others would like to check the full path,
some might just use the file_id.
> I am a little concerned about the performance issues. bzr-builddeb is
> going to be shipping one of these hooks, ...
James Westby (james-w) wrote : Posted in a previous version of this proposal | # |
On Mon, 21 Dec 2009 02:39:14 -0000, Andrew Bennetts <email address hidden> wrote:
> I'm interested in the “on every conflict it [bzr-builddeb?] has to look
> up the path names” remark though — I don't think this hook as-is
> provides a convenient way to implement “try the regular logic, if that
> conflicts then try my custom plugin logic”. Is that something you want?
> I suspect the way to provide that nicely may be via making the regular
> logic interact via the same interfaces as the hook (i.e. receiving the
> same params object and returning the same values), and then providing a
> convenient way to call that via the hook params.
The way I imagine it working is that a hook will be registered by
bzr-builddeb that will merge debian/changelog files. The question
is to know what files to act on. We can try and parse the files
and just skip everything that doesn't parse as one of those files.
However, that will be both quite slow, plus hugely pessimistic.
Therefore I would want to short-circuit, and only bother trying
to parse files where the basename is "changelog". It would be
great if the basename was easily accesible to the hook so
that it could do this check without doing something like
trans_id-
> Yeah, that's a good point. Even many those that want to handle more than
> plain text vs text changes might still want to punt on delete vs text,
> for example. So I need to take the knowledge about what sort of changes
> are involved that's implicit in the code path now and turn that into
> some data that the hook function can inspect.
Yep, I'm mainly concerned in these comments about the API that is
presented to plugin writers. Being able simply separate out the cases
without each plugin having to cargo cult code to detect what situation
it is being called for would be good.
I think this will partially come from APIs, which I imagine are there,
I just haven't looked, and from docs. We should docuement the things
people commonly want to do so that it is obvious to plugin writers.
My concern would be that you write a hook that assumes you are always
dealing with text vs. text, and then it crashes when you get text vs.
delete. Making it clear that you can detect the cases at least points
you towards considering which you want to handle, even if the API
doesn't force you to think about them all.
> Well what sort of priority system do you have in mind that would be
> better than ordering by name?
Nothing in particular. I was just making a plea for a defined order.
If e.g. the order was whatever dict.keys() returns then there is no
way to get your hook to run last if you need to.
Thanks,
James
Robert Collins (lifeless) wrote : Posted in a previous version of this proposal | # |
On Mon, 2009-12-21 at 02:57 +0000, James Westby wrote:
> Nothing in particular. I was just making a plea for a defined order.
> If e.g. the order was whatever dict.keys() returns then there is no
> way to get your hook to run last if you need to.
You don't with names either.
'Run last' is very very hard to pin down.
'Topological order' is a bit easier, 'numeric order' is very simple.
Hooks don't have names so much as descriptions, I strongly urge /not/ to
use the labels as a sort order; ugh. We don't have orders elsewhere in
hook land and we haven't needed to - even though this discussion happens
every time.
Re: merging changelog, I'd lookup the full path and compare to
'debian/changelog' and 'changelog' - its simple and should be fast.
Andrew: are kind changes able to invoke this hook? If not please make
that clear.
I think passing in the Merger is fine; we pass in Branch in other hooks
and trust folk to use public bits and not make a mess. If someone does
fiddle, the conflict resolution phases will detect that pretty robustly
I think.
-Rob
Andrew Bennetts (spiv) wrote : Posted in a previous version of this proposal | # |
Robert Collins wrote:
> On Mon, 2009-12-21 at 02:57 +0000, James Westby wrote:
>
> > Nothing in particular. I was just making a plea for a defined order.
> > If e.g. the order was whatever dict.keys() returns then there is no
> > way to get your hook to run last if you need to.
>
> You don't with names either.
>
> 'Run last' is very very hard to pin down.
>
> 'Topological order' is a bit easier, 'numeric order' is very simple.
>
> Hooks don't have names so much as descriptions, I strongly urge /not/ to
> use the labels as a sort order; ugh. We don't have orders elsewhere in
> hook land and we haven't needed to - even though this discussion happens
> every time.
Yeah, I don't much like punning that. I'd like to make the order
defined and stable somehow rather than dependent on the somewhat
unstable order that plugins will get loaded, though.
> Re: merging changelog, I'd lookup the full path and compare to
> 'debian/changelog' and 'changelog' - its simple and should be fast.
I definitely think that's what I'd try to start with. Maybe that will
be a significant performance hit in which case we can rethink it, but
let's not prematurely optimise.
> Andrew: are kind changes able to invoke this hook? If not please make
> that clear.
I don't know off the top of my head. I'll find out and make it
explicit. My guess is no, but I'm not very confident about that.
> I think passing in the Merger is fine; we pass in Branch in other hooks
> and trust folk to use public bits and not make a mess. If someone does
> fiddle, the conflict resolution phases will detect that pretty robustly
> I think.
I'm not worried about people accessing non-public bits (they should
already know that _foo attributes are off-limits unless they're ready to
deal with the likely breakage), apart from ensuring that the public bits
are complete enough for hooks to do what they need.
However, I think it's important to decide if we pass the Merger with the
intent that the hook will use it only read-only (to retrieve file names,
file contents, maybe details about other things in the tree, etc), or if
we actually want to support the hook mutating things via the Merger. It
seems the answer is the latter, which is fine, but it's worth being
clear about that. That answer suggests that my original proposal that
the return value from the hook makes changes isn't quite right when the
hook can just do change(s) directly using the Merger.
-Andrew.
Robert Collins (lifeless) wrote : Posted in a previous version of this proposal | # |
I'd suggest one of:
return value indicates 'handled' or 'not handled'
return value is parameter to pass to next hook or None-to-stop
and let changes that require calling fn's on Merger be done directly.
-Rob
Andrew Bennetts (spiv) wrote : | # |
This addresses most (all?) of the points raised in the original reviews.
Major changes:
* Added 'news_merge' plugin to contrib/! This is an example plugin using the merge hook function so that you can see it in action, and so that users will have some inspiration for writing their own.
* The built-in merge logic is now invoked the same way as user-registered hook functions. So in theory anything the built-in logic can do, a plugin can do too.
* the hook point is invoked a bit earlier, so it can have a say in more kinds of conflicts by inspecting the 'winner' attribute.
* added some convenience methods/properties to the params object.
* generally simpler and more linear code path than lp:bzr, thanks to refactorings for above changes.
I think looking at the news_merge plugin the API is still a bit cumbersome (note the repetition of logic vs. the built-in text merge to see if this is an ordinary regular file, both sides changed merge). But having an example proves it works, and gives us good data for thinking of improvements.
Martin Pool (mbp) wrote : | # |
2010/1/13 Andrew Bennetts <email address hidden>:
> This addresses most (all?) of the points raised in the original reviews.
>
> Major changes:
>
> Â * Added 'news_merge' plugin to contrib/! Â This is an example plugin using the merge hook function so that you can see it in action, and so that users will have some inspiration for writing their own.
> Â * The built-in merge logic is now invoked the same way as user-registered hook functions. Â So in theory anything the built-in logic can do, a plugin can do too.
> Â * the hook point is invoked a bit earlier, so it can have a say in more kinds of conflicts by inspecting the 'winner' attribute.
> Â * added some convenience methods/properties to the params object.
> Â * generally simpler and more linear code path than lp:bzr, thanks to refactorings for above changes.
>
> I think looking at the news_merge plugin the API is still a bit cumbersome (note the repetition of logic vs. the built-in text merge to see if this is an ordinary regular file, both sides changed merge). Â But having an example proves it works, and gives us good data for thinking of improvements.
(Without reading the diff) that sounds like a nice list of changes.
I don't want to add latency to these changes but I would say that
putting things in contrib/ is not a fabulous way to get it out to
people: it relies on them discovering that it's there and how to use
it. It is a bit confusing for packagers whether they should install
the things from in there or not: if they don't, it's more
inaccessible, and if they do then the packaged version is quite
different from the standard one. Empirically, the other things in
there have tended to rot. They are not tested.
Therefore maybe you should put this in bzrlib.plugins but arrange that
it has no effect (and little load time overhead) unless it's
configured on?
--
Martin <http://
Robert Collins (lifeless) wrote : | # |
On Wed, 2010-01-13 at 07:54 +0000, Martin Pool wrote:
>
> Therefore maybe you should put this in bzrlib.plugins but arrange that
> it has no effect (and little load time overhead) unless it's
> configured on?
+1000
-Rob
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr with lp:~spiv/bzr/merge-minor-refactoring as a prerequisite.
>
> Requested reviews:
> John A Meinel (jameinel)
> Related bugs:
> #491711 hook to permit per file merges
> https:/
>
>
I think overall this is looking good. Looking over the actual diff
brought up a lot of questions about how various things hook together. I
don't know that everything needs to be addressed before merging, but I
think it would be good to at least understand what is going on.
review: needsfixing
+class MergeHookParams
+ """Object holding parameters passed to merge_file_content hooks.
+
+ There are 3 fields hooks can access:
+
What about "this_pair", "other_pair", etc. I don't really understand
what the "pair" is. this_pair[0] looks to be a Kind. Also, you should
probably document "base_lines", "this_lines", and "other_lines" here as
well. The current documentation assumes you'll go to the merger, but
you've already provided helpers for that.
base_lines, this_lines and other_lines doesn't map very cleanly into
WeaveMerge... But maybe it is good enough.
+ if hook_status == 'not_applicable':
+ # This is a contents conflict, because none of the available
+ # functions could merge it.
+ result = None
+ name = self.tt.
+ parent_id = self.tt.
+ if file_id in self.this_
+ self.tt.
+ file_group = self._dump_
+ set_version=True)
+ self._raw_
^- Are we sure this is correct? If there is a contents_conflict we mark
the file as unversioned?
Also, you are using "self.this_
in self.this_tree".
I'd probably rather use "self.this_
Trees shouldn't really have a __contains__ method, and I'd like to avoid
using ".inventory" if we can. (Most Dirstate actions can be resolved
without building an inventory at all.)
...
+ elif hook_status == 'conflicted':
+ # XXX: perhaps the hook should be able to provide
+ # the BASE/THIS/OTHER files?
+ self.tt.
+ self._raw_
+ name = self.tt.
+ parent_id = self.tt.
+ file_group = self._dump_
+ file_group.
^- Do you know why we are appending the trans_id here? file_group isn't
mentioned again. So is this mutating an internal structure behind the
scenes? (_dump_conflicts is creating a list and storing it in state
somewhere, and we mutate it to add the new id.) If so, a comment would
be nice. (Probably not your code, but certainly a "what is going on?"
moment.)
...
+ try:...
Martin Pool (mbp) wrote : | # |
As a follow on (after merging) it would be good to
* announce this on udd
* see if there is some existing code for merging debian changelogs and merge that
Andrew Bennetts (spiv) wrote : | # |
John A Meinel wrote:
> Review: Needs Fixing
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Andrew Bennetts wrote:
> > Andrew Bennetts has proposed merging lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr with lp:~spiv/bzr/merge-minor-refactoring as a prerequisite.
> >
> > Requested reviews:
> > John A Meinel (jameinel)
> > Related bugs:
> > #491711 hook to permit per file merges
> > https:/
> >
> >
>
> I think overall this is looking good. Looking over the actual diff
> brought up a lot of questions about how various things hook together. I
> don't know that everything needs to be addressed before merging, but I
> think it would be good to at least understand what is going on.
Fair enough. There are parts I don't fully understand, but those parts
I'm pretty sure are working the same as in trunk (and are equally
confusing in trunk).
Thanks for the review!
> +class MergeHookParams
> + """Object holding parameters passed to merge_file_content hooks.
> +
> + There are 3 fields hooks can access:
> +
>
> What about "this_pair", "other_pair", etc. I don't really understand
> what the "pair" is. this_pair[0] looks to be a Kind. Also, you should
Yeah, it is a kind. This was a refactoring that I should have taken a
little further by just passing the kinds rather than the “pairs”. I've
done that now, and added this_kind and other_kind to the class
docstring.
> probably document "base_lines", "this_lines", and "other_lines" here as
> well. The current documentation assumes you'll go to the merger, but
> you've already provided helpers for that.
I've removed the outdated comment that suggests using the merger for
this.
The *_lines properties are already documented, though, they have their
own docstrings. Unless you think they should be mentioned in the class
docstring too, but I didn't think that was our usual practice?
> base_lines, this_lines and other_lines doesn't map very cleanly into
> WeaveMerge... But maybe it is good enough.
Yeah, I admit to being unsure about what to do with the non-default
Merger classes. I guess for now we leave it up to the plugins to check
the type of Merger if they are sensitive to it? I would like to give
plugins the chance to use their own hooks even in the case of
WeaveMerge.
> + if hook_status == 'not_applicable':
> + # This is a contents conflict, because none of the available
> + # functions could merge it.
> + result = None
> + name = self.tt.
> + parent_id = self.tt.
> + if file_id in self.this_
> + self.tt.
> + file_group = self._dump_
> + set_version=True)
> + self._raw_
>
> ^- Are we sure this is correct? If there is a contents_conflict we mark
> the file as unversioned?
I'm not really sure why, but this is what lp:bzr does in these cases.
So this isn't an issue introduced by this branch.
> Also, you are u...
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
>> + try:
>> + self.tt.
>> + self.tt.
>> + except errors.NoSuchFile:
>> + pass
>>
>> ^- I don't quite understand the unconditional "delete_contents" if the
>> contents exist. Especially given that:
>>
>> + elif hook_status == 'success':
>> + self.tt.
>>
>> Is called *before* this.
>
> Yes, this certainly puzzled me when I came across it in trunk! (Again,
> this oddity is definitely present in trunk, just that my patch has
> perhaps made it more visible.)
>
>> Is it that TreeTransform.
>> the content in the current tree, and "create_file()" is creating
>> contents in the "next" tree?
>
> Exactly how TreeTransform isn't clearly documented that I could find.
> If someone that knows precisely what the operations on it means and
> could document that it would be a great help, I think. AFAICS there
> isn't even a description of what a trans_id is... is an abbreviation of
> “transform id” or “transaction id”, are they file_ids, or something else
> but like file_ids? What's the difference between unversion_file()
> and delete_contents()? etc.
I can help with this.
A trans_id is a handle for the TreeTransform. Similar in concept to a
file_id mapping to a path in an inventory. However, TreeTransform can
talk about files that aren't versioned (have no trans_id), and multiple
copies of the same logical file (two people introduce the same file_id
at different locations, etc). As such, trans_id are an abstraction of a
specific file/directory content during the transaction. AFAICT they are
always of the form "new-%d".
unversion_file is saying that we need to remove the file_id from the
inventory.
delete_contents says we want to call 'os.remove()' on the object.
Note that you don't get a new trans_id for content being removed versus
content being added. I checked and _removed_contents indicates files
that are to be deleted from the wt, but you can *also* have the same
trans_id in _new_contents. So I think trans_id is like file_id, in that
it is how TT maps this file should be replacing that file.
I think part of it is that TT doesn't think about replacements. It only
has a remove phase and an add phase.
1) All new content is staged into .bzr/checkout/
names like 'new-1'. Though we do try to create files in their final
directories with their final names if we know in advance. (So we don't
rename all files, just the containing dirs for 'bzr co'.)
2) All old files are renamed into .bzr/checkout/
3) All new files are renamed from .bzr/checkout/limbo into the working tree
4) All old files in .bzr/checkout/
That is how we get atomicity of updates, so that if something fails in
(3) we can still revert back to the original tree.
>
> So, for this specific puzzle...
>
> The code I'm guessing is there to ensure that the file is marked as
> versioned, but that the temporary file used for the new contents is
> removed? Staring again at the treetransform.py...
Aaron Bentley (abentley) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
>> ^- Are we sure this is correct? If there is a contents_conflict we mark
>> the file as unversioned?
That's correct. For a contents conflict that is not a text conflict,
bzr does not leave a file in place with the original filename, therefore
we also unversion it, and version the .OTHER, .THIS or .BASE--
whatever's there.
>> + elif hook_status == 'conflicted':
>> + # XXX: perhaps the hook should be able to provide
>> + # the BASE/THIS/OTHER files?
>> + self.tt.
>> + self._raw_
>> + name = self.tt.
>> + parent_id = self.tt.
>> + file_group = self._dump_
>> + file_group.
>>
>> ^- Do you know why we are appending the trans_id here? file_group isn't
>> mentioned again.
> Hmm. Again, this is something that lp:bzr does that I've
> preserved/copied. On closer inspection, I don't think it has any effect
> (either in lp:bzr or in my branch), so I'll just delete that.
This code doesn't appear to have ever made sense. It was introduced in
<email address hidden>. I
suspect it was meant to help with treating a file and its conflict files
as a group for conflict handling.
>> + try:
>> + self.tt.
>> + self.tt.
>> + except errors.NoSuchFile:
>> + pass
>>
>> ^- I don't quite understand the unconditional "delete_contents" if the
>> contents exist. Especially given that:
>>
>> + elif hook_status == 'success':
>> + self.tt.
>>
>> Is called *before* this.
Operations on a TreeTransform simply schedule things to happen when you
call 'apply'. Before and after are irrelevant (except when you cancel
an operation you had previously scheduled). All that matters is that
the tree transform is consistent when you call apply.
>> Is it that TreeTransform.
>> the content in the current tree, and "create_file()" is creating
>> contents in the "next" tree?
delete_contents means that applying the transform should delete these
contents, and create_file means that applying the transform should
create a file with specified contents.
> Exactly how TreeTransform isn't clearly documented that I could find.
How it what? Works?
> If someone that knows precisely what the operations on it means and
> could document that it would be a great help, I think.
I've certainly tried in the past, and I'm happy to enhance the
documentation. Could you start by explaining what is confusing or
missing from the TreeTransform docstring?
> AFAICS there
> isn't even a description of what a trans_id is...
There are three paragraphs of description:
> Transform/
> -------
> trans_ids are temporary ids assigned to all files involved in a transform.
> It's possible, even common, that not all ...
Andrew Bennetts (spiv) wrote : | # |
Aaron Bentley wrote:
[...]
> > Exactly how TreeTransform isn't clearly documented that I could find.
>
> How it what? Works?
Yes, that's the missing word. I'm glad your telepathy is functioning
well :)
> > If someone that knows precisely what the operations on it means and
> > could document that it would be a great help, I think.
>
> I've certainly tried in the past, and I'm happy to enhance the
> documentation. Could you start by explaining what is confusing or
> missing from the TreeTransform docstring?
Simply that I didn't find it when I looked for it, believe or not! I
looked for a class docstring on the class that implemented these methods
and instance variables (TreeTransformB
docstring, and then tried grepping doc/developers/. It didn't occur to
me to try looking a couple of classes (and 1000 lines) further down
transform.py (and a couple of classes further down the inheritance
hierarchy too) to see if one of those had introductory documentation.
This is the sort of situation where I miss dedicated “interface”
documentation, I guess.
Thanks for pointing me to that, I thought I'd seen it before but several
minutes of solid searching failed to find it for me! I expect I'll
remember it for next time.
[...]
> is an abbreviation of
> > “transform id” or “transaction id”
>
> Both, as you can see above.
[Eep! Ambiguity of names doesn't help clarity. I realise it's basically
an arbitrary choice, but I suggest picking one. Actually, it's also
that neither name is particuarly satisfying: one of these IDs doesn't
correspond to a single tree transform (which is done as a single
transaction), it corresponds to an element of that transform. So, I'd
probably find “tentry_id” (short for “transform entry ID”) clear and
mnemonic... but it's a bit late to change now :)
]
> >, are they file_ids, or something else
> > but like file_ids?
>
>
> > What's the difference between unversion_file()
> > and delete_contents()?
>
> One schedules the deletion of the contents of a file. The other
> schedules removes the removal of file id of the file. The docstring for
> delete_contents is:
>
> > """Schedule the contents of a path entry for deletion"""
>
> The docstring for unversion_file is:
> > """Schedule a path entry to become unversioned"""
>
> If this is unclear, could you say what about it is unclear?
First I should say I had an intuition about the meanings of these, and I
think my intuition was basically correct. So it's not so bad.
It's unclear for a few reasons:
* the docstrings don't really add information not already in the method
name,
* my confidence in my intuition about the names was low because the
terms were so close to synonymous, and
* the terminology is not memorably clear.
By “memorably clear”, I mean that terms like “delete” and “unversion”
are close to synonymous, so the distinction is hard to notice and
recall.
I don't mean say you necessarily should have found better names, maybe
there are no better names for these concepts... none spring to mind for
me! But (assuming I had found the TreeTransform docstring in the first
place) some explicit descri...
Andrew Bennetts (spiv) wrote : | # |
John A Meinel wrote:
[...]
> Though it also means that calling:
> self.tt.tree_kind()
> self.tt.
>
> is redundant and double statting the file. So we should clean that up
Yeah, I'd noticed that, but wasn't sure if it was tasteful to clean that
up. Seeing as you think we should I've gone ahead and removed that line
:)
[...]
> So the comment looks correct to my understanding.
I've added a comment based on Aaron's wording.
[...]
> If I set up location.conf to say:
> [/home/
> news_merger_files = NEWS
> news_merge_
(Thanks for this, this is exactly the syntax I've used for the
news_merge plugin configuration)
> Will that match when 'bzr merge --preview' is invoked (what Branch is
> given to the PreviewTree, etc.) I'm not 100% sure that PreviewTree *has*
> a .branch property. (Given that RevisionTree does not, but has a
> _repository attribute. Even better it is passed in as 'branch'... :)
It turned out I had to pass the branch a little deeper in all cases, but
having done that 'merge --preview' works just as well as regular 'merge'
:)
[...]
> It is probably reasonable to merge soon. I'd probably like to see the
> news plugin in 'plugins' first, but that is not critical.
I've done that anyway. Please try it out! :)
-Andrew.
Vincent Ladeuil (vila) wrote : | # |
I reviewed this chronologically (reading oldest revision first)
and I found it was far easier to understand what was going
on. I'm not sure that trick could be used for all reviews but it
definetly made that one a nice tour :)
Keep that in mind when reading the comments below as they may
seem out of context otherwise.
I had a moderately good understanding of the merge code and I
find that your refactoring makes it far clearer, thanks for that
!
32 === modified file 'bzrlib/
33 --- bzrlib/
34 +++ bzrlib/
35 @@ -253,3 +253,19 @@
36 global needs_read_lock, needs_write_lock
37 needs_read_lock = _pretty_
38 needs_write_lock = _pretty_
39 +
40 +
41 +def cachedproperty(
42 + """Like property(
43 +
44 + Note: there's no way to clear or reset the cache.
45 + """
46 + memo = []
AIUI, this is gc'ed when no more references exist on the cached
value. It's not obvious and may be worth documenting.
Also, this can be used in a lot of places in the codebase so
that's worth a NEWS entry isn't it ?
And last remark on that subject, can't use some unique object()
to initialize the cache instead of a list for each property ?
Using a list here is cute but a bit hackish, using a dedicated
object to represent the 'not cached yet' could make the code more
obvious to the casual reader.
The hook scope is a bit unclear regarding the tt updates (spotted
when reading revno 4887). Why don't you let the hooks do them (in
merge.merge_
in the hook docstring ?
I had to read the code to verify that is_file_merge() was indeed
is_text_merge(), is it just me or should it be renamed ? Ha,
fixed in revno 4890 :) Never mind then.
248 + params = MergeHookParams
249 + other_pair[0], winner)
Using this_kind= this_pair[0] and other_kind=
add a final touch of clarity to revno 4894 :)
Overall, nice job and reading the history of how it was put together
was really a big help to understand it.
I'll be happy to land this for 2.1 unless John or Aaron have more objections
until tomorrow.
Addressing the points I raised above with be nice but not mandatory for landing.
Andrew Bennetts (spiv) wrote : | # |
Vincent Ladeuil wrote:
[...]
> 41 +def cachedproperty(
> 42 + """Like property(
> 43 +
> 44 + Note: there's no way to clear or reset the cache.
> 45 + """
> 46 + memo = []
>
>
> AIUI, this is gc'ed when no more references exist on the cached
> value. It's not obvious and may be worth documenting.
Erk, this is totally bogus. I'm not sure what I was thinking! I was
trying for a limited quick & dirty cachedproperty decorator without the
complexity of the one in Launchpad, but I think I should just borrow
Launchpad's, so I've now done that (including arranging for its doctests
to be run).
> Also, this can be used in a lot of places in the codebase so
> that's worth a NEWS entry isn't it ?
Good idea. Done.
> The hook scope is a bit unclear regarding the tt updates (spotted
> when reading revno 4887). Why don't you let the hooks do them (in
> merge.merge_
> in the hook docstring ?
I'm not sure what you mean here? You mean let the hook do the
self.tt.
the end of merge_contents?
I guess I'm still not certain exactly how much the hook should be
allowed to do... the more plugins people try to write for it, the more
we'll know :)
[...]
> I'll be happy to land this for 2.1 unless John or Aaron have more objections
> until tomorrow.
>
> Addressing the points I raised above with be nice but not mandatory for landing.
That's very much for the extra review.
-Andrew.
Vincent Ladeuil (vila) wrote : | # |
>>>>> "Andrew" == Andrew Bennetts <email address hidden> writes:
Andrew> Vincent Ladeuil wrote:
Andrew> [...]
>> 41 +def cachedproperty(
>> 42 + """Like property(
>> 43 +
>> 44 + Note: there's no way to clear or reset the cache.
>> 45 + """
>> 46 + memo = []
>>
>>
>> AIUI, this is gc'ed when no more references exist on the cached
>> value. It's not obvious and may be worth documenting.
Andrew> Erk, this is totally bogus.
Really ? How so ?
<snip/>
>> The hook scope is a bit unclear regarding the tt updates
>> (spotted when reading revno 4887). Why don't you let the
>> hooks do them (in merge.merge_
>> that be at least documented in the hook docstring ?
Andrew> I'm not sure what you mean here? You mean let the
Andrew> hook do the self.tt.
Andrew> self.tt.
Andrew> of merge_contents?
Yes or let the caller do it but document it anyway.
Andrew> I guess I'm still not certain exactly how much the
Andrew> hook should be allowed to do... the more plugins
Andrew> people try to write for it, the more we'll know :)
Ok, no problem, I'm not sure either. It's just that it seems odd
to *forbid* the hook to do some stuff a bit different if
needed. But there is no point forcing people to do it
unconditionally either. Time will tell.
I'm happy enough with what is possible today anyway.
John A Meinel (jameinel) wrote : | # |
One problem I potentially see with the plugin is this:
585 +def filename_
586 + config = params.
587 + affected_files = config.
588 + if affected_files:
589 + filename = params.
590 + if filename in affected_files:
591 + return True
592 + return False
The issue is that "branch.
I wonder if we should have a place on MergeHookParams to store hook data...
Otherwise I like this revision.
Robert Collins (lifeless) wrote : | # |
On Mon, 2010-01-18 at 11:06 +0000, Andrew Bennetts wrote:
> > AIUI, this is gc'ed when no more references exist on the cached
> > value. It's not obvious and may be worth documenting.
>
> Erk, this is totally bogus. I'm not sure what I was thinking! I was
> trying for a limited quick & dirty cachedproperty decorator without the
> complexity of the one in Launchpad, but I think I should just borrow
> Launchpad's, so I've now done that (including arranging for its doctests
> to be run).
You'll need to get an ack to rerelease the LP code to do this: Launchpad
is AGPL, which is incompatible with GPL.
I agree with John that we need to avoid processing branch.conf per-file.
-Rob
Vincent Ladeuil (vila) wrote : | # |
> On Mon, 2010-01-18 at 11:06 +0000, Andrew Bennetts wrote:
>
> You'll need to get an ack to rerelease the LP code to do this: Launchpad
> is AGPL, which is incompatible with GPL.
I've got the ack from flacoste.
>
> I agree with John that we need to avoid processing branch.conf per-file.
spiv proposed to cache the config in the plugin, I'll tweak and merge.
We may want to discuss caching the config in the branch itself
(I can't find scenarios where a locked branch could have trouble with that)
but that's outside the scope of this plugin.
Preview Diff
1 | === modified file 'NEWS' |
2 | --- NEWS 2010-01-17 20:06:41 +0000 |
3 | +++ NEWS 2010-01-18 11:01:23 +0000 |
4 | @@ -20,6 +20,15 @@ |
5 | * Add bug information to log output when available. |
6 | (Neil Martinsen-Burrell, Guillermo Gonzalez, #251729) |
7 | |
8 | +* Added ``merge_file_content`` hook point to ``Merger``, allowing plugins |
9 | + to register custom merge logic, e.g. to provide smarter merging for |
10 | + particular files. |
11 | + |
12 | +* Bazaar now includes the ``news_merge`` plugin. It is disabled by |
13 | + default, to enable it add a ``news_merge_files`` option to your |
14 | + configuration. Consult ``bzr help news_merge`` for more information. |
15 | + (Andrew Bennetts) |
16 | + |
17 | * ``bzr branch`` now takes a ``--bind`` option. This lets you |
18 | branch and bind all in one command. (Ian Clatworthy) |
19 | |
20 | @@ -154,11 +163,18 @@ |
21 | API Changes |
22 | *********** |
23 | |
24 | +* Added ``cachedproperty`` decorator to ``bzrlib.decorators``. |
25 | + (Andrew Bennetts) |
26 | + |
27 | * Many test features were renamed from ``FooFeature`` to ``foo_feature`` |
28 | to be consistent with instances being lower case and classes being |
29 | CamelCase. For the features that were more likely to be used, we added a |
30 | deprecation thunk, but not all. (John Arbash Meinel) |
31 | |
32 | +* Merger classes (such as ``Merge3Merger``) now expect a ``this_branch`` |
33 | + parameter in their constructora, and provide ``this_branch`` as an |
34 | + attribute. (Andrew Bennetts) |
35 | + |
36 | * The Branch hooks pre_change_branch_tip no longer masks exceptions raised |
37 | by plugins - the original exceptions are now preserved. (Robert Collins) |
38 | |
39 | |
40 | === modified file 'bzrlib/decorators.py' |
41 | --- bzrlib/decorators.py 2009-10-15 02:19:43 +0000 |
42 | +++ bzrlib/decorators.py 2010-01-18 11:01:23 +0000 |
43 | @@ -253,3 +253,79 @@ |
44 | global needs_read_lock, needs_write_lock |
45 | needs_read_lock = _pretty_needs_read_lock |
46 | needs_write_lock = _pretty_needs_write_lock |
47 | + |
48 | + |
49 | +# This implementation of cachedproperty is copied from Launchpad's |
50 | +# canonical.launchpad.cachedproperty module. |
51 | +def cachedproperty(attrname_or_fn): |
52 | + """A decorator for methods that makes them properties with their return |
53 | + value cached. |
54 | + |
55 | + The value is cached on the instance, using the attribute name provided. |
56 | + |
57 | + If you don't provide a name, the mangled name of the property is used. |
58 | + |
59 | + >>> class CachedPropertyTest(object): |
60 | + ... |
61 | + ... @cachedproperty('_foo_cache') |
62 | + ... def foo(self): |
63 | + ... print 'foo computed' |
64 | + ... return 23 |
65 | + ... |
66 | + ... @cachedproperty |
67 | + ... def bar(self): |
68 | + ... print 'bar computed' |
69 | + ... return 69 |
70 | + |
71 | + >>> cpt = CachedPropertyTest() |
72 | + >>> getattr(cpt, '_foo_cache', None) is None |
73 | + True |
74 | + >>> cpt.foo |
75 | + foo computed |
76 | + 23 |
77 | + >>> cpt.foo |
78 | + 23 |
79 | + >>> cpt._foo_cache |
80 | + 23 |
81 | + >>> cpt.bar |
82 | + bar computed |
83 | + 69 |
84 | + >>> cpt._bar_cached_value |
85 | + 69 |
86 | + |
87 | + """ |
88 | + if isinstance(attrname_or_fn, basestring): |
89 | + attrname = attrname_or_fn |
90 | + return _CachedPropertyForAttr(attrname) |
91 | + else: |
92 | + fn = attrname_or_fn |
93 | + attrname = '_%s_cached_value' % fn.__name__ |
94 | + return _CachedProperty(attrname, fn) |
95 | + |
96 | + |
97 | +class _CachedPropertyForAttr(object): |
98 | + |
99 | + def __init__(self, attrname): |
100 | + self.attrname = attrname |
101 | + |
102 | + def __call__(self, fn): |
103 | + return _CachedProperty(self.attrname, fn) |
104 | + |
105 | + |
106 | +class _CachedProperty(object): |
107 | + |
108 | + def __init__(self, attrname, fn): |
109 | + self.fn = fn |
110 | + self.attrname = attrname |
111 | + self.marker = object() |
112 | + |
113 | + def __get__(self, inst, cls=None): |
114 | + if inst is None: |
115 | + return self |
116 | + cachedresult = getattr(inst, self.attrname, self.marker) |
117 | + if cachedresult is self.marker: |
118 | + result = self.fn(inst) |
119 | + setattr(inst, self.attrname, result) |
120 | + return result |
121 | + else: |
122 | + return cachedresult |
123 | |
124 | === modified file 'bzrlib/hooks.py' |
125 | --- bzrlib/hooks.py 2010-01-03 03:49:07 +0000 |
126 | +++ bzrlib/hooks.py 2010-01-18 11:01:23 +0000 |
127 | @@ -45,6 +45,8 @@ |
128 | 'bzrlib.info', 'InfoHooks') |
129 | known_hooks.register_lazy(('bzrlib.lock', 'Lock.hooks'), 'bzrlib.lock', |
130 | 'LockHooks') |
131 | +known_hooks.register_lazy(('bzrlib.merge', 'Merger.hooks'), 'bzrlib.merge', |
132 | + 'MergeHooks') |
133 | known_hooks.register_lazy(('bzrlib.msgeditor', 'hooks'), 'bzrlib.msgeditor', |
134 | 'MessageEditorHooks') |
135 | known_hooks.register_lazy(('bzrlib.mutabletree', 'MutableTree.hooks'), |
136 | |
137 | === modified file 'bzrlib/merge.py' |
138 | --- bzrlib/merge.py 2009-12-18 08:22:42 +0000 |
139 | +++ bzrlib/merge.py 2010-01-18 11:01:23 +0000 |
140 | @@ -19,8 +19,10 @@ |
141 | branch as _mod_branch, |
142 | conflicts as _mod_conflicts, |
143 | debug, |
144 | + decorators, |
145 | errors, |
146 | graph as _mod_graph, |
147 | + hooks, |
148 | merge3, |
149 | osutils, |
150 | patiencediff, |
151 | @@ -50,7 +52,68 @@ |
152 | from_tree.unlock() |
153 | |
154 | |
155 | +class MergeHooks(hooks.Hooks): |
156 | + |
157 | + def __init__(self): |
158 | + hooks.Hooks.__init__(self) |
159 | + self.create_hook(hooks.HookPoint('merge_file_content', |
160 | + "Called when file content needs to be merged (including when one " |
161 | + "side has deleted the file and the other has changed it)." |
162 | + "merge_file_content is called with a " |
163 | + "bzrlib.merge.MergeHookParams. The function should return a tuple " |
164 | + "of (status, lines), where status is one of 'not_applicable', " |
165 | + "'success', 'conflicted', or 'delete'. If status is success or " |
166 | + "conflicted, then lines should be an iterable of strings of the " |
167 | + "new file contents.", |
168 | + (2, 1), None)) |
169 | + |
170 | + |
171 | +class MergeHookParams(object): |
172 | + """Object holding parameters passed to merge_file_content hooks. |
173 | + |
174 | + There are 3 fields hooks can access: |
175 | + |
176 | + :ivar merger: the Merger object |
177 | + :ivar file_id: the file ID of the file being merged |
178 | + :ivar trans_id: the transform ID for the merge of this file |
179 | + :ivar this_kind: kind of file_id in 'this' tree |
180 | + :ivar other_kind: kind of file_id in 'other' tree |
181 | + :ivar winner: one of 'this', 'other', 'conflict' |
182 | + """ |
183 | + |
184 | + def __init__(self, merger, file_id, trans_id, this_kind, other_kind, |
185 | + winner): |
186 | + self.merger = merger |
187 | + self.file_id = file_id |
188 | + self.trans_id = trans_id |
189 | + self.this_kind = this_kind |
190 | + self.other_kind = other_kind |
191 | + self.winner = winner |
192 | + |
193 | + def is_file_merge(self): |
194 | + """True if this_kind and other_kind are both 'file'.""" |
195 | + return self.this_kind == 'file' and self.other_kind == 'file' |
196 | + |
197 | + @decorators.cachedproperty |
198 | + def base_lines(self): |
199 | + """The lines of the 'base' version of the file.""" |
200 | + return self.merger.get_lines(self.merger.base_tree, self.file_id) |
201 | + |
202 | + @decorators.cachedproperty |
203 | + def this_lines(self): |
204 | + """The lines of the 'this' version of the file.""" |
205 | + return self.merger.get_lines(self.merger.this_tree, self.file_id) |
206 | + |
207 | + @decorators.cachedproperty |
208 | + def other_lines(self): |
209 | + """The lines of the 'other' version of the file.""" |
210 | + return self.merger.get_lines(self.merger.other_tree, self.file_id) |
211 | + |
212 | + |
213 | class Merger(object): |
214 | + |
215 | + hooks = MergeHooks() |
216 | + |
217 | def __init__(self, this_branch, other_tree=None, base_tree=None, |
218 | this_tree=None, pb=None, change_reporter=None, |
219 | recurse='down', revision_graph=None): |
220 | @@ -432,7 +495,7 @@ |
221 | 'other_tree': self.other_tree, |
222 | 'interesting_ids': self.interesting_ids, |
223 | 'interesting_files': self.interesting_files, |
224 | - 'pp': self.pp, |
225 | + 'pp': self.pp, 'this_branch': self.this_branch, |
226 | 'do_merge': False} |
227 | if self.merge_type.requires_base: |
228 | kwargs['base_tree'] = self.base_tree |
229 | @@ -542,13 +605,14 @@ |
230 | interesting_ids=None, reprocess=False, show_base=False, |
231 | pb=progress.DummyProgress(), pp=None, change_reporter=None, |
232 | interesting_files=None, do_merge=True, |
233 | - cherrypick=False, lca_trees=None): |
234 | + cherrypick=False, lca_trees=None, this_branch=None): |
235 | """Initialize the merger object and perform the merge. |
236 | |
237 | :param working_tree: The working tree to apply the merge to |
238 | :param this_tree: The local tree in the merge operation |
239 | :param base_tree: The common tree in the merge operation |
240 | :param other_tree: The other tree to merge changes from |
241 | + :param this_branch: The branch associated with this_tree |
242 | :param interesting_ids: The file_ids of files that should be |
243 | participate in the merge. May not be combined with |
244 | interesting_files. |
245 | @@ -577,6 +641,7 @@ |
246 | self.this_tree = working_tree |
247 | self.base_tree = base_tree |
248 | self.other_tree = other_tree |
249 | + self.this_branch = this_branch |
250 | self._raw_conflicts = [] |
251 | self.cooked_conflicts = [] |
252 | self.reprocess = reprocess |
253 | @@ -892,7 +957,7 @@ |
254 | self.tt.final_kind(other_root) |
255 | except errors.NoSuchFile: |
256 | return |
257 | - if self.other_tree.inventory.root.file_id in self.this_tree.inventory: |
258 | + if self.this_tree.has_id(self.other_tree.inventory.root.file_id): |
259 | # the other tree's root is a non-root in the current tree |
260 | return |
261 | self.reparent_children(self.other_tree.inventory.root, self.tt.root) |
262 | @@ -940,7 +1005,7 @@ |
263 | @staticmethod |
264 | def executable(tree, file_id): |
265 | """Determine the executability of a file-id (used as a key method).""" |
266 | - if file_id not in tree: |
267 | + if not tree.has_id(file_id): |
268 | return None |
269 | if tree.kind(file_id) != "file": |
270 | return False |
271 | @@ -949,7 +1014,7 @@ |
272 | @staticmethod |
273 | def kind(tree, file_id): |
274 | """Determine the kind of a file-id (used as a key method).""" |
275 | - if file_id not in tree: |
276 | + if not tree.has_id(file_id): |
277 | return None |
278 | return tree.kind(file_id) |
279 | |
280 | @@ -1038,7 +1103,7 @@ |
281 | |
282 | def merge_names(self, file_id): |
283 | def get_entry(tree): |
284 | - if file_id in tree.inventory: |
285 | + if tree.has_id(file_id): |
286 | return tree.inventory[file_id] |
287 | else: |
288 | return None |
289 | @@ -1107,18 +1172,6 @@ |
290 | contents = None |
291 | return kind, contents |
292 | |
293 | - def contents_conflict(): |
294 | - trans_id = self.tt.trans_id_file_id(file_id) |
295 | - name = self.tt.final_name(trans_id) |
296 | - parent_id = self.tt.final_parent(trans_id) |
297 | - if file_id in self.this_tree.inventory: |
298 | - self.tt.unversion_file(trans_id) |
299 | - if file_id in self.this_tree: |
300 | - self.tt.delete_contents(trans_id) |
301 | - file_group = self._dump_conflicts(name, parent_id, file_id, |
302 | - set_version=True) |
303 | - self._raw_conflicts.append(('contents conflict', file_group)) |
304 | - |
305 | # See SPOT run. run, SPOT, run. |
306 | # So we're not QUITE repeating ourselves; we do tricky things with |
307 | # file kind... |
308 | @@ -1140,63 +1193,112 @@ |
309 | if winner == 'this': |
310 | # No interesting changes introduced by OTHER |
311 | return "unmodified" |
312 | + # We have a hypothetical conflict, but if we have files, then we |
313 | + # can try to merge the content |
314 | trans_id = self.tt.trans_id_file_id(file_id) |
315 | - if winner == 'other': |
316 | + params = MergeHookParams(self, file_id, trans_id, this_pair[0], |
317 | + other_pair[0], winner) |
318 | + hooks = Merger.hooks['merge_file_content'] |
319 | + hooks = list(hooks) + [self.default_text_merge] |
320 | + hook_status = 'not_applicable' |
321 | + for hook in hooks: |
322 | + hook_status, lines = hook(params) |
323 | + if hook_status != 'not_applicable': |
324 | + # Don't try any more hooks, this one applies. |
325 | + break |
326 | + result = "modified" |
327 | + if hook_status == 'not_applicable': |
328 | + # This is a contents conflict, because none of the available |
329 | + # functions could merge it. |
330 | + result = None |
331 | + name = self.tt.final_name(trans_id) |
332 | + parent_id = self.tt.final_parent(trans_id) |
333 | + if self.this_tree.has_id(file_id): |
334 | + self.tt.unversion_file(trans_id) |
335 | + file_group = self._dump_conflicts(name, parent_id, file_id, |
336 | + set_version=True) |
337 | + self._raw_conflicts.append(('contents conflict', file_group)) |
338 | + elif hook_status == 'success': |
339 | + self.tt.create_file(lines, trans_id) |
340 | + elif hook_status == 'conflicted': |
341 | + # XXX: perhaps the hook should be able to provide |
342 | + # the BASE/THIS/OTHER files? |
343 | + self.tt.create_file(lines, trans_id) |
344 | + self._raw_conflicts.append(('text conflict', trans_id)) |
345 | + name = self.tt.final_name(trans_id) |
346 | + parent_id = self.tt.final_parent(trans_id) |
347 | + self._dump_conflicts(name, parent_id, file_id) |
348 | + elif hook_status == 'delete': |
349 | + self.tt.unversion_file(trans_id) |
350 | + result = "deleted" |
351 | + elif hook_status == 'done': |
352 | + # The hook function did whatever it needs to do directly, no |
353 | + # further action needed here. |
354 | + pass |
355 | + else: |
356 | + raise AssertionError('unknown hook_status: %r' % (hook_status,)) |
357 | + if not self.this_tree.has_id(file_id) and result == "modified": |
358 | + self.tt.version_file(file_id, trans_id) |
359 | + # The merge has been performed, so the old contents should not be |
360 | + # retained. |
361 | + try: |
362 | + self.tt.delete_contents(trans_id) |
363 | + except errors.NoSuchFile: |
364 | + pass |
365 | + return result |
366 | + |
367 | + def _default_other_winner_merge(self, merge_hook_params): |
368 | + """Replace this contents with other.""" |
369 | + file_id = merge_hook_params.file_id |
370 | + trans_id = merge_hook_params.trans_id |
371 | + file_in_this = self.this_tree.has_id(file_id) |
372 | + if self.other_tree.has_id(file_id): |
373 | + # OTHER changed the file |
374 | + wt = self.this_tree |
375 | + if wt.supports_content_filtering(): |
376 | + # We get the path from the working tree if it exists. |
377 | + # That fails though when OTHER is adding a file, so |
378 | + # we fall back to the other tree to find the path if |
379 | + # it doesn't exist locally. |
380 | + try: |
381 | + filter_tree_path = wt.id2path(file_id) |
382 | + except errors.NoSuchId: |
383 | + filter_tree_path = self.other_tree.id2path(file_id) |
384 | + else: |
385 | + # Skip the id2path lookup for older formats |
386 | + filter_tree_path = None |
387 | + transform.create_from_tree(self.tt, trans_id, |
388 | + self.other_tree, file_id, |
389 | + filter_tree_path=filter_tree_path) |
390 | + return 'done', None |
391 | + elif file_in_this: |
392 | + # OTHER deleted the file |
393 | + return 'delete', None |
394 | + else: |
395 | + raise AssertionError( |
396 | + 'winner is OTHER, but file_id %r not in THIS or OTHER tree' |
397 | + % (file_id,)) |
398 | + |
399 | + def default_text_merge(self, merge_hook_params): |
400 | + if merge_hook_params.winner == 'other': |
401 | # OTHER is a straight winner, so replace this contents with other |
402 | - file_in_this = file_id in self.this_tree |
403 | - if file_in_this: |
404 | - # Remove any existing contents |
405 | - self.tt.delete_contents(trans_id) |
406 | - if file_id in self.other_tree: |
407 | - # OTHER changed the file |
408 | - wt = self.this_tree |
409 | - if wt.supports_content_filtering(): |
410 | - # We get the path from the working tree if it exists. |
411 | - # That fails though when OTHER is adding a file, so |
412 | - # we fall back to the other tree to find the path if |
413 | - # it doesn't exist locally. |
414 | - try: |
415 | - filter_tree_path = wt.id2path(file_id) |
416 | - except errors.NoSuchId: |
417 | - filter_tree_path = self.other_tree.id2path(file_id) |
418 | - else: |
419 | - # Skip the id2path lookup for older formats |
420 | - filter_tree_path = None |
421 | - transform.create_from_tree(self.tt, trans_id, |
422 | - self.other_tree, file_id, |
423 | - filter_tree_path=filter_tree_path) |
424 | - if not file_in_this: |
425 | - self.tt.version_file(file_id, trans_id) |
426 | - return "modified" |
427 | - elif file_in_this: |
428 | - # OTHER deleted the file |
429 | - self.tt.unversion_file(trans_id) |
430 | - return "deleted" |
431 | + return self._default_other_winner_merge(merge_hook_params) |
432 | + elif merge_hook_params.is_file_merge(): |
433 | + # THIS and OTHER are both files, so text merge. Either |
434 | + # BASE is a file, or both converted to files, so at least we |
435 | + # have agreement that output should be a file. |
436 | + try: |
437 | + self.text_merge(merge_hook_params.file_id, |
438 | + merge_hook_params.trans_id) |
439 | + except errors.BinaryFile: |
440 | + return 'not_applicable', None |
441 | + return 'done', None |
442 | else: |
443 | - # We have a hypothetical conflict, but if we have files, then we |
444 | - # can try to merge the content |
445 | - if this_pair[0] == 'file' and other_pair[0] == 'file': |
446 | - # THIS and OTHER are both files, so text merge. Either |
447 | - # BASE is a file, or both converted to files, so at least we |
448 | - # have agreement that output should be a file. |
449 | - try: |
450 | - self.text_merge(file_id, trans_id) |
451 | - except errors.BinaryFile: |
452 | - return contents_conflict() |
453 | - if file_id not in self.this_tree: |
454 | - self.tt.version_file(file_id, trans_id) |
455 | - try: |
456 | - self.tt.tree_kind(trans_id) |
457 | - self.tt.delete_contents(trans_id) |
458 | - except errors.NoSuchFile: |
459 | - pass |
460 | - return "modified" |
461 | - else: |
462 | - return contents_conflict() |
463 | + return 'not_applicable', None |
464 | |
465 | def get_lines(self, tree, file_id): |
466 | """Return the lines in a file, or an empty list.""" |
467 | - if file_id in tree: |
468 | + if tree.has_id(file_id): |
469 | return tree.get_file(file_id).readlines() |
470 | else: |
471 | return [] |
472 | @@ -1205,7 +1307,7 @@ |
473 | """Perform a three-way text merge on a file_id""" |
474 | # it's possible that we got here with base as a different type. |
475 | # if so, we just want two-way text conflicts. |
476 | - if file_id in self.base_tree and \ |
477 | + if self.base_tree.has_id(file_id) and \ |
478 | self.base_tree.kind(file_id) == "file": |
479 | base_lines = self.get_lines(self.base_tree, file_id) |
480 | else: |
481 | @@ -1274,7 +1376,7 @@ |
482 | versioned = False |
483 | file_group = [] |
484 | for suffix, tree, lines in data: |
485 | - if file_id in tree: |
486 | + if tree.has_id(file_id): |
487 | trans_id = self._conflict_file(name, parent_id, tree, file_id, |
488 | suffix, lines, filter_tree_path) |
489 | file_group.append(trans_id) |
490 | @@ -1324,11 +1426,11 @@ |
491 | if winner == "this": |
492 | executability = this_executable |
493 | else: |
494 | - if file_id in self.other_tree: |
495 | + if self.other_tree.has_id(file_id): |
496 | executability = other_executable |
497 | - elif file_id in self.this_tree: |
498 | + elif self.this_tree.has_id(file_id): |
499 | executability = this_executable |
500 | - elif file_id in self.base_tree: |
501 | + elif self.base_tree_has_id(file_id): |
502 | executability = base_executable |
503 | if executability is not None: |
504 | trans_id = self.tt.trans_id_file_id(file_id) |
505 | |
506 | === added directory 'bzrlib/plugins/news_merge' |
507 | === added file 'bzrlib/plugins/news_merge/README' |
508 | --- bzrlib/plugins/news_merge/README 1970-01-01 00:00:00 +0000 |
509 | +++ bzrlib/plugins/news_merge/README 2010-01-18 11:01:23 +0000 |
510 | @@ -0,0 +1,9 @@ |
511 | +A plugin for merging bzr's NEWS file. |
512 | + |
513 | +Install as usual (e.g. symlink this directory into ~/.bazaar/plugins), and any |
514 | +file with a path of NEWS that is in bzr's NEWS format will be merged with this |
515 | +hook. |
516 | + |
517 | +This hook can resolve conflicts where both sides add entries at the same place. |
518 | +If it encounters a more difficult conflict it gives up and bzr will fallback to |
519 | +the default merge algorithm. |
520 | |
521 | === added file 'bzrlib/plugins/news_merge/__init__.py' |
522 | --- bzrlib/plugins/news_merge/__init__.py 1970-01-01 00:00:00 +0000 |
523 | +++ bzrlib/plugins/news_merge/__init__.py 2010-01-18 11:01:23 +0000 |
524 | @@ -0,0 +1,73 @@ |
525 | +# Copyright (C) 2010 Canonical Ltd |
526 | +# |
527 | +# This program is free software; you can redistribute it and/or modify |
528 | +# it under the terms of the GNU General Public License as published by |
529 | +# the Free Software Foundation; either version 2 of the License, or |
530 | +# (at your option) any later version. |
531 | +# |
532 | +# This program is distributed in the hope that it will be useful, |
533 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
534 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
535 | +# GNU General Public License for more details. |
536 | +# |
537 | +# You should have received a copy of the GNU General Public License |
538 | +# along with this program; if not, write to the Free Software |
539 | +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |
540 | + |
541 | +"""Merge hook for bzr's NEWS file. |
542 | + |
543 | +To enable this plugin, add a section to your branch.conf or location.conf |
544 | +like:: |
545 | + |
546 | + [/home/user/code/bzr] |
547 | + news_merge_files = NEWS |
548 | + news_merge_files:policy = recurse |
549 | + |
550 | +The news_merge_files config option takes a list of file paths, separated by |
551 | +commas. |
552 | + |
553 | +Limitations: |
554 | + |
555 | +* if there's a conflict in more than just bullet points, this doesn't yet know |
556 | + how to resolve that, so bzr will fallback to the default line-based merge. |
557 | +""" |
558 | + |
559 | +# Put most of the code in a separate module that we lazy-import to keep the |
560 | +# overhead of this plugin as minimal as possible. |
561 | +from bzrlib.lazy_import import lazy_import |
562 | +lazy_import(globals(), """ |
563 | +from bzrlib.plugins.news_merge.news_merge import news_merger |
564 | +""") |
565 | + |
566 | +from bzrlib.merge import Merger |
567 | + |
568 | + |
569 | +def news_merge_hook(params): |
570 | + """Merger.merge_file_content hook function for bzr-format NEWS files.""" |
571 | + # First, check whether this custom merge logic should be used. We expect |
572 | + # most files should not be merged by this file. |
573 | + if params.winner == 'other': |
574 | + # OTHER is a straight winner, rely on default merge. |
575 | + return 'not_applicable', None |
576 | + elif not params.is_file_merge(): |
577 | + # THIS and OTHER aren't both files. |
578 | + return 'not_applicable', None |
579 | + elif not filename_matches_config(params): |
580 | + # The filename isn't listed in the 'news_merge_files' config option. |
581 | + return 'not_applicable', None |
582 | + return news_merger(params) |
583 | + |
584 | + |
585 | +def filename_matches_config(params): |
586 | + config = params.merger.this_branch.get_config() |
587 | + affected_files = config.get_user_option_as_list('news_merge_files') |
588 | + if affected_files: |
589 | + filename = params.merger.this_tree.id2path(params.file_id) |
590 | + if filename in affected_files: |
591 | + return True |
592 | + return False |
593 | + |
594 | + |
595 | +Merger.hooks.install_named_hook( |
596 | + 'merge_file_content', news_merge_hook, 'NEWS file merge') |
597 | + |
598 | |
599 | === added file 'bzrlib/plugins/news_merge/news_merge.py' |
600 | --- bzrlib/plugins/news_merge/news_merge.py 1970-01-01 00:00:00 +0000 |
601 | +++ bzrlib/plugins/news_merge/news_merge.py 2010-01-18 11:01:23 +0000 |
602 | @@ -0,0 +1,88 @@ |
603 | +# Copyright (C) 2010 Canonical Ltd |
604 | +# |
605 | +# This program is free software; you can redistribute it and/or modify |
606 | +# it under the terms of the GNU General Public License as published by |
607 | +# the Free Software Foundation; either version 2 of the License, or |
608 | +# (at your option) any later version. |
609 | +# |
610 | +# This program is distributed in the hope that it will be useful, |
611 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
612 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
613 | +# GNU General Public License for more details. |
614 | +# |
615 | +# You should have received a copy of the GNU General Public License |
616 | +# along with this program; if not, write to the Free Software |
617 | +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |
618 | + |
619 | +"""Merge logic for news_merge plugin.""" |
620 | + |
621 | + |
622 | +from bzrlib.plugins.news_merge.parser import simple_parse |
623 | +from bzrlib import merge3 |
624 | + |
625 | + |
626 | +magic_marker = '|NEWS-MERGE-MAGIC-MARKER|' |
627 | + |
628 | + |
629 | +def news_merger(params): |
630 | + """Perform a simple 3-way merge of a bzr NEWS file. |
631 | + |
632 | + Each section of a bzr NEWS file is essentially an ordered set of bullet |
633 | + points, so we can simply take a set of bullet points, determine which |
634 | + bullets to add and which to remove, sort, and reserialize. |
635 | + """ |
636 | + # Transform the different versions of the NEWS file into a bunch of text |
637 | + # lines where each line matches one part of the overall structure, e.g. a |
638 | + # heading or bullet. |
639 | + def munge(lines): |
640 | + return list(blocks_to_fakelines(simple_parse(''.join(lines)))) |
641 | + this_lines = munge(params.this_lines) |
642 | + other_lines = munge(params.other_lines) |
643 | + base_lines = munge(params.base_lines) |
644 | + m3 = merge3.Merge3(base_lines, this_lines, other_lines) |
645 | + result_lines = [] |
646 | + for group in m3.merge_groups(): |
647 | + if group[0] == 'conflict': |
648 | + _, base, a, b = group |
649 | + # Are all the conflicting lines bullets? If so, we can merge this. |
650 | + for line_set in [base, a, b]: |
651 | + for line in line_set: |
652 | + if not line.startswith('bullet'): |
653 | + # Something else :( |
654 | + # Maybe the default merge can cope. |
655 | + return 'not_applicable', None |
656 | + # Calculate additions and deletions. |
657 | + new_in_a = set(a).difference(base) |
658 | + new_in_b = set(b).difference(base) |
659 | + all_new = new_in_a.union(new_in_b) |
660 | + deleted_in_a = set(base).difference(a) |
661 | + deleted_in_b = set(base).difference(b) |
662 | + # Combine into the final set of bullet points. |
663 | + final = all_new.difference(deleted_in_a).difference(deleted_in_b) |
664 | + # Sort, and emit. |
665 | + final = sorted(final, key=sort_key) |
666 | + result_lines.extend(final) |
667 | + else: |
668 | + result_lines.extend(group[1]) |
669 | + # Transform the merged elements back into real blocks of lines. |
670 | + return 'success', list(fakelines_to_blocks(result_lines)) |
671 | + |
672 | + |
673 | +def blocks_to_fakelines(blocks): |
674 | + for kind, text in blocks: |
675 | + yield '%s%s%s' % (kind, magic_marker, text) |
676 | + |
677 | + |
678 | +def fakelines_to_blocks(fakelines): |
679 | + fakelines = list(fakelines) |
680 | + # Strip out the magic_marker, and reinstate the \n\n between blocks |
681 | + for fakeline in fakelines[:-1]: |
682 | + yield fakeline.split(magic_marker, 1)[1] + '\n\n' |
683 | + # The final block doesn't have a trailing \n\n. |
684 | + for fakeline in fakelines[-1:]: |
685 | + yield fakeline.split(magic_marker, 1)[1] |
686 | + |
687 | + |
688 | +def sort_key(s): |
689 | + return s.replace('`', '').lower() |
690 | + |
691 | |
692 | === added file 'bzrlib/plugins/news_merge/parser.py' |
693 | --- bzrlib/plugins/news_merge/parser.py 1970-01-01 00:00:00 +0000 |
694 | +++ bzrlib/plugins/news_merge/parser.py 2010-01-18 11:01:23 +0000 |
695 | @@ -0,0 +1,62 @@ |
696 | +# Copyright (C) 2010 Canonical Ltd |
697 | +# |
698 | +# This program is free software; you can redistribute it and/or modify |
699 | +# it under the terms of the GNU General Public License as published by |
700 | +# the Free Software Foundation; either version 2 of the License, or |
701 | +# (at your option) any later version. |
702 | +# |
703 | +# This program is distributed in the hope that it will be useful, |
704 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
705 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
706 | +# GNU General Public License for more details. |
707 | +# |
708 | +# You should have received a copy of the GNU General Public License |
709 | +# along with this program; if not, write to the Free Software |
710 | +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |
711 | + |
712 | +"""Simple parser for bzr's NEWS file. |
713 | + |
714 | +Simple as this is, it's a bit over-powered for news_merge's needs, which only |
715 | +cares about 'bullet' and 'everything else'. |
716 | + |
717 | +This module can be run as a standalone Python program; pass it a filename and |
718 | +it will print the parsed form of a file (a series of 2-tuples, see |
719 | +simple_parse's docstring). |
720 | +""" |
721 | + |
722 | + |
723 | +def simple_parse(content): |
724 | + """Returns blocks, where each block is a 2-tuple (kind, text). |
725 | + |
726 | + :kind: one of 'heading', 'release', 'section', 'empty' or 'text'. |
727 | + :text: a str, including newlines. |
728 | + """ |
729 | + blocks = content.split('\n\n') |
730 | + for block in blocks: |
731 | + if block.startswith('###'): |
732 | + # First line is ###...: Top heading |
733 | + yield 'heading', block |
734 | + continue |
735 | + last_line = block.rsplit('\n', 1)[-1] |
736 | + if last_line.startswith('###'): |
737 | + # last line is ###...: 2nd-level heading |
738 | + yield 'release', block |
739 | + elif last_line.startswith('***'): |
740 | + # last line is ***...: 3rd-level heading |
741 | + yield 'section', block |
742 | + elif block.startswith('* '): |
743 | + # bullet |
744 | + yield 'bullet', block |
745 | + elif block.strip() == '': |
746 | + # empty |
747 | + yield 'empty', block |
748 | + else: |
749 | + # plain text |
750 | + yield 'text', block |
751 | + |
752 | + |
753 | +if __name__ == '__main__': |
754 | + import sys |
755 | + content = open(sys.argv[1], 'rb').read() |
756 | + for result in simple_parse(content): |
757 | + print result |
758 | |
759 | === modified file 'bzrlib/tests/__init__.py' |
760 | --- bzrlib/tests/__init__.py 2010-01-17 20:06:41 +0000 |
761 | +++ bzrlib/tests/__init__.py 2010-01-18 11:01:23 +0000 |
762 | @@ -3752,6 +3752,7 @@ |
763 | return [ |
764 | 'bzrlib', |
765 | 'bzrlib.branchbuilder', |
766 | + 'bzrlib.decorators', |
767 | 'bzrlib.export', |
768 | 'bzrlib.inventory', |
769 | 'bzrlib.iterablefile', |
770 | |
771 | === modified file 'bzrlib/tests/per_merger.py' |
772 | --- bzrlib/tests/per_merger.py 2009-12-18 06:54:20 +0000 |
773 | +++ bzrlib/tests/per_merger.py 2010-01-18 11:01:23 +0000 |
774 | @@ -18,6 +18,7 @@ |
775 | |
776 | import os |
777 | |
778 | +from bzrlib.conflicts import TextConflict |
779 | from bzrlib import ( |
780 | errors, |
781 | merge as _mod_merge, |
782 | @@ -28,6 +29,7 @@ |
783 | multiply_tests, |
784 | TestCaseWithTransport, |
785 | ) |
786 | +from bzrlib.tests.test_merge_core import MergeBuilder |
787 | from bzrlib.transform import TreeTransform |
788 | |
789 | |
790 | @@ -188,4 +190,171 @@ |
791 | self.assertRaises(errors.LockError, wt.unlock) |
792 | |
793 | |
794 | +class TestHookMergeFileContent(TestCaseWithTransport): |
795 | + """Tests that the 'merge_file_content' hook is invoked.""" |
796 | + |
797 | + def setUp(self): |
798 | + TestCaseWithTransport.setUp(self) |
799 | + self.hook_log = [] |
800 | + |
801 | + def install_hook_noop(self): |
802 | + def hook_na(merge_params): |
803 | + # This hook unconditionally does nothing. |
804 | + self.hook_log.append(('no-op',)) |
805 | + return 'not_applicable', None |
806 | + _mod_merge.Merger.hooks.install_named_hook( |
807 | + 'merge_file_content', hook_na, 'test hook (no-op)') |
808 | + |
809 | + def install_hook_success(self): |
810 | + def hook_success(merge_params): |
811 | + self.hook_log.append(('success',)) |
812 | + if merge_params.file_id == '1': |
813 | + return 'success', ['text-merged-by-hook'] |
814 | + return 'not_applicable', None |
815 | + _mod_merge.Merger.hooks.install_named_hook( |
816 | + 'merge_file_content', hook_success, 'test hook (success)') |
817 | + |
818 | + def install_hook_conflict(self): |
819 | + def hook_conflict(merge_params): |
820 | + self.hook_log.append(('conflict',)) |
821 | + if merge_params.file_id == '1': |
822 | + return 'conflicted', ['text-with-conflict-markers-from-hook'] |
823 | + return 'not_applicable', None |
824 | + _mod_merge.Merger.hooks.install_named_hook( |
825 | + 'merge_file_content', hook_conflict, 'test hook (delete)') |
826 | + |
827 | + def install_hook_delete(self): |
828 | + def hook_delete(merge_params): |
829 | + self.hook_log.append(('delete',)) |
830 | + if merge_params.file_id == '1': |
831 | + return 'delete', None |
832 | + return 'not_applicable', None |
833 | + _mod_merge.Merger.hooks.install_named_hook( |
834 | + 'merge_file_content', hook_delete, 'test hook (delete)') |
835 | + |
836 | + def install_hook_log_lines(self): |
837 | + """Install a hook that saves the get_lines for the this, base and other |
838 | + versions of the file. |
839 | + """ |
840 | + def hook_log_lines(merge_params): |
841 | + self.hook_log.append(( |
842 | + 'log_lines', |
843 | + merge_params.this_lines, |
844 | + merge_params.other_lines, |
845 | + merge_params.base_lines, |
846 | + )) |
847 | + return 'not_applicable', None |
848 | + _mod_merge.Merger.hooks.install_named_hook( |
849 | + 'merge_file_content', hook_log_lines, 'test hook (log_lines)') |
850 | + |
851 | + def make_merge_builder(self): |
852 | + builder = MergeBuilder(self.test_base_dir) |
853 | + self.addCleanup(builder.cleanup) |
854 | + return builder |
855 | + |
856 | + def create_file_needing_contents_merge(self, builder, file_id): |
857 | + builder.add_file(file_id, builder.tree_root, "name1", "text1", True) |
858 | + builder.change_contents(file_id, other="text4", this="text3") |
859 | + |
860 | + def test_change_vs_change(self): |
861 | + """Hook is used for (changed, changed)""" |
862 | + self.install_hook_success() |
863 | + builder = self.make_merge_builder() |
864 | + builder.add_file("1", builder.tree_root, "name1", "text1", True) |
865 | + builder.change_contents("1", other="text4", this="text3") |
866 | + conflicts = builder.merge(self.merge_type) |
867 | + self.assertEqual(conflicts, []) |
868 | + self.assertEqual( |
869 | + builder.this.get_file('1').read(), 'text-merged-by-hook') |
870 | + |
871 | + def test_change_vs_deleted(self): |
872 | + """Hook is used for (changed, deleted)""" |
873 | + self.install_hook_success() |
874 | + builder = self.make_merge_builder() |
875 | + builder.add_file("1", builder.tree_root, "name1", "text1", True) |
876 | + builder.change_contents("1", this="text2") |
877 | + builder.remove_file("1", other=True) |
878 | + conflicts = builder.merge(self.merge_type) |
879 | + self.assertEqual(conflicts, []) |
880 | + self.assertEqual( |
881 | + builder.this.get_file('1').read(), 'text-merged-by-hook') |
882 | + |
883 | + def test_result_can_be_delete(self): |
884 | + """A hook's result can be the deletion of a file.""" |
885 | + self.install_hook_delete() |
886 | + builder = self.make_merge_builder() |
887 | + self.create_file_needing_contents_merge(builder, "1") |
888 | + conflicts = builder.merge(self.merge_type) |
889 | + self.assertEqual(conflicts, []) |
890 | + self.assertRaises(errors.NoSuchId, builder.this.id2path, '1') |
891 | + self.assertEqual([], list(builder.this.list_files())) |
892 | + |
893 | + def test_result_can_be_conflict(self): |
894 | + """A hook's result can be a conflict.""" |
895 | + self.install_hook_conflict() |
896 | + builder = self.make_merge_builder() |
897 | + self.create_file_needing_contents_merge(builder, "1") |
898 | + conflicts = builder.merge(self.merge_type) |
899 | + self.assertEqual(conflicts, [TextConflict('name1', file_id='1')]) |
900 | + # The hook still gets to set the file contents in this case, so that it |
901 | + # can insert custom conflict markers. |
902 | + self.assertEqual( |
903 | + builder.this.get_file('1').read(), |
904 | + 'text-with-conflict-markers-from-hook') |
905 | + |
906 | + def test_can_access_this_other_and_base_versions(self): |
907 | + """The hook function can call params.merger.get_lines to access the |
908 | + THIS/OTHER/BASE versions of the file. |
909 | + """ |
910 | + self.install_hook_log_lines() |
911 | + builder = self.make_merge_builder() |
912 | + builder.add_file("1", builder.tree_root, "name1", "text1", True) |
913 | + builder.change_contents("1", this="text2", other="text3") |
914 | + conflicts = builder.merge(self.merge_type) |
915 | + self.assertEqual( |
916 | + [('log_lines', ['text2'], ['text3'], ['text1'])], self.hook_log) |
917 | + |
918 | + def test_chain_when_not_applicable(self): |
919 | + """When a hook function returns not_applicable, the next function is |
920 | + tried (when one exists). |
921 | + """ |
922 | + self.install_hook_noop() |
923 | + self.install_hook_success() |
924 | + builder = self.make_merge_builder() |
925 | + self.create_file_needing_contents_merge(builder, "1") |
926 | + conflicts = builder.merge(self.merge_type) |
927 | + self.assertEqual(conflicts, []) |
928 | + self.assertEqual( |
929 | + builder.this.get_file('1').read(), 'text-merged-by-hook') |
930 | + self.assertEqual([('no-op',), ('success',)], self.hook_log) |
931 | + |
932 | + def test_chain_stops_after_success(self): |
933 | + """When a hook function returns success, no later functions are tried. |
934 | + """ |
935 | + self.install_hook_success() |
936 | + self.install_hook_noop() |
937 | + builder = self.make_merge_builder() |
938 | + self.create_file_needing_contents_merge(builder, "1") |
939 | + conflicts = builder.merge(self.merge_type) |
940 | + self.assertEqual([('success',)], self.hook_log) |
941 | + |
942 | + def test_chain_stops_after_conflict(self): |
943 | + """When a hook function returns conflict, no later functions are tried. |
944 | + """ |
945 | + self.install_hook_conflict() |
946 | + self.install_hook_noop() |
947 | + builder = self.make_merge_builder() |
948 | + self.create_file_needing_contents_merge(builder, "1") |
949 | + conflicts = builder.merge(self.merge_type) |
950 | + self.assertEqual([('conflict',)], self.hook_log) |
951 | + |
952 | + def test_chain_stops_after_delete(self): |
953 | + """When a hook function returns delete, no later functions are tried. |
954 | + """ |
955 | + self.install_hook_delete() |
956 | + self.install_hook_noop() |
957 | + builder = self.make_merge_builder() |
958 | + self.create_file_needing_contents_merge(builder, "1") |
959 | + conflicts = builder.merge(self.merge_type) |
960 | + self.assertEqual([('delete',)], self.hook_log) |
961 |
This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
The tests hopefully make the capabilities fairly clear. In short:
* it is invoked for all changes where one side has changed the content, and the other side has changed the content or deleted the file
* hook functions are passed the merger, file_id and trans_id (so can get access to filenames and file contents in the various this/other/base trees)
* hook functions can explicitly do nothing (allowing the next registered function to try), or return a conflict, a successful merge, or a deletion of the file.
* hook functions can set the contents of a file in both the success and conflict case
Some open questions (could be addressed in future patches perhaps):
* should merge of a file rename vs. a text change use this hook point?
* should a hook function be able to have some control over the emission of THIS/BASE/OTHER files when it results in a conflict, or over their contents of those files?
* what helpers can we provide to help people write hook functions?
* what more can we do to help multiple hook registrations co-exist peacefully? e.g. These functions already chain, but if multiple plugins register with this hook the ordering is not defined.
I think ideally this hook point would be marked experimental until we get some practical experience proving that it works well in practice (e.g. that someone can write a plugin to merge NEWS files), but that's what beta series are for ;)