Merge lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr
- per-file-merge-hook-491711
- Merge into bzr.dev
Status: | Superseded | ||||
---|---|---|---|---|---|
Proposed branch: | lp:~spiv/bzr/per-file-merge-hook-491711 | ||||
Merge into: | lp:bzr | ||||
Prerequisite: | lp:~spiv/bzr/merge-minor-refactoring | ||||
Diff against target: |
639 lines (+482/-62) 7 files modified
bzrlib/decorators.py (+16/-0) bzrlib/hooks.py (+2/-0) bzrlib/merge.py (+163/-62) bzrlib/tests/per_merger.py (+169/-0) contrib/news_merge/README (+9/-0) contrib/news_merge/__init__.py (+90/-0) contrib/news_merge/parser.py (+33/-0) |
||||
To merge this branch: | bzr merge lp:~spiv/bzr/per-file-merge-hook-491711 | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
John A Meinel | Needs Information | ||
Review via email: mp+16327@code.launchpad.net |
This proposal has been superseded by a proposal from 2010-01-13.
Commit message
Description of the change
Andrew Bennetts (spiv) wrote : | # |
Robert Collins (lifeless) wrote : | # |
On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>
> * it is invoked for all changes where one side has changed the
> content, and the other side has changed the content or deleted the fil
Why not 'all changes' ? I don't see why we should prevent hooks merging
even one-side changes in a more sophisticated manner.
> * should a hook function be able to have some control over the
> emission of THIS/BASE/OTHER files when it results in a conflict, or
> over their contents of those files?
I think yes, but that can come later.
> * what helpers can we provide to help people write hook functions?
> * what more can we do to help multiple hook registrations co-exist
> peacefully? e.g. These functions already chain, but if multiple
> plugins register with this hook the ordering is not defined.
I don't think the ordering matters, except in the case that someone is
writing a new 'fallback' hook, to replace our three-way merge attempt.
Perhaps put a priority into the hook or something? I'd wait for signs we
need it.
> I think ideally this hook point would be marked experimental until we
> get some practical experience proving that it works well in practice
> (e.g. that someone can write a plugin to merge NEWS files), but that's
> what beta series are for ;)
You can just mark it experimental in the docs ;)
-Rob
James Westby (james-w) wrote : | # |
On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
Yay, thanks for working on this.
> The tests hopefully make the capabilities fairly clear. In short:
>
> * hook functions are passed the merger, file_id and trans_id (so can
> get access to filenames and file contents in the various
> this/other/base trees)
Is there any other information that is readily available?
I am a little concerned about the performance issues. bzr-builddeb is
going to be shipping one of these hooks, and I don't want that to
destroy the performance of merge.
At minimum on every conflict it has to look up the path names. If
we do as Rob suggests and call the hook on every modification that
is going to be O(tree) lookups.
Some way to short-circuit this might be good. Either passing in the
paths if they are already known (and are likely to be known even
with refactoring), or even allowing the hook to declare what paths
it is interested in (though that could be limiting).
(e.g. if path_filter is not None and basename == path_filter:)
Have you written a very simple implementation of e.g. a NEWS merger
and measured the impact?
> Some open questions (could be addressed in future patches perhaps):
>
> * should merge of a file rename vs. a text change use this hook
> point?
I imagine we would want to make it possible.
Some helpers would be useful for splitting out the cases, e.g. I imagine
many will just want to act on two text changes.
> * should a hook function be able to have some control over the
> emission of THIS/BASE/OTHER files when it results in a conflict, or
> over their contents of those files?
I'm not sure I see a use case for this.
> * what helpers can we provide to help people write hook functions?
Helpers for getting the paths, and other meta information.
Helpers for determining the cause of the conflict, text changes vs.
path changes etc.
> * what more can we do to help multiple hook registrations co-exist
> peacefully? e.g. These functions already chain, but if multiple
> plugins register with this hook the ordering is not defined.
I think using a known ordering would be good. Even using sorted() on
the hook names. This would allow you to place something at the end
by calling it zzfallback for instance.
Otherwise changing the API to allow a priority could be good. However,
these APIs are always tricky to deal with, as you are playing a game
of rock-paper-scissors with and unbounded number of choices.
I bring it up now, as it may require API changes in the registration
to do this.
Thanks,
James
Aaron Bentley (abentley) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>> * it is invoked for all changes where one side has changed the
>> content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.
That would require us to pay attention to THIS-only changes, and that
would be a performance regression.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
XUUAni6xZaHfi47
=uew/
-----END PGP SIGNATURE-----
Aaron Bentley (abentley) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
James Westby wrote:
> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
>> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>> * should a hook function be able to have some control over the
>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>> over their contents of those files?
>
> I'm not sure I see a use case for this.
merge3 and weave merge emit different things as their BASE, so if we
hooked merge to force a weave merge for certain files, we might want to
also use that style of BASE.
Can't see any use for controlling THIS or OTHER, though.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
3EgAn08cMSUKDne
=QYTu
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>> * it is invoked for all changes where one side has changed the
>> content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.
I assume this is because the merge code itself splits these cases. Files
with clear winners don't get passed into the Merger code. Also, there is
bidirectional issues.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
9fsAoMnNoCWak28
=lRgj
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
>
> Is there any other information that is readily available?
>
> I am a little concerned about the performance issues. bzr-builddeb is
> going to be shipping one of these hooks, and I don't want that to
> destroy the performance of merge.
>
> At minimum on every conflict it has to look up the path names. If
> we do as Rob suggests and call the hook on every modification that
> is going to be O(tree) lookups.
Still O(changes), I don't see how you get O(tree).
>
> Some way to short-circuit this might be good. Either passing in the
> paths if they are already known (and are likely to be known even
> with refactoring), or even allowing the hook to declare what paths
> it is interested in (though that could be limiting).
I think passing in the "path_in_this_tree, path_in_other_tree, and
path_in_final_tree" as determined by the current merge logic would be
reasonable.
>
> (e.g. if path_filter is not None and basename == path_filter:)
>
> Have you written a very simple implementation of e.g. a NEWS merger
> and measured the impact?
>
>> Some open questions (could be addressed in future patches perhaps):
>>
>> * should merge of a file rename vs. a text change use this hook
>> point?
>
> I imagine we would want to make it possible.
>
> Some helpers would be useful for splitting out the cases, e.g. I imagine
> many will just want to act on two text changes.
>
>> * should a hook function be able to have some control over the
>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>> over their contents of those files?
>
> I'm not sure I see a use case for this.
>
>> * what helpers can we provide to help people write hook functions?
>
> Helpers for getting the paths, and other meta information.
>
> Helpers for determining the cause of the conflict, text changes vs.
> path changes etc.
>
>> * what more can we do to help multiple hook registrations co-exist
>> peacefully? e.g. These functions already chain, but if multiple
>> plugins register with this hook the ordering is not defined.
>
> I think using a known ordering would be good. Even using sorted() on
> the hook names. This would allow you to place something at the end
> by calling it zzfallback for instance.
>
> Otherwise changing the API to allow a priority could be good. However,
> these APIs are always tricky to deal with, as you are playing a game
> of rock-paper-scissors with and unbounded number of choices.
>
> I bring it up now, as it may require API changes in the registration
> to do this.
>
> Thanks,
>
> James
I think explicitly making them unordered forces people to think about
whether they can do what they need in an unordered fashion.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
/3oAn1TyJC5/
=Gavq
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Bentley wrote:
> James Westby wrote:
>> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
>>> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>>> * should a hook function be able to have some control over the
>>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>>> over their contents of those files?
>> I'm not sure I see a use case for this.
>
> merge3 and weave merge emit different things as their BASE, so if we
> hooked merge to force a weave merge for certain files, we might want to
> also use that style of BASE.
>
> Can't see any use for controlling THIS or OTHER, though.
>
> Aaron
Well, potentially a whitespace ignoring merger may want to pass that
ability on to the output files?
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
6UIAn30zdkrAz6h
=omK9
-----END PGP SIGNATURE-----
Robert Collins (lifeless) wrote : | # |
On Fri, 2009-12-18 at 17:09 +0000, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
> >> * it is invoked for all changes where one side has changed the
> >> content, and the other side has changed the content or deleted the fil
> >
> > Why not 'all changes' ? I don't see why we should prevent hooks merging
> > even one-side changes in a more sophisticated manner.
>
> That would require us to pay attention to THIS-only changes, and that
> would be a performance regression.
So there is a middle ground, where OTHER-only changes are examined.
There are a number of useful things one can do where OTHER-only changes
are examined (such as better merges of NEWS items by accounting for
changes after a release) which the current definition won't help with. I
see that checking for THIS-only changes would have an impact, and while
it might be nice, I don't see a need to do it at this point.
-Rob
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr with lp:~spiv/bzr/merge-minor-refactoring as a prerequisite.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #491711 hook to permit per file merges
> https:/
>
>
> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>
> The tests hopefully make the capabilities fairly clear. In short:
>
> * it is invoked for all changes where one side has changed the content, and the other side has changed the content or deleted the file
> * hook functions are passed the merger, file_id and trans_id (so can get access to filenames and file contents in the various this/other/base trees)
> * hook functions can explicitly do nothing (allowing the next registered function to try), or return a conflict, a successful merge, or a deletion of the file.
> * hook functions can set the contents of a file in both the success and conflict case
>
> Some open questions (could be addressed in future patches perhaps):
>
> * should merge of a file rename vs. a text change use this hook point?
> * should a hook function be able to have some control over the emission of THIS/BASE/OTHER files when it results in a conflict, or over their contents of those files?
> * what helpers can we provide to help people write hook functions?
> * what more can we do to help multiple hook registrations co-exist peacefully? e.g. These functions already chain, but if multiple plugins register with this hook the ordering is not defined.
>
> I think ideally this hook point would be marked experimental until we get some practical experience proving that it works well in practice (e.g. that someone can write a plugin to merge NEWS files), but that's what beta series are for ;)
>
>
+ self.create_
+ "Called when file content needs to be merged (including
when one "
+ "side has deleted the file and the other has changed it)."
+ "merge_file_content is called with a "
+ "bzrlib.
tuple "
+ "of (status, lines), where status is one of 'not_applicable', "
+ "'success', 'conflicted', or 'delete'. If status is
success or "
+ "conflicted, then lines should be an iterable of the new
lines "
+ "for the file.",
+ (2, 1), None))
^- I'm wondering if we wouldn't want something smarter than a tuple of
(status, lines) being returned. It might give us a way to move to
supporting returning THIS/BASE/OTHER more directly. It may also make it
easier to handle path conflicts / updating the path as part of the hook.
I think that since the transform is being passed in (via the Merger
object), you actually can do any renaming you want. You just tell the
transform that the path for 'trans_id' should really be 'X'.
As james_w mentioned, we may want to make the api a little bit more
verbose, and have some helper functions o...
Andrew Bennetts (spiv) wrote : | # |
First, thanks everyone for the deluge of thoughtful comments! It's
great to have so much enthusiasm about a patch :)
I'm going to reply to all the comments in this one jumbo reply, rather
than try to detangle the conversation threads this proposal has spawned.
I've skipped over some comments that are repeats of others, but I think
I've replied to every point. Do let me know if I've inadvertantly
forgotten something you'd like me to consider.
Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
> >
> > * it is invoked for all changes where one side has changed the
> > content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.
Hmm. I guess because if there's a clear one-side winner then I don't
think of that as a “merge” (from the point of view of that file), and I
didn't have a use-case for it. I was also trying to be mindful of
keeping the performance cost minimal, as Aaron mentioned. You have
given a use-case in a follow-up (“better merges of NEWS items by
accounting for changes after a release”), and that would definitely be
nice to have. I'll have a play and see what it would take to make
OTHER-only changes go via the hook, as your followup suggests. It
sounds like no-one is strongly interested in THIS-only at this point, so
I'm happy to ignore that for this patch.
Also, I know it's a bit of a cop-out answer, but a plugin could always
implement a whole new Merger if the hook isn't powerful enough...
John's reply to you said:
] I assume this is because the merge code itself splits these cases. Files
] with clear winners don't get passed into the Merger code. Also, there is
] bidirectional issues.
Right, I definitely did want to keep the changes as minimal as possible
because I'm not deeply familiar with the structure of the merge code,
but I do get the sense that it would be easy to accidentally introduce
bugs if I rearranged things.
I'm not certain what you mean by bidirectional issues? I think I can
make some educated guesses, but I'd rather you explained for me :)
> > * should a hook function be able to have some control over the
> > emission of THIS/BASE/OTHER files when it results in a conflict, or
> > over their contents of those files?
>
> I think yes, but that can come later.
Cool.
James Westby wrote:
> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
[...]
> > * hook functions are passed the merger, file_id and trans_id (so can
> > get access to filenames and file contents in the various
> > this/other/base trees)
>
> Is there any other information that is readily available?
As John says, the path in the various tree versions are fairly cheaply
available via the merger object. I think it'll be worth adding
convenience methods to get at them. I imagine some plugins might want
to just check the basename, others would like to check the full path,
some might just use the file_id.
> I am a little concerned about the performance issues. bzr-builddeb is
> going to be shipping one of these hooks, ...
James Westby (james-w) wrote : | # |
On Mon, 21 Dec 2009 02:39:14 -0000, Andrew Bennetts <email address hidden> wrote:
> I'm interested in the “on every conflict it [bzr-builddeb?] has to look
> up the path names” remark though — I don't think this hook as-is
> provides a convenient way to implement “try the regular logic, if that
> conflicts then try my custom plugin logic”. Is that something you want?
> I suspect the way to provide that nicely may be via making the regular
> logic interact via the same interfaces as the hook (i.e. receiving the
> same params object and returning the same values), and then providing a
> convenient way to call that via the hook params.
The way I imagine it working is that a hook will be registered by
bzr-builddeb that will merge debian/changelog files. The question
is to know what files to act on. We can try and parse the files
and just skip everything that doesn't parse as one of those files.
However, that will be both quite slow, plus hugely pessimistic.
Therefore I would want to short-circuit, and only bother trying
to parse files where the basename is "changelog". It would be
great if the basename was easily accesible to the hook so
that it could do this check without doing something like
trans_id-
> Yeah, that's a good point. Even many those that want to handle more than
> plain text vs text changes might still want to punt on delete vs text,
> for example. So I need to take the knowledge about what sort of changes
> are involved that's implicit in the code path now and turn that into
> some data that the hook function can inspect.
Yep, I'm mainly concerned in these comments about the API that is
presented to plugin writers. Being able simply separate out the cases
without each plugin having to cargo cult code to detect what situation
it is being called for would be good.
I think this will partially come from APIs, which I imagine are there,
I just haven't looked, and from docs. We should docuement the things
people commonly want to do so that it is obvious to plugin writers.
My concern would be that you write a hook that assumes you are always
dealing with text vs. text, and then it crashes when you get text vs.
delete. Making it clear that you can detect the cases at least points
you towards considering which you want to handle, even if the API
doesn't force you to think about them all.
> Well what sort of priority system do you have in mind that would be
> better than ordering by name?
Nothing in particular. I was just making a plea for a defined order.
If e.g. the order was whatever dict.keys() returns then there is no
way to get your hook to run last if you need to.
Thanks,
James
Robert Collins (lifeless) wrote : | # |
On Mon, 2009-12-21 at 02:57 +0000, James Westby wrote:
> Nothing in particular. I was just making a plea for a defined order.
> If e.g. the order was whatever dict.keys() returns then there is no
> way to get your hook to run last if you need to.
You don't with names either.
'Run last' is very very hard to pin down.
'Topological order' is a bit easier, 'numeric order' is very simple.
Hooks don't have names so much as descriptions, I strongly urge /not/ to
use the labels as a sort order; ugh. We don't have orders elsewhere in
hook land and we haven't needed to - even though this discussion happens
every time.
Re: merging changelog, I'd lookup the full path and compare to
'debian/changelog' and 'changelog' - its simple and should be fast.
Andrew: are kind changes able to invoke this hook? If not please make
that clear.
I think passing in the Merger is fine; we pass in Branch in other hooks
and trust folk to use public bits and not make a mess. If someone does
fiddle, the conflict resolution phases will detect that pretty robustly
I think.
-Rob
Andrew Bennetts (spiv) wrote : | # |
Robert Collins wrote:
> On Mon, 2009-12-21 at 02:57 +0000, James Westby wrote:
>
> > Nothing in particular. I was just making a plea for a defined order.
> > If e.g. the order was whatever dict.keys() returns then there is no
> > way to get your hook to run last if you need to.
>
> You don't with names either.
>
> 'Run last' is very very hard to pin down.
>
> 'Topological order' is a bit easier, 'numeric order' is very simple.
>
> Hooks don't have names so much as descriptions, I strongly urge /not/ to
> use the labels as a sort order; ugh. We don't have orders elsewhere in
> hook land and we haven't needed to - even though this discussion happens
> every time.
Yeah, I don't much like punning that. I'd like to make the order
defined and stable somehow rather than dependent on the somewhat
unstable order that plugins will get loaded, though.
> Re: merging changelog, I'd lookup the full path and compare to
> 'debian/changelog' and 'changelog' - its simple and should be fast.
I definitely think that's what I'd try to start with. Maybe that will
be a significant performance hit in which case we can rethink it, but
let's not prematurely optimise.
> Andrew: are kind changes able to invoke this hook? If not please make
> that clear.
I don't know off the top of my head. I'll find out and make it
explicit. My guess is no, but I'm not very confident about that.
> I think passing in the Merger is fine; we pass in Branch in other hooks
> and trust folk to use public bits and not make a mess. If someone does
> fiddle, the conflict resolution phases will detect that pretty robustly
> I think.
I'm not worried about people accessing non-public bits (they should
already know that _foo attributes are off-limits unless they're ready to
deal with the likely breakage), apart from ensuring that the public bits
are complete enough for hooks to do what they need.
However, I think it's important to decide if we pass the Merger with the
intent that the hook will use it only read-only (to retrieve file names,
file contents, maybe details about other things in the tree, etc), or if
we actually want to support the hook mutating things via the Merger. It
seems the answer is the latter, which is fine, but it's worth being
clear about that. That answer suggests that my original proposal that
the return value from the hook makes changes isn't quite right when the
hook can just do change(s) directly using the Merger.
-Andrew.
Robert Collins (lifeless) wrote : | # |
I'd suggest one of:
return value indicates 'handled' or 'not handled'
return value is parameter to pass to next hook or None-to-stop
and let changes that require calling fn's on Merger be done directly.
-Rob
Preview Diff
1 | === modified file 'bzrlib/decorators.py' |
2 | --- bzrlib/decorators.py 2009-10-15 02:19:43 +0000 |
3 | +++ bzrlib/decorators.py 2010-01-13 07:31:30 +0000 |
4 | @@ -253,3 +253,19 @@ |
5 | global needs_read_lock, needs_write_lock |
6 | needs_read_lock = _pretty_needs_read_lock |
7 | needs_write_lock = _pretty_needs_write_lock |
8 | + |
9 | + |
10 | +def cachedproperty(getter_func): |
11 | + """Like property(getter_func), but caches the value. |
12 | + |
13 | + Note: there's no way to clear or reset the cache. |
14 | + """ |
15 | + memo = [] |
16 | + def f(self): |
17 | + if not memo: |
18 | + memo.append(getter_func(self)) |
19 | + return memo[0] |
20 | + f.__doc__ = getter_func.__doc__ |
21 | + getter_func.__name__ = getter_func.__name__ |
22 | + return property(f) |
23 | + |
24 | |
25 | === modified file 'bzrlib/hooks.py' |
26 | --- bzrlib/hooks.py 2010-01-03 03:49:07 +0000 |
27 | +++ bzrlib/hooks.py 2010-01-13 07:31:30 +0000 |
28 | @@ -45,6 +45,8 @@ |
29 | 'bzrlib.info', 'InfoHooks') |
30 | known_hooks.register_lazy(('bzrlib.lock', 'Lock.hooks'), 'bzrlib.lock', |
31 | 'LockHooks') |
32 | +known_hooks.register_lazy(('bzrlib.merge', 'Merger.hooks'), 'bzrlib.merge', |
33 | + 'MergeHooks') |
34 | known_hooks.register_lazy(('bzrlib.msgeditor', 'hooks'), 'bzrlib.msgeditor', |
35 | 'MessageEditorHooks') |
36 | known_hooks.register_lazy(('bzrlib.mutabletree', 'MutableTree.hooks'), |
37 | |
38 | === modified file 'bzrlib/merge.py' |
39 | --- bzrlib/merge.py 2009-12-18 08:22:42 +0000 |
40 | +++ bzrlib/merge.py 2010-01-13 07:31:30 +0000 |
41 | @@ -19,8 +19,10 @@ |
42 | branch as _mod_branch, |
43 | conflicts as _mod_conflicts, |
44 | debug, |
45 | + decorators, |
46 | errors, |
47 | graph as _mod_graph, |
48 | + hooks, |
49 | merge3, |
50 | osutils, |
51 | patiencediff, |
52 | @@ -50,7 +52,69 @@ |
53 | from_tree.unlock() |
54 | |
55 | |
56 | +class MergeHooks(hooks.Hooks): |
57 | + |
58 | + def __init__(self): |
59 | + hooks.Hooks.__init__(self) |
60 | + self.create_hook(hooks.HookPoint('merge_file_content', |
61 | + "Called when file content needs to be merged (including when one " |
62 | + "side has deleted the file and the other has changed it)." |
63 | + "merge_file_content is called with a " |
64 | + "bzrlib.merge.MergeHookParams. The function should return a tuple " |
65 | + "of (status, lines), where status is one of 'not_applicable', " |
66 | + "'success', 'conflicted', or 'delete'. If status is success or " |
67 | + "conflicted, then lines should be an iterable of the new lines " |
68 | + "for the file.", |
69 | + (2, 1), None)) |
70 | + |
71 | + |
72 | +class MergeHookParams(object): |
73 | + """Object holding parameters passed to merge_file_content hooks. |
74 | + |
75 | + There are 3 fields hooks can access: |
76 | + |
77 | + :ivar merger: the Merger object |
78 | + :ivar file_id: the file ID of the file being merged |
79 | + :ivar trans_id: the transform ID for the merge of this file. |
80 | + |
81 | + The lines of versions of the file being merged can be retrieved from the |
82 | + merger, e.g.:: |
83 | + |
84 | + params.merger.get_lines(params.merger.this_tree, params.file_id) |
85 | + """ |
86 | + |
87 | + def __init__(self, merger, file_id, trans_id, this_pair, other_pair, |
88 | + winner): |
89 | + self.merger = merger |
90 | + self.file_id = file_id |
91 | + self.trans_id = trans_id |
92 | + self.this_pair = this_pair |
93 | + self.other_pair = other_pair |
94 | + self.winner = winner |
95 | + |
96 | + def is_file_merge(self): |
97 | + return self.this_pair[0] == 'file' and self.other_pair[0] == 'file' |
98 | + |
99 | + @decorators.cachedproperty |
100 | + def base_lines(self): |
101 | + """The lines of the 'base' version of the file.""" |
102 | + return self.merger.get_lines(self.merger.base_tree, self.file_id) |
103 | + |
104 | + @decorators.cachedproperty |
105 | + def this_lines(self): |
106 | + """The lines of the 'this' version of the file.""" |
107 | + return self.merger.get_lines(self.merger.this_tree, self.file_id) |
108 | + |
109 | + @decorators.cachedproperty |
110 | + def other_lines(self): |
111 | + """The lines of the 'other' version of the file.""" |
112 | + return self.merger.get_lines(self.merger.other_tree, self.file_id) |
113 | + |
114 | + |
115 | class Merger(object): |
116 | + |
117 | + hooks = MergeHooks() |
118 | + |
119 | def __init__(self, this_branch, other_tree=None, base_tree=None, |
120 | this_tree=None, pb=None, change_reporter=None, |
121 | recurse='down', revision_graph=None): |
122 | @@ -1107,18 +1171,6 @@ |
123 | contents = None |
124 | return kind, contents |
125 | |
126 | - def contents_conflict(): |
127 | - trans_id = self.tt.trans_id_file_id(file_id) |
128 | - name = self.tt.final_name(trans_id) |
129 | - parent_id = self.tt.final_parent(trans_id) |
130 | - if file_id in self.this_tree.inventory: |
131 | - self.tt.unversion_file(trans_id) |
132 | - if file_id in self.this_tree: |
133 | - self.tt.delete_contents(trans_id) |
134 | - file_group = self._dump_conflicts(name, parent_id, file_id, |
135 | - set_version=True) |
136 | - self._raw_conflicts.append(('contents conflict', file_group)) |
137 | - |
138 | # See SPOT run. run, SPOT, run. |
139 | # So we're not QUITE repeating ourselves; we do tricky things with |
140 | # file kind... |
141 | @@ -1140,59 +1192,108 @@ |
142 | if winner == 'this': |
143 | # No interesting changes introduced by OTHER |
144 | return "unmodified" |
145 | + # We have a hypothetical conflict, but if we have files, then we |
146 | + # can try to merge the content |
147 | trans_id = self.tt.trans_id_file_id(file_id) |
148 | - if winner == 'other': |
149 | + params = MergeHookParams(self, file_id, trans_id, this_pair, |
150 | + other_pair, winner) |
151 | + hooks = Merger.hooks['merge_file_content'] |
152 | + hooks = list(hooks) + [self.default_text_merge] |
153 | + hook_status = 'not_applicable' |
154 | + for hook in hooks: |
155 | + hook_status, lines = hook(params) |
156 | + if hook_status != 'not_applicable': |
157 | + # Don't try any more hooks, this one applies. |
158 | + break |
159 | + result = "modified" |
160 | + if hook_status == 'not_applicable': |
161 | + # This is a contents conflict, because none of the available |
162 | + # functions could merge it. |
163 | + result = None |
164 | + name = self.tt.final_name(trans_id) |
165 | + parent_id = self.tt.final_parent(trans_id) |
166 | + if file_id in self.this_tree.inventory: |
167 | + self.tt.unversion_file(trans_id) |
168 | + file_group = self._dump_conflicts(name, parent_id, file_id, |
169 | + set_version=True) |
170 | + self._raw_conflicts.append(('contents conflict', file_group)) |
171 | + elif hook_status == 'success': |
172 | + self.tt.create_file(lines, trans_id) |
173 | + elif hook_status == 'conflicted': |
174 | + # XXX: perhaps the hook should be able to provide |
175 | + # the BASE/THIS/OTHER files? |
176 | + self.tt.create_file(lines, trans_id) |
177 | + self._raw_conflicts.append(('text conflict', trans_id)) |
178 | + name = self.tt.final_name(trans_id) |
179 | + parent_id = self.tt.final_parent(trans_id) |
180 | + file_group = self._dump_conflicts(name, parent_id, file_id) |
181 | + file_group.append(trans_id) |
182 | + elif hook_status == 'delete': |
183 | + self.tt.unversion_file(trans_id) |
184 | + result = "deleted" |
185 | + elif hook_status == 'done': |
186 | + # The hook function did whatever it needs to do directly, no |
187 | + # further action needed here. |
188 | + pass |
189 | + else: |
190 | + raise AssertionError('unknown hook_status: %r' % (hook_status,)) |
191 | + if file_id not in self.this_tree and result == "modified": |
192 | + self.tt.version_file(file_id, trans_id) |
193 | + try: |
194 | + self.tt.tree_kind(trans_id) |
195 | + self.tt.delete_contents(trans_id) |
196 | + except errors.NoSuchFile: |
197 | + pass |
198 | + return result |
199 | + |
200 | + def _default_other_winner_merge(self, merge_hook_params): |
201 | + """Replace this contents with other.""" |
202 | + file_id = merge_hook_params.file_id |
203 | + trans_id = merge_hook_params.trans_id |
204 | + file_in_this = file_id in self.this_tree |
205 | + if file_id in self.other_tree: |
206 | + # OTHER changed the file |
207 | + wt = self.this_tree |
208 | + if wt.supports_content_filtering(): |
209 | + # We get the path from the working tree if it exists. |
210 | + # That fails though when OTHER is adding a file, so |
211 | + # we fall back to the other tree to find the path if |
212 | + # it doesn't exist locally. |
213 | + try: |
214 | + filter_tree_path = wt.id2path(file_id) |
215 | + except errors.NoSuchId: |
216 | + filter_tree_path = self.other_tree.id2path(file_id) |
217 | + else: |
218 | + # Skip the id2path lookup for older formats |
219 | + filter_tree_path = None |
220 | + transform.create_from_tree(self.tt, trans_id, |
221 | + self.other_tree, file_id, |
222 | + filter_tree_path=filter_tree_path) |
223 | + return 'done', None |
224 | + elif file_in_this: |
225 | + # OTHER deleted the file |
226 | + return 'delete', None |
227 | + else: |
228 | + raise AssertionError( |
229 | + 'winner is OTHER, but file_id %r not in THIS or OTHER tree' |
230 | + % (file_id,)) |
231 | + |
232 | + def default_text_merge(self, merge_hook_params): |
233 | + if merge_hook_params.winner == 'other': |
234 | # OTHER is a straight winner, so replace this contents with other |
235 | - file_in_this = file_id in self.this_tree |
236 | - if file_in_this: |
237 | - # Remove any existing contents |
238 | - self.tt.delete_contents(trans_id) |
239 | - if file_id in self.other_tree: |
240 | - # OTHER changed the file |
241 | - wt = self.this_tree |
242 | - if wt.supports_content_filtering(): |
243 | - # We get the path from the working tree if it exists. |
244 | - # That fails though when OTHER is adding a file, so |
245 | - # we fall back to the other tree to find the path if |
246 | - # it doesn't exist locally. |
247 | - try: |
248 | - filter_tree_path = wt.id2path(file_id) |
249 | - except errors.NoSuchId: |
250 | - filter_tree_path = self.other_tree.id2path(file_id) |
251 | - else: |
252 | - # Skip the id2path lookup for older formats |
253 | - filter_tree_path = None |
254 | - transform.create_from_tree(self.tt, trans_id, |
255 | - self.other_tree, file_id, |
256 | - filter_tree_path=filter_tree_path) |
257 | - if not file_in_this: |
258 | - self.tt.version_file(file_id, trans_id) |
259 | - return "modified" |
260 | - elif file_in_this: |
261 | - # OTHER deleted the file |
262 | - self.tt.unversion_file(trans_id) |
263 | - return "deleted" |
264 | + return self._default_other_winner_merge(merge_hook_params) |
265 | + elif merge_hook_params.is_file_merge(): |
266 | + # THIS and OTHER are both files, so text merge. Either |
267 | + # BASE is a file, or both converted to files, so at least we |
268 | + # have agreement that output should be a file. |
269 | + try: |
270 | + self.text_merge(merge_hook_params.file_id, |
271 | + merge_hook_params.trans_id) |
272 | + except errors.BinaryFile: |
273 | + return 'not_applicable', None |
274 | + return 'done', None |
275 | else: |
276 | - # We have a hypothetical conflict, but if we have files, then we |
277 | - # can try to merge the content |
278 | - if this_pair[0] == 'file' and other_pair[0] == 'file': |
279 | - # THIS and OTHER are both files, so text merge. Either |
280 | - # BASE is a file, or both converted to files, so at least we |
281 | - # have agreement that output should be a file. |
282 | - try: |
283 | - self.text_merge(file_id, trans_id) |
284 | - except errors.BinaryFile: |
285 | - return contents_conflict() |
286 | - if file_id not in self.this_tree: |
287 | - self.tt.version_file(file_id, trans_id) |
288 | - try: |
289 | - self.tt.tree_kind(trans_id) |
290 | - self.tt.delete_contents(trans_id) |
291 | - except errors.NoSuchFile: |
292 | - pass |
293 | - return "modified" |
294 | - else: |
295 | - return contents_conflict() |
296 | + return 'not_applicable', None |
297 | |
298 | def get_lines(self, tree, file_id): |
299 | """Return the lines in a file, or an empty list.""" |
300 | |
301 | === modified file 'bzrlib/tests/per_merger.py' |
302 | --- bzrlib/tests/per_merger.py 2009-12-18 06:54:20 +0000 |
303 | +++ bzrlib/tests/per_merger.py 2010-01-13 07:31:30 +0000 |
304 | @@ -18,6 +18,7 @@ |
305 | |
306 | import os |
307 | |
308 | +from bzrlib.conflicts import TextConflict |
309 | from bzrlib import ( |
310 | errors, |
311 | merge as _mod_merge, |
312 | @@ -28,6 +29,7 @@ |
313 | multiply_tests, |
314 | TestCaseWithTransport, |
315 | ) |
316 | +from bzrlib.tests.test_merge_core import MergeBuilder |
317 | from bzrlib.transform import TreeTransform |
318 | |
319 | |
320 | @@ -188,4 +190,171 @@ |
321 | self.assertRaises(errors.LockError, wt.unlock) |
322 | |
323 | |
324 | +class TestHookMergeFileContent(TestCaseWithTransport): |
325 | + """Tests that the 'merge_file_content' hook is invoked.""" |
326 | + |
327 | + def setUp(self): |
328 | + TestCaseWithTransport.setUp(self) |
329 | + self.hook_log = [] |
330 | + |
331 | + def install_hook_noop(self): |
332 | + def hook_na(merge_params): |
333 | + # This hook unconditionally does nothing. |
334 | + self.hook_log.append(('no-op',)) |
335 | + return 'not_applicable', None |
336 | + _mod_merge.Merger.hooks.install_named_hook( |
337 | + 'merge_file_content', hook_na, 'test hook (no-op)') |
338 | + |
339 | + def install_hook_success(self): |
340 | + def hook_success(merge_params): |
341 | + self.hook_log.append(('success',)) |
342 | + if merge_params.file_id == '1': |
343 | + return 'success', ['text-merged-by-hook'] |
344 | + return 'not_applicable', None |
345 | + _mod_merge.Merger.hooks.install_named_hook( |
346 | + 'merge_file_content', hook_success, 'test hook (success)') |
347 | + |
348 | + def install_hook_conflict(self): |
349 | + def hook_conflict(merge_params): |
350 | + self.hook_log.append(('conflict',)) |
351 | + if merge_params.file_id == '1': |
352 | + return 'conflicted', ['text-with-conflict-markers-from-hook'] |
353 | + return 'not_applicable', None |
354 | + _mod_merge.Merger.hooks.install_named_hook( |
355 | + 'merge_file_content', hook_conflict, 'test hook (delete)') |
356 | + |
357 | + def install_hook_delete(self): |
358 | + def hook_delete(merge_params): |
359 | + self.hook_log.append(('delete',)) |
360 | + if merge_params.file_id == '1': |
361 | + return 'delete', None |
362 | + return 'not_applicable', None |
363 | + _mod_merge.Merger.hooks.install_named_hook( |
364 | + 'merge_file_content', hook_delete, 'test hook (delete)') |
365 | + |
366 | + def install_hook_log_lines(self): |
367 | + """Install a hook that saves the get_lines for the this, base and other |
368 | + versions of the file. |
369 | + """ |
370 | + def hook_log_lines(merge_params): |
371 | + self.hook_log.append(( |
372 | + 'log_lines', |
373 | + merge_params.this_lines, |
374 | + merge_params.other_lines, |
375 | + merge_params.base_lines, |
376 | + )) |
377 | + return 'not_applicable', None |
378 | + _mod_merge.Merger.hooks.install_named_hook( |
379 | + 'merge_file_content', hook_log_lines, 'test hook (log_lines)') |
380 | + |
381 | + def make_merge_builder(self): |
382 | + builder = MergeBuilder(self.test_base_dir) |
383 | + self.addCleanup(builder.cleanup) |
384 | + return builder |
385 | + |
386 | + def create_file_needing_contents_merge(self, builder, file_id): |
387 | + builder.add_file(file_id, builder.tree_root, "name1", "text1", True) |
388 | + builder.change_contents(file_id, other="text4", this="text3") |
389 | + |
390 | + def test_change_vs_change(self): |
391 | + """Hook is used for (changed, changed)""" |
392 | + self.install_hook_success() |
393 | + builder = self.make_merge_builder() |
394 | + builder.add_file("1", builder.tree_root, "name1", "text1", True) |
395 | + builder.change_contents("1", other="text4", this="text3") |
396 | + conflicts = builder.merge(self.merge_type) |
397 | + self.assertEqual(conflicts, []) |
398 | + self.assertEqual( |
399 | + builder.this.get_file('1').read(), 'text-merged-by-hook') |
400 | + |
401 | + def test_change_vs_deleted(self): |
402 | + """Hook is used for (changed, deleted)""" |
403 | + self.install_hook_success() |
404 | + builder = self.make_merge_builder() |
405 | + builder.add_file("1", builder.tree_root, "name1", "text1", True) |
406 | + builder.change_contents("1", this="text2") |
407 | + builder.remove_file("1", other=True) |
408 | + conflicts = builder.merge(self.merge_type) |
409 | + self.assertEqual(conflicts, []) |
410 | + self.assertEqual( |
411 | + builder.this.get_file('1').read(), 'text-merged-by-hook') |
412 | + |
413 | + def test_result_can_be_delete(self): |
414 | + """A hook's result can be the deletion of a file.""" |
415 | + self.install_hook_delete() |
416 | + builder = self.make_merge_builder() |
417 | + self.create_file_needing_contents_merge(builder, "1") |
418 | + conflicts = builder.merge(self.merge_type) |
419 | + self.assertEqual(conflicts, []) |
420 | + self.assertRaises(errors.NoSuchId, builder.this.id2path, '1') |
421 | + self.assertEqual([], list(builder.this.list_files())) |
422 | + |
423 | + def test_result_can_be_conflict(self): |
424 | + """A hook's result can be a conflict.""" |
425 | + self.install_hook_conflict() |
426 | + builder = self.make_merge_builder() |
427 | + self.create_file_needing_contents_merge(builder, "1") |
428 | + conflicts = builder.merge(self.merge_type) |
429 | + self.assertEqual(conflicts, [TextConflict('name1', file_id='1')]) |
430 | + # The hook still gets to set the file contents in this case, so that it |
431 | + # can insert custom conflict markers. |
432 | + self.assertEqual( |
433 | + builder.this.get_file('1').read(), |
434 | + 'text-with-conflict-markers-from-hook') |
435 | + |
436 | + def test_can_access_this_other_and_base_versions(self): |
437 | + """The hook function can call params.merger.get_lines to access the |
438 | + THIS/OTHER/BASE versions of the file. |
439 | + """ |
440 | + self.install_hook_log_lines() |
441 | + builder = self.make_merge_builder() |
442 | + builder.add_file("1", builder.tree_root, "name1", "text1", True) |
443 | + builder.change_contents("1", this="text2", other="text3") |
444 | + conflicts = builder.merge(self.merge_type) |
445 | + self.assertEqual( |
446 | + [('log_lines', ['text2'], ['text3'], ['text1'])], self.hook_log) |
447 | + |
448 | + def test_chain_when_not_applicable(self): |
449 | + """When a hook function returns not_applicable, the next function is |
450 | + tried (when one exists). |
451 | + """ |
452 | + self.install_hook_noop() |
453 | + self.install_hook_success() |
454 | + builder = self.make_merge_builder() |
455 | + self.create_file_needing_contents_merge(builder, "1") |
456 | + conflicts = builder.merge(self.merge_type) |
457 | + self.assertEqual(conflicts, []) |
458 | + self.assertEqual( |
459 | + builder.this.get_file('1').read(), 'text-merged-by-hook') |
460 | + self.assertEqual([('no-op',), ('success',)], self.hook_log) |
461 | + |
462 | + def test_chain_stops_after_success(self): |
463 | + """When a hook function returns success, no later functions are tried. |
464 | + """ |
465 | + self.install_hook_success() |
466 | + self.install_hook_noop() |
467 | + builder = self.make_merge_builder() |
468 | + self.create_file_needing_contents_merge(builder, "1") |
469 | + conflicts = builder.merge(self.merge_type) |
470 | + self.assertEqual([('success',)], self.hook_log) |
471 | + |
472 | + def test_chain_stops_after_conflict(self): |
473 | + """When a hook function returns conflict, no later functions are tried. |
474 | + """ |
475 | + self.install_hook_conflict() |
476 | + self.install_hook_noop() |
477 | + builder = self.make_merge_builder() |
478 | + self.create_file_needing_contents_merge(builder, "1") |
479 | + conflicts = builder.merge(self.merge_type) |
480 | + self.assertEqual([('conflict',)], self.hook_log) |
481 | + |
482 | + def test_chain_stops_after_delete(self): |
483 | + """When a hook function returns delete, no later functions are tried. |
484 | + """ |
485 | + self.install_hook_delete() |
486 | + self.install_hook_noop() |
487 | + builder = self.make_merge_builder() |
488 | + self.create_file_needing_contents_merge(builder, "1") |
489 | + conflicts = builder.merge(self.merge_type) |
490 | + self.assertEqual([('delete',)], self.hook_log) |
491 | |
492 | |
493 | === added directory 'contrib/news_merge' |
494 | === added file 'contrib/news_merge/README' |
495 | --- contrib/news_merge/README 1970-01-01 00:00:00 +0000 |
496 | +++ contrib/news_merge/README 2010-01-13 07:31:30 +0000 |
497 | @@ -0,0 +1,9 @@ |
498 | +A plugin for merging bzr's NEWS file. |
499 | + |
500 | +Install as usual (e.g. symlink this directory into ~/.bazaar/plugins), and any |
501 | +file with a path of NEWS that is in bzr's NEWS format will be merged with this |
502 | +hook. |
503 | + |
504 | +This hook can resolve conflicts where both sides add entries at the same place. |
505 | +If it encounters a more difficult conflict it gives up and bzr will fallback to |
506 | +the default merge algorithm. |
507 | |
508 | === added file 'contrib/news_merge/__init__.py' |
509 | --- contrib/news_merge/__init__.py 1970-01-01 00:00:00 +0000 |
510 | +++ contrib/news_merge/__init__.py 2010-01-13 07:31:30 +0000 |
511 | @@ -0,0 +1,90 @@ |
512 | +# Copyright (C) 2010 Canonical Ltd |
513 | +# |
514 | +# This program is free software; you can redistribute it and/or modify |
515 | +# it under the terms of the GNU General Public License as published by |
516 | +# the Free Software Foundation; either version 2 of the License, or |
517 | +# (at your option) any later version. |
518 | +# |
519 | +# This program is distributed in the hope that it will be useful, |
520 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
521 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
522 | +# GNU General Public License for more details. |
523 | +# |
524 | +# You should have received a copy of the GNU General Public License |
525 | +# along with this program; if not, write to the Free Software |
526 | +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |
527 | + |
528 | +"""Merge hook for bzr's NEWS file. |
529 | + |
530 | +Install this as a plugin, e.g: |
531 | + |
532 | + cp contrib/news-file-merge-hook.py ~/.bazaar/plugins/news_merge.py |
533 | +""" |
534 | + |
535 | +from .parser import simple_parse |
536 | + |
537 | +from bzrlib.merge import Merger |
538 | +from bzrlib.merge3 import Merge3 |
539 | + |
540 | + |
541 | +def news_merge_hook(params): |
542 | + if params.winner == 'other': |
543 | + # OTHER is a straight winner, rely on default merge. |
544 | + return 'not_applicable', None |
545 | + elif params.is_file_merge(): |
546 | + # THIS and OTHER are both files. That's a good start. |
547 | + filename = params.merger.this_tree.id2path(params.file_id) |
548 | + if filename != 'NEWS': |
549 | + return 'not_applicable', None |
550 | + return news_merger(params) |
551 | + else: |
552 | + return 'not_applicable', None |
553 | + |
554 | +magic_marker = '|NEWS-MERGE-MAGIC-MARKER|' |
555 | + |
556 | +def blocks_to_fakelines(blocks): |
557 | + for kind, text in blocks: |
558 | + yield '%s%s%s' % (kind, magic_marker, text) |
559 | + |
560 | +def fakelines_to_lines(fakelines): |
561 | + for fakeline in fakelines: |
562 | + yield fakeline.split(magic_marker, 1)[1] + '\n' |
563 | + yield '\n' |
564 | + |
565 | +def sort_key(s): |
566 | + return s.replace('`', '').lower() |
567 | + |
568 | +def news_merger(params): |
569 | + def munge(lines): |
570 | + return list(blocks_to_fakelines(simple_parse(''.join(lines)))) |
571 | + this_lines = munge(params.this_lines) |
572 | + other_lines = munge(params.other_lines) |
573 | + base_lines = munge(params.base_lines) |
574 | + m3 = Merge3(base_lines, this_lines, other_lines) |
575 | + result_lines = [] |
576 | + for group in m3.merge_groups(): |
577 | + if group[0] == 'conflict': |
578 | + _, base, a, b = group |
579 | + # are all the conflicting lines bullets? If so, we can merge this. |
580 | + for line_set in [base, a, b]: |
581 | + for line in line_set: |
582 | + if not line.startswith('bullet'): |
583 | + # Something else :( |
584 | + # Maybe the default merge can cope. |
585 | + return 'not_applicable', None |
586 | + new_in_a = set(a).difference(base) |
587 | + new_in_b = set(b).difference(base) |
588 | + all_new = new_in_a.union(new_in_b) |
589 | + deleted_in_a = set(base).difference(a) |
590 | + deleted_in_b = set(base).difference(b) |
591 | + final = all_new.difference(deleted_in_a).difference(deleted_in_b) |
592 | + final = sorted(final, key=sort_key) |
593 | + result_lines.extend(final) |
594 | + else: |
595 | + result_lines.extend(group[1]) |
596 | + return 'success', list(fakelines_to_lines(result_lines)) |
597 | + |
598 | + |
599 | +Merger.hooks.install_named_hook( |
600 | + 'merge_file_content', news_merge_hook, 'NEWS file merge') |
601 | + |
602 | |
603 | === added file 'contrib/news_merge/parser.py' |
604 | --- contrib/news_merge/parser.py 1970-01-01 00:00:00 +0000 |
605 | +++ contrib/news_merge/parser.py 2010-01-13 07:31:30 +0000 |
606 | @@ -0,0 +1,33 @@ |
607 | + |
608 | + |
609 | +def simple_parse(content): |
610 | + """Returns blocks, where each block is a 2-tuple (kind, text).""" |
611 | + blocks = content.split('\n\n') |
612 | + for block in blocks: |
613 | + if block.startswith('###'): |
614 | + # First line is ###...: Top heading |
615 | + yield 'heading', block |
616 | + continue |
617 | + last_line = block.rsplit('\n', 1)[-1] |
618 | + if last_line.startswith('###'): |
619 | + # last line is ###...: 2nd-level heading |
620 | + yield 'release', block |
621 | + elif last_line.startswith('***'): |
622 | + # last line is ***...: 3rd-level heading |
623 | + yield 'section', block |
624 | + elif block.startswith('* '): |
625 | + # bullet |
626 | + yield 'bullet', block |
627 | + elif block.strip() == '': |
628 | + # empty |
629 | + yield 'empty', block |
630 | + else: |
631 | + # plain text |
632 | + yield 'text', block |
633 | + |
634 | + |
635 | +if __name__ == '__main__': |
636 | + import sys |
637 | + content = open(sys.argv[1], 'rb').read() |
638 | + for result in simple_parse(content): |
639 | + print result |
This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
The tests hopefully make the capabilities fairly clear. In short:
* it is invoked for all changes where one side has changed the content, and the other side has changed the content or deleted the file
* hook functions are passed the merger, file_id and trans_id (so can get access to filenames and file contents in the various this/other/base trees)
* hook functions can explicitly do nothing (allowing the next registered function to try), or return a conflict, a successful merge, or a deletion of the file.
* hook functions can set the contents of a file in both the success and conflict case
Some open questions (could be addressed in future patches perhaps):
* should merge of a file rename vs. a text change use this hook point?
* should a hook function be able to have some control over the emission of THIS/BASE/OTHER files when it results in a conflict, or over their contents of those files?
* what helpers can we provide to help people write hook functions?
* what more can we do to help multiple hook registrations co-exist peacefully? e.g. These functions already chain, but if multiple plugins register with this hook the ordering is not defined.
I think ideally this hook point would be marked experimental until we get some practical experience proving that it works well in practice (e.g. that someone can write a plugin to merge NEWS files), but that's what beta series are for ;)