Merge lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr

Proposed by Andrew Bennetts
Status: Superseded
Proposed branch: lp:~spiv/bzr/per-file-merge-hook-491711
Merge into: lp:bzr
Prerequisite: lp:~spiv/bzr/merge-minor-refactoring
Diff against target: 639 lines (+482/-62)
7 files modified
bzrlib/decorators.py (+16/-0)
bzrlib/hooks.py (+2/-0)
bzrlib/merge.py (+163/-62)
bzrlib/tests/per_merger.py (+169/-0)
contrib/news_merge/README (+9/-0)
contrib/news_merge/__init__.py (+90/-0)
contrib/news_merge/parser.py (+33/-0)
To merge this branch: bzr merge lp:~spiv/bzr/per-file-merge-hook-491711
Reviewer Review Type Date Requested Status
John A Meinel Needs Information
Review via email: mp+16327@code.launchpad.net

This proposal has been superseded by a proposal from 2010-01-13.

To post a comment you must log in.
Revision history for this message
Andrew Bennetts (spiv) wrote :

This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).

The tests hopefully make the capabilities fairly clear. In short:

 * it is invoked for all changes where one side has changed the content, and the other side has changed the content or deleted the file
 * hook functions are passed the merger, file_id and trans_id (so can get access to filenames and file contents in the various this/other/base trees)
 * hook functions can explicitly do nothing (allowing the next registered function to try), or return a conflict, a successful merge, or a deletion of the file.
 * hook functions can set the contents of a file in both the success and conflict case

Some open questions (could be addressed in future patches perhaps):

 * should merge of a file rename vs. a text change use this hook point?
 * should a hook function be able to have some control over the emission of THIS/BASE/OTHER files when it results in a conflict, or over their contents of those files?
 * what helpers can we provide to help people write hook functions?
 * what more can we do to help multiple hook registrations co-exist peacefully? e.g. These functions already chain, but if multiple plugins register with this hook the ordering is not defined.

I think ideally this hook point would be marked experimental until we get some practical experience proving that it works well in practice (e.g. that someone can write a plugin to merge NEWS files), but that's what beta series are for ;)

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>
> * it is invoked for all changes where one side has changed the
> content, and the other side has changed the content or deleted the fil

Why not 'all changes' ? I don't see why we should prevent hooks merging
even one-side changes in a more sophisticated manner.

> * should a hook function be able to have some control over the
> emission of THIS/BASE/OTHER files when it results in a conflict, or
> over their contents of those files?

I think yes, but that can come later.

> * what helpers can we provide to help people write hook functions?
> * what more can we do to help multiple hook registrations co-exist
> peacefully? e.g. These functions already chain, but if multiple
> plugins register with this hook the ordering is not defined.

I don't think the ordering matters, except in the case that someone is
writing a new 'fallback' hook, to replace our three-way merge attempt.
Perhaps put a priority into the hook or something? I'd wait for signs we
need it.

> I think ideally this hook point would be marked experimental until we
> get some practical experience proving that it works well in practice
> (e.g. that someone can write a plugin to merge NEWS files), but that's
> what beta series are for ;)

You can just mark it experimental in the docs ;)

-Rob

Revision history for this message
James Westby (james-w) wrote :

On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).

Yay, thanks for working on this.

> The tests hopefully make the capabilities fairly clear. In short:
>
> * hook functions are passed the merger, file_id and trans_id (so can
> get access to filenames and file contents in the various
> this/other/base trees)

Is there any other information that is readily available?

I am a little concerned about the performance issues. bzr-builddeb is
going to be shipping one of these hooks, and I don't want that to
destroy the performance of merge.

At minimum on every conflict it has to look up the path names. If
we do as Rob suggests and call the hook on every modification that
is going to be O(tree) lookups.

Some way to short-circuit this might be good. Either passing in the
paths if they are already known (and are likely to be known even
with refactoring), or even allowing the hook to declare what paths
it is interested in (though that could be limiting).

  (e.g. if path_filter is not None and basename == path_filter:)

Have you written a very simple implementation of e.g. a NEWS merger
and measured the impact?

> Some open questions (could be addressed in future patches perhaps):
>
> * should merge of a file rename vs. a text change use this hook
> point?

I imagine we would want to make it possible.

Some helpers would be useful for splitting out the cases, e.g. I imagine
many will just want to act on two text changes.

> * should a hook function be able to have some control over the
> emission of THIS/BASE/OTHER files when it results in a conflict, or
> over their contents of those files?

I'm not sure I see a use case for this.

> * what helpers can we provide to help people write hook functions?

Helpers for getting the paths, and other meta information.

Helpers for determining the cause of the conflict, text changes vs.
path changes etc.

> * what more can we do to help multiple hook registrations co-exist
> peacefully? e.g. These functions already chain, but if multiple
> plugins register with this hook the ordering is not defined.

I think using a known ordering would be good. Even using sorted() on
the hook names. This would allow you to place something at the end
by calling it zzfallback for instance.

Otherwise changing the API to allow a priority could be good. However,
these APIs are always tricky to deal with, as you are playing a game
of rock-paper-scissors with and unbounded number of choices.

I bring it up now, as it may require API changes in the registration
to do this.

Thanks,

James

Revision history for this message
Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>> * it is invoked for all changes where one side has changed the
>> content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.

That would require us to pay attention to THIS-only changes, and that
would be a performance regression.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksrtrIACgkQ0F+nu1YWqI3a6wCeNdmME7ZY/xSZ7ma6snOd30fE
XUUAni6xZaHfi47GfpkTivlXF8uFDa5w
=uew/
-----END PGP SIGNATURE-----

Revision history for this message
Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Westby wrote:
> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
>> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>> * should a hook function be able to have some control over the
>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>> over their contents of those files?
>
> I'm not sure I see a use case for this.

merge3 and weave merge emit different things as their BASE, so if we
hooked merge to force a weave merge for certain files, we might want to
also use that style of BASE.

Can't see any use for controlling THIS or OTHER, though.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksrt6oACgkQ0F+nu1YWqI26pgCdF24SIHw4I4pnYV1DpikUoPwT
3EgAn08cMSUKDneD39c0rBt4aKK+zYCh
=QYTu
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
>> * it is invoked for all changes where one side has changed the
>> content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.

I assume this is because the merge code itself splits these cases. Files
with clear winners don't get passed into the Merger code. Also, there is
bidirectional issues.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksrw+oACgkQJdeBCYSNAANPTQCeLRj5ST7fT58nGOkIJldtyNis
9fsAoMnNoCWak28nr8shdECpIyLRbxtY
=lRgj
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...
>
> Is there any other information that is readily available?
>
> I am a little concerned about the performance issues. bzr-builddeb is
> going to be shipping one of these hooks, and I don't want that to
> destroy the performance of merge.
>
> At minimum on every conflict it has to look up the path names. If
> we do as Rob suggests and call the hook on every modification that
> is going to be O(tree) lookups.

Still O(changes), I don't see how you get O(tree).

>
> Some way to short-circuit this might be good. Either passing in the
> paths if they are already known (and are likely to be known even
> with refactoring), or even allowing the hook to declare what paths
> it is interested in (though that could be limiting).

I think passing in the "path_in_this_tree, path_in_other_tree, and
path_in_final_tree" as determined by the current merge logic would be
reasonable.

>
> (e.g. if path_filter is not None and basename == path_filter:)
>
> Have you written a very simple implementation of e.g. a NEWS merger
> and measured the impact?
>
>> Some open questions (could be addressed in future patches perhaps):
>>
>> * should merge of a file rename vs. a text change use this hook
>> point?
>
> I imagine we would want to make it possible.
>
> Some helpers would be useful for splitting out the cases, e.g. I imagine
> many will just want to act on two text changes.
>
>> * should a hook function be able to have some control over the
>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>> over their contents of those files?
>
> I'm not sure I see a use case for this.
>
>> * what helpers can we provide to help people write hook functions?
>
> Helpers for getting the paths, and other meta information.
>
> Helpers for determining the cause of the conflict, text changes vs.
> path changes etc.
>
>> * what more can we do to help multiple hook registrations co-exist
>> peacefully? e.g. These functions already chain, but if multiple
>> plugins register with this hook the ordering is not defined.
>
> I think using a known ordering would be good. Even using sorted() on
> the hook names. This would allow you to place something at the end
> by calling it zzfallback for instance.
>
> Otherwise changing the API to allow a priority could be good. However,
> these APIs are always tricky to deal with, as you are playing a game
> of rock-paper-scissors with and unbounded number of choices.
>
> I bring it up now, as it may require API changes in the registration
> to do this.
>
> Thanks,
>
> James

I think explicitly making them unordered forces people to think about
whether they can do what they need in an unordered fashion.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksrxLwACgkQJdeBCYSNAANiYwCgpaTb3KqL65jqOHou/mpzrQwt
/3oAn1TyJC5/tS1+Xc0+3//PRxoai0lI
=Gavq
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> James Westby wrote:
>> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
>>> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>>> * should a hook function be able to have some control over the
>>> emission of THIS/BASE/OTHER files when it results in a conflict, or
>>> over their contents of those files?
>> I'm not sure I see a use case for this.
>
> merge3 and weave merge emit different things as their BASE, so if we
> hooked merge to force a weave merge for certain files, we might want to
> also use that style of BASE.
>
> Can't see any use for controlling THIS or OTHER, though.
>
> Aaron

Well, potentially a whitespace ignoring merger may want to pass that
ability on to the output files?

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksrxN0ACgkQJdeBCYSNAAMJ9gCgxngkhtNINZPw7iNuWPe8JUs3
6UIAn30zdkrAz6hhN2PoLlhgxsfy0vCm
=omK9
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, 2009-12-18 at 17:09 +0000, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
> >> * it is invoked for all changes where one side has changed the
> >> content, and the other side has changed the content or deleted the fil
> >
> > Why not 'all changes' ? I don't see why we should prevent hooks merging
> > even one-side changes in a more sophisticated manner.
>
> That would require us to pay attention to THIS-only changes, and that
> would be a performance regression.

So there is a middle ground, where OTHER-only changes are examined.
There are a number of useful things one can do where OTHER-only changes
are examined (such as better merges of NEWS items by accounting for
changes after a release) which the current definition won't help with. I
see that checking for THIS-only changes would have an impact, and while
it might be nice, I don't see a need to do it at this point.

-Rob

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (6.4 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/per-file-merge-hook-491711 into lp:bzr with lp:~spiv/bzr/merge-minor-refactoring as a prerequisite.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #491711 hook to permit per file merges
> https://bugs.launchpad.net/bugs/491711
>
>
> This branch adds a hook point that allows a plugin to perform the merging of the contents of specific files (see bug 491711).
>
> The tests hopefully make the capabilities fairly clear. In short:
>
> * it is invoked for all changes where one side has changed the content, and the other side has changed the content or deleted the file
> * hook functions are passed the merger, file_id and trans_id (so can get access to filenames and file contents in the various this/other/base trees)
> * hook functions can explicitly do nothing (allowing the next registered function to try), or return a conflict, a successful merge, or a deletion of the file.
> * hook functions can set the contents of a file in both the success and conflict case
>
> Some open questions (could be addressed in future patches perhaps):
>
> * should merge of a file rename vs. a text change use this hook point?
> * should a hook function be able to have some control over the emission of THIS/BASE/OTHER files when it results in a conflict, or over their contents of those files?
> * what helpers can we provide to help people write hook functions?
> * what more can we do to help multiple hook registrations co-exist peacefully? e.g. These functions already chain, but if multiple plugins register with this hook the ordering is not defined.
>
> I think ideally this hook point would be marked experimental until we get some practical experience proving that it works well in practice (e.g. that someone can write a plugin to merge NEWS files), but that's what beta series are for ;)
>
>

+ self.create_hook(hooks.HookPoint('merge_file_content',
+ "Called when file content needs to be merged (including
when one "
+ "side has deleted the file and the other has changed it)."
+ "merge_file_content is called with a "
+ "bzrlib.merge.MergeHookParams. The function should return a
tuple "
+ "of (status, lines), where status is one of 'not_applicable', "
+ "'success', 'conflicted', or 'delete'. If status is
success or "
+ "conflicted, then lines should be an iterable of the new
lines "
+ "for the file.",
+ (2, 1), None))

^- I'm wondering if we wouldn't want something smarter than a tuple of
(status, lines) being returned. It might give us a way to move to
supporting returning THIS/BASE/OTHER more directly. It may also make it
easier to handle path conflicts / updating the path as part of the hook.

I think that since the transform is being passed in (via the Merger
object), you actually can do any renaming you want. You just tell the
transform that the path for 'trans_id' should really be 'X'.

As james_w mentioned, we may want to make the api a little bit more
verbose, and have some helper functions o...

Read more...

review: Needs Information
Revision history for this message
Andrew Bennetts (spiv) wrote :
Download full text (13.7 KiB)

First, thanks everyone for the deluge of thoughtful comments! It's
great to have so much enthusiasm about a patch :)

I'm going to reply to all the comments in this one jumbo reply, rather
than try to detangle the conversation threads this proposal has spawned.
I've skipped over some comments that are repeats of others, but I think
I've replied to every point. Do let me know if I've inadvertantly
forgotten something you'd like me to consider.

Robert Collins wrote:
> On Fri, 2009-12-18 at 08:08 +0000, Andrew Bennetts wrote:
> >
> > * it is invoked for all changes where one side has changed the
> > content, and the other side has changed the content or deleted the fil
>
> Why not 'all changes' ? I don't see why we should prevent hooks merging
> even one-side changes in a more sophisticated manner.

Hmm. I guess because if there's a clear one-side winner then I don't
think of that as a “merge” (from the point of view of that file), and I
didn't have a use-case for it. I was also trying to be mindful of
keeping the performance cost minimal, as Aaron mentioned. You have
given a use-case in a follow-up (“better merges of NEWS items by
accounting for changes after a release”), and that would definitely be
nice to have. I'll have a play and see what it would take to make
OTHER-only changes go via the hook, as your followup suggests. It
sounds like no-one is strongly interested in THIS-only at this point, so
I'm happy to ignore that for this patch.

Also, I know it's a bit of a cop-out answer, but a plugin could always
implement a whole new Merger if the hook isn't powerful enough...

John's reply to you said:

] I assume this is because the merge code itself splits these cases. Files
] with clear winners don't get passed into the Merger code. Also, there is
] bidirectional issues.

Right, I definitely did want to keep the changes as minimal as possible
because I'm not deeply familiar with the structure of the merge code,
but I do get the sense that it would be easy to accidentally introduce
bugs if I rearranged things.

I'm not certain what you mean by bidirectional issues? I think I can
make some educated guesses, but I'd rather you explained for me :)

> > * should a hook function be able to have some control over the
> > emission of THIS/BASE/OTHER files when it results in a conflict, or
> > over their contents of those files?
>
> I think yes, but that can come later.

Cool.

James Westby wrote:
> On Fri, 18 Dec 2009 08:08:34 -0000, Andrew Bennetts <email address hidden> wrote:
[...]
> > * hook functions are passed the merger, file_id and trans_id (so can
> > get access to filenames and file contents in the various
> > this/other/base trees)
>
> Is there any other information that is readily available?

As John says, the path in the various tree versions are fairly cheaply
available via the merger object. I think it'll be worth adding
convenience methods to get at them. I imagine some plugins might want
to just check the basename, others would like to check the full path,
some might just use the file_id.

> I am a little concerned about the performance issues. bzr-builddeb is
> going to be shipping one of these hooks, ...

Revision history for this message
James Westby (james-w) wrote :

On Mon, 21 Dec 2009 02:39:14 -0000, Andrew Bennetts <email address hidden> wrote:
> I'm interested in the “on every conflict it [bzr-builddeb?] has to look
> up the path names” remark though — I don't think this hook as-is
> provides a convenient way to implement “try the regular logic, if that
> conflicts then try my custom plugin logic”. Is that something you want?
> I suspect the way to provide that nicely may be via making the regular
> logic interact via the same interfaces as the hook (i.e. receiving the
> same params object and returning the same values), and then providing a
> convenient way to call that via the hook params.

The way I imagine it working is that a hook will be registered by
bzr-builddeb that will merge debian/changelog files. The question
is to know what files to act on. We can try and parse the files
and just skip everything that doesn't parse as one of those files.
However, that will be both quite slow, plus hugely pessimistic.

Therefore I would want to short-circuit, and only bother trying
to parse files where the basename is "changelog". It would be
great if the basename was easily accesible to the hook so
that it could do this check without doing something like
trans_id->file_id->path.

> Yeah, that's a good point. Even many those that want to handle more than
> plain text vs text changes might still want to punt on delete vs text,
> for example. So I need to take the knowledge about what sort of changes
> are involved that's implicit in the code path now and turn that into
> some data that the hook function can inspect.

Yep, I'm mainly concerned in these comments about the API that is
presented to plugin writers. Being able simply separate out the cases
without each plugin having to cargo cult code to detect what situation
it is being called for would be good.

I think this will partially come from APIs, which I imagine are there,
I just haven't looked, and from docs. We should docuement the things
people commonly want to do so that it is obvious to plugin writers.
My concern would be that you write a hook that assumes you are always
dealing with text vs. text, and then it crashes when you get text vs.
delete. Making it clear that you can detect the cases at least points
you towards considering which you want to handle, even if the API
doesn't force you to think about them all.

> Well what sort of priority system do you have in mind that would be
> better than ordering by name?

Nothing in particular. I was just making a plea for a defined order.
If e.g. the order was whatever dict.keys() returns then there is no
way to get your hook to run last if you need to.

Thanks,

James

Revision history for this message
Robert Collins (lifeless) wrote :

On Mon, 2009-12-21 at 02:57 +0000, James Westby wrote:

> Nothing in particular. I was just making a plea for a defined order.
> If e.g. the order was whatever dict.keys() returns then there is no
> way to get your hook to run last if you need to.

You don't with names either.

'Run last' is very very hard to pin down.

'Topological order' is a bit easier, 'numeric order' is very simple.

Hooks don't have names so much as descriptions, I strongly urge /not/ to
use the labels as a sort order; ugh. We don't have orders elsewhere in
hook land and we haven't needed to - even though this discussion happens
every time.

Re: merging changelog, I'd lookup the full path and compare to
'debian/changelog' and 'changelog' - its simple and should be fast.

Andrew: are kind changes able to invoke this hook? If not please make
that clear.

I think passing in the Merger is fine; we pass in Branch in other hooks
and trust folk to use public bits and not make a mess. If someone does
fiddle, the conflict resolution phases will detect that pretty robustly
I think.

-Rob

Revision history for this message
Andrew Bennetts (spiv) wrote :

Robert Collins wrote:
> On Mon, 2009-12-21 at 02:57 +0000, James Westby wrote:
>
> > Nothing in particular. I was just making a plea for a defined order.
> > If e.g. the order was whatever dict.keys() returns then there is no
> > way to get your hook to run last if you need to.
>
> You don't with names either.
>
> 'Run last' is very very hard to pin down.
>
> 'Topological order' is a bit easier, 'numeric order' is very simple.
>
> Hooks don't have names so much as descriptions, I strongly urge /not/ to
> use the labels as a sort order; ugh. We don't have orders elsewhere in
> hook land and we haven't needed to - even though this discussion happens
> every time.

Yeah, I don't much like punning that. I'd like to make the order
defined and stable somehow rather than dependent on the somewhat
unstable order that plugins will get loaded, though.

> Re: merging changelog, I'd lookup the full path and compare to
> 'debian/changelog' and 'changelog' - its simple and should be fast.

I definitely think that's what I'd try to start with. Maybe that will
be a significant performance hit in which case we can rethink it, but
let's not prematurely optimise.

> Andrew: are kind changes able to invoke this hook? If not please make
> that clear.

I don't know off the top of my head. I'll find out and make it
explicit. My guess is no, but I'm not very confident about that.

> I think passing in the Merger is fine; we pass in Branch in other hooks
> and trust folk to use public bits and not make a mess. If someone does
> fiddle, the conflict resolution phases will detect that pretty robustly
> I think.

I'm not worried about people accessing non-public bits (they should
already know that _foo attributes are off-limits unless they're ready to
deal with the likely breakage), apart from ensuring that the public bits
are complete enough for hooks to do what they need.

However, I think it's important to decide if we pass the Merger with the
intent that the hook will use it only read-only (to retrieve file names,
file contents, maybe details about other things in the tree, etc), or if
we actually want to support the hook mutating things via the Merger. It
seems the answer is the latter, which is fine, but it's worth being
clear about that. That answer suggests that my original proposal that
the return value from the hook makes changes isn't quite right when the
hook can just do change(s) directly using the Merger.

-Andrew.

Revision history for this message
Robert Collins (lifeless) wrote :

I'd suggest one of:
 return value indicates 'handled' or 'not handled'
 return value is parameter to pass to next hook or None-to-stop

and let changes that require calling fn's on Merger be done directly.

-Rob

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/decorators.py'
2--- bzrlib/decorators.py 2009-10-15 02:19:43 +0000
3+++ bzrlib/decorators.py 2010-01-13 07:31:30 +0000
4@@ -253,3 +253,19 @@
5 global needs_read_lock, needs_write_lock
6 needs_read_lock = _pretty_needs_read_lock
7 needs_write_lock = _pretty_needs_write_lock
8+
9+
10+def cachedproperty(getter_func):
11+ """Like property(getter_func), but caches the value.
12+
13+ Note: there's no way to clear or reset the cache.
14+ """
15+ memo = []
16+ def f(self):
17+ if not memo:
18+ memo.append(getter_func(self))
19+ return memo[0]
20+ f.__doc__ = getter_func.__doc__
21+ getter_func.__name__ = getter_func.__name__
22+ return property(f)
23+
24
25=== modified file 'bzrlib/hooks.py'
26--- bzrlib/hooks.py 2010-01-03 03:49:07 +0000
27+++ bzrlib/hooks.py 2010-01-13 07:31:30 +0000
28@@ -45,6 +45,8 @@
29 'bzrlib.info', 'InfoHooks')
30 known_hooks.register_lazy(('bzrlib.lock', 'Lock.hooks'), 'bzrlib.lock',
31 'LockHooks')
32+known_hooks.register_lazy(('bzrlib.merge', 'Merger.hooks'), 'bzrlib.merge',
33+ 'MergeHooks')
34 known_hooks.register_lazy(('bzrlib.msgeditor', 'hooks'), 'bzrlib.msgeditor',
35 'MessageEditorHooks')
36 known_hooks.register_lazy(('bzrlib.mutabletree', 'MutableTree.hooks'),
37
38=== modified file 'bzrlib/merge.py'
39--- bzrlib/merge.py 2009-12-18 08:22:42 +0000
40+++ bzrlib/merge.py 2010-01-13 07:31:30 +0000
41@@ -19,8 +19,10 @@
42 branch as _mod_branch,
43 conflicts as _mod_conflicts,
44 debug,
45+ decorators,
46 errors,
47 graph as _mod_graph,
48+ hooks,
49 merge3,
50 osutils,
51 patiencediff,
52@@ -50,7 +52,69 @@
53 from_tree.unlock()
54
55
56+class MergeHooks(hooks.Hooks):
57+
58+ def __init__(self):
59+ hooks.Hooks.__init__(self)
60+ self.create_hook(hooks.HookPoint('merge_file_content',
61+ "Called when file content needs to be merged (including when one "
62+ "side has deleted the file and the other has changed it)."
63+ "merge_file_content is called with a "
64+ "bzrlib.merge.MergeHookParams. The function should return a tuple "
65+ "of (status, lines), where status is one of 'not_applicable', "
66+ "'success', 'conflicted', or 'delete'. If status is success or "
67+ "conflicted, then lines should be an iterable of the new lines "
68+ "for the file.",
69+ (2, 1), None))
70+
71+
72+class MergeHookParams(object):
73+ """Object holding parameters passed to merge_file_content hooks.
74+
75+ There are 3 fields hooks can access:
76+
77+ :ivar merger: the Merger object
78+ :ivar file_id: the file ID of the file being merged
79+ :ivar trans_id: the transform ID for the merge of this file.
80+
81+ The lines of versions of the file being merged can be retrieved from the
82+ merger, e.g.::
83+
84+ params.merger.get_lines(params.merger.this_tree, params.file_id)
85+ """
86+
87+ def __init__(self, merger, file_id, trans_id, this_pair, other_pair,
88+ winner):
89+ self.merger = merger
90+ self.file_id = file_id
91+ self.trans_id = trans_id
92+ self.this_pair = this_pair
93+ self.other_pair = other_pair
94+ self.winner = winner
95+
96+ def is_file_merge(self):
97+ return self.this_pair[0] == 'file' and self.other_pair[0] == 'file'
98+
99+ @decorators.cachedproperty
100+ def base_lines(self):
101+ """The lines of the 'base' version of the file."""
102+ return self.merger.get_lines(self.merger.base_tree, self.file_id)
103+
104+ @decorators.cachedproperty
105+ def this_lines(self):
106+ """The lines of the 'this' version of the file."""
107+ return self.merger.get_lines(self.merger.this_tree, self.file_id)
108+
109+ @decorators.cachedproperty
110+ def other_lines(self):
111+ """The lines of the 'other' version of the file."""
112+ return self.merger.get_lines(self.merger.other_tree, self.file_id)
113+
114+
115 class Merger(object):
116+
117+ hooks = MergeHooks()
118+
119 def __init__(self, this_branch, other_tree=None, base_tree=None,
120 this_tree=None, pb=None, change_reporter=None,
121 recurse='down', revision_graph=None):
122@@ -1107,18 +1171,6 @@
123 contents = None
124 return kind, contents
125
126- def contents_conflict():
127- trans_id = self.tt.trans_id_file_id(file_id)
128- name = self.tt.final_name(trans_id)
129- parent_id = self.tt.final_parent(trans_id)
130- if file_id in self.this_tree.inventory:
131- self.tt.unversion_file(trans_id)
132- if file_id in self.this_tree:
133- self.tt.delete_contents(trans_id)
134- file_group = self._dump_conflicts(name, parent_id, file_id,
135- set_version=True)
136- self._raw_conflicts.append(('contents conflict', file_group))
137-
138 # See SPOT run. run, SPOT, run.
139 # So we're not QUITE repeating ourselves; we do tricky things with
140 # file kind...
141@@ -1140,59 +1192,108 @@
142 if winner == 'this':
143 # No interesting changes introduced by OTHER
144 return "unmodified"
145+ # We have a hypothetical conflict, but if we have files, then we
146+ # can try to merge the content
147 trans_id = self.tt.trans_id_file_id(file_id)
148- if winner == 'other':
149+ params = MergeHookParams(self, file_id, trans_id, this_pair,
150+ other_pair, winner)
151+ hooks = Merger.hooks['merge_file_content']
152+ hooks = list(hooks) + [self.default_text_merge]
153+ hook_status = 'not_applicable'
154+ for hook in hooks:
155+ hook_status, lines = hook(params)
156+ if hook_status != 'not_applicable':
157+ # Don't try any more hooks, this one applies.
158+ break
159+ result = "modified"
160+ if hook_status == 'not_applicable':
161+ # This is a contents conflict, because none of the available
162+ # functions could merge it.
163+ result = None
164+ name = self.tt.final_name(trans_id)
165+ parent_id = self.tt.final_parent(trans_id)
166+ if file_id in self.this_tree.inventory:
167+ self.tt.unversion_file(trans_id)
168+ file_group = self._dump_conflicts(name, parent_id, file_id,
169+ set_version=True)
170+ self._raw_conflicts.append(('contents conflict', file_group))
171+ elif hook_status == 'success':
172+ self.tt.create_file(lines, trans_id)
173+ elif hook_status == 'conflicted':
174+ # XXX: perhaps the hook should be able to provide
175+ # the BASE/THIS/OTHER files?
176+ self.tt.create_file(lines, trans_id)
177+ self._raw_conflicts.append(('text conflict', trans_id))
178+ name = self.tt.final_name(trans_id)
179+ parent_id = self.tt.final_parent(trans_id)
180+ file_group = self._dump_conflicts(name, parent_id, file_id)
181+ file_group.append(trans_id)
182+ elif hook_status == 'delete':
183+ self.tt.unversion_file(trans_id)
184+ result = "deleted"
185+ elif hook_status == 'done':
186+ # The hook function did whatever it needs to do directly, no
187+ # further action needed here.
188+ pass
189+ else:
190+ raise AssertionError('unknown hook_status: %r' % (hook_status,))
191+ if file_id not in self.this_tree and result == "modified":
192+ self.tt.version_file(file_id, trans_id)
193+ try:
194+ self.tt.tree_kind(trans_id)
195+ self.tt.delete_contents(trans_id)
196+ except errors.NoSuchFile:
197+ pass
198+ return result
199+
200+ def _default_other_winner_merge(self, merge_hook_params):
201+ """Replace this contents with other."""
202+ file_id = merge_hook_params.file_id
203+ trans_id = merge_hook_params.trans_id
204+ file_in_this = file_id in self.this_tree
205+ if file_id in self.other_tree:
206+ # OTHER changed the file
207+ wt = self.this_tree
208+ if wt.supports_content_filtering():
209+ # We get the path from the working tree if it exists.
210+ # That fails though when OTHER is adding a file, so
211+ # we fall back to the other tree to find the path if
212+ # it doesn't exist locally.
213+ try:
214+ filter_tree_path = wt.id2path(file_id)
215+ except errors.NoSuchId:
216+ filter_tree_path = self.other_tree.id2path(file_id)
217+ else:
218+ # Skip the id2path lookup for older formats
219+ filter_tree_path = None
220+ transform.create_from_tree(self.tt, trans_id,
221+ self.other_tree, file_id,
222+ filter_tree_path=filter_tree_path)
223+ return 'done', None
224+ elif file_in_this:
225+ # OTHER deleted the file
226+ return 'delete', None
227+ else:
228+ raise AssertionError(
229+ 'winner is OTHER, but file_id %r not in THIS or OTHER tree'
230+ % (file_id,))
231+
232+ def default_text_merge(self, merge_hook_params):
233+ if merge_hook_params.winner == 'other':
234 # OTHER is a straight winner, so replace this contents with other
235- file_in_this = file_id in self.this_tree
236- if file_in_this:
237- # Remove any existing contents
238- self.tt.delete_contents(trans_id)
239- if file_id in self.other_tree:
240- # OTHER changed the file
241- wt = self.this_tree
242- if wt.supports_content_filtering():
243- # We get the path from the working tree if it exists.
244- # That fails though when OTHER is adding a file, so
245- # we fall back to the other tree to find the path if
246- # it doesn't exist locally.
247- try:
248- filter_tree_path = wt.id2path(file_id)
249- except errors.NoSuchId:
250- filter_tree_path = self.other_tree.id2path(file_id)
251- else:
252- # Skip the id2path lookup for older formats
253- filter_tree_path = None
254- transform.create_from_tree(self.tt, trans_id,
255- self.other_tree, file_id,
256- filter_tree_path=filter_tree_path)
257- if not file_in_this:
258- self.tt.version_file(file_id, trans_id)
259- return "modified"
260- elif file_in_this:
261- # OTHER deleted the file
262- self.tt.unversion_file(trans_id)
263- return "deleted"
264+ return self._default_other_winner_merge(merge_hook_params)
265+ elif merge_hook_params.is_file_merge():
266+ # THIS and OTHER are both files, so text merge. Either
267+ # BASE is a file, or both converted to files, so at least we
268+ # have agreement that output should be a file.
269+ try:
270+ self.text_merge(merge_hook_params.file_id,
271+ merge_hook_params.trans_id)
272+ except errors.BinaryFile:
273+ return 'not_applicable', None
274+ return 'done', None
275 else:
276- # We have a hypothetical conflict, but if we have files, then we
277- # can try to merge the content
278- if this_pair[0] == 'file' and other_pair[0] == 'file':
279- # THIS and OTHER are both files, so text merge. Either
280- # BASE is a file, or both converted to files, so at least we
281- # have agreement that output should be a file.
282- try:
283- self.text_merge(file_id, trans_id)
284- except errors.BinaryFile:
285- return contents_conflict()
286- if file_id not in self.this_tree:
287- self.tt.version_file(file_id, trans_id)
288- try:
289- self.tt.tree_kind(trans_id)
290- self.tt.delete_contents(trans_id)
291- except errors.NoSuchFile:
292- pass
293- return "modified"
294- else:
295- return contents_conflict()
296+ return 'not_applicable', None
297
298 def get_lines(self, tree, file_id):
299 """Return the lines in a file, or an empty list."""
300
301=== modified file 'bzrlib/tests/per_merger.py'
302--- bzrlib/tests/per_merger.py 2009-12-18 06:54:20 +0000
303+++ bzrlib/tests/per_merger.py 2010-01-13 07:31:30 +0000
304@@ -18,6 +18,7 @@
305
306 import os
307
308+from bzrlib.conflicts import TextConflict
309 from bzrlib import (
310 errors,
311 merge as _mod_merge,
312@@ -28,6 +29,7 @@
313 multiply_tests,
314 TestCaseWithTransport,
315 )
316+from bzrlib.tests.test_merge_core import MergeBuilder
317 from bzrlib.transform import TreeTransform
318
319
320@@ -188,4 +190,171 @@
321 self.assertRaises(errors.LockError, wt.unlock)
322
323
324+class TestHookMergeFileContent(TestCaseWithTransport):
325+ """Tests that the 'merge_file_content' hook is invoked."""
326+
327+ def setUp(self):
328+ TestCaseWithTransport.setUp(self)
329+ self.hook_log = []
330+
331+ def install_hook_noop(self):
332+ def hook_na(merge_params):
333+ # This hook unconditionally does nothing.
334+ self.hook_log.append(('no-op',))
335+ return 'not_applicable', None
336+ _mod_merge.Merger.hooks.install_named_hook(
337+ 'merge_file_content', hook_na, 'test hook (no-op)')
338+
339+ def install_hook_success(self):
340+ def hook_success(merge_params):
341+ self.hook_log.append(('success',))
342+ if merge_params.file_id == '1':
343+ return 'success', ['text-merged-by-hook']
344+ return 'not_applicable', None
345+ _mod_merge.Merger.hooks.install_named_hook(
346+ 'merge_file_content', hook_success, 'test hook (success)')
347+
348+ def install_hook_conflict(self):
349+ def hook_conflict(merge_params):
350+ self.hook_log.append(('conflict',))
351+ if merge_params.file_id == '1':
352+ return 'conflicted', ['text-with-conflict-markers-from-hook']
353+ return 'not_applicable', None
354+ _mod_merge.Merger.hooks.install_named_hook(
355+ 'merge_file_content', hook_conflict, 'test hook (delete)')
356+
357+ def install_hook_delete(self):
358+ def hook_delete(merge_params):
359+ self.hook_log.append(('delete',))
360+ if merge_params.file_id == '1':
361+ return 'delete', None
362+ return 'not_applicable', None
363+ _mod_merge.Merger.hooks.install_named_hook(
364+ 'merge_file_content', hook_delete, 'test hook (delete)')
365+
366+ def install_hook_log_lines(self):
367+ """Install a hook that saves the get_lines for the this, base and other
368+ versions of the file.
369+ """
370+ def hook_log_lines(merge_params):
371+ self.hook_log.append((
372+ 'log_lines',
373+ merge_params.this_lines,
374+ merge_params.other_lines,
375+ merge_params.base_lines,
376+ ))
377+ return 'not_applicable', None
378+ _mod_merge.Merger.hooks.install_named_hook(
379+ 'merge_file_content', hook_log_lines, 'test hook (log_lines)')
380+
381+ def make_merge_builder(self):
382+ builder = MergeBuilder(self.test_base_dir)
383+ self.addCleanup(builder.cleanup)
384+ return builder
385+
386+ def create_file_needing_contents_merge(self, builder, file_id):
387+ builder.add_file(file_id, builder.tree_root, "name1", "text1", True)
388+ builder.change_contents(file_id, other="text4", this="text3")
389+
390+ def test_change_vs_change(self):
391+ """Hook is used for (changed, changed)"""
392+ self.install_hook_success()
393+ builder = self.make_merge_builder()
394+ builder.add_file("1", builder.tree_root, "name1", "text1", True)
395+ builder.change_contents("1", other="text4", this="text3")
396+ conflicts = builder.merge(self.merge_type)
397+ self.assertEqual(conflicts, [])
398+ self.assertEqual(
399+ builder.this.get_file('1').read(), 'text-merged-by-hook')
400+
401+ def test_change_vs_deleted(self):
402+ """Hook is used for (changed, deleted)"""
403+ self.install_hook_success()
404+ builder = self.make_merge_builder()
405+ builder.add_file("1", builder.tree_root, "name1", "text1", True)
406+ builder.change_contents("1", this="text2")
407+ builder.remove_file("1", other=True)
408+ conflicts = builder.merge(self.merge_type)
409+ self.assertEqual(conflicts, [])
410+ self.assertEqual(
411+ builder.this.get_file('1').read(), 'text-merged-by-hook')
412+
413+ def test_result_can_be_delete(self):
414+ """A hook's result can be the deletion of a file."""
415+ self.install_hook_delete()
416+ builder = self.make_merge_builder()
417+ self.create_file_needing_contents_merge(builder, "1")
418+ conflicts = builder.merge(self.merge_type)
419+ self.assertEqual(conflicts, [])
420+ self.assertRaises(errors.NoSuchId, builder.this.id2path, '1')
421+ self.assertEqual([], list(builder.this.list_files()))
422+
423+ def test_result_can_be_conflict(self):
424+ """A hook's result can be a conflict."""
425+ self.install_hook_conflict()
426+ builder = self.make_merge_builder()
427+ self.create_file_needing_contents_merge(builder, "1")
428+ conflicts = builder.merge(self.merge_type)
429+ self.assertEqual(conflicts, [TextConflict('name1', file_id='1')])
430+ # The hook still gets to set the file contents in this case, so that it
431+ # can insert custom conflict markers.
432+ self.assertEqual(
433+ builder.this.get_file('1').read(),
434+ 'text-with-conflict-markers-from-hook')
435+
436+ def test_can_access_this_other_and_base_versions(self):
437+ """The hook function can call params.merger.get_lines to access the
438+ THIS/OTHER/BASE versions of the file.
439+ """
440+ self.install_hook_log_lines()
441+ builder = self.make_merge_builder()
442+ builder.add_file("1", builder.tree_root, "name1", "text1", True)
443+ builder.change_contents("1", this="text2", other="text3")
444+ conflicts = builder.merge(self.merge_type)
445+ self.assertEqual(
446+ [('log_lines', ['text2'], ['text3'], ['text1'])], self.hook_log)
447+
448+ def test_chain_when_not_applicable(self):
449+ """When a hook function returns not_applicable, the next function is
450+ tried (when one exists).
451+ """
452+ self.install_hook_noop()
453+ self.install_hook_success()
454+ builder = self.make_merge_builder()
455+ self.create_file_needing_contents_merge(builder, "1")
456+ conflicts = builder.merge(self.merge_type)
457+ self.assertEqual(conflicts, [])
458+ self.assertEqual(
459+ builder.this.get_file('1').read(), 'text-merged-by-hook')
460+ self.assertEqual([('no-op',), ('success',)], self.hook_log)
461+
462+ def test_chain_stops_after_success(self):
463+ """When a hook function returns success, no later functions are tried.
464+ """
465+ self.install_hook_success()
466+ self.install_hook_noop()
467+ builder = self.make_merge_builder()
468+ self.create_file_needing_contents_merge(builder, "1")
469+ conflicts = builder.merge(self.merge_type)
470+ self.assertEqual([('success',)], self.hook_log)
471+
472+ def test_chain_stops_after_conflict(self):
473+ """When a hook function returns conflict, no later functions are tried.
474+ """
475+ self.install_hook_conflict()
476+ self.install_hook_noop()
477+ builder = self.make_merge_builder()
478+ self.create_file_needing_contents_merge(builder, "1")
479+ conflicts = builder.merge(self.merge_type)
480+ self.assertEqual([('conflict',)], self.hook_log)
481+
482+ def test_chain_stops_after_delete(self):
483+ """When a hook function returns delete, no later functions are tried.
484+ """
485+ self.install_hook_delete()
486+ self.install_hook_noop()
487+ builder = self.make_merge_builder()
488+ self.create_file_needing_contents_merge(builder, "1")
489+ conflicts = builder.merge(self.merge_type)
490+ self.assertEqual([('delete',)], self.hook_log)
491
492
493=== added directory 'contrib/news_merge'
494=== added file 'contrib/news_merge/README'
495--- contrib/news_merge/README 1970-01-01 00:00:00 +0000
496+++ contrib/news_merge/README 2010-01-13 07:31:30 +0000
497@@ -0,0 +1,9 @@
498+A plugin for merging bzr's NEWS file.
499+
500+Install as usual (e.g. symlink this directory into ~/.bazaar/plugins), and any
501+file with a path of NEWS that is in bzr's NEWS format will be merged with this
502+hook.
503+
504+This hook can resolve conflicts where both sides add entries at the same place.
505+If it encounters a more difficult conflict it gives up and bzr will fallback to
506+the default merge algorithm.
507
508=== added file 'contrib/news_merge/__init__.py'
509--- contrib/news_merge/__init__.py 1970-01-01 00:00:00 +0000
510+++ contrib/news_merge/__init__.py 2010-01-13 07:31:30 +0000
511@@ -0,0 +1,90 @@
512+# Copyright (C) 2010 Canonical Ltd
513+#
514+# This program is free software; you can redistribute it and/or modify
515+# it under the terms of the GNU General Public License as published by
516+# the Free Software Foundation; either version 2 of the License, or
517+# (at your option) any later version.
518+#
519+# This program is distributed in the hope that it will be useful,
520+# but WITHOUT ANY WARRANTY; without even the implied warranty of
521+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
522+# GNU General Public License for more details.
523+#
524+# You should have received a copy of the GNU General Public License
525+# along with this program; if not, write to the Free Software
526+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
527+
528+"""Merge hook for bzr's NEWS file.
529+
530+Install this as a plugin, e.g:
531+
532+ cp contrib/news-file-merge-hook.py ~/.bazaar/plugins/news_merge.py
533+"""
534+
535+from .parser import simple_parse
536+
537+from bzrlib.merge import Merger
538+from bzrlib.merge3 import Merge3
539+
540+
541+def news_merge_hook(params):
542+ if params.winner == 'other':
543+ # OTHER is a straight winner, rely on default merge.
544+ return 'not_applicable', None
545+ elif params.is_file_merge():
546+ # THIS and OTHER are both files. That's a good start.
547+ filename = params.merger.this_tree.id2path(params.file_id)
548+ if filename != 'NEWS':
549+ return 'not_applicable', None
550+ return news_merger(params)
551+ else:
552+ return 'not_applicable', None
553+
554+magic_marker = '|NEWS-MERGE-MAGIC-MARKER|'
555+
556+def blocks_to_fakelines(blocks):
557+ for kind, text in blocks:
558+ yield '%s%s%s' % (kind, magic_marker, text)
559+
560+def fakelines_to_lines(fakelines):
561+ for fakeline in fakelines:
562+ yield fakeline.split(magic_marker, 1)[1] + '\n'
563+ yield '\n'
564+
565+def sort_key(s):
566+ return s.replace('`', '').lower()
567+
568+def news_merger(params):
569+ def munge(lines):
570+ return list(blocks_to_fakelines(simple_parse(''.join(lines))))
571+ this_lines = munge(params.this_lines)
572+ other_lines = munge(params.other_lines)
573+ base_lines = munge(params.base_lines)
574+ m3 = Merge3(base_lines, this_lines, other_lines)
575+ result_lines = []
576+ for group in m3.merge_groups():
577+ if group[0] == 'conflict':
578+ _, base, a, b = group
579+ # are all the conflicting lines bullets? If so, we can merge this.
580+ for line_set in [base, a, b]:
581+ for line in line_set:
582+ if not line.startswith('bullet'):
583+ # Something else :(
584+ # Maybe the default merge can cope.
585+ return 'not_applicable', None
586+ new_in_a = set(a).difference(base)
587+ new_in_b = set(b).difference(base)
588+ all_new = new_in_a.union(new_in_b)
589+ deleted_in_a = set(base).difference(a)
590+ deleted_in_b = set(base).difference(b)
591+ final = all_new.difference(deleted_in_a).difference(deleted_in_b)
592+ final = sorted(final, key=sort_key)
593+ result_lines.extend(final)
594+ else:
595+ result_lines.extend(group[1])
596+ return 'success', list(fakelines_to_lines(result_lines))
597+
598+
599+Merger.hooks.install_named_hook(
600+ 'merge_file_content', news_merge_hook, 'NEWS file merge')
601+
602
603=== added file 'contrib/news_merge/parser.py'
604--- contrib/news_merge/parser.py 1970-01-01 00:00:00 +0000
605+++ contrib/news_merge/parser.py 2010-01-13 07:31:30 +0000
606@@ -0,0 +1,33 @@
607+
608+
609+def simple_parse(content):
610+ """Returns blocks, where each block is a 2-tuple (kind, text)."""
611+ blocks = content.split('\n\n')
612+ for block in blocks:
613+ if block.startswith('###'):
614+ # First line is ###...: Top heading
615+ yield 'heading', block
616+ continue
617+ last_line = block.rsplit('\n', 1)[-1]
618+ if last_line.startswith('###'):
619+ # last line is ###...: 2nd-level heading
620+ yield 'release', block
621+ elif last_line.startswith('***'):
622+ # last line is ***...: 3rd-level heading
623+ yield 'section', block
624+ elif block.startswith('* '):
625+ # bullet
626+ yield 'bullet', block
627+ elif block.strip() == '':
628+ # empty
629+ yield 'empty', block
630+ else:
631+ # plain text
632+ yield 'text', block
633+
634+
635+if __name__ == '__main__':
636+ import sys
637+ content = open(sys.argv[1], 'rb').read()
638+ for result in simple_parse(content):
639+ print result