Bazaar

Merge lp:~toshio/bzr/allow-dirty-patches into lp:bzr

allow-dirty-patches
Merge into bzr.dev

Proposed by Toshio Kuratomi on 2010-02-08

Status:

Merged

Merged at revision:

not available

Proposed branch:

lp:~toshio/bzr/allow-dirty-patches

Merge into:

lp:bzr

Diff against target:

113 lines (+53/-8)

1 file modified

bzrlib/patches.py (+53/-8)

To merge this branch:

bzr merge lp:~toshio/bzr/allow-dirty-patches

Medium

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Martin Pool		2010-02-08	Approve on 2010-02-25
Review via email: mp+18854@code.launchpad.net

Revision history for this message

Toshio Kuratomi (toshio) wrote on 2010-02-08:

Handle dirty patches.

Revision history for this message

Robert Collins (lifeless) wrote on 2010-02-08:

> Handle dirty patches.

Could you enlarge on what you mean a little? Thanks

-Rob

Revision history for this message

Jelmer Vernooij (jelmer) wrote on 2010-02-08:

On Mon, 2010-02-08 at 22:09 +0000, Robert Collins wrote:
> > Handle dirty patches.
>
> Could you enlarge on what you mean a little? Thanks
This was about being able to parse patches with other data in them, such
as patches generated by git.

Cheers,

Jelmer

Revision history for this message

John A Meinel (jameinel) wrote on 2010-02-09:

This looks ok to me. I'm a bit surprised that we stop as soon as we get trailing garbage, but we allow preceding garbage. What happens with a patch like:

=== modified file 'file-1'
--- file-1 2010-01-13 21:15:55 +0000
+++ file-1 2010-02-09 19:19:46 +0000
@@ -54,17 +54,21 @@
text
+one
-two

A comment about this next change
=== modified file 'file-2'
--- file-1 2010-01-13 21:15:55 +0000
+++ file-1 2010-02-09 19:19:46 +0000
@@ -54,17 +54,21 @@

Are they just considered separate 'patches', and thus the line is treated as bogus header data for the second patch?

How does 'patch' handle interleaved stuff?

Having thought about it more, we may want to be more relaxed overall about what we allow. I guess the original argument was that we wanted to make sure to understand the diff in a merge-proposal, so that we could validate that the diff you preview is the diff that would be applied once merged.

Revision history for this message

Martin Pool (mbp) wrote on 2010-02-11:

> How does 'patch' handle interleaved stuff?

I think unix patch just keeps reading through the whole stream, applying them one after the other.

>
> Having thought about it more, we may want to be more relaxed overall about
> what we allow. I guess the original argument was that we wanted to make sure
> to understand the diff in a merge-proposal, so that we could validate that the
> diff you preview is the diff that would be applied once merged.

I think so too. I don't see how this is necessarily a problem for merge proposals.

However, this patch is probably still an improvement in its own right, even if we want to accept trailing junk later.

It would be nice to see at least a basic test for this. Toshio, would you be able to add one? bzrlib/tests/test_patches is pretty clear so you just need to add one or two other cases.

review: Needs Fixing

Revision history for this message

Martin Pool (mbp) wrote on 2010-02-18:

I thought I'd add a test for this as patch pilot, so I put in this one based on bug 502076:

    def test_parse_leading_noise(self):
        """Parse a valid patch header"""
        # https://bugs.edge.launchpad.net/bzr/+bug/502076
        lines = ["diff -pruN commands.py",
            "--- orig/commands.py"
            "+++ mod/dommands.py"]
        bits = parse_patch(iter(lines), allow_dirty=True)

however this fails with

Traceback (most recent call last):
  File "/home/mbp/lib/python/testtools/runtest.py", line 128, in _run_user
    return fn(*args)
  File "/home/mbp/lib/python/testtools/testcase.py", line 369, in _run_test_method
    testMethod()
  File "/home/mbp/bzr/502076-dirty-patches/bzrlib/tests/test_patches.py", line 63, in test_parse_leading_noise
    bits = parse_patch(iter(lines), allow_dirty=True)
  File "/home/mbp/bzr/502076-dirty-patches/bzrlib/patches.py", line 364, in parse_patch
    (orig_name, mod_name) = get_patch_names(iter_lines)
  File "/home/mbp/bzr/502076-dirty-patches/bzrlib/patches.py", line 76, in get_patch_names
    raise MalformedPatchHeader("No orig name", line)
MalformedPatchHeader: Malformed patch header. No orig name
'diff -pruN commands.py'

Can someone tell me what format this is supposed to be parsing, or what's wrong with that test?

review: Needs Fixing

Revision history for this message

Toshio Kuratomi (toshio) wrote on 2010-02-21:

> I thought I'd add a test for this as patch pilot, so I put in this one based
> on bug 502076:
>
> def test_parse_leading_noise(self):
> """Parse a valid patch header"""
> # https://bugs.edge.launchpad.net/bzr/+bug/502076
> lines = ["diff -pruN commands.py",
> "--- orig/commands.py"
> "+++ mod/dommands.py"]
> bits = parse_patch(iter(lines), allow_dirty=True)
>
> however this fails with

> Can someone tell me what format this is supposed to be parsing, or what's
> wrong with that test?

Thanks for the test -- there's two problems with this patch:
1) Call parse_patches() instead of parse_patch():
bits = parse_patch(iter(lines), allow_dirty=True)

2) You need a comma in your header lines list:
"--- orig/commands.py",

Revision history for this message

Martin Pool (mbp) wrote on 2010-02-22:

> > Can someone tell me what format this is supposed to be parsing, or what's
> > wrong with that test?
>
> Thanks for the test -- there's two problems with this patch:
> 1) Call parse_patches() instead of parse_patch():
> bits = parse_patch(iter(lines), allow_dirty=True)
>
> 2) You need a comma in your header lines list:
> "--- orig/commands.py",

Thanks for that. With those changes, this works.

Is it really intended and reasonable that parse_patch (rather than parse_patches) takes allow_dirty, but doesn't accept garbage at the start? If so, we're ok to merge. (Or we could even handle the rest as a followon.)

Revision history for this message

Martin Pool (mbp) wrote on 2010-02-25:

I'll submit this with those changes.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alejandro Cornejo2

Bazaar Codereview Subscribers

Benoit Pierre

Gmood

Karl Bielefeldt

Mahmoud Hassan

Matt Nordhoff

Mohd Fikri Mohd Amin

MrJOHN

Toshio Kuratomi

Václav Haisman

bzr PQM

vincenzo

to status/vote changes:

Alexander Belchenko

amandla2023

 === modified file 'bzrlib/patches.py'
 --- bzrlib/patches.py	2009-11-03 15:45:56 +0000
 +++ bzrlib/patches.py	2010-02-08 17:05:27 +0000
@@ -250,7 +250,13 @@
          return shift
--def iter_hunks(iter_lines):
++def iter_hunks(iter_lines, allow_dirty=False):
++    '''
++    :arg iter_lines: iterable of lines to parse for hunks
++    :kwarg allow_dirty: If True, when we encounter something that is not
++        a hunk header when we're looking for one, assume the rest of the lines
++        are not part of the patch (comments or other junk).  Default False
++    '''
      hunk = None
      for line in iter_lines:
          if line == "\n":
@@ -260,7 +266,15 @@
              continue
          if hunk is not None:
              yield hunk
--        hunk = hunk_from_header(line)
++        try:
++            hunk = hunk_from_header(line)
++        except MalformedHunkHeader:
++            if allow_dirty:
++                # If the line isn't a hunk header, then we've reached the end
++                # of this patch and there's "junk" at the end.  Ignore the
++                # rest of this patch.
++                return
++            raise
          orig_size = 0
          mod_size = 0
          while orig_size < hunk.orig_range or mod_size < hunk.mod_range:
@@ -339,7 +353,12 @@
                      pos += 1
--def parse_patch(iter_lines):
++def parse_patch(iter_lines, allow_dirty=False):
++    '''
++    :arg iter_lines: iterable of lines to parse
++    :kwarg allow_dirty: If True, allow the patch to have trailing junk.
++        Default False
++    '''
      iter_lines = iter_lines_handle_nl(iter_lines)
      try:
          (orig_name, mod_name) = get_patch_names(iter_lines)
@@ -347,15 +366,29 @@
          return BinaryPatch(e.orig_name, e.mod_name)
      else:
          patch = Patch(orig_name, mod_name)
--        for hunk in iter_hunks(iter_lines):
++        for hunk in iter_hunks(iter_lines, allow_dirty):
              patch.hunks.append(hunk)
          return patch
--def iter_file_patch(iter_lines):
++def iter_file_patch(iter_lines, allow_dirty=False):
++    '''
++    :arg iter_lines: iterable of lines to parse for patches
++    :kwarg allow_dirty: If True, allow comments and other non-patch text
++        before the first patch.  Note that the algorithm here can only find
++        such text before any patches have been found.  Comments after the
++        first patch are stripped away in iter_hunks() if it is also passed
++        allow_dirty=True.  Default False.
++    '''
++    ### FIXME: Docstring is not quite true.  We allow certain comments no
++    # matter what, If they startwith '===', '***', or '#' Someone should
++    # reexamine this logic and decide if we should include those in
++    # allow_dirty or restrict those to only being before the patch is found
++    # (as allow_dirty does).
      regex = re.compile(binary_files_re)
      saved_lines = []
      orig_range = 0
++    beginning = True
      for line in iter_lines:
          if line.startswith('=== ') or line.startswith('*** '):
              continue
@@ -365,7 +398,12 @@
              if line.startswith('-') or line.startswith(' '):
                  orig_range -= 1
          elif line.startswith('--- ') or regex.match(line):
--            if len(saved_lines) > 0:
++            if allow_dirty and beginning:
++                # Patches can have "junk" at the beginning
++                # Stripping junk from the end of patches is handled when we
++                # parse the patch
++                beginning = False
++            elif len(saved_lines) > 0:
                  yield saved_lines
              saved_lines = []
          elif line.startswith('@@'):
@@ -397,8 +435,15 @@
          yield last_line
--def parse_patches(iter_lines):
--    return [parse_patch(f.__iter__()) for f in iter_file_patch(iter_lines)]
++def parse_patches(iter_lines, allow_dirty=False):
++    '''
++    :arg iter_lines: iterable of lines to parse for patches
++    :kwarg allow_dirty: If True, allow text that's not part of the patch at
++        selected places.  This includes comments before and after a patch
++        for instance.  Default False.
++    '''
++    return [parse_patch(f.__iter__(), allow_dirty) for f in
++                        iter_file_patch(iter_lines, allow_dirty)]
  def difference_index(atext, btext):

Bazaar

Merge lp:~toshio/bzr/allow-dirty-patches into lp:bzr

Commit message

Description of the change

Preview Diff

Subscribers