Merge lp:~toshio/bzr/allow-dirty-patches into lp:bzr
- allow-dirty-patches
- Merge into bzr.dev
Status: | Merged |
---|---|
Merged at revision: | not available |
Proposed branch: | lp:~toshio/bzr/allow-dirty-patches |
Merge into: | lp:bzr |
Diff against target: |
113 lines (+53/-8) 1 file modified
bzrlib/patches.py (+53/-8) |
To merge this branch: | bzr merge lp:~toshio/bzr/allow-dirty-patches |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Martin Pool | Approve | ||
Review via email: mp+18854@code.launchpad.net |
Commit message
Description of the change
Toshio Kuratomi (toshio) wrote : | # |
Robert Collins (lifeless) wrote : | # |
> Handle dirty patches.
Could you enlarge on what you mean a little? Thanks
-Rob
Jelmer Vernooij (jelmer) wrote : | # |
On Mon, 2010-02-08 at 22:09 +0000, Robert Collins wrote:
> > Handle dirty patches.
>
> Could you enlarge on what you mean a little? Thanks
This was about being able to parse patches with other data in them, such
as patches generated by git.
Cheers,
Jelmer
John A Meinel (jameinel) wrote : | # |
This looks ok to me. I'm a bit surprised that we stop as soon as we get trailing garbage, but we allow preceding garbage. What happens with a patch like:
=== modified file 'file-1'
--- file-1 2010-01-13 21:15:55 +0000
+++ file-1 2010-02-09 19:19:46 +0000
@@ -54,17 +54,21 @@
text
+one
-two
A comment about this next change
=== modified file 'file-2'
--- file-1 2010-01-13 21:15:55 +0000
+++ file-1 2010-02-09 19:19:46 +0000
@@ -54,17 +54,21 @@
Are they just considered separate 'patches', and thus the line is treated as bogus header data for the second patch?
How does 'patch' handle interleaved stuff?
Having thought about it more, we may want to be more relaxed overall about what we allow. I guess the original argument was that we wanted to make sure to understand the diff in a merge-proposal, so that we could validate that the diff you preview is the diff that would be applied once merged.
Martin Pool (mbp) wrote : | # |
> How does 'patch' handle interleaved stuff?
I think unix patch just keeps reading through the whole stream, applying them one after the other.
>
> Having thought about it more, we may want to be more relaxed overall about
> what we allow. I guess the original argument was that we wanted to make sure
> to understand the diff in a merge-proposal, so that we could validate that the
> diff you preview is the diff that would be applied once merged.
I think so too. I don't see how this is necessarily a problem for merge proposals.
However, this patch is probably still an improvement in its own right, even if we want to accept trailing junk later.
It would be nice to see at least a basic test for this. Toshio, would you be able to add one? bzrlib/
Martin Pool (mbp) wrote : | # |
I thought I'd add a test for this as patch pilot, so I put in this one based on bug 502076:
def test_parse_
"""Parse a valid patch header"""
# https:/
lines = ["diff -pruN commands.py",
"--- orig/commands.py"
"+++ mod/dommands.py"]
bits = parse_patch(
however this fails with
Traceback (most recent call last):
File "/home/
return fn(*args)
File "/home/
testMethod()
File "/home/
bits = parse_patch(
File "/home/
(orig_name, mod_name) = get_patch_
File "/home/
raise MalformedPatchH
MalformedPatchH
'diff -pruN commands.py'
Can someone tell me what format this is supposed to be parsing, or what's wrong with that test?
Toshio Kuratomi (toshio) wrote : | # |
> I thought I'd add a test for this as patch pilot, so I put in this one based
> on bug 502076:
>
> def test_parse_
> """Parse a valid patch header"""
> # https:/
> lines = ["diff -pruN commands.py",
> "--- orig/commands.py"
> "+++ mod/dommands.py"]
> bits = parse_patch(
>
> however this fails with
> Can someone tell me what format this is supposed to be parsing, or what's
> wrong with that test?
Thanks for the test -- there's two problems with this patch:
1) Call parse_patches() instead of parse_patch():
bits = parse_patch(
2) You need a comma in your header lines list:
"--- orig/commands.py",
Martin Pool (mbp) wrote : | # |
> > Can someone tell me what format this is supposed to be parsing, or what's
> > wrong with that test?
>
> Thanks for the test -- there's two problems with this patch:
> 1) Call parse_patches() instead of parse_patch():
> bits = parse_patch(
>
> 2) You need a comma in your header lines list:
> "--- orig/commands.py",
Thanks for that. With those changes, this works.
Is it really intended and reasonable that parse_patch (rather than parse_patches) takes allow_dirty, but doesn't accept garbage at the start? If so, we're ok to merge. (Or we could even handle the rest as a followon.)
Martin Pool (mbp) wrote : | # |
I'll submit this with those changes.
Preview Diff
1 | === modified file 'bzrlib/patches.py' | |||
2 | --- bzrlib/patches.py 2009-11-03 15:45:56 +0000 | |||
3 | +++ bzrlib/patches.py 2010-02-08 17:05:27 +0000 | |||
4 | @@ -250,7 +250,13 @@ | |||
5 | 250 | return shift | 250 | return shift |
6 | 251 | 251 | ||
7 | 252 | 252 | ||
9 | 253 | def iter_hunks(iter_lines): | 253 | def iter_hunks(iter_lines, allow_dirty=False): |
10 | 254 | ''' | ||
11 | 255 | :arg iter_lines: iterable of lines to parse for hunks | ||
12 | 256 | :kwarg allow_dirty: If True, when we encounter something that is not | ||
13 | 257 | a hunk header when we're looking for one, assume the rest of the lines | ||
14 | 258 | are not part of the patch (comments or other junk). Default False | ||
15 | 259 | ''' | ||
16 | 254 | hunk = None | 260 | hunk = None |
17 | 255 | for line in iter_lines: | 261 | for line in iter_lines: |
18 | 256 | if line == "\n": | 262 | if line == "\n": |
19 | @@ -260,7 +266,15 @@ | |||
20 | 260 | continue | 266 | continue |
21 | 261 | if hunk is not None: | 267 | if hunk is not None: |
22 | 262 | yield hunk | 268 | yield hunk |
24 | 263 | hunk = hunk_from_header(line) | 269 | try: |
25 | 270 | hunk = hunk_from_header(line) | ||
26 | 271 | except MalformedHunkHeader: | ||
27 | 272 | if allow_dirty: | ||
28 | 273 | # If the line isn't a hunk header, then we've reached the end | ||
29 | 274 | # of this patch and there's "junk" at the end. Ignore the | ||
30 | 275 | # rest of this patch. | ||
31 | 276 | return | ||
32 | 277 | raise | ||
33 | 264 | orig_size = 0 | 278 | orig_size = 0 |
34 | 265 | mod_size = 0 | 279 | mod_size = 0 |
35 | 266 | while orig_size < hunk.orig_range or mod_size < hunk.mod_range: | 280 | while orig_size < hunk.orig_range or mod_size < hunk.mod_range: |
36 | @@ -339,7 +353,12 @@ | |||
37 | 339 | pos += 1 | 353 | pos += 1 |
38 | 340 | 354 | ||
39 | 341 | 355 | ||
41 | 342 | def parse_patch(iter_lines): | 356 | def parse_patch(iter_lines, allow_dirty=False): |
42 | 357 | ''' | ||
43 | 358 | :arg iter_lines: iterable of lines to parse | ||
44 | 359 | :kwarg allow_dirty: If True, allow the patch to have trailing junk. | ||
45 | 360 | Default False | ||
46 | 361 | ''' | ||
47 | 343 | iter_lines = iter_lines_handle_nl(iter_lines) | 362 | iter_lines = iter_lines_handle_nl(iter_lines) |
48 | 344 | try: | 363 | try: |
49 | 345 | (orig_name, mod_name) = get_patch_names(iter_lines) | 364 | (orig_name, mod_name) = get_patch_names(iter_lines) |
50 | @@ -347,15 +366,29 @@ | |||
51 | 347 | return BinaryPatch(e.orig_name, e.mod_name) | 366 | return BinaryPatch(e.orig_name, e.mod_name) |
52 | 348 | else: | 367 | else: |
53 | 349 | patch = Patch(orig_name, mod_name) | 368 | patch = Patch(orig_name, mod_name) |
55 | 350 | for hunk in iter_hunks(iter_lines): | 369 | for hunk in iter_hunks(iter_lines, allow_dirty): |
56 | 351 | patch.hunks.append(hunk) | 370 | patch.hunks.append(hunk) |
57 | 352 | return patch | 371 | return patch |
58 | 353 | 372 | ||
59 | 354 | 373 | ||
61 | 355 | def iter_file_patch(iter_lines): | 374 | def iter_file_patch(iter_lines, allow_dirty=False): |
62 | 375 | ''' | ||
63 | 376 | :arg iter_lines: iterable of lines to parse for patches | ||
64 | 377 | :kwarg allow_dirty: If True, allow comments and other non-patch text | ||
65 | 378 | before the first patch. Note that the algorithm here can only find | ||
66 | 379 | such text before any patches have been found. Comments after the | ||
67 | 380 | first patch are stripped away in iter_hunks() if it is also passed | ||
68 | 381 | allow_dirty=True. Default False. | ||
69 | 382 | ''' | ||
70 | 383 | ### FIXME: Docstring is not quite true. We allow certain comments no | ||
71 | 384 | # matter what, If they startwith '===', '***', or '#' Someone should | ||
72 | 385 | # reexamine this logic and decide if we should include those in | ||
73 | 386 | # allow_dirty or restrict those to only being before the patch is found | ||
74 | 387 | # (as allow_dirty does). | ||
75 | 356 | regex = re.compile(binary_files_re) | 388 | regex = re.compile(binary_files_re) |
76 | 357 | saved_lines = [] | 389 | saved_lines = [] |
77 | 358 | orig_range = 0 | 390 | orig_range = 0 |
78 | 391 | beginning = True | ||
79 | 359 | for line in iter_lines: | 392 | for line in iter_lines: |
80 | 360 | if line.startswith('=== ') or line.startswith('*** '): | 393 | if line.startswith('=== ') or line.startswith('*** '): |
81 | 361 | continue | 394 | continue |
82 | @@ -365,7 +398,12 @@ | |||
83 | 365 | if line.startswith('-') or line.startswith(' '): | 398 | if line.startswith('-') or line.startswith(' '): |
84 | 366 | orig_range -= 1 | 399 | orig_range -= 1 |
85 | 367 | elif line.startswith('--- ') or regex.match(line): | 400 | elif line.startswith('--- ') or regex.match(line): |
87 | 368 | if len(saved_lines) > 0: | 401 | if allow_dirty and beginning: |
88 | 402 | # Patches can have "junk" at the beginning | ||
89 | 403 | # Stripping junk from the end of patches is handled when we | ||
90 | 404 | # parse the patch | ||
91 | 405 | beginning = False | ||
92 | 406 | elif len(saved_lines) > 0: | ||
93 | 369 | yield saved_lines | 407 | yield saved_lines |
94 | 370 | saved_lines = [] | 408 | saved_lines = [] |
95 | 371 | elif line.startswith('@@'): | 409 | elif line.startswith('@@'): |
96 | @@ -397,8 +435,15 @@ | |||
97 | 397 | yield last_line | 435 | yield last_line |
98 | 398 | 436 | ||
99 | 399 | 437 | ||
102 | 400 | def parse_patches(iter_lines): | 438 | def parse_patches(iter_lines, allow_dirty=False): |
103 | 401 | return [parse_patch(f.__iter__()) for f in iter_file_patch(iter_lines)] | 439 | ''' |
104 | 440 | :arg iter_lines: iterable of lines to parse for patches | ||
105 | 441 | :kwarg allow_dirty: If True, allow text that's not part of the patch at | ||
106 | 442 | selected places. This includes comments before and after a patch | ||
107 | 443 | for instance. Default False. | ||
108 | 444 | ''' | ||
109 | 445 | return [parse_patch(f.__iter__(), allow_dirty) for f in | ||
110 | 446 | iter_file_patch(iter_lines, allow_dirty)] | ||
111 | 402 | 447 | ||
112 | 403 | 448 | ||
113 | 404 | def difference_index(atext, btext): | 449 | def difference_index(atext, btext): |
Handle dirty patches.