Merge lp:~parthm/bzr/300062-invalid-pattern-warnings into lp:bzr

Proposed by Parth Malwankar
Status: Rejected
Rejected by: Robert Collins
Proposed branch: lp:~parthm/bzr/300062-invalid-pattern-warnings
Merge into: lp:bzr
Diff against target: 680 lines (+383/-39)
12 files modified
NEWS (+5/-0)
bzrlib/builtins.py (+10/-0)
bzrlib/errors.py (+1/-1)
bzrlib/globbing.py (+143/-28)
bzrlib/lazy_regex.py (+6/-6)
bzrlib/tests/blackbox/test_add.py (+28/-0)
bzrlib/tests/blackbox/test_ignore.py (+32/-0)
bzrlib/tests/blackbox/test_ignored.py (+15/-0)
bzrlib/tests/blackbox/test_ls.py (+23/-0)
bzrlib/tests/blackbox/test_status.py (+69/-0)
bzrlib/tests/test_globbing.py (+41/-4)
bzrlib/tests/test_ignores.py (+10/-0)
To merge this branch: bzr merge lp:~parthm/bzr/300062-invalid-pattern-warnings
Reviewer Review Type Date Requested Status
John A Meinel Needs Fixing
Robert Collins Pending
bzr-core Pending
Review via email: mp+26809@code.launchpad.net

Commit message

Better handling for invalid patterns in .bzrignore and user ignore. ignore command rejects bad patterns passed to it.

Description of the change

=== Fixed Bug #300062 ===

Based on the earlier discussion
https://code.edge.launchpad.net/~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern/+merge/26400
this branch make bzr issue a warning if there are invalid patterns in any ignore file. 'ls', 'add', 'ignore', 'ignored' and 'status' work as expected. Below is a sample interaction of these commands.

[aaa]% cat .bzrignore
RE:*.cpp
foo
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ignore xyz "RE:[[[[[" pqr
bzr: warning: Skipped invalid pattern argument(s):
  RE:[[[[[
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
[aaa]% cat .bzrignore
RE:*.cpp
foo
xyz
pqr
[aaa]%

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr st -S
 M .bzrignore
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
? one.txt
? two.txt
? x
? y
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr st
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
modified:
  .bzrignore
unknown:
  one.txt
  two.txt
  x
  y

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ls -i
.bzrignore.~1~
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ls -u
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
one.txt
two.txt
x
y

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ignored
.bzrignore.~1~ *~
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr add
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
adding one.txt
adding two.txt
adding x
adding y

To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

110 + if validp:
111 + g = Globster(validp)
112 + new_regex_patterns.append(g.regex_patterns[0])

^- This seems odd. A comment here saying that we are collapsing the valid requests into a single compiled regex would be useful.

158 +def is_pattern_valid(pattern):

^- It seems like it would be nice if this could share the code that determines what translation unit to use. Otherwise it seems likely to get skew if one of them is updated. For example, the proposed 'FixedString' prefix. The fallout is that it would normally work fine, but if someone had a *different* invalid regex, then suddenly it would start causing this one to fail as well.

I don't understand why you changed variables from being private. In lazy_regex you renamed _real_re_compile to real_re_compile, and in globbing you changed _regex_patterns.

It is perfectly fine for *tests* to check private variables, it is good to keep them private so that plugins get a feeling that they may not be stable names.

...

342 + def test_invalid_pattern_in_ignore_file(self):
343 + """Existing invalid patterns should be shown.
344 + """
345 + tree = self.make_branch_and_tree('.')
346 + self.build_tree(['file', 'foo.c'])
347 + f = open('.bzrignore', 'w')
348 + try:
349 + f.write(u"*.c\nRE:[\n") # invalid pattern
350 + finally:
351 + f.close()

^- I think you can simplify this to:
self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])

It is pretty bad form to write *unicode* strings to f.write() given that in python2 it is an 8-bit request. (It happens to be automatically casting your unicode string down to an ascii string.)

This test might be easier to read if we turned it into a "run_script" test, but that may not be easily available here.

The same comments apply to test_add_with_invalid_ignore_in_global_ignore and
test_invalid_pattern_in_ignore_file in the 2 other places.

review: Needs Fixing
Revision history for this message
Parth Malwankar (parthm) wrote :

Thanks for the review John.

> 110 + if validp:
> 111 + g = Globster(validp)
> 112 + new_regex_patterns.append(g.regex_patterns[0])
>
> ^- This seems odd. A comment here saying that we are collapsing the valid
> requests into a single compiled regex would be useful.
>

Done.

> 158 +def is_pattern_valid(pattern):
>
> ^- It seems like it would be nice if this could share the code that determines
> what translation unit to use. Otherwise it seems likely to get skew if one of
> them is updated. For example, the proposed 'FixedString' prefix. The fallout
> is that it would normally work fine, but if someone had a *different* invalid
> regex, then suddenly it would start causing this one to fail as well.
>

I have pulled out the common code into a class _Pattern. _Pattern has an
identify method that categorizes the pattern as fullpath, extension or
basename. This also provides pattern->translator and pattern->prefix
mapping as there are required in multiple places. Unit test is also added
for this.

>
> I don't understand why you changed variables from being private. In lazy_regex
> you renamed _real_re_compile to real_re_compile, and in globbing you changed
> _regex_patterns.
>

real_re_compile is imported by bzrlib.globbing as is_pattern_valid uses
real_re_compile to check if a pattern is valid. Hence I made that public.

For regex_patterns, in case of a pre-existing bad pattern Globbing.match
creates an instance of Globbing and accesses regex_patterns directly as
instance.regex_patterns. Considering that this uses is within the same
file we could allow it to be private but I thought making it public
may be cleaner.

Let me know if the above seems reasonable to you. I haven't done any
changes for this yet.

> It is perfectly fine for *tests* to check private variables, it is good to
> keep them private so that plugins get a feeling that they may not be stable
> names.
>
>
> ...
>
> 342 + def test_invalid_pattern_in_ignore_file(self):
> 343 + """Existing invalid patterns should be shown.
> 344 + """
> 345 + tree = self.make_branch_and_tree('.')
> 346 + self.build_tree(['file', 'foo.c'])
> 347 + f = open('.bzrignore', 'w')
> 348 + try:
> 349 + f.write(u"*.c\nRE:[\n") # invalid pattern
> 350 + finally:
> 351 + f.close()
>
> ^- I think you can simplify this to:
> self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])
>

Thanks. That very handy. I wasn't aware of this method.
The tests have been updated to use this.

> It is pretty bad form to write *unicode* strings to f.write() given that in
> python2 it is an 8-bit request. (It happens to be automatically casting your
> unicode string down to an ascii string.)
>
> This test might be easier to read if we turned it into a "run_script" test, hha
> but that may not be easily available here.
>
> The same comments apply to test_add_with_invalid_ignore_in_global_ignore and
> test_invalid_pattern_in_ignore_file in the 2 other places.

Revision history for this message
Robert Collins (lifeless) wrote :

I have to say, I still think that this would be better solved by just
making the compile error clearer and having it throw an external
BZRError - no expansion of interfaces needed, the improvements can be
used by other things using regexps etc.

-Rob

Revision history for this message
Parth Malwankar (parthm) wrote :

Well, we now have both solutions in place. The warnings based (this) and the error based[1] though the latter may need some cleanup and review.

My personal preference is towards an error based approach. I feel warnings are may be ok for operations like 'status' that don't change anything, but specifically for something like 'add' an error would probably be better so that the user doesn't miss the warning message and go ahead and do the commit. We saw The user could 'revert' but it may be a bit of a pain to do that, or worse, user may miss the message and end up with multiple commits after that containing large binary files :) The 'add' operation with warnings specifically reminds me of bug #549310 where whoami was guessing and I ended up with multiple commits with a bad username[2]. I would prefer a uniform error based approach as Robert suggested.
I don't expect the rollout of an error based approach to be a problem as bzr currently crashes on bad patterns so the existing ignore files will not have bad patterns in them.

That said, I think both warning and error based approach are an improvement over the current situation where bzr shows a stack trace which can scare users. It would be great if we can align on one of these after some more discussion and go ahead.

[1] https://code.edge.launchpad.net/~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern/+merge/26400
[2] https://bugs.launchpad.net/bzr/+bug/549310/comments/3

Revision history for this message
Robert Collins (lifeless) wrote :

So, lets go with an error based approach for now, for the following reasons:
 - its a very small change compared to the current code (wrap the
regex compiler to know its taking a list in and print the errorneous
pattern out)
 - it seems genuinely better for some use cases like add
 - we can add a warning based one later if this feels insufficient.

In general I do think its better to stop early and cleanly in commands
that can output lots of stuff - if we show a lot then users will often
miss the fine print.

I'll review the error based one again to move things forward.

Revision history for this message
Robert Collins (lifeless) wrote :

Marking this one as rejected for now.

Revision history for this message
Martin Pool (mbp) wrote :

On 21 June 2010 16:46, Robert Collins <email address hidden> wrote:
> So, lets go with an error based approach for now, for the following reasons:
>  - its a very small change compared to the current code (wrap the
> regex compiler to know its taking a list in and print the errorneous
> pattern out)
>  - it seems genuinely better for some use cases like add
>  - we can add a warning based one later if this feels insufficient.
>
> In general I do think its better to stop early and cleanly in commands
> that can output lots of stuff - if we show a lot then users will often
> miss the fine print.
>
> I'll review the error based one again to move things forward.

I said previously that I slightly preferred a warning if it was
equally easy to implement. But for the sake of simplicity and of
getting an error in cases where we do need it, an error is fine with
me.

--
Martin

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> On 21 June 2010 16:46, Robert Collins <email address hidden> wrote:
>> So, lets go with an error based approach for now, for the following reasons:
>> - its a very small change compared to the current code (wrap the
>> regex compiler to know its taking a list in and print the errorneous
>> pattern out)
>> - it seems genuinely better for some use cases like add
>> - we can add a warning based one later if this feels insufficient.
>>
>> In general I do think its better to stop early and cleanly in commands
>> that can output lots of stuff - if we show a lot then users will often
>> miss the fine print.
>>
>> I'll review the error based one again to move things forward.
>
> I said previously that I slightly preferred a warning if it was
> equally easy to implement. But for the sake of simplicity and of
> getting an error in cases where we do need it, an error is fine with
> me.
>

We've had a bad habit in the past of falling over on trivial details and
preventing users from getting stuff done. I'm not saying that is the
case here, and as always there is a balance to be struck.

As mentioned, this can't be considered a regression, since we used to
raise an unhelpful exception, so raising a helpful one is strictly
better than what we had.

So certainly, land the update (I have not reviewed the code). I'm not as
convinced that failing early is always the best answer, it is probably
ok here.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwg10gACgkQJdeBCYSNAAPBLwCgokPBGQCdKuxiom9uOmK2DKmN
Q0UAoKYen2O25wLOuQRdvqyUVRV42iLn
=Qoef
-----END PGP SIGNATURE-----

Unmerged revisions

5292. By Parth Malwankar

added test for _Pattern

5291. By Parth Malwankar

cleanup tests.

5290. By Parth Malwankar

cleanup of pattern categorization code

5289. By Parth Malwankar

merged in changes from trunk

5288. By Parth Malwankar

added _PatternType class to categorize patterns.

5287. By Parth Malwankar

updated NEWS

5286. By Parth Malwankar

merged in changes from trunk.

5285. By Parth Malwankar

added invalid patterns tests for 'ignore'

5284. By Parth Malwankar

added invalid pattern test for ignored.

5283. By Parth Malwankar

added invalid pattern tests for 'add' and 'ls'

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-06-18 11:57:13 +0000
3+++ NEWS 2010-06-19 13:29:28 +0000
4@@ -56,6 +56,11 @@
5 test that all commands available to the test suite have help.
6 (Robert Collins, #177500)
7
8+* Invalid patterns in ignore files are now displayed as a warning to the
9+ user. Bazaar does not throw a stack trace when an invalid pattern is
10+ encountered by commands.
11+ (Parth Malwankar, #300062)
12+
13 * Raise ValueError instead of a string exception.
14 (John Arbash Meinel, #586926)
15
16
17=== modified file 'bzrlib/builtins.py'
18--- bzrlib/builtins.py 2010-06-17 08:53:15 +0000
19+++ bzrlib/builtins.py 2010-06-19 13:29:28 +0000
20@@ -2717,6 +2717,16 @@
21 (len(name_pattern) > 1 and name_pattern[1] == ':')):
22 raise errors.BzrCommandError(
23 "NAME_PATTERN should not be an absolute path")
24+ name_pattern_list, invalid_patterns = \
25+ globbing.filter_invalid_patterns(name_pattern_list)
26+ if invalid_patterns:
27+ pats = '\n '.join(invalid_patterns)
28+ ui.ui_factory.show_warning(
29+ 'Skipped invalid pattern argument(s):\n %s' % pats)
30+ if not name_pattern_list:
31+ # After removing invalid patterns, the list may be
32+ # empty. There is nothing to do in this case.
33+ return
34 tree, relpath = WorkingTree.open_containing(directory)
35 ignores.tree_ignores_add_patterns(tree, name_pattern_list)
36 ignored = globbing.Globster(name_pattern_list)
37
38=== modified file 'bzrlib/errors.py'
39--- bzrlib/errors.py 2010-06-17 11:52:13 +0000
40+++ bzrlib/errors.py 2010-06-19 13:29:28 +0000
41@@ -3149,10 +3149,10 @@
42 def __init__(self, bzrdir):
43 self.bzrdir = bzrdir
44
45+
46 class NoWhoami(BzrError):
47
48 _fmt = ('Unable to determine your name.\n'
49 "Please, set your name with the 'whoami' command.\n"
50 'E.g. bzr whoami "Your Name <name@example.com>"')
51
52-
53
54=== modified file 'bzrlib/globbing.py'
55--- bzrlib/globbing.py 2010-02-17 17:11:16 +0000
56+++ bzrlib/globbing.py 2010-06-19 13:29:28 +0000
57@@ -22,6 +22,12 @@
58
59 import re
60
61+import bzrlib
62+from bzrlib import (
63+ config,
64+ ui,
65+ )
66+from bzrlib.lazy_regex import real_re_compile
67 from bzrlib.trace import (
68 warning
69 )
70@@ -152,6 +158,47 @@
71 return _sub_basename(pattern[2:])
72
73
74+class _Pattern(object):
75+ """Wrapper for managing pattern categories.
76+ """
77+ TYPE_FULLPATH = 1
78+ TYPE_BASENAME = 2
79+ TYPE_EXTENSION = 3
80+
81+ translators = {
82+ TYPE_FULLPATH : _sub_fullpath,
83+ TYPE_BASENAME : _sub_basename,
84+ TYPE_EXTENSION : _sub_extension,
85+ }
86+
87+ # Prefixes used to combine various patterns.
88+ # See: Globster._add_patterns
89+ PREFIX_EXTENSION = r'(?:.*/)?(?!.*/)(?:.*\.)'
90+ PREFIX_BASENAME = r'(?:.*/)?(?!.*/)'
91+ PREFIX_FULLPATH = r''
92+
93+ prefixes = {
94+ TYPE_FULLPATH : PREFIX_FULLPATH,
95+ TYPE_BASENAME : PREFIX_BASENAME,
96+ TYPE_EXTENSION : PREFIX_EXTENSION,
97+ }
98+
99+ @staticmethod
100+ def identify(pattern):
101+ """Returns pattern category.
102+
103+ :param pattern: normalized pattern.
104+ Identify if a pattern is fullpath, basename or extension
105+ and returns the appropriate type.
106+ """
107+ if pattern.startswith(u'RE:') or u'/' in pattern:
108+ return _Pattern.TYPE_FULLPATH
109+ elif pattern.startswith(u'*.'):
110+ return _Pattern.TYPE_EXTENSION
111+ else:
112+ return _Pattern.TYPE_BASENAME
113+
114+
115 class Globster(object):
116 """A simple wrapper for a set of glob patterns.
117
118@@ -164,7 +211,7 @@
119 a super-regex containing groups of up to 99 patterns.
120 The 99 limitation is due to the grouping limit of the Python re module.
121 The resulting super-regex and associated patterns are stored as a list of
122- (regex,[patterns]) in _regex_patterns.
123+ (regex,[patterns]) in regex_patterns.
124
125 For performance reasons the patterns are categorised as extension patterns
126 (those that match against a file extension), basename patterns
127@@ -178,46 +225,83 @@
128 patterns.
129 """
130 def __init__(self, patterns):
131- self._regex_patterns = []
132+ self.regex_patterns = []
133 path_patterns = []
134 base_patterns = []
135 ext_patterns = []
136+ pattern_lists = {
137+ _Pattern.TYPE_FULLPATH : path_patterns,
138+ _Pattern.TYPE_EXTENSION : ext_patterns,
139+ _Pattern.TYPE_BASENAME : base_patterns,
140+ }
141 for pat in patterns:
142 pat = normalize_pattern(pat)
143- if pat.startswith(u'RE:') or u'/' in pat:
144- path_patterns.append(pat)
145- elif pat.startswith(u'*.'):
146- ext_patterns.append(pat)
147- else:
148- base_patterns.append(pat)
149- self._add_patterns(ext_patterns,_sub_extension,
150- prefix=r'(?:.*/)?(?!.*/)(?:.*\.)')
151- self._add_patterns(base_patterns,_sub_basename,
152- prefix=r'(?:.*/)?(?!.*/)')
153- self._add_patterns(path_patterns,_sub_fullpath)
154+ pattern_lists[_Pattern.identify(pat)].append(pat)
155+ for k, v in pattern_lists.iteritems():
156+ self._add_patterns(v, _Pattern.translators[k],
157+ _Pattern.prefixes[k])
158
159- def _add_patterns(self, patterns, translator, prefix=''):
160+ def _add_patterns(self, patterns, translator, prefix):
161 while patterns:
162 grouped_rules = ['(%s)' % translator(pat) for pat in patterns[:99]]
163 joined_rule = '%s(?:%s)$' % (prefix, '|'.join(grouped_rules))
164- self._regex_patterns.append((re.compile(joined_rule, re.UNICODE),
165+ self.regex_patterns.append((re.compile(joined_rule, re.UNICODE),
166 patterns[:99]))
167 patterns = patterns[99:]
168
169 def match(self, filename):
170+ """Search for a pattern that matches the given filename.
171+
172+ :return A matching pattern or None if there is no matching pattern.
173+ In case of a re.error exception,
174+ Scan self._regex_patterns.
175+ Issue a warning to the user about the bad pattern.
176+ Remove bad pattern.
177+ Retry match.
178+ """
179+ try:
180+ return self._match(filename)
181+ except re.error, e:
182+ new_regex_patterns = []
183+ for regex, patterns in self.regex_patterns:
184+ try:
185+ c = regex._compile_and_collapse()
186+ new_regex_patterns.append((regex, patterns))
187+ except re.error, e:
188+ validp, invalidp = filter_invalid_patterns(patterns)
189+ user_ignore = config.user_ignore_config_filename()
190+ bzrignore = bzrlib.IGNORE_FILENAME
191+ msg = ('Skipped invalid pattern(s):\n %s\n'
192+ 'Please fix above patterns in '
193+ '"%s" or "%s".\n'
194+ 'Bazaar commands will NOT ignore files meant to '
195+ 'match above pattern(s).' %
196+ ('\n '.join(invalidp), bzrignore, user_ignore))
197+ ui.ui_factory.show_warning(msg)
198+ if validp:
199+ # Skip invalid pattern and accumulate valid patterns
200+ # into new_regex_patterns so that they can be compiled
201+ # into a single regex for better performance.
202+ g = Globster(validp)
203+ new_regex_patterns.append(g.regex_patterns[0])
204+ self.regex_patterns = new_regex_patterns
205+ return self._match(filename)
206+
207+ def _match(self, filename):
208 """Searches for a pattern that matches the given filename.
209
210 :return A matching pattern or None if there is no matching pattern.
211 """
212- for regex, patterns in self._regex_patterns:
213+ for regex, patterns in self.regex_patterns:
214 match = regex.match(filename)
215 if match:
216 return patterns[match.lastindex -1]
217 return None
218
219+
220 class ExceptionGlobster(object):
221 """A Globster that supports exception patterns.
222-
223+
224 Exceptions are ignore patterns prefixed with '!'. Exception
225 patterns take precedence over regular patterns and cause a
226 matching filename to return None from the match() function.
227@@ -225,7 +309,7 @@
228 as regular ignores. '!!' patterns are useful to establish ignores
229 that apply under paths specified by '!' exception patterns.
230 """
231-
232+
233 def __init__(self,patterns):
234 ignores = [[], [], []]
235 for p in patterns:
236@@ -236,7 +320,7 @@
237 else:
238 ignores[0].append(p)
239 self._ignores = [Globster(i) for i in ignores]
240-
241+
242 def match(self, filename):
243 """Searches for a pattern that matches the given filename.
244
245@@ -250,6 +334,7 @@
246 else:
247 return self._ignores[0].match(filename)
248
249+
250 class _OrderedGlobster(Globster):
251 """A Globster that keeps pattern order."""
252
253@@ -259,17 +344,12 @@
254 :param patterns: sequence of glob patterns
255 """
256 # Note: This could be smarter by running like sequences together
257- self._regex_patterns = []
258+ self.regex_patterns = []
259 for pat in patterns:
260 pat = normalize_pattern(pat)
261- if pat.startswith(u'RE:') or u'/' in pat:
262- self._add_patterns([pat], _sub_fullpath)
263- elif pat.startswith(u'*.'):
264- self._add_patterns([pat], _sub_extension,
265- prefix=r'(?:.*/)?(?!.*/)(?:.*\.)')
266- else:
267- self._add_patterns([pat], _sub_basename,
268- prefix=r'(?:.*/)?(?!.*/)')
269+ pat_type = _Pattern.identify(pat)
270+ self._add_patterns([pat], _Pattern.translators[pat_type],
271+ _Pattern.prefixes[pat_type])
272
273
274 _slashes = re.compile(r'[\\/]+')
275@@ -283,3 +363,38 @@
276 if len(pattern) > 1:
277 pattern = pattern.rstrip('/')
278 return pattern
279+
280+
281+def is_pattern_valid(pattern):
282+ """Returns True if pattern is valid.
283+
284+ :param pattern: Normalized pattern.
285+ is_pattern_valid() assumes pattern to be normalized.
286+ see: globbing.normalize_pattern
287+ """
288+ result = True
289+ translator = _Pattern.translators[_Pattern.identify(pattern)]
290+ tpattern = '(%s)' % translator(pattern)
291+ try:
292+ cpattern = real_re_compile(tpattern, re.UNICODE)
293+ except re.error, e:
294+ result = False
295+ return result
296+
297+
298+def filter_invalid_patterns(patterns):
299+ """Returns valid and invalid patterns.
300+
301+ :param patterns: List of normalized patterns.
302+ filter_invalid_patterns() assumes pattern to be normalized.
303+ see: globbing.normalize_pattern
304+ """
305+ valid_patterns = []
306+ invalid_patterns = []
307+ for pat in patterns:
308+ if is_pattern_valid(pat):
309+ valid_patterns.append(pat)
310+ else:
311+ invalid_patterns.append(pat)
312+ return valid_patterns, invalid_patterns
313+
314
315=== modified file 'bzrlib/lazy_regex.py'
316--- bzrlib/lazy_regex.py 2009-03-23 14:59:43 +0000
317+++ bzrlib/lazy_regex.py 2010-06-19 13:29:28 +0000
318@@ -58,7 +58,7 @@
319
320 def _real_re_compile(self, *args, **kwargs):
321 """Thunk over to the original re.compile"""
322- return _real_re_compile(*args, **kwargs)
323+ return real_re_compile(*args, **kwargs)
324
325 def __getattr__(self, attr):
326 """Return a member from the proxied regex object.
327@@ -97,11 +97,11 @@
328 Though the first call will reset back to the original (it doesn't
329 track nesting level)
330 """
331- re.compile = _real_re_compile
332-
333-
334-_real_re_compile = re.compile
335-if _real_re_compile is lazy_compile:
336+ re.compile = real_re_compile
337+
338+
339+real_re_compile = re.compile
340+if real_re_compile is lazy_compile:
341 raise AssertionError(
342 "re.compile has already been overridden as lazy_compile, but this would" \
343 " cause infinite recursion")
344
345=== modified file 'bzrlib/tests/blackbox/test_add.py'
346--- bzrlib/tests/blackbox/test_add.py 2010-02-23 07:43:11 +0000
347+++ bzrlib/tests/blackbox/test_add.py 2010-06-19 13:29:28 +0000
348@@ -18,8 +18,10 @@
349 """Tests of the 'bzr add' command."""
350
351 import os
352+import re
353
354 from bzrlib import (
355+ ignores,
356 osutils,
357 tests,
358 )
359@@ -217,3 +219,29 @@
360 os.symlink(osutils.abspath('target'), 'tree/link')
361 out = self.run_bzr(['add', 'tree/link'])[0]
362 self.assertEquals(out, 'adding link\n')
363+
364+ def test_add_with_invalid_ignore(self):
365+ """add should handle invalid ignore pattern gracefully.
366+ """
367+ tree = self.make_branch_and_tree('.')
368+ self.build_tree(['top.txt'])
369+ self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
370+ out, err = self.run_bzr(['add'])
371+ self.assertEquals(out, 'adding .bzrignore\nadding top.txt\n')
372+ self.assertContainsRe(err,
373+ 'warning: Skipped invalid pattern.*RE:\*\.cpp',
374+ re.DOTALL)
375+
376+ def test_add_with_invalid_ignore_in_global_ignore(self):
377+ """add should handle invalid global ignore pattern gracefully.
378+ """
379+ tree = self.make_branch_and_tree('.')
380+ self.build_tree(['top.txt'])
381+ self.build_tree_contents([('foobar', 'foo\n')])
382+ ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
383+ out, err = self.run_bzr(['add'])
384+ self.assertEquals(out, 'adding foobar\nadding top.txt\n')
385+ self.assertContainsRe(err,
386+ 'warning: Skipped invalid pattern.*RE:\*\.cpp',
387+ re.DOTALL)
388+
389
390=== modified file 'bzrlib/tests/blackbox/test_ignore.py'
391--- bzrlib/tests/blackbox/test_ignore.py 2010-05-03 09:19:15 +0000
392+++ bzrlib/tests/blackbox/test_ignore.py 2010-06-19 13:29:28 +0000
393@@ -162,3 +162,35 @@
394 tree = self.make_branch_and_tree('a')
395 self.run_bzr(['ignore', '--directory=a', 'README'])
396 self.check_file_contents('a/.bzrignore', 'README\n')
397+
398+ def test_ignore_invalid_pattern_argument(self):
399+ """Ensure graceful handling for invalid ignore pattern.
400+
401+ Invalid pattern should show clear error message.
402+ Invalid pattern should not be added to .bzrignore file.
403+ """
404+ tree = self.make_branch_and_tree('.')
405+ out, err = self.run_bzr(['ignore', 'RE:*.cpp']) # invalid pattern
406+ self.assertEqual(out, '')
407+ self.assertContainsRe(err,
408+ 'bzr: warning: Skipped invalid.*RE:\*\.cpp', re.DOTALL)
409+ self.assertFalse(os.path.isfile('.bzrignore'))
410+
411+ def test_invalid_pattern_in_ignore_file(self):
412+ """Existing invalid patterns should be shown.
413+ """
414+ tree = self.make_branch_and_tree('.')
415+ self.build_tree(['file'])
416+ self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])
417+ out, err = self.run_bzr(['ignore', 'RE:*.cpp', 'abc'])
418+ self.assertEqual(out, '')
419+ self.assertContainsRe(err, 'warning: Skipped invalid pattern'
420+ '.*RE:\*\.cpp.*'
421+ 'warning: Skipped invalid pattern.*RE:\[', re.DOTALL)
422+ f = open('.bzrignore', 'r')
423+ try:
424+ lines = f.readlines()
425+ self.assertTrue('abc\n' in lines)
426+ finally:
427+ f.close()
428+
429
430=== modified file 'bzrlib/tests/blackbox/test_ignored.py'
431--- bzrlib/tests/blackbox/test_ignored.py 2010-05-02 20:10:25 +0000
432+++ bzrlib/tests/blackbox/test_ignored.py 2010-06-19 13:29:28 +0000
433@@ -17,6 +17,8 @@
434
435 """Tests of the 'bzr ignored' command."""
436
437+import re
438+
439 from bzrlib.tests.blackbox import ExternalBase
440
441
442@@ -46,3 +48,16 @@
443 ('a/.bzrignore', 'README')])
444 out, err = self.run_bzr(['ignored', '--directory=a'])
445 self.assertStartsWith(out, 'README')
446+
447+ def test_invalid_pattern_in_ignore_file(self):
448+ """Existing invalid patterns should be shown.
449+ """
450+ tree = self.make_branch_and_tree('.')
451+ self.build_tree(['file', 'foo.c'])
452+ self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])
453+ out, err = self.run_bzr(['ignored'])
454+ self.assertContainsRe(out, 'foo\.c[ ]+\*\.c')
455+ self.assertContainsRe(err,
456+ 'warning: Skipped invalid pattern.*RE:\[',
457+ re.DOTALL)
458+
459
460=== modified file 'bzrlib/tests/blackbox/test_ls.py'
461--- bzrlib/tests/blackbox/test_ls.py 2010-05-02 20:10:25 +0000
462+++ bzrlib/tests/blackbox/test_ls.py 2010-06-19 13:29:28 +0000
463@@ -16,6 +16,7 @@
464
465 """External tests of 'bzr ls'"""
466
467+import re
468 import os
469
470 from bzrlib import ignores, osutils
471@@ -244,3 +245,25 @@
472 self.wt.commit('commit')
473 self.ls_equals('sub/\nsub/file\n', '--directory=dir')
474 self.ls_equals('sub/file\n', '-d dir sub')
475+
476+ def test_ls_invalid_ignore_pattern(self):
477+ """Tests status with invalid ignore pattern"""
478+ f = open('.bzrignore', 'w')
479+ try:
480+ f.write(u"RE:*.cpp\n") # invalid pattern
481+ finally:
482+ f.close()
483+ self.build_tree(['bye.c'])
484+ out, err = self.run_bzr(['ls', '-i'])
485+ self.assertEquals(out, '')
486+ self.assertContainsRe(err,
487+ 'warning: Skipped invalid pattern.*RE:\*\.cpp',
488+ flags=re.DOTALL)
489+ out, err = self.run_bzr(['ls', '-u'])
490+ self.assertEquals(out, '.bzrignore\na\nbye.c\n')
491+ self.assertContainsRe(err,
492+ 'warning: Skipped invalid pattern.*RE:\*\.cpp',
493+ flags=re.DOTALL)
494+
495+
496+
497
498=== modified file 'bzrlib/tests/blackbox/test_status.py'
499--- bzrlib/tests/blackbox/test_status.py 2010-04-07 21:34:13 +0000
500+++ bzrlib/tests/blackbox/test_status.py 2010-06-19 13:29:28 +0000
501@@ -25,6 +25,7 @@
502 from cStringIO import StringIO
503 import codecs
504 from os import mkdir, chdir, rmdir, unlink
505+import re
506 import sys
507 from tempfile import TemporaryFile
508
509@@ -32,6 +33,7 @@
510 bzrdir,
511 conflicts,
512 errors,
513+ ignores,
514 osutils,
515 )
516 import bzrlib.branch
517@@ -701,6 +703,73 @@
518 out, err = self.run_bzr('status tree/a')
519 self.assertNotContainsRe(out, 'pending merge')
520
521+ def test_status_invalid_ignore_pattern(self):
522+ """Tests status with invalid ignore pattern"""
523+ wt = self.make_branch_and_tree('.')
524+ f = open('.bzrignore', 'w')
525+ try:
526+ f.write(u"RE:*.cpp\n") # invalid pattern
527+ finally:
528+ f.close()
529+ wt.commit('commit .bzrignore')
530+ self.build_tree(['bye.c'])
531+ out, err = self.run_bzr(['status'])
532+ self.assertEquals(out, 'unknown:\n .bzrignore\n bye.c\n')
533+ self.assertContainsRe(err,
534+ 'warning: Skipped invalid pattern.*RE:\*\.cpp',
535+ flags=re.DOTALL)
536+
537+ def test_status_invalid_ignore_pattern_in_global_ignore(self):
538+ """Tests status with invalid ignore pattern in .bazaar/ignore"""
539+ wt = self.make_branch_and_tree('.')
540+ self.build_tree(['file'])
541+ wt.add('file')
542+ wt.commit('one')
543+ # add unknown file
544+ f = open('foobar', 'w')
545+ try:
546+ f.write(u"foo\n") # invalid pattern
547+ finally:
548+ f.close()
549+ ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
550+ out, err = self.run_bzr(['status'])
551+ self.assertEquals(out, 'unknown:\n foobar\n')
552+ self.assertContainsRe(err,
553+ 'warning: Skipped invalid.*RE:\*\.cpp',
554+ flags=re.DOTALL)
555+
556+ def test_short_status_invalid_ignore_pattern(self):
557+ """Tests short status with invalid ignore pattern"""
558+ wt = self.make_branch_and_tree('.')
559+ f = open('.bzrignore', 'w')
560+ try:
561+ f.write(u"RE:*.cpp\n") # invalid pattern
562+ finally:
563+ f.close()
564+ wt.commit('commit .bzrignore')
565+ out, err = self.run_bzr(['status', '-S'])
566+ self.assertEquals(out, '? .bzrignore\n')
567+ self.assertContainsRe(err,
568+ 'warning: Skipped invalid.*RE:\*\.cpp',
569+ flags=re.DOTALL)
570+
571+ def test_specific_status_invalid_ignore_pattern(self):
572+ """Tests specific status with invalid ignore pattern"""
573+ wt = self.make_branch_and_tree('.')
574+ self.build_tree(['file'])
575+ wt.add('file')
576+ wt.commit('added file')
577+ f = open('.bzrignore', 'w')
578+ try:
579+ f.write(u"RE:*.cpp\n") # invalid pattern
580+ finally:
581+ f.close()
582+ wt.commit('commit .bzrignore')
583+ out, err = self.run_bzr(['status', 'file'])
584+ self.assertEquals(out, '')
585+ self.assertContainsRe(err,
586+ 'warning: Skipped invalid.*RE:\*\.cpp',
587+ flags=re.DOTALL)
588
589 class TestStatusEncodings(TestCaseWithTransport):
590
591
592=== modified file 'bzrlib/tests/test_globbing.py'
593--- bzrlib/tests/test_globbing.py 2010-02-17 17:11:16 +0000
594+++ bzrlib/tests/test_globbing.py 2010-06-19 13:29:28 +0000
595@@ -16,10 +16,13 @@
596 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
597
598 from bzrlib.globbing import (
599+ ExceptionGlobster,
600+ filter_invalid_patterns,
601 Globster,
602- ExceptionGlobster,
603+ is_pattern_valid,
604+ normalize_pattern,
605 _OrderedGlobster,
606- normalize_pattern
607+ _Pattern,
608 )
609 from bzrlib.tests import (
610 TestCase,
611@@ -37,11 +40,11 @@
612 for name in positive:
613 self.failUnless(globster.match(name), repr(
614 u'name "%s" does not match glob "%s" (re=%s)' %
615- (name, glob, globster._regex_patterns[0][0].pattern)))
616+ (name, glob, globster.regex_patterns[0][0].pattern)))
617 for name in negative:
618 self.failIf(globster.match(name), repr(
619 u'name "%s" does match glob "%s" (re=%s)' %
620- (name, glob, globster._regex_patterns[0][0].pattern)))
621+ (name, glob, globster.regex_patterns[0][0].pattern)))
622
623 def assertMatchBasenameAndFullpath(self, matchset):
624 # test basename matcher
625@@ -371,3 +374,37 @@
626 """tests that multiple mixed slashes are collapsed to single forward
627 slashes and trailing mixed slashes are removed"""
628 self.assertEqual(u'/foo/bar', normalize_pattern(u'\\/\\foo//\\///bar/\\\\/'))
629+
630+class TestInvalidPattern(TestCase):
631+
632+ def test_is_pattern_valid(self):
633+ """Ensure is_pattern_valid() detects invalid patterns.
634+ """
635+ self.assertTrue(is_pattern_valid(normalize_pattern(u'foo')))
636+ self.assertFalse(is_pattern_valid(normalize_pattern(u'RE:*.cpp')))
637+
638+ def test_filter_invalid_patterns(self):
639+ """Ensure filter_invalid_patterns() returns only valid patterns.
640+ """
641+ patterns = [u'abc', u'RE:*.cpp', u'[', u'def']
642+ patterns = [normalize_pattern(p) for p in patterns]
643+ ref_valid_patterns = set([u'abc', u'def'])
644+ ref_invalid_patterns = set([u'RE:*.cpp', u'['])
645+ valid_patterns, invalid_patterns = filter_invalid_patterns(patterns)
646+ self.assertEqual(ref_valid_patterns, set(valid_patterns))
647+ self.assertEqual(ref_invalid_patterns, set(invalid_patterns))
648+
649+class TestPatternClass(TestCase):
650+
651+ def test_pattern_category(self):
652+ """Ensure pattern categories are identified correctly.
653+ """
654+ fullpath0 = normalize_pattern(u'/home/jrandom')
655+ fullpath1 = normalize_pattern(u'RE:foobar')
656+ extension = normalize_pattern(u'*.py')
657+ basename = normalize_pattern(u'random_file')
658+ self.assertEqual(_Pattern.identify(fullpath0), _Pattern.TYPE_FULLPATH)
659+ self.assertEqual(_Pattern.identify(fullpath1), _Pattern.TYPE_FULLPATH)
660+ self.assertEqual(_Pattern.identify(extension), _Pattern.TYPE_EXTENSION)
661+ self.assertEqual(_Pattern.identify(basename), _Pattern.TYPE_BASENAME)
662+
663
664=== modified file 'bzrlib/tests/test_ignores.py'
665--- bzrlib/tests/test_ignores.py 2010-05-03 04:51:58 +0000
666+++ bzrlib/tests/test_ignores.py 2010-06-19 13:29:28 +0000
667@@ -217,3 +217,13 @@
668 ignores.tree_ignores_add_patterns(tree, ["myentry2", "foo"])
669 self.assertEquals(open('.bzrignore', 'rb').read(), 'myentry1\r\nmyentry2\r\nfoo\r\n')
670 self.assertPatternsEquals(["myentry1", "myentry2", "foo"])
671+
672+ def test_bad_pattern(self):
673+ """tree.is_ignored() should handle bad patterns.
674+ """
675+ tree = self.make_branch_and_tree(".")
676+ self.build_tree(['foo.c'])
677+ self.build_tree_contents([('.bzrignore', u"RE:*.cpp\nfoo.py\n")])
678+ tree.add([".bzrignore"])
679+ self.assertFalse(tree.is_ignored('foo.c'))
680+