Bazaar

Code review comment for lp:~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern

300062-better-handling-for-invalid-ignore-pattern
Merge into bzr.dev

Revision history for this message

Parth Malwankar (parthm) wrote on 2010-06-28:

> 1) It might be clearer to resubmit this, since it has changed fairly
> significantly since the beginning. (So I don't think Robert's Needs Fixing
> still directly applies.)
>
> 2)
> 61 +class InvalidPattern(BzrError):
> 62 +
> 63 + _fmt = ('Invalid pattern(s) found.')
> 64 +
> 65 + def __init__(self, message):
> 66 + self.message = message
> 67 +
>
>
> ^- You set a message, but then never display it (there is nothing in _fmt
> which shows the attributes.)
>

Good catch. Will fix it and resubmit.

> I suppose in builtins you do:
> 29 + if invalid_patterns:
> 30 + pats = '\n '.join(invalid_patterns)
> 31 + ui.ui_factory.show_warning(
> 32 + 'Skipped invalid pattern(s):\n %s' % pats)
> ^- which explicitly displays the bad patterns, but then you do so in a warning
> and not an error. Which seems to disagree with your "now raises exceptions"
> comment.
>

This is for the `bzr ignore` command where if the user gives 'bzr good1 good2 bad1 good3 bad2', good1-3 get added to .bzrignore and a warning is issued for bad1-2.

> 3) It seems very strange to me that 'show_pattern_errors' is a module-level
> function that re-reads the ignore files and reports on them.
>
> Rather than, for example, having the state of the Globster used to determine
> what patterns are invalid.
>
> It means that anyone using Globster for anything that isn't *bzr* (imagine
> qbzr using it for something) is going to get strange results. (As it will tell
> them about problems with something that isn't related at all.)
>

One reason for doing this was that the invalid pattern could come either from .bazaar/ignore of .bzrignore and I wanted to display a message indicating the bad pattern along with the specific file information. E.g.

[a1234]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/a1234/.bzrignore":
RE:*.x
bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
RE:[
bzr: ERROR: Invalid pattern(s) found.
[a1234]%

We could use the Globster state to display the message but the user would need to check both files. I would expect that the probability of an error in .bzrignore is higher but it just seems nicer to indicate the file.
Thoughts?

> 4) so for the construct here:
> 105 + except errors.InvalidPattern, e:
> 106 + # lazy_regex throws InvalidPattern in case of bad pattern.
> 107 + # For Globbing, the bad pattern may have come from an ignore
> 108 + # file, so we should any bad patterns that the ignore file(s)
> 109 + # may have.
> 110 + show_pattern_errors()
> 111 + raise errors.InvalidPattern(str(e))
>
> It seems better to me to do the formatting and put that on the
> errors.InvalidPattern that is getting raised, rather than calling
> show_pattern_errors() and having that display.
>
> I also find it strange to catch InvalidPattern and then raise another
> InvalidPattern(str(first_e))
>
> Especially given your formatting for InvalidPattern this would become:
> InvalidPattern('Invalid pattern(s) found.')
>
> Which would lose any sort of context that the original error tried to pass on.
> (So if a low level exception raised InvalidPattern('pattern XXX is broken')
> then you catch it and re-raise a InvalidPattern('Invalid pattern(s) found.')
> which is pretty unhelpful.)
>

You are right. I will fix this.

> 5) In this test:
> 351 + def test_add_invalid_ignore_in_global_ignore(self):
> 352 + """add should handle invalid global ignore pattern gracefully.
> 353 + """
> 354 + tree = self.make_branch_and_tree('.')
> 355 + self.build_tree(['top.txt'])
> 356 + self.build_tree_contents([('foobar', 'foo\n')])
> 357 + ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
> 358 + out, err = self.run_bzr(['add'], 3)
> 359 + self.assertEquals(out, '')
> 360 + self.assertContainsRe(err,
> 361 + ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
> 362 + 'Invalid pattern'),
> 363 + re.DOTALL)
> 364 +
>
>
> ^- You make it so that 'ignores.add_unique_user_ignores()' allows an invalid
> pattern. It seems more logical to fault that at insert time, rather than at
> 'bzr add' time. (We probably still need to fail there, but we should first
> forbid people from adding broken entries in the first place.)
>

This is done in cmd_ignore.run. This was the "Skipped invalid pattern(s)" warning. So the bad pattern doesn't reach this point. This test case injects the bad pattern to handle the case of when the user might add the bad pattern using and external editor.

>
> ^- I would probably say that we should fail all insertions if one is invalid,
> but we would want it to be atomic. eg if the do 'bzr ignore valid valid
> invalid valid' either all valids get inserted [as you have today] or nothing
> gets inserted. You don't want to add the first two, and then fail, as it
> leaves things in an unknown state.
>
> I would probably tend towards 'validate early and fail', as long as we are
> choosing that for other commands. (I would still like to see status try even
> if it gets a bogus ignore, but I understand why Robert feels so strongly about
> 'add' even if I probably disagree.)
>
> Also, it is a little bit odd that if you *have* an invalid entry it fails with
> an error, but if you try to insert one it just gives you a warning that it
> skipped it. (maybe one of those is 'ignored' vs 'ignore'.)
>

For uniformity I could make ignore fail if it contains a bad pattern. At the moment, it filters out the bad pattern and adds the valid ones.

>
>
> 5) So overall I think this is going in the right direction, but the layering
> for 'what patterns are invalid' seems a bit incorrect. And the formatting and
> raising of the exceptions also doesn't seem correct.

> 1) It might be clearer to resubmit this, since it has changed fairly
> significantly since the beginning. (So I don't think Robert's Needs Fixing
> still directly applies.)
> 
> 2)
> 61      +class InvalidPattern(BzrError):
> 62      +
> 63      + _fmt = ('Invalid pattern(s) found.')
> 64      +
> 65      + def __init__(self, message):
> 66      + self.message = message
> 67      +
> 
> 
> ^- You set a message, but then never display it (there is nothing in _fmt
> which shows the attributes.)
>

Good catch. Will fix it and resubmit.

> I suppose in builtins you do:
> 29      + if invalid_patterns:
> 30      +   pats = '\n '.join(invalid_patterns)
> 31      +   ui.ui_factory.show_warning(
> 32      +     'Skipped invalid pattern(s):\n %s' % pats)
> ^- which explicitly displays the bad patterns, but then you do so in a warning
> and not an error. Which seems to disagree with your "now raises exceptions"
> comment.
>

This is for the `bzr ignore` command where if the user gives 'bzr good1 good2 bad1 good3 bad2', good1-3 get added to .bzrignore and a warning is issued for bad1-2.

> 3) It seems very strange to me that 'show_pattern_errors' is a module-level
> function that re-reads the ignore files and reports on them.
> 
> Rather than, for example, having the state of the Globster used to determine
> what patterns are invalid.
> 
> It means that anyone using Globster for anything that isn't *bzr* (imagine
> qbzr using it for something) is going to get strange results. (As it will tell
> them about problems with something that isn't related at all.)
>

[a1234]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/a1234/.bzrignore":
  RE:*.x
bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
  RE:[
bzr: ERROR: Invalid pattern(s) found.
[a1234]%

> 4) so for the construct here:
> 105     + except errors.InvalidPattern, e:
> 106     + # lazy_regex throws InvalidPattern in case of bad pattern.
> 107     + # For Globbing, the bad pattern may have come from an ignore
> 108     + # file, so we should any bad patterns that the ignore file(s)
> 109     + # may have.
> 110     + show_pattern_errors()
> 111     + raise errors.InvalidPattern(str(e))
> 
> It seems better to me to do the formatting and put that on the
> errors.InvalidPattern that is getting raised, rather than calling
> show_pattern_errors() and having that display.
> 
> I also find it strange to catch InvalidPattern and then raise another
> InvalidPattern(str(first_e))
> 
> Especially given your formatting for InvalidPattern this would become:
>     InvalidPattern('Invalid pattern(s) found.')
> 
> Which would lose any sort of context that the original error tried to pass on.
> (So if a low level exception raised InvalidPattern('pattern XXX is broken')
> then you catch it and re-raise a InvalidPattern('Invalid pattern(s) found.')
> which is pretty unhelpful.)
>

You are right. I will fix this.

> 5) In this test:
> 351     + def test_add_invalid_ignore_in_global_ignore(self):
> 352     + """add should handle invalid global ignore pattern gracefully.
> 353     + """
> 354     + tree = self.make_branch_and_tree('.')
> 355     + self.build_tree(['top.txt'])
> 356     + self.build_tree_contents([('foobar', 'foo\n')])
> 357     + ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
> 358     + out, err = self.run_bzr(['add'], 3)
> 359     + self.assertEquals(out, '')
> 360     + self.assertContainsRe(err,
> 361     + ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
> 362     + 'Invalid pattern'),
> 363     + re.DOTALL)
> 364     +
> 
> 
> ^- You make it so that 'ignores.add_unique_user_ignores()' allows an invalid
> pattern. It seems more logical to fault that at insert time, rather than at
> 'bzr add' time. (We probably still need to fail there, but we should first
> forbid people from adding broken entries in the first place.)
>

> 
> ^- I would probably say that we should fail all insertions if one is invalid,
> but we would want it to be atomic. eg if the do 'bzr ignore valid valid
> invalid valid' either all valids get inserted [as you have today] or nothing
> gets inserted. You don't want to add the first two, and then fail, as it
> leaves things in an unknown state.
> 
> I would probably tend towards 'validate early and fail', as long as we are
> choosing that for other commands. (I would still like to see status try even
> if it gets a bogus ignore, but I understand why Robert feels so strongly about
> 'add' even if I probably disagree.)
> 
> Also, it is a little bit odd that if you *have* an invalid entry it fails with
> an error, but if you try to insert one it just gives you a warning that it
> skipped it. (maybe one of those is 'ignored' vs 'ignore'.)
>

For uniformity I could make ignore fail if it contains a bad pattern. At the moment, it filters out the bad pattern and adds the valid ones.

> 
> 
> 5) So overall I think this is going in the right direction, but the layering
> for 'what patterns are invalid' seems a bit incorrect. And the formatting and
> raising of the exceptions also doesn't seem correct.

« Back to merge proposal