> ^- I think factoring the patterns into a dict indirected code is interesting.
> I'm a bit concerned about:
> for pattern_type, patterns in pattern_lists.iteritems():
>
> I'm pretty sure we explicitly intended the order of 'check extensions' then
> 'check basename' then 'check fullpath'. So that we can do checks on just the
> little bit of the content before we have to check against the whole content. I
> don't have any explicit proof of this, but turning the current deterministic
> ordering with a random one doesn't seem ideal.
>
Makes sense. I have updated the patch to make _add_patterns order deterministic.
>
> otherwise the change seems fine to me. I may be over-concerned, but I do
> believe that 'is_ignored()' is a fair amount of overhead in the 'initial add
> of 30k files', and I'd like us to at least be aware of it. I do wonder about
> changing the internals so that instead of using '?:.*/)?(?!.*/)) to filter out
> everything but the basename for some patterns, if we wouldn't be better off
> knowing that this is a basename-only pattern and just supply that to the
> match.
>
Filed bug #607258 to track this.
Thanks for the review.
>
> I'm certainly willing to discuss it, but the easiest way to move forward is to
> just make the order of _add_patterns deterministic by iterating over a fixed
> ordering.
> ^- I think factoring the patterns into a dict indirected code is interesting. lists.iteritems ():
> I'm a bit concerned about:
> for pattern_type, patterns in pattern_
>
> I'm pretty sure we explicitly intended the order of 'check extensions' then
> 'check basename' then 'check fullpath'. So that we can do checks on just the
> little bit of the content before we have to check against the whole content. I
> don't have any explicit proof of this, but turning the current deterministic
> ordering with a random one doesn't seem ideal.
>
Makes sense. I have updated the patch to make _add_patterns order deterministic.
>
> otherwise the change seems fine to me. I may be over-concerned, but I do
> believe that 'is_ignored()' is a fair amount of overhead in the 'initial add
> of 30k files', and I'd like us to at least be aware of it. I do wonder about
> changing the internals so that instead of using '?:.*/)?(?!.*/)) to filter out
> everything but the basename for some patterns, if we wouldn't be better off
> knowing that this is a basename-only pattern and just supply that to the
> match.
>
Filed bug #607258 to track this.
Thanks for the review.
>
> I'm certainly willing to discuss it, but the easiest way to move forward is to
> just make the order of _add_patterns deterministic by iterating over a fixed
> ordering.