Bazaar

Code review comment for lp:~mbp/bzr/test-errors

test-errors
Merge into bzr.dev

Revision history for this message

Martin Pool (mbp) wrote on 2011-06-28:

> What was the failure from PQM for this?
>
> I'm a little scared by some of these changes, though I think overall it's
> moving in the right direction.
>
> + self.repeat_failures_at_end = False
>
> My understanding of the proposed flag was it would be either during *or* at
> the end, never both.

This was a bit of a compromise towards people wanting it at the end. I think I will just pull it out. <https://bugs.launchpad.net/bzr/+bug/625597>.

>
> +class BaseTextTestResult(ExtendedTestResult):
>
> Why a new class rather than just putting the method you want to share on
> ExtendedTestResult?

My idea was that it could be good to separate out "general extended test result" from "... formatted as text." If the parent class already has text-specific methods there may be no point (or perhaps even if it doesn't.) I'll look.

> + u'ERROR: %s\n' % self._test_description(test))
>
> Using u"" ascii literals is well intentioned but isn't really useful. This is
> something of a common misconception. All it amounts to is asking for an error
> from this interpolation rather than elsewhere in the stack. Having the file
> object reject non-ascii bytestrings is a better choice.

This was actually done in response to a test failure and I think it does actually make sense. We want to keep it as unicode as late as possible, and then the file can either reject it or (what we actually want) discard unrepresentable characters.

> def _finishLogFile(self):
> ...
> + log_contents = self.get_log()
>
> Moving this logic scares me. Robert did a lot of complicated coding to make
> sure the log was available from test initialisation rather than only at the
> end, which is partly why this code is still strange. If we really don't want
> the log until _finishLogFile is called, most of the code can be removed and
> replaced with a single addDetails call here.

What behaviour do we actually want from the log file while testing? And secondly, how many tests count on the existing behaviour?

It seems to me that having tests able to inspect the log of every message emitted since the test started is not really a good idea. I have written such tests myself but they are fairly imprecise: if we are trying to check that a particular operation writes some debug messages, we will normally want to check that happens during a particular call, not during the whole run to date. I believe that holding the whole log in memory has also caused excess memory usage (and maybe you, gz, fixed it?)

I'll have a look into whether we can simplify this more.

>
> + if type(codec) is tuple:
> + # Python 2.4
> + encode = codec[0]
>
> This code being copied out to a helper method can lose the compat stuff now.

Thanks.

>
> + # This isn't an expected failure in bzr, because there's nothing
> + # reasonable we can do about old buggy subunits.
>
> This was an expected failure because we want to depend on a minimum subunit
> version to avoid failures, and remove the need for the hack in the test, see
> bug 531667.

OK, I see. I think we should only have xfails for things that would be valid bugs against bzr, and ideally in trunk only for things that actually are open bugs. I suppose we could have a bzr bug saying that there is this ugly hack because subunit doesn't expose a version number, and once it's fixed to do so we could clean up bzr. But I don't think that actually gets to the point of deserving to be a bug; an xxx comment would be enough.

I don't want us to have an ever-increasing number of known failures because they'll just be ignored. It would be ok if every open bug had an xfail test, which would admittedly be pretty high, but we could at least in principle go through and fix them all.

> What was the failure from PQM for this?
> 
> I'm a little scared by some of these changes, though I think overall it's
> moving in the right direction.
> 
> +        self.repeat_failures_at_end = False
> 
> My understanding of the proposed flag was it would be either during *or* at
> the end, never both.

This was a bit of a compromise towards people wanting it at the end.  I think I will just pull it out.  <https://bugs.launchpad.net/bzr/+bug/625597>.

> 
> +class BaseTextTestResult(ExtendedTestResult):
> 
> Why a new class rather than just putting the method you want to share on
> ExtendedTestResult?

My idea was that it could be good to separate out "general extended test result" from "... formatted as text."  If the parent class already has text-specific methods there may be no point (or perhaps even if it doesn't.)  I'll look.

> +            u'ERROR: %s\n' % self._test_description(test))
> 
> Using u"" ascii literals is well intentioned but isn't really useful. This is
> something of a common misconception. All it amounts to is asking for an error
> from this interpolation rather than elsewhere in the stack. Having the file
> object reject non-ascii bytestrings is a better choice.

This was actually done in response to a test failure and I think it does actually make sense.  We want to keep it as unicode as late as possible, and then the file can either reject it or (what we actually want) discard unrepresentable characters.

>      def _finishLogFile(self):
> ...
> +        log_contents = self.get_log()
> 
> Moving this logic scares me. Robert did a lot of complicated coding to make
> sure the log was available from test initialisation rather than only at the
> end, which is partly why this code is still strange. If we really don't want
> the log until _finishLogFile is called, most of the code can be removed and
> replaced with a single addDetails call here.

What behaviour do we actually want from the log file while testing?  And secondly, how many tests count on the existing behaviour?

It seems to me that having tests able to inspect the log of every message emitted since the test started is not really a good idea.  I have written such tests myself but they are fairly imprecise: if we are trying to check that a particular operation writes some debug messages, we will normally want to check that happens during a particular call, not during the whole run to date.  I believe that holding the whole log in memory has also caused excess memory usage (and maybe you, gz, fixed it?)

I'll have a look into whether we can simplify this more.

> 
> +        if type(codec) is tuple:
> +            # Python 2.4
> +            encode = codec[0]
> 
> This code being copied out to a helper method can lose the compat stuff now.

Thanks.

> 
> +            # This isn't an expected failure in bzr, because there's nothing
> +            # reasonable we can do about old buggy subunits.
> 
> This was an expected failure because we want to depend on a minimum subunit
> version to avoid failures, and remove the need for the hack in the test, see
> bug 531667.

OK, I see.  I think we should only have xfails for things that would be valid bugs against bzr, and ideally in trunk only for things that actually are open bugs.  I suppose we could have a bzr bug saying that there is this ugly hack because subunit doesn't expose a version number, and once it's fixed to do so we could clean up bzr.  But I don't think that actually gets to the point of deserving to be a bug; an xxx comment would be enough.

I don't want us to have an ever-increasing number of known failures because they'll just be ignored.  It would be ok if every open bug had an xfail test, which would admittedly be pretty high, but we could at least in principle go through and fix them all.

« Back to merge proposal