>>>>> Robert Collins <email address hidden> writes:
> we did two simultaneous things a while back:
> - turned off -1, so that we get *all* failures
+1
> - turned on subunit, so that the failures are machine driven
ECANTPARSE
> The emails are meant to be compressed, so you should in principle
> be able get it reliably.
In practice I only get failures for merge errors, any error in the test
suite and... it's not delivered.
> If we're actually running into mailer limits, we can do a few
> things:
> - turn off subunit (will reduce some of the overhead - lots in a
> success case, but in a it-all-goes-pear-shaped failure, you'll
> still hit the limit and whatever undiagnosed and unfixed issue is
> biting you now, will bite you then.
Yup, exactly my feeling.
> - diagnose and fix the mailer path issue so that you receive the
> mails reliably
As a data point, John got an email for a failure and *I* didn't get it
for the same failure (or so very closely the same failure that the
difference doesn't matter).
This means the problem is on my side but I don't know where exactly. I
use fetchmail and there is *nothing* in the logs so that also rules out
the delivery part. That leaves my ISP mail server... and I don't control
it. That's me, but it could apply to anybody else, we just don't know.
> - stop using email to send the results
Or store the results somewhere (deleting them after a day/week/month
whatever).
Or send two mails:
- one indicating the failure
- one with the stream attached.
At least with that I will *know* there was a failure.
> - filter the stream at source so success and other results that
> you consider uninteresting are not included in the 'raw'
> stream. Note that again, this will *leave the failure mode in
> place so when it goes bad you will stop getting the responses*
Another idea would be to have a different pqm instance used only for
known failures and make the output for the regular pqm includes only
*failures* and get rid of *all* the rest.
> Now, I'm not directly experiencing this anymore, so you need to do
> what makes sense to you.
Thanks for your thoughts anyway.
> If subunit's API is out of date and messing up the reason
> formatting, I'll happily fix it - I'm going to go peek and see if
> its addSkip is stale right after mailing this.
Cool, any feedback ?
> To stop using email to send the results, we either need to design
> a new thing and change PQM to do it, or reenable the LP API
> support, which Tim requested we disable.
And what would that give us ? Failed run stream attached to the mp as an
attachment ?
> We previously *haven't* filtered at source as a defensive measure
> against bugs. Python 2 is riddled with default-encoding pitfalls
> and they made subunit + bzr very unreliable at one point. Possibly
> totally fixed thanks to Martin[gz].
I still expect a few (tiny but annoying) bugs to be fixed there...
> Personally, I would not take any action that leaves the problem
> fallow waiting for a serious test failure to provoke it: that just
> means that when you need it most it will fail.
+1
> However, as I say above: you're using this system, I'm rarely
> using it : do what makes sense to you. I do strongly suggest that
> you change things in PQM itself if you want filtering, rather than
> bzr. That will at least preserve the detailed output for babune if
> you do need it in future.
Babune doesn't use 'make check' and will still use subunit, that's
unrelated.
Since I administer babune, I don't have any problem with tracking
subunit and testtools trunks. The problem with PQM, as I see it, is that
we suffer a lot from not being able to control the full stack that
decides whether or not a commit should be landed.
Don;t get me wrong on this, it's good that we don't tweak it endlessly
either, but I'd prefer a reliable solution and if this means making
'make check' simpler (even if less powerful), so be it.
> As for the value of logs on success, xfail, skip etc : *if* they pass
> incorrectly, I think the log will be invaluable, but you won't need it
> till you need it.
>>>>> Robert Collins <email address hidden> writes:
> we did two simultaneous things a while back:
> - turned off -1, so that we get *all* failures
+1
> - turned on subunit, so that the failures are machine driven
ECANTPARSE
> The emails are meant to be compressed, so you should in principle
> be able get it reliably.
In practice I only get failures for merge errors, any error in the test
suite and... it's not delivered.
> If we're actually running into mailer limits, we can do a few
> things:
> - turn off subunit (will reduce some of the overhead - lots in a goes-pear- shaped failure, you'll
> success case, but in a it-all-
> still hit the limit and whatever undiagnosed and unfixed issue is
> biting you now, will bite you then.
Yup, exactly my feeling.
> - diagnose and fix the mailer path issue so that you receive the
> mails reliably
As a data point, John got an email for a failure and *I* didn't get it
for the same failure (or so very closely the same failure that the
difference doesn't matter).
This means the problem is on my side but I don't know where exactly. I
use fetchmail and there is *nothing* in the logs so that also rules out
the delivery part. That leaves my ISP mail server... and I don't control
it. That's me, but it could apply to anybody else, we just don't know.
> - stop using email to send the results
Or store the results somewhere (deleting them after a day/week/month
whatever).
Or send two mails:
- one indicating the failure
- one with the stream attached.
At least with that I will *know* there was a failure.
> - filter the stream at source so success and other results that
> you consider uninteresting are not included in the 'raw'
> stream. Note that again, this will *leave the failure mode in
> place so when it goes bad you will stop getting the responses*
Another idea would be to have a different pqm instance used only for
known failures and make the output for the regular pqm includes only
*failures* and get rid of *all* the rest.
> Now, I'm not directly experiencing this anymore, so you need to do
> what makes sense to you.
Thanks for your thoughts anyway.
> If subunit's API is out of date and messing up the reason
> formatting, I'll happily fix it - I'm going to go peek and see if
> its addSkip is stale right after mailing this.
Cool, any feedback ?
> To stop using email to send the results, we either need to design
> a new thing and change PQM to do it, or reenable the LP API
> support, which Tim requested we disable.
And what would that give us ? Failed run stream attached to the mp as an
attachment ?
> We previously *haven't* filtered at source as a defensive measure
> against bugs. Python 2 is riddled with default-encoding pitfalls
> and they made subunit + bzr very unreliable at one point. Possibly
> totally fixed thanks to Martin[gz].
I still expect a few (tiny but annoying) bugs to be fixed there...
> Personally, I would not take any action that leaves the problem
> fallow waiting for a serious test failure to provoke it: that just
> means that when you need it most it will fail.
+1
> However, as I say above: you're using this system, I'm rarely
> using it : do what makes sense to you. I do strongly suggest that
> you change things in PQM itself if you want filtering, rather than
> bzr. That will at least preserve the detailed output for babune if
> you do need it in future.
Babune doesn't use 'make check' and will still use subunit, that's
unrelated.
Since I administer babune, I don't have any problem with tracking
subunit and testtools trunks. The problem with PQM, as I see it, is that
we suffer a lot from not being able to control the full stack that
decides whether or not a commit should be landed.
Don;t get me wrong on this, it's good that we don't tweak it endlessly
either, but I'd prefer a reliable solution and if this means making
'make check' simpler (even if less powerful), so be it.
> As for the value of logs on success, xfail, skip etc : *if* they pass
> incorrectly, I think the log will be invaluable, but you won't need it
> till you need it.
+1
Vincent