~wgrant/launchpad:buildd-manager-nicer-retries

Last commit made on 2023-10-29
Get this branch:
git clone -b buildd-manager-nicer-retries https://git.launchpad.net/~wgrant/launchpad
Only William Grant can upload to this branch. If you are William Grant please log in for upload directions.

Branch merges

Branch information

Name:
buildd-manager-nicer-retries
Repository:
lp:~wgrant/launchpad

Recent commits

272d66a... by William Grant

Cope more gracefully with intermittent builder glitches

buildd-manager would previously immediately count any single scan
failure against the builder and job. This meant that three glitches --
say, network timeouts -- over the course of job would result in the
build being requeued. A builder's failure count is reset on successful
dispatch, but a job's deliberately isn't since we want to fail builds
that are repeatedly killing builders. This meant that a single network
glitch in the second attempt at a build would cause it to be failed.

This added layer of failure counting substantially reduces the
likelihood of those two scenarios, by requiring five consecutive
unsuccessful scans before a single failure is counted against a builder
or job. This means that brief network interruptions, or indeed temporary
insanity on buildd-manager's part, should no longer cause builds to be
requeued or failed at all.

The only significant downside of this change is that recovery from
legitimate failures will now take a few minutes longer. But that's much
less of a concern with the very large build farm we have nowadays.

Succeeded
[SUCCEEDED] docs:0 (build)
[SUCCEEDED] lint:0 (build)
[SUCCEEDED] mypy:0 (build)
13 of 3 results
71599ff... by William Grant

Tweak buildd-manager failure handling metrics

Merged from https://code.launchpad.net/~wgrant/launchpad/+git/launchpad/+merge/454693

0eb8ec0... by William Grant

Refactor buildd-manager job dispatch error handling

manager.py is now entirely inlineCallbacks.

Merged from https://code.launchpad.net/~wgrant/launchpad/+git/launchpad/+merge/454692

8128720... by William Grant

Rename base failure count metric to not clash with numbercruncher

Succeeded
[SUCCEEDED] docs:0 (build)
[SUCCEEDED] lint:0 (build)
[SUCCEEDED] mypy:0 (build)
13 of 3 results
bb4b250... by Guruprasad

charm/launchpad-codehosting: Simplify TLS certification configuration

We do not need any value other than 'DEFAULT' for the `crts` list
passed to the load balancer.

Merged from https://code.launchpad.net/~lgp171188/launchpad/+git/launchpad/+merge/454744

1772d8d... by Guruprasad

charm/launchpad-codehosting: Simplify TLS certification configuration

fd2bd75... by Ines Almeida

Increase timeouts of regularly failing unit tests

Merged from https://code.launchpad.net/~ines-almeida/launchpad/+git/launchpad/+merge/454723

8c6be11... by Guruprasad

charm: Update the bzr sftp port to 5022 and make it configurable

Merged from https://code.launchpad.net/~lgp171188/launchpad/+git/launchpad/+merge/454732

20e2f1b... by Guruprasad

charm: Update the bzr sftp port to 5022 and make it configurable

316a8ee... by Ines Almeida

Increase timeouts of regularly failing unit tests