Code review comment for lp:~michael.nelson/launchpad/db-changes-build-generalisation-new

Revision history for this message
Stuart Bishop (stub) wrote :

This looks great. Comments, naming standards, indexes - all pretty good.

+CREATE TABLE BinaryPackageBuild (
+ id serial PRIMARY KEY,
+ package_build integer NOT NULL CONSTRAINT binarypackagebuild__package_build__fk REFERENCES packagebuild,
+ distro_arch_series integer NOT NULL CONSTRAINT binarypackagebuild__distro_arch_series__fk REFERENCES distroarchseries,
+ source_package_release integer NOT NULL CONSTRAINT binarypackagebuild__source_package_release__fk REFERENCES sourcepackagerelease

I believe Sourcepackage is one word in Launchpad's vocabulary, so that last column should be sourcepackage_release. Confirm with Bjorn.

+CREATE TABLE BinaryPackageBuild (
+ id serial PRIMARY KEY,
+ package_build integer NOT NULL CONSTRAINT binarypackagebuild__package_build__fk REFERENCES packagebuild,
+ distro_arch_series integer NOT NULL CONSTRAINT binarypackagebuild__distro_arch_series__fk REFERENCES distroarchseries,
+ source_package_release integer NOT NULL CONSTRAINT binarypackagebuild__source_package_release__fk REFERENCES sourcepackagerelease
+);
+
+CREATE UNIQUE INDEX binarypackagebuild__package_build__idx ON binarypackagebuild(package_build);
+-- Indexes that we can no longer create:
+-- CREATE UNIQUE INDEX binarypackagebuild__distro_arch_series_uniq__idx ON binarypackagebuild(distro_arch_series, source_package_release, archive)
+-- CREATE INDEX binarypackagebuild__distro_arch_series__status__idx ON binarypackagebuild(distro_arch_series, status?)
+-- CREATE INDEX binarypackagebuild__distro_arch_series__date_finished ON binarypackagebuild(distro_arch_series, date_finished)
+CREATE INDEX binarypackagebuild__source_package_release_idx ON binarypackagebuild(source_package_release);

We might need an index on distro_arch_series too

CREATE INDEX binarypackagebuild__distro_arch_series__idx ON BinaryPackageBuild(distro_arch_series);

+-- Step 2
+-- Migrate the current data from the build table to the newly created
+-- relationships.
+CREATE OR REPLACE FUNCTION migrate_build_rows() RETURNS integer
+LANGUAGE plpgsql AS
+$$
+DECLARE
+ build_info RECORD;
+ rows_migrated integer;
+ buildfarmjob_id integer;
+ packagebuild_id integer;
+BEGIN
+ rows_migrated := 0;
+ FOR build_info IN
+ SELECT
+ build.id,
+ build.processor,
+ archive.require_virtualized AS virtualized,
+ -- Currently we do not know if a build was virtual or not? (it's
+ -- only on the archive and the builder, both of which can
+ -- change). IBuild.is_virtualized just queries the archive.
+ build.datecreated AS date_created,
+ (build.datebuilt - build.buildduration) AS date_started,
+ build.datebuilt AS date_finished,
+ build.date_first_dispatched,
+ build.builder,
+ build.buildstate AS status,
+ build.buildlog AS log,
+ build.archive,
+ build.pocket,
+ build.upload_log,
+ build.dependencies,
+ build.distroarchseries AS distro_arch_series,
+ build.sourcepackagerelease AS source_package_release,
+ build.build_warnings -- We don't seem to use this in LP code at all?
+ FROM
+ build JOIN archive ON build.archive = archive.id

We have to have 'ORDER BY Build.id' at the end here. Any data migration in a DB patch is run individually on each database replica. We have to guarantee that it will run identically everywhere, or replication breaks horribly.

I think we could have done the migration using standard SQL and a temporary table, but the stored procedure is fine too. We can revisit if data migration goes too slowly.

Approved as patch-2207-57-0.sql. The critical fix is the ORDER BY in the stored procedure.

review: Approve (db)

« Back to merge proposal