Code review comment for lp:~michael.nelson/launchpad/db-changes-build-generalisation-new

Revision history for this message
Michael Nelson (michael.nelson) wrote :

On Thu, May 20, 2010 at 4:29 PM, Stuart Bishop
<email address hidden> wrote:
> Review: Approve db
> This looks great. Comments, naming standards, indexes - all pretty good.

Thanks Stuart.

>
>
> +CREATE TABLE BinaryPackageBuild (
> + id serial PRIMARY KEY,
> + package_build integer NOT NULL CONSTRAINT binarypackagebuild__package_build__fk REFERENCES packagebuild,
> + distro_arch_series integer NOT NULL CONSTRAINT binarypackagebuild__distro_arch_series__fk REFERENCES distroarchseries,
> + source_package_release integer NOT NULL CONSTRAINT binarypackagebuild__source_package_release__fk REFERENCES sourcepackagerelease
>
> I believe Sourcepackage is one word in Launchpad's vocabulary, so that last column should be sourcepackage_release. Confirm with Bjorn.

The current devel code has sourcepackagerelease (as well as lots of
other non-underscored vars seen in the schema change - datecreated,
buildlog, etc.), and part of this change is to start using underscores
consistently.

Grepping for "source_package" in the tip of db-devel has 1113 hits
(skip over the wadl ;) ) for variable names "source_package" in
registry, as well as things like "source_package_name",
"source_package_format" etc. (admittedly, there are many more for
"sourcepackage", due to the old non-underscored variable and column
names (sourcepackagename, sourcepackagerelease etc.)

Since no one commented on the suggestion in the diagram at:

http://people.ubuntu.com/~wgrant/launchpad/buildfarm/new-build-model-again.png

I've gone ahead and code source_package_release through the pipeline
of changes. Bjorn, if you feel strongly about it, I can change the
schema and the code (or just change the schema and field definition
for the moment?), but I'd prefer to leave it for the moment, and if
you'd like the change, do a separate branch later to standardise
(we'll need to do this anyway as we've lots of other classes that
still use names without any underscores).

>
>
> +CREATE TABLE BinaryPackageBuild (
> + id serial PRIMARY KEY,
> + package_build integer NOT NULL CONSTRAINT binarypackagebuild__package_build__fk REFERENCES packagebuild,
> + distro_arch_series integer NOT NULL CONSTRAINT binarypackagebuild__distro_arch_series__fk REFERENCES distroarchseries,
> + source_package_release integer NOT NULL CONSTRAINT binarypackagebuild__source_package_release__fk REFERENCES sourcepackagerelease
> +);
> +
> +CREATE UNIQUE INDEX binarypackagebuild__package_build__idx ON binarypackagebuild(package_build);
> +-- Indexes that we can no longer create:
> +-- CREATE UNIQUE INDEX binarypackagebuild__distro_arch_series_uniq__idx ON binarypackagebuild(distro_arch_series, source_package_release, archive)
> +-- CREATE INDEX binarypackagebuild__distro_arch_series__status__idx ON binarypackagebuild(distro_arch_series, status?)
> +-- CREATE INDEX binarypackagebuild__distro_arch_series__date_finished ON binarypackagebuild(distro_arch_series, date_finished)
> +CREATE INDEX binarypackagebuild__source_package_release_idx ON binarypackagebuild(source_package_release);
>
> We might need an index on distro_arch_series too
>
> CREATE INDEX binarypackagebuild__distro_arch_series__idx ON BinaryPackageBuild(distro_arch_series);

Thanks, and done :) After realising I couldn't add the above unique
indexes, I'd forgotten to add a basic index.

>
>
>
> +-- Step 2
> +-- Migrate the current data from the build table to the newly created
> +-- relationships.
> +CREATE OR REPLACE FUNCTION migrate_build_rows() RETURNS integer
> +LANGUAGE plpgsql AS
> +$$
> +DECLARE
> + build_info RECORD;
> + rows_migrated integer;
> + buildfarmjob_id integer;
> + packagebuild_id integer;
> +BEGIN
> + rows_migrated := 0;
> + FOR build_info IN
> + SELECT
> + build.id,
> + build.processor,
> + archive.require_virtualized AS virtualized,
> + -- Currently we do not know if a build was virtual or not? (it's
> + -- only on the archive and the builder, both of which can
> + -- change). IBuild.is_virtualized just queries the archive.
> + build.datecreated AS date_created,
> + (build.datebuilt - build.buildduration) AS date_started,
> + build.datebuilt AS date_finished,
> + build.date_first_dispatched,
> + build.builder,
> + build.buildstate AS status,
> + build.buildlog AS log,
> + build.archive,
> + build.pocket,
> + build.upload_log,
> + build.dependencies,
> + build.distroarchseries AS distro_arch_series,
> + build.sourcepackagerelease AS source_package_release,
> + build.build_warnings -- We don't seem to use this in LP code at all?
> + FROM
> + build JOIN archive ON build.archive = archive.id
>
> We have to have 'ORDER BY Build.id' at the end here. Any data migration in a DB patch is run individually on each database replica. We have to guarantee that it will run identically everywhere, or replication breaks horribly.

Done.

>
>
> I think we could have done the migration using standard SQL and a temporary table, but the stored procedure is fine too. We can revisit if data migration goes too slowly.
>
>
> Approved as patch-2207-57-0.sql. The critical fix is the ORDER BY in the stored procedure.

Given that I won't (can't) land this straight away (we need to give it
a good workout on dogfood), should I wait on the patch name for the
moment? (in case other patches come through to be landed straight
away?).

Thanks!

« Back to merge proposal