On Thu, May 20, 2010 at 4:29 PM, Stuart Bishop
<email address hidden> wrote:
> Review: Approve db
> This looks great. Comments, naming standards, indexes - all pretty good.
Thanks Stuart.
>
>
> +CREATE TABLE BinaryPackageBuild (
> + id serial PRIMARY KEY,
> + package_build integer NOT NULL CONSTRAINT binarypackagebuild__package_build__fk REFERENCES packagebuild,
> + distro_arch_series integer NOT NULL CONSTRAINT binarypackagebuild__distro_arch_series__fk REFERENCES distroarchseries,
> + source_package_release integer NOT NULL CONSTRAINT binarypackagebuild__source_package_release__fk REFERENCES sourcepackagerelease
>
> I believe Sourcepackage is one word in Launchpad's vocabulary, so that last column should be sourcepackage_release. Confirm with Bjorn.
The current devel code has sourcepackagerelease (as well as lots of
other non-underscored vars seen in the schema change - datecreated,
buildlog, etc.), and part of this change is to start using underscores
consistently.
Grepping for "source_package" in the tip of db-devel has 1113 hits
(skip over the wadl ;) ) for variable names "source_package" in
registry, as well as things like "source_package_name",
"source_package_format" etc. (admittedly, there are many more for
"sourcepackage", due to the old non-underscored variable and column
names (sourcepackagename, sourcepackagerelease etc.)
Since no one commented on the suggestion in the diagram at:
I've gone ahead and code source_package_release through the pipeline
of changes. Bjorn, if you feel strongly about it, I can change the
schema and the code (or just change the schema and field definition
for the moment?), but I'd prefer to leave it for the moment, and if
you'd like the change, do a separate branch later to standardise
(we'll need to do this anyway as we've lots of other classes that
still use names without any underscores).
>
>
> +CREATE TABLE BinaryPackageBuild (
> + id serial PRIMARY KEY,
> + package_build integer NOT NULL CONSTRAINT binarypackagebuild__package_build__fk REFERENCES packagebuild,
> + distro_arch_series integer NOT NULL CONSTRAINT binarypackagebuild__distro_arch_series__fk REFERENCES distroarchseries,
> + source_package_release integer NOT NULL CONSTRAINT binarypackagebuild__source_package_release__fk REFERENCES sourcepackagerelease
> +);
> +
> +CREATE UNIQUE INDEX binarypackagebuild__package_build__idx ON binarypackagebuild(package_build);
> +-- Indexes that we can no longer create:
> +-- CREATE UNIQUE INDEX binarypackagebuild__distro_arch_series_uniq__idx ON binarypackagebuild(distro_arch_series, source_package_release, archive)
> +-- CREATE INDEX binarypackagebuild__distro_arch_series__status__idx ON binarypackagebuild(distro_arch_series, status?)
> +-- CREATE INDEX binarypackagebuild__distro_arch_series__date_finished ON binarypackagebuild(distro_arch_series, date_finished)
> +CREATE INDEX binarypackagebuild__source_package_release_idx ON binarypackagebuild(source_package_release);
>
> We might need an index on distro_arch_series too
>
> CREATE INDEX binarypackagebuild__distro_arch_series__idx ON BinaryPackageBuild(distro_arch_series);
Thanks, and done :) After realising I couldn't add the above unique
indexes, I'd forgotten to add a basic index.
>
>
>
> +-- Step 2
> +-- Migrate the current data from the build table to the newly created
> +-- relationships.
> +CREATE OR REPLACE FUNCTION migrate_build_rows() RETURNS integer
> +LANGUAGE plpgsql AS
> +$$
> +DECLARE
> + build_info RECORD;
> + rows_migrated integer;
> + buildfarmjob_id integer;
> + packagebuild_id integer;
> +BEGIN
> + rows_migrated := 0;
> + FOR build_info IN
> + SELECT
> + build.id,
> + build.processor,
> + archive.require_virtualized AS virtualized,
> + -- Currently we do not know if a build was virtual or not? (it's
> + -- only on the archive and the builder, both of which can
> + -- change). IBuild.is_virtualized just queries the archive.
> + build.datecreated AS date_created,
> + (build.datebuilt - build.buildduration) AS date_started,
> + build.datebuilt AS date_finished,
> + build.date_first_dispatched,
> + build.builder,
> + build.buildstate AS status,
> + build.buildlog AS log,
> + build.archive,
> + build.pocket,
> + build.upload_log,
> + build.dependencies,
> + build.distroarchseries AS distro_arch_series,
> + build.sourcepackagerelease AS source_package_release,
> + build.build_warnings -- We don't seem to use this in LP code at all?
> + FROM
> + build JOIN archive ON build.archive = archive.id
>
> We have to have 'ORDER BY Build.id' at the end here. Any data migration in a DB patch is run individually on each database replica. We have to guarantee that it will run identically everywhere, or replication breaks horribly.
Done.
>
>
> I think we could have done the migration using standard SQL and a temporary table, but the stored procedure is fine too. We can revisit if data migration goes too slowly.
>
>
> Approved as patch-2207-57-0.sql. The critical fix is the ORDER BY in the stored procedure.
Given that I won't (can't) land this straight away (we need to give it
a good workout on dogfood), should I wait on the patch name for the
moment? (in case other patches come through to be landed straight
away?).
On Thu, May 20, 2010 at 4:29 PM, Stuart Bishop
<email address hidden> wrote:
> Review: Approve db
> This looks great. Comments, naming standards, indexes - all pretty good.
Thanks Stuart.
> ild__package_ build__ fk REFERENCES packagebuild, ild__distro_ arch_series_ _fk REFERENCES distroarchseries, package_ release integer NOT NULL CONSTRAINT binarypackagebu ild__source_ package_ release_ _fk REFERENCES sourcepackagere lease release. Confirm with Bjorn.
>
> +CREATE TABLE BinaryPackageBuild (
> + id serial PRIMARY KEY,
> + package_build integer NOT NULL CONSTRAINT binarypackagebu
> + distro_arch_series integer NOT NULL CONSTRAINT binarypackagebu
> + source_
>
> I believe Sourcepackage is one word in Launchpad's vocabulary, so that last column should be sourcepackage_
The current devel code has sourcepackagere lease (as well as lots of
other non-underscored vars seen in the schema change - datecreated,
buildlog, etc.), and part of this change is to start using underscores
consistently.
Grepping for "source_package" in the tip of db-devel has 1113 hits package_ name", package_ format" etc. (admittedly, there are many more for lease etc.)
(skip over the wadl ;) ) for variable names "source_package" in
registry, as well as things like "source_
"source_
"sourcepackage", due to the old non-underscored variable and column
names (sourcepackagename, sourcepackagere
Since no one commented on the suggestion in the diagram at:
http:// people. ubuntu. com/~wgrant/ launchpad/ buildfarm/ new-build- model-again. png
I've gone ahead and code source_ package_ release through the pipeline
of changes. Bjorn, if you feel strongly about it, I can change the
schema and the code (or just change the schema and field definition
for the moment?), but I'd prefer to leave it for the moment, and if
you'd like the change, do a separate branch later to standardise
(we'll need to do this anyway as we've lots of other classes that
still use names without any underscores).
> ild__package_ build__ fk REFERENCES packagebuild, ild__distro_ arch_series_ _fk REFERENCES distroarchseries, package_ release integer NOT NULL CONSTRAINT binarypackagebu ild__source_ package_ release_ _fk REFERENCES sourcepackagere lease ild__package_ build__ idx ON binarypackagebu ild(package_ build); ild__distro_ arch_series_ uniq__idx ON binarypackagebu ild(distro_ arch_series, source_ package_ release, archive) ild__distro_ arch_series_ _status_ _idx ON binarypackagebu ild(distro_ arch_series, status?) ild__distro_ arch_series_ _date_finished ON binarypackagebu ild(distro_ arch_series, date_finished) ild__source_ package_ release_ idx ON binarypackagebu ild(source_ package_ release) ; ild__distro_ arch_series_ _idx ON BinaryPackageBu ild(distro_ arch_series) ;
>
> +CREATE TABLE BinaryPackageBuild (
> + id serial PRIMARY KEY,
> + package_build integer NOT NULL CONSTRAINT binarypackagebu
> + distro_arch_series integer NOT NULL CONSTRAINT binarypackagebu
> + source_
> +);
> +
> +CREATE UNIQUE INDEX binarypackagebu
> +-- Indexes that we can no longer create:
> +-- CREATE UNIQUE INDEX binarypackagebu
> +-- CREATE INDEX binarypackagebu
> +-- CREATE INDEX binarypackagebu
> +CREATE INDEX binarypackagebu
>
> We might need an index on distro_arch_series too
>
> CREATE INDEX binarypackagebu
Thanks, and done :) After realising I couldn't add the above unique
indexes, I'd forgotten to add a basic index.
> build_rows( ) RETURNS integer require_ virtualized AS virtualized, is_virtualized just queries the archive. tion) AS date_started, first_dispatche d, hseries AS distro_arch_series, kagerelease AS source_ package_ release, warnings -- We don't seem to use this in LP code at all?
>
>
> +-- Step 2
> +-- Migrate the current data from the build table to the newly created
> +-- relationships.
> +CREATE OR REPLACE FUNCTION migrate_
> +LANGUAGE plpgsql AS
> +$$
> +DECLARE
> + build_info RECORD;
> + rows_migrated integer;
> + buildfarmjob_id integer;
> + packagebuild_id integer;
> +BEGIN
> + rows_migrated := 0;
> + FOR build_info IN
> + SELECT
> + build.id,
> + build.processor,
> + archive.
> + -- Currently we do not know if a build was virtual or not? (it's
> + -- only on the archive and the builder, both of which can
> + -- change). IBuild.
> + build.datecreated AS date_created,
> + (build.datebuilt - build.builddura
> + build.datebuilt AS date_finished,
> + build.date_
> + build.builder,
> + build.buildstate AS status,
> + build.buildlog AS log,
> + build.archive,
> + build.pocket,
> + build.upload_log,
> + build.dependencies,
> + build.distroarc
> + build.sourcepac
> + build.build_
> + FROM
> + build JOIN archive ON build.archive = archive.id
>
> We have to have 'ORDER BY Build.id' at the end here. Any data migration in a DB patch is run individually on each database replica. We have to guarantee that it will run identically everywhere, or replication breaks horribly.
Done.
> 57-0.sql. The critical fix is the ORDER BY in the stored procedure.
>
> I think we could have done the migration using standard SQL and a temporary table, but the stored procedure is fine too. We can revisit if data migration goes too slowly.
>
>
> Approved as patch-2207-
Given that I won't (can't) land this straight away (we need to give it
a good workout on dogfood), should I wait on the patch name for the
moment? (in case other patches come through to be landed straight
away?).
Thanks!