Launchpad itself

Code review comment for lp:~michael.nelson/launchpad/443353-api-builds-from-private

443353-api-builds-from-private
Merge into db-devel

Revision history for this message

Jonathan Lange (jml) wrote on 2009-11-15:

On Sun, Nov 15, 2009 at 2:35 PM, Michael Nelson
<email address hidden> wrote:
> Michael Nelson has proposed merging lp:~michael.nelson/launchpad/443353-api-builds-from-private into lp:launchpad.
>
> Requested reviews:
> Canonical Launchpad Engineering (launchpad)
> Related bugs:
> #443353 API does not include Builds for sources that were sync'd from private PPAs
> https://bugs.launchpad.net/bugs/443353
>
>
> Overview
> ========
>
> This branch fixes bug 443353 by ensuring that getBuildsForSources() will include builds that were originally built in a different archive context, but have since had binaries copied into the source archive context.
>
> Issues
> ======
>
> There are two main issues with this branch IMO.
>
> 1. Ensuring that the result of the union was ordered correctly is hackish, and dependent on a storm implementation detail (that columns in SQL queries for a certain table are ordered alphabetically).
>
> 2. Testing the correct ordering in the doctest isn't so readable. It could be better to print out the results instead, but on the other hand, I didn't want the test to be dependent on database ids. Previously the test simply ensured the last item was the one expected - I've updated this to instead assure that the complete result is sorted as expected.
>
> Any suggestions welcome!
>

Hi Michael,

Thanks for fixing this -- on a Sunday even!

I'm generally OK with this branch, but would like another opportunity
to have a look at it and maybe to talk with you face-to-face.

jml

> Testing/QA
> ==========
>
> To test, run:
> bin/test -vvt doc/publishing.txt
>
> To QA:
>
> Visit:
> https://edge.launchpad.net/ubuntu/+source/openjdk-6
>
> and expand the intrepid 6b12-0ubuntu6.6 security release. Currently this only displays two builds, where as it should display all the builds listed at:
>
> https://edge.launchpad.net/ubuntu/+source/openjdk-6/6b12-0ubuntu6.6
>
>
>
>
> --
> https://code.launchpad.net/~michael.nelson/launchpad/443353-api-builds-from-private/+merge/14896
> Your team Launchpad code reviewers from Canonical is subscribed to branch lp:launchpad.
>
> === modified file 'lib/lp/soyuz/doc/publishing.txt'
> --- lib/lp/soyuz/doc/publishing.txt 2009-11-05 10:51:36 +0000
> +++ lib/lp/soyuz/doc/publishing.txt 2009-11-15 20:35:29 +0000
> @@ -1055,22 +1055,53 @@
> each build found.
>
> >>> cprov_builds.count()
> - 7
> + 8
>
> The `ResultSet` is ordered by ascending
> `SourcePackagePublishingHistory.id` and ascending
> `DistroArchseries.architecturetag` in this order.
>
> - >>> source_pub, build, arch = cprov_builds.last()
> -
> - >>> print source_pub.displayname
> - foo 666 in breezy-autotest
> -
> - >>> print build.title
> - i386 build of foo 666 in ubuntutest breezy-autotest RELEASE
> -
> - >>> print arch.displayname
> - ubuntutest Breezy Badger Autotest i386
> + # The easiest thing we can do here (without printing ids)
> + # is to show that sorting a list of the resulting ids+tags does not
> + # modify the list.
> + >>> from copy import copy
> + >>> ids_and_tags = [(pub.id, arch.architecturetag)
> + ... for pub, build, arch in cprov_builds]
> + >>> ids_and_tags_sorted = copy(ids_and_tags)
> + >>> ids_and_tags_sorted.sort()
> + >>> ids_and_tags == ids_and_tags_sorted
> + True
> +

You don't have to use copy() and .sort(), you can use sorted()
instead. That would be slightly nicer.

Nicer still would be an assertSorted() method called from a unit test :P

> +If a source package is copied from another archive (including the
> +binaries), then these builds will also be included in the result (even
> +though they were build in a different archive context).
> +

Saying something about _why_ would greatly improve this as a document.
Can you please add something like that?

> + # Create a new PPA and publish a source with some builds
> + # and binaries.
> + >>> other_ppa = factory.makeArchive()
> + >>> binaries = test_publisher.getPubBinaries(archive=other_ppa)
> + >>> build = binaries[0].binarypackagerelease.build
> + >>> source_pub = build.sourcepackagerelease.publishings[0]
> +

Can you please add some comments here about how the context of the
build is _not_ Celso's PPA? This can be easily lost in skimming the
docs.

> + # Copy the source into Celso's PPA, ensuring that the binaries
> + # are alse published there.
> + >>> source_pub_cprov = source_pub.copyTo(
> + ... source_pub.distroseries, source_pub.pocket,
> + ... cprov.archive)
> + >>> binaries_cprov = test_publisher.publishBinaryInArchive(
> + ... binaries[0].binarypackagerelease, cprov.archive)
> +
> +Now we will see an extra source in Celso's PPA as well as an extra
> +build - even though the build's context is not Celso's PPA.
> +
> + >>> cprov_sources_new = cprov.archive.getPublishedSources()
> + >>> cprov_sources_new.count()
> + 9
> +
> + >>> cprov_builds_new = publishing_set.getBuildsForSources(
> + ... cprov_sources_new)
> + >>> cprov_builds_new.count()
> + 9
>

Maybe it's worth adding a note in the text saying "(It was 8 befgore
we started)".

> Next we'll create two sources with two builds each (the SoyuzTestPublisher
> default) and show that the number of unpublished builds for these sources
>
> === modified file 'lib/lp/soyuz/model/publishing.py'
> --- lib/lp/soyuz/model/publishing.py 2009-10-29 11:55:35 +0000
> +++ lib/lp/soyuz/model/publishing.py 2009-11-15 20:35:29 +0000
> @@ -40,6 +40,7 @@
> from canonical.database.enumcol import EnumCol
> from lp.registry.interfaces.pocket import PackagePublishingPocket
> from lp.soyuz.model.binarypackagename import BinaryPackageName
> +from lp.soyuz.model.binarypackagerelease import BinaryPackageRelease
> from lp.soyuz.model.files import (
> BinaryPackageFile, SourcePackageReleaseFile)
> from canonical.launchpad.database.librarian import (
> @@ -1233,20 +1234,59 @@
> Build.buildstate.is_in(build_states))
>
> store = getUtility(IStoreSelector).get(MAIN_STORE, DEFAULT_FLAVOR)
> - result_set = store.find(
> - (SourcePackagePublishingHistory, Build, DistroArchSeries),
> + find_spec = (
> + SourcePackagePublishingHistory, Build, DistroArchSeries)
> +
> + # We'll be looking for builds in the same distroseries as the
> + # SPPH for the same release.
> + builds_for_distroseries_expr = (
> Build.distroarchseriesID == DistroArchSeries.id,
> + SourcePackagePublishingHistory.distroseriesID ==
> + DistroArchSeries.distroseriesID,
> + SourcePackagePublishingHistory.sourcepackagereleaseID ==
> + Build.sourcepackagereleaseID,
> + In(SourcePackagePublishingHistory.id, source_publication_ids)
> + )
> +

I think it's better style to close parens on the same line. PEP 8
agrees with me, IIRC.

> + # First, we'll find the builds that were built in the same
> + # archive context as the published sources.
> + builds_in_same_archive = store.find(
> + find_spec,
> + builds_for_distroseries_expr,
> SourcePackagePublishingHistory.archiveID == Build.archiveID,
> - SourcePackagePublishingHistory.distroseriesID ==
> - DistroArchSeries.distroseriesID,
> - SourcePackagePublishingHistory.sourcepackagereleaseID ==
> - Build.sourcepackagereleaseID,
> - In(SourcePackagePublishingHistory.id, source_publication_ids),
> - *extra_exprs)
> -
> + *extra_exprs)
> +
> + # Next get all the builds that have a binary published in the
> + # same archive... even though the build was not built in
> + # the same context archive.
> + builds_copied_into_archive = store.find(
> + find_spec,
> + builds_for_distroseries_expr,
> + SourcePackagePublishingHistory.archiveID != Build.archiveID,
> + BinaryPackagePublishingHistory.archive == Build.archiveID,
> + BinaryPackagePublishingHistory.binarypackagerelease ==
> + BinaryPackageRelease.id,
> + BinaryPackageRelease.build == Build.id,
> + *extra_exprs)
> +
> + result_set = builds_copied_into_archive.union(
> + builds_in_same_archive).config(distinct=True)
> +
> + # XXX 2009-11-15 Michael Nelson bug=366043. It is not possible
> + # to sort by the name `SourcePackagePublishingHistory.id` as after
> + # the union there are no tables. Nor can we sort by ambiguous `id`
> + # as in this case there are 3 id columns in the result. Specifying
> + # the column index is the only option that I can find. So we're
> + # relying on an implementation detail of Storm that it lists
> + # columns in alphabetical order in sql queries.

I don't get what you mean by "ambiguous `id`". Surely the ID columns
can be referred to by their table name as before?

It might be a good idea to grab Jamu and see if he can solve your problem.

> + sql_columns = SourcePackagePublishingHistory._storm_columns.values()
> + sql_column_names = [column.name for column in sql_columns]
> + sql_column_names.sort()
> +

That's pretty terrible, but if that's what you've got to do, then I'm OK.

You can use sorted() here to make things a little better.

> + # SQL order by uses 1-based column numbers.
> + source_pub_id_col_number = sql_column_names.index('id') + 1

Given that all of this is a workaround for a storm bug, maybe it's a
good idea to put it into a small helper function?

> result_set.order_by(
> - SourcePackagePublishingHistory.id,
> - DistroArchSeries.architecturetag)
> + source_pub_id_col_number, DistroArchSeries.architecturetag)
>
> return result_set
>

On Sun, Nov 15, 2009 at 2:35 PM, Michael Nelson
<michael.nelson@canonical.com> wrote:
> Michael Nelson has proposed merging lp:~michael.nelson/launchpad/443353-api-builds-from-private into lp:launchpad.
>
>    Requested reviews:
>    Canonical Launchpad Engineering (launchpad)
> Related bugs:
>  #443353 API does not include Builds for sources that were sync'd from private PPAs
>  https://bugs.launchpad.net/bugs/443353
>
>
> Overview
> ========
>
> This branch fixes bug 443353 by ensuring that getBuildsForSources() will include builds that were originally built in a different archive context, but have since had binaries copied into the source archive context.
>
> Issues
> ======
>
> There are two main issues with this branch IMO.
>
> 1. Ensuring that the result of the union was ordered correctly is hackish, and dependent on a storm implementation detail (that columns in SQL queries for a certain table are ordered alphabetically).
>
> 2. Testing the correct ordering in the doctest isn't so readable. It could be better to print out the results instead, but on the other hand, I didn't want the test to be dependent on database ids. Previously the test simply ensured the last item was the one expected - I've updated this to instead assure that the complete result is sorted as expected.
>
> Any suggestions welcome!
>

Hi Michael,

Thanks for fixing this -- on a Sunday even!

I'm generally OK with this branch, but would like another opportunity
to have a look at it and maybe to talk with you face-to-face.

jml

> Testing/QA
> ==========
>
> To test, run:
> bin/test -vvt doc/publishing.txt
>
> To QA:
>
> Visit:
> https://edge.launchpad.net/ubuntu/+source/openjdk-6
>
> and expand the intrepid 6b12-0ubuntu6.6 security release. Currently this only displays two builds, where as it should display all the builds listed at:
>
> https://edge.launchpad.net/ubuntu/+source/openjdk-6/6b12-0ubuntu6.6
>
>
>
>
> --
> https://code.launchpad.net/~michael.nelson/launchpad/443353-api-builds-from-private/+merge/14896
> Your team Launchpad code reviewers from Canonical is subscribed to branch lp:launchpad.
>
> === modified file 'lib/lp/soyuz/doc/publishing.txt'
> --- lib/lp/soyuz/doc/publishing.txt     2009-11-05 10:51:36 +0000
> +++ lib/lp/soyuz/doc/publishing.txt     2009-11-15 20:35:29 +0000
> @@ -1055,22 +1055,53 @@
>  each build found.
>
>     >>> cprov_builds.count()
> -    7
> +    8
>
>  The `ResultSet` is ordered by ascending
>  `SourcePackagePublishingHistory.id` and ascending
>  `DistroArchseries.architecturetag` in this order.
>
> -    >>> source_pub, build, arch = cprov_builds.last()
> -
> -    >>> print source_pub.displayname
> -    foo 666 in breezy-autotest
> -
> -    >>> print build.title
> -    i386 build of foo 666 in ubuntutest breezy-autotest RELEASE
> -
> -    >>> print arch.displayname
> -    ubuntutest Breezy Badger Autotest i386
> +    # The easiest thing we can do here (without printing ids)
> +    # is to show that sorting a list of the resulting ids+tags does not
> +    # modify the list.
> +    >>> from copy import copy
> +    >>> ids_and_tags = [(pub.id, arch.architecturetag)
> +    ...     for pub, build, arch in cprov_builds]
> +    >>> ids_and_tags_sorted = copy(ids_and_tags)
> +    >>> ids_and_tags_sorted.sort()
> +    >>> ids_and_tags == ids_and_tags_sorted
> +    True
> +

You don't have to use copy() and .sort(), you can use sorted()
instead. That would be slightly nicer.

Nicer still would be an assertSorted() method called from a unit test :P

Saying something about _why_ would greatly improve this as a document.
Can you please add something like that?

> +    # Create a new PPA and publish a source with some builds
> +    # and binaries.
> +    >>> other_ppa = factory.makeArchive()
> +    >>> binaries = test_publisher.getPubBinaries(archive=other_ppa)
> +    >>> build = binaries[0].binarypackagerelease.build
> +    >>> source_pub = build.sourcepackagerelease.publishings[0]
> +

Can you please add some comments here about how the context of the
build is _not_ Celso's PPA? This can be easily lost in skimming the
docs.

> +    # Copy the source into Celso's PPA, ensuring that the binaries
> +    # are alse published there.
> +    >>> source_pub_cprov = source_pub.copyTo(
> +    ...     source_pub.distroseries, source_pub.pocket,
> +    ...     cprov.archive)
> +    >>> binaries_cprov = test_publisher.publishBinaryInArchive(
> +    ...     binaries[0].binarypackagerelease, cprov.archive)
> +
> +Now we will see an extra source in Celso's PPA as well as an extra
> +build - even though the build's context is not Celso's PPA.
> +
> +    >>> cprov_sources_new = cprov.archive.getPublishedSources()
> +    >>> cprov_sources_new.count()
> +    9
> +
> +    >>> cprov_builds_new = publishing_set.getBuildsForSources(
> +    ...     cprov_sources_new)
> +    >>> cprov_builds_new.count()
> +    9
>

Maybe it's worth adding a note in the text saying "(It was 8 befgore
we started)".

>  Next we'll create two sources with two builds each (the SoyuzTestPublisher
>  default) and show that the number of unpublished builds for these sources
>
> === modified file 'lib/lp/soyuz/model/publishing.py'
> --- lib/lp/soyuz/model/publishing.py    2009-10-29 11:55:35 +0000
> +++ lib/lp/soyuz/model/publishing.py    2009-11-15 20:35:29 +0000
> @@ -40,6 +40,7 @@
>  from canonical.database.enumcol import EnumCol
>  from lp.registry.interfaces.pocket import PackagePublishingPocket
>  from lp.soyuz.model.binarypackagename import BinaryPackageName
> +from lp.soyuz.model.binarypackagerelease import BinaryPackageRelease
>  from lp.soyuz.model.files import (
>     BinaryPackageFile, SourcePackageReleaseFile)
>  from canonical.launchpad.database.librarian import (
> @@ -1233,20 +1234,59 @@
>                 Build.buildstate.is_in(build_states))
>
>         store = getUtility(IStoreSelector).get(MAIN_STORE, DEFAULT_FLAVOR)
> -        result_set = store.find(
> -            (SourcePackagePublishingHistory, Build, DistroArchSeries),
> +        find_spec = (
> +            SourcePackagePublishingHistory, Build, DistroArchSeries)
> +
> +        # We'll be looking for builds in the same distroseries as the
> +        # SPPH for the same release.
> +        builds_for_distroseries_expr = (
>             Build.distroarchseriesID == DistroArchSeries.id,
> +            SourcePackagePublishingHistory.distroseriesID ==
> +                DistroArchSeries.distroseriesID,
> +            SourcePackagePublishingHistory.sourcepackagereleaseID ==
> +                Build.sourcepackagereleaseID,
> +            In(SourcePackagePublishingHistory.id, source_publication_ids)
> +            )
> +

I think it's better style to close parens on the same line. PEP 8
agrees with me, IIRC.

> +        # First, we'll find the builds that were built in the same
> +        # archive context as the published sources.
> +        builds_in_same_archive = store.find(
> +            find_spec,
> +            builds_for_distroseries_expr,
>             SourcePackagePublishingHistory.archiveID == Build.archiveID,
> -            SourcePackagePublishingHistory.distroseriesID ==
> -                DistroArchSeries.distroseriesID,
> -            SourcePackagePublishingHistory.sourcepackagereleaseID ==
> -                Build.sourcepackagereleaseID,
> -            In(SourcePackagePublishingHistory.id, source_publication_ids),
> -            *extra_exprs)
> -
> +            *extra_exprs)
> +
> +        # Next get all the builds that have a binary published in the
> +        # same archive... even though the build was not built in
> +        # the same context archive.
> +        builds_copied_into_archive = store.find(
> +            find_spec,
> +            builds_for_distroseries_expr,
> +            SourcePackagePublishingHistory.archiveID != Build.archiveID,
> +            BinaryPackagePublishingHistory.archive == Build.archiveID,
> +            BinaryPackagePublishingHistory.binarypackagerelease ==
> +                BinaryPackageRelease.id,
> +            BinaryPackageRelease.build == Build.id,
> +            *extra_exprs)
> +
> +        result_set = builds_copied_into_archive.union(
> +            builds_in_same_archive).config(distinct=True)
> +
> +        # XXX 2009-11-15 Michael Nelson bug=366043. It is not possible
> +        # to sort by the name `SourcePackagePublishingHistory.id` as after
> +        # the union there are no tables. Nor can we sort by ambiguous `id`
> +        # as in this case there are 3 id columns in the result. Specifying
> +        # the column index is the only option that I can find. So we're
> +        # relying on an implementation detail of Storm that it lists
> +        # columns in alphabetical order in sql queries.

I don't get what you mean by "ambiguous `id`". Surely the ID columns
can be referred to by their table name as before?

It might be a good idea to grab Jamu and see if he can solve your problem.

> +        sql_columns = SourcePackagePublishingHistory._storm_columns.values()
> +        sql_column_names = [column.name for column in sql_columns]
> +        sql_column_names.sort()
> +

That's pretty terrible, but if that's what you've got to do, then I'm OK.

You can use sorted() here to make things a little better.

> +        # SQL order by uses 1-based column numbers.
> +        source_pub_id_col_number = sql_column_names.index('id') + 1

Given that all of this is a workaround for a storm bug, maybe it's a
good idea to put it into a small helper function?

>         result_set.order_by(
> -            SourcePackagePublishingHistory.id,
> -            DistroArchSeries.architecturetag)
> +            source_pub_id_col_number, DistroArchSeries.architecturetag)
>
>         return result_set
>

« Back to merge proposal