On 03/11/2010 03:06 PM, Michael Nelson wrote:
> Review: Approve code
>
> Wow! I'm looking forward to seeing how this improves the response time
> for uploads with lots of binaries :)
Hello Michael,
thanks very much for the review! I am quite confident that this change
will have a positive impact :)
> I've only got a few suggestions (extra comments, use of MasterStore,
> possible renaming etc.). I've also attached a diff with the equivalent
> (unformatted) storm version that works too (as I know you don't like
> to work with storm ;) ). Of course, it's up to you whether you use it
> or go with the SQL.
Hrm .. the SQL version has been running on dogfood for a few days now
and I am quite confident that it actually works..
Please see also my in-line responses below as well as the attached
incremental diff.
> Thanks!
> -Michael
>
> On Thu, Mar 11, 2010 at 10:44 AM, Muharem Hrnjadovic
> <email address hidden> wrote:
>> Muharem Hrnjadovic has proposed merging lp:~al-maisan/launchpad/unembargo-oops-526645 into lp:launchpad/devel.
>>
>> Requested reviews:
>> Canonical Launchpad Engineering (launchpad)
>> Related bugs:
>> #526645 cannot unembargo due to LP API timeouts
>> https://bugs.launchpad.net/bugs/526645
>>
>>
>> Hello there!
>>
>> The 4 OOPSs attached to the related bug indicate the optimization with the
>> highest ROI (see the top 3 queries in the "Repeated SQL Statements" section):
>> we query the database for *each* binary in order to find out whether it's
>> already published in the destination archive. That's inefficient and accounts
>> for 50% percent of the SQL time.
>>
>> The branch at hand performs the "is the binary published in the destination
>> archive already?" check in one go.
>>
>> Tests to run:
>>
>> bin/test -vv test_uploadprocessor
>>
>> No "make lint" issues.
>>
>> --
>> https://code.launchpad.net/~al-maisan/launchpad/unembargo-oops-526645/+merge/21126
>> Your team Launchpad code reviewers from Canonical is subscribed to branch lp:launchpad/devel.
>>
>> === modified file 'lib/lp/soyuz/interfaces/queue.py'
>> --- lib/lp/soyuz/interfaces/queue.py 2009-07-19 04:41:14 +0000
>> +++ lib/lp/soyuz/interfaces/queue.py 2010-03-11 09:44:23 +0000
>> @@ -411,23 +411,6 @@
>> title=_("The related build"), required=True, readonly=False,
>> )
>>
>> - def verifyBeforeAccept():
>> - """Perform overall checks before accepting a binary upload.
>> -
>> - Ensure each uploaded binary file can be published in the targeted
>> - archive.
>> -
>> - If any of the uploaded binary files are already published a
>> - QueueInconsistentStateError is raised containing all filenames
>> - that cannot be published.
>> -
>> - This check is very similar to the one we do for source upload and
>> - was designed to prevent the creation of binary publications that
>> - will never reach the archive.
>> -
>> - See bug #227184 for further details.
>> - """
>> -
>> def publish(logger=None):
>> """Publish this queued source in the distroseries referred to by
>> the parent queue item.
>>
>> === modified file 'lib/lp/soyuz/model/queue.py'
>> --- lib/lp/soyuz/model/queue.py 2010-01-10 04:58:44 +0000
>> +++ lib/lp/soyuz/model/queue.py 2010-03-11 09:44:23 +0000
>> @@ -217,9 +217,9 @@
>> except QueueSourceAcceptError, info:
>> raise QueueInconsistentStateError(info)
>>
>> + self._checkForBinariesinDestinationArchive(
>> + [pub.build for pub in self.builds])
>
> It looks like we should rename self.builds here?
Hmm .. good idea but probably better done in a separate clean-up branch
since the self.builds is used all over the place.
>> for build in self.builds:
>> - # as before, but for QueueBuildAcceptError
>> - build.verifyBeforeAccept()
>> try:
>> build.checkComponentAndSection()
>> except QueueBuildAcceptError, info:
>> @@ -228,6 +228,62 @@
>> # if the previous checks applied and pass we do set the value
>> self.status = PassthroughStatusValue(PackageUploadStatus.ACCEPTED)
>>
>> + def _checkForBinariesinDestinationArchive(self, builds):
>> + """
>> + Check for existing binaries (in destination archive) for all binary
>> + uploads to be accepted.
>> +
>> + Before accepting binary uploads we check whether any of the binaries
>> + already exists in the destination archive and raise an exception
>> + (QueueInconsistentStateError) if this is the case.
>> +
>> + The only way to find pre-existing binaries is to match on binary
>> + package file names.
>> + """
>> + if len(builds) == 0:
>> + return
>> +
>
> It would be great to include a summary of what the query below is
> doing... something like:
> # Collect all the binary filenames for the set of builds.
Thanks for the suggestion. Done.
>> + inner_query = """
>> + SELECT DISTINCT lfa.filename
>> + FROM
>> + binarypackagefile bpf, binarypackagerelease bpr,
>> + libraryfilealias lfa
>> + WHERE
>> + bpr.build IN %s
>> + AND bpf.binarypackagerelease = bpr.id
>> + AND bpf.libraryfile = lfa.id
>> + """ % sqlvalues([build.id for build in builds])
>> +
>
> And again... let's see if I've understood this correctly, something like:
>
> # Check whether any of the binary filenames have already been
> published in this archive.
Yup, only s/this/the destination/
>> + query = """
>> + SELECT DISTINCT lfa.filename
>> + FROM
>> + binarypackagefile bpf, binarypackagepublishinghistory bpph,
>> + distroarchseries das, distroseries ds, libraryfilealias lfa
>> + WHERE
>> + bpph.archive = %s
>> + AND bpph.distroarchseries = das.id
>> + AND das.distroseries = ds.id
>> + AND ds.distribution = %s
>> + AND bpph.binarypackagerelease = bpf.binarypackagerelease
>> + AND bpf.libraryfile = lfa.id
>> + AND lfa.filename IN (%%s)
>> + """ % sqlvalues(self.archive, self.distroseries.distribution)
>> + # Inject the inner query.
>> + query %= inner_query
>> +
>> + store = IMasterStore(PackageUpload)
>
> Is there a reason for using the master store here? Given that we're not writing?
Not really .. changed this to a normal Store.of()
> Thanks!
>> + result_set = store.execute(query)
>> + known_filenames = [row[0] for row in result_set.get_all()]
>> +
>> + # Do any of the files to be uploaded already exist in the destination
>> + # archive?
>> + if len(known_filenames)> 0:
>> + filename_list = "\n\t%s".join(
>> + [filename for filename in known_filenames])
>> + raise QueueInconsistentStateError(
>> + 'The following files are already published in %s:\n%s' % (
>> + self.archive.displayname, filename_list))
>> +
>> def setDone(self):
>> """See `IPackageUpload`."""
>> if self.status == PackageUploadStatus.DONE:
>> @@ -1334,31 +1390,6 @@
>> # guaranteed to exist in the DB. We don't care if sections are
>> # not official.
>>
>> - def verifyBeforeAccept(self):
>> - """See `IPackageUploadBuild`."""
>> - distribution = self.packageupload.distroseries.distribution
>> - known_filenames = []
>> - # Check if the uploaded binaries are already published in the archive.
>> - for binary_package in self.build.binarypackages:
>> - for binary_file in binary_package.files:
>> - try:
>> - published_binary = distribution.getFileByName(
>> - binary_file.libraryfile.filename, source=False,
>> - archive=self.packageupload.archive)
>> - except NotFoundError:
>> - # Only unknown files are ok.
>> - continue
>> -
>> - known_filenames.append(binary_file.libraryfile.filename)
>> -
>> - # If any of the uploaded files are already present we have a problem.
>> - if len(known_filenames)> 0:
>> - filename_list = "\n\t%s".join(
>> - [filename for filename in known_filenames])
>> - raise QueueInconsistentStateError(
>> - 'The following files are already published in %s:\n%s' % (
>> - self.packageupload.archive.displayname, filename_list))
>> -
>> def publish(self, logger=None):
>> """See `IPackageUploadBuild`."""
>> # Determine the build's architecturetag
>>
>>
>> --
>> launchpad-reviews mailing list
>> <email address hidden>
>> https://lists.ubuntu.com/mailman/listinfo/launchpad-reviews
On 03/11/2010 03:06 PM, Michael Nelson wrote:
> Review: Approve code
>
> Wow! I'm looking forward to seeing how this improves the response time
> for uploads with lots of binaries :)
Hello Michael,
thanks very much for the review! I am quite confident that this change
will have a positive impact :)
> I've only got a few suggestions (extra comments, use of MasterStore,
> possible renaming etc.). I've also attached a diff with the equivalent
> (unformatted) storm version that works too (as I know you don't like
> to work with storm ;) ). Of course, it's up to you whether you use it
> or go with the SQL.
Hrm .. the SQL version has been running on dogfood for a few days now
and I am quite confident that it actually works..
Please see also my in-line responses below as well as the attached
incremental diff.
> Thanks! /bugs.launchpad .net/bugs/ 526645 essor /code.launchpad .net/~al- maisan/ launchpad/ unembargo- oops-526645/ +merge/ 21126 soyuz/interface s/queue. py' soyuz/interface s/queue. py 2009-07-19 04:41:14 +0000 soyuz/interface s/queue. py 2010-03-11 09:44:23 +0000 ept(): ntStateError is raised containing all filenames logger= None): soyuz/model/ queue.py' soyuz/model/ queue.py 2010-01-10 04:58:44 +0000 soyuz/model/ queue.py 2010-03-11 09:44:23 +0000 ptError, info: ntStateError( info) inariesinDestin ationArchive(
> -Michael
>
> On Thu, Mar 11, 2010 at 10:44 AM, Muharem Hrnjadovic
> <email address hidden> wrote:
>> Muharem Hrnjadovic has proposed merging lp:~al-maisan/launchpad/unembargo-oops-526645 into lp:launchpad/devel.
>>
>> Requested reviews:
>> Canonical Launchpad Engineering (launchpad)
>> Related bugs:
>> #526645 cannot unembargo due to LP API timeouts
>> https:/
>>
>>
>> Hello there!
>>
>> The 4 OOPSs attached to the related bug indicate the optimization with the
>> highest ROI (see the top 3 queries in the "Repeated SQL Statements" section):
>> we query the database for *each* binary in order to find out whether it's
>> already published in the destination archive. That's inefficient and accounts
>> for 50% percent of the SQL time.
>>
>> The branch at hand performs the "is the binary published in the destination
>> archive already?" check in one go.
>>
>> Tests to run:
>>
>> bin/test -vv test_uploadproc
>>
>> No "make lint" issues.
>>
>> --
>> https:/
>> Your team Launchpad code reviewers from Canonical is subscribed to branch lp:launchpad/devel.
>>
>> === modified file 'lib/lp/
>> --- lib/lp/
>> +++ lib/lp/
>> @@ -411,23 +411,6 @@
>> title=_("The related build"), required=True, readonly=False,
>> )
>>
>> - def verifyBeforeAcc
>> - """Perform overall checks before accepting a binary upload.
>> -
>> - Ensure each uploaded binary file can be published in the targeted
>> - archive.
>> -
>> - If any of the uploaded binary files are already published a
>> - QueueInconsiste
>> - that cannot be published.
>> -
>> - This check is very similar to the one we do for source upload and
>> - was designed to prevent the creation of binary publications that
>> - will never reach the archive.
>> -
>> - See bug #227184 for further details.
>> - """
>> -
>> def publish(
>> """Publish this queued source in the distroseries referred to by
>> the parent queue item.
>>
>> === modified file 'lib/lp/
>> --- lib/lp/
>> +++ lib/lp/
>> @@ -217,9 +217,9 @@
>> except QueueSourceAcce
>> raise QueueInconsiste
>>
>> + self._checkForB
>> + [pub.build for pub in self.builds])
>
> It looks like we should rename self.builds here?
Hmm .. good idea but probably better done in a separate clean-up branch
since the self.builds is used all over the place.
>> for build in self.builds: tError oreAccept( ) onentAndSection () tError, info: usValue( PackageUploadSt atus.ACCEPTED) esinDestination Archive( self, builds): entStateError) if this is the case.
>> - # as before, but for QueueBuildAccep
>> - build.verifyBef
>> try:
>> build.checkComp
>> except QueueBuildAccep
>> @@ -228,6 +228,62 @@
>> # if the previous checks applied and pass we do set the value
>> self.status = PassthroughStat
>>
>> + def _checkForBinari
>> + """
>> + Check for existing binaries (in destination archive) for all binary
>> + uploads to be accepted.
>> +
>> + Before accepting binary uploads we check whether any of the binaries
>> + already exists in the destination archive and raise an exception
>> + (QueueInconsist
>> +
>> + The only way to find pre-existing binaries is to match on binary
>> + package file names.
>> + """
>> + if len(builds) == 0:
>> + return
>> +
>
> It would be great to include a summary of what the query below is
> doing... something like:
> # Collect all the binary filenames for the set of builds.
Thanks for the suggestion. Done.
>> + inner_query = """ lease bpr, gerelease = bpr.id
>> + SELECT DISTINCT lfa.filename
>> + FROM
>> + binarypackagefile bpf, binarypackagere
>> + libraryfilealias lfa
>> + WHERE
>> + bpr.build IN %s
>> + AND bpf.binarypacka
>> + AND bpf.libraryfile = lfa.id
>> + """ % sqlvalues([build.id for build in builds])
>> +
>
> And again... let's see if I've understood this correctly, something like:
>
> # Check whether any of the binary filenames have already been
> published in this archive.
Yup, only s/this/the destination/
>> + query = """ blishinghistory bpph, series = das.id agerelease = bpf.binarypacka gerelease self.archive, self.distroseri es.distribution ) PackageUpload)
>> + SELECT DISTINCT lfa.filename
>> + FROM
>> + binarypackagefile bpf, binarypackagepu
>> + distroarchseries das, distroseries ds, libraryfilealias lfa
>> + WHERE
>> + bpph.archive = %s
>> + AND bpph.distroarch
>> + AND das.distroseries = ds.id
>> + AND ds.distribution = %s
>> + AND bpph.binarypack
>> + AND bpf.libraryfile = lfa.id
>> + AND lfa.filename IN (%%s)
>> + """ % sqlvalues(
>> + # Inject the inner query.
>> + query %= inner_query
>> +
>> + store = IMasterStore(
>
> Is there a reason for using the master store here? Given that we're not writing?
Not really .. changed this to a normal Store.of()
> Thanks! query) set.get_ all()] filenames) > 0: ntStateError( displayname, filename_list)) `.""" atus.DONE: ept(self) : Build`. """ oad.distroserie s.distribution binarypackages: package. files: getFileByName( file.libraryfil e.filename, source=False, self.packageupl oad.archive) .append( binary_ file.libraryfil e.filename) filenames) > 0: ntStateError( oad.archive. displayname, filename_list)) Build`. """ /lists. ubuntu. com/mailman/ listinfo/ launchpad- reviews
>> + result_set = store.execute(
>> + known_filenames = [row[0] for row in result_
>> +
>> + # Do any of the files to be uploaded already exist in the destination
>> + # archive?
>> + if len(known_
>> + filename_list = "\n\t%s".join(
>> + [filename for filename in known_filenames])
>> + raise QueueInconsiste
>> + 'The following files are already published in %s:\n%s' % (
>> + self.archive.
>> +
>> def setDone(self):
>> """See `IPackageUpload
>> if self.status == PackageUploadSt
>> @@ -1334,31 +1390,6 @@
>> # guaranteed to exist in the DB. We don't care if sections are
>> # not official.
>>
>> - def verifyBeforeAcc
>> - """See `IPackageUpload
>> - distribution = self.packageupl
>> - known_filenames = []
>> - # Check if the uploaded binaries are already published in the archive.
>> - for binary_package in self.build.
>> - for binary_file in binary_
>> - try:
>> - published_binary = distribution.
>> - binary_
>> - archive=
>> - except NotFoundError:
>> - # Only unknown files are ok.
>> - continue
>> -
>> - known_filenames
>> -
>> - # If any of the uploaded files are already present we have a problem.
>> - if len(known_
>> - filename_list = "\n\t%s".join(
>> - [filename for filename in known_filenames])
>> - raise QueueInconsiste
>> - 'The following files are already published in %s:\n%s' % (
>> - self.packageupl
>> -
>> def publish(self, logger=None):
>> """See `IPackageUpload
>> # Determine the build's architecturetag
>>
>>
>> --
>> launchpad-reviews mailing list
>> <email address hidden>
>> https:/
Best regards
--
Muharem Hrnjadovic <email address hidden>
Public key id : B2BBFCFC
Key fingerprint : A5A3 CC67 2B87 D641 103F 5602 219F 6B60 B2BB FCFC