Archive:+index timeouts : ArchiveView.num_pkgs_building can be very slow

Bug #709717 reported by William Grant
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Julian Edwards

Bug Description

As seen in OOPS-1855H1087, the query generated by ArchiveView.num_pkgs_building can take nearly four seconds on a primary-sized archive.

Related branches

summary: - ArchiveView.num_pkgs_building can be very slow
+ Archive:+index timeouts : ArchiveView.num_pkgs_building can be very slow
Revision history for this message
Robert Collins (lifeless) wrote :

Its generating a nonsense-big table; changing it to this makes it take < 25% of the time:

launchpad_qastaging=> SELECT COUNT(*)
FROM (
        (SELECT DISTINCT SourcePackageRelease.id FROM BinaryPackageBuild, BuildFarmJob, PackageBuild, SourcePackageRelease
         WHERE BinaryPackageBuild.source_package_release = SourcePackageRelease.id
           AND BinaryPackageBuild.package_build = PackageBuild.id AND PackageBuild.archive = 20931
           AND PackageBuild.build_farm_job = BuildFarmJob.id
           AND BuildFarmJob.status = 0) EXCEPT
        (SELECT DISTINCT SourcePackageRelease.id
         FROM BinaryPackageBuild, BuildFarmJob, PackageBuild, SourcePackageRelease
         WHERE BinaryPackageBuild.source_package_release = SourcePackageRelease.id
           AND BinaryPackageBuild.package_build = PackageBuild.id
           AND PackageBuild.archive = 20931 AND PackageBuild.build_farm_job = BuildFarmJob.id AND BuildFarmJob.status = 6)) AS "_tmp" ;

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (4.9 KiB)

plan of this :

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=915.85..915.86 rows=1 width=0) (actual time=200.997..200.998 rows=1 loops=1)
   -> Subquery Scan _tmp (cost=695.51..915.72 rows=52 width=0) (actual time=195.451..199.516 rows=5016 loops=1)
         -> HashSetOp Except (cost=695.51..915.20 rows=52 width=4) (actual time=195.448..196.957 rows=5016 loops=1)
               -> Append (cost=695.51..915.02 rows=71 width=4) (actual time=176.432..191.642 rows=5016 loops=1)
                     -> Subquery Scan "*SELECT* 1" (cost=695.51..696.55 rows=52 width=4) (actual time=176.431..181.448 rows=5016 loops=1)
                           -> HashAggregate (cost=695.51..696.03 rows=52 width=4) (actual time=176.428..179.004 rows=5016 loops=1)
                                 -> Nested Loop (cost=61.92..695.38 rows=52 width=4) (actual time=12.070..166.402 rows=8268 loops=1)
                                       -> Nested Loop (cost=61.92..667.19 rows=52 width=4) (actual time=12.048..95.471 rows=8268 loops=1)
                                             -> Hash Join (cost=61.92..332.41 rows=52 width=4) (actual time=12.010..36.768 rows=8268 loops=1)
                                                   Hash Cond: (public.buildfarmjob.id = public.packagebuild.build_farm_job)
                                                   -> Index Scan using buildfarmjob__status__idx on buildfarmjob (cost=0.00..268.94 rows=276 width=4) (actual time=0.048..7.287 rows=8573 loops=1)
                                                         Index Cond: (status = 0)
                                                   -> Hash (cost=61.27..61.27 rows=52 width=8) (actual time=11.933..11.933 rows=8268 loops=1)
                                                         -> Index Scan using packagebuild__archive__idx on packagebuild (cost=0.00..61.27 rows=52 width=8) (actual time=0.019..6.759 rows=8268 loops=1)
                                                               Index Cond: (archive = 20931)
                                             -> Index Scan using binarypackagebuild__package_build__idx on binarypackagebuild (cost=0.00..6.43 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=8268)
                                                   Index Cond: (public.binarypackagebuild.package_build = public.packagebuild.id)
                                       -> Index Scan using sourcepackagerelease_pkey on sourcepackagerelease (cost=0.00..0.53 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=8268)
                                             Index Cond: (public.sourcepackagerelease.id = public.binarypackagebuild.source_package_release)
                     -> Subquery Scan "*SELECT* 2" (cost=218.09..218.47 rows=19 width=4) (actual time=8.269..8.269 rows=0 loops=1)
                           -> HashAggregate (cost=218.09..218.28 rows=19 width=4) (actual time=8.268..8.268 rows=0 loops=1)
                                 -> Nested Loop (cost=23.77..218.05 rows=19...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

Slightly tweaked for minimalism: SELECT COUNT(*)
FROM (
        (SELECT DISTINCT BinaryPackageBuild.source_package_release
         FROM BinaryPackageBuild, BuildFarmJob, PackageBuild
         WHERE BinaryPackageBuild.package_build = PackageBuild.id
           AND PackageBuild.archive = 20931
           AND PackageBuild.build_farm_job = BuildFarmJob.id
           AND BuildFarmJob.status = 0) EXCEPT
        (SELECT DISTINCT BinaryPackageBuild.source_package_release
         FROM BinaryPackageBuild, BuildFarmJob, PackageBuild
         WHERE BinaryPackageBuild.package_build = PackageBuild.id
           AND PackageBuild.archive = 20931
           AND PackageBuild.build_farm_job = BuildFarmJob.id
           AND BuildFarmJob.status = 6)) AS "_tmp";

Revision history for this message
Robert Collins (lifeless) wrote :

correlated and uncorrelated subqueries are slower on this dataset; appears to be because the number of active builds (state 6) is usually going to be near-0. A better schema would let us do this much more directly.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

All I can see you've changed is the columns it's selecting. Is that where the time is all going?

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Staging is running the same query from the OOPS (that takes 3.7 seconds) in 0.5ms

Ummm.... ?

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Ah that was using your archive of id 20931. I used the xorg-edgers PPA (ID 2034) and it's 2.7 seconds.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

But once the data is in memory cache the same query takes 29ms.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

The attached branch takes the query down to 430ms on staging on a cold cache using the Ubuntu archive in the query. Repeating the query takes 30ms so we're avoiding a lot of cache misses on SPR I think.

Changed in launchpad:
status: Triaged → In Progress
assignee: nobody → Julian Edwards (julian-edwards)
Revision history for this message
Robert Collins (lifeless) wrote :

So the time is indeed in reading in SPR rows that are unneeded - the distinct count of the intermediary table forces the full table to materialised (to perform the DISTINCT). Dropping the output to just the unique key, which is all we actually care about, means that creating the full table is a lot cheaper.

Revision history for this message
Robert Collins (lifeless) wrote :

Oh, and you need a main sized copy archive with a big chunk of pending builds to show the poor behaviour - the one I listed on qastaging shows that up. Something the size of xorg edgers won't show as poor initial behaviour, nor such a win.

Ursula Junque (ursinha)
tags: removed: oops
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
Changed in launchpad:
milestone: none → 11.02
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
tags: added: qa-ok
removed: qa-needstesting
William Grant (wgrant)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.