Archive:EntryResource:getPublishedBinaries timeouts

Bug #706200 reported by Bryce Harrington
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Julian Edwards

Bug Description

OOPS-1868E2663

From this code:

demo:
#!/usr/bin/env python
from lpltk import LaunchpadService
lp = LaunchpadService(config={'read_only':True})
edgers = lp.launchpad.people['xorg-edgers']

edgers_ppa = edgers.getPPAByName(name="ppa")
for bin in edgers_ppa.getPublishedBinaries():
    print "Binary"

https://api.launchpad.net/1.0/~xorg-edgers/+archive/ppa?ws.op=getPublishedBinaries

slow query:
SELECT BinaryPackagePublishingHistory.archive,
       BinaryPackagePublishingHistory.binarypackagerelease,
       BinaryPackagePublishingHistory.component,
       BinaryPackagePublishingHistory.datecreated,
       BinaryPackagePublishingHistory.datemadepending,
       BinaryPackagePublishingHistory.datepublished,
       BinaryPackagePublishingHistory.dateremoved,
       BinaryPackagePublishingHistory.datesuperseded,
       BinaryPackagePublishingHistory.distroarchseries,
       BinaryPackagePublishingHistory.id,
       BinaryPackagePublishingHistory.pocket,
       BinaryPackagePublishingHistory.priority,
       BinaryPackagePublishingHistory.removal_comment,
       BinaryPackagePublishingHistory.removed_by,
       BinaryPackagePublishingHistory.scheduleddeletiondate,
       BinaryPackagePublishingHistory.section,
       BinaryPackagePublishingHistory.status,
       BinaryPackagePublishingHistory.supersededby
FROM BinaryPackageName,
     BinaryPackagePublishingHistory,
     BinaryPackageRelease
WHERE BinaryPackagePublishingHistory.archive = 2034
  AND BinaryPackagePublishingHistory.binarypackagerelease = BinaryPackageRelease.id
  AND BinaryPackageRelease.binarypackagename = BinaryPackageName.id
  AND (1=1)
ORDER BY BinaryPackageName.name,
         BinaryPackageRelease.VERSION DESC, BinaryPackagePublishingHistory.id DESC LIMIT 76
OFFSET 0

This is sporadic - I think we've got cold caching effects here.

Tags: qa-ok timeout

Related branches

Changed in launchpad:
status: New → Triaged
importance: Undecided → Critical
tags: added: timeout
summary: - Timeout OOPS trying to retrieve PPA contents
+ Archive:EntryResource:getPublishedBinaries timeouts
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

observations:
 - cannot be sorted in index - name, debversion is whats needed, only name,build or debversion are available in indices.

description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

               -> Hash (cost=2470.03..2470.03 rows=141503 width=29) (actual time=227.035..227.035 rows=141503 loops=1)
                     -> Seq Scan on binarypackagename (cost=0.00..2470.03 rows=141503 width=29) (actual time=0.016..75.493 rows=141503 loops=1)
 is a little bit of nuisance - 75ms spent building a hash of binary package names.

this is then used to do the join into binarypackagerelease package name, which is solely used to do the sort. I wonder if its possible to build an index on that in-table. That, or perhaps we should inline the name. However, trying without the bpn in the query at all is still 11 seconds. So thats not sufficient and possibly not necessary.

querying with neither binarypackage release or binarypackage name returns 76 rows in 6ms (but probably not the 76 rows we want :)).

Revision history for this message
Robert Collins (lifeless) wrote :

So in english this is:
 - find publications
 - in archive 2034
 - sorted by their english package name
 - and then their debian version

Datawise, there are 55K publications in total.
And (reasonably, since there are more package names than publications) postgresql is consulting every single publication, looking up its name and version, and then sorting the result.

It takes 10 seconds to calculate the debversion_sort_key for 55K rows, and because we're considering every possible version we're calculating the debversion_sort_key for every candidate row : we need an index on *version* where the sort is by *debversion* to fix this up.

 explain analyze select debversion_sort_key(version) from binarypackagepublishinghistory, binarypackagerelease where archive=2034 and BinaryPackagePublishingHistory.binarypackagerelease = BinaryPackageRelease.id;
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop (cost=902.15..353690.83 rows=56248 width=21) (actual time=19.874..10344.132 rows=55469 loops=1)
   -> Bitmap Heap Scan on binarypackagepublishinghistory (cost=902.15..96928.16 rows=56248 width=4) (actual time=19.309..90.454 rows=55469 loops=1)
         Recheck Cond: (archive = 2034)
         -> Bitmap Index Scan on securebinarypackagepublishinghistory__archive__status__idx (cost=0.00..888.09 rows=56248 width=0) (actual time=15.580..15.580 rows=55469 loops=1)
               Index Cond: (archive = 2034)
   -> Index Scan using binarypackage_pkey on binarypackagerelease (cost=0.00..4.30 rows=1 width=25) (actual time=0.007..0.008 rows=1 loops=55469)
         Index Cond: (binarypackagerelease.id = binarypackagepublishinghistory.binarypackagerelease)
 Total runtime: 10356.986 ms
(8 rows)

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (3.4 KiB)

Dropping the debversion_sort_key does indeed work better: this query
SELECT BinaryPackagePublishingHistory.archive, BinaryPackagePublishingHistory.binarypackagerelease, BinaryPackagePublishingHistory.component, BinaryPackagePublishingHistory.datecreated, BinaryPackagePublishingHistory.datemadepending, BinaryPackagePublishingHistory.datepublished, BinaryPackagePublishingHistory.dateremoved, BinaryPackagePublishingHistory.datesuperseded, BinaryPackagePublishingHistory.distroarchseries, BinaryPackagePublishingHistory.id, BinaryPackagePublishingHistory.pocket, BinaryPackagePublishingHistory.priority, BinaryPackagePublishingHistory.removal_comment, BinaryPackagePublishingHistory.removed_by, BinaryPackagePublishingHistory.scheduleddeletiondate, BinaryPackagePublishingHistory.section, BinaryPackagePublishingHistory.status, BinaryPackagePublishingHistory.supersededby
FROM BinaryPackageName, BinaryPackagePublishingHistory, BinaryPackageRelease
WHERE BinaryPackagePublishingHistory.archive = 2034
  AND BinaryPackagePublishingHistory.binarypackagerelease = BinaryPackageRelease.id
  AND BinaryPackageRelease.binarypackagename = BinaryPackageName.id
  AND (1=1)
ORDER BY BinaryPackageName.name, BinaryPackageRelease.VERSION DESC, BinaryPackagePublishingHistory.id DESC LIMIT 76
OFFSET 0;

Time: 710.060 ms

                                                                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=347241.95..347242.14 rows=76 width=163) (actual time=819.017..819.063 rows=76 loops=1)
   -> Sort (cost=347241.95..347382.57 rows=56248 width=163) (actual time=819.015..819.033 rows=76 loops=1)
         Sort Key: binarypackagename.name, binarypackagerelease.version, binarypackagepublishinghistory.id
         Sort Method: top-N heapsort Memory: 44kB
         -> Hash Join (cost=5140.97..345203.54 rows=56248 width=163) (actual time=185.810..713.196 rows=55469 loops=1)
               Hash Cond: (binarypackagerelease.binarypackagename = binarypackagename.id)
               -> Nested Loop (cost=902.15..339628.83 rows=56248 width=142) (actual time=22.049..467.079 rows=55469 loops=1)
                     -> Bitmap Heap Scan on binarypackagepublishinghistory (cost=902.15..96928.16 rows=56248 width=117) (actual time=22.017..66.129 rows=55469 loops=1)
                           Recheck Cond: (archive = 2034)
                           -> Bitmap Index Scan on securebinarypackagepublishinghistory__archive__status__idx (cost=0.00..888.09 rows=56248 width=0) (actual time=17.997..17.997 rows=55469 loops=1)
                                 Index Cond: (archive = 2034)
                     -> Index Scan using binarypackage_pkey on binarypackagerelease (cost=0.00..4.30 rows=1 width=29) (actual time=0.005..0.006 rows=1 loops=55469)
                           Index Cond: (binarypackagerelease.id = binarypackagepublishinghistory.binarypackagerelease)
      ...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

I would fix this by storing the debversion_sort_key result in binary package release

Revision history for this message
Robert Collins (lifeless) wrote :

chatted briefly with stub; this is a reasonable thing to do.

Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Bug 706200] Re: Timeout OOPS trying to retrieve PPA contents

On Sat, Jan 22, 2011 at 01:42:02PM -0000, Robert Collins wrote:
> edgers_ppa = edgers.getPPAByName(name="ppa")
> for bin in edgers_ppa.getPublishedBinaries():
> +     print "Binary"

It worked if I limited it to getPublishedBinaries(status="Published")

Revision history for this message
Robert Collins (lifeless) wrote :

On Sun, Jan 23, 2011 at 9:53 AM, Bryce Harrington
<email address hidden> wrote:
> On Sat, Jan 22, 2011 at 01:42:02PM -0000, Robert Collins wrote:
>>   edgers_ppa = edgers.getPPAByName(name="ppa")
>>   for bin in edgers_ppa.getPublishedBinaries():
>> +     print "Binary"
>
> It worked if I limited it to getPublishedBinaries(status="Published")

Not surprising - there are 55000 obsolete records in that PPA. I
suspect we don't need them all ...

Changed in launchpad:
assignee: nobody → Julian Edwards (julian-edwards)
status: Triaged → In Progress
Revision history for this message
Robert Collins (lifeless) wrote :

Julians branch brings in a binary datatype which will also avoid the function call overhead and should make things considerably better.

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
Changed in launchpad:
milestone: none → 11.02
tags: added: qa-needstesting
Changed in launchpad:
status: In Progress → Fix Committed
Ian Booth (wallyworld)
tags: added: qa-ok
removed: qa-needstesting
Revision history for this message
Robert Collins (lifeless) wrote :

With the debversion type, once hot, this is a 700ms query on qastaging. \o/

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Just a liiiiitle bit quicker :)

Curtis Hovey (sinzui)
Changed in launchpad:
status: Fix Committed → Fix Released
description: updated
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

I've reopened this, because it suffers substantially from cold cache effects - we're still likely to see OOPSes on it. Its massively faster when hot, but we haven't solved the cold case yet. (By cold , I just hit the url on production and it timed out).

description: updated
Changed in launchpad:
status: Fix Released → Triaged
Revision history for this message
Julian Edwards (julian-edwards) wrote :

The only way I can think of speeding that up more is to make sure we have an index that can be used for sorting - right now the sort columns are split across 2 tables.

Essentially that means de-normalising the name :/

Revision history for this message
Robert Collins (lifeless) wrote :

We could join on a string rather than an int as an option. what do we use the normalisation for here - making the table narrower presumably, but anything else?

Revision history for this message
Julian Edwards (julian-edwards) wrote : Re: [Bug 706200] Re: Archive:EntryResource:getPublishedBinaries timeouts

On Friday 11 February 2011 22:01:04 you wrote:
> We could join on a string rather than an int as an option. what do we
> use the normalisation for here - making the table narrower presumably,
> but anything else?

I can't think of any other reason. It's basic data factoring.

Revision history for this message
Robert Collins (lifeless) wrote :

Ok, so its used by:
    TABLE "binarypackagerelease" CONSTRAINT
    TABLE "bugtask" CONSTRAINT
    TABLE "distroseriespackagecache" CONSTRAINT
    TABLE "packageselection" CONSTRAINT
    TABLE "potemplate" CONSTRAINT

for bugtask:
launchpad_qastaging=> select coalesce(binarypackagename is null,
false), count(*) from bugtask group by coalesce(binarypackagename is
null, false);
 coalesce | count
----------+--------
 f | 25407
 t | 739371

so its very much not pulling its weight; for the other cases it may be
- I'm sure its a few bytes saving though :).

Obviously for binarypackagerelease its a 1:1, I'll get stubs opinion
Monday, but I suspect we either need to:
 - delete old publications
 - move to an archive schema
 - have the package name inline
 - stop sorting by package name

Revision history for this message
Julian Edwards (julian-edwards) wrote :

I don't really understand what point you're trying to make here, sorry :(

> for binarypackagerelease its a 1:1

If I understand correctly, that's not true. There can be many versions of the same package.

Revision history for this message
Robert Collins (lifeless) wrote :

On Tue, Feb 15, 2011 at 1:36 AM, Julian Edwards
<email address hidden> wrote:
> I don't really understand what point you're trying to make here, sorry
> :(
>
>> for binarypackagerelease its a 1:1
>
> If I understand correctly, that's not true.  There can be many versions
> of the same package.

Yes, my bad, I totally confused the issue.

What I was trying to do was get a handle on the size impact of having
the name be an inline string rather than a normalised string in a
different table.

Revision history for this message
Robert Collins (lifeless) wrote :

Discussed with stub now:

18:30 < stub> I'd happily drop the *PackageName tables - never liked them.

was his only comment.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Tuesday 15 February 2011 05:33:24 you wrote:
> Discussed with stub now:
>
> 18:30 < stub> I'd happily drop the *PackageName tables - never liked
> them.
>
> was his only comment.

Heh, ok.

I'll make a branch. It won't be pretty.

Revision history for this message
Robert Collins (lifeless) wrote :

If I was working on this, I'd start with just binary package name,
focused on this one use case: rather than disrupt the whole system,
take it one step at a time. That might help with branch prettiness ;)

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Tuesday 15 February 2011 10:15:33 you wrote:
> If I was working on this, I'd start with just binary package name,
> focused on this one use case: rather than disrupt the whole system,
> take it one step at a time. That might help with branch prettiness ;)

So I was being somewhat tongue-in-cheek. :)

This is a *massive* change. There's no real way of doing it incrementally
either.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

I think we should mark this fix-released and open a new bug with more specific requirements.

Changed in launchpad:
status: Triaged → Fix Released
Revision history for this message
Julian Edwards (julian-edwards) wrote :
Revision history for this message
Gary Poster (gary) wrote :

See bug 904344 for another approach to fixing this.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.