Merge lp:~lifeless/launchpad/bug-618849 into lp:launchpad

Proposed by Robert Collins
Status: Merged
Approved by: Michael Hudson-Doyle
Approved revision: no longer in the source branch.
Merged at revision: 11472
Proposed branch: lp:~lifeless/launchpad/bug-618849
Merge into: lp:launchpad
Diff against target: 388 lines (+144/-27)
11 files modified
lib/canonical/launchpad/database/message.py (+1/-1)
lib/lp/bugs/browser/bug.py (+8/-6)
lib/lp/bugs/browser/bugtask.py (+1/-1)
lib/lp/bugs/browser/tests/bugtarget-filebug-views.txt (+1/-1)
lib/lp/bugs/configure.zcml (+1/-0)
lib/lp/bugs/doc/bug.txt (+6/-2)
lib/lp/bugs/interfaces/bug.py (+10/-0)
lib/lp/bugs/model/bug.py (+50/-9)
lib/lp/bugs/model/bugattachment.py (+13/-2)
lib/lp/bugs/tests/test_bugs_webservice.py (+52/-4)
lib/lp/testing/sampledata.py (+1/-1)
To merge this branch: bzr merge lp:~lifeless/launchpad/bug-618849
Reviewer Review Type Date Requested Status
Michael Hudson-Doyle Approve
Jelmer Vernooij Pending
Review via email: mp+34011@code.launchpad.net

Commit message

Remove O(N^2) scaling from the Bug.attachments API call by injecting IIndexedMessage objects into the BugAttachments.

Description of the change

Reduce the query count for bug/attachments API calls: inject an IIndexedMessage into the bug attachment when we query, rather than doing O(N^2) work later. Also, to make my test easier to write, fixed up the LibraryFileAlias lookups to be eager loaded.

Net result - 23 SQL calls to answer the API (down from 38 for a bug with 2 attachments when I started testing this). There is a lurking surprise when indexed_messages becomes too expensive to query; I think we probably want the index in the DB rather than calculating at runtime, however that is a separate bridge to cross in the future.

The implementation is a little convoluted due to:
 - needing to index a *non db class* as the answer for bugattachment.message (https://answers.edge.launchpad.net/lazr.restful/+question/123212)
 - lazr.restful not being able to replace an unexported attribute. (bug 625102)

The former point is also potentially solvable via lazr restful, but I'm not sure its a bug, so its just a question for now.

I've done some driveby cleanups to sample data in the test I modified too.

To post a comment you must log in.
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

As we discussed on IRC, I'm fine with all the code, but asked for a few changes in the docs.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/canonical/launchpad/database/message.py'
2--- lib/canonical/launchpad/database/message.py 2010-08-20 20:31:18 +0000
3+++ lib/canonical/launchpad/database/message.py 2010-08-30 05:20:59 +0000
4@@ -127,7 +127,7 @@
5 chunks = SQLMultipleJoin('MessageChunk', joinColumn='message')
6 raw = ForeignKey(foreignKey='LibraryFileAlias', dbName='raw',
7 default=None)
8- bugattachments = SQLMultipleJoin('BugAttachment', joinColumn='message')
9+ bugattachments = SQLMultipleJoin('BugAttachment', joinColumn='_message')
10
11 def __iter__(self):
12 """See IMessage.__iter__"""
13
14=== modified file 'lib/lp/bugs/browser/bug.py'
15--- lib/lp/bugs/browser/bug.py 2010-08-20 20:31:18 +0000
16+++ lib/lp/bugs/browser/bug.py 2010-08-30 05:20:59 +0000
17@@ -519,7 +519,7 @@
18 'file': ProxiedLibraryFileAlias(
19 attachment.libraryfile, attachment),
20 }
21- for attachment in self.context.attachments
22+ for attachment in self.context.attachments_unpopulated
23 if attachment.type != BugAttachmentType.PATCH]
24
25 @property
26@@ -531,7 +531,7 @@
27 'file': ProxiedLibraryFileAlias(
28 attachment.libraryfile, attachment),
29 }
30- for attachment in self.context.attachments
31+ for attachment in self.context.attachments_unpopulated
32 if attachment.type == BugAttachmentType.PATCH]
33
34
35@@ -877,15 +877,17 @@
36 if bug.security_related:
37 text.append('security: yes')
38
39+ patches = []
40 text.append('attachments: ')
41- for attachment in bug.attachments:
42+ for attachment in bug.attachments_unpopulated:
43 if attachment.type != BugAttachmentType.PATCH:
44 text.append(' %s' % self.attachment_text(attachment))
45+ else:
46+ patches.append(attachment)
47
48 text.append('patches: ')
49- for attachment in bug.attachments:
50- if attachment.type == BugAttachmentType.PATCH:
51- text.append(' %s' % self.attachment_text(attachment))
52+ for attachment in patches:
53+ text.append(' %s' % self.attachment_text(attachment))
54
55 text.append('tags: %s' % ' '.join(bug.tags))
56
57
58=== modified file 'lib/lp/bugs/browser/bugtask.py'
59--- lib/lp/bugs/browser/bugtask.py 2010-08-22 18:31:30 +0000
60+++ lib/lp/bugs/browser/bugtask.py 2010-08-30 05:20:59 +0000
61@@ -310,7 +310,7 @@
62 """
63 chunks = bugtask.bug.getMessageChunks()
64 comments = build_comments_from_chunks(chunks, bugtask, truncate=truncate)
65- for attachment in bugtask.bug.attachments:
66+ for attachment in bugtask.bug.attachments_unpopulated:
67 message_id = attachment.message.id
68 # All attachments are related to a message, so we can be
69 # sure that the BugComment is already created.
70
71=== modified file 'lib/lp/bugs/browser/tests/bugtarget-filebug-views.txt'
72--- lib/lp/bugs/browser/tests/bugtarget-filebug-views.txt 2010-02-23 15:34:15 +0000
73+++ lib/lp/bugs/browser/tests/bugtarget-filebug-views.txt 2010-08-30 05:20:59 +0000
74@@ -284,7 +284,7 @@
75 The attachments got added, with the charsets preserved, and the one that
76 didn't specify a description got an autogenerated one.
77
78- >>> for attachment in filebug_view.added_bug.attachments:
79+ >>> for attachment in filebug_view.added_bug.attachments_unpopulated:
80 ... print "Filename: %s" % attachment.libraryfile.filename
81 ... print "Content type: %s" % attachment.libraryfile.mimetype
82 ... print "Description: %s" % attachment.title
83
84=== modified file 'lib/lp/bugs/configure.zcml'
85--- lib/lp/bugs/configure.zcml 2010-08-26 12:50:12 +0000
86+++ lib/lp/bugs/configure.zcml 2010-08-30 05:20:59 +0000
87@@ -626,6 +626,7 @@
88 cve_links
89 duplicates
90 attachments
91+ attachments_unpopulated
92 questions
93 specifications
94 followup_subject
95
96=== modified file 'lib/lp/bugs/doc/bug.txt'
97--- lib/lp/bugs/doc/bug.txt 2010-08-02 17:48:13 +0000
98+++ lib/lp/bugs/doc/bug.txt 2010-08-30 05:20:59 +0000
99@@ -1130,8 +1130,12 @@
100 It's done in a way that we only issue two queries to fetch all this
101 information, too:
102
103- >>> len(CursorWrapper.last_executed_sql) - queries
104- 2
105+XXX: bug=https://bugs.launchpad.net/storm/+bug/619017 means that this sometimes
106+does 3 queries, depending on the precise state of the storm cache. To avoid
107+spurious failures it has been changed to tolerate this additional query.
108+
109+ >>> len(CursorWrapper.last_executed_sql) - queries <= 3
110+ True
111
112 Bugs have a special attribute, `indexed_messages` which returns the collection
113 of messages, each decorated with the index of that message in its context
114
115=== modified file 'lib/lp/bugs/interfaces/bug.py'
116--- lib/lp/bugs/interfaces/bug.py 2010-08-20 20:31:18 +0000
117+++ lib/lp/bugs/interfaces/bug.py 2010-08-30 05:20:59 +0000
118@@ -279,6 +279,16 @@
119 CollectionField(
120 title=_("MultiJoin of bugs which are dupes of this one."),
121 value_type=BugField(), readonly=True))
122+ # See lp.bugs.model.bug.Bug.attachments for why there are two similar
123+ # properties here.
124+ # attachments_unpopulated would more naturally be attachments, and
125+ # attachments be attachments_prepopulated, but lazr.resful cannot
126+ # export over a non-exported attribute in an interface.
127+ # https://bugs.launchpad.net/lazr.restful/+bug/625102
128+ attachments_unpopulated = CollectionField(
129+ title=_("List of bug attachments."),
130+ value_type=Reference(schema=IBugAttachment),
131+ readonly=True)
132 attachments = exported(
133 CollectionField(
134 title=_("List of bug attachments."),
135
136=== modified file 'lib/lp/bugs/model/bug.py'
137--- lib/lp/bugs/model/bug.py 2010-08-25 14:41:20 +0000
138+++ lib/lp/bugs/model/bug.py 2010-08-30 05:20:59 +0000
139@@ -72,6 +72,7 @@
140
141 from canonical.cachedproperty import (
142 cachedproperty,
143+ cache_property,
144 clear_property,
145 )
146 from canonical.config import config
147@@ -82,6 +83,7 @@
148 SQLBase,
149 sqlvalues,
150 )
151+from canonical.launchpad.components.decoratedresultset import DecoratedResultSet
152 from canonical.launchpad.database.librarian import LibraryFileAlias
153 from canonical.launchpad.database.message import (
154 Message,
155@@ -1476,7 +1478,9 @@
156 self.who_made_private = None
157 self.date_made_private = None
158
159- for attachment in self.attachments:
160+ # XXX: This should be a bulk update. RBC 20100827
161+ # bug=https://bugs.edge.launchpad.net/storm/+bug/625071
162+ for attachment in self.attachments_unpopulated:
163 attachment.libraryfile.restricted = private
164
165 # Correct the heat for the bug immediately, so that we don't have
166@@ -1728,19 +1732,56 @@
167 task.target.recalculateBugHeatCache()
168 store.flush()
169
170- @property
171- def attachments(self):
172- """See `IBug`."""
173- # We omit those bug attachments that do not have a
174- # LibraryFileContent record in order to avoid OOPSes as
175- # mentioned in bug 542274. These bug attachments will be
176- # deleted anyway during the next garbo_daily run.
177+ def _attachments_query(self):
178+ """Helper for the attachments* properties."""
179+ # bug attachments with no LibraryFileContent have been deleted - the
180+ # garbo_daily run will remove the LibraryFileAlias asynchronously.
181+ # See bug 542274 for more details.
182 store = Store.of(self)
183 return store.find(
184- BugAttachment, BugAttachment.bug == self,
185+ (BugAttachment, LibraryFileAlias),
186+ BugAttachment.bug == self,
187 BugAttachment.libraryfile == LibraryFileAlias.id,
188 LibraryFileAlias.content != None).order_by(BugAttachment.id)
189
190+ @property
191+ def attachments(self):
192+ """See `IBug`.
193+
194+ This property does eager loading of the index_messages so that the API
195+ which wants the message_link for the attachment can answer that without
196+ O(N^2) overhead. As such it is moderately expensive to call (it
197+ currently retrieves all messages before any attachments, and does this
198+ when attachments is evaluated, not when the resultset is processed).
199+ """
200+ message_to_indexed = {}
201+ for message in self.indexed_messages:
202+ message_to_indexed[message.id] = message
203+ def set_indexed_message(row):
204+ attachment = row[0]
205+ # row[1] - the LibraryFileAlias is now in the storm cache and
206+ # will be found without a query when dereferenced.
207+ indexed_message = message_to_indexed.get(attachment._messageID)
208+ if indexed_message is not None:
209+ cache_property(attachment, '_message_cached', indexed_message)
210+ return attachment
211+ rawresults = self._attachments_query()
212+ return DecoratedResultSet(rawresults, set_indexed_message)
213+
214+ @property
215+ def attachments_unpopulated(self):
216+ """See `IBug`.
217+
218+ This version does not pre-lookup messages and LibraryFileAliases.
219+
220+ The regular 'attachments' property does prepopulation because it is
221+ exposed in the API.
222+ """
223+ # Grab the attachment only; the LibraryFileAlias will be eager loaded.
224+ return DecoratedResultSet(
225+ self._attachments_query(),
226+ operator.itemgetter(0))
227+
228
229 class BugSet:
230 """See BugSet."""
231
232=== modified file 'lib/lp/bugs/model/bugattachment.py'
233--- lib/lp/bugs/model/bugattachment.py 2010-08-20 20:31:18 +0000
234+++ lib/lp/bugs/model/bugattachment.py 2010-08-30 05:20:59 +0000
235@@ -19,6 +19,7 @@
236 from zope.event import notify
237 from zope.interface import implements
238
239+from canonical.cachedproperty import cachedproperty
240 from canonical.database.enumcol import EnumCol
241 from canonical.database.sqlbase import SQLBase
242 from lp.app.errors import NotFoundError
243@@ -46,9 +47,19 @@
244 foreignKey='LibraryFileAlias', dbName='libraryfile', notNull=True)
245 data = ForeignKey(
246 foreignKey='LibraryFileAlias', dbName='libraryfile', notNull=True)
247- message = ForeignKey(
248+ _message = ForeignKey(
249 foreignKey='Message', dbName='message', notNull=True)
250
251+ @cachedproperty('_message_cached')
252+ def message(self):
253+ """This is a cachedproperty to allow message to be an IIndexedMessage.
254+
255+ This is needed for the bug/attachments API call which needs to index
256+ an IIndexedMessage rather than a simple DB model IMessage. See
257+ Bug.attachments where the injection occurs.
258+ """
259+ return self._message
260+
261 @property
262 def is_patch(self):
263 """See IBugAttachment."""
264@@ -99,7 +110,7 @@
265 attach_type = IBugAttachment['type'].default
266 attachment = BugAttachment(
267 bug=bug, libraryfile=filealias, type=attach_type, title=title,
268- message=message)
269+ _message=message)
270 # canonial_url(attachment) (called by notification subscribers
271 # to generate the download URL of the attachments) blows up if
272 # attachment.id is not (yet) set.
273
274=== modified file 'lib/lp/bugs/tests/test_bugs_webservice.py'
275--- lib/lp/bugs/tests/test_bugs_webservice.py 2010-08-20 20:31:18 +0000
276+++ lib/lp/bugs/tests/test_bugs_webservice.py 2010-08-30 05:20:59 +0000
277@@ -10,6 +10,11 @@
278 from BeautifulSoup import BeautifulSoup
279 from lazr.lifecycle.interfaces import IDoNotSnapshot
280 from simplejson import dumps
281+from testtools.matchers import (
282+ Equals,
283+ LessThan,
284+ MatchesAny,
285+ )
286 from zope.component import getMultiAdapter
287
288 from canonical.launchpad.ftests import (
289@@ -19,10 +24,19 @@
290 from canonical.launchpad.testing.pages import LaunchpadWebServiceCaller
291 from canonical.launchpad.webapp import snapshot
292 from canonical.launchpad.webapp.servers import LaunchpadTestRequest
293-from canonical.testing import DatabaseFunctionalLayer
294+from canonical.testing import (
295+ DatabaseFunctionalLayer,
296+ LaunchpadFunctionalLayer,
297+ )
298 from lp.bugs.browser.bugtask import get_comments_for_bugtask
299 from lp.bugs.interfaces.bug import IBug
300 from lp.testing import TestCaseWithFactory
301+from lp.testing.matchers import HasQueryCount
302+from lp.testing.sampledata import (
303+ ADMIN_EMAIL,
304+ USER_EMAIL,
305+ )
306+from lp.testing._webservice import QueryCollector
307
308
309 class TestBugDescriptionRepresentation(TestCaseWithFactory):
310@@ -31,7 +45,7 @@
311
312 def setUp(self):
313 TestCaseWithFactory.setUp(self)
314- login('foo.bar@canonical.com')
315+ login(ADMIN_EMAIL)
316 # Make two bugs, one whose description points to the other, so it will
317 # get turned into a HTML link.
318 self.bug_one = self.factory.makeBug(title="generic")
319@@ -128,12 +142,46 @@
320 rendered_comment, self.expected_comment_html)
321
322
323+class TestBugAttachments(TestCaseWithFactory):
324+
325+ layer = LaunchpadFunctionalLayer
326+
327+ def test_attachments_query_counts_constant(self):
328+ login(USER_EMAIL)
329+ self.bug = self.factory.makeBug()
330+ self.factory.makeBugAttachment(self.bug)
331+ self.factory.makeBugAttachment(self.bug)
332+ webservice = LaunchpadWebServiceCaller(
333+ 'launchpad-library', 'salgado-change-anything')
334+ collector = QueryCollector()
335+ collector.register()
336+ self.addCleanup(collector.unregister)
337+ url = '/bugs/%d/attachments' % self.bug.id
338+ response = webservice.get(url)
339+ self.assertThat(collector, HasQueryCount(LessThan(24)))
340+ with_2_count = collector.count
341+ self.failUnlessEqual(response.status, 200)
342+ login(USER_EMAIL)
343+ self.factory.makeBugAttachment(self.bug)
344+ logout()
345+ response = webservice.get(url)
346+ # XXX: Permit the second call to be == or less, because storm
347+ # caching bugs (such as) https://bugs.launchpad.net/storm/+bug/619017
348+ # which can cause spurious queries. There was an EC2 failure landing
349+ # with this set to strictly equal which I could not reproduce locally,
350+ # and the ec2 test did not report the storm egg used, so could not
351+ # confidently rule out some form of skew.
352+ self.assertThat(collector, HasQueryCount(MatchesAny(
353+ Equals(with_2_count),
354+ LessThan(with_2_count))))
355+
356+
357 class TestBugMessages(TestCaseWithFactory):
358
359 layer = DatabaseFunctionalLayer
360
361 def setUp(self):
362- super(TestBugMessages, self).setUp('test@canonical.com')
363+ super(TestBugMessages, self).setUp(USER_EMAIL)
364 self.bug = self.factory.makeBug()
365 self.message1 = self.factory.makeMessage()
366 self.message2 = self.factory.makeMessage(parent=self.message1)
367@@ -191,7 +239,7 @@
368 real_hard_limit_for_snapshot = snapshot.HARD_LIMIT_FOR_SNAPSHOT
369 snapshot.HARD_LIMIT_FOR_SNAPSHOT = 3
370 try:
371- login('foo.bar@canonical.com')
372+ login(ADMIN_EMAIL)
373 for count in range(snapshot.HARD_LIMIT_FOR_SNAPSHOT + 1):
374 person = self.factory.makePerson()
375 bug.subscribe(person, person)
376
377=== modified file 'lib/lp/testing/sampledata.py'
378--- lib/lp/testing/sampledata.py 2010-08-26 12:33:03 +0000
379+++ lib/lp/testing/sampledata.py 2010-08-30 05:20:59 +0000
380@@ -53,7 +53,7 @@
381 VCS_IMPORTS_MEMBER_EMAIL = 'david.allouche@canonical.com'
382 COMMERCIAL_ADMIN_EMAIL = 'commercial-member@canonical.com'
383 ADMIN_EMAIL = 'foo.bar@canonical.com'
384-SAMPLE_PERSON_EMAIL = 'test@canonical.com'
385+SAMPLE_PERSON_EMAIL = USER_EMAIL
386 # A user that is an admin of ubuntu-team, which has upload rights
387 # to Ubuntu.
388 UBUNTU_DEVELOPER_ADMIN_NAME = 'name16'