Merge lp:~allenap/launchpad/dynamic-batch-size-bug-546085 into lp:launchpad/db-devel

Proposed by Gavin Panella
Status: Merged
Approved by: Gavin Panella
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~allenap/launchpad/dynamic-batch-size-bug-546085
Merge into: lp:launchpad/db-devel
Diff against target: 595 lines (+202/-125)
11 files modified
lib/lp/bugs/doc/checkwatches.txt (+6/-3)
lib/lp/bugs/doc/externalbugtracker-debbugs.txt (+3/-3)
lib/lp/bugs/doc/externalbugtracker.txt (+5/-2)
lib/lp/bugs/externalbugtracker/__init__.py (+4/-1)
lib/lp/bugs/externalbugtracker/base.py (+5/-1)
lib/lp/bugs/externalbugtracker/debbugs.py (+8/-7)
lib/lp/bugs/scripts/checkwatches/tests/test_scheduler.py (+110/-0)
lib/lp/bugs/scripts/checkwatches/tests/test_updater.py (+15/-0)
lib/lp/bugs/scripts/checkwatches/updater.py (+40/-16)
lib/lp/bugs/tests/externalbugtracker.py (+4/-2)
lib/lp/bugs/tests/test_bugwatch.py (+2/-90)
To merge this branch: bzr merge lp:~allenap/launchpad/dynamic-batch-size-bug-546085
Reviewer Review Type Date Requested Status
Edwin Grubbs (community) code Approve
Review via email: mp+22538@code.launchpad.net

Commit message

In checkwatches, calculate a sensible batch size at run-time based on the number of watches there are for a bug tracker.

Description of the change

When checkwatches runs it observes a batch_size attribute on the object representing the remote system. This is fine, but for bug trackers with an especially large number of bug watches the batch size is a bit too low. This branch makes batch_size equal to 2% of the number of bug watches recorded for a bug tracker, or 100, whichever is greater. For GNOME Bugs this works out at about 300 (GNOME Bugs is the largest active bug tracker registered in Launchpad, with >15000 bug watches).

I've also moved the test for BugWatchScheduler into the lp.bugs.scripts.checkwatches.tests package, and the tests for the updater too.

To post a comment you must log in.
Revision history for this message
Edwin Grubbs (edwin-grubbs) wrote :

Hi Gavin,

This is a nice branch. I just have two comments besides the lint errors.

merge-conditional

-Edwin

== Pyflakes notices ==

lib/lp/bugs/externalbugtracker/debbugs.py
    124: local variable 'severity' is assigned to but never used

lib/lp/bugs/scripts/checkwatches/tests/test_updater.py
    157: local variable 'bug' is assigned to but never used

>=== modified file 'lib/lp/bugs/externalbugtracker/debbugs.py'
>--- lib/lp/bugs/externalbugtracker/debbugs.py 2010-03-16 16:52:42 +0000
>+++ lib/lp/bugs/externalbugtracker/debbugs.py 2010-03-31 15:33:31 +0000
>@@ -61,7 +61,7 @@
> # Because we keep a local copy of debbugs, we remove the batch_size
> # limit so that all debbugs watches that need checking will be
> # checked each time checkwatches runs.
>- batch_size = None
>+ batch_size = 0

Zero is a magic number. It would be good to define it as a constant
as opposed to relying on comments to explain what it means. You might
want to define the constant as a unique object such as:
    UNLIMITED = object()
so that there is never the possiblity of someone using zero and getting
the opposite results than they expect. That would allow you to get
rid of this if-statement in _getRemoteIdsToCheck() where you change
the meaning of the magic values.
    if batch_size == 0:
        batch_size = None

>
> def __init__(self, baseurl, db_location=None):
> super(DebBugs, self).__init__(baseurl)
>
>=== modified file 'lib/lp/bugs/scripts/checkwatches/updater.py'
>--- lib/lp/bugs/scripts/checkwatches/updater.py 2010-03-26 21:03:30 +0000
>+++ lib/lp/bugs/scripts/checkwatches/updater.py 2010-03-31 15:33:31 +0000
>@@ -242,6 +247,22 @@
> yield item
>
>
>+def suggest_batch_size(remote_system, num_watches):
>+ """Suggest a value for batch_size if it's not set.
>+
>+ Givend the number of bug watches for an `remote_system`, this sets
>+ a suggested batch size on it. If `remote_system` already has a
>+ batch size set, this does not override it.
>+
>+ :param remote_system: An `ExternalBugTracker`.
>+ :param num_watches: The number of watches for `remote_system`.
>+ """
>+ if remote_system.batch_size is None:
>+ remote_system.batch_size = max(
>+ SUGGESTED_BATCH_SIZE_MIN, int(
>+ SUGGESTED_BATCH_SIZE_PROPORTION * num_watches))

I think it would be easier to read with this wrapping.
        remote_system.batch_size = max(
            SUGGESTED_BATCH_SIZE_MIN,
            int(SUGGESTED_BATCH_SIZE_PROPORTION * num_watches))

>+
> class BugWatchUpdater(object):
> """Takes responsibility for updating remote bug watches."""
>

review: Approve (code)
Revision history for this message
Gavin Panella (allenap) wrote :

Post-review diff, constructed from revisions 9177, 9178 and 9180.

1--- lib/lp/bugs/externalbugtracker/__init__.py 2009-06-25 00:40:31 +0000
2+++ lib/lp/bugs/externalbugtracker/__init__.py 2010-04-07 14:25:18 +0000
3@@ -7,7 +7,7 @@
4
5 __metaclass__ = type
6 __all__ = [
7- 'get_external_bugtracker',
8+ 'BATCH_SIZE_UNLIMITED',
9 'BugNotFound',
10 'BugTrackerConnectError',
11 'BugWatchUpdateError',
12@@ -30,6 +30,7 @@
13 'UnparseableBugData',
14 'UnparseableBugTrackerVersion',
15 'UnsupportedBugTrackerVersion',
16+ 'get_external_bugtracker',
17 ]
18
19 from lp.bugs.externalbugtracker.base import *
20@@ -41,6 +42,8 @@
21 from lp.bugs.externalbugtracker.rt import *
22 from lp.bugs.externalbugtracker.trac import *
23 from lp.bugs.interfaces.bugtracker import BugTrackerType
24+
25+
26 BUG_TRACKER_CLASSES = {
27 BugTrackerType.BUGZILLA: Bugzilla,
28 BugTrackerType.DEBBUGS: DebBugs,
29--- lib/lp/bugs/externalbugtracker/base.py 2010-03-30 12:19:11 +0000
30+++ lib/lp/bugs/externalbugtracker/base.py 2010-04-07 14:25:18 +0000
31@@ -5,6 +5,7 @@
32
33 __metaclass__ = type
34 __all__ = [
35+ 'BATCH_SIZE_UNLIMITED',
36 'BugNotFound',
37 'BugTrackerAuthenticationError',
38 'BugTrackerConnectError',
39@@ -39,6 +40,9 @@
40 # The user agent we send in our requests
41 LP_USER_AGENT = "Launchpad Bugscraper/0.2 (https://bugs.launchpad.net/)"
42
43+# To signify that all bug watches should be checked in a single run.
44+BATCH_SIZE_UNLIMITED = 0
45+
46
47 #
48 # Errors.
49--- lib/lp/bugs/externalbugtracker/debbugs.py 2010-03-26 17:32:38 +0000
50+++ lib/lp/bugs/externalbugtracker/debbugs.py 2010-04-07 14:25:18 +0000
51@@ -22,18 +22,19 @@
52
53 from canonical.config import config
54 from canonical.database.sqlbase import commit
55+from canonical.launchpad.interfaces.message import IMessageSet
56+from canonical.launchpad.mail import simple_sendmail
57+from canonical.launchpad.webapp import urlsplit
58+
59 from lp.bugs.externalbugtracker import (
60- BugNotFound, BugTrackerConnectError, ExternalBugTracker,
61- InvalidBugId, UnknownRemoteStatusError)
62-from canonical.launchpad.interfaces.message import IMessageSet
63+ BATCH_SIZE_UNLIMITED, BugNotFound, BugTrackerConnectError,
64+ ExternalBugTracker, InvalidBugId, UnknownRemoteStatusError)
65+from lp.bugs.externalbugtracker.isolation import ensure_no_transaction
66 from lp.bugs.interfaces.bugtask import BugTaskImportance, BugTaskStatus
67 from lp.bugs.interfaces.externalbugtracker import (
68 ISupportsBugImport, ISupportsCommentImport, ISupportsCommentPushing,
69 UNKNOWN_REMOTE_IMPORTANCE)
70-from canonical.launchpad.mail import simple_sendmail
71 from lp.bugs.scripts import debbugs
72-from canonical.launchpad.webapp import urlsplit
73-from lp.bugs.externalbugtracker.isolation import ensure_no_transaction
74
75
76 debbugsstatusmap = {'open': BugTaskStatus.NEW,
77@@ -61,7 +62,7 @@
78 # Because we keep a local copy of debbugs, we remove the batch_size
79 # limit so that all debbugs watches that need checking will be
80 # checked each time checkwatches runs.
81- batch_size = 0
82+ batch_size = BATCH_SIZE_UNLIMITED
83
84 def __init__(self, baseurl, db_location=None):
85 super(DebBugs, self).__init__(baseurl)
86--- lib/lp/bugs/scripts/checkwatches/updater.py 2010-03-31 10:51:17 +0000
87+++ lib/lp/bugs/scripts/checkwatches/updater.py 2010-04-07 14:25:18 +0000
88@@ -61,16 +61,15 @@
89
90 from lp.bugs import externalbugtracker
91 from lp.bugs.externalbugtracker import (
92- BugNotFound, BugTrackerConnectError, BugWatchUpdateError,
93- BugWatchUpdateWarning, InvalidBugId, PrivateRemoteBug,
94- UnknownBugTrackerTypeError, UnknownRemoteStatusError, UnparseableBugData,
95- UnparseableBugTrackerVersion, UnsupportedBugTrackerVersion)
96-from lp.bugs.externalbugtracker.bugzilla import (
97- BugzillaAPI)
98+ BATCH_SIZE_UNLIMITED, BugNotFound, BugTrackerConnectError,
99+ BugWatchUpdateError, BugWatchUpdateWarning, InvalidBugId,
100+ PrivateRemoteBug, UnknownBugTrackerTypeError, UnknownRemoteStatusError,
101+ UnparseableBugData, UnparseableBugTrackerVersion,
102+ UnsupportedBugTrackerVersion)
103+from lp.bugs.externalbugtracker.bugzilla import BugzillaAPI
104 from lp.bugs.externalbugtracker.isolation import check_no_transaction
105 from lp.bugs.interfaces.bug import IBugSet
106-from lp.bugs.interfaces.externalbugtracker import (
107- ISupportsBackLinking)
108+from lp.bugs.interfaces.externalbugtracker import ISupportsBackLinking
109 from lp.services.limitedlist import LimitedList
110 from lp.services.scripts.base import LaunchpadCronScript
111
112@@ -250,17 +249,17 @@
113 def suggest_batch_size(remote_system, num_watches):
114 """Suggest a value for batch_size if it's not set.
115
116- Givend the number of bug watches for an `remote_system`, this sets
117- a suggested batch size on it. If `remote_system` already has a
118- batch size set, this does not override it.
119+ Given the number of bug watches for a `remote_system`, this sets a
120+ suggested batch size on it. If `remote_system` already has a batch
121+ size set, this does not override it.
122
123 :param remote_system: An `ExternalBugTracker`.
124 :param num_watches: The number of watches for `remote_system`.
125 """
126 if remote_system.batch_size is None:
127 remote_system.batch_size = max(
128- SUGGESTED_BATCH_SIZE_MIN, int(
129- SUGGESTED_BATCH_SIZE_PROPORTION * num_watches))
130+ SUGGESTED_BATCH_SIZE_MIN,
131+ int(SUGGESTED_BATCH_SIZE_PROPORTION * num_watches))
132
133
134 class BugWatchUpdater(object):
135@@ -402,7 +401,12 @@
136 `BaseScheduler`. If no scheduler is given, `SerialScheduler`
137 will be used, which simply runs the jobs in order.
138 """
139- self.log.debug("Using a global batch size of %s" % batch_size)
140+ if batch_size is None:
141+ self.log.debug("No global batch size specified.")
142+ elif batch_size == BATCH_SIZE_UNLIMITED:
143+ self.log.debug("Using an unlimited global batch size.")
144+ else:
145+ self.log.debug("Using a global batch size of %s" % batch_size)
146
147 # Default to using the very simple SerialScheduler.
148 if scheduler is None:
149@@ -681,11 +685,6 @@
150 # by the ExternalBugTracker.
151 batch_size = remotesystem.batch_size
152
153- if batch_size == 0:
154- # A batch_size of 0 means that there's no batch size limit
155- # for this bug tracker.
156- batch_size = None
157-
158 with self.transaction:
159 old_bug_watches = set(
160 bug_watch for bug_watch in bug_watches
161@@ -718,7 +717,7 @@
162 # are actually some bugs that we're interested in so as to
163 # avoid unnecessary network traffic.
164 if server_time is not None and len(remote_old_ids) > 0:
165- if batch_size is None:
166+ if batch_size == BATCH_SIZE_UNLIMITED:
167 remote_old_ids_to_check = (
168 remotesystem.getModifiedRemoteBugs(
169 remote_old_ids, oldest_lastchecked))
170@@ -746,7 +745,7 @@
171 remote_ids_to_check = chain(
172 remote_ids_with_comments, remote_new_ids, remote_old_ids_to_check)
173
174- if batch_size is not None:
175+ if batch_size != BATCH_SIZE_UNLIMITED:
176 # Some remote bug IDs may appear in more than one list so
177 # we must filter the list before slicing.
178 remote_ids_to_check = islice(
179--- lib/lp/bugs/doc/checkwatches.txt 2010-03-30 17:25:52 +0000
180+++ lib/lp/bugs/doc/checkwatches.txt 2010-04-07 16:08:16 +0000
181@@ -45,7 +45,7 @@
182
183 >>> print err
184 INFO creating lockfile
185- DEBUG Using a global batch size of None
186+ DEBUG No global batch size specified.
187 DEBUG Skipping updating Ubuntu Bugzilla watches.
188 DEBUG No watches to update on http://bugs.debian.org
189 DEBUG No watches to update on mailto:bugs@example.com
190@@ -274,7 +274,7 @@
191 ... updater.log = original_log
192 ... externalbugtracker.Roundup.batch_size = batch_size
193 ... urllib2.urlopen = urlopen
194- DEBUG Using a global batch size of None
195+ DEBUG No global batch size specified.
196 INFO Updating 5 watches for 5 bugs on http://bugs.example.com
197 ERROR Connection timed out when updating ... (OOPS-...)
198
199@@ -412,7 +412,8 @@
200 >>> from lp.bugs.interfaces.externalbugtracker import (
201 ... ISupportsCommentImport, ISupportsCommentPushing,
202 ... ISupportsBackLinking)
203- >>> from lp.bugs.externalbugtracker.base import ExternalBugTracker
204+ >>> from lp.bugs.externalbugtracker.base import (
205+ ... BATCH_SIZE_UNLIMITED, ExternalBugTracker)
206
207 >>> nowish = datetime.now(utc)
208 >>> class UselessExternalBugTracker(ExternalBugTracker):
209@@ -421,6 +422,8 @@
210 ... ISupportsBackLinking, ISupportsCommentImport,
211 ... ISupportsCommentPushing)
212 ...
213+ ... batch_size = BATCH_SIZE_UNLIMITED
214+ ...
215 ... def initializeRemoteBugDB(self, bug_ids):
216 ... # This just exists to stop errors from being raised.
217 ... pass
218--- lib/lp/bugs/doc/externalbugtracker-debbugs.txt 2010-03-26 13:48:53 +0000
219+++ lib/lp/bugs/doc/externalbugtracker-debbugs.txt 2010-04-07 16:08:16 +0000
220@@ -14,7 +14,7 @@
221 You can specify the db_location explicitly:
222
223 >>> from lp.bugs.externalbugtracker import (
224- ... DebBugs)
225+ ... BATCH_SIZE_UNLIMITED, DebBugs)
226 >>> from canonical.testing import LaunchpadZopelessLayer
227 >>> txn = LaunchpadZopelessLayer.txn
228 >>> external_debbugs = DebBugs(
229@@ -42,8 +42,8 @@
230 worry about batching bug watch updates for performance reasons, so
231 DebBugs instances don't have a batch_size limit.
232
233- >>> print external_debbugs.batch_size
234- None
235+ >>> external_debbugs.batch_size == BATCH_SIZE_UNLIMITED
236+ True
237
238
239 == Retrieving bug status from the debbugs database ==
240--- lib/lp/bugs/tests/externalbugtracker.py 2009-12-23 12:14:59 +0000
241+++ lib/lp/bugs/tests/externalbugtracker.py 2010-04-07 16:08:16 +0000
242@@ -24,8 +24,8 @@
243 from canonical.config import config
244 from canonical.database.sqlbase import commit, ZopelessTransactionManager
245 from lp.bugs.externalbugtracker import (
246- BugNotFound, BugTrackerConnectError, Bugzilla, DebBugs,
247- ExternalBugTracker, Mantis, RequestTracker, Roundup, SourceForge,
248+ BATCH_SIZE_UNLIMITED, BugNotFound, BugTrackerConnectError, Bugzilla,
249+ DebBugs, ExternalBugTracker, Mantis, RequestTracker, Roundup, SourceForge,
250 Trac)
251 from lp.bugs.externalbugtracker.trac import (
252 FAULT_TICKET_NOT_FOUND, LP_PLUGIN_BUG_IDS_ONLY, LP_PLUGIN_FULL,
253@@ -152,6 +152,8 @@
254 implementation, though it doesn't actually do anything.
255 """
256
257+ batch_size = BATCH_SIZE_UNLIMITED
258+
259 def __init__(self, baseurl='http://example.com/'):
260 super(TestExternalBugTracker, self).__init__(baseurl)
261
Revision history for this message
Gavin Panella (allenap) wrote :

Conversation re. additional changes from IRC:

<EdwinGrubbs> allenap: I'm surprised that you added a line to tests/externalbugtracker.py to set the batch_size to BATCH_SIZE_UNLIMITED, but it didn't change the results of any of the tests.
<allenap> EdwinGrubbs: It kind of does the same as lines 153 to 157 used to do.
<allenap> EdwinGrubbs: ... because most calls into updateBugWatches() and _getRemoteIdsToCheck() didn't specify batch_size.
<EdwinGrubbs> allenap: ok, that makes sense. Looks good.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/bugs/doc/checkwatches.txt'
2--- lib/lp/bugs/doc/checkwatches.txt 2010-04-06 21:36:16 +0000
3+++ lib/lp/bugs/doc/checkwatches.txt 2010-04-08 15:38:28 +0000
4@@ -45,7 +45,7 @@
5
6 >>> print err
7 INFO Creating lockfile: /var/lock/launchpad-checkwatches.lock
8- DEBUG Using a global batch size of None
9+ DEBUG No global batch size specified.
10 DEBUG Skipping updating Ubuntu Bugzilla watches.
11 DEBUG No watches to update on http://bugs.debian.org
12 DEBUG No watches to update on mailto:bugs@example.com
13@@ -274,7 +274,7 @@
14 ... updater.log = original_log
15 ... externalbugtracker.Roundup.batch_size = batch_size
16 ... urllib2.urlopen = urlopen
17- DEBUG Using a global batch size of None
18+ DEBUG No global batch size specified.
19 INFO Updating 5 watches for 5 bugs on http://bugs.example.com
20 ERROR Connection timed out when updating ... (OOPS-...)
21
22@@ -412,7 +412,8 @@
23 >>> from lp.bugs.interfaces.externalbugtracker import (
24 ... ISupportsCommentImport, ISupportsCommentPushing,
25 ... ISupportsBackLinking)
26- >>> from lp.bugs.externalbugtracker.base import ExternalBugTracker
27+ >>> from lp.bugs.externalbugtracker.base import (
28+ ... BATCH_SIZE_UNLIMITED, ExternalBugTracker)
29
30 >>> nowish = datetime.now(utc)
31 >>> class UselessExternalBugTracker(ExternalBugTracker):
32@@ -421,6 +422,8 @@
33 ... ISupportsBackLinking, ISupportsCommentImport,
34 ... ISupportsCommentPushing)
35 ...
36+ ... batch_size = BATCH_SIZE_UNLIMITED
37+ ...
38 ... def initializeRemoteBugDB(self, bug_ids):
39 ... # This just exists to stop errors from being raised.
40 ... pass
41
42=== modified file 'lib/lp/bugs/doc/externalbugtracker-debbugs.txt'
43--- lib/lp/bugs/doc/externalbugtracker-debbugs.txt 2010-03-26 13:48:53 +0000
44+++ lib/lp/bugs/doc/externalbugtracker-debbugs.txt 2010-04-08 15:38:28 +0000
45@@ -14,7 +14,7 @@
46 You can specify the db_location explicitly:
47
48 >>> from lp.bugs.externalbugtracker import (
49- ... DebBugs)
50+ ... BATCH_SIZE_UNLIMITED, DebBugs)
51 >>> from canonical.testing import LaunchpadZopelessLayer
52 >>> txn = LaunchpadZopelessLayer.txn
53 >>> external_debbugs = DebBugs(
54@@ -42,8 +42,8 @@
55 worry about batching bug watch updates for performance reasons, so
56 DebBugs instances don't have a batch_size limit.
57
58- >>> print external_debbugs.batch_size
59- None
60+ >>> external_debbugs.batch_size == BATCH_SIZE_UNLIMITED
61+ True
62
63
64 == Retrieving bug status from the debbugs database ==
65
66=== modified file 'lib/lp/bugs/doc/externalbugtracker.txt'
67--- lib/lp/bugs/doc/externalbugtracker.txt 2010-03-26 15:12:59 +0000
68+++ lib/lp/bugs/doc/externalbugtracker.txt 2010-04-08 15:38:28 +0000
69@@ -1042,13 +1042,16 @@
70
71 _getRemoteIdsToCheck() will interpret a batch_size parameter of 0 as an
72 instruction to ignore the batch size limitation altogether and just return all
73-the IDs that need checking.
74+the IDs that need checking. The constant BATCH_SIZE_UNLIMITED should
75+be used in place of using 0 verbatim.
76+
77+ >>> from lp.bugs.externalbugtracker import BATCH_SIZE_UNLIMITED
78
79 >>> ids = bug_watch_updater._getRemoteIdsToCheck(
80 ... external_bugtracker,
81 ... unchecked_watches + watches_with_comments + old_watches,
82 ... external_bugtracker.getCurrentDBTime(), utc_now,
83- ... batch_size=0)
84+ ... batch_size=BATCH_SIZE_UNLIMITED)
85 >>> print sorted(ids['remote_ids_to_check'])
86 [u'0', u'1', u'2', u'3', u'4', u'5', u'6', u'7', u'8', u'9']
87
88
89=== modified file 'lib/lp/bugs/externalbugtracker/__init__.py'
90--- lib/lp/bugs/externalbugtracker/__init__.py 2009-06-25 00:40:31 +0000
91+++ lib/lp/bugs/externalbugtracker/__init__.py 2010-04-08 15:38:28 +0000
92@@ -7,7 +7,7 @@
93
94 __metaclass__ = type
95 __all__ = [
96- 'get_external_bugtracker',
97+ 'BATCH_SIZE_UNLIMITED',
98 'BugNotFound',
99 'BugTrackerConnectError',
100 'BugWatchUpdateError',
101@@ -30,6 +30,7 @@
102 'UnparseableBugData',
103 'UnparseableBugTrackerVersion',
104 'UnsupportedBugTrackerVersion',
105+ 'get_external_bugtracker',
106 ]
107
108 from lp.bugs.externalbugtracker.base import *
109@@ -41,6 +42,8 @@
110 from lp.bugs.externalbugtracker.rt import *
111 from lp.bugs.externalbugtracker.trac import *
112 from lp.bugs.interfaces.bugtracker import BugTrackerType
113+
114+
115 BUG_TRACKER_CLASSES = {
116 BugTrackerType.BUGZILLA: Bugzilla,
117 BugTrackerType.DEBBUGS: DebBugs,
118
119=== modified file 'lib/lp/bugs/externalbugtracker/base.py'
120--- lib/lp/bugs/externalbugtracker/base.py 2010-03-26 15:12:59 +0000
121+++ lib/lp/bugs/externalbugtracker/base.py 2010-04-08 15:38:28 +0000
122@@ -5,6 +5,7 @@
123
124 __metaclass__ = type
125 __all__ = [
126+ 'BATCH_SIZE_UNLIMITED',
127 'BugNotFound',
128 'BugTrackerAuthenticationError',
129 'BugTrackerConnectError',
130@@ -39,6 +40,9 @@
131 # The user agent we send in our requests
132 LP_USER_AGENT = "Launchpad Bugscraper/0.2 (https://bugs.launchpad.net/)"
133
134+# To signify that all bug watches should be checked in a single run.
135+BATCH_SIZE_UNLIMITED = 0
136+
137
138 #
139 # Errors.
140@@ -133,7 +137,7 @@
141
142 implements(IExternalBugTracker)
143
144- batch_size = 100
145+ batch_size = None
146 batch_query_threshold = config.checkwatches.batch_query_threshold
147 comment_template = 'default_remotecomment_template.txt'
148
149
150=== modified file 'lib/lp/bugs/externalbugtracker/debbugs.py'
151--- lib/lp/bugs/externalbugtracker/debbugs.py 2010-03-31 20:09:55 +0000
152+++ lib/lp/bugs/externalbugtracker/debbugs.py 2010-04-08 15:38:28 +0000
153@@ -22,18 +22,19 @@
154
155 from canonical.config import config
156 from canonical.database.sqlbase import commit
157+from canonical.launchpad.interfaces.message import IMessageSet
158+from canonical.launchpad.mail import simple_sendmail
159+from canonical.launchpad.webapp import urlsplit
160+
161 from lp.bugs.externalbugtracker import (
162- BugNotFound, BugTrackerConnectError, ExternalBugTracker,
163- InvalidBugId, UnknownRemoteStatusError)
164-from canonical.launchpad.interfaces.message import IMessageSet
165+ BATCH_SIZE_UNLIMITED, BugNotFound, BugTrackerConnectError,
166+ ExternalBugTracker, InvalidBugId, UnknownRemoteStatusError)
167+from lp.bugs.externalbugtracker.isolation import ensure_no_transaction
168 from lp.bugs.interfaces.bugtask import BugTaskImportance, BugTaskStatus
169 from lp.bugs.interfaces.externalbugtracker import (
170 ISupportsBugImport, ISupportsCommentImport, ISupportsCommentPushing,
171 UNKNOWN_REMOTE_IMPORTANCE)
172-from canonical.launchpad.mail import simple_sendmail
173 from lp.bugs.scripts import debbugs
174-from canonical.launchpad.webapp import urlsplit
175-from lp.bugs.externalbugtracker.isolation import ensure_no_transaction
176
177
178 debbugsstatusmap = {'open': BugTaskStatus.NEW,
179@@ -59,7 +60,7 @@
180 # Because we keep a local copy of debbugs, we remove the batch_size
181 # limit so that all debbugs watches that need checking will be
182 # checked each time checkwatches runs.
183- batch_size = None
184+ batch_size = BATCH_SIZE_UNLIMITED
185
186 def __init__(self, baseurl, db_location=None):
187 super(DebBugs, self).__init__(baseurl)
188
189=== added directory 'lib/lp/bugs/scripts/checkwatches/tests'
190=== added file 'lib/lp/bugs/scripts/checkwatches/tests/__init__.py'
191=== added file 'lib/lp/bugs/scripts/checkwatches/tests/test_scheduler.py'
192--- lib/lp/bugs/scripts/checkwatches/tests/test_scheduler.py 1970-01-01 00:00:00 +0000
193+++ lib/lp/bugs/scripts/checkwatches/tests/test_scheduler.py 2010-04-08 15:38:28 +0000
194@@ -0,0 +1,110 @@
195+# Copyright 2010 Canonical Ltd. This software is licensed under the
196+# GNU Affero General Public License version 3 (see the file LICENSE).
197+
198+"""XXX: Module docstring goes here."""
199+
200+__metaclass__ = type
201+
202+import transaction
203+import unittest
204+
205+from datetime import datetime, timedelta
206+from pytz import utc
207+
208+from zope.component import getUtility
209+
210+from canonical.launchpad.scripts.logger import QuietFakeLogger
211+from canonical.testing import DatabaseFunctionalLayer
212+
213+from lp.bugs.interfaces.bugwatch import (
214+ BugWatchActivityStatus, IBugWatchSet)
215+from lp.bugs.scripts.checkwatches.scheduler import BugWatchScheduler
216+
217+from lp.testing import TestCaseWithFactory
218+
219+
220+class TestBugWatchScheduler(TestCaseWithFactory):
221+ """Tests for the BugWatchScheduler, which runs as part of garbo."""
222+
223+ layer = DatabaseFunctionalLayer
224+
225+ def setUp(self):
226+ super(TestBugWatchScheduler, self).setUp('foo.bar@canonical.com')
227+ # We'll make sure that all the other bug watches look like
228+ # they've been scheduled so that only our watch gets scheduled.
229+ for watch in getUtility(IBugWatchSet).search():
230+ watch.next_check = datetime.now(utc)
231+ self.bug_watch = self.factory.makeBugWatch()
232+ self.scheduler = BugWatchScheduler(QuietFakeLogger())
233+ transaction.commit()
234+
235+ def test_scheduler_schedules_unchecked_watches(self):
236+ # The BugWatchScheduler will schedule a BugWatch that has never
237+ # been checked to be checked immediately.
238+ self.bug_watch.next_check = None
239+ self.scheduler(1)
240+
241+ self.assertNotEqual(None, self.bug_watch.next_check)
242+ self.assertTrue(
243+ self.bug_watch.next_check <= datetime.now(utc))
244+
245+ def test_scheduler_schedules_working_watches(self):
246+ # If a watch has been checked and has never failed its next
247+ # check will be scheduled for 24 hours after its last check.
248+ now = datetime.now(utc)
249+ # Add some succesful activity to ensure that successful activity
250+ # is handled correctly.
251+ self.bug_watch.addActivity()
252+ self.bug_watch.lastchecked = now
253+ self.bug_watch.next_check = None
254+ transaction.commit()
255+ self.scheduler(1)
256+
257+ self.assertEqual(
258+ now + timedelta(hours=24), self.bug_watch.next_check)
259+
260+ def test_scheduler_schedules_failing_watches(self):
261+ # If a watch has failed once, it will be scheduled more than 24
262+ # hours after its last check.
263+ now = datetime.now(utc)
264+ self.bug_watch.lastchecked = now
265+
266+ # The delay depends on the number of failures that the watch has
267+ # had.
268+ for failure_count in range(1, 6):
269+ self.bug_watch.next_check = None
270+ self.bug_watch.addActivity(
271+ result=BugWatchActivityStatus.BUG_NOT_FOUND)
272+ transaction.commit()
273+ self.scheduler(1)
274+
275+ coefficient = self.scheduler.delay_coefficient * failure_count
276+ self.assertEqual(
277+ now + timedelta(days=1 + coefficient),
278+ self.bug_watch.next_check)
279+
280+ # The scheduler only looks at the last 5 activity items, so even
281+ # if there have been more failures the maximum delay will be 7
282+ # days.
283+ for count in range(10):
284+ self.bug_watch.addActivity(
285+ result=BugWatchActivityStatus.BUG_NOT_FOUND)
286+ self.bug_watch.next_check = None
287+ transaction.commit()
288+ self.scheduler(1)
289+ self.assertEqual(
290+ now + timedelta(days=7), self.bug_watch.next_check)
291+
292+ def test_scheduler_doesnt_schedule_scheduled_watches(self):
293+ # The scheduler will ignore watches whose next_check has been
294+ # set.
295+ next_check_date = datetime.now(utc) + timedelta(days=1)
296+ self.bug_watch.next_check = next_check_date
297+ transaction.commit()
298+ self.scheduler(1)
299+
300+ self.assertEqual(next_check_date, self.bug_watch.next_check)
301+
302+
303+def test_suite():
304+ return unittest.TestLoader().loadTestsFromName(__name__)
305
306=== renamed file 'lib/lp/bugs/scripts/tests/test_checkwatches.py' => 'lib/lp/bugs/scripts/checkwatches/tests/test_updater.py'
307--- lib/lp/bugs/scripts/tests/test_checkwatches.py 2010-03-25 21:59:48 +0000
308+++ lib/lp/bugs/scripts/checkwatches/tests/test_updater.py 2010-04-08 15:38:28 +0000
309@@ -167,6 +167,21 @@
310 # stop SQL statemnet logging.
311 clear_request_started()
312
313+ def test_suggest_batch_size(self):
314+ class RemoteSystem: pass
315+ remote_system = RemoteSystem()
316+ # When the batch_size is None, suggest_batch_size() will set
317+ # it accordingly.
318+ remote_system.batch_size = None
319+ checkwatches.updater.suggest_batch_size(remote_system, 1)
320+ self.failUnlessEqual(100, remote_system.batch_size)
321+ remote_system.batch_size = None
322+ checkwatches.updater.suggest_batch_size(remote_system, 12350)
323+ self.failUnlessEqual(247, remote_system.batch_size)
324+ # If the batch_size is already set, it will not be changed.
325+ checkwatches.updater.suggest_batch_size(remote_system, 99999)
326+ self.failUnlessEqual(247, remote_system.batch_size)
327+
328
329 class TestUpdateBugsWithLinkedQuestions(unittest.TestCase):
330 """Tests for updating bugs with linked questions."""
331
332=== modified file 'lib/lp/bugs/scripts/checkwatches/updater.py'
333--- lib/lp/bugs/scripts/checkwatches/updater.py 2010-03-26 21:03:30 +0000
334+++ lib/lp/bugs/scripts/checkwatches/updater.py 2010-04-08 15:38:28 +0000
335@@ -61,16 +61,15 @@
336
337 from lp.bugs import externalbugtracker
338 from lp.bugs.externalbugtracker import (
339- BugNotFound, BugTrackerConnectError, BugWatchUpdateError,
340- BugWatchUpdateWarning, InvalidBugId, PrivateRemoteBug,
341- UnknownBugTrackerTypeError, UnknownRemoteStatusError, UnparseableBugData,
342- UnparseableBugTrackerVersion, UnsupportedBugTrackerVersion)
343-from lp.bugs.externalbugtracker.bugzilla import (
344- BugzillaAPI)
345+ BATCH_SIZE_UNLIMITED, BugNotFound, BugTrackerConnectError,
346+ BugWatchUpdateError, BugWatchUpdateWarning, InvalidBugId,
347+ PrivateRemoteBug, UnknownBugTrackerTypeError, UnknownRemoteStatusError,
348+ UnparseableBugData, UnparseableBugTrackerVersion,
349+ UnsupportedBugTrackerVersion)
350+from lp.bugs.externalbugtracker.bugzilla import BugzillaAPI
351 from lp.bugs.externalbugtracker.isolation import check_no_transaction
352 from lp.bugs.interfaces.bug import IBugSet
353-from lp.bugs.interfaces.externalbugtracker import (
354- ISupportsBackLinking)
355+from lp.bugs.interfaces.externalbugtracker import ISupportsBackLinking
356 from lp.services.limitedlist import LimitedList
357 from lp.services.scripts.base import LaunchpadCronScript
358
359@@ -78,6 +77,11 @@
360 SYNCABLE_GNOME_PRODUCTS = []
361 MAX_SQL_STATEMENTS_LOGGED = 10000
362
363+# The minimum batch size to suggest to an IExternalBugTracker.
364+SUGGESTED_BATCH_SIZE_MIN = 100
365+# The proportion of all watches to suggest as a batch size.
366+SUGGESTED_BATCH_SIZE_PROPORTION = 0.02
367+
368
369 class TooMuchTimeSkew(BugWatchUpdateError):
370 """Time difference between ourselves and the remote server is too much."""
371@@ -242,6 +246,22 @@
372 yield item
373
374
375+def suggest_batch_size(remote_system, num_watches):
376+ """Suggest a value for batch_size if it's not set.
377+
378+ Given the number of bug watches for a `remote_system`, this sets a
379+ suggested batch size on it. If `remote_system` already has a batch
380+ size set, this does not override it.
381+
382+ :param remote_system: An `ExternalBugTracker`.
383+ :param num_watches: The number of watches for `remote_system`.
384+ """
385+ if remote_system.batch_size is None:
386+ remote_system.batch_size = max(
387+ SUGGESTED_BATCH_SIZE_MIN,
388+ int(SUGGESTED_BATCH_SIZE_PROPORTION * num_watches))
389+
390+
391 class BugWatchUpdater(object):
392 """Takes responsibility for updating remote bug watches."""
393
394@@ -381,7 +401,12 @@
395 `BaseScheduler`. If no scheduler is given, `SerialScheduler`
396 will be used, which simply runs the jobs in order.
397 """
398- self.log.debug("Using a global batch size of %s" % batch_size)
399+ if batch_size is None:
400+ self.log.debug("No global batch size specified.")
401+ elif batch_size == BATCH_SIZE_UNLIMITED:
402+ self.log.debug("Using an unlimited global batch size.")
403+ else:
404+ self.log.debug("Using a global batch size of %s" % batch_size)
405
406 # Default to using the very simple SerialScheduler.
407 if scheduler is None:
408@@ -497,6 +522,7 @@
409 def _getExternalBugTrackersAndWatches(self, bug_tracker, bug_watches):
410 """Return an `ExternalBugTracker` instance for `bug_tracker`."""
411 with self.transaction:
412+ num_watches = bug_tracker.watches.count()
413 remotesystem = (
414 externalbugtracker.get_external_bugtracker(bug_tracker))
415 # We special-case the Gnome Bugzilla.
416@@ -506,6 +532,9 @@
417 # Probe the remote system for additional capabilities.
418 remotesystem_to_use = remotesystem.getExternalBugTrackerToUse()
419
420+ # Try to hint at how many bug watches to check each time.
421+ suggest_batch_size(remotesystem_to_use, num_watches)
422+
423 if (is_gnome_bugzilla and
424 isinstance(remotesystem_to_use, BugzillaAPI) and
425 len(self._syncable_gnome_products) > 0):
426@@ -656,11 +685,6 @@
427 # by the ExternalBugTracker.
428 batch_size = remotesystem.batch_size
429
430- if batch_size == 0:
431- # A batch_size of 0 means that there's no batch size limit
432- # for this bug tracker.
433- batch_size = None
434-
435 with self.transaction:
436 old_bug_watches = set(
437 bug_watch for bug_watch in bug_watches
438@@ -693,7 +717,7 @@
439 # are actually some bugs that we're interested in so as to
440 # avoid unnecessary network traffic.
441 if server_time is not None and len(remote_old_ids) > 0:
442- if batch_size is None:
443+ if batch_size == BATCH_SIZE_UNLIMITED:
444 remote_old_ids_to_check = (
445 remotesystem.getModifiedRemoteBugs(
446 remote_old_ids, oldest_lastchecked))
447@@ -721,7 +745,7 @@
448 remote_ids_to_check = chain(
449 remote_ids_with_comments, remote_new_ids, remote_old_ids_to_check)
450
451- if batch_size is not None:
452+ if batch_size != BATCH_SIZE_UNLIMITED:
453 # Some remote bug IDs may appear in more than one list so
454 # we must filter the list before slicing.
455 remote_ids_to_check = islice(
456
457=== modified file 'lib/lp/bugs/tests/externalbugtracker.py'
458--- lib/lp/bugs/tests/externalbugtracker.py 2009-12-23 12:14:59 +0000
459+++ lib/lp/bugs/tests/externalbugtracker.py 2010-04-08 15:38:28 +0000
460@@ -24,8 +24,8 @@
461 from canonical.config import config
462 from canonical.database.sqlbase import commit, ZopelessTransactionManager
463 from lp.bugs.externalbugtracker import (
464- BugNotFound, BugTrackerConnectError, Bugzilla, DebBugs,
465- ExternalBugTracker, Mantis, RequestTracker, Roundup, SourceForge,
466+ BATCH_SIZE_UNLIMITED, BugNotFound, BugTrackerConnectError, Bugzilla,
467+ DebBugs, ExternalBugTracker, Mantis, RequestTracker, Roundup, SourceForge,
468 Trac)
469 from lp.bugs.externalbugtracker.trac import (
470 FAULT_TICKET_NOT_FOUND, LP_PLUGIN_BUG_IDS_ONLY, LP_PLUGIN_FULL,
471@@ -152,6 +152,8 @@
472 implementation, though it doesn't actually do anything.
473 """
474
475+ batch_size = BATCH_SIZE_UNLIMITED
476+
477 def __init__(self, baseurl='http://example.com/'):
478 super(TestExternalBugTracker, self).__init__(baseurl)
479
480
481=== modified file 'lib/lp/bugs/tests/test_bugwatch.py'
482--- lib/lp/bugs/tests/test_bugwatch.py 2010-04-07 12:46:15 +0000
483+++ lib/lp/bugs/tests/test_bugwatch.py 2010-04-08 15:38:28 +0000
484@@ -8,9 +8,6 @@
485 import transaction
486 import unittest
487
488-from datetime import datetime, timedelta
489-from pytz import utc
490-
491 from urlparse import urlunsplit
492
493 from zope.component import getUtility
494@@ -24,10 +21,8 @@
495
496 from lp.bugs.interfaces.bugtracker import BugTrackerType, IBugTrackerSet
497 from lp.bugs.interfaces.bugwatch import (
498- BugWatchActivityStatus, IBugWatchSet, NoBugTrackerFound,
499- UnrecognizedBugTrackerURL)
500-from lp.bugs.scripts.checkwatches.scheduler import (
501- BugWatchScheduler, MAX_SAMPLE_SIZE)
502+ IBugWatchSet, NoBugTrackerFound, UnrecognizedBugTrackerURL)
503+from lp.bugs.scripts.checkwatches.scheduler import MAX_SAMPLE_SIZE
504 from lp.registry.interfaces.person import IPersonSet
505
506 from lp.testing import TestCaseWithFactory
507@@ -466,88 +461,5 @@
508 self.failUnless("Activity %s" % i in messages)
509
510
511-class TestBugWatchScheduler(TestCaseWithFactory):
512- """Tests for the BugWatchScheduler, which runs as part of garbo."""
513-
514- layer = DatabaseFunctionalLayer
515-
516- def setUp(self):
517- super(TestBugWatchScheduler, self).setUp('foo.bar@canonical.com')
518- # We'll make sure that all the other bug watches look like
519- # they've been scheduled so that only our watch gets scheduled.
520- for watch in getUtility(IBugWatchSet).search():
521- watch.next_check = datetime.now(utc)
522- self.bug_watch = self.factory.makeBugWatch()
523- self.scheduler = BugWatchScheduler(QuietFakeLogger())
524- transaction.commit()
525-
526- def test_scheduler_schedules_unchecked_watches(self):
527- # The BugWatchScheduler will schedule a BugWatch that has never
528- # been checked to be checked immediately.
529- self.bug_watch.next_check = None
530- self.scheduler(1)
531-
532- self.assertNotEqual(None, self.bug_watch.next_check)
533- self.assertTrue(
534- self.bug_watch.next_check <= datetime.now(utc))
535-
536- def test_scheduler_schedules_working_watches(self):
537- # If a watch has been checked and has never failed its next
538- # check will be scheduled for 24 hours after its last check.
539- now = datetime.now(utc)
540- # Add some succesful activity to ensure that successful activity
541- # is handled correctly.
542- self.bug_watch.addActivity()
543- self.bug_watch.lastchecked = now
544- self.bug_watch.next_check = None
545- transaction.commit()
546- self.scheduler(1)
547-
548- self.assertEqual(
549- now + timedelta(hours=24), self.bug_watch.next_check)
550-
551- def test_scheduler_schedules_failing_watches(self):
552- # If a watch has failed once, it will be scheduled more than 24
553- # hours after its last check.
554- now = datetime.now(utc)
555- self.bug_watch.lastchecked = now
556-
557- # The delay depends on the number of failures that the watch has
558- # had.
559- for failure_count in range(1, 6):
560- self.bug_watch.next_check = None
561- self.bug_watch.addActivity(
562- result=BugWatchActivityStatus.BUG_NOT_FOUND)
563- transaction.commit()
564- self.scheduler(1)
565-
566- coefficient = self.scheduler.delay_coefficient * failure_count
567- self.assertEqual(
568- now + timedelta(days=1 + coefficient),
569- self.bug_watch.next_check)
570-
571- # The scheduler only looks at the last 5 activity items, so even
572- # if there have been more failures the maximum delay will be 7
573- # days.
574- for count in range(10):
575- self.bug_watch.addActivity(
576- result=BugWatchActivityStatus.BUG_NOT_FOUND)
577- self.bug_watch.next_check = None
578- transaction.commit()
579- self.scheduler(1)
580- self.assertEqual(
581- now + timedelta(days=7), self.bug_watch.next_check)
582-
583- def test_scheduler_doesnt_schedule_scheduled_watches(self):
584- # The scheduler will ignore watches whose next_check has been
585- # set.
586- next_check_date = datetime.now(utc) + timedelta(days=1)
587- self.bug_watch.next_check = next_check_date
588- transaction.commit()
589- self.scheduler(1)
590-
591- self.assertEqual(next_check_date, self.bug_watch.next_check)
592-
593-
594 def test_suite():
595 return unittest.TestLoader().loadTestsFromName(__name__)

Subscribers

People subscribed via source and target branches

to status/vote changes: