Merge lp:~gmb/launchpad/bugwatch-next_check-bug-544238 into lp:launchpad/db-devel

Proposed by Graham Binns
Status: Merged
Approved by: Graham Binns
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~gmb/launchpad/bugwatch-next_check-bug-544238
Merge into: lp:launchpad/db-devel
Diff against target: 215 lines (+113/-8)
5 files modified
database/schema/comments.sql (+1/-0)
database/schema/patch-2207-42-0.sql (+7/-0)
lib/lp/bugs/doc/bug-watch-activity.txt (+83/-0)
lib/lp/bugs/model/bugwatch.py (+6/-3)
lib/lp/bugs/scripts/checkwatches.py (+16/-5)
To merge this branch: bzr merge lp:~gmb/launchpad/bugwatch-next_check-bug-544238
Reviewer Review Type Date Requested Status
Jonathan Lange (community) db Abstain
Björn Tillenius (community) db Approve
Stuart Bishop (community) db Approve
Review via email: mp+21869@code.launchpad.net

Commit message

Add a next_check column to BugWatch.

Description of the change

This branch adds a next_check field to BugWatch. We'll use this to allow us to throttle BugWatch updates where the BugWatch is continually failing.

To post a comment you must log in.
Revision history for this message
Graham Binns (gmb) wrote :

Argh. I've forgotten to add the prerequisite branch to this one. Here's the actual diff for this branch:

=== added file 'database/schema/patch-2207-99-0.sql'
--- database/schema/patch-2207-99-0.sql 1970-01-01 00:00:00 +0000
+++ database/schema/patch-2207-99-0.sql 2010-03-22 15:58:29 +0000
@@ -0,0 +1,7 @@
+SET client_min_messages=ERROR;
+
+ALTER TABLE BugWatch ADD COLUMN next_check timestamp without time zone DEFAULT timezone('UTC'::text, now());
+
+CREATE INDEX bugwatch__next_check__idx ON BugWatch(next_check);
+
+INSERT INTO LaunchpadDatabaseRevision VALUES (2207, 99, 0);

Revision history for this message
Jonathan Lange (jml) wrote :

Why aren't you using the Job table for this?

Even if you don't, consider calling the column 'scheduled_start' to at least be vaguely consistent.

Revision history for this message
Graham Binns (gmb) wrote :

On 22 March 2010 16:10, Jonathan Lange <email address hidden> wrote:
> Why aren't you using the Job table for this?
>

Is there a reason to use the Job table? We decided last week that
checkwatches shouldn't use the Jobs system; it doesn't fit the needs
of our batched processing model particularly well and any such
solution would be a kludge.

I'll happily rename the field, though.

Revision history for this message
Jonathan Lange (jml) wrote :

On Mon, Mar 22, 2010 at 4:15 PM, Graham Binns <email address hidden> wrote:
> On 22 March 2010 16:10, Jonathan Lange <email address hidden> wrote:
>> Why aren't you using the Job table for this?
>>
>
> Is there a reason to use the Job table? We decided last week that
> checkwatches shouldn't use the Jobs system; it doesn't fit the needs
> of our batched processing model particularly well and any such
> solution would be a kludge.
>

Oh right, I saw that decision mentioned on a bug report last week but
didn't see anything about the reasoning behind it. I'm surprised that
a system which bears such a strong resemblance to code imports
requires a significantly different model. Perhaps the resemblance is
only superficial.

> I'll happily rename the field, though.

Thanks. You might want to hold off until stub & Bjorn have their say though.

jml

Revision history for this message
Graham Binns (gmb) wrote :

On 22 March 2010 16:21, Jonathan Lange <email address hidden> wrote:
> Oh right, I saw that decision mentioned on a bug report last week but
> didn't see anything about the reasoning behind it. I'm surprised that
> a system which bears such a strong resemblance to code imports
> requires a significantly different model. Perhaps the resemblance is
> only superficial.

We thought the same thing to, but BugWatch is significantly different
in that it loads a lot of data from a bug tracker up front and then
does bug watch updates based on the cached data. Using the Jobs model
would mean that we wouldn't (easily) have that cache, which in turn
means that we'd be hitting bug trackers for data a lot more than we do
now. That's okay for Bugzilla 3.4 or 3.2 + plugin, where we can ask
for only what we need, but for older bugtrackers we ask for all the
data and then parse it, because there's no other option.

>> I'll happily rename the field, though.
>
> Thanks. You might want to hold off until stub & Bjorn have their say though.
>

Okay.

--
Graham Binns | PGP Key: EC66FA7D

Revision history for this message
Stuart Bishop (stub) wrote :

Fine. patch-2207-42-0.sql

I'm happy with 'next_check' as a column name. This is a repeating job. Jobs in the Jobs table run once and have a scheduled_start, date_started and date_finished timestamp.

review: Approve (db)
Revision history for this message
Björn Tillenius (bjornt) wrote :

< BjornT> gmb: sure, i'll look at it, if you explain how it will be used.
          in more detail :)
< gmb> BjornT: Cool. So, we want to be a bit more sensible with.
       scheduling bug watch checks. At the moment we check each watch.
       once every 24 hours, but if it's erroring a lot (which we now.
       record in the BugWatchActivity table) we don't want to check it so.
       often.
< gmb> BjornT: So, instead of building that logic into.
       BugTracker.getBugWatchesNeedingUpdate(), which would be complex at.
       best, we think we should have a garbo-hourly process that looks at.
       all the bug watches checked in the last hour, checks to see how.
       often they've errored and then schedules their next_check.
       accordingly.
< gmb> So BugTracker.getBugWatchesNeedingUpdate() will look at the.
       next_check field instead of lastchecked to determine whether a.
       watch needs checking.
< gmb> The advantage is that we don't have to further complicate.
       checkwatches to get the back-off for erroring watches.
< BjornT> gmb: ok, that makes sense. i'm wondering if we could find a.
          better name? if you set next_check to some time, it doesn't.
          mean that it will be checked exactly at that time, will it?
< gmb> BjornT: Hmm, true. How about check_after?
< stub> check_due ?
 * stub gets out his bike shed schematics
< gmb> It should be blue.
< BjornT> gmb: maybe next_check is good, as long as there is a comment.
          explaining the semantics. can't think of a better name, and i.
          think next_check is better than the other suggestions.
< gmb> BjornT: Right; I'll add the comment (should've added it anyway).

review: Approve (db)
Revision history for this message
Jonathan Lange (jml) wrote :

I can go either way.

review: Abstain (db)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'database/schema/comments.sql'
2--- database/schema/comments.sql 2010-03-18 17:07:11 +0000
3+++ database/schema/comments.sql 2010-03-23 15:43:33 +0000
4@@ -319,6 +319,7 @@
5 COMMENT ON COLUMN BugWatch.last_error_type IS 'The type of error which last prevented this entry from being updated. Legal values are defined by the BugWatchErrorType enumeration.';
6 COMMENT ON COLUMN BugWatch.remote_importance IS 'The importance of the bug as returned by the remote server. This will be converted into a Launchpad BugTaskImportance value.';
7 COMMENT ON COLUMN BugWatch.remote_lp_bug_id IS 'The bug in Launchpad that the remote bug is pointing at. This can be different than the BugWatch.bug column, since the same remote bug can be linked from multiple bugs in Launchpad, but the remote bug can only link to a single bug in Launchpad. The main use case for this column is to avoid having to query the remote bug tracker for this information, in order to decide whether we need to give this information to the remote bug tracker.';
8+COMMENT ON COLUMN BugWatch.next_check IS 'The time after which the watch should next be checked. Note that this does not denote an exact schedule for the next check since checkwatches only runs periodically.';
9
10
11 -- BugWatchActivity
12
13=== added file 'database/schema/patch-2207-42-0.sql'
14--- database/schema/patch-2207-42-0.sql 1970-01-01 00:00:00 +0000
15+++ database/schema/patch-2207-42-0.sql 2010-03-23 15:43:33 +0000
16@@ -0,0 +1,7 @@
17+SET client_min_messages=ERROR;
18+
19+ALTER TABLE BugWatch ADD COLUMN next_check timestamp without time zone;
20+
21+CREATE INDEX bugwatch__next_check__idx ON BugWatch(next_check);
22+
23+INSERT INTO LaunchpadDatabaseRevision VALUES (2207, 42, 0);
24
25=== modified file 'lib/lp/bugs/doc/bug-watch-activity.txt'
26--- lib/lp/bugs/doc/bug-watch-activity.txt 2010-03-19 23:28:34 +0000
27+++ lib/lp/bugs/doc/bug-watch-activity.txt 2010-03-23 15:43:33 +0000
28@@ -59,3 +59,86 @@
29 None
30 >>> print activity.oops_id
31 None
32+
33+
34+Recording BugWatch updates as BugWatchActivity
35+-----------------------------------------------
36+
37+Whenever an update is made to a BugWatch that update is recorded as a
38+BugWatchActivity entry.
39+
40+We can demonstrate this by passing our bug watch to
41+BugWatchUpdater.updateBugWatches().
42+
43+ >>> from canonical.launchpad.scripts.logger import QuietFakeLogger
44+ >>> from lp.bugs.scripts.checkwatches import BugWatchUpdater
45+ >>> from lp.bugs.tests.externalbugtracker import TestExternalBugTracker
46+ >>> updater = BugWatchUpdater(transaction, QuietFakeLogger())
47+ >>> updater.updateBugWatches(
48+ ... TestExternalBugTracker('http://example.com'), [bug_watch])
49+
50+An extra activity item will have been added to the BugWatch's activity
51+property.
52+
53+ >>> print bug_watch.activity.count()
54+ 2
55+
56+The most recent activity entry will have a result of None since it was
57+successful.
58+
59+ >>> most_recent_activity = bug_watch.activity.first()
60+ >>> print most_recent_activity.result
61+ None
62+
63+Its message will also be empty
64+
65+ >>> print most_recent_activity.message
66+ None
67+
68+As will its oops_id
69+
70+ >>> print most_recent_activity.oops_id
71+ None
72+
73+If the remote bug tracker breaks during an update the error will be
74+recorded in the activity entry for that update.
75+
76+ >>> from lp.bugs.externalbugtracker.base import BugNotFound
77+ >>> from lp.bugs.tests.externalbugtracker import (
78+ ... TestBrokenExternalBugTracker)
79+ >>> broken_bugtracker = TestBrokenExternalBugTracker('http://example.com')
80+ >>> broken_bugtracker.get_remote_status_error = BugNotFound
81+
82+ >>> updater.updateBugWatches(broken_bugtracker, [bug_watch])
83+
84+Another entry will have been added to the watch's activity property.
85+
86+ >>> print bug_watch.activity.count()
87+ 3
88+
89+And this time its result field will record that the the remote bug was
90+not found.
91+
92+ >>> most_recent_activity = bug_watch.activity.first()
93+ >>> print most_recent_activity.result.title
94+ Bug Not Found
95+
96+The OOPS ID for the error will also have been recorded.
97+
98+ >>> print most_recent_activity.oops_id
99+ OOPS...
100+
101+The BugWatchUpdater also adds BugWatchActivity entries when errors occur
102+that don't have an entry in the BugWatchActivityStatus DB Enum.
103+
104+ >>> broken_bugtracker.get_remote_status_error = Exception
105+ >>> updater.updateBugWatches(broken_bugtracker, [bug_watch])
106+ >>> most_recent_activity = bug_watch.activity.first()
107+
108+ >>> print most_recent_activity.result.title
109+ Unknown
110+
111+The OOPS ID of the error is recorded so that we can debug it.
112+
113+ >>> print most_recent_activity.oops_id
114+ OOPS...
115
116=== modified file 'lib/lp/bugs/model/bugwatch.py'
117--- lib/lp/bugs/model/bugwatch.py 2010-03-22 10:36:53 +0000
118+++ lib/lp/bugs/model/bugwatch.py 2010-03-23 15:43:33 +0000
119@@ -282,8 +282,10 @@
120 activity = BugWatchActivity()
121 activity.bug_watch = self
122 activity.result = result
123- activity.message = message
124- activity.oops_id = oops_id
125+ if message is not None:
126+ activity.message = unicode(message)
127+ if oops_id is not None:
128+ activity.oops_id = unicode(oops_id)
129 store = IStore(BugWatchActivity)
130 store.add(activity)
131
132@@ -292,7 +294,8 @@
133 store = Store.of(self)
134 return store.find(
135 BugWatchActivity,
136- BugWatchActivity.bug_watch == self).order_by('activity_date')
137+ BugWatchActivity.bug_watch == self).order_by(
138+ Desc('activity_date'))
139
140
141 class BugWatchSet(BugSetBase):
142
143=== modified file 'lib/lp/bugs/scripts/checkwatches.py'
144--- lib/lp/bugs/scripts/checkwatches.py 2010-03-23 03:14:13 +0000
145+++ lib/lp/bugs/scripts/checkwatches.py 2010-03-23 15:43:33 +0000
146@@ -826,6 +826,7 @@
147 new_remote_importance = None
148 new_malone_importance = None
149 error = None
150+ oops_id = None
151
152 # XXX: 2007-10-17 Graham Binns
153 # This nested set of try:excepts isn't really
154@@ -846,7 +847,7 @@
155 error = get_bugwatcherrortype_for_error(ex)
156 message = error_type_messages.get(
157 error, error_type_message_default)
158- self.warning(
159+ oops_id = self.warning(
160 message % {
161 'bug_id': remote_bug_id,
162 'base_url': remotesystem.baseurl,
163@@ -879,6 +880,7 @@
164 self.pushBugComments(remotesystem, bug_watch)
165 if ISupportsBackLinking.providedBy(remotesystem):
166 self.linkLaunchpadBug(remotesystem, bug_watch)
167+ bug_watch.addActivity(result=error, oops_id=oops_id)
168
169 except (KeyboardInterrupt, SystemExit):
170 # We should never catch KeyboardInterrupt or SystemExit.
171@@ -888,7 +890,7 @@
172 # Restart transaction before recording the error.
173 self.txn.abort()
174 # Send the error to the log.
175- self.error(
176+ oops_id = self.error(
177 "Failure updating bug %r on %s (local bugs: %s)." %
178 (remote_bug_id, bug_tracker_url, local_ids),
179 properties=[
180@@ -901,9 +903,10 @@
181 # their lastchecked dates so that we don't try to
182 # re-check them every time checkwatches runs.
183 errortype = get_bugwatcherrortype_for_error(error)
184- for bugwatch in bug_watches:
185- bugwatch.lastchecked = UTC_NOW
186- bugwatch.last_error_type = errortype
187+ for bug_watch in bug_watches:
188+ bug_watch.lastchecked = UTC_NOW
189+ bug_watch.last_error_type = errortype
190+ bug_watch.addActivity(result=errortype, oops_id=oops_id)
191 # We need to commit the transaction, in case the next
192 # bug fails to update as well.
193 self.txn.commit()
194@@ -1156,6 +1159,10 @@
195 # Also put it in the log.
196 self.log.warning("%s (%s)" % (message, oops_info.oopsid))
197
198+ # Return the OOPS ID so that we can use it in
199+ # BugWatchActivity.
200+ return oops_info.oopsid
201+
202 def error(self, message, properties=None, info=None):
203 """Record an error related to this external bug tracker."""
204 oops_info = report_oops(message, properties, info)
205@@ -1163,6 +1170,10 @@
206 # Also put it in the log.
207 self.log.error("%s (%s)" % (message, oops_info.oopsid))
208
209+ # Return the OOPS ID so that we can use it in
210+ # BugWatchActivity.
211+ return oops_info.oopsid
212+
213
214 class BaseScheduler:
215 """Run jobs according to a policy."""

Subscribers

People subscribed via source and target branches

to status/vote changes: