Merge lp:~kfogel/launchpad/505845-affects-me-too-from-dups into lp:launchpad/db-devel

Proposed by Karl Fogel
Status: Rejected
Rejected by: Jonathan Lange
Proposed branch: lp:~kfogel/launchpad/505845-affects-me-too-from-dups
Merge into: lp:launchpad/db-devel
Diff against target: 68 lines (+21/-2)
4 files modified
database/schema/comments.sql (+1/-0)
database/schema/patch-2207-99-0.sql (+9/-0)
lib/lp/code/browser/branchmergeproposallisting.py (+3/-1)
lib/lp/code/browser/tests/test_branchmergeproposallisting.py (+8/-1)
To merge this branch: bzr merge lp:~kfogel/launchpad/505845-affects-me-too-from-dups
Reviewer Review Type Date Requested Status
Stuart Bishop (community) db Disapprove
Jonathan Lange (community) Needs Fixing
Review via email: mp+17633@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Karl Fogel (kfogel) wrote :

Make the DB change necessary to implement the solution described in https://bugs.edge.launchpad.net/malone/+bug/505845/comments/1 . Short version:

This new field will allow Launchpad to remember when a person is affected by a bug via duplication, versus directly; and will be able to represent perverse things like people claiming to be affected by a bug but not by its dup.

The basic plan is to add a new column:

  ALTER table bugaffectsperson ADD COLUMN master_affects integer;

By default, the new column has the same value as the 'bug' column. But when a bug D is marked as a dup of bug M, D's 'master_affects' is set to M (unless there was already a record indicating M does not affect P).

We'll do most queries on the 'master_affects' column (instead of on 'bug' as currently). For example, when a person P visits the page for bug M, we figure out whether to show it as affecting them based on whether there are any rows with M in 'master_affects'. And when we're calculating total affects-person counts for bug heat, we query on the master_affects column. Thus, if D is later unduplicated from M, the affects-person count on M will automatically be decremented properly.

See the above-linked comment for more details.

NOTE: There will need to be some code updates following this patch -- we know about them. Until then, you'll see test errors like 'IntegrityError: null value in column "master_affects" violates not-null constraint'.

OTHER NOTE: There is a possibility this solution will help with the performance issues we have with the subscriber portlet on the bugs pages; however, we're not positive enough to say that with 100% certainty yet. We just wanted to point it out for warm fuzzies.

Revision history for this message
Jonathan Lange (jml) wrote :

Hey Karl,

A couple of questions:
  - why doesn't the new column have a foreign key referring to the bug table?
  - what's the test for?

Also, if I understand this correctly, this patch denormalizes the database -- information about whether the bug is a dupe is already stored somewhere. I guess this change to the model allows you to support crazy things like the example described above, but why is this desirable?

If we do keep this patch, then I think (but am not sure) that we need a uniqueness constraint such that any given bugaffecstperson.bug only has one master_affects.

Thanks,
jml

review: Needs Fixing
Revision history for this message
Karl Fogel (kfogel) wrote :

Good point on the foreign key, d'oh. Thanks.

It _looks_ like denormalization, but semantically it's not, really: this new field doesn't mean "is / is not a dup of", it means "this bug-affects-person relationship should / should not be passed through to". That doesn't map reliably to "is / is not a dup of", it's just that they happen to coincide a lot -- but not always, hence the need for a separate record. It's true that that data is already in the system, if you iterate over enough bugs and people, but the problem is performance issues.

Thoughts?

Re the uniqueness constraint: yes. Note that person may want to claim to be unaffected by M but to still be affected by D, and in that scenario the bugaffectsperson.master_affects gets set back from 'M' to 'D', even while other people may continue to claim that both D and thereby M affect them. Thus, there is a uniqueness constraint of a sort here: for a given bug D, bugaffectsperson.bug must be D *or* it can be at most one other value (say, M).

There is probably a way to express this constraint in SQL (abel theorizes): say that bugaffectsperson.master_affects can have either the value of that row's bugaffectsperson.bug *or* of bug.duplicateof, but not anything else.

Revision history for this message
Jonathan Lange (jml) wrote :

Weeeeeeeeeell, I'm not sure how we feelr about this sort of denormalization, but I'll be guided by stub's opinion here.

If you can think of how to do the constraint, that'd be good. I'll save an approve vote for an updated patch.

Revision history for this message
Stuart Bishop (stub) wrote :

This seems a very complicated way of duplicating information already available in the database. It is a denormalization, which we should avoid doing except to fix performance problems. I don't think the existing structure is a performance problem for bug heat, as bug heat is being recalculated as a background task and not on the fly when handling a request.

review: Needs Resubmitting (db)
Revision history for this message
Stuart Bishop (stub) :
review: Disapprove (db)
Revision history for this message
Karl Fogel (kfogel) wrote :

Marked branch as 'abandoned'. We had a conversation here (adeuring, gmb, kfogel) and have come around to a solution that doesn't need a schema change. You're right: the fact that bug heat is calculated as a background task makes the difference; and gmb explained that current performance problems with the subscriber portlet are not due to database queries but to over-iteration after the query (so there is no need for us to try to address those problems here).

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'database/schema/comments.sql'
2--- database/schema/comments.sql 2010-01-13 05:55:29 +0000
3+++ database/schema/comments.sql 2010-01-18 22:39:26 +0000
4@@ -311,6 +311,7 @@
5 COMMENT ON TABLE BugAffectsPerson IS 'This table maintains a mapping between bugs and users indicating that they are affected by that bug. The value is calculated and cached in the Bug.users_affected_count column.';
6 COMMENT ON COLUMN BugAffectsPerson.bug IS 'The bug affecting this person.';
7 COMMENT ON COLUMN BugAffectsPerson.person IS 'The person affected by this bug.';
8+COMMENT ON COLUMN BugAffectsPerson.master_affects IS 'The master bug affecting this person. By default, this has the same value as "bug", but if this bug is marked as a duplicate of some master bug, and there is no other record indicating that the person is not affected by the master bug, than this column takes the value of the master bug.';
9
10
11 -- CodeImport
12
13=== added file 'database/schema/patch-2207-99-0.sql'
14--- database/schema/patch-2207-99-0.sql 1970-01-01 00:00:00 +0000
15+++ database/schema/patch-2207-99-0.sql 2010-01-18 22:39:26 +0000
16@@ -0,0 +1,9 @@
17+-- Copyright 2009 Canonical Ltd. This software is licensed under the
18+-- GNU Affero General Public License version 3 (see the file LICENSE).
19+
20+SET client_min_messages TO ERROR;
21+
22+ALTER table bugaffectsperson ADD COLUMN master_affects integer NOT NULL;
23+CREATE INDEX bugaffectsperson__master_affects__idx ON bugaffectsperson USING btree (master_affects);
24+
25+INSERT INTO LaunchpadDatabaseRevision VALUES (2207, 99, 0);
26
27=== modified file 'lib/lp/code/browser/branchmergeproposallisting.py'
28--- lib/lp/code/browser/branchmergeproposallisting.py 2010-01-14 23:19:13 +0000
29+++ lib/lp/code/browser/branchmergeproposallisting.py 2010-01-18 22:39:26 +0000
30@@ -112,8 +112,10 @@
31 """
32 if self.context.date_review_requested is not None:
33 return self.context.date_review_requested
34- else:
35+ elif self.context.date_reviewed is not None:
36 return self.context.date_reviewed
37+ else:
38+ return self.context.date_created
39
40
41 class BranchMergeProposalListingBatchNavigator(TableBatchNavigator):
42
43=== modified file 'lib/lp/code/browser/tests/test_branchmergeproposallisting.py'
44--- lib/lp/code/browser/tests/test_branchmergeproposallisting.py 2010-01-14 23:39:06 +0000
45+++ lib/lp/code/browser/tests/test_branchmergeproposallisting.py 2010-01-18 22:39:26 +0000
46@@ -10,7 +10,6 @@
47
48 import pytz
49 import transaction
50-from zope.publisher.interfaces import NotFound
51 from zope.security.proxy import removeSecurityProxy
52
53 from canonical.launchpad.webapp.servers import LaunchpadTestRequest
54@@ -298,6 +297,14 @@
55 item = BranchMergeProposalListingItem(bmp, None, None)
56 self.assertEqual(review_date, item.sort_key)
57
58+ def test_sort_key_wip(self):
59+ # If the proposal is a work in progress, the date_created is used.
60+ bmp = self.factory.makeBranchMergeProposal(
61+ date_created=datetime(2009,6,1,tzinfo=pytz.UTC))
62+ login_person(bmp.target_branch.owner)
63+ item = BranchMergeProposalListingItem(bmp, None, None)
64+ self.assertEqual(bmp.date_created, item.sort_key)
65+
66
67 class ActiveReviewSortingTest(TestCaseWithFactory):
68 """Test the sorting of the active review groups."""

Subscribers

People subscribed via source and target branches

to status/vote changes: