Merge lp:~michael.nelson/launchpad/distro-series-difference-schema into lp:launchpad/db-devel

Proposed by Michael Nelson
Status: Merged
Approved by: Michael Nelson
Approved revision: no longer in the source branch.
Merged at revision: 9718
Proposed branch: lp:~michael.nelson/launchpad/distro-series-difference-schema
Merge into: lp:launchpad/db-devel
Diff against target: 82 lines (+42/-3)
3 files modified
database/schema/comments.sql (+16/-3)
database/schema/patch-2208-07-0.sql (+24/-0)
database/schema/security.cfg (+2/-0)
To merge this branch: bzr merge lp:~michael.nelson/launchpad/distro-series-difference-schema
Reviewer Review Type Date Requested Status
Brad Crittenden (community) code Approve
Stuart Bishop (community) db Approve
Robert Collins db Pending
Review via email: mp+33515@code.launchpad.net

Commit message

Add DistroSeriesDifference table and related DistroSeriesDifferenceMessage.

Description of the change

Overview
=======
This branch is a schema patch to support displaying packages with different versions between two distroseries (specifically, a derived distroseries and its parent series).

Details of the UI at:
https://dev.launchpad.net/LEP/DerivativeDistributions

Details
=====
Some details regarding the implementation and schema design were discussed on a long-ish email thread:

https://lists.launchpad.net/launchpad-dev/msg04408.html

The decisions that came out of that discussion were:

1) Lets just store the derived series which references its parent series (rather than generalising it with two references to distroseries - we can always generalise later if and when we need to).
2) Store the source_package_name, rather than two references to SPPHs, as the SPPH references can be out of date (ie. a newer version could be published before the record is updated again). We'll have properties on the model which will always look up the latest SPPHs based on the distroseries's and source package name.
3) The actual diff off the two packages is not required (nullable), and will be generated on request. Also, once its created, its not necessarily a diff of the current packages (new versions could be uploaded since the diff was generated). Hence naming it latest_package_diff rather than simply package_diff.
4) The activity_log field was initially a comment field, but it will be used and appended to (1) by scripts, when new uploads are detected that change the state, or (2) when users add comments (eg. "We're waiting for version 1.4"), hence renaming it activity_log. I don't see a reason to add a separate comment model - but you might.
5) Initially we had just one enum for the type of difference (UNIQUE_TO_DERIVED_SERIES, MISSING_FROM_DERIVED_SERIES, DIFFERENT_VERSIONS) so that the view can query for all the records of a certain type. We were planning on having an 'ignored' flag, but after the above email discussion, it became apparent that other options would be required (IGNORED_ALWAYS, RESOLVED etc.), so the status column was added to support this enum instead of the ignored bool.

No sampledata changes.

The following branch adds the basic model code:
https://code.edge.launchpad.net/~michael.nelson/launchpad/distro-series-difference-basic-model/+merge/33885

To post a comment you must log in.
Revision history for this message
Stuart Bishop (stub) wrote :

Its surprising you don't want delete permissions. How does the derived distribution sync its changes up with its parent distribution?

Having a column called derived_series is confusing. The derived distro series is the child. Having it reference its parent via a column called derived_series seems wrong (The derived distro series is derived from the derived distro series, which is silly). What is the parent distribution of a derived distribution called anyway? Parent or is there some other term?

We should add an index on last_package_diff for garbage collection purposes.

patch-2208-07-0.sql

review: Needs Information (db)
Revision history for this message
Stuart Bishop (stub) wrote :

Confused myself. This is all fine. The difference references the derived distro series it is for. The derived distro series references its parent distro series.

Add the index and you are good to go from my pov.

review: Approve (db)
Revision history for this message
Michael Nelson (michael.nelson) wrote :

On Fri, Aug 27, 2010 at 12:03 PM, Stuart Bishop
<email address hidden> wrote:
> Review: Approve db
> Confused myself. This is all fine. The difference references the derived distro series it is for. The derived distro series references its parent distro series.
>
> Add the index and you are good to go from my pov.

Thanks Stuart. Based on our conversation [1] I've updated to remove
the activity_log column and instead added a joining table for messages
for the DistroSeriesDifference (r9690, incremental attached). I've
based it on the similar BugMessage table, but please check it in case
I've missed something.

[1] 12:04 < noodles775> stub: Why would we want delete permissions? At
the moment I'm planning that these differences will end in a resolved
state, but we wouldn't want to delete them until the derived
distroseries is itself deleted (if it ever is).
12:05 < stub> The derived distribution has a patched version of
firefox. Tomorrow it decides that it should just use the parents
version of firefox.
12:07 < noodles775> Great, so assuming the parents version has a
higher revno, they upload it, this record is marked as resolved and
the activity log is updated, but the record is not deleted, as
tomorrow the parent might upload a new version, and the previous
comments/activity is still useful.
12:07 < stub> ok
12:08 < stub> How big do you think the comments and activity will grow?
12:08 < stub> It will be a pain to split out into a structured format later.
12:10 < stub> If it is append only, storing it as a single text blob
isn't ideal. If it is more a whiteboard or a copy of a text file
stored in a branch, that would be fine I guess.
12:10 < noodles775> On average 4 or 5 lines. If a particular
difference goes through resolved->needs attention-> resolved it would
be a few more.
12:10 < stub> But over time
12:10 < noodles775> Yes, it is just like a whiteboard.
12:11 < stub> ok
12:11 < noodles775> Yep, so if overtime it went through that cycle
(resolved->needs attention->resolved) *lots*, there would be 2 lines
for each change.
12:11 < noodles775> Do you think its worth adding a separate comment model?
12:12 < noodles775> (from memory the code/bugs guys made the comment
model re-usable... I'll check).
12:14 < noodles775> stub: oh, and I think I misunderstood your comment
above. I am making the activity_log append only.
12:17 < stub> So we would probably be better off with a separate table
containing the activity, one row per addition. Or even a link table
between distroseriesdifference and message same as our other comment
spools.
12:18 < noodles775> Yep, just looking at the message table... I'll do that then.
12:18 < noodles775> Thanks stub.
12:19 < stub> That gives you the infrastructure for attachments, email
interfaces etc. if we want them in the future.
12:20 < noodles775> Yeah, given that the table already exists, it
sounds crazy not to use it.

1=== modified file 'database/schema/comments.sql'
2--- database/schema/comments.sql 2010-08-27 06:50:09 +0000
3+++ database/schema/comments.sql 2010-08-27 10:45:34 +0000
4@@ -508,10 +508,14 @@
5 COMMENT ON COLUMN DistroSeriesDifference.derived_series IS 'The derived distroseries with the difference from its parent.';
6 COMMENT ON COLUMN DistroSeriesDifference.source_package_name IS 'The name of the source package which is different in the two series.';
7 COMMENT ON COLUMN DistroSeriesDifference.last_package_diff IS 'The most recent package diff that was created for this difference.';
8-COMMENT ON COLUMN DistroSeriesDifference.activity_log IS 'A log of actions and/or comments by users regarding action to be taken.';
9 COMMENT ON COLUMN DistroSeriesDifference.status IS 'A distroseries difference can be needing attention, ignored or resolved.';
10 COMMENT ON COLUMN DistroSeriesDifference.difference_type IS 'The type of difference that this record represents - a package unique to the derived series, or missing, or in both.';
11
12+-- DistroSeriesDifferenceMessage
13+COMMENT ON TABLE DistroSeriesDifferenceMessage IS 'A message/comment on a distro series difference.';
14+COMMENT ON COLUMN DistroSeriesDifferenceMessage.distro_series_difference IS 'The distro series difference for this comment.';
15+COMMENT ON COLUMN DistroSeriesDifferenceMessage.message IS 'The comment for the distro series difference.';
16+
17 -- DistroSeriesPackageCache
18
19 COMMENT ON TABLE DistroSeriesPackageCache IS 'A cache of the text associated with binary packages in the distroseries. This table allows for fast queries to find a binary packagename that matches a given text.';
20
21=== renamed file 'database/schema/patch-2208-99-0.sql' => 'database/schema/patch-2208-07-0.sql'
22--- database/schema/patch-2208-99-0.sql 2010-08-24 14:55:03 +0000
23+++ database/schema/patch-2208-07-0.sql 2010-08-27 10:52:17 +0000
24@@ -5,7 +5,6 @@
25 derived_series integer NOT NULL CONSTRAINT distroseriesdifference__derived_series__fk REFERENCES distroseries,
26 source_package_name integer NOT NULL CONSTRAINT distroseriesdifference__source_package_name__fk REFERENCES sourcepackagename,
27 last_package_diff integer CONSTRAINT distroseriesdifference__last_package_diff__fk REFERENCES packagediff,
28- activity_log text,
29 status integer NOT NULL,
30 difference_type integer NOT NULL
31 );
32@@ -13,5 +12,15 @@
33 CREATE INDEX distroseriesdifference__source_package_name__idx ON distroseriesdifference(source_package_name);
34 CREATE INDEX distroseriesdifference__status__idx ON distroseriesdifference(status);
35 CREATE INDEX distroseriesdifference__difference_type__idx ON distroseriesdifference(difference_type);
36-
37-INSERT INTO LaunchpadDatabaseRevision VALUES (2208, 99, 0);
38+CREATE INDEX distroseriesdifference__last_package_diff__idx ON distroseriesdifference(last_package_diff);
39+
40+CREATE TABLE DistroSeriesDifferenceMessage(
41+ id serial PRIMARY KEY,
42+ distro_series_difference integer NOT NULL CONSTRAINT distroseriesdifferencemessage__distro_series_difference__fk REFERENCES distroseriesdifference,
43+ message integer NOT NULL CONSTRAINT distroseriesdifferencemessage__message__fk REFERENCES message,
44+ UNIQUE (distro_series_difference, message)
45+);
46+CREATE INDEX distroseriesdifferencemessage__distroseriesdifference__idx ON distroseriesdifferencemessage(distro_series_difference);
47+CREATE INDEX distroseriesdifferencemessage__message__idx ON distroseriesdifferencemessage(message);
48+
49+INSERT INTO LaunchpadDatabaseRevision VALUES (2208, 07, 0);
50
51=== modified file 'database/schema/security.cfg'
52--- database/schema/security.cfg 2010-08-27 06:50:09 +0000
53+++ database/schema/security.cfg 2010-08-27 10:45:11 +0000
54@@ -135,6 +135,7 @@
55 public.distributionsourcepackage = SELECT, INSERT, UPDATE, DELETE
56 public.distributionsourcepackagecache = SELECT
57 public.distroseriesdifference = SELECT, INSERT, UPDATE
58+public.distroseriesdifferencemessage = SELECT, INSERT, UPDATE
59 public.distroserieslanguage = SELECT, INSERT, UPDATE
60 public.distroseriespackagecache = SELECT
61 public.emailaddress = SELECT, INSERT, UPDATE, DELETE
Revision history for this message
Brad Crittenden (bac) :
review: Approve (code)
Revision history for this message
Robert Collins (lifeless) wrote :

Michael, 'latest_package_diff' implies 'the newest' but as you say
uploads could be made that mean its not the latest diff after all.

I haven't looked closely yet, but it seems to me that when you
generate a diff you know the versions you generated it against, and
you should store those, then it *is* a package_diff, not a 'latest_'
one, and we can query easily to see if it is or isn't the latest one.

Revision history for this message
Michael Nelson (michael.nelson) wrote :

Hi Robert,

Yes, those versions are stored with the PackageDiff object itself, but do you mean that we should store the exact versions additionally with this DistroSeriesDifference? We were doing that originally, but I'd switched to store simply the source package name as a view displaying these records would need to check and update them anyway, so it I thought it saner to have properties on the model that lookup the latest published version (cheap when we already know *which* packages have differences). Here's the email msg where I brought this up:

https://lists.launchpad.net/launchpad-dev/msg04454.html

And yep, I was using 'latest' in the sense that it is the most recently generated package diff, and irrespective of the above, we could rename this simply to package_diff and automatically remove it when it no longer matches the current published versions - I'd not thought of that.

I'll update latest_package_diff -> package_diff, but let me know about the first issue (storing of exact versions within DistroSeriesDifference in addition to PackageDiff).

Cheers,
M.

Revision history for this message
Stuart Bishop (stub) wrote :

Drop the UNIQUE (distro_series_difference, message) constraint, and declare the message column UNIQUE instead.

Drop the index on the message column - one will get created implicitly when the column is declared UNIQUE.

Revision history for this message
Michael Nelson (michael.nelson) wrote :

On Mon, Aug 30, 2010 at 9:41 AM, Stuart Bishop
<email address hidden> wrote:
> Drop the UNIQUE (distro_series_difference, message) constraint, and declare the message column UNIQUE instead.
>
> Drop the index on the message column - one will get created implicitly when the column is declared UNIQUE.

Thanks Stuart. Done in the incremental (as well as renaming
last_package_diff->package_diff).

1=== modified file 'database/schema/comments.sql'
2--- database/schema/comments.sql 2010-08-27 10:59:56 +0000
3+++ database/schema/comments.sql 2010-08-30 07:48:22 +0000
4@@ -507,7 +507,7 @@
5 COMMENT ON TABLE DistroSeriesDifference IS 'A difference of versions for a package in a derived distroseries and its parent distroseries.';
6 COMMENT ON COLUMN DistroSeriesDifference.derived_series IS 'The derived distroseries with the difference from its parent.';
7 COMMENT ON COLUMN DistroSeriesDifference.source_package_name IS 'The name of the source package which is different in the two series.';
8-COMMENT ON COLUMN DistroSeriesDifference.last_package_diff IS 'The most recent package diff that was created for this difference.';
9+COMMENT ON COLUMN DistroSeriesDifference.package_diff IS 'The most recent package diff that was created for this difference.';
10 COMMENT ON COLUMN DistroSeriesDifference.status IS 'A distroseries difference can be needing attention, ignored or resolved.';
11 COMMENT ON COLUMN DistroSeriesDifference.difference_type IS 'The type of difference that this record represents - a package unique to the derived series, or missing, or in both.';
12
13
14=== modified file 'database/schema/patch-2208-07-0.sql'
15--- database/schema/patch-2208-07-0.sql 2010-08-27 10:59:56 +0000
16+++ database/schema/patch-2208-07-0.sql 2010-08-30 07:49:48 +0000
17@@ -4,7 +4,7 @@
18 id serial PRIMARY KEY,
19 derived_series integer NOT NULL CONSTRAINT distroseriesdifference__derived_series__fk REFERENCES distroseries,
20 source_package_name integer NOT NULL CONSTRAINT distroseriesdifference__source_package_name__fk REFERENCES sourcepackagename,
21- last_package_diff integer CONSTRAINT distroseriesdifference__last_package_diff__fk REFERENCES packagediff,
22+ package_diff integer CONSTRAINT distroseriesdifference__package_diff__fk REFERENCES packagediff,
23 status integer NOT NULL,
24 difference_type integer NOT NULL
25 );
26@@ -12,15 +12,13 @@
27 CREATE INDEX distroseriesdifference__source_package_name__idx ON distroseriesdifference(source_package_name);
28 CREATE INDEX distroseriesdifference__status__idx ON distroseriesdifference(status);
29 CREATE INDEX distroseriesdifference__difference_type__idx ON distroseriesdifference(difference_type);
30-CREATE INDEX distroseriesdifference__last_package_diff__idx ON distroseriesdifference(last_package_diff);
31+CREATE INDEX distroseriesdifference__package_diff__idx ON distroseriesdifference(package_diff);
32
33 CREATE TABLE DistroSeriesDifferenceMessage(
34 id serial PRIMARY KEY,
35 distro_series_difference integer NOT NULL CONSTRAINT distroseriesdifferencemessage__distro_series_difference__fk REFERENCES distroseriesdifference,
36- message integer NOT NULL CONSTRAINT distroseriesdifferencemessage__message__fk REFERENCES message,
37- UNIQUE (distro_series_difference, message)
38+ message integer NOT NULL CONSTRAINT distroseriesdifferencemessage__message__fk REFERENCES message UNIQUE
39 );
40 CREATE INDEX distroseriesdifferencemessage__distroseriesdifference__idx ON distroseriesdifferencemessage(distro_series_difference);
41-CREATE INDEX distroseriesdifferencemessage__message__idx ON distroseriesdifferencemessage(message);
42
43 INSERT INTO LaunchpadDatabaseRevision VALUES (2208, 07, 0);

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'database/schema/comments.sql'
2--- database/schema/comments.sql 2010-08-25 16:25:02 +0000
3+++ database/schema/comments.sql 2010-08-30 09:12:45 +0000
4@@ -503,6 +503,19 @@
5 COMMENT ON COLUMN DistributionSourcePackageCache.changelog IS 'A concatenation of the source package release changelogs for this source package, where the status is not REMOVED.';
6 COMMENT ON COLUMN DistributionSourcePackageCache.archive IS 'The archive where the source is published.';
7
8+-- DistroSeriesDifference
9+COMMENT ON TABLE DistroSeriesDifference IS 'A difference of versions for a package in a derived distroseries and its parent distroseries.';
10+COMMENT ON COLUMN DistroSeriesDifference.derived_series IS 'The derived distroseries with the difference from its parent.';
11+COMMENT ON COLUMN DistroSeriesDifference.source_package_name IS 'The name of the source package which is different in the two series.';
12+COMMENT ON COLUMN DistroSeriesDifference.package_diff IS 'The most recent package diff that was created for this difference.';
13+COMMENT ON COLUMN DistroSeriesDifference.status IS 'A distroseries difference can be needing attention, ignored or resolved.';
14+COMMENT ON COLUMN DistroSeriesDifference.difference_type IS 'The type of difference that this record represents - a package unique to the derived series, or missing, or in both.';
15+
16+-- DistroSeriesDifferenceMessage
17+COMMENT ON TABLE DistroSeriesDifferenceMessage IS 'A message/comment on a distro series difference.';
18+COMMENT ON COLUMN DistroSeriesDifferenceMessage.distro_series_difference IS 'The distro series difference for this comment.';
19+COMMENT ON COLUMN DistroSeriesDifferenceMessage.message IS 'The comment for the distro series difference.';
20+
21 -- DistroSeriesPackageCache
22
23 COMMENT ON TABLE DistroSeriesPackageCache IS 'A cache of the text associated with binary packages in the distroseries. This table allows for fast queries to find a binary packagename that matches a given text.';
24@@ -533,13 +546,13 @@
25 can be changed without restarting Launchpad
26 <https://dev.launchpad.net/LEP/FeatureFlags>';
27
28-COMMENT ON COLUMN FeatureFlag.scope IS
29+COMMENT ON COLUMN FeatureFlag.scope IS
30 'Scope in which this setting is active';
31
32-COMMENT ON COLUMN FeatureFlag.priority IS
33+COMMENT ON COLUMN FeatureFlag.priority IS
34 'Higher priority flags override lower';
35
36-COMMENT ON COLUMN FeatureFlag.flag IS
37+COMMENT ON COLUMN FeatureFlag.flag IS
38 'Name of the flag being controlled';
39
40 -- KarmaCategory
41
42=== added file 'database/schema/patch-2208-07-0.sql'
43--- database/schema/patch-2208-07-0.sql 1970-01-01 00:00:00 +0000
44+++ database/schema/patch-2208-07-0.sql 2010-08-30 09:12:45 +0000
45@@ -0,0 +1,24 @@
46+SET client_min_messages=ERROR;
47+
48+CREATE TABLE DistroSeriesDifference (
49+ id serial PRIMARY KEY,
50+ derived_series integer NOT NULL CONSTRAINT distroseriesdifference__derived_series__fk REFERENCES distroseries,
51+ source_package_name integer NOT NULL CONSTRAINT distroseriesdifference__source_package_name__fk REFERENCES sourcepackagename,
52+ package_diff integer CONSTRAINT distroseriesdifference__package_diff__fk REFERENCES packagediff,
53+ status integer NOT NULL,
54+ difference_type integer NOT NULL
55+);
56+CREATE INDEX distroseriesdifference__derived_series__idx ON distroseriesdifference(derived_series);
57+CREATE INDEX distroseriesdifference__source_package_name__idx ON distroseriesdifference(source_package_name);
58+CREATE INDEX distroseriesdifference__status__idx ON distroseriesdifference(status);
59+CREATE INDEX distroseriesdifference__difference_type__idx ON distroseriesdifference(difference_type);
60+CREATE INDEX distroseriesdifference__package_diff__idx ON distroseriesdifference(package_diff);
61+
62+CREATE TABLE DistroSeriesDifferenceMessage(
63+ id serial PRIMARY KEY,
64+ distro_series_difference integer NOT NULL CONSTRAINT distroseriesdifferencemessage__distro_series_difference__fk REFERENCES distroseriesdifference,
65+ message integer NOT NULL CONSTRAINT distroseriesdifferencemessage__message__fk REFERENCES message UNIQUE
66+);
67+CREATE INDEX distroseriesdifferencemessage__distroseriesdifference__idx ON distroseriesdifferencemessage(distro_series_difference);
68+
69+INSERT INTO LaunchpadDatabaseRevision VALUES (2208, 07, 0);
70
71=== modified file 'database/schema/security.cfg'
72--- database/schema/security.cfg 2010-08-27 18:49:22 +0000
73+++ database/schema/security.cfg 2010-08-30 09:12:45 +0000
74@@ -134,6 +134,8 @@
75 public.distributionmirror = SELECT, INSERT, UPDATE, DELETE
76 public.distributionsourcepackage = SELECT, INSERT, UPDATE, DELETE
77 public.distributionsourcepackagecache = SELECT
78+public.distroseriesdifference = SELECT, INSERT, UPDATE
79+public.distroseriesdifferencemessage = SELECT, INSERT, UPDATE
80 public.distroserieslanguage = SELECT, INSERT, UPDATE
81 public.distroseriespackagecache = SELECT
82 public.emailaddress = SELECT, INSERT, UPDATE, DELETE

Subscribers

People subscribed via source and target branches

to status/vote changes: