Merge lp:~stub/launchpad/replication into lp:launchpad

Proposed by Stuart Bishop
Status: Merged
Merged at revision: not available
Proposed branch: lp:~stub/launchpad/replication
Merge into: lp:launchpad
Diff against target: 190 lines
3 files modified
database/replication/Makefile (+8/-3)
database/replication/helpers.py (+11/-6)
database/replication/new-slave.py (+66/-3)
To merge this branch: bzr merge lp:~stub/launchpad/replication
Reviewer Review Type Date Requested Status
Michael Nelson (community) release-critical Approve
Henning Eggers (community) code Approve
Review via email: mp+14155@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Stuart Bishop (stub) wrote :

Fix Bug #461800

new-slave.py is a script used on production to build a new slave database. It automates a non-trivial process. This process became even less trivial when we moved the authdb replication set master to a seperate database than the lpmain replication set master. One of the first steps this script needs to do is duplicate the schema. Unfortunately, with the 1.2 series of Slony-I, backups from slaves replication sets are incomplete (missing triggers etc.). So now we need to build the schema using two databases rather than the single database used previously (the master for both replication sets).

Revision history for this message
Henning Eggers (henninge) wrote :

merge-conditional, so that you can get ahead with that when you get up ... ;-)

Thank you for improving this. I just have two improvements that I'd like to have in there.

> + # Get the connection string for masters.
> + lpmain_connection_string = get_master_connection_string(
> + source_connection, parser, 1) or source_connection_string
> + authdb_connection_string = get_master_connection_string(
> + source_connection, parser, 2) or source_connection_string

Please define constants for "1" and "2", e.g. "LPMAIN_SET_ID = 1".

> + # Duplicate the authdb schema.
> + log.info("Duplicating authdb schema from '%s' to '%s'" % (
> + authdb_connection_string, target_connection_string))
> + table_args = ["--table=%s" % table for table in _auth_store_tables]
> + # We need to restore the two cross-replication-set views that where
> + # dropped as a side effect of dropping the auth store tables.
> + table_args.append("--table=ValidPersonCache")
> + table_args.append("--table=ValidPersonOrTeamCache")

Please list these two tables in a list, close to where you define _auth_store_tables. This keeps this information from being scattered throughout the code. You can then create table_args directly from the two lists, like this:

table_args = ["--table=%s" % table for table in _auth_store_tables + whatever_you_call_them_tables]

BTW, since _auth_store_tables is imported from somewhere else, the leading _ is missleading, isn't it?

Cheers,
Henning

review: Approve (code)
Revision history for this message
Michael Nelson (michael.nelson) wrote :

Assuming this is all well tested, let's get it in devel while we can!

review: Approve (release-critical)
Revision history for this message
Stuart Bishop (stub) wrote :

On Mon, Nov 2, 2009 at 11:07 PM, Henning Eggers
<email address hidden> wrote:
> Review: Approve code
> merge-conditional, so that you can get ahead with that when you get up ... ;-)
>
> Thank you for improving this. I just have two improvements that I'd like to have in there.
>
>> + # Get the connection string for masters.
>> + lpmain_connection_string = get_master_connection_string(
>> + source_connection, parser, 1) or source_connection_string
>> + authdb_connection_string = get_master_connection_string(
>> + source_connection, parser, 2) or source_connection_string
>
> Please define constants for "1" and "2", e.g. "LPMAIN_SET_ID = 1".

Done.

>> + # Duplicate the authdb schema.
>> + log.info("Duplicating authdb schema from '%s' to '%s'" % (
>> + authdb_connection_string, target_connection_string))
>> + table_args = ["--table=%s" % table for table in _auth_store_tables]
>> + # We need to restore the two cross-replication-set views that where
>> + # dropped as a side effect of dropping the auth store tables.
>> + table_args.append("--table=ValidPersonCache")
>> + table_args.append("--table=ValidPersonOrTeamCache")
>
> Please list these two tables in a list, close to where you define _auth_store_tables. This keeps this information from being scattered throughout the code. You can then create table_args directly from the two lists, like this:
>
> table_args = ["--table=%s" % table for table in _auth_store_tables + whatever_you_call_them_tables]

I'm leaving this for next time. These two views are an atavism and
don't belong in canonical.launchpad.webapp.adapter. Hopefully next
time they won't be there at all as we split authentication and
launchpad apart further.

> BTW, since _auth_store_tables is imported from somewhere else, the leading _ is missleading, isn't it?

It is. There needs to be a better way of the database adapter
determining which Store should be used for a table than having this
list hardcoded. It probably should be an atttribute on the database
class. The _ is still appropriate as we don't want this as part of the
public API. Hopefully these chages to new-slave.py are just a
temporary hack and we can simplify things in the near future.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'database/replication/Makefile'
2--- database/replication/Makefile 2009-06-24 21:17:33 +0000
3+++ database/replication/Makefile 2009-11-03 08:45:26 +0000
4@@ -37,6 +37,8 @@
5
6 PGMASSACRE=../../utilities/pgmassacre.py
7
8+SHHH=../../utilities/shhh.py
9+
10 default:
11 echo Usage: make [start|stop|restart]
12
13@@ -143,9 +145,12 @@
14 # Setup the slave
15 ./new-slave.py 2 "dbname=${_SLAVE}"
16 # Upgrade all databases in the cluster and reset security.
17- ../schema/upgrade.py
18- ../schema/fti.py
19- ../schema/security.py --cluster -U slony
20+ @echo Running upgrade.py
21+ ${SHHH} ../schema/upgrade.py
22+ @echo Running fti.py
23+ ${SHHH} ../schema/fti.py
24+ @echo Running security.py
25+ ${SHHH} ../schema/security.py --cluster -U slony
26 # Migrate tables to the authdb replication set, creating the set
27 # and subscribing nodes to it as necessary.
28 ./populate_auth_replication_set.py -U slony
29
30=== modified file 'database/replication/helpers.py'
31--- database/replication/helpers.py 2009-06-24 21:17:33 +0000
32+++ database/replication/helpers.py 2009-11-03 08:45:26 +0000
33@@ -26,6 +26,11 @@
34 # The namespace in the database used to contain all the Slony-I tables.
35 CLUSTER_NAMESPACE = '_%s' % CLUSTERNAME
36
37+# Replication set id constants. Don't change these without DBA help.
38+LPMAIN_SET_ID = 1
39+AUTHDB_SET_ID = 2
40+HOLDING_SET_ID = 666
41+
42 # Seed tables for the authdb replication set to be passed to
43 # calculate_replication_set().
44 AUTHDB_SEED = frozenset([
45@@ -282,10 +287,10 @@
46 cluster name = sl;
47
48 # Symbolic ids for replication sets.
49- define lpmain_set 1;
50- define authdb_set 2;
51- define holding_set 666;
52- """)]
53+ define lpmain_set %d;
54+ define authdb_set %d;
55+ define holding_set %d;
56+ """ % (LPMAIN_SET_ID, AUTHDB_SET_ID, HOLDING_SET_ID))]
57
58 if master_node is not None:
59 preamble.append(dedent("""\
60@@ -303,9 +308,9 @@
61 node.nickname, node.node_id,
62 node.nickname, node.connection_string,
63 node.nickname, node.nickname)))
64-
65+
66 return '\n\n'.join(preamble)
67-
68+
69
70 def calculate_replication_set(cur, seeds):
71 """Return the minimal set of tables and sequences needed in a
72
73=== modified file 'database/replication/new-slave.py'
74--- database/replication/new-slave.py 2009-06-24 21:17:33 +0000
75+++ database/replication/new-slave.py 2009-11-03 08:45:26 +0000
76@@ -22,8 +22,10 @@
77 from canonical.database.sqlbase import (
78 connect_string, ISOLATION_LEVEL_AUTOCOMMIT)
79 from canonical.launchpad.scripts import db_options, logger_options, logger
80+from canonical.launchpad.webapp.adapter import _auth_store_tables
81
82 import replication.helpers
83+from replication.helpers import AUTHDB_SET_ID, LPMAIN_SET_ID
84
85 def main():
86 parser = OptionParser(
87@@ -72,6 +74,12 @@
88 if node_id in [node.node_id for node in existing_nodes]:
89 parser.error("Node %d already exists in the cluster." % node_id)
90
91+ # Get the connection string for masters.
92+ lpmain_connection_string = get_master_connection_string(
93+ source_connection, parser, AUTHDB_SET_ID) or source_connection_string
94+ authdb_connection_string = get_master_connection_string(
95+ source_connection, parser, LPMAIN_SET_ID) or source_connection_string
96+
97 # Sanity check the target connection string.
98 target_connection_string = ConnectionString(raw_target_connection_string)
99 if target_connection_string.user is None:
100@@ -110,11 +118,11 @@
101 "Database at %s is not empty." % target_connection_string)
102 target_con.rollback()
103
104- # Duplicate the schema. We restore with no-privileges as required
105+ # Duplicate the full schema. We restore with no-privileges as required
106 # roles may not yet exist, so we have to run security.py on the
107 # new slave once it is built.
108- log.info("Duplicating db schema from '%s' to '%s'" % (
109- source_connection_string, target_connection_string))
110+ log.info("Duplicating full db schema from '%s' to '%s'" % (
111+ lpmain_connection_string, target_connection_string))
112 cmd = "pg_dump --schema-only --no-privileges %s | psql -1 -q %s" % (
113 source_connection_string.asPGCommandLineArgs(),
114 target_connection_string.asPGCommandLineArgs())
115@@ -122,7 +130,33 @@
116 log.error("Failed to duplicate database schema.")
117 return 1
118
119+ # Drop the authdb replication set tables we just restored, as they
120+ # will be broken if the authdb master is a seperate database to the
121+ # lpmain master.
122+ log.debug("Dropping (possibly corrupt) authdb tables.")
123+ cur = target_con.cursor()
124+ for table_name in _auth_store_tables:
125+ cur.execute("DROP TABLE IF EXISTS %s CASCADE" % table_name)
126+ target_con.commit()
127+
128+ # Duplicate the authdb schema.
129+ log.info("Duplicating authdb schema from '%s' to '%s'" % (
130+ authdb_connection_string, target_connection_string))
131+ table_args = ["--table=%s" % table for table in _auth_store_tables]
132+ # We need to restore the two cross-replication-set views that where
133+ # dropped as a side effect of dropping the auth store tables.
134+ table_args.append("--table=ValidPersonCache")
135+ table_args.append("--table=ValidPersonOrTeamCache")
136+ cmd = "pg_dump --schema-only --no-privileges %s %s | psql -1 -q %s" % (
137+ ' '.join(table_args),
138+ source_connection_string.asPGCommandLineArgs(),
139+ target_connection_string.asPGCommandLineArgs())
140+ if subprocess.call(cmd, shell=True) != 0:
141+ log.error("Failed to duplicate database schema.")
142+ return 1
143+
144 # Trash the broken Slony tables we just duplicated.
145+ log.debug("Removing slony cruft.")
146 cur = target_con.cursor()
147 cur.execute("DROP SCHEMA _sl CASCADE")
148 target_con.commit()
149@@ -136,6 +170,7 @@
150 "SELECT set_id FROM _sl.sl_set WHERE set_origin=%d"
151 % master_node.node_id)
152 set_ids = [set_id for set_id, in cur.fetchall()]
153+ log.debug("Discovered set ids %s" % repr(list(set_ids)))
154
155 # Generate and run a slonik(1) script to initialize the new node
156 # and subscribe it to our replication sets.
157@@ -185,5 +220,33 @@
158
159 return 0
160
161+
162+def get_master_connection_string(con, parser, set_id):
163+ """Return the connection string to the origin for the replication set.
164+ """
165+ cur = con.cursor()
166+ cur.execute("""
167+ SELECT pa_conninfo FROM _sl.sl_set, _sl.sl_path
168+ WHERE set_origin = pa_server AND set_id = %d
169+ LIMIT 1
170+ """ % set_id)
171+ row = cur.fetchone()
172+ if row is None:
173+ # If we have no paths stored, there is only a single node in the
174+ # cluster.
175+ return None
176+ else:
177+ connection_string = ConnectionString(row[0])
178+
179+ # Confirm we can connect from here.
180+ try:
181+ test_con = psycopg2.connect(str(connection_string))
182+ except psycopg2.Error, exception:
183+ parser.error("Failed to connect to using '%s' (%s)" % (
184+ connection_string, str(exception).strip()))
185+
186+ return connection_string
187+
188+
189 if __name__ == '__main__':
190 sys.exit(main())