Merge lp:~stub/launchpad/replication into lp:launchpad

Proposed by Stuart Bishop
Status: Merged
Merged at revision: not available
Proposed branch: lp:~stub/launchpad/replication
Merge into: lp:launchpad
Diff against target: 190 lines
3 files modified
database/replication/Makefile (+8/-3)
database/replication/helpers.py (+11/-6)
database/replication/new-slave.py (+66/-3)
To merge this branch: bzr merge lp:~stub/launchpad/replication
Reviewer Review Type Date Requested Status
Michael Nelson (community) release-critical Approve
Henning Eggers (community) code Approve
Review via email: mp+14155@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Stuart Bishop (stub) wrote :

Fix Bug #461800

new-slave.py is a script used on production to build a new slave database. It automates a non-trivial process. This process became even less trivial when we moved the authdb replication set master to a seperate database than the lpmain replication set master. One of the first steps this script needs to do is duplicate the schema. Unfortunately, with the 1.2 series of Slony-I, backups from slaves replication sets are incomplete (missing triggers etc.). So now we need to build the schema using two databases rather than the single database used previously (the master for both replication sets).

Revision history for this message
Henning Eggers (henninge) wrote :

merge-conditional, so that you can get ahead with that when you get up ... ;-)

Thank you for improving this. I just have two improvements that I'd like to have in there.

> + # Get the connection string for masters.
> + lpmain_connection_string = get_master_connection_string(
> + source_connection, parser, 1) or source_connection_string
> + authdb_connection_string = get_master_connection_string(
> + source_connection, parser, 2) or source_connection_string

Please define constants for "1" and "2", e.g. "LPMAIN_SET_ID = 1".

> + # Duplicate the authdb schema.
> + log.info("Duplicating authdb schema from '%s' to '%s'" % (
> + authdb_connection_string, target_connection_string))
> + table_args = ["--table=%s" % table for table in _auth_store_tables]
> + # We need to restore the two cross-replication-set views that where
> + # dropped as a side effect of dropping the auth store tables.
> + table_args.append("--table=ValidPersonCache")
> + table_args.append("--table=ValidPersonOrTeamCache")

Please list these two tables in a list, close to where you define _auth_store_tables. This keeps this information from being scattered throughout the code. You can then create table_args directly from the two lists, like this:

table_args = ["--table=%s" % table for table in _auth_store_tables + whatever_you_call_them_tables]

BTW, since _auth_store_tables is imported from somewhere else, the leading _ is missleading, isn't it?

Cheers,
Henning

review: Approve (code)
Revision history for this message
Michael Nelson (michael.nelson) wrote :

Assuming this is all well tested, let's get it in devel while we can!

review: Approve (release-critical)
Revision history for this message
Stuart Bishop (stub) wrote :

On Mon, Nov 2, 2009 at 11:07 PM, Henning Eggers
<email address hidden> wrote:
> Review: Approve code
> merge-conditional, so that you can get ahead with that when you get up ... ;-)
>
> Thank you for improving this. I just have two improvements that I'd like to have in there.
>
>> + # Get the connection string for masters.
>> + lpmain_connection_string = get_master_connection_string(
>> + source_connection, parser, 1) or source_connection_string
>> + authdb_connection_string = get_master_connection_string(
>> + source_connection, parser, 2) or source_connection_string
>
> Please define constants for "1" and "2", e.g. "LPMAIN_SET_ID = 1".

Done.

>> + # Duplicate the authdb schema.
>> + log.info("Duplicating authdb schema from '%s' to '%s'" % (
>> + authdb_connection_string, target_connection_string))
>> + table_args = ["--table=%s" % table for table in _auth_store_tables]
>> + # We need to restore the two cross-replication-set views that where
>> + # dropped as a side effect of dropping the auth store tables.
>> + table_args.append("--table=ValidPersonCache")
>> + table_args.append("--table=ValidPersonOrTeamCache")
>
> Please list these two tables in a list, close to where you define _auth_store_tables. This keeps this information from being scattered throughout the code. You can then create table_args directly from the two lists, like this:
>
> table_args = ["--table=%s" % table for table in _auth_store_tables + whatever_you_call_them_tables]

I'm leaving this for next time. These two views are an atavism and
don't belong in canonical.launchpad.webapp.adapter. Hopefully next
time they won't be there at all as we split authentication and
launchpad apart further.

> BTW, since _auth_store_tables is imported from somewhere else, the leading _ is missleading, isn't it?

It is. There needs to be a better way of the database adapter
determining which Store should be used for a table than having this
list hardcoded. It probably should be an atttribute on the database
class. The _ is still appropriate as we don't want this as part of the
public API. Hopefully these chages to new-slave.py are just a
temporary hack and we can simplify things in the near future.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'database/replication/Makefile'
--- database/replication/Makefile 2009-06-24 21:17:33 +0000
+++ database/replication/Makefile 2009-11-03 08:45:26 +0000
@@ -37,6 +37,8 @@
3737
38PGMASSACRE=../../utilities/pgmassacre.py38PGMASSACRE=../../utilities/pgmassacre.py
3939
40SHHH=../../utilities/shhh.py
41
40default:42default:
41 echo Usage: make [start|stop|restart]43 echo Usage: make [start|stop|restart]
4244
@@ -143,9 +145,12 @@
143 # Setup the slave145 # Setup the slave
144 ./new-slave.py 2 "dbname=${_SLAVE}"146 ./new-slave.py 2 "dbname=${_SLAVE}"
145 # Upgrade all databases in the cluster and reset security.147 # Upgrade all databases in the cluster and reset security.
146 ../schema/upgrade.py148 @echo Running upgrade.py
147 ../schema/fti.py149 ${SHHH} ../schema/upgrade.py
148 ../schema/security.py --cluster -U slony150 @echo Running fti.py
151 ${SHHH} ../schema/fti.py
152 @echo Running security.py
153 ${SHHH} ../schema/security.py --cluster -U slony
149 # Migrate tables to the authdb replication set, creating the set154 # Migrate tables to the authdb replication set, creating the set
150 # and subscribing nodes to it as necessary.155 # and subscribing nodes to it as necessary.
151 ./populate_auth_replication_set.py -U slony156 ./populate_auth_replication_set.py -U slony
152157
=== modified file 'database/replication/helpers.py'
--- database/replication/helpers.py 2009-06-24 21:17:33 +0000
+++ database/replication/helpers.py 2009-11-03 08:45:26 +0000
@@ -26,6 +26,11 @@
26# The namespace in the database used to contain all the Slony-I tables.26# The namespace in the database used to contain all the Slony-I tables.
27CLUSTER_NAMESPACE = '_%s' % CLUSTERNAME27CLUSTER_NAMESPACE = '_%s' % CLUSTERNAME
2828
29# Replication set id constants. Don't change these without DBA help.
30LPMAIN_SET_ID = 1
31AUTHDB_SET_ID = 2
32HOLDING_SET_ID = 666
33
29# Seed tables for the authdb replication set to be passed to34# Seed tables for the authdb replication set to be passed to
30# calculate_replication_set().35# calculate_replication_set().
31AUTHDB_SEED = frozenset([36AUTHDB_SEED = frozenset([
@@ -282,10 +287,10 @@
282 cluster name = sl;287 cluster name = sl;
283288
284 # Symbolic ids for replication sets.289 # Symbolic ids for replication sets.
285 define lpmain_set 1;290 define lpmain_set %d;
286 define authdb_set 2;291 define authdb_set %d;
287 define holding_set 666;292 define holding_set %d;
288 """)]293 """ % (LPMAIN_SET_ID, AUTHDB_SET_ID, HOLDING_SET_ID))]
289294
290 if master_node is not None:295 if master_node is not None:
291 preamble.append(dedent("""\296 preamble.append(dedent("""\
@@ -303,9 +308,9 @@
303 node.nickname, node.node_id,308 node.nickname, node.node_id,
304 node.nickname, node.connection_string,309 node.nickname, node.connection_string,
305 node.nickname, node.nickname)))310 node.nickname, node.nickname)))
306 311
307 return '\n\n'.join(preamble)312 return '\n\n'.join(preamble)
308 313
309314
310def calculate_replication_set(cur, seeds):315def calculate_replication_set(cur, seeds):
311 """Return the minimal set of tables and sequences needed in a316 """Return the minimal set of tables and sequences needed in a
312317
=== modified file 'database/replication/new-slave.py'
--- database/replication/new-slave.py 2009-06-24 21:17:33 +0000
+++ database/replication/new-slave.py 2009-11-03 08:45:26 +0000
@@ -22,8 +22,10 @@
22from canonical.database.sqlbase import (22from canonical.database.sqlbase import (
23 connect_string, ISOLATION_LEVEL_AUTOCOMMIT)23 connect_string, ISOLATION_LEVEL_AUTOCOMMIT)
24from canonical.launchpad.scripts import db_options, logger_options, logger24from canonical.launchpad.scripts import db_options, logger_options, logger
25from canonical.launchpad.webapp.adapter import _auth_store_tables
2526
26import replication.helpers27import replication.helpers
28from replication.helpers import AUTHDB_SET_ID, LPMAIN_SET_ID
2729
28def main():30def main():
29 parser = OptionParser(31 parser = OptionParser(
@@ -72,6 +74,12 @@
72 if node_id in [node.node_id for node in existing_nodes]:74 if node_id in [node.node_id for node in existing_nodes]:
73 parser.error("Node %d already exists in the cluster." % node_id)75 parser.error("Node %d already exists in the cluster." % node_id)
7476
77 # Get the connection string for masters.
78 lpmain_connection_string = get_master_connection_string(
79 source_connection, parser, AUTHDB_SET_ID) or source_connection_string
80 authdb_connection_string = get_master_connection_string(
81 source_connection, parser, LPMAIN_SET_ID) or source_connection_string
82
75 # Sanity check the target connection string.83 # Sanity check the target connection string.
76 target_connection_string = ConnectionString(raw_target_connection_string)84 target_connection_string = ConnectionString(raw_target_connection_string)
77 if target_connection_string.user is None:85 if target_connection_string.user is None:
@@ -110,11 +118,11 @@
110 "Database at %s is not empty." % target_connection_string)118 "Database at %s is not empty." % target_connection_string)
111 target_con.rollback()119 target_con.rollback()
112120
113 # Duplicate the schema. We restore with no-privileges as required121 # Duplicate the full schema. We restore with no-privileges as required
114 # roles may not yet exist, so we have to run security.py on the122 # roles may not yet exist, so we have to run security.py on the
115 # new slave once it is built.123 # new slave once it is built.
116 log.info("Duplicating db schema from '%s' to '%s'" % (124 log.info("Duplicating full db schema from '%s' to '%s'" % (
117 source_connection_string, target_connection_string))125 lpmain_connection_string, target_connection_string))
118 cmd = "pg_dump --schema-only --no-privileges %s | psql -1 -q %s" % (126 cmd = "pg_dump --schema-only --no-privileges %s | psql -1 -q %s" % (
119 source_connection_string.asPGCommandLineArgs(),127 source_connection_string.asPGCommandLineArgs(),
120 target_connection_string.asPGCommandLineArgs())128 target_connection_string.asPGCommandLineArgs())
@@ -122,7 +130,33 @@
122 log.error("Failed to duplicate database schema.")130 log.error("Failed to duplicate database schema.")
123 return 1131 return 1
124132
133 # Drop the authdb replication set tables we just restored, as they
134 # will be broken if the authdb master is a seperate database to the
135 # lpmain master.
136 log.debug("Dropping (possibly corrupt) authdb tables.")
137 cur = target_con.cursor()
138 for table_name in _auth_store_tables:
139 cur.execute("DROP TABLE IF EXISTS %s CASCADE" % table_name)
140 target_con.commit()
141
142 # Duplicate the authdb schema.
143 log.info("Duplicating authdb schema from '%s' to '%s'" % (
144 authdb_connection_string, target_connection_string))
145 table_args = ["--table=%s" % table for table in _auth_store_tables]
146 # We need to restore the two cross-replication-set views that where
147 # dropped as a side effect of dropping the auth store tables.
148 table_args.append("--table=ValidPersonCache")
149 table_args.append("--table=ValidPersonOrTeamCache")
150 cmd = "pg_dump --schema-only --no-privileges %s %s | psql -1 -q %s" % (
151 ' '.join(table_args),
152 source_connection_string.asPGCommandLineArgs(),
153 target_connection_string.asPGCommandLineArgs())
154 if subprocess.call(cmd, shell=True) != 0:
155 log.error("Failed to duplicate database schema.")
156 return 1
157
125 # Trash the broken Slony tables we just duplicated.158 # Trash the broken Slony tables we just duplicated.
159 log.debug("Removing slony cruft.")
126 cur = target_con.cursor()160 cur = target_con.cursor()
127 cur.execute("DROP SCHEMA _sl CASCADE")161 cur.execute("DROP SCHEMA _sl CASCADE")
128 target_con.commit()162 target_con.commit()
@@ -136,6 +170,7 @@
136 "SELECT set_id FROM _sl.sl_set WHERE set_origin=%d"170 "SELECT set_id FROM _sl.sl_set WHERE set_origin=%d"
137 % master_node.node_id)171 % master_node.node_id)
138 set_ids = [set_id for set_id, in cur.fetchall()]172 set_ids = [set_id for set_id, in cur.fetchall()]
173 log.debug("Discovered set ids %s" % repr(list(set_ids)))
139174
140 # Generate and run a slonik(1) script to initialize the new node175 # Generate and run a slonik(1) script to initialize the new node
141 # and subscribe it to our replication sets.176 # and subscribe it to our replication sets.
@@ -185,5 +220,33 @@
185220
186 return 0221 return 0
187222
223
224def get_master_connection_string(con, parser, set_id):
225 """Return the connection string to the origin for the replication set.
226 """
227 cur = con.cursor()
228 cur.execute("""
229 SELECT pa_conninfo FROM _sl.sl_set, _sl.sl_path
230 WHERE set_origin = pa_server AND set_id = %d
231 LIMIT 1
232 """ % set_id)
233 row = cur.fetchone()
234 if row is None:
235 # If we have no paths stored, there is only a single node in the
236 # cluster.
237 return None
238 else:
239 connection_string = ConnectionString(row[0])
240
241 # Confirm we can connect from here.
242 try:
243 test_con = psycopg2.connect(str(connection_string))
244 except psycopg2.Error, exception:
245 parser.error("Failed to connect to using '%s' (%s)" % (
246 connection_string, str(exception).strip()))
247
248 return connection_string
249
250
188if __name__ == '__main__':251if __name__ == '__main__':
189 sys.exit(main())252 sys.exit(main())