Merge ~andersson123/ubuntu-qa-website:database-backup into ubuntu-qa-website:main

Proposed by Tim Andersson
Status: Merged
Merged at revision: de893e76b2a4aa3153a2b0ce683a114ca588264d
Proposed branch: ~andersson123/ubuntu-qa-website:database-backup
Merge into: ubuntu-qa-website:main
Diff against target: 167 lines (+150/-0)
2 files modified
scripts/db-backup (+122/-0)
scripts/qatracker-setup.sh (+28/-0)
Reviewer Review Type Date Requested Status
Steve Langasek Approve
Review via email: mp+459506@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Tim Andersson (andersson123) wrote :

Tested it now, it all works! moving to needs review

Revision history for this message
Tim Andersson (andersson123) wrote :

Need to amend the script to remove old backups too

Revision history for this message
Steve Langasek (vorlon) :
review: Needs Fixing
Revision history for this message
Brian Murray (brian-murray) wrote :

I think Steve is talking about the "wal_e" juju configuration options here.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Fri, Jan 26, 2024 at 07:19:43PM -0000, Brian Murray wrote:
> I think Steve is talking about the "wal_e" juju configuration options here.

I'm looking at these options in 'juju config postgresql':

  backup_dir:
    default: /var/lib/postgresql/backups
    description: |
      Directory to place backups in.
    source: default
    type: string
    value: /var/lib/postgresql/backups
  backup_retention_count:
    default: 7
    description: Number of backups to retain.
    source: default
    type: int
    value: 7
  backup_schedule:
    default: 13 4 * * *
    description: Cron-formatted schedule for regular database backups.
    source: default
    type: string
    value: 13 4 * * *

Alarmingly, they also seem not to have had any effect since Dec 8, which is
concerning. I have no idea why since /etc/cron.d/juju-postgresql is still
present on the system.

Revision history for this message
Steve Langasek (vorlon) wrote :

Heh. Cron mail:

Traceback (most recent call last):
  File "/var/lib/postgresql/scripts/pgkillidle.py", line 114, in <module>
    sys.exit(main())
  File "/var/lib/postgresql/scripts/pgkillidle.py", line 55, in main
    con = psycopg2.connect(options.connect_string)
  File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 126, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL: Peer authentication failed for user "postgres"

So needs some pg_hba amending...

Revision history for this message
Tim Andersson (andersson123) wrote :

I changed the juju config:
```
juju config postgresql extra_pg_auth="local all all md5 map=qatracker"
```

I'll amend the script to grab the backups from /var/lib/postgresql/backups instead :)

Revision history for this message
Tim Andersson (andersson123) wrote :

 I don't know if what I did was correct, but I will check for backups soon

Revision history for this message
Tim Andersson (andersson123) wrote :

well, actually, I'll try create a backup now

Revision history for this message
Tim Andersson (andersson123) wrote :

-_-

```
stg-image-testing-tracker@ubuntu-qa-bastion-ps5:~$ juju run-action postgresql/1 wal-e-backup --wait
unit-postgresql-1:
  UnitId: postgresql/1
  id: "16"
  message: Backup failed
  results:
    Stdout: |
      sudo: /snap/bin/wal-e.envdir: command not found
    backup-return-code: "1"
    wal-e-backup-cmd: /snap/bin/wal-e.envdir /etc/postgresql/12/main/wal-e.env /snap/bin/wal-e
      backup-push /var/lib/postgresql/12/main
    wal-e-prune-cmd: None
  status: failed
  timing:
    completed: 2024-01-29 11:06:16 +0000 UTC
    enqueued: 2024-01-29 11:06:13 +0000 UTC
    started: 2024-01-29 11:06:14 +0000 UTC
```

Revision history for this message
Tim Andersson (andersson123) wrote :

Looks like wal-e wasn't actually installed as a snap, so I'm doing that

Revision history for this message
Tim Andersson (andersson123) wrote :

wal-e backups go straight to cloud storage, they don't seem to have an option to backup to filesystem. I'm looking more.

Revision history for this message
Tim Andersson (andersson123) wrote :

Oh okay, I broke the qatracker...

Revision history for this message
Tim Andersson (andersson123) wrote :

I fixed it again, and there's some issues in prod that I'm going to try and amend today.

Revision history for this message
Tim Andersson (andersson123) wrote :

To sum up the issue, the pg_hba.conf has two conflicting lines:
```
local all all peer
local all all peer map=qatracker # extra_pg_auth config
```

The first one isn't sufficient to allow connections to the db, and it must be removed, with just this line present for the pg_hba.conf to be correct:
```
local all all peer map=qatracker # extra_pg_auth config
```

but the charm writes that file periodically (multiple times a minute) so one must delete the bad line and restart the postgresql service before the file is re-written.

I'm going to look into the juju config options and see if I can amend this.

Revision history for this message
Tim Andersson (andersson123) wrote :

and I broke the qatracker simply by restarting the postgresql service. My guess is someone in the past removed the bad line and restarted the service before the file could be re-written. I'm also guessing this has something to do with the backups not working. I'll fix it

Revision history for this message
Tim Andersson (andersson123) wrote :

For now, what I've done, is create
/home/ubuntu/pg_hba.conf

and modified /etc/postgresql/12/main/postgresql.conf to point to the above file, without the difficult line. postgresql.conf isn't getting periodically re-written so this is a temporary fix - @brian let's discuss in standup today.

Revision history for this message
Tim Andersson (andersson123) wrote (last edit ):

ah, and this seems to have fixed the backups :)

(really hating that the backups and killing of dead connections are cronjobs and not systemd)

Now I'll amend the script to use these backups.

Revision history for this message
Tim Andersson (andersson123) wrote :

This is fully amended and tested now.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Mon, Jan 29, 2024 at 01:24:06PM -0000, Tim Andersson wrote:
> For now, what I've done, is create
> /home/ubuntu/pg_hba.conf

> and modified /etc/postgresql/12/main/postgresql.conf to point to the above
> file, without the difficult line. postgresql.conf isn't getting
> periodically re-written so this is a temporary fix - @brian let's discuss
> in standup today.

Please move this file under /etc/postgresql/12/main, even if it's under a
different name. There's a privilege escalation attack by having it in
/home/ubuntu, which is that the ubuntu user can delete the file despite it
being owned by root, then replace it with contents of its choice.

Revision history for this message
Tim Andersson (andersson123) wrote :

I moved it to:
/etc/postgresql/12/main/pg_hba_qatracker.conf

and I also pointed
/etc/postgresql/12/main/postgresql.conf

to the correct filepath

So this is ready for review!

Revision history for this message
Steve Langasek (vorlon) :
review: Approve
Revision history for this message
Tim Andersson (andersson123) wrote :

Thanks - I don't have permissions to merge though

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/scripts/db-backup b/scripts/db-backup
2new file mode 100755
3index 0000000..6892d0f
4--- /dev/null
5+++ b/scripts/db-backup
6@@ -0,0 +1,122 @@
7+#!/usr/bin/python3
8+"""
9+Script for backing up the sql database
10+swift-creds.json should look like this:
11+{
12+ "authurl": "",
13+ "user": "",
14+ "key": "",
15+ "os_options": {
16+ "region_name": "",
17+ "project_domain_name": "",
18+ "project_name": "",
19+ "user_domain_name": ""
20+ },
21+ "auth_version": ""
22+}
23+"""
24+import datetime
25+import json
26+import os
27+import subprocess
28+
29+import swiftclient
30+
31+SWIFT_CREDS_FP = "/home/ubuntu/swift-creds.json"
32+PG_DUMP_DIR = "/tmp"
33+PG_DUMP_FILE = "psql_dump.sql"
34+PSQL_USER = "qatracker"
35+PSQL_DB = "qatracker"
36+CONTAINER_NAME = "db-backup"
37+MAX_DAYS = 7
38+PG_DUMP_DIR = "/var/lib/postgresql/backups"
39+
40+
41+def init_swift_con():
42+ with open(SWIFT_CREDS_FP, "r") as f:
43+ swift_json = json.load(f)
44+ swift_con = swiftclient.Connection(**swift_json)
45+ return swift_con
46+
47+
48+def zip_dumpfiles(dumpfiles):
49+ ret_df = []
50+ for file in dumpfiles:
51+ fp = "%s/%s" % (PG_DUMP_DIR, file)
52+ gzip_args = "gzip -f %s" % fp
53+ _ = subprocess.run(gzip_args.split(" "), check=True)
54+ ret_df.append(file + ".gz")
55+ return ret_df
56+
57+
58+def check_for_backup():
59+ now = datetime.datetime.now().strftime("%Y%m%d")
60+ todays_backups = []
61+ for file in os.listdir(PG_DUMP_DIR):
62+ if now in file:
63+ print("db backup found: %s/%s" % (PG_DUMP_DIR, file))
64+ todays_backups.append(file)
65+ if len(todays_backups) != 0:
66+ return todays_backups
67+ return None
68+
69+
70+def upload_to_swift(dumpfiles):
71+ swift_con = init_swift_con()
72+ print("uploading objects to swift")
73+ try:
74+ swift_con.get_container(CONTAINER_NAME)
75+ except swiftclient.exceptions.ClientException:
76+ swift_con.put_container(
77+ CONTAINER_NAME,
78+ )
79+ for dumpfile in dumpfiles:
80+ now = datetime.datetime.now().strftime("%Y/%m/%d/%H_%M_%S")
81+ object_path = "%s/%s" % (now, dumpfile)
82+ pg_file = None
83+ with open("%s/%s" % (PG_DUMP_DIR, dumpfile), "rb") as f:
84+ pg_file = f.read()
85+ print("Uploading %s to swift..." % dumpfile)
86+ for _ in range(5):
87+ try:
88+ swift_con.put_object(
89+ CONTAINER_NAME,
90+ object_path,
91+ pg_file,
92+ content_type="text/plain; charset=UTF-8",
93+ headers={"Content-Encoding": "gzip"},
94+ )
95+ except swiftclient.exceptions.ClientException as e:
96+ print("exception: %s" % str(e))
97+ swift_con = init_swift_con()
98+
99+
100+def cleanup(zipped_files):
101+ for zipped_file in zipped_files:
102+ print("Removing %s" % zipped_file)
103+ if os.path.isfile("%s/%s" % (PG_DUMP_DIR, zipped_file)):
104+ os.remove("%s/%s" % (PG_DUMP_DIR, zipped_file))
105+
106+
107+def swift_cleanup():
108+ swift_con = init_swift_con()
109+ print("Removing old db backups...")
110+ _, objects = swift_con.get_container(CONTAINER_NAME)
111+ now = datetime.datetime.now()
112+
113+ for obj in objects:
114+ last_modified = obj["last_modified"].split(".")[0]
115+ timestamp = datetime.datetime.strptime(last_modified, "%Y-%m-%dT%H:%M:%S")
116+ diff = now - timestamp
117+ if diff > datetime.timedelta(days=MAX_DAYS):
118+ print("Deleting %s" % obj["name"])
119+ swift_con.delete_object(CONTAINER_NAME, obj["name"])
120+
121+
122+if __name__ == "__main__":
123+ backups = check_for_backup()
124+ if backups is not None:
125+ zipped_dumps = zip_dumpfiles(backups)
126+ upload_to_swift(zipped_dumps)
127+ cleanup(zipped_dumps)
128+ swift_cleanup()
129diff --git a/scripts/qatracker-setup.sh b/scripts/qatracker-setup.sh
130index 85527bb..7d703d8 100644
131--- a/scripts/qatracker-setup.sh
132+++ b/scripts/qatracker-setup.sh
133@@ -113,6 +113,34 @@ cat <<- _EOF_
134 * Uncheck the site name
135 _EOF_
136
137+
138+# Make service file for db-backup
139+sudo cat > /etc/systemd/system/db-backup.service << EOM
140+[Unit]
141+Description=Backup sql database
142+
143+[Service]
144+Type=oneshot
145+ExecStart=/home/ubuntu/ubuntu-qa-website/scripts/db-backup
146+EOM
147+
148+# Make timer file for db-backup
149+sudo cat > /tmp/"${service_name}".timer << EOM
150+[Unit]
151+Description=Backup sql database
152+
153+[Timer]
154+OnCalendar=*-*-* 04:30:00
155+
156+[Install]
157+WantedBy=multi-user.target
158+EOM
159+# daemon-reload
160+sudo systemctl daemon-reload
161+
162+# start --no-block
163+sudo systemctl start db-backup.service --no-block
164+
165 # Opening in browser
166 if which xdg-open > /dev/null
167 then

Subscribers

People subscribed via source and target branches