Merge lp:~stub/launchpad/update-storm into lp:launchpad

Proposed by Stuart Bishop
Status: Merged
Merge reported by: Stuart Bishop
Merged at revision: not available
Proposed branch: lp:~stub/launchpad/update-storm
Merge into: lp:launchpad
Diff against target: None lines
To merge this branch: bzr merge lp:~stub/launchpad/update-storm
Reviewer Review Type Date Requested Status
Jonathan Lange (community) Approve
Christian Reis release-critical Pending
Review via email: mp+9130@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Stuart Bishop (stub) wrote :

Bug #393625 was raised because one of our scripts was using an unacceptable amount of RAM. Testing showed reducing the Storm cache size fixed the problem. Most likely, this script is dealing with objects containing long text columns.

This branch changes the default Storm cache size to be much more conservative. It also adds a load of boilerplate so we can easily override the conservative default for individual components where we detect slowness due to cache thrashing and can afford to use more RAM.

The appserver cache size settings remain unchanged from the value we have been running on edge for the last few weeks.

I would like to get this branch landed release critical so we don't have to cherry pick or disable update-pkgcache.

Revision history for this message
Jonathan Lange (jml) wrote :

Hey Stuart,

Thanks for debugging this problem. The changes look good, but I've got a couple of questions:

  - is there a way to avoid repeating the settings so many times over? is it desirable to do so?

  - why is one of the items changed to a cache size of 10000?

  - can you please update the storm_cache_size comment in the schema to say what units are being used?

Thanks,
jml

review: Needs Fixing
Revision history for this message
Stuart Bishop (stub) wrote :

On Wed, Jul 22, 2009 at 2:11 PM, Jonathan Lange<email address hidden> wrote:
> Review: Needs Fixing
> Hey Stuart,
>
> Thanks for debugging this problem. The changes look good, but I've got a couple of questions:
>
>  - is there a way to avoid repeating the settings so many times over? is it desirable to do so?
>

It doesn't look like it can be avoided. It is not desirable. I'm not sure if the new config machinery was implemented this way by design or by necessity. Perhaps I'm doing something wrong.

>  - why is one of the items changed to a cache size of 10000?

That is the appserver. Its not being changed - it is just being moved up a bit to be consistent.

>  - can you please update the storm_cache_size comment in the schema to say what units are being used?

Done.

I think I need to discuss on the Storm list how we should implement a cache that is bounded by RAM usage rather than object count.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Revision history for this message
Jonathan Lange (jml) wrote :

Thanks for that.

I agree, raising that on the storm list is definitely a good plan. Raising the config issue on the launchpad-dev list might be a good idea as well.

Oh, I forgot -- do you need to change the production config branch also?

jml

review: Approve
Revision history for this message
Stuart Bishop (stub) wrote :

> >  - is there a way to avoid repeating the settings so many times over? is it
> desirable to do so?
> >
>
> It doesn't look like it can be avoided. It is not desirable. I'm not sure if
> the new config machinery was implemented this way by design or by necessity.
> Perhaps I'm doing something wrong.

I should explain this more. Without this boilerplate, the values are inherited from the [database] section but they cannot be overridden in a launchpad-lazr.conf file. I believe this is because the config parser knows nothing about how these sections inherit from the [database] section causing unknown key errors to be raised. The [database] fallback is done in lib/canonical/config and is launchpad specific.

Revision history for this message
Stuart Bishop (stub) wrote :

On Wed, Jul 22, 2009 at 2:24 PM, Jonathan Lange<email address hidden> wrote:

> Oh, I forgot -- do you need to change the production config branch also?

No - I'm changing the defaults. Production configs don't override them yet.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Revision history for this message
Christian Reis (kiko) wrote :

On Wed, Jul 22, 2009 at 06:55:23AM -0000, Stuart Bishop wrote:
> -# datatype: string
> +# datatype: string.

Don't add the period here. If this isn't going to risk keeping us all up
at 2am tonight, then rc=kiko ;-)
--
Christian Robottom Reis | [+55 16] 3376 0125 | http://launchpad.net/~kiko
                        | [+55 16] 9112 6430 | http://async.com.br/~kiko

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/canonical/config/schema-lazr.conf'
2--- lib/canonical/config/schema-lazr.conf 2009-07-16 21:12:16 +0000
3+++ lib/canonical/config/schema-lazr.conf 2009-07-22 06:43:05 +0000
4@@ -2,6 +2,8 @@
5 # The database user which will be used to expire questions.
6 # datatype: string
7 dbuser: answertracker
8+storm_cache: generational
9+storm_cache_size: 500
10
11 # The number of days of inactivity required before a question in
12 # the open or needs information state is expired.
13@@ -31,6 +33,8 @@
14 # The database user which will be used by this process.
15 # datatype: string
16 dbuser: branchscanner
17+storm_cache: generational
18+storm_cache_size: 500
19
20 # See [error_reports].
21 error_dir: none
22@@ -46,6 +50,8 @@
23 # The database user which will be used by this process.
24 # datatype: string
25 dbuser: fiera
26+storm_cache: generational
27+storm_cache_size: 500
28
29 # datatype: string
30 default_sender_address: noreply@launchpad.net
31@@ -178,6 +184,8 @@
32 # The database user to run this process as.
33 # datatype: string
34 dbuser: checkwatches
35+storm_cache: generational
36+storm_cache_size: 500
37
38 # See [error_reports].
39 error_dir: none
40@@ -477,6 +485,8 @@
41 # The database user which will be used by this process.
42 # datatype: string
43 dbuser: create-merge-proposals
44+storm_cache: generational
45+storm_cache_size: 500
46
47 # See [error_reports].
48 error_dir: none
49@@ -540,10 +550,16 @@
50 soft_request_timeout: None
51
52 # The Storm cache type to use. May be 'default', 'generational' or 'stupid'
53+# datatype: string
54 storm_cache: generational
55
56-# The size of the Storm cache.
57-storm_cache_size: 5000
58+# The size of the Storm cache. We start small, because this is the
59+# default used by everything and we have no idea if we are dealing
60+# with tiny objects or huge objects. Individual scripts can easily
61+# increase the cache size if database performance is an issue and RAM
62+# usage is not. See Bug #393625 for the issue that prompted this change.
63+# datatype: integer
64+storm_cache_size: 500
65
66
67 [debug]
68@@ -609,13 +625,16 @@
69 # The database user which will be used by this process.
70 # datatype: string
71 dbuser: teammembership
72+storm_cache: generational
73+storm_cache_size: 500
74
75
76 [gina]
77 # The database user which will be used by this process.
78 # datatype: string
79 dbuser: gina
80-
81+storm_cache: generational
82+storm_cache_size: 500
83
84 [gina_target.template]
85 # The distribution name (e.g. ubuntu) from where the packages
86@@ -803,6 +822,8 @@
87 # The database user which will be used by this process.
88 # datatype: string
89 dbuser: karma
90+storm_cache: generational
91+storm_cache_size: 500
92
93 # When calculating karma, if a categories scaling factor is
94 # larger than this it is reduced down to this maximum. This is
95@@ -821,6 +842,8 @@
96 # datatype: string
97 dbuser: launchpad_main
98 auth_dbuser: launchpad_auth
99+storm_cache: generational
100+storm_cache_size: 10000
101
102 # If true, Launchpad is running in read-only mode. Attempts to
103 # write to the Launchpad database will be denied, and an explanatory
104@@ -1019,10 +1042,6 @@
105 # ba-ws.geonames.net.
106 geonames_identity:
107
108-storm_cache: generational
109-storm_cache_size: 10000
110-
111-
112 [launchpad_session]
113 # The hostname where the session database is located.
114 # If the value is empty or None, localhost via UNIX sockets
115@@ -1051,6 +1070,8 @@
116 # The database user which will be used by this process.
117 # datatype: string
118 dbuser: librarian
119+storm_cache: generational
120+storm_cache_size: 500
121
122 isolation_level: read_committed
123
124@@ -1104,6 +1125,8 @@
125 # The database user which will be used by this process.
126 # datatype: string
127 dbuser: librariangc
128+storm_cache: generational
129+storm_cache_size: 500
130
131
132 [librarian_server]
133@@ -1304,6 +1327,8 @@
134 # The database user which will be used by this process.
135 # datatype: string
136 dbuser: mp-creation-job
137+storm_cache: generational
138+storm_cache_size: 500
139
140 # See [error_reports].
141 error_dir: none
142@@ -1323,7 +1348,6 @@
143 # datatype: integer
144 retained_days: 366
145
146-
147 [personalpackagearchive]
148 # Directory to be created to store PPAs.
149 # datatype: string
150@@ -1359,18 +1383,22 @@
151 # The database user which will be used by this process.
152 # datatype: string
153 dbuser: poimport
154-
155+storm_cache: generational
156+storm_cache_size: 500
157
158 [processmail]
159 # The database user which will be used by this process.
160 # datatype: string
161 dbuser: processmail
162-
163+storm_cache: generational
164+storm_cache_size: 500
165
166 [productreleasefinder]
167 # The database user which will be used by this process.
168 # datatype: string
169 dbuser: productreleasefinder
170+storm_cache: generational
171+storm_cache_size: 500
172
173 [profiling]
174 # When set to True, each requests will be profiled and the resulting data
175@@ -1392,6 +1420,8 @@
176 # The database user which will be used by this process.
177 # datatype: string
178 dbuser: reclaim-branch-space
179+storm_cache: generational
180+storm_cache_size: 500
181
182 # See [error_reports].
183 error_dir: none
184@@ -1434,9 +1464,14 @@
185 translate_pages_max_batch_size: 50
186
187 [rosettabranches]
188+# XXX stub 20090722 bug=402891: Scripts need to use unique
189+# database users but the rosetta branch scanner is incorrectly reusing an
190+# existing database user.
191 # The database user which will be used by the rosetta-branches cronscript.
192-# datatype: string
193+# datatype: string.
194 dbuser: branchscanner
195+storm_cache: generational
196+storm_cache_size: 500
197
198 # See [error_reports].
199 error_dir: none
200@@ -1452,6 +1487,8 @@
201 # The database user which will be used by this process.
202 # datatype: string
203 dbuser: send-branch-mail
204+storm_cache: generational
205+storm_cache_size: 500
206
207 # See [error_reports].
208 error_dir: none
209@@ -1467,6 +1504,8 @@
210 # The database user which will be used by this process.
211 # datatype: string
212 dbuser: shipit
213+storm_cache: generational
214+storm_cache_size: 500
215
216 # datatype: string
217 admins_email_address: info@shipit.ubuntu.com
218@@ -1501,7 +1540,8 @@
219 # The database user which will be used by this process.
220 # datatype: string
221 dbuser: statistician
222-
223+storm_cache: generational
224+storm_cache_size: 500
225
226 [supermirror]
227 # The longest period of time, in seconds, that the scheduler will
228@@ -1550,22 +1590,30 @@
229 # The database user which will be used by this process.
230 # datatype: string
231 dbuser: targetnamecacheupdater
232+storm_cache: generational
233+storm_cache_size: 500
234
235
236 [updateremoteproduct]
237 # The database user to run this process as.
238 # datatype: string
239 dbuser: updateremoteproduct
240+storm_cache: generational
241+storm_cache_size: 500
242
243 [updatesourceforgeremoteproduct]
244 # The database user to run this process as.
245 # datatype: string
246 dbuser: updatesourceforgeremoteproduct
247+storm_cache: generational
248+storm_cache_size: 500
249
250 [uploader]
251 # The database user which will be used by this process.
252 # datatype: string
253 dbuser: uploader
254+storm_cache: generational
255+storm_cache_size: 500
256
257 # datatype: string
258 default_recipient_name: none
259@@ -1584,13 +1632,19 @@
260 # The database user which will be used by this process.
261 # datatype: string
262 dbuser: queued
263+storm_cache: generational
264+storm_cache_size: 500
265
266
267 [binaryfile_expire]
268 dbuser: binaryfile-expire
269+storm_cache: generational
270+storm_cache_size: 500
271
272 [generateppahtaccess]
273 dbuser: generateppahtaccess
274+storm_cache: generational
275+storm_cache_size: 500
276
277 [vhosts]
278 # When true, use https URLs unless explicitly overridden.