Merge lp:~stub/launchpad/page-performance-report into lp:launchpad

Proposed by Stuart Bishop
Status: Merged
Approved by: Muharem Hrnjadovic
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~stub/launchpad/page-performance-report
Merge into: lp:launchpad
Diff against target: 512 lines (+493/-0)
4 files modified
lib/lp/scripts/helpers.py (+57/-0)
lib/lp/scripts/utilities/pageperformancereport.py (+387/-0)
utilities/page-performance-report.ini (+31/-0)
utilities/page-performance-report.py (+18/-0)
To merge this branch: bzr merge lp:~stub/launchpad/page-performance-report
Reviewer Review Type Date Requested Status
Muharem Hrnjadovic (community) Approve
Review via email: mp+18522@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Stuart Bishop (stub) wrote :

We want to make slow pages faster. At the moment we do this by optimizing test pages, but we really have no idea if we have made other pages slower. When we optimize, we need to ensure the system improves overall rather than playing whack a mole, speeding up some pages while causing others to timeout.

This hasn't been a genuine problem so far, but we want to evaluate using memcache. memcache is unusual in that it will slow down the initial page renderings due to overhead in stuffing data into the caches, and speed up repeat renderings. This branch is to provide us with tools to confirm if our experiments succeeded or failed.

The reporting tool is basic, but meets our initial needs in the current state. An example report from edge data can be found at http://people.canonical.com/~stub/lpperf.html .

To get meaningful graphs I had to use a pretty severe logarithmic scale on the y-axis and the numerical stats (mean, median, standard deviation) are pretty useless with real Launchpad data, but it is a start. The issue here is that slow requests are actually vastly outnumbered by fast requests in most of Launchpad, which is surprising to developers who spend their time worrying about the minor percentage of pages that generate timeouts and create the bulk of complaints.

Revision history for this message
Muharem Hrnjadovic (al-maisan) wrote :

The example report quoted in the merge proposal cover letter looks very good!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'lib/lp/scripts/helpers.py'
2--- lib/lp/scripts/helpers.py 1970-01-01 00:00:00 +0000
3+++ lib/lp/scripts/helpers.py 2010-02-03 14:58:22 +0000
4@@ -0,0 +1,57 @@
5+# Copyright 2010 Canonical Ltd. This software is licensed under the
6+# GNU Affero General Public License version 3 (see the file LICENSE).
7+
8+"""Helpers for command line tools."""
9+
10+__metaclass__ = type
11+__all__ = ["LPOptionParser"]
12+
13+from copy import copy
14+from datetime import datetime
15+from optparse import Option, OptionParser, OptionValueError
16+
17+from canonical.launchpad.scripts.logger import logger_options
18+
19+
20+def _check_datetime(option, opt, value):
21+ "Type checker for optparse datetime option type."
22+ # We support 5 valid ISO8601 formats.
23+ formats = [
24+ '%Y-%m-%dT%H:%M:%S',
25+ '%Y-%m-%dT%H:%M',
26+ '%Y-%m-%d %H:%M:%S',
27+ '%Y-%m-%d %H:%M',
28+ '%Y-%m-%d',
29+ ]
30+ for format in formats:
31+ try:
32+ return datetime.strptime(value, format)
33+ except ValueError:
34+ pass
35+ raise OptionValueError(
36+ "option %s: invalid datetime value: %r" % (opt, value))
37+
38+
39+class LPOption(Option):
40+ """Extended optparse Option class.
41+
42+ Adds a 'datetime' option type.
43+ """
44+ TYPES = Option.TYPES + ("datetime",)
45+ TYPE_CHECKER = copy(Option.TYPE_CHECKER)
46+ TYPE_CHECKER["datetime"] = _check_datetime
47+
48+
49+class LPOptionParser(OptionParser):
50+ """Extended optparse OptionParser.
51+
52+ Adds a 'datetime' option type.
53+
54+ Automatically adds our standard --verbose, --quiet options that
55+ tie into our logging system.
56+ """
57+ def __init__(self, *args, **kw):
58+ kw.setdefault('option_class', LPOption)
59+ OptionParser.__init__(self, *args, **kw)
60+ logger_options(self)
61+
62
63=== added file 'lib/lp/scripts/utilities/pageperformancereport.py'
64--- lib/lp/scripts/utilities/pageperformancereport.py 1970-01-01 00:00:00 +0000
65+++ lib/lp/scripts/utilities/pageperformancereport.py 2010-02-03 14:58:22 +0000
66@@ -0,0 +1,387 @@
67+# Copyright 2010 Canonical Ltd. This software is licensed under the
68+# GNU Affero General Public License version 3 (see the file LICENSE).
69+
70+"""Page performance report generated from zserver trace logs."""
71+
72+__metaclass__ = type
73+__all__ = ['main']
74+
75+import bz2
76+from cgi import escape as html_quote
77+from ConfigParser import RawConfigParser
78+from datetime import datetime
79+import gzip
80+import re
81+import sre_constants
82+import os.path
83+from textwrap import dedent
84+import time
85+
86+import numpy
87+import simplejson as json
88+from zc.zservertracelog.tracereport import Request, parsedt
89+
90+from canonical.config import config
91+from canonical.launchpad.scripts.logger import log
92+from lp.scripts.helpers import LPOptionParser
93+
94+
95+class Category:
96+ """A Category in our report.
97+
98+ Requests belong to a Category if the URL matches a regular expression.
99+ """
100+ def __init__(self, title, regexp, timeout):
101+ self.title = title
102+ self.regexp = regexp
103+ self._compiled_regexp = re.compile(regexp, re.I | re.X)
104+ self.times = Times(timeout)
105+
106+ def add(self, request):
107+ """Add a request to a Category if it belongs.
108+
109+ Does nothing if the request does not belong in this Category.
110+ """
111+ if self._compiled_regexp.search(request.url) is not None:
112+ self.times.add(request)
113+
114+ def __cmp__(self, other):
115+ return cmp(self.title.lower(), other.title.lower())
116+
117+
118+class Times:
119+ """Collection of request times."""
120+ def __init__(self, timeout):
121+ self.request_times = []
122+ self.timeout = timeout
123+
124+ def add(self, request):
125+ """Add the application time from the request to the collection.
126+
127+ The application time is capped to our timeout.
128+ """
129+ self.request_times.append(min(request.app_seconds, self.timeout))
130+
131+ def stats(self):
132+ """Generate statistics about our request times.
133+
134+ Returns (mean, median, standard_deviation, histogram).
135+
136+ The histogram is a list of request counts per 1 second bucket.
137+ ie. histogram[0] contains the number of requests taking between 0 and
138+ 1 second, histogram[1] contains the number of requests taking between
139+ 1 and 2 seconds etc. histogram is None if there are no requests in
140+ this Category.
141+ """
142+
143+ if not self.request_times:
144+ return 0, 0, 0, None
145+ array = numpy.asarray(self.request_times, numpy.float32)
146+ mean = numpy.mean(array)
147+ median = numpy.median(array)
148+ standard_deviation = numpy.std(array)
149+ histogram = numpy.histogram(
150+ array, normed=True,
151+ range=(0, self.timeout), bins=self.timeout)
152+ histogram = zip(histogram[1], histogram[0])
153+ return mean, median, standard_deviation, histogram
154+
155+ def __str__(self):
156+ results = self.stats()
157+ mean, median, standard_deviation, histogram = results
158+ hstr = " ".join("%2d" % v for v in histogram)
159+ return "%2.2f %2.2f %2.2f %s" % (
160+ mean, median, standard_deviation, hstr)
161+
162+
163+def main():
164+ parser = LPOptionParser("%prog [args] tracelog [...]")
165+ parser.add_option(
166+ "-c", "--config", dest="config",
167+ default=os.path.join(
168+ config.root, "utilities", "page-performance-report.ini"),
169+ metavar="FILE", help="Load configuration from FILE")
170+ parser.add_option(
171+ "--timeout", dest="timeout", type="int",
172+ default=20, metavar="SECS",
173+ help="Requests taking more than SECS seconds are timeouts")
174+ parser.add_option(
175+ "--from", dest="from_ts", type="datetime",
176+ default=None, metavar="TIMESTAMP",
177+ help="Ignore log entries before TIMESTAMP")
178+ parser.add_option(
179+ "--until", dest="until_ts", type="datetime",
180+ default=None, metavar="TIMESTAMP",
181+ help="Ignore log entries after TIMESTAMP")
182+ options, args = parser.parse_args()
183+ if len(args) == 0:
184+ parser.error("At least one zserver tracelog file must be provided")
185+
186+ if options.from_ts is not None and options.until_ts is not None:
187+ if options.from_ts > options.until_ts:
188+ parser.error(
189+ "--from timestamp %s is before --until timestamp %s"
190+ % (options.from_ts, options.until_ts))
191+
192+ for filename in args:
193+ if not os.path.exists(filename):
194+ parser.error("Tracelog file %s not found." % filename)
195+
196+ if not os.path.exists(options.config):
197+ parser.error("Config file %s not found." % options.config)
198+
199+ # Need a better config mechanism as ConfigParser doesn't preserve order.
200+ script_config = RawConfigParser()
201+ script_config.optionxform = str # Make keys case sensitive.
202+ script_config.readfp(open(options.config))
203+
204+ categories = [] # A list of Category, in report order.
205+ for option in script_config.options('categories'):
206+ regexp = script_config.get('categories', option)
207+ try:
208+ categories.append(Category(option, regexp, options.timeout))
209+ except sre_constants.error, x:
210+ log.fatal("Unable to compile regexp %r (%s)" % (regexp, x))
211+ return 1
212+ categories.sort()
213+
214+ if len(categories) == 0:
215+ parser.error("No data in [categories] section of configuration.")
216+
217+ parse(args, categories, options)
218+
219+ print_html_report(categories)
220+
221+ return 0
222+
223+
224+def smart_open(filename, mode='r'):
225+ """Open a file, transparently handling compressed files.
226+
227+ Compressed files are detected by file extension.
228+ """
229+ ext = os.path.splitext(filename)[1]
230+ if ext == '.bz2':
231+ return bz2.BZ2File(filename, mode)
232+ elif ext == '.gz':
233+ return gzip.open(filename, mode)
234+ else:
235+ return open(filename, mode)
236+
237+
238+class MalformedLine(Exception):
239+ """A malformed line was found in the trace log."""
240+
241+
242+_ts_re = re.compile(
243+ '^(\d{4})-(\d\d)-(\d\d)\s(\d\d):(\d\d):(\d\d)(?:.(\d{6}))?$')
244+
245+
246+def parse_timestamp(ts_string):
247+ match = _ts_re.search(ts_string)
248+ if match is None:
249+ raise ValueError("Invalid timestamp")
250+ return datetime(
251+ *(int(elem) for elem in match.groups() if elem is not None))
252+
253+
254+def parse(tracefiles, categories, options):
255+ requests = {}
256+ total_requests = 0
257+ for tracefile in tracefiles:
258+ log.info('Processing %s', tracefile)
259+ for line in smart_open(tracefile):
260+ line = line.rstrip()
261+ try:
262+ record = line.split(' ',7)
263+ try:
264+ record_type, request_id, date, time_ = record[:4]
265+ except ValueError:
266+ raise MalformedLine()
267+
268+ if record_type == 'S':
269+ # Short circuit - we don't care about these entries.
270+ continue
271+
272+ # Parse the timestamp.
273+ ts_string = '%s %s' % (date, time_)
274+ try:
275+ dt = parse_timestamp(ts_string)
276+ except ValueError:
277+ raise MalformedLine(
278+ 'Invalid timestamp %s' % repr(ts_string))
279+
280+ # Filter entries by command line date range.
281+ if options.from_ts is not None and dt < options.from_ts:
282+ continue # Skip to next line.
283+ if options.until_ts is not None and dt > options.until_ts:
284+ break # Skip to next log file.
285+
286+ args = record[4:]
287+
288+ def require_args(count):
289+ if len(args) < count:
290+ raise MalformedLine()
291+
292+ if record_type == 'B': # Request begins.
293+ require_args(2)
294+ requests[request_id] = Request(dt, args[0], args[1])
295+ continue
296+
297+ request = requests.get(request_id, None)
298+ if request is None: # Just ignore partial records.
299+ continue
300+
301+ if record_type == '-': # Extension record from Launchpad.
302+ # Launchpad outputs the full URL to the tracelog,
303+ # including protocol & hostname. Use this in favor of
304+ # the ZServer logged path.
305+ require_args(1)
306+ request.url = args[0]
307+
308+ elif record_type == 'I': # Got request input.
309+ require_args(1)
310+ request.I(dt, args[0])
311+
312+ elif record_type == 'C': # Entered application thread.
313+ request.C(dt)
314+
315+ elif record_type == 'A': # Application done.
316+ require_args(2)
317+ request.A(dt, args[0], args[1])
318+
319+ elif record_type == 'E': # Request done.
320+ del requests[request_id]
321+ request.E(dt)
322+ total_requests += 1
323+ if total_requests % 10000 == 0:
324+ log.debug("Parsed %d requests", total_requests)
325+ for category in categories:
326+ category.add(request)
327+
328+ else:
329+ raise MalformedLine('Unknown record type %s', record_type)
330+ except MalformedLine, x:
331+ log.error(
332+ "Malformed line %s %s (%s)" % (repr(line), repr(args), x))
333+
334+
335+def print_html_report(categories):
336+
337+ print dedent('''\
338+ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
339+ "http://www.w3.org/TR/html4/loose.dtd">
340+ <html>
341+ <head>
342+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
343+ <title>Launchpad Page Performance Report %(date)s</title>
344+ <script language="javascript" type="text/javascript"
345+ src="http://people.canonical.com/~stub/flot/jquery.min.js"
346+ ></script>
347+ <script language="javascript" type="text/javascript"
348+ src="http://people.canonical.com/~stub/flot/jquery.flot.min.js"
349+ ></script>
350+ <style type="text/css">
351+ h2 { font-weight: normal; font-size: 100%%; }
352+ thead th { padding-left: 1em; padding-right: 1em; }
353+ .category-title { text-align: right; padding-right: 2em; }
354+ .regexp { font-size: x-small; font-weight: normal; }
355+ .mean { text-align: right; padding-right: 1em; }
356+ .median { text-align: right; padding-right: 1em; }
357+ .standard-deviation { text-align: right; padding-right: 1em; }
358+ .histogram { padding: 0.5em 1em; width:400px; height:250px; }
359+ .odd-row { background-color: #eeeeff; }
360+ .even-row { background-color: #ffffee; }
361+ </style>
362+ </head>
363+ <body>
364+ <h1>Launchpad Page Performance Report</h1>
365+ <h2>%(date)s</h2>
366+
367+ <table class="launchpad-performance-report">
368+ <thead>
369+ <tr>
370+ <td></td>
371+ <th>Mean</th>
372+ <th>Median</th>
373+ <th>Standard<br/>Deviation</th>
374+ <th>Distribution</th>
375+ </tr>
376+ </thead>
377+ <tbody>
378+ ''' % {'date': time.ctime()})
379+
380+ histograms = []
381+ for i, category in enumerate(categories):
382+ row_class = "even-row" if i % 2 else "odd-row"
383+ mean, median, standard_deviation, histogram = category.times.stats()
384+ histograms.append(histogram)
385+ print dedent("""\
386+ <tr class="%s">
387+ <th class="category-title">%s <div class="regexp">%s</span></th>
388+ <td class="mean">%.2f s</td>
389+ <td class="median">%.2f s</td>
390+ <td class="standard-deviation">%.2f s</td>
391+ <td>
392+ <div class="histogram" id="histogram%d"></div>
393+ </td>
394+ </tr>
395+ """ % (
396+ row_class,
397+ html_quote(category.title), html_quote(category.regexp),
398+ mean, median, standard_deviation, i))
399+
400+ print dedent("""\
401+ <script language="javascript" type="text/javascript">
402+ $(function () {
403+ var options = {
404+ series: {
405+ bars: {show: true}
406+ },
407+ xaxis: {
408+ tickDecimals: 0,
409+ tickFormatter: function (val, axis) {
410+ return val.toFixed(axis.tickDecimals) + "s";
411+ }
412+ },
413+ yaxis: {
414+ min: 0,
415+ max: 1,
416+ transform: function (v) {
417+ return Math.pow(Math.log(v*100+1)/Math.LN2, 0.5);
418+ },
419+ inverseTransform: function (v) {
420+ return Math.pow(Math.exp(v*100+1)/Math.LN2, 2);
421+ },
422+ tickDecimals: 1,
423+ tickFormatter: function (val, axis) {
424+ return (val * 100).toFixed(axis.tickDecimals) + "%";
425+ },
426+ ticks: [0.001,0.01,0.10,0.50,1.0]
427+ },
428+ grid: {
429+ aboveData: true,
430+ labelMargin: 15
431+ }
432+ };
433+ """)
434+
435+ for i, histogram in enumerate(histograms):
436+ if histogram is None:
437+ continue
438+ print dedent("""\
439+ var d = %s;
440+
441+ $.plot(
442+ $("#histogram%d"),
443+ [{data: d}], options);
444+
445+ """ % (json.dumps(histogram), i))
446+
447+ print dedent("""\
448+ });
449+ </script>
450+ </body>
451+ </html>
452+ """)
453+
454
455=== added file 'utilities/page-performance-report.ini'
456--- utilities/page-performance-report.ini 1970-01-01 00:00:00 +0000
457+++ utilities/page-performance-report.ini 2010-02-03 14:58:22 +0000
458@@ -0,0 +1,31 @@
459+[categories]
460+All Launchpad=.
461+
462+Registry=^https?://(edge\.)?launchpad\.
463+Registry - Person Index=/(~|%7E)[^/]+$
464+Registry - Pillar Index=^https?://(edge\.)?launchpad.[^/]+/\w[^/]*$
465+
466+Answers=^https?://answers\.(edge\.)?launchpad\.
467+
468+Blueprints=^https?://blueprints\.(edge\.)?launchpad\.
469+
470+Bugs=^https?://bugs\.(edge\.)?launchpad\.
471+Bugs - Bug Page=\+bug/\d+$
472+Bugs - Pillar Index=^https?://bugs\.(edge\.)?launchpad\.[^/]+/\w[^/]*$
473+Bugs - Search=/\+bugs\?
474+
475+Code=^https?://code\.(edge\.)?launchpad\.
476+Code - PPA Index=/\+archive/[^/]+$
477+Code - Pillar Branches=^https?://code\.(edge\.)?launchpad\.[^/]+/\w[^/]*$
478+Code - Branch Page=^https?://code\.(edge\.)?launchpad\.[^/]+/(~|%7E)[^/]+/[^/]+/[^/]+$
479+Code - Merge Proposal=/\+merge/\d+$
480+
481+Translations=^https?://translations\.(edge\.)?launchpad\.
482+Translations - Overview=^https?://translations\..*/\+lang/\w+$
483+
484+API=^https?://api\.(edge\.)?launchpad\.
485+Public XML-RPC=^https?://xmlrpc\.(edge\.)launchpad\.
486+Private XML-RPC=^https?://xmlrpc-private\.(edge\.)launchpad\.
487+Shipit=^https?://shipit\.
488+Launchpad Homepage=^https?://(edge\.)?launchpad\.
489+
490
491=== added file 'utilities/page-performance-report.py'
492--- utilities/page-performance-report.py 1970-01-01 00:00:00 +0000
493+++ utilities/page-performance-report.py 2010-02-03 14:58:22 +0000
494@@ -0,0 +1,18 @@
495+#!/usr/bin/python2.5
496+#
497+# Copyright 2010 Canonical Ltd. This software is licensed under the
498+# GNU Affero General Public License version 3 (see the file LICENSE).
499+
500+"""Page performance report generated from zserver tracelogs."""
501+
502+__metaclass__ = type
503+
504+import _pythonpath
505+
506+import sys
507+
508+from lp.scripts.utilities.pageperformancereport import main
509+
510+
511+if __name__ == '__main__':
512+ sys.exit(main())