Merge lp:~flacoste/launchpad/ppr-constant-memory into lp:launchpad

Proposed by Francis J. Lacoste
Status: Merged
Approved by: Robert Collins
Approved revision: no longer in the source branch.
Merged at revision: 11795
Proposed branch: lp:~flacoste/launchpad/ppr-constant-memory
Merge into: lp:launchpad
Diff against target: 628 lines (+246/-169)
1 file modified
lib/lp/scripts/utilities/pageperformancereport.py (+246/-169)
To merge this branch: bzr merge lp:~flacoste/launchpad/ppr-constant-memory
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+39324@code.launchpad.net

Commit message

Refactor page-performance-report to use less memory by using a SQLite3 db to hold the requests and generating statistics for only one key at a time.

Description of the change

This branch changes the algorithm used by the Page Performance Report to be
able to reduce memory usage.

The current algorithm builds the statistics as it parses the logs
all-in-memory. This uses a great amount of memory because it maintains
multiple array of request times in memory for all the keys (categories, page
ids, urls) it wants to report on. It currently fails to generate any weekly or
monthly report and has trouble with some daily report too.

The new algorithm parses all the logs into a SQLite3 database and then
generates statistics for one key at a time. It still does the statistics
computation in memory. This means that the amount of memory still grows
linearly with the number of requests, as the all category will require an
array that has all the request times.

Other changes:

* I've dropped the variance column for the report. We include standard deviation
which is its square root and more useful anyway.

* I've used numpy.clip instead of doing it using list comprehension for the
input to the histogram.

Locally on a 300 000 request file here are the performance diff:

            Old New
User time 1m33 1m52
Sys time 0m1.6 0m5
RSS 483M 229M

QA

I've compared the reports generated using the old algorithm with the new one
and the reports are identical (apart the removed column).

On sodium, I've been able to generate the problematic daily reports. It peaked
at 2.2G for 4 million requests. I'm not sure that the weekly and monthly
reports will be able to be computed still. Trying that now.

To post a comment you must log in.
Revision history for this message
Francis J. Lacoste (flacoste) wrote :

As far as stats goes, I forgot to report that the SQLite3 DB size was 55M for 300000 requests and 776M for 4.1M.

Revision history for this message
Robert Collins (lifeless) :
review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

Seems plausible; it might be better to not put the time and sql time in the same table.

If you used different tables, you could avoid all the masking stuff entirely.

Revision history for this message
Francis J. Lacoste (flacoste) wrote :

Yeah, computing one statistics at a time would also reduce the peak amount of memory used at the cost of more processing time. I'll see how it goes for weekly and monthly and assess if another round is needed.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/scripts/utilities/pageperformancereport.py'
2--- lib/lp/scripts/utilities/pageperformancereport.py 2010-08-20 20:31:18 +0000
3+++ lib/lp/scripts/utilities/pageperformancereport.py 2010-10-25 21:49:08 +0000
4@@ -13,7 +13,10 @@
5 import re
6 import subprocess
7 from textwrap import dedent
8+import sqlite3
9+import tempfile
10 import time
11+import warnings
12
13 import numpy
14 import simplejson as json
15@@ -24,6 +27,9 @@
16 from canonical.launchpad.scripts.logger import log
17 from lp.scripts.helpers import LPOptionParser
18
19+# We don't care about conversion to nan, they are expected.
20+warnings.filterwarnings(
21+ 'ignore', '.*converting a masked element to nan.', UserWarning)
22
23 class Request(zc.zservertracelog.tracereport.Request):
24 url = None
25@@ -52,19 +58,14 @@
26
27 Requests belong to a Category if the URL matches a regular expression.
28 """
29- def __init__(self, title, regexp, timeout):
30+ def __init__(self, title, regexp):
31 self.title = title
32 self.regexp = regexp
33 self._compiled_regexp = re.compile(regexp, re.I | re.X)
34- self.times = Times(timeout)
35-
36- def add(self, request):
37- """Add a request to a Category if it belongs.
38-
39- Does nothing if the request does not belong in this Category.
40- """
41- if self._compiled_regexp.search(request.url) is not None:
42- self.times.add(request)
43+
44+ def match(self, request):
45+ """Return true when the request match this category."""
46+ return self._compiled_regexp.search(request.url) is not None
47
48 def __cmp__(self, other):
49 return cmp(self.title.lower(), other.title.lower())
50@@ -81,7 +82,6 @@
51 mean = 0 # Mean time per hit.
52 median = 0 # Median time per hit.
53 std = 0 # Standard deviation per hit.
54- var = 0 # Variance per hit.
55 ninetyninth_percentile_time = 0
56 histogram = None # # Request times histogram.
57
58@@ -89,46 +89,16 @@
59 mean_sqltime = 0 # Mean time spend waiting for SQL to process.
60 median_sqltime = 0 # Median time spend waiting for SQL to process.
61 std_sqltime = 0 # Standard deviation of SQL time.
62- var_sqltime = 0 # Variance of SQL time
63
64 total_sqlstatements = 0 # Total number of SQL statements issued.
65 mean_sqlstatements = 0
66 median_sqlstatements = 0
67 std_sqlstatements = 0
68- var_sqlstatements = 0
69-
70-empty_stats = Stats() # Singleton.
71-
72-
73-class Times:
74- """Collection of request times."""
75- def __init__(self, timeout):
76- self.total_hits = 0
77- self.total_time = 0
78- self.request_times = []
79- self.sql_statements = []
80- self.sql_times = []
81- self.ticks = []
82- self.histogram_width = int(1.5*timeout)
83-
84- def add(self, request):
85- """Add the application time from the request to the collection."""
86- self.total_hits += 1
87- self.total_time += request.app_seconds
88- self.request_times.append(request.app_seconds)
89- if request.sql_statements is not None:
90- self.sql_statements.append(request.sql_statements)
91- if request.sql_seconds is not None:
92- self.sql_times.append(request.sql_seconds)
93- if request.ticks is not None:
94- self.ticks.append(request.ticks)
95-
96- _stats = None
97-
98- def stats(self):
99- """Generate statistics about our request times.
100-
101- Returns a `Stats` instance.
102+
103+ def __init__(self, times, timeout):
104+ """Compute the stats based on times.
105+
106+ Times is a list of (app_time, sql_statements, sql_times).
107
108 The histogram is a list of request counts per 1 second bucket.
109 ie. histogram[0] contains the number of requests taking between 0 and
110@@ -136,67 +106,201 @@
111 1 and 2 seconds etc. histogram is None if there are no requests in
112 this Category.
113 """
114- if not self.total_hits:
115- return empty_stats
116-
117- if self._stats is not None:
118- return self._stats
119-
120- stats = Stats()
121-
122- stats.total_hits = self.total_hits
123-
124- # Time stats
125- array = numpy.asarray(self.request_times, numpy.float32)
126- stats.total_time = numpy.sum(array)
127- stats.mean = numpy.mean(array)
128- stats.median = numpy.median(array)
129- stats.std = numpy.std(array)
130- stats.var = numpy.var(array)
131+ if not times:
132+ return
133+
134+ self.total_hits = len(times)
135+
136+ # Ignore missing values (-1) in computation.
137+ times_array = numpy.ma.masked_values(
138+ numpy.asarray(times, dtype=numpy.float32), -1.)
139+
140+ self.total_time, self.total_sqlstatements, self.total_sqltime = (
141+ times_array.sum(axis=0))
142+
143+ self.mean, self.mean_sqlstatements, self.mean_sqltime = (
144+ times_array.mean(axis=0))
145+
146+ self.median, self.median_sqlstatements, self.median_sqltime = (
147+ numpy.median(times_array, axis=0))
148+
149+ self.std, self.std_sqlstatements, self.std_sqltime = (
150+ numpy.std(times_array, axis=0))
151+
152 # This is an approximation which may not be true: we don't know if we
153 # have a std distribution or not. We could just find the 99th
154 # percentile by counting. Shock. Horror; however this appears pretty
155 # good based on eyeballing things so far - once we're down in the 2-3
156 # second range for everything we may want to revisit.
157- stats.ninetyninth_percentile_time = stats.mean + stats.std*3
158- capped_times = (min(a_time, self.histogram_width) for a_time in
159- self.request_times)
160- array = numpy.fromiter(capped_times, numpy.float32,
161- len(self.request_times))
162+ self.ninetyninth_percentile_time = self.mean + self.std*3
163+
164+ histogram_width = int(timeout*1.5)
165+ histogram_times = numpy.clip(times_array[:,0], 0, histogram_width)
166 histogram = numpy.histogram(
167- array, normed=True,
168- range=(0, self.histogram_width), bins=self.histogram_width)
169- stats.histogram = zip(histogram[1], histogram[0])
170-
171- # SQL time stats.
172- array = numpy.asarray(self.sql_times, numpy.float32)
173- stats.total_sqltime = numpy.sum(array)
174- stats.mean_sqltime = numpy.mean(array)
175- stats.median_sqltime = numpy.median(array)
176- stats.std_sqltime = numpy.std(array)
177- stats.var_sqltime = numpy.var(array)
178-
179- # SQL query count.
180- array = numpy.asarray(self.sql_statements, numpy.int)
181- stats.total_sqlstatements = int(numpy.sum(array))
182- stats.mean_sqlstatements = numpy.mean(array)
183- stats.median_sqlstatements = numpy.median(array)
184- stats.std_sqlstatements = numpy.std(array)
185- stats.var_sqlstatements = numpy.var(array)
186-
187- # Cache for next invocation.
188- self._stats = stats
189- return stats
190-
191- def __str__(self):
192- results = self.stats()
193- total, mean, median, std, histogram = results
194- hstr = " ".join("%2d" % v for v in histogram)
195- return "%2.2f %2.2f %2.2f %s" % (
196- total, mean, median, std, hstr)
197-
198- def __cmp__(self, b):
199- return cmp(self.total_time, b.total_time)
200+ histogram_times, normed=True, range=(0, histogram_width),
201+ bins=histogram_width)
202+ self.histogram = zip(histogram[1], histogram[0])
203+
204+
205+class SQLiteRequestTimes:
206+ """SQLite-based request times computation."""
207+
208+ def __init__(self, categories, options):
209+ if options.db_file is None:
210+ fd, self.filename = tempfile.mkstemp(suffix='.db', prefix='ppr')
211+ os.close(fd)
212+ else:
213+ self.filename = options.db_file
214+ self.con = sqlite3.connect(self.filename, isolation_level='EXCLUSIVE')
215+ log.debug('Using request database %s' % self.filename)
216+ # Some speed optimization.
217+ self.con.execute('PRAGMA synchronous = off')
218+ self.con.execute('PRAGMA journal_mode = off')
219+
220+ self.categories = categories
221+ self.store_all_request = options.pageids or options.top_urls
222+ self.timeout = options.timeout
223+ self.cur = self.con.cursor()
224+
225+ # Create the tables, ignore errors about them being already present.
226+ try:
227+ self.cur.execute('''
228+ CREATE TABLE category_request (
229+ category INTEGER,
230+ time REAL,
231+ sql_statements INTEGER,
232+ sql_time REAL)
233+ ''');
234+ except sqlite3.OperationalError, e:
235+ if 'already exists' in str(e):
236+ pass
237+ else:
238+ raise
239+
240+ if self.store_all_request:
241+ try:
242+ self.cur.execute('''
243+ CREATE TABLE request (
244+ pageid TEXT,
245+ url TEXT,
246+ time REAL,
247+ sql_statements INTEGER,
248+ sql_time REAL)
249+ ''');
250+ except sqlite3.OperationalError, e:
251+ if 'already exists' in str(e):
252+ pass
253+ else:
254+ raise
255+
256+ def add_request(self, request):
257+ """Add a request to the cache."""
258+ sql_statements = request.sql_statements
259+ sql_seconds = request.sql_seconds
260+
261+ # Store missing value as -1, as it makes dealing with those
262+ # easier with numpy.
263+ if sql_statements is None:
264+ sql_statements = -1
265+ if sql_seconds is None:
266+ sql_seconds = -1
267+ for idx, category in enumerate(self.categories):
268+ if category.match(request):
269+ self.con.execute(
270+ "INSERT INTO category_request VALUES (?,?,?,?)",
271+ (idx, request.app_seconds, sql_statements, sql_seconds))
272+
273+ if self.store_all_request:
274+ pageid = request.pageid or 'Unknown'
275+ self.con.execute(
276+ "INSERT INTO request VALUES (?,?,?,?,?)",
277+ (pageid, request.url, request.app_seconds, sql_statements,
278+ sql_seconds))
279+
280+ def commit(self):
281+ """Call commit on the underlying connection."""
282+ self.con.commit()
283+
284+ def get_category_times(self):
285+ """Return the times for each category."""
286+ category_query = 'SELECT * FROM category_request ORDER BY category'
287+
288+ empty_stats = Stats([], 0)
289+ categories = dict(self.get_times(category_query))
290+ return [
291+ (category, categories.get(idx, empty_stats))
292+ for idx, category in enumerate(self.categories)]
293+
294+ def get_top_urls_times(self, top_n):
295+ """Return the times for the Top URL by total time"""
296+ top_url_query = '''
297+ SELECT url, time, sql_statements, sql_time
298+ FROM request WHERE url IN (
299+ SELECT url FROM (SELECT url, sum(time) FROM request
300+ GROUP BY url
301+ ORDER BY sum(time) DESC
302+ LIMIT %d))
303+ ORDER BY url
304+ ''' % top_n
305+ # Sort the result by total time
306+ return sorted(
307+ self.get_times(top_url_query), key=lambda x: x[1].total_time,
308+ reverse=True)
309+
310+ def get_pageid_times(self):
311+ """Return the times for the pageids."""
312+ pageid_query = '''
313+ SELECT pageid, time, sql_statements, sql_time
314+ FROM request
315+ ORDER BY pageid
316+ '''
317+ return self.get_times(pageid_query)
318+
319+ def get_times(self, query):
320+ """Return a list of key, stats based on the query.
321+
322+ The query should return rows of the form:
323+ [key, app_time, sql_statements, sql_times]
324+
325+ And should be sorted on key.
326+ """
327+ times = []
328+ current_key = None
329+ results = []
330+ self.cur.execute(query)
331+ while True:
332+ rows = self.cur.fetchmany()
333+ if len(rows) == 0:
334+ break
335+ for row in rows:
336+ # We are encountering a new group...
337+ if row[0] != current_key:
338+ # Compute the stats of the previous group
339+ if current_key != None:
340+ results.append(
341+ (current_key, Stats(times, self.timeout)))
342+ # Initialize the new group.
343+ current_key = row[0]
344+ times = []
345+
346+ times.append(row[1:])
347+ # Compute the stats of the last group
348+ if current_key != None:
349+ results.append((current_key, Stats(times, self.timeout)))
350+
351+ return results
352+
353+ def close(self, remove=False):
354+ """Close the SQLite connection.
355+
356+ :param remove: If true, the DB file will be removed.
357+ """
358+ self.con.close()
359+ if remove:
360+ log.debug('Deleting request database.')
361+ os.unlink(self.filename)
362+ else:
363+ log.debug('Keeping request database %s.' % self.filename)
364
365
366 def main():
367@@ -235,13 +339,17 @@
368 # Default to 12: the staging timeout.
369 default=12, type="int",
370 help="The configured timeout value : determines high risk page ids.")
371+ parser.add_option(
372+ "--db-file", dest="db_file",
373+ default=None, metavar="FILE",
374+ help="Do not parse the records, generate reports from the DB file.")
375
376 options, args = parser.parse_args()
377
378 if not os.path.isdir(options.directory):
379 parser.error("Directory %s does not exist" % options.directory)
380
381- if len(args) == 0:
382+ if len(args) == 0 and options.db_file is None:
383 parser.error("At least one zserver tracelog file must be provided")
384
385 if options.from_ts is not None and options.until_ts is not None:
386@@ -266,7 +374,7 @@
387 for option in script_config.options('categories'):
388 regexp = script_config.get('categories', option)
389 try:
390- categories.append(Category(option, regexp, options.timeout))
391+ categories.append(Category(option, regexp))
392 except sre_constants.error, x:
393 log.fatal("Unable to compile regexp %r (%s)" % (regexp, x))
394 return 1
395@@ -275,18 +383,23 @@
396 if len(categories) == 0:
397 parser.error("No data in [categories] section of configuration.")
398
399- pageid_times = {}
400- url_times = {}
401-
402- parse(args, categories, pageid_times, url_times, options)
403-
404- # Truncate the URL times to the top N.
405+ times = SQLiteRequestTimes(categories, options)
406+
407+ if len(args) > 0:
408+ parse(args, times, options)
409+ times.commit()
410+
411+ log.debug('Generating category statistics...')
412+ category_times = times.get_category_times()
413+
414+ pageid_times = []
415+ url_times= []
416 if options.top_urls:
417- sorted_urls = sorted(
418- ((times, url) for url, times in url_times.items()
419- if times.total_hits > 0), reverse=True)
420- url_times = [(url, times)
421- for times, url in sorted_urls[:options.top_urls]]
422+ log.debug('Generating top %d urls statistics...' % options.top_urls)
423+ url_times = times.get_top_urls_times(options.top_urls)
424+ if options.pageids:
425+ log.debug('Generating pageid statistics...')
426+ pageid_times = times.get_pageid_times()
427
428 def _report_filename(filename):
429 return os.path.join(options.directory, filename)
430@@ -295,7 +408,7 @@
431 if options.categories:
432 report_filename = _report_filename('categories.html')
433 log.info("Generating %s", report_filename)
434- html_report(open(report_filename, 'w'), categories, None, None)
435+ html_report(open(report_filename, 'w'), category_times, None, None)
436
437 # Pageid only report.
438 if options.pageids:
439@@ -313,7 +426,8 @@
440 if options.categories and options.pageids:
441 report_filename = _report_filename('combined.html')
442 html_report(
443- open(report_filename, 'w'), categories, pageid_times, url_times)
444+ open(report_filename, 'w'),
445+ category_times, pageid_times, url_times)
446
447 # Report of likely timeout candidates
448 report_filename = _report_filename('timeout-candidates.html')
449@@ -322,6 +436,7 @@
450 open(report_filename, 'w'), None, pageid_times, None,
451 options.timeout - 2)
452
453+ times.close(options.db_file is None)
454 return 0
455
456
457@@ -363,7 +478,7 @@
458 *(int(elem) for elem in match.groups() if elem is not None))
459
460
461-def parse(tracefiles, categories, pageid_times, url_times, options):
462+def parse(tracefiles, times, options):
463 requests = {}
464 total_requests = 0
465 for tracefile in tracefiles:
466@@ -444,35 +559,7 @@
467 log.debug("Parsed %d requests", total_requests)
468
469 # Add the request to any matching categories.
470- if options.categories:
471- for category in categories:
472- category.add(request)
473-
474- # Add the request to the times for that pageid.
475- if options.pageids:
476- pageid = request.pageid
477- try:
478- times = pageid_times[pageid]
479- except KeyError:
480- times = Times(options.timeout)
481- pageid_times[pageid] = times
482- times.add(request)
483-
484- # Add the request to the times for that URL.
485- if options.top_urls:
486- url = request.url
487- # Hack to remove opstats from top N report. This
488- # should go into a config file if we end up with
489- # more pages that need to be ignored because
490- # they are just noise.
491- if not (url is None or url.endswith('+opstats')):
492- try:
493- times = url_times[url]
494- except KeyError:
495- times = Times(options.timeout)
496- url_times[url] = times
497- times.add(request)
498-
499+ times.add_request(request)
500 else:
501 raise MalformedLine('Unknown record type %s', record_type)
502 except MalformedLine, x:
503@@ -491,7 +578,6 @@
504 elif prefix == 't':
505 if len(args) != 4:
506 raise MalformedLine("Wrong number of arguments %s" % (args,))
507- request.ticks = int(args[1])
508 request.sql_statements = int(args[2])
509 request.sql_seconds = float(args[3]) / 1000
510 else:
511@@ -500,12 +586,12 @@
512
513
514 def html_report(
515- outf, categories, pageid_times, url_times,
516+ outf, category_times, pageid_times, url_times,
517 ninetyninth_percentile_threshold=None):
518 """Write an html report to outf.
519
520 :param outf: A file object to write the report to.
521- :param categories: Categories to report.
522+ :param category_times: The time statistics for categories.
523 :param pageid_times: The time statistics for pageids.
524 :param url_times: The time statistics for the top XXX urls.
525 :param ninetyninth_percentile_threshold: Lower threshold for inclusion of
526@@ -575,20 +661,17 @@
527
528 <th class="clickable">Mean Time (secs)</th>
529 <th class="clickable">Time Standard Deviation</th>
530- <th class="clickable">Time Variance</th>
531 <th class="clickable">Median Time (secs)</th>
532 <th class="sorttable_nosort">Time Distribution</th>
533
534 <th class="clickable">Total SQL Time (secs)</th>
535 <th class="clickable">Mean SQL Time (secs)</th>
536 <th class="clickable">SQL Time Standard Deviation</th>
537- <th class="clickable">SQL Time Variance</th>
538 <th class="clickable">Median SQL Time (secs)</th>
539
540 <th class="clickable">Total SQL Statements</th>
541 <th class="clickable">Mean SQL Statements</th>
542 <th class="clickable">SQL Statement Standard Deviation</th>
543- <th class="clickable">SQL Statement Variance</th>
544 <th class="clickable">Median SQL Statements</th>
545
546 </tr>
547@@ -600,8 +683,7 @@
548 # Store our generated histograms to output Javascript later.
549 histograms = []
550
551- def handle_times(html_title, times):
552- stats = times.stats()
553+ def handle_times(html_title, stats):
554 histograms.append(stats.histogram)
555 print >> outf, dedent("""\
556 <tr>
557@@ -611,7 +693,6 @@
558 <td class="numeric 99pc_under">%.2f</td>
559 <td class="numeric mean_time">%.2f</td>
560 <td class="numeric std_time">%.2f</td>
561- <td class="numeric var_time">%.2f</td>
562 <td class="numeric median_time">%.2f</td>
563 <td>
564 <div class="histogram" id="histogram%d"></div>
565@@ -619,30 +700,27 @@
566 <td class="numeric total_sqltime">%.2f</td>
567 <td class="numeric mean_sqltime">%.2f</td>
568 <td class="numeric std_sqltime">%.2f</td>
569- <td class="numeric var_sqltime">%.2f</td>
570 <td class="numeric median_sqltime">%.2f</td>
571
572- <td class="numeric total_sqlstatements">%d</td>
573+ <td class="numeric total_sqlstatements">%.f</td>
574 <td class="numeric mean_sqlstatements">%.2f</td>
575 <td class="numeric std_sqlstatements">%.2f</td>
576- <td class="numeric var_sqlstatements">%.2f</td>
577 <td class="numeric median_sqlstatements">%.2f</td>
578 </tr>
579 """ % (
580 html_title,
581 stats.total_hits, stats.total_time,
582 stats.ninetyninth_percentile_time,
583- stats.mean, stats.std, stats.var, stats.median,
584+ stats.mean, stats.std, stats.median,
585 len(histograms) - 1,
586 stats.total_sqltime, stats.mean_sqltime,
587- stats.std_sqltime, stats.var_sqltime, stats.median_sqltime,
588+ stats.std_sqltime, stats.median_sqltime,
589 stats.total_sqlstatements, stats.mean_sqlstatements,
590- stats.std_sqlstatements, stats.var_sqlstatements,
591- stats.median_sqlstatements))
592+ stats.std_sqlstatements, stats.median_sqlstatements))
593
594 # Table of contents
595 print >> outf, '<ol>'
596- if categories:
597+ if category_times:
598 print >> outf, '<li><a href="#catrep">Category Report</a></li>'
599 if pageid_times:
600 print >> outf, '<li><a href="#pageidrep">Pageid Report</a></li>'
601@@ -650,22 +728,21 @@
602 print >> outf, '<li><a href="#topurlrep">Top URL Report</a></li>'
603 print >> outf, '</ol>'
604
605- if categories:
606+ if category_times:
607 print >> outf, '<h2 id="catrep">Category Report</h2>'
608 print >> outf, table_header
609- for category in categories:
610+ for category, times in category_times:
611 html_title = '%s<br/><span class="regexp">%s</span>' % (
612 html_quote(category.title), html_quote(category.regexp))
613- handle_times(html_title, category.times)
614+ handle_times(html_title, times)
615 print >> outf, table_footer
616
617 if pageid_times:
618 print >> outf, '<h2 id="pageidrep">Pageid Report</h2>'
619 print >> outf, table_header
620- for pageid, times in sorted(pageid_times.items()):
621- pageid = pageid or 'None'
622+ for pageid, times in pageid_times:
623 if (ninetyninth_percentile_threshold is not None and
624- (times.stats().ninetyninth_percentile_time <
625+ (times.ninetyninth_percentile_time <
626 ninetyninth_percentile_threshold)):
627 continue
628 handle_times(html_quote(pageid), times)