Launchpad itself

Merge lp:~stub/launchpad/memcache into lp:launchpad

memcache
Merge into devel

Proposed by Stuart Bishop on 2010-02-26

Status:

Merged

Approved by:

Stuart Bishop on 2010-03-06

Approved revision:

no longer in the source branch.

Merged at revision:

not available

Proposed branch:

lp:~stub/launchpad/memcache

Merge into:

lp:launchpad

Diff against target:

723 lines (+606/-6)

8 files modified

lib/canonical/testing/layers.py (+5/-0)
lib/lp/app/stories/launchpad-root/xx-featuredprojects.txt (+5/-0)
lib/lp/app/templates/root-index.pt (+10/-5)
lib/lp/services/memcache/configure.zcml (+6/-0)
lib/lp/services/memcache/doc/tales-cache.txt (+214/-0)
lib/lp/services/memcache/interfaces.py (+1/-1)
lib/lp/services/memcache/tales.py (+294/-0)
lib/lp/services/memcache/tests/test_doc.py (+71/-0)

To merge this branch:

bzr merge lp:~stub/launchpad/memcache

Related bugs:

Bug #634326: memcache cache keys interact poorly with query parameters	High	Fix Released
Bug #634646: MemcachedKeyCharacterError: Control characters not allowed	High	Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Gary Poster (community)		2010-02-26	Approve on 2010-02-26
Review via email: mp+20226@code.launchpad.net

Commit message

Add syntax to page templates to cache chunks of rendered content in memcached.

Revision history for this message

Stuart Bishop (stub) wrote on 2010-02-26:

Implements the ability to cache rendered chunks of our page templates in memcached.

Readable, tested documentation included describing the syntax and functionality.

To install this new functionality, I had to resort to monkey patching. The solution to this will be to move this feature upstream into zope.tal and zope.tales. We are not worrying about this yet as we are considering switching our TAL interpreter to chameleon.

Revision history for this message

Gary Poster (gary) wrote on 2010-02-26:

Download full text (4.2 KiB)

merge-conditional

Hi Stuart. This is very cool! Thank you.

Might as well clean up the ``#level debug`` comments in launchpad.conf.

I question including the "anonymous" visibility. You bring up a good reason for not including it. Why don't we just exclude it this time around? It feels like something that might be interesting for upstream but not for right now--and like something that people will use unnecessarily. (On the other hand, if you want to push back, in the interest of saving you from waiting for me to wake up on the other side of the globe, you may merely imagine me acquiescing, and keep it, if you like.)

I liked your doctest.

Line 383 of the diff incorrectly describes the syntax, AIUI: <div tal:content="cache:1h public">. Please correct it.

I would be tempted to try to provide a more helpful error message when someone includes too many commas (re "self.visibility, max_age = (s.strip() for s in expr.split(','))" from line 392 of the diff. Line 403 ("value, unit = max_age.split(' ')") is similar. If you can't be bothered, I'll look the other way.

You say that "units is one of 'seconds', 'minutes', 'hours' or 'days'." However, you accept s*, m*, h* and d*. You are much stricter about the visibility strings. I'm inclined to favor enforcing readability, and enforce the full strings. This is negotiable (that is, under the circumstances, you may hear that as "ignorable" if you wish) but I feel more strongly about this one than some others I've brought up with similar deference.

You generate _valid_key_charactersbut then you never use it. Maybe delete it?

I am curious why you are not using SPACE (32) as your delineator, as opposed to the colon, which forces your logic to have to be a bit trickier here and there. Maybe I'll see why later...

I figure you know this is OK because of the DB, but I was very mildly surprised by the confidence of "uid = str(logged_in_user.id)". If you are sure it is safe, that's great.

The way you are handling repeats is very interesting. It's an interesting problem. I first thought that your approach of adding one to the counter_key in the request annotations would not work very well in the case of nested loops, because if something changed, then it would do a very odd cascade that might put an old sub-item from one top-level section into another top-level section entirely. However, if a collection changes significantly from one repeat to the next--an item is inserted somewhere, rather than appended, in particular--you are kind of hosed anyway. I guess I'm fine with what you have, though I'd like it if you added a warning in the doctest that cacheing things in a repeat is perhaps a less appealing prospect than some others.

merge-conditional

Hi Stuart.  This is very cool!  Thank you.

Might as well clean up the ``#level debug`` comments in launchpad.conf.

I question including the "anonymous" visibility.  You bring up a good reason for not including it.  Why don't we just exclude it this time around?  It feels like something that might be interesting for upstream but not for right now--and like something that people will use unnecessarily.  (On the other hand, if you want to push back, in the interest of saving you from waiting for me to wake up on the other side of the globe, you may merely imagine me acquiescing, and keep it, if you like.)

I liked your doctest.

Line 383 of the diff incorrectly describes the syntax, AIUI: <div tal:content="cache:1h public">.  Please correct it.

I would be tempted to try to provide a more helpful error message when someone includes too many commas (re "self.visibility, max_age = (s.strip() for s in expr.split(','))" from line 392 of the diff.  Line 403 ("value, unit = max_age.split(' ')") is similar.  If you can't be bothered, I'll look the other way.

You say that "units is one of 'seconds', 'minutes', 'hours' or 'days'."  However, you accept s*, m*, h* and d*.  You are much stricter about the visibility strings.  I'm inclined to favor enforcing readability, and enforce the full strings.  This is negotiable (that is, under the circumstances, you may hear that as "ignorable" if you wish) but I feel more strongly about this one than some others I've brought up with similar deference.

You generate _valid_key_charactersbut then you never use it.  Maybe delete it?

I am curious why you are not using SPACE (32) as your delineator, as opposed to the colon, which forces your logic to have to be a bit trickier here and there.  Maybe I'll see why later...

I figure you know this is OK because of the DB, but I was very mildly surprised by the confidence of "uid = str(logged_in_user.id)".  If you are sure it is safe, that's great.

The way you are handling repeats is very interesting.  It's an interesting problem.  I first thought that your approach of adding one to the counter_key in the request annotations would not work very well in the case of nested loops, because if something changed, then it would do a very odd cascade that might put an old sub-item from one top-level section into another top-level section entirely.  However, if a collection changes significantly from one repeat to the next--an item is inserted somewhere, rather than appended, in particular--you are kind of hosed anyway.  I guess I'm fine with what you have, though I'd like it if you added a warning in the doctest that cacheing things in a repeat is perhaps a less appealing prospect than some others.

OK, I asked why you are not using SPACE (32) as your delineator, and now I see those colons in "pt:%s:%s,%s:%s:%d,%d:%d,%s".  I see commas too, though.  Why can't we use spaces for everything?  Contrariwise, if we are separating with colons and commas, why are you not excluding commas from your valid characters? Contrariwise to both of those, or perhaps perpendicularly to them, if the url is at the end, we know that any colon or comma after the ones used in our string formatting pattern belong to the url, so why do we escape any of the delimiters?  And then, to add to the excitement, why do we not escape any of the other values in that key--how sure are you that the user id will not have a colon or comma, for instance?

If you answer those questions to your own satisfaction, you may consider me satisfied, and proceed.

base62 is fun :-)

To state the obvious, getKey needs to be very fast or else it will be difficult for this to be a performance win. It looks pretty reasonable to me, though.  ISTR that calculating an md5 hash is pretty darn fast.  I'll not worry about it.

In Python, I've gained a taste against conditional statements actually doing work ("if getUtility(IMemcacheClient).set(...):") but it's clear enough in context.  No change needed.

You have a nice __repr__ on the MemcacheMiss, while you have what I interpret to be a vestigial __unicode__ (before you monkeypatched evaluateText) on MemcacheHit.  Would it be reasonable to change MemcacheHit to drop the __unicode__ and add a __repr__?  If so, please do.

OK, that's it.  Thank you again for a very cool branch, Stuart!

Gary

review: Approve

Revision history for this message

Stuart Bishop (stub) wrote on 2010-03-02:

Download full text (5.9 KiB)

On Sat, Feb 27, 2010 at 5:47 AM, Gary Poster <email address hidden> wrote:

> Might as well clean up the ``#level debug`` comments in launchpad.conf.

Done.

> I question including the "anonymous" visibility. You bring up a good reason for not including it. Why don't we just exclude it this time around? It feels like something that might be interesting for upstream but not for right now--and like something that people will use unnecessarily. (On the other hand, if you want to push back, in the interest of saving you from waiting for me to wake up on the other side of the globe, you may merely imagine me acquiescing, and keep it, if you like.)

I'd like to keep it because it completes the visibility model and I'd rather not have to re implement it later if we push this upstream or if we have real use cases where we do need it. Also, my comments are just my guess - is our squid configured to cache pages with query strings? I don't really know.

> Line 383 of the diff incorrectly describes the syntax, AIUI: <div tal:content="cache:1h public">. Please correct it.

Fixed.

> I would be tempted to try to provide a more helpful error message when someone includes too many commas (re "self.visibility, max_age = (s.strip() for s in expr.split(','))" from line 392 of the diff. Line 403 ("value, unit = max_age.split(' ')") is similar. If you can't be bothered, I'll look the other way.

Fixed.

> You say that "units is one of 'seconds', 'minutes', 'hours' or 'days'." However, you accept s*, m*, h* and d*. You are much stricter about the visibility strings. I'm inclined to favor enforcing readability, and enforce the full strings. This is negotiable (that is, under the circumstances, you may hear that as "ignorable" if you wish) but I feel more strongly about this one than some others I've brought up with similar deference.

Ok. I did that because it made it simpler to accept plural forms. Fixed.

> You generate _valid_key_charactersbut then you never use it. Maybe delete it?

Yes - that was cruft from earlier work.

> I am curious why you are not using SPACE (32) as your delineator, as opposed to the colon, which forces your logic to have to be a bit trickier here and there. Maybe I'll see why later...

Space is not a valid character in memcache keys, but colon is. I'm also using a mixture of colon and comma because I think I have seen memcache reporting tools using this to summarize memcache utilization, but it doesn't really matter provided it isn't a number or one of the magic tokens like 'p' or 'a'. These tools might be a figment of an overactive imagination.

> The way you are handling repeats is very interesting. It's an interesting problem. I first thought that your approach of adding one to the counter_key in the request annotations would not work very well in the case of nested loops, because if something changed, then it would do a very odd cascade that might put an old sub-item from one top-level section into another top-level section entirely. However, if a collection changes significantly from one repeat to the next--an item is inserted somewhere, rather than appended, in par...

On Sat, Feb 27, 2010 at 5:47 AM, Gary Poster <gary.poster@canonical.com> wrote:

> Might as well clean up the ``#level debug`` comments in launchpad.conf.

Done.

> I question including the "anonymous" visibility.  You bring up a good reason for not including it.  Why don't we just exclude it this time around?  It feels like something that might be interesting for upstream but not for right now--and like something that people will use unnecessarily.  (On the other hand, if you want to push back, in the interest of saving you from waiting for me to wake up on the other side of the globe, you may merely imagine me acquiescing, and keep it, if you like.)

> Line 383 of the diff incorrectly describes the syntax, AIUI: <div tal:content="cache:1h public">.  Please correct it.

Fixed.

> I would be tempted to try to provide a more helpful error message when someone includes too many commas (re "self.visibility, max_age = (s.strip() for s in expr.split(','))" from line 392 of the diff.  Line 403 ("value, unit = max_age.split(' ')") is similar.  If you can't be bothered, I'll look the other way.

Fixed.

> You say that "units is one of 'seconds', 'minutes', 'hours' or 'days'."  However, you accept s*, m*, h* and d*.  You are much stricter about the visibility strings.  I'm inclined to favor enforcing readability, and enforce the full strings.  This is negotiable (that is, under the circumstances, you may hear that as "ignorable" if you wish) but I feel more strongly about this one than some others I've brought up with similar deference.

Ok. I did that because it made it simpler to accept plural forms. Fixed.

> You generate _valid_key_charactersbut then you never use it.  Maybe delete it?

Yes - that was cruft from earlier work.

> I am curious why you are not using SPACE (32) as your delineator, as opposed to the colon, which forces your logic to have to be a bit trickier here and there.  Maybe I'll see why later...

> The way you are handling repeats is very interesting.  It's an interesting problem.  I first thought that your approach of adding one to the counter_key in the request annotations would not work very well in the case of nested loops, because if something changed, then it would do a very odd cascade that might put an old sub-item from one top-level section into another top-level section entirely.  However, if a collection changes significantly from one repeat to the next--an item is inserted somewhere, rather than appended, in particular--you are kind of hosed anyway.  I guess I'm fine with what you have, though I'd like it if you added a warning in the doctest that cacheing things in a repeat is perhaps a less appealing prospect than some others.

I can't see how the approach for loops would cause any weirdness or fail that isn't already an issue outside of loops. If you retrieve a list of bugs from the live database but used cached renderings, the two might not match. This is similar to any other case where you are mixing live information with cached information. If that isn't what you meant, I do not follow your reasoning.

> OK, I asked why you are not using SPACE (32) as your delineator, and now I see those colons in "pt:%s:%s,%s:%s:%d,%d:%d,%s".  I see commas too, though.  Why can't we use spaces for everything?  Contrariwise, if we are separating with colons and commas, why are you not excluding commas from

I'm thinking we can use colon to report on memcached utilization. I'm using colon where it seemed sensible to create a subdivision, and comma where it wasn't. This was just a guess though, as these reporting tools are still science fiction.

> your valid characters? Contrariwise to both of those, or perhaps perpendicularly to them, if the url is at the end, we know that any colon or comma after the ones used in our string formatting pattern belong to the url, so why do we escape any of the delimiters?  And then, to add to the excitement, why do we not escape any of the other values in that key--how sure are you that the user id will not have a colon or comma, for instance?

I want to escape colons in the URL to avoid confusing the fictional reporting tools. Nothing else in the key can generate a character that needs sanitizing - the user id is an integer for example.

> To state the obvious, getKey needs to be very fast or else it will be difficult for this to be a performance win. It looks pretty reasonable to me, though.  ISTR that calculating an md5 hash is pretty darn fast.  I'll not worry about it.

Its probably faster than any suitable hash we come up with inhouse anyway. I chose md5 over sha1 due to my assumption that md5 would be faster. I never timed it though.

> You have a nice __repr__ on the MemcacheMiss, while you have what I interpret to be a vestigial __unicode__ (before you monkeypatched evaluateText) on MemcacheHit.  Would it be reasonable to change MemcacheHit to drop the __unicode__ and add a __repr__?  If so, please do.

I've dropped the __unicode__. I didn't add __repr__ as there isn't enough suitable information in the class to do a better job than the default __repr__ (the cached text itself isn't suitable as it might be huge). It was cruft from an earlier attempt to keep changes to TALInterpreter minimal.

-- 
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Barki Mustapha

Celso Providelo

Christian Reis

Christy Awad

Colin Watson

Harpianto,ANDI

James Troup

John A Meinel

Kevin bush

Launchpad code reviewers

Launchpad code reviewers from Canonical

Matthew Tanner

Maximiliano Bertacchini

Oguz Ersoz

Simon Brakhane

Stuart Bishop

Ubuntu-BR DevOps

William Grant

alhawiti

api.ng

pedro cavazos

todaioan

wenjingwen

to status/vote changes:

Tzaddi

Tzaddi Belding

 === modified file 'lib/canonical/testing/layers.py'
 --- lib/canonical/testing/layers.py	2010-02-12 19:34:42 +0000
 +++ lib/canonical/testing/layers.py	2010-03-06 08:12:37 +0000
@@ -537,6 +537,11 @@
      def getPidFile(cls):
          return os.path.join(config.root, '.memcache.pid')
++    @classmethod
++    def purge(cls):
++        "Purge everything from our memcached."
++        MemcachedLayer.client.flush_all() # Only do this in tests!
++
  class LibrarianLayer(BaseLayer):
      """Provides tests access to a Librarian instance.
 === modified file 'lib/lp/app/stories/launchpad-root/xx-featuredprojects.txt'
 --- lib/lp/app/stories/launchpad-root/xx-featuredprojects.txt	2010-01-08 21:23:15 +0000
 +++ lib/lp/app/stories/launchpad-root/xx-featuredprojects.txt	2010-03-06 08:12:37 +0000
@@ -78,6 +78,9 @@
  Administrators can add a project. Here Foo Bar adds apache as a featured
  project:
++    >>> from canonical.testing.layers import MemcachedLayer
++    >>> MemcachedLayer.purge() # Featured projects list is cached.
++
      >>> admin_browser.getControl('Add project').value = 'apache'
      >>> admin_browser.getControl('Update').click()
      >>> admin_browser.url
@@ -111,6 +114,8 @@
  == Removing a project ==
++    >>> MemcachedLayer.purge() # Featured projects list is cached.
++
      >>> admin_browser.getLink(MANAGE_LINK).click()
      >>> admin_browser.getControl('Apache').click()
      >>> admin_browser.getControl('Update').click()
 === modified file 'lib/lp/app/templates/root-index.pt'
 --- lib/lp/app/templates/root-index.pt	2010-02-22 17:58:40 +0000
 +++ lib/lp/app/templates/root-index.pt	2010-03-06 08:12:37 +0000
@@ -86,7 +86,7 @@
          <div class="yui-g">
            <div class="yui-u first">
              <div class="homepage-whatslaunchpad"
--                 tal:condition="not:view/user">
++                 tal:condition="not:view/user" tal:content="cache:anonymous">
                <h2><span class="launchpad-gold">Launchpad</span> is a software collaboration platform that provides:</h2>
                <ul tal:define="apphomes view/apphomes">
                <li><a tal:attributes="href apphomes/bugs"><img src="/@@/bug" alt="" /></a>
@@ -159,7 +159,7 @@
                <input id="text" type="text" name="field.text" size="25%" />
                <input id="search" type="submit" value="Search Launchpad" />
              </form>
--            <div id="homepage-stats">
++            <div id="homepage-stats" tal:content="cache:public, 1 hour">
                <strong class="registry-stat"
                  tal:content="view/project_count/fmt:intcomma">123</strong>&nbsp;projects,
                <strong class="bugs-stat"
@@ -182,7 +182,8 @@
                </tal:logged_out>You can test Launchpad's functionality
                in our sandbox environment.
                (<a href="/+help/home-page-staging-help.html" target="help">What's this?</a>)<br />
--              <tal:logged_in condition="view/user" omit-tag="">
++              <tal:logged_in condition="view/user" omit-tag=""
++                  tal:content="cache:public">
                  If you're ready, you can:
                  <ul tal:define="apphomes view/apphomes">
                    <li><a href="https://help.launchpad.net/">
@@ -207,7 +208,10 @@
                </tal:logged_in>
              </div>
--            <div id="homepage-featured" class="homepage-portlet">
++            <div id="homepage-featured" class="homepage-portlet"
++                tal:content="cache:anonymous, 1 hour">
++              <tal:cache
++                  tal:content="cache:public, 5 minutes" tal:omit-tag="">
                <h2>Featured projects</h2>
                <div class="featured-project-top"
@@ -231,9 +235,10 @@
                    </li>
                  </ul>
                </div>
++              </tal:cache>
                <ul class="horizontal">
--                <li>
++                <li tal:content="cache:public, 1 hour">
                    <strong><a href="/projects">Browse all
                      <tal:count content="view/project_count">42</tal:count>
                      projects</a>!</strong>
 === modified file 'lib/lp/services/memcache/configure.zcml'
 --- lib/lp/services/memcache/configure.zcml	2009-09-16 12:47:23 +0000
 +++ lib/lp/services/memcache/configure.zcml	2010-03-06 08:12:37 +0000
@@ -5,10 +5,16 @@
      xmlns="http://namespaces.zope.org/zope"
      xmlns:browser="http://namespaces.zope.org/browser"
      xmlns:i18n="http://namespaces.zope.org/i18n"
++    xmlns:tales="http://namespaces.zope.org/tales"
      i18n_domain="launchpad">
++
++    <!-- Main memcache interface - the IMemcacheClient Utility -->
      <utility
          provides="lp.services.memcache.interfaces.IMemcacheClient"
          factory="lp.services.memcache.client.memcache_client_factory"
          />
++
++    <!-- TALES expression letting us cache chunks of rendered templates -->
++    <tales:expressiontype name="cache" handler=".tales.MemcacheExpr" />
  </configure>
 === added directory 'lib/lp/services/memcache/doc'
 === added file 'lib/lp/services/memcache/doc/tales-cache.txt'
 --- lib/lp/services/memcache/doc/tales-cache.txt	1970-01-01 00:00:00 +0000
 +++ lib/lp/services/memcache/doc/tales-cache.txt	2010-03-06 08:12:37 +0000
@@ -0,0 +1,214 @@
++Memcache with TALES
++===================
++
++We have extended TALES with a cache: expression to allow chunks of
++rendered page templates to be cached in Memcached.
++
++
++    >>> template = TestPageTemplate(dedent("""\
++    ...     <div tal:content="cache:public">
++    ...         <span tal:content="param">placeholder</span>
++    ...     </div>"""))
++
++
++The first time we render the page template, there is no information
++in the cache. The cachable section is interpreted and stored in the cache
++for next time.
++
++    >>> print template(param='first')
++    <div>
++        <span>first</span>
++    </div>
++
++
++The second time we render the page template, the cached information
++is used. We prove this here by changing our parameters, which would
++cause this template to render differently.
++
++    >>> print template(param='second')
++    <div>
++        <span>first</span>
++    </div>
++
++
++If we clear the cache, it will be rendered as expected.
++
++    >>> MemcachedLayer.purge()
++    >>> print template(param='third')
++    <div>
++        <span>third</span>
++    </div>
++
++
++Expiry
++------
++
++We can specify how long cached information is considered valid. If
++this is not set, the information may be cached indefinitely. Note
++that memcache may evict information sooner if it runs low on storage
++space.
++
++One interesting technique is to specify a lengthy expiry, but to
++refresh the information asynchronously using an AJAX request. This
++is good enough for bots and improves the initial page load time, but
++care will be needed to avoid 'popping'.
++
++    >>> template = TestPageTemplate(dedent("""\
++    ...     <body tal:omit-tag="">
++    ...         <div tal:content="cache:public,30 seconds" tal:omit-tag="">
++    ...             This bit cached up to 30 seconds.
++    ...         </div>
++    ...         <div tal:content="cache:public,1 minute" tal:omit-tag="">
++    ...             This bit cached up to 1 minute.
++    ...         </div>
++    ...         <div tal:content="cache:public,6 hours" tal:omit-tag="">
++    ...             This bit cached up to 6 hours.
++    ...         </div>
++    ...         <tal:cached content="cache:public,3 days">
++    ...             This bit cached up to 3 days.
++    ...         </tal:cached>
++    ...     </body>"""))
++    >>> print template()
++    This bit cached up to 30 seconds.
++    This bit cached up to 1 minute.
++    This bit cached up to 6 hours.
++    This bit cached up to 3 days.
++
++
++Visibility
++----------
++
++We define 4 different types of 'visibility':
++
++    public
++
++        The cached information is shared by everyone. These sections
++        should not be personalized. They can contain private information
++        if the page itself is protected.
++
++    private
++
++        Unauthenticated users share cached information, but
++        authenticated users do not share with anyone else. A list on a
++        publicly accessible page that might contain private information
++        should use this visibility.
++
++    anonymous
++
++        Unauthenticated users share cached information, but
++        authenticated users do not use the cache at all. This can
++        be used to feed bots cached information quicky, while giving
++        authenticated users up to date information. In practice, this
++        might not make much difference as reverse proxies should
++        already be caching the entire page for unauthenticated users.
++
++    authenticated
++
++        Unauthenticated users share cached information, and all
++        authenticated users share a different cache. This is used
++        when information is being hidden from unauthenticated users,
++        for example when we hide email addresses from unauthenticated
++        users to help protect against email address harvesters.
++
++    >>> template = TestPageTemplate(dedent("""\
++    ...     <div tal:omit-tag="">
++    ...         <tal:cache content="cache:public">
++    ...             Public: <tal:x content="username" />
++    ...         </tal:cache>
++    ...         <tal:cache content="cache:private">
++    ...             Private: <tal:x content="username" />
++    ...         </tal:cache>
++    ...         <tal:cache content="cache:anonymous">
++    ...             Anonymous: <tal:x content="username" />
++    ...         </tal:cache>
++    ...         <tal:cache content="cache:authenticated">
++    ...             Authenticated: <tal:x content="username" />
++    ...         </tal:cache>
++    ...     </div>"""))
++
++Here we populate all caches.
++
++    >>> login(ANONYMOUS)
++    >>> print template(username="Anonymous")
++    Public:        Anonymous
++    Private:       Anonymous
++    Anonymous:     Anonymous
++    Authenticated: Anonymous
++
++Here we reuse the public cache, populate foo's private cache,
++and populate the authenticated cache. The anonymous section is
++uncached.
++
++    >>> login('foo.bar@canonical.com')
++    >>> print template(username='Foo Bar')
++    Public:        Anonymous
++    Private:       Foo Bar
++    Anonymous:     Foo Bar
++    Authenticated: Foo Bar
++
++Here we reuse the public cache, populate test's private cache, and
++reuse the authenticated cache. The anonymous section is uncached.
++
++    >>> login('test@canonical.com')
++    >>> print template(username='Test')
++    Public:        Anonymous
++    Private:       Test
++    Anonymous:     Test
++    Authenticated: Foo Bar
++
++
++Nesting & Loops
++---------------
++
++Cached chunks can contain other cached chunks, useful for specifying
++different timeouts of different visibilities.
++
++    >>> template = TestPageTemplate(dedent("""\
++    ...     <body tal:content="cache:private,25 seconds" tal:omit-tag="">
++    ...         This bit is private to <span tal:replace="username" />
++    ...         and cached up to 25 seconds, but contains
++    ...         <span tal:content="cache:public,3 days" tal:omit-tag="">
++    ...             this bit cached by <span tal:replace="username" />
++    ...             which is public and cached up to 3 days.
++    ...         </span>
++    ...     </body>"""))
++
++    >>> login('foo.bar@canonical.com')
++    >>> print template(username="Foo Bar")
++    This bit is private to Foo Bar and cached up to 25 seconds, but
++    contains this bit cached by Foo Bar which is public and cached up
++    to 3 days.
++
++    >>> login('test@canonical.com')
++    >>> print template(username="Test")
++    This bit is private to Test and cached up to 25 seconds, but
++    contains this bit cached by Foo Bar which is public and cached up
++    to 3 days.
++
++
++tal:repeat loops are fully supported. Each iteration of the loop gets
++a different cache.
++
++    >>> template = TestPageTemplate(dedent("""\
++    ...     <body>
++    ...         <div tal:repeat="i python:range(1,3)">
++    ...             <div tal:replace="cache:public">
++    ...                 <span tal:replace="param" />
++    ...                 <span tal:replace="repeat/i/index" />
++    ...             </div>
++    ...         </div>
++    ...     </body>"""))
++    >>> print template(param='first')
++    <body>
++        <div> first 0 </div>
++        <div> first 1 </div>
++    </body>
++
++    >>> print template(param='second')
++    <body>
++        <div> first 0 </div>
++        <div> first 1 </div>
++    </body>
++
++
++
 === modified file 'lib/lp/services/memcache/interfaces.py'
 --- lib/lp/services/memcache/interfaces.py	2009-09-16 12:47:23 +0000
 +++ lib/lp/services/memcache/interfaces.py	2010-03-06 08:12:37 +0000
@@ -4,7 +4,7 @@
  """Memcached interfaces."""
  __metaclass__ = type
--__all__ = []
++__all__ = ['IMemcacheClient']
  from zope.interface import Interface
 === added file 'lib/lp/services/memcache/tales.py'
 --- lib/lp/services/memcache/tales.py	1970-01-01 00:00:00 +0000
 +++ lib/lp/services/memcache/tales.py	2010-03-06 08:12:37 +0000
@@ -0,0 +1,294 @@
++# Copyright 2010 Canonical Ltd.  This software is licensed under the
++# GNU Affero General Public License version 3 (see the file LICENSE).
++
++"""Implementation of the cache: namespace in TALES."""
++
++__metaclass__ = type
++__all__ = []
++
++
++from hashlib import md5
++import logging
++import os.path
++
++from zope.component import getUtility
++from zope.interface import implements
++from zope.tal.talinterpreter import TALInterpreter, I18nMessageTypes
++from zope.tales.interfaces import ITALESExpression
++
++from canonical.base import base
++from canonical.config import config
++from canonical.launchpad import versioninfo
++from canonical.launchpad.webapp.interfaces import ILaunchBag
++from lp.services.memcache.interfaces import IMemcacheClient
++
++
++class MemcacheExpr:
++    """Namespace to provide memcache caching of page template chunks.
++
++    This namespace is exclusively used in tal:content directives.
++    The only sensible way of using this is the following syntax:
++
++    <div tal:content="cache:public, 1 hour">
++        [... Potentially expensive page template chunk ...]
++    </div>
++    """
++    implements(ITALESExpression)
++    def __init__(self, name, expr, engine):
++        """expr is in the format "visibility, 42 units".
++
++        visibility is one of...
++
++            public: All users see the same cached information.
++
++            private: Authenticated users see a personal copy of the cached
++                     information. Unauthenticated users share a copy of
++                     the cached information.
++
++            anonymous: Unauthenticated users use a shared copy of the
++                       cached information. Authenticated users don't
++                       use the cache. This probably isn't that useful
++                       in practice, as Anonymous requests should already
++                       be cached by reverse proxies on the production
++                       systems.
++
++            authenticated: Authenticated user share a copy of the cached
++                           information, and unauthenticated users share
++                           a seperate copy. Use this when information is
++                           being hidden from unauthenticated users, eg.
++                           for bug comments where email addresses are
++                           obfuscated for unauthenticated users.
++
++        units is one of 'seconds', 'minutes', 'hours' or 'days'.
++
++        visibility is required. If the cache timeout is not specified,
++        it defaults to 'never timeout' (memcache will still purge the
++        information when in a LRU fashion when things fill up).
++        """
++        self._s = expr
++
++        if ',' in expr:
++            try:
++                self.visibility, max_age = (s.strip() for s in expr.split(','))
++            except ValueError:
++                raise SyntaxError("Too many arguments in cache: expression")
++        else:
++            self.visibility = expr.strip()
++            max_age = None
++        assert self.visibility in (
++            'anonymous', 'public', 'private', 'authenticated',
++            ), 'visibility must be anonymous, public, private or authenticated'
++
++        if max_age is None:
++            self.max_age = 0
++        else:
++            try:
++                value, unit = max_age.split(' ')
++            except ValueError:
++                raise SyntaxError(
++                    "Unparsable age %s in cache: expression"
++                    % repr(self.max_age))
++            value = float(value)
++            if unit[-1] == 's':
++                unit = unit[:-1]
++            if unit == 'second':
++                pass
++            elif unit == 'minute':
++                value *= 60
++            elif unit == 'hour':
++                value *= 60 * 60
++            elif unit == 'day':
++                value *= 24 * 60 * 60
++            else:
++                raise AssertionError("Unknown unit %s" % unit)
++            self.max_age = int(value)
++
++    # For use with str.translate to sanitize keys. No control characters
++    # allowed, and we skip ':' too since it is a magic separator.
++    _key_translate_map = (
++        '_'*33 + ''.join(chr(i) for i in range(33, ord(':'))) + '_'
++        + ''.join(chr(i) for i in range(ord(':')+1, 127)) + '_' * 129)
++
++    def getKey(self, econtext):
++        """We need to calculate a unique key for this cached chunk.
++
++        To ensure content is uniquely identified, we must include:
++            - a user id if this chunk is not 'public'
++            - the template source file name
++            - the position in the source file
++            - a counter to cope with cached chunks in loops
++            - the revision number of the source tree
++            - the config in use
++            - the URL and query string
++        """
++        # We include the URL and query string in the key.
++        # We use the full, unadulterated url to calculate a hash.
++        # We use a sanitized version in the human readable chunk of
++        # the key.
++        request = econtext.getValue('request')
++        url = str(request.URL) + '?' + str(request.get('QUERY_STRING', ''))
++        url = url.encode('utf8') # Ensure it is a byte string.
++        sanitized_url = url.translate(self._key_translate_map)
++
++        # We include the source file and position in the source file in
++        # the key.
++        source_file = os.path.abspath(econtext.source_file)
++        source_file = source_file[
++            len(os.path.commonprefix([source_file, config.root + '/lib']))+1:]
++
++        # We include the visibility in the key so private information
++        # is not leaked. We use 'p' for public information, 'a' for
++        # unauthenticated user information, 'l' for information shared
++        # between all authenticated users, or ${Person.id} for private
++        # information.
++        if self.visibility == 'public':
++            uid = 'p'
++        else:
++            logged_in_user = getUtility(ILaunchBag).user
++            if logged_in_user is None:
++                uid = 'a'
++            elif self.visibility == 'authenticated':
++                uid = 'l'
++            else: # private visibility
++                uid = str(logged_in_user.id)
++
++        # We include a counter in the key, reset at the start of the
++        # request, to ensure we get unique but repeatable keys inside
++        # tal:repeat loops.
++        counter_key = 'lp.services.memcache.tales.counter'
++        counter = request.annotations.get(counter_key, 0) + 1
++        request.annotations[counter_key] = counter
++
++        # We use pt: as a unique prefix to ensure no clashes with other
++        # components using the memcached servers. The order of components
++        # below only matters for human readability and memcached reporting
++        # tools - it doesn't really matter provided all the components are
++        # included and separators used.
++        key = "pt:%s:%s,%s:%s:%d,%d:%d,%s" % (
++            config.instance_name,
++            source_file, versioninfo.revno, uid,
++            econtext.position[0], econtext.position[1], counter,
++            sanitized_url,
++            )
++
++        # Memcached max key length is 250, so truncate but ensure uniqueness
++        # with a hash. A short hash is good, provided it is still unique,
++        # to preserve readability as much as possible. We include the
++        # unsanitized URL in the hash to ensure uniqueness.
++        key_hash = base(int(md5(key + url).hexdigest(), 16), 62)
++        key = key[:250-len(key_hash)] + key_hash
++
++        return key
++
++    def __call__(self, econtext):
++
++        # If we have an 'anonymous' visibility chunk and are logged in,
++        # we don't cache. Return the 'default' magic token to interpret
++        # the contents.
++        request = econtext.getValue('request')
++        if (self.visibility == 'anonymous'
++            and getUtility(ILaunchBag).user is not None):
++            return econtext.getDefault()
++
++        # Calculate a unique key so we serve the right cached information.
++        key = self.getKey(econtext)
++
++        cached_chunk = getUtility(IMemcacheClient).get(key)
++
++        if cached_chunk is None:
++            logging.debug("Memcache miss for %s", key)
++            return MemcacheMiss(key, self.max_age)
++        else:
++            logging.debug("Memcache hit for %s", key)
++            return MemcacheHit(cached_chunk)
++
++    def __str__(self):
++        return 'memcache expression (%s)' % self._s
++
++    def __repr__(self):
++        return '<MemcacheExpr %s>' % self._s
++
++
++class MemcacheMiss:
++    """Callback for the customized TALInterpreter to invoke.
++
++    If the memcache hit failed, the TALInterpreter interprets the
++    tag contents and invokes this callback, which will store the
++    result in memcache against the key calculated by the MemcacheExpr.
++    """
++    def __init__(self, key, max_age):
++        self._key = key
++        self._max_age = max_age
++
++    def __call__(self, value):
++        if getUtility(IMemcacheClient).set(
++            self._key, value, self._max_age):
++            logging.debug("Memcache set succeeded for %s", self._key)
++        else:
++            logging.warn("Memcache set failed for %s", self._key)
++
++    def __repr__(self):
++        return "<MemcacheCallback %s %d>" % (self._key, self._max_age)
++
++
++class MemcacheHit:
++    """A prerendered chunk retrieved from cache.
++
++    We use a special object so the TALInterpreter knows that this
++    information should not be quoted.
++    """
++    def __init__(self, value):
++        self.value = value
++
++
++# Oh my bleeding eyes! Monkey patching & cargo culting seems the sanest
++# way of installing our extensions, which makes me sad.
++
++def do_insertText_tal(self, stuff):
++    text = self.engine.evaluateText(stuff[0])
++    if text is None:
++        return
++    if text is self.Default:
++        self.interpret(stuff[1])
++        return
++    # Start Launchpad customization
++    if isinstance(text, MemcacheMiss):
++        # We got a MemcacheCallback instance. This means we hit a
++        # content="cache:..." attribute but there was no valid
++        # data in memcache. So we need to interpret the enclosed
++        # chunk of template and stuff it in the cache for next time.
++        callback = text
++        self.pushStream(self.StringIO())
++        self.interpret(stuff[1])
++        text = self.stream.getvalue()
++        self.popStream()
++        # Now we have generated the chunk, cache it for next time.
++        callback(text)
++        # And output it to the currently rendered page, unquoted.
++        self.stream_write(text)
++        return
++    if isinstance(text, MemcacheHit):
++        # Got a hit. Include the contents directly into the
++        # rendered page, unquoted.
++        self.stream_write(text.value)
++        return
++    # End Launchpad customization
++    if isinstance(text, I18nMessageTypes):
++        # Translate this now.
++        text = self.translate(text)
++    self._writeText(text)
++TALInterpreter.bytecode_handlers_tal["insertText"] = do_insertText_tal
++
++
++# Just like the original, except MemcacheHit and MemcacheMiss
++# instances are also passed through unharmed.
++def evaluateText(self, expr):
++    text = self.evaluate(expr)
++    if (text is None
++        or isinstance(text, (basestring, MemcacheHit, MemcacheMiss))
++        or text is self.getDefault()):
++        return text
++    return unicode(text)
++import zope.pagetemplate.engine
++zope.pagetemplate.engine.ZopeContextBase.evaluateText = evaluateText
++
 === added file 'lib/lp/services/memcache/tests/test_doc.py'
 --- lib/lp/services/memcache/tests/test_doc.py	1970-01-01 00:00:00 +0000
 +++ lib/lp/services/memcache/tests/test_doc.py	2010-03-06 08:12:37 +0000
@@ -0,0 +1,71 @@
++# Copyright 2010 Canonical Ltd.  This software is licensed under the
++# GNU Affero General Public License version 3 (see the file LICENSE).
++
++"""Run doctests."""
++
++__metaclass__ = type
++
++import os.path
++from textwrap import dedent
++import unittest
++
++from zope.component import getUtility
++import zope.pagetemplate.engine
++from zope.pagetemplate.pagetemplate import PageTemplate
++from zope.publisher.browser import TestRequest
++
++from canonical.launchpad.testing.systemdocs import (
++    LayeredDocFileSuite, setUp, tearDown)
++from canonical.testing.layers import LaunchpadFunctionalLayer, MemcachedLayer
++from lp.services.memcache.interfaces import IMemcacheClient
++from lp.services.testing import build_test_suite
++from lp.testing import TestCase
++
++
++here = os.path.dirname(os.path.realpath(__file__))
++
++
++class TestPageTemplate(PageTemplate):
++    """A cutdown PageTemplate implementation suitable for our tests."""
++
++    _num_instances = 0
++
++    def __init__(self, source):
++        super(TestPageTemplate, self).__init__()
++        TestPageTemplate._num_instances += 1
++        self._my_instance_num = TestPageTemplate._num_instances
++        self.pt_edit(source, 'text/html')
++
++    def pt_source_file(self):
++        return 'fake/test_%d.pt' % self._my_instance_num
++
++    def pt_getEngine(self):
++        # The <tales:expressiontype> ZCML only registers with this
++        # engine, not the default.
++        return zope.pagetemplate.engine.Engine
++
++    def pt_getContext(self, args=(), options={}):
++        # Build a minimal context. The cache: expression requires
++        # a request.
++        context = {'request': TestRequest()}
++        context.update(options)
++        return context
++
++
++def memcacheSetUp(test):
++    setUp(test)
++    test.globs['TestPageTemplate'] = TestPageTemplate
++    test.globs['dedent'] = dedent
++    test.globs['MemcachedLayer'] = MemcachedLayer
++
++
++special = {
++    'tales-cache.txt': LayeredDocFileSuite(
++        '../doc/tales-cache.txt',
++        setUp=memcacheSetUp, tearDown=tearDown,
++        layer=LaunchpadFunctionalLayer),
++    }
++
++
++def test_suite():
++    return build_test_suite(here, special, layer=LaunchpadFunctionalLayer)

Launchpad itself

Merge lp:~stub/launchpad/memcache into lp:launchpad

Commit message

Description of the change

Preview Diff

Subscribers