Code review comment for lp:~leonardr/launchpad/test-representation-cache

Revision history for this message
Stuart Bishop (stub) wrote :

On Tue, Jun 1, 2010 at 11:12 PM, Leonard Richardson
<email address hidden> wrote:

> === modified file 'lib/canonical/database/sqlbase.py'

> +    def __storm_flushed__(self):
> +        """Invalidate the web service cache."""
> +        cache = getUtility(IRepresentationCache)
> +        cache.delete(self)

This looks like a decent way of hooking in. Unfortunatly, SQLBase is
legacy - used by objects providing the SQLObject compatibility layer.
Some of our newer classes use the Storm base class directly. If there
is no way to register an on-flush hook with Storm, you will need to
create a Launchpad specific subclass of storm.Storm that provides this
hook, and update classes using the Storm base class directly to use
LPStorm.

The cache can become stale by:

1) Raw SQL, at psql command line or executed by appservers or scripts.
2) Stored procedures
3) Store.remove()
4) Using Storm for bulk updates, inserts or deletions.
5) Scripts or appservers running with non-identical [memcache] config section.
6) Scripts or appservers unable to contact memcached servers
7) Network glitches

We should probably ignore the last three. Are the first four a problem
though for the webservice? If it doesn't really matter, fine.
Otherwise we might need some ON DELETE and ON UPDATE database triggers
to keep things in sync. Hopefully this is unnecessary, as it could
will kill our performance gain.

=== added file 'lib/canonical/launchpad/pagetests/webservice/cache.txt'

+The cache starts out empty.
+
+ >>> print cache.get_by_key(key)
+ None
+
+Retrieving a representation of an object populates the cache.
+
+ >>> ignore = webservice.get("/~salgado", api_version="devel").jsonBody()
+
+ >>> cache.get_by_key(key)
+ '{...}'

Is webservice.get making a real connection to the webservice here? I
am curious if the cache is populated when the object is retrieved,
rather than when the transaction that retrieved the object commits.

=== added file 'lib/lp/services/memcache/doc/restful-cache.txt'

> +An object's cache key is derived from its Storm metadata: its database
> +table name and its primary key.

> +    >>> cache_key = cache.key_for(
> +    ...     person, 'media/type', 'web-service-version')
> +    >>> print person.id, cache_key
> +    29 Person(29,),media/type,web-service-version

Do we need the LPCONFIG here too? Or is it ok for edge & production to mix?

Hmm... perhaps IMemcacheClient should grow a real API and
automatically prepend the LPCONFIG - not sure if have a use case for
edge and production sharing data, but it is convenient to have them
share the same physical Memcached servers.

> +When a Launchpad object is modified, its JSON representations for
> +recognized web service versions are automatically removed from the
> +cache.
> +
> +    >>> person.addressline1 = "New address"
> +    >>> from canonical.launchpad.ftests import syncUpdate
> +    >>> syncUpdate(person)
> +
> +    >>> print cache.get(person, json_type, "beta", default="missing")
> +    missing
> +
> +    >>> print cache.get(person, json_type, "1.0", default="missing")
> +    missing

Should we document the cases where this doesn't happen, or is this
irrelevant to the webservice?

store.execute(Update({Person.addressline1: 'Newer Address'},
Person.name==person.name)) should make the cache stale.

> === added file 'lib/lp/services/memcache/restful.py'

> +class MemcachedStormRepresentationCache(BaseRepresentationCache):
> +    """Caches lazr.restful representations of Storm objects in memcached."""
> +
> +    def __init__(self):
> +        self.client = memcache_client_factory()

You should be using getUtility(IMemcacheClient) here - otherwise we
end up with two clients and twice as many sockets opened to each
memcached. Or we need to refactor the factory to return a singleton
(is that still a factory?)

> +    def key_for(self, obj, media_type, version):
> +        storm_info = storm.info.get_obj_info(obj)
> +        table_name = storm_info.cls_info.table
> +        primary_key = tuple(var.get() for var in storm_info.primary_vars)
> +
> +        key = (table_name + repr(primary_key)
> +                + ',' + media_type + ',' + str(version))
> +        return key
> +
> +    def get_by_key(self, key, default=None):
> +        return self.client.get(key) or default
> +
> +    def set_by_key(self, key, value):
> +        self.client.set(key, value)
> +
> +    def delete_by_key(self, key):
> +        self.client.delete(key)

We will likely need to calculate the key using the Storm class and the
primary key value rather than an actual object instance. No need to
add it now, but it might affect your API design.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

« Back to merge proposal