Merge lp:~leonardr/lazr.restful/representation-cache into lp:lazr.restful

Proposed by Leonard Richardson
Status: Merged
Merged at revision: 130
Proposed branch: lp:~leonardr/lazr.restful/representation-cache
Merge into: lp:lazr.restful
Diff against target: 861 lines (+581/-40)
10 files modified
src/lazr/restful/NEWS.txt (+10/-0)
src/lazr/restful/_operation.py (+11/-12)
src/lazr/restful/_resource.py (+88/-16)
src/lazr/restful/declarations.py (+11/-4)
src/lazr/restful/docs/webservice-declarations.txt (+6/-1)
src/lazr/restful/example/base/subscribers.py (+1/-0)
src/lazr/restful/example/base/tests/representation-cache.txt (+277/-0)
src/lazr/restful/interfaces/_rest.py (+52/-0)
src/lazr/restful/simple.py (+124/-6)
src/lazr/restful/version.txt (+1/-1)
To merge this branch: bzr merge lp:~leonardr/lazr.restful/representation-cache
Reviewer Review Type Date Requested Status
Eleanor Berger (community) code Approve
Review via email: mp+25895@code.launchpad.net

Description of the change

This branch makes it possible to store preconstructed string representations of entries in a cache. If an entry is present in the cache, the preconstructed representation is used (and possibly redacted) rather than generating a new representation.

On its own, this isn't a huge performance improvement. The huge performance improvement comes from collections. If you request a page of collection, and 50% of the entries on that page are present in the cache, 50% of the representations will come from the cache and incorporated into a collection representation. The other half of the representations will be generated placed in the cache, so that if you request that page again, _all_ the entry representations will come from the cache.

In my Launchpad performance tests based on memcached and this lazr.restful branch (https://dev.launchpad.net/Foundations/Webservice/Performance#Store%20representations%20in%20memcached), I found that the operation of retrieving a fully cached collection was about five times faster than if there was no cache.

To make this performance win worth the complexity, I had to make the new 'redacted_fields' attribute very very fast. Typically we check whether the user has permission on an attribute by trying to access the attribute and catching an Unauthorized exception. But if the attribute in question is a calculated attribute, accessing it might trigger a database request or something equally slow. We need to use the Zope permission checker directly.

The problem is that the web service doesn't know which field name to pass into the Zope permission checker. If a field's real name is "fooBar" but it's published as "foo_bar" on the web service, the Zope permission checker expects "fooBar" but all the web service has access to is 'foo_bar'.

To get around this problem, I changed the export() declaration to set 'original_name' every time it sets 'as'. In the example above, 'as' will be 'foo_bar', and everything in the web service will call the field 'foo_bar', except for 'redacted_fields', which will look in 'original_name' to find that it needs to pass 'fooBar' into the Zope permission checker.

To post a comment you must log in.
Revision history for this message
Leonard Richardson (leonardr) wrote :

Although we'll be using a memcached-based cache when we integrate this code into Launchpad, there's no memcached-code here. For testing purposes I use a cache that's backed by a Python dict.

Revision history for this message
Leonard Richardson (leonardr) wrote :

Another thing I forgot to mention: although the cache interface has a hook for removing objects from the cache, lazr.restful itself will never call that hook. It's the responsibility of the application to call that hook when the cache needs invalidation.

Revision history for this message
Eleanor Berger (intellectronica) :
review: Approve (code)
Revision history for this message
Gary Poster (gary) wrote :

Summary: After we verify that this general approach gives real-world benefit, I suspect that we should always populate the cache, not only when there are no redacted fields for the current user. We could do this by stripping the security proxy for getting the initial data and populating the cache, and then doing the same logic that you do now for redacting existing JSON caches to get the actual desired end result.

IRC conversation:

[10:43am] gary_poster: leonardr: in your branch, when there is no cache, did you contemplate always generating it, even if there are redacted fields? Example: if no cache, generate dict of the entire non-redacted version; else if cache and redacted fields, parse out cache to dict; else return cache. (Now we have a non-redacted dict, if we are still here.)
[10:43am] gary_poster: Now, redact dict, turn into JSON, and return. There are variations of that, some of which might be better, but I imagine you get the drift.
[11:01am] leonardr: gary, i'm not sure what the benefit would be
[11:01am] leonardr: also, if there are redacted fields we _cannot_ calculate an unredacted cache due to the security policy
[11:05am] gary_poster: leonardr: the goal would be to create a source for further cache hits. This could be particularly important for objects that frequently have one or more fields redacted. In that case, the cache would rarely or, in the worst case, never be filled (and therefore never or rarely used). Since DB access is the main expense, you discovered, I strongly suspect that loading JSON and redacting will be significantly cheaper than simply creating the JSON.
[11:05am] gary_poster: Also, I'm skeptical of "cannot"; isn't it just a matter of doing the usual work with an unproxied object?
[11:07am] leonardr: yes, we would have to strip the proxy
[11:10am] leonardr: ok, i see what you're saying. we would cache it all the time, whether we were sending a redacted version or not
[11:10am] gary_poster: right
[11:11am] leonardr: i could certainly do that in a future branch. do you know of launchpad objects that typically have redacted fields?
[11:13am] gary_poster: bac would probably know, but he's out. My first guess: anything private, or (perhaps more interesting, perhaps not) anything referring to something provate.
[11:13am] gary_poster: private
[11:14am] leonardr: if an object's url contains private information, a link to that url would be redacted
[11:14am] gary_poster: so, that's an example?
[11:14am] leonardr: but i don't know of any specific launchpad object that does that. it's something to look for
[11:15am] gary_poster: bugs that are marked as security issues
[11:15am] gary_poster: private projects
[11:15am] gary_poster: private teams
[11:15am] gary_poster: private bugs
[11:15am] leonardr: so anything that links to those objects might end up redacted
[11:16am] gary_poster: (and there's more coming, if I understand correctly)
[11:16am] gary_poster: right
[11:17am] leonardr: ok, let's get the basic cache working, make sure it improves performance in real situations, and then i'll work on that
[11:17am] gary_poster: cool, makes sense

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/lazr/restful/NEWS.txt'
--- src/lazr/restful/NEWS.txt 2010-05-17 17:53:57 +0000
+++ src/lazr/restful/NEWS.txt 2010-05-24 14:15:38 +0000
@@ -2,6 +2,16 @@
2NEWS for lazr.restful2NEWS for lazr.restful
3=====================3=====================
44
50.9.27 (Development)
6====================
7
8Added the ability to define a representation cache used to store the
9JSON representations of entry resources, rather than building them
10from scratch every time. Although the cache has hooks for
11invalidation, lazr.restful will never invalidate any part of the cache
12on its own. You need to hook lazr.restful's invalidation code into
13your ORM or other data store.
14
50.9.26 (2010-05-18)150.9.26 (2010-05-18)
6===================16===================
717
818
=== modified file 'src/lazr/restful/_operation.py'
--- src/lazr/restful/_operation.py 2010-01-05 19:24:12 +0000
+++ src/lazr/restful/_operation.py 2010-05-24 14:15:38 +0000
@@ -84,22 +84,21 @@
84 # If the result is a web service collection, serve only one84 # If the result is a web service collection, serve only one
85 # batch of the collection.85 # batch of the collection.
86 collection = getMultiAdapter((result, self.request), ICollection)86 collection = getMultiAdapter((result, self.request), ICollection)
87 result = CollectionResource(collection, self.request).batch()87 result = CollectionResource(collection, self.request).batch() + '}'
88 elif self.should_batch(result):88 elif self.should_batch(result):
89 result = self.batch(result, self.request)89 result = self.batch(result, self.request) + '}'
9090 else:
91 # Serialize the result to JSON. Any embedded entries will be91 # Serialize the result to JSON. Any embedded entries will be
92 # automatically serialized.92 # automatically serialized.
93 try:93 try:
94 json_representation = simplejson.dumps(94 result = simplejson.dumps(result, cls=ResourceJSONEncoder)
95 result, cls=ResourceJSONEncoder)95 except TypeError, e:
96 except TypeError, e:96 raise TypeError("Could not serialize object %s to JSON." %
97 raise TypeError("Could not serialize object %s to JSON." %97 result)
98 result)
9998
100 self.request.response.setStatus(200)99 self.request.response.setStatus(200)
101 self.request.response.setHeader('Content-Type', self.JSON_TYPE)100 self.request.response.setHeader('Content-Type', self.JSON_TYPE)
102 return json_representation101 return result
103102
104 def should_batch(self, result):103 def should_batch(self, result):
105 """Whether the given response data should be batched."""104 """Whether the given response data should be batched."""
106105
=== modified file 'src/lazr/restful/_resource.py'
--- src/lazr/restful/_resource.py 2010-05-17 17:52:57 +0000
+++ src/lazr/restful/_resource.py 2010-05-24 14:15:38 +0000
@@ -70,7 +70,7 @@
70from zope.schema.interfaces import (70from zope.schema.interfaces import (
71 ConstraintNotSatisfied, IBytes, IField, IObject, RequiredMissing)71 ConstraintNotSatisfied, IBytes, IField, IObject, RequiredMissing)
72from zope.security.interfaces import Unauthorized72from zope.security.interfaces import Unauthorized
73from zope.security.proxy import removeSecurityProxy73from zope.security.proxy import getChecker, removeSecurityProxy
74from zope.security.management import checkPermission74from zope.security.management import checkPermission
75from zope.traversing.browser import absoluteURL, AbsoluteURL75from zope.traversing.browser import absoluteURL, AbsoluteURL
76from zope.traversing.browser.interfaces import IAbsoluteURL76from zope.traversing.browser.interfaces import IAbsoluteURL
@@ -84,7 +84,7 @@
84from lazr.restful.interfaces import (84from lazr.restful.interfaces import (
85 ICollection, ICollectionField, ICollectionResource, IEntry, IEntryField,85 ICollection, ICollectionField, ICollectionResource, IEntry, IEntryField,
86 IEntryFieldResource, IEntryResource, IFieldHTMLRenderer, IFieldMarshaller,86 IEntryFieldResource, IEntryResource, IFieldHTMLRenderer, IFieldMarshaller,
87 IHTTPResource, IJSONPublishable, IReferenceChoice,87 IHTTPResource, IJSONPublishable, IReferenceChoice, IRepresentationCache,
88 IResourceDELETEOperation, IResourceGETOperation, IResourcePOSTOperation,88 IResourceDELETEOperation, IResourceGETOperation, IResourcePOSTOperation,
89 IScopedCollection, IServiceRootResource, ITopLevelEntryLink,89 IScopedCollection, IServiceRootResource, ITopLevelEntryLink,
90 IUnmarshallingDoesntNeedValue, IWebServiceClientRequest,90 IUnmarshallingDoesntNeedValue, IWebServiceClientRequest,
@@ -97,7 +97,8 @@
97WADL_SCHEMA_FILE = os.path.join(os.path.dirname(__file__),97WADL_SCHEMA_FILE = os.path.join(os.path.dirname(__file__),
98 'wadl20061109.xsd')98 'wadl20061109.xsd')
9999
100# Levels of detail to use when unmarshalling the data.100# Constants and levels of detail to use when unmarshalling the data.
101MISSING = object()
101NORMAL_DETAIL = object()102NORMAL_DETAIL = object()
102CLOSEUP_DETAIL = object()103CLOSEUP_DETAIL = object()
103104
@@ -599,7 +600,7 @@
599 def batch(self, entries, request):600 def batch(self, entries, request):
600 """Prepare a batch from a (possibly huge) list of entries.601 """Prepare a batch from a (possibly huge) list of entries.
601602
602 :return: A hash:603 :return: A JSON string representing a hash:
603 'entries' contains a list of EntryResource objects for the604 'entries' contains a list of EntryResource objects for the
604 entries that actually made it into this batch605 entries that actually made it into this batch
605 'total_size' contains the total size of the list.606 'total_size' contains the total size of the list.
@@ -608,6 +609,11 @@
608 'prev_url', if present, contains a URL to get the previous batch609 'prev_url', if present, contains a URL to get the previous batch
609 in the list.610 in the list.
610 'start' contains the starting index of this batch611 'start' contains the starting index of this batch
612
613 Note that the JSON string will be missing its final curly
614 brace. This is in case the caller wants to add some additional
615 keys to the JSON hash. It's the caller's responsibility to add
616 a '}' to the end of the string returned from this method.
611 """617 """
612 if not hasattr(entries, '__len__'):618 if not hasattr(entries, '__len__'):
613 entries = IFiniteSequence(entries)619 entries = IFiniteSequence(entries)
@@ -617,8 +623,7 @@
617 resources = [EntryResource(entry, request)623 resources = [EntryResource(entry, request)
618 for entry in navigator.batch624 for entry in navigator.batch
619 if checkPermission(view_permission, entry)]625 if checkPermission(view_permission, entry)]
620 batch = { 'entries' : resources,626 batch = { 'total_size' : navigator.batch.listlength,
621 'total_size' : navigator.batch.listlength,
622 'start' : navigator.batch.start }627 'start' : navigator.batch.start }
623 if navigator.batch.start < 0:628 if navigator.batch.start < 0:
624 batch['start'] = None629 batch['start'] = None
@@ -628,7 +633,17 @@
628 prev_url = navigator.prevBatchURL()633 prev_url = navigator.prevBatchURL()
629 if prev_url != "":634 if prev_url != "":
630 batch['prev_collection_link'] = prev_url635 batch['prev_collection_link'] = prev_url
631 return batch636 json_string = simplejson.dumps(batch, cls=ResourceJSONEncoder)
637
638 # String together a bunch of entry representations, possibly
639 # obtained from a representation cache.
640 entry_strings = [
641 resource._representation(HTTPResource.JSON_TYPE)
642 for resource in resources]
643 json_string = (json_string[:-1] + ', "entries": ['
644 + (", ".join(entry_strings) + ']'))
645 # The caller is responsible for tacking on the final curly brace.
646 return json_string
632647
633648
634class CustomOperationResourceMixin:649class CustomOperationResourceMixin:
@@ -708,8 +723,6 @@
708 return "DELETE not supported."723 return "DELETE not supported."
709 return operation()724 return operation()
710725
711 return operation()
712
713726
714class FieldUnmarshallerMixin:727class FieldUnmarshallerMixin:
715728
@@ -733,12 +746,11 @@
733746
734 :return: a 2-tuple (representation_name, representation_value).747 :return: a 2-tuple (representation_name, representation_value).
735 """748 """
736 missing = object()749 cached_value = MISSING
737 cached_value = missing
738 if detail is NORMAL_DETAIL:750 if detail is NORMAL_DETAIL:
739 cached_value = self._unmarshalled_field_cache.get(751 cached_value = self._unmarshalled_field_cache.get(
740 field_name, missing)752 field_name, MISSING)
741 if cached_value is not missing:753 if cached_value is not MISSING:
742 return cached_value754 return cached_value
743755
744 field = field.bind(self.context)756 field = field.bind(self.context)
@@ -1442,6 +1454,29 @@
1442 self.request), self.request),1454 self.request), self.request),
1443 adapter.singular_type)1455 adapter.singular_type)
14441456
1457 @property
1458 def redacted_fields(self):
1459 """Names the fields the current user doesn't have permission to see."""
1460 failures = []
1461 checker = getChecker(self.context)
1462 for name, field in getFieldsInOrder(self.entry.schema):
1463 try:
1464 # Can we view the field's value? We check the
1465 # permission directly using the Zope permission
1466 # checker, because doing it indirectly by fetching the
1467 # value may have very slow side effects such as
1468 # database hits.
1469 tagged_values = field.getTaggedValue('lazr.restful.exported')
1470 original_name = tagged_values['original_name']
1471 checker.check(self.context, original_name)
1472 except Unauthorized:
1473 # This is an expensive operation that will make this
1474 # request more expensive still, but it happens
1475 # relatively rarely.
1476 repr_name, repr_value = self._unmarshallField(name, field)
1477 failures.append(repr_name)
1478 return failures
1479
1445 def isModifiableField(self, field, is_external_client):1480 def isModifiableField(self, field, is_external_client):
1446 """Returns true if this field's value can be changed.1481 """Returns true if this field's value can be changed.
14471482
@@ -1463,10 +1498,45 @@
14631498
1464 def _representation(self, media_type):1499 def _representation(self, media_type):
1465 """Return a representation of this entry, of the given media type."""1500 """Return a representation of this entry, of the given media type."""
1501
1466 if media_type in [self.WADL_TYPE, self.DEPRECATED_WADL_TYPE]:1502 if media_type in [self.WADL_TYPE, self.DEPRECATED_WADL_TYPE]:
1467 return self.toWADL().encode("utf-8")1503 return self.toWADL().encode("utf-8")
1468 elif media_type == self.JSON_TYPE:1504 elif media_type == self.JSON_TYPE:
1469 return simplejson.dumps(self, cls=ResourceJSONEncoder)1505 cache = None
1506 try:
1507 cache = getUtility(IRepresentationCache)
1508 representation = cache.get(
1509 self.context, self.JSON_TYPE, self.request.version)
1510 except ComponentLookupError:
1511 # There's no representation cache.
1512 representation = None
1513
1514 redacted_fields = self.redacted_fields
1515 if representation is None:
1516 # Either there is no cache, or the representation
1517 # wasn't in the cache.
1518 representation = simplejson.dumps(self, cls=ResourceJSONEncoder)
1519 # If there's a cache, and this representation doesn't
1520 # contain any redactions, store it in the cache.
1521 if cache is not None and len(redacted_fields) == 0:
1522 cache.set(self.context, self.JSON_TYPE,
1523 self.request.version, representation)
1524 else:
1525 # We have a representation, but we might not be able
1526 # to use it as-is.
1527 if len(redacted_fields) != 0:
1528 # We can't use the representation as is. We need
1529 # to deserialize it, redact certain fields, and
1530 # reserialize it. Hopefully this is faster than
1531 # generating the representation from scratch!
1532 json = simplejson.loads(representation)
1533 for field in redacted_fields:
1534 json[field] = self.REDACTED_VALUE
1535 # There's no need to use the ResourceJSONEncoder,
1536 # because we loaded the cached representation
1537 # using the standard decoder.
1538 representation = simplejson.dumps(json)
1539 return representation
1470 elif media_type == self.XHTML_TYPE:1540 elif media_type == self.XHTML_TYPE:
1471 return self.toXHTML().encode("utf-8")1541 return self.toXHTML().encode("utf-8")
1472 else:1542 else:
@@ -1516,7 +1586,7 @@
1516 result = self.batch(entries)1586 result = self.batch(entries)
15171587
1518 self.request.response.setHeader('Content-type', self.JSON_TYPE)1588 self.request.response.setHeader('Content-type', self.JSON_TYPE)
1519 return simplejson.dumps(result, cls=ResourceJSONEncoder)1589 return result
15201590
1521 def batch(self, entries=None):1591 def batch(self, entries=None):
1522 """Return a JSON representation of a batch of entries.1592 """Return a JSON representation of a batch of entries.
@@ -1526,7 +1596,9 @@
1526 if entries is None:1596 if entries is None:
1527 entries = self.collection.find()1597 entries = self.collection.find()
1528 result = super(CollectionResource, self).batch(entries, self.request)1598 result = super(CollectionResource, self).batch(entries, self.request)
1529 result['resource_type_link'] = self.type_url1599 result += (
1600 ', "resource_type_link" : ' + simplejson.dumps(self.type_url)
1601 + '}')
1530 return result1602 return result
15311603
1532 @property1604 @property
15331605
=== modified file 'src/lazr/restful/declarations.py'
--- src/lazr/restful/declarations.py 2010-02-25 17:07:16 +0000
+++ src/lazr/restful/declarations.py 2010-05-24 14:15:38 +0000
@@ -151,10 +151,17 @@
151 if tag_stack['type'] != FIELD_TYPE:151 if tag_stack['type'] != FIELD_TYPE:
152 continue152 continue
153 for version, tags in tag_stack.stack:153 for version, tags in tag_stack.stack:
154 # Set 'as' for every version in which the field is published154 # Set 'as' for every version in which the field is
155 # but no 'as' is specified.155 # published but no 'as' is specified. Also set
156 if tags.get('as') is None and tags.get('exported') != False:156 # 'original_name' for every version in which the field
157 tags['as'] = name157 # is published--this will help with performance
158 # optimizations around permission checks.
159 if tags.get('exported') != False:
160 tags['original_name'] = name
161 if tags.get('as') is None:
162 tags['as'] = name
163
164
158165
159 annotate_exported_methods(interface)166 annotate_exported_methods(interface)
160 return interface167 return interface
161168
=== modified file 'src/lazr/restful/docs/webservice-declarations.txt'
--- src/lazr/restful/docs/webservice-declarations.txt 2010-04-14 14:56:46 +0000
+++ src/lazr/restful/docs/webservice-declarations.txt 2010-05-24 14:15:38 +0000
@@ -49,7 +49,7 @@
49 ...49 ...
50 ... inventory_number = TextLine(title=u'The inventory part number.')50 ... inventory_number = TextLine(title=u'The inventory part number.')
5151
52These declarations adds tagged value to the original interface elements.52These declarations add tagged values to the original interface elements.
53The tags are in the lazr.restful namespace and are dictionaries of53The tags are in the lazr.restful namespace and are dictionaries of
54elements.54elements.
5555
@@ -74,12 +74,15 @@
74 type: 'entry'74 type: 'entry'
75 >>> print_export_tag(IBook['title'])75 >>> print_export_tag(IBook['title'])
76 as: 'title'76 as: 'title'
77 original_name: 'title'
77 type: 'field'78 type: 'field'
78 >>> print_export_tag(IBook['author'])79 >>> print_export_tag(IBook['author'])
79 as: 'author'80 as: 'author'
81 original_name: 'author'
80 type: 'field'82 type: 'field'
81 >>> print_export_tag(IBook['base_price'])83 >>> print_export_tag(IBook['base_price'])
82 as: 'price'84 as: 'price'
85 original_name: 'base_price'
83 type: 'field'86 type: 'field'
84 >>> print_export_tag(IBook['inventory_number'])87 >>> print_export_tag(IBook['inventory_number'])
85 tag 'lazr.restful.exported' is not present88 tag 'lazr.restful.exported' is not present
@@ -751,9 +754,11 @@
751 ... print_export_tag(IUser[name])754 ... print_export_tag(IUser[name])
752 == name ==755 == name ==
753 as: 'name'756 as: 'name'
757 original_name: 'name'
754 type: 'field'758 type: 'field'
755 == nickname ==759 == nickname ==
756 as: 'nickname'760 as: 'nickname'
761 original_name: 'nickname'
757 type: 'field'762 type: 'field'
758 == rename ==763 == rename ==
759 as: 'rename'764 as: 'rename'
760765
=== modified file 'src/lazr/restful/example/base/subscribers.py'
--- src/lazr/restful/example/base/subscribers.py 2009-09-01 14:37:41 +0000
+++ src/lazr/restful/example/base/subscribers.py 2010-05-24 14:15:38 +0000
@@ -5,6 +5,7 @@
5__metaclass__ = type5__metaclass__ = type
6__all__ = ['update_cookbook_revision_number']6__all__ = ['update_cookbook_revision_number']
77
8from zope.interface import Interface
8import grokcore.component9import grokcore.component
9from lazr.lifecycle.interfaces import IObjectModifiedEvent10from lazr.lifecycle.interfaces import IObjectModifiedEvent
10from lazr.restful.example.base.interfaces import ICookbook11from lazr.restful.example.base.interfaces import ICookbook
1112
=== added file 'src/lazr/restful/example/base/tests/representation-cache.txt'
--- src/lazr/restful/example/base/tests/representation-cache.txt 1970-01-01 00:00:00 +0000
+++ src/lazr/restful/example/base/tests/representation-cache.txt 2010-05-24 14:15:38 +0000
@@ -0,0 +1,277 @@
1**********************************
2The in-memory representation cache
3**********************************
4
5Rather than having lazr.restful calculate a representation of an entry
6every time it's requested, you can register an object as the
7representation cache. String representations of entries are generated
8once and stored in the representation cache.
9
10lazr.restful works fine when there is no representation cache
11installed; in fact, this is the only test that uses one.
12
13 >>> from zope.component import getUtility
14 >>> from lazr.restful.interfaces import IRepresentationCache
15 >>> getUtility(IRepresentationCache)
16 Traceback (most recent call last):
17 ...
18 ComponentLookupError: ...
19
20DictionaryBasedRepresentationCache
21==================================
22
23A representation cache can be any object that implements
24IRepresentationCache, but for test purposes we'll be using a simple
25DictionaryBasedRepresentationCache. This object transforms the
26IRepresentationCache operations into operations on a Python dict-like
27object.
28
29 >>> from lazr.restful.simple import DictionaryBasedRepresentationCache
30 >>> dictionary = {}
31 >>> cache = DictionaryBasedRepresentationCache(dictionary)
32
33It's not a good idea to use a normal Python dict in production,
34because there's no limit on how large the dict can become. In a real
35situation you want something with an LRU implementation. That said,
36let's see how the DictionaryBasedRepresentationCache works.
37
38All IRepresentationCache implementations will cache a representation
39under a key derived from the object whose representation it is, the
40media type of the representation, and a web service version name.
41
42 >>> from lazr.restful.example.base.root import C4 as greens_object
43 >>> json = "application/json"
44 >>> print cache.get(greens_object, json, "devel")
45 None
46 >>> print cache.get(greens_object, json, "devel", "missing")
47 missing
48
49 >>> cache.set(greens_object, json, "devel", "This is the 'devel' value.")
50 >>> print cache.get(greens_object, json, "devel")
51 This is the 'devel' value.
52 >>> sorted(dictionary.keys())
53 ['http://cookbooks.dev/devel/cookbooks/Everyday%20Greens,application/json']
54
55This allows different representations of the same object to be stored
56for different versions.
57
58 >>> cache.set(greens_object, json, "1.0", "This is the '1.0' value.")
59 >>> print cache.get(greens_object, json, "1.0")
60 This is the '1.0' value.
61 >>> sorted(dictionary.keys())
62 ['http://cookbooks.dev/1.0/cookbooks/Everyday%20Greens,application/json',
63 'http://cookbooks.dev/devel/cookbooks/Everyday%20Greens,application/json']
64
65Deleting an object from the cache will remove all its representations.
66
67 >>> cache.delete(greens_object)
68 >>> sorted(dictionary.keys())
69 []
70 >>> print cache.get(greens_object, json, "devel")
71 None
72 >>> print cache.get(greens_object, json, "1.0")
73 None
74
75A representation cache
76======================
77
78Now let's register our DictionaryBasedRepresentationCache as the
79representation cache for this web service, and see how it works within
80lazr.restful.
81
82 >>> from zope.component import getSiteManager
83 >>> sm = getSiteManager()
84 >>> sm.registerUtility(cache, IRepresentationCache)
85
86 >>> from lazr.restful.testing.webservice import WebServiceCaller
87 >>> webservice = WebServiceCaller(domain='cookbooks.dev')
88
89When we retrieve a JSON representation of an entry, that
90representation is added to the cache.
91
92 >>> ignored = webservice.get("/recipes/1")
93 >>> [the_only_key] = dictionary.keys()
94 >>> print the_only_key
95 http://cookbooks.dev/devel/recipes/1,application/json
96
97Note that the cache key incorporates the web service version name
98("devel") and the media type of the representation
99("application/json").
100
101Associated with the key is a string: the JSON representation of the object.
102
103 >>> import simplejson
104 >>> print simplejson.loads(dictionary[the_only_key])['self_link']
105 http://cookbooks.dev/devel/recipes/1
106
107If we get a representation of the same resource from a different web
108service version, that representation is stored separately.
109
110 >>> ignored = webservice.get("/recipes/1", api_version="1.0")
111 >>> for key in sorted(dictionary.keys()):
112 ... print key
113 http://cookbooks.dev/1.0/recipes/1,application/json
114 http://cookbooks.dev/devel/recipes/1,application/json
115
116 >>> key1 = "http://cookbooks.dev/1.0/recipes/1,application/json"
117 >>> key2 = "http://cookbooks.dev/devel/recipes/1,application/json"
118 >>> dictionary[key1] == dictionary[key2]
119 False
120
121Cache invalidation
122==================
123
124lazr.restful does not automatically invalidate the representation
125cache, because it only knows about a subset of the changes that might
126invalidate the cache--the changes that happen through the web service
127itself.
128
129If you want to invalidate the cache whenever the web service changes
130an object, you can write a listener for ObjectModifiedEvent objects
131(see doc/webservice.txt for an example). But most of the time, you'll
132want to invalidate the cache when something deeper happens--something
133like a change to the objects in your ORM.
134
135Let's signal a change to recipe #1. Let's say someone changed that
136recipe, using a web application that has no connection to the web
137service except for a shared database. We can detect the database
138change, but what do we do when that change happens?
139
140Here's the recipe object.
141
142 >>> from lazr.restful.example.base.root import RECIPES
143 >>> recipe = [recipe for recipe in RECIPES if recipe.id == 1][0]
144
145To remove its representation from the cache, we pass it into the
146cache's delete() method.
147
148 >>> print cache.get(recipe, json, 'devel')
149 {...}
150 >>> cache.delete(recipe)
151
152All the relevant representations are deleted.
153
154 >>> print cache.get(recipe, json, 'devel')
155 None
156 >>> dictionary.keys()
157 []
158
159Data visibility
160===============
161
162Only full representations are added to the cache. If the
163representation you request includes a redacted field (because you
164don't have permission to see that field's true value), the
165representation is not added to the cache.
166
167 >>> from urllib import quote
168 >>> greens_url = quote("/cookbooks/Everyday Greens")
169 >>> greens = webservice.get(greens_url).jsonBody()
170 >>> print greens['confirmed']
171 tag:launchpad.net:2008:redacted
172
173 >>> dictionary.keys()
174 []
175
176This means that if your entry resources typically contain data that's
177only visible to a select few users, you won't get much benefit out of
178a representation cache.
179
180What if a full representation is in the cache, and the user requests a
181representation that must be redacted? Let's put some semi-fake data in
182the cache and find out.
183
184 >>> import simplejson
185 >>> greens['name'] = "This comes from the cache; it is not generated."
186 >>> greens['confirmed'] = True
187 >>> cache.set(greens_object, json, 'devel', simplejson.dumps(greens))
188
189When we GET the corresponding resource, we get a representation that
190definitely comes from the cache, not the original data source.
191
192 >>> cached_greens = webservice.get(greens_url).jsonBody()
193 >>> print cached_greens['name']
194 This comes from the cache; it is not generated.
195
196But the redacted value is still redacted.
197
198 >>> print cached_greens['confirmed']
199 tag:launchpad.net:2008:redacted
200
201Cleanup: clear the cache.
202
203 >>> dictionary.clear()
204
205Collections
206===========
207
208Collections are full of entries, and representations of collections
209are built from the cache if possible. We'll demonstrate this with the
210collection of recipes.
211
212First, we'll hack the cached representation of a single recipe.
213
214 >>> recipe = webservice.get("/recipes/1").jsonBody()
215 >>> recipe['instructions'] = "This representation is from the cache."
216 >>> [recipe_key] = dictionary.keys()
217 >>> dictionary[recipe_key] = simplejson.dumps(recipe)
218
219Now, we get the collection of recipes.
220
221 >>> recipes = webservice.get("/recipes").jsonBody()['entries']
222
223The fake instructions we put into an entry's cached representation are
224also present in the collection.
225
226 >>> for instructions in (
227 ... sorted(recipe['instructions'] for recipe in recipes)):
228 ... print instructions
229 A perfectly roasted chicken is...
230 Draw, singe, stuff, and truss...
231 ...
232 This representation is from the cache.
233
234To build the collection, lazr.restful had to generate representations
235of all the cookbook entries. As it generated each representation, it
236populated the cache.
237
238 >>> for key in sorted(dictionary.keys()):
239 ... print key
240 http://cookbooks.dev/devel/recipes/1,application/json
241 http://cookbooks.dev/devel/recipes/2,application/json
242 http://cookbooks.dev/devel/recipes/3,application/json
243 http://cookbooks.dev/devel/recipes/4,application/json
244
245If we request the collection again, all the entry representations will
246come from the cache.
247
248 >>> for key in dictionary.keys():
249 ... value = simplejson.loads(dictionary[key])
250 ... value['instructions'] = "This representation is from the cache."
251 ... dictionary[key] = simplejson.dumps(value)
252
253 >>> recipes = webservice.get("/recipes").jsonBody()['entries']
254 >>> for instructions in (
255 ... sorted(recipe['instructions'] for recipe in recipes)):
256 ... print instructions
257 This representation is from the cache.
258 This representation is from the cache.
259 This representation is from the cache.
260 This representation is from the cache.
261
262Cleanup: de-register the cache.
263
264 >>> sm.registerUtility(None, IRepresentationCache)
265
266Of course, the hacks we made to the cached representations have no
267effect on the objects themselves. Once the hacked cache is gone, the
268representations look just as they did before.
269
270 >>> recipes = webservice.get("/recipes").jsonBody()['entries']
271 >>> for instructions in (
272 ... sorted(recipe['instructions'] for recipe in recipes)):
273 ... print instructions
274 A perfectly roasted chicken is...
275 Draw, singe, stuff, and truss...
276 Preheat oven to...
277 You can always judge...
0278
=== modified file 'src/lazr/restful/interfaces/_rest.py'
--- src/lazr/restful/interfaces/_rest.py 2010-05-17 17:52:57 +0000
+++ src/lazr/restful/interfaces/_rest.py 2010-05-24 14:15:38 +0000
@@ -34,6 +34,7 @@
34 'IHTTPResource',34 'IHTTPResource',
35 'IJSONPublishable',35 'IJSONPublishable',
36 'IJSONRequestCache',36 'IJSONRequestCache',
37 'IRepresentationCache',
37 'IResourceOperation',38 'IResourceOperation',
38 'IResourceGETOperation',39 'IResourceGETOperation',
39 'IResourceDELETEOperation',40 'IResourceDELETEOperation',
@@ -606,4 +607,55 @@
606 """Traverse to a sub-object."""607 """Traverse to a sub-object."""
607608
608609
610class IRepresentationCache(Interface):
611 """A cache for resource representations.
612
613 Register an object as the utility for this interface and
614 lazr.restful will use that object to cache resource
615 representations. If no object is registered as the utility,
616 representations will not be cached.
617
618 This is designed to be used with memcached, but you can plug in
619 other key-value stores. Note that this cache is intended to store
620 string representations, not deserialized JSON objects or anything
621 else.
622 """
623
624 def get(object, media_Type, version, default=None):
625 """Retrieve a representation from the cache.
626
627 :param object: An IEntry--the object whose representation you want.
628 :param media_type: The media type of the representation to get.
629 :param version: The version of the web service for which to
630 fetch a representation.
631 :param default: The object to return if no representation is
632 cached for this object.
633
634 :return: A string representation, or `default`.
635 """
636 pass
637
638 def set(object, media_type, version, representation):
639 """Add a representation to the cache.
640
641 :param object: An IEntry--the object whose representation this is.
642 :param media_type: The media type of the representation.
643 :param version: The version of the web service in which this
644 representation should be stored.
645 :param representation: The string representation to store.
646 """
647 pass
648
649 def delete(object):
650 """Remove *all* of an object's representations from the cache.
651
652 This means representations for every (supported) media type
653 and every version of the web service. Currently the only
654 supported media type is 'application/json'.
655
656 :param object: An IEntry--the object being represented.
657 """
658 pass
659
660
609InvalidBatchSizeError.__lazr_webservice_error__ = 400661InvalidBatchSizeError.__lazr_webservice_error__ = 400
610662
=== modified file 'src/lazr/restful/simple.py'
--- src/lazr/restful/simple.py 2010-01-28 15:33:31 +0000
+++ src/lazr/restful/simple.py 2010-05-24 14:15:38 +0000
@@ -2,7 +2,9 @@
22
3__metaclass__ = type3__metaclass__ = type
4__all__ = [4__all__ = [
5 'BaseRepresentationCache',
5 'BaseWebServiceConfiguration',6 'BaseWebServiceConfiguration',
7 'DictionaryBasedRepresentationCache',
6 'IMultiplePathPartLocation',8 'IMultiplePathPartLocation',
7 'MultiplePathPartAbsoluteURL',9 'MultiplePathPartAbsoluteURL',
8 'Publication',10 'Publication',
@@ -24,19 +26,24 @@
24from zope.publisher.publish import mapply26from zope.publisher.publish import mapply
25from zope.proxy import sameProxiedObjects27from zope.proxy import sameProxiedObjects
26from zope.security.management import endInteraction, newInteraction28from zope.security.management import endInteraction, newInteraction
27from zope.traversing.browser import AbsoluteURL as ZopeAbsoluteURL29from zope.traversing.browser import (
30 absoluteURL, AbsoluteURL as ZopeAbsoluteURL)
28from zope.traversing.browser.interfaces import IAbsoluteURL31from zope.traversing.browser.interfaces import IAbsoluteURL
29from zope.traversing.browser.absoluteurl import _insufficientContext, _safe32from zope.traversing.browser.absoluteurl import _insufficientContext, _safe
3033
31import grokcore.component34import grokcore.component
3235
33from lazr.restful import EntryAdapterUtility, ServiceRootResource36from lazr.restful import (
37 EntryAdapterUtility, HTTPResource, ServiceRootResource)
34from lazr.restful.interfaces import (38from lazr.restful.interfaces import (
35 IServiceRootResource, ITopLevelEntryLink, ITraverseWithGet,39 IRepresentationCache, IServiceRootResource, ITopLevelEntryLink,
36 IWebServiceConfiguration, IWebServiceLayer)40 ITraverseWithGet, IWebServiceConfiguration, IWebServiceLayer)
37from lazr.restful.publisher import (41from lazr.restful.publisher import (
38 WebServicePublicationMixin, WebServiceRequestTraversal)42 browser_request_to_web_service_request, WebServicePublicationMixin,
39from lazr.restful.utils import implement_from_dict43 WebServiceRequestTraversal)
44from lazr.restful.utils import (
45 get_current_browser_request, implement_from_dict,
46 tag_request_with_version_name)
4047
4148
42class PublicationMixin(object):49class PublicationMixin(object):
@@ -351,6 +358,117 @@
351 __call__ = __str__358 __call__ = __str__
352359
353360
361class BaseRepresentationCache(object):
362 """A useful base class for representation caches.
363
364 When an object is invalidated, all of its representations must be
365 removed from the cache. This means representations of every media
366 type for every version of the web service. Subclass this class and
367 you won't have to worry about removing everything. You can focus
368 on implementing key_for() and delete_by_key(), which takes the
369 return value of key_for() instead of a raw object.
370
371 You can also implement set_by_key() and get_by_key(), which also
372 take the return value of key_for(), instead of set() and get().
373 """
374 implements(IRepresentationCache)
375
376 def get(self, obj, media_type, version, default=None):
377 """See `IRepresentationCache`."""
378 key = self.key_for(obj, media_type, version)
379 return self.get_by_key(key, default)
380
381 def set(self, obj, media_type, version, representation):
382 """See `IRepresentationCache`."""
383 key = self.key_for(obj, media_type, version)
384 return self.set_by_key(key, representation)
385
386 def delete(self, object):
387 """See `IRepresentationCache`."""
388 config = getUtility(IWebServiceConfiguration)
389 for version in config.active_versions:
390 key = self.key_for(object, HTTPResource.JSON_TYPE, version)
391 self.delete_by_key(key)
392
393 def key_for(self, object, media_type, version):
394 """Generate a unique key for an object/media type/version.
395
396 :param object: An IEntry--the object whose representation you want.
397 :param media_type: The media type of the representation to get.
398 :param version: The version of the web service for which to
399 fetch a representation.
400 """
401 raise NotImplementedError()
402
403 def get_by_key(self, key, default=None):
404 """Delete a representation from the cache, given a key.
405
406 :key: The cache key.
407 """
408 raise NotImplementedError()
409
410 def set_by_key(self, key):
411 """Delete a representation from the cache, given a key.
412
413 :key: The cache key.
414 """
415 raise NotImplementedError()
416
417 def delete_by_key(self, key):
418 """Delete a representation from the cache, given a key.
419
420 :key: The cache key.
421 """
422 raise NotImplementedError()
423
424
425class DictionaryBasedRepresentationCache(BaseRepresentationCache):
426 """A representation cache that uses an in-memory dict.
427
428 This cache transforms IRepresentationCache operations into
429 operations on a dictionary.
430
431 Don't use a Python dict object in a production installation! It
432 can easily grow to take up all available memory. If you implement
433 a dict-like object that maintains a maximum size with an LRU
434 algorithm or something similar, you can use that. But this class
435 was written for testing.
436 """
437 def __init__(self, use_dict):
438 """Constructor.
439
440 :param use_dict: A dictionary to keep representations in. As
441 noted in the class docstring, in a production installation
442 it's a very bad idea to use a standard Python dict object.
443 """
444 self.dict = use_dict
445
446 def key_for(self, obj, media_type, version):
447 """See `BaseRepresentationCache`."""
448 # Create a fake web service request for the appropriate version.
449 config = getUtility(IWebServiceConfiguration)
450 web_service_request = config.createRequest("", {})
451 web_service_request.setVirtualHostRoot(
452 names=[config.path_override, version])
453 tag_request_with_version_name(web_service_request, version)
454
455 # Use that request to create a versioned URL for the object.
456 value = absoluteURL(obj, web_service_request) + ',' + media_type
457 return value
458
459 def get_by_key(self, key, default=None):
460 """See `IRepresentationCache`."""
461 return self.dict.get(key, default)
462
463 def set_by_key(self, key, representation):
464 """See `IRepresentationCache`."""
465 self.dict[key] = representation
466
467 def delete_by_key(self, key):
468 """Implementation of a `BaseRepresentationCache` method."""
469 del self.dict[key]
470
471
354BaseWebServiceConfiguration = implement_from_dict(472BaseWebServiceConfiguration = implement_from_dict(
355 "BaseWebServiceConfiguration", IWebServiceConfiguration, {}, object)473 "BaseWebServiceConfiguration", IWebServiceConfiguration, {}, object)
356474
357475
=== modified file 'src/lazr/restful/version.txt'
--- src/lazr/restful/version.txt 2010-05-10 11:48:42 +0000
+++ src/lazr/restful/version.txt 2010-05-24 14:15:38 +0000
@@ -1,1 +1,1 @@
10.9.2610.9.27

Subscribers

People subscribed via source and target branches