Merge lp:~leonardr/lazr.restful/representation-cache into lp:lazr.restful

Proposed by Leonard Richardson
Status: Merged
Merged at revision: 130
Proposed branch: lp:~leonardr/lazr.restful/representation-cache
Merge into: lp:lazr.restful
Diff against target: 861 lines (+581/-40)
10 files modified
src/lazr/restful/NEWS.txt (+10/-0)
src/lazr/restful/_operation.py (+11/-12)
src/lazr/restful/_resource.py (+88/-16)
src/lazr/restful/declarations.py (+11/-4)
src/lazr/restful/docs/webservice-declarations.txt (+6/-1)
src/lazr/restful/example/base/subscribers.py (+1/-0)
src/lazr/restful/example/base/tests/representation-cache.txt (+277/-0)
src/lazr/restful/interfaces/_rest.py (+52/-0)
src/lazr/restful/simple.py (+124/-6)
src/lazr/restful/version.txt (+1/-1)
To merge this branch: bzr merge lp:~leonardr/lazr.restful/representation-cache
Reviewer Review Type Date Requested Status
Eleanor Berger (community) code Approve
Review via email: mp+25895@code.launchpad.net

Description of the change

This branch makes it possible to store preconstructed string representations of entries in a cache. If an entry is present in the cache, the preconstructed representation is used (and possibly redacted) rather than generating a new representation.

On its own, this isn't a huge performance improvement. The huge performance improvement comes from collections. If you request a page of collection, and 50% of the entries on that page are present in the cache, 50% of the representations will come from the cache and incorporated into a collection representation. The other half of the representations will be generated placed in the cache, so that if you request that page again, _all_ the entry representations will come from the cache.

In my Launchpad performance tests based on memcached and this lazr.restful branch (https://dev.launchpad.net/Foundations/Webservice/Performance#Store%20representations%20in%20memcached), I found that the operation of retrieving a fully cached collection was about five times faster than if there was no cache.

To make this performance win worth the complexity, I had to make the new 'redacted_fields' attribute very very fast. Typically we check whether the user has permission on an attribute by trying to access the attribute and catching an Unauthorized exception. But if the attribute in question is a calculated attribute, accessing it might trigger a database request or something equally slow. We need to use the Zope permission checker directly.

The problem is that the web service doesn't know which field name to pass into the Zope permission checker. If a field's real name is "fooBar" but it's published as "foo_bar" on the web service, the Zope permission checker expects "fooBar" but all the web service has access to is 'foo_bar'.

To get around this problem, I changed the export() declaration to set 'original_name' every time it sets 'as'. In the example above, 'as' will be 'foo_bar', and everything in the web service will call the field 'foo_bar', except for 'redacted_fields', which will look in 'original_name' to find that it needs to pass 'fooBar' into the Zope permission checker.

To post a comment you must log in.
Revision history for this message
Leonard Richardson (leonardr) wrote :

Although we'll be using a memcached-based cache when we integrate this code into Launchpad, there's no memcached-code here. For testing purposes I use a cache that's backed by a Python dict.

Revision history for this message
Leonard Richardson (leonardr) wrote :

Another thing I forgot to mention: although the cache interface has a hook for removing objects from the cache, lazr.restful itself will never call that hook. It's the responsibility of the application to call that hook when the cache needs invalidation.

Revision history for this message
Eleanor Berger (intellectronica) :
review: Approve (code)
Revision history for this message
Gary Poster (gary) wrote :

Summary: After we verify that this general approach gives real-world benefit, I suspect that we should always populate the cache, not only when there are no redacted fields for the current user. We could do this by stripping the security proxy for getting the initial data and populating the cache, and then doing the same logic that you do now for redacting existing JSON caches to get the actual desired end result.

IRC conversation:

[10:43am] gary_poster: leonardr: in your branch, when there is no cache, did you contemplate always generating it, even if there are redacted fields? Example: if no cache, generate dict of the entire non-redacted version; else if cache and redacted fields, parse out cache to dict; else return cache. (Now we have a non-redacted dict, if we are still here.)
[10:43am] gary_poster: Now, redact dict, turn into JSON, and return. There are variations of that, some of which might be better, but I imagine you get the drift.
[11:01am] leonardr: gary, i'm not sure what the benefit would be
[11:01am] leonardr: also, if there are redacted fields we _cannot_ calculate an unredacted cache due to the security policy
[11:05am] gary_poster: leonardr: the goal would be to create a source for further cache hits. This could be particularly important for objects that frequently have one or more fields redacted. In that case, the cache would rarely or, in the worst case, never be filled (and therefore never or rarely used). Since DB access is the main expense, you discovered, I strongly suspect that loading JSON and redacting will be significantly cheaper than simply creating the JSON.
[11:05am] gary_poster: Also, I'm skeptical of "cannot"; isn't it just a matter of doing the usual work with an unproxied object?
[11:07am] leonardr: yes, we would have to strip the proxy
[11:10am] leonardr: ok, i see what you're saying. we would cache it all the time, whether we were sending a redacted version or not
[11:10am] gary_poster: right
[11:11am] leonardr: i could certainly do that in a future branch. do you know of launchpad objects that typically have redacted fields?
[11:13am] gary_poster: bac would probably know, but he's out. My first guess: anything private, or (perhaps more interesting, perhaps not) anything referring to something provate.
[11:13am] gary_poster: private
[11:14am] leonardr: if an object's url contains private information, a link to that url would be redacted
[11:14am] gary_poster: so, that's an example?
[11:14am] leonardr: but i don't know of any specific launchpad object that does that. it's something to look for
[11:15am] gary_poster: bugs that are marked as security issues
[11:15am] gary_poster: private projects
[11:15am] gary_poster: private teams
[11:15am] gary_poster: private bugs
[11:15am] leonardr: so anything that links to those objects might end up redacted
[11:16am] gary_poster: (and there's more coming, if I understand correctly)
[11:16am] gary_poster: right
[11:17am] leonardr: ok, let's get the basic cache working, make sure it improves performance in real situations, and then i'll work on that
[11:17am] gary_poster: cool, makes sense

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/lazr/restful/NEWS.txt'
2--- src/lazr/restful/NEWS.txt 2010-05-17 17:53:57 +0000
3+++ src/lazr/restful/NEWS.txt 2010-05-24 14:15:38 +0000
4@@ -2,6 +2,16 @@
5 NEWS for lazr.restful
6 =====================
7
8+0.9.27 (Development)
9+====================
10+
11+Added the ability to define a representation cache used to store the
12+JSON representations of entry resources, rather than building them
13+from scratch every time. Although the cache has hooks for
14+invalidation, lazr.restful will never invalidate any part of the cache
15+on its own. You need to hook lazr.restful's invalidation code into
16+your ORM or other data store.
17+
18 0.9.26 (2010-05-18)
19 ===================
20
21
22=== modified file 'src/lazr/restful/_operation.py'
23--- src/lazr/restful/_operation.py 2010-01-05 19:24:12 +0000
24+++ src/lazr/restful/_operation.py 2010-05-24 14:15:38 +0000
25@@ -84,22 +84,21 @@
26 # If the result is a web service collection, serve only one
27 # batch of the collection.
28 collection = getMultiAdapter((result, self.request), ICollection)
29- result = CollectionResource(collection, self.request).batch()
30+ result = CollectionResource(collection, self.request).batch() + '}'
31 elif self.should_batch(result):
32- result = self.batch(result, self.request)
33-
34- # Serialize the result to JSON. Any embedded entries will be
35- # automatically serialized.
36- try:
37- json_representation = simplejson.dumps(
38- result, cls=ResourceJSONEncoder)
39- except TypeError, e:
40- raise TypeError("Could not serialize object %s to JSON." %
41- result)
42+ result = self.batch(result, self.request) + '}'
43+ else:
44+ # Serialize the result to JSON. Any embedded entries will be
45+ # automatically serialized.
46+ try:
47+ result = simplejson.dumps(result, cls=ResourceJSONEncoder)
48+ except TypeError, e:
49+ raise TypeError("Could not serialize object %s to JSON." %
50+ result)
51
52 self.request.response.setStatus(200)
53 self.request.response.setHeader('Content-Type', self.JSON_TYPE)
54- return json_representation
55+ return result
56
57 def should_batch(self, result):
58 """Whether the given response data should be batched."""
59
60=== modified file 'src/lazr/restful/_resource.py'
61--- src/lazr/restful/_resource.py 2010-05-17 17:52:57 +0000
62+++ src/lazr/restful/_resource.py 2010-05-24 14:15:38 +0000
63@@ -70,7 +70,7 @@
64 from zope.schema.interfaces import (
65 ConstraintNotSatisfied, IBytes, IField, IObject, RequiredMissing)
66 from zope.security.interfaces import Unauthorized
67-from zope.security.proxy import removeSecurityProxy
68+from zope.security.proxy import getChecker, removeSecurityProxy
69 from zope.security.management import checkPermission
70 from zope.traversing.browser import absoluteURL, AbsoluteURL
71 from zope.traversing.browser.interfaces import IAbsoluteURL
72@@ -84,7 +84,7 @@
73 from lazr.restful.interfaces import (
74 ICollection, ICollectionField, ICollectionResource, IEntry, IEntryField,
75 IEntryFieldResource, IEntryResource, IFieldHTMLRenderer, IFieldMarshaller,
76- IHTTPResource, IJSONPublishable, IReferenceChoice,
77+ IHTTPResource, IJSONPublishable, IReferenceChoice, IRepresentationCache,
78 IResourceDELETEOperation, IResourceGETOperation, IResourcePOSTOperation,
79 IScopedCollection, IServiceRootResource, ITopLevelEntryLink,
80 IUnmarshallingDoesntNeedValue, IWebServiceClientRequest,
81@@ -97,7 +97,8 @@
82 WADL_SCHEMA_FILE = os.path.join(os.path.dirname(__file__),
83 'wadl20061109.xsd')
84
85-# Levels of detail to use when unmarshalling the data.
86+# Constants and levels of detail to use when unmarshalling the data.
87+MISSING = object()
88 NORMAL_DETAIL = object()
89 CLOSEUP_DETAIL = object()
90
91@@ -599,7 +600,7 @@
92 def batch(self, entries, request):
93 """Prepare a batch from a (possibly huge) list of entries.
94
95- :return: A hash:
96+ :return: A JSON string representing a hash:
97 'entries' contains a list of EntryResource objects for the
98 entries that actually made it into this batch
99 'total_size' contains the total size of the list.
100@@ -608,6 +609,11 @@
101 'prev_url', if present, contains a URL to get the previous batch
102 in the list.
103 'start' contains the starting index of this batch
104+
105+ Note that the JSON string will be missing its final curly
106+ brace. This is in case the caller wants to add some additional
107+ keys to the JSON hash. It's the caller's responsibility to add
108+ a '}' to the end of the string returned from this method.
109 """
110 if not hasattr(entries, '__len__'):
111 entries = IFiniteSequence(entries)
112@@ -617,8 +623,7 @@
113 resources = [EntryResource(entry, request)
114 for entry in navigator.batch
115 if checkPermission(view_permission, entry)]
116- batch = { 'entries' : resources,
117- 'total_size' : navigator.batch.listlength,
118+ batch = { 'total_size' : navigator.batch.listlength,
119 'start' : navigator.batch.start }
120 if navigator.batch.start < 0:
121 batch['start'] = None
122@@ -628,7 +633,17 @@
123 prev_url = navigator.prevBatchURL()
124 if prev_url != "":
125 batch['prev_collection_link'] = prev_url
126- return batch
127+ json_string = simplejson.dumps(batch, cls=ResourceJSONEncoder)
128+
129+ # String together a bunch of entry representations, possibly
130+ # obtained from a representation cache.
131+ entry_strings = [
132+ resource._representation(HTTPResource.JSON_TYPE)
133+ for resource in resources]
134+ json_string = (json_string[:-1] + ', "entries": ['
135+ + (", ".join(entry_strings) + ']'))
136+ # The caller is responsible for tacking on the final curly brace.
137+ return json_string
138
139
140 class CustomOperationResourceMixin:
141@@ -708,8 +723,6 @@
142 return "DELETE not supported."
143 return operation()
144
145- return operation()
146-
147
148 class FieldUnmarshallerMixin:
149
150@@ -733,12 +746,11 @@
151
152 :return: a 2-tuple (representation_name, representation_value).
153 """
154- missing = object()
155- cached_value = missing
156+ cached_value = MISSING
157 if detail is NORMAL_DETAIL:
158 cached_value = self._unmarshalled_field_cache.get(
159- field_name, missing)
160- if cached_value is not missing:
161+ field_name, MISSING)
162+ if cached_value is not MISSING:
163 return cached_value
164
165 field = field.bind(self.context)
166@@ -1442,6 +1454,29 @@
167 self.request), self.request),
168 adapter.singular_type)
169
170+ @property
171+ def redacted_fields(self):
172+ """Names the fields the current user doesn't have permission to see."""
173+ failures = []
174+ checker = getChecker(self.context)
175+ for name, field in getFieldsInOrder(self.entry.schema):
176+ try:
177+ # Can we view the field's value? We check the
178+ # permission directly using the Zope permission
179+ # checker, because doing it indirectly by fetching the
180+ # value may have very slow side effects such as
181+ # database hits.
182+ tagged_values = field.getTaggedValue('lazr.restful.exported')
183+ original_name = tagged_values['original_name']
184+ checker.check(self.context, original_name)
185+ except Unauthorized:
186+ # This is an expensive operation that will make this
187+ # request more expensive still, but it happens
188+ # relatively rarely.
189+ repr_name, repr_value = self._unmarshallField(name, field)
190+ failures.append(repr_name)
191+ return failures
192+
193 def isModifiableField(self, field, is_external_client):
194 """Returns true if this field's value can be changed.
195
196@@ -1463,10 +1498,45 @@
197
198 def _representation(self, media_type):
199 """Return a representation of this entry, of the given media type."""
200+
201 if media_type in [self.WADL_TYPE, self.DEPRECATED_WADL_TYPE]:
202 return self.toWADL().encode("utf-8")
203 elif media_type == self.JSON_TYPE:
204- return simplejson.dumps(self, cls=ResourceJSONEncoder)
205+ cache = None
206+ try:
207+ cache = getUtility(IRepresentationCache)
208+ representation = cache.get(
209+ self.context, self.JSON_TYPE, self.request.version)
210+ except ComponentLookupError:
211+ # There's no representation cache.
212+ representation = None
213+
214+ redacted_fields = self.redacted_fields
215+ if representation is None:
216+ # Either there is no cache, or the representation
217+ # wasn't in the cache.
218+ representation = simplejson.dumps(self, cls=ResourceJSONEncoder)
219+ # If there's a cache, and this representation doesn't
220+ # contain any redactions, store it in the cache.
221+ if cache is not None and len(redacted_fields) == 0:
222+ cache.set(self.context, self.JSON_TYPE,
223+ self.request.version, representation)
224+ else:
225+ # We have a representation, but we might not be able
226+ # to use it as-is.
227+ if len(redacted_fields) != 0:
228+ # We can't use the representation as is. We need
229+ # to deserialize it, redact certain fields, and
230+ # reserialize it. Hopefully this is faster than
231+ # generating the representation from scratch!
232+ json = simplejson.loads(representation)
233+ for field in redacted_fields:
234+ json[field] = self.REDACTED_VALUE
235+ # There's no need to use the ResourceJSONEncoder,
236+ # because we loaded the cached representation
237+ # using the standard decoder.
238+ representation = simplejson.dumps(json)
239+ return representation
240 elif media_type == self.XHTML_TYPE:
241 return self.toXHTML().encode("utf-8")
242 else:
243@@ -1516,7 +1586,7 @@
244 result = self.batch(entries)
245
246 self.request.response.setHeader('Content-type', self.JSON_TYPE)
247- return simplejson.dumps(result, cls=ResourceJSONEncoder)
248+ return result
249
250 def batch(self, entries=None):
251 """Return a JSON representation of a batch of entries.
252@@ -1526,7 +1596,9 @@
253 if entries is None:
254 entries = self.collection.find()
255 result = super(CollectionResource, self).batch(entries, self.request)
256- result['resource_type_link'] = self.type_url
257+ result += (
258+ ', "resource_type_link" : ' + simplejson.dumps(self.type_url)
259+ + '}')
260 return result
261
262 @property
263
264=== modified file 'src/lazr/restful/declarations.py'
265--- src/lazr/restful/declarations.py 2010-02-25 17:07:16 +0000
266+++ src/lazr/restful/declarations.py 2010-05-24 14:15:38 +0000
267@@ -151,10 +151,17 @@
268 if tag_stack['type'] != FIELD_TYPE:
269 continue
270 for version, tags in tag_stack.stack:
271- # Set 'as' for every version in which the field is published
272- # but no 'as' is specified.
273- if tags.get('as') is None and tags.get('exported') != False:
274- tags['as'] = name
275+ # Set 'as' for every version in which the field is
276+ # published but no 'as' is specified. Also set
277+ # 'original_name' for every version in which the field
278+ # is published--this will help with performance
279+ # optimizations around permission checks.
280+ if tags.get('exported') != False:
281+ tags['original_name'] = name
282+ if tags.get('as') is None:
283+ tags['as'] = name
284+
285+
286
287 annotate_exported_methods(interface)
288 return interface
289
290=== modified file 'src/lazr/restful/docs/webservice-declarations.txt'
291--- src/lazr/restful/docs/webservice-declarations.txt 2010-04-14 14:56:46 +0000
292+++ src/lazr/restful/docs/webservice-declarations.txt 2010-05-24 14:15:38 +0000
293@@ -49,7 +49,7 @@
294 ...
295 ... inventory_number = TextLine(title=u'The inventory part number.')
296
297-These declarations adds tagged value to the original interface elements.
298+These declarations add tagged values to the original interface elements.
299 The tags are in the lazr.restful namespace and are dictionaries of
300 elements.
301
302@@ -74,12 +74,15 @@
303 type: 'entry'
304 >>> print_export_tag(IBook['title'])
305 as: 'title'
306+ original_name: 'title'
307 type: 'field'
308 >>> print_export_tag(IBook['author'])
309 as: 'author'
310+ original_name: 'author'
311 type: 'field'
312 >>> print_export_tag(IBook['base_price'])
313 as: 'price'
314+ original_name: 'base_price'
315 type: 'field'
316 >>> print_export_tag(IBook['inventory_number'])
317 tag 'lazr.restful.exported' is not present
318@@ -751,9 +754,11 @@
319 ... print_export_tag(IUser[name])
320 == name ==
321 as: 'name'
322+ original_name: 'name'
323 type: 'field'
324 == nickname ==
325 as: 'nickname'
326+ original_name: 'nickname'
327 type: 'field'
328 == rename ==
329 as: 'rename'
330
331=== modified file 'src/lazr/restful/example/base/subscribers.py'
332--- src/lazr/restful/example/base/subscribers.py 2009-09-01 14:37:41 +0000
333+++ src/lazr/restful/example/base/subscribers.py 2010-05-24 14:15:38 +0000
334@@ -5,6 +5,7 @@
335 __metaclass__ = type
336 __all__ = ['update_cookbook_revision_number']
337
338+from zope.interface import Interface
339 import grokcore.component
340 from lazr.lifecycle.interfaces import IObjectModifiedEvent
341 from lazr.restful.example.base.interfaces import ICookbook
342
343=== added file 'src/lazr/restful/example/base/tests/representation-cache.txt'
344--- src/lazr/restful/example/base/tests/representation-cache.txt 1970-01-01 00:00:00 +0000
345+++ src/lazr/restful/example/base/tests/representation-cache.txt 2010-05-24 14:15:38 +0000
346@@ -0,0 +1,277 @@
347+**********************************
348+The in-memory representation cache
349+**********************************
350+
351+Rather than having lazr.restful calculate a representation of an entry
352+every time it's requested, you can register an object as the
353+representation cache. String representations of entries are generated
354+once and stored in the representation cache.
355+
356+lazr.restful works fine when there is no representation cache
357+installed; in fact, this is the only test that uses one.
358+
359+ >>> from zope.component import getUtility
360+ >>> from lazr.restful.interfaces import IRepresentationCache
361+ >>> getUtility(IRepresentationCache)
362+ Traceback (most recent call last):
363+ ...
364+ ComponentLookupError: ...
365+
366+DictionaryBasedRepresentationCache
367+==================================
368+
369+A representation cache can be any object that implements
370+IRepresentationCache, but for test purposes we'll be using a simple
371+DictionaryBasedRepresentationCache. This object transforms the
372+IRepresentationCache operations into operations on a Python dict-like
373+object.
374+
375+ >>> from lazr.restful.simple import DictionaryBasedRepresentationCache
376+ >>> dictionary = {}
377+ >>> cache = DictionaryBasedRepresentationCache(dictionary)
378+
379+It's not a good idea to use a normal Python dict in production,
380+because there's no limit on how large the dict can become. In a real
381+situation you want something with an LRU implementation. That said,
382+let's see how the DictionaryBasedRepresentationCache works.
383+
384+All IRepresentationCache implementations will cache a representation
385+under a key derived from the object whose representation it is, the
386+media type of the representation, and a web service version name.
387+
388+ >>> from lazr.restful.example.base.root import C4 as greens_object
389+ >>> json = "application/json"
390+ >>> print cache.get(greens_object, json, "devel")
391+ None
392+ >>> print cache.get(greens_object, json, "devel", "missing")
393+ missing
394+
395+ >>> cache.set(greens_object, json, "devel", "This is the 'devel' value.")
396+ >>> print cache.get(greens_object, json, "devel")
397+ This is the 'devel' value.
398+ >>> sorted(dictionary.keys())
399+ ['http://cookbooks.dev/devel/cookbooks/Everyday%20Greens,application/json']
400+
401+This allows different representations of the same object to be stored
402+for different versions.
403+
404+ >>> cache.set(greens_object, json, "1.0", "This is the '1.0' value.")
405+ >>> print cache.get(greens_object, json, "1.0")
406+ This is the '1.0' value.
407+ >>> sorted(dictionary.keys())
408+ ['http://cookbooks.dev/1.0/cookbooks/Everyday%20Greens,application/json',
409+ 'http://cookbooks.dev/devel/cookbooks/Everyday%20Greens,application/json']
410+
411+Deleting an object from the cache will remove all its representations.
412+
413+ >>> cache.delete(greens_object)
414+ >>> sorted(dictionary.keys())
415+ []
416+ >>> print cache.get(greens_object, json, "devel")
417+ None
418+ >>> print cache.get(greens_object, json, "1.0")
419+ None
420+
421+A representation cache
422+======================
423+
424+Now let's register our DictionaryBasedRepresentationCache as the
425+representation cache for this web service, and see how it works within
426+lazr.restful.
427+
428+ >>> from zope.component import getSiteManager
429+ >>> sm = getSiteManager()
430+ >>> sm.registerUtility(cache, IRepresentationCache)
431+
432+ >>> from lazr.restful.testing.webservice import WebServiceCaller
433+ >>> webservice = WebServiceCaller(domain='cookbooks.dev')
434+
435+When we retrieve a JSON representation of an entry, that
436+representation is added to the cache.
437+
438+ >>> ignored = webservice.get("/recipes/1")
439+ >>> [the_only_key] = dictionary.keys()
440+ >>> print the_only_key
441+ http://cookbooks.dev/devel/recipes/1,application/json
442+
443+Note that the cache key incorporates the web service version name
444+("devel") and the media type of the representation
445+("application/json").
446+
447+Associated with the key is a string: the JSON representation of the object.
448+
449+ >>> import simplejson
450+ >>> print simplejson.loads(dictionary[the_only_key])['self_link']
451+ http://cookbooks.dev/devel/recipes/1
452+
453+If we get a representation of the same resource from a different web
454+service version, that representation is stored separately.
455+
456+ >>> ignored = webservice.get("/recipes/1", api_version="1.0")
457+ >>> for key in sorted(dictionary.keys()):
458+ ... print key
459+ http://cookbooks.dev/1.0/recipes/1,application/json
460+ http://cookbooks.dev/devel/recipes/1,application/json
461+
462+ >>> key1 = "http://cookbooks.dev/1.0/recipes/1,application/json"
463+ >>> key2 = "http://cookbooks.dev/devel/recipes/1,application/json"
464+ >>> dictionary[key1] == dictionary[key2]
465+ False
466+
467+Cache invalidation
468+==================
469+
470+lazr.restful does not automatically invalidate the representation
471+cache, because it only knows about a subset of the changes that might
472+invalidate the cache--the changes that happen through the web service
473+itself.
474+
475+If you want to invalidate the cache whenever the web service changes
476+an object, you can write a listener for ObjectModifiedEvent objects
477+(see doc/webservice.txt for an example). But most of the time, you'll
478+want to invalidate the cache when something deeper happens--something
479+like a change to the objects in your ORM.
480+
481+Let's signal a change to recipe #1. Let's say someone changed that
482+recipe, using a web application that has no connection to the web
483+service except for a shared database. We can detect the database
484+change, but what do we do when that change happens?
485+
486+Here's the recipe object.
487+
488+ >>> from lazr.restful.example.base.root import RECIPES
489+ >>> recipe = [recipe for recipe in RECIPES if recipe.id == 1][0]
490+
491+To remove its representation from the cache, we pass it into the
492+cache's delete() method.
493+
494+ >>> print cache.get(recipe, json, 'devel')
495+ {...}
496+ >>> cache.delete(recipe)
497+
498+All the relevant representations are deleted.
499+
500+ >>> print cache.get(recipe, json, 'devel')
501+ None
502+ >>> dictionary.keys()
503+ []
504+
505+Data visibility
506+===============
507+
508+Only full representations are added to the cache. If the
509+representation you request includes a redacted field (because you
510+don't have permission to see that field's true value), the
511+representation is not added to the cache.
512+
513+ >>> from urllib import quote
514+ >>> greens_url = quote("/cookbooks/Everyday Greens")
515+ >>> greens = webservice.get(greens_url).jsonBody()
516+ >>> print greens['confirmed']
517+ tag:launchpad.net:2008:redacted
518+
519+ >>> dictionary.keys()
520+ []
521+
522+This means that if your entry resources typically contain data that's
523+only visible to a select few users, you won't get much benefit out of
524+a representation cache.
525+
526+What if a full representation is in the cache, and the user requests a
527+representation that must be redacted? Let's put some semi-fake data in
528+the cache and find out.
529+
530+ >>> import simplejson
531+ >>> greens['name'] = "This comes from the cache; it is not generated."
532+ >>> greens['confirmed'] = True
533+ >>> cache.set(greens_object, json, 'devel', simplejson.dumps(greens))
534+
535+When we GET the corresponding resource, we get a representation that
536+definitely comes from the cache, not the original data source.
537+
538+ >>> cached_greens = webservice.get(greens_url).jsonBody()
539+ >>> print cached_greens['name']
540+ This comes from the cache; it is not generated.
541+
542+But the redacted value is still redacted.
543+
544+ >>> print cached_greens['confirmed']
545+ tag:launchpad.net:2008:redacted
546+
547+Cleanup: clear the cache.
548+
549+ >>> dictionary.clear()
550+
551+Collections
552+===========
553+
554+Collections are full of entries, and representations of collections
555+are built from the cache if possible. We'll demonstrate this with the
556+collection of recipes.
557+
558+First, we'll hack the cached representation of a single recipe.
559+
560+ >>> recipe = webservice.get("/recipes/1").jsonBody()
561+ >>> recipe['instructions'] = "This representation is from the cache."
562+ >>> [recipe_key] = dictionary.keys()
563+ >>> dictionary[recipe_key] = simplejson.dumps(recipe)
564+
565+Now, we get the collection of recipes.
566+
567+ >>> recipes = webservice.get("/recipes").jsonBody()['entries']
568+
569+The fake instructions we put into an entry's cached representation are
570+also present in the collection.
571+
572+ >>> for instructions in (
573+ ... sorted(recipe['instructions'] for recipe in recipes)):
574+ ... print instructions
575+ A perfectly roasted chicken is...
576+ Draw, singe, stuff, and truss...
577+ ...
578+ This representation is from the cache.
579+
580+To build the collection, lazr.restful had to generate representations
581+of all the cookbook entries. As it generated each representation, it
582+populated the cache.
583+
584+ >>> for key in sorted(dictionary.keys()):
585+ ... print key
586+ http://cookbooks.dev/devel/recipes/1,application/json
587+ http://cookbooks.dev/devel/recipes/2,application/json
588+ http://cookbooks.dev/devel/recipes/3,application/json
589+ http://cookbooks.dev/devel/recipes/4,application/json
590+
591+If we request the collection again, all the entry representations will
592+come from the cache.
593+
594+ >>> for key in dictionary.keys():
595+ ... value = simplejson.loads(dictionary[key])
596+ ... value['instructions'] = "This representation is from the cache."
597+ ... dictionary[key] = simplejson.dumps(value)
598+
599+ >>> recipes = webservice.get("/recipes").jsonBody()['entries']
600+ >>> for instructions in (
601+ ... sorted(recipe['instructions'] for recipe in recipes)):
602+ ... print instructions
603+ This representation is from the cache.
604+ This representation is from the cache.
605+ This representation is from the cache.
606+ This representation is from the cache.
607+
608+Cleanup: de-register the cache.
609+
610+ >>> sm.registerUtility(None, IRepresentationCache)
611+
612+Of course, the hacks we made to the cached representations have no
613+effect on the objects themselves. Once the hacked cache is gone, the
614+representations look just as they did before.
615+
616+ >>> recipes = webservice.get("/recipes").jsonBody()['entries']
617+ >>> for instructions in (
618+ ... sorted(recipe['instructions'] for recipe in recipes)):
619+ ... print instructions
620+ A perfectly roasted chicken is...
621+ Draw, singe, stuff, and truss...
622+ Preheat oven to...
623+ You can always judge...
624
625=== modified file 'src/lazr/restful/interfaces/_rest.py'
626--- src/lazr/restful/interfaces/_rest.py 2010-05-17 17:52:57 +0000
627+++ src/lazr/restful/interfaces/_rest.py 2010-05-24 14:15:38 +0000
628@@ -34,6 +34,7 @@
629 'IHTTPResource',
630 'IJSONPublishable',
631 'IJSONRequestCache',
632+ 'IRepresentationCache',
633 'IResourceOperation',
634 'IResourceGETOperation',
635 'IResourceDELETEOperation',
636@@ -606,4 +607,55 @@
637 """Traverse to a sub-object."""
638
639
640+class IRepresentationCache(Interface):
641+ """A cache for resource representations.
642+
643+ Register an object as the utility for this interface and
644+ lazr.restful will use that object to cache resource
645+ representations. If no object is registered as the utility,
646+ representations will not be cached.
647+
648+ This is designed to be used with memcached, but you can plug in
649+ other key-value stores. Note that this cache is intended to store
650+ string representations, not deserialized JSON objects or anything
651+ else.
652+ """
653+
654+ def get(object, media_Type, version, default=None):
655+ """Retrieve a representation from the cache.
656+
657+ :param object: An IEntry--the object whose representation you want.
658+ :param media_type: The media type of the representation to get.
659+ :param version: The version of the web service for which to
660+ fetch a representation.
661+ :param default: The object to return if no representation is
662+ cached for this object.
663+
664+ :return: A string representation, or `default`.
665+ """
666+ pass
667+
668+ def set(object, media_type, version, representation):
669+ """Add a representation to the cache.
670+
671+ :param object: An IEntry--the object whose representation this is.
672+ :param media_type: The media type of the representation.
673+ :param version: The version of the web service in which this
674+ representation should be stored.
675+ :param representation: The string representation to store.
676+ """
677+ pass
678+
679+ def delete(object):
680+ """Remove *all* of an object's representations from the cache.
681+
682+ This means representations for every (supported) media type
683+ and every version of the web service. Currently the only
684+ supported media type is 'application/json'.
685+
686+ :param object: An IEntry--the object being represented.
687+ """
688+ pass
689+
690+
691 InvalidBatchSizeError.__lazr_webservice_error__ = 400
692
693=== modified file 'src/lazr/restful/simple.py'
694--- src/lazr/restful/simple.py 2010-01-28 15:33:31 +0000
695+++ src/lazr/restful/simple.py 2010-05-24 14:15:38 +0000
696@@ -2,7 +2,9 @@
697
698 __metaclass__ = type
699 __all__ = [
700+ 'BaseRepresentationCache',
701 'BaseWebServiceConfiguration',
702+ 'DictionaryBasedRepresentationCache',
703 'IMultiplePathPartLocation',
704 'MultiplePathPartAbsoluteURL',
705 'Publication',
706@@ -24,19 +26,24 @@
707 from zope.publisher.publish import mapply
708 from zope.proxy import sameProxiedObjects
709 from zope.security.management import endInteraction, newInteraction
710-from zope.traversing.browser import AbsoluteURL as ZopeAbsoluteURL
711+from zope.traversing.browser import (
712+ absoluteURL, AbsoluteURL as ZopeAbsoluteURL)
713 from zope.traversing.browser.interfaces import IAbsoluteURL
714 from zope.traversing.browser.absoluteurl import _insufficientContext, _safe
715
716 import grokcore.component
717
718-from lazr.restful import EntryAdapterUtility, ServiceRootResource
719+from lazr.restful import (
720+ EntryAdapterUtility, HTTPResource, ServiceRootResource)
721 from lazr.restful.interfaces import (
722- IServiceRootResource, ITopLevelEntryLink, ITraverseWithGet,
723- IWebServiceConfiguration, IWebServiceLayer)
724+ IRepresentationCache, IServiceRootResource, ITopLevelEntryLink,
725+ ITraverseWithGet, IWebServiceConfiguration, IWebServiceLayer)
726 from lazr.restful.publisher import (
727- WebServicePublicationMixin, WebServiceRequestTraversal)
728-from lazr.restful.utils import implement_from_dict
729+ browser_request_to_web_service_request, WebServicePublicationMixin,
730+ WebServiceRequestTraversal)
731+from lazr.restful.utils import (
732+ get_current_browser_request, implement_from_dict,
733+ tag_request_with_version_name)
734
735
736 class PublicationMixin(object):
737@@ -351,6 +358,117 @@
738 __call__ = __str__
739
740
741+class BaseRepresentationCache(object):
742+ """A useful base class for representation caches.
743+
744+ When an object is invalidated, all of its representations must be
745+ removed from the cache. This means representations of every media
746+ type for every version of the web service. Subclass this class and
747+ you won't have to worry about removing everything. You can focus
748+ on implementing key_for() and delete_by_key(), which takes the
749+ return value of key_for() instead of a raw object.
750+
751+ You can also implement set_by_key() and get_by_key(), which also
752+ take the return value of key_for(), instead of set() and get().
753+ """
754+ implements(IRepresentationCache)
755+
756+ def get(self, obj, media_type, version, default=None):
757+ """See `IRepresentationCache`."""
758+ key = self.key_for(obj, media_type, version)
759+ return self.get_by_key(key, default)
760+
761+ def set(self, obj, media_type, version, representation):
762+ """See `IRepresentationCache`."""
763+ key = self.key_for(obj, media_type, version)
764+ return self.set_by_key(key, representation)
765+
766+ def delete(self, object):
767+ """See `IRepresentationCache`."""
768+ config = getUtility(IWebServiceConfiguration)
769+ for version in config.active_versions:
770+ key = self.key_for(object, HTTPResource.JSON_TYPE, version)
771+ self.delete_by_key(key)
772+
773+ def key_for(self, object, media_type, version):
774+ """Generate a unique key for an object/media type/version.
775+
776+ :param object: An IEntry--the object whose representation you want.
777+ :param media_type: The media type of the representation to get.
778+ :param version: The version of the web service for which to
779+ fetch a representation.
780+ """
781+ raise NotImplementedError()
782+
783+ def get_by_key(self, key, default=None):
784+ """Delete a representation from the cache, given a key.
785+
786+ :key: The cache key.
787+ """
788+ raise NotImplementedError()
789+
790+ def set_by_key(self, key):
791+ """Delete a representation from the cache, given a key.
792+
793+ :key: The cache key.
794+ """
795+ raise NotImplementedError()
796+
797+ def delete_by_key(self, key):
798+ """Delete a representation from the cache, given a key.
799+
800+ :key: The cache key.
801+ """
802+ raise NotImplementedError()
803+
804+
805+class DictionaryBasedRepresentationCache(BaseRepresentationCache):
806+ """A representation cache that uses an in-memory dict.
807+
808+ This cache transforms IRepresentationCache operations into
809+ operations on a dictionary.
810+
811+ Don't use a Python dict object in a production installation! It
812+ can easily grow to take up all available memory. If you implement
813+ a dict-like object that maintains a maximum size with an LRU
814+ algorithm or something similar, you can use that. But this class
815+ was written for testing.
816+ """
817+ def __init__(self, use_dict):
818+ """Constructor.
819+
820+ :param use_dict: A dictionary to keep representations in. As
821+ noted in the class docstring, in a production installation
822+ it's a very bad idea to use a standard Python dict object.
823+ """
824+ self.dict = use_dict
825+
826+ def key_for(self, obj, media_type, version):
827+ """See `BaseRepresentationCache`."""
828+ # Create a fake web service request for the appropriate version.
829+ config = getUtility(IWebServiceConfiguration)
830+ web_service_request = config.createRequest("", {})
831+ web_service_request.setVirtualHostRoot(
832+ names=[config.path_override, version])
833+ tag_request_with_version_name(web_service_request, version)
834+
835+ # Use that request to create a versioned URL for the object.
836+ value = absoluteURL(obj, web_service_request) + ',' + media_type
837+ return value
838+
839+ def get_by_key(self, key, default=None):
840+ """See `IRepresentationCache`."""
841+ return self.dict.get(key, default)
842+
843+ def set_by_key(self, key, representation):
844+ """See `IRepresentationCache`."""
845+ self.dict[key] = representation
846+
847+ def delete_by_key(self, key):
848+ """Implementation of a `BaseRepresentationCache` method."""
849+ del self.dict[key]
850+
851+
852 BaseWebServiceConfiguration = implement_from_dict(
853 "BaseWebServiceConfiguration", IWebServiceConfiguration, {}, object)
854
855
856=== modified file 'src/lazr/restful/version.txt'
857--- src/lazr/restful/version.txt 2010-05-10 11:48:42 +0000
858+++ src/lazr/restful/version.txt 2010-05-24 14:15:38 +0000
859@@ -1,1 +1,1 @@
860-0.9.26
861+0.9.27

Subscribers

People subscribed via source and target branches