Launchpad itself

Code review comment for lp:~stub/launchpad/memcache

memcache
Merge into devel

Revision history for this message

Gary Poster (gary) wrote on 2010-02-26:

merge-conditional

Hi Stuart. This is very cool! Thank you.

Might as well clean up the ``#level debug`` comments in launchpad.conf.

I question including the "anonymous" visibility. You bring up a good reason for not including it. Why don't we just exclude it this time around? It feels like something that might be interesting for upstream but not for right now--and like something that people will use unnecessarily. (On the other hand, if you want to push back, in the interest of saving you from waiting for me to wake up on the other side of the globe, you may merely imagine me acquiescing, and keep it, if you like.)

I liked your doctest.

Line 383 of the diff incorrectly describes the syntax, AIUI: <div tal:content="cache:1h public">. Please correct it.

I would be tempted to try to provide a more helpful error message when someone includes too many commas (re "self.visibility, max_age = (s.strip() for s in expr.split(','))" from line 392 of the diff. Line 403 ("value, unit = max_age.split(' ')") is similar. If you can't be bothered, I'll look the other way.

You say that "units is one of 'seconds', 'minutes', 'hours' or 'days'." However, you accept s*, m*, h* and d*. You are much stricter about the visibility strings. I'm inclined to favor enforcing readability, and enforce the full strings. This is negotiable (that is, under the circumstances, you may hear that as "ignorable" if you wish) but I feel more strongly about this one than some others I've brought up with similar deference.

You generate _valid_key_charactersbut then you never use it. Maybe delete it?

I am curious why you are not using SPACE (32) as your delineator, as opposed to the colon, which forces your logic to have to be a bit trickier here and there. Maybe I'll see why later...

I figure you know this is OK because of the DB, but I was very mildly surprised by the confidence of "uid = str(logged_in_user.id)". If you are sure it is safe, that's great.

The way you are handling repeats is very interesting. It's an interesting problem. I first thought that your approach of adding one to the counter_key in the request annotations would not work very well in the case of nested loops, because if something changed, then it would do a very odd cascade that might put an old sub-item from one top-level section into another top-level section entirely. However, if a collection changes significantly from one repeat to the next--an item is inserted somewhere, rather than appended, in particular--you are kind of hosed anyway. I guess I'm fine with what you have, though I'd like it if you added a warning in the doctest that cacheing things in a repeat is perhaps a less appealing prospect than some others.

OK, I asked why you are not using SPACE (32) as your delineator, and now I see those colons in "pt:%s:%s,%s:%s:%d,%d:%d,%s". I see commas too, though. Why can't we use spaces for everything? Contrariwise, if we are separating with colons and commas, why are you not excluding commas from your valid characters? Contrariwise to both of those, or perhaps perpendicularly to them, if the url is at the end, we know that any colon or comma after the ones used in our string formatting pattern belong to the url, so why do we escape any of the delimiters? And then, to add to the excitement, why do we not escape any of the other values in that key--how sure are you that the user id will not have a colon or comma, for instance?

If you answer those questions to your own satisfaction, you may consider me satisfied, and proceed.

base62 is fun :-)

To state the obvious, getKey needs to be very fast or else it will be difficult for this to be a performance win. It looks pretty reasonable to me, though. ISTR that calculating an md5 hash is pretty darn fast. I'll not worry about it.

In Python, I've gained a taste against conditional statements actually doing work ("if getUtility(IMemcacheClient).set(...):") but it's clear enough in context. No change needed.

You have a nice __repr__ on the MemcacheMiss, while you have what I interpret to be a vestigial __unicode__ (before you monkeypatched evaluateText) on MemcacheHit. Would it be reasonable to change MemcacheHit to drop the __unicode__ and add a __repr__? If so, please do.

OK, that's it. Thank you again for a very cool branch, Stuart!

Gary

merge-conditional

Hi Stuart.  This is very cool!  Thank you.

Might as well clean up the ``#level debug`` comments in launchpad.conf.

I question including the "anonymous" visibility.  You bring up a good reason for not including it.  Why don't we just exclude it this time around?  It feels like something that might be interesting for upstream but not for right now--and like something that people will use unnecessarily.  (On the other hand, if you want to push back, in the interest of saving you from waiting for me to wake up on the other side of the globe, you may merely imagine me acquiescing, and keep it, if you like.)

I liked your doctest.

Line 383 of the diff incorrectly describes the syntax, AIUI: <div tal:content="cache:1h public">.  Please correct it.

I would be tempted to try to provide a more helpful error message when someone includes too many commas (re "self.visibility, max_age = (s.strip() for s in expr.split(','))" from line 392 of the diff.  Line 403 ("value, unit = max_age.split(' ')") is similar.  If you can't be bothered, I'll look the other way.

You say that "units is one of 'seconds', 'minutes', 'hours' or 'days'."  However, you accept s*, m*, h* and d*.  You are much stricter about the visibility strings.  I'm inclined to favor enforcing readability, and enforce the full strings.  This is negotiable (that is, under the circumstances, you may hear that as "ignorable" if you wish) but I feel more strongly about this one than some others I've brought up with similar deference.

You generate _valid_key_charactersbut then you never use it.  Maybe delete it?

I am curious why you are not using SPACE (32) as your delineator, as opposed to the colon, which forces your logic to have to be a bit trickier here and there.  Maybe I'll see why later...

I figure you know this is OK because of the DB, but I was very mildly surprised by the confidence of "uid = str(logged_in_user.id)".  If you are sure it is safe, that's great.

The way you are handling repeats is very interesting.  It's an interesting problem.  I first thought that your approach of adding one to the counter_key in the request annotations would not work very well in the case of nested loops, because if something changed, then it would do a very odd cascade that might put an old sub-item from one top-level section into another top-level section entirely.  However, if a collection changes significantly from one repeat to the next--an item is inserted somewhere, rather than appended, in particular--you are kind of hosed anyway.  I guess I'm fine with what you have, though I'd like it if you added a warning in the doctest that cacheing things in a repeat is perhaps a less appealing prospect than some others.

OK, I asked why you are not using SPACE (32) as your delineator, and now I see those colons in "pt:%s:%s,%s:%s:%d,%d:%d,%s".  I see commas too, though.  Why can't we use spaces for everything?  Contrariwise, if we are separating with colons and commas, why are you not excluding commas from your valid characters? Contrariwise to both of those, or perhaps perpendicularly to them, if the url is at the end, we know that any colon or comma after the ones used in our string formatting pattern belong to the url, so why do we escape any of the delimiters?  And then, to add to the excitement, why do we not escape any of the other values in that key--how sure are you that the user id will not have a colon or comma, for instance?

If you answer those questions to your own satisfaction, you may consider me satisfied, and proceed.

base62 is fun :-)

To state the obvious, getKey needs to be very fast or else it will be difficult for this to be a performance win. It looks pretty reasonable to me, though.  ISTR that calculating an md5 hash is pretty darn fast.  I'll not worry about it.

In Python, I've gained a taste against conditional statements actually doing work ("if getUtility(IMemcacheClient).set(...):") but it's clear enough in context.  No change needed.

You have a nice __repr__ on the MemcacheMiss, while you have what I interpret to be a vestigial __unicode__ (before you monkeypatched evaluateText) on MemcacheHit.  Would it be reasonable to change MemcacheHit to drop the __unicode__ and add a __repr__?  If so, please do.

OK, that's it.  Thank you again for a very cool branch, Stuart!

Gary

review: Approve

« Back to merge proposal