Merge lp:~leonardr/lazr.restful/understand-mod-compress into lp:lazr.restful

Proposed by Leonard Richardson
Status: Merged
Approved by: Gary Poster
Approved revision: 130
Merged at revision: 129
Proposed branch: lp:~leonardr/lazr.restful/understand-mod-compress
Merge into: lp:lazr.restful
Diff against target: 223 lines (+127/-21)
4 files modified
src/lazr/restful/NEWS.txt (+11/-4)
src/lazr/restful/_resource.py (+17/-1)
src/lazr/restful/docs/webservice.txt (+66/-16)
src/lazr/restful/interfaces/_rest.py (+33/-0)
To merge this branch: bzr merge lp:~leonardr/lazr.restful/understand-mod-compress
Reviewer Review Type Date Requested Status
Gary Poster Approve
Review via email: mp+25454@code.launchpad.net

Description of the change

This branch creates a workaround that should resolve bug 574697. Basically, I change lazr.restful to handle the bad behavior or mod_compress. https://issues.apache.org/bugzilla/show_bug.cgi?id=39727 has the history. Basically, we should not be using mod_compress at all, but because our server setup includes multiple layers of HTTP intermediaries that don't understand transfer-encoding, we have no choice.

This branch makes it possible for a lazr.restful application to detect when an ETag was modified by mod_compress, and handle it as though mod_compress were not present. This will let us re-enable mod_compress on the 'api' vhost in Launchpad installations.

To post a comment you must log in.
Revision history for this message
Gary Poster (gary) wrote :

Thank you

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

Isn't apache the right place to fix this? Why add cruft to the
launchpad code base, when it is probably a very small patch to
mod_compress to fix it?

Revision history for this message
Leonard Richardson (leonardr) wrote :

I believe we discussed this on IRC, but for the record: there is no "small change" to Apache that is acceptable to the Apache maintainers. https://issues.apache.org/bugzilla/attachment.cgi?id=23051 contains a small change that solves the problem for us, but because it's not a fully general solution (https://issues.apache.org/bugzilla/show_bug.cgi?id=39727#c31 explains why), it has not been applied. Solving the problem in Apache requires allowing each server in a chain of servers to append a _different_ string to the ETag. Or, it requires making transfer-encoding work in a setup with chained HTTP intermediaries, so that we don't have to do all these content-encoding hacks.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/lazr/restful/NEWS.txt'
--- src/lazr/restful/NEWS.txt 2010-05-10 12:11:00 +0000
+++ src/lazr/restful/NEWS.txt 2010-05-17 18:15:41 +0000
@@ -2,8 +2,15 @@
2NEWS for lazr.restful2NEWS for lazr.restful
3=====================3=====================
44
50.9.26 (Development)50.9.26 (2010-05-18)
6====================6===================
7
8Special note: This version adds a new configuration element,
9'compensate_for_mod_compress_etag_modification'. If you are running
10lazr.restful behind an Apache server, setting this configuration
11element will make mod_compress work properly with lazr.restful. This
12is not a permanent solution: a better solution will be available when
13Apache bug 39727 is fixed.
714
8Special note: This version removes the configuration element15Special note: This version removes the configuration element
9'set_hop_to_hop_headers'. You can still define this element in your16'set_hop_to_hop_headers'. You can still define this element in your
@@ -12,8 +19,8 @@
12Removed code that handles compression through hop-to-hop19Removed code that handles compression through hop-to-hop
13headers. We've never encountered a real situation in which these20headers. We've never encountered a real situation in which these
14headers were useful. Compression can and should be handled by21headers were useful. Compression can and should be handled by
15intermediaries such as mod_compress. The Apache bug that set us on22intermediaries such as mod_compress. (Unfortunately, mod_compress has
16this path in the first place has been fixed for a long time.23its own problems, which this release tries to work around.)
1724
180.9.25 (2010-04-14)250.9.25 (2010-04-14)
19===================26===================
2027
=== modified file 'src/lazr/restful/_resource.py'
--- src/lazr/restful/_resource.py 2010-05-10 11:48:42 +0000
+++ src/lazr/restful/_resource.py 2010-05-17 18:15:41 +0000
@@ -448,14 +448,30 @@
448 """Extract a list of ETags from a header and parse the list.448 """Extract a list of ETags from a header and parse the list.
449449
450 RFC2616 allows multiple comma-separated ETags.450 RFC2616 allows multiple comma-separated ETags.
451
452 If the compensate_for_mod_compress_etag_modification value is set,
453 ETags like '"foo"-gzip' and '"foo-gzip"' will be transformed into
454 '"foo"'.
451 """455 """
452 header = self.request.getHeader(header_name)456 header = self.request.getHeader(header_name)
453 if header is None:457 if header is None:
454 return []458 return []
459 utility = getUtility(IWebServiceConfiguration)
460 strip_gzip_from_etag = (
461 utility.compensate_for_mod_compress_etag_modification)
462 etags = []
455 # We're kind of cheating, because entity tags can technically463 # We're kind of cheating, because entity tags can technically
456 # have commas in them, but none of our tags contain commas, so464 # have commas in them, but none of our tags contain commas, so
457 # this will work.465 # this will work.
458 return [etag.strip() for etag in header.split(',')]466 for etag in header.split(','):
467 etag = etag.strip()
468 if strip_gzip_from_etag:
469 if etag.endswith('-gzip'):
470 etag = etag[:-5]
471 elif etag.endswith('-gzip"'):
472 etag = etag[:-6] + '"'
473 etags.append(etag)
474 return etags
459475
460 def _fieldValueIsObject(self, field):476 def _fieldValueIsObject(self, field):
461 """Does the given field expect a data model object as its value?477 """Does the given field expect a data model object as its value?
462478
=== modified file 'src/lazr/restful/docs/webservice.txt'
--- src/lazr/restful/docs/webservice.txt 2010-04-14 14:56:46 +0000
+++ src/lazr/restful/docs/webservice.txt 2010-05-17 18:15:41 +0000
@@ -1595,28 +1595,33 @@
1595sends a response code of 304 ("Not Modified") instead of sending the1595sends a response code of 304 ("Not Modified") instead of sending the
1596same representation again.1596same representation again.
15971597
1598 >>> headers = {'CONTENT_TYPE' : 'application/json',1598First, let's define a helper method to request a specific entry
1599 ... 'HTTP_IF_NONE_MATCH' : etag_original }1599resource, and gather the entity-body and the response object into an
1600 >>> get_request = create_web_service_request(julia_url, environ=headers)1600easily accessible data structure.
1601 >>> get_request.traverse(app)()1601
1602 ''1602 >>> def get_julia(etag=None):
1603 >>> get_request.response.getStatus()1603 ... headers = {'CONTENT_TYPE' : 'application/json'}
1604 ... if etag is not None:
1605 ... headers['HTTP_IF_NONE_MATCH'] = etag
1606 ... get_request = create_web_service_request(
1607 ... julia_url, environ=headers)
1608 ... entity_body = get_request.traverse(app)()
1609 ... return dict(entity_body=entity_body,
1610 ... response=get_request.response)
1611
1612 >>> print get_julia(etag_original)['response'].getStatus()
1604 3041613 304
16051614
1606If the ETags don't match, the server assumes the client has an old1615If the ETags don't match, the server assumes the client has an old
1607representation, and sends the new representation.1616representation, and sends the new representation.
16081617
1609 >>> headers['HTTP_IF_NONE_MATCH'] = 'bad etag'1618 >>> print get_julia('bad etag')['entity_body']
1610 >>> get_request = create_web_service_request(julia_url, environ=headers)1619 {...}
1611 >>> get_request.traverse(app)()
1612 '{...}'
16131620
1614Change the state of the resource, and the ETag changes.1621Change the state of the resource, and the ETag changes.
16151622
1616 >>> julia_object.favorite_recipe = C2_D21623 >>> julia_object.favorite_recipe = C2_D2
1617 >>> get_request = create_web_service_request(julia_url)1624 >>> etag_after_modification = get_julia()['response'].getHeader('ETag')
1618 >>> ignored = get_request.traverse(app)()
1619 >>> etag_after_modification = get_request.response.getHeader('ETag')
16201625
1621 >>> etag_after_modification == etag_original1626 >>> etag_after_modification == etag_original
1622 False1627 False
@@ -1625,12 +1630,57 @@
1625behind the scenes. If one of them changes, the ETag will change.1630behind the scenes. If one of them changes, the ETag will change.
16261631
1627 >>> julia_object.popularity = 51632 >>> julia_object.popularity = 5
1628 >>> get_request = create_web_service_request(julia_url)1633 >>> etag_after_readonly_change = get_julia()['response'].getHeader(
1629 >>> ignored = get_request.traverse(app)()1634 ... 'ETag')
1630 >>> etag_after_readonly_change = get_request.response.getHeader('ETag')
1631 >>> etag_after_readonly_change == etag_original1635 >>> etag_after_readonly_change == etag_original
1632 False1636 False
16331637
1638compensate_for_mod_compress_etag_modification
1639---------------------------------------------
1640
1641Apache's mod_compress transparently modifies outgoing ETags, but
1642doesn't remove the modifications when the ETags are sent back in. The
1643configuration setting 'compensate_for_mod_compress_etag_modification'
1644makes lazr.restful compensate for this behavior, so that you can use
1645mod_compress to save bandwidth.
1646
1647Different versions of mod_compress modify outgoing ETags in different
1648ways. lazr.restful handles both cases.
1649
1650 >>> etag = get_julia()['response'].getHeader('ETag')
1651 >>> modified_etag_1 = etag + '-gzip'
1652 >>> modified_etag_2 = etag[:-1] + '-gzip' + etag[-1]
1653
1654Under normal circumstances, lazr.restful won't recognize an ETag
1655modified by mod_compress.
1656
1657 >>> print get_julia(modified_etag_1)['entity_body']
1658 {...}
1659
1660When 'compensate_for_mod_compress_etag_modification' is set,
1661lazr.restful will recognize an ETag modified by mod_compress.
1662
1663 >>> c = webservice_configuration
1664 >>> print c.compensate_for_mod_compress_etag_modification
1665 False
1666 >>> c.compensate_for_mod_compress_etag_modification = True
1667
1668 >>> print get_julia(modified_etag_1)['response'].getStatus()
1669 304
1670
1671 >>> print get_julia(modified_etag_2)['response'].getStatus()
1672 304
1673
1674Of course, that doesn't mean lazr.restful will recognize any random
1675ETag.
1676
1677 >>> print get_julia(etag + "-not-gzip")['entity_body']
1678 {...}
1679
1680Cleanup.
1681
1682 >>> c.compensate_for_mod_compress_etag_modification = False
1683
1634Resource Visibility1684Resource Visibility
1635===================1685===================
16361686
16371687
=== modified file 'src/lazr/restful/interfaces/_rest.py'
--- src/lazr/restful/interfaces/_rest.py 2010-05-10 11:48:42 +0000
+++ src/lazr/restful/interfaces/_rest.py 2010-05-17 18:15:41 +0000
@@ -526,6 +526,39 @@
526 "a collection, they will not be allowed to request more entries "526 "a collection, they will not be allowed to request more entries "
527 "in the batch than this.")527 "in the batch than this.")
528528
529 compensate_for_mod_compress_etag_modification = Bool(
530 title=u"Accept incoming ETags that appear to have been modified "
531 "in transit by Apache's mod_compress.",
532 default=False,
533
534 description=u"""When mod_compress compresses an outgoing
535 representation, it (correctly) modifies the ETag. But when
536 that ETag comes back in on a conditional GET or PUT request,
537 mod_compress does _not_ transparently remove the modification,
538 because it can't be sure which HTTP intermediary was
539 responsible for modifying the ETag on the way out.
540
541 Setting this value to True will tell lazr.restful that
542 mod_compress is in use. lazr.restful knows what mod_compress
543 does to outgoing ETags, and if this value is set, it will
544 strip the modification from incoming ETags that appear to have
545 been modified by mod_compress.
546
547 Specifically: if lazr.restful generates an etag '"foo"',
548 mod_compress will change it to '"foo"-gzip' or '"foo-gzip"',
549 depending on the version. If this value is set to False,
550 lazr.restful will not recognize '"foo"-gzip' or '"foo-gzip"'
551 as being equivalent to '"foo"'. If this value is set to True,
552 lazr.restful will treat all three strings the same way.
553
554 This is a hacky solution, but until Apache issue 39727 is
555 resolved, it's the best way to get the performance
556 improvements of content-encoding in a real setup with multiple
557 layers of HTTP intermediary. (An earlier attempt to use
558 transfer-encoding instead of content-encoding failed because
559 HTTP intermediaries strip the TE header.)
560 """)
561
529 def createRequest(body_instream, environ):562 def createRequest(body_instream, environ):
530 """A factory method that creates a request for the web service.563 """A factory method that creates a request for the web service.
531564

Subscribers

People subscribed via source and target branches