Merge lp:~leonardr/lazr.restful/understand-mod-compress into lp:lazr.restful

Proposed by Leonard Richardson
Status: Merged
Approved by: Gary Poster
Approved revision: 130
Merged at revision: 129
Proposed branch: lp:~leonardr/lazr.restful/understand-mod-compress
Merge into: lp:lazr.restful
Diff against target: 223 lines (+127/-21)
4 files modified
src/lazr/restful/NEWS.txt (+11/-4)
src/lazr/restful/_resource.py (+17/-1)
src/lazr/restful/docs/webservice.txt (+66/-16)
src/lazr/restful/interfaces/_rest.py (+33/-0)
To merge this branch: bzr merge lp:~leonardr/lazr.restful/understand-mod-compress
Reviewer Review Type Date Requested Status
Gary Poster Approve
Review via email: mp+25454@code.launchpad.net

Description of the change

This branch creates a workaround that should resolve bug 574697. Basically, I change lazr.restful to handle the bad behavior or mod_compress. https://issues.apache.org/bugzilla/show_bug.cgi?id=39727 has the history. Basically, we should not be using mod_compress at all, but because our server setup includes multiple layers of HTTP intermediaries that don't understand transfer-encoding, we have no choice.

This branch makes it possible for a lazr.restful application to detect when an ETag was modified by mod_compress, and handle it as though mod_compress were not present. This will let us re-enable mod_compress on the 'api' vhost in Launchpad installations.

To post a comment you must log in.
Revision history for this message
Gary Poster (gary) wrote :

Thank you

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

Isn't apache the right place to fix this? Why add cruft to the
launchpad code base, when it is probably a very small patch to
mod_compress to fix it?

Revision history for this message
Leonard Richardson (leonardr) wrote :

I believe we discussed this on IRC, but for the record: there is no "small change" to Apache that is acceptable to the Apache maintainers. https://issues.apache.org/bugzilla/attachment.cgi?id=23051 contains a small change that solves the problem for us, but because it's not a fully general solution (https://issues.apache.org/bugzilla/show_bug.cgi?id=39727#c31 explains why), it has not been applied. Solving the problem in Apache requires allowing each server in a chain of servers to append a _different_ string to the ETag. Or, it requires making transfer-encoding work in a setup with chained HTTP intermediaries, so that we don't have to do all these content-encoding hacks.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/lazr/restful/NEWS.txt'
2--- src/lazr/restful/NEWS.txt 2010-05-10 12:11:00 +0000
3+++ src/lazr/restful/NEWS.txt 2010-05-17 18:15:41 +0000
4@@ -2,8 +2,15 @@
5 NEWS for lazr.restful
6 =====================
7
8-0.9.26 (Development)
9-====================
10+0.9.26 (2010-05-18)
11+===================
12+
13+Special note: This version adds a new configuration element,
14+'compensate_for_mod_compress_etag_modification'. If you are running
15+lazr.restful behind an Apache server, setting this configuration
16+element will make mod_compress work properly with lazr.restful. This
17+is not a permanent solution: a better solution will be available when
18+Apache bug 39727 is fixed.
19
20 Special note: This version removes the configuration element
21 'set_hop_to_hop_headers'. You can still define this element in your
22@@ -12,8 +19,8 @@
23 Removed code that handles compression through hop-to-hop
24 headers. We've never encountered a real situation in which these
25 headers were useful. Compression can and should be handled by
26-intermediaries such as mod_compress. The Apache bug that set us on
27-this path in the first place has been fixed for a long time.
28+intermediaries such as mod_compress. (Unfortunately, mod_compress has
29+its own problems, which this release tries to work around.)
30
31 0.9.25 (2010-04-14)
32 ===================
33
34=== modified file 'src/lazr/restful/_resource.py'
35--- src/lazr/restful/_resource.py 2010-05-10 11:48:42 +0000
36+++ src/lazr/restful/_resource.py 2010-05-17 18:15:41 +0000
37@@ -448,14 +448,30 @@
38 """Extract a list of ETags from a header and parse the list.
39
40 RFC2616 allows multiple comma-separated ETags.
41+
42+ If the compensate_for_mod_compress_etag_modification value is set,
43+ ETags like '"foo"-gzip' and '"foo-gzip"' will be transformed into
44+ '"foo"'.
45 """
46 header = self.request.getHeader(header_name)
47 if header is None:
48 return []
49+ utility = getUtility(IWebServiceConfiguration)
50+ strip_gzip_from_etag = (
51+ utility.compensate_for_mod_compress_etag_modification)
52+ etags = []
53 # We're kind of cheating, because entity tags can technically
54 # have commas in them, but none of our tags contain commas, so
55 # this will work.
56- return [etag.strip() for etag in header.split(',')]
57+ for etag in header.split(','):
58+ etag = etag.strip()
59+ if strip_gzip_from_etag:
60+ if etag.endswith('-gzip'):
61+ etag = etag[:-5]
62+ elif etag.endswith('-gzip"'):
63+ etag = etag[:-6] + '"'
64+ etags.append(etag)
65+ return etags
66
67 def _fieldValueIsObject(self, field):
68 """Does the given field expect a data model object as its value?
69
70=== modified file 'src/lazr/restful/docs/webservice.txt'
71--- src/lazr/restful/docs/webservice.txt 2010-04-14 14:56:46 +0000
72+++ src/lazr/restful/docs/webservice.txt 2010-05-17 18:15:41 +0000
73@@ -1595,28 +1595,33 @@
74 sends a response code of 304 ("Not Modified") instead of sending the
75 same representation again.
76
77- >>> headers = {'CONTENT_TYPE' : 'application/json',
78- ... 'HTTP_IF_NONE_MATCH' : etag_original }
79- >>> get_request = create_web_service_request(julia_url, environ=headers)
80- >>> get_request.traverse(app)()
81- ''
82- >>> get_request.response.getStatus()
83+First, let's define a helper method to request a specific entry
84+resource, and gather the entity-body and the response object into an
85+easily accessible data structure.
86+
87+ >>> def get_julia(etag=None):
88+ ... headers = {'CONTENT_TYPE' : 'application/json'}
89+ ... if etag is not None:
90+ ... headers['HTTP_IF_NONE_MATCH'] = etag
91+ ... get_request = create_web_service_request(
92+ ... julia_url, environ=headers)
93+ ... entity_body = get_request.traverse(app)()
94+ ... return dict(entity_body=entity_body,
95+ ... response=get_request.response)
96+
97+ >>> print get_julia(etag_original)['response'].getStatus()
98 304
99
100 If the ETags don't match, the server assumes the client has an old
101 representation, and sends the new representation.
102
103- >>> headers['HTTP_IF_NONE_MATCH'] = 'bad etag'
104- >>> get_request = create_web_service_request(julia_url, environ=headers)
105- >>> get_request.traverse(app)()
106- '{...}'
107+ >>> print get_julia('bad etag')['entity_body']
108+ {...}
109
110 Change the state of the resource, and the ETag changes.
111
112 >>> julia_object.favorite_recipe = C2_D2
113- >>> get_request = create_web_service_request(julia_url)
114- >>> ignored = get_request.traverse(app)()
115- >>> etag_after_modification = get_request.response.getHeader('ETag')
116+ >>> etag_after_modification = get_julia()['response'].getHeader('ETag')
117
118 >>> etag_after_modification == etag_original
119 False
120@@ -1625,12 +1630,57 @@
121 behind the scenes. If one of them changes, the ETag will change.
122
123 >>> julia_object.popularity = 5
124- >>> get_request = create_web_service_request(julia_url)
125- >>> ignored = get_request.traverse(app)()
126- >>> etag_after_readonly_change = get_request.response.getHeader('ETag')
127+ >>> etag_after_readonly_change = get_julia()['response'].getHeader(
128+ ... 'ETag')
129 >>> etag_after_readonly_change == etag_original
130 False
131
132+compensate_for_mod_compress_etag_modification
133+---------------------------------------------
134+
135+Apache's mod_compress transparently modifies outgoing ETags, but
136+doesn't remove the modifications when the ETags are sent back in. The
137+configuration setting 'compensate_for_mod_compress_etag_modification'
138+makes lazr.restful compensate for this behavior, so that you can use
139+mod_compress to save bandwidth.
140+
141+Different versions of mod_compress modify outgoing ETags in different
142+ways. lazr.restful handles both cases.
143+
144+ >>> etag = get_julia()['response'].getHeader('ETag')
145+ >>> modified_etag_1 = etag + '-gzip'
146+ >>> modified_etag_2 = etag[:-1] + '-gzip' + etag[-1]
147+
148+Under normal circumstances, lazr.restful won't recognize an ETag
149+modified by mod_compress.
150+
151+ >>> print get_julia(modified_etag_1)['entity_body']
152+ {...}
153+
154+When 'compensate_for_mod_compress_etag_modification' is set,
155+lazr.restful will recognize an ETag modified by mod_compress.
156+
157+ >>> c = webservice_configuration
158+ >>> print c.compensate_for_mod_compress_etag_modification
159+ False
160+ >>> c.compensate_for_mod_compress_etag_modification = True
161+
162+ >>> print get_julia(modified_etag_1)['response'].getStatus()
163+ 304
164+
165+ >>> print get_julia(modified_etag_2)['response'].getStatus()
166+ 304
167+
168+Of course, that doesn't mean lazr.restful will recognize any random
169+ETag.
170+
171+ >>> print get_julia(etag + "-not-gzip")['entity_body']
172+ {...}
173+
174+Cleanup.
175+
176+ >>> c.compensate_for_mod_compress_etag_modification = False
177+
178 Resource Visibility
179 ===================
180
181
182=== modified file 'src/lazr/restful/interfaces/_rest.py'
183--- src/lazr/restful/interfaces/_rest.py 2010-05-10 11:48:42 +0000
184+++ src/lazr/restful/interfaces/_rest.py 2010-05-17 18:15:41 +0000
185@@ -526,6 +526,39 @@
186 "a collection, they will not be allowed to request more entries "
187 "in the batch than this.")
188
189+ compensate_for_mod_compress_etag_modification = Bool(
190+ title=u"Accept incoming ETags that appear to have been modified "
191+ "in transit by Apache's mod_compress.",
192+ default=False,
193+
194+ description=u"""When mod_compress compresses an outgoing
195+ representation, it (correctly) modifies the ETag. But when
196+ that ETag comes back in on a conditional GET or PUT request,
197+ mod_compress does _not_ transparently remove the modification,
198+ because it can't be sure which HTTP intermediary was
199+ responsible for modifying the ETag on the way out.
200+
201+ Setting this value to True will tell lazr.restful that
202+ mod_compress is in use. lazr.restful knows what mod_compress
203+ does to outgoing ETags, and if this value is set, it will
204+ strip the modification from incoming ETags that appear to have
205+ been modified by mod_compress.
206+
207+ Specifically: if lazr.restful generates an etag '"foo"',
208+ mod_compress will change it to '"foo"-gzip' or '"foo-gzip"',
209+ depending on the version. If this value is set to False,
210+ lazr.restful will not recognize '"foo"-gzip' or '"foo-gzip"'
211+ as being equivalent to '"foo"'. If this value is set to True,
212+ lazr.restful will treat all three strings the same way.
213+
214+ This is a hacky solution, but until Apache issue 39727 is
215+ resolved, it's the best way to get the performance
216+ improvements of content-encoding in a real setup with multiple
217+ layers of HTTP intermediary. (An earlier attempt to use
218+ transfer-encoding instead of content-encoding failed because
219+ HTTP intermediaries strip the TE header.)
220+ """)
221+
222 def createRequest(body_instream, environ):
223 """A factory method that creates a request for the web service.
224

Subscribers

People subscribed via source and target branches