Merge lp:~lifeless/bzr/bug-558343-wrong-host-with-proxy into lp:bzr

Proposed by Robert Collins
Status: Merged
Approved by: Vincent Ladeuil
Approved revision: no longer in the source branch.
Merged at revision: 5220
Proposed branch: lp:~lifeless/bzr/bug-558343-wrong-host-with-proxy
Merge into: lp:bzr
Diff against target: 70 lines (+19/-13)
2 files modified
NEWS (+3/-0)
bzrlib/transport/http/_urllib2_wrappers.py (+16/-13)
To merge this branch: bzr merge lp:~lifeless/bzr/bug-558343-wrong-host-with-proxy
Reviewer Review Type Date Requested Status
Vincent Ladeuil Approve
bzr-core Pending
Review via email: mp+24938@code.launchpad.net

Commit message

Fix lp: urls behind an https proxy.

Description of the change

This should fix using lp: urls from behind an https proxy; no tests - it appears remarkably opaque :(.

Would love vincent's opinion on it, as I think he knows urllib2 guts much better than I.

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

Oh, and as I'm travelling to UDS, please land if its ok (or fix and land) - this is fairly high importance IMO.

Revision history for this message
Vincent Ladeuil (vila) wrote :

At first glance, the following line:

26 + self.add_unredirected_header('Host', self.proxied_host)

looks obvious.

From memory, I'm pretty sure the Host header is added just before sending the request
if it's not there already. Yet, I also have some vague feeling that playing with *this* header
led to trouble in the past (this fix could have avoid that may be).

I'd blame the lack of test for this omission. Am I right in thinking that
the bug requires a redirection behind the proxy to trigger ?

If yes, that could explain why we didn't get more reports (and the NEWS entry may
be more precise then).

Since proxy and redirection tests are pretty hackish so far, there may be a way to test
this with pre-canned responses (I think there is some test http
server than can be used for that).

I won't have time to look at it until UDS so let's talk about it there.

Otherwise, thanks for the cleanup, we may also want to use 'http' in debug_flags instead of
debug_level, that will help diagnosing issues in the future.

Revision history for this message
Robert Collins (lifeless) wrote :

Yes, when I stepped through the code, urllib2 adds a host header if
there isn't one, and its grabbing the wrong host value because
'has_proxy' is returning False.

And yes, you have to have an https proxy defined to trigger this - see
my comments in the bug about reproducing it.

Sounds like you think this is ok? Could you land it(I'm hopping on a
plane in a minute for the next leg).

Thanks,
Rob

Revision history for this message
Vincent Ladeuil (vila) wrote :

I found time to look at the code. There is already a huge list of missing tests under a FIXME there (including the one that is missing here).

And yes, the fix is right and urllib2 didn't have the bug because it didn't implement CONNECT anyway.

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

submitted to PQM by hand.

Revision history for this message
Robert Collins (lifeless) wrote :

submitted to PQM by hand.

Revision history for this message
Robert Collins (lifeless) wrote :

submitted to PQM by hand.

Revision history for this message
Robert Collins (lifeless) wrote :

submitted to PQM by hand.

Revision history for this message
Robert Collins (lifeless) wrote :

submitted to PQM by hand.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-05-06 06:51:11 +0000
3+++ NEWS 2010-05-08 06:00:46 +0000
4@@ -101,6 +101,9 @@
5 messages.
6 (Parth Malwankar, #563646)
7
8+* Using bzr with `lp:` urls behind an http proxy should work.
9+ (Robert Collins, #558343)
10+
11 Improvements
12 ************
13
14
15=== modified file 'bzrlib/transport/http/_urllib2_wrappers.py'
16--- bzrlib/transport/http/_urllib2_wrappers.py 2010-02-17 17:11:16 +0000
17+++ bzrlib/transport/http/_urllib2_wrappers.py 2010-05-08 06:00:46 +0000
18@@ -378,6 +378,11 @@
19 port = conn_class.default_port
20 self.proxied_host = '%s:%s' % (host, port)
21 urllib2.Request.set_proxy(self, proxy, type)
22+ # When urllib2 makes a https request with our wrapper code and a proxy,
23+ # it sets Host to the https proxy, not the host we want to talk to.
24+ # I'm fairly sure this is our fault, but what is the cause is an open
25+ # question. -- Robert Collins May 8 2010.
26+ self.add_unredirected_header('Host', self.proxied_host)
27
28
29 class _ConnectRequest(Request):
30@@ -711,7 +716,7 @@
31 connect = _ConnectRequest(request)
32 response = self.parent.open(connect)
33 if response.code != 200:
34- raise ConnectionError("Can't connect to %s via proxy %s" % (
35+ raise errors.ConnectionError("Can't connect to %s via proxy %s" % (
36 connect.proxied_host, self.host))
37 # Housekeeping
38 connection.cleanup_pipe()
39@@ -868,21 +873,19 @@
40 print 'Will unbind %s_open for %r' % (type, proxy)
41 delattr(self, '%s_open' % type)
42
43+ def bind_scheme_request(proxy, scheme):
44+ if proxy is None:
45+ return
46+ scheme_request = scheme + '_request'
47+ if self._debuglevel >= 3:
48+ print 'Will bind %s for %r' % (scheme_request, proxy)
49+ setattr(self, scheme_request,
50+ lambda request: self.set_proxy(request, scheme))
51 # We are interested only by the http[s] proxies
52 http_proxy = self.get_proxy_env_var('http')
53+ bind_scheme_request(http_proxy, 'http')
54 https_proxy = self.get_proxy_env_var('https')
55-
56- if http_proxy is not None:
57- if self._debuglevel >= 3:
58- print 'Will bind http_request for %r' % http_proxy
59- setattr(self, 'http_request',
60- lambda request: self.set_proxy(request, 'http'))
61-
62- if https_proxy is not None:
63- if self._debuglevel >= 3:
64- print 'Will bind http_request for %r' % https_proxy
65- setattr(self, 'https_request',
66- lambda request: self.set_proxy(request, 'https'))
67+ bind_scheme_request(https_proxy, 'https')
68
69 def get_proxy_env_var(self, name, default_to='all'):
70 """Get a proxy env var.