Merge lp:~maxb/bzr/normalize-lp into lp:bzr/2.3

Proposed by Max Bowsher
Status: Work in progress
Proposed branch: lp:~maxb/bzr/normalize-lp
Merge into: lp:bzr/2.3
Diff against target: 41 lines (+10/-14)
1 file modified
bzrlib/urlutils.py (+10/-14)
To merge this branch: bzr merge lp:~maxb/bzr/normalize-lp
Reviewer Review Type Date Requested Status
Andrew Bennetts Needs Fixing
Review via email: mp+61729@code.launchpad.net

Description of the change

Fixes the bzrlib.urlutils.normalize_url function to not mangle lp:foo URLs for lp:foolp:foo

One way to demonstrate the bug is:

bzr config -d lp:bzr wibble=blah

and then check your ~/.bazaar/locations.conf

To post a comment you must log in.
Revision history for this message
Andrew Bennetts (spiv) wrote :

 review needs-fixing

Max Bowsher wrote:
[…]
> - path = _url_hex_escapes_re.sub(_unescape_safe_chars, path)
> - return str(prefix + ''.join(path))
> + url = _url_hex_escapes_re.sub(_unescape_safe_chars, url)
> + return str(url)

Hmm. This will change what we unescape, including in the netloc part of
a URL. I'm not sure that's correct (I'm not sure it's not either). I
wonder particularly about URLs with usernames with unusual characters in
them.

I *think* what you have is correct, but it's a slightly worrying change
to make in a stable branch as it appears it might affect more cases than
<lp:bzr>.

I like that your change makes the code a little simpler. I wonder if we
should land this change on trunk, and a more conservative fix on 2.3:

=== modified file 'bzrlib/urlutils.py'
--- bzrlib/urlutils.py 2010-10-21 22:27:43 +0000
+++ bzrlib/urlutils.py 2011-05-24 03:48:53 +0000
@@ -308,7 +308,10 @@
     if scheme_end is None:
         return local_path_to_url(url)
     prefix = url[:path_start]
- path = url[path_start:]
+ if path_start is None:
+ path = ''
+ else:
+ path = url[path_start:]
     if not isinstance(url, unicode):
         for c in url:
             if c not in _url_safe_characters:

Either way, you should add at least one test to
bzrlib/tests/test_urlutils.py.

review: Needs Fixing

Unmerged revisions

5650. By Max Bowsher

Fix bzrlib.urlutils.normalize_url when processing URLs with a scheme, but no
slash in the path - before this, lp:bzr was 'normalized' to lp:bzrlp:bzr (sic).

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bzrlib/urlutils.py'
--- bzrlib/urlutils.py 2010-10-21 22:27:43 +0000
+++ bzrlib/urlutils.py 2011-05-20 10:00:58 +0000
@@ -307,27 +307,23 @@
307 scheme_end, path_start = _find_scheme_and_separator(url)307 scheme_end, path_start = _find_scheme_and_separator(url)
308 if scheme_end is None:308 if scheme_end is None:
309 return local_path_to_url(url)309 return local_path_to_url(url)
310 prefix = url[:path_start]
311 path = url[path_start:]
312 if not isinstance(url, unicode):310 if not isinstance(url, unicode):
313 for c in url:311 for c in url:
314 if c not in _url_safe_characters:312 if c not in _url_safe_characters:
315 raise errors.InvalidURL(url, 'URLs can only contain specific'313 raise errors.InvalidURL(url, 'URLs can only contain specific'
316 ' safe characters (not %r)' % c)314 ' safe characters (not %r)' % c)
317 path = _url_hex_escapes_re.sub(_unescape_safe_chars, path)315 url = _url_hex_escapes_re.sub(_unescape_safe_chars, url)
318 return str(prefix + ''.join(path))316 return str(url)
319317
320 # We have a unicode (hybrid) url318 # We have a unicode (hybrid) url
321 path_chars = list(path)319 url_chars = list(url)
322320 for i in xrange(len(url_chars)):
323 for i in xrange(len(path_chars)):321 if url_chars[i] not in _url_safe_characters:
324 if path_chars[i] not in _url_safe_characters:322 url_chars[i] = ''.join(
325 chars = path_chars[i].encode('utf-8')323 ['%%%02X' % ord(c) for c in url_chars[i].encode('utf-8')])
326 path_chars[i] = ''.join(324 url = ''.join(url_chars)
327 ['%%%02X' % ord(c) for c in path_chars[i].encode('utf-8')])325 url = _url_hex_escapes_re.sub(_unescape_safe_chars, url)
328 path = ''.join(path_chars)326 return str(url)
329 path = _url_hex_escapes_re.sub(_unescape_safe_chars, path)
330 return str(prefix + path)
331327
332328
333def relative_url(base, other):329def relative_url(base, other):

Subscribers

People subscribed via source and target branches