Code review comment for lp:~maxb/bzr/normalize-lp

Revision history for this message
Andrew Bennetts (spiv) wrote :

 review needs-fixing

Max Bowsher wrote:
[…]
> - path = _url_hex_escapes_re.sub(_unescape_safe_chars, path)
> - return str(prefix + ''.join(path))
> + url = _url_hex_escapes_re.sub(_unescape_safe_chars, url)
> + return str(url)

Hmm. This will change what we unescape, including in the netloc part of
a URL. I'm not sure that's correct (I'm not sure it's not either). I
wonder particularly about URLs with usernames with unusual characters in
them.

I *think* what you have is correct, but it's a slightly worrying change
to make in a stable branch as it appears it might affect more cases than
<lp:bzr>.

I like that your change makes the code a little simpler. I wonder if we
should land this change on trunk, and a more conservative fix on 2.3:

=== modified file 'bzrlib/urlutils.py'
--- bzrlib/urlutils.py 2010-10-21 22:27:43 +0000
+++ bzrlib/urlutils.py 2011-05-24 03:48:53 +0000
@@ -308,7 +308,10 @@
     if scheme_end is None:
         return local_path_to_url(url)
     prefix = url[:path_start]
- path = url[path_start:]
+ if path_start is None:
+ path = ''
+ else:
+ path = url[path_start:]
     if not isinstance(url, unicode):
         for c in url:
             if c not in _url_safe_characters:

Either way, you should add at least one test to
bzrlib/tests/test_urlutils.py.

review: Needs Fixing

« Back to merge proposal