Merge lp:~maxb/bzr/normalize-lp into lp:bzr/2.3
Proposed by
Max Bowsher
Status: | Work in progress |
---|---|
Proposed branch: | lp:~maxb/bzr/normalize-lp |
Merge into: | lp:bzr/2.3 |
Diff against target: |
41 lines (+10/-14) 1 file modified
bzrlib/urlutils.py (+10/-14) |
To merge this branch: | bzr merge lp:~maxb/bzr/normalize-lp |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Andrew Bennetts | Needs Fixing | ||
Review via email: mp+61729@code.launchpad.net |
To post a comment you must log in.
Unmerged revisions
- 5650. By Max Bowsher
-
Fix bzrlib.
urlutils. normalize_ url when processing URLs with a scheme, but no
slash in the path - before this, lp:bzr was 'normalized' to lp:bzrlp:bzr (sic).
review needs-fixing
Max Bowsher wrote: escapes_ re.sub( _unescape_ safe_chars, path) escapes_ re.sub( _unescape_ safe_chars, url)
[…]
> - path = _url_hex_
> - return str(prefix + ''.join(path))
> + url = _url_hex_
> + return str(url)
Hmm. This will change what we unescape, including in the netloc part of
a URL. I'm not sure that's correct (I'm not sure it's not either). I
wonder particularly about URLs with usernames with unusual characters in
them.
I *think* what you have is correct, but it's a slightly worrying change
to make in a stable branch as it appears it might affect more cases than
<lp:bzr>.
I like that your change makes the code a little simpler. I wonder if we
should land this change on trunk, and a more conservative fix on 2.3:
=== modified file 'bzrlib/ urlutils. py' to_url( url) characters:
--- bzrlib/urlutils.py 2010-10-21 22:27:43 +0000
+++ bzrlib/urlutils.py 2011-05-24 03:48:53 +0000
@@ -308,7 +308,10 @@
if scheme_end is None:
return local_path_
prefix = url[:path_start]
- path = url[path_start:]
+ if path_start is None:
+ path = ''
+ else:
+ path = url[path_start:]
if not isinstance(url, unicode):
for c in url:
if c not in _url_safe_
Either way, you should add at least one test to tests/test_ urlutils. py.
bzrlib/