Merge lp:~songofacandy/bzr/fix-523746-dev into lp:bzr
- fix-523746-dev
- Merge into bzr.dev
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Martin Pool | ||||
Approved revision: | no longer in the source branch. | ||||
Merged at revision: | 5892 | ||||
Proposed branch: | lp:~songofacandy/bzr/fix-523746-dev | ||||
Merge into: | lp:bzr | ||||
Diff against target: |
156 lines (+98/-12) 2 files modified
bzrlib/diff.py (+51/-11) bzrlib/tests/test_diff.py (+47/-1) |
||||
To merge this branch: | bzr merge lp:~songofacandy/bzr/fix-523746-dev | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
John A Meinel | Approve | ||
Vincent Ladeuil | Abstain | ||
Alexander Belchenko | Pending | ||
Gary van der Merwe | Pending | ||
Martin Packman | Pending | ||
bzr-core | Pending | ||
Review via email: mp+20537@code.launchpad.net |
Commit message
better handling of subprocesses with non-ascii encodings and filenames (bug 523746)
Description of the change
methane (songofacandy) wrote : | # |
Vincent Ladeuil (vila) wrote : | # |
@Naoki, can you please use more meaningful branch names
(see Making a Merge Proposal in doc/developers/
Also, it helps a lot if you put an initial comment (we sometimes call it the cover letter)
when you start the merge proposal explaining the intent of the patch.
If you've fixed a bug, you desserve credit for that, so please add a NEWS entry too :)
Vincent Ladeuil (vila) wrote : | # |
Oh, silly me, I mis-read your initial comment as pointing to itself,
where in fact, you just did what I asked in the *other* mp.
Anyway, this illustrates why I find sparse branch names unhelpful, I just didn't
recognize it :)
In fact you made the comments I was waiting for on the other mp:
> This problem depends on OS, locale settings and other environment.
> So I agree with you that this patch should land to bzr.dev and be tested on
> vary environment while bzr 2.2 is released.
Fine, I'll mark the other mp rejected.
So, test against variations on locale settings exist in bzrlib/
and bzrlib/
Ask for help if you don't find the existing tests enough to write your own.
Vincent Ladeuil (vila) wrote : | # |
@Naoki, I've added tests in lp:~vila/bzr/523746-difftool-file-names and doing so I think I found a simpler fix, could you test it and give feedback ?
methane (songofacandy) wrote : | # |
This problem contains many situation.
I think your fix doesn't work sometimes because:
1. osutils.
user_encoding is not equals to filesystem encoding.
2. path.encode(enc) can raise UnicodeEncodeError.
So bzr should use "encode(enc, 'replace')"
3. _write_file() writes file with unicode path.
But if bzr can't encode the path collectly, external command can't
open the file.
So bzr should encode the path before _write_file().
To test these situation:
* Use fixed encoding in the test.
* Use filenames that contains chars that can't encode with the encoding.
* Replace subprocess.call and check arguments.
--
INADA Naoki <email address hidden>
Vincent Ladeuil (vila) wrote : | # |
> This problem contains many situation.
> I think your fix doesn't work
That's why I wanted you to test it :) Did you ? If yes, did it fail and how ?
>sometimes because:
>
> 1. osutils.
> user_encoding is not equals to filesystem encoding.
My assumption was that subprocess.Popen should receive its arguments as if
they were coming from the command line: in user encoding.
>
> 2. path.encode(enc) can raise UnicodeEncodeError.
> So bzr should use "encode(enc, 'replace')"
That will mean some path can't be encoded in user encoding which in turn means
it could not be user as a parameter to bzr no ?
>
> 3. _write_file() writes file with unicode path.
> But if bzr can't encode the path collectly, external command can't
> open the file.
We rely on python to do the right thing from unicode paths there.
> So bzr should encode the path before _write_file().
>
> To test these situation:
>
> * Use fixed encoding in the test.
That's what the tests are doing for the user encoding.
> * Use filenames that contains chars that can't encode with the encoding.
Hmm, AFAIR, when we encounter problems there it's because some setting is wrong in the environment. As mentioned below, I couldn't automate the tests enough
to reach that point, hence my call for real-life tests :)
> * Replace subprocess.call and check arguments.
I don't want to do that, if the problem is in subprocess I will not be able to detect it.
That's the main point I dislike with the current tests, they don't exercise
subprocess.Popen, but I don't have an external diff tool I can call available
on all platforms.
Vincent Ladeuil (vila) wrote : | # |
> We rely on python to do the right thing from unicode paths there.
i.e. encode using 'mbcs' as returned by getsystemfileco
and whatever is appropriate on other systems.
methane (songofacandy) wrote : | # |
>> 1. osutils.
>> user_encoding is not equals to filesystem encoding.
>
>My assumption was that subprocess.Popen should receive its arguments as if
>they were coming from the command line: in user encoding.
OK, I don't know about when get_user_encoding() returns encoding that is not
suitable for commandline arguments.
>> 2. path.encode(enc) can raise UnicodeEncodeError.
>> So bzr should use "encode(enc, 'replace')"
>
>That will mean some path can't be encoded in user encoding which in turn means
>it could not be user as a parameter to bzr no ?
No. bzr uses unicode arguments in Windows. So users can pass unicode filename to bzr.
And "bzr diff --using=
such a file.
>> 3. _write_file() writes file with unicode path.
>> But if bzr can't encode the path collectly, external command can't
>> open the file.
>
> We rely on python to do the right thing from unicode paths there.
On windows, unicode path is too perfect.
If unicode path contains chars that is not be able to encode,
we can't use the path in encoded string.
But we can create a file with such a name when using unicode.
So we should replace such a char before _write_file().
Vincent Ladeuil (vila) wrote : | # |
>>>>> "Naoki" == INADA Naoki <email address hidden> writes:
>>> 1. osutils.
>>> user_encoding is not equals to filesystem encoding.
>>
>> My assumption was that subprocess.Popen should receive its arguments as if
>> they were coming from the command line: in user encoding.
Naoki> OK, I don't know about when get_user_encoding() returns encoding that is not
Naoki> suitable for commandline arguments.
Good.
>>> 2. path.encode(enc) can raise UnicodeEncodeError.
>>> So bzr should use "encode(enc, 'replace')"
>>
>> That will mean some path can't be encoded in user encoding which
>> in turn means it could not be user as a parameter to bzr no ?
Naoki> No. bzr uses unicode arguments in Windows.
No. bzr uses unicode paths internally, they are coming from the command
line and decoded with user encoding or from the bzr db where they are
encoding into utf-8.
Naoki> So users can pass unicode filename to bzr.
Encoded appropriately yes. They are decoded into unicode and encoded by
bzr or by python when access to the filesystem is needed at which point
the file system encoding is used (which may be different from the user
encoding).
Naoki> And "bzr diff --using=
Naoki> path but shows diff on such a file.
Sure and it may decode them using the *terminal* encoding for the
content of the diff but DiffFromTool has a parameter 'path_encoding' to
control that.
Note that this is still not the encoding we need to use to create any
needed file though.
>>> 3. _write_file() writes file with unicode path.
>>> But if bzr can't encode the path collectly, external command can't
>>> open the file.
>>
>> We rely on python to do the right thing from unicode paths there.
Naoki> On windows, unicode path is too perfect.
Naoki> If unicode path contains chars that is not be able to encode,
Naoki> we can't use the path in encoded string.
If we can't use it here, we can't use it either when creating the
working tree, so I'll tend to ignore the problem here.
Naoki> But we can create a file with such a name when using unicode.
Naoki> So we should replace such a char before _write_file().
I don't think so.
Can we chat about that on IRC or can you provide me a file name that
breaks my patch (preferably as a python string using the \xnnnn
notation) ?
Vincent Ladeuil (vila) wrote : | # |
So it appears that I lack the windows knowledge (and the ability to tests)
to effectively help Naoki here.
The consensus seems to be that the problem is now to either:
- make subprocess.Popen accept unicode arguments (Naoki as a monkey-patch for that
but I find it a bit scary),
- bypass subprocess.Popen and use a direct win32 API (I'm out of my league on
this one)
John ? Alexander ? Can you give it a try ?
Vincent Ladeuil (vila) wrote : | # |
> John ? Alexander ? Can you give it a try ?
Or may be Gary since you're working in this area right now and have some way of testing it on windows ?
Alexander Belchenko (bialix) wrote : | # |
> So it appears that I lack the windows knowledge (and the ability to tests)
> to effectively help Naoki here.
>
> The consensus seems to be that the problem is now to either:
> - make subprocess.Popen accept unicode arguments (Naoki as a monkey-patch for
> that
> but I find it a bit scary),
> - bypass subprocess.Popen and use a direct win32 API (I'm out of my league on
> this one)
>
> John ? Alexander ? Can you give it a try ?
Sorry, I'm a bit busy these days, have lost this ping. I can't promise I will look into it soon. But I'll try.
Gary van der Merwe (garyvdm) wrote : | # |
Vincent Ladeuil wrote on 2010-04-21 :
> That's the main point I dislike with the current tests, they don't exercise
> subprocess.Popen, but I don't have an external diff tool I can call available
> on all platforms.
I've fixed that in this branch: lp:~garyvdm/bzr/523746-difftool-file-names
Those tests fail in wine for both Vincent's and INADA's branches.
- new_abs_path = self._prepare_
- allow_write_
- force_temp=True)[1]
- command = self._get_
- osutils.
+ old_abs_path, new_abs_path = self._prepare_
+ file_id, old_path, new_path,
+ allow_write_
+ force_temp=True)
+ command = self._get_
This is a nice simple change. Maybe we should merge this straight away, separately from the other changes.
- new_file = open(new_abs_path, 'r')
+ new_file = open(new_abs_path, 'rb')
Why is this changed?
methane (songofacandy) wrote : | # |
> Vincent Ladeuil wrote on 2010-04-21 :
> > That's the main point I dislike with the current tests, they don't exercise
> > subprocess.Popen, but I don't have an external diff tool I can call
> available
> > on all platforms.
>
> I've fixed that in this branch: lp:~garyvdm/bzr/523746-difftool-file-names
>
> Those tests fail in wine for both Vincent's and INADA's branches.
I'll see the branch later.
> - new_abs_path = self._prepare_
> - allow_write_
> - force_temp=True)[1]
> - command = self._get_
> - osutils.
> + old_abs_path, new_abs_path = self._prepare_
> + file_id, old_path, new_path,
> + allow_write_
> + force_temp=True)
> + command = self._get_
>
> This is a nice simple change. Maybe we should merge this straight away,
> separately from the other changes.
I think so.
> - new_file = open(new_abs_path, 'r')
> + new_file = open(new_abs_path, 'rb')
>
> Why is this changed?
This change is made because I think bzr should use 'binary' mode in this point.
External diff tool may have edit feature and user may fix line ending. So bzr should keep
EOL bytes.
But, this change is not relating to #523746.
methane (songofacandy) wrote : | # |
I'm ashamed to leave this so long.
> > Vincent Ladeuil wrote on 2010-04-21 :
> > > That's the main point I dislike with the current tests, they don't
> exercise
> > > subprocess.Popen, but I don't have an external diff tool I can call
> > available
> > > on all platforms.
> >
> > I've fixed that in this branch: lp:~garyvdm/bzr/523746-difftool-file-names
> >
> > Those tests fail in wine for both Vincent's and INADA's branches.
>
> I'll see the branch later.
Your test fails because my branch changes filename for it safely write
to the filename and pass the filename to subprocess.
I split filename creation from DiffFromTool.
Please see it.
I've wrote test executing DiffFromTool but it doesn't tests real problem
because overriding sys.getfilesyst
Py_FileSystemDe
Vincent Ladeuil (vila) wrote : | # |
Thanks for coming back songofacandy (If you explain me one last time which of INADA or Naoki I should use, I swear I won't forget anymore :-}) !
Once again, I'd like to defer to people with a better windows knowledge than myself in this specific area, I'll nominate explicit reviewers, please, one of you, review ! :) Or if you can't just abstain ;)
Vincent Ladeuil (vila) wrote : | # |
My understanding was that we couldn't arbitrarily try to encode/decode both paths and other arguments. I may be right in general and wrong in this particular case where mbcs may be the silver bullet. That's why I'm deferring to people knowing better than me.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 05/06/2011 03:29 AM, INADA Naoki wrote:
> INADA Naoki has proposed merging lp:~songofacandy/bzr/fix-523746-dev into lp:bzr.
>
> Requested reviews:
> Vincent Ladeuil (vila)
> bzr-core (bzr-core)
> Related bugs:
> Bug #523746 in Bazaar: "crash with winmerge in cp932 japanese character"
> https:/
>
> For more details, see:
> https:/
+ def _safe_filename(
+ if sys.platform == 'win32':
+ fenc = 'mbcs'
+ else:
+ fenc = sys.getfilesyst
+ # encoded_
+ # So we should encode, decode, then replace(u'?', u'_')
+ relpath_tmp = relpath.
'replace')
+ relpath_tmp = relpath_
+ return osutils.
Why are we using "encode('replace')" rather than "encode('ignore')". It
doesn't seem to buy us much to be putting "_" into the filename, rather
than just writing out whatever letters we can.
I also don't understand why you do the encode/decode dance and the put a
munged Unicode string back into the path. I think I sort of understand,
but a docstring would probably help a lot at understanding why we need
to use this function.
...
+ command = [AtTemplate(
+ self.command_
+ if sys.platform == 'win32': # Popen doesn't accept unicode on win32
+ command_encoded = []
+ for c in command:
+ if isinstance(c, unicode):
+ command_
+ else:
+ command_
+ return command_encoded
+ else:
+ return command
^- I'm not sure whether c.encode('mbcs') can fail, or whether it will
return '?' for unusable characters. For example, I can create Arabic or
Japanese filenames (since the filesystem is UTF-16) even though 'mbcs'
is only able to handle ~latin-1. Any thoughts on how to handle that
cleanly? We don't have to change Popen to support Unicode (though that
would be ideal), but at least something that would give users helpful
feedback, rather than just tracebacks.
+ def test_encodable_
+ import sys
+ diffobj = diff.DiffFromTo
+ None, None, None)
On *my* system, there is no 'python' (I never add it to sys.path, as I
happen to have 2.4/5/6/7 all installed.) There is 'sys.executable', though.
diffobj = diff.DiffFromTo
Also, we probably want to skip this test if "sys.frozen" is set, since I
don't think we can run it from 'bzr.exe'. I think we already had a
feature for that.
John
=:->
review: needsfixing
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAk3
W0cAoID...
methane (songofacandy) wrote : | # |
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 05/06/2011 03:29 AM, INADA Naoki wrote:
> > INADA Naoki has proposed merging lp:~songofacandy/bzr/fix-523746-dev into
> lp:bzr.
> >
> > Requested reviews:
> > Vincent Ladeuil (vila)
> > bzr-core (bzr-core)
> > Related bugs:
> > Bug #523746 in Bazaar: "crash with winmerge in cp932 japanese character"
> > https:/
> >
> > For more details, see:
> > https:/
>
>
> + def _safe_filename(
> + if sys.platform == 'win32':
> + fenc = 'mbcs'
> + else:
> + fenc = sys.getfilesyst
> + # encoded_
> + # So we should encode, decode, then replace(u'?', u'_')
> + relpath_tmp = relpath.
> 'replace')
> + relpath_tmp = relpath_
> + return osutils.
>
> Why are we using "encode('replace')" rather than "encode('ignore')". It
> doesn't seem to buy us much to be putting "_" into the filename, rather
> than just writing out whatever letters we can.
Because "encode('ignore')" may makes null string.
Fallback to other string is also OK, but I don't have nice name for fallback.
> I also don't understand why you do the encode/decode dance and the put a
> munged Unicode string back into the path. I think I sort of understand,
> but a docstring would probably help a lot at understanding why we need
> to use this function.
OK, I'll add docstrings.
>
> ...
>
> + command = [AtTemplate(
> + self.command_
> + if sys.platform == 'win32': # Popen doesn't accept unicode on win32
> + command_encoded = []
> + for c in command:
> + if isinstance(c, unicode):
> + command_
> + else:
> + command_
> + return command_encoded
> + else:
> + return command
>
> ^- I'm not sure whether c.encode('mbcs') can fail, or whether it will
> return '?' for unusable characters. For example, I can create Arabic or
> Japanese filenames (since the filesystem is UTF-16) even though 'mbcs'
> is only able to handle ~latin-1. Any thoughts on how to handle that
> cleanly? We don't have to change Popen to support Unicode (though that
> would be ideal), but at least something that would give users helpful
> feedback, rather than just tracebacks.
c.encode('mbcs') may fail on very rare circumstance. For example, path of
diff tool or temporary directory is not be able to encode with 'mbcs'.
I don't have any idea what should we do on that circumstance.
But filepath on commandline should not be fail decoding unlsess
decode(
> + def test_encodable_
> + import sys
> + diffobj = diff.DiffFromTo
> + None, None, No...
methane (songofacandy) wrote : | # |
Done
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 5/6/2011 8:28 PM, INADA Naoki wrote:
> INADA Naoki has proposed merging lp:~songofacandy/bzr/fix-523746-dev into lp:bzr.
>
> Requested reviews:
> John A Meinel (jameinel)
> Alexander Belchenko (bialix)
> Gary van der Merwe (garyvdm)
> Martin [gz] (gz)
> Vincent Ladeuil (vila)
> bzr-core (bzr-core)
> Related bugs:
> Bug #523746 in Bazaar: "crash with winmerge in cp932 japanese character"
> https:/
>
> For more details, see:
> https:/
Can you give a little bit more background in your merge proposal? I know
there is a bug linked, but describing what you are doing makes it easier
to understand the diff.
review: needsinfo
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAk3
dgkAoJ8kAv8ws8A
=u/ie
-----END PGP SIGNATURE-----
methane (songofacandy) wrote : | # |
OK.
1. On Windows.
Some diff application doesn't receive command line argument as a UTF-16 string
but as a string encoded with current codepage.
In this situation, passing Unicode argument to child process may fail
when the file path
contains character that is not be able to encode with current codepage.
So, we should use only chars those are able to encode with current codepage as a
temporary file.
2. On Unix.
All application may receive commandline argument as just bytes. So we can pass
any path as a argument.
But, GUI diff tool may be not able to show such filepath. So we should
use string
that is able to encode with current locale as a temporary file, too.
My fix removes chars those are not able to encode with current codepage/locale.
And force to use temporary file when checkouted file contains such chars as a
path.
On Thu, May 12, 2011 at 4:23 PM, John Arbash Meinel
<email address hidden> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 5/6/2011 8:28 PM, INADA Naoki wrote:
>> INADA Naoki has proposed merging lp:~songofacandy/bzr/fix-523746-dev into lp:bzr.
>>
>> Requested reviews:
>> John A Meinel (jameinel)
>> Alexander Belchenko (bialix)
>> Gary van der Merwe (garyvdm)
>> Martin [gz] (gz)
>> Vincent Ladeuil (vila)
>> bzr-core (bzr-core)
>> Related bugs:
>> Bug #523746 in Bazaar: "crash with winmerge in cp932 japanese character"
>> https:/
>>
>> For more details, see:
>> https:/
>
> Can you give a little bit more background in your merge proposal? I know
> there is a bug linked, but describing what you are doing makes it easier
> to understand the diff.
>
> review: needsinfo
>
> John
> =:->
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (Cygwin)
> Comment: Using GnuPG with Mozilla - http://
>
> iEYEARECAAYFAk3
> dgkAoJ8kAv8ws8A
> =u/ie
> -----END PGP SIGNATURE-----
>
--
INADA Naoki <email address hidden>
John A Meinel (jameinel) wrote : | # |
Seems fine to me. I don't really like the encode().decode() stuff, but I'll live with it.
Martin Pool (mbp) wrote : | # |
sent to pqm by email
Preview Diff
1 | === modified file 'bzrlib/diff.py' |
2 | --- bzrlib/diff.py 2011-04-07 10:36:24 +0000 |
3 | +++ bzrlib/diff.py 2011-05-06 17:43:35 +0000 |
4 | @@ -745,8 +745,18 @@ |
5 | |
6 | def _get_command(self, old_path, new_path): |
7 | my_map = {'old_path': old_path, 'new_path': new_path} |
8 | - return [AtTemplate(t).substitute(my_map) for t in |
9 | - self.command_template] |
10 | + command = [AtTemplate(t).substitute(my_map) for t in |
11 | + self.command_template] |
12 | + if sys.platform == 'win32': # Popen doesn't accept unicode on win32 |
13 | + command_encoded = [] |
14 | + for c in command: |
15 | + if isinstance(c, unicode): |
16 | + command_encoded.append(c.encode('mbcs')) |
17 | + else: |
18 | + command_encoded.append(c) |
19 | + return command_encoded |
20 | + else: |
21 | + return command |
22 | |
23 | def _execute(self, old_path, new_path): |
24 | command = self._get_command(old_path, new_path) |
25 | @@ -772,12 +782,42 @@ |
26 | raise |
27 | return True |
28 | |
29 | + @staticmethod |
30 | + def _fenc(): |
31 | + """Returns safe encoding for passing file path to diff tool""" |
32 | + if sys.platform == 'win32': |
33 | + return 'mbcs' |
34 | + else: |
35 | + # Don't fallback to 'utf-8' because subprocess may not be able to |
36 | + # handle utf-8 correctly when locale is not utf-8. |
37 | + return sys.getfilesystemencoding() or 'ascii' |
38 | + |
39 | + def _is_safepath(self, path): |
40 | + """Return true if `path` may be able to pass to subprocess.""" |
41 | + fenc = self._fenc() |
42 | + try: |
43 | + return path == path.encode(fenc).decode(fenc) |
44 | + except UnicodeError: |
45 | + return False |
46 | + |
47 | + def _safe_filename(self, prefix, relpath): |
48 | + """Replace unsafe character in `relpath` then join `self._root`, |
49 | + `prefix` and `relpath`.""" |
50 | + fenc = self._fenc() |
51 | + # encoded_str.replace('?', '_') may break multibyte char. |
52 | + # So we should encode, decode, then replace(u'?', u'_') |
53 | + relpath_tmp = relpath.encode(fenc, 'replace').decode(fenc, 'replace') |
54 | + relpath_tmp = relpath_tmp.replace(u'?', u'_') |
55 | + return osutils.pathjoin(self._root, prefix, relpath_tmp) |
56 | + |
57 | def _write_file(self, file_id, tree, prefix, relpath, force_temp=False, |
58 | allow_write=False): |
59 | if not force_temp and isinstance(tree, WorkingTree): |
60 | - return tree.abspath(tree.id2path(file_id)) |
61 | - |
62 | - full_path = osutils.pathjoin(self._root, prefix, relpath) |
63 | + full_path = tree.abspath(tree.id2path(file_id)) |
64 | + if self._is_safepath(full_path): |
65 | + return full_path |
66 | + |
67 | + full_path = self._safe_filename(prefix, relpath) |
68 | if not force_temp and self._try_symlink_root(tree, prefix): |
69 | return full_path |
70 | parent_dir = osutils.dirname(full_path) |
71 | @@ -840,13 +880,13 @@ |
72 | """ |
73 | old_path = self.old_tree.id2path(file_id) |
74 | new_path = self.new_tree.id2path(file_id) |
75 | - new_abs_path = self._prepare_files(file_id, old_path, new_path, |
76 | - allow_write_new=True, |
77 | - force_temp=True)[1] |
78 | - command = self._get_command(osutils.pathjoin('old', old_path), |
79 | - osutils.pathjoin('new', new_path)) |
80 | + old_abs_path, new_abs_path = self._prepare_files( |
81 | + file_id, old_path, new_path, |
82 | + allow_write_new=True, |
83 | + force_temp=True) |
84 | + command = self._get_command(old_abs_path, new_abs_path) |
85 | subprocess.call(command, cwd=self._root) |
86 | - new_file = open(new_abs_path, 'r') |
87 | + new_file = open(new_abs_path, 'rb') |
88 | try: |
89 | return new_file.read() |
90 | finally: |
91 | |
92 | === modified file 'bzrlib/tests/test_diff.py' |
93 | --- bzrlib/tests/test_diff.py 2011-04-18 23:19:16 +0000 |
94 | +++ bzrlib/tests/test_diff.py 2011-05-06 17:43:35 +0000 |
95 | @@ -33,7 +33,7 @@ |
96 | transform, |
97 | ) |
98 | from bzrlib.symbol_versioning import deprecated_in |
99 | -from bzrlib.tests import features |
100 | +from bzrlib.tests import features, EncodingAdapter |
101 | from bzrlib.tests.blackbox.test_diff import subst_dates |
102 | |
103 | |
104 | @@ -1421,6 +1421,52 @@ |
105 | diff_obj._prepare_files('file2-id', 'oldname2', 'newname2') |
106 | |
107 | |
108 | +class TestDiffFromToolEncodedFilename(tests.TestCaseWithTransport): |
109 | + |
110 | + def test_encodable_filename(self): |
111 | + # Just checks file path for external diff tool. |
112 | + # We cannot change CPython's internal encoding used by os.exec*. |
113 | + import sys |
114 | + diffobj = diff.DiffFromTool(['dummy', '@old_path', '@new_path'], |
115 | + None, None, None) |
116 | + for _, scenario in EncodingAdapter.encoding_scenarios: |
117 | + encoding = scenario['encoding'] |
118 | + dirname = scenario['info']['directory'] |
119 | + filename = scenario['info']['filename'] |
120 | + |
121 | + self.overrideAttr(diffobj, '_fenc', lambda: encoding) |
122 | + relpath = dirname + u'/' + filename |
123 | + fullpath = diffobj._safe_filename('safe', relpath) |
124 | + self.assertEqual( |
125 | + fullpath, |
126 | + fullpath.encode(encoding).decode(encoding) |
127 | + ) |
128 | + self.assert_(fullpath.startswith(diffobj._root + '/safe')) |
129 | + |
130 | + def test_unencodable_filename(self): |
131 | + import sys |
132 | + diffobj = diff.DiffFromTool(['dummy', '@old_path', '@new_path'], |
133 | + None, None, None) |
134 | + for _, scenario in EncodingAdapter.encoding_scenarios: |
135 | + encoding = scenario['encoding'] |
136 | + dirname = scenario['info']['directory'] |
137 | + filename = scenario['info']['filename'] |
138 | + |
139 | + if encoding == 'iso-8859-1': |
140 | + encoding = 'iso-8859-2' |
141 | + else: |
142 | + encoding = 'iso-8859-1' |
143 | + |
144 | + self.overrideAttr(diffobj, '_fenc', lambda: encoding) |
145 | + relpath = dirname + u'/' + filename |
146 | + fullpath = diffobj._safe_filename('safe', relpath) |
147 | + self.assertEqual( |
148 | + fullpath, |
149 | + fullpath.encode(encoding).decode(encoding) |
150 | + ) |
151 | + self.assert_(fullpath.startswith(diffobj._root + '/safe')) |
152 | + |
153 | + |
154 | class TestGetTreesAndBranchesToDiffLocked(tests.TestCaseWithTransport): |
155 | |
156 | def call_gtabtd(self, path_list, revision_specs, old_url, new_url): |
https:/ /code.edge. launchpad. net/~songofacan dy/bzr/ fix-523746- 2/+merge/ 19938 for bzr.dev.