> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 05/06/2011 03:29 AM, INADA Naoki wrote:
> > INADA Naoki has proposed merging lp:~songofacandy/bzr/fix-523746-dev into
> lp:bzr.
> >
> > Requested reviews:
> > Vincent Ladeuil (vila)
> > bzr-core (bzr-core)
> > Related bugs:
> > Bug #523746 in Bazaar: "crash with winmerge in cp932 japanese character"
> > https://bugs.launchpad.net/bzr/+bug/523746
> >
> > For more details, see:
> > https://code.launchpad.net/~songofacandy/bzr/fix-523746-dev/+merge/20537
>
>
> + def _safe_filename(self, prefix, relpath):
> + if sys.platform == 'win32':
> + fenc = 'mbcs'
> + else:
> + fenc = sys.getfilesystemencoding() or 'ascii'
> + # encoded_str.replace('?', '_') may break multibyte char.
> + # So we should encode, decode, then replace(u'?', u'_')
> + relpath_tmp = relpath.encode(fenc, 'replace').decode(fenc,
> 'replace')
> + relpath_tmp = relpath_tmp.replace(u'?', u'_')
> + return osutils.pathjoin(self._root, prefix, relpath_tmp)
>
> Why are we using "encode('replace')" rather than "encode('ignore')". It
> doesn't seem to buy us much to be putting "_" into the filename, rather
> than just writing out whatever letters we can.
Because "encode('ignore')" may makes null string.
Fallback to other string is also OK, but I don't have nice name for fallback.
> I also don't understand why you do the encode/decode dance and the put a
> munged Unicode string back into the path. I think I sort of understand,
> but a docstring would probably help a lot at understanding why we need
> to use this function.
OK, I'll add docstrings.
>
> ...
>
> + command = [AtTemplate(t).substitute(my_map) for t in
> + self.command_template]
> + if sys.platform == 'win32': # Popen doesn't accept unicode on win32
> + command_encoded = []
> + for c in command:
> + if isinstance(c, unicode):
> + command_encoded.append(c.encode('mbcs'))
> + else:
> + command_encoded.append(c)
> + return command_encoded
> + else:
> + return command
>
> ^- I'm not sure whether c.encode('mbcs') can fail, or whether it will
> return '?' for unusable characters. For example, I can create Arabic or
> Japanese filenames (since the filesystem is UTF-16) even though 'mbcs'
> is only able to handle ~latin-1. Any thoughts on how to handle that
> cleanly? We don't have to change Popen to support Unicode (though that
> would be ideal), but at least something that would give users helpful
> feedback, rather than just tracebacks.
c.encode('mbcs') may fail on very rare circumstance. For example, path of
diff tool or temporary directory is not be able to encode with 'mbcs'.
I don't have any idea what should we do on that circumstance.
But filepath on commandline should not be fail decoding unlsess
decode('mbcs').encode('mbcs') can be fail.
> + def test_encodable_filename(self):
> + import sys
> + diffobj = diff.DiffFromTool(['python', '@old_path', '@new_path'],
> + None, None, None)
>
>
> On *my* system, there is no 'python' (I never add it to sys.path, as I
> happen to have 2.4/5/6/7 all installed.) There is 'sys.executable', though.
>
> diffobj = diff.DiffFromTool([sys.executable, ...])
>
> Also, we probably want to skip this test if "sys.frozen" is set, since I
> don't think we can run it from 'bzr.exe'. I think we already had a
> feature for that.
It doesn't execute any commands. It just builds file names.
So I'll change 'python' to 'dummy'.
> -----BEGIN PGP SIGNED MESSAGE----- /bugs.launchpad .net/bzr/ +bug/523746 /code.launchpad .net/~songofaca ndy/bzr/ fix-523746- dev/+merge/ 20537 self, prefix, relpath): emencoding( ) or 'ascii' str.replace( '?', '_') may break multibyte char. encode( fenc, 'replace' ).decode( fenc, tmp.replace( u'?', u'_') pathjoin( self._root, prefix, relpath_tmp)
> Hash: SHA1
>
> On 05/06/2011 03:29 AM, INADA Naoki wrote:
> > INADA Naoki has proposed merging lp:~songofacandy/bzr/fix-523746-dev into
> lp:bzr.
> >
> > Requested reviews:
> > Vincent Ladeuil (vila)
> > bzr-core (bzr-core)
> > Related bugs:
> > Bug #523746 in Bazaar: "crash with winmerge in cp932 japanese character"
> > https:/
> >
> > For more details, see:
> > https:/
>
>
> + def _safe_filename(
> + if sys.platform == 'win32':
> + fenc = 'mbcs'
> + else:
> + fenc = sys.getfilesyst
> + # encoded_
> + # So we should encode, decode, then replace(u'?', u'_')
> + relpath_tmp = relpath.
> 'replace')
> + relpath_tmp = relpath_
> + return osutils.
>
> Why are we using "encode('replace')" rather than "encode('ignore')". It
> doesn't seem to buy us much to be putting "_" into the filename, rather
> than just writing out whatever letters we can.
Because "encode('ignore')" may makes null string.
Fallback to other string is also OK, but I don't have nice name for fallback.
> I also don't understand why you do the encode/decode dance and the put a
> munged Unicode string back into the path. I think I sort of understand,
> but a docstring would probably help a lot at understanding why we need
> to use this function.
OK, I'll add docstrings.
> t).substitute( my_map) for t in template] encoded. append( c.encode( 'mbcs') ) encoded. append( c)
> ...
>
> + command = [AtTemplate(
> + self.command_
> + if sys.platform == 'win32': # Popen doesn't accept unicode on win32
> + command_encoded = []
> + for c in command:
> + if isinstance(c, unicode):
> + command_
> + else:
> + command_
> + return command_encoded
> + else:
> + return command
>
> ^- I'm not sure whether c.encode('mbcs') can fail, or whether it will
> return '?' for unusable characters. For example, I can create Arabic or
> Japanese filenames (since the filesystem is UTF-16) even though 'mbcs'
> is only able to handle ~latin-1. Any thoughts on how to handle that
> cleanly? We don't have to change Popen to support Unicode (though that
> would be ideal), but at least something that would give users helpful
> feedback, rather than just tracebacks.
c.encode('mbcs') may fail on very rare circumstance. For example, path of
diff tool or temporary directory is not be able to encode with 'mbcs'.
I don't have any idea what should we do on that circumstance.
But filepath on commandline should not be fail decoding unlsess 'mbcs') .encode( 'mbcs') can be fail.
decode(
> + def test_encodable_ filename( self): ol(['python' , '@old_path', '@new_path'], ol([sys. executable, ...])
> + import sys
> + diffobj = diff.DiffFromTo
> + None, None, None)
>
>
> On *my* system, there is no 'python' (I never add it to sys.path, as I
> happen to have 2.4/5/6/7 all installed.) There is 'sys.executable', though.
>
> diffobj = diff.DiffFromTo
>
> Also, we probably want to skip this test if "sys.frozen" is set, since I
> don't think we can run it from 'bzr.exe'. I think we already had a
> feature for that.
It doesn't execute any commands. It just builds file names.
So I'll change 'python' to 'dummy'.