Bazaar

Code review comment for lp:~songofacandy/bzr/fix-523746-dev

fix-523746-dev
Merge into bzr.dev

Revision history for this message

Vincent Ladeuil (vila) wrote on 2010-04-22:

>>>>> "Naoki" == INADA Naoki <email address hidden> writes:

    >>> 1. osutils.get_user_encoding() is not good in rare case.
    >>> user_encoding is not equals to filesystem encoding.
    >>
    >> My assumption was that subprocess.Popen should receive its arguments as if
    >> they were coming from the command line: in user encoding.

Naoki> OK, I don't know about when get_user_encoding() returns encoding that is not
Naoki> suitable for commandline arguments.

Good.

    >>> 2. path.encode(enc) can raise UnicodeEncodeError.
    >>> So bzr should use "encode(enc, 'replace')"
    >>
    >> That will mean some path can't be encoded in user encoding which
    >> in turn means it could not be user as a parameter to bzr no ?

Naoki> No. bzr uses unicode arguments in Windows.

No. bzr uses unicode paths internally, they are coming from the command
line and decoded with user encoding or from the bzr db where they are
encoding into utf-8.

Naoki> So users can pass unicode filename to bzr.

Encoded appropriately yes. They are decoded into unicode and encoded by
bzr or by python when access to the filesystem is needed at which point
the file system encoding is used (which may be different from the user
encoding).

Naoki> And "bzr diff --using=winmergeu.exe" doesn't requires file
Naoki> path but shows diff on such a file.

Sure and it may decode them using the *terminal* encoding for the
content of the diff but DiffFromTool has a parameter 'path_encoding' to
control that.

Note that this is still not the encoding we need to use to create any
needed file though.

    >>> 3. _write_file() writes file with unicode path.
    >>> But if bzr can't encode the path collectly, external command can't
    >>> open the file.
    >>
    >> We rely on python to do the right thing from unicode paths there.

    Naoki> On windows, unicode path is too perfect.
    Naoki> If unicode path contains chars that is not be able to encode,
    Naoki> we can't use the path in encoded string.

If we can't use it here, we can't use it either when creating the
working tree, so I'll tend to ignore the problem here.

Naoki> But we can create a file with such a name when using unicode.
Naoki> So we should replace such a char before _write_file().

I don't think so.

Can we chat about that on IRC or can you provide me a file name that
breaks my patch (preferably as a python string using the \xnnnn
notation) ?

>>>>> "Naoki" == INADA Naoki <songofacandy@gmail.com> writes:

>>> 1. osutils.get_user_encoding() is not good in rare case.
    >>> user_encoding is not equals to filesystem encoding.
    >> 
    >> My assumption was that subprocess.Popen should receive its arguments as if
    >> they were coming from the command line: in user encoding.

Naoki> OK, I don't know about when get_user_encoding() returns encoding that is not
    Naoki> suitable for commandline arguments.

Good.

>>> 2. path.encode(enc) can raise UnicodeEncodeError.
    >>> So bzr should use "encode(enc, 'replace')"
    >> 
    >> That will mean some path can't be encoded in user encoding which
    >> in turn means it could not be user as a parameter to bzr no ?

Naoki> No. bzr uses unicode arguments in Windows.

No. bzr uses unicode paths internally, they are coming from the command
line and decoded with user encoding or from the bzr db where they are
encoding into utf-8.

Naoki> So users can pass unicode filename to bzr.

Naoki> And "bzr diff --using=winmergeu.exe" doesn't requires file
    Naoki> path but shows diff on such a file.

Sure and it may decode them using the *terminal* encoding for the
content of the diff but DiffFromTool has a parameter 'path_encoding' to
control that.

Note that this is still not the encoding we need to use to create any
needed file though.

>>> 3. _write_file() writes file with unicode path.
    >>> But if bzr can't encode the path collectly, external command can't
    >>> open the file.
    >> 
    >> We rely on python to do the right thing from unicode paths there.

Naoki> On windows, unicode path is too perfect.
    Naoki> If unicode path contains chars that is not be able to encode,
    Naoki> we can't use the path in encoded string.

If we can't use it here, we can't use it either when creating the
working tree, so I'll tend to ignore the problem here.

Naoki> But we can create a file with such a name when using unicode.
    Naoki> So we should replace such a char before _write_file().

I don't think so.

Can we chat about that on IRC or can you provide me a file name that
breaks my patch (preferably as a python string using the \xnnnn
notation) ?

« Back to merge proposal