Merge lp:~gz/bzr/require_unicode_committer_614593 into lp:bzr
Status: | Merged |
---|---|
Approved by: | John A Meinel |
Approved revision: | no longer in the source branch. |
Merged at revision: | 5510 |
Proposed branch: | lp:~gz/bzr/require_unicode_committer_614593 |
Merge into: | lp:bzr |
Diff against target: |
46 lines (+13/-1) 3 files modified
bzrlib/repository.py (+2/-0) bzrlib/tests/per_repository/test_commit_builder.py (+10/-0) bzrlib/tests/test_testament.py (+1/-1) |
To merge this branch: | bzr merge lp:~gz/bzr/require_unicode_committer_614593 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Vincent Ladeuil | Approve | ||
John A Meinel | Pending | ||
Review via email: mp+38334@code.launchpad.net |
Commit message
Check committer values are ascii or unicode and fix a test where it was not
Description of the change
Currently commit can be passed a non-ascii str value for committer and the value makes it all the way through to the repository serialisation code where it potentially outputs bogus data. As generally the input will be decoded to unicode already from the command line or config this isn't generally fatal, but is laying a bit of a trap for test and plugin authors to do the wrong thing without realising. With Python 2.7 it happens that this is now caught by the xml escaping function, and a misspelt test started failing.
As well as fixing that test, I've added a guard in commit that raises UnicodeDecodeError for all non-ascii str values. I think that's better than decoding with user_encoding or similar as we don't actually have a good basis for believing a str passed is one encoding or another, so it's better to leave that up to the caller. We could trap the decode error and raise some other flavour of exception instead, but as this will mostly be for bzrlib coders rather than users that's probably no more informative.
This sounds fine to me, but I'd like John to confirm that we are indeed expecting Unicode there.