Code review comment for lp:~shawn111/bzr/lp_propose_message

Revision history for this message
Richard Wilbur (richard-wilbur) wrote :

Turns out I needed to learn more about Unicode so I did a bit of reading online. I found some useful descriptions of Unicode[0] and the UTF-8 encoding[1] on Wikipedia. Then I looked up specific information concerning Python support of Unicode and found a page documenting Python 2.7 usage.[2] An interesting excerpt from the Python documentation:
-------------------------------------------------
Tips for Writing Unicode-aware Programs

This section provides some suggestions on writing software that deals with Unicode.

The most important tip is:
    Software should only work with Unicode strings internally, converting to a particular encoding on output.
-------------------------------------------------
I appreciate your observation that the message string is a unicode object. That means we support multi-lingual input. It looks to me like in order to return the same type of string from Proposer.get_comment as before, we still want to remove leading and trailing whitespace (strip), and, from my reading, we still need to convert it to 'utf-8' for output (encode 'utf-8'), as that is the expected encoding for HTML.

References:
[0] https://en.wikipedia.org/wiki/Unicode
[1] https://en.wikipedia.org/wiki/UTF-8
[2] https://docs.python.org/2/howto/unicode.html

« Back to merge proposal