書名： The Python Apprentice
作者名： Robert Smallshire Austin Bingham
本章字數： 224字
更新時間： 2021-07-02 22:16:57

Strings with Unicode

Strings are fully Unicode capable, so you can use them with international characters easily, even in literals, because the default source code encoding for Python 3 is UTF-8. For example, if you have access to Norwegian characters, you can simply enter this:

>>> "Vi er s? glad for ? h?re og l?re om Python!"
'Vi er s? glad for ? h?re og l?re om Python!'

Alternatively, you can use the hexadecimal representations of Unicode code points as an escape sequence prefixed by \u:

>>> "Vi er s\u00e5 glad for \u00e5 h\xf8re og l\u00e6re om Python!"
'Vi er s? glad for ? h?re og l?re om Python!'

We're sure you'll agree, though, that this is somewhat more unwieldy.

Similarly, you can use the \x escape sequence followed by a 2-character hexadecimal string to include one-byte Unicode code points in a string literal:

>>> '\xe5'
'?'

You can even an use an escaped octal string using a single backlash followed by three digits in the range zero to seven, although we confess we've never seen this used in practice, except inadvertently as a bug:

>>> '\345'
'?'

There are no such Unicode capabilities in the otherwise similar bytes type, which we'll look at next.

官术网_书友最值得收藏!

The Python Apprentice

Strings with Unicode