[Python-Dev] Unicode
Jack Jansen
Jack.Jansen@oratrix.com
Mon, 29 Apr 2002 00:05:13 +0200
On vrijdag, april 26, 2002, at 06:26 , Guido van Rossum wrote:
> No syntactic changes, no. But the way we do things would become
> significantly different. And think of binary I/O vs. textual I/O --
> currently, file.read() returns a string. Code dealing with binary
> files will look significantly different, and old code won't work.
It could be argued that open(..., 'r').read() returns a text
string and open(..., 'rb').read() returns a binary blob.
If textstrings and blobs become wholly different objects this
shouldn't create too many problems [see below], except for code
that opens a file in binary mode and (partially) reads the
resulting file expecting text. But this code would need
revisiting anyway if the normal textstring would become unicode.
[here's below] To my surprise I think that having blobs and
textstrings be unrelated objects creates less problems than
having the one be a subtype of the other. At least, every time I
try to do the subtyping in my head I flip back and forth between
textstrings-are-a-subtype-of-general-binary-buffers and
binary-buffers-are-a-special-case-of-python-strings every couple
of seconds. I think having them both be subtypes of a common
base type (basestring) might work, but I'm not sure.
--
- Jack Jansen <Jack.Jansen@oratrix.com>
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution --
Emma Goldman -