[Python-Dev] Allowing non-ASCII identifiers
"Martin v. Löwis"
martin at v.loewis.de
Mon Feb 9 17:03:11 EST 2004
François Pinard wrote:
>>1. At run-time, identifiers are represented as Unicode objects unless
>>they are pure ASCII. IOW, they are converted from the source encoding
>>to Unicode objects in the process of parsing.
>
>
> This is already the case, isn't it?
Currently, all identifiers are byte strings, at run-time, representing
ASCII characters. IOW, you currently won't observe Unicode strings
as identifiers.
>>2. As a consequence of 1), all places there identifiers appear need to
>>support Unicode objects (e.g. __dict__, __getattr__, etc)
>
>
> I do not much know the internals, yet I suspect one more thing to
> consider is whether Unicode strings looking like non-ASCII identifiers
> should be interned or not, the same as currently done for ASCII.
Indeed; I had not thought about this.
> # -*- coding: Latin-1 -*-
> élève = 3
> print élève
[...]
> So, the Python compiler is sensitive to the active locale.
Yes, that's a bug. It will use byte strings as identifiers
(without running your example, I'd expect that dir() shows
they are UTF-8)
> This is kind of an happy bug! May we count on it being supported in the
> interim? :-) :-)
I would think so: this bug has been present for quite some time,
and nobody complained :-)
Martin
More information about the Python-Dev
mailing list