[Python-Dev] str() vs. unicode()
Walter D�rwald
walter@livinglogic.de
Tue, 25 Sep 2001 15:36:04 +0200
Guido van Rossum wrote:
>>Ok, let's remove the buffer API from unicode(). Should it still be
>>maintained for unicode(obj, encoding, errors) ?
>>
>
> I think so yes.
>
>
>>Hmm, perhaps we do need a __unicode__/tp_unicode slot after all.
>>It would certainly help clarify the communication between the
>>interpreter and the object.
>>
>
> Would you settle for a __unicode__ method but no tp_unicode slot?
> It's easy enough to define a C method named __unicode__ if the need
> arises. This should always be tried first, not just for classic
> instances. Adding a slot is a bit painful now that there are so many
> new slots already (adding it to the end means you have to add tons of
> zeros, adding it to the middle means I have to edit every file).
Hmm, what about a type object initialisation function that takes
"named arguments" via varargs:
PyType_Initialize(&PyUnicode_Type,
TYPE_TYPE, &PyType_Type,
TYPE_NAME, "unicode",
SLOT_DESTRUCTOR, _PyUnicode_Free,
SLOT_CMP, unicode_compare,
SLOT_REPR, unicode_repr,
SLOT_SEQ, unicode_as_sequence,
SLOT_HASH, unicode_hash,
DONE
)
The SLOT_xxx arguments would be #defines like this
#define DONE 0
#define TYPE_TYPE 1
#define TYPE_NAME 2
#define SLOT_DESTRUCTOR 3
#define SLOT_CMP 4
Adding a new slot would require much less work: define a new slot
*somewhere* in the struct, define a new SLOT_xxx and add
SLOT_xxx, foo_xxx
to the call to the initializer for every type that implements this
struct. Performance shouldn't be a problem, because this function
would only be called once for every type. And we could get rid of
the problem with static initialization of ob_type with some
compilers.
> [...]
>
> I would add __unicode__ support without tp_unicode right away.
I like this idea. There is no need to piggyback unicode
representation of objects onto tp_str/__str__. Both PyObject_Str
and PyObject_Unicode will get much simpler.
But we will need int.__unicode__, float.__unicode__ etc.
(or fallback to __str__)
BTW, what about __repr__? Should this be allowed to return unicode
objects? (currently it is, and uses PyUnicode_AsUnicodeEscapeString)
Bye,
Walter D�rwald