[Python-Dev] PEP 383 update: utf8b is now the error handler

Stephen J. Turnbull stephen at xemacs.org
Tue May 5 20:09:54 CEST 2009


MRAB writes:

 > [snip]
 > It might be slightly OT, but sometimes strict UTF-8 encoding is violated
 > by encoding U+0000 using 2 bytes (0xC0 0x80) so that 0x00 can be used as
 > a terminator. I think I read that Microsoft sometimes does this.

Nice hack! as long as you don't let it escape.  But if 'strict' errors
on this, then PEP 383 'utf8b' will do the right thing, I think.


More information about the Python-Dev mailing list