Everything you did not want to know about Unicode in Python 3

Mark H Harris harrismh777 at gmail.com
Mon May 12 21:58:45 EDT 2014


On 5/12/14 8:18 PM, Steven D'Aprano wrote:
> Unicode is hard, not because Unicode is hard, but because of legacy
> problems.

Yes.  To put a finer point on that, Unicode (which is only a 
specification constantly being improved upon) is harder to implement 
when it hasn't been on the design board from the ground up; Python in 
this case.

Julia has Unicode support from the ground up, and it was easier for 
those guys to implement (in beta release) than for the Python crew when 
they undertook the Unicode work that had to be done for Python3.x (just 
an observation).

Anytime there are legacy code issues, regression testing problems, and a 
host of domain issues that weren't thought through from the get-go there 
are going to be more problematic hurdles; not to mention bugs.

Having said that, I still think Unicode is somewhat harder than you're 
admitting.

marcus




More information about the Python-list mailing list