i’m currently re-learning UTF-8 bits and pieces. just thought i’ll document some of the urls on the way.
This Unicode introduction by Markus Kuhn seems like a must read. He goes on to UTF-8 in linux locales which may or may not be interesting.
Another quick primer to the above UTF-8 linux box topic is this piece by Ed Trager.
This simple check for UTF-8 multibytes is always worth remembering.
This Unicode chart is quite handy.
UTF8-CPP code is by someone who’s been there and done that.
and this is how its done in python!