Performance of decodeUtf8 on non-ascii text

The performance of `decodeUtf8` is excellent on ascii text, since the check can be vectorized to operate on a full machine word at a time, and nearly all branch prediction are correct. However, for non-ascii text, I haven't put much effort into optimizing it. There's no way it could ever compete with the decoding of ascii text, but it could probably be much better than it is now.