As previously discussed on the coreutils mailing list, beginning with

  http://lists.gnu.org/archive/html/coreutils/2017-12/msg00074.html

most of the coreutils text processing commands process bytes instead of characters, regardless of the user's locale, so they do not handle UTF-8 text or options properly.

I propose the changes in

  https://github.com/ericfischer/coreutils/compare/multibyte-squash

to convert sort, uniq, join, tr, cut, paste, expand, and unexpand to process characters instead of bytes, allowing them to work correctly on non-ASCII text, as specified by POSIX.

Eric Fischer