UTF-8 does not require BOM, but for UTF-16 and UTF-32 BOM is always present. Files with UTF-16 and UTF-32 without the BOM should be identified as binary. But why there are no plans to support UTF-16 and UTF-32? Diff is part of the Git and is used all over the world. Now 2018 and Unicode solved problems with encodings. 17.04.2018 10:37, Paul Eggert: > Keepun wrote: >> Files with encoding greater than 8 bits without BOM at the beginning >> can be immediately identified as binary. > > No, the BOM is not required or recommended in UTF-8, so it would be a > mistake to identify GNU/Linux text files as binary merely because they > lack a BOM. Typically these files do not have a BOM, and when they do > one of the first things many users do is remove the BOM because it can > cause trouble in practice. > > Diffutils does not support UTF-16, where a BOM would make more sense, > and there are no plans to add support for UTF-16 (or for UTF-32, for > that matter).