GNU bug report logs - #31185
Why is there no full support for Unicode?

Previous Next

Package: diffutils;

Reported by: Keepun <keepun <at> gmail.com>

Date: Mon, 16 Apr 2018 22:02:01 UTC

Severity: normal

Full log


Message #8 received at 31185 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Keepun <keepun <at> gmail.com>, 31185 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#31185: Why is there no full support for
 Unicode?
Date: Tue, 17 Apr 2018 00:37:18 -0700
Keepun wrote:
> Files with encoding greater than 8 bits without BOM at the beginning can be 
> immediately identified as binary.

No, the BOM is not required or recommended in UTF-8, so it would be a mistake to 
identify GNU/Linux text files as binary merely because they lack a BOM. 
Typically these files do not have a BOM, and when they do one of the first 
things many users do is remove the BOM because it can cause trouble in practice.

Diffutils does not support UTF-16, where a BOM would make more sense, and there 
are no plans to add support for UTF-16 (or for UTF-32, for that matter).




This bug report was last modified 7 years and 61 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.