GNU bug report logs - #16812
Eszett handling

Previous Next

Package: grep;

Reported by: mathstuf <at> gmail.com

Date: Wed, 19 Feb 2014 19:04:01 UTC

Severity: wishlist

Full log


Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Johannes Meixner <jsmeix <at> suse.de>
To: bug-grep <at> gnu.org
Cc: mathstuf <at> gmail.com
Subject: Re: bug#16812: Eszett handling
Date: Thu, 20 Feb 2014 11:07:34 +0100 (CET)
[Message part 1 (text/plain, inline)]
Hello,

On Feb 19 13:59 Ben Boeckel wrote (excerpt):
> [ I am not subscribed; please keep me on the CC. ]
...
> I had a thought about how the German eszett was handled
...
> Basically, it seems that grep doesn't support alternates when changing
> case. The uppercase of 'ß' is either 'SS' or '?' depending on the
> context

As far as I understand it you are talking about
"Unicode case folding".

As far as I know grep does not support "Unicode case folding".

Currently grep works on a pure "character by character" base
where each character could be in UTF-8 encoding (a possible
encoding for Unicode characters) so that grep supports
the UTF-8 encoding which could be misunderstood that
grep supports Unicode but the latter is not true.

For more details see the various (usually very long mail threads)
regarding "grep -i" in particular together with UTF-8.

For example on

http://lists.gnu.org/archive/html/bug-grep/2012-06/threads.html#00011

mail threads like
"Ignore case handling of special unicode characters (case folding)"
which is
http://savannah.gnu.org/bugs/?36682
or the mail thread
"grep -i (case-insensitive) is broken with UTF8"


Kind Regards
Johannes Meixner
-- 
SUSE LINUX Products GmbH -- Maxfeldstrasse 5 -- 90409 Nuernberg -- Germany
HRB 16746 (AG Nuernberg) GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer

This bug report was last modified 11 years and 53 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.