GNU bug report logs -
#16812
Eszett handling
Previous Next
Full log
Message #14 received at 16812 <at> debbugs.gnu.org (full text, mbox):
'grep' is conforming to its specification, even though it's not as
useful as it might be when searching German text. The situation with
'ß'/'SS' is different than the situation with 'lj'/'Lj'/'LJ' because in the
latter case 'grep' is dealing only with individual characters.
There's a related issue with 'ß' versus the recently-introduced capital
sharp-S 'ẞ'. These do not match each other with 'grep --ignore-case' in
the current savannah git master. This is an unfortunate property of how
the glibc regex code behaves: the regex code uppercases both pattern and
data before comparing, but in the standard German locale 'ß' is
unchanged by uppercasing.
I'll leave this bug open as it is an awkward situation. Fixing it would
require changing the glibc regex code, which is a big deal -- it would
have some performance implications in a lot of programs. So I'm not
optimistic about fixing it any time soon.
This bug report was last modified 11 years and 54 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.