GNU bug report logs - #43225
Grep treats extended Latin characters like whitespace

Previous Next

Package: grep;

Reported by: Mayo Fark <mayofark <at> outlook.com>

Date: Sat, 5 Sep 2020 16:06:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Mayo Fark <mayofark <at> outlook.com>
To: 43225 <at> debbugs.gnu.org
Subject: bug#43225: Grep treats extended Latin characters like whitespace
Date: Sat, 5 Sep 2020 14:27:56 +0000
[Message part 1 (text/plain, inline)]
What I did:
```
grep -Riw cone *
'''

Expected result: lines with the word "cone" surrounded by whitespace, ignoring case.

What I got instead:
```
data/po/pt_BR.po:msgstr "Pressione o ícone de pódio para iniciar o tutorial"
'''

Why this is a bug: the word ícone is not the same as cone and should not have been returned in the result set. It appears that grep treats the í character in ícone as whitespace, which affects other extended-Latin characters as well.


[Message part 2 (text/html, inline)]

This bug report was last modified 4 years and 311 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.