GNU bug report logs - #36718
uniq treats distinct Korean characters equal

Previous Next

Package: coreutils;

Reported by: Felix Hamme <fhamme <at> united-internet.de>

Date: Thu, 18 Jul 2019 14:49:01 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Felix Hamme <fhamme <at> united-internet.de>
To: 36718 <at> debbugs.gnu.org
Cc: Gerhard Dittes <gerhard.dittes <at> ionos.com>
Subject: bug#36718: uniq treats distinct Korean characters equal
Date: Thu, 18 Jul 2019 16:08:57 +0200
[Message part 1 (text/plain, inline)]
Dear all,

I found that, when performing uniq on some Korean characters, it treats
them as equal (counts as duplicate) although the characters aren't
equal. To be precise, it happened to me on the Characters 프 (U+D504) and
틀 (U+D2C0).

An example (input, expected output, actual output) can be found in the
attachment.
I've tried that using uniq (GNU coreutils) 8.30.

Greetings
Felix Hamme
[uniq-korean-characters-bug.tar.gz (application/gzip, attachment)]

This bug report was last modified 5 years and 341 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.