GNU bug report logs - #79300
fold-nbsp test failure

Previous Next

Package: coreutils;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Sun, 24 Aug 2025 07:51:02 UTC

Severity: normal

Tags: fixed

Done: Collin Funk <collin.funk1 <at> gmail.com>

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Collin Funk <collin.funk1 <at> gmail.com>
Cc: 79300 <at> debbugs.gnu.org, bruno <at> clisp.org
Subject: bug#79300: fold-nbsp test failure
Date: Fri, 29 Aug 2025 13:47:17 +0100
On 29/08/2025 05:23, Collin Funk wrote:
> Pádraig Brady <P <at> draigBrady.com> writes:
> 
>> Perhaps the techniques from tests/wc/wc-nbsp.sh could be used?
>> Maybe something like:
>>
>> check_space() {
>>    char="$1"
>>    # Use -L to determine whether NBSP is printable.
>>    # FreeBSD 11 and OS X treat NBSP as non printable ?
>>    test "$(env printf "=$char=" | wc -L)" = 3 &&
>>      test $(env printf "=$char=" | wc -w) = 2
>> }
>>
>> if check_space '\u2007'; then
>>    ...
>> fi
> 
> Thanks for the suggestion, but that doesn't work. Any issue with
> skipping based on $host_os for this test and for fold-spaces.sh?
> 
> I was thinking of testing "printf '\u00A0' | ./src/tr -d '[:blank:]'"
> but that won't work since 'tr' operates on bytes and U+00A0 is
> represented as 0xc2 0xa0 in UTF-8.

Oh right sorry. wc has it's own iswnbspace,
whereas fold essentially relies on the system iswblank.

That means you could correlate with uniq though. Something like:

  isblank() { test $(printf "a$1a\nb$1b\n" | uniq -f1 | wc -l) = 2; }
  if ! isblank '\u2007'; then
    # can test '\u2007' is treated as non breaking space
  fi

That would be a preferable way to gate the test.

Though I'm thinking now we should adjust fold(1) a little
to ensure we don't break with nbsp consistently across systems.
I.e. move/rename iswnbspace() from wc.c to src/system.h
and use it in fold (and wc) to give consistent behavior.
I.e. fold would use: c32isblank() && ! c32isnbspace(),
and the test would stay as is.

cheers,
Padraig




This bug report was last modified 10 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.