GNU bug report logs -
#20954
wc - linux
Previous Next
Reported by: tele <swojskichlopak <at> wp.pl>
Date: Thu, 2 Jul 2015 00:46:03 UTC
Severity: normal
Tags: notabug
Done: Bob Proulx <bob <at> proulx.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20954 in the body.
You can then email your comments to 20954 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#20954
; Package
coreutils
.
(Thu, 02 Jul 2015 00:46:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
tele <swojskichlopak <at> wp.pl>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Thu, 02 Jul 2015 00:46:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi!
From terminal:
$ a="" ; echo $s | wc -l
1
Should be 0 , yes ?
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#20954
; Package
coreutils
.
(Thu, 02 Jul 2015 01:42:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 20954 <at> debbugs.gnu.org (full text, mbox):
tag 20954 + notabug
close 20954
thanks
tele wrote:
> Hi!
Hi! :-)
> From terminal:
>
> $ a="" ; echo $s | wc -l
> 1
Do you mean $a instead of $s? Either way is the same though assuming
$s is empty too.
> Should be 0 , yes ?
No. Should be 1. You have forgotten about the newline at the end of
the command. The echo will terminate with a newline. You can see
this with od.
echo | od -tx1 -c
0000000 0a
\n
Since this appears to be a usage error I have closed the bug. Please
feel free to follow up with more information. We will read it. And
we appreciate additional communication! I am simply closing it to
keep the accounting straight. :-)
Bob
Added tag(s) notabug.
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Thu, 02 Jul 2015 01:42:03 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
20954 <at> debbugs.gnu.org and tele <swojskichlopak <at> wp.pl>
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Thu, 02 Jul 2015 01:42:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#20954
; Package
coreutils
.
(Thu, 02 Jul 2015 13:24:02 GMT)
Full text and
rfc822 format available.
Message #15 received at 20954 <at> debbugs.gnu.org (full text, mbox):
2015-07-01 19:41:00 -0600, Bob Proulx:
[...]
> > $ a="" ; echo $s | wc -l
> > 1
[...]
> No. Should be 1. You have forgotten about the newline at the end of
> the command. The echo will terminate with a newline.
[...]
Leaving a variable unquoted will also cause the shell to apply
the split+glob operator on it. echo will also do some
transformations on the string (backslash and option processing).
To count the number of bytes in a variable, you can use:
printf %s "$var" | wc -c
Use "${#var}" or
printf %s "$var" | wc -m
for the number of characters.
GNU wc will not count the bytes that are not part of a valid
character, while GNU bash's ${#var} will count them as one
character:
In a UTF-8 locale:
$ var=$'\x80X\x80\u00e9'
$ printf %s "$var" | hd
00000000 80 58 80 c3 a9 |.X...|
00000005
$ echo "${#var}"
4
$ printf %s "$var" | wc -c
5
$ printf %s "$var" | wc -m
2
Above $var contains the 0x80 byte that doesn't form a valid
character, "X" (0x58), then another 0x80, then é (0xc3 0xa9).
wc -c counts the 5 bytes, wc -m counts X and é, while bash
${#var} counts those plus the 0x80s.
--
Stephane
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#20954
; Package
coreutils
.
(Thu, 02 Jul 2015 15:30:19 GMT)
Full text and
rfc822 format available.
Message #18 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
tag 20954 + notabug
close 20954
thanks
tele wrote:
> Hi!
Hi!
> From terminal:
>
> $ a="" ; echo $s | wc -l
> 1
Do you mean $a instead of $s? Either way is the same though assuming
$s is empty too.
- Yes, my mistake :-)
> Should be 0 , yes ?
No. Should be 1. You have forgotten about the newline at the end of
the command. The echo will terminate with a newline. You can see
this with od.
echo | od -tx1 -c
0000000 0a
\n
Since this appears to be a usage error I have closed the bug. Please
feel free to follow up with more information. We will read it. And
we appreciate additional communication! I am simply closing it to
keep the accounting straight.
Bob
# "echo" gives in new line, "echo -n" subtracts 1 line, but "wc -l" can count only from new line,
# so if something exist inside first line "wc -l" can not count. :-(
# example:
#
# $ a="j" ; echo "$a" | wc -l
# 1
#
# $ a="" ; echo "$a" | wc -l
# 1
#
# $ a="" ; echo -n "$a" | wc -l
# 0
#
# $ a="j" ; echo -n "$a" | wc -l
# 0
So,
$ a="" ; echo "$a" | sed '/^\s*$/d' | wc -l
0
$ a="3" ; echo "$a" | sed '/^\s*$/d' | wc -l
1
Can be added option to "wc" to fix this problem without use sed in future ?
Thanks for helping :-)
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#20954
; Package
coreutils
.
(Thu, 02 Jul 2015 23:34:02 GMT)
Full text and
rfc822 format available.
Message #21 received at 20954 <at> debbugs.gnu.org (full text, mbox):
tele wrote:
> "echo" gives in new line,
Yes.
> "echo -n" subtracts 1 line,
echo -n is non-portable and shouldn't be used.
echo -n suppresses emitting a trailing newline.
Note that in both of these cases you are using the shell's internal
builtin echo and not the coreutils echo. They behave the same.
> but "wc -l" can count only from new line, so if something exist
> inside first line "wc -l" can not count. :-(
"wc -l" counts newlines. That is the task that it was constructed to
do. That is exactly what it does. No more and no less.
What is a text line? A text line by definition ends with a newline.
This has been standardized to prevent different implementations from
implementing it differently and creating portability problems.
Therefore all standards compliant implementations must implement it in
the same way to prevent portability problems.
> example:
>
> $ a="j" ; echo "$a" | wc -l
> 1
I have been wondering. Why are you using a variable here? Using the
variable as you are doing is no different than not using the variable.
echo "j" | od -tx1 -c
0000000 6a 0a
j \n
There is one newline. That counts as one text line.
> $ a="" ; echo "$a" | wc -l
> 1
echo "" | od -tx1 -c
0000000 0a
\n
There is one newline. That counts as one text line.
> $ a="" ; echo -n "$a" | wc -l
> 0
echo -n "" | od -tx1 -c
0000000
Nothing was emitted. No newlines. Counts as zero lines. But nothing
was emitted. Zero characters.
od -tx1 -c < /dev/null
0000000
> $ a="j" ; echo -n "$a" | wc -l
> 0
echo -n "j" | od -tx1 -c
0000000 6a
j
That emits one character, the 'j' character. It emits no newlines.
Without any newlines at all that is not and cannot be a "text" line.
Without a newline that can only be interpreted as binary data. In any
case there were no newlines to count and "wc -l" counted and reported
zero newlines.
Instead of echo -n it would be better and portable to use printf
instead.
printf "j" | od -tx1 -c
0000000 6a
j
Same action in a portable way using printf. Avoid using echo with
options.
> So,
>
> $ a="" ; echo "$a" | sed '/^\s*$/d' | wc -l
> 0
echo "" | sed '/^\s*$/d' | od -tx1 -c
0000000
As we previosuly see the echo action will emit one newline character.
This is piped to the sed program which will delete that line.
Deleting the line is what the sed 'd' action does. Therefore sed does
not emit the newline. The text line is deleted.
> $ a="3" ; echo "$a" | sed '/^\s*$/d' | wc -l
> 1
echo "3" | sed '/^\s*$/d' | od -tx1 -c
0000000 33 0a
3 \n
Here the echo emitted two character a '3' and a newline. The sed
prgram did not match and therefore did not delete the line. Since it
did not delete the line it passed the one text line to wc and "wc -l"
counted the one newline and reported one text line.
> Can be added option to "wc" to fix this problem without use sed in future ?
> Thanks for helping :-)
There is no problem to be fixed. And therefore this isn't something
that can be "fixed" in wc.
Bob
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#20954
; Package
coreutils
.
(Fri, 03 Jul 2015 14:19:02 GMT)
Full text and
rfc822 format available.
Message #24 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
tag 20954 + notabug
close 20954
thanks
Maybe we did not understand.
I don't want change old definitions but create new option for wc or echo,
because this above examples not make logic sense,
( and it I want fix, however with sed is also fixed )
however now Iunderstand that they work correctly in accordance with
accepted principles.
# What is a text line? A text line by definition ends with a newline.
This has been standardized to prevent
# different implementations from implementing it differently and
creating portability problems. Therefore
# all standards compliant implementations must implement it in the same
way to prevent portability
# problems.
" wc -l " in most examples working correct,
because it " echo " give's " \n " and "wc -l" count correct.
I mentioned about "wc", because for me build option "wc -a" for "echo"
or "echo -m"
this is not important.
Maybe exist hope for example create option "m" to echo , " echo -m "
which not will from new line, but first line if variable is empty
and from new line if is not empty ?
example:
echo -m "" | wc -l
0
echo -m "e" | wc -l
1
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#20954
; Package
coreutils
.
(Mon, 06 Jul 2015 02:41:02 GMT)
Full text and
rfc822 format available.
Message #27 received at 20954 <at> debbugs.gnu.org (full text, mbox):
tele wrote:
> Maybe we did not understand.
> I don't want change old definitions but create new option for wc or echo,
> because this above examples not make logic sense,
What would such an option do?
> ( and it I want fix, however with sed is also fixed )
Your original message asked if "echo | wc -l" should count 0 lines
instead of 1 line. But the echo is going to produce one line and
therefore it should be counted.
In a later message you wrote using sed to delete blank lines so that
only non-blank lines remained to be counted.
> $ a="" ; echo "$a" | sed '/^\s*$/d' | wc -l
> 0
>
> $ a="3" ; echo "$a" | sed '/^\s*$/d' | wc -l
> 1
>
> Can be added option to "wc" to fix this problem without use sed in future ?
This tells me that you as yet did not understand things yet. :-(
I tried to explain this in more detail in my response to that message.
The sed command you pulled from stackoverflow.com deletes blank
lines. That is a good way to avoid counting blank lines.
If I guess at what you are suggesting then it does not make sense to
add an option to wc to count only non-blank lines. If you don't want
to count blank lines then delete them first. There are an infinite
number of possible things to count. There cannot be an infinite
number of options implemented. And using sed to delete blank lines is
the Right Way To Do Things.
> however now Iunderstand that they work correctly in accordance with
> accepted principles.
Yes.
> > What is a text line? A text line by definition ends with a
> > newline. This has been standardized to prevent different
> > implementations from implementing it differently and creating
> > portability problems. Therefore all standards compliant
> > implementations must implement it in the same way to prevent
> > portability problems.
>
> " wc -l " in most examples working correct,
"most"? No. "wc -l" is working correctly in all examples. :-)
> because it " echo " give's " \n " and "wc -l" count correct.
Yes.
> I mentioned about "wc", because for me build option "wc -a" for "echo" or
> "echo -m"
> this is not important.
> Maybe exist hope for example create option "m" to echo , " echo -m "
> which not will from new line, but first line if variable is empty
> and from new line if is not empty ?
>
> example:
>
> echo -m "" | wc -l
> 0
>
> echo -m "e" | wc -l
> 1
The shell is a programming language. If not infinite number then a
very large number of possibilities may be implemented by programming
them in the shell. All such possibilities should not be coded into
specific options. Instead if you have a specific need it should be
programmed. Simply write the code that says explicitly what you want
to do. There are millions of lines of code written for various tasks.
All of those millions of lines should not be turned into specific
options. If you want to delete blank lines then simply delete blank
lines.
This entire discussion feels like an XY problem. Here is a collection
of explanations of the XY problem.
http://www.perlmonks.org/?node_id=542341
The help-bash <at> gnu.org mailing list is the right place to follow up but
if you wrote there and said what you were trying to do and asking how
to do it in the shell people would try to help you there.
Bob
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 03 Aug 2015 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 327 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.