GNU bug report logs -
#55719
29.0.50; various bindat strz bugs
Previous Next
Reported by: Richard Hansen <rhansen <at> rhansen.org>
Date: Mon, 30 May 2022 06:48:02 UTC
Severity: normal
Tags: patch
Found in version 29.0.50
Done: Richard Hansen <rhansen <at> rhansen.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 55719 in the body.
You can then email your comments to 55719 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Mon, 30 May 2022 06:48:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Richard Hansen <rhansen <at> rhansen.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 30 May 2022 06:48:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
There are a number of bugs with strz packing and unpacking in the bindat
package:
(require 'bindat)
(require 'ert)
(let ((s (bindat-type :pack-var v (x strz :pack-val v) :unpack-val x)))
;; Bug: length doesn't count the null terminator.
(ert-deftest t1 () (should (equal (bindat-length s "x") 2)))
;; Bug: pack doesn't include the null terminator.
(ert-deftest t2 () (should (equal (bindat-pack s "x") "\170\0")))
;; Bug: unpack throws (wrong-type-argument number-or-marker-p nil)
(ert-deftest t3 () (should (equal (bindat-unpack s "\170\0") "x"))))
(let ((s (bindat-type :pack-var v (x strz 2 :pack-val v) :unpack-val x)))
;; Bug: pack doesn't always include the null terminator.
(ert-deftest t4 () (should (equal (bindat-pack s "xx") "\170\0"))))
(let ((s '((x strz 2))))
;; Bug: pack doesn't always include the null terminator.
(ert-deftest t5 () (should (equal (bindat-pack s '((x . "xx"))) "\170\0"))))
In addition to the above issues, legacy-style specs with no strz length,
such as '((x strz)), do not work. But I suspect length only became
optional when `bindat-type' was introduced, so that's probably not an
issue.
In GNU Emacs 29.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.20, cairo version 1.16.0)
of 2022-05-29 built on sprinkles
Repository revision: 7f7acf395697e4cfa470cfa2993b70c2308e95c1
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
System Description: Ubuntu 20.04.4 LTS
Configured using:
'configure --build=x86_64-linux-gnu --prefix=/usr
'--includedir=${prefix}/include' '--mandir=${prefix}/share/man'
'--infodir=${prefix}/share/info' --sysconfdir=/etc --localstatedir=/var
--disable-option-checking --disable-silent-rules
'--libdir=${prefix}/lib/x86_64-linux-gnu'
'--libexecdir=${prefix}/lib/x86_64-linux-gnu' --disable-maintainer-mode
--disable-dependency-tracking --prefix=/usr --sharedstatedir=/var/lib
--with-modules=yes --with-x=yes --with-x-toolkit=gtk3
--with-xwidgets=yes --without-pgtk'
Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS X11 XDBE XIM XINPUT2 XPM XWIDGETS GTK3 ZLIB
Important settings:
value of $LC_MONETARY: en_US.UTF-8
value of $LC_NUMERIC: en_US.UTF-8
value of $LC_TIME: root.UTF-8
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message mailcap yank-media rmc puny
dired dired-loaddefs rfc822 mml mml-sec epa derived epg rfc6068
epg-config gnus-util text-property-search mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils ffap thingatpt
url-parse auth-source eieio eieio-core eieio-loaddefs password-cache
json url-vars help-fns radix-tree cl-print cl-extra ert map seq pp ewoc
debug backtrace help-mode find-func bindat byte-opt bytecomp
byte-compile cconv pcase cl-seq cl-macs gv time-date subr-x cl-loaddefs
cl-lib iso-transl tooltip eldoc paren electric uniquify ediff-hook
vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode lisp-mode prog-mode register
page tab-bar menu-bar rfn-eshadow isearch easymenu timer select
scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors
frame minibuffer nadvice simple cl-generic indonesian philippine cham
georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure
cl-preloaded button loaddefs faces cus-face macroexp files window
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget keymap hashtable-print-readable backquote threads
xwidget-internal dbusbind inotify lcms2 dynamic-setting
system-font-setting font-render-setting cairo move-toolbar gtk x-toolkit
xinput2 x multi-tty make-network-process emacs)
Memory information:
((conses 16 74917 10902)
(symbols 48 8201 0)
(strings 32 23138 2251)
(string-bytes 1 732663)
(vectors 16 16128)
(vector-slots 8 211715 13201)
(floats 8 98 309)
(intervals 56 475 0)
(buffers 992 14))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Mon, 30 May 2022 19:42:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 55719 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The attached series of patches should fix bug #55719 (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55719).
I have already signed the copyright assignment agreement.
Thanks,
Richard
[0001-bindat-tests-strz-Add-more-tests.patch (text/x-patch, attachment)]
[0002-bindat-strz-Fix-off-by-one-bug-in-computed-length.patch (text/x-patch, attachment)]
[0003-bindat-strz-Consistent-length-type-check.patch (text/x-patch, attachment)]
[0004-bindat-strz-Always-write-null-terminator.patch (text/x-patch, attachment)]
[0005-bindat-strz-Fix-wrong-type-argument-error-when-unpac.patch (text/x-patch, attachment)]
[0006-bindat-strz-Move-all-pack-logic-to-pack-function.patch (text/x-patch, attachment)]
[0007-bindat-strz-Fix-packing-of-long-strings-with-legacy-.patch (text/x-patch, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]
Added tag(s) patch.
Request was from
Richard Hansen <rhansen <at> rhansen.org>
to
control <at> debbugs.gnu.org
.
(Tue, 31 May 2022 05:39:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Tue, 31 May 2022 11:09:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 55719 <at> debbugs.gnu.org (full text, mbox):
> Cc: 55719 <at> debbugs.gnu.org
> Date: Mon, 30 May 2022 12:53:31 -0400
> From: Richard Hansen <rhansen <at> rhansen.org>
>
> The attached series of patches should fix bug #55719 (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55719).
Aren't those all due to the issue of including the terminating null
byte in a packed string? And if so, I wonder whether indeed the null
byte should be included, since that means you cannot handle strings
that include null bytes as part of the payload, not as terminators.
Can you tell why you are convinced the null byte should be considered
as part of the string?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Tue, 31 May 2022 20:09:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 55719 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 5/31/22 07:08, Eli Zaretskii wrote:
> Can you tell why you are convinced the null byte should be considered
> as part of the string?
The null terminator is the reason one would use the strz type. If the
user doesn't want a null terminator, they should use the str type
instead. From the documentation [1]:
> str len
>
> String of bytes of length len.
>
> strz &optional len
>
> Zero-terminated string of bytes, can be of arbitrary length or
> in a fixed-size field with length len.
[1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Bindat-Types.html#index-bindat_002dtype
[OpenPGP_signature (application/pgp-signature, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Tue, 31 May 2022 23:02:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 55719 <at> debbugs.gnu.org (full text, mbox):
> diff --git a/test/lisp/emacs-lisp/bindat-tests.el b/test/lisp/emacs-lisp/bindat-tests.el
> index 7722cf6c02..53c0c359d8 100644
> --- a/test/lisp/emacs-lisp/bindat-tests.el
> +++ b/test/lisp/emacs-lisp/bindat-tests.el
> @@ -162,4 +162,64 @@ bindat-test--recursive
> (bindat-pack bindat-test--LEB128 n))
> n)))))))
>
> +(let ((spec (bindat-type :pack-var v
> + (x strz 2 :pack-val v)
> + :unpack-val x)))
Any particular reason why you define it this way instead of just
(bindat-type strz 2)
?
> + (ert-deftest bindat-test--strz-fixedlen-len ()
> + (should (equal (bindat-length spec "") 2))
> + (should (equal (bindat-length spec "a") 2)))
> +
> + (ert-deftest bindat-test--strz-fixedlen-len-overflow ()
> + (should (equal (bindat-length spec "abc") 2)))
> +
> + (ert-deftest bindat-test--strz-fixedlen-pack ()
> + (should (equal (bindat-pack spec "") "\0\0"))
> + (should (equal (bindat-pack spec "a") "\141\0")))
LGTM.
> + (ert-deftest bindat-test--strz-fixedlen-pack-overflow ()
> + :expected-result :failed
> + (should (equal (bindat-pack spec "abc") "\141\0")))
I think this changes the intended semantics. Until now `strz N` has
meant that N bytes are used to encode the string and that it can
hold upto a string of length N (in which case there's no terminating NUL
byte). I agree that it's not the only valid semantics, but I'm not sure
we want to change it at this point.
Do you have a particular reason to make this change.
> + (ert-deftest bindat-test--strz-fixedlen-unpack ()
> + (should (equal (bindat-unpack spec "\0\0") ""))
> + (should (equal (bindat-unpack spec "a\0") "a"))))
How 'bout
(bindat-unpack spec "ab")
?
> +(let ((spec (bindat-type :pack-var v
> + (x strz :pack-val v)
> + :unpack-val x)))
Similarly here, I'd use just (bindat-type strz)
> + (ert-deftest bindat-test--strz-varlen-len ()
> + :expected-result :failed
> + (should (equal (bindat-length spec "") 1))
> + (should (equal (bindat-length spec "abc") 4)))
> +
> + (ert-deftest bindat-test--strz-varlen-pack ()
> + :expected-result :failed
> + (should (equal (bindat-pack spec "") "\0"))
> + (should (equal (bindat-pack spec "abc") "\141\142\143\0")))
> +
> + (ert-deftest bindat-test--strz-varlen-unpack ()
> + :expected-result :failed
> + (should (equal (bindat-unpack spec "\0") ""))
> + (should (equal (bindat-unpack spec "\141\142\143\0") "abc"))))
Looks good (tho I'd write "abc\0" i.s.o "\141\142\143\0").
Not sure what we should do about (bindat-unpack spec "abc")?
> diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el
> index c6d64975ec..f66458296a 100644
> --- a/lisp/emacs-lisp/bindat.el
> +++ b/lisp/emacs-lisp/bindat.el
> @@ -687,10 +687,9 @@ bindat--type
> (bindat--pcase op
> ('unpack `(bindat--unpack-strz ,len))
> (`(length ,val)
> - `(cl-incf bindat-idx ,(cond
> - ((null len) `(length ,val))
> - ((numberp len) len)
> - (t `(or ,len (length ,val))))))
> + `(cl-incf bindat-idx ,(if (numberp len)
> + len
> + `(1+ (length ,val)))))
`len` is supposed to be an ELisp *expression*. E.g. it can be
(+ a 4)
in which case (numberp len) will fail yet we should return the value of
`len` rather than (1+ (length ,val)). In the original code, the cases
for (null len) and (numberp len) are *optimizations*.
I haven't yet looked at the rest of the patches. If you can update your
patches based on this feedback, that would be great, but in the worst
case, I'll get to reviewing the rest sooner or later anyway.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Wed, 01 Jun 2022 05:29:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 55719 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thank you for the review! I have attached a new set of patches.
On 5/31/22 19:00, Stefan Monnier wrote:
>> +(let ((spec (bindat-type :pack-var v
>> + (x strz 2 :pack-val v)
>> + :unpack-val x)))
>
> Any particular reason why you define it this way instead of just
>
> (bindat-type strz 2)
>
> ?
Thanks, I should have realized that was possible. Fixed.
>> + (ert-deftest bindat-test--strz-fixedlen-pack-overflow ()
>> + :expected-result :failed
>> + (should (equal (bindat-pack spec "abc") "\141\0")))
>
> I think this changes the intended semantics. Until now `strz N` has
> meant that N bytes are used to encode the string and that it can
> hold upto a string of length N (in which case there's no terminating NUL
> byte). I agree that it's not the only valid semantics, but I'm not sure
> we want to change it at this point.
>
> Do you have a particular reason to make this change.
A few:
* The documentation says that the packed output is null terminated
so that's what users expect.
* It is safer (packed output is less likely to cause some other
program to run past the end of the field).
* Without this change, there is no difference between `strz N` and
`str N`. So what would be the point of `strz N`?
* If the user selected strz, the application probably requires null
termination in all cases, not just when the string is under a
certain length.
>> + (ert-deftest bindat-test--strz-fixedlen-unpack ()
>> + (should (equal (bindat-unpack spec "\0\0") ""))
>> + (should (equal (bindat-unpack spec "a\0") "a"))))
>
> How 'bout
>
> (bindat-unpack spec "ab")
>
> ?
I added some comments explaining why cases like that aren't tested.
> (tho I'd write "abc\0" i.s.o "\141\142\143\0").
Done.
> Not sure what we should do about (bindat-unpack spec "abc")?
That is not a valid packed string, so I'm not too concerned about it.
Currently it signals an args-out-of-range error because it tries to
read past the end of the string, which seems like an acceptable
behavior to me. We could make the error message more friendly, such as
"end of input reached while searching for null terminator", but I'd
rather not make that change right now with this set of patches.
>> - `(cl-incf bindat-idx ,(cond
>> - ((null len) `(length ,val))
>> - ((numberp len) len)
>> - (t `(or ,len (length ,val))))))
>> + `(cl-incf bindat-idx ,(if (numberp len)
>> + len
>> + `(1+ (length ,val)))))
>
> `len` is supposed to be an ELisp *expression*.
Ah, I was wondering why it was written that way. Fixed.
[0001-bindat-tests-strz-Add-more-tests.patch (text/x-patch, attachment)]
[0002-bindat-strz-Fix-off-by-one-bug-in-computed-length.patch (text/x-patch, attachment)]
[0003-bindat-strz-Consistent-length-type-check.patch (text/x-patch, attachment)]
[0004-bindat-strz-Always-write-null-terminator.patch (text/x-patch, attachment)]
[0005-bindat-strz-Fix-wrong-type-argument-error-when-unpac.patch (text/x-patch, attachment)]
[0006-bindat-strz-Move-all-pack-logic-to-pack-function.patch (text/x-patch, attachment)]
[0007-bindat-strz-Fix-packing-of-long-strings-with-legacy-.patch (text/x-patch, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Wed, 01 Jun 2022 12:06:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 55719 <at> debbugs.gnu.org (full text, mbox):
>>> + (ert-deftest bindat-test--strz-fixedlen-pack-overflow ()
>>> + :expected-result :failed
>>> + (should (equal (bindat-pack spec "abc") "\141\0")))
>> I think this changes the intended semantics. Until now `strz N` has
>> meant that N bytes are used to encode the string and that it can
>> hold upto a string of length N (in which case there's no terminating NUL
>> byte). I agree that it's not the only valid semantics, but I'm not sure
>> we want to change it at this point.
>> Do you have a particular reason to make this change.
>
> A few:
> * The documentation says that the packed output is null terminated
> so that's what users expect.
> * It is safer (packed output is less likely to cause some other
> program to run past the end of the field).
> * Without this change, there is no difference between `strz N` and
> `str N`. So what would be the point of `strz N`?
> * If the user selected strz, the application probably requires null
> termination in all cases, not just when the string is under a
> certain length.
You're listing advantages of this choice, which we know about already.
The other choice also has advantages. These don't count as "particular
reasons" (e.g. a real-life concrete *need* for it, rather than a mere
preference).
The particular reason to prefer the current behavior is
backward compatibility (which we could call "inertia").
Note also that `strz` without a length (or with a nil length) behaves
the way you want.
Of course, we could add an additional (optional) arg to `strz` to
specify what should happen when unpacking a string with missing NUL byte
as well as whether to truncate to N-1 chars rather than to N chars to
make sure there is a terminating NUL byte.
>>> + (ert-deftest bindat-test--strz-fixedlen-unpack ()
>>> + (should (equal (bindat-unpack spec "\0\0") ""))
>>> + (should (equal (bindat-unpack spec "a\0") "a"))))
>> How 'bout
>> (bindat-unpack spec "ab")
>> ?
>
> I added some comments explaining why cases like that aren't tested.
The byte-string to unpack is not necessarily built from our own code
(usually bindat is used to communicate with some other application), so
whether our code can generate "ab" is not really relevant: the question
still comes up about what we should do with "ab" (where valid answers
could be to return "ab" or to return "a" or to signal an error, ...).
Of course we can also decide it's "undefined".
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Wed, 01 Jun 2022 20:25:02 GMT)
Full text and
rfc822 format available.
Message #28 received at 55719 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 6/1/22 08:04, Stefan Monnier wrote:
>> * The documentation says that the packed output is null terminated
>> so that's what users expect.
>> * It is safer (packed output is less likely to cause some other
>> program to run past the end of the field).
>> * Without this change, there is no difference between `strz N` and
>> `str N`. So what would be the point of `strz N`?
>> * If the user selected strz, the application probably requires null
>> termination in all cases, not just when the string is under a
>> certain length.
>
> You're listing advantages of this choice, which we know about already.
> The other choice also has advantages. These don't count as "particular
> reasons" (e.g. a real-life concrete *need* for it, rather than a mere
> preference).
I don't have a concrete need for it at the moment. I just noticed the bug while I was fixing the other bugs. I can leave that change out of the patch series if that is the only thing impeding merge.
> The particular reason to prefer the current behavior is
> backward compatibility (which we could call "inertia").
Anyone that needs the current behavior should be using `str N`, not `strz N`. If they're using `strz N`, then I would consider that to be a bug in their code. If this change breaks someone, they can fix it easily: just change `strz N` to `str N`. I understand that we should endeavor to maintain compatibility, but keeping the current behavior would be intentionally preserving a bug to accommodate other bugs. I don't think that's a good trade-off.
> Note also that `strz` without a length (or with a nil length) behaves
> the way you want.
No -- strz without a length results in variable length encoding. Without these changes, the only way to get a fixed length output with guaranteed null termination is to do `(substring str 0 (1- N))` before packing. (Or explicitly pack a separate zero byte immediately after the `str N-1` or `strz N-1` entry.)
> Of course, we could add an additional (optional) arg to `strz` to
> specify what should happen when unpacking a string with missing NUL byte
> as well as whether to truncate to N-1 chars rather than to N chars to
> make sure there is a terminating NUL byte.
I don't think that's necessary. I think most use cases are satisfied by the current str behavior and the strz behavior with these patches.
>>> How 'bout
>>> (bindat-unpack spec "ab")
>>> ?
>>
>> I added some comments explaining why cases like that aren't tested.
>
> The byte-string to unpack is not necessarily built from our own code
> (usually bindat is used to communicate with some other application), so
> whether our code can generate "ab" is not really relevant: the question
> still comes up about what we should do with "ab" (where valid answers
> could be to return "ab" or to return "a" or to signal an error, ...).
> Of course we can also decide it's "undefined".
Regardless of what other applications produce, packed strings like that are invalid according to the spec provided to `bindat-unpack`. Attempting to do something useful with such invalid strings assumes that Postel's Law is a good thing, which I don't agree with (especially with security-sensitive protocols). I think it is safest to signal an error, which it currently does.
A real example of why it might be undesirable to attempt to process a strz string without a null terminator:
1. The sender streams multiple packed messages over a reliable, in-order protocol like TCP or pipes.
2. The sending system breaks up the messages at arbitrary places to fit in packets/buffers, so the receiving side might see an incomplete message that will be completed by a future packet (or multiple future packets).
3. The receiver attempts to extract a message from the bytes received so far. If the bytes do not form a valid message according to the bindat-unpack spec, then assume the message was truncated. Remember the bytes received so far and wait for the next packet.
4. When the next packet arrives, concatenate its bytes with the remembered bytes and try again.
If bindat-unpack did not signal an error when expecting a null terminator, then the above protocol would not work without extra effort by the user.
Either way, changing the way `bindat-unpack` handles invalid strings feels out of scope for this particular bug report.
[OpenPGP_signature (application/pgp-signature, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Wed, 01 Jun 2022 20:30:03 GMT)
Full text and
rfc822 format available.
Message #31 received at 55719 <at> debbugs.gnu.org (full text, mbox):
On 6/1/22 16:23, Richard Hansen wrote:
> Anyone that needs the current behavior should be using `str N`, not
> `strz N`. If they're using `strz N`, then I would consider that to be
> a bug in their code. If this change breaks someone, they can fix it
> easily: just change `strz N` to `str N`. I understand that we should
> endeavor to maintain compatibility, but keeping the current behavior
> would be intentionally preserving a bug to accommodate other bugs. I
> don't think that's a good trade-off.
Actually I'm wrong; there is a difference between `str N` and `strz N`. With `str N`, all zero bytes in the input string are included in the unpacked value (including trailing null bytes). With `strz N`, the unpacked string ends just before the first zero byte.
I'll remove the `strz N` change from the patch series so that we can get the rest merged.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55719
; Package
emacs
.
(Thu, 02 Jun 2022 02:53:02 GMT)
Full text and
rfc822 format available.
Message #34 received at 55719 <at> debbugs.gnu.org (full text, mbox):
> I don't have a concrete need for it at the moment. I just noticed the bug
> while I was fixing the other bugs. I can leave that change out of the patch
> series if that is the only thing impeding merge.
I've installed the rest of the patch series into `master`, thanks.
>> The particular reason to prefer the current behavior is
>> backward compatibility (which we could call "inertia").
> Anyone that needs the current behavior should be using `str N`, not
> `strz N`.
`str N` does not give the same semantics when unpacking.
> If they're using `strz N`, then I would consider that to be a bug in
> their code.
It all depends how the format was defined. "NUL-terminated N-bytes-max
strings" can be like the one currently supported or like the ones your
code supports. Which one is better is irrelevant when the format is
not of your choosing.
I'm not at all opposed to supporting the kind of NUL-terminated strings
you propose but it should be *in addition* to the ones
already supported.
> If this change breaks someone, they can fix it easily: just
> change `strz N` to `str N`.
No, because unpacking "a\0" with (str 2) give "a\0" rather than "a".
> I understand that we should endeavor to maintain compatibility, but
> keeping the current behavior would be intentionally preserving a bug
> to accommodate other bugs. I don't think that's a good trade-off.
It's only a bug if it doesn't match the format you need to use.
Your code introduces a bug when the format is defined to behave like the
current code :-(
>> Note also that `strz` without a length (or with a nil length) behaves
>> the way you want.
> No -- strz without a length results in variable length encoding. Without
> these changes, the only way to get a fixed length output with guaranteed
> null termination is to do `(substring str 0 (1- N))` before packing. (Or
> explicitly pack a separate zero byte immediately after the `str N-1` or
> `strz N-1` entry.)
...or define a new type.
>>>> (bindat-unpack spec "ab")
[...]
> Regardless of what other applications produce, packed strings like that are
> invalid according to the spec provided to `bindat-unpack`.
Where in the spec/doc of Bindat do you see that "ab" is not a valid
input to `bindat-unpack` for `strz 2`?
> A real example of why it might be undesirable to attempt to process
> a strz string without a null terminator:
You don't need to convince me that the format you propose is useful.
Stefan
Reply sent
to
Richard Hansen <rhansen <at> rhansen.org>
:
You have taken responsibility.
(Sun, 05 Jun 2022 19:31:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Richard Hansen <rhansen <at> rhansen.org>
:
bug acknowledged by developer.
(Sun, 05 Jun 2022 19:31:02 GMT)
Full text and
rfc822 format available.
Message #39 received at 55719-done <at> debbugs.gnu.org (full text, mbox):
On 6/1/22 22:52, Stefan Monnier wrote:
> I've installed the rest of the patch series into `master`, thanks.
Thanks! I'm marking this bug as done.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 04 Jul 2022 11:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 351 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.