GNU bug report logs - #77762
[PATCH] web: Add JSON module.

Previous Next

Package: guile;

Reported by: "Thompson, David" <dthompson2 <at> worcester.edu>

Date: Sat, 12 Apr 2025 13:16:02 UTC

Severity: normal

Tags: patch

To reply to this bug, email your comments to 77762 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Sat, 12 Apr 2025 13:16:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Thompson, David" <dthompson2 <at> worcester.edu>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sat, 12 Apr 2025 13:16:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Thompson, David" <dthompson2 <at> worcester.edu>
To: "bug-guile <at> gnu.org" <bug-guile <at> gnu.org>
Subject: [PATCH] web: Add JSON module.
Date: Sat, 12 Apr 2025 09:14:31 -0400
[Message part 1 (text/plain, inline)]
Attached is a patch that adds a new (web json) module. Some may
remember that I submitted a patch back in 2015 (time flies, eh?) for
an (ice-9 json) module that never made it in. Well, 10 years is a long
time and Guile still doesn't have a built-in JSON module. Third party
libraries like guile-json and guile-sjson are available, the latter
being an adaptation of my original patch and the former remaining the
go-to library used by larger Guile projects like Guix. There's also
SRFI-180 (which sounds like a cool surfing trick!) which was published
in 2020 but the API is, in my opinion, overly complicated due to
generators and other things.  Anyway, JSON continues to be *the* data
interchange format of the web and Guile really ought to have a simple
API that can read/write JSON to/from a port using only Scheme data
types that have read syntax (i.e. no hash tables like guile-json).
This minimal, practical API is what my patch provides.  I've tried my
best to make it as efficient as possible.

I've settled on the following JSON<->Scheme data type mapping which is
nearly identical to SRFI-180 with the exception of object keys:

- true and false are #t and #f
- null is the symbol 'null
- numbers are either exact integers (fixnums and bignums) or inexact
reals (flonums, NaNs and infinities excluded)
- strings are strings
- arrays are vectors
- objects are association lists with string keys (SRFI-180 chose
symbols but JSON uses strings so strings feel the most honest)

Thanks in advance for the review,

- Dave
[0001-web-Add-JSON-module.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Sat, 12 Apr 2025 16:02:03 GMT) Full text and rfc822 format available.

Message #8 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: "Thompson, David" <dthompson2 <at> worcester.edu>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Sat, 12 Apr 2025 18:01:17 +0200
> - null is the symbol 'null

Out of curiosity, what are your thoughts about using #nil instead?

Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.




Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Sat, 12 Apr 2025 16:05:02 GMT) Full text and rfc822 format available.

Message #11 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: "Thompson, David" <dthompson2 <at> worcester.edu>
To: Tomas Volf <~@wolfsden.cz>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Sat, 12 Apr 2025 12:04:14 -0400
On Sat, Apr 12, 2025 at 12:01 PM Tomas Volf <~@wolfsden.cz> wrote:
>
> > - null is the symbol 'null
>
> Out of curiosity, what are your thoughts about using #nil instead?

My original patch from 10 years ago did this. Mark Weaver then
explained to me that #nil was added specifically for the purpose of
supporting Emacs Lisp on the Guile VM and shouldn't be used in Scheme
code. The symbol 'null works well because there is no symbol type in
JSON so there's no ambiguity.

- Dave




Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Sun, 13 Apr 2025 01:40:01 GMT) Full text and rfc822 format available.

Message #14 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: Arun Isaac <arunisaac <at> systemreboot.net>
To: "Thompson, David" <dthompson2 <at> worcester.edu>, 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Sun, 13 Apr 2025 02:39:25 +0100
Hi David,

Thanks for the patch. I'm just trying to understand: Why do we need a
JSON module in guile itself? Isn't guile-json, the way it is as an
external library, good enough?

> API that can read/write JSON to/from a port using only Scheme data
> types that have read syntax (i.e. no hash tables like guile-json).

guile-json does not use hash tables. guile-json uses association lists.
I believe the decision to use hash tables was reversed many years ago.

Cheers!
Arun




Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Sun, 13 Apr 2025 02:15:03 GMT) Full text and rfc822 format available.

Message #17 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: "Thompson, David" <dthompson2 <at> worcester.edu>
To: Arun Isaac <arunisaac <at> systemreboot.net>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Sat, 12 Apr 2025 22:13:40 -0400
Hi Arun,

On Sat, Apr 12, 2025 at 9:39 PM Arun Isaac <arunisaac <at> systemreboot.net> wrote:
>
> Thanks for the patch. I'm just trying to understand: Why do we need a
> JSON module in guile itself? Isn't guile-json, the way it is as an
> external library, good enough?

I mean, Guile doesn't *need* to include anything, but JSON is one of
the most common serialization formats around. Most language
implementations ship with a JSON library.  Guile already has a suite
of web modules and so this fits right in with those.  Over the years
I've noticed many users, especially new users, asking why Guile
doesn't ship with JSON support. I don't think you should have to
install an external library for something so common. In my own
projects I prefer to just check a JSON module into the source tree so
I don't have to add an additional dependency. guile-json is a good
library with a bunch of bells and whistles, but I think most people
just want basic read/write procedures and would be happy to take them
for granted with their Guile installation.

> > API that can read/write JSON to/from a port using only Scheme data
> > types that have read syntax (i.e. no hash tables like guile-json).
>
> guile-json does not use hash tables. guile-json uses association lists.
> I believe the decision to use hash tables was reversed many years ago.

Heh, shows how long it's been since I last used guile-json. Glad
that's changed. :)

- Dave




Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Mon, 14 Apr 2025 04:22:02 GMT) Full text and rfc822 format available.

Message #20 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: Arun Isaac <arunisaac <at> systemreboot.net>
To: "Thompson, David" <dthompson2 <at> worcester.edu>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Mon, 14 Apr 2025 05:21:16 +0100
Hi David,

> I mean, Guile doesn't *need* to include anything, but JSON is one of
> the most common serialization formats around. Most language
> implementations ship with a JSON library.  Guile already has a suite
> of web modules and so this fits right in with those.  Over the years
> I've noticed many users, especially new users, asking why Guile
> doesn't ship with JSON support. I don't think you should have to
> install an external library for something so common. In my own
> projects I prefer to just check a JSON module into the source tree so
> I don't have to add an additional dependency. guile-json is a good
> library with a bunch of bells and whistles, but I think most people
> just want basic read/write procedures and would be happy to take them
> for granted with their Guile installation.

Thanks for the explanation. That sounds reasonable to me. But, I am a
bit biased in favour of keeping guile small and portable. :-) Anyway,
I'll let the guile maintainers make a call here.

>> > API that can read/write JSON to/from a port using only Scheme data
>> > types that have read syntax (i.e. no hash tables like guile-json).
>>
>> guile-json does not use hash tables. guile-json uses association lists.
>> I believe the decision to use hash tables was reversed many years ago.
>
> Heh, shows how long it's been since I last used guile-json. Glad
> that's changed. :)

And, the reversal of that decision was thanks to your objection, as a
matter of fact! :-)

https://yhetil.org/guile-user/CAJ=RwfZgUg6-bCoYJR-v48ywCROEMu94NrSGrNKjzJB0foCfTQ <at> mail.gmail.com/
https://yhetil.org/guile-user/CA+XASoVmEy1Rt7w=pM21JkYL6dYJeGtW0_HjpnUa_eV5pGcvkQ <at> mail.gmail.com/

Cheers!
Arun




Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Mon, 14 Apr 2025 06:32:02 GMT) Full text and rfc822 format available.

Message #23 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: "Thompson, David" <dthompson2 <at> worcester.edu>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Mon, 14 Apr 2025 15:30:55 +0900
Hi David,

"Thompson, David" <dthompson2 <at> worcester.edu> writes:

> Attached is a patch that adds a new (web json) module. Some may
> remember that I submitted a patch back in 2015 (time flies, eh?) for
> an (ice-9 json) module that never made it in. Well, 10 years is a long
> time and Guile still doesn't have a built-in JSON module. Third party
> libraries like guile-json and guile-sjson are available, the latter
> being an adaptation of my original patch and the former remaining the
> go-to library used by larger Guile projects like Guix. There's also
> SRFI-180 (which sounds like a cool surfing trick!) which was published
> in 2020 but the API is, in my opinion, overly complicated due to
> generators and other things.  Anyway, JSON continues to be *the* data
> interchange format of the web and Guile really ought to have a simple
> API that can read/write JSON to/from a port using only Scheme data
> types that have read syntax (i.e. no hash tables like guile-json).
> This minimal, practical API is what my patch provides.  I've tried my
> best to make it as efficient as possible.
>
> I've settled on the following JSON<->Scheme data type mapping which is
> nearly identical to SRFI-180 with the exception of object keys:
>
> - true and false are #t and #f
> - null is the symbol 'null
> - numbers are either exact integers (fixnums and bignums) or inexact
> reals (flonums, NaNs and infinities excluded)
> - strings are strings
> - arrays are vectors
> - objects are association lists with string keys (SRFI-180 chose
> symbols but JSON uses strings so strings feel the most honest)
>
> Thanks in advance for the review,

First of all, let me say thank you for working on that!  I agree that
this would be most welcome in core Guile, for the reasons you mention.

[...]

> +@example
> +@verbatim
> +{
> +  "name": "Eva Luator",
> +  "age": 24,
> +  "schemer": true,
> +  "hobbies": [
> +    "hacking",
> +    "cycling",
> +    "surfing"
> +  ]
> +}
> +@end verbatim
> +@end example
> +
> +can be represented with the following Scheme expression:
> +
> +@example
> +@verbatim
> +'(("name" . "Eva Luator")
> +  ("age" . 24)
> +  ("schemer" . #t)
> +  ("hobbies" . #("hacking" "cycling" "surfing")))
> +@end verbatim
> +@end example

Is there particular reason for using vectors instead of plain list to
represent JSON arrays?  The later would be more idiomatic unless there
are technical reasons (perhaps performance?).

> +Strings, exact integers, inexact reals (excluding NaNs and infinities),
> +@code{#t}, @code{#f}, the symbol @code{null}, vectors, and association
> +lists may be serialized as JSON.  Association lists serialize as JSON
> +objects and vectors serialize as JSON arrays.  The keys of association
> +lists @emph{must} be strings.
> +
> +@deffn {Scheme Procedure} read-json [port]
> +
> +Parse a JSON-encoded value from @var{port} and return its Scheme
> +representation.  If @var{port} is unspecified, the current input port is
> +used.
> +
> +@example
> +@verbatim
> +(call-with-input-string "[true,false,null,42,\"foo\"]" read-json)
> +;; => #(#t #f null 42 "foo")
> +
> +(call-with-input-string "{\"foo\":1,\"bar\":2}" read-json)
> +;; => (("foo" . 1) ("bar" . 2))
> +@end verbatim
> +@end example
> +
> +@end deffn
> +
> +@deftp {Exception Type} &json-read-error
> +An exception type denoting JSON read errors.
> +@end deftp
>
> +@deffn {Scheme Procedure} write-json exp [port]
> +
> +Serialize the expression @var{exp} as JSON-encoded text to @var{port}.
> +If @var{port} is unspecified, the current output port is used.
> +
> +@example
> +@verbatim
> +(with-output-to-string (lambda () (write-json #(#t #f null 42 "foo"))))
> +;; => "[true,false,null,42,\"foo\"]"
> +
> +(with-output-to-string (lambda () (write-json '(("foo" . 1) ("bar" . 2)))))
> +;; => "{\"foo\":1,\"bar\":2}"
> +@end verbatim
> +@end example
> +
> +@end deffn
> +
> +@deftp {Exception Type} &json-write-error
> +An exception type denoting JSON write errors.
> +@end deftp

I think it could be a bit nicer if the deffn of read-json and write-json
explicitly mentioned that upon error an exception of type X is raised.

> +
>  @node Web Client
>  @subsection Web Client
>  
> diff --git a/module/web/json.scm b/module/web/json.scm
> new file mode 100644
> index 000000000..41aac0e90
> --- /dev/null
> +++ b/module/web/json.scm
> @@ -0,0 +1,308 @@
> +;;;; json.scm --- JSON reader/writer (ECMA-404)
> +;;;; Copyright (C) 2025 Free Software Foundation, Inc.
> +;;;;
> +;;;; This library is free software; you can redistribute it and/or
> +;;;; modify it under the terms of the GNU Lesser General Public
> +;;;; License as published by the Free Software Foundation; either
> +;;;; version 3 of the License, or (at your option) any later version.
> +;;;;
> +;;;; This library is distributed in the hope that it will be useful,
> +;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +;;;; Lesser General Public License for more details.
> +;;;;
> +;;;; You should have received a copy of the GNU Lesser General Public
> +;;;; License along with this library; if not, write to the Free Software
> +;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

The FSF has gone office-less, so the above address is now incorrect [0].
The up-to-date template for the copyright notice (header) reads [1]:

--8<---------------cut here---------------start------------->8---
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful, but
    WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
    General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program. If not, see
    <https://www.gnu.org/licenses/>.
--8<---------------cut here---------------end--------------->8---

[0]  https://www.fsf.org/blogs/community/fsf-office-closing-party
[1]  https://www.gnu.org/licenses/gpl-howto.html

> +
> +(define-module (web json)
> +  #:use-module (ice-9 exceptions)
> +  #:use-module (ice-9 match)
> +  #:use-module (ice-9 textual-ports)
> +  #:export (&json-read-error
> +            read-json
> +
> +            &json-write-error
> +            write-json))
> +
> +(define-exception-type &json-read-error &error
> +  make-json-read-error
> +  json-read-error?)
> +
> +(define* (read-json #:optional (port (current-input-port)))
> +  "Parse a JSON-encoded value from @var{port} and return its Scheme
> +representation.  If @var{port} is unspecified, the current input port is
> +used."
> +  (define (fail message)
> +    (raise-exception
> +     (make-exception (make-json-read-error)
> +                     (make-exception-with-origin 'read-json)
> +                     (make-exception-with-message message)
> +                     (make-exception-with-irritants (list port)))))

Hm, I wonder what (list port) looks like in the irritants when the
exception is reported; is it useful?  Shouldn't it show instead the
problematic value?

> +  (define (consume-whitespace)
> +    (case (peek-char port)
> +      ((#\space #\tab #\return #\newline)

Should a match + ((? char-whitespace?)) predicate pattern be used here
instead, or similar?  Or perhaps the above is faster and more
self-contained, which can be a good thing.

> +       (read-char port)
> +       (consume-whitespace))
> +      (else (values))))
> +  (define-syntax-rule (define-keyword-reader name str val)
> +    (define (name)
> +      (if (string=? (get-string-n port (string-length str)) str)
> +          val
> +          (fail "invalid keyword"))))
> +  (define-keyword-reader read-true "true" #t)
> +  (define-keyword-reader read-false "false" #f)
> +  (define-keyword-reader read-null "null" 'null)
> +  (define (read-hex-digit)
> +    (case (peek-char port)
> +      ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)
> +       (- (char->integer (read-char port)) (char->integer #\0)))
> +      ((#\a #\b #\c #\d #\e #\f)
> +       (+ 10 (- (char->integer (read-char port)) (char->integer #\a))))
> +      ((#\A #\B #\C #\D #\E #\F)
> +       (+ 10 (- (char->integer (read-char port)) (char->integer #\A))))
> +      (else (fail "invalid hex digit"))))
> +  (define (read-utf16-character)
> +    (let* ((a (read-hex-digit))
> +           (b (read-hex-digit))
> +           (c (read-hex-digit))
> +           (d (read-hex-digit)))
> +      (integer->char (+ (* a (expt 16 3)) (* b (expt 16 2)) (* c 16) d))))
> +  (define (read-escape-character)
> +    (case (read-char port)
> +      ((#\") #\")
> +      ((#\\) #\\)
> +      ((#\/) #\/)
> +      ((#\b) #\backspace)
> +      ((#\f) #\page)
> +      ((#\n) #\newline)
> +      ((#\r) #\return)
> +      ((#\t) #\tab)
> +      ((#\u) (read-utf16-character))
> +      (else (fail "invalid escape character"))))
> +  (define (read-string)
> +    (read-char port)
> +    (list->string
> +     (let lp ()
> +       (match (read-char port)
> +         ((? eof-object?) (fail "EOF while reading string"))
> +         (#\" '())
> +         (#\\ (cons (read-escape-character) (lp)))
> +         (char (cons char (lp)))))))
> +  (define (read-digit-maybe)
> +    (case (peek-char port)
> +      ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)
> +       (- (char->integer (read-char port))
> +          (char->integer #\0)))
> +      (else #f)))
> +  (define (read-integer)
> +    (let ((x (read-digit-maybe)))
> +      (and x
> +           (let lp ((x x))
> +             (match (read-digit-maybe)
> +               (#f x)
> +               (y (lp (+ (* x 10) y))))))))

Perhaps the above should be named read-integer-maybe, since it may
return #f?

> +  (define (read-fraction)
> +    (case (peek-char port)
> +      ((#\.)
> +       (read-char port)
> +       (let lp ((mag 10))
> +         (let ((n (read-digit-maybe)))
> +           (if n (+ (/ n mag) (lp (* mag 10))) 0))))
> +      (else 0)))

Should the above be named 'read-decimal' ?  Does a decimal number in
JSON always start with '.' and not with 0. ?  I was a bit puzzled on
what 'mag' may mean here, I guess 'magnitude' although there doesn't
appear to have a clear terminology for it.

> +  (define (read-exponent)
> +    (case (peek-char port)
> +      ((#\e #\E)
> +       (read-char port)
> +       (case (peek-char port)
> +         ((#\-)
> +          (read-char port)
> +          (expt 10 (- (read-integer))))
> +         ((#\+)
> +          (read-char port)
> +          (expt 10 (read-integer)))
> +         (else
> +          (expt 10 (read-integer)))))
> +      (else 1)))
> +  (define (read-positive-number)
> +    (let ((n (read-integer)))
> +      (and n
> +           (let* ((f (read-fraction))
> +                  (e (read-exponent))
> +                  (x (* (+ n f) e)))
> +             (if (exact-integer? x) x (exact->inexact x))))))

This may return #f.  Should it fail instead, or be named
read-positive-number-maybe ?

> +  (define (read-negative-number)
> +    (read-char port)
> +    (let ((x (read-positive-number)))
> +      (if x (- x) (fail "invalid number"))))

Not symmetrical with the above: this one would fail if an integer
couldn't be read in read-positive-number.

> +  (define (read-leading-zero-number)
> +    (read-char port)
> +    (case (peek-char port)
> +      ;; Extraneous zeroes are not allowed.  A single leading zero
> +      ;; can only be followed by a decimal point.
> +      ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9 #\e #\E)
> +       (fail "extraneous leading zero"))

Why not check for (not #\.) explicitly?  That'd be clearer and would
cover all cases (even crazy unexpected ones).

> +      ;; Fractional number.
> +      ((#\.)
> +       (let* ((d (read-fraction))
> +              (e (read-exponent)))
> +         (exact->inexact (* d e))))
> +      ;; Just plain zero.
> +      (else 0)))
> +  (define (read-key+value-pair)
> +    (let ((key (read-string)))
> +      (consume-whitespace)
> +      (case (read-char port)
> +        ((#\:)
> +         (consume-whitespace)
> +         (cons key (read-value)))
> +        (else (fail "invalid key/value pair delimiter")))))
> +  (define (read-object)
> +    (read-char port)
> +    (consume-whitespace)
> +    (case (peek-char port)
> +      ;; Empty object.
> +      ((#\})
> +       (read-char port)
> +       '())
> +      (else
> +       ;; Read first key/value pair, then all subsequent pairs delimited
> +       ;; by commas.
> +       (cons (read-key+value-pair)
> +             (let lp ()
> +               (consume-whitespace)
> +               (case (peek-char port)
> +                 ((#\,)
> +                  (read-char port)
> +                  (consume-whitespace)
> +                  (cons (read-key+value-pair) (lp)))
> +                 ;; End of object.
> +                 ((#\})
> +                  (read-char port)
> +                  '())
> +                 (else (fail "invalid object delimiter"))))))))
> +  (define (read-array)
> +    (read-char port)
> +    (consume-whitespace)
> +    (case (peek-char port)
> +      ;; Empty array.
> +      ((#\])
> +       (read-char port)
> +       #())
> +      (else
> +       (list->vector

As mentioned above, just a plain list would be more Schemey, no?  What
does the vector type buys us?  A user wanting a vector could always call
list->vector themselves, and otherwise we save some computation.

> +        ;; Read the first element, then all subsequent elements
> +        ;; delimited by commas.
> +        (cons (read-value)
> +              (let lp ()
> +                (consume-whitespace)
> +                (case (peek-char port)
> +                  ;; Elements are comma delimited.
> +                  ((#\,)
> +                   (read-char port)
> +                   (consume-whitespace)
> +                   (cons (read-value) (lp)))
> +                  ;; End of array.
> +                  ((#\])
> +                   (read-char port)
> +                   '())
> +                  (else (fail "invalid array delimiter")))))))))
> +  (define (read-value)
> +    (consume-whitespace)
> +    (case (peek-char port)
> +      ((#\") (read-string))
> +      ((#\{) (read-object))
> +      ((#\[) (read-array))
> +      ((#\t) (read-true))
> +      ((#\f) (read-false))
> +      ((#\n) (read-null))
> +      ((#\-) (read-negative-number))
> +      ((#\0) (read-leading-zero-number))
> +      ((#\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9) (read-positive-number))
> +      (else (fail "invalid value"))))
> +  (read-value))
> +
> +(define-exception-type &json-write-error &error
> +  make-json-write-error
> +  json-write-error?)
> +
> +(define* (write-json exp #:optional (port (current-output-port)))
> +  "Serialize the expression @var{exp} as JSON-encoded text to @var{port}.
> +If @var{port} is unspecified, the current output port is used."
> +  (define (fail message x)
> +    (raise-exception
> +     (make-exception (make-json-write-error)
> +                     (make-exception-with-origin 'write-json)
> +                     (make-exception-with-message message)
> +                     (make-exception-with-irritants (list x)))))
> +  (define (write-char/escape char)
> +    (match char
> +      (#\" (put-string port "\\\""))
> +      (#\\ (put-string port "\\\\"))
> +      (#\/ (put-string port "\\/"))
> +      (#\backspace (put-string port "\\b"))
> +      (#\page (put-string port "\\f"))
> +      (#\newline (put-string port "\\n"))
> +      (#\return (put-string port "\\r"))
> +      (#\tab (put-string port "\\t"))
> +      (_ (put-char port char))))
> +  (define (write-string str)
> +    (let ((in (open-input-string str)))

Looks like the above 'in' binding is not used.

> +      (put-char port #\")
> +      (string-for-each write-char/escape str)
> +      (put-char port #\")))
> +  (define (write-pair x)
> +    (match x
> +      (((? string? key) . value)
> +       (write-string key)
> +       (put-char port #\:)
> +       (write-value value))
> +      (_ (fail "invalid key/value pair" x))))
> +  (define (write-object obj)
> +    (put-char port #\{)
> +    (match obj
> +      ((head . rest)
> +       (write-pair head)
> +       (let lp ((obj rest))
> +         (match obj
> +           (() (values))

Any reason to return (values) instead of some dummy #t to denote 'no-op'
?.

> +           ((head . rest)
> +            (put-char port #\,)
> +            (write-pair head)
> +            (lp rest))
> +           (_ (fail "invalid object" obj))))))
> +    (put-char port #\}))
> +  (define (write-array v)
> +    (put-char port #\[)
> +    (match (vector-length v)
> +      (0 (values))
> +      (n
> +       (write-value (vector-ref v 0))
> +       (do ((i 1 (1+ i)))
> +           ((= i n))
> +         (put-char port #\,)
> +         (write-value (vector-ref v i)))))

I suppose the above is more efficient than a for-each loop?  I'd be
curious to see it profiled, if you still have data.  At least now I see
than for > 100k, vector-ref is faster than list-ref, which probably
explains why you went with vectors (could still be an implementation
detail with the list->vector call left in the writer though, in my
opinion).

> +    (put-char port #\]))
> +  (define (write-number x)
> +    (if (or (exact-integer? x)
> +            (and (real? x)
> +                 (inexact? x)
> +                 ;; NaNs and infinities are not allowed.
> +                 (not (or (nan? x) (inf? x)))))
> +        ;; Scheme's string representations of exact integers and floats
> +        ;; are compatible with JSON.
> +        (put-string port (number->string x))
> +        (fail "invalid number" x)))
> +  (define (write-value x)
> +    (match x
> +      (#t (put-string port "true"))
> +      (#f (put-string port "false"))
> +      ('null (put-string port "null"))
> +      (() (put-string port "{}"))
> +      ((? pair?) (write-object x))
> +      ((? vector?) (write-array x))
> +      ((? string?) (write-string x))
> +      ((? number?) (write-number x))
> +      (_ (fail "invalid value" x))))
> +  (write-value exp))

Phew.  That's a pretty low-level parser!  I hope it's fast, otherwise it
seems it'd be more concise/fun/maintainable to devise a PEG-based one,
which appears to be doable for JSON, from what I've read.  Perhaps
sprinkle with a few performance-related comments where such concerns
impacted the design choices, so that we can remember and retest/reverify
these in the future when Guile evolves.

> diff --git a/test-suite/Makefile.am b/test-suite/Makefile.am
> index 6014b1f1f..00afea142 100644
> --- a/test-suite/Makefile.am
> +++ b/test-suite/Makefile.am
> @@ -73,6 +73,7 @@ SCM_TESTS = tests/00-initial-env.test		\
>  	    tests/iconv.test			\
>  	    tests/import.test			\
>  	    tests/interp.test			\
> +	    tests/json.test			\
>  	    tests/keywords.test			\
>  	    tests/list.test			\
>  	    tests/load.test			\
> diff --git a/test-suite/tests/json.test b/test-suite/tests/json.test
> new file mode 100644
> index 000000000..f92eeccec
> --- /dev/null
> +++ b/test-suite/tests/json.test
> @@ -0,0 +1,154 @@
> +;;;; json.test --- test JSON reader/writer     -*- scheme -*-
> +;;;;
> +;;;; Copyright (C) 2015 Free Software Foundation, Inc.
> +;;;;
> +;;;; This library is free software; you can redistribute it and/or
> +;;;; modify it under the terms of the GNU Lesser General Public
> +;;;; License as published by the Free Software Foundation; either
> +;;;; version 3 of the License, or (at your option) any later version.
> +;;;;
> +;;;; This library is distributed in the hope that it will be useful,
> +;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +;;;; Lesser General Public License for more details.
> +;;;;
> +;;;; You should have received a copy of the GNU Lesser General Public
> +;;;; License along with this library; if not, write to the Free Software
> +;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> +
> +(define-module (test-suite test-json)
> +  #:use-module (test-suite lib)
> +  #:use-module (web json))
> +
> +;;;
> +;;; Reader
> +;;;
> +
> +(define (read-json-string str)
> +  (call-with-input-string str read-json))
> +
> +(define (json-read=? str x)
> +  (= x (read-json-string str)))
> +
> +(define (json-read-eq? str x)
> +  (eq? x (read-json-string str)))
> +
> +(define (json-read-equal? str x)
> +  (equal? x (read-json-string str)))
> +
> +(define (json-read-string=? str x)
> +  (string=? x (read-json-string str)))
> +
> +(with-test-prefix "read-json"
> +  ;; Keywords
> +  (pass-if (json-read-eq? "true" #t))
> +  (pass-if (json-read-eq? "false" #f))
> +  (pass-if (json-read-eq? "null" 'null))
> +  ;; Numbers
> +  (pass-if (json-read=? "0" 0))
> +  (pass-if (json-read=? "-0" 0))
> +  (pass-if (json-read=? "0.0" 0.0))
> +  (pass-if (json-read=? "-0.0" -0.0))
> +  (pass-if (json-read=? "0.1" 0.1))
> +  (pass-if (json-read=? "1.234" 1.234))
> +  (pass-if (json-read=? "1" 1))
> +  (pass-if (json-read=? "-1" -1))
> +  (pass-if (json-read=? "1.1" 1.1))
> +  (pass-if (json-read=? "1e2" 1e2))
> +  (pass-if (json-read=? "1.1e2" 1.1e2))
> +  (pass-if (json-read=? "1.1e-2" 1.1e-2))
> +  (pass-if (json-read=? "1.1e+2" 1.1e2))
> +  ;; Extraneous zeroes in fraction
> +  (pass-if (json-read=? "1.000" 1))
> +  (pass-if (json-read=? "1.5000" 1.5))
> +  ;; Extraneous zeroes in exponent
> +  (pass-if (json-read=? "1.1e000" 1.1))
> +  (pass-if (json-read=? "1.1e-02" 1.1e-2))
> +  (pass-if (json-read=? "1.1e+02" 1.1e2))
> +  ;; Strings
> +  (pass-if (json-read-string=? "\"foo\"" "foo"))
> +  ;; Escape codes
> +  (pass-if (json-read-string=? "\"\\\"\"" "\""))
> +  (pass-if (json-read-string=? "\"\\\\\"" "\\"))
> +  (pass-if (json-read-string=? "\"\\/\"" "/"))
> +  (pass-if (json-read-string=? "\"\\b\"" "\b"))
> +  (pass-if (json-read-string=? "\"\\f\"" "\f"))
> +  (pass-if (json-read-string=? "\"\\n\"" "\n"))
> +  (pass-if (json-read-string=? "\"\\r\"" "\r"))
> +  (pass-if (json-read-string=? "\"\\t\"" "\t"))
> +  ;; Unicode in hexadecimal format
> +  (pass-if (json-read-string=? "\"\\u12ab\"" "\u12ab"))
> +  ;; Objects
> +  (pass-if (json-read-equal? "{}" '()))
> +  (pass-if (json-read-equal? "{ \"foo\": \"bar\", \"baz\": \"frob\"}"
> +                             '(("foo" . "bar") ("baz" . "frob"))))
> +  ;; Nested objects
> +  (pass-if (json-read-equal? "{\"foo\":{\"bar\":\"baz\"}}"
> +                             '(("foo" . (("bar" . "baz"))))))
> +  ;; Arrays
> +  (pass-if (json-read-equal? "[]" #()))
> +  (pass-if (json-read-equal? "[1, 2, \"foo\"]"
> +                             #(1 2 "foo")))
> +  ;; Nested arrays
> +  (pass-if (json-read-equal? "[1, 2, [\"foo\", \"bar\"]]"
> +                             #(1 2 #("foo" "bar"))))
> +  ;; Arrays and objects nested within each other
> +  (pass-if (json-read-equal? "{\"foo\":[{\"bar\":true},{\"baz\":[1,2,3]}]}"
> +                             '(("foo" . #((("bar" . #t))
> +                                          (("baz" . #(1 2 3))))))))
> +  ;; Leading whitespace
> +  (pass-if (json-read-eq? "\t\r\n true" #t)))

> +;;;
> +;;; Writer
> +;;;
> +
> +(define (write-json-string exp)
> +  (call-with-output-string
> +   (lambda (port)
> +     (write-json exp port))))
> +
> +(define (json-write-string=? exp str)
> +  (string=? str (write-json-string exp)))
> +
> +(with-test-prefix "write-json"
> +  ;; Keywords
> +  (pass-if (json-write-string=? #t "true"))
> +  (pass-if (json-write-string=? #f "false"))
> +  (pass-if (json-write-string=? 'null "null"))
> +  ;; Numbers
> +  (pass-if (json-write-string=? 0 "0"))
> +  (pass-if (json-write-string=? 0.0 "0.0"))
> +  (pass-if (json-write-string=? 0.1 "0.1"))
> +  (pass-if (json-write-string=? 1 "1"))
> +  (pass-if (json-write-string=? -1 "-1"))
> +  (pass-if (json-write-string=? 1.1 "1.1"))
> +  ;; Strings
> +  (pass-if (json-write-string=? "foo" "\"foo\""))
> +  ;; Escape codes
> +  (pass-if (json-write-string=? "\"" "\"\\\"\""))
> +  (pass-if (json-write-string=? "\\" "\"\\\\\""))
> +  (pass-if (json-write-string=? "/" "\"\\/\""))
> +  (pass-if (json-write-string=? "\b" "\"\\b\""))
> +  (pass-if (json-write-string=? "\f" "\"\\f\""))
> +  (pass-if (json-write-string=? "\n" "\"\\n\""))
> +  (pass-if (json-write-string=? "\r" "\"\\r\""))
> +  (pass-if (json-write-string=? "\t" "\"\\t\""))
> +  ;; Objects
> +  (pass-if (json-write-string=? '() "{}"))
> +  (pass-if (json-write-string=? '(("foo" . "bar") ("baz" . "frob"))
> +                                "{\"foo\":\"bar\",\"baz\":\"frob\"}"))
> +  ;; Nested objects
> +  (pass-if (json-write-string=? '(("foo" . (("bar" . "baz"))))
> +                                "{\"foo\":{\"bar\":\"baz\"}}"))
> +  ;; Arrays
> +  (pass-if (json-write-string=? #() "[]"))
> +  (pass-if (json-write-string=? #(1 2 "foo")
> +                                "[1,2,\"foo\"]"))
> +  ;; Nested arrays
> +  (pass-if (json-write-string=? #(1 2 #("foo" "bar"))
> +                                "[1,2,[\"foo\",\"bar\"]]"))
> +  ;; Arrays and objects nested in each other
> +  (pass-if (json-write-string=? '(("foo" . #((("bar" . #t))
> +                                             (("baz" . #(1 2))))))
> +                                "{\"foo\":[{\"bar\":true},{\"baz\":[1,2]}]}")))

Neat.  Nitpick: perhaps add a trailing '.' after each stand-alone
comments, to follow existing conventions.

I hope my armchair commentary is of some use :-).

Thanks again for working on a JSON parser/writer for Guile.

-- 
Thanks,
Maxim




Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Mon, 14 Apr 2025 12:17:02 GMT) Full text and rfc822 format available.

Message #26 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: "Thompson, David" <dthompson2 <at> worcester.edu>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Mon, 14 Apr 2025 08:16:13 -0400
[Message part 1 (text/plain, inline)]
Hi Maxim,

Thanks for the review!

On Mon, Apr 14, 2025 at 2:31 AM Maxim Cournoyer
<maxim.cournoyer <at> gmail.com> wrote:
>
> > +@example
> > +@verbatim
> > +{
> > +  "name": "Eva Luator",
> > +  "age": 24,
> > +  "schemer": true,
> > +  "hobbies": [
> > +    "hacking",
> > +    "cycling",
> > +    "surfing"
> > +  ]
> > +}
> > +@end verbatim
> > +@end example
> > +
> > +can be represented with the following Scheme expression:
> > +
> > +@example
> > +@verbatim
> > +'(("name" . "Eva Luator")
> > +  ("age" . 24)
> > +  ("schemer" . #t)
> > +  ("hobbies" . #("hacking" "cycling" "surfing")))
> > +@end verbatim
> > +@end example
>
> Is there particular reason for using vectors instead of plain list to
> represent JSON arrays?  The later would be more idiomatic unless there
> are technical reasons (perhaps performance?).

Back in 2015 I chose lists out of convenience because lists are such a
fundamental data structure in Scheme. It is very tempting! However,
there are a couple of issues: 1) O(n) access time for lists is not
great when there's a perfectly suitable O(1) data structure with read
syntax sitting right there and 2) It creates ambiguity when lists are
also used to represent objects. My previous patch used the symbol '@
as a sentinel to mark objects, but it was kinda gross and I grew to
dislike it more and more over the years. SRFI-180 uses vectors,
Racket's JSON library uses vectors, etc. and I think they made the
right call.

> > +Strings, exact integers, inexact reals (excluding NaNs and infinities),
> > +@code{#t}, @code{#f}, the symbol @code{null}, vectors, and association
> > +lists may be serialized as JSON.  Association lists serialize as JSON
> > +objects and vectors serialize as JSON arrays.  The keys of association
> > +lists @emph{must} be strings.
> > +
> > +@deffn {Scheme Procedure} read-json [port]
> > +
> > +Parse a JSON-encoded value from @var{port} and return its Scheme
> > +representation.  If @var{port} is unspecified, the current input port is
> > +used.
> > +
> > +@example
> > +@verbatim
> > +(call-with-input-string "[true,false,null,42,\"foo\"]" read-json)
> > +;; => #(#t #f null 42 "foo")
> > +
> > +(call-with-input-string "{\"foo\":1,\"bar\":2}" read-json)
> > +;; => (("foo" . 1) ("bar" . 2))
> > +@end verbatim
> > +@end example
> > +
> > +@end deffn
> > +
> > +@deftp {Exception Type} &json-read-error
> > +An exception type denoting JSON read errors.
> > +@end deftp
> >
> > +@deffn {Scheme Procedure} write-json exp [port]
> > +
> > +Serialize the expression @var{exp} as JSON-encoded text to @var{port}.
> > +If @var{port} is unspecified, the current output port is used.
> > +
> > +@example
> > +@verbatim
> > +(with-output-to-string (lambda () (write-json #(#t #f null 42 "foo"))))
> > +;; => "[true,false,null,42,\"foo\"]"
> > +
> > +(with-output-to-string (lambda () (write-json '(("foo" . 1) ("bar" . 2)))))
> > +;; => "{\"foo\":1,\"bar\":2}"
> > +@end verbatim
> > +@end example
> > +
> > +@end deffn
> > +
> > +@deftp {Exception Type} &json-write-error
> > +An exception type denoting JSON write errors.
> > +@end deftp
>
> I think it could be a bit nicer if the deffn of read-json and write-json
> explicitly mentioned that upon error an exception of type X is raised.

Sure, makes sense to mention that.

> > +
> >  @node Web Client
> >  @subsection Web Client
> >
> > diff --git a/module/web/json.scm b/module/web/json.scm
> > new file mode 100644
> > index 000000000..41aac0e90
> > --- /dev/null
> > +++ b/module/web/json.scm
> > @@ -0,0 +1,308 @@
> > +;;;; json.scm --- JSON reader/writer (ECMA-404)
> > +;;;; Copyright (C) 2025 Free Software Foundation, Inc.
> > +;;;;
> > +;;;; This library is free software; you can redistribute it and/or
> > +;;;; modify it under the terms of the GNU Lesser General Public
> > +;;;; License as published by the Free Software Foundation; either
> > +;;;; version 3 of the License, or (at your option) any later version.
> > +;;;;
> > +;;;; This library is distributed in the hope that it will be useful,
> > +;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +;;;; Lesser General Public License for more details.
> > +;;;;
> > +;;;; You should have received a copy of the GNU Lesser General Public
> > +;;;; License along with this library; if not, write to the Free Software
> > +;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>
> The FSF has gone office-less, so the above address is now incorrect [0].
> The up-to-date template for the copyright notice (header) reads [1]:
>
> --8<---------------cut here---------------start------------->8---
>     This program is free software: you can redistribute it and/or modify
>     it under the terms of the GNU General Public License as published by
>     the Free Software Foundation, either version 3 of the License, or
>     (at your option) any later version.
>
>     This program is distributed in the hope that it will be useful, but
>     WITHOUT ANY WARRANTY; without even the implied warranty of
>     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>     General Public License for more details.
>
>     You should have received a copy of the GNU General Public License
>     along with this program. If not, see
>     <https://www.gnu.org/licenses/>.
> --8<---------------cut here---------------end--------------->8---
>
> [0]  https://www.fsf.org/blogs/community/fsf-office-closing-party
> [1]  https://www.gnu.org/licenses/gpl-howto.html

Ah, right. Good catch.  I knew about this change but wasn't thinking
about the copyright header at the time. I even walked through Franklin
St. recently and thought about the FSF no longer having an office.  I
always thought it was a bit silly to include a mailing address in a
license header because offices are not forever.

> > +
> > +(define-module (web json)
> > +  #:use-module (ice-9 exceptions)
> > +  #:use-module (ice-9 match)
> > +  #:use-module (ice-9 textual-ports)
> > +  #:export (&json-read-error
> > +            read-json
> > +
> > +            &json-write-error
> > +            write-json))
> > +
> > +(define-exception-type &json-read-error &error
> > +  make-json-read-error
> > +  json-read-error?)
> > +
> > +(define* (read-json #:optional (port (current-input-port)))
> > +  "Parse a JSON-encoded value from @var{port} and return its Scheme
> > +representation.  If @var{port} is unspecified, the current input port is
> > +used."
> > +  (define (fail message)
> > +    (raise-exception
> > +     (make-exception (make-json-read-error)
> > +                     (make-exception-with-origin 'read-json)
> > +                     (make-exception-with-message message)
> > +                     (make-exception-with-irritants (list port)))))
>
> Hm, I wonder what (list port) looks like in the irritants when the
> exception is reported; is it useful?  Shouldn't it show instead the
> problematic value?

By including the port in the irritants it allows the exception handler
to use 'port-line' and 'port-column' to show where in the JSON
document the error occurred (for ports with such information like file
ports).

> > +  (define (consume-whitespace)
> > +    (case (peek-char port)
> > +      ((#\space #\tab #\return #\newline)
>
> Should a match + ((? char-whitespace?)) predicate pattern be used here
> instead, or similar?  Or perhaps the above is faster and more
> self-contained, which can be a good thing.

ECMA-404 states that these 4 characters are the only acceptable
whitespace characters:

"Whitespace is any sequence of one or more of the following code
points: character tabulation (U+0009), line feed (U+000A), carriage
return (U+000D), and space (U+0020)."

> > +       (read-char port)
> > +       (consume-whitespace))
> > +      (else (values))))
> > +  (define-syntax-rule (define-keyword-reader name str val)
> > +    (define (name)
> > +      (if (string=? (get-string-n port (string-length str)) str)
> > +          val
> > +          (fail "invalid keyword"))))
> > +  (define-keyword-reader read-true "true" #t)
> > +  (define-keyword-reader read-false "false" #f)
> > +  (define-keyword-reader read-null "null" 'null)
> > +  (define (read-hex-digit)
> > +    (case (peek-char port)
> > +      ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)
> > +       (- (char->integer (read-char port)) (char->integer #\0)))
> > +      ((#\a #\b #\c #\d #\e #\f)
> > +       (+ 10 (- (char->integer (read-char port)) (char->integer #\a))))
> > +      ((#\A #\B #\C #\D #\E #\F)
> > +       (+ 10 (- (char->integer (read-char port)) (char->integer #\A))))
> > +      (else (fail "invalid hex digit"))))
> > +  (define (read-utf16-character)
> > +    (let* ((a (read-hex-digit))
> > +           (b (read-hex-digit))
> > +           (c (read-hex-digit))
> > +           (d (read-hex-digit)))
> > +      (integer->char (+ (* a (expt 16 3)) (* b (expt 16 2)) (* c 16) d))))
> > +  (define (read-escape-character)
> > +    (case (read-char port)
> > +      ((#\") #\")
> > +      ((#\\) #\\)
> > +      ((#\/) #\/)
> > +      ((#\b) #\backspace)
> > +      ((#\f) #\page)
> > +      ((#\n) #\newline)
> > +      ((#\r) #\return)
> > +      ((#\t) #\tab)
> > +      ((#\u) (read-utf16-character))
> > +      (else (fail "invalid escape character"))))
> > +  (define (read-string)
> > +    (read-char port)
> > +    (list->string
> > +     (let lp ()
> > +       (match (read-char port)
> > +         ((? eof-object?) (fail "EOF while reading string"))
> > +         (#\" '())
> > +         (#\\ (cons (read-escape-character) (lp)))
> > +         (char (cons char (lp)))))))
> > +  (define (read-digit-maybe)
> > +    (case (peek-char port)
> > +      ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)
> > +       (- (char->integer (read-char port))
> > +          (char->integer #\0)))
> > +      (else #f)))
> > +  (define (read-integer)
> > +    (let ((x (read-digit-maybe)))
> > +      (and x
> > +           (let lp ((x x))
> > +             (match (read-digit-maybe)
> > +               (#f x)
> > +               (y (lp (+ (* x 10) y))))))))
>
> Perhaps the above should be named read-integer-maybe, since it may
> return #f?

Yeah, good idea.

> > +  (define (read-fraction)
> > +    (case (peek-char port)
> > +      ((#\.)
> > +       (read-char port)
> > +       (let lp ((mag 10))
> > +         (let ((n (read-digit-maybe)))
> > +           (if n (+ (/ n mag) (lp (* mag 10))) 0))))
> > +      (else 0)))
>
> Should the above be named 'read-decimal' ?  Does a decimal number in
> JSON always start with '.' and not with 0. ?  I was a bit puzzled on
> what 'mag' may mean here, I guess 'magnitude' although there doesn't
> appear to have a clear terminology for it.

It's called 'read-fraction' because it reads the fractional part of
the number.  I considered 'read-decimal' but I think it's less
descriptive. This procedure is called from two places, after the
respective caller has already read the digits before the decimal
point. 'mag' does indeed mean 'magnitude', as each digit read
increases the order of magnitude of the resulting number. I borrowed
this name from Hoot's 'read' implementation, which means it's probably
in Guile's pure-Scheme 'read' implementation as well.

> > +  (define (read-exponent)
> > +    (case (peek-char port)
> > +      ((#\e #\E)
> > +       (read-char port)
> > +       (case (peek-char port)
> > +         ((#\-)
> > +          (read-char port)
> > +          (expt 10 (- (read-integer))))
> > +         ((#\+)
> > +          (read-char port)
> > +          (expt 10 (read-integer)))
> > +         (else
> > +          (expt 10 (read-integer)))))
> > +      (else 1)))
> > +  (define (read-positive-number)
> > +    (let ((n (read-integer)))
> > +      (and n
> > +           (let* ((f (read-fraction))
> > +                  (e (read-exponent))
> > +                  (x (* (+ n f) e)))
> > +             (if (exact-integer? x) x (exact->inexact x))))))
>
> This may return #f.  Should it fail instead, or be named
> read-positive-number-maybe ?

Adding 'maybe' to the name is good. It's called from two places. One
caller knows that it's not going to return #f because it has peeked a
valid digit from the port beforehand, the other does a check and fails
upon #f.

> > +  (define (read-negative-number)
> > +    (read-char port)
> > +    (let ((x (read-positive-number)))
> > +      (if x (- x) (fail "invalid number"))))
>
> Not symmetrical with the above: this one would fail if an integer
> couldn't be read in read-positive-number.

This is the intended behavior to ensure that '-' is not parsed as a
valid number.  There needs to be at least one digit.

> > +  (define (read-leading-zero-number)
> > +    (read-char port)
> > +    (case (peek-char port)
> > +      ;; Extraneous zeroes are not allowed.  A single leading zero
> > +      ;; can only be followed by a decimal point.
> > +      ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9 #\e #\E)
> > +       (fail "extraneous leading zero"))
>
> Why not check for (not #\.) explicitly?  That'd be clearer and would
> cover all cases (even crazy unexpected ones).

The following clause checks for '.'. We need to distinguish between 3
types of input here:

- invalid extraneous zeroes like '09' (ECMA-404 says this is not allowed)
- fractional notation like '0.123'
- plain '0'

If I leave out this clause, we can no longer distinguish errors from
plain zeroes.  There's likely other ways to express it but this feels
straightforward enough.

However, it was a great idea to take another look at this code because
there is a bug! '0e3' is a valid JSON number that this code rejects.
It should parse to 0.  My updated patch fixes this and adds a test
case.

> > +      ;; Fractional number.
> > +      ((#\.)
> > +       (let* ((d (read-fraction))
> > +              (e (read-exponent)))
> > +         (exact->inexact (* d e))))
> > +      ;; Just plain zero.
> > +      (else 0)))
> > +  (define (read-key+value-pair)
> > +    (let ((key (read-string)))
> > +      (consume-whitespace)
> > +      (case (read-char port)
> > +        ((#\:)
> > +         (consume-whitespace)
> > +         (cons key (read-value)))
> > +        (else (fail "invalid key/value pair delimiter")))))
> > +  (define (read-object)
> > +    (read-char port)
> > +    (consume-whitespace)
> > +    (case (peek-char port)
> > +      ;; Empty object.
> > +      ((#\})
> > +       (read-char port)
> > +       '())
> > +      (else
> > +       ;; Read first key/value pair, then all subsequent pairs delimited
> > +       ;; by commas.
> > +       (cons (read-key+value-pair)
> > +             (let lp ()
> > +               (consume-whitespace)
> > +               (case (peek-char port)
> > +                 ((#\,)
> > +                  (read-char port)
> > +                  (consume-whitespace)
> > +                  (cons (read-key+value-pair) (lp)))
> > +                 ;; End of object.
> > +                 ((#\})
> > +                  (read-char port)
> > +                  '())
> > +                 (else (fail "invalid object delimiter"))))))))
> > +  (define (read-array)
> > +    (read-char port)
> > +    (consume-whitespace)
> > +    (case (peek-char port)
> > +      ;; Empty array.
> > +      ((#\])
> > +       (read-char port)
> > +       #())
> > +      (else
> > +       (list->vector
>
> As mentioned above, just a plain list would be more Schemey, no?  What
> does the vector type buys us?  A user wanting a vector could always call
> list->vector themselves, and otherwise we save some computation.

As stated above, vectors are the more natural analog to JSON arrays
and being distinct from pairs they resolve some ambiguity in the
Scheme representation that would exist otherwise.

> > +        ;; Read the first element, then all subsequent elements
> > +        ;; delimited by commas.
> > +        (cons (read-value)
> > +              (let lp ()
> > +                (consume-whitespace)
> > +                (case (peek-char port)
> > +                  ;; Elements are comma delimited.
> > +                  ((#\,)
> > +                   (read-char port)
> > +                   (consume-whitespace)
> > +                   (cons (read-value) (lp)))
> > +                  ;; End of array.
> > +                  ((#\])
> > +                   (read-char port)
> > +                   '())
> > +                  (else (fail "invalid array delimiter")))))))))
> > +  (define (read-value)
> > +    (consume-whitespace)
> > +    (case (peek-char port)
> > +      ((#\") (read-string))
> > +      ((#\{) (read-object))
> > +      ((#\[) (read-array))
> > +      ((#\t) (read-true))
> > +      ((#\f) (read-false))
> > +      ((#\n) (read-null))
> > +      ((#\-) (read-negative-number))
> > +      ((#\0) (read-leading-zero-number))
> > +      ((#\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9) (read-positive-number))
> > +      (else (fail "invalid value"))))
> > +  (read-value))
> > +
> > +(define-exception-type &json-write-error &error
> > +  make-json-write-error
> > +  json-write-error?)
> > +
> > +(define* (write-json exp #:optional (port (current-output-port)))
> > +  "Serialize the expression @var{exp} as JSON-encoded text to @var{port}.
> > +If @var{port} is unspecified, the current output port is used."
> > +  (define (fail message x)
> > +    (raise-exception
> > +     (make-exception (make-json-write-error)
> > +                     (make-exception-with-origin 'write-json)
> > +                     (make-exception-with-message message)
> > +                     (make-exception-with-irritants (list x)))))
> > +  (define (write-char/escape char)
> > +    (match char
> > +      (#\" (put-string port "\\\""))
> > +      (#\\ (put-string port "\\\\"))
> > +      (#\/ (put-string port "\\/"))
> > +      (#\backspace (put-string port "\\b"))
> > +      (#\page (put-string port "\\f"))
> > +      (#\newline (put-string port "\\n"))
> > +      (#\return (put-string port "\\r"))
> > +      (#\tab (put-string port "\\t"))
> > +      (_ (put-char port char))))
> > +  (define (write-string str)
> > +    (let ((in (open-input-string str)))
>
> Looks like the above 'in' binding is not used.

Vestigial code, my favorite. Good eye! Removed.

> > +      (put-char port #\")
> > +      (string-for-each write-char/escape str)
> > +      (put-char port #\")))
> > +  (define (write-pair x)
> > +    (match x
> > +      (((? string? key) . value)
> > +       (write-string key)
> > +       (put-char port #\:)
> > +       (write-value value))
> > +      (_ (fail "invalid key/value pair" x))))
> > +  (define (write-object obj)
> > +    (put-char port #\{)
> > +    (match obj
> > +      ((head . rest)
> > +       (write-pair head)
> > +       (let lp ((obj rest))
> > +         (match obj
> > +           (() (values))
>
> Any reason to return (values) instead of some dummy #t to denote 'no-op'
> ?.

The loop is evaluated for effect and thus there is nothing to return
so returning 0 values makes sense. This is something I picked up from
Andy Wingo.

> > +           ((head . rest)
> > +            (put-char port #\,)
> > +            (write-pair head)
> > +            (lp rest))
> > +           (_ (fail "invalid object" obj))))))
> > +    (put-char port #\}))
> > +  (define (write-array v)
> > +    (put-char port #\[)
> > +    (match (vector-length v)
> > +      (0 (values))
> > +      (n
> > +       (write-value (vector-ref v 0))
> > +       (do ((i 1 (1+ i)))
> > +           ((= i n))
> > +         (put-char port #\,)
> > +         (write-value (vector-ref v i)))))
>
> I suppose the above is more efficient than a for-each loop?  I'd be
> curious to see it profiled, if you still have data.  At least now I see
> than for > 100k, vector-ref is faster than list-ref, which probably
> explains why you went with vectors (could still be an implementation
> detail with the list->vector call left in the writer though, in my
> opinion).

Yes, it is more efficient because it avoids closure allocation. Also,
I just don't see any reason to import (rnrs base) or (srfi srfi-43) to
get vector-for-each when looping over a vector is trivial. I didn't
profile anything.

> > +    (put-char port #\]))
> > +  (define (write-number x)
> > +    (if (or (exact-integer? x)
> > +            (and (real? x)
> > +                 (inexact? x)
> > +                 ;; NaNs and infinities are not allowed.
> > +                 (not (or (nan? x) (inf? x)))))
> > +        ;; Scheme's string representations of exact integers and floats
> > +        ;; are compatible with JSON.
> > +        (put-string port (number->string x))
> > +        (fail "invalid number" x)))
> > +  (define (write-value x)
> > +    (match x
> > +      (#t (put-string port "true"))
> > +      (#f (put-string port "false"))
> > +      ('null (put-string port "null"))
> > +      (() (put-string port "{}"))
> > +      ((? pair?) (write-object x))
> > +      ((? vector?) (write-array x))
> > +      ((? string?) (write-string x))
> > +      ((? number?) (write-number x))
> > +      (_ (fail "invalid value" x))))
> > +  (write-value exp))
>
> Phew.  That's a pretty low-level parser!  I hope it's fast, otherwise it
> seems it'd be more concise/fun/maintainable to devise a PEG-based one,
> which appears to be doable for JSON, from what I've read.  Perhaps
> sprinkle with a few performance-related comments where such concerns
> impacted the design choices, so that we can remember and retest/reverify
> these in the future when Guile evolves.

JSON is a pretty simple format and thus I think a hand-rolled parser
is appropriate.  It's much simpler than 'read', anyway.  I suppose it
would be more concise, but "PEG parser" and "fun" do not go together
for me.  At ~300 lines of quite simple code (I did not go hog wild on
macros or fancy abstractions nor did I sacrifice readable code for
performance) I don't think there is much concern regarding
maintenance.  If someone wants to experiment to see how a PEG parser
compares, though, feel free.

> > diff --git a/test-suite/Makefile.am b/test-suite/Makefile.am
> > index 6014b1f1f..00afea142 100644
> > --- a/test-suite/Makefile.am
> > +++ b/test-suite/Makefile.am
> > @@ -73,6 +73,7 @@ SCM_TESTS = tests/00-initial-env.test               \
> >           tests/iconv.test                    \
> >           tests/import.test                   \
> >           tests/interp.test                   \
> > +         tests/json.test                     \
> >           tests/keywords.test                 \
> >           tests/list.test                     \
> >           tests/load.test                     \
> > diff --git a/test-suite/tests/json.test b/test-suite/tests/json.test
> > new file mode 100644
> > index 000000000..f92eeccec
> > --- /dev/null
> > +++ b/test-suite/tests/json.test
> > @@ -0,0 +1,154 @@
> > +;;;; json.test --- test JSON reader/writer     -*- scheme -*-
> > +;;;;
> > +;;;; Copyright (C) 2015 Free Software Foundation, Inc.
> > +;;;;
> > +;;;; This library is free software; you can redistribute it and/or
> > +;;;; modify it under the terms of the GNU Lesser General Public
> > +;;;; License as published by the Free Software Foundation; either
> > +;;;; version 3 of the License, or (at your option) any later version.
> > +;;;;
> > +;;;; This library is distributed in the hope that it will be useful,
> > +;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +;;;; Lesser General Public License for more details.
> > +;;;;
> > +;;;; You should have received a copy of the GNU Lesser General Public
> > +;;;; License along with this library; if not, write to the Free Software
> > +;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> > +
> > +(define-module (test-suite test-json)
> > +  #:use-module (test-suite lib)
> > +  #:use-module (web json))
> > +
> > +;;;
> > +;;; Reader
> > +;;;
> > +
> > +(define (read-json-string str)
> > +  (call-with-input-string str read-json))
> > +
> > +(define (json-read=? str x)
> > +  (= x (read-json-string str)))
> > +
> > +(define (json-read-eq? str x)
> > +  (eq? x (read-json-string str)))
> > +
> > +(define (json-read-equal? str x)
> > +  (equal? x (read-json-string str)))
> > +
> > +(define (json-read-string=? str x)
> > +  (string=? x (read-json-string str)))
> > +
> > +(with-test-prefix "read-json"
> > +  ;; Keywords
> > +  (pass-if (json-read-eq? "true" #t))
> > +  (pass-if (json-read-eq? "false" #f))
> > +  (pass-if (json-read-eq? "null" 'null))
> > +  ;; Numbers
> > +  (pass-if (json-read=? "0" 0))
> > +  (pass-if (json-read=? "-0" 0))
> > +  (pass-if (json-read=? "0.0" 0.0))
> > +  (pass-if (json-read=? "-0.0" -0.0))
> > +  (pass-if (json-read=? "0.1" 0.1))
> > +  (pass-if (json-read=? "1.234" 1.234))
> > +  (pass-if (json-read=? "1" 1))
> > +  (pass-if (json-read=? "-1" -1))
> > +  (pass-if (json-read=? "1.1" 1.1))
> > +  (pass-if (json-read=? "1e2" 1e2))
> > +  (pass-if (json-read=? "1.1e2" 1.1e2))
> > +  (pass-if (json-read=? "1.1e-2" 1.1e-2))
> > +  (pass-if (json-read=? "1.1e+2" 1.1e2))
> > +  ;; Extraneous zeroes in fraction
> > +  (pass-if (json-read=? "1.000" 1))
> > +  (pass-if (json-read=? "1.5000" 1.5))
> > +  ;; Extraneous zeroes in exponent
> > +  (pass-if (json-read=? "1.1e000" 1.1))
> > +  (pass-if (json-read=? "1.1e-02" 1.1e-2))
> > +  (pass-if (json-read=? "1.1e+02" 1.1e2))
> > +  ;; Strings
> > +  (pass-if (json-read-string=? "\"foo\"" "foo"))
> > +  ;; Escape codes
> > +  (pass-if (json-read-string=? "\"\\\"\"" "\""))
> > +  (pass-if (json-read-string=? "\"\\\\\"" "\\"))
> > +  (pass-if (json-read-string=? "\"\\/\"" "/"))
> > +  (pass-if (json-read-string=? "\"\\b\"" "\b"))
> > +  (pass-if (json-read-string=? "\"\\f\"" "\f"))
> > +  (pass-if (json-read-string=? "\"\\n\"" "\n"))
> > +  (pass-if (json-read-string=? "\"\\r\"" "\r"))
> > +  (pass-if (json-read-string=? "\"\\t\"" "\t"))
> > +  ;; Unicode in hexadecimal format
> > +  (pass-if (json-read-string=? "\"\\u12ab\"" "\u12ab"))
> > +  ;; Objects
> > +  (pass-if (json-read-equal? "{}" '()))
> > +  (pass-if (json-read-equal? "{ \"foo\": \"bar\", \"baz\": \"frob\"}"
> > +                             '(("foo" . "bar") ("baz" . "frob"))))
> > +  ;; Nested objects
> > +  (pass-if (json-read-equal? "{\"foo\":{\"bar\":\"baz\"}}"
> > +                             '(("foo" . (("bar" . "baz"))))))
> > +  ;; Arrays
> > +  (pass-if (json-read-equal? "[]" #()))
> > +  (pass-if (json-read-equal? "[1, 2, \"foo\"]"
> > +                             #(1 2 "foo")))
> > +  ;; Nested arrays
> > +  (pass-if (json-read-equal? "[1, 2, [\"foo\", \"bar\"]]"
> > +                             #(1 2 #("foo" "bar"))))
> > +  ;; Arrays and objects nested within each other
> > +  (pass-if (json-read-equal? "{\"foo\":[{\"bar\":true},{\"baz\":[1,2,3]}]}"
> > +                             '(("foo" . #((("bar" . #t))
> > +                                          (("baz" . #(1 2 3))))))))
> > +  ;; Leading whitespace
> > +  (pass-if (json-read-eq? "\t\r\n true" #t)))
>
> > +;;;
> > +;;; Writer
> > +;;;
> > +
> > +(define (write-json-string exp)
> > +  (call-with-output-string
> > +   (lambda (port)
> > +     (write-json exp port))))
> > +
> > +(define (json-write-string=? exp str)
> > +  (string=? str (write-json-string exp)))
> > +
> > +(with-test-prefix "write-json"
> > +  ;; Keywords
> > +  (pass-if (json-write-string=? #t "true"))
> > +  (pass-if (json-write-string=? #f "false"))
> > +  (pass-if (json-write-string=? 'null "null"))
> > +  ;; Numbers
> > +  (pass-if (json-write-string=? 0 "0"))
> > +  (pass-if (json-write-string=? 0.0 "0.0"))
> > +  (pass-if (json-write-string=? 0.1 "0.1"))
> > +  (pass-if (json-write-string=? 1 "1"))
> > +  (pass-if (json-write-string=? -1 "-1"))
> > +  (pass-if (json-write-string=? 1.1 "1.1"))
> > +  ;; Strings
> > +  (pass-if (json-write-string=? "foo" "\"foo\""))
> > +  ;; Escape codes
> > +  (pass-if (json-write-string=? "\"" "\"\\\"\""))
> > +  (pass-if (json-write-string=? "\\" "\"\\\\\""))
> > +  (pass-if (json-write-string=? "/" "\"\\/\""))
> > +  (pass-if (json-write-string=? "\b" "\"\\b\""))
> > +  (pass-if (json-write-string=? "\f" "\"\\f\""))
> > +  (pass-if (json-write-string=? "\n" "\"\\n\""))
> > +  (pass-if (json-write-string=? "\r" "\"\\r\""))
> > +  (pass-if (json-write-string=? "\t" "\"\\t\""))
> > +  ;; Objects
> > +  (pass-if (json-write-string=? '() "{}"))
> > +  (pass-if (json-write-string=? '(("foo" . "bar") ("baz" . "frob"))
> > +                                "{\"foo\":\"bar\",\"baz\":\"frob\"}"))
> > +  ;; Nested objects
> > +  (pass-if (json-write-string=? '(("foo" . (("bar" . "baz"))))
> > +                                "{\"foo\":{\"bar\":\"baz\"}}"))
> > +  ;; Arrays
> > +  (pass-if (json-write-string=? #() "[]"))
> > +  (pass-if (json-write-string=? #(1 2 "foo")
> > +                                "[1,2,\"foo\"]"))
> > +  ;; Nested arrays
> > +  (pass-if (json-write-string=? #(1 2 #("foo" "bar"))
> > +                                "[1,2,[\"foo\",\"bar\"]]"))
> > +  ;; Arrays and objects nested in each other
> > +  (pass-if (json-write-string=? '(("foo" . #((("bar" . #t))
> > +                                             (("baz" . #(1 2))))))
> > +                                "{\"foo\":[{\"bar\":true},{\"baz\":[1,2]}]}")))
>
> Neat.  Nitpick: perhaps add a trailing '.' after each stand-alone
> comments, to follow existing conventions.

Sure thing.

> I hope my armchair commentary is of some use :-).

It was! Always nice to have another pair of eyes on some code you've
stared at for too long to notice all the little issues that remain.
Thank you!

> Thanks again for working on a JSON parser/writer for Guile.

:)

Updated patch attached.

- Dave
[0001-web-Add-JSON-module.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Sun, 20 Apr 2025 13:49:05 GMT) Full text and rfc822 format available.

Message #29 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: "Thompson, David" <dthompson2 <at> worcester.edu>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Sun, 20 Apr 2025 22:48:00 +0900
Hi David,

"Thompson, David" <dthompson2 <at> worcester.edu> writes:

> Hi Maxim,
>
> Thanks for the review!

My pleasure!  It's a small thing I can do to try to help nudge builtin
JSON support in Guile, which I agree has its place in 2025.  In general,
I'd like Guile to compete a bit more with Python in the out-of-the-box
library capabilities.  Especially since outside of Guix installing Guile
libraries is probably not that fun (I'd suspect many Guile libraries to
be missing or outdated on various distributions), so a more useful core
has even more value there.

[...]

>> Is there particular reason for using vectors instead of plain list to
>> represent JSON arrays?  The later would be more idiomatic unless there
>> are technical reasons (perhaps performance?).
>
> Back in 2015 I chose lists out of convenience because lists are such a
> fundamental data structure in Scheme. It is very tempting! However,
> there are a couple of issues: 1) O(n) access time for lists is not
> great when there's a perfectly suitable O(1) data structure with read
> syntax sitting right there and 2) It creates ambiguity when lists are
> also used to represent objects. My previous patch used the symbol '@
> as a sentinel to mark objects, but it was kinda gross and I grew to
> dislike it more and more over the years. SRFI-180 uses vectors,
> Racket's JSON library uses vectors, etc. and I think they made the
> right call.

I see.  It makes sense, thanks for explaining.

[...]

>> Hm, I wonder what (list port) looks like in the irritants when the
>> exception is reported; is it useful?  Shouldn't it show instead the
>> problematic value?
>
> By including the port in the irritants it allows the exception handler
> to use 'port-line' and 'port-column' to show where in the JSON
> document the error occurred (for ports with such information like file
> ports).

OK!

>> > +  (define (consume-whitespace)
>> > +    (case (peek-char port)
>> > +      ((#\space #\tab #\return #\newline)
>>
>> Should a match + ((? char-whitespace?)) predicate pattern be used here
>> instead, or similar?  Or perhaps the above is faster and more
>> self-contained, which can be a good thing.
>
> ECMA-404 states that these 4 characters are the only acceptable
> whitespace characters:
>
> "Whitespace is any sequence of one or more of the following code
> points: character tabulation (U+0009), line feed (U+000A), carriage
> return (U+000D), and space (U+0020)."

Makes sense.

[...]

>> > +  (define (read-fraction)
>> > +    (case (peek-char port)
>> > +      ((#\.)
>> > +       (read-char port)
>> > +       (let lp ((mag 10))
>> > +         (let ((n (read-digit-maybe)))
>> > +           (if n (+ (/ n mag) (lp (* mag 10))) 0))))
>> > +      (else 0)))
>>
>> Should the above be named 'read-decimal' ?  Does a decimal number in
>> JSON always start with '.' and not with 0. ?  I was a bit puzzled on
>> what 'mag' may mean here, I guess 'magnitude' although there doesn't
>> appear to have a clear terminology for it.
>
> It's called 'read-fraction' because it reads the fractional part of
> the number.  I considered 'read-decimal' but I think it's less
> descriptive. This procedure is called from two places, after the
> respective caller has already read the digits before the decimal
> point. 'mag' does indeed mean 'magnitude', as each digit read
> increases the order of magnitude of the resulting number. I borrowed
> this name from Hoot's 'read' implementation, which means it's probably
> in Guile's pure-Scheme 'read' implementation as well.

OK, thanks for explaining.  I realized later in my review that these are
very dependent on when they are called (they are very specific or
narrow, more than their name might suggest).  It's fine.

[...]

> The following clause checks for '.'. We need to distinguish between 3
> types of input here:
>
> - invalid extraneous zeroes like '09' (ECMA-404 says this is not allowed)
> - fractional notation like '0.123'
> - plain '0'
>
> If I leave out this clause, we can no longer distinguish errors from
> plain zeroes.  There's likely other ways to express it but this feels
> straightforward enough.
>
> However, it was a great idea to take another look at this code because
> there is a bug! '0e3' is a valid JSON number that this code rejects.
> It should parse to 0.  My updated patch fixes this and adds a test
> case.

Good catch!

[...]

>> > +  (define (write-object obj)
>> > +    (put-char port #\{)
>> > +    (match obj
>> > +      ((head . rest)
>> > +       (write-pair head)
>> > +       (let lp ((obj rest))
>> > +         (match obj
>> > +           (() (values))
>>
>> Any reason to return (values) instead of some dummy #t to denote 'no-op'
>> ?.
>
> The loop is evaluated for effect and thus there is nothing to return
> so returning 0 values makes sense. This is something I picked up from
> Andy Wingo.

Interesting.

>> > +           ((head . rest)
>> > +            (put-char port #\,)
>> > +            (write-pair head)
>> > +            (lp rest))
>> > +           (_ (fail "invalid object" obj))))))
>> > +    (put-char port #\}))
>> > +  (define (write-array v)
>> > +    (put-char port #\[)
>> > +    (match (vector-length v)
>> > +      (0 (values))
>> > +      (n
>> > +       (write-value (vector-ref v 0))
>> > +       (do ((i 1 (1+ i)))
>> > +           ((= i n))
>> > +         (put-char port #\,)
>> > +         (write-value (vector-ref v i)))))
>>
>> I suppose the above is more efficient than a for-each loop?  I'd be
>> curious to see it profiled, if you still have data.  At least now I see
>> than for > 100k, vector-ref is faster than list-ref, which probably
>> explains why you went with vectors (could still be an implementation
>> detail with the list->vector call left in the writer though, in my
>> opinion).
>
> Yes, it is more efficient because it avoids closure allocation. Also,
> I just don't see any reason to import (rnrs base) or (srfi srfi-43) to
> get vector-for-each when looping over a vector is trivial. I didn't
> profile anything.

Yeah, I was thinking in terms of plain lists, didn't know we didn't
iterators for vectors in the core (I've seldom used vectors).  Sounds
reasonable.

>> > +    (put-char port #\]))
>> > +  (define (write-number x)
>> > +    (if (or (exact-integer? x)
>> > +            (and (real? x)
>> > +                 (inexact? x)
>> > +                 ;; NaNs and infinities are not allowed.
>> > +                 (not (or (nan? x) (inf? x)))))
>> > +        ;; Scheme's string representations of exact integers and floats
>> > +        ;; are compatible with JSON.
>> > +        (put-string port (number->string x))
>> > +        (fail "invalid number" x)))
>> > +  (define (write-value x)
>> > +    (match x
>> > +      (#t (put-string port "true"))
>> > +      (#f (put-string port "false"))
>> > +      ('null (put-string port "null"))
>> > +      (() (put-string port "{}"))
>> > +      ((? pair?) (write-object x))
>> > +      ((? vector?) (write-array x))
>> > +      ((? string?) (write-string x))
>> > +      ((? number?) (write-number x))
>> > +      (_ (fail "invalid value" x))))
>> > +  (write-value exp))
>>
>> Phew.  That's a pretty low-level parser!  I hope it's fast, otherwise it
>> seems it'd be more concise/fun/maintainable to devise a PEG-based one,
>> which appears to be doable for JSON, from what I've read.  Perhaps
>> sprinkle with a few performance-related comments where such concerns
>> impacted the design choices, so that we can remember and retest/reverify
>> these in the future when Guile evolves.
>
> JSON is a pretty simple format and thus I think a hand-rolled parser
> is appropriate.  It's much simpler than 'read', anyway.  I suppose it
> would be more concise, but "PEG parser" and "fun" do not go together
> for me.  At ~300 lines of quite simple code (I did not go hog wild on
> macros or fancy abstractions nor did I sacrifice readable code for
> performance) I don't think there is much concern regarding
> maintenance.  If someone wants to experiment to see how a PEG parser
> compares, though, feel free.

Someone could always make that experiment in the future and suggest a
replacement, if the performance is as good or better and the code
simpler.  LGTM.

Reviewed-by: Maxim Cournoyer <maxim.cournoyer <at> gmail>

-- 
Thanks,
Maxim




Information forwarded to bug-guile <at> gnu.org:
bug#77762; Package guile. (Fri, 02 May 2025 12:05:02 GMT) Full text and rfc822 format available.

Message #32 received at 77762 <at> debbugs.gnu.org (full text, mbox):

From: Tomas Volf <~@wolfsden.cz>
To: "Thompson, David" <dthompson2 <at> worcester.edu>
Cc: 77762 <at> debbugs.gnu.org
Subject: Re: bug#77762: [PATCH] web: Add JSON module.
Date: Fri, 02 May 2025 14:03:56 +0200
"Thompson, David" <dthompson2 <at> worcester.edu> writes:

> On Sat, Apr 12, 2025 at 12:01 PM Tomas Volf <~@wolfsden.cz> wrote:
>>
>> > - null is the symbol 'null
>>
>> Out of curiosity, what are your thoughts about using #nil instead?
>
> My original patch from 10 years ago did this. Mark Weaver then
> explained to me that #nil was added specifically for the purpose of
> supporting Emacs Lisp on the Guile VM and shouldn't be used in Scheme
> code. The symbol 'null works well because there is no symbol type in
> JSON so there's no ambiguity.

Thanks for the explanation.  I asked not due to perceived ambiguity, but
due to how #nil interacts with conditions.  It would be handy to be able
to write

    (when (assq-ref data 'foo)
      ...)

instead of

    (when (not (eq? 'null (or (assq-ref data 'foo) 'null)))
      ...)

Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.




This bug report was last modified 41 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.