GNU bug report logs - #54705
[PATCH] reader: Add Gemtext reader.

Previous Next

Package: skribilo;

Reported by: Arun Isaac <arunisaac <at> systemreboot.net>

Date: Mon, 4 Apr 2022 09:29:01 UTC

Severity: normal

Tags: patch

Done: Arun Isaac <arunisaac <at> systemreboot.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 54705 in the body.
You can then email your comments to 54705 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to skribilo-bugs <at> nongnu.org:
bug#54705; Package skribilo. (Mon, 04 Apr 2022 09:29:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Arun Isaac <arunisaac <at> systemreboot.net>:
New bug report received and forwarded. Copy sent to skribilo-bugs <at> nongnu.org. (Mon, 04 Apr 2022 09:29:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Arun Isaac <arunisaac <at> systemreboot.net>
To: skribilo-bugs <at> nongnu.org
Cc: Arun Isaac <arunisaac <at> systemreboot.net>
Subject: [PATCH] reader: Add Gemtext reader.
Date: Mon,  4 Apr 2022 14:57:42 +0530
* src/guile/skribilo/reader/gemtext.scm: New file.
* src/guile/Makefile.am (readers): Register it.
* doc/user/syntax.skb (The Gemtext Syntax): New section.
* tests/readers/gemtext.test: New file.
* tests/Makefile.am (TESTS): Add readers/gemtext.test.
---
 doc/user/syntax.skb                   |  21 ++-
 src/guile/Makefile.am                 |   3 +-
 src/guile/skribilo/reader/gemtext.scm | 231 ++++++++++++++++++++++++++
 tests/Makefile.am                     |   3 +-
 tests/readers/gemtext.test            | 133 +++++++++++++++
 5 files changed, 388 insertions(+), 3 deletions(-)
 create mode 100644 src/guile/skribilo/reader/gemtext.scm
 create mode 100644 tests/readers/gemtext.test

diff --git a/doc/user/syntax.skb b/doc/user/syntax.skb
index 9a4070c..2de7cbd 100644
--- a/doc/user/syntax.skb
+++ b/doc/user/syntax.skb
@@ -211,7 +211,26 @@ documents that can be output in variety of formats (see ,(numref :text
 [Chapter] :ident "engines")).  The downside is that, being a very simple
 markup-less document format, there are many things that cannot be done
 using it, most notably tables, bibliographies, and cross-references.]))
-   
+
+   (section :title [The Gemtext Syntax] :ident "gemtext-syntax"
+     (p [,(ref
+:url "https://gemini.circumlunar.space/docs/gemtext.gmi"
+:text "Gemtext"), the lightweight markup language used by the ,(ref
+:url "https://gemini.circumlunar.space" :text "Gemini protocol"), is
+supported as an input syntax. To use it, just pass ,(tt
+[--reader=gemtext]) to the compiler. When used programmatically, the
+Gemtext reader can be customized using the following options.])
+
+     (doc-markup 'make-gemtext-reader
+                 '((:join-lines? [If ,(code "#t"), lines which are not
+separated by a blank line are joined into a single paragraph. This is
+a relaxation of the Gemtext standard, and is not done by default.])
+                   (:section-numbers? [If ,(code "#t"), sections are
+numbered. Else, they are not.]))
+                 :common-args '()
+                 :source "skribilo/reader/gemtext.scm"
+	         :idx *function-index*))
+
    (section :title [The RSS 2.0 Syntax]
             :ident "rss2-syntax"
       
diff --git a/src/guile/Makefile.am b/src/guile/Makefile.am
index 98f2873..0a66a88 100644
--- a/src/guile/Makefile.am
+++ b/src/guile/Makefile.am
@@ -47,7 +47,8 @@ SOURCES =								\
 SOURCES += $(readers) $(packages) $(engines)
 
 readers =							\
-  skribilo/reader/skribe.scm skribilo/reader/outline.scm
+  skribilo/reader/skribe.scm skribilo/reader/outline.scm	\
+  skribilo/reader/gemtext.scm
 
 if BUILD_RSS2_READER
 
diff --git a/src/guile/skribilo/reader/gemtext.scm b/src/guile/skribilo/reader/gemtext.scm
new file mode 100644
index 0000000..4ae403c
--- /dev/null
+++ b/src/guile/skribilo/reader/gemtext.scm
@@ -0,0 +1,231 @@
+;;; gemtext.scm  --  A reader for the Gemini protocol's Gemtext markup
+;;;
+;;; Copyright © 2022 Arun Isaac <arunisaac <at> systemreboot.net>
+;;;
+;;;
+;;; This file is part of Skribilo.
+;;;
+;;; Skribilo is free software: you can redistribute it and/or modify
+;;; it under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation, either version 3 of the License, or
+;;; (at your option) any later version.
+;;;
+;;; Skribilo is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with Skribilo.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (skribilo reader gemtext)
+  #:use-module (rnrs io ports)
+  #:use-module (srfi srfi-26)
+  #:use-module (srfi srfi-71)
+  #:use-module (srfi srfi-171)
+  #:use-module (ice-9 match)
+  #:use-module ((ice-9 textual-ports) #:select (unget-char unget-string))
+  #:use-module (skribilo reader)
+  #:use-module (skribilo utils syntax)
+  #:export (reader-specification
+            make-gemtext-reader))
+
+(skribilo-module-syntax)
+
+;;; Author: Arun Isaac
+;;;
+;;; Commentary:
+;;;
+;;; A reader for gemtext, the lightweight markup language used by the
+;;; Gemini protocol
+;;;
+;;; Code:
+
+(define %join-lines?
+  (make-parameter #f))
+
+(define %section-numbers?
+  (make-parameter #f))
+
+(define (string-blank? str)
+  "Return #t if STR contains only whitespace characters.  Else, return
+#f."
+  (string-every char-set:whitespace str))
+
+(define (string-remove-prefix prefix str)
+  "Return STR with PREFIX removed.  If PREFIX is not a prefix of STR,
+return #f."
+  (and (string-prefix? prefix str)
+       (substring str (string-length prefix))))
+
+(define (string-partition str char-pred)
+  "Return the part of STR before and after the first occurrence of
+CHAR-PRED as two values."
+  (let ((partition-index (string-index str char-pred)))
+    (if partition-index
+        (values (substring str 0 partition-index)
+                (substring str partition-index))
+        (values str #f))))
+
+(define (unget-line port line)
+  "Place the string LINE in PORT so that subsequent read operations
+will read LINE followed by a newline character."
+  (unget-char port #\newline)
+  (unget-string port line))
+
+(define (read-preformatted-text in out)
+  "Read preformatted text from port IN and write it to port OUT."
+  (let ((line (get-line in)))
+    (unless (or (eof-object? line)
+                (string-prefix? "```" line))
+      (put-string out line)
+      (newline out)
+      (read-preformatted-text in out))))
+
+(define (heading-level line)
+  "Return the level of the heading in LINE. If LINE is not a heading,
+return #f."
+  (cond
+   ((string-prefix? "### " line) 3)
+   ((string-prefix? "## " line) 2)
+   ((string-prefix? "# " line) 1)
+   (else #f)))
+
+(define (read-section-children level port)
+  "Read section elements of LEVEL from PORT. Return as a list."
+  (let ((line (get-line port)))
+    (cond
+     ;; End of file
+     ((eof-object? line) (list))
+     ;; If another heading of same or higher level begins, unget line
+     ;; and end section.
+     ((let ((heading-level (heading-level line)))
+        (and heading-level
+             (<= heading-level level)))
+      (unget-line port line)
+      (list))
+     ;; If blank line, continue.
+     ((string-blank? line)
+      (read-section-children level port))
+     ;; Else, add element and continue.
+     (else
+      (unget-line port line)
+      (cons (read-gemtext-element port)
+            (read-section-children level port))))))
+
+(define (paragraph-line? line)
+  "Return #t if LINE is a paragraph line. Else, return #f."
+  (not (or (string-blank? line)
+           (heading-level line)
+           (string-prefix? "* " line)
+           (string-prefix? ">" line)
+           (string-prefix? "=>" line)
+           (string-prefix? "```" line))))
+
+(define (link-line->ref line)
+  "Convert link LINE to a skribilo ref expression."
+  (let* ((trimmed-line (string-trim (string-remove-prefix "=>" line)))
+         (url text (string-partition trimmed-line (char-set #\space #\tab))))
+    (if text
+        `(ref #:url ,url #:text ,(string-trim text))
+        `(ref #:url ,url))))
+
+(define (retf-unget-line port result line)
+  "Unget LINE to PORT and return RESULT. This function is used as an
+argument to ttake-while."
+  (unget-line port line)
+  result)
+
+(define (read-gemtext-element port)
+  "Read next gemtext element from PORT."
+  (let ((line (get-line port)))
+    (cond
+     ;; End of file
+     ((eof-object? line) line)
+     ;; Section
+     ((heading-level line)
+      => (lambda (level)
+           `(,(case level
+                ((1) 'section)
+                ((2) 'subsection)
+                ((3) 'subsubsection))
+             #:title ,(substring line (1+ level))
+             #:number ,(%section-numbers?)
+             ,@(read-section-children level port))))
+     ;; List
+     ((string-remove-prefix "* " line)
+      => (lambda (first-item)
+           `(itemize
+             ,@(port-transduce (compose (ttake-while (cut string-prefix? "* " <>)
+                                                     (cut retf-unget-line port <> <>))
+                                        (tmap (lambda (line)
+                                                `(item ,(string-remove-prefix "* " line)))))
+                               rcons
+                               (list `(item ,first-item))
+                               get-line
+                               port))))
+     ;; Blockquote
+     ((string-remove-prefix ">" line)
+      => (lambda (first-line)
+           (list 'blockquote
+                 (if (%join-lines?)
+                     (string-join
+                      (port-transduce (compose (ttake-while (cut string-prefix? ">" <>)
+                                                            (cut retf-unget-line port <> <>))
+                                               (tmap (cut string-remove-prefix ">" <>)))
+                                      rcons
+                                      (list first-line)
+                                      get-line
+                                      port)
+                      " ")
+                     line))))
+     ;; Link
+     ((string-prefix? "=>" line)
+      (cons 'paragraph
+            (port-transduce (compose (ttake-while (cut string-prefix? "=>" <>)
+                                                  (cut retf-unget-line port <> <>))
+                                     (tmap link-line->ref))
+                            rcons
+                            (list (link-line->ref line))
+                            get-line
+                            port)))
+     ;; Preformatted text
+     ((string-remove-prefix "```" line)
+      => (lambda (alt-text)
+           ;; We don't use the alt text.
+           `(pre ,(call-with-output-string
+                    (cut read-preformatted-text port <>)))))
+     ;; Ignore blank lines.
+     ((string-blank? line) (read-gemtext-element port))
+     ;; Paragraph
+     (else
+      (list 'paragraph
+            (if (%join-lines?)
+                (string-join
+                 (port-transduce (ttake-while paragraph-line?
+                                              (cut retf-unget-line port <> <>))
+                                 rcons
+                                 (list line)
+                                 get-line
+                                 port)
+                 " ")
+                line))))))
+
+(define* (make-gemtext-reader :key join-lines? section-numbers?)
+  "Return a gemtext reader.
+
+If JOIN-LINES? is #t, lines which are not separated by a blank line
+are joined into a single paragraph.
+
+If SECTION-NUMBERS? is #t, sections are numbered. Else, they are not."
+  (lambda (port)
+    (parameterize ((%join-lines? join-lines?)
+                   (%section-numbers? section-numbers?))
+      (match (port-transduce (tmap identity)
+                             rcons
+                             read-gemtext-element
+                             port)
+        (() (eof-object))
+        (elements `(document ,@elements))))))
+
+(define-reader gemtext "0.1" make-gemtext-reader)
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 8ba7637..16478a9 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -13,7 +13,8 @@ TESTS =						\
   ast.test					\
   resolve.test					\
   engines/info.test				\
-  location.test
+  location.test					\
+  readers/gemtext.test
 
 if BUILD_RSS2_READER
 
diff --git a/tests/readers/gemtext.test b/tests/readers/gemtext.test
new file mode 100644
index 0000000..99891c8
--- /dev/null
+++ b/tests/readers/gemtext.test
@@ -0,0 +1,133 @@
+;;; Exercise Gemtext reader.                  -*- Scheme -*-
+;;;
+;;; Copyright © 2022 Arun Isaac <arunisaac <at> systemreboot.net>
+;;;
+;;;
+;;; This file is part of Skribilo.
+;;;
+;;; Skribilo is free software: you can redistribute it and/or modify
+;;; it under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation, either version 3 of the License, or
+;;; (at your option) any later version.
+;;;
+;;; Skribilo is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with Skribilo.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (tests gemtext)
+  #:use-module (srfi srfi-64)
+  #:use-module (ice-9 match)
+  #:use-module (skribilo reader))
+
+(define make-gemtext-reader
+  (reader:make (lookup-reader 'gemtext)))
+
+(define-syntax-rule (match? exp pattern)
+  (match exp
+    (pattern #t)
+    (_ #f)))
+
+
+
+(test-begin "gemtext")
+
+(test-assert "basic gemtext document"
+  (match? (call-with-input-string "# Heading
+* Mercury
+* Gemini
+* Apollo
+## Subheading
+
+### Subsubheading
+
+> I contend that text-based websites should not exceed in size the major works of Russian literature.
+
+# Links
+
+=>https://example.com A cool website
+=>gopher://example.com      An even cooler gopherhole
+=> gemini://example.com A supremely cool Gemini capsule
+=>   sftp://example.com
+
+```
+This is a preformatted block.
+```
+
+```alt
+This is a preformatted block with \"alt text\".
+```"
+            (make-gemtext-reader))
+          `(document
+            (section #:title "Heading" #:number #f
+                     (itemize (item "Mercury")
+                              (item "Gemini")
+                              (item "Apollo"))
+                     (subsection #:title "Subheading" #:number #f
+                                 (subsubsection #:title "Subsubheading" #:number #f
+                                                (blockquote "> I contend that text-based websites should not exceed in size the major works of Russian literature."))))
+            (section #:title "Links" #:number #f
+                     (paragraph (ref #:url "https://example.com" #:text "A cool website")
+                                (ref #:url "gopher://example.com" #:text "An even cooler gopherhole")
+                                (ref #:url "gemini://example.com" #:text "A supremely cool Gemini capsule")
+                                (ref #:url "sftp://example.com"))
+                     (pre "This is a preformatted block.\n")
+                     (pre "This is a preformatted block with \"alt text\".\n")))))
+
+(test-assert "do not join short lines into paragraph"
+  (match? (call-with-input-string "Foo
+Bar"
+            (make-gemtext-reader))
+          `(document
+            (paragraph "Foo")
+            (paragraph "Bar"))))
+
+(test-assert "join short lines into paragraphs"
+  (match? (call-with-input-string "Foo
+Bar"
+            (make-gemtext-reader #:join-lines? #t))
+          `(document
+            (paragraph "Foo Bar"))))
+
+(test-assert "do not number sections"
+  (match? (call-with-input-string "# Foo
+## Bar"
+            (make-gemtext-reader))
+          `(document
+            (section #:title "Foo" #:number #f
+                     (subsection #:title "Bar" #:number #f)))))
+
+(test-assert "number sections"
+  (match? (call-with-input-string "# Foo
+## Bar"
+            (make-gemtext-reader #:section-numbers? #t))
+          `(document
+            (section #:title "Foo" #:number #t
+                     (subsection #:title "Bar" #:number #t)))))
+
+(test-assert "break up links separated by blank lines into paragraphs"
+  (match? (call-with-input-string "=>https://example.com A cool website
+=>gopher://example.com      An even cooler gopherhole
+
+=> gemini://example.com A supremely cool Gemini capsule
+=>   sftp://example.com"
+            (make-gemtext-reader))
+          `(document
+            (paragraph (ref #:url "https://example.com" #:text "A cool website")
+                       (ref #:url "gopher://example.com" #:text "An even cooler gopherhole"))
+            (paragraph (ref #:url "gemini://example.com" #:text "A supremely cool Gemini capsule")
+                       (ref #:url "sftp://example.com")))))
+
+(test-assert "ignore blank lines that have a non-zero number of whitespace characters"
+  (match? (call-with-input-string "Foo
+  
+Bar"
+            (make-gemtext-reader))
+          `(document
+            (paragraph "Foo")
+            (paragraph "Bar"))))
+
+(test-end "gemtext")
-- 
2.34.0





Information forwarded to skribilo-bugs <at> nongnu.org:
bug#54705; Package skribilo. (Sun, 10 Apr 2022 20:12:01 GMT) Full text and rfc822 format available.

Message #8 received at 54705 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Arun Isaac <arunisaac <at> systemreboot.net>
Cc: 54705 <at> debbugs.gnu.org
Subject: Re: bug#54705: [PATCH] reader: Add Gemtext reader.
Date: Sun, 10 Apr 2022 22:11:29 +0200
Hey!

Arun Isaac <arunisaac <at> systemreboot.net> skribis:

> * src/guile/skribilo/reader/gemtext.scm: New file.
> * src/guile/Makefile.am (readers): Register it.
> * doc/user/syntax.skb (The Gemtext Syntax): New section.
> * tests/readers/gemtext.test: New file.
> * tests/Makefile.am (TESTS): Add readers/gemtext.test.

This looks great to me!

> +(define-module (skribilo reader gemtext)
> +  #:use-module (rnrs io ports)
> +  #:use-module (srfi srfi-26)
> +  #:use-module (srfi srfi-71)
> +  #:use-module (srfi srfi-171)

I’ve just realized that, currently, Skribilo supports Guile 2.x in
addition to 3.0, but neither 2.2 nor 2.0 has (srfi srfi-171).

Maybe the simplest option is to check for (srfi srfi-171) in
configure.ac and disable compilation of this module if it’s unavailable,
similar to the ‘BUILD_RSS2_READER’ conditional.

WDYT?

Minor cosmetic comments:

> +(skribilo-module-syntax)

I think you can omit this line and use regular #:keywords.

That’s all I have to say, thank you!

Ludo’.




Information forwarded to skribilo-bugs <at> nongnu.org:
bug#54705; Package skribilo. (Tue, 12 Apr 2022 19:42:01 GMT) Full text and rfc822 format available.

Message #11 received at 54705 <at> debbugs.gnu.org (full text, mbox):

From: Arun Isaac <arunisaac <at> systemreboot.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 54705 <at> debbugs.gnu.org
Subject: Re: bug#54705: [PATCH] reader: Add Gemtext reader.
Date: Wed, 13 Apr 2022 01:10:51 +0530
Hi Ludo,

> I’ve just realized that, currently, Skribilo supports Guile 2.x in
> addition to 3.0, but neither 2.2 nor 2.0 has (srfi srfi-171).
>
> Maybe the simplest option is to check for (srfi srfi-171) in
> configure.ac and disable compilation of this module if it’s unavailable,
> similar to the ‘BUILD_RSS2_READER’ conditional.
>
> WDYT?

Sure, done! A new patch follows in my next mail.

But, I have always wondered, why do we not leave all the SRFIs as
external packages outside guile? That way, older versions of guile and
other scheme implementations might use them as well. It's a pity that I
have had to make many of guile packages strictly guile 3+ just because I
use SRFI-171.

>> +(skribilo-module-syntax)
>
> I think you can omit this line and use regular #:keywords.

I'd love to do that, but if I do, doc-markup fails to read the
gemtext.scm source file properly. The backtrace follows. doc-markup will
have to be fixed. May I push this patch and work on that separately
later?

--8<---------------cut here---------------start------------->8---
Backtrace:
In ice-9/eval.scm:
   177:49 19 (lp _)
   177:32 18 (lp (#<procedure 7f782904c9c0 at ice-9/eval.scm:182:7 (env)> #<procedure 7f782904c9a0 at ice-9/eval.scm:182:7 (env)> #<procedure 7f782904c980 at ice-9/eval.scm:182:7 (env)> #<procedure 7f782904c960 at ice-9/eval.scm:182:7 (env)> #<p…> …))
In ice-9/ports.scm:
   433:17 17 (call-with-input-file _ _ #:binary _ #:encoding _ #:guess-encoding _)
    472:4 16 (_ _)
In ice-9/boot-9.scm:
   2836:4 15 (save-module-excursion _)
In skribilo/evaluator.scm:
     61:2 14 (_)
In unknown file:
          13 (eval (chapter #:title "Syntax" #:ident "syntax" (p (quasiquote ("This chapter describes the syntax or Skribilo documents---or\nrather, the available syntaxes Skribilo documents can use.  Skribilo\nactually supports several …" …))) # …) …)
In ice-9/eval.scm:
   214:21 12 (_ _)
   217:50 11 (lp _)
   217:50 10 (lp _)
   217:50  9 (lp (#<procedure 7f7828970540 at ice-9/eval.scm:212:12 (env)> #<procedure 7f7828970500 at ice-9/eval.scm:212:12 (env)> #<procedure 7f78289704c0 at ice-9/eval.scm:212:12 (env)> #<procedure 7f7828970480 at ice-9/eval.scm:212:12 (env)> # …))
   217:50  8 (lp (#<procedure 7f7828970500 at ice-9/eval.scm:212:12 (env)> #<procedure 7f78289704c0 at ice-9/eval.scm:212:12 (env)> #<procedure 7f7828970480 at ice-9/eval.scm:212:12 (env)> #<procedure 7f7828970440 at ice-9/eval.scm:212:12 (env)> #))
   217:33  7 (lp (#<procedure 7f78289704c0 at ice-9/eval.scm:212:12 (env)> #<procedure 7f7828970480 at ice-9/eval.scm:212:12 (env)> #<procedure 7f7828970440 at ice-9/eval.scm:212:12 (env)> #<procedure 7f7828970400 at ice-9/eval.scm:212:12 (env)>))
   214:21  6 (_ #f)
   217:50  5 (lp (#<procedure 7f78289a3880 at ice-9/eval.scm:282:4 (env)> #<procedure 7f78289a3800 at ice-9/eval.scm:649:6 (env)> #<procedure 7f78289a3780 at ice-9/eval.scm:649:6 (env)>))
   217:50  4 (lp (#<procedure 7f78289a3800 at ice-9/eval.scm:649:6 (env)> #<procedure 7f78289a3780 at ice-9/eval.scm:649:6 (env)>))
   217:33  3 (lp (#<procedure 7f78289a3780 at ice-9/eval.scm:649:6 (env)>))
In skribilo/documentation/api.scm:
   782:10  2 (%doc-markup-internal #<<location> 140154063808512 "./syntax.skb":224:6> make-gemtext-reader _ #:ident _ #:writer-id _ #:common-args _ #:ignore-args _ #:force-args _ #:idx _ #:idx-note _ #:idx-suffix _ #:source _ #:def _ #:see-also _ . #)
In skribilo/lib.scm:
    225:6  1 (skribe-error _ _ _)
In ice-9/boot-9.scm:
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
doc-markup: make-gemtext-reader: missing descriptions (:key join-lines? section-numbers?)
--8<---------------cut here---------------end--------------->8---

Thanks,
Arun




Information forwarded to skribilo-bugs <at> nongnu.org:
bug#54705; Package skribilo. (Tue, 12 Apr 2022 19:43:02 GMT) Full text and rfc822 format available.

Message #14 received at 54705 <at> debbugs.gnu.org (full text, mbox):

From: Arun Isaac <arunisaac <at> systemreboot.net>
To: Arun Isaac <arunisaac <at> systemreboot.net>,
 Ludovic Courtès <ludo <at> gnu.org>
Cc: 54705 <at> debbugs.gnu.org
Subject: [PATCH v2] reader: Add Gemtext reader.
Date: Wed, 13 Apr 2022 01:12:32 +0530
* src/guile/skribilo/reader/gemtext.scm: New file.
* configure.ac: Set BUILD_GEMTEXT_READER automake conditional to true
if (srfi srfi-171) is found. Else, set it to false.
* src/guile/Makefile.am (readers): Add skribilo/reader/gemtext.scm
if BUILD_GEMTEXT_READER is true.
(EXTRA_DIST): Add skribilo/reader/gemtext.scm if BUILD_GEMTEXT_READER
is false.
* doc/user/syntax.skb (The Gemtext Syntax): New section.
* tests/readers/gemtext.test: New file.
* tests/Makefile.am (TESTS): Add readers/gemtext.test if
BUILD_GEMTEXT_READER is true.
(EXTRA_DIST): Add readers/gemtext.text if BUILD_GEMTEXT_READER is
false.
---
 configure.ac                          |   9 +
 doc/user/syntax.skb                   |  21 ++-
 src/guile/Makefile.am                 |  10 ++
 src/guile/skribilo/reader/gemtext.scm | 231 ++++++++++++++++++++++++++
 tests/Makefile.am                     |  10 ++
 tests/readers/gemtext.test            | 133 +++++++++++++++
 6 files changed, 413 insertions(+), 1 deletion(-)
 create mode 100644 src/guile/skribilo/reader/gemtext.scm
 create mode 100644 tests/readers/gemtext.test

diff --git a/configure.ac b/configure.ac
index 04c7eac..5ad964a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -66,6 +66,15 @@ fi
 AM_CONDITIONAL([BUILD_RSS2_READER],
   [test "x$have_sxml_simple$have_htmlprag" == "xyesyes"])
 
+# Check for SRFI-171, needed for the `gemtext' reader.
+GUILE_MODULE_AVAILABLE([have_srfi_171], [(srfi srfi-171)])
+if test "x$have_srfi_171" != "xyes"; then
+  AC_MSG_WARN([SRFI-171 needed by the `gemtext' reader is missing.])
+fi
+
+AM_CONDITIONAL([BUILD_GEMTEXT_READER],
+  [test "x$have_srfi_171" == "xyes"])
+
 # Look for `convert', from ImageMagick.
 AC_PATH_PROG([CONVERT], [convert])
 if test "x$CONVERT" == "x"; then
diff --git a/doc/user/syntax.skb b/doc/user/syntax.skb
index 9a4070c..2de7cbd 100644
--- a/doc/user/syntax.skb
+++ b/doc/user/syntax.skb
@@ -211,7 +211,26 @@ documents that can be output in variety of formats (see ,(numref :text
 [Chapter] :ident "engines")).  The downside is that, being a very simple
 markup-less document format, there are many things that cannot be done
 using it, most notably tables, bibliographies, and cross-references.]))
-   
+
+   (section :title [The Gemtext Syntax] :ident "gemtext-syntax"
+     (p [,(ref
+:url "https://gemini.circumlunar.space/docs/gemtext.gmi"
+:text "Gemtext"), the lightweight markup language used by the ,(ref
+:url "https://gemini.circumlunar.space" :text "Gemini protocol"), is
+supported as an input syntax. To use it, just pass ,(tt
+[--reader=gemtext]) to the compiler. When used programmatically, the
+Gemtext reader can be customized using the following options.])
+
+     (doc-markup 'make-gemtext-reader
+                 '((:join-lines? [If ,(code "#t"), lines which are not
+separated by a blank line are joined into a single paragraph. This is
+a relaxation of the Gemtext standard, and is not done by default.])
+                   (:section-numbers? [If ,(code "#t"), sections are
+numbered. Else, they are not.]))
+                 :common-args '()
+                 :source "skribilo/reader/gemtext.scm"
+	         :idx *function-index*))
+
    (section :title [The RSS 2.0 Syntax]
             :ident "rss2-syntax"
       
diff --git a/src/guile/Makefile.am b/src/guile/Makefile.am
index 98f2873..09bb7da 100644
--- a/src/guile/Makefile.am
+++ b/src/guile/Makefile.am
@@ -1,5 +1,6 @@
 # Copyright 2005, 2006, 2007, 2008, 2009, 2010, 2012,
 #   2015, 2018, 2020 Ludovic Courtès <ludo <at> gnu.org>
+# Copyright 2022 Arun Isaac <arunisaac <at> systemreboot.net>
 #
 # This file is part of Skribilo.
 #
@@ -59,6 +60,15 @@ EXTRA_DIST += skribilo/reader/rss-2.scm
 
 endif !BUILD_RSS2_READER
 
+if BUILD_GEMTEXT_READER
+
+readers += skribilo/reader/gemtext.scm
+
+else !BUILD_GEMTEXT_READER
+
+EXTRA_DIST += skribilo/reader/gemtext.scm
+
+endif !BUILD_GEMTEXT_READER
 
 engines =						\
   skribilo/engine/base.scm skribilo/engine/context.scm	\
diff --git a/src/guile/skribilo/reader/gemtext.scm b/src/guile/skribilo/reader/gemtext.scm
new file mode 100644
index 0000000..7f5905c
--- /dev/null
+++ b/src/guile/skribilo/reader/gemtext.scm
@@ -0,0 +1,231 @@
+;;; gemtext.scm  --  A reader for the Gemini protocol's Gemtext markup
+;;;
+;;; Copyright © 2022 Arun Isaac <arunisaac <at> systemreboot.net>
+;;;
+;;;
+;;; This file is part of Skribilo.
+;;;
+;;; Skribilo is free software: you can redistribute it and/or modify
+;;; it under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation, either version 3 of the License, or
+;;; (at your option) any later version.
+;;;
+;;; Skribilo is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with Skribilo.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (skribilo reader gemtext)
+  #:use-module (rnrs io ports)
+  #:use-module (srfi srfi-26)
+  #:use-module (srfi srfi-71)
+  #:use-module (srfi srfi-171)
+  #:use-module (ice-9 match)
+  #:use-module ((ice-9 textual-ports) #:select (unget-char unget-string))
+  #:use-module (skribilo reader)
+  #:use-module (skribilo utils syntax)
+  #:export (reader-specification
+            make-gemtext-reader))
+
+(skribilo-module-syntax)
+
+;;; Author: Arun Isaac
+;;;
+;;; Commentary:
+;;;
+;;; A reader for gemtext, the lightweight markup language used by the
+;;; Gemini protocol
+;;;
+;;; Code:
+
+(define %join-lines?
+  (make-parameter #f))
+
+(define %section-numbers?
+  (make-parameter #f))
+
+(define (string-blank? str)
+  "Return #t if STR contains only whitespace characters.  Else, return
+#f."
+  (string-every char-set:whitespace str))
+
+(define (string-remove-prefix prefix str)
+  "Return STR with PREFIX removed.  If PREFIX is not a prefix of STR,
+return #f."
+  (and (string-prefix? prefix str)
+       (substring str (string-length prefix))))
+
+(define (string-partition str char-pred)
+  "Return the part of STR before and after the first occurrence of
+CHAR-PRED as two values."
+  (let ((partition-index (string-index str char-pred)))
+    (if partition-index
+        (values (substring str 0 partition-index)
+                (substring str partition-index))
+        (values str #f))))
+
+(define (unget-line port line)
+  "Place the string LINE in PORT so that subsequent read operations
+will read LINE followed by a newline character."
+  (unget-char port #\newline)
+  (unget-string port line))
+
+(define (read-preformatted-text in out)
+  "Read preformatted text from port IN and write it to port OUT."
+  (let ((line (get-line in)))
+    (unless (or (eof-object? line)
+                (string-prefix? "```" line))
+      (put-string out line)
+      (newline out)
+      (read-preformatted-text in out))))
+
+(define (heading-level line)
+  "Return the level of the heading in LINE. If LINE is not a heading,
+return #f."
+  (cond
+   ((string-prefix? "### " line) 3)
+   ((string-prefix? "## " line) 2)
+   ((string-prefix? "# " line) 1)
+   (else #f)))
+
+(define (read-section-children level port)
+  "Read section elements of LEVEL from PORT. Return as a list."
+  (let ((line (get-line port)))
+    (cond
+     ;; End of file
+     ((eof-object? line) (list))
+     ;; If another heading of same or higher level begins, unget line
+     ;; and end section.
+     ((let ((heading-level (heading-level line)))
+        (and heading-level
+             (<= heading-level level)))
+      (unget-line port line)
+      (list))
+     ;; If blank line, continue.
+     ((string-blank? line)
+      (read-section-children level port))
+     ;; Else, add element and continue.
+     (else
+      (unget-line port line)
+      (cons (read-gemtext-element port)
+            (read-section-children level port))))))
+
+(define (paragraph-line? line)
+  "Return #t if LINE is a paragraph line. Else, return #f."
+  (not (or (string-blank? line)
+           (heading-level line)
+           (string-prefix? "* " line)
+           (string-prefix? ">" line)
+           (string-prefix? "=>" line)
+           (string-prefix? "```" line))))
+
+(define (link-line->item line)
+  "Convert link LINE to a skribilo ref expression."
+  (let* ((trimmed-line (string-trim (string-remove-prefix "=>" line)))
+         (url text (string-partition trimmed-line (char-set #\space #\tab))))
+    (if text
+        `(item (ref #:url ,url #:text ,(string-trim text)))
+        `(item (ref #:url ,url)))))
+
+(define (retf-unget-line port result line)
+  "Unget LINE to PORT and return RESULT. This function is used as an
+argument to ttake-while."
+  (unget-line port line)
+  result)
+
+(define (read-gemtext-element port)
+  "Read next gemtext element from PORT."
+  (let ((line (get-line port)))
+    (cond
+     ;; End of file
+     ((eof-object? line) line)
+     ;; Section
+     ((heading-level line)
+      => (lambda (level)
+           `(,(case level
+                ((1) 'section)
+                ((2) 'subsection)
+                ((3) 'subsubsection))
+             #:title ,(substring line (1+ level))
+             #:number ,(%section-numbers?)
+             ,@(read-section-children level port))))
+     ;; List
+     ((string-remove-prefix "* " line)
+      => (lambda (first-item)
+           `(itemize
+             ,@(port-transduce (compose (ttake-while (cut string-prefix? "* " <>)
+                                                     (cut retf-unget-line port <> <>))
+                                        (tmap (lambda (line)
+                                                `(item ,(string-remove-prefix "* " line)))))
+                               rcons
+                               (list `(item ,first-item))
+                               get-line
+                               port))))
+     ;; Blockquote
+     ((string-remove-prefix ">" line)
+      => (lambda (first-line)
+           (list 'blockquote
+                 (if (%join-lines?)
+                     (string-join
+                      (port-transduce (compose (ttake-while (cut string-prefix? ">" <>)
+                                                            (cut retf-unget-line port <> <>))
+                                               (tmap (cut string-remove-prefix ">" <>)))
+                                      rcons
+                                      (list first-line)
+                                      get-line
+                                      port)
+                      " ")
+                     line))))
+     ;; Link
+     ((string-prefix? "=>" line)
+      (cons 'itemize
+            (port-transduce (compose (ttake-while (cut string-prefix? "=>" <>)
+                                                  (cut retf-unget-line port <> <>))
+                                     (tmap link-line->item))
+                            rcons
+                            (list (link-line->item line))
+                            get-line
+                            port)))
+     ;; Preformatted text
+     ((string-remove-prefix "```" line)
+      => (lambda (alt-text)
+           ;; We don't use the alt text.
+           `(pre ,(call-with-output-string
+                    (cut read-preformatted-text port <>)))))
+     ;; Ignore blank lines.
+     ((string-blank? line) (read-gemtext-element port))
+     ;; Paragraph
+     (else
+      (list 'paragraph
+            (if (%join-lines?)
+                (string-join
+                 (port-transduce (ttake-while paragraph-line?
+                                              (cut retf-unget-line port <> <>))
+                                 rcons
+                                 (list line)
+                                 get-line
+                                 port)
+                 " ")
+                line))))))
+
+(define* (make-gemtext-reader :key join-lines? section-numbers?)
+  "Return a gemtext reader.
+
+If JOIN-LINES? is #t, lines which are not separated by a blank line
+are joined into a single paragraph.
+
+If SECTION-NUMBERS? is #t, sections are numbered. Else, they are not."
+  (lambda (port)
+    (parameterize ((%join-lines? join-lines?)
+                   (%section-numbers? section-numbers?))
+      (match (port-transduce (tmap identity)
+                             rcons
+                             read-gemtext-element
+                             port)
+        (() (eof-object))
+        (elements `(document ,@elements))))))
+
+(define-reader gemtext "0.1" make-gemtext-reader)
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 8ba7637..26b05ad 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -26,5 +26,15 @@ EXTRA_DIST = $(TESTS) readers/rss-2.test
 
 endif !BUILD_RSS2_READER
 
+if BUILD_GEMTEXT_READER
+
+TESTS += readers/gemtext.test
+EXTRA_DIST = $(TESTS)
+
+else !BUILD_GEMTEXT_READER
+
+EXTRA_DIST = $(TESTS) readers/gemtext.test
+
+endif !BUILD_GEMTEXT_READER
 
 CLEANFILES = ast.log resolve.log rss-2.log location.log info.log
diff --git a/tests/readers/gemtext.test b/tests/readers/gemtext.test
new file mode 100644
index 0000000..2340dc0
--- /dev/null
+++ b/tests/readers/gemtext.test
@@ -0,0 +1,133 @@
+;;; Exercise Gemtext reader.                  -*- Scheme -*-
+;;;
+;;; Copyright © 2022 Arun Isaac <arunisaac <at> systemreboot.net>
+;;;
+;;;
+;;; This file is part of Skribilo.
+;;;
+;;; Skribilo is free software: you can redistribute it and/or modify
+;;; it under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation, either version 3 of the License, or
+;;; (at your option) any later version.
+;;;
+;;; Skribilo is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with Skribilo.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (tests gemtext)
+  #:use-module (srfi srfi-64)
+  #:use-module (ice-9 match)
+  #:use-module (skribilo reader))
+
+(define make-gemtext-reader
+  (reader:make (lookup-reader 'gemtext)))
+
+(define-syntax-rule (match? exp pattern)
+  (match exp
+    (pattern #t)
+    (_ #f)))
+
+
+
+(test-begin "gemtext")
+
+(test-assert "basic gemtext document"
+  (match? (call-with-input-string "# Heading
+* Mercury
+* Gemini
+* Apollo
+## Subheading
+
+### Subsubheading
+
+> I contend that text-based websites should not exceed in size the major works of Russian literature.
+
+# Links
+
+=>https://example.com A cool website
+=>gopher://example.com      An even cooler gopherhole
+=> gemini://example.com A supremely cool Gemini capsule
+=>   sftp://example.com
+
+```
+This is a preformatted block.
+```
+
+```alt
+This is a preformatted block with \"alt text\".
+```"
+            (make-gemtext-reader))
+          `(document
+            (section #:title "Heading" #:number #f
+                     (itemize (item "Mercury")
+                              (item "Gemini")
+                              (item "Apollo"))
+                     (subsection #:title "Subheading" #:number #f
+                                 (subsubsection #:title "Subsubheading" #:number #f
+                                                (blockquote "> I contend that text-based websites should not exceed in size the major works of Russian literature."))))
+            (section #:title "Links" #:number #f
+                     (itemize (item (ref #:url "https://example.com" #:text "A cool website"))
+                              (item (ref #:url "gopher://example.com" #:text "An even cooler gopherhole"))
+                              (item (ref #:url "gemini://example.com" #:text "A supremely cool Gemini capsule"))
+                              (item (ref #:url "sftp://example.com")))
+                     (pre "This is a preformatted block.\n")
+                     (pre "This is a preformatted block with \"alt text\".\n")))))
+
+(test-assert "do not join short lines into paragraph"
+  (match? (call-with-input-string "Foo
+Bar"
+            (make-gemtext-reader))
+          `(document
+            (paragraph "Foo")
+            (paragraph "Bar"))))
+
+(test-assert "join short lines into paragraphs"
+  (match? (call-with-input-string "Foo
+Bar"
+            (make-gemtext-reader #:join-lines? #t))
+          `(document
+            (paragraph "Foo Bar"))))
+
+(test-assert "do not number sections"
+  (match? (call-with-input-string "# Foo
+## Bar"
+            (make-gemtext-reader))
+          `(document
+            (section #:title "Foo" #:number #f
+                     (subsection #:title "Bar" #:number #f)))))
+
+(test-assert "number sections"
+  (match? (call-with-input-string "# Foo
+## Bar"
+            (make-gemtext-reader #:section-numbers? #t))
+          `(document
+            (section #:title "Foo" #:number #t
+                     (subsection #:title "Bar" #:number #t)))))
+
+(test-assert "break up links separated by blank lines into separate lists"
+  (match? (call-with-input-string "=>https://example.com A cool website
+=>gopher://example.com      An even cooler gopherhole
+
+=> gemini://example.com A supremely cool Gemini capsule
+=>   sftp://example.com"
+            (make-gemtext-reader))
+          `(document
+            (itemize (item (ref #:url "https://example.com" #:text "A cool website"))
+                     (item (ref #:url "gopher://example.com" #:text "An even cooler gopherhole")))
+            (itemize (item (ref #:url "gemini://example.com" #:text "A supremely cool Gemini capsule"))
+                     (item (ref #:url "sftp://example.com"))))))
+
+(test-assert "ignore blank lines that have a non-zero number of whitespace characters"
+  (match? (call-with-input-string "Foo
+  
+Bar"
+            (make-gemtext-reader))
+          `(document
+            (paragraph "Foo")
+            (paragraph "Bar"))))
+
+(test-end "gemtext")
-- 
2.35.1





Information forwarded to skribilo-bugs <at> nongnu.org:
bug#54705; Package skribilo. (Fri, 15 Apr 2022 16:37:02 GMT) Full text and rfc822 format available.

Message #17 received at 54705 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Arun Isaac <arunisaac <at> systemreboot.net>
Cc: 54705 <at> debbugs.gnu.org
Subject: Re: [PATCH v2] reader: Add Gemtext reader.
Date: Fri, 15 Apr 2022 18:36:39 +0200
Hi!

Arun Isaac <arunisaac <at> systemreboot.net> skribis:

> * src/guile/skribilo/reader/gemtext.scm: New file.
> * configure.ac: Set BUILD_GEMTEXT_READER automake conditional to true
> if (srfi srfi-171) is found. Else, set it to false.
> * src/guile/Makefile.am (readers): Add skribilo/reader/gemtext.scm
> if BUILD_GEMTEXT_READER is true.
> (EXTRA_DIST): Add skribilo/reader/gemtext.scm if BUILD_GEMTEXT_READER
> is false.
> * doc/user/syntax.skb (The Gemtext Syntax): New section.
> * tests/readers/gemtext.test: New file.
> * tests/Makefile.am (TESTS): Add readers/gemtext.test if
> BUILD_GEMTEXT_READER is true.
> (EXTRA_DIST): Add readers/gemtext.text if BUILD_GEMTEXT_READER is
> false.

LGTM, thanks!

Ludo’.




Information forwarded to skribilo-bugs <at> nongnu.org:
bug#54705; Package skribilo. (Fri, 15 Apr 2022 16:38:01 GMT) Full text and rfc822 format available.

Message #20 received at 54705 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Arun Isaac <arunisaac <at> systemreboot.net>
Cc: 54705 <at> debbugs.gnu.org
Subject: Re: bug#54705: [PATCH] reader: Add Gemtext reader.
Date: Fri, 15 Apr 2022 18:37:11 +0200
Arun Isaac <arunisaac <at> systemreboot.net> skribis:

> I'd love to do that, but if I do, doc-markup fails to read the
> gemtext.scm source file properly. The backtrace follows. doc-markup will
> have to be fixed. May I push this patch and work on that separately
> later?

Yeah, let’s fix ‘doc-markup’ later.

Ludo’.




Reply sent to Arun Isaac <arunisaac <at> systemreboot.net>:
You have taken responsibility. (Fri, 15 Apr 2022 16:50:02 GMT) Full text and rfc822 format available.

Notification sent to Arun Isaac <arunisaac <at> systemreboot.net>:
bug acknowledged by developer. (Fri, 15 Apr 2022 16:50:02 GMT) Full text and rfc822 format available.

Message #25 received at 54705-done <at> debbugs.gnu.org (full text, mbox):

From: Arun Isaac <arunisaac <at> systemreboot.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 54705-done <at> debbugs.gnu.org
Subject: Re: [PATCH v2] reader: Add Gemtext reader.
Date: Fri, 15 Apr 2022 22:19:24 +0530
Pushed, thanks!




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 14 May 2022 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 33 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.