GNU bug report logs -
#26059
utf16->string and utf32->string don't conform to R6RS
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26059 in the body.
You can then email your comments to 26059 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guile <at> gnu.org
:
bug#26059
; Package
guile
.
(Sat, 11 Mar 2017 16:21:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
taylanbayirli <at> gmail.com ("Taylan Ulrich Bayırlı/Kammer")
:
New bug report received and forwarded. Copy sent to
bug-guile <at> gnu.org
.
(Sat, 11 Mar 2017 16:21:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
See the R6RS Libraries document page 10. The differences:
- R6RS supports reading a BOM.
- R6RS mandates an endianness argument to specify the behavior at the
absence of a BOM.
- R6RS allows an optional third argument 'endianness-mandatory' to
explicitly ignore any possible BOM.
Here's a quick patch on top of master to implement the R6RS procedures
in terms of the Guile procedures and export them with a rename from
(rnrs bytevectors).
===File
/home/taylan/src/guile/guile-master/0001-Fix-R6RS-utf16-string-and-utf32-string.patch===
From f51cd1d4884caafb1ed0072cd77c0e3145f34576 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Taylan=20Ulrich=20Bay=C4=B1rl=C4=B1/Kammer?=
<taylanbayirli <at> gmail.com>
Date: Fri, 10 Mar 2017 22:36:55 +0100
Subject: [PATCH] Fix R6RS utf16->string and utf32->string.
* module/rnrs/bytevectors.scm (read-bom16, read-bom32): New procedures.
(r6rs-utf16->string, r6rs-utf32->string): Ditto.
---
module/rnrs/bytevectors.scm | 52 ++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 51 insertions(+), 1 deletion(-)
diff --git a/module/rnrs/bytevectors.scm b/module/rnrs/bytevectors.scm
index 9744359f0..997a8c9cb 100644
--- a/module/rnrs/bytevectors.scm
+++ b/module/rnrs/bytevectors.scm
@@ -69,7 +69,9 @@
bytevector-ieee-double-native-set!
string->utf8 string->utf16 string->utf32
- utf8->string utf16->string utf32->string))
+ utf8->string
+ (r6rs-utf16->string . utf16->string)
+ (r6rs-utf32->string . utf32->string)))
(load-extension (string-append "libguile-" (effective-version))
@@ -80,4 +82,52 @@
`(quote ,sym)
(error "unsupported endianness" sym)))
+(define (read-bom16 bv)
+ (let ((c0 (bytevector-u8-ref bv 0))
+ (c1 (bytevector-u8-ref bv 1)))
+ (cond
+ ((and (= c0 #xFE) (= c1 #xFF))
+ 'big)
+ ((and (= c0 #xFF) (= c1 #xFE))
+ 'little)
+ (else
+ #f))))
+
+(define r6rs-utf16->string
+ (case-lambda
+ ((bv default-endianness)
+ (let ((bom-endianness (read-bom16 bv)))
+ (if (not bom-endianness)
+ (utf16->string bv default-endianness)
+ (substring/shared (utf16->string bv bom-endianness) 1))))
+ ((bv endianness endianness-mandatory?)
+ (if endianness-mandatory?
+ (utf16->string bv endianness)
+ (r6rs-utf16->string bv endianness)))))
+
+(define (read-bom32 bv)
+ (let ((c0 (bytevector-u8-ref bv 0))
+ (c1 (bytevector-u8-ref bv 1))
+ (c2 (bytevector-u8-ref bv 2))
+ (c3 (bytevector-u8-ref bv 3)))
+ (cond
+ ((and (= c0 #x00) (= c1 #x00) (= c2 #xFE) (= c3 #xFF))
+ 'big)
+ ((and (= c0 #xFF) (= c1 #xFE) (= c2 #x00) (= c3 #x00))
+ 'little)
+ (else
+ #f))))
+
+(define r6rs-utf32->string
+ (case-lambda
+ ((bv default-endianness)
+ (let ((bom-endianness (read-bom32 bv)))
+ (if (not bom-endianness)
+ (utf32->string bv default-endianness)
+ (substring/shared (utf32->string bv bom-endianness) 1))))
+ ((bv endianness endianness-mandatory?)
+ (if endianness-mandatory?
+ (utf32->string bv endianness)
+ (r6rs-utf32->string bv endianness)))))
+
;;; bytevector.scm ends here
--
2.11.0
============================================================
bug closed, send any further explanations to
26059 <at> debbugs.gnu.org and taylanbayirli <at> gmail.com ("Taylan Ulrich Bayırlı/Kammer")
Request was from
taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer)
to
control <at> debbugs.gnu.org
.
(Sat, 11 Mar 2017 18:20:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#26059
; Package
guile
.
(Sat, 11 Mar 2017 18:25:01 GMT)
Full text and
rfc822 format available.
Message #10 received at 26059 <at> debbugs.gnu.org (full text, mbox):
Please ignore this, as it's a duplicate of #26058.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#26059
; Package
guile
.
(Mon, 13 Mar 2017 12:59:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 26059-close <at> debbugs.gnu.org (full text, mbox):
On Sat 11 Mar 2017 19:30, taylanbayirli <at> gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:
> Please ignore this, as it's a duplicate of #26058.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 11 Apr 2017 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 8 years and 69 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.