GNU bug report logs - #74994
Improve Emacs iCalendar support

Previous Next

Package: emacs;

Reported by: Richard Lawrence <rwl <at> recursewithless.net>

Date: Fri, 20 Dec 2024 13:08:02 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Richard Lawrence <rwl <at> recursewithless.net>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 74994 <at> debbugs.gnu.org
Subject: bug#74994: Improve Emacs iCalendar support
Date: Wed, 22 Jan 2025 08:43:38 +0100
Hi Stefan,

thanks for your feedback!

Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>>>   - I've used macros (see icalendar-macs.el) to create a small "DSL" for
>>>     defining iCalendar types. These macros store parsing-related information for
>>>     each type as properties of the symbols which name them. There's a lot of
>>>     dynamic dispatch in the parser based on these type symbols' properties.
>>>     This adds some complexity but (I hope) makes the parser more "atomic"/
>>>     extensible. Does this seem like a reasonable approach in general?
>
> It sounds like a reasonable design, yes.
>
> In `bindat.el` I used a similar approach except that each construct (I
> guess in your case, that means each "type") is stored as a method (in
> a generic function) instead of a property of a symbol.  I'm not sure
> it's the perfect solution, but it's nice that `C-h o` on the generic
> function can then provide a documentation of each of the constructs.

This would mean relying more heavily on cl-lib, correct? Generic
functions and methods are part of cl-lib's CLOS implementation?

C-h o already works with my code (see the describe-symbol backend at the
end of icalendar-parser.el), but maybe the generic functions approach is
cleaner. I'll think about it.

> Other options we use elsewhere is to use function names constructed from
> a constant prefix plus the name of the construct, so instead of
>
>     (funcall (get 'foo 'bar) ...)
>
> you might be able to macroexpand to something like
>
>     (,(intern (format "bar %s" 'foo)) ...)
>
> so you get (for free) compile-time warnings when using a construct that
> doesn't exist, and you avoid a `get` at runtime (IIRC, we use that
> approach in `peg.el`).

I hadn't thought of that. Would this prevent users of the library from
defining new types after the library is compiled, though? The iCalendar
standard allows extensions in "X-" properties and components; I don't
want to do anything that would make it difficult e.g. for Org to use
these to encode its own data structures.

>>>   - I ran into one issue that feels like a design flaw: the parser separates
>>>     "reading" (converting a string to an Elisp value) into a function
>>>     distinct from the parsing function which matches that string (see e.g.
>>>     ical:parse-property-value in icalendar-parser.el, which calls
>>>     ical:read-property-value). In simple cases this nicely factors out a pure
>>>     function from one which depends on a lot of global buffer state;
>>>     but in more complicated cases the "pure" reader function depends on
>>>     the match data and so isn't pure at all (see e.g. ical:read-dur-value).
>>>     Is there a better way to do this? (Not make the distinction? Pass
>>>     the match data explicitly? ...?)
>
> Is the separation useful to users (including internal users) of the
> parser? This kind of problem doesn't directly ring a bell, so I don't
> have a good suggestion to make.

It's certainly useful when debugging. Calling a pure function with M-:
or e in the debugger to make sure it's doing what I expect is generally
a lot easier than getting a whole buffer into the right parsing state.
If I can declare them pure, it might also have some performance
benefits.

>>>   - whether there's a better solution to the problem of needing to unfold
>>>     lines *before* a buffer containing iCalendar data is decoded
>>>     (is there anything like a hook that runs before decoding?)
>
> [ Sorry, I don't understand this question.  ]

The standard says that long lines need to be "folded" (wrapped) by
inserting a CR-LF-space sequence. It defines long lines as those longer
than 75 *bytes*, and explicitly says that implementations need to handle
the case where the line-wrapping sequence occurs in the middle of a
multi-byte character. So the only safe way to unwrap lines is before a
buffer gets decoded.

So far the best user interface I could come up with was to check for
long lines when icalendar-mode starts and ask the user whether they want
to unwrap them. If they do, it re-loads the raw data into a new buffer,
unwraps the lines, decodes the buffer, and then re-starts icalendar-mode
in the new buffer. But I find this pretty awkward in practice, because
you end up with two buffers containing the same data (modulo whitespace)
and visiting the same file, and I'm not sure how to improve this.

Thanks again for your thoughts!

Best,
Richard




This bug report was last modified 99 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.