GNU bug report logs - #78561
[PATCH] Add semantic linefeed support for paragraph filling

Previous Next

Package: emacs;

Reported by: Roi Martin <jroi.martin <at> gmail.com>

Date: Fri, 23 May 2025 09:59:02 UTC

Severity: normal

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

Full log


Message #8 received at 78561 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Roi Martin <jroi.martin <at> gmail.com>
Cc: mbork <at> mbork.pl, 78561 <at> debbugs.gnu.org
Subject: Re: bug#78561: [PATCH] Add semantic linefeed support for paragraph
 filling
Date: Fri, 23 May 2025 14:11:35 +0300
> Cc: Marcin Borkowski <mbork <at> mbork.pl>
> From: Roi Martin <jroi.martin <at> gmail.com>
> Date: Fri, 23 May 2025 11:58:02 +0200
> 
> This patch adds semantic linefeed support for paragraph filling.  The
> functionality has been discussed in the emacs-devel mailing list in the
> following threads:
> 
> - Fill paragraph using semantic linefeeds: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00035.html
> - [GNU ELPA] New package: semlf: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00702.html
> 
> In the second thread we agreed on sending a patch to core instead of
> adding a new package to GNU ELPA.
> 
> Given that this is a first version, I have not added any reference to
> the manuals.  If you think it makes sense, please let me know and I'll
> modify the patch accordingly.
> 
> What follows is a detailed explanation of the term semantic linefeeds,
> so we have all the information in one single place.
> 
> The term "semantic linefeeds" or "semantic line breaks" refers to a set
> of conventions for using insensitive vertical whitespace to structure
> prose along semantic boundaries.
> 
> The concept was first introduced by Brian Kernighan in "UNIX for
> Beginners" [1] in October 1974.
> 
>   Hints for Preparing Documents
>   
>   Most documents go through several versions (always more than you
>   expected) before they are finally finished.  Accordingly, you should
>   do whatever possible to make the job of changing them easy.
>   
>   First, when you do the purely mechanical operations of typing, type so
>   subsequent editing will be easy.  Start each sentence on a new line.
>   Make lines short, and break lines at natural places, such as after
>   commas and semicolons, rather than randomly.  Since most people change
>   documents by rewriting phrases and adding, deleting and rearranging
>   sentences, these precautions simplify any editing you have to do
>   later.
> 
> Semantic linefeeds are usually used with markup languages that are not
> sensitive to newlines when exported to a different format (e.g. Org,
> Texinfo, Markdown).
> 
> Let's say that we have the following paragraph in an Org document:
> 
>   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
>   veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
>   commodo consequat.
> 
> After filling the paragraph using semantic linefeeds, the result is:
> 
>   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   tempor.
>   Incididunt ut labore et dolore magna aliqua.
>   Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
>   ut aliquip ex ea commodo consequat.
> 
> However, when exported, in both cases the result is:
> 
>   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
>   veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
>   commodo consequat.
> 
> So, what are the benefits?
> 
> One of the greatest benefits is that semantic linefeeds are "diff
> friendly".
> 
> For example,
> 
>   -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   -tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
>   -veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
>   -commodo consequat.
>   +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
>   +eiusmod tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim
>   +ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
>   +aliquip ex ea commodo consequat.
> 
> Versus,
> 
>   -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   -tempor.
>   +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
>   +eiusmod tempor.
>    Incididunt ut labore et dolore magna aliqua.
>    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
>    ut aliquip ex ea commodo consequat.
> 
> Semantic linefeeds make easier to spot that the word "XXXXX" was added
> in the first line.
> 
> Also, they are convenient during code reviews.  Shorter diffs and
> separating "ideas" with newlines allow to be more accurate when adding
> comments.
> 
> The site "Semantic Line Breaks" [2] by Mattt and the blog post "Semantic
> Linefeeds" [3] by Brandon Rhodes are both excellent references.
> 
> [1] https://web.archive.org/web/20130108163017if_/http://miffy.tom-yam.or.jp:80/2238/ref/beg.pdf
> [2] https://sembr.org/
> [3] https://rhodesmill.org/brandon/2012/one-sentence-per-line/

Thanks.

> +(defun fill-paragraph-semlf (&optional justify)
> +  "Fill paragraph at or after point using semantic linefeeds.
> +
> +This function ensures that a newline character follows every
> +sentence, as punctuated by a period (.), exclamation mark (!), or
> +question mark (?).

This explanation of what is "semantic linefeeds" is a good starting
point, but it is not enough.  For starters, "ensures" hints but
doesn't say explicitly that if there's no newline there, it is
inserted.  Also, I think a URL to at least one site explaining what
"semantic linefeeds" are should be in the doc string.

> +	  (when (and (> (point) (line-beginning-position))
> +		     (< (point) (line-end-position)))
> +	    (delete-horizontal-space)
> +	    (newline)

Are you sure 'newline' is the right function to call here?  It doesn't
just insert the newline character, at least not in all the cases.
Perhaps inserting a literal newline character is better?




This bug report was last modified today.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.