GNU bug report logs - #78561
[PATCH] Add semantic linefeed support for paragraph filling

Previous Next

Package: emacs;

Reported by: Roi Martin <jroi.martin <at> gmail.com>

Date: Fri, 23 May 2025 09:59:02 UTC

Severity: normal

Tags: patch

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Roi Martin <jroi.martin <at> gmail.com>
Cc: mbork <at> mbork.pl, 78561 <at> debbugs.gnu.org
Subject: bug#78561: [PATCH] Add semantic linefeed support for paragraph filling
Date: Fri, 23 May 2025 14:11:35 +0300
> Cc: Marcin Borkowski <mbork <at> mbork.pl>
> From: Roi Martin <jroi.martin <at> gmail.com>
> Date: Fri, 23 May 2025 11:58:02 +0200
> 
> This patch adds semantic linefeed support for paragraph filling.  The
> functionality has been discussed in the emacs-devel mailing list in the
> following threads:
> 
> - Fill paragraph using semantic linefeeds: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00035.html
> - [GNU ELPA] New package: semlf: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00702.html
> 
> In the second thread we agreed on sending a patch to core instead of
> adding a new package to GNU ELPA.
> 
> Given that this is a first version, I have not added any reference to
> the manuals.  If you think it makes sense, please let me know and I'll
> modify the patch accordingly.
> 
> What follows is a detailed explanation of the term semantic linefeeds,
> so we have all the information in one single place.
> 
> The term "semantic linefeeds" or "semantic line breaks" refers to a set
> of conventions for using insensitive vertical whitespace to structure
> prose along semantic boundaries.
> 
> The concept was first introduced by Brian Kernighan in "UNIX for
> Beginners" [1] in October 1974.
> 
>   Hints for Preparing Documents
>   
>   Most documents go through several versions (always more than you
>   expected) before they are finally finished.  Accordingly, you should
>   do whatever possible to make the job of changing them easy.
>   
>   First, when you do the purely mechanical operations of typing, type so
>   subsequent editing will be easy.  Start each sentence on a new line.
>   Make lines short, and break lines at natural places, such as after
>   commas and semicolons, rather than randomly.  Since most people change
>   documents by rewriting phrases and adding, deleting and rearranging
>   sentences, these precautions simplify any editing you have to do
>   later.
> 
> Semantic linefeeds are usually used with markup languages that are not
> sensitive to newlines when exported to a different format (e.g. Org,
> Texinfo, Markdown).
> 
> Let's say that we have the following paragraph in an Org document:
> 
>   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
>   veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
>   commodo consequat.
> 
> After filling the paragraph using semantic linefeeds, the result is:
> 
>   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   tempor.
>   Incididunt ut labore et dolore magna aliqua.
>   Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
>   ut aliquip ex ea commodo consequat.
> 
> However, when exported, in both cases the result is:
> 
>   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
>   veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
>   commodo consequat.
> 
> So, what are the benefits?
> 
> One of the greatest benefits is that semantic linefeeds are "diff
> friendly".
> 
> For example,
> 
>   -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   -tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
>   -veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
>   -commodo consequat.
>   +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
>   +eiusmod tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim
>   +ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
>   +aliquip ex ea commodo consequat.
> 
> Versus,
> 
>   -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
>   -tempor.
>   +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
>   +eiusmod tempor.
>    Incididunt ut labore et dolore magna aliqua.
>    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
>    ut aliquip ex ea commodo consequat.
> 
> Semantic linefeeds make easier to spot that the word "XXXXX" was added
> in the first line.
> 
> Also, they are convenient during code reviews.  Shorter diffs and
> separating "ideas" with newlines allow to be more accurate when adding
> comments.
> 
> The site "Semantic Line Breaks" [2] by Mattt and the blog post "Semantic
> Linefeeds" [3] by Brandon Rhodes are both excellent references.
> 
> [1] https://web.archive.org/web/20130108163017if_/http://miffy.tom-yam.or.jp:80/2238/ref/beg.pdf
> [2] https://sembr.org/
> [3] https://rhodesmill.org/brandon/2012/one-sentence-per-line/

Thanks.

> +(defun fill-paragraph-semlf (&optional justify)
> +  "Fill paragraph at or after point using semantic linefeeds.
> +
> +This function ensures that a newline character follows every
> +sentence, as punctuated by a period (.), exclamation mark (!), or
> +question mark (?).

This explanation of what is "semantic linefeeds" is a good starting
point, but it is not enough.  For starters, "ensures" hints but
doesn't say explicitly that if there's no newline there, it is
inserted.  Also, I think a URL to at least one site explaining what
"semantic linefeeds" are should be in the doc string.

> +	  (when (and (> (point) (line-beginning-position))
> +		     (< (point) (line-end-position)))
> +	    (delete-horizontal-space)
> +	    (newline)

Are you sure 'newline' is the right function to call here?  It doesn't
just insert the newline character, at least not in all the cases.
Perhaps inserting a literal newline character is better?




This bug report was last modified 10 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.