GNU bug report logs - #78561
[PATCH] Add semantic linefeed support for paragraph filling

Previous Next

Package: emacs;

Reported by: Roi Martin <jroi.martin <at> gmail.com>

Date: Fri, 23 May 2025 09:59:02 UTC

Severity: normal

Tags: patch

Full log


View this message in rfc822 format

From: Roi Martin <jroi.martin <at> gmail.com>
To: 78561 <at> debbugs.gnu.org
Cc: Marcin Borkowski <mbork <at> mbork.pl>
Subject: bug#78561: [PATCH] Add semantic linefeed support for paragraph filling
Date: Fri, 23 May 2025 11:58:02 +0200
[Message part 1 (text/plain, inline)]
Tags: patch

This patch adds semantic linefeed support for paragraph filling.  The
functionality has been discussed in the emacs-devel mailing list in the
following threads:

- Fill paragraph using semantic linefeeds: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00035.html
- [GNU ELPA] New package: semlf: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00702.html

In the second thread we agreed on sending a patch to core instead of
adding a new package to GNU ELPA.

Given that this is a first version, I have not added any reference to
the manuals.  If you think it makes sense, please let me know and I'll
modify the patch accordingly.

What follows is a detailed explanation of the term semantic linefeeds,
so we have all the information in one single place.

The term "semantic linefeeds" or "semantic line breaks" refers to a set
of conventions for using insensitive vertical whitespace to structure
prose along semantic boundaries.

The concept was first introduced by Brian Kernighan in "UNIX for
Beginners" [1] in October 1974.

  Hints for Preparing Documents
  
  Most documents go through several versions (always more than you
  expected) before they are finally finished.  Accordingly, you should
  do whatever possible to make the job of changing them easy.
  
  First, when you do the purely mechanical operations of typing, type so
  subsequent editing will be easy.  Start each sentence on a new line.
  Make lines short, and break lines at natural places, such as after
  commas and semicolons, rather than randomly.  Since most people change
  documents by rewriting phrases and adding, deleting and rearranging
  sentences, these precautions simplify any editing you have to do
  later.

Semantic linefeeds are usually used with markup languages that are not
sensitive to newlines when exported to a different format (e.g. Org,
Texinfo, Markdown).

Let's say that we have the following paragraph in an Org document:

  Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
  tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
  veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
  commodo consequat.

After filling the paragraph using semantic linefeeds, the result is:

  Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
  tempor.
  Incididunt ut labore et dolore magna aliqua.
  Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
  ut aliquip ex ea commodo consequat.

However, when exported, in both cases the result is:

  Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
  tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
  veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
  commodo consequat.

So, what are the benefits?

One of the greatest benefits is that semantic linefeeds are "diff
friendly".

For example,

  -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
  -tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim ad minim
  -veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
  -commodo consequat.
  +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
  +eiusmod tempor.  Incididunt ut labore et dolore magna aliqua.  Ut enim
  +ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
  +aliquip ex ea commodo consequat.

Versus,

  -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
  -tempor.
  +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
  +eiusmod tempor.
   Incididunt ut labore et dolore magna aliqua.
   Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
   ut aliquip ex ea commodo consequat.

Semantic linefeeds make easier to spot that the word "XXXXX" was added
in the first line.

Also, they are convenient during code reviews.  Shorter diffs and
separating "ideas" with newlines allow to be more accurate when adding
comments.

The site "Semantic Line Breaks" [2] by Mattt and the blog post "Semantic
Linefeeds" [3] by Brandon Rhodes are both excellent references.

[1] https://web.archive.org/web/20130108163017if_/http://miffy.tom-yam.or.jp:80/2238/ref/beg.pdf
[2] https://sembr.org/
[3] https://rhodesmill.org/brandon/2012/one-sentence-per-line/

[0001-Add-semantic-linefeed-support-for-paragraph-filling.patch (text/patch, attachment)]

This bug report was last modified 10 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.