GNU bug report logs - #78561
[PATCH] Add semantic linefeed support for paragraph filling

Previous Next

Package: emacs;

Reported by: Roi Martin <jroi.martin <at> gmail.com>

Date: Fri, 23 May 2025 09:59:02 UTC

Severity: normal

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Jordan Ellis Coppard <jc+o.emacs <at> wz.ht>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 78561 <at> debbugs.gnu.org
Subject: bug#78561: [PATCH] Add semantic linefeed support for paragraph filling
Date: Sun, 22 Jun 2025 03:49:07 +0900
My 2c here is that this seems rather basic and while the tests pass with 
toy lorem-ipsum language it fails completely if abbreviations are used, 
for example, running M-x fill-paragraph-semlf over:

#+begin_example
Hi there Chris! I see you've got your M.D. now so I suppose I should 
call you Dr. Yoo. I hear you're also a newly appointed U.S. Rep., what 
is that like? Good I hope.
#+end_example

Gets split to:

#+begin_example
Hi there Chris!
I see you've got your M.D.
now so I suppose I should call you Dr.
Yoo.
I hear you're also a newly appointed U.S.
Rep., what is that like?
Good I hope.
#+end_example

This is not correct, one would expect:

#+begin_example
Hi there Chris!
I see you've got your M.D. now so I suppose I should call you Dr. Yoo.
I hear you're also a newly appointed U.S. Rep., what is that like?
Good I hope.
#+end_example

I admit the example given here is intentionally (somewhat) contrived but 
abbreviations like "Dr." are not uncommon and use of double-spaced 
full-stops is absent from a majority of English which would probably 
alleviate that problem.

I understand the work of finding a sentence here is done via 
forward-sentence, perhaps inspiration from this prior art could help 
find the end of sentences better:

(1) https://github.com/neurosnap/sentences
(2) https://github.com/diasks2/pragmatic_segmenter

I haven't looked into the exact techniques being used for those two 
projects, and I am also unsure on how "serious" an "issue" this is 
(hence just referring to it as 2c) but it would be an improvement to 
correctly split (move forward by sentence) over natural language most of 
the time I'd wager.

In any case still a good feature. One thing I had been planning to use 
these kinds of things for is to semantically fill a large paragraph and 
then more easily be able to rewrite or re-arrange thoughts (now they are 
just one line per sentence). Once that's done, join lines back to a 
paragraph and viola.


/Jordan






This bug report was last modified 18 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.