GNU bug report logs -
#78561
[PATCH] Add semantic linefeed support for paragraph filling
Previous Next
To reply to this bug, email your comments to 78561 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
mbork <at> mbork.pl, bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Fri, 23 May 2025 09:59:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Roi Martin <jroi.martin <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
mbork <at> mbork.pl, bug-gnu-emacs <at> gnu.org
.
(Fri, 23 May 2025 09:59:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Tags: patch
This patch adds semantic linefeed support for paragraph filling. The
functionality has been discussed in the emacs-devel mailing list in the
following threads:
- Fill paragraph using semantic linefeeds: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00035.html
- [GNU ELPA] New package: semlf: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00702.html
In the second thread we agreed on sending a patch to core instead of
adding a new package to GNU ELPA.
Given that this is a first version, I have not added any reference to
the manuals. If you think it makes sense, please let me know and I'll
modify the patch accordingly.
What follows is a detailed explanation of the term semantic linefeeds,
so we have all the information in one single place.
The term "semantic linefeeds" or "semantic line breaks" refers to a set
of conventions for using insensitive vertical whitespace to structure
prose along semantic boundaries.
The concept was first introduced by Brian Kernighan in "UNIX for
Beginners" [1] in October 1974.
Hints for Preparing Documents
Most documents go through several versions (always more than you
expected) before they are finally finished. Accordingly, you should
do whatever possible to make the job of changing them easy.
First, when you do the purely mechanical operations of typing, type so
subsequent editing will be easy. Start each sentence on a new line.
Make lines short, and break lines at natural places, such as after
commas and semicolons, rather than randomly. Since most people change
documents by rewriting phrases and adding, deleting and rearranging
sentences, these precautions simplify any editing you have to do
later.
Semantic linefeeds are usually used with markup languages that are not
sensitive to newlines when exported to a different format (e.g. Org,
Texinfo, Markdown).
Let's say that we have the following paragraph in an Org document:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
tempor. Incididunt ut labore et dolore magna aliqua. Ut enim ad minim
veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.
After filling the paragraph using semantic linefeeds, the result is:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
tempor.
Incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
ut aliquip ex ea commodo consequat.
However, when exported, in both cases the result is:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
tempor. Incididunt ut labore et dolore magna aliqua. Ut enim ad minim
veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.
So, what are the benefits?
One of the greatest benefits is that semantic linefeeds are "diff
friendly".
For example,
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
-tempor. Incididunt ut labore et dolore magna aliqua. Ut enim ad minim
-veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
-commodo consequat.
+Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
+eiusmod tempor. Incididunt ut labore et dolore magna aliqua. Ut enim
+ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
+aliquip ex ea commodo consequat.
Versus,
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
-tempor.
+Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
+eiusmod tempor.
Incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
ut aliquip ex ea commodo consequat.
Semantic linefeeds make easier to spot that the word "XXXXX" was added
in the first line.
Also, they are convenient during code reviews. Shorter diffs and
separating "ideas" with newlines allow to be more accurate when adding
comments.
The site "Semantic Line Breaks" [2] by Mattt and the blog post "Semantic
Linefeeds" [3] by Brandon Rhodes are both excellent references.
[1] https://web.archive.org/web/20130108163017if_/http://miffy.tom-yam.or.jp:80/2238/ref/beg.pdf
[2] https://sembr.org/
[3] https://rhodesmill.org/brandon/2012/one-sentence-per-line/
[0001-Add-semantic-linefeed-support-for-paragraph-filling.patch (text/patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Fri, 23 May 2025 11:12:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 78561 <at> debbugs.gnu.org (full text, mbox):
> Cc: Marcin Borkowski <mbork <at> mbork.pl>
> From: Roi Martin <jroi.martin <at> gmail.com>
> Date: Fri, 23 May 2025 11:58:02 +0200
>
> This patch adds semantic linefeed support for paragraph filling. The
> functionality has been discussed in the emacs-devel mailing list in the
> following threads:
>
> - Fill paragraph using semantic linefeeds: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00035.html
> - [GNU ELPA] New package: semlf: https://lists.gnu.org/archive/html/emacs-devel/2025-03/msg00702.html
>
> In the second thread we agreed on sending a patch to core instead of
> adding a new package to GNU ELPA.
>
> Given that this is a first version, I have not added any reference to
> the manuals. If you think it makes sense, please let me know and I'll
> modify the patch accordingly.
>
> What follows is a detailed explanation of the term semantic linefeeds,
> so we have all the information in one single place.
>
> The term "semantic linefeeds" or "semantic line breaks" refers to a set
> of conventions for using insensitive vertical whitespace to structure
> prose along semantic boundaries.
>
> The concept was first introduced by Brian Kernighan in "UNIX for
> Beginners" [1] in October 1974.
>
> Hints for Preparing Documents
>
> Most documents go through several versions (always more than you
> expected) before they are finally finished. Accordingly, you should
> do whatever possible to make the job of changing them easy.
>
> First, when you do the purely mechanical operations of typing, type so
> subsequent editing will be easy. Start each sentence on a new line.
> Make lines short, and break lines at natural places, such as after
> commas and semicolons, rather than randomly. Since most people change
> documents by rewriting phrases and adding, deleting and rearranging
> sentences, these precautions simplify any editing you have to do
> later.
>
> Semantic linefeeds are usually used with markup languages that are not
> sensitive to newlines when exported to a different format (e.g. Org,
> Texinfo, Markdown).
>
> Let's say that we have the following paragraph in an Org document:
>
> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
> tempor. Incididunt ut labore et dolore magna aliqua. Ut enim ad minim
> veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
> commodo consequat.
>
> After filling the paragraph using semantic linefeeds, the result is:
>
> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
> tempor.
> Incididunt ut labore et dolore magna aliqua.
> Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
> ut aliquip ex ea commodo consequat.
>
> However, when exported, in both cases the result is:
>
> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
> tempor. Incididunt ut labore et dolore magna aliqua. Ut enim ad minim
> veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
> commodo consequat.
>
> So, what are the benefits?
>
> One of the greatest benefits is that semantic linefeeds are "diff
> friendly".
>
> For example,
>
> -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
> -tempor. Incididunt ut labore et dolore magna aliqua. Ut enim ad minim
> -veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
> -commodo consequat.
> +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
> +eiusmod tempor. Incididunt ut labore et dolore magna aliqua. Ut enim
> +ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
> +aliquip ex ea commodo consequat.
>
> Versus,
>
> -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
> -tempor.
> +Lorem ipsum dolor sit amet, XXXXX consectetur adipiscing elit, sed do
> +eiusmod tempor.
> Incididunt ut labore et dolore magna aliqua.
> Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
> ut aliquip ex ea commodo consequat.
>
> Semantic linefeeds make easier to spot that the word "XXXXX" was added
> in the first line.
>
> Also, they are convenient during code reviews. Shorter diffs and
> separating "ideas" with newlines allow to be more accurate when adding
> comments.
>
> The site "Semantic Line Breaks" [2] by Mattt and the blog post "Semantic
> Linefeeds" [3] by Brandon Rhodes are both excellent references.
>
> [1] https://web.archive.org/web/20130108163017if_/http://miffy.tom-yam.or.jp:80/2238/ref/beg.pdf
> [2] https://sembr.org/
> [3] https://rhodesmill.org/brandon/2012/one-sentence-per-line/
Thanks.
> +(defun fill-paragraph-semlf (&optional justify)
> + "Fill paragraph at or after point using semantic linefeeds.
> +
> +This function ensures that a newline character follows every
> +sentence, as punctuated by a period (.), exclamation mark (!), or
> +question mark (?).
This explanation of what is "semantic linefeeds" is a good starting
point, but it is not enough. For starters, "ensures" hints but
doesn't say explicitly that if there's no newline there, it is
inserted. Also, I think a URL to at least one site explaining what
"semantic linefeeds" are should be in the doc string.
> + (when (and (> (point) (line-beginning-position))
> + (< (point) (line-end-position)))
> + (delete-horizontal-space)
> + (newline)
Are you sure 'newline' is the right function to call here? It doesn't
just insert the newline character, at least not in all the cases.
Perhaps inserting a literal newline character is better?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Fri, 23 May 2025 15:06:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 78561 <at> debbugs.gnu.org (full text, mbox):
> Given that this is a first version, I have not added any reference to
> the manuals. If you think it makes sense, please let me know and I'll
> modify the patch accordingly.
Maybe a short version of the explanation you give below would be good to
have in the manual (tho Eli suggests a URL instead, so maybe that's
good enough?).
> +(defun fill-paragraph-semlf (&optional justify)
> + "Fill paragraph at or after point using semantic linefeeds.
> +
> +This function ensures that a newline character follows every
> +sentence, as punctuated by a period (.), exclamation mark (!), or
> +question mark (?).
This seems inaccurate: it just uses whichever definition of sentence is
used by `forward-sentence`, so it may ignore some of those chars or pay
attention to others.
> +If JUSTIFY is non-nil (interactively, with prefix argument), justify as
> +well. If `sentence-end-double-space' is non-nil, then period followed
> +by one space does not end a sentence, so don't break a line there. The
> +variable `fill-column' controls the width for filling."
I'd move the "The" to the last line. 🙂
> + (interactive "P")
> + (save-excursion
> + (let ((end (progn
> + (fill-forward-paragraph 1)
> + (backward-word)
> + (end-of-line)
> + (point)))
> + (start (progn
> + (fill-forward-paragraph -1)
> + (forward-word)
> + (beginning-of-line)
> + (point)))
> + pfx)
> + (with-restriction start end
> + (let ((fill-column (point-max)))
> + (setq pfx (or (fill-region-as-paragraph (point-min) (point-max)) "")))
> + (goto-char (point-min))
> + (while (not (eobp))
> + (let ((fill-prefix pfx))
> + (fill-region-as-paragraph (point)
> + (progn (forward-sentence) (point))
> + justify))
> + (when (and (> (point) (line-beginning-position))
> + (< (point) (line-end-position)))
> + (delete-horizontal-space)
> + (newline)
> + (insert pfx))))))
> + t)
Please try and separate it into a `fill-region-semlf` function and then
another one which applies it to a paragraph, so that it can also be used
to fill a specific user-specified region (or the whole buffer).
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Sat, 24 May 2025 12:16:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 78561 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> +(defun fill-paragraph-semlf (&optional justify)
>> + "Fill paragraph at or after point using semantic linefeeds.
>> +
>> +This function ensures that a newline character follows every
>> +sentence, as punctuated by a period (.), exclamation mark (!), or
>> +question mark (?).
>
> This explanation of what is "semantic linefeeds" is a good starting
> point, but it is not enough. For starters, "ensures" hints but
> doesn't say explicitly that if there's no newline there, it is
> inserted. Also, I think a URL to at least one site explaining what
> "semantic linefeeds" are should be in the doc string.
I would prefer to avoid depending on external URLs to explain the
concept. I'd link to an external reference if, for instance, this
feature was backed by an standard located in a well-known site
(e.g. IETF RFCs). In this case, the concept is quite simple and I agree
with Stefan in that we can provide our own interpretation in the manual
and link to the Info node from the doc string. If you prefer to avoid
changing the manual until this is well tested, then we can provide a
more detailed explanation in the doc string itself. What do you think?
>> + (when (and (> (point) (line-beginning-position))
>> + (< (point) (line-end-position)))
>> + (delete-horizontal-space)
>> + (newline)
>
> Are you sure 'newline' is the right function to call here? It doesn't
> just insert the newline character, at least not in all the cases.
> Perhaps inserting a literal newline character is better?
The reason behind using 'newline' is to support documents that follow
other conventions to represent newlines (e.g. '\r\n' or '\r'). Does it
make sense? Is this the right approach?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Sat, 24 May 2025 13:03:03 GMT)
Full text and
rfc822 format available.
Message #17 received at 78561 <at> debbugs.gnu.org (full text, mbox):
> From: Roi Martin <jroi.martin <at> gmail.com>
> Cc: 78561 <at> debbugs.gnu.org, mbork <at> mbork.pl, monnier <at> iro.umontreal.ca
> Date: Sat, 24 May 2025 14:15:37 +0200
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > This explanation of what is "semantic linefeeds" is a good starting
> > point, but it is not enough. For starters, "ensures" hints but
> > doesn't say explicitly that if there's no newline there, it is
> > inserted. Also, I think a URL to at least one site explaining what
> > "semantic linefeeds" are should be in the doc string.
>
> I would prefer to avoid depending on external URLs to explain the
> concept.
Since the concept came from outside, why not?
> I'd link to an external reference if, for instance, this
> feature was backed by an standard located in a well-known site
> (e.g. IETF RFCs). In this case, the concept is quite simple and I agree
> with Stefan in that we can provide our own interpretation in the manual
> and link to the Info node from the doc string.
There's no contradiction: we could describe this in our documentation
and also mention the external references. We do that, for example,
for Unicode-related features.
> >> + (when (and (> (point) (line-beginning-position))
> >> + (< (point) (line-end-position)))
> >> + (delete-horizontal-space)
> >> + (newline)
> >
> > Are you sure 'newline' is the right function to call here? It doesn't
> > just insert the newline character, at least not in all the cases.
> > Perhaps inserting a literal newline character is better?
>
> The reason behind using 'newline' is to support documents that follow
> other conventions to represent newlines (e.g. '\r\n' or '\r'). Does it
> make sense? Is this the right approach?
In Emacs, there's only one "newline convention", the one that uses the
newline (LFD) character. The different en d-of-line conventions are
supported during I/O: we "encode" newlines as CR-LF pair for Windows,
for example, when saving buffers to files, and "decode" CR-LF back
into a single newline when reading files into buffers.
By contrast, the 'newline' function does other things, in addition to
inserting the newline character; see its documentation for the
details. It seems to me that some of those additional actions is not
something this feature will want, because this feature is _only_ about
where to break text into physical lines.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Sat, 24 May 2025 13:39:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 78561 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Roi Martin <jroi.martin <at> gmail.com>
>> Cc: 78561 <at> debbugs.gnu.org, mbork <at> mbork.pl, monnier <at> iro.umontreal.ca
>> Date: Sat, 24 May 2025 14:15:37 +0200
>>
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>
>> > This explanation of what is "semantic linefeeds" is a good starting
>> > point, but it is not enough. For starters, "ensures" hints but
>> > doesn't say explicitly that if there's no newline there, it is
>> > inserted. Also, I think a URL to at least one site explaining what
>> > "semantic linefeeds" are should be in the doc string.
>>
>> I would prefer to avoid depending on external URLs to explain the
>> concept.
>
> Since the concept came from outside, why not?
>
>> I'd link to an external reference if, for instance, this
>> feature was backed by an standard located in a well-known site
>> (e.g. IETF RFCs). In this case, the concept is quite simple and I agree
>> with Stefan in that we can provide our own interpretation in the manual
>> and link to the Info node from the doc string.
>
> There's no contradiction: we could describe this in our documentation
> and also mention the external references. We do that, for example,
> for Unicode-related features.
OK. I'll update the patch accordingly.
>> >> + (when (and (> (point) (line-beginning-position))
>> >> + (< (point) (line-end-position)))
>> >> + (delete-horizontal-space)
>> >> + (newline)
>> >
>> > Are you sure 'newline' is the right function to call here? It doesn't
>> > just insert the newline character, at least not in all the cases.
>> > Perhaps inserting a literal newline character is better?
>>
>> The reason behind using 'newline' is to support documents that follow
>> other conventions to represent newlines (e.g. '\r\n' or '\r'). Does it
>> make sense? Is this the right approach?
>
> In Emacs, there's only one "newline convention", the one that uses the
> newline (LFD) character. The different en d-of-line conventions are
> supported during I/O: we "encode" newlines as CR-LF pair for Windows,
> for example, when saving buffers to files, and "decode" CR-LF back
> into a single newline when reading files into buffers.
Got it. Thanks a lot for the explanation. That simplifies things a
lot.
> By contrast, the 'newline' function does other things, in addition to
> inserting the newline character; see its documentation for the
> details. It seems to me that some of those additional actions is not
> something this feature will want, because this feature is _only_ about
> where to break text into physical lines.
You are right. I replaced it with
(insert "\n")
Thanks!
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Sun, 25 May 2025 18:48:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 78561 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>> Given that this is a first version, I have not added any reference to
>> the manuals. If you think it makes sense, please let me know and I'll
>> modify the patch accordingly.
>
> Maybe a short version of the explanation you give below would be good to
> have in the manual (tho Eli suggests a URL instead, so maybe that's
> good enough?).
>
>> +(defun fill-paragraph-semlf (&optional justify)
>> + "Fill paragraph at or after point using semantic linefeeds.
>> +
>> +This function ensures that a newline character follows every
>> +sentence, as punctuated by a period (.), exclamation mark (!), or
>> +question mark (?).
>
> This seems inaccurate: it just uses whichever definition of sentence is
> used by `forward-sentence`, so it may ignore some of those chars or pay
> attention to others.
I have updated the patch with a more precise definition. Also, I added
links to the sources I referenced for semantic linefeeds.
>> +If JUSTIFY is non-nil (interactively, with prefix argument), justify as
>> +well. If `sentence-end-double-space' is non-nil, then period followed
>> +by one space does not end a sentence, so don't break a line there. The
>> +variable `fill-column' controls the width for filling."
>
> I'd move the "The" to the last line. 🙂
Fixed :)
>> + (interactive "P")
>> + (save-excursion
>> + (let ((end (progn
>> + (fill-forward-paragraph 1)
>> + (backward-word)
>> + (end-of-line)
>> + (point)))
>> + (start (progn
>> + (fill-forward-paragraph -1)
>> + (forward-word)
>> + (beginning-of-line)
>> + (point)))
>> + pfx)
>> + (with-restriction start end
>> + (let ((fill-column (point-max)))
>> + (setq pfx (or (fill-region-as-paragraph (point-min) (point-max)) "")))
>> + (goto-char (point-min))
>> + (while (not (eobp))
>> + (let ((fill-prefix pfx))
>> + (fill-region-as-paragraph (point)
>> + (progn (forward-sentence) (point))
>> + justify))
>> + (when (and (> (point) (line-beginning-position))
>> + (< (point) (line-end-position)))
>> + (delete-horizontal-space)
>> + (newline)
>> + (insert pfx))))))
>> + t)
>
> Please try and separate it into a `fill-region-semlf` function and then
> another one which applies it to a paragraph, so that it can also be used
> to fill a specific user-specified region (or the whole buffer).
I'm not sure about this one. The idea is that `fill-paragraph-semlf'
can be assigned as a value for `fill-paragraph-function' (I also
included a mention to this in the doc string) or it ca be called
directly. The thing is that `fill-paragraph' docs say:
The REGION argument is non-nil if called interactively; in that
case, if Transient Mark mode is enabled and the mark is active,
call `fill-region' to fill each of the paragraphs in the active
region, instead of just filling the current paragraph.
And `fill-paragraph-function' docs say:
Note: This only affects ‘fill-paragraph’ and not ‘fill-region’
nor ‘auto-fill-mode’
So, if I'm not wrong, filling regions and paragraphs is different in the
current design.
I agree that it would be useful to apply `fill-paragraph-semlf' on a
region or the whole buffer. But, the same could be said about any other
`fill-paragraph-function'. So, do we really want a specific
`fill-region-semlf' function?
I attach a new version of the patch.
Thanks!
[0001-Add-semantic-linefeed-support-for-paragraph-filling.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Mon, 26 May 2025 16:22:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 78561 <at> debbugs.gnu.org (full text, mbox):
>> Please try and separate it into a `fill-region-semlf` function and then
>> another one which applies it to a paragraph, so that it can also be used
>> to fill a specific user-specified region (or the whole buffer).
> I'm not sure about this one. The idea is that `fill-paragraph-semlf'
> can be assigned as a value for `fill-paragraph-function' (I also
> included a mention to this in the doc string) or it can be called
> directly. The thing is that `fill-paragraph' docs say:
Sadly, we don't currently have anything like a `fill-region-function`,
but that's "a bug, not a feature", so we should write new code such that
this misfeature is easier rather than harder to fix in the future.
> I agree that it would be useful to apply `fill-paragraph-semlf' on
> a region or the whole buffer. But, the same could be said about any
> other `fill-paragraph-function'.
Yes, I've had this in my TODO for many years. 🙁
> So, do we really want a specific `fill-region-semlf' function?
Actually, I think I want a `fill-region-as-paragraph-semlf'.
And maybe then some way to tell the `fill.el` code to use that instead
of the default `fill-region-as-paragraph`.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Tue, 27 May 2025 15:48:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 78561 <at> debbugs.gnu.org (full text, mbox):
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>>> Please try and separate it into a `fill-region-semlf` function and then
>>> another one which applies it to a paragraph, so that it can also be used
>>> to fill a specific user-specified region (or the whole buffer).
>> I'm not sure about this one. The idea is that `fill-paragraph-semlf'
>> can be assigned as a value for `fill-paragraph-function' (I also
>> included a mention to this in the doc string) or it can be called
>> directly. The thing is that `fill-paragraph' docs say:
>
> Sadly, we don't currently have anything like a `fill-region-function`,
> but that's "a bug, not a feature", so we should write new code such that
> this misfeature is easier rather than harder to fix in the future.
>
>> I agree that it would be useful to apply `fill-paragraph-semlf' on
>> a region or the whole buffer. But, the same could be said about any
>> other `fill-paragraph-function'.
>
> Yes, I've had this in my TODO for many years. 🙁
I'll try to implement it :)
>> So, do we really want a specific `fill-region-semlf' function?
>
> Actually, I think I want a `fill-region-as-paragraph-semlf'.
> And maybe then some way to tell the `fill.el` code to use that instead
> of the default `fill-region-as-paragraph`.
OK, so here is my plan:
First, I'll give a closer look to all the "fill" functions that Emacs
currently provides to understand how every thing fits together and what
is and what is not already supported.
Then, I'll send a new patch to support applying a
`fill-paragraph-function' on a region "as-paragraph" (similar to the
behavior of `fill-region-as-paragraph') and "per-paragraph" (similar to
executing `fill-paragraph' for each paragraph within the region).
Finally, if needed, I can update this patch.
Does it make sense?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Tue, 27 May 2025 16:37:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 78561 <at> debbugs.gnu.org (full text, mbox):
>> Yes, I've had this in my TODO for many years. 🙁
> I'll try to implement it :)
For some definition of "it".
>>> So, do we really want a specific `fill-region-semlf' function?
>> Actually, I think I want a `fill-region-as-paragraph-semlf'.
>> And maybe then some way to tell the `fill.el` code to use that instead
>> of the default `fill-region-as-paragraph`.
> OK, so here is my plan:
That plan has a much larger scope than semantic linefeed, so
I recommend you first finish the current patch and take on that
plan afterwards.
What I'm asking is a fairly small refactoring of the code you sent,
I think.
> First, I'll give a closer look to all the "fill" functions that Emacs
> currently provides to understand how every thing fits together and what
> is and what is not already supported.
Sounds good. I can give you a quick summary of what I know.
`fill-paragraph-function`s usually serve as either:
- Some way for the major mode to teach the fill code how to stay within
a major-mode-specific notion of paragraph (e.g. fill within comments
or within strings). This use has made somewhat obsolete by
`fill-paragraph-forward-function` but it's still pretty common.
- Some way for the major mode to teach the fill code about special
line-wrapping conventions (e.g. add a trailing \ on the previous line
to fill C strings, add a leading SPC or TAB to fill rfc822 headers,
..) or to fine tune (adaptive-)fill-prefix.
- Actually change where we break lines for example to fill code in
a "smart" way (like `smie-auto-fill` does).
- It can also be used to prevent filling in some parts
(e.g. titles/headlines).
> Then, I'll send a new patch to support applying a
> `fill-paragraph-function' on a region "as-paragraph" (similar to the
> behavior of `fill-region-as-paragraph') and "per-paragraph" (similar to
> executing `fill-paragraph' for each paragraph within the region).
I think in general these are impossible because we don't know what the
`fill-paragraph-function` actually does. My TODO is instead to provide
different hooks for the above different uses (some hooks already exist
such as `comment-line-break-function` but often they need to be
generalized) to try and make `fill-paragraph-function` effectively
obsolete (without necessarily marking it officially as obsolete).
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Wed, 28 May 2025 15:45:03 GMT)
Full text and
rfc822 format available.
Message #35 received at 78561 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
> I recommend you first finish the current patch and take on that
> plan afterwards.
> What I'm asking is a fairly small refactoring of the code you sent,
> I think.
Please, find attached a new version of the patch.
It contains the following changes:
- Add a new function named `fill-region-as-paragraph-semlf', which fills
a region using semantic linefeeds as if it were a single paragraph.
Its behavior is analogous to `fill-region-as-paragraph'.
- Update the `fill-paragraph-semlf' function to use
`fill-region-as-paragraph-semlf' internally.
- Remove the mention about using `fill-paragraph-semlf' as
`fill-paragraph-function' from docs, given that the long term plan is
to obsolete this.
- Add test for `fill-region-as-paragraph-semlf'.
Thanks a lot for all the explanations. I hope this new version is
closer to what you had in mind.
Roi
[0001-Add-semantic-linefeed-support-for-paragraph-filling.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Sat, 31 May 2025 17:41:03 GMT)
Full text and
rfc822 format available.
Message #38 received at 78561 <at> debbugs.gnu.org (full text, mbox):
>> I recommend you first finish the current patch and take on that
>> plan afterwards.
>> What I'm asking is a fairly small refactoring of the code you sent,
>> I think.
>
> Please, find attached a new version of the patch.
Looks good to me. Eli?
> - Remove the mention about using `fill-paragraph-semlf' as
> `fill-paragraph-function' from docs, given that the long term plan is
> to obsolete this.
Well, it's been in my TODO for ages, so maybe we shouldn't assume this
long term plan will materialize soon.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Sun, 01 Jun 2025 04:49:06 GMT)
Full text and
rfc822 format available.
Message #41 received at 78561 <at> debbugs.gnu.org (full text, mbox):
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 78561 <at> debbugs.gnu.org, Marcin Borkowski
> <mbork <at> mbork.pl>
> Date: Sat, 31 May 2025 13:39:52 -0400
>
> >> I recommend you first finish the current patch and take on that
> >> plan afterwards.
> >> What I'm asking is a fairly small refactoring of the code you sent,
> >> I think.
> >
> > Please, find attached a new version of the patch.
>
> Looks good to me. Eli?
It's okay, but please make sure there are two spaces between
sentences, not one. There are quite a few cases in the patch where
this is not so.
Also, the commit log message should mention the bug number.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Mon, 02 Jun 2025 13:07:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 78561 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:
> It's okay, but please make sure there are two spaces between
> sentences, not one. There are quite a few cases in the patch where
> this is not so.
Fixed in doc strings. The rest of cases correspond to test data where
we want to verify that `sentence-end-double-space' is respected.
> Also, the commit log message should mention the bug number.
Fixed.
RMS replied to me with a review, where he mentioned that this feature
would be more convenient if it were a minor mode. Eli, Stefan, what do
you think? If it makes sense, we can add it in a future patch.
Also, with regard to the paragraph:
For more details about semantic linefeeds, see `https://sembr.org/' and
`https://rhodesmill.org/brandon/2012/one-sentence-per-line/'."
He raised the following concern:
> This sort of information doesn't belong in an Emacs doc string.
> If some of it isimportant for using this feature right, we
> should state the point in this and other doc strings, or in
> the Emacs Manual, That way we can propagate it and update it.
> If information is "if you'd like to know more", not necessary for peopke
> to read to know how to use the feature, then we can refer to it on a web page.
> But we need to have reason to be confident that page will exist for many
> years, serving the same purpose, in a way we would not be ashamed to link to.
>
> Otherwise, depending on it is asking for trouble.
>
> How much material is importaht to link to for this purpose?
As I previously said, I think it should be fine to add a short
explanation in the doc string or in the manual instead of linking with
external sites that are subject to change or be deleted in the future.
I've attached a new version of the patch. It includes the following
changes compared to the previous version:
- Separate sentences in doc strings using two spaces.
- Add bug number to commit log message.
- Slightly reword doc strings, following a suggestion from RMS.
- Add test to check that it is possible to "revert" semlf-filling by
refilling the paragraph, following a suggestion from RMS.
Roi
[0001-Add-semantic-linefeed-support-for-paragraph-filling.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Mon, 02 Jun 2025 19:14:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 78561 <at> debbugs.gnu.org (full text, mbox):
> RMS replied to me with a review, where he mentioned that this feature
> would be more convenient if it were a minor mode. Eli, Stefan, what do
> you think? If it makes sense, we can add it in a future patch.
Yes, it goes in the same direction of my TODO of adding more hooks to
control various aspects of filling. (One of) Those hooks should make it
easy to write such a minor mode.
> I've attached a new version of the patch. It includes the following
> changes compared to the previous version:
LGTM.
Side comment, tho: it would be good to get rid of the `with-restriction`
so that the code can see the surrounding text. In the past such
restrictions in `fill.el` have posed problems for example when we want
to refill code in a way that obeys indentation rules or that
needs to distinguish filling within strings vs filling within comments.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Tue, 03 Jun 2025 06:46:03 GMT)
Full text and
rfc822 format available.
Message #50 received at 78561 <at> debbugs.gnu.org (full text, mbox):
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
> Side comment, tho: it would be good to get rid of the `with-restriction`
> so that the code can see the surrounding text. In the past such
> restrictions in `fill.el` have posed problems for example when we want
> to refill code in a way that obeys indentation rules or that
> needs to distinguish filling within strings vs filling within comments.
The latest version is far from perfect but works well for the most
common use cases. The included tests can given you an idea of the
current state. Are you OK with tackling the `with-restriction'
improvement in a future patch?
Roi
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#78561
; Package
emacs
.
(Tue, 03 Jun 2025 17:26:02 GMT)
Full text and
rfc822 format available.
Message #53 received at 78561 <at> debbugs.gnu.org (full text, mbox):
> Are you OK with tackling the `with-restriction'
> improvement in a future patch?
Yes.
Stefan
This bug report was last modified 9 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.