Package: emacs;
Reported by: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Wed, 18 Oct 2023 16:33:02 UTC
Severity: normal
Found in version 29.1.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 66614 <at> debbugs.gnu.org Subject: bug#66614: 29.1.50; Support not capitalizing words inside symbols Date: Wed, 18 Oct 2023 15:38:34 -0400
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Date: Wed, 18 Oct 2023 13:01:43 -0400 >> >> --- a/doc/lispref/strings.texi >> +++ b/doc/lispref/strings.texi >> @@ -1510,7 +1510,9 @@ Case Conversion >> >> The definition of a word is any sequence of consecutive characters that >> are assigned to the word constituent syntax class in the current syntax >> -table (@pxref{Syntax Class Table}). >> +table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is >> +non-nil, also characters assigned to the symbol constituent syntax >> +class. >> >> When @var{string-or-char} is a character, this function does the same >> thing as @code{upcase}. >> @@ -1542,7 +1544,9 @@ Case Conversion >> >> The definition of a word is any sequence of consecutive characters that >> are assigned to the word constituent syntax class in the current syntax >> -table (@pxref{Syntax Class Table}). >> +table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is >> +non-nil, also characters assigned to the symbol constituent syntax >> +class. > > These two hunks use @var incorrectly: case-symbols-as-words is a > literal symbol, so it should have the @code markup. Fixed. >> ++++ >> +** New variable 'case-symbols-as-words' to change case behavior for symbols. > > "Case behavior" is confusing. I think you mean > > New variable 'case-symbols-as-words' affects case operations for symbols. Fixed. >> +If this is set to non-nil, then case operations such as >> +'upcase-initials' or 'replace-match' (with nil FIXEDCASE) will treat >> +symbol constituents as if they were part of words. > > Don't you mean > > will treat the entire symbol name as a single word > > ? I find the text you used confusing, FWIW. Fixed. >> This is useful for >> +programming languages and style where words in the middle of symbols >> +are never capitalized. > > Likewise here: instead of talking about "words in the middle of > symbols", wouldn't it be better to say something like > > ...style where only the first letter of a symbol's name is ever > capitalized. > > ? > > Also, please say here that the default of this new variable is nil. Fixed. >> + DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words, >> + doc: /* If non-nil, case functions treat symbol syntax as part of words. >> + >> +Functions such as `upcase-initials' and `replace-match' check or modify >> +the case pattern of sequences of characters. Normally, these operate on >> +sequences of characters whose syntax is word constituent. If this >> +variable is non-nil, then they operate on sequences of characters who >> +syntax is either word constituent or symbol constituent. >> + >> +This is useful for programming styles which wish to capitalize the >> +beginning of symbols, but not capitalize individual words in a symbol.*/); > > Similar comments about this doc string. Fixed. > Also, shouldn't this variable be buffer-local? You want certain major > modes to set it, right? Yes, I want certain major modes to set it, although it's also possible that some users will want to set it globally. Are you suggesting it should be a DEFVAR_PER_BUFFER? I can do that, but I didn't think it was worth putting another slot into struct buffer. Plus DEFVAR_PER_BUFFER has bad performance (O(#buffers)) when you let-bind it, which I expect users might want to do sometimes. >> - if (SYNTAX (prevc) != Sword) >> + if (SYNTAX (prevc) != Sword >> + && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol)) > > I think the code will be more clear if you use > > && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol)) Fixed. >> else if (uppercasep (c)) >> { >> some_uppercase = 1; >> - if (SYNTAX (prevc) != Sword) >> + if (SYNTAX (prevc) != Sword >> + && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol)) > > Same here. > Fixed. >> /* If the initial is a caseless word constituent, >> treat that like a lowercase initial. */ >> - if (SYNTAX (prevc) != Sword) >> + if (SYNTAX (prevc) != Sword >> + && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol)) >> some_nonuppercase_initial = 1; > > And here. > Fixed.
[0001-Add-case-symbols-as-words-to-configure-symbol-case-b.patch (text/x-patch, inline)]
From 8286118c70288217badbbb2afd7863ae2ba6848c Mon Sep 17 00:00:00 2001 From: Spencer Baugh <sbaugh <at> janestreet.com> Date: Wed, 18 Oct 2023 12:51:37 -0400 Subject: [PATCH] Add case-symbols-as-words to configure symbol case behavior In some programming languages and styles, a symbol (or every symbol in a sequence of symbols) might be capitalized, but the individual words making up the symbol should never be capitalized. For example, in OCaml, type names Look_like_this and variable names look_like_this, but it is basically never correct for something to Look_Like_This. And one might have "aa_bb cc_dd ee_ff" or "Aa_bb Cc_dd Ee_ff", but never "Aa_Bb Cc_Dd Ee_Ff". To support this, the new variable case-symbols-as-words causes symbol constituents to be treated as part of words only for case operations. * src/casefiddle.c (case_ch_is_word): Add. (case_character_impl): Use case_ch_is_word. (case_character): Use case_ch_is_word. (syms_of_casefiddle): Define case-symbols-as-words. (bug#66614) * src/search.c (Freplace_match): Use case-symbols-as-words when calculating case pattern. * test/src/casefiddle-tests.el (casefiddle-tests--check-syms) (casefiddle-case-symbols-as-words): Test case-symbols-as-words. * etc/NEWS: Announce case-symbols-as-words. * doc/lispref/strings.texi (Case Conversion): Document case-symbols-as-words. --- doc/lispref/strings.texi | 8 ++++++-- etc/NEWS | 8 ++++++++ src/casefiddle.c | 23 +++++++++++++++++++++-- src/search.c | 11 +++++++---- test/src/casefiddle-tests.el | 12 ++++++++++++ 5 files changed, 54 insertions(+), 8 deletions(-) diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi index 7d11db49def..665d4f9a8dc 100644 --- a/doc/lispref/strings.texi +++ b/doc/lispref/strings.texi @@ -1510,7 +1510,9 @@ Case Conversion The definition of a word is any sequence of consecutive characters that are assigned to the word constituent syntax class in the current syntax -table (@pxref{Syntax Class Table}). +table (@pxref{Syntax Class Table}), or if @code{case-symbols-as-words} +is non-nil, also characters assigned to the symbol constituent syntax +class. When @var{string-or-char} is a character, this function does the same thing as @code{upcase}. @@ -1542,7 +1544,9 @@ Case Conversion The definition of a word is any sequence of consecutive characters that are assigned to the word constituent syntax class in the current syntax -table (@pxref{Syntax Class Table}). +table (@pxref{Syntax Class Table}), or if @code{case-symbols-as-words} +is non-nil, also characters assigned to the symbol constituent syntax +class. When the argument to @code{upcase-initials} is a character, @code{upcase-initials} has the same result as @code{upcase}. diff --git a/etc/NEWS b/etc/NEWS index 129017f7dbe..23867aafe6f 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1100,6 +1100,14 @@ instead of "ctags", "ebrowse", "etags", "hexl", "emacsclient", and "rcs2log", when starting one of these built in programs in a subprocess. ++++ +** New variable 'case-symbols-as-words' affects case operations for symbols. +If non-nil, then case operations such as 'upcase-initials' or +'replace-match' (with nil FIXEDCASE) will treat the entire symbol name +as a single word. This is useful for programming languages and styles +where only the first letter of a symbol's name is ever capitalized. +It defaults to nil. + +++ ** 'x-popup-menu' now understands touch screen events. When a 'touchscreen-begin' or 'touchscreen-end' event is passed as the diff --git a/src/casefiddle.c b/src/casefiddle.c index d567a5e353a..47e8950cda6 100644 --- a/src/casefiddle.c +++ b/src/casefiddle.c @@ -92,6 +92,12 @@ prepare_casing_context (struct casing_context *ctx, SETUP_BUFFER_SYNTAX_TABLE (); /* For syntax_prefix_flag_p. */ } +static bool +case_ch_is_word (enum syntaxcode syntax) +{ + return syntax == Sword || (case_symbols_as_words && syntax == Ssymbol); +} + struct casing_str_buf { unsigned char data[max (6, MAX_MULTIBYTE_LENGTH)]; @@ -115,7 +121,7 @@ case_character_impl (struct casing_str_buf *buf, /* Update inword state */ bool was_inword = ctx->inword; - ctx->inword = SYNTAX (ch) == Sword && + ctx->inword = case_ch_is_word (SYNTAX (ch)) && (!ctx->inbuffer || was_inword || !syntax_prefix_flag_p (ch)); /* Normalize flag so its one of CASE_UP, CASE_DOWN or CASE_CAPITALIZE. */ @@ -222,7 +228,7 @@ case_character (struct casing_str_buf *buf, struct casing_context *ctx, has a word syntax (i.e. current character is end of word), use final sigma. */ if (was_inword && ch == GREEK_CAPITAL_LETTER_SIGMA && changed - && (!next || SYNTAX (STRING_CHAR (next)) != Sword)) + && (!next || !case_ch_is_word (SYNTAX (STRING_CHAR (next))))) { buf->len_bytes = CHAR_STRING (GREEK_SMALL_LETTER_FINAL_SIGMA, buf->data); buf->len_chars = 1; @@ -720,6 +726,19 @@ syms_of_casefiddle (void) 3rd argument. */); Vregion_extract_function = Qnil; /* simple.el sets this. */ + DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words, + doc: /* If non-nil, case functions treat symbol syntax as part of words. + +Functions such as `upcase-initials' and `replace-match' check or modify +the case pattern of sequences of characters. Normally, these operate on +sequences of characters whose syntax is word constituent. If this +variable is non-nil, then they operate on sequences of characters whose +syntax is either word constituent or symbol constituent. + +This is useful for programming languages and styles where only the first +letter of a symbol's name is ever capitalized.*/); + case_symbols_as_words = 0; + defsubr (&Supcase); defsubr (&Sdowncase); defsubr (&Scapitalize); diff --git a/src/search.c b/src/search.c index e9b29bb7179..692d8488049 100644 --- a/src/search.c +++ b/src/search.c @@ -2365,7 +2365,7 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0, convert NEWTEXT to all caps. Otherwise if all words are capitalized in the replaced text, capitalize each word in NEWTEXT. Note that what exactly is a word is determined by the syntax tables in effect -in the current buffer. +in the current buffer, and the variable `case-symbols-as-words'. If optional third arg LITERAL is non-nil, insert NEWTEXT literally. Otherwise treat `\\' as special: @@ -2479,7 +2479,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0, /* Cannot be all caps if any original char is lower case */ some_lowercase = 1; - if (SYNTAX (prevc) != Sword) + if (SYNTAX (prevc) != Sword + && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol)) some_nonuppercase_initial = 1; else some_multiletter_word = 1; @@ -2487,7 +2488,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0, else if (uppercasep (c)) { some_uppercase = 1; - if (SYNTAX (prevc) != Sword) + if (SYNTAX (prevc) != Sword + && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol)) ; else some_multiletter_word = 1; @@ -2496,7 +2498,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0, { /* If the initial is a caseless word constituent, treat that like a lowercase initial. */ - if (SYNTAX (prevc) != Sword) + if (SYNTAX (prevc) != Sword + && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol)) some_nonuppercase_initial = 1; } diff --git a/test/src/casefiddle-tests.el b/test/src/casefiddle-tests.el index e7f4348b0c6..12984d898b9 100644 --- a/test/src/casefiddle-tests.el +++ b/test/src/casefiddle-tests.el @@ -294,4 +294,16 @@ casefiddle-turkish ;;(should (string-equal (capitalize "indIá") "İndıa")) )) +(defun casefiddle-tests--check-syms (init with-words with-symbols) + (let ((case-symbols-as-words nil)) + (should (string-equal (upcase-initials init) with-words))) + (let ((case-symbols-as-words t)) + (should (string-equal (upcase-initials init) with-symbols)))) + +(ert-deftest casefiddle-case-symbols-as-words () + (casefiddle-tests--check-syms "Aa_bb Cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd") + (casefiddle-tests--check-syms "Aa_bb cc_DD" "Aa_Bb Cc_DD" "Aa_bb Cc_DD") + (casefiddle-tests--check-syms "aa_bb cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd") + (casefiddle-tests--check-syms "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd")) + ;;; casefiddle-tests.el ends here -- 2.39.3
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.