GNU bug report logs -
#6345
css-mode `css-extract-keyword-list' does not actually [PATCH]
Previous Next
Reported by: MON KEY <monkey <at> sandpframing.com>
Date: Thu, 3 Jun 2010 18:02:02 UTC
Severity: minor
Tags: patch
Fixed in version 25.1
Done: Simen Heggestøyl <simenheg <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 6345 in the body.
You can then email your comments to 6345 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6345
; Package
emacs
.
(Thu, 03 Jun 2010 18:02:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
MON KEY <monkey <at> sandpframing.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Thu, 03 Jun 2010 18:02:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
`css-extract-keyword-list' does not actually [PATCH]
In function `css-extract-keyword-list' the search for "Appendix
H. Index" fails e.g. this form:
(search-backward "Appendix H. Index")
when used to search this the contents of this URL:
"http://www.w3.org/TR/REC-CSS2/css2.txt"
which is dated: W3C Candidate Recommendation 08 September 2009
Returns this message:
css-extract-keyword-list: Search failed: "Appendix H. Index"
It appears this function was originally supplied to scrape CSS
keywords as per the commented code in: lisp/textmodes/css-mode.el
,----
| (css-extract-keyword-list
| '((pseudo . "^ +\\* :\\([^ \n,]+\\)")
| (at . "^ +\\* @\\([^ \n,]+\\)")
| (descriptor . "^ +\\* '\\([^ '\n]+\\)' (descriptor)")
| (media . "^ +\\* '\\([^ '\n]+\\)' media group")
| (property . "^ +\\* '\\([^ '\n]+\\)',")))
`----
However, W3C has gone behined Stefan's back and changed the Appendix
enumeration without asking his permission first :)
"Appendix H" is now "Appendix I".
Compare the version scraped (presumably):
(URL `http://www.w3.org/TR/2008/REC-CSS2-20080411/indexlist.html')
(URL `http://www.w3.org/TR/2008/REC-CSS2-20080411/css2.txt')
with the current version:
(URL `http://www.w3.org/TR/CSS2/indexlist.html')
(URL `http://www.w3.org/TR/CSS2/css2.txt')
The following regexp may be more robust and appears to works for
either the older version or the latest version and leaves room for W3C
to continue add appendices J-M:
(search-backward-regexp "[_━]\\{60,79\\}\xa[[:space:]]+Appendix [A-M]\. Index")
This said, `css-extract-keyword-list' is now borking on regexps in
these conses:
(css-extract-keyword-list
'((pseudo . "^ +\\* :\\([^ \n,]+\\)")
(at . "^ +\\* @\\([^ \n,]+\\)")
(descriptor . "^ +\\* '\\([^ '\n]+\\)' (descriptor)")
(media . "^ +\\* '\\([^ '\n]+\\)' media group")
(property . "^ +\\* '\\([^ '\n]+\\)',")))
and seems to be failing per `url-insert-file-contents' reliance on
`decode-coding-inserted-region' which frobs the asterisks `*' (char
#x2a) into a bullet `•' (char #x2022) -- at least on on my system.
If we substitute occurences of "\\*" with "[*•]" (e.g. "[\x2a\x2022]")
the following regexps now seem to work correctly:
(pp (css-extract-keyword-list
'((pseudo . "^ +[\x2a\x2022] :\\([^ \n,]+\\)")
(at . "^ +[\x2a\x2022] @\\([^ \n,]+\\)")
(descriptor . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' (descriptor)")
(media . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' media group")
(property . "^ +[\x2a\x2022] '\\([^ '\n]+\\)',")))
(current-buffer))
Following diffed against Bazaar revision 100231
;;; ==============================
*** ediff3753M5g 2010-06-03 09:43:04.000000000 -0400
--- lisp/textmodes/css-mode.el 2010-06-03 09:42:43.000000000 -0400
***************
*** 41,49 ****
(defun css-extract-keyword-list (res)
(with-temp-buffer
! (url-insert-file-contents "http://www.w3.org/TR/REC-CSS2/css2.txt")
(goto-char (point-max))
! (search-backward "Appendix H. Index")
(forward-line)
(delete-region (point-min) (point))
(let ((result nil)
--- 41,49 ----
(defun css-extract-keyword-list (res)
(with-temp-buffer
! (url-insert-file-contents
"http://www.w3.org/TR/2008/REC-CSS2-20080411/css2.txt")
(goto-char (point-max))
! (search-backward-regexp "[_━]\\{60,79\\}\xa[[:space:]]+Appendix
[A-M]\. Index")
(forward-line)
(delete-region (point-min) (point))
(let ((result nil)
***************
*** 115,125 ****
;; Extraction was done with:
;; (css-extract-keyword-list
! ;; '((pseudo . "^ +\\* :\\([^ \n,]+\\)")
! ;; (at . "^ +\\* @\\([^ \n,]+\\)")
! ;; (descriptor . "^ +\\* '\\([^ '\n]+\\)' (descriptor)")
! ;; (media . "^ +\\* '\\([^ '\n]+\\)' media group")
! ;; (property . "^ +\\* '\\([^ '\n]+\\)',")))
(defconst css-pseudo-ids
'("active" "after" "before" "first" "first-child" "first-letter"
"first-line"
--- 115,125 ----
;; Extraction was done with:
;; (css-extract-keyword-list
! ;; '((pseudo . "^ +[\x2a\x2022] :\\([^ \n,]+\\)")
! ;; (at . "^ +[\x2a\x2022] @\\([^ \n,]+\\)")
! ;; (descriptor . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' (descriptor)")
! ;; (media . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' media group")
! ;; (property . "^ +[\x2a\x2022] '\\([^ '\n]+\\)',")))
(defconst css-pseudo-ids
'("active" "after" "before" "first" "first-child" "first-letter"
"first-line"
[css-mode.diff-2010-06-03 (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6345
; Package
emacs
.
(Tue, 10 Apr 2012 11:13:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 6345 <at> debbugs.gnu.org (full text, mbox):
MON KEY <monkey <at> sandpframing.com> writes:
> `css-extract-keyword-list' does not actually [PATCH]
[...]
> ! (search-backward-regexp "[_]\\{60,79\\}\xa[[:space:]]+Appendix [A-M]\. Index")
The rest of the patch seems reasonable (I think), but is there a way to
rework this? Having characters like that in the source code isn't
ideal, if it can be avoided.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6345
; Package
emacs
.
(Tue, 10 Apr 2012 12:08:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 6345 <at> debbugs.gnu.org (full text, mbox):
>> `css-extract-keyword-list' does not actually [PATCH]
> [...]
>> ! (search-backward-regexp "[_煤]\\{60,79\\}\xa[[:space:]]+Appendix [A-M]\. Index")
> The rest of the patch seems reasonable (I think), but is there a way to
> rework this? Having characters like that in the source code isn't
> ideal, if it can be avoided.
Indeed: the code is only run occasionally to update the keyword-list, so
it's not super important for it to be terribly robust. In a sense, the
code is only kept as documentation to have a good stating point for the
next time I need such a thing.
Stefan
Reply sent
to
Simen Heggestøyl <simenheg <at> gmail.com>
:
You have taken responsibility.
(Thu, 19 Mar 2015 22:43:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
MON KEY <monkey <at> sandpframing.com>
:
bug acknowledged by developer.
(Thu, 19 Mar 2015 22:43:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 6345-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Version: 25.1
As of commit 7ec63a3afa52213b7b3cd3ecc0717c6e6504dc43, that code is no
longer part of css-mode.
Thanks for your report!
-- Simen
[Message part 2 (text/html, inline)]
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 17 Apr 2015 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 10 years and 71 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.