Package: emacs;
Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Date: Thu, 31 May 2018 09:56:02 UTC
Severity: minor
Tags: fixed, moreinfo
Fixed in version 27.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org> To: Lars Ingebrigtsen <larsi <at> gnus.org> Cc: Katsumi Yamaoka <yamaoka <at> jpl.org>, 31665 <at> debbugs.gnu.org Subject: bug#31665: libxml-parse-html-region' doesn't extract text in tables Date: Mon, 30 Sep 2019 00:52:40 +0800
>>>>> "LI" == Lars Ingebrigtsen <larsi <at> gnus.org> writes: LI> 積丹尼 Dan Jacobson <jidanni <at> jidanni.org> writes: >>>>>>> "LI" == Lars Ingebrigtsen <larsi <at> gnus.org> writes: >> LI> Do you have an example table that `libxml-parse-html-region' doesn't LI> "extract" text from? >> >> OK here is a mail that I cleaned off my personal phone bill from: LI> What was it you think is missing from that table? I don't read Chinese, LI> but there didn't seem to be any text in that table, just a bunch of LI> images. It should look like: +----------------------------------------------------------------------------------------------------------------------------------------------------+ |+---------------------------------------------------------------------------------------------------------------------+ | ||+------------------------------------------------------------------------------------------------------------------+ | | |||[banner2] | | | |||------------------------------------------------------------------------------------------------------------------| | | |||+---------------------------------------------------------------------------------------------------------------+ | | | |||| |親愛的客戶,您好: | | | | | |||| |-------------------------------------| | | | | |||| |為保障您資料的安全,請輸入密碼開啟附 | | | | | |||| |加檔案瀏覽您本期的帳單,密碼為『身分 | | | | | |||| [IS1] |證號碼』(英文字母須大寫),營業人客戶 | [IS2] | | | | |||| |不需輸入密碼即可瀏覽。 | | | | | |||| |若無法開啟附加檔案,請先確認是否已下 | | | | | |||| |載Acrobat Reader軟體。 | | | | | |||| |-------------------------------------| | | | | |||+---------------------------------------------------------------------------------------------------------------+ | | | ||+------------------------------------------------------------------------------------------------------------------+ | | ||++ | | |||| | | ||++ | | ||+-------------------------------------------------------------------------------------------------------------------+| | |||[new1] || | |||+-----------------------------------------------------------------------------------------------------------------+|| | |||| | [enf201]||| | |||| |--------------------------------------------------------||| | ||||[end101] | [enl301]||| | |||| |--------------------------------------------------------||| | |||| | [enl401]||| | |||+-----------------------------------------------------------------------------------------------------------------+|| | ||+-------------------------------------------------------------------------------------------------------------------+| | ||++ | | |||| | | ||++ | | ||+------------------------------------------------------------------------------------------------------------------+ | | |||[hot1] | | | |||------------------------------------------------------------------------------------------------------------------| | | |||+----------------------------------+ | | | ||||[hot1]|[hot2]|[hot3]|[hot4]|[hot5]| | | | |||+----------------------------------+ | | | ||+------------------------------------------------------------------------------------------------------------------+ | | ||++ | | |||| | | ||++ | | ||+------------------------------------------------------------------------------------------------------------------+ | | |||[link1] | | | |||+-----------------------------------------------------------------+ | | | |||||| | | | | | | | ||||++------------+----------------+----------------+----------------| | | | ||||||電子帳單Q&A | 費率說明 | 客戶消費資訊 | 線上繳費 | | | | ||||++------------+----------------+----------------+----------------| | | | |||||| 服務專線 | 貼心提醒 |不可不知行動優惠| HiNet好康優惠 | | | | |||+-----------------------------------------------------------------+ | | | ||+------------------------------------------------------------------------------------------------------------------+ | | ||++ | | |||| | | ||++ | | ||+------------------------------------------------------------------------------------------------------------------+ | | ||| [cht] | | | ||+------------------------------------------------------------------------------------------------------------------+ | | |+---------------------------------------------------------------------------------------------------------------------+ | +----------------------------------------------------------------------------------------------------------------------------------------------------+ But instead all we get is: From: Phone Co. <p <at> cht.com.tw> Subject: Phone Bill To: "jidanni <at> jidanni.org" <jidanni <at> jidanni.org> Date: Thu, 17 May 2018 12:12:06 +0800 Reply-To: x <at> cht.com.tw [1. text/html] 中華電信電子帳單 * * * * * * * * * * * * * * * *
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.