On Mon, 04 Mar 2024 16:43:26 +0200 Eli Zaretskii wrote: >> From: Stephen Berman >> Cc: 69385@debbugs.gnu.org >> Date: Mon, 04 Mar 2024 14:28:50 +0100 >> >> On Sun, 03 Mar 2024 17:18:31 +0200 Eli Zaretskii wrote: [...] >> > I hope you are editing those files with >> > embedded Arabic frequently enough for these changes to be exercised. >> >> As I mentioned previously, my real files are programmatically generated >> elisp files (so base paragraph direction LTR) not meant to be manually >> edited or even just viewed by users of the program, and I haven't edited >> them manually, and normally wouldn't. But I just now ran the >> end-of-buffer benchmark on one of them (the one I described previous, >> containing a vector of 827 lists of bidirectional strings in a single >> line), with this result: >> >> (0.849369497 4 0.05337466599999996) >> >> This was the timing without your patch: >> >> (9.308704995000001 4 0.054923504999999984) >> >> So for this file your patch yields "only" an almost 11 times faster >> benchmark. For navigation besides M-> and M-<, I find C-v, M-v, C-n, >> C-p in the buffer visiting this file still very slow (noticeably more >> than in the test buffers) and holding them down still freezes Emacs >> (with C-n and C-p for many seconds) and uses 100% of a CPU core; though, >> while I haven't tried timing these yet, my impression is that the >> freezes are not as long as the ones I observed without your patch. >> Also, there is still a marked delay when entering the minibuffer with >> M-x or M-: or when switching to another buffer with C-x b, though >> impressionistically no worse than the delays without your patch. I'll >> try to do more testing. > > Thanks for testing. The above matches what I see on my system. C-n > and C-p is known to be problematic in long lines, but these changes > speed them up as well, although perhaps not as well as the other > commands. > >> > If you see no problems after a week or two, I will install this. >> >> Thanks. > > So I will wait for you to report any problems, and if no problems are > seen, will install in a week or so. I haven't yet run into any issues concerning your patch, but I have encountered a problem with another one of my generated files, which, though independent of your patch (the problem also happens in emacs-29), is an issue for bidirectional text in Emacs, so might be worth trying to handle better. If you want, I can open a separate bug to pursue this issue, but for now I'll summarize what I've observed so far. Most of the Arabic words in the problematic file are enclosed in the bidirectional control characters POP DIRECTIONAL FORMATTING (#x202c) and RIGHT-TO-LEFT EMBEDDING (#x202b). I did not add these characters, but I had copy-&-pasted most of the Arabic from a PDF file I did not create. I don't know if PDFs of Arabic text normally contain these control characters, but the consequences for Emacs were dramatic. When I simply visited this file in Emacs (started with -Q) there was an immediate slowdown, and in top I could see Emacs using 100% of a CPU thread. When I ran the end-of-buffer benchmark on this file, the result (with your patch) was: (27.962602113 2 0.0226042269999999977) However, the display of that result only appeared in the echo area after more than a minute (I timed it with a stopwatch). At that point the mode line showed the buffer at 4% from the top, and the display remained frozen afterwards. After several minutes during which Emacs consumed 100% CPU, and I had switched the focus away from the Emacs frame, the CPU consumption stopped, but as soon as I switch focus back to that frame, it went back to 100%. The display never changed from showing the buffer at 4%, apparently being in some kind of infinite loop. After about 15 minutes I started gdb, attached the Emacs process and produced a backtrace, which I've attached, in the hope it helps to diagnose the problem. The problem seems to be certainly related the the bidirectional control characters, because I made a copy of the file and removed all occurrences of these control characters from it, and then ran the end-of-buffer benchmark, getting this result (with your patch): (0.716104165 4 0.04223660400000001) And the display updated normally and CPU consumption was normal. Nevertheless, there seems to be something else besides the control characters involved in this issue, because as a futher test, I created a buffer consisting of more than 1000 copies of the test string concatenating the Arabic example in etc/HELLO and "Hello", and manually enclosed each Arabic word in the above control characters, but the benchmark result in this buffer was not significantly different from the result without the control characters (and similar to the above result for the copy of the problematic file without the control characters), and the display did not freeze. Steve Berman