GNU bug report logs -
#46933
Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos
Previous Next
Full log
Message #26 received at 46933 <at> debbugs.gnu.org (full text, mbox):
In article <83pmzkog6x.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> > How about something like this method:
> > 1. Encode the buffer text one line by one until we get a longer byte
> > sequence than BYTE.
> > 2. Delete the result of enoding the last line above.
> > 3. Provided that the above last line has chars C1 C2 ... Cn,
> > encode characters C1...Cn, C1...Cn-1, C1...Cn-2 until we get a shorter
> > byte sequence than BYTE.
> >
> > The first step may be optimized by encode multiple lines instead of
> > single line.
> Even if we do optimize, this would be very slow, I think.
Whether it is too slow or not depends on what filepos-to-bufferpos is
used for. Do you know why filepos-to-bufferpos (and
bufferpos-to-filepos) is introduced?
> And what if the buffer has no newlines?
In that case, just do the step 2. Or, we can use the bi-sectioning
technique.
> In any case, the problem is not with encoding, the problem is with
> decoding. Encoding doesn't have this problem because we always encode
> more than enough (we use the value of BYTE as the count of
> _characters_ to encode, so for ISO-2022 encoding it is usually much
> more than needed). By contrast, when decoding, we decode exactly
> BYTE+1 bytes, which then hits the problem if that offset is inside a
> shift sequence.
Then, that implementation should be changed.
Any coding system can have :post-read-conversion and
:pre-write-conversion functions, it is not guaranteed that encoded byte
length is greater than the number of characters.
---
K. Handa
handa <at> gnu.org
This bug report was last modified 3 years and 52 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.