Sorry about the delay. The bugs I was talking about earlier are: 1. (read FUNCTION) assumes latin-1, as discussed earlier in this bug. The code in readchar() just forgets to set the multibyte flag for function sources. 2. (read UNIBYTE-STRING) assumes latin-1: (read "\"a\xff\"") -> "aÿ" For buffer and marker sources, readchar() does if (! ASCII_CHAR_P (c)) c = BYTE8_TO_CHAR (c); but this is missing for string sources. 3. (print UNIBYTE-SYM) assumes latin-1; (prin1-to-string (make-symbol "a\xff")) -> "aÿ" Here the reason is that print_object() calls `fetch_string_char_advance` instead of `fetch_string_char_as_multibyte_advance`. The above three bugs are clear omissions and were never intended behaviour; a lot happened in the switch to multibyte and bugs were bound to appear in the cracks. There should be no downside from fixing them. We may want to ask ourselves whether it's reasonable that read sources have a multibyteness, which affects how symbols are read but not string literals. I don't think it should affect either. However, I'm leaving this concern out of the immediate discussion. I also have a patch that improves reader performance while cleaning up some parts of the code, but it can be applied before or after fixing the three bugs above.