GNU bug report logs - #15597
bug-parted Digest, Vol 131, Issue 9

Previous Next

Package: parted;

Reported by: Rod Smith <rodsmith <at> rodsbooks.com>

Date: Sat, 12 Oct 2013 16:34:01 UTC

Severity: normal

Merged with 15591

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15597 in the body.
You can then email your comments to 15597 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-parted <at> gnu.org:
bug#15597; Package parted. (Sat, 12 Oct 2013 16:34:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Rod Smith <rodsmith <at> rodsbooks.com>:
New bug report received and forwarded. Copy sent to bug-parted <at> gnu.org. (Sat, 12 Oct 2013 16:34:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rod Smith <rodsmith <at> rodsbooks.com>
To: bug-parted <at> gnu.org
Subject: Re: bug-parted Digest, Vol 131, Issue 9
Date: Sat, 12 Oct 2013 12:33:12 -0400
On 10/12/2013 12:01 PM, Phillip Susi <psusi <at> ubuntu.com> wrote:

> The gpt partition table has 16 bit characters for the name, which I
> assume are supposed to be UTF-16, but the bloody uefi standard is moot
> on the subject.

The standard says they're "strings," and the default for strings in UEFI 
is UTF-16LE/UCS-2.

> Currently parted simply decimates the characters,
> throwing out the upper 8 bits.  This corrupts characters that aren't
> simple ascii, and at some later point, strlist.c calls mbstrtowcs(),
> which chokes on the corrupt name causing parted to bail out with
> "Error during translation".
>
> I think that gpt.c needs to translate the UTF-16 to the native
> multibyte encoding, but I have no idea how to do that.  The C standard
> conversion functions all seem to use the current locale and don't have
> a way to override it if you know this string is in UTF-16 ( and maybe
> the current locale is UTF-8 ).

I agree with you. I haven't studied the parted code on this score, so I 
don't have any specific suggestions for how to do it in parted. I can 
offer my experiences with doing it in GPT fdisk 
(http://www.rodsbooks.com/gdisk/), though: I used libicu 
(http://site.icu-project.org/) to do the translation. This seems to work 
pretty well -- at least, it produces results that are inter-operable 
with what Apple's tools do. You can check the gdisk source code, and 
particularly the gptpart.cc file, to see how gdisk does it. Search for 
"UnicodeString" to find what it does. It's been a while since I added 
libicu support, and I haven't made many changes to it since then, so I 
don't recall every detail of what I did. I seem to recall that it wasn't 
really very hard, but I did need to change quite a few output functions 
to use the libicu calls.

FWIW, when I added libicu support to gdisk, I kept the option to compile 
without libicu, in which case gdisk mangles non-ASCII characters in much 
the way parted does. Thus, you'll see both sets of code in gdisk. As a 
practical matter, libicu is a rather large library, and some developers 
of small emergency disks don't want to include it, so keeping the option 
to not use libicu is worthwhile.

Note that some values are invalid even with libicu, so there's a 
possibility that you'll run into error conditions, whether using libicu 
or not. Obviously, sane error handling is better than having the code 
bail out.

-- 
Rod Smith
rodsmith <at> rodsbooks.com
http://www.rodsbooks.com




Merged 15591 15597. Request was from Phillip Susi <psusi <at> ubuntu.com> to control <at> debbugs.gnu.org. (Mon, 14 Oct 2013 17:51:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 21 Jan 2014 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 210 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.