GNU bug report logs -
#67926
29.1; fail to extract ZIP subfile named with [...]
Previous Next
Reported by: awrhygty <at> outlook.com
Date: Wed, 20 Dec 2023 11:24:02 UTC
Severity: normal
Found in version 29.1
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #38 received at 67926 <at> debbugs.gnu.org (full text, mbox):
> From: awrhygty <at> outlook.com
> Cc: 67926 <at> debbugs.gnu.org
> Date: Thu, 28 Dec 2023 22:09:23 +0900
>
> > Btw, your suggested changes required gzip and bunzip2 as external
> > programs to support the 2 most popular compression methods. Why
> > should we assume these are available more widely than unzip,
> > especially on Windows?
>
> When I installed UnxUtils years ago, it had bzip2 and gzip, but not
> unzip nor zip. Now I download it again, it has unzip and zip.
Windows systems don't come with UnxUtils installed anyway.
> My interest is how to avoid naming problems.
> There are more difficulties in Japanese.
> Japanese characters in file names are normally encoded in cp932.
> Encoded characters may have '[', '\' or ']' as a second byte.
> (encode-coding-string "ゼソゾ" 'cp932)
> => "\203[\203\\\203]"
> Subfiles of such names can not be extracted normally.
I don't think we can solve this in Emacs: non-ASCII file names in zip
archives are a mess, even before you consider the fact that zip
archives are frequently moved between systems. For starters, how can
one know in advance what is the encoding of file names in an arbitrary
zip archive? This will bite you even if we do everything in Emacs,
and even if someone does submit patches to implement all the
compression methods.
This bug report was last modified 1 year and 232 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.