On 12/05/2014 08:00 AM, Jim Meyering wrote: >> >> I deny this is desirable behavior and I doubt there is a security issue as >> described. If any other, independent software has a security issue with >> non-UTF-8 input, it should decide itself to filter it and use accordingly >> stable decoding functions. It cannot be the task of any tool (grep in this >> case) to filter output to work around possible security issues in other >> programs in a pipe. This would be completely against the concept of pipes in >> the Unix tradition. > > This is another side effect of using a multibyte locale. > As long as there are no NUL bytes in your input, you can work > around the issue by running grep in the C locale: > > LC_ALL=C grep ... Yes, the C locale has the nice effect of EVERY byte being a valid single byte character, leaving only NUL bytes and a non-empty file not ending in newline as the only reasons for a file to be marked binary. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org