Oh one further solution: - document more properly in the manpage and --help, what -u really is, and especially that it may not behave as expected, with other locales/collations. Perhaps even giving an example, so that people understand the seriousness of that. - add companion option, maybe -U, which sorts out *only* those lines which are really binary identical between the two \n . Cheers, Chris.