tag 16944 notabug thanks On 03/05/2014 03:48 PM, Leslie Satenstein wrote: > I have a problem with the sort utility that I cannot seem to do with > sort. > > I have a file x (below) and I wish to sort only the first column > according to the ascii table, in other words, a sort where the sort > follows the > A..Za..z and of course the other characters as well. > > I created this file x to illustrate the problem. > > This is First line of file x is a space character, the backspace char > and the textHost=fedora20-leslie > > RAW Unsorted input (27 lines) filename x > > Host=fedora20-leslie | | scan > from|/home/leslie/Development/scandir > scandir.ini |20140223 1245| > e2c713788f9492be9e61d1d0badcc8ca|/home/leslie/Development/scandir > sha.c |20140223 1245| Umm, your example file got corrupted by your mailer. So it's harder to see what you are actually trying to sort, and what results you are trying to get. Maybe you should actually attach your file 'x' instead of pasting it inline where it gets corrupted. Also, when you say "column 1", did you really mean "field 1" (which occupies multiple character columns) rather than just the literal first character? > The sort order is not correct with folding, the missing line with the x > has returned and my header line remains in row 1, > BUT... > I am after an ascii sequence sort and out of place are the rows with > DATE1 and DATE2. They should actually appears as lines 2 and 3. Are you setting locale environment variables correctly? The only way to guarantee ASCII collation is to use a locale that enforces it. Many distros these days default to an en_US.UTF-8 locale (or similar) which intentionally does NOT do ascii collation; to override that, you probably want to try 'LC_ALL=C sort ...' https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021 > > How do I get the sort to respect the ascii sorting sequence? I can do > so for later fields such as sorting any other column such as ... > sort -fb -t '|' -k2 x to sort -fb -t '|' k4 x This looks very suspicious (not to mention a typo - you mention 'k4' when you probably typed '-k4' - it's in your best interest to be more accurate when reporting difficulties you are having). You usually want to use '-k2,2' and not the simpler '-k2' (the longer version sorts on exactly field 2, while the shorthand treats field 2 and then on to the end of the line all as one key). You may want to try the 'sort --debug' flag to see exactly what sort is using during its checks, to make sure it is choosing sort keys that line up with what you think it should. > > My observation is that there does not appear to be an option that allows > me to sort by column 1 without shifting to the left of the all the > leading whitespace characters. I didn't parse that - if you are eliding leading whitespace, then you are not sorting by column 1, but by the first non-whitespace character. Oh - maybe you meant sorting by "field 1", which is spelled '-k1,1' (or -k1,1b if you want to ignore leading blanks), and optionally with -t in effect to force field separation to match your expectations instead of occurring on non-blank to blank transitions. > - > If I have found a shortcoming, I would like to propose a new flag so > that the sort would actually generate the first column in pure ascii > sequence. Sort already has a POSIX-mandated option to force pure ascii sorting: LC_ALL=C sort ... Therefore, I'm closing this as not a bug. But feel free to ask further questions or provide better details of what you are trying to do, in a way that does not get munged by your mailer. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org