From unknown Wed Jun 18 23:14:20 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9236: Fwd: Join Resent-From: "David Gast" Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 04 Aug 2011 04:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 9236 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 9236@debbugs.gnu.org X-Debbugs-Original-To: bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.131243131230842 (code B ref -1); Thu, 04 Aug 2011 04:16:02 +0000 Received: (at submit) by debbugs.gnu.org; 4 Aug 2011 04:15:12 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QopKx-00081L-UD for submit@debbugs.gnu.org; Thu, 04 Aug 2011 00:15:12 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QonzM-00069q-6M for submit@debbugs.gnu.org; Wed, 03 Aug 2011 22:48:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qonyk-0001wL-P9 for submit@debbugs.gnu.org; Wed, 03 Aug 2011 22:48:11 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:34053) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qonyk-0001wH-MY for submit@debbugs.gnu.org; Wed, 03 Aug 2011 22:48:10 -0400 Received: from eggs.gnu.org ([140.186.70.92]:57195) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qonyj-0006Hz-P6 for bug-coreutils@gnu.org; Wed, 03 Aug 2011 22:48:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qonyi-0001vw-9T for bug-coreutils@gnu.org; Wed, 03 Aug 2011 22:48:09 -0400 Received: from iron2.its.csulb.edu ([134.139.1.35]:52127) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qonyi-0001vs-2s for bug-coreutils@gnu.org; Wed, 03 Aug 2011 22:48:08 -0400 Received: from its-cgpb02.csulb.edu (HELO csulb.edu) ([134.139.16.6]) by iron2.its.csulb.edu with ESMTP; 03 Aug 2011 19:48:06 -0700 Received: from [71.160.180.217] (account dgast@csulb.edu) by its-cgpb02.csulb.edu (CommuniGate Pro WEBUSER 5.3.12) with HTTP id 7694701 for bug-coreutils@gnu.org; Wed, 03 Aug 2011 19:48:06 -0700 From: "David Gast" X-Mailer: CommuniGate Pro WebUser v5.3.12 Date: Wed, 03 Aug 2011 19:48:06 -0700 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="_===7694701====its-cgpb02.csulb.edu===_" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -6.6 (------) X-Mailman-Approved-At: Thu, 04 Aug 2011 00:15:09 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.6 (------) This is a multi-part MIME message --_===7694701====its-cgpb02.csulb.edu===_ Content-Type: text/plain;charset=iso-8859-1; format="flowed" Content-Transfer-Encoding: 8bit Oops, I hit the wrong button ... cat > /tmp/x < Subject: Join To: bug-coreutils@gnu.org X-Mailer: CommuniGate Pro WebUser v5.3.12 Date: Wed, 03 Aug 2011 19:43:31 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1; format="flowed" Content-Transfer-Encoding: 8bit When there is disorder, could you please provide the line number like the command and option sort -c does? Note: join seems to report disorder in file 2 only if there is no disorder in file 1. You try the following code Thanks --_===7694701====its-cgpb02.csulb.edu===_-- From unknown Wed Jun 18 23:14:20 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9236: Fwd: Join Resent-From: Eric Blake Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 04 Aug 2011 14:53:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 9236 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: David Gast Cc: 9236@debbugs.gnu.org Received: via spool by 9236-submit@debbugs.gnu.org id=B9236.131246953029750 (code B ref 9236); Thu, 04 Aug 2011 14:53:02 +0000 Received: (at 9236) by debbugs.gnu.org; 4 Aug 2011 14:52:10 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QozHO-0007jm-7j for submit@debbugs.gnu.org; Thu, 04 Aug 2011 10:52:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QozHK-0007je-NC for 9236@debbugs.gnu.org; Thu, 04 Aug 2011 10:52:08 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p74EpPIi009247 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 4 Aug 2011 10:51:25 -0400 Received: from [10.3.113.47] (ovpn-113-47.phx2.redhat.com [10.3.113.47]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p74EpP1t028344; Thu, 4 Aug 2011 10:51:25 -0400 Message-ID: <4E3AB1EC.9080605@redhat.com> Date: Thu, 04 Aug 2011 08:51:24 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110621 Fedora/3.1.11-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.11 MIME-Version: 1.0 References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Spam-Score: -10.3 (----------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) merge 9235 9236 thanks On 08/03/2011 08:48 PM, David Gast wrote: > Oops, I hit the wrong button ... > > cat > /tmp/x < b > a > ! > ln /tmp/x /tmp/y > sort -c /tmp/x > join --check-order /tmp/x /tmp/y > # Note: The two files do not have to be the same. > > Output is > > sort: /tmp/x:2: disorder: a > join: file 1 is not in sorted order This sounds like a reasonable idea! Would you like to contribute the patch? -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org From unknown Wed Jun 18 23:14:20 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9236: Fwd: Join Resent-From: Jim Meyering Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 04 Aug 2011 17:50:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 9236 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Eric Blake Cc: 9236@debbugs.gnu.org, David Gast Received: via spool by 9236-submit@debbugs.gnu.org id=B9236.131248014513310 (code B ref 9236); Thu, 04 Aug 2011 17:50:02 +0000 Received: (at 9236) by debbugs.gnu.org; 4 Aug 2011 17:49:05 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qp22a-0003Sd-Uh for submit@debbugs.gnu.org; Thu, 04 Aug 2011 13:49:05 -0400 Received: from mx.meyering.net ([82.230.74.64]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qp22X-0003SE-H3 for 9236@debbugs.gnu.org; Thu, 04 Aug 2011 13:49:03 -0400 Received: from rho.meyering.net (localhost.localdomain [127.0.0.1]) by rho.meyering.net (Acme Bit-Twister) with ESMTP id 205A760098; Thu, 4 Aug 2011 19:48:20 +0200 (CEST) From: Jim Meyering In-Reply-To: <4E3AB1EC.9080605@redhat.com> (Eric Blake's message of "Thu, 04 Aug 2011 08:51:24 -0600") References: <4E3AB1EC.9080605@redhat.com> Date: Thu, 04 Aug 2011 19:48:20 +0200 Message-ID: <87k4atoyor.fsf@rho.meyering.net> Lines: 136 MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.1 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.1 (------) Eric Blake wrote: > merge 9235 9236 > thanks > > On 08/03/2011 08:48 PM, David Gast wrote: >> Oops, I hit the wrong button ... >> >> cat > /tmp/x <> b >> a >> ! >> ln /tmp/x /tmp/y >> sort -c /tmp/x >> join --check-order /tmp/x /tmp/y >> # Note: The two files do not have to be the same. >> >> Output is >> >> sort: /tmp/x:2: disorder: a >> join: file 1 is not in sorted order > > This sounds like a reasonable idea! Would you like to contribute the patch? I started looking at this, and among other things saw a diagnostic that mentioned "file 1", which would do much better to mention the actual file name, so embarked. Here's a preliminary patch (not even a decent ChangeLog entry and the join test still needs to be updated): $ printf '%s\n' b a c > in $ ./join --check-order in in ./join: in:2: is not sorted: a [Exit 1] >From adf709ba6a8d934e8f90cafada824221e1c6eb18 Mon Sep 17 00:00:00 2001 From: Jim Meyering Date: Thu, 4 Aug 2011 19:31:50 +0200 Subject: [PATCH] join: FIXME: check: print both file name and line number --- src/join.c | 29 +++++++++++++++++++---------- 1 files changed, 19 insertions(+), 10 deletions(-) diff --git a/src/join.c b/src/join.c index 99d918f..368d0db 100644 --- a/src/join.c +++ b/src/join.c @@ -89,6 +89,12 @@ struct seq /* The previous line read from each file. */ static struct line *prevline[2] = {NULL, NULL}; +/* The number of lines read from each file. */ +static uintmax_t line_no[2] = {0, 0}; + +/* The input file names. */ +static char *g_names[2]; + /* This provides an extra line buffer for each file. We need these if we try to read two consecutive lines into the same buffer, since we don't want to overwrite the previous buffer before we check order. */ @@ -386,7 +392,10 @@ check_order (const struct line *prev, { error ((check_input_order == CHECK_ORDER_ENABLED ? EXIT_FAILURE : 0), - 0, _("file %d is not in sorted order"), whatfile); + 0, _("%s:%ju: is not sorted: %.*s"), + g_names[whatfile], line_no[whatfile], + current->buf.length-1, /* FIXME should be int */ + current->buf.buffer); /* If we get to here, the message was just a warning, but we want only to issue it once. */ @@ -436,6 +445,7 @@ get_line (FILE *fp, struct line **linep, int which) freeline (line); return false; } + ++line_no[which]; xfields (line); @@ -980,7 +990,6 @@ main (int argc, char **argv) int prev_optc_status = MUST_BE_OPERAND; int operand_status[2]; int joption_count[2] = { 0, 0 }; - char *names[2]; FILE *fp1, *fp2; int optc; int nfiles = 0; @@ -1100,7 +1109,7 @@ main (int argc, char **argv) break; case 1: /* Non-option argument. */ - add_file_name (optarg, names, operand_status, joption_count, + add_file_name (optarg, g_names, operand_status, joption_count, &nfiles, &prev_optc_status, &optc_status); break; @@ -1122,7 +1131,7 @@ main (int argc, char **argv) /* Process any operands after "--". */ prev_optc_status = MUST_BE_OPERAND; while (optind < argc) - add_file_name (argv[optind++], names, operand_status, joption_count, + add_file_name (argv[optind++], g_names, operand_status, joption_count, &nfiles, &prev_optc_status, &optc_status); if (nfiles != 2) @@ -1148,20 +1157,20 @@ main (int argc, char **argv) if (join_field_2 == SIZE_MAX) join_field_2 = 0; - fp1 = STREQ (names[0], "-") ? stdin : fopen (names[0], "r"); + fp1 = STREQ (g_names[0], "-") ? stdin : fopen (g_names[0], "r"); if (!fp1) - error (EXIT_FAILURE, errno, "%s", names[0]); - fp2 = STREQ (names[1], "-") ? stdin : fopen (names[1], "r"); + error (EXIT_FAILURE, errno, "%s", g_names[0]); + fp2 = STREQ (g_names[1], "-") ? stdin : fopen (g_names[1], "r"); if (!fp2) - error (EXIT_FAILURE, errno, "%s", names[1]); + error (EXIT_FAILURE, errno, "%s", g_names[1]); if (fp1 == fp2) error (EXIT_FAILURE, errno, _("both files cannot be standard input")); join (fp1, fp2); if (fclose (fp1) != 0) - error (EXIT_FAILURE, errno, "%s", names[0]); + error (EXIT_FAILURE, errno, "%s", g_names[0]); if (fclose (fp2) != 0) - error (EXIT_FAILURE, errno, "%s", names[1]); + error (EXIT_FAILURE, errno, "%s", g_names[1]); if (issued_disorder_warning[0] || issued_disorder_warning[1]) exit (EXIT_FAILURE); -- 1.7.4.4 From unknown Wed Jun 18 23:14:20 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9236: Fwd: Join Resent-From: Jim Meyering Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Sat, 06 Aug 2011 19:42:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 9236 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Eric Blake Cc: 9236@debbugs.gnu.org, David Gast Received: via spool by 9236-submit@debbugs.gnu.org id=B9236.131265966616186 (code B ref 9236); Sat, 06 Aug 2011 19:42:01 +0000 Received: (at 9236) by debbugs.gnu.org; 6 Aug 2011 19:41:06 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qpmk5-0004D0-IW for submit@debbugs.gnu.org; Sat, 06 Aug 2011 15:41:05 -0400 Received: from mx.meyering.net ([82.230.74.64]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qpmk2-0004Ct-NA for 9236@debbugs.gnu.org; Sat, 06 Aug 2011 15:41:04 -0400 Received: from rho.meyering.net (localhost.localdomain [127.0.0.1]) by rho.meyering.net (Acme Bit-Twister) with ESMTP id 967F86005B; Sat, 6 Aug 2011 21:40:09 +0200 (CEST) From: Jim Meyering In-Reply-To: <87k4atoyor.fsf@rho.meyering.net> (Jim Meyering's message of "Thu, 04 Aug 2011 19:48:20 +0200") References: <4E3AB1EC.9080605@redhat.com> <87k4atoyor.fsf@rho.meyering.net> Date: Sat, 06 Aug 2011 21:40:09 +0200 Message-ID: <87wreql46e.fsf@rho.meyering.net> Lines: 210 MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.1 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.1 (------) Jim Meyering wrote: ... > I started looking at this, and among other things saw > a diagnostic that mentioned "file 1", which would do > much better to mention the actual file name, so embarked. > Here's a preliminary patch (not even a decent ChangeLog entry > and the join test still needs to be updated): > > $ printf '%s\n' b a c > in > $ ./join --check-order in in > ./join: in:2: is not sorted: a > [Exit 1] > > Subject: [PATCH] join: FIXME: check: print both file name and line number > > --- > src/join.c | 29 +++++++++++++++++++---------- > 1 files changed, 19 insertions(+), 10 deletions(-) Here's a much better patch. >From 2e4ca5100dcc3229e9937c48aed3dc475bb507ea Mon Sep 17 00:00:00 2001 From: Jim Meyering Date: Thu, 4 Aug 2011 19:31:50 +0200 Subject: [PATCH] join: with --check-order print offending file name, line number and data * src/join (g_names): New global (was main's "names"). (main): Update all uses of "names". (line_no[2]): New globals. (get_line): Increment after reading each line. (check_order): Print the standard "file name:line_no: " prefix as well as the offending line when reporting disorder. Here is a sample old/new comparison: -join: file 1 is not in sorted order +join: in:4: is not sorted: contents-of-line-4 * tests/misc/join: Change the two affected tests to expect the new diagnostic. Add new tests for more coverage: mismatch in file 2, two diagnostics, zero-length out-of-order line. * NEWS (Improvements): Mention it. --- NEWS | 3 +++ src/join.c | 43 ++++++++++++++++++++++++++++++------------- tests/misc/join | 20 ++++++++++++++++++-- 3 files changed, 51 insertions(+), 15 deletions(-) diff --git a/NEWS b/NEWS index 2e48497..6e24f5c 100644 --- a/NEWS +++ b/NEWS @@ -66,6 +66,9 @@ GNU coreutils NEWS -*- outline -*- df now supports disk partitions larger than 4 TiB on MacOS X 10.5 or newer and on AIX 5.2 or newer. + join --check-order now prints "join: FILE:LINE_NUMBER: bad_line" for an + unsorted input, rather than e.g., "join: file 1 is not in sorted order". + shuf outputs small subsets of large permutations much more efficiently. For example `shuf -i1-$((2**32-1)) -n2` no longer exhausts memory. diff --git a/src/join.c b/src/join.c index 99d918f..694fb55 100644 --- a/src/join.c +++ b/src/join.c @@ -86,9 +86,15 @@ struct seq struct line **lines; }; -/* The previous line read from each file. */ +/* The previous line read from each file. */ static struct line *prevline[2] = {NULL, NULL}; +/* The number of lines read from each file. */ +static uintmax_t line_no[2] = {0, 0}; + +/* The input file names. */ +static char *g_names[2]; + /* This provides an extra line buffer for each file. We need these if we try to read two consecutive lines into the same buffer, since we don't want to overwrite the previous buffer before we check order. */ @@ -384,12 +390,23 @@ check_order (const struct line *prev, size_t join_field = whatfile == 1 ? join_field_1 : join_field_2; if (keycmp (prev, current, join_field, join_field) > 0) { + /* Exclude any trailing newline. */ + size_t len = current->buf.length; + if (0 < len && current->buf.buffer[len - 1] == '\n') + --len; + + /* If the offending line is longer than INT_MAX, output + only the first INT_MAX bytes in this diagnostic. */ + len = MIN (INT_MAX, len); + error ((check_input_order == CHECK_ORDER_ENABLED ? EXIT_FAILURE : 0), - 0, _("file %d is not in sorted order"), whatfile); + 0, _("%s:%ju: is not sorted: %.*s"), + g_names[whatfile - 1], line_no[whatfile - 1], + (int) len, current->buf.buffer); - /* If we get to here, the message was just a warning, but we - want only to issue it once. */ + /* If we get to here, the message was merely a warning. + Arrange to issue it only once per file. */ issued_disorder_warning[whatfile-1] = true; } } @@ -436,6 +453,7 @@ get_line (FILE *fp, struct line **linep, int which) freeline (line); return false; } + ++line_no[which - 1]; xfields (line); @@ -980,7 +998,6 @@ main (int argc, char **argv) int prev_optc_status = MUST_BE_OPERAND; int operand_status[2]; int joption_count[2] = { 0, 0 }; - char *names[2]; FILE *fp1, *fp2; int optc; int nfiles = 0; @@ -1100,7 +1117,7 @@ main (int argc, char **argv) break; case 1: /* Non-option argument. */ - add_file_name (optarg, names, operand_status, joption_count, + add_file_name (optarg, g_names, operand_status, joption_count, &nfiles, &prev_optc_status, &optc_status); break; @@ -1122,7 +1139,7 @@ main (int argc, char **argv) /* Process any operands after "--". */ prev_optc_status = MUST_BE_OPERAND; while (optind < argc) - add_file_name (argv[optind++], names, operand_status, joption_count, + add_file_name (argv[optind++], g_names, operand_status, joption_count, &nfiles, &prev_optc_status, &optc_status); if (nfiles != 2) @@ -1148,20 +1165,20 @@ main (int argc, char **argv) if (join_field_2 == SIZE_MAX) join_field_2 = 0; - fp1 = STREQ (names[0], "-") ? stdin : fopen (names[0], "r"); + fp1 = STREQ (g_names[0], "-") ? stdin : fopen (g_names[0], "r"); if (!fp1) - error (EXIT_FAILURE, errno, "%s", names[0]); - fp2 = STREQ (names[1], "-") ? stdin : fopen (names[1], "r"); + error (EXIT_FAILURE, errno, "%s", g_names[0]); + fp2 = STREQ (g_names[1], "-") ? stdin : fopen (g_names[1], "r"); if (!fp2) - error (EXIT_FAILURE, errno, "%s", names[1]); + error (EXIT_FAILURE, errno, "%s", g_names[1]); if (fp1 == fp2) error (EXIT_FAILURE, errno, _("both files cannot be standard input")); join (fp1, fp2); if (fclose (fp1) != 0) - error (EXIT_FAILURE, errno, "%s", names[0]); + error (EXIT_FAILURE, errno, "%s", g_names[0]); if (fclose (fp2) != 0) - error (EXIT_FAILURE, errno, "%s", names[1]); + error (EXIT_FAILURE, errno, "%s", g_names[1]); if (issued_disorder_warning[0] || issued_disorder_warning[1]) exit (EXIT_FAILURE); diff --git a/tests/misc/join b/tests/misc/join index eae3f18..d6528da 100755 --- a/tests/misc/join +++ b/tests/misc/join @@ -196,7 +196,23 @@ my @tv = ( # With check, both inputs out of order (in fact, in reverse order) ['chkodr-5', '--check-order', [" b 1\n a 2\n", " b Y\n a Z\n"], "", 1, - "$prog: file 1 is not in sorted order\n"], + "$prog: chkodr-5.1:2: is not sorted: a 2\n"], + +# Similar, but with only file 2 not sorted. +['chkodr-5b', '--check-order', + [" a 2\n b 1\n", " b Y\n a Z\n"], "", 1, + "$prog: chkodr-5b.2:2: is not sorted: a Z\n"], + +# Similar, but with the offending line having length 0 (excluding newline). +['chkodr-5c', '--check-order', + [" a 2\n b 1\n", " b Y\n\n"], "", 1, + "$prog: chkodr-5c.2:2: is not sorted: \n"], + +# Similar, but elicit a warning for each input file (without --check-order). +['chkodr-5d', '', + ["a\nx\n\n", "b\ny\n\n"], "", 1, + "$prog: chkodr-5d.1:3: is not sorted: \n" . + "$prog: chkodr-5d.2:3: is not sorted: \n"], # Without order check, both inputs out of order and some lines # unpairable. This is NOT supported by the GNU extension. All that @@ -229,7 +245,7 @@ my @tv = ( # actual data out-of-order. This join should fail. ['header-3', '--header --check-order', ["ID Name\n2 B\n1 A\n", "ID Color\n2 blue\n"], "ID Name Color\n", 1, - "$prog: file 1 is not in sorted order\n"], + "$prog: header-3.1:3: is not sorted: 1 A\n"], # '--header' with specific output format '-o'. # output header line should respect the requested format -- 1.7.6.351.gb35ac From unknown Wed Jun 18 23:14:20 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9236: Fwd: Join Resent-From: Jim Meyering Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Sat, 06 Aug 2011 20:44:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 9236 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Eric Blake Cc: 9236@debbugs.gnu.org, David Gast Received: via spool by 9236-submit@debbugs.gnu.org id=B9236.131266341421849 (code B ref 9236); Sat, 06 Aug 2011 20:44:01 +0000 Received: (at 9236) by debbugs.gnu.org; 6 Aug 2011 20:43:34 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QpniX-0005gM-Oj for submit@debbugs.gnu.org; Sat, 06 Aug 2011 16:43:34 -0400 Received: from mx.meyering.net ([82.230.74.64]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QpniV-0005gF-Fx for 9236@debbugs.gnu.org; Sat, 06 Aug 2011 16:43:32 -0400 Received: from rho.meyering.net (localhost.localdomain [127.0.0.1]) by rho.meyering.net (Acme Bit-Twister) with ESMTP id 1C7C86018D; Sat, 6 Aug 2011 22:42:38 +0200 (CEST) From: Jim Meyering In-Reply-To: <87wreql46e.fsf@rho.meyering.net> (Jim Meyering's message of "Sat, 06 Aug 2011 21:40:09 +0200") References: <4E3AB1EC.9080605@redhat.com> <87k4atoyor.fsf@rho.meyering.net> <87wreql46e.fsf@rho.meyering.net> Date: Sat, 06 Aug 2011 22:42:37 +0200 Message-ID: <87hb5ul1aa.fsf@rho.meyering.net> Lines: 66 MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.1 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.1 (------) Jim Meyering wrote: > Jim Meyering wrote: > ... >> I started looking at this, and among other things saw >> a diagnostic that mentioned "file 1", which would do >> much better to mention the actual file name, so embarked. >> Here's a preliminary patch (not even a decent ChangeLog entry >> and the join test still needs to be updated): >> >> $ printf '%s\n' b a c > in >> $ ./join --check-order in in >> ./join: in:2: is not sorted: a >> [Exit 1] >> >> Subject: [PATCH] join: FIXME: check: print both file name and line number >> >> --- >> src/join.c | 29 +++++++++++++++++++---------- >> 1 files changed, 19 insertions(+), 10 deletions(-) > > Here's a much better patch. > >>>From 2e4ca5100dcc3229e9937c48aed3dc475bb507ea Mon Sep 17 00:00:00 2001 > From: Jim Meyering > Date: Thu, 4 Aug 2011 19:31:50 +0200 > Subject: [PATCH] join: with --check-order print offending file name, line > number and data > > * src/join (g_names): New global (was main's "names"). > (main): Update all uses of "names". > (line_no[2]): New globals. > (get_line): Increment after reading each line. > (check_order): Print the standard "file name:line_no: " prefix > as well as the offending line when reporting disorder. > Here is a sample old/new comparison: > -join: file 1 is not in sorted order > +join: in:4: is not sorted: contents-of-line-4 > * tests/misc/join: Change the two affected tests to expect > the new diagnostic. > Add new tests for more coverage: mismatch in file 2, > two diagnostics, zero-length out-of-order line. > * NEWS (Improvements): Mention it. Nearly forgot. While coding, I considered the case of an offending line with no trailing newline, but hadn't tested it. Just folded this in: diff --git a/tests/misc/join b/tests/misc/join index d6528da..a892a10 100755 --- a/tests/misc/join +++ b/tests/misc/join @@ -214,6 +214,12 @@ my @tv = ( "$prog: chkodr-5d.1:3: is not sorted: \n" . "$prog: chkodr-5d.2:3: is not sorted: \n"], +# Similar, but make it so each offending line has no newline. +['chkodr-5e', '', + ["a\nx\no", "b\ny\np"], "", 1, + "$prog: chkodr-5e.1:3: is not sorted: o\n" . + "$prog: chkodr-5e.2:3: is not sorted: p\n"], + # Without order check, both inputs out of order and some lines # unpairable. This is NOT supported by the GNU extension. All that # we really care about for this test is that the return status is From unknown Wed Jun 18 23:14:20 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: "David Gast" Subject: bug#9236: closed (Re: bug#9236: Fwd: Join) Message-ID: References: <8762mal0f5.fsf@rho.meyering.net> X-Gnu-PR-Message: they-closed 9236 X-Gnu-PR-Package: coreutils Reply-To: 9236@debbugs.gnu.org Date: Sat, 06 Aug 2011 21:03:01 +0000 Content-Type: multipart/mixed; boundary="----------=_1312664581-23733-1" This is a multi-part message in MIME format... ------------=_1312664581-23733-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #9236: Fwd: Join which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 9236@debbugs.gnu.org. --=20 9236: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D9236 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1312664581-23733-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 9236-done) by debbugs.gnu.org; 6 Aug 2011 21:02:16 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qpo0d-00069e-QN for submit@debbugs.gnu.org; Sat, 06 Aug 2011 17:02:16 -0400 Received: from mx.meyering.net ([82.230.74.64]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qpo0a-00069S-4t for 9236-done@debbugs.gnu.org; Sat, 06 Aug 2011 17:02:14 -0400 Received: from rho.meyering.net (localhost.localdomain [127.0.0.1]) by rho.meyering.net (Acme Bit-Twister) with ESMTP id B5BE96002D; Sat, 6 Aug 2011 23:01:18 +0200 (CEST) From: Jim Meyering To: Eric Blake Subject: Re: bug#9236: Fwd: Join In-Reply-To: <87hb5ul1aa.fsf@rho.meyering.net> (Jim Meyering's message of "Sat, 06 Aug 2011 22:42:37 +0200") References: <4E3AB1EC.9080605@redhat.com> <87k4atoyor.fsf@rho.meyering.net> <87wreql46e.fsf@rho.meyering.net> <87hb5ul1aa.fsf@rho.meyering.net> Date: Sat, 06 Aug 2011 23:01:18 +0200 Message-ID: <8762mal0f5.fsf@rho.meyering.net> Lines: 233 MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.1 (------) X-Debbugs-Envelope-To: 9236-done Cc: 9236-done@debbugs.gnu.org, David Gast X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.1 (------) Jim Meyering wrote: > Jim Meyering wrote: >> Jim Meyering wrote: >> ... >>> I started looking at this, and among other things saw >>> a diagnostic that mentioned "file 1", which would do >>> much better to mention the actual file name, so embarked. >>> Here's a preliminary patch (not even a decent ChangeLog entry >>> and the join test still needs to be updated): >>> >>> $ printf '%s\n' b a c > in >>> $ ./join --check-order in in >>> ./join: in:2: is not sorted: a >>> [Exit 1] >>> >>> Subject: [PATCH] join: FIXME: check: print both file name and line number >>> >>> --- >>> src/join.c | 29 +++++++++++++++++++---------- >>> 1 files changed, 19 insertions(+), 10 deletions(-) >> >> Here's a much better patch. >> >>>>From 2e4ca5100dcc3229e9937c48aed3dc475bb507ea Mon Sep 17 00:00:00 2001 >> From: Jim Meyering >> Date: Thu, 4 Aug 2011 19:31:50 +0200 >> Subject: [PATCH] join: with --check-order print offending file name, line >> number and data >> >> * src/join (g_names): New global (was main's "names"). ... >> * NEWS (Improvements): Mention it. Also nearly forgot to mention in the log that David Gast suggested this change. For the record, I expect to push this tomorrow or Monday: >From a0a3f339f72f4ca3ecc348ee4416c3c1e0f4765f Mon Sep 17 00:00:00 2001 From: Jim Meyering Date: Thu, 4 Aug 2011 19:31:50 +0200 Subject: [PATCH] join: with --check-order print offending file name, line number and data * src/join (g_names): New global (was main's "names"). (main): Update all uses of "names". (line_no[2]): New globals. (get_line): Increment after reading each line. (check_order): Print the standard "file name:line_no: " prefix as well as the offending line when reporting disorder. Here is a sample old/new comparison: -join: file 1 is not in sorted order +join: in:4: is not sorted: contents-of-line-4 * tests/misc/join: Change the two affected tests to expect the new diagnostic. Add new tests for more coverage: mismatch in file 2, two diagnostics, zero-length out-of-order line. * NEWS (Improvements): Mention it. Suggested by David Gast in http://debbugs.gnu.org/9236 --- NEWS | 3 +++ src/join.c | 43 ++++++++++++++++++++++++++++++------------- tests/misc/join | 26 ++++++++++++++++++++++++-- 3 files changed, 57 insertions(+), 15 deletions(-) diff --git a/NEWS b/NEWS index 2e48497..6e24f5c 100644 --- a/NEWS +++ b/NEWS @@ -66,6 +66,9 @@ GNU coreutils NEWS -*- outline -*- df now supports disk partitions larger than 4 TiB on MacOS X 10.5 or newer and on AIX 5.2 or newer. + join --check-order now prints "join: FILE:LINE_NUMBER: bad_line" for an + unsorted input, rather than e.g., "join: file 1 is not in sorted order". + shuf outputs small subsets of large permutations much more efficiently. For example `shuf -i1-$((2**32-1)) -n2` no longer exhausts memory. diff --git a/src/join.c b/src/join.c index 99d918f..694fb55 100644 --- a/src/join.c +++ b/src/join.c @@ -86,9 +86,15 @@ struct seq struct line **lines; }; -/* The previous line read from each file. */ +/* The previous line read from each file. */ static struct line *prevline[2] = {NULL, NULL}; +/* The number of lines read from each file. */ +static uintmax_t line_no[2] = {0, 0}; + +/* The input file names. */ +static char *g_names[2]; + /* This provides an extra line buffer for each file. We need these if we try to read two consecutive lines into the same buffer, since we don't want to overwrite the previous buffer before we check order. */ @@ -384,12 +390,23 @@ check_order (const struct line *prev, size_t join_field = whatfile == 1 ? join_field_1 : join_field_2; if (keycmp (prev, current, join_field, join_field) > 0) { + /* Exclude any trailing newline. */ + size_t len = current->buf.length; + if (0 < len && current->buf.buffer[len - 1] == '\n') + --len; + + /* If the offending line is longer than INT_MAX, output + only the first INT_MAX bytes in this diagnostic. */ + len = MIN (INT_MAX, len); + error ((check_input_order == CHECK_ORDER_ENABLED ? EXIT_FAILURE : 0), - 0, _("file %d is not in sorted order"), whatfile); + 0, _("%s:%ju: is not sorted: %.*s"), + g_names[whatfile - 1], line_no[whatfile - 1], + (int) len, current->buf.buffer); - /* If we get to here, the message was just a warning, but we - want only to issue it once. */ + /* If we get to here, the message was merely a warning. + Arrange to issue it only once per file. */ issued_disorder_warning[whatfile-1] = true; } } @@ -436,6 +453,7 @@ get_line (FILE *fp, struct line **linep, int which) freeline (line); return false; } + ++line_no[which - 1]; xfields (line); @@ -980,7 +998,6 @@ main (int argc, char **argv) int prev_optc_status = MUST_BE_OPERAND; int operand_status[2]; int joption_count[2] = { 0, 0 }; - char *names[2]; FILE *fp1, *fp2; int optc; int nfiles = 0; @@ -1100,7 +1117,7 @@ main (int argc, char **argv) break; case 1: /* Non-option argument. */ - add_file_name (optarg, names, operand_status, joption_count, + add_file_name (optarg, g_names, operand_status, joption_count, &nfiles, &prev_optc_status, &optc_status); break; @@ -1122,7 +1139,7 @@ main (int argc, char **argv) /* Process any operands after "--". */ prev_optc_status = MUST_BE_OPERAND; while (optind < argc) - add_file_name (argv[optind++], names, operand_status, joption_count, + add_file_name (argv[optind++], g_names, operand_status, joption_count, &nfiles, &prev_optc_status, &optc_status); if (nfiles != 2) @@ -1148,20 +1165,20 @@ main (int argc, char **argv) if (join_field_2 == SIZE_MAX) join_field_2 = 0; - fp1 = STREQ (names[0], "-") ? stdin : fopen (names[0], "r"); + fp1 = STREQ (g_names[0], "-") ? stdin : fopen (g_names[0], "r"); if (!fp1) - error (EXIT_FAILURE, errno, "%s", names[0]); - fp2 = STREQ (names[1], "-") ? stdin : fopen (names[1], "r"); + error (EXIT_FAILURE, errno, "%s", g_names[0]); + fp2 = STREQ (g_names[1], "-") ? stdin : fopen (g_names[1], "r"); if (!fp2) - error (EXIT_FAILURE, errno, "%s", names[1]); + error (EXIT_FAILURE, errno, "%s", g_names[1]); if (fp1 == fp2) error (EXIT_FAILURE, errno, _("both files cannot be standard input")); join (fp1, fp2); if (fclose (fp1) != 0) - error (EXIT_FAILURE, errno, "%s", names[0]); + error (EXIT_FAILURE, errno, "%s", g_names[0]); if (fclose (fp2) != 0) - error (EXIT_FAILURE, errno, "%s", names[1]); + error (EXIT_FAILURE, errno, "%s", g_names[1]); if (issued_disorder_warning[0] || issued_disorder_warning[1]) exit (EXIT_FAILURE); diff --git a/tests/misc/join b/tests/misc/join index eae3f18..a892a10 100755 --- a/tests/misc/join +++ b/tests/misc/join @@ -196,7 +196,29 @@ my @tv = ( # With check, both inputs out of order (in fact, in reverse order) ['chkodr-5', '--check-order', [" b 1\n a 2\n", " b Y\n a Z\n"], "", 1, - "$prog: file 1 is not in sorted order\n"], + "$prog: chkodr-5.1:2: is not sorted: a 2\n"], + +# Similar, but with only file 2 not sorted. +['chkodr-5b', '--check-order', + [" a 2\n b 1\n", " b Y\n a Z\n"], "", 1, + "$prog: chkodr-5b.2:2: is not sorted: a Z\n"], + +# Similar, but with the offending line having length 0 (excluding newline). +['chkodr-5c', '--check-order', + [" a 2\n b 1\n", " b Y\n\n"], "", 1, + "$prog: chkodr-5c.2:2: is not sorted: \n"], + +# Similar, but elicit a warning for each input file (without --check-order). +['chkodr-5d', '', + ["a\nx\n\n", "b\ny\n\n"], "", 1, + "$prog: chkodr-5d.1:3: is not sorted: \n" . + "$prog: chkodr-5d.2:3: is not sorted: \n"], + +# Similar, but make it so each offending line has no newline. +['chkodr-5e', '', + ["a\nx\no", "b\ny\np"], "", 1, + "$prog: chkodr-5e.1:3: is not sorted: o\n" . + "$prog: chkodr-5e.2:3: is not sorted: p\n"], # Without order check, both inputs out of order and some lines # unpairable. This is NOT supported by the GNU extension. All that @@ -229,7 +251,7 @@ my @tv = ( # actual data out-of-order. This join should fail. ['header-3', '--header --check-order', ["ID Name\n2 B\n1 A\n", "ID Color\n2 blue\n"], "ID Name Color\n", 1, - "$prog: file 1 is not in sorted order\n"], + "$prog: header-3.1:3: is not sorted: 1 A\n"], # '--header' with specific output format '-o'. # output header line should respect the requested format -- 1.7.6.351.gb35ac ------------=_1312664581-23733-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 4 Aug 2011 04:15:12 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QopKx-00081L-UD for submit@debbugs.gnu.org; Thu, 04 Aug 2011 00:15:12 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QonzM-00069q-6M for submit@debbugs.gnu.org; Wed, 03 Aug 2011 22:48:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qonyk-0001wL-P9 for submit@debbugs.gnu.org; Wed, 03 Aug 2011 22:48:11 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:34053) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qonyk-0001wH-MY for submit@debbugs.gnu.org; Wed, 03 Aug 2011 22:48:10 -0400 Received: from eggs.gnu.org ([140.186.70.92]:57195) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qonyj-0006Hz-P6 for bug-coreutils@gnu.org; Wed, 03 Aug 2011 22:48:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qonyi-0001vw-9T for bug-coreutils@gnu.org; Wed, 03 Aug 2011 22:48:09 -0400 Received: from iron2.its.csulb.edu ([134.139.1.35]:52127) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qonyi-0001vs-2s for bug-coreutils@gnu.org; Wed, 03 Aug 2011 22:48:08 -0400 Received: from its-cgpb02.csulb.edu (HELO csulb.edu) ([134.139.16.6]) by iron2.its.csulb.edu with ESMTP; 03 Aug 2011 19:48:06 -0700 Received: from [71.160.180.217] (account dgast@csulb.edu) by its-cgpb02.csulb.edu (CommuniGate Pro WEBUSER 5.3.12) with HTTP id 7694701 for bug-coreutils@gnu.org; Wed, 03 Aug 2011 19:48:06 -0700 From: "David Gast" Subject: Fwd: Join To: bug-coreutils@gnu.org X-Mailer: CommuniGate Pro WebUser v5.3.12 Date: Wed, 03 Aug 2011 19:48:06 -0700 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="_===7694701====its-cgpb02.csulb.edu===_" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -6.6 (------) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 04 Aug 2011 00:15:09 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.6 (------) This is a multi-part MIME message --_===7694701====its-cgpb02.csulb.edu===_ Content-Type: text/plain;charset=iso-8859-1; format="flowed" Content-Transfer-Encoding: 8bit Oops, I hit the wrong button ... cat > /tmp/x < Subject: Join To: bug-coreutils@gnu.org X-Mailer: CommuniGate Pro WebUser v5.3.12 Date: Wed, 03 Aug 2011 19:43:31 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1; format="flowed" Content-Transfer-Encoding: 8bit When there is disorder, could you please provide the line number like the command and option sort -c does? Note: join seems to report disorder in file 2 only if there is no disorder in file 1. You try the following code Thanks --_===7694701====its-cgpb02.csulb.edu===_-- ------------=_1312664581-23733-1--