From unknown Sat Jun 21 10:31:26 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#66256 <66256@debbugs.gnu.org> To: bug#66256 <66256@debbugs.gnu.org> Subject: Status: sorting NAN values with "=?UTF-8?Q?general-numeric=E2=80=99?= Reply-To: bug#66256 <66256@debbugs.gnu.org> Date: Sat, 21 Jun 2025 17:31:26 +0000 retitle 66256 sorting NAN values with "general-numeric=E2=80=99 reassign 66256 coreutils submitter 66256 Jorge Stolfi severity 66256 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Thu Sep 28 07:15:35 2023 Received: (at submit) by debbugs.gnu.org; 28 Sep 2023 11:15:35 +0000 Received: from localhost ([127.0.0.1]:53149 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qlozV-0006jY-PX for submit@debbugs.gnu.org; Thu, 28 Sep 2023 07:15:35 -0400 Received: from lists.gnu.org ([2001:470:142::17]:50292) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qloVL-0005lT-EG for submit@debbugs.gnu.org; Thu, 28 Sep 2023 06:44:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qloV1-0003f6-Dv for bug-coreutils@gnu.org; Thu, 28 Sep 2023 06:44:03 -0400 Received: from taquaral.ic.unicamp.br ([143.106.7.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qloUw-0003tA-IN for bug-coreutils@gnu.org; Thu, 28 Sep 2023 06:44:00 -0400 Received: from guanabara.ic.unicamp.br (guanabara.ic.unicamp.br [143.106.7.44]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: stolfi) by taquaral.ic.unicamp.br (Postfix) with ESMTPSA id 9E87A27C4E4A for ; Thu, 28 Sep 2023 07:43:53 -0300 (-03) DKIM-Filter: OpenDKIM Filter v2.11.0 taquaral.ic.unicamp.br 9E87A27C4E4A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ic.unicamp.br; s=default; t=1695897833; bh=8tB3IgBNc8GjiSeuBXdJUFpq0xsMsz1hbEihAxExAe0=; h=Date:From:To:Subject:From; b=L4tajBwFnT4gLcOfZht9B43gEUrUChFevaf/1Y+a43a6DJpL7x6yA5T1gNFy57FKN X3PFc2YE/XiWnf/PofrOnpmgsRLytu8Ukvv/+UIDaVibGnOEk4KdhiQgRDgKfRlgGk aJ0wxfPFUwCx+VF/KqWZJH5pyTlpISAIlAHSNl94= Received: from 187-35-195-161.dsl.telesp.net.br (187-35-195-161.dsl.telesp.net.br [187.35.195.161]) by webmail2.ic.unicamp.br (Horde Framework) with HTTPS; Thu, 28 Sep 2023 07:43:52 -0300 Date: Thu, 28 Sep 2023 07:43:52 -0300 Message-ID: <20230928074352.Horde.e6lnpeLVHKt8TsI8KnyEFB1@webmail2.ic.unicamp.br> From: Jorge Stolfi To: bug-coreutils@gnu.org Subject: sorting NAN values with =?utf-8?b?ImdlbmVyYWwtbnVtZXJpY+KAmQ==?= User-Agent: Horde Application Framework 5 Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=143.106.7.43; envelope-from=stolfi@ic.unicamp.br; helo=taquaral.ic.unicamp.br X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.9 (/) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 28 Sep 2023 07:15:28 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.1 (/) The full documentation of the "--general-numeric-sort" option of {sort} says that NaN values are sorted "in a consistent but machine-dependent order". This is not good. The point of the IEEE floating-point standard was to make the results of floating-point computations be independent of the platform or implementation. Please consider extending that goal to the handling of NaNs by {sort}. That it, all flavors of NaN (determined by their char tails, as parsed by {strtod}) should be treated as equal. The fact that different flavors of NaN have distinct binary representation is not an excuse to sort them as distinct, since the same is true of +0 and -0, which "general-numeric" sort already treats as equal. As a separate suggestion, please consider having {sort} abort with an error message if any field that is supposed to be sorted with "general-numeric" is not a valid {double} value, or has some leftover chars that are not parsed by {strtod}. Whether these solutions are accepted or not, please change the manpage explanation of "-g"/"--general-numeric-sort" to say, at least, "the field is parsed as a double-precision (64-bit) floating-point number and sorted by its numeric value". Thanks, and all the best, --jorge -- Jorge Stolfi - Professor Titular/Full Professor Instituto de Computação/Computer Science Dept Universidade Estadual de Campinas/State University of Campinas Campinas, SP - Brazil From debbugs-submit-bounces@debbugs.gnu.org Thu Sep 28 07:53:10 2023 Received: (at 66256) by debbugs.gnu.org; 28 Sep 2023 11:53:10 +0000 Received: from localhost ([127.0.0.1]:53188 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qlpZu-0002AS-Gd for submit@debbugs.gnu.org; Thu, 28 Sep 2023 07:53:10 -0400 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]:56623) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qlpZt-0002AE-1d for 66256@debbugs.gnu.org; Thu, 28 Sep 2023 07:53:09 -0400 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-40572aeb6d0so94511905e9.1 for <66256@debbugs.gnu.org>; Thu, 28 Sep 2023 04:52:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695901969; x=1696506769; darn=debbugs.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=xk9MqVxMKK6MMqThM/L+VeSMidKxP93sNlfs2CBk4SY=; b=GwmUW7deeq99xYQs9PRH9nEFzmNbZh9l14MiMK8DtI9H9Cd99vGZeqR/BacT4IZWK4 Ezgc53a/ZH3bLEZztu26CBc8Vh2eIn8AVSuvwY59fWnECX/9y7/BTIWiL+22BqBfRrTG rDryC+xlXKkdCaFaSaW8l0vAlfhoEZGy/UX58Hs2JV2IuAwKdeQ1tQkoBhkRWHnCcC3F DUS8PiYxI58EWLoLLyKMJAXpK6cgmFdADWco7Zuu/wzQbjnENk5vwsmYilnUUG5EEjG4 2znUEFSzHK2VpvRdbhWroVJpjechR4wanE4SXa59hhmSe0EsuRbXhWI5wHk9sP547m6J LzAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695901969; x=1696506769; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xk9MqVxMKK6MMqThM/L+VeSMidKxP93sNlfs2CBk4SY=; b=HEGG4DN7BkzlNQa0u+ohRMtCoVO7duIfJNW9w7ZPqn1ZWLcwHG6cfcXGfooTfKHyfl BcMqGo+QZceR7xZuR98LLQBZRzAa2ppdpk0A64xtokOSPw60wRIApREBb/h8th0D2XbD 52sZdpPEbpUnLlVLH/mAlD1NI11CCOsWd3dUJXP9UA8lUvmIToR5rr0f6eoYqm6zCNkb Ds0dofDaoB6KnnYszJ/jpH2CUYT+Uv6/2LnAh22azc4KenihaEv7M+1Y7J5ddc50+IfI 8QTGUVkzRB+8hqAYpbzz9nwvRadET/51u+AlDtapYkEj4+JpCYf5A0lp+7k4acB/LaKf IMKw== X-Gm-Message-State: AOJu0YxRnO3+j3KjeBA+S1+ysaFIM9o7mAwhAr23oxPFFYIi26nxgMcl 0osFbFERyh95d4LqhtPNJiw= X-Google-Smtp-Source: AGHT+IGm/wJ2fDje5C9xvL+lG6VNvZkp+XSRQOvGU6XizSa1BgmCX4AIDeyNXdcJMM7VbEH2XwmwTw== X-Received: by 2002:a1c:7711:0:b0:401:b1c6:97dc with SMTP id t17-20020a1c7711000000b00401b1c697dcmr969662wmi.23.1695901968679; Thu, 28 Sep 2023 04:52:48 -0700 (PDT) Received: from [192.168.1.20] (95-44-90-175-dynamic.agg2.lod.rsl-rtd.eircom.net. [95.44.90.175]) by smtp.googlemail.com with ESMTPSA id h1-20020a05600c350100b003fe15ac0934sm1942852wmq.1.2023.09.28.04.52.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 28 Sep 2023 04:52:48 -0700 (PDT) Message-ID: Date: Thu, 28 Sep 2023 12:52:47 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: =?UTF-8?Q?Re=3A_bug=2366256=3A_sorting_NAN_values_with_=22general-n?= =?UTF-8?B?dW1lcmlj4oCZ?= Content-Language: en-US To: Jorge Stolfi , 66256@debbugs.gnu.org References: <20230928074352.Horde.e6lnpeLVHKt8TsI8KnyEFB1@webmail2.ic.unicamp.br> From: =?UTF-8?Q?P=C3=A1draig_Brady?= In-Reply-To: <20230928074352.Horde.e6lnpeLVHKt8TsI8KnyEFB1@webmail2.ic.unicamp.br> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 66256 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 28/09/2023 11:43, Jorge Stolfi wrote: > > The full documentation of the "--general-numeric-sort" option of > {sort} says that NaN values are sorted "in a consistent but > machine-dependent order". > > This is not good. The point of the IEEE floating-point standard was to > make the results of floating-point computations be independent of the > platform or implementation. > > Please consider extending that goal to the handling of NaNs by {sort}. > That it, all flavors of NaN (determined by their char tails, as > parsed by {strtod}) should be treated as equal. > > The fact that different flavors of NaN have distinct binary > representation is not an excuse to sort them as distinct, since the > same is true of +0 and -0, which "general-numeric" sort already treats > as equal. > > As a separate suggestion, please consider having {sort} abort with an > error message if any field that is supposed to be sorted with > "general-numeric" is not a valid {double} value, or has some leftover > chars that are not parsed by {strtod}. > > Whether these solutions are accepted or not, please change the manpage > explanation of "-g"/"--general-numeric-sort" to say, at least, "the > field is parsed as a double-precision (64-bit) floating-point number > and sorted by its numeric value". > > Thanks, and all the best, No comment on the actual ordering of NaNs, but note NaN ordering changed recently in coreutils 9.2, as discussed at https://bugs.gnu.org/55212 cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Thu Sep 28 16:45:15 2023 Received: (at 66256) by debbugs.gnu.org; 28 Sep 2023 20:45:15 +0000 Received: from localhost ([127.0.0.1]:54645 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qlxsp-0007ci-FP for submit@debbugs.gnu.org; Thu, 28 Sep 2023 16:45:15 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:53208) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qlxsm-0007cM-9Q for 66256@debbugs.gnu.org; Thu, 28 Sep 2023 16:45:14 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 952DD3C00D18A; Thu, 28 Sep 2023 13:44:51 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id YkJy1x8vybvz; Thu, 28 Sep 2023 13:44:51 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 5828F3C00D18B; Thu, 28 Sep 2023 13:44:51 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 5828F3C00D18B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1695933891; bh=Wf8rVQdK1w5W4vg7z0hz3J7Loh1cOblmge9bRD7MXUY=; h=Message-ID:Date:MIME-Version:To:From; b=Zh+Ai3T8EiaFH10Uk655GpKlrZ7A32UPCMXoAjg5Us4B+FFhGBmhLn6fJSqEJS4cS hrH15uqKJ8/0sIezx4gJdX8YkVQoSWklArFr+szSSkybl7ElDaiLYfwLxD+Y8zMpFJ x1XnNmtwv5KRKB/Edfg6x0GG7F7BW+gvPADILwH/B57nbR8XegfOFyKA87DBM9jTsF si5wk9wICxFAwNzhkc9gZAi/QzYymC8reF8f8k1Ua5quYXp6aNme1FHP4TF2gNunh2 4WYyMkRBHAR/79GG02tv07RU24ZsJNzAr4COhLFEQ9W5jA9agP7zLkgsHdKIpf093N Ou/fzLe8+M3Ug== X-Virus-Scanned: amavisd-new at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id v49twT48rdQT; Thu, 28 Sep 2023 13:44:51 -0700 (PDT) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 3F6EA3C00D18A; Thu, 28 Sep 2023 13:44:51 -0700 (PDT) Message-ID: <497eaf62-ef06-ef34-fd52-c03afdfab6a3@cs.ucla.edu> Date: Thu, 28 Sep 2023 13:44:50 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: =?UTF-8?Q?Re=3a_bug=2366256=3a_sorting_NAN_values_with_=22general-n?= =?UTF-8?B?dW1lcmlj4oCZ?= Content-Language: en-US To: =?UTF-8?Q?P=c3=a1draig_Brady?= , Jorge Stolfi , 66256@debbugs.gnu.org References: <20230928074352.Horde.e6lnpeLVHKt8TsI8KnyEFB1@webmail2.ic.unicamp.br> From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -1.5 (-) X-Debbugs-Envelope-To: 66256 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.5 (--) On my long list of things to do is to have sort -g sort more deterministically with NaNs. This could be done with the new totalorder and totalorderl functions in C23 Annex F.10.12.1, if available. The fix would not be portable (a these functions are newly sort-of-standardized and are often not available) but it should be better than nothing. Of course the other problem is that there's no standard textual representation of NaN payloads (i.e., their fractions).