GNU bug report logs - #71477
30.0.50; Lock files are not deleted on Windows 98

Previous Next

Package: emacs;

Reported by: Po Lu <luangruo <at> yahoo.com>

Date: Mon, 10 Jun 2024 16:41:04 UTC

Severity: normal

Found in version 30.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 71477 in the body.
You can then email your comments to 71477 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Mon, 10 Jun 2024 16:41:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Po Lu <luangruo <at> yahoo.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 10 Jun 2024 16:41:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.50; Lock files are not deleted on Windows 98
Date: Mon, 10 Jun 2024 23:07:38 +0800
With lock files enabled, type C-x C-f C:/WINDOWS/Application Data/.emacs
RET, modify the file, and type C-x C-f, whereupon such a warning will be
displayed:

Warning (unlock-file): Invalid argument, `~/.emacs', ignored

I don't recall whether it was because lock files were disabled on the
same machine that this issue wasn't present in Emacs 28.1, or it was
because the issue was introduced in a subsequent release.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Mon, 10 Jun 2024 18:56:03 GMT) Full text and rfc822 format available.

Message #8 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Mon, 10 Jun 2024 20:44:30 +0300
> Date: Mon, 10 Jun 2024 23:07:38 +0800
> From:  Po Lu via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> With lock files enabled, type C-x C-f C:/WINDOWS/Application Data/.emacs
> RET, modify the file, and type C-x C-f, whereupon such a warning will be
> displayed:
> 
> Warning (unlock-file): Invalid argument, `~/.emacs', ignored

I don't have access to Windows 9X anymore, and there's no such
directory on the Windows system to which I do have access.  So either
you or someone will show a recipe that can be reproduced and debugged
on a more modern system, or you dig into the EINVAL on that system and
tell more details to understand what happens, or we just dismiss this
bug alone with "moreinfo" tag (after all, this is just a warning).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Tue, 11 Jun 2024 15:46:02 GMT) Full text and rfc822 format available.

Message #11 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Tue, 11 Jun 2024 21:34:22 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> OK, but could you provide some additional details, so I could
> understand the issue better?  What kind of negative values do you get
> from getpid on Windows 98, and what does the system show as the PID of
> that process?  Is the value really such a large positive number that
> its MSB is set?

The value of getpid was -1859765, but I did not attempt to read the PID
manually with GetCurrentProcessID or cross-check it against the OS's
equivalent of ps, if there exists one at all.

> According to my records, _getpid just calls GetCurrentProcessId and
> returns the value as an int.  So for _getpid to return a negative
> value, GetCurrentProcessId should return a very large positive value,
> I think.

There's a screenshot on this forum that, if it is to be trusted,
demonstrates that PIDs are indeed of this scale on Windows 9X:

  https://www.vbforums.com/showthread.php?308830-Task-Manager-For-Windows-98




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Tue, 11 Jun 2024 20:25:04 GMT) Full text and rfc822 format available.

Message #14 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Tue, 11 Jun 2024 16:03:01 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: 71477 <at> debbugs.gnu.org
> Date: Tue, 11 Jun 2024 16:43:06 +0800
> 
> > The only possible issue I see with allowing a negative PID is that the
> > code checks for "pid > 0" or "pid < 0" somewhere; if that is the case,
> > we should replace those with comparisons with -1 instead.
> >
> > Can you test the above on Windows 9X when you have a chance?  Then we
> > could install it.
> 
> If it doesn't produce any adverse effect on modern Windows, and what I
> raised is not important, let's install it now, and I will test it as
> soon as may be, or it might fall by the wayside.

OK, but could you provide some additional details, so I could
understand the issue better?  What kind of negative values do you get
from getpid on Windows 98, and what does the system show as the PID of
that process?  Is the value really such a large positive number that
its MSB is set?

According to my records, _getpid just calls GetCurrentProcessId and
returns the value as an int.  So for _getpid to return a negative
value, GetCurrentProcessId should return a very large positive value,
I think.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Tue, 11 Jun 2024 20:25:05 GMT) Full text and rfc822 format available.

Message #17 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Tue, 11 Jun 2024 11:28:59 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: 71477 <at> debbugs.gnu.org
> Date: Tue, 11 Jun 2024 16:15:08 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> From: Po Lu <luangruo <at> yahoo.com>
> >> Cc: 71477 <at> debbugs.gnu.org
> >> Date: Tue, 11 Jun 2024 15:42:58 +0800
> >> 
> >> Eli Zaretskii <eliz <at> gnu.org> writes:
> >> 
> >> > Sorry, I don't understand the problems with negative PID values.
> >> > Where exactly in the code of filelock.c it gets in the way?
> >> 
> >> Here:
> >> 
> >>   /* The PID is everything from the last '.' to the ':' or equivalent.  */
> >>   if (! c_isdigit (dot[1])) <--------------
> >>     return EINVAL;
> >>   errno = 0;
> >> 
> >> The first character of the number after the period is `-' on Windows 98.
> >
> > But that is easy to fix without any significant effect on the rest of
> > the code.  For example:
> >
> >   if (! (c_isdigit (dot[1])
> >          || (dot[1] == '-'  && c_isdigit (dot[2]))))
> >     return EINVAL;
> >
> > Are there any problems with the above fix?
> 
> No, but won't leaving the format of the lock file string inconsistent
> with Unix create difficulties elsewhere, as, for example, on a Samba
> share to which Unix systems are also connected?

What inconsistencies, specifically?

The only possible issue I see with allowing a negative PID is that the
code checks for "pid > 0" or "pid < 0" somewhere; if that is the case,
we should replace those with comparisons with -1 instead.

Can you test the above on Windows 9X when you have a chance?  Then we
could install it.

> P.S. is debbugs.gnu.org offline or some such?

It has connectivity problems.  It was already reported.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Tue, 11 Jun 2024 20:25:05 GMT) Full text and rfc822 format available.

Message #20 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Tue, 11 Jun 2024 10:56:50 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: 71477 <at> debbugs.gnu.org
> Date: Tue, 11 Jun 2024 15:42:58 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Sorry, I don't understand the problems with negative PID values.
> > Where exactly in the code of filelock.c it gets in the way?
> 
> Here:
> 
>   /* The PID is everything from the last '.' to the ':' or equivalent.  */
>   if (! c_isdigit (dot[1])) <--------------
>     return EINVAL;
>   errno = 0;
> 
> The first character of the number after the period is `-' on Windows 98.

But that is easy to fix without any significant effect on the rest of
the code.  For example:

  if (! (c_isdigit (dot[1])
         || (dot[1] == '-'  && c_isdigit (dot[2]))))
    return EINVAL;

Are there any problems with the above fix?

Please note: I don't want to make any significant changes in this
area, certainly not for the benefit of Windows 9X.  So if the above is
not sufficient, please tell the details, and let's discuss how to
solve what's left.

P.S. I've for now reverted the changes you made to use unsigned values
because I don't think that TRT (pid must support negative values), and
this whole area of code is fragile enough for us to discuss changes
before installing them.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Tue, 11 Jun 2024 20:42:07 GMT) Full text and rfc822 format available.

Message #23 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Tue, 11 Jun 2024 09:47:26 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: 71477 <at> debbugs.gnu.org
> Date: Tue, 11 Jun 2024 09:41:54 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > I don't have access to Windows 9X anymore, and there's no such
> > directory on the Windows system to which I do have access.  So either
> > you or someone will show a recipe that can be reproduced and debugged
> > on a more modern system, or you dig into the EINVAL on that system and
> > tell more details to understand what happens, or we just dismiss this
> > bug alone with "moreinfo" tag (after all, this is just a warning).
> 
> I think I've arrived at the problem: Femacs_pid and getpid return a
> negative value, and once it is duly written to the lock file,
> current_lock_owner does not accept the sign character after the PID
> separator, consequently returning EINVAL.  Apparently, in times past
> file-locking routines cast all PIDs to unsigned long, but that behavior
> was lost in the midst of changes since installed for Unix systems, and
> as such I will attempt to restore the historical semantics on Windows
> systems.

Sorry, I don't understand the problems with negative PID values.
Where exactly in the code of filelock.c it gets in the way?

We had a similar problem on Cygwin, albeit with boot time, not PID,
and we fixed it very easily.  If filelock.c assumes PIDs are positive
somewhere, please point out that code.  In any case, please don't
install any changes in this area without posting them for discussion
first.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Wed, 12 Jun 2024 08:26:01 GMT) Full text and rfc822 format available.

Message #26 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Wed, 12 Jun 2024 11:25:44 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: 71477 <at> debbugs.gnu.org
> Date: Tue, 11 Jun 2024 16:43:06 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > What inconsistencies, specifically?
> 
> Any older Emacs binary that encounters a lock file produced under
> Windows 9X will report an "Invalid argument" error until the user
> intervenes to delete this lock file.

I don't see how we can fix such problems retroactively.

> > The only possible issue I see with allowing a negative PID is that the
> > code checks for "pid > 0" or "pid < 0" somewhere; if that is the case,
> > we should replace those with comparisons with -1 instead.
> >
> > Can you test the above on Windows 9X when you have a chance?  Then we
> > could install it.
> 
> If it doesn't produce any adverse effect on modern Windows, and what I
> raised is not important, let's install it now, and I will test it as
> soon as may be, or it might fall by the wayside.

OK.  (It turns out we already knew about this issue, see the comments
in w32proc.c, search for "Hack for Windows 95".)

Paul, do you see any problems with the change below?  It worked for me
in some limited testing.  I intend to install it on the master branch
unless there are objections.

diff --git a/src/filelock.c b/src/filelock.c
index 050cac5..59fb47e 100644
--- a/src/filelock.c
+++ b/src/filelock.c
@@ -393,7 +393,9 @@ current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
     return EINVAL;
 
   /* The PID is everything from the last '.' to the ':' or equivalent.  */
-  if (! c_isdigit (dot[1]))
+  if (! (c_isdigit (dot[1])
+	 /* Windows 9X report negative PID values.  */
+	 || (dot[1] == '-' && c_isdigit (dot[2]))))
     return EINVAL;
   errno = 0;
   pid = strtoimax (dot + 1, &owner->colon, 10);
@@ -451,7 +453,7 @@ current_lock_owner (lock_info_type *owner, Lisp_Object lfname)
     {
       if (pid == getpid ())
         return I_OWN_IT;
-      else if (0 < pid && pid <= TYPE_MAXIMUM (pid_t)
+      else if (pid != -1 && pid <= TYPE_MAXIMUM (pid_t)
                && (kill (pid, 0) >= 0 || errno == EPERM)
 	       && (boot_time == 0
 		   || (boot_time <= TYPE_MAXIMUM (time_t)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Wed, 12 Jun 2024 16:08:02 GMT) Full text and rfc822 format available.

Message #29 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>, Po Lu <luangruo <at> yahoo.com>
Cc: 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Wed, 12 Jun 2024 09:07:52 -0700
[Message part 1 (text/plain, inline)]
On 2024-06-12 01:25, Eli Zaretskii wrote:

> -  if (! c_isdigit (dot[1]))
> +  if (! (c_isdigit (dot[1])
> +	 /* Windows 9X report negative PID values.  */
> +	 || (dot[1] == '-' && c_isdigit (dot[2]))))

Faster is "if (! c_isdigit[(dot[1] == '-') + 1])", as it avoids a 
conditional branch on most platforms.


> -      else if (0 < pid && pid <= TYPE_MAXIMUM (pid_t)
> +      else if (pid != -1 && pid <= TYPE_MAXIMUM (pid_t)
>                  && (kill (pid, 0) >= 0 || errno == EPERM)

This looks dubious for most systems, where 'kill' has special behavior 
when pid < -1 or pid == 0; it tests a process group. That's not the test 
we want here, since we want to check whether Emacs can be sent a signal, 
not whether any process in its process group can be sent a signal (this 
can be valid even after Emacs has exited). The code should use calls 
like kill (-2, 0) and kill (0, 0) only on platforms where we know the 
calls do not test a process group.

Even on MS Windows 98 we should check that TYPE_MINIMUM (pid_t) <= pid. 
Also, is there a special meaning for kill (0, 0) on MS Windows 98? If 
so, we should also check that pid != 0.

Do any MS-Windows platforms support process groups, i.e., kill (-2, 0) 
operates on process group 2 rather than on an individual process with 
process ID -2? If so, these platforms should be careful too, and should 
not use kill (-2, 0) or kill (0, 0).

How about the attached patch instead? You can adjust the 
Microsoft-specific .h files to define VALID_PROCESS_ID appropriately for 
MS Windows 98, and for any other MS platform where kill (-2, 0) is known 
to check for just the individual process -2.
[0001-Start-of-a-fix-for-bug-71477.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Wed, 12 Jun 2024 17:12:02 GMT) Full text and rfc822 format available.

Message #32 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: luangruo <at> yahoo.com, 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Wed, 12 Jun 2024 20:10:50 +0300
> Date: Wed, 12 Jun 2024 09:07:52 -0700
> Cc: 71477 <at> debbugs.gnu.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> 
> On 2024-06-12 01:25, Eli Zaretskii wrote:
> 
> > -  if (! c_isdigit (dot[1]))
> > +  if (! (c_isdigit (dot[1])
> > +	 /* Windows 9X report negative PID values.  */
> > +	 || (dot[1] == '-' && c_isdigit (dot[2]))))
> 
> Faster is "if (! c_isdigit[(dot[1] == '-') + 1])", as it avoids a 
> conditional branch on most platforms.

OK.

> > -      else if (0 < pid && pid <= TYPE_MAXIMUM (pid_t)
> > +      else if (pid != -1 && pid <= TYPE_MAXIMUM (pid_t)
> >                  && (kill (pid, 0) >= 0 || errno == EPERM)
> 
> This looks dubious for most systems, where 'kill' has special behavior 
> when pid < -1 or pid == 0; it tests a process group. That's not the test 
> we want here, since we want to check whether Emacs can be sent a signal, 
> not whether any process in its process group can be sent a signal (this 
> can be valid even after Emacs has exited). The code should use calls 
> like kill (-2, 0) and kill (0, 0) only on platforms where we know the 
> calls do not test a process group.

But on all platforms except Windows 9X we shouldn't see a negative PID
here, so what you say is purely theoretical, no?

> Even on MS Windows 98 we should check that TYPE_MINIMUM (pid_t) <= pid. 

Since pid_t is typedefed as 'int', that's always true, no?

> Also, is there a special meaning for kill (0, 0) on MS Windows 98?

No.  And our emulation of 'kill' fails with EPERM when called witgh
both arguments zero.

> If so, we should also check that pid != 0.

There are no processes on Windows whose PID is zero, so getting zero
here is impossible.

> Do any MS-Windows platforms support process groups, i.e., kill (-2, 0) 
> operates on process group 2 rather than on an individual process with 
> process ID -2? If so, these platforms should be careful too, and should 
> not use kill (-2, 0) or kill (0, 0).

Windows does support process groups, but our emulation of 'kill'
pretends that each process is its own group.

> How about the attached patch instead? You can adjust the 
> Microsoft-specific .h files to define VALID_PROCESS_ID appropriately for 
> MS Windows 98, and for any other MS platform where kill (-2, 0) is known 
> to check for just the individual process -2.

Fine with me, please install and I will followup.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71477; Package emacs. (Wed, 12 Jun 2024 17:58:02 GMT) Full text and rfc822 format available.

Message #35 received at 71477 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, 71477 <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Wed, 12 Jun 2024 10:57:10 -0700
On 2024-06-12 10:10, Eli Zaretskii wrote:

> But on all platforms except Windows 9X we shouldn't see a negative PID
> here, so what you say is purely theoretical, no?

No, because that value doesn't come from a pid_t that the system gave us 
as a process ID. It comes from the file system, and so could be invalid 
as a process ID. That's why the code already checks that pid <= 
TYPE_MAXIMUM (pid_t). Such a check wouldn't be needed if the pid were a 
valid process ID.


>> Even on MS Windows 98 we should check that TYPE_MINIMUM (pid_t) <= pid.
> 
> Since pid_t is typedefed as 'int', that's always true, no?

No, because the code is checking 'pid', which is of type intmax_t not 
pid_t. (And anyway pid_t need not be 'int'.)


> No.  And our emulation of 'kill' fails with EPERM when called witgh
> both arguments zero.

In that case there's no need to worry about pid == 0 here.


> Fine with me, please install and I will followup.

OK, installed.





Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Thu, 13 Jun 2024 08:09:01 GMT) Full text and rfc822 format available.

Notification sent to Po Lu <luangruo <at> yahoo.com>:
bug acknowledged by developer. (Thu, 13 Jun 2024 08:09:01 GMT) Full text and rfc822 format available.

Message #40 received at 71477-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: luangruo <at> yahoo.com, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 71477-done <at> debbugs.gnu.org
Subject: Re: bug#71477: 30.0.50; Lock files are not deleted on Windows 98
Date: Thu, 13 Jun 2024 11:06:17 +0300
> Date: Wed, 12 Jun 2024 10:57:10 -0700
> Cc: luangruo <at> yahoo.com, 71477 <at> debbugs.gnu.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> > Fine with me, please install and I will followup.
> 
> OK, installed.

Thanks, I've now installed the followup bits.  Po Lu, please test if
the result works on Windows 9X when you have time.

I'm closing the bug for now; please reopen with new information if the
new code still doesn't work.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 11 Jul 2024 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 39 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.