GNU bug report logs - #25756
Problems using "parted ... print" on nvme devices

Package: parted;

Reported by: Douglas Miller <dougmill <at> linux.vnet.ibm.com>

Date: Thu, 16 Feb 2017 16:35:03 UTC

Severity: normal

To reply to this bug, email your comments to 25756 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-parted <at> gnu.org:
bug#25756; Package parted. (Thu, 16 Feb 2017 16:35:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Douglas Miller <dougmill <at> linux.vnet.ibm.com>:
New bug report received and forwarded. Copy sent to bug-parted <at> gnu.org. (Thu, 16 Feb 2017 16:35:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Douglas Miller <dougmill <at> linux.vnet.ibm.com>
To: bug-parted <at> gnu.org
Cc: Guilherme Piccoli <gpiccoli <at> br.ibm.com>, chavez <at> us.ibm.com,
 ruddk <at> us.ibm.com
Subject: Problems using "parted ... print" on nvme devices
Date: Thu, 16 Feb 2017 09:08:44 -0600

We have seen a problem in some test infrastructure that uses "parted ... 
print" to query partition information and then configure test cases. The 
problem shows up when using parted on nvme drives because systemd.udevd 
is monitoring nvme devices for changes to the partition tables, and 
rebuilds the devices. This results in the devices disappearing for a few 
seconds after running "parted ... print" and causing failures to 
configure tests. The root cause is that parted opens the device "O_RDWR" 
regardless of whether it is actually modifying the partition table, and 
this notifies systemd-udevd causing the disruption in the block devices.

I have not worked up a patch yet, or even studied the code in-depth, but 
it seems to me that parted could be better about using open modes that 
reflect it's true intentions. Does that seem like a reasonable change?

I expect that our test infrastructure will have to be modified, probably 
to use fdisk or something other than parted, but it still seems like 
something to be fixed.


Thoughts?

Thanks,

Doug

Information forwarded to bug-parted <at> gnu.org:
bug#25756; Package parted. (Thu, 16 Feb 2017 17:33:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Brian C. Lane" <bcl <at> redhat.com>
To: bug-parted <at> gnu.org
Subject: Re: bug#25756: Problems using "parted ... print" on nvme devices
Date: Thu, 16 Feb 2017 09:32:18 -0800

On Thu, Feb 16, 2017 at 09:08:44AM -0600, Douglas Miller wrote:
> We have seen a problem in some test infrastructure that uses "parted ...
> print" to query partition information and then configure test cases. The
> problem shows up when using parted on nvme drives because systemd.udevd is
> monitoring nvme devices for changes to the partition tables, and rebuilds
> the devices. This results in the devices disappearing for a few seconds
> after running "parted ... print" and causing failures to configure tests.
> The root cause is that parted opens the device "O_RDWR" regardless of
> whether it is actually modifying the partition table, and this notifies
> systemd-udevd causing the disruption in the block devices.
> 
> I have not worked up a patch yet, or even studied the code in-depth, but it
> seems to me that parted could be better about using open modes that reflect
> it's true intentions. Does that seem like a reasonable change?
> 
> I expect that our test infrastructure will have to be modified, probably to
> use fdisk or something other than parted, but it still seems like something
> to be fixed.

I think the tricky part of that is going to be that when we open the
device we don't really know what commands are going to be issued so it
needs to be RDWR to allow for all the other possibilities.

There should be some way to lock out udev during your tests.

-- 
Brian C. Lane (PST8PDT)

Information forwarded to bug-parted <at> gnu.org:
bug#25756; Package parted. (Wed, 19 Apr 2017 13:01:01 GMT) Full text and rfc822 format available.

Message #11 received at 25756 <at> debbugs.gnu.org (full text, mbox):

From: Phil Susi <psusi <at> ubuntu.com>
To: "Brian C. Lane" <bcl <at> redhat.com>, 25756 <at> debbugs.gnu.org
Cc: systemd-devel <at> lists.freedesktop.org
Subject: Re: systemd mucking with partition tables ( was: bug#25756: Problems
 using "parted ... print" on nvme devices )
Date: Wed, 19 Apr 2017 09:01:55 -0400

On 2/16/2017 12:32 PM, Brian C. Lane wrote:
> I think the tricky part of that is going to be that when we open the
> device we don't really know what commands are going to be issued so it
> needs to be RDWR to allow for all the other possibilities.

I'm sure I have seen a patch floating around somewhere and been meaning
to merge it for some time that opens the device RO at first, then
switches to RW if and when it is required.  We should do that, but...

> There should be some way to lock out udev during your tests.

Why the hell has udev started mucking with the partition tables and dev
nodes every time someone opens the block device rw?  Parted and other
partitioning tools have always manipulated the in memory partition table
themselves after updating the disk, so why does systemd now think this
is its responsibility?

Parted takes care to only manipulate the individual partitions that have
changed, but I'm not sure that systemd doesn't just blow them all way
and recreate them all, causing significant system wide disruption.
There are some open bugs in Ubuntu for the unity desktop where drives
you have unlocked from your hotbar reappear due to them being "removed"
and reappearing due to this behavior.  systemd should stop this nonsense.

Information forwarded to bug-parted <at> gnu.org:
bug#25756; Package parted. (Wed, 19 Apr 2017 16:47:04 GMT) Full text and rfc822 format available.

Message #14 received at 25756 <at> debbugs.gnu.org (full text, mbox):

From: Lennart Poettering <lennart <at> poettering.net>
To: Phil Susi <psusi <at> ubuntu.com>
Cc: systemd-devel <at> lists.freedesktop.org, "Brian C. Lane" <bcl <at> redhat.com>,
 25756 <at> debbugs.gnu.org
Subject: Re: [systemd-devel] systemd mucking with partition tables ( was:
 bug#25756: Problems using "parted ... print" on nvme devices )
Date: Wed, 19 Apr 2017 18:17:14 +0200

On Wed, 19.04.17 09:01, Phil Susi (psusi <at> ubuntu.com) wrote:

> On 2/16/2017 12:32 PM, Brian C. Lane wrote:
> > I think the tricky part of that is going to be that when we open the
> > device we don't really know what commands are going to be issued so it
> > needs to be RDWR to allow for all the other possibilities.
> 
> I'm sure I have seen a patch floating around somewhere and been meaning
> to merge it for some time that opens the device RO at first, then
> switches to RW if and when it is required.  We should do that, but...
> 
> > There should be some way to lock out udev during your tests.
> 
> Why the hell has udev started mucking with the partition tables and dev
> nodes every time someone opens the block device rw?  Parted and other
> partitioning tools have always manipulated the in memory partition table
> themselves after updating the disk, so why does systemd now think this
> is its responsibility?

This isn't precisely new functionality, it has been doing that since
years. It will synthesize "change" udev events when a process closes a block
device after writing, so that the changed superblock/partition
information is properly propagated to clients.

Also note that parted never was in the business of retriggering block
devices through sysfs/udev (i.e. echoing "change" into a "uevents"
file in sysfs), only udev ever did that so far, and I am pretty sure
that should stay that way.

As long as there's a BSD lock in effect on a block device, udev won't
synthesize such events. Hence: if you want to make a series of
changes, and want to close the block device fds in the process, then
make sure to keep at least one fd open with a BSD lock in effect, and
your changes won't be propagated into udev events until you release
it. 

> Parted takes care to only manipulate the individual partitions that have
> changed, but I'm not sure that systemd doesn't just blow them all way
> and recreate them all, causing significant system wide disruption.
> There are some open bugs in Ubuntu for the unity desktop where drives
> you have unlocked from your hotbar reappear due to them being "removed"
> and reappearing due to this behavior.  systemd should stop this nonsense.

My recommendation: instead of calling the stuff we do "nonsense",
first figure out what is actually implemented.

Lennart

Information forwarded to bug-parted <at> gnu.org:
bug#25756; Package parted. (Wed, 19 Apr 2017 17:58:01 GMT) Full text and rfc822 format available.

Message #17 received at 25756 <at> debbugs.gnu.org (full text, mbox):

From: Phil Susi <psusi <at> ubuntu.com>
To: Lennart Poettering <lennart <at> poettering.net>
Cc: systemd-devel <at> lists.freedesktop.org, "Brian C. Lane" <bcl <at> redhat.com>,
 25756 <at> debbugs.gnu.org
Subject: Re: [systemd-devel] systemd mucking with partition tables ( was:
 bug#25756: Problems using "parted ... print" on nvme devices )
Date: Wed, 19 Apr 2017 13:59:13 -0400

On 4/19/2017 12:17 PM, Lennart Poettering wrote:
> This isn't precisely new functionality, it has been doing that since
> years. It will synthesize "change" udev events when a process closes a block
> device after writing, so that the changed superblock/partition
> information is properly propagated to clients.
> 
> Also note that parted never was in the business of retriggering block
> devices through sysfs/udev (i.e. echoing "change" into a "uevents"
> file in sysfs), only udev ever did that so far, and I am pretty sure
> that should stay that way.

What?  The kernel must generate the event as otherwise systemd has no
idea that a process on the system closed its handle to the device, and
so would not know when it should trigger them.  Or do you mean that the
kernel only triggers on the main device, and udev now synthesizes events
on the partitions?

That could explain why udevadm monitor is now showing me KERNEL change
events on the partitions as well, unless the kernel itself really is
generating those internally?  I'm fairly certain these events on the
partition devices did not used to happen, and they should not be
happening now.  Changing one partition should not cause a change event
on another partition that has not been changed in any way.

> As long as there's a BSD lock in effect on a block device, udev won't
> synthesize such events. Hence: if you want to make a series of
> changes, and want to close the block device fds in the process, then
> make sure to keep at least one fd open with a BSD lock in effect, and
> your changes won't be propagated into udev events until you release
> it. 

The timing isn't the issue, so using a lock to delay does not help.

Information forwarded to bug-parted <at> gnu.org:
bug#25756; Package parted. (Thu, 20 Apr 2017 09:58:02 GMT) Full text and rfc822 format available.

Message #20 received at 25756 <at> debbugs.gnu.org (full text, mbox):

From: Lennart Poettering <lennart <at> poettering.net>
To: Phil Susi <psusi <at> ubuntu.com>
Cc: systemd-devel <at> lists.freedesktop.org, "Brian C. Lane" <bcl <at> redhat.com>,
 25756 <at> debbugs.gnu.org
Subject: Re: [systemd-devel] systemd mucking with partition tables ( was:
 bug#25756: Problems using "parted ... print" on nvme devices )
Date: Thu, 20 Apr 2017 11:57:01 +0200

On Wed, 19.04.17 13:59, Phil Susi (psusi <at> ubuntu.com) wrote:

> On 4/19/2017 12:17 PM, Lennart Poettering wrote:
> > This isn't precisely new functionality, it has been doing that since
> > years. It will synthesize "change" udev events when a process closes a block
> > device after writing, so that the changed superblock/partition
> > information is properly propagated to clients.
> > 
> > Also note that parted never was in the business of retriggering block
> > devices through sysfs/udev (i.e. echoing "change" into a "uevents"
> > file in sysfs), only udev ever did that so far, and I am pretty sure
> > that should stay that way.
> 
> What?  The kernel must generate the event as otherwise systemd has no
> idea that a process on the system closed its handle to the device, and
> so would not know when it should trigger them.  Or do you mean that the
> kernel only triggers on the main device, and udev now synthesizes events
> on the partitions?

The kernel generates inotify IN_CLOSED_WRITE events, and udev then
retriggers the device.

Lennart

-- 
Lennart Poettering, Red Hat

This bug report was last modified 8 years and 60 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #25756 Problems using "parted ... print" on nvme devices

GNU bug report logs - #25756
Problems using "parted ... print" on nvme devices