Now that the external metadata handler must provide an md-compatible
old/new geometry, sys_set_array can do all of the array set-up for
an array that is undergoing reshape.
That leave less for reshape_array to do.
Also clean up how reshape_array tells if the reshape has started or
not.
Don't use ->reshape_active as that doesn't tell us anything consistent
at this stage, only use the 'restart' flag passed in.
Signed-off-by: NeilBrown <neilb@suse.de>
Problem: sysfs_disk_to_scsi_id() not returns correct scsi_id value.
Reason: sysfs format has been changed
This patch adapt sysfs_disk_to_scsi_id() to new sysfs format.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
- when reshaping a container, ->reshape_active is already set
even though it isn't really active yet, so we need to set
the new geometry even when reshape_active is set. This is safe.
- When restarting a reshape, make sure the reshape_position is set
appropriately when external metadata is used.
Signed-off-by: NeilBrown <neilb@suse.de>
Sometimes we want to convert a devnum to a devname without allocating
memory. So provide function to do the formatting without allocation.
Signed-off-by: NeilBrown <neilb@suse.de>
When disk is added, set valid slot numbers (positive) only.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
In order to support reshape and atomic removal of spares from containers
we need to prevent mdmon from activating spares. In the reshape case we
additionally need to freeze sync_action while the reshape transaction is
initiated with the kernel and recorded in the metadata.
When reshaping a raid0 array we need to freeze the array *before* it is
transitioned to a redundant raid level. Since sync_action does not exist
at this point we extend the '-' prefix of a subarray string to flag
mdmon not to activate spares.
Mdadm needs to be reasonably certain that the version of mdmon in the
system honors this 'freeze' indication. If mdmon is not already active
then we assume the version that gets started is the same as the mdadm
version. Otherwise, we check the version of mdmon as returned by the
extended ping_monitor() operation. This is to catch cases where mdadm
is upgraded in the filesystem, but mdmon started in the initramfs is
from a previous release.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
...i.e. GET_DEVS == (GET_DEVS|SKIP_GONE_DEVS)
A null pointer dereference in Incremental.c can be triggered by
replugging a disk while the old name is in use. When mdadm -I is called
on the new disk we fail the call to sysfs_read(). I audited all the
locations that use GET_DEVS and it appears they can tolerate missing a
drive. So just make SKIP_GONE_DEVS the default behaviour.
Also fix up remaining unchecked usages of the sysfs_read() return value.
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that mdmon records periodic checkpoints, and checkpoints every
->set_array_state() event we no longer need to 'idle' sync_action from
--wait-clean.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Minimal changes needed to permit reassembling partially recovered
external metadata arrays. The biggest logical change is that
->container_content() can now surface partially rebuilt members rather
than omitting them from the disk list.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The patch increases the capacity of buffers used to store
sysfs path names. Originally the buffers were too small to
hold the canonical representation of sysfs path (in case
of a SAS device, especially a device installed behind an
expander).
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Connect to the monitor in the old namespace and use that connection for
WaitClean requests when stopping the victim mdmon instance. This allows
ping_monitor() to work post chroot().
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When building container members with -IR, we need to ensure that
devices added to an active array preserve the 'in_sync' status so they
don't needlessly get rebuilt.
So allow sysfs_add_disk to do this (only works in kernels since
2.6.30) and pass the relevant flag down.
Signed-off-by: NeilBrown <neilb@suse.de>
For short reshapes the kernel may be done before mdadm can check that
progress has passed the critical section.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
imsm arrays round down the effective array size to the closest 1
megabyte boundary so teach get_info_super_imsm and sysfs_set_array to
set 'md/array_size' if available (and make sure ddf uses the default
size).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If an array reshape completed within 1 second, then --grow will not
notice that it has finished and will keep waiting for the critical
section to pass.
So be more cautious in the test.
Signed-off-by: NeilBrown <neilb@suse.de>
All operations that rely on loading from an existing container (like
--add) will fail after a disk has been removed. Provide an option to
skip missing / offline disks rather than abort. We attempt to do this
in the load_super_{imsm,ddf}_all cases when mdmon is running i.e. we
already have a consitent version of the metadata running in the system.
Otherwise, we fail as normal and let the administrator fix up the
container.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When a 'container' gets started, we need udev to notice, but the
kernel has no way of knowing that a KOBJ_CHANGE event is needed. So
send one directly via the 'uevent' sysfs attribute.
Also, uevents don't get generated when md arrays are stopped (prior to
2.6.28) so send 'change' events then too.
Signed-off-by: NeilBrown <neilb@suse.de>
But sysfs_init and stat2devnum try to convert stat information
into an md devnum. Combine all the value of both pieces of code
into stat2devnum and have sysfs_init call that.
Signed-off-by: NeilBrown <neilb@suse.de>
From 2.6.28, normal md device will be able to have partitions. These
partitions will have a different major number. Sometimes mdadm tests
the major number and so can get confused.
Change these tests to test against get_mdp_major(). mdp does not use
extended minor number and so this test will always be accurate.
Also use /sys/dev links to map major/minor to devnum in sysfs.
Signed-off-by: NeilBrown <neilb@suse.de>
Set the safemode timeout to a small value to get the array marked clean as
soon as possible. We don't write 'clean' directly as it may cause mdmon to
miss a 'write-pending' event.
Include a couple fixes to sysfs_set_safemode():
1/ 0 pad the milliseconds field
2/ workaround input truncation in the kernel
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When arrays do not startup correctly it would be nice to know why. Need
to move the dprintf definition to mdadm.h
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When we assemble an array, there are three different approaches
depending on whether metadata is internal or external, and on
kernel version.
Move all this to a common helper instead of duplicating in 3 places.
Signed-off-by: NeilBrown <neilb@suse.de>
The variety of approaches to 'add_disk' are factored out into
a separate function, and Incremental mode benefits by being
closer to supporting the assembly of containers.
Also remove the adding-to-array-data-structure out of sysfs_add_disk
and into add_disk.
And add some tests for --incremental mode to make sure we don't break it.
Signed-off-by: NeilBrown <neilb@suse.de>
When removing a disk there is a window where the 'slot' attribute of
md/dev-$name will return -EBUSY to read attempts. When this happens
look at the the 'block' link, if it is removed then we can be sure the
device has been removed, versus some other error.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For use in distro shutdown scripts with a RAID root file system.
Returns immediately if the array is 'readonly', or not an externally
managed array. It is up to the distro's scripts to make sure no new
writes hit the device after this returns 'true'.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The cmd_filter patch merged for 2.6.27 broke retrieving the serial
number via an ioctl to /dev/sgN. In debugging this I found that other
utilities like sdparm simply run the ioctl on /dev/sdX. So just convert
to that for protection in numbers, but scream on the mailing list for
the inconvenience grr...
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This really should be done in mdadm, not mdmon.
We ensure the device won't be suddenly commited as a hot-spare
using O_EXCL, then check the 'holders' sysfs directory
to make sure it is only in use once.