If the device name "missing" is given for --re-add, then mdadm will
attempt to find any device which should be a member of the array but
currently isn't and will --re-add it to the array.
This can be useful if a device disappeared due to a cabling problem,
and was then re-connected.
The appropriate sequence would be
mdadm /dev/mdX --fail detached
mdadm /dev/mdX --remove detached
mdadm /dev/mdX --re-add missing
Signed-off-by: NeilBrown <neilb@suse.de>
Adding devices to active arrays in --incremental is a bit dubious.
Normally the array won't be activated until all expected devices are
present, so this situation would mean that the given device is not
expected, so is probably failed. In that case it should only be added
by explicit sysadmin request.
However if --run was given, then quite possibly the array was
assembled earlier when not complete, so it is less clear whether it is
wrong to add this device or not. In that case add it as that is
generally safest.
It would be nice to allow policy for this to be explicitly given by
sysadmin.
Signed-off-by: NeilBrown <neilb@suse.de>
commit 1ff9833928
broke the checking of metadata types via the 'auto' line.
Be moving 'load_super" before "conf_test_metadata" we left
tst->sb set even if conf_test_metadata fails, so the device will
actually be accepted and used.
So if we decide to reject the device, free the superblock so it is
clear that it is rejected.
Signed-off-by: NeilBrown <neilb@suse.de>
When using 0.90 metadata, devices can be renumbered when
earlier devices are removed.
So when iterating all devices looking for 'failed' or 'detached'
devices, we need to re-check the same slot we checked last time
to see if maybe it has a different device now.
Reported-by: Jim Paris <jim@jtan.com>
Resolves-Debian-Bug: 587550
Signed-off-by: NeilBrown <neilb@suse.de>
- split the rules for handling components of array to be clearly
separate from rules for handling the arrays themselves.
- add call to "-If" when removing a device
- uncomment the --incremental call when adding a device.
Signed-off-by: NeilBrown <neilb@suse.de>
This can be used for hot-unplug. When a device has been remove,
udev can call
mdadm --incremental --fail sda
and mdadm will find the array holding sda and remove sda from
the array.
Based on code from Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Allow kernel names like "sda" and "hdb1" to be used to
fail/remove devices from an array.
This is useful as after a device has been removed it can be difficult
to get the major/minor number.
Signed-off-by: NeilBrown <neilb@suse.de>
This allows finding the array which contains a given component.
Components are named using the kernel-internal string name such
as "sda1" or "hdb".
Don't return member arrays, only the contain that contains them.
Also tidy up the parsing of 'inactive' arrays in /proc/mdstat.
If we see 'inactive' we need to set 'in_devs' immediately as there
is no level coming.
Signed-off-by: NeilBrown <neilb@suse.de>
In some places it is referred to as "--rebuild", and while
that works due to getopt allowing prefixes, it could appear
confusing (rebuild means other things too) and being explicit
is some safeguard if we want to add e.g. --rebuild-foo later.
Reported-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This commit introduces DEFAULT_OLD_METADATA as a preprocessor
definition. If defined, it causes mdadm to assume metadata version 0.9
as default. If not defined, version 1.x (currently 1.2) is used as
default.
The man page mdadm.8 is also modified to reflect the chosen default.
The selftests will not work if the old default is chosen.
This patch was requested by Debian so they could distribute a current
mdadm together with boot loaders that only understand 0.90 metadata
for md-raid.
Preferred usage is simply
make DEFAULT_OLD_METADATA=yes
Signed-off-by: martin f. krafft <madduck@debian.org>
Signed-off-by: NeilBrown <neilb@suse.de>
If the metadata is an empty string, it means the array in question
does not use metadata. This comes from sysfs_read finding "none" in
"metadata_version", then super_by_fd noticing the vers == -1, and so
just using the ->text_version (which is empty).
In this case we want to use the super0 metadata handler routines
because that is what we always used to do before
commit 7d5c3964cc
And that commit was wrong because "" doesn't mean "default" and so
should not have been changed at the same time.
Reported-by: martin f. krafft <madduck@debian.org>
Signed-off-by: NeilBrown <neilb@suse.de>
update_recovery_start() assumed that the out-of-sync disk would always be
marked as IMSM_ORD_REBUILD in the disk_ord_tbl, but the segmentation
fault reported by Andy proves otherwise. This might also be explained by
an interrupted rebuild and the disk has not yet been marked missing.
https://bugzilla.redhat.com/show_bug.cgi?id=592030
Reported-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This reverts commit fdb482f99b.
Now that containers can report state for ->container_enough we can
automatically determine when the array can be started, and no longer
need the --no-degraded hammer.
Conflicts:
Incremental.c
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This is needed for imsm where:
1/ we want to report raid_disks as zero to allow mdadm -As to
incorporate all spares
2/ we can't determine stale disks by looking at the event counts.
3/ we can't see per-subarray expectations with the info returned from
the container level ->getinfo_super()
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Normally Monitor doesn't see faulty devices in active slots - they get
moved away too quickly.
But if it does, it reports the "faulty device disappeared" event (when
it finally does get moved away) as SpareActive due to insufficient
checking.
So add a better check.
Reported-by: Pierre Vignéras <pierre@vigneras.name>
A recent change move the sysfs_read call away from the check that it
succeeded. This patch moves the check back next to the sysfs_read
call.
Signed-off-by: NeilBrown <neilb@suse.de>
devnum2devname() returns pointer to memory allocated with strdup.
It must be released to prevent memory leak.
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When activating a spare we neglect to open recovery_start and as such do
not see checkpoint events. Move disk initialization to common routine
to mitigate recurrence.
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Chunks aren't particularly big, but when you could them in bytes
and multiply them together (as we do for calculating the backup
size for 'grow') they can overflow a 32bit int.
So group the division by 512 more closely with the
chunk size so were would need 30Meg chunks to come close to
overflowing 32bits.
Signed-off-by: NeilBrown <neilb@suse.de>
These metadata are not expected on partitions, and they have
no way of differentiation whether which is correct if they
are found both on the device and on the last partition.
So if the device is a partition, refuse to read the metadata.
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm prevents creation when device names are duplicated on the command
line, but leaves the partially created array intact. Detect this case
in the error code from add_to_super() and cleanup the partially created
array. The imsm handler is updated to report this conflict in
add_to_super_imsm_volume().
Note that since neither mdmon, nor userspace for that matter, ever saw an
active array we only need to perform a subset of the cleanup actions.
So call ioctl(STOP_ARRAY) directly and arrange for Create() to cleanup
the map file rather than calling Manage_runstop().
Reported-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
I suspect commit c132678b wanted VAR_RUN and ALT_RUN to be suffixed by
'/mdadm' to match the defaults in mdadm.h.
Cc: Luca Berra <bluca@comedia.it>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
filesystem is writable).
This particularly keeps udev happy if VAR_RUN is set to /dev/md.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Default metadata type is now 1.2, and we sometimes
add extra alignment before the data section,
so adjust tests for these changes.
Signed-off-by: NeilBrown <neilb@suse.de>
Matching the functionality already in super0 and super1, when
we first create a container, remove any other recognisable metadata to
ensure it doesn't cause confusion.
Signed-off-by: NeilBrown <neilb@suse.de>
Found during testing:
- cannot check metadata for homehost before loading metadata.
- As 1.x metadata can has a state 'rebuilding' between
'spare' and 'ok', we need to include that in our calculations.
Signed-off-by: NeilBrown <neilb@suse.de>
When checking if the new chunk size fit in the component size
we were confusing sectors and K, and so getting it wrong.
Signed-off-by: NeilBrown <neilb@suse.de>
Code to check partition tables used some needless casts
and was broken, using a u8 when a u32 was wanted.
So create structure describing the tables rather than using offset,
and read into those tables instead.
Signed-off-by: NeilBrown <neilb@suse.de>
This reverts commit 9ef5dbff4a as it is
duplicating the check that is done internal to imsm_read_serial().
Signed-off-by: Dan Williams <dan.j.williams@intel.com>