This implicitly adds ddf containers to 'add' and 'remove' rules.
Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: NeilBrown <neilb@suse.de>
When forking mdmon we need to close all other fds because we don't
use O_CLOEXEC yet.
Any approach will be fairly arbitrary, but as we can expect fds to be
fairly dense, closing until we find a set number that don't need
closing is possible safer than only closing the first 100.
So keep closing until we find 20 that are already closed.
Signed-off-by: NeilBrown <neilb@suse.de>
When not all attributes are supported (attributes incompatibility)
function container_content_imsm returns NULL pointer.
We need to cope with a NULL list better.
Reported-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Description of the bug:
Sometimes mdmon crashes after changing RAID level from 1 to 0 (takeover).
Cause of the bug:
The managemon marks an active_array for removal from monitoring
by assigning a->container to NULL value (in the "manage_member" function).
Sometimes (during stress test) it happens right when the monitor
is in the "read_and_act" function and a->container pointer is in use.
This causes the monitor crashes.
Solution:
The active array has to be marked for removal in another way
than setting NULL pointer when it can be in use.
A new field "to_remove" was added to the "active_array" structure.
It is used in the managemon to mark a container to remove
(instead of the old assigment: a->container = NULL)
and monitor checks it to determine if the array should be removed.
The field "to_remove" should be checked in some other places
to avoid managing of the array which is going to be removed.
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Specifying missing devices at create is very useful for array recovery.
For imsm create dummy disk entries at init_super_imsm time, and then use
them to fill in unoccupied slots in the final array (if the container is
unpopulated).
If the container is already populated (has a subarray)
'missing' disks must be in reference to already recorded missing devices
in the metadata.
Also add support for --assume-clean for imsm arrays.
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The following test fails when the md_check_recovery() event triggered by
the ro->rw transition causes remove_and_add_spares() to run while mdmon
is attempting spare activation.
Result is that the kernel races to set the slot immediately after
sysfs_add_disk() writes new_dev. mdmon thinks the spare activation
failed and declines to send the monitor a new acitve_array. We show
degraded after the wait because the monitor cannot notify the metadata
that all disks are in_sync.
#!/bin/bash
i=0
false
while [ $? == 1 ]
do
i=$((i+1))
mdadm -Ss
mdadm -CR /dev/md0 /dev/loop[0-2] -n 3 -e imsm
mdadm -CR /dev/md1 /dev/loop[01] missing -n 3 -l 5
mdadm --wait /dev/md1
mdadm -E /dev/loop2 | grep -i degraded
done
echo "failed: $i"
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Different OROMs reserve different amounts of space for the migration area.
When activating a spare minimize the reserved space otherwise a valid spare
can be prevented from joining an array with a migration area smaller than
IMSM_RESERVED_SECTORS.
This may result in an array that cannot be reshaped, but that is less
surprising than not being able to rebuild a degraded array.
imsm_reserved_sectors() already reports the minimal value which adds to
the confusion when trying rebuild an array because mdadm -E indicates
that the device has enough space.
Cc: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Not sure how this ever worked, but now we just try to parse a directory
name that looks like <host>:<bus>:<target>:<lun>.
Array creation segfaults on Fedora 14 without this.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Commit 94827db3 "imsm: add spares to --examine output." may try to
display failed disks whose imsm_disk info is not uptodate (due to not
being able to look itself up by serial). The same effect can be had by
just loosening the restriction in print_imsm_disk().
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We already refrain from updating metadata on disks that are failed at
load, need to do the same for new failures. This also reverts b4add146
as we *do* want to update other disks' view of the failed device as out of
date.
Cc: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Validate geometry is incorrectly looking at max disks support which is
irrelevant for md/mdadm. ->dpa (disks per array) is how many disks the
orom will allow per volume.
Also cleanup an unnecessary ->orom check, is_raid_level_supported()
already does the right thing in the !orom case.
Cc: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Update to latest from http://www.gnu.org/licenses/gpl-2.0.txt
This corrects the address and some other minor details.
No change in the license.
Signed-off-by: NeilBrown <neilb@suse.de>
Some bits are not handled by mdadm, but their presence should not
cause failure.
In particular MPB_ATTRIB_NEVER_USE appears harmless.
Reported-by: Thomas Steinborn <thestonewell@googlemail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
There are some more times when we don't care that the hardware doesn't
support the metadata:
- when removing old metadata
- when reporting the metadata present before over-writing it.
So set ignore_hw_compat in these cases.
Signed-off-by: NeilBrown <neilb@suse.de>
The 'raid_disk' can be different to the 'number' and must be
the position of the device in the array, not in the container.
Normally these should not be different, but the test-suite creates a
possibility so it should work.
Signed-off-by: NeilBrown <neilb@suse.de>
- explain it's use in guarding against small replacements
- clarify relationship with containers.
- include information about --grow --size not being supported by IMSM metadata.
Reported-by: maciej.naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
As the conditional checks, reshape to fewer devices is supported
since Linux kernel 2.6.30 not 2.6.32.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Set a valid default in the !orom case, otherwise we segfault, or
otherwise fail.
Cc: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
getinfo_super_imsm_volume doesn't correctly set info.disk fields
because it doesn't know which disk to set them from.
It should be the last disk passed to add_to_super.
So add a field 'current_disk' to record this disk in add_to_super, and
use it in getinfo_super.
This allows us to remove a hack in Create.c
Signed-off-by: NeilBrown <neilb@suse.de>
This patch fixes these build issues:
super-intel.c: In function 'getinfo_super_imsm_volume':
super-intel.c:2327:4: error: format '%llu' expects argument of type 'long long
unsigned int', but argument 3 has type '__u64' [-Werror=format]
super-intel.c: In function 'imsm_reshape_super':
super-intel.c:8665:7: error: 'devnum' may be used uninitialized in this function [-Werror=uninitialized]
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The serial number is not necessarily nul terminated, so we need to be
sure to only use the allowed number of chars.
Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Arvin Schnell <aschnell@novell.com>
Initially there is no proper translation mdstat's DELAYED/PENDING processes
to "--detail" output.
For example, if we have recover=DELAYED in mdstat, "--detail"
shows "State: recovering" and "Rebuild Status = 0%".
It was incorrect in case of process waiting on checkpoint different
than 0%. In fact rebuild status is differnt than 0% and user is misled.
The patch fix the problem. Current "--detail" command shows
in the exampe: "State: recovering (DELAYED)" and no information
about precentage.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Since info->delta_disks is signed it doesn't need to be special-cased.
This allowed my 9->8 reshape to display correctly instead of as 8->7
Signed-off-by: NeilBrown <neilb@suse.de>
Fix random typos and add a few of missing words/macros.
Also update RAID website URL as it is not accessible anymore.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
It is supposed to be under Grow mode. Since Create/Build/Grow modes
use common options and '-a' is already used for '--auto' in Create/
Build modes, describe it to avoid confusion.
Signed-off-by: NeilBrown <neilb@suse.de>
The next version of Linux might be 3.0. If it is, get_linux_version
will fail.
So make it more robust.
Reported-by: Namhyung Kim <namhyung@gmail.com>
Reported-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The 'best' array only has 'bestcnt' entries allocated, so 'i' should
always be "< bestcnt", not "<= bestcnt".
Reported-by: "Lawrence, Joe" <Joe.Lawrence@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
looking at the gpt code in util.c i found i did not like it at all, a
gpt partition entry is currently 128 bytes, but the spec does not say it
is a fixed value, so the code that reads into a buffer with 512bytes
chunk expecting this to be a multiplier of part_size is imho incorrect.
my fix was to read each partition entry directly into a struct
GPT_part_entry, the advantage is that the code is very simple to read,
the disadvantage it is 128 reads of 128 bytes each, which is
sub-optimal, but i believe readahead will mitigate this a lot.
Signed-off-by: NeilBrown <neilb@suse.de>
Now that getinfo_super clears the info structure, we need
to make sure the correct values are filled in to info->dev.
Signed-off-by: NeilBrown <neilb@suse.de>
Now that we reserve space for migration buffer devices are a little
bit smaller so we need to allow for that in the test suite.
Also add a 'udevadm settle' - it seems to help
Signed-off-by: NeilBrown <neilb@suse.de>