If a device hasn't been found yet we can still tell if it is
expected to be working, and we must to do to make sure
'working_disks' is correct.
Signed-off-by: NeilBrown <neilb@suse.de>
This is needed to remove devices from mdmon's knowledge when the
device is removed from the md container.
Now that ddf have a remove_from_super we don't need the code
that allows some personalities not to implement this.
Signed-off-by: NeilBrown <neilb@suse.de>
If a RAID1 is meant to have more than 2 device and while it doesn't
have that many, it still has more than 1, then according to the
DDF spec it is "partially optional" rather than "degraded"
So make that so.
Signed-off-by: NeilBrown <neilb@suse.de>
The DDF spec requires we have a phys disk record for every physically
attached device. But it isn't clear what that means in the case
of soft raid in a general purpose Linux computer.
So remove phys disk records for any failed device that is not
active in any array.
Signed-off-by: NeilBrown <neilb@suse.de>
This is a big fragile, but DDF has wierd rules that we aren't really
set up to handle properly.
When we add a device to a degraded array it must be a spare, so
mark it as Rebuilding.
Signed-off-by: NeilBrown <neilb@suse.de>
This is needed to remove devices from mdmon's knowledge when the
device is removed from the md container.
Now that ddf have a remove_from_super we don't need the code
that allows some personalities not to implement this.
Signed-off-by: NeilBrown <neilb@suse.de>
We weren't setting ->vcnum at all when an array was added. This
meant that a subsequent device failure could be assigned to the
wrong array.
Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When set_disk is called, we need to check if the disk has changed or
recently appeared, and update everything properly if it has.
Signed-off-by: NeilBrown <neilb@suse.de>
1/ ignore devices with "state_fd < 0" as these have been removed.
2/ Set update 'length' properly and clear 'space'.
Signed-off-by: NeilBrown <neilb@suse.de>
We were setting ->container_member twice in ddf get_info.
Once to currentconf->vcnum,
once to atoi(st->subarray).
Both should be the same.
For consistency with super-intel, use the first.
Signed-off-by: NeilBrown <neilb@suse.de>
the new code for defaulting chunksizes didn't work quite right
- default was set to late in super1/super0/ddf
- defaults would over-ride values of '0' imposed by some levels
- default value wasn't applied to size properly.
Signed-off-by: NeilBrown <neilb@suse.de>
If a device is failed, then don't include it in the reported
container_content, else it might get included in the array.
Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When chunk size is not set from command line we need to guess it
depending on metadata given on command line or found on listed devices.
Validate_geometry sets the default for it's metadata if chunk is not set.
For external metadata chunk is set only when creating in a container.
For imsm validate_geometry_imsm_orom is responsible for finding default
chunk depending on container metadata loaded. Container will already know
which controller it is attached to, and have this controllers orom
available.
do_default_chunk indicates that we need to find default chunk and
if validate_geometry fails for some metadata it tells us to reset chunk
that may have been set.
Current solution would set default chunk correctly for imsm only if
container device was given on command line. With the list of devices
chunk was always set to 512.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We previously closed all 'fds' associated with an array in
write_init_super .. sometimes, and sometimes at bad times.
This isn't neat and free_super is a better place to close them.
So make sure free_super always closes the fds that the metadata
manager kept hold of, and stop closing them in write_init_super.
Also add a few more calls to free_super to make sure they really do
get closed.
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Now that we check the error return of 'update_super' better, we
much make sure that ddf doesn't incorrectly report that the
superblocks are wrong during assemble.
Signed-off-by: NeilBrown <neilb@suse.de>
Support metadata specific level, layout and chunksize defaults. Kill an
uneeded superswitch methods ahead of adding more for the reshape case.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We more clearly separate out -I on a container, and use
load_container in that case and load_super only for true members.
This removes another use of loaded_container.
Signed-off-by: NeilBrown <neilb@suse.de>
This handles the 'container' part of 'load_super', so we can
soon make them completely separate - it is just confusing to
overload these two.
Signed-off-by: NeilBrown <neilb@suse.de>
It makes more sense to test for container_dev than for subarray
for several places in Create where it then uses container_dev.
This allows us to subsequently remove subarray.
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.
For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.
Signed-off-by: NeilBrown <neilb@suse.de>
We were setting ->container_member twice in ddf get_info.
Once to currentconf->vcnum,
once to atoi(st->subarray).
Both should be the same.
For consistency with super-intel, use the first.
Signed-off-by: NeilBrown <neilb@suse.de>
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.
We should never include a device in an array if it thinks that the
primary device has failed.
This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.
Signed-off-by: NeilBrown <neilb@suse.de>
If an --update is requested by the relevant metadata doesn't
understand it, print a useful message rather than silently ignoring
the issue.
Signed-off-by: NeilBrown <neilb@suse.de>
The container_enough code change broke ddf as ddf never claimed
'enough' devices. So change it to always claim 'enough' to
restore previous behaviour.
Signed-off-by: NeilBrown <neilb@suse.de>
...i.e. GET_DEVS == (GET_DEVS|SKIP_GONE_DEVS)
A null pointer dereference in Incremental.c can be triggered by
replugging a disk while the old name is in use. When mdadm -I is called
on the new disk we fail the call to sysfs_read(). I audited all the
locations that use GET_DEVS and it appears they can tolerate missing a
drive. So just make SKIP_GONE_DEVS the default behaviour.
Also fix up remaining unchecked usages of the sysfs_read() return value.
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Support for deleting a subarray out of a container. When all subarrays
are deleted the component devices are converted back into spares, a
--zero-superblock is still needed to kill the remaining metadata at this
point. This operation is blocked when the subarray is active and may
also be blocked by the metadata handler when deleting the subarray might
change the uuid of other active subarrays. For example, with imsm,
deleting subarray 'n' may change the uuid of subarrays with indexes > n.
Deleting a subarray needs to be a container wide event to ensure
disks that record the modified subarray list perceive other disks that
did not receive this change as out of date.
Notes:
The st->subarray parsing in super-intel.c and super-ddf.c is updated to
be more strict now that we are reading user supplied subarray values.
Offline container modification shares actions that mdmon typically
handles so promote is_container_member() and version_to_superswitch()
(formerly find_metadata_methods()) to generic utility functions for the
cases where mdadm performs the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This is needed for imsm where:
1/ we want to report raid_disks as zero to allow mdadm -As to
incorporate all spares
2/ we can't determine stale disks by looking at the event counts.
3/ we can't see per-subarray expectations with the info returned from
the container level ->getinfo_super()
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
These metadata are not expected on partitions, and they have
no way of differentiation whether which is correct if they
are found both on the device and on the last partition.
So if the device is a partition, refuse to read the metadata.
Signed-off-by: NeilBrown <neilb@suse.de>
Matching the functionality already in super0 and super1, when
we first create a container, remove any other recognisable metadata to
ensure it doesn't cause confusion.
Signed-off-by: NeilBrown <neilb@suse.de>
1.x metadata allows a device to be a member of the array while it
is still recoverying. So it is a working member, but is not
completely in-sync.
mdadm/assemble does not understand this distinction and assumes that a
work member is fully in-sync for the purpose of determining if there
are enough in-sync devices for the array to be functional.
So collect the 'recovery_start' value from the metadata and use it in
assemble when determining how useful a given device is.
Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
Signed-off-by: NeilBrown <neilb@suse.de>