Now that we check the error return of 'update_super' better, we
much make sure that ddf doesn't incorrectly report that the
superblocks are wrong during assemble.
Signed-off-by: NeilBrown <neilb@suse.de>
As chunk_size in mdstat_ent is never set, we shouldn't copy
it into a->info.array.
In fact, it is safest to get rid of the field altogether.
Reported-by: "Kwolek, Adam" <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We shouldn't call remove_partitions until we have made a really firm
decision to include the device into the array.
Signed-off-by: NeilBrown <neilb@suse.de>
If adding a bitmap fails with EBUSY, then it is because the array is
currently resyncing/recovering/reshaping.
As this is non-obvious, give a message explaining the fact.
Signed-off-by: NeilBrown <neilb@suse.de>
We already allow K,M,G suffixes for --size and --array-size.
Allow it for --chunk and --bitmap-chunk as well.
Also add this info to man page, and remove the duplication of info
about --array-size.
Signed-off-by: NeilBrown <neilb@suse.de>
number of backup blocks evaluation is put in to function for code reuse.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
fd handles table creation is put in to function for code reuse.
In manage_reshape(), child_grow() function from Grow.c will be reused.
To prepare parameters for this function, code from Grow.c can be
reused also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Assumption for spares searching was that after picking new device, it
has to be added to array before next search. This causes returning
different disk on each call.
When creating a spare list during Online Capacity Expansion, we will
first collect the devices list and then all devices are added to md.
Picked device from spares pool has to be checked against picked
devices so far. If not, the same disk will be returned all the time.
Already picked devices are stored in the list and this list is used
for new devices verification also.
So add an extra arg to imsm_add_spare to hold a list of known
spares to ignore.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Wrong number of disks during metadata update causes core dump. New
disks number based on internal mdmon information has to used for
calculation (not previously read from metadata).
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Internal IMSM procedures need to support the General Migration.
It is used during operations like:
- Online Capacity Expansion,
- migration initialization,
- finishing migration,
- apply changes to raid disks etc.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Due to fact that IMSM Windows compatibility was not tested yet,
feature has to be treated as experimental until compatibility
verification will be performed.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Until now Raid10->Raid0 takeover was possible only if all the mirrors
where removed before md starts the takeover. Now mdadm, when
performing Raid10->raid0 takeover, will remove all unwanted mirrors
from the array before actual md takeover is called.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Also not that the leading '-' on the metadata names now
simply means that mdmon must not reconfiure the array.
Signed-off-by: NeilBrown <neilb@suse.de>
If a degraded array disappears we still have it in statelist
with active<raid but it is pointless to look for spares for it.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If /var/run/mdadm doesn't exist we can never succeed writing
so we should try to create it first. When we make sure it is there we
write pid file as before.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org, Williams, Dan J <dan.j.williams@intel.com>, Ciechanowski, Ed <ed.ciechanowski@intel.com>
Otherwise spare will be considered good anyway.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
For consistency with makedev().
int is not sufficient.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
1. If array not changed we should still report any degraded
- another array may have a new spare that we can move.
2. Array with err=1 can't give a spare.
3. We look for spares in "from" not "st" which is supertype
and has devname=NULL.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This is a series of tests checking if mdadm Monitor migrates spares
according to rules in /etc/mdadm.conf defined by POLICY lines.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a devices - typically in a mirrored set - is assembled
independently of the other devices, and then attempted to be brought
back into the set, it could contain inconsistent data. It should not
be included.
So detect this situation by ensuring that the 'most recent' device is
believed to be active by every other device. If a device is wayward,
it will only consider fellow wayward devices to be active and will
think all others are failed or missing.
This patches fixes --incremental, --assemble was done in an earlier
patch.
Signed-off-by: NeilBrown <neilb@suse.de>
In several cases, two different long options map to the same short
option. So e.g. you could give '--brief' and it would be interpreted
as '--bitmap'. That isn't really good.
So for every shared short option, define an option number and return
that for the long option instead. Then always check for both the
short and long options.
Also give some bugs like " mode == 'G'" which should be '== GROW'.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than calling sysfs_read whenever we want data from sysfs, call
it once at the start will all the requests of interest, then just use
that,
Make sure we free it properly too.
Signed-off-by: NeilBrown <neilb@suse.de>
When metadata is managed externally - probably as a container - we
need to examine that metadata to see which devices are spares.
So use the getinfo_super_disk message and use the info returned.
Signed-off-by: NeilBrown <neilb@suse.de>
choosing a spare from a container is more complicated that
from a native array. So separate out choose_spare to make it easier
to use an alternate implementation
Signed-off-by: NeilBrown <neilb@suse.de>
When growing the number of raid disks the reshape process will promote
container-spares to subarray-spares (later the kernel promotes them to
subarray-members in raid5_start_reshape()). The automatic spare
promotion that mdmon performs upon seeing a degraded array must be
disabled until the reshape process has been initiated. Otherwise, mdmon
may start a rebuild before the reshape parameters can be specified.
In the external case we arrange for the monitor to be blocked, and
turn off the safemode delay.
Mdmon is updated to check sync_action is not frozen before initiating
recovery.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
In the native metadata case Grow_reshape() and the kernel validate what
reshapes are possible / supported and the kernel handles all the metadata
updates. In the external case the metadata format may have specific
constraints above this baseline. External formats also introduce the
constraint of only permitting some reshapes at container scope versus subarray
scope. For exmaple imsm changes to 'raiddisks' must be applied to all arrays
in the container.
This operation assumes that its 'st' parameter has been obtained from
super_by_fd() (such that st->subarray is up to date), and that a snapshot of
the metadata has been loaded from the container.
Why a new method, versus extending an existing one?
->validate_geometry: this routine assumes it is being called from Create(),
adding reshape complicates the cases that this routine needs to handle. Where
we find that checks can be shared between the two cases those routines
refactored into common code internal to the metadata handler, i.e. no need to
provide a unified external interface. ->validate_geometry() also does not
expect to update the metadata.
->update_super: this is meant to update single fields at Assembly() and only at
the container scope. Reshape potentially wants to update multiple fields at
either container or subarray scope.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Precludes needing to deduce this information later, like in Detail.c and
soon in Grow.c.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Support metadata specific level, layout and chunksize defaults. Kill an
uneeded superswitch methods ahead of adding more for the reshape case.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If the user does not specify a layout, don't skip asking about retaining
the non-standard raid6 layout which may be implicitly changed.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Incremental assembly works on such an array because the kernel sees the
disk as in-sync and that the array is reshaping. Teach Assemble() the
same assumptions.
This is only needed on kernels that do not initialize ->recovery_offset
when activating spares for reshape.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Going through the Grow api found some local routines that could be
marked static.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm --readwrite <subarray> will clear the external readonly flag ('-'
to '/'), but only for redudant arrays. Allow raid0 arrays as well so
the user has a simple helper to control this flag.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
In order to support reshape and atomic removal of spares from containers
we need to prevent mdmon from activating spares. In the reshape case we
additionally need to freeze sync_action while the reshape transaction is
initiated with the kernel and recorded in the metadata.
When reshaping a raid0 array we need to freeze the array *before* it is
transitioned to a redundant raid level. Since sync_action does not exist
at this point we extend the '-' prefix of a subarray string to flag
mdmon not to activate spares.
Mdadm needs to be reasonably certain that the version of mdmon in the
system honors this 'freeze' indication. If mdmon is not already active
then we assume the version that gets started is the same as the mdadm
version. Otherwise, we check the version of mdmon as returned by the
extended ping_monitor() operation. This is to catch cases where mdadm
is upgraded in the filesystem, but mdmon started in the initramfs is
from a previous release.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
...before introducing another open coded instace of this conversion.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When checking that a container matches the required uuid,
we need to call 'getinfo_super' before we have a 'content'
to test.
Reported-by: "Czarnowska, Anna" <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Platform (metadata) domain let the metadata handlers differentiate
disk domains based on controllers that the disk belongs to.
Platform domain is sub-domain inside user specified domain
in mdadm.conf configuration files inheriting all parameters from it.
The metadata domain name is used disk domain matching functions.
The disk with the same metadata domain name belong to the same metadata
domain.
New metadata handler is added that retrieves platform domain string based
on disk path:
const char *(*get_disk_controller_domain)(const char *path);
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When trying to move a spare, move to the container of a degraded
array, not to the array itself.
And don't try to move from a subarray, only from a native or container
array.
And don't move from a container which contains degraded subarrays.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than only migrating between arrays with the same spare_group,
we now migrate based on domains set in the policy.
In order for spare_group to continue to work, we treat it as a domain
of the destination array, and a domain of any device we might remove
from a source array.
Signed-off-by: NeilBrown <neilb@suse.de>
Checking compatibility between arrays for spare migration is going to
become a little more complicated, so split it out into a separate
function.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than passing mailaddr, mailfrom, cmd, dosyslog around in
argument lists, create a structure to hold them all.
Signed-off-by: NeilBrown <neilb@suse.de>