This allows the metadata on a device to be saved and later restored.
This can be useful before experimenting on an array that is misbehaving.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Fix bug in previous patch
"DDF: compare_super_ddf: merge local info of other superblock"
Just discovered this bug in my last patch set - unfortunately, just after
you committed it.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
The 10ddf-create test case fails sporadically because wrong meta
data is written, making the array appear inconsistent when it's
restarted. Added code to aid debugging this.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Commit c1ea5a98 caused brief_detail_super_ddf() to be called
for subarrays. But the UUID printed was always the one of the
container. This is wrong and actually worse than printing no UUID
at all, and causes the DDF test case (10ddf-create) to fail.
This patch adds code to determine the MD UUID of a subarray correctly.
The hard part is to figure out for which subarray the function is
called. Moved that to an extra function.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
It's not necessary to check for 0xffffffff, which is a valid
sequential number.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
If a match is found in compare_super_ddf, check the other SB
for local DDF information (VD config records, physical disk data)
which is not available in the current superblock, and add it
if needed.
This is important for the mdmon - when disks are added to a
auto read-only array, they must be present in the DDF structure
in order to guarantee consistent writeback of metadata to all
disks.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Besides container GUID, also check seqnum, physical and virtual
disk numbers, and check match between local and global sections.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
When writing back the DDF structure, make sure that on each disk
we write the configs that include this disk even if a secondary
RAID level is present. Otherwise the secondary RAID will not be
read correctly any more when we open the device next time.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Check for supportable secondary RAID configurations.
There is currently only one: RAID 10, if the stripe
sizes and Basic volume sizes are all equal.
With this patch, mdadm will not try to start unsupported
secondary RAID level configurations any more.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
When searching for container elements, loop over the known phys
disks rather than the elements of the current configuration.
This patch changes nothing in the logic or return value of the code.
It just prepares extended logic for handling RAID10.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Store VD config for other BVDs in the other_bvds array.
This allows handling secondary RAID levels in container_content_ddf.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
The VD config structures of different BVDs in the same SVD may be
different. This pointer stores the other BVDs.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Cleanly increase the seq number when the DDF structures are
written, instead of always setting it back to 1.
Also, make sure that the sequential number of all headers and
VD conf records is the same.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Some RAID BIOSes apparently use hard-coded LBA offsets (presumably
from the end of the disk) for the primary and secondary DDF
structure, ignoring the values given in the DDF anchor. This is
broken BIOS behavior, but it will cause any changes made by MD
(e.g. setting the init_state flag after a full initialization)
to be "forgotten" after the next reboot.
This patch fixes this by using the exiting LBA locations if
available. Verified that this fixes MD+LSI Mega Software RAID
BIOS.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
So far, mdadm only saved the header of the secondary structure.
With this patch, the full secondary DDF structure is saved
consistently, too. Some vendor DDF implementations need it.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.
So get rid of devnum (a number) and use devnm (a 32char string).
eg.
md0
md_d2
md_home
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ...
The size is in K unless a suffix: K M G is given.
The suffix 's' means sectors.
Signed-off-by: NeilBrown <neilb@suse.de>
This is currently only useful for 1.x metadata and will allow an
explicit --data-offset request on command line.
Signed-off-by: NeilBrown <neilb@suse.de>
When adding a spare to a DDF there is some confusion about the
'level' of the container. It is reported by kernel as unknown
-1000000.
I don't know why this broke but until I figure out why and fix it,
this hack gets us going again.
Signed-off-by: NeilBrown <neilb@suse.de>
If a DDF has two arrays sharing devices and one device fails, then
as soon as the spare is used to recover one of the arrays it isn't
spare any more and so is not chosen for the other array.
Work around this for now by allowing a non-spare to be used if it has
enough space.
Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If we change some functions to accept 'verbose', where <0 means to be
quiet, in place of 'quiet', then we will be able to merge
'quiet' and 'verbose' together for simplicity.
Signed-off-by: NeilBrown <neilb@suse.de>
malloc should never fail, and if it does it is unlikely
that anything else useful can be done. Best approach is to
abort and let some super-daemon restart.
So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit. Then use those
removing all the tests for failure.
Also replace all "malloc;memset" sequences with 'xcalloc'.
Signed-off-by: NeilBrown <neilb@suse.de>
We need to check every disk to see if it has a particular amount of
space spare, not just every subsequent disk.
Signed-off-by: NeilBrown <neilb@suse.de>
When container is assembled while reshape is active on one of its member
whole container can be required to be blocked from monitoring.
For such purpose field recovery blocked is added to mdinfo structure.
When metadata handler finds active reshape in container it should set
recovery_blocked field to disable whole container monitoring during
reshape.
For arrays that doesn't use containers, recovery_blocked field
has the same value as reshape_active field e.g. super0/1.
In fact,recovery is blocked during reshape for such arrays.
For ddf, metadata handler doesn't set reshape_active field,
so recovery_blocked is not set also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Now that getinfo_super clears the info structure, we need
to make sure the correct values are filled in to info->dev.
Signed-off-by: NeilBrown <neilb@suse.de>
container_content_imsm was setting info->next before calling
getinfo_super_imsm_container which now zeros everything.
So move that assignment to afterwards.
So both imsm and ddf were assuming info->disk.raid_disk means
something but it doesn't. So fix those.
Signed-off-by: NeilBrown <neilb@suse.de>
Some code currently clears 'info' before calling getinfo_super,
some code doesn't.
To be consistent, change it so no caller ever clears 'info',
but ever getinfo_super function must clear it.
Note that ->raid_disk may be meaningful if that 'map' is passed
non-NULL. In that case it is copied out before the structure
is zeroed.
Signed-off-by: NeilBrown <neilb@suse.de>
If a device hasn't been found yet we can still tell if it is
expected to be working, and we must to do to make sure
'working_disks' is correct.
Signed-off-by: NeilBrown <neilb@suse.de>
This is needed to remove devices from mdmon's knowledge when the
device is removed from the md container.
Now that ddf have a remove_from_super we don't need the code
that allows some personalities not to implement this.
Signed-off-by: NeilBrown <neilb@suse.de>
If a RAID1 is meant to have more than 2 device and while it doesn't
have that many, it still has more than 1, then according to the
DDF spec it is "partially optional" rather than "degraded"
So make that so.
Signed-off-by: NeilBrown <neilb@suse.de>
The DDF spec requires we have a phys disk record for every physically
attached device. But it isn't clear what that means in the case
of soft raid in a general purpose Linux computer.
So remove phys disk records for any failed device that is not
active in any array.
Signed-off-by: NeilBrown <neilb@suse.de>
This is a big fragile, but DDF has wierd rules that we aren't really
set up to handle properly.
When we add a device to a degraded array it must be a spare, so
mark it as Rebuilding.
Signed-off-by: NeilBrown <neilb@suse.de>
This is needed to remove devices from mdmon's knowledge when the
device is removed from the md container.
Now that ddf have a remove_from_super we don't need the code
that allows some personalities not to implement this.
Signed-off-by: NeilBrown <neilb@suse.de>
We weren't setting ->vcnum at all when an array was added. This
meant that a subsequent device failure could be assigned to the
wrong array.
Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When set_disk is called, we need to check if the disk has changed or
recently appeared, and update everything properly if it has.
Signed-off-by: NeilBrown <neilb@suse.de>