Neil rightly points out that imsm_activate_spare may skip valid free space
on a spare, fix this up.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Following the lead of 75ede16d. This incidentally fixes creation of a second
array by gating call to getinfo_super_imsm_volume with a valid ->current_vol.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Using buffered IO risks non-atomic updates to parts of the
device that we don't actually want to write to. This isn't in
general safe.
So switch to O_DIRECT for all that IO and make sure we have
properly aligned buffers.
When loading the metadata for a subarray (super_by_fd), we set
->subarray to be the name read from md/metadata_version so that
getinfo_super can return info about the correct array.
With this we can differentiate between a container and
an array within the container by looking at ->subarray[0].
Only one superswitch should be externally visible for each
general type. Others which handle different flavours
(e.g. container/data-array) should be internal only.
for development only as console output can block leading to monitor deadlocks
in low mem situations
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A spare device by definition will have raid_disk set to -1, but when
assembling the container we want this disk to by included.
Fixes a SIGSEGV when doing:
mdadm -A /dev/imsm -e imsm /dev/sd[b-e]
...where /dev/sde is marked as a global spare device
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
It is used later by container_content_imsm to determine set the
text_version of the member arrays.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Code in manager can now just call queue_metadata_update with a
(freeable) buf holding the update, and it will get passed to the
monitor and written out.
I want the metadata handler to have more control over the 'version',
particularly for arrays which are members of containers.
So discard st->text_version and instead use info->text_version
which getinfo_super can initialise.
mark_dirty is just a special case of mark_clean - with sync_pos == 0.
mark_sync is not required. We don't modify the metadata when sync
finishes. Only when the array becomes non-writeable at which point we
use mark_clean to record how far the resync progressed.
From: Dan Williams <dan.j.williams@intel.com>
Added curr_state as a parameter to set_disk. Handlers look at this to
record components failures, and set global 'degraded' or 'failed'
status.
When reading the state as faulty:
1/ mark the disk failed in the metadata
2/ write '-blocked' to the rdev state to allow the kernel's failure
mechanism to advance
3/ the kernel will take away the drive's role in remove_and_add_spares()
4/ once the disk no longer has a role writing 'remove' to the rdev state
will get the disk out of array.
There is a window after writing '-blocked' where the kernel will return
-EBUSY to remove requests. We rely on the fact that the disk will
continue to show faulty so we lazily wait until the kernel is ready to
remove the disk. If the manager thread needs to get the disk out of the
way it can ping the monitor and wait, just like the replace_array()
case.
[buglet fix: swap the parameters of attr_match in read_dev_state]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
From: Dan Williams <dan.j.williams@intel.com>
Metadata handlers set mdinfo.resync_start depending on the state of the
array. By default mdadm assumes the array is dirty and needs a full
resync.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
From: Dan Williams <dan.j.williams@intel.com>
The following now work:
--examine
--examine --brief
Signed-off-by: Dan Williams <dan.j.williams@intel.com>