Using buffered IO risks non-atomic updates to parts of the
device that we don't actually want to write to. This isn't in
general safe.
So switch to O_DIRECT for all that IO and make sure we have
properly aligned buffers.
1/ track if there are any actual updates pending, and only
write metadata when we have changed something.
2/ when writing null virtual-configs, write full blocks,
not just the first 4 bytes. This will allow O_DIRECT
writes in a subsequent patch.
1/ close a race where multiple arrays disappear at once
and monitor isn't woken up to find out that the last one
has gone.
2/ "mdadm -Ss" needs to pause briefly for mdmon to exit.
The returned value was never used, and we don't really want
this return path anyway as writing to a pipe could conceivably
block, and the monitor must not block.
This really should be done in mdadm, not mdmon.
We ensure the device won't be suddenly commited as a hot-spare
using O_EXCL, then check the 'holders' sysfs directory
to make sure it is only in use once.
When loading the metadata for a subarray (super_by_fd), we set
->subarray to be the name read from md/metadata_version so that
getinfo_super can return info about the correct array.
With this we can differentiate between a container and
an array within the container by looking at ->subarray[0].
Only one superswitch should be externally visible for each
general type. Others which handle different flavours
(e.g. container/data-array) should be internal only.
Using write_init_super to add a spare to an active array is quite
different to how it is used when creating an array.
It mostly works, but if we are adding two devices to an array,
then when we add the second, there are still traces of the first
which confuse write_init_super.
So get write_init_super to ignore those traces. Longer term, we
probably want to do this differently as for DDF, hot-adding to
an active array will have to be quite different - it will want to
write to all metadata, possibly via mdmon.
For arrays that don't have redundancy (raid0, linear etc), the
clean/dirty distinction doesn't mean anything. So always
'assume clean' for these arrays.
Two places that should have been .B or .I were .M, which of
course is an error.
So fix these and that the opportunity to be more consistent
with highlighting of "mdadm". Make it always italic except when
used as a literal example, then bold.
It is important that dup_super always returns an 'st' with the same
->ss and ->minor_version as the st that was passed.
This wasn't happening for 0.91 metadata (i.e. in the middle of a reshape).
for development only as console output can block leading to monitor deadlocks
in low mem situations
Signed-off-by: Dan Williams <dan.j.williams@intel.com>