If an array is being reshaped using backup space on a 'spare' device,
then
mdadm --grow --continue
won't find it as by the time it runs, nothing looks like a spare are
more. The spare has been added to the array, but has no data yet.
So allow reshape_prepare_fdlist to find a newly-incorporated spare and
report this so it can be used.
Reported-by: Xiao Ni <xni@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When the array is stopped during a critical section, we sometimes
erase the backup, which is bad.
This happens when 'completed' is zero.
This can happen easily when 'stop' freezes reshape.
So try to be more careful and check 'reshape_position'.
Signed-off-by: NeilBrown <neilb@suse.de>
Some old kernels set 'completed' to '0' too soon.
But modern kernels don't.
And when 'mdadm --stop' freezes and resume the grow,
'completed' goes back to zero briefly, which can confuse this
logic.
So only think '0' might be wrong from an old kernel when
the reshape has gone idle.
Signed-off-by: NeilBrown <neilb@suse.de>
EBUSY can be returned if something has recently happened
to cause md to want to check if recovery is needed, but hasn't
had a chance yet.
This can easily happen in testing.
So retry a few times in that case.
Signed-off-by: NeilBrown <neilb@suse.de>
We might be trying to set_new_data_offset() for RAID10, when it is
a necessary requirement, or for RAID5 where it is optional.
In the latter case, a message about metadata versions is no helpful.
Signed-off-by: NeilBrown <neilb@suse.de>
Kernels between
c6563a8c38fde3c1c7fc925a v3.5-rc1~110^2~53
and
b5254dd5fdd9abcacadb5101 v3.5-rc1~110^2~51
allow new_offset to be set, but don't then allow a RAID5
to be reshaped to change that offset.
Due to selective backports, this includes the SLES11-SP3 kernel.
It is quite easy to handle this case in mdadm, so we do.
Specifically: if the reshape with data-offset fails with EINVAL,
abort the data-offset change and try the "old" way.
Signed-off-by: NeilBrown <neilb@suse.de>
The buffer passed on to readlink() needs to contain space for the
terminating \0. See 'man 3 readlink' for details.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.
Only strings which contain a newline can be broken
into multiple lines:
"It is OK to\n"
"break this string\n"
Signed-off-by: NeilBrown <neilb@suse.de>
make dprintf() print program name and __func__, so that
this messaging is consistent.
Also remove all __func__ messages from pr_err(). We shouldn't
leak that internal data in error message.
If we really want function name there, we new pr_XXX might
be wanted.
Signed-off-by: NeilBrown <neilb@suse.de>
Sometimes mdadm prints messages with wrong name "mdmon",
and vice versa.
This patch solves this problem by changing method of determining
process name.
Now "Name" will be set in const at start of a program,
previously was hardcoded as #define.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Commit a821c95f11
besides introducing additional message, also changed
direct return to "goto" instruction.
'goto release' will cause routine to return with '-1',
when previously '1' was returned.
Described behaviour breaks e.g. IMSM reshape process.
This patch fixes this issue by changing 'goto' to proper one -
the one that returns '1'.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If the request --size to --grow an array to is larger
than 32bits, then mdadm may make the wrong choice and
use ioctl instead of setting component_size via sysfs
and the change is ignored.
Instead of using casts to check for a 32-bit overflow,
just check for set bits outside of INT32_MAX.
Fixes: 4e9a3dd16d
Signed-off-by: NeilBrown <neilb@suse.de>
Report when the array's metadata needs updating instead of just
reporting the generic "kernel too old" message.
Signed-off-by: Andy Smith <andy@strugglers.net>
Signed-off-by: NeilBrown <neilb@suse.de>
Linear arrays don't respond to setting raid-disks, only to
adding a device.
Reported-by: mulhern
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1122146
Signed-off-by: NeilBrown <neilb@suse.de>
If the request --size to --grow an array to is 32bits
(i.e. msb in bit 32) then mdadm make wrong choice and
uses ioctl instead of setting component_size via sysfs
and the change is ignored.
This is fixed by using correct casts.
Reported-and-tested-by: Killian De Volder <killian.de.volder@megasoft.be>
Signed-off-by: NeilBrown <neilb@suse.de>
Grow process did not check if reshape is already started
when deciding about restarting.
Sync_action should be checked in this case, and if
reshape is running - restart flag should not be set.
Otherwise, Grow process will fail to write data to
sysfs, and reshape will not be continued.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Commit 18d9bcfa33
removed wrong line (in case RAID0->RAID4).
This patch corrects this mistake
(line should be removed in case RAID4->RAID4).
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Mdadm should not run 'grow-continue' unit file for container if
'--freeze-reshape' argument is passed. Otherwise it will be ignored,
and reshape will start anyway.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Similar to commit 06e293d097
same thing should be done for reshape_container in Grow_continue
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Commit fdcad551e9
brings some changes to reshape process.
Setting 'before.layout' when reshaping from RAID4 to another RAID4 is
not really necessary.
If reshape is restarted 'before.layout' will be compared with
'info->array.layout' in reshape_array(). Changes brought by mentioned
commit will cause this comparation return as false, becouse 'array.layout'
is always set to 'ALGORITHM_PARITY_N' in analyse_change() for RAID4, so
reshape will not be continued after reboot/stop.
This patch reverts unnecessary changes.
Signed-off-by: Pawel Baldysiak pawel.baldysiak@intel.com
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Commit 5e76dce1ac changed
Grow_continue to assume a fork had already happened, so that
mdadm --grow --continue
didn't fork. This is good, but it means that if Grow_continue
is run from Assemble, then
mdadm --assemble ....
can misbehave if the array was in the middle of a reshape.
So introduce finer control. Grow_continue only assumes it has
already forked if run from "mdadm --grow --continue".
Signed-off-by: NeilBrown <neilb@suse.de>
This is a better match for reshape_array() and means that
"mdadm --grow --continue" will run in the foreground, which
makes more sense.
Signed-off-by: NeilBrown <neilb@suse.de>
If "--assemble" or "--incremental" is started by udev, then
monitoring the reshape in the background won't work.
So try asking systemd to start a grow-continue.
If that fails, just do it the old way.
Signed-off-by: NeilBrown <neilb@suse.de>
Subsequent patch will allow the background part of "mdadm --grow" to
be run from systemd. This can require the passing of a backup file
name.
To do this, store that name as a symlink in /run/mdadm (or MAP_DIR)
and look for it when appropriate.
It might be useful to also store the name across reboot, but that
would be a different patch. We would need to use the uuid to identify
it, and store it in stable storage.
Signed-off-by: NeilBrown <neilb@suse.de>
1/ when unfreezing, make sure the array is frozen first.
If it isn't we might end up interrupting a reshape.
2/ When the child finishes, don't call abort_reshape() as that
will interrupt the reshape. Just set suspend_* etc
explicitly.
Signed-off-by: NeilBrown <neilb@suse.de>
Since:
commit 84d11e6c6a
Author: NeilBrown <neilb@suse.de>
Date: Thu Aug 1 11:16:14 2013 +1000
Grow: exit background thread cleanly on SIGTERM.
removed the setting of "sync_max" from abort_reshape() we need
to do it explicitly here.
Signed-off-by: NeilBrown <neilb@suse.de>
If the mdadm thread that monitors a reshape gets SIGTERM it should
exit cleanly and clear the 'suspended' region of the array.
However it mustn't clear 'sync_max' as that would allow the
reshape to continue unmonitored.
If the thread ever does get killed, the array should really be
shutdown soon after if possible.
Signed-off-by: NeilBrown <neilb@suse.de>
Coverity discovered a possible double close(fd2) in Grow.c. Avoided by
invalidating fd2 after the first close.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If we will need to change array level when a reshape completes, a copy
of mdadm waits in the background.
Currently this copy hold the device (/dev/mdX) open. This prevents
the array from being stopped.
So close the file descriptor and re-open after the reshape completes.
Signed-off-by: NeilBrown <neilb@suse.de>
Having a fix time for a wait is clumsy and can make us
wait much too long.
So use mdstat_wait and keep the mdstat_fd open.
This requires an 'mdstat_close' so it doesn't stay open
forever.
Signed-off-by: NeilBrown <neilb@suse.de>
--stop now tries to wait for a reshape to be at just the right spot.
However for a reducing reshape, mdadm will be running in the
background watching, and might adjust sync_max and mess things up.
So teach "progress_reshape" to notice when "sync_max" is modified, and
leave it alone.
Signed-off-by: NeilBrown <neilb@suse.de>
progress_reshape() may not set reshape_completed if the reshape is
interrupted, so we need to initialize it to the current value before
hand, so the value used afterwards is credible.
Signed-off-by: NeilBrown <neilb@suse.de>
If the restarted reshape needs a backup file and we don't have one,
that should be reported before we try to start the array.
Also we shouldn't say the "Cannot grow" but "cannot complete".
Signed-off-by: NeilBrown <neilb@suse.de>
To be able to revert-reshape of raid4/5/6 which is changing
the number of devices, the reshape must has been stopped on a multiple
of the old and new stripe sizes.
The kernel only enforces the new stripe size multiple.
So we enforce the old-stripe-size multiple by careful use of
"sync_max" and monitoring "reshape_position".
Signed-off-by: NeilBrown <neilb@suse.de>
We have several places that wait for activity on a sysfs
file. Combine most of these into a single 'sysfs_wait' function.
Signed-off-by: NeilBrown <neilb@suse.de>
For RAID10, we must have head/tail space for reshape.
For RAID4/5/6 we can use a spare or a backup file.
So make that distinction.
Signed-off-by: NeilBrown <neilb@suse.de>
When changing the chunksize of an array, the new chunksize must
divide the device size.
If it doesn't we report a very brief message.
Make this message a bit longer and suggest a way forward be reducing
the size of the array.
Reported-by: Mark Knecht <markknecht@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>