When external metatdata handler supports manage_reshape()
and recover_backup() functions in super switch backup file is not required
and can be omitted. For backup purposes metadata specific mechanisms
are used.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When raid0 is takeovered to degraded raid4, metadata updates has to be
applied via mdmon (raid4 has to be monitored).
It is not possible due to no update_tail pointer initialization
in supertype structure.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
mdmon has to update checkpoint information in metadata during
general migration according to received metadata update.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm has to prepare checkpoint information update and send it to mdmon.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
There are 2 places for keeping checkpoint information:
- metadata (per volume information used during volume
initialization and rebuilding).
- migration record (per container information used during
migration/reshape)
During reshape both checkpoints has to contains the same information.
To do this mdadm will send metadta update with checkpoint information.
Note: Checkpoint information consistence is not critical. During general
migration restart, information from migration record is used only.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
imsm contains 2 check-pointing mechanism. One (per array) is used for
initialization and rebuild and second (per container) is used for general
migration (reshape). First is controlled by mdmon, second by mdadm.
To avoid conflicts disable mdmon checkpoints updating for general
migration.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Add ability to restore data backed up in General Migration Copy Area
in case of unexpected reshape interruption.
Function restores data during an array assembly and then reshape
is continues from next checkpoint.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Patch introduces support for reshape process restart for external metadata
using metadata specific data handling methods.
It introduces recover_backup() function that restores array to stable state
It is equivalent to Grow_restart() functionality for native metadata.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
blocks_per_migr_unit() has to use information from migration record
for general migration case. This causes to pass intel_super pointer
to this function and some other interfaces changes.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Add ability to display information from migration record in examine
option.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When metadata is saved and there is no general migration in progress
/in container/ clear migration record in container.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Before reshaping every "migration unit", check if array is still usable.
In failed disks number is greater than allowed degradation level, reshape
has to be aborted.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Before reshape is started, mdadm should check again if there is only one
array (in container) under reshape. Then function "divides" array in to
"migration units" that can fits migration copy area and enters main loop.
It checks if current "migration unit" requires to be backed up.
If necessary mdadm saves it to copy area and updates migration record.
Then MD-driver is directed to perform reshape step (by "migration unit" size)
and checkpoint is moved forward. In this way reshape is executed until
array ends.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
After each checkpoint mdadm should set new reshaped area and wait
until md finishes reshape. Function wait_for_reshape_imsm() sets
new reshape range and waits for job completion.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Just in case array.level is ever something that we don't expect, make
sure we report an error clearly rather than get confused.
Signed-off-by: NeilBrown <neilb@suse.de>
Under Windows IMSM can reshape arrays in 2 directions
(ascending and decsending).
Under Linux one (ascending) direction is supported at this moment.
Block loading metadata when decsending reshape is detected
Windows also uses optimalization area during reshaping array.
Linux does not support it.
The patch blocks this operation also.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This patch adds methods of manipulating migration record:
init_migr_record_imsm() - initiate migration record at the beginning of
the reshape process
write_imsm_migr_rec() - saves migration record to array.
Migration record is stored on 2 first disks in array only.
save_backup_imsm() - saves critical data stripes to Migration Copy Area
and updates the current migration unit status.
Uses restore_stripes() to format a destination stripe, and to write it
to the Migration Copy Area.
save_checkpoint_imsm() - Updates the current unit status in the
migration record.
Migration record is written to 2 first array disks only (similar to reading
operation).
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Definitions are necessary to compile mdmon.
Metadata specific source code is compiled to mdmon.
Functions used for reshape check pointing:
- restore_stripes()
- save_stripes
- abort_reshape
are not used in mdmon, but they are compiled in it.
To enable mdmon compilation, dummy functions are used.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
For external metadata backup location and saving methods depends
on metadata specific implementation details. Currently restore_stripes()
function is able to restore data only from the given backup file handles
and it is used only for assembling partially reshaped arrays.
As this function will be very helpful for external metadata backup
mechanism, add the support for restoring data from the given source buffer.
Add possibility for save_stripes() to work without designation targets.
Save_stripes() can now prepare data for restore_stripes() only.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
IMSM for securing reshape process uses special disk area outside metadata
for reshaped area backup purposes. If just reshaped array area requires
backup, bunch of array stripes prepared for reshape is stored in to
Migration Copy Area. In case of reshape interruption, Option ROM during
restart or mdadm during reshape restart (when no reboot occurs) will
restore Migration Copy Area to designation array. Reshape can be
continued from stable array stable state.
This patch adds support for IMSM migration record structure.
IMSM migration record is stored on the first two disks of IMSM volume
during the migration.
Add function for reading migration record, so mdadm can read (if present)
migration record. Migration record has to be cleared every time
MIGR_GEN_MIGR is started.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Some code currently clears 'info' before calling getinfo_super,
some code doesn't.
To be consistent, change it so no caller ever clears 'info',
but ever getinfo_super function must clear it.
Note that ->raid_disk may be meaningful if that 'map' is passed
non-NULL. In that case it is copied out before the structure
is zeroed.
Signed-off-by: NeilBrown <neilb@suse.de>
A recent change to improve error messages make it not possible to
create an array from devices that are 'busy'. However if they are
made busy by a container, then the create should be allowed.
So move one of the error messages later.
Reported-by: "Wojcik, Krzysztof" <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The loop over all member devices in enough_fd could easily stop
before it had found all devices. This would cause --re-add to
fail incorrectly.
So change the loop to be based on the reported number of devices
in the device - with a safe-guard limit of 1024.
Change some other loops to be more careful too.
Reported-by: "Schmidt, Annemarie" <Annemarie.Schmidt@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When an array is resized to have larger members, --assume-clean will
disable any resync if the kernel supports it (2.6.40 and later).
Signed-off-by: NeilBrown <neilb@suse.de>
RAID0 has accepted chunksizes that are not a power of 2 since 2.6.30.
So it time mdadm allowed that to be used.
Signed-off-by: NeilBrown <neilb@suse.de>
Options like --stop must come before the device that is being
stopped. If (in --misc mode) a device does not have an option,
nothing will be done to it, which can be confusing.
So report an error in this case.
Signed-off-by: NeilBrown <neilb@suse.de>
If using an old kernel we should still check if a re-add might be
intended, so we can refuse and require a '--zero' first if it is not
possible.
Signed-off-by: NeilBrown <neilb@suse.de>
As homehost defaults to the system name it is not possible to specify
a NULL homehost.
This patch restored this ability with either --homehost="" or
--homehost="<none>".
This allows the creation of v1.x arrays without a "hostname:"
prefix in the name.
Signed-off-by: NeilBrown <neilb@suse.de>
In an array is auto-readonly then a reshape will not start.
But auto-readonly is only wanted until something is explicitly
done to acknowledge that the array is really wanted.
So it is perfectly correct to switch an auto-readonly array to
'clean' if a reshape has been requested.
Signed-off-by: NeilBrown <neilb@suse.de>
If a device fails while the grow is being monitored but the array is
still functional, the Grow will appear to abort and then almost
instantly restart from where it was up to.
So if it appears to abort, wait up to 10 seconds for a restart (it
should be much much less than this.
Signed-off-by: NeilBrown <neilb@suse.de>
The big 'grow' refactor lost us the ability to configure 'faulty'
arrays through --grow.
So put that back as a special case.
Signed-off-by: NeilBrown <neilb@suse.de>
Analyse_change can give unhelpful error messages if nothing was
changed. This is particularly awkward when only changing --size.
So check and re-introduce a message that was list in commit
5da9ab9874
Signed-off-by: NeilBrown <neilb@suse.de>
If you change the size of a member of an array (e.g. it might be a
dm device that can be resized, or on a smart storage device), md
doesn't notice and so the space cannot be used without explicitly
telling md that the device is bigger.
This change causes "mdadm --grow --size=...." to make sure each
component device is making at least that much space available if it
can.
Normally usage of "--size=max" will cause all devices to make max
space available, the md will use as much as it can of that.
Signed-off-by: NeilBrown <neilb@suse.de>
Put information about new chunk size change in to migration metadata
update allowing simultaneous level change and re-striping.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Received update and prepared memory is processed to update imsm metadata.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When level is changed from raid0 to raid5 memory is required for replace device
smaller device/array object.
This memory is allocated in manager context in prepare_update()
Prepare_update() is called in manager context so memory allocation are
allowed here. This allows us to look for spare devices for meta update.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
it is not supported yet, so start such transition is improper.
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Introducing raid0->raid5 level migration metadata update structure
is prepared for future use.
Adding spare device is required to hold additional raid5 parity.
Mdadm just checks for spares, but it is not included in update.
If there are no spares available, abort. Otherwise we will create
degraded array what should be not allowed.
Mdmon will decide what spare device is used for parity.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When 2 arrays are configured in container and arrays are reassembled during
rebuild or initialization, checkpoint for one array can be reset. It depends
on arrays assembly order.
Scenario:
1. Create 2 arrays (e.g. raid5)
2. Add spare to container
3. Degrade arrays /rebuild starts on array #1 and continues to n%/
4. Reassembly arrays
5. Rebuild starts on array #2 /because of assembly order/ from 0%
6. On first checkpoint stored for array #2 (non 0 value), checkpoint
for array #1 is cleared /it is delayed rebuild in md, so progress is 0/
7. Rebuild on #1 starts from n% /it was configured before checkpoint
was cleared/.
Any next reassembly during rebuild of #2 array (after p.6) causes
checkpoint information lost for array #1.
Solution is not store checkpoint for progress == 0.
Checkpoint is set to 0 when rebuild/initialization starts.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This was #if-ed out for 3.0, but it really should go.
Gcc 4.6.0 complains that auto_update_home is set but not used
(which is true).
Reported-by: Tobias Powalowski <t.powa@gmx.de>
During reshape restart info->array.raid_disks contains new raid_disks number
It cannot be compared against old disks number. Such check will always fail.
Check raid disks array field against final disks number for restart.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When array is in reshape state raid_disks field contains final disks number.
To know how many disks were added, disk.raid_disk index has to be compared
against old disk number computed using delta_disks.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
During migration raid_disks field contains new disks number now.
It should be set old disks number first and then new disks number
to allow md to calculate e.g. delta_disks parameter.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When restarting a reshape, the value of 'raid_disks' is the *new*
value. The old value is found by subtracting delta_disks.
So before calling analyse_change we must set raid_disks to be the
old value, and then reset it afterwards.
All other fields are cleanly separated with the main field being
the 'old' value and a new_* field available.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>