Currently whole array geometry is set in sysfs_set_array(),
so none of disks (even for expansion) should fail during sysfs_add_disk()
Due to this expansion counter should be used for reshaped array when
disk slot is bigger than number of disks in array.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When array under reshape is assembled it has to be disabled from
monitoring as soon as possible. It can occur that this is i.e second
array in container and mdmon is loaded already.
Lack of blocking monitoring can cause change array state to active,
and reshape continuation will be not possible.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
st->sb is null. This is exception cause.
reshape_container() function expects that super block will be loaded.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This is a bit of a hack - probably analyse change needs to be
re-written a bit to handle this properly.
However when the metadata deduced the intermediate state for a
reshaping array, the 'new_level' it sets should not be used to
interpret the 'delta_disks' number.
So in that case, hide the new_level while calling analyse_change.
Signed-off-by: NeilBrown <neilb@suse.de>
The 'raid_disks' for a container is zero, so subtracting it
from the given raid_disks to get delta_disks doesn't make sense.
Rather set delta_disks to UnSet and set raid_disks to the requested
number of disks. This then gets passed to reshape_super() which
can use it as required.
Signed-off-by: NeilBrown <neilb@suse.de>
The check that the array info is already in 'native format' is
only relevant when restarting a growth, so only perform it then.
Signed-off-by: NeilBrown <neilb@suse.de>
Normally when reshape_array is called with restart == 0,
info->array is the same as the 'array' read from the kernel
(via ioctl) so both have the same level.
However when called from reshape_container, info->array was
generated by the metadata so it will have 'level' set to the
intermediate (or final) level already.
So to test if we need to change the level, we need to compare the
desired level with that which was loaded from the kernel (array.level)
rather than that which was read from metadata (info->array.level).
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The check for 'enough spares' doesn't apply to RAID0 as we don't
mind it going degraded. But add a test that there are enough spares
to actually produce a working array.
Signed-off-by: NeilBrown <neilb@suse.de>
the new code for defaulting chunksizes didn't work quite right
- default was set to late in super1/super0/ddf
- defaults would over-ride values of '0' imposed by some levels
- default value wasn't applied to size properly.
Signed-off-by: NeilBrown <neilb@suse.de>
Array state has to be managed during reshape based on consistent flag.
To achieve this existing code will be reused. Currently existing code for
blocks_per_unit calculation can be removed
and existing code can be reused also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Some grow operations must be applied to a whole container. These
are performed one array at a time, so only one array appears to
be reshaping.
When re-assembling such an array, we need to make sure that
when the reshape finished, we move on to the next array.
So require metadata to set ->reshape_active = 2 in that case,
and use reshape_container to complete the reshape.
Signed-off-by: NeilBrown <neilb@suse.de>
Variables declaration moved a little bit up,
to not mix declaration and code.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
As containers can now grow, we need to use both Grow_restart (to
replay any backup-file) and Grow_continue when assembling the content
of a container.
Note that we don't pass a backup-file when doing incremental assembly.
If such is needed in that case, the assembly will fail.
To restart such arrays, explicit assembly is required.
Signed-off-by: NeilBrown <neilb@suse.de>
For general migration, blocks per unit are required for all disks,
not for per-member.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Now that the external metadata handler must provide an md-compatible
old/new geometry, sys_set_array can do all of the array set-up for
an array that is undergoing reshape.
That leave less for reshape_array to do.
Also clean up how reshape_array tells if the reshape has started or
not.
Don't use ->reshape_active as that doesn't tell us anything consistent
at this stage, only use the 'restart' flag passed in.
Signed-off-by: NeilBrown <neilb@suse.de>
If a reshape (migration) is happening, we might need to modify the information
with provide to md so that it can cope with the reshape.
For example, if a migration from 4-device RAID0 to 5-device RAID0 is
happening, we need to tell md that it is reshape from degraded
5-device RAID4 to degraded 6-device RAID4 so md doesn't handle direct
reshape of RAID0.
There may be other migrations supported by IMSM that need special
treatment here.
Signed-off-by: NeilBrown <neilb@suse.de>
When we cannot find spare devices for grow operation we should
print error message.
This patch changes debug error message to 'stderr' print.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Problem consists of missing =sign in comparison with SUBSYSTEM and
missing new line character at the end of line. As a result incremental
for hot-plugs of bare disks does not work.
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a device is failed, then don't include it in the reported
container_content, else it might get included in the array.
Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
assemble_container_content() cannot close mdfd handle, as it could be
required by reshape continuation.
mdfd handle is closed outside this function, when it is not longer
necessary.
Call to Grow_continue is added for reshape continuation after
assembly.
In the nearest future, simple condition:
if (content->reshape_active)
before Grow_continue() call will be replaced by check function
for support container operation /reshape/.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Put code for blocking subarray in to separate function.
This little code/function will be used for blocking arrays from mdmon
monitoring during assembly process. Arrays cannot wait for container
assembly finish, because meanwhile monitor can enable arrays for writing.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When assembling array using assemble_container_content() for external
metadata case, array is in 'readonly' state already.
There is not necessary to duplicate this operation.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When calling reshape_array() for external metadata 'container name'
parameter have to be passed.
Find and pass container name in external metadata case.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We somehow got to version of documentation for --array-size.
So merge them it one.
Reported-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: NeilBrown <neilb@suse.de>
During expansion there is more working disks that array can have.
Disks with set raid_disk (not a spare disk) during reshape should be counted
to allow array state transition to read_only state.
Array reconfiguration to new geometry should be done before reshape will
be started.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If we create a single-drive RAID0 array on partitioned drive,
we do not lose information about disk structure after operation
(partitions are visible on created array)
Warning message:
mdadm: partition table exists on /dev/sdX but will be lost or
meaningless after creating array"
is not necessary during creation single-drive RAID0 array.
This patch removes the message.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Defect description:
When we create an redundant array in mdadm and then degrade it
by disk removing, Option ROM and Windows OS does not detect any array.
Reason:
Metadata created and updated after degrading array is not compatible
with IMSM standard.
This patch synchronizes the metadata according IMSM requirements.
Following inconsistencies have been fixed:
- reset all fields in imsm_dev during creation to avoid random values
- init dev status during creation to proper state
- not reset CONFIGURED_DISK flag when disk is missing
- add ":0" suffix to the serial number for missing/failed disks
- update medatada signature after takeover operation
- mark map state as degraded after raid0->raid10 takeover
Note:
Patch reworked after Dan Willams review.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Hi Neil,
I noticed that the -Y option, as in mdadm -D -Y /dev/md0, doesn't work
but used as --export it works.
So I made a little patch to fix it, but it is simply sticking a Y in the
list of short_options in ReadMe.c.
Signed-off-by: NeilBrown <neilb@suse.de>
When reshape is started imsm stores new size in metadata.
mdadm requires "old" size to proper initialization restarted array.
When reshape is in progress getinfo_super_imsm_volume() should report
computed array size value instead array size stored in metatda.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
reshape prodess cannot be restarted due to no checkpoint information
in mdinfo.
When metadata is read during reshape process or reshape restart,
rehape_progress (mdinfo field) has to be initialized to value
stored in metadata. This allows start reshape from stored
in metadata checkpoint.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When chunk size is not set from command line we need to guess it
depending on metadata given on command line or found on listed devices.
Validate_geometry sets the default for it's metadata if chunk is not set.
For external metadata chunk is set only when creating in a container.
For imsm validate_geometry_imsm_orom is responsible for finding default
chunk depending on container metadata loaded. Container will already know
which controller it is attached to, and have this controllers orom
available.
do_default_chunk indicates that we need to find default chunk and
if validate_geometry fails for some metadata it tells us to reset chunk
that may have been set.
Current solution would set default chunk correctly for imsm only if
container device was given on command line. With the list of devices
chunk was always set to 512.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
match_metadata_desc allocates memory for st
which is not needed after validate_geometry fails
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Loading container may fail if e.g. one of the disks in container
has been detached but udev has not realized the change.
Addition to such array will fail because reading superblock
from one of disks in array fails.
Current message is a bit confusing.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When we force-assemble an array which is in the middle of a reshape,
we should repeat the reshape of any parts that aren't recorded in
the oldest superblock.
This is unlikely to make a significant difference, but could make
a small difference, and is safer.
Signed-off-by: NeilBrown <neilb@suse.de>
Problem: sysfs_disk_to_scsi_id() not returns correct scsi_id value.
Reason: sysfs format has been changed
This patch adapt sysfs_disk_to_scsi_id() to new sysfs format.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
> I have applied some patch - with some formatting changes to make it consistent
> with the rest of the code.
>
> I don't really have time to look more deeply at it at the moment.
> Maybe someone else will?...
Hi Neil,
thanks for including this in git.
Actually I did it look at it :-) and I already found a
couple of issues, below is a small fix patch.
Signed-off-by: NeilBrown <neilb@suse.de>
Bug detected for imsm metadata.
Assembling of array using Incremental switch generate segmentation
fault if BBM log is detected.
Reason: missing return from Incremental_container if BBM is detected
and unnecessary list=NULL assignment.
This patch fix the problem and memory leak in this area.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Before resizing an array with --size or --array-size, then filesystem
should be resized. mdadm cannot do this so the user should.
Reported-by: Gavin Flower <gavinflower@yahoo.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a request to remove all 'failed' or 'detached' devices chooses to
remove the first device, it will not actually try the removal and will
skip any following devices.
This fixes it.
Reported-by: Rémi Rérolle <rrerolle@lacie.com>
Tested-by: Rémi Rérolle <rrerolle@lacie.com>
Signed-off-by: NeilBrown <neilb@suse.de>
# mdadm --detail --export /dev/md127p1
Before:
MD_LEVEL=raid5
MD_DEVICES=4
MD_METADATA=0.90
After:
MD_LEVEL=raid5
MD_DEVICES=4
MD_CONTAINER=/dev/md0
MD_MEMBER=0
MD_UUID=55746a20:925d24a7:4f9bd7e2:9c9a411f
We parse the symlink target with a format:
../../block/mdXXX/mdXXXpYY
...and need the second '/' from the end of the string to read detect a
'md' device.
Reported-by: Krzysztof Wasilewski <krzysztof.wasilewski@intel.com>
Cc: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a request to remove all 'failed' or 'detached' devices chooses to
remove the first device, it will not actually try the removal and will
skip any following devices.
This fixes it.
Reported-by: Rémi Rérolle <rrerolle@lacie.com>
Tested-by: Rémi Rérolle <rrerolle@lacie.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When changing level when a new number of raid disks was explicitly
specified, we much make sure that the change implied by the
change in level is properly incorporated into the final result.
So explicitly track the change in number of parity disks
(delta_parity) and use it together with delta_disks to determine
final data_disks.
Also set info->delta_disks so other code doesn't need to mirror
this analysis.
And add some errors in cases where a new number of disks was
requested but is not currently supported
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Transition raid5 to raid0 was not covered in analyse_change()
Missing case added.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Spares that are specified on container can be used by any array in container.
this means that for every array in container they should be reported.
This let caller know how many spare devices (not used in any array)
are still available.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
For external metadata cases, information about number of spares cannot
be get via ioctl GET_ARRAY_INFO for particular array
(as info variable is initialized by). In md this information is present
in container object not array one.
This causes need to get spare disks number from external metadata.
This information is required for reshape_array() function to decide
if spare disks number satisfy operation requirements.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>