As we don't allow '-K' for '--zero-super' there is no point
using it internally. Just define a 'KillOpt' like with
other options.
Signed-off-by: NeilBrown <neilb@suse.de>
->percent sometimes stores negative values recording states
like 'pending' or 'delayed'.
The value '-2' means both 'delayed' and in Monitor, 'unknown'.
Also, '-1' has a meaning but not #define.
So change the #defines to be prefixed with "RESYNC_", instead
of "PROCESS_", add new "_NONE" and "_UNKNOWN", and use correct
value in each location.
Signed-off-by: NeilBrown <neilb@suse.de>
Now that /run seems to be a good standard, make that
the default for storing various run-time files, rather than
/var/run or /dev/.mdadm.
Signed-off-by: NeilBrown <neilb@suse.de>
mdinfo->bitmap_offset is a signed long and needs to be treated as
such when passed to the kernel.
This resolves the problem with adding internal bitmaps to a 1.0 array.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Both --detail and --monitor can report the names of member
devices on an array, and do so by searching /dev and finding
the shortest name that matches.
If
--prefer=foo
is given, they will instead prefer a name that contain /foo/.
So
mdadm --detail /dev/md0 --prefer=by-path
will list the component devices via their /dev/disk/by-path/xxx
names.
Signed-off-by: NeilBrown <neilb@suse.de>
When we can for devices using GET_DISK_INFO we currently
limit to 1024. But some arrays can have more than this.
So raise it to 4096 and make the constant a #define.
Signed-off-by: NeilBrown <neilb@suse.de>
Function reshape_super() guards metadata changes.
It is used to apply changes rollback in error case also.
As change (apply and rollback) can be not bi-directional reshape_super()
has to know if current action is metadata change that should be guarded
using metadata restrictions, or this is metadata rollback change
executed due to error occurrence.
In second case change has to be unconditional.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
As the bitmap can be before the superblock, bitmap_offset is signed.
But some of the code didn't honour that :-(
Signed-off-by: NeilBrown <neilb@suse.de>
It can easily be calculated from 'avail' and 'raid_disks', and we
will soon have a case where we don't have it easily available to pass
in.
Signed-off-by: NeilBrown <neilb@suse.de>
When --offroot is specified, mdadm will change the first character of
argv[0] to '@'. This is used to signal to systemd that mdadm was
launched from initramfs and should not be shut down before returning
to the initramfs.
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a 'remove' fails there is no certainty that another event will
happen soon, so make sure we retry soon anyway.
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Adding a bitmap via ioctl can only add it at a fixed location.
That location is not suitable for 4K-block devices.
So allow setting the bitmap location via sysfs if kernel supports it
and aim to always use 4K alignments.
Signed-off-by: NeilBrown <neilb@suse.de>
This fields doesn't work any more as ->getinfo_super clears the info
structure at an awkward time. So get rid of it and do it differently.
The issue is that the metadata handler cannot tell if the uuid it has
was randomly generated or explicitly requested, except on the first
call.
And we don't want to accept explicit requests for IMSM.
So when it was auto-generated, make it look distinctive by having the
same int copied in all 4 positions. If someone requests a uuid like
that, I guess they get away with it.
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm allowes to assemble 2 volumes with the same names based on the
config file. The issue is fixed by iterating over the list of md device
identifiers and comparing the names of md devices against each other,
detecting identical names and blocking the assembly should the same names
be found.
Now having detected duplicate names, mdadm terminates without assembling
the container, displaying appropriate prompt.
Signed-off-by: Lukasz Orlowski <lukasz.orlowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
search_mdstat and conf_match are almost identical.
Put all the functionality in conf_match, and remove search_mdstat.
Reported-by: Jes.Sorensen@redhat.com
Signed-off-by: NeilBrown <neilb@suse.de>
container_content retrieves volume information from disks in the
container. For unsupported volumes the function was not returning
mdinfo. When all volumes were unsupported the function was returning
NULL pointer to block actions on the volumes. Therefore, such volumes
were not activated in Incremental and Assembly. As side effect they
also could not be deleted using kill-subarray since "kill" function
requires to obtain a valid mdinfo from container_content.
This patch fixes the kill-subarray problem by allowing to obtain
mdinfo of all volumes types including unsupported and introducing new
array.status flags.
There are following changes:
1. Added MD_SB_BLOCK_VOLUME for blocking an array, other arrays in the
container can be activated.
2. Added MD_SB_BLOCK_CONTAINER_RESHAPE block container wide reshapes
(like changing disk numbers in arrays).
3. IMSM container_content handler is to load mdinfo for all volumes
and set both blocking flags in array.state field in mdinfo of
unsupported volumes. In case of some errors, all volumes can be
affected. Only blocked array is not activated (also reshaped as
result). The container wide reshapes are also blocked since by
metadata definition they require modifications of both arrays.
4. Incremental_container and Assemble functions check array.state and
do not activate volumes with blocking bits set.
5. assemble_container_content is changed to check container wide reshapes
before activating reshapes of assembled containers.
6. Grow_reshape and Grow_continue_command checks blocking bits
before starting reshapes or continueing (-G --continue) reshapes.
7. kill-subarray ignores array.state info and can remove requested array.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When container is assembled while reshape is active on one of its member
whole container can be required to be blocked from monitoring.
For such purpose field recovery blocked is added to mdinfo structure.
When metadata handler finds active reshape in container it should set
recovery_blocked field to disable whole container monitoring during
reshape.
For arrays that doesn't use containers, recovery_blocked field
has the same value as reshape_active field e.g. super0/1.
In fact,recovery is blocked during reshape for such arrays.
For ddf, metadata handler doesn't set reshape_active field,
so recovery_blocked is not set also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
To allow for reshape continuation '--continue' option is added
to grow command.
Function that will be executed in grow-continue case doesn't require
information about reshape geometry. All required information are read
from metadata.
For external metadata reshape can be run for monitored array/container
only. In case when array/container is not monitored run mdmon for it.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
During initrd phase continuing reshape will cause file system context
lost. This blocks ability to control reshape using checkpoints.
To avoid this, during initrd phase assemble has to be executed with
'--freeze-reshape' option. This causes that mdadm restores reshape
critical section only.
Reshape can be continued later after system full boot.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
1. Three missing map_unlock() calls were added.
2. Map file must be unlocked on fork, else child will hold lock.
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When validate_geometry finds that we haven't committed to
a metadata yet and that the subdev is a member of 'our'
container, it needs to report any errors it finds as Create()
cannot report them effectively.
So make a slight change to the semantics of the 'verbose' flag
and allow validate_geometry to report if it printed any error
messages.
Signed-off-by: NeilBrown <neilb@suse.de>
Reshape backup should be able to be restored during reshape continuation
also. To reuse already existing code it is moved to function.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If you create a two drive raid1 array with one device writemostly, then
fail the readwrite drive, when you add a new device, it will get the
writemostly bit copied out of the remaining device's superblock into
it's own. You can then remove the new drive and readd it as readwrite,
which will work for the readd, but it leaves the stale WriteMostly1 bit
in devflags resulting in the device going back to writemostly on the
next assembly.
The fix is to make sure that A) when we readd a device and we might have
filled the st->sb info from a running device instead of the device being
readded, then clear/set the WriteMostly1 bit in the super1 struct in
addition to setting the disk state (ditto for super0, but slightly
different mechanism) and B) when adding a clean device to an array (when
we most certainly did copy the superblock info from an existing device),
then clear any writemostly bits.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
0.90 arrays can only use up to 4TB per device. So when a larger
device is added, complain a bit. Still allow it if --force is given
as there could be a valid use.
Signed-off-by: NeilBrown <neilb@suse.de>
Initially there is no proper translation mdstat's DELAYED/PENDING processes
to "--detail" output.
For example, if we have recover=DELAYED in mdstat, "--detail"
shows "State: recovering" and "Rebuild Status = 0%".
It was incorrect in case of process waiting on checkpoint different
than 0%. In fact rebuild status is differnt than 0% and user is misled.
The patch fix the problem. Current "--detail" command shows
in the exampe: "State: recovering (DELAYED)" and no information
about precentage.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Patch introduces support for reshape process restart for external metadata
using metadata specific data handling methods.
It introduces recover_backup() function that restores array to stable state
It is equivalent to Grow_restart() functionality for native metadata.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Before reshape is started, mdadm should check again if there is only one
array (in container) under reshape. Then function "divides" array in to
"migration units" that can fits migration copy area and enters main loop.
It checks if current "migration unit" requires to be backed up.
If necessary mdadm saves it to copy area and updates migration record.
Then MD-driver is directed to perform reshape step (by "migration unit" size)
and checkpoint is moved forward. In this way reshape is executed until
array ends.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
For external metadata backup location and saving methods depends
on metadata specific implementation details. Currently restore_stripes()
function is able to restore data only from the given backup file handles
and it is used only for assembling partially reshaped arrays.
As this function will be very helpful for external metadata backup
mechanism, add the support for restoring data from the given source buffer.
Add possibility for save_stripes() to work without designation targets.
Save_stripes() can now prepare data for restore_stripes() only.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Some code currently clears 'info' before calling getinfo_super,
some code doesn't.
To be consistent, change it so no caller ever clears 'info',
but ever getinfo_super function must clear it.
Note that ->raid_disk may be meaningful if that 'map' is passed
non-NULL. In that case it is copied out before the structure
is zeroed.
Signed-off-by: NeilBrown <neilb@suse.de>
When an array is resized to have larger members, --assume-clean will
disable any resync if the kernel supports it (2.6.40 and later).
Signed-off-by: NeilBrown <neilb@suse.de>
Allow for loading metadata from disk attached to non-metadata compliant
system. Affects mdadm --examine and guess_super.
Added ignore_hw_compat in supertype to pass information to load_super
handler. If ignore_hw_compat is set the handler should load metadata
also from disks that do not comply with metadata requirements (i.e. disk is not
attached to native controller, etc).
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If mdmon is shutting down because there are no devices
left to look at, then don't wait 5 seconds for an O_EXCL open,
and that can block progress of --grow.
Only wait for O_EXCL if we received a signal.
Signed-off-by: NeilBrown <neilb@suse.de>
If single-disk RAID0 or RAID1 array is created, user may preserve data on
disk. If array given size covers all partitions on disk, all data will be
available on created array. If array size is too small (not covers
all partitions), data will be not accessible.
This patch introduces warning message during array creation if given size
is too small. User may interrupt creation process to avoid data loss.
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This is done via conversion to RAID4 and back.
To grow the array, extra devices will be needed which cannot
already be present as spares - so allow a list of new devices
to be included in grow request which changed the number of devices.
Signed-off-by: NeilBrown <neilb@suse.de>
As containers can now grow, we need to use both Grow_restart (to
replay any backup-file) and Grow_continue when assembling the content
of a container.
Note that we don't pass a backup-file when doing incremental assembly.
If such is needed in that case, the assembly will fail.
To restart such arrays, explicit assembly is required.
Signed-off-by: NeilBrown <neilb@suse.de>
When chunk size is not set from command line we need to guess it
depending on metadata given on command line or found on listed devices.
Validate_geometry sets the default for it's metadata if chunk is not set.
For external metadata chunk is set only when creating in a container.
For imsm validate_geometry_imsm_orom is responsible for finding default
chunk depending on container metadata loaded. Container will already know
which controller it is attached to, and have this controllers orom
available.
do_default_chunk indicates that we need to find default chunk and
if validate_geometry fails for some metadata it tells us to reset chunk
that may have been set.
Current solution would set default chunk correctly for imsm only if
container device was given on command line. With the list of devices
chunk was always set to 512.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Delta_disk can be set to UnSet value.
This can a cause to pass wrong parameter to reshape_super().
To avoid such situations raid_disks and delta_disks parameters
have to be passed to reshape_super() separately.
It will be up to reshape_super() function validation
and usage of this parameters to avoid not valid values.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
For containers, it is always appropriate to include a device in the
container.
Whether it should then be included in an array is a separate question.
Signed-off-by: NeilBrown <neilb@suse.de>
Arrays on partitions are not supported for external metadata
so do not take such spare from native array.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Neil,
Please consider this patch that once was discussed and I think agreed with in general direction. It was sent a while ago
but somehow did not merged into your devel3-2. This patch enables hot-plug of so called bare devices (as understand by domain policies rules in mdadm.conf).
Without this patch we do NOT serve hot-plug of bare devices at all.
Thanks,
Marcin Labun
Subject was: FW: Autorebuild, new dynamic udev rules for hot-plugs
>>From c0aecd4dd96691e8bfa6f2dc187261ec8bb2c5a2 Mon Sep 17 00:00:00 2001
From: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Date: Thu, 23 Dec 2010 16:35:01 +0100
Subject: [PATCH] Dynamic hot-plug udev rules for policies
Cc: linux-raid@vger.kernel.org, Williams, Dan J <dan.j.williams@intel.com>, Ciechanowski, Ed <ed.ciechanowski@intel.com>
When introducing policies, new hot-plug rules were added to support
bare disks. Mdadm was started for each hot plugged block device
to determine if it could be used as spare or as a replacement member for
degraded array.
This patch introduces limitation of range of devices that are handled
by mdadm.
It limits them to the ones specified in domains associated with
the actions: spare-same-port, spare and spare-force.
In order to enable hot-plug for bare disks one must update udev rules
with command
mdadm --activate-domains[=filename]
Above command writes udev rule configuration to stdout. If 'filename'
is given output is written to the file provided as parameter. It is up
to system administrator what should be done later. To make such rule
permanent (i.e. remain after reboot) rule should be writen to
/lib/udev/rules.d directory. Other cases will just need to write it to
/dev/.udev/rules.d directory where temporary rules lies. One should be
aware of the meaning of names/priorities of the udev rules.
After mdadm.conf is changed one is obliged to re-run
"mdadm --activate-domains" command in order to bring the system
configuration up to date.
All hot-plugged disks containing metadata are still handled by existing
rules.
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When we restart an array in the middle of a reshape, we reuse a lot of
the code for starting the reshape, but it needs to know that
circumstances are slightly different.
So add a 'restart' arg which is used:
- skip checking and adding spares
- activate the array (rather than start reshape)
- allow the backup file to already exist
Signed-off-by: NeilBrown <neilb@suse.de>
Child_monitor was design to perform 'manage_reshape' for native
arrays. So change the signature for ->manage_reshape to match
child_monitor and move the all to the same place that child_monitor
is called from.
Also give super-intel a manage_reshape handler which simple calls
child_monitor.
Signed-off-by: NeilBrown <neilb@suse.de>
container_chose_spares in Monitor.c and
get_spares_for_grow in super-intel.c
do the same thing: search for spares in a container.
Another version will also be needed for Incremental
so a more general solution is presented here and
applied in two previous contexts.
Normally domlist==NULL would lead an empty list but
this is typically checked earlier so here it is interpreted
as "do not test domains".
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
uuid_match_any is replaced by uuid_zero for imsm spares.
Function fixup_container_spare_uuid not needed as it gives
unwanted uuid to spares.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Sometimes one metadata update will require allocating several
larger data structures. As 'monitor' cannot allocate, 'manager'
must, so it must be able to attach a list of allocates to the
update, and importantly it must be able to easily free them.
So add a 'space_list' element to metadata updates where each
element on the list starts with a pointer to the next.
Signed-off-by: NeilBrown <neilb@suse.de>
During Online Capacity Expansion metadata has to be updated to show
array changes and allow for future assembly of array. To do this
mdadm prepares and sends reshape_update metadata update to mdmon.
The update contains the old and new number of raid disks, and the
indices of the spare disks that will be used to fill the spaces.
This works as follows:
1. reshape_super() prepares metadata update.
2. mdadm discovers the spares and adds them to the array
3. mdadm sends the update to mdmon
4. managemon in prepare_update() allocates required memory for bigger
device object
5. monitor in process_update() updates the metadata to record the
new sizes and the newly assigned devices.
6. mdadm initiates the reshape
Based on code From: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Sometimes we want to convert a devnum to a devname without allocating
memory. So provide function to do the formatting without allocation.
Signed-off-by: NeilBrown <neilb@suse.de>
Manager thread shall pass the information to monitor thread (mdmon)
that some devices are removed from container. Otherwise, monitor
(mdmon) might use such devices (spares) to rebuild the array that has
gone degraded.
This problem happens for imsm containers, since a list of the
container disks is maintained in intel_super structure. When array
goes degraded, the list is searched to find a spare disks to start
rebuild. Without this fix the rebuild could be stared on the spare
device that was a member of the container, but has been removed from
it.
New super type function handler has been introduced to prepare
metadata format specific information about removed devices.
int (*remove_from_super)(struct supertype *st, mdu_disk_info_t *dinfo)
The message prepared in remove_from_super is later processed by
process_update handler in monitor thread.
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This is useful with 1.1 and 1.2 metadata to update the metadata if
the device size has changed.
The same functionality can be achieved by writing to the device size
in sysfs after re-adding normally, but in some cases this might be
easier.
Signed-off-by: NeilBrown <neilb@suse.de>
Growing an array when there aren't enough spares can make the array
degraded. This works but might not be what is wanted.
So warn the user in this case and require a --force to go ahead
with the reshape.
Signed-off-by: NeilBrown <neilb@suse.de>
Move opening backup file to the function for future reuse during
container reshape.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Though not having the proper backup file can cause data corruption, it
is not enough to justify not being able to start the array at all.
So allow "--invalid-backup" to be specified which says "just continue
even if a backup cannot be restored".
Signed-off-by: NeilBrown <neilb@suse.de>
As chunk_size in mdstat_ent is never set, we shouldn't copy
it into a->info.array.
In fact, it is safest to get rid of the field altogether.
Reported-by: "Kwolek, Adam" <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
number of backup blocks evaluation is put in to function for code reuse.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
fd handles table creation is put in to function for code reuse.
In manage_reshape(), child_grow() function from Grow.c will be reused.
To prepare parameters for this function, code from Grow.c can be
reused also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Due to fact that IMSM Windows compatibility was not tested yet,
feature has to be treated as experimental until compatibility
verification will be performed.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Also not that the leading '-' on the metadata names now
simply means that mdmon must not reconfiure the array.
Signed-off-by: NeilBrown <neilb@suse.de>
For consistency with makedev().
int is not sufficient.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a devices - typically in a mirrored set - is assembled
independently of the other devices, and then attempted to be brought
back into the set, it could contain inconsistent data. It should not
be included.
So detect this situation by ensuring that the 'most recent' device is
believed to be active by every other device. If a device is wayward,
it will only consider fellow wayward devices to be active and will
think all others are failed or missing.
This patches fixes --incremental, --assemble was done in an earlier
patch.
Signed-off-by: NeilBrown <neilb@suse.de>
In several cases, two different long options map to the same short
option. So e.g. you could give '--brief' and it would be interpreted
as '--bitmap'. That isn't really good.
So for every shared short option, define an option number and return
that for the long option instead. Then always check for both the
short and long options.
Also give some bugs like " mode == 'G'" which should be '== GROW'.
Signed-off-by: NeilBrown <neilb@suse.de>
In the native metadata case Grow_reshape() and the kernel validate what
reshapes are possible / supported and the kernel handles all the metadata
updates. In the external case the metadata format may have specific
constraints above this baseline. External formats also introduce the
constraint of only permitting some reshapes at container scope versus subarray
scope. For exmaple imsm changes to 'raiddisks' must be applied to all arrays
in the container.
This operation assumes that its 'st' parameter has been obtained from
super_by_fd() (such that st->subarray is up to date), and that a snapshot of
the metadata has been loaded from the container.
Why a new method, versus extending an existing one?
->validate_geometry: this routine assumes it is being called from Create(),
adding reshape complicates the cases that this routine needs to handle. Where
we find that checks can be shared between the two cases those routines
refactored into common code internal to the metadata handler, i.e. no need to
provide a unified external interface. ->validate_geometry() also does not
expect to update the metadata.
->update_super: this is meant to update single fields at Assembly() and only at
the container scope. Reshape potentially wants to update multiple fields at
either container or subarray scope.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Support metadata specific level, layout and chunksize defaults. Kill an
uneeded superswitch methods ahead of adding more for the reshape case.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
In order to support reshape and atomic removal of spares from containers
we need to prevent mdmon from activating spares. In the reshape case we
additionally need to freeze sync_action while the reshape transaction is
initiated with the kernel and recorded in the metadata.
When reshaping a raid0 array we need to freeze the array *before* it is
transitioned to a redundant raid level. Since sync_action does not exist
at this point we extend the '-' prefix of a subarray string to flag
mdmon not to activate spares.
Mdadm needs to be reasonably certain that the version of mdmon in the
system honors this 'freeze' indication. If mdmon is not already active
then we assume the version that gets started is the same as the mdadm
version. Otherwise, we check the version of mdmon as returned by the
extended ping_monitor() operation. This is to catch cases where mdadm
is upgraded in the filesystem, but mdmon started in the initramfs is
from a previous release.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
...before introducing another open coded instace of this conversion.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Platform (metadata) domain let the metadata handlers differentiate
disk domains based on controllers that the disk belongs to.
Platform domain is sub-domain inside user specified domain
in mdadm.conf configuration files inheriting all parameters from it.
The metadata domain name is used disk domain matching functions.
The disk with the same metadata domain name belong to the same metadata
domain.
New metadata handler is added that retrieves platform domain string based
on disk path:
const char *(*get_disk_controller_domain)(const char *path);
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than only migrating between arrays with the same spare_group,
we now migrate based on domains set in the policy.
In order for spare_group to continue to work, we treat it as a domain
of the destination array, and a domain of any device we might remove
from a source array.
Signed-off-by: NeilBrown <neilb@suse.de>
If getinfo_super is called on a container supertype we only get information
on first disk. As a parameter it uses reference to preallocated
mdinfo structure. Amending getinfo_super to return full list of disks
would require ammending all previous calls and subsequently freeing memory
allocated for mdinfo list.
Function container_content that returns a mdinfo list is written
specifically for assembly, performing actions not needed to just fill
mdinfo. It also does not include spares so is unsuitable.
As an alternative a new function getinfo_super_disks is created
to obtain information about all disks states in array.
Existing function sysfs_free is used to free memory
allocated by getinfo_super_disks.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
--no-sharing option disables moving spares between arrays/containers.
Without the option spares are moved if needed according to config rules.
We only allow one process moving spares started with --scan option.
If there is such process running and another instance of Monitor
is starting without --scan, then we issue a warning but allow it
to continue.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When "mdadm -I" is given a device with no metadata, mdadm tries to add
it as a 'spare' somewhere based on policy.
This patch changes the behaviour in two ways:
1/ If the device is at a 'path' where a previous device was removed
from an array or container, then we preferentially add the spare to
that array or container.
2/ Previously only 'bare' devices were considered for adding as
spares. Now if action=spare-same-slot is active, we will add
non-bare devices, but *only* if the path was previously in use
for some array, and the device will only be added to that array.
Based on code
From: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If the disk is taken out from its port this port information is
lost. Only udev rule can provide us with this information, and then we
have to store it somehow. This patch adds writing 'cookie' file in
/dev/.mdadm/failed-slots directory in form of file named with value of
f<path-id> containing the metadata type and uuid of the array (or
container) that the device was a member of. The uuid is in exactly
the same format as in the mapfile.
FAILED_SLOTS_DIR constant has been added to hold the location of
cookie files.
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
<path-id> allows to identify the port to which given device is plugged
in. In case of hot-removal, udev can pass this information for future
use (eg. write this name as 'cookie' allowing to detect the fact of
reinserting device to the same port).
--path <path-id> parameter has been added to device removal handle
(and char *path has been added to IncrementalRemove() to pass this
value) in order to pass path-id to this handler.
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Now that the next_member loop is much smaller it is easy to
just use 'content' rather than stashing it in 'tmpdev->content'.
So we can remove the 'content' field from 'struct mddev_dev'.
Signed-off-by: NeilBrown <neilb@suse.de>
Remove the _t pointer typedef and remove the _s suffix for the
structure,
These things do not help readability.
Signed-off-by: NeilBrown <neilb@suse.de>
This handles the 'container' part of 'load_super', so we can
soon make them completely separate - it is just confusing to
overload these two.
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.
For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than hiding this in the 'st', return it explicitly.
In the one case we still need it, copy it into st where needed.
This will disappear in a future patch.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than hiding this arg in the 'st' structure, pass it explicitly.
This is a first step to getting rid of 'subarray' from 'supertype'.
The strcpy in open_subarray should have better error checking, but it
will disappear soon so there is little point.
Signed-off-by: NeilBrown <neilb@suse.de.
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.
We should never include a device in an array if it thinks that the
primary device has failed.
This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.
Signed-off-by: NeilBrown <neilb@suse.de>
Detail: report reshape and check as well as resync and recovery
Wait: if the resync is pending or delayed, wait for that too.
Signed-off-by: NeilBrown <neilb@suse.de>
If an --add is requested and a re-add looks promising but fails or
cannot possibly succeed, then don't try the add. This avoids
inadvertently turning devices into spares when an array is failed but
the devices seem to actually work.
Signed-off-by: NeilBrown <neilb@suse.de>
Allow disk-policy to be computed given the path and
disk type explicitly. This can be used when hunting through
/dev/disk/by-path for something interesting.
Signed-off-by: NeilBrown <neilb@suse.de>
To support incorpating a new bare device into a collection of arrays -
one partition each - mdadm needs a modest understanding of partition
tables.
The main needs to be able to recognise a partition table on one device
and copy it onto another.
This will be done using pseudo metadata types 'mbr' and 'gpt'.
Signed-off-by: NeilBrown <neilb@suse.de>
A device can be in a number of domains.
The domains of an array is the union of the domains of all devices.
A device is allowed to join an array when its set of domains is a
subset of the array's domains.
Signed-off-by: NeilBrown <neilb@suse.de>
Policy can be stated as lines in mdadm.conf like:
POLICY type=disk path=pci-0000:00:1f.2-* action=ignore domain=onboard
This defines two distinct policies which apply to any disk (but not
partition) device reached through the pci device 0000:00:1f.2.
The policies are "action=ignore" which means certain actions will
ignore the device, and "domain=onboard" which means all such devices
as treated as being united under the name 'onboard'.
This patch just adds data structures and code to read and
manipulate them. Future patches will actually use them.
Signed-off-by: NeilBrown <neilb@suse.de>
It turns out that /lib/init/rw doesn't exist in early boot
like I thought. So give up on that idea and just use
/dev/.mdadm/ for files that must persist from early-boot
to regular boot.
Signed-off-by: NeilBrown <neilb@suse.de>
We now have 3 directory definitions: mdmon directory for its pid and
sock files (compile time define, not changable at run time), mdmonitor
directory which is for the mdadm monitor mode pid file (can only be
passed in via command line at the time mdadm is invoked in monitor mode),
and the directory for the mdadm incremental assembly map file (compile
time define, not changable at run time). Only the mdadm map file still
hunts multiple locations, and the number of locations has been reduced
to /var/run and the compile time specified location. Re-use of similar
sounding defines that actually didn't denote their actual usage at
compile time made it more difficult for a person to know what affect
changing the compile time defines would have on the resulting programs.
This patch renames the various defines to clearly identify which item
the define affects. It also reduces the number of various directories
which will be searched for these files as this has lead to confusion
in mdadm and mdmon in terms of which files should take precedence when
files exist in multiple locations, etc. It's best if the person
compiling the program intentionally and with planning selects the
right directories to be used for the various purposes. Which directory
is right depends on which items you are talking about and what boot
loader your system uses and what initramfs generation program your
system uses. Because of the inter-dependency of all these items it
would typically be up to the distribution that mdadm is being integrated
into to select the correct values for these defines.
Signed-off-by: Doug Ledford <dledford@redhat.com>
GET_ARRAY_INFO always succeeds on an inactive container, so we need to
be a bit more diligent about adding a disk to an active container.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
--test can be given in Manage mode.
This can be used when there is an attempt to fail or remove 'faulty',
'failed' or 'detached' devices, or to re-add 'missing' devices.
If no devices were failed, removed, or re-added, then mdadm will
exit with status '2'.
Signed-off-by: NeilBrown <neilb@suse.de>
This can be used for hot-unplug. When a device has been remove,
udev can call
mdadm --incremental --fail sda
and mdadm will find the array holding sda and remove sda from
the array.
Based on code from Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This allows finding the array which contains a given component.
Components are named using the kernel-internal string name such
as "sda1" or "hdb".
Don't return member arrays, only the contain that contains them.
Also tidy up the parsing of 'inactive' arrays in /proc/mdstat.
If we see 'inactive' we need to set 'in_devs' immediately as there
is no level coming.
Signed-off-by: NeilBrown <neilb@suse.de>
Allow the name of the array stored in the metadata to be updated. In
some cases the metadata format may not be able to support this rename
without modifying the UUID. In these cases the request will be blocked.
Otherwise we allow the rename to take place, even for active arrays.
This assumes that the user understands the difference between the kernel
node name, the device node symlink name, and the metadata specific name.
Anticipating further need to modify subarrays in-place, introduce the
->update_subarray() superswitch method. A future potential use
case is setting storage pool (spare-group) identifiers.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
...i.e. GET_DEVS == (GET_DEVS|SKIP_GONE_DEVS)
A null pointer dereference in Incremental.c can be triggered by
replugging a disk while the old name is in use. When mdadm -I is called
on the new disk we fail the call to sysfs_read(). I audited all the
locations that use GET_DEVS and it appears they can tolerate missing a
drive. So just make SKIP_GONE_DEVS the default behaviour.
Also fix up remaining unchecked usages of the sysfs_read() return value.
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Make create check with the appropriate meta data handler and see what the
largest chunk size is supported. The current 512K default is not supported
by existing imsm OROM.
[dan.j.williams@intel.com: trim the upper limit to 512k for future oroms]
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Support for deleting a subarray out of a container. When all subarrays
are deleted the component devices are converted back into spares, a
--zero-superblock is still needed to kill the remaining metadata at this
point. This operation is blocked when the subarray is active and may
also be blocked by the metadata handler when deleting the subarray might
change the uuid of other active subarrays. For example, with imsm,
deleting subarray 'n' may change the uuid of subarrays with indexes > n.
Deleting a subarray needs to be a container wide event to ensure
disks that record the modified subarray list perceive other disks that
did not receive this change as out of date.
Notes:
The st->subarray parsing in super-intel.c and super-ddf.c is updated to
be more strict now that we are reading user supplied subarray values.
Offline container modification shares actions that mdmon typically
handles so promote is_container_member() and version_to_superswitch()
(formerly find_metadata_methods()) to generic utility functions for the
cases where mdadm performs the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This is needed for imsm where:
1/ we want to report raid_disks as zero to allow mdadm -As to
incorporate all spares
2/ we can't determine stale disks by looking at the event counts.
3/ we can't see per-subarray expectations with the info returned from
the container level ->getinfo_super()
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
These metadata are not expected on partitions, and they have
no way of differentiation whether which is correct if they
are found both on the device and on the last partition.
So if the device is a partition, refuse to read the metadata.
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm prevents creation when device names are duplicated on the command
line, but leaves the partially created array intact. Detect this case
in the error code from add_to_super() and cleanup the partially created
array. The imsm handler is updated to report this conflict in
add_to_super_imsm_volume().
Note that since neither mdmon, nor userspace for that matter, ever saw an
active array we only need to perform a subset of the cleanup actions.
So call ioctl(STOP_ARRAY) directly and arrange for Create() to cleanup
the map file rather than calling Manage_runstop().
Reported-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently "mdadm -As" will process the entries in the config
file in order. If any array is a component or member of a preceding
array, that array will not be assembled.
So if there are any failures during assembly, retry those arrays,
and look until everything is assembled, or nothing more can
be assembled.
Signed-off-by: NeilBrown <neilb@suse.de>
Monitoring /proc/mounts and creating a .pid file as soon as /var/run
is writable is racy. Most distros clean all non-directories from
/var/run early in boot and if mdmon races with this it could
lose the files as soon as they are created.
Instead require that "mdmon --takeover" be run after /var is writable.
Signed-off-by: NeilBrown <neilb@suse.de>
/var/run probably doesn't persist from early boot.
So if necessary, store in in /lib/init/rw or somewhere else
that does persist.
Signed-off-by: NeilBrown <neilb@suse.de>
Unlike native md checkpointing some data about the geometry and type of
the migration process is coded into curr_migr_unit. Provide logic to
convert between md/{resync_start|recovery_start} and imsm/curr_migr_unit.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Minimal changes needed to permit reassembling partially recovered
external metadata arrays. The biggest logical change is that
->container_content() can now surface partially rebuilt members rather
than omitting them from the disk list.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Replace occurrences of ~0ULL to make it clear we are talking about maximal
resync/recovery position.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The patch increases the capacity of buffers used to store
sysfs path names. Originally the buffers were too small to
hold the canonical representation of sysfs path (in case
of a SAS device, especially a device installed behind an
expander).
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When creating an array, check if the devices have partition
tables and print a warning if the table or the partitions might be
destroyed by array creation.
Signed-off-by: NeilBrown <neilb@suse.de>
- When --kill-superblock is used with --metadata, find every
different superblock if there are several and kill them all.
- When creating a new array, kill off any old metadata. The code
to do this was already present but has become broken over time.
Signed-off-by: NeilBrown <neilb@suse.de>
ie. the percent increments after which RebuildNN event is generated
This is particulary useful when using --program option, rather than
(only) syslog for alerts.
Signed-off-by: Zdenek Behan <rain@matfyz.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
Connect to the monitor in the old namespace and use that connection for
WaitClean requests when stopping the victim mdmon instance. This allows
ping_monitor() to work post chroot().
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The load_super() from an mdadm --detail call may race against an mdmon
update. When this happens the load_super sees an inconsistent metadata
block and returns an error. The fallback path to use the map file
contents lacks uuid reporting, so provide __fname_from_uuid for
generically printing a uuid.
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When disks have conflicting container memberships (same container ids
but incompatible member arrays) --update=uuid can be used to move
offenders to a new container id by changing 'orig_family_num'.
Note that this only supports random updates of the uuid as the actual
uuid is synthesized. We also need to communicate the new
'orig_family_num' value to all disks involved in the update. A new
field 'update_private' is added to struct mdinfo to allow this
information to be transmitted.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A previous patch moved move the '--examine --brief' reporting of
member arrays to before their containers. This breaks "mdadm -As"
assembly. So put them back, but still fix the problem addressed by
previous patch.
Signed-off-by: NeilBrown <neilb@suse.de>
The family_number field can change. The option-rom will change the
family number when it starts a rebuild process (flags a container for
rebuild). This was not seen previously as mdadm would usually start the
rebuild process, preserving the family number.
This is the mechanism that helps to prevent a prodigal array member from
being returned to its original system and cause a rebuild to go in the
wrong direction. With the change we will end up with a container that
will fail to assemble unless the device with the incompatible family
number is left out of the assembly.
So, take several actions:
1/ Convert uuid generation to use orig_family_num, being careful to
preserve the existing uuid in the case where orig_family_num is not
set (i.e. previous mdadm created imsm arrays)
2/ Set orig_family_num at Create. For arrays created by mdadm prior to
this release orig_family_num will be zero, so set it to family_num at
the first metadata write.
3/ Add checks for orig_family_num to compare_super_imsm
4/ Update the family number when initiating rebuild
5/ The option-rom mixes some random data into the family number, add
this functionality to the mdadm implementation.
Reported-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This allows the layout to be parsed after the current level of the
array is know, so that the level doesn't need to be given (otherwise
pointlessly) on the command line.
Signed-off-by: NeilBrown <neilb@suse.de>
Bother Manage_reconfig and Grow_reshape provide for changing
the 'layout' of a faulty array. This is no necessary.
So discard Manage_reconfig and just use Grow_reshape
Signed-off-by: NeilBrown <neilb@suse.de>
With 2.6.30 it is possible to tell the md driver to clip an array to a
size smaller than the real size of the array. This option gives
access to that feature. The size change does not persist
across restarts.
Signed-off-by: NeilBrown <neilb@suse.de>
When rebuilding the mapfile (mdadm -Ir), if not appropriate name is
found in /dev/md/, try to find an appropriate name, either by looking
in mdadm.conf or by using the name in the metadata.
Signed-off-by: NeilBrown <neilb@suse.de>
If mdadm.conf contains
HOMEHOST <ignore>
or commandline contains
--homehost=<ignore>
then the check that array metadata mentions the given homehost is
replace by a check that the name recorded in the metadata is not
already used by some other array mentioned in mdadm.conf.
This allows more arrays to use their native name rather than having
an _NN suffix added.
This should only be used during boot time if all arrays required for
normal boot are listed in mdadm.conf.
If auto-assembly is used to find all array during boot, then the
HOMEHOST feature should be used to ensure there is no room for
confusion in choosing array names, and so it should not be set
to <ignore>.
Signed-off-by: NeilBrown <neilb@suse.de>
Because ---examine --brief, or --detail --brief are
often used to create mdadm.conf, and because people don't want to
have to update their mdadm.conf unnecessarily, we don't want to
include information that might change.
And now that level changing is supported, that is almost everything
but UUID.
So move some more fields into the "Only print with --verbose" class.
Signed-off-by: NeilBrown <neilb@suse.de>
The line 'auto' in mdadm.conf can be used to disable assembly
of specific metadata types, or of all arrays.
This does not affect assembly of arrays listed in mdadm.conf
or on command line.
auto -all
will disable all auto-assembly.
auto -ddf
will cause mdadm to ignore ddf arrays that are not explicitly
mentioned, and auto assemble anything else it finds.
Signed-off-by: NeilBrown <neilb@suse.de>
This patch enables the --size parameter for build operations.
Without this, if you have a raid1, for instance, where the 2 disks are
not the exact same size, and you need to build the array but one of the
disks is not available right at the moment (maybe it's USB and it's
unplugged, or maybe it's a network disk and it's unavailable), then you
have to play some weird games to get the array to size correctly (that
is, to the size of the smaller of the two components or less).
There may be other uses for this too...
--
Paul
Signed-off-by: NeilBrown <neilb@suse.de>
When building container members with -IR, we need to ensure that
devices added to an active array preserve the 'in_sync' status so they
don't needlessly get rebuilt.
So allow sysfs_add_disk to do this (only works in kernels since
2.6.30) and pass the relevant flag down.
Signed-off-by: NeilBrown <neilb@suse.de>
For short reshapes the kernel may be done before mdadm can check that
progress has passed the critical section.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
imsm arrays round down the effective array size to the closest 1
megabyte boundary so teach get_info_super_imsm and sysfs_set_array to
set 'md/array_size' if available (and make sure ddf uses the default
size).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
wait not only for the name to appear, but for it to refer to the
correct device.
Sometimes old symlinks left lying around can be confusing.
Signed-off-by: NeilBrown <neilb@suse.de>
If an array reshape completed within 1 second, then --grow will not
notice that it has finished and will keep waiting for the critical
section to pass.
So be more cautious in the test.
Signed-off-by: NeilBrown <neilb@suse.de>
DDF raid6 layouts are subtly different from the standard 'md' layouts.
From 2.6.30 the kernel knows about these.
Teach mdadm about them, and also allow 'ddf' to set an appropriate default.
Signed-off-by: NeilBrown <neilb@suse.de>
All operations that rely on loading from an existing container (like
--add) will fail after a disk has been removed. Provide an option to
skip missing / offline disks rather than abort. We attempt to do this
in the load_super_{imsm,ddf}_all cases when mdmon is running i.e. we
already have a consitent version of the metadata running in the system.
Otherwise, we fail as normal and let the administrator fix up the
container.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-12-08 Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
* Makefile (dadm.uclibc): Remove misspelled and unneeded rule.
* md5.h: Include stdint.h for uClibc.
* mdadm.h: uClibc defines __UCLIBC__. If uClibc has LFS off
then use lseek instead of lseek64.
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
If the metadata handler can not find its platform support components
then there is no way for it to verify that the raid configuration will
be supported by the option-rom. Provide a generic method for metadata
handlers to warn the user that the array they are about to create may
not work as intended with a given platform.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Let handlers specifiy their own defaults, specifically needed for the
imsm-raid5 case where mdadm defaults to 'ls' and imsm to 'la'.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Metadata formats like imsm work in concert with platform firmware and
hardware, so provide a way for mdadm to display this info to the user.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If, when creating an array, a signal target device is given which
is a container, then allow the metadata handler to choose which
devices to use.
This is currently only supported for DDF.
Signed-off-by: NeilBrown <neilb@suse.de>
Now that names in /dev are usually created (eventually) by udev,
it isn't really safe to rely in finding a name in /dev to pass to
mdmon to identify which array to monitor.
And it isn't really necessary to have a name in /dev.
So just pass the symbolic name, e.g. md127 or md123.
Change util.c to pass that name, and change mdmon to process the
name sensibly.
Signed-off-by: NeilBrown <neilb@suse.de>
It is possible for the mapfile to become wrong, and that gets
very confusing. So validate entries before returning them.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than appending the md minor number, we now append a small
sequence number to make sure name in /dev/md/ that aren't LOCAL are
unique. As the map file is locked while we do this, we are sure
of no losing any races.
Signed-off-by: NeilBrown <neilb@suse.de>
Try to treat members of containers much like other arrays for
assembly.
We still look through the list of devices for a match (it will be
the container), then find the relevant 'info' and try to assemble
the array.
Signed-off-by: NeilBrown <neilb@suse.de>
Factor out, from Incremental_container, the code for assembling an
array based on information extracted from a container. We will
shortly use this from Assemble too.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This only applies if udev isn't installed or is disabled
by MDADM_NO_UDEV
We try to remove partitions too.
We find names to remove by looking in /var/run/mdadm/map
Signed-off-by: NeilBrown <neilb@suse.de>
When a 'container' gets started, we need udev to notice, but the
kernel has no way of knowing that a KOBJ_CHANGE event is needed. So
send one directly via the 'uevent' sysfs attribute.
Also, uevents don't get generated when md arrays are stopped (prior to
2.6.28) so send 'change' events then too.
Signed-off-by: NeilBrown <neilb@suse.de>
When mdadm.conf is automatically generated, we might not know a
suitable /dev/name. But we do know the uuid of the container.
So allow that as an option.
Signed-off-by: NeilBrown <neilb@suse.de>
Change the "env_check_mdmon" function to be more generic, accepting
and environment variable name, as soon we will have a new use for it.
Signed-off-by: NeilBrown <neilb@suse.de>
We will shortly be feeding more information into the process of
creating array devices, so delay the creation. Still open them
early if the device already exists.
This involves making sure the autof flag is in the right place
so that it can be found at creation time.
Also, Assemble, Build, and Create now always close 'mdfd'.
Signed-off-by: NeilBrown <neilb@suse.de>
This reflect that fact that more often than not it is creating things
in /dev, and allows for a new open_mddev which does just that.
Signed-off-by: NeilBrown <neilb@suse.de>
Given an mdadm.conf like the following allow /dev/imsm and /dev/md/r1 to be
created by "mdadm -As".
DEVICES partitions
ARRAY /dev/imsm metadata=imsm auto=md UUID=b98f5dbe-aa859e7b-0e369b89-a80986d4
ARRAY /dev/md/r1 container=/dev/imsm member=0 auto=mdp UUID=3538e39c-b397c2e9-1aa031f9-2bc0eca4
spares=1
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Previously it was possible to set the WRITEMOSTLY flag when
adding a device to an array, but not to clear the flag when re-adding.
This is now possible with --readwrite.
Signed-off-by: NeilBrown <neilb@suse.de>