If we find we cannot add the requested bitmap file when
assembling the array, then make sure to clean up properly
and don't leave a half-configured array.
Signed-off-by: NeilBrown <neilb@suse.de>
If --assemble is given a container and some other devices to assemble
an array from, it complains with an error because that doesn't make
sense.
However it currently also complains if the list of devices was extract
from the config file rather than being given on the command line.
That is not appropriate.
So add an '&& inargv' test to ensure that we are really complaining
about the right thing.
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
1.x metadata allows a device to be a member of the array while it
is still recoverying. So it is a working member, but is not
completely in-sync.
mdadm/assemble does not understand this distinction and assumes that a
work member is fully in-sync for the purpose of determining if there
are enough in-sync devices for the array to be functional.
So collect the 'recovery_start' value from the metadata and use it in
assemble when determining how useful a given device is.
Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
Signed-off-by: NeilBrown <neilb@suse.de>
Once load_super has succeeded, it should continue to succeed. However
devices can disappear etc so it is prudent to always check the return
status of load_super.
Signed-off-by: NeilBrown <neilb@suse.de>
Previously such things did not exist: ACTIVE and SYNC were either both
set or both clear. Recent changes with reshape means that a device
can be ACTIVE but not yet fully in-sync, so they need to be handled
and included in the array as active devices.
Signed-off-by: NeilBrown <neilb@suse.de>
When disks have conflicting container memberships (same container ids
but incompatible member arrays) --update=uuid can be used to move
offenders to a new container id by changing 'orig_family_num'.
Note that this only supports random updates of the uuid as the actual
uuid is synthesized. We also need to communicate the new
'orig_family_num' value to all disks involved in the update. A new
field 'update_private' is added to struct mdinfo to allow this
information to be transmitted.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If an array name contains a "hostname:" prefix, then
--assemble will tend to leave it there, while --incremental
will strip it off (when chosing a device name during auto-assembly).
Make this more consistent: strip the name off if we decide that
the name will be treated as 'local'. Leave it on if it will be
treated as 'foreign'.
Signed-off-by: NeilBrown <neilb@suse.de>
If mdadm.conf contains
HOMEHOST <ignore>
or commandline contains
--homehost=<ignore>
then the check that array metadata mentions the given homehost is
replace by a check that the name recorded in the metadata is not
already used by some other array mentioned in mdadm.conf.
This allows more arrays to use their native name rather than having
an _NN suffix added.
This should only be used during boot time if all arrays required for
normal boot are listed in mdadm.conf.
If auto-assembly is used to find all array during boot, then the
HOMEHOST feature should be used to ensure there is no room for
confusion in choosing array names, and so it should not be set
to <ignore>.
Signed-off-by: NeilBrown <neilb@suse.de>
For container= and member= to be effective in an mdadm.conf line
they must both be present. So when checking for their absence we
need container != NULL || member != NULL.
Signed-off-by: NeilBrown <neilb@suse.de>
The line 'auto' in mdadm.conf can be used to disable assembly
of specific metadata types, or of all arrays.
This does not affect assembly of arrays listed in mdadm.conf
or on command line.
auto -all
will disable all auto-assembly.
auto -ddf
will cause mdadm to ignore ddf arrays that are not explicitly
mentioned, and auto assemble anything else it finds.
Signed-off-by: NeilBrown <neilb@suse.de>
If an array is created with --homehost=any, then --assemble and
--incremental will treat it as being local to 'this' host, no matter
what the name of this host is.
This is useful for array that will be given unique names and be
moved between machines.
This needs to be documented.
Signed-off-by: NeilBrown <neilb@suse.de>
When building container members with -IR, we need to ensure that
devices added to an active array preserve the 'in_sync' status so they
don't needlessly get rebuilt.
So allow sysfs_add_disk to do this (only works in kernels since
2.6.30) and pass the relevant flag down.
Signed-off-by: NeilBrown <neilb@suse.de>
wait not only for the name to appear, but for it to refer to the
correct device.
Sometimes old symlinks left lying around can be confusing.
Signed-off-by: NeilBrown <neilb@suse.de>
If we are assembling an array in a container and it isn't complete
enough to start yet, then
- don't start mdmon
- don't say the array is started
- don't wait for the device to appear in /dev
Signed-off-by: NeilBrown <neilb@suse.de>
Use mddev_busy() as GET_ARRAY_INFO can succeed on 'clear' arrays.
Ran into this after an encountering a case where mdadm -Ss ended in
segfault (missing check for NULL return from map_by_devnum() in
sles11:Manage.c). So, tried to stop the array by hand with echo clear >
md/array_state, after which I could not reassemble since GET_ARRAY_INFO
was succeeding.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reshape with large chunk size can require a large stripe_cache.
We make this work when starting the reshape but not when
restarting at assemble time. So fix that.
Signed-off-by: NeilBrown <neilb@suse.de>
This is only significant for --assemble --force where some old
devices might be included into the array. If anything looks like
it isn't clean, the kernel will not allow a degraded array to be started.
Signed-off-by: NeilBrown <neilb@suse.de>
If any superblocks in a confused array had an event count of 0,
"mdadm -Af" would not update the event counts to assemble the array.
I don't remember why that text is there, and it has caused at least
one situation to be difficult to recover from. So remove the
test. --force means --force!
Signed-off-by: NeilBrown <neilb@suse.de>
1/ when we choose not to use a device, must set ->used to 2, not 1.
2/ When we give up on a member, clear st and content.
Signed-off-by: NeilBrown <neilb@suse.de>
We probably should pass a flag down saying 'this is auto-assembly',
but for now, if there is no identity information set, it must
be auto-assemble.
Signed-off-by: NeilBrown <neilb@suse.de>
Try to treat members of containers much like other arrays for
assembly.
We still look through the list of devices for a match (it will be
the container), then find the relevant 'info' and try to assemble
the array.
Signed-off-by: NeilBrown <neilb@suse.de>
Attempting to open(O_EXCL) each candidate device usually filters out all
busy raid components. However, containers do not behave like components
and will return container_content that may describe active member
arrays.
This patch just adds a function that will be used to check if a
container member is busy. It will be used shortly.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Factor out, from Incremental_container, the code for assembling an
array based on information extracted from a container. We will
shortly use this from Assemble too.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
In preparation for handling the container case where we may need to handle
a list of potential member arrays.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When a 'container' gets started, we need udev to notice, but the
kernel has no way of knowing that a KOBJ_CHANGE event is needed. So
send one directly via the 'uevent' sysfs attribute.
Also, uevents don't get generated when md arrays are stopped (prior to
2.6.28) so send 'change' events then too.
Signed-off-by: NeilBrown <neilb@suse.de>
We previously only updated /var/run/mdadm/map when starting an
array with --incremental. However we now make more use of
that file (to pass the dev name to udev) so always update it.
Signed-off-by: NeilBrown <neilb@suse.de>
This delays the create_mddev call even further in the case where
an array device name is given for --assemble. It is now delayed
until the 'name' of the array is also available.
We will shortly be feeding more information into the process of
creating array devices, so delay the creation. Still open them
early if the device already exists.
This involves making sure the autof flag is in the right place
so that it can be found at creation time.
Also, Assemble, Build, and Create now always close 'mdfd'.
Signed-off-by: NeilBrown <neilb@suse.de>
This reflect that fact that more often than not it is creating things
in /dev, and allows for a new open_mddev which does just that.
Signed-off-by: NeilBrown <neilb@suse.de>
Given an mdadm.conf like the following allow /dev/imsm and /dev/md/r1 to be
created by "mdadm -As".
DEVICES partitions
ARRAY /dev/imsm metadata=imsm auto=md UUID=b98f5dbe-aa859e7b-0e369b89-a80986d4
ARRAY /dev/md/r1 container=/dev/imsm member=0 auto=mdp UUID=3538e39c-b397c2e9-1aa031f9-2bc0eca4
spares=1
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Raid disk and disk number information is not relevant at the container
level, especially for imsm. So arrange for getinfo_super_imsm() to
always publish devices as spares and report the number of spares at
Assemble() time.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Made the mistake of recompiling the F9 mdadm rpm which has a patch to
remove -Werror and add "-Wp,-D_FORTIFY_SOURCE -O2" which turns on lots
of errors:
config.c:568: warning: ignoring return value of asprintf
Assemble.c:411: warning: ignoring return value of asprintf
Assemble.c:413: warning: ignoring return value of asprintf
super0.c:549: warning: ignoring return value of posix_memalign
super0.c:742: warning: ignoring return value of posix_memalign
super0.c:812: warning: ignoring return value of posix_memalign
super1.c:692: warning: ignoring return value of posix_memalign
super1.c:1039: warning: ignoring return value of posix_memalign
super1.c:1155: warning: ignoring return value of posix_memalign
super-ddf.c:508: warning: ignoring return value of posix_memalign
super-ddf.c:645: warning: ignoring return value of posix_memalign
super-ddf.c:696: warning: ignoring return value of posix_memalign
super-ddf.c:715: warning: ignoring return value of posix_memalign
super-ddf.c:1476: warning: ignoring return value of posix_memalign
super-ddf.c:1603: warning: ignoring return value of posix_memalign
super-ddf.c:1614: warning: ignoring return value of posix_memalign
super-ddf.c:1842: warning: ignoring return value of posix_memalign
super-ddf.c:2013: warning: ignoring return value of posix_memalign
super-ddf.c:2140: warning: ignoring return value of write
super-ddf.c:2143: warning: ignoring return value of write
super-ddf.c:2147: warning: ignoring return value of write
super-ddf.c:2150: warning: ignoring return value of write
super-ddf.c:2162: warning: ignoring return value of write
super-ddf.c:2169: warning: ignoring return value of write
super-ddf.c:2172: warning: ignoring return value of write
super-ddf.c:2176: warning: ignoring return value of write
super-ddf.c:2181: warning: ignoring return value of write
super-ddf.c:2686: warning: ignoring return value of posix_memalign
super-ddf.c:2690: warning: ignoring return value of write
super-ddf.c:3070: warning: ignoring return value of posix_memalign
super-ddf.c:3254: warning: ignoring return value of posix_memalign
bitmap.c:128: warning: ignoring return value of posix_memalign
mdmon.c:94: warning: ignoring return value of write
mdmon.c:221: warning: ignoring return value of pipe
mdmon.c:327: warning: ignoring return value of write
mdmon.c:330: warning: ignoring return value of chdir
mdmon.c:335: warning: ignoring return value of dup
monitor.c:415: warning: rv may be used uninitialized in this function
...some of these like the write() ones are not so trivial so save those
fixes for the next patch.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For now, this means that the lack of a homehost doesn't always prevent
assembly.
Soon we will allow assembly anyway, but have different messages if
homehost isn't supported.
When we assemble an array, there are three different approaches
depending on whether metadata is internal or external, and on
kernel version.
Move all this to a common helper instead of duplicating in 3 places.
Signed-off-by: NeilBrown <neilb@suse.de>
The variety of approaches to 'add_disk' are factored out into
a separate function, and Incremental mode benefits by being
closer to supporting the assembly of containers.
Also remove the adding-to-array-data-structure out of sysfs_add_disk
and into add_disk.
And add some tests for --incremental mode to make sure we don't break it.
Signed-off-by: NeilBrown <neilb@suse.de>
Since we made free_super a superswitch call, we need to be careful
that st is non NULL before calling st->ss->free_super(st).
Also when updating byteorder there is a chance of a similar NULL
deref.
I want the metadata handler to have more control over the 'version',
particularly for arrays which are members of containers.
So discard st->text_version and instead use info->text_version
which getinfo_super can initialise.
If you have stacked arrays, then
mdadm -As --homehost=fred
should work but doesn't. It gets into an infinite loop!
So write some tests, and fix the bugs.
If two drives in a raid5 disappear at the same time, then "-Af"
will add them both in rather than just one and forcing the array
to 'clean'. This is slightly safer in some cases.
If an auto-assembly attempt failes because the array cannot be
opened or because the array has already been created, then we
get into an infinite loop.
Reported-by: Dan Pascu <dan@ag-projects.com>
Fixes-debian-bug: 396582
If they are for a partition and a whole device (common case)
they old message doesn't really cover the situation. So add
the "overlap" option to the text.
Also detect whether the device list was in mdadm.conf and
act accordingly.
Make -assemble a bit more resilient to finding strange
information in superblocks.
Don't claim newly added spares are InSync!! (don't know why that
code was ever in there)
As version-0.90 superblock don't record the superblock
offset, it is possible for overlapping partitions,
or a partition that starts on a 64K boundary in the whole device
to result in mis-detection - one partition or device might
be detected where the other was intended.
To avoid this awkward possibility, we reject assembly attempts
which seem to have two devices that are different but have the
same version-0.90 superblock.
To avoid this problem altogether, switch to version-1 metadata.
Signed-off-by: Neil Brown <neilb@suse.de>