Commit Graph

2528 Commits

Author SHA1 Message Date
mwilck@arcor.de dc9e279c13 DDF: __write_init_super_ddf: just use seq number of active header
It's not necessary to check for 0xffffffff, which is a valid
sequential number.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:31 +10:00
mwilck@arcor.de dacf3dc5d4 DDF: __write_ddf_structure: Fix wrong reference to ddf->primary
Should reference "header" instead here.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-23 14:55:31 +10:00
NeilBrown 2fdf559d74 Manage_runstop: call flush_mdmon if O_EXCL fails on stopping mdmon array.
When stopping an mdmon array, at reshape might be being aborted
which inhibets O_EXCL.  So if that is possible, call flush_mdmon
to make sure mdmon isn't still busy.

Reported-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-22 17:05:33 +10:00
Przemyslaw Czarnowski 79b68f1b48 imsm: monitor: do not finish migration if there are no failed disks
Transition from "degraded" to "recovery" made in OROM is slightly different
than the same transision in mdadm. Missing disk is not removed from list of
raid devices, but just from map. Therefore mdadm should not end migration
basing on existence of list of missing disks but should rely on count of
failed disks.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Tested-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-22 16:21:17 +10:00
Pawel Baldysiak 4edb8530e8 Add updating component_size to manager thread of mdmon
Mdmon does not update component_size now. It is wrong because in case
of size's expansion component_size is changed by mdadm but mdmon does not
reread its new value and uses a wrong, old one. As a result the metadata
is incorrect during size's expansion. It contains no information that
resync is in progress (there is no checkpoint too). The metadata is
as if resync has already been finished but it has not.

Component_size will be set to match information in sysfs. This value
will be updated by manager thread in manage_member() function.
Now mdmon uses the correct, current value of component_size and the
correct metadata (containing information about resync and checkpoint)
is written.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-08 11:32:53 +10:00
NeilBrown 5e73b02409 Ensure mddev_dev struct always zeroed on allocation.
There are a number of fields which should not
be left uninitialised.  e.g. attempt_re_add can get
confused if ->writemostly is not set correctly.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 11:53:51 +11:00
NeilBrown 748952f73e Create: default to bitmap=internal for large arrays.
Here, "large" means components are 100G or more.  It is
usually beneficial to have write-intent bitmaps on such arrays.
They can be suppressed with --bitmap=none

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 10:36:21 +11:00
NeilBrown 8af530b07f Enhance incremental removal.
When asked to incrementally-remove a device, try marking the array
read-auto first.  That will delay recording the failure in the
metadata until it is really relevant.
This way, if the device are just unplugged when the array is not
really in use, the metadata will remain clean.

If marking the default as faulty fails because it is EBUSY, that
implies that the array would be failed without the device.  As the
device has (presumably gone) - that means the array is dead.  So try
to stop it.  If that fails because it is in use, send a uevent to
report that it is gone.  Hopefully whoever mounted it will now let go.

This means that if  you plug in some devices and they are
auto-assembled, then unplugging them will auto-deassemble relatively
cleanly.

To be complete, we really need the kernel to disassemble the array
after the last close somehow.  Maybe if a REMOVE has failed and a STOP
has failed and nothing else much has happened, it could safely stop
the array on last close.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05 09:46:34 +11:00
NeilBrown 401f095c39 mdadm.8: Detail use for IMSM_NO_PLATFORM environment variable.
Suggested-by: Marcin Tomczak <marcin.tomczak@intel.com>
2013-03-04 17:25:36 +11:00
mwilck@arcor.de c1ea5a9809 Detail.c: call load_container for container subarrays
Without calling load_container at this point, the
info structure may be missing some important information.
In particular, information about secondary DDF RAID levels
may be wrong if information is only read from a single disk.

If this fails, fall back to the previous code.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:15:51 +11:00
mwilck@arcor.de 4eefd651f0 DDF: compare_super_ddf: merge local info of other superblock
If a match is found in compare_super_ddf, check the other SB
for local DDF information (VD config records, physical disk data)
which is not available in the current superblock, and add it
if needed.

This is important for the mdmon - when disks are added to a
auto read-only array, they must be present in the DDF structure
in order to guarantee consistent writeback of metadata to all
disks.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:15:06 +11:00
mwilck@arcor.de 2d21069764 DDF: add sanity checks in compare_super_ddf
Besides container GUID, also check seqnum, physical and virtual
disk numbers, and check match between local and global sections.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:14:17 +11:00
mwilck@arcor.de e3c2a365e9 DDF: __write_init_super_ddf: use correct VD conf
When writing back the DDF structure, make sure that on each disk
we write the configs that include this disk even if a secondary
RAID level is present. Otherwise the secondary RAID will not be
read correctly any more when we open the device next time.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:13:21 +11:00
mwilck@arcor.de 4e5870181a DDF: container_content_ddf: handle RAID layout for RAID10
This patch adds basic handling for the special case of RAID10.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:10:38 +11:00
mwilck@arcor.de a5c7adb310 DDF: container_content_ddf: check for secondary RAID
Check for supportable secondary RAID configurations.
There is currently only one: RAID 10, if the stripe
sizes and Basic volume sizes are all equal.

With this patch, mdadm will not try to start unsupported
secondary RAID level configurations any more.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:08:46 +11:00
mwilck@arcor.de 8a38db8674 DDF: container_content_ddf: change array disk search loop
When searching for container elements, loop over the known phys
disks rather than the elements of the current configuration.

This patch changes nothing in the logic or return value of the code.
It just prepares extended logic for handling RAID10.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:07:09 +11:00
mwilck@arcor.de 3dc821b091 DDF: load_ddf_local: store VD conf for other BVDs
Store VD config for other BVDs in the other_bvds array.
This allows handling secondary RAID levels in container_content_ddf.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 16:03:44 +11:00
mwilck@arcor.de 8ec5d68536 DDF: added other_bvd to struct vcl
The VD config structures of different BVDs in the same SVD may be
different. This pointer stores the other BVDs.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 15:59:38 +11:00
mwilck@arcor.de 0175cbf62c DDF: increase seq number when writing meta data
Cleanly increase the seq number when the DDF structures are
written, instead of always setting it back to 1.

Also, make sure that the sequential number of all headers and
VD conf records is the same.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 14:29:59 +11:00
mwilck@arcor.de 097bcf0057 DDF: use existing locations for primary and secondary DDF structure
Some RAID BIOSes apparently use hard-coded LBA offsets (presumably
from the end of the disk) for the primary and secondary DDF
structure, ignoring the values given in the DDF anchor. This is
broken BIOS behavior, but it will cause any changes made by MD
(e.g. setting the init_state flag after a full initialization)
to be "forgotten" after the next reboot.

This patch fixes this by using the exiting LBA locations if
available. Verified that this fixes MD+LSI Mega Software RAID
BIOS.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 14:19:50 +11:00
mwilck@arcor.de 7f798aca5b DDF: cleanly save the secondary DDF structure
So far, mdadm only saved the header of the secondary structure.
With this patch, the full secondary DDF structure is saved
consistently, too. Some vendor DDF implementations need it.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-04 14:19:50 +11:00
NeilBrown 4dd2df0966 Discard devnum in favour of devnm
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:05:23 +11:00
NeilBrown fdcad551e9 Grow: fix problem with reshaping RAID4 to RAID0.
As 'layout' doesn't map neatly from RAID4 to RAID5, we need to
set it correctly for RAID4.
Also, when no reshape is needed we should set re->level to the final
desired level.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 17:02:21 +11:00
NeilBrown 3920235ea2 Grow: disallow --size changes on RAID0 and Linear.
These aren't meaningful and must be disabled.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-21 14:51:11 +11:00
Thomas Bächler 2452f13a41 udev: Fix order of execution of the md rules
Right now, the rules that run blkid on raid arrays are executed after
the assembly rules. This means incremental assembly will always fail
when raid arrays are again physical components of raid arrays.

Instead of simply reversing the order, split the rules up into two files,
one dealing with array properties and one dealing with assembly.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-11 11:15:34 +11:00
Thomas Bächler 3ab6bf304a Modernize udev rules
* $tempnode is deprecated, use $devnode
* blkid -o udev output is deprecated, use IMPORT{builtin}="blkid" instead

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-11 11:15:34 +11:00
John Spencer 0d35d5c480 mdadm.h: fix ugly glibc specific ifdeffery
the code that was exposed on anything else than dietlibc and klibc
is entirely glibc specific and broke the build on musl libc.

Signed-off-by: John Spencer <maillist-mdadm@barfooze.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-10 15:40:53 +11:00
John Spencer e3635eee64 platform-intel: canonicalize_file_name() is not portable
this is a GLIBC specific feature and should not be used.

according to its manpage:
"The call canonicalize_file_name(path) is equivalent
to the call realpath(path, NULL)."

thus, we use realpath so it works everywhere.

Signed-off-by: John Spencer <maillist-mdadm@barfooze.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-10 15:40:47 +11:00
NeilBrown def1133297 make --update=homehost work again
Commit 1e2b276535 (Report error in --update
string is not recognised) broke homehost updating functionality because it
depended on each string comparison being done even after we already found
a match.  Make it work again by restructuring code.

Reported-by: (and original version by) Justin Maggard <jmaggard10@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-10 15:40:42 +11:00
NeilBrown 9698df15d9 Avoid using BLKFLSBUF.
Now that we use O_DIRECT for all device IO, BLKFLSBUF is not needed to
ensure we get current data, and it can impose a cost if any flush-out
is needed.  So remove it.

To be safe, add O_DIRECT to one place where it isn't currently used:
when reading a bitmap.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-05 16:00:55 +11:00
NeilBrown ec1b28fdc0 Detail: print correct size for large external-metadata arrays.
If externally menaged metadata is in use, array.major_version will
be zero, so the test here to consider using get_component_size()
is wrong.  So if sra is present, use the major_version from there.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-05 16:00:50 +11:00
NeilBrown 030419821f mdmon: add --foreground option
While not strictly necessary for systemd, it is cleaner to avoid
forking when running from a management daemon.  So add a --foreground
option to mdmon.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-05 15:57:09 +11:00
Jes Sorensen 15c10423aa In case launching mdmon fails, print an error message before exiting
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-05 15:51:48 +11:00
Jes Sorensen 0f7bdf8946 Add support for launching mdmon via systemctl instead of fork/exec
If launching mdmon via systemctl fails, we fall back to the old method
of fork/exec. This allows for having mdmon launched via systemctl
which avoids problems with it getting killed by systemd due to it
ending up in the parent's cgroup (udev).

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-05 15:40:38 +11:00
Jes Sorensen 3e23ba9d7b Remove --offroot argument and default to always setting argv[0] to @
We still allow --offroot to be given - for compatibility with scripts
- but ignore it.

The whole point of --offroot is to get systemd to not auto-kill mdmon,
and we always want that.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-02-05 15:06:47 +11:00
NeilBrown e31dec228b mdadm.conf.5: clarify connection between action=re-add and bitmaps.
action=re-add will only re-add a recently removed device if a
bitmap is present.
Otherwise a force-space is needed.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-01-21 10:12:53 +11:00
NeilBrown 9dc7d3576a dev_open - don't bother trying map_dev
map_dev can be slow, and doesn't really provide a better result
than just creating a temporary device.
So discard it and use mknod/open/unlink to open a major:minor device.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-01-10 14:04:54 +11:00
NeilBrown 9bc4ae77e4 platform-intel - cache 'intel_devices' for a few seconds.
find_intel_devices() has take a little while to run as it scans
some directory tree, and the result isn't likely to change
often.
So cache the value and only discard it after 10 seconds.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-01-07 10:34:43 +11:00
NeilBrown 06d2ffc3e2 conditionally remove map_dev from find_free_devnum
map_dev can be slow so it is best to not call it when
not necessary.
The final test in "find_free_devnum" is not relevant when
udev is being used, so remove the test in that case.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-01-07 10:17:04 +11:00
NeilBrown 6d388a8816 MISC: Add --examine-badblocks option
This will list the contents of the bad-blocks log, if one is present.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-05 12:56:31 +11:00
NeilBrown 8cf2eb96b2 Assemble: fix spelling: report_missmatch -> report_mismatch
Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-05 11:40:28 +11:00
NeilBrown 1d04e27570 Assemble: Don't auto-assemble arrays which conflict with mdadm.conf
When auto-assembling we might find an array which appear in
mdadm.conf.
This can happen if the array (based on UUID) doesn't match what is
in mdadm.conf.
For consistency we should avoid auto-assembling such an array just as
we avoid regular-assembling of the array.


Reported-by: Ross Boylan <ross@biostat.ucsf.edu>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-05 11:06:55 +11:00
NeilBrown 262e3b7fe8 Manage: Add support for --re-add faulty
mdadm /dev/mdXX --re-add faulty

will identify any faulty devices in the array, remove them, and
--re-add them.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-28 10:19:52 +11:00
NeilBrown 5fe7f5f7c8 Fix "--remove faulty" and similar commands.
A recent change to improve error messages for subdev management broken
all use cases were device names like %d:%d were used.
Re-arrange the code again so we use dev_open first - which understands
those names - and then only try 'stat' if that failed.
The important thing is to base the 'Cannot find' message on the result
of 'stat', not on the result of 'open'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-28 10:12:09 +11:00
NeilBrown 66eb2c93a6 Assemble: ensure that <ignore>d arrays are not auto-assembled.
It isn't enough to simply not assemble arrays found to be called
<ignore>, as the final stage of auto-assemble doesn't check for names
in mdadm.conf.

So add a check to Assemble, similar to the check in Incremental()

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-22 17:04:20 +11:00
NeilBrown 13f2dd6be5 conf: allow multiple arrays to be <ignore>d
We currently complain if mdadm.conf contains multiple
definitions for the same name.  Unfortunately this stops
multiple arrays  from being <ignored>d.

So exclude "<ignore>" from the duplicate-names test.

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-22 16:28:00 +11:00
NeilBrown 639c3c103a Allow --wait to wait for delayed resync.
If a resync is delayed, then e->percent will be negative but not
RESYNC_NONE.  In that case we still want to wait.

Reported-by: Ross Boylan <ross@biostat.ucsf.edu>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-22 08:58:54 +11:00
NeilBrown ae0dcfbdb2 Grow: fix bug when multiple arrays present.
commit 1f9b0e2845
    Grow - be careful about 'delayed' reshapes.

Introduced a bug where a list of devices longer than 1
would cause an infinite loop.  Oops.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-22 08:57:25 +11:00
NeilBrown 04463b2f58 Makefile: remove "sh" from instructions for running 'test'.
'test' is really a bash script more than an 'sh' script, so
don't say "run 'sh ./test'", just say "run './test'".

Reported-by: Gilles Espinasse <g.esp@free.fr>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:15:11 +11:00
Lukasz Dorau f3f09a520f Grow.c: fix uninitialized variables compilation-time error
It fixes the following uninitialized variables compilation-time error:
WARN  - Grow.c: In function ‘reshape_array’:
WARN  - Grow.c:2413:21: error: ‘min_space_after’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
WARN  - Grow.c:2376:39: note: ‘min_space_after’ was declared here
WARN  - Grow.c:2414:22: error: ‘min_space_before’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
WARN  - Grow.c:2376:21: note: ‘min_space_before’ was declared here
WARN  - cc1: all warnings being treated as errors
WARN  - make: *** [Grow.o] Error 1
It occurs during compilation of mdadm on Fedora 17.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-11-20 12:12:03 +11:00