These both have the same value, and have done since the
'devnm' concept was introduced.
So discard the pointless duplicate.
Signed-off-by: NeilBrown <neilb@suse.de>
This extends nodes option for assemble mode, make the num of
cluster node could be change by user.
Before that, it is necessary to ensure there are enough space
for those nodes, calc_bitmap_size is introduced to calculate
the bitmap size of each node.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
To support change the cluster name, the commit do the followings:
1. extend original write_bitmap function for new scenario.
2. add the scenarion to handle the modification of cluster's name
in write_bitmap1.
3. let the cluster name also show in examine_super1 and detail_super1
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
A clustered disk is added by the traditional --add sequence.
However, other nodes need to acknowledge that they can "see"
the device. This is done by --cluster-confirm:
--cluster-confirm SLOTNUM:/dev/whatever (if disk is found)
or
--cluster-confirm SLOTNUM:missing (if disk is not found)
The node initiating the --add, has the disk state tagged with
MD_DISK_CLUSTER_ADD and the one confirming tag the disk with
MD_DISK_CANDIDATE.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The home-cluster is stored in the bitmap super block of the
array. The device can be assembled on a cluster with the
cluster name same as the one recorded in the bitmap.
If home-cluster is not specified, this is auto-detected using
dlopen corosync cmap library.
neilb: allow code to compile when corosync-devel is not installed.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Specifies the maximum number of nodes in the cluster that may use
this device simultaneously. This is equivalent to the number of
bitmaps created in the internal superblock (patches to follow).
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.
Only strings which contain a newline can be broken
into multiple lines:
"It is OK to\n"
"break this string\n"
Signed-off-by: NeilBrown <neilb@suse.de>
make dprintf() print program name and __func__, so that
this messaging is consistent.
Also remove all __func__ messages from pr_err(). We shouldn't
leak that internal data in error message.
If we really want function name there, we new pr_XXX might
be wanted.
Signed-off-by: NeilBrown <neilb@suse.de>
Sometimes mdadm prints messages with wrong name "mdmon",
and vice versa.
This patch solves this problem by changing method of determining
process name.
Now "Name" will be set in const at start of a program,
previously was hardcoded as #define.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
--incremental currently fails if the device name passed does not
textually match the names permitted by the DEVICE line in mdadm.conf.
This is problematic when "mdadm -I" is run by udev as the name given
can be a temp name.
This patch makes two improvements:
1/ We generate a list of all existing devices that match the names
in mdadm.conf, and allow rdev based matching
2/ We allows extra aliases to be provided on the command line, and
perform textual matching on those. This is particularly suitable
for udev usages as ${DEVLINKS} can be provided even though the links
make not yet be created.
Signed-off-by: NeilBrown <neilb@suse.de>
--add-spare is like --add, but a --re-add is never attempted.
So it is equivalent to two separate commands:
--zero-metadata
--add
Signed-off-by: NeilBrown <neilb@suse.de>
We really need to make sure assemble_container_content()
gets called to finished the assembly of these.
Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Current "mdadm --run /dev/mdX" will not handle external metadata
properly. mdmon won't be started etc.
So use the code from "mdadm -IRs" instead - that already does all
the right things.
Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Telling systemd that mdadm was started from the initrd
is often a lie and never necessary. Now that the reshape monitoring
thread handles SIGTERM gracefully it is OK for system to kill
and mdadm that it finds running.
mdmon still have a bit of a question mark over it so I won't remove
the '@' from there just yet.
Signed-off-by: NeilBrown <neilb@suse.de>
This is useful for reshaping a RAID0 to a higher level.
The recovery will happen at the same time as the reshape.
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the smooth conversion of legacy 0.90 arrays
to 1.0 metadata.
Old metadata is likely to remain but will be ignored.
It can be removed with
mdadm --zero-superblock --metadata=0.90 /dev/whatever
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the metadata on a device to be saved and later restored.
This can be useful before experimenting on an array that is misbehaving.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
misc_scan assumes that any device name found in the 'mapping' file
is usable. Usually it is but sometimes not, such as for inactive
devices.
Depending on it isn't really robust, when a name is found, check that
it exists. If not, fall back on map_dev.
This will allow "--detail --scan" to notice inactive devices.
Signed-off-by: NeilBrown <neilb@suse.de>
Here, "large" means components are 100G or more. It is
usually beneficial to have write-intent bitmaps on such arrays.
They can be suppressed with --bitmap=none
Signed-off-by: NeilBrown <neilb@suse.de>
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.
So get rid of devnum (a number) and use devnm (a 32char string).
eg.
md0
md_d2
md_home
Signed-off-by: NeilBrown <neilb@suse.de>
We still allow --offroot to be given - for compatibility with scripts
- but ignore it.
The whole point of --offroot is to get systemd to not auto-kill mdmon,
and we always want that.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
--replace can be used to replace a device without completely failing
it. Once the replacement completes the device will be failed.
--with can indicate which of several spares to use.
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ...
The size is in K unless a suffix: K M G is given.
The suffix 's' means sectors.
Signed-off-by: NeilBrown <neilb@suse.de>
This can be used to over-ride the automatic assignment of
data offset.
For --create, it is useful to re-create old arrays where different
defaults applied.
For --grow it may be able to force a reshape in the reverse direction.
Signed-off-by: NeilBrown <neilb@suse.de>
We will shortly introduce --data-offset= which is allowed to
be zero. We will want to use parse_size() so it needs to be
able to return '0' without it being an error.
So define INVALID_SECTORS to be an impossible value (currently '1')
and return and test for it consistently.
Signed-off-by: NeilBrown <neilb@suse.de>
--update=bbl will add a bad block list to each device.
--update=no-bblk will remove the bad block list providing that it
is empty.
Signed-off-by: NeilBrown <neilb@suse.de>
Both are impossible, and '1' allows size to be unsigned,
which is neater.
Also #define MAX_SIZE to be '1' to make it all more explicit.
Signed-off-by: NeilBrown <neilb@suse.de>
The value of 'verbose' is sometimes mixed into 'brief', particularly
for Examine.
This is messy and confusing. So keep them separate.
'brief' still gets assumed when 'scan' is set, unless we are very
verbose.
Signed-off-by: NeilBrown <neilb@suse.de>
If we change some functions to accept 'verbose', where <0 means to be
quiet, in place of 'quiet', then we will be able to merge
'quiet' and 'verbose' together for simplicity.
Signed-off-by: NeilBrown <neilb@suse.de>
Rather than passing a long list of little flags etc to various
functions we will soon pass a small collection of structures.
This first step combines a collection of variables local to
'main' into a single structure.
Signed-off-by: NeilBrown <neilb@suse.de>
malloc should never fail, and if it does it is unlikely
that anything else useful can be done. Best approach is to
abort and let some super-daemon restart.
So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit. Then use those
removing all the tests for failure.
Also replace all "malloc;memset" sequences with 'xcalloc'.
Signed-off-by: NeilBrown <neilb@suse.de>
As we don't allow '-K' for '--zero-super' there is no point
using it internally. Just define a 'KillOpt' like with
other options.
Signed-off-by: NeilBrown <neilb@suse.de>
Both --detail and --monitor can report the names of member
devices on an array, and do so by searching /dev and finding
the shortest name that matches.
If
--prefer=foo
is given, they will instead prefer a name that contain /foo/.
So
mdadm --detail /dev/md0 --prefer=by-path
will list the component devices via their /dev/disk/by-path/xxx
names.
Signed-off-by: NeilBrown <neilb@suse.de>
It isn't sufficient to use '0' for 'error' as well will
later have fields that can validly be '0'.
So return "-1" on error.
Also fix parsing of --bitmap_check so that '0' is treated
as an error: we don't support 512B anyway.
Signed-off-by: NeilBrown <neilb@suse.de>
When --offroot is specified, mdadm will change the first character of
argv[0] to '@'. This is used to signal to systemd that mdadm was
launched from initramfs and should not be shut down before returning
to the initramfs.
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Providing --no-sharing is given, monitoring should be permitted.
Unfortunately the kernel current rejects GET_DISK_INFO and
GET_ARRAY_INFO for non-privileged users which is unjustified.
The info is available in sysfs and we could get it from there,
but for now, require the kernel to be fixed.
Reported-by: Raphael Hertzog <hertzog@debian.org>
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm allowes to assemble 2 volumes with the same names based on the
config file. The issue is fixed by iterating over the list of md device
identifiers and comparing the names of md devices against each other,
detecting identical names and blocking the assembly should the same names
be found.
Now having detected duplicate names, mdadm terminates without assembling
the container, displaying appropriate prompt.
Signed-off-by: Lukasz Orlowski <lukasz.orlowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Extracting the 'subarray' arg for these options was being done at the
wrong place which lead to the code being a bit confusing and looking
wrong.
So reformat that code a bit better and move the extraction of
'subarray' down to the main parsing of these options rather than the
mode setting.
Signed-off-by: NeilBrown <neilb@suse.de>
To allow for reshape continuation '--continue' option is added
to grow command.
Function that will be executed in grow-continue case doesn't require
information about reshape geometry. All required information are read
from metadata.
For external metadata reshape can be run for monitored array/container
only. In case when array/container is not monitored run mdmon for it.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
During initrd phase continuing reshape will cause file system context
lost. This blocks ability to control reshape using checkpoints.
To avoid this, during initrd phase assemble has to be executed with
'--freeze-reshape' option. This causes that mdadm restores reshape
critical section only.
Reshape can be continued later after system full boot.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Moving spares is not the purpose of '--oneshot' and without
currently --oneshot doesn't work as the check against multiple
spare-sharers prevents it.
So make --oneshot imply --no-sharing. This is sensible and allows
--oneshot to work again.
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Resolves-Debian-Bug: 641886
Signed-off-by: NeilBrown <neilb@suse.de>
0.90 arrays can only use up to 4TB per device. So when a larger
device is added, complain a bit. Still allow it if --force is given
as there could be a valid use.
Signed-off-by: NeilBrown <neilb@suse.de>
When an array is resized to have larger members, --assume-clean will
disable any resync if the kernel supports it (2.6.40 and later).
Signed-off-by: NeilBrown <neilb@suse.de>
RAID0 has accepted chunksizes that are not a power of 2 since 2.6.30.
So it time mdadm allowed that to be used.
Signed-off-by: NeilBrown <neilb@suse.de>
Options like --stop must come before the device that is being
stopped. If (in --misc mode) a device does not have an option,
nothing will be done to it, which can be confusing.
So report an error in this case.
Signed-off-by: NeilBrown <neilb@suse.de>
As homehost defaults to the system name it is not possible to specify
a NULL homehost.
This patch restored this ability with either --homehost="" or
--homehost="<none>".
This allows the creation of v1.x arrays without a "hostname:"
prefix in the name.
Signed-off-by: NeilBrown <neilb@suse.de>
This was #if-ed out for 3.0, but it really should go.
Gcc 4.6.0 complains that auto_update_home is set but not used
(which is true).
Reported-by: Tobias Powalowski <t.powa@gmx.de>
When we allowed a devlist to accompany some --grow modes - but not
--bitmap - we made --bitmap always fail, in stead of fail of a device
was given to add.
As 'devs_found' includes the md device, we need to compare against
'1'.
Signed-off-by: NeilBrown <neilb@suse.de>
This is done via conversion to RAID4 and back.
To grow the array, extra devices will be needed which cannot
already be present as spares - so allow a list of new devices
to be included in grow request which changed the number of devices.
Signed-off-by: NeilBrown <neilb@suse.de>
Neil,
Please consider this patch that once was discussed and I think agreed with in general direction. It was sent a while ago
but somehow did not merged into your devel3-2. This patch enables hot-plug of so called bare devices (as understand by domain policies rules in mdadm.conf).
Without this patch we do NOT serve hot-plug of bare devices at all.
Thanks,
Marcin Labun
Subject was: FW: Autorebuild, new dynamic udev rules for hot-plugs
>>From c0aecd4dd96691e8bfa6f2dc187261ec8bb2c5a2 Mon Sep 17 00:00:00 2001
From: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Date: Thu, 23 Dec 2010 16:35:01 +0100
Subject: [PATCH] Dynamic hot-plug udev rules for policies
Cc: linux-raid@vger.kernel.org, Williams, Dan J <dan.j.williams@intel.com>, Ciechanowski, Ed <ed.ciechanowski@intel.com>
When introducing policies, new hot-plug rules were added to support
bare disks. Mdadm was started for each hot plugged block device
to determine if it could be used as spare or as a replacement member for
degraded array.
This patch introduces limitation of range of devices that are handled
by mdadm.
It limits them to the ones specified in domains associated with
the actions: spare-same-port, spare and spare-force.
In order to enable hot-plug for bare disks one must update udev rules
with command
mdadm --activate-domains[=filename]
Above command writes udev rule configuration to stdout. If 'filename'
is given output is written to the file provided as parameter. It is up
to system administrator what should be done later. To make such rule
permanent (i.e. remain after reboot) rule should be writen to
/lib/udev/rules.d directory. Other cases will just need to write it to
/dev/.udev/rules.d directory where temporary rules lies. One should be
aware of the meaning of names/priorities of the udev rules.
After mdadm.conf is changed one is obliged to re-run
"mdadm --activate-domains" command in order to bring the system
configuration up to date.
All hot-plugged disks containing metadata are still handled by existing
rules.
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This is useful with 1.1 and 1.2 metadata to update the metadata if
the device size has changed.
The same functionality can be achieved by writing to the device size
in sysfs after re-adding normally, but in some cases this might be
easier.
Signed-off-by: NeilBrown <neilb@suse.de>
Growing an array when there aren't enough spares can make the array
degraded. This works but might not be what is wanted.
So warn the user in this case and require a --force to go ahead
with the reshape.
Signed-off-by: NeilBrown <neilb@suse.de>
Check on upper limit of number of devices was in the wrong place.
Result was could not create array with more than 27 devices without
explicitly setting metadata, even though default metadata allows more.
Fixed, and also perform check when growing an array.
Signed-off-by: NeilBrown <neilb@suse.de>
Though not having the proper backup file can cause data corruption, it
is not enough to justify not being able to start the array at all.
So allow "--invalid-backup" to be specified which says "just continue
even if a backup cannot be restored".
Signed-off-by: NeilBrown <neilb@suse.de>