When checking if the new chunk size fit in the component size
we were confusing sectors and K, and so getting it wrong.
Signed-off-by: NeilBrown <neilb@suse.de>
Code to check partition tables used some needless casts
and was broken, using a u8 when a u32 was wanted.
So create structure describing the tables rather than using offset,
and read into those tables instead.
Signed-off-by: NeilBrown <neilb@suse.de>
This reverts commit 9ef5dbff4a as it is
duplicating the check that is done internal to imsm_read_serial().
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For 1.1 and 1.2 metadata where data_offset is not zero,
it is important to align the data_offset to underlying
block size.
We don't currently have access to the particular device
in avail_size so just try to force to a 1Meg boundary.
Also default 1.x metadata to 1.2 as documented.
Signed-off-by: NeilBrown <neilb@suse.de>
If /dev is static, a name may remain there after the
device has been detached from the system.
Using 'mdadm' to remove such a device from the array
should still work (even though "mdadm --remove detached"
might be preferred).
So when processing a device for '-r', don't insist on
being able to open the device.
Signed-off-by: NeilBrown <neilb@suse.de>
If mdmon is run without "--takeover", it should still
check both *_RUN directories for a valid 'pid' file
to guard against having two mdmons running at the same time.
Suggested-by: Luca Berra <bluca@comedia.it>
Signed-off-by: NeilBrown <neilb@suse.de>
Check that either VAR_DIR or ALT_DIR is actually writable before
selecting it.
Signed-off-by: Luca Berra <bluca@comedia.it>
Signed-off-by: NeilBrown <neilb@suse.de>
- when we waited for the old mdmon to exit, we didn't look
for the socket in the right place
- when we failed to find a pid file, we returned the wrong
value (code expected <0, but got ==0).
Signed-off-by: Luca Berra <bluca@comedia.it>
Signed-off-by: NeilBrown <neilb@suse.de>
running mdmon --takeover --all resulted in a segfault.
Also allow "--all" to be longer so that e.g. md_d127 can
successfully over-write it in the argument list visible
to 'ps'.
Signed-off-by: Luca Berra <bluca@comedia.it>
Signed-off-by: NeilBrown <neilb@suse.de>
When building mdadm.O2, set _FORTIFY_SOURCE to get more
warnings, and also build mdmon.O2 to find warnings in that
code too.
Then fix the warnings.
Suggested-by: Luca Berra <bluca@comedia.it>
Signed-off-by: NeilBrown <neilb@suse.de>
If we find we cannot add the requested bitmap file when
assembling the array, then make sure to clean up properly
and don't leave a half-configured array.
Signed-off-by: NeilBrown <neilb@suse.de>
Change default - again, sorry.
v1.1 causes problems as the conflicts with a boot sector, whether on
the whole device or in a partition. I guess the rule is they you
aren't allowed to use the first sector in general.
v1.2 is mostly just as good and avoid boot-sector problems so default
to that instead.
Signed-off-by: NeilBrown <neilb@suse.de>
As backup file has a timestamp which is updated quite separately
from the metadata timestamp. They should be largely in-sync but
sometimes are not.
So be more generous in the check, and allow it to be over-ridden
by an environment variable.
Signed-off-by: NeilBrown <neilb@suse.de>
If --assemble is given a container and some other devices to assemble
an array from, it complains with an error because that doesn't make
sense.
However it currently also complains if the list of devices was extract
from the config file rather than being given on the command line.
That is not appropriate.
So add an '&& inargv' test to ensure that we are really complaining
about the right thing.
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Some devices have 4K sector sizes. So try to align the bitmap
on a 4K boundary.
This isn't possible when hot-adding a bitmap, but will be once
we add support for configuring the bitmap via sysfs.
Signed-off-by: NeilBrown <neilb@suse.de>
Currently "mdadm -As" will process the entries in the config
file in order. If any array is a component or member of a preceding
array, that array will not be assembled.
So if there are any failures during assembly, retry those arrays,
and look until everything is assembled, or nothing more can
be assembled.
Signed-off-by: NeilBrown <neilb@suse.de>
Now that we don't "mdadm --takeover" until /var/run is writable
there is no need to continually try to create files in there.
So only create these files at startup and fail if they cannot be
made. This means that to start an array with externally managed
metadata, either /var/run or ALT_RUN (e.g. /lib/init/rw) must be
writable. To 'takeover' from a previous mdmon instance, /var/run
must be writable.
This means we don't need to worry about SIGHUP (which was once used to
tell us it was time to create .pid) and SIGALRM.
Signed-off-by: NeilBrown <neilb@suse.de>
Monitoring /proc/mounts and creating a .pid file as soon as /var/run
is writable is racy. Most distros clean all non-directories from
/var/run early in boot and if mdmon races with this it could
lose the files as soon as they are created.
Instead require that "mdmon --takeover" be run after /var is writable.
Signed-off-by: NeilBrown <neilb@suse.de>
Using switch-root and then creating files in /var/run once
that location is writable is racing as most distro clean out
/var/run shortly after it is mounted.
This can cause the .pid and .sock files to be deleted shortly
after they are created.
This option doesn't seem to be used at all any, and the alternative
of communicating the pid etc in some preserved directory and then
restarting mdmon once all is settled seems simpler.
So remove the code for supporting switchroot and update man page
accordingly.
Signed-off-by: NeilBrown <neilb@suse.de>
/var/run probably doesn't persist from early boot.
So if necessary, store in in /lib/init/rw or somewhere else
that does persist.
Signed-off-by: NeilBrown <neilb@suse.de>
devname2mdmon() can be used just as easily and doing so saves us from
some duplicate code.
Also tidy up some related logic.
Signed-off-by: NeilBrown <neilb@suse.de>
Creating /var/run in mdmon is really not justifiable.
If /var/run doesn't exist, then it is either deliberate and it should
be left that way to make sure the mapfile gets created in /dev, or
it is a configuration error and not our problem to fix.
Signed-off-by: NeilBrown <neilb@suse.de>
The change to get mdmon to re-exec itself from the switchroot
filesystem broken switchroot in various ways. This fixes it.
If the switchroot path is not '/', mdmon will find the pid and
socket for the monitor, chroot to the new root, and exec mdmon
passing the pid in argv[2] and the socket in stdin.
If the switchroot path is actually a number, mdmon will not chroot,
but will kill that pid before taking over the array.
Signed-off-by: NeilBrown <neilb@suse.de>
After we SIGTERM the monitor we need to wait for it to finish up.
Rather than the complexity of waiting for every md array to be clean,
we can simply read from the sock connected to the monitor.
When the monitor dies, we will get EOF. Before then we will block.
Signed-off-by: NeilBrown <neilb@suse.de>
It is redundant as each place that it is used, it can only
have one possible value.
Also change the related arg to mdmon() to have a more meaningful
name.
And make mdmon() static.
Signed-off-by: NeilBrown <neilb@suse.de>
1.x metadata allows a device to be a member of the array while it
is still recoverying. So it is a working member, but is not
completely in-sync.
mdadm/assemble does not understand this distinction and assumes that a
work member is fully in-sync for the purpose of determining if there
are enough in-sync devices for the array to be functional.
So collect the 'recovery_start' value from the metadata and use it in
assemble when determining how useful a given device is.
Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
Signed-off-by: NeilBrown <neilb@suse.de>
Suggesting that if your boot loaded doesn't support v1.x, then use
--metadata=1.0
doesn't make a lot of sense.
Also, '/', isn't really important, it is /boot that matters.
Hopefully people who have /boot on '/' realise that /boot is
on '/' ....
Signed-off-by: NeilBrown <neilb@suse.de>
Most operations require root access. Rather than ensure we generate
the right error message when something fails because we aren't root,
check early.
Note that "--examine" does not necessarily require root, so test
for that first.
Resolves-Debian-bug: 396570
Signed-off-by: NeilBrown <neilb@suse.de>
Fix compile warning when size_t is not a long.
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Luca Berra <bluca@vodka.it>
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm should not wait indefinitely when it trys to reset
superblock on device which is busy (cannot be opened).
Acked-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The current locking uses lockf, and is completely broken.
When you hold a lockf lock, any close of any fd on that file will
release the lock. So map_read() call which is made as soon as we get
the lock, will immediately drop the lock.
So change to flock locking which isn't so badly designed.
Signed-off-by: NeilBrown <neilb@suse.de>
We already have a call to 'enough' in Detail which is the check for
"do we have enough devices". We just need to calculate the required
data a bit earlier, then use the same 'enough' call to possibly
print FAILED.
This is motivated by Debian bug 495755.
The other request in that bug is not practical.
It would be very nice if output of `mdadm' is more clear in case of a
broken array.
Currently the only hint you get from `mdadm' that your array is broken
is this:
# mdadm -A /dev/md0 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
mdadm: /dev/md0 assembled from 1 drive and 3 spares - not enough to start the
array.
It could say something like `Your array is broken, you can't use it anymore'
It is not valid to report that array as 'broken' if the user hasn't
listed all the devices, which could be the case here.
Resolves-Debian-Bug: 495755
Signed-off-by: NeilBrown <neilb@suse.de>
If a bitmap exists on an array, then current kernels cannot grow
that array.
So when we try to grow an array, test for EBUSY and if a bitmap is
present, report that the bitmap needs to be removed.
Resolves-Debian-Bug: 534571
Signed-off-by: NeilBrown <neilb@suse.de>
Once load_super has succeeded, it should continue to succeed. However
devices can disappear etc so it is prudent to always check the return
status of load_super.
Signed-off-by: NeilBrown <neilb@suse.de>