This call to validate_geometry is really rather gratuitous.
It is purely about the fact that super0 cannot use more than 4TB.
So just make it an explicit test - less confusing that way.
With this, validate_geometry is only called from Create, which
makes it easier to reason about.
Also validate_geometry is now never passed NULL for the 'chunk'
parameter, so we can remove those annoying tests for NULL.
Signed-off-by: NeilBrown <neilb@suse.de>
We can only revert a reshape if the reshape_position aligns
properly for the old geometry.
If it doesn't we just fail for now.
Also fix a +/- error with updating raid_disks for super1.c
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the smooth conversion of legacy 0.90 arrays
to 1.0 metadata.
Old metadata is likely to remain but will be ignored.
It can be removed with
mdadm --zero-superblock --metadata=0.90 /dev/whatever
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the metadata on a device to be saved and later restored.
This can be useful before experimenting on an array that is misbehaving.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.
So get rid of devnum (a number) and use devnm (a 32char string).
eg.
md0
md_d2
md_home
Signed-off-by: NeilBrown <neilb@suse.de>
Commit 1e2b276535 (Report error in --update
string is not recognised) broke homehost updating functionality because it
depended on each string comparison being done even after we already found
a match. Make it work again by restructuring code.
Reported-by: (and original version by) Justin Maggard <jmaggard10@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Now that we use O_DIRECT for all device IO, BLKFLSBUF is not needed to
ensure we get current data, and it can impose a cost if any flush-out
is needed. So remove it.
To be safe, add O_DIRECT to one place where it isn't currently used:
when reading a bitmap.
Signed-off-by: NeilBrown <neilb@suse.de>
As 'info->size' is signed, it cannot even hold values above
2TB.
But it isn't used much. sb->size is the important value and it
is unsigned.
So use that to check for overflow of size.
Reported-by: Eugene San <eugenesan@gmail.com>
And here's another trivial bugfix, now for spelling mistakes in various
places, authred by Sergey Kirpichev (Cc'ed) and carried in debian mdadm
package.
Signed-off-by: NeilBrown <neilb@suse.de>
When --uuid is specified in the command line, even for v0.90
superblock we override last portion of uuid with data from
--homehost, which is wrong (and disagrees with the manpage).
Only use homehost in super0 if no uuid is specified.
Signed-off-By: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: NeilBrown <neilb@suse.de>
mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ...
The size is in K unless a suffix: K M G is given.
The suffix 's' means sectors.
Signed-off-by: NeilBrown <neilb@suse.de>
This is currently only useful for 1.x metadata and will allow an
explicit --data-offset request on command line.
Signed-off-by: NeilBrown <neilb@suse.de>
If we change some functions to accept 'verbose', where <0 means to be
quiet, in place of 'quiet', then we will be able to merge
'quiet' and 'verbose' together for simplicity.
Signed-off-by: NeilBrown <neilb@suse.de>
malloc should never fail, and if it does it is unlikely
that anything else useful can be done. Best approach is to
abort and let some super-daemon restart.
So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit. Then use those
removing all the tests for failure.
Also replace all "malloc;memset" sequences with 'xcalloc'.
Signed-off-by: NeilBrown <neilb@suse.de>
RAID10 arrays with an odd number of devices had the arraysize
reported wrongly by --examine due to a rounding error.
Reported-by: Chris Francy <zoredache@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This makes super[01].c properly align buffers used for the bitmap
using posix_memalign() to make sure the writes don't fail in case the
bitmap is opened using O_DIRECT.
This is based on https://bugzilla.redhat.com/show_bug.cgi?id=789898
and an initial patch by Alexander Murashkin.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Adding a bitmap via ioctl can only add it at a fixed location.
That location is not suitable for 4K-block devices.
So allow setting the bitmap location via sysfs if kernel supports it
and aim to always use 4K alignments.
Signed-off-by: NeilBrown <neilb@suse.de>
We need to force multiplication to use ULL before they
get to big, else it overflows. So move the "2ULL" to the start.
Signed-off-by: NeilBrown <neilb@suse.de>
When container is assembled while reshape is active on one of its member
whole container can be required to be blocked from monitoring.
For such purpose field recovery blocked is added to mdinfo structure.
When metadata handler finds active reshape in container it should set
recovery_blocked field to disable whole container monitoring during
reshape.
For arrays that doesn't use containers, recovery_blocked field
has the same value as reshape_active field e.g. super0/1.
In fact,recovery is blocked during reshape for such arrays.
For ddf, metadata handler doesn't set reshape_active field,
so recovery_blocked is not set also.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If you create a two drive raid1 array with one device writemostly, then
fail the readwrite drive, when you add a new device, it will get the
writemostly bit copied out of the remaining device's superblock into
it's own. You can then remove the new drive and readd it as readwrite,
which will work for the readd, but it leaves the stale WriteMostly1 bit
in devflags resulting in the device going back to writemostly on the
next assembly.
The fix is to make sure that A) when we readd a device and we might have
filled the st->sb info from a running device instead of the device being
readded, then clear/set the WriteMostly1 bit in the super1 struct in
addition to setting the disk state (ditto for super0, but slightly
different mechanism) and B) when adding a clean device to an array (when
we most certainly did copy the superblock info from an existing device),
then clear any writemostly bits.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
A 0.90 array can use at most 4TB of each device - 2TB between
2.6.39 and 3.1 due to a kernel bug.
The test for this in validate_super0 is very wrong. 'size' is sectors
and the number it is compared against is just confusing.
So fix it all up and correct the spelling of terabytes and remove
a second redundant test on 'size'.
Signed-off-by: NeilBrown <neilb@suse.de>
Some code currently clears 'info' before calling getinfo_super,
some code doesn't.
To be consistent, change it so no caller ever clears 'info',
but ever getinfo_super function must clear it.
Note that ->raid_disk may be meaningful if that 'map' is passed
non-NULL. In that case it is copied out before the structure
is zeroed.
Signed-off-by: NeilBrown <neilb@suse.de>
As homehost defaults to the system name it is not possible to specify
a NULL homehost.
This patch restored this ability with either --homehost="" or
--homehost="<none>".
This allows the creation of v1.x arrays without a "hostname:"
prefix in the name.
Signed-off-by: NeilBrown <neilb@suse.de>
the new code for defaulting chunksizes didn't work quite right
- default was set to late in super1/super0/ddf
- defaults would over-ride values of '0' imposed by some levels
- default value wasn't applied to size properly.
Signed-off-by: NeilBrown <neilb@suse.de>
When chunk size is not set from command line we need to guess it
depending on metadata given on command line or found on listed devices.
Validate_geometry sets the default for it's metadata if chunk is not set.
For external metadata chunk is set only when creating in a container.
For imsm validate_geometry_imsm_orom is responsible for finding default
chunk depending on container metadata loaded. Container will already know
which controller it is attached to, and have this controllers orom
available.
do_default_chunk indicates that we need to find default chunk and
if validate_geometry fails for some metadata it tells us to reset chunk
that may have been set.
Current solution would set default chunk correctly for imsm only if
container device was given on command line. With the list of devices
chunk was always set to 512.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When we force-assemble an array which is in the middle of a reshape,
we should repeat the reshape of any parts that aren't recorded in
the oldest superblock.
This is unlikely to make a significant difference, but could make
a small difference, and is safer.
Signed-off-by: NeilBrown <neilb@suse.de>
We previously closed all 'fds' associated with an array in
write_init_super .. sometimes, and sometimes at bad times.
This isn't neat and free_super is a better place to close them.
So make sure free_super always closes the fds that the metadata
manager kept hold of, and stop closing them in write_init_super.
Also add a few more calls to free_super to make sure they really do
get closed.
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
It makes more sense to test for container_dev than for subarray
for several places in Create where it then uses container_dev.
This allows us to subsequently remove subarray.
Signed-off-by: NeilBrown <neilb@suse.de>
This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.
For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.
Signed-off-by: NeilBrown <neilb@suse.de>
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.
We should never include a device in an array if it thinks that the
primary device has failed.
This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.
Signed-off-by: NeilBrown <neilb@suse.de>
If an --update is requested by the relevant metadata doesn't
understand it, print a useful message rather than silently ignoring
the issue.
Signed-off-by: NeilBrown <neilb@suse.de>
This number isn't meaningful for RAID0 as a different amount of space
might be used from each device.
It isn't meaningful for linear either, but already was not reported
for linear.
Detail doesn't report it either.
So make --examine not report it.
Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Mario 'BitKoenig' Holbe <Mario.Holbe@TU-Ilmenau.DE>
The 4K superblock can be as close as 64K from the end
of the device. As the bitmap (with header) lives after
the superblock (with 0.90 metadata) there could be as
little as 60K of space.
So limit the bitmaps to 59.5K, and only write 60K including
the header.
The bug fixed here means that bitmaps cannot be created
on devices which are exact multiples of 64K in size
Signed-off-by: NeilBrown <neilb@suse.de>