Commit Graph

2435 Commits

Author SHA1 Message Date
NeilBrown ee2429e0bc Grow/raid10: support reducing the devices in a RAID10.
When reducing the number of devices in a RAID10, we increase the
data offset to avoid the need for backup area.

If there is no room at the end of the device to allow this, we need
to first reduce the component size of each device.  However if there
is room, we don't want to insist on that, otherwise growing then
shrinking the array would not be idempotent.

So find the min before/after space before analysing a RAID10 for
reshape, and if the after space is insufficient, reduce the total size
of the array and the component size accordingly.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown cb19a251a5 super1: reserve at least 2 chunks for reshape headroom.
sometimes 0.1% isn't enough, though mostly only in testing.

We need one chunk for a successful reshape, so reserve 2.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 5e88ab2e2f New RESHAPE_NO_BACKUP flag to track when backup action is needed.
Some arrays (raid10) never need a backup file, so during assembly
we can avoid the whole Grow_continue check in that case.
Achieve this using a flag set by the metadata handler.

Also get "mdadm -I" to fail if a backup process would be
needed.  It currently does fail as the kernel rejects things,
but it is nicer to have this explicit.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 19ceb16daf Grow: add raid10 reshape.
RAID10 reshape requires that data_offset be changed.
So we only allow it if the new_data_offset attribute is available,
and we compute a suitable change in data offset.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown b48e2e25c4 Split 'GCD' out into a separate function.
It is neater that way.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown fe384ca0b9 Grow: set new_data_offset if appropriate 2012-10-04 16:34:21 +10:00
NeilBrown 80bf913592 Add space_before/space_after fields to mdinfo
These will be needed to guide changes to data_offset during reshape.
Only set them for super1 for now.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 8fe1c44f82 super1: add new_offset field.
The 'new_offset' is used for reshaping to avoid the need
for a backup file.

For now we only report the value when it is set.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 40c9a66a5c Add --data-offset flag for Create and Grow
This can be used to over-ride the automatic assignment of
data offset.
For --create, it is useful to re-create old arrays where different
   defaults applied.
For --grow it may be able to force a reshape in the reverse direction.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:21 +10:00
NeilBrown 83cd1e97cb Add data_offset arg to ->init_super and use it in super1.c
So if ->data_offset is already set, use that rather than
computing one.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown af4348ddd1 Add data_offset arg to ->validate_geometry.
This is needed to return correct available size.  It isn't
really used yet.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown 387fcd593c Add data_offset arg to ->avail_size
This is currently only useful for 1.x metadata and will allow an
explicit --data-offset request on command line.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown 822e393a05 Allow parse_size to return 0.
We will shortly introduce --data-offset= which is allowed to
be zero.  We will want to use parse_size() so it needs to be
able to return '0' without it being an error.

So define INVALID_SECTORS to be an impossible value (currently '1')
and return and test for it consistently.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
NeilBrown 7103b9b88d Handles spaces in array names better.
1/ When printing the "name=" entry for --brief output,
   enclose name in quotes if it contains spaces etc.
   Quotes are already supported for reading mdadm.conf

2/ When a name is used as a device name, translate spaces
   and tabs to '_', as well as the current translation of
   '/' to '-'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:20 +10:00
Maciej Naruszewicz 9eafa1de73 imsm: Allow to specify controller for --detail-platform.
Usually, 'mdadm --detail-platform -e imsm' scans all the controllers
looking for IMSM capabilities. This patch provides the possibility
to specify a controller to scan, enabling custom usage by other
processes - especially with the --export switch.

$ mdadm --detail-platform
       Platform : Intel(R) Matrix Storage Manager
        Version : 9.5.0.1037
    RAID Levels : raid0 raid1 raid10 raid5
    Chunk Sizes : 4k 8k 16k 32k 64k 128k
    2TB volumes : supported
      2TB disks : not supported
      Max Disks : 7
    Max Volumes : 2 per array, 4 per controller
 I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)

$ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.2
       Platform : Intel(R) Matrix Storage Manager
        Version : 9.5.0.1037
    RAID Levels : raid0 raid1 raid10 raid5
    Chunk Sizes : 4k 8k 16k 32k 64k 128k
    2TB volumes : supported
      2TB disks : not supported
      Max Disks : 7
    Max Volumes : 2 per array, 4 per controller
 I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)

$ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.2 --export
MD_FIRMWARE_TYPE=imsm
IMSM_VERSION=9.5.0.1037
IMSM_SUPPORTED_RAID_LEVELS=raid0 raid1 raid10 raid5
IMSM_SUPPORTED_CHUNK_SIZES=4k 8k 16k 32k 64k 128k
IMSM_2TB_VOLUMES=yes
IMSM_2TB_DISKS=no
IMSM_MAX_DISKS=7
IMSM_MAX_VOLUMES_PER_ARRAY=2
IMSM_MAX_VOLUMES_PER_CONTROLLER=4

$ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.0 # This isn't an IMSM-capable controller
mdadm: no active Intel(R) RAID controller found under /sys/devices/pci0000:00/0000:00:1f.0

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-04 16:34:11 +10:00
Jes Sorensen 12c7b44ebe Add zlib license to crc32.c
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 18:17:26 +10:00
NeilBrown 88af981fa5 super1: ensure bitmap doesn't overlap bad block log.
If a bad block log already exists when adding a bitmap,
make sure the bitmap stays before the log.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 17:07:20 +10:00
NeilBrown 688e99a77d Allow --update to add or remove space for a bad block list.
--update=bbl will add a bad block list to each device.
--update=no-bblk will remove the bad block list providing that it
is empty.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 17:07:13 +10:00
NeilBrown bf95d0f38c Bad block log 2012-10-03 17:07:11 +10:00
NeilBrown 676ab3120b Grow: make warning about old metadata more explicit.
Don't print it just when --verbose is set, and explain how
to over-ride it.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 14:44:20 +10:00
NeilBrown 1f9b0e2845 Grow - be careful about 'delayed' reshapes.
If multiple reshapes are activated on the same devices (different
partitions) then one might be forced to wait for the other to
complete.
As reshaping suspends access to small sections of the array
at time, this cause a region to be suspended for a long time,
which isn't good.

To try to detect this and don't start suspending until
the reshape is actually happening.

This is only effective on 3.7 and later as prior kernels
don't report when the delayed reshape can progress.  For
the earlier kernels, just give a warning.

Signed-off-by; NeilBrown <neilb@suse.de>
2012-10-03 14:41:31 +10:00
NeilBrown b716f38973 Use --offroot flag when assembling md arrays via --incrmental
This ensures that mdmon won't be killed by systemd.
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 13:54:03 +10:00
NeilBrown 787e234ddc Detail/raid10: don't report 'set' names for 'far' and 'offset' raid10.
The 'set' concept is only meaningful for 'near' arrays, so only
use it there.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 13:53:54 +10:00
NeilBrown b3ec716d00 Fix 'enough' function for RAID10.
The 'enough' function is written to work with 'near' arrays only
in that is implicitly assumes that the offset from one 'group' of
devices to the next is the same as the number of copies.
In reality it is the number of 'near' copies.

So change it to make this number explicit.

Reported-by: Jakub Husák <jakub@gooseman.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 13:53:46 +10:00
NeilBrown 6a1cde4d06 Replace sha1.h with slightly older version.
sha1.h claims GPL3+, while sha1.c claims GPL2+.  This is
inconsistent and technically prevents the whole from being
distributed under GPL2.
So replace sha1.h with a version from the GCC sources from before
  Tue Apr 20 08:36:39 2010
when the copyright notice was updated.

Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-03 13:34:15 +10:00
Maciej Naruszewicz 80730bae52 Add MD_ARRAY_SIZE for --examine --export
An additional pair of key=value for --examine --export.

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-02 16:42:25 +10:00
Maciej Naruszewicz f0ec67106c Display size with human_size_brief with a chosen prefix
When using human_size_brief, only IEC prefixes were supported. Now
it's possible to specify which format we want to see - either IEC
(kibi, mibi, gibi) or JEDEC (kilo, mega, giga).

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-02 16:41:13 +10:00
Maciej Naruszewicz 570abc6f38 Synchronize size calculation in human_size and human_size_brief
It would be better if two size-calculating methods had the same
calculating algorithm. The human_size way of calculation seems
more readable, so let's use it for both methods.

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-02 16:40:11 +10:00
Maciej Naruszewicz 9590cb4276 Fix return code for --detail-platform
Variable 'err' is initially set to 1, so changing its value with
'|=' won't set it to 0 even if the operation is successful.

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-02 16:37:48 +10:00
Maciej Naruszewicz e50cf22073 imsm: Add --export option for --detail-platform
This option will provide most of information we can get via
mdadm --detail-platform [-e format] in the key=value format.
Example output:

$ mdadm --detail-platform
       Platform : Intel(R) Matrix Storage Manager
        Version : 9.5.0.1037
    RAID Levels : raid0 raid1 raid10 raid5
    Chunk Sizes : 4k 8k 16k 32k 64k 128k
    2TB volumes : supported
      2TB disks : not supported
      Max Disks : 7
    Max Volumes : 2 per array, 4 per controller
 I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)

$ mdadm --detail-platform --export
MD_FIRMWARE_TYPE=imsm
IMSM_VERSION=9.5.0.1037
IMSM_SUPPORTED_RAID_LEVELS=raid0 raid1 raid10 raid5
IMSM_SUPPORTED_CHUNK_SIZES=4k 8k 16k 32k 64k 128k
IMSM_2TB_VOLUMES=yes
IMSM_2TB_DISKS=no
IMSM_MAX_DISKS=7
IMSM_MAX_VOLUMES_PER_ARRAY=2
IMSM_MAX_VOLUMES_PER_CONTROLLER=4

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-10-02 16:28:30 +10:00
NeilBrown aab15415ed Manage: fix checks for removal from a container.
We must only remove from a container if the device isn't a
member of any member array.
To check we look at the 'holders' directory in sysfs.

We currently skip that check if ->devname is "detached", however
that can never be true since the change that introduced
add_detached().

Also sysfs_unique_holder returns status in 'errno' which isn't
entirely safe as e.g. closedir() is probably allowed to clear it.

So make sysfs_unique_holder return an unambigious value, and us
it to decide what to report.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-24 12:26:03 +10:00
Lukasz Dorau 65d0b4ce2d fix: imsm: do not accept too small sizes
Inappriopriate error messages (e.g. mdadm: platform does not support
raid5 with 0 disk) have been displayed when too small size was given.
This patch fixes it.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-20 12:27:17 +10:00
Lukasz Dorau 51d4261ca9 fix: adjust parse_size() to the unsigned size variable
An error in parse_size() should be reported by 0, not -1,
because -1 is changed to the max value of unsigned long long
during calculations of size (e.g. at mdadm.c:412).

A negative value of size should be reported as error
(e.g. size equal -1 has been changed to the max value of
unsigned long long so far).

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-20 12:27:17 +10:00
Lukasz Dorau d5d2c61473 fix: imsm: re-enable size expansion to the max value
Size expansion to the 'max' value has been broken since
the following patch:

    commit d04f65f48c
    Change the values for "max size" from -1 to 1.

This patch re-enables it.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-20 12:27:17 +10:00
NeilBrown 887a7a9e98 grow: fix typo : MAX_DISKS -> MAX_SIZE
oopps.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-20 11:32:39 +10:00
Lukasz Dorau db428d7a24 fix segfaults in Detail()
If disk has been removed, 'st' and 'info' can be NULL. It causes segfault.
'st' and 'info' should be checked against being NULL before being used.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-20 11:06:25 +10:00
Robert Buchholz 8a63c73123 raid6check: Auto-repair mode
When calling raid6check in regular scanning mode, specifiying
"autorepair" as the last positional parameter will cause it
to automatically repair any single slot failes it identifies.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-10 17:28:21 +10:00
Robert Buchholz 351d768026 raid6check: Extract (un)locking into functions
Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-10 17:28:03 +10:00
Robert Buchholz 696e95a1df raid6check: Repair mode used geo_map incorrectly
In repair mode, the data block indices to be repaired were calculated
using geo_map() which returns the disk slot for a data block index
and not the reverse. Now we simply store the reverse of that calculation
when we do it anyway.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-10 17:25:27 +10:00
Robert Buchholz b67e45b858 raid6check: Fix off-by-one in argument check
In repair mode, specifying a failed slot that is equal to the number of
devices in the raid could cause a segfault.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-10 17:23:59 +10:00
Robert Buchholz 1cc101f3f8 Move xmalloc et al into their own file
This avoid code duplication for utilities that do not link to
util.c and everything that comes with it, such as test_restripe and
raid6check

Signed-off-by: NeilBrown <neilb@suse.de>
2012-09-10 17:23:59 +10:00
NeilBrown a74e5731ba mapfile: fix mapfile rebuild for containers
When recreating the mapfile entry for a container we need to
use ->getinfo_super, not ->container_content, just like we
do in Detail().

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-20 12:34:28 +10:00
NeilBrown fb52f2457a find_free_devnum: avoid auto-using names in /etc/mdadm.conf
high-number names like "/dev/md126" shouldn't be in /etc/mdadm.conf,
but if they are they should be ignored when choosing an
unused number.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-20 10:50:42 +10:00
NeilBrown a99d346971 mdmon: allow --takeover when original was started with --offroot
As --offroot causes ARGV[0] to be changed, we need to be more
lenient when checking that the mdmon we are about to kill really
is mdmon.  i.e. allow name to be "@dmon" instead.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-20 10:37:21 +10:00
NeilBrown bcd2b4485a mdmon: fix arg processing for -a
'-a' was not being recognised as an abbreviation for '--all'.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-20 10:33:50 +10:00
NeilBrown 542ef4ece4 ddf: hack to fix container recognition.
When adding a spare to a DDF there is some confusion about the
'level' of the container.  It is reported by kernel as unknown
 -1000000.
I don't know why this broke but until I figure out why and fix it,
this hack gets us going again.

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-15 10:07:26 +10:00
NeilBrown e0e7aeaa18 ddf: allow a non-spare to be used to recovery a missing device.
If a DDF has two arrays sharing devices and one device fails, then
as soon as the spare is used to recover one of the arrays it isn't
spare any more and so is not chosen for the other array.

Work around this for now by allowing a non-spare to be used if it has
enough space.

Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-15 09:59:55 +10:00
NeilBrown 613b0d174d ddf: fix multiple white-space issues.
Just general tidy-up

Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-15 09:55:40 +10:00
NeilBrown 9cf9a1de36 Manage: zero metadata before adding to 'external' array.
'external' arrays don't support --re-add yet so old metadata is no
value, and 'ddf' gets confusing in mdmon if old metadata is found.
So for now, zero out any old metadata found before adding a spare to
an externally-managed array.

Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-15 09:51:20 +10:00
Lukasz Dorau 6d43efb59b Manage.c: fix make everything compilation error
This patch fixes the following make everything compilation error:
Manage.c: In function ‘Manage_add’:
Manage.c:538: error: ‘dev_st’ may be used uninitialized in this function
make: *** [mdadm.Os] Error 1

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-08-14 09:55:29 +10:00