Commit Graph

3166 Commits

Author SHA1 Message Date
Guoqing Jiang 8b2202ded1 mdadm: don't show cluster name once the bitmap is cleared
Don't show cluster name if bitmap is cleared.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 13:24:01 +11:00
Guoqing Jiang 37d0ca9be6 mdadm: output info more precisely when change bitmap to none
WHen change bitmap to none, the infos could be more accurate
based on existed bitmap type.

And s->bitmap_file is passed from cmd "--bitmap=TYPE", so
remove s->bitmap_file from err info since it should means
change the bitmap to one type failed rather than the type is
already presented.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 13:23:58 +11:00
Guoqing Jiang 41dbb4da22 mdadm: let cluster raid could also add disk within incremental mode
For cluster raid, the disc.state need to be changed accordingly under
incremental mode.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 13:23:54 +11:00
Song Liu 01290056d0 recreate journal in mdadm
This patch tries recreates missing/faulty journal in mdadm.

Example:

./mdadm --fail /dev/md1 /dev/sdb2
mdadm: set /dev/sdb2 faulty in /dev/md1

./mdadm --stop /dev/md1
mdadm: stopped /dev/md1

./mdadm -A --scan --force
mdadm: Journal is missing or stale, starting array read only.
mdadm: /dev/md/1 has been started with 15 drives.

./mdadm --add-journal /dev/md1 /dev/sdb2
mdadm: added /dev/sdb2

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 12:43:56 +11:00
Song Liu 5aa644c68a add sysfs_array_state to struct mdinfo
Add sysfs_array_state to struct mdinfo, and add GET_ARRAY_STATE to
options of sysfs_read.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 12:43:45 +11:00
Deepa Dinamani 26714713cd mdadm: Change timestamps to unsigned data type.
32 bit signed timestamps will overflow in the year 2038.

Change the user interface mdu_array_info_s structure timestamps:
ctime and utime values used in ioctls GET_ARRAY_INFO and
SET_ARRAY_INFO to unsigned int. This will extend the field to last
until the year 2106.

Add time_after/time_before and supporting typecheck from
the kernel to take care of unsigned time wraparound.

The long term plan is to get rid of ctime and utime values in
this structure as this information can be read from the on-disk
meta data directly.

v0.90 on disk meta data uses u32 for maintaining time stamps.
So this will also last until year 2106.
Assumption is that the usage of v0.90 will be deprecated by
year 2106.

Timestamp fields in the on disk meta data for v1.0 version already
use 64 bit data types.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-16 12:43:25 +11:00
Constantine Shulyupin cd04f56212 Detail.c --test fix 2015-12-10 16:26:07 +02:00
Song Liu dbfbca4300 fix bug in assemble
In Assemble, getinfo_super() over-writes journal_clean.  To
ensure correct journal_clean, keep it in a local variable
before getinfo_super().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-09 07:35:50 +11:00
Bas van Schaik 1158f25eae make sure 'path' buffer is large enough to fit 200 characters plus null terminator 2015-12-03 13:48:53 +00:00
Bas van Schaik fa9aca4930 avoid confusion with parameter 'devname' with same name, ensure buffer is large enough for two ints plus extras 2015-12-03 13:48:46 +00:00
Bas van Schaik a90ed30e74 ensure buffer is large enough for two ints and some extras 2015-12-03 13:48:37 +00:00
Song Liu 198d54787c add crc32c and use it for r5l checksum
In kernel space, r5l checksum will use crc32c:
http://marc.info/?l=linux-raid&m=144598970529191
mdadm need to change too.

This patch ports a simplified crc32c algorithm from kernel code,
and used in super1.c:write_empty_r5l_meta_block();

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-30 17:38:28 +11:00
Song Liu 356e69de79 mdadm: add test script for raid456 journal
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-22 12:20:08 +11:00
Song Liu 28f83f6d3b mdadm: Add description of write journal to md.4
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-22 12:19:11 +11:00
Song Liu 051f326550 mdadm: refactor write journal code in Assemble and Incremental
As discussed, standalone require_journal() in struct superswitch
is not a very good idea. Instead, journal related information
fits well in struct mdinfo.

This patch simplifies journal support code in Assemble and
Incremental as:

- Add journal_device_required and journal_clean to struct mdinfo;
- Remove function require_journal from struct superswitch;
- Update Assemble and Incremental to use journal_device_required
and journal_clean from struct mdinfo (instead of separate var).

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-22 12:19:09 +11:00
Guoqing Jiang e80357f825 Make cmap_* also has same policy as dlm_*
Let libcmap lib and related funs also only need one-time
setup during mdadm running period.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-21 11:19:35 +11:00
Guoqing Jiang d15a1f72bd Safeguard against writing to an active device of another node
Modifying an exiting device's superblock or creating a new superblock
on an existing device needs to be checked because the device could be
in use by another node in another array. So, we check this by taking
all superblock locks in userspace so that we don't  step onto an active
device used by another node and safeguard against accidental edits.
After the edit is complete, we release all locks and the lockspace so
that it can be used by the kernel space.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-21 11:19:05 +11:00
Song Liu 28d744468e Add help message and man entry for --write-journal
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:06:21 +11:00
Song Liu 5c6ad21150 Check write journal in incremental
If journal device is missing, do not start the array, and shows:

./mdadm -I /dev/sdf
mdadm: journal device is missing, not safe to start yet.

The array will be started when the journal device is attached with -I

./mdadm -I /dev/sdb1
mdadm: /dev/sdb1 attached to /dev/md/0_0, which has been started.

To force start without journal device:

./mdadm -I /dev/sdf --run
mdadm: Trying to run with missing journal device
mdadm: /dev/sdf attached to /dev/md/0_0, which has been started.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:06:18 +11:00
Song Liu 69a481166b Assemble array with write journal
Example output:

./mdadm --assemble /dev/md0 /dev/sd[c-f] /dev/sdb1
mdadm: /dev/md0 has been started with 4 drives and 1 journal.

mdadm checks superblock for journal devices. If the journal device
is missing or faulty, mdadm will show warning

./mdadm --assemble /dev/md0 /dev/sd[c-q] /dev/sdb1
mdadm: Not safe to assemble with missing or stale journal device, consider --force.

User can insist to start the array (read only) with --force

./mdadm --assemble /dev/md0 /dev/sd[c-q] /dev/sdb1 --force
mdadm: Journal is missing or stale, starting array read only.
mdadm: /dev/md0 has been started with 15 drives.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:06:15 +11:00
Song Liu cc1799c3dd Enable create array with write journal (--write-journal DEVICE).
Specify the write journal device with --write-journal DEVICE

./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Only one journal device is allowed. If multiple --write-journal
are given, mdadm will use the first and ignore others

./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1 --write-journal /dev/sdx
mdadm: Please specify only one journal device for the array.
mdadm: Ignoring --write-journal /dev/sdx...
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:06:12 +11:00
Song Liu ed94976d84 Show device as journal in --detail --examine
Example output:

./mdadm --detail /dev/md127
/dev/md127:
        Version : 1.2
  Creation Time : Wed May 13 17:01:12 2015
     Raid Level : raid5
     Array Size : 11720662464 (11177.69 GiB 12001.96 GB)
  Used Dev Size : 3906887488 (3725.90 GiB 4000.65 GB)
   Raid Devices : 4
  Total Devices : 5
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Wed May 13 17:01:12 2015
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 32K

           Name : 0
           UUID : 8fb9ee05:3831d52f:e5c23825:28cd6881
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       32        0      active sync   /dev/sdc
       1       8       48        1      active sync   /dev/sdd
       2       8       64        2      active sync   /dev/sde
       3       8       80        3      active sync   /dev/sdf

       4       8       17        -      journal   /dev/sdb1

./mdadm -E /dev/sdb2
/dev/sdb2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x201
     Array UUID : 562b2334:35b9bcc1:add50892:1f30c4bd
           Name : 0
  Creation Time : Thu Aug 27 12:55:26 2015
     Raid Level : raid5
   Raid Devices : 15

 Avail Dev Size : 249796608 (119.11 GiB 127.90 GB)
     Array Size : 54696423936 (52162.57 GiB 56009.14 GB)
  Used Dev Size : 7813774848 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : active
    Device UUID : 5015e522:d39ba566:5909cf3c:9c51f2ff

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Aug 27 13:16:55 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 4e6fd76d - correct
         Events : 262

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Journal
   Array State : AAAAAAAAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:06:07 +11:00
Song Liu fa7574f6d4 add macros for MD_DISK_ROLE_(SPARE/FAULTY)
Replace special disk roles (0xffff, 0xfffe) with macros:

define MD_DISK_ROLE_SPARE      0xffff
define MD_DISK_ROLE_FAULTY     0xfffe

Will add macro for journal device in next patch:
define MD_DISK_ROLE_JOURNAL    0xfffd

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-19 13:05:59 +11:00
Artur Paszkiewicz 2139b03c20 imsm: don't call abort_reshape() in imsm_manage_reshape()
Calling abort_reshape() in imsm_manage_reshape() is unnecessary in case
of an error because it is handled by reshape_array(). Calling it when
reshape completes successfully is also unnecessary and leads to a race
condition:
- reshape ends
- mdadm calls abort_reshape() -> sets sync_action to idle
- MD_RECOVERY_INTR is set and md_reap_sync_thread() does not finish the
  reshape

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Konrad Dabrowski <konrad.dabrowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-08 15:22:16 +11:00
Guoqing Jiang 9465f17058 re-add: make re-add try to write sysfs node first
If sysfs node existed, we should try to write "re-add" to it.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-10-08 11:08:40 +11:00
NeilBrown 8266a36ad6 Merge branch 'fix' of git://github.com/ldzhong/mdadm 2015-10-01 08:30:58 +10:00
Guoqing Jiang bff96f7366 mdadm: make cluster raid also could support re-add
If it is a cluster raid, the disc.state need to be
changed accordingly when do re-add.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-09-28 14:55:02 +10:00
Goldwyn Rodrigues 9d9202e301 Fix --incremental handling on cluster array.
Commit 06bd679317 ("Skip clustered devices in incremental")
disabled incremental completely on clustered arrays.
What we really want is that mdadm should not start or create
a clustered array but still be able to add or readd to an existing
device. This would enable udev scripts to automatically add
or re-add a device after transient errors.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-09-28 14:42:55 +10:00
NeilBrown 86a406c226 super1: Do not create bad block log for clustered devices.
We currently have no synchronization techniques for the bad
block log, so disable it for the cluster.

Reported-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-09-28 12:27:37 +10:00
Goldwyn Rodrigues 6d9c7c2551 Increment version for clustered bitmaps
Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels
to assemble a clustered device.

In order to maximize compatibility, the major version is set to
BITMAP_MAJOR_CLUSTERED *only* if the bitmap is clustered.

Also, added MD_FEATURE_CLUSTERED in order to return error
for older kernels which would assemble MD in case bitmap is
corrupted.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-09-28 11:47:04 +10:00
Lidong Zhong d7c6d75dcf mdadm: remove duplicate logic when c.delay is 0 2015-08-26 14:03:56 +08:00
NeilBrown ccc93b33ca Makefile: test -s flag and suppress echo when set.
Some rules do their own tracing and so aren't affected
by -s.
So add a test for -s in MAKE_FLAGS and avoid echo when present.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-05 15:10:43 +10:00
NeilBrown 4a3a795a8b tests: raid6 repair is now tested on every different layout.
Signed-off-by: NeilBrown <neilb@suse.de>
2015-08-05 14:57:08 +10:00
NeilBrown d80f7aa9a1 Assemble: correctly capture error from ->write_bitmap
else 'err' might be undefined.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-05 14:55:31 +10:00
NeilBrown 380487fdc9 main: remove use of uninitialized 'rv'.
If c.homecluster was not NULL, might get an
error anyway.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-05 14:53:33 +10:00
NeilBrown 598f8904ac raid6check: don't ignore return value from posix_memalign.
Compilers don't like that.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-08-05 14:51:25 +10:00
NeilBrown 5997585200 Merge branch 'mdadm-3.3.x' 2015-08-03 16:21:37 +10:00
NeilBrown 69818a5c75 Release mdadm-3.3.4
Important bugfix release.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 16:17:13 +10:00
NeilBrown 8360760457 Assemble: really don't assemble IMSM array without OROM.
Previous patch missed on case.

Also print more useful information when rejecting
a device with IMSM metadata.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 16:06:51 +10:00
NeilBrown 187f157bf0 mdassemble: include mapfile support.
This does make mdassemble a bit bigger, but it also means
it actually works properly with named arrays.

Ref: https://bbs.archlinux.org/viewtopic.php?id=198196
Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 15:42:31 +10:00
NeilBrown 7eee461e91 Assemble: don't assemble IMSM array without OROM.
If someone has an IMSM array, and disables RAID in the BIOS
and uses the devices for some other purpose, then they really don't
want mdadm to start syncing the array.

So don't assemble if OROM doesn't confirm it is OK.

There can still be problems for crash-dump not being able to find
the OROM.   Some explicit work-around might be needed for that
rather than a more general workaround that can corrupt data.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 15:42:16 +10:00
NeilBrown 53a087b105 mdassemble: include mapfile support.
This does make mdassemble a bit bigger, but it also means
it actually works properly with named arrays.

Ref: https://bbs.archlinux.org/viewtopic.php?id=198196
Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 11:54:16 +10:00
NeilBrown 7d55dca2cc mdassemble: don't try to perform cluster check.
mdassemble is meant to be small an simple, so avoid
trying to check for a cluster.
Currently it doesn't, but it still includes the code,
which doesn't build because the library isn't provided.

So just exclude the get_cluster_name code from mdassemble.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-08-03 11:53:01 +10:00
Guoqing Jiang 2cf42394f0 md-cluster: use %-64s to print cluster_name
Left align is better for cluster with name less than 64. Also
make the output of cluster name is aligned with others.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-29 17:26:12 +10:00
Guoqing Jiang d7a493695a mdadm: fix wrong condition for go to abort
When parse_cluster_confirm_arg return 0, it means the
arg are parsed successfully, so change !rv to rv.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-29 17:26:12 +10:00
NeilBrown 9f2e55a421 Assemble: don't assemble IMSM array without OROM.
If someone has an IMSM array, and disables RAID in the BIOS
and uses the devices for some other purpose, then they really don't
want mdadm to start syncing the array.

So don't assemble if OROM doesn't confirm it is OK.

There can still be problems for crash-dump not being able to find
the OROM.   Some explicit work-around might be needed for that
rather than a more general workaround that can corrupt data.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-29 14:38:37 +10:00
NeilBrown 653299b699 Merge branch 'cluster'
Now that 3.3.3 is out, it is time to include the cluster-support code.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-27 11:01:08 +10:00
NeilBrown 3cab8baec5 Release mdadm-3.3.3
Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-24 16:18:17 +10:00
NeilBrown e4fa82a858 mdassemble: add "Name" definition.
That allows it to compile again :-(

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-24 16:18:13 +10:00
NeilBrown 27aefbdb3d Don't ignore return value from read and write
New gcc sometimes complains about this.

Signed-off-by: NeilBrown <neilb@suse.com>
2015-07-24 16:11:23 +10:00