mdadm

Commit Graph

Author	SHA1	Message	Date
NeilBrown	29a312f2f3	Assemble: really ensure stripe_cache is bit enough to handle new chunk size Earlier patch: `56fcbcbb6f` calculated the proper chunk size - but didn't use it.. Let's actually use it this time. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-17 13:10:25 +10:00
NeilBrown	ad1a3c2f08	raid6check fix checking of DDF layouts. Stuff probably still broken. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-16 12:07:54 +10:00
NeilBrown	76cd79d3d1	raid6check: get device ordering correct for syndrome calculation. The order of devices used for the syndrome calculation is not the same as the order of data in the array. The D block immediately after Q is first, then they continue cyclicly in raid-disk order, skipping over the P disk if it is seen. This gets the 'check' right for all layouts other than DDF, which is quite different. I haven't confirmed that this does't break repair. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-16 11:25:40 +10:00
NeilBrown	0832fb09d9	tests: slow down --stop a bit to allow revert-inplace to work. revert-inplace would sometimes find that the original reshape had finished. So slow down the reshaping during --stop (which needs to be a little bit fast so that stop doesn't timeout waiting) and don't wait quite so long before stopping. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-16 10:50:40 +10:00
NeilBrown	7cb2815a15	tests: add 19raid6check This checks that raid6check finds no errors in newly created array with all different layouts. (it doesn't...) Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-16 08:02:52 +10:00
NeilBrown	21a1287ac9	test: clear out old metadata from loop devices. Old metadata can tempt udev to assemble things, which just gets in the way. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-16 07:49:14 +10:00
NeilBrown	108bd87457	raid6check: report role of suspect device. i.e. -2 for Q, -1 for P, 0-N for data. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-10 14:46:59 +10:00
NeilBrown	5bc29745a0	tests: save failure logs to logdir If --save-logs is given we already save all logs to --logdir If not, we should still save erroneous logs to --logdir. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-10 14:44:58 +10:00
NeilBrown	439c196491	tests: do not try to 'flushbufs' after stopping a array If the array is stopped, there is nothing to flush, and blockdev can signal an error. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-10 14:42:20 +10:00
NeilBrown	bc6ccf969e	test: add dmesg output to logs on error. This can help isolate the problem. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-06 15:32:46 +10:00
NeilBrown	a76b3a345b	test: check sync_action as well when checking for an action. Some actions only appear in /proc/mdstat after a little delay, so check in sync_action as well. This applies when checking for recovery etc, and when waiting for idle. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-06 13:58:19 +10:00
NeilBrown	602b916951	test: speed up reshape when stopping arrays. --stop needs to wait for reshape to get to a suitable spot, so having really slow resync isn't helpful. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-06 13:52:04 +10:00
NeilBrown	5c351af129	test: stop all arrays before starting test. As well a cleaning up loop devices, stop all arrays. After all, we cannot do the one without the other. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-06 13:48:59 +10:00
NeilBrown	62844a4da6	Grow: remove stray tracing message. Signed-off-by: NeilBrow <neilb@suse.com>	2015-07-06 13:47:45 +10:00
NeilBrown	e3e0d0a843	Manage/stop: don't stop during initial critical section. If the array is reshaping to more devices, then stopping during that initial critical section is a bad idea. So check for it and wait a bit. Should probably handle final critical section of a reduction too. same-size reshape should be handled correctly already. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-06 13:45:39 +10:00
NeilBrown	932be6276e	Manage/stop: improve some comments. This code always confuses me - this might help a bit. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-06 13:37:19 +10:00
NeilBrown	30ddba7de5	Manage/stop: guard against 'completed' being too large. A race can allow 'completed' to read as 2^63-1, which takes a long time to count up to. So guard against that possibility. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-06 13:33:20 +10:00
NeilBrown	d3f6cf4f9b	Monitor: don't Wait forever on a 'frozen' array. If Wait() finds the array resync is 'frozen', then wait a little while to avoid races, but don't wait forever. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-06 13:26:41 +10:00
NeilBrown	5418499ae4	sysfs: reject reads that use the whole buffer. If a read fills the whole buffer, then we possibly missed something of the end, and we definitely shouldn't put a '\0' beyond the end, so just return an error. This should never happen anyway. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-06 13:21:33 +10:00
NeilBrown	bcbb92d4ee	Remove some trailing white space It looks ugly in my editor. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-02 08:26:30 +10:00
NeilBrown	52b6ccad34	Manage: fix no-op test in Manage_stop. A 'devnm' never starts with '/', so this test is pointless. The code should use the passed-in devname unless it is clearly not usable. So fix it to do that. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-02 08:16:59 +10:00
NeilBrown	9581efb1ae	mdstat: discard 'dev' field, just use 'devnm' These both have the same value, and have done since the 'devnm' concept was introduced. So discard the pointless duplicate. Signed-off-by: NeilBrown <neilb@suse.de>	2015-07-02 08:15:10 +10:00
NeilBrown	caf9ac0ca4	Grow: fix typo in comment Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-18 15:51:45 +10:00
NeilBrown	56fcbcbb6f	Assemble: ensure stripe_cache is big enough to handle new chunk size If you reshape to a larger chunk size, and need to restart, it can have problems. Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-18 15:49:52 +10:00
Guoqing Jiang	4a3d29edce	Reuse calc_bitmap_size to reduce code size We can use the new added calc_bitmap_size func to remove some redundant lines. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:44:38 +10:00
Guoqing Jiang	7e6e839a26	mdadm: change the num of cluster node This extends nodes option for assemble mode, make the num of cluster node could be change by user. Before that, it is necessary to ensure there are enough space for those nodes, calc_bitmap_size is introduced to calculate the bitmap size of each node. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:43:31 +10:00
Guoqing Jiang	0aa2f15b20	mdadm: add the ability to change cluster name To support change the cluster name, the commit do the followings: 1. extend original write_bitmap function for new scenario. 2. add the scenarion to handle the modification of cluster's name in write_bitmap1. 3. let the cluster name also show in examine_super1 and detail_super1 Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:33:39 +10:00
Guoqing Jiang	06bd679317	Skip clustered devices in incremental We want the clustered devices to be started exclusively by a cluster resource-agent. So, avoid starting using the incremental option. This also skips a clustered md from starting during boot in inactive mode. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:33:18 +10:00
Guoqing Jiang	7c25f4d706	Convert a bitmap=none device to clustered This adds the ability to convert a regular md without bitmap (--bitmap=none) to a clustered device (--bitmap=clustered). To convert a device with --bitmap=internal or --bitmap=external, you have to convert to --bitmap=none and then re-execute the command with --bitmap=clustered. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:24:41 +10:00
Guoqing Jiang	4de9091302	Add a new clustered disk A clustered disk is added by the traditional --add sequence. However, other nodes need to acknowledge that they can "see" the device. This is done by --cluster-confirm: --cluster-confirm SLOTNUM:/dev/whatever (if disk is found) or --cluster-confirm SLOTNUM:missing (if disk is not found) The node initiating the --add, has the disk state tagged with MD_DISK_CLUSTER_ADD and the one confirming tag the disk with MD_DISK_CANDIDATE. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:21:29 +10:00
Guoqing Jiang	b98043a2f8	Show all bitmaps while examining bitmap This adds capability of exmining bitmaps corresponding to all nodes/slots on the device. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:19:25 +10:00
Guoqing Jiang	7716570e6d	Set home-cluster while creating an array The home-cluster is stored in the bitmap super block of the array. The device can be assembled on a cluster with the cluster name same as the one recorded in the bitmap. If home-cluster is not specified, this is auto-detected using dlopen corosync cmap library. neilb: allow code to compile when corosync-devel is not installed. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:06:30 +10:00
Guoqing Jiang	529e2aa573	Add nodes option while creating md Specifies the maximum number of nodes in the cluster that may use this device simultaneously. This is equivalent to the number of bitmaps created in the internal superblock (patches to follow). Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:04:16 +10:00
Guoqing Jiang	95a05b37e8	Create n bitmaps for clustered mode For a clustered MD, create bitmaps equal to number of nodes so each node has an independent bitmap. Only the first bitmap is has the bits set so that the first node that assembles the device also performs the sync. The bitmaps are aligned to 4k boundaries. On-disk format: 0 4k 8k 12k ------------------------------------------------------------------- \| idle \| md super \| bm super [0] + bits \| \| bm bits[0, contd] \| bm super[1] + bits \| bm bits[1, contd] \| \| bm super[2] + bits \| bm bits [2, contd] \| bm super[3] + bits \| \| bm bits [3, contd] \| \| \| Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 07:54:03 +10:00
NeilBrown	2a6493cfe1	Grow: fix a couple of typos. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-28 17:21:06 +10:00
NeilBrown	4a859abdc9	test: make 'check wait' more reliable. 'recover' etc doesn't appear in /proc/mdstat immediately. The "sync" thread must be started first. But 'sync_action' shows it as soon as MD_RECOVERY_NEEDED is set in the kernel. So look there too. Now maybe I can get rid of some of those silly 'sleep' calls. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-28 17:00:36 +10:00
NeilBrown	7d1dda2c55	tests/imsm-grow-template change 'wait' to 'check wait' 'wait' is a shell builtin that isn't doing anything useful. It should be calling 'check wait' I think. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-28 16:51:23 +10:00
NeilBrown	8e7ddc5f50	Grow: fix problem with --grow --continue If an array is being reshaped using backup space on a 'spare' device, then mdadm --grow --continue won't find it as by the time it runs, nothing looks like a spare are more. The spare has been added to the array, but has no data yet. So allow reshape_prepare_fdlist to find a newly-incorporated spare and report this so it can be used. Reported-by: Xiao Ni <xni@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-28 16:43:15 +10:00
NeilBrown	20c993e2e5	tests: wait a bit long for reshape to complete. As the kernel now does less locking, 'check wait' doesn't always wait long enough. Add some pauses. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-25 16:59:19 +10:00
NeilBrown	e0cc1c8d8b	Grow: another attempt to fix stop-during-reshape race. When the array is stopped during a critical section, we sometimes erase the backup, which is bad. This happens when 'completed' is zero. This can happen easily when 'stop' freezes reshape. So try to be more careful and check 'reshape_position'. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-25 16:33:45 +10:00
Andrew Burgess	4a984120ea	Fix minor typo in mdadm manpage. Appologies if this is the wrong mailing list for this patch. This is a very small patch for the manual page for the mdadm utility. Thanks, Andrew Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-23 09:29:45 +10:00
Sergey Vidishev	1e08717f0b	mdadm: monitor: fix nullptr dereference when get_md_name() returns NULL Function add_new_arrays() expects that function get_md_name() should return pointer to devname, but also get_md_name() may return NULL. So check the pointer before use it in add_new_arrays(). Signed-off-by: Sergey Vidishev <sergeyv@yandex-team.ru> Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-20 13:16:09 +10:00
NeilBrown	dd0468af57	test: forcefully clean up old loop devices. sometimes these can get left around, and udev can be looking at them at awkward times so they don't disappear. So be forceful. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-20 13:16:00 +10:00
NeilBrown	3ee556f8b6	Grow: be even more careful about handing a '0' completed value. Some old kernels set 'completed' to '0' too soon. But modern kernels don't. And when 'mdadm --stop' freezes and resume the grow, 'completed' goes back to zero briefly, which can confuse this logic. So only think '0' might be wrong from an old kernel when the reshape has gone idle. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 15:11:48 +10:00
NeilBrown	2c3e39ebf9	tests/07reshape5intr : retry if writing 'check' fails. It can sometimes. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 15:09:08 +10:00
NeilBrown	df995e3af3	tests/19raid6repair: don't flushbufs on non-existent array. ..that triggers an error. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 12:34:27 +10:00
NeilBrown	e2a8e9dcf6	tests: wait for complete rebuild in integrity checks 'check wait' seems a bit racy now. Wait for the array to be fully optimal before proceeding. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 11:07:27 +10:00
NeilBrown	ada38ebbcb	Grow: retry when writing 'reshape' to 'sync_action' is EBUSY. EBUSY can be returned if something has recently happened to cause md to want to check if recovery is needed, but hasn't had a chance yet. This can easily happen in testing. So retry a few times in that case. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 11:07:25 +10:00
NeilBrown	670fe20aa0	tests/05r6tor0: minor adjustments 1/ use correct data-offset for cmp - that has changed. 2/ flushbufs on the block device before reading to avoid cache issues Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 11:07:25 +10:00
NeilBrown	f7f6c9f161	tests: 05r6tor0 - add some more waiting. I don't really know why this is needed, but there is a delay between the reshape finishing and the level/etc changing. So add some sleeps. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 11:07:25 +10:00

1 2 3 4 5 ...

3106 Commits All Branches Search

3106 Commits

All Branches