mdadm

Author	SHA1	Message	Date
NeilBrown	efc67e8e9f	New function: sysfs_wait We have several places that wait for activity on a sysfs file. Combine most of these into a single 'sysfs_wait' function. Signed-off-by: NeilBrown <neilb@suse.de>	2013-07-01 13:28:13 +10:00
NeilBrown	a2836f12c4	revert-reshape: make sure reshape_position is acceptable. We can only revert a reshape if the reshape_position aligns properly for the old geometry. If it doesn't we just fail for now. Also fix a +/- error with updating raid_disks for super1.c Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-27 16:38:53 +10:00
NeilBrown	240cb5a41a	tests/raid6repair: default data offset has changed. So the test scripts must change too. Signed-off-by: NeilBrown <neilb2suse.de>	2013-06-27 14:29:18 +10:00
NeilBrown	8e4a42719c	"make test" should build "raid6check" As there are selftests for raid6check. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-27 14:09:48 +10:00
NeilBrown	9b6bf8aa54	Assemble: remove some stray tracing. Was introduced in: Assemble: when forcing a single-degraded RAID6 array, trigger a 'repair'. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-27 14:07:38 +10:00
NeilBrown	dfa4d769f0	Grow: fix crash when restarting an array. After the 'started' label it is assumed that 'sra' is set, so better set it when jumping there. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-27 13:10:44 +10:00
NeilBrown	6a23fb9d0d	Grow: lack of head/tail space not fatal for RAID5 etc. For RAID10, we must have head/tail space for reshape. For RAID4/5/6 we can use a spare or a backup file. So make that distinction. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-27 12:58:16 +10:00
NeilBrown	a73b00811c	Grow: report better message when --grow --chunk cannot work. When changing the chunksize of an array, the new chunksize must divide the device size. If it doesn't we report a very brief message. Make this message a bit longer and suggest a way forward be reducing the size of the array. Reported-by: Mark Knecht <markknecht@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-27 10:12:31 +10:00
NeilBrown	b379508eeb	Makefile/version: use version/date from .git if possible. If being built from a git tree, use the version and date information from the top commit rather than the hard-coded values. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-25 16:27:05 +10:00
NeilBrown	399e0b9709	Subject: Make wait_for and open_dev_excl faster When we crete or assemble an array, we wait for udev to create the device file in /dev so that as soon as mdadm complete, the device can be used. This waiting is performed in multiples of 200ms, which can sometimes be too long to wait. So change to an exponential backoff. Wait 1, then 2, then 4 msec etc. Once we get to 256msec, stop backing off and continue waiting 256ms at a time until we reach the limit which is now 4.608sec rather than 5sec which it was before. Ditto for open_dev_excl. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-25 15:56:22 +10:00
NeilBrown	8010806bab	tests: add device size tests when change raid leve to/from 0 There was a kernel bug that got this wrong, so better check for it. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-25 15:54:44 +10:00
NeilBrown	dea3786ae2	Grow: fix bug in raid0 -> raid5 conversion. The moment we change a RAID0 to a RAID5 it will try to recovery. This will abort quite quickly as there are not spare devices, but it could confuse the attempt to freeze the array. So allow 'freeze' to work even on a recovering array. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-25 15:52:58 +10:00
NeilBrown	688eb823bc	Make: CXFLAGS should be conditionally assigned. As the Makefile encourages users to set CXFLAGS for extra flags, we should only conditionally set it. That way it can be over-ridden in the environment as well as on the command line. Suggested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 16:59:37 +10:00
mwilck@arcor.de	41a663b267	Detail: deterministic ordering in --brief --verbose Have mdadm --Detail --brief --verbose print the list of devices in alphabetical order. This is useful for debugging purposes. E.g. the test script 10ddf-create compares the output of two mdadm -Dbv calls which may be different if the order is not deterministic. (I confess: I use a modified "test" script that always runs "mdadm --verbose" rather than "mdadm --quiet", otherwise this wouldn't happen in 10ddf-create). Signed-off-by: Martin Wilck <mwilck@arcor.de> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 16:56:47 +10:00
NeilBrown	0ddc35beed	super1: fix space_{before,after} for RAID0 For RAID0 we need to use 'data_size', no 'size' as later is 0. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 16:24:08 +10:00
NeilBrown	1c8b90df15	Grow: allow "--add" with "--grow --level=??" This is useful for reshaping a RAID0 to a higher level. The recovery will happen at the same time as the reshape. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 16:13:00 +10:00
NeilBrown	e5ba75ce03	Grow: chose default layout when converting from RAID0. If we don't do this explicitly, we end up keeping the "current" layout, which is meaningless for RAID0. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 16:06:21 +10:00
NeilBrown	35698c6e91	tests: add test for converting levels to raid0 and back. Now that I have this mostly working, I should make sure it doesn't break... Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 15:57:58 +10:00
NeilBrown	9ccfd3be30	test/00names: use appropriate mdadm.conf Using non-numeric names needs an mdadm.conf setting, so make sure we have one. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 15:48:40 +10:00
NeilBrown	97e3a6a0e0	Grow: centralise level-change code. There are now 3 places which change level. And they all do it slightly differently with different messages etc. Make a single function for this and use it. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 15:27:07 +10:00
NeilBrown	6fb8746e4a	Grow: remove excess drives when converting to RAID0. When converting to RAID0, all spares and non-data drives need to be removed first. It is possible that the first HOT_REMOVE_DISK will fail because the personality hasn't let go of it yet, so retry a few times. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 14:08:41 +10:00
NeilBrown	9030d55ff2	Grow: clear new_layout when we change the level. After changing the level, the meaning of layout numbers changes, so we will keeping a new_layout value around can cause later confusion. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 13:08:13 +10:00
NeilBrown	ddbf2ebb0e	Grow: analyse_change needs to set new_size even if nothing much is happening. This means it will be set for a "--data-offset" only reshape so that case doesn't complain that the array is getting smaller. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 13:06:32 +10:00
NeilBrown	b397d7f3e0	Grow: fix two problems with new_data_offset 1/ ignore failed devices - obviously 2/ We need to tell the kernel which direction the reshape should progress even if we didn't choose the particular data_offset to use. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 13:04:38 +10:00
NeilBrown	a6a78630ac	Grow: Try hard to set new_offset. Setting new_offset can fail if the v1.x "data_size" is too small. So if that happens, try increasing it first by writing "0". That can fail on spare devices due to a kernel bug, so if it doesn't try writing the correct number of sectors. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 13:02:35 +10:00
NeilBrown	534f543296	Grow: Make sure new data-offset is well-aligned If we choose a new data-offset, make sure it is rounded to a largest power of to possible, up to 1Meg Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-24 12:55:41 +10:00
NeilBrown	e09233d048	Grow: a data_offset should not be tested against 0. It should always be tested against INVALID_SECTORS!!! Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 16:55:35 +10:00
NeilBrown	97882bc806	tests: add test for non-numeric device names Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 16:44:18 +10:00
NeilBrown	71417de6fe	Add test for interaction of --assemble with --incr and fix the bug that it found. The refactor of start_array() missed a test. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 16:34:47 +10:00
NeilBrown	ccec2685ab	Add test for --update=metadata and fix bug it found. We were not setting device size correctly for raid0. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 16:28:05 +10:00
NeilBrown	033e098c07	tests: rearrange sometest groupings. All 'update' tests in 04 More imsm tests in 09 Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 13:46:53 +10:00
NeilBrown	1011e8344a	Remove lots of unnecessary white space. Now that I am using white-space mode in Emacs I can see all of this, and I don't like it :-) Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 12:31:45 +10:00
NeilBrown	e6dd89da86	Manage: allow "--stop" on kernel names. e.g. mdadm --stop md4 This works even if udev has become confused or killed. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 11:39:14 +10:00
NeilBrown	fe7e0e64b0	Manage: split Manage_runstop into Manage_run and Manage_stop The two branches have virtually nothing in common, so it is simpler if they are separate. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 11:23:44 +10:00
NeilBrown	8cde842b18	Assemble: when forcing a single-degraded RAID6 array, trigger a 'repair'. When an active/degraded RAID6 array is force-started we clear the 'active' flag, but it is still possible that some parity is no in sync. This is because there are two parity block. It would be nice to be able to tell the kernel "P is OK, Q maybe not". But that is not possible. So when we force-assemble such an array, trigger a 'repair' to fix up any errant Q blocks. This is not ideal as a restart during the repair will not be continued after the restart, but it is the best we can do without kernel help. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 11:09:33 +10:00
NeilBrown	54def20f8b	Detail: add device information to --detail --export We may well want more per-device information here, but this is a start. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:39:36 +10:00
NeilBrown	64e103fe19	sysfs_read: return devices in same order as in filesystem. When we read devices from sysfs (../md/dev-*), store them in the same order that they appear. That makes more sense when exposed to a human (as the next patch will). Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:33:47 +10:00
Bernd Schubert	2161adce8f	raid6check: Check return value of lseek64() If lseek64() failed it was still writing to the disks, which would introduce data corruption. Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:05:38 +10:00
Bernd Schubert	2c7b668df7	raid6check: Fix compiler warnings. Fix some compiler warnings appearing with optimization levels. Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:04:43 +10:00
Bernd Schubert	635b5861c3	raid6check: Use enums for repair type Using hard coded numbers is error prone and hard to read by humans. Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:04:18 +10:00
Bernd Schubert	3a89d75488	raid6check: Fix memory leaks detected by valgrind ==2389947== 24 bytes in 1 blocks are definitely lost in loss record 1 of 10 ==2389947== at 0x4C2B3F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==2389947== by 0x408067: xmalloc (xmalloc.c:36) ==2389947== by 0x401B19: check_stripes (raid6check.c:151) ==2389947== by 0x4030C6: main (raid6check.c:521) ==2389947== ==2389947== 24 bytes in 1 blocks are definitely lost in loss record 2 of 10 ==2389947== at 0x4C2B3F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==2389947== by 0x408067: xmalloc (xmalloc.c:36) ==2389947== by 0x401B67: check_stripes (raid6check.c:155) ==2389947== by 0x4030C6: main (raid6check.c:521) ==2389947== Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:03:44 +10:00
Bernd Schubert	f8fcf7a1c5	raid6check: Fix build of raid6check After recent git pull 'make raid6check' did not work anymore, as sysfs_read() was called with a wrong argument and as check_env() was used by use_udev(), but not defined. Replace sysfs_read(..., -1, ...) by sysfs_read(..., NULL, ...) Move check_env() from util.c to lib.c Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:03:12 +10:00
NeilBrown	7506f86012	Makefile: add "-O3" to WARN_UNUSED options. This finds more errors Also remove some trailing spaces. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 10:02:54 +10:00
NeilBrown	c0f0d8128a	Grow: fix up recent changes to set_new_data_offset. The second 'info2' wasn't being initialised. So don't use it. Reported by -O3 Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 09:58:02 +10:00
NeilBrown	f69bb60857	super0: set uninitialized variable. Reported by -O3 Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 09:51:01 +10:00
NeilBrown	f80057aec5	Assemble/Incr: Don't include spares with too-high event count. Some failure scenarios can leave a spare with a higher event count than an in-sync device. Assembling an array like this will confuse the kernel. So detect spares with event counts higher than the best non-spare event count and exclude them from the array. Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-17 16:55:31 +10:00
NeilBrown	e2f408a4c0	mdadm.h: add little bits of doco for 'struct superswitch'. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-17 16:04:59 +10:00
NeilBrown	a7dec3fd92	Make sure NOFILE resource limit is big enough. Some people want to create truely enormous arrays. As we sometimes need to hold one file descriptor for each device, this can hit the NOFILE limit. So raise the limit if it ever looks like it might be a problem. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-30 14:31:09 +10:00
NeilBrown	041b815f17	Incremental: allow --quiet to silence from errors from "-If" -q is currently ineffective on "mdadm -If". Messages that are not usage errors should be suppressed. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-29 09:13:25 +10:00
NeilBrown	8ecf12b9f8	Grow_continue: handle RESHAPE_NO_BACKUP correctly. If the reshape does not require a backup, Grow_continue can abort early. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-28 16:58:18 +10:00

... 5 6 7 8 9 ...

2936 Commits