mdadm

Author	SHA1	Message	Date
Adam Kwolek	6a75c8ca79	imsm: FIX: use md position to reshape restart When reshape is broken it can occur that metadata is not saved properly. This can cause that reshape process is farther in md than metadata states. On restart save checkpoint to store current position /probably farther/ that can be read from md. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:36:42 +11:00
Adam Kwolek	f93346ef07	FIX: use md position to reshape restart When reshape is broken, it can occur that metadata is not saved properly. This can cause that reshape process is farther in md than metadata states. On reshape restart use md position as start position, if it is farther than position specified in metadata. Opposite situation treat as error. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:36:41 +11:00
Adam Kwolek	e1742195ff	imsm: FIX: Chunk size migration problem When chunk size migration occurs (e.g. 128k->4k) first checkpoint cannot be set in md due to too small step. Correct migration record initialization to allow whole copy area usage and increase migration checkpoint step. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:52 +11:00
Adam Kwolek	78340e26a5	Flush mdmon before next reshape step during container operation Using takeover operation for grow purposes, mdadm has to be sure that mdmon processes all updates, and if necessary it will be closed at takeover to raid0 operation. If mdmon is late, next array in container is processed and due to race condition mdmon closes itself instead to monitor next reshape operation. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:52 +11:00
Adam Kwolek	50927b1323	Fix: Sometimes mdmon throws core dump during reshape Problem was found during reshaping 2 volumes /raid0 and raid5/ in container. Sometimes mdmon throws core dump due to NULL pointer exception. Problem occurs in scenario: - managemon: is about spare activation (degraded raid4 volume == raid0 under takeover) - managemon: detect level change and signals monitor (manage_member() calls replace_array()) - monitor: detects transition raid4/5->raid0 and sets a->container to NULL to indicate array deactivation - managemon : continues his work and tries to activate spare (a->check_degraded is set). NULL pointer is passed to metadata handler activate_spare() Core dump is generated. To resolve this situation managemon (after monitor kick) checks again a->container pointer to learn if current array is not to be deactivated. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:52 +11:00
Adam Kwolek	bf5cf7c705	imsm: FIX: imsm_get_allowed_degradation() doesn't count degradation for raid1 Missing case raid1 added to function. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:51 +11:00
Adam Kwolek	111e9fdaa8	FIX: Array is not run when expansion disks are added When added disk is disk added by expansion and this is last disk added to array, assemble_container_content() will not even try to run such array. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:51 +11:00
Adam Kwolek	d2bde6d3aa	imsm: FIX: No new missing disks are allowed during general migration When during incremental assembly general migration is in progress, starting degraded array causes that no more disks (even present) can be added later as array is already started. Request all previously present disks during general migration for assembly. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:51 +11:00
Adam Kwolek	92d49ecfaa	FIX: NULL pointer to strdup() can be passed When result from strchr() is NULL and it is assigned to subarray, NULL pointer can be passed to strdup() function and coredump file is generated. Subarray is checked for NULL pointer, so it is assumed that it can be NULL at this moment. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 12:20:51 +11:00
NeilBrown	b10c663ea9	config: fix handing of 'homehost' in AUTO line. We should be looking for the policy "auto=homehost", not "auto=auto". Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-09 11:44:38 +11:00
NeilBrown	da8fe5aa9b	Assemble: fix --force assemble during reshape. If we have to --force assembly during reshape, we need to check by the 'before' and 'after' cases to make sure there are enough devices. Reported-by: Richard Herd <2001oddity@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-07 14:06:44 +11:00
NeilBrown	de5a472ea3	Remove avail_disks arg from 'enough'. It can easily be calculated from 'avail' and 'raid_disks', and we will soon have a case where we don't have it easily available to pass in. Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-07 14:04:47 +11:00
NeilBrown	6ef89052d8	super1: make aread/awrite always use an aligned buffer. A recently change to write_bitmap1 meant awrite would sometimes write from a non-aligned buffer which of course break. So change awrite (and aread) to always use their own aligned buffer to ensure safety. Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-07 11:55:18 +11:00
Alexander Lyakas	d59770567c	getinfo_super1: Use MaxSector in place of sb->size when deciding whether the array is clean or dirty, compare sb->resync_offset against MaxSector and not against sb->size With RAID6 resyncing and subsequent drive failures, it is possible to reach the case, in which sb->resync_offset==sb->size. This happens when resync is aborted due to drive failures, and immediately a rebuild of a spare starts. In this case, mdadm was considered the array as clean, while kernel was considering the array as dirty. It is better for mdadm also to consider the array as dirty in this case. Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-07 10:15:20 +11:00
Labun, Marcin	f878b24226	imsm: fix, the second array need to have the whole available space on devices Fix the case with creating an array with given container in command line instead of real devices: mdadm -CR /dev/md/raid0 -l 0 -n 2 -z5G /dev/md/imsm Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:21:13 +11:00
Jes Sorensen	a0963a86e1	Spawn mdmon with --offroot if mdadm was launched with --offroot Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:11:29 +11:00
Jes Sorensen	da827518c1	Add --offroot argument to mdmon Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:11:21 +11:00
Jes Sorensen	08ca2adfff	Add --offroot argument to mdadm When --offroot is specified, mdadm will change the first character of argv[0] to '@'. This is used to signal to systemd that mdadm was launched from initramfs and should not be shut down before returning to the initramfs. Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:11:16 +11:00
Jes Sorensen	eb155f6de0	mdmon: Use getopt_long() to parse command line options This changes mdmon over to use getopt_long() for option parsing, making it easier to add new options. In addition this patch introduces a short version -t for --takeover and adds -h/--help. Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:11:07 +11:00
Labun, Marcin	30602f533f	imsm: display fd in error trace when when store_imsm_mpb failes Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:00:43 +11:00
Labun, Marcin	ca9de185a3	imsm: validate the number of imsm volumes per controller IMSM OROM limits number of volumes per controller. Volumes above the limit are blocked in OROM. mdadm should follow OROM limitations in this area. Therefore we need to count number of volumes on the devices attached to SATA (ahci driver) or SAS (isci) controller. Adding a new volume must be blocked if the number of volumes on devices attached to the given controller is exceeded. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:00:10 +11:00
Labun, Marcin	ec50f7b6bb	imsm: load_imsm_super_all supports loading metadata from the device list This option is going to be used to load and analyse the metadata from devices. This is needed to count the number of volumes on devcies attached to particular Intel controller (SATA or SAS). It shall be done without activation of container and volumes on the devices. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 11:57:23 +11:00
Labun, Marcin	9587c3739b	imsm: load_super_imsm_all function refactoring Prepare function for subsequent changes related to loading metadata from devices list. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 11:56:58 +11:00
Adam Kwolek	59ab9f54a0	FIX: Typo error in fprint command Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 11:36:25 +11:00
Adam Kwolek	3c20f9899b	FIX: mdmon check in reshape_container() can cause a problem When raid0 reshape is executed mdmon can dissappear due to raid level takeover operation. If this happen before mdmon check, mdadm would treat it as error condition. It is not true for this case. Remove mdmon check from reshape_container() function. Error condition check will remain using reshape_array() reentry test for the same array (line 2577). Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 11:36:25 +11:00
Adam Kwolek	5d1c7cdaca	FIX: External metadata sometimes is not updated External metadata sometimes is not updated. It can be observed during 2 raid0 arrays Capacity Expansion. New array size is not set, because metadata is not updated and on the reshape end mdadm doesn't read new array size from metadata. This happens when mdmon finishes his work (due to takeover to raid0), before all metadata updates are processed. Make sure that all updates are flushed to disk before executing takeover. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 11:36:25 +11:00
Lukasz Dorau	0c4304ca8b	fix: container creation with --incremental used. If there is no name provided for a container by the metadata it is always appropriate to use the metadata version name. create_mddev will still add a uniquifying digit to the end so there is little risk of confusion. This makes the --incremental code behave the same as the --assemble code. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-12 10:57:20 +11:00
Jes Sorensen	90fa1a2929	Work around gcc-4.7's strict aliasing checks Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-12 10:41:49 +11:00
Lukasz Dorau	8453f8d0df	fix: Monitor sometimes crashes The "char cnt [40]" buffer is sometimes too small to hold all message - in such case monitor crashes. The buffer must be larger to be able to hold all message. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-12 10:40:00 +11:00
NeilBrown	68226a8081	monitor: ensure we retry soon when 'remove' fails. If a 'remove' fails there is no certainty that another event will happen soon, so make sure we retry soon anyway. Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-03 11:39:59 +11:00
NeilBrown	77b3ac8c65	monitor: make return from read_and_act more symbolic. Rather than just a number, use a named flag. This makes the code easier to understand and allows room for returning more flags later. Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-03 11:18:59 +11:00
NeilBrown	1fbc5b7a5e	Release mdadm-3.2.3 Bug fix release.	2011-12-23 14:32:44 +11:00
NeilBrown	c0c1acd691	Grow/bitmap: support adding bitmap via sysfs. Adding a bitmap via ioctl can only add it at a fixed location. That location is not suitable for 4K-block devices. So allow setting the bitmap location via sysfs if kernel supports it and aim to always use 4K alignments. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 14:10:41 +11:00
NeilBrown	b6db6fab11	super1: use awrite when writing a new bitmap. This ensures it will succeed on 4K block devices like DASD. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 14:09:56 +11:00
NeilBrown	a7322ae12f	Subject: bitmap: used 4K aligned buffers when reading the bitmap. This is needed on 4K block devices such as DASD as we do O_DIRECT reads. Do the same alignment for gpt just to be safe. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 14:09:29 +11:00
NeilBrown	adbb382b55	super1 - fix for bigendian machines. devflags is a single byte so endian conversions are now wanted. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 14:07:47 +11:00
NeilBrown	887162637f	Assemble: fix count in "assembled with .. but not started". We need to include the count of pre-existing devices here. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:49:07 +11:00
NeilBrown	576d028002	Assemble: make some plurals conditional. "1 devices" is ugly. Fix it. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:49:07 +11:00
NeilBrown	a408d66c4f	Incremental: make sure container name appears in /dev We need to send a "change" event just like we do when creating an array. This reverts commit `382afe49b1` The problem is that we need udev to create the file in /dev for us. It might be unnecessary for udev to consider assembling things in this array, but it shouldn't cause a problem. If it did that would be a different bug which we probably need locking to fix. Or maybe udev shouldn't trigger a "-I" for containers appearing. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:47:07 +11:00
NeilBrown	6c9a5fa103	bitmap: print correct percentage of bitmap in use. We were adding 1, presumably to avoid div-by-zero possibilities. It is better to only substitute '1' if the value actually is zero, else for small numbers of bits the difference is visible. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 09:07:14 +11:00
NeilBrown	7b80ad6af2	ddf: fix auto-allocation problem. We need to check every disk to see if it has a particular amount of space spare, not just every subsequent disk. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 09:04:59 +11:00
NeilBrown	ba3903d416	allow --monitor to be run by non-privileged use. Providing --no-sharing is given, monitoring should be permitted. Unfortunately the kernel current rejects GET_DISK_INFO and GET_ARRAY_INFO for non-privileged users which is unjustified. The info is available in sysfs and we could get it from there, but for now, require the kernel to be fixed. Reported-by: Raphael Hertzog <hertzog@debian.org> Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 07:42:23 +11:00
Adam Kwolek	3e9df86add	FIX: Verify if array name doesn't exist already When e.g. array name (an) is correct and it is the same as container name (cn), file element creation /dev/md/an will replace /dev/md/cn. This can cause that user cannot access container using /dev/md/cn. Verify during array creation if chosen name is not already existing one. [Changed to use map_by_name() rather than stat() to determine prior existence - NeilBrown] Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 07:13:55 +11:00
NeilBrown	24daa16fa1	Grow.c: fix lots of white-space issues. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 06:59:51 +11:00
NeilBrown	ce4783d3d6	Grow: fix reshape-array for shrinking reshapes. The value in info->array.raid_disks is the total number of devices, which is the 'after' number when the number is increasing, and the 'before' number when the number is decreasing. The code currently assumes it is always the 'after' number - so fix that. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 06:59:48 +11:00
NeilBrown	27a1e5b5a4	Grow: fix start_reshape for shrinking arrays. When an array is being reshaped to fewer data devices the relationship between sync_max and reshape_progress is different to when the number of devices increases - we need to allow for that when setting sync_max/sync_min. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 06:59:45 +11:00
NeilBrown	0b50d3359a	tests/03r5assemV1 - slow down recovery a bit more. In this test we stop and restart an array that is being recovered, and we do that several times. Sometimes it completes the recovery before we expect and test results are wrong. So slow it down a bit more. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 06:41:52 +11:00
NeilBrown	cb0997242c	super1: getinfo_super should set write-mostly flag. Otherwise it is not preserved when you re-add a device to an array. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-20 15:01:53 +11:00
Lukasz Dorau	b307134253	imsm: make messages more understandable The printed messages should be more appropriate and understandable for user. If maxsize is equal 0, this means there is no free space left on device. If size is greater than maxsize, this means there is not enough space to create a new volume of given size. Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-20 10:37:01 +11:00
Williams, Dan J	5fe62b9455	imsm: FIX: UT '08imsm-overlap' fails Make test for all sub arrays having the same number of devices dependant on the option ROM requirements being checked. 08imsm-overlap disables the OROM check but then fails because this test causes it to. Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-20 10:36:21 +11:00

... 6 7 8 9 10 ...

2570 Commits