mdadm

Commit Graph

Author	SHA1	Message	Date
Jes Sorensen	dae131379f	sysfs: Make sysfs_init() return an error code Rather than have the caller inspect the returned content, return an error code from sysfs_init(). In addition make all callers actually check it. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-30 16:52:37 -04:00
Jes Sorensen	49948a3561	Grow: Do not shadow an existing variable Declaring 'int rv' twice within the same function is asking for trouble. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-30 10:46:01 -04:00
Jes Sorensen	758b327cf5	Grow: Remove unnecessary optimization Per explanation by Neil, this optimization of writing "size" to the attribute of each device, however when reducing the size of devices, the size change isn't permitted until the array has been shrunk, so this will fail anyway. This effectively reverts `65a9798b58` Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-30 10:44:36 -04:00
Jes Sorensen	018a488238	util: Introduce md_set_array_info() Switch from using ioctl(SET_ARRAY_INFO) to using md_set_array_info() Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 15:43:53 -04:00
Jes Sorensen	d97572f5a5	util: Introduce md_get_disk_info() This removes all the inline ioctl calls for GET_DISK_INFO, allowing us to switch to sysfs in one place, and improves type checking. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 15:23:50 -04:00
Jes Sorensen	9cd39f0155	util: Introduce md_get_array_info() Remove most direct ioctl calls for GET_ARRAY_INFO, except for one, which will be addressed in the next patch. This is the start of the effort to clean up the use of ioctl calls and introduce a more structured API, which will use sysfs and fall back to ioctl for backup. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 14:35:41 -04:00
Jes Sorensen	6ebf34e6bd	Grow: Fixup a pile of cosmetic issues No code change, simply cleanup ugliness. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 12:15:20 -04:00
Artur Paszkiewicz	860f11ed4d	Grow: support consistency policy change Extend the --consistency-policy parameter to work also in Grow mode. Using it changes the currently active consistency policy in the kernel driver and updates the metadata to make this change permanent. Currently this supports only changing between "ppl" and "resync" policies, that is enabling or disabling PPL at runtime. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 11:35:16 -04:00
Artur Paszkiewicz	e97a7cd011	super1: PPL support Enable creating and assembling raid5 arrays with PPL for 1.x metadata. When creating, reserve enough space for PPL and store its size and location in the superblock and set MD_FEATURE_PPL bit. Write an initial empty header in the PPL area on each device. PPL is stored in the metadata region reserved for internal write-intent bitmap, so don't allow using bitmap and PPL together. While at it, fix two endianness issues in write_empty_r5l_meta_block() and write_init_super1(). Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 11:33:52 -04:00
NeilBrown	1ab9ed2afb	Add 'force' flag to hot_remove_disk(). In rare circumstances, the short period that hot_remove_disk() waits isn't long enough to IO to complete. This particularly happens when a device is failing and many retries are still happening. We don't want to increase the normal wait time for "mdadm --remove" as that might be use just to test if a device is active or not, and a delay would be problematic. So allow "--force" to mean that mdadm should try extra hard for a --remove to complete, waiting up to 5 seconds. Note that this patch fixes a comment which claim the previous wait time was half a second, where it was really 50msec. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:32:35 -04:00
NeilBrown	2dd271fe70	Retry HOT_REMOVE_DISK a few times. HOT_REMOVE_DISK can fail with EBUSY if there are outstanding IO request that have not completed yet. It can sometimes be helpful to wait a little while for these to complete. We already do this in impose_level() when reshaping a device, but not in Manage.c in response to an explicit --remove request. So create hot_remove_disk() to central this code, and call it where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:25:23 -04:00
Tomasz Majchrzak	cf52eff58a	Increase buffer for sysfs disk state Bad block support has incremented sysfs disk state reported by kernel ("external_bbl") so it became longer than 20 bytes. It causes reshape to fail as it reads truncated entry from sysfs. Increase buffer so it can accommodate the string including all state values currently implemented in kernel at the same time. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-11-17 09:46:42 -05:00
Mariusz Dabrowski	ddab63c7de	Allow level migration only for single-array container IMSM doesn't allow to change RAID level of array in container with two arrays but array count check is being done too late (after removing disks) and in some cases (e. g. RAID 0 and RAID 1 migrated to RAID 0) both arrays become degraded. This patch adds array count check before disks are being removed. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-10-19 11:26:49 -04:00
Xiao Ni	8800f85381	MDADM:Check mdinfo->reshape_active more times before calling Grow_continue When reshaping a 3 drives raid5 to 4 drives raid5, there is a chance that it can't start the reshape. If the disks are not enough to have spaces for relocating the data_offset, it needs to call start_reshape and then run mdadm --grow --continue by systemd. But mdadm --grow --continue fails because it checkes that info->reshape_active is 0. The info->reshape_active is got from the superblock of underlying devices. Function start_reshape write reshape to /sys/../sync_action. Before writing latest superblock to underlying devices, mdadm --grow --continue is called. There is a chance info->reshape_active is 0. We should wait for superblock updating more time before calling Grow_continue. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-06-16 13:53:45 -04:00
Mike Lovell	13db17bd1f	Use dev_t for devnm2devid and devid2devnm Commit `4dd2df0966` added a trip through makedev(), major(), and minor() for device major and minor numbers. This would cause mdadm to fail in operating on a device with a minor number bigger than (2^19)-1 due to it changing from dev_t to a signed int and back. Where this was found as a problem was when a array was created with a device specified as a name like /dev/md/raidname and there were already 128 arrays on the system. In this case, mdadm would chose 1048575 ((2^20)-1) for the array and minor number. This would cause the major and minor number to become negative when generated from devnm2devid() and passed to major() and minor() in open_dev_excl(). open_dev_excl() would then call dev_open() which would detect the negative minor number and call open() on the *char containing the major:minor pair which isn't a valid file. Signed-off-by: Mike Lovell <mlovell@bluehost.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-06-03 15:35:26 -04:00
Jes Sorensen	6ac963cef0	Grow: Apply some more consistent formatting to Grow_addbitmap() This should be purely cosmetic and cause no functional change ... famous last words! Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-05-12 15:27:24 -04:00
Jes Sorensen	4ed129aca7	Grow: Simplify error paths in Grow_addbitmap() This gets rid of some repeated exit paths, making the code a little cleaner. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-05-12 15:27:18 -04:00
Jes Sorensen	2ec2b7e9d5	mdadm: Make add_internal_bitmap() return 0 on success add_internal_bitmap() returned 1 on success and 0 on error which is inconsistent. This changes it to return 0 on success and use more reasonable error codes on error. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-05-12 15:19:16 -04:00
Jes Sorensen	c152f3610f	Grow: Handle failure to load superblock in Grow_addbitmap() Reported-by: Gioh Kim <gi-oh.kim@profitbricks.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-05-12 14:30:10 -04:00
Jes Sorensen	dac1b1115f	Grow: Grow_addbitmap() reduce indentation This makes the code a little more readable. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-05-12 14:27:11 -04:00
Guoqing Jiang	81306e021e	Change the option from NoUpdate to NodeNumUpdate Actually, we need to use NodeNumUpdate here to ensure there are enough spaces for those nodes. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-24 12:33:27 -04:00
Guoqing Jiang	31dbeda730	Grow: goto release if Manage_subdevs failed If failure happened when add disk to array by grow mode, need to goto release instead of continue the reshape. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-22 13:53:10 -04:00
Yi Zhang	a58e0da443	Grow: analyse_change add notification about only 2-device can be convert from RAID1 to RAID5 Notify "Can only convert a 2-device array to RAID5" instead of "Impossibly level change request for RAID1" when convert from RAID1 to RAID5 if the disk num is not equal two like RAID4/5->RAID1 did. Signed-off-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-11 12:40:47 -05:00
Pawel Baldysiak	ad2f464602	Grow: close fd earlier to avoid "cannot get excl access" when stopping If this file descriptor is not closed here, it remains open during reshape process and stopping process will end up with "cannot get exclusive access to container". Once this file descriptor is no longer needed - it can be closed. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-11 12:32:31 -05:00
Jes Sorensen	efdfcc9e95	Grow: Grow_addbitmap(): Add check to quiet down static code checkers Grow_addbitmap() is only ever called with s->bitmap_file != NULL, but not all static code checkers catch this. This adds a check to quiet down the false positive warnings. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-09 11:35:34 -05:00
Jes Sorensen	12add44564	Grow: Grow_continue_command() remove dead code All cases where fd2 is used are completed with a close(fd2), so there is no need to set fd2 = -1 or check for it before exiting. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-09 11:35:34 -05:00
Jes Sorensen	bf08f6b1ef	Grow: Add documentation to abort_reshape() for suspend_{lo,hi} setting Add documentation for quirky reset procedure for resetting suspended region range. Suggested-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-02-22 09:43:43 -05:00
Artur Paszkiewicz	10df72a080	Grow: close file descriptor earlier to avoid "still in use" when stopping Close fd2 as soon as it is no longer needed, before calling Grow_continue(). Otherwise, we won't be able to stop an array with external metadata during reshape, because mdadm running in background will be keeping it open. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-12-24 10:00:00 +11:00
Xiao Ni	f7cf9699dc	Check and remove bitmap first when reshape to raid0 If reshape one raid device with bitmap to raid0, the reshape progress will start. But it'll fail and lose some components. So it should remove bitmap first. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-12-22 15:16:08 +11:00
Guoqing Jiang	37d0ca9be6	mdadm: output info more precisely when change bitmap to none WHen change bitmap to none, the infos could be more accurate based on existed bitmap type. And s->bitmap_file is passed from cmd "--bitmap=TYPE", so remove s->bitmap_file from err info since it should means change the bitmap to one type failed rather than the type is already presented. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-12-16 13:23:58 +11:00
Deepa Dinamani	26714713cd	mdadm: Change timestamps to unsigned data type. 32 bit signed timestamps will overflow in the year 2038. Change the user interface mdu_array_info_s structure timestamps: ctime and utime values used in ioctls GET_ARRAY_INFO and SET_ARRAY_INFO to unsigned int. This will extend the field to last until the year 2106. Add time_after/time_before and supporting typecheck from the kernel to take care of unsigned time wraparound. The long term plan is to get rid of ctime and utime values in this structure as this information can be read from the on-disk meta data directly. v0.90 on disk meta data uses u32 for maintaining time stamps. So this will also last until year 2106. Assumption is that the usage of v0.90 will be deprecated by year 2106. Timestamp fields in the on disk meta data for v1.0 version already use 64 bit data types. Signed-off-by: NeilBrown <neilb@suse.com>	2015-12-16 12:43:25 +11:00
Goldwyn Rodrigues	6d9c7c2551	Increment version for clustered bitmaps Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels to assemble a clustered device. In order to maximize compatibility, the major version is set to BITMAP_MAJOR_CLUSTERED only if the bitmap is clustered. Also, added MD_FEATURE_CLUSTERED in order to return error for older kernels which would assemble MD in case bitmap is corrupted. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-09-28 11:47:04 +10:00
NeilBrown	653299b699	Merge branch 'cluster' Now that 3.3.3 is out, it is time to include the cluster-support code. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-27 11:01:08 +10:00
NeilBrown	62844a4da6	Grow: remove stray tracing message. Signed-off-by: NeilBrow <neilb@suse.com>	2015-07-06 13:47:45 +10:00
NeilBrown	caf9ac0ca4	Grow: fix typo in comment Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-18 15:51:45 +10:00
Guoqing Jiang	0aa2f15b20	mdadm: add the ability to change cluster name To support change the cluster name, the commit do the followings: 1. extend original write_bitmap function for new scenario. 2. add the scenarion to handle the modification of cluster's name in write_bitmap1. 3. let the cluster name also show in examine_super1 and detail_super1 Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:33:39 +10:00
Guoqing Jiang	7c25f4d706	Convert a bitmap=none device to clustered This adds the ability to convert a regular md without bitmap (--bitmap=none) to a clustered device (--bitmap=clustered). To convert a device with --bitmap=internal or --bitmap=external, you have to convert to --bitmap=none and then re-execute the command with --bitmap=clustered. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:24:41 +10:00
NeilBrown	2a6493cfe1	Grow: fix a couple of typos. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-28 17:21:06 +10:00
NeilBrown	8e7ddc5f50	Grow: fix problem with --grow --continue If an array is being reshaped using backup space on a 'spare' device, then mdadm --grow --continue won't find it as by the time it runs, nothing looks like a spare are more. The spare has been added to the array, but has no data yet. So allow reshape_prepare_fdlist to find a newly-incorporated spare and report this so it can be used. Reported-by: Xiao Ni <xni@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-28 16:43:15 +10:00
NeilBrown	e0cc1c8d8b	Grow: another attempt to fix stop-during-reshape race. When the array is stopped during a critical section, we sometimes erase the backup, which is bad. This happens when 'completed' is zero. This can happen easily when 'stop' freezes reshape. So try to be more careful and check 'reshape_position'. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-25 16:33:45 +10:00
NeilBrown	3ee556f8b6	Grow: be even more careful about handing a '0' completed value. Some old kernels set 'completed' to '0' too soon. But modern kernels don't. And when 'mdadm --stop' freezes and resume the grow, 'completed' goes back to zero briefly, which can confuse this logic. So only think '0' might be wrong from an old kernel when the reshape has gone idle. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 15:11:48 +10:00
NeilBrown	ada38ebbcb	Grow: retry when writing 'reshape' to 'sync_action' is EBUSY. EBUSY can be returned if something has recently happened to cause md to want to check if recovery is needed, but hasn't had a chance yet. This can easily happen in testing. So retry a few times in that case. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 11:07:25 +10:00
NeilBrown	e0184a0cd0	Grow: be more careful if array is stopped during critical section. In that case, updating 'completed' to 'max_progress' is wrong. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 11:07:25 +10:00
NeilBrown	a5a6a7d9fa	Grow: add missing space in message. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-15 11:07:25 +10:00
NeilBrown	dd243f561f	Grow: only warn about incompatible metadata when no fallback available. We might be trying to set_new_data_offset() for RAID10, when it is a necessary requirement, or for RAID5 where it is optional. In the latter case, a message about metadata versions is no helpful. Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-14 11:17:39 +10:00
NeilBrown	783bbc2b13	reshape: support raid5 grow on certain older kernels. Kernels between c6563a8c38fde3c1c7fc925a v3.5-rc1~110^2~53 and b5254dd5fdd9abcacadb5101 v3.5-rc1~110^2~51 allow new_offset to be set, but don't then allow a RAID5 to be reshaped to change that offset. Due to selective backports, this includes the SLES11-SP3 kernel. It is quite easy to handle this case in mdadm, so we do. Specifically: if the reshape with data-offset fails with EINVAL, abort the data-offset change and try the "old" way. Signed-off-by: NeilBrown <neilb@suse.de>	2015-03-26 10:06:26 +11:00
Jes Sorensen	9eb5ce5ae2	Grow.c: Fix classic readlink() buffer overflow The buffer passed on to readlink() needs to contain space for the terminating \0. See 'man 3 readlink' for details. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-25 08:06:45 +11:00
NeilBrown	7a862a020f	Don't break long strings onto multiple lines. It is best to keep strings all together so that they are easier to search for in the source code. If a string is so long that it looks ugly one line, them maybe it should be broken into multiple lines for display too. Only strings which contain a newline can be broken into multiple lines: "It is OK to\n" "break this string\n" Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 13:46:53 +11:00
NeilBrown	1ade5cc15a	Consistently print program Name and __func__ in debug messages. make dprintf() print program name and __func__, so that this messaging is consistent. Also remove all __func__ messages from pr_err(). We shouldn't leak that internal data in error message. If we really want function name there, we new pr_XXX might be wanted. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 13:21:17 +11:00
Pawel Baldysiak	d56dd607ba	Change way of printing name of a process Sometimes mdadm prints messages with wrong name "mdmon", and vice versa. This patch solves this problem by changing method of determining process name. Now "Name" will be set in const at start of a program, previously was hardcoded as #define. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 12:11:01 +11:00

1 2 3 4 5 ...

432 Commits