mdadm

Commit Graph

Author	SHA1	Message	Date
Zhilong Liu	9e04ac1c43	mdadm/util: unify stat checking blkdev into function declare function stat_is_blkdev() to integrate repeated stat checking blkdev operations, it returns 'true/1' when it is a block device, and returns 'false/0' when it isn't. The devname is necessary parameter, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:05:32 -04:00
Zhilong Liu	0a6bff09d4	mdadm/util: unify fstat checking blkdev into function declare function fstat_is_blkdev() to integrate repeated fstat checking block device operations, it returns true/1 when it is a block device, and returns false/0 when it isn't. The fd and devname are necessary parameters, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned the device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:04:02 -04:00
Jes Sorensen	9db2ab4e9b	util: md_array_valid(): Introduce md_array_valid() helper Using md_get_array_info() to determine if an array is valid is broken during creation, since the ioctl() returns -ENODEV if the device is valid but not active. Where did I leave my stash of brown paper bags? Fixes: ("40b054e mdopen/open_mddev: Use md_get_array_info() to determine valid array") Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-03 16:15:16 -04:00
Jes Sorensen	44356754ec	util: Get rid of unused enough_fd() enough_fd() is no longer used, so lets get rid of it. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-20 11:53:30 -04:00
Jes Sorensen	3ab8f4bf33	util: Introduce md_array_active() helper Rather than querying md_get_array_info() to determine whether an array is valid, do the work in md_array_active() using sysfs, and fall back on md_get_array_info() if sysfs fails. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-20 00:12:34 -04:00
Jes Sorensen	32141c1765	Retire mdassemble mdassemble doesn't handle container based arrays, no support for sysfs, etc. It has not been actively maintained for years, so time to send it off to retirement. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-11 12:54:26 -04:00
Jes Sorensen	303949f6f0	util: Finally kill off md_get_version() Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-04-05 15:49:18 -04:00
Jes Sorensen	700483a223	util/set_array_info: Simplify code since md_get_version returns a constant md_get_version() always returns (0 * 1000) + (90 * 100) + 3, so no point in calling it. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-04-05 15:06:24 -04:00
Jes Sorensen	f5c924f441	util/must_be_container: Use sysfs_read(GET_VERSION) to determine valid array Use sysfs_read() instead of ioctl(RAID_VERSION) to determine this is in fact a valid raid array fd. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-04-05 14:01:30 -04:00
Jes Sorensen	018a488238	util: Introduce md_set_array_info() Switch from using ioctl(SET_ARRAY_INFO) to using md_set_array_info() Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 15:43:53 -04:00
Jes Sorensen	d97572f5a5	util: Introduce md_get_disk_info() This removes all the inline ioctl calls for GET_DISK_INFO, allowing us to switch to sysfs in one place, and improves type checking. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 15:23:50 -04:00
Jes Sorensen	9cd39f0155	util: Introduce md_get_array_info() Remove most direct ioctl calls for GET_ARRAY_INFO, except for one, which will be addressed in the next patch. This is the start of the effort to clean up the use of ioctl calls and introduce a more structured API, which will use sysfs and fall back to ioctl for backup. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 14:35:41 -04:00
Jes Sorensen	efa295309f	util: Cosmetic changes Fixup a number of indentation and whitespace issues Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 12:05:12 -04:00
NeilBrown	1ab9ed2afb	Add 'force' flag to hot_remove_disk(). In rare circumstances, the short period that hot_remove_disk() waits isn't long enough to IO to complete. This particularly happens when a device is failing and many retries are still happening. We don't want to increase the normal wait time for "mdadm --remove" as that might be use just to test if a device is active or not, and a delay would be problematic. So allow "--force" to mean that mdadm should try extra hard for a --remove to complete, waiting up to 5 seconds. Note that this patch fixes a comment which claim the previous wait time was half a second, where it was really 50msec. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:32:35 -04:00
NeilBrown	fdd015696c	Introduce sys_hot_remove_disk() The new hot_remove_disk() will retry HOT_REMOVE_DISK several times in the face of EBUSY. However we sometimes remove a device by writing "remove" to the "state" attributed. This should be retried as well. So introduce sys_hot_remove_disk() to repeat this action a few times. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:30:49 -04:00
NeilBrown	2dd271fe70	Retry HOT_REMOVE_DISK a few times. HOT_REMOVE_DISK can fail with EBUSY if there are outstanding IO request that have not completed yet. It can sometimes be helpful to wait a little while for these to complete. We already do this in impose_level() when reshaping a device, but not in Manage.c in response to an explicit --remove request. So create hot_remove_disk() to central this code, and call it where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-28 14:25:23 -04:00
Xiao Ni	ff9239ee31	mdadm: Specify enough length when write to buffer In Detail.c the buffer path in function Detail is defined as path[200], in fact the max lenth of content which needs to write to the buffer is 287. Because the length of dname of struct dirent is 255. During building it reports error: error: ‘%s’ directive writing up to 255 bytes into a region of size 189 [-Werror=format-overflow=] In function examine_super0 there is a buffer nb with length 5. But it need to show a int type argument. The lenght of max number of int is 10. So the buffer length should be 11. In human_size function the length of buf is 30. During building there is a error: output between 20 and 47 bytes into a destination of size 30. Change the length to 47. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-17 15:58:16 -04:00
Mariusz Dabrowski	31208db97e	Always return last partition end address in 512B blocks For 4K disks 'endofpart' is an index of the last 4K sector used by partition. mdadm is using number of 512-byte sectors, so value returned by get_last_partition_end must be multiplied by 8 for devices with 4K sectors. Also, unused 'ret' variable has been removed. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-12-13 09:09:25 -05:00
Mariusz Dabrowski	41b06495ba	Use disk sector size value to set offset for reading GPT mdadm is using invalid byte-offset while reading GPT header to get partition info (size, first sector, last sector etc.). Now this offset is hardcoded to 512 bytes and it is not valid for disks with sector size different than 512 bytes because MBR and GPT headers are aligned to LBA, so valid offset for 4k drives is 4096 bytes. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-12-12 14:26:22 -05:00
Pawel Baldysiak	329715091c	Add function for getting member drive sector size This patch introduces the function for getting sector size of given device (fd). Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-11-17 09:24:18 -05:00
James Clarke	8e2bca513e	Fix bus error when accessing MBR partition records Since the MBR layout only has partition records as 2-byte aligned, the 32-bit fields in them are not aligned. Thus, they cannot be accessed on some architectures (such as SPARC) by using a "struct MBR_part_record *" pointer, as the compiler can assume that the pointer is properly aligned. Instead, the records must be accessed by going through the MBR struct itself every time. Signed-off-by: James Clarke <jrtc27@jrtc27.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-10-19 12:38:02 -04:00
Mariusz Dabrowski	fa219dd26a	Fix RAID metadata check mdadm recognizes devices with partition table as part of an RAID array and invalid warning message is displayed. After this fix proper warning messages are being displayed for MBR/GPT disks and devices with RAID metadata. Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-09-22 11:35:02 -04:00
Jes Sorensen	c5f71c2417	Introduce random_uuid() helper function This gets rid of 5 nearly identical copies of the same code, and reduces the binary size of mdadm by over 700 bytes on x86_64. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-08-15 15:41:34 -04:00
Jes Sorensen	9f0ad56be0	util: Never have if and return on the same line Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-08-11 15:48:47 -04:00
Mike Lovell	13db17bd1f	Use dev_t for devnm2devid and devid2devnm Commit `4dd2df0966` added a trip through makedev(), major(), and minor() for device major and minor numbers. This would cause mdadm to fail in operating on a device with a minor number bigger than (2^19)-1 due to it changing from dev_t to a signed int and back. Where this was found as a problem was when a array was created with a device specified as a name like /dev/md/raidname and there were already 128 arrays on the system. In this case, mdadm would chose 1048575 ((2^20)-1) for the array and minor number. This would cause the major and minor number to become negative when generated from devnm2devid() and passed to major() and minor() in open_dev_excl(). open_dev_excl() would then call dev_open() which would detect the negative minor number and call open() on the *char containing the major:minor pair which isn't a valid file. Signed-off-by: Mike Lovell <mlovell@bluehost.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-06-03 15:35:26 -04:00
Jes Sorensen	15d230f730	util: Remove unnecesary NULL pointer checks when calling sysfs_free() Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-08 12:19:03 -05:00
Guoqing Jiang	21f541cc31	Remove dead code about LKF_CONVERT flag Since flags is only set as LKF_NOQUEUE, the code with LKF_CONVERT flag should be delete. Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-03-07 15:21:04 -05:00
Maxin B. John	986b868817	util.c: include poll.h instead of sys/poll.h This fixes a compile warning when building with musl: In file included from util.c:27:0: \| qemux86-64/usr/include/sys/poll.h:1:2: error: #warning redirecting incorrect #include <sys/poll.h> to <poll.h> [-Werror=cpp] \| #warning redirecting incorrect #include <sys/poll.h> to <poll.h> \| ^ Signed-off-by: Maxin B. John <maxin.john@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-02-08 10:59:00 -05:00
Xiao Ni	1d13b59960	Fix some type comparison problems As `26714713cd` said, 32 bit signed timestamps will overflow in the year 2038. It already changed the utime and ctime in struct mdu_array_info_s from int to unsigned int. So we need to change the values that compared with them to unsigned int too. Signed-off-by : Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-02-08 10:49:22 -05:00
NeilBrown	7071320a18	Assorted fixed for a "make everything" build Signed-off-by: NeilBrown <neilb@suse.com>	2016-01-28 13:28:58 +11:00
Guoqing Jiang	32539f74d2	util: fix wrong return value of cluster_get_dlmlock Actually lksb.sb_status means that a node got the lock or not instead of the return value of dlm_lock. Signed-off-by: Guoqing Jiang <gqjiang@suse.com>	2016-01-27 11:43:02 +11:00
Guoqing Jiang	81a8a69415	mdadm: improve the safeguard for change cluster raid's sb This commit does the following jobs: 1. rename is_clustered to dlm_funs_ready since it match the function better. 2. st->cluster_name can't be use to identify the raid is a clustered or not, we should check the bitmap's version to perform the identification. 3. for cluster_get_dlmlock/cluster_release_dlmlock funcs, both of them just need the lockid as parameter since the cluster name can get by get_cluster_name(). Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-12-17 09:53:37 +11:00
Guoqing Jiang	e80357f825	Make cmap_* also has same policy as dlm_* Let libcmap lib and related funs also only need one-time setup during mdadm running period. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-10-21 11:19:35 +11:00
Guoqing Jiang	d15a1f72bd	Safeguard against writing to an active device of another node Modifying an exiting device's superblock or creating a new superblock on an existing device needs to be checked because the device could be in use by another node in another array. So, we check this by taking all superblock locks in userspace so that we don't step onto an active device used by another node and safeguard against accidental edits. After the edit is complete, we release all locks and the lockspace so that it can be used by the kernel space. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>	2015-10-21 11:19:05 +11:00
NeilBrown	7d55dca2cc	mdassemble: don't try to perform cluster check. mdassemble is meant to be small an simple, so avoid trying to check for a cluster. Currently it doesn't, but it still includes the code, which doesn't build because the library isn't provided. So just exclude the get_cluster_name code from mdassemble. Signed-off-by: NeilBrown <neilb@suse.com>	2015-08-03 11:53:01 +10:00
Guoqing Jiang	4de9091302	Add a new clustered disk A clustered disk is added by the traditional --add sequence. However, other nodes need to acknowledge that they can "see" the device. This is done by --cluster-confirm: --cluster-confirm SLOTNUM:/dev/whatever (if disk is found) or --cluster-confirm SLOTNUM:missing (if disk is not found) The node initiating the --add, has the disk state tagged with MD_DISK_CLUSTER_ADD and the one confirming tag the disk with MD_DISK_CANDIDATE. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:21:29 +10:00
Guoqing Jiang	7716570e6d	Set home-cluster while creating an array The home-cluster is stored in the bitmap super block of the array. The device can be assembled on a cluster with the cluster name same as the one recorded in the bitmap. If home-cluster is not specified, this is auto-detected using dlopen corosync cmap library. neilb: allow code to compile when corosync-devel is not installed. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-06-17 09:06:30 +10:00
NeilBrown	330d6900bb	Assemble: allow a RAID4 to assemble easily when parity devices is missing. If the parity device of a RAID4 is missing, then there is no immediate risk to data. So it doesn't matter if the array is dirty or not. This can be important when reshaping a RAID0, and is a much better solution that that in the resent-reverted. `b720636a58` Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net> Signed-off-by: NeilBrown <neilb@suse.de>	2015-04-08 09:39:02 +10:00
NeilBrown	7a862a020f	Don't break long strings onto multiple lines. It is best to keep strings all together so that they are easier to search for in the source code. If a string is so long that it looks ugly one line, them maybe it should be broken into multiple lines for display too. Only strings which contain a newline can be broken into multiple lines: "It is OK to\n" "break this string\n" Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 13:46:53 +11:00
NeilBrown	1ade5cc15a	Consistently print program Name and __func__ in debug messages. make dprintf() print program name and __func__, so that this messaging is consistent. Also remove all __func__ messages from pr_err(). We shouldn't leak that internal data in error message. If we really want function name there, we new pr_XXX might be wanted. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 13:21:17 +11:00
NeilBrown	93d3bd3b28	util: remove rounding error where reporting "human sizes". The division 1<<20 / 200 is not exact, so dividing by this to convert bytes into half-megs is wrong and results in incorrect output. As we are doing "long long" arithmetic, there is no risk of an overflow until we reach 64 petabytes. So change to * 200 / (1<<20). Reported-by: Jan Echternach <jan@goneko.de> Resolved-debian-bug: 763917 URL: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763917 Signed-off-by: NeilBrown <neilb@suse.de>	2014-12-18 16:58:44 +11:00
NeilBrown	cc742d3807	util: split get_maj_min() out from dev_open() This allows other code to parse "8:3" style device names. Signed-off-by: NeilBrown <neilb@suse.de>	2014-08-11 10:34:36 +10:00
NeilBrown	85945e1986	install: use BINDIR consistently to locate mdadm and mdmon Every place where the paths for mdadm or mdmon is explicit, it should use the BINDIR setting, not "/sbin/". Reported-by: member graysky <graysky@archlinux.us> (https://bugs.archlinux.org/task/37330) Signed-off-by: NeilBrown <neilb@suse.de>	2014-05-22 17:13:02 +10:00
Jes Sorensen	76d0f1886f	Work around architectures having statfs.f_type defined as long Having RAMFS_MAGIC defined as 0x858458f6 causing problems when trying to compare it directly against statfs.f_type being cast from long to unsigned long. This hack is extremly ugly, but it should at least do the right thing for every situation. Thanks to Arnd Bergmann for suggesting the fix. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2014-03-20 09:24:27 +11:00
NeilBrown	8832342d3a	Assemble/Incremental: don't hold O_EXCL on mddev after assembly. As soon as the array is assembled, udev or systemd might run fsck and mount it. So we need to drop O_EXCL promptly. Signed-off-by: NeilBrown <neilb@suse.de>	2013-12-05 10:35:16 +11:00
NeilBrown	5dd29dafa2	Two small fixes related to enough() 1/ enough_fd doesn't use avail_disks any more, so discard it. 2/ Manage_Add increments 'found' at the wrong place, so it can waste time before calling enough(). Signed-off-by: NeilBrown <neilb@suse.de>	2013-12-05 08:58:21 +11:00
NeilBrown	357ac10678	IMSM metadata really should be ignored when found on partitions. commit `b31df43682` changed load_super_imsm to not insist on finding a partition if ignore_hw_compat was set. Unfortunately this is set for '--assemble' so arrays could get assembled badly. The comment says this was to allow e.g. --examine of image files. A better fixes for this is to change test_partitions to not report a regular file as being a partition. The errors from the BLKPG ioctl are: ENOTTY : not a block device. EINVAL : not a whole device (probably a partition) ENXIO : partition doesn't exist (so not a partition) Reported-by: "David F." <df7729@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2013-11-20 10:49:14 +11:00
NeilBrown	6f02172d2e	Release mdadm-3.3 (and various cosmetic fixes) Signed-off-by: NeilBrown <neilb@suse.de>	2013-09-03 14:47:47 +10:00
NeilBrown	2f1bcf43d9	Make sure "mdmon" doesn't get called "@dmon". The Anaconda installer (via its "loader" program) will try to kill many processes at shutdown, but not "mdmon". However when mdadm runs mdmon in the Anaconda environment, mdmon sets argv[0][0] to '@' resulting in "@dmon" which confuses "loader". So change mdadm to set argv[0] to a path so that mdmon becomes e.g. "@usr/sbin/mdmon" which "loader" will recognise as being "mdmon". Reported-by: Martin Wilck <mwilck@arcor.de> Signed-off-by: NeilBrown <neilb@suse.de>	2013-09-02 11:02:09 +10:00
mwilck@arcor.de	7ac5d47e8a	in_initrd: fix gcc compiler error On some systems, this code caused a "comparison between signed and unsigned" error. Signed-off-by: Martin Wilck <mwilck@arcor.de> Signed-off-by: NeilBrown <neilb@suse.de>	2013-08-28 14:58:56 +10:00

1 2 3 4 5 ...

290 Commits