mdadm

Commit Graph

Author	SHA1	Message	Date
Lukasz Florczak	83a379cfbd	Replace error prone signal() with sigaction() Up to this date signal() was used which implementation could vary [1]. Sigaction() call is preferred. This commit introduces replacement from signal() to sigaction() by the use of signal_s() wrapper. Also remove redundant signal.h header includes. [1] https://man7.org/linux/man-pages/man2/signal.2.html Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2022-04-04 11:13:02 -04:00
Coly Li	8c80d3055b	Monitor: print message before quit for no array to monitor If there is no array device to monitor, Monitor() will stop monitoring at line 261 from the following code block, 257 if (!new_found) { 258 if (oneshot) 259 break; 260 else if (!anyredundant) { 261 break; 262 } This change was introduced by commit `007087d089` ("Monitor: stop notifing about containers"). Before this commit, Monitor() will continue and won't quit even there is no array to monitor. It is fine to quit without any array device to monitor, but users may wonder whether there is something wrong with mdadm program or their configuration to make mdadm quit monitoring. This patch adds a simple error message to indicate Monitor() quits for array device to monitor, which makes users have hint to understand why mdadm stops monitoring. Reported-by: George Gkioulis <ggkioulis@suse.com> Suggested-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Coly Li <colyli@suse.de> Cc: Jes Sorensen <jsorensen@fb.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2021-12-08 09:44:14 -05:00
Nigel Croxon	1c66260df6	Fix 2 dc stream buffer To meet requirements of Common Criteria certification vulnerablility assessment. Static code analysis has been run and found the following Error: DC.STREAM_BUFFER (CWE-120): [#def46] mdadm-4.2: dont_call: "fscanf" assumes an arbitrarily long string, so callers must use correct precision specifiers or never use "fscanf". The change is to define a value for string %s. V2: Tighten the value in policy.c to match the limit of the metadata. Add a change to policy_save_path() to use correct precision on the fscanf call. Signed-off-by: Nigel Croxon <ncroxon@redhat.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2021-10-19 02:57:43 -04:00
Mariusz Tkaczyk	af3396dadf	Monitor: make libudev dependency optional Make -ludev configurable, enabled by default. To disable it, -DNO_LIBUDEV has to be set explicitly in CXFALGS. This patch restores commit `cab9c67d46` ("mdmonitor: set small delay once") for configuration without libudev to bring minimal support in such case. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2021-03-19 08:31:11 -04:00
Oleksandr Shchirskyi	49b69533e8	mdmonitor: check if udev has finished events processing If mdmonitor is awaken by event, wait for udev to finish events processing, to eliminate the race between udev and mdadm when spare has been added and need to be moved by mdmonitor Signed-off-by: Oleksandr Shchirskyi <oleksandr.shchirskyi@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2021-03-08 10:20:48 -05:00
Mariusz Tkaczyk	ff6bb131a4	mdadm: Unify forks behaviour If mdadm is run by udev or systemd, it gets a pipe as each stream. Forks in the background may run after an event or service has been processed when udev is detached from pipe. As a result process fails quietly if any message is written. To prevent from it, each fork has to close all parent streams. Leave stderr and stdout opened only for debug purposes. Unify it across all forks. Introduce other descriptors detection by scanning /proc/self/fd directory. Add generic method for managing systemd services. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>	2020-11-25 18:15:55 -05:00
Mariusz Tkaczyk	ca4b156b20	Monitor: don't use default modes when creating a file Replace fopen() calls by open() with creation mode directly specified. This fixes the potential security issue. Use octal values instead masks. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>	2020-11-25 18:09:03 -05:00
Blazej Kucman	7f3b2d1d16	Check if other Monitor instance running before fork. Make error message visible to the user. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-10-14 11:37:49 -04:00
Blazej Kucman	cab9c67d46	mdmonitor: set small delay once If mdmonitor is awakened by event, set small delay once to deal with udev and mdadm. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-10-14 11:33:47 -04:00
Mariusz Tkaczyk	007087d089	Monitor: stop notifing about containers. Stop reporting any events from container but still track them, it is important for spare migration. Stop mdmonitor if no redundant array is presented in mdstat. There is nothing to follow. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-10-14 11:32:09 -04:00
Mariusz Tkaczyk	e230873391	Monitor: refresh mdstat fd after select After `52209d6ee1` ("Monitor: release /proc/mdstat fd when no arrays present") mdstat fd is closed if mdstat is empty or cannot be opened. It causes that monitor is not able to select on mdstat. Select doesn't fail because it gets valid descriptor to a different resource. As a result any new event will be unnoticed until timeout (delay). Refresh mdstat after wake up, don't poll on wrong resource. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-10-14 11:31:39 -04:00
Coly Li	185ec4397e	Monitor: improve check_one_sharer() for checking duplicated process When running mdadm monitor with scan mode, only one autorebuild process is allowed. check_one_sharer() checks duplicated process by following steps, 1) Read autorebuild.pid file, - if file does not exist, no duplicated process, go to 3). - if file exists, continue to next step. 2) Read pid number from autorebuild.pid file, then check procfs pid directory /proc/<PID>, - if the directory does not exist, no duplicated process, go to 3) - if the directory exists, print error message for duplicated process and exit this mdadm. 3) Write current pid into autorebuild.pid file, continue to monitor in scan mode. The problem for the above step 2) is, if after system reboots and another different process happens to have exact same pid number which autorebuild.pid file records, check_one_sharer() will treat it as a duplicated mdadm process and returns error with message "Only one autorebuild process allowed in scan mode, aborting". This patch tries to fix the above same-pid-but-different-process issue by one more step to check the process command name, 1) Read autorebuild.pid file - if file does not exist, no duplicated process, go to 4). - if file exists, continue to next step. 2) Read pid number from autorebuild.pid file, then check procfs file comm with the specific pid directory /proc/<PID>/comm - if the file does not exit, it means the directory /proc/<PID> does not exist, go to 4) - if the file exits, continue next step 3) Read process command name from /proc/<PIC>/comm, compare the command name with "mdadm" process name, - if not equal, no duplicated process, goto 4) - if strings are equal, print error message for duplicated process and exit this mdadm. 4) Write current pid into autorebuild.pid file, continue to monitor in scan mode. Now check_one_sharer() returns error for duplicated process only when the recorded pid from autorebuild.pid exists, and the process has exact same command name as "mdadm". Reported-by: Shinkichi Yamazaki <shinkichi.yamazaki@suse.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2020-04-27 10:26:47 -04:00
Guilherme G. Piccoli	43ebc9105e	mdadm: Introduce new array state 'broken' for raid0/linear Currently if a md raid0/linear array gets one or more members removed while being mounted, kernel keeps showing state 'clean' in the 'array_state' sysfs attribute. Despite udev signaling the member device is gone, 'mdadm' cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted. Nothing else hints that something is wrong (except that the removed devices don't show properly in the output of mdadm 'detail' command). There is no other property to be checked, and if user is not performing reads/writes to the array, even kernel log is quiet and doesn't give a clue about the missing member. This patch is the mdadm counterpart of kernel new array state 'broken'. The 'broken' state mimics the state 'clean' in every aspect, being useful only to distinguish if an array has some member missing. All necessary paths in mdadm were changed to deal with 'broken' state, and in case the tool runs in a kernel that is not updated, it'll work normally, i.e., it doesn't require the 'broken' state in order to work. Also, this patch changes the way the array state is showed in the 'detail' command (for raid0/linear only) - now it takes the 'array_state' sysfs attribute into account instead of only rely in the MD_SB_CLEAN flag. Cc: Jes Sorensen <jes.sorensen@gmail.com> Cc: NeilBrown <neilb@suse.de> Cc: Song Liu <songliubraving@fb.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2019-09-30 15:08:09 -04:00
Jes Sorensen	9758983959	Monitor: Increase size of percentalert to avoid gcc warning gcc-8.1 complains about truncated string operations. While we know percent will never grow larger than 100, it doesn't cost us anything to increase the size of 'percentalert' on the stack like this. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2018-05-31 11:45:21 -04:00
Mariusz Tkaczyk	1ea0462990	Monitor/msg: Don't print error message if mdmon doesn't run Commit `4515fb28a5` ("Add detail information when can not connect monitor") was added to warn about failed connection to monitor in WaitClean function (see link below). Mdmon runs for IMSM containers when they have array with redundancy so if mdmon doesn't run, mdadm prints this error. This is misleading and unnecessary. Just print it in WaitClean function. The sock in WaitClean is deprecated so it is removed. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1375002 Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-11-21 13:26:09 -05:00
Mariusz Tkaczyk	b3ab4e4dd7	Monitor: Check redundancy for arrays GET_MISMATCH option doesn't exist for RAID arrays without redundancy so sysfs_read fails if this information is requested. Set options according to the device using information from /proc/mdstat. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-10-02 15:44:14 -04:00
Mariusz Tkaczyk	2dab69c9e3	Monitor: Include containers in spare migration Spare migration doesn't work for external metadata. mdadm skips a container with spare device because it is inactive. It used to work because GET_ARRAY_INFO ioctl returned valid structure for a container and mdadm treated such response as active container. Current implementation checks it in sysfs where container is shown as inactive. Adapt sysfs implementation to work the same way as ioctl. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-08-16 09:05:59 -04:00
Mariusz Tkaczyk	802961a239	Monitor: containers don't have the same sysfs properties as arrays GET_MISMATCH option doesn't exist for containers so sysfs_read fails if this information is requested. Set options according to the device using information from /proc/mdstat. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-08-16 08:25:07 -04:00
Tomasz Majchrzak	cb91230c87	Monitor: don't assume mdadm parameter is a block device If symlink (e.g. /dev/md/raid) is passed as a parameter to mdadm --wait, it fails as it's not able to find a corresponding entry in /proc/mdstat output. Get parameter file major:minor and look for block device name in sysfs. This commit is partial revert of commit `9e04ac1c43` ("mdadm/util: unify stat checking blkdev into function"). Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-07-10 13:40:05 -04:00
Tomasz Majchrzak	b13b52c80f	Get failed disk count from array state Recent commit has changed the way failed disks are counted. It breaks recovery for external metadata arrays as failed disks are not part of the array and have no corresponding entries is sysfs (they are only reported for containers) so degraded arrays show no failed disks. Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it is not set again if GET_STATE has not been requested. As GET_STATE provides the same information as GET_DEGRADED, the latter is not needed anymore. Remove GET_DEGRADED option and replace it with GET_STATE option. Don't count number of failed disks looking at sysfs entries but calculate it at the end. Do it only for arrays as containers report no disks, just spares. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-06-05 11:11:36 -04:00
Jes Sorensen	d7be7d8736	mdadm: Fixup more broken logical operator formatting Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-16 13:59:43 -04:00
Jes Sorensen	f566ef45d3	Monitor: Fixup a pile of whitespace issues No code was hurt in this event Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-11 16:56:55 -04:00
Jes Sorensen	72362f18ae	Monitor: mailfrom is initialized correctly Remove gratituous variable initialization. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-11 16:44:19 -04:00
Jes Sorensen	9f3dd4549b	Monitor: Not much point declaring mdlist in both forks of the if() statement Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-11 16:40:16 -04:00
Jes Sorensen	e5eb6857cd	Monitor/check_array: Use working_disks from sysfs sysfs now provides working_disks information, so lets use it too. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 17:15:14 -04:00
Jes Sorensen	b98943a4f8	Monitor/check_array: Get nr_disks, active_disks and spare_disks from sysfs This leaves working_disks and utime missing before we can eliminate check_array()'s call to md_get_array_info() Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 17:06:09 -04:00
Jes Sorensen	12a9d21f4e	Monitor/check_array: Get array_disks from sysfs Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 16:58:55 -04:00
Jes Sorensen	b8e5713c74	Monitor/check_array: Get 'failed_disks' from sysfs Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 16:54:19 -04:00
Jes Sorensen	48bc2ade86	Monitor/check_array: Obtain RAID level from syfs Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 16:52:44 -04:00
Jes Sorensen	aed5f5c34c	Monitor/check_array: Read sysfs entry earlier This will allow us to pull additional info from sysfs, such as level and device info. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 16:51:41 -04:00
Jes Sorensen	826522f0dc	Monitor/check_array: Declate mdinfo instance globally We can pull in more information from sysfs earlier, so move sra to the top. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 16:41:06 -04:00
Jes Sorensen	13e5d8455c	Monitor/check_array: Reduce duplicated error handling Avoid closing fd in multiple places, and duplicating the error message for when a device disappeared. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 16:38:06 -04:00
Jes Sorensen	1830e74b4c	Monitor/check_array: Centralize exit path Improve exit handling to make it easier to share error handling and free sysfs entries later. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 16:25:23 -04:00
Alexey Obitotskiy	4b57ecf6ce	Add sector size as spare selection criterion Add sector size as new spare selection criterion. Assume that 0 means there is no requirement for the sector size in the array. Skip disks with unsuitable sector size when looking for a spare to move across containers. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 14:18:38 -04:00
Alexey Obitotskiy	fbfdcb06dc	Allow more spare selection criteria Disks can be moved across containers in order to be used as a spare drive for reubild. At the moment the only requirement checked for such disk is its size (if it matches donor expectations). In order to introduce more criteria rename corresponding superswitch method to more generic name and move function parameter to a structure. This change is a big edit but it doesn't introduce any changes in code logic, it just updates function naming and parameters. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-09 14:18:36 -04:00
Jes Sorensen	f27904a53b	Monitor: Code is 80 characters per line Fix up some lines that are too long for no reason, and some that have silly line breaks. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-08 17:52:10 -04:00
Jes Sorensen	b9a0309c7f	Monitor: Use md_array_active() instead of manually fiddling in sysfs This removes a pile of clutter that can easily behandled with a simple check of array_state. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-08 17:41:00 -04:00
Zhilong Liu	9e04ac1c43	mdadm/util: unify stat checking blkdev into function declare function stat_is_blkdev() to integrate repeated stat checking blkdev operations, it returns 'true/1' when it is a block device, and returns 'false/0' when it isn't. The devname is necessary parameter, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-05-05 11:05:32 -04:00
Jes Sorensen	32141c1765	Retire mdassemble mdassemble doesn't handle container based arrays, no support for sysfs, etc. It has not been actively maintained for years, so time to send it off to retirement. Signed-off-by: Jes Sorensen <jsorensen@fb.com>	2017-04-11 12:54:26 -04:00
Jes Sorensen	dae131379f	sysfs: Make sysfs_init() return an error code Rather than have the caller inspect the returned content, return an error code from sysfs_init(). In addition make all callers actually check it. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-30 16:52:37 -04:00
Jes Sorensen	d97572f5a5	util: Introduce md_get_disk_info() This removes all the inline ioctl calls for GET_DISK_INFO, allowing us to switch to sysfs in one place, and improves type checking. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 15:23:50 -04:00
Jes Sorensen	9cd39f0155	util: Introduce md_get_array_info() Remove most direct ioctl calls for GET_ARRAY_INFO, except for one, which will be addressed in the next patch. This is the start of the effort to clean up the use of ioctl calls and introduce a more structured API, which will use sysfs and fall back to ioctl for backup. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-29 14:35:41 -04:00
Zhilong Liu	75dd32a185	mdadm/Monitor: Fix NULL pointer dereference when stat2devnm return NULL Wait(): stat2devnm() returns NULL for non block devices. Check the pointer is valid derefencing it. This can happen when using --wait, such as the 'f' and 'd' file type, causing a core dump. such as: ./mdadm --wait /dev/md/ Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>	2017-03-27 18:24:19 -04:00
Tomasz Majchrzak	52209d6ee1	Monitor: release /proc/mdstat fd when no arrays present If md kernel module is reloaded, /proc/mdstat cannot be accessed ("cat: /proc/mdstat: No such file or directory"). The reason is mdadm monitor still holds a file descriptor to previous /proc/mdstat instance. It leads to really confusing outcome of the following operations - mdadm seems to run without errors, however some udev rules don't get executed and new array doesn't work. Add a check if lseek was successful as it fails if md kernel module has been unloaded - close a file descriptor then. The problem is mdadm monitor doesn't always do it before next operation takes place. To prevent it monitor always releases /proc/mdstat descriptor when there are no arrays to be monitored, just in case driver unload happens in a moment. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-07-21 11:37:17 -04:00
Jes Sorensen	26c62b8e76	Monitor: Use sysfs_free() to free object returned by sysfs_read() We should always use sysfs_free() to release sysfs_* allocated objects. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-06-10 14:56:23 -04:00
Xiao Ni	1d13b59960	Fix some type comparison problems As `26714713cd` said, 32 bit signed timestamps will overflow in the year 2038. It already changed the utime and ctime in struct mdu_array_info_s from int to unsigned int. So we need to change the values that compared with them to unsigned int too. Signed-off-by : Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>	2016-02-08 10:49:22 -05:00
NeilBrown	d3f6cf4f9b	Monitor: don't Wait forever on a 'frozen' array. If Wait() finds the array resync is 'frozen', then wait a little while to avoid races, but don't wait forever. Signed-off-by: NeilBrown <neilb@suse.com>	2015-07-06 13:26:41 +10:00
Sergey Vidishev	1e08717f0b	mdadm: monitor: fix nullptr dereference when get_md_name() returns NULL Function add_new_arrays() expects that function get_md_name() should return pointer to devname, but also get_md_name() may return NULL. So check the pointer before use it in add_new_arrays(). Signed-off-by: Sergey Vidishev <sergeyv@yandex-team.ru> Signed-off-by: NeilBrown <neilb@suse.de>	2015-05-20 13:16:09 +10:00
NeilBrown	04e27c2084	Monitor: use the "space protocol" for "Wrong-Level". "Wrong-Level" is a reason, not a component device, so it should start with a space to indiciate this to alert(). Signed-off-by: NeilBrown <neilb@suse.de>	2015-04-08 09:18:55 +10:00
NeilBrown	b033913a3c	Monitor: Obey "space protocol" when writing to syslog. "alert" treats the "disc" arg differently if it starts with a space. At least it does for sending email. It doesn't for writing to syslog. Make this consistent and obey the 'space protocol' when writing to syslog. Signed-off-by: NeilBrown <neilb@suse.de>	2015-04-08 09:17:17 +10:00

1 2 3 4

188 Commits