Up to this date signal() was used which implementation could vary [1].
Sigaction() call is preferred. This commit introduces replacement
from signal() to sigaction() by the use of signal_s() wrapper.
Also remove redundant signal.h header includes.
[1] https://man7.org/linux/man-pages/man2/signal.2.html
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
If there is no array device to monitor, Monitor() will stop monitoring
at line 261 from the following code block,
257 if (!new_found) {
258 if (oneshot)
259 break;
260 else if (!anyredundant) {
261 break;
262 }
This change was introduced by commit 007087d089 ("Monitor: stop
notifing about containers"). Before this commit, Monitor() will continue
and won't quit even there is no array to monitor.
It is fine to quit without any array device to monitor, but users may
wonder whether there is something wrong with mdadm program or their
configuration to make mdadm quit monitoring.
This patch adds a simple error message to indicate Monitor() quits for
array device to monitor, which makes users have hint to understand why
mdadm stops monitoring.
Reported-by: George Gkioulis <ggkioulis@suse.com>
Suggested-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
To meet requirements of Common Criteria certification vulnerablility
assessment. Static code analysis has been run and found the following
Error: DC.STREAM_BUFFER (CWE-120): [#def46]
mdadm-4.2: dont_call: "fscanf" assumes an arbitrarily
long string, so callers must use correct precision specifiers or
never use "fscanf".
The change is to define a value for string %s.
V2: Tighten the value in policy.c to match the limit of the metadata.
Add a change to policy_save_path() to use correct precision on the
fscanf call.
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Make -ludev configurable, enabled by default.
To disable it, -DNO_LIBUDEV has to be set explicitly in CXFALGS.
This patch restores commit cab9c67d46 ("mdmonitor: set small delay
once") for configuration without libudev to bring minimal support in
such case.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
If mdmonitor is awaken by event, wait for udev to finish
events processing, to eliminate the race between udev and mdadm
when spare has been added and need to be moved by mdmonitor
Signed-off-by: Oleksandr Shchirskyi <oleksandr.shchirskyi@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
If mdadm is run by udev or systemd, it gets a pipe as each stream.
Forks in the background may run after an event or service has been
processed when udev is detached from pipe. As a result process
fails quietly if any message is written.
To prevent from it, each fork has to close all parent streams. Leave
stderr and stdout opened only for debug purposes.
Unify it across all forks. Introduce other descriptors detection by
scanning /proc/self/fd directory. Add generic method for
managing systemd services.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Replace fopen() calls by open() with creation mode directly specified.
This fixes the potential security issue. Use octal values instead masks.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
If mdmonitor is awakened by event, set small delay once
to deal with udev and mdadm.
Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Stop reporting any events from container but still track them,
it is important for spare migration.
Stop mdmonitor if no redundant array is presented in mdstat.
There is nothing to follow.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
After 52209d6ee1 ("Monitor: release /proc/mdstat fd when no arrays
present") mdstat fd is closed if mdstat is empty or cannot be opened.
It causes that monitor is not able to select on mdstat. Select
doesn't fail because it gets valid descriptor to a different resource.
As a result any new event will be unnoticed until timeout (delay).
Refresh mdstat after wake up, don't poll on wrong resource.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
When running mdadm monitor with scan mode, only one autorebuild process
is allowed. check_one_sharer() checks duplicated process by following
steps,
1) Read autorebuild.pid file,
- if file does not exist, no duplicated process, go to 3).
- if file exists, continue to next step.
2) Read pid number from autorebuild.pid file, then check procfs pid
directory /proc/<PID>,
- if the directory does not exist, no duplicated process, go to 3)
- if the directory exists, print error message for duplicated process
and exit this mdadm.
3) Write current pid into autorebuild.pid file, continue to monitor in
scan mode.
The problem for the above step 2) is, if after system reboots and
another different process happens to have exact same pid number which
autorebuild.pid file records, check_one_sharer() will treat it as a
duplicated mdadm process and returns error with message "Only one
autorebuild process allowed in scan mode, aborting".
This patch tries to fix the above same-pid-but-different-process issue
by one more step to check the process command name,
1) Read autorebuild.pid file
- if file does not exist, no duplicated process, go to 4).
- if file exists, continue to next step.
2) Read pid number from autorebuild.pid file, then check procfs file
comm with the specific pid directory /proc/<PID>/comm
- if the file does not exit, it means the directory /proc/<PID> does
not exist, go to 4)
- if the file exits, continue next step
3) Read process command name from /proc/<PIC>/comm, compare the command
name with "mdadm" process name,
- if not equal, no duplicated process, goto 4)
- if strings are equal, print error message for duplicated process
and exit this mdadm.
4) Write current pid into autorebuild.pid file, continue to monitor in
scan mode.
Now check_one_sharer() returns error for duplicated process only when
the recorded pid from autorebuild.pid exists, and the process has exact
same command name as "mdadm".
Reported-by: Shinkichi Yamazaki <shinkichi.yamazaki@suse.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Currently if a md raid0/linear array gets one or more members removed while
being mounted, kernel keeps showing state 'clean' in the 'array_state'
sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.
Nothing else hints that something is wrong (except that the removed devices
don't show properly in the output of mdadm 'detail' command). There is no
other property to be checked, and if user is not performing reads/writes
to the array, even kernel log is quiet and doesn't give a clue about the
missing member.
This patch is the mdadm counterpart of kernel new array state 'broken'.
The 'broken' state mimics the state 'clean' in every aspect, being useful
only to distinguish if an array has some member missing. All necessary
paths in mdadm were changed to deal with 'broken' state, and in case the
tool runs in a kernel that is not updated, it'll work normally, i.e., it
doesn't require the 'broken' state in order to work.
Also, this patch changes the way the array state is showed in the 'detail'
command (for raid0/linear only) - now it takes the 'array_state' sysfs
attribute into account instead of only rely in the MD_SB_CLEAN flag.
Cc: Jes Sorensen <jes.sorensen@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
gcc-8.1 complains about truncated string operations. While we know
percent will never grow larger than 100, it doesn't cost us anything
to increase the size of 'percentalert' on the stack like this.
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Commit 4515fb28a5 ("Add detail information when can not connect
monitor") was added to warn about failed connection to monitor in
WaitClean function (see link below).
Mdmon runs for IMSM containers when they have array with redundancy so
if mdmon doesn't run, mdadm prints this error. This is misleading and
unnecessary. Just print it in WaitClean function.
The sock in WaitClean is deprecated so it is removed.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1375002
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
GET_MISMATCH option doesn't exist for RAID arrays without redundancy
so sysfs_read fails if this information is requested. Set options
according to the device using information from /proc/mdstat.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Spare migration doesn't work for external metadata. mdadm skips
a container with spare device because it is inactive. It used to work
because GET_ARRAY_INFO ioctl returned valid structure for a container
and mdadm treated such response as active container. Current
implementation checks it in sysfs where container is shown as inactive.
Adapt sysfs implementation to work the same way as ioctl.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
GET_MISMATCH option doesn't exist for containers so sysfs_read fails if
this information is requested. Set options according to the device using
information from /proc/mdstat.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
If symlink (e.g. /dev/md/raid) is passed as a parameter to mdadm --wait,
it fails as it's not able to find a corresponding entry in /proc/mdstat
output. Get parameter file major:minor and look for block device name in
sysfs. This commit is partial revert of commit 9e04ac1c43
("mdadm/util: unify stat checking blkdev into function").
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Recent commit has changed the way failed disks are counted. It breaks
recovery for external metadata arrays as failed disks are not part of
the array and have no corresponding entries is sysfs (they are only
reported for containers) so degraded arrays show no failed disks.
Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it
is not set again if GET_STATE has not been requested. As GET_STATE
provides the same information as GET_DEGRADED, the latter is not needed
anymore. Remove GET_DEGRADED option and replace it with GET_STATE
option.
Don't count number of failed disks looking at sysfs entries but
calculate it at the end. Do it only for arrays as containers report
no disks, just spares.
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
This leaves working_disks and utime missing before we can eliminate
check_array()'s call to md_get_array_info()
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Add sector size as new spare selection criterion. Assume that 0 means
there is no requirement for the sector size in the array. Skip disks
with unsuitable sector size when looking for a spare to move across
containers.
Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Disks can be moved across containers in order to be used as a spare
drive for reubild. At the moment the only requirement checked for such
disk is its size (if it matches donor expectations). In order to
introduce more criteria rename corresponding superswitch method to more
generic name and move function parameter to a structure. This change is
a big edit but it doesn't introduce any changes in code logic, it just
updates function naming and parameters.
Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
declare function stat_is_blkdev() to integrate repeated stat
checking blkdev operations, it returns 'true/1' when it is a
block device, and returns 'false/0' when it isn't.
The devname is necessary parameter, *rdev is optional, parse
the pointer of dev_t *rdev, if valid, assigned device number
to dev_t *rdev, if NULL, ignores.
Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
mdassemble doesn't handle container based arrays, no support for sysfs,
etc. It has not been actively maintained for years, so time to send it
off to retirement.
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Rather than have the caller inspect the returned content, return an
error code from sysfs_init(). In addition make all callers actually
check it.
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
This removes all the inline ioctl calls for GET_DISK_INFO, allowing us
to switch to sysfs in one place, and improves type checking.
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
Remove most direct ioctl calls for GET_ARRAY_INFO, except for one,
which will be addressed in the next patch.
This is the start of the effort to clean up the use of ioctl calls and
introduce a more structured API, which will use sysfs and fall back to
ioctl for backup.
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
Wait(): stat2devnm() returns NULL for non block devices. Check the
pointer is valid derefencing it. This can happen when using --wait,
such as the 'f' and 'd' file type, causing a core dump.
such as: ./mdadm --wait /dev/md/
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
If md kernel module is reloaded, /proc/mdstat cannot be accessed ("cat:
/proc/mdstat: No such file or directory"). The reason is mdadm monitor
still holds a file descriptor to previous /proc/mdstat instance. It
leads to really confusing outcome of the following operations - mdadm
seems to run without errors, however some udev rules don't get executed
and new array doesn't work.
Add a check if lseek was successful as it fails if md kernel module has
been unloaded - close a file descriptor then. The problem is mdadm
monitor doesn't always do it before next operation takes place. To
prevent it monitor always releases /proc/mdstat descriptor when there
are no arrays to be monitored, just in case driver unload happens in a
moment.
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
As 26714713cd said, 32 bit signed
timestamps will overflow in the year 2038. It already changed the
utime and ctime in struct mdu_array_info_s from int to unsigned
int. So we need to change the values that compared with them to
unsigned int too.
Signed-off-by : Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
If Wait() finds the array resync is 'frozen', then wait
a little while to avoid races, but don't wait forever.
Signed-off-by: NeilBrown <neilb@suse.com>
Function add_new_arrays() expects that function get_md_name() should
return pointer to devname, but also get_md_name() may return NULL. So
check the pointer before use it in add_new_arrays().
Signed-off-by: Sergey Vidishev <sergeyv@yandex-team.ru>
Signed-off-by: NeilBrown <neilb@suse.de>
"Wrong-Level" is a reason, not a component device, so it should
start with a space to indiciate this to alert().
Signed-off-by: NeilBrown <neilb@suse.de>
"alert" treats the "disc" arg differently if it starts with a space.
At least it does for sending email. It doesn't for writing to syslog.
Make this consistent and obey the 'space protocol' when writing to
syslog.
Signed-off-by: NeilBrown <neilb@suse.de>