mdadm

Commit Graph

Author	SHA1	Message	Date
NeilBrown	7a862a020f	Don't break long strings onto multiple lines. It is best to keep strings all together so that they are easier to search for in the source code. If a string is so long that it looks ugly one line, them maybe it should be broken into multiple lines for display too. Only strings which contain a newline can be broken into multiple lines: "It is OK to\n" "break this string\n" Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 13:46:53 +11:00
Pawel Baldysiak	d56dd607ba	Change way of printing name of a process Sometimes mdadm prints messages with wrong name "mdmon", and vice versa. This patch solves this problem by changing method of determining process name. Now "Name" will be set in const at start of a program, previously was hardcoded as #define. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-12 12:11:01 +11:00
Artur Paszkiewicz	19d3ea0f0b	Monitor: fix for regression with container devices This patch fixes 2 problems introduced by commit 9a518d8: not closing a file descriptor and ignoring container devices. Array state is always "inactive" for containers, so we make sure that the device is not a container by reading also the "level" sysfs entry. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-11 15:27:57 +11:00
NeilBrown	9a518d81fe	Monitor: don't open md array that doesn't exist. Opening a block-special-device for an array that doesn't exist causes that array to be instantiated (as an empty array). Races at array shutdown can cause the array to spontaneously re-appear if some deamon notices a 'change' event and goes to investigate. Teach "mdadm --monitor" to avoid this race by checking the "array_state" before opening the device. Reported-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2014-11-25 11:44:29 +11:00
NeilBrown	73ff073271	Monitor: Stop monitoring devices that have disappeared. If we are only monitoring a device because we found it in /proc/mdstat, and it has been gone for 5 checks, forget about it completely. Signed-off-by: NeilBrown <neilb@suse.de>	2014-08-14 15:36:09 +10:00
NeilBrown	efc67e8e9f	New function: sysfs_wait We have several places that wait for activity on a sysfs file. Combine most of these into a single 'sysfs_wait' function. Signed-off-by: NeilBrown <neilb@suse.de>	2013-07-01 13:28:13 +10:00
NeilBrown	1011e8344a	Remove lots of unnecessary white space. Now that I am using white-space mode in Emacs I can see all of this, and I don't like it :-) Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 12:31:45 +10:00
NeilBrown	276be5147e	Wait: also wait if an action is about to start. If a sync/recover action is about to start but hasn't actually begun yet, /proc/mdstat won't show it, but md/sync_action will (it checks MD_RECOVERY_NEEDED). So when /proc/mdstat seems to say nothing is happening, double check with md/sync_action. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-01 10:23:40 +10:00
NeilBrown	4dd2df0966	Discard devnum in favour of devnm We widely use a "devnum" which is 0 or +ve for md%d devices and -ve for md_d%d devices. But I want to be able to use md_%s device names. So get rid of devnum (a number) and use devnm (a 32char string). eg. md0 md_d2 md_home Signed-off-by: NeilBrown <neilb@suse.de>	2013-02-21 17:05:23 +11:00
NeilBrown	639c3c103a	Allow --wait to wait for delayed resync. If a resync is delayed, then e->percent will be negative but not RESYNC_NONE. In that case we still want to wait. Reported-by: Ross Boylan <ross@biostat.ucsf.edu> Signed-off-by: NeilBrown <neilb@suse.de>	2012-11-22 08:58:54 +11:00
NeilBrown	f1661bd71b	Monitor: don't complain about non-monitorable arrays in mdadm.conf If we are asked to monitor a RAID0 or Linear - which cannot be monitored - we complain with "Device Disappeared .... Wrong-Level". However if the RAID0 or Linear is being requested because it is in mdadm.conf then the message is inappropriate and confusing. So track which arrays are added from the config file, and suppress that message in that case. Reported-by: "Johnson Yan" <johnson_yan@usish.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-24 13:09:09 +11:00
NeilBrown	95c5020544	Change Monitor to take a struct context Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:20:19 +10:00
NeilBrown	503975b9d5	Remove scattered checks for malloc success. malloc should never fail, and if it does it is unlikely that anything else useful can be done. Best approach is to abort and let some super-daemon restart. So define xmalloc, xcalloc, xrealloc, xstrdup which don't fail but just print a message and exit. Then use those removing all the tests for failure. Also replace all "malloc;memset" sequences with 'xcalloc'. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	e7b84f9d50	Introduce pr_err for printing error messages. 'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": ' cont_err() is also available. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	721b662b5b	Monitor: fix reporting for Fail vs FailSpare etc. The tests here were specific to 0.90 metadata and didn't work properly for 1.x metadata, where a device's "number" doesn't change. By checking if this is a new array we can avoid some corner cases. Then we test mostly based on state and not based on 'number' at all. Signed-off-by: NeilBrown <neilb@suse.de>	2012-06-04 12:57:52 +10:00
NeilBrown	0f760384eb	Monitor: Report NewArray when an array the disappeared, reappears. Signed-off-by: NeilBrown <neilb@suse.de>	2012-06-04 12:52:36 +10:00
NeilBrown	9dad51d418	Monitor: fix inconsistencies in values for ->percent ->percent sometimes stores negative values recording states like 'pending' or 'delayed'. The value '-2' means both 'delayed' and in Monitor, 'unknown'. Also, '-1' has a meaning but not #define. So change the #defines to be prefixed with "RESYNC_", instead of "PROCESS_", add new "_NONE" and "_UNKNOWN", and use correct value in each location. Signed-off-by: NeilBrown <neilb@suse.de>	2012-06-04 12:31:40 +10:00
NeilBrown	b0599bda13	Monitor: Allow correct monitoring of more member devices. Having "MaxDisks == 384" is not good. Discard it in favour of MAX_DISKS which is 4096 Signed-off-by: NeilBrown <neilb@suse.de>	2012-06-04 09:30:56 +10:00
NeilBrown	c2ecf5f61a	Add --prefer option for --detail and --monitor Both --detail and --monitor can report the names of member devices on an array, and do so by searching /dev and finding the shortest name that matches. If --prefer=foo is given, they will instead prefer a name that contain /foo/. So mdadm --detail /dev/md0 --prefer=by-path will list the component devices via their /dev/disk/by-path/xxx names. Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-18 11:00:07 +10:00
Jes Sorensen	0011874f7e	Use MDMON_DIR for pid files created in Monitor.c Other parts of mdadm/mdmon place .pid/.sock files in MDMON_DIR. This makes Monitor.c consistent with the rest. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-23 09:05:16 +11:00
Lukasz Dorau	8453f8d0df	fix: Monitor sometimes crashes The "char cnt [40]" buffer is sometimes too small to hold all message - in such case monitor crashes. The buffer must be larger to be able to hold all message. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-12 10:40:00 +11:00
Sergey B Kirpichev	d97a5e6050	Report raid level type to syslog on RebuildFinished event. Thus, for RAID1/RAID10 this can be filtered out in logcheck. Relates-to: Debian bug 599821 Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-07 08:41:57 +11:00
Jes Sorensen	b657208c50	Monitor(): free allocated memory on exit Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-02 10:48:53 +11:00
NeilBrown	9e6d929127	Check all member devices in enough_fd The loop over all member devices in enough_fd could easily stop before it had found all devices. This would cause --re-add to fail incorrectly. So change the loop to be based on the reported number of devices in the device - with a safe-guard limit of 1024. Change some other loops to be more careful too. Reported-by: "Schmidt, Annemarie" <Annemarie.Schmidt@stratus.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-05-23 17:21:35 +10:00
Jonathan Liu	4019ad0701	Monitor: avoid NULL dereference with 0.90 metadata 0.90 array do not report the metadata type in /proc/mdstat, so we cannot assume that mse->metadata_version is non-NULL. So add an appropriate check. This adds an additional check missed by commit `eb28e119b0`. Signed-off-by: NeilBrown <neilb@suse.de>	2011-04-13 08:25:45 +10:00
Mike Frysinger	b1bac75b26	mdadm: respect --syslog in monitor mode A few places don't accept syslog as a monitor mode, so fix that. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: NeilBrown <neilb@suse.de>	2011-04-11 14:54:27 +10:00
NeilBrown	eb28e119b0	Monitor: avoid NULL dereference with 0.90 metadata 0.90 array do not report the metadata type in /proc/mdstat, so we cannot assume that mse->metadata_version is non-NULL. So add an appropriate check. Reported-by: Eugene <hdejin@yahoo.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-04-05 09:16:57 +10:00
NeilBrown	679eb882fc	Move WaitClean from sysfs to Monitor.c It might not really belong in Monitor, but it really doesn't belong in sysfs.c, and fits well with Wait() Signed-off-by: NeilBrown <neilb@suse.de>	2011-04-05 08:21:03 +10:00
NeilBrown	38376c2e51	Monitor: handle v.quick removal of devices better. If a device fails and then is removed before Monitor sees the failure, GET_DISK_INFO returns nothing so Monitor relies on mdstat info where '_' is incorrectly interpreted as 'a spare'. We should treat '_' as 'removed' - that is safer. Without this, a v.quick fail+remove gets reported as 'Failed' then 'SpareActive'. Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-22 14:47:55 +11:00
Adam Kwolek	983fff45a1	FIX: ping_monitor() usage causes memory leaks When for ping_monitor() input devnum2devname() is used, received string pointer should be passed to free() for memory release. It is not made in several places. This use case should have function to avoid memory leak. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-18 12:32:16 +11:00
NeilBrown	71204a5029	Various compile fixes. Make "make everything" succeed. This fixed some real bugs. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 15:48:03 +11:00
NeilBrown	e5508b361d	Allow domain_test to report that no domains were found. Sometime we will need to know the difference between no domains found and domains didn't match. So allow domain_test to return different values and fix up all callers to maintain current behaviour. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 14:44:02 +11:00
Czarnowska, Anna	bfd76b9309	Monitor: do not move partitions to external container Arrays on partitions are not supported for external metadata so do not take such spare from native array. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 10:40:56 +11:00
Czarnowska, Anna	a1e49d6956	Monitor: avoid adding too many spares to container Tests revealed that sometimes there are still more spares taken than needed. The reason for this is that after adding one spare to container with degraded subarray if between ioctl in main loop and load_container in try_spare_migration mdmon activates the spare we see active<raid but find no spares in parent container and so add an extra spare. To prevent such behaviour we count active disks in the list returned by getinfo_super_disks and compare it with subarray->active. If the number has increased it means new spare was added and activated so there is no need for more. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-28 11:18:57 +10:00
Czarnowska, Anna	300f503323	fix: Monitor: min_size must be set to 0 Otherwise a random value will be used for comparison later for native and ddf metadata (until min_acceptable_spare_size is defined). Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-17 12:46:14 +11:00
Czarnowska, Anna	c0dc0ad5f8	fix: segfault if subarray is monitored but container is not In this situation to->parent is null so "to" doesn't change to parent container and to->metadata is still null. This results in segmentation fault when checking to->metadata->ss->external. We should just skip this array as container is needed to move spares to. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-17 12:01:11 +11:00
Anna Czarnowska	de697acc4c	Monitor: skip array if error getting size load_super tries to load container first anyway but if it fails eg. after physically removing a disk then it tries to read metadata from container device. This will always fail and print confusing errors. So use load_container instead of load_super on container. On failure to read metadata we should skip this array. It will be dealt with the next time round. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-12 16:16:14 +11:00
Anna Czarnowska	d52bb542d4	move_spare function modified and moved to Manage.c It will also be needed for Incremental. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-05 14:34:32 +11:00
Anna Czarnowska	326727d9c9	Use one function chosing spares from container container_chose_spares in Monitor.c and get_spares_for_grow in super-intel.c do the same thing: search for spares in a container. Another version will also be needed for Incremental so a more general solution is presented here and applied in two previous contexts. Normally domlist==NULL would lead an empty list but this is typically checked earlier so here it is interpreted as "do not test domains". Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-05 14:34:14 +11:00
Marcin Labun	5ec0f3738a	Monitor: Check destination array domain early. Destination arrays that do not have any domains are excluded from spare sharing. We can check it early, without searching for donor arrays. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-21 09:07:08 +11:00
Anna Czarnowska	44d337f04d	fix: Monitor doesn't return after starting daemon Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-15 15:51:53 +11:00
NeilBrown	833bb0f8f6	Allow --update=devicesize with --re-add This is useful with 1.1 and 1.2 metadata to update the metadata if the device size has changed. The same functionality can be achieved by writing to the device size in sysfs after re-adding normally, but in some cases this might be easier. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-09 13:06:29 +11:00
Anna Czarnowska	e9a2ac028e	Monitor: don't add more spares than needed When we add a spare to a container it takes a while before it is noticed by mdmon and recovery starts. During this time the array remains degraded but we don't want to add any more spares to this container. Therefore we must check container with degraded array if it doesn't already have a suitable spare. container_choose_spare is reused with from=to Domain check is not needed in this situation. Ping_manager after moving disk is needed to be able to see newly added disk in container after coming back through the loop. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-03 14:11:29 +11:00
Anna Czarnowska	f0b8530630	Monitor: only get min_size once We may call chose_spare several times before we find a suitable one so it is better to get the size beforehand. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-03 14:05:11 +11:00
Anna Czarnowska	83f3bc5f04	Monitor: pass statelist reference when adding new arrays Otherwise it will not get updated. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-03 14:03:35 +11:00
Anna Czarnowska	ef15641fb5	Monitor: array that has disappeared doesn't need spares If a degraded array disappears we still have it in statelist with active<raid but it is pointless to look for spares for it. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 09:58:22 +11:00
Anna Czarnowska	a1bb206520	Monitor: fix writing autorebuild.pid If /var/run/mdadm doesn't exist we can never succeed writing so we should try to create it first. When we make sure it is there we write pid file as before. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 09:57:41 +11:00
Anna Czarnowska	24baa548c4	Monitor: reset dev when size too small Cc: linux-raid@vger.kernel.org, Williams, Dan J <dan.j.williams@intel.com>, Ciechanowski, Ed <ed.ciechanowski@intel.com> Otherwise spare will be considered good anyway. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 09:56:48 +11:00
Anna Czarnowska	0f0749ad93	Monitor: devid should be dev_t For consistency with makedev(). int is not sufficient. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 09:56:28 +11:00
Anna Czarnowska	ff044d6ba7	Monitor: few bug fixes for spare migration 1. If array not changed we should still report any degraded - another array may have a new spare that we can move. 2. Array with err=1 can't give a spare. 3. We look for spares in "from" not "st" which is supertype and has devname=NULL. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 09:51:27 +11:00
NeilBrown	5739e0d007	Monitor: choose spare correctly for external metadata. When metadata is managed externally - probably as a container - we need to examine that metadata to see which devices are spares. So use the getinfo_super_disk message and use the info returned. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-25 18:58:27 +11:00
NeilBrown	0fa21e8522	Monitor: separate 'choose_spare' out from 'move_spare' choosing a spare from a container is more complicated that from a native array. So separate out choose_spare to make it easier to use an alternate implementation Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-25 18:37:23 +11:00
NeilBrown	062dc4817d	Monitor: check spare group is non-NULL before adding to domain list ... otherwise we crash. Reported-by: "Labun, Marcin" <Marcin.Labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 11:11:45 +11:00
Anna Czarnowska	80e7f8c31a	Monitor: Allow metadata to set minimum size for spare to migrate in. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	66f5c4b665	Monitor: teach spare migration about containers When trying to move a spare, move to the container of a degraded array, not to the array itself. And don't try to move from a subarray, only from a native or container array. And don't move from a container which contains degraded subarrays. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	e78dda3bf5	Monitor: policy based spare migration. Rather than only migrating between arrays with the same spare_group, we now migrate based on domains set in the policy. In order for spare_group to continue to work, we treat it as a domain of the destination array, and a domain of any device we might remove from a source array. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	2feb22efbc	Monitor: split out check_donor Checking compatibility between arrays for spare migration is going to become a little more complicated, so split it out into a separate function. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	6d3d44d98c	Monitor: split out move_spare in spare migration. This is a simple refactoring with no functionality change. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	e0bd6a9637	Monior: create struct for holding alert info. Rather than passing mailaddr, mailfrom, cmd, dosyslog around in argument lists, create a structure to hold them all. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	9bfc6a7d1a	Monitor: use calloc rather than malloc calloc zeros the memory allocated, which is safer, particularly as we add more things to struct state. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	a90e1050b5	Monitor: minor optimisation to spare migration. Only try spare migration if we know that at least one array is degraded. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
Marcin Labun	c3621c0a5f	Monitor: link containers with subarrays in statelist Each containers has list of its subarrays. Each subarray has back link to its parent container. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	2e0172b110	Break Monitor into smaller functions. Monitor() has become way too big. Break it up into multiple smaller functions that are all called from the main loop. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	ca750d9830	Monitor: track metadata type or parent/container of arrays. For subarrays, record the devid of the parent. For others arrays, record the metadata type. This will be used in a subsequent patch to link related arrays together and allow spare migration between containers. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
Anna Czarnowska	5d19bb23dd	Monitor: include containers in scan mode Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	b1717f0afc	Monitor: avoid skipping checks on external arrays utime is not correct for external metadata so we must not risk the observed time ever matching the old time. Reported-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
Anna Czarnowska	edde9560fa	mdadm: added --no-sharing option for Monitor mode --no-sharing option disables moving spares between arrays/containers. Without the option spares are moved if needed according to config rules. We only allow one process moving spares started with --scan option. If there is such process running and another instance of Monitor is starting without --scan, then we issue a warning but allow it to continue. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
Anna Czarnowska	0eac199a2c	Monitor: set err on arrays not in mdstat mse can be NULL when the array was not in mdstat when we read it but existed in statelist and was recreated after reading mdstat. In this case we set err as we can't get full update on this array this time. If the same array is given twice in command line it appears twice in statelist. The first one will mark mse->devnum=INT_MAX so the second one can't find mse. We set err on the second one as it's not needed. Also if it becomes degraded we would look for spares twice for the same array. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
NeilBrown	a655e55064	Improve type names for mddev_dev Remove the _t pointer typedef and remove the _s suffix for the structure, These things do not help readability. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:05 +11:00
NeilBrown	fa56eddbd1	Improve mddev_ident type definitions. Remove the _t typedef and remove the _s suffix from the struct name. These things do not help readability. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:05 +11:00
NeilBrown	f21e18ca89	Compile with -Wextra by default This produced lots of warning, some of which pointed to actual bugs. Signed-off-by: NeilBrown <neilb@suse.de>	2010-08-05 13:13:02 +10:00
NeilBrown	7d2e6486e3	Add --test option to --re-add and similar --test can be given in Manage mode. This can be used when there is an attempt to fail or remove 'faulty', 'failed' or 'detached' devices, or to re-add 'missing' devices. If no devices were failed, removed, or re-added, then mdadm will exit with status '2'. Signed-off-by: NeilBrown <neilb@suse.de>	2010-07-06 12:07:07 +10:00
NeilBrown	4460f8f7c3	Monitor: don't report the disappearance of a faulty device as SpareActive. Normally Monitor doesn't see faulty devices in active slots - they get moved away too quickly. But if it does, it reports the "faulty device disappeared" event (when it finally does get moved away) as SpareActive due to insufficient checking. So add a better check. Reported-by: Pierre Vignéras <pierre@vigneras.name>	2010-05-18 12:31:29 +10:00
Zdenek Behan	9a36a9b713	Monitor: add option to specify rebuild increments ie. the percent increments after which RebuildNN event is generated This is particulary useful when using --program option, rather than (only) syslog for alerts. Signed-off-by: Zdenek Behan <rain@matfyz.cz> Signed-off-by: NeilBrown <neilb@suse.de>	2009-10-19 13:13:58 +11:00
NeilBrown	6278fb3af7	Monitor: use pclose rather than fclose Using pclose is probably the right thing to do seeing that we used popen, but as there is no clear need to wait for sendmail to finish, it isn't really important. Signed-off-by: NeilBrown <neilb@suse.de>	2009-07-10 14:39:20 +10:00
NeilBrown	3b435195fc	Merge branch 'master' into devel-3.0 Conflicts: super0.c super1.c	2009-06-02 15:28:36 +10:00
NeilBrown	38a07ed61e	Move WaitClean from Monitor.c to sysfs.c That way mdmon doesn't need to include Monitor.o Signed-off-by: NeilBrown <neilb@suse.de>	2009-06-02 15:27:16 +10:00
NeilBrown	e736b62389	Update copyright dates and remove references to @cse.unsw.edu.au Also removed 'paper' addresses. Signed-off-by: NeilBrown <neilb@suse.de>	2009-06-02 14:35:45 +10:00
NeilBrown	4736b5dc77	Monitor: don't get confused if utime is never set. externally managed arrays do not (currently) cause utime in GET_ARRAY_INFO to be updated. So if it is zero, just assume the current time. This will cause GET_DISK_INFO to be called more often, but as we do the scan only every 60 seconds normally, a few extra syscalls isn't going to make a big difference. Signed-off-by: NeilBrown <neilb@suse.de>	2009-06-02 14:24:58 +10:00
NeilBrown	c7b474474c	Monitor: support spare-group manipulation for 1.x metadata. The code for moving spares around a spare-group currently only works for 0.90 metadata. Generalise it for 1.x metadata as well. Reported-by: "Garth Snyder" <garth@grsweb.us> Signed-off-by NeilBrown <neilb@suse.de>	2009-05-12 09:49:45 +10:00
NeilBrown	db2d001cee	config: allow Array line to contain array name without /dev/md/ prefix. For consistency with --create and --assemble, allow the array name given in mdadm.conf to exclude the "/dev/md/" prefix. So e.g. ARRAY home uuid=whatever is treated like ARRAY /dev/md/home uuid=whatever Also exclude names which create_mddev will reject. Signed-off-by: NeilBrown <neilb@suse.de>	2009-05-11 15:18:35 +10:00
NeilBrown	112cace627	config: support "ARRAY <ignore> ..." lines in mdadm.conf Sometimes we want to ensure particular arrays are never assembled automatically. This might include an array made of devices that are shared between hosts. To support this, allow ARRAY lines in mdadm.conf to use the word "ignore" rather than a device name. Arrays which match such lines are never automatically assembled (though they can still be assembled by explicitly giving identification information on the mdadm command line. Signed-off-by: NeilBrown <neilb@suse.de>	2009-05-11 15:17:05 +10:00
NeilBrown	2800528713	Wait for POLLPRI on /proc or /sys files. From 2.6.30, /proc/mounts and various /sys files will probably always returns 'readable' to select, so we will need to wait on POLLPRI to get the 'new data is available' signal. When using select, this corresponds to an 'exception', so adjust calls to select accordingly. In one case we sometimes wait on a socket and sometime on /proc/mounts, so we need to test which. Signed-off-by: NeilBrown <neilb@suse.de>	2009-04-14 14:59:24 +10:00
NeilBrown	8844e29149	Merge branch 'master' in devel-3.0	2009-03-10 16:47:02 +11:00
NeilBrown	bc8544484f	Monitor: send --test message for arrays in /proc/mdstat that aren't in mdadm.conf "mdadm --monitor --test --scan" currently only sends test messages for arrays listed on the command line or in /etc/mdadm.conf. With this patch it also reports on any active arrays, which is more in line with the description in the manpage. Thanks to Andrew Walrond <andrew@walrond.org> for reporting this error. Signed-off-by: NeilBrown <neilb@suse.de>	2009-02-05 17:06:03 +11:00
NeilBrown	78fbcc1031	Merge branch 'master' into scratch-3.0 Conflicts: Assemble.c config.c	2009-01-08 09:31:28 +11:00
NeilBrown	89a10d84cb	Free mdstat data structures properly. In one case we called 'free' instead of 'mdstat_free'. In others we didn't free at all. Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-08 09:25:31 +11:00
Dan Williams	e7783ee677	fix ->metadata_version checking in Wait() Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-12-02 22:45:43 -07:00
NeilBrown	fe056d1fb0	config: Don't require an array to have a device name. i.e. in mdadm.conf you can have a line like ARRAY uuid=whatever and it will use auto-name-generation to give a name to the array at assemble-time. The is different from blind auto-assembly in that the array will be treated as 'local'.	2008-11-04 20:50:38 +11:00
Dan Williams	27dec8fae3	quiet WaitClean() Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-10-15 14:43:57 -07:00
Dan Williams	7146ec6a1e	Extend --wait-clean to checkpoint resync Root file systems backed by external metadata arrays need to be explicitly checkpointed near the time the rootfs is marked readonly as userspace will not have an opportunity to react to the final shutdown of the array. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-10-15 14:15:51 -07:00
Dan Williams	0dd3ba30aa	--wait-clean: shorten timeout Set the safemode timeout to a small value to get the array marked clean as soon as possible. We don't write 'clean' directly as it may cause mdmon to miss a 'write-pending' event. Include a couple fixes to sysfs_set_safemode(): 1/ 0 pad the milliseconds field 2/ workaround input truncation in the kernel Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-10-15 14:15:51 -07:00
Dan Williams	1770662bca	'mdadm --wait-clean' wait for array to be marked clean For use in distro shutdown scripts with a RAID root file system. Returns immediately if the array is 'readonly', or not an externally managed array. It is up to the distro's scripts to make sure no new writes hit the device after this returns 'true'. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-09-15 20:58:42 -07:00
Dan Williams	c94709e83f	Add ping_monitor() to mdadm --wait The action we are waiting for may not be complete until the monitor has had a chance to take action on the result. The following script can now remove the device on the first attempt, versus a few attempts with the original Wait(): #!/bin/bash #export MDADM_NO_MDMON=1 export IMSM_DEVNAME_AS_SERIAL=1 ./mdadm -Ss ./mdadm --zero-superblock /dev/loop[0-3] echo 2 > /proc/sys/dev/raid/speed_limit_max ./mdadm --create /dev/imsm /dev/loop[0-3] -n 4 -e imsm -a md ./mdadm --create /dev/md/r1 /dev/loop[0-3] -n 4 -l 5 --force -a mdp ./mdadm --fail /dev/md/r1 /dev/loop3 ./mdadm --wait /dev/md/r1 x=0 while ! ./mdadm --remove /dev/imsm /dev/loop3 > /dev/null 2>&1 do x=$((x+1)) done echo "removed after $x attempts" ./mdadm --add /dev/imsm /dev/loop3 Include 2 small cleanups: * remove the almost open coded fd2devnum() in Wait() by introducing a new utility routine stat2devnum() * teach connect_monitor() to parse the container device from a subarray string Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2008-09-15 20:58:42 -07:00
Doug Ledford	fb97b4d691	Clean up usage of open() Fix on call that passed an invalid mode to open Don't pass a third arg unless we also pass O_CREAT Use symbolic args for 2nd and 3rd args Signed-off-by: Doug Ledford <dledford@redhat.com>	2008-07-24 18:35:11 -04:00
Neil Brown	2cdb64897d	Fix for segfault when reading /proc/mdstat Some kernel versions don't put a space between 'active' and '(auto-read-only)' in /proc/mdstat. This causes a parsing problem leaving 'level' set to NULL which causes a crash. So synthesise a space there if it is missing, and check for 'level' to be NULL and don't de-ref if it is.	2008-04-28 16:29:12 +10:00
Neil Brown	7e0f69790c	Replace sysarray with mdinfo Sure, mdinfo is bigger, but having a uniform structure for lots of things will make life easier.	2007-12-14 20:14:59 +11:00
Neil Brown	aba69144fd	Remove spaces/tabs from ends of lines.	2007-12-14 20:13:43 +11:00
maximilian attems	4450e59ffa	Monitor.c s/MAXINT/INT_MAX/g include limits.h directly instead of values.h Signed-off-by: maximilian attems <max@stro.at> Signed-off-by: Neil Brown <neilb@suse.de>	2007-09-27 10:08:30 +10:00
maximilian attems	98127a6d4d	Monitor.c include signal.h directly another small step for better klibc support, glibc compile tested. Signed-off-by: maximilian attems <max@stro.at> Signed-off-by: Neil Brown <neilb@suse.de>	2007-09-27 10:08:30 +10:00

1 2 3 4

188 Commits