mdadm

Author	SHA1	Message	Date
NeilBrown	1011e8344a	Remove lots of unnecessary white space. Now that I am using white-space mode in Emacs I can see all of this, and I don't like it :-) Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 12:31:45 +10:00
NeilBrown	fe7e0e64b0	Manage: split Manage_runstop into Manage_run and Manage_stop The two branches have virtually nothing in common, so it is simpler if they are separate. Signed-off-by: NeilBrown <neilb@suse.de>	2013-06-19 11:23:44 +10:00
NeilBrown	d33f151842	Change some fprintf(stderrs to cont_err() Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-21 12:51:33 +10:00
NeilBrown	8adabef587	Remove open-coded use_udev(). Manage_runstop has an open-coded version of use_udev() which is no longer correct. So make it use use_udev() explicitly. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-13 13:03:25 +10:00
NeilBrown	64a78416e3	Manage: support --fail set-X and --remove set-X A RAID10 array can have 'sets' of devices which are reported by --detail. They can now be collectively failed or removed. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-13 11:17:50 +10:00
NeilBrown	2fdf559d74	Manage_runstop: call flush_mdmon if O_EXCL fails on stopping mdmon array. When stopping an mdmon array, at reshape might be being aborted which inhibets O_EXCL. So if that is possible, call flush_mdmon to make sure mdmon isn't still busy. Reported-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2013-04-22 17:05:33 +10:00
NeilBrown	5e73b02409	Ensure mddev_dev struct always zeroed on allocation. There are a number of fields which should not be left uninitialised. e.g. attempt_re_add can get confused if ->writemostly is not set correctly. Signed-off-by: NeilBrown <neilb@suse.de>	2013-03-05 11:53:51 +11:00
NeilBrown	8af530b07f	Enhance incremental removal. When asked to incrementally-remove a device, try marking the array read-auto first. That will delay recording the failure in the metadata until it is really relevant. This way, if the device are just unplugged when the array is not really in use, the metadata will remain clean. If marking the default as faulty fails because it is EBUSY, that implies that the array would be failed without the device. As the device has (presumably gone) - that means the array is dead. So try to stop it. If that fails because it is in use, send a uevent to report that it is gone. Hopefully whoever mounted it will now let go. This means that if you plug in some devices and they are auto-assembled, then unplugging them will auto-deassemble relatively cleanly. To be complete, we really need the kernel to disassemble the array after the last close somehow. Maybe if a REMOVE has failed and a STOP has failed and nothing else much has happened, it could safely stop the array on last close. Signed-off-by: NeilBrown <neilb@suse.de>	2013-03-05 09:46:34 +11:00
NeilBrown	4dd2df0966	Discard devnum in favour of devnm We widely use a "devnum" which is 0 or +ve for md%d devices and -ve for md_d%d devices. But I want to be able to use md_%s device names. So get rid of devnum (a number) and use devnm (a 32char string). eg. md0 md_d2 md_home Signed-off-by: NeilBrown <neilb@suse.de>	2013-02-21 17:05:23 +11:00
NeilBrown	262e3b7fe8	Manage: Add support for --re-add faulty mdadm /dev/mdXX --re-add faulty will identify any faulty devices in the array, remove them, and --re-add them. Signed-off-by: NeilBrown <neilb@suse.de>	2012-11-28 10:19:52 +11:00
NeilBrown	5fe7f5f7c8	Fix "--remove faulty" and similar commands. A recent change to improve error messages for subdev management broken all use cases were device names like %d:%d were used. Re-arrange the code again so we use dev_open first - which understands those names - and then only try 'stat' if that failed. The important thing is to base the 'Cannot find' message on the result of 'stat', not on the result of 'open'. Signed-off-by: NeilBrown <neilb@suse.de>	2012-11-28 10:12:09 +11:00
NeilBrown	839f27a380	Manage: improve error message when given a non-block device. As dev_open uses O_DIRECT it will fail on directories and such. So we never get to report that it isn't a block device. So do a 'stat' earlier and if it is a block device, report the error there. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-24 12:51:14 +11:00
NeilBrown	70c55e36b7	Add support for --replace and --with --replace can be used to replace a device without completely failing it. Once the replacement completes the device will be failed. --with can indicate which of several spares to use. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-23 16:27:15 +11:00
NeilBrown	72ca9bcff3	Allow data-offset to be specified per-device for create mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ... The size is in K unless a suffix: K M G is given. The suffix 's' means sectors. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:21 +10:00
NeilBrown	af4348ddd1	Add data_offset arg to ->validate_geometry. This is needed to return correct available size. It isn't really used yet. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:20 +10:00
NeilBrown	387fcd593c	Add data_offset arg to ->avail_size This is currently only useful for 1.x metadata and will allow an explicit --data-offset request on command line. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:20 +10:00
NeilBrown	aab15415ed	Manage: fix checks for removal from a container. We must only remove from a container if the device isn't a member of any member array. To check we look at the 'holders' directory in sysfs. We currently skip that check if ->devname is "detached", however that can never be true since the change that introduced add_detached(). Also sysfs_unique_holder returns status in 'errno' which isn't entirely safe as e.g. closedir() is probably allowed to clear it. So make sysfs_unique_holder return an unambigious value, and us it to decide what to report. Signed-off-by: NeilBrown <neilb@suse.de>	2012-09-24 12:26:03 +10:00
NeilBrown	9cf9a1de36	Manage: zero metadata before adding to 'external' array. 'external' arrays don't support --re-add yet so old metadata is no value, and 'ddf' gets confusing in mdmon if old metadata is found. So for now, zero out any old metadata found before adding a spare to an externally-managed array. Reported-by: Albert Pauw <albert.pauw@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-15 09:51:20 +10:00
Lukasz Dorau	6d43efb59b	Manage.c: fix make everything compilation error This patch fixes the following make everything compilation error: Manage.c: In function ‘Manage_add’: Manage.c:538: error: ‘dev_st’ may be used uninitialized in this function make: *** [mdadm.Os] Error 1 Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-14 09:55:29 +10:00
NeilBrown	d070235d3f	Manage_subdevs: factor out Manage_delete Now Manage_subdevs is now small enough to be manageable. Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-13 08:00:21 +10:00
NeilBrown	38aeaf3af6	Manage_subdevs: split most of 'add' handling into Manage_add. This makes Manage_subdevs smaller, and makes the error-path handling for Manage_add much cleaner and probably less buggy. Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-13 08:00:21 +10:00
NeilBrown	abe94694da	Manage: split out attempt_re_add. The indent level is way too deep here, and this is a well defined task, so split it out to a separate function. Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-13 08:00:21 +10:00
NeilBrown	46d475beb4	Manage_subdev: give 'st' a better name and narrower focus. 'st' is use to examine the metadata on the device being added to see if a 're-add' is possible. However it is loaded long before the 're-add' attempt is made. So move the 'load_super' closer to were it is used - allowing us to discard a number of 'free_super' call - and rename it to 'dev_st' to emphasize that it related to the current device. Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-13 08:00:20 +10:00
NeilBrown	7bd04da926	Manage: minor cosmetic fixes. Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-13 08:00:20 +10:00
NeilBrown	1d9976430c	Manage: simplify device searches in Manage_subdevs We currently have rather hard-to-follow loop to iterate through all the matches for 'missing' or 'faulty' or 'detached'. Simplify it by creating a list of possible devices for each of those and splicing the new list into the device list. This removes the need for 'jnext' and 'next' and various other hacks. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:22:16 +10:00
NeilBrown	ba728be72f	Convert 'quiet' to 'not verbose' in various places. If we change some functions to accept 'verbose', where <0 means to be quiet, in place of 'quiet', then we will be able to merge 'quiet' and 'verbose' together for simplicity. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:18:09 +10:00
NeilBrown	503975b9d5	Remove scattered checks for malloc success. malloc should never fail, and if it does it is unlikely that anything else useful can be done. Best approach is to abort and let some super-daemon restart. So define xmalloc, xcalloc, xrealloc, xstrdup which don't fail but just print a message and exit. Then use those removing all the tests for failure. Also replace all "malloc;memset" sequences with 'xcalloc'. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	c8e1a230b7	Remove re_add flag in favour of new disposition. Instead of disposition == 'a' re_add == 1 use disposition == 'A' to record that a re-add was requested. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	e7b84f9d50	Introduce pr_err for printing error messages. 'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": ' cont_err() is also available. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	0a999759b5	Relax restrictions on when --add is permitted. The restriction that --add was not allowed on a device which looked like a recent member of an array was overly harsh. The real requirement was to avoid using --add when the array had failed, and the device being added might contain necessary information which can only be incorporated by stopping and re-assembling with --force. So change the test to reflect the need. Reported-by: Doug Ledford <dledford@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-18 14:19:49 +10:00
NeilBrown	480f356641	Raid limit of 1024 when scanning for devices. When we can for devices using GET_DISK_INFO we currently limit to 1024. But some arrays can have more than this. So raise it to 4096 and make the constant a #define. Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-18 09:06:02 +10:00
NeilBrown	3556c2fafb	Fix typo: wan -> want Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-04 14:02:00 +10:00
NeilBrown	9f58469128	Manage: freeze recovery while adding multiple devices. If the kernel supports it, freeze recovery over multiple adds, so that they can all be added to the array at the same time and be recovered in parallel. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-22 16:15:03 +11:00
NeilBrown	bcbb3112d2	Manage: replace 'return 1' with 'goto abort'. This will allow exit processing in next patch Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-22 16:07:02 +11:00
NeilBrown	c69ffac0d6	Manage: allow --re-add to failed array. If both "legs" of a RAID1 (or equivalent in RAID10) fail, then one of the becomes available again it maybe appropriate to re-add the failed device(s). So remove the restriction that an array must has 'enough' devices before being re-added, and if there is no-where to read a superblock from for matching, then assume the kernel will do necessary checks. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-07 15:25:57 +11:00
Jes Sorensen	1471b8b14b	Manage_ro(): Check pointer rather than dereferencing it Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-03 08:09:41 +11:00
Jes Sorensen	bccd8153fa	Manage_runstop(): Avoid memory leak Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-02 10:48:53 +11:00
Jes Sorensen	b73e45ae6a	Managa_ro(): free() mdi before exiting Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-02 10:48:53 +11:00
Jes Sorensen	093d918759	Manage_subdevs(): avoid leaking super Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-02 10:48:53 +11:00
Jes Sorensen	d9ca03e9c3	remove_devices(): readlink returns -1 on error Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-02 10:48:53 +11:00
Doug Ledford	16715c01f7	Fix readding of a readwrite drive into a writemostly array If you create a two drive raid1 array with one device writemostly, then fail the readwrite drive, when you add a new device, it will get the writemostly bit copied out of the remaining device's superblock into it's own. You can then remove the new drive and readd it as readwrite, which will work for the readd, but it leaves the stale WriteMostly1 bit in devflags resulting in the device going back to writemostly on the next assembly. The fix is to make sure that A) when we readd a device and we might have filled the st->sb info from a running device instead of the device being readded, then clear/set the WriteMostly1 bit in the super1 struct in addition to setting the disk state (ditto for super0, but slightly different mechanism) and B) when adding a clean device to an array (when we most certainly did copy the superblock info from an existing device), then clear any writemostly bits. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-19 13:06:38 +10:00
NeilBrown	11b391ece9	Discourage large devices from being added to 0.90 arrays. 0.90 arrays can only use up to 4TB per device. So when a larger device is added, complain a bit. Still allow it if --force is given as there could be a valid use. Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-08 13:05:31 +10:00
NeilBrown	9e6d929127	Check all member devices in enough_fd The loop over all member devices in enough_fd could easily stop before it had found all devices. This would cause --re-add to fail incorrectly. So change the loop to be based on the reported number of devices in the device - with a safe-guard limit of 1024. Change some other loops to be more careful too. Reported-by: "Schmidt, Annemarie" <Annemarie.Schmidt@stratus.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-05-23 17:21:35 +10:00
NeilBrown	873eec468c	Manage: minor fix to add/re-add handling. If using an old kernel we should still check if a re-add might be intended, so we can refuse and require a '--zero' first if it is not possible. Signed-off-by: NeilBrown <neilb@suse.de>	2011-05-10 16:20:25 +10:00
NeilBrown	51d9a2ce33	Merge branch 'master' into devel-3.2 Conflicts: Incremental.c Manage.c ReadMe.c inventory mdadm.8.in mdadm.spec mdassemble.8 mdmon.8	2011-03-24 12:00:55 +11:00
NeilBrown	fb0d4b9ca2	--stop: separate 'is busy' test for 'did it stop properly'. Stopping an md array requires that there is no other user of it. However with udev and udisks and such there can be transient other users of md devices which can interfere with stopping the array. If there is a transient users, we really want "mdadm --stop" to wait a little while and retry. However if the array is genuinely in-use (e.g. mounted), then we don't want to wait at all - we want to fail immediately. So before trying to stop, re-open device with O_EXCL. If this fails then the device is probably in use, so give up. If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly a transient failure, so try again for a few seconds. Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-23 15:42:24 +11:00
Adam Kwolek	c0f8269d57	FIX: Add spare throws exception (v2) sync_metadata() requires st->sb to be loaded, otherwise exception is generated. This fails expansion, because spares cannot be added. metadata update uses tst instead st pointer, it is better than loading anchor for st as I proposed previously. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-20 15:47:17 +11:00
Krzysztof Wojcik	1ae42d9d99	Retry writing 'inactive' state during stopping array Issue observed: Sporadicaly stopping arrays using "mdadm -Ss" command does not succeded. Cause: Writting "inactive" to the array state not succeded- array is busy (accessed by udev, blkid etc.) Resolution: If writing 'inactive' fails, wait and retry again (because it is possibly a transient failure) Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-18 12:42:17 +11:00
Adam Kwolek	983fff45a1	FIX: ping_monitor() usage causes memory leaks When for ping_monitor() input devnum2devname() is used, received string pointer should be passed to free() for memory release. It is not made in several places. This use case should have function to avoid memory leak. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-18 12:32:16 +11:00
NeilBrown	d6221e667f	Manage: fix the mess I made in earlier patch. When I separated the 'native metadata' case more cleanly from the "external metadata" case for adding a drive, I left some 'external' code in the 'native' case, and didn't copy it to the 'external' case. When - in the external case - we add to super, we much check for mdmon first, so we know whether to do the metadata update ourselves or not, then afterwards call either flush_metadata_updates (to send to mdmon) or sync_metadata (to do it directly). Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-18 12:31:45 +11:00

1 2 3 4

174 Commits