mdadm

Author	SHA1	Message	Date
NeilBrown	89ecd3cfe4	Grow: introduce min_offset_change to struct reshape. raid10 currently uses the 'backup_blocks' field to store something else: a minimum offset change. This is bad practice, we will shortly need to have both for RAID5/6, so make a separate field. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-22 12:20:39 +10:00
NeilBrown	74db60b00a	Add --dump / --restore functionality. This allows the metadata on a device to be saved and later restored. This can be useful before experimenting on an array that is misbehaving. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-16 15:07:16 +10:00
NeilBrown	eca944fa9c	create_mddev: add support for /dev/md_XXX non-numeric names. With the 'devnm' infrastructure fixed, it is quite easy to support names like "md_home" for md arrays. The currently defaults to "off" and can be enabled in mdadm.conf with CREATE names=yes This is incase other tools get confused by the new names. Signed-off-by: NeilBrown <neilb@suse.de>	2013-05-15 11:03:25 +10:00
NeilBrown	4dd2df0966	Discard devnum in favour of devnm We widely use a "devnum" which is 0 or +ve for md%d devices and -ve for md_d%d devices. But I want to be able to use md_%s device names. So get rid of devnum (a number) and use devnm (a 32char string). eg. md0 md_d2 md_home Signed-off-by: NeilBrown <neilb@suse.de>	2013-02-21 17:05:23 +11:00
John Spencer	0d35d5c480	mdadm.h: fix ugly glibc specific ifdeffery the code that was exposed on anything else than dietlibc and klibc is entirely glibc specific and broke the build on musl libc. Signed-off-by: John Spencer <maillist-mdadm@barfooze.de> Signed-off-by: NeilBrown <neilb@suse.de>	2013-02-10 15:40:53 +11:00
Jes Sorensen	3e23ba9d7b	Remove --offroot argument and default to always setting argv[0] to @ We still allow --offroot to be given - for compatibility with scripts - but ignore it. The whole point of --offroot is to get systemd to not auto-kill mdmon, and we always want that. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2013-02-05 15:06:47 +11:00
NeilBrown	06d2ffc3e2	conditionally remove map_dev from find_free_devnum map_dev can be slow so it is best to not call it when not necessary. The final test in "find_free_devnum" is not relevant when udev is being used, so remove the test in that case. Signed-off-by: NeilBrown <neilb@suse.de>	2013-01-07 10:17:04 +11:00
NeilBrown	6d388a8816	MISC: Add --examine-badblocks option This will list the contents of the bad-blocks log, if one is present. Signed-off-by: NeilBrown <neilb@suse.de>	2012-12-05 12:56:31 +11:00
NeilBrown	70c55e36b7	Add support for --replace and --with --replace can be used to replace a device without completely failing it. Once the replacement completes the device will be failed. --with can indicate which of several spares to use. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-23 16:27:15 +11:00
NeilBrown	4ec2cbe96d	Remove get_one_disk It has never been used, and there isn't really any place that could usefully use it. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-22 17:23:15 +11:00
NeilBrown	72ca9bcff3	Allow data-offset to be specified per-device for create mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ... The size is in K unless a suffix: K M G is given. The suffix 's' means sectors. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:21 +10:00
NeilBrown	5e88ab2e2f	New RESHAPE_NO_BACKUP flag to track when backup action is needed. Some arrays (raid10) never need a backup file, so during assembly we can avoid the whole Grow_continue check in that case. Achieve this using a flag set by the metadata handler. Also get "mdadm -I" to fail if a backup process would be needed. It currently does fail as the kernel rejects things, but it is nicer to have this explicit. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:21 +10:00
NeilBrown	fe384ca0b9	Grow: set new_data_offset if appropriate	2012-10-04 16:34:21 +10:00
NeilBrown	80bf913592	Add space_before/space_after fields to mdinfo These will be needed to guide changes to data_offset during reshape. Only set them for super1 for now. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:21 +10:00
NeilBrown	40c9a66a5c	Add --data-offset flag for Create and Grow This can be used to over-ride the automatic assignment of data offset. For --create, it is useful to re-create old arrays where different defaults applied. For --grow it may be able to force a reshape in the reverse direction. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:21 +10:00
NeilBrown	83cd1e97cb	Add data_offset arg to ->init_super and use it in super1.c So if ->data_offset is already set, use that rather than computing one. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:20 +10:00
NeilBrown	af4348ddd1	Add data_offset arg to ->validate_geometry. This is needed to return correct available size. It isn't really used yet. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:20 +10:00
NeilBrown	387fcd593c	Add data_offset arg to ->avail_size This is currently only useful for 1.x metadata and will allow an explicit --data-offset request on command line. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:20 +10:00
NeilBrown	822e393a05	Allow parse_size to return 0. We will shortly introduce --data-offset= which is allowed to be zero. We will want to use parse_size() so it needs to be able to return '0' without it being an error. So define INVALID_SECTORS to be an impossible value (currently '1') and return and test for it consistently. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:20 +10:00
NeilBrown	7103b9b88d	Handles spaces in array names better. 1/ When printing the "name=" entry for --brief output, enclose name in quotes if it contains spaces etc. Quotes are already supported for reading mdadm.conf 2/ When a name is used as a device name, translate spaces and tabs to '_', as well as the current translation of '/' to '-'. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:20 +10:00
Maciej Naruszewicz	9eafa1de73	imsm: Allow to specify controller for --detail-platform. Usually, 'mdadm --detail-platform -e imsm' scans all the controllers looking for IMSM capabilities. This patch provides the possibility to specify a controller to scan, enabling custom usage by other processes - especially with the --export switch. $ mdadm --detail-platform Platform : Intel(R) Matrix Storage Manager Version : 9.5.0.1037 RAID Levels : raid0 raid1 raid10 raid5 Chunk Sizes : 4k 8k 16k 32k 64k 128k 2TB volumes : supported 2TB disks : not supported Max Disks : 7 Max Volumes : 2 per array, 4 per controller I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA) $ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.2 Platform : Intel(R) Matrix Storage Manager Version : 9.5.0.1037 RAID Levels : raid0 raid1 raid10 raid5 Chunk Sizes : 4k 8k 16k 32k 64k 128k 2TB volumes : supported 2TB disks : not supported Max Disks : 7 Max Volumes : 2 per array, 4 per controller I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA) $ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.2 --export MD_FIRMWARE_TYPE=imsm IMSM_VERSION=9.5.0.1037 IMSM_SUPPORTED_RAID_LEVELS=raid0 raid1 raid10 raid5 IMSM_SUPPORTED_CHUNK_SIZES=4k 8k 16k 32k 64k 128k IMSM_2TB_VOLUMES=yes IMSM_2TB_DISKS=no IMSM_MAX_DISKS=7 IMSM_MAX_VOLUMES_PER_ARRAY=2 IMSM_MAX_VOLUMES_PER_CONTROLLER=4 $ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.0 # This isn't an IMSM-capable controller mdadm: no active Intel(R) RAID controller found under /sys/devices/pci0000:00/0000:00:1f.0 Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-04 16:34:11 +10:00
Maciej Naruszewicz	f0ec67106c	Display size with human_size_brief with a chosen prefix When using human_size_brief, only IEC prefixes were supported. Now it's possible to specify which format we want to see - either IEC (kibi, mibi, gibi) or JEDEC (kilo, mega, giga). Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-02 16:41:13 +10:00
Maciej Naruszewicz	e50cf22073	imsm: Add --export option for --detail-platform This option will provide most of information we can get via mdadm --detail-platform [-e format] in the key=value format. Example output: $ mdadm --detail-platform Platform : Intel(R) Matrix Storage Manager Version : 9.5.0.1037 RAID Levels : raid0 raid1 raid10 raid5 Chunk Sizes : 4k 8k 16k 32k 64k 128k 2TB volumes : supported 2TB disks : not supported Max Disks : 7 Max Volumes : 2 per array, 4 per controller I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA) $ mdadm --detail-platform --export MD_FIRMWARE_TYPE=imsm IMSM_VERSION=9.5.0.1037 IMSM_SUPPORTED_RAID_LEVELS=raid0 raid1 raid10 raid5 IMSM_SUPPORTED_CHUNK_SIZES=4k 8k 16k 32k 64k 128k IMSM_2TB_VOLUMES=yes IMSM_2TB_DISKS=no IMSM_MAX_DISKS=7 IMSM_MAX_VOLUMES_PER_ARRAY=2 IMSM_MAX_VOLUMES_PER_CONTROLLER=4 Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-02 16:28:30 +10:00
Lukasz Dorau	51d4261ca9	fix: adjust parse_size() to the unsigned size variable An error in parse_size() should be reported by 0, not -1, because -1 is changed to the max value of unsigned long long during calculations of size (e.g. at mdadm.c:412). A negative value of size should be reported as error (e.g. size equal -1 has been changed to the max value of unsigned long long so far). Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-09-20 12:27:17 +10:00
NeilBrown	7bd04da926	Manage: minor cosmetic fixes. Signed-off-by: NeilBrown <neilb@suse.de>	2012-08-13 08:00:20 +10:00
NeilBrown	50f01ba5a1	Use new struct context and struct shape for Grow_addbitmap Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:22:12 +10:00
NeilBrown	32754b7d84	Use new struct context and struct shape in Grow_reshape Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:22:09 +10:00
NeilBrown	99cc42f4a9	Use new 'struct shape' to pass args to Create Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:22:05 +10:00
NeilBrown	a4e8316a75	Use new 'struct shape' to pass args to Build Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:21:57 +10:00
NeilBrown	e705e81ba6	Create new 'struct shape' to pass around array details. This collects to together many of the args given to create/build/grow Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:21:38 +10:00
NeilBrown	d04f65f48c	Change the values for "max size" from -1 to 1. Both are impossible, and '1' allows size to be unsigned, which is neater. Also #define MAX_SIZE to be '1' to make it all more explicit. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:20:32 +10:00
NeilBrown	11b6d91dd0	Change Incremental and related functions to take struct context Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:20:22 +10:00
NeilBrown	95c5020544	Change Monitor to take a struct context Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:20:19 +10:00
NeilBrown	ef898ce65b	Change Detail and misc_scan to take a struct context Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:20:16 +10:00
NeilBrown	eec3f88785	change Examine to take a struct context Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:20:00 +10:00
NeilBrown	265460abab	Examine: split 'verbose' out from 'brief'. The value of 'verbose' is sometimes mixed into 'brief', particularly for Examine. This is messy and confusing. So keep them separate. 'brief' still gets assumed when 'scan' is set, unless we are very verbose. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:19:48 +10:00
NeilBrown	171dccc813	Change Create to take a struct context Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:19:24 +10:00
NeilBrown	0c9e4afb1f	Change Build to take a struct context Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:19:21 +10:00
NeilBrown	4977146a84	Convert Assemble() to take a context rather than a list of options. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:19:07 +10:00
NeilBrown	0937132db1	Discard 'quiet' context variable. Just use negative verbose, now that we are ready for that. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:18:48 +10:00
NeilBrown	ba728be72f	Convert 'quiet' to 'not verbose' in various places. If we change some functions to accept 'verbose', where <0 means to be quiet, in place of 'quiet', then we will be able to merge 'quiet' and 'verbose' together for simplicity. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:18:09 +10:00
NeilBrown	9e33d55609	Create 'struct context' for ad hoc context option. Rather than passing a long list of little flags etc to various functions we will soon pass a small collection of structures. This first step combines a collection of variables local to 'main' into a single structure. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:17:33 +10:00
NeilBrown	7986889004	Create parse_num() function. Instead of open-coding this several times, just do it once. The frees up the name 'c' which I'm about to use. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:17 +10:00
NeilBrown	72d566f68d	Create: support --readonly flag. Allow array to be created read-only Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:17 +10:00
NeilBrown	0ea8f5b167	Assemble: allow arrays to be assembled read-only. The option was there, but never used. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	503975b9d5	Remove scattered checks for malloc success. malloc should never fail, and if it does it is unlikely that anything else useful can be done. Best approach is to abort and let some super-daemon restart. So define xmalloc, xcalloc, xrealloc, xstrdup which don't fail but just print a message and exit. Then use those removing all the tests for failure. Also replace all "malloc;memset" sequences with 'xcalloc'. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	c8e1a230b7	Remove re_add flag in favour of new disposition. Instead of disposition == 'a' re_add == 1 use disposition == 'A' to record that a re-add was requested. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	e7b84f9d50	Introduce pr_err for printing error messages. 'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": ' cont_err() is also available. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	5187a38587	Help: use an array to choose which help matches which mode. Looks cleaner than a big switch statement. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:14:16 +10:00
NeilBrown	f7d3febcd6	Use explicit non-char opt for --zero-super As we don't allow '-K' for '--zero-super' there is no point using it internally. Just define a 'KillOpt' like with other options. Signed-off-by: NeilBrown <neilb@suse.de>	2012-07-09 17:00:45 +10:00
NeilBrown	9dad51d418	Monitor: fix inconsistencies in values for ->percent ->percent sometimes stores negative values recording states like 'pending' or 'delayed'. The value '-2' means both 'delayed' and in Monitor, 'unknown'. Also, '-1' has a meaning but not #define. So change the #defines to be prefixed with "RESYNC_", instead of "PROCESS_", add new "_NONE" and "_UNKNOWN", and use correct value in each location. Signed-off-by: NeilBrown <neilb@suse.de>	2012-06-04 12:31:40 +10:00
NeilBrown	96fd06edce	Adjust to new standard of /run Now that /run seems to be a good standard, make that the default for storing various run-time files, rather than /var/run or /dev/.mdadm. Signed-off-by: NeilBrown <neilb@suse.de>	2012-05-03 14:16:56 +10:00
Jes Sorensen	012a864129	Introduce sysfs_set_num_signed() and use it to set bitmap/offset mdinfo->bitmap_offset is a signed long and needs to be treated as such when passed to the kernel. This resolves the problem with adding internal bitmaps to a 1.0 array. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-30 09:56:22 +10:00
NeilBrown	c2ecf5f61a	Add --prefer option for --detail and --monitor Both --detail and --monitor can report the names of member devices on an array, and do so by searching /dev and finding the shortest name that matches. If --prefer=foo is given, they will instead prefer a name that contain /foo/. So mdadm --detail /dev/md0 --prefer=by-path will list the component devices via their /dev/disk/by-path/xxx names. Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-18 11:00:07 +10:00
NeilBrown	480f356641	Raid limit of 1024 when scanning for devices. When we can for devices using GET_DISK_INFO we currently limit to 1024. But some arrays can have more than this. So raise it to 4096 and make the constant a #define. Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-18 09:06:02 +10:00
Adam Kwolek	016e00f546	FIX: Support metadata changes rollback Function reshape_super() guards metadata changes. It is used to apply changes rollback in error case also. As change (apply and rollback) can be not bi-directional reshape_super() has to know if current action is metadata change that should be guarded using metadata restrictions, or this is metadata rollback change executed due to error occurrence. In second case change has to be unconditional. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-17 12:33:37 +10:00
NeilBrown	fbdef49811	Bitmap_offset is a signed number As the bitmap can be before the superblock, bitmap_offset is signed. But some of the code didn't honour that :-( Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-04 14:03:45 +10:00
NeilBrown	2d762ade6a	Fix the new ROUND_UP macro. It was missing a "- 1". Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-22 19:40:38 +11:00
Jes Sorensen	de89706515	Generalize ROUND_UP() macro and introduce matching ROUND_UP_PTR() Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-21 08:04:24 +11:00
NeilBrown	de5a472ea3	Remove avail_disks arg from 'enough'. It can easily be calculated from 'avail' and 'raid_disks', and we will soon have a case where we don't have it easily available to pass in. Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-07 14:04:47 +11:00
Jes Sorensen	a0963a86e1	Spawn mdmon with --offroot if mdadm was launched with --offroot Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:11:29 +11:00
Jes Sorensen	08ca2adfff	Add --offroot argument to mdadm When --offroot is specified, mdadm will change the first character of argv[0] to '@'. This is used to signal to systemd that mdadm was launched from initramfs and should not be shut down before returning to the initramfs. Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-30 12:11:16 +11:00
NeilBrown	68226a8081	monitor: ensure we retry soon when 'remove' fails. If a 'remove' fails there is no certainty that another event will happen soon, so make sure we retry soon anyway. Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-01-03 11:39:59 +11:00
NeilBrown	c0c1acd691	Grow/bitmap: support adding bitmap via sysfs. Adding a bitmap via ioctl can only add it at a fixed location. That location is not suitable for 4K-block devices. So allow setting the bitmap location via sysfs if kernel supports it and aim to always use 4K alignments. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 14:10:41 +11:00
NeilBrown	81a5b4f52f	Remove update_private This fields doesn't work any more as ->getinfo_super clears the info structure at an awkward time. So get rid of it and do it differently. The issue is that the metadata handler cannot tell if the uuid it has was randomly generated or explicitly requested, except on the first call. And we don't want to accept explicit requests for IMSM. So when it was auto-generated, make it look distinctive by having the same int copied in all 4 positions. If someone requests a uuid like that, I guess they get away with it. Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-20 10:30:34 +11:00
Lukasz Orlowski	7c3367585e	fix: Allowed to assemble 2 volumes with the same names from config file. mdadm allowes to assemble 2 volumes with the same names based on the config file. The issue is fixed by iterating over the list of md device identifiers and comparing the names of md devices against each other, detecting identical names and blocking the assembly should the same names be found. Now having detected duplicate names, mdadm terminates without assembling the container, displaying appropriate prompt. Signed-off-by: Lukasz Orlowski <lukasz.orlowski@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-07 12:20:34 +11:00
NeilBrown	2244d1a987	Remove duplicated code: search_mdstat and conf_match search_mdstat and conf_match are almost identical. Put all the functionality in conf_match, and remove search_mdstat. Reported-by: Jes.Sorensen@redhat.com Signed-off-by: NeilBrown <neilb@suse.de>	2011-11-01 13:30:41 +11:00
Labun, Marcin	81219e70f2	kill-subarray: fix, IMSM cannot kill-subarray with unsupported metadata container_content retrieves volume information from disks in the container. For unsupported volumes the function was not returning mdinfo. When all volumes were unsupported the function was returning NULL pointer to block actions on the volumes. Therefore, such volumes were not activated in Incremental and Assembly. As side effect they also could not be deleted using kill-subarray since "kill" function requires to obtain a valid mdinfo from container_content. This patch fixes the kill-subarray problem by allowing to obtain mdinfo of all volumes types including unsupported and introducing new array.status flags. There are following changes: 1. Added MD_SB_BLOCK_VOLUME for blocking an array, other arrays in the container can be activated. 2. Added MD_SB_BLOCK_CONTAINER_RESHAPE block container wide reshapes (like changing disk numbers in arrays). 3. IMSM container_content handler is to load mdinfo for all volumes and set both blocking flags in array.state field in mdinfo of unsupported volumes. In case of some errors, all volumes can be affected. Only blocked array is not activated (also reshaped as result). The container wide reshapes are also blocked since by metadata definition they require modifications of both arrays. 4. Incremental_container and Assemble functions check array.state and do not activate volumes with blocking bits set. 5. assemble_container_content is changed to check container wide reshapes before activating reshapes of assembled containers. 6. Grow_reshape and Grow_continue_command checks blocking bits before starting reshapes or continueing (-G --continue) reshapes. 7. kill-subarray ignores array.state info and can remove requested array. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-31 11:29:46 +11:00
Adam Kwolek	6e75048bc5	Add recovery blocked field to mdinfo When container is assembled while reshape is active on one of its member whole container can be required to be blocked from monitoring. For such purpose field recovery blocked is added to mdinfo structure. When metadata handler finds active reshape in container it should set recovery_blocked field to disable whole container monitoring during reshape. For arrays that doesn't use containers, recovery_blocked field has the same value as reshape_active field e.g. super0/1. In fact,recovery is blocked during reshape for such arrays. For ddf, metadata handler doesn't set reshape_active field, so recovery_blocked is not set also. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-05 13:30:50 +11:00
Adam Kwolek	577e8448e9	Move code to get_data_disks() function Move code to function for code reuse. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-03 09:57:12 +11:00
Adam Kwolek	2dddadb0f7	Add continue option to grow command To allow for reshape continuation '--continue' option is added to grow command. Function that will be executed in grow-continue case doesn't require information about reshape geometry. All required information are read from metadata. For external metadata reshape can be run for monitored array/container only. In case when array/container is not monitored run mdmon for it. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-03 09:26:48 +11:00
Adam Kwolek	b76b30e0f9	Do not continue reshape during initrd phase During initrd phase continuing reshape will cause file system context lost. This blocks ability to control reshape using checkpoints. To avoid this, during initrd phase assemble has to be executed with '--freeze-reshape' option. This causes that mdadm restores reshape critical section only. Reshape can be continued later after system full boot. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-03 09:15:22 +11:00
Lukasz Dorau	cc700db34f	fix: correct unlocking of map file 1. Three missing map_unlock() calls were added. 2. Map file must be unlocked on fork, else child will hold lock. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-10-03 08:55:02 +11:00
NeilBrown	ecbd9e8160	Create: improve messages from validate_geometry. When validate_geometry finds that we haven't committed to a metadata yet and that the subdev is a member of 'our' container, it needs to report any errors it finds as Create() cannot report them effectively. So make a slight change to the semantics of the 'verbose' flag and allow validate_geometry to report if it printed any error messages. Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-21 14:39:01 +10:00
Adam Kwolek	3f54bd62dc	Move restore backup code to function Reshape backup should be able to be restored during reshape continuation also. To reuse already existing code it is moved to function. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-21 12:17:30 +10:00
Doug Ledford	16715c01f7	Fix readding of a readwrite drive into a writemostly array If you create a two drive raid1 array with one device writemostly, then fail the readwrite drive, when you add a new device, it will get the writemostly bit copied out of the remaining device's superblock into it's own. You can then remove the new drive and readd it as readwrite, which will work for the readd, but it leaves the stale WriteMostly1 bit in devflags resulting in the device going back to writemostly on the next assembly. The fix is to make sure that A) when we readd a device and we might have filled the st->sb info from a running device instead of the device being readded, then clear/set the WriteMostly1 bit in the super1 struct in addition to setting the disk state (ditto for super0, but slightly different mechanism) and B) when adding a clean device to an array (when we most certainly did copy the superblock info from an existing device), then clear any writemostly bits. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-19 13:06:38 +10:00
NeilBrown	11b391ece9	Discourage large devices from being added to 0.90 arrays. 0.90 arrays can only use up to 4TB per device. So when a larger device is added, complain a bit. Still allow it if --force is given as there could be a valid use. Signed-off-by: NeilBrown <neilb@suse.de>	2011-09-08 13:05:31 +10:00
Krzysztof Wojcik	2d3603ba0c	Show DELAYED, PENDING status of resync process in "--detail" Initially there is no proper translation mdstat's DELAYED/PENDING processes to "--detail" output. For example, if we have recover=DELAYED in mdstat, "--detail" shows "State: recovering" and "Rebuild Status = 0%". It was incorrect in case of process waiting on checkpoint different than 0%. In fact rebuild status is differnt than 0% and user is misled. The patch fix the problem. Current "--detail" command shows in the exampe: "State: recovering (DELAYED)" and no information about precentage. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-23 12:06:47 +10:00
Adam Kwolek	ba53ea59ad	Add reshape restart support for external metadata Patch introduces support for reshape process restart for external metadata using metadata specific data handling methods. It introduces recover_backup() function that restores array to stable state It is equivalent to Grow_restart() functionality for native metadata. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-08 17:11:11 +10:00
Adam Kwolek	10f228541c	imsm: Implement imsm_manage_reshape(), reshape workhorse Before reshape is started, mdadm should check again if there is only one array (in container) under reshape. Then function "divides" array in to "migration units" that can fits migration copy area and enters main loop. It checks if current "migration unit" requires to be backed up. If necessary mdadm saves it to copy area and updates migration record. Then MD-driver is directed to perform reshape step (by "migration unit" size) and checkpoint is moved forward. In this way reshape is executed until array ends. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-08 17:09:08 +10:00
Adam Kwolek	2fcb75aea1	Support restore_stripes() from the given buffer For external metadata backup location and saving methods depends on metadata specific implementation details. Currently restore_stripes() function is able to restore data only from the given backup file handles and it is used only for assembling partially reshaped arrays. As this function will be very helpful for external metadata backup mechanism, add the support for restoring data from the given source buffer. Add possibility for save_stripes() to work without designation targets. Save_stripes() can now prepare data for restore_stripes() only. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-08 16:24:48 +10:00
NeilBrown	95eeceeb32	getinfo_super now clears the 'info' structure before filling it in. Some code currently clears 'info' before calling getinfo_super, some code doesn't. To be consistent, change it so no caller ever clears 'info', but ever getinfo_super function must clear it. Note that ->raid_disk may be meaningful if that 'map' is passed non-NULL. In that case it is copied out before the structure is zeroed. Signed-off-by: NeilBrown <neilb@suse.de>	2011-06-08 15:54:13 +10:00
NeilBrown	ce52f92f04	Grow: accept --assume-clean with --grow --size When an array is resized to have larger members, --assume-clean will disable any resync if the kernel supports it (2.6.40 and later). Signed-off-by: NeilBrown <neilb@suse.de>	2011-05-16 17:28:27 +10:00
Labun, Marcin	df3346e675	examine: allows to examine a disk metadata on non-metadata compliant systems Allow for loading metadata from disk attached to non-metadata compliant system. Affects mdadm --examine and guess_super. Added ignore_hw_compat in supertype to pass information to load_super handler. If ignore_hw_compat is set the handler should load metadata also from disks that do not comply with metadata requirements (i.e. disk is not attached to native controller, etc). Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-23 12:04:46 +11:00
NeilBrown	d998b738f5	mdmon: don't wait for O_EXCL when shutting down. If mdmon is shutting down because there are no devices left to look at, then don't wait 5 seconds for an O_EXCL open, and that can block progress of --grow. Only wait for O_EXCL if we received a signal. Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-22 16:10:22 +11:00
Krzysztof Wojcik	53ed6ac36e	Warn the user about too small array size If single-disk RAID0 or RAID1 array is created, user may preserve data on disk. If array given size covers all partitions on disk, all data will be available on created array. If array size is too small (not covers all partitions), data will be not accessible. This patch introduces warning message during array creation if given size is too small. User may interrupt creation process to avoid data loss. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-14 18:21:21 +11:00
NeilBrown	e2e53a2da5	Grow: support reshape of RAID0 arrays. This is done via conversion to RAID4 and back. To grow the array, extra devices will be needed which cannot already be present as spares - so allow a list of new devices to be included in grow request which changed the number of devices. Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-10 15:05:23 +11:00
NeilBrown	4968025884	Run Grow_restart/Grow_continue when assembling the content of a container. As containers can now grow, we need to use both Grow_restart (to replay any backup-file) and Grow_continue when assembling the content of a container. Note that we don't pass a backup-file when doing incremental assembly. If such is needed in that case, the assembly will fail. To restart such arrays, explicit assembly is required. Signed-off-by: NeilBrown <neilb@suse.de>	2011-03-08 17:14:00 +11:00
Czarnowska, Anna	c21e737ba1	set default chunk in validate_geometry When chunk size is not set from command line we need to guess it depending on metadata given on command line or found on listed devices. Validate_geometry sets the default for it's metadata if chunk is not set. For external metadata chunk is set only when creating in a container. For imsm validate_geometry_imsm_orom is responsible for finding default chunk depending on container metadata loaded. Container will already know which controller it is attached to, and have this controllers orom available. do_default_chunk indicates that we need to find default chunk and if validate_geometry fails for some metadata it tells us to reset chunk that may have been set. Current solution would set default chunk correctly for imsm only if container device was given on command line. With the list of devices chunk was always set to 512. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-22 11:25:07 +11:00
Adam Kwolek	41784c88f3	FIX: delta_disk can have UnSet value Delta_disk can be set to UnSet value. This can a cause to pass wrong parameter to reshape_super(). To avoid such situations raid_disks and delta_disks parameters have to be passed to reshape_super() separately. It will be up to reshape_super() function validation and usage of this parameters to avoid not valid values. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-14 11:04:09 +11:00
NeilBrown	e5e5d7cea3	Incr: don't exclude 'active' devices from auto inclusion in a container. For containers, it is always appropriate to include a device in the container. Whether it should then be included in an array is a separate question. Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 13:07:36 +11:00
Czarnowska, Anna	bfd76b9309	Monitor: do not move partitions to external container Arrays on partitions are not supported for external metadata so do not take such spare from native array. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-02-01 10:40:56 +11:00
Labun, Marcin	20b60dcd6c	Dynamic hot-plug udev rules for policies Neil, Please consider this patch that once was discussed and I think agreed with in general direction. It was sent a while ago but somehow did not merged into your devel3-2. This patch enables hot-plug of so called bare devices (as understand by domain policies rules in mdadm.conf). Without this patch we do NOT serve hot-plug of bare devices at all. Thanks, Marcin Labun Subject was: FW: Autorebuild, new dynamic udev rules for hot-plugs >>From c0aecd4dd96691e8bfa6f2dc187261ec8bb2c5a2 Mon Sep 17 00:00:00 2001 From: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Date: Thu, 23 Dec 2010 16:35:01 +0100 Subject: [PATCH] Dynamic hot-plug udev rules for policies Cc: linux-raid@vger.kernel.org, Williams, Dan J <dan.j.williams@intel.com>, Ciechanowski, Ed <ed.ciechanowski@intel.com> When introducing policies, new hot-plug rules were added to support bare disks. Mdadm was started for each hot plugged block device to determine if it could be used as spare or as a replacement member for degraded array. This patch introduces limitation of range of devices that are handled by mdadm. It limits them to the ones specified in domains associated with the actions: spare-same-port, spare and spare-force. In order to enable hot-plug for bare disks one must update udev rules with command mdadm --activate-domains[=filename] Above command writes udev rule configuration to stdout. If 'filename' is given output is written to the file provided as parameter. It is up to system administrator what should be done later. To make such rule permanent (i.e. remain after reboot) rule should be writen to /lib/udev/rules.d directory. Other cases will just need to write it to /dev/.udev/rules.d directory where temporary rules lies. One should be aware of the meaning of names/priorities of the udev rules. After mdadm.conf is changed one is obliged to re-run "mdadm --activate-domains" command in order to bring the system configuration up to date. All hot-plugged disks containing metadata are still handled by existing rules. Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-27 12:48:04 +10:00
NeilBrown	a93f87eee6	Add 'restart' arg to various functions used for reshaping. When we restart an array in the middle of a reshape, we reuse a lot of the code for starting the reshape, but it needs to know that circumstances are slightly different. So add a 'restart' arg which is used: - skip checking and adding spares - activate the array (rather than start reshape) - allow the backup file to already exist Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-17 09:53:56 +11:00
NeilBrown	999b497251	Make child_monitor a candidate for ->manage_reshape Child_monitor was design to perform 'manage_reshape' for native arrays. So change the signature for ->manage_reshape to match child_monitor and move the all to the same place that child_monitor is called from. Also give super-intel a manage_reshape handler which simple calls child_monitor. Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-12 14:46:17 +11:00
Anna Czarnowska	d52bb542d4	move_spare function modified and moved to Manage.c It will also be needed for Incremental. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-05 14:34:32 +11:00
Anna Czarnowska	326727d9c9	Use one function chosing spares from container container_chose_spares in Monitor.c and get_spares_for_grow in super-intel.c do the same thing: search for spares in a container. Another version will also be needed for Incremental so a more general solution is presented here and applied in two previous contexts. Normally domlist==NULL would lead an empty list but this is typically checked earlier so here it is interpreted as "do not test domains". Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-01-05 14:34:14 +11:00
Anna Czarnowska	22e263f64a	imsm: set imsm spare uuid to 0 uuid_match_any is replaced by uuid_zero for imsm spares. Function fixup_container_spare_uuid not needed as it gives unwanted uuid to spares. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-26 21:59:31 +11:00
NeilBrown	cb23f1f4c3	Allow a metadata update to have a linked list of allocated spaces. Sometimes one metadata update will require allocating several larger data structures. As 'monitor' cannot allocate, 'manager' must, so it must be able to attach a list of allocates to the update, and importantly it must be able to easily free them. So add a 'space_list' element to metadata updates where each element on the list starts with a pointer to the next. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 12:10:01 +11:00
NeilBrown	78b10e663c	imsm: Prepare reshape_update in mdadm During Online Capacity Expansion metadata has to be updated to show array changes and allow for future assembly of array. To do this mdadm prepares and sends reshape_update metadata update to mdmon. The update contains the old and new number of raid disks, and the indices of the spare disks that will be used to fill the spaces. This works as follows: 1. reshape_super() prepares metadata update. 2. mdadm discovers the spares and adds them to the array 3. mdadm sends the update to mdmon 4. managemon in prepare_update() allocates required memory for bigger device object 5. monitor in process_update() updates the metadata to record the new sizes and the newly assigned devices. 6. mdadm initiates the reshape Based on code From: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 11:45:21 +11:00
NeilBrown	11877f4dc2	Split fmt_devnum out from devnum2devname Sometimes we want to convert a devnum to a devname without allocating memory. So provide function to do the formatting without allocation. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-16 09:07:51 +11:00
Labun, Marcin	1a64be565b	IMSM: Fix problem in mdmon monitor of using removed disk in imsm container. Manager thread shall pass the information to monitor thread (mdmon) that some devices are removed from container. Otherwise, monitor (mdmon) might use such devices (spares) to rebuild the array that has gone degraded. This problem happens for imsm containers, since a list of the container disks is maintained in intel_super structure. When array goes degraded, the list is searched to find a spare disks to start rebuild. Without this fix the rebuild could be stared on the spare device that was a member of the container, but has been removed from it. New super type function handler has been introduced to prepare metadata format specific information about removed devices. int (remove_from_super)(struct supertype st, mdu_disk_info_t *dinfo) The message prepared in remove_from_super is later processed by process_update handler in monitor thread. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-15 15:51:51 +11:00
NeilBrown	833bb0f8f6	Allow --update=devicesize with --re-add This is useful with 1.1 and 1.2 metadata to update the metadata if the device size has changed. The same functionality can be achieved by writing to the device size in sysfs after re-adding normally, but in some cases this might be easier. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-09 13:06:29 +11:00
NeilBrown	691a36b76f	Grow: warn if growing an array will make it degraded. Growing an array when there aren't enough spares can make the array degraded. This works but might not be what is wanted. So warn the user in this case and require a --force to go ahead with the reshape. Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-09 11:51:13 +11:00
Adam Kwolek	e6e9d47b76	Grow: open backup file for reshape as function Move opening backup file to the function for future reuse during container reshape. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-03 15:00:16 +11:00
NeilBrown	87f26d14f7	Assemble: allow an array undergoing reshape to be started without backup file Though not having the proper backup file can cause data corruption, it is not enough to justify not being able to start the array at all. So allow "--invalid-backup" to be specified which says "just continue even if a backup cannot be restored". Signed-off-by: NeilBrown <neilb@suse.de>	2010-12-01 11:47:32 +11:00
NeilBrown	ab2bb0b621	mdmon: don't copy an invalid chunk_size As chunk_size in mdstat_ent is never set, we shouldn't copy it into a->info.array. In fact, it is safest to get rid of the field altogether. Reported-by: "Kwolek, Adam" <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 18:35:36 +11:00
Adam Kwolek	1c009fc218	Compute backup blocks in function. number of backup blocks evaluation is put in to function for code reuse. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 13:30:22 +11:00
Adam Kwolek	130994cb83	Prepare and free fdlist in functions fd handles table creation is put in to function for code reuse. In manage_reshape(), child_grow() function from Grow.c will be reused. To prepare parameters for this function, code from Grow.c can be reused also. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-30 13:27:08 +11:00
Adam Kwolek	6d11ec6fc2	Treat feature as experimental Due to fact that IMSM Windows compatibility was not tested yet, feature has to be treated as experimental until compatibility verification will be performed. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 12:11:09 +11:00
NeilBrown	746a6567d3	Improve comments for block_monitor. Also not that the leading '-' on the metadata names now simply means that mdmon must not reconfiure the array. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 10:32:15 +11:00
Anna Czarnowska	0f0749ad93	Monitor: devid should be dev_t For consistency with makedev(). int is not sufficient. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 09:56:28 +11:00
NeilBrown	de6ae75015	Incremental - avoid including wayward devices. If a devices - typically in a mirrored set - is assembled independently of the other devices, and then attempted to be brought back into the set, it could contain inconsistent data. It should not be included. So detect this situation by ensuring that the 'most recent' device is believed to be active by every other device. If a device is wayward, it will only consider fellow wayward devices to be active and will think all others are failed or missing. This patches fixes --incremental, --assemble was done in an earlier patch. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-29 09:40:15 +11:00
NeilBrown	1c7a808c4d	Improve opt parsing, and distinguish long from short. In several cases, two different long options map to the same short option. So e.g. you could give '--brief' and it would be interpreted as '--bitmap'. That isn't really good. So for every shared short option, define an option number and return that for the long option instead. Then always check for both the short and long options. Also give some bugs like " mode == 'G'" which should be '== GROW'. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-25 18:58:45 +11:00
Dan Williams	7bc7119671	External reshape (step 1): container reshape and ->reshape_super() In the native metadata case Grow_reshape() and the kernel validate what reshapes are possible / supported and the kernel handles all the metadata updates. In the external case the metadata format may have specific constraints above this baseline. External formats also introduce the constraint of only permitting some reshapes at container scope versus subarray scope. For exmaple imsm changes to 'raiddisks' must be applied to all arrays in the container. This operation assumes that its 'st' parameter has been obtained from super_by_fd() (such that st->subarray is up to date), and that a snapshot of the metadata has been loaded from the container. Why a new method, versus extending an existing one? ->validate_geometry: this routine assumes it is being called from Create(), adding reshape complicates the cases that this routine needs to handle. Where we find that checks can be shared between the two cases those routines refactored into common code internal to the metadata handler, i.e. no need to provide a unified external interface. ->validate_geometry() also does not expect to update the metadata. ->update_super: this is meant to update single fields at Assembly() and only at the container scope. Reshape potentially wants to update multiple fields at either container or subarray scope. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 16:09:27 +11:00
Dan Williams	30f58b2208	Create: cleanup/unify default geometry handling Support metadata specific level, layout and chunksize defaults. Kill an uneeded superswitch methods ahead of adding more for the reshape case. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 15:20:50 +11:00
Dan Williams	bc77ed535d	block monitor: freeze spare assignment for external arrays In order to support reshape and atomic removal of spares from containers we need to prevent mdmon from activating spares. In the reshape case we additionally need to freeze sync_action while the reshape transaction is initiated with the kernel and recorded in the metadata. When reshaping a raid0 array we need to freeze the array before it is transitioned to a redundant raid level. Since sync_action does not exist at this point we extend the '-' prefix of a subarray string to flag mdmon not to activate spares. Mdadm needs to be reasonably certain that the version of mdmon in the system honors this 'freeze' indication. If mdmon is not already active then we assume the version that gets started is the same as the mdadm version. Otherwise, we check the version of mdmon as returned by the extended ping_monitor() operation. This is to catch cases where mdadm is upgraded in the filesystem, but mdmon started in the initramfs is from a previous release. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 15:00:54 +11:00
Dan Williams	e5408a3202	Provide a mdstat_ent to subarray helper ...before introducing another open coded instace of this conversion. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-23 14:44:23 +11:00
Marcin Labun	2cda7640f9	Policy is aware of metadata disk's controller domains. Platform (metadata) domain let the metadata handlers differentiate disk domains based on controllers that the disk belongs to. Platform domain is sub-domain inside user specified domain in mdadm.conf configuration files inheriting all parameters from it. The metadata domain name is used disk domain matching functions. The disk with the same metadata domain name belong to the same metadata domain. New metadata handler is added that retrieves platform domain string based on disk path: const char (get_disk_controller_domain)(const char *path); Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
Anna Czarnowska	80e7f8c31a	Monitor: Allow metadata to set minimum size for spare to migrate in. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
NeilBrown	e78dda3bf5	Monitor: policy based spare migration. Rather than only migrating between arrays with the same spare_group, we now migrate based on domains set in the policy. In order for spare_group to continue to work, we treat it as a domain of the destination array, and a domain of any device we might remove from a source array. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
Anna Czarnowska	5c4cd5da70	imsm: create mdinfo list of disks in a container from supertype If getinfo_super is called on a container supertype we only get information on first disk. As a parameter it uses reference to preallocated mdinfo structure. Amending getinfo_super to return full list of disks would require ammending all previous calls and subsequently freeing memory allocated for mdinfo list. Function container_content that returns a mdinfo list is written specifically for assembly, performing actions not needed to just fill mdinfo. It also does not include spares so is unsuitable. As an alternative a new function getinfo_super_disks is created to obtain information about all disks states in array. Existing function sysfs_free is used to free memory allocated by getinfo_super_disks. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:07 +11:00
Anna Czarnowska	edde9560fa	mdadm: added --no-sharing option for Monitor mode --no-sharing option disables moving spares between arrays/containers. Without the option spares are moved if needed according to config rules. We only allow one process moving spares started with --scan option. If there is such process running and another instance of Monitor is starting without --scan, then we issue a warning but allow it to continue. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
Anna Czarnowska	52d5d101a9	Util: get device size from id Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
NeilBrown	d2db304558	Add action=spare-same-slot policy. When "mdadm -I" is given a device with no metadata, mdadm tries to add it as a 'spare' somewhere based on policy. This patch changes the behaviour in two ways: 1/ If the device is at a 'path' where a previous device was removed from an array or container, then we preferentially add the spare to that array or container. 2/ Previously only 'bare' devices were considered for adding as spares. Now if action=spare-same-slot is active, we will add non-bare devices, but only if the path was previously in use for some array, and the device will only be added to that array. Based on code From: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
Przemyslaw Czarnowski	403410eb97	extension of IncrementalRemove to store location (path-id) of removed device If the disk is taken out from its port this port information is lost. Only udev rule can provide us with this information, and then we have to store it somehow. This patch adds writing 'cookie' file in /dev/.mdadm/failed-slots directory in form of file named with value of f<path-id> containing the metadata type and uuid of the array (or container) that the device was a member of. The uuid is in exactly the same format as in the mapfile. FAILED_SLOTS_DIR constant has been added to hold the location of cookie files. Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
Przemyslaw Czarnowski	950bc34477	added --path <path_id> to give the information on the 'path-id' of removed device <path-id> allows to identify the port to which given device is plugged in. In case of hot-removal, udev can pass this information for future use (eg. write this name as 'cookie' allowing to detect the fact of reinserting device to the same port). --path <path-id> parameter has been added to device removal handle (and char *path has been added to IncrementalRemove() to pass this value) in order to pass path-id to this handler. Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
NeilBrown	157e6e24b9	Remove loaded_container This field is now only set, never used. So remove it. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
NeilBrown	3a3716107b	Add must_be_container helper. This checks a block device to see if it could be a container, and in particular cannot be a member device. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:06 +11:00
NeilBrown	2b594614a1	Remove content from mddev_dev Now that the next_member loop is much smaller it is easy to just use 'content' rather than stashing it in 'tmpdev->content'. So we can remove the 'content' field from 'struct mddev_dev'. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:05 +11:00
NeilBrown	a655e55064	Improve type names for mddev_dev Remove the _t pointer typedef and remove the _s suffix for the structure, These things do not help readability. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:05 +11:00
NeilBrown	fa56eddbd1	Improve mddev_ident type definitions. Remove the _t typedef and remove the _s suffix from the struct name. These things do not help readability. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:58:05 +11:00
NeilBrown	7d91c3f547	Make Incremental_container static as it is only used in Incremental.c Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:24:50 +11:00
NeilBrown	2b959fbf66	New method: load_container This handles the 'container' part of 'load_super', so we can soon make them completely separate - it is just confusing to overload these two. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:24:50 +11:00
NeilBrown	69b2fcc5bb	Remove subarray field in supertype. This is now only ever set, never used. So remove it. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:24:50 +11:00
NeilBrown	a951a4f78f	Pass subarray arg explicitly to ->update_subarray. This is better than hiding it in the supertype structure where we are never quite sure who needs it. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 20:24:50 +11:00
NeilBrown	00bbdbdac6	Add subarray arg to container_content. This allows the info for a single array to be extracted, so we don't have to write it into st->subarray. For consistency, implement container_content for super0 and super1, to just return the mdinfo for the single array. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 19:35:26 +11:00
NeilBrown	4725bc31fb	super_by_fd: return subarray info explicitly. Rather than hiding this in the 'st', return it explicitly. In the one case we still need it, copy it into st where needed. This will disappear in a future patch. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 19:35:25 +11:00
NeilBrown	feab51f8f7	open_subarray: pass subarray name as explicit arg. Rather than hiding this arg in the 'st' structure, pass it explicitly. This is a first step to getting rid of 'subarray' from 'supertype'. The strcpy in open_subarray should have better error checking, but it will disappear soon so there is little point. Signed-off-by: NeilBrown <neilb@suse.de.	2010-11-22 19:35:25 +11:00
NeilBrown	a5d85af748	get_info_super: report which other devices are thought to be working/failed. To accurately detect when an array has been split and is now being recombined, we need to track which other devices each thinks is working. We should never include a device in an array if it thinks that the primary device has failed. This patch just allows get_info_super to return a list of devices and whether they are thought to be working or not. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 19:35:25 +11:00
NeilBrown	f94c116f56	detail/wait: better handling of monitoring sync action. Detail: report reshape and check as well as resync and recovery Wait: if the resync is pending or delayed, wait for that too. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 19:35:25 +11:00
NeilBrown	8453e70430	Manage: be more careful about --add attempts. If an --add is requested and a re-add looks promising but fails or cannot possibly succeed, then don't try the add. This avoids inadvertently turning devices into spares when an array is failed but the devices seem to actually work. Signed-off-by: NeilBrown <neilb@suse.de>	2010-11-22 19:35:25 +11:00
NeilBrown	4e8d9f0a16	Convert 'auto' config line to policy statements	2010-09-06 11:26:28 +10:00
NeilBrown	54887ad8cb	Add guess_super_type This can select to only guess array types, or only guess partition types. Signed-off-by: NeilBrown <neilb@suse.de>	2010-09-06 11:26:28 +10:00
NeilBrown	73c9c47c70	Factor out path_policy functon. Allow disk-policy to be computed given the path and disk type explicitly. This can be used when hunting through /dev/disk/by-path for something interesting. Signed-off-by: NeilBrown <neilb@suse.de>	2010-09-06 11:26:28 +10:00
NeilBrown	0592faeb5e	Add gpt pseudo-metadata This allows mdadm to work with gpt metadata to a limited extent. Signed-off-by: NeilBrown <neilb@suse.de>	2010-09-06 11:26:28 +10:00
NeilBrown	0f22b998fb	Add mbr pseudo metadata handler. To support incorpating a new bare device into a collection of arrays - one partition each - mdadm needs a modest understanding of partition tables. The main needs to be able to recognise a partition table on one device and copy it onto another. This will be done using pseudo metadata types 'mbr' and 'gpt'. Signed-off-by: NeilBrown <neilb@suse.de>	2010-09-06 11:26:28 +10:00
NeilBrown	f5f12c84ac	Add domain policy support. A device can be in a number of domains. The domains of an array is the union of the domains of all devices. A device is allowed to join an array when its set of domains is a subset of the array's domains. Signed-off-by: NeilBrown <neilb@suse.de>	2010-09-06 11:26:27 +10:00
NeilBrown	e3bb5f144b	Add policy_action support Add code for easy working 'action' policies. Signed-off-by: NeilBrown <neilb@suse.de>	2010-09-06 11:26:27 +10:00
NeilBrown	5527fc7462	Add policy framework. Policy can be stated as lines in mdadm.conf like: POLICY type=disk path=pci-0000:00:1f.2-* action=ignore domain=onboard This defines two distinct policies which apply to any disk (but not partition) device reached through the pci device 0000:00:1f.2. The policies are "action=ignore" which means certain actions will ignore the device, and "domain=onboard" which means all such devices as treated as being united under the name 'onboard'. This patch just adds data structures and code to read and manipulate them. Future patches will actually use them. Signed-off-by: NeilBrown <neilb@suse.de>	2010-09-06 11:03:43 +10:00

1 2 3 4 5 ...

529 Commits