Commit Graph

272 Commits

Author SHA1 Message Date
Adam Kwolek 8ba77d3281 imsm: Allow multiple spares to be collected.
Assumption for spares searching was that after picking new device, it
has to be added to array before next search.  This causes returning
different disk on each call.

When creating a spare list during Online Capacity Expansion, we will
first collect the devices list and then all devices are added to md.
Picked device from spares pool has to be checked against picked
devices so far. If not, the same disk will be returned all the time.
Already picked devices are stored in the list and this list is used
for new devices verification also.

So add an extra arg to imsm_add_spare to hold a list of known
spares to ignore.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 13:14:24 +11:00
Adam Kwolek 36988a3dda imsm: FIX: core dump during imsm metadata writing
Wrong number of disks during metadata update causes core dump.  New
disks number based on internal mdmon information has to used for
calculation (not previously read from metadata).

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-29 12:53:16 +11:00
Adam Kwolek 28bce06f17 imsm: Add support for general migration
Internal IMSM procedures need to support the General Migration.
It is used during operations like:
- Online Capacity Expansion,
- migration initialization,
- finishing migration,
- apply changes to raid disks etc.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-29 12:28:01 +11:00
Dan Williams 30f58b2208 Create: cleanup/unify default geometry handling
Support metadata specific level, layout and chunksize defaults.  Kill an
uneeded superswitch methods ahead of adding more for the reshape case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 15:20:50 +11:00
Marcin Labun 2cda7640f9 Policy is aware of metadata disk's controller domains.
Platform (metadata) domain let the metadata handlers differentiate
disk domains based on controllers that the disk belongs to.
Platform domain is sub-domain inside user specified domain
in mdadm.conf configuration files inheriting all parameters from it.
The metadata domain name is used disk domain matching functions.
The disk with the same metadata domain name belong to the same metadata
domain.

New metadata handler is added that retrieves platform domain string based
on disk path:
const char *(*get_disk_controller_domain)(const char *path);

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:07 +11:00
Anna Czarnowska 80e7f8c31a Monitor: Allow metadata to set minimum size for spare to migrate in.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:07 +11:00
Anna Czarnowska 5c4cd5da70 imsm: create mdinfo list of disks in a container from supertype
If getinfo_super is called on a container supertype we only get information
on first disk. As a parameter it uses reference to preallocated
mdinfo structure. Amending getinfo_super to return full list of disks
would require ammending all previous calls and subsequently freeing memory
allocated for mdinfo list.
Function container_content that returns a mdinfo list is written
specifically for assembly, performing actions not needed to just fill
mdinfo. It also does not include spares so is unsuitable.
As an alternative a new function getinfo_super_disks is created
to obtain information about all disks states in array.
Existing function sysfs_free is used to free memory
allocated by getinfo_super_disks.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:07 +11:00
NeilBrown 157e6e24b9 Remove loaded_container
This field is now only set, never used.

So remove it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown ab3cb6b3b7 imsm: always calculate container_enough in getinfo_super_imsm
We are about to lose the loaded_container field, and we don't really
need to use it to protect the calculation of container_enough.
So drop the test.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown fa56eddbd1 Improve mddev_ident type definitions.
Remove the _t typedef and remove the _s suffix from the struct name.

These things do not help readability.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown 2b959fbf66 New method: load_container
This handles the 'container' part of 'load_super', so we can
soon make them completely separate - it is just confusing to
overload these two.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown e1902a7b6c Remove keep_fd arg from load_super_XXX_all
It is always set to 1, so we don't need it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown 69b2fcc5bb Remove subarray field in supertype.
This is now only ever set, never used.
So remove it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown d1d599ea0d Create: user container_dev rather than subarray for some tests.
It makes more sense to test for container_dev than for subarray
for several places in Create where it then uses container_dev.

This allows us to subsequently remove subarray.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown e32bd33f44 Remove subarray detection from load_super.
Nothing relies on this any more, so remove it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown a951a4f78f Pass subarray arg explicitly to ->update_subarray.
This is better than hiding it in the supertype structure
where we are never quite sure who needs it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown 00bbdbdac6 Add subarray arg to container_content.
This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.

For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:26 +11:00
NeilBrown a5d85af748 get_info_super: report which other devices are thought to be working/failed.
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.

We should never include a device in an array if it thinks that the
primary device has failed.

This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown 1e2b276535 Report error in --update string is not recognised.
If an --update is requested by the relevant metadata doesn't
understand it, print a useful message rather than silently ignoring
the issue.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:24 +11:00
NeilBrown 64436f0628 intel: Don't try to read from tiny devices.
If a device is less than 1K, avoid even trying to seek to 1K before
the end.
The seek will fail anyway so this is a fairly cosmetic fix.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:28 +10:00
NeilBrown cdbe98cd54 Fix compiler warning concering bad use of snprintf.
Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
2010-08-06 20:10:48 +10:00
NeilBrown f21e18ca89 Compile with -Wextra by default
This produced lots of warning, some of which pointed to actual bugs.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-05 13:13:02 +10:00
Dan Williams 569cc43ffb imsm: fix a -O2 build warning
super-intel.c: In function ‘imsm_add_spare’:
super-intel.c:4833: error: ‘array_start’ may be used uninitialized in this function
super-intel.c:4834: error: ‘array_end’ may be used uninitialized in this function

This is valid, if we don't find a spare candidate then array_{start,end}
will be uninitialized.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-07-06 12:48:59 -07:00
Dan Williams d19e3cfb66 Merge branch 'fixes' into for-neil 2010-07-01 17:36:11 -07:00
Dan Williams 8cfc801c72 Merge branch 'subarray' into for-neil
Conflicts:
	mdadm.h
	super-intel.c
2010-07-01 17:36:05 -07:00
Dan Williams aa534678ba Rename subarray v2
Allow the name of the array stored in the metadata to be updated.  In
some cases the metadata format may not be able to support this rename
without modifying the UUID.  In these cases the request will be blocked.
Otherwise we allow the rename to take place, even for active arrays.
This assumes that the user understands the difference between the kernel
node name, the device node symlink name, and the metadata specific name.

Anticipating further need to modify subarrays in-place, introduce the
->update_subarray() superswitch method.  A future potential use
case is setting storage pool (spare-group) identifiers.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-22 16:30:59 -07:00
Dan Williams b526e52dc7 Always assume SKIP_GONE_DEVS behaviour and kill the flag
...i.e. GET_DEVS == (GET_DEVS|SKIP_GONE_DEVS)

A null pointer dereference in Incremental.c can be triggered by
replugging a disk while the old name is in use.  When mdadm -I is called
on the new disk we fail the call to sysfs_read().  I audited all the
locations that use GET_DEVS and it appears they can tolerate missing a
drive.  So just make SKIP_GONE_DEVS the default behaviour.

Also fix up remaining unchecked usages of the sysfs_read() return value.

Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-16 17:26:04 -07:00
Dan Williams 4f0a7acc9a mdmon: record sync_completed directly to the metadata
When sync_action is idle mdmon takes the latest value of md/resync_start
or md/<dev>/recovery_start to record the resync/rebuild checkpoint in
the metadata.  However, now that mdmon is reading sync_completed there
is no longer a need to wait for, or force an idle event to take a
checkpoint.

Simply update the forward progress of ->last_checkpoint at every wakeup
event and force it to be recorded at least every 1/16th array-size
interval.  It may be recorded more frequently if a ->set_array_state()
event occurs.

This also cleans up some confusion in handling the dual-rebuild case.
If more than one spare has been activated the kernel starts the rebuild
at the lowest recovery offset, so we do not need to worry about
min_recovery_start().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 18:41:57 -07:00
Dan Williams 0d80bb2f97 imsm: dump each disk's view of the slot state
Allow --examine to determine which disk might have a stale view of the
per-disk out-of-sync state.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 18:41:57 -07:00
Dave Jiang 0bd16cf217 create: Check with OROM limit before setting default chunk size
Make create check with the appropriate meta data handler and see what the
largest chunk size is supported. The current 512K default is not supported
by existing imsm OROM.

[dan.j.williams@intel.com: trim the upper limit to 512k for future oroms]
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 18:41:53 -07:00
Dan Williams 33414a0182 Kill subarray v2
Support for deleting a subarray out of a container.  When all subarrays
are deleted the component devices are converted back into spares, a
--zero-superblock is still needed to kill the remaining metadata at this
point.  This operation is blocked when the subarray is active and may
also be blocked by the metadata handler when deleting the subarray might
change the uuid of other active subarrays.  For example, with imsm,
deleting subarray 'n' may change the uuid of subarrays with indexes > n.

Deleting a subarray needs to be a container wide event to ensure
disks that record the modified subarray list perceive other disks that
did not receive this change as out of date.

Notes:
The st->subarray parsing in super-intel.c and super-ddf.c is updated to
be more strict now that we are reading user supplied subarray values.

Offline container modification shares actions that mdmon typically
handles so promote is_container_member() and version_to_superswitch()
(formerly find_metadata_methods()) to generic utility functions for the
cases where mdadm performs the operation.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 17:55:41 -07:00
NeilBrown d492df0307 Merge commit '3288b419b988b20a53a2b12eb8e5f9f536228db4'; commit '4363fd80bcc9f85ed824228dee5e6350a8d73e18'; commit '63b4aae33ebf00d443378daf313622630f2336c0'
* commit '3288b419b988b20a53a2b12eb8e5f9f536228db4':
  Revert "Incremental: honor --no-degraded to delay assembly"
  Incremental: honor an 'enough' flag from external handlers

* commit '4363fd80bcc9f85ed824228dee5e6350a8d73e18':
  imsm: robustify recovery-start detection
  fix: memory leak in mdmon_pid()

* commit '63b4aae33ebf00d443378daf313622630f2336c0':
  mdmon: fix missing open of md/<dev>/recovery_start
2010-05-31 11:34:14 +10:00
Dan Williams 4363fd80bc imsm: robustify recovery-start detection
update_recovery_start() assumed that the out-of-sync disk would always be
marked as IMSM_ORD_REBUILD in the disk_ord_tbl, but the segmentation
fault reported by Andy proves otherwise.  This might also be explained by
an interrupted rebuild and the disk has not yet been marked missing.

https://bugzilla.redhat.com/show_bug.cgi?id=592030

Reported-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-26 13:33:43 -07:00
Dan Williams 97b4d0e971 Incremental: honor an 'enough' flag from external handlers
This is needed for imsm where:
1/ we want to report raid_disks as zero to allow mdadm -As to
   incorporate all spares
2/ we can't determine stale disks by looking at the event counts.
3/ we can't see per-subarray expectations with the info returned from
   the container level ->getinfo_super()

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-26 13:22:36 -07:00
Dan Williams 484240d8a3 mdmon: periodically checkpoint recovery
The kernel updates and notifies md/sync_completed when it is time to
take a checkpoint.  When this occurs (at 1/16 array size intervals)
write 'idle' to md/sync_action to have the current recovery position
updated in recovery_start and resync_start.

Requires the metadata handler to reset ->last_checkpoint when it has
determined that recovery has ended.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-14 17:42:49 -07:00
NeilBrown 691c6ee1b6 IMSM/DDF: don't recognised these metadata on partitions.
These metadata are not expected on partitions, and they have
no way of differentiation whether which is correct if they
are found both on the device and on the last partition.

So if the device is a partition, refuse to read the metadata.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-04-29 16:09:59 +10:00
Dan Williams 4eb269706f Create: cleanup after failed create in duplicated array member case
mdadm prevents creation when device names are duplicated on the command
line, but leaves the partially created array intact.  Detect this case
in the error code from add_to_super() and cleanup the partially created
array.  The imsm handler is updated to report this conflict in
add_to_super_imsm_volume().

Note that since neither mdmon, nor userspace for that matter, ever saw an
active array we only need to perform a subset of the cleanup actions.
So call ioctl(STOP_ARRAY) directly and arrange for Create() to cleanup
the map file rather than calling Manage_runstop().

Reported-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-04-19 15:28:07 +10:00
Doug Ledford 94fcb80a8e powerpc compile fix
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-04-07 09:19:42 +10:00
NeilBrown d682f3445c ddf/intel: zero out old metadata before creating a container.
Matching the functionality already in super0 and super1, when
we first create a container, remove any other recognisable metadata to
ensure it doesn't cause confusion.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-10 15:55:47 +11:00
NeilBrown c147484252 Merge branch 'master' of git://github.com/djbw/mdadm 2010-03-10 07:54:03 +11:00
NeilBrown 624c5ad4cb Make sure reshape_active is cleared by getinfo_super
There were cases where --detail would report phantom reshapes.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-09 16:15:29 +11:00
Dan Williams 49133e5782 imsm: kill ->creating_imsm flag
It is an unused holdover from long since removed functionality.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 00:03:04 -07:00
Dan Williams 32ba9157f5 Revert "Make the IMSM_DEVNAME_AS_SERIAL option work when creating containers."
This reverts commit 9ef5dbff4a as it is
duplicating the check that is done internal to imsm_read_serial().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 00:03:04 -07:00
NeilBrown 921d9e164f Assemble: fix --force assembly of v1.x arrays which are recovering.
1.x metadata allows a device to be a member of the array while it
is still recoverying.  So it is a working member, but is not
completely in-sync.

mdadm/assemble does not understand this distinction and assumes that a
work member is fully in-sync for the purpose of determining if there
are enough in-sync devices for the array to be functional.

So collect the 'recovery_start' value from the metadata and use it in
assemble when determining how useful a given device is.

Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-04 12:02:09 +11:00
Luca Berra cf1be220e2 super-intel.c: use %zu specifier for printf of size_t
Fix compile warning when size_t is not a long.

Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Luca Berra <bluca@vodka.it>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-01 09:15:35 +11:00
Doug Ledford 9ef5dbff4a Make the IMSM_DEVNAME_AS_SERIAL option work when creating containers.
This allows a person to testing using loopback devices that don't
support serial number queries.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-01-19 10:39:39 +13:00
NeilBrown 8409bc51e8 Merge branch 'klockwork' of git://github.com/djbw/mdadm
Conflicts:
	super-intel.c
2009-12-30 13:46:52 +11:00
Dan Williams 1e5c69836d imsm: add support for checkpointing via 'curr_migr_unit'
Unlike native md checkpointing some data about the geometry and type of
the migration process is coded into curr_migr_unit.  Provide logic to
convert between md/{resync_start|recovery_start} and imsm/curr_migr_unit.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-21 17:54:32 -07:00
Dan Williams d23534e464 Teach sysfs_add_disk() callers to use ->recovery_start versus 'insync' parameter
Also fixup 'in_sync' versus 'insync' typo.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-21 11:26:21 -07:00
Dan Williams b7528a20cc Introduce MaxSector
Replace occurrences of ~0ULL to make it clear we are talking about maximal
resync/recovery position.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-21 10:23:26 -07:00