Commit Graph

338 Commits

Author SHA1 Message Date
Anna Czarnowska
326727d9c9 Use one function chosing spares from container
container_chose_spares in Monitor.c and
get_spares_for_grow in super-intel.c
do the same thing: search for spares in a container.

Another version will also be needed for Incremental
so a more general solution is presented here and
applied in two previous contexts.

Normally domlist==NULL would lead an empty list but
this is typically checked earlier so here it is interpreted
as "do not test domains".

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-05 14:34:14 +11:00
Anna Czarnowska
22e263f64a imsm: set imsm spare uuid to 0
uuid_match_any is replaced by uuid_zero for imsm spares.

Function fixup_container_spare_uuid not needed as it gives
unwanted uuid to spares.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 21:59:31 +11:00
Krzysztof Wojcik
a06d022db4 FIX: Bad block verification during assembling array
We need to refuse to assemble an arrays with bad blocks.
Initially there was condition in container_content function
that returns error value in the case when metadata store information
about bad blocks.
When the container_content function is called from functions NOT connected
with assemble (Kill_subarray, Detail) we get faulty error return value.
Patch introduces new flag in array.status - MD_SB_BBM_ERRORS. It is set
in container_content when bad blocks are detected and can be checked by
container_content caller.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-26 21:41:57 +11:00
Adam Kwolek
81ac8b4d56 imsm: Fill delta_disks field in getinfo_super()
delta_disks field is not always filled during getinfo_super() call.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 15:55:40 +11:00
Adam Kwolek
4c9bc37b97 imsm: Do not indicate resync during reshape
If reshape is started resync is not allowed in parallel. This would
break reshape.  If array is in General Migration state do not indicate
resync and allow for reshape continuation.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 15:48:27 +11:00
NeilBrown
aad6f216a1 Handle checkpointing during reshape
We need to allow metadata to handle progress of reshape,
completion, and abort-before-start.

Include all those in ->set_array_state()

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 15:48:05 +11:00
Adam Kwolek
1af97990a6 imsm: Block array state change during reshape
Array state change is blocked due to reshape action in progress
metadata changes are during applying.

'1' is returned to indicate that array is clean

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 13:17:47 +11:00
Adam Kwolek
d195167d9c imsm: Process reshape_update in mdmon
For this update prepare_update() allocates memory to relink imsm
(bigger) device imsm structures. It calculates new /bigger/ anchor
size.

Process update applies update in to imsm structures.  This includes
 - converting selected spares into configured devices
 - marking the arrays as migrating
 - making a new 'map' for each array with the changed details.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 13:17:45 +11:00
NeilBrown
cb23f1f4c3 Allow a metadata update to have a linked list of allocated spaces.
Sometimes one metadata update will require allocating several
larger data structures.  As 'monitor' cannot allocate, 'manager'
must, so it must be able to attach a list of allocates to the
update, and importantly it must be able to easily free them.

So add a 'space_list' element to metadata updates where each
element on the list starts with a pointer to the next.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 12:10:01 +11:00
NeilBrown
78b10e663c imsm: Prepare reshape_update in mdadm
During Online Capacity Expansion metadata has to be updated to show
array changes and allow for future assembly of array.  To do this
mdadm prepares and sends reshape_update metadata update to mdmon.
The update contains the old and new number of raid disks, and the
indices of the spare disks that will be used to fill the spaces.

This works as follows:
1. reshape_super() prepares metadata update.
2. mdadm discovers the spares and adds them to the array
3. mdadm sends the update to mdmon
4. managemon in prepare_update() allocates required memory for bigger
   device object
5. monitor in process_update() updates the metadata to record the
   new sizes and the newly assigned devices.
6. mdadm initiates the reshape

Based on code From: Adam Kwolek <adam.kwolek@intel.com>

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 11:45:21 +11:00
NeilBrown
94827db3b3 imsm: add spares to --examine output.
When we examine a container, list the spare devices as well as
the active devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 11:33:23 +11:00
Adam Kwolek
6c93202898 imsm: FIX: imsm_add_spare() wrongly tests spares list
For more than one disk tested additional_test_list was searched from last
point, not from begin.

This bug causes that more than 2 disks cannot be added to imsm array,
when imsm_add_spare() is used for this.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-16 09:03:03 +11:00
Labun, Marcin
95d07a2cdd IMSM: do not rebuild the array if a non-redundant sub-array with failed disks is present
Before looking for a spare to rebuild a degraded array, check if there
are any failed disks in container.  Block rebuild if another sub-array
is failed until failed disks are removed from container.

Currently, Intel metadata handler rebuilds all sub-arrays even if one
of them is non-redundant. In case of failed sub-array, failed disks
are just replaced with new ones in the metadata mapping. The data for
failed disk is not restored even the disk is present in the system.

With this fix, we require the removal of the failed disk from
container to start the process of rebuilding the array with failed
member. If the disk is physically pulled out of the system, the disk
is removed from container automatically by exiting udev rules.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-15 15:51:53 +11:00
Labun, Marcin
1a64be565b IMSM: Fix problem in mdmon monitor of using removed disk in imsm container.
Manager thread shall pass the information to monitor thread (mdmon)
that some devices are removed from container.  Otherwise, monitor
(mdmon) might use such devices (spares) to rebuild the array that has
gone degraded.

This problem happens for imsm containers, since a list of the
container disks is maintained in intel_super structure. When array
goes degraded, the list is searched to find a spare disks to start
rebuild.  Without this fix the rebuild could be stared on the spare
device that was a member of the container, but has been removed from
it.

New super type function handler has been introduced to prepare
metadata format specific information about removed devices.

int (*remove_from_super)(struct supertype *st, mdu_disk_info_t *dinfo)

The message prepared in remove_from_super is later processed by
process_update handler in monitor thread.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-15 15:51:51 +11:00
NeilBrown
1d54f2867b Merge branch 'master' into devel-3.2
Conflicts:
	super-intel.c
2010-12-13 14:00:05 +11:00
Luca Berra
a2973b6af2 segfault in imsm create with wrong arguments
When calling mdadm -C --metadata=imsm -l 1 /dev/sd..
mdadm segfaults in default_chunk_imsm()
above syntax is incorrect, but mdadm should error instead of segfaulting

Signed-off-by: Luca Berra <bluca@comedia.it>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-12-13 13:51:07 +11:00
Adam Kwolek
8ba77d3281 imsm: Allow multiple spares to be collected.
Assumption for spares searching was that after picking new device, it
has to be added to array before next search.  This causes returning
different disk on each call.

When creating a spare list during Online Capacity Expansion, we will
first collect the devices list and then all devices are added to md.
Picked device from spares pool has to be checked against picked
devices so far. If not, the same disk will be returned all the time.
Already picked devices are stored in the list and this list is used
for new devices verification also.

So add an extra arg to imsm_add_spare to hold a list of known
spares to ignore.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-30 13:14:24 +11:00
Adam Kwolek
36988a3dda imsm: FIX: core dump during imsm metadata writing
Wrong number of disks during metadata update causes core dump.  New
disks number based on internal mdmon information has to used for
calculation (not previously read from metadata).

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-29 12:53:16 +11:00
Adam Kwolek
28bce06f17 imsm: Add support for general migration
Internal IMSM procedures need to support the General Migration.
It is used during operations like:
- Online Capacity Expansion,
- migration initialization,
- finishing migration,
- apply changes to raid disks etc.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-29 12:28:01 +11:00
Dan Williams
30f58b2208 Create: cleanup/unify default geometry handling
Support metadata specific level, layout and chunksize defaults.  Kill an
uneeded superswitch methods ahead of adding more for the reshape case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-23 15:20:50 +11:00
Marcin Labun
2cda7640f9 Policy is aware of metadata disk's controller domains.
Platform (metadata) domain let the metadata handlers differentiate
disk domains based on controllers that the disk belongs to.
Platform domain is sub-domain inside user specified domain
in mdadm.conf configuration files inheriting all parameters from it.
The metadata domain name is used disk domain matching functions.
The disk with the same metadata domain name belong to the same metadata
domain.

New metadata handler is added that retrieves platform domain string based
on disk path:
const char *(*get_disk_controller_domain)(const char *path);

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:07 +11:00
Anna Czarnowska
80e7f8c31a Monitor: Allow metadata to set minimum size for spare to migrate in.
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:07 +11:00
Anna Czarnowska
5c4cd5da70 imsm: create mdinfo list of disks in a container from supertype
If getinfo_super is called on a container supertype we only get information
on first disk. As a parameter it uses reference to preallocated
mdinfo structure. Amending getinfo_super to return full list of disks
would require ammending all previous calls and subsequently freeing memory
allocated for mdinfo list.
Function container_content that returns a mdinfo list is written
specifically for assembly, performing actions not needed to just fill
mdinfo. It also does not include spares so is unsuitable.
As an alternative a new function getinfo_super_disks is created
to obtain information about all disks states in array.
Existing function sysfs_free is used to free memory
allocated by getinfo_super_disks.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:07 +11:00
NeilBrown
157e6e24b9 Remove loaded_container
This field is now only set, never used.

So remove it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown
ab3cb6b3b7 imsm: always calculate container_enough in getinfo_super_imsm
We are about to lose the loaded_container field, and we don't really
need to use it to protect the calculation of container_enough.
So drop the test.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:06 +11:00
NeilBrown
fa56eddbd1 Improve mddev_ident type definitions.
Remove the _t typedef and remove the _s suffix from the struct name.

These things do not help readability.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:58:05 +11:00
NeilBrown
2b959fbf66 New method: load_container
This handles the 'container' part of 'load_super', so we can
soon make them completely separate - it is just confusing to
overload these two.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown
e1902a7b6c Remove keep_fd arg from load_super_XXX_all
It is always set to 1, so we don't need it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown
69b2fcc5bb Remove subarray field in supertype.
This is now only ever set, never used.
So remove it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown
d1d599ea0d Create: user container_dev rather than subarray for some tests.
It makes more sense to test for container_dev than for subarray
for several places in Create where it then uses container_dev.

This allows us to subsequently remove subarray.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown
e32bd33f44 Remove subarray detection from load_super.
Nothing relies on this any more, so remove it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown
a951a4f78f Pass subarray arg explicitly to ->update_subarray.
This is better than hiding it in the supertype structure
where we are never quite sure who needs it.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 20:24:50 +11:00
NeilBrown
00bbdbdac6 Add subarray arg to container_content.
This allows the info for a single array to be extracted,
so we don't have to write it into st->subarray.

For consistency, implement container_content for super0 and super1,
to just return the mdinfo for the single array.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:26 +11:00
NeilBrown
a5d85af748 get_info_super: report which other devices are thought to be working/failed.
To accurately detect when an array has been split and is now being
recombined, we need to track which other devices each thinks is
working.

We should never include a device in an array if it thinks that the
primary device has failed.

This patch just allows get_info_super to return a list of devices
and whether they are thought to be working or not.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:25 +11:00
NeilBrown
1e2b276535 Report error in --update string is not recognised.
If an --update is requested by the relevant metadata doesn't
understand it, print a useful message rather than silently ignoring
the issue.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-11-22 19:35:24 +11:00
NeilBrown
64436f0628 intel: Don't try to read from tiny devices.
If a device is less than 1K, avoid even trying to seek to 1K before
the end.
The seek will fail anyway so this is a fairly cosmetic fix.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-09-06 11:26:28 +10:00
NeilBrown
cdbe98cd54 Fix compiler warning concering bad use of snprintf.
Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
2010-08-06 20:10:48 +10:00
NeilBrown
f21e18ca89 Compile with -Wextra by default
This produced lots of warning, some of which pointed to actual bugs.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-05 13:13:02 +10:00
Dan Williams
569cc43ffb imsm: fix a -O2 build warning
super-intel.c: In function ‘imsm_add_spare’:
super-intel.c:4833: error: ‘array_start’ may be used uninitialized in this function
super-intel.c:4834: error: ‘array_end’ may be used uninitialized in this function

This is valid, if we don't find a spare candidate then array_{start,end}
will be uninitialized.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-07-06 12:48:59 -07:00
Dan Williams
d19e3cfb66 Merge branch 'fixes' into for-neil 2010-07-01 17:36:11 -07:00
Dan Williams
8cfc801c72 Merge branch 'subarray' into for-neil
Conflicts:
	mdadm.h
	super-intel.c
2010-07-01 17:36:05 -07:00
Dan Williams
aa534678ba Rename subarray v2
Allow the name of the array stored in the metadata to be updated.  In
some cases the metadata format may not be able to support this rename
without modifying the UUID.  In these cases the request will be blocked.
Otherwise we allow the rename to take place, even for active arrays.
This assumes that the user understands the difference between the kernel
node name, the device node symlink name, and the metadata specific name.

Anticipating further need to modify subarrays in-place, introduce the
->update_subarray() superswitch method.  A future potential use
case is setting storage pool (spare-group) identifiers.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-22 16:30:59 -07:00
Dan Williams
b526e52dc7 Always assume SKIP_GONE_DEVS behaviour and kill the flag
...i.e. GET_DEVS == (GET_DEVS|SKIP_GONE_DEVS)

A null pointer dereference in Incremental.c can be triggered by
replugging a disk while the old name is in use.  When mdadm -I is called
on the new disk we fail the call to sysfs_read().  I audited all the
locations that use GET_DEVS and it appears they can tolerate missing a
drive.  So just make SKIP_GONE_DEVS the default behaviour.

Also fix up remaining unchecked usages of the sysfs_read() return value.

Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-16 17:26:04 -07:00
Dan Williams
4f0a7acc9a mdmon: record sync_completed directly to the metadata
When sync_action is idle mdmon takes the latest value of md/resync_start
or md/<dev>/recovery_start to record the resync/rebuild checkpoint in
the metadata.  However, now that mdmon is reading sync_completed there
is no longer a need to wait for, or force an idle event to take a
checkpoint.

Simply update the forward progress of ->last_checkpoint at every wakeup
event and force it to be recorded at least every 1/16th array-size
interval.  It may be recorded more frequently if a ->set_array_state()
event occurs.

This also cleans up some confusion in handling the dual-rebuild case.
If more than one spare has been activated the kernel starts the rebuild
at the lowest recovery offset, so we do not need to worry about
min_recovery_start().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 18:41:57 -07:00
Dan Williams
0d80bb2f97 imsm: dump each disk's view of the slot state
Allow --examine to determine which disk might have a stale view of the
per-disk out-of-sync state.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 18:41:57 -07:00
Dave Jiang
0bd16cf217 create: Check with OROM limit before setting default chunk size
Make create check with the appropriate meta data handler and see what the
largest chunk size is supported. The current 512K default is not supported
by existing imsm OROM.

[dan.j.williams@intel.com: trim the upper limit to 512k for future oroms]
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 18:41:53 -07:00
Dan Williams
33414a0182 Kill subarray v2
Support for deleting a subarray out of a container.  When all subarrays
are deleted the component devices are converted back into spares, a
--zero-superblock is still needed to kill the remaining metadata at this
point.  This operation is blocked when the subarray is active and may
also be blocked by the metadata handler when deleting the subarray might
change the uuid of other active subarrays.  For example, with imsm,
deleting subarray 'n' may change the uuid of subarrays with indexes > n.

Deleting a subarray needs to be a container wide event to ensure
disks that record the modified subarray list perceive other disks that
did not receive this change as out of date.

Notes:
The st->subarray parsing in super-intel.c and super-ddf.c is updated to
be more strict now that we are reading user supplied subarray values.

Offline container modification shares actions that mdmon typically
handles so promote is_container_member() and version_to_superswitch()
(formerly find_metadata_methods()) to generic utility functions for the
cases where mdadm performs the operation.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-06-15 17:55:41 -07:00
NeilBrown
d492df0307 Merge commit '3288b419b988b20a53a2b12eb8e5f9f536228db4'; commit '4363fd80bcc9f85ed824228dee5e6350a8d73e18'; commit '63b4aae33ebf00d443378daf313622630f2336c0'
* commit '3288b419b988b20a53a2b12eb8e5f9f536228db4':
  Revert "Incremental: honor --no-degraded to delay assembly"
  Incremental: honor an 'enough' flag from external handlers

* commit '4363fd80bcc9f85ed824228dee5e6350a8d73e18':
  imsm: robustify recovery-start detection
  fix: memory leak in mdmon_pid()

* commit '63b4aae33ebf00d443378daf313622630f2336c0':
  mdmon: fix missing open of md/<dev>/recovery_start
2010-05-31 11:34:14 +10:00
Dan Williams
4363fd80bc imsm: robustify recovery-start detection
update_recovery_start() assumed that the out-of-sync disk would always be
marked as IMSM_ORD_REBUILD in the disk_ord_tbl, but the segmentation
fault reported by Andy proves otherwise.  This might also be explained by
an interrupted rebuild and the disk has not yet been marked missing.

https://bugzilla.redhat.com/show_bug.cgi?id=592030

Reported-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-26 13:33:43 -07:00
Dan Williams
97b4d0e971 Incremental: honor an 'enough' flag from external handlers
This is needed for imsm where:
1/ we want to report raid_disks as zero to allow mdadm -As to
   incorporate all spares
2/ we can't determine stale disks by looking at the event counts.
3/ we can't see per-subarray expectations with the info returned from
   the container level ->getinfo_super()

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-26 13:22:36 -07:00
Dan Williams
484240d8a3 mdmon: periodically checkpoint recovery
The kernel updates and notifies md/sync_completed when it is time to
take a checkpoint.  When this occurs (at 1/16 array size intervals)
write 'idle' to md/sync_action to have the current recovery position
updated in recovery_start and resync_start.

Requires the metadata handler to reset ->last_checkpoint when it has
determined that recovery has ended.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-14 17:42:49 -07:00
NeilBrown
691c6ee1b6 IMSM/DDF: don't recognised these metadata on partitions.
These metadata are not expected on partitions, and they have
no way of differentiation whether which is correct if they
are found both on the device and on the last partition.

So if the device is a partition, refuse to read the metadata.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-04-29 16:09:59 +10:00
Dan Williams
4eb269706f Create: cleanup after failed create in duplicated array member case
mdadm prevents creation when device names are duplicated on the command
line, but leaves the partially created array intact.  Detect this case
in the error code from add_to_super() and cleanup the partially created
array.  The imsm handler is updated to report this conflict in
add_to_super_imsm_volume().

Note that since neither mdmon, nor userspace for that matter, ever saw an
active array we only need to perform a subset of the cleanup actions.
So call ioctl(STOP_ARRAY) directly and arrange for Create() to cleanup
the map file rather than calling Manage_runstop().

Reported-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-04-19 15:28:07 +10:00
Doug Ledford
94fcb80a8e powerpc compile fix
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-04-07 09:19:42 +10:00
NeilBrown
d682f3445c ddf/intel: zero out old metadata before creating a container.
Matching the functionality already in super0 and super1, when
we first create a container, remove any other recognisable metadata to
ensure it doesn't cause confusion.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-10 15:55:47 +11:00
NeilBrown
c147484252 Merge branch 'master' of git://github.com/djbw/mdadm 2010-03-10 07:54:03 +11:00
NeilBrown
624c5ad4cb Make sure reshape_active is cleared by getinfo_super
There were cases where --detail would report phantom reshapes.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-03-09 16:15:29 +11:00
Dan Williams
49133e5782 imsm: kill ->creating_imsm flag
It is an unused holdover from long since removed functionality.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 00:03:04 -07:00
Dan Williams
32ba9157f5 Revert "Make the IMSM_DEVNAME_AS_SERIAL option work when creating containers."
This reverts commit 9ef5dbff4a as it is
duplicating the check that is done internal to imsm_read_serial().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 00:03:04 -07:00
NeilBrown
921d9e164f Assemble: fix --force assembly of v1.x arrays which are recovering.
1.x metadata allows a device to be a member of the array while it
is still recoverying.  So it is a working member, but is not
completely in-sync.

mdadm/assemble does not understand this distinction and assumes that a
work member is fully in-sync for the purpose of determining if there
are enough in-sync devices for the array to be functional.

So collect the 'recovery_start' value from the metadata and use it in
assemble when determining how useful a given device is.

Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-04 12:02:09 +11:00
Luca Berra
cf1be220e2 super-intel.c: use %zu specifier for printf of size_t
Fix compile warning when size_t is not a long.

Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Luca Berra <bluca@vodka.it>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-02-01 09:15:35 +11:00
Doug Ledford
9ef5dbff4a Make the IMSM_DEVNAME_AS_SERIAL option work when creating containers.
This allows a person to testing using loopback devices that don't
support serial number queries.

Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-01-19 10:39:39 +13:00
NeilBrown
8409bc51e8 Merge branch 'klockwork' of git://github.com/djbw/mdadm
Conflicts:
	super-intel.c
2009-12-30 13:46:52 +11:00
Dan Williams
1e5c69836d imsm: add support for checkpointing via 'curr_migr_unit'
Unlike native md checkpointing some data about the geometry and type of
the migration process is coded into curr_migr_unit.  Provide logic to
convert between md/{resync_start|recovery_start} and imsm/curr_migr_unit.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-21 17:54:32 -07:00
Dan Williams
d23534e464 Teach sysfs_add_disk() callers to use ->recovery_start versus 'insync' parameter
Also fixup 'in_sync' versus 'insync' typo.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-21 11:26:21 -07:00
Dan Williams
b7528a20cc Introduce MaxSector
Replace occurrences of ~0ULL to make it clear we are talking about maximal
resync/recovery position.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-21 10:23:26 -07:00
Dan Williams
b7941fd68d mdmon: cleanup resync_start
We don't need to sprinkle reads of this attribute all over the place,
just once at the entry of read_and_act().  Also, the mdinfo structure
for the array already has a 'resync_start' member, so just reuse that.
Finally, rename get_resync_start() to read_resync_start to make it
consistent with the other sysfs accessors in monitor.c.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-14 12:57:55 -07:00
Dan Williams
8655a7b194 imsm: cleanup print_imsm_dev()
When printing the migration state there is no need to print "migrating".
The fact that the state is non-idle should be enough indication.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-12 13:57:28 -07:00
Dan Williams
ecf408e914 imsm: fix thunderdome segfault
disk_list_get() can return NULL if:
1/ A formerly missing disk is re-added
2/ The original array has not been rebuilt, so the family number of the
   missing disk still matches
3/ The metadata record of the in-sync disks are read before the missing
   disk

This will result in the missing disk not adding its own serial number to
the disk_list, only its truncated value will be present.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-12 13:57:25 -07:00
Dan Williams
ac6449bee9 imsm: fix spare promotion
When associating a spare take on the target's metadata version number to
satisfy future compare_super checks.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 15:03:34 -07:00
Dan Williams
6592ce37ee imsm: honor orom constraints for auto-layout
Factor out the orom checking bits to validate_geometry_imsm_orom() and
share it between validate_geometry_imsm_volume() and the entry path to
reserve_space().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 15:03:31 -07:00
Dan Williams
dd9bb2fbed imsm: prune dead code in validate_geometry_imsm
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:40 -07:00
Artur Wojcik
33a6535d00 Fix required to enable RAID arrays on SAS disks.
The patch increases the capacity of buffers used to store
sysfs path names. Originally the buffers were too small to
hold the canonical representation of sysfs path (in case
of a SAS device, especially a device installed behind an
expander).

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:40 -07:00
Artur Wojcik
389508223e Fix for memory leak defect.
Possible memory leak. Dynamic memory stored in 'dev' and 'dev' allocated
through function 'malloc' can be lost on exit path.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:40 -07:00
Artur Wojcik
1602d52c99 Fix for memory leak defect.
Possible memory leak. Dynamic memory stored in 'sra' allocated through
function 'sysfs_read' at line 2484 can be lost at lines 2491, 2560 and
2571.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:40 -07:00
Artur Wojcik
e207da2f1b Fix for memory leak defect.
Dynamic memory stored in 'devnum2devname(st->container_dev)' allocated
through function 'devnum2devname' at line 1274 is lost at line 1278.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:40 -07:00
Artur Wojcik
4e5e717d72 Fix for NULL pointer dereference defect.
Pointer 'c' returned from call to function 'strchr' at line 954 may
be NULL and will be dereferenced at line 955.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:40 -07:00
Artur Wojcik
d362da3dfe Fix for NULL pointer dereference defect.
Pointer 'disk' returned from call to function '_get_imsm_disk' at line
700 may be NULL and will be dereferenced at line 710.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:40 -07:00
Artur Wojcik
4e9d21862d Fix for NULL pointer dereference defect.
Pointer 'st' returned from call to function 'malloc' at line 320 may
be NULL and it will be dereferenced at line 321.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:39 -07:00
Dan Williams
c3ca5f6028 imsm: no need to report the component device name from container_content
sysfs_add_disk() regenerates the name from major:minor, so we can drop a
strcpy that the static analysis checker does not like.

Reported-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:39 -07:00
Artur Wojcik
7a6ecd5544 Fix for buffer overflow defect.
Buffer overflow, array index of 'nm' may be out of bounds.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:39 -07:00
Artur Wojcik
791b666ae8 Fix for NULL pointer dereference.
Pointers '_dev' and '_disk' returned from call to function '_get_imsm_dev'
and '_get_imsm_disk' may be NULL and will be dereferenced at lines
2933 and 2934, respectively.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:39 -07:00
Artur Wojcik
d10d56feb8 Fix for NULL pointer dereference.
Suspicious dereference of pointer 'super' before NULL check at
line 3429.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:39 -07:00
Artur Wojcik
20cbe8d2ba Fix for memory and resource leak.
Make sure opened file descriptor is cleaned up on exit
path. Also make sure allocated memory for 'sra' is released
on exit path, too.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:39 -07:00
Artur Wojcik
0fbd635caa Fix for possible NULL pointer dereference.
Pointer 'this' returned from call to function 'malloc' at line 3795
may be NULL and will be dereferenced at line 3796.

Artur Wojcik <artur.wojcik@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-10 12:03:37 -07:00
Dan Williams
a7dd165b4e imsm: catch attempt to auto-layout zero-length arrays
When -z is omitted reserve_space() looks to satisfy a zero length
allocation which lo and behold is equal to the amount of free space on a
full disk.  So, catch maxsize == 0 and simplify the return value from
merge_extents() to always equal amount of free space (no benefit to
having a special case ~0ULL == error).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-01 16:04:06 -07:00
NeilBrown
b42f577a0d Improve error messages when metadata handler does not support request.
->validate_geometry is called to validate overall parameters,
and to validate each individual device.
If it ever fails, it needs to report the reason, as common code
cannot possible know.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-11-17 13:15:34 +11:00
NeilBrown
1799c9e8f8 super-intel: Fix compilation of mdassemble.
Signed-off-by: NeilBrown <neilb@suse.de>
2009-10-20 13:50:23 +11:00
Dan Williams
6e46bf344b imsm: add --update=uuid support
When disks have conflicting container memberships (same container ids
but incompatible member arrays) --update=uuid can be used to move
offenders to a new container id by changing 'orig_family_num'.

Note that this only supports random updates of the uuid as the actual
uuid is synthesized.  We also need to communicate the new
'orig_family_num' value to all disks involved in the update.  A new
field 'update_private' is added to struct mdinfo to allow this
information to be transmitted.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
Dan Williams
e683ca88ac imsm: fix/support --update
Fix init_super_imsm() to return an empty mpb when info == NULL, and
teach store_super_imsm() to simply write out the passed in mpb.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=523320

Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
Dan Williams
f796af5d5e imsm: fix spare record writeout race
imsm_activate_spare() in the manager thread may race against
write_super_imsm_spares() in the monitor thread.  Give
write_super_imsm_spares() its own private mpb buffer to prevent
confusing the manager.

This change uncovered cases where spares were not being assembled due to
a failed metadata version number check.  Spares can freely associate
across metadata version number, so reduce the scope of the version check
in the spare assembly case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-13 17:41:53 -07:00
Dan Williams
a2b9798159 imsm: disambiguate family_num
This is a result of trawling through the Windows implementation to learn
the mechanism of how it disambiguates family_num.  It is a continuation
of commit 148acb7b "imsm: fix family number handling" which introduced a
regression when reassembling a container with stale disks and rebuilt
members.

When rebuilding, a new family number is assigned to protect against the
"prodigal array member" problem.  It prevents a former family member
from returning to the system and causing a rebuild to go the wrong
direction.  However, this invalidates looking at the generation number to
determine the most up-to-date disk when comparing across family numbers.
Instead the assembly logic looks for agreement between a disk's local
family membership compared against a global list of all families in the
system.  Whenever a disk's local metadata does not match a family number
on the global list that family number is marked offline.

It is possible that this logic results in multiple incompatible but
valid family numbers existing in a container.  In this case mdadm.conf
cannot be consulted because it only records the uuid which is generated
from static fields in the metadata.  The metadata lacks the data needed
to disambiguate "local" versus "foreign".  The "foreign" array in this
case requires updating to change its container-id information
(orig_family_num), and possibly the member array names.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-30 11:45:41 -07:00
Dan Williams
51725a7c25 imsm: kill close() of component device
None of the other formats close the passed in fd at load, and this
becomes a problem when trying to support --update where we need O_EXCL
protection across the entire operation.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-30 11:44:38 -07:00
Dan Williams
25ed7e5924 imsm: cleanup disk status tests
Add is_failed(), is_configured(), and is_spare() helpers to clean up
disk status flag testing.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-28 14:40:59 -07:00
Dan Williams
cf53434e5c imsm: clear CONFIGURED_DISK for failed drives
Synchronizing with what the Windows driver does.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:35:28 -07:00
Dan Williams
ee5aad5ae2 imsm: kill USABLE_DISK flag
'USABLE_DISK' is not a 'persistent' status flag it is an internal status
flag used for the in memory representation of the disk in the Windows
driver.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:35:28 -07:00
Dan Williams
709743c554 imsm: fix spare promotion
1/ Fix an off by one error when detecting whether the device allocation
   loop succeeded or not
2/ Update ->num_raid_devs before copying to avoid a segmentation fault

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-15 11:34:20 -07:00
NeilBrown
4737ae25de Exmaine/brief: put member arrays after container arrays.
A previous patch moved move the '--examine --brief' reporting of
member arrays to before their containers.  This breaks "mdadm -As"
assembly.  So put them back, but still fix the problem addressed by
previous patch.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-08-07 14:17:40 +10:00
Dan Williams
7e8545e954 imsm: fix spare-uuid assignment
imsm spares do not have container membership by default so we associate
them with the first container found in the configuration file.  Some
ARRAY lines do not specify the metadata type so we cannot assume that
_cst will always be valid.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-07-31 17:11:42 -07:00
Dan Williams
148acb7baa imsm: fix family number handling
The family_number field can change.  The option-rom will change the
family number when it starts a rebuild process (flags a container for
rebuild).  This was not seen previously as mdadm would usually start the
rebuild process, preserving the family number.

This is the mechanism that helps to prevent a prodigal array member from
being returned to its original system and cause a rebuild to go in the
wrong direction.  With the change we will end up with a container that
will fail to assemble unless the device with the incompatible family
number is left out of the assembly.

So, take several actions:
1/ Convert uuid generation to use orig_family_num, being careful to
   preserve the existing uuid in the case where orig_family_num is not
   set (i.e. previous mdadm created imsm arrays)
2/ Set orig_family_num at Create.  For arrays created by mdadm prior to
   this release orig_family_num will be zero, so set it to family_num at
   the first metadata write.
3/ Add checks for orig_family_num to compare_super_imsm
4/ Update the family number when initiating rebuild
5/ The option-rom mixes some random data into the family number, add
   this functionality to the mdadm implementation.

Reported-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-07-31 17:11:41 -07:00