Commit Graph

422 Commits

Author SHA1 Message Date
Dan Williams
ae6aad8239 imsm: delete kicked disks
When we have determined that a disk is no longer of any value, remove
it from the data structure.   This is now safe because the manager
will back off while any metadata update is pending in the monitor.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 14:55:10 +10:00
NeilBrown
01f157d74a Extra option for set_array_state: you choose dirty or clean.
When we first start an array, it might be good to start recovery
straight away.  That requires setting the array to 'dirty', but
only the metadata handler can know if that is required or not.
So have a third possible 'consistent' option to set_array_state.
Either 'no' or 'yes' or 'you choose'.

Return value indicates what was chosen.

'1' (no) should be chosen unless there is a good reason.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-08-19 14:54:55 +10:00
Dan Williams
fcb844757f imsm: include not synced disks in imsm_count_failed
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-15 10:58:42 -07:00
Dan Williams
7eef045331 imsm: use disk_ord_tbl to identify rebuilding disks
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-15 10:57:19 -07:00
Dan Williams
9a1608e5d0 imsm: fix up assembly of disks that are not in-sync
1/ Do not assemble !in_sync or failed devices in container_content.
2/ Prevent activation of failed or configured devices in activate_spare.
3/ Be sure to avoid dirty degraded if the array was shutdown cleanly.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams
6a3e913ee9 imsm: fix create by mdmon-update
imsm_dev dynamically grows, so dev_idx needs to be moved up in the
definition to avoid getting clobbered.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams
e74255d907 imsm: write_super return 0 on success
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams
a48ac0a8d6 imsm: update mpb_size in write_super_imsm
With dev->vol.map and mpb->disk entries entering and leaving the parameter
block write_super_imsm needs to update the size before writeback.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:49 -07:00
Dan Williams
272906ef49 mdmon: use activate spare for re-add
Disks that are not in-sync or failed are not assembled into member
arrays by mdadm.  Teach mdmon to resolve this situation by checking for
spares at start.  imsm_activate_spare() is updated to prefer devices
that can be re-added versus new spares.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:25:46 -07:00
Dan Williams
3393c6af8b imsm: fix handling of the 'migr_state' and 'migr_type' bits
The option-rom and the Matrix driver mark resyncs/rebuilds with the
migrate state bits.  Update sizeof_imsm_dev to allow allocation of
imsm_dev entries large enough to grow if migr_state is later set.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-12 02:05:20 -07:00
Dan Williams
a965f303c7 imsm: add get_imsm_map and sizeof_imsm_map
retrieve map entries from a imsm_dev, and cleanup imsm_copy_dev

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-11 01:16:24 -07:00
Dan Williams
828408ebef imsm: drop 'external' from imsm_examine_brief
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-11 01:16:24 -07:00
Dan Williams
19859edc2d imsm: ensure 'usable' remains clear until the disk is in_sync
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-11 01:16:24 -07:00
Dan Williams
d23fe9472d imsm: spare devices are represented as single disk containers
This poses a small problem for the case of handling multiple raid1 arrays
across separate disk pairs i.e. 2 mirrors on 4 disks.  The option-ROM will
configure this as two containers.  We may need the capability for one
container to ask for an unused spare in another container.  For now spares
will just maintain the affinity established at assemble time.

To support this configuration spare devices must be allowed to be assembled
into the container even though the metadata indicates the disk belongs to a
different family.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-09 13:37:54 -07:00
Dan Williams
b9f594fea0 imsm: store imsm_disk entries in the super->disks list
removes the need to lookup the disk by index in a few cases and is a
preparation step for tracking spares outside the current anchor.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-09 13:37:54 -07:00
Jacek Danecki
604b746f7c imsm: bad block management (phase1)
This is the initial defensive implementation of bad block management
support.  It simply precludes assembly if there are entries in the bad
block logs.  This is sufficient for now as the conditions that lead to
an entry in the bad block log would cause the array to be failed by MD
(as of 2.6.27).

[dan.j.williams@intel.com: general cleanups]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-06 23:55:53 -07:00
Dan Williams
115c380305 imsm: do not mark arrays 'clean' if resync still in progress
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-06 23:53:44 -07:00
Dan Williams
a380c027e3 imsm: raid5 layout is left-asymmetric
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-08-06 23:53:44 -07:00
Dan Williams
9ca2c81c0f imsm: trim some memory wastage
drop the previously loaded data as load_imsm_super_all loops through the
disks.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-25 17:45:13 -07:00
Dan Williams
949c47a0fa imsm: refactor mpb handling into parse and coalesce
Maintaining a single global buffer is unwieldly when extending/rewriting
sections of the metadata.  Parse the metadata into component data
structures upon reading and coalesce to a coherent buffer before
writing.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-25 16:59:47 -07:00
Dan Williams
f54e6321a2 imsm: use component_size not total array size in container_content_imsm
Also kill setting ->array.size as nothing appears to be using it.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-24 17:26:24 -07:00
Dan Williams
fcfd95998b imsm: take chunksize into account when calculating blocks_per_member
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-24 17:26:24 -07:00
Dan Williams
755c99faf2 sysfs: deprecate sysfs_disk_to_sg
The cmd_filter patch merged for 2.6.27 broke retrieving the serial
number via an ioctl to /dev/sgN.  In debugging this I found that other
utilities like sdparm simply run the ioctl on /dev/sdX.  So just convert
to that for protection in numbers, but scream on the mailing list for
the inconvenience grr...

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-24 17:26:24 -07:00
Dan Williams
eb7ea4630f imsm: fix store_zero_imsm breakage from posix_memalign conversion
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-24 17:26:23 -07:00
Dan Williams
0030e8d690 imsm: debug environment variable to use 'device name' as 'serial number'
Facilitate using loopback devices for the imsm regression suite

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-24 17:26:23 -07:00
Dan Williams
e9d82038f8 imsm: allow --examine to succeed with damaged metadata
limp along after seeing mismatched serial numbers

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-24 17:26:23 -07:00
Dan Williams
5802a8118e imsm: handle degraded->normal transitions in set_disk
Removes the need for the call to ->set_array_state when sync_action
transitions from 'recover' to 'idle'.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-24 17:26:23 -07:00
Dan Williams
8273f55e4f imsm: create array via update to mdmon 2008-07-24 17:26:19 -07:00
Dan Williams
ef649044b7 imsm: remove some casts
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-21 17:42:10 -07:00
Dan Williams
5257ce365b imsm: activate_spare: fix handling of data_offset
Neil rightly points out that imsm_activate_spare may skip valid free space
on a spare, fix this up.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-21 11:08:20 -07:00
Dan Williams
e8319a19f2 imsm: process update
Handle 'activate_spare' events
2008-07-14 14:10:02 -07:00
Dan Williams
88758e9daa imsm: activate_spare
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 14:09:48 -07:00
Dan Williams
2da8544a39 imsm: metadata only supports a global 'data_offset'
When creating a volume a drive must have free space starting at the same
location as all the other drives in the array.
2008-07-14 13:59:56 -07:00
Dan Williams
ef6ffaded4 imsm: fix raid1 creation
Need to supply dummy values for the chunk size and number of data stripes

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 13:58:19 -07:00
Dan Williams
0372d5a2f4 imsm: fail ->open_new if container_member is out of range
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 13:57:33 -07:00
Dan Williams
a3d6757fbf imsm: kill obsolete call to fsync() in store_imsm_mpb
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 13:56:42 -07:00
Dan Williams
2c092cada6 imsm: fix insufficient mpb buffer allocation
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 13:56:28 -07:00
Dan Williams
ad97895ec8 imsm: catch posix_memalign allocation failures
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 13:55:12 -07:00
Dan Williams
bf5a934aff imsm: remove extra superswitches
Following the lead of 75ede16d.  This incidentally fixes creation of a second
array by gating call to getinfo_super_imsm_volume with a valid ->current_vol.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 13:55:12 -07:00
Dan Williams
99e2926423 imsm: fix store_imsm_mpb() alignment for O_DIRECT
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-14 13:55:02 -07:00
Neil Brown
2c514b7120 Pass 'verbose' flag to validate_geometry
That way it can be silent when we are just trying to figure out
which metadata to use, and noisy when detecting a real problem.
2008-07-12 20:28:38 +10:00
Neil Brown
6416d5275d Use O_DIRECT for all IO to devices.
Using buffered IO risks non-atomic updates to parts of the
device that we don't actually want to write to.  This isn't in
general safe.
So switch to O_DIRECT for all that IO and make sure we have
properly aligned buffers.
2008-07-12 20:28:33 +10:00
Neil Brown
d2ca644994 Remove getinfo_super_n and do some other cleaning up.
Getting close to a sensible description of what some of the
superswitch methods are supposed to do!
2008-07-12 20:27:39 +10:00
Neil Brown
0d481d3723 External metadata shouldn't set array.*_version
It doesn't mean anything and is never used.
2008-07-12 20:27:39 +10:00
Neil Brown
f7e7067b47 Add subarray field to supertype.
When loading the metadata for a subarray (super_by_fd), we set
->subarray to be the name read from md/metadata_version so that
getinfo_super can return info about the correct array.

With this we can differentiate between a container and
an array within the container by looking at ->subarray[0].
2008-07-12 20:27:38 +10:00
Neil Brown
0063ecba3d Hide subordinate superswitch structures.
Only one superswitch should be externally visible for each
general type.  Others which handle different flavours
(e.g. container/data-array) should be internal only.
2008-07-12 20:27:38 +10:00
Neil Brown
b8ac196795 Remove 'major' from superswitch.
It isn't generally meaningful.
2008-07-12 20:27:37 +10:00
Neil Brown
ef60947720 Always initialise a struct super_type to zero 2008-07-12 20:27:36 +10:00
Dan Williams
4e6e574a3e mdmon: add debug print statements for profiling mdmon
for development only as console output can block leading to monitor deadlocks
in low mem situations

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-16 15:50:07 -07:00
Dan Williams
a862209dc3 imsm: set map state depending on sync state
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-16 15:36:41 -07:00
Dan Williams
272f648fb0 imsm: write anchor last
Ensure that newly written anchors reflect a consistent metadata block.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-16 15:36:41 -07:00
Dan Williams
551c80c1c1 imsm: implement store_zero_imsm and init_zero_imsm
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-16 15:36:40 -07:00
Dan Williams
4a04ec6ca1 imsm: ->getinfo_super set raid_disk to disk.number
A spare device by definition will have raid_disk set to -1, but when
assembling the container we want this disk to by included.

Fixes a SIGSEGV when doing:
mdadm -A /dev/imsm -e imsm /dev/sd[b-e]

...where /dev/sde is marked as a global spare device

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-16 15:36:40 -07:00
Dan Williams
c2c087e631 imsm: add 'Create' support
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-16 15:36:38 -07:00
Dan Williams
8b35327854 imsm: 'volume' is the proper name for imsm container members
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-13 17:42:09 -07:00
Dan Williams
f4d11639d1 imsm: ->container_dev needs to be set in load_super_imsm_all
It is used later by container_content_imsm to determine set the
text_version of the member arrays.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-13 17:42:09 -07:00
Dan Williams
87eb16dfa9 imsm: add 'mpb_size' to --examine output
Break out sector_count() for use elsewhere.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-13 17:42:09 -07:00
Dan Williams
90c8b70714 sysfs: provide a helper function for locating scsi_generic interfaces
imsm records and validates this data in its metadata

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-13 17:27:30 -07:00
Neil Brown
2e735d1982 Allow passing metadata update to the monitor.
Code in manager can now just call queue_metadata_update with a
(freeable) buf holding the update, and it will get passed to the
monitor and written out.
2008-06-12 10:13:23 +10:00
Neil Brown
cba0191bad Parse the 'instance' part of external:/mdXX/INST in metadata handler.
This give more flexability.
2008-05-27 09:18:57 +10:00
Neil Brown
159c3a1a77 Remove st->text_version in favour of info->text_version
I want the metadata handler to have more control over the 'version',
particularly for arrays which are members of containers.
So discard st->text_version and instead use info->text_version
which getinfo_super can initialise.
2008-05-27 09:18:55 +10:00
Neil Brown
ed9d66aade Change mark_clean to set_array_state.
DDF needs more fine grained understanding of the array state.
2008-05-27 09:18:54 +10:00
Neil Brown
4e5528c6f7 Implement mark_clean for ddf and remove mark_dirty and mark_sync
mark_dirty is just a special case of mark_clean - with sync_pos == 0.
mark_sync is not required.  We don't modify the metadata when sync
finishes.  Only when the array becomes non-writeable at which point we
use mark_clean to record how far the resync progressed.
2008-05-27 09:18:38 +10:00
Dan Williams
8d45d1969b handle disk failures
From: Dan Williams <dan.j.williams@intel.com>

Added curr_state as a parameter to set_disk.  Handlers look at this to
record components failures, and set global 'degraded' or 'failed'
status.

When reading the state as faulty:
1/ mark the disk failed in the metadata

2/ write '-blocked' to the rdev state to allow the kernel's failure
   mechanism to advance

3/ the kernel will take away the drive's role in remove_and_add_spares()

4/ once the disk no longer has a role writing 'remove' to the rdev state
   will get the disk out of array.

There is a window after writing '-blocked' where the kernel will return
-EBUSY to remove requests.  We rely on the fact that the disk will
continue to show faulty so we lazily wait until the kernel is ready to
remove the disk.  If the manager thread needs to get the disk out of the
way it can ping the monitor and wait, just like the replace_array()
case.

[buglet fix: swap the parameters of attr_match in read_dev_state]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:49 +10:00
Dan Williams
c2a1e7dad7 imsm: update metadata for dirty -> clean or resync -> idle events
From: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:47 +10:00
Dan Williams
3dbccbcf5f keep member disk fd(s) for updates
From: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:46 +10:00
Dan Williams
fd7cde1bf0 handle resync completion
From: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:42 +10:00
Neil Brown
845dea950c Add mdmon functions to super-intel. 2008-05-15 16:48:41 +10:00
Dan Williams
0fd5c350e5 set resync_start in Incremental_container
From: Dan Williams <dan.j.williams@intel.com>

Metadata handlers set mdinfo.resync_start depending on the state of the
array.  By default mdadm assumes the array is dirty and needs a full
resync.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:33 +10:00
Dan Williams
4f5bc4542c imsm: assemble raid devices
From: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:27 +10:00
Dan Williams
f352c54527 imsm: assemble container support
From: Dan Williams <dan.j.williams@intel.com>

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:24 +10:00
Dan Williams
cdddbdbca0 imsm: initial Intel(R) Matrix Storage Manager support
From: Dan Williams <dan.j.williams@intel.com>

The following now work:
--examine
--examine --brief

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-05-15 16:48:22 +10:00