Commit Graph

2842 Commits

Author SHA1 Message Date
Martin Wilck 30b83120ed mdmon: manage_member: debug messages for array state
Add debug messages to watch the manager's steps.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 13:00:32 +10:00
Martin Wilck c371936051 mdmon: wait_and_act: fix debug message for SIGUSR1
Correctly print out wake reason if it was a signal. Previous code
would print misleading select events (pselect(2) man page says the
fdsets become undefined in case of error).

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 12:59:40 +10:00
Martin Wilck 39da26ecf5 monitor: read_and_act: log status when called
read_and_act() currently prints a debug message only very late.
Print the status seen by mdmon right away, to track mdmon's
actions more closely. Add a time stamp to observe long delays
between read_and_act calls, e.g. caused by meta data writes.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 12:57:20 +10:00
Martin Wilck ce6844b99c DDF: ddf_set_disk: add some debug messages
Adds more verbose debugging in ddf_set_disk, to understand failures
better.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 12:47:44 +10:00
Martin Wilck 0e5fa86239 DDF: load_ddf_header: more error logging
Try to determine problem if load_ddf_header fails. May be useful
for determining compatibility problems with Fake RAID BIOSes.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 12:47:44 +10:00
Martin Wilck 0847945b8e DDF: ddf_process_update: log offsets for conf changes
I needed this for tracking a bug with wrong offsets after array
creation.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 12:47:44 +10:00
Martin Wilck 2a645ee220 DDF: log disk status changes more nicely
In particular, include refnum for better tracking. This makes
it a little easier for humans to track what happened to which disk.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 12:47:44 +10:00
Martin Wilck 6f56dbb970 DDF: ddf_activate_spare: bugfix for 62ff3c40
Move the check for good drives in the dl loop - otherwise dl
may be NULL and mdmon may crash.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 12:47:44 +10:00
NeilBrown 71d68ff62f Fix is_resync_complete for RAID10
For RAID10, 'sync' numbers go up to the array size rather than the
component size.  is_resync_complete() needs to allow for this.

Reported-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 09:22:18 +10:00
Jes Sorensen 364a48c992 Avoid double close()
Coverity discovered a possible double close(fd2) in Grow.c. Avoided by
invalidating fd2 after the first close.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-31 08:51:16 +10:00
NeilBrown 23bf42cc79 super1: simplify setting of array size.
Currently the extra space to leave before the data in the array
is calculated in two separate places, and they can be inconsistent.

Instead, do it all in validate_geometry.  This records the
'data_offset' chosen which all other devices then use.

'write_init_super' now just uses the value rather than doing all the
calculations again.

This results in more consistent numbers.

Also, load_super sets st->data_offset so that it is used by "--add",
so the new device has a data offset matching a pre-existing device.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-30 17:05:47 +10:00
NeilBrown 641da74591 super1: separate to version of _avail_space1().
_avail_space1() is calls from both avail_space1() and validate_geometry1()
and does slightly different things.

The partial code sharing doesn't really help.  In particularly the
responsibility for setting the size of the array is currently
confused.

So duplicate the code into the two locations - one where 'super' is
always NULL (validate_geometry1) and one where it is never NULL
(avail_space1), and simplify.

No behaviour change - just code re-organisation.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-30 15:59:03 +10:00
NeilBrown 7ccc4cc4fc Manage: remove call to validate_geometry.
This call to validate_geometry is really rather gratuitous.
It is purely about the fact that super0 cannot use more than 4TB.
So just make it an explicit test - less confusing that way.

With this, validate_geometry is only called from Create, which
makes it easier to reason about.

Also validate_geometry is now never passed NULL for the 'chunk'
parameter, so we can remove those annoying tests for NULL.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-30 13:45:22 +10:00
mwilck@arcor.de 0c78849f2b DDF: ddf_activate_spare: fix metadata update for SVDs
Metadata updates for secondary RAID (RAID10) need to cover
all BVDs. Compare with code in write_init_super_ddf().

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-30 10:57:14 +10:00
mwilck@arcor.de 62ff3c40c1 DDF: ddf_activate_spare: only activate good drives
Do not try to activate drives marked missing or failed.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-30 10:57:13 +10:00
mwilck@arcor.de 7733b91d37 DDF: ddf_activate_spare: Add RAID10 code
The check for degraded array is a bit more complex for RAID10.
Fixing it.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-30 10:57:13 +10:00
mwilck@arcor.de 84e32e1977 DDF: find_vdcr: fix minor bug in debug message
This code could find disk -1. Fixed.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-30 10:57:13 +10:00
NeilBrown 4b1679dd39 Change version to 3.3-rc2
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-25 17:55:18 +10:00
NeilBrown 482383022d Add test for --replace handling.
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-24 15:32:31 +10:00
NeilBrown 51425978e5 Manage: fix typo in error for "--with" handling
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-24 15:32:26 +10:00
NeilBrown 7d7092ec6c Improve revert tests
1/ perform revert-grow on more metadata versions
2/ add revert-inplace.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-24 12:23:04 +10:00
NeilBrown 2bf62891c1 super0/1: fix typo in error messages.
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-24 12:22:58 +10:00
NeilBrown 3377ee4248 Grow: don't hold array open while waiting for reshape.
If we will need to change array level when a reshape completes, a copy
of mdadm waits in the background.
Currently this copy hold the device (/dev/mdX) open.  This prevents
the array from being stopped.

So close the file descriptor and re-open after the reshape completes.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-24 12:21:10 +10:00
NeilBrown 419e018284 super1: update data_size when performing "revert-reshape".
The "data_size" is with respect to "data_offset".  When the kernel
changes "data_offset" it modifies "data_size" to match - see
md_finish_reshape() in the kernel.

So when mdadm switches the data_offset for the new data_offset, it
must update data_size correspondingly.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-24 10:21:27 +10:00
NeilBrown 4441541f1f super-ddf: allow mdassemble to compile.
Just add/move some #ifdefs and move some code.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-23 14:00:56 +10:00
mwilck@arcor.de a8173e4349 DDF: convert big-endian __u16 to be16 type
Last step of endian-safe recoding. This requires also bit
operations.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-23 13:53:34 +10:00
mwilck@arcor.de 9d0c6b7071 DDF: convert big-endian __u64 to be64 type
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-23 13:53:32 +10:00
mwilck@arcor.de 60931cf94a DDF: convert big endian to be32 type
Part 2 of endianness-safe conversion

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-23 13:49:41 +10:00
mwilck@arcor.de 4d1bdc1840 DDF: add endian-safe typedefs
This adds typedefs for big-endian numbers. This will hopefully
reduce the number of endianness bugs I make.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-23 13:49:11 +10:00
mwilck@arcor.de 5c41684539 tests/10ddf-geometry: new unit test
Test various RAID geometries, creation and deletion of subarrays

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de 840ad583e0 test: increase number of devices to 13
extended DDF/RAID10 tests need 6 disks for DDF.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de abbc450fc2 tests/10ddf-create: create RAID5 first
Let the first created array be RAID5 rather than RAID0. This makes
the test harder than before, because everything after the first
Create has do be done indirectly through mdmon.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de fbf0c2a7ac DDF: getinfo_super_ddf_bvd: fix offset calculation for SVDs
Fix a bug that caused the wrong conf record to be used to derive
data offset and size on secondary RAID (RAID10).

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de 6a350d82b9 DDF: kill_subarray_ddf: fix case without mdmon running
When mdmon wasn't runnning, meta data wasn't committed to disk.
Fixed.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de 2aba583f28 DDF: err_bad_md_layout: fix return value
This function must use -1 to indicate failure. Fix it.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de 9bf3870442 DDF: factor out writing super block to single disk
Factor out single disk from __write_init_super_ddf to a new function
_write_super_to_disk. Use this function in store_super_ddf.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de 8e9387ac9f DDF: make "null_aligned" a static buffer
Use a static buffer for this "zero page". This makes it easier
to factor out the header writing code.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
mwilck@arcor.de 35c3606df7 DDF: increase seq number in ddf_set_updates_pending
Increase seq number only when there's actually a metadata change.
This is better then increasing it at every write.

This also fixes another endianness bug.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-22 16:56:32 +10:00
NeilBrown 57666a41b2 Merge commit '956a13fb850321bed8568dfa8692c0c323538d7c' 2013-07-15 11:39:50 +10:00
NeilBrown 6fd2a36f9b test: allow resync/reshape etc to go faster.
Whenever we "check wait" - make the resync process go at full speed.

Also allow script to adjust it manually.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-11 13:24:40 +10:00
NeilBrown ca36d70735 Grow: pass INVALID_SECTORS to reshape_array, not 0.
'0' means 'make it 0', which isn't what we want here.
We want 'leave it unchanged'.

Signed-off-by: NeilBrown  <neilb@suse.de>
2013-07-11 12:42:12 +10:00
NeilBrown 85ca499c6b IMSM: fix wait_for_reshape_imsm
This was waiting on "reshape_position" which doesn't
get update events.
Before sysfs_wait was introduced, the code to wait didn't
wait at all, so it spun.
With sysfs_wait, it would wait forever.

Change to wait in sync_completed which does get events.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-11 12:26:15 +10:00
Christoph Anton Mitterer 956a13fb85 align spelling of “RAID” and RAID levels
* Aligned the spelling of “RAID” to use captial letters in all places.
* Aligned the spelling of the RAID level names (LINEAR, RAID1, …) to use capital
  letters in all places, except for the string “faulty” in places where not the
  RAID level was meant.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
2013-07-10 23:32:22 +02:00
NeilBrown 3afaff930c Stop: fix up synchronising end of reshape to good boundary.
If we stop too soon after reshape starts (probably only during
testing), we can get confused by the status of the reshape.
If that might be happening - sleep a bit longer.

Also allow for reshape going unusually slowly (again, probably only
during testing).

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 16:28:25 +10:00
NeilBrown a7a0d8a116 Grow: use mdstat_wait to wait for delayed reshape.
Having a fix time for a wait is clumsy and can make us
wait much too long.
So use mdstat_wait and keep the mdstat_fd open.
This requires an 'mdstat_close' so it doesn't stay open
forever.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 11:10:54 +10:00
NeilBrown b4924f46c0 Don't set 'hold' option for mdstat_read if not needed.
We only need 'hold' if we want to mdstat_wait for a change.
These two callers don't care about a change, so they shouldn't
use the 'hold' flag.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 11:02:10 +10:00
NeilBrown 3eff7c1d2f DDF load headers: if primary is invalid, don't check fields.
Currently we compare fields between primary and secondary
superblocks, before we check if the primary is even valid.
This is a bit backwards, so reverse it.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 10:47:22 +10:00
mwilck@arcor.de c5943560f8 DDF: ddf_process_update: Fix updates for SVDs
The "indirect" code path for adding VDs was not working correctly
for secondary RAID level. The "other BVDs" were not transmitted
to mdmon. Thus mdmon wouldn't build up correct information, and
RAID creation would fail when mdmon was already running on the container.

This patch fixes this.

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 10:43:05 +10:00
mwilck@arcor.de ed5ff7a22b DDF: ddf_process_update: some more debug messages
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 10:43:01 +10:00
mwilck@arcor.de 4a03cbd10b DDF: guid_str: more readable output
Print ASCII characters as ASCII

Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-10 10:42:17 +10:00