Commit d180d2aa2a ("Manage: fix test for 'is array failed'.")
introduced a regression which would not allow to re-add new
drivers to a failed array.
Fixes: d180d2aa2a ("Manage: fix test for 'is array failed'.")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: Coly Li <colyli@suse.de>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
The 'size' field of mdu_disk_info_t is 32bit and should not be used
except for legacy ioctls. super-ddf got this wrong :-(
This change makes it possible to create ddf arrays which used more than
2TB of each device.
Reported-by: Dan Russell <dpr@aol.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
The return value from print_found_intel_controllers() is never used,
so lets make it return void.
Reported-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
cmap_get_string() used to retrieve cluster_name does not restrict it's
size. To prevent buffer overflows use the size of the destination
buffer, not strlen() of the source, and null terminate the copied
string.
Fixes: 0aa2f15b ("mdadm: add the ability to change cluster name)"
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
The code did not free 'dir' allocated by opendir(). An additional
benefit is that this simplifies the for() loops.
Fixes: 60f0f54d ("IMSM: Add support for VMD")
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Grow_addbitmap() is only ever called with s->bitmap_file != NULL, but
not all static code checkers catch this. This adds a check to quiet
down the false positive warnings.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
All cases where fd2 is used are completed with a close(fd2), so there
is no need to set fd2 = -1 or check for it before exiting.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
This adds a buffer size argument to load_sys(), rather than relying on
a hard coded buffer size. The old behavior was safe because we knew
the kernel would never return strings overrunning the buffers, however
it was ugly, and would cause code checking tools to spit out warnings.
This caused a Coverity warning over the read into
sra->sysfs_array_state which is only 20 bytes.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
sysfs_read() may return NULL, so we should check the validity of the
pointer before dereferencing it.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Since flags is only set as LKF_NOQUEUE, the code
with LKF_CONVERT flag should be delete.
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
For cluster raid, we need to displays bitmap related
contents from different bitmaps which are based on node
num. So bitmap_file_open and locate_bitmap are changed a
little bit for the purpose.
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Fixes: b98043a2f8 ("Show all bitmaps while examining bitmap")
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
A recent commit removed a call to abort_reshape() when IMSM reshape
completed. An unanticipated result of this is that the suspended
region is not cleared as it should be.
So after a reshape, a region of the array will cause all IO to block.
Re-instate the required updates to suspend_{lo,hi} coped from
abort_reshape().
This is caught (sometimes) by the test suite.
Also fix a couple of typos found while exploring the code.
Reported-by: Ken Moffat <zarniwhoop@ntlworld.com>
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Fixes: 2139b03c20 ("imsm: don't call abort_reshape() in imsm_manage_reshape()")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
2.6.28+ kernels handle this themselves and issuing the event here can
cause a race.
Reported-by: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Suggested-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
uClibc exposes it's own version of bswap_<X> macros. Rather than
pulling in random macros by change, rename the mdadm ones to make sure
we know what we are getting.
Reported-by: "Maxin B. John" <maxin.john@gmail.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
This fixes a compile warning when building with musl:
In file included from util.c:27:0:
|
qemux86-64/usr/include/sys/poll.h:1:2:
error: #warning redirecting incorrect #include <sys/poll.h> to <poll.h>
[-Werror=cpp]
| #warning redirecting incorrect #include <sys/poll.h> to <poll.h>
| ^
Signed-off-by: Maxin B. John <maxin.john@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
By hardcoding CC's definition in the Makefile, all the external gcc
parameters set by tune settings are lost. This causes compile failure
with x32 toolchain
Signed-off-by: Maxin B. John <maxin.john@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
As 26714713cd said, 32 bit signed
timestamps will overflow in the year 2038. It already changed the
utime and ctime in struct mdu_array_info_s from int to unsigned
int. So we need to change the values that compared with them to
unsigned int too.
Signed-off-by : Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
bblog_size is 16bit so using le32_to_cpu on it is not wise
and leads to errors on big-endian machines.
Change all such calls to use le16.
Bug was introduced in mdadm-3.3
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
A simple revert doesn't work here because the reshape_position is
in the critical section.
The best approach is to let the reshape progress a bit and then
go backwards.
If that isn't possible, assembling with --update=revert-reshape and
--invalid-backup should work.
Reported-by-tested-by: George Rapp <george.rapp@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.com>
v0.90 metadata can handle devices between 2GB and 4GB, but we need
to treat the 'size' and unsigned. In a couple of places we don't.
URL: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=809447
Signed-off-by: NeilBrown <neilb@suse.com>
It seems that having the Conflicts in the .timer file is not sufficient.
Sometimes it works, but if the timer gets requested after the conflicting
block device appears (or was it "before" ...) the timer is not aborted.
Having the Conflicts in both files seems to work reliably.
URL: https://bugzilla.suse.com/show_bug.cgi?id=853944
Signed-off-by: NeilBrown <neilb@suse.com>
This code was meant to update 'earliest' but clearly never doesn't.
This bug would only affect an array with a very large bitmap so it is unlikely
to be significant.
Signed-off-by: NeilBrown <neilb@suse.com>
glibc allows the addr arg to connect and socket to be any of a number
of 'sockaddr_*' types, but musl requires 'const struct sockaddr *'
which is in line with open group specs. So add casts to allow
compilation with musl.
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.com>
config.c uses _POSIX_C_SOURCE which is defined in features.h when
glibc/uclibc is used, but isn't defined when musl is used.
So provide a reasonable default.
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.com>
The check of "is there a filesystem here" is still appropriate for a
journal device.
Also set active_disks correctly - even though it is ignored.
Signed-off-by: NeilBrown <neilb@suse.com>
The careless change was introduce by 'commit 7e6e839a26
(mdadm: change the num of cluster node)'. Which should be
revert to avoid misunderstanding.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Abort imsm_manage_reshape() without updating the migration record if any
error occurs when checking progress. If reshape is interrupted and the
migration record is then updated, the checkpoint will be wrong and will
cause reshape to fail when the array is restarted.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Waiting for reshape progress is done by using select() on sync_completed
to block until an exception condition is signalled on the
filedescriptor. This happens when the attribute's value is updated by
the kernel, but if the array is stopped when mdadm is blocked on
select() this will never happen, because this attribute is then removed
and apparently the kernel doesn't do sysfs_notify() when removing a
sysfs attribute. So set a 3 second timeout for the sysfs_wait() call.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
The Intel Volume Management Device (VMD) is an integrated
endpoint on the platform's PCIe root complex that acts
as a host bridge to a secondary PCIe domain.
This patch adds proper handling of NVMe devices attached to VMD domain.
Each VMD domain is treated as a separate controller (HBA).
Spanning between domains is forbidden.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
When reshape was interrupted, an incorrect checkpoint would be saved in
the migration record. Change wait_for_reshape_imsm() to return -1 when
sync_action is not "reshape" to abort early in imsm_manage_reshape()
without writing the migration record.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Close fd2 as soon as it is no longer needed, before calling
Grow_continue(). Otherwise, we won't be able to stop an array with
external metadata during reshape, because mdadm running in background
will be keeping it open.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Now that we can print device details with a specific raid_disk but not
disk.number, the condition for "print either disk.number or disk.raid_disk"
must be make more specific.
Reported-by: Coly Li <colyli@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
If reshape one raid device with bitmap to raid0, the reshape progress will
start. But it'll fail and lose some components. So it should remove bitmap
first.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
As we give journal device raid_disk of 0, the output of --detail is:
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
5 8 24 0 journal /dev/sdb8
1 8 18 1 active sync /dev/sdb2
2 8 19 2 active sync /dev/sdb3
3 8 21 3 active sync /dev/sdb5
4 8 23 - spare /dev/sdb7
This patch makes it back to:
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 18 1 active sync /dev/sdb2
2 8 19 2 active sync /dev/sdb3
3 8 21 3 active sync /dev/sdb5
4 8 23 - spare /dev/sdb7
5 8 24 - journal /dev/sdb8
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
This forcibly removed the bad-block log. There can be situations where it is hard to
remove bad blocks by writing to them - partiularly on RAID5.
Signed-off-by: NeilBrown <neilb@suse.com>