Header Shadow Image


command ConnectStoragePoolVDS failed: Cannot find master domain:

So we receive the following error from oVirt:

VDSM mdskvm-p01.mds.xyz command ConnectStoragePoolVDS failed: Cannot find master domain: u'spUUID=87ec67c6-8da8-4161-afdf-180778a4b595, msdUUID=73fa156c-f085-466f-b409-130a9795a667'

and dig in a bit deeper to see what's going on:

[root@mdskvm-p01 log]# systemctl status vdsmd.service
â vdsmd.service – Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2018-03-30 23:18:02 EDT; 23h ago
  Process: 2787 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh –pre-start (code=exited, status=0/SUCCESS)
 Main PID: 2875 (vdsmd)
   CGroup: /system.slice/vdsmd.service
           ââ 2875 /usr/bin/python2 /usr/share/vdsm/vdsmd
           ââ16845 /usr/libexec/ioprocess –read-pipe-fd 51 –write-pipe-fd 50 –max-threads 10 –max-queued-requests 10

Mar 31 00:39:03 mdskvm-p01.mds.xyz vdsm[2875]: ERROR Unhandled exception in <Task discardable <UpdateVolumes vm=d8dfd596-1e87-4e98-87ff-269edd…001d610>
                                               Traceback (most recent call last):
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task…
Mar 31 00:40:03 mdskvm-p01.mds.xyz vdsm[2875]: ERROR Unhandled exception in <Task discardable <UpdateVolumes vm=d8dfd596-1e87-4e98-87ff-269edd…c0b96d0>
                                               Traceback (most recent call last):
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task…
Mar 31 00:40:23 mdskvm-p01.mds.xyz vdsm[2875]: WARN unhandled close event
Mar 31 00:40:35 mdskvm-p01.mds.xyz fence_ilo[20843]: Unable to connect/login to fencing device
Mar 31 00:40:37 mdskvm-p01.mds.xyz fence_ilo[20889]: Unable to connect/login to fencing device
Mar 31 00:41:03 mdskvm-p01.mds.xyz vdsm[2875]: ERROR Unhandled exception in <Task discardable <UpdateVolumes vm=d8dfd596-1e87-4e98-87ff-269edd…009b650>
                                               Traceback (most recent call last):
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task…
Mar 31 00:41:57 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.ovirt-guest-agen… removed
Mar 31 00:41:57 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.org.qemu.guest_a… removed
Mar 31 00:43:29 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.ovirt-guest-agen… removed
Mar 31 00:43:29 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.org.qemu.guest_a… removed
Hint: Some lines were ellipsized, use -l to show in full.
[root@mdskvm-p01 log]#
[root@mdskvm-p01 log]#
[root@mdskvm-p01 log]# systemctl status vdsmd.service -l
â vdsmd.service – Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2018-03-30 23:18:02 EDT; 23h ago
  Process: 2787 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh –pre-start (code=exited, status=0/SUCCESS)
 Main PID: 2875 (vdsmd)
   CGroup: /system.slice/vdsmd.service
           ââ 2875 /usr/bin/python2 /usr/share/vdsm/vdsmd
           ââ16845 /usr/libexec/ioprocess –read-pipe-fd 51 –write-pipe-fd 50 –max-threads 10 –max-queued-requests 10

Mar 31 00:39:03 mdskvm-p01.mds.xyz vdsm[2875]: ERROR Unhandled exception in <Task discardable <UpdateVolumes vm=d8dfd596-1e87-4e98-87ff-269edd92bdf1 at 0x7fcd3c0b9950> timeout=30.0, duration=0 at 0x7fcd4001d610>
                                               Traceback (most recent call last):
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task
                                                   task()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
                                                   self._callable()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 349, in __call__
                                                   self._execute()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 391, in _execute
                                                   self._vm.updateDriveVolume(drive)
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4209, in updateDriveVolume
                                                   vmDrive.volumeID)
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6119, in _getVolumeSize
                                                   (domainID, volumeID))
                                               StorageUnavailableError: Unable to get volume size for domain 73fa156c-f085-466f-b409-130a9795a667 volume 81186557-9080-42d1-ba6a-633fb8b805e5
Mar 31 00:40:03 mdskvm-p01.mds.xyz vdsm[2875]: ERROR Unhandled exception in <Task discardable <UpdateVolumes vm=d8dfd596-1e87-4e98-87ff-269edd92bdf1 at 0x7fcd5805cd90> timeout=30.0, duration=0 at 0x7fcd3c0b96d0>
                                               Traceback (most recent call last):
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task
                                                   task()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
                                                   self._callable()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 349, in __call__
                                                   self._execute()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 391, in _execute
                                                   self._vm.updateDriveVolume(drive)
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4209, in updateDriveVolume
                                                   vmDrive.volumeID)
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6119, in _getVolumeSize
                                                   (domainID, volumeID))
                                               StorageUnavailableError: Unable to get volume size for domain 73fa156c-f085-466f-b409-130a9795a667 volume 81186557-9080-42d1-ba6a-633fb8b805e5
Mar 31 00:40:23 mdskvm-p01.mds.xyz vdsm[2875]: WARN unhandled close event
Mar 31 00:40:35 mdskvm-p01.mds.xyz fence_ilo[20843]: Unable to connect/login to fencing device
Mar 31 00:40:37 mdskvm-p01.mds.xyz fence_ilo[20889]: Unable to connect/login to fencing device
Mar 31 00:41:03 mdskvm-p01.mds.xyz vdsm[2875]: ERROR Unhandled exception in <Task discardable <UpdateVolumes vm=d8dfd596-1e87-4e98-87ff-269edd92bdf1 at 0x3adeb90> timeout=30.0, duration=0 at 0x7fcd2009b650>
                                               Traceback (most recent call last):
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task
                                                   task()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
                                                   self._callable()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 349, in __call__
                                                   self._execute()
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 391, in _execute
                                                   self._vm.updateDriveVolume(drive)
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4209, in updateDriveVolume
                                                   vmDrive.volumeID)
                                                 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6119, in _getVolumeSize
                                                   (domainID, volumeID))
                                               StorageUnavailableError: Unable to get volume size for domain 73fa156c-f085-466f-b409-130a9795a667 volume 81186557-9080-42d1-ba6a-633fb8b805e5
Mar 31 00:41:57 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.ovirt-guest-agent.0 already removed
Mar 31 00:41:57 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.org.qemu.guest_agent.0 already removed
Mar 31 00:43:29 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.ovirt-guest-agent.0 already removed
Mar 31 00:43:29 mdskvm-p01.mds.xyz vdsm[2875]: WARN File: /var/lib/libvirt/qemu/channels/d8dfd596-1e87-4e98-87ff-269edd92bdf1.org.qemu.guest_agent.0 already removed
[root@mdskvm-p01 log]#
[root@mdskvm-p01 log]#
[root@mdskvm-p01 log]#
[root@mdskvm-p01 log]# systemctl restart vdsmd.service -l
[root@mdskvm-p01 log]# systemctl status vdsmd.service -l
â vdsmd.service – Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-04-01 00:04:52 EDT; 2s ago
  Process: 22701 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh –post-stop (code=exited, status=0/SUCCESS)
  Process: 22705 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh –pre-start (code=exited, status=0/SUCCESS)
 Main PID: 22783 (vdsmd)
   CGroup: /system.slice/vdsmd.service
           ââ22783 /usr/bin/python2 /usr/share/vdsm/vdsmd

Apr 01 00:04:50 mdskvm-p01.mds.xyz vdsmd_init_common.sh[22705]: vdsm: Running prepare_transient_repository
Apr 01 00:04:51 mdskvm-p01.mds.xyz vdsmd_init_common.sh[22705]: vdsm: Running syslog_available
Apr 01 00:04:51 mdskvm-p01.mds.xyz vdsmd_init_common.sh[22705]: vdsm: Running nwfilter
Apr 01 00:04:51 mdskvm-p01.mds.xyz vdsmd_init_common.sh[22705]: vdsm: Running dummybr
Apr 01 00:04:52 mdskvm-p01.mds.xyz vdsmd_init_common.sh[22705]: vdsm: Running tune_system
Apr 01 00:04:52 mdskvm-p01.mds.xyz vdsmd_init_common.sh[22705]: vdsm: Running test_space
Apr 01 00:04:52 mdskvm-p01.mds.xyz vdsmd_init_common.sh[22705]: vdsm: Running test_lo
Apr 01 00:04:52 mdskvm-p01.mds.xyz systemd[1]: Started Virtual Desktop Server Manager.
Apr 01 00:04:53 mdskvm-p01.mds.xyz vdsm[22783]: WARN MOM not available.
Apr 01 00:04:53 mdskvm-p01.mds.xyz vdsm[22783]: WARN MOM not available, KSM stats will be missing.
[root@mdskvm-p01 log]#
[root@mdskvm-p01 log]#

XFS metadata corruption shows up ( /var/log/messages ):

Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Metadata corruption detected at xfs_agi_read_verify+0x5e/0x110 [xfs], xfs_agi block 0xebffc502
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Unmount and run xfs_repair
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): First 64 bytes of corrupted metadata buffer:
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): metadata I/O error: block 0xebffc502 ("xfs_trans_read_buf_map") error 117 numblks 1
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Metadata corruption detected at xfs_agi_read_verify+0x5e/0x110 [xfs], xfs_agi block 0xefffc402
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Unmount and run xfs_repair
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): First 64 bytes of corrupted metadata buffer:
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ovirtmgmt: received packet on bond0 with own address as source address (addr:78:e7:d1:8f:4d:26, vlan:0)
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811e7aa1230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ovirtmgmt: received packet on bond0 with own address as source address (addr:78:e7:d1:8f:4d:26, vlan:0)
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): metadata I/O error: block 0xefffc402 ("xfs_trans_read_buf_map") error 117 numblks 1
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Metadata corruption detected at xfs_agi_read_verify+0x5e/0x110 [xfs], xfs_agi block 0xf3ffc302
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Unmount and run xfs_repair
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): First 64 bytes of corrupted metadata buffer:
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811f8ba2200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811f8ba2210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811f8ba2220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8811f8ba2230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): metadata I/O error: block 0xf3ffc302 ("xfs_trans_read_buf_map") error 117 numblks 1
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Metadata corruption detected at xfs_agi_read_verify+0x5e/0x110 [xfs], xfs_agi block 0xf7ffc202
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): Unmount and run xfs_repair
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): First 64 bytes of corrupted metadata buffer:
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8808e5335c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8808e5335c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8808e5335c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: ffff8808e5335c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  …………….
Mar 29 09:37:55 mdskvm-p01 kernel: XFS (dm-3): metadata I/O error: block 0xf7ffc202 ("xfs_trans_read_buf_map") error 117 numblks 1

So we fix using the following after going into runlevel 1 and unmounting the volume,  Failing that, use a boot ISO to boot into the environment and perform these tasks:

xfs_repair -n /dev/mdskvmsanvg/mdskvmsanlv 2>&1 | more

Cheers,
TK

Comments are closed.


     
  Copyright © 2003 - 2025 Tom Kacperski (microdevsys.com). All rights reserved.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License

 

The IT Development and Technology Mini Vault | MicroDevSys.com
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.