Microcomputers, Development and Systems

Extending the size of your mdadm array.

Extending the size of your mdadm array. Now that you've replaced all the failed disks, we can double the size of our array to 8TB from 4TB.

We start off with this array:

[root@mbpc-pc log]# mdadm –detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Mon Mar 26 00:06:24 2012
Raid Level : raid6
Array Size : 3907045632 (3726.05 GiB 4000.81 GB)
Used Dev Size : 976761408 (931.51 GiB 1000.20 GB)
Raid Devices : 6
Total Devices : 6
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Thu Mar 29 23:02:24 2018
State : active
Active Devices : 6
Working Devices : 6
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

Name : mbpc:0
UUID : 2f36ac48:5e3e4c54:72177c53:bea3e41e
Events : 1333503

Number Major Minor RaidDevice State
8 8 64 0 active sync /dev/sde
9 8 32 1 active sync /dev/sdc
7 8 16 2 active sync /dev/sdb
11 8 48 3 active sync /dev/sdd
6 8 80 4 active sync /dev/sdf
10 8 0 5 active sync /dev/sda
[root@mbpc-pc log]#

So let's do this:

[root@mbpc-pc log]#
[root@mbpc-pc log]# mdadm –grow /dev/md0 –size=max
mdadm: component size of /dev/md0 has been set to 1953513536K
unfreeze
[root@mbpc-pc log]# mdadm –detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Mon Mar 26 00:06:24 2012
Raid Level : raid6
Array Size : 7814054144 (7452.06 GiB 8001.59 GB)
Used Dev Size : 1953513536 (1863.02 GiB 2000.40 GB)
Raid Devices : 6
Total Devices : 6
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Thu Mar 29 23:42:32 2018
State : active, resyncing
Active Devices : 6
Working Devices : 6
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

Resync Status : 51% complete

Name : mbpc:0
UUID : 2f36ac48:5e3e4c54:72177c53:bea3e41e
Events : 1333507

Number Major Minor RaidDevice State
8 8 64 0 active sync /dev/sde
9 8 32 1 active sync /dev/sdc
7 8 16 2 active sync /dev/sdb
11 8 48 3 active sync /dev/sdd
6 8 80 4 active sync /dev/sdf
10 8 0 5 active sync /dev/sda
[root@mbpc-pc log]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdd[11] sde[8] sdc[9] sdb[7] sdf[6] sda[10]
7814054144 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
[==========>……….] resync = 51.5% (1007560660/1953513536) finish=373.8min speed=42168K/sec
bitmap: 7/8 pages [28KB], 131072KB chunk

unused devices: <none>
[root@mbpc-pc log]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdd[11] sde[8] sdc[9] sdb[7] sdf[6] sda[10]
7814054144 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
[==========>……….] resync = 51.5% (1007603712/1953513536) finish=405.9min speed=38830K/sec
bitmap: 8/8 pages [32KB], 131072KB chunk

unused devices: <none>
[root@mbpc-pc log]#

And now you wait. Once done, use the usual LVM commands such as PVS, VGS, LVS to resize those components.

Some reading available here.

Now that you've done that, it's time to resize the LVM physical volume:

[root@mbpc-pc ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/md0 MBPCStorage lvm2 a– 3.64t 931.70g
/dev/sdg2 mbpcvg lvm2 a– 1.18t 0
/dev/sdg4 mbpcvg lvm2 a– 465.75g 415.75g
[root@mbpc-pc ~]# pvresize /dev/md0
Physical volume "/dev/md0" changed
1 physical volume(s) resized / 0 physical volume(s) not resized
[root@mbpc-pc ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/md0 MBPCStorage lvm2 a– 7.28t 4.55t
/dev/sdg2 mbpcvg lvm2 a– 1.18t 0
/dev/sdg4 mbpcvg lvm2 a– 465.75g 415.75g
[root@mbpc-pc ~]# vgs
VG #PV #LV #SN Attr VSize VFree
MBPCStorage 1 1 0 wz–n- 7.28t 4.55t
mbpcvg 2 3 0 wz–n- 1.64t 415.75g
[root@mbpc-pc ~]# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
MBPCBackup MBPCStorage -wi-ao—- 2.73t
fmlv mbpcvg -wi-ao—- 1.15t
rootlv mbpcvg -wi-ao—- 81.25g
swaplv mbpcvg -wi-ao—- 4.00g
[root@mbpc-pc ~]#

And you're set.

Cheers,
TK

March 29th, 2018 | Posted in NIX Posts | No Comments

pam_reply called with result [4]: System error.

So you're trying to login and get these messages on ovirt01 (192.168.0.145) and ipaclient01 (192.168.0.236). What could be wrong:

(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [ldb] (0x4000): cancel ldb transaction (nesting: 2)
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [sysdb_mod_group_member] (0x0080): ldb_modify failed: [No such object](32)[ldb_wait from ldb_modify with LDB_WAIT_ALL: No such object (32)]
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [sysdb_mod_group_member] (0x0400): Error: 2 (No such file or directory)
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [sysdb_update_members_ex] (0x0020): Could not add member [tom@mds.xyz] to group [name=tom@mds.xyz,cn=groups,cn=mds.xyz,cn=sysdb]. Skipping.

(Thu Mar 22 23:59:26 2018) [[sssd[krb5_child[3246]]]] [k5c_setup_fast] (0x0020): check_fast_ccache failed.
(Thu Mar 22 23:59:26 2018) [[sssd[krb5_child[3246]]]] [k5c_setup_fast] (0x0020): 2618: [-1765328203][Key table entry not found]
(Thu Mar 22 23:59:26 2018) [[sssd[krb5_child[3246]]]] [privileged_krb5_setup] (0x0040): Cannot set up FAST
(Thu Mar 22 23:59:26 2018) [[sssd[krb5_child[3246]]]] [main] (0x0020): privileged_krb5_setup failed.
(Thu Mar 22 23:59:26 2018) [[sssd[krb5_child[3246]]]] [main] (0x0020): krb5_child failed!

(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [read_pipe_handler] (0x0400): EOF received, client finished

(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [parse_krb5_child_response] (0x0020): message too short.
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [krb5_auth_done] (0x0040): The krb5_child process returned an error. Please inspect the krb5_child.log file or the journal for more information
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [krb5_auth_done] (0x0040): Could not parse child response [22]: Invalid argument
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [check_wait_queue] (0x1000): Wait queue for user [tom@mds.xyz] is empty.
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [krb5_auth_queue_done] (0x0040): krb5_auth_recv failed with: 22
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [ipa_pam_auth_handler_krb5_done] (0x0040): KRB5 auth failed [22]: Invalid argument
(Thu Mar 22 23:59:26 2018) [sssd[be[nix.mds.xyz]]] [dp_req_done] (0x0400): DP Request [PAM Preauth #2]: Request handler finished [0]: Success

(Thu Mar 22 23:59:26 2018) [sssd[pam]] [pam_dp_process_reply] (0x0200): received: [4 (System error)][mds.xyz]
(Thu Mar 22 23:59:26 2018) [sssd[pam]] [pam_reply] (0x0200): pam_reply called with result [4]: System error.

More intrieguing is that the reverse dig output had two PTR records for one IP and none for the other IP:

[root@ovirt01 network-scripts]# dig -x 192.168.0.145

; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.2 <<>> -x 192.168.0.145
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47551
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 3

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;145.0.168.192.in-addr.arpa. IN PTR

;; ANSWER SECTION:
145.0.168.192.in-addr.arpa. 1200 IN PTR ovirt01.nix.mds.xyz.
145.0.168.192.in-addr.arpa. 1200 IN PTR ipaclient01.nix.mds.xyz.

;; AUTHORITY SECTION:
0.168.192.in-addr.arpa. 86400 IN NS idmipa01.nix.mds.xyz.
0.168.192.in-addr.arpa. 86400 IN NS idmipa02.nix.mds.xyz.

;; ADDITIONAL SECTION:
idmipa01.nix.mds.xyz. 1200 IN A 192.168.0.44
idmipa02.nix.mds.xyz. 1200 IN A 192.168.0.45

;; Query time: 1 msec
;; SERVER: 192.168.0.44#53(192.168.0.44)
;; WHEN: Fri Mar 23 00:04:25 EDT 2018
;; MSG SIZE rcvd: 192

[root@ovirt01 network-scripts]#

Whilst the other IP had no PTR records returned:

[root@ovirt01 network-scripts]# dig -x 192.168.0.236

; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.2 <<>> -x 192.168.0.236
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 64699
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;236.0.168.192.in-addr.arpa. IN PTR

;; AUTHORITY SECTION:
0.168.192.in-addr.arpa. 3600 IN SOA idmipa01.nix.mds.xyz. hostmaster.nix.mds.xyz. 1521778151 3600 900 1209600 3600

;; Query time: 1 msec
;; SERVER: 192.168.0.44#53(192.168.0.44)
;; WHEN: Fri Mar 23 00:27:22 EDT 2018
;; MSG SIZE rcvd: 122

[root@ovirt01 network-scripts]#

Is because I was copying the /etc/sssd/sssd.conf config from one client to the other. More specifically, I was copying the config from ipaclient01 to ovirt01:

[root@ipaclient01 ~]# grep -Ei ipa_hostname /etc/sssd/sssd.conf
ipa_hostname = ipaclient01.nix.mds.xyz
[root@ipaclient01 ~]#

[root@ovirt01 network-scripts]# grep -Ei ipa_hostname /etc/sssd/sssd.conf
ipa_hostname = ipaclient01.nix.mds.xyz
[root@ovirt01 network-scripts]#

Changing the above quickly resolved my login issue.

Cheers,
TK

March 23rd, 2018 | Posted in NIX Posts | No Comments

Windows 7 Cannot Resolve hostnames via PING but nslookup works.

It may happen that you can't ping a hostname either internally on a local DNS you may be running or externally. Flushing the DNS cache may not work either:

C:\Users\tom>ipconfig /flushdns

Windows IP Configuration

Successfully flushed the DNS Resolver Cache.

C:\Users\tom>ping vcsa01
Ping request could not find host vcsa01. Please check the name and try again.

C:\Users\tom>ping vcsa01

So what you can also do is disable IPv6 in windows for the Interface you're using under Control Panel -> Network and Sharing Center.

If that doesn't work, consider if you are using OpenVPN. If the OpenVPN client is up and you're using VPN to move in and out of your infrastructure, consider turning it off or restarting DHCP Client in windows Services.

Alternately, you may have a third DNS server in the IPv4 Advanced Properties panel of your network card properties in Windows. Review your DNS and remove any extra DNS entries that can't resolve the hostnames you are trying to get too.

Cheers,
TK

March 22nd, 2018 | Posted in NIX Posts | No Comments

failed command: READ FPDMA QUEUED FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE

So my last Seagate SATA drive in my RAID 6 Array died spectacularly taking out my 4.8.4 Kernel and locking up my storage to the point where the only way I can get to it is via the kernel boot parameter init=/bin/bash . The disk lasted about 5.762 years:

Read the rest of this entry »

March 21st, 2018 | Posted in NIX Posts | No Comments

GlusterFS: Configuration and Setup w/ NFS-Ganesha for an HA NFS Cluster (Quick Start Guide)

This is a much shorter version of our troubleshooting article on NFS Ganesh we created earlier. This is meant as a quick start guide for those who just want to get this server up and running very quickly. The point of High Availabilty is that the best implement HA solutions never allow any outage to be noticed by the client. It's not the client's job to put up with the fallout of a failure, it's the sysadmins job to ensure they never have too. In this configuration, however, we will use a 3 node Gluster Cluster. In short, we'll be using the following techs to setup an HA configuration:

GlusterFS
NFS Ganesha
CentOS 7
HAPROXY
keepalived
firewalld
selinux

Here's a summary configuration for this whole work. If you run into this particularly nasty error, visit the solution page here:

HOST	SETTING	DESCRIPTION
nfs01 / nfs02 / nfs03	Create and reserve some IP's for your hosts. We are using the FreeIPA project to provide DNS and Kerberos functionality here: 192.168.0.80 nfs-c01 (nfs01, nfs02, nfs03) VIP DNS Entry 192.168.0.131 nfs01 192.168.0.119 nfs02 192.168.0.125 nfs03	Add the hosts to your DNS server for a clean setup. Alternately add them to /etc/hosts (ugly)
nfs01 / nfs02 / nfs03	PACKAGES You can use the packages directly. Since version 2.6.X, Ganesha supports binding only on specific interfaces and has been introduced in the latest RPM packages. yum install nfs-ganesha.x86_64 nfs-ganesha-gluster.x86_64 nfs-ganesha-proxy.x86_64 nfs-ganesha-utils.x86_64 nfs-ganesha-vfs.x86_64 nfs-ganesha-xfs.x86_64 nfs-ganesha-mount-9P.x86_64 COMPILING We used this method because we needed a feature that allows binding the service only on specific ports, at the time only available from the latest source releases. wget https://github.com/nfs-ganesha/nfs-ganesha/archive/V2.6-.0.tar.gz [root@nfs01 ~]# ganesha.nfsd -v NFS-Ganesha Release = V2.6.0 nfs-ganesha compiled on Feb 20 2018 at 08:55:23 Release comment = GANESHA file server is 64 bits compliant and supports NFS v3,4.0,4.1 (pNFS) and 9P Git HEAD = 97867975b2ee69d475876e222c439b1bc9764a78 Git Describe = V2.6-.0-0-g9786797 [root@nfs01 ~]# DETAILED INSTRUCTIONS: https://github.com/nfs-ganesha/nfs-ganesha/wiki/Compiling https://github.com/nfs-ganesha/nfs-ganesha/wiki/GLUSTER https://github.com/nfs-ganesha/nfs-ganesha/wiki/XFSLUSTRE PACKAGES: yum install glusterfs-api-devel.x86_64 yum install xfsprogs-devel.x86_64 yum install xfsprogs.x86_64 xfsdump-3.1.4-1.el7.x86_64 libguestfs-xfs-1.36.3-6.el7_4.3.x86_64 libntirpc-devel-1.5.4-1.el7.x86_64 libntirpc-1.5.4-1.el7.x86_64 libnfsidmap-devel-0.25-17.el7.x86_64 jemalloc-devel-3.6.0-1.el7.x86_64 COMMANDS git clone https://github.com/nfs-ganesha/nfs-ganesha.git cd nfs-ganesha; git checkout V2.6-stable git submodule update —init —recursive yum install gcc-c++ yum install cmake ccmake /root/ganesha/nfs-ganesha/src/ # Press the c, e, c, g keys to create and generate the config and make files. make make install	Compile and build nfsganesha 2.60+ from source. (At this time RPM packages did not work) Install the listed packages before compiling as well.
nfs01 / nfs02 / nfs03	Add a disk to the VM such as /dev/sdb .	Add secondary disk for the shared GlusterFS
nfs01 / nfs02 / nfs03	Create the FS on the new disk and mount it and setup Gluster: mkfs.xfs /dev/sdb mkdir -p /bricks/0 mount /dev/sdb /bricks/0 # grep brick /etc/fstab /dev/sdb /bricks/0 xfs defaults 0 0 Gluster currently ships in version 4.1. This won't work with Ganesha. Use either the repo or continue installing the latest version of Gluster: # cat CentOS-Gluster-3.13.repo # CentOS-Gluster-3.13.repo # # Please see http://wiki.centos.org/SpecialInterestGroup/Storage for more # information [centos-gluster313] name=CentOS-$releasever – Gluster 3.13 (Short Term Maintanance) baseurl=http://mirror.centos.org/centos/$releasever/storage/$basearch/gluster-3.13/ gpgcheck=1 enabled=0 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-Storage [centos-gluster313-test] name=CentOS-$releasever – Gluster 3.13 Testing (Short Term Maintenance) baseurl=http://buildlogs.centos.org/centos/$releasever/storage/$basearch/gluster-3.13/ gpgcheck=0 enabled=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-Storage Alternately to the above, use the following to install the latest repo: yum install centos-release-gluster Install and enable the rest: yum -y install glusterfs glusterfs-fuse glusterfs-server glusterfs-api glusterfs-cli systemctl enable glusterd.service systemctl start glusterd On node01 ONLY if creating brand new: gluster volume create gv01 replica 2 nfs01:/bricks/0/gv01 nfs02:/bricks/0/gv01 gluster volume info gv01 gluster volume status Replace bricks: Unreachable brick: gluster volume remove-brick gv01 replica X nfs01:/bricks/0/gv01 start gluster volume remove-brick gv01 replica X nfs01:/bricks/0/gv01 force gluster peer detach nfs01 Reachable brick: gluster volume remove-brick gv01 replica X nfs01:/bricks/0/gv01 start gluster volume remove-brick gv01 replica X nfs01:/bricks/0/gv01 status gluster volume remove-brick gv01 replica X nfs01:/bricks/0/gv01 commit gluster peer detach nfs01 Add subsequent bricks: (from existing cluster member ) [root@nfs01 ~]# gluster peer probe nfs03 [root@nfs01 ~]# gluster volume add-brick gv01 replica 3 nfs03:/bricks/0/gv01 Mount the storage locally: systemctl disable autofs mkdir /n Example below. Add to /etc/fstab as well: [root@nfs01 ~]# mount -t glusterfs nfs01:/gv01 /n [root@nfs02 ~]# mount -t glusterfs nfs02:/gv01 /n [root@nfs03 ~]# mount -t glusterfs nfs03:/gv01 /n Ex: nfs01:/gv01 /n glusterfs defaults 0 0 Ensure the following options are set on the gluster volume: [root@nfs01 glusterfs]# gluster volume set gv01 cluster.quorum-type auto volume set: success [root@nfs01 glusterfs]# gluster volume set gv01 cluster.server-quorum-type server volume set: success Here is an example Gluster volume configuration we used (This config is replicated when adding new bricks): cluster.server-quorum-type: server cluster.quorum-type: auto server.event-threads: 8 client.event-threads: 8 performance.readdir-ahead: on performance.write-behind-window-size: 8MB performance.io-thread-count: 16 performance.cache-size: 1GB nfs.trusted-sync: on performance.client-io-threads: off nfs.disable: on transport.address-family: inet	Configure the GlusterFS filesystem using
nfs01 / nfs02 / nfs03	PACKAGES: yum install haproxy # ( 1.5.18-6.el7.x86_64 used in this case ) /etc/haproxy/haproxy.cfg: global log 127.0.0.1 local0 debug stats socket /var/run/haproxy.sock mode 0600 level admin # stats socket /var/lib/haproxy/stats maxconn 4000 user haproxy group haproxy daemon debug defaults mode tcp log global option dontlognull option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000 frontend nfs-in log 127.0.0.1 local0 debug # bind nfs02:2049 bind nfs-c01:2049 mode tcp option tcplog default_backend nfs-back backend nfs-back log /dev/log local0 debug mode tcp balance source server nfs01.nix.mds.xyz nfs01.nix.mds.xyz:2049 check server nfs02.nix.mds.xyz nfs02.nix.mds.xyz:2049 check server nfs03.nix.mds.xyz nfs03.nix.mds.xyz:2049 check listen stats bind :9000 mode http stats enable stats hide-version stats realm Haproxy\ Statistics stats uri /haproxy-stats stats auth admin:s3cretp@s$w0rd Set logging settings for HAProxy: # cat /etc/rsyslog.d/haproxy.conf $ModLoad imudp $UDPServerAddress 127.0.0.1 $UDPServerRun 514 local6.* /var/log/haproxy.log local0.* /var/log/haproxy.log Configure rsyslogd (/etc/rsyslog.conf): local0.* /var/log/haproxy.log local3.* /var/log/keepalived.log	Install and Configure HAPROXY. A great source that helped with this part.
nfs01 / nfs02 / nfs03	# echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.conf # echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf # sysctl -p net.ipv4.ip_nonlocal_bind = 1 net.ipv4.ip_forward = 1 #	Turn on kernel parameters. These allow keepalived below to function properly.
nfs01 / nfs02 / nfs03	PACKAGES: yum install keepalived # ( Used 1.3.5-1.el7.x86_64 in this case ) NFS01: vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance nfs-c01 { interface eth0 # interface to monitor state MASTER # MASTER on haproxy1, BACKUP on haproxy2 virtual_router_id 80 # Set to last digit of cluster IP. priority 101 # 101 on haproxy1, 100 on haproxy2 authentication { auth_type PASS auth_pass s3cretp@s$w0rd } virtual_ipaddress { delay_loop 12 lb_algo wrr lb_kind DR protocol TCP 192.168.0.80 # virtual ip address } track_script { chk_haproxy } } NFS02: vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance nfs-c01 { interface eth0 # interface to monitor state BACKUP # MASTER on haproxy1, BACKUP on haproxy2 virtual_router_id 80 # Set to last digit of cluster IP. priority 102 # 101 on haproxy1, 100 on haproxy2 authentication { auth_type PASS auth_pass s3cretp@s$w0rd } virtual_ipaddress { delay_loop 12 lb_algo wrr lb_kind DR protocol TCP 192.168.0.80 # virtual ip address } track_script { chk_haproxy } } NFS03: vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance nfs-c01 { interface eth0 # interface to monitor state BACKUP # MASTER on haproxy1, BACKUP on haproxy2 virtual_router_id 80 # Set to last digit of cluster IP. priority 103 # 101 on haproxy1, 100 on haproxy2 authentication { auth_type PASS auth_pass s3cretp@s$w0rd } virtual_ipaddress { delay_loop 12 lb_algo wrr lb_kind DR protocol TCP 192.168.0.80 # virtual ip address } track_script { chk_haproxy } } Configure extended logging and corresponding log rotation for keepalived: [root@nfs03 log]# cat /etc/sysconfig/keepalived KEEPALIVED_OPTIONS="-dD -S 3" [root@nfs03 log]# [root@nfs03 log]# cat /etc/logrotate.d/keepalived /var/log/keepalived.log { daily rotate 8 copytruncate maxsize 100M dateext compress missingok } [root@nfs03 log]# Logrotate the log file: [root@nfs03 log]# logrotate /etc/logrotate.d/keepalived -v	Configure keepalived. A great source that helped with this as well.
nfs01 / nfs02 / nfs03	This step can be made quicker by copying the xml definitions from one host to the other if you already have one defined: /etc/firewalld/zones/dmz.xml /etc/firewalld/zones/public.xml Contents of above: # cat dmz.xml <?xml version="1.0" encoding="utf-8"?> <zone> <short>DMZ</short> <description>For computers in your demilitarized zone that are publicly-accessible with limited access to your internal network. Only selected incoming connections are accepted.</description> <service name="ssh"/> <port protocol="tcp" port="2049"/> <port protocol="tcp" port="111"/> <port protocol="tcp" port="24007-24008"/> <port protocol="tcp" port="38465-38469"/> <port protocol="udp" port="111"/> <port protocol="tcp" port="22"/> <port protocol="udp" port="22"/> <port protocol="udp" port="49000-59999"/> <port protocol="tcp" port="49000-59999"/> <port protocol="tcp" port="20048"/> <port protocol="udp" port="20048"/> <port protocol="tcp" port="49152"/> <port protocol="tcp" port="4501"/> <port protocol="udp" port="4501"/> <port protocol="tcp" port="10000"/> <port protocol="udp" port="9000"/> <port protocol="tcp" port="9000"/> <port protocol="tcp" port="445"/> <port protocol="tcp" port="139"/> <port protocol="udp" port="138"/> <port protocol="udp" port="137"/> </zone> # cat public.xml <?xml version="1.0" encoding="utf-8"?> <zone> <short>Public</short> <description>For use in public areas. You do not trust the other computers on networks to not harm your computer. Only selected incoming connections are accepted.</description> <service name="ssh"/> <service name="dhcpv6-client"/> <service name="haproxy"/> <port protocol="tcp" port="24007-24008"/> <port protocol="tcp" port="49152"/> <port protocol="tcp" port="38465-38469"/> <port protocol="tcp" port="111"/> <port protocol="udp" port="111"/> <port protocol="tcp" port="2049"/> <port protocol="tcp" port="4501"/> <port protocol="udp" port="4501"/> <port protocol="udp" port="20048"/> <port protocol="tcp" port="20048"/> <port protocol="tcp" port="22"/> <port protocol="udp" port="22"/> <port protocol="tcp" port="10000"/> <port protocol="udp" port="49000-59999"/> <port protocol="tcp" port="49000-59999"/> <port protocol="udp" port="9000"/> <port protocol="tcp" port="9000"/> <port protocol="udp" port="137"/> <port protocol="udp" port="138"/> <port protocol="udp" port="2049"/> <port protocol="tcp" port="445"/> <port protocol="tcp" port="139"/> <port protocol="udp" port="68"/> <port protocol="udp" port="67"/> </zone> Individual setup: # cat public.bash firewall-cmd –zone=public –permanent –add-port=2049/tcp firewall-cmd –zone=public –permanent –add-port=111/tcp firewall-cmd –zone=public –permanent –add-port=111/udp firewall-cmd –zone=public –permanent –add-port=24007-24008/tcp firewall-cmd –zone=public –permanent –add-port=49152/tcp firewall-cmd –zone=public –permanent –add-port=38465-38469/tcp firewall-cmd –zone=public –permanent –add-port=4501/tcp firewall-cmd –zone=public –permanent –add-port=4501/udp firewall-cmd –zone=public –permanent –add-port=24007-24008/tcp firewall-cmd –zone=public –permanent –add-port=24007-24008/udp firewall-cmd –zone=public –permanent –add-port=49152-49156/tcp firewall-cmd –zone=public –permanent –add-port=49152-49156/udp firewall-cmd –reload # cat dmz.bash firewall-cmd –zone=dmz –permanent –add-port=2049/tcp firewall-cmd –zone=dmz –permanent –add-port=111/tcp firewall-cmd –zone=dmz –permanent –add-port=111/udp firewall-cmd –zone=dmz –permanent –add-port=24007-24008/tcp firewall-cmd –zone=dmz –permanent –add-port=24007-24008/udp firewall-cmd –zone=dmz –permanent –add-port=49152-49156/tcp firewall-cmd –zone=dmz –permanent –add-port=49152-49156/udp firewall-cmd –zone=dmz –permanent –add-port=49152/tcp firewall-cmd –zone=dmz –permanent –add-port=38465-38469/tcp firewall-cmd –zone=dmz –permanent –add-port=4501/tcp firewall-cmd –zone=dmz –permanent –add-port=4501/udp firewall-cmd –zone=dmz –permanent –add-port=20048/tcp firewall-cmd –zone=dmz –permanent –add-port=20048/udp firewall-cmd –reload # # On Both firewall-cmd –permanent –direct –add-rule ipv4 filter INPUT 0 -m pkttype –pkt-type multicast -j ACCEPT firewall-cmd –reload FINAL FILE: # vi /etc/firewalld/zones/public.xml <?xml version="1.0" encoding="utf-8"?> <zone> <service name="cockpit"/> <service name="libvirt-tls"/> <service name="snmp"/> <service name="vdsm"/> <service name="ovirt-imageio"/> <service name="ovirt-vmconsole"/> <service name="ctdb"/> <service name="glusterfs"/> <service name="nfs"/> <service name="nrpe"/> <service name="ovirt-storageconsole"/> <service name="rpc-bind"/> <service name="samba"/> <port protocol="tcp" port="22"/> <port protocol="udp" port="6081"/> <port protocol="tcp" port="8080"/> <port protocol="udp" port="963"/> <port protocol="tcp" port="965"/> <port protocol="tcp" port="24007-24008"/> <port protocol="udp" port="24007-24008"/> <port protocol="tcp" port="9090"/> <port protocol="udp" port="9090"/> <port protocol="tcp" port="49152-49156"/> <port protocol="udp" port="49152-49156"/> <rule family="ipv4"> <source address="192.168.0.0/24"/> <accept/> </rule> <rule family="ipv4"> <source address="0.0.0.0/24"/> <accept/> </rule> <rule family="ipv4"> <source address="255.255.255.255/24"/> <accept/> </rule> </zone> HANDY STUFF: firewall-cmd –zone=dmz –list-all firewall-cmd –zone=public –list-all firewall-cmd –set-log-denied=all firewall-cmd –permanent –add-service=haproxy firewall-cmd –list-all firewall-cmd –runtime-to-permanent	Configure firewalld. DO NOT disable firewalld .
nfs01 / nfs02 / nfs03	Run any of the following command, or a combination of, on deny entries in /var/log/audit/audit.log that may appear as you stop, start or install above services: METHOD 1: grep AVC /var/log/audit/audit.log \| audit2allow -M systemd-allow semodule -i systemd-allow.pp METHOD 2: audit2allow -a audit2allow -a -M ganesha_<NUM>_port semodule -i ganesha_<NUM>_port.pp USEFULL THINGS: ausearch –interpret aureport	Configure selinux. Don't disable it. This actually makes your host safer and is actually easy to work with using just these commands.
nfs01 / nfs02 / nfs03	NODE 1: [root@nfs01 ~]# cat /etc/ganesha/ganesha.conf ################################################### # # EXPORT # # To function, all that is required is an EXPORT # # Define the absolute minimal export # ################################################### # logging directives–be careful LOG { # Default_Log_Level is unknown token?? # Default_Log_Level = NIV_FULL_DEBUG; Components { # ALL = FULL_DEBUG; MEMLEAKS = FATAL; FSAL = DEBUG; NFSPROTO = FATAL; NFS_V4 = FULL_DEBUG; EXPORT = DEBUG; FILEHANDLE = FATAL; DISPATCH = DEBUG; CACHE_INODE = FULL_DEBUG; CACHE_INODE_LRU = FATAL; HASHTABLE = FATAL; HASHTABLE_CACHE = FATAL; DUPREQ = FATAL; INIT = DEBUG; MAIN = FATAL; IDMAPPER = FULL_DEBUG; NFS_READDIR = FULL_DEBUG; NFS_V4_LOCK = FULL_DEBUG; CONFIG = FULL_DEBUG; CLIENTID = FULL_DEBUG; SESSIONS = FATAL; PNFS = FATAL; RW_LOCK = FATAL; NLM = FATAL; RPC = FULL_DEBUG; NFS_CB = FATAL; THREAD = FATAL; NFS_V4_ACL = FULL_DEBUG; STATE = FULL_DEBUG; # 9P = FATAL; # 9P_DISPATCH = FATAL; FSAL_UP = FATAL; DBUS = FATAL; } Facility { name = FILE; destination = "/var/log/ganesha/ganesha-rgw.log"; enable = active; } } NFSv4 { Lease_Lifetime = 20 ; IdmapConf = "/etc/idmapd.conf" ; DomainName = "nix.mds.xyz" ; } NFS_KRB5 { PrincipalName = "nfs/nfs01.nix.mds.xyz@NIX.MDS.XYZ" ; KeytabPath = /etc/krb5.keytab ; Active_krb5 = YES ; } NFS_Core_Param { Bind_addr=192.168.0.119; NFS_Port=2049; MNT_Port=20048; NLM_Port=38468; Rquota_Port=4501; } %include "/etc/ganesha/export.conf" # %include "/etc/ganesha/export-home.conf" [root@nfs01 ~]# cat /etc/ganesha/export.conf EXPORT { Export_Id = 1 ; # Export ID unique to each export Path = "/n"; # Path of the volume to be exported. Eg: "/test_volume" FSAL { name = GLUSTER; hostname = "nfs01.nix.mds.xyz"; # IP of one of the nodes in the trusted pool volume = "gv01"; # Volume name. Eg: "test_volume" } Access_type = RW; # Access permissions Squash = No_root_squash; # To enable/disable root squashing Disable_ACL = FALSE; # To enable/disable ACL Pseudo = "/n"; # NFSv4 pseudo path for this export. Eg: "/test_volume_pseudo" Protocols = "3", "4"; # NFS protocols supported Transports = "UDP", "TCP" ; # Transport protocols supported SecType = "sys", "krb5","krb5i","krb5p"; # "sys", "krb5","krb5i","krb5p"; # Security flavors supported } [root@nfs01 ~]# NODE 2: # cat /etc/ganesha/ganesha.conf ################################################### # # EXPORT # # To function, all that is required is an EXPORT # # Define the absolute minimal export # ################################################### # logging directives–be careful LOG { # Default_Log_Level is unknown token?? # Default_Log_Level = NIV_FULL_DEBUG; Components { # ALL = FULL_DEBUG; MEMLEAKS = FATAL; FSAL = DEBUG; NFSPROTO = FATAL; NFS_V4 = FULL_DEBUG; EXPORT = DEBUG; FILEHANDLE = FATAL; DISPATCH = DEBUG; CACHE_INODE = FULL_DEBUG; CACHE_INODE_LRU = FATAL; HASHTABLE = FATAL; HASHTABLE_CACHE = FATAL; DUPREQ = FATAL; INIT = DEBUG; MAIN = FATAL; IDMAPPER = FULL_DEBUG; NFS_READDIR = FULL_DEBUG; NFS_V4_LOCK = FULL_DEBUG; CONFIG = FULL_DEBUG; CLIENTID = FULL_DEBUG; SESSIONS = FATAL; PNFS = FATAL; RW_LOCK = FATAL; NLM = FATAL; RPC = FULL_DEBUG; NFS_CB = FATAL; THREAD = FATAL; NFS_V4_ACL = FULL_DEBUG; STATE = FULL_DEBUG; # 9P = FATAL; # 9P_DISPATCH = FATAL; FSAL_UP = FATAL; DBUS = FATAL; } Facility { name = FILE; destination = "/var/log/ganesha/ganesha-rgw.log"; enable = active; } } NFSv4 { Lease_Lifetime = 20 ; IdmapConf = "/etc/idmapd.conf" ; DomainName = "nix.mds.xyz" ; } NFS_KRB5 { PrincipalName = "nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ" ; KeytabPath = /etc/krb5.keytab ; Active_krb5 = YES ; } NFS_Core_Param { Bind_addr=192.168.0.119; NFS_Port=2049; MNT_Port=20048; NLM_Port=38468; Rquota_Port=4501; } %include "/etc/ganesha/export.conf" # %include "/etc/ganesha/export-home.conf" [root@nfs02 glusterfs]# [root@nfs02 glusterfs]# [root@nfs02 glusterfs]# cat /etc/ganesha/export.conf EXPORT { Export_Id = 1 ; # Export ID unique to each export Path = "/n"; # Path of the volume to be exported. Eg: "/test_volume" FSAL { name = GLUSTER; hostname = "nfs02.nix.mds.xyz"; # IP of one of the nodes in the trusted pool volume = "gv01"; # Volume name. Eg: "test_volume" } Access_type = RW; # Access permissions Squash = No_root_squash; # To enable/disable root squashing Disable_ACL = FALSE; # To enable/disable ACL Pseudo = "/n"; # NFSv4 pseudo path for this export. Eg: "/test_volume_pseudo" Protocols = "3", "4"; # NFS protocols supported Transports = "UDP", "TCP" ; # Transport protocols supported SecType = "sys", "krb5","krb5i","krb5p"; # "sys", "krb5","krb5i","krb5p"; # Security flavors supported } [root@nfs02 glusterfs]# NODE 3: [root@nfs03 ~]# cat /etc/ganesha/ganesha.conf ################################################### # # EXPORT # # To function, all that is required is an EXPORT # # Define the absolute minimal export # ################################################### # logging directives–be careful LOG { # Default_Log_Level is unknown token?? # Default_Log_Level = NIV_FULL_DEBUG; Components { # ALL = FULL_DEBUG; MEMLEAKS = FATAL; FSAL = DEBUG; NFSPROTO = FATAL; NFS_V4 = FULL_DEBUG; EXPORT = DEBUG; FILEHANDLE = FATAL; DISPATCH = DEBUG; CACHE_INODE = FULL_DEBUG; CACHE_INODE_LRU = FATAL; HASHTABLE = FATAL; HASHTABLE_CACHE = FATAL; DUPREQ = FATAL; INIT = DEBUG; MAIN = FATAL; IDMAPPER = FULL_DEBUG; NFS_READDIR = FULL_DEBUG; NFS_V4_LOCK = FULL_DEBUG; CONFIG = FULL_DEBUG; CLIENTID = FULL_DEBUG; SESSIONS = FATAL; PNFS = FATAL; RW_LOCK = FATAL; NLM = FATAL; RPC = FULL_DEBUG; NFS_CB = FATAL; THREAD = FATAL; NFS_V4_ACL = FULL_DEBUG; STATE = FULL_DEBUG; # 9P = FATAL; # 9P_DISPATCH = FATAL; FSAL_UP = FATAL; DBUS = FATAL; } Facility { name = FILE; destination = "/var/log/ganesha/ganesha-rgw.log"; enable = active; } } NFSv4 { Lease_Lifetime = 20 ; IdmapConf = "/etc/idmapd.conf" ; DomainName = "nix.mds.xyz" ; } NFS_KRB5 { PrincipalName = "nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ" ; KeytabPath = /etc/krb5.keytab ; Active_krb5 = YES ; } NFS_Core_Param { Bind_addr = 192.168.0.125; NFS_Port = 2049; MNT_Port = 20048; NLM_Port = 38468; Rquota_Port = 4501; } %include "/etc/ganesha/export.conf" # %include "/etc/ganesha/export-home.conf" [root@nfs03 ~]# [root@nfs03 ~]# [root@nfs03 ~]# cat /etc/ganesha/export.conf EXPORT { Export_Id = 1 ; # Export ID unique to each export Path = "/n"; # Path of the volume to be exported. Eg: "/test_volume" FSAL { name = GLUSTER; hostname = "nfs03.nix.mds.xyz"; # IP of one of the nodes in the trusted pool volume = "gv01"; # Volume name. Eg: "test_volume" } Access_type = RW; # Access permissions Squash = No_root_squash; # To enable/disable root squashing Disable_ACL = FALSE; # To enable/disable ACL Pseudo = "/n"; # NFSv4 pseudo path for this export. Eg: "/test_volume_pseudo" Protocols = "3", "4"; # "3", "4" NFS protocols supported Transports = "UDP", "TCP" ; # "UDP", "TCP" Transport protocols supported SecType = "sys","krb5","krb5i","krb5p"; # "sys","krb5","krb5i","krb5p"; # Security flavors supported } [root@nfs03 ~]# STARTUP: systemctl start nfs-ganesha (Only if you did not extract the startup script) /usr/bin/ganesha.nfsd -L /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT	Configure NFS Ganesha
nfs01 / nfs02 / nfs03	[root@nfs01 ~]# cat /etc/fstab\|grep -Ei "brick\|gv01" /dev/sdb /bricks/0 xfs defaults 0 0 nfs01:/gv01 /n glusterfs defaults 0 0 [root@nfs01 ~]# [root@nfs01 ~]# mount\|grep -Ei "brick\|gv01" /dev/sdb on /bricks/0 type xfs (rw,relatime,seclabel,attr2,inode64,noquota) nfs01:/gv01 on /n type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) [root@nfs01 ~]# [root@nfs01 ~]# ps -ef\|grep -Ei "haproxy\|keepalived\|ganesha"; netstat -pnlt\|grep -Ei "haproxy\|ganesha\|keepalived" root 1402 1 0 00:59 ? 00:00:00 /usr/sbin/keepalived -D root 1403 1402 0 00:59 ? 00:00:00 /usr/sbin/keepalived -D root 1404 1402 0 00:59 ? 00:00:02 /usr/sbin/keepalived -D root 13087 1 0 01:02 ? 00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid haproxy 13088 13087 0 01:02 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds haproxy 13089 13088 0 01:02 ? 00:00:01 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds root 13129 1 15 01:02 ? 00:13:11 /usr/bin/ganesha.nfsd -L /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT root 19742 15633 0 02:30 pts/2 00:00:00 grep –color=auto -Ei haproxy\|keepalived\|ganesha tcp 0 0 192.168.0.80:2049 0.0.0.0:* LISTEN 13089/haproxy tcp6 0 0 192.168.0.131:20048 :::* LISTEN 13129/ganesha.nfsd tcp6 0 0 :::564 :::* LISTEN 13129/ganesha.nfsd tcp6 0 0 192.168.0.131:4501 :::* LISTEN 13129/ganesha.nfsd tcp6 0 0 192.168.0.131:2049 :::* LISTEN 13129/ganesha.nfsd tcp6 0 0 192.168.0.131:38468 :::* LISTEN 13129/ganesha.nfsd [root@nfs01 ~]#	Ensure mounts are done and everything is started up.
nfs01 / nfs02 / nfs03	yumdownloader nfs-ganesha.x86_64 rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm \| cpio -idmv ./usr/lib/systemd/system/nfs-ganesha-lock.service rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm \| cpio -idmv ./usr/lib/systemd/system/nfs-ganesha.service rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm \| cpio -idmv ./usr/lib/systemd/system/nfs-ganesha-config.service rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm \| cpio -idmv ./usr/libexec/ganesha/nfs-ganesha-config.sh Copy above to the same folders under / instead of ./ : systemctl enable nfs-ganesha.service systemctl status nfs-ganesha.service	Since you compiled from source you don't have nice startup scripts. To get your nice startup scripts from an existing ganesha RPM do the following. Then use systemctl to stop and start nfs-ganesha as you would any other service.
ANY	Enable dumps: gluster volume set gv01 server.statedump-path /var/log/glusterfs/ gluster volume statedump gv01	Enable state dumps for issue isolation.
Enable Samba / SMB for Windows File Sharing ( Optional )	Packages: samba-common-4.7.1-6.el7.noarch samba-client-libs-4.7.1-6.el7.x86_64 libsmbclient-4.7.1-6.el7.x86_64 samba-libs-4.7.1-6.el7.x86_64 samba-4.7.1-6.el7.x86_64 libsmbclient-devel-4.7.1-6.el7.x86_64 samba-common-libs-4.7.1-6.el7.x86_64 samba-common-tools-4.7.1-6.el7.x86_64 samba-client-4.7.1-6.el7.x86_64 # cat /etc/samba/smb.conf\|grep NFS -A 12 [NFS] comment = NFS Shared Storage path = /n valid users = root public = no writable = yes read only = no browseable = yes guest ok = no printable = no write list = root tom@mds.xyz tomk@nix.mds.xyz directory mask = 0775 create mask = 664 Start the service after enabling it: systemctl enable smb systemctl start smb Samba permissions to access NFS directories, fusefs and allow export. Likewise for fusefs filesystems: # setsebool -P samba_share_fusefs on # getsebool samba_share_fusefs samba_share_fusefs –> on Likewise, for NFS shares, you'll need the following to allow sharing out of NFS shares: # setsebool -P samba_share_nfs on # getsebool samba_share_nfs samba_share_nfs –> on # And some firewalls ports to go along with it: firewall-cmd –zone=public –permanent –add-port=445/tcp firewall-cmd –zone=public –permanent –add-port=139/tcp firewall-cmd –zone=public –permanent –add-port=138/udp firewall-cmd –zone=public –permanent –add-port=137/udp firewall-cmd –reload	We can also enable SMB / Samba file sharing on the individual cluster hosts and allow visibility to the Gluster FS / NFS – Ganesha from Windows.
nfs01 / nfs02 / nfs03	Referencing this post, we will import a few principals from the master IPA server. (For the KDC steps, see the reference post.): On the IPA server, issue the following to permission retrieveal of principals on clients: [root@idmipa01 ~]# ipa service-add nfs/nfs03.nix.mds.xyz [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ –groups=admins [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ –hosts={nfs01.nix.mds.xyz,nfs02.nix.mds.xyz,nfs03.nix.mds.xyz} [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs01.nix.mds.xyz@NIX.MDS.XYZ –groups=admins [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ –groups=admins [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ –groups=admins [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs03.nix.mds.xyz –hosts=nfs01.nix.mds.xyz [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs02.nix.mds.xyz –hosts=nfs02.nix.mds.xyz [root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs01.nix.mds.xyz –hosts=nfs03.nix.mds.xyz On the target client issue the following: [root@nfs01 ~]# kinit admin # Or the user you permissioned above. [root@nfs01 ~]# ipa-getkeytab -s idmipa01.nix.mds.xyz -p nfs/nfs-c01.nix.mds.xyz -k /etc/krb5.keytab -r [root@nfs01 ~]# ipa-getkeytab -s idmipa01.nix.mds.xyz -p nfs/nfs01.nix.mds.xyz -k /etc/krb5.keytab -r	Pull in principals from your IPA / KDC Server.
nfs01 / nfs02 / nfs03	Check the HAProxy GUI to see the full status report: http://nfs-c01:9000/haproxy-stats	Verify the cluster.
nfs01 / nfs02 / nfs03	Configure log rotation for the ganesha log files. They can get big. [root@nfs03 ganesha]# cat /etc/logrotate.d/ganesha /var/log/ganesha/ganesha.log { weekly rotate 8 copytruncate dateext compress missingok } /var/log/ganesha/ganesha-rgw.log { daily rotate 8 copytruncate maxsize 100M dateext compress missingok } [root@nfs03 ganesha]# Manually rotate the log file: [root@nfs03 ganesha]# logrotate /etc/logrotate.d/ganesha -v Distribute the log rotate file to the other nodes.	Log Rotation Setup

TESTING

Now let's do some checks on our NFS HA. Mount the share using the VIP from a client then create a test file:

[root@ipaclient01 /]# mount -t nfs4 nfs-c01:/n /n
[root@ipaclient01 n]# echo -ne "Hacked It. Gluster, NFS Ganesha, HAPROXY, keepalived scalable NFS server." > some-people-find-this-awesome.txt
[root@ipaclient01 n]# mount|grep nfs4
nfs-c01:/n on /n type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.236,local_lock=none,addr=192.168.0.80)
[root@ipaclient01 n]#

Then check each brick to see if the file was replicated:

[root@nfs01 n]# cat /bricks/0/gv01/some-people-find-this-awesome.txt
Hacked It. Gluster, NFS Ganesha, HAPROXY, keepalived scalable NFS server.
[root@nfs01 n]# mount|grep -Ei gv01
nfs01:/gv01 on /n type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@nfs01 n]#

[root@nfs02 n]# cat /bricks/0/gv01/some-people-find-this-awesome.txt
Hacked It. Gluster, NFS Ganesha, HAPROXY, keepalived scalable NFS server.
[root@nfs02 n]# mount|grep -Ei gv01
nfs02:/gv01 on /n type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@nfs02 n]#

Good! Now let's hard shutdown one node, nfs01, the primary node. Expected behaviour is that we need to see failover to nfs02 and then when we bring back the nfs01 server, we need to see the file is replicated. While we do this, the client ipaclient01 is not supposed to loose any connection to the NFS mount via the VIP. Here are the results:

[root@nfs02 n]# ps -ef|grep -Ei "haproxy|ganesha|keepalived"
root 12245 1 0 Feb19 ? 00:00:03 /usr/sbin/keepalived -D
root 12246 12245 0 Feb19 ? 00:00:03 /usr/sbin/keepalived -D
root 12247 12245 0 Feb19 ? 00:00:41 /usr/sbin/keepalived -D
root 12409 1 16 Feb20 ? 00:13:05 /usr/bin/ganesha.nfsd -L /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT
root 17892 1 0 00:37 ? 00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy 17893 17892 0 00:37 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
haproxy 17894 17893 0 00:37 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
root 17918 21084 0 00:38 pts/0 00:00:00 grep –color=auto -Ei haproxy|ganesha|keepalived
[root@nfs02 n]# ps -ef|grep -Ei "haproxy|ganesha|keepalived"; netstat -pnlt|grep -Ei ganesha; netstat -pnlt|grep -Ei haproxy; netstat -pnlt|grep -Ei keepalived
root 12245 1 0 Feb19 ? 00:00:03 /usr/sbin/keepalived -D
root 12246 12245 0 Feb19 ? 00:00:03 /usr/sbin/keepalived -D
root 12247 12245 0 Feb19 ? 00:00:41 /usr/sbin/keepalived -D
root 12409 1 16 Feb20 ? 00:13:09 /usr/bin/ganesha.nfsd -L /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT
root 17892 1 0 00:37 ? 00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy 17893 17892 0 00:37 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
haproxy 17894 17893 0 00:37 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
root 17947 21084 0 00:38 pts/0 00:00:00 grep –color=auto -Ei haproxy|ganesha|keepalived
tcp6 0 0 192.168.0.119:20048 :::* LISTEN 12409/ganesha.nfsd
tcp6 0 0 :::564 :::* LISTEN 12409/ganesha.nfsd
tcp6 0 0 192.168.0.119:4501 :::* LISTEN 12409/ganesha.nfsd
tcp6 0 0 192.168.0.119:2049 :::* LISTEN 12409/ganesha.nfsd
tcp6 0 0 192.168.0.119:38468 :::* LISTEN 12409/ganesha.nfsd
tcp 0 0 192.168.0.80:2049 0.0.0.0:* LISTEN 17894/haproxy
[root@nfs02 n]#
[root@nfs02 n]#
[root@nfs02 n]#
[root@nfs02 n]# ssh nfs-c01
Password:
Last login: Wed Feb 21 00:37:28 2018 from nfs-c01.nix.mine.dom
[root@nfs02 ~]# logout
Connection to nfs-c01 closed.
[root@nfs02 n]#

From client we can still see all the files (seemless with no interruption to the NFS service). As a bonus, while we started this first test, we noticed that HAPROXY was offline on nfs02. While trying to list the client files, it appeared hung but still responded then listed files right after we started HAPROXY on nfs02:

[root@ipaclient01 n]# ls -altri some-people-find-this-awesome.txt
11782527620043058273 -rw-r–r–. 1 nobody nobody 74 Feb 21 00:26 some-people-find-this-awesome.txt
[root@ipaclient01 n]# df -h .
Filesystem Size Used Avail Use% Mounted on
nfs-c01:/n 128G 43M 128G 1% /n
[root@ipaclient01 n]# ssh nfs-c01
Password:
Last login: Wed Feb 21 00:41:06 2018 from nfs-c01.nix.mine.dom
[root@nfs02 ~]#

Checking the gluster volume on nfs02:

[root@nfs02 n]# gluster volume status
Status of volume: gv01
Gluster process TCP Port RDMA Port Online Pid
——————————————————————————
Brick nfs02:/bricks/0/gv01 49152 0 Y 16103
Self-heal Daemon on localhost N/A N/A Y 16094

Task Status of Volume gv01
——————————————————————————
There are no active volume tasks

[root@nfs02 n]#

Now let's bring back the first node and fail the second after nfs01 is up again. As soon as we bring nfs01 back up, the VIP fails over to nfs01 without any hickup or manual invervention on the client end:

[root@ipaclient01 n]# ls -altri
total 11
128 dr-xr-xr-x. 21 root root 4096 Feb 18 22:24 ..
11782527620043058273 -rw-r–r–. 1 nobody nobody 74 Feb 21 00:26 some-people-find-this-awesome.txt
1 drwxr-xr-x. 3 nobody nobody 4096 Feb 21 00:26 .
[root@ipaclient01 n]#
[root@ipaclient01 n]#
[root@ipaclient01 n]#
[root@ipaclient01 n]# ssh nfs-c01
Password:
Last login: Wed Feb 21 00:59:56 2018
[root@nfs01 ~]#

So now let's fail the second node. NFS still works:

[root@ipaclient01 ~]# ssh nfs-c01
Password:
Last login: Wed Feb 21 01:31:50 2018
[root@nfs01 ~]# logout
Connection to nfs-c01 closed.
[root@ipaclient01 ~]# cd /n
[root@ipaclient01 n]# ls -altri some-people-find-this-awesome.txt
11782527620043058273 -rw-r–r–. 1 nobody nobody 74 Feb 21 00:26 some-people-find-this-awesome.txt
[root@ipaclient01 n]# df -h .
Filesystem Size Used Avail Use% Mounted on
nfs-c01:/n 128G 43M 128G 1% /n
[root@ipaclient01 n]#

So we bring the second node back up. And that concludes the configuration! All works like a charm!

You can also check out our guest post for the same on loadbalancer.org!

Good Luck!

Cheers,
Tom K.

March 11th, 2018 | Posted in NIX Posts | 1 Comment

Cannot find key for kvno in keytab

If you are getting this:

krb5_child.log:(Tue Mar 6 23:18:46 2018) [[sssd[krb5_child[3193]]]] [map_krb5_error] (0x0020): 1655: [-1765328340][Cannot find key for nfs/nfs01.nix.my.dom@NIX.my.dom kvno 6 in keytab]

Then you can resolve it by copying the old keytab file back (or removing the incorrect entries using ktutil). In our case we had made a saved copy and readded the NFS principals to the keytab file. You can list out the current principals in the keytab file using:

klist -kte /etc/krb5.keytab

This was followed up by readding missing keytab keys from the IPA server:

ipa-getkeytab -s idmipa01.nix.my.dom -p nfs/nfs-c01.nix.my.dom -k /etc/krb5.keytab
ipa-getkeytab -s idmipa01.nix.my.dom -p nfs/nfs01.nix.my.dom -k /etc/krb5.keytab

Alternately, create the keytab entries manually using ktutil above.

Cheers,
Tom

March 7th, 2018 | Posted in NIX Posts | No Comments

Name resolution for the name timed out after none of the configured DNS servers responded.

You're getting this:

Name resolution for the name <URL> timed out after none of the configured DNS servers responded.

One of the resolutions is to adjust a few network parameters:

netsh interface tcp set global rss=disabled
netsh interface tcp set global autotuninglevel=disabled
netsh int ip set global taskoffload=disabled

Then set these registry options:

regedit: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

EnableTCPChimney=dword:00000000
EnableTCPA=dword:00000000
EnableRSS=dword:00000000

Cheers,
Tom K.

March 4th, 2018 | Posted in NIX Posts | No Comments

Ping request could not find host HOST. Please check the name and try again .

ping cannot find host but nslookup on a host works just fine:

Ping request could not find host HOST. Please check the name and try again.

Restart the DNS Client Service in Windows Services to resolve this one. A few other commands to try:

ipconfig /flushdns
ipconfig /registerdns

Following this, check eventviewer why it stopped working, to begin with. The service is started using:

C:\Windows\system32\svchost.exe -k NetworkService

Alternately, stopping the caching daemon (DNS Client) in Services also works. Also check Event Viewer -> Custom Events -> Administrative Events for any failures. After doing the above, disable and reenable the network interface.

Additional commands to try:

route /f
netsh winsock reset catalog
netsh int ip reset reset.log

After all this, the issue was resolved on our side. Alternatively, reconnect to the network if originally connected via WiFi.

Cheers,
Tom K

March 4th, 2018 | Posted in NIX Posts | No Comments

GlusterFS: Configuration and Setup w/ NFS-Ganesha for an HA NFS Cluster

In this post we will go over how to setup a highly available NFS Cluster using:

GlusterFS
NFS Ganesha
CentOS 7
HAPROXY
keepalived
firewalld
selinux

This post is very lengthy and goes over quite a few details on the way to configuring this setup. We document virtually every step including how to build out a GlusterFS filesystem on both physical or virtual environments. For those interested in a quick setup, please skip to the SUMMARY or TESTING sections at the bottom for a summary of commands and configuration files used. If you run into problems, just search the page for the issue you have, as it's likely listed, and read the solution attempted.

Read the rest of this entry »

February 18th, 2018 | Posted in NIX Posts | 1 Comment

Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()

FreeIPA replication failes for about 13 minutes with no activity on the first IDM server. Not clear why at first.

Feb 12 10:06:56 idmipa01 named-pkcs11[2529]: zone nix.mds.xyz/IN: sending notifies (serial 1518448016)
Feb 12 10:07:06 idmipa01 named-pkcs11[2529]: error (chase DS servers) resolving 'mds.xyz/DS/IN': 192.168.0.224#53
Feb 12 10:07:14 idmipa01 ns-slapd: [12/Feb/2018:10:07:14.130840773 -0500] – ERR – NSMMReplicationPlugin – bind_and_check_pwp – agmt="cn=meToidmipa02.nix.mds.xyz" (idmipa02:389) – Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()
Feb 12 10:20:01 idmipa01 systemd: Created slice user-0.slice.
Feb 12 10:20:01 idmipa01 systemd: Starting user-0.slice.

The problem was again with NTP and time/date settings.

[root@idmipa02 log]# date
Wed Feb 14 00:05:58 EST 2018
[root@idmipa02 log]#

[root@idmipa01 log]# date
Wed Feb 14 00:00:14 EST 2018
You have new mail in /var/spool/mail/root
[root@idmipa01 log]#

Over 5 minute difference. Checking further we see the following in the logs:

Feb 12 10:13:00 idmipa02 rc.local: Error resolving ca.pool.ntp.org: Name or service not known (-2)
Feb 12 10:13:00 idmipa02 rc.local: 12 Feb 10:13:00 ntpdate[963]: Can't find host ca.pool.ntp.org: Name or service not known (-2)
Feb 12 10:13:00 idmipa02 rc.local: 12 Feb 10:13:00 ntpdate[963]: no servers can be used, exiting

So we need to keep the time between the two masters in sync otherwise this replication issue will reoccur. But we need to ensure our NTP servers are resolvable. So we may need to put extra conditions in our NTP servers. We have:

[root@idmipa01 log]# cat /etc/rc.local |grep -Evi "#"

touch /var/lock/subsys/local
ntpdate -u ca.pool.ntp.org;
[root@idmipa01 log]#

But we should use a single IP in case of failure (We are using NLB on our AD DC servers and we noted a failure on that host earlier which we just fixed.):

[root@idmipa01 log]# cat /etc/rc.local |grep -Evi "#"

touch /var/lock/subsys/local
ntpdate -u ca.pool.ntp.org || ntpdate -u 206.108.0.132 || ntpdate -u 159.203.8.72;
[root@idmipa01 log]#

This gives us some safety in case the name can't be resolved due to DNS issues. We will also reconfigure our NTP servers as follows:

[root@idmipa02 log]# grep -Evi "#" /etc/ntp.conf | sed -e "/^$/d"
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
fudge 127.127.1.0 stratum 10
restrict 192.168.0.0 mask 255.255.255.0 nomodify notrap
restrict 127.0.0.1
restrict ::1
driftfile /var/lib/ntp/ntp.drift
logfile /var/log/ntp.log
server 0.ca.pool.ntp.org prefer
server 1.ca.pool.ntp.org
server 2.ca.pool.ntp.org
server 3.ca.pool.ntp.org
server 198.50.139.209

server 207.210.46.249
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
disable monitor
[root@idmipa02 log]#

and

[root@idmipa01 log]# grep -Evi "#" /etc/ntp.conf|sed -e "/^$/d"
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
fudge 127.127.1.0 stratum 10
restrict 192.168.0.0 mask 255.255.255.0 nomodify notrap
restrict 127.0.0.1
restrict ::1
driftfile /var/lib/ntp/ntp.drift
logfile /var/log/ntp.log
server 207.210.46.249
server 198.50.139.209
server 0.ca.pool.ntp.org
server 1.ca.pool.ntp.org
server 2.ca.pool.ntp.org
server 3.ca.pool.ntp.org prefer
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
disable monitor
[root@idmipa01 log]#

Noticed the preferred NTP servers are different on each of our NTP servers. We're attempting to prevent a scenario where the same external NTP server is polled twice from two different servers simultaneously. No clear evidence if this causes an issue but setting an alternate preferred server for each of our NTP servers prevents that from occurring just in case it could ever be true. We also add 2 IP's from one the domains above in case DNS errors cause us issues. We will be immune to this if it were ever to come up. The difference is significant:

[root@idmipa02 log]# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*LOCAL(0) .LOCL. 10 l 4 64 1 0.000 0.000 0.000
k8s-w04.tblflp. 152.2.133.55 2 u 3 64 1 21.943 906.098 0.000
echo.baxterit.n 213.251.128.249 2 u 2 64 1 39.255 908.220 0.000
k8s-w01.tblflp. 152.2.133.55 2 u 1 64 1 18.415 903.549 0.000
portal.switch.c 213.251.128.249 2 u – 64 1 16.560 901.799 0.000
mirror3.rafal.c .INIT. 16 u – 64 0 0.000 0.000 0.000
198.50.139.209 .INIT. 16 u – 64 0 0.000 0.000 0.000
[root@idmipa02 log]#

[root@idmipa01 log]# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*LOCAL(0) .LOCL. 10 l 34 64 1 0.000 0.000 0.000
198.50.139.209 35.73.197.144 2 u 33 64 1 19.071 -84.149 0.000
mirror3.rafal.c 53.27.192.223 2 u 32 64 1 18.490 -56.439 0.000
ns522433.ip-158 18.26.4.105 2 u 31 64 1 17.833 -80.900 0.000
echo.baxterit.n 213.251.128.249 2 u 30 64 1 16.688 -82.694 0.000
209.115.181.102 206.108.0.133 2 u 29 64 1 72.834 -82.194 0.000
mongrel.ahem.ca .INIT. 16 u – 64 0 0.000 0.000 0.000
[root@idmipa01 log]#

Good Luck!

Cheers,
TK

February 13th, 2018 | Posted in NIX Posts | No Comments

The IT Development and Technology Mini Vault | MicroDevSys.com

Extending the size of your mdadm array.

pam_reply called with result [4]: System error.

Windows 7 Cannot Resolve hostnames via PING but nslookup works.

failed command: READ FPDMA QUEUED FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE

GlusterFS: Configuration and Setup w/ NFS-Ganesha for an HA NFS Cluster (Quick Start Guide)

Cannot find key for kvno in keytab

Name resolution for the name timed out after none of the configured DNS servers responded.

Ping request could not find host HOST. Please check the name and try again .

GlusterFS: Configuration and Setup w/ NFS-Ganesha for an HA NFS Cluster

Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()

Navigation

Blogroll

Databases

ISP's & Resources

Java

Languages

Linux

Miscellaneous

Online Security

Perl

Scripting

Web


	Copyright © 2003 - 2025 Tom Kacperski (microdevsys.com). All rights reserved. This work is licensed under a Creative Commons Attribution 3.0 Unported License Privacy / Use / Terms / Disclaimer Policy.