Header Shadow Image


NFS reply xid 3844308326 reply ERR 20: Auth Rejected Credentials (client should begin new session)

Getting this? Mounts freezing?  Final verified solution is at the bottom but this can be for any number of reasons.  Keep reading:

tcpdump -i eth0 -s 0 -w dump.dat
tcpdump -r dump.dat |grep -Ei "psql02|nfs-c01"

02:55:48.731360 IP psql02.nix.mds.xyz.33991 > nfs-c01.nix.mds.xyz.nfs: Flags [P.], seq 1:693, ack 1, win 229, options [nop,nop,TS val 166990 ecr 5681495], length 692: NFS request xid 3844308326 688 null
02:55:48.731483 IP nfs-c01.nix.mds.xyz.nfs > psql02.nix.mds.xyz.33991: Flags [.], ack 693, win 238, options [nop,nop,TS val 5681498 ecr 166990], length 0
02:55:48.732644 IP nfs-c01.nix.mds.xyz.nfs > psql02.nix.mds.xyz.33991: Flags [P.], seq 1:25, ack 693, win 238, options [nop,nop,TS val 5681499 ecr 166990], length 24: NFS reply xid 3844308326 reply ERR 20: Auth Rejected Credentials (client should begin new session)
02:55:48.732670 IP psql02.nix.mds.xyz.33991 > nfs-c01.nix.mds.xyz.nfs: Flags [.], ack 25, win 229, options [nop,nop,TS val 166991 ecr 5681499], length 0

Try this patch to bring nfs-utils-1.3.0-0.48.el7_4.1.x86_64 up to nfs-utils-1.3.0-0.48.el7_4.2.x86_64:

http://download.rhn.redhat.com/errata/RHBA-2018-0422.html

Update and enjoy?  Nope!  So let's keep digging some more.  After more of an exhaustive search, the result was to add the following firewall lines and restart autofs.  Appears autofs didn't properly start on account of the missing firewall ports causing everything else to freeze, including any additional mounts:


[root@ovirt01 sssd]# firewall-cmd –zone=public –permanent –add-port=111/udp
success
[root@ovirt01 sssd]# firewall-cmd –zone=public –permanent –add-port=2049/ufp
success
[root@ovirt01 sssd]# firewall-cmd –reload
success
[root@ovirt01 sssd]# systemctl restart autofs
[root@ovirt01 sssd]# mount nfs-c01:/n /m
[root@ovirt01 sssd]# umount /m
[root@ovirt01 sssd]#
[root@ovirt01 sssd]#

The following fix was also used in combination with above: 

https://review.gerrithub.io/#/c/ffilz/nfs-ganesha/+/408756/

[root@nfs02 ~]# /bin/ganesha.nfsd -v
NFS-Ganesha Release = V2.7-dev.10
ganesha.nfsd compiled on Apr 30 2018 at 02:21:35
Release comment = GANESHA file server is 64 bits compliant and supports NFS v3,4.0,4.1 (pNFS) and 9P
Git HEAD = 9cf00dccc9ab92ea4a6ec6f7f1f2c043bdc20a4b
Git Describe = V2.7-dev.10-0-g9cf00dc
[root@nfs02 ~]#

On top of the above, also ensure the following gluster errors are handled:

[2018-05-01 22:43:06.412067] E [MSGID: 114058] [client-handshake.c:1571:client_query_portmap_cbk] 0-gv01-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2018-05-01 22:43:55.554833] E [socket.c:2374:socket_connect_finish] 0-gv01-client-0: connection to 192.168.0.131:49152 failed (Connection refused); disconnecting socket

 

[root@nfs02 glusterfs]# netstat -pnlt|grep gluster
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1108/glusterd
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      1432/glusterfsd
[root@nfs02 glusterfs]#


[ CORRECT ]

[root@nfs02 glusterfs]# firewall-cmd –zone=dmz –list-all
dmz
  target: default
  icmp-block-inversion: no
  interfaces:
  sources:
  services: ssh
  ports: 2049/tcp 111/tcp 24007-24008/tcp 38465-38469/tcp 111/udp 22/tcp 22/udp 49000-59999/udp 49000-59999/tcp 20048/tcp 20048/udp 49152/tcp 4501/tcp 4501/udp 10000/tcp 9000/udp 9000/tcp
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

[root@nfs02 glusterfs]#


[ INCORRECT ]

[root@nfs01 /]# firewall-cmd –zone=public –list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: eth0
  sources:
  services: ssh dhcpv6-client haproxy
  ports: 24007-24008/tcp 49152/tcp 38465-38469/tcp 111/tcp 111/udp 2049/tcp 4501/tcp 4501/udp 20048/udp 20048/tcp 22/tcp 22/udp 10000/tcp 49000-59999/udp 49000-59999/tcp 9000/udp 9000/tcp 137/udp 138/udp 2049/udp
  protocols:
  masquerade: no
  forward-ports:
  source-ports: 49000-59999/tcp
  icmp-blocks:
  rich rules:

[root@nfs01 /]#


Fix was to remove the source-port by either editing /etc/firewalld/zones/public.xml and removing 

firewall-cmd –zone=public –permanent –remove-source-port=49000-59999/udp
firewall-cmd –zone=public –permanent –remove-source-port=49000-59999/tcp
firewall-cmd –reload


Also ensure haproxy is running on both hosts:


[root@nfs02 systemd]# systemctl status haproxy -l
* haproxy.service – HAProxy Load Balancer
   Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-05-01 23:21:44 EDT; 20s ago
 Main PID: 2405 (haproxy-systemd)
   CGroup: /system.slice/haproxy.service
           |-2405 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
           |-2406 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
           `-2407 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds

May 01 23:21:44 nfs02.nix.mds.xyz systemd[1]: Started HAProxy Load Balancer.
May 01 23:21:44 nfs02.nix.mds.xyz systemd[1]: Starting HAProxy Load Balancer…
May 01 23:21:44 nfs02.nix.mds.xyz haproxy-systemd-wrapper[2405]: haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
[root@nfs02 systemd]# sysctl -p
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
vm.min_free_kbytes = 1048560
[root@nfs02 systemd]#

 

[root@nfs01 ~]# systemctl status haproxy -l
â haproxy.service – HAProxy Load Balancer
   Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-05-01 23:21:53 EDT; 7s ago
 Main PID: 21707 (haproxy-systemd)
   CGroup: /system.slice/haproxy.service
           ââ21707 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
           ââ21708 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
           ââ21709 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds

May 01 23:21:53 nfs01.nix.mds.xyz systemd[1]: Started HAProxy Load Balancer.
May 01 23:21:53 nfs01.nix.mds.xyz systemd[1]: Starting HAProxy Load Balancer…
May 01 23:21:53 nfs01.nix.mds.xyz haproxy-systemd-wrapper[21707]: haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
[root@nfs01 ~]# sysctl -p
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
vm.min_free_kbytes = 1048560
[root@nfs01 ~]#

The other issue that existed was that you did not have a proper PTR and DNS records for the server.  Add them in IPA server.  This indicates that either there is an IPA server replication issue or PTR records are not created:

[root@psql01 ~]# dig -x psql01

; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.2 <<>> -x psql01
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 29853
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;psql01.in-addr.arpa.           IN      PTR

;; AUTHORITY SECTION:
in-addr.arpa.           900     IN      SOA     b.in-addr-servers.arpa. nstld.iana.org. 2018013362 1800 900 604800 3600

;; Query time: 95 msec
;; SERVER: 192.168.0.44#53(192.168.0.44)
;; WHEN: Tue May 01 23:39:52 EDT 2018
;; MSG SIZE  rcvd: 116

[root@psql01 ~]#

But that didn't fix it either.   Next thing:

firewall-cmd –direct –permanent –add-rule ipv4 filter INPUT 0 –in-interface enp0s8 –destination 224.0.0.18 –protocol vrrp -j ACCEPT

And that didn't work either.  Some of the packets in the TCP dump show as incorrect.  This can be fixed using the following:

[root@nfs02 ~]# ethtool –show-offload  eth0 > eth0-checksum.txt
[root@nfs02 ~]# ethtool –offload  eth0  rx off  tx off
Actual changes:
rx-checksumming: off
tx-checksumming: off
        tx-checksum-ip-generic: off
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [requested on]
        tx-tcp6-segmentation: off [requested on]
[root@nfs02 ~]#

 

That did not do the trick either.  The only log that changed on the mount attempt is the following on NFS02:

==> /var/log/ganesha/ganesha-rgw.log <==
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_2] nfs_rpc_decode_request :DISP :DEBUG :0x7faa0c0012b0 fd 32 context 0x7faa08000c10
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_2] nfs_rpc_process_request :DISP :DEBUG :Request from ::ffff:192.168.0.125 for Program 100003, Version 4, Function 0 has xid=1069716099
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_2] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7faa0c0012b0 fd 32 (::ffff:192.168.0.125:46740) xid=1069716099 returned XPRT_IDLE
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_2] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7faa0c0012b0 fd 32 xp_refs 3 rq_refs 0
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_13] nfs_rpc_decode_request :DISP :DEBUG :0x7faa00001440 fd 34 context 0x7fa9e8002570
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_13] nfs_rpc_process_request :DISP :INFO :Could not authenticate request… rejecting with AUTH_STAT=AUTH_REJECTEDCRED
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_13] nfs_rpc_decode_request :DISP :DEBUG :SVC_DECODE on 0x7faa00001440 fd 34 (::ffff:192.168.0.125:46742) xid=2610062768 returned XPRT_IDLE
11/11/2018 10:05:12 : epoch 5be8411c : nfs02.nix.mds.xyz : ganesha.nfsd-28961[svc_13] free_nfs_request :DISP :DEBUG :free_nfs_request: 0x7faa00001440 fd 34 xp_refs 3 rq_refs 0

 

SOLUTION (Verified)

This is one of the solutions that did move us forwrad but for the wrong reason.  Still check auditd for the following:

type=AVC msg=audit(1526965320.850:4094): avc:  denied  { write } for  pid=8714 comm="ganesha.nfsd" name="nfs_0" dev="dm-0" ino=201547689 scontext=system_u:system_r:ganesha_t:s0 tcontext=system_u:object_r:krb5_host_rcache_t:s0 tclass=file
type=SYSCALL msg=audit(1526965320.850:4094): arch=c000003e syscall=2 success=no exit=-13 a0=7f23b0003150 a1=2 a2=180 a3=2 items=0 ppid=1 pid=8714 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="ganesha.nfsd" exe="/usr/bin/ganesha.nfsd" subj=system_u:system_r:ganesha_t:s0 key=(null)
type=PROCTITLE msg=audit(1526965320.850:4094): proctitle=2F7573722F62696E2F67616E657368612E6E667364002D4C002F7661722F6C6F672F67616E657368612F67616E657368612E6C6F67002D66002F6574632F67616E657368612F67616E657368612E636F6E66002D4E004E49565F4556454E54
type=AVC msg=audit(1526965320.850:4095): avc:  denied  { unlink } for  pid=8714 comm="ganesha.nfsd" name="nfs_0" dev="dm-0" ino=201547689 scontext=system_u:system_r:ganesha_t:s0 tcontext=system_u:object_r:krb5_host_rcache_t:s0 tclass=file
type=SYSCALL msg=audit(1526965320.850:4095): arch=c000003e syscall=87 success=no exit=-13 a0=7f23b0004100 a1=7f23b0000050 a2=7f23b0004100 a3=5 items=0 ppid=1 pid=8714 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="ganesha.nfsd" exe="/usr/bin/ganesha.nfsd" subj=system_u:system_r:ganesha_t:s0 key=(null)
type=PROCTITLE msg=audit(1526965320.850:4095): proctitle=2F7573722F62696E2F67616E657368612E6E667364002D4C002F7661722F6C6F672F67616E657368612F67616E657368612E6C6F67002D66002F6574632F67616E657368612F67616E657368612E636F6E66002D4E004E49565F4556454E54

A few lines like this:

grep AVC /var/log/audit/audit.log | audit2allow -M systemd-allow

semodule -i systemd-allow.pp

solved the issue for us.  The error thrown also included this:

May 21 23:53:13 psql01 kernel: CPU: 3 PID: 2273 Comm: mount.nfs Tainted: G             L ————   3.10.0-693.21.1.el7.x86_64 #1
May 21 23:53:13 psql01 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/14/2014
May 21 23:53:13 psql01 kernel: task: ffff880136335ee0 ti: ffff8801376b0000 task.ti: ffff8801376b0000
May 21 23:53:13 psql01 kernel: RIP: 0010:[]  [] _raw_spin_unlock_irqrestore+0x15/0x20
May 21 23:53:13 psql01 kernel: RSP: 0018:ffff8801376b3a60  EFLAGS: 00000206
May 21 23:53:13 psql01 kernel: RAX: ffffffffc05ab078 RBX: ffff880036973928 RCX: dead000000000200
May 21 23:53:13 psql01 kernel: RDX: ffffffffc05ab078 RSI: 0000000000000206 RDI: 0000000000000206
May 21 23:53:13 psql01 kernel: RBP: ffff8801376b3a60 R08: ffff8801376b3ab8 R09: ffff880137de1200
May 21 23:53:13 psql01 kernel: R10: ffff880036973928 R11: 0000000000000000 R12: ffff880036973928
May 21 23:53:13 psql01 kernel: R13: ffff8801376b3a58 R14: ffff88013fd98a40 R15: ffff8801376b3a58
May 21 23:53:13 psql01 kernel: FS:  00007fab48f07880(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
May 21 23:53:13 psql01 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 21 23:53:13 psql01 kernel: CR2: 00007f99793d93cc CR3: 000000013761e000 CR4: 00000000000007e0
May 21 23:53:13 psql01 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 21 23:53:13 psql01 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 21 23:53:13 psql01 kernel: Call Trace:
May 21 23:53:13 psql01 kernel: [] finish_wait+0x56/0x70
May 21 23:53:13 psql01 kernel: [] nfs_wait_client_init_complete+0xa1/0xe0 [nfs]
May 21 23:53:13 psql01 kernel: [] ? wake_up_atomic_t+0x30/0x30
May 21 23:53:13 psql01 kernel: [] nfs_get_client+0x22b/0x470 [nfs]
May 21 23:53:13 psql01 kernel: [] nfs4_set_client+0x98/0x130 [nfsv4]
May 21 23:53:13 psql01 kernel: [] nfs4_create_server+0x13e/0x3b0 [nfsv4]
May 21 23:53:13 psql01 kernel: [] nfs4_remote_mount+0x2e/0x60 [nfsv4]
May 21 23:53:13 psql01 kernel: [] mount_fs+0x3e/0x1b0
May 21 23:53:13 psql01 kernel: [] ? __alloc_percpu+0x15/0x20
May 21 23:53:13 psql01 kernel: [] vfs_kern_mount+0x67/0x110
May 21 23:53:13 psql01 kernel: [] nfs_do_root_mount+0x86/0xc0 [nfsv4]
May 21 23:53:13 psql01 kernel: [] nfs4_try_mount+0x44/0xc0 [nfsv4]
May 21 23:53:13 psql01 kernel: [] ? get_nfs_version+0x27/0x90 [nfs]
May 21 23:53:13 psql01 kernel: [] nfs_fs_mount+0x4cb/0xda0 [nfs]
May 21 23:53:13 psql01 kernel: [] ? nfs_clone_super+0x140/0x140 [nfs]
May 21 23:53:13 psql01 kernel: [] ? param_set_portnr+0x70/0x70 [nfs]
May 21 23:53:13 psql01 kernel: [] mount_fs+0x3e/0x1b0
May 21 23:53:13 psql01 kernel: [] ? __alloc_percpu+0x15/0x20
May 21 23:53:13 psql01 kernel: [] vfs_kern_mount+0x67/0x110
May 21 23:53:13 psql01 kernel: [] do_mount+0x233/0xaf0
May 21 23:53:13 psql01 kernel: [] SyS_mount+0x96/0xf0
May 21 23:53:13 psql01 kernel: [] system_call_fastpath+0x1c/0x21
May 21 23:53:13 psql01 kernel: [] ? system_call_after_swapgs+0xae/0x146

However the above did not work for us on a second attempt because we were missing the right principal on that server.  The correct server has the following:

[root@nfs02 ~]# klist -kte
Keytab name: FILE:/etc/krb5.keytab
KVNO Timestamp           Principal
—- ——————- ——————————————————
   1 02/17/2018 20:13:39 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   1 02/17/2018 20:13:39 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   1 02/17/2018 20:13:39 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   1 02/17/2018 20:13:39 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
   4 03/04/2018 13:57:16 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   4 03/04/2018 13:57:16 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   4 03/04/2018 13:57:16 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   4 03/04/2018 13:57:16 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
   2 03/04/2018 13:57:32 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   2 03/04/2018 13:57:32 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   2 03/04/2018 13:57:32 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   2 03/04/2018 13:57:32 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
   4 03/05/2018 22:56:37 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   4 03/05/2018 22:56:37 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   4 03/05/2018 22:56:37 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   4 03/05/2018 22:56:37 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
[root@nfs02 ~]#

And the bad one in the cluster had the following:

[root@nfs03 ~]# klist -kte
Keytab name: FILE:/etc/krb5.keytab
KVNO Timestamp         Principal
—- —————– ——————————————————–
   1 05/20/18 23:18:01 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   1 05/20/18 23:18:01 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   1 05/20/18 23:18:01 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   1 05/20/18 23:18:01 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
[root@nfs03 ~]#

We can see the back and fourth communication between nfs02 (192.168.0.119 ) and nfs03 ( 192.168.0.125 ) in the log files by the following entries:

04:13:32.932068 00:50:56:86:2d:21 > 00:50:56:86:3a:74, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 41490, offset 0, flags [DF], proto TCP (6), length 76)
    192.168.0.119.2049 > 192.168.0.125.46652: Flags [P.], cksum 0x236c (correct), seq 3129116439:3129116463, ack 1682320548, win 238, options [nop,nop,TS val 2303240 ecr 2060939], length 24: NFS reply xid 2610793391 reply ERR 20: Auth Rejected Credentials (client should begin new session)
        0x0000:  0050 5686 3a74 0050 5686 2d21 0800 4500  .PV.:t.PV.-!..E.
        0x0010:  004c a212 4000 4006 1655 c0a8 0077 c0a8  .L..@.@..U…w..
        0x0020:  007d 0801 b63c ba82 8717 6446 2ca4 8018  .}…<….dF,…
        0x0030:  00ee 236c 0000 0101 080a 0023 2508 001f  ..#l…….#%…
        0x0040:  728b 8000 0014 9b9d 8baf 0000 0001 0000  r……………
        0x0050:  0001 0000 0001 0000 0002                 ……….

VERIFIED SOLUTION

NOTE: Don't forget to restart autofs on the clients and test from more then one client.

The requirement is to add a principal for the server.  In the case of FreeIPA, we use the following ( service-allow-retrieve-keytab allows the subsequent activity using -r option ):  

[root@idmipa01 ~]# ipa service-add nfs/nfs03.nix.mds.xyz

[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ –groups=admins
[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ –hosts={nfs01.nix.mds.xyz,nfs02.nix.mds.xyz,nfs03.nix.mds.xyz} 

[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ –groups=admins  
[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ –groups=admins
  
[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs01.nix.mds.xyz@NIX.MDS.XYZ –groups=admins  

[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs03.nix.mds.xyz –hosts=nfs03.nix.mds.xyz
[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs02.nix.mds.xyz –hosts=nfs02.nix.mds.xyz
[root@idmipa01 ~]# ipa service-allow-retrieve-keytab nfs/nfs01.nix.mds.xyz –hosts=nfs01.nix.mds.xyz

[root@nfs03 ~]# kinit admin    # Or the user you permissioned above.
[root@nfs03 ~]# ipa-getkeytab -s idmipa01.nix.mds.xyz -p nfs/nfs-c01.nix.mds.xyz -k /etc/krb5.keytab -r 

[root@nfs03 ~]# ipa-getkeytab -s idmipa01.nix.mds.xyz -p nfs/nfs03.nix.mds.xyz -k /etc/krb5.keytab -r 

Test using kinit when done:

[root@nfs02 sssd]# kinit -kt /etc/krb5.keytab nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
[root@nfs02 sssd]# klist
Ticket cache: KEYRING:persistent:0:krb_ccache_t3UCYMN
Default principal: nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ

Valid starting       Expires              Service principal
11/11/2018 15:45:54  11/12/2018 15:45:54  krbtgt/NIX.MDS.XYZ@NIX.MDS.XYZ
[root@nfs02 sssd]#

Note the use of -r above, this preserves the KVNO. In the event you're using a local, non-IPA KDC, issue the following set of commands:

# kadmin.local
kadmin.local:  addprinc host/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
Enter password for principal "
host/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ": <password>
Re-enter password for principal "
host/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ": <password>
kadmin.local:  addprinc nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ 
Enter password for principal "nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ": <password>
Re-enter password for principal "nfs
/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ": <password>
kadmin.local:

Adjust the above paramaters to the addprinc command accordingly to the below options:

kadmin.local:  addprinc
usage: add_principal [options] principal
        options are:
                [-randkey|-nokey] [-x db_princ_args]* [-expire expdate] [-pwexpire pwexpdate] [-maxlife maxtixlife]
                [-kvno kvno] [-policy policy] [-clearpolicy]
                [-pw password] [-maxrenewlife maxrenewlife]
                [-e keysaltlist]
                [{+|-}attribute]
        attributes are:
                allow_postdated allow_forwardable allow_tgs_req allow_renewable
                allow_proxiable allow_dup_skey allow_tix requires_preauth
                requires_hwauth needchange allow_svr password_changing_service
                ok_as_delegate ok_to_auth_as_delegate no_auth_data_required
                lockdown_keys

where,
        [-x db_princ_args]* – any number of database specific arguments.
                        Look at each database documentation for supported arguments
kadmin.local:

to achieve the below results:

kadmin.local:  getprinc nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
Principal: nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
Expiration date: [never]
Last password change: Tue Mar 06 23:24:10 EST 2018
Password expiration date: [never]
Maximum ticket life: 1 day 00:00:00
Maximum renewable life: 7 days 00:00:00
Last modified: Tue Mar 06 23:24:10 EST 2018 (nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ)
Last successful authentication: [never]
Last failed authentication: Sun Nov 11 11:00:08 EST 2018
Failed password attempts: 8
Number of keys: 4
Key: vno 6, aes256-cts-hmac-sha1-96:special
Key: vno 6, aes128-cts-hmac-sha1-96:special
Key: vno 6, des3-cbc-sha1:special
Key: vno 6, arcfour-hmac:special
MKey: vno 1
Attributes: REQUIRES_PRE_AUTH
Policy: [none]
kadmin.local:
kadmin.local:
kadmin.local:  get princ host/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
kadmin.local: Unknown request "get".  Type "?" for a request list.
kadmin.local:  getprinc host/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
Principal: host/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
Expiration date: [never]
Last password change: [never]
Password expiration date: [never]
Maximum ticket life: 1 day 00:00:00
Maximum renewable life: 7 days 00:00:00
Last modified: Wed Dec 31 19:00:00 EST 1969 (principal@UNINITIALIZED)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 0
MKey: vno 1
Attributes: REQUIRES_PRE_AUTH
Policy: [none]
kadmin.local:

Write keytabs to a separate file using the ktutil command and preserving the KVNO number you got from above (highlighted) .  Example commands are as follows for the various encryption algorigthms .  Use all or some as applicable:  

[root@nfs03 ~]# ktutil 

ktutil: add_entry -password -p nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des3-cbc-sha1-kd 
Password for nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e arcfour-hmac-md5 
Password for nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des-hmac-sha1 
Password for nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des-cbc-md5 
Password for nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des-cbc-md4 
Password for nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ: 

 

ktutil: add_entry -password -p nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des3-cbc-sha1-kd 
Password for nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e arcfour-hmac-md5 
Password for nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des-hmac-sha1 
Password for nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des-cbc-md5 
Password for nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ: 

ktutil: add_entry -password -p nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ -k 4 -e des-cbc-md4 
Password for nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ: 

Verify the written entries:

ktutil:  l -k -t -e
slot KVNO Timestamp         Principal
—- —- —————– —————————————————
   1    4 11/11/18 13:27:45      nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)  (0x977c7f3b20b5b694dafb8c6b0749a420e32cf29bd96d803d)
   2    4 11/11/18 13:28:02      nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)  (0x8846f7eaee8fb117ad06bdd830b7586c)
   3    4 11/11/18 13:28:59      nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des-hmac-sha1)  (0x2ab6760e97d0672a)
   4    4 11/11/18 13:29:10      nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md5)  (0x0efbf225dc201cf8)
   5    4 11/11/18 13:29:23      nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md4)  (0x0efbf225dc201cf8)
   6    4 11/11/18 13:29:39        nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)  (0xfe73237c160e0d7357d96b61fbcbce437fdf7a9e08ab239d)
   7    4 11/11/18 13:29:49        nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)  (0x8846f7eaee8fb117ad06bdd830b7586c)
   8    4 11/11/18 13:30:03        nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des-hmac-sha1)  (0x4557862576d9e973)
   9    4 11/11/18 13:30:12        nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md5)  (0x6d54a1abe0194a25)
  10    4 11/11/18 13:30:21        nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md4)  (0x6d54a1abe0194a25)
ktutil:

Now that you've created the entries above, write a keytab file for them: 

ktutil: wkt /some/path/you/choose/nfs.keytab 

Merge the two keytabs on the system: 

cp -ip /etc/krb5.keytab  /etc/krb5.keytab-backup

[root@nfs03 ~]# ktutil 
  ktutil: rkt /some/path/you/choose/nfs.keytab
  ktutil: rkt /etc/krb5.keytab
  ktutil: wkt /etc/krb5.keytab
  ktutil: quit

Verify the newly created keytab file:

[root@nfs03 ~]# klist -kte /etc/krb5.keytab
Keytab name: FILE:/etc/krb5.keytab-new
KVNO Timestamp         Principal
—- —————– ——————————————————–
   1 11/11/18 13:35:28 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   1 11/11/18 13:35:28 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   1 11/11/18 13:35:28 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   1 11/11/18 13:35:28 host/nfs03.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
   4 11/11/18 13:35:28 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   4 11/11/18 13:35:28 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
   4 11/11/18 13:35:28 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des-hmac-sha1)
   4 11/11/18 13:35:28 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md5)
   4 11/11/18 13:35:28 nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md4)
   4 11/11/18 13:35:28 nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   4 11/11/18 13:35:28 nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
   4 11/11/18 13:35:28 nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des-hmac-sha1)
   4 11/11/18 13:35:28 nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md5)
   4 11/11/18 13:35:28 nfs/nfs03.nix.mds.xyz@NIX.MDS.XYZ (des-cbc-md4)
[root@nfs03 ~]#

PROBLEMS SECTION

In case you get this error:

[root@nfs03 ~]# ipa-getkeytab -s idmipa01.nix.mds.xyz -p nfs/nfs03.nix.mds.xyz -k /etc/krb5.keytab
Failed to load translations
Failed to parse result: PrincipalName not found.

Retrying with pre-4.0 keytab retrieval method…
Failed to parse result: PrincipalName not found.

Failed to get keytab!
Failed to get keytab
[root@nfs03 ~]#

Remember to first create the service principal on the IPA KDC as mentioned above using ipa service-add nfs/nfs03.nix.mds.xyz .

If you get the following error when using -r option when trying to import the keys, don't forget to kinit and permission the activity on the IPA server:

[root@nfs03 ~]# ipa-getkeytab -s idmipa01.nix.mds.xyz -p nfs/nfs03.nix.mds.xyz -k /etc/krb5.keytab -r
Failed to load translations
Failed to parse result: Insufficient access rights

Failed to get keytab
[root@nfs03 ~]#

Once you permission retrieval of keytabs / principals using ipa service-allow-retrieve-keytab, your attempt will succeed:

[root@nfs03 ~]# ipa-getkeytab -s idmipa01.nix.mds.xyz -p nfs/nfs03.nix.mds.xyz -k /etc/krb5.keytab -r
Failed to load translations
Keytab successfully retrieved and stored in: /etc/krb5.keytab
[root@nfs03 ~]#

If you accidentally omitted the -r option when getting keytabs from the KDC / IPA server, you'll need to reimport them using -r otherwise the KVNO number won't match and you'll get this:

[root@nfs02 ~]# kinit -kt /etc/krb5.keytab nfs/nfs-c01.nix.mds.xyz@NIX.MDS.XYZ
kinit: Preauthentication failed while getting initial credentials
[root@nfs02 ~]#

In that scenario you'll need to manually edit out the offending keytab entries and reimport:

[root@nfs02 sssd]# ktutil
ktutil: rkt /etc/krb5.keytab
ktutil: l
ktutil: deltent <NUM>
ktutil: wkt /some/temp/path/krb5.keytab-new
ktutil: quit

Check the entries using:

[root@nfs02 sssd]# klist -kte /etc/krb5.keytab-new
Keytab name: FILE:/etc/krb5.keytab-new
KVNO Timestamp           Principal
—- ——————- ——————————————————
   1 11/11/2018 15:22:23 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   1 11/11/2018 15:22:23 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   1 11/11/2018 15:22:23 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   1 11/11/2018 15:22:23 host/nfs02.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
   4 11/11/2018 15:22:23 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes256-cts-hmac-sha1-96)
   4 11/11/2018 15:22:23 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (aes128-cts-hmac-sha1-96)
   4 11/11/2018 15:22:23 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (des3-cbc-sha1)
   4 11/11/2018 15:22:23 nfs/nfs02.nix.mds.xyz@NIX.MDS.XYZ (arcfour-hmac)
[root@nfs02 sssd]#

Alongside the above, we needed to enable Kerberos on nfs03:

[root@nfs02 ganesha]# grep SecType /etc/ganesha/export.conf
    SecType = "sys","krb5","krb5i","krb5p";     # Security flavors supported
[root@nfs02 ganesha]#

[root@nfs03 ganesha]# grep SecType /etc/ganesha/export.conf
    SecType = "sys";                            # Security flavors supported
[root@nfs03 ganesha]#

This is another reason why the tcpdump command showed that nfs03 couldn't authenticate since it didn't know what KRB5 messages from nfs02 were about.  But this did not work either.

Next check the autofs service and try the mount of othe rmachines:

[root@psql01 sssd]# systemctl list-unit-files|grep auto
proc-sys-fs-binfmt_misc.automount             static
auto-net.service                              enabled
autofs.service                                disabled
autovt@.service                               enabled
rhel-autorelabel-mark.service                 static
rhel-autorelabel.service                      static
sssd-autofs.service                           indirect
sssd-autofs.socket                            disabled
[root@psql01 sssd]#

And try to login again.  This time it was a success!  Now let's fix mysql01 and restart the autofs daemon:

[root@mysql01 ~]# systemctl list-unit-files|grep -Ei auto
proc-sys-fs-binfmt_misc.automount             static
auto-net.service                              enabled
autofs.service                                enabled
autovt@.service                               enabled
rhel-autorelabel-mark.service                 static
rhel-autorelabel.service                      static
sssd-autofs.service                           indirect
sssd-autofs.socket                            disabled
[root@mysql01 ~]#
[root@mysql01 ~]#
[root@mysql01 ~]# ps -ef|grep -Ei auto
root     17817 17812  0 22:33 ?        00:00:00 /usr/libexec/sssd/sssd_autofs –uid 0 –gid 0 –logger=files
root     18835 17688  0 22:53 pts/0    00:00:00 grep –color=auto -Ei auto
[root@mysql01 ~]# mount nfs-c01:/n /m
[root@mysql01 ~]# umount /m

[root@mysql01 ~]#

Now restarting autofs on this client yielded our successful mounts!

[root@mysql01 ~]# ps -ef|grep -Ei auto
root     17817 17812  0 22:33 ?        00:00:00 /usr/libexec/sssd/sssd_autofs –uid 0 –gid 0 –logger=files
root     19014     1  0 22:56 ?        00:00:00 /usr/sbin/automount –debug –pid-file /run/autofs.pid
root     19489 19298  0 23:02 pts/0    00:00:00 grep –color=auto -Ei auto
[root@mysql01 ~]#

Retry the mount command using the VIP!  Enjoy!

ALTERNATE SOLUTION

We also noticed the following on the NFS servers:

21/02/2019 07:35:38 : epoch 5c6350e1 : nfs03.nix.mds.xyz : ganesha.nfsd-1306[svc_36] dec_client_id_ref :CLIENT ID :F_DBG :Free Clientid refcount now=0 {0x7fda34022a00 ClientID={Epoch=0x5c6350e1 Counter=0x000000ea} EXPIRED Client={0x7fda34022950 name=(36:Linux NFSv4.1 cm-r01en01.mws.mds.xyz) refcount=1} t_delta=0 reservations=0 refcount=1}

Run the following to refresh the NFS cache:

systemctl restart nfs-ganesha

and retry login:

Using username "mds.xyz\tom".
Using keyboard-interactive authentication.
Password:
Last login: Thu Feb 21 07:35:59 2019 from 192.168.0.76
Could not chdir to home directory /n/mds.xyz/tom: Permission denied
-sh: /n/mds.xyz/tom/.profile: Permission denied
-sh-4.2$

Get's us further.

Good Luck!

Cheers,
TK

REF: https://www.ibm.com/support/knowledgecenter/en/SSZUMP_7.1.2/management_sym/sym_kerberos_creating_principal_keytab.html
REF: https://sourceforge.net/p/nfs-ganesha/mailman/message/30653393/ 

 

Comments are closed.


     
  Copyright © 2003 - 2013 Tom Kacperski (microdevsys.com). All rights reserved.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License