Header Shadow Image


oVirt: bond0: option slaves: invalid value (-eth1)

Looking around the oVirt configs to troubleshoot an issue earlier yielded no results.  This is because oVirt manages the host networks through it's UI instead.  All automated and GUI controlled.  Some of the error messages we needed to troubleshoot:

ovirtmgmt: received packet on bond0 with own address as source address (addr:78:e7:d1:8f:4d:26, vlan:0)
bond0: option slaves: invalid value (-eth1)

 

Config Files controllec by oVirt UI:

/var/lib/vdsm/persistence/netconf/bonds/bond0
/etc/resolve.conf
/etc/sysconfig/network-scripts/bond0
/etc/sysconfig/network-scripts/ifcfg-eth0
/etc/sysconfig/network-scripts/ifcfg-eth1
/etc/sysconfig/network-scripts/ifcfg-eth2
/etc/sysconfig/network-scripts/ifcfg-eth3

Log Files:

/var/log/vdsm/
/var/log/messages

 

Messages such as these pop up:

restore-net::DEBUG::2019-09-21 22:32:54,292::ifcfg::571::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-eth2 configuration:
# Generated by VDSM version 4.20.46-1.el7
DEVICE=eth2
MASTER=bond0
SLAVE=yes
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no

Telling us oVirt Engine is controlling these.

Just the same, our host network configs were stored in the following folders:

/var/lib/vdsm/persistence/netconf/

But what you need to do is visit the oVirt Engine UI.  Navigate to Compute -> Hosts -> (Click the name of the host) -> Network Interfaces -> Setup Host Networks

oVirt Host Network Configuration

Change User Agent under Microsoft Edge

If you're not relying on Microsoft Edge as much as on Google, it may be used for older locations that cannot be changed.  This is true especially when accessing older hardware web locations that are not compatible with Chrome anymore.

To change the user agent do the following:

  • Start Microsoft Edge
  • Press F12 to open the Developer Tools
  • Find the emulation tab.  ( There will be a downward arrow with a bar over it indicating more options in case your browser window is small. )
  • Find the Mode drop down in the resultant panel.
  • Select the compatability mode you're looking for.

Cheers,
TK

 

Executing command failed with the following exception: AuthorizationException: User:tom@MDS.XYZ not allowed to do ‘GET_KEYS’

Getting the following errors from spark-shell or from listing out valid KMS keys?

tom@mds.xyz@cm-r01en01:~] 🙂 $ hadoop key list
19/09/17 23:56:43 INFO util.KerberosName: No auth_to_local rules applied to tom@MDS.XYZ
Cannot list keys for KeyProvider: org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider@e350b40
list [-provider ] [-strict] [-metadata] [-help]:

The list subcommand displays the keynames contained within
a particular provider as configured in core-site.xml or
specified with the -provider argument. -metadata displays
the metadata. If -strict is supplied, fail immediately if
the provider requires a password and none is given.
Executing command failed with the following exception: AuthorizationException: User:tom@MDS.XYZ not allowed to do 'GET_KEYS'
tom@mds.xyz@cm-r01en01:~] 🙁 $

Or the following message entry?

19/09/17 22:17:25 DEBUG ipc.Client: Negotiated QOP is :auth
19/09/17 22:17:25 DEBUG ipc.Client: IPC Client (1322600748) connection to cm-r01nn02.mws.mds.xyz/192.168.0.133:8020 from tom@MDS.XYZ: starting, having connections 1
19/09/17 22:17:25 DEBUG ipc.Client: IPC Client (1322600748) connection to cm-r01nn02.mws.mds.xyz/192.168.0.133:8020 from tom@MDS.XYZ sending #0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getDelegationToken
19/09/17 22:17:25 DEBUG ipc.Client: IPC Client (1322600748) connection to cm-r01nn02.mws.mds.xyz/192.168.0.133:8020 from tom@MDS.XYZ got value #0
19/09/17 22:17:25 DEBUG ipc.ProtobufRpcEngine: Call: getDelegationToken took 650ms
19/09/17 22:17:25 INFO util.KerberosName: No auth_to_local rules applied to tom@MDS.XYZ
19/09/17 22:17:25 INFO hdfs.DFSClient: Created token for tom@MDS.XYZ: HDFS_DELEGATION_TOKEN owner=tom@MDS.XYZ, renewer=yarn, realUser=, issueDate=1568773045589, maxDate=1569377845589, sequenceNumber=56, masterKeyId=62 on 192.168.0.133:8020
19/09/17 22:17:25 DEBUG ipc.Client: IPC Client (1322600748) connection to cm-r01nn02.mws.mds.xyz/192.168.0.133:8020 from tom@MDS.XYZ sending #1 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults
19/09/17 22:17:25 DEBUG ipc.Client: IPC Client (1322600748) connection to cm-r01nn02.mws.mds.xyz/192.168.0.133:8020 from tom@MDS.XYZ got value #1
19/09/17 22:17:25 DEBUG ipc.ProtobufRpcEngine: Call: getServerDefaults took 2ms
19/09/17 22:17:25 DEBUG kms.KMSClientProvider: KMSClientProvider created for KMS url: http://cm-r01nn01.mws.mds.xyz:16000/kms/v1/ delegation token service: kms://http@cm-r01nn01.mws.mds.xyz:16000/kms canonical service: 192.168.0.134:16000.
19/09/17 22:17:25 DEBUG kms.LoadBalancingKMSClientProvider: Created LoadBalancingKMSClientProvider for KMS url: kms://http@cm-r                           01nn01.mws.mds.xyz:16000/kms with 1 providers. delegation token service: kms://http@cm-r01nn01.mws.mds.xyz:16000/kms, canonical service: 192.168.0.134:16000
19/09/17 22:17:25 DEBUG kms.KMSClientProvider: Current UGI: tom@MDS.XYZ (auth:KERBEROS)
19/09/17 22:17:25 DEBUG kms.KMSClientProvider: Login UGI: tom@MDS.XYZ (auth:KERBEROS)
19/09/17 22:17:25 DEBUG security.UserGroupInformation: PrivilegedAction as:tom@MDS.XYZ (auth:KERBEROS) from:org.apache.hadoop.c                           rypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1029)
19/09/17 22:17:25 DEBUG kms.KMSClientProvider: Getting new token from http://cm-r01nn01.mws.mds.xyz:16000/kms/v1/, renewer:yarn/cm-r01nn02.mws.mds.xyz@MWS.MDS.XYZ
19/09/17 22:17:25 DEBUG web.DelegationTokenAuthenticator: No delegation token found for url=http://cm-r01nn01.mws.mds.xyz:16000/kms/v1/?op=GETDELEGATIONTOKEN&renewer=yarn%2Fcm-r01nn02.mws.mds.xyz%40MWS.MDS.XYZ, token=, authenticating with class org.apach                           e.hadoop.security.token.delegation.web.KerberosDelegationTokenAuthenticator$1
19/09/17 22:17:25 DEBUG client.KerberosAuthenticator: JDK performed authentication on our behalf.
19/09/17 22:17:25 DEBUG client.AuthenticatedURL: Cannot parse cookie header:
java.lang.IllegalArgumentException: Empty cookie header string

 

Solve it by adjusting your KMS settings to include the groups and users that will run your commands as follows:

Name: hadoop.kms.acl.GET_KEYS
Value: kmsadmin,kmsadmingroup,hdfs,cdhadmins@mds.xyz,nixadmins@mds.xyz,cdhadmins,nixadmins,tom@MDS.XYZ
Description: ACL for get-keys operations.

And test using:

tom@mds.xyz@cm-r01en01:~] 🙂 $ hadoop key list
19/09/18 07:20:23 INFO util.KerberosName: No auth_to_local rules applied to tom@MDS.XYZ
Listing keys for KeyProvider: org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider@121314f7
tom@mds.xyz@cm-r01en01:~] 🙂 $

Cheers,
TK

WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn’t exist or is not writable. Lineage for this application will be disabled.

Gettnig this?

19/09/17 22:17:41 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.

Resolve it by creating the folder:

[root@cm-r01en01 spark]# ls -altri
total 2244
335565630 drwxr-xr-x.  2 spark spark       6 Aug 17 21:25 stacks
 67109037 drwxr-xr-x. 27 root  root     4096 Sep 16 03:39 ..
268448735 -rw-r–r–.  1 spark spark 2284695 Sep 17 22:18 spark-history-server-cm-r01en01.mws.mds.xyz.log
134702284 drwxr-xr-x.  2 spark spark       6 Sep 18 07:12 lineage
268447866 drwxr-xr-x.  4 spark spark      87 Sep 18 07:12 .
[root@cm-r01en01 spark]# ssh cm-r01en02

 

Cheers,
TK

FATAL: remaining connection slots are reserved for non-replication superuser connections

Getting this?

FATAL:  remaining connection slots are reserved for non-replication superuser connections

Fix that by updating the Patroni configuration like like so:

[root@psql01 log]# patronictl -c /etc/patroni.yml edit-config postgres

+++
@@ -1,9 +1,10 @@
 loop_wait: 10
 maximum_lag_on_failover: 1048576
 postgresql:
+  parameters:
–  max_connections: 256
+    max_connections: 256
–  max_replication_slots: 64
+    max_replication_slots: 64
–  max_wal_senders: 32
+    max_wal_senders: 32
   use_pg_rewind: true
 retry_timeout: 10
 ttl: 30

Apply these changes? [y/N]: y
Configuration changed
[root@psql01 log]#
[root@psql01 log]#
[root@psql01 log]# patronictl -c /etc/patroni.yml restart postgres
+———-+————-+—————+——–+———+———–+
| Cluster  |    Member   |      Host     |  Role  |  State  | Lag in MB |
+———-+————-+—————+——–+———+———–+
| postgres | postgresql0 | 192.168.0.108 | Leader | running |       0.0 |
| postgres | postgresql1 | 192.168.0.124 |        | running |       0.0 |
| postgres | postgresql2 | 192.168.0.118 |        | running |       0.0 |
+———-+————-+—————+——–+———+———–+
Are you sure you want to restart members postgresql0, postgresql1, postgresql2? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
When should the restart take place (e.g. 2015-10-01T14:30)  [now]:
Success: restart on member postgresql0
Success: restart on member postgresql1
Success: restart on member postgresql2
[root@psql01 log]# sudo su – postgres
Last login: Sat Sep 14 09:15:34 EDT 2019 on pts/0
-bash-4.2$ psql -h psql-c01 -p 5432 -W
Password:
psql (10.5)
Type "help" for help.

postgres=#
postgres=#
postgres=#
postgres=# show max_connections; show  max_replication_slots;
 max_connections
—————–
 256
(1 row)

 max_replication_slots
———————–
 64
(1 row)

postgres=#
 

Keep in mind that cluster name above is your scope from the config file:

[root@psql01 patroni]# cat /etc/patroni.yml
scope: postgres

Alternately, update the PostgresSQL settings with the above, if you're not running Patroni.  Verify status:

[root@psql01 ~]# patronictl -c /etc/patroni.yml list
+———-+————-+—————+——–+———+———–+
| Cluster  |    Member   |      Host     |  Role  |  State  | Lag in MB |
+———-+————-+—————+——–+———+———–+
| postgres | postgresql0 | 192.168.0.108 |        | running |       0.0 |
| postgres | postgresql1 | 192.168.0.124 | Leader | running |       0.0 |
| postgres | postgresql2 | 192.168.0.118 |        | running |       0.0 |
+———-+————-+—————+——–+———+———–+
[root@psql01 ~]#

 

Cheers,
TK

REF: My post on the project page: https://github.com/zalando/patroni/issues/1177

touch: cannot touch /atlas/atlassian/confluence/logs/catalina.out: Permission denied

Getting this?

[confluence@atlas02 logs]$ logout
[root@atlas02 atlassian]# systemctl status confluence.service -l
â confluence.service – LSB: Atlassian Confluence
   Loaded: loaded (/etc/rc.d/init.d/confluence; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2019-09-10 22:07:18 EDT; 2min 5s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 11361 ExecStop=/etc/rc.d/init.d/confluence stop (code=exited, status=0/SUCCESS)
  Process: 11925 ExecStart=/etc/rc.d/init.d/confluence start (code=exited, status=1/FAILURE)

Sep 10 22:07:18 atlas02.nix.mds.xyz confluence[11925]: at com.atlassian.confluence.bootstrap.SynchronyProxyWatchdog.main(SynchronyProxyWatchdog.ja   va:47)
Sep 10 22:07:18 atlas02.nix.mds.xyz confluence[11925]: 2019-09-10 22:07:18,348 INFO [main] [atlassian.confluence.bootstrap.SynchronyProxyWatchdog]    A Context element for ${confluence.context.path}/synchrony-proxy is found in /atlas/atlassian/confluence/conf/server.xml. No further action is re   quired
Sep 10 22:07:18 atlas02.nix.mds.xyz confluence[11925]: —————————————————————————
Sep 10 22:07:18 atlas02.nix.mds.xyz confluence[11925]: touch: cannot touch â/atlas/atlassian/confluence/logs/catalina.outâ: Permission denied
Sep 10 22:07:18 atlas02.nix.mds.xyz confluence[11925]: /atlas/atlassian/confluence/bin/catalina.sh: line 464: /atlas/atlassian/confluence/logs/cat   alina.out: Permission denied
Sep 10 22:07:18 atlas02.nix.mds.xyz runuser[11930]: pam_unix(runuser:session): session closed for user confluence1
Sep 10 22:07:18 atlas02.nix.mds.xyz systemd[1]: confluence.service: control process exited, code=exited status=1
Sep 10 22:07:18 atlas02.nix.mds.xyz systemd[1]: Failed to start LSB: Atlassian Confluence.
Sep 10 22:07:18 atlas02.nix.mds.xyz systemd[1]: Unit confluence.service entered failed state.
Sep 10 22:07:18 atlas02.nix.mds.xyz systemd[1]: confluence.service failed.
[root@atlas02 atlassian]# ls -altri /atlas/atlassian/confluence/conf/server.xml.
ls: cannot access /atlas/atlassian/confluence/conf/server.xml.: No such file or directory
[root@atlas02 atlassian]#

 

And seeing this from journalctl -xe:

— Unit confluence.service has begun starting up.
Sep 10 22:11:18 atlas02.nix.mds.xyz confluence[12241]: To run Confluence in the foreground, start the server with start-confluence.sh -fg
Sep 10 22:11:18 atlas02.nix.mds.xyz confluence[12241]: executing using dedicated user: confluence1
Sep 10 22:11:18 atlas02.nix.mds.xyz runuser[12246]: pam_unix(runuser:session): session opened for user confluence1 by (uid=0)
Sep 10 22:11:18 atlas02.nix.mds.xyz confluence[12241]: If you encounter issues starting up Confluence, please see the Installation guide at http:/
Sep 10 22:11:18 atlas02.nix.mds.xyz confluence[12241]: Server startup logs are located in /atlas/atlassian/confluence/logs/catalina.out
Sep 10 22:11:18 atlas02.nix.mds.xyz confluence[12241]: —————————————————————————
Sep 10 22:11:18 atlas02.nix.mds.xyz confluence[12241]: Using Java: /atlas/atlassian/confluence/jre//bin/java
Sep 10 22:11:18 atlas02.nix.mds.xyz automount[5344]: st_expire: state 1 path /n
Sep 10 22:11:18 atlas02.nix.mds.xyz automount[5344]: expire_proc: exp_proc = 140606617675520 path /n
Sep 10 22:11:18 atlas02.nix.mds.xyz automount[5344]: expire_proc_indirect: expire /n/mds.xyz
Sep 10 22:11:18 atlas02.nix.mds.xyz automount[5344]: 1 remaining in /n
Sep 10 22:11:18 atlas02.nix.mds.xyz automount[5344]: expire_cleanup: got thid 140606617675520 path /n stat 3
Sep 10 22:11:18 atlas02.nix.mds.xyz automount[5344]: expire_cleanup: sigchld: exp 140606617675520 finished, switching from 2 to 1
Sep 10 22:11:18 atlas02.nix.mds.xyz automount[5344]: st_ready: st_ready(): state = 2 path /n
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: log4j:ERROR setFile(null,true) call failed.
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: java.io.FileNotFoundException: /atlas/atlassian/confluence/logs/synchrony-proxy-watchdog.lo
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at java.io.FileOutputStream.open0(Native Method)
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at java.io.FileOutputStream.open(FileOutputStream.java:270)
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at com.atlassian.confluence.bootstrap.SynchronyProxyWatchdog.addLogFileAppender(SynchronyPr
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: at com.atlassian.confluence.bootstrap.SynchronyProxyWatchdog.main(SynchronyProxyWatchdog.ja
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: 2019-09-10 22:11:19,321 INFO [main] [atlassian.confluence.bootstrap.SynchronyProxyWatchdog]
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: —————————————————————————
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: touch: cannot touch â/atlas/atlassian/confluence/logs/catalina.outâ: Permission denied
Sep 10 22:11:19 atlas02.nix.mds.xyz confluence[12241]: /atlas/atlassian/confluence/bin/catalina.sh: line 464: /atlas/atlassian/confluence/logs/cat
Sep 10 22:11:19 atlas02.nix.mds.xyz runuser[12246]: pam_unix(runuser:session): session closed for user confluence1
Sep 10 22:11:19 atlas02.nix.mds.xyz systemd[1]: confluence.service: control process exited, code=exited status=1
Sep 10 22:11:19 atlas02.nix.mds.xyz systemd[1]: Failed to start LSB: Atlassian Confluence.
— Subject: Unit confluence.service has failed
— Defined-By: systemd

 

It turns out that confluence creates a new user everytime you install it.  Why?  Who knows.  First time I have ever seen anything like that in an application.  It's very unusual and annoying especially if you try to reinstall confluence only to find it made itself a new user.  And also when searching for the real user using standard process commands can be misleading when two or more of these exist:

[root@atlas02 logs]# ps -ef|grep -Ei confluence|grep logs
conflue+ 10256     1 43 01:23 ?        00:01:30 /atlas/atlassian/confluence/jre//bin/java

To fix this, do the following.  

Change the user to the earlier confluence user.  In our case, change confluence1 to confluence:

[root@atlas02 bin]# grep -Ei confluence1 *
grep: synchrony: Is a directory
user.sh:CONF_USER="confluence1" # user created by installer
[root@atlas02 bin]#
[root@atlas02 bin]#
[root@atlas02 bin]#
[root@atlas02 bin]# vi user.sh
[root@atlas02 bin]# pwd
/atlas/atlassian/confluence/bin
[root@atlas02 bin]#

Next change the directory permissions on the confluence folder:

[root@atlas02 atlas]# pwd
/atlas
[root@atlas02 atlas]# ls -altri
total 17
11318803973829525516 -rw-r–r–.  1 root       root          8 Nov 15  2018 you.there
                 128 dr-xr-xr-x. 24 root       root       4096 Mar 12 21:23 ..
12124534773086893833 drwxr-xr-x.  4 root       root       4096 Mar 23 12:34 atlassian.bak
                   1 drwxr-xr-x.  5 root       root       4096 Mar 23 13:23 .
13456417161533701348 drwxr-xr-x.  4 confluence confluence 4096 Mar 23 13:28 atlassian
[root@atlas02 atlas]# chown -R confluence.confluence atlassian

And restart confluence using:

systemctl restart confluence

Cheers,
TK

 

Application application_1567571625367_0006 failed 2 times due to AM Container for appattempt_1567571625367_0006_000002 exited with  exitCode: -1000

Getting this?

19/09/07 23:41:56 ERROR repl.Main: Failed to initialize Spark session.
org.apache.spark.SparkException: Application application_1567571625367_0006 failed 2 times due to AM Container for appattempt_1567571625367_0006_000002 exited with  exitCode: -1000
Failing this attempt.Diagnostics: [2019-09-07 23:41:54.934]Application application_1567571625367_0006 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is tom
main : requested yarn user is tom
User tom not found

For more detailed output, check the application tracking page: http://cm-r01nn02.mws.mds.xyz:8088/cluster/app/application_1567571625367_0006 Then click on links to logs of each attempt.
. Failing the application.

 

This is likely due to incorrect auth_to_local rules in HDFS -> Configuration:


RULE:[2:$1@$0](HTTP@\QMWS.MDS.XYZ\E$)s/@\QMWS.MDS.XYZ\E$//
RULE:[1:$1@$0](.*@\QMWS.MDS.XYZ\E$)s/@\QMWS.MDS.XYZ\E$///L
RULE:[2:$1@$0](.*@\QMWS.MDS.XYZ\E$)s/@\QMWS.MDS.XYZ\E$///L
RULE:[2:$1@$0](HTTP@\Qmws.mds.xyz\E$)s/@\Qmws.mds.xyz\E$//
RULE:[1:$1@$0](.*@\Qmws.mds.xyz\E$)s/@\Qmws.mds.xyz\E$///L
RULE:[2:$1@$0](.*@\Qmws.mds.xyz\E$)s/@\Qmws.mds.xyz\E$///L
RULE:[2:$1@$0](HTTP@\QMDS.XYZ\E$)s/@\QMDS.XYZ\E$//
RULE:[1:$1@$0](.*@\QMDS.XYZ\E$)s/@\QMDS.XYZ\E$///L
RULE:[2:$1@$0](.*@\QMDS.XYZ\E$)s/@\QMDS.XYZ\E$///L
RULE:[2:$1@$0](HTTP@\Qmds.xyz\E$)s/@\Qmds.xyz\E$//
RULE:[1:$1@$0](.*@\Qmds.xyz\E$)s/@\Qmds.xyz\E$///L
RULE:[2:$1@$0](.*@\Qmds.xyz\E$)s/@\Qmds.xyz\E$///L

 

In our case, removed the above rules.  More fine-tuning would be needed to make them both HDFS and Spark friendly. 

Cheers,
TK

Configure Cloudera HUE with FreeIPA

Configuring HUE with LDAP / FreeIPA:

[root@idmipa03 ~]# LDAPTLS_CACERT=/etc/ipa/ca.crt ldapsearch -H ldaps://idmipa03.mws.mds.xyz:636 -D "uid=admin,cn=users,cn=compat,dc=mws,dc=mds,dc=xyz" -w "<SECRET>" -b "dc=mws,dc=mds,dc=xyz" -v "(&(objectClass=posixAccount)(uid=*))"  |grep dn:
ldap_initialize( ldaps://idmipa03.mws.mds.xyz:636/??base )
filter: (&(objectClass=posixAccount)(uid=*))
requesting: All userApplication attributes
dn: uid=cmadmin-530029b6,cn=users,cn=compat,dc=mws,dc=mds,dc=xyz
dn: uid=admin,cn=users,cn=compat,dc=mws,dc=mds,dc=xyz
dn: uid=admin,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz
dn: uid=cmadmin-530029b6,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz
[root@idmipa03 ~]#

Ensure the following settings:

Authentication Backend ( backend ) : desktop.authentication.backend.LdapBackend
PAM Backend Service Name ( pam_service) : login
LDAP URL  ( ldap_url ) : ldaps://idmipa03.mws.mds.xyz:636
LDAP Server CA Certificate ( ldap_cert ) : /etc/ipa/ca.crt
Enable LDAP TLS ( use_start_tls ) : <CHECKED>
Use Search Bind Authentication (search_bind_authentication) : <CHECKED>
Create LDAP users on login ( create_users_on_login ) : <CHECKED>
LDAP Search Base ( base_dn ) : cn=compat,dc=mws,dc=mds,dc=xyz
LDAP Bind User Distinguished Name ( bind_dn ) : uid=admin,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz
LDAP Bind Password ( bind_password ) : <SECRET>
LDAP User Filter ( user_filter ) : (objectClass=posixAccount)
LDAP Username Attribute ( user_name_attr ) : uid
LDAP Group Filter ( group_filter ) : (objectClass=posixGroup)
LDAP Group Name Attribute ( group_name_attr ) : cn

Test the configuration ( Hue – Actions – Test LDAP Configuration ):

Test LDAP Configuration
Status  Finished  Context 
Hue
  Sep 2, 7:09:09 PM  35.8s 
Hue's LDAP configuration is valid.
 
Completed 1 of 1 step(s).
  Show All Steps    Show Only Failed Steps    Show Only Running Steps
Testing the Hue LDAP configuration.        
Hue Server (cm-r01en01)
Sep 2, 7:09:09 PM    35.8s

 

You may receive this error:

[root@cm-r01en01 hue-httpd]# ldapsearch -Y GSSAPI -w "<SECRET>" -H 'ldaps://idmipa-c01.mws.mds.xyz:636' -b 'dc=mws,dc=mds,dc=xyz' -D 'uid=admin,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz' '(&(objectClass=posixAccount)(uid=tom))' -d1 |grep dn:
TLS: hostname (idmipa-c01.mws.mds.xyz) does not match common name in certificate (idmipa04.mws.mds.xyz).

This means you'll need a SAN certificate with 1) VIP, 2) idmipa03 and 3) idmipa04 listed as valid hostnames.  Otherwise, use the single IPA server node.

To find users in AD DC ( Active Directory / Domain Controllers ) use the explicit format:

[root@cm-r01en01 cloudera-scm-agent]# LDAPTLS_CACERT=/etc/ipa/ca.crt   ldapsearch -Y GSSAPI -H ldaps://idmipa03.mws.mds.xyz:636 -D "uid=admin,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz" -w "<SECRET>" -b "dc=mws,dc=mds,dc=xyz" "(uid=tom@mds.xyz)" -v|grep dn:

As per RFC 2307 . 

While configuring, we into the following:

/var/run/cloudera-scm-agent/process/2231-hue-HUE_SERVER/logs/stderr.log
[02/Sep/2019 20:05:57 +0000] backend      WARNING  Cannot configure LDAP with SSL and enable STARTTLS.
[02/Sep/2019 20:05:58 +0000] config       ERROR    search_s('dc=mws,dc=mds,dc=xyz', 2, '(&(uid=tom@mds.xyz)(*))') raised FILTER_ERROR({'desc': 'Bad search filter'},)
[02/Sep/2019 20:05:58 +0000] config       DEBUG    search_s('dc=mws,dc=mds,dc=xyz', 2, '(&(uid=%(user)s)(*))') returned 0 objects:
[02/Sep/2019 20:05:58 +0000] backend      DEBUG    Authentication failed for tom@mds.xyz: failed to map the username to a DN.
[02/Sep/2019 20:05:59 +0000] access       WARNING  192.168.0.76 -anon- – "POST /hue/accounts/login HTTP/1.1" (mem: 132mb)– Failed login for user: tom@mds.xyz

Debugging a little further reveals:

[root@cm-r01en01 logs]# LDAPTLS_CACERT=/etc/ipa/ca.crt   ldapsearch -Y GSSAPI -H ldaps://idmipa03.mws.mds.xyz:636 -D "uid=admin,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz" -w "<SECRET>" -b "cn=compat,dc=mws,dc=mds,dc=xyz" "(uid=tom@mds.xyz))" -v|grep dn:
ldap_initialize( ldaps://idmipa03.mws.mds.xyz:636/??base )
SASL/GSSAPI authentication started
SASL username: hdfs/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
SASL SSF: 256
SASL data security layer installed.
filter: (uid=tom@mds.xyz))
requesting: All userApplication attributes
ldap_search_ext: Bad search filter (-7)
[root@cm-r01en01 logs]#

With a few commands, we quickly figure out the correct mappings:

USER:
LDAPTLS_CACERT=/etc/ipa/ca.crt   ldapsearch -Y GSSAPI -H ldaps://idmipa03.mws.mds.xyz:636 -D "uid=admin,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz" -w "<SECRET>" -b "cn=compat,dc=mws,dc=mds,dc=xyz" "(&(uid=tom@mds.xyz)(objectClass=posixAccount))" -v

GROUP:
LDAPTLS_CACERT=/etc/ipa/ca.crt   ldapsearch -Y GSSAPI -H ldaps://idmipa03.mws.mds.xyz:636 -D "uid=admin,cn=users,cn=accounts,dc=mws,dc=mds,dc=xyz" -w "<SECRET>" -b "cn=compat,dc=mws,dc=mds,dc=xyz" "(&(cn=cdhadmins)(objectClass=posixGroup))" -v

And we are greeted with a successful login message:

[02/Sep/2019 20:34:19 +0000] middleware   DEBUG    {"username": "tom@mds.xyz", "impersonator": "hue", "eventTime": 1567481659975, "operationText": "Successful login for user: tom@mds.xyz", "service": "hue", "url": "/hue/accounts/login", "allowed": true, "operation": "USER_LOGIN", "ipAddress": "192.168.0.76"}

using our AD DC user!  

Successful Hue IPA Integration

Cheers,
TK

There is a problem processing audits for HIVESERVER2.

Getting this?

There is a problem processing audits for HIVESERVER2.

[02/Sep/2019 12:36:30 +0000] 32165 Audit-Plugin throttling_logger ERROR    (341 skipped) Error occurred when sending entry to server: [02/Sep/2019 12:36:30 +0000] 32165 Audit-Plugin throttling_logger INFO     (341 skipped) Unable to send data to nav server. Will try again.

Diggig further, we see this error as well:

[02/Sep/2019 11:31:55 +0000] 4044 Profile-Plugin navigator_plugin INFO     Pipelines updated for Profile Plugin: set([])
[02/Sep/2019 11:31:55 +0000] 4044 Audit-Plugin navigator_plugin_pipeline INFO     Starting with navigator log None for role HIVESERVER2 and pipeline HiveSentryOnFailureHookTP
[02/Sep/2019 11:31:55 +0000] 4044 Metadata-Plugin navigator_plugin ERROR    Exception caught when trying to refresh Metadata Plugin for conf.cloudera.spark_on_yarn with count 0 pipelines names [].
Traceback (most recent call last):
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/audit/navigator_plugin.py", line 198, in immediate_refresh
    self._recreate_pipelines_for_csd()
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/audit/navigator_plugin.py", line 157, in _recreate_pipelines_for_csd
    existing_logs = [name for name in os.listdir(self.nav_conf.log_dir)
AttributeError: 'NoneType' object has no attribute 'log_dir'
[02/Sep/2019 11:31:55 +0000] 4044 Metadata-Plugin navigator_plugin INFO     Pipelines updated for Metadata Plugin: []

Redeploying the Spark config should solve this:

Execute DeployClusterClientConfig for {yarn,solr,hbase,kafka,hdfs,hive,spark_on_yarn} in parallel.

We can only surmise what may have occurred in this case.  Apparently config updates were being done to the config while an earlier config deployment was happening, corrupting the setup.  This may not solve it, however.  YMMV.  

In all likelihood, your free license has expired.  In that case navigate to Cloudera Management Service then turn off / uncheck the following:

Navigator Audit Server Role Health Test

But that wasn't it either.  Finally, remove the Navigator Audit Server from Cloudera Management Services instances since no valid license exists.

Cheers,
TK

How do I connect to HiveServer2 (HS2) through beeline

How do I connect to HiveServer2 (HS2) through beeline:

beeline> !connect jdbc:hive2://cm-r01en01.mws.mds.xyz:10000/default;principal=hive/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
Connecting to jdbc:hive2://cm-r01en01.mws.mds.xyz:10000/default;principal=hive/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
Connected to: Apache Hive (version 2.1.1-cdh6.3.0)
Driver: Hive JDBC (version 2.1.1-cdh6.3.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://cm-r01en01.mws.mds.xyz:10000/>
0: jdbc:hive2://cm-r01en01.mws.mds.xyz:10000/>
0: jdbc:hive2://cm-r01en01.mws.mds.xyz:10000/> show tables;
INFO  : Compiling command(queryId=hive_20190902102937_4ef97c5b-19ff-47b8-be81-dacb2edeece0): show tables
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=hive_20190902102937_4ef97c5b-19ff-47b8-be81-dacb2edeece0); Time taken: 2.67 seconds
INFO  : Executing command(queryId=hive_20190902102937_4ef97c5b-19ff-47b8-be81-dacb2edeece0): show tables
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=hive_20190902102937_4ef97c5b-19ff-47b8-be81-dacb2edeece0); Time taken: 1.333 seconds
INFO  : OK
+———–+
| tab_name  |
+———–+
+———–+
No rows selected (5.048 seconds)
0: jdbc:hive2://cm-r01en01.mws.mds.xyz:10000/> show databases;
INFO  : Compiling command(queryId=hive_20190902103034_18cb927f-0ab7-4a2d-b311-206b6ebb2cc2): show databases
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=hive_20190902103034_18cb927f-0ab7-4a2d-b311-206b6ebb2cc2); Time taken: 0.059 seconds
INFO  : Executing command(queryId=hive_20190902103034_18cb927f-0ab7-4a2d-b311-206b6ebb2cc2): show databases
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=hive_20190902103034_18cb927f-0ab7-4a2d-b311-206b6ebb2cc2); Time taken: 0.039 seconds
INFO  : OK
+—————-+
| database_name  |
+—————-+
| default        |
+—————-+
1 row selected (0.362 seconds)
0: jdbc:hive2://cm-r01en01.mws.mds.xyz:10000/>

 

Cheers,
TK


     
  Copyright © 2003 - 2013 Tom Kacperski (microdevsys.com). All rights reserved.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License