Header Shadow Image


preauth (encrypted_timestamp) verify failure: Preauthentication failed

Getting this?

Aug 25 08:18:51 idmipa03.mws.mds.xyz krb5kdc[6704](info): preauth (encrypted_timestamp) verify failure: Preauthentication failed
Aug 25 08:18:51 idmipa03.mws.mds.xyz krb5kdc[6704](info): AS_REQ (4 etypes {18 17 16 23}) 192.168.0.140: PREAUTH_FAILED: hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ for krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ, Preauthentication failed

Check the principal vs the keytab you're using for the same service:

kadmin.local:  getprinc hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
Principal: hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
Expiration date: [never]
Last password change: Sun Aug 25 01:14:37 EDT 2019
Password expiration date: [never]
Maximum ticket life: 1 day 00:00:00
Maximum renewable life: 90 days 00:00:00
Last modified: Sun Aug 25 01:14:37 EDT 2019 (hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ)
Last successful authentication: [never]
Last failed authentication: Sun Aug 25 08:37:15 EDT 2019
Failed password attempts: 31362
Number of keys: 2
Key: vno 22, aes256-cts-hmac-sha1-96:special
Key: vno 22, aes128-cts-hmac-sha1-96:special
MKey: vno 1
Attributes: REQUIRES_PRE_AUTH
Policy: [none]
kadmin.local:

Now check the keytab for the same service:

[root@cm-r01en01 ~]# klist -kte /var/run/cloudera-scm-agent/process/1151-hue-KT_RENEWER/hue.keytab
Keytab name: FILE:/var/run/cloudera-scm-agent/process/1151-hue-KT_RENEWER/hue.keytab
KVNO Timestamp           Principal
—- ——————- ——————————————————
  18 08/25/2019 00:46:12 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes256-cts-hmac-sha1-96)
  18 08/25/2019 00:46:12 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes128-cts-hmac-sha1-96)
  18 08/25/2019 00:46:12 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes256-cts-hmac-sha384-192)
  18 08/25/2019 00:46:12 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes128-cts-hmac-sha256-128)
  18 08/25/2019 00:46:12 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (des3-cbc-sha1)
  18 08/25/2019 00:46:12 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (arcfour-hmac)
[root@cm-r01en01 ~]#

To resolve this, go to Cloudera Manager and regenerate the credentials.  New principal looked like this:

kadmin.local:  getprinc hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
Principal: hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
Expiration date: [never]
Last password change: Sun Aug 25 08:38:34 EDT 2019
Password expiration date: [never]
Maximum ticket life: 1 day 00:00:00
Maximum renewable life: 90 days 00:00:00
Last modified: Sun Aug 25 08:38:34 EDT 2019 (hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ)
Last successful authentication: [never]
Last failed authentication: Sun Aug 25 08:37:15 EDT 2019
Failed password attempts: 31362
Number of keys: 6
Key: vno 23, aes256-cts-hmac-sha1-96
Key: vno 23, aes128-cts-hmac-sha1-96
Key: vno 23, des3-cbc-sha1
Key: vno 23, arcfour-hmac
Key: vno 23, camellia128-cts-cmac
Key: vno 23, camellia256-cts-cmac
MKey: vno 1
Attributes: REQUIRES_PRE_AUTH
Policy: [none]
kadmin.local:

kadmin.local:  getprinc hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ
Principal: hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ
Expiration date: [never]
Last password change: Sun Aug 25 08:38:17 EDT 2019
Password expiration date: [never]
Maximum ticket life: 1 day 00:00:00
Maximum renewable life: 90 days 00:00:00
Last modified: Sun Aug 25 08:38:17 EDT 2019 (hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 6
Key: vno 19, aes256-cts-hmac-sha1-96
Key: vno 19, aes128-cts-hmac-sha1-96
Key: vno 19, des3-cbc-sha1
Key: vno 19, arcfour-hmac
Key: vno 19, camellia128-cts-cmac
Key: vno 19, camellia256-cts-cmac
MKey: vno 1
Attributes: REQUIRES_PRE_AUTH
Policy: [none]
kadmin.local:

Comparing the above to the generated hue.keytab files shows a match:

[root@cm-r01en01 ~]# klist -kte /var/run/cloudera-scm-agent/process/1229-hue-KT_RENEWER/hue.keytab
Keytab name: FILE:/var/run/cloudera-scm-agent/process/1229-hue-KT_RENEWER/hue.keytab
KVNO Timestamp           Principal
—- ——————- ——————————————————
  23 08/25/2019 08:44:22 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes256-cts-hmac-sha1-96)
  23 08/25/2019 08:44:22 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes128-cts-hmac-sha1-96)
  23 08/25/2019 08:44:22 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes256-cts-hmac-sha384-192)
  23 08/25/2019 08:44:22 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (aes128-cts-hmac-sha256-128)
  23 08/25/2019 08:44:22 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (des3-cbc-sha1)
  23 08/25/2019 08:44:22 hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ (arcfour-hmac)
[root@cm-r01en01 ~]#

[root@cm-r01en02 ~]# klist -kte /var/run/cloudera-scm-agent/process/1228-hue-KT_RENEWER/hue.keytab
Keytab name: FILE:/var/run/cloudera-scm-agent/process/1228-hue-KT_RENEWER/hue.keytab
KVNO Timestamp           Principal
—- ——————- ——————————————————
  19 08/25/2019 08:44:22 hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ (aes256-cts-hmac-sha1-96)
  19 08/25/2019 08:44:22 hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ (aes128-cts-hmac-sha1-96)
  19 08/25/2019 08:44:22 hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ (aes256-cts-hmac-sha384-192)
  19 08/25/2019 08:44:22 hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ (aes128-cts-hmac-sha256-128)
  19 08/25/2019 08:44:22 hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ (des3-cbc-sha1)
  19 08/25/2019 08:44:22 hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ (arcfour-hmac)
[root@cm-r01en02 ~]#

And Hue is green.

Cheers,
TK

ERROR    Timed out waiting for worker process collecting filesystem usage to complete.

Getting this error?

==> /var/log/cloudera-scm-agent/cloudera-scm-agent.log <==
[24/Aug/2019 22:00:08 +0000] 3697 Monitor-HostMonitor throttling_logger ERROR    Timed out waiting for worker process collecting filesystem usage to complete. This may occur if the host has an NFS or other remote filesystem that is not responding to requests in a timely fashion. Current nodev filesystems: /dev/shm,/run,/sys/fs/cgroup,/run/cloudera-scm-agent/process,/run/cloudera-scm-agent/process,/run/user/155601104,/n/mds.xyz,/run/user/0
[24/Aug/2019 22:00:08 +0000] 3697 MainThread agent        ERROR    Heartbeating to srv-c01.mws.mds.xyz:7182 failed.

Verify your NFS storage.  In our case one of the gluster brick's was out of space under root:

[root@nfs03 ~]# systemctl restart glusterd haproxy keepalived nfs-ganesha
Job for glusterd.service failed because the control process exited with error code. See "systemctl status glusterd.service" and "journalctl -xe" for details.
[root@nfs03 ~]# df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root   41G   41G   20K 100% /

Free the space and restart services.

Cheers,
TK

ERROR:desktop.kt_renewer:Couldn’t renew kerberos ticket in order to work around Kerberos 1.8.1 issue. Please check that the ticket for ‘‘ is still renewable:

Getting this?

INFO:desktop.kt_renewer:Renewing kerberos ticket to work around kerberos 1.8.1: /usr/bin/kinit -R -c /var/run/hue/hue_krb5_ccache
kinit: KDC can't fulfill requested option while renewing credentials
ERROR:desktop.kt_renewer:Couldn't renew kerberos ticket in order to work around Kerberos 1.8.1 issue. Please check that the ticket for 'hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ' is still renewable:
  $ klist -f -c /var/run/hue/hue_krb5_ccache
If the 'renew until' date is the same as the 'valid starting' date, the ticket cannot be renewed. Please check your KDC configuration, and the ticket renewal policy (maxrenewlife) for the 'hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ' and `krbtgt' principals.

Resolve it by adding the following lines to the /etc/krb5.conf file on the KDC servers:

[libdefaults]
    ticket_lifetime = 24h
    renew_lifetime = 7d
    forwardable = true

Then regenerate the kerberos credentials by browsing to Administration – Security – Kerberos Credentials, checking off all hosts and regenerating the Kerberos Credentails for all.

Once that is done, restart the Kerberos Ticket Renewer.

Didn't work?  Set the Maximum Renewable Life for Principals to 7 days from 5 days and set the Hue Keytab Renewal Interval to 7 days:

Hue Keytab Renewal Interval
reinit_frequency

However this did not work for us either.  The real issue was in this message on the KDC server:

Aug 24 21:26:50 idmipa03.mws.mds.xyz krb5kdc[12023](info): TGS_REQ (8 etypes {18 17 20 19 16 23 25 26}) 192.168.0.140: TICKET NOT RENEWABLE: authtime 0,  hu /cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ for krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ, KDC can't fulfill requested option

We can see this through the cache:

[root@cm-r01en02 ~]# klist -fe /var/run/hue/hue_krb5_ccache
Ticket cache: FILE:/var/run/hue/hue_krb5_ccache
Default principal: hue/cm-r01en02.mws.mds.xyz@MWS.MDS.XYZ

Valid starting       Expires              Service principal
08/24/2019 21:14:07  08/25/2019 21:14:07  krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ
        Flags: FIA, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
You have new mail in /var/spool/mail/root
[root@cm-r01en02 ~]#

Hence why when we try to kinit with the ticket, we get the above renewal error message:

[root@cm-r01en01 ~]# kinit -R -c /var/run/hue/hue_krb5_ccache
kinit: KDC can't fulfill requested option while renewing credentials
[root@cm-r01en01 ~]#

It is missing the R ( renewable ) flag .  For this we'll need to modify one of the Cloudera scripts to ensure our tickets are renewable by adding +allow_renewable to the code that creates the principals:

     68 # Set the maxrenewlife for the principal, if given. There is no interface
     69 # offered by the IPA to set it, so we use KADMIN as suggested in a few IPA
     70 # related forums.
     71 #set +e
     72 KADMIN="kadmin -k -t $CMF_KEYTAB_FILE -p $CMF_PRINCIPAL -r $CMF_REALM"
     73
     74 if [ $MAX_RENEW_LIFE -gt 0 ]; then
     75   mkdir /tmp/kadmin/;
     76   klist >> /tmp/kadmin/klist.log;
     77   pwd >> /tmp/kadmin/pwd.log;
     78   whoami >> /tmp/kadmin/whoami.log;
     79   who am i >> /tmp/kadmin/who-am-i.log;
     80   echo "Running: $KADMIN -q \"modprinc -maxrenewlife \"$MAX_RENEW_LIFE sec\" $PRINCIPAL\" " >> /tmp/kadmin/kadmin-command.log;
     81   /bin/cp $CMF_KEYTAB_FILE $CMF_PRINCIPAL /tmp/kadmin/;
     82   $KADMIN -q "modprinc -maxrenewlife \"$MAX_RENEW_LIFE sec\" +allow_renewable $PRINCIPAL"
     83 fi
     84 #set -e

Save the code and redistribute it:

[root@awx01 ansible]# vi adhoc/gen_credentials_ipa.sh
[root@awx01 ansible]# ansible cm* -i infra -m copy -a 'src=adhoc/gen_credentials_ipa.sh dest=/opt/cloudera/cm/bin/gen_credentials_ipa.sh'

Once this is done, regenerate all the kerberos principals in AdministrationSecurityKerberos Credentials.

Another issue here is the Encryption Types and possibly the renewable life:

kadmin.local:  getprinc krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ
Principal: krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ
Expiration date: [never]
Last password change: [never]
Password expiration date: [never]
Maximum ticket life: 7 days 00:00:00
Maximum renewable life: 14 days 00:00:00
Last modified: Mon Feb 04 22:19:28 EST 2019 (db_creation@MWS.MDS.XYZ)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 2
Key: vno 1, aes256-cts-hmac-sha1-96
Key: vno 1, aes128-cts-hmac-sha1-96
MKey: vno 1
Attributes: REQUIRES_PRE_AUTH LOCKDOWN_KEYS
Policy: [none]
kadmin.local:  modprinc -maxrenewlife 90day krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ
Principal "krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ" modified.
kadmin.local:
kadmin.local:
kadmin.local:  getprinc krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ
Principal: krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ
Expiration date: [never]
Last password change: [never]
Password expiration date: [never]
Maximum ticket life: 7 days 00:00:00
Maximum renewable life: 90 days 00:00:00
Last modified: Sat Aug 24 22:45:03 EDT 2019 (admin/admin@MWS.MDS.XYZ)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 2
Key: vno 1, aes256-cts-hmac-sha1-96
Key: vno 1, aes128-cts-hmac-sha1-96

MKey: vno 1
Attributes: REQUIRES_PRE_AUTH LOCKDOWN_KEYS
Policy: [none]
kadmin.local:

However the encryption types were of highest concern.  In our Cloudera Kerberos configuration, we didn't have these thereby getting:

Aug 24 22:43:38 idmipa03.mws.mds.xyz krb5kdc[12022](info): AS_REQ (8 etypes {18 17 20 19 16 23 25 26}) 192.168.0.140: NEEDED_PREAUTH: hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ for krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ, Additional pre-authentication required
Aug 24 22:43:38 idmipa03.mws.mds.xyz krb5kdc[12022](info): closing down fd 11
Aug 24 22:43:38 idmipa03.mws.mds.xyz krb5kdc[12023](info): preauth (encrypted_timestamp) verify failure: Preauthentication failed
Aug 24 22:43:38 idmipa03.mws.mds.xyz krb5kdc[12023](info): AS_REQ (8 etypes {18 17 20 19 16 23 25 26}) 192.168.0.140: PREAUTH_FAILED: hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ for krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ, Preauthentication failed
Aug 24 22:43:38 idmipa03.mws.mds.xyz krb5kdc[12023](info): closing down fd 11

when we tried to run:

[root@cm-r01en01 ~]# /bin/kinit -k -t /run/cloudera-scm-agent/process/749-hue-KT_RENEWER/hue.keytab -c /var/run/hue/hue_krb5_ccache hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
kinit: Preauthentication failed while getting initial credentials
[root@cm-r01en01 ~]# 

Add the encryption types to the configuration:

Administration – Settings
     Kerberos Encryption Types
     aes256-cts-hmac-sha1-96
     aes128-cts-hmac-sha1-96

Again, regenerate the Kerberos Credentials and check the kerberos cache again:

[root@cm-r01en01 ~]# /bin/kinit -k -t /run/cloudera-scm-agent/process/904-hue-KT_RENEWER/hue.keytab -c /var/run/hue/hue_krb5_ccache hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ
[root@cm-r01en01 ~]#

Tried adjusting the kdc.conf with no luck either:

[root@idmipa04 ~]# cat /var/kerberos/krb5kdc/kdc.conf | grep default_principal_flags
  default_principal_flags = +preauth, +renewable
[root@idmipa04 ~]#

But no luck.  Ultimately, we ended up setting the flags of the /etc/krb5.conf on each CM machine instead:

[root@cm-r01en01 ~]# vi /etc/krb5.conf
[libdefaults]
    ticket_lifetime = 24h
    renew_lifetime = 7d
    forwardable = true

And the same on cm-r01en02.  That worked!

[root@cm-r01en01 ~]# klist -fe
Ticket cache: KEYRING:persistent:0:krb_ccache_Wg0x02u
Default principal: hue/cm-r01en01.mws.mds.xyz@MWS.MDS.XYZ

Valid starting       Expires              Service principal
08/25/2019 01:02:14  08/26/2019 01:02:14  krbtgt/MWS.MDS.XYZ@MWS.MDS.XYZ
        renew until 09/01/2019 01:02:14, Flags: FRIA
        Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
[root@cm-r01en01 ~]#

Cheers,
TK

kadmin: Communication failure with server while initializing kadmin interface

Getting this?

-bash-4.2$ kadmin -k -t /tmp/kadmin/cmf8392129202434993503.keytab -p cmadmin-530029b6@MWS.MDS.XYZ -r MWS.MDS.XYZ -q 'listprincs'         Authenticating as principal cmadmin-530029b6@MWS.MDS.XYZ with keytab /tmp/kadmin/cmf8392129202434993503.keytab.
kadmin: Communication failure with server while initializing kadmin interface

Ensure you have port 749 defined in your firewall config:

[root@idmipa03 ~]# cat /etc/firewalld/zones/public.xml|grep -Ei 749
  <port protocol="tcp" port="749"/>
[root@idmipa03 ~]#

Cheers,
TK

ERROR! Extraneous options or arguments

Getting this error from Ansible?

ERROR! Extraneous options or arguments

Simply drop down one folder where the infra file resides and reexecute:

[root@awx01 adhoc]# cd ..
[root@awx01 ansible]# ansible cm* -i infra -m copy -a 'src=adhoc/public.xml dest=/etc/firewalld/zones/public.xml'

This was just a fancy way of ansible telling you it can't find the infra file containing the hosts for your deployment.

Cheers,
TK

ls: Permission denied: user=root, access=EXECUTE, inode=”/tmp/.cloudera_health_monitoring_canary_files”:hdfs:supergroup:d———

Before enabling kerberos, the HDFS user is the only one that can list out files owned by itself:

[root@cm-r01en01 run]# hdfs dfs -ls /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2019_08_20-19_28_02.abab681cca56293als: Permission denied: user=root, access=EXECUTE, inode="/tmp/.cloudera_health_monitoring_canary_files":hdfs:supergroup:d———

You'll need to run the command via SUDO like this:

[root@cm-r01en01 run]# sudo su -c "hdfs dfs -ls /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2019_08_20-19_28_02.abab681cca56293a" -s /bin/bash hdfs
-rw-rw-rw-   3 hdfs supergroup          0 2019-08-20 20:28 /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2019_08_20-19_28_02.abab681cca56293a

Cheers,
TK

Slow SSD performance through SATA 3.0

Seeing slow SSD performance through SATA 3.0?  Tired of transfer rates of 30MB/s when you should be seeing at least 150MB/s? 

Fix this by adding the following kernel parameters:

libata.noacpi=1

Cheers,
TK

FATAL: Module scsi_wait_scan not found.

Getting this?

FATAL: Module scsi_wait_scan not found.

This is likely followed by a kernel panic.  Look for a non existing mount point being defined, such as a LVM_swap partition that doesn't exist. Remove it.  Also remove any non existent partitions in /etc/fstab through the recovery console.

Cheers,
TK

Cloudera 6.2 Installation: ERROR StatusLogger No log4j2 configuration file found.

Getting this?

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 internal initialization logging.

Investigate by first running the command manually:

[oozie@cm-r01en01 ~]$ /usr/java/latest/bin/java -Xms52428800 -Xmx52428800 -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/oozie_oozie-OOZIE_SERVER-901d5713a53510380392378fa81b483d_pid1397.hprof -XX:OnOutOfMemoryError=/opt/cloudera/cm-agent/service/common/killparent.sh -Doozie.home.dir=/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/oozie -Doozie.config.dir=/run/cloudera-scm-agent/process/216-oozie-OOZIE-SERVER-upload-sharelib -Doozie.log.dir=/var/log/oozie -Doozie.log.file=oozie-cmf-oozie-OOZIE_SERVER-cm-r01en01.mws.mds.xyz.log.out -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=cm-r01en01.mws.mds.xyz -Doozie.http.port=11000 -Djava.net.preferIPv4Stack=true -Doozie.admin.port= -Dderby.stream.error.file=/var/log/oozie/derby.log -Doozie.instance.id=cm-r01en01.mws.mds.xyz -Djava.library.path=/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/hadoop/lib/native -cp ':/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/oozie/libtools/accessors-smart-1.2.jar: < VERY VERY LONG COMMAND OF JAR FILES> :/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/oozie/libext/*.jar' org.apache.oozie.tools.OozieSharelibCLI create -fs hdfs://cm-r01nn02.mws.mds.xyz:8020 -locallib /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/oozie/oozie-sharelib-yarn -concurrency 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/jars/log4j-slf4j-impl-2.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/jars/slf4j-simple-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 internal initialization logging.
Found Hadoop that supports Erasure Coding. Trying to disable Erasure Coding for path: /user/oozie/share/lib
Done
the destination path for sharelib is: /user/oozie/share/lib/lib_20190817204923
Running 1738 copy tasks on 8 threads
Copy tasks are done
[oozie@cm-r01en01 ~]$

 

Notice how the command completes successfully above but doesn't through the CM UI.

Next execute the same task from the CM UI and observe the space on HDFS:

[root@cm-r01en02 CDH]# hdfs dfs -du -s -h /user/oozie/share/lib/*
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190813004211
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190813011235
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190813074412
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814001323
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814003111
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814222128
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814223531
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814224153
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814230015
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814231607
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814232614
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190814233305
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190815234730
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190816173425
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190816230157
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190816232108
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190816232800
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190817092413
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190817140917
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190817192300
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190817193419
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190817202812
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190817204018
1.4 G    4.1 G  /user/oozie/share/lib/lib_20190817204923
998.0 M  3.0 G  /user/oozie/share/lib/lib_20190817205610
[root@cm-r01en02 CDH]#

Notice the task in the CM UI ends but the command continues to copy into the folder.  This indicates that timeouts have been exhausted leading up to your failure.  This also indicates the storage you are using for the installation is rather slow.  Cloudera doesn't appear to indicate this explicitly in the log files. 

We need to set the timeout.  Browse to Oozie -> Configuration and set a greater timeout then the default of 270.

Oozie Upload ShareLib Command Timeout

Oozie (Service-Wide) 

The timeout in seconds used for the Oozie Upload ShareLib command. When the value is zero, there is no timeout for the command.

Change 270 to a higher value and retry the operation.

Cheers,
TK

java.lang.ClassNotFoundException: org.cloudera.log4j.redactor.RedactorAppender

Running into this?

java.lang.ClassNotFoundException: org.cloudera.log4j.redactor.RedactorAppender

Solve it with this:

[root@awx01 ansible]# ansible 'cm*' -m shell -a 'cd /opt/cloudera/parcels/CDH/lib/oozie/libtools; ln -s ../../../jars/log4j-core-2.8.2.jar log4j-core-2.8.2.jar; ln -s ../../../jars/logredactor-2.0.7.jar  logredactor-2.0.7.jar'

Cheers,
TK


     
  Copyright © 2003 - 2013 Tom Kacperski (microdevsys.com). All rights reserved.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License