Header Shadow Image


vmware: no healthy upstream

After a hard reset, a greeting of:

no healthy upstream

pops up from the vSphere Client.  Login as root and issue:

service-control –start vmware-vpxd

to see if there's any  additional information in regards to this error.  You may  or may not receive more info.  Check the time and NTP settings. There's a good chance time is not synced up.

date

Login to the management console.  For example:

https://vcsa01.nix.mds.xyz:5480/#/login

If it fails to login with:

Unable to login

check space with:

# df -h |grep 100
/dev/mapper/log_vg-log                    9.8G  9.5G     0 100% /storage/log

Clear space by removing old log files, for example:

root@vcsa01 [ ~ ]# df -h |grep 100
/dev/mapper/log_vg-log                    9.8G  9.5G     0 100% /storage/log
root@vcsa01 [ ~ ]# cd /storage/log
root@vcsa01 [ /storage/log ]# du -sh *|grep G
9.4G    vmware
root@vcsa01 [ /storage/log ]# cd vmware/
root@vcsa01 [ /storage/log/vmware ]# du -sh *|grep G
1.6G    eam
2.7G    lookupsvc
root@vcsa01 [ /storage/log/vmware ]# cd lookupsvc/
root@vcsa01 [ /storage/log/vmware/lookupsvc ]# du -sh *|grep G
2.6G    tomcat
root@vcsa01 [ /storage/log/vmware/lookupsvc ]# cd tomcat/
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]#
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]# rm -rf localhost_access.2021*
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]# df -h .
Filesystem              Size  Used Avail Use% Mounted on
/dev/mapper/log_vg-log  9.8G  8.0G  1.4G  86% /storage/log
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]#

After cleaning up the space in a few more folders, reboot the appliance:

reboot

You should now be able to login after the space is freed.  Try to loign again to the management console (Port 5480) to disable and enable Time Synchronization if UI still doesn't show up.  If not, issue:

root@vcsa01 [ ~ ]# service-control –start applmgmt

From the vSphere Client SSH session.  If you get a certificate expiration failure:

Exception in invoking authentication handler [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1076)

Renew the self signed cert, or official certificate.  Use this page:

https://kb.vmware.com/s/article/76719

Example output:

root@vcsa01 [ /tmp ]# ./fixsts.sh
NOTE: This works on external and embedded PSCs
This script will do the following
1: Regenerate STS certificate
What is needed?
1: Offline snapshots of VCs/PSCs
2: SSO Admin Password
IMPORTANT: This script should only be run on a single PSC per SSO domain
==================================
Resetting STS certificate for vcsa01.nix.mds.xyz started on Mon Aug  1 04:23:46 UTC 2022


Detected DN: cn=vcsa01.nix.mds.xyz,ou=Domain Controllers,dc=vsphere,dc=local
Detected PNID: vcsa01.nix.mds.xyz
Detected PSC: vcsa01.nix.mds.xyz
Detected SSO domain name: vsphere.local
Detected Machine ID: 310ae9cb-82a9-4fa4-bcd4-d34b054d0090
Detected IP Address: 192.168.0.33
Domain CN: dc=vsphere,dc=local
==================================
==================================

Detected Root's certificate expiration date: 2030 Jun 3
Detected today's date: 2022 Aug 1
==================================

Exporting and generating STS certificate

Status : Success
Using config file : /tmp/vmware-fixsts/certool.cfg
Status : Success


Enter password for administrator@vsphere.local:
Highest tenant credentials index : 1
Exporting tenant 1 to /tmp/vmware-fixsts

Deleting tenant 1

Highest trusted cert chains index: 1
Exporting trustedcertchain 1 to /tmp/vmware-fixsts

Deleting trustedcertchain 1

Applying newly generated STS certificate to SSO domain
adding new entry "cn=TenantCredential-1,cn=vsphere.local,cn=Tenants,cn=IdentityManager,cn=Services,dc=vsphere,dc=local"

adding new entry "cn=TrustedCertChain-1,cn=TrustedCertificateChains,cn=vsphere.local,cn=Tenants,cn=IdentityManager,cn=Services,dc=vsphere,dc=local"


Replacement finished – Please restart services on all vCenters and PSCs in your SSO domain
==================================
IMPORTANT: In case you're using HLM (Hybrid Linked Mode) without a gateway, you would need to re-sync the certs from Cloud to On-Prem after following this procedure
==================================
==================================
root@vcsa01 [ /tmp ]#

Try to login again.  

Cheers,
Tom

NTPD: Setting up an NTP server on DD-WRT

Recent power outages and ISP outages left my network without a proper internal NTP server which was, coincidently, installed on an ESXi host.  Having to revert to an external NTP server for the time being, a recent outage with my ISP highlighted the fact that even that isn't enough.  The ISP outage () made it clear I need a solution that is:

  1. Sitting on a lower power device and external to my LAB, so isolated from any large server hosted device.
  2. Doesn't depend on DNS to syn cup time in case an ISP is offline.  
  3. Maintain an accurate time on it's own so it itself will be a reliable source of time when everything is offline.

So went with an OpenWRT and a Raspberry Pi device for just this very thing.  This is super simple:

opkg update
opkg install ntpd
/etc/init.d/sysntpd disable
/etc/init.d/ntpd enable
/etc/init.d/ntpd start
netstat -l | grep ntp

Configure the external NTP servers to use:

root@OWRT01:~# cat /etc/config/system

config system
        option ttylogin '0'
        option log_size '64'
        option urandom_seed '0'
        option hostname 'OWRT01'
        option log_proto 'udp'
        option conloglevel '8'
        option cronloglevel '5'
        option timezone 'EST5EDT,M3.2.0,M11.1.0'
        option zonename 'America/Toronto'
        option log_ip 192.168.0.14
        option log_port 514
        option log_proto udp


config timeserver 'ntp'
        list server '0.ca.pool.ntp.org'
        list server '1.ca.pool.ntp.org'
        list server '2.ca.pool.ntp.org'
root@OWRT01:~#

Set the time manually, in the event the system can't sync it's time with an external server:

# date
# date -k

So our brand new NTP server is sitting on:

192.168.0.12

Let's now set the Date / Time to sync from this NTP server.  For Cisco switches:

mdscisco01#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
mdscisco01(config)#ntp server 192.168.0.12
mdscisco01(config)#end
mdscisco01#show run
mdscisco01#show running-config

Ensure local time is also set correctly:

mdscisco02#
mdscisco02#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
mdscisco02(config)#clock timezone EST -5
mdscisco02(config)#end
mdscisco02#clock set 11:52:00 July 10 2022
mdscisco02#copy run
mdscisco02#copy running-config startup-config
Destination filename [startup-config]?
Building configuration…
Compressed configuration from 7043 bytes to 2639 bytes[OK]
mdscisco02#

For Linux Servers:

[root@mbpc-pc ~]# cat /etc/ntp.conf|grep -Eiv "^#"
driftfile /var/lib/ntp/drift
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict -6 ::1
server 192.168.0.12
server idmipa01.nix.mds.xyz prefer
server idmipa02.nix.mds.xyz prefer
server 0.rhel.pool.ntp.org iburst
server 1.rhel.pool.ntp.org iburst
server 2.rhel.pool.ntp.org iburst
server 3.rhel.pool.ntp.org iburst
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys

For DD-WRT, configure via the basic config to use the OpenWRT NTP server:

https://i2.wp.com/www.microdevsys.com/WordPressImages/DD-WRT-NTP-Setup.PNG?ssl=1

Configure ESXi hosts you may have:

https://i0.wp.com/www.microdevsys.com/wordpressimages/DD-WRT-NTP-Setup-ESXi-Config.PNG?ssl=1

REF: https://openwrt.org/docs/guide-user/services/ntp/client-server 
REF: https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst3750x_3560x/software/release/12-2_55_se/configuration/guide/3750xscg/swadmin.html 
REF: https://wiki.dd-wrt.com/wiki/index.php/Network_Time_Protocol#:~:text=You%20cannot%20set%20your%20time,to%20match%20your%20local%20time.

HTH


 

DD-WRT: DHCP not working not assigning IP – Local DNS fix

Not getting a DHCP IP?  Instead getting:

169.254

169.254.149.164

The tcpdump of the traffic from the affected DD-WRT router shows the DHCP sending over a valid IP yet said IP address is not assigned to the 5Ghz WIFI band?


23:54:16.735841 bb:bb:bb:bb:bb:bb > aa:aa:aa:aa:aa:aa, ethertype 802.1Q (0x8100), length 366: vlan 1, p 0, ethertype IPv4 (0x0800), (tos 0xc0, ttl 64, id 27979, offset 0, flags [none], proto UDP (17), length 348)
    192.168.0.6.67 > 192.168.0.144.68: [udp sum ok] BOOTP/DHCP, Reply, length 320, xid 0xaf0ee6f3, secs 25, Flags [none] (0x0000)
          Your-IP 192.168.0.144
          Server-IP 192.168.0.6
          Client-Ethernet-Address aa:aa:aa:aa:aa:aa
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message (53), length 1: Offer
            Server-ID (54), length 4: 192.168.0.6
            Lease-Time (51), length 4: 86400
            RN (58), length 4: 43200
            RB (59), length 4: 75600
            Subnet-Mask (1), length 4: 255.255.255.0
            BR (28), length 4: 192.168.0.255
            Domain-Name (15), length 7: "dom.abc"
            Unknown (252), length 1: 10
            Domain-Name-Server (6), length 20: 192.168.0.30,192.168.0.10,192.168.0.20,123.87.80.1,123.87.81.1
            Default-Gateway (3), length 4: 192.168.0.6
23:54:17.580935 cc:cc:cc:cc:cc:cc > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 594: vlan 1, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 576)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from cc:cc:cc:cc:cc:cc, length 548, xid 0x9035706b, Flags [none] (0x0000)
          Client-IP 192.168.0.210
          Client-Ethernet-Address cc:cc:cc:cc:cc:cc
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message (53), length 1: Request
            Parameter-Request (55), length 5:
              Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6), Domain-Name (15)
              Hostname (12)
            Hostname (12), length 10: "RokuPlayer"
23:54:25.476020 bb:bb:bb:bb:bb:bb > aa:aa:aa:aa:aa:aa, ethertype 802.1Q (0x8100), length 366: vlan 1, p 0, ethertype IPv4 (0x0800), (tos 0xc0, ttl 64, id 28795, offset 0, flags [none], proto UDP (17), length 348)
    192.168.0.6.67 > 192.168.0.144.68: [udp sum ok] BOOTP/DHCP, Reply, length 320, xid 0xaf0ee6f3, secs 34, Flags [none] (0x0000)
          Your-IP 192.168.0.144
          Server-IP 192.168.0.6
          Client-Ethernet-Address aa:aa:aa:aa:aa:aa
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message (53), length 1: Offer
            Server-ID (54), length 4: 192.168.0.6
            Lease-Time (51), length 4: 86400
            RN (58), length 4: 43200
            RB (59), length 4: 75600
            Subnet-Mask (1), length 4: 255.255.255.0
            BR (28), length 4: 192.168.0.255
            Domain-Name (15), length 7: "dom.abc"
            Unknown (252), length 1: 10
            Domain-Name-Server (6), length 20: 192.168.0.30,192.168.0.10,192.168.0.20,123.87.80.1,123.87.81.1
            Default-Gateway (3), length 4: 192.168.0.6

 

Particularly on the 5Ghz signal.  In this case, leaving TurboQAM and NitroQAM enable however disabling Implicit Beamforming and Explicit Beamforming fixed the problem.  Apparently a self inflicted wound.  Had this disabled earlier and just recently reenabled.

Some handy commands:

tcpdump '(port 67 or port 68) and ether host aa:bb:cc:dd:ee:ff' -e -n -vv

HTH

DD-WRT: Missing 5Ghz settings and kern.warn kernel: wl driver 7.14.164.18 (r692288) failed with code 1

If the Wireless setup page is missing the 5GHz or any GHz settings, or other pages are missing this info:it could be a sign of exhausted nvram.  A clue to this is when resetting the nvram to defaults, all options become available again.  But what about this message?

kern.warn kernel: wl driver 7.14.164.18 (r692288) failed with code 1

The above was really a red herring.  The module can be checke using:

# lsmod|grep -Ei wl
wl                   4420666  0

And if not loaded, use rmmod wl then modprobe wl to re add the module.  These pop up and look like they could be related but they are not, in reality connected to the Wireless page not showing settings.  This is also interesting since these settings don't show up even with plenty of NVRAM left.  Hmm:

# nvram show >/dev/null
size: 68727 bytes (62345 left)

The culprit for the 5Ghz was too many F/W rules in DD-WRT causing (apparent) nvram exhaustion.  To fix this, relocate the iptables rules to /jffs where there's plenty of storage:

To save nvram space, encode the F/W commands as follows (First command below compresses directly from the current F/W settings):

# nvram set pH_fw="$(nvram get rc_firewall | gzip | uuencode -m /dev/stdout)"

# nvram set rc_firewall="nvram get pH_fw | uudecode -o /tmp/pH_fw.gz;gunzip /tmp/pH_fw.gz;chmod +x /tmp/pH_fw;/tmp/pH_fw"

# nvram get pH_fw
begin-base64 644 /dev/stdout
H4sIAAAAAAAAA9VabXeiRhT+vr/inqbnbHJORBgxoh/aYwzZuFVC1WTbTz1E
JkqjQAE3zuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuumb7AwLId
k0Jf/+jjtQxjyw2MpwX1odaFvqY/TKDmgyIKSlNQZEGsEwI1F4KpC7U/YeHM
.
.
.
tUsYOsv6sssssssssssssssssssssssssssssssssssssssssssTUbMPl9Nh
xTHyY5gcggggggggggggggggggggggggggggggggggggggg+2NcYRZ4RnwDl
F/L7XKtQKbbbbbbbbbbbbbbbbbbbbbbbbbbbbbQrPhhhhhhhhhhJc5MVx+S3
xdpjWoFIzzzzzzzzzzzzzzzzzzzzzzkkkkkkkkzzhBDu2+ddVF7b51S2lciU
Gb50hBMJ/YPkKgb/h38AsEVI4rQ2AAA=
====

The command that will show in the UI is:

# nvram get pH_fw | uudecode -o /tmp/pH_fw.gz;gunzip /tmp/pH_fw.gz;chmod +x /tmp/pH_fw;/tmp/pH_fw

To edit the rules use:

# nvram get pH_fw|uudecode | gunzip | tee -a $(nvram get router_name)-firewall.conf

Edit the F/W rules conf file:

# vi $(nvram get router_name)-firewall.conf

Then compress once more:

# nvram set pH_fw="$(cat $(nvram get router_name)-firewall.conf | gzip | uuencode -m /dev/stdout)"

And verify:

# nvram get pH_fw

Verify in the UI that the .rc_firewall command has the compress line above.  An alternative to the above, is to always compress the firewall from a file stored on /jffs.  This will allow for far more storage available to firewall rules.  Create a file on the /jffs/firewall folder or another folder, perhaps on your USB:

# vi /jffs/firewall/$(nvram get router_name)-firewall.run

Copy in or enter the firewall rules.  Once done, set the rc_firewall nvram setting to:

# nvram set rc_firewall="time /bin/sh /jffs/firewall/$(nvram get router_name)-firewall.run"

Verify:

# nvram get rc_firewall
# time /bin/sh /jffs/firewall/$(nvram get router_name)-firewall.run

UI verification:

https://i1.wp.com/www.microdevsys.com/WordPressImages/DD-WRT-Firewall-Rules-In-Jffs.png?ssl=1

ISSUE

Some symptoms of low nvram are messages such as these:

dd-wrt-heimdall.the.abyss.log-20220607.gz:–
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: 2231 total pagecache pages
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: 0 pages in swap cache
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: Swap cache stats: add 0, delete 0, find 0/0
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: Free swap  = 0kB
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: Total swap = 0kB
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: 131072 pages RAM
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: 98304 pages HighMem/MovableOnly
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: 2645 pages reserved
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [  668]     0   668      195       39       4       0        0             0 hotplug2
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [  672]     0   672      206       64       3       0        0             0 mstpd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [  676]     0   676      245      139       3       0        0             0 irqbalance
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [  736]     0   736      394      202       3       0        0             0 watchdog
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1060]     0  1060      363      194       3       0        0             0 syslogd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1062]     0  1062      363      170       4       0        0             0 klogd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1071]     0  1071      187       13       3       0        0             0 p910nd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1122]     0  1122      307       59       3       0        0             0 dropbear
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1501]     0  1501      424      114       5       0        0             0 ttraff
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1650]     0  1650      321       41       3       0        0             0 dhcp6c
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1676]     0  1676      367      191       4       0        0             0 watchquagga
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1678]     0  1678      379      110       4       0        0             0 wland
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1690]     0  1690      425      171       3       0        0             0 dnsmasq
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1700]     0  1700      317      173       3       0        0             0 radvd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 1701]     0  1701      317       40       3       0        0             0 radvd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2143]     0  2143      394       70       3       0        0             0 process_monitor
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2146]     0  2146      286      115       4       0        0             0 scc
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2147]     0  2147      363       88       3       0        0             0 udhcpc
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2155]     0  2155      392      223       5       0        0             0 nas
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2156]     0  2156      392      118       3       0        0             0 nas
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2157]     0  2157      392      212       4       0        0             0 nas
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2158]     0  2158      392      117       5       0        0             0 nas
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2164]     0  2164     1140      154       5       0        0             0 httpd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2165]     0  2165      329      141       3       0        0             0 resetbutton
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2265]     0  2265      248       57       4       0        0             0 usbipd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2338]     0  2338      368       72       3       0        0             0 zabbix_agentd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2340]     0  2340      368      219       3       0        0             0 zabbix_agentd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2341]     0  2341      370      207       3       0        0             0 zabbix_agentd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2342]     0  2342      370      108       3       0        0             0 zabbix_agentd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2343]     0  2343      370      207       3       0        0             0 zabbix_agentd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2344]     0  2344      370      109       3       0        0             0 zabbix_agentd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2345]     0  2345      370      108       3       0        0             0 zabbix_agentd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2410]     0  2410      188      104       4       0        0             0 cron
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [ 2461]     0  2461      454      316       3       0        0             0 ripd
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: Out of memory: Kill process 2461 (ripd) score 2 or sacrifice child
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: Killed process 2461 (ripd) total-vm:1816kB, anon-rss:392kB, file-rss:872kB
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: zabbix_agentd invoked oom-killer: gfp_mask=0x26040d0, order=0, oom_score_adj=0
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: zabbix_agentd cpuset=/ mems_allowed=0
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: CPU: 1 PID: 2340 Comm: zabbix_agentd Tainted: P                4.4.302 #5717
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: Hardware name: Northstar Prototype
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: Backtrace:
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [<80012988>] (dump_backtrace) from [<80012c0c>] (show_stack+0x18/0x1c)
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel:  r7:87265388 r6:20000013 r5:00000000 r4:8055e9e4
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [<80012bf4>] (show_stack) from [<80014b84>] (dump_stack+0x94/0xa8)
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel: [<80014af0>] (dump_stack) from [<800140e4>] (dump_header+0x54/0x17c)
dd-wrt-heimdall.the.abyss.log-20220607.gz:Jun  5 14:52:34 dd-wrt-heimdall.the.abyss kernel:  r7:87265388 r6:00000000 r5:8374bbec r4:87265380

 

Cheers,

 

REF: https://wiki.dd-wrt.com/wiki/index.php/Useful_Scripts#Compress_the_Firewall_Script_.28to_reduce_nvram_usage.29 
REF: https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=327261 

r8169 0000:09:00.0: no dedicated PHY driver found for PHY ID 0x001cc912, maybe realtek.ko needs to be added to initramfs?

After regenerating the initramfs using dracut -f, we end up seeing the following, rather disappointing message:

r8169 0000:09:00.0: no dedicated PHY driver found for PHY ID 0x001cc912, maybe realtek.ko needs to be added to initramfs?

More specifically, in order to fix a QLogic card firmware issue on the following Linux kernel level, which foced the regeneration of the initramfs to take in the ql2400_fw.bin file once more with the updated 08.07 firmware:

Linux mbpc-pc 5.11.13 #1 SMP Sun Apr 11 21:31:14 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

On reboot however, though the FW driver worked like a charm, the network card didn't come up.  Digging deeper:

[root@mbpc-pc firmware]# 

Apr 10 17:17:45 mbpc-pc kernel: r8169 0000:09:00.0: no dedicated PHY driver found for PHY ID 0x001cc912, maybe realtek.ko needs to be added to initramfs?
Apr 10 17:17:45 mbpc-pc kernel: r8169: probe of 0000:09:00.0 failed with error -49

[root@mbpc-pc ~]# grep -Ei r8169 /var/log/messages
Apr 10 16:46:15 mbpc-pc kernel: libphy: r8169: probed
Apr 10 16:46:15 mbpc-pc kernel: r8169 0000:09:00.0 eth0: RTL8168d/8111d, aa:bb:cc:dd:ee:ff, XID 283, IRQ 18
Apr 10 16:46:15 mbpc-pc kernel: r8169 0000:09:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
Apr 10 16:46:15 mbpc-pc kernel: RTL8211B Gigabit Ethernet r8169-900:00: attached PHY driver (mii_bus:phy_addr=r8169-900:00, irq=IGNORE)
Apr 10 16:46:15 mbpc-pc kernel: r8169 0000:09:00.0 eth0: Link is Down
Apr 10 16:46:15 mbpc-pc kernel: r8169 0000:09:00.0 eth0: Link is Up – 1Gbps/Full – flow control off
Apr 10 17:17:45 mbpc-pc kernel: libphy: r8169: probed
Apr 10 17:17:45 mbpc-pc kernel: r8169 0000:09:00.0: no dedicated PHY driver found for PHY ID 0x001cc912, maybe realtek.ko needs to be added to initramfs?
Apr 10 17:17:45 mbpc-pc kernel: r8169: probe of 0000:09:00.0 failed with error -49
Apr 10 18:06:04 mbpc-pc kernel: libphy: r8169: probed
Apr 10 18:06:04 mbpc-pc kernel: r8169 0000:09:00.0 eth0: RTL8168d/8111d, aa:bb:cc:dd:ee:ff, XID 283, IRQ 18
Apr 10 18:06:04 mbpc-pc kernel: r8169 0000:09:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
Apr 10 18:06:04 mbpc-pc kernel: RTL8211B Gigabit Ethernet r8169-900:00: attached PHY driver (mii_bus:phy_addr=r8169-900:00, irq=IGNORE)
Apr 10 18:06:04 mbpc-pc kernel: r8169 0000:09:00.0 eth0: Link is Down
Apr 10 18:06:06 mbpc-pc kernel: r8169 0000:09:00.0 eth0: Link is Up – 1Gbps/Full – flow control off

This is a horrible way to fix things but given the fact that this is a Scientific Linux 6.X OS w/ a 5.X kernel in it, slated for a future upgrade, so be it:

[root@mbpc-pc ~]# vi /etc/rc.local

# Sometimes the realtek driver doesn't load.  It's between realtek.ko and r8169.  In this case, if eth0 doesn't show up, reinsert the module and log the attempt.
#
# Attempts to fix:
#
#       mbpc-pc kernel: r8169 0000:09:00.0: no dedicated PHY driver found for PHY ID 0x001cc912, maybe realtek.ko needs to be added to initramfs?
#       kernel: r8169: probe of 0000:09:00.0 failed with error -49
#
# This fix is only temporary for buggy kernels until a better kernel shows up that fixes this.  There are alot of assumptions here.
#
if ip a|grep -Ei eth0 >/dev/null >/dev/null 2>&1; then
        echo "ERROR: eth0 was not present.  Removing r8169 and reinserting it again to 'fix' the problem." | tee -a /var/log/messages;
        rmmod r8169;
        modprobe r8169;
else
        echo "GOOD: eth0 was present.  Not necessary to reinsert the r8169 kernel module." | tee -a /var/log/messages;
fi
[root@mbpc-pc 5.11.13]#

Hope it works for you too!

 

gnome-session[2995]: WARNING: Unable to find required component ‘gnome-shell’

Gnome session issues:

Apr  9 17:02:55 rfc1178-01 systemd[1]: Cannot add dependency job for unit sys-kernel-security.mount, ignoring: Unit sys-kernel-security.mount failed to load: Invalid argument. See system logs and 'systemctl status sys-kernel-security.mount' for details.
Apr  9 17:02:55 rfc1178-01 systemd[1]: Stopping Remote desktop service (VNC)…
Apr  9 17:02:55 rfc1178-01 gnome-session[2885]: gnome-session[2885]: Gdk-WARNING: gnome-session: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
Apr  9 17:02:55 rfc1178-01 gnome-session[2885]: Gdk-WARNING: gnome-session: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
Apr  9 17:02:55 rfc1178-01 gnome-session[2885]: (gnome-session-failed:2915): Gdk-WARNING **: gnome-session-failed: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
Apr  9 17:02:55 rfc1178-01 systemd[1]: Starting Remote desktop service (VNC)…
Apr  9 17:02:55 rfc1178-01 runuser[2941]: Set Bell volume to 'mute' (0).
Apr  9 17:02:55 rfc1178-01 runuser[2941]: xset:  unable to open display ""
Apr  9 17:02:58 rfc1178-01 runuser[2941]: New 'rfc1178-01:1 (root)' desktop is rfc1178-01:1
Apr  9 17:02:58 rfc1178-01 runuser[2941]: Starting applications specified in /root/.vnc/xstartup
Apr  9 17:02:58 rfc1178-01 runuser[2941]: Log file is /root/.vnc/rfc1178-01:1.log
Apr  9 17:02:58 rfc1178-01 systemd[1]: Started Remote desktop service (VNC).
Apr  9 17:02:58 rfc1178-01 gnome-session[2995]: gnome-session[2995]: WARNING: Unable to find required component 'gnome-shell'
Apr  9 17:02:59 rfc1178-01 gnome-session[2995]: WARNING: Unable to find required component 'gnome-shell'
Apr  9 17:02:59 rfc1178-01 gnome-session[2995]: Entering running state
Apr  9 17:02:59 rfc1178-01 gnome-session[2995]: Fontconfig warning: "/etc/fonts/conf.d/50-user.conf", line 14: reading configurations from ~/.fonts.conf is deprecated.

Fix by installing missing components:

# yum install gnome-shell.i686

Restart the VNC server:

# systemctl restart vncserver@:1

HTH,

FATA[0000] listing images: rpc error: code = Unknown desc = layer not known

Getting this?

[root@rhcpm03 ~]# crictl images
FATA[0000] listing images: rpc error: code = Unknown desc = layer not known
[root@rhcpm03 ~]#

Solve it with this:

mv /var/lib/containers/storage/overlay-images /root/

and reboot.  Result:

[root@rhcpm03 ~]# crictl images
IMAGE                                                   TAG                 IMAGE ID            SIZE
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256   <none>              efffc0c0c9d49       272MB
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256   <none>              cb6a3de089555       706MB
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256   <none>              6b15709214373       320MB
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256   <none>              91965ed632cd9       346MB
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256   <none>              cf9a3590794c6       324MB
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256   <none>              f93d86f4137d2       342MB
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256   <none>              1ca3399715bd7       325MB
[root@rhcpm03 ~]#

And all of a sudden, containers are listed as well:

[root@rhcpm03 member]# crictl ps -a
CONTAINER           IMAGE                                                              CREATED              STATE               NAME                                          ATTEMPT             POD ID
3b15c78e27081       91965ed632cd91f053532cefc7d8b853a2bb5b8b6d1768c8ad473fe404b9338d   4 seconds ago        Running             etcd                                          3                   173e01bafa268
da9cd9852082d       91965ed632cd91f053532cefc7d8b853a2bb5b8b6d1768c8ad473fe404b9338d   45 seconds ago       Exited              etcd                                          2                   173e01bafa268
3690e2c6d4559       1ca3399715bd78d327040cdb7ab42b9a327f4d5fb051a82803530b0fad881688   55 seconds ago       Exited              kube-apiserver-check-endpoints                1                   3acefe2582471
9622b9edb4c41       cb6a3de0895557add3a8c06853be26361d7986aa58c04c451d5f28cd606ea13a   56 seconds ago       Exited              kube-apiserver                                1                   3acefe2582471
45ec2a0aaa2b9       f93d86f4137d2ad194edecb73424a4ce373c6a1bd11e6b4588b5afe601c1553d   About a minute ago   Running             kube-scheduler-recovery-controller            0                   59483f323a340
9a4fa1095ef6c       f93d86f4137d2ad194edecb73424a4ce373c6a1bd11e6b4588b5afe601c1553d   About a minute ago   Running             kube-scheduler-cert-syncer                    0                   59483f323a340
cfa4fc1624ee4       cb6a3de0895557add3a8c06853be26361d7986aa58c04c451d5f28cd606ea13a   About a minute ago   Running             kube-scheduler                                0                   59483f323a340
04dd78dbf1de3       91965ed632cd91f053532cefc7d8b853a2bb5b8b6d1768c8ad473fe404b9338d   2 minutes ago        Running             etcd-metrics                                  0                   173e01bafa268
8c49f34fa26da       1ca3399715bd78d327040cdb7ab42b9a327f4d5fb051a82803530b0fad881688   2 minutes ago        Running             kube-apiserver-insecure-readyz                0                   3acefe2582471
2d01ad7011aad       91965ed632cd91f053532cefc7d8b853a2bb5b8b6d1768c8ad473fe404b9338d   2 minutes ago        Running             etcdctl                                       0                   173e01bafa268
2f6264de26076       1ca3399715bd78d327040cdb7ab42b9a327f4d5fb051a82803530b0fad881688   2 minutes ago        Running             kube-apiserver-cert-regeneration-controller   0                   3acefe2582471
a632ca35c1e47       1ca3399715bd78d327040cdb7ab42b9a327f4d5fb051a82803530b0fad881688   2 minutes ago        Running             kube-apiserver-cert-syncer                    0                   3acefe2582471
6038ee56e440d       cb6a3de0895557add3a8c06853be26361d7986aa58c04c451d5f28cd606ea13a   2 minutes ago        Exited              wait-for-host-port                            0                   59483f323a340
2f5afb3599e93       91965ed632cd91f053532cefc7d8b853a2bb5b8b6d1768c8ad473fe404b9338d   2 minutes ago        Exited              etcd-resources-copy                           0                   173e01bafa268
48b921c551021       cf9a3590794c69413a2d77f1c5fc3392f069412b310d522c27ad64a2fc53f2f7   2 minutes ago        Running             kube-controller-manager-recovery-controller   0                   77ee601e357e5
ba1d1ccd4eac5       cf9a3590794c69413a2d77f1c5fc3392f069412b310d522c27ad64a2fc53f2f7   2 minutes ago        Running             kube-controller-manager-cert-syncer           0                   77ee601e357e5
796bc713f02df       6b1570921437331098e754bec919e2682c1a2aff7aa54a25b56d1f9d3a282957   2 minutes ago        Running             cluster-policy-controller                     0                   77ee601e357e5
f1ea6c341b86c       cb6a3de0895557add3a8c06853be26361d7986aa58c04c451d5f28cd606ea13a   2 minutes ago        Running             kube-controller-manager                       0                   77ee601e357e5
a4af04ce65cea       cb6a3de0895557add3a8c06853be26361d7986aa58c04c451d5f28cd606ea13a   2 minutes ago        Exited              setup                                         0                   3acefe2582471
3d3a45e580fc1       91965ed632cd91f053532cefc7d8b853a2bb5b8b6d1768c8ad473fe404b9338d   2 minutes ago        Exited              etcd-ensure-env-vars                          0                   173e01bafa268
[root@rhcpm03 member]#

 

Cheers,

github: fatal: Authentication failed for

Seeing this?

github: fatal: Authentication failed for

Renew the github token: Github Login -> Settings -> Developer Settings -> Personal Access Tokens

Expired today.

Use the new token!

Cheers,

 

 

Customizing vi and vim rc files: ~/.virc and ~/.vimrc

If you're looking:

cat ~/.vimrc
" Enable plugins
filetype plugin on

" Enable indentation
filetype indent on

" Turn syntax highlighting off.  (In case there's inefficient contract between available colors.)
syntax off

" Set line numbering
set number

What about VI?  For vi, the above will also work however this is manged by the cat ~/.virc file:

cat ~/.vimrc
" Enable plugins
filetype plugin on

" Enable indentation
filetype indent on

" Turn syntax highlighting off.  (In case there's inefficient contract between available colors.)
syntax off

" Set line numbering
set number

Cheers,
 

Patroni: FATAL:  could not connect to the primary server: server closed the connection unexpectedly

Getting this?

[root@psql04 ~]# tail -f /data/patroni/log/postgresql-Wed.log
2022-03-09 20:07:40.890 EST [27627] FATAL:  could not connect to the primary server: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
^C
[root@psql04 ~]#

Check on the targe primary cluster that it is not getting blocked via Haproxy:

[root@psql07 patroni]# tail -f /var/log/audit/audit.log|grep -Ei denied
type=AVC msg=audit(1646874430.882:1393): avc:  denied  { name_connect } for  pid=1045 comm="haproxy" dest=5432 scontext=system_u:system_r:haproxy_t:s0 tcontext=system_u:object_r:postgresql_port_t:s0 tclass=tcp_socket permissive=0

Both logs should scroll as the Standby Cluster tries to make a connection to the Primary Cluster.  Resolve using:

grep AVC /var/log/audit/audit.log* |grep -Ei denied >/var/log/audit/audit.previous; cat /var/log/audit/audit.previous |  audit2allow -M systemd-allow; semodule -i systemd-allow.pp

Better yet, allow all haproxy traffic:

semanage permissive -a haproxy_t

As soon as this is set, Haproxy connections work perfectly and all of a sudden, the Patroni cluster is able to replicate just fine:

[root@psql04 ~]# tail -f /data/patroni/log/postgresql-Wed.log
                before or while processing the request.
2022-03-09 20:28:01.807 EST [30690] FATAL:  could not connect to the primary server: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
2022-03-09 20:28:06.816 EST [30703] LOG:  fetching timeline history file for timeline 1226 from primary server
2022-03-09 20:28:06.838 EST [30703] LOG:  started streaming WAL from primary at 32/55000000 on timeline 1225
2022-03-09 20:28:06.854 EST [30703] LOG:  replication terminated by primary server
2022-03-09 20:28:06.854 EST [30703] DETAIL:  End of WAL reached on timeline 1225 at 32/5508DE08.
2022-03-09 20:28:06.857 EST [27428] LOG:  new target timeline is 1226
2022-03-09 20:28:06.859 EST [30703] LOG:  restarted WAL streaming at 32/55000000 on timeline 1226
^C
[root@psql04 ~]#  patronictl –config-file=/etc/patroni.yml list
+————-+——————–+—————-+———+——+———–+—————–+
| Member      | Host               | Role           | State   |   TL | Lag in MB | Pending restart |
+ Cluster: postgres (6617627977882355208) ———-+———+——+———–+—————–+
| postgresql0 | psql04.nix.mds.xyz | Standby Leader | running | 1226 |           | *               |
| postgresql1 | psql05.nix.mds.xyz | Replica        | running | 1226 |         0 | *               |
| postgresql2 | psql06.nix.mds.xyz | Replica        | running | 1226 |         0 | *               |
+————-+——————–+—————-+———+——+———–+—————–+
[root@psql04 ~]#

Cheers,
 


     
  Copyright © 2003 - 2013 Tom Kacperski (microdevsys.com). All rights reserved.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License