Header Shadow Image


ERROR    (10 skipped) Error sending messages to firehose (retry): mgmt-HOSTMONITOR

Getting this?

[24/May/2020 23:08:13 +0000] 5385 MonitorDaemon-Reporter throttling_logger ERROR    (10 skipped) Error sending messages to firehose (retry): mgmt-HOSTMONITOR-a6c8a202b717eae93da5e0a53f184c3a
Traceback (most recent call last):
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/firehose.py", line 125, in _send
    self._requestor.request('sendAgentMessages', dict(messages=UNICODE_SANITIZER(messages)))
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 141, in request
    return self.issue_request(call_request, message_name, request_datum)
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 254, in issue_request
    call_response = self.transceiver.transceive(call_request)
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 483, in transceive
    result = self.read_framed_message()
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 487, in read_framed_message
    response = self.conn.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 444, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
    raise BadStatusLine(line)
BadStatusLine: ''

modify the line slightly to see exactly what host or port it's trying:

    try:
      if self._requestor is None:
        self._transceiver = avro.ipc.HTTPTransceiver(self._address,
                                                     self._port)
        self._requestor = avro.ipc.Requestor(FIREHOSE_MESSAGE_PROTOCOL,
                                             self._transceiver)
      initial_requestor_bytes = self._requestor.get_requestor_bytes_sent()
      self._requestor.request('sendAgentMessages', dict(messages=UNICODE_SANITIZER(messages)))
      self._last_message_transmit_duration_gauge.set_value(
        (time.time() – start) * 1000)
      self._message_transmit_succeeded_counter.increment()
      self._requestor_bytes_sent.increment(
        self._requestor.get_requestor_bytes_sent() – initial_requestor_bytes)
      return True
    except BadStatusLine, ex:
      # We've lost our connection. In practice this usually means the server has
      # closed a connection that we expect to be open because of HTTP keep-alive.
      # We will do a single silent retry. If the problem persistest there, we'll
      # log.
      self._reset()
      if retryOnBadStatusLine:
        return self._send(messages, retryOnBadStatusLine=False)
      self._message_transmit_failed_counter.increment()
      # THROTTLED_LOG.exception("Error sending messages to firehose (retry): " +
      #                        self.name)

      THROTTLED_LOG.exception("Error sending messages to firehose (retry): %s .  Address: %s .  Port: %s" % ( self.name, self._address, self._port ))
      return False
    except Exception:
      THROTTLED_LOG.exception("Error sending messages to firehose: " + self.name)
      self._reset()
      self._message_transmit_failed_counter.increment()
      return False

Now when you start things up, you'll get some more meaningfull messages:

[24/May/2020 23:26:07 +0000] 6934 MonitorDaemon-Reporter firehoses    INFO     Creating a connection to the HOSTMONITOR.
[24/May/2020 23:26:08 +0000] 6934 MonitorDaemon-Reporter throttling_logger ERROR    Error sending messages to firehose (retry): mgmt-HOSTMONITOR-a6c8a202b717eae93da5e0a53f184c3a .  Address: cm-r01en02.mws.mds.xyz .  Port: 9995
Traceback (most recent call last):
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/firehose.py", line 125, in _send
    self._requestor.request('sendAgentMessages', dict(messages=UNICODE_SANITIZER(messages)))
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 141, in request
    return self.issue_request(call_request, message_name, request_datum)
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 254, in issue_request
    call_response = self.transceiver.transceive(call_request)
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 483, in transceive
    result = self.read_framed_message()
  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 487, in read_framed_message
    response = self.conn.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 444, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
    raise BadStatusLine(line)
BadStatusLine: ''
^C
[root@cm-awn01 pki]# nc -vz cm-r01en02.mws.mds.xyz 9995
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 108.168.115.113:9995.
Ncat: 0 bytes sent, 0 bytes received in 0.05 seconds.
[root@cm-awn01 pki]#

Notice the text in blue above.  Keeping it in mind, consider this Haproxy configuration:

listen cm9995
        log                             127.0.0.1:514   local0          debug
        bind                            srv-c01:9995
        mode tcp
        option tcplog
        server cm-r01en01.mws.mds.xyz cm-r01en01.mws.mds.xyz check
        server cm-r01en02.mws.mds.xyz cm-r01en02.mws.mds.xyz check

Notice that we have TCP in the HAproxy but perhaps CMA expects HTTP?  Try setting it to HTTP:

 

Leave a Reply

You must be logged in to post a comment.


     
  Copyright © 2003 - 2013 Tom Kacperski (microdevsys.com). All rights reserved.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License