Help Center > > User Guide> Managing Active Clusters> Alarm Reference> ALM-26051 Storm Service Unavailable

ALM-26051 Storm Service Unavailable

Updated at: Dec 31, 2019 GMT+08:00

Description

The system checks the Storm service availability every 30 seconds. This alarm is generated if the Storm service becomes unavailable after all Nimbus nodes in a cluster become abnormal.

This alarm is cleared after the Storm service recovers.

Attribute

Alarm ID

Alarm Severity

Automatically Cleared

26051

Critical

Yes

Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

  • The cluster cannot provide the Storm service.
  • Users cannot run new Storm tasks.

Possible Causes

  • The Kerberos component is faulty.
  • ZooKeeper is faulty or suspended.
  • The active and standby Nimbus nodes in the Storm cluster are abnormal.

Procedure

  1. Check the Kerberos component status. For clusters without Kerberos authentication, skip this step and go to 2.

    1. On the MRS cluster details page, click Components.

      For MRS 2.0.1 or earlier, log in to MRS Manager and click Services.

    2. Check whether the health status of the Kerberos service is Good.
      • If yes, go to 2.a.
      • If no, go to 1.c.
    3. Rectify the fault by following instructions in ALM-25500 KrbServer Service Unavailable.
    4. Perform 1.b again.

  2. Check the ZooKeeper component status.

    1. Check whether the health status of the ZooKeeper service is Good.
      • If yes, go to 3.a.
      • If no, go to 2.b.
    2. If the ZooKeeper service is stopped, start it. For other problems, follow the instructions in ALM-13000 ZooKeeper Service Unavailable.
    3. Perform 2.a again.

  3. Check the status of the active and standby Nimbus nodes.

    1. Choose Components > Storm > Nimbus.
    2. In Role, check whether only one active Nimbus node exists.
      • If yes, go to 4.
      • If no, go to 3.c.
    3. Select the two Nimbus instances and choose More > Restart Instance. Check whether the restart is successful.
      • If yes, go to 3.d.
      • If no, go to 4.
    4. Log in to MRS cluster details page again and choose Components > Storm > Nimbus. Check whether the health status of Nimbus is Good.
      • If yes, go to 3.e.
      • If no, go to 4.
    5. Wait 30 seconds and check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 4.

  4. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact the O&M personnel and send the collected log information.

Related Information

N/A

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel