Process of Using AOM
Application Operations Management (AOM) is a one-stop and multi-dimensional O&M management platform for cloud applications. It monitors applications and related cloud resources in real time, collects and associates resource metrics, logs, and events to analyze application health status, and provides flexible alarm reporting and data visualization. It helps you detect faults in a timely manner and monitor running status of applications, resources, and services in real time.
You can quickly experience AOM by subscribing to AOM and installing the ICAgent. This guide describes how to subscribe to AOM and install the ICAgent by using a purchased Elastic Cloud Server (ECS) as an example.
Figure 1 shows the process of using AOM.
- (Mandatory) Subscribing to AOM
You need to subscribing to AOM before using it. For details, see Subscribing to AOM.
- (Mandatory) Installing the ICAgent
ICAgent is the collector of AOM. It runs on each host to collect metric, log, and application performance data in real time. You need to install the ICAgent before using AOM. For details, see Installing the ICAgent. Otherwise, AOM cannot be used.
- (Optional) Configuring Service Discovery Rules
For the services that meet Built-in Service Discovery Rules, they will be automatically discovered after the ICAgent is installed. For the services that cannot be discovered using built-in service discovery rules, configure custom service discovery rules. For details, see Custom Service Discovery Rules.
You can customize a service discovery rule by setting command parameters, processes, and environment variables of services. When there are any processes meeting the rule on the host, the ICAgent automatically collects the metric data of the processes and displays them on the AOM console.
- (Optional) Configuring Log Collection Paths
To use the logs of monitored hosts, configure log collection paths. For details, see Configuring a Log Collection Path for VMs. Then the ICAgent will collect host logs from the configured paths and display them on the AOM console for retrieval.
- (Optional) Subscribing to APM
In addition to basic functions, AOM integrates functions of Application Performance Management (APM), such as topology, tracing, device-side analysis, and abnormal SQL analysis to implement advanced monitoring. AOM helps O&M personnel quickly locate problems and performance bottlenecks in a distributed architecture, ensuring premium user experience.
To use AOM to monitor application performance, subscribe to APM and connect your applications to APM. For details, see APM Getting Started.
- (Optional) O&M
You can use AOM dashboards, threshold rules, and notification rules to implement routine O&M.
- Dashboards: With a dashboard, different graphs can be displayed on the same screen. Various graphs, such as line graphs, digital graphs, and top N resource graphs can display resource data, enabling you to monitor data comprehensively. For details about how to create a dashboard, see Dashboard.
- Threshold rules: When you set threshold rules and metric values meet threshold criteria, AOM will generate threshold alarms. When no metric data is reported, AOM will report insufficient data events. In this way, you can discover and handle exceptions at the earliest time. For details about how to create a threshold rule, see Creating a Static Threshold Rule.
- Notification rules: When you set notification rules and alarms are reported due to an exception in AOM or an external service, alarm information will be sent to the specified personnel by email or Short Message Service (SMS) message. Therefore, such personnel can rectify faults in time to avoid service loss. For details about how to create a notification rule, see Alarm Notification.