Linux | Cloud | DevOps | Scripting

Breaking

Tuesday, 17 September 2019

Monitoring Tools


Monitoring:

There are two types of monitoring:

1. Local Monitoring (on an individual system. For local monitoring we use in-built tools like; top, free, VStat, VMStat, SAR)
2. Centralized Monitoring (when we need to monitor entire organization. For centralized monitoring we use Simple Network Monitoring Protocol (SNMP)).

There are four ways by which we can do monitoring in AWS:
1. CloudWatch
2. CloudTrail
3. CloudConfig
4. Trusted Advisor

CloudWatch:

That SNMP agent in AWS is known as 'CloudWatch'. CloudWatch is an active monitoring mechanism which can take action. By default, it is a free service. But by default, it monitors the things in the interval of 5 minutes and that is free of cost. But if we want high resolution of alarms or per minute status then it is a paid service. Generally, it stores the data for 45 days but if we want to save the data for years long, then CloudTrail comes into the picture.

Fig: CloudWatch
The thing we want to monitor is known as 'Metrics' like; CPU monitoring, Disk Monitoring, Disk Utilization Monitoring etc. Based on Metrics we can set alarms, notifications, set certain logs on which Cloudwatch can take action like start/stop the instance or auto-scale the instance etc. By default, CloudWatch supports up to 10 metrics.

CloudTrail:

CloudWatch is active monitoring and also can take action whereas 'Cloudtrail' is Passive Logs, which means all end-to-end logs from when we created AWS account and after that, all work that we have done till the date, are kept by CloudTrail (logs for who came in, from where and did what). CloudTrail service is free but we need to pay for bucket charges. Mainly, Cloudtrail is used for Governance and Auditing.

CloudConfig:

CloudConfig only monitors the configuration changes like; any port change, any policy change can be monitor by CloudConfig. It maintains a timeline history of the changes made in the configuration of the cloud environment. It shows the data clockwise. CloudConfig fetches data from Cloudtrail but shows the data in a timeline series. We can go for a time and can see what thing is done at that particular time.

Trusted Advisor:

It's an account-based monitoring tool. Trusted Advisor service advises us about entire environment in 'cloud' + 'cost optimization' + 'performance improvement'. It advises for performance, security (by default free of cost), cost, HA and Service Limits.

PRACTICAL:

For a basic understanding, we are going to create an alarm in CloudWatch to take action that will terminate the instance if it found CPU utilization is more than 10%.

Steps we need to follow:

1. Create an instance.
2. Create alarm
3. Verify the condition

Step 1: Create an instance:

AWS ➔ Services ➔ EC2 ➔ Launch Instance ➔ tick on free tier ➔ select AWS Linux AMI ➔ Review and Launch ➔ Launch ➔ Provide Key-pair ➔ Launch instance.

Now, give a name to this instance like 'linux-box'.

We can perform monitoring in two ways:
A. By creating an alarm by navigating to Monitoring followed by Create Alarm.
B. By using CloudWatch.

If we go to way-A steps will be:

AWS ➔ Services ➔ EC2 ➔ Select instance ➔ Monitoring ➔ Create Alarm ➔ If you do not have any topic registered, you can create by following Step-2 in URL: https://redhatpanacia.blogspot.com/2019/07/billing-alert.html ➔ Take an action: [*] stop the instance ➔ Whenever: Average of CPU Utilization ➔ is <= 10 Percent ➔ For at least 1 consecutive period(s) of 5 minutes ➔ Name of Alarm: give any name or leave default ➔ Create Alarm ➔ Close.

Create a CloudWatch Alarm
Fig: Create a CloudWatch Alarm
Now, monitor the clock, after 5 minutes instance will be stopped.

By using Way-2 (which is our main concern, create Alarm using CloudWatch):

Step 2: Create Alarm:

AWS ➔ Services ➔ CloudWatch ➔ Alarms ➔ Create alarm ➔ Select Metric ➔ Under all metrics select EC2 ➔ Per-Instance Metrics ➔ there is a big list, so filter it using Instance ID. So, copy Instance ID in it and select CPU Utilization ➔ Select metric ➔ now, the metric is selected, we just need to provide the condition on which this metric need to take the action ➔ Conditions: Threshold type: Static ➔ Whenever CPU Utilization is: [*] Lower/Equal ➔ than: 10…

Conditions
Fig: Conditions
Next ➔ Select or create an SNS topic using link: https://redhatpanacia.blogspot.com/2019/07/billing-alert.html ➔ go to EC2 action ➔ Add EC2 action ➔ Alarm state: In Alarm ➔ [*] Stop the instance ➔ Next ➔ Alert name: give any name ➔ Next ➔ Create Alarm.

Step 3: Verify the condition

AWS ➔ Services ➔ EC2 ➔ Now, as per the condition, after 5 minutes we can see that our instance is going to stop.

Enjoy!





No comments:

Post a Comment

Pages