CloudWatch
💡 Definition
Amazon CloudWatch is a monitoring and management service that provides data and actionable insights for AWS, hybrid, and on-premises applications and infrastructure resources. It collects monitoring and operational data in the form of logs, metrics, and events.
🔑 Key Concepts
- Metrics: Time-ordered sets of data points representing a variable (e.g., CPU utilization of an EC2 instance, number of requests to a Lambda function).
- Logs: Collects and stores log files from AWS services (e.g., EC2 instances, Lambda functions) and custom applications.
- Alarms: Watches a single metric over a time period you specify, and performs one or more actions based on the value of the metric relative to a threshold.
- Dashboards: Customizable home pages in the CloudWatch console that you can use to monitor your resources in a single view.
- EventBridge (formerly CloudWatch Events): Delivers a near real-time stream of system events that describe changes in AWS resources.
⚙️ How it Works
- Collect Data: CloudWatch automatically collects metrics from most AWS services and allows you to send custom metrics and logs.
- Visualize: Use dashboards to see performance trends.
- Alarm & Act: Set alarms to notify you or automatically trigger actions (e.g., scale an EC2 fleet using Auto Scaling).
🎯 Use Cases
- Resource Monitoring: Tracking CPU, memory, network I/O of EC2 instances.
- Application Health: Monitoring request rates, error rates, and latency for web applications.
- Automated Scaling: Triggering Auto Scaling policies based on CPU utilization.
- Troubleshooting: Analyzing logs to diagnose issues in applications or infrastructure.
💰 Pricing Model
- Basic Monitoring: Metrics with 5-minute frequency, 10 dashboard metrics, 10 alarms are often free tier.
- Custom Metrics, High-Resolution Metrics, More Alarms/Dashboards: Charged based on quantity.
- Log Storage & Ingestion: Charged per GB.
📝 Exam Tips (CLF-C02)
- CloudWatch focuses on monitoring performance and operational health.
- Collects logs, metrics, and events.
- Can trigger actions based on alarms.
- Remember its integration with Auto Scaling.
See Also: * CloudTrail * Systems Manager * Auto Scaling