Tag Archives: Monitoring

Tools for Monitoring Applications Logs

Monitoring logs for an application is an important part of any deployment and support cycle. You want to keep a check on logs to understand what is happening with your application. But these logs are mostly GBs of raw data, that making sense out of this is not very easy. Thankfully there are many off the shelf tools available to help us out in this tedious task. I have already talked about ELK which is a popular tool for log analytics. In this post, we will talk about some of the other popular tools and get an idea of how these can help us.

Splunk is a tool to collect and analyze logs. Splunk basically has three core components, a forwarder which will forward data to Splunk server, An indexer which takes the data and indexes it for better search and finally Search head component which actually looks into the data and searches relevant information. An important aspect of Splunk is that it can easily scale horizontally with Splunk cluster, so you can manage GBs of data coming in the form of logs.

Graylog is another option for log monitoring. You can stream your logs to Greylog, which uses MongoDB and ElasticSearch behind the scenes to make sure you get fast and useful analysis.

Then there are specialized tools like SumoLogic for log analysis, which works on your log data and can provide additional analytics based on your logs. It can help you make sense of your logs as well as provide suggestions.

The list of tools providing log management, monitoring, and analysis tools is increasing by the day as people are recognizing the need and importance of mog monitoring. Here are some additional resources for interested readers.
https://www.dnsstuff.com/free-log-management-tools
https://dzone.com/articles/top-10-log-management-tools-1
https://www.comparitech.com/net-admin/log-management-tools/

ELK stack- Getting started

In the last three posts, I had talked about three popular off the shelf monitoring tools by cloud service providers, i.e. AWS CloudWatch, Azure Application Insights, and Azure Monitor. A discussion about monitoring cloud-native applications and microservices is incomplete without discussing ELK stack. ELK stack provides end to end functionality from capturing logs, indexing them in a useful manner, and finally visualizing them in a form that makes sense. Three core components that make the ELK stack are Elastic Search, Logstash, and Kibana.

Image source – https://medium.com/devxchange/streaming-spring-boot-application-logs-to-elk-stack-part-1-a68bd7cccaeb

As the image above shows, three tools forming ELK stack work together, where Logstash is responsible for the collection and transformation of logs, ElasticSearch indexes and makes logs searchable, and finally Kiabana helps them visualize in forms of reports which are easy o make sense of.

Let’s take a look at these three components.

ElasticSearch: is a popular search engine implementer. It indexes data and helps in implementing quick searches. It is based on Apache Lucene and provides REST APIs for accessing data. It is highly scalable and reliable, implemented on the No-SQL database.

Logstash: provides connectors for various input sources and platforms, helping in the collection of logs data from different sources. It can collect, parse, and manage a variety of structured and unstructured data.

Kibana: is basically a visualization tool, provides various user-friendly visual options for reporting like graphs, bars, tables, etc. One can create and share dashboards for an easy understanding of data in form of visual reports.

Additional resources:
https://www.youtube.com/watch?v=MRMgd6E9AXE
https://medium.com/devxchange/streaming-spring-boot-application-logs-to-elk-stack-part-1-a68bd7cccaeb
https://www.guru99.com/elk-stack-tutorial.html

Amazon CloudWatch

Once your application is deployed to production, monitoring is the only friend that can help you avoid embarrassing situations like a service not responding or an application is running very slow. You would like to make sure that monitoring and alerting systems are in place so that before you start hearing complaints from your end users, you can know about the problem and fix it. You would also like to make sure automated systems are in place to handle such issues.

Amazon CloudWatch is a service provided by AWS which can help us add monitoring for AWS resources.

image source https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_architecture.html

Let’s try to understand the above design. AWS services publish data to cloud watch in the form of metrics. Metrics here contain time-ordered data for various aspects, like CPU usage. Cloud watch processes the data and is capable of showing that in the form of graphs and bars. One can also set alarms on certain events like CPU usage goes beyond 75%. Based on alarm action can be taken like sending an email notification to admins or autoscale the application by adding an additional server to reduce CPU usage. One can also publish additional application data to CloudWatch for monitoring.

Let’s take a look at how we can create metrics and alerts for EC2 instance. Basic CloudWatch is by default enabled for EC2. You can enable detailed monitoring which will register events every minute, but it is a paid option.

For this example, I will move ahead with basic default monitoring. As I mentioned that default monitoring is enabled by default, so once you go to CloudWatch, select EC2 resources and you will see some default metrics already in place.

As a next step, we will add Alarms for the instances. You can set up alarms at an individual level, scale group level for autoscale, type of instance, and so on. For the sake of this example, I am choosing a metric of average CPU utilization for all my EC2 instances.

So the alert I am setting says that whenever average CPU utilization for all my instance goes beyond 50% an alarm should be raised. As a result of alarm, I can make the CloudWatch send a message to SNS or Simple Notification Service Queue, from which I can read in some application or serverless function and configure to send email or SMS notifications. One can also set auto-scale options like adding or removing servers or simply restarting an EC2 instance based on the alarm.