In this blog we are going to learn about how to do analysis of AWS RDS performance and debugging. So, there was one incident that happened when an AWS RDS instance was utilising 100% CPU usage even after having autoscaling enabled and read-write replica in place. Somehow not able to find out what was causing it exactly. Then we connect to AWS support got to know one of best feature of AWS RDS monitoring which is Performance Insights. I am not an expert of this but whatever I know trying to put here so others can get benefits of this feature. Let's understand what AWS RDS is and how Performance Insights can help to debug database issues.
Lets understand now what is Performance Insights
As name suggest Performance Insights it is directly related to CPU utilisation and database load. Even though CPU utilisation is related to each other, they are independent of each other. Databases can be on high load and CPU utilisation was low at that time. Similarly, Database can be on low load but CPU utilisation was high. This is actually designed for developers or users who are not having much expertise as DBA to analyse DB queries. AWS CloudWatch is majorly used for metrics to watch out for. This was not enough to analyse in details.
So, here AWS introduced Performance Insights Dashboard
With help of this attribute can see more insights to drill down analysis. Below image can see there are different colour codes indicating different areas to look for analysis. Performance Insights Dashboard collects metric data from the database engine to monitor the actual load on a database. Majorly analyse on AAS ( Average active session ) and CPU usage.
How this can be differentiated. I learned one good example from another source. Like there were multiple people in one room. There different type of people :
1.People who don't talk just come enjoy party and go away - Best performing query
2.People who just say hi to everyone and enjoy party and go away - Average performing query
3.People who just talk more and discuss make more noise - like heavy and long running queries impact CPU usage or database load. In this third type of people making more noise means utilising more space in a room similar to a long running query. This is one insight to look for the root cause.