Context
Since 1.18, the Monitoring part of Conduktor Console has been externalized in an image called conduktor-platform-cortex
.
This image contains 3 components:
- Prometheus, to scrape the metrics from Conduktor Console
- Cortex, to store these metrics in your S3, or volumes
- Alert Manager, to setup alerts
These 3 components are external to Conduktor, and not maintained by us, but we use them in order to make our Monitoring work.
Issue
If you have many clusters, with a lot of resources (topics, consumer groups, partitions), the Cortex component might need more resources to run properly, or it gets OOM killed (Out Of Memory). Let's see how to size the container properly.
Solution
In order to size it properly, you first have to check how many time-series you are pulling from Conduktor Console.
Step 1: Check how many time-series you have
To check how many time-series you are pulling, you can enter in one of the two containers using bash, and run the following command line:
curl http://conduktor-platform:8080/monitoring/metrics | wc -l
In my case, the name of the container is conduktor-platform
and the port is 8080
. Please make sure you're changing it so it matches with your deployment.
The output should look like this:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 36898 100 36898 0 0 5659k 0 --:--:-- --:--:-- --:--:-- 6005k
279
The number of time-series is the one on the bottom left, in this case "279". This is really low, but it can easily increase if you have many clusters, topics, partitions, or consumer groups.
Step 2: Give enough resources
In their documentation, Cortex shared their rules of thumb to know how much memory you should give to the container. They mention that:
Each million series in an ingester takes 15GB of RAM
Then, here is a table to give you an idea of how much RAM you should give this container so it runs properly:
Number of metrics | 35,000 | 200,000 | 500,000 | 700,000 | 1,000,000 |
RAM needed | 500MB (default) | 3GB | 8GB | 11GB | 15GB |
Comments
0 comments
Please sign in to leave a comment.