Enhancing Your Prometheus Queries: How to Filter Metrics Based on Conditions

Enhancing Your Prometheus Queries: How to Filter Metrics Based on Conditions

Learn how to filter Prometheus metrics using Grafana by correlating metrics. This step-by-step guide shows how to selectively display the `requests_failed_total` based on the `requests_processed_total`. --- This video is based on the question https://stackoverflow.com/q/63134892/ asked by the user 'aspyct' ( https://stackoverflow.com/u/1003190/ ) and on the answer https://stackoverflow.com/a/63140034/ provided by the user 'Marcelo Ávila de Oliveira' ( https://stackoverflow.com/u/4653675/ ) at 'Stack Overflow' website. Thanks to these great users and Stackexchange community for their contributions. Visit these links for original content and any more details, such as alternate solutions, latest updates/developments on topic, comments, revision history etc. For example, the original title of the Question was: Prometheus: filter query based on another metric Also, Content (except music) licensed under CC BY-SA https://meta.stackexchange.com/help/l... The original Question post is licensed under the 'CC BY-SA 4.0' ( https://creativecommons.org/licenses/... ) license, and the original Answer post is licensed under the 'CC BY-SA 4.0' ( https://creativecommons.org/licenses/... ) license. If anything seems off to you, please feel free to write me at vlogize [AT] gmail [DOT] com. --- Enhancing Your Prometheus Queries: How to Filter Metrics Based on Conditions When managing a monitoring system with Prometheus, you might encounter a situation where you're dealing with multiple metrics that are linked by common labels. For example, you may have two counter metrics: requests_processed_total and requests_failed_total. In many cases, you want to filter the failed requests based on specific conditions posed by another metric. The Problem Consider a scenario where you have the following metrics: requests_processed_total for the total number of requests processed by different services. requests_failed_total for the total number of requests that failed. Both metrics share a common label called service, which helps in isolating data belonging to that particular service. Here’s what your sample metric data looks like: [[See Video to Reveal this Text or Code Snippet]] Suppose you want to query the requests_failed_total, but only for those services where requests_processed_total is greater than 1000. From the above data, you would expect the output to show requests_failed_total only for the news service since it’s the only one exceeding the threshold. The expected result would be: [[See Video to Reveal this Text or Code Snippet]] The Solution If you're using Grafana to visualize your Prometheus data, you can easily set this up by following these steps: Step 1: Create a New Dashboard Begin by creating a new dashboard in Grafana where you plan to visualize your metrics. Step 2: Configure Variables Open Dashboard Settings Click on the settings gear icon in your dashboard. Create a New Variable Navigate to the Variables section and click New. Define Variable Parameters Use the following settings to define your variable: Name: service Type: Query Data Source: Select Prometheus Query: Input the following PromQL query: [[See Video to Reveal this Text or Code Snippet]] Regex: Use the regex to capture the service name: [[See Video to Reveal this Text or Code Snippet]] Step 3: Implement the Variable in Your Panel Use the newly created service variable to display the requests_failed_total metrics. Here's how to do that: Go to the panel where you want to show the failed requests. In your query editor, reference the service variable in your PromQL query for requests_failed_total: [[See Video to Reveal this Text or Code Snippet]] Step 4: Enable Panel Repeats (Optional) To enhance your dashboard, you can also use the "repeat for" feature in Grafana, allowing you to dynamically create panels for each service based on your filtered results. This way, you can see all relevant metrics at a glance. Conclusion By setting up your Grafana dashboard with variables and specific PromQL queries, you can effectively filter down your metrics to show only the data that matters to you. This way, you cut through the clutter and get a clearer view of the services that are performing well versus those that may be underperforming. Implementing such filters not only improves the efficiency of your monitoring but also helps you make informed decisions quickly. Happy monitoring with Prometheus and Grafana!