When it comes to running production software, you’ll hardly ever find yourself in a situation where you know too much about how your application is performing. More often than not, the opposite is true, and we don’t nearly have enough information available about our application’s performance in the wild.
Fortunately, there are a variety of free, open source tools available that provide time series databases and allow us to accrue and store all sorts of metrics. While individual metrics at an instant in time may not mean much on their own, having access to metrics over an extended period of time allows for in-depth trend analysis. One of the most commonly used tools for aggregating metrics is Prometheus.
Prometheus utilizes a pull model to obtain its metrics. That is, you point it at an URL exposed by your application, Prometheus grabs metrics from that URL every once in a while, and stores them. The power of this model is in its simplicity — the only thing your application needs to do is expect a metrics endpoint in a specific format, and let Prometheus deal with the rest.
Unfortunately, ASP.NET Core doesn’t come with a metrics endpoint by default — we’ll have to create one for ourselves. In this article, we’ll create a simple ASP.NET Core Web API that collects and exposes metrics for Prometheus to interpret. We’ll then obtain those metrics in Prometheus, and plot a graph from them.
Setting up our Application
We’ll be starting from scratch, so the first thing we’ll do is create an ASP.NET Core Web API project. At the time of writing, .NET Core 3.1 is the most recent version of .NET, so we’ll go with that.
Next up, we want to expose a basic set of metrics. There is a library available named prometheus-net that comes with a companion ASP.NET Core Package. We’ll need them both in our example. Run the following commands in the Package Manager Console:
We can now start exposing a default set of metrics using one simple line in Startup.cs:
Make sure to place it before the call to
app.UseEndPoints(...) — otherwise we may be missing some important HTTP metrics later on.
That’s all there’s to it — if you start your service and navigate to the
/metrics endpoint, you should see a default set of metrics exposed by your service:
While the amount of memory used by our service is a great starting metric, it is very general. Of course, the type of metrics you want to collect depend heavily on your specific application.
For most ASP.NET Core applications however, response times is a meaningful metric. Any spikes in response times may in the worst case indicate possible outage, if not a plain nuisance for your users. In the next section we’ll implement a response timer.
Choosing Our Metrics
Before we can start reporting metrics, we’ll need to settle on the type of metric we’re going to report. Prometheus defines the following metrics in their documentation:
- Counter — A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart.
- Gauge — A gauge is a metric that represents a single numerical value that can arbitrarily go up and down.
- Histogram — A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.
- Summary — Similar to a histogram, a summary samples observations (usually things like request durations and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
The number of requests our service serves will ever only go up during its lifetime. Therefore, a monotonic counter represented by a Counter will be perfect.
Request duration can be represented using a Histogram. Each time we measure the time elapsed between a request and a response, that’s an observation. Prometheus will subsequently divide these observations in quantiles (or buckets) for us.
Before we can send our metrics off to Prometheus, we’ll need to collect them. It’s perfectly fine to define the individual
Histogram objects close to use, but for this article we will keep them in a reporter type that we inject into the middleware component, to keep things slightly more manageable.
For our use case, we can start off with a fairly simple
As you can see, the Reporter contains two methods;
RegisterRequest() , which is called whenever a request is considered handled, and regardless of its response type. The second method is
RegisterResponseType , which takes a handful of parameters. We’ll obtain the values for these parameters in our middleware component later on.
The interesting thing to note here is that our histogram contains a handful of labels, which the metrics will be grouped by. By applying the
method labels, we’ll be able to differentiate between any combination of them in our metrics — e.g. we’ll be able to tell the duration of POST requests with a response code of 200 apart from GET requests with the same response.
We could extend these labels much further, for example with the Controller and Action parameters that handled the request. By doing so we would be able to collect very in-depth information regarding the performance of our actions and controllers, and be able to optimize them very specifically.
Next up, we’ll need a middleware component that actually gathers the data for us:
Nothing special about this middleware component. We inject
MetricReporter into the middleware, gather some data and we do some reporting. We exclude requests to
/metrics because we don’t want to include Prometheus’ own crawling in the statistics.
Metrics in Action
We now have all the components we need to expose a solid set of metrics for our service. Before we can take it for a spin however, we’ll need to wire the various components up with the dependency injection framework.
Startup.cs and update
Configure to look something along the lines of:
Fire up your application and make a couple of requests. You shouldn’t notice much difference in terms of performance from before.
Now, when you move to the
/metrics endpoint on your application, you should be able to see some key metrics about the requests your application served:
There should be more metrics available on your endpoint, but I’ve omitted the ones that we didn’t set up in this article for brevity in the above text.
You can see from both the counter and the histogram that the application serviced 22 requests, 13 of which were successful, and 9 returned 404 Not Found.
Out of the 13 successful requests, 12 took less than 0.01 second, or 10 milliseconds. Because the buckets are defined as quantiles, observations from lower buckets fall into higher buckets as well.
As you can see, even though these metrics are relatively simple to set up, they immediately give us very powerful insights in the performance of our application, and where we might be able to optimize it further.
We’ve added a handful of metrics to our application that allow us to obtain information about the environment our application runs in, as well as the requests our application services.
To improve this, you could add the
HttpMetrics middleware provided by
This is largely functionally equivalent to the response time middleware we wrote and configured in this article, so you may want to remove that one from your pipeline.
The HTTP Metrics middleware components give you far more information than our simple middleware did. It includes the actions and controllers in the metrics, allowing you to specifically track down the (lack of) performance of specific actions in your application, and fix them with precision.
In the next article, we’ll take a look at Prometheus and configure it to use the metrics endpoint we’ve exposed in this article as a metric ingress to store our metrics over time.
If you are looking for a cheap Prometheus instance to tinker with, consider DigitalOcean’s One-Click App for Prometheus. If you sign up using my personal referral link, you’ll get $50 in free credit to get started.