Histograms

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

Histograms represent how a collection of values are distributed not the values themselves.

For example, consider you have those numbers:

[1, 2, 4, 5, 8, 9, 10]

A Histogram can describe them like this:

  • For the range between 1 and 5, there are 4 values.
  • For the range between 6 and 10, there are 3 values.

Each range is called a bucket. and the histogram specifies how many values exist in each bucket.

The same set of numbers can be represented like this:

  • For the range between 1 and 3, there are 2 values.
  • For the range between 4 and 6, there are 2 values.
  • For the range between 7 and 10, there are 3 values.

Or

  • For the range between 1 and 25, there are 7 values.
  • For the range between 25 and 50, there are 0 values.

The accuracy of the histogram relies on the buckets themselves. If the ranges are too narrow, you’ll have too many buckets that don’t make much difference, if the ranges are too far, you may end up having almost all the numbers falling in one bucket which might also make it meaningless.

To make a histogram meaningful, you need to specify meaningful ranges so when you see the graph, it quickly conveys an idea what those values resemble.

Let’s take a moving car for example. You want to have an idea about the speed of the car over the day.

  • 0 - 1: Stopped
  • 2 - 10: Very Slow
  • 11 - 25: Slow
  • 26 - 69: Average
  • 70+ : Fast

If you take a value of the car’s speed every minute over the whole day, you can quickly understand on average the time the car spends at each speed range.

You may have thought “Why not take the average speed over the day?”. Take those sets of numbers as an example:

[1, 1, 1, 1, 96]

[18, 22, 16, 19, 25]

[2, 30, 15, 20, 33]

All those sets have the same average of 20. The first one is very stable at 1 except for the last one which was very high. If you exclude that number, the average becomes 1. The second set has all the numbers very close to each other. The last one the numbers a little higher except for the first number that reduced the average to some extent.

Averages on their own don’t provide clarity on how the numbers vary. If you go deeper into statistics, you can have the variance beside the average too to give you a better idea how the numbers vary from each other. But it’s still not enough, you don’t know how many numbers were averages.

Histograms are meant to be a visual representation that gives you this information with a quick look, not by putting numbers together and running an equation.

Creating a histogram in OpenTelemetry is similar to creating a Gauge

let boundaries: [Double] = [0, 2, 11, 26, 70]  // 1

let openTelemetry = OpenTelemetry.instance  // 2
    
let meter = openTelemetry.meterProvider.meterBuilder(name: "NAME_OF_METRICS_GROUP").build()  // 3

var histogram = meter.histogramBuilder(name: "NAME_OF_THE_HISTOGRAM")  // 4
  .setExplicitBucketBoundariesAdvice(boundaries)
  .build()

histogram.record(value: 5.0)  // 5
histogram.record(value: 15.0)
histogram.record(value: 25.0)
histogram.record(value: 35.0)
...
  1. First you specify the boundaries. Pass an array of Double and OpenTelemetry will automatically create the Buckets:
    • [0 - 1]
    • [2 - 10]
    • [11 - 25]
    • [25 - 69]
    • [70 - …]
  2. Get the current instance of OpenTelemetry
  3. Just like with gauges, create an instance of a meter with a name of the group
  4. Create a histogram instance through its builder, and passing the boundaries to that builder before creating the histogram
  5. Record as many values as needed in the histogram

Submitting the histogram on Grafana once won’t display anything, when you submit it multiple times the graphical representation will start to show up… why?

Histograms are similar to counters, the graph will represent the rate of change of the distribution of the data. If the values in the histogram doesn’t change, then there will be nothing to visualize and everything will be zero.

When you send the graph again, you’re actually appending the same values to the existing graph on a different time so there will be a rate of change to see.

When you send a histogram on Grafana repeatedly, it will build a heatmap representation of your values

A heatmap representation of a Histogram on OpenTelemetry Dashboard
A heatmap representation of a Histogram on OpenTelemetry Dashboard

This heatmap will aggregate all rate of changes from the histograms that are coming from all devices within the same timeframe.

This will allow you to visually understand the distribution of values across all of your users.

See forum comments
Download course materials from Github
Previous: Counters Next: Conclusion