Writing charms that collect metrics
Knowing an application's configuration isn’t enough to effectively operate and manage it. Consider that a well-designed application will have as few configurable parameters as possible. Operators will want to know more about the resources that your charm consumes and provides in their models -- resources such as:
- Storage GiB used
- Number of user accounts
- Number of recently active users
- Active database connections
Juju Metrics complete the operational picture with application observability; by modeling, sampling and collecting measurements of resources such as these. Juju collects application metrics at a cadence appropriate for taking a model-level assessment of application utilization and capacity planning.
There are many instrumentation and time-series data collection solutions supporting devops. Juju’s metrics complement these fine-grained, lower-level data sources with a model-level overview -- a starting point for deeper analysis.
Collecting metrics
Adding metrics to a charm is simple and straightforward with the reactive framework.
Add layer:metrics
Add layer:metrics
to the charm’s layer.yaml
. This layer provides the
collect-metrics
hook, and allows metric collection to be defined completely by
metrics.yaml
.
Add metrics.yaml
Declare the metrics to be collected in your charm's metrics.yaml
. Example:
metrics:
users:
type: gauge
description: Number of users
tokens:
type: gauge
description: Number of active tokens
Metric types
In Juju 2.0, only type: gauge
fully supports operational use-cases. Other
types are experimental in this release.
type: gauge
Gauge metrics are a snapshot value reading at a point in time, as a positive decimal number.
type: absolute
Absolute metrics track the quantity since the last measurement, as a positive decimal number. Future releases of Juju will track the cumulative aggregate of absolutes, providing a more useful indicator to operators.
Built-in metric juju-units
There is also a built-in metric, which has no type or description, named
juju-units
. When declared, this metric sends a "1" for each unit.
Metric commands
When charming with layer:metrics
, add a command:
attribute to each metric
in metrics.yaml
, containing a command line that measures the value when
executed. layer:metrics
will then execute this command in the
collect-metrics
hook for you automatically. Continuing with the example above:
metrics:
users:
type: gauge
description: Number of users
command: scripts/count_users.py
tokens:
type: gauge
description: Number of active tokens
command: scripts/count_tokens.py
Commands can use any script or executable in your charm or installed elsewhere
on the workload. The current working directory for this command will be the
charm directory (charmhelpers.core.hookenv.charm_dir
). The command must write
only the metric value to standard output, and terminate with exit code 0 in
order for the measurement to be to be counted valid.
Continuing with the metrics example above, a charm that relates to a PostgreSQL
database probably stores its "users" and "tokens" in database tables. These can
be counted with a simple SQL query. scripts/count_users.py
in such a charm
might read as:
#!/usr/bin/env python3
# Python packages will have been installed by the charm.
import configparser
import psycopg2
if __name__ == '__main__':
# Read the application's configuration file, which will have been written
# by the charm's relation hooks.
with open('/opt/sso-auth/config.ini') as f:
config_str = f.read()
config = configparser.ConfigParser(strict=False)
config.read_string(config_str)
# Build a database connection string from configuration.
dbname = config['database']['NAME']
user = config['database']['USER']
password = config['database']['PASSWD']
hostport = config['database']['HOST']
host, port = hostport.split(':')
conn_str = 'dbname=%s user=%s password=%s host=%s port=%s' % (
dbname, user, password, host, port)
conn = psycopg2.connect(conn_str)
try:
cur = conn.cursor()
try:
# For sake of example, let's say we don't want to include the
# default admin user account in the count.
cur.execute("SELECT COUNT(1) FROM users WHERE name != 'admin';")
row, = cur.fetchone()
print(row) # Print the measurement to standard output, for Juju
finally:
cur.close()
finally:
conn.close()
Note that this command will not have access to the normal lifecycle hook
environment. Refer to the
collect-metrics
documentation
for more information.