Telemetry¶
The SCION CA setup based on HashiCorp Vault
and a vaultca service provided by Anapaya. Both process expose telemetry in
the form of Prometheus metrics. The following subsections describe the exposed
metrics as well how to setup the monitoring stack for the SCION CA.
Metrics¶
Vault instances configured by Anapaya expose two categories of Prometheus metrics:
- Built-in metrics as defined by HashiCorp Vault - For a list of all the metrics exposed by HashiCorp Vault, please refer to the official documentation. Metrics of high importance are the seal status of the Vault instance and the number of existing mount points. At the time of writing, the metric names for these features are - vault.core.unsealedand- vault.core.mount_table.num_entries.
- Anapaya-defined metrics that relate to the SCION CA operations of the host - To ensure that the Vault instances are operating correctly as SCION CAs, Anapaya has defined some additional metrics that are exposed: - vaultca_build_info - Description - vaultca build information - Type - gauge - Labels - version- vaultca_certificate_matching_distinguished_name - Description - Indicate that the distinguished name in the certificate contains the expected value for this engine. - Type - gauge - Labels - isd_as- type- dn- vaultca_certificate_not_after_time - Description - The NotAfter time of the last certificate that was successfully updated in vault - Type - gauge - Labels - isd_as- type- vaultca_certificate_not_before_time - Description - The NotBefore time of the last certificate that was successfully updated in vault - Type - gauge - Labels - isd_as- type- vaultca_instance_id - Description - The instance identifier of the last CA certificate that was successfully updated in vault. This is the number of the instance in the CommonName of the CA certificate. E.g., for CA certificate with CommonName ‘Anapaya CA - GEN I 2021.42’, the instance identifier is 42. If the CA certificate does not have a instance identifier in the CommonName, the value is set to -1. - Type - gauge - Labels - isd_as- type- vaultca_latest_trc_base_version - Description - The base version of the latest TRC stored in the secret engine for the given ISD. - Type - gauge - Labels - isd_as- vaultca_latest_trc_contains_root_certificate - Description - Indicates if the current root certificate is included in the latest TRC. - Type - gauge - Labels - isd_as- vaultca_latest_trc_not_after_time - Description - The NotAfter time of the latest TRC stored in the secret engine for the given ISD. - Type - gauge - Labels - isd_as- vaultca_latest_trc_not_before_time - Description - The NotBefore time of the latest TRC stored in the secret engine for the given ISD. - Type - gauge - Labels - isd_as- vaultca_latest_trc_serial_version - Description - The serial version of the latest TRC stored in the secret engine for the given ISD. - Type - gauge - Labels - isd_as- vaultca_update_task_enabled - Description - Indicates if the periodic vaultca update task is enabled. - Type - gauge - Labels - None - vaultca_update_task_errors_total - Description - The amount of errors in runs of the vaultca update periodic task. - Type - counter - Labels - None - vaultca_update_task_successes_total - Description - The amount of successful runs of the vaultca update periodic task. - Type - counter - Labels - None 
Setting up Monitoring for Vault instances¶
Our monitoring stack is based on Prometheus, Grafana, Loki and AlertManager. They are all open-source tools with plenty of documentation and support online.
For the correct functionality of the stack the following requirements must be met:
- The monitoring host must be able to reach the management interface of the target vault instances. 
- Firewall rules must allow opening an HTTP(S) connection on the monitoring port(s) of the vault instances. 
The following sections describe each required step to setup the monitoring in more detail.
Enabling Telemetry on the Vault instance¶
If you are enabling telemetry in an instance of Vault, ensure that the Vault
configuration (usually stored in /etc/vault.d/vault.hcl) contains the section:
telemetry {
  # retention time of metrics, only new metrics (updated < 1m ago) would be exposed
  prometheus_retention_time = "1m"
  disable_hostname = true
}
Setting up monitoring stack¶
To setup the monitoring stack follow the instructions under Setting Up a Monitoring Host.
Configure Alert Rules¶
We recommend to install the alert rules, that we provide for the SCION CA
product, on your Prometheus instance. The alert rules can be found under the
name anapaya-alerts-scion-ca on our cloudsmith repository. To download them,
use the following command:
curl -O https://dl.cloudsmith.io/<access_token>/anapaya/stable/raw/names/anapaya-alerts-scion-ca/versions/<version>/anapaya-alerts-scion-ca-<version>.yml
Note
Replace the <access_token> placeholder with your cloudsmith access token
and the <version> placeholder with the version of the alert rules you
want to download.
If you don’t have an access token, get in contact with our customer success team.
To install the alert rules on the Prometheus instance follow the official instructions on adding an alert rule file to your Prometheus configuration.
Configure Dashboards¶
We recommend to install the dashboards, that we provide for the SCION CA product,
on your Grafana instance. The dashboards can be found under the name
anapaya-dashboards-scion-ca on our cloudsmith repository. To download them, use
the following command:
curl -1sLf -O 'https://dl.cloudsmith.io/<access_token>/anapaya/stable/raw/names/anapaya-dashboards-scion-ca/versions/<version>/anapaya-dashboards-scion-ca-<version>.zip'
Note
Replace the <access_token> placeholder with your cloudsmith access token
and the <version> placeholder with the version of the alert rules you
want to download.
If you don’t have an access token, get in contact with our customer success team.
In order to import a JSON dashboard on Grafana, follow the official instructions.