Telemetry¶
In setups where a SCION CA needs to be configured, HashiCorp Vault is a common tool that provides a secure implementation of a certificate authority and secret storage. To ensure that the Vault instances are operating correctly, there is a variety of Prometheus metrics exposed.
For further information on Prometheus metrics, please refer to the official Prometheus documentation and the explanation offered in Overview.
Metrics¶
Vault instances configured by Anapaya expose two categories of Prometheus metrics:
Built-in metrics as defined by HashiCorp Vault
For a list of all the metrics exposed by HashiCorp Vault, please refer to the official documentation. Metrics of high importance are the seal status of the Vault instance and the number of existing mount points. At the time of writing, the metric names for these features are
vault.core.unsealed
andvault.core.mount_table.num_entries
.Anapaya-defined metrics that relate to the SCION CA operations of the host
To ensure that the Vault instances are operating correctly as SCION CAs, Anapaya has defined some additional metrics that are exposed:
vaultca_build_info
Description
vaultca build information
Type
gauge
Labels
version
vaultca_certificate_matching_distinguished_name
Description
Indicate that the distinguished name in the certificate contains the expectedvalue for this engine.
Type
gauge
Labels
isd_as
type
dn
vaultca_certificate_not_after_time
Description
The NotAfter time of the last certificate that was successfully updated in vault
Type
gauge
Labels
isd_as
type
vaultca_certificate_not_before_time
Description
The NotBefore time of the last certificate that was successfully updated in vault
Type
gauge
Labels
isd_as
type
vaultca_instance_id
Description
The instance identifier of the last CA certificate that was successfully updated in vault. This is the number of the instance in the CommonName of the CA certificate. E.g., for CA certificate with CommonName ‘Anapaya CA - GEN I 2021.42’, the instance identifier is 42. If the CA certificate does not have a instance identifier in the CommonName, the value is set to -1.
Type
gauge
Labels
isd_as
type
vaultca_latest_trc_base_version
Description
The base version of the latest TRC stored in the secret engine for the given ISD.
Type
gauge
Labels
isd_as
vaultca_latest_trc_contains_root_certificate
Description
Indicates if the current root certificate is included in the latest TRC.
Type
gauge
Labels
isd_as
vaultca_latest_trc_not_after_time
Description
The NotAfter time of the latest TRC stored in the secret engine for the given ISD.
Type
gauge
Labels
isd_as
vaultca_latest_trc_not_before_time
Description
The NotBefore time of the latest TRC stored in the secret engine for the given ISD.
Type
gauge
Labels
isd_as
vaultca_latest_trc_serial_version
Description
The serial version of the latest TRC stored in the secret engine for the given ISD.
Type
gauge
Labels
isd_as
vaultca_update_task_enabled
Description
Indicates if the periodic vaultca update task is enabled.
Type
gauge
Labels
None
vaultca_update_task_errors_total
Description
The amount of errors in runs of the vaultca update periodic task.
Type
counter
Labels
None
vaultca_update_task_successes_total
Description
The amount of successful runs of the vaultca update periodic task.
Type
counter
Labels
None
Setting up Monitoring for Vault instances¶
Our monitoring stack is based on Prometheus, Grafana, Loki and AlertManager. They are all open-source tools with plenty of documentation and support online.
In order to set up monitoring of Anapaya appliances or Vault instances, there are a few technical requirements.
The monitoring host must be able to reach the management interface of the target appliances.
Firewall rules must allow opening an HTTP(S) connection on the monitoring port of the appliance.
Enabling Telemetry on the Host¶
If you are enabling telemetry in an instance of Vault, ensure that the Vault
configuration (usually stored in /etc/vault.d/vault.hcl
) contains the section:
telemetry {
# retention time of metrics, only new metrics (updated < 1m ago) would be exposed
prometheus_retention_time = "1m"
disable_hostname = true
}
Setting up Prometheus¶
In order to set up Prometheus, follow the official Prometheus instructions. Specifically,
Ensure you have the latest version of Prometheus installed. Consult the installation guide for reference.
Follow the instructions in Section Configure Prometheus to monitor the sample targets.
To monitor each Vault instance, add the following target configuration in the
prometheus.yml
file. For each Vault instance that will be monitored, add an entry in the targets section. The vault listening address is theapi_addr
as set in the Vault configuration file (usually stored at/etc/vault.d/vault.hcl
).- job_name: 'vault' honor_labels: true static_configs: - targets: - <vault listening address>
Start Prometheus. The exact command depends on the method of installation.
Recording and Alerting rules¶
Prometheus allows the configuration of rules for recording data or creating alerts when an event happens. These alerts can later be picked up by AlertManager and be integrated with your alerting system. You can specify the events that trigger an alert, the scope and severity of the alert, and also provide a description and summary of the firing alert. Below, we provide an example of how to monitor the seal state of the Vault instance.
- alert: InfraVaultSealed
expr: vault_core_unsealed == 0
for: 1m
Monitoring Stack¶
We suggest you set up Grafana, and Alertmanager to visualize metrics and keep track of alerts. For instructions on how to achieve this, please refer to Setting up Grafana and Setting up AlertManager.