Appliance Release v0.37

This page contains the release notes for the v0.37 Anapaya appliance software release. The appliance software release is applicable for the following Anapaya products:

  • Anapaya CORE

  • Anapaya EDGE

  • Anapaya GATE

We recommend always upgrading to the latest available patch release. Please refer to Upgrade Notes (if any) of each release if there are any special steps to be taken when upgrading. For general information on how to upgrade your appliance, please refer to Appliance Update Guide.

Upgrade Notes

Warning

In release v0.37.0, new configuration validation checks regarding VPP buffer allocations were added. In case your configuration is not compliant with the new validation checks, the appliance controller will start but wait for the configuration to be fixed. The logic has been improved in v0.37.2 to also take the available VPP buffers into account. We recommend to upgrade directly to v0.37.2 or later.

The affected releases are v0.37.0 and v0.37.1. If you are upgrading to one of these releases, please check the appliance controller logs after the upgrade:

journalctl -u appliance-controller.service -e

Verify that the logs do not contain any log events with the following message:

"validating latest configuration","cause":{"msg":"invalid configuration"...

Such a log message also contains a detailed description of the validation error, pointing out what needs to be fixed in the configuration.

Warning

In release v0.37.0 and newer, the IP-in-SCION tunneling always needs access to AS certificates of configured local ISD-ASes. For EDGEs this is naturally given, but for a GATE this might previously not have been set up. If you are upgrading a GATE to v0.37.0 or newer, please make sure to provision the GATE instance with the necessary control plane crypto material (AS certificate and TRC).

See Certificate/TRC Provisioning for more information on how to configure the TRC and AS certificate. To get an AS certificate it’s easiest to request it via a sibling appliance.

A new alert GatewayASCertificateExpiresSoon has been added to indicate if the AS certificate expires soon.

Warning

Release v0.34.0 and newer require Ubuntu 22.04 as the system and at least the anapaya-system package with version 2.8.0. Please refer to Upgrade Notes for more information.

v0.37.4 (2024-10-29)

Fixes

  • The appliance now correctly selects the local SCION AS when running the POST /api/v1/cppki/certificates/request HTTP request, even if multiple SCION ASes are configured on the appliance. Previously, the appliance would always use the default SCION AS. If the certificate request targeted an issuer AS that was not reachable from the default SCION AS, e.g., disconnected ISDs, the request would fail due to no paths being available.

Improvements

  • The default cost of the bcrypt algorithm in appliance-cli crypto kdf has been increased from 10 to 12. We recommend that you re-create the hashes according to your threat model.

v0.37.3 (2024-10-10)

Fixes

  • The appliance now always returns a validation error if the VPP tun section is missing in the configuration and is required, regardless of whether the VPP section is set.

  • The default buffer and worker allocation validation now triggers when a new configuration is pushed. Previously, with the validation introduced in v0.37.2 it only ran when configuring the dataplane, which could lead to a non-running dataplane.

    Also the available memory on the system is now correctly taken into account when doing the validation.

  • Fix a bug in the dataplane-control service that caused an error when adding a new route and bringing the interface up at the same time. When adding a new route to an interface, the interface’s admin state must be up. This change ensures that the interface is brought up before adding the route.

  • The appliance now validates that the dispatcher port 30041 cannot be allocated to any other service.

v0.37.2 (2024-09-19)

Warning

Release v0.37.2 contains known bugs that affect the EDGE and GATE products. We recommend to only upgrade your EDGE and GATE appliances to this release if you are running v0.37.0 or v0.37.1. If you are running an older release, we recommend to wait for the next patch release.

Fixes

  • The default number of VPP workers now also takes the available VPP buffers into account, such that validation errors are avoided on certain platforms.

    Previously, the calculation of the default number of VPP workers did not take the available VPP buffers into account. With the added validation in v0.37.0, that there is a sufficient number of VPP buffers, this could lead to validation errors if the values were not explicitly set in the appliance config.

  • The dataplane control now successfully updates IP neighbors and routes. Previously, applying updates would fail and in most cases require a dataplane service restart.

  • The IP-in-SCION tunneling process now correctly handles invalid certificate chains (e.g., due to expiry) and continues looking for a valid certificate.

    Previously, an expired certificate could cause the IP-in-SCION tunneling service to not find a valid certificate and result in server unavailability. This could lead to a loss of IP-in-SCION connectivity for 15 minutes until the expired certificate has been cleaned up.

  • Fix a race condition that occasionally could lead to a crash of the IP-in-SCION tunneling process due to incorrectly computed path fingerprints. This only affected setups with multiple local ASes configured.

Improvements

  • The scion-pki trc payload is now more ergonomic to use:

    • The validity.not_before field can now either be a UNIX or an RFC3339 compatible timestamp.

    • The validity.not_after field is an alternative to the validity.validity field which allows setting an UNIX or an RFC3339 compatible timestamp instead of the duration.

    • The cert_files list allows referencing certificates from the predecessor TRC with predecessor:<index>. This way, unchanged certificates do not need to be re-distributed during a TRC ceremony.

  • The new scion-pki trc payload dummy command creates a dummy payload in either PEM or DER format that can be used to test access to the signing keys in preparation to a TRC ceremony.

v0.37.1 (2024-09-11)

Fixes

  • Make sure the frr-exporter exports metrics again if BFD is not configured.

  • The ca-frontend no longer uses 100% CPU.

  • Fix an issue where the IP-in-SCION tunneling component would not read all the locally available IP prefixes.

Improvements

  • The ISD-AS number in the Distinguished Name of the CPPKI certificates is now shown in the output of scion-pki certificate inspect.

v0.37.0 (2024-09-06)

Improvements

Improved IP-in-SCION tunneling path selection algorithm

The internal path and remote endpoint selection algorithm for IP-in-SCION tunneling has been reworked significantly. The new algorithm reduces the traffic caused by the monitoring system and with this also reduces the CPU usage.

The new monitoring will also detect internal network disconnects in the destination ISD-AS and route around them if possible.

With the new monitoring there are also new metrics available and shown in the IP-in-SCION related dashboards. In the dashboards there are now specific panels that are only relevant for releases with version v0.37.0 and newer.

The new metrics are the following:

  • gateway_domain_traffic_redirections_total: indicates the number of traffic redirections per domain and traffic matcher. This metric is shown in the “Traffic redirections” graph on the IP-in-SCION tunneling dashboard. A traffic redirection means that traffic will be routed via a different path or to a different remote endpoint.

  • gateway_domain_paths_total: indicates the number of paths per domain and traffic matcher. This metric is shown in the different “Paths” graphs on the IP-in-SCION tunneling dashboard under the “Domain monitoring >= v0.37” section.

  • gateway_domain_traffic_matcher_sessions_total: indicates the number of sessions per domain and traffic matcher. This is used for alerting. If no session is available for a traffic matcher an alert is triggered as this means potential traffic loss. There is a new alert TunnelingDomainNoAlivePaths that uses this metric.

Various other improvements

  • The appliance API OAuth integration now supports role aliases. Role aliases can be used to map the role name in the OAuth token to a role name in the appliance API. This is useful for mapping different role names from different identity providers to the same role in the appliance. If no aliases are configured for a role the default aliases are appliance.<role>, appliance/<role>, and appliance:<role>. The appliance API currently only supports the reader and writer roles.

  • Add the SCION link type label to the router_interface_up metric.

  • Updating the MAC of an interface with VLAN sub-interfaces does not cause a recreation of the sub-interfaces anymore.

  • The appliance-controller and debugscraper systemd services now have a CPU quota set to 100% to ensure that they do not consume more than one CPU.

  • The appliance now validates that the number of buffers allocated by VPP is sufficiently high in relation to the number of VPP worker threads and the number of VPP interfaces. This validation is performed whether or not the number of buffers is explicitly configured.

  • Add SSH information to the appliance-cli info auth command.

  • Add validations to the appliance which ensure that VFs cannot be created on interfaces with ena, virtio-pci or vmxnet3 drivers.

  • ISD-AS numbers in non-canonical form (e.g. with capital letters) are now rejected by the appliance. Given configurations with such numbers did also previously not work correctly, no migration is done.

  • The appliance takes the installed system version into account when calculating the etag during configuration fetches and updates.

  • The appliance API now does not report an error if the same TRC is pushed multiple times as long as the contents are identical. This is particularly useful if a TRC bundle is pushed multiple times.

  • The appliance API returns the installed system package version as part of the metadata when querying the GET /config endpoint.

  • The appliance now forces the user to explicitly configure the management API without authentication with the /management/api/unprotected field. Configurations that do not have this field set to true and have no API authentication enabled are considered invalid. For appliances that migrate from previous releases, the unprotected flag is automatically set in migration if no authentication is enabled.

  • The number of buffers allocated by VPP can now be explicitly configured in the appliance configuration.

    {
        "system": {
            "vpp": {
                "num_buffer": 32400
            }
        }
    }
    

    The appliance validates that the memory allocated for VPP buffers fits into the configured hugepages memory.

Change categories

In the following we list the different change categories that are used in the release notes.

  • Features: Describes new features that have been added. Example: The appliance API can now be protected with OIDC/OAuth2.

  • Improvements: Describes improvements to existing features. Example: The routing table implementation is now 30% faster.

  • Fixes: Describes bug fixes, i.e. previously broken behavior that is now fixed. Example: The appliance no longer crashes when adding a new route.