Appliance Release v0.36¶
This page contains the release notes for the v0.36 Anapaya appliance software release. The appliance software release is applicable for the following Anapaya products:
Anapaya CORE
Anapaya EDGE
Anapaya GATE
We recommend always upgrading to the latest available patch release. Please refer to Upgrade Notes (if any) of each release if there are any special steps to be taken when upgrading. For general information on how to upgrade your appliance, please refer to Appliance Update Guide.
Known Issues
Appliance configurations that contain IP prefixes in non-canonical form can lead to IP-in-SCION tunneling service crashes in releases prior to v0.36.3.
On releases v0.36.2 and older, the following accept filter could lead to a crash of a software component and loss of IP-in-SCION tunneling connectivity:
"accept_filter": [ { "action": "ACCEPT", "prefixes": [ "198.51.100.210/28" ], "sequence_id": 0 } ], ]
Instead, the accept filter should be expressed in its canonical form:
"accept_filter": [ { "action": "ACCEPT", "prefixes": [ "198.51.100.208/28" ], "sequence_id": 0 } ], ]
Upgrade Notes¶
Warning
In release v0.36.3, the appliance controller has been extended with additional validation rules. If you are upgrading from a previous verison and your appliance configuration contains IP prefixes in non-canonical form, the appliance controller will start and wait for the configuration to be fixed.
To ensure that your appliance is healthy, check the appliance controller logs after the update:
journalctl -u appliance-controller.service -e
Verify that the logs do not contain any log events with the following message:
"validating latest configuration","cause":{"msg":"invalid configuration"...
Warning
SSH keys for the root
and anapaya
users are now managed via the appliance
configuration. If you have previously configured SSH keys for these users, the
appliance will automatically migrate them to the appliance configuration. Please
refer to SSH Configuration for more
information.
It’s recommended to verify that the SSH keys are correctly migrated after the
upgrade by checking the appliance configuration. Note, that if you upload a
configuration with no SSH keys configured, the appliance will remove all SSH
keys from the root
and anapaya
users and you might lose access to the
appliance.
Warning
Release v0.36.0 and newer automatically configure multiple packet forwarding
cores (if available). This has the effect that multiple streams are used to send
IP-in-SCION tunneled traffic to a remote EDGE or GATE appliance. EDGE and GATE
appliances on releases that are older than v0.34.0 do not support this
feature. If you have such remote peers, you should configure only a single
fowarding core in the system.vpp.cpu.workers
configurations:
{
"system": {
"vpp": {
"cpu": {
"workers": 1
}
}
}
}
Warning
Release v0.34.0 and newer require Ubuntu 22.04 as the system and at least the anapaya-system package with version 2.8.0.
v0.36.5 (2024-08-16)¶
Improvements¶
The SCION control service now includes the beacon ID when logging beacon propagation errors. This will allow operators to track down offending beacons easier.
The SCION control service now uses a more optimized verification function during signature verification based on AS certificate chains.
The SCION control service now validates that it has the required AS certificate chains when propagating beacons. Previously, in rare edge cases (e.g., after manually deleting an AS certificate from the trust database) the service would propagate beacons that could not be verified by the receivers due to missing certificate chains.
v0.36.4 (2024-07-29)¶
Fixes¶
The automatically generated list of allowed interface in the local topology is now fixed.
Fixed a bug where the
VPP_VMXNET3
driver would not work in certain circumstances.The
gateway_prefix_fetch_invalid_total
metric is now correctly reporting values again.Prevent path changes from being counted if they only differ in expiry in the
gateway_session_path_changes
metric.The gateway_prefixes_advertised metric is correctly reporting values again.
v0.36.3 (2024-07-18)¶
Fixes¶
The appliance controller now requires all IP prefixes to be provided in canonical form. In prior releases, a non-canonical IP prefix in the accept filter could lead to crashing of the IP-in-SCION tunneling service.
The
gateway_frames_discarded_total
metric is now correctly reported again.Adding or deleting traffic matchers of a domain now properly forwards traffic for the affected prefixes of the domain instead of silently dropping it.
Changes to discovered metadata from a remote IP-in-SCION tunneling endpoint are now correctly taken into account. Previously, changes like updated allowed interfaces were ignored.
Improvements¶
The rate at which the local metrics are scraped and injected to the log journal for the debug information archive has been reduce.
v0.36.2 (2024-06-14)¶
Fixes¶
The journald log entries are now exposed via
/api/v1/debug/logs/entries
, as expected by the specification. Before, the endpoint was exposed via/api/v1/logs/entries
.The appliance-controller now correctly generates the gateway configuration when SCION RSS is enabled in the appliance configuration.
The GATE flow exporter can no longer crash when a large number of flows are added and deleted.
Monitoring a large number of paths will no longer lead to dropped path monitoring probes in the IP-in-SCION tunneling component.
The IP-in-SCION tunneling endpoints now properly discovers remote endpoints in a disconnected AS. Before, failover towards EDGE gateways would not have worked in such a scenario.
Improvements¶
The IP-in-SCION tunneling compmonent now monitors maximum 100 paths per remote AS. This is a measure to prevent high load on topologies with very high path diversity.
v0.36.1 (2024-05-28)¶
Fixes¶
The VPP dataplane no longer experiences a rare crash that could happen when gateway flow metrics were enabled (GATE only).
The syslog can no longer fill up the log partition.
v0.36.0 (2024-05-23)¶
Features¶
Source NATing for outgoing traffic¶
The appliance now supports source NATing for outgoing traffic. While outgoing traffic on any interface can be NATed, this feature is particularly useful for NATing traffic that will be sent over an IP-in-SCION tunnel. Outgoing source NAT is useful for deployments that only have a single (or few) public IP address(es) that can be tunneled through an IP-in-SCION tunnel. The NAT allows multiple internal hosts to share the same public IP address.
An operator can configure the NAT address pool, i.e., the list of IPv4 prefixes
that can be used as public IP addresses for NATing. These addresses should also
be announced to remote IP-in-SCION tunneling endpoints. Furthermore, the
operator defines for which outgoing interfaces the NAT should be applied, in
most cases this will be the special scion-gateway
interface. Finally, it is
also possible to exclude certain addresses from being NATed, e.g., in case a
host should be reachable directly via its public IP address.
For more information on how to configure source NATing, please refer to Network Address Translation (NAT) and for a specific example on how to configure outgoing source NATing for IP-in-SCION traffic, please refer to Configuring egress NAT.
Confguring source NAT for IP-in-SCION traffic
"interfaces": {
"ethernets": [
{
"name": "lan0",
"addresses": ["192.168.1.1/24"],
"driver": "VPP_DPDK"
},
{
"name": "wan0",
"addresses": ["169.254.0.2/31"],
"driver": "VPP_DPDK"
}
}
],
"loopbacks": [
{
"name": "loop0",
"addresses": ["203.0.113.50/32"]
}
]
},
"nat": {
"snat": {
"address_pool": ["203.0.113.50/32"],
"exclude": [],
"interfaces": ["scion-gateway"]
}
}
Automatic forwarding core assignment¶
The appliance uses the Vector Packet Processor (VPP) framework as its forwarding
dataplane. VPP could already be configured to use multiple CPU cores
(workers) for packet processing through the system.vpp.cpu.workers
setting.
Starting with the current release the appliance now automatically calculates a
suitable worker configuration based on the number of available CPU cores. The
automatic assignment of worker threads takes into account sibling cores when
hyper-threading is enabled and aims to not assign workers to sibling cores of
the control plane.
These values can be overridden by explicitly configuring
system.vpp.cpu.main_core
, system.vpp.cpu.workers
, and/or
system.vpp.cpu.corelist_workers
entries, however, for most deployments, the
automatic assignment should produce the best results.
Note
Each configured worker is pinned to a separate CPU core. These workers will consume 100% of the core they are pinned to, because the worker is constantly polling for packets.
Management of SSH keys via the appliance configuration¶
The appliance can now manage SSH keys for the root
and anapaya
users on the
appliance. Other users are not supported on the appliance. An arbitrary number
of authorized SSH public keys can be configured for those users. Keys must be
base64-encoded as used in the authorized_keys file,
e.g., as produced by the
ssh-keygen tool.
Refer to SSH Configuration for more
information on how to configure SSH keys for the root
and anapaya
users.
Simultaneous IPv4 and IPv6 BGP peering and BFD support¶
The appliance now supports the simultaneous configuration of IPv4 and IPv6 BGP
neighbors. Separate neighbors must be configured for each address family. It is
not supported to exchange IPv4 and IPv6 routes using the same BGP session.
Furthermore, the appliance now supports BFD for BGP sessions. This allows for
faster detection of BGP session failures. Use the new bgp.bfd
configuration
options to enable BFD for BGP sessions.
Refer to Configuring BGP for more information on how to configure BGP neighbors and BFD for BGP sessions.
Appliance debug information archive¶
The appliance now supports generating a debug information archive containing
important state, configuration, and log information. This archive can be used by
Anapaya support and engineering to diagnose issues with the appliance. To create
an archive, run the appliance-cli debug dump --duration <duration>
command.
The duration specifies the time range for which logs should be included in the
archive and is set to 1 hour by default.
Improvements¶
Improved behavior for path changes in the IP-in-SCION tunneling service¶
The IP-in-SCION tunneling service has a complex control plane that is responsible to configure the data plane based on path and routing policies, announced prefixes, availability of remote endpoints, and health and performance characteristics of SCION paths. In our previous release, we rewrote the IP-in-SCION tunneling service from scratch to improve its overall scalability. In this release, we have further improved the behavior of the reconfiguration pipeline specifically for path changes. A path change will now only trigger a minimal amount of reconfiguration which reduces the pressure on the dataplane API under high churn conditions.
Improved control over announced and accepted IP prefixes¶
The announce and accept filters for IP prefixes in routing domain configurations
now allow more control over which prefixes are announced and accepted. For example,
if a remote gateway announces 1.2.3.0/24
but the local prefix filter
accepts 1.2.3.0/25
and 1.2.3.128/25
, the two /25s are announced
to the local network. Additionally, also the original /24 is announced to the local
network since the combined set of accept filters covers the /24.
New functionality for the appliance CLI¶
appliance-cli service metrics level
allows to set the metric level of SCION services. The default level isprod
. With the level set todebug
the service exposes additional metrics.appliance-cli crypto forwarding-key --size X
generates a SCION AS forwarding key on the appliance.appliance-cli info auth
the configured management API users and whether they are using the insecure default password.appliance-cli info network
shows network interface state information.appliance-cli info scion
shows SCION state information.appliance-cli info tunneling
prints the status of the IP-in-SCION tunneling. For each domain it reports the number of prefixes that are received/advertised and the remote endpoints per ISD-AS.
Impoved appliance health API endpoint¶
A new health API endpoint has been added to the appliance GET /api/v1/health
.
It reports the health status of the appliance based on a set of health checks
that are executed. appliance-cli get health
will show the health status of the
appliance. This endpoint should be preferred over the existing GET /api/v1/debug/services/{service_name}/health
endpoints.
For more details about the health checks refer to the Appliance API Specification.
Configurable CPU and memory resource limits per service¶
The CPU and memory resource limits can now be configured per service
using the system.resources.service_limits
section. This allows fine-grained
control over how much CPU and memory each service can use. Furthermore, the
defaults for all performance-critical services have been adjusted to better
match the actual resource requirements of these services.
Note
This is a feature for expert users. The default values should be sufficient for most deployments.
Various other improvements¶
The size of RX/TX queues for VPP interfaces can now be configured on the appliance. The default value is 1024. This is an expert setting and the default value should be sufficient for most deployments.
The VRRP preempt mode can be disabled by setting the
no_preempt
option totrue
in the VRRP configuration. This can be used to prevent a backup from taking over the master role even if it has a higher priority.The appliance API now validates that the configured VPP main thread and the number of VPP worker threads are valid with respect to the CPUs available on the appliance.
The validation of the appliance cluster configuration is now stricter and detects more misconfigurations. Specifically it detects duplicate cluster synchronization endpoints, duplicate SCION shards, duplicate SCION control addresses, and duplicate SCION neighbor interfaces.
Fixes¶
The
appliance-cli
info command does no longer report misleading health status information for the appliance.The BGP ASNs
23456
,65535
and4294967295
are no longer allowed to be configured on the appliance. These ASNs are reserved and should not be configured.The VRRP preempt mode is now enabled by default as the RFC 5798 defines it. This means that the master router will preempt the backup router if it comes back online after a failure. This is a change from the previous behavior where the preempt mode was disabled by default.
The appliance now correctly configures the firewall to allow SCION traffic on the internal SCION interface if the interface is configured using the LINUX driver. Before, the firewall would block SCION traffic on such interfaces.
The VPP main thread will now no longer use 100% of a CPU core when the number of VPP worker threads is larger than 0.
IP prefixes configured in the static announcements are now properly announced again after deconfiguring the next-hop IP address.
Breaking changes¶
For all static-announcements, next-hop tracking must be defined by default. The
enabled
flag is replaced by adisabled
flag, which is set tofalse
by default. Ifdisabled
is set totrue
, the next-hop tracking is disabled. The automatic configuration migration will automatically adapt the configuration. Make sure that you have a next-hop set if you have static announcements.