Keep the main thing, the main thing. — Stephen Covey
Consideration of resilience in Air Traffic Services (ATS) began with reactive management of disruptions – if something major impacts the service, how soon can contingency plans be enacted to restore the service, even at reduced capacity?
It used to be that technical resilience meant a redundant contingency centre – a new building or control tower duplicating the functionality of the main operational centre. Often, this was cost prohibitive, or did not actually prevent the most common continuity failures which can last 30-120 minutes but bring operational impacts over days or weeks. This inability to prevent the outcomes is due to the redundant centre not being a “hot back-up” (able to operate immediately) but only “warm” (able to operate with a warm-up period lasting 1-3 hours, including getting staff to the location).
It means that ANSPs implicitly accept a rare failure (for example, once in 5 or 10 years) impacting service continuity. The word “implicitly” here is important. Many ANSPs have not developed explicit targets for how often they would be willing to accept service continuity issues, perhaps because they feel any targets are out of their control or politically sensitive.
But resilience risk controls can be more than reactive. Done well, they are proactive and predictive, gathering data of minor events which impact the outcomes, and carefully analysing the performance data to derive changes in the design and operation of services and systems which prevent or control the risks.
Only relatively recently has the resilience of critical services become a proactive and explicit factor in the enterprise and technical design of ANSPs. Of course, this can be driven by reputational and political damage from service outages. But it is also partly due to the emergence of resilience engineering in aviation, bringing awareness of the designed continuity of technical systems and services. There are more options being available to decision makers with the advent of virtual services, digitalisation and subscription-based approaches (Software As A Service, rather than an asset). The pandemic has also spurred on novel rostering practices to improve service continuity, splitting teams whilst ensuring recency and performance, widening the appreciation of what is feasible.
It’s clear today that resilience is about more than just disruption and response. It is about flexibility and scalability in the face of change, helping to ensure the service disruption doesn’t happen in the first place.
“The secret of change is to focus all of your energy, not on fighting the old, but on building the new.” — Socrates
I had the pleasure to participate in CANSO and EUROCONTROL’s Global Resilience Summit in December 2021. The panel discussions delved into the various flavours of resilience – at organisational, individual and technical levels.
A key takeaway was the importance of strategic alignment when pursuing resilience goals for key services. An ANSP may have a state-of-the-art resilient communications network, but if there are staff shortages or ATM system issues, the ultimate performance goal of Air Traffic Service continuity will still suffer. Similarly, how can an ANSP build in flexibility and scalability in its operations to cope with unanticipated traffic growth?
The figure below illustrates the alignment of various layers of an organisation, focused on a service-oriented approach. For example: