
Digital Sovereignty: Federal Government Case Study
The Problem:
The customer faced several urgent issues in their on-premises IT environment that would soon have a significant impact on their ability to run mission critical business applications. Of particular concern was the capacity of their on-premises NetApp storage volumes, which were rapidly approaching capacity. They were fast approaching 700TB and due to hit a physical limit within the next 6 months.
Hitting this would severely restrict the customer’s ability to perform new fraud investigations, one of the core tenets of the organisations mission. In addition, hardware associated with compute and networking in the environment was reaching end of life and in need of refresh. Due to delays with ordering and receiving new hardware during the pandemic, an on-premises solution was ruled out as it was unable to address these issues in time.
The CyberCX Digital Sovereignty Practice first applied our Risk Management Process to understand the customer’s Digital Sovereignty requirements. Moving this critical data to cloud required a PROTECTED grade environment that was aligned to several Digital Sovereignty principles, including:
-
- Compliance with regulatory requirements (PROTECTED grade environment)
- Ensure locality of data
- Manage Access to Data
- Encryption in transit and at rest
- Business continuity
- Australian partner with cleared resources
The Solution:
CyberCX first worked with the customer on the design and implementation of a PROTECTED grade AWS environment. The secure Landing Zone/Control Tower environment would run multiple production workloads and was fully integrated with the customer’s security ecosystem that included federated IDAM, SIEM and dynamic threat detection/monitoring. A project was initiated to leverage this existing environment for the migration of the customer’s core digital forensics platform. This included a range of compute (EC2, EKS and Lambda) and database instances (RDS) as well as dynamic high performance Amazon Workspaces to be used for investigations.
In order to satisfy the capacity, availability and performance requirements of the current on-premises data storage solution we chose to use the Amazon FsX-N cloud native NetApp storage service. This AWS managed service offering is a fully featured NetApp solution that allows for secure, efficient and reliable replication of data via the standardised Snapmirror feature. By establishing a native Storage Virtual Machine (SVM) peering relationship between the on-prem and cloud hosted volumes we were able to manage all cluster volumes centrally from both the NetApp console and CLI.
We deployed the cloud hosted FSxN volumes in a Multi-AZ (scale-up) configuration in active/standby mode. Due to throughput limits on VPN tunnel connections to the AWS Transit Gateway, we configured the individual throughput capacity of 1024MBps an in-memory caching of 128GB, NVMe caching of 1900 and a baseline SSD drive IOPS of 40000.
Once we enabled multipath secure VPN tunnels over a dedicated 10GB Direct Connect we were able to parallel stream data across an end-to-end encrypted link whilst maintaining application availability at source. One of the key benefits of Snapmirror is that it can effectively compress and deduplicate data to maximise replication efficiency as well as reduce storage redundancy. With this configuration, we were able to achieve close to the theoretical throughput of 1024 MBps per VPN tunnel. A total of 576TB was securely and successfully transferred in just over 16 days.
Once all data had been replicated, we were able to establish effective storage tiering policies to optimise both performance (for regularly accessed and cached data ) and cost (for archive and infrequently accessed data), leverage the benefits of native FSx, S3 and Glacier AWS services.
Whilst the storage replication was taking place, we built out the compute and database infrastructure under a dedicated account over multiple AZs to improve the availability and recoverability of the new solution. This was done through the use of Infrastructure as Code (IAC) and CI/CD leveraging CloudFormation and the customer’s existing DevOps tooling.
After completing integration, performance and usability testing, the production cutover of the new solution was implemented by a simple DNS change outside of business hours. Aside from an initial issue relating to populating cache for ongoing investigations, and some performance tuning related to tiering efficiency, users reported that responsiveness and performance of the new system was excellent, with no noticeable degradation of service between the cloud and on-prem platforms.
This solution was fully integrated into the customer’s established security ecosystem and was delivered to PROTECTED standard, which includes provisions for data residency, encryption in transit and at rest, and met other stated requirements.
The Result:
The customer sought to satisfy multiple Digital Sovereignty requirements in the delivery of this engagement, including:
- Compliance with regulatory requirements (PROTECTED grade environment) – the solution was deployed into the PROTECTED grade landing zone designed and established by CyberCX
- Ensure locality of data – the solution is designed to only allow deployment of resources and data into the SYD region of AWS
- Manage Access to Data – through compliance with CyberCX IAM best practices aligned with the PROTECTED grade environment requirements.
- Encryption in transit and at rest – all data in the solution is encrypted in transit and at rest. This is achieved using the KMS service for block (EBS / FSxN). The KMS policies were configured with least privilege. As described earlier, VPN tunnels were configured to encrypt traffic over the direct connect. Layer 7 Suricata rules were implemented on the AWS Network Firewall where possible to ensure traffic was encrypted (TLS v 1.3). TLS certificates are managed with AWS Certificate Manager.
- Business continuity – resiliency is achieved through a multi-AZ design using an Active/Standby configuration of FSxN that is inherently resilient, allowing the customer to continue operating in the case of failure within one AZ
- Australian partner with cleared resources – CyberCX is one of the most trusted local partners with cleared resources, and is the only partner to hold the AWS Authority to Operate specifically for the Public Sector.
This project was successfully delivered on time and on budget within a 16-week period with no outages to production or loss of data. We were able to avoid hitting capacity for the on-premises storage arrays and move towards a compliant, scalable, resilient and performant cloud-based solution. Additional benefits included a significantly reduced recovery time in the event of systems/datacentre failure as well as the ability to spin up and down environments on demand.