Customer Success
ASIC, a large Australian federal government agency, was seeking to gain greater business value and deeper insights into its wide and diverse range of disparate data assets. In addition to uplifting their advanced analytics capabilities, the business was looking to use the data lake to provide a wide range of real-world benefits including improved fraud detection, enhanced market awareness and predictive modelling in areas such as superannuation, mortgages and business lending.
They launched an open tender in 2019 to partner with a trusted data services provider to design, build and manage an enterprise grade Data Lake that incorporated best of breed tools and was capable of passing an independent IRAP security assessment to PROTECTED grade.
CyberCX won this competitive tender and has been working with the client on an ongoing basis to design and deliver the platform. We currently have an established 8 person team consisting of Data Architects, Engineers, Analysts and Data Scientists working closely with customer business and IT teams.
Solution
This hybrid solution has existing business intelligence (Qlik Sense) components hosted on-premises with all Data Lake infrastructure running inside the AWS public cloud. All the underlying AWS services that make up the Data Lake have been previously IRAP certified. From an access and identity perspective, all components of the solution are federated to the agencies Active Directory domain and all logging is integrated with their existing commercial SIEM solution.
The successful pilot for this project incorporated industry leading commercial products, Snowflake EDW, Databricks UAP (for ETL, processing and ML) and Alex Solutions (for governance, catalogue, lineage and data quality) as well as a number of native AWS services including Glue, Athena and Kinesis Data Streams for live feeds.
This project is now in full build phase and now includes a number of additional cloud native machine learning tools such as Amazon Textract to provide advanced Optical Character Recognition (OCR) for non-structured data sets such as PDF documents and scanned images. It leverages a wide range of storage technologies for both data source ingestion and backend processing including the highly performant FSx for NetApp ONTAP.
Challenges and outcomes
One of the key challenges of this project was around stakeholder management and helping define the customer’s data strategy and governance policy. Prior to this point, the customer had fragmented business teams and a lack of general structure and maturity in its data practices. This required CyberCX to provide a range of resources in the form of business/data analysts, data architects and project management to assist the client with the cultural shift and organisational changes required to make the project a success.
We have also needed to work collaboratively with business and IT teams during the COVID-19 outbreak where remote working and social distancing became necessary. CyberCX has risen to the challenge and is effectively working with the customer to co-ordinate and deliver the project. One of the key success factors to this was our ability to quickly comply with, and adapt to, the customer’s policies and standards relating to security and remote access.