Scalable Computing for Drone Data


Summary

Our team describes its high powered, scalable and open-source compute design strategy for the processing of drone data, and how it is increasing the accessibility of drone-based data research in Australia.


The ASDC team thought we’d take the time to describe the scalable computing strategy of the project and how it is set to solve many of the common issues that researchers and institutions face when processing high volumes of drone data.

Whilst listening to the experiences of the partners involved in the project, institutions external to the project, and early users of the platform, we have heard many common stories of the challenges that people face in up-scaling their drone programs. Whilst there were plenty of experiences around the use of drone aircraft and operations at scale, the barriers that were of most concern to institutions using drones frequently, consistently surrounded the question of data -  its storage, and computational resource management. 

Most of the partner institutions of the ASDC have developed internal compute processing capability. However, envisaging that the use of drone data in fundamental research will continue to grow, the group has come together to develop national infrastructure that aims to solve many of the common issues identified.

Our compute solution

The key focus of the development operations and architecture team has been to create scalable cloud compute systems that are designed to be responsive to demand. By leveraging international standards and established methods for managing research infrastructures, the ASDC team aims to ensure Australia contributes to establishing best practice. 

Providing scalable computing infrastructure is seen as one of the essential pillars of the project, underpinning our objective of enabling meaningful impact from fundamental research. To accomplish this, we have used Kubernetes as the management software to deploy, scale, and manage the compute hardware. It handles the compute, networking, data storage and servers and the environment that the ASDC platform operates on. 

The advantage of a containerised system such as Kubernetes is that it is built with scalability in mind. It allows for compute nodes to be added quickly, simply, and reliably in response to demand.  Furthermore, the Kubernetes environment is agnostic to the platform in which it is deployed, allowing for nodes to be added from other compute platforms like Amazon cloud services if demand is high.

The ASDC storage and compute infrastructure is provided by the Australian Research Data Commons (ARDC), which operates the Nectar Research Cloud along with node partners including Monash and QCIF. 

Based on the requirements for ASDC, the available budget, and procurement discussions with the relevant Nectar Research Cloud nodes, the ARDC funded the following Nectar Research Cloud infrastructure in 2020/21:

  • 2 GPU servers at QCIF, each with 3 NVIDIA A100-40s, two 64 core AMD 7702s, 2 TB RAM, and 16TB NVMe storage.

  • 2 GPU servers at Monash with 4 NVIDIA A40s and 1 GPU server with 4 NVIDIA A100s GPUs.

  • 10 TB of volume storage at Monash and QCIF

  • 200 virtual CPUs at Monash and 50 virtual CPUs at QCIF.

The GPU servers at QCIF were deployed in October 2021 and the GPU servers at Monash will be deployed next month (March 2022).

These assets have been deployed as the baseline hardware allocated to the ASDC, servicing the current processing demand within the platform. When the platform is formally released, this demand is expected to increase. The scalable design of the platform will allow for quick deployment of new processing resources in response. 

The scalable compute and open-source platform design removes the need for every new researcher interested in using drone data to worry about and procure high performance desktops, only to have their computers be obsolete or insufficient to meet their processing demand within 12 months.  Furthermore, it removes the purchasing of redundant processing capacity by different departments within the same organisation and improves ease of access by centralising the processing demand on web accessible scalable infrastructure. 

There are additional efficiency benefits in performing computation and analysis operations on the same platform where the data is stored, managed, and eventually published.  By providing data pipelines that remove the need for repeated uploads, downloads, imports, and exports from different working environments, the system architecture simplifies and standardises research workflows.

GPU isa specialised processing unit with enhanced mathematical computation capability, making it ideal for machine learning applications. Supplying researchers with significant GPU hardware within the ASDC enables them to apply machine learning pipelines on the same infrastructure. 

Our team is hopeful that through the provision of this scalable open source platform we will  increase the accessibility of drone-based data research in Australia, and build a community of researchers using drones. 

If you would like to get involved and help guide the development of the ASDC platform in a way that will benefit you and your team, please get in contact with us.




Previous
Previous

What is the Australian Scalable Drone Cloud?

Next
Next

Scalable drone infrastructure for revegetation