Adarga Open Sources its Kubernetes Operator for Flyte Workflow Registration

News

17 Jan 2025

Adarga Open Sources its Kubernetes Operator for Flyte Workflow Registration

Adarga deploys gold-standard MLOps processes, enabling our customers to leverage cutting-edge machine learning solutions at speed and with confidence.

We use Flyte to run our Machine Learning (ML) pipelines and Kubernetes to host all our services. One of the challenges we faced when adopting Flyte was managing workflows across our different Kubernetes environments. Each workflow can have different versions and a process for promoting workflows through the environments had to be devised.

To streamline the process of promoting these workflows across environments, we built a custom Kubernetes Operator for Flyte workflow registration. Today, we’re excited to announce that we are open-sourcing this operator to benefit the broader Kubernetes and Flyte community. In this blog post, we’ll walk you through the key concepts, implementation details, and the journey of building the operator.

Background: Why a Kubernetes Operator?

Flyte is an open-source platform designed for orchestrating machine learning and data workflows at scale, leveraging Kubernetes for deployment. However, the conventional approach to registering workflows in Flyte, which involves using flytectl or pyflyte in CI pipelines, had a significant drawback: it posed a security risk as it required external access to the Flyte cluster.

In our case, we wanted to:

Keep workflows registered securely and entirely within the Kubernetes environment.
Reduce the overhead of manual registration.
Improve scalability and manage the growing number of Flyte workflows efficiently.

The solution? A custom Kubernetes operator to automate Flyte workflow registration within the cluster.

What is a Kubernetes Operator?

Before diving into the implementation, let’s quickly review what a Kubernetes Operator is.

A Kubernetes Operator is a method for managing complex Kubernetes applications by extending the Kubernetes API. Operators automate application lifecycle management tasks—such as deployment, scaling, and failure recovery—by leveraging Kubernetes’ native tooling (kubectl) and APIs. Operators typically consist of two components:

Custom Resource Definitions (CRDs): These are used to define new types of resources within Kubernetes (e.g. our custom FlyteRegistration CRD).
Controller: The controller watches these resources and reconciles the system's actual state with the desired state. In our case, it registers Flyte workflows when the associated CRDs are created or modified.

The Problem: Traditional Flyte Workflow Registration

In Flyte, workflows need to be registered before they can be executed. Traditionally, this is done through a CI pipeline using the flytectl command-line tool. However, this method requires external access to the Flyte instance, making it susceptible to security risks. Additionally, this process is not ideal for managing the growing number of workflows we need to promote across different environments.

Our goal was to create a Kubernetes-native solution that automates this process while maintaining security.

Building the Flyte Workflow Registration Kubernetes Operator

Step 1: Using Kubebuilder to Scaffold the Operator

Building Kubernetes operators from scratch can be quite complex, so we turned to Kubebuilder, a framework that streamlines the process of building operators. Kubebuilder provides a solid foundation by automatically generating the necessary boilerplate code to interact with Kubernetes APIs. This allowed us to focus on the specific business logic around Flyte workflow registration.

Step 2: Defining the Custom Resource Definition (CRD)

Our custom resource is the FlyteRegistration CRD, which defines the state for a Flyte workflow that needs to be registered. Here’s an example of how this looks:

This CRD includes the following fields:

workflowName: The name of the Flyte workflow.
workflowVersion: The version of the workflow.
workflowProject: The Flyte project the workflow belongs to.
workflowDomain: The Flyte domain under which the workflow will be registered.
workflowPackageUri: The URL of the workflow package, stored in JFrog Artifactory or any OCI artifact compatible repository.

Step 3: Automating the Workflow Registration

The operator’s primary function is to watch for changes to FlyteRegistration CRDs. When a new CRD is detected, the operator performs the following steps:

Downloads the Workflow Package: The operator fetches the corresponding Flyte workflow package from the repository where our workflows are stored.
Registers the Workflow: Using flytectl, the operator registers the workflow with the Flyte instance running in the Kubernetes cluster.

Step 4: Deployment Using Helm Chart

Kubernetes operators can be complex to deploy due to the various configurations involved. To simplify this, we used Helm to package the operator and its associated Kubernetes resources. Since Kubebuilder generates a dynamic configuration, we used helmify to automatically generate the Helm chart. The Helm chart is then stored in our public Docker Hub repository and can be retrieved with the following command:

This deployment workflow ensures that the operator is deployed consistently across environments.

Step 5: Integrating the Operator with Flyte Workflow

Once deployed, the operator continuously monitors the FlyteRegistration CRDs and ensures that workflows are correctly registered with the Flyte instance. This integration improves the operational efficiency of Flyte workflow promotion across environments without the need for manual intervention.

Challenges Faced

While building this operator, we encountered several challenges:

1. Integration with Flyte Golang SDK

Our initial attempt was to use the Golang and call the Flyte API but that was cumbersome. The approach required unpacking the entire workflow into raw protobuf files, which made the process more complex than necessary. Flyte does not yet provide a Go SDK but during one of their community syncs mentioned they would love to have one but need help.

We opted to use (flytectl) in the operator to streamline the workflow registration process.

2. Testing the Operator

Testing the operator against a live Flyte instance proved challenging due to the complexities of setting up a fully functional Flyte environment. Flyte's CLI tool (flytectl) builds a mini Kubernetes cluster dynamically, which interfered with our ability to run integration tests. As a result, we used mocks and tested the operator’s reconcile logic directly.

Conclusion

The development of a custom Kubernetes operator for Flyte workflow registration has been an exciting and rewarding project. By automating the process of registering workflows within a Kubernetes cluster, we’ve significantly reduced the complexity of managing Flyte workflows across different environments.

Our operator has reduced our Machine Learning model workflow deployment times, increased model trustworthiness, and enabled us to quickly adapt to changing customer needs. This is vital, especially as our work in defence and national security requires us to deploy to a range of different environments.

By open-sourcing this operator, we hope to contribute to the Flyte and Kubernetes ecosystems, enabling other teams to automate their workflow registration process while maintaining security and efficiency.

You can access the code for the Flyte Workflow Registration Kubernetes Operator on GitHub.

We encourage you to contribute, raise issues, and help us improve it for the community.

Thank you to our engineering, MLOps, and platform teams for making this available. To find out more about our AI and Machine Learning services, get in touch here.

Adarga Awarded Expanded Defence AI Contract and Multi-Year Renewal

News

14 Jul 2025

Adarga Awarded Expanded Defence AI Contract and Multi-Year Renewal

New report co-authored with CETaS and IISS combines world-class research expertise with our cutting-edge AI tools

News

09 Jul 2025

New report co-authored with CETaS and IISS combines world-class research expertise with our cutting-edge AI tools

Data. Digital. AI: Delivering on the UK’s Strategic Defence Review

News

02 Jun 2025

News

Adarga Open Sources its Kubernetes Operator for Flyte Workflow Registration

Background: Why a Kubernetes Operator?

What is a Kubernetes Operator?

The Problem: Traditional Flyte Workflow Registration

Building the Flyte Workflow Registration Kubernetes Operator

Step 1: Using Kubebuilder to Scaffold the Operator

Step 2: Defining the Custom Resource Definition (CRD)

Step 3: Automating the Workflow Registration

Step 4: Deployment Using Helm Chart

Step 5: Integrating the Operator with Flyte Workflow

Challenges Faced

1. Integration with Flyte Golang SDK

2. Testing the Operator

Conclusion

You can access the code for the Flyte Workflow Registration Kubernetes Operator on GitHub.

Related posts

News

Adarga Awarded Expanded Defence AI Contract and Multi-Year Renewal

News

New report co-authored with CETaS and IISS combines world-class research expertise with our cutting-edge AI tools

News

Data. Digital. AI: Delivering on the UK’s Strategic Defence Review

News

UK Defence Secretary's landmark announcement includes Adarga's Enterprise Agreement

News

Adarga Briefly Issue 31: Ruto in Beijing: Kenya's Diplomatic Balancing Act in a Multipolar World

News

Adarga completes work with HM Treasury

News

Why is ‘software-defined’ warfare vital for maintaining our military advantage?

News

Adarga Briefly Issue 29: Beijing's actions which led Hong Kong-based firm to postpone the sale of two strategic ports in Panama

News

Adarga Briefly Issue 30: How has China framed Admiral Sir Tony Radakin's visit to Beijing?

Cookie Policy

Analytics Cookies