2 Powerful AI and Database Operators to Extend your K8s Cluster

Ian Kiprotich
11 min readSep 26, 2024

--

Managing complex applications in Kubernetes can be challenging. While Kubernetes is the sturdy vessel holding your applications afloat, it still requires skillful handling to ensure smooth sailing. This is where Kubernetes Operators come in — the crew members who automate critical tasks, allowing the ship to run efficiently without constant manual intervention.

But what are operators, exactly? Think of them as automated captains — they understand the workings of specific applications and ensure your Kubernetes cluster stays on course, automating everything from scaling to backups, much like an experienced navigator ensuring a ship avoids dangerous waters.

“The operator pattern captures how you can write code to automate a task beyond what Kubernetes itself provides. Operators follow Kubernetes principles, notably the control loop.” — Kubernetes Documentation

By using Operators, you can automate routine tasks, such as backups, scaling, and updates, which would otherwise require manual intervention. This simplifies the management of complex applications and ensures consistency and reliability in their operation.

In this blog, we will learn about operator patterns, explore the underlying Kubernetes primitives, and look closely at three powerful operators that can significantly enhance your Kubernetes cluster. Whether you are a DevOps or Platform Engineer, a Kubernetes administrator, or simply someone eager to learn more about Kubernetes, this content is tailored for you.

We’ll cover:

  1. Understanding the Operator Pattern: We will look at what Kubernetes Operators are and how they simplify the management of complex applications.
  2. Kubernetes Primitives: Explore the core Kubernetes components that Operators leverage to automate application lifecycle management.
  3. 3 Must-know Kubernetes Operators: Discover three essential Operators that can extend your Kubernetes cluster’s capabilities in AI, database management, and streaming services.

Join us as we uncover how these Operators can streamline your operations and bring advanced functionalities to your Kubernetes environment.

What Are Kubernetes Operators?

Kubernetes Operators extend Kubernetes’ capabilities by acting as application-specific controllers. They automate the management of complex applications, ensuring the cluster’s behavior matches the desired state as closely as a well-steered ship sticks to its course. Operators continuously reconcile the desired state (what you want) with the actual state (what is happening) and take action when these states drift apart — much like how a ship’s autopilot adjusts its direction to stay on course.

By embedding operational knowledge into code, Operators handle tasks beyond the native automation Kubernetes offers, making it easier to manage sophisticated systems without constant human intervention.

Understanding the Operator Pattern

At the heart of the operator pattern is the concept of declarative state reconciliation, where a controller continuously monitors and adjusts a system’s actual state to match the desired one. This mirrors how a skilled operator navigates a ship: observing changes and making real-time adjustments to ensure smooth sailing.

Key components of Kubernetes Operators include:

  • Custom Resource Definitions (CRDs): Define the schema and behavior of new custom resources the operator will manage.
  • Controller: Watches for changes in the custom resources and adjusts the system accordingly to maintain the desired state.
  • Custom Resources (CRs): Represent the desired state of a resource, such as an AI model deployment or a database setup.

Operator Capability Levels

Not all operators are created equal — some are more advanced than others. OperatorHub.io, a central hub for discovering operators, categorizes them into five capability levels, from basic installs to full lifecycle management and auto-piloting. The more advanced an operator, the more tasks it automates and optimizes.

Here are the five capability levels:

  1. Basic Install: Installs the application.
  2. Seamless Upgrades: Manages application upgrades.
  3. Full Lifecycle Management: Handles installation, scaling, upgrades, and backups.
  4. Deep Insights: Provides monitoring and detailed insights into system health.
  5. Auto Pilot: Automatically tunes and optimizes application performance with minimal intervention.

At the highest levels, operators function like seasoned captains, not only maintaining the course but also optimizing it for efficiency, performance, and safety.

Why Advanced Operators Matter

Operators that handle Full Lifecycle Management and beyond can be game changers. They are like tireless crew members who ensure everything runs smoothly, even under complex conditions, automating tasks like:

  • Sophisticated, application-specific workflows: Automate complex processes, such as machine learning pipelines or database recovery.
  • High availability (HA): Ensure applications stay online by automating failover and recovery.
  • Scaling and performance tuning: Adjust resources dynamically as workloads change.
  • Reduced operational overhead: Minimize human intervention, reducing the risk of error and freeing engineers to focus on innovation.

Running infrastructure in high availability mode can be labor-intensive and error-prone, but advanced operators greatly reduce this complexity. It’s like having an experienced crew at the helm, one that never tires, ensuring your Kubernetes ship stays on course.

Finding and Deploying Operators

To discover operators suited to your needs, OperatorHub.io is the go-to resource. It’s like the dockyard where you can find capable crew members ready to help you navigate Kubernetes. With operators across all capability levels, you can pick the right tools to handle your unique workloads.

How to install Operators

To install Kubernetes operators into your cluster, you have several options depending on the operator and your environment. Below are two popular methods for getting started with operator installation:

1. Using Helm Charts

Helm charts are a powerful way to deploy and manage applications in Kubernetes, and many operators provide Helm charts for installation. Helm simplifies the installation process by packaging the required Kubernetes resources together, so you don’t have to manually configure each one.

To install an operator using Helm:

  • First, make sure Helm is installed on your machine.
  • Add the operator’s Helm chart repository, if required.
  • Use the following command to install the operator:
helm install <operator-name> <chart-repo>/<chart-name>

For example, to install the Prometheus Operator using Helm:

helm repo add prometheus https://strimzi.io/charts/
helm install prometheus-operator prometheus/prometheus-operator

This method provides a quick and easy way to get operators running in your cluster, especially if you are familiar with Helm.

2. Using Operator Lifecycle Manager (OLM)

The Operator Lifecycle Manager (OLM) is a Kubernetes tool designed to help manage the lifecycle of operators in your cluster, including their installation, updates, and removal. OLM is particularly useful for operators listed on OperatorHub, a centralized repository of community and commercial Kubernetes operators.

To install an operator using OLM:

  1. Install OLM:
    First, you need to install OLM in your cluster. OLM is installed with a set of YAML manifests, which can be applied to your cluster with a single command:
kubectl apply -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v<olm-version>/install.yaml
  1. Replace <olm-version> with the latest release of OLM.
  2. Deploy Operators via OLM:
    Once OLM is installed, you can deploy operators directly from OperatorHub. You can browse OperatorHub from your cluster dashboard (if supported) or use OLM commands to install operators like this:
  3. Install Operator Lifecycle Manager (OLM), a tool to help manage the Operators running on your cluster.
curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.28.0/install.sh | bash -s v0.28.0

For example, to install the CloudNativePG operator:

kubectl create -f https://operatorhub.io/install/cloudnative-pg.yaml

OLM simplifies managing multiple operators by handling their versioning, dependencies, and updates, making it ideal for production environments where multiple operators are in use.

1. K8sGpt Operator

K8sGPT — a tool that supercharges your Kubernetes clusters with SRE superpowers. Designed to provide deep, automated insights into your cluster’s health and operations, K8sGPT enables unparalleled visibility, allowing you to monitor, analyze, and troubleshoot your Kubernetes environment with precision.

In the ever-evolving kubernetes landscape, keeping your clusters running reliably is critical. Manual cluster management is not only time-consuming but also leaves room for human error. K8sGPT changes the game by offering a hands-off approach that constantly scans your cluster, flagging potential issues before they escalate into outages. Think of it as an always-on SRE assistant — one that never sleeps, tirelessly working to keep your environment optimized and problem-free.

Unmatched Visibility and In-Depth Monitoring

At the core of K8sGPT is a powerful monitoring engine that continuously watches over your cluster. By integrating directly with your Kubernetes environment, the operator collects real-time data on workloads, resource usage, and system performance, providing detailed reports on the state of your infrastructure.

With K8sGPT, you gain:

  • Proactive Error Detection: The operator continuously scans for misconfigurations, performance bottlenecks, and other potential issues.
  • Comprehensive Insights: By analyzing cluster health and behavior, K8sGPT offers in-depth reports, helping you understand not only what is wrong, but also why it happened and how to fix it.
  • Automated Responses: Issues detected by the operator trigger automated remediation processes, ensuring minimal downtime and fast recovery without manual intervention.

Continuous SRE-Grade Analysis and Alerts

K8sGPT acts like an always-on, virtual SRE, ensuring your cluster operates at peak efficiency. It integrates seamlessly with tools like Prometheus for real-time metrics collection, and Grafana for visualizations and dashboards, and it can even send alerts directly to your communication platforms like Slack.

With built-in AI-driven intelligence, K8sGPT doesn’t just monitor — it provides actionable insights. From identifying performance degradation to predicting potential failures, it gives you everything you need to keep your systems running smoothly. This level of detail allows you to dive deep into cluster operations and understand every moving part, eliminating the guesswork.

Key Features:

  • AI-Powered Monitoring: Using models like GPT-4.0, the operator doesn’t just find problems — it analyzes them, offering context and suggestions for resolution.
  • Advanced Error Reporting: Get comprehensive YAML-based reports detailing every cluster issue, helping you address root causes faster.
  • Scalable Monitoring: Whether you’re managing a small cluster or a vast infrastructure, K8sGPT scales with you, providing consistent insights across all workloads.
  • Seamless Integration: K8sGPT works with Prometheus for metrics and integrates with AWS S3 for report storage, providing flexibility in how you monitor and store data.

Installing K8sGPT in the cluster

First, let's add the K8sGPT operator Helm charts and update the Helm Charts.

helm repo add k8sgpt https://charts.k8sgpt.ai/
helm repo update

Next, let's install the operator.

helm install release k8sgpt/k8sgpt-operator -n k8sgpt-operator-system --create-namespace

We can finally check on what is deployed on the k8sgpt-operator-system namespace.

Create a Secret for the OpenAI API Key: You can replace $OPENAI_TOKEN with your actual OpenAI API key:

kubectl create secret generic k8sgpt-sample-secret --from-literal=openai-api-key=$OPENAI_TOKEN -n k8sgpt-operator-system

Once installed, configuring the operator is simple. You can create a custom resource to define how the operator should function. Here’s an example configuration:

apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
name: k8sgpt
namespace: k8sgpt-operator-system
spec:
ai:
enabled: true
model: gpt-3.5-turbo
backend: openai
secret:
name: k8sgpt-secret
key: openai-api-key
noCache: false
version: v0.3.41

Apply the configuration:

kubectl apply -f k8sgpt-config.yaml

Check the status of the K8sGPT deployment and ensure it’s running correctly:

Analyze your cluster to check if the Model is working well.

k8sgpt analyze --explain

You can also change to different providers that you wish to

Above we have been able to deploy the K8sGPT operator to the cluster and successfully authenticated it and finally we have analyzed our cluster.

In conclusion, K8sGPT is a powerful tool that simplifies troubleshooting in Kubernetes environments by leveraging AI to identify and resolve issues quickly. Its ability to analyze logs, diagnose problems, and provide actionable insights helps teams maintain smoother operations with less manual effort. Whether you’re a seasoned Kubernetes admin or just getting started, K8sGPT reduces the complexity of managing clusters, enhances your system’s reliability, and accelerates problem-solving. Integrating K8sGPT into your workflow saves valuable time, minimizes downtime, and ensures your Kubernetes infrastructure runs efficiently.

2. CloudNative PG

Before the introduction of CloudNativePG, deploying PostgreSQL databases in Kubernetes was complex and labor-intensive. Managing databases as stateful applications required manual setup for high availability, backups, replication, and security. Ensuring data persistence in Kubernetes, with its ephemeral nature, was particularly challenging, and disaster recovery was often time-consuming.

CloudNativePG simplifies database management in Kubernetes by automating the entire lifecycle of PostgreSQL clusters. It handles deployment, failover, and backups, ensuring high availability using a primary/standby architecture with native streaming replication.

Key benefits include:

  • Automated Management: Seamlessly automates tasks like failover, backups, and recovery.
  • High Availability: Ensures minimal downtime with automated failover and replication.
  • Security: Supports encrypted TLS connections and custom certificates for enhanced security.
  • Disaster Recovery: Offers continuous backup and Point-In-Time Recovery (PITR).
  • Monitoring: Includes Prometheus for easy monitoring and JSON logging for integration with management tools.

Installing CloudNativePG

The operator can be installed like any other resource in Kubernetes, through a YAML manifest applied via kubectl.

You can install the latest operator manifest for this minor release as follows:

kubectl apply --server-side -f \
https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/main/releases/cnpg-1.24.0.yaml

You can verify that with:

kubectl get deployment -n cnpg-system cnpg-controller-manager

Create the database cluster.

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: cluster-example
spec:
instances: 3

storage:
size: 1Gi

check the cluster database if deployed

We can also check the pods running

Let’s exec into the pod and input some data in our database

kubectl exec postgress-1 -n postgress -it  -- bin/bash

CloudNativePG turns what used to be a complex and manual process into an efficient, reliable, and secure way to manage PostgreSQL databases in Kubernetes environments.

In conclusion, CloudNativePG is a game-changer for managing PostgreSQL databases in Kubernetes environments. By automating the deployment, scaling, and maintenance of PostgreSQL clusters, it simplifies database operations while ensuring high availability, security, and disaster recovery. With features like declarative configuration, built-in monitoring, and seamless integration with Kubernetes, CloudNativePG empowers teams to focus on innovation rather than the complexity of database management. If you’re looking for a robust, scalable, and secure way to run PostgreSQL in Kubernetes, CloudNativePG is an excellent solution.

Conclusion

In conclusion, Kubernetes operators like CloudNativePG and K8sGPT are essential tools for simplifying complex operations in Kubernetes environments. CloudNativePG streamlines the management of PostgreSQL databases, automating tasks like deployment, scaling, and disaster recovery, ensuring high availability and security for critical data. On the other hand, K8sGPT enhances troubleshooting by using AI to diagnose and resolve cluster issues quickly, reducing downtime and manual effort.

Both operators bring automation, efficiency, and reliability to Kubernetes, allowing teams to focus on innovation rather than the intricacies of infrastructure management. By leveraging these operators, organizations can ensure smoother, more resilient Kubernetes operations while reducing operational overhead.

If this blog has helped you, You can donate here via Paypal. Thank you, Let's learn and grow together.

If you have any inquiries or would like to connect, feel free to reach out via email at onai.rotich@gmail.com or through Twitter or LinkedIn. Together, let’s simplify Kubernetes management and make it more accessible.

--

--