Run Kubernetes

Last updated: June 21, 2026

Kubernetes has two jobs that usually live in two different places: standing up the cluster (infrastructure) and running things on it (workloads). With the American Cloud MCP server, an AI assistant covers both — and in a coding agent like Claude Code, both happen in one conversation.

This recipe shows the full loop: create a cluster from a prompt with the cost shown up front, connect kubectl, deploy an app, triage what's running, and add capacity when traffic grows.

The two layers

It helps to keep these straight, because the assistant works at both and switches between them in a single session:

  • Infrastructure layer — MCP tools. Creating the cluster, picking a node size and region, previewing cost, fetching the kubeconfig, and scaling worker capacity all happen through the American Cloud MCP server. These are American Cloud API calls, not kubectl.
  • Workload layer — your terminal. Once the kubeconfig is in place, the assistant uses kubectl in your shell to apply manifests, watch rollouts, read logs, and debug pods. This is ordinary Kubernetes — nothing American Cloud-specific.

In a coding agent the assistant can author your manifests, create the cluster that runs them, and then apply them — all without you switching windows.

The infrastructure layer needs the MCP server connected. The workload layer needs a coding agent that can run shell commands (like Claude Code) plus kubectl installed locally. A chat-only client can do everything in the infrastructure layer — create, scale, fetch the kubeconfig — but can't run kubectl for you.

Before you start

  • The MCP server connected to your client — see overview.
  • For the create and scale prompts below, a read-write API key and the --allow-writes flag. Read tools — listing clusters, cost estimates, fetching the kubeconfig — work with a read-only key.
  • For the deploy and operate prompts, a coding agent that can run shell commands, and kubectl installed. See Kubernetes getting started for install commands.

Provision a cluster

Lead with the cost. The MCP server has a Kubernetes cost-estimate tool that takes the same inputs as the create tool, so the assistant can price a configuration before it provisions anything.

Create a small Kubernetes cluster on American Cloud — show me the cost estimate first.

What your assistant will do:

  1. Call list_kubernetes_packages for node sizing tiers, list_regions for regions, and list_kubernetes_versions for available Kubernetes versions, then propose a configuration (a small worker package, a sensible region, the latest version).
  2. Call get_cost_estimate_kubernetes with that configuration and show you the monthly cost — before creating anything.
  3. Wait for your go-ahead, then call create_kubernetes_cluster with the name, package, region, version, control-node count, and worker-node count. (Three or more control nodes give you a highly available control plane; one is fine for dev. Pass an SSH key from list_ssh_keys if you want node access.)
  4. Poll get_kubernetes_cluster until status goes from CREATING to RUNNING — provisioning takes a few minutes.

If you have an opinion on size, region, or version, say so in the prompt ("3 control nodes, mid-size workers, in us-west"). If you don't, let the assistant list the options and recommend one, then confirm.

Connect and deploy

This is where the coding agent earns its keep: it fetches the kubeconfig through the MCP server and then deploys through your terminal.

Fetch the kubeconfig for that cluster and deploy this app to it.

What your assistant will do:

  1. Call get_kubernetes_cluster_config to retrieve the cluster's kubeconfig (YAML) and write it where kubectl can find it — typically a file it points KUBECONFIG at.
  2. Run kubectl get nodes to confirm the control and worker nodes are Ready.
  3. Read your project to understand what it is (a web service, a worker, what port it listens on, whether it has a container image or needs one built).
  4. Write the Kubernetes manifests — a Deployment, a Service, and an Ingress or a Service of type LoadBalancer to expose it — and kubectl apply them.
  5. Watch the rollout with kubectl rollout status and report when the pods are healthy and the app is reachable.

The kubeconfig grants full access to your cluster. The tool that returns it is read-only in the MCP sense — it doesn't change your infrastructure — but the credentials it hands back are sensitive. Treat the file like any other secret: don't commit it, and don't paste it into a shared channel. Ask your assistant to write it to a path outside your repo (or one your .gitignore already covers).

If your app needs a database, run PostgreSQL on its own American Cloud VM (ask the assistant to provision one and point a connection string at it), or deploy it inside the cluster from a manifest with a persistent volume. Either way, the assistant can wire it up in the same session.

To get the public address of your exposed service once the ingress controller assigns it, the assistant reads it from kubectl — see accessing your cluster via public IP for which IP is which.

Operate and triage

Day-two work is mostly kubectl, and this is exactly the kind of repetitive inspection an assistant is good at. No new infrastructure tools here — it's reading cluster state through your terminal.

Are all my deployments healthy?

What your assistant will do:

  1. Run kubectl get deployments -A and kubectl get pods -A to survey every namespace.
  2. Flag anything not at full ready replicas, any pods in CrashLoopBackOff, Pending, or ImagePullBackOff, and any recent restarts.
  3. Summarize the state in plain English and offer to dig into anything that looks off.

When something is wrong, hand it the symptom:

Why is the api pod crashlooping?

What your assistant will do:

  1. Run kubectl describe pod on the failing pod and read the events (failed image pull, OOMKilled, failing readiness probe, unschedulable).
  2. Pull the logs with kubectl logs --previous to see why the last container exited.
  3. Explain the root cause and propose a fix — a corrected environment variable, a higher memory request, a fixed probe path, a missing secret — and, with your okay, edit the manifest and re-apply it.

This loop — observe, hypothesize, fix, re-apply — is the same one you'd run by hand, just narrated and faster.

Scale when traffic grows

Adding capacity is back in the infrastructure layer: the MCP server changes the worker count on the managed cluster directly.

Traffic is growing — add worker capacity to the cluster.

What your assistant will do:

  1. Call get_kubernetes_cluster (with details=true) to read the current worker count and node utilization.
  2. Propose a new total worker count, or suggest turning on autoscaling so the cluster adds and removes workers with load.
  3. Call scale_kubernetes_cluster — either with a fixed workerNodes total, or with autoscaling enabled and a minWorkers/maxWorkers range.
  4. Confirm the new nodes register by running kubectl get nodes once they come up.

The scale tool sets the total worker count, not the number to add. If you have four workers and want one more, the target is five. The assistant reads the current count first so it gets this right — but it's worth knowing if you check the result yourself.

The same cluster can also be paused, upgraded to a newer Kubernetes version, or removed entirely through the MCP server when you ask. Upgrades and deletes are flagged so your client can prompt before they run.

When to choose Kubernetes vs a single VM

Kubernetes is the right tool when you have several services that scale independently, want rolling deploys and self-healing pods, or already think in containers and manifests. The assistant makes the cluster cheap to stand up and easy to operate — but the cluster itself is still more moving parts than some workloads need.

If you're shipping a single app — a Next.js site, an API, a small service — a single VM is often the better fit: less to run, less to reason about, lower cost. The assistant can provision and deploy to one just as fluently. See Deploy a Next.js app with your AI assistant for that path. Pick the one that matches the shape of your workload, not the one that sounds more impressive.

Next steps