Claude Code for Infrastructure

157 points132 comments8 hours ago
maxdo

Profile and hooks + skills for cc will solve concerns . cicd with manual approve + cc will work even better . Infra is a code same as anything else .

falloutx

All these tools to build something, but nothing to build. I feel like I am part of a Pyramid Scheme where every product is about building something else, but nothing reaches the end user.

Note: nothing against fluid.sh, I am struggling to figure out something to build.

show comments
aspectrr

Hey HN, My name is Collin and I'm working on fluid.sh (https://fluid.sh) the Claude Code for Infrastructure.

What does that mean?

Fluid is a terminal agent that do work on production infrastructure like VMs/K8s cluster/etc. by making sandbox clones of the infrastructure for AI agents to work on, allowing the agents to run commands, test connections, edit files, and then generate Infra-as-code like an Ansible Playbook to be applied on production.

Why not just use an LLM to generate IaC?

LLMs are great at generating Terraform, OpenTofu, Ansible, etc. but bad at guessing how production systems work. By giving access to a clone of the infrastructure, agents can explore, run commands, test things before writing the IaC, giving them better context and a place to test ideas and changes before deploying.

I got the idea after seeing how much Claude Code has helped me work on code, I thought "I wish there was something like that for infrastructure", and here we are.

Why not just provide tools, skills, MCP server to Claude Code?

Mainly safety. I didn't want CC to SSH into a prod machine from where it is running locally (real problem!). I wanted to lock down the tools it can run to be only on sandboxes while also giving it autonomy to create sandboxes and not have access to anything else.

Fluid gives access to a live output of commands run (it's pretty cool) and does this by ephemeral SSH Certificates. Fluid gives tools for creating IaC and requires human approval for creating sandboxes on hosts with low memory/CPU and for accessing the internet or installing packages.

I greatly appreciate any feedback or thoughts you have, and I hope you get the chance to try out Fluid!

show comments
levkk

So... I already tell Claude Code to do this. Just run kubectl for me please and figure out why my helm chart is broken.

Scary? A little but it's doing great. Not entirely sure why a specialized tool is needed when the general purpose CLI is working.

show comments
turtlebits

Making clones of production isn't trivial. Is your app server clone going to connect to your production database? It is going to spin up your whole stack? Seems a bit naive.

A better approach is to have AI understand how prod is built and make the changes there instead of having AI inspect it and figure out how to apply one off changes.

Models are already very good at writing IaaC.

JohnMakin

> LLMs are great at generating Terraform, OpenTofu, Ansible, etc. but bad at guessing how production systems work.

Sorry, that last part is absolutely not the case from my experience. IaC also uses the API to inquire about the infrastructure, and there are existing import/export tools around it, so I’m not exactly sure what you are gaining by insisting on abandoning it. IaC also has the benefit of being reusable and commitable.

show comments
hebejebelus

Clever solution. I think ops (like this) and observability will be pretty hot markets for a while soon. The code is quite cheap now, but actually running it and keeping it running still requires some amount of background. I've had a number of acquaintances ask me how they can get their vibe coded app available for others to use.

I really like this idea. I do a lot of kubernetes ops with workloads I'm unfamiliar with (and not directly responsible for) and often give claude read access in order to help me debug things, including with things like a grafana skill in order to access the same monitoring tools humans have. It's saved me dozens of hours in the last months - and my job is significantly less frustrating now.

Your method of creating ansible playbooks makes _tons_ of sense for this kind of work. I typically create documentation (with claude) for things after I've worked through them (with claude) but playbooks is a very, very clever move.

I would say something similar but as an auditable, controllable kubernetes operator would be pretty welcome.

show comments
wayeq

> curl -fsSL https://fluid.sh/install.sh | bash

what could go wrong..

jamesmstone

This general idea is exactly why I love nix. The immutability of it is powerful. It can be useful for both running your agents in a certain environment AND your agents are useful at writing your nix config. I expand on this in a blog post here https://jamesst.one/posts/agents-nix

keyle

It always makes me smile when you get some random domain with a good looking CSS telling you:

    Don't do the same as everyone!

    For safety...
here... Just curl this script and execute it :)
raw_anon_1111

Is this a real product? This is a solved problem.

First I’m personally never going to create infrastructure in the console. I’m going to use IAC from the get go. That means I can reproduce my infra on another account easily.

Second if I did come across an environment where this was already the case, there are tools for both Terraform and CloudFormation where you can reverse your infra to reproducible IAC.

After that, let Claude go wild in my sandbox account with a reasonably scoped IAM role with temporary credentials

dengsauve

I use Pulumi for work, and their AI solution (Pulumi Neo) works amazingly well in troubleshooting cloud issues. It's informed of the cloud state and recent changes right from their platform, which is pretty amazing. Compared to using Azure CoPilot for the same purposes, Pulumi Neo was faster in generating responses, and these responses were actionable and solved my issues. CoPilot was laughably useless comparably.

chickensong

So this is a client/server thing to control KVM via libvert and provision SSH keys to allow LLM agent access to the VMs?

How does the Ansible export work? Do the agents hack around inside the VM and then write a playbook from memory, or are all changes made via Ansible?

If Ansible playbooks are the artifact, what does features does Fluid offer over just having agents iterate on an Ansible codebase and having Ansible drive provisioning?

stackskipton

Ops person here.

I'm already using LLM to generate things and I'm not sure what this adds. The Demo isn't really doing it for me but maybe I'm wrong target for it. (What is running on that server? You don't know. Build your cattle properly!)

Maybe this is better for one man band devs trying to get something running without caring beyond, it's running.

show comments
bluelightning2k

This sounds like a uniquely good way to accidentally spend infinity money on AWS

lfx

Hey Collin!

Interesting idea, few things:

- The website tells less than your comment here. I want to try but have no idea how destructive it can be.

- You need to add / mention how to do things in the RO mode only.

- Always explain destructive actions.

Few weeks ago I had to debug K8S on the GCP GDC metal, Claude Code helped me tons, but... I had to recreate whole cluster next day because agent ran too fast deleted things it should not delete or at least tell me the full impact. So some harness would be nice.

show comments
jaimex2

This will make some amazing memes. 'Sorry I caused a $100,000 bill. I've made the right changes this time to scale appropriately.'

Next month - 'Sorry I caused a $200,000 bill...'

baalimago

It's pretty cool. What would be cooler is to have it as a MCP server... and then use claude code

alexandercheema

Isn't Claude Code for Infrastructure just...Claude Code?

show comments
qainsights

Can't we just use Claude Code straight up?

zahrevsky

I love how the landing page is straight to the point and has zero marketing BS. It achieves the opposite of AI-written text, while still being polished.

esafak

An infrastructure tool's primary installation method should NOT be curl | sh

show comments
ekaesmem

Please at least write the README.md by yourself. It's excessively lengthy.

tobi_bsf

Whats wrong with just using claude code for infrastructure? Works great tbh.

show comments
latchkey

I'm working towards this for actual infrastructure, for serving up AI compute.

"install kimi 2.5 on a 4x mi300x vm and connect the endpoint to opencode, shut it down in 4 hours"

We're getting close.

show comments
Uptrenda

About 90% of HN is now AI shit at any given time. I can't fucking take this shit. Can you losers talk about anything else.

show comments
lijok

FUCK NO. Who in their right mind would let an LLM connect to prod?

show comments
bigcat12345678

This is the most plausible tool for vibe infra I can think of