Show HN: Tilde.run – Agent sandbox with a transactional, versioned filesystem

167 points118 comments19 hours ago

docheinestages

Just my two cents: less is more and the first impression matters a lot. I'm saying this because we see a new agent sandbox tool on the front-page almost every day. Most of them have an AI-made landing page design, lots of animations, lots of words. This has become a bad sign for me. I can tell that you put time into it, made a video, and everything, but I guess I'm suffering from some kind of fatigue of having to go through all these tools. So, the less I have to process to get to the meat of exactly what I'm looking at, what sets this apart from others, why and when I would need to use it, then the more likely I am to actually engage with the product.

show comments

jFriedensreich

I had to dig hard to find this is a SAAS sandbox offering not an actual sandbox (the software i can use locally). Its just wasting peoples time, no one needs a non opensource sandbox. There are now at least 3 apache 2 projects (smolmachines, microsandbox, boxlite) working on sandboxes and at least one of them should be ready for primetime soon.

show comments

skeledrew

I made something pretty similar to this a couple months ago, when I was just getting into using coding agents. Has 2 parts that work individually but are better together: a change tracking FS and an agent sandbox. Haven't really used it though as it's a pain to get Claude Code working in that - Docker-based - sandbox without baking it in, and I really want something that's fully configurable. And then I didn't really need it to because I'm a very interactive user; I'm almost constantly watching the agent and never use YOLO... except for 1 codebase where it's frustratingly failing to fix a single particular bug and I really don't want to deal with it myself.

jmull

This is an excellent idea who's time has come.

But this is too vague for me. I'm not seeing my questions answered in the landing page or FAQ either.

E.g.,... what's the pricing?

How does atomic commit really work? E.g., if one write to S3 succeeds but the update to a git repo fails?

Does this use optimistic locking or something else? What happens if I commit changes to a resource that was updated since it was imported?

Where/how is it hosted?

show comments

egorfine

I glanced through the whole documentation, the homepage and the github readmes and still couldn't figure out which OS do they support and how. And this is especially important to know because sandboxing in macOS and Linux have nothing in common.

_pdp_

Git is already versioned, S3 support versioning and any file copied into the sandbox, is well a copy, so I am not sure what is the angle here.

Other than that it looks cool!

show comments

kushalpatil07

I was trying to build an agent. None of the sandboxes out there had solved the filesystem problem. I want my agent to have a persistent storage, and that stays forever. Like a human with a computer. When the agent spins up again, it has access to the computer with the same files.

I had to create my own setup using aws s3 filesystem and docker for this.

Does Tilde solve for this?

show comments

kindev

Wow, I see a lot of potential with this project! Using the filesystem simplifies the integration with 3rd parties significantly.

seamossfet

Does this provide gitflow to handle conflicts from multiple agents touching the same file system or is it purely for single-branch sequential iterations on the filesystem?

I have a use case that could use this if it supports handling branching and merging file systems.

show comments

anonymousiam

Back in the 1970's when versioned filesystems were invented, they provided a recovery path for when a file was improperly changed or deleted. Now, in the age of LLMs that go rouge, I can see why they would become popular again.

show comments

sahil-shubham

Nice work on the website!

Building something for the same problem but more so from the perspective of self-hostable stateful sandboxes, and not just the filesystem (see https://bhatti.sh). What sandbox solution are you using here?

show comments

cpard

It was a nice surprise seeing your post on the first page of HN Oz, congrats!

If I understand correctly what Tilde is doing is extending the concept of the sandbox in an operating system - filesystem, to data too.

So this is a sandbox environment someone would use for data heavy agentic workloads, is this correct?

show comments

digitaltrees

Interesting project. I am building an IDE for my phone and browser (www.propelcode.app) and have evaluated a few container architectures and providers. It was quite painful to get a prototype working. I will try your platform and would be happy to give feedback.

show comments

mehmetkeremmtl

The versioned filesystem is exactly what's missing when agents hallucinate and go off the rails. How fast are the rollbacks if an agent completely messes up the directory state?

show comments

stronglikedan

> Free to start

Before I invest my time into something like this I'll need to know what it'll end up costing in the end. Perhaps it's just that "private previews" aren't for me. Good luck!

mc-serious

Nice, I think that's pretty neat. Do you have an idea where to take this further? I.e. for the filesystem it's great but what if you need to touch external systems that keep their own state?

show comments

grim_io

Another one.

If it's not a local sandbox, I'm not interested.

We've got enough subscription lock-in from LLM's already.

pwr1

This looks pretty useful. The versioned filesystem part is nice becuase that’s exactly where a lot of agent stuff gets messy fast.

zuzululu

more tools I will never use or need theres just an endless supply of new open source projects now I stopped paying attention

I increasingly feel the impact of landing on the frontpage of HN is not as pronounced as it used to be. The demographic shift of HN is also noted, it has a lot more "reddit" vibe than I remember.

show comments

danielbenzvi

Interesting. Their versioned storage sandbox seems to be what really sets them apart

show comments

viewhub

What compute resources does the sandbox have? Memory/CPU/GPU?

show comments

aussieguy1234

Nice project, but saying "Run AI agents in production without the risk" isn't quite accurate.

Even if some tool makes it impossible for an AI agent to delete things in a way that isn't recoverable, there are other risks such as data exfiltration that need to be managed separately.

clearstack

If an agent deletes something important (e.g. database), can you undo it? Does it automatically backup before making changes?

show comments

kay_o

Does this interact with sql or only fs?

show comments

dtran24

Do git and branching fit into this at all?

show comments

mdavid626

Just enable versioning in S3?

esafak

I do not get it. If the agent is not mutating state the change can be checked in. If it is mutating external state, version control won't save you.

show comments

dorianzheng

any chance i can run local micro-VM such as boxlite with this?

show comments

irivkin

Looks promising! I wanna try it!

whwhyb

not to be confused with tilde.club

gverrilla

I'm far from an expert on the field or in computer science, but from my limited perspective I don't see the need for sandboxing - after thousands of claude code interactions it never did nothing wrong that was serious, at all. If I understand this all correctly, lakeFS would be useful for versioning huge dataloads - but it's not my case: for my usecase I use dura and that's plenty, and for more serious projects where I want not only to version changes but also to 'journal' them, I use github. Also I don't understand one thing: this is like a different client? The website shows a screenshot of "Claude Code" that is not claude code at all, or is modified - that's not a terminal. Am I tripping in anything I said?

show comments

verdverm

I implemented something like this in ADK with Dagger, but it misses some important features b/c of BuildKit underneath. The OCI foundations make saving each step as a layer, diff, clone/fork, and time travel easy. The hard parts are security and resource limits.

Glad to see more takes in this space.

redwood

How does the scale? For example if I were to have hundreds or thousands of concurrent agents running with some parts of their data pulled out of shared state and other parts custom to that particular agent run and I wanted all of this to be preserved for future collective or individual agent use later, is this a reasonable primitive for that problem space? Or is this more for a situation what you have one or a small number of productivity assistance agents that need a sandbox but low data mutation throughput and low amount of concurrent access across different agents?

show comments

varispeed

All these agent offering are missing a use case.

What I would use it for and why?

It reminds me of a blockchain - where it was a solution desperately looking for a problem. What problem does it solve?

wyre

Interesting. Literally saw a tweet talking about exactly this last night.

Not sure how I feel about it using on your hosted service, while your home page is asking me for analytics data and only the cli and sdk are open source.

show comments

cyanydeez

I know everyones trying to figure out how to make money in this grift economy, but if you're a rational person, you know that it's all a bunch of gambling and tailoring your scope to b2b and ignoring local & open source models and tools, you're more likely going to be part of that permanent undeclass they keep talking about in a self-fullfilling prophecy.

show comments