We were heavy users of Claude Code ($70K+ spend per year) and have almost completely switched to codex CLI. I'm doing massive lifts with it on software that would never before have been feasible for me personally, or any team I've ever run. I'll use Claude Code maybe once every two weeks as a second set of eyes to inspect code and document a bug, with mixed success. But my experience has been that initially Claude Code was amazing and a "just take my frikkin money" product. Then Codex overtook CC and is much better at longer runs on hard problems. I've seen Claude Code literally just give up on a hard problem and tell me to buy something off the shelf. Whereas Codex's ability to profoundly increase the capabilities of a software org is a secret that's slowly getting out.
I don't have any relationship with any AI company, and honestly I was rooting for Anthropic, but Codex CLI is just way way better.
Also Codex CLI is cheaper than Claude Code.
I think Anthropic are going to have to somehow leapfrog OpenAI to regain the position they were in around June of this year. But right now they're being handed their hat.
It's really solid. It's effectively a web (and native mobile) UI over Claude Code CLI, more specifically "claude --dangerously-skip-permissions".
Anthropic have recognized that Claude Code where you don't have to approve every step is massively more productive and interesting than the default, so it's worth investing a lot of resources in sandboxing.
show comments
brynary
The most interesting parts of this to me are somewhat buried:
- Claude Code has been added to iOS
- Claude Code on the Web allows for seamless switching to Claude Code CLI
- They have open sourced an OS-native sandboxing system which limits file system and network access _without_ needing containers
However, I find the emphasis on limiting the outbound network access somewhat puzzling because the allowlists invariably include domains like gist.github.com and dozens of others which act effectively as public CMS’es and would still permit exfiltration with just a bit of extra effort.
show comments
mdeeks
I feel like these background agents still aren't doing what I want from a developer experience perspective. Running in an inaccessible environment that pushes random things to branches that I then have to checkout locally doesn't feel great.
AI coding should be tightly in the inner dev loop! PRs are a bad way to review and iterate on code. They are a last line of defense, not the primary way to develop.
Give me an isolated environment that is one click hooked up to Cursor/VSCode Remote SSH. It should be the default. I can't think of a single time that Claude or any other AI tool nailed the request on the first try (other than trivial things). I always need to touch it up or at least navigate around and validate it in my IDE.
show comments
martypitt
It's interesting how most of these tools are (exclusively) Github.
We're on Gitlab for historic reasons. Where Github now has numerous opporuntities to use AI as part of your workflow, there's nothing in Gitlab (from what I can tell), unless you're paying big bucks.
I like using AI to boost my productivity. I'm surprised that that'll be the thing that makes me migrate to Github.
jackconsidine
> We were heavy users of Claude Code ($70K+ spend per year) and have almost completely switched to codex CLI
Seeing comments like this all over the place. I switched to CC from Cursor in June / July because I saw the same types of comments. I switched from VSCode + Copilot about 8 months before that for the same reason. I remember being skeptical that this sort of thing was guerilla marketing, but CC was in fact better than Cursor. Guess I'll try Codex, and I guess that it's good that there are multiple competing products making big strides.
Never would have imagined myself ditching IDEs and workflows 3x in a few months. A little exhausting
show comments
pimterry
Personally, my one annoyance here is that it requires you to install a GitHub App that gives it direct write permissions to all code in your repos (in addition to issues, PRs, etc).
I'd much rather give it read permissions, have it work in its own clone, and then manually pull changes back through (either with a web review UI somehow, or just pulling the changes locally). Partly for security, partly just to provide a good review gate.
Would also allow using this with other people's repos, where I _can't_ give write permissions, which would be super helpful for exploring dependency repos, or doing more general research. I've found this super helpful with Claude Code locally but seems impossible on the web right now.
neilv
Nit about doing your AI interfaces on the Web: I really want claude.ai and chatgpt.com to offer a standard username+password login without 2FA. The kind my privacy-friendly browser of short-lived sessions can complete in a couple clicks, like for most other SaaSes, and then I'm in and using the tool.
I don't want to leak data either way by using some "let's throw SSO from a sketchy adtech company into the trust loop".
I don't want to wait a minute for Anthropic's login-by-email link, and have the process slam the brakes on my workflow and train of thought.
I don't want to wait a minute for OpenAI's MFA-by-email code (even though I disabled that in the account settings, it still did it).
I don't want to deal with desktop clients I don't trust, or that might not keep up with feature improvements. Nor have to kludge up a clumsy virtualization sandbox for an untrusted client, just to ask an LLM questions that could just be in a Web browser.
show comments
ea016
No relations to them, but I've started using Happy[0]'s iOS app to start and continue Claude Code sessions on my iPhone. It allows me to run sessions on a custom environment, like a machine with a GPU to train models
This is going to be extremely useful. A lot of people have hacked together similar things to get around waiting for CC to finish without mangling worktrees and branches manually.
I was curious how the 'Open in CLI' works - it copies a command to clipboard like 'claude --teleport session_XXXXX', which opens the same chat in the CLI, and checks out a new branch off origin/main which it's created for the thread, called 'claude/feature-name-XXXXX'.
I prefer not to use CC at the 'PR level' because it still needs too much hand-holding, so very happy to see that they've added this.
Update: Session titles are either being leaked between users or have a very bad LLM writing them. I'm seeing "Update Ton Blockchain Configuration" and "Retrieve Current PIN Code" for a project that has nothing to do with blockchain or PIN codes...
show comments
yoavm
I was just working on something similar for OpenCode - pushing it now in case it's useful for someone[0].
It can run in a front-end only mode (I'll put up a hosted version soon), and then you need to specify your OpenCode API server and it'll connect to it. Alternatively, it can spin up the API server itself and proxy it, and then you just need to expose (securely) the server to the internet.
The UI is responsive and my main idea was that I can easily continue directing the AI from my phone, but it's also of course possible to just spin up new sessions. So often I have an idea while I'm away from my keyboard, and being up able to just say "create an X" and let it do its thing while I'm on the go is quite exciting.
It doesn't spin up a special sandbox environment or anything like that, but you're really free to run it inside whatever sandboxing solution you want. And unlike Claude Code, you're of course free to choose whatever model you want.
I've been using Happy Coder[0] for some time now on web and mobile. I run it `--yolo` mode on an isolated VM across multiple projects.
With Happy, I managed to turn one of these Claude Code instances into a replacement for Claude that has all the MCP goodness I could ever want and more.
Unfortunately Anthropic have completely lost my trust. It’s very unlikely that I will ever return to purchasing from a company that behaves in the manner in which they do.
show comments
artdigital
So is this their version of Jules / Codex / Copilot agent? Aka autonomous agent in the cloud you give a task and it spits out a PR a bit later?
It’s interesting how all the LLMs slowly end up with the same feature set and picking one really ends up with personal preference.
Me as a dev am happy that I now have 4 autonomous engineers that I can delegate stuff to depending on task difficulty and rate limits. Even just Copilot + Codex has made me a lot more productive
Also rip to all the startups that tried to provide “Claude in the cloud”, though this was very predictable to happen
It's pretty frustrating that every release is IOS first without any timeline or expectation for Android
show comments
lukaslalinsky
I wish this was integrated with GitHub actions, as there I can configure the environment, give it access to tools. The GitHub Actions integration is already fairly good, but having this interactive web UI would be perfect.
jryio
Pair programming is still one of the best ways to knowledge transfer between two programmers in a high throughput manner. Humans learn by doing, building synaptic connections.
I wonder if a shared Claude Code instance has the same effect?
show comments
ubj
Very curious to see what usage limits are like for paid plans. Anthropic was already experiencing issues with high-volume model usage for Pro and Max users. I hope their infrastructure is able to adequately support running these additional coding environments on top of model inference.
Just to be clear, I'm excited for the capability to use Claude Code entirely within the browser. However, I've heard reports of Max users experiencing throttled usage limits in recent months, and am concerned as to whether this will exacerbate that issue or not.
show comments
r0x0r007
Wow, so nice! Now I can read hacker news, watch youtube shorts and solve tickets and add new features at the same time! What could go wrong!? Thanks AI!
qwertox
I wish that a "Claude Code"-session (and a per-project-id session) could present itself in Claude Web in menu entries to create new chats which participate in the selected Claude Code session.
jngiam1
I got so used to having Claude Code read some of my MCP tools, and was bummed to see that it couldn't connect to them yet on the web.
Pretty cool though! Will need to use it for some more isolated work/code edits. Claude Code is now my workhorse for a ton of stuff including non-coding work (esp. with the right MCPs)
teunlao
Been using both daily for three months. Different tools for different jobs.
Claude Code has better UX. Period. The permission system, rollbacks, plan mode - it's more polished. Iterative work feels natural. Quick fixes, exploratory coding, when I'm not sure exactly what I want yet - Claude wins.
Codex is more reliable when stakes are high. Hard problems. Multi-file refactors. Complex business logic. The model just grinds through it. Less hand-holding needed.
Here's the split I've landed on - Claude for fast iteration tasks where I'm actively involved. Codex for delegate-and-walk-away work that needs to be right first time.
Not about which is "better" - wrong question. It's about tooling vs model capability. Claude optimized the wrapper. OpenAI optimized the engine.
cube2222
This is quite nice!
I'm using Claude Code locally a lot, occasionally with a couple parallel session.
I was very happy when they made the GitHub Action - I used it quite a bit, but in practice I got frustrated that I effectively only get a single back-and-forth out of it, I can't really "continue the conversation without losing context" - Sure, I can respond to it in the PR it makes, but that will be a fresh session with a fresh empty context.
So, as much as I don't like moving out of my standard development workflow with my tools, I think this could be quite useful. The ability to interrupt and/or continue a conversation should be very nice.
My main worry is - usually my unit tests and integration tests rely on a postgres database running on the machine, and it's not obvious to me if I can spin that up here?
show comments
jimmydoe
total used free shared buff/cache available
Mem: 13Gi 306Mi 12Gi 0B 126Mi 12Gi
Swap: 0B 0B 0B
the sandbox has ~12G RAM, but no docker or podman allowed.
unfortunately it doesn't work for me as I need docker compose or equivalent to fire up some env for local test
lysecret
Just played around with it the fact it’s on the phone is a big bonus.
I have setup a little workflow where given linear tags it sets up a work tree on my dev box installs deps and starts the implementation so I can take it over I prefer this workflow to the fully managed cloud based solutions.
This kind of fits in for issues where I’m basically sure I won’t have to take it over (and it can do it fully on its own). Which aren’t that many.
Very simple example there was a warning pop up on something where I thought there shouldn’t be now it’s done fully automatically from my phone in 5 mins. I quite like that these small changes become so easy.
ed_mercer
Is CC on the web able to spawn local containers? I would need to spawn a half dozen services locally in order to have a proper simulation of my actual working environment. Tool calling and integration with various microservices (e.g. postgres, playwright) is one of the most important uses of CC for us. For example, after telling CC to implement a feature, it needs to test that feature and confirm that any database changes are the way they're supposed to.
shireboy
I really want this but for Azure Devops. If you're not familiar, Microsoft owns both Github and Azure Devops, and both do similar: git repos and project management. I can use Github Copilot, Claude Code CLI, etc. against code on my disk, including Azure Devops MCP. But what I can't easily do is like Github Copilot Agent and apparently this Claude Code on Web: Assign a ticket to @SomeAi and have a PR show up in a few minutes. Can't change to github for _reasons_.
Would love any suggestions if anyone in a similar story.
show comments
idk1
This is off topic, but can anyone tell me what the genre of music is to the video on this?
mholubowski
I’d really appreciate an explanation of this:
How does Codex / Claude Code compare to working within Cursor with the chat and agents? Are they effectively the same thing?
Is one significantly better than the other. Please share your experiences around this I’m trying to be ass effective of an engineer as I can be at our company. - Mike
arjie
A thing I really like with Claude Code is how well it uses the bash scripts you give it. I also have a browser control MCP installed and it's pretty good for it to full-cycle around the approach. I have a staging database that it has the passwords to that it logs in and runs queries on. This whole thing means it loops and delivers good results for me.
I'll try this, but the grounding seems crucial for these LLMs to deliver results that are fewer shot than otherwise.
show comments
dysoco
So from what I can understand this is only meant to be used with Claude-hosted sandbox environments?
Wouldn't work for my case since I need a lot of HDD space, GPUs etc. to run the thing I'm working on, but it would be great if I could run a Claude Code server in my server, expose the port and then connect via web or iOS interface.
Sure I can use tmux/ssh but it's very impractical specially in mobile.
scamaltman
How many comments in this topic are written by codex bot already ?
Although I wish that the performance of Jules is worse than Gemini CLI. I hope that this is as good as the Claude Code CLI.
jakebasile
Both Anthropic and OpenAI have something like this now and neither bothered to implement a delete feature.
Google’s version Jules has one.
witnessme
Claude team has been killing it with the new impressive releases since last week. And this one looks most promising.
minimaxir
I like how in the demo video there's a squiggle emphasis on Claude's "Good Idea!" in response to a user clarification, when it's more common among vibe coders that that less glazing is better and they just want the LLM to write code.
bgirard
Looks promising.
I got my environment working well with Codex's Cloud Task. Trying to same repo with Claude Code Web (which started off with Claude Code CLI mind you), and the yarn install just hangs with no debuggable output.
cesarvarela
Does this work inside docker containers like Codex? Stuff like `testcontainers` is unusable with that architecture because you need access to docker itself.
show comments
BohdanPetryshyn
They didn't even reviewed their own PR in the demo video :\
jannniii
I’m wondering if it would be possible to use the new skills feature or agents with this. Without the agents or the skills, I don’t know how useful this would be.
show comments
jzig
Does the feature need to be toggled on somewhere? I don’t see it on web nor iOS.
show comments
CSMastermind
The inabillity to set up an environment with just full internet access is annoying.
show comments
low_tech_punk
IMHO, parallel tasks across multiple repos is not as useful as parallel tasks in one repo.
hnidiots3
I wonder why people don’t just use Amp Code and use the Oracle.
It’s Sonnet 4.5 + GPT-5 working together.
Codex just isn’t as good as people make it out to be. OpenAI seems to train on a lot of JavaScript/Tailwind to make visuals look more impressive but when it comes to actual backend work it just fails more than it succeeds. Sonnet is much better at chewing through tasks and GPT 5 is great at consulting planning and analysis.
Using Amp and asking it to check everything with the oracle leads to superior results.
But no one on HN has heard of it. I’m guessing HN hates twitter?
show comments
rounakdatta
Now that this hosted CC is achieved, next up, I think scheduled workflows would be coming. For example, certain open-source repositories host data files scraped regularly from sources via scheduled GitHub Actions, that could be simplified.
wolfgangbabad
Codex is the way.
mkummer
Is the web interface open sourced anywhere? Looks great, excited to try it out
kelvinjps10
I was hoping that it would work with the API.
Stevvo
Guess they couldn't name it "Claude Codex"
aantix
Does this web interface have support for AWS Bedrock?
retrocog
My productivity is exploding!
bitpatch
This is kind of nice, as much as I love a good TUI, sometimes text editing in claude code can trip me up compared to a web GUI
insane_dreamer
I can already run multiple parallel tasks with Claude in multiple terminal windows, with git worktrees if working on the same repo. So I don't really understand the use case for CC on the web.
arianvanp
The way network Access works really feels weird to me. I wish that instead i could just do it like (or with!) nix. If i know the hash of the thing I'm fetching from the network, allow me access to it. Instead of arbitrarily allow listing domains.
Imagine if this would just be able to use your nix file in your repo to fetch all the dependencies needed to run your project. That'd be extremely sick
show comments
nextworddev
Developers may want to deny this, but it's getting dangerously close to maybe replacing 30% of developers
The dev just types in a prompt, scrolls down the bottom and makes a PR asking others to review without even looking at what they just did.
Lmao. Know their target market for sure.
mrcwinn
We’re moving almost entirely to Codex, first because often it’s just better, and second because it’s much cheaper. It’s a bet that they’re better now, but given capacity and funding, they’ll be better later too.
The only edge Claude has is context window, which we do sometimes hit, but I’m sure that gap will close.
show comments
lvl155
I am not a big fan of these. They’re trying to bundle compute and jack up the prices down the road.
show comments
bgwalter
I have never seen such a bunch of uncreative people who have never written a real application, never done anything artistic, never said anything intelligent try to ruin software development to the extent that the "AI" companies do.
They want to turn everything into a bootstrap framework, which is probably the limit of their mental horizon. And many people maintain that the emperor is fully clothed and that the scam works.
We were heavy users of Claude Code ($70K+ spend per year) and have almost completely switched to codex CLI. I'm doing massive lifts with it on software that would never before have been feasible for me personally, or any team I've ever run. I'll use Claude Code maybe once every two weeks as a second set of eyes to inspect code and document a bug, with mixed success. But my experience has been that initially Claude Code was amazing and a "just take my frikkin money" product. Then Codex overtook CC and is much better at longer runs on hard problems. I've seen Claude Code literally just give up on a hard problem and tell me to buy something off the shelf. Whereas Codex's ability to profoundly increase the capabilities of a software org is a secret that's slowly getting out.
I don't have any relationship with any AI company, and honestly I was rooting for Anthropic, but Codex CLI is just way way better.
Also Codex CLI is cheaper than Claude Code.
I think Anthropic are going to have to somehow leapfrog OpenAI to regain the position they were in around June of this year. But right now they're being handed their hat.
I had a preview of this over the weekend, notes here plus some example PRs: https://simonwillison.net/2025/Oct/20/claude-code-for-web/
It's really solid. It's effectively a web (and native mobile) UI over Claude Code CLI, more specifically "claude --dangerously-skip-permissions".
Anthropic have recognized that Claude Code where you don't have to approve every step is massively more productive and interesting than the default, so it's worth investing a lot of resources in sandboxing.
The most interesting parts of this to me are somewhat buried:
- Claude Code has been added to iOS
- Claude Code on the Web allows for seamless switching to Claude Code CLI
- They have open sourced an OS-native sandboxing system which limits file system and network access _without_ needing containers
However, I find the emphasis on limiting the outbound network access somewhat puzzling because the allowlists invariably include domains like gist.github.com and dozens of others which act effectively as public CMS’es and would still permit exfiltration with just a bit of extra effort.
I feel like these background agents still aren't doing what I want from a developer experience perspective. Running in an inaccessible environment that pushes random things to branches that I then have to checkout locally doesn't feel great.
AI coding should be tightly in the inner dev loop! PRs are a bad way to review and iterate on code. They are a last line of defense, not the primary way to develop.
Give me an isolated environment that is one click hooked up to Cursor/VSCode Remote SSH. It should be the default. I can't think of a single time that Claude or any other AI tool nailed the request on the first try (other than trivial things). I always need to touch it up or at least navigate around and validate it in my IDE.
It's interesting how most of these tools are (exclusively) Github.
We're on Gitlab for historic reasons. Where Github now has numerous opporuntities to use AI as part of your workflow, there's nothing in Gitlab (from what I can tell), unless you're paying big bucks.
I like using AI to boost my productivity. I'm surprised that that'll be the thing that makes me migrate to Github.
> We were heavy users of Claude Code ($70K+ spend per year) and have almost completely switched to codex CLI
Seeing comments like this all over the place. I switched to CC from Cursor in June / July because I saw the same types of comments. I switched from VSCode + Copilot about 8 months before that for the same reason. I remember being skeptical that this sort of thing was guerilla marketing, but CC was in fact better than Cursor. Guess I'll try Codex, and I guess that it's good that there are multiple competing products making big strides.
Never would have imagined myself ditching IDEs and workflows 3x in a few months. A little exhausting
Personally, my one annoyance here is that it requires you to install a GitHub App that gives it direct write permissions to all code in your repos (in addition to issues, PRs, etc).
I'd much rather give it read permissions, have it work in its own clone, and then manually pull changes back through (either with a web review UI somehow, or just pulling the changes locally). Partly for security, partly just to provide a good review gate.
Would also allow using this with other people's repos, where I _can't_ give write permissions, which would be super helpful for exploring dependency repos, or doing more general research. I've found this super helpful with Claude Code locally but seems impossible on the web right now.
Nit about doing your AI interfaces on the Web: I really want claude.ai and chatgpt.com to offer a standard username+password login without 2FA. The kind my privacy-friendly browser of short-lived sessions can complete in a couple clicks, like for most other SaaSes, and then I'm in and using the tool.
I don't want to leak data either way by using some "let's throw SSO from a sketchy adtech company into the trust loop".
I don't want to wait a minute for Anthropic's login-by-email link, and have the process slam the brakes on my workflow and train of thought.
I don't want to wait a minute for OpenAI's MFA-by-email code (even though I disabled that in the account settings, it still did it).
I don't want to deal with desktop clients I don't trust, or that might not keep up with feature improvements. Nor have to kludge up a clumsy virtualization sandbox for an untrusted client, just to ask an LLM questions that could just be in a Web browser.
No relations to them, but I've started using Happy[0]'s iOS app to start and continue Claude Code sessions on my iPhone. It allows me to run sessions on a custom environment, like a machine with a GPU to train models
[0] https://github.com/slopus/happy/
This is going to be extremely useful. A lot of people have hacked together similar things to get around waiting for CC to finish without mangling worktrees and branches manually.
I was curious how the 'Open in CLI' works - it copies a command to clipboard like 'claude --teleport session_XXXXX', which opens the same chat in the CLI, and checks out a new branch off origin/main which it's created for the thread, called 'claude/feature-name-XXXXX'.
I prefer not to use CC at the 'PR level' because it still needs too much hand-holding, so very happy to see that they've added this.
Update: Session titles are either being leaked between users or have a very bad LLM writing them. I'm seeing "Update Ton Blockchain Configuration" and "Retrieve Current PIN Code" for a project that has nothing to do with blockchain or PIN codes...
I was just working on something similar for OpenCode - pushing it now in case it's useful for someone[0].
It can run in a front-end only mode (I'll put up a hosted version soon), and then you need to specify your OpenCode API server and it'll connect to it. Alternatively, it can spin up the API server itself and proxy it, and then you just need to expose (securely) the server to the internet.
The UI is responsive and my main idea was that I can easily continue directing the AI from my phone, but it's also of course possible to just spin up new sessions. So often I have an idea while I'm away from my keyboard, and being up able to just say "create an X" and let it do its thing while I'm on the go is quite exciting.
It doesn't spin up a special sandbox environment or anything like that, but you're really free to run it inside whatever sandboxing solution you want. And unlike Claude Code, you're of course free to choose whatever model you want.
[0] https://github.com/bjesus/opencode-web
I've been using Happy Coder[0] for some time now on web and mobile. I run it `--yolo` mode on an isolated VM across multiple projects.
With Happy, I managed to turn one of these Claude Code instances into a replacement for Claude that has all the MCP goodness I could ever want and more.
[0]: https://happy.engineering/
Unfortunately Anthropic have completely lost my trust. It’s very unlikely that I will ever return to purchasing from a company that behaves in the manner in which they do.
So is this their version of Jules / Codex / Copilot agent? Aka autonomous agent in the cloud you give a task and it spits out a PR a bit later?
It’s interesting how all the LLMs slowly end up with the same feature set and picking one really ends up with personal preference.
Me as a dev am happy that I now have 4 autonomous engineers that I can delegate stuff to depending on task difficulty and rate limits. Even just Copilot + Codex has made me a lot more productive
Also rip to all the startups that tried to provide “Claude in the cloud”, though this was very predictable to happen
Here's the link talking about the sandbox environment and features they're using for this Claude Code. https://www.anthropic.com/engineering/claude-code-sandboxing
It's pretty frustrating that every release is IOS first without any timeline or expectation for Android
I wish this was integrated with GitHub actions, as there I can configure the environment, give it access to tools. The GitHub Actions integration is already fairly good, but having this interactive web UI would be perfect.
Pair programming is still one of the best ways to knowledge transfer between two programmers in a high throughput manner. Humans learn by doing, building synaptic connections.
I wonder if a shared Claude Code instance has the same effect?
Very curious to see what usage limits are like for paid plans. Anthropic was already experiencing issues with high-volume model usage for Pro and Max users. I hope their infrastructure is able to adequately support running these additional coding environments on top of model inference.
Just to be clear, I'm excited for the capability to use Claude Code entirely within the browser. However, I've heard reports of Max users experiencing throttled usage limits in recent months, and am concerned as to whether this will exacerbate that issue or not.
Wow, so nice! Now I can read hacker news, watch youtube shorts and solve tickets and add new features at the same time! What could go wrong!? Thanks AI!
I wish that a "Claude Code"-session (and a per-project-id session) could present itself in Claude Web in menu entries to create new chats which participate in the selected Claude Code session.
I got so used to having Claude Code read some of my MCP tools, and was bummed to see that it couldn't connect to them yet on the web.
Pretty cool though! Will need to use it for some more isolated work/code edits. Claude Code is now my workhorse for a ton of stuff including non-coding work (esp. with the right MCPs)
Been using both daily for three months. Different tools for different jobs.
Claude Code has better UX. Period. The permission system, rollbacks, plan mode - it's more polished. Iterative work feels natural. Quick fixes, exploratory coding, when I'm not sure exactly what I want yet - Claude wins.
Codex is more reliable when stakes are high. Hard problems. Multi-file refactors. Complex business logic. The model just grinds through it. Less hand-holding needed.
Here's the split I've landed on - Claude for fast iteration tasks where I'm actively involved. Codex for delegate-and-walk-away work that needs to be right first time.
Not about which is "better" - wrong question. It's about tooling vs model capability. Claude optimized the wrapper. OpenAI optimized the engine.
This is quite nice!
I'm using Claude Code locally a lot, occasionally with a couple parallel session.
I was very happy when they made the GitHub Action - I used it quite a bit, but in practice I got frustrated that I effectively only get a single back-and-forth out of it, I can't really "continue the conversation without losing context" - Sure, I can respond to it in the PR it makes, but that will be a fresh session with a fresh empty context.
So, as much as I don't like moving out of my standard development workflow with my tools, I think this could be quite useful. The ability to interrupt and/or continue a conversation should be very nice.
My main worry is - usually my unit tests and integration tests rely on a postgres database running on the machine, and it's not obvious to me if I can spin that up here?
unfortunately it doesn't work for me as I need docker compose or equivalent to fire up some env for local test
Just played around with it the fact it’s on the phone is a big bonus.
I have setup a little workflow where given linear tags it sets up a work tree on my dev box installs deps and starts the implementation so I can take it over I prefer this workflow to the fully managed cloud based solutions.
This kind of fits in for issues where I’m basically sure I won’t have to take it over (and it can do it fully on its own). Which aren’t that many.
Very simple example there was a warning pop up on something where I thought there shouldn’t be now it’s done fully automatically from my phone in 5 mins. I quite like that these small changes become so easy.
Is CC on the web able to spawn local containers? I would need to spawn a half dozen services locally in order to have a proper simulation of my actual working environment. Tool calling and integration with various microservices (e.g. postgres, playwright) is one of the most important uses of CC for us. For example, after telling CC to implement a feature, it needs to test that feature and confirm that any database changes are the way they're supposed to.
I really want this but for Azure Devops. If you're not familiar, Microsoft owns both Github and Azure Devops, and both do similar: git repos and project management. I can use Github Copilot, Claude Code CLI, etc. against code on my disk, including Azure Devops MCP. But what I can't easily do is like Github Copilot Agent and apparently this Claude Code on Web: Assign a ticket to @SomeAi and have a PR show up in a few minutes. Can't change to github for _reasons_.
Would love any suggestions if anyone in a similar story.
This is off topic, but can anyone tell me what the genre of music is to the video on this?
I’d really appreciate an explanation of this:
How does Codex / Claude Code compare to working within Cursor with the chat and agents? Are they effectively the same thing?
Is one significantly better than the other. Please share your experiences around this I’m trying to be ass effective of an engineer as I can be at our company. - Mike
A thing I really like with Claude Code is how well it uses the bash scripts you give it. I also have a browser control MCP installed and it's pretty good for it to full-cycle around the approach. I have a staging database that it has the passwords to that it logs in and runs queries on. This whole thing means it loops and delivers good results for me.
I'll try this, but the grounding seems crucial for these LLMs to deliver results that are fewer shot than otherwise.
So from what I can understand this is only meant to be used with Claude-hosted sandbox environments?
Wouldn't work for my case since I need a lot of HDD space, GPUs etc. to run the thing I'm working on, but it would be great if I could run a Claude Code server in my server, expose the port and then connect via web or iOS interface.
Sure I can use tmux/ssh but it's very impractical specially in mobile.
How many comments in this topic are written by codex bot already ?
This is very similar to Jules by Google! https://jules.google/
Although I wish that the performance of Jules is worse than Gemini CLI. I hope that this is as good as the Claude Code CLI.
Both Anthropic and OpenAI have something like this now and neither bothered to implement a delete feature.
Google’s version Jules has one.
Claude team has been killing it with the new impressive releases since last week. And this one looks most promising.
I like how in the demo video there's a squiggle emphasis on Claude's "Good Idea!" in response to a user clarification, when it's more common among vibe coders that that less glazing is better and they just want the LLM to write code.
Looks promising.
I got my environment working well with Codex's Cloud Task. Trying to same repo with Claude Code Web (which started off with Claude Code CLI mind you), and the yarn install just hangs with no debuggable output.
Does this work inside docker containers like Codex? Stuff like `testcontainers` is unusable with that architecture because you need access to docker itself.
They didn't even reviewed their own PR in the demo video :\
I’m wondering if it would be possible to use the new skills feature or agents with this. Without the agents or the skills, I don’t know how useful this would be.
Does the feature need to be toggled on somewhere? I don’t see it on web nor iOS.
The inabillity to set up an environment with just full internet access is annoying.
IMHO, parallel tasks across multiple repos is not as useful as parallel tasks in one repo.
I wonder why people don’t just use Amp Code and use the Oracle.
It’s Sonnet 4.5 + GPT-5 working together.
Codex just isn’t as good as people make it out to be. OpenAI seems to train on a lot of JavaScript/Tailwind to make visuals look more impressive but when it comes to actual backend work it just fails more than it succeeds. Sonnet is much better at chewing through tasks and GPT 5 is great at consulting planning and analysis.
Using Amp and asking it to check everything with the oracle leads to superior results.
But no one on HN has heard of it. I’m guessing HN hates twitter?
Now that this hosted CC is achieved, next up, I think scheduled workflows would be coming. For example, certain open-source repositories host data files scraped regularly from sources via scheduled GitHub Actions, that could be simplified.
Codex is the way.
Is the web interface open sourced anywhere? Looks great, excited to try it out
I was hoping that it would work with the API.
Guess they couldn't name it "Claude Codex"
Does this web interface have support for AWS Bedrock?
My productivity is exploding!
This is kind of nice, as much as I love a good TUI, sometimes text editing in claude code can trip me up compared to a web GUI
I can already run multiple parallel tasks with Claude in multiple terminal windows, with git worktrees if working on the same repo. So I don't really understand the use case for CC on the web.
The way network Access works really feels weird to me. I wish that instead i could just do it like (or with!) nix. If i know the hash of the thing I'm fetching from the network, allow me access to it. Instead of arbitrarily allow listing domains.
Imagine if this would just be able to use your nix file in your repo to fetch all the dependencies needed to run your project. That'd be extremely sick
Developers may want to deny this, but it's getting dangerously close to maybe replacing 30% of developers
The YouTube demo is hilarious.
https://youtu.be/s-avRazvmLg?si=eQqY6w8kbxv3TFhQ
The dev just types in a prompt, scrolls down the bottom and makes a PR asking others to review without even looking at what they just did.
Lmao. Know their target market for sure.
We’re moving almost entirely to Codex, first because often it’s just better, and second because it’s much cheaper. It’s a bet that they’re better now, but given capacity and funding, they’ll be better later too.
The only edge Claude has is context window, which we do sometimes hit, but I’m sure that gap will close.
I am not a big fan of these. They’re trying to bundle compute and jack up the prices down the road.
I have never seen such a bunch of uncreative people who have never written a real application, never done anything artistic, never said anything intelligent try to ruin software development to the extent that the "AI" companies do.
They want to turn everything into a bootstrap framework, which is probably the limit of their mental horizon. And many people maintain that the emperor is fully clothed and that the scam works.