I feel like this is a feature which improves the perceived confidence of the LLM but doesn't do much for correctness of other outputs, i.e. an exacerbation of the "confidently incorrect" criticism.
show comments
darepublic
When I ask chatgpt to create a mermaid diagram for me it regularly will add new lines to certain labels that will break the parse. If you then feed the parse error back to it the second version is always correct And it seems to exactly know the problem. There are some other examples where it will almost always get it wrong the first time but right if nudged to correct itself. I wonder what the underlying cause is
show comments
czk
I tried the periodic table in their examples using sonnet 4.6 on the $20/mo plan. After a few minutes Claude told me it reached the max message length and bailed. I pressed continue and eventually it generated the table, but it wasn't inline, it was a jsx artifact, and I've now hit my daily usage limit.
show comments
fixxation92
I find it absolutely mindblowing to witness the rate at which Anthropic can ship new features. Only a year ago I couldn't wait to see some sort of Github integration and then it appeared only a week later. Seriously impressive stuff.
show comments
Gareth321
I asked it to do some portfolio analysis for me and it created BEAUTIFUL, tabbed, interactive charts UNPROMPTED. This is kind of magical. The charts were not just beautiful, but actually super useful in understanding the data faster. I honestly could not have produced those in a week if you asked me to.
atonse
Wow, I asked it to build me a simple diagram explaining agile development and it did an amazing job. Wow it felt magical to watch that diagram slowly animating to life.
Like a much prettier version of Mermaid.
Kudos, Anthropic. Geez, this is so nice.
Now I'm going to ask it to draw a diagram of a pelican riding a bicycle, why not?
wuweiaxin
The artifact output model is more useful than it looks at first. We use Claude in a multi-agent pipeline and discovered that structured artifact outputs reduce parse errors significantly compared to freeform text responses -- the model seems to reason differently when it knows the output will be rendered. Curious whether Anthropic sees similar quality improvements in tool-use tasks when the output has a concrete format constraint.
JoshGG
This is pretty neat and I am experimenting with it now, but hasn't ChatGPT had capability to create graphs and interact with data for a while? "ChatGPT advanced data analysis" for example. I'm asking in good faith as maybe some of you have been using that and can compare the two and give an informed opinion.
I usually use a lot of other tools for data analysis or write code with Claude code or another LLM to do data analysis and visualization.
When using Claude Code, we often prompt it to draft diagrams in MermaidJS syntax.
Great for summarizing a multi-step process and quick to render with simple tools.
asim
It was inevitable until the point all apps will disappear and AI will be the entry point for all work. You can see how anything required appear based on a single request. After which world models and other forms of interaction that are more dynamic will make sense and we'll need something that's not a screen.
show comments
gkfasdfasdf
I would love to know how they built this. Did they use json-render [0], openui [1], or rolled their own?
Interesting. So if I'm reading this correctly, this is distinctly different than the artifacts that Claude creates? If that's the case, why create it inline as opposed to an artifact? Any time I get a visual, I tend to find them so useful that I _want_ them to be an artifact that I can export and share.
johsole
love to see it, my auto researcher is getting more capable with less effort every release
shiftyck
Claude is broken for me since this was released, prompts are just timing out and stopping after 10 attempts
I_am_tiberius
Does anyone know which library they use? Or something developed internally?
show comments
jzig
Unable to reproduce the recipe image in the iOS app. It first gave a normal text answer. Then when referencing this blog post it produced a wonky HTML artifact.
wuweiaxin
Reliability has been the real bottleneck for multi-agent setups in production. The hard part isnt getting one agent to do something clever once - its making repeated runs observable and bounded when tools fail halfway through. Idempotency checks, explicit handoff state, and human review gates have mattered more for us than adding another model or another agent role.
show comments
atonse
Anyone else able to use Claude with Excel? I've tried adding it to our (very small) Office365 org and it just fails. Been failing for months.
show comments
mehdibl
Isn't this mainly a skill injected in the context? Rather a model/platform specific feature?
groby_b
Aaand all the way at the bottom, there it is. The first glimpse of what will be an ad carousel.
(Literally nobody needs an image of a cake when asking for a cake recipe)
I feel like this is a feature which improves the perceived confidence of the LLM but doesn't do much for correctness of other outputs, i.e. an exacerbation of the "confidently incorrect" criticism.
When I ask chatgpt to create a mermaid diagram for me it regularly will add new lines to certain labels that will break the parse. If you then feed the parse error back to it the second version is always correct And it seems to exactly know the problem. There are some other examples where it will almost always get it wrong the first time but right if nudged to correct itself. I wonder what the underlying cause is
I tried the periodic table in their examples using sonnet 4.6 on the $20/mo plan. After a few minutes Claude told me it reached the max message length and bailed. I pressed continue and eventually it generated the table, but it wasn't inline, it was a jsx artifact, and I've now hit my daily usage limit.
I find it absolutely mindblowing to witness the rate at which Anthropic can ship new features. Only a year ago I couldn't wait to see some sort of Github integration and then it appeared only a week later. Seriously impressive stuff.
I asked it to do some portfolio analysis for me and it created BEAUTIFUL, tabbed, interactive charts UNPROMPTED. This is kind of magical. The charts were not just beautiful, but actually super useful in understanding the data faster. I honestly could not have produced those in a week if you asked me to.
Wow, I asked it to build me a simple diagram explaining agile development and it did an amazing job. Wow it felt magical to watch that diagram slowly animating to life.
Like a much prettier version of Mermaid.
Kudos, Anthropic. Geez, this is so nice.
Now I'm going to ask it to draw a diagram of a pelican riding a bicycle, why not?
The artifact output model is more useful than it looks at first. We use Claude in a multi-agent pipeline and discovered that structured artifact outputs reduce parse errors significantly compared to freeform text responses -- the model seems to reason differently when it knows the output will be rendered. Curious whether Anthropic sees similar quality improvements in tool-use tasks when the output has a concrete format constraint.
This is pretty neat and I am experimenting with it now, but hasn't ChatGPT had capability to create graphs and interact with data for a while? "ChatGPT advanced data analysis" for example. I'm asking in good faith as maybe some of you have been using that and can compare the two and give an informed opinion.
I usually use a lot of other tools for data analysis or write code with Claude code or another LLM to do data analysis and visualization.
article about the ChatGPT charts and graphs https://www.zdnet.com/article/how-to-use-chatgpt-to-make-cha...
When using Claude Code, we often prompt it to draft diagrams in MermaidJS syntax.
Great for summarizing a multi-step process and quick to render with simple tools.
It was inevitable until the point all apps will disappear and AI will be the entry point for all work. You can see how anything required appear based on a single request. After which world models and other forms of interaction that are more dynamic will make sense and we'll need something that's not a screen.
I would love to know how they built this. Did they use json-render [0], openui [1], or rolled their own?
[0]: https://github.com/vercel-labs/json-render
[1]: https://github.com/thesysdev/openui
Interesting. So if I'm reading this correctly, this is distinctly different than the artifacts that Claude creates? If that's the case, why create it inline as opposed to an artifact? Any time I get a visual, I tend to find them so useful that I _want_ them to be an artifact that I can export and share.
love to see it, my auto researcher is getting more capable with less effort every release
Claude is broken for me since this was released, prompts are just timing out and stopping after 10 attempts
Does anyone know which library they use? Or something developed internally?
Unable to reproduce the recipe image in the iOS app. It first gave a normal text answer. Then when referencing this blog post it produced a wonky HTML artifact.
Reliability has been the real bottleneck for multi-agent setups in production. The hard part isnt getting one agent to do something clever once - its making repeated runs observable and bounded when tools fail halfway through. Idempotency checks, explicit handoff state, and human review gates have mattered more for us than adding another model or another agent role.
Anyone else able to use Claude with Excel? I've tried adding it to our (very small) Office365 org and it just fails. Been failing for months.
Isn't this mainly a skill injected in the context? Rather a model/platform specific feature?
Aaand all the way at the bottom, there it is. The first glimpse of what will be an ad carousel.
(Literally nobody needs an image of a cake when asking for a cake recipe)
Interactive slop is still slop.