I haven't seen comments regarding a big factor here:
It seems like OpenAI is trying to turn Sora into a social network - TikTok but AI.
The webapp is heavily geared towards consumption, with a feed as the entry point, liking and commenting for posts, and user profiles having a prominent role.
The creation aspect seems about as important as on Instagram, TikTok etc - easily available, but not the primary focus.
Generated videos are very short, with minimal controls. The only selectable option is picking between landscape and portrait mode.
There is no mention or attempt to move towards long form videos, storylines, advanced editing/controls/etc, like others in this space (eg Google Flow).
Seems like they want to turn this into AITok.
Edit: regarding accurate physics ... check out these two videos below...
To be fair, Veo fails miserably with those prompts also.
Couldn't help but mock them a little, here is a bit of fun... the prompt adherence is pretty good, at least.
NOTE: there are plenty of quite impressive videos being posted, and a lot of horrible ones also.
show comments
samuelfekete
This is a step towards a constant stream of hyper-personalised AI generated content optimised for max dopamine.
show comments
SeanAnderson
Sheeeeeeeeeeesh. That was so impressive. I had to go back to the start and confirm it said "Everything you're about to see is Sora 2" when I saw Sam do that intro. I thought there was a prologue that was native film before getting to the generated content.
show comments
TheAceOfHearts
Really impressive engineering work. The videos have gotten good enough that they can grab your attention and trigger a strong uncanny valley feeling.
I think OpenAI is actually doing a great job at easing people into these new technologies. It's not such a huge leap in capabilities that it's shocking, and it helps people acclimate for what's coming. This version is still limited but you can tell that in another generation or two it's going to break through some major capabilities threshold.
To give a comparison: in the LLM model space, the big capabilities threshold event for me came with the release of Gemini 2.5 Pro. The models before that were good in various ways, but that was the first model that felt truly magical.
From a creative perspective, it would be ideal if you could first generate a fixed set of assets, locations, and objects, which are then combined and used to bring multiple scenes to life while providing stronger continuity guarantees.
Josh5
Everyone has the widest eyes in these Sora videos.
seydor
Since Agi is cancelled, at least we have shopping and endless video
nycdatasci
What makes TikTok fun is seeing actual people do crazy stuff. Sora 2 could synthesize someone hitting five full-court shots in a row, but it wouldn’t be inspiring or engaging. How will this be different than music-generating AI like Suno, which doesn't have widespread adoption despite incredible capabilities?
show comments
polishdude20
There's something about the faces that looks completely off to me. I think it's the way the mouth and whole face moves when they talk.
show comments
echelon
I'm a software engineer and hobbyist actor/director. My friends are in the film industry and are in IATSE and SAG-AFTRA. I've made photons-on-glass films for decades, and I frequently film stuff with my friends for festivals.
I love this AI video technology.
Here are some of the films my friends and I have been making with AI. These are not "prompted", but instead use a lot of hand animation, rotoscoping, and human voice acting in addition to AI assistance:
I see several incredibly good things happening with this tech:
- More people being able to visually articulate themselves, including "lay" people who typically do not use editing software.
- Creative talent at the bottom rungs being able to reach high with their ambition and pitch grand ideas. With enough effort, they don't even need studio capital anymore. (Think about the tens of thousands of students that go to film school that never get to direct their dream film. That was a lot of us!)
- Smaller studios can start to compete with big studios. A ten person studio in France can now make a well-crafted animation that has more heart and soul than recent by-the-formula Pixar films. It's going to start looking like indie games. Silksong and Undertale and Stardew Valley, but for movies, shows, and shorts. Makoto Shinkai did this once by himself with "Voices of a Distant Star", but it hasn't been oft repeated. Now that is becoming possible.
You can't just "prompt" this stuff. It takes work. (Each of the shorts above took days of effort - something you probably wouldn't know unless you're in the trenches trying to use the tech!)
For people that know how to do a little VFX and editing, and that know the basic rules of storytelling, these tools are remarkable assets that compliment an existing skill set. But every shot, every location, every scene is still work. And you have to weave that all into a compelling story with good hooks and visuals. It's multi-layered and complex. Not unlike code.
And another code analogy: think of these models like Claude Code for the creative. An exoskeleton, but not the core driving engineer or vision that draws it all together. You can't prompt a code base, and similarly, you can't prompt a movie. At least not anytime soon.
show comments
clgeoio
> Concerns about doomscrolling, addiction, isolation, and RL-sloptimized feeds are top of mind—here is what we are doing about it.
> We are giving users the tools and optionality to be in control of what they see on the feed. Using OpenAI's existing large language models, we have developed a new class of recommender algorithms that can be instructed through natural language. We also have built-in mechanisms to periodically poll users on their wellbeing and proactively give them the option to adjust their feed.
So, nothing?
I can see this being generated and then reposted to TikTok, Meta, etc for likes and engagement.
neilv
> And we're introducing Cameo, giving you the power to step into any world or scene, and letting your friends cast you in theirs.
How much are they (and providers of similar tools) going to be able to keep anyone from putting anyone else in a video, shown doing and saying whatever the tool user wants?
Will some only protect politicians and celebrities? Will the less-famous/less-powerful of us be harassed, defamed, exploited, scammed, etc.?
show comments
alberth
Why do you have to download an app to use Sora 2 (vs it being available on the web like ChatGPT)?
ashu1461
This is a good comparison thread of capabilities of sora vs sora 2
I welcome a world where gullible people begin to doubt everything they see.
show comments
jp57
Prediction: we'll see at least one Sora-generated commercial at the Super Bowl this year.
vahid4m
While the quality of what I'm seeing is very nice for AI generated content (I still can't believe it) but the fact thay they are mostly showing short clips and not a long connected consistent video makes it less impressive.
robotsquidward
It's insanely impressive. At the same time, all these videos all look terrible to me. Still get extreme uncanny valley and literally makes me sick to my stomach.
show comments
squidsoup
A little tangential to this announcement, but is anyone aware of any clean/ethical models for AI video or image generation (i.e. not trained on copyright work?) that are available publicly?
ezomode
full-on productisation effort -> no AGI in sight
amelius
Nicely cherry-picked.
doikor
Does this survive panning the camera away for 5 to 10 seconds and then back? Or basic conversation scene with the camera cutting between being located behind either speaker once every few seconds?
Basically proper working persistence of the scene.
show comments
sumeruchat
Shameless plug but I am creating a startup in this space called cleanvideo.cc to tackle some of the issues that will come with fake news videos. https://cleanvideo.cc
LarsDu88
I really hope they have more granular APIs around this.
One use case I'm really excited about is simply making animated sprites and rotational transformations of artwork using these videogen models, but unlike with local open models, they never seem to expose things like depth estimation output heads, aspect ratio alteration, or other things that would actually make these useful tools beyond shortform content generation.
tptacek
If I was on the OpenAI marketing team I maybe wouldn't have included the phrase "and letting your friends cast you in their [videos]". It's a little chilling.
show comments
bgwalter
What is the target market for this? The videos are not good enough for YouTube. They are unrealistic, nauseating and dorky. Already now any YouTube video that contains a hint of "AI" attracts hundreds of scathing comments. People do not want this.
Let me guess, the ultimate market will be teenagers "creating" a Skibidi Toilet and cheap TikTok propaganda videos which promote Gazan ocean front properties.
NoahZuniga
TTS is horrible compared to Google's veo 3
fersarr
Only iphone...
dolebirchwood
This makes me less excited about the future of video, not more.
It's technically impressive, but all so very soulless.
When everything fake feels real, will everything real feel fake?
show comments
carrozo
Sora 2: Sloppy Seconds
boh
This is the kind of thing people get excited about for the first couple of months and then barely use it going forward. It's amazing how quickly the novelty of this amazing technology wears off. You realize how necessary meaning/identity/narrative is to media and how empty it gets (regardless of the output) when those elements are missing.
egeres
I wonder how this will affect the large cinema production companies (Disney, WB, Universal, Sony, Paramount, 20th century...). The global film market share was estimated to be 100B in 2023. If the production cost of high FX movies like Avengers Infinity War goes down from 300M$ to just 10K$ in a couple of years, will companies like Disney restrain themselves to just release a few epic movies per year? Or will we be flooded with tons of slop? If this kind of AI content keeps getting better, how will movies sustain our attention and feel 'special'? Will people not care if an actor is AI or real?
mrcino
So, this is the AI Slop generator for the AI SlipSlop that Altman has announced lately.
Brave new internet, where humans are not needed for any "social" media anymore, AI will generate slop for bots without any human interaction in an endless cycle.
rvz
12,000+ "AI startups" have been obliterated.
apetresc
If anyone is feeling generous with one of their four invite codes, I'd really appreciate it. I'm at adrian@apetre.sc.
carabiner
CEO of Loopt makes a cameo at 1:28 in the youtube vid.
VagabundoP
I hate this vacant technology tbh. Every video feels like distilled advert mindless slop.
There's still something off about the movements, faces and eyes. Gollum features.
show comments
drcongo
The AI generated Sam Altman doesn't look even vaguely human.
CSMastermind
Anyone have an invite they want to share with me lol.
ionwake
I think HN is too political like this tech is clearly amazing and it’s great they shipped it there should be more props even if it’s a billion dollar company.
show comments
unethical_ban
I just had a thought: (spoilers Expanse and Hyperion and Fire Upon the Deep)
Multiple sci-fi-fantasy tales have been written about technology getting so out of control, either through its own doing or by abuse by a malevolent controller, that society must sever itself from that technology very intentionally and permanently.
I think the idea of AGI and transhumanism is that moment for society. I think it's hard to put the genie back in the bottle because multiple adversarial powers are racing to be more powerful than the rest, but maybe the best thing for society would be if every tensor chip disintegrated the moment they came into existence.
I don't see how society is better when everyone can run their own gooner simulation and share it with videos made of their high school classmates. Or how we'll benefit from being unable to trust any photo or video we see without trusting who sends it to you, and even then doubting its veracity. Not being able to hear your spouse's voice on the phone without checking the post-quantum digital signature of their transmission for authenticity.
Society is heading to a less stable, less certain moment than any point in its history, and it is happening within our lifetime.
marcofloriano
Every AI video demonstration is always about funny stuff and fancy situations. We never see videos on art, history, literature, poetry, religion (imagine building a video about the moment Jesus was born) ... ducks in a race !? Come on ...
So much visual power, yet so little soul power. We are dying.
I haven't seen comments regarding a big factor here:
It seems like OpenAI is trying to turn Sora into a social network - TikTok but AI.
The webapp is heavily geared towards consumption, with a feed as the entry point, liking and commenting for posts, and user profiles having a prominent role.
The creation aspect seems about as important as on Instagram, TikTok etc - easily available, but not the primary focus.
Generated videos are very short, with minimal controls. The only selectable option is picking between landscape and portrait mode.
There is no mention or attempt to move towards long form videos, storylines, advanced editing/controls/etc, like others in this space (eg Google Flow).
Seems like they want to turn this into AITok.
Edit: regarding accurate physics ... check out these two videos below...
To be fair, Veo fails miserably with those prompts also.
https://sora.chatgpt.com/p/s_68dc32c7ddb081919e0f38d8e006163...
https://sora.chatgpt.com/p/s_68dc3339c26881918e45f61d9312e95...
Veo:
https://veo-balldrop.wasmer.app/ballroll.mp4
https://veo-balldrop.wasmer.app/balldrop.mp4
Couldn't help but mock them a little, here is a bit of fun... the prompt adherence is pretty good, at least.
NOTE: there are plenty of quite impressive videos being posted, and a lot of horrible ones also.
This is a step towards a constant stream of hyper-personalised AI generated content optimised for max dopamine.
Sheeeeeeeeeeesh. That was so impressive. I had to go back to the start and confirm it said "Everything you're about to see is Sora 2" when I saw Sam do that intro. I thought there was a prologue that was native film before getting to the generated content.
Really impressive engineering work. The videos have gotten good enough that they can grab your attention and trigger a strong uncanny valley feeling.
I think OpenAI is actually doing a great job at easing people into these new technologies. It's not such a huge leap in capabilities that it's shocking, and it helps people acclimate for what's coming. This version is still limited but you can tell that in another generation or two it's going to break through some major capabilities threshold.
To give a comparison: in the LLM model space, the big capabilities threshold event for me came with the release of Gemini 2.5 Pro. The models before that were good in various ways, but that was the first model that felt truly magical.
From a creative perspective, it would be ideal if you could first generate a fixed set of assets, locations, and objects, which are then combined and used to bring multiple scenes to life while providing stronger continuity guarantees.
Everyone has the widest eyes in these Sora videos.
Since Agi is cancelled, at least we have shopping and endless video
What makes TikTok fun is seeing actual people do crazy stuff. Sora 2 could synthesize someone hitting five full-court shots in a row, but it wouldn’t be inspiring or engaging. How will this be different than music-generating AI like Suno, which doesn't have widespread adoption despite incredible capabilities?
There's something about the faces that looks completely off to me. I think it's the way the mouth and whole face moves when they talk.
I'm a software engineer and hobbyist actor/director. My friends are in the film industry and are in IATSE and SAG-AFTRA. I've made photons-on-glass films for decades, and I frequently film stuff with my friends for festivals.
I love this AI video technology.
Here are some of the films my friends and I have been making with AI. These are not "prompted", but instead use a lot of hand animation, rotoscoping, and human voice acting in addition to AI assistance:
https://www.youtube.com/watch?v=H4NFXGMuwpY
https://www.youtube.com/watch?v=tAAiiKteM-U
https://www.youtube.com/watch?v=7x7IZkHiGD8
https://www.youtube.com/watch?v=Tii9uF0nAx4
Here are films from other industry folks. One of them writes for a TV show you probably watch:
https://www.youtube.com/watch?v=FAQWRBCt_5E
https://www.youtube.com/watch?v=t_SgA6ymPuc
https://www.youtube.com/watch?v=OCZC6XmEmK0
I see several incredibly good things happening with this tech:
- More people being able to visually articulate themselves, including "lay" people who typically do not use editing software.
- Creative talent at the bottom rungs being able to reach high with their ambition and pitch grand ideas. With enough effort, they don't even need studio capital anymore. (Think about the tens of thousands of students that go to film school that never get to direct their dream film. That was a lot of us!)
- Smaller studios can start to compete with big studios. A ten person studio in France can now make a well-crafted animation that has more heart and soul than recent by-the-formula Pixar films. It's going to start looking like indie games. Silksong and Undertale and Stardew Valley, but for movies, shows, and shorts. Makoto Shinkai did this once by himself with "Voices of a Distant Star", but it hasn't been oft repeated. Now that is becoming possible.
You can't just "prompt" this stuff. It takes work. (Each of the shorts above took days of effort - something you probably wouldn't know unless you're in the trenches trying to use the tech!)
For people that know how to do a little VFX and editing, and that know the basic rules of storytelling, these tools are remarkable assets that compliment an existing skill set. But every shot, every location, every scene is still work. And you have to weave that all into a compelling story with good hooks and visuals. It's multi-layered and complex. Not unlike code.
And another code analogy: think of these models like Claude Code for the creative. An exoskeleton, but not the core driving engineer or vision that draws it all together. You can't prompt a code base, and similarly, you can't prompt a movie. At least not anytime soon.
> Concerns about doomscrolling, addiction, isolation, and RL-sloptimized feeds are top of mind—here is what we are doing about it.
> We are giving users the tools and optionality to be in control of what they see on the feed. Using OpenAI's existing large language models, we have developed a new class of recommender algorithms that can be instructed through natural language. We also have built-in mechanisms to periodically poll users on their wellbeing and proactively give them the option to adjust their feed.
So, nothing? I can see this being generated and then reposted to TikTok, Meta, etc for likes and engagement.
> And we're introducing Cameo, giving you the power to step into any world or scene, and letting your friends cast you in theirs.
How much are they (and providers of similar tools) going to be able to keep anyone from putting anyone else in a video, shown doing and saying whatever the tool user wants?
Will some only protect politicians and celebrities? Will the less-famous/less-powerful of us be harassed, defamed, exploited, scammed, etc.?
Why do you have to download an app to use Sora 2 (vs it being available on the web like ChatGPT)?
This is a good comparison thread of capabilities of sora vs sora 2
https://x.com/mattshumer_/status/1973085321928515783
I welcome a world where gullible people begin to doubt everything they see.
Prediction: we'll see at least one Sora-generated commercial at the Super Bowl this year.
While the quality of what I'm seeing is very nice for AI generated content (I still can't believe it) but the fact thay they are mostly showing short clips and not a long connected consistent video makes it less impressive.
It's insanely impressive. At the same time, all these videos all look terrible to me. Still get extreme uncanny valley and literally makes me sick to my stomach.
A little tangential to this announcement, but is anyone aware of any clean/ethical models for AI video or image generation (i.e. not trained on copyright work?) that are available publicly?
full-on productisation effort -> no AGI in sight
Nicely cherry-picked.
Does this survive panning the camera away for 5 to 10 seconds and then back? Or basic conversation scene with the camera cutting between being located behind either speaker once every few seconds?
Basically proper working persistence of the scene.
Shameless plug but I am creating a startup in this space called cleanvideo.cc to tackle some of the issues that will come with fake news videos. https://cleanvideo.cc
I really hope they have more granular APIs around this.
One use case I'm really excited about is simply making animated sprites and rotational transformations of artwork using these videogen models, but unlike with local open models, they never seem to expose things like depth estimation output heads, aspect ratio alteration, or other things that would actually make these useful tools beyond shortform content generation.
If I was on the OpenAI marketing team I maybe wouldn't have included the phrase "and letting your friends cast you in their [videos]". It's a little chilling.
What is the target market for this? The videos are not good enough for YouTube. They are unrealistic, nauseating and dorky. Already now any YouTube video that contains a hint of "AI" attracts hundreds of scathing comments. People do not want this.
Let me guess, the ultimate market will be teenagers "creating" a Skibidi Toilet and cheap TikTok propaganda videos which promote Gazan ocean front properties.
TTS is horrible compared to Google's veo 3
Only iphone...
This makes me less excited about the future of video, not more.
It's technically impressive, but all so very soulless.
When everything fake feels real, will everything real feel fake?
Sora 2: Sloppy Seconds
This is the kind of thing people get excited about for the first couple of months and then barely use it going forward. It's amazing how quickly the novelty of this amazing technology wears off. You realize how necessary meaning/identity/narrative is to media and how empty it gets (regardless of the output) when those elements are missing.
I wonder how this will affect the large cinema production companies (Disney, WB, Universal, Sony, Paramount, 20th century...). The global film market share was estimated to be 100B in 2023. If the production cost of high FX movies like Avengers Infinity War goes down from 300M$ to just 10K$ in a couple of years, will companies like Disney restrain themselves to just release a few epic movies per year? Or will we be flooded with tons of slop? If this kind of AI content keeps getting better, how will movies sustain our attention and feel 'special'? Will people not care if an actor is AI or real?
So, this is the AI Slop generator for the AI SlipSlop that Altman has announced lately.
Brave new internet, where humans are not needed for any "social" media anymore, AI will generate slop for bots without any human interaction in an endless cycle.
12,000+ "AI startups" have been obliterated.
If anyone is feeling generous with one of their four invite codes, I'd really appreciate it. I'm at adrian@apetre.sc.
CEO of Loopt makes a cameo at 1:28 in the youtube vid.
I hate this vacant technology tbh. Every video feels like distilled advert mindless slop.
There's still something off about the movements, faces and eyes. Gollum features.
The AI generated Sam Altman doesn't look even vaguely human.
Anyone have an invite they want to share with me lol.
I think HN is too political like this tech is clearly amazing and it’s great they shipped it there should be more props even if it’s a billion dollar company.
I just had a thought: (spoilers Expanse and Hyperion and Fire Upon the Deep)
Multiple sci-fi-fantasy tales have been written about technology getting so out of control, either through its own doing or by abuse by a malevolent controller, that society must sever itself from that technology very intentionally and permanently.
I think the idea of AGI and transhumanism is that moment for society. I think it's hard to put the genie back in the bottle because multiple adversarial powers are racing to be more powerful than the rest, but maybe the best thing for society would be if every tensor chip disintegrated the moment they came into existence.
I don't see how society is better when everyone can run their own gooner simulation and share it with videos made of their high school classmates. Or how we'll benefit from being unable to trust any photo or video we see without trusting who sends it to you, and even then doubting its veracity. Not being able to hear your spouse's voice on the phone without checking the post-quantum digital signature of their transmission for authenticity.
Society is heading to a less stable, less certain moment than any point in its history, and it is happening within our lifetime.
Every AI video demonstration is always about funny stuff and fancy situations. We never see videos on art, history, literature, poetry, religion (imagine building a video about the moment Jesus was born) ... ducks in a race !? Come on ...
So much visual power, yet so little soul power. We are dying.
More discussion: https://news.ycombinator.com/item?id=45428122
Show me a coherent video that lasts more than 5 seconds and was generated with the model and maybe I'll start to care.
It is very underwhelming. It seems like a step backward. Scam altman should be replaced before he runs the company to bankruptcy.