"We'll show you what Sora can do. Reply to this tweet with a prompt, and we'll create a video from it," wrote Sam Altman, CEO of OpenAI, on his personal X account.
The announcement of Sora, OpenAI's latest text-to-video model, was met with an immediate flood of responses. One account replied with the prompt, "A bicycle race on the sea with sea creatures participating, with the camera angle taken from a drone."
It didn't take long for Sam Altman to showcase the video made according to that user's prompt.
Along with the CEO's tweet, OpenAI's account also demonstrated Sora's capabilities with the prompt, "A beautiful and snowy Tokyo city. The camera moves towards the busy streets, where some people are enjoying the snow and shopping in nearby stores. Cherry blossoms and snowflakes float in the wind."
The result? A stunningly hyper-realistic video!
What Makes Sora Special?
Sora's capabilities are indeed impressive, but they are not entirely surprising. Prior to this, generative AI models had already made significant advancements with text-to-image, sound-to-text, text-to-sound generators, and more.
So, the emergence of text-to-video generators was just a matter of time. Even before Sora, there were similar text-to-video generators, like Google's Lumiere.
But what makes Sora different?
On its website, OpenAI explains that Sora is a diffusion model that works from random noise, iterating until the noise dissipates according to the user's prompt. This method is similar to how ChatGPT and DALL-E operate, utilizing the same diffusion model and deep learning techniques.
Although Sora's videos can still be distinguished from real ones upon close inspection, the interactions between elements, the emotions displayed, and the details shown are very promising for an AI model.
So promising, in fact, that OpenAI has already arranged meetings with several studios and directors in Hollywood to introduce Sora further to the film industry. It's not far-fetched to imagine a future collaboration between Hollywood and OpenAI to create epic films.
But, is Sora like other generative AI models that are suspected of secretly using others' works for training data?
OpenAI Amidst Criticism
Mira Murati, Chief Technology Officer of OpenAI, stated that Sora uses licensed public data. However, when asked if Sora was also trained using data from social media platforms like Facebook, Instagram, or YouTube, Mira appeared hesitant.
"Yes, if those data are in the public domain. But I'm not sure. I'm not too confident in answering that," she said.
OpenAI is no stranger to such questions. In July 2023, three authors—Sarah Silverman, Richard Kadrey, and Christopher Golden—filed a lawsuit against OpenAI and Meta, alleging copyright infringement.
In December 2023, a similar lawsuit was filed by eleven non-fiction authors, claiming that OpenAI and Microsoft used their works without permission for training data.
OpenAI has denied all allegations, stating that the plaintiffs do not have sufficient evidence to prove that the AI model's output is identical to their works.
In line with this argument, a California state judge in February dismissed part of the copyright infringement lawsuit filed by the authors.
Despite ongoing debates about copyright and ethics in the use of generative AI, there's no denying that the advent of generative AI like Sora marks a significant advancement in artificial intelligence.
Want to stay updated on the latest in technology and AI? Download Fitie now and be the most up-to-date!