Sora is one of several AI tools that generates video from text prompts Credit: OpenAI
OpenAI 首次推出令人印象深刻的文本视频转换工具 Sora,引发了一些重要问题。
上个月,OpenAI发布了Sora文本到视频的人工智能工具,研究人员对此既惶恐又兴奋,他们担心这项技术会被滥用。这家总部位于加利福尼亚州的公司展示了 Sora 通过一些简短的文字提示创建逼真视频的能力,其中的例子包括一个女人走在东京霓虹灯闪烁的街道上和一只狗在两个窗台之间跳跃的片段。
英国莱斯特德蒙福特大学(De Montfort University)数字文化专家特蕾西-哈伍德(Tracy Harwood)说,她对文字视频人工智能(AI)的发展速度感到 "震惊"。一年前,人们还在嘲笑人工智能制作的美国演员威尔-史密斯(Will Smith)吃意大利面条的视频。现在,一些研究人员担心,这项技术可能会在2024年颠覆全球政治。
OpenAI 还开发了 ChatGPT 和文本到图像技术 DALL-E,它于 2 月 15 日首次推出了 Sora,并宣布将把这项技术 "提供给红队人员,以评估关键领域的危害或风险"。红队 "指的是对某项技术进行模拟攻击或利用的过程,以了解该技术在现实世界中如何应对邪恶活动,如制造错误信息和仇恨内容。
Sora 并不是文字转视频技术的第一个例子,其他例子包括去年发布的由纽约市 Runway 公司制作的 Gen-2,以及今年 1 月发布的由谷歌主导的 Lumiere。哈伍德说,她对其他一些产品感到 "失望"。她说:"它们呈现给你的内容越来越虚无缥缈,"她补充说,这些程序需要非常具体的提示才能让它们制作出引人注目的内容。
哈伍德补充说,信息错误是这些文字视频技术面临的一大挑战。"我们很快就会被大量看起来很有吸引力的信息淹没。这确实令人担忧。
How OpenAI’s text-to-video tool Sora could change science – and society
OpenAI’s debut of its impressive Sora text-to-video tool has raised important questions.
The release of OpenAI’s Sora text-to-video AI tool last month was met with a mix of trepidation and excitement from researchers who are concerned about misuse of the technology. The California-based company showcased Sora’s ability to create photorealistic videos from a few short text prompts, with examples including clips of a woman walking down a neon-lit street in Tokyo and a dog jumping between two windowsills.
Tracy Harwood, a digital-culture specialist at De Montfort University in Leicester, UK, says she is “shocked” by the speed at which text-to-video artificial intelligence (AI) has developed. A year ago, people were laughing at an AI-produced video of the US actor Will Smith eating spaghetti. Now some researchers are worried that the technology could upend global politics in 2024.
OpenAI, which also developed ChatGPT and the text-to-image technology DALL·E, debuted Sora on 15 February, announcing that it was making the technology “available to red teamers to assess critical areas for harms or risks”. ‘Red teaming’ refers to the process of conducting simulated attacks or exploitation of a technology to see how it would cope with nefarious activity, such as the creation of misinformation and hateful content, in the real world.
Sora isn’t the first example of text-to-video technology; others include Gen-2, produced by Runway in New York City and released last year, and the Google-led Lumiere, announced in January. Harwood says she has been “underwhelmed” by some of these other offerings. “They are becoming more and more vanilla in what they present to you,” she says, adding that the programs require very specific prompts to get them to produce compelling content.
Misinformation is a major challenge for these text-to-video technologies, Harwood adds. “We’re going to very quickly reach a point in which we are swamped with a barrage of really compelling-looking information. That’s really worrying.”
Credit: Nature News;Date: 12 Mar 2024
网友评论