Text-to-video comes to life!
3 innovation and digital news in 1 minute. Every Monday. Episode 367
Meta latest “research” called Emu
Emu is a foundation model (transformer) for both video generation and editing (Emu Video and Emu Edit). Video generation is “factorised”, split into 2 steps. First, generate an image, then generate a video from there. It allows for a better video resolution. Emu Edit can rework the result, also with prompts. Opinion: Meta AI has been strong in this Generative AI “sequence”. In three dimensions: LLM Llama 2 open source, new research like Emu and a leader Yann LeCun communicating a lot. A clear communication engaged into an open-source AI rather than a commercial-only approach (OpenAI and Microsoft).
A very smart Meta on the AI scene!
Runway, and Google behind
The leader in video generation is Runway, a New York-based start-up financed by Google (50% of the $238m funding). Runway launches gen-2, the newest version. In four modes: Text to video, text + image to video, image to video, and “stylisation” (same style from an input to an output). Opinion: Runway describes itself as a research company and indeed, they are pushing the boundaries of video generation. The results are very good — but one can always see a kind of style and movement that are slow. We aren’t in a phase where the videos are looking genuine.
Let’s follow Runway and their next steps in 2024.
New-kid-on-the-block called Pika
Yes, a new kid on the block, with a large funding of $110 million. What do they do for such an investment? Generation of “films” (not video) from still images and captions. The Pika 1.0 is called “idea-to-video” and is in beta. Opinion: Retry and reprompt are separate buttons to rework the image. We’re are on the waiting list to try this new AI. The first image on the site is “Elon Musk in a space unit. 3D generation” — see the pic.
We’ll follow up on Pika when we test and the link to beta just came!