DALLE2 Video Save Abandoned

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

Project README

DALLE2 Video (wip)

** only to be built after DALLE2 image is done and replicated, and the importance of the prior network is validated **

Direct application of DALLE-2 to video synthesis, using factored space-time UNet and Transformers

CLIP could have two approaches. (1) design 3D CLIP using TimesFormer for the video encoding (2) use 2D CLIP with attention pooling at the end over all image embeddings with time positional encoding

For all you academics out there emailing me, you are free to publish this idea without citing me at all. I'm just after the world of infinite machine dreams, and I don't really care how we get there.

Open Source Agenda is not affiliated with "DALLE2 Video" Project. README Source: lucidrains/DALLE2-video
Stars
103
Open Issues
0
Last Commit
1 year ago
License
MIT

Open Source Agenda Badge

Open Source Agenda Rating