Follow the instruction to test our project!
- two ipynb file (code_final.ipynb, preprocessing.ipynb)
- two original images (original1.jpg, original2.jpg)
- two preprocessed images (pic1.jpg, pic2.jpg)
- one original video (original_video.mp4)
- one preprocessed video (2.mp4)
- two txt file of facial landmarks from pic1.jpg and pic2.jpg (pzy1.txt, pzy2.txt)
- two zip file (ldm.zip, checkpoints.zip) ldm.zip contains landmarks for every frame in the 2.mp4, checkpoints.zip contains pretrained model which we will use later.
*If you already have pic1.jpg, pic2.jpg, 2.mp4, ldm.zip, pzy1.txt, pzy2.txt, you can skip this notebook.
-
You can align your original images or videos, and get the facial landmark file in preprocessing.ipynb.
-
You will get pic1.jpg, pic2.jpg, 2.mp4, ldm.zip, pzy1.txt, pzy2.txt, from original1.jpg, original2.jpg, original_video.mp4.
-
In preprocessing.ipynb, first upload the original1.jpg, original2.jpg, original_video.mp4, then follow every step in the paragraph "Image Alignment", "Video Alignment", "Extract landmarks for every frame in the video", "Record the difference between the landmarks of input image and every frame".
-
Every component:
- Image Alignment: crop and rotate the original photo, and resize it to 256*256
- Video Alignment: crop and rotate every frame in the video, and resize it to 256*256
- Extract landmarks for every frame in the video: input a video, will save landmarks of every frame as txt files in a folder.
- Record the difference between the landmarks of input image and every frame: this part will save the landmark difference between input image and every frame in the video. The output file is pzy1.txt, pzy2.txt
*main code for generating video
-
First upload the pic1.jpg, pic2.jpg, 2.mp4, ldm.zip, pzy1.txt, pzy2.txt, then just follow every step in this notebook.
-
Every component:
-
Please run this notebook on colab: After have uploaded all the files, please run all cells. There are 2 outputs, /content/original_output.mp4 is the video generated by original model, and /content/our_output.mp4 is the video generated by our model.
-
Upload files: The paths should be like: /content/pic1.jpg
/content/pic2.jpg
/content/2.mp4
/content/pzy1.txt
/content/pzy2.txt
/content/checkpoints.zip
-
Python package installation and pre-trained model download: prepare the environment
-
Download facial landmark detection model: This part will be used to detect facial landmark differences between the source images and driving frames, in order to calculate the weights for 2 source images.
-
Download StyleGAN and StyleGAN encoder models: This part will build StyleGAN and StyleGAN encoder models and load the pretrained parameters. Then they will be used to merge 2 output frames from 2 source images.
-
Video generated by the original First Order Motion model: This part will show the output video using the orginial model. The output video’s path: /content/original_output.mp4
-
Video generated by our model: This part will show the output video using the model improved by us. The output video’s path: /content/our_output.mp4
-