Have you ever wondered about making a video from all your photographs the hard way?

Have you ever wondered what happens if you take all the photographs you have made of a particular model and play them back at 30 frames per second?

More specifically have you thought of:

  • writing software to use artificial intelligence to find the facial landmarks in your photos

  • then write more some code to rotate images so that all the faces are vertical

  • then order the images according to the distance in pixels between the eyes in the images

  • then further order them according to how the head is turned to profile or straight ahead in the image.

  • then write some more code to align all the images so that the left eye is in the same place

  • then turn that collection into a video, showing images at 1/30th second

Read on if you want to see more experiments and learn more about the process...

Well, I had too much time on my hands over Christmas!! I did it so you don't have to.

And to be honest I'm not sure it was worth it!

This is an idea that has been bugging me for some years. I have previously made videos using all the images from a photoshoot as a sort of time-lapse video. They work reasonably well, not least because each section of images are often taken from very similar positions and pose variations. This gives a reasonable flow to the sequence. However it is fairly jerky, so I ran it through YouTube's image stabilisation to get this:

But what had been bugging me was I wanted to take the images from many shoots and meld them in this way. I knew that the only way this would really make sense would be if I could align all the faces to the same place. And doing that by hand for 1,000's of images was not going to happen!

Well, then some experimenting with artificial intelligence engines led me to the point where I could get the co-ordinates of all the facial landmarks in an image. I could also determine whether the face was facing the camera or turned to one side, or indeed tilted backwards or forwards.

OK, so at this point I've got data and I've got images, but they still don't line up in any useful way.

The first thing to do was to rotate all the images I was using so that the face was consistently vertical in every image. I wrote some code to take the image face database and use ImageMagick to do the rotation to an arbitrary number of degrees. I also had to go back to school and revise my trigonometry because of course all the facial landmark co-ordinates I had also needed to be rotated.

Then, on the new data I had to calculate the distance between the eyes, all based in pixels from the image of course - not actual distance between the eyes!

I was getting close. The final step was then to process all the images so that the left eye was positioned in the same place in every frame. This had to be regardless of the original picture size and alignment. The final output also needed to be a consistent frame size for conversion to video. So, some more code was written. At last I was ready.

My first thought was that I could just take my collection of images, process them and view them in order of face size. Basically I wanted to create the effect of the camera "zooming in" on the face. This was the result.

What I had hoped to see was a sort of subliminal merging of images and faces. What I got was an epileptic strobing of randomness. This seems to be partly because the faces are very disparate and also because the original source images were of very varied original resolution and sizes.

OK, let's have a plan B. What about if all the images were from one model? There would be a lot more consistency in the face then. So I collected all the images I had have Rosa Brighid and repeated the process.

I also realised the direction the face was looking was far too random. I had to sort the faces so that the head turned from left to right. Back to the code. Fortunately I had the data and so with some fiddling around I was able to output the images in the right order.

Except... There was this jerk every time the head turned all the way to the right and then flipped back to the left as we got the next set of closer images. So with a bit more coding and fiddling I was able to come up with a set of images that (a) zoomed in, (b) the head turned from left to right, then back from right to left.

This was sort of getting closer.

Now I felt I was getting close. However I felt that the whole set started too close in to begin with. I wanted more of a "zoom" effect. Also it seemed as though starting so close in increased the flicker effect. So back to the data again and some resetting of image sizes so that the earlier images started with more of the whole image in show.

Getting closer, but I ran one more set with the starting images pulled much further back.

This was the final result.

In this video I am showing the images slower than in the final video, which I repeat here for you below so you can get the full frenetic finish.

LESSONS LEARNED

Having got to this final version I have a suspicion that my first idea of the multiple models would work if I tuned it up.

I also want to try a different version where the faces are all exactly the same size.

Maybe when I've got another couple of days spare I might go back and do this some more.

 
share:            
experimental, video,rosa brighid,artificial intelligence,software,imaging Simon Q. Walden, FilmPhotoAcademy.com, sqw, FilmPhoto, photography

Would you like to take better photographs?

Would you like to take a big step forward in a photography career or as a keen enthusiast?

"Master Your DSLR" is a comprehensive course is provided completely online, in your own time, at your own pace.

More Info


Categories