I defined corresponding pairs of points between the two images I wanted to merge: portraits of George Clooney and Brad Pitt from Martin's collection. These corresponding points were selected using a tool from last year that was linked on the project spec. In order to create a triangulation that we could use to warp the images into the midway image, we needed correspondences that defined the structure of the faces and photos. I selected 76 correspondences on the two faces, outlining key facial features like eyes, hair, jawlines, etc.
I created the Triangulation using the delaunay library provided in Python, creating triangles that connected the points. The delaunay triangulation method is useful because it avoids overly thin triangles, merging them to form larger triangles. The triangulation was computed at a midway shape (a_pts + b_pts)/2 to lessen the potential of image deformations. I computed the triangulation on this set, which resulted in simplices (indexes of the points in the original arrays that formed the triangles). I only computed one triangulation to stay consistent with labeling across both faces.
The triangulations on George Clooney and Brad Pitt's faces are displayed below along with the correspondences selected using the online tool.
Next, I computed the mid-way face of George Clooney and Brad Pitt using the Delaunay Triangulation from Part 1 which was computed on the midway (average) points of the correspondences selected on both images. The triangles created were marked by simplices, indexes of the points in the original arrays – e.g. triangle [1 3 2] corresponds to [a_points[1] a_points[3] a_points[2]]. Using the simplices, I created triangles in the A_points space, Midway points space, and the B_points space.
I computed an affine transformation matrix between the A_points and B_points space to the Midway_points space. To explain what this means, the vertices of the triangles in the A_points space (derived from the Delaunay Triangulation's simplices) are mapped to the Midway_points space through an affine transformation matrix [[a b c] [d e f] [0 0 1]], where the last row is preserve the affine nature of the transformation, and the six parameters correspond to the transformation. I computed an affine transformation matrix for each of the triangles simplices and for A and B, mapping each of the triangles to the midway space.
The final step in the process was to actually warp both the original A and B images to the Midway points space, creating a Midway face. To accomplish this, I created inverse affine transformation matrices, derived from the affine matrices derived in the previous step. These inverse affine transformation matrices mapped the Midway points to the A points space. Then, I used the polygon function in scikit image to create a "mask" which contained the row and column indices of all points contained inside the Triangle in the midway space. Then, I used a vectorized operation and the inverse affine transformation matrix to map these Midway points in the Mask back to the A_points space.
After this reverse mapping was complete, I sampled the original colors from image A using the mapped A_points using nearest neighbor interpolation (Bilinear interpolation did not give me improvement, so to improve performance, I implemented nearest neighbor interpolation by rounding the coordinate numbers to integers, and then sampling. After sampling the colors, I set the indices of the original Midway points from the polygon function in the Midway image to the colors I sampled. I performed this process for each triangle in the Midway space, for both A and B. Then I averaged the results for A and B to get the resulting Midway face between Brad Pitt and George Clooney. Brad Clooney's displayed below.
Next, I created a Morph Sequence .gif between George Clooney and Brad Pitt, to illustrate the transformation process. I wrote a function morphed_im = morph(im1, im2, im1_pts, im2_pts, tri, warp_frac, dissolve_frac) which controlled the shape warping and the cross dissolve. The two images were warped into an intermediate shape configuration using the warp fraction, and then cross dissolved using the cross dissolve frac. Both fractions started at 0, resulting in the original image A, and they ended at 1, resulting in the original image B. The warp was defined by the correspondences and triangulation from the previous part.
The sequence was created using 45 frames, and each iteration, both fractions were incremented by 1/45th. Each frame is displayed in the gif for 30 fps.
Next, I extracted the "Mean face" of a population of 37 people from the Danes dataset. The Danes dataset had defined correspondences between all the images. I averaged these correspondences to create midway points similar to Part 2, except now I was working with more than 2 images. Then I created a Delaunay triangulation based on these Midway points. From there, it was a similar process to Part 2, where I warped each of the images in the dataset into this average shape. After warping each image, I performed the averaging to find the mean face, stacking each of the images on top of each other and then dividing by the total number of images to get my result. The facial features of the average face are clearly defined.
To warp my face into the average, and the average into my face, I defined correspondences on the resulting average face and my face. Then, I defined a Delaunay triangulation based on the average face's keypoints and warped my face into the shape defined by the triangulation. I did the reverse as well, warping the average face into a Triangulation defined by the correspondences selected on my face. I had to resize the image of my face and the position of my face to match the average face closely, otherwise the warping did not give good results. Both images had a resolution of 640 x 480.
Original images
Morphed into population average
Next, I extrapolated caricatures of my face from the population mean defined in the previous step. The way I did this was to first calculate the difference between the correspondences for the average face and my face selected in the previous parts. Then, I defined a parameter to control the strength of the caricature, which I set to 3. I multipled the difference by this parameter, and added the result to the average correspondence keypoints, creating a triangulation from these new keypoints.
Next, the warping process was similar to previous parts, where I computed an affine function mapping the correspondences for my face to the new keypoints defined in the previous paragraph, and warping my face into the triangulation defined by those new keypoints, resulting in a Caricature.
I chose to do the Bells and Whistles which changed the ethnicity of a face by warping the image into the average shape of a population. I picked the average image of a man of chinese ancestrial descent, and an image of former President Obama. I then chose correspondences on both images, outlining key features like eyes, noses, hairlines etc. to create an accurate, informative Triangulation. After this was complete, I used the process described in Part 2 to warp President Obama's face into the average Chinese shape, which led to subtle changes in President Obama's facial features that reflected the morphing.
I had fun learning how to use affine transformations and Delaunay triangulations to manipulate images into unique shapes.