In this project, we will experiment with frequencies and their applications to various image processing and image manipulation techniques. We will start with edge detection through utilizing the finite difference operator as well as the Derivative of Gaussian (DoG) Filter. From there, we will explore image sharpening, hybrid images, and image blending
We can utilize the finite difference operators (Dx = [1, -1] and Dy = [[1], [-1]]) to get the partial derivatives of the image. If we convolve our cameraman image with the Dx filter, we can identify the vertical edges. If we convolve using the Dy filter, we can identify the horizontal edges instead. From here, we can calculate the Gradient Magnitude Image using the formula np.sqrt(partial_deriv_x**2 + partial_deriv_y**2). To better visualize this, we can binarize the gradient magnitude image values to be either 1 or 0 if they are greater than a threshold, which I chose to be 0.25
We can see that there's a lot of noise in the binarized edge image so we will now first blur the cameraman image before convolving it with the finite difference operators. A Gaussian filter can be used to blur the image. I chose kernel size = 7 and sigma = 1. When we take our cameraman image and convolve it with a 2D Gaussian kernel, it will 'smooth' or blur the image slightly reducing drastic changes which cause 'noise' in edge detection. Now, we can repeat the steps from above of getting the partial derivatives and gradient magnitude image. For the binarized gradient magnitude after blurring, I chose the threshold to be 0.08. Compared to the edge map produced by the previous part, there's much less noise which produces sharper, thicker edges.
This process can be sped up slightly by noting that convolutions are associative. This allows us to first convolve the finite difference operators with the 2D Gaussian Kernel. These can then be convolved with the original (unblurred) cameraman image to create the Derivative of Gaussian (DoG) X and Y convolutions and, as a result, acquire the DoG Gradient Magnitude Image. For the binarized DoG Gradient Magnitude Image threshold, I chose 0.08 again. As we can see, the edge map produced is identical to the unoptimized approach.
Here, we will sharpen a blurry image using the unsharp masking technique. First, we will convolve the original image of the Taj Mahal with a 2D gaussian Kernel of kernel size = 7 and sigma = 1. This blurs the image and is considered a low pass filter which leaves only the low frequencies behind. If we then subtract the blurred image from the original image, we can obtain the high frequencies of the image. We can then add back the higher frequencies, also scaling it by some value alpha if necessary, to the original image to make it 'sharper'.
Here are some other images that I sharpened using the unsharp masking technique.
In this part, we will create hybrid images which are static images that we can visualize differently based on our viewing distance. Again, we can utilize frequencies of images and specifically how humans are able to see high frequencies closer to an image while the lower frequencies are more perceptible from farther away. We will take 2 images and get the low frequencies of 1 by blurring it using a Gaussian filter. For the other image, we will also blur it to get the low frequencies and subtract the low frequencies from the original image to get the high frequencies. With the low frequencies of image A and the high frequencies of image B, we can add them to create the 'hybrid' image.
Here's a hybrid image of Max Verstappen and Charles Leclerc, two of the best Formula 1 drivers on the grid as of lately.
Frog and Rat Hybrid Image
Here's one that didn't work as well as I wanted it to. I tried to mix Gekko and Wingman but the two input images had different features and backgrounds which also made it difficult to align properly.
Gekko and Wingman Hybrid Image
We will create Gaussian and Laplacian Stacks which will be useful for blending 2 images together seamlessly. For the Gaussian stack, keep blurring the image using a Gaussian filter and add it to the list, each time using the image we obtained as the next image to blur. We can choose how many levels G. We will have a list of decreasing low frequencies. This list can now be used to create the Laplacian stack. We can subtract each pair of Gaussian blurred images to create 1 Laplacian stack image (L[i] = G[i] - G[i + 1]). The last Laplacian stack entry will be the same as the last Gaussian entry (L[-1] = G[-1]). This allows us to add all the Laplacian stack entries to get the original image back. We will also need to create a Gaussian stack of the mask as it will be necessary to blend. Once we have the Laplacian stacks of the 2 images we want to blend and the Gaussian stack for the mask, we can use this formula to create a new blended Laplacian stack with blend_stack[i] = (im1_l_stack_white[i] * g_stack_mask[i]) + (im2_l_stack_black[i] * (1 - g_stack_mask[i])). We can add all the levels of the Laplacian stack to obtain our final blended image!
I used 5 stacks with a kernel size = 21 and sigma = 7 to achieve the following result
Gaussian + Laplacian Stack of Apple
Gaussian + Laplacian Stack of Orange
Gaussian Stack of Mask
Blended Laplacian Stack
Blended Image - Oraple!
I wanted to try to improve the blending as a seam can be seen near the top. This time, I used 100 stacks with kernel size = 100 and sigma = 35.
Blended Image - Improved Oraple!
Here's the blending algorithm applied on some other images
Here's a blended image of the Las Vegas Sphere and another metallic sphere that utilized an irregular mask
The algorithms which were developed in this project are the backbone for various camera and photoshop tools. Through working on this project, I learned a lot about using image frequencies to enhance and manipulate images.