Project 2: Fun with Filters and Frequencies!

Ryan Mathew
September 21st, 2024
Website Repo

Overview

In this project, we will experiment with frequencies and their applications to various image processing and image manipulation techniques. We will start with edge detection through utilizing the finite difference operator as well as the Derivative of Gaussian (DoG) Filter. From there, we will explore image sharpening, hybrid images, and image blending

Part 1: Fun with Filters

Part 1.1: Finite Difference Operator

We can utilize the finite difference operators (Dx = [1, -1] and Dy = [[1], [-1]]) to get the partial derivatives of the image. If we convolve our cameraman image with the Dx filter, we can identify the vertical edges. If we convolve using the Dy filter, we can identify the horizontal edges instead. From here, we can calculate the Gradient Magnitude Image using the formula np.sqrt(partial_deriv_x**2 + partial_deriv_y**2). To better visualize this, we can binarize the gradient magnitude image values to be either 1 or 0 if they are greater than a threshold, which I chose to be 0.25

Image 1
cameraman.png
Image 2
Dx (Vertical Edges)
Image 3
Dy (Horizontal Edges)
Image 4
Gradient Magnitude
Image 5
Binarized Gradient Magnitude
Part 1.2: Derivative of Gaussian (DoG) Filter

We can see that there's a lot of noise in the binarized edge image so we will now first blur the cameraman image before convolving it with the finite difference operators. A Gaussian filter can be used to blur the image. I chose kernel size = 7 and sigma = 1. When we take our cameraman image and convolve it with a 2D Gaussian kernel, it will 'smooth' or blur the image slightly reducing drastic changes which cause 'noise' in edge detection. Now, we can repeat the steps from above of getting the partial derivatives and gradient magnitude image. For the binarized gradient magnitude after blurring, I chose the threshold to be 0.08. Compared to the edge map produced by the previous part, there's much less noise which produces sharper, thicker edges.

Image 1
Blurred Cameraman
Image 2
Dx Blurred Convolution (Vertical Edges)
Image 3
Dy Blurred Convolution (Horizontal Edges)
Image 4
Gradient Magnitude
Image 5
Binarized Gradient Magnitude

This process can be sped up slightly by noting that convolutions are associative. This allows us to first convolve the finite difference operators with the 2D Gaussian Kernel. These can then be convolved with the original (unblurred) cameraman image to create the Derivative of Gaussian (DoG) X and Y convolutions and, as a result, acquire the DoG Gradient Magnitude Image. For the binarized DoG Gradient Magnitude Image threshold, I chose 0.08 again. As we can see, the edge map produced is identical to the unoptimized approach.

Image 1
Derivative of Gaussian X Kernel
Image 1
Derivative of Gaussian Y Kernel
Image 2
Dx Blurred Convolution (Vertical Edges)
Image 3
Dy Blurred Convolution (Horizontal Edges)
Image 4
Gradient Magnitude
Image 5
Binarized Gradient Magnitude

Part 2: Fun with Frequencies!

Part 2.1: Image "Sharpening"

Here, we will sharpen a blurry image using the unsharp masking technique. First, we will convolve the original image of the Taj Mahal with a 2D gaussian Kernel of kernel size = 7 and sigma = 1. This blurs the image and is considered a low pass filter which leaves only the low frequencies behind. If we then subtract the blurred image from the original image, we can obtain the high frequencies of the image. We can then add back the higher frequencies, also scaling it by some value alpha if necessary, to the original image to make it 'sharper'.

Image 1
Taj Mahal
Image 1
Taj Blurred
Image 1
Taj High Frequencies
Image 1
Taj Sharpened (alpha = 1)

Here are some other images that I sharpened using the unsharp masking technique.

Image 1
Bears
Image 1
Bears Blurred
Image 1
Bears High Frequencies
Image 1
Bears Sharpened (alpha = 4)


Image 1
Sheep
Image 1
Sheep Blurred
Image 1
Sheep High Frequencies
Image 1
Sheep Sharpened (alpha = 2)
Part 2.2: Hybrid Images

In this part, we will create hybrid images which are static images that we can visualize differently based on our viewing distance. Again, we can utilize frequencies of images and specifically how humans are able to see high frequencies closer to an image while the lower frequencies are more perceptible from farther away. We will take 2 images and get the low frequencies of 1 by blurring it using a Gaussian filter. For the other image, we will also blur it to get the low frequencies and subtract the low frequencies from the original image to get the high frequencies. With the low frequencies of image A and the high frequencies of image B, we can add them to create the 'hybrid' image.

Image 1
Derek
Image 1
Nutmeg
Image 1
Derek Low Frequencies: Kernel Size = 50 / Sigma = 10
Image 1
Nutmeg High Frequencies: Kernel Size = 50 / Sigma = 10
Image 1
Hybrid Image

Here's a hybrid image of Max Verstappen and Charles Leclerc, two of the best Formula 1 drivers on the grid as of lately.

Image 1
Max Verstappen
Image 1
Charles Leclerc
Image 1
Charles Leclerc Low Frequencies: Kernel Size = 50 / Sigma = 5
Image 1
Max Verstappen High Frequencies: Kernel Size = 50 / Sigma = 2
Image 1
Hybrid Image

FFT Frequency Analysis
Image 1
Max Verstappen FFT
Image 1
Charles Leclerc FFT
Image 1
Charles Leclerc Low Frequencies FFT
Image 1
Max Verstappen High Frequencies FFT
Image 1
Hybrid Image FFT

Frog and Rat Hybrid Image


Image 1
Frog
Image 1
Rat
Image 1
Frog Low Frequencies: Kernel Size = 50 / Sigma = 10
Image 1
Rat High Frequencies: Kernel Size = 50 / Sigma = 8
Image 1
Hybrid Image

Here's one that didn't work as well as I wanted it to. I tried to mix Gekko and Wingman but the two input images had different features and backgrounds which also made it difficult to align properly.

Gekko and Wingman Hybrid Image

Image 1
Gekko
Image 1
Wingman
Image 1
Gekko Low Frequencies: Kernel Size = 50 / Sigma = 10
Image 1
Wingman High Frequencies: Kernel Size = 50 / Sigma = 3
Image 1
Hybrid Image

Part 2.3 + 2.4: Gaussian and Laplacian Stacks + Multiresolution Blending

We will create Gaussian and Laplacian Stacks which will be useful for blending 2 images together seamlessly. For the Gaussian stack, keep blurring the image using a Gaussian filter and add it to the list, each time using the image we obtained as the next image to blur. We can choose how many levels G. We will have a list of decreasing low frequencies. This list can now be used to create the Laplacian stack. We can subtract each pair of Gaussian blurred images to create 1 Laplacian stack image (L[i] = G[i] - G[i + 1]). The last Laplacian stack entry will be the same as the last Gaussian entry (L[-1] = G[-1]). This allows us to add all the Laplacian stack entries to get the original image back. We will also need to create a Gaussian stack of the mask as it will be necessary to blend. Once we have the Laplacian stacks of the 2 images we want to blend and the Gaussian stack for the mask, we can use this formula to create a new blended Laplacian stack with blend_stack[i] = (im1_l_stack_white[i] * g_stack_mask[i]) + (im2_l_stack_black[i] * (1 - g_stack_mask[i])). We can add all the levels of the Laplacian stack to obtain our final blended image!

I used 5 stacks with a kernel size = 21 and sigma = 7 to achieve the following result

Gaussian + Laplacian Stack of Apple


Gaussian + Laplacian Stack of Orange


Gaussian Stack of Mask


Blended Laplacian Stack


Blended Image - Oraple!

Image 1
Oraple

I wanted to try to improve the blending as a seam can be seen near the top. This time, I used 100 stacks with kernel size = 100 and sigma = 35.

Blended Image - Improved Oraple!

Image 1
Oraple

Here's the blending algorithm applied on some other images

Image 1
Sunrise
Image 1
Nature
Image 1
Horizontal Mask
Image 1
Image Blended (stacks = 100, ksize = 5, sigma = 100)

Here's a blended image of the Las Vegas Sphere and another metallic sphere that utilized an irregular mask

Image 1
Las Vegas Sphere
Image 1
Metallic Sphere
Image 1
Irregular Mask
Image 1
Image Blended (stacks = 100, ksize = 5, sigma = 100)

Takeaways

The algorithms which were developed in this project are the backbone for various camera and photoshop tools. Through working on this project, I learned a lot about using image frequencies to enhance and manipulate images.