Project 1: Images of the Russian Empire: Colorizing the Prokudin-Gorskii Photo Collection

Ryan Mathew
September 3rd, 2024
Website Repo

Overview

In 1907, Sergei Mikhailovich Prokudin-Gorskii travelled to the Russian Empire to photograph various subjects. He captured everything in three different exposures of red, green, and blue. Although, there wasn't a way to print color photographs at that time, the Library of Congress obtained his collection of glass negatives and we will attempt to overlay the 3 different color channels for each scene and produce a colorized and aligned image in this project.

Naive Euclidean Alignment

A first attempt to simply overlay all the channels will produce colorized but rather unaligned images. Because of this, we need to find a proper alignment for each red and green channel onto the blue channel. This is as simple as finding the best x and y pixel translation for each channel. We can exhaustively iterate over a range [-15, 15] for both x and y and score each shift compared to the blue channel using the L2 Norm, also known as the Euclidean distance, on the pixels. It's also important to note that we should crop the images before trying different shifts to avoid fitting to the borders which can affect our results. After trying these steps on both the red and green channels, we can obtain and apply the best shifts to each respective color channel and overlay it on the blue channel to produce properly aligned and colorized images as shown below.

Cathedral

Cathedral R: (12, 3) G: (5, 2)

Monastery

Monastery R: (3, 2) G: (-3, 2)

Tobolsk

Tobolsk R: (6, 3) G: (3, 3)

Image Pyramid Alignment

While this exhaustive iteration over a small window works well for JPG files, it would require a larger window for high-resolution scans which can take longer and be quite inefficient. For this reason, we can instead utilize an image pyramid alignment approach. This approach takes advantage of the idea we saw earlier that smaller images are easier to exhaustively search over to find the best shift. We will first downscale the original image by 0.5 using sk.transform.rescale and store each downscaled image. This will be repeated for all color channels until the image's height or width is less than 200px. Next, we will start from the coarsest image and euclidean align it with the coarsest image of blue's downscaled images to find the best shift. We will repeat this for both red and green, each time moving from the coarse images to the finer ones and scaling the shift from the previous image by 2 and adding the new shifts until we have our final shifts on the highest resolution. The results from image pyramid alignment and the shifts used to achieve it for each image are shown below.

church

church.tif

R: (58, -4) G: (25, 4)

emir

emir.tif

R: (99, -205) G: (49, 24)

harvesters

harvesters.tif

R: (124, 14) G: (60, 17)

icon

icon.tif

R: (90, 23) G: (41, 17)

lady

lady.tif

R: (112, 12) G: (52, 9)

melons

melons.tif

R: (178, 13) G: (82, 10)

onion_church

onion_church.tif

R: (108, 36) G: (52, 26)

sculpture

sculpture.tif

R: (140, -27) G: (33, -11)

self_portrait

self_portrait.tif

R: (176, 37) G: (79, 29)

three_generations

three_generations.tif

R: (112, 11) G: (53, 14)

Train

train.tif

R: (87, 32) G: (42, 6)

Sample Images

Here are the results of the Image Pyramid Alignment algorithm performed on other images I chose from the Prokudin-Gorskii collection

Isfandiyar

Isfandiyar R: (94, -8) G: (40, 8)

V Italīi

V Italīi R: (76, 35) G: (38, 21)

Woman

Woman R: (107, 56) G: (48, 38)

Bells & Whistles: Better Alignment

As you may have noticed, Emir.tif is still rather unaligned compared to the other tif files that were able to be aligned through image pyramid alignment. This is because the images to be matched have different brightness values across different color channels. In this case, it would be better to use a different metric than the raw pixel difference. I decided to use Canny Edge Detection (sk.feature.canny) to create an edge map of all three color channels. Once we have the edge maps, we can align them to each other using the image pyramid align with the L2 Norm scoring once again. This allows us to avoid suboptimal shifts that we would otherwise receive using the same scoring system on raw color channels that can have different brightness values. It is worth noting that if we are using edge maps from Canny Edge Detection, we no longer need to crop the images before exhaustively trying shifts as the borders will not appear in the edge maps. As you can see below, using the edge maps results in a better aligned image compared to just using the raw color channels.

Emir

Raw Color Channels + Image Pyramid Alignment

R: (99, -205) G: (49, 24)

Emir

Canny Edge Detection + Image Pyramid Alignment

R: (107, 40) G: (49, 23)