Image Warping and Mosaicing

I. Introduction

The objective of this project is to be able to stitch a mosaic from a series of images with manually selected corresponding points. The process can be described in four steps:

Shoot images so that the transforms between them are projective.

Recover the homography of the images.

Robert Collins' CSE486 Lecture Slides

Warp images using the homography.

Blend the warped images together.

To improve upon this, automatic mosaicing is also implemented. This entails the automatic selection of likely points and further refining those points and finding the correspondences. A more detailed, step by step explanation is available below.

II. Image Rectification

After retreiving a homography matrix, whether or not the implemented warping is working correctly or not can be determined by using it to rectify an image. This is simply taking an image of an object from an angle for which we know the object's true shape, and morph the image into that shape.

I did this on a photo of the portrait I was working on. The picture was taken at an angle, so the drawing board is trapezoidal in the photo even though we know it to be rectangular. Image rectification recovers the rectangular drawing board as if it were viewed head on. The original points were chosen manually, and the new points were chosen manually as well, through guess and check to ensure the correct width to height ratios were used.

original

rectified

Photo of my drawing setup in Kroeber one afternoon. Points were selected within the drawing board to create a rectangle around the paper and references.

original

rectified

Photo of my polaroid wall. Points were selected to create a rectangle around the 5 by 4 fully pictured polaroids.

I discovered that there are images that don't look quite right when rectified. At first I tried to rectify this image of some bookshelves in Main Stacks. As you can see below, the rectified image looks oddly mishapen, especially in regards to the spinning handles. This is because the handles (as well as the signs) stick out from the flat surface of the bookshelves. When you look solely at the flat surface of the bookselves, though, they are correctly rectified.

original

rectified

Photo of some bookshelves in Main Stacks. Points were selected to create a rectangle around the bookshelves on the right.

III. Mosaicing

Now that we know that our homography matrix is working, we can finally stitch some photos together and make mosaics! When I was first working on this project, I was in MLK, so I took some quick pictures with my iPhone for testing. I thought then that not having a tripod would produce unappealing ghosting in my mosaics, but as long as I tried my best to pivot my phone about where the camera is (and look very weird while doing it, I'm sure), the results were quite good. Unfortunately, since my phone does auto adjustments for brightness, contrast, exposure, etc. some of the images don't match each other in colour. I had to go in and manually adjust the image until it matched the others better.

original images

manually stitched

A series of photos taken in the Student Union one afternoon featuring studious Cal students. I manually stitched together these images with only translation and rotation to get a feel as to how the mosaic should look in the end.

To combine two images, we need at least four corresponding points from each image to make a homography matrix. I used around 10 points for each pair of images. When mosaicing two images together, I averaged the points and warped both images toward that average. When mosaicing three images together, I warped the left and right images towards the centre image.

To determine the size of the image image, I simply used the corresponding homography matrices on the left images' left corners, and the right images' right corners and took the min (for left image) or max (for right side) of the x and y values. This gives me some set of negative and positive numbers such as [-545, -267, 1065, 481], which are the min x value, min y value, max x value, and max y value of the final image. The size of the image can be easily found with these values, and all that is needed from there is to offset the pixels being put onto the final image by the min values.

overwrite

The images overwrite each other when no blending technique is used, creating harsh edges.

Although the mosaic looks very nice and aligned in terms of features within the image, there are still obvious harsh edges where one image ends and another begins. Some blending technique must be used to eliminate these edges. I chose to implement linear blending. I simply took the x-axis range of where two images overlap (with the same process as finding the final image's width and height) and created a linear gradient into the next image. To calculate the weight is a simple 1 - (x - (left edge x-value)) / ((right edge x-value) - (left edge x-value)). Take 1 minus that value for the weight of the other image.

linear blending

Linear blending across each image creates a smooth result.

This process does still leave some edges along the top and bottom of the images, but cropping can elimiate those edges without taking too much away from the mosaic. The reason linear blending worked so well is because the images and points were taken and chosen with care. If the homography did not correctly align the images, there would be ghosting, and the blending would not have worked as well as it did.

The final image looks quite seamless after cropping.

Below is a series of other images that I mosaic'd together with the same process.

original images

manually stitched

linear blending

final

Tilden Park at sunset.

original images

manually stitched

linear blending

final

View from the holiday Union Square ice rink.

IV. Automatic Mosaicing

Although the finished mosaics look quite nice, it is very time consuming to go through and find matching points for every image that you want to stitch. So, the key to automatic mosaic stitching is automatic corresponding point selection.

This process is actually quite simple, although it does require several rounds of pruning to ensure there are no false positives. The steps are outlined below:

Use the Harris Interest Point Detector method to find possible points of interest.

harris.py

Use the Adaptive Non-Maximal Suppression strategy on the list of points.

Extract the feature descriptor for each point.

Feature match each feature of each point.

Use 4 point RANSAC to compute a robust homography estimate.

I took the previous images of Tilden Park for which I had created a mosaic for with manually selected points to use automatic mosaicing on.

original images

View from the holiday Union Square ice rink.

The Harris corners of each image are found with the provided harris.py mentioned earlier.

harris corners

Each of the Harris corners found are plotted as points in red.

Adaptive Non-Maximal Suppression, or ANMS, is a strategy that selects only the most important and spread out n points so that only a small amount of feature descriptors must be calculated and compared. I chose n = 300 as I tend to work with smaller images. To explain this method in a most basic sense: using each point's h value, we can find radi for each point, which ranks the importance of each point. I selected the top n points with the maximum radius to continue on to the next step.

Since I used images that are on the small side, no points were pruned as there were less inital Harris corners than my maximum selected points.

after anms

Each of the interest points that weren't pruned by ANMS are plotted as points in blue.

Now that we have a reasonable number of points, the next step is to find corresponding points in the other image. To do this, we must consider the feature descriptor of each point. We want these feature descriptors to contain as much information as possible while being small so that the following step will be less computationally expensive. To prepare these feature descriptors, we need to massage it into a format that allows for easy and fast feature comparison computation. I took the 40 by 40 patch around each point and used a Gaussian blur on to before downsizing it to an 8 by 8 patch by selecting every 5th pixel to use in the new patch.

feature descriptors

All the features for each image.

Now that we have our feature descriptors for each image, we can use our basic SSD to see which feature in one images matches which feature of the other image the best.

after matching

Each of the points that got matched with a corresponding point are plotted as points in green.

Although our feature matching algorithm is quite good, there is still a chance of having outlier points. So, we use RANSAC to reject these remaining outliers. The idea is to select some random four points to create a homography matrix (transforming the left points to the right points) and apply it to all the left image points that we have and see how many of them match the corresponding right image points. The points that are within some epsilon of their expected location are accepted as inlier and all other points are discarded as outliers. The number of votes, which is the number of inlier, is saved along with the list of inlier points.

This process is repeated some number of trials, t. I chose to use t = 500, since I usually end up with only 20-50 set of points and it produces consistent results. I chose epsilon = 2 pixels as suggested by Professor Efros in class. I tried other values, but 2 seems to have the best results.

after ransac

The sets of points that weren't discarded as outliers are plotted as points in yellow.

Finally, the point selection process is complete and we will have found a set of corresponding points for the two images. Then we take these points and stitch the images together with the homography matrix just as detailed in the previous section.

auto stitched mosaic

manually stitched mosaic

The final, automatically stitched mosaic, along with a version with manually selected points.

The final result looks better than the manually selected version, which is great! Below are some more examples.

original images

final (manual) mosaic

View from the holiday Union Square ice rink.

original images

final (manual) mosaic

From the food trucks beside Boalt Hall.

original images

intermediate mosaics

final (manual) mosaic

Four photos of the previous Student Union set mosaiced twice to make a large mosaic.

Some mosaics have better results when points were manually selected, which is especially noticable in the Union Square mosaic. This is because some areas of the images that would have given helpful points were never selected due to those points of interest not being detected as a Harris corner.

I came across some issues with several sets of images where no matches or even no corners could be found in the images. This is because I did not implement any allowances for rotation between the features, so if one image was slightly rotated, no matched would be found. I also had to increase the contrast in some of the images in order for any Harris corners to be detected.

V. Rotation Invariance

Another reason why we were not able to find a set of corresponding points in some of the image pairs was because some of the images were rotated not only about the y-axis, but about the z-axis as well. Two features, the same, will not be matched if one is rotated with our basic SSD, which compares pixels individually. To solve this problem, there must be some "correct" orientation for each feature. I simply rotated each feature to align with the gradient direction of that feature.

Below is an example of one of the pairs of images for which the automatic point selection process did not succeed. It is quite obvious that the z-axis orientation was not preserved from one image to the next. Take the beige building with yellow lights, for example. It is much more rotated in the left image than the right.

original images

Once again, the view from the holiday Union Square ice rink. These are the original images used for this rotation invariant mosaic.

The process is exactly the same, except that instead of comparing the original features, we compare the rotated features. Now, the false negatives that were discarded as unmatched will be preserved.

feature descriptors

feature patches

All the features aligned with their respective gradient directions.

And horray! A previously unstitchable pair of images can now be stitched.

basic automatic mosaicing

with rotation invariance

View from the holiday Union Square ice rink, this time using auto mosaicing with rotation invariance.