blog




  • Essay / Scale Invariant Feature Transform Algorithm

    Scale Invariant Feature Transform (SIFT) is an image descriptor for image-based matching and recognition that was developed by David Lowe. Like other descriptors, this descriptor is used for many purposes in computer vision topics related to point matching for object recognition. The SIFT descriptor is invariant to geometric transformations such as translation, rotation and scaling in the image domain, in addition it is robust to moderating perspective transformations and variations in illumination degrees. It has been experimentally proven to be useful and effective in practice for object recognition and image matching in real-world conditions. Say no to plagiarism. Get a tailor-made essay on “Why violent video games should not be banned”?Get the original essaySIFT includes a method for detecting points of interest from a grayscale image, where the statistics of the directions local gradient of image intensities were accumulated to give a summary description of the local image structure in a local neighborhood around each point of interest, in which the descriptor should be used to match the points d corresponding interest between different images. Later, the SIFT descriptor was extended from gray level to color images. The SIFT algorithm uses the Difference of Gaussians (DoG), which is an approximation of the Laplacian of Gaussians (LoG), which is a bit expensive. The Gaussian difference is obtained as a difference in Gaussian blur of an image with two different σ, which acts as a scaling parameter. Once DoG is found, images are searched for local extrema in scale and space. For example, a pixel of an image is compared to its 8 neighbors as well as 9 pixels of the next scale and 9 pixels of the previous scales as well. If it is a local extrema, this is a potential key point. This process is carried out on different octaves of the Gaussian pyramid image as shown in 2.12. An image pyramid is a series of images, each image being the result of subsampling (reducing by a certain factor) the previous element. After that, we move on to the next step, which is locating the key point. Once potential locations of key points are found, they need to be refined in order to obtain more precise results on the location of extrema, where there is a threshold value, and whether the intensity at these extrema is less than this threshold value, then it is rejected. We must now take into account the orientation, and for this, we must assign an orientation to each key point to obtain invariance to the rotation of the image. A neighborhood will be defined around the keypoint location based on scale, and the magnitude and direction of the gradient will be calculated in that particular region. To find the dominant orientation, peaks are detected in this orientation histogram. In case there is more than one dominant orientation around the point of interest, then multiple peaks are accepted if the height of the secondary peaks is greater than 80% of the height of the highest peak, and in this case, each peak is used to calculate a new image. descriptor for the corresponding orientation estimate.Keep in mind: this is just a sample.Get a custom paper now from our expert writers.Get a custom essayNow that the key point descriptor is created, the neighborhood around the key point is taken. It is divided into sub-blocks, and for each sub-block, an orientation histogram is..