OpenCV + Python Crop and resize images

728x90

Hello. This is codingwalks.

Image resizing and cropping are some of the most basic and important tasks in image processing. In many fields such as web development, design, data analysis, and artificial intelligence, there are many cases where you need to resize an image or use only a specific part of it. In particular, in computer vision tasks, it is essential to optimize the size or remove unnecessary parts before analyzing the image. OpenCV provides powerful tools to easily perform these tasks. In this article, we will introduce how to resize an image to a desired size or crop only the necessary area of the image using OpenCV. Simple functions and intuitive usage will help you perform image processing tasks more efficiently. Now, let's take a closer look at how to resize and crop an image.

1. OpenCV's Coordinate System

When dealing with images in OpenCV, it is very important to understand the coordinate system. OpenCV represents an image as a two-dimensional matrix, and each pixel is treated as an element of the matrix. In this case, the coordinates represent the pixel locations in the image, and there is a slight difference from the coordinate system used in mathematics. When drawing graphs in mathematics, the positive direction of the x-axis generally points to the east, and the positive direction of the y-axis points to the north. However, in OpenCV, the positive x-axis points to the east, while the positive y-axis points to the south. If the image is 512x512 in size, the origin will be at the top left, and the maximum width and height will be at the bottom right.

Mathematical coordinate system, Image coordinate system

2. Image cropping

To crop an image, you can treat the image itself as a matrix. In OpenCV, you can crop an image through array indexing without a separate cropping function. When cropping, you can specify the start and end points of the height and width.

import cv

img = cv2.imread('resources/baboon.bmp')
cv2.imshow('Original Image', img)

# Check image size
print(img.shape) # (480, 500, 3)

cropped_img = img[50:150,30:130]
cv2.imshow('Cropped Image', cropped_img)

# Check the cropped image size
print(cropped_img.shape) # (100, 100, 3)

cv2.waitKey(0)
cv2.destroyAllWindows()

3. Image Resizing

Image resizing is one of the most important functions in OpenCV. In addition, in order to resize an image, you must first know the current size of the image. For example, after loading an image named ‘baboon.bmp’, you can check the size of the image using img.shape. The code below shows how to check the size of the image and resize it. When reducing or enlarging an image, the cv2.resize() function is used, and since pixel values need to be recalculated, interpolation is used. Interpolation is a method of calculating pixel values, which has a great impact on image quality, and OpenCV provides various interpolation methods.

3.1. cv2.INTER_NEAREST (Nearest Neighbor Interpolation)

Nearest Neighbor Interpolation is the simplest interpolation method, and it is fast and easy to calculate because it takes the closest pixel value in the transformed coordinates. However, when enlarging an image, aliasing occurs, which reduces the quality and can create a rough image. It is generally used when performance is important or for simple purposes.

\[I{\prime}(x{\prime}, y{\prime}) = I\left( \text{round}(x), \text{round}(y) \right)\]

Here, \(I{\prime}(x{\prime}, y{\prime})\) is the pixel value of the new image, and \(I(x, y)\) is the pixel value of the original image. \(x\) and \(y\) are real coordinates rounded to integers to get the nearest neighboring pixel value as it is.

3.2. cv2.INTER_ LINEAR (Bilinear Interpolation)

It resizes using bilinear interpolation, cv2.INTER_LINEAR, which is the default interpolation method used in cv2.resize(). Bilinear interpolation calculates a new pixel value by linearly weighting the four nearest pixels. Compared to INTER_NEAREST, it can produce a smoother and more natural image, but the edges may be somewhat blurred. It is commonly used in zooming in and out operations. This method provides medium quality and is suitable for balancing speed and quality.

\[I{\prime}(x{\prime}, y{\prime}) = (1 - a)*(1 - b)*I(x_1, y_1) + a(1 - b)*I(x_2, y_1) + (1 - a)*b*I(x_1, y_2) + a*b*I(x_2, y_2)\]

Here, \(x_1\), \(x_2\) are the nearest integer coordinates around \(x\), \(y_1\), \(y_2\) are the nearest integer coordinates around \(y\). \(a = x - x_1\), \(b = y - y_1\).

3.3. cv2.INTER_ CUBIC (Cubic Interpolation)

Cubic interpolation is a method that calculates using surrounding 4x4 pixel blocks, In 2D, it uses a 3rd degree polynomial. The texture of the image is very smooth and natural, and it can obtain a high-quality enlarged image. Although it is computationally expensive, it is one of the most frequently used methods for image enlargement.

\[I{\prime}(x{\prime}, y{\prime}) = \sum_{i=-1}^{2} \sum_{j=-1}^{2} w(i, j) I(x+i, y+j)\]

Here, \(w(i, j)\) is the weight for the 16 surrounding pixels, which is calculated based on a 3rd degree polynomial. This weight is defined as follows:

\[ w(x) = \begin{cases}
(a + 2) |x|^3 - (a + 3) |x|^2 + 1 & \text{if } |x| \leq 1 \\
a |x|^3 - 5a |x|^2 + 8a |x| - 4a & \text{if } 1 < |x| < 2 \\
0 & \text{if } |x| \geq 2
\end{cases} \]

Here, \(a = -0.5\) is often used.

3.4. cv2.INTER_LANCZOS4 (Lanczos Interpolation)

Lanczos interpolation is a high-dimensional Sinc function-based interpolation method, which is the most complex and computationally intensive. It provides high-quality image enlargement, and is especially suitable for processing images with sharp edges. However, it is not often used in tasks that require real-time processing due to its high computational cost, and is mainly used in tasks that prioritize image quality.

\[ I{\prime}(x{\prime}, y{\prime}) = \sum_{i=-n}^{n} \sum_{j=-n}^{n} I(x+i, y+j) \cdot \text{Lanczos}(x - i) \cdot \text{Lanczos}(y - j) \]

Here, the Lanczos function is defined as follows:

\[ \text{Lanczos}(x) = \begin{cases}
\frac{\sin(\pi x) \sin(\pi x / a)}{(\pi x)^2} & \text{if } -n < x < n \\
0 & \text{otherwise}
\end{cases} \]

Typically, \(n = 4\) is set.

3.5. cv2.INTER_AREA (Area-Based Interpolation)

Area-based interpolation is a specialized method for image downscaling in OpenCV. Unlike other interpolation methods, INTER_AREA calculates a weighted average based on an area rather than an interpolation. That is, each new pixel value in the downscaled image is calculated as the average value of the area corresponding to that pixel in the original image. Since the average is calculated by considering the area of the pixels, it produces a smooth and natural image with less noise when downscaling. Details are preserved in the downscaled image, and it is especially useful when downscaling to a small size. However, it is not used for image upscaling.

When downscaling, the new pixel value \(I{\prime}(x{\prime}, y{\prime})\) is defined as the sum of several pixel values in the area of the original image divided by the size of that area. This can be expressed as a formula as follows:

\[I{\prime}(x{\prime}, y{\prime}) = \frac{1}{A} \sum_{i=1}^{m} \sum_{j=1}^{n} I(x + i, y + j)\]

Here, \( I{\prime}(x{\prime}, y{\prime}) \) is the pixel value of the reduced image, \( I(x + i, y + j) \) is the pixel value corresponding to the region in the original image, \( A = m \times n \) is the area of the region corresponding to one pixel in the reduced image in the original image, and \(m\) and \(n\) are the horizontal and vertical ranges (pixel counts) occupied by the pixels in the reduced image in the original image.

3.6. Image Differences by Interpolation Method

Each interpolation method is suitable for a specific situation, and the results may differ when scaling up or down. Here are some factors to consider when choosing an interpolation method

• When scaling up: cv2.INTER_CUBIC or cv2.INTER_LANCZOS4 provide good quality.
• When scaling down: cv2.INTER_AREA produces smooth images without distortion.
• When speed is important: cv2.INTER_NEAREST or cv2.INTER_LINEAR provide fast calculation speed.

Example of scaling up (x5) with cv2.resize:

import cv2

img = cv2.imread('resources/baboon.bmp')
cv2.imshow('Original Image', img)

# Check image size
print(img.shape) # (480, 500, 3)

# Resize image
img_nearest = cv2.resize(img, (2500, 2400), interpolation=cv2.INTER_NEAREST)
img_linear = cv2.resize(img, (2500, 2400), interpolation=cv2.INTER_LINEAR)
img_cubic = cv2.resize(img, (2500, 2400), interpolation=cv2.INTER_CUBIC)
img_lanczos = cv2.resize(img, (2500, 2400), interpolation=cv2.INTER_LANCZOS4)

cv2.imshow('Nearest Neighborhood Interpolation', img_nearest)
cv2.imshow('Linear Interpolation', img_nearest)
cv2.imshow('Cubic Interpolation', img_cubic)
cv2.imshow('Lanczos Interpolation', img_lanczos)

# Check the adjusted image size
print(img_nearest.shape)  # (2400, 2500, 3)

cv2.waitKey(0)
cv2.destroyAllWindows()

Example of reducing cv2.resize (x5):

import cv2

img = cv2.imread('resources/baboon.bmp')
cv2.imshow('Original Image', img)

# Check image size
print(img.shape) # (480, 500, 3)

# Resize image
img_nearest = cv2.resize(img, (100, 96), interpolation=cv2.INTER_NEAREST)
img_linear = cv2.resize(img, (100, 96), interpolation=cv2.INTER_LINEAR)
img_area = cv2.resize(img, (100, 96), interpolation=cv2.INTER_AREA)

cv2.imshow('Nearest Neighborhood Interpolation', img_nearest)
cv2.imshow('Linear Interpolation', img_nearest)
cv2.imshow('Area-Based Interpolation', img_area)

# Check the adjusted image size
print(img_nearest.shape)  # (96, 100, 3)

cv2.waitKey(0)
cv2.destroyAllWindows()

As explained above, there are trade-offs depending on each interpolation method.

If speed is important during enlargement, nearest or linear can be used, but aliasing may occur or high-frequency components (e.g. image detail or clarity) may be damaged, as can be seen in the enlarged image above. On the other hand, cubic or lanczos can be used to sacrifice speed and improve quality, but this requires improved execution speed.

Similarly, if speed must be increased during reduction or the image quality is not affected, nearest or linear can be used, but if possible, it is reasonable to choose a method that can respond to both speed and quality by using area-based.

ps.) Based on the author's personal experience or opinion, cubic was used the most in the field, and code optimization according to architecture was applied to improve speed. There were cases where lanczos had to be used, but this was only used in areas where image detail or clarity was important, and cubic was usually used otherwise.

4. Conclusion

When processing images in OpenCV, it is important to understand the image coordinate system and array indexing. In this article, we learned how to resize and crop an image. In OpenCV, you can resize an image using the cv2.resize function, and it is important to choose an interpolation method that suits your purpose. The more important the image quality is, the more likely you are to use INTER_CUBIC or INTER_LANCZOS4, while INTER_NEAREST or INTER_LINEAR are suitable for speed. INTER_LINEAR is often used in real-time processing, while INTER_CUBIC and INTER_LANCZOS4 are often used in print and graphics work. Additionally, cropping can be easily handled using array indexes.

If you found this post useful, please like and subscribe below. ^^

[Codingwalks]에게 송금하기 - AQR

aq.gy

★ All contents are referenced from the link below. ★

OpenCV: OpenCV-Python Tutorials

Core Operations In this section you will learn basic operations on image like pixel editing, geometric transformations, code optimization, some mathematical tools etc.

docs.opencv.org