OpenCV + Python Binarization (Thresholding)

728x90

Hello. This is codingwalks.

Image thresholding is one of the important techniques in image processing. It is used to remove unnecessary information and extract only the key part by converting pixel values into two values, black and white. Binarization is useful for solving various problems such as noise removal, object detection, and boundary extraction. In this article, we will learn how to use various binarization techniques in Python using OpenCV. From the basic cv2.threshold function to Otsu binarization and adaptive binarization techniques, we will learn through real examples.

1. Basic Binarization (cv2.threshold)

cv2.threshold is the basic function for image binarization, converting pixel values to black and white using a fixed threshold value. This function provides various parameters and binarization methods, so that it can be applied to various scenarios.

1.1. Basic Usage

retval, dst = cv2.threshold(src, thresh, maxval, type)

src: Input image (usually grayscale)
thresh: Threshold value
maxval: Value to apply to pixels exceeding the threshold value (white, usually 255)
type: Select the binarization method

1.2. Various Threshold Types

cv2.THRESH_BINARY: Set maxval if greater than the threshold value, 0 if less.
cv2.THRESH_BINARY_INV: Set maxval if greater than the threshold value, 0 if less.
cv2.THRESH_TRUNC: Set the value exceeding the threshold as the threshold, if it is smaller, keep it as is.
cv2.THRESH_TOZERO: Keep the value exceeding the threshold, if it is smaller, set it to 0.
cv2.THRESH_TOZERO_INV: The opposite of TOZERO, keep the value if it is smaller, and set it to 0 if it is larger.

1.3. Example source using cv2.Threshold()

import cv2
import matplotlib.pyplot as plt

img = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)

ret, th_binary = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
ret, th_binary_inv = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY_INV)
ret, th_trunc = cv2.threshold(img, 127, 255, cv2.THRESH_TRUNC)
ret, th_tozero = cv2.threshold(img, 127, 255, cv2.THRESH_TOZERO)
ret, th_tozero_inv = cv2.threshold(img, 127, 255, cv2.THRESH_TOZERO_INV)

titles = ['ORIGINAL', 'BINARY', 'BINARY_INV', 'TRUNC', 'TOZERO', 'TOZERO_INV']
images = [img, th_binary, th_binary_inv, th_trunc, th_tozero, th_tozero_inv]

plt.figure(figsize=(10, 10))
for i in range(len(images)):
    plt.subplot(2, 3, i+1), plt.imshow(images[i], cmap='gray')
    plt.title(titles[i])
    plt.xticks([]), plt.yticks([])

plt.tight_layout()
plt.show()

2. Adaptive Threshold Processing (cv2.adaptiveThreshold)

2.1. Limitations of Fixed Threshold

If the lighting is not uniform or the brightness of the image is not constant, binarization using a fixed threshold can lead to inaccurate results.

2.2. Need for Adaptive Binarization

Adaptive binarization calculates and applies different thresholds to each area of the image, so it is useful when processing images with uneven lighting.

2.3. cv2.ADAPTIVE_THRESH_MEAN_C

This method calculates the threshold based on the average of the surrounding pixel values. The principle of this method is to calculate the average of the pixel values in a given area in each small block of the image, and use the value obtained by subtracting the \(C\) value from the average as the threshold. This allows the threshold to be set differently depending on the pixel values of each small block or area, allowing it to adapt to local lighting changes in the image. Applicable scenarios include images with relatively constant lighting or without complex textures, and it works well on simple documents or text images.

\[T(x, y) = \frac{1}{N} \sum_{i,j \in N(x, y)} I(i, j) - C\]

Here, \(N(x, y)\) is the surrounding pixel block, \(I(i, j)\) is the value of the pixel, and \(C\) is a constant that is used to fine-tune the threshold.

2.4. cv2.ADAPTIVE_THRESH_GAUSSIAN_C

This method calculates the threshold by applying Gaussian weights to the surrounding pixel values. The principle of this method is to apply a Gaussian kernel to each pixel in a given block, calculate the weighted average, use that value as the threshold, and subtract the C value from it. That is, it gives higher weights to pixels closer to the center, and the weights decrease as they get farther from the center. This is advantageous for images with complex textures or boundaries, as it reflects the information of the surrounding area more precisely in a Gaussian manner. Applicable scenarios include images with complex lighting changes, images with complex textures, and images with objects with clear boundaries. For example, in document scans where the background is not uniform, this method can provide better results.

\[T(x, y) = \sum_{i,j \in N(x, y)} w(i,j)\cdot I(i, j) - C\]

Where \(w(i,j)\) is the Gaussian weight, \(I(i, j)\) is the value of the corresponding pixel.

2.5. How to use cv2.adaptiveThreshold function

dst = cv2.adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C)

adaptiveMethod: cv2.ADAPTIVE_THRESH_MEAN_C, cv2.ADAPTIVE_THRESH_GAUSSIAN_C.
thresholdType: See various Threshold Types of basic binarization.
blockSize: Size of the region to calculate the threshold (odd).
C: Constant to subtract from the mean. Fine-tune the binarization result by adjusting it.

2.6. Example source using cv2.adaptiveThreshold()

import cv2
from matplotlib import pyplot as plt 

img = cv2.imread('resources/sudoku.jpg', cv2.IMREAD_GRAYSCALE)

th_adap_mean = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 45, 5)
th_adap_gaussian = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 45, 5)

titles = ['Original', 'Mean', 'Gaussian']
images = [img, th_adap_mean, th_adap_gaussian]

plt.figure(figsize=(10, 4))
for i in range(len(images)):
	plt.subplot(2,2,i+1), plt.imshow(images[i],'gray')
	plt.title(titles[i])
	plt.xticks([]), plt.yticks([])

plt.tight_layout()
plt.show()

3. Otsu's Threshold

Otsu's binarization is one of the image binarization techniques that automatically determines the threshold value. Unlike general binarization that requires manual threshold setting, it analyzes the histogram of the image to find the optimal threshold value. This method is especially effective when the foreground and background of the image are clearly distinguished. Otsu's binarization is one of the global binarization methods that automatically calculates the threshold value that minimizes the variance between the two classes after dividing the histogram of the image into two classes (foreground and background).

Foreground (object): The part of the image that we are interested in.
Background: The remaining part excluding the foreground.

Otsu's method works by finding the threshold value that maximizes the variance between the two groups (between-class variance) when the pixel values are divided into two groups. This allows the threshold value that best distinguishes the foreground and background to be automatically determined.

3.1. Limitations of Otsu Binarization

Limitations in images with uneven illumination: Since Otsu's binarization applies a single threshold to the entire image, it may perform poorly in images with uneven illumination or where the background and foreground are not clear. In this case, adaptive thresholding or local thresholding techniques should be used.
Prone to noise: If the image contains a lot of noise, the histogram may become distorted, making it difficult for the Otsu algorithm to find an appropriate threshold. In this case, it is recommended to apply Gaussian blurring and then use Otsu binarization.

3.2. Otsu Binarization Process

Otsu binarization analyzes the histogram to select a threshold that maximizes the variance between classes. Here, it is assumed that the two classes are best separated when the variance between classes is maximized.

3.2.1. Generating an Image Histogram

First, the image is converted to grayscale, and a histogram representing the distribution of each pixel value is generated. A histogram is a graph that shows the frequency (number of pixels) of each grayscale value.

3.2.2. Definition of intraclass variance and interclass variance

After dividing all pixel values into two groups, the mean and variance of each group are calculated. The important concepts in Otsu's method are intraclass variance and interclass variance.

Intraclass variance: It indicates how spread out the pixel values of each group are.
Interclass variance: It indicates the average difference between two groups, and when this value is maximum, the two groups are best separated.

3.2.3. Threshold optimization

For all possible threshold values t of the image, the interclass variance is calculated using the formula below. The optimal threshold is the threshold that maximizes the interclass variance.

\[\sigma_b^2(t) = w_0(t) \cdot w_1(t) \cdot [\mu_0(t) - \mu_1(t)]^2\]

Here,

\(w_0(t), w_1(t)\): Probability (i.e. pixel ratio) of each class (background and foreground).
\(\mu_0(t), \mu_1(t)\): Mean of each class.

In this equation, the t value that maximizes the between-class variance is chosen as the optimal threshold.

3.3. Applying Otsu in cv2.threshold()

import cv2

img = cv2.imread('resources/noisy_leaf.jpg', cv2.IMREAD_GRAYSCALE)

blur = cv2.GaussianBlur(img, (9, 9), 0)

ret, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

cv2.imshow('Gaussian + Otsu Binary', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

4. Conclusion

The various binarization techniques covered in this post are important basic techniques for image processing. You can easily use various binarization techniques in Python through OpenCV, and if you understand the characteristics and advantages of each technique, you can maximize the efficiency of image processing. Binarization is a powerful tool that simply separates an image into foreground and background, and is used in various fields such as object detection, noise removal, and document scanning. We checked the basic binarization method using a fixed threshold through cv2.threshold(). In addition, we were able to obtain effective results even in images with uneven lighting or complex images through adaptive binarization. In particular, MEAN_C and GAUSSIAN_C in cv2.adaptiveThreshold() provide options suitable for each situation. Finally, Otsu binarization is a powerful technique that automatically calculates the threshold through image histogram analysis, and it performs excellently in images with clear foreground and background. However, in cases where lighting is uneven or there is a lot of noise, it provides better results when combined with methods such as adaptive binarization or Gaussian blur. Binarization is a fundamental and important step in computer vision and image processing, and can be used in a variety of applications. In the future, as you learn advanced image processing techniques such as morphological operations, object detection, and segmentation, you will be able to utilize binarization techniques more deeply.