Hello. This is codingwalks.
Histogram analysis plays a very important role in image processing. Histograms represent the distribution of pixel values in an image and are useful for analyzing brightness, contrast, and dynamic range of an image. In this blog, I will explain various histogram processing techniques using OpenCV and Python and present the code to implement them. The main concepts covered are how to obtain a basic histogram, histogram stretching, histogram equalization, and CLAHE (Contrast Limited Adaptive Histogram Equalization). Additionally, one of the important concepts in histogram analysis is PDF (Probability Density Function) and CDF (Cumulative Distribution Function). These two functions play an important role in understanding and adjusting the distribution of pixel values in an image.
1. Histogram Analysis
A histogram is a distribution graph that shows how many times each pixel value (0-255) appears in an image. For black and white images, the histogram is calculated from a single channel, and for color images, it can be calculated separately for each channel (R, G, B).
1.1. How to Obtain a Histogram
Assuming that an image is \( I \) and that this image is a grayscale image of size \( M \times N \), each pixel value \( I(i, j) \) has a value between \( [0, L-1] \). Here, \( L \) is the maximum number of pixel values, which is usually \( L=256 \).
This can be defined mathematically as follows:
\[ h(k) = \sum_{i=0}^{M-1} \sum_{j=0}^{N-1} \delta(I(i, j) - k) \]
Where:
• \( h(k) \) is the frequency (histogram value) for pixel value \( k \).
• \( \delta(x) \) is the Dirac delta function, which is 1 when \( x = 0 \) and 0 otherwise. That is, it adds 1 only when the pixel value is equal to \( k \).
• \( I(i, j) \) is the pixel value at coordinate \( (i, j) \).
• \( k \) is the histogram bin number, typically \( k \in [0, 255] \).
So this formula counts the frequency of pixel values \( k \) in the image.
When analyzing an image in OpenCV, the cv2.calcHist function is used to calculate a histogram. This function calculates the distribution of pixel values in the image and displays the number of times each pixel value appears as a histogram. This allows you to analyze the brightness, contrast, etc. of the image.
\[ cv2.calcHist(images, channels, mask, histSize, ranges, accumulate=False)) \]
• images: A list of images for which to calculate histograms (in the form of [image])
• channels: Channels to calculate (0: grayscale, 1, 2, 3: RGB)
• mask: A mask that specifies a specific area for which to calculate the histogram. None when calculating for the entire image
• histSize: The number of histogram bins. Usually set to [256]
• ranges: The range of pixel values to calculate. Usually set to [0, 256]
• accumulate: Whether to accumulate the histogram on the previous values (default False)
Calculating histograms on grayscale images:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load image (convert to black and white image)
image = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)
# Calculating histograms
hist = cv2.calcHist([image], [0], None, [256], [0, 256])
# Plotting PDF and CDF (Matplotlib)
plt.figure(figsize=(10, 5), linewidth=2)
# Show Image
plt.subplot(1, 2, 1)
plt.imshow(image, cmap='gray')
plt.title('Grayscale Image')
plt.axis('off')
# Histogram plot
plt.subplot(1, 2, 2)
plt.plot(hist)
plt.title('Grayscale Histogram')
plt.xlabel('Pixel Value')
plt.ylabel('Frequency')
# Show plot
plt.tight_layout()
plt.savefig('results/hist1.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
Calculating histograms in color images (calculating R, G, B separately):
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load image (color image)
image = cv2.imread('resources/lena.bmp')
# Plotting PDF and CDF (Matplotlib)
plt.figure(figsize=(10, 5), linewidth=2)
# Show Image
plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('RGB Image')
plt.axis('off')
# Calculate histogram (for each channel)
plt.subplot(1, 2, 2)
colors = ('b', 'g', 'r')
for i, color in enumerate(colors):
hist = cv2.calcHist([image], [i], None, [256], [0, 256])
plt.plot(hist, color=color)
plt.title('Color Histogram')
plt.xlabel('Pixel Value')
plt.ylabel('Frequency')
# Show plot
plt.tight_layout()
plt.savefig('results/hist2.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
Calculating histograms in specific areas (using masks):
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load image (convert to black and white image)
image = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)
# Create a mask (calculate histogram for only a part of the image)
mask = np.zeros(image.shape[:2], np.uint8)
mask[128:384, 128:384] = 255
# Calculating histogram using mask
masked_hist = cv2.calcHist([image], [0], mask, [256], [0, 256])
# Plotting PDF and CDF (Matplotlib)
plt.figure(figsize=(10, 5), linewidth=2)
# Show Image
show_image = cv2.subtract(mask, 255-image)
plt.subplot(1, 2, 1)
plt.imshow(show_image, cmap='gray')
plt.title('Grayscale Image')
plt.axis('off')
# Histogram plot
plt.subplot(1, 2, 2)
plt.plot(masked_hist)
plt.title('Masked Histogram')
plt.xlabel('Pixel Value')
plt.ylabel('Frequency')
# Show plot
plt.tight_layout()
plt.savefig('results/hist3.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
The above code is an example of calculating the histogram of an image using the calcHist function of OpenCV, and then plotting the histogram using Matplotlib.
1.2. Probability Density Function (PDF)
If we interpret the histogram probabilistically, we can convert it into a probability density function (PDF). The PDF, which represents the probability of each pixel value occurring, is the histogram value divided by the total number of pixels \( M \times N \).
The PDF is defined as follows:
\[ P(k)=\frac{h(k)}{M\times N} \]
Where:
• \( P(k) \) is the probability for pixel value \( k \).
• \( M \times N \) is the total number of pixels.
• \( h(k) \) is the frequency of pixel value \( k \). This formula calculates the probability that a pixel value \( k \) appears in an image.
Example PDF calculation code:
def calculate_pdf(img):
# Calculating histograms
hist = cv2.calcHist([img], [0], None, [256], [0, 256])
# Convert to probability by dividing by total number of pixels (PDF)
pdf = hist / np.sum(hist)
return pdf
1.3. Cumulative Distribution Function (CDF)
The cumulative distribution function (CDF) is a function that accumulates the probability of all pixel values occurring below a certain pixel value. CDFs are used in techniques such as histogram equalization and are useful for redistributing pixel values in an image.
The CDF is defined as follows:
\[ C(k) = \sum_{i=0}^{k} P(i) \]
Where:
• \( C(k) \) is the cumulative probability of all pixel values below pixel value \( k \).
• \( P(i) \) is the probability (PDF) for pixel value \( i \).
This formula calculates the cumulative proportion of pixel values below \( k \). \( C(k) \) has values in the range \( 0 \leq C(k) \leq 1 \).
Example code for calculating CDF:
def calculate_cdf(pdf):
cdf = np.cumsum(pdf)
# Normalize to 0~1 range
cdf_normalized = cdf / cdf.max()
return cdf_normalized
1.4. Relationship between PDF and CDF
PDF represents the probability distribution of each pixel value, and CDF is the accumulated value of that probability. Therefore, CDF can be calculated using PDF. In histogram equalization, CDF is used to transform pixel values, and plays an important role in adjusting the contrast of the image.
Calculating histogram PDF and CDF in grayscale images:
import cv2
import numpy as np
import matplotlib.pyplot as plt
def calculate_pdf(img):
# Calculating histograms (calculating the frequency of pixel values)
hist = cv2.calcHist([img], [0], None, [256], [0, 256])
# Convert each bin to a probability by dividing it by the total number of pixels.
pdf = hist / np.sum(hist)
return pdf
def calculate_cdf(pdf):
# Compute CDF by accumulating PDFs
cdf = np.cumsum(pdf)
# Normalize to 0~1 range
cdf_normalized = cdf / cdf.max()
return cdf_normalized
# Load image (convert to black and white image)
image = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)
# PDF Calculation
pdf = calculate_pdf(image)
# CDF calculation
cdf = calculate_cdf(pdf)
# Plotting PDF and CDF (Matplotlib)
plt.figure(figsize=(15, 5), linewidth=2)
# Show Image
plt.subplot(1, 3, 1)
plt.imshow(image, cmap='gray')
plt.title('Original Image')
plt.axis('off')
# PDF plot
plt.subplot(1, 3, 2)
plt.plot(pdf, color='blue')
plt.title('PDF (Probability Density Function)')
plt.xlabel('Pixel Value')
plt.ylabel('Probability')
# CDF plot
plt.subplot(1, 3, 3)
plt.plot(cdf, color='green')
plt.title('CDF (Cumulative Distribution Function)')
plt.xlabel('Pixel Value')
plt.ylabel('Cumulative Probability')
# Show graph
plt.tight_layout()
plt.savefig('results/pdf_cdf.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
2. Histogram Equalization
Histogram equalization is a method of redistributing the pixel values of an image using CDF. It transforms the pixel values based on CDF to make the pixel values of the image evenly distributed within the range of 0 to 255. This can increase the contrast of the image.
The transformed pixel value \( T(x) \) is calculated as follows:
\[ T(x) = round((L - 1) \cdot C(x)) \]
Where:
\( T(x) \) is the transformed pixel value.
\( L \) is the number of grayscale levels (usually 256).
\( C(x) \) is the cumulative distribution function (CDF) for the pixel value ((x)).
\( round(\cdot) \) means the rounding function.
This transformation adjusts the pixel values of the image to be evenly distributed.
Example using cv2.equalizeHist:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load image (convert to black and white image)
image = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)
# Apply histogram equalization
equalized_image = cv2.equalizeHist(image)
# Plotting PDF and CDF (Matplotlib)
plt.figure(figsize=(10, 5), linewidth=2)
plt.subplot(1, 2, 1)
plt.imshow(image, cmap='gray')
plt.title('Original Image')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(equalized_image, cmap='gray')
plt.title('Equalized Image')
plt.axis('off')
# Show image
plt.tight_layout()
plt.savefig('results/hist_equalize1.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
Example of histogram equalization using CDF calculation:
import cv2
import numpy as np
import matplotlib.pyplot as plt
def calculate_pdf(img):
# Calculating histograms
hist = cv2.calcHist([img], [0], None, [256], [0, 256])
# Convert to probability by dividing by total number of pixels (PDF)
pdf = hist / np.sum(hist)
return hist, pdf
def calculate_cdf(pdf):
# Compute CDF by accumulating PDFs
cdf = np.cumsum(pdf)
# Normalize to 0~1 range
cdf_normalized = cdf / cdf.max()
return cdf_normalized
# Calculating and applying CDF during histogram equalization process
def histogram_equalization(img):
# PDF Calculation
hist, pdf = calculate_pdf(img)
# CDF Calculation
cdf = calculate_cdf(pdf)
# Mapping pixel values using CDF
return hist, np.round(np.interp(img.flatten(), range(256), cdf*255)).reshape(img.shape).astype(np.uint8)
# Load image (convert to grayscale image)
image = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)
# Apply histogram equalization
hist, equalized_image = histogram_equalization(image)
# Plotting PDF and CDF (Matplotlib)
plt.figure(figsize=(20, 5), linewidth=2)
plt.subplot(1, 4, 1)
plt.imshow(image, cmap='gray')
plt.title('Original Image')
plt.axis('off')
plt.subplot(1, 4, 2)
plt.plot(hist, color='black')
plt.title('Original Image Histogram')
plt.subplot(1, 4, 3)
plt.imshow(equalized_image, cmap='gray')
plt.title('Equalized Image')
plt.axis('off')
# Calculating histograms of eqaulized image
equalized_hist = cv2.calcHist([equalized_image], [0], None, [256], [0, 256])
plt.subplot(1, 4, 4)
plt.plot(equalized_hist, color='black')
plt.title('Equalized Image Histogram')
# Show Image
plt.tight_layout()
plt.savefig('results/hist_equalize2.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
PDF and CDF are important concepts in image histogram analysis. PDF represents the distribution of pixel values as a probability, and CDF represents the probability by accumulating it. In particular, CDF plays an important role in improving the contrast of an image in techniques such as histogram equalization. Based on this theoretical background, various image processing techniques can be applied.
3. CLAHE (Contrast Limited Adaptive Histogram Equalization)
Basic histogram equalization equalizes the histogram of the entire image to make the brightness uniform, but this has the disadvantage of causing excessive contrast changes in the detailed parts of the image. On the other hand, CLAHE (Contrast Limited Adaptive Histogram Equalization) can improve local contrast because it divides the image into small tiles and then individually equalizes the histogram of each tile. CLAHE is particularly advantageous in solving the problem of excessive emphasis on dark or bright areas of the image. In addition, contrast limiting is applied to prevent too many pixel values from being concentrated in the same area, and to control the noise or artifacts in the image from being prominent. CLAHE is often used in situations where detailed contrast must be emphasized, such as medical images (e.g., X-rays) and satellite images.
3.1. Contrast Limiting
An important part of CLAHE is contrast limiting. When pixel values occur too frequently (i.e., when a specific bin in the histogram is too high), it is limited to prevent the contrast from changing too drastically. To do this, the histogram is clipped to the limit value (clip limit), and the excess bins are evenly distributed to other bins.
\[ h_{\text{clip}}(x) = \min(h(x), \text{clip_limit}) \]
Here, \( h(x) \) is the original histogram bin value for pixel value \( x \), and \( \text{clip_limit} \) means the maximum allowed frequency in the histogram. Based on this limited value, CDF is recomputed to transform the pixel value.
3.2. Bilinear Interpolation
After applying local histogram equalization, bilinear interpolation is used to minimize the unnatural transition at the boundary of the region to achieve a smooth transition between adjacent regions. This is the process of appropriately blending the pixel values of adjacent tiles.
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load image (convert to grayscale image)
image = cv2.imread('resources/x-ray.jpg', cv2.IMREAD_GRAYSCALE)
# Apply basic histogram equalization
equalized_image = cv2.equalizeHist(image)
# Apply CLAHE
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
clahe_image = clahe.apply(image)
plt.figure(figsize=(12, 8))
plt.subplot(1, 3, 1)
plt.imshow(image, cmap='gray')
plt.title('Original Image')
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(equalized_image, cmap='gray')
plt.title('Histogram Equalization')
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(clahe_image, cmap='gray')
plt.title('CLAHE')
plt.axis('off')
plt.tight_layout()
plt.show()
4. Histogram Stretching
Histogram stretching is a technique to improve the contrast of an image. It expands the image to the full range (0~255) based on the minimum and maximum pixel values of the image.
When the pixel value is \( p(x) \) , the stretched pixel value \( p'(x) \) is calculated as follows:
\[ p'(x) = \frac{p(x) - p_{\text{min}}}{p_{\text{max}} - p_{\text{min}}} \times 255 \]
Where \( p_{\text{min}} \) is the minimum pixel value of the image, and \( p_{\text{max}} \) is the maximum pixel value.
import cv2
import numpy as np
import matplotlib.pyplot as plt
def histogram_stretching(img):
# Calculate minimum and maximum pixel values of an image
min_val = np.min(img)
max_val = np.max(img)
# Apply histogram stretching
stretched = ((img - min_val) / (max_val - min_val) * 255).astype(np.uint8)
return stretched
# Load image (convert to black and white image)
image = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)
# Histogram stretching
stretched_image = histogram_stretching(image)
# Plotting PDF and CDF (Matplotlib)
plt.figure(figsize=(20, 5), linewidth=2)
plt.subplot(1, 4, 1)
plt.imshow(image, cmap='gray')
plt.title('Original Image')
plt.axis('off')
# Calculating histograms of original image
hist = cv2.calcHist([image], [0], None, [256], [0, 256])
plt.subplot(1, 4, 2)
plt.plot(hist, color='black')
plt.title('Original Image Histogram')
plt.subplot(1, 4, 3)
plt.imshow(stretched_image, cmap='gray')
plt.title('Stretched Image')
plt.axis('off')
# Calculating histograms of stretched image
stretched_hist = cv2.calcHist([stretched_image], [0], None, [256], [0, 256])
plt.subplot(1, 4, 4)
plt.plot(stretched_hist, color='black')
plt.title('Stretched Image Histogram')
# Show image
plt.tight_layout()
plt.savefig('results/hist_stretched.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
5. Histogram Matching
Histogram matching is a technique that matches the histogram of one image to the histogram of another image. It is used to increase the visual consistency between images. Histogram matching works by comparing the CDF of the target image to the CDF of the source image and assigning new pixel values. It redistributes the pixels of the source image according to the target CDF.
import cv2
import numpy as np
import matplotlib.pyplot as plt
def calculate_pdf(img):
# Calculating histograms (calculating the frequency of pixel values)
hist = cv2.calcHist([img], [0], None, [256], [0, 256])
# Convert each bin to a probability by dividing it by the total number of pixels.
pdf = hist / np.sum(hist)
return pdf
def calculate_cdf(pdf):
# Compute CDF by accumulating PDFs
cdf = np.cumsum(pdf)
# Normalize to 0~1 range
cdf_normalized = cdf / cdf.max()
return cdf_normalized
def histogram_matching(source, target):
# Calculating PDF
source_pdf = calculate_pdf(source)
target_pdf = calculate_pdf(target)
# Calculating CDF
source_cdf_normalized = calculate_cdf(source_pdf)
target_cdf_normalized = calculate_cdf(target_pdf)
# Create a matching table
mapping = np.interp(source_cdf_normalized, target_cdf_normalized, np.arange(256))
# Apply matching results
matched_image = mapping[source.flatten()].reshape(source.shape)
return matched_image.astype(np.uint8)
source_image = cv2.imread('resources/lena.bmp', cv2.IMREAD_GRAYSCALE)
target_image = cv2.imread('resources/baboon.bmp', cv2.IMREAD_GRAYSCALE)
matched_image = histogram_matching(source_image, target_image)
# Plotting PDF and CDF (Matplotlib)
fig, axes = plt.subplots(2,3, figsize=(10,5))
axes[0][0].imshow(source_image, cmap='gray')
axes[0][0].set_title('Source Image')
axes[0][0].axis('off')
axes[1][0].plot(source_hist, color='black')
axes[0][1].imshow(target_image, cmap='gray')
axes[0][1].set_title('Target Image')
axes[0][1].axis('off')
axes[1][1].plot(target_hist, color='black')
axes[0][2].imshow(matched_image, cmap='gray')
axes[0][2].set_title('Matched Image')
axes[0][2].axis('off')
matched_hist = cv2.calcHist([matched_image], [0], None, [256], [0, 256])
axes[1][2].plot(matched_hist, color='black')
# Show image
plt.tight_layout()
plt.savefig('results/hist_matched.png', dpi=200, facecolor='#eeeeee', edgecolor='black')
plt.show()
6. Conclusion
We have looked at various techniques that can analyze and improve images through histogram analysis. From basic histogram calculation to histogram equalization, stretching, CLAHE, and histogram matching, various techniques can be implemented with OpenCV and Python. In addition, PDF and CDF are important concepts in image histogram analysis. PDF represents the distribution of pixel values as a probability, and CDF represents the accumulated probability. In particular, CDF plays an important role in improving the contrast of images in techniques such as histogram equalization. We have also explained the theoretical background and mathematical principles of each technique, so you will be able to perform various image processing tasks using them.
If you found this post useful, please like and subscribe below. ^^
★ All contents are referenced from the link below. ★
'Lecture > OpenCV Master with Python (Beginner)' 카테고리의 다른 글
OpenCV + Python Crop and resize images (0) | 2024.10.24 |
---|---|
OpenCV + Python Arithmetic and logical operations (0) | 2024.10.24 |
OpenCV + Python Adjust the brightness and contrast of the image (0) | 2024.10.23 |
OpenCV + Python Angle Measuring Instrument (Practical) (0) | 2024.10.23 |
OpenCV + Python Outputting Unicode Fonts (0) | 2024.10.23 |