Color Quantization

3 minute read

Introduction

Color quantization is used to reduce the number of colors that are used to represent an image. The objective is to reduce the colors without affecting the visual appearance of the original image. However, some information is lost during the process and hence it is called as a lossy operation.

One of the reason for performing color quantization is to allow rendering of image on hardware supporting limited number of colors. Also, it reduces the space required by the image which in turn allows faster loading on slower devices.

One of the approaches for performing color quantization is using clustering techniques, where colors of the pixel are divided into pre-defined clusters and all the pixels are mapped to the color of the cluster to which they belong. K-means is a very popular clustering technique which will be further explored in this post.

Color Quantization using K-means

K-Means algorithm can be described as follows:

1. Select randomly k points as cluster centers
2. Allocate rest of the data points to the closest cluster center
3. Re-calculate cluster centers, new cluster centers are mean of
   all the points that belongs to a particular cluster
4. Repeat steps 2 and 3 until the cluster center stops changing

Color quantization using K-means starts by randomly selecting k colors from the image as initial cluster centers. Then, the rest of the pixels are assigned to the closest cluster. Once all points are assinged, the center of the clusters are updated to the mean of the pixel colors that belong to a particular cluster. This process is repeated until the cluster center can no longer be updated. At this point, the algorithm has converged to a local solution.

Analysis

Enough with the theory, let us apply this on few images and see the results!

Chart

Name Original image size Size after quantization(K=3) K=5 K=10 K=20
baboon.png 622 KB 99.4 KB(84% less) 181 KB(70.9% less) 299 KB(51.93%) 402 KB(35.4%)
monarch.png 599 KB 73.9 KB(87.66% less) 128 KB(78.63% less) 187 KB(68.78%) 263 KB(56.1%)
tulips.png 663 KB 51 KB(92.3% less) 104 KB(84.3% less) 186 KB(71.94%) 311 KB(53.1%)
Average Compression   88% 77.93% 64.2% 48.2%

We can observe that we can achieve almost 50% compression of the original image with K=20 while preserving the visual appearance. There is clear trade-off between reduction in image size and preservation of the original image, and accordingly we can choose the value of K that best suits the purpose.

Note: Images used for this analysis are taken from Public-Domain Test Images for Homeworks and Projects.

Result

Original image of baboon

Original image of baboon

Image after color quantization using 3 clusters (K=3)

Original image of baboon

Image after color quantization using 20 clusters (K=20)

Original image of baboon

For more results, check the github repo here.

Code

The code for main functions is given below. For full code, see here.
Please note that this was an attempt to implement the process from scratch and hence the runtime is high. A better way would be to make use of numpy vectorization.

def performColorQuanitization(img, clusterCenter, K):
    """
    Main method which performs color quantization using K-means

    Args:
        img(str): Location of image on which color quantization is to be performed
        clusterCenter(list): Initial cluster centers (randomly selected)
        K(int): Number of clusters

    Returns:
        clusterCenter(list): Final cluster centers
        classificationVector(list): Indicates to which cluster a pixel belongs to
    """
    error = 1
    while(error > 0):
        classificationVector = classifyImagePoints(img, clusterCenter)
        newClusterCenter = updateColor(img, classificationVector, K)
        error = np.linalg.norm(newClusterCenter - clusterCenter, axis=None)
        clusterCenter = newClusterCenter
    return clusterCenter, classificationVector


def classifyImagePoints(img, clusterCenter):
    """
    Computes distance of each pixel color from the cluster center and assigns pixel
    to the closest cluster

    Args:
        img(str): Location of image on which color quantization is to be performed
        clusterCenter(list): Initial cluster centers (randomly selected)

    Returns:
        classificationVector(list): Indicates to which cluster a pixel belongs to
    """
    classificationVector = {}
    rows, cols = img.shape[:2]
    for row in range(rows):
        for col in range(cols):
            pixel = img[row, col]
            distance = np.linalg.norm(np.uint8([pixel]) - clusterCenter, axis=1)
            classificationVector[(row, col)] = np.argmin(distance)
    return classificationVector


def updateColor(img, classificationVector, K):
    """
    Updates cluster centers by taking mean of the color of the pixel that belongs to the cluster

    Args:
        img(str): Location of image on which color quantization is to be performed
        classificationVector(list): Indicates to which cluster a pixel belongs to
        K(int): Number of clusters

    Returns:
        clusterCenter(list): Updated center of clusters
    """
    clusterCenter = []
    rows, cols = img.shape[:2]
    for clusterNum in range(K):
        points = []
        for row in range(rows):
            for col in range(cols):
                if classificationVector[row, col] == clusterNum:
                    points.append(img[row, col])
        clusterCenter.append(np.round(np.mean(points, axis=0), 2))
        # clusterCenter[clusterNum] = np.round(np.mean(points, axis=0), 2)
    # return np.float32(list(clusterCenter.values()))
    return np.float32(clusterCenter)

Conclusion

We can use color quantization to reduce the memory required by an image without loosing the visual appearance of the original image, which can prove useful on devices with limited memory capacity.