Bringing Color to Grayscale: A Deep Dive into Automatic Image Colorization

TLDR: This research explores automatic image colorization using two deep learning approaches: classification with Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs). Evaluating on the CIFAR-10 dataset, the study found that while both methods can colorize images, the GAN-based approach generally achieves higher pixel accuracy and PSNR, though it is more computationally intensive. A user study also indicated that GAN-generated images were more realistic to human observers.

Image colorization, the fascinating process of adding colors to grayscale images, has seen significant advancements in computer vision. This task is inherently challenging because it involves recovering two out of three color dimensions that are lost, leading to many possible solutions. However, the context of a scene, like a blue sky or green grass, provides crucial clues for accurate color prediction. Thanks to the abundance of colored images available, deep learning models can be trained extensively to learn these complex relationships.

Traditionally, image colorization has been approached as a regression problem, which often overlooks the fact that a single grayscale image can have multiple plausible color interpretations. This research explores two modern deep learning strategies: classification and adversarial learning, building upon previous works and adapting them for specific scenarios.

Exploring Different Approaches

The study delves into two primary methods for automatic image colorization:

Classification-based Colorization: Instead of predicting continuous color values, this approach treats colorization as a classification problem. The ‘ab’ channels of the CIE Lab color space (which represent color information independently of brightness) are quantized into 313 discrete pairs. A neural network, similar to the U-Net architecture, is trained to predict a probability distribution over these color bins for each pixel. The final colorized image is then generated by mapping these probabilities back to color channels. Unlike some prior works, this study did not apply class rebalancing, finding it disrupted the training process in their specific setup.

GAN-based Colorization: Generative Adversarial Networks (GANs) consist of two competing neural networks: a generator and a discriminator. In this context, the generator takes a grayscale image and attempts to produce a realistic colorized version. The discriminator’s role is to distinguish between real colored images and those generated by the generator. By training these two networks in opposition, the generator learns to create increasingly convincing colorizations. This research uses a conditional GAN, where the generator is guided by the input grayscale image, and employs a modified U-Net architecture for the generator. The CIE Lab color space is also utilized here to separate brightness from color prediction, focusing the generator on the ‘ab’ color channels.

Experiments and Findings

The models were evaluated using the CIFAR-10 dataset, which comprises 60,000 small images (32×32 pixels) across 10 classes. Both models were trained using the Adam optimizer, with specific learning rates and epochs tailored to each approach. Training times varied, with the classification model taking about 4.5 hours and the GAN model taking 4 hours on different GPU setups.

To assess performance, several metrics were used: pixel-wise accuracy (measuring how many pixels’ colors are within a small error threshold of the original), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM). PSNR and SSIM are common measures of image quality and similarity.

The results showed that while both classification and GAN methods could colorize grayscale images to an acceptable visual degree, the GAN-based approach generally outperformed the classification method. GANs achieved significantly higher pixel-wise accuracy and PSNR values, indicating better overall colorization quality. The SSIM values were comparable between the two methods. Interestingly, both models struggled more with generating accurate colors in the red (R) channel compared to green (G) and blue (B).

A user study was also conducted, where 16 students were asked to identify the ground truth image from a set including classification-generated and GAN-generated images. The results indicated that images produced by GANs were more successful at fooling users, with 40.69% of GAN-generated images being mistaken for ground truth, compared to 4.80% for classification-generated images. Users also rated GAN-generated images higher in terms of reality and quality.

Also Read:

Implementation and Future Directions

The models were implemented in PyTorch, with adaptations made to the architectures to suit the smaller image size of the CIFAR-10 dataset. The researchers also utilized TensorBoard for visualizing synchronous colorization results.

In conclusion, this project successfully compared and evaluated the performance of convolutional neural networks and generative adversarial networks for automatic image colorization. While both are effective, the Conditional Deep Convolutional Generative Adversarial Network (C-DCGAN) demonstrated superior performance, albeit with higher computational demands. Future work includes experimenting with higher-resolution datasets like ImageNet or MS COCO, exploring different classifier backbones like ResNet, and investigating other generative models such as VAEs.

For more technical details, you can refer to the full research paper: Automatic Image Colorization with Convolutional Neural Networks and Generative Adversarial Networks.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Bringing Color to Grayscale: A Deep Dive into Automatic Image Colorization

Exploring Different Approaches

Experiments and Findings

Implementation and Future Directions

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates