CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy
Dongyoung Kim1, Mahmoud Afifi2, Dongyun Kim1, Michael S. Brown2, Seon Joo Kim1
1Yonsei University, 2AI Center Toronto, Samsung Electronics
ICCV 2025
CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy
Dongyoung Kim1, Mahmoud Afifi2, Dongyun Kim1, Michael S. Brown2, Seon Joo Kim1
1Yonsei University, 2AI Center Toronto, Samsung Electronics
ICCV 2025
Computational color constancy, or white balancing, is a key module in a camera’s image signal processor (ISP) that corrects color casts from scene lighting. Because this operation occurs in the camera-specific raw color space, white balance algorithms must adapt to different cameras. This paper introduces a learning-based method for cross-camera color constancy that generalizes to new cameras without retraining. Our method leverages pre-calibrated color correction matrices (CCMs) available on ISPs that map the camera’s raw color space to a standard space (e.g., CIE XYZ). Our method uses these CCMs to transform predefined illumination colors (i.e., along the Planckian locus) into the test camera's raw space. The mapped illuminants are encoded into a compact camera fingerprint embedding (CFE) that enables the network to adapt to unseen cameras. To prevent overfitting due to limited cameras and CCMs during training, we introduce a data augmentation technique that interpolates between cameras and their CCMs. Experimental results across multiple datasets and backbones show that our method achieves state-of-the-art cross-camera color constancy while remaining lightweight and relying only on data readily available in camera ISPs.
Overview of the CCMNet architecture. (A) Based on CCC and C5, CCMNet includes a network f that generates filters and bias from the uv-histograms of the input image. To process query images from diverse camera domains with varying spectral sensitivities, CCMNet uses a camera fingerprint embedding (CFE) as guidance. (B) The CFE for three example cameras (A, B, V)—two real (A, B) and one imaginary (V)—is constructed by mapping predefined illuminants (2500K–7500K along the Planckian locus) from the CIE XYZ space to each camera’s native raw RGB space using calibrated CCMs. These values are converted into a 64 × 64 histogram and encoded into a 1D vector via a lightweight encoder.
Qualitative results of CCMNet. Our model has not been exposed to any images or CCMs from the cameras shown in the figure during training.
More visual results will be released soon. Stay tuned!