A coworker recently asked me this question during a discussion about color management and encoding. I wrote the original version of this post in response, and thought it was an interesting enough "whirlwind tour" of my current understanding of CG color theory and management that it was worth posting publicly.
But I'm also not sure what's the correct way to specify an HDR colour, something that's overbright like
[3.0, 3.0, 3.0]? my sort of clunky understanding is that that's outside of sRGB gamut?!? I mean definitely can't be represented with
u8. My guess is that you'd want to use something like
AcesCgthat you mention in the colistodian docs, but I'm really not sure? Or is it fine to use
Color::linear_srgba(), because it will get clamped when transferring to encoded srgb? (Forgive imprecise use of language).
Great question! The answer is both very simple and extremely complex ahaha (sorry this got really long...).
The short of it is "it depends"--on the specifications of the color encoding being used, and how pedantic we're being over what qualifies as a "color". One thing to say off the bat is that "wide gamut" color spaces are different in purpose than "HDR" color spaces, though if a space is wide it is also usually "HDR" just because these two things were evolving at the same time... "HDR" is referring to the ability to show the same "hue" and "chroma" color with higher power. While a "wider gamut" color space is used to be able to show more chromatic (saturated) colors at the same power. As we'll see later, if we bend the definitions a bit, we can make thin-gamut color encodings still record "HDR" (as in higher power) values, but we'll not be able to escape that thin gamut and display more saturated colors without moving to a wider base encoding.
So... let's dive in. What does a color code value
(0.2, 0.3, 0.7) in the rigorously-defined sRGB color encoding mean? It is the color that a (rigorously) sRGB-calibrated monitor outputs when given that code value as input... What does the code value
(1.0, 1.0, 1.0) mean? It's the brightest "white" (full power) that a rigorously sRGB calibrated monitor can display.
And in this definition of the rigorous sRGB color encoding, therefore, a code value above 1.0 has no meaning. It's "out of gamut" by the definitions of this color encoding. But wait a minute... how can we make the "color" of an object in the game world an sRGB color, then?! If we are following the definition of the encoding rigorously, that doesn't... really make sense. It would mean that we are defining we want the light emitted from the object to be the same light (I'm being intentionally nebulous here) that an sRGB monitor emits when we give it that code value. But, obviously that only makes sense if the object is itself just emitting light.
If it's reflecting other light, now we're a bit shit outta luck because I mean, well, it depends what light is hitting the object, the orientation, etc. as to what light is reflected back to the viewer. So we have to define some canonical conditions under which we want to match the reflected light from the object such that it would then match the color output from the sRGB monitor. Let's say we define that as the object being lit from straight in front by a light such that the power measured at the object's surface is sRGB white (
(1.0,1.0,1.0)). Now if we define the reflectance (i.e. percentage of incoming light that gets reflected) in each of the three color components to be
(0.2, 0.3, 0.7) again, we get...
(0.2, 0.3, 0.7) light being reflected again! (I'm again being a bit handwavey but stay with me..)
Okay so we've kinda solved the issue but now wait a minute again.... what happens when we shine a brighter light on that same object? I mean, it will still be the same ""hue"" and ""chroma"" (saturation), but.. more power. But how do we encode that brighter light as a color now? Well, if you truly want to encode it as a "color" within a fully specified color space's gamut by "color science" definitions, you'd need to use a color encoding with a "larger gamut" in the sense that the maximum intensity of light output by a color in the gamut is higher.
But even the most HDR monitors and therefore color spaces right now only go up to an intensity that is pretty low in the grand scheme of things, a few thousand nits.
What we're really after here is not a way to encode a color, but rather from a rendering perspective we're trying to encode the light power along a direction (radiometric flux per unit solid angle to be pedantic..), which is also called "radiance". By strict definition, radiance is a radiometric quantity (i.e. dealing with electromagnetic radiation), which means it is measured as a spectral power distribution. That means for all wavelengths along the spectrum of electromagnetic radiation, we define some power. Taken together this gives a curve of intensity per wavelength, defining the power of radiation traveling along the direction we're interested in.
If we have two lights shining in the same direction then we can just add the power distribution curves. If light hits an object, we can define its reflectance as a function of percentage of incoming power reflected per wavelength (another curve), and then convolve those two curves together to get the output power.
Aside: White Remember how earlier we said
(1.0, 1.0, 1.0)in sRGB is the brightest "white" that an sRGB monitor can output? Well what does "white" actually mean? What light is actually being output? Turns out we can define it as whatever we want, and that's what's called a "white point". The reference for what white is, is simply a defined spectral power distribution! You may hear of "D65", "D60", A, E, .... etc. These are all just well known power distributions that approximate what we perceive as white in different circumstances. sRGB itself uses the D65 spectral power distribution, and defines its total power (i.e. a scaling factor applied to the whole curve) to be 200 nits.
But encoding this data as a curve is hard, and so is doing operations on them... and rendering is already computationally intensive. So we want a way to store and operate on some approximation of radiance, but with less space and computational intensity. Turns out.... (because of human color vision and a bit of luck...) we can use just three buckets of power that we define as having some mountain-like distribution of power across the spectrum and get a pretty darn good approximation. It does break down completely in some situations and gives error in others (notably, the "convolution" operation we talked about before is lossy using this approximation, which means for every bounce of light we compute we get more and more error depending on various conditions!), but... it works pretty well in the majority of cases.
And if we loosen our definition of "color" a little bit, we could see this "quasi-radiance" as also being a "color encoding". If we adjust the definition of our three buckets to be aligned to an existing color space's three primaries (which we can loosely think of as defining something approximating peak-y spectral power distributions themselves...), we can re-use that existing reference point to anchor what our buckets mean. And at that point, we just need to define how much "power" each unit in each bucket has, and then we've created a "color encoding" which has in theory infinite dynamic range, where the ratio between the three buckets is defining some approximation of what wavelengths of light are included in our spectrum distribution curve approximation, and then the magnitude of them defines how much power they have.
If we use such an encoding, let's call it
SrgbBasedQuasiRadiance, well now a code value
(3.0, 3.0, 3.0) makes (mostly) entire sense... it should be (an approximation of) the "white point" spectral power distribution but 3 times more intense! Although calling it a "color" is stretching the definition a bit. And if we're using this to define an emissive thing (light or emissive material) then that's all we need because we can just emit light from the thing using that as quasi radiance directly. And when we simulate this light entering a virtual camera we can record that light as having hit a virtual camera sensor. Bing bang boom!
Aside: Wide Gamut vs HDR Here's where you may note that even in this new
SrgbBasedQuasiRadiance, since our buckets are just recording power of three different base "primary" SPDs, we can get infinite dynamic range, but we can still not escape the "thinness" of the base sRGB gamut in that we can't represent a more saturated/chromatic color than those base primary SPDs! If we wanted to do so, we'd either need to allow negative values of the other primary buckets, or use base primaries which are themselves spread further apart and therefore more chromatic to start with.
Buuuut we have one more issue. Which is, ultimately we have rendered our scene and have the amount of our new
SrgbBasedQuasiRadiance arriving at the virtual camera for each pixel. But how do we convert that to actual rigorously defined
Srgb color so that we can display it for real on a real sRGB display (or any other color space)?! The answer is.... it depends! And there's no "right" answer. Basically you have an open question to create some mapping from
(0, 0, 0)...(inf, inf, inf) to
(0, 0, 0)...(1, 1, 1) . In computer graphics, that mapping in common parlance is called "tonemapping", although now the latest terminology is "display rendering transform", or also "image formation transform", which really describes what it's doing better. It's "rendering" a set of simulated camera-sensor-recording data into a directly displayable, known-quantity image. Which carries with it some amount of subjectivity as to how that should be done. In real life cameras, this mapping is the camera manufacturer's "color science"! It's a closely guarded secret of each manufacturer and is the code that runs on the camera chip to convert the measured light values from the sensor to an image. It's what makes Canon images look different than Sony look different than Apple. It's a really important step!
Okay, "brief overview of game color management" complete... x)