Delta E Testing and Why Our Numbers are Different - Dell U2412M
Delta E Testing and Why Our Numbers are Different
If you’ve looked at reviews of the Dell U2412M at other sites, you’re going to find that our Delta E (dE) numbers look different, as do our other display reviews. This leads to several questions: why are our numbers different, what do they measure differently, and what results should you believe? In reality you should believe all of them, as they are all accurate, but likely reporting on different things. To explain this more, let’s look at how profiling a display works.
We use ColorEyes Display Pro for our device profiling and measurements, and I use an i1Pro for all of my profiling and profile evaluations. In creating a profile, ColorEyes Pro uses a fixed set of patterns that it moves through, adjusting the response curves for the display as well as creating Look Up Tables (LUTs) that contain information about how the display responds to colors. Using the curves we get a linear grayscale and accurate gamma out of the display. Using the LUTs we get the correct colors out of the display. If we ask for red, it looks at the LUTs to see how the display creates red, and then adjusts the signal going to the display to accurately reflect what the program is asking for.
This is exactly where we can get the difference in results but still have them be accurate. Sites use different software to evaluate displays; I haven’t used all of the packages available so I don’t know specifically how each works. However, if they were to use the same swatches in profile creation that they use in profile evaluation, then the results should always be near perfect. If the LUT contains the exact color you are trying to measure against, then it knows exactly how the display handles that color and it should come out close to perfect. If you try to look up a color that isn’t in the LUT, then you’re going to have to interpret how to create that color and will likely be off by a certain amount.
When calibrating a TV, people almost always use the first method. We calibrate to the RGB primaries (and CMY secondaries), measure how close they are, and assume the intermediate colors will be created correctly. One benefit is it is very easy to compare across different reviews as we all have the same targets. Sometimes we find after viewing test material that something is wrong and making those 6 points correct caused the millions of other possible points to be incorrect. This could be due to the lack of bit-depth in doing calculations and causing posterization, an incorrect formula, or something else. Some programs might do the same thing in that they create a profile for the display, but then they only check against colors that are in the LUT and so will be accurate.
We check color fidelity using the well-known Gretag Macbeth color checker chart. This is a collection of 24 color swatches that are common in daily life, like skin tones, sky blue, natural greens, and more. None of these are typically contained in the LUT of the profile, so we are finding out how well the display can do these other shades and not, in a way, cheating by using known values. Because of this we expect to encounter a higher amount of error than other tests might, but we also believe it is closer to real world results.
The other main source of error using this method is colors in the chart that are outside of the sRGB colorspace or at the very edge. Since GMB was designed around real world photography and not computers, some of these swatches are much harder to reproduce. This helps to separate displays with larger color gamuts from those with smaller gamuts in testing, rewarding them with lower dE values in the end. It also can reward displays that have their own, built-in LUTs for doing calculations and not those that just rely on the LUTs in the graphics card.
So when you look at an LCD review, remember that one dE isn’t the same as another dE. Both are valid but both are potentially measuring very different things. I could easily put up the dE values that ColorEyes Pro generates when it verifies a profile and every display would have a value well below 1, but that wouldn’t be as useful or informative as the current method.
ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZIJ2gY9om56knGLCc4CQa6RmaWZifnF5yKmqZq%2BZqbWwwdNmmaudkaC2r7OMrZ%2BeZZKWu6x7kg%3D%3D