When rejection is a good thing, part 2

Digital noise comes in several varieties. We call the natural fluctuation of incoming light “photon shot noise.” The electronics of your camera also introduce uncertainties in the measurement of incident light. In fact, simply transferring the photons from the chip — and subsequently digitizing the information into brightness values (counts) — introduces additional fluctuations in the measured results.

You might be able to minimize the chip’s “read noise” (but not the photon shot noise) by purchasing better electronics. Nonetheless, if you take more exposures and make more measurements, you can minimize the net noise level, be it natural or instrumental.

Although contributions from cosmic rays, satellites, asteroids, and other transient effects are not sources of noise, they are certainly unwanted signals. They are identifiable because their values may be significantly different from the fluctuations of values you expect to see each time you take an exposure. Image #1 plots the frequency of 27 values from a single pixel in one of my recent data sets.

Image #1. This histogram shows the frequency of values for the data listed in the plot area. Note that one value is far to the right and much greater than all other values, which cluster on the left edge.

All images: Adam Block

You will note they are pretty similar except for the outlying value of 3,729 due to a cosmic ray. But how do we know it is an outlier in a mathematically rigorous way? What we want to know is how large a deviation the outlier is from the mean value. We can use this as a tool to choose whether or not we reject a value before calculating a new average from the remaining numbers, which should give us a much improved result.

Most programs have a “sigma rejection” algorithm, which simply calculates the mean value and the standard deviation from your measurements. “Sigma” and “standard deviation” are inscrutable terms to people who aren’t mathematicians. If you remember nothing else, remember this: The standard deviation of a set of measurements, often denoted by the Greek letter sigma (σ), is nothing more than the average distance to the mean. That’s it! Just ask each value, “How far are you from the mean?” and average all of the answers. If a value is many times the average distance to the mean, then we can safely reject it.

For my data in Image #1, the mean is 1,861, and the average distance to the mean, i.e. one standard deviation, is 376. The difference between 3,729 and 1,861 is around five times what you would expect for a typical fluctuation. Programs will ask you to enter the threshold, in standard deviations, and any reading above it excludes a value. Typically if a value is more than two or three standard deviations, it is likely an outlier caused by an unwanted source. The calculation of a standard deviation is meaningful when you have taken more than seven images.

Image #2. The statistical method of rejecting pixels identifies outlying values and then averages the remaining numbers at a given pixel for a set of images. Rejection filters work differently by substituting neighboring pixel values in a single image but do not analyze the values of other images.

A caveat is that this statistical rejection and subsequent averaging is not the only method used in image processing. One other type of rejection algorithm is a filter that scans an image, and when it finds a value that is very different from its neighbors, it substitutes a new value based on the value of its neighbors (Image #2). Hot pixel filters are of this type.

This is not the same kind of rejection and is not as robust as the method I described above. Some programs identify out-lying values statistically, and instead of simply taking the average of the remaining values, they use a substitution method based on neighboring pixels. This processing error yields poorer results for your hard-won images, and it is one that I see often during my workshops. In other words, please do not do this!

In my next column, I will demonstrate the minimum filter, and I’ll offer a powerful variation for this common processing trick.

Up Next