The magic kernel

What is the magic kernel?

The “magic kernel” is a method of resampling images that gives amazingly clear results (free of “aliasing” artifacts, free of “ringing”, and free of “width beat” for thin features) yet is lightning fast (the simplest version only requires a few integer additions and bit-shifts).

I came across the magic kernel in 2006 in the source code for a popularly-used JPEG library. Since then I have investigated its remarkable properties in depth, and extended it to the case of arbitrary resampling factors.

This web page summarizes those properties, and provides a full C# code implementation of the magic kernel, including its application to images.

Where did the magic kernel come from?

In 2006 I was seeking to minimize the “blocking artifacts” caused by heavy use of JPEG compression (see here.) In doing so, I made use of a popularly-used JPEG library, written in C, created by the Independent JPEG Group (IJG). Standard JPEG compression stores the two channels of chrominance data at only half-resolution, i.e., only one Cr and one Cb value for every 2 x 2 block of pixels. In reconstructing the image, it is therefore necessary to “upsample” the chrominance channels back to full resolution.

The simplest upsampling method is to simply replicate the values in each 2 x 2 block of pixels. But the IJG library also contained an option for “fancy upsampling” of the chrominance. The following example (twice real size) shows the difference in the resulting reconstructions (note, particularly, the edges between red and black):

replicated chrominance   “Replicated” upsampling

replicated chrominance   “Fancy” upsampling

Intrigued, I examined the C source code for this “fancy” upsampling. To my amazement, I discovered that it was nothing more than a kernel of {1/4, 3/4, 3/4, 1/4} in each direction (where, in this notation, the standard “replication” kernel is just {0, 1, 1, 0}).

I decided to try this kernel out on general images, which has the effect of doubling the size of the image in each direction. To my astonishment, I found that it gave astoundingly good results for repeated upsamplings, no matter how many times it was applied. I therefore dubbed it the “magic kernel”. Even more amazingly, the results were, visually, far superior to the bicubic resampling commonly used in Photoshop or GIMP. (They were also superior to the Lanczos-3 upsampling in GIMP, but I have since been informed that the GIMP implementation of Lanczos-3 is broken, so I have temporarily removed that comparison image.)

small image   Original image

upsampled once   Magic kernel once

upsampled twice   Magic kernel twice

upsampled thrice   Magic kernel thrice

upsampled using bicubic   Bicubic: note the the artifacts

Likewise, applying it as a downsampling kernel yields beautifully antialiased images:

large image   New original image

downsampled once   Magic kernel once

downsampled once   Magic kernel twice

downsampled once   Magic kernel thrice

Flabbergasted, and unable to find this kernel either in textbooks or on the web, I asked on the newsgroups whether it was well-known (see here and here). It appeared that it was not.

Why is the magic kernel so good?

Having been trained and drilled extensively in advanced Fourier theory during my undergraduate Electrical Engineering degree, and having applied that theory to some of the most intricate formulations of theoretical particle physics (see here), I knew that the “correct” antialiasing kernel is a sinc function. In practice, the infinite support and negative values in the sinc function necessitate approximations (such as the Lanczos kernels).

The question, then, was obvious: how on Earth could the simple kernel {1/4, 3/4, 3/4, 1/4} out-perform such “correct” kernels?

The answer to that question seems to have two parts.

Firstly, as can be seen from the newsgroup discussions following my above postings, this kernel is effectively a very simple approximation to the sinc function, albeit with the twist that the new sampling positions are offset from the original positions. (This is obvious when one considers the “tiling” of the original JPEG chrominance implementation, but is not an obvious choice from the point of view of sampling theory.) Charles Bloom has recently noted that the kernel can be factorized to a nearest-neighbor upsampling followed by convolution with a regular {1/4, 1/2, 1/4} kernel.

Secondly, it turns out that an arbitrary number of repeated applications of this kernel results in a sequence of kernels with unique and extremely desirable properties. This is the topic of the next section.

The continuum magic kernel

In investigating why the magic kernel performed miraculously well when applied repeatedly to images, I discovered empirically that, when convolved with itself an arbitrary number of times, the resulting kernel was always of a “tri-parabolic” form, which allows the kernel to “fit into itself” when convolved with a constant source signal.

Taking this to its logical conclusion, I looked at what would happen if a signal were upsampled with the magic kernel an infinite number of times. If we denote by x the coordinate in the original sampled space (where the original signal is sampled at integral values of x), then I found that the infinitely-upsampled continuum magic kernel is given analytically by

Here is a graph of the function:

This doesn’t look much like a sinc function at all. So how and why does it work so well?

Firstly, m(x) has finite support: a given sample of the original signal only contributes to the region from 1.5 spacings to the left to 1.5 spacings to the right of its position. This is extremely good in practical terms. In contrast, the sinc function has infinite support, which raises important questions about its theoretical validity for real signals, which are always bounded. (I look at this “boundary condition” problem in detail in the physics papers noted above.)

Secondly, m(x) is non-negative. Again, this avoids the practical problems involved in using the sinc function, which has substantial negative lobes, leading to practical approximations that negate the theoretical “correctness” of the sinc function.

Thirdly, m(x) is continuous and has continuous first derivative everywhere. This property ensures that convolving with the magic kernel yields results that “look good”: discontinuities in the kernel or its first derivative generally lead to artifacts that can be detected by the human eye. These mathematical properties are shared by the sinc function, but not necessarily by its practical approximations.

Fourthly, the shape of m(x) visually resembles that of a gaussian, which (if one didn’t know anything about Fourier theory) would be the mathematician’s natural choice for an interpolation kernel. The standard deviation of m(x) can be calculated from the above functional form: it is 1/2. Comparing m(x) with a Gaussian with the same standard deviation shows that the resemblance is not just an illusion:

m formula

Remarkably, the seminal work in pyramid encoding in the early 1980s found exactly the same Gaussian-like properties for a five-tap filter with coefficients {0.05, 0.25, 0.4, 0.25, 0.05} (see here and here).

In contrast, a sinc function has oscillatory lobes that are not an intuitive choice for a smooth interpolation function, and require forced truncation in practice.

Unlike a gaussian, the finite support of m(x) means that only three adjacent copies of it are necessary for it to “fit into itself”. We can therefore see why a truncation of a gaussian to finite support (needed for practical purposes) leads to artifacts.

However, arguably the most important property of m(x) arises from what, at first, actually appears to be a detrimental feature: it only has the value 3/4 at x = 0. In contrast, the sinc function is unity at x = 0 and vanishes at all other integral x, which means that it is, mathematically, a projection operator: it can be applied an arbitrary number of times to the original signal without degrading it at all. So how can m(x) still do a good job?

Indeed, the following suggestion may have popped into your mind: why not simply define a different triparabolic resampling kernel whose support is between –1 and +1? Then it could be unity at x = 0, and very easily engineered to fit into adjacent copies of itself. Specifically:

n formula

The problem with such a kernel can be seen if we consider the following thought experiment: what if we want to resample a signal onto lattice points with the same spacing, but offset by some non-integral amount, f?

Imagine the case of the original signal having the value 1 at x = 0, and 0 at all other integral x values. If f = 0, then the signal is unchanged when we convolve it with n(x), but is spread out to have the values {1/8, 3/4, 1/8} when we convolve it with m(x). If we convolve this with the pixel response function for whatever display device we wish to use, itself having spatial variance of v, then the overall spatial variance for the resampled signal will be v + 1/4 for m(x), but only v for n(x).

Doesn’t this show that n(x) is better than m(x)?

No. Consider the case when f = 1/2. Convolving with either m(x) or n(x) leads to the same result {1/2, 1/2}. Convolving with the pixel response function leads to a spatial variance of v + 1/4 for both m(x) and n(x).

Now consider the case when the resampling spacing is slightly different than the original spacing. The kernel n(x) will “beat” with the original spacing, creating a Moire pattern such that the variance of the resampled signal will oscillate between v and v + 1/4. (For example: for an image, thin lines will get fatter and skinnier as one goes across the image.) In contrast, it can be shown analytically that, remarkably, convolving with m(x) leads to a constant variance of v + 1/4 for any value of f.

We therefore see why the magic kernel’s value of only 3/4 at x = 0 is not only not detrimental, it is in fact crucial to its fidelity as a resampling kernel: it ensures that there are no Moire beat patterns in the widths of fine features, regardless of the resampling frequency or offset phase.

The magic kernel is, indeed, remarkable!

Source code