It's all a blur

339 points63 comments6 days ago
siofra

Beautiful walkthrough. The key insight people miss is that "looks unreadable to humans" and "is information-theoretically destroyed" are very different bars. The blur looks opaque because our visual system is bad at detecting small per-pixel differences, but the math does not care about our perception.

Same principle applies to other "looks safe" redactions — pixelation with small block sizes, partial masking of credentials, etc. If you can describe the transform as a linear operation, there is probably a pseudoinverse waiting to undo it.

jeremyscanvic

Blur is perhaps surprisingly one of the degradations we know best how to undo. It's been studied extensively because there's just so many applications, for microscopes, telescopes, digital cameras. The usual tricks revolve around inverting blur kernels, and making educated guesses about what the blur kernel and underlying image might look like. My advisors and I were even able to train deep neural networks using only blurry images using a really mild assumption of approximate scale-invariance at the training dataset level [1].

[1] https://ieeexplore.ieee.org/document/11370202

show comments
swiftcoder

One salient point not touched on here, is that an awful lot of the time, the things folks are blurring out specifically is text. And since we know an awful lot about what text ought to look like, we have a lot more information to guide the reconstruction...

show comments
coldtea

>But then, it’s not wrong to scratch your head. Blurring amounts to averaging the underlying pixel values. If you average two numbers, there’s no way of knowing if you’ve started with 1 + 5 or 3 + 3. In both cases, the arithmetic mean is the same and the original information appears to be lost. So, is the advice wrong?

Well, if you have a large enough averaging window (like is the case with bluring letters) they have constraints (a fixed number of shapes) information for which is partly retained.

Not very different from the information retained in minesweeper games.

derektank

Captain Disillusion recently covered this subject in a more popular science format as well

https://youtu.be/xDLxFGXuPEc

show comments
bmandale

> This nets us another original pixel value, img(8).

This makes it all seem really too pat. In fact, this probably doesn't get us the original pixel value, because of quantizing deleting information when the blur was applied, which can never be recovered afterwards. We can at best get an approximation of the original value, which is rather obvious given that we can vaguely make out figures in a blurred image already.

> Nevertheless, even with a large averaging window, fine detail — including individual strands of hair — could be recovered and is easy to discern.

The reason for this is that he's demonstrating a box blur. A box blur is roughly equivalent to taking the frequency transform of the image, then multiplying it by a sort of decaying sin wave. This achieves a "blur" in that the lowest frequency is multiplied by 1 and hence is retained, and higher frequencies are attenuated. However, visually we can see that a box blur doesn't look very good, and importantly it doesn't necessarily attenuate the very highest frequencies by much more than far lower frequencies. Hence it isn't surprising that the highest frequencies can be recovered in good fidelity. Compare a gaussian blur, which is usually considered to look better, and whose frequency transform focuses all the attenuation at the highest frequencies. You would be far less able to recover individual strands of hair in an image that was gaussian blurred.

> Remarkably, the information “hidden” in the blurred images survives being saved in a lossy image format.

Remarkable, maybe, but unsurprising if you understand that jpeg operates on basically the same frequency logic as described above. Specifically, it will be further attenuating and quantizing the highest frequencies of the image. Since the box blur has barely attenuated them already, this doesn't affect our ability to recover the image.

show comments
cornhole

reminds me of the guy who used the photoshop swirl effect to mask his face in csam he produced, who was found out when someone just undid the swirl

show comments
dsego

Can this be applied to camera shutter/motion blur, at low speeds the slight shake of the camera produces this type of blur. This is usually resolved with IBIS to stabilize the sensor.

show comments
srean

Encode the image as a boundary condition of a laminar flow and you can recover the original image from an observation.

If, however, you observe after turbulence has set in, then some of the information has been lost, it's in the entropy now. How much, that depends on the turbulent flow.

Don't miss out on this video by smarter every day

https://youtu.be/j2_dJY_mIys?si=ArMd0C5UzbA8pmzI

Treat the dynamics and time of evolution as your private key, laminar flow is a form of encryption.

show comments
tflinton

I did my thesis on using medioni's tensor voting framework to reconstruct noisy, blurry, low-res and the like images. It was sponsored by USGS on a data set that I thought was a bit of a bizarre use case. The approach worked pretty well, with some reasonable success at doing "COMPUTER ENHANCE" type computer vision magic. Later on talking with my advisor about the bizarrely mundane and uninteresting data sets we were working on from the grant he quipped that "You built a reasonable way of unblurring and enhancing unreadable images, the military doesn't care about this mundane use case." It then occurred that i'd been wildly ignorant to what I just spent 2 years of my life on.

esafak

This is classical deconvolution. Modern de-blurring implementations are DNN-based.

praptak

My (admittedly superficial) knowledge about blur reversibility is that an attacker may know what kind of stuff is behind the blur.

I mean knowledge like "a human face, but the potential set of humans is known to the attacker" or even worse "a text, but the font is obvious from the unblurred part of the doc".

show comments
jfaganel99

How do we apply this to geospatial face and licence plate blurs?

IshKebab

In practice unblurring (deconvolution) doesn't really work as well as you'd hope because it is usually blind (you don't know the blur function), and it is ill-conditioned, so any small mistakes or noise get enormously amplified.

jkuli

A simple solution is to use a system of linear equations. Each row of a matrix is a linear equation, Ax = b Each row contains kernel weightings A across the image X, B is the blurred pixel color. The full matrix would be a terabyte, so take advantage of the zeros and use an efficient solve for X instead of inversion.

Enhance really refers to combining multiple images. (stacking) Each pixel in a low res image was a kernel over the same high res image. So undoing a 100 pixel blur is equivalent to combining 10,000 images for 100x super resolution.

zb3

Ok, what about gaussian blur?

[deleted]
unconed

Sorry but this post is the blind leading the blind, pun intended. Allow me to explain, I have a DSP degree.

The reason the filters used in the post are easily reversible is because none of them are binomial (i.e. the discrete equivalent of a gaussian blur). A binomial blur uses the coefficients of a row of Pascal's triangle, and thus is what you get when you repeatedly average each pixel with its neighbor (in 1D).

When you do, the information at the Nyquist frequency is removed entirely, because a signal of the form "-1, +1, -1, +1, ..." ends up blurred _exactly_ into "0, 0, 0, 0...".

All the other blur filters, in particular the moving average, are just poorly conceived. They filter out the middle frequencies the most, not the highest ones. It's equivalent to doing a bandpass filter and then subtracting that from the original image.

Here's an interactive notebook that explains this in the context of time series. One important point is that the "look" that people associate with "scientific data series" is actually an artifact of moving averages. If a proper filter is used, the blurryness of the signal is evident. https://observablehq.com/d/a51954c61a72e1ef

show comments
oulipo2

Those unblurring methods look "amazing" like that but they are just very fragile, add even a modicum of noise to the blurred image and the deblurring will almost certainly completely fail, this is well-known in signal-processing

show comments
chenmx

What I find fascinating about blur is how computational photography has completely changed the game. Smartphone cameras now capture multiple exposures and computationally combine them, essentially solving the deblurring problem before it even happens. The irony is that we now have to add blur back artificially for portrait mode bokeh, which means we went from fighting blur to synthesizing it as a feature.