Spatial Domain Restoration & Edge-Preserving Filters

The previous post introduced frequency domain image processing methods — transforming images to the frequency domain via Fourier transform, then performing filtering, restoration, and other operations. However, frequency domain methods are sometimes less intuitive, especially when we’re more accustomed to directly manipulating pixels.

Spatial domain methods process images directly in pixel space, without requiring transformations. This chapter introduces spatial domain image restoration, edge-preserving filtering, and sharpening techniques.

Lucy-Richardson Iterative Deconvolution

The goal of deconvolution is to recover the original image from a degraded image. The degradation process is typically modeled as convolution: degraded image = original image convolved with blur kernel + noise.

Direct deconvolution (inverse filtering) is extremely unstable when noise exists — noise is amplified by the division in high-frequency regions. The Lucy-Richardson algorithm provides a more robust iterative method.

Maximum Likelihood Principle

The Lucy-Richardson algorithm is based on maximum likelihood estimation, assuming image noise follows a Poisson distribution. Poisson noise is characterized by variance equal to mean — common in low-light imaging, X-ray imaging, and other scenarios.

Iterative Formula

The Lucy-Richardson iterative update formula is:

$$ f_{k+1}(x, y) = f_k(x, y) \cdot \left( \frac{g(x, y)}{h * f_k(x, y)} * h(-x, -y) \right) $$

Where:

  • $f_k$ is the $k$-th iteration’s estimated image
  • $g$ is the observed degraded image
  • $h$ is the blur kernel (point spread function)
  • $*$ denotes convolution
  • $h(-x, -y)$ is the flipped blur kernel

Iterative Behavior

Intuitively, this formula does two things:

  1. Calculate the ratio between current estimate and observation $g / (h * f_k)$ — if current estimate is too dark, ratio > 1, needs enhancement
  2. Use the flipped blur kernel to back-propagate the ratio, correcting pixel values

Choice of iteration count is crucial:

  • Too few iterations: image remains blurry, insufficient restoration
  • Too many iterations: details fully restored but noise amplified, overfitting

Early stopping is a common strategy — stop iteration before noise is over-amplified.

Figure 1 - Lucy-Richardson iteration process:

mermaid
flowchart TD
    A["Initial estimate f_0<br/>uniform or degraded image"] --> B["Compute convolution h * f_k"]
    B --> C["Compute ratio g / (h * f_k)"]
    C --> D["Back-propagate: ratio * h(-x, -y)"]
    D --> E["Update: f_{k+1} = f_k × back-propagation"]
    E --> F{Iterations<br/>enough?}
    F -->|No| B
    F -->|Yes| G["Output restored image f_k"]

    classDef start fill:#FF9800,color:#fff
    classDef proc fill:#2196F3,color:#fff
    classDef check fill:#9C27B0,color:#fff
    classDef end fill:#4CAF50,color:#fff

    class A start
    class B,C,D,E proc
    class F check
    class G end

In practice, the blur kernel $h$ can be measured experimentally (e.g., photographing a point source) or estimated via methods like blur edge analysis.

Total Variation Regularization

Deconvolution is essentially an ill-posed problem — even knowing the blur kernel, recovering the original image from the degraded image has no unique solution. Regularization introduces prior constraints to make the solution more stable.

Why Choose L1 Norm

The most intuitive regularization is energy minimization: minimize the total variation (TV) of the restored image. TV measures the sum of gradients in an image:

$$ TV(f) = \sum_{x,y} \sqrt{(\nabla_x f)^2 + (\nabla_y f)^2} $$

Using L1 norm (absolute gradient) instead of L2 norm (squared gradient) is because:

  • L2 norm over-penalizes strong gradients — edge gradients get smoothed out
  • L1 norm is more tolerant of large gradients — allows edges to remain sharp

Intuitively: L2 norm wants gradients to be small everywhere (smooth), while L1 norm allows gradients to be large in some places (edges), as long as other areas have sufficiently small gradients (flat regions).

Optimization Objective

The image restoration objective with TV regularization:

$$ \min_f | g - h * f |_2^2 + \lambda \cdot TV(f) $$

The first term is the data fidelity term — the restored image must be consistent with the observed image. The second term is the regularization term — the restored image’s total variation should be small. $\lambda$ is the balance parameter: large $\lambda$ favors smoother images, small $\lambda$ favors closer adherence to observation.

This optimization problem is typically solved via iterative algorithms like gradient descent or split Bregman. In practice, you can call scikit-image functions like restoration.denoise_tv_chambolle or restoration.unsupervised_wiener.

Limitations of TV Regularization

While TV regularization preserves edges, it has two common issues:

  1. Staircase effect — recovered flat regions show staircase artifacts because L1 norm tends to produce piecewise-constant regions
  2. Texture loss — fine textures get over-smoothed

More advanced regularization methods (e.g., non-local means, sparse representation) can mitigate these issues but have higher computational complexity.

Bilateral Filter

Edge-preserving filtering’s core goal is: smooth noise while not blurring edges. Traditional Gaussian filtering cannot achieve this — it treats all pixels equally, blurring both edges and flat regions.

Bilateral filtering achieves edge preservation by introducing color similarity as an additional weight.

Two-Dimensional Weights

Each pixel in bilateral filtering has two weights:

  1. Spatial weight $G_s$: determined by geometric distance between pixels, closer pixels get larger weight
  2. Color weight $G_r$: determined by difference in pixel values, more similar values get larger weight

Complete formula:

$$ f(x, y) = \frac{1}{W} \sum_{(i,j) \in \Omega} I(i, j) \cdot G_s(|(x,y)-(i,j)|) \cdot G_r(|I(x,y) - I(i,j)|) $$

Where $W$ is the normalization factor:

$$ W = \sum_{(i,j) \in \Omega} G_s(|(x,y)-(i,j)|) \cdot G_r(|I(x,y) - I(i,j)|) $$

$G_s$ and $G_r$ are typically Gaussian functions:

$$ G_s(d) = \exp\left(-\frac{d^2}{2\sigma_s^2}\right), \quad G_r(v) = \exp\left(-\frac{v^2}{2\sigma_r^2}\right) $$

$\sigma_s$ controls spatial neighborhood size, $\sigma_r$ controls color similarity threshold.

Why It Preserves Edges

The magic of bilateral filtering lies in the color weight $G_r$:

In flat regions, neighboring pixel values are similar, $G_r \approx 1$, bilateral filtering reduces to Gaussian filtering — noise is smoothed.

At edges, neighboring pixel values differ significantly, $G_r \approx 0$, cross-edge weights are suppressed — edges are not blurred.

Figure 2 - Bilateral filtering weight mechanism:

mermaid
flowchart TD
    CENTER["Center pixel P(x,y)<br/>I = 128"]
    P1["P1: I=125, dist1.0<br/>diff3 → weight0.93"]
    P2["P2: I=120, dist1.4<br/>diff8 → weight0.77"]
    P3["P3: I=30, dist1.0<br/>diff98 → weight≈0"]
    CENTER --> P1
    CENTER --> P2
    CENTER --> P3

    classDef center fill:#FF9800,color:#fff
    classDef neighbor fill:#2196F3,color:#fff
    classDef edge fill:#f44336,color:#fff
    class CENTER center
    class P1,P2 neighbor
    class P3 edge

The figure above shows that P3, despite being spatially close ($G_s \approx 0.95$), has a near-zero final weight because the color difference is too large (crossing an edge) — the edge is preserved.

Practical Applications

Bilateral filtering is commonly used for:

  • High Dynamic Range (HDR) imaging: fuse multi-exposure images then adjust local contrast with bilateral filtering
  • Cartoonization: strong bilateral filtering followed by color quantization produces cartoon-style effects
  • Image matting: use bilateral filtering to optimize alpha masks

OpenCV provides cv2.bilateralFilter():

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import cv2
import numpy as np

# Read image
img = cv2.imread('input.jpg')

# Bilateral filtering
# d: neighborhood diameter (must be odd)
# sigmaColor: standard deviation in color space
# sigmaSpace: standard deviation in coordinate space
filtered = cv2.bilateralFilter(img, d=9, sigmaColor=75, sigmaSpace=75)

Bilateral filtering’s drawback is high computational cost — each pixel needs to iterate over all pixels in the neighborhood, time complexity $O(N \cdot r^2)$, where $N$ is pixel count, $r$ is neighborhood radius. In practice, $d$ should not exceed 15, otherwise processing becomes slow.

Bilateral Filtering vs Gaussian Filtering

Compare the behavior of bilateral and Gaussian filtering at edges:

  • Gaussian filtering: all neighborhood pixels participate regardless of edge presence → edges get blurred
  • Bilateral filtering: cross-edge pixel weights are suppressed → edges preserved

You can verify this experimentally: process the same image with cv2.GaussianBlur() and cv2.bilateralFilter(), compare edge regions — Gaussian blur widens edges, bilateral filtering keeps edges sharp.

Guided Filter

Guided filtering is a linear-time $O(N)$ edge-preserving filtering method proposed by Kaiming He et al. in 2010. Compared to bilateral filtering, it has two advantages:

  1. Speed: linear time complexity, can quickly process large images
  2. No gradient reversal: avoids gradient reversal artifacts found in bilateral filtering (gradient reversal artifact means some regions’ gradient direction after filtering is opposite to the original image)

Local Linear Model

Guided filtering’s core assumption is: output image is a local linear transform of the guide image. For each local window $\omega_k$, we have:

$$ q_i = a_k I_i + b_k, \quad \forall i \in \omega_k $$

Where:

  • $q_i$ is the $i$-th pixel of the output image
  • $I_i$ is the $i$-th pixel of the guide image (guide image can be the input image itself or another image)
  • $a_k$ and $b_k$ are linear coefficients in window $\omega_k$ (all pixels in the same window share $a_k, b_k$)

If $a_k$ is small, the output image varies smoothly within the window (flat region); if $a_k$ is large, the output follows the guide image (edge region).

Coefficient Solution

$a_k$ and $b_k$ are obtained via least squares fitting. The objective is to minimize:

$$ \min_{a_k, b_k} \sum_{i \in \omega_k} \left[ (a_k I_i + b_k - p_i)^2 + \epsilon a_k^2 \right] $$

Where $p_i$ is the $i$-th pixel of the input image. $\epsilon a_k^2$ is the regularization term, preventing $a_k$ from being too large (overfitting noise).

The solution is:

$$ a_k = \frac{\frac{1}{|\omega|} \sum_{i \in \omega_k} I_i p_i - \mu_k \bar{p}_k}{\sigma_k^2 + \epsilon} $$

$$ b_k = \bar{p}_k - a_k \mu_k $$

Where:

  • $\mu_k$ is the mean of the guide image in window $\omega_k$
  • $\sigma_k^2$ is the variance of the guide image in window $\omega_k$
  • $\bar{p}_k$ is the mean of the input image in window $\omega_k$
  • $|\omega|$ is the number of pixels in the window

Intuitively:

  • Large variance $\sigma_k^2$ (edge regions), $a_k$ tends to be large → output follows guide image
  • Small variance $\sigma_k^2$ (flat regions), $a_k$ tends toward 0 → output gets smoothed

Practical Applications

OpenCV provides cv2.ximgproc.guidedFilter() (requires opencv-contrib-python):

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import cv2
import numpy as np

# Read image
img = cv2.imread('input.jpg')
guide = img  # guide image can be the original or another image

# Guided filtering
filtered = cv2.ximgproc.guidedFilter(
    guide=guide,
    src=img,
    radius=8,      # window radius
    eps=0.01       # regularization parameter
)

Guided filtering is commonly used for:

  • Matting refinement: use original image as guide, refine coarse alpha masks
  • Dehazing: use dark channel as guide, restore clear image
  • HDR compression: use original image as guide, adjust local contrast

Bilateral Filtering vs Guided Filtering

FeatureBilateral FilteringGuided Filtering
Time complexity$O(N \cdot r^2)$$O(N)$
Edge preservationYesYes
Gradient reversalPossibleNone
GuidabilityOnly itselfCan use other images

For large images or real-time applications, guided filtering is more suitable. If fine-grained control over edge preservation is needed, bilateral filtering’s parameter tuning is more intuitive.

Laplacian Sharpening

Sharpening is the opposite of filtering — filtering is for smoothing, sharpening is for enhancing edge contrast. Laplacian sharpening is the most basic sharpening method.

Second-Order Derivative Operator

The Laplacian operator is a second-order differential operator, defined as:

$$ \nabla^2 f = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2} $$

Second derivatives respond strongly to edges (where gradient changes abruptly) and approach 0 in flat regions. Thus Laplacian response can be seen as a measure of edge strength.

Discretization and Convolution Kernels

In discrete images, second derivatives are approximated with finite differences:

$$ \frac{\partial^2 f}{\partial x^2} \approx f(x+1, y) - 2f(x, y) + f(x-1, y) $$

$$ \frac{\partial^2 f}{\partial y^2} \approx f(x, y+1) - 2f(x, y) + f(x, y-1) $$

Adding them yields the 4-neighborhood Laplacian kernel:

1
2
3
[ 0  -1   0]
[-1   4  -1]
[ 0  -1   0]

If diagonal directions are considered (8-neighborhood), the kernel is:

1
2
3
[-1  -1  -1]
[-1   8  -1]
[-1  -1  -1]

Sharpening Formula

Adding the Laplacian response back to the original image enhances edge contrast:

$$ g(x, y) = f(x, y) + c \cdot \nabla^2 f(x, y) $$

$c$ is the sharpening strength:

  • $c > 0$: enhance edges (when kernel center coefficient is positive, like 4 or 8)
  • $c < 0$: reverse operation (when kernel center coefficient is negative, like -4 or -8)

In practice, to maintain image brightness, typically use a kernel with positive center coefficient (like changing 4-neighborhood kernel center to 4):

1
2
3
[ 0  -1   0]
[-1   5  -1]
[ 0  -1   0]

This kernel is equivalent to $g = f + \nabla^2 f$ (since $5 = 4 + 1$).

Practical Applications

OpenCV applies custom convolution kernels with cv2.filter2D():

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import cv2
import numpy as np

img = cv2.imread('input.jpg')

# 4-neighborhood Laplacian sharpening kernel
laplacian_kernel = np.array([
    [ 0, -1,  0],
    [-1,  5, -1],
    [ 0, -1,  0]
], dtype=np.float32)

# 8-neighborhood Laplacian sharpening kernel
laplacian_kernel_8 = np.array([
    [-1, -1, -1],
    [-1,  9, -1],
    [-1, -1, -1]
], dtype=np.float32)

# Apply sharpening
sharpened = cv2.filter2D(img, -1, laplacian_kernel)

Limitations of Sharpening

Laplacian sharpening has several issues:

  1. Noise amplification: sharpening also amplifies noise because noise is in high-frequency regions
  2. Overshoot and undershoot: excessive sharpening creates overshoot and undershoot on both sides of edges — bright rings and dark rings appear around edges
  3. Unnatural appearance: over-sharpened images look unnatural, like over-saturation

In practice, it’s best to denoise before sharpening (e.g., Gaussian filtering), and don’t use too much sharpening strength.

Summary

Spatial domain methods directly manipulate pixels, are more intuitive, but also have limitations like high computational cost and difficulty analyzing global properties.

MethodUse CaseCore IdeaLimitations
Lucy-RichardsonImage deblurringIterative maximum likelihoodIteration count hard to choose, noise amplification
TV regularizationImage denoising/deblurringMinimize total variation (L1 gradient)Staircase effect, texture loss
Bilateral filteringEdge-preserving denoisingSpatial weight + color weightHigh computational cost
Guided filteringEdge-preserving filtering/mattingLocal linear modelParameter sensitive
Laplacian sharpeningEdge enhancementSecond derivative detects edgesNoise amplification, overshoot/undershoot

Frequency domain and spatial domain methods each have strengths and weaknesses. In actual projects, they’re often combined: first perform global filtering in the frequency domain, then use spatial domain methods for local adjustments.

The next post will introduce more advanced topics — deep learning-based image processing.