Image Filtering

Goal

Signal

clipboard.png

Image

clipboard.png

δ\delta Function (1-D)

clipboard.png

δ[x]={1,(x=0)0,(x0)\delta[x] = \begin{cases} 1, \quad (x =0) \\ 0, \quad (x \ne 0) \end{cases}

Shifting to other places

clipboard.png

f[x]={f[1] f[2]f[12]}={0 1 2 4 8 7 6 5 4 3 2 1}=k=f[k]δ[xk]\begin{align*} f[x] &= \{f[1]\ f[2] \cdots f[12]\} \\ &= \{0\ 1\ 2\ 4\ 8\ 7\ 6\ 5\ 4\ 3\ 2\ 1\} \\ &= \sum_{k = -\infty}^{\infty} f[k]\delta[x - k] \end{align*}

Discrete-time System

f[x]T()g[x]=T{f[x]}f[x] \rightarrow T(\cdot) \rightarrow g[x] = T\{f[x]\}

clipboard.png

Running Average of the signal

g[x]=12N+1k=NNf[x+k]g[x] = \frac{1}{2N + 1}\sum_{k = - N}^{N} f[x + k]
  • e.g. N=6N = 6

clipboard.png

δ\delta Function (2-D)

δ[x,y]={1,(x=m2,y=n2)0,(otherwise)\delta[x, y] = \begin{cases} 1, \quad (x = \frac{m}{2}, y = \frac{n}{2}) \\ 0, \quad (otherwise) \end{cases}

Image Gradient

fx2+fy2\sqrt{f^2_x + f^2_y}
from scipy import ndimage
im = 1.0 * plt.imread("al.jpg") # convert to double
filt = np.array([
 [-0.1250, -0.2500, -0.1250],
 [0, 0, 0],
 [0.1250, 0.2500, 0.1250],
])

im_y = ndimage.convolve(im, filt, mode="reflect")
im_x = ndimage.convolve(im, filt, mode="reflect")
im_g = np.sqrt(im_x**2 + im_y**2)

Space and Frequency (Canonical Basis)

f[x]=(1 2 3 4  7)=1(1 0 0 0  0)+2(0 1 0 0  0)+3(0 0 1 0  0)++7(0 0 0 0  1)\begin{align*} f[x] &= (1\ 2\ 3\ 4\ \cdots\ 7) \\ &= 1(1\ 0\ 0\ 0\ \cdots\ 0) \\ &+ 2(0\ 1\ 0\ 0\ \cdots\ 0) \\ &+ 3(0\ 0\ 1\ 0\ \cdots\ 0) \\ &+ \cdots \\ &+ 7(0\ 0\ 0\ 0\ \cdots\ 1) \\ \end{align*} f[x]=k=0m1akbk(x)\begin{align*} f[x] &= \sum_{k = 0}^{m-1} a_kb_k(x) \end{align*}

Matrix Form

(f[0]f[m1])=(b0bm1)orthonormal (B1=BT)(a0am1)\begin{pmatrix} f[0] \\ \vdots\\ f[m - 1] \end{pmatrix} = \underbrace{ \begin{pmatrix} \begin{align*} &\vert \\ &\vec{b}_0 \\ &\vert \\ \end{align*} \quad\cdots \quad \begin{align*} &\vert \\ &\vec{b}_{m- 1} \\ &\vert \\ \end{align*} \end{pmatrix} }_{\text{orthonormal }(B^{-1} = B^T)} \begin{pmatrix} a_0 \\ \vdots \\ a_{m-1} \end{pmatrix}

we can then derive aka_k

B is orthonormal(B1=BT)B1f=AA=BTfak=l=0m1f(l)bk(l)\begin{align*} \because & \\ &B \text{ is orthonormal} (B^{-1} = B^T) \\ \therefore & \\ & B^{-1} \cdot f = A \\ & A = B^T \cdot f \\ & a_k = \sum^{m - 1}_{l = 0} f(l)b_k(l) \end{align*}

Space and Frequency (Fourier, 1-D)

Important

Assumption: Singals are periodic

clipboard.png

f[x]f[x] represented with coscos

f[x]=1mk=0m1ckamplitudecos(2πkmfrequencyx+ϕkphase (nonfixed))\begin{align*} f[x] &= \frac{1}{m} \sum^{m - 1}_{k = 0} \underbrace{c_k}_{amplitude} cos(\underbrace{\frac{2\pi k}{m}}_{frequency}x + \underbrace{\phi_k}_{phase\ (non-fixed)}) \\ \end{align*}

f[x]f[x] represented with coscos and sinsin

f[x]=1mk=0m1ckcos(ωkx+ϕk)=1mk=0m1ckcos(ϕk)scale factor akcos(ωkx)cksin(ϕk)scale factor bksin(ωkx)=1mk=0m1akcos(ωkx)fixed basis+bksin(ωkx)fixed basis\begin{align*} f[x] &= \frac{1}{m} \sum^{m - 1}_{k = 0} c_k cos(\omega_kx + \phi_k) \\ &= \frac{1}{m} \sum^{m - 1}_{k = 0} \underbrace{c_k cos(\phi_k)}_{\text{scale factor }\rightarrow a_k} cos(\omega_kx) - \underbrace{c_ksin(\phi_k)}_{\text{scale factor }\rightarrow b_k}sin(\omega_kx)\\ &= \frac{1}{m} \sum^{m - 1}_{k = 0} a_k \underbrace{cos(\omega_kx)}_{\text{fixed basis}} + b_k \underbrace{sin(\omega_kx)}_{\text{fixed basis}} \\ \end{align*}

aka_k and bkb_k

  • Given the canonical and matrix forms of a space and frequency
  • and the fact that coscos and sinsin matrices are orthonormal
ak=l=0m1f[l]cos(ωkl)bk=l=0m1f[l]sin(ωkl)\begin{align} a_k = \sum^{m - 1}_{l = 0} f[l]cos(\omega_kl) \\ b_k = \sum^{m - 1}_{l = 0} f[l]sin(\omega_kl) \end{align}

Note

About Frequency 2πkm\frac{2\pi k}{m}

  • 2π2\pi: a complete cycle of a sinusoid
  • kk: frequency bin (harmonic number)
  • mm: the length of a discrete signal
  • km\frac{k}{m}: Normalize the frequency

Space and Frequency (Complex Exponential)

\bullet Fourier Series

1mk=0m1akcos(ωkx)+bksin(ωkx)\frac{1}{m} \sum^{m - 1}_{k = 0} a_k \cos(\omega_kx) + b_k sin(\omega_kx)

\bullet Fourier Transform

ak=l=0m1f[l]cos(ωkl)bk=l=0m1f[l]sin(ωkl)\begin{align*} a_k = \sum^{m - 1}_{l = 0} f[l]cos(\omega_kl) \\ b_k = \sum^{m - 1}_{l = 0} f[l]sin(\omega_kl) \end{align*}

Important

Having two bases aka_k and bkb_k is too cumbersome to compute. We need to combine them as one.

\bullet Complex Exponential

Euler’s formula

eiωx=cos(ωx)+isin(ωx)e^{i\omega x} = cos(\omega x) + isin(\omega x)

Note

Full proof can be found in Euler’s Formula

\bullet Rewrite of the Fourier Series

f[x]=1mk=0m1ckeiωkxf[x] = \frac{1}{m} \sum^{m - 1}_{k = 0} c_k e^{i\omega_k x}

\bullet Rewrite of the Fourier Transform

ck=k=0m1f(l)eiωklck=akibk\begin{align*} &c_k = \sum^{m - 1}_{k = 0} f(l) e^{-i\omega _kl} \\ &c_k = a_k - ib_k \end{align*}

magnitude

ck=ak2+bk2|c_k| = \sqrt{a_k^2 + b_k^2}

Phase

ck=tan1(bkak)\angle c_k = tan^{-1}(\frac{b_k}{a_k})

Space and Frequency (Fourier, 2-D)

Important

Assumption: Signals are periodic in both xx and yy directions.

f[x,y]f[x,y] represented with cos\cos and sin\sin

clipboard.png

f[x,y]=1m2k=0m1l=0m1(aklcos(ωkx+ωly)+bklsin(ωkx+ωly))f[x,y] = \frac{1}{m^2} \sum_{k=0}^{m-1} \sum_{l=0}^{m-1} \Big( a_{kl}\cos(\omega_k x + \omega_l y) + b_{kl}\sin(\omega_k x + \omega_l y) \Big)

where

  • ωk=2πkm\omega_k = \tfrac{2\pi k}{m}: horizontal frequency bin
  • ωl=2πlm\omega_l = \tfrac{2\pi l}{m}: vertical frequency bin

Coefficients akl,bkla_{kl}, b_{kl}

akl=x=0m1y=0m1f[x,y]cos(ωkx+ωly)a_{kl} = \sum_{x=0}^{m-1}\sum_{y=0}^{m-1} f[x,y]\cos(\omega_k x + \omega_l y) bkl=x=0m1y=0m1f[x,y]sin(ωkx+ωly)b_{kl} = \sum_{x=0}^{m-1}\sum_{y=0}^{m-1} f[x,y]\sin(\omega_k x + \omega_l y)

Complex Exponential Form (2-D)

Euler’s Formula

ei(ωkx+ωly)=cos(ωkx+ωly)+isin(ωkx+ωly)e^{i(\omega_k x + \omega_l y)} = \cos(\omega_k x + \omega_l y) + i\sin(\omega_k x + \omega_l y)

2-D Fourier Series

f[x,y]=1m2k=0m1l=0m1cklei(ωkx+ωly)f[x,y] = \frac{1}{m^2} \sum_{k=0}^{m-1} \sum_{l=0}^{m-1} c_{kl} \, e^{\,i(\omega_k x + \omega_l y)}

with

ckl=aklibklc_{kl} = a_{kl} - i b_{kl}

2-D Discrete Fourier Transform (DFT)

Forward Transform

F[u,v]=x=0m1y=0m1f[x,y]ei2π(uxm+vym)F[u,v] = \sum_{x=0}^{m-1} \sum_{y=0}^{m-1} f[x,y] \, e^{-i2\pi\left(\frac{ux}{m} + \frac{vy}{m}\right)}

Inverse Transform

f[x,y]=1m2u=0m1v=0m1F[u,v]e+i2π(uxm+vym)f[x,y] = \frac{1}{m^2} \sum_{u=0}^{m-1} \sum_{v=0}^{m-1} F[u,v] \, e^{+i2\pi\left(\frac{ux}{m} + \frac{vy}{m}\right)}

Magnitude and Phase (2-D)

Magnitude Spectrum

F[u,v]=(F[u,v])2+(F[u,v])2|F[u,v]| = \sqrt{\Re(F[u,v])^2 + \Im(F[u,v])^2}

Phase Spectrum

F[u,v]=tan1 ⁣((F[u,v])(F[u,v]))\angle F[u,v] = \tan^{-1}\!\left(\frac{\Im(F[u,v])}{\Re(F[u,v])}\right)

Magnitude and Its Log-form of the Fourier Transform

import numpy as np
import matplotlib.pyplot as plt
from numpy import fft

f = plt.imread("al.jpg")

F = fft.fftshift(
 fft.fft2(f)
)
Fmag = np.abs(F)

# Plotting
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 10))

ax1.set_title("Fourier Magnitude")
ax1.imshow(Fmag, cmap="gray")

ax2.set_title("Log Fourier Magnitude")
ax2.imshow(np.log(Fmag), cmap="gray")
  • fft.fftshift: shift the origin of the coordinate system from the top-left corner to the center
  • fft.fft2(f): calculate the fourier transform of the image

clipboard.png

  • In the center of the image, there is a DC TermDC\ Term representing the average pixel intensity of the entire image.
  • The DCTermDC Term is so strong that the other values are comparatively small and nearly invisible
  • The np.log(Fmag) resolves this problem

Creating a window smoothly tapers to zero at all four edges

f = plt.imread(al.jpg)
[ydim, xdim] = f.shape
win = np.outer(
 np.hanning(ydim),
 np.hanning(xdim),
)
win = win / np.mean(win) # make unit-mean
F = fft.fftshift(fft.fft2(f * win))

clipboard.png

Continuous to Discrete Sampling

  1. f(x)f(x)

clipboard.png

  1. s(x)=k=δ(xkT)s(x) = \sum^{\infty}_{k = -\infty} \delta(x - kT)

clipboard.png

  • Sampling
  • TT: How fine we want to sample

Tip

δ(x)={1(x=0)0(x0)\delta(x) = \begin{cases} 1 \quad (x = 0) \\ 0 \quad (x \ne 0) \end{cases}
  1. fs(x)=f(x)s(x)f_s(x) = f(x)s(x)

clipboard.png

  • grabbing the values that we want but having 0 everywhere else
  1. f[x]=fs(xT)f[x] = f_s(xT)

Screenshot 2025-09-15 at 14.42.10

  • Take each integer multiples of TT

Convolution and Fourier Transform

Formulas

fs(x)=f(x)s(x)Fs(x)=F(ω)S(ω)\begin{align*} &f_s(x) = f(x)s(x) \\ &\Rightarrow F_s(x) = F(\omega) * S(\omega) \end{align*}

Important

The convolution in the frequency domain is the same as the multiplication in the spatial domain, and vice versa.

What do we lose when we samples

\bullet Aliasing

  • Aliasing causes artifacts and the loss of information in the image

clipboard.png

\bullet Nyquist Limit

ωs>2ωn\omega_s > 2\omega_n
  • A solution to determining aliasing
  • Perfect representation of a signal without information loss

Prevent Replicates

clipboard.png

  • Sampling induces replicate signals
  • Apply ideal sinc to resolve this problem
h(x)=sin(πxT)πxTh(x) = \frac{sin(\frac{\pi x}{T})}{\frac{\pi x}{T}}

clipboard.png


Discrete to Discrete Sampling

  • Discard high frequency part of the image and down-sample it
  • Given Gaussian Function h(x)=ex2/σ2h(x) = e^{-x^2/\sigma^2}

Spatial Domain

g[x]=(h(x)f[x])s[x]g[x] = (h(x) * f[x])s[x]

Frequency Domain

G[ω]=(H(ω)F[ω])S[ω]G[\omega] = (H(\omega)F[\omega]) * S[\omega]

clipboard.png

  • The signal becomes narrower, which means the higher frequencies are filtered successfully
  • Aliasing would not happen

Gaussian Pyramid

  • Repeated process of blurring + down-sampling
  • blurring is typically done with Gaussian Filter

Screenshot 2025-09-15 at 16.24.50

In Fourier Domain

Screenshot 2025-09-15 at 16.25.43 Screenshot 2025-09-15 at 16.26.44

Code

im = plt.imread("mandrill.png") # load image
h = [1/16, 4/16, 6/16, 4/16, 1/16] # unit-sum blur filter (a Gaussian filter)
N = 3 # pyramid level

P = [im]
for k in range(1, N):
 im2 = np.zeros(im.shape)
 for z in range(3):
 im2[:,:, z] = sepfir2d(im[:,:, z], h, h) # blur each color channel
 im2 = im2[0:-1:2, 0:-1:2,:] # down-sample by 2 x 2
 im = im2
 P.append(im2)

Laplacian Pyramid

  • Starting from a Gaussian Pyramid
  • Repeated of up-sampling and subtraction

clipboard.png

In Fourier Domain

clipboard.png

Code

L = []
for k in range(0, N - 1):
 l1 = G[k]
 l2 = G[k + 1]
 l2 = cv2.resize(l2, (0, 0), fx=2, fy=2) # up-sample
 D = l1 - l2
 D = D - np.min(D) # scale in [0, 1]
 D = D / np.max(D) # for display purpose
 L.append(D)
L.append(G[N - 1])