7.1 Single-view Metrology and Cameras

Review: Projection Matrix

clipboard.png clipboard.png

Formula Recap

x=K[Rt]Xx = K \left[R \mid t\right]X

where

  • X=[XYZ1]X = \begin{bmatrix}X \\ Y \\ Z \\ 1\end{bmatrix} \rightarrow 3-D world coordinates
  • x=[uv1]x = \begin{bmatrix}u \\ v \\ 1\end{bmatrix} \rightarrow Image pixel coordinates
  • KK \rightarrow Intrinsic parameters
  • R,tR, t \rightarrow extrinsic parameters

Matrix Components

\bullet Intrinsic Matrix K:

K=[fsu00αfv0001]K = \begin{bmatrix} &f \quad &s \quad &u_0 \\ &0 \quad &\alpha f \quad &v_0 \\ &0 \quad &0 \quad &1 \\ \end{bmatrix}
ParameterMeaningVisual Effect
fffocal lengthcontrols field of view (zoom in/out)
ssskewshears the image horizontally
α\alphaaspect ratio (y-scaling)stretches/compresses vertically
(u0,v0)(u_0, v_0)principal pointshifts the image center

\bullet Extrinsic Matrix [R | t]

[Rt]=[r11r12r13txr21r22r23tyr31r32r33tz][R \mid t] = \begin{bmatrix} &r_{11} \quad &r_{12} \quad &r_{13} \quad &t_x\\ &r_{21} \quad &r_{22} \quad &r_{23}\quad &t_y \\ &r_{31} \quad &r_{32} \quad &r_{33}\quad &t_z\\ \end{bmatrix}
ParameterMeaningVisual Effect
RRrotation matrixrotates the camera view around x, y, z
tttranslation vectormoves the camera in world space

Take-home Questions

1. Suppose the camera axis is in the direction of (x = 0, y = 0, z = 1) in its own coordinate system. What is the camera axis in world coordinates given the extrinsic parameters R, t

  • What is the z-direction (the direction the camera is facing) given the rotation and translation matrix?
Xch=[0010]=[R3×3t3×101]Xwhwhere  Xwh=[xwywzw0]X^h_c = \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} &R_{3\times3} \quad &t_{3\times1} \\ &0 \quad &1 \\ \end{bmatrix} X^h_w \\ where\; X^h_w = \begin{bmatrix}x_w \\ y_w \\ z_w \\ 0\end{bmatrix}
  • XwhX_w^h is what we want
Xc=[Rt][Xw1]Xc=RXw+tXw=R1XcR1tXw=RTXcRTt\begin{align*} &X_c = [R \mid t]\begin{bmatrix}X_w \\ 1\end{bmatrix} \\ &X_c = RX_w + t \\ &X_w = R^{-1}X_c - R^{-1}t \\ &X_w = R^TX_c - R^Tt \\ \end{align*}
  • The position of the camera is RTt-R^Tt
  • Since XcX_c is 0, the position of the camera is the world coordinate is RTt-R^Tt
  • At infinity, translation is going to have no effect
Xw=RTXcRTtXw=RTXcXw=[R0][0010]\begin{align*} &X_w = R^TX_c \cancel{- R^Tt} \\ &X_w = R^TX_c \\ &X_w = \left[R \quad 0\right] \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \\ \end{align*}
  • XwX_w is just the 3rd3^{rd} column of the transposed rotation matrix

2. Suppose a camera at height y=h(x=0,z=0)y = h \quad (x = 0, z = 0) observes a point at (u,v)(u, v) known to be on the ground (y=0)(y = 0). Assume RR is an identity matrix. What is the 3D position of the point in terms of f, u0,v0u_0, v_0

  • Recall that the position of the camera is RTt-R^Tt
RTt=[0h0](Given in the question)t=[0h0](R is an Identity matrix)t=[0h0]\begin{align*} &-R^Tt = \begin{bmatrix}0 \\ h \\ 0\end{bmatrix} \quad \text{(Given in the question)} \\ &-t = \begin{bmatrix}0 \\ h \\ 0\end{bmatrix} \quad \text{(R is an Identity matrix)} \\ &t = \begin{bmatrix}0 \\ -h \\ 0\end{bmatrix} \\ \end{align*}
  • Now calculate the point in the camera’s coordinate, assuming that we know the intrinsic parameters
Xc=[Rt][Xw1](points at the camera’s coordinate)=Xw+[0h0]w[uv1]=K[Xw+[0h0]](Image coordinate)\begin{align*} X_c &= [R \mid t] \begin{bmatrix}X_w \\ 1\end{bmatrix} \quad \text{(points at the camera's coordinate)} \\ &= X_w + \begin{bmatrix}0 \\ -h \\ 0\end{bmatrix} \\ w \begin{bmatrix}u \\ v \\ 1\end{bmatrix}&= K \left[X_w + \begin{bmatrix}0 \\ -h \\ 0\end{bmatrix}\right] \quad (\text{Image coordinate})\\ \end{align*}
  • Given that K=[f0u00fv0001]K = \begin{bmatrix} &f \quad &0 \quad &u_0 \\ &0 \quad & f \quad &v_0 \\ &0 \quad &0 \quad &1 \\ \end{bmatrix} and Xw=[xw0zw](on the ground)X_w = \begin{bmatrix}x_w \\ 0 \\ z_w\end{bmatrix} \quad (\text{on the ground})
w[uv1]=K[Xw+[0h0]](Image coordinate)K1w[uv1]=Xw+[0h0]K1w[uv1]=[xwhzw]\begin{align*} w \begin{bmatrix}u \\ v \\ 1\end{bmatrix}&= K \left[X_w + \begin{bmatrix}0 \\ -h \\ 0\end{bmatrix}\right] \quad (\text{Image coordinate})\\ K^{-1}w \begin{bmatrix}u \\ v \\ 1\end{bmatrix}&= X_w + \begin{bmatrix}0 \\ -h \\ 0\end{bmatrix} \\ K^{-1}w \begin{bmatrix}u \\ v \\ 1\end{bmatrix}&= \begin{bmatrix}x_w \\ -h \\ z_w\end{bmatrix} \\ \end{align*}
  • Given that K1=[1fx0cxfx01fycyfy001]K^{-1} = \begin{bmatrix} \frac{1}{f_x} & 0 & -\frac{c_x}{f_x}\\ 0 & \frac{1}{f_y} & -\frac{c_y}{f_y}\\ 0 & 0 & 1 \end{bmatrix}
w(ufu0f)=xww(vfv0f)=hw=zw\begin{align*} &w \left(\frac{u}{f} - \frac{u_0}{f}\right) = x_w \\ &w \left(\frac{v}{f} - \frac{v_0}{f}\right) = -h \\ &w = z_w \end{align*}

Takeaway

  • You have the following two equations to solve various kinds of problems
Xc=[Rt][Xw1]The position of the camera is RTtw[uv1]=K[Xw+[0h0]](Image coordinate)\begin{align*} &X_c = [R \mid t]\begin{bmatrix}X_w \\ 1\end{bmatrix} \\ &\text{The position of the camera is }-R^Tt \\ &w \begin{bmatrix}u \\ v \\ 1\end{bmatrix}= K \left[X_w + \begin{bmatrix}0 \\ -h \\ 0\end{bmatrix}\right] \quad (\text{Image coordinate})\\ \end{align*}

How to calibrate the camera?

Height Measurement

clipboard.png

  • Objects along the same parallel lines are of the same height

clipboard.png clipboard.png

  • Camera height is the height of horizon
  • Parachute is higher than the camera while the person is than the camera

Cross Ratio

clipboard.png

P3P1×P4P2P3P2×P4P1\frac{||P_3 - P_1|| \times ||P_4 - P_2||}{||P_3 - P_2|| \times ||P_4 - P_1||}
  • Projective invariant

clipboard.png

BT×RBR×TScene-cross Ratio=bt×vzrbr×vztImagecrossRatio=HR\underbrace{\frac{||B - T|| \times ||\infty - R||}{||B - R|| \times || \infty -T||}}_{\text{Scene-cross Ratio}} = \underbrace{\frac{||b - t|| \times ||v_z - r||}{||b - r|| \times || v_z - t||}}_{Image-cross Ratio} = \frac{H}{R}

Example: The height of the man

clipboard.png

tb×vzrrb×vztImagecrossRatio=HR\underbrace{\frac{||t - b|| \times ||v_z - r||}{||r - b|| \times || v_z - t||}}_{Image-cross Ratio} = \frac{H}{R}

Lens, Aperture, and DOF

Goal (What We Want)How We Get ItEffect / ExplanationCost / Trade-off
More spatial resolutionIncrease focal length (zoom in)Magnifies the subject; effectively increases detail on the sensorReduces light reaching the sensor, narrows field of view (FOV)
Decrease focal length (wider lens)Captures more area; lower magnificationReduces depth of field (DOF); less background separation
Broader field of viewDecrease focal length (wide-angle lens)Captures a wider sceneDecreases DOF; may introduce distortion
More depth of field (DOF)Decrease aperture (smaller opening → larger f-number)Increases range of sharp focus from near to farReduces light; requires longer exposure or higher ISO
Increase aperture (larger opening → smaller f-number)Decreases DOF, isolates subject with background blurGains light but loses focus range (shallow DOF)
More temporal resolutionShorten exposure timeFreezes fast motion; captures more frames per secondReduces light; image may be underexposed
Lengthen exposure timeIncreases light collection; motion blur can appearReduces temporal resolution (motion blur)

Quick Reference — Aperture Effects

clipboard.png

Aperture TypeDescriptionDepth of FieldLight IntakeTypical Use
Large aperture (small f-number, e.g., f/1.8)Wide lens openingShallow DOF (background blur)More lightPortraits, low-light photography
Small aperture (large f-number, e.g., f/16)Narrow lens openingDeep DOF (sharp foreground + background)Less lightLandscapes, daylight scenes

Warning

The effect will not be great if the aperture is too small as you may encounter issues related to diffraction

clipboard.png

Comprehensive List for Adjusting All the Camera Parameters

ParameterWhat It ControlsIf You Increase ItIf You Decrease ItAffects Field of View (FOV)?Affects Depth of Field (DOF)?Other Key Effects
Aperture (f-stop ↓)Lens opening sizeLarger opening → more light (brighter image)
Shallower DOF (blurred background)
Smaller opening → less light (darker image)
Deeper DOF (more in focus)
❌ No✅ StronglyControls exposure and background blur
Shutter TimeDuration sensor is exposed to lightLonger → more light
Motion blur increases
Shorter → less light
Motion frozen
❌ No❌ NoControls motion blur and brightness
Focal LengthLens zoom / magnificationNarrows FOV (zoom in)
Shallower DOF
Widens FOV (zoom out)
Deeper DOF
✅ Yes✅ YesAffects perspective compression
ISO SensitivitySensor’s light sensitivityBrighter image
More noise/grain
Darker image
Cleaner image
❌ No❌ NoTrade-off between brightness and noise
Sensor SizePhysical size of the imaging sensorWider FOV (for same focal length)
Shallower DOF
Narrower FOV
Deeper DOF
✅ Yes✅ YesLarger sensors perform better in low light
Focus DistanceDistance to subjectFocusing closer → shallower DOFFocusing farther → deeper DOF❌ No✅ YesMacro shots have very thin DOF
Exposure (Overall)Total light hitting the sensorIncreased by aperture ↑, shutter ↑, or ISO ↑Decreased by aperture ↓, shutter ↓, or ISO ↓❌ NoIndirectlyDetermines image brightness

Key Takeaways

ConceptDefinitionMain ControlsTypical Use
Field of View (FOV)How much of the scene is captured in the frameFocal length, sensor sizeWide landscape vs. zoomed portrait
Depth of Field (DOF)How much of the depth range appears in focusAperture, focal length, focus distance, sensor sizePortraits (shallow) vs. landscapes (deep)
ExposureBrightness of the imageAperture, shutter time, ISOProperly balanced image