In this article, I will be explaining step by step process to perform camera calibration with MATLAB code. I wrote this code while doing Computer Vision course to better learn the academic lessons on intrinsic and extrinsic calibration parameter of a camera. With the help of code, one can even perform calibration of a mobile camera.
Book
I followed the book ‘Computer Vision: A Modern Approach (1st Edition)’ by Forsyth & Ponce to learn about the various intrinsic/extrinsic camera parameters. One can visit my article ‘Computer Vision Resources for a Beginner‘ to know about best resources/books to learn Computer Vision.
Perspective Projection Matrix
It is a matrix which contains all the intrinsic and extrinsic parameters and it has the following form
where $ (\alpha,\beta) $ are magnifications in pixel units, $\theta$ is skew, $(u_0, v_0)$ is position (in pixel units) of principal point, $(\mathbf{r}_1^T, \mathbf{r}_2^T, \mathbf{r}_3^T)$ are three rows of rotation matrix and $(t_x, t_y, t_z)$ are coordinates of translation vector. If $\mathbf{P}$ is a homogeneous coordinate vector of a point in the world coordinate system, then corresponding homogeneous image point is
$\mathbf{p} = \mathcal{M} \mathbf{P}.$
We have to convert homogeneous coordinate to non-homogeneous to obtain image point in pixels. If $\mathbf{m}_1^T$, $\mathbf{m}_2^T $ and $\mathbf{m}_3^T$ are rows of the matrix $\mathcal{M} $, then image point is
\begin{align*}
u &= \frac{\mathbf{m}_1^T.\mathbf{P}}{\mathbf{m}_3^T.\mathbf{P}}, \\
v &= \frac{\mathbf{m}_2^T.\mathbf{P}}{\mathbf{m}_3^T.\mathbf{P}}.
\end{align*}
Step – 1: Points for Correspondence
As there are 5 intrinsic parameters and 6 extrinsic parameters, we require minimum 6 points for correspondence to evaluate the perspective projection matrix. We can’t choose all these points to lie on a line or a plane, as this will give no unique information about the projection matrix. In the absence of calibration pad, I used a corner of a room and took photo of it using my mobile camera. One can note that the encircled points in the photo lie on three different plane with not more than 3 points on a single plane.

I then did point correspondence between world coordinates and image coordinates. The unit for World coordinates does not matter if we use one uniformly.
N = 6; % number of points for correspondence % each colum in one coordinate WldPts = zeros(3,N); % Img1Pts = zeros(2,N); % points as per my camera calibration % matrix for world coordinates. Each point will be a column vector % WldPts(1,:) = [58.6,58.6, 0, 0, 0, 58.6]; % all x coordinates are mentioned WldPts(2,:) = [0, 0, 0, 58.6, 58.6, 58.6]; % all y coordinates are mentioned WldPts(3,:) = [0, 10, 10, 10, 0, 0]; % all z coordinates are mentioned % corresponding points in image 1 Img1Pts(1,:) = [212, 126, 1618, 2979, 2899, 1265]; %in pixels: all x coordinates are mentioned Img1Pts(2,:) = [2177, 1913, 1372, 2071, 2344, 3518]; % in pixels: all y coordinates are mentioned
Step – 2: Evaluating Perspective Projection Matrix
Second step is to convert the projection equation into a system of homogeneous linear equations to evaluate the projection matrix. If $n$ points are chosen for correspondence, we get the following system of equations (refer eq. 5.3.1 in the book):-
Next we have to perform Singular Value Decomposition (SVD) of the matrix $\mathcal{P}$ and take eigenvector corresponding to lowest eigenvalue as solution for $\mathbf{m}$.
% calculate the vector P corresponding to projection matrix P = zeros(2*N,12); for i = 1:N P(2*i-1,:) = [WldPts(:,i)', 1, 0,0,0,0, -Img1Pts(1,i)*WldPts(:,i)', -Img1Pts(1,i)]; P(2*i,:) = [0,0,0,0,WldPts(:,i)', 1, -Img1Pts(2,i)*WldPts(:,i)', -Img1Pts(2,i)]; end [U,S,V] = svd(P'*P); indx = 12; eigval_min = S(indx,indx); m_vect = V(:,indx); % projection matrix M M = zeros (3,4); M(1,1:4) = m_vect(1:4); M(2,1:4) = m_vect(5:8); M(3,1:4) = m_vect(9:12);
Step – 3: Recovering Parameters
Next step is to recover intrinsic and extrinsic parameters from the perspective projection matrix. For that I used Eqs. (5.3.5) – (5.3.8) from the book.
a1 = M(1,1:3)';
a2 = M(2,1:3)';
a3 = M(3,1:3)';
% epsilon = +-1. With -1, a different set of values of rotation matrix
% and translation vector will be obtained.
epsilon = 1;
% value of epsilon = +-1 comes from the fact that if a^2 = 1 then
% a = +-1. A row of rotation matrix has unit magnitude.
prow = epsilon/ norm(a3);
r3 = prow*a3;
u0 = prow^2*(a1'*a3);
v0 = prow^2*(a2'*a3);
tmp1 = cross(a1,a3);
tmp2 = cross(a2,a3);
tmp3 = (tmp1'*tmp2)/(norm(tmp1)*norm(tmp2));
% as theta is between 0 to 90, cos(theta) is always positive
tmp3 = abs(tmp3);
theta = acosd(tmp3);
% sign of alpha & beta depends on sign of focal length of lens. Assuming
% positive focal length
alpha = prow^2*norm(tmp1)*sind(theta);
beta = prow^2*norm(tmp2)*sind(theta);
r1 = tmp2/norm(tmp2);
r2 = cross(r3,r1);
% intrinsic parameters matrix
K = [alpha, -alpha*cotd(theta),u0; ...
0, beta/sind(theta), v0; ...
0, 0, 1];
% rotation matrix
R = [r1';r2';r3'];
% translation vector
tz = M(3,4)*prow;
ty = (M(2,4)*prow - v0*tz)*sind(theta)/beta;
tx = (M(1,4)*prow-u0*tz+alpha*cotd(theta)*ty)/alpha;
Verifying the Results
I obtained the following result for the image shown above:-
In order to cross-check the results, one has to observe the following:-
- Skew should be close to $90^{\circ}$ but less than it.
- Aspect ratio to be close to one. Though it depends on camera also.
- Position of principal point ($u_0, v_0)$ to close to half of the size of the image.
That was all I had to write about the camera calibration. I hope this article has been of use to you. If that is the case, then you can donate me to support this website!