In this article, I will be explaining step by step process to perform camera calibration with MATLAB code. I wrote this code while doing Computer Vision course to better learn the academic lessons on intrinsic and extrinsic calibration parameter of a camera. With the help of code, one can even perform calibration of a mobile camera.
Book
I followed the book ‘Computer Vision: A Modern Approach (1st Edition)’ by Forsyth & Ponce to learn about the various intrinsic/extrinsic camera parameters. One can visit my article ‘Computer Vision Resources for a Beginner‘ to know about best resources/books to learn Computer Vision.
Perspective Projection Matrix
It is a matrix which contains all the intrinsic and extrinsic parameters and it has the following form
where $ (\alpha,\beta) $ are magnifications in pixel units, $\theta$ is skew, $(u_0, v_0)$ is position (in pixel units) of principal point, $(\mathbf{r}_1^T, \mathbf{r}_2^T, \mathbf{r}_3^T)$ are three rows of rotation matrix and $(t_x, t_y, t_z)$ are coordinates of translation vector. If $\mathbf{P}$ is a homogeneous coordinate vector of a point in the world coordinate system, then corresponding homogeneous image point is
$\mathbf{p} = \mathcal{M} \mathbf{P}.$
We have to convert homogeneous coordinate to non-homogeneous to obtain image point in pixels. If $\mathbf{m}_1^T$, $\mathbf{m}_2^T $ and $\mathbf{m}_3^T$ are rows of the matrix $\mathcal{M} $, then image point is
\begin{align*}
u &= \frac{\mathbf{m}_1^T.\mathbf{P}}{\mathbf{m}_3^T.\mathbf{P}}, \\
v &= \frac{\mathbf{m}_2^T.\mathbf{P}}{\mathbf{m}_3^T.\mathbf{P}}.
\end{align*}
Step – 1: Points for Correspondence
As there are 5 intrinsic parameters and 6 extrinsic parameters, we require minimum 6 points for correspondence to evaluate the perspective projection matrix. We can’t choose all these points to lie on a line or a plane, as this will give no unique information about the projection matrix. In the absence of calibration pad, I used a corner of a room and took photo of it using my mobile camera. One can note that the encircled points in the photo lie on three different plane with not more than 3 points on a single plane.
I then did point correspondence between world coordinates and image coordinates. The unit for World coordinates does not matter if we use one uniformly.
N = 6; % number of points for correspondence % each colum in one coordinate WldPts = zeros(3,N); % Img1Pts = zeros(2,N); % points as per my camera calibration % matrix for world coordinates. Each point will be a column vector % WldPts(1,:) = [58.6,58.6, 0, 0, 0, 58.6]; % all x coordinates are mentioned WldPts(2,:) = [0, 0, 0, 58.6, 58.6, 58.6]; % all y coordinates are mentioned WldPts(3,:) = [0, 10, 10, 10, 0, 0]; % all z coordinates are mentioned % corresponding points in image 1 Img1Pts(1,:) = [212, 126, 1618, 2979, 2899, 1265]; %in pixels: all x coordinates are mentioned Img1Pts(2,:) = [2177, 1913, 1372, 2071, 2344, 3518]; % in pixels: all y coordinates are mentioned
Step – 2: Evaluating Perspective Projection Matrix
Second step is to convert the projection equation into a system of homogeneous linear equations to evaluate the projection matrix. If $n$ points are chosen for correspondence, we get the following system of equations (refer eq. 5.3.1 in the book):-
Next we have to perform Singular Value Decomposition (SVD) of the matrix $\mathcal{P}$ and take eigenvector corresponding to lowest eigenvalue as solution for $\mathbf{m}$.
% calculate the vector P corresponding to projection matrix P = zeros(2*N,12); for i = 1:N P(2*i-1,:) = [WldPts(:,i)', 1, 0,0,0,0, -Img1Pts(1,i)*WldPts(:,i)', -Img1Pts(1,i)]; P(2*i,:) = [0,0,0,0,WldPts(:,i)', 1, -Img1Pts(2,i)*WldPts(:,i)', -Img1Pts(2,i)]; end [U,S,V] = svd(P'*P); indx = 12; eigval_min = S(indx,indx); m_vect = V(:,indx); % projection matrix M M = zeros (3,4); M(1,1:4) = m_vect(1:4); M(2,1:4) = m_vect(5:8); M(3,1:4) = m_vect(9:12);
Step – 3: Recovering Parameters
Next step is to recover intrinsic and extrinsic parameters from the perspective projection matrix. For that I used Eqs. (5.3.5) – (5.3.8) from the book.
a1 = M(1,1:3)'; a2 = M(2,1:3)'; a3 = M(3,1:3)'; % epsilon = +-1. With -1, a different set of values of rotation matrix % and translation vector will be obtained. epsilon = 1; % value of epsilon = +-1 comes from the fact that if a^2 = 1 then % a = +-1. A row of rotation matrix has unit magnitude. prow = epsilon/ norm(a3); r3 = prow*a3; u0 = prow^2*(a1'*a3); v0 = prow^2*(a2'*a3); tmp1 = cross(a1,a3); tmp2 = cross(a2,a3); tmp3 = (tmp1'*tmp2)/(norm(tmp1)*norm(tmp2)); % as theta is between 0 to 90, cos(theta) is always positive tmp3 = abs(tmp3); theta = acosd(tmp3); % sign of alpha & beta depends on sign of focal length of lens. Assuming % positive focal length alpha = prow^2*norm(tmp1)*sind(theta); beta = prow^2*norm(tmp2)*sind(theta); r1 = tmp2/norm(tmp2); r2 = cross(r3,r1); % intrinsic parameters matrix K = [alpha, -alpha*cotd(theta),u0; ... 0, beta/sind(theta), v0; ... 0, 0, 1]; % rotation matrix R = [r1';r2';r3']; % translation vector tz = M(3,4)*prow; ty = (M(2,4)*prow - v0*tz)*sind(theta)/beta; tx = (M(1,4)*prow-u0*tz+alpha*cotd(theta)*ty)/alpha;
Verifying the Results
I obtained the following result for the image shown above:-
In order to cross-check the results, one has to observe the following:-
- Skew should be close to $90^{\circ}$ but less than it.
- Aspect ratio to be close to one. Though it depends on camera also.
- Position of principal point ($u_0, v_0)$ to close to half of the size of the image.
That was all I had to write about the camera calibration. I hope this article has been of use to you. If that is the case, then you can donate me to support this website!