# Overview

In the off-site competition, input images are prepared by generating synthesized images using the POV-Ray, which is a tool for rendering high-quality computer graphics. The resolution of the image is 640 $\times$ 480 pixels. The target scene is composed of rigid objects, and illuminated by multiple static or dynamic light sources.

# Task

## Level 1

- Find out the 2D positions of the reference points which are given by 2D image patches extracted from the image I
_{0} - Calculate the projective transformation matrix referring the correspondences between the 2D positions of the reference points in the image I
_{0}and their 3D coordinates.

In general, a camera pose can be determined from more than three correspondences between the 2D positions of scene points and their 3D coordinates by solving the Perspective-n-Point (PnP) problem. The Level 1 algorithm should return the projective transformation matrix of the image I_{0}.

main1.cpp includes a basic algorithm for this problem.

### Given information

- a VGA image: I
_{0} - m reference points each of which has:
- a square 65×65 pixels image patch
- a 3D world coordinate corresponding to the center of the image patch

### Environment

- The environment is composed of a block object (e.g., house).
- The image I
_{0 }is captured at the initial position with VGA resolution. - The 2D image patches are extracted from the image I
_{0}, and their 3D coordinates are measured.

### Procedure

- Read the image I
_{0,}the image patches and their 3D world coordinates. - Find out the 2D positions of the center of the given patches in the image I
_{0}. In typical algorithms, the cues for finding the positions are texture around the scene points and positional relationship among the scene points. - Calculate the projective transformation matrix referring the correspondences between the 2D positions of the reference points in the image I
_{0}and their 3D coordinates. - Output the 2D positions (2Ddata.csv) and the estimated projective transformation matrix (matrix.csv).

### Output

- The 2D positions of the center of the given patches in the UV coordinates from (0, 0) to (639, 479). The UV coordinates start from the upper left corner (v-axis is facing down).
- The estimated projective transformation 3 x 4 matrix.

## Level 2

- Find out the 2D positions of the reference points which are given by 2D image patches in the initial image I
_{0 }of an image sequence - Track them through the images I
_{0},⋯,I_{n-1}in the sequence - Calculate the projective transformation matrix referring correspondences between the 2D positions of the reference points in the image I
_{n-1}and their 3D coordinates.

All the given images of an image sequence contain the same scene part that can be found in the initial image I_{0}. The Level 2 algorithm can use inter-frame restrictions of the given video sequence. In contrast to the Level 1 algorithm, the Level 2 algorithm should return the projective transformation matrix of the image I_{n-1}. The participants can assume that the scene is rigid, the camera poses are consecutive, and the intrinsic parameters are not changed. However, the lighting environment may be gradually changed during camera movement.

main2.cpp includes a basic algorithm for this problem.

### Given information

- n VGA images: I
_{0},⋯,I_{n-1} - m reference points in the initial image I
_{0}each of which has:- a square 65×65 pixels image patch
- a 3D world coordinate corresponding to the center of the image patch

### Environment

- The environment is composed of a block object (e.g., house).
- The image I
_{0}is captured at the initial position with VGA resolution. - The 2D image patches are extracted from the initial image I
_{0}, and their 3D coordinates are measured.

### Procedure

- Read the image I
_{0}, the image patches and their 3D world coordinates. - Find out the 2D positions of the center of the given patches in the image I
_{0}. - Repeat the same process, in each image through the last image I
_{n-1 }(e.g., tracking the patches from I_{0}to I_{n-1}). In typical algorithms, the cues for finding the positions are texture around the scene points and positional relationship among the scene points and images. - Calculate the projective transformation matrix referring the correspondences between the 2D positions of the reference points in the last image I
_{n-1}and their 3D coordinates. - Output the 2D positions (2Ddata.csv) and the estimated projective transformation matrix (matrix.csv).

### Output

- The 2D positions of the center of the given patches in the UV coordinates from (0, 0) to (639, 479). The UV coordinates start from the upper left corner (v-axis is facing down).
- The estimated projective transformation 3 x 4 matrix of the last image I
_{n-1}.

## Level 3

- Calculate the projective transformation matrix referring the correspondences between the 2D positions in the last image I
_{n-1 }of an image sequence and their 3D coordinates.

This task is almost the same as Level 2. However, any scene parts (patches) found in the initial image I_{0} do not appear any more in the ending part of the images including I_{n-1}. For this level, only tracking the given points may not be enough for accurate estimation of the camera pose.

### Given information

- n VGA images: I
_{0},⋯,I_{n-1} - m reference points in the initial image I
_{0}each of which has:- a square 65×65-pixel image patch
- a 3D world coordinate corresponding to the center of the image patch

### Environment

- The environment is composed of a block object (e.g., house).
- The image I
_{0 }is captured at the initial position with VGA resolution. - The 2D image patches are extracted from the initial image I
_{0}, and their 3D coordinates are measured.

### Procedure

- Read the image I
_{0}, the image patches and their 3D world coordinates. - Find out the 2D positions of the given patches in I
_{0}. - Repeat the same process, in each image through the last image I
_{n-1}(e.g., tracking from I_{0}to I_{n-1}). In typical algorithms, the cues for finding the positions are texture around the scene points and positional relationship among the scene points and images. - When the given patches are not observed in an image, it is necessary to add reference points by reconstructing the 3D coordinates (i.e., 3D mapping).
- Calculate the projective transformation matrix referring the correspondences between the 2D positions of reference points, which might be added during the tracking, in the last image I
_{n-1}and their 3D coordinates. - Output the 2D positions (2Ddata.csv) and the estimated projective transformation matrix (matrix.csv).

### Output

- The estimated projective transformation 3 x 4 matrix of the last image I
_{n-1}.