ORB-SLAM2 Notes: Day 11 – Descriptor - Pinecone Log

In the last article, I implemented Orientation Computer, which computes orientations of keypoints. Next we will use orientation info to compute descriptors for matching keypoints across frames.

The code is available on my GitHub repository, and readers are welcome to check it out!

Descriptor

Overview

In ORB-SLAM2, we use descriptors to match keypoints across frames, which makes spatial localization possible using triangulation, as shown in Fig. 1.

Structure

In fact, a descriptor is a 256-bit data structure, where each bit represents either 0 or 1, as shown in Fig. 2.

Match

We use Hamming distance to evaluate similarity between two descriptors. It is computed by counting the number of differing corresponding bits. The smaller the distance, the more similar the two descriptors are.

Hamming Distance

A simple example: consider two 8-bit binary descriptors, $d_1$ and $d_2$:

$$
d_1 = 10\underline110\underline010 \\
d_2 = 10\underline010\underline110
$$

There are two differing bits, so the Hamming distance is 2.

How to Compute Descriptor?

Coordinate Rotation

After getting the orientation of a keypoint, we use it to compute the descriptor. First, it is necessary to rotate the keypoint so that its orientation aligns with the x-axis as shown in Fig. 3, which can be indirectly achieved by rotating the original sample coordinates according to orientation, as shown in Fig. 4.

$$
\begin{bmatrix} x’ \\ y’ \end{bmatrix} =

\begin{bmatrix}
\cos\theta & -\sin\theta \\
\sin\theta & \cos\theta
\end{bmatrix}

\begin{bmatrix} x \\ y \end{bmatrix}

\tag{1}
$$

Using the orientation information compensates for camera rotation and makes computed descriptors more robust.

Fig. 3. Rotate the keypoint so that its orientation aligns with the x-axis.

Fig. 4. In practice, it is more convenient to rotate the sample coordinates for accessing intensities of target pixels.

Gaussian Blur

Blurring the image before computing descriptors helps reduce noise.

Compute Descriptor

In the circular region of radius 15 around the keypoint, we randomly select 256 pairs of pixels, and then compare two pixels’ intensities in each pair. Although I say “randomly”, there has been a template made by others that contains sorted pairs of coordinates. We just need to compute descriptors based on it.

Consider one bit of a descriptor, its value is:

$$
value =

\begin{cases}
1, & \text{if } I(y_1,x_1) < I(y_2,x_2) \\
0, & \text{if } I(y_1,x_1) \ge I(y_2,x_2)
\end{cases}

\text{ , where $I(y,x)$ represents the intensity at this point. }
$$

By repeating this process, we will obtain the 256-bit descriptor representing the information around the keypoint.

ORB-SLAM2 Notes: Day 11 – Descriptor

Table of Contents

Descriptor

Overview

Structure

Match

Hamming Distance

How to Compute Descriptor?

Coordinate Rotation

Gaussian Blur

Compute Descriptor

發佈留言取消回覆

Table of Contents

Descriptor

Overview

Structure

Match

Hamming Distance

How to Compute Descriptor?

Coordinate Rotation

Gaussian Blur

Compute Descriptor

相關文章

ORB-SLAM2 Notes: Day 10 – Orientation Computer

ORB-SLAM2 Notes: Day 9 – KeyPoint Orientation

ORB-SLAM2 Notes: Day 8 – KeyPoint Distribution

發佈留言取消回覆