Marker-less Augmented Reality by OpenCV and OpenGL

You can find more information about the source code at GitHub.

Quick Demo

Marker-less Augmented Reality Version 1
Marker-less Augmented Reality Version 2
[1]

1. INRODUCTION

5 x 5 marker
  • Cheap detection algorithm
  • Robust against lighting changes
  • Doesn’t work if partially overlapped
  • Marker image has to be black and white
  • Has square form in most cases (because it’s easy to detect)
  • Non-esthetic visual look of the marker
  • Has nothing in common with real-world objects
  • Can be used to detect real-world objects
  • Works even if the target object is partially overlapped
  • Can have arbitrary form and texture (except solid or smooth gradient textures)
  • Basic knowledge of CMake. CMake is a cross-platform, open-source build system designed to build, test, and package software. Like the OpenCV library, the demonstration project for this chapter also uses the CMake build system. CMake can be downloaded from here.
  • A basic knowledge of C++ programming language is also necessary.

2. THEORY

  1. ) MarkerlessAR_V1: It is the first version of “Open Source Markerless Augmented Reality” and it has the capabilities which is listed at below.
  • Via an explicit call of the concrete feature detector class constructor:
cv::Ptrcv::FeatureDetector detector = cv::Ptrcv::FeatureDetector(new cv::SurfFeatureDetector());
  • Or by creating a feature detector by algorithm name:
cv::Ptrcv::FeatureDetector detector = cv::FeatureDetector::create(“SURF”);
std::vector<cv::KeyPoint> keypoints;
detector->detect(image, keypoints);
/**
* Store the image data and computed descriptors of target pattern
*/
struct Pattern {

cv::Size size;
cv::Mat data;
std::vector<cv::KeyPoint> keypoints;
cv::Mat descriptors;
std::vector<cv::Point2f> points2d;
std::vector<cv::Point3f> points3d;

};
void PatternDetector::train(const Pattern& pattern) {

// Store the pattern object
m_pattern = pattern;

// API of cv::DescriptorMatcher is somewhat tricky
// First we clear old train data:
m_matcher->clear();

// That we add vector of descriptors
// (each descriptors matrix describe one image).
// This allows us to perform search across multiple images:
std::vector<cv::Mat> descriptors(1);
descriptors[0] = pattern.descriptors.clone();
m_matcher->add(descriptors);

// After adding train data perform actual train:
m_matcher->train();
}
  • To find the simple list of best matches:
void match(const Mat& queryDescriptors, vector& matches, const vector& masks=vector() );
  • To find K nearest matches for each descriptor:
void knnMatch(const Mat& queryDescriptors, vector<vector >& matches, int k, const vector& masks=vector ), bool compactResult=false );
  • To find correspondences whose distances are not farther than the specified distance:
void radiusMatch(const Mat& queryDescriptors, vector<vector >& matches, maxDistance, const vector& masks=vector(), bool compactResult=false );
  • False-positive matches: When the feature-point correspondence is wrong
  • False-negative matches: The absence of a match when the feature points are visible on both images
void PatternDetector::getMatches(const cv::Mat& queryDescriptors, std::vector<cv::DMatch>& matches) {

matches.clear();
if (enableRatioTest) {

// To avoid NaNs when best match has
// zero distance we will use inverse ratio.
const float minRatio = 1.f / 1.5f;

// KNN match will return 2 nearest
// matches for each query descriptor
m_matcher->knnMatch(queryDescriptors, m_knnMatches, 2);

for (size_t i=0; i<m_knnMatches.size(); i++) {

const cv::DMatch& bestMatch = m_knnMatches[i][0];
const cv::DMatch& betterMatch = m_knnMatches[i][1];
float distanceRatio = bestMatch.distance /
betterMatch.distance;
// Pass only matches where distance ratio between
// nearest matches is greater than 1.5
// (distinct criteria)
if (distanceRatio < minRatio) {

matches.push_back(bestMatch);

}

}

}
else { // Perform regular match
m_matcher->match(queryDescriptors, matches);
}
}
bool PatternDetector::refineMatchesWithHomography(const std::vector<cv::KeyPoint>& queryKeypoints, const std::vector<cv::KeyPoint>& trainKeypoints, float reprojectionThreshold, std::vector<cv::DMatch>& matches, cv::Mat& homography ) {    const int minNumberMatchesAllowed = 8;

if (matches.size() < minNumberMatchesAllowed)
return false;

// Prepare data for cv::findHomography
std::vector<cv::Point2f> srcPoints(matches.size());
std::vector<cv::Point2f> dstPoints(matches.size());
for (size_t i = 0; i < matches.size(); i++) {

srcPoints[i] = trainKeypoints[matches[i].trainIdx].pt;
dstPoints[i] = queryKeypoints[matches[i].queryIdx].pt;

}

// Find homography matrix and get inliers mask
std::vector<unsigned char> inliersMask(srcPoints.size());
homography = cv::findHomography(srcPoints,
dstPoints,
CV_FM_RANSAC,
reprojectionThreshold,
inliersMask);
std::vector<cv::DMatch> inliers;
for (size_t i=0; i<inliersMask.size(); i++) {

if (inliersMask[i])
inliers.push_back(matches[i]);

}

matches.swap(inliers);
return matches.size() > minNumberMatchesAllowed;
}
bool PatternDetector::findPattern(const cv::Mat& image, PatternTrackingInfo& info) {    // Convert input image to gray
getGray(image, m_grayImg);

// Extract feature points from input gray image
extractFeatures(m_grayImg, m_queryKeypoints,
m_queryDescriptors);

// Get matches with current pattern
getMatches(m_queryDescriptors, m_matches);

// Find homography transformation and detect good matches
bool homographyFound = refineMatchesWithHomography(
m_queryKeypoints,
m_pattern.keypoints,
homographyReprojectionThreshold,
m_matches,
m_roughHomography);

if (homographyFound) {

// If homography refinement enabled
// improve found transformation
if (enableHomographyRefinement) {

// Warp image using found homography
cv::warpPerspective(m_grayImg, m_warpedImg,
m_roughHomography, m_pattern.size,
cv::WARP_INVERSE_MAP | cv::INTER_CUBIC);

// Get refined matches:
std::vector<cv::KeyPoint> warpedKeypoints;
std::vector<cv::DMatch> refinedMatches;

// Detect features on warped image
extractFeatures(m_warpedImg, warpedKeypoints,
m_queryDescriptors);

// Match with pattern
getMatches(m_queryDescriptors, refinedMatches);

// Estimate new refinement homography
homographyFound = refineMatchesWithHomography(
warpedKeypoints,
m_pattern.keypoints,
homographyReprojectionThreshold,
refinedMatches,
m_refinedHomography);

// Get a result homography as result of matrix product
// of refined and rough homographies:
info.homography = m_roughHomography *
m_refinedHomography;

// Transform contour with precise homography
cv::perspectiveTransform(m_pattern.points2d,
info.points2d, info.homography);

}

else {

info.homography = m_roughHomography;

// Transform contour with rough homography
cv::perspectiveTransform(m_pattern.points2d,
info.points2d, m_roughHomography);

}
}

return homographyFound;

}
  1. Converted input image to grayscale.
  2. Detected features on the query image using our feature-detection algorithm.
  3. Extracted descriptors from the input image for the detected feature points.
  4. Matched descriptors against pattern descriptors.
  5. Used cross-checks or ratio tests to remove outliers.
  6. Found the homography transformation using inlier matches.
  7. Refined the homography by warping the query image with homography from the previous step.
  8. Found the precise homography as a result of the multiplication of rough and refined homography.
  9. Transformed the pattern corners to an image coordinate system to get pattern locations on the input image.
void PatternDetector::buildPatternFromImage(const cv::Mat& image, Pattern& pattern) const {    int numImages = 4;    float step = sqrtf(2.0f);    // Store original image in pattern structure
pattern.size = cv::Size(image.cols, image.rows);
pattern.frame = image.clone();
getGray(image, pattern.grayImg);
// Build 2d and 3d contours (3d contour lie in XY plane since
// it's planar)
pattern.points2d.resize(4);
pattern.points3d.resize(4);
// Image dimensions
const float w = image.cols;
const float h = image.rows;
// Normalized dimensions:
const float maxSize = std::max(w,h);
const float unitW = w / maxSize;
const float unitH = h / maxSize;
pattern.points2d[0] = cv::Point2f(0,0);
pattern.points2d[1] = cv::Point2f(w,0);
pattern.points2d[2] = cv::Point2f(w,h);
pattern.points2d[3] = cv::Point2f(0,h);
pattern.points3d[0] = cv::Point3f(-unitW, -unitH, 0);
pattern.points3d[1] = cv::Point3f( unitW, -unitH, 0);
pattern.points3d[2] = cv::Point3f( unitW, unitH, 0);
pattern.points3d[3] = cv::Point3f(-unitW, unitH, 0);
extractFeatures(pattern.grayImg, pattern.keypoints,pattern.descriptors);}
  • The most recent image taken from the camera
  • The camera-calibration matrix
  • The pattern pose in 3D (if present)
  • The internal data related to OpenGL (texture ID and so on)
  • The camera-calibration object
  • An Instance of the pattern-detector object
  • A trained pattern object
  • Intermediate data of pattern tracking
mtllib cube.mtl
v 1.000000 -1.000000 -1.000000
v 1.000000 -1.000000 1.000000
v -1.000000 -1.000000 1.000000
v -1.000000 -1.000000 -1.000000
v 1.000000 1.000000 -1.000000
v 0.999999 1.000000 1.000001
v -1.000000 1.000000 1.000000
v -1.000000 1.000000 -1.000000
vt 0.748573 0.750412
vt 0.749279 0.501284
vt 0.999110 0.501077
vt 0.999455 0.750380
vt 0.250471 0.500702
vt 0.249682 0.749677
vt 0.001085 0.750380
vt 0.001517 0.499994
vt 0.499422 0.500239
vt 0.500149 0.750166
vt 0.748355 0.998230
vt 0.500193 0.998728
vt 0.498993 0.250415
vt 0.748953 0.250920
vn 0.000000 0.000000 -1.000000
vn -1.000000 -0.000000 -0.000000
vn -0.000000 -0.000000 1.000000
vn -0.000001 0.000000 1.000000
vn 1.000000 -0.000000 0.000000
vn 1.000000 0.000000 0.000001
vn 0.000000 1.000000 -0.000000
vn -0.000000 -1.000000 0.000000
usemtl Material_ray.png
s off
f 5/1/1 1/2/1 4/3/1
f 5/1/1 4/3/1 8/4/1
f 3/5/2 7/6/2 8/7/2
f 3/5/2 8/7/2 4/8/2
f 2/9/3 6/10/3 3/5/3
f 6/10/4 7/6/4 3/5/4
f 1/2/5 5/1/5 2/9/5
f 5/1/6 6/10/6 2/9/6
f 5/1/7 8/11/7 6/10/7
f 8/11/7 7/12/7 6/10/7
f 1/2/8 2/9/8 3/13/8
f 1/2/8 3/13/8 4/14/8
  • usemtl and mtllib describe the look of the model. We won’t use this in this tutorial.
  • v is a vertex
  • vt is the texture coordinate of one vertex
  • vn is the normal of one vertex
  • f is a face
  • 8/11/7 describes the first vertex of the triangle
  • 7/12/7 describes the second vertex of the triangle
  • 6/10/7 describes the third vertex of the triangle (duh)
  • For the first vertex, 8 says which vertex to use. So in this case, -1.000000 1.000000 -1.000000 (index start to 1, not to 0 like in C++)
  • 11 says which texture coordinate to use. So in this case, 0.748355 0.998230
  • 7 says which normal to use. So in this case, 0.000000 1.000000 -0.000000
void ARDrawingContext::draw3DModel() { // a method to render 3D model by using OpenGL    glEnable(GL_TEXTURE_2D); // enable server-side GL capabilities
glBindTexture(GL_TEXTURE_2D, Texture); //
glBegin(GL_TRIANGLES); // delimit the vertices of a primitive or a group of like primitives

for (int i = 0; i < vertices.size(); i += 1) { // for loop to cover all vertices, normals and texture coordinates
a = vertices[i];
b = uvs[i];
glNormal3f(a.x, a.y, a.z); // it renders the "normals"
glTexCoord2d(b.x, b.y); // it renders the "texture"
glVertex3f(a.x, a.y, a.z); // it renders the "vertices"
} glEnd(); //end drawing of line loop
glDisable(GL_TEXTURE_2D); // disable server-side GL capabilities
}
void ARDrawingContext::scale3DModel(float scaleFactor) {    for (int i = 0; i < vertices.size(); i += 1) {

// multiplying vertices by scale factor to scale 3d model
vertices[i] = vertices[i] * vec3(scaleFactor * 1.0f, scaleFactor * 1.0f, scaleFactor * 1.0f);

}
for (int i = 0; i < normals.size(); i += 1) {

// multiplying normals by scale factor to scale 3d model
normals[i] = normals[i] * vec3(scaleFactor * 1.0f, scaleFactor * 1.0f, scaleFactor * 1.0f);

}
for (int i = 0; i < uvs.size(); i += 1) {

// multiplying faces by scale factor to scale 3d model
uvs[i] = uvs[i] * vec2(scaleFactor * 1.0f, scaleFactor * 1.0f);

}

}
[2]

3. INSTALLATION

How to enable OpenGL Support in OpenCV:

  • Linux: Execute;
  • MacOSX: Install QT4 and then configure OpenCV with QT and OpenGL.
  • Windows: Enable WITH_OPENGL=YES flag when building OpenCV to enable OpenGL support.

Building the project using CMake from the command-line:

  • Linux:
export OpenCV_DIR="~/OpenCV/build"
mkdir build
cd build
cmake -D OpenCV_DIR=$OpenCV_DIR ..
make
  • MacOSX (Xcode):
export OpenCV_DIR="~/OpenCV/build"
mkdir build
cd build
cmake -G Xcode -D OpenCV_DIR=$OpenCV_DIR ..
open ARProject.xcodepro

Running the project:

4. USAGE

  • To run on a single image call:
- ARProject pattern.png test_image.png
  • To run on a recorded video call:
- ARProject pattern.png test_video.avi
  • To run using live feed from a web camera, call:
- ARProject pattern.png

5. FUTURE WORKS

  • IN PROGRESS: Working on Performance Issues to Get Realtime Tracking by using Dimension Reduction which is the process of reducing the number of random variables under consideration, via obtaining a set of principal variables.
  • TO DO: Multiple Object Detection And Tracking

6. CITATION

.@ONLINE{vdtc,
author = "Ahmet Özlü",
title = "Open Source Markerless Augmented Reality",
year = "2017",
url = "https://github.com/ahmetozlu/open_source_markerless_augmented_reality"
}

7. AUTHOR

8. LICENSE

9. SUMMARY

You can find more information about the source code at GitHub.

--

--

--

I am a big fan of Real Madrid CF and I love computer science!

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Wine Quality Prediction Using Pytorch

ANN Classification with ‘nnet’ Package in R

Emulating Logical Gates with a Neural Network

“GANs” vs “ODEs”: the end of mathematical modeling?

Tackling Bias in Machine Learning

Neural Network From Scratch

CIFAR-10 Image Classifier using the Neural Network

Introduction to 2 Dimensional LSTM Autoencoder

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ahmet Özlü

Ahmet Özlü

I am a big fan of Real Madrid CF and I love computer science!

More from Medium

Faces: AI Blitz XIII with Team GLaDOS

Tutorial for training YOLOv4 to Detect Pedestrian traffic Lights

Understanding Facial Recognition

Ultimate masterclass on Bounding Boxes, Object Detection and Image Recognition

Bounding boxes drawn on dog and cat sitting together