Creating Multi-View Face Recognition/Detection Database for Deep Learning in Programmatic Way

Ahmet Özlü
6 min readSep 30, 2017

Let’s assume that you want to investigate some aspect of facial recognition or facial detection. One thing you are going to want is a variety of faces that you can use for your system. You can create your own face detection/recognition database but how? Maybe cropping and saving images can come in your mind as an idea but it will be really exhausting way! You probably don’t want to be in front of your computer all day long to crop and save images for creating database. Alternatively, you could look at some of the existing facial recognition and facial detection databases that fellow researchers and organizations have created in the past. Why reinvent the wheel if you do not have to! For example, VGG Face Descriptor or Labeled Faces in the Wild. It works if you just want to have Single-View Face Recognition/Detection database. You can do some processing such as trying to warp each picture or use an algorithm called face landmark estimation to detect/recognize multi-view faces by using these databases but it can not work on surveillance cases. You really need to have Multi-View Face Database to succeed real multi-view face detection/recognition in surveillance or more harder cases. However there are not appropriate any Multi-View Face Recognition/Detection databases in both academia and industry to work and do research on surveillance camera records or more harder cases. Moreover, Multi-View Face Recognition/Detection is the hot topic of Computer Vision in recent years. Thus, creating your own Multi-View Face Recognition/Detection database will be so a very precious step! Who knows, maybe you can publish your database with both academia and industry for helping to develop facial recognition and detection technology. The researchers would be appreciate to you for this!

And we already know that; The world’s most valuable resource is no longer oil, but data… then lets create our own data…

Let’s talk more technical

Let’s start to think about how we can automatically detect, recognize, crop and save face images from videos. The steps which we will deal;

1–) Read videos frame by frame.

2-) Process each frame to recognize faces so we will need face recognizer.

3-) Crop and save each recognized face as an image under the appropriate file path.

4-) Once we have acquired the face data, we’ll need to read it in our program. Therefore, we will create CSV file in programmatic way to read images which are located in database for using database images from another programs.

A scene from Person of Interest TV series.

Okay, we specified the tasks which we will have to deal to create our own facial database. Now, we should decided the working environment, I mean computer programming language! Let’s think, what do we have? Java, Python or c++…

I chose Python! Why? Because I love Python! Let’s me allow to sort the reasons that why I love Python;

  • Interactive
  • Interpreted
  • Modular
  • Dynamic
  • Object-oriented
  • Portable
  • High level
  • Extensible in C++ & C

Now, I need face recognizer, my main aim is creating Multi-View Face Recognition/Detection database so I don’t need to develop face recognizer because there are so many face recognizer which was developed by passionate developer on GitHub. I searched on GitHub and I found an amazing face recognizer which is developed by Adam Geitgey. Thanks to him so much!

Let’s meet with our face recognizer

Adam Agitey’s face recognizer was developed in Python using OpenFace and dlib. Let’s summarize it quickly;

  1. Encode a picture using the HOG algorithm to create a simplified version of the image. Using this simplified image, find the part of the image that most looks like a generic HOG encoding of a face.
  2. Figure out the pose of the face by finding the main landmarks in the face. Once we find those landmarks, use them to warp the image so that the eyes and mouth are centered.
  3. Pass the centered face image through a neural network that knows how to measure features of the face. Save those 128 measurements.
  4. Looking at all the faces we’ve measured in the past, see which person has the closest measurements to our face’s measurements. That’s our match!

If you want to get more information, please check it out!

Now, we have face recognizer and we can start to create our face database! First, we should set the video as an input. I chose a scene from The Big Bang Theory TV Series as an input. Then, we should set the faces which will be recognized by our face recognizer. I determined Penny and Sheldon as target. And as you can see at the below, our face recognizer works even if faces turned different directions so we can catch, crop and save multi-view face data, we are ready!!!

Faces are being tracking, cropping and saving as images from video.

Our face recognizer works pretty good and let’s start to create our own face database! We can detect, recognize and track the face images so now it is cropping and saving images turn. We can crop and save images easily in Python. Just we write the code that crops the images which is specified by face recognizer and they are drawn with red line boxes as you can see above. We will have a huge data after start to crop and save images so we should take consider about being well organized. We should locate the face data under appropriate folder path. It is important because we don’t want to spend our time to organize the data by hand, it should be done in programmatic way as you can see at the below.

Images are being saving from video with appropriate path hierarchy.

That’s it!!! Now, we can create our own face database! More input videos equal more face images, more face images equal more data, more data equals big data and big data is better data!!!

We have acquired the face data, we’ll need to read it in our program. In the demo applications I have decided to read the images from a very simple CSV file. Why? Because it’s the simplest platform-independent approach I can think of. However, if you know a simpler solution please ping me about it. Basically all the CSV file needs to contain are lines composed of a filename followed by a ; followed by the label (as integer number), making up a line like this:


You don’t really want to create the CSV file by hand. I have prepared you a little Python script that automatically creates you a CSV file.

As a summary, detect, recognize, crop multi-view faces from video and save them as images. Then, it calls to labeling and indexing database, in other words it creates data-set from database so you can read the database images from another program. Here is the flow diagram which is summarize what have done we so far to create our own multi-view face database.

The flow diagram of creating your own face data-set project.

The more information and source codes are available on this GitHub repository.



Ahmet Özlü

I am a big fan of Real Madrid CF and I love computer science!