Preparing Your Dataset¶

To train your object detection model, you need to prepare your dataset in the YOLO format ( Click here to learn more on the YOLO format ). To do this, we shall use LabelIMG, an open source tool that makes the process simpler.

Step 1: Install LabelIMG¶

FOR WINDOWS
Install Python3 via https://www.python.org Install PyQt5 and lxml via PIP
pip3 install pyqt5 lxml
Download and unzip the file via https://github.com/johnolafenwa/DeepStack/releases/download/1.0/labelimgwindows.zip

For MacOS

Clone the LabelImg repository
git clone https://github.com/tzutalin/labelImg.git
CD to the Repo and run the commands below
pip3 install pyqt5 lxml # Install qt and lxml by pip
make qt5py3

For Linux
Install LabelImg via PIP
pip3 install labelimg

Step 2: Organize Your Dataset¶

Collect as many images as you can containing the object you want to detect. About 300 images for training and 50 images for testing is suggested for good results. If you can collect thousands, that’s even better. You can source images from the web or from your camera feeds as applicable.

Create a new folder for your dataset
Create train and validation folders inside it
Put all your training images inside the train folder
Put all your test images inside your validation folder

Your Directory Structure should look like this

My-Dataset:

- - - - - train (90% of your images goes here)

- - - - - test (put about 10% of your images here)

Step 3: Run LabelIMG¶

On Windows

Run the file LabelImg.exe that you unzipped earlier

On MacOS

CD to the repo you cloned earlier and run

python3 labelImg.py

On Linux

Run

python3 labelimg

Go to your LabelImg menu, select “View” and make sure “Auto Save Mode” is checked.

Click on “Open Dir” on the top-left and select your “train” directory where your training images are kept. The first image in your folder will be shown as seen in the example below.

Click on the “Change Save Dir” on the top-left and select your “train” folder. The annotation files will be saved alongside your images.

Change Annotation to YOLO Format¶

Click on Pascal/VOC to change it to YOLO format

FROM

Step 4: Annotate Your Dataset¶

Now that you have loaded your images, set the save folder for the annotations and switched to the YOLO format, we shall annotate our dataset. In this example, we are using an image dataset on Google Glass.

Start annotating your images by:

Click on the “Create nRectBox” button on the left-bottom and draw a box around the objects you want to annotate as seen in the images below.

Click on the “Create nRectBox” button again and annotate all the objects in the image.
Once you are done, click the “Next Image” button on the middle-left to annotate the another image.

As you are annotating your images, the .TXT files containing your box annotations are saved for each image in the “train” folder.

N.B: Take note that the annotation .TXT file for each image is saved using the name of the image file. For example: you have images image_1.jpg, image_2.jpg …… image_z.jpg the .TXT annotations file will be saved as image_1.txt, image_2.txt,…. image_z.txt

Annotate Your Test Dataset¶

Repeat the process above for your test folder as well.