How to Effectively Nurture Your B2B Leads

As inbound marketing continues to become ubiquitous with generating leads, and ultimately revenue, the importance of a lead nurturing strategy also becomes more evident. The goal of any marketing and…


独家优惠奖金 100% 高达 1 BTC + 180 免费旋转

Traffic Sign Detection and Classification through CNN

Autonomous cars must make real-time decisions about perception of surroundings. CNN classifier accuracy must be close to 100%. One wrong classification can cause loss to life and property. I recently built a CNN from scratch to detect and classify traffic signs using open source data.

Problem Statement: Detect and classify traffic signs

Methodology: Build, visualize and refine CNNs


Major Steps:



Consistently achieving over 99% validation accuracy and around 97–98% test accuracy.

Identified specific cases for further improvements (darkness, reflections, partially hidden, etc signs and need for more labeled samples or augmentation for under-represented classes).


Iterator on demand returns 1 batch size of images & labels without filling RAM/disk.


Relatively new techniques, internal nuances in implementation, insufficient documentation or examples

ImageDataGenerator fails to generate a finite set of tests for model.evaluate in Tensorflow 2.2

Setup Instructions

Only tested this on laptop with Nvidia GPU and running Tensorflow 2.1.0 in Ubuntu Linux.

Theoretically, this should work on Windows or cloud offerings such as AWS where Jupyter notebooks can run, or in Google Colab, but I have not tested on those environments at the time of this writing.

Laptop Setup Instructions:

Note that on Linux, you may need to set up the environment and notebook separately with the following instructions:

conda create -n DL2020 python=3.7

conda activate DL2020

conda env list

pip install tensorflow

pip install jupyter

jupyter notebook


>>> import tensorflow as tf


After testing that each of the above works, proceed to download the dataset from Kaggle. Extract files into a folder where you will be creating your ipynb file using Jupyter. Need to create this folder structure:

Change all file paths in the dataframes to all lower case for cleaner coding, and all ClassId to string because the CNNs using categorical labels require strings or tuples. However to then sort the labels, pad all 1-digit ClassIDs with a leading 0.

From the meta, train, test csv file, locate 43 classes of various shapes, colors, sizes. While the given image dataset can be analyzed in greyscale, chose RGB to create extensible approach to other countries where colors can be used to distinguish signs.

Visualize classes of symbols in this dataset from the metadata

Visualize images


The images range from bright to dark, grainy to clear, and so on. However, I’ll not further preprocess the images for now because I don’t want to learn for example only how the traffic signs would appear on a sunny day or in the evening or close up or far away. Let’s revisit preprocessing if I need to later. Also leaving all images in color rather than grayscale to keep the model robust enough to later on add classes that may depend on the color of the traffic sign.

Image Classes Vary on Width & Height:


Visual inspection of the width and height seems to indicate general clustering around width and height of about 50, while specific classes, e.g. 14 (Stop) on the high end and 17 (No Entry) on the low end seem to be on average at the higher and lower ends. I will note these, but for now keep image sizes to have width and height of 50.

Image Distribution Across Classes


The data set isn’t balanced. This could create difficulties in learning and in interpreting accuracy, skewing against classes with small training sample sizes.

This does not necessarily imply that there is a problem. Hence, I proceed and see the results with the data as is, and then consider data augmentation or other approaches to attend to the problem.


The baseline model architecture is inspired by CNN models from our course and ones I tried to prepare for homework assignments. The CNN model consists of 2 logical parts -

The input into the model are images of width and height 50, and in color which is represented by 3 colors, RGB. This implies that the dimension of the input is 50 x 50 x 3.

For feature extraction from these images, I use convolutional layers with small 3×3 filters, which help summarize the presence of features in an input image. Each uses relu (rectified linear activation unit) activation. The layer uses some padding to ensure that the output of the layer matches in shape to the input.

I also use BatchNormalization to help each Conv2D layer learn more independently from other layers. It works by reducing the degree to which the hidden unit values change. This layer also helps train faster, converge faster, and makes more activation functions viable.

These are followed by max pooling layers that extract the most activated presence of a feature based on filters applied in the preceding convolutional layers.

Together the convolutional layer and the max pooling layer form a logical block which detect features. These blocks are stacked with the number of filters expanding, from 32 to 64 to 128 in my CNN.

The output of the feature extraction part of the model becomes the input into the classification part of the model. For this to work, I must reduce dimensions to a flat structure, and the Flatten layer helps do that. Next to do the actual classification, I use a fully connected layer called Dense layer first to interpret the flattened input and another Dense layer to predict one of the 43 classes of traffic signs.

Dropout is a simple technique which randomly drops nodes from the network, resulting in a regularizing effect to reduce overfitting because it forces the remaining nodes to fill the missing information. The Dropout layer is an easier implementation of this. I use it to reduce overfitting and also to support classification when the number of classes drops down to 43.

Settings and Hyperparameters

I chose Adam optimizer based on the results from experiments.

Batch size of 64 is a good balance of performance vs time.

Learning rate default of 0.001 gives the best outcomes.

Weight decay rates of 0.0001 and 0.001 both appear roughly equivalent from an accuracy perspective, but difference from the default of 0 is minor.


While the accuracy is not as high in the 99% with the test set, it is still decent enough at about 2% below.

However, what does accuracy mean here?

Understanding Model Accuracy

True Positives (TP) — The correctly predicted positive values which means that the value of actual class is yes and the value of predicted class is also yes.

True Negatives (TN) — These are the correctly predicted negative values which means that the value of actual class is no and value of predicted class is also no.

False positives and false negatives — These values occur when your actual class contradicts with the predicted class.

False Positives (FP) — When actual class is no and predicted class is yes.

False Negatives (FN) — When actual class is yes but predicted class in no.

I can calculate Accuracy, Precision, Recall and F1 score using the above -

Accuracy — The ratio of correctly predicted observation to the total observations. Accuracy is a great measure but only for symmetric datasets where values of FP and FN are almost same. Therefore, you have to look at other parameters to evaluate the performance of your model.

Accuracy = (TP+TN)/(TP+FP+FN+TN)

Precision — The ratio of correctly predicted positive observations to the total predicted positive observations. High precision relates to the low FP rate.

Precision = TP/(TP+FP)

Recall (Sensitivity) — The ratio of correctly predicted positive observations to the all observations in actual class.

Recall = TP/(TP+FN)

F1 score — The Weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. Intuitively it is not as easy to understand as accuracy, but F1 is usually more useful than accuracy, especially if you have an uneven class distribution. Accuracy works best if false positives and false negatives have similar cost. If the cost of FP and FN are very different, it’s better to look at both Precision and Recall.

F1 Score = 2*(Recall * Precision) / (Recall + Precision)

Wrong Predictions

Most wrong predictions fall into 2 categories

Potential solutions

The images that are predicted wrong belong to few classes. The concentration of wrong predictions in a certain class indicates both that other classes contained more similar images, and that the labeled data of these classes are limited enough to not help the model distinguish between the classes.

Note: Due to space considerations in this report, use the class labels at the bottom to read class names of all 4 distributions below.

Confusion Matrix

Several of Pedestrian signs in test were interpreted as other classes. It is valuable to ask data collection teams to source not only more images of the class under question, but also more variations of the class it is misinterpreted to be.

Visualization to Understand & Improve

I visualize the convnet filters, activations, heatmaps, and superimposed on images.

Next Steps

There are several next steps for future development and further improvement I would love to take.

Add a comment

Related posts:

Immigrant Civic Participation

As GC MA Senior Program Associate, Samantha Perlman, wrote in December, when parents don’t have the civic knowledge or skills to prepare their children for political engagement, a role reversal can…

Online Professionalism

The year is 2020 and our world continues to dive deeper into the digital age and everything that goes a long with it. With each day that passes it is becoming more and more vital to adapt our lives…

The Essential Paradox

The above quote by Victor Hugo is a brilliant distillation of both the euphoria and agony plaguing those of us who use the written word to express how we experience and navigate the world. The…