Why I wasn’t at the Vancouver Climate Strike

It was fun,
It was slacktivism.

A strike should have concrete goals and actionable items. There should be a way to quantify its success. It should take a narrative of this form: “We are marching because we want these things to happen. The purpose of this large march is to show how strong public sentiment is on this cause. We all agree on having these things happening, so government/company/whoever, please do them.”

“Showing you care” is not an actionable item. It is virtue signalling.

How to make your strike useful: Pick something concrete. I don’t care what. Americans want back in to the Paris agreement? Sure. Increased carbon tax? Why not. Do them both for all I care. But have something. Maybe even draft your own bill and send it to Ottawa. THEN you can strike. And then I will be there.

There are a lot of real and important strikes and protests going on around the world.
The Climate Strike was not one of them.


Someone who cares about the environment (but also doesn’t want class to be cancelled for no good reason).

Computer Vision Project

(UBC ENPH 353 Course Report)

Keywords: Robot Operating System (ROS), Keras, OpenCV

This an overview of the work done in the ENPH 353 course.

Github repositories for some context: [1] [2]
Plate reader python notebook (uploaded to Colab for readability) [3]


This is a brand new course at UBC! So firstly I would like to provide special thanks for Miti and Griffin at UBC for setting everything up. I have learned a lot in this course.

The goal of ENPH353 is to design a robot to navigate virtual environments and read license plates using machine learning. Fancy. Stay on the road, don’t hit the pedestrians, you know the drill.

For those who are familiar with MIT’s “Duckietown”, our task is quite similar to that. We even use the same framework to control our robots (ROS). The difference is that our track is not physical, and is instead modelled in a software called Gazebo, which integrates with ROS nicely.

Figure: our gazebo simulation environment. Thanks, Miti and Griffin!

This is the course we must navigate. We are restricted to only two methods of interfacing with the robot:

  1. The camera feed
  2. Twist commands (move forward, move backwards, turn left, turn right)

With these two I/Os, we must design a robot which will accurately report license plates and their location through a ROS message.

ROS Gazebo Colormask License Plate
Figure: Colormask


Yes, one could set up a full CNN plate reader which scans the whole image for license plate characters, such as this model. however in terms of training time, this model is far from ideal and considering our short time frame it a high risk strategy to rely on one single complex method to do everything for us. Furthermore, the nature of the plates all being the same size and color makes it extremely attractive to break this into two separate systems: one that collects the license plate and another that tells us what is on it.

Plate Detector

Plate Detection Failure ROS Gazebo
Figure: False Positives for the Plate Detector

We had originally planned for the plate detector to be another object detection neural net. Due to time constraints we moved to an OpenCV color mask instead. It’s really nothing impressive so I’d say you can just skip this section.

Once the color mask was chosen, we passed it through an opening in OpenCV (that is, erosion and then dilation) to remove the random white pixels. We then passed that through findContours, and s so that rectangular shapes are included with a minimum bounding height and width.

ROS Gazebo License Plate Capture
Figure: Plate Detector

The problem here was that, for some reason, openCV also counted the road features as white things with 4 edges (see figure 2). This was a simple fix, which involved filtering for the purple in the image, once again finding contours, and then scaling the filled bounding box of the purple so that the white licence plate is always covered. Using that bounding box as another bitwise and mask, we have the final product which is effective and fast:

Observe that it is not perfect. Because the license plate is not included in our color mask (only the “true” white), the bounding box has been simply stretched a little bit. This works for angles that are head on but it leads to imperfection when the plate is read at an angle.

Data Generation

Lab 3 of this course was building a CNN to read the characters of a virtual license plate. However, since images were perfect (skew, or color deformation), and we can generate a lot of them, it achieved 100% accuracy within the first epoch….

In the real competition, running through Gazebo, where there is skew and color deformation due to lighting and camera effects, the task is not as simple.

To generate data, we employed two methods:

  1. Create a generator python script, which would generate plates and artificially Skew them in front of collected backgrounds. This had the advantage of knowing the corner locations and text of any huge number of images, but does it map to the real virtual world?
  2. Collect real-camera data through the use of a bash script. Said script launched spawned the robot in a specific location, turned it ever so slightly, killed the Gazebo model, and then did the whole thing over again. This has the advantage that it is real world data, but is it comprehensive? Unfortunately it also neglected to kill the xterm keyboard controller so after running it overnight my computer looked like this:
  3. Of course, there was also the option of manually driving around, gathering and labelling data. Sounds like a bad idea, but once you realize that you could have over 1000 images of juicy REAL data in under 3 hours, it becomes pretty appealing.

We elected to use methods 1 and 3. At the end of this process, we had over 700 simulated license plates and over 1000 real license plates. Example Data is shown below:

Both types of data were they were unskewed and separated into individual characters (see the python notebook below) so they would be fed into the neural network. Each 40×80 character image looked this:

Figure: Real data after characters cropped in python notebook. Notice that some characters are sometimes quite close to the chosen crop boundaries.
Figure: Synthetic data after characters cropped in python notebook. Notice how well behaved the synthetic data is compared to the real data.

Keras Neural Network

This is the python notebook housing the neural network. Most of it is just piping the data to the correct form (unskewing simulated data and cropping, the result being what you saw above). But in the end we were left with this model and an accuracy of over 95% on real data:

Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 76, 36, 32)        832       
max_pooling2d_3 (MaxPooling2 (None, 38, 18, 32)        0         
conv2d_5 (Conv2D)            (None, 36, 16, 32)        9248      
max_pooling2d_4 (MaxPooling2 (None, 18, 8, 32)         0         
conv2d_6 (Conv2D)            (None, 16, 6, 32)         9248      
max_pooling2d_5 (MaxPooling2 (None, 8, 3, 32)          0         
conv2d_7 (Conv2D)            (None, 6, 1, 32)          9248      
flatten_1 (Flatten)          (None, 192)               0         
dense_2 (Dense)              (None, 60)                11580     
dense_3 (Dense)              (None, 36)                2196      
Total params: 42,352
Trainable params: 42,352
Non-trainable params: 0

That’s all, Folks

There you are! It works. It is accurate enough.

These are some of the cool things I have found while doing research, hopefully they will aid in your next computer vision project:

Really cool license plate recognition (though out of the time scope of our project)

YoloV3 ( I really love Redman’s approach to academic writing here. What a fantastic costly signal to his work)

Creating your own object detector – Towards Data Science