Why Self-Driving Cars Are Still on the Horizon and Not on the Street

- Marcel Gutsche

You probably remember sentences from five years ago claiming that self-driving cars will be a reality the latest by 2020. Well there was some progress in this regard, but even Tesla, pushing frontier of autonomous vehicles, has not been able to fulfill this prophecy. So what went wrong? Is the whole deep learning thing just another hype, as some have predicted? We believe, that only a little ingredient is missing: High quality data.

Example from our benchmark

Top: Input Image; Mid: Lidar Depth; Bottom: Dense Depth from rabbitAI

We at rabbitAI want to deliver this kind of data and made our first steps toward this goal. You can read our recently published benchmark paper, where we explain the data and metrics for evaluating algorithms. In the following, let us briefly look into how this helps to get self-driving cars on the street.

Getting a car to maneuver independently of human interaction is quite complicated). One important ingredient is the vision capability of the car. This includes its sensors but also the ability to interpret the data-stream correctly. For example, to distinguish a silhouette from a person, unexpectedly rushing on the street from a plastic bag.

To achieve decent performance in these vision tasks, modern algorithms require training data, a set of input images with human validated, desired output. The output can be either predictions for object classes in the images, e.g. a person, a car, a region of free street and so on or geometrical information such as depth and inclination of surfaces.

Creating such datasets is a tedious task, but it is inevitably to increase the performance of autonomous vehicles. A majority of the industry is focusing on Lidar for measuring distances, and also our predecessor benchmark KITTI uses Lidar measurements for their depth ground truth. Unfortunately, Lidar has some shortcomings, impacting this data, and hence also the performance of algorithms based on that data.

We created a small dataset, including depth information relaying solely on camera information. It is included in our benchmark (click on an algorithm in the leader board), where we test various algorithm on the task of mono-depth prediction, estimating the depth of a scene from a single image.

We plan on adding more scenes to the benchmark. So go ahead and get our data, submit your algorithm, and tell us your opinion! Do you think we will have self-driving cars by 2025?