While attending Tesla’s AI Day event, I learned, along with the rest of the world watching the event on the livestream, that Tesla’s Autopilot team sees an unfathomable amount of data. Along with Elon Musk, Tesla’s Autopilot team presented Tesla’s latest achievements in the field of artificial intelligence, data, and supercomputers.
Tesla is working to bring to life Full Self-Driving cars that will help prevent the millions of accidents that take place every day. This requires massive amounts of data from images and videos attained from real-world driving. The videos and images need labels to show which objects are present and how the AI should respond. Furthermore, the AI needs to learn how to teach itself how to respond to different images and videos (if that makes sense).
Massive Data Sets
Andrej Karpathy, Tesla’s director of AI, explained that neural networks weren’t enough and that Tesla needed massive data sets to actually generate the correct algorithms that are inside the neural networks. More specifically, Tesla needs these datasets to be within the vector space. This led to the question of how one could accumulate hundreds of millions of parameters and vector space examples. Karpathy pointed out that Tesla had these networks in the millions.
Stopping here just for a moment, that’s a lot of data. Karpathy noted that, over time, Tesla’s data labeling team had grown to over 1,000 professional labelers who work with the team’s engineers. There’s even a team to develop and maintain all of the infrastructure for data labeling. He explained that today, the teams are directly labeling in vector space and not in the individual images that they started with years ago, and they are labeling videos now. However, even this wasn’t enough. Karpathy expalined that both computers and people had their flaws and benefits. An example is that computers are good at geometry reconstruction and triangulation tracking while people are great at semantics.
“Really, for us, it’s becoming a story of how do humans and computers collaborate to create these vector space data sets?”
Karpathy mentioned that before moving Tesla’s data labeling in-house, it had been working with a third party. This wasn’t quite working out, though. High latency and low quality were major issues. So, Tesla vertically integrated data labeling by moving it in-house (as Tesla is known to do). Tesla’s data labeling team is based in the US. This is something that many overlook. Many companies would outsource these jobs and pay low-cost offshore prices for human labelers. Tesla hires people in California. This is just one more reason to admire Tesla, especially as an American company and job provider.
More Data Than Human Labelers
Ashok Elluswamy, Tesla’s Director of Autopilot Software, also participated in the presentation. He explained that even though Tesla has a lot of human labelers, the amount of training data that is needed for training with the networks outnumbers the human labelers. To solve this, he explained, they are creating a massive auto labeling pipeline. He showed an example of how the team labels a single clip. These clips have a lot of data, such as videos, IMU data, GPS, odometry, and can be 45 seconds to a minute long.
The clips are collected either from customer cars or engineering cars and are then sent out to the servers where they run a lot of their neural networks offline to produce intermediate results such as point mapping. After this, the clip goes through robotics and an algorithm that produces a final set of labels that can be used to train the networks.
It seems as if Tesla’s AI team speaks an entirely different language at times, but in a nutshell, Tesla has developed a process to sort and organize its data automatically. This is done through the collaboration of people and computers, as Karpathy mentioned.
Simulation In A Video Game
Elluswamy also explained another way data is labeled. Their simulation system is a video game with Autopilot as the player. He showed the simulation and a car within it that had a logo on top of it, which signified Autopilot. It made a left turn. He explained that since this was a simulation, it started from the vector space, so it had perfect labels. Some of those labels produced were vehicle cuboids with kinematics and depth. What’s neat is that if Karpathy wanted to add a new task, the team could quickly produce this since they already have the vector space and are able to write the code to produce new labels.