Tesla Full Self Driving Is Using GPT For Vision — Dr. Know It All Explains What This Means

June 9, 20222 years ago Johnna Crider 0 Comments

Sign up for daily news updates from CleanTechnica on email. Or follow us on Google News!

Tesla’s Full-Self Driving is using generative pre-trained transformers (GPT) for vision, Elon Musk tweeted recently. He added that the GPTs are running natively on Tesla TRIP chips versus needing to round trip to iGPU. I think it’s important to take a quick deep dive into this, because this is kind of the heart and soul of FSD. Thankfully, we have “Dr. Know It All Knows It All” to translate what all of this means. Learning new things is something we all should be open to, and that’s why I’m writing this today. I’m also learning.

Elon Musk’s initial tweet was a response to @JeffTutorials, who asked Elon Musk to add software release notes into the Tesla app, adding that it would be nice to see what was new right from the phone. In that thread, Elon Musk noted that the transformers are replacing C heuristics for post-processing of the vision neural networks’ giant bag of points.

Some thoughts on this tweet thread. I hope they're useful! https://t.co/CCEfh924zP

— DrKnowItAll (@DrKnowItAll16) June 2, 2022

I asked Dr. Know It All to share a bit more about TRIP chips and he pointed me to a project that the Department of Computer Science at The University of Texas at Austin worked on. I think, but am not 100% sure, that Elon was referring to TRIPS chips, which is a type of microprocessor architecture. You can read up on the project here.

In the tweet below, KL Manish shared a definition of a TRIP chip and Elon Musk confirmed this.

Yeah

— Elon Musk (@elonmusk) June 2, 2022

Dr. Know It All noted that Elon Musk revealed a lot of useful information, and his video is a short dive into what exactly Elon Musk is talking about and why it matters. I’m sure Jeff didn’t plan on initiating a conversation about artificial intelligence and GPT, and Elon’s reply to Jeff is a bit off the topic. What Jeff was referring to was making the release notes available in the Tesla app as well as on the screen of the car. It’s a brilliant suggestion and would make taking screenshots of the release notes easier for those who share them on Twitter for us writers to write about.

Dr. Know It All explained that GPT is something that OpenAI is working on — specifically GPT3. GPT3 has 175 billion parameters.

“It’s an absolutely massive network that they’re doing. Now, I’m not saying GPT3 is what Tesla is using here but I just wanted to put that as a contextual element there.”

He explained that Tesla’s Hardware 3 has a lot of chips in it, and the sub chip is a TRIP chip which basically does neural network operations as opposed to a GPU. When Elon Musk referred to “needing to round trip to the iGPU,” he was referring to the GPU on board the Tesla Hardware 3.

“The deal is that it is just faster, essentially, to run. Instead of having to run some of the calculations on the TRIP chip which has neural network architecture on it and then also some of the code on the GPU and having to round trip all that stuff, it’s actually much faster to run it all natively on the TRIP chip. Also, the code just runs faster on things that are designed for — the GPUs were designed specifically to be used for gaming and rendering polygons and stuff. And people in artificial intelligence just realized that that was a very similar type of calculation to what you do when you are working on the dot product and add for doing the whole giant matric multiplication thing that is the central part of doing deep neural network training. They were like, ‘oh, we can use this, but it’s not ideally suited for it.'”

Things, he explained, like the Apple M1 Chip, Tesla’s Hardware 3, and even Google’s Tensor Processing Unit are among several new types of hardware designed specifically for AI training, which is what the TRIP chip is versus the GPU. The GPU is great for rendering things on a screen, such as watching videos, but it’s not ideally optimized for AI training.

“You want to use something that’s ideally optimized for AI training if possible and that’s what the TRIP chip actually is, and plus the fact that you don’t have to round trip this. You don’t have to pull things out of memory, go into the TRIP chip, go into the iGPU back to the TRIP chip back to memory. It’s just saving a whole bunch of latency, which is exactly what KL Manish actually said.”

The end result is efficiency since the time isn’t being wasted going back and forth from memory to the TRIP chip to the iGPU and back again.

Using The GP Part Of GPT For a Lane Prediction Transformer

When James Douma, an expert on AI, asked if Elon was using GPTs for vision, Dr. Know It All agreed that it was a pretty wild thing. He pointed out that when you think about GPT3, it’s a massive neural network. James noted that he thought the early comment meant that Tesla was doing the GP part of GPT and that it would make sense to use GP for a lane prediction transformer, pointing out the transformers were replacing C heuristics for post-processing of the vision’s neural networks’ giant bag of points. Dr. Know It All added,

“Essentially, transformers can be used without generative pre-trained. They’re just an architecture that neural networks have. It’s a very modern architecture.”

He added that when Andrej Karpathy talked about it, he’d said that they would use spatially based transformers — initially, transformers were language-based models. This made me think of predictive texting. He added that it’s great information to have for a language model, but they realized that it could be done visually since they can give spatial information about the location of pixels in an image or a video. They can transform (predict) what’s in those spaces, such as a red cat walking to the store.

Dr. Know It All shared some additional thoughts with me.

“While it does not appear that Tesla is using something as massive as GPT-3 (which at 174 billion parameters would be insane for a car), the fact that they’re able to use multiple GPTs by running them on the localized TRIP chip is extremely impressive. If and when this is successfully implemented, I predict cars will become much more ‘creative’ in their problem solving capabilities.

“A massive general purpose neural network will be able to solve much more varied driving challenges than smaller networks — and especially than traditional heuristics based code — and should finally get us to a place where cars can drive as well, or better than, humans!”

I think the video is an informative watch, and I encourage you to watch it in full if you’re interested in hearing Dr. Know It All’s full thoughts. You can watch it below or by clicking here.