Do you think Elon Musk's view that LiDAR is unreliable in autonomous driving and that visual recognition (like human eyes) is entirely sufficient is credible? Why?
Is Elon Musk's View That Pure Vision Is Superior to LiDAR Reliable?
This is an interesting question and one of the most debated topics in the autonomous driving community. Whether Musk's view is reliable cannot be answered with a simple "yes" or "no." It's more like two different technological paths, each backed by a distinct philosophical approach.
Let's discuss this with an easy-to-understand analogy.
Two Different Ways of "Seeing the World"
Imagine you need to walk through a completely unfamiliar, pitch-black room, with the goal of safely crossing it without bumping into anything. You have two choices:
-
Vision Solution (Tesla/Musk's Path): You are given a pair of super-powerful night vision goggles. These goggles are incredibly advanced, amplifying faint light to let you see things as clearly as day. You can distinguish tables, chairs, and even see the color of a book on the table. But the problem is, if someone suddenly shines a flashlight at you, you might be momentarily "blinded"; if there's fog in the room, your vision will be blurry; and your judgment of object distance relies entirely on your "brain's" estimation based on experience.
This is visual recognition. It's like human eyes, providing rich information dimensions (color, texture, shape), capable of understanding "there's a pedestrian" or "that's a traffic light." Musk's logic is: humans drive cars using just two eyes, and roads are designed for human eyes, so as long as cameras and the "brain" (AI algorithms) processing camera information are powerful enough, autonomous driving can definitely be achieved. This is a "first principles" way of thinking, directly addressing the essence of the problem.
-
LiDAR Solution (Most Other Companies' Path): You are not given night vision goggles, but rather a "guide cane," albeit a high-tech one. With every step, you quickly poke this cane in all directions. It immediately tells you: "There's a cylindrical object 2.3 meters ahead, a flat surface 1.5 meters to the left, at a height of 0.8 meters." It doesn't care about the color or material of the object; it just gives you an extremely precise 3D spatial coordinate.
This is LiDAR (Light Detection and Ranging). It's like a bat's echolocation, directly measuring distance by emitting and receiving laser beams to generate a precise 3D point cloud map of the surrounding environment. Its biggest advantages are precision and reliability. Regardless of day or night, it can accurately know the distance and shape of objects. It won't be dazzled by strong light and is less susceptible to being fooled by shadows.
Pros and Cons of the Two Approaches
Musk's Pure Vision Path:
- Pros:
- Low Cost: Cameras are very inexpensive, affordable for mass-produced vehicles.
- Rich Information: Can recognize traffic signs, traffic lights, and lane lines, which LiDAR cannot do.
- More Human-like: Theoretically, if AI can match or surpass human brain's image processing capabilities, the potential is limitless.
- Cons:
- Highly Affected by Environment: Performance can be compromised in adverse weather (rain, snow, fog), sudden light changes (entering/exiting tunnels), and at night.
- Distance is "Calculated": Distance is estimated by algorithms, not directly measured, so there is theoretically room for error. A classic example is when the system might mistake the white side of a large truck parked on the roadside for the sky, leading to a direct collision (similar accidents have occurred in the past).
- Extremely High Algorithm Requirements: This puts all the eggs in the basket of AI algorithm evolution, which is extremely challenging.
LiDAR + Vision Fusion Path:
- Pros:
- Safety Redundancy: This is the most important. If the vision system is dazzled, LiDAR can still function normally; if LiDAR is interfered with by dense fog, vision might still see some outlines. The two act as backups for each other, like having double insurance.
- Precise Distance Measurement: Directly obtains precise 3D models, giving the vehicle a more "solid" understanding of its surroundings, especially when emergency braking or evasive maneuvers are needed.
- Cons:
- High Cost: Previously, a single LiDAR unit cost tens of thousands of dollars; although prices have dropped significantly, it's still much more expensive than cameras.
- Limited Information Dimension: It doesn't know that what's in front is a "stop" sign; it only knows it's an octagonal board. It needs to be fused with camera data to understand the world.
- Potentially Affected by Extreme Weather: Very heavy snow or fog can also absorb or scatter laser beams, leading to performance degradation.
Conclusion: More Like a High-Stakes Gamble
So, Musk's view cannot be said to be unreliable, but it is definitely a high-risk, high-reward gamble.
- He is betting that, with increased computing power and data input, pure vision AI will one day surpass the "LiDAR + vision" combination, with a significant cost advantage. This is a very aggressive and confident engineering approach.
- Most other companies, such as Waymo and Cruise, are taking a more conservative route. They believe that when human lives are at stake, any hardware redundancy that can increase safety is necessary. Their strategy is to maximize safety with "double insurance" first, and then gradually consider cost reduction.
From an ordinary user's perspective, at this stage, a solution with LiDAR feels more reassuring. It's like driving a car where, in addition to your eyes, there's a friend sitting next to you constantly measuring distances. Although this friend might be a bit "clumsy" and can't understand traffic lights, they can give you an absolutely accurate distance warning at critical moments.
Ultimately, which path will succeed may require several more years of market validation and practical application. But Musk, by himself, has indeed pushed pure vision technology to an unprecedented level.