Seeing Issues with Self-Driving Cars

Video Statistics and Information

Video
Captions Word Cloud
Captions
It’s been a long time since I’ve filmed outside… or in my car. Though, according to some people I won’t have to be sitting here for long. Which reminds me, I recently picked up a large British audience because of a few videos I made, so let me uh… This should make them feel a little more at home. And is oddly appropriate since this is what many people think this is what the driver’s seat of a car might look like soon. No steering wheel, maybe just a HUD… it’s pronounced HUD, Heads Up Display, not “hood.” And I won’t have to do anything more than just sit here and sleep or… sing. Really, we don’t have an embarrassing shot of me singing? But there are many problems and challenges that self-driving cars must overcome first. When artificial intelligence, first started, the designers thought to themselves: “Hmm, what’s the most difficult thing we can teach a computer to do?” What answer did they come up with? Chess… and checkers and go, board games. But not just any board games, these are games for super nerds, so if they can beat super nerds, they must be really smart. Turns out, while these games may be hard for you, they’re really easy for a computer. Like laughably easy. Deep Blue beat the beat the best human chess master in 1997. That was well before most of you had internet, if you were lucky enough to have internet, it was still dialup and you hoarded those AOL CDs with thousands of hours. The game checkers was declared officially dead in 2007, because a computer had memorized every possible game situation, all 5 times 10 to the 20 positions. In 2015, Chess was likewise solved. And in 2016, a computer beat the second best international Go player… Board games are easy. You input the rules, maybe let it watch a few thousand games, and that’s it. It doesn’t require any motor functions or sensory input. So what people didn’t see coming was how difficult it is to get a computer to see. Seeing requires much more than just photon receptors. That’s why you have an entire lobe, roughly 20% of your brain, dedicated to it. Colors are relatively simple and not all that difficult for a computer to figure out… so they won’t be the focus of this video – but I’m… building to it. Shapes are also relatively easy but edge detection is not. So when a computer sees an image like this, it sees a bunch of colors and shapes. But it has no idea what any of it means and can’t tell the objects from the background. Is this an object? Is this an object? Is this all one object with multiple colors? But you can see it, easily. And without having to think about it. Because, one of the primary purposes of Area V1, also known as BA17, is edge detection. It’s one of the first things that the brain does with visual information. This problem has been mostly figured out, but I’ll get to how in a bit. A problem that will never be solved though, is how to teach a computer to see depth. It has no idea what is closer. This… or this. Again, you can, with little to no effort. When you see optical illusions like this, you know something isn’t right, but a computer has no idea. First, let’s go through the monocular cues, that is one eye, for depth perception. When you look at this picture, you see the horizon. Your brain uses this as a major clue. The closer it is to the horizon, the further away it usually is. And the higher up or down, the closer to you it is. We also look for parallel lines which converge the further away they are. Also the further away you look, the more atmospheric haze there is, which makes things have less contrast and appear more blue than they should. But a more obvious monocular cue is relative size, which is how big or small something is compared to something else that should be similar… like in this illusion. They’re all the same size by the way, but because of their positions they appear hilariously misproportioned. Or, my hands, one of them is bigger than the other – not because I’m a freak, but because one is closer to the camera. Another obvious cue is occlusion, since my right hand is obscuring the view of my left hand, it must be closer. Which was also the case with the cars. But that’s just how you do it with one eye. You don’t actually see this. Because you have two eyes, giving you binocular vision… so you see this. The focus of your two eyes gives you depth perception. Because of a fun little bit of math that your eyes and brain calculate without your conscious awareness. The angle of your eyes is calculated when you’re looking here… and when you’re looking here… and your brain essentially does triangulation to figure out that this point is further away than this point. Granted, you don’t actually perceive this blurry mess unless you take off your glasses during a 3D movie, but in real life your brain cleans up the image a lot. While you may think that you see in 3D, you actually only see in 2D and your brain creates a 3D space using that 2D information. And we’re not even going to get into the motion cues because, we’ve already complicated things enough… and… I mean c’mon. Maybe that’ll be another video. All of these are things your brain does without even thinking about it. Many of the monocular cues can be programmed into a computer, but not all of them. And just like how your brain can be fooled by various illusions, so can a computer, which is why a computer cannot rely entirely on vision to determine depth or distance. It needs some sort of outside help. Many of you might have jumped to the idea of GPS or satellite imagery. Unfortunately, GPS doesn’t provide you with any images, and it doesn’t tell you where you are. It tells you where IT is, and your computer makes its best guess on your location, which is usually only accurate within a few feet. Which doesn’t work when you’re talking about piloting a one ton hunk of metal at 60 miles an hour. Satellite imagery also doesn’t work because it isn’t live, which is pretty much what you need when navigating a busy street. And even if you somehow manage to fix that problem you’ll never overcome weather, buildings, and trees and stuff. So a driverless car needs to use localized vision, along with something else to perceive distance local to the car. Which we’ve actually had much longer than we’ve had satellites. Echolocation, such as sonar or radar. Basically, it sends out a signal and based on how long it takes for the signal to reflect off the object and return, it can figure out how far away it is. The problem, is that this information is limited and not quite how you see it in the movies. For an image like this, radar will return an image like this – just like with visual information, it’s only two dimensional, and will only tell you how far something is from transceiver at the level of the transceiver. So if it’s on the roof of your car, that’s not very useful for looking for how far away another car’s bumper is. So have one on the bumper, obviously. That’s on top of the fact that it only tells you how far away something is in that moment, and not what direction or how fast it’s travelling. Which is made even more complicated when you add in the fact that you are likewise moving. The technology to do that does exist and has been in use for sea and air travel for decades. But those are long range with far fewer vehicles and little environmental obstruction. So okay, all of these are challenges that have been or can be solved. So let’s look at some issues with vision that have not yet been solved. What is this? Right, it’s a bicycle. What is this? It’s still a bicycle, c’mon. Okay, what is this? It’s still a bicycle, this isn’t rocket science, it’s pretty easy. Yeah, for you. For a computer, it’s infinitely difficult. Object recognition is by far the hardest thing to get a computer to do. Especially when you consider that 3D objects look completely different in a 2D image from different angles, even in perfect conditions. You may remember this video from a few years ago of a robot navigating around the world and interacting with objects. You probably mostly remember it because of the jerk with the hockey stick. It’s all pretty impressive… as long as you aren’t aware of the difficulties of getting a computer to see. You know what a cognitive psychologist sees when they watch this video? A robot interacting with a bunch of QR codes. Those QR codes tell the robot what the object is, it’s distance, and it’s orientation. Whether it’s a box or a door, if they want the robot to interact with it, there are QR codes stamped all over it. Sorry for ruining the magic. What are you talking about? Computers are way good at recognizing things, like how facebook recognizes me in all those pictures or all those snapchat filters. Yeah, I mean they’re good at faces, I’ll give you that. But faces are pretty easy, there’s a pretty standard pattern for those… your brain is wired to automatically think this is a face – it’s not… it’s a chair And it’s not really like recognizing faces is all that useful for driverless cars. Computers are getting better at recognizing objects though, and you’re helping. Whether you’re playing Google Quickdraw, or doing one of those annoying new Captcha images where you select all of the squares with a stop sign, you're teaching computers how to see. You are helping our future machine overlords recognize objects. However, even with all of this learning, it still has to match what it sees with a previously learned template. It’s not, like you, able to figure out what new objects are and what they mean on the fly. For example, say you come across this sign. It might only take you a second or two to skim it over and realize that it doesn’t apply to you and continue driving. But a computer? The first thing it will recognize is the shape – it sure looks like a stop sign. It’s red and white, just like a stop sign… it has a little more to it than a stop sign but, just to be sure, I better stop... in the middle of this street. That could cause an issue. Or perhaps this situation. Clearly, that’s a stop sign, but it’s covered with a trash bag. Whatever internet traffic uplink it’s using says there should be a stop sign here and there it is. But it… it’s partially covered with a trash bag. Again, you can look at this situation and quickly figure it out. You’re supposed to follow the temporary green light and ignore the stop sign. But a computer, especially one that’s never encountered this before, won’t be able to react as easily. Say the temporary light wasn’t there. Is a trash bag and some duct tape all it takes to fool a self-driving car into ignoring the sign and just flying through? If so, you’re going to see a lot more teenage pranks and youtube videos like this show up. And what about that detour sign? Your GPS says you should go straight. You might know that you can safely go straight, but a computer sees the sign saying it must go right. Maybe there’s an obstruction ahead? These are all things that you can quickly figure out. But a computer has to obey a set of rules and when presented with something outside of its ruleset, it may not know how to react. Which is why I’m not very impressed when people bring up Google’s self driving car that drove up and down the highway on its own a few years ago. Driving on a highway is easy, all you have to do is stay between the lines. There are very few dynamic situations, very few unique situations, and relatively few challenges. It’s so easy a monkey could do it… actually, it’s so easy a dog could do it. This isn’t a prank, this isn’t a joke, this is an actual dog driving a car. There’s an entire channel dedicated to showing various dogs learning how to drive cars, it’s not that hard. Obviously they’re on a track by themselves, heavily supervised, but they are staying in the lines. It’s not that hard, and it’s not that impressive. Highway driving is so easy you regularly completely zone out and stop paying attention, and most of the time everything turns out okay. Not like in the city where you’re on constant guard and see dynamic, unique situations all the time. Which brings us to the last major hurdle that driverless cars must face. Once you’re able to get it to see properly and understand what it sees, you then have to tell it what to do with that information. Let’s say you’re driving along and you come across this situation. Again, you, as a human, can quickly figure out the context of this situation, and probably wouldn’t stop. A computer, on the other hand, wouldn’t know that this person was just about to turn right and get into the driver’s side door. According to this person’s current trajectory, if the car doesn’t stop now, it’s gonna hit them. Does it assume that this person is fully aware and acting safely or does it stop, possibly causing an accident with the car behind them? That’s a simple situation, a very simple situation. Let’s say that the car is driving along on a two lane road and realizes that it’s brakes are out. I don’t know, maybe a line got severed or a wire shorted out, it doesn’t matter – it’s rare, but it’s not unheard of. Coming towards the car are two motorcyclists who are dangerously riding side by side in both lanes. Your car must now choose who to hit. You, as a human, can freeze up, yell “Jesus take the wheel!” and let physics decide who lives and who dies. A computer on the other hand, can’t. Not making a decision is a decision to do nothing, which means that the car will hit one or both of them… which means that the car decided to hit one or both of them. There is no scenario where the computer can claim to have been so flustered that it couldn’t make a decision. It could decide to follow the law and strike the person who was travelling in the incorrect lane, that’s one way to do it. Or, we can make the situation even more interesting by pointing out that one rider is wearing a helmet and all their protective clothing, while the other is simply wearing a t-shirt and shorts. Your car may decide that this person is more likely to survive a collision – although very slimly more likely – and therefore steer the car into that person. Which would paradoxically make it less safe to be wearing a helmet. Or, we could point out this tree to the side. It could avoid hitting both riders and instead elect to crash itself into the tree… just injuring you. You’ll likely survive while the riders probably wouldn’t. But who is the car supposed to protect? You? The owner and operator? Or some bonehead Harley rider who wasn’t obeying the law? Some might say that as the owner, the car’s main directive should be to protect you and the passengers. While others might say it should protect as many lives as possible. But given the choice, if there were cars on the market that safeguarded all life and cars that just protected you and your passengers… you might be more inclined to buy the one that places you and your family above others. Let’s pose another situation. Say you’re at an intersection, and your car wants to make a right turn… but there’s a line of school children currently crossing the street, all holding hands, single file. So you’re patiently waiting. But another car coming down the road, has hit a patch of ice or has its brakes and steering go out, whatever, it doesn’t matter, the point is that the car isn’t stopping and no longer has control. It’s also a self driving car, and using magic, is alerting all other cars in the area about its situation. If your car is designed to only protect you, it’ll probably sit tight… and force you to watch something so horrifying you’ll never see the end of the therapy bills. If your car is designed to protect as many lives as possible, it might pull forward into the intersection… stopping the car from plowing through all those kids… but you’ll be t-boned and your possibility of walking away from this accident is pretty low. These are the situations that driverless cars will be forced to have to make decisions on, and they are incredibly tough decisions. Not to mention the fact that I’ve only given you a small handful of the literal infinite amount of possible situations. I certainly don’t want to be the one writing the ethical and moral codes for self driving cars… but someone has to… especially if we ever want intersections to look like this. Where there are no traffic lights, all the cars are driverless and are simply communicating with each other with hyper efficiency. And it’s absolutely impossible. First of all, it requires that every single car on the road be self-driving. If there’s even one manually driven car, game over. Which then also means that your car must be self-driving all the time. If you have the ability to switch it on and off, an intersection like that will never work. Which means that old man river out on his dirt road would have to be using a self driving car. We can get around this situation by saying that only on certain roads, auto pilot must be enabled, fine. But let’s say you’re on one of these auto pilot only roads, and you’re late for work… when this happens. Your HUD tells you that an emergency is occurring on the road so all travel is currently halted. Nevermind how furious you’ll be over the fact that the government can just seize control of your car… you’re late, so you flip the manual override and decide to proceed anyway… and congratulations, you just caused a collision. Don’t act like this situation is impossible, how many people do you know who have driven through a closed highway because “the weather was bad.” If people can break the law in order to save themselves time, they will. But let’s go back to this intersection and assume that all cars will always be self driving, with no possible manual override. This intersection is still a disaster waiting to happen. Let’s completely set aside the idea that anyone would ever go to this intersection with malicious intent, even though those people have always and will always exist. And we’ll assume that all of these cars are completely unhackable, again… we’re assuming perfect conditions. Imagine a tree branch falls in this intersection. Or a tie blows out. Or a truck’s unsecure cargo falls out. You’re looking at a several car pile up… even with AI that can respond instantly. Also, having traffic flow like this renders the intersection completely useless to pedestrians and bicyclists. There’s an easy solution of course, a four-way foot bridge. Which likewise dramatically increases the likelihood of something or someone falling into the intersection, accidentally or not. But again, in order to achieve that perfect flow of traffic, everyone needs a driverless car. Cars aren’t like phones, where people get a new one every year or two. Cars last a long time – like 15 to 20 years. Most people don’t get a new one until their current one is broken beyond repair. So even if by some miracle, all of the technological and ethical hurdles are overcome in ten years – which is extremely generous, they totally won’t be – and they stop selling manually driven cars that same day… without government intervention, it would still take another 15 to 20 years to phase out all of the manually driven cars, excluding the antiques of course, because you’re never going to take those away from people. On the topic of government intervention, we also have many legal issues to work out with driverless cars. Just to throw a few out there. Who is at fault when a car decides to hit someone? When you’re the only person riding in a self-driving car, are you allowed to be on your phone? Sleeping? Drunk? If you’re required to be awake and attentive the entire time, doesn’t that kind of ruin the point of it being self-driving? Self driving cars will happen, don’t get me wrong. They are coming. But if you think that they’ll take over the roads in the next ten, twenty, or even thirty years, hopefully now, you know better.
Info
Channel: Knowing Better
Views: 348,262
Rating: 4.7386494 out of 5
Keywords: psychology, technology, morals, ethics, legal, law, morality, legality, self driving, driverless, automated, car, cars, automobiles, autos, artificial intelligence, ai, vision, sight, see, depth perception, distance, monocular, binocular, depth, cues, object recognition, intent, context, trolley problem, occipital lobe, edge detection, relative size, occlusion, area v1, ba17, captcha, quickdraw, machine learning, illusion, optical illusion, 3d, 2d, motion, echolocation, gps, facial recognition, eyes, focus, traffic
Id: rlbHeg7U6Tc
Channel Id: undefined
Length: 17min 29sec (1049 seconds)
Published: Sun Jul 30 2017
Reddit Comments

I feel like arguments against driverless cars are only arguments that it won't fix every single problem on the road. Just ask always ask yourself, is it better then the alternative?

And the "moral issue" is a non issue. Every example I have heard can be answered by the car simply following the rules of the road. If someone comes straight at me on a motorcycle the best thing the car can do is slow down, stop, or pull over. Why does the car have to fling itself into something else? If the car follows the rules of the road then it will never be responsible for someone else causeing an accident.

We are talking about cars, not the androids from iRobot...

Edit: oh by the way, yes a computer with one camera does not have great depth perception compared to our two eyes, but do you know what does? A computer with two cameras!

👍︎︎ 8 👤︎︎ u/Rusty_14 📅︎︎ Jul 31 2017 🗫︎ replies

Well, that bummed me out a bit :(

👍︎︎ 2 👤︎︎ u/Pizmovc 📅︎︎ Jul 31 2017 🗫︎ replies

I had no idea chess had been solved. That is news to me.

👍︎︎ 2 👤︎︎ u/merelyadoptedthedark 📅︎︎ Jul 31 2017 🗫︎ replies

The problem with this video is that pretty much everything he brings up will be fixed with human operated autopilot.

These are just edge cases that will be solved when a human tells the car the solution. That's how Machine Learning works. Take everything he says with a grain of salt.

It just means that we will have to have a steering wheel on the car for a bit longer.

source: computer science student.

edit: also about the ethical decisions, by in large it doesn't matter. As long as driverless cars kill fewer people than humans the car can kill whoever it wants. The number of people that die will still be dramatically less than what it is now and the car company could just absorb the legal fees.

👍︎︎ 2 👤︎︎ u/derangedkilr 📅︎︎ Aug 01 2017 🗫︎ replies
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.