How to use Cameras in ROS (Sim Camera and Pi Camera)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

these days you'll find cameras just about everywhere they're on our cars in our houses they're all over our phones but what about robots getting robots to be able to see and understand the world the same way that we do with our eyes is one of the biggest areas of robotics research and today we're going to look at the first half of that problem the seeing how do we get data from the world into a robot using a camera by the end of this video you're going to be able to use ross to connect to either a virtual camera or a real camera and process image data coming from it and just like the last tutorial on lidars this is going to be split up into four different sections first we're going to do a really quick overview of how cameras and images work secondly we're going to look at how cameras and images are integrated into ros thirdly we'll be using gazebo to simulate a virtual camera and drive around a virtual robot and then fourthly we'll be connecting a raspberry pi to a real camera and getting image data from it and you can use the chapter markers at the bottom of the video to jump to any section that you want in this video we'll be focusing on rgb cameras that's regular color cameras but in the next video we'll be looking at depth cameras which are even cooler make sure you subscribe so that you don't miss out on that while it's really exciting to get a camera working in ross for the first time getting those images beamed back to your computer and it's tempting to just rush ahead and start typing things up and wiring it all together things are going to make a lot more sense if we step back first and quickly cover some theory the stuff we're about to talk about you could spend an entire series or even entire uni degree talking about so we're only going to be looking at the bare minimum bird's eye view right now when we think about cameras and images there's probably something that jumps into our head straight away we imagine a 2d array of colored pixels that vaguely represents what we see with our eyes and that's what we'll be talking about for most of this video but it's worth acknowledging that there's actually a huge variety of cameras out there they vary in their sensor type so you can get color grayscale thermal cameras infrared cameras they vary in their optics so you get wide angle lenses that can see heaps of things or telephoto lenses that are really zoomed in on one thing or sometimes you get high frame rate cameras that can measure things that are moving really fast or depth cameras that can measure the depth to an object as well as its colors and stuff there's all sorts of different types of cameras out there and the things that we're talking about in this video are going to apply to all of them in some way or another when a camera takes an image it basically takes all the light that's just bouncing around in the world it focuses it through a lens through an aperture and onto a sensor and that data is then stored as a 2d array of measurements that are called pixels now for a grayscale image this is pretty straightforward each pixel just measures the intensity of the light that hits it but for colored images it gets a bit more complex there are a few different methods that can be used but the most common one is to split the information up into three different color channels red green and blue and by combining these three channels together we can produce all sorts of different colors so if all of them are at their maximum then you're going to get white if they're all at their minimum you're going to get black and for something in between you've got every other color you can think of now usually we store this with eight bits so that's one bite per color channel per pixel so those eight bits give us a range of zero to 255 different values for each color and by combining the three colors that gives us millions and millions of different color combinations and we usually store that as red and then green and then blue but we should be aware that some software and some hardware does things a bit differently for example opencv stores it as blue and then green and then red or you might get a camera that has data that's 16 bits per pixel that sort of thing in general the same principle applies this format of three 8-bit values per pixel is really easy for computers to work with but unfortunately it's not very space efficient and this starts to become a problem with robotics because we've often got large images that we want to send over a network many many times per second and so to deal with this we use something called compression the simplest form of compression is just to resize the image just shrink it down throw away you know three in every four pixels and this works you'll save space but you end up with a picture that's really pixelated the good news is there are smarter ways to do this there are compression formats like png and jpeg that will do a smarter job of this for example say you've got a big black patch in your image instead of storing zero zero zero zero zero zero for every single pixel maybe there's some way it can recognize that all of this section is the same color png does this in a way that's lossless so when you compress on image on one end and decompress on the other you'll get exactly the same pixels back out jpeg does this in a way that is lossy so when you decompress it you're not getting exactly the same thing out you'll have lost a little bit of information but hopefully it'll be done in such a way that the information that's lost wasn't that important now the process of compressing an image on one end and decompressing on the other does take a bit of cpu power but it's often worth it for the amount of bandwidth that we save when we're taking an image there's a whole lot of different parameters that are going to contribute to what that final image looks like but there's just one that i want to touch on now and that's the focal length the focal length is technically how far away the sensor inside the camera is away from its lens but that's kind of meaningless unless you know the size of your sensor instead it's often easier to talk about the horizontal field of view that's the angle spanned from the left side of the image to the right side then when we increase the focal length we're decreasing the field of view we're making that angle tighter and that's just the same thing as zooming in so you can see in the image that instead of all the trees it can now only see just above the person often in robotics we care about seeing as much as we can of the world around us and so we use cameras that are zoomed out they're a fairly wide angle field of view or equivalently a short focal length if you do ever need to perform the conversion between focal length and field of view there's the equation up on the screen now it's just some plane trigonometry the last thing i want to touch on briefly in this section is coordinate systems typically when we're working with images the x direction moves from left to right and the y direction moves from top to bottom and according to our right hand rule that makes the z direction pointing into the page or away from the camera keep that in mind it's going to come in handy in just a minute now that we've got a general understanding of how cameras and images work let's see how they're integrated into ros so just like we saw in the last video with lidar we're always going to start with a driver node for our camera this is a node that's going to talk to the camera hardware it knows all about that camera it can set the the frame rate and the shutter speed and whatever else it knows how to speak that camera's language and it's going to take the data stream from that camera and publish it to a topic now the type that ross provides for us for that topic is sensor messages image and that way as long as your driver node is designed to publish an image as long as your algorithms are designed to subscribe to an image then you can chop and change you can use whatever cameras you want whatever algorithms you want and everything should work together and if for some reason your driver or your algorithm only wants to deal with compressed images ross provides another type that sensor messages compressed image and as you guessed it it's just the same sort of thing as image but it's storing compressed data like jpeg or png ross also provides a bunch of libraries and tools called image transport and they handle all the conversion from compressed to uncompressed so say for example your camera hardware only deals with compressed images or your algorithms only want uncompressed images using those libraries or those nodes can allow us to convert between them and we'll see a little bit of that later the unprocessed image topic that's being published directly by the driver node is usually going to be called something like image raw now here raw doesn't mean the same thing it does in photography if you're familiar with that the raw format in this case it just means it's the unprocessed image it might be distorted and that kind of thing if you've got a compressed image then we usually add slash compress to the end of the topic so in this case it would be slash image raw slash compressed there's also another type that ross provides for us and that's the camera info type and as you might guess the camera info stores the information about the camera things like how it's calibrated distortion coefficients that sort of thing this is the kind of information that some algorithms will need in order to be able to interpret the raw image data correctly this topic is typically called camera underscore info and that lives in the same name space as image raw so say your image was in slash my camera slash image raw then your camera info would be slash my camera slash camera info we're not going to talk about the camera info anymore for the rest of this tutorial it's just good to know lastly one of the most confusing aspects about working with cameras in ross is the coordinate systems now we've seen in earlier tutorials that the standard kind of ros coordinate system is to have x pointing forward y pointing to the left and z pointing up but just a minute ago we saw that the standard when working with images is to have x right y down and z forward how do we deal with this well the answer is we do both so whenever we're creating uh urdf with links or we're creating a transform tree that sort of thing we always have two different frames we usually call the one that's in the ros coordinate system the camera link or something like that and then the one that's going to be in the the vision frame which is exactly the same as this frame but just rotated differently is the same name but with underscore optical so you might have camera link and camera link optical then all of our topics so our image topics our camera info topics in their headers should specify the optical frame not the regular frame i know this is really confusing but it's just how it is to deal with this problem we're going to get to see a practical example of it in just a minute as we're about to start simulating a virtual camera in gazebo now just like the last tutorial with the lidar we're going to be simulating the camera first in gazebo before we try and plug in a real one and so if you do feel like i'm moving through things a little bit too quickly then maybe check out that lidar tutorial first for some clarification now because the camera is going to be so similar to the lidar we're actually going to start by taking our lidar file from the last tutorial and we're going to copy and paste it and we'll rename it camera dot x-acro and then we'll go into our robot.uidf copy and paste the line where we include the lidar and rename that to camera as well so we'll go through this file now and we'll change a few things so our first joint is now going to be called camera joint instead and our first link we're going to call camera link then obviously this joint is now going to be between the chassis and the camera link and for the location of this i'm going to put it at the front center of my robot now my robot origin is at the back and it's 300 mil long so i'm going to make this 305.305 meters because it'll have a bit of thickness to it so the origin of the camera is going to be sitting just in front of the chassis and let's make it .08 so 80mm high from the bottom of the chassis now you'll also note that i haven't put any rotations in here and that's because for now at least this camera joint is going to be using the the ros standard of x being forward y being to the left and z being up rather than the optical standard we'll get into that in a minute now for the link you've got the visual and collision and inertial components here from the lidar now you can keep all them if you want but i'm actually going to get rid of the collision and inertial components because my camera is really small and it's just going to be affixed to the chassis i'm going to get rid of those components and just have the visual and i'm going to change that to be a box that is of size uh let's go 10 millimeters thick so that accounts for twice that extra 5 millimeters then we've got 30 millimeters in each of the other dimensions so this should give us a little square prism sitting on the front of our robot i'll keep the material as red from the from the lighter now in our gazebo tag here we better change the gazebo reference to the camera link now i'm going to comment out this sensor section for now we're not going to worry about that just yet and we'll leave the material there so we've now got our camera exacto file it's got a new joint and the camera link the other thing we're going to do now is to add that special extra link and joint that convert from the ros coordinate system to the general vision standard coordinate system so i'm going to copy and paste the joint here so i'm going to call this joint the camera optical joint the parent link is going to be the camera link and the child link is going to be called camera link optical so normally this is going to be whatever your camera link was and then optical afterwards now the origin you want the origin to be located at the same point as your camera link but it's going to be rotated differently so we want it to be roll minus pi on 2 no pitch and then your is also minus pi on two and we'll see soon that that's just what it has to be to make it work for the actual link itself we'll just get rid of all of these components it doesn't need to have any sort of visual or collision or inertial associated with it it'll just be that camera link optical it's kind of an invisible joint that just gives us an alternative way of looking at our camera link now you'll notice that i actually put this in between the camera link and the gazebo tag normally i like to keep my gazebo information close to the links there but because this one is is kind of just another way of looking at the same link sort of an alias for it i'm happy enough to keep it nice and close to here rather than putting at the bottom of the file or something else like that okay so we've got camera joint camera link optical joint and optical link and then our gazebo tag which at the moment just has the material associated with it so let's test that it works so far i'll open up a new terminal we'll get into our development workspace we'll make sure we rebuild because we added some new files and now i'll source my installation and then you remember from the past we had our launch sim so we'll go launch sims this is just like in the previous tutorial it'll launch it up we've still got our lidar there but now if we spin around we should see our nice little camera sitting out the front of our robot so that's the physical aspect of the link now we need to set up that sensor tag just like we did with the lidar so we'll close this and let's see what we've got here so i'll i'll uncomment that sensor tag and i'll just collapse some of these things up for now so we're going to start by calling it we'll call it camera and the sensor type this time is going to be a camera sensor again pose put that as all zeros it doesn't really matter visualize true update rate 10 this ray section here we're going to get rid of that because you remember those were the settings for the ray sensor but this is a camera sensor now so we're going to add a camera section so the settings that are going to go in here i've actually already copied from somewhere else so i'm just going to paste them and let's take a look at those and you can type them out as i go through them so first up we've got the horizontal field of view this essentially determines how zoomed in or zoomed out the picture is going to be i've chosen this number because uh it's pretty similar to the value for the actual camera that i'll be using now the image format uh we've got our image is r8 g8 b8 so that's eight bits of red eight bits of green eight bits of blue and for the width and height i've got 640 by 480 pixels you can kind of change that to be whatever you want it to be and then finally i've got some clipping settings so this sort of sets a minimum and a maximum range that the camera is going to be able to see so i've got mine set up so that it'll see as close as 50 millimeters and as far away as 8 meters so that's all our camera settings now it's time for the plug-in settings so we've got the name here laser controller let's change that to camera controller and the file name here is going to be lib gazebo ross and instead of array sensor this time it's lib gazebo ros camera now we'll get rid of these arguments we don't need them for the camera sensor and then for this one the frame name this is where it gets a bit confusing even though this is attached to the camera link we actually need the frame associated with it to be camera link optical so when it publishes the image data the image data is going to be associated with this optical link but the link that gazebo is using for its simulations is the camera link very confusing i know it trips a lot of people up and you end up with cameras pointing the wrong way but that's the way you've got to do it because gazebo is using one standard the kind of ros standard of x forward whereas other nodes are going to be expecting to use the um the kind of image standard which is to have z forward and x to the right and y down so we've got those settings all set up so let's minimize that and we'll try re-running gazebo oh and i've just realized we've still got our laser visualized here so let's uh open up our lidar and we're just going to set visualize to false it was good in the last tutorial while we were testing our laser and developing it but we're happy that it works now so we can disable it from being visualized it'll still be running it'll still be publishing the lidar data to the scan topic all right so now we can see we've got our robot we can actually see this little visualization of the camera so we can see even though it's kind of looking through it we can see that it's seeing this little bollard thing and so what i'll do now is open up the tele operation and so you should be able to see as i drive this around we can see our little robot is yeah seeing the different things now what should happen is if i get too far away once we're more than eight meters away from a particular object we should start seeing it disappear so you can see these rear most cones are starting to fade out of the field of view once we get eight meters away from these bollards they'll start to disappear as well so that's kind of showing that that maximum clipping distance that we can set so what about in rvs let's open up a new new terminal here we'll start up rvs frame to odom now this time we'll add an image topic and image display i mean and in the drop down here we should be able to see we've got camera camera info and camera image raw and if we select that up now we're not seeing anything there now because we we drove our thing too far away so i'm just going to drive it back toward the thing and so we should see these start to come into view here now what we can also do is we could add a camera display now the camera display is very similar to the image display so again we can select the topic and we'll select camera image raw and this looks the same but you might be able to just see there's a bit of a grid here and what this does is it uses the camera info to figure out where this image sits within the world and so if we had other robots driving around or obstacles we detected or maybe a grid map that we laid out we'd be able to see this camera image overlaid on top of the other contents of the scene as it is all we've got right now is the the background grid so we can kind of see that the horizon of the grid there lines up with horizon in our image so that's good that means we haven't got our image kind of pointed in some weird direction now you'll notice when we did our drop down here we have these options of camera image raw and camera info and there's no option for a compressed image and that's because at the moment we actually haven't got uh the the plugins installed for image compression so we're going to close these down so i'm going to open up a new tab here and type sudo apt install ros foxy image transport plugins enter the password so it's going to install the compressed image transport as well as a couple of other things and now that that's installed we can go back to our gazebo window here i'm just going to source the installation again just in case it's um not found it should be all right and now i'm going to rerun our gazebo simulator and while that's starting up i'm also going to rerun rvs swap that to odom add our image display again [Music] and what we should find now come on is we've now got the options for image raw like we had before but we've also got compressed compressed depth and theore so if we go to compress we might expect to see this but it says no image what's up with that rvs can't actually handle compressed images it's a bit annoying so instead what we're going to do is we're going to install another piece of software actually i'll use this window again we'll go sudo apt install ros foxy rqt image view and what imageview does is it will let us view both compressed and uncompressed topics so we can now type rost2 run rqt image view it'll pop up this little window and we can see our topics here so sure enough we've got that image raw but we can also view the compressed topic drive it around we can see that that topic is being updated um yeah as we drive our robot so that's great we've now got the gazebo simulator that's publishing compressed and uncompressed images from a simulated camera but sometimes when you're using a real camera driver if they haven't written the driver properly you might only get a compressed image or only get an uncompressed image so before we move on to the working on a real camera i'm just going to show you how to create a compressed or an uncompressed topic if you're missing the other one so what i'm going to do is i'm going to open up a new tab here and i type rost2 run image transport list transport and this is going to tell me all of the different types of image formats that that our system knows about right now so we've got compressed compressed depth raw and theorem so they're those ones that we saw so to take a topic that's got an image of one type and republish it to an image of another type we can type rost2 run image trend transport and we'll use the republish node now we need to say what the format of the input is and then the format of the output so in this case let's say we've got our input is compressed but we want to output as raw and we now need to specify what the topics were so we go ross args this is done as a remap so we're going to remap the compressed input topic so we go in compressed and we can remember from up here that that's slash camera slash image raw slash compressed and we want to map the raw output topic so out as slash camera slash image raw now that straight camera inventories aren't already taken so let's go uncompressed um we'll just call it that so now this is running and so we can refresh this list of topics here and we should now see we've got slash camera image raw uncompressed sure enough that's working and if we go back to here and drive it around that's updating it so that's taking the compressed image that was being published by gazebo and then uncompressing it to this new topic and this can be helpful if you want to compress data send it over the network and then uncompress it on the other end that kind of thing so now that it's all working in gazebo we can have a go connecting to a real camera so we'll start by plugging in our camera the camera i'm going to be using is the raspberry pi camera v2 and that just connects by connecting this little ribbon cable to these clips so you just lift up the little black thing put the cable in put the clip back down make sure you do that on both ends and here's some pictures of what it looks like on each end so that if you're having any trouble you can pause that and check it out i've also there got a little 3d printed mount that i made for my one on the prototype i'll include a link to that in the description but i'll be redoing that for the new build so you don't need to worry too much about that all right now before we start there's a few pieces of software i've got to install i've connected to the raspberry pi with a monitor just because i find it a bit easier for getting the camera set up but you can do all this over ssh if you want so we'll install lib raspberry pi bin v4l utils and ross foxy v4 l2 camera v4l is just video for linux it's like a common video driver library thing so once that's installed we also want to make sure we're in the video group so we type groups and i can see there my user is in the video group but if you're not then the command that you run is sudo user mod dash a capital g and then the group so video and then your username and then once you run that you want to log out or reboot but i don't need to do that because i'm already in the video group so the raspberry pi comes with a command that we can run to check if the camera is connected so we type vc gen cmd get underscore camera and it'll tell us now it's supported and detected so that's good camera is connected now we can type raspy still k to stream the data from the camera and sure enough there it is the camera is seeing this little coordinate marker i've got next to my wall and to quit this you just type x and then enter so now that the camera is working we want to see if the video for linux kind of subsystem can see it so we'll type v4l2 ctl and then dash dash list devices and you can see there slash dev video 0 at the bottom that's the one we want so it's connected there now there's a few other things we've got to quickly install that i forgot to do before so we want to install the image transport plugins and also rqt image view we already saw how to install them in the simulator one so we're just doing this on the pi now so we'll just speed through that nice and quick all right so now we can run the video for linux node driver so we'll type ros2 run v4 l2 camera that's the package and then v4 l2 underscore camera underscore node is the node name and it's got a couple of arguments the first argument is going to be the image size and i'm going to set that to 640 by 480. you just specify that as an array in quotes and then we also want to set the uh transform frame that our camera is going to be in so that's camera frame id and you remember from before we're going to set ours to camera link optical so hit enter that'll start up the driver node and it'll start spewing a bunch of errors you don't need to worry too much about these most of them are going to be telling us about things on the camera that it can't control and that's fine i'm not too worried about controlling those things and then it's also telling us that it hasn't got a calibration file but i'm also not too worried about that if those are a problem for you then you might need to find a different driver so we can start up rqt image view and then we'll see there we've got our image topics including the compressed ones so i'm going to select image raw compressed just for the sake of it and now i'll go reach over and pick it up so that you can see that it is in fact streaming the image data on that topic so that should all be working now you should also be able to connect to another computer on the network and see the images streamed over the network it is worth noting that at least in the current version that v4 l2 camera node can't control the frame rate of the video it looks like um that is an update that they're planning on pushing out so hopefully that'll be in soon i haven't actually tested it but it'll be good once that's in so that you can yeah restrict how fast the frames are coming out because sometimes if they're coming too fast it can bog down your network a bit so now i'm just going to also show that we can run this in a launch file just like we did with the lidar so i'll show you the context of my camera.launch.pi i've just got it in my home directory at the moment but i'll put it in the repo that everything else is in and include a link in the description so just like with the lidar one i can just type ross2launch camera.launch.pi although normally it would be in your package so that's running now that's all that same text it's just not colored anymore and once again we can run the image view and sure enough there it is working just like it was before so there you have it we've got a bit of a handle on all things cameras and images in ross we can connect to a virtual camera or even a real camera and this opens up a huge array of options so we can start doing things like driving a robot around and getting visual feedback on a screen or we can start having a robot track an object with its camera and follow it around those are both things that we're going to be covering in future tutorials make sure you subscribe so that you get to do them when we get to it but before we get to that in the next tutorial i'm going to be showing you how to set up a depth camera and a depth camera not only sees the world around it but it sees how far away it is it's kind of like a cross between a camera and a lidar and i'm really excited to set that one up if you do have any other questions let us know in the comments otherwise as always i'll see you next time [Music] you

Info

Channel: Articulated Robotics

Views: 81,703

Rating: undefined out of 5

Keywords:

Id: A3nw2M47K50

Channel Id: undefined

Length: 32min 1sec (1921 seconds)

Published: Tue Jul 05 2022