Learning Rust: Opening and handling files in Rust

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay welcome back everyone so in this video we're going to take what we built in the last video and build on top of that we're going to start using some standard libraries in rust to manipulate files and open files and read them and do all sorts of things with the view of opening and reading fits data remembering that Fitz is a common image data format or a data format it's extensible used in astronomy and so we're going to we're going to start actually parsing that or at least opening those files today and so remembering back to the last video we had a binary application that Cargo built for us and it was just a hello world application that we updated we've put a much more meaningful welcome message there we're going to just probably make sure that that still compiles and works or just do cargo run happy everything works so what we're going to do is improve this this application and and start using some of these standard libraries but before we do that we want to find some Fitz data and we want to actually download that make that available because we want to work with real Fitz data as a test the more Fitz data we get the better in fact it'd be great to have lots and lots of Fitz data so let's go across to NASA's website so this is Fitz dot GSF see nasa.gov and there is some sample Fitz files here and there's two in particular that I'd like to look at they're quite different this one is a sixteen by sixteen hundred array mosaic constructed from four different chips on the telescope these are both from the Hubble Space Telescope so that's what HST underscore means as a prefix so that's really cool right so we're taking we're taking images or data rather that was captured off the mirror that's on the Hubble Space Telescope that was that was basically imaged using these CCD cameras and then and then formed into inter fit started to make to make available for research so we're going to use these two files and what I'm going to do is I'll download these to my local project or my local directory that I'm doing my work in I'll make that available in the github repo as well and so to do that I'll just open up my my my fits rust folder here and I'll just create a data directory and then I'll take this one that's looks interesting I'll just save that to ya if it ever if it ever works save that to the data directory and I'll do the same thing with this other one save it here so yeah so they're two quite different files if we go back up into visual studio code we can open these up and have a look at them so this is the first one this is I don't know what it looks like we have some text art at the top and then a whole bunch of binary data that's trying to be Visual Studio code is trying to render that as Unicode text data or something and then the next one looks similar but it's got a lot more well that looks like at least it has a lot more sort of text data here and then we have more binary image data down or I guess it's image could be anything right but we have more binary data and then we have another text block down here it's called the table extension and then right at the bottom we have some numeric data looks like scientific notation for something that we don't really understand so anyway you can start seeing some structure here can't you sort of text binary text similar to this one but just text and binary by the looks of it so we're not really going to explore the the detail of what makes up a fits file yet but we are going to start using these two Fitz files to just just I guess start interrogating opening playing with these files anyway all right so to do that we're going to look at some we're gonna have a look at some common libraries that are used in Russ to do this kind of work and the first that is particularly interesting so if we just go in here and sort of update our update our program we're gonna use something from the standard library STD called file system or FS and we're going to import a struct effectively called file and so this this file struct will allow us to open and read and right as well and it does bunch of other interesting things and we've got some some examples of how to use that so that's the first library that we're going to use or the first struck from a standard file system and then we're going to use standard we're gonna use path and this will import the path struct so let's go and have a look at the detail here so so this is actually a a slice a slice of a path so it's it's not a string it's a slice we'll go into that later and it can be used to represent paths to files on a file system so we'll absolutely need to use that we're going to also start using error handling so we'll use the standard error error we'll import that as well if I can type correctly and it missing a colon there so error is a trait and I believe so all right so what we're going to start noticing is we're you know we're gonna come across something called the result type that is a generic or sort of a template type and it returns either the type that we're after or an error and error is a trait that probably implements a whole bunch of methods that we're interested in anyway so we're going to use those three things and we'll we'll cover this more rust has a very specific way of managing errors and handling errors and it almost always relies on using this parameterised enum type like like we've seen here with court results anyway we'll go that we'll go through that in a little bit more detail all right so so at the moment we're starting to see some warnings so Visual Studio code is pretty cool because you can install Russ plugins and it basically will try and effectively try and compile the code as you write it or at least you know check it for errors and syntax errors and what have you and so it's just calling out that we haven't used these so these aren't catastrophic errors because we're only just writing this code all right so the first thing we want to do is we want to get a path to one of our data file so we're going to use this this module and it's going to give us a way of getting a handle to a path in the file system right so to do that we're going to declare a new variable in the way you do that in in Rusty's using the let keywords so we're gonna say let and we'll call it data part equal par new and then a string literal to the path of the file that we're interested in now I really don't want to type that out by hand so I'm just going to to cheat here and take take the filename so it should be relative to our directory so I'll just do the root of our our our project directory or whatever you want to call it so data slash and then we'll put follow me and dot it's okay so this this will let's just see what the return type is the return type is a slice path slice okay good so so we'll do that and then what we're gonna do is we're going to use that path or data path variable to open a file or rather we'll we'll get a handle on a file and so to do that we're gonna use the this file struct that we that we imported here so we'll do let what are we gonna call it let's just call it file equal file new go file open and then the path which is data path and that should hopefully work so we still have a few warnings about unused symbols and variables and what-have-you so that's okay so if we run this we should see not much but we should see it run and compile and and without error that's good now what we're doing here so look at what open does open takes it takes a path as an argument and it returns something called an IO result of type file okay so this is a parameterised type and I'm betting the result is an enumeration and so let's let's just double check that so let's go and see I think I've got that already open was that the correct type yeah fire results cool so this is enough this is a standard IO result type this is the definition and you can see that yeah it caused a result and it returns the type that were interested in or an error now let's just have a look at what the error is or what the error does right okay okay cool so we can actually call a whole bunch of functions on error as well anyway we're not gonna look at that so so this is basically in a none it's it's a result type and it has those two components in it has the type in the air and so what we're gonna do in in rust we have this keyword called match and this one this match keyword is going to allow us to D structure things like strikes and the noms and there's a whole bunch of other things we can do with it which is it's a really powerful powerful idea and so what we're gonna do is we're going to instead of just opening file or just calling file open which returns this this in num that's what file will be we want this to actually be a handle to the real file so we're gonna do match file open which is an expression and there's gonna be two options it's it's potentially gonna return an error or it's going to to work now in the case where it does an error what we want to do is if we can't open the file then we're going to stop the execution of the program and the way to do that in Rusk is to use the panic macro so we'll do panic and we'll say something like it couldn't open and we'll say file name and we'll put a reason so these are these these curly braces basically placeholders for variables that will be interpolated into that string when when that's printed in this case will be printed to standard error I suspect so we'll say couldn't open and then we'll pass C and let's go data path display so that we get something that is printable and and then we'll just say what whatever the error was so we'll just call it a result description okay if we get the actual if we're able to succeed if we're successful in opening the file then we'll just return the file but we'll just rather we'll just we'll just assign this variable to the value file which is which is the open file handle or the open file okay so that should give us a handle on the actual file so let's let's run that and see if that that works yep okay looks good let's try breaking the file and just seeing what happens so I've just removed the last character good it's complaining it's throwing an error saying thread main panicked at couldn't open data and then printing the path entity not found I'm sure we could probably had a better error description than that but that's okay all right so don't do that make sure that we have working code again okay good so happy days so that's all looking that's all looking pretty straightforward now now what we'll do is we want to read the contents of this file okay so so far we have a path we're opening it now we want to actually sort of slurp up the contents of the file and there's number of ways we could do that we could read line by line we could you know there's a number of different methods that we can call on on file or functions that we can call on file rather but what we're going to do is we're going to use something well firstly let's firstly let's create a data structure to store the contents of the file in and so what we're going to do is create a vector and soul to let fits data equal vector new okay so so a vector in that's interesting I'll write it's it's okay I think I know what it's complaining about all right so what this is gonna say is a vector is sort of like a global you know it's it's a it's sort of similar to a two to a vector in other languages but it's something that we can just push entries on to and and iterate over and it's it's sort of a really common workhorse data structure in rust at least as far as I understand it and so what we'll do is we'll store we're gonna read every byte in the file and we're going to put every byte that we read from from file here into the vector and it's not efficient we're gonna do that we're gonna sort of read the entire file in one go and store it all in the vector but that just lets us process the vector and pass that around and do things with that later so we can you know close the file or we don't need we don't need the open file handle so it is complaining at the moment and the reason it's complaining is it it's saying that it can't infer the type the type and that's because we haven't told so vector is a parameterised struct or parameterised type and so and so what it what it's complaining about is that we haven't actually told it what type we're going to be putting in the vector now we could actually be explicit about but what rust is grout is inferring that so let's write some code to actually use the vector and once we do that rust we'll be able to work out what type of data we're putting in the vector and anyway and so to do this what we're going to do is called file read to end okay and read to end we'll take a vector of type unsigned 8-bit integers right so this is obedience unsigned eight billions which is all we need for storing storing the bytes in this in this file it also needs to be immutable immutable reference but we'll cover that we'll cover that so so let's just get started let's pass in a reference to Fitz data and just see what what blows up okay the first problem we have is that does it throw the error here it doesn't look like it no all right let's let's actually run cargo run okay what's it gonna say rust has really nice errors I love it I love the lot of the areas that the compiler gives us so it's a know method named read to end found for type file in the current scope okay items from traits can only be used if the trade is in scope the following trade is implemented but not in scope perhaps you add a use for it which is just wonderful right so anyway so standard i/o is a module that contains a whole bunch of i/o traits that we will want to use and in this case the REIT rate the reach rate will allow us to use read to end okay so oh I mean we could we could actually do something like this we could say use standard i/o and then a blob and that would that would immediately make this this function available but we can also be a lot more explicit about it and just import you just use the read trade which I mean it's it's really up to you I don't know how efficient one over the other would be but regardless they're there they're sort of our two options okay and so now so now that we've moved past that era we now have an error which is saying that that that our reference to the fits data variable and that's what the ampersand is doing it's saying basically give me a reference is of a different oh it has a differs in mutability okay so so this is where we get into a little bit about how Russ thinks about ownership and mutability and borrowing and I'm not going to go through it in detail on this video so I'm just going to sort of hope that you just trust me and follow along and then we'll cover it as we go and learn more about rust and I'll refer you to some excellent documentation in the rust book that goes over this in a lot of detail or enough detail so suffice to say that well firstly whenever we declare a variable in rust it's default is that it's immutable it can't be modified and so for the functional programmers in the audience you'll probably smile it'll make you happy in pure functional languages like Haskell everything is immutable and that causes a problem that's all it causes challenges or different you have to write think about your problem in a different way because of that rust by default makes everything immutable but there are still really good reasons when you're writing rust code to make data structures and and variables mutable and in this case we certainly want that right so we've got a vector and it knows what type it's going to it's going to store now because of our read to end and basically implying that it's going to store unsigned 8-bit ins but it needs to be able to be updated it needs to be mutable and so to do that in brass we add the mute keywords okay so this makes fits data mutable so that means that we can now add and remove things from our Victor that's great it still doesn't fix this problem though because this is saying borrow fits data okay get a reference to fits data but it's an immutable reference okay and so what we what we need to do is explicitly explicitly make it immutable sorry Blissett li make it mutable and so to do that we just do ampersand mute fits data and that will make that will that will mean that when we when we effectively reference or borrow to use ruffs terms this this variable for read to end then it will be it will be a mutable reference or a mute yeah it'll be able to be modified and so now that we're also we're also now finding NextEra cannot borrow file as mutable it is as it is not declared as mutable so again we need to come back to our declaration of file and add add this to say that file is mutable okay so now we actually have no errors we've got another warning though which i think is just saying that the result of this is unused so what does read to n give us it gives us a result of type news sighs ok and what it's going to do is read all the bytes into the end of until it hits end of file and put them in our vector great but we are going to get this this result again which is another enum ok and look we could go through the documentation again let's have a look reat wind results ok it's it's another enormous it gives us either type the type that we wanted or an error okay now so guess what we do we match again to be structure that and then we we have two options we have era now okay okay so let's just do the air platter so please panic again and we'll just say I couldn't breathe file a file and we'll just do error again like this it's and we'll do that up to pass display and leave that description okay so that's how that's the case where we get an error in the case way we're able to to read the entire file we want to know what we're getting and if we go back we can see re to end reads all by it's placing them in bath yeah but I'm just trying to see whether or not it tells us here in the documentation yeah if successful this function will return the total number of bytes read and so that's actually what we'll get back here we'll get number of bytes and so what we can do is we could say use the print line macro to say successfully read however many bytes from whatever the file is okay so we could just write fast bytes here and we could do okay so this thing should actually do something useful now it should get a path it should open it does some error handling create a vector to store bytes from the file read right to the end of file and then store that in the vector and if it can't it'll stop execution and print that out and if it can it will succed print a success message so so let's just have a quick look and see what see what actually happened to you so if we run now go run again we can see welcome to fits processing tool built in rust and then successfully read 69 thousand one hundred and twenty bytes from data blah blah blah blah blah so let's just validate that I guess let's have a look at what Windows Explorer thinks the size of the file is so it thinks that the size of the file is 69 thousand one hundred twenty bytes success all right so so that's where we're gonna stop actually we're going to stop there because the video is getting getting long enough I think we didn't get to the fits part we downloaded some Fitz data and we started opening it and we briefly looked at it but in the next video what we'll do is go through and start writing a very simple pathway in fact I wouldn't even call it a passer we'll just start opening the data and then data files and being pragmatic and starting to starting to understand the data to do that though we're going to finally go through and look at the fit standard which is quite quite long and quite quite detailed but we're going to probably we'll skip all the history stuff and we'll go down to the fits file organization and then you can see here it goes through what a header and the data looks like and how how extensions work and a whole bunch of other things that should be quite interesting but we'll probably just start with the overall file structure and we'll see that you know there's a primary header and data unit there's extensions and then other special records and we'll just start actually understanding that and in fact we already saw a structure like this didn't we we saw some textual header data we saw some binary probably that's probably what the primary data array is and then I think we saw an extension a table extension in one of them okay so we'll do that in the next video but at least what we've got now is a sum or useful program we're doing error handling we're able to open and read an entire data file and count the number of bytes and happy days so that's great so this is if this is interesting please subscribe also leave your comments below and just you know remember be constructive I am a complete rusty B I'm learning as I go yes I've programmed in other languages but but not in rusts to any any sort of significant degree so I am still learning if you're a rust expert then of course please leave your comments about how I can improve this and what I'm doing wrong I'm sure I'm doing lots of things wrong but but you know this is this is you you watching me learn and hopefully you're learning along with me and and I hope to see you in the next video thanks
Info
Channel: snarkyboojum
Views: 1,519
Rating: undefined out of 5
Keywords:
Id: vp4o8fbjM8Y
Channel Id: undefined
Length: 26min 44sec (1604 seconds)
Published: Sun Oct 13 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.