DEMO: AWS CLI setup and basic use for S3 data movement

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
afternoon depending on where you're at my name is Doug Psaltis and today we are going to be going over power user commands for Amazon's s3 CLI we're going to be focusing on the AWS CLI and this is maybe a live demo maybe in an environment and so without further ado let's let's get through a couple of quick housekeeping items the first one is what are we in be doing today we're going to be looking at the AWS CLI we're even going over some some use cases and then rain jump jump straight into the demo and the demo I'm going to teach you how to install this I'm going to be installing in Linux but we can install it for mac and Windows as well we're going to configure it we're in configure against Amazon's AWS s3 now I'm going to show you how to configure it for an on-prem s3 solution like cloudy and hyper store and then we're in demo a bunch of commands so before we get into this the first housekeeping item is if you look at your GoToWebinar panel there is a little tab called handouts there's five of them in there and the most important one is this AWS Clell got CLI guide in it you're going to find a complete guide of how to install Amazon AWS CLI for Windows Mac OS Linux I outline Oracle Linux Rell CentOS abouttwo but of course I'll work with Deveny Debian you can probably even you know whatever your favorite flavor is mint it doesn't matter it's got configuration guides it's got shortcuts for from aliasing commands and then of course it's got what you're seeing right here the CLI cheat sheet so that for basic commands you can just reference this and be on your way so why the AWS CLI there multiple CLI is out there that worked with s3 in fact probably the most famous one is s3 command or s3 CMD but the reason that we are highlighting and working with AWS CLI is that it's definitely the most actively maintained if you look at my screen shot on the right there of github you can see that there have been 254 releases it's updated constantly it works all of the AWS services so I've only named three of them here but if you actually go to the project page it works with at least 50 Amazon services and when you look at something like s3 command all that works with is s3 it hasn't been maintained as much over the past two years and so you're in get more features more functionality more cross-platform functionality out of this tool and let's talk about some of those basic features that we'll be going over today you're going to be able to create and delete but buckets you're going to be able to copy or move objects into those buckets from your file system or vice versa this one's really important the parallel multi-part uploads this is a key differentiator between the AWS CLI client and other ones out there you're able to do two different things the first one is if you have a folder full of small files it's going to take them and move them in parallel that's going to be much faster than moving one file at a time because when we think about cloud storage we know that it's high latency you're talking to the Internet the Amazon Cloud but we also know that it's capable of a lot of throughput and in order to get that throughput unlike when you're in Windows and you try to copy a directory we need to start moving these these files in parallel now if I have a really big file like a movie file what it'll do is they'll actually chunk that that file up and it'll move all of those pieces in parallel by default that's ten but later on we'll have a more advanced session where we talked about how to up that limit and then of course we can sync data and sync think about our sync so the difference between sync and copy is that copy copies your data up to the cloud or vice versa sync copies only files that do not exist it doesn't compare and it checks to see either the file does not exist or it's been updated a very handy feature that again is built into the AWS CLI we're not going to get into advanced features today we can save that for another webinar because we only have 30 minutes but those would include things like encryption include exclude clauses typing data information lifecycle management there's just so much more we collaborate on so this go over a couple of quick use cases I've used this before to script back up so I had a watch folder or I guess you could call it a watch bucket in this case and I had my sequel server doing its own sequel server back up to a folder and previously I was copying that to a SAN or an ass now you can use this command to have that sync up to the cloud likewise if you have genomics and we'll be demoing that something like an Illumina sequencer that's sequencing genomes and you have hundreds of thousands or millions of small files again we highlighted that one of the key features here is the ability to move those in parallel we'll be able to show that I talked a little bit about really large files why this command is better than other ones in the market and even against just a standard you know sis or NFS share when you copy a big file with sis or NFS it starts at the start and it just goes straight on through to the end with the ability to parallel chunk your your data and do breeze zooms obviously this is a much more effective tool and then likewise and this might be a little dance but if you have video surveillance or CCTV equipment that is dropping to a folder or creating files you can sync those up to the cloud a little bit more advanced would be to take that feed and pipe it straight up to the cloud but again that's nothing we can get into another day so without further ado let's jump straight into the the demo maybe demoing installation configuration and a number of commands so clean up my screen I'm starting off with a boom - I'm on the trusty release 1404 now of course this will work with all the way back to 1204 I've tested it and of course if you're running xenial 1604 this works just the same and so the first thing you need to do is you need to have Python installed well the good news is pythons already installed for you know 99% of the Linux distributions out there maybe not alpine but very easy to set up the next thing you need to do is install pick and tip you can install with just app install pip or it's actually Python and your see it's already installed in this machine a boot - already packages fit but if you're using Oracle CentOS or Rell a minimal Edition you're not going to get tip again if you look in the configuration guide we have the RPM you need what you're going to actually be doing is adding in the extra packages for Enterprise Linux and then it's just proceeding as normal doing the app install Python tips so since I already have tip installed the easiest way to install AWS CLI is now to use pip as my mind manager and I'm going to do pip install AWS CLI and the great thing about this is this is going to be common across all your platforms so whether you're using Windows Mac or Linux all we need to do is install pip and then tip is going to be your package manager it's going to be able to keep AWS up to date and it's going to be common across everything and so it's as simple as that I've installed AWS CLI and so I'll let me just clear the screen so now that I've done the installation and and we'll check on after this uh windows and some other installers but I need to configure it and so what I'm going to run is this command here AWS configure and so the first thing it asked me for is what is my AWS access key and so if I go into my AWS you know web GUI and I go over and I get my security credentials I'm going to see an access key so in my case I've got this one right here it's going to look like a large numeric number and understand cut and paste that in because there's no way I'd be able to type that in and then the next thing is asking me for is my secret key this is my signing key so I've got that right here the third question is asking me is what is my default region so I'm broadcasting to you outside of the Bay Area and so I am using uswest - this is something that you'd be able to get easily from your Amazon session in fact I've got that open right over here and you can see that right here in my AWS console I can see that my region is uswest - so I'm going to just go ahead and type that in u.s. West too and because I'm only using this for s3 I don't have to worry about my default output format that would be more something you use if you're using this with ECS and whatnot so I'm going to leave this as none it won't hurt it at all if you type in JSON or you want one of the other formats and so configuring it to work with the AWS cloud is a simple and now we can actually start using the command now I do have to do one other step and the reason I do want to accept this for anybody that's cut and pasting my screen these are not my real Amazon credentials so I do need to copy my real ones so why don't we just look real quick what is this done if if I look I do an LS minus a you can see that I have this hidden directory now called AWS and if I look in that directory I'm going to see two files I'm going to see a config file and it's got my default region and then I'm also going to have a credential file and it's got my credentials so all I'm going to do is I because these are fake credentials I actually have a real credential file and I'm saying copy that over top of the the fake one and then we can start using these commands all right so now that I've got my real credentials in there I should be able to do a command like this AWS s3 LS and now I can list my buckets now as you can see I've got a bucket right here called cloudian demo data and if I log in to my AWS I can see that I've got the exact same bucket as well in fact I should be able to created the bucket so I can do AWS s3 MV from a bucket and again all bucket names need to be globe so we're going to have to find one that is not in use and so I'm hoping that if I use today's date so 10 19 2016 cloudy in demo that not taken but one other quick note when using this you do have to name everything s3 colon slash slash and so with that I've created a new bucket if i go back and i refresh my console I can see that I have that new bucket as well so now let's see how do we connect this to an on-prem s3 source because it's going to be the same commands but for some of the demos I want to give later I don't want to be maxing out my internet connection because otherwise this webinar is going to run really slowly so now what I want to do is I want to do my AWS configure again and I could swap out my current credentials with my credentials for running on Prem but I don't want to do that I want to have two sets of credentials in the system one for when I'm using the true Amazon Cloud and one for when I'm using my on Prem storage and the way to do that is I do - - profile and I'm gonna call this one cloudian and it's going to ask me the same questions what is my access key what is my secret key and I've actually got my cloudy and interface open right here I'm open to credentials I can grab my key I can grab my secret key it's probably easier if I cut and paste from my file and then when I ask for region I'm going to actually tell it that I'm using US East one the reason that I'm doing this is that it makes compatibility really nice and then again I don't need an output format and so now I'm set up and I have two sets of controls in the system I have my Amazon credentials and so anytime I type AWS base s3 I'm going to be able to use those and anytime I specify the profile cloudian I'll be able to use my cloudy and on-premise profile and so in order to do that the Camille it's going to look a little bit different so normally I would type AWS s3 but this time I need to add in - - profile equals cloudian and then I also need to give it my endpoint so normally it would go straight to the Amazon Cloud but I've got my own internal servers here and so I need to give it the fqdn or the IP address of my on-prem system so I'm gonna do endpoint - URL equals HTTP s3 region one cloudian local and now I can do my command so I can tell it s3 and I want to list and I can see that I have one bucket and that's a little cumbersome to take that command and every time now obviously if you're working with Lifesciences you have a limb system it's not a big deal to script it but what I want to do is I want to make this easier to use I'm going to use this make this easy to use whether I'm on Linux Windows or Mac and so the way I'm going to do that is I'm going to a Leah Sisk amande and it's actually really simple to do so in Linux I have this files called bash Darcy and so I'm just going to edit it real quick you can use any editor that you want I happen to be a VI guy and when I scroll down in my window I'm going to find some other aliases and so here's some their aliases that I already have in the system and so I'm going to insert a new one and so just to save time I'm going to cut and paste but I'll explain what the command does so I'm a listing the command hyper store now whenever I type the command hyper store it's going to type the following for me virtually AWS profile equals cloudian the end point and even the command s3 it's going to save me an awful lot of typing and so now I can save that file and I can either log out and log back in or I can just reload that file by doing source bash RC and now watch what I can do instead of typing this long command up here I can type in hyper store in fact I can even hit tab and use autocomplete and now LS and now I can list that same bucket in fact I can do hyper store MB for make bucket again whenever I'm working with s3 I've got to type in s3 slash slash and then I can do alumina because the next thing we'll do is we'll work with some genomics files and so I can see that I've now created a bucket and I can list that bucket and it approves you that this works in other systems I can switch over to an Oracle Linux 7 system and I've set it up the exact same way and I can do a hyper store LS and I can see I have alumina I can also do this with a CentOS 6 system and I can do hyper store LS I've set up my Mac the same way now with Mac it's a little bit different and again this is all in the guide that we've made available for you to download I can show you that there's actually a slightly different file it's called bash profile and I've put the same command in there alias the word the command hyper store and it's going to fill the following in so if I type hyper store LS again I've got my two buckets and last but not least I want to make sure that if anybody is a Windows user they know that this is perfectly compatible with Windows again the command it's it's slightly different you have to use what's called das key to create an alias in Windows but again this is all so I can type in hyper store LS I can see my bucket in fact I can do again MD for make bucket s3 : / / let's call this one movies and I created a new bucket called boobies now to save everybody's eyes let's go back to one of my systems that's already set up so again if I want to work with the actual Amazon Cloud I can do Amazon s3 LS I've got my buckets in fact if I want to copy a file up so I believe I've got a a pretty small file here I've got under big files I have a file called 2 Meg's so what I want to do now is copy a file from my file system up to the Hamazon cloud so I'm going to do a WS s3 and then it's really simple CP for copy and then what file do I want to copy well I want big files and I want my 2 MB file because again my internet pipe is limited on my laptop right now for this demo and where am I going to copy that well we created these folders up in the cloud so we're going to s3 colon slash slash cloudian demo data and you can see that I just copied that 2 Meg file up to the cloud and now if I once again check my my Amazon system and go into cloudy and demo data and I hit refresh here I should now have one object there we go the two megabyte file now we can do the same thing locally and we can do that with much bigger file so I don't want to copy this big movie file right now to Amazon's Cloud because my limited internet pipe but I do want to demonstrate to you how easy it is to take a large file and so if we do ll - age we can see that that's a 350 Meg file I want to copy that to my on-prem object store and I want it to be just as easy as if you were sitting here and you had a mounted filesystem sits and if s that you were copied to so I'm going to use my my shortcut my alias called hyper store I'm going to copy Big Buck Bunny which is a free and P for that you can download and I'm going to copy that to the bucket we created called movies I believe we we created it called movies there we go and what's happening now is even though you see it's streaming I have 10 simultaneous streams moving that file up to the cloud so now if I want to list what's in an actual bucket I can do again my alias hyper store list s3 colon slash slash movies and it's going to show me that I have one object in that folder now I can do other neat things as well so let's rerun that command I can copy that up and when I copy it up I can rename it so I can call it new movie dot mp4 and after it copies up will list the the container or the bucket again and what we're going to see is that now I have two files they're both the same size but I was able to change the object's name I'm also able to create virtual sub directories and so if once again I was going to copy that file up but this time I wanted it to be in a virtual subdirectory called Doug simply need to append that to the objects name and I'm able to copy it up once again and now what's interesting is when I list it I see this thing called pre pre means prefix because I don't actually have a subdirectory what Amazon is doing for me is it's creating a virtual subdirectory and so I have got this one called if I want to see what's in Doug I can do slash Doug slash and I can see that I have new movie in there so is very familiar it's very much like using any normal Linux file system and again the same commands are going to work whether I'm in Mac Windows or any normal Linux distribution so now let's do a couple you know a little bit more complicated things the first one is if I'm looking at my file system here we've shown you how to move a big file a movie file but what if I wanted to copy an entire directory full of files so if I once again do my hyper store LS I've got a bucket here called bucket one and so I can do hyper store copy and I want to copy big files and I want to copy everything in there to s3 bucket one and the way that I'm going to copy everything in there is I'm going to use this recursive as long as I spell it right and recursive means that it's going to iterate through big files and copy everything over and again and that was extremely quick because I'm using a Gigabit line on Prem I've got a bunch of twenty and two Meg files that have been moved up there and if I now look at what is in bucket one I can see all of these files that I've copied up to the cloud now I could have given them each of prefix by putting a slash there and I just want to elaborate how easy this tool is to use and how I don't have to mount a file system this will work if I'm running on my Linux or Windows or Mac laptop whether I'm in the office or I go home because this is all moving over HTTP or HTTPS so there's no worrying about trying to mount the filesystem over the Internet which isn't something you can do I have access to all my files and then I can even do neat things like if I'm running when I can use a product like cloudberry Explorer I can do a refresh and if I look in bucket one I can see all the same files I could have also used the the cloudian web GUI to see this or if I was moving these to Amazon I could use cloud berry against Amazon or I could use the Amazon GUI that we just saw a minute ago so now let's do something a little bit more complicated let's instead of copy files let's actually sync them in syncing is really important when you have a data set that's constantly changing or evolving and so the the favorite data set I have for that is Illumina sequencer files and so if I just look real quick I've got this directory here called Illumina if I do a find Illumina and pipe that to to word count you can see I have a hundred and sixty-two thousand files in this directory and so what I want to do is I want to sync them up to the cloud you can imagine I have my sequencer is doing an experiment run and it's constantly spitting out new files and so what I'm going to do is once again I'm going to use my alias hyper store and this time I'm going to use the command sync and what sync is going to do is it's going to copy these because again they don't currently exist but what we're going to do is interrupt the copy and we're going to resume it and it's going to pick up where it left off but we could also make it pick up where it left off and take any change files along the way so I'm going to sync Illumina and I'm going to sync it to that bucket we created earlier called Illumina now one quick note is that when we're working with a normal file system capital letters are okay when we're working with buckets in the s3 cloud capital letters are not okay so we need to be DNS compliant and so we need to make sure that these are all lowercase alright so we've got our sink running you can see that it's got you know it's synced up a thousand files again we have a hundred thousand-plus files to go the great thing about this is it's moving them concurrently it's moving ten at a time and so much faster than a normal you know sink or our sink or file system command again in a more advanced session we can talk about how to expand that pipe but what I'm going to do is I'm just going to to cancel this real quick and we can see that you know in uploading I made it I'm in experiment you know 23 or let's just call it 55:29 and I'm on lane 1 I made it up to cell 158 and so if I was to rerun that command we wouldn't want it starting from the beginning we wanted to pick up where it left off and so what this command is actually going to do is it's going to look at the date modified in the file size of all these files in my system it's going to do a compare to them in the Amazon as long as they have not changed it's going to pick up where it left off if any file has been changed it's going to resync that file and so if I start this again right now what it's doing is it's checking those 1000 files it's resumed and if I if I pause it we can see that we picked up with cell 164 and if we scroll up a little bit it's probably too much to scroll up but essentially we were able into in a second or two compare that everything is the same and then just start right where we left off moving those objects or files into the system and if I want to see what they look like I can once again do my hyper store list s3 Illumina I can see that I have a virtual directory called experiment 55:29 again the actual file names were experiment data intensity based calls this is just how Illumina sequencers actually name their files so this is the original file here's what it saved it to as an object and so if I want to actually see you know a little bit more like the the let's do all of the cells I can just jump to that directory and so I'll look in lane 1 and I can see that I've got a number of sub directories or again when we're working with objects swords there's no real thing as a sub directory or what we call prefixes so with that I'm hoping that you've learned how to set up and install the AWS CLI you're able to check out our guide set for Windows Mac and Linux and you're going to be able to move files from your local file system into an on-prem cloud or the actual Amazon Cloud you're able to move them back and you're able to script this whether you have a man from media entertainment or if you have a limb system in genomics or if you're just a normal system admin and you've got backups that are storing on on disk and that could be your your Oracle your Oracle Linux arm on it could be your sequel server from Microsoft backups and the ability to copy them to a more durable more cost effective tier of storage and know that a lot of what people fear about when they're using s3 is the complexity and with this tool the complexity just basically goes down to zero so I'm going to look and see if there are any questions and it looks like right actually at a time I have one question the the one question was how did you set that up for Windows again that's that's in the actual guide I can give you a quick sneak peek I had to do two things the first one was undermined users in my not downloads sorry in my documents I created this file here and what this file does if I edit it what it does is it sets up the alias but for windows in the aliasing in Windows is called das key and I did the exact same thing I created a command called hyper store and I referenced AWS my profile cloudian again if you want to name it something else that's perfectly fine the URL or the IP address of my endpoint that I want to use s3 and then what this says here is accept any extra arguments that I place after that to get that to work every time I open up CMD in in Windows you need to edit your registry and so I created this quick little reg but you could do this with Reggie edit as well and so you're any in each key current users software Microsoft command processor you're going to set auto run that any time you open up the the command that it's going to go to you know my user directory documents and it's going to run that command and that's that's just simply how the equivalent of them you know bash underscore profile or bash rc4 for Windows and to set this up for Windows super simple download Python takes about you know 30 seconds install Python we've got it in the guide you just need to check off that you want pathing enabled and then you simply run pip install AWS CLI and then AWS config and you're off and running it's just as simple to do in Windows as it is Linux or Mac and so really I don't want anybody to have any fear about hey this is a Linux only thing I have to learn Linux whatever tool whatever OS you like it's going to work and I think we're out of time I hope you found this informative if you did you know please request and we will do other topics we can talk about our clone we can do advanced commands for AWS CLI where you know the sky is the limit so you know write us let us know what you liked and what you'd like to see next I'm Doug Psaltis thank you very much
Info
Channel: Cloudian Videos
Views: 7,869
Rating: undefined out of 5
Keywords: AWS CLI, S3, Cloudian, Cloudian HyperStore, Migrate Data, Hybrid Cloud, S3 API
Id: X0wU-HTjv0g
Channel Id: undefined
Length: 32min 15sec (1935 seconds)
Published: Wed Oct 26 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.