Native SFTP in Microsoft Azure with Azure Storage

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone in this video i want to talk about native sftps at ftp over ssh in azure using azure storage accounts as always if this is useful please go ahead and like subscribe comment and share and hit that bell icon to get notified of new updates so sftp is still very widely used by a lot of different applications when they need that secure ability to transfer files and up until this point there's not been a native solution in azure if i went to support sftp i would have to maybe stand up a azure is virtual machine running some piece of software maybe something from the marketplace that was a virtual appliance but now with this new capability it becomes a native part of the blob type service of azure storage accounts now today it's in previews you have to go and register for this the type of storage accounts it supports if we think about a storage account remember there are many types of storage account there are many services supported so if i want block blob i can either use that general purpose v2 or if i want premium then there's actually that block blob type storage account so that's when i want premium so both of those are supported for the sftp today for the preview for the resiliency in the replication this is lrs so locally resilient so three copies within a certain data center or zrs so those three copies are distributed over availability zones but in the same region and what i need to make sure is i've enabled that hierarchical namespace so the hierarchical namespace is what gives blob true folders without that when i do folders in blob they're just virtual directories it's really just part of the name but when i add that hierarchical namespace which is an option when i create the storage account now it has true folders that's part of the azure data lake storage gen 2 that leverages that if i was to actually go and look at a storage account for a second what we can see is i'm looking at my sftp enabled storage account and the key thing we're looking at right here is this hierarchical namespace is enabled and again when you create the storage account that's an option for you but that that's really the only thing i have to be doing there were no additional limits or anything imposed when i use the sftp so what i can think about is ordinarily when i think of blob what we actually have is there's different types of service so a storage account yes it has i'm going to focus really on block blob but obviously there's things like tables there's queue and there's files now for block blob we have different types of api so there's just the regular blob rest api when i turn on that hierarchical namespace then it lets me actually use that data lake api as well also if i turn on hierarchical namespace there's an option i can actually do nfs now these are mutually exclusive i can use nfs or i can use sftp i can't have them both turned on but now we have this new option of sftp so a new way to actually go and interact with my blog blob storage account or again my general purpose v2 storage account and it's just a different way to interact it's not changing anything about the underlying storage account once i write the data there it's just blobs in a container it's just now i can interact with it in another way now while there were no additional limits imposed by sftp realize sftp by default is very chatty and it might use a payload either chunks of like 32 kilobytes which is very small so what you might want to do to optimize performance is i think on windows i can actually increase that to 100 kilobytes per second i think on linux it's 256 kilobytes per second there's a dash uppercase b parameter that lets me get a higher performance this uses port 22 so i have to have port 22 open but i can still use all of the regular firewall and networking capabilities of my storage account i can restrict it to certain ip addresses i can use service endpoints i can use private endpoints and this is blob so if i'm using private endpoints it's the blob endpoint it's not the dfs for data lake i would be using the blob private endpoint to actually go and connect but as long as i have port 22 i'm good to go ordinarily blob now has role-based access control at the data plane sftp is not using that so sftp is going to ignore any data plane rbac i have put in place instead what happens is at the storage account level i create local accounts so i'm going to go ahead and add local accounts i think it's up to a thousand local accounts which are either going to use a password or essentially a key an ssh key and remember blob is actually divided into containers so then for each of the accounts i can give it permissions to a certain container so the first thing we have to do is once we've enabled an account for sftp we're going to have to go and create accounts so if we jump over i have enabled this for sftp you can see here for this i have sftp and once again it's just an option when you create a new storage account now once you've done that what you'll see is under settings we have this new option for sftp this is at the storage account level not a specific container so if i go into there we have accounts so i'm going to add local accounts it is not using azure ad it is not using shared access signatures it's not using posix style apples today the authentication and authorization is only via these local accounts so i can go and add a local account and i put in a name so i could say bruce and i say well do i want it to be a password and or ssh public key and i can add a specific key source from this so i can actually generate it or i could use an existing key stored in azure or use an existing public key so i have different options for that key if i pick generate then it's going to go and generate new key i'll be able to download the private key part of that and then it's going to use the native capability in azure for ssh keys which i'll show in a second if i select to do a password it will actually show me that password i do not get to pick the password realize at this point my authentication is using a password and this could be exposed to the internet so we don't want an 8 character password so if i pick password it is going to generate a huge string that i cannot change i can regenerate it but i can't pick that so it's going to take a really really long string that i can copy and then use but it's not going to let me set some 8 character password it's a huge string it's going to actually use for that password and then additionally i can then say well what containers remember within a storage account i have multiple containers which containers do i want to give this permission to only have one and then for each container i select which permissions do i want to give it to read i can read file contents right i can upload files i can create directories i can upload directories list i can see the contents of containers of directories delete i can delete files directories and create obviously i can upload files if the file doesn't exist create directories if it doesn't exist and i can create directories so those various options and then i can optionally set a home directory so if i just quickly let's just leave that like that and we'll say test key if i was to hit next and then add so it's gone ahead so notice this is the password it's this huge thing that i can hit copy and then that's what i would use additionally it's generated a key pair that i can now do download that private key if i quickly jump over for a second and look at ssh keys this is where they are actually stored so here if we look i can see that there's that test key i typed it wrong that i just generated there's one i generated earlier for my sftp local john account but also previously i used a key for a linux virtual machine they're all stored in the same place they're stored under ssh keys i cannot store these in azure key vault azure keyboard does not support ssh keys so it has this separate storage area for those ssh keys now going back to what's actually happening here so we've now added local accounts now for john i configured a home directory so i can just connect but remember these accounts are storage account wide if i do not set a home directory so let's look at clark where i did not set a home directory i can't just connect because it doesn't know where to connect to so the connection will it will actually fail so we can see that in action so if we quickly open up a really advanced looking shell so if i try and connect as clock so over here notice what this is made up of so what we can see is well it's the storage account name and then the name of the user and then it's just the regular endpoint for blob so again this kind of confirms we're using the blob endpoint so my storage account.blob.cor.windows.net at this point it's going to ask for the huge humongous password so if i copy that and paste it fails home directory is not accessible i cannot connect so what i would have to do if i don't set that home directory as part of the connection after the name of the account i would have to add in what container i wanted to start in so my container was called folder one so here you can see i've added in folder one now along with the storage account then the container and then the name so with that now i can actually connect now i can see okay there's the content i can move around and there's nothing there in that one but i could go and look at john and i can see the content so that's kind of that that processing action of actually seeing that data okay so that that's fun once i've copied up the data then it's just a blob now those local accounts i'm creating here they are per storage account i cannot share them between different storage accounts so i'd have to recrea if there's lots of storage accounts i'd have to recreate them it is again a thousand accounts per storage account i can create once again today there's no azure ad integration it's early preview things may change down the line in terms of the actual algorithm being used there's a set of different algorithms available and when the client connects it will negotiate with the server well what what's the best one we both support you can see if we look at the article it will tell us what are the options if i scroll down for a second these are all the supported algorithms now if you find you can't connect if you're using a really old sftp client it may not support any of these so the solution would be you're going to have to go and get a newer client to be able to connect you need to make sure you're supporting one of these to make it actually work now if we go back over let's test something out if i look at my containers here we can see i've just got my folder one container and within there i've got two folders remember these are true folders now because i've got that hierarchical namespace these are real folders and i've got two files in there right now so what i want to do is upload something so if we have a quick look getting my mixed up between powershell and everything else all right so i've got a john wick.gif file in here so let's see this in action i'm going to connect now as john so if we paste over here so once again you can see that structure is storage account and then my name and then it's just the blob endpoint so i'm going to connect now remember i have a home folder which is why i don't have to specify the container as part of that initial username string so if i look around i'm already in that john folder i don't have to do anything else it put me directly in there because i set that if i look at my local files well there's that john wick if i put john wick dot gif it's uploading it there it is and if i now jump back to the storage account and refresh there's the file there's nothing else i have to do it it's just a blob at this point and at this point if i added role based access control at the data plane it would be enforced if i connect through other means it's just not used as part of the sftp now when i connected you'll notice it just connected because i've connected to this machine before but if this was the first time you ever connected it will prompt you to trust the host key now if you want to be super secure they are actually listed so you can go over here and i'll have the link in the description it tells you for each of the regions what is that host key so you could if you wanted to pre accept on whatever configurations you had on the machine to only trust these to make sure someone's not trying to hijack it or anything else and again i'll put this in the description below but that's really it it's just a very simple service today i have to create these separate accounts and then a container level what permissions does it has it's not using the native rbac it's not using posix it's not using any of those things any other activity would just be like a regular storage account today i don't get things like change the feed or notifications because hierarchical namespace does not support that today but once hierarchical namespace adds that feature then i would see that when i'm using sftp as well remember all i'm doing here with sftp it's a new way to interact that's the change it's not changing anything really about the underlying storage account it's a new interface a new protocol i can use with it so there you go it is in preview at time of recording so it's only certain regions i would recommend to read through the article it talks about some of the kind of the known issues today so if you actually go and look it talks about well hey look the only supported authorization is through sftp it doesn't use posix it doesn't use azure ad but those are enforced if i access it through other means which is what it's saying in this note once it's injected ingested talks about hey i need port 22 talks about the firewall everything else is enforced but initially it will connect but then the actual data plane operations would fail it talks about security other types of integrations it notes nfs and sftp are mutually exclusive i can't enable them both on the same storage account then it even talks about that buffer issue i mentioned because it's very chatty chances are i'd want to actually increase that buffer size with dash b to a bigger size to really increase the performance of this so that would be a key point in using this but that's it i think it's a great new capability once that preview i'm sure a lot of companies will look to leverage this again i expect it to get built on with other types of authentication support outside of these local accounts in time but if you need sftp remember it does support things like private endpoints it's just going to use the blob endpoint so i can have that security i can use service endpoints it's just blob it's just another means to interact hope that was useful until next time take care you
Info
Channel: John Savill's Technical Training
Views: 6,277
Rating: undefined out of 5
Keywords: azure, azure cloud, microsoft azure, microsoft, cloud, sftp, azure storage, blob
Id: -0PPA0tJLKA
Channel Id: undefined
Length: 18min 58sec (1138 seconds)
Published: Thu Dec 09 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.