Learn Together: Use Data Factory pipelines in Microsoft Fabric

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello everyone and welcome to this next episode of learn Life my name is er I'm working for inspark Microsoft partner in the Netherlands and I'm a principal consultant for the day nii team today I'm joined with Javier Javier hello hello hi Irwin nice to be here uh well my name is kavier vishas I am from buenos AES Argentina I work as an IT director for the DBA MBI services at Mediterranean Shipping Company I've been involved with uh in data for almost three decades so we're gonna have some fun today everyn right yeah definitely we will have some fun uh before we will start the show please yeah once again welcome from where you are in the world so good morning good afternoon good evening and say hello to our moderators today sulamita and yakob thank you so if you have questions leave them in the chat and they will do their best to answer all the questions so today we're going to talk about learn live so data Factory pipelines in Microsoft fabric uh you can follow alone with this complete session scan the QR code or go to to akms learn live to 202 40130 a you can follow alone and yeah say hi to the moderators during the show and ask all your questions during the chat and we try to answer them or the moderators will try to answer them before we start the show uh I want you to remind you that if you follow the cloud skill CH challenge the next 30 days which is part of this learn live sessions you will get a 50% exam discount with the with this uh with your next exam and uh you will have like uh all the modules you can complete the the next couple of days and of course after that don't forget to register for your dp600 exam gram as well this exam gram will be held on February 13th 8 o'clock specific time and it's going to be for the better exam version which you can do afterwards so I'm happy to be there uh to help you and together with Javier today uh to answer all your question so JY start your career with the micr fabric career up go to akms fabric career up and you will find all the information you need to have to to follow your role as you want to do uh before we start the show you can join us in Las Vegas I will be there speaking as well you can register today with100 discount with the code Ms customer uh and go to akms fcom and it's going to be an amazing event with more than 100 sessions different workships and a lot of Microsoft speakers and MVPs so today it's January 30 it's used data Factory pipelines in Microsoft fabric uh time so for today we have defined some learning objects or not Javier yeah yeah absolutely well first of all we hope all you have followed all the previous session and well at this point uh we're going to start talking about um fabric pipelines so yeah the idea is to go to uh these objectives and well hopefully by the end of this session you will get into them yeah so for today we have defined a couple of the learning objectives so today you will learn how you can uh use pipelines in Microsoft fabric how you can use the copy activity in in the pipelines but also how you can uh create pipelines based on predefined templates and how you can run and monitor your pipelines so let's first start about Microsoft fabric we all have heard about Microsoft fabric the unified data platform for the area of AI so just to be focused for today and to make sure what you are doing today we will focus on data Factory so not on the other experiences today so only on data Factory how you can start it with data Factory is very easy just go to app. power.com or to fabric.com and you will see the starting page and of this starting starting page you can click the data Factory uh experience and once you have started the data Factory experience you will directly be leaded to the correct experience where we can see data flows and data pipelines so today we will focus on the data pipeline part yeah and the idea Awin is basically to go as deep as possible on learning how to get data into fabric right so um it's going to be quite interesting the the topic that we will see because well actually to start on every single project within fabric we had to get data into it right so this is quite important as the first step yeah so what zier is saying uh we will go through all the different possibilities to get data from somewhere to uh fabric so we will first go to all the the different presentations and I definitely know that Javier has prepared an awesome demo to help you through all these modules as well and during the show you can follow along during the L live module and because we will follow the the strict PA there as well so uh yeah as an as an introduction data pipelines Define a sequence of activities so we will see activities that orchestrate an overall process it's usually by extracting data what Shier was saying from one or more sources and then load that data into a destination and uh yeah we're not going to focus on transforming data because that's what going to be the show of of tomorrow but most we are focusing on extracting the data and loading the data and if you are familiar with as your data Factory then I think the data Factory pipelines in Microsoft fabric will be uh uh immediately familiar because they are have to use they are using the same uh architecture or not jaier yeah yeah absolutely I mean if you are familiar with aure data Factory um definitely you will um understand much better what we have here today to show because it's it's fairly similar right it's the same concept and you know I would say that eventually we would call it the Next Generation right yeah so uh what I always say pipelines and micr fabric encapsulate a sequence of activity so we can do a delete and then we can call data flow and then we can call data uh a notebook or we can call several copy activities which are we are processing different task doing the things so before uh building pipelines in microsof fabric you should understand a few core concept so and we will go walk through them so we will go explain a little bit what activities are we will explain how you can Define different tasks with parameters to make it easier to run or to reuse the the the copy activity and how you can run all these different pipelines so before building pipelines uh you will open uh the the pipeline and you will directly see a screen so you can start somewhere directly with an activity from copy data or choose a Tas to start and we will handle today all the different uh uh uh uh starters of today but before that I will explain I want you to explain that we have different kind of activities in the pipeline so we have the data transform activities call data flows copy data activities notebook activities store procedure activities you already heard in the name it's all all different activities on the second hand we have also the control flow activities so how can you set a variable how can you build it for each Loop container or how you can sit set an if condition or even how you can build a lookup or even switch there are a couple of different more move and transform uh uh things in the control uh flow so be be aware that we have the move and transform activities and that we have the control uh flow uh activities so uh besides that if we have built everything we will see that we have a copy pipeline uh copy activity and in this copy activity we can Define uh pipelines the parameters I mean I mean and parameters enable you to provide specific values to be used each time the pipeline is running so I can create one Pipeline with one copy activity and change every time the value of my parameter if you can see in the Le left under Corner that I have the address as a source name the source schema is my sales schema and my destination is customer address and my destination schema is sales so every time I can change that over there uh besides that when I've done or configured the whole copy activity I can go to run and I can click on run if I've done that I I'm running the the the flow in a Deb mode that means that I can directly see in the output pane over here all the activities what is happening in the in the background but don't worry I will explain that later on a little bit more so the next step is how is that copy activity we you just saw running so uh it's very easy let's show it the the copy activity is I think the one of the most common uses of a data pipeline or not Shier yeah absolutely because we easily allow us to get data from the source as is and get it into fabric on a destination we will see later all the possible destination as well lake housee data warehouse uh but yeah it's is it's one of the most common uh things that we can do in fabric so uh let's start with building uh a pipeline from scratch so uh just by adding a pipeline activity it's pretty simple we can go to one of the upper corner activities and we can select the copy activity on the left side of the corner and what safia was already mentioning with a copy activity we can define a connection for our source we're always talking about source and destination in a copy activity and we can Define different parameters of on there as you can see I have a connection from the connection type SQL database and I'm going to use a query to get data from my source system so in this case I'm going to use probably something from Wild World importers one of the demo databases from Microsoft and then you can see that I can add parameters on top of that table query and that's they call it also the Expression Builder I can just click in there Define the expression and then uh uh I can Define all the parameters so quite interesting to mention here that uh not you cannot only copy a full table from A to B you can also uh create your complex query with multiple showings right and get the result set into your destination same time you can also run a store procedure if it is a SQL Azure SQL manage instance or anything like that and the results that get it into your destination yeah and what I didn't explain you'll see also the advanced button where I can Define timeout settings and retry possibilities to even retry the uh copy activity and I what was mentioning currently we are showing you how to use the parameters in in combination with a table but you can do the the same thing with a query and a stored uh procedure so the next step when we have defined our source in combination with our parameters we can also Define the Des dination and this guy time we're going to use the destination uh work as a Lakehouse as a destination so we're going to write down all our copy activities to a root folder in our Lakehouse and we have a a choice over here we can Define the tables or we can define a file and the file will be stored in the format you have defined so that can be a Avro file Json file or even a parquet file and then you can pick up the files how you want to do but also here we are defining the table name uh with a parameter so uh with this pipeline I can now constantly change the input parameters and everything will automatically be transferred to the from the correct table to the correct destination and as you can see with the table action I have two options over there I can Abend the data but I can also over the data so if you load data from your Source system to your lay house my advice will be to overwrite your data and then load it to a bronze or even to a silver layer in your medon architecture which will maybe have already learned or will learn the come uh upcoming next two days do you have something to add uh zier no no absolutely I mean uh definitely uh as we said at the beginning this is going to be one of the most common things and activity that uh yourself will be finding when you start a new project in in in fabric so definitely uh and also as as you mentioned Irwin uh moving the data uh from A to B to the louse overwriting Etc is depending on your project and and and The Medallion approach that uh we will see also in in the next modules the next couple of days definitely something that uh is going to be super important for for us so so yes now you have learned how to use the copy data activity let's go to the next one and the next one is the copy data tool and uh that's the middle option of course and what you can do uh actually the the copy data tool can help you uh to get data from A to B so this is going to be the first data and now you can see directly all the different uh sources which are available so from Azure but also other sources you you can see o data Dynamics Amazon uh there's a lot of options over there and um in this day case I'm going to use a SQL database so the same SQL database I've used before and I'm going to use the wild world importer database to extract tables from I have I have actually a question for you do you have to install any software any driver to get data from The Source or or is it something that you see it right away so that's a good question zier normally you don't have to install any uh software in fabric to use this copy activity only if you connect to some on pram sources then you need to use a copy or gateway to get your data from your own premise system to to Azure but all the drivers like FTP SFTP are already available within the copy uh tool or within the data p viip Lin from Microsoft fabric is that answering your question without installing any VM or any obbc driver or or or or anything like that right without anything to install that's cool that's the power of SAS right uh so we don't have to spin up virtual machines install applications deal with drivers or version no everything is out of the box in here as soon as we start uh fetching data from from our sources yeah definitely definitely yes so in this case I'm going to directly to an Azure environment with a SQL database even my Azure is not living uh within my sth application of Microsoft fabric it's living in another subscription somewhere else but I can connect uh to that Azure seqo database so if you are are allowing me to connect to your database in your region I can still use your database in Argentina to get data into my West European uh fabric environment absolutely so uh in this situation I'm connecting to the data source and I can use a query but I also can say okay give me the whole table of that uh uh of the table so I can select a couple of tables and then the next step is I can say okay I will directly I want to store that data in a lak house or in a KL database or in a warehouse for today we will use the approach Lakehouse because that's the the first approach if you get data from your source to your destination or from the source system to here so once you have done that you can select which lake housee do I want to use no the data is coming from a source system so probably your lake house will be your data Landing zone or your bronze layer it depends on what kind of architect you you are using but if you haven't created any lake house you can directly create a new lake house from here it's called The Copy data tool so that's really easy a wizard which help you to load data from A to B then I can choose we were talking about tables in the copy activity now I'm talking about files if I want to have files I have to define a folder part so my data is going to be written in my Lakehouse in the wild world importers folder and then I can say okay what is the file name and what is the suffix though the suffix in this case when I when you using a SQL database you're using a park file so then is the metadata of your SQL table is also stored in the park file the next thing is that you have to define the compression type and now you will see something new what you haven't seen in in data Factory uh with the past platform use V order use V ordering is something new with Microsoft have been created to add to the Delta paret files that's we call it a secret sauce it's made or column based querying faster your power reports will be faster so I'm not going to go in detail because I think we can talk about one hour about the new Delta L but uh yes you should enable this option to get faster processing for your power reports and then the next step is you can click okay you can start data transfer immediately and then your data is loaded directly to the file location of your lake house uh that's actually very simple and I think it's almost like next next next finish or not Shier yeah yeah yeah absolutely it's super easy is this uh new experience with the no code low code right that Microsoft is providing us and now within fabric uh definitely super easy to accomplish any of these activities so um what we already said in the beginning uh when to use the copy data activity so what he said you are using a copy data activity uh where you have directly the support of source and the destination and where you are not almost not doing any Transformations so if you want to know how to apply Transformations during a copy activity I think you have to tune in tomorrow evening same time uh and my colleagues will talk about how to ingest data data flow Gen 2 in Microsoft fabric so we are not going to handle that today that's more for uh tomorrow so we just explains you how to start from scratch with a activity with a copy activity I showed you how to use the copy data tool there's one more and that's called the pipeline templates and the pipelines uh is actually a combination of activity you can choose enabling to create custom data injust and Transformations process processes to meet your specific needs but there are many common pipeline scenarios for which Microsoft fabric includes predefined templates that you can use and customize as required uh it's very simple again to start this you can uh choose choose a task to start and once you have clicked on that there will a new screen be opened so in this case you can see I can do a b copy from a database uh I can copy from ads Gen 2 storage to a Lakehouse file and there are a lot of different fabric templates available and uh every day or uh there will be added more new templates if you look to data Factory currently there are already a lot of templates available and also the community will be involved to add more templates in the near future so let's click on that copy data from a sample data file to a Lakehouse file so the only thing I need to do there's a description of of this file so copy data from sample data in this case the only thing I need to do I have to select my lake house so I will select my Lakehouse as you can see over here and the next step is the copy activity is already created my source is a sample data set in this case the New York texi data set have you seen that in every environment and every data pipeline we can use sample data so the New York Taxi we can use to transer to to learn building a pipelines or to learn building activities or to create reports on it with sample data how easy is that so once we have done that we can run and monitor our pipelines you have something to add before we go to the run and monitor of the pipelines S no going back to the templates I believe that these are common scenarios that Microsoft and also the community uh is adding to this page so it's making The Experience even simpler than the original copy activity that you show before Erwin right where you have to choose uh the source the destination the query the the the table etc etc right I mean it definitely was not complex at all but with the templates are even simpler right because there are the most common activities that we we we have that's going to be there and we just need to choose ours and put the the the tables at the source destinat and boom that's it yeah and what I want to add as well using this templates will also help you to understand to start using parameters because for example if you use the copy data from enl DB to Lakehouse table it will predefine the hole for each Loop container how you have to set up and how you have to build array so it's really easy as a start for you to learn how to build the these pipe and withd parameters with we just explained you a little bit before so the next thing is time is okay once we have run that pipeline uh we want to monitor what was happening so uh I can click on on the pipeline I can click on the Run button and what will happen so the next step is that I will see in the output window that I have a copy data from sample to a Lakehouse F and that you will see the activity normally starts cued then it is in progress and then it will be finished or and if itent something wrong then probably you will have a status wrath with fields and you can view the Run history for a pipeline to see details of each run either from the pipeline Canvas OR from the pipeline item listed in the page for the workpace uh let's click on the copy data from sample data to Lakehouse F what will happen we will directly see all the details what what is happening in the in the background so we will see that there is a source and in this case we were using The Lakehouse or the the New York Taxi file from a blob storage from Microsoft and we wanted to transfer the data to The Lakehouse uh you can directly see how many data is read how many files are read and how many rows are read and now you will also see uh how many paralleled copies there will be used but also uh what was your queuing time what was the transferring time and uh what how long it took it to read from the blob storage so you can optimize in a later stage your uh performance as well uh to change something to to add more copy activities uh to your system or or even split your source files to several uh smaller files to read maybe a little bit faster so this is what I already explained this is more the dbig mode so I can directly see in in there but if I want to check uh my pipeline how it run yesterday or to a specified uh schedule we have the monitoring Hub and the monitor monitoring hub can be found uh in the in fabric on the left Page and I think uh uh Javier will show you as well in the demo where to find you there but there I can see by date by who uh uh did the activity who loaded the data and if there was a failure uh where did the pipeline filed on uh and even which item type so you will see there also reports semantic models and that kind of things data CL everything's going to be in a monitoring Hub to start so I think the next step is is to start doing a small uh demo uh uh and we will show you in the in the demo how you can uh building the first steps yeah absolutely let me see if we have my screaming here and you can also do uh uh you can also follow the steps uh during the exercise with in justest data within your learn Life module if you want to so go ahead so what thank you Arwin what we'll be doing here is starting over right we're going to start creating a brand new uh workspace we're going to create uh louse uh data warehouse and then we're going to see uh most of the uh step that when was presenting before you get data from a sample Source into a data warehouse then we're going to get data using a template from an aure SQL database into uh our playhouse uh then we will be getting data from parket files that are actually stored uh in an storage account and finally we will see also uh the concept of this one L uh application that will allow us to easily upload files that we may have locally in our computer into uh fabric so let's start with that uh this is my uh fabric uh home right so I'm entering to the portal what is interesting here is super common especially if you are coming from uh powerbi this is something that uh is not new uh for you right but what is important for you to see is this uh lower left corner right in where you have what they call experiences right so by default we have fabric but we have uh you know many others right one of them is definitely uh data Factory and this is probably the one that we will be focusing nowadays but before starting as I said I will be creating a brand new workspace right this is a concept again that is coming from powerbi in where we create the the warspace we put all our components over there right and then you know eventually if the project is over or if this is a demo as what we're doing nowadays we can just delete the whole workpace and everything goes away right so I will go to workpace and I will create a new workspace right for that I have my name here in the notepad and we will call it super simple not too creative My Demo workpace 03 with that we can even uh get a an icon right to uh you know identify our workpace so I hit uh apply here and as simple as that which is have our um our workspace right so in order to move on here we have to start getting data as we said from source to destination so this is what we're going to be doing and I will go here to New right and I will start by creating a louse right super simple here we're going to create actually a louse and then a warehouse so I will create my louse coming back [Music] to here and I will call it demo l k003 so as simple as that right we have it's going to take few seconds to get created and what is interesting here is again we don't have to do much outside the fabric portal and we will see everything we need here directly but before moving on as I said earlier I will create a data warehouse so down here we have the option Warehouse so going to create my warehouse and going to get my name demo d wh3 right we put it here we hit create and again in a matter of seconds we have our uh environment ready to work with the data warehouse super a super summary on this if you are working with structured data with a traditional data warehouse with tables store procedures relationship Etc well and and you work normally with tsql well a data warehouse is your um your your spot right if you are working with structur and unstructured data and even files Etc you may use the uh louse right so again if I have you seen this have you seen this shv just created a warehouse or a SQL endpoint within seconds so no waight to provision that's SAS that's the world of sze cool isn't it yeah yeah yeah this is the part that I like the the most we don't have to install anything as we said before for the drivers of DVC and and and the sources right so we have our warspace we have our destination the data warehouse and the Lous so let's start by getting data so I will go again to my experiences I will go to data Factory here and I will go to data uh pipeline there are multiple ways that you can use uh to to create your pipeline right this is one of them the other one is as I said as I did before I go back to my workspace I go here to uh new and at the very top I have data pipeline right so I will create my pipeline I will name it pipeline one and as we said we're going to start getting data uh from uh sample repository into our warehouse so this is what Irwin was showing us uh before and we will go to the uh copy data uh options here so we have as we saw before all the uh possible sources right here there are a lot of them and here at the top we have the sample data in this case we're going to choose the uh New York Taxi uh information here which is publicly available to use and consume right we have here as soon as we select it we have uh a preview right we're going to hit uh next right then we have to choose our destination right our destination will be the data warehouse so I choose warehouse and now we have to either create a new one or use the one that I previously created right so we hit next and now we have to choose um whether we want to load this data into a brand new table or if we have the table already created we can reuse it it's also proposing us an schema and a table name for our destination and what is quite interesting here is that that we can also get the uh Source data type for each of the columns as well as the destination we can change column names If eventually we want and even data type in this case we will just uh use the default when we are ready here we have the summary right so getting data from New York Taxi into the data warehouse what's going to be table name Etc so down here we have uh the option start data transfer immediately so we is hit save and run right and right away we move to the next screen which is actually showing our act artifact that we have it here and at the bottom of the screen we see that our activity it was Q up before and now it's in progress right now is also showing us um the direction and this is refreshing uh every so often right so this is going to take few seconds to inest that can you copy click on the copy activity so that we can show you the the the no the the activity name in the in the output window to show the monitoring of the uh uh you clicking the window okay cool let me just go back here just click in the middle of the screen here we go and click theut copy underscore yeah on the left side so that one yeah yeah so it just succeed we got the summary as Irwin was showing in the presentation before uh timing the ation uh etc etc so it's super uh powerful to have a very detail uh information on how our uh pipeline was executed in this case right so what can we do right now so we said that we move or we inchest data into our warehouse so I go back to my warehouse and now we see the table here right we have the table and we have a preview we can also do a new uh query here right and we have a tsql environment here and we can do uh select top then start from and so we we have everything we need here in order to move on right I will go and a step further and you know me irn I'm a SQL guy right I've been uh working with SQL server for for years so I will show you something extra right if we go here to our warehouse at the top to the uh settings options we have something quite interesting here which is SQL server or SQL connection string and we have a quite long URL so I will just copy this link into my um clipboard and I will go to management Studio of course right here I'm previously connected to an aure SQL database but in here I will go to connect to database engine I will paste right the uh URL that I have before here I have to change my my credentials because are not the same and super simple we're going to be connected to our environment from management Studio So eventually if you would like to do this from management Studio you can also do it I mean consuming the data I'm getting my uh credentials here which for the reason it was not there is going to of course pop up with the credentials now I am authenticated into my uh environment and and you can see here that I'm now connected from management studio and what can I see from here if I expand databases I see both the warehouse and the louse so I will expand Warehouse right same as in SQL Server we see you know store procedure uh functions everything but we also see tables so same as we did from the fabric portal right I see the table that we just imported right I will do a select up so if you want to stick to the fabric portal and use the tsql experience within that you are free to do it right but if you would like to jump into management Studio aure data Studio or any other tool you get the connectivity to use it from there right super simple easy to use and this is something extra right so let's go back to uh our fabric environment we finished the first exercise right we create the first pipeline which was actually moved from the sample data into the data warehouse right so let's create our second pipeline I we go here call it pipeline number two right and in this case instead of using the copy data activity I will choose my templates going to uh choose a task to start so we come here and as Irwin was showing us before we have uh all the common templates that we can use now I will be using this one copy data from Azure SQL DB to a louse table right so I'm selecting this one one I hit next and what is interesting here is that we will also see uh the parameters in action right uh Awin show us before how to specifies parameter and how to use parameters we're going to see it from here right so the input is an AAL SQL database so we have to specify which is our uh connection I have many connections created before uh so I choose my aure SQL datab which is the one that you saw in my management Studio then I have to choose the destination which is a louse of course I have it here because it was created at the beginning when we create the um the workspace so use this template right I have mostly everything we need here but down here we have the parameters right the parameters are schema name and table name which are actually uh from the source and then the table name how we would like to name the table on the louse right so that's the third parameter so when we are here the only thing we need to do it just run right we're going to save it of course to reuse it in the future and now it's going to pop up asking us for these parameters that are defined in the template so we have the sales LT uh schema the product table and uh we're going to call the table in our louse we're going to call it LH product right once it's there we is hit okay and same as before our pipeline is running we can uh monitoring we have the is is Q up and now is in progress so it's going to take again few seconds we see the duration same as before we can see uh all the details Here and Now is done ex exceed right so we go to our louse and we see under tables that we have the uh LH product here right so if I go back to my management Studio I expand the uh louse right we see under tables that hopefully we should see our LH product table that we imported from um from the aure SQL data right so same as before we have this from management Studio or if we want to stay here we can uh just use it from uh you have to click on refresh yeah yeah yeah I I have to here and before that was the intermediate uh step and now we have the table here and I can open uh notebook and work it from there but we have the the table we have the preview in here right so I'm not seeing the tsql experience because this is the louse right but we can create a notebook and work it directly with with the spark right so that is the uh thir uh um sorry the second uh example that we did is moving from the Asos equal DB into uh the uh louse using the template now we will go back to our work before you go can you show also me the SQL end point in the in in The Lakehouse so going to the lake housee here to the lake house I go to the layous and then go to the right top to the setting right we have here and what you ask me to to to check next to the button of share in the right right top corner you can select SQL Endo so we have the SQL end now for all the SQL and DB you can write SQL on top of Delta on top yeah yeah super simple right and if you are familiar with the notebook as before as we said before we can come up from here so and as we show from management Studio you can connect to both directly what is interesting here is that you can do as we call in SQL cross database transaction right so you can showing your tables that are stored in the data warehouse with the tables that stored in your uh layous right super cool all right so go back to the workspace let's create a brand new uh pipeline we will call it pipeline uh three right and here what we will be doing is copying data which is stored in a parket file right but is sitting in my uh storage account so we will put here uh aure blob storage hit next I have all my my connections here I will go to the last [Music] one in here I'm going to test just in case this is not so this Zoom is not here we go this now we test this is my personal storage account in where I have a container in there and within the container I have my uh files I have a a container called uh py and here I have my file right I can copy this as a binary but here I say no this is a paret file which actually has all the data model defined eventually we can have it with other format right but it just detect that it's a parket file so I will go hit uh next we have to again choose our destination Warehouse in the destination uh we have the warehouse already created it's going to come up with a table name and is proposing uh an schema and a table name well the table name is not too familiar so we would call it learn live 003 again data types uh column names everything we have it in here we are free to change it if we need it hit next we have the summary here right start data transfer immediately we hit save and run and one more time we have within the output it's going to take a second is queueing right and then in progress when it's done we will see this parket file already IM Mar data warehouse as a table and as I said before we can uh show if I mean those tables are nowadays uh completely unrelated they don't have any relationship but eventually if they do we can do any sort of shint here and combine everything so uh this is my um I will go to my warehouse that's here and I see now my new dat one more time if I go to management studio and I refresh the tables I'll see it I will see the new table that we just created right so everything we need is in here and we still have from our previous exercise the other table here right so super simple easy to use which is complete the um third part of the uh demo with the third exercise so we will go back to my um layous right I will go back to the uh layouts here so before you continue S I do see some questions from the from the the people uh a lot of people are asking are data pipelines already available in cicd or in AZ devop it's on the road map you can watch the road map on ak. msro map and I think it's pointed to q1 or Q Q2 there were also some questions uh how we can do some Version Control on tables in in in in the warehouse you can still use Visual Studio or Visual Studio code to build your deck packs as well but there's also an API to create tables within your Lakehouse so api. fabric. microsoft.com uh just find it on on Google or and uh you can find everything you need to know to create Tables by from out of your own repository in creating by hat so uh and besides SQL management Studio you can also use aure data studio so these were some questions I just saw in the chat uh hopefully uh answered for a lot of people uh uh yeah cicd is coming currently supported for I think notebooks lake houses uh uh semantic models reports and and and like with the deol pipelines uh and the rest of the cicd integration is coming in the next couple of months pretty soon hopefully yeah all right so last part of the demo is basically uploading a file that we have in our file system right I have locally in my computer uh CSV file so I will go here to the layous we go under files and I will hit upload files right so here which is go I select I have uh let's say a simple file let's called the annual Enterprise survey 2021 all so I just selecting my file I'm if there is uh the file I will tell it to overwrite it I will hit upload and here super simple the file has been uploaded right when when it's done I come back here to the Explorer I have to refresh and we should see the file here right if I hit here in the three dots we see load to table so I will load it into a new table it's proposing the table name based on the file name and it's not so friendly so let's make it shorted let's make it like this and actually the problem is the dash let put it as a u underscore right is detecting that is a csb file it has uh the header for the column names and comma separated so I just hit uh load right as soon as we do that without opening let's say the um the data Factory or the the pipeline that we saw right now or up to now right we will see also another way to get our own data is stored locally into uh Fabric and then convert it into uh a table right it's going to take few seconds and in the meantime s that I would like you to see as well is you know this one leg you know that one lake is for fabric what one drive is for uh Microsoft Office or Microsoft 365 right so is the common layer of storage for everything that is related to fabric in this case so I have one L installed here I see um all my work space here I will right click here I will hit sync from one leg and we should see here my workpace the workpace that I create uh for this demo so demo warspace 003 so I can dive it like if there are you know files in my storage I see my warehouse my the louse let's go to louse we can see uh you know the file that I just upload is now in here right so I will refresh one more time and now we will see that the uh annual Enterprise which was original a CSV file now we can see it and we can query directly from uh the fabric portal and one more time if I go here and hit Refresh on management Studio I will see now the annual survey as a table here and I can combine it with any other tables that are studed within my uh louse or data warehouse right so with that we just complete uh the demo uh as a summary we went through uh everything that is related to pipelines using the copy activity using the template using multiple and different sources Azure SQL database uh uh and Storage account with the parket files and even the sample data and to finish we also saw the uh way to upload a csb file right how was that uh Irwin was awesome or not I think we have covered a lot of stuff uh we have seen all the copy activities we have seen how to start from a template we have seen uh the copy data tool where you transfer data and have you seen how fast it was we transferred a parket file from 2 gigabyte from your blob storage to fabric within one minute so that was really really fast uh and the the data was directly visible a PAR file I transferred the data and it was directly visible as a table for a SQL dbas or even in notebooks how how cool is that I love uh so I think uh hopefully you can do it as yourself as well to follow all these exercises Javier just showed you uh through the learn Life uh uh module uh be but before we continue uh I think shall we check if uh everybody can answer some questions yeah so uh let's go to the questions so um I think the first question of today uh uh is going to be what is a data pipeline is that add a special folder in the one Lake storage where data can be exported from a Lakehouse a sequence of activities to orchestrate a data ingestion or transformation process or a safe power query and if you want to join us please make uh use the the QR code uh to scan or go to the polls and fill in your poll and then we will check uh uh as well uh what the result will be and if you have learned a lot from from Javier and myself today so this is a nice ways to uh you know uh evaluate what we have learned today right DN yeah and if you want to get ready for the exams that are coming uh definitely is a good way to practice Yeah so this is the learn in the in the learning approach we just show showed you in the beginning uh uh it's one of the the modules and I think this will also be covered during the exam Gram on the February 13th where you can subscribe yourself and uh uh yeah for me it's definitely goal to get my dp600 and for you hopefully as well uh currently uh we have still a lot of people voting it's going the correct way so uh I think we can go to the next question but before we do that we will of course give you the correct answer and the answer uh oh why is the clicking and the answer was of course a sequence of activities to orchestrate a data ingestion or transformation process uh so I think mostly everybody said it correctly that's cool so let's continue to the next question you want to use a pipeline to copy data to a folder with a specified name for each run so something like Sim similar savier just showed you what should you do create multiple pipelines one for each folder name use the data flow gen two add a parameter to the pipeline and use it to specify the folder name for each run once again go to the QR code scan the code and do your poll everybody's getting vot voting faster and faster so that's cool to see let's see how it goes It goes very well pretty good right yeah so curious about the answer uh probably yes so I will show you the answer add a parameter to a pipeline and use it to specify the folder name for each run uh as you have seen uh uh zier showed you one of the one of the copy activities where you have to define the source table schema The Source table name and the destination table name H and this is one of the ways to to to make the uh uh pipeline useful for more destinations or even more table copy copy activities so uh I think everybody almost voted the correct answer so let's continue to the next one you have previous run a pipeline containing multiple activities what is the best way to check how long each individual activity to to complete a rerun the pipeline and observe the output timing each activity view the Run details in the Run history or view the re refreshed value of your Lakehouse default data set he doing is all yours one yeah I guess pretty much everybody mod the answer is the answer is yes almost everybody voting the answer is of course B the Run history details show the time taken for each activity and Al also optional as a chart available so with that we are ended with our questions and uh we have a small summary left uh what we have done today today uh we showed you uh the the pipeline capabilities in Microsoft fabric uh we showed you how to use the copy data activity in a pipeline uh we showed you how to create pipelines based on a predefined templates and we showed you how to run and to monit pipelines and even we showed you a little bit more uh uh how to upload files to one of your lake houses from your one Lake to your one L with the upload mechanism but also through your uh installed uh one drive for data your one like your one like activity in your laptop so uh with that uh we are at the end of this session uh maybe we can answer some more questions but I think most of the questions are answered by our awesome moderators so one big Applause for them uh thank you uh before we leave the next session so if you want to test it yourself you can go to the next slide go through all the session details again and otherwise tomorrow evening same time you will have the data flows and with data flows what we al already said you a little bit in the beginning you can do much more Transformations uh in between if you like the old SS from really long time ago this is more what you look like so tomorrow 31st we have the ings data with data flow Gen 2 in Microsoft f if you want to get started with Microsoft fabric you can directly go to upgrad to fabric and you can start a trial period of 60 days to do all the stuff and to get your dp600 and if you're there you can still use the MS cust code for $100 discount to go to the Microsoft conference uh in Las Vegas at the third week of March hopefully I will see you there thank you s for today thank you to the ERS and to our thank you so just just uh thank you everybody for joining us today uh we hope you learn something new and well to start diving into this new awesome world of fabric right so thank you everybody and we see you soon thank you byebye bye
Info
Channel: Microsoft Power BI
Views: 4,617
Rating: undefined out of 5
Keywords: microsoft learn live, fabric, data-analyst, data-engineer, beginner
Id: kzqZGr-S0z8
Channel Id: undefined
Length: 67min 56sec (4076 seconds)
Published: Tue Jan 30 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.