SSIS Tutorial | SSIS Tutorial for Beginners | Intellipaat

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hey guys welcome to this session by into the pipe so the most important business intelligence component in Microsoft's arsenal is essa' sized the co-responsibility is the movement and cleansing of data without the cleansing and movement the other components would not exist or at a minimum report bad data so in this session we'll comprehensively learn about SSIS so let us take a quick glance at the agenda we'll start off with a quick introduction to SQL Server and then we'll learn how to create our first exercise project then we'll understand the data flow on the concepts of data flow going ahead we'll look at the concepts of oily DB and learn how to create an oily DB connection and finally we'll learn the different types of integration service data types what you guys now let us move on with the session if you are going to start off pulsipher for the entire demo and before I get started any further I wanted to take an opportunity to talk a little bit about the software's that are involved and and where you can get the links and where you can download these software's so rest assured these are all free so one of the one of the best things about Microsoft and especially for you folks you are going to obviously use the sequel server 2014 evaluation version and if you just see my screen right now you should be able to see that just go to evaluate now when you click on this you should be able to see all the different options here and just just go to sequence over 2014 sp2 and you can just try to stress out from here okay and it'll ask you for a few details on your email address and other details and you can just plug in those details download the software and that's basically a 6-month trial that you get to all the features and functionality available so that's how you can get sequel server 2014 up and running and and I do want to highlight here that you need to have a Microsoft account so make sure that in case you don't have a Microsoft account you will need to create a Microsoft account so it's very similar to what you have done for Outlook but make sure that you have a separate Microsoft account it will ask you for those details in case you don't have it it's free to register and create that account so you can just download the software and the additional steps install sequels over a pretty straightforward simple menu options the graphical it's a GUI that you will get takes you through a series of Windows and it's pretty easy to get the thing up and running just ensure that there are a series of checks that will happen so step by step you will have a set of checks that will happen and if any of the checks are failed automatically the installation will stop and it'll give you a detailed description as to why that check failed and and how we can ensure that the requisite the prerequisites are installed before installing sequel server 2014 one important thing I do want to highlight is the feature selection selection window unfortunately I can't show you because I already have sequel server installed in my system so probably this is something you might encounter when you do it from your end but do ensure that when you are setting up sequel server when when you are setting up sequel server ensure that you get you'll get a feature selection wizard where it'll ask you to select what all features of sequel server you want to install and be very sure to select select also you can actually go for select all as the safest option and ensure that you have integration services analysis services and reporting services all set up all checked in that particular box okay so integration services that is SSIS SSAS analysis services and SSRS reporting services they are all a part of sequel server so it's just a misconception that people have that they are not part of the sequence of installation so they are part of the sequel server core service alright so it's pretty straightforward install sequel server and finally the the final piece of the component here is to get data tools up and running data tools is the front-end we call this Microsoft sequel server data tools Co policy of the link for data tools here so this is where you can install Microsoft sequel server HR tools and s that's the final piece of the puzzle again a free free installation and a free download so make sure that that's basically the Visual Studio 2013 shell that you have there okay and it is mainly for this reason that you need to have dotnet up and running so one of the reasons why I mentioned initially the dotnet is a prerequisite is for data tools to be working so dotnet is a prerequisite for that okay so broadly there are three things you have to do a you have to make sure that dotnet the proper framework is installed B you have to make sure do you have sequel server 2014 evaluation version setup obviously if you have a if you have an Enterprise Edition sequel server as part of an organizational account whatever you're working for the organization and if you already have sequel server setup that's the best case very good but in case you don't in case you want to try out in your personal machine this is this is one of the ways and one of the free way of doing it okay and finally you gotta have data tool setup on your system so data tools is the is the development environment we call it IDE you're all programmers you're all techies so you all understand what an IDE is ID e stands for an integrated development environment IDE is where we build solutions ok we talked about SSIS we talked about education services we talked about ETL we build packages we talk about analysis solution analysis services we build cubes and we talk about reporting services we talk we build reports so all of that stuff happens in that sequel server data tools IDE ok I'm gonna take an opportunity now to show you some of that stuff right now that's already installed in my system obviously you wanna get this up and running in your respective systems but it's gonna take a moment to show you what all I have installed right now and just go to my go to my sequel server go to my sequel server installation you start up you can see that a sequel server 2014 a set up and here you'll be able to see and all of you will be able to have the same kind of view one you have your software setup as well you'll be able to see something like this and you see that Microsoft sequel server 2014 management studio and these are all the sequel server related components that have been set up and this is integration services analysis services reporting services these are all all being set up here ok and the other piece of the puzzle so this is this this will come as part of the sequel server installation and this is the part that comes as part of the data tools installation this is what you can see is data tools ok as sequel server data tools for Visual Studio 2013 so when you click on this and let me just just wanted to show you the interface here we'll discuss this in more detail when we talk about getting our hands on aspect of things just in a while today itself I just wanted to take you through this and just this one and open up on education services project here so this is nothing but the IDE and the how you might have worked in Visual Studio you should immediately you should immediately recollect this ok I think it's a password protected package I'm going to go ahead and let me let me go ahead and open up something else and show you then we'll open up this latest integration services project and as you can see it's nothing but a button IDE to develop an environment you have a set of tools drag and drop pretty simple UI easy to develop and that's what that's what data tools is so in the in the prior versions of sequel server even before 2012 so in 2012 starting 2012 my sequel server 2012 of course Microsoft rename this to call it sequel server data tools prior to 2012 which is a previous version up to 2012 or 2008 r2 so up to Dylan 8 r2 2008 it used to call it business intelligence development studio so bids is what what we used to call it until that portion and obviously even before that even before that when SSIS and it came into existence it was it was called DTS so built bi bi D as bids is what we used to call it business intelligence development studio is just a front an IDE that you have and finally sequel server of course this is the management studio interface I've already shown this to you and if you see this is this is basically how you interact with your sequencer but it's the graphical user interface that you get you type in your queries you know in drag and drop you build your queries and all the stuff that you will do to interact with your underlying SQL Server database engine ok so you could you have to think of it from a diagrammatic point of view from a very conceptual point of view is sequel server is like the central component right sequel server consists of primarily four components so it's your database engine this is the database engine is Victor really the core component of sequel server you have your integration services have a reporting services and your analysis services which are all a part of sequence server ok so most people of a misconception the data tools is what is driving integration services their tools is what drives analysis services but but no data tools is nothing but just an IDE it's nothing but we should studio it's all Visual Studio if you come to visits you can see that is indeed getting integrated shell you can see this 2013 channel integrators they go to file go to new go to projects the interface here is very similar to what you will get and if you're working Visual Studio before you'll understand what I'm trying to say right it's only difference is you can see that the BI templates have been having pre-installed here ok I think we're going to talk about this in more detail in a while we just want to show you this so that's on the software installations front and now we're gonna gradually get deep into sequel server integration services that's out there's a first module that we're going to talk about SSIS and the first the first objective is really to make you comfortable with the IDE so one of the objectives of today's session is to ensure that we fire the tool we fire the IDE and we walk you through some of the integral components of what goes into building an SSIS package right so again start up your start up your SSIS data tools and go to the new project section just as we did a while back go to file go to new code you can either fight it up that way or it can go to new projects like this so obviously the initial window that you will find are a set of announcements and you know if you opening this for the first time you'll see something else in terms of tutorials and other stuff that might come up initially but this is basically the very simple interface that you have mutual littles recent of some of the recent projects you've worked on go to new project and it gives you an option to select what all projects you want to viewed here so it can be an integration services project analysis services project or a reporting services project so for our intensive purposes we can go ahead and select the integration services project as the SSIS gives you a brief description of the type of the project here so this is the location in which the project is saved I'm going to talk a little bit about that in a while specifically what you mean by location I can see it's creating the project so everything that we do as part of an SSIS solution as part of ETL all if you remember from the last session we understand where SSIS is required is for is for performing extract transform and load everything we do inside this particular IDE and when we create a project the first time you will see this option getting started so if this takes you to again the Microsoft repository gives you a few examples and stuff like that and they already understand we already talked about MSDN and you already know the place where you need to go to for additional reference material so go ahead and close this for the time being so you can see an SSIS toolbox opens up by default it is still loading and this is the designer interface that you will see you can see it is the package dot TTS X I'll talk a little bit about what is the package and other stuff in a while so a few other windows will also come up by default unfortunately for me it is right now hidden I will talk I mean again some other reasons why it is written is because the way Visual Studio works is it really takes it remembers the settings of your previous session so if you if you did something in your previous session it will remember it and it'll load up all the windows based on the settings of your previous session so in the last session what I did was I I closed a few windows okay so it's a very easy way to bring up your windows go to view and you can bring up all the other windows so for instance if i if i for instance if i close this say SSIS toolbox all of a sudden you will open up visual studio and you will wonder okay where is this window I don't see it and the easy way to bring that up is I can go to view and bring up the respective windows so one of the one of the default windows that all of you will see a solution Explorer and the other default windows that all of you will see again is properties window so these are the two default windows that all of you are gonna see I said I intentionally did this to give you an example as to how to bring these windows back up so that you don't panic when you don't see it so might so happen that the way you configured your vision studio the way the organization has configured your vision studio is these are these are hidden but it's very easy to bring that up so don't panic if you don't see them it's very easy to configure it as per your specifications okay just to talk a little bit about this solution explorer obviously like just like a folder hierarchy it's kind of gives you a folder it's very much like a hierarchy of the project like where your project fits in what are the components of your project and it just like it's just like a file explorer okay it's just like a file explorer similar to that properties needless to say what properties mean properties is very context sensitive so right now you can see that properties here refers to the package it depends on where you're clicking I haven't talked about toolbox tools yet but go to drag and drop a few items and show you specifically what I mean by that it's very drag it's a drag and drop interface and talk about these things later but just wanted to show you properties again when you click on the background you can see that now I'm configuring properties for the whole package you can see the selected package when you click on this you will see I'm configuring properties for the execute sequel task you can configure your property then they click on this you can see I'm configuring properties for a data flow task it is context-sensitive that's the reason why I say it is context-sensitive because depending on where you're clicking you can configure the properties of that particular object okay and further you can speak on this you'll see that that you can configure the properties of this and you can package this parameters so depending on where you click you can configure the properties of that particular item right so it's very easy to delete items so you can just go you can just click on it click on delete on the keyboard or you can right click on it click on delete drag and drop interface from the toolbox you can even drag and drop items here ok again I will talk about the specifics as to what each item does but just wanted to show you the interface broadly so this is the designer interface we cannot design the stuff so you can see in square brackets design is written this way you design the packages ok so this will talk a little bit about the solution Explorer solution Explorer as I mentioned is just like the File Explorer and one of the things I mentioned initially when I Square the new project is where this project is actually saved so let me show you the location where it's safe so this is saved in a place called My Documents of course which all if you understand so let me let me go you let me take you to my documents and you see the visual studio 2013 folder there so in my documents you see the visual studio 2013 folder and inside that he will have the project's folder and that's where you should be able to see all your projects so hope all if you understand this particular part right you can see go to documents go to go to visual studio 2013 and go to projects inside that go to projects and and this is where you should be able to see all the high-level solutions that you created now be very mindful of the fact that what you're seeing right now is this so you can see that right now the solution name is also integration services project 10 okay so please be mindful of the fact that your project lives inside a solution okay so the first thing is the solution you can see there's a small differentiation in the icons that I can start little different so this is the icon that is associated with the solution it's called a dot SLN file okay and within a solution you have what we call a project so this is what we call a project and this is what we call a solution you can see there's a solution properties make sense yep and another very important thing I wanted to highlight also is in terms of the you know extension you can see that's the indication services project 10 and you can come here and you can see that it's a folder for project 10 if you go inside it you should be able to see that this is indeed the original studio solution just just over over it just go to the fluid tool tip that comes up and you can see it as a type is a visual studio solution you want to look at the extension of that file you can see that that's that's what we call a dot SLN extension as a solution file okay so solution is what encompasses all the projects you can off you can have multiple projects inside the solution which is pretty pretty much do the experiment early from the overall diagram the picture that you've seen you can see that the solution lives above the projects you can actually create new projects inside that solution will add a project inside a solution okay you can have multiple projects inside a solution make sense so this is what we call an integration services project and obviously if you if you dig deep into if you dig deep into the solution if you just go instead enter into the folder here now you can see that what we call the project that's a dot TT proj okay so this remember the extensions so this is the dot DT proj which is the project and dot SLN which was a solution okay and next step is what we call the package package is obviously the heart of your integration services projects obviously every anything and everything that you perform in NSS is project is a package so you I mean all the conversation in a NASA science project will revolve another package so you'll almost always hear people talking about exercise packages and because that's s what you have so what the stuff that we are designing here is ultimately what is the NASA science package that's that's all that you're designing here so inside a project you can create multiple packages you can see by default the first package is being created you can obviously rename it let me go back and rename this to demo one that's my first package I can go ahead and create one more package I think that's very simple to do that right click on it click on new essence is packaged and I can create one more package which I can rename as demo - very simple to perform these activities so these are two different packages but obviously you can see two - two tabs open up all sources so that you can easily navigate with them to design interfaces for demo1 and demo2 you might be thinking that they're both one of the same window interfaces the idea is I'm getting for both actually it's not so probably I can come back to demo one and I can see automatically when I select them or one here also the selection changes to the demo one and when I change the demo to automatically the selection also change the demo to here and vice versa this is a pretty straightforward things to navigate but then let's see if I drag and drop execute package so let's let's say that put some components here they may not make any sense at this moment just wanted to show this to you so this is these are the changes I've made in demo one but when I go back to demo - you will see that those changes aren't present so which means that the design window that you're seeing this is the design window that you are seeing is different and easy is it is applicable and is exclusive only for that particular package you're working on it's not it's it's not it's local to that particular package you're working all right it's a very very important concept it's easy to get confused in the beginning but just just just keep that concept in mind that it the settings are really local to the package you're working on very easy to delete a package right click on it and click on delete or you can click on delete on the keyboard and it will delete that particular package project parameter stands for at is the param stands for parameters so parameters are very similar to variables we'll have a discussion on the specifics as to what parameters are useful but conceptually they're very similar to variables and you all understand what variables are right so it is especially useful for scenarios where you want to transfer some information from one package to another package let's say you're calling one package from another package and you want to transfer some information parameters are a great way to do it so that's what parameters are primarily useful for and connection managers we're going to talk about in a lot of detail in a while just in a while and finally a miscellaneous where you can add in other kinds of files and stuff let's to talk a little bit about the other five tabs that you can see here you can see that all that we'll be doing in a control phone is something called data flow then there's parameters here there's event handlers here in the package Explorer here and all that you are seeing here at this moment these are all exclusive to the package that you're working on okay very important so right now we are working on package demo one to TTS X and for our demo let me just go back and create that other package that I just not delete it let me go back and rename this to demo two toward a CSX now one very very important thing I wanted to highlight this is the extension of the package now be very be very very careful not to change this if it's I mean it's very easy to you know go back and try to rename the whole thing but don't do that okay it becomes a problem and I mean he just corrupts your code okay so so this is an extension of a package store DTS X dot DT prodigy extension of the project and dot SLN is the extension of the solution so be very very careful not to tamper with the extensions go back to demo one two TTS X and you can see that these are all settings that are applicable for that particular package control flow data flow parameters handlers and package explorer okay now whatever we have been discussing here okay I'll come to this in a while now what is control flow control flow is alright so now we understand about so far what we have discussed so far we have discussed what is a solution we have discussed what is a project so obviously our solution can contain multiple projects you can probably you can think of it in those in the under point of view of an msbi solution you're building one MHP a solution inside a time a space solution you're building an SSIS project and analysis project and a reporting services project you can build all your projects as part of one solution okay you can think of a package and existing within a project and you can have multiple packages within the project so this is one project that I have and it's very very extensive ex-military their hierarchy you can see that this is the project and inside the project I can have multiple packages now inside the package is this demo point but this is inside the package I'm gonna go ahead and close this other window right now inside this particular package you can these are the items that you have these are what you have inside a package so whatever designing that you do you do - based on these five components here control flow is the high level Ori of your package and it is in control flow that you will define all the control related elements of your package now what am I trying to say here think of it like a unit of work think of it like a function it's energy that I really like to take here because it's easily understood all of us understand functions all of us understand all of us have done programming at some capacity all of us understand functions right think of it like a sum function think of it like you know some kind of function think of it like a simple function call so your function contains your function name your function definition the body of the function and typically what happens is from the analogy that I like to take function is like a black box right so you put some code in that function and then you call that function from another part of the program that function executes and that function returns you some value right so conceptually what am I trying to get at let me open up paint and just explain this to you once again so this is your function and again assuming that all of us here have some level of programming capacity in terms of what is a function all of us understand what functions the functions are like a black box so you have the function name you have a function definition what is the function supposed to do let let's it's a very simple function and I'm not going to get deep into you know specifics on what functions are and this is this is by no means a discussion on programming concepts and what are functions but let's say you have a very simple function called sum so let's say you have a sum function and the sum function is taking in two parameters int a int B and please don't please don't look at the syntax here because it has we understand in today's world you know there there are so many so many programming languages and and syntax is really irrelevant in today's world so it's just more on the concept part so you see if this is your so let's say it's returning a value also so what you're essentially doing in this particular function is you're returning a plus B so let's say you have another variable C that is defined that C is equal to a plus B and what you're doing after that is you're just simply returning the value of C right so this is what you're doing as part of your function and now what you're doing is so this is your function is where you defined your function okay right so obviously of you've given your function you've defined your function and and what you're doing here is you're calling your function so for another part of the program another part of the program you're calling your function so let me give some dot dot dot and stuff sorry I'm sorry I I'm pressing and pressing the should press ENTER here all right I hope is understandable to you and I'm calling this function here so I'm saying some I'm saying some 1 comma 5 I'm saying some 1 comma 5 alright so let me let me put this in a printf so I can actually put this in a plaintiff or what I can do is I can say int result is equal to 1 comma 5 and then what I'm doing is I'll let me let me directly do it from here I can I can actually directly do it from here I can say print I can say print the result is sum 1 comma 5 names the plus or whatever operators that you can you want to use here right so again depending on the programming language you're coding on but conceptually I hope you are able to understand what I am what I'm driving here so what I'm driving here is the component here where you have described the function this is what the function is supposed to do ok so you have described the function here the function is taking into arguments and it is adding those two arguments and it's returning the sum here that's all that the function is doing okay so and just so that I don't confuse you here let me use the ampersand operator I'm assuming a different program and let me just assume let me just assume ampersand is for concatenation so that you don't get confused why the plus is used here anyway so coming back to our discussion and percentage is used for concatenation I'm assuming here okay now what I'm doing here is I am just calling the function sum 1 comma 5 and obviously I have no understanding what that function does it just so happens that I'm using some but it could so happen that this is something else this could be something like FG I mean this could be something like anything I mean just just just think about it could be something like result set name it name it anything you want results it 2017 something like that okay and now what you're doing is you're calling results at 2017 from here okay so now now it completely you know you have absolutely no clue so if you just go back and think about it results at 2017 1 comma 5 you're absolutely no clue what happens inside that function all you know is you should you should you should enter some input and it should get some output back obviously each it's not the best naming convention obviously you should name it a sum but just wanted to give you an understanding what it is right so you have a piece of code written here it's performing some function is performing is have some description it is doing something but it's like a black box it's like a black box this is your function you know that if you get this input you know that if you pass in this input you will get an output as return value and as a return value that you get so but what happens inside the concrete is a black box okay so when you are calling this function from here you know that if I give this input I should expect this output but how you're getting that output that's a black box to you you're not really cared about that you're not really concerned so just to give you a very simple example you can either do it like this in C equal to a plus B there in return or C or it can directly return an A plus B so it really doesn't matter how we have implemented it so you can you can directly return the a plus B here you can probably come here let me just let me just straight down the a plus B directly here that's that's also something that you can do it just depends on how you implement it but as I mentioned the implementation does not matter ok this is a very brief description I want to have all functions here and what I'm trying to drive home again through this disk disk you know through this discussion here is control-flow is where you call those functions control flow is where you call those functions you have those blocks just pieces of code that you're calling so control flow is like is the place where these are all functions and control flow think of it like that so whatever you have here all these tasks and again as I mentioned it's the very high-level discussion we are having today all these tasks are we can only going to discuss in more detail what each one is meant for but each one of it is like a function call ok what is happening behind each one of them is handled separately but at a very high level you are you are calling this function so these are like function calls and you're defining through this arrow you can see this green arrow jutting out through this arrow which we call a precedence constraint you can see this properties window highlights it's just select on this and since the precedence constraint so you are defining the order of execution you're defining okay first do this then do this and then if successful do this a lot of other details around it but broadly this is what you're doing so control flow is a high-level description of the order of your program so these are and each of the tasks that you can see here in control flow are are the individual components or the individual function calls okay what is the code in the function just justice if we looked in the function called where there is one area where you've described the function where you have the function code which which is where you describe what the function is supposed to do and there is another part of the program where you're actually calling the function where is that code written where do you where is that code written where you're describing what the function is supposed to do all right so all this final F say that control flow is where you are calling the function now we are asking what is the function supposed to do how do you said that you said that in the properties window which is what you do by double-clicking here okay so you can simply double click on a particular task to view the properties window of that particular function right so this is the this is the analogy that I like to take and see the double click on any task and you can see the properties window and this is where you describe what that function is supposed to do okay I either double-click on it or you can go ahead and right click and edit on it okay this is where you set the properties of it not the function okay so think of it this way that and now the other question that that you might have is just now just a while back I say that okay if I want to set properties of this particular executor sequel task I can either double click on it set the properties here which comes up as a nice graphical user interface where you can make and categorize I get these options in a very neat interface but the same thing I can set here also okay so just multiple ways of doing the same thing but there are certain properties that you cannot set here but you have to set here so just just keep that in mind also alright so just just a better UI that you get when you double click and or whether you right-click an editor on it unlike compared to when you when you do to set the properties from here all right so just just multiple ways of doing the same thing now so hope this is clear to all of you so again just a device so what we are doing in control flow is setting the high level order of execution of the program so think of think of the packages of program think of the packages of program that you're writing ok exercise is the project is the project inside the project which is a kind of multiple packages so obviously you're I mean it can consist of multiple so each package is like a program and control flow is where you design the program and ultimately each program is nothing but a set of function calls if you think of an already programmatic standpoint each program obviously you'll have in independent lines of code but then eventually if from the point of view of modularity and all the other advantages that a function gives you you will be making function calls in a program so think of it like the program and in control flow you are building that programs first this then this then call this function then call this function then call this function and so and so forth so each task is like a function call in control flow all right now the other kind of item that I wanted to mention and again they just to just to take a quick digression here and again this is related to our discussion on control flow you can see there are two kinds of items in control flow one is a task and the other is a container container is also like a function think of it like a function call also conceptually ok now what is the difference and these are the only two items you will find if you scroll down the task list the toolbar that you have here just look at the top these are all tasks these are all tasks safe for these three items which are containers all right oh this is clear to all of you now what is the obvious difference between a task and a container a container can contain multiple tasks inside it okay think about it this way that whenever you are building the packages here so you just think about it so what is this what is this package this package is is also acting as a container okay so is this the highest level container that you have so you can have multiple tasks inside that package because that package is acting as a container okay so you can have another container and you can obviously have tasks inside of it so a container can contain multiple tasks inside of it simple enough and obviously a container can also contain multiple containers inside of it so this is of and you can obviously have one more container inside that container and again inside that container you can have multiple tasks so tasks which are present inside the package containers containers can contain multiple packages containers can contain I'm sorry sorry my mistake containers can contain multiple tasks and of course other containers that's why they are called a container and obviously when you don't have a container and let's say you have an independent task like this this task is obviously present within the package so packages also acts as a container alright so everything in control flow is either a task or a container keep that in mind but conceptually just as we discussed everything is like a function call so these are all individual components beat at our speed a container so you can contain you can place a container here but even this is like a function call and if you have to set the properties of this container you will again double click on it and you will again set the properties of this continue which is just like you know setting the properties of the function what is the function supposed to do so these are all function calls so call this function return a value then go to the next step call this function return a value go to the next step call this function return a value go to the next just that's all the tutoring in the control flow okay so you can define the order of execute of your program dataflow similar to all the tasks that we have here that's a very special kind of task called dataflow so these are all the tasks if you see okay now again we don't understand anything about these tasks yet so we haven't yet described in detail about these nuts we can do that gradually but there's a very special kind of task called dataflow that is also a task you can see that that's it's in the favorites section favorites is some of the some of the most important tasks that you have in control flow obviously execute sequel task is an important point dataflow tasks also very important tasks reckon drop it is just a task like any any other so it's and as we said in control phrases like a function call it's analogous to a function call so how do you set the properties of dataflow just the way you set the properties of other tasks just go ahead and double click on and open it up and when you double click and open it up something very interesting happens unlike for the other tasks where you are taken to a properties window separate separate window pops up in data flow task you are taken to a separate tab remember the time I showed you so that is the tab that you're taking to automatically when you clicked on data flow so this is also like a function call where you are going to edit the properties of that particular function the description of a particular function however the way to set the properties is through a completely different interface and it is that complex okay so rate of flow is and as one of the reasons why it makes data flow such a such an important important component okay so keep that in mind so when you when you want to set the properties for a data flow it takes you to a separate window that's the first tab that you would take that it takes you to and obviously you can have multiple data flow tasks sake let's say is a data flow task one and you can have one more data flow task here and and the other thing that I wanted to mention related to our discussion on this is very easy to rename your tasks and obviously as part of your standard naming conventions applicable for your project you should all know always rename your tasks and it's very easy to do that you can just click on this click on this once and you can actually rename your task call this data flow task in short is called DFT and I can call DFT demo one I can have one wrote ask again drag and drop it again the drag-and-drop a task or it's very easy to copy paste a task also you can control C control V or you can got directly copy right click paste I can control C and control V and I can see that I have one more data flow task I created here you can drag and drop your components very easily on the window so I'm just showing you how to rename just click on these like click on the names here it's very easy to type in the names this I can rename the individual tasks here and it's very easy to move around the components you can just move them around okay so you can just arrange them as as and when you want it and we were discussing about control flow we're discussing how each how each component in the control flow is like a function call and you are really organizing each of the each of the function calls and designing a broader function a broader a broader workflow at a very high level in that control flow okay and if you can configure the properties you can double click on them but it data flow data flow task is a very special kind of task I named it the D key F my mistake I'm go ahead and put it a DFT very simple to edit as you can see just go here DFT and this is DFT demo2 and these are the two data products that I've created at this moment okay so how do you configure properties of the data flow task just double click on them and it takes you to the data flow tab automatically and this is as you can see that you can same toolbox window is visible and you can drag and drop components here and form your data flow task ok you will observe that the components the toolbox components for your control flow and data flow are different if you go to your control flow you can see these are all tasks or containers as you were saying and analog this in a very analogous analogous manner these are what we call as functions function calls as we were explaining but in data flow these are these are what we call components and there are broadly three kinds of components that we have in our data flow one is a source one of the transformation and the other is a destination the source our transformation and explanation just so you can see these are the components that I have in a data flow and data flow is really the heart of a DTL data flow is the heart of ETL where you're actually performing you're actually extracting data from your source you're transforming that data and you're finally loading that way down to your destination so that's the e for the extraction the T for the transformation and the L for the the load and the e for the extraction stands for the source you can see these are my sources the e the T for the transformation stands for your transformation now the transforms and the common transforms and finally the L for the load comes with the destinations which is the destination spot here there are the destinations obviously the source compost and destination assistant are pretty much the same thing as the source assistant cell for the e part and the destination itself or the L part the load part so this is basically the heart of the enter data flow as you can see this is I come up here opens up DFT demo one if I come up here it opens up DFT demo 2 and obviously whatever I perform in in demo 1 is going to be different so you can to come here it can probably can drag and drop off again very simple to drag and drop the components here just as we did for control flow these are all toolbox items so this is what you're doing in data flow this is you consider is TFT demo 1 and I can come back here and I can go to the FT demo - and you will see that those components are no longer visible because that is something that you did in demo 1 not in demo - all right so you can come back to demo 1 and you can see those same components come back up here you can navigate either going back and double click on them and you can navigate or you can navigate directly from here you can navigate to the individual data fro tasks present in this particular package directly from within here guru demo 1 and then go back to demo 2 pretty easy pretty self-explanatory what and how to navigate these options right so hope this is clear to all of you at a very high level of what we are discussing on control flow control flow again is a high level high level overview of your program where you're defining what all you are doing what to perform step a then step B then step C and data flows really is the heart of the ETL that's where you're defining what is your source one of the transformations you're doing and what is the destination that you're loading that - and it's basically another kind of a function call if you think about it there's just a Mahdi you'll think about it from a buddy again from a very function standpoint a very modular functional standpoint so whatever you are performing within a DFT is still kind of a function call series you're performing a task so you it's like a black box so it's just like saying you know that I'm going to load this data from an excel file of retailer a onto my company ABCs data warehouse that's all that you know how is it doing it that's not my concern but that's just DFT that's just like a function module that you have here okay hope the analogy is clear and I really like to use this analogy because it really drives on the point what is the fundamental difference in the control flow and rate of flow now just like we did for the other windows at any point in time if you think the racer science tool box is gone it's it's again a scenario of panic many of you panic and one of the first things that we try to do is go back to the view and see toolbox bring this up and you don't see the tools and you start panicking okay what's happening I don't see the usual tools what to do well you have to understand the toolbox is specific only for SSRS projects okay SSIS there is another this is a different way to get back the toolbox so the toolbox that you see from view is is not applicable for SSIS projects so for SSIS you need to open it up from here SSIS toolbox alright so this is going to be relevant for SS artists and SSIS this right hope this clarifies parameters we're going to talk about later I already give you a high level understanding of what a parameter is it's very similar to a variable event handlers again we will talk about it in a little more detail later on but at a very high level at a very broad level event handlers are applicable for certain exceptional situations that your package may encounter it's just like your try catch block from a programming standpoint event handlers are used to handle errors and I understand when I say errors I'm making it a little more specific but from a very generic standpoint event handlers are used to handle events now those events could be errors those events could be anything else whose events could be warnings those events could be successful completion of a particular task those events could be anything and you define event handlers through this particular tab and again all that you're seeing all these items that you're seeing are applicable for the whole package and please understand we are still in demo one right now so all that you have done here is nothing is moved to demo two so all that you're seeing is again specific only for demo - okay so whatever I am performing whatever event handlers and defining they are ready in respective for that particular package okay and the other other important item that I wanted to mention is more from a terminology point of view I did mention that in control flow everything is either a task on a container the other terminology that we use and again an important naming convention that we use is that everything in a control flow is is an executable very important naming convention we like to use here everything in a everything in a control flow is and executable okay and it's rather I think it just connects the dots initially we're saying that is a function calls and you can just think about it ultimately a function is nothing but an executable so you give an input you get you return a value and it's executing something and it returns you something so it's ultimately an executable so think about it that everything in a control flow is an executable why is that important now that is important because next up I'm going to go to a package Explorer and now you will see sorry next up I will go back to yeah package Explorer and even see that in package Explorer you see the term called executables so let me let me just collapse everything so what you're seeing right now is demo one demo one is obviously our package and what is package Explorer at a high level package Explorer is nothing but just like your file explorer it tells you what all contents you have in your package same thing in the file explorer eager to see what all is in your file or folder package Explorer is how you get to know what all is in your package as simple as that so you can expand it and you can see that these are all the items in your you know package you have the connection managers I don't have any connection managers defined no lock properties defined nothing yeah none of these most of these things will not make any sense to you right now but at a high level what will make sense right now is executables executables is all the items that we have in the control flow how many executables do we have one two three and four we have four executables and if you go back to the package Explorer expand executables you can see this at a four executables I have all right so this is what executables mean and it sees a precedence constraints also explain precedence constraints very briefly to you you see this green arrows or the precedence constraints now again let's not get too much into what is the precedence constraint now we're going to discuss that in detail going forward but this is how you define the precedence constraint these are the total three precedence constraints that we have any expand this here you can see that there had three procedures constraints constraint constraint one and constraint three you can obviously name them you can name them appropriately and if you just click on this and if you go up you can see that the name is constraint if you just click on this and if you if you go down it's the name is constraint one click on this and see the name is constraint 2 it's very easy to change the names from here right so it really depends on the naming conventions you are following so those are the closer to precedence constraints and at a high level those are the executables and and what do you do within the executables so let's let's let's take for example I took the example of DFT demo one and you can see in demo one I am doing and the same way that we did for control flows you can in data flows also you can actually rename the items if conditional spirit I'm going to call it CS demo one right so I'm going to call the CS demo one and here I'm going to call it DZ demo one so what here is as part of my DFT demo one I have created these two components as part of the data flow CS demo 1dc demo you can see dat demo - is empty nothing I cannot configure anything here so if I go back to my package Explorer and if I expand DFT demo one if I expand DFT demo - you will see I have nothing obviously as I already explained to you that same as in control flow you have executables and data flow you have components so whatever you have in data flows are called components because they have no components I've defined as no event handlers I've defined and variables is a default feature so obviously there are variables yeah I'll show you how to create variables but then there two kinds of variables that you have in science basic you have the system variables which are obviously created or maintained by default and you have user-defined variable sing a/c system proportionate sign the variable name so this is basically what it defines the namespace the first part of the variable name which a system that is basically the namespace okay and obviously there are user defined variables which you can create and which you can create in the user standpoint I don't have any user defined variables so and hence it will not reflect here alright next up I wanted to show you just as I was showing it for DFT demo - I don't have any components just to show you DFT demo one and you can see now all of a sudden I do have the components click on the expand this and you can see that there's my DCT mo 1 and CS demo one is it a two component that I have as part of DFT demo 1 that's the task this data flow task and demo one that I have alright hope this is clear to all of you yeah it's pretty straightforward pretty straightforward what we are explaining here and you can see the same way that enough for another question that you might be asking is okay how come how come variables are present in both control flow and data flow okay the only difference here is scope so what we are talking about here is oh now the same way that in programming terminology you will you have local scope you have global scope we might define a variable within a block and that variable might be executable access if and only within that block the same way if you have variables that are accessible within that particular component or within that particular task or within the whole package so what you're seeing here are variables which are global - the whole package what you're seeing here are variables which are local only to that particular task or component right so come back to control flow and what I'm going to show here is how to create a variable very simple to do that right-click on it and you see the variables option you okay so close variables and you'll be able to open up the variables wizard and you see the first option is add a variable the tooltips will come up very simple click on add variable and automatically you can create a variable demo variable I'm going to name it as demo underscore variable the scope is demo one so what does demo one the first question you might have is okay what is demo one demo one is nothing but the package so by default it will be created at the package scope it is very easy to change the scope I'll show that to you later how to change the scope but by default of the variable will be created at the package scope at least the way we did it here the wave is in which you can create it in a different scope but then at this moment the way we created it it should be created by default in the package scope you can change the data types and you can see it's integer so the value is by default taken as a zero I can create a string variable and type in a stripe in a string and it's pretty simple to create a variable here okay and them on disk available now when you go back to the package Explorer you should be able to see that collapse all of this expand your variables and even be able to see the system variables come up it's all your system variable so now you see the user underscore use a demo on Disko variable so what's my variable name a variable name is demo underscore variable and you can see the namespaces all of a sudden user because that is a user-defined variable by default namespace is user now you'll be wondering that why can't I see the other system variables in my variables window it's very easy to show that also right click on it go to variables and you have to change a few options for that you can select the grid options here and you have to you have to select show system variables so select show system variables by default hidden this unchecked so show system variables and should be able to see all your system variables also by default these are all my system variables all right so it's by defined you can see that you cannot edit anything in a system variables a click on this click on this you can't you can't edit anything unlike now unlike you know he's a variable where you can actually edit the values so go back and hide it back and you can see that the way you can hide the values your system variables it is greyed out you cannot you cannot change the values you can edit the you can obviously change a few other options the namespace is one of the options that I was showing you then I can actually select the column so this gives you more control as to what are columns you want to display you can actually display the namespace column click on OK and you can see by default the names faces user you can obviously change it as per your needs here but when you display system variables of course you see the namespaces also something that you cannot change unlike a user variable which you cannot which you can edit from your end system you cannot edit user you can edit absolutely everything here all right there is something on expressions it's a more detailed description detail this little discussion which I'll have later in expressions are basically one way through which you can control you can control values at runtime if they can make things dynamic so the same way that I'm entering the fact entering the value as hello if you want to make it much more dynamic even you will configure that through an expression so I'm going to talk more about that later but just wanted to give you give you all a very very basic at a very high level flavor of what an expression is and how do you setup an expression you see the three ellipsis sign here just click on it and it brings up the expression builder ok very simple very easy interface very easy to create it you see there set of functions that you have string functions mathematical functions a lot of functions of a lot of the functions are very similar to what you have in sequence server if you're familiar if you have you're comfortable with tranzact's equal to d SQL is a micro sequencer or dialect of sequel i should see that most of the functions are very similar and straightforward and face him in a sequel alright so we talked more about this later you can form your expressions and you can even dynamically define expressions through this instead of hard-coding values in a static man on like this and that's how you define variables and and the final thing on moving scopes you can actually go to variables and you can actually move the scope of a variable you can actually come here and you can move the scope you can click on this and say select new scope and what you can do is by default at this moment I define the scope as demo it can actually come here and it can set the scope as data for filing tasks okay okay you can see if you just remember that okay the next question that you might have is why am I seeing only four options because you can only set it to own any of the four executables that you have as part of your package so a package consists of only four executables in control flow right and when we talk about executables we're talking about control flow let's say you want to transfer the scope of this to data finding tasks click on OK and now all of a sudden you will see this magically vanishes you might might might refer another question is that ok how come how come will vanish is no I change the scope right but that should still be visible now I had it in global scope I made it in local scope but it should be visible and to do that go back to the grid options and say show variables of all scopes right now it is only showing it further for the highest go so show variables of all scope click on OK and you should be able to see this now if you go back uncheck it back you're not seeing this yet but when you go back and click on data profiling tasks and now when you click on variables and click on variables now right so make sure that you and you select the option of city of profiling tasks ok so it is inside this that you can actually go ahead and see the variable right so it's very very important to understand how to set that property okay go back to variables and say show variables of all scopes and now you'll see that the scope is set potata profiling tasks right so it's very simple to change scopes and this is now importance option you should select so don't get panicked if you if you don't see the variables which you have defined in a different scope and this is very similar to what what you mean by local scope and global scope so what it means is you have the demo variable it's it's visible only within this particular task okay so within this particular task you can access that variable so outside this task demo underscore variable has no meaning has no significance all right the concept is very very similar to local variable and global variable all right so that's pretty much it as far as a basic description that I wanted to give you and finally the final section that I wanted to cover as part of an overview is connection manager's connection managers in short is a way through which sequel server integration services that is 3d tools of course if you use to build SSIS packages you use it to connect to the outside world so as part of ETL obviously you need to connect two different sources you need to connect two different destination and connection managers are your way to do what so canals you can connect to different kinds of sources you can connect to oil EDB sources databases Excel sources flat file sources all the different examples that we took in the very first two sessions if you remember are achieved through connection manager so in short connection managers are your window to the outside world that is how you connect to the external sources be the source from where you're taking your data orbiter destination where you're writing or very loading your data so connection managers are your window so be the window from which things come in the window through which things go out so this is the this is like a kitchen this is the place where you're performing your activities right see when you go to data flow and you'll be wondering okay huh I already have something called an only DB source okay and again we'll talk more about this in detail if we have we won't be a touch upon this at this moment but when you say I have an only DB source of what's happening behind the scenes is you've seen in the connection managers it's like a kitchen this is where you build the high-level components but the actual extraction and the actual loading the e and the L part happen in the connection manager so even if you double-click on this you see this asks you to create what we call a connection manager okay so the actual extraction that is the source and the actual loading that is a destination happened to your connection managers so it's a very very important part of your package and obviously you can define your connection managers at the package level so when you do it here you're defining your connection managers at the package then you can right click on entering you can see these are all the different options that you get so first of you get some of the more popular options then you can prove the new connection and you can see a whole host of other options that come up all the different kinds of connection managers that you can find here all right so these are connection managers at a package level similarly you can create connection managers at a project level so obviously the difference is what is the difference in the package level and a project level so obviously the package levels the same width same thing that they've been discussing just to give an example here so only DB connection this is one of the most common kinds of connection managers in use today are used to connect to relational databases so if for our most of our discussion here we will be talking primarily about sequel server and obviously we be using only DB connection manager to connect to a sequel server database so I'll go ahead and set up an only TB can I right now so click on new and it asks you for the server name so myself a name is dot or localhost so it's just a reason why because we have configured sequel server in my local system in case you're trying to connect to an external sequel server make sure you give the proper server name of the IP address so I'm using the proper authentication method use Windows just I to your windows login or sequence of authentication which is essentially it depends on how you set your authentication with the server level all right it's very easy to set these options also and just open up my sequel server I'll show you this real quick it's not an administration course but just wanted to just wanted to show this to you how we can set this obviously is something that you said at a relation level while you're performing installation but in case you have made some mistakes and you want to correct those mistakes rectify your mistakes you can actually go ahead and set some of these things in the property section you can go to the server right click on server go to the properties and this is the place where you can set your security ok go to the security and you should see what is the authentication mode that you selected so by default it is sequel server and windows indication mode we we also call it the mixed mode host indication where you can authenticate through both means ok so and in case you you forgot your sequel server sequence of authentication you can obviously reset your passwords and other things here alright so that's how you do it now windows indication and you test your connection connection succeeded and you can select a database name so I can select it which it works 2014 DW 2014 this is the database that I want to connect to ok so through connection manager you are connecting to a particular database in your sequel server click on OK click on OK and you can see that the connection manager is created same way that you can rename other objects here you can go ahead and rename your connection manager so let me call it corn manager underscore oledb very simple maybe can configure a connection manager all right now very interestingly what what happens here is when when I go back to the other package demo too you can see that that connection manager obviously is non-physical none of this is visible because as I mentioned to you just a while back connection managers are excuses further what you're doing here is exclusive for a particular package so when you create a connection manager here and that connection manager exists only in your package it is not applicable across different packages right but but think of a scenario where let's say you want to connect to adventureworks DW 2014 that particular data warehouse that I just know connected to and you don't want to create the same connection across different packages let's say even in this package you have the same requirements you have to connect to that particular database so you don't want to go through the same process you don't want to go to new or you to be connection and set this whole thing up again from scratch so what you want to do in such scenarios is create what we call a project level connection manager okay so that's why you have two options this is the package level connection manager and this is what we have the project level connection manager and obviously we understand that ena in a particular package you can have multiple in a particular project you can have multiple packages hence if you create a project level connection manager you can essentially share your connection manager across multiple packages okay so a very easy way of doing it and given in this case we have already created the connection manager a very easy way to create a project connection is just right click on it and say convert the project connection you can either do this or you can you can create a project level connection manager and the way to do that is again just right click on a new connection oledb I'm sorry I'm just creating ODBC here oledb click on new the same kind of option that we just saw before and you can see that all of a sudden I have a project level connection manager set up here okay this is in addition to the package level connection manager and how do you differentiate you can see that for project level connection manager you see I right you see a parent thesis project prefixed with the name you can change the name just to make it a little more readable I'll call it project conman right that's your project nation manager and you can see there's a dart con manager extension don't mess with the extension because it really messes things up otherwise so this is your package level Commission managers in local only with a package and this is your project level connection manager which is which is applicable across all the packages in this project so if I go back to demo 2 now you can see that demo 2 will now see that particular connection manager which I created make sense it will not see the other connection manager it will not see the the other connection manager but it will see the project level connection manager which I created you can see project con manage your project Korn manager which is the project level condition manager ok you can convert it back to a package connection or in the other case first case you can convert this back to a project connection so let's say this is a this is our package level connection manager look on only to the package you can actually connect convert this to a project connection it's very simple and I can see these are both project connection managers and so automatically added up here go to demo to even see both these options pretty simple pretty intuitive and that's that's what I want to discuss about connection managers at this stage all right so just sharing my screen this is the core excited that we have here click on attach a box 2014 databases and you see the official state of 2014 either the full database backup or the adventureworks 2014 date warehouse tripped ok we already explained in detail what these two things mean script is the taught sequel file ethics that you take security management studio and the backup is what you have to restore I'm going to go to the data flow tab and as you can see this is the demo to package that I've taken so it's it's an empty package reading in demo 1 so that is something that that we leave untouched as it is and you can oh we could have proceeded on with the next one that is demo - this is the empty package that we had so I can go ahead and add a data flow task also using this option as you can see they if there is no data flow task that is created and if you go to data flow tab it will give you this kind of an option where it says no other products have been added you can just click here to directly add the task ok now first up data flow there are components there are three kinds of components a component can either be a source component can be a transformation or a component can be a destination right so resources are all listed out here destinations here and transformations all over here right first up let me talk a little bit about the different kinds of sources that you will have and obviously there are there are a lot of different sources that you can see here just to just to quickly brief you to the sources oledb stands for object linking embedding database oil ADB is the most common kind of source that we will encounter in our discussions because OD DB is used to connect to relational databases think of all relational databases sequence or Oracle and if you have to connect to a relational database oledb source is what you will be using okay ODB c is open database connectivity so how do you how do you set up your ODBC so you can go to actually go to data sources just go to your Start menu type in data sources and see data sources ODBC and you click on this you should be able to see that these are the different drivers that you can set up ok this is where you can get your drivers set up and and and eventually connect to your ODBC source using this in certain configure in certain scenario specialized scenarios to be using this flat file sources are typically used to connect to text files you know which are in notepad you can give an example of that Excel obviously is used to connect to excel files ad or dot that is also another way in which you can connect to relational databases so that is also a secondary way in which you can connect relational databases so between AD or dotnet and oledb the most preferred way of connecting is only TP because it is much much faster compared to 88 rocket row file source is a proprietary SSIS format so we cannot look at raw file and their use cases also so it's appropriate or essas format and it's not a file that you can create in your windows file system it is a file that that and only SSIS only only the package can create it's internal to the engine and it's nothing but it's nothing but you know a dump of your memory buffers ok now I'll talk a little bit about buffers when I talk about Rafael source but just remember that it is proprietary crisis is and finally XML source if you have an XML file that is what is used to connect to ok so real quick let's let's go back to a discussion let's go back to demonstrate how to create an oily debe source so simply go ahead and dragon and and the other thing that I wanted to touch upon and before I start off with how to connect to a source who can right-click on this and you can create a connection manager let's say I can create a new connection manager and as you can see I'm creating a new only DB connection manager here there are different kind of providers by the edges to also flag this to you depending on what you want to connect to if you want to connect an Oracle you might want to select a different option here okay so here it's a sequence of a Native Client by default I'm gonna go ahead and select that itself adventureworks click on OK click on ok and that's my that's my connection manager has created ok now go ahead and rename it you already mr. stands for connection manager okay now what I want to go ahead and do now is I'm going to take my source connection managers are your window to the outside world right so sources the destinations are they just the components so the source will use the connection manager so ultimately when you take an oil EDP source behind the scenes at the underlying layer your only DB source is still using the connection manager right so how do you how do you map this to this that is what you do inside this particular window again as just as we did in control flow is very simple drag and drop just drag and rough from the tool box and place it under this item on the design interface of data flow so how do you do that you can right click and edit it either you can right click edit or you can double click on it to configure its properties and you can see is the only DB source editor and the first and most important option that you have to choose select here and and configure here is the only DB connection manager as you can see there's a connection manager right so by default this is already set to this option the to the connection manager I already created you can see OD DB con man there's a condition I've already created so by default it will be selected ok so because only DB source at the underlying layer requires and only DB connection manager to connect to the source to configure right so it first thing it will do it will try to search within that particular package what all available connection managers I have and not only connection managers I you need to search for all what all compatible connection managers you have so for if it's an only DB source you're always looking at an OD DB connection manager so it does find a match so this was just a connection manager he created a while back and you can see it's automatically selected so there will be cases and then they'll often be cases where you may not want to do it in this way you may want to directly so let's say in this kind of a scenario if you go to remove this record and delete it I can see there's no more connection manager now if we go back and double click on it you can see this is blank so in that case obviously it cannot find any connection manager so you have to create a new connection manager when you click on new here if you open up the same dialog box that you got when you when you when you clicked on new only DB connection manager from there ok it's the same interface that comes up now it's very intuitive it's very selfish self fix Melvin isn't it what I'm trying to get it so click on new and it's the same process that you're following again so click on adventureworks to select adventureworks detail to 2014 click on ok and that's it that's your connection manager that's again created and let's automatically select it here ok now data access mode connection manager is just a way for you to connect to the underlying database right so you not only connect the connector to the server you not only selected the server but you also selected the underlying database so in this case this connection manager is nothing but a connection that points to the adventureworks t2 2014 now what do I do now just think about it from the context of sequel server now when you open up sequel server management studio and you click on new query ultimately what you do is and one of the one of the most important things that people will tell you as a chick is select this particular drop-down choose what database you want to run your queries against right if I go back and say select star from let's say team product here if I select this it will throw an error you can see that as intelligence is going to complain invalid object and the reason is because I have the incorrect database selected so you have to make sure that you have the right database selected for you to run the right query and that's exactly what what we are doing here also so ultimately it's the only DB source you have to make sure that so so you're pointed to the database you're not select at the right database but then what what is it that you want to do you want to write a query you want to write a sequel command against that all you want to access the tables there or do you want to access the tables or views through a variable or you want to access the sequel come on through a variable that's what the data access mode property stand for and just to go step by step just to come to the sequel command as I'm demonstrating here so you can see why we want to write a very simple query like this next our phantom product ok how do you do that I mean when you execute this query what happens you get a result set of all the rows and columns of dim product right select start so it will rate down rate or everything indium product this is what you will be expecting right and you can do the similar thing here also you can go to sequel come on text so what you're doing is in the source what are you doing right and again I don't want to confuse you here I hope you all of Europe or all of you are with me on this connection manager is just a way for you to connect to the database so using connection manager all you did was you connected to the underlying database right it is using the source that you're you're specifying exactly what are you right because connection manager is just this connection is just the connection that you read connection managed it is like saying I'm just selecting from the drop-down that's all that connection manager is it gives you the gateway to the database now what do you do from that you want to fetch the tables one by one from here then you select a different data access mode you want to write a sequel command on the database then you select a different access mode okay so let me go ahead and copy this statement from here the next random product and let me go ahead and paste it so you can either do that or you can it's very easy to go ahead and build a query so when you go to build query it opens up a very very simp simple query builder interface so in in management studio also you get a very similar kind of interface here when you when you say design query any editor writes if this is the this is where you write type in a sequel command and if you want to know more graphical UI you can actually say design query in editor it opens up a very very similar interface in SSIS also this is the query designer interface that you have that features now showed you right so come back to SSIS it's the same query builder interface the names are different query designer is query builder right so you can right click on this add a table then you want to add DIMM product ad and what you want to do is you want to select all the columns right you can see dim product star or you can just go back and say select star means the same thing right click on OK and as your query builder simple enough it by default star would mean it will select all the columns by default okay so it doesn't look pretty good so what I'll go ahead and do is I'll simply go ahead and just say stop so go to preview and preview will give you what that particular query will return as a result set so preview is just like executing that query in management studio and getting the result set for you so that's what a sequel command it's very simple and very very very easy to understand this isn't it now the next way you can access is to a table of you and as I mentioned instead of writing a sequel command you can basically select the tables of views from your database so remember connection magnitude connection manager you have chosen the database of your choice - connection manager you have chosen that you you want to be working you want to work with adventureworks DW 2014 you've chosen that now using table of view you can choose the underlying tables of views of that particular database so if you click on this drop down even get all the tables of users who have particular database so I can actually go back and select name product and it's been that it is basically doing a select start so when you when you select table of you access more than when you say the name of table of us name product it's essentially what it means is it's retrieving everything from that particular tables it's just like saying selects random product so click on preview and is the same result that we got last time right so this is what table of view means the other option is table of view names row variable and this is a very very interesting option to demonstrate this I'm going to go ahead and create a variable then right click on this anything create a variable I'm going to go ahead and call it demo table string variable and I'm going to call this let's say team customer so the only thing the only different thing that we are doing here as you can just just to just to highlight this again so what we are doing here is the amount of code table that's available they arrive a string and you can see the value is customer right I'm just going to go in and check my scope once you can see sales package one I'm just going to go ahead and check my scope once all right now what we need to do now is so all this while what we were doing is we only table of you you could use the tables from here it's just like doing a select start and ultimately written results it but here what what we gonna do is we done is we may choose that table or view name from a variable I say sort of instead of choosing my table of view from the from the drop-down I'm going to make pepper I'm gonna make the process dynamic and I'm going to choose the table of you name through available choose the variable that's a variable that we created demo underscore table click on preview and you should be able to see that the results from tim-tim customer are coming up right so that same customer so if you want to change that so just click on OK I'm not gonna not gonna modify anything let's go to variables and let's say you want to change it something else that you want to make a dim currency just edit the value here edit the value and you don't want to take any changes here so demo table internally deficit in currency and now what you will do is if you could go to preview you can see that now it's automatically picking it up on dream currency ok so that's what that's what that's what a difference between variable name means any other option the final of sequel command as we have discussed already and and one of the advantages of sequel come on office Lee it's the most preferred way of exit data accesses right so it's it's something that you will do all the time in production in production in determines you'll typically never select tables from the drop-down so you'll almost always you see will come on there's one other one of the major most important reasons is performance to the performance standpoint because if you don't use sequin command then by default if you've selected from the drop-down by default you're doing a select star a sequin command is a way for you to a limit your columns and B you can use multiple tables so you can see that here you cannot use multiple tables so in table of you can only select a single table at a time but in case there could be complex scenarios where let's say you want to build a query where you want to take at a table you want to take my factor in trade sales anyone also take a dim product you want to build a slightly more complex query and you want to pull in the color from here and you want to pull in the sales amount from here and you can see I'm clearly doing a join and so you want to you want also pull in the English product name from here click on OK and you can see that in these kind of cases you have to use a sequel command and sequin command is gives you much more flexibility over the kind of query that you want to write and the kind of results that you want to that you want to include in your particular source because because what is the source think about it from this point of view only DB source as I mentioned initially is is a way for you to retrieve data from a normal relational database right so when I said retrieve data ultimately what you're doing is you're creating your database and you're getting a result so if that's what you're doing at the end of the day so it's about it's about writing a sequel statement sequel command either way so whether you do a drop down from here whether you are choosing a table from here or you're writing a sequel to come on either ways you're right you're fetching of results like that's that's what getting entering your necessary is package okay so if you click on the preview here you can see that those are the three columns that I included here right those are the three columns I included it's pretty straightforward pretty intuitive what we are doing now what all this while we've been talking about data access mode which is fine so first APIs connection manager that's your window what you want to connect to next how you want to connect to it it's were a sequel command I just want to choose the tables and after that what you do is you go to columns and in columns you are defining what is it that you want to bring inside SSIS okay so this is that this is what the columns option helps you continue see those check boxes yeah so these are the three columns that you will see why do you see only three columns that because that is what you are bringing into SSIS at this moment think about it from this point of view think about it from this point of view have your database sitting here that's your database as a sequel query that you write that's a sequel query and the sequel query will obviously return results it so results at I will I will say is RS or s stands for the result set yeah now it is that result set that contains three columns I'll just denote three columns using three lines your result set contains three lines now it could so happen that you don't want to bring though all the three columns inside SSI it's no obviously it's a very rare case we never do that but I just want to tell you that you know you can still have three columns in the result set but you can still filter further here so if you can uncheck this let's say if the last moment you decide okay no one English product name we can actually uncheck here and now what you will have is you would only have these two columns entering exercises okay see this component here let me go ahead and click on OK and show you so click on OK here and you can see this this component now will bring in only two columns into SSIS the other thing I wanted to briefly touch upon and I was mentioning this very briefly when I was talking about RAW files or something called memory buffers everything in SSIS happens in memory it so it's an important concept but it's something that all of us should appreciate the fact that everything in exercise and then the fact that SSIS is is so fast you know transformations happen on the fly in general operations happen pretty fast in exercise ok and on one of the reasons is because SSIS in memory so what you're doing and when I say in memory I mean to say everything happens your your ram is utilized for this purpose random access memory ok so when you're bringing data into SSIS so your data is ultimately all these operations are performed in your RAM ok and obviously if you have a huge amount of data that you're importing and you don't have the resources your packages can actually hang up right so in this point in this scenario what what we are doing is we have we are reading in so in so many rows so and so many columns from online database and where is that going where is the data going that data is actually entering into the RAM so when I say entering into SSIS ioan directly mean to say that data is entering into your M and it is from there that you can do other activities let's say haven't talked about transformation so far but then you can actually connect these arrows not please and please note that these arrows are different from the arrows I showed you in control flow again they haven't spoken deeply about precedence constraints now what am I trying to say so the arrows that you see in control flow this is what we call a precedence constraint right so this is the precedence constraint the arrows that you see here are data flow paths these are data flows it's just just click on this and you will see that this is what we call a data flow path right this make sense the dataflow path this is what we call a data flow path in in other words in other words and I think this is just to accentuate a discussion just to accentuate the concept of memory buffers even further so what it means is that you threw only DB source you're bringing your data into memory you're bringing into your data into SSIS and all that you perform in dataflow everything you perform and the name is very synonymous as a dataflow so so all that you define here is how your data is flowing to the memory buffers and what are you doing in the process so at the source level you're breaking your data into the memory buffers and finally after all the transformations that you will do you're going to load your data to a particular filing through the destination so all the operations that you perform in the memory buffers and all this while you're transferring data from one stage to the other so first this then this then this then this and so and so forth that's what the arrow signifies arrow signifies how your data is flowing it's just like it just like a pipeline and and water is going to the pipe only in this case data is going to the pipe that's what the data flow that is for this particular arrow means a data flow path and don't confuse this with control flow at the green arrow which stands for precedence constraints what this means is this is just a data flow path this is how data is flowing between your source and your different transformations and the destinations of course okay just to also revise this again so this is the sequel command that we selected and we selected the results at which countries of three columns and what we did here was we uncheck this third column and the only including two columns in our underlying data flow and hereby henceforth Aldous I'll refer to this as data flow so in the underlying data flow I am including only two columns and that is what is going to enter my memory buffers now a great way to analyze what is getting into your memory buffers and and what you are actually reading is to double click on the data flow path so you can actually actually click on this double click on this or you can right click edit whichever way you want to do it actually write double click on it and you will be able to you'll be taken to the data flow path editor okay you can go to general section will give you different things about the name you can of course edit your names you can give it as it is but the most important section here is a the metadata metadata will give you an idea as to what is actually flowing through the data flow path and here you will see that there are only two columns that are flowing to the data flow path one is color and the other is sales mode the only two columns that are flowing to the data flow path and it's very and that's understandable isn't it because if you remember I had three results eight columns initially and I and I only included two in my data flow only two came in so it's like saying I was reading three columns from my source but then I have filter to column I filtered one column in my source I'm only exiting so only two columns are out put it out so those that are two columns that you can see here it's very intuitive isn't it so you can just click on it and you'll see that okay I'm reading in only two columns color and sales amount here right those are the only two columns that are allowed and those are the only two columns that I can see are flowing through the data flow path now one very very important thing and before I go any further I just wanted to have a quick discussion on the data types here yeah so one thing you will observe is DT and risk ocy DT underscore WS t are very interesting data type names so typically these are not sequence our data types and these are what we call integration services data types these are SSIS specific data types where it will give you a very good mapping and understanding of the different kind of data types it's a great resource which will give you an understanding of different cut stereotypes and and these are these are all integration services data types you can see DT bool DT by DT see why it is the currency value that we saw you can see in pieces are represented differently so integers are we all understand integers a sequel server int int long int but it's different here in education services you can see this DTM is called eye to eye for eye 8 so it is different for by signed integer is different 8 byte is different you can see float sorry president R 4 R 8 STR S stands for see if it's if it's a Unicode it is represented by wstr okay so what is a Unicode Unicode is a very special kind of character set so your understand Unicode is a very special kind of characteristic non unicode is a standard character set unicode is a special kind of character set which has just a more more number of characters that you can be present there so things like Chinese characters Smiley's do a lot of different characters you can represent other than the standard character set that we have so that's what Unicode means it's just much more options that you get and and if you want to represent your data type as a Unicode this is the wstr is what we do so take you valent of this in sequence server if you understand where cats if i take bad cat val care is going to be very similar to what we call in STR your DTN is cor STR is nothing but a Valkyr but your TT in is called wstr as a unicode is nothing but an inverter so in sequence over when we represent a unicode vp prefix and in n as in in n with that alright so you have a very nice table here through which is demonstrated what the what the mapping should be in case you want to refer to this is very very useful way in wishing under kerosene a few of the database products or DBMS products that are also referred to in this particular table CDT STR is corresponding sequel server mappings are care and bad care if we go down ttw STR or n care n bad care and ECT TN is called I 2 is small and I for is in die it is begin and so on so forth it's a great the very very helpful table that you have you now why is this relevant why is this table relevant and why why why am I even this to you the reason is because the first thing the first question that might come to your mind is how did this conversion happen how did this mapping happen because all that I showed you here is I went with my I women I went here I went to metadata and I see the three types are coming as this but how did this conversion happen because if you if you remember when I go to oil any resource the way I am reading these columns and another way to see the column data types is just to hover over it just hover over it to see what the column red drapes are either you can hover over it here or you can hover over it here okay right you can see it is reading it as a DT underscore wstr you can see it is reading this as a DPN let's go see why but is that how it is stored in your underlying database is that how it is stored answer is no if you look at if you look at these two columns color and sales amount and if I come back to my underlying sequel server if you come back here and if you go to DT and a scope currency and if you look at color and say this amount right so color is going to be from product and if I go to product color you can see color is actually stored as or anywhere can 15 right color a color is an in vehicle 15 and if you look at sales amount sales amount is from fact internet sales columns expand columns and if you look at sales among sales amount is actually a money later type in sequel server is some money here it right so so again it is using the same table its internal ESS is also using these are the same table and it is able to convert this correctly you can see 30 tienes Co cui is the corresponding mapping for money so and internal ESS is using uses same table for converting between the corresponding sequel server ready types into their society types okay because because that's how you can store your data inside of exercise great way to look at what data is flowing to digital cause we're going to use this much much more extensively going forward but I just wanted to stress on this data flow path you can give an intercept your data as and when it is flowing and you can look at exactly what kind of data and what are three types of the data instead of flowing and there's now the very very interesting thing you can do you can actually use what we call a data viewer okay that's the third option you can enable a data viewer to look at and it is release it click on enable data viewer you can actually see so here you can see what columns of flowing is more like the metadata you can analyze what all columns are flowing what are their types and other things but in data viewer you can actually see your data you can actually see those two columns you know the either opt to display both those columns or you can just bring something back and just you can even play around with what you want to display so color than sales amount is all that we have you can click on OK and you can you see a small microscope sign that comes up when you see when you say enable data viewer ok now I have used data conversion here data conversion is not yeah I'm not really configured anything potato conversion so don't worry too much about it so it's not performing any function rather than just being a placeholder there right so all I have done is I have red X number of rolls and a 3 X number of columns from my source I've really I'm reading some data from my source and I'm just transferring it on to the next component and in them in in between in between I am intercepting that data and I'm trying to find out what data is flowing through so I'm intercepting the data through the data flow path I have enabled a data viewer and I'm trying to see what data is flowing through that particular path ok so how do I execute this package so we haven't talked about execution yet but the a simple way to execute your package is just right click on it and say execute package so you can either do it like this or you can click on start here so it's a little different slight difference in both these options but it's simply a way to do it at this stage just go ahead and right click and say execute package and when you do that even see that it takes some time but you'll eventually execute and you will see that what happens is it's reading some data ok that's the data flow data flow task that stretch trigger ticks the only TP source it's starting to read data I can see nine five five seven rows and you can immediately see the date of your escape WC what what is this happening what is this you may want to disable your data viewer and just show you just wanted to disable it if you wanna show you real quick let me go ahead and disable my data viewer and show you if I deserve eliminate a viewer what will happen is you see PC 60,000 398 was a transferred right instantaneous is 60,000 398 rows are transformed so what are the green green tick mark stand for a success so this means that this this component was a success I was successfully able to read data from my source and and this component stands for a success so essentially as I said data conversion and not performing anything just go back and double click on it you'll see it as it's blank I have not conferred I'm not configure any of the properties here right haven't done it yet but all I'm doing here is I'm transferring these roles here no data view at the moment you introduce that and the other thing I wanted to show you real quick and before I go on to date of your demonstration so go to progress and progress tab actually gives you an idea of the progress of your package you can actually look at when it started a little more detail on the log so especially when you get warnings or any kind of errors you'll typically look at the progress bar for any kind of hints on what kind of errors occurred or what kind of warning so here you can see a few warning signs come up and a good way to copy a good way to copy or paste these columns look at the errors in much more detail it's just to copy the message here ok the copy message takes or hit ctrl C on the keyboard bring up on notepad or any kind of text editor paste it and you can analyze this in more detail what exactly saying because the output column sales amount is not subsequent to using data flow task removing this unused column can increase data flow graphs performance so you can see that I am NOT using sales amount anywhere so what it's complaining is that so what it's complaining is I have used two columns in my in my data flow I am reading two columns I'm reading two corners but I'm not using them in any way all I am doing is and I'm just just just imported that have not done any operations with that so that's why is giving a warning saying that I'm not using these columns it's not subsequently used in the dairy products you know the the color column is not subsequently used in the difference just a warning message that is giving me okay so that's what the progress bar is for oh and how do you stop the execution and the other important thing that you will observe here is when you enter into the execution window you will see that a few things change first thing it brings up the breakpoints window the watch window comes up the locals window comes up okay there are different utilities of these windows anyone who has done programming before you will understand we understand in the visual studio environment what these things mean we're gonna talk about a few of these things more in control flow but you could see the inverter environment in general changes so in previously we were in the design environment but now we have entered into the debugging environment this is what we call the debugging environment or the execution environment okay you cannot perform any kind of package designing here you can see the toolbox is now hidden okay woman we are still executing we are still executing how do you know that the reason is because these windows come up the tool box is hidden and you can see the stop button comes up here you can see the tooltips a stop debugging okay so this is how you exit from the debugging mode and you go back to the design mode so when you click on stop debugging you'll see again the toolbox comes back and everything comes back to normal right so this is how you can execute a simple package just to go back to our demonstration of data viewers again double click on it enable data viewer click on OK and what happens here is when again when you go back to right-click and execute the package what happens here is you can see that it is still running cuuu achieve a natural break point you'll again see the break points window and in breakpoints window something very interesting happens you will see a breakpoint has been triggered at the data viewer why does that happen because the data viewer acts as a natural break point so what you are essentially doing is you have intercepted your data you want to intercept your data in between and you want to look at your data that's exactly what you are doing you have intercepted written here and you're looking at your data that's what you are performing here and until that time you want to pause the execution of the program that's what break points mean in short it's the natural paucity execution of the program all right how do you go back how do you grab the resuming execution you can see rolls displayed on 9 5 5 7 is that the current number of rows displayed if you go to go back to the next the next buffer and just click on it and he'll go to the next set of rows you can see I gained nine five five seven I displayed and total so many rows uploaded again you can go to the next you can see again 9 5 5 7 display total so many upload click on next click on next and you continue clicking on next until unless you reach the desired limit and if you want to close it all together you can you just go ahead it's how you can copy your data and copy your data patient next set it and analyze it more or the simple way to exit from the rid of you just go and close it ok and you execute everything so it just - just a way for you to close a breakpoint and you know a resume with a normal execution so breakpoint and did of your data via so just a way for you to get an understanding of whether what you perform did get a basic kind of debugging of let's say some of the operations you performed are working as per you expected you look at a few demos going forward and I think that will keep lend you more clarity on the exact use case of data viewers but as I mentioned it's very very useful from a debugging standpoint let's say you're reading some data into SSIS and you just want to before you perform anything else so what I'm doing here is all I have done is I've read so in source so many so many rows into SSIS and I've done nothing I've literally done nothing I'm just used as OD TP source so far but even before I proceed to with other activities I just want to be sure that the data is coming exactly the way I expected it so I just want to look at the first 10 15 rows or let's say the first 100 200 rows just glance through the latest result set and just get just be sure that that's exactly the way I expected it to be and that's all that it of yours are for from a very very debugging standpoint
Info
Channel: Intellipaat
Views: 136,470
Rating: undefined out of 5
Keywords: ssis tutorial, ssis tutorial for beginners, learn ssis, SQL Server Integration Services, msbi tutorials for beginners, msbi training videos, ssis package, msbi tutorial videos, sql server ssis, business intelligence tutorial, ssis best practices, ssis package tutorial, msbi demo, ssis interview questions, ssis transformations, ssis, msbi, bi, Intellipaat, sql server integration services, MSBI Intellipaat, ssis package deployment, intellipaat msbi, ssis demo
Id: Ugbadoa8nq0
Channel Id: undefined
Length: 97min 28sec (5848 seconds)
Published: Mon Jul 08 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.