Master Azure Databricks CI/CD in 2 Hours with Azure DevOps | Full End-to-End CI/CD Project in Azure

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video we are going to see about cacd in data bricks this video will be a step-by-step approach needed to build a complete end to end cacd pipeline for AO data braks firstly we'll be starting with understanding what exactly is cacd with an example and then we will understand about the entire CSD functionality and after that we'll be moving to the actual implementation of the CSD pipeline Itself by the end of this video you'll have a complete idea of what is cacd and how we can configure the pipeline in I should Ops for deploying the changes instantly from one environment to another environment so without wasting further time let's get [Music] started so I have received many messages and comments to upload your CSD video on data brakes and that's the reason I'm making this video so please do watch it until the end and give a like to this video if you find it useful so firstly for doing the cicd I'm going to use this data bricks workspace so this is the workspace that we use for doing the ENT and complete data engineering project as you can see here we have the notebooks bronze to silver silver to gold and storage Mount which we created for that end to endend project so we can use the same datab bricks workspace for doing our CD pipeline as well so the reason for using this is we have already seen how the data engineering is done in the NN project and this CD would be an extension to that project so if you haven't watched that end to end project video I would highly recommend you all to watch that since we have covered different concepts and resources which is really important to know to become an assur dat engineer okay so now before actually seeing about what is cacd let's first see how we are going to implement the cacd for aure data brakes with an example consider we have a Dev Resource Group in that Resource Group consider we have a assure datab bricks workspace created in that workspace consider we have created different notebooks for doing some data engineering task now we have all these in Dev environment which also means that it is in Dev Resource Group so the main idea of the cacd pipeline is to deploy the code from one environment to other environment the other environment here can be a production environment also different companies use different numbers and types of environment say for example a company can have three environments like Dev uat and prod or Dev QA and prod and the other company might have only two environments which are Dev and prod which is the standard thing followed by most of the companies so let's go with the two environments for this cicd demo which is Dev and prod so similar to Dev environment the prod will have all the resources created the only differences here here is we will not be using the prod resource for doing any development work so what I mean by this is we'll be creating the notebooks and write any code only in Dev environment and we need to deploy all the code changes from Dev to prud Via a cacd pipeline so that's the main idea here so for doing the cacd process the most important thing here is the G repository so there are different kinds of repository available such as giab Ashu devops and several others in this tutorial we'll be using the Ashu devops repository for implementing the cicd pipeline so in simple terms the repository is mainly used to save all our code so here we'll be integrating this reposter to only one environment which is Dev environment so what I mean by this is only the data bricks workspace which is in the dev Resource Group will be integrated with the Repository there will be no connection between the pr data bricks workspace and the repository okay so now let's discuss about the cacd process so here we'll be doing all our code changes only in the dev data bricks workspace and once we have completed all our work we can commit all our changes to the git repos Tre which means that we are saving all our code to the repost Tre and now as soon as you commit the changes to the G repos Tre what will happen is the cacd pipeline will get triggered and the pipeline will get the latest changes from the dev environment and deploy all the latest code to the production environment this is the complete cacd process so here the part where we integrate the dev changes to the repost Tre is called as continuous integration and the part where we are deploying all the latest code from Dev to prod is called as continuous deployment and this is called as cacd which is continuous integration and continuous deployment so now let's see what is cicd it is a set of practices used to automate and streamline the process of building testing and deploying code changes to different environments I hope now you have a clear understanding of what is cicd okay so now let's discuss about the emerging techniques that we are going to use in the data briak cacd process the merging techniques is also called as branching techniques so there are different ways of merging techniques followed by different companies so let's take one method and discuss this with an example so consider there are two data Engineers data engineer 1 and data engineer 2 so these two data Engineers are currently working collaboratively with one datab bricks workspace so this datab bricks workspace is in the dev environment we all know that all the development work is only done in the dev environment now as discussed before for the CSD process the first and foremost important thing is integrating datab bricks workspace with a git reposer so we are going to use the Ashu devops repository for this integration okay so let's consider we have integrated ASU devops repository to the datab breaks workspace so while doing this integration we need to create a main branch in this repost Tre so this main branch is really important the reason for that is the CD pipeline that we are going to create will trigger only when there is a change happening in the main branch what I mean by this is say for example if a data engineer makes some changes to the main branch then this will act as a triggering point for our cicd Pipeline and the pipeline will get the latest changes from the dev and deploy to the prod so that's the reason the main branch is really important so now in terms of the branching Technique we are going to protect this main branch so what I mean by protection is no data Engineers can directly make a change to this main branch they cannot update any code directly in the main branch say for example if the data engineer one needs to make a change then he must create a new Branch based on the main branch this Branch can be a future branch which has the exact copy of the main bran so he needs to make all the changes only on this fature branch that he has created and cannot make any changes in the main branch similarly if the data engineer 2 needs to make any changes then he has to create his own feature branch and use that Branch to do any work so in case if someone tries to make changes directly to the main branch then they should get an error so this is called Branch protection and we are protecting the main branch based on the merging techniques that we follow in the cicd process okay so now let's see how these data Engineers can update their changes to the main branch so consider the data engineer one has finished his changes then what he can do is he can commit all his changes to his future Branch similarly the data engineer 2 has completed all his changes and have committed the changes to his future Branch okay now we are in the most important part of the CAC process so both of these data Engineers have completed their respective work you may now ask a question whose changes should be mered to the main branch first the answer to this question depends on the priority between these two changes what I mean by this priority is whose changes should be first deployed to the prod should be mer to the main branch first so for this the data engineers and the other operation team should discuss and talk about the priorities the reason for discussing the priority is as soon as we merge the changes to the main branch our cicd pipeline will trigger and deploy the changes to the pr environment so that's the reason discussing about the priorities is very important in the cicd process okay so consider they have decided for the data engineer one to merge his changes to the main branch first so now what the data engineer one should do he has to create a pull request to merge his changes from the future one branch to the main branch so once this prer gets approved all the changes done by the data engineer one will be mered to the main branch okay so now the main branch gets updated with the latest changes so now what will happen is as discussed before the csad pipeline will get triggered and it will get all the latest changes from the dev environment and will deploy all the code to the pro environment so this is the complete process of cicd so once the pipeline has finished deploying all the changes the data engineer 2 can follow the same step by creating another pull request to merge his changes from the future two Branch to the main branch so after merging the changes the cicd pipeline will again get triggered and it will deploy the latest changes to the prod environment so this is the merging technique that we are going to use in the cicd process proc okay so I think now you have a clear understanding about what is cicd and what is the merging technique that we are going to use in the CD pipeline in the next section we'll be seeing how to set up the repost Tre and other environment setup like Branch protection permissions and other stuff and after that we'll be seeing how to create a continuous integration Pipeline and then we'll be seeing about the continuous deployment Pipeline and finally we'll be doing a complete C CD pipeline testing so this is the agenda for the cicd demo so in this section we have seen about the introduction to the cicd and in the next section we can see about setting up the depost Tre and other environment setup required for this CD process okay so I'm in the assure portal now in assure portal I'm inside the resource Group which is RG data engineering project so this is the resource Group that we use for doing the complete end to endend data engineering project object so in this Resource Group I have opened this asue datab bricks workspace in the other tab so as you can see here these are the notebooks that we created for the end to end project so similar to this setup I have another tab open here so as you can see here I have created another Resource Group which is in the name RG data engineering project prod so I have created all the resources as same as other resource Group here I have opened this PR datab breaks workspace in the other tab as you can see here this is the DBW Mr K demo 01 prod workspace so in this workspace we don't have any notebook created it's a brand new workspace so in the other workspace we have the three notebooks created under the workspace directory and under the shared folder so if you go to the Brad workspace and CLI click on the shade folder we cannot see any notebooks here so we'll be creating a cacd pipeline which deploys the notebooks automatically from the other workspace to this broad workspace so that's the idea of this CSD pipeline so here consider the other resource Group as the dev environment and consider the pr Resource Group as the production environment okay so now as discussed in the last section we are just going to integ the repost tree only to the dev environment so we'll not be integrating any prod resource to the repost Tre so for that what I'm going to do now is I will close the tabs related to The Proud environment since we don't need that for now so I have just opened these tabs for showing you about the pr environment setup so let's now close this tab and also this tab as well okay so now we can begin integrating the ashid devop repos with this Dev datab breakes workspace so for that I have open another tab here which is assure devops so if you type dev. aso.com in the browser along with the organization name it will take you to the Ashu devops portal so Mr K talk Tech is the organization name for this ashure devops so when I first sign up for the ASU devops I have used Mr K talkx as the organization name and inside that organization iation I have created a project called assure tutorials so instead of using this project for our cicd pipeline I will create a brand new project so for that let's click on this new project button okay so now we need to fill in few details here the first thing is the project name so since we are doing a cicd for data brakes I'm going to give a name called datab brakes CD the next one is the description with this is an optional one so let's skip that for now after that we have the visibility option public or private so let's go with the private option since I'm the only user for this project and then we have the advanced option here as you can see the git is selected as the version control which is the one we need as well so let's have the same default option here and click on create button to create this project okay so now we have created a project project with the name datab brick cacd now we need to create a new repost Tre inside this project so for that in the left hand side we have an option called repost so let's click on that as you can see here by default while creating this project there will be a default repository created with the same name as the project name which is datab briak cacd we can use the same repository for our integration with the datab braks workspace or we can create a new repository for the integration so I'll go ahead and create a new one for showing you how that is done so let's click on this new repository button so firstly we need to choose the repository type so let's choose the git option here and then we need to give a name for the repository so already we have a repository with the name datab break cacd so let's give a new name called datab brakes cacd tutorial and then we have have a checkbox enabled which says add readme so this option is used to just create a readme file so we can use this readme file to write few things about the reposter itself which will help others to understand what's the main use of this repository so let's add this file here and then we have a note saying that this repository will be initialized with your main branch so what this means is when creating this Repository a new Branch with the name main will be created in this repository so as discussed in the previous section the main branch is the most important Branch for the cacd pipeline okay so now we have given all the details here so let's click on this create button to create the repository okay so now we have created a repository with the name datab bricks cacd tutorial and in the repos Tre we have your main branch created and this branch have a readme.md file which contains some information about the repos Tre we can always update the readme file as required okay so now what I'm going to do is I'm going to the dev datab breaks workspace again here inside the workspace folder we have an option called repost so we'll be using this repost option to integrate with the ASU devops repository so let's click on it and now in the right side we have an option called add re so let's click on it after clicking that firstly we need to enter the URL for the git repos tree so we can get this URL from the devops repository so here if you click on the Clone button in the right side you can find an URL which is the HTTP 1 Let's copy this URL and go back to the datab bricks workspace and paste it over here so as soon as you paste the URL the other information got filled automatically like the git provider which is the Ashu devop services and also the reposty name which is the datab briak cicd tutorial okay we have given the URL details now we have two options in the bottom the first one is the advanced option so let's click on it so here we have an option called sparse checkout mode so this option can be used if you want to clone only a certain folder from the repos Tre in that case you can enable this option and spe specify the folder path for cloning it so by default this option will be unchecked for cloning the whole repository instead of your particular folder so spot of this integration let's go with the default option to clone the whole repos itself then we have an option called get credential this is an important one which is mainly used to specify how we are going to authenticate to the ASU devops from this datab bricks workspace so here we'll have an option called git provider by default the as should devop Services personal access token will be selected so if you use this option to authenticate then we need to create a token using the assur doops and then we use that token inside the datab bricks workspace for authentication so in general the token based authentication is not that good since we have few complexities like managing the token such as updating the token frequently or creating a new token if the token got expired and other Warheads because of these reasons and complexities instead of using the token based authentication we can use this option which is assure devop Services assure active directory which is a managed identity based authentication so what I mean by that is for example currently I have logged in to this datab bricks workspace using the Mr K talk gmail.com account so if I go to the as devops now I'm currently logged into this devops using the same Mr K talk Tech gmail.com account since the same user have access to both the datab bricks workspace and the ASU devops repos we can use that account credentials for the authentication so for that let's choose this assure devop Services assure active directory option and click on the save button over here okay now we have successfully updated the credentials so we have given all the required details for integrating the repos but I didn't explain about this option over here which is create repo by cloning the G repository so by default this option will be enabled so the main use of this option is when this option is enabled while integrating the ASU devops with this datab bricks workspace we'll be cloning a complete copy of the files that are in the repost Tre to the data brakes workspace which means that if there are any code in the report repos that code will be copied over to the datab bricks workspace as well so since we have just created the repository and we have just one files in there which is the read me file so after the integration only the copy of that read me file will be there inside this datab bricks workspace so let's now click on this create repo button over here okay so now we have successfully integrated the ASU devops repository to the dev datab brakes workspace is as you can see here we have the repository name which is datab Brick cacd tutorial so now under this repost option if you expand this Mr K talk gmail.com we can now see the repository name here so the repository name inside the datab breaks workspace looks kind of weird it has percentage 20 instead of space so let's rename this so what I'm going to do is I will right click on this repository name and you'll find an option called rename so let's click on it here I will just remove the percentage 20 from this reposty name one thing to note here is this rename of this repos Tre is only done inside the datab bricks workspace and it will not update the actual reposting name in the Ashu devops okay so now we have renamed successfully so if I just go back to the Ashu devops and hit refresh over here as you can see here there is no changes in the reposting name in the ASU Dev okay so now in this data breaks repos Tre we have just one file which is the readme.md file and we have just one branch which is the main branch okay so now what I'm going to do is in the last section I have already mentioned that we need to protect this main branch we still haven't done anything to protect the main branch which also means that now anyone can update the changes directly to the main branch so so before protecting the main branch let's try to make a change directly to the main branch for the testing purpose so for that what I'm going to do is I'm going to move a notebook that we have under the shade folder to this location and then we'll try to commit the changes to the main branch directly so let's click on the shade folder and as you can see here we have the three notebooks over here let's just move the bronze to Silver notebook so for that let's click on the three dots in the right side you'll be seeing an option called move let's click on it now we need to browse to the location where we need to move this notebook so I'll go to the repost and inside that I will go to the Mr K talks Tech folder and inside that we can see the repository name which is datab break cicd tutorial and if you click on that we can see the readme.md file in here so this is the location where I'm going to move this notebook so let's click on the Move button over here so now we have an alert box over here so this is just related to the access for the notebook in the new location so we can just click on the confirm button okay so now we have moved the bronze to Silver notebook to this location one thing to note here is we have just made the changes inside the datab bricks workspace only what I mean by this is even though we moved this notebook to the main branch inside the data workspace it doesn't mean that we have made changes to the actual ASU devops repository only when you commit the changes in the data bricks workspace this notebook will be added to the main branch in the assure devops repository so now what I'm going to do is I'm going to commit the changes directly to the main branch this will not cause any issues since we haven't protected the main branch it so let's try that so for committing the changes we need to click on this Branch name so after clicking that the data bricks will automatically identify what has been changed on this main branch so since we have just added a bronze to Silver notebook it has identified that the notebook is added and you need to make sure that the right branch is selected which is the main branch now we need to give a commit message here to commit or save the changes to the main branch so I'm going to give a message called added bronze to Silver after that we can just click on the commit and push button to commit the changes to the main branch okay so now we got a message stating that we have successfully committed and pushed changes to the branch main okay so for testing that what we can do is let's go to the ASU devops again as you can see here we can just see the readme.md file in this main branch so let's refresh the page now now nice as you can see here now we can see the bronze to Silver notebook in the Ashu devops repository which is updated just now in the main branch okay so now as discussed in the previous section based on the merging techniques that we are following in the CD pipeline we should not allow anyone to update the changes in the main branch directly so to restrict that we need to protect the main branch let's now see how we can protect the main branch so for that what I'm going to do is I will click on this branches option in the left side as you can see we have just one branch in the repository which is the main branch so for protecting this main branch let's click on the three dots which is more option so let's click on it now in the bottom we can see an option called Branch policies let's click on it now firstly we have a section called branch policies over here this is the important section for doing the branch protection we have four options over here and the most important thing is the note section in the top which says if any required policies is enabled this Branch cannot be deleted and the changes must be made via a pulled request so this is what we need exactly right what this means is if we are enabling one of the bow four options then the main branch cannot be deleted and also any changes to the main branch should be done only via a pulled request we cannot directly commit any changes to the main branch okay so now among these four options we are going to enable the first option over here which is require a minimum number of reviewers so as discussed now even we just enable this one option as per the note above we are protecting the main branch from deleting it or making any changes directly to the main branch so now let's talk about what exactly this minimum number of reviewers option is so when we enable this option before making any changes to the main branch your reviewer should approve the change that you have made say for example a data engineer creates a pulled request to M the changes from his future Branch to the main branch then that pulled request should be approved by a reviewer before merging those changes to the main branch here we have an option to provide the number of reviewers for approving the pulled request the default is two but I'm going to change this to one since I'm the only one who is having access to the repos Tre so this means that even a single reviewer approves the pull request then the changes can be merged to the main branch also I'm going to enable the below option here which is allow requesters to approve their own changes I'm doing this only for the demo purposes usually in the real-time environment this option is not recommended to enable the reason for this is when this option is enabled the person who is creating the pulled request will also have privileges to accept their own pulled request for merging the changes to the main branch this is not a good practice at all it is always recommended someone in the team most probably a technical person or a team lead should review the changes that we have done and approve the pulled request to merge the changes to the main branch but I am enabling this option since I'm the only one who is currently working with this reposty okay so now we have configured everything to protect the main branch so for for testing this what I'm going to do is I will go to the datab bricks workspace again and here I will try to move the other two notebooks that we have under the shade folder to this main branch location and try to commit the changes directly to the main branch to check if we have protected the main branch correctly or not okay so now I will click on the shade folder I will move these two notebooks similar to the same process that we follow to move the bronze to Silver notebook okay so now we have moved the remaining two notebooks over here let's try to commit the changes directly to the main branch okay for that I'm clicking on this main branch and now the data bricks will give us information about the changes that we have made which is adding the remaining two notebooks let's give a commit message called added remaining notebooks okay so now let's click on this commit and put push button and before that make sure that the main branch is selected over here so we should be getting an error message if we have protected the main branch correctly so let's click on it to test it nice as you can see here we are now getting an error message as error pushing changes while committing the changes directly to the main branch so we have successfully protected our main branch now and from now on no one can directly make any changes to the main branch which is really cool okay so now as discussed in the last section in order to merge the changes to the main branch we need to create a future branch and then do a pull request to merge the changes from that future Branch to the main branch so let's see how we can do that so firstly what I'm going to do is I'm going to create a new feature Branch for that let's click on this create button over here now we need to give a name of the branch that we are creating so I'm going going to give a name called feature one okay so now the most important thing here is the based on option so we need to make sure the main branch is selected over here so we should always create a future Branch based on the main branch which makes the future Branch to have the extract copy of the main branch one thing to note here is we can also create a new Branch from the asue devops and use that branch in the datab bricks workspace but now I'm just going to show you how to create the branch from the datab bricks workspace itself both of these options should work as the same okay so now let's click on this create button to create this feature one branch okay now we have created this feature one branch so I will close this window and as you can see here in the datab bricks workspace inside the future one branch we have all the notebooks over here including the last two notebooks that we have moved recently we already know that the main branch only has two files which are readme.md file and the bronze to Silver notebook so since we created this future one branch based on the main branch even the future one branch is just having that two files in the asue devops repos so only in the datab braks workspace we have all the files over here so we need to commit the changes to the Future Branch to save the remaining two notebooks that we have added recently so for that let's click on this future one branch now okay as you can see here the data bricks has identified the two notebooks that has been added now we can commit the changes to this feature one branch so for that the most important thing to check is if the future one branch is selected over here or not okay so now let's give a commit message like added remaining notebooks we can commit the changes to this feature one branch so let's click on this commit and push button button okay so now we have successfully committed the changes to the Future one branch so the future one branch should have all these files in the ASU devops repos but we know that the main branch is just having the two files so now we need to create a new pull request to merge the changes from the future one branch to the main branch so for creating this pull request let's go to the assure devops so inside the assure devops let's click on this repost option okay now as you can see here we have a message that says you have updated the future one branch just now also we have an option to create a pull request by clicking on this button we can click this button to create the pull request or we can also create it by clicking on this pull request option under the repost section over here so let's use this option to create it now you can also see the same message over here which says you have updated future one branch so here let's click on this create a pull request button now the most important thing here is to check this part which is what is the source branch and what is the target Branch so here the source branch is the future one branch that we are using to merge the changes to the Target branch which is the main branch after checking that we have few tabs over here the first one is the overview tab which is used to specify the information about the pull request and then we have the files tab which is an interesting one this will help us to identify what are the changes that will be merge to the main branch so as you can see here it is giving us information like we have added two files which are silver to gold and storage Mount and these are the two files which needs to be merged to the main branch with this bull request then we have a commit tab which will give us the information about the different commits that are done in the source branch so here we can see the commit message that we used which is added remaining notebooks okay so now let's go to the overview tab again and as you can see the title is updated automatically with the commit messages that we have given and then we have description which is optional and here we have the reviewers option so in real time we can configure the branches in the way that whenever you create a pull request the reviewers will automatically added over here and they can review the pull request for you or also you can manually add a reviewer using this add required reviewers option for someone to verify your pull request but we have already configured the main branch like at least one reviewer must approve this pull request anyway before merging the changes to the main branch so what I mean by this is let's first create this pull request by clicking on this create button okay so now we have created the pull request and as you can see here we have a message called at least one reviewers must approve this pull request so this is what we have configured in the branch protection we need someone to approve this pull request in order to merge the changes to the main branch but also you have configured another thing like the person who requested this pull request can also approve it so this will give me privileges to approve this pull request by myself which is not a recommended practice as discussed earlier so what I'm going to do now is I will click on this approve drop-down and click on this approve option okay so now after our pull request gets approved we can see that the complete option over here has been enabled so we'll be using this complete button to merge the changes to the main branch so let's click on this complete dropdown and select the complete option now we have some information about the merging here the only thing I would like to talk about is delete future one branch after merging option by default this option will be enabled so when this is enabled after completing this pull request the changes from the future Branch will be merged to the main branch and after that the future one branch will be deleted automatically so this is one of the best practices so we should always get rid of the future branches once we have merged our changes to the main branch which will help us to have less confusion and easy to manage on the number of branches that we have in the Repository while working in the real-time projects so apart from this option we can ignore other things and go with the default settings which should be fine so don't worry much about that for now so let's go ahead and click on this complete merch button okay so as you can see here we have completed our pull request successfully which means that we have mosted our changes to the main branch successfully so let's test this by going to the repository again as you can see here in the main Branch now we have all the files in here which has three notebooks and one readme.md file okay so now we have successfully mged the changes to the main branch also we have successfully protected the main branch from anyone to update any changes directly to the main branch so now we have completed the environment setup required for this CSD pipeline I think now you have a clear understanding about how to integrate Ashu devops repository with the Ashu data bricks works space and how to protect the main branch and also about the merging techniques that we can use to merge the changes to the main branch so in the next section we can start with the actual creation of the cacd pipeline okay so now in this datab brick cacd tutorial we have completed the first two parts in the first part we have seen about what exactly a cacd is with an example and then in the previous section we have seen about the complete environment setup needed for building the CD pipeline for asue data brakes and now it's time for building the actual cacd pipeline itself so in this section we are going to start with creating the continuous integration pipeline so firstly let's discuss about the main functionality of the continuous integration pipeline so we have already seen about the overall cacd process that we are going to create in this tutorial here we have two environments which are Dev and Pro and in the dev environment we'll be doing all the development work and once we have done with our Dev work and done merging our changes to the main branch then as soon as the changes are merged to the main branch our cicd pipeline will get triggered and it will copy the latest changes from the main branch and deploys it to the production environment so this is the process of the cacd pipeline that we discussed earlier right so here in this section we are just going to concentrate on on the functionality of the ca pipeline so one thing to note here is the continuous integration pipeline that we are going to create we'll need one additional step which is not mentioned here in this diagram so now let's understand about that with an example firstly as part of the environmental setup in the previous section we have already integrated the dev datab bricks workspace with the assure devops repository now we'll be having a repost folder inside the dev datab Bri workspace and that is the place we'll be using for doing any Dev work going forward say for example you have created a new future Branch inside the dev datab bricks workspace and in that future Branch you have made few changes such as adding a new data brakes notebook or any other stuff and once you finish the changes and create a pull request and M A changes to the main branch then as soon as you m the changes to the main branch the cacd pipeline will trigger as usual but here as mentioned earlier The Continuous integration pipeline will do one extra step which is before actually deploying the changes to the prod environment it will first copy all the latest changes from the main branch to the dev data breaks workspace itself so this is the first step of this continuous integration pipeline so the pipeline will create a new folder called live inside the dev datab bricks workspace and it will copy all the latest notebook from the main branch to this live folder so to understand this better let's go to the ASU datab bricks workspace okay so as you can see here I'm inside the asue datab bricks workspace so in the previous section we moved all the notebooks from this shade folder under the workspace directory to this repository location so now we do not have any notebooks inside the shade folder so all the notebooks are one in the repository location so we'll be doing all the D work using this repost location going forward so right now I have integrated this assu devops repository to this datab breaks workspace using the Mr K Talk account so similar to this and another data engineer will integrate ASU devops repository to the datab braks workspace and will make changes to that repost location so each and every data Engineers should do the same kind of integration and will be working on their specific branches so here what is the ca pip planine will do is say for example if anyone makes changes to their future branch and merge it to the main branch the CI pipeline will copy the latest notebooks from the main branch and it will create a new new folder called live under the workspace directory and it will copy all the notebooks from the main branch to this live folder so let me show you how does the live folder looks like when the CSD pipeline creates it so I will right click here and create a new folder and name it as live so this is how it looks like once the CD pipeline creates this and the ca pipeline will copy all the latest notebooks from the main branch to this live folder so I'm calling this folder as live since all the notebooks that are inside this folder are the latest ones so this live folder will be the exact copy of the main branch every time so now you may ask a question that what's the main reason we are copying the latest notebooks from the main branch to this live folder so to understand this let's go to the asue data Factory studio so in ADF we have created a pipeline called copy all tables for the end to end complete data engineering project so in this pipeline we created two notebook activity called bronze to Silver and silver to gold so if you look at the path of this notebook activity as you can see here we have selected the shade workspace location in here so we all know that we have moved all the notebooks from this shade folder to the repost location So currently as seen before there is no note books in the shade folder and it is only in the repost location so when we run the pipeline again we will get an error since the notebook is actually missing in that location so we already know that the main branch of the repository will have all the latest code right but the problem here is referring the actual main branch of the devops repository cannot be done inside the ADF pipelines so for example if I click on this browse button we can see the repost folder so let's click on that now as you can see here we are seeing the Mr K account since I'm doing this setup using the same Mr key account so if you click on this you can find the reposting name and inside that you can see all the notebooks that we moved before so if you choose the bronze to Silver notebook in here the pipeline will run this notebook using the Mr key account datab bricks repost location and there is no op option to select the actual main branch of the assure devops repository which is the one will have all the latest scored right say for example if a different data engineer makes some changes to the main branch then that changes will not be reflected into the Mr K account repost location until we pull up the latest changes from the main branch so in that case we'll be missing the latest code so since we cannot refer the main branch of the devops repository inside the the ADF pipelines we have our CI pipeline which will copy all the latest notebooks from the main branch to the live folder inside the workspace and this ADF pipeline will always refer to the notebooks that are inside the live folder which will be the exact copy of the main branch I think now you have a clear understanding of why we are copying the latest notebooks to this live folder so this live folder will be automatically created by the CI pipeline itself so we don't have to create manually so I will just delete the live folder for now so if I hit refresh as you can see here there is no live folder now inside this workspace location okay I think now you have a clear understanding about the functionality of the ca pipeline that we are going to create basically the pipeline will copy all the latest code from the main branch to this life holder inside the dev data bricks workspace this live folder will always have the exact code of the main branch which will be used in the ADF pipelines also in some of the real-time projects we'll be configuring some unit test functionality along with the ca pipeline so that unit test will have some test cases to test the code before actually merging the changes to the main branch but for this datab brick CA pipeline we are not actually doing it in this tutorial and once the CI pipeline have copied all the latest code to this live folder The Continuous deployment pipeline will get triggered and then it will copy all the latest code from this Dev live folder and deploy that into the production environment workspace with the same live folder so basically the exact live folder will be deployed to the production environment as well so this is the complete cacd process of assure data breaks but in this section we'll just concentrate on creating the continuous integration pipeline so let's see how we can do that okay so now before actually starting with creating the continuous integration pipeline we need to update one final thing in the branch so as you can see here we have three notebooks file and one readme.md file so in order to make our cacd pipeline work effectively in deploying the changes from one environment to other environment it's always a good practice to organize the available files in the repository so what I mean by organizing things is as you can see here we have mix of files in here we have notebook file and also a readme.md file here the read. MD file is not the actual code and it is not necessarily important to deploy this to the production environment so we need to organize the available files in a folder structure for the E deployment M so let's see how we can do that so before making any changes in the main branch I'm going to create a new Branch so for that let's click on this main branch and click on this create Branch button so I'm going to give a name called organize and let's click on this create button to create it okay so now I'm inside the organized branch and here firstly I'm going to create a new folder and for that let's right click here and click on create folder option so let's a name called notebook so I'm going to move all the notebook type file to this notebook folder so we can move this by dragging and dropping the notebooks file okay so I have moved all the notebooks file inside this notebook folder now I will create another folder and will give it a name called extra so inside this extra folder I will move the readme.md file so this folder can have any unimportant things to be stored and this will not be deployed to the production environment so we are going to configure the ca pipeline in the way that only the files that are inside this notebook folder will be deployed to the production environment so that's the main idea of organizing the files in the repository so all the data Engineers should make use of this notebook folder for doing any changes like creating a new notebook or making changes to any existing notebooks which are already ins this notebook folder okay so now we have made these changes so let's commit the changes to the organized Branch so for that I will click on this branch and now you can see the different changes that we have done in this Branch you can see that the status of these files are marked as D which means as deleted and since we move to the different directory location we can see the status of the same files as a which means that the same files are added to the different directory okay so now I'm going to give a commit message as change the file locations and we'll click on this commit and push button to commit the changes to the organized Branch okay so we have committed the changes successfully so let's now go to the assure devops repository and click on the refresh button so as you can see here in the homepage we have a message that says the branch organized has been updated with the changes just now so let's click on this create pull request button to merge the changes to the main branch so make sure we are merging the changes from the organized Branch to the main branch so let's click on this create button to create the pull request so based on our previous configurations at least one reviewer must approve this pull request so let's approve this by clicking on this approve button okay so now the pull request has been approved so let's click on this complete button and now make sure that the delete organized Branch after merch has been checked and after checking that let's click on this complete merch button okay so now we have merched the changes to the main branch successfully so as you can see here in the main branch we have the latest folder structure in here so if I go to the assu datab bricks workspace and click on this organized Branch so we have already deleted this branch in the ASU devops and it's not reflected in the datab breaks workspace 8 so I will just click on this drop- down and as you can see here we can just see the main branch so let's choose this main branch and close this menu so now we have just one branch which is the main branch and the updat folder structure in this repository okay now we are all set to start building our continuous integration pipeline so for that firstly I will go to the Ashu devops repository So currently in this reposer we have only one branch which is the main branch so for creating the ca pipeline there are a set set of code that we need to write and merge that into the main branch of this repository so the code that I'm going to use for building this CS pipeline is called ASL code this is the code which is mostly used for building the CD pipeline so similar to how we use ASU datab bricks workspace to create a new branch and do all the data engineering stuff in the notebooks we need to create a new Branch for writing the logic for building this CSD pipeline as well but the only difference this time is instead of using the auu data brakes I'm going to use the visual studio code for writing the EML code because Visual Studio code is the easier way for doing that so let's see how we can do that so firstly we need to clone this repository to my local machine to use the repository inside the visual studio code so for doing that there will be an option called clone in the repos Tre so so let's click on that and here we will have an option called clone in vs code which is an inbuilt feature of the ASU devops and using this we can clone this repository directly to the VSS code so there are different ways to clone the repository locally and this is one of the ways to do it so let's click on it after clicking that we will get an option to open the visual studio code so let's choose this option to open it now we need to specify a local location for cloning the as Dev repository so I have created a folder under the C drive which is datab Brick cacd so let's choose this location as the local repos destination so when we do that what will happen is all the files that are available in the repos Tre will be cloned locally to this particular location so once it is cloned we can access the cloned files from the visual studio code and make any changes required for creating the CD pipeline we can also use the vs code to create a new Branch or commit the changes to the Ashu devops repur Tre similar to how we did using the ASU data bricks okay so now let's click on this select as repos destination button okay so now it has started to clone all the code locally and we have an option to open the cloned repository so let's click on this open Button okay so as you can see here in the visual studio code now we can see the ASU devops repos name name which is datab Brick CSD tutorial and inside that we can see the same folder setup that we created earlier which is extra folder that contains the readme.md file and the notebook folder which contains all the three notebooks okay so now for doing this continuous integration pipeline I have already created a set of code so we can just import that code to this location and then I will explain about each and every step that is done in the code code for creating the CI pipeline so firstly I will drag and drop the code inside the VSS code so I have created a folder called cacd which contains all the required code for doing this continuous integration pipeline for aure data bricks so the reason for creating the code beforehand is if you need to type the code from scratch the video duration will be much longer so don't worry I will explain everything that is required in the code so now let's click on this copy folder option to import the CSD folder inside the visual studio code okay as you can see here we have imported a new folder called cscd to this location so if I expand the cscd folder inside that we have another folder called scripts and after that we have one more folder called templates and then we have a EML file called cacd pipeline so I'm just showing the setup of the cacd code so so don't worry much about this for now so inside the scripts folder I have a Powershell script file called datab bricks token and after that if I expand this template folder I have another EML file which is named as deploy notebooks so this is the code setup required for creating the CI pipeline also as you can see here all the files that has been added recently is highlighted as green which means that vs code has identified what has been newly changed compared to the existing ones in the repository so visual studio code can be easily integrated with the ASU devops repository and it works really well as well so in vs code we have a separate option called Source control here we can see what are the changes that has been done and also you can commit the changes to the repository directly using this option which is kind of very similar to the ASU data brick setup okay so now we all know that we cannot merge the changes that we have done in the visual studio code directly to the main branch since we have already protected the main branch right so we need to make sure that we should create a new branch and make any changes only to that Branch so we can also create a new Branch from the visual studio code itself so here in the bottom left you can see the main branch so this is the branch that has been selected now we can just click on this main branch to create a new Branch so let's click on it after that we have an option called create new branch let's click on it now it is asking for the name of the branch I'm going to give a name called future slci pipeline so after giving the name let's hit enter to create this new Branch okay so now as you can see here in the bottom left the new future Branch CI pipeline has been selected so now we can use this future Branch to make any changes required for creating this continuous integration pipeline for assure data breaks okay so now we will understand the code that we created for the cacd pipeline firstly I will explain the flow of this code as you can see here we have a file called cicd pipelines DOL so this file is the master file for this CD pipeline what do I mean by Master fil is when we actually create the CD pipeline in AO devops we'll be using this master file to create it so this master file will call this temp template file which is deploy notebook. yaml this template file which is deploy notebook. yaml will call this script file called datab brick token. PS1 which is a partial script file so totally there are only three files the first one is the master file and the next one is the template file and finally we have a script file okay so firstly let's understand the master file this is how the syntax of EML file looks like mostly in a form of Json structure so most of the companies will be using the yaml file to create the CD pipelines since this can be easily reused in different reposter such as ASU devops or GitHub okay so now let's understand the code in the master yaml file the most important thing is the first to line we have a property called trigger and the value for this is given as main what this means is this EML file will be triggered only when there is a change happens in the main branch as discussed before we'll be using this master file to create the csad pipeline in aure devops so the CSD pipeline will be triggered as soon as you update a change in the main branch so that's the main idea of this trigger function the next thing we have is the variable groups this is an interesting one so what this means is in ASU devops we'll be creating something called variable groups the main use of this variable groups is say for example in Dev environment the datab bricks name would be different Resource Group name would be different compared to the data bricks name and the resource Group name in prod right so for this reason we need to create a variable group specific to each environment and the variables related to that environment should be saved in that particular group so before seeing the code below first let's create a variable group for the dev environment with the name DBW cacd Dev so for that I'm going to the ASU devops as you can see here I'm inside the ASU devops now here in the left side of the page you'll be seeing an option called pipelines let's click on it in the pipeline section we have an option called Library so let's click on it here we will have an option to create the new variable group let's click on this variable group button to create a new one for the dev environment here we have a few option to fill in firstly it is asking for the name of the variable group for this we can use the same name used in the code so that we don't have to change anything in the code and we can create the csdd pipeline with the same code in here so let's copy this name and we'll go back to the ASU devops and paste it over here in the text box okay so we have given the name of the variable group now we need to add the variables related to the dev environment over here we need to add at least one variable in this group to create it say for example if I just hit the save button now we will get an error message that says variable group must have at least one variables defined okay now before adding the required variables in here we need to understand the remaining code so let's go back to the vs code to see the code again okay we have seen what is trigger and variable groups now let's see about the line 7 to 10 this is just like setting a parameters in the code we have a parameter called VM image name and the value for this is Windows latest and then we have a parameter called notebook's path and its value is assigned as notebook this parameter can be used anywhere in this file for example the VM image name is used in here under the pool section the value is passed as a parameter to the VM image property and the value Windows latest will be assigned to this VM image property now let's understand about this pool in simple terms the pool is just a compute power say for example to execute the CD pipeline in asue devops we need some sort of compute power right that compute power can be consumed from this pool Microsoft provides different types of pool that can be configured to run the C ACD pipeline in ASU devops let's see what are the different types of pool available for that I have opened an official Microsoft documentation here in a different tab as you can see here this is all about the Microsoft hosted agents as mentioned earlier the Microsoft have hosted different virtual machines and these virtual machines can be configured so if I scroll to the bottom we have the information about all the hosted agents as you can see here we have used Windows latest in our code and similar to this we also have options to configure the Ubuntu and Macos virtual machines to get the compute power needed to execute the CSD pipeline here we are configuring the windows latest agent as specified in the pool section of the ml code after that in the notebook's path parameter we have assigned the value as notebook the reason for giving the value as not notebook here is if I go to the assure datab bricks workspace as you can see here we already have set up the folder structure where we have a folder called notebook and all the actual notebooks has been moved to this folder as discussed earlier only the files that are inside this notebooks folder will be pushed to the prod environment so that's the reason I have specified notebook as a value in this notebook path parameter okay I think now you should have a clear understanding of this entire code segment in this master yl file the next one is the stages this is the most important part of this master file this is the segment that does the actual deployment the stage should be configured for each environment specifically right now I have only one stage for the dev environment which does the continuous integration part in the stages firstly we have an option called templates this is where we are actually calling the template file which is deploy notebook. yaml so I have already mentioned that this master file will call the template file right so that step is configured over here so while calling this template file it is going to send all these parameters to it the different parameters are firstly we have a stage ID so this is just to represent the use of the stage we have given the value as deploy to Dev environment of after that we have a parameter called EnV which means for what environment we are creating this stage for so since this stage is created for Dev environment I have given the value as Dev for it then we have a parameter called environment name and the other parameters that we have are Resource Group name service connection and finally a parameter called notebook's path among these parameters the notebook's path is already specified in the top so all these parameters will be passed as a parameter to the template file for the deployment logic okay so here these three parameters which are environment name Resource Group name and service connection we are not directly specifying the value in the code instead we are going to add these as a variable in the dev variable group that we created earlier in the asue devops so we need to create three variables one for the environment name and another one for the Resource Group name and finally for the service connection so the name of the variable should be specified exactly as this as mentioned in the code so what I'm going to do now is I will just copy this variable name which is the environment name one and we'll jump back to the ASU devops variable group and I will paste here to create a new variable with the same name so don't worry much about the value for this variable now similar to this let's create the other variables as well so I will now copy the dev Resource Group name and I will paste it over here finally let's copy the final one which is for the service connection and we'll go back to the ASO devops and paste it over here among these three variables we know the value of this Dev Resource Group name right so if I go to the asue portal as you can see here this is the resource Group that we are currently using as a Dev environment so let's copy the name of this Resource Group and add this as a value for the dev Resource Group name variable okay for the environment name and the service connection variable we do not have a value for this eight so I will just give a temporary value as to be filled for both of these variables okay so now let's save this variable group so once it is saved let's jump back to the visual studio code again here before actually seeing about the environment name and the service connection let's now see how all these these parameters are used in the template yl file so let's open this file now okay so this deploy notebooks DOL file is just a template file where the same file can be used to deploy the notebooks changes to multiple environments so similar to creating a stage for Dev environment we can also create another stage for prod environment and then we can call the same template file in the prod stage and the only change that we need to do is passing the parameters which is specific to the prod environment to this template file for deploying the changes to the production environment okay so now all the parameters from the dev stage will be passed to this template file so we have stage ID parameter and then we have the depends on parameter so this depends on parameter is not passed from the dev stage so this is just defined in this template file itself with the default value as null so we can ignore this dep on parameter for now because we'll be needing this parameter only during the production deployment so let's see about this later and then we have the EnV parameter environment name parameter Resource Group name service connection name and finally the notebook's path so once we receive the values for all these parameters from the master file this template file will then use these parameters and we'll do the further operation for the deployment so before seeing the functionality of the deployment let's see about the environment name and the service connection so for that let's go to the assure devops again okay so firstly let's see about the environment name here under the pipeline section we can see an option called environments so let's click on it so what is mean by environment is all the deployment which is performed by the cacd pipeline can be managed in one place using this environment as you can see here we have some information about the environment here as manage deployments view resource status and get end traceability say for example if we create an environment for dev then when the CSD pipeline runs and deploys the changes to the dev environment we can come to this place and see the status of the deployment such as if the deployment is successful or failure and other information related to the deployment so that's the main idea of creating the environment so now let's create a new environment for Dev so for that let's click on this create environment button so now it is asking for the environment name so I'm going to give a name called Dev environment datab brick CD after giving the name it's asking to give some description about this environment so this is optional so I'm going to skip it for now and then we have an option called resource where we have options like nonone kubernetes and virtual machines so we don't need any kind of kubernetes or virtual machines for this environment so let's go with the none option and click on the create button okay so now we have created an environment for the dev so going forward we can manage all the dev related deployments from this place so we'll be seeing that in the latest session of the CD tutorial okay as you can see here we have some information here like your environment is ready to use in the pipeline EML as environment colon Dev environment data brick cacd so that's what we have used in our EML code right but instead of hardcoding the environment name in the code we have used the variables instead so now let's copy this environment name and paste that in the variable that we created earlier inside the variable group and once that is done the code will get the environment name directly from the variable itself okay so now let's go to the library and open the variable group and in the environment name text box we can remove this and paste the value that we copied earlier okay so now this value will be passed as a parameter to the code as an environment name and this value will be again passed to the template file for the further deployment operation okay so now we have seen about the environment name and the final one to discuss is the service connection so this is the most important one for this CD pipeline so let's discuss about that so for that let's go to the ASO devops again okay so now let's talk about what exactly is the service connection we all know that we'll be creating the CSD pipeline in ASU devops so this cacd pipeline should have access to the assured resources for making the deployment right so in order for CD pipeline to get access to the assure resources we need to establish some sort of connection between the ASU devops and the ASU resources so that can be exactly done using the service connection so before creating a service connection if I go to the access control of this Dev Resource Group and click on the role assignments we can clearly see that who are the ones who have access to this Resource Group as you can see here at the moment we have one security group that have contributor access to this Resource Group and then this Mr K account have the owner and the user access admin access apart from these no one have access to this Resource Group only when someone have either a contributor or owner access they can access the resource such as datab bricks workspace which is inside this Resource Group so we cannot directly add ASU devops to give access to this Resource Group so that's the reason we are using service connection which will automatically configure the required access to the ASU devops using a Serv service principal so let's see how we can create the service connection now so for that in the bottom left we can see an option called project settings so let's click on it here if you scroll down we can see an option called service connection so let's click on it okay so as you can see here we have an option to create a new service connection so let's click on this create service connection button here we have different kinds of connection types supported in ASU devops since we need to connect to the assure resources I'm going to choose the option assure resource manager and click on the next button and now we need to configure the authentication method for connecting to the assure resources we have different types here such as service principal automatic service principal manual managed identity and published profile among these the first option which is the service principal automatic is the recommended one so let's go with the same option here so basically the service principle is a kind of a third party service so what I mean by that is we cannot directly add the assu devops to give access to this Resource Group so when we create the service connection it will automatically create a service principal and that service principle will be automatically given your contributor access to this Resource Group and using the authentication of that service principle the assu devops will get access to this Resource Group and can access any resources inside this Resource Group for making the deployment so that's what we are trying to do with the service connection okay so now let's go to the ASU devops again and click on this next button so here we need to scope the service connection to our Dev Resource Group so firstly it is asking me to sign in with the account so I'll just login now so after signing in we need to select our Dev Resource Group which is RG data engineering project after selecting the resource Group we need to to give a name for the service connection so I'm going to give a name called Dev service connection so I'll just copy this to paste it later in the variable group then we have description which is optional and finally we have an option called security here we have a check box which says Grant access to all the pipelines so if this is checked then all the pipelines that is created in this asure devops account will have access to this service connection which we don't want to do it so let's ignore this and click on the save button okay so as you can see here it says setting up the connection so as discussed earlier what this is going to do is it's going to create a new service principal and give the service principal contributor access to the resource Group so let's validate that once it finish creating the connection okay now we have created a service connection so if I jump back to the resource Group and do a full Refresh on this page and we'll go to the roll assignments again so as you can see here we have a new service principle created automatically and it is given a contributor access to this Resource Group so now the ASU devops will use this service principal access as an authentication mode to access any resource inside this Resource Group okay so now I'll jump back to the ASU devops again and we'll go to the library and open the variable group and we'll paste the service connection name that we copied earlier okay so now we have assigned all the values for the variables that we created in this variable group so let's save the changes now all these variables will be passed as a parameter to the code and now let's discuss how these parameters are used in the code for making the actual deployment firstly I will close this master yl file Tab and we will just concentrate on the template file now we have received all these parameters values from the master EML file firstly we also have stages in this template file as well here we are using the stage ID that is passed from the master file then we have a display name property so this is the display name of the pipeline that will be created to deploy to the dev environment so the EnV parameter is used here so the display name for the dev pipeline will be deploying to Dev environment so when the same template file is used again in prod stage the value for the display name would be deploying to prod environment then we have your depend on property so as mentioned earlier the depends on parameter will not be required for continuous integration pipeline so since the default value is null only the null value will be passed over here after that inside the stages we have jobs this will have functionality to deploy the notebooks firstly we have a display name and the name for this is given as deploying datab braks notebooks then we have a property called environment so this is the environment that we created in the asue devops so the dev environment name has been already added to the variable and this will be assigned to this environment property okay so now the main functionality of this continuous integration pipeline would be copying the files from the notebooks folder to the live folder in the dev environment workspace so that's going to be done using the below code section which is an inline Powershell script firstly we need to know in which subscription the dev data bricks works SP is created so for that we are pausing the service connection that we created in asso devops here which will have the information about the subscription and the resource Group Now using the service connection credentials the below Powershell code can access the dev datab bricks workspace in the Powershell script firstly we need to get the information about the dev assure datab bricks workspace so for that we are using the Powershell script function like AED resource list which will list the resources in the resource Group that we are passing as a parameter and then we are specifying the type as datab bricks workspace and thereby retriving the details of the datab bricks workspace and that details will be assigned to this variable after that we will get the information about the dev datab bricks workspace by using the Powershell function a a data brakes by passing the ID of the data brakes that we got in the previous step and once this is done we will have all the information about the dev datab bricks workspace now we have all the information about the datab bricks workspace but in order to access the datab bricks workspace we need to generate a beer token so for generating the token this template file will call the datab bricks token. PS1 file as you can see here we are specifying the path of the datab bricks token. PS1 file along with some of the other parameters that will be needed to create the token which are datab bricks workspace ID and the workspace URL so these workspace ID and workspace URL was retried from the previous two steps so once these parameters are sent to the datab briak token. PS1 file the beer token for accessing the datab bricks workspace will be created using the below Powell script the parameters it will use to create the token are the datab bricks resource ID workspace URL and then it uses a dedicated principal ID for ASO data braks which is common for everyone across the asue tenant so what I mean by this is I will just copy this ID and we'll go to the assure portal and we'll now paste this ID in the search box and now as you can see here we are seeing assure data braks under the assure active directory so this is the Enterprise application in the tenant level which represents the datab bricks resource in aure so we'll be using all of these parameters in the below Powell script to generate the beer token and that token will be returned back to the template file and will be assigned to this variable Now using this token we need to perform the actual deployment functionality which is copying all the notebooks which is inside the notebooks folder and put that into a live folder in the dev workspace so that's what exactly done using these two lines of code as you can see here we are using the notebook's path parameter here which contains the value as notebook and then all the notebooks inside that folder will be copied to the live folder which is specified in the code here so this is the functionality of the entire code segment so to quickly Rec up this we have a master yaml file which is cacd pipes. yl in this file we are getting all these variable values from the variable group and these are passed as a parameter to the template file which is deploy notebooks yl so this template file will use the service connection to get the information of the datab bricks workspace in the dev Resource Group and then using that information like the ID and the workspace URL we are generating a be token by calling another file which is datab bricks token. PS1 so this file will use both the datab bricks ID and URL along with the principal ID and will generate the beer token and finally return back the token to the template file now this beer token is used to access the datab bricks workspace to copy all the notebooks that are inside the notebook folder and deploys it to the live folder in the dev datab bricks workspace so this is the complete flow of this continuous integration pipeline now what we can do is we can commit all the newly added changes to this future branch that we created earlier so for that I'm going to click on this source control option and will give you a commit message called added CA code after that I will click on this commit button to commit the changes to the Future Branch now we have an option to publish the branch to the ASU devops since we created this Branch locally using the vs code so let's click on this publish Branch button now okay so now we have committed all our newly added changes to this future Branch now I will go to the ASU devops again and we'll go to the repository location you you can now clearly see that we have updated the future CI pipeline Branch just now to view the changes I'm going to switch the branch from the main branch to the Future Branch as you can see here we can now see the cacd folder that we created and we have the script file and then we have the template file and finally we have the master file which is cacd pipelines yaml okay so now the next step would be as mentioned before we'll be using this master yl file to create the CD pipeline right so in the next section let's see how we can use this master file to create the continuous integration pipeline in ASU devops okay so now before creating the pipelines firstly we need to merge the changes to the main branch the reason for this is if you create the pipel lane first and then merge the changes to the main branch then the pipel would start immediately since there will be an update happening in the main branch so for that let's let's merge the changes to the main branch first and then we will see how to create the CSD pipeline in ASU devops for merging the changes to the main branch I will click on this create a pull request so here if I go to the files tab as you can see here these are the three files that we added as part of this continuous integration pipeline so once you have verified the changes and given a meaningful title you can go ahead and click on this create button over here I will now approve these changes then I will click on this complete button okay so now I will just uncheck this delete future Branch option because if we have any issues with the pipeline then we can use the same Branch to fix it although it's not a recommended practice I'm going to do this only for the demo purpose to save some time okay now I'll click on this complete merch button okay so now all the continuous integration changes from the future Branch are getting merged to this main branch cool as you can see here the merging is completed now what I'm going to do is I'll go to the repost and switch the branch from the future to the main branch now you can see all the cacd code that we created are now in the main branch so we are now into the final step of this which is creating the pipelines as discussed before we'll be creating the pipeline with this master EML file which is cacd pipeline. ml so for that let's click on this pipeline option over here okay now as you can see see here we have an option to create your first pipeline using the below create pipeline button so let's click on it here we are going to select the first option which is assure repost get yaml so let's click on this now we need to choose the repository for creating the pipeline so let's choose the repository which is data brick cacd tutorial now we have few options available for configuring our pipeline here I'm going to choose the last option which is using the existing assure pipeline yl file the reason for this is since we have already created the yl file which is the master file we can use the same existing yl file to create this pipeline you can also use the starter pipeline option if you're creating the pipeline from scratch using the assure devops itself so now let's click on the existing assure pipelines option now there is an option to select the branch and we need to choose the main branch over here which is selected as default anyway and then we have a path option to choose our Master yaml file which is cacd pipes. yaml so here we should not select the template file since the value for the template file will be passed from the master yl file so let's choose this option now after choosing that let's click on this continue button now you can see the master EML file code in here to review the code before creating the pipeline once you're happy with the code you can either click on this run option to run the pipeline straight away or you can just save the PIP line firstly I'm going to just save this pipeline so let's click on the save button okay so now as you can see here we have created our csad pipeline which is datab Brick csad tutorial now before testing this pipeline functionality we need to give this pipeline certain permissions let's now see what are those permissions are firstly we need to give this pipeline access to the dev environment that we created earlier so for that let's click on this environment option here I will choose the dev environment and inside that there will be three dots in the top right corner where you'll have an option called security let's choose that option here in the bottom we will have a section called pipeline permissions currently no pipeline have access to this Dev environment so to assign the permissions you can click on the plus icon on the right and here you can select our cacd pipeline which is data brick cacd tutorial to give access to this environment okay so now our CD pipan have access to this environment for doing the deployments the next pipeline access to give is to the library here we have created a variable group for Dev so the pipeline should have access to this variable group as well to read the variables that are created inside the variable group so for that let's click on this variable group and inside that in the top you'll find an option called pipeline permissions so let's click on that here we can add our CS pipeline to give permissions to this variable group okay so now our pipeline and read all the variables from this variable group okay we have given the pipeline permissions to access the environment and the variable group the final permissions that we need to give the pipeline is the service connection that we created earlier so for that let's click on this project settings and in the left side we can scroll down to select the service connection option here I will choose the dev service connection inside that in the top right corner we can click on the three dots option and choose the security here in the bottom we can see the pipeline permissions option and I'm going to just add our C pipeline which is datab Brick CD tutorial now the pipeline have access to this service connection as well okay so now our CSD pipeline have access to all of these services that we created such as environments variable groups and the service connection now it's a perfect time to test our continuous integration functionality so for that I'll go to the datab bricks workspace here I will go inside the notebook folder now what I'm going to do is I'm going to create a new notebook and will'll merge the changes to the main branch so as soon as I merge the changes to the main branch our CSD pipeline should get triggered and should copy all the latest notebooks from the main branch and should deploy it to the live folder in this Dev datab bricks workspace so that's what we are going to test as part of our continuous integration functionality so for that firstly I will click on this main branch to create a new Branch to to test this functionality I will give it a name called testing CI Pipeline and we'll click on the create button okay so we have created this new Branch now let's create a new notebook inside this notebook folder I will right click here and click on create notebook I will rename this notebook to test CA notebook and here I will type a simple print statement as print of 1+ one once that is done I will click on this new branch and here you can see we have just added one notebook to this Branch so now let's give a commit message called added a notebook and we'll click on this commit and push button okay so we have committed the changes to the new Branch now let's go to the asue devops and go to the repost location here let's create a new pull request to merge the changes from this Branch to the main branch so I'll click on this create a pull request button here if I go to the files tab we are just merging the new notebook changes to the main branch so let's click on this create button okay so now we have created the pull request so as soon as you complete the merch our CD pipeline should get triggered so let's see if that works fine or not so for that I'll click on this complete button and we'll click the delete Branch option since we don't need this Branch anymore and finally let's click on this complete merch button to test if the pipeline is triggered automatically or not okay so we have most the changes to the main branch successfully so if I now go to the pipeline section as you can see here our CSD pipeline is triggered automatically and it is in the queue right now so let's click on this pipeline to monitor its run as you can see here we can see the final commit message added a notebook that we gave for merging the changes to the main branch so if you click on this run you can monitor the jobs of the pipeline as you can see here we have the job running at the moment with the display name deploying the datab braks notebooks as mentioned in our AML code you can also click on this job to see further information about what exactly the job is doing and as you can see here we have the information as deploying to Dev environment which is the display name of our Dev stage mentioned in our EML code so while doing the prod deployment we will have different display name for the prod stage which will be like deploying to prod environment we'll be seeing that in the continuous deployment process okay so as you can see here our continuous integration pipeline is completed successfully and it is good to see all the status as green over here which is nice okay so before actually going to the data bricks workspace to validate The Continuous integration functionality I'll first go to the environment section here we can see that a new activity has occurred in the dev environment that we created earlier so if I go inside this Dev environment you can see that the last deployment status of our continuous integration pipeline which happened to minutes ago so similar to this you can keep track of all the deployments that is done by our cacd pipeline to the dev environment coming to this single place which is really helpful okay so now I'll go to the asue data bricks workspace to test the deployment I inside the workspace now firstly I will refresh the whole page okay so now I'll go to the workspace location and will expand this workspace directory cool as you can see here in a new folder called live is created and if I click on this live folder you can see all the notebooks along with the latest notebook which we added to test this continuous integration pipeline as you can see here all the pipelines have been recently deployed at 27 a.m. so yeah as you can imagine I'm working pretty late for making this video so I'd highly appreciate if you could support me by giving a like to this video okay so now what we can do is we can point these notebooks which is inside the live folder to the ADF pipeline notebook activity so this is the bronze to Silver notebook activity so I'll just use the browse option to select the bronze to Silver notebook from the live folder and click on okay so similar to this we can also change the silver to gold notebook activity to point to the silver to gold notebook which is inside the live folder now both these activities are using the latest code from the life folder which means that whenever this pipeline runs it always going to use the latest code so let's publish the changes for this ADF pipeline okay so now I think you should have a clear understanding of how to create a continuous integration pipeline functionality for isue data bricks in addition to this in some of the real-time projects as part of the continuous integration pipeline they will also be creating some uni test functionalities which test your code before actually merging the changes to the main branch but I haven't done this as part of this tutorial since in terms of the data bricks deployments most of the companies will be using this approach in deploying the notebooks in the next section we will see about the continuous deployment part where we will see how we can deploy the latest notebook changes from the dev environment to the production environment and after that we will finish this by doing the complete cacad pipeline testing okay so now we have completed three sections in this tutorial we started with the introduction to CD and then we did all the environmental setup required like setting up the repos and stuff and and in the previous section we have seen a detail overview of creating the continuous integration Pipeline and now in the section we'll see about creating the continuous deployment pipeline so in terms of the current pipeline setup that we have created so far our continuous integration pipeline will get triggered as soon as you merge the changes to the main branch and it will copy all the latest notebook from the main branch to the live folder in the dev datab bricks workspace so we have tested this functionality and it is all working pretty well so now as soon as the continuous integration pipeline run is finished the continuous deployment pipeline should get triggered and this pipeline will again take the latest notebook from the main branch and copies it to the live folder in the Pro datab bricks workspace so here both the live folder in the dev data bricks workspace and the pr data bricks workspace will be exactly having the same code which is also the exact code that is in the main branch so this is what we are going to do in this continuous deployment pipeline so now let's see how we can do that okay so now I'm inside the visual studio code here we need to make few changes for the continuous deployment pipeline let's see what those are so firstly let's create a new Branch for the new changes so for that in the bottom left corner I will click on this future branch that we created for continuous integration pipeline after clicking this firstly I will switch the branch to the main branch since we all know whenever we create a future Branch we should create it of main branch now if you notice something as soon as we switch it to the main branch we are seeing few issues highlighted over here the reason for this is in the previous section we updated the main branch with a pull request so that change is not reflected in the local Visual Studio code so in order to reflect those changes here we need to pull the changes directly from the main branch so for that I'll go to the source control option and click on the three dots and here we'll find an option called pull so let's click on it so now what will happen is it will pull all the latest code from the ASU devops main branch to this local VSS code cool as you can see here the errors are gone away and now we have the latest version in this local vs code okay so now we can create a branch for the continuous deployment changes so let's click on the main branch in the bottom left corner and here let's choose the create new Branch option I'm going to give a name called future SL CD pipeline and click on enter okay so we have created our future Branch now we can use this Branch to make our changes okay so in terms of the changes required for the continuous deployment pipeline it is going to be very minimal the reason for this is we already have a template yl file which is deploy notebook. yaml as discussed before we can use the same template file for deploying changes to the proud environment the only change that we need to do is we need to pass pass the parameters related to the pr resources to this template file so now let's see how we can pass those parameters the first thing we need to do is in the master EML file we need to add a new variable group for storing the pr parameter values so now let's copy this line and paste it in the bottom here I will just replace the value Dev to prod so we still hav't created this prod variable group in Ashu devops but before that we will just discuss what are the Chang that we need to do in this yl file okay so now we have added the pr variable group in the master yl file the next changes would be adding a new stage for the pr environment so this code in the bottom is the dev stage which calls the template file by passing all the dev parameters right so let's copy all this code and paste it in the bottom so now in the new code we need to update all the values from Dev to prod firstly the prod stage will call the template file as same as the dev stage above but the only change we need to do is updating the parameters from Dev to Broad so firstly let's update the stage ID from Dev to Brad so now the stage ID would be deploy to PR environment and then we have EnV parameters so let's change the value from Dev to pra after that all the below ones will be stored as a variable in the new Proud variable group so similar to how we added all these variables in the dev variable group we will also be adding all these variables in the prod variable group with the prod values so let's now update the name for this from Dev to prod and then we'll also be doing the same thing for the resource Group name and finally also the service connection parameter so the notebook's spot parameter would be the same as above since we need to deploy only the notebooks that are inside the notebook folder from the main branch so yeah these are the only code changes required for this continuous deployment pipeline we have just added a new variable group and added a new stage for prod with updating the parameters from Dev to pra so now before actually committing these changes we need to create a variable group create a new environment and service Connection in the ASU devops for the proud environment and finally add those names as a variable to the proud variable group so firstly let's go to the asure devops and create the proud variable group okay I'm inside the ASU devops now firstly I will go to the pipeline section and click on the library option here I'll first open the dev variable group and now I'll just duplicate this page so that I can refer what are the variables I need to create for prod okay so now let's create a new variable group so firstly I will just copy the name of the dev variable group and then click on this plus icon over here and now in the variable group name I will just replace this text with the name that we copied earlier now we will just update the value Dev to Broad okay so after that let's create the variables required so I'll go to the dev variable group and we'll copy the environment name and we'll paste it in the proud variable group and we'll also copy the value of the environment name and paste that in the prod one now let's just replace the value Dev to prod in both the places similar to this let's copy the resource Group name and paste that in the proud variable group and update the value from div to prod now in the value part we need to put the prod Resource Group name so this is the resource Group that we are using as the proud one which is RG data engineering project prod so let's copy this and paste it in the value section of the prod Resource Group name and the final one that we need to add is the service connection so let's copy the name of the service connection from Dev and paste it in the pr one and also update the name from the dev to prod and finally leads to the same thing for the value of the service connection cool so let's save all these changes now we have created the pr variable group and added all the required variables the next step we need to do is creating the environment and the service connection for prod using the same values as in the variable group firstly let's start with creating the environment for prod so for that let's copy this value since we need to create the environment using the same name inside the environment option let's click on this new environment button and now let's paste the environment name here which we copied earlier after giving the name we have the description which is optional so let's skip it for now and finally in the resource section we can go with the none option and click on the create button okay so now we have created an environment dedicated for managing the pro deployments the next one is the service connection so for that let's click on this project settings and here if you scroll down to the bottom we'll have an option called service connection so let's click on it so here let's first copy the name of the dev service connection and then we can create a new one by clicking on this new service connection button here I will choose the same assure resource manager option and click on next also we go with the same service principle automatic and click on next now we need to sign in for the authentication and once that is done let's choose the prod Resource Group which is RG data engineering project prod after choosing that let's paste the name of the service connection which we copied earlier we'll just replace the value Dev to prod okay so now before creating the service connection let's go to the pr Resource Group and here under the access control let's go to the role assignments option as you can see here currently only this Mr K account have access to this Resource Group so as seen before as soon as we create the prod service connection your service principle will be automatically created and will be assigned the contributor access to this Resource Group and then the assure devops can use the service principal authentication to access any resources which are inside this Resource Group so that's the reason we are creating the service connection so now let's click on the save button as you can see here it is setting up the connection now let's wait for this to complete cool we have created the service connection so now if I go to the proud Resource Group and hit on the refresh button as you can see see here under the role assignments there is a new service principle created and automatically assign the contributor access to this PR resource gr which is nice now the Ashi devops can use this service Principle as an authentication mode to access the Pro Data bricks workspace for doing the deployments okay so now we have pretty much done everything for the continuous deployment pipeline the one final thing that we need to do before testing the pipeline is asscending the permissions for our CSD pipeline to the prod variable group environment and the service connection firstly let's give the access to the variable group so if I just go inside the proud variable group there will be an option called pipeline permissions so let's click on it and then we can use the plus icon to add our CSD pipeline so similar to how we did this for all the dev ones we need to do the same for ones that we created for prod okay so we have given the permissions for the variable group now let's do the same thing for the environment so inside the pr environment if you click click on the three dots in the top right corner and choose the security option here in the bottom we have an option called pipeline permissions so let's click on the plus icon to add our CSD pipeline to it okay so now we have given the pipeline permissions to the environment and the variable group the final access that we need to give is for the service connection that we created for prod so let's open the prod service connection and click on the three dots in the top right corner and choose the option security here in the pipeline permission section we can add our CS pipeline to give access for this prod service connection okay so now our CSD pipeline have access to all the prod version of variable group environment and the service connection now we can begin testing the continuous deployment functionality so for that firstly let's merge all the new changes to our future Branch so for that I'll go to the source control option and give a commit message called added CD pipeline changes and we'll click on the commit button and after committing the changes let's publish the future Branch to the ASO de s okay so we have published the branch now let's go back to the ASU devops and we'll go to the repo section as you can see here we have updated the future branch and now we can create a pull request to mge the new changes to the main branch so before that if you have notice something in the future Branch we did not make any changes to the notebook we just made the changes to the yl code in the master yl file for the continuous deployment functionality but once we merge the changes to the main branch our CSD PIP plan will get triggered and it will again copy all the notebooks which is inside the main branch and deploy it to the live folder in the datab brakes workspace so although you did not update any new changes in the notebook it will copy and replace the same notebooks again since we'll be updating the main branch now the only difference that will happen is there will be another step in the CSD pipeline for doing the prod deployment which will copy the same notebook from the main branch and deploys it to the live folder in the Pro Data bricks workspace so that's what we are going to test in this pull request so let's see if the pipeline runs correctly or not so for that let's click on this create button to create the pull request firstly I will approve this change and then click on the complete button here I will uncheck the delete Branch option since if the pipeline fails for some reason we can use the same Branch to fix it now let's click on this complete merge button okay so now our latest changes are getting merged to the main branch cool so we have merged the changes to the main branch now if I go to the pipeline section as you can see here our CSD pipeline has started automatically so if I click on this pipeline as you can see here now our current pipeline run has two stages compared to the previous runs previously we just deploy the changes to the dev environment but now we are deploying it to both Dev and the prod so let's click on this pipeline run as you can see here currently the dev stage is running and the pr stage is in the waiting stage so generally this Pro stage should only run if the dev stage run is successful but currently this is not happening in that way this is because this BR stage is not dependent on the dev stage so to make that dependency we need to make one small change in the yl code so before that let's wait for this pipeline run to be completed okay so now the dev pipeline run is completed and the prod pipeline run has started so as mentioned before this is not a right approach the reason for that is consider if this Dev stage has failed for some reason so in that scenario the pr stage should not run for doing the deployments but in the current pipeline setup even though the dev stage is failed the pr stage will start and try deploying the changes to the pr environment so we need to restrict this so that can be exactly done using the dependency functionality so now let's wait for this pipeline to complete and we'll see if the pipeline is successfully running or not and then we can update the required changes so if the pipeline run successfully then in the Pro Data bricks workspace the pipeline would create a new folder called live over here with all the exact notebooks which is same as in Dev so let's wait and see if that works fine or not okay so now our CSD pipeline run is finished so if I now go to the Pro Data bricks workspace cool as you can see here we can find a live folder created in the proud workspace so if I go inside the live folder you can see all the latest notebooks which also includes the latest notebook that we added for testing the continuous integration functionality so now our CSD pipeline functionality is working fine without any issues so as discussed before we need to make a final change for making the prod stage dependent on the da stage so for that let's go back to the vs code here in the prod stage we need to make a single change let's see what is that in the parameter section I'm going to paste a code over here so this is just it depends on parameter where the value for this is assigned as the dev stage ID which is deployed to Dev environment so now this depend on parameter will be passed to the template file so if I just open the template file as you can see here we already have a depends on parameter here so we have a default value assigned as null so if the parameter is not passed from the master yaml file then the value would be null so since in the dev stage we do not have it depends on parameter the value would be null whereas in the prod stage we are passing the stage ID of the dev as the value for this depends on parameter then this parameter is used in the bottom stage in the inbuilt depends on property available in the yamco deployment so now what will happen is when the prod stage calls this template file it will only run if the dev stage is successfully completed so that's what we are trying to achieve with this dependon property okay so now what we can do is we can M the new changes to the main branch and test the new setup of the pipeline so for that I'll go to the source control option and will give you a commit message called added depends on functionality and click on the commit button and click on the sync changes button okay so we have committed the changes to the Future Branch now let's create a new pull request to merge the changes to the main branch for testing the C pipeline so if I go to the files tab as you can see here this is the only change we did so once we have verified the changes let's click on this create button now I'll approve the changes and click on the complete button button and choose the complete merge option okay so now the CSD pipeline should have got triggered so if I go to the pipeline run and as you can see here now the pipeline setup has changed we can see a new separation between the dev and the proud stage so what this means is now this broad stage will only run if this Dev stage run is successful so that's what we have done using the depend on property I hope that makes sense to you now okay so let's wait for the pipin to complete cool the death stage is completed now and the pr stage will now start for doing the pro deployment nice as you can see here both the dev and the prod stage run is successfully finished which means that our CSD pipeline is working fine without any issues in the current cacd pipeline setup that we created so far if you have noticed something here as soon as our Dev stage is completed the pr stage will begin to deploy the changes to the pr environment so this pipeline setup is not an optimal solution in the real-time projects the reason for that is it is completely fine to deploy the changes to the dev environment instantly after merging the latest changes to the main branch but on the other hand deploying the changes to the pra environment should not be happening immediately unless there is a proper planning or approval from the higher technical or operational team in a company in most of the organization any deployments to the praud environment requires a kind of change request to be raised and approved and in some companies the deployments to the pr environment should happen only after the office hours say for example after 5:00 p.m. in this case we need to somehow make this pipeline to wait before actually deploying the changes to the proud environment so this can be exactly achieved by configuring something called environment protection so what this means is for example once the dev stage is completed an email would be sent to a team lead or someone from the operational team that says the pipeline is waiting to deploy the changes to the pr environment and only when they approve it the Brad stage will run and does the deployment otherwise the pipeline will wait until they approve it also they will have privileges to reject the pipeline run if they think for some reason the changes shouldn't be deployed to the pr environment so this is one of the main functionality that we need to configure in our CSD pipeline so let's see how we can do that so for doing that let's go to the environment section here as discussed before we need to configure the protection check only for the pr environment we don't have to do any changes in the dev environment since we need our Dev pipeline to deploy the changes to the dev workspace instantly so now let's click on this PR environment here you will see an option called approvals and checks so let's click on that now in the right side you'll see an Plus symbol which says add check so let's click on that now we have different types of checks that we can configure to the CD pipeline among these we are going to use the approvals option so this is the standard one most of the companies uses so basically what this means is say for example I will choose this and we'll click on next as you can see here we have an option to assign any users or security groups as an approver for deploying to this product environment so in realtime projects this approvers might be a higher technical person like a team lead or someone from the operational team so once you add them as an approver after the dev stage is completed the approver will receive a notification via an email asking them to approve the cacd pipeline to deploy the changes to the proud environment so for this demo purpose I'm going to add this Mr K account as an approver after adding that let's click on this create button okay so we have protected the pr environment now when our CSD pipeline runs the dev stage will work as usual and once the run is finished the Mr K account will be notified via an email and only when the Mr K approves the pipeline the pr stage would run and deploy the changes to the pr environment so now let's test this functionality so this test would be the final Inn cacd pipeline testing for asue data brakes and this will be an exact scenario that will happen in the realtime projects so for that let's go to the dev datab bricks workspace here firstly I'm going to create a new branch and add a new notebook inside this notebook folder so for that let's click on this Branch option and I will switch the branch to Main Branch to create a new Branch based off the main branch now let's give it a name called n2n cacd testing and click on this create button okay so we have created a new branch now let's create a new notebook inside this folder let's rename this notebook to final csad test notebook after changing the name I'm going to just type a simple print statement code AS please work okay so now we can commit the changes for this notebook to the Future Branch here let's give a commit message called added a notebook and click on the commit and push button okay so we have successfully committed the changes to this Branch now let's go back to the ASU devops here I'll go to the repost section and click on this create a pull request button now we are going to merge the latest changes from this Inn testing Branch to the main branch so let's click on this create button now after creating it I will approve the pull request and click on the complete button here we can check this delete Branch option since we don't need this Branch anymore and now I'll click on this complete MCH button okay so now the new notebook that we created will be added to the main branch okay so the Ming is done now I'll go to the pipeline section as you can see here our csad pipeline has started so let's click on it so if I go inside the pipeline run now our Dev stage is currently running and deploying the changes to the dev workspace so this Dev stage will get the latest notebook from the main branch and deploys it to the dev workspace inside this live folder so here we'll be seeing an additional not book that we added as part of this entn testing so let's wait for the dev stage to finish running okay so our Dev pipeline run is finished so if you notice something now the pr stage is in Waiting State and it didn't start running immediately after the dev stage is completed also in the top we have a message that says one approval needs your review before this run can continue to deploying to PR environment so basically this is what we configured in the pr environment so now the Mr K account should have received an email which ask for the approval for the pr stage to deploy the changes to the prud environment so now I will just show you how that email looks like as you can see here this is the email that the Mr K account received from the ASU devops here we have information like your approval is required for prod stage for the deployment so now what we can do is I'll jump back to the asue devops so before approving this prod stage let's first verify if the dev stage has deployed the latest notebook or not so for that I'll go to the dev datab bricks workspace as you can see here inside the live folder we can now see the latest notebook that we added which is final csad test notebook so our Dev stage is working fine without any issues so now what we can do is let's approve this prod stage so for that we need to click on this review button over here this review button will be visible only to the person who have privileges to approve this pipeline so since we added the mistake K account I'm able to see this so now let's click on it here we have an option to approve this pipeline also we have an option to give any comments or here which is an optional one so only when we approve this pipeline the Brad stage will run otherwise the pipeline will wait until it is approved so based on the company's policies and priorities they can approve this pipeline when they would need to deploy the changes to the proda environment so now let's approve this pipeline cool so now this pipeline would start deploying the changes and once it is done we should be seeing the latest notebook inside the live folder in the pr data breaks workspace as you can see here currently we do not have the latest notebook over here so let's wait for the pipeline to complete cool the pipeline run is complete now so if I jump back to the Pro Data bricks workspace as you can see here we can now see the final notebook that we added inside this live folder so now we have tested this complete cacd pipeline for asue data brakes and all the functionality is working fine without any issues so in this tutorial we have completed the section four which is creating the continuous deployment Pipeline and also the final section which is testing the complete cacd pipeline for ASO data braks so we have covered almost everything in a detailed way that is required for creating a csad pipeline for ASU data braks I hope you have learned something new and my videos are easy to understand it was a huge effort for me to make this complete video since my main intention is to teach you everything in a most simpler way possible so that even an absolute beginner would understand the complete functionality so if you like this video please do like share and subscribe which would really motivate me to do more similar videos like this and also it will help me to grow my YouTube channel I hope you'll do that so thanks for watching see you in another great video Until Then cheers bye
Info
Channel: Mr. K Talks Tech
Views: 31,342
Rating: undefined out of 5
Keywords: Azure Databricks tutorials, Azure Databricks tutorials for Beginners, Azure Databricks real time project, databricks tutorial, azure databricks interview questions, azure databricks tutorial for beginners, azure databricks interview question for beginners, databricks cicd, cicd and devops, cicd pipeline, what is CICD, cicd project, cicd and devops azure, azure cicd, azure devops cicd, cicd pipelines, cicd tools, devops cicd, what is cicd pipeline, cicd and azure devops
Id: 8SgHFXXdDBQ
Channel Id: undefined
Length: 125min 54sec (7554 seconds)
Published: Tue Oct 17 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.