Running Airflow 2.0 with Docker in 5 mins

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
airflow 2.0 is out how it works what can you do with your dags what are the new features in order to answer all of those questions you need to install and run airflow on your machine what is the easiest and fastest way to do that indeed with airflow 1 we had to run all of those commands fodb need install the python 3.6 create the user install pip too many things to do just to get airflow up and running well in that video we're going to discover the easiest and fastest way to do that by using docker my name is mark amati i'm the head of customer training at astronomer and i'm a best selling instructor on udemy and during the next five minutes you are going to discover how to install and run airflow 2.0 with the celery executor by using docker oh and by the way in two seconds only you can subscribe to my channel this will help me a lot and this will make you a master of airflow now you have subscribed without further ado let's get started without further ado let's get started the first step is to install docker and docker compose indeed as we're gonna run airflow with docker you have to make sure that docker is installed on your computer so you just need to go to docker.com and then choose either mac windows or linux according to your operating system and once you have docker installed you are ready to install docker compose docker compose allows you to run multi-container applications which is exactly the case with airflow indeed airflow has at least three components the web server the scheduler and the metadatabase and since we're gonna use salary executor we will have more components but keep in mind you have to install docker compose and you can go to docker.com action right there and then install compose by following the instructions once you have docker compose installed you are ready to move on the airflow part first let's create a new folder called airflow.docker then inside that folder we are going to download the docker compose file describing all the services needed by airflow and the docker compose file has been already made for us by the airflow community so let's type curl dash lfo and this very long url that you can find in the description below hit enter then if you type ls as you can see we have a new file docker compose.tml let's open a code editor type code dot if you use visual studio code open the file.compos.tmln and let me give you a quick explanation about it so first you have all command settings that will be applied to all services you have the image of airflow that we're going to use in that case the latest image then you have some on random variables in order to customize the airflow instance first we specify that the salary executor will be used we specify the connection to the metadatabase which is posgress same for the result backend as we use the server executor the broker url is redis and the broker is in charge of exchanging tasks exchanging messages between the scheduler and the workers then you have the finite key feel free to put a finite key if you want in order to encrypt passwords and so on basically if you want to customize your air for instance you can create some environment variables right there next we have the volumes and the volumes are super important indeed all the folders dags logs and plugins will be synchronized between the containers and your host so whatever you put inside dags those files will be automatically synchronized in your containers so if you create a new file dac.py and put that file into dags that file will be automatically transferred into the local containers then we have the user this is important to make sure that the permissions are the same between your containers and your machine and then you can find the services the services are all components needed by airflow first the postgres database so the metadatabase of airflow then we have redis the broker airflow dash web server running on port 8080 airflow scheduler the scheduler airflow worker in order to execute the tasks flower that you can find here and we have airflow dash in it airflow dash init is a little service in charge of initializing your airflow instance as you might know the first thing you need to do when you install airflow manually is to run afrodb init or airflow db upgrade and then create a user well that's exactly what airflow dash ini does for you fodb upgrade is triggered first and then a new user airflow with the password airflow is created for you okay now we understand what we have in that file let's create the folders logs dags and plugins so we can open a new terminal like that make sure that you are inside for docker and let's create those three folders so mkdir dot slash tags dot slash plugins and dot slash logs then you should obtain those folders dags logs and plugins as shown right there next if you're on mac os or linux you might need to export some environment variables in order to make sure that the user and group permissions are the same between those folders from your host and the folders in your containers to do this you can type echo e two double quotes airflow underscore uid equals to dollar then two parenthesis id dash u backslash n airflow underscore gid equals to zero and you put it into a file called dot f and this file will be automatically loaded by the docker compose file if you open the file.have you should obtain a pretty similar output the folders are created and the permissions are set now we can initialize our air for instance with the service if we need to do this you just need to type docker dash compose up airflow init hit enter and this service is in charge of running airflow db net or airflow db upgrade and then create the user airflow with the password airflow that's exactly what you can see here from the output so let's wait a little bit until it's done perfect the upgrade is done and the admin user airflow has been created action right there so now the only last thing we need to do is to execute the command docker compose up hit enter and as you can see this command runs all the services that are specified in the docker compose file the scheduler the web server the worker readies and so on so let's wait a little bit until the docker containers are up and running and if you want to check that you can just open a new terminal and type docker ps with that command you are able to check all the containers and you can see if they are up and running by looking at their statues looks like the web server is healthy as you can see right there and we have other containers such as the flower container the walker container we have postgres readys and so on okay so let's verify our air for instance open your web browser let's open a new tab and type localhost colon 8080 hit enter and you should land on this beautiful page if you type airflow airflow we got airflow 2.0 up and running using docker with the salary executor in only five minutes so that's how you can right now use effort filter with the server executor so you can potentially execute as many tasks as you want and make your own experiments in order to discover the new awesome features of airflow 2.0 alright it's time for the bonus section first how can you interact with the airflow command line interface well it is pretty simple the only thing you need is to go back to your terminal and execute the command docker exec use one of the container ids of your air for instance so let's use the web server for example copy the container id paste it there and then you can access the airflow command line interface like airflow then version if you want to get the version of your for instance if you hit enter as you can see we get the following version so the only thing you need in order to interact with your airflow command interface is to use docker exec then a container id corresponding to the scheduler the web server or the worker and then execute the command air flow version or airflow users or whatever you want to interact with from your command line interface now what about the api how can you interact with the api well if you try to execute the command curve x get then http colon slash localhost colon 8080 then slash api slash v1 slash let's set dags so you want to list the dags if you hit enter as you can see we get unauthorized so how can we fix this well open your file docker compose.tml then right there you have to create a new environment variable called airflow underscore underscore api on the scroll score auth underscore backend two simple quotes dot airflow.api.org dot backend dot basic underscore auth so by specifying that backend you will need to specify the username as well as the password of a user in order to interact with the api by default all requests made to the api are denied so that's why we got unauthorized right there so once you have specified this backend save the file restart your air for instance with docker compose done and docker compose app perfect now airflow is restarted let's open a new terminal and from there if we type docker ps to check the containers you can see that the web server is still starting so we have to wait until it is healthy let's tap the cup yes again all right now it is healthy so we can try the command curl x get but this time we specify dash dash user then airflow for the user name and the password airflow hit enter and as you can see we get the list of all dags so that's how you can interact with the api and that's how you can make some experiments with it i hope you enjoyed that video i've put a lot of work in it so if you want to help me the only thing i ask you in favor is to subscribe to my channel this will help me a lot have a great day and enjoy airflow 2.0
Info
Channel: Data with Marc
Views: 151,288
Rating: undefined out of 5
Keywords: airflow, apache airflow, docker, docker compose, airflow 2, airflow 2.0, airflow local, airflow docker, airflow docker compose
Id: aTaytcxy2Ck
Channel Id: undefined
Length: 11min 55sec (715 seconds)
Published: Tue Feb 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.