I’ve started reading the latest Manning book about Airflow. As for now, only 3 chapters were released, which is small to get a global idea of what this book is made of.
Yet, the first chapters appear to be quite easy to read. The authors have pushed the usage of Docker (which is not so common in many books I have read), which makes testing much easier. The very basic concepts about airflow are explained, not too much of details, but enough to understand them.
I’m looking forward to reading the rest of the book. In the meantime, for those who would like to test Airflow with Docker without buying the book, here is a small Dockerfile script I have written, that should get you started.
FROM python:3.6-slim RUN apt-get update -y RUN apt-get install -y gcc RUN pip install apache-airflow RUN airflow initdb RUN mkdir p ~/airflow/dags # This part is to include the launch bash file. Thus, you have it all in 1 file RUN echo '#!/bin/bash' > launch.sh RUN echo 'nohup airflow scheduler &' >> launch.sh RUN echo 'airflow webserver -p 8080' >> launch.sh RUN chmod +x launch.sh # Feel free to extract this part and copy a separate file if you find this cleaner EXPOSE 8080 CMD ["bash", "launch.sh"]
Note that the web interface will be exposed through port 8080, and DAGS files will be located in /root/airflow/dags.
Building the image will be done this way :
docker build -t airflow .
Running the airflow container will be done this way :
docker run -d --name airflow_container -p 8080:8080 airflow