PiterPy #5 / Michal Karzynski: "Developing elegant workflows with Apache Airflow" / Saint Petersburg, Russia, Online / 2 November 2018 - 4 November 2018

PiterPy #5

2 November 2018 (Fri), 09:00 - 4 November 2018 (Sun), 18:00

Michal Karzynski: "Developing elegant workflows with Apache Airflow"


Developing elegant workflows with Apache Airflow

Every time a new batch of data comes in, you start a set of tasks. Some tasks can run in parallel, some must run in a sequence, perhaps on a number of different machines. That's a workflow.
Did you ever draw a block diagram of your workflow? Imagine you could bring that diagram to life and actually run it as it looks on the whiteboard. With Airflow you can just about do that.
Apache Airflow is an open-source Python tool for orchestrating data processing pipelines. In each workflow tasks are arranged into a directed acyclic graph (DAG). Shape of this graph decides the overall logic of the workflow. A DAG can have many branches and you can decide which of them to follow and which to skip at execution time.
This creates a resilient design because each task can be retried multiple times if an error occurs. Airflow can even be stopped entirely and running workflows will resume by restarting the last unfinished task. Logs for each task are stored separately and are easily accessible through a friendly web UI.
In my talk I will go over basic Airflow concepts and through examples demonstrate how easy it is to define your own workflows in Python code. We'll also go over ways to extend Airflow by adding functionality through custom task operators and plugins.

Michal Karzynski
Poland. Gdansk
Tech Lead Software Engineer

Michal Karzynski has a scientific research background in the areas of molecular biology and bioinformatics. He is currently working as a Machine Learning software engineer, tech lead and consultant. He also has web development experience and spent many years writing code in Python and JavaScript.
Michal loves Linux and everything open source. He's currently working on nGraph, Intel's runtime and graph compiler for Deep Learning. He wrote "Webmin Administrator's Cookbook", a book on Linux server administration. As consultant he was responsible for designing and deploying cloud infrastructure for a number of companies.
Michal is currently employed as a tech lead at Intel. He also runs the consulting company Atarnia.com. He writes a blog

Add to calendar
PiterPy and Linux Piter conferences together
DELL EMC - silver sponsor
SEMrush - silver sponsor
Exness - silver sponsor
HDE - silver sponsor
Cindicator - silver sponsor
Selectel - bronze sponsor
Travel sponsors
PiterPy Meetup
Event in socials
По вопросам участия
Екатерина Попова
+7 (961) 873-33-27
По вопросам выступления
Ирина Сарибекова
+7 (921) 903-45-17
Обсудить свой доклад
Программный комитет конференции


You've successfully subscribed for news.