===== Apache Airflow ===== Available as a SAAS on AWS. Example of UI {{youtube>iTg-a4icf_I?medium&start=32}} More infos https://medium.com/apache-airflow/apache-airflow-2-0-tutorial-41329bbf7211 === Difference Step Function vs. Apache Airflow === https://stackoverflow.com/questions/64016869/airflow-versus-aws-step-functions-for-workflow I have worked on both Apache Airflow and AWS Step Functions and here are some insights: * Step Functions provide out of the box maintenance. It has high availability and scalability that is required for your use-case, for Airflow we'll have to do to it with auto-scaling/load balancing on servers or containers (kubernetes).* * Both Airflow and Step Functions have user friendly UI's. While Airflow supports multiple representations of the state machine, Step Functions only display state machine as DAG's. * As of version 2.0, Airflow's Rest API is now stable. AWS Step Functions are also supported by a range of production graded cli and SDK's. * Airflow has server costs while Step Functions have 4000/month free step executions (free tier) and $0.000025/step after that. e.g. if you use 10K steps for AWS Batch that run once daily, you will be priced $0.25 per day ($7.5 per month). The price for Airflow server (t2.large ec2 1 year reserved instance) is $41.98 per month. We will have to use AWS Batch for either case. * AWS Batch can integrate to both Airflow and Step Functions. * You can clear and rerun a failed task in Apache Airflow, but in Step Functions you will have to create a custom implementation to handle that. You may handle automated retries with back-offs in Step Functions definition as well. * For failed task in Step Functions you will get a visual representation of failed state and the detailed message when you click it. You may also use aws cli or sdk to get the details. * Step Functions use easy to use JSON as state machine definition, while Airflow uses Python script. * Step Functions support async callbacks, i.e. state machine pauses until an external source notifies it to resume. While Airflow has yet to add this feature. Overall, I see more advantages of using AWS Step Functions. You will have to consider maintenance cost and development cost for both services as per your use case. == UPDATES (AWS Managed Workflows for Apache Airflow Service): == * *With AWS Managed Workflows for Apache Airflow service, you can offload deployment, maintenance, autoscaling/load balancing and security of your Airflow Service to AWS. But please consider the version number you're willing to settle for, as AWS managed services are mostly behind the latest version. (e.g. As of March 08, 2021, the latest version of open source airflow is 2.01, while MWAA allows version 1.10.12) * **MWAA costs on environment, instance and storage. More details here. https://aws.amazon.com/de/managed-workflows-for-apache-airflow/pricing/