Getting Started with Prefect | Task Orchestration & Data Workflows
A big challenge for many data teams is orchestrating all of the tools within their data stack.
Fortunately, there are tools designed to address this exact issue (task orchestration) and make our lives as data engineers easier.
So in today's video, I want to show you how to use a great open-source task orchestration tool called Prefect that you can use to orchestrate and monitor your entire stack from a single location.
It has a great UI, is designed with data tools in mind and only requires basic Python knowledge to get started.
By the end of this video you'll understand what Prefect is all about and how you can start using it in your own data stack.
Thank you for watching!
►► The Starter Guide for The Modern Data Stack (Free PDF)
Simplify the “modern” data stack + better understand common tools & components → bit.ly/starter-mds
Timestamps:
00:00 - Intro
00:44 - What is Prefect?
02:32 - Install Prefect
04:48 - Create a Python Script
06:44 - Add a flow
07:42 - Add a task
08:45 - Add a subflow
09:49 - Intro to Deployments
12:03 - Create a Deployment
15:17 - Start an Agent
16:48 - Use Prefect Cloud
20:25 - Using Blocks
23:09 - Version Control & Storage w/ GitHub
26:07 - Automations & Task Concurrency
Title & Tags:
Getting Started with Prefect | Task Orchestration & Data Workflows
#kahandatasolutions #dataengineering #prefect
►► The Starter Guide for The Modern Data Stack (Free PDF)→ bit.ly/starter-mds Simplify “modern” architectures + better understand common tools & components
prefect explanation ! 🤣 Amazing content easy and simple to follow. Thanks for your effort it helped me a lot ! 💛
Great tutorial + narration by Louis CK!
Wow, I love this tutorial. Thank you for the videos, and I have been receiving your updates on my email. Thanks man
I'M really enjoy your video bro! Keep up the great work!!!!
Much appreciated!
Hello @Kahan, first, I want to thank you for the great video! It does help me start prefect from scratch. I just have one headache regarding the agent part. As you have shown, I already deployed my workflow to prefect cloud. But it kept saying that my work queue is unhealthy. I figured out that it could be because that I did not start an agent. Then I started an agent using the command you showed, in Visual Studio Code. But once I exited visual studio, my agent will be terminated, thus my workflow can't be run on a schedule. Is there anyway to fix this problem (e.g. let the agent run as always, not depend on VScode)? Thank you very much!
Masterful
Great video on Prefect! It's clear how it helps with managing data tools. I'm interested in how to add Prefect to an existing project running sequential Tasks using OpenAI's API assistants?
I like your videos it is simple and easy to follow, I am moving into data architect / data engineering role is there a way to share the first screen grouped by BI, Transform, extract etc.. Thanks
Do you mind doing a tutorial on difference between a prefect future and a prefect state and when you would use them in your pipeline?
as always, this is another amazing video. Thank you very much Michael! oh I ve got a question, the image (I assume its a website) from 0:03, can you share that link please? thanks in advance
Thanks!
Thank you so much for the Super Thanks Sudhansku!
Thank you so much. I am currently looking for a simple solution to create a CI/CD ETL pipeline :D I think I would use Prefect with GIthub and Docker
That sounds like a great idea!
Mind-blowingly succinct tutorial, better than the official Prefect docs. Love how you used a simple "hello world" script as an example. Your description of deployments and agents was particularly helpful. Would love to see an example using a Docker container as storage in the future. Will show this to other programmers on my team. Thank you!
very good video but leave some questions how do I work with .sql scripts defined in specific path could you give me an example and how can I make them execute at a certain time like airflow that works within the same day
Is there anything like this, but in Java ? Thanks
I wish prefect would add more to their documentation, or maybe create something similar to dagster university for better understanding of their product :/
Good intro. 😊
Thank you Kahan, great starting point for Prefect. One question in case you have some input - I am building workflow capability in my application. I want to allow my app's users to customize and setup their own business workflows. As part of each workflow they can setup different tasks to run from a task pool I am giving them. I came across Prefect and was wondering if it applies in my case. I very much understand the usage of Prefect when it comes to building more static flows or for development / infrastructure teams to automate tasks. But I am wondering if using it to implement a service for users where they customize their own workflows is a good practice. Using Prefect's task state management, ordering, retry policies, etc will be useful but I wanna make sure I am not putting overhead in my app unreasonably. If there is any input on that I would really appreciate it, thank you!
is "prefect server start" the same as "prefect orion start"?
Excellent tutorial, thank you for your efforts. I've been trying to get Prefect standardized in my company but I was shaky in deployments/agents/blocks areas. One complaint I have (which is really minor but for me is paiful) is there is too much mouse movements on the screen. PLEASE don't move your mouse so much on the screen. It hurts my eyes and I have a headache after watching it for more than 5-10 minuts. Other than this minor nitpick I appreciate your videos very much. Keep up the good work.
Very interesting
Thanks for watching
perfect
Appreciate it!
Finally :D
I tried running my flow code from Github, but it doesn't work. Does my prefect server have to be running on prefect cloud for this to work?
Hi, i hope you can help me. When i run python file, i had a trouble: ImportError: cannot import name 'SecretField' from 'pydantic'. How can i fix this?
Flyte or Kedro are goood alternatives
How to create a sqlalchrmy block
i tried prefect block register -m prefect_sqlalchemy but getting value error
You have some hope till you guys can keep Embiid, trust the process bro 😂
Prefect vs Airflow?
Both great options. Although I personally have more experience with Prefect.
I think compared to airflow, Prerfect looks a lot simpler to implement
Is there an alternative to Prefect Cloud? Implement that portion on an EC2?
@@as978 Prefect Server is the OSS equivalent of Prefect Cloud, you could deploy it wherever you wanted (directly on a VM, kubernetes, etc)
Our CTO was once asked what Prefect was.. "Well you know Airflow? Well it's like Airflow, but actually Good" I can't substantiate that claim, as I don't have enough experience with them, but it was our go-to tool for our Data Platform
prefect cloud is useless as 90% of companies will not allow you to use any cloud solutions (except google, amazon, MS ofc)