Power BI Dataflows and How Are They Different From Datasets?
Updated: Feb 14, 2021
(Iguazu Falls in Argentina - largest waterfall in the world)
What are Dataflows?
Dataflows are a new feature released in September of 2018 for Power BI. Just going by the name, it's not exactly clear on what it is though, and I think the name doesn’t justify what it actually does for us. A more appropriate name would have been, “Power Query Online”. That’s just what it is; Power Query on the web at powerbi.com. It does all the same things as Power Query for desktop.
Who are they for?
Well, if you’re on a small team or in a small business, this may not mean much or provide much value at all. However, if you’re in that sweet spot of just the right size, this may be excellent news for you:
Your company is big enough to have multiple dashboards and workflows covering several areas of the business.
But you don't have an abundance of expensive other tools and databases/data warehouses, or the budget of an expensive IT team.
In other words, if your company is using Power BI as an all inclusive reporting solution, this is for you. In fact, Microsoft is currently marketing this as a "Self-service ETL" program without the need for professional tools - "Low Code, No Code". Check out Microsoft's own blog on that: https://powerbi.microsoft.com/en-us/blog/introducing-power-bi-data-prep-wtih-dataflows/
What's the point?
Okay, so what's the point? The bottom line is this: Dataflows allow Power BI developers to reuse & recycle queries for use in other dashboards. This can save time in the development process by enabling you to reference a previously created data transformation process:
Effectively eliminates the need to recreate a query.
Also creates a single, standardized query to use for all other dashboards - a so called "Certified Entity".
I summarized it with the following diagrams below...
Issue of the same, multiple-step query being made for several dashboards:
With Dataflows, you can create one query to reference across multiple dashboards:
Okay, okay... How is this different than Power BI Datasets?
With Power BI Datasets, you are only allowed to connect to one data source; one dataset. This includes the entire data model, which may include more than one table.
Once you connect to a Power BI Dataset, you are unable to create another source / query. That means you can't connect to an Excel file, CSV file, Web source... anything. Once you import a Dataset - that's it. This works the other way around too. If you already have a connection to anything, like an Excel file for example, you cannot create a connection to a Power BI Dataset.
Alien Tip: That's one of the key differences between Datasets and Dataflows. Where Datasets provide an unchangeable data model, Dataflows provide a pre-prepared, modifiable table (called "entities") for your existing desktop data model.
To get started with creating your own Dataflows...
I highly recommend you check out this blog by Matthew Roche: https://ssbipolar.com/2018/10/20/dataflows-in-power-bi-overview-part-2
He does an excellent job of explaining everything you need to know about getting started all the way to execution.
I do want to leave you with this... In order to use Dataflows, you'll need either a Pro or Premium license. They also need to be created in the App Workspace, not the "My Workspace".
If this helped you or you learned something, please share with your friends and colleagues!
Want to improve your Power BI mastery? Check out Mastering Microsoft Power BI!