Do You Need Relationships For Power BI? A Look Into The Disconnected Model Paradigm.
This may be blasphemous to even ask, but... do you really need to join your tables in the first place?
You may have heard of disconnected tables. These are tables that exist in your model but do not have any relationship to any of your tables (active or inactive). A disconnected model, is just that - a model where all tables exist with no relationships.
As we'll discover below, while this kind of model is an inefficient way to use Power BI, the disconnected model paradigm will help us to better understand how PBI relationships work and allows us to appreciate the efficiencies relationships bring to the table - you can think of it as a nice thought experiment (one where you can actually put your thoughts to data!).
So, back to the original question: Are relationships necessary? To answer this, we first need to ask ourselves: Why have a relationship in the first place? Put simply, a relationship allows users to automatically slice and dice data in reports based on dimensions and makes filtering happen automatically. In Microsoft's words on relationships, "Relationships between those tables are necessary to accurately calculate results and display the correct information in your reports.".
Indeed, relationships are necessary for accurate results. Let's take a look at a simple example. Sales and Customers.
This active relationships allows us to build a visual with dimensions slicing the numbers very easily, as show below.
Okay, let's see how this would look if we remove the relationship.
Sales Amount is now showing the same number in each cell. This is expected, because no relationship exists between the two tables. How do we get the correct result? If you know a bit of DAX, you know we can filter the data using CALCULATE. So let's create the following measure:
CALCULATE( [Sales Amount], 'Sales'[CustomerKey] IN VALUES( 'Customer'[CustomerKey] ) )
Let's put this new measure on the report and see how it looks.
Okay okay, it works.. Fair enough... But this fails when we need to slice by a different dimension... let's say Products:
As you may have realized, with each disconnected table, you will need a specific measure for each dimension in your model. Furthermore, if you want to slice by more than one dimension on a visual (highly likely since you are using Power BI in the first place), you'll need to create a measure with a filter for all of the desired dimensions. For example, if you want to slice by Customers AND Products, you'll need a measure like the following:
CALCULATE( [Sales Amount], Sales[CustomerKey] IN VALUES ( Customers[CustomerKey] ), Sales[ProductKey] IN VALUES ( Products[ProductKey] ) )
And you will indeed get the correct results in your visuals.
The Disconnected Model paradigm isn't looking good so far. And performance wise, it gets worse. Comparing the query plans for using a relationship versus not, it's quite clear who the winner is. Take the following results for Sales by Customer:
Query plan using the relationship
Query plan WITHOUT using the relationship (using the CALCULATE() and VALUES() method above)
As we can see, the disconnected model approach ends up generating 5 storage engine queries for just a simple calculation grouped by customer. By contrast, using a relationship generates just one query.
By the way, we could also use TREATAS instead of VALUES IN to achieve the same result, like the following:
Sales by Customer = CALCULATE( [Sales Amount], TREATAS ( VALUES ( Customers[CustomerKey] ), Sales[CustomerKey] ) )
Performance wise it seems TREATAS is sub-optimal, producing more storage engine queries and about twice as slow.
Summary Results of Performance differences:
NOTE: This dataset is using a subset of the Contoso data model. Your results will likely be different on a different model.
In this article, we've discovered that you do not need relationships to build a Power BI dashboards. However, it comes at a severe cost; 1) Your DAX becomes more complex and 2) Your DAX becomes less efficient.
Though it may be obvious to use relationships (auto-detect relationship is enabled by default, after all), knowing why relationships are helpful gives us more insight into how Power BI works and, hopefully, gives us more joy with that insight.