- Yooo! Adam Saxton with Guy in a Cube, and in this video, we are gonna look at how do you get data in Power BI, 'cause there's different ways to do it from importing, to DirectQuery, to live connections, to more. What's it all mean? Let's find out. (upbeat music) If you're finding us for the first time, be sure to hit that subscribe button to stay up to date with all the videos from both Patrick and myself. Getting data inside of
Power BI, like I said, it can be a little confusing with all the different
terms that are going around, but we're gonna break that down. Let's start off with import. When we say we're importing data, this is also referred to as cached data. What it means is that
we're actually pulling data from your data source,
bringing it into Power BI as a copy of that data, and then it sits inside of
your Power BI Desktop file. If you then publish that to
the service, that dataset where the data is, resides
inside of Power BI. And when we query items
from a visual perspective, it's hitting that cached data. One of the biggest benefits to
this is that it's super fast. And there's data compression involved, so you can get a lot of data
in a tiny little bit of space. Another benefit that we have
here is when we go to get data and we look at all of the data sources that are part of Power BI, you'll see that there's a
lot of data sources here. All of these are available
from an import perspective. So there's no restrictions on that. Even if you're doing ODBC or anything, connecting to any data and bringing it in, you have full capabilities of Power BI, including all of the DAX functions, all the relationship types,
all of the stuff you can do in Power Query, everything
is available for you 100% in import mode. This is the default option, and usually it's the option
that I recommend to folks. And you may be thinking or
you may have heard like, "Wow, yeah, I can only do a million rows. "That's all I can do." No, you can do a lot in import. I'm working with one customer where they've got inside of Power BI 1.5 billion rows imported. They're actually trying
to push it even bigger. So you can go a long way. Let's take a look at an
actual data source here. So if I go to get data, we're just gonna go to SQL Server. And when we put in our server
and database information, you'll have another option here
for data connectivity mode. So you've got an option
of import or DirectQuery. Import's the default option, and that's gonna bring in that
cached version of the data. DirectQuery is also an option. We'll talk about that in a second. So, import is available for
everything, like I mentioned. If you are not using premium
capacity, your dataset size, so this is the on-disk size
of the Power BI Desktop file, can only go up to one gig in size when you go to publish
that to the service. So that's gonna be one restriction from an import perspective and something you need to be mindful of. If you are using premium, you can go up to 10 gigs,
which is the default. You can even go beyond 10
gigs with large model support up to what your premium
capacity can actually manage. One pro tip I'm just
gonna throw out there, from a modeling perspective when you're doing import of the data is keep it simple, keep it clean. I've got a link in the
description below to a document in the guidance section
of the Power BI docs for star schemas. Read up on this. Make sure you're familiar
with what this is, fact tables, dimension tables, how to relate those things together. It is very important for
the best performance. Don't just do one giant
single table, please, I beg. All right, let's look at direct queries. In import, I mentioned that we
copy the data into Power BI. With DirectQuery, it's a little different. What we actually do is
we're going to connect to the data source and we
grab the schema of the data, so the table structure, the column names, that type of information. And that is stored inside
of Power BI Desktop in the data model, but the data itself
stays at the data source. So this is beneficial for
a lot of different reasons. One of the biggest reasons
and one of the limitations I called out in import is your size. If you're working w%ith massive data, DirectQuery can help
you get past that limit. Going back to my connection options, all it is is a flip of a switch, and we're gonna do
DirectQuery at that point. Then it will actually pull in that schema and not the data itself. One thing to realize
is not all data sources support DirectQuery. I've got another link
in the description below with the Power BI data source list, and this actually calls out
which ones support DirectQuery and which ones do not. Typically, it's going to be your relational type data sources. So just be aware of that. One huge word of caution
I'm gonna throw out here with DirectQuery, be careful. So typically when folks call us up and they say they've
got performance issues, the first question we ask is,
are you using DirectQuery? 'Cause that's usually what the problem is. People tend to do DirectQuery against some sort of operational database, or something that's not in
that star schema format, or not modeled really clean. Other things that you'll
hit with DirectQuery is you're gonna have some limitations from a Power Query perspective,
as well as DAX itself. Please do not use the time
intelligence functions. As of the recording of this video, they're probably not
gonna be as performant, and the queries they're gonna generate are not gonna be the best,
'cause they're very complex in what they have to do. Going back to the performance
piece on DirectQuery, typically the conversations
I have is customers will say, "No, we've modeled this correctly. "It's amazing. "The database is fast." And we go look at the
queries and it's not. It's all gotta be sub-second
for a Power BI report and interactive query load. Remember, every visual on the canvas can generate one or more queries to that data source over DirectQuery, so you gotta be mindful of that. All right, the next connection
type is this live connection. What is that, and why is it
different from DirectQuery? There is an actual difference. So live connect is specific
to when we're connecting to analysis services
or a Power BI dataset, so this could be Azure Analysis Services, Analysis Services on prem,
or the Power BI dataset. And the reason for this is
it is the gold standard, so Power BI, the heart of Power
BI is that VertiPag engine, which is used across
all of those properties. And so all this is, I
mentioned a DirectQuery, that we copy over the schema. From a live connection perspective, what we actually have is
just the connection string. Everything is kept on the model itself. We don't anything over
to Power BI Desktop. So, for example, if I
go to Power BI datasets, and let's say that I
just want my sales model, and we'll say Create. And what this is gonna do is we've got that connection
string to that dataset. Another thing you'll mention
over on the side here, I've only got my Report tab, and I've only got my model view. I don't have the data view, because that's just not available for my live connect perspective. And by default here, you can't
make changes to that model. That model's an external source, and from Power BI Desktop's perspective, you're not gonna be able
to make those changes. If you do, this is where
composite models come into play. So, composite models is a
feature where it allows you to mix and match the connection type. So, we've got DirectQuery,
we've got import, and now, as of the December
2020 build of Power BI Desktop, you can also mix and match
over live connections to Power BI data sets and
Azure Analysis Services, which is amazing. Just as a quick example here, maybe I wanna import an
item like this CSV file. Here. it's gonna prompt you, hey, because we have a live connect, you have to change this over
to a DirectQuery connection. So now I'm doing DirectQuery
to the actual AS model or the Power BI dataset. And so, we're gonna create a local model, which has that schema now in references to the model that is the Power BI dataset. So let me go ahead, and
we'll bring in my CSV file. I'm just gonna import that. So now I've got an imported table combined with my live connect table, and now they can live together. And so if I go to my model view,
now these are all together. We can see the blue is my live connect, and then my sub category
table is over here. And I can do what I need
to do to mash up that data. Be sure to check out the
video I did with Tessa Hurr where we go a little more in depth about what the DirectQuery
over Power BI datasets can do. You can take advantage of other items with composite models as well. So, when we look at
aggregation or summary tables, that's imported, and then we
use the DirectQuery table. So this helps you work
with that massive data. So layers are gonna be your
friend with massive data. We've got that imported summary table, which will handle maybe
90% of the query operations when you wanna get to that detail row, it works on the DirectQuery table, and you can satisfy
those business requests in your report itself. What do you guys think? Did this help clear up what the different connection types are and what you can do inside of Power BI? Check the description for links to other resources on these topics. And let me know in the
comments below what you think. If you like this video, be sure to hit that big thumbs up button. Smash it, if you so desire. If it's your first time here,
hit that subscribe button. And, as always, from
both Patrick and myself, thank you so much for watching. Keep being awesome. And we'll see you in the next video.