Excel has various tools that can help you easily clean up your raw
data for further analysis. So apart from using
formulas and formatting, there are two useful Excel features that let you easily clean up. One method is text to columns, and the other is importing external data. What I've seen people do when
they export data from a system and they want to bring it into Excel is that they copy and paste it in there. Don't do that. Because you can run into problems. One of the problems can be with the decimals and
thousands separators. Because especially if you're dealing with international companies because in the US the decimal is a dot and the comma is a thousand separator, and generally in European countries it's the other way around where the comma is the decimal and a dot is a thousand separator. It's always safest to import it properly using Excel's wizard because
you can control this. What you can also do if
your data is on the web is that you can import it
directly using a web query, which you can actually set
to refresh automatically based on your preferences. Now let's have a look at these in action. Let's start with text to columns first. So in this case I have a very
small sample, a full name, and I want to split them
to first name and last name into two separate columns. Many people are familiar
with this feature, but they actually forget to
use it when the time comes and instead they try to
use formulas to do this, which can be more time consuming. But instead, what you can do
is to highlight the column that you want to split, and note
that it has to be one column. So it can't be many columns like this. Everything has to be in one place. And then you go to Data, Text to Columns. And here you kind of see
a preview of your data. As a first option in this wizard, you have to select if it's
delimited or fixed width. Generally things will be delimited. Fixed width is when everything would have the exact same width. So everything would be four
characters, let's say, here, and then here, five. If it's not, if there's some
other logic that you can use to make the split, you will use delimited. So then we go Next. And here we have the option
to select what this logic is. So in our case the logic is a space. You have these options here, but if you have some other
split that is not here, so let's say you have this one, you can actually type it in here. It looks good in the preview. In the last step, we get to
choose the column data format. So if these should be
formatted specifically as text or general or date, we
can select each column and select the formatting here. In this case, I'm fine
with Excel's general. So I'll leave that as is. Here you can select the destination. So if you just leave it as is, it's going to overwrite what you had here. If you don't want it to overwrite, select another destination here. Okay, and then you say Finish. And you have your names split. As a next step I'm going to show you how you can import external data in the format of a CSV file or a data file or a Notepad file. Usually you can get data
extracts from any system. So whether you have a finance
system, an ERP system, you're able to get some
type of data extract in a predefined structure
into a Notepad file. That can have many names. So some people call it data files or CSV files for comma-separated files. They're all the same thing. So as long as you can
get some type of file with some structure you
can import it in Excel. And you should do that using
the wizard from the Data tab. In this example, I'm just
going to show you the file that we're going to import. It's a very small sample of data, but it has some structure and some logic, as you can see here. So going back to our Excel, I'm
going to go to Data From Text. This is something you
have to be careful with because Excel's filter always
defaults to text files. And those should have
an extension like .csv. If your file doesn't have that,
and in my case it doesn't, it's a .dat file, DAT file, I can't see that until I
change it to All Files. So just remember this when
you are importing text files. That's the one I need. I'm going to import. So here I have to choose if
it's delimited or fixed width. In this case I can see
it's not a fixed width. There's some logic that's
giving me the split, and that's a comma. So I'm going to go Next. And here it's not a space, my delimiter, but instead it's a comma. So I'm going to select that. Next. Here again I can select each column, and I can say is this a text, a date, or I leave it as general. In most cases general is
fine, but there are cases when you might have product
labels starting with zeroes. So it's like zero, zero, zero, something. And if you leave that as general, Excel is going to strip out
the zeroes when it imports because it's going to think it's a number. So if you don't want that to happen, do highlight that column
and change this to Text. That way you can keep your zeroes. The other good setting
that you can control here is this one, and here is
what I mentioned before, where you can control your
decimal and thousand separator. So if your Excel is different
to your source system, you can actually change that here. In my case it's fine. I'm going to leave that as is, and Finish. And now I have the choice
of importing right here if I want to or I can go
to a brand new worksheet. It's especially recommended
to use a brand new worksheet, especially if you're planning
on changing the numbers, resaving it as CSV or data file, and then reloading back to your system. And now I have each category
sitting in its own column. I can do sums on this, filtering on this, or use it to feed my dashboard. This is not a one time import. Notice that you have Connections and Refresh here highlighted. What it's done now is that
it's built a connection to this text file that you've
saved in your folder sheets. So if you have some
type of specific routine that saves the file always with
that name into that folder, you can refresh it here, and
you will get your new numbers. What you can also do in Properties is to set the refresh time. So here you see you have the control. You can say refresh this every 60 minutes or refresh data when you open the file. And here you have some
additional layout options that you can control. One other tip is that you
can press Control + T, and what this does is it
turns your data into a table. It just created some headers for me. With just Control + T, I get this whole table, tools, options, that I can use to quickly filter
on accounts if I wanted to. And I can also add a Total row here. So if I would then, let's say,
filter on these two accounts, see my total also automatically updates. So it's a really neat tool
to analyze your numbers. You are also able to
convert back to range. So let's say you did
some analysis on this, and then you want to save
this again as a DAT file or a CSV file and load
back to your system. I'm just going to delete these
because I don't need these rows. Delete this, and then I would change these to, let's say 400, 600. All I have to do is do File, Save As, save it here, and instead
of an Excel Workbook, I want a CSV file. It's this one. You just give it a name. You get this that it doesn't
support multiple worksheets, but that's fine. We just want this one worksheet. It also tells you some features
are not compatible with this because the colors and so on, that's not compatible, but that's fine. Then, if I go back to my file, I can see the updated CSV file, which I can use to upload into my system. As a last step, I also want to show you that you can do a web query as well. So if there's some data
that you need from the web, you can use this. Let's try to get some data from, let's do Google Finance. And you see this here, it says, "Click this next to the
tables you want to select, "and then import." Not every single webpage and
data is going to work with this, but a lot of them do. So here's the world's markets. Let's see what we have here, currency. So let's say we want the currencies. I'm going to click on this,
and I'm going to say Import. Okay, so in this case
let's just import it here. I have my web query. I can set the properties that it refreshes every
whatever minutes I want or that it refreshes when I open my file. And then I can use this data, obviously, since they're sitting in separate cells to manipulate or to use as I want. This feature can come in quite handy when your data is on the web.