Data Analytics with MATLAB | Master Class with Loren Shure

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone welcome to to today's  master class on data analytics   thank you for joining me today my name is  lauren and i've been at mathworks for over   30 years i try to share my knowledge in classes  like today's you might also find some useful   matlab tips on the blog that i maintain the art  of matlab let's get started on this week's topic   um so as i said i'm going to speak about data  analytics today i have some uh mathworks um   uh friends who are on the youtube chat so please  feel free to put your questions there and they'll   either answer them or uh shoot them my way when  it makes sense and um i'm just ready to go so um   uh in a perfect world i would be able to ask you  some questions but instead i'm going to tell you   what i normally see because i can't really do  that right now um so um what is data analytics   well before i answered that um let me talk about  where data analytics are found and they're found   everywhere where there's any sort of technical  computing required whether it's in transportation   type systems whether it's in energy based  systems whether it's in medical sort of things   and also unless traditionally technical areas  whether it's retail and finance and logistics   things like that so um what are what  do these things all have in common   well they have problems at heart that have a lot  of data associated with them and they'd like to   mine that data for information and so the idea  here is with data analytics is we're going to   try to turn large volumes of data into something  we can take action on in order to get from a data   from the data that we have to some decision  the first thing we need to do is figure out   what happened so we need to look at the data  and sort of describe what we think we see   and then we need to sort of figure out why we  think it happened we then maybe want to make a   model so that what we can do is predict what will  happen next and if what's going to happen next is   what we want well we don't need to do anything but  suppose it's not suppose it's something we don't   want to have happen then we need to do something  about changing um changing the situation so   a different outcome can happen so that's the  idea behind data analytics and there's a very   typical workflow the very first thing is we  have to get access to our data and the data   thankfully in matlab can come from anywhere can  come from files local files databases sensors   with live data internet of things databases on  another system as long as you can reach them   matlab will be able to capture that information  that you ask for after we get access to the data   and again this is a place where i'd love to ask a  question for those of you who've worked with data   how perfect is your data perfect right that's what  people would like to think well not so much okay   so we often have messy data and it's often not  our fault it's just the way the world is okay so   when i'm thinking about this what i might need  to do is i might need to pre-process the data   and depending on what kind of data it is you're  going to do different things you know if it's a   if it's an image you might need to remove  a shadow if it's a signal you might need to   remove like a 60 or 50 hertz signal that is not  really the thing you're interested in in any case   you may have large amounts of data and they might  come from multiple sources and it may be so large   that you decide to do something to transform  the data and make it into a smaller data set   that still encapsulates the important details of  the process you're trying to understand and affect   and so you might do some data reduction one  kind of way it's not usually considered data   reduction but one sort of way of doing that is  feature extraction you can look for features if   i were looking at images for example of drinking  vessels like you can see here in the plot i could   look for things with handles without handles and  that might help me decide whether maybe in the   context of other information whether it's a coffee  cup or a beer mug or or what a teapot or whatever   after we get the data into a form that we are  happy with and that represents the system well   we need to think about creating a model now  one thing that um you may not know about me   is my background is in physics and so i'm used  to knowing the equations of things because of   people like um uh maxwell and gauss and einstein  and so on so i know my equations and because i   know my equations typically what i'm doing is  i'm just going to find and fit parameters like   how much is the mass of the earth for example and  i'm going to use the data to help estimate those   but there's often a case where we have a lot  of data that has really great information in   it but we don't know the physics behind it and  for those we use statistical models or machine   learning models typically in either case  once we get a model we have to figure out   whether that model makes sense um you know  i i taught university for a little while   and i would get homeworks back from people and  you'd ask you know um what is the value for   something and you expect it to be you know 3 times  10 to the 10 and they give you an answer of pi   so clearly there's a big disconnect they  didn't even check to see that their model   was in the right ballpark because the numbers  are so big it's so uh so hugely different from   one another so this is really key making sure  our model really is going to be a predictor for   what we need so that we can rely on it now  i'm going to argue if you stop right there   you will have had a really fun day at the office  or at home or wherever you're working these days   because you haven't told anything about what  you've done now i know many of you may be signing   up from universities one of the things you might  need to do if you're not the professor is you   might need to hand in some sort of homework  or paper or thesis if you are the professor   you might be writing and even even if you're a  student you might be writing papers for journals   you might be giving talks there's a lot  of different ways you might present your   results to other people and in fact you might  present the results in another system you might   might make a an app for someone to explore that  keeps your information your model inside it and   you may deploy that app to the desktop or to an  enterprise scaling an enterprise system and if you   need to you can use external code to link in with  matlab to make things glued together well and you   also might choose to take the information you have  and deploy it to embedded devices and hardware um   but that action of telling other people actually  completes the flow and the nice thing is is that   all of these pieces are available in matlab we  can use the data to tell a story and we can share   tools and programs and artifacts along the way  in different formats depending on what's useful   now the next thing i want to do is i want to  motivate the example i'd like to talk about   um and for that i'm going to bring up matlab  which i need to find here don't do that to me   here okay so this is my matlab and i happen  to have um a live script open in case you   don't know what live scripts are i'm going to  explain them in a moment but i want to tell you   about a problem that i want to solve and here  you can see one dashboard a picture of a website   unfortunately the websites aren't working today so  i can't go to them and on this website you'll see   the state of new york and what we'll see here  is the um uh last two days energy load there   what's come out of the power plant and then a  prediction for what's going to happen in the   next 24 hours i have another different app that's  very similar and what you see in this app is still   the same 11 different distinct regions in new york  in which the energy grid is broken up into and you   see the prior two days energy load and then you  see a forecast for tomorrow including uncertainty   also on this map once i see a region you'll also  notice that there's red stars on there and those   red stars represent the locations where there's  weather stations so why do i care about weather   stations for this problem well we're in a very  sweet time of year from my perspective where i   live which is in the massachusetts area northern  in new england and it's a beautiful day today but   in the summer there were some days that were super  hot over 30 degrees centigrade and very humid   and guess what i wanted to do those days  i wanted to turn my air conditioner on   now i don't need to turn it on i might even  need to turn it on if the temperature's 28   or maybe even 26 if it's really humid but if  it's 30 and it's really dry out i might not   need it especially if there's a breeze and  so my energy consumption i'm going to argue   is going to be controlled at least in part by  what the weather's doing outside the temperature   and the humidity but there's also other things you  can see there seems to be a daily signal in this   and that's because at night most of us are  not using nearly as much power and electronics   we might these days i don't know quite how it  works at home but normally uh pre-covered we would   go and um i would go to the office or i would be  at a customer site and so at home i wasn't using   a lot of power um during the day the middle of the  day and i also wasn't using power in the middle of   the night but i was using it in the morning when  i was getting ready to get up and make breakfast   and before i left and then when i came home  and cooked dinner and did laundry and watch tv   and that sort of thing so we have a natural daily  cadence i think for some of the power that we use   and so what i'd like to do is show you how i um  would look at the data for this um and how i might   want to um do that now i told you that we have  the load here and the load we simply get from this   location you'll have the link when um you'll see  the link here and it's the new york independent   system operators location and if you come here  you see they have pricing information about that   because of course power costs money and we have  the power grid outages and things like that and   then you see we have the actual load data and we  have the real time actual load and if i click on   that what you see is a website that has the last  10 days of csv comma separated value files and   then um archived month by month a whole month's  worth of files going back quite a few years here   okay well i really would like to get a lot of  this data i'm assuming that if i understood the   patterns from before they would help me of a  power of load of load usage they would help me   predict how much load would be there we would  need tomorrow well who cares about that well   first of all the um the people who monitor and  run the grids care about that because they have   access to several different generators for making  the power and so they may say oh we need more   power tomorrow i better turn on another generator  or it could be in a situation in the summer time   where they go oh my goodness we need more power  tomorrow and we're using every generator we've got   so then what do they do well what they do is they  call hydro quebec our neighbor and say we need to   buy some energy from for tomorrow and so they're  using it these energy operators are using it to   predict the load that they'll need and make sure  that they have enough so that they don't run out   now think about the people in canada at that point  what are they doing they also want to know whether   someone here is going to want to buy more energy  from them they're going to want to know because   if they see it coming that it's going to be super  hot and humid in new york and massachusetts and   all that and they're going to get all these  calls to sell energy well they're not they're   not in business for no good reason they're in  business among other things to make some money   so they may want to raise the price when they see  that happen so different people might want to see   the same information for different reasons  okay so that's one piece of information the   other piece of information that i have here is um  a location on the noaa the national organization   uh oceanographic uh and atmospheric association  this is basically where the weather service is   hosted in the u.s and you'll see for each  month um well i yeah for each month i have   uh of years starting in 1996 i've got a basically  a compressed file with all the information   so somehow i have to get all these data sets  down to me so that i could begin to model well   if i come back to this new york independent system  operators you know how to do this i could click on   this and download one and download too well i got  to tell you um i would be bored pretty quickly and   i would probably make a mistake and i'd skip one  and i'd download another month twice and you can   see it's fairly long list and so i might really  mess that up so what i'd like to do instead if   i can and see if i can automate getting that data  so that i'm sure i have all the data that we need   okay so now i'm going to come  back to matlab to talk about this   and now i'm going to show you this is the live  script this is actually matlab executable code   um for the benefit of those who haven't seen it  before this is called a live script your regular   scripts were functions or were files that ended  in dot m you'll see this one ends in dot mlx   um for live and um it's a way of creating text  that is a formatted document it looks like   you know a pdf you might create or a word document  or you name your choice of how you'd like to   do it and you'll see here that we can put in  pictures here and i can put in bulleted lists   i told you that some of the factors that are going  to matter that the load is going to depend on are   going to be the time of the day and the weather  um and so what we did is originally we built this   web application so that people could just go to  it without needing to know matlab necessarily   click on one of the regions and then find out  the forecast for the next day by looking here   okay and you'll notice because of that we  have links in here i've got embedded pictures   i don't actually have a formula in this case  if i had a formula i could put it in though   let me come here suppose i actually had a  formula and and i knew that it was you know   whether squared plus uh time of day or something  like that i can come and in from the live editor   i can insert many different things and one of the  things you'll see that i can insert are equations   and i can insert a latex equation i'm not  super great at latex because between you and me   i graduated with my phd before latex was invented  so i'm okay with that but not great but i'm gonna   you can do it either way i'm gonna use the  point and click choice here and what you'll   see is i have a palette of a lot of different  mathematical symbols beyond the usual ones   that we have and so i can start i can even  start typing x equals and i can say omega   times r or whatever i want and i can then say  times and i can come over here and i can put in   um i like the nth root i'm going to put an  nth root and let me put in the 17th root of um   and let me do x minus no no we don't  want x we want um uh where did it go   omega minus three on the omega i'm gonna go  squared here okay that's so that's one way i could   put an equation in now um that's really useful  because if i were to say the equation in words   it wouldn't be nearly as easy to understand  as if you just see it and then you can double   check to make sure that it makes sense  i can also put in arrays so if i want to   i can put in an equation again and if i want  to i can put in matrices and i can say i just   want a 2x2 and in the same way i filled  in things before i can fill in things now   and so i can communicate the math to people  because that's one piece of the communication   um there's the mathematics there's the  pros there's the pictures there's the code   these are all important and some are important to  different people depending on what their role is   okay so the live script lets me put all this  together and because i sometimes forget i'm   going to jump ahead for a moment if i get all  done with this and i have this document exactly   the way i want it i can come here and do a save as  and you'll see or an export and you'll see i can   export this to any kind of document pdf word  html or latex so you can get these artifacts   out of matlab and share them with other  people so that's one way you can do the   sharing and communication and people can  be very clear on what code you actually ran   okay now um i'm up to the point where i'm  explaining what we're going to do so what   i'm going to do is i'm going to get the live  the data that i showed you from those websites   the historical data for the load and for the  weather and then we're going to look at it   and we're going to find that it's  a little bit messy not surprisingly   what surprised me actually is that the load  data is actually fairly clean um no no it's   not the weather data is actually fairly clean um  and then i'm going to want to merge the data and   the the load data i am told um is recorded every  five minutes the weather data is recorded once an   hour so this again is not atypical for things that  happen when you're collecting data from different   sources we've got to rationalize that somehow  and then we've got to pull out predictors for what might be important   when i go to do my predictions so we know  the time of day might be important i told you   why but you know does it matter if it's monday  or tuesday or saturday or sunday well it might   okay and the reason it might if i'm thinking about  it is my behavior is different on different days   of the week so monday through friday at this point  i normally am at home working most of the day   and saturday and sunday well it's nice weather  i get outside and i'm not home most of the day   so my energy use might be very different because  of that so day of the week might matter there's   other things that might matter and so we  want to basically pull apart the signal   in terms of its time constituents in some ways and  see which ones are important and which ones may   not be and once we figure out which ones we think  might be important we want to take that and train   a model for it so that we can make predictions  for the future load and see how well it works   so i am actually not going to run the next pieces  of code here um uh i'm just going to come to the   next section this is like regular matlab code  it has sections as well because i always make   sure i've downloaded the data beforehand just in  case and i'm actually going to at the moment um   minimize my tool strip so you can see a little  more in my command window here and what you'll   see here is i'm going to make a directory if it  doesn't exist but i think i already have it open   that's here it is and you can see that it's  going to make a directory called data sample   well i already have that made here so we'll go  in there and then what i'm doing is i am um oops   like that um then what i'm doing is i want to  get um a year's worth of data in this case it   doesn't really matter what you get but i watched  a year because each season may have different   characteristics of how we how much time you spend  indoors and outdoors and things of that sort   so i'm going to go from february 2005 through  january 2006 and notice when i do that i go in   steps of a calendar month so this takes away the  burden for me when dealing with dates and times   of having to remember which month has 30 and  which month has 31 days and whether or not it's   a leap year and so on and leap seconds all that  sort of stuff is just taken care of there and so   those are the dates of the files that  i want to get one per each month and so   if i um if i were going to uh do this  i don't want to do that don't do that um uh i have my i have four cores on my laptop  here and you'll see i have four workers i put   my i put on them all to work behind the scenes  here and so if i have a big enough pipeline to   the internet instead of downloading one file at  a time i could choose to download as many as it   will let me in this case four perhaps because i  have four different fours there and so that's why   instead of i'm using a par 4 instead of a for  loop a parallel for loop so that if it can go   faster by doing things in parallel it will in this  case and so what i'm doing is i'm just building up   the full names of the files and you can see the  last time this ran i have the output left here   from when my colleague ran it the last time and  one of the things you'll notice here is that we   got all 12 files down we constructed the file name  we were going to get and we had put together the   url so it has the same file name on the website  as it has here but i have to give the url   for finding the data and a location where  i want to put the data when i'm done   and i'm going to download it so when i download it  you'll see we don't necessarily download an order   but that's okay because i still have all the  files but with a par 4 you can't assume things   are being done in order so each time through the  par 4 loop is an independ must be an independent   action now having put together the file  name and the file destination you'll see   i'm using the function web save if you  don't know what web save does you can   choose select it and get help on web save and let  me come over here and show you here i have the   math lab documentation for web save and one  of my favorite things to look at is see also   c also tells you many things first  of all it was introduced in 2014   and it tells you other things that might be  interesting web write and web read well web save   actually is um doing something it's saving the  file rather than reading it into matlab right now   and you give it the file name where you want to  place it in the destination it's coming from and   there's my favorite part though um well i said tie  between c also and examples if there's an example   that does what you want you don't necessarily have  to read the full description and everything about   every input argument you might need to to get some  details about it but it might just save you a lot   of time and you can just copy and paste a little  bit of code and modify it for your use okay um so   we've got our files here and in fact if i bring  back that folder that i had open a moment ago   there to go here come on here it is um you'll see  that the last thing i did in this um uh area is i   for for the length of the dates that i have which  is uh 12 of them i unzip them enough so i look in   my his load directory i actually have more on zip  than that because i i got more data at one point   um but i also have um uh all the days for a whole  bunch of different months i have you know the 31st   um or the january 31st january 29th january  30th all of 2006 and so on so i've got a bunch   of data there um that's downloaded and so i can  do that without worrying that i'm going to miss   a file because of um making a mistake clicking  it and we can automate a tedious task that way   okay but now i have this mess of files that i need  to think about and i want to think about how to um   how to think about them so i'm going to come to  this next section and what i'm going to do is i'm   now going to clear all the output um from the last  time this was run so i don't have anything in my   workspace right now and i don't have i haven't run  anything yet and so what i want to do is i want to   take that directory with the historical loads and  i'm going to take all those csv files and i kind   of want to think of them as one long skinny um uh  time sequence of how much load was used every five   minutes for days and days and days on end and so  what i can do instead of having to manage all the   data myself is i can first find out what data is  there one way to find out what's there let me come   back to make this big is i can run a section and  when i run a section we see the output there and i   can see that i have my d there i have 365 days and  if i hover here maybe i turn that off i don't know   okay and i can come here and what i'm going to do  in this case is say i want to um let me come over   here actually let me go no yeah let me go this  other way okay let me open one of these files so here's one of these files um and i'm just  going to make the columns a little bit bigger   so that you can see i've got um uh five columns  of data the time stamp and you'll see we have a   bunch of things that were all on january 31st  2006 at midnight and then we go that's uh   uh lines two through um 12. so that's 11 of them  funny i have 11 regions right in new york i told   you and then i have uh starting at 13 and going  for another 11 5 minutes after the hour and then   10 minutes after the hour and you'll see time zone  um and maybe i don't really need that information   because new york is always in the same time  zone with each other they're all together   but you'll see the next column which is the name  of the region is um capital that's where albany is   and central and so on and then at five  minutes past we start repeating them   with capital and central central central and  woody again and and if i don't want to think   about the name i can use an id to represent  each one of those a level region 11 regions   personally i like the name because i know the  geography so i kind of know where new york city   is and where the capital is and so on so it's  a little more meaningful to me so i maybe don't   need d either and then i need the data surely  the load okay so when i read in the data um uh   i want to think about how to read all these  as an ensemble rather than reading them one   by one and for that what i'm going to do is  i'm going to refer to the historical load um   the data in historical load uh as a data store and  this basically will let it me aggregate all of the   data and be able to read it all at once and you'll  notice that i'm gonna do um uh a date time for the   first column um and then two capital cs for the  next two columns here's a preview of them the uh   time zone and the um uh actual energy zone  then the load these two happen to be strings   but they're finite strings so i'm going to make  them categorical with the capital c so that i   don't have to store the same string many many  times and then these are floating point numbers   and so i can come here and say but you know what  i really don't want all of that data i want um   just columns 1 time stamp 3 name and load  you'll notice timestamp doesn't match timestamp   because when i go to run this this  is another way to run is to stick my   mouse in the blue area and click it i get  a warning that the variable name was not   valid because it has a space in it and so  all matlab did was squeeze the space out   and so now what i'm going to do is because i can  i'm going to read in the whole data set at once   and it's going to take a little bit of time  because there's even though it's not only one   year's worth of data and not like 20 years  worth of data it's a fair amount of data so now while we're waiting you can see in my  workspace i've got the data and my raw data   is about a million lines long and three columns  okay so here's i can just look at the top of it   and you can see i've got timestamp and if i hover  in my table you can see i can see all the times   they're all the same and these first eight are  different um names and i've got the loads here   so i read in the data i always like looking at  it to make sure i'm successful um but you know   i if i just look at five numbers i'm not sure  i've got it right so what i often want to do is   take a better look at the data well let me remind  you what the data looked like it looked like um 11   uh data values for time zero and followed by  11 data values by time 5 followed by 11 at time   10 minutes and so on and that's a little bit  awkward i'd kind of like to think about my data   as time 0 5 10 15 and so on and one column  for capital and one for central and so on   and so i'd like to take each 11 and turn them  on the side and stack them that way actually   unstack them and that's the the terminology  that's used in databases so we borrowed it   and i'm going to unstack the load data according  to the name now and uh it also had some names that   didn't make sense because new york city was n dot  y dot c and you can't have dots and matlab names   and i'm going to change the first name which was  timestamp to date here and now you see a preview   of the first 12 of my um times so we have five  minutes zero minutes five minutes ten minutes   15 minutes and then you'll see capital central  dunwood and so on and i can go through and if   the um what i'm trying to show in the live script  is bigger than the amount we want to take up space   then you can scroll through it now if i were going  to take this and do my save it would actually show   in an expanded way for example in pdf it would  show all 12 rows but look at row number two   does this bother you for the state of  new york at five minutes after the hour   midnight on january 31st no one needed any  power anywhere seems like there's a problem   with the data collection somehow the problem is  of course we don't know how often that happened   and what other problems we may encounter so what  i'd like to do is i'd actually like to take a look   at a piece of the signal and so what i'm going  to do is just take the whole signal first and   i'm going to use something that's relatively new  in matlab and i'm going to call stack plot on it and it's thinking because it's got  remember a lot of data points in there   and here we have a plot now notice in my live  script the output has been coming into my script   i could choose to have this come after the output  come after each section that i'm running or i can   have it be side by side so that i can see what's  going on and it's really a preference up to you   and you can also suppress all of the code if you  want to i don't want to at the moment i definitely   want to show it to you but here is my signal  that's visualized notice also with this plot   that i can do things like i can start moving my  mouse here and then we can read for all 11 regions   what the load was at any given time and now you'll  also see we notice a lot of places where there's   these zeros at other times and then we also happen  to see maybe some spikes up at various times   and um i could uh zoom in and when i zoom in it  zooms in on the entire plot that i've got and if   i want to i could update the code and it will  just put the code in for me saving me the time   if it's the way that i would like to see it okay  and if i want to i can control z and undo that   and we can hide the code there at the moment  and in fact i can just double click i think i'm being a little bit silly somehow well okay i thought i could click and go back but  at any rate um i can run that section again though   and get us back where we were all right the next  thing i want to do is i want to take advantage   of the fact that matlab actually knows about  dates and times much better than it did before   so if you've got timestamp data and you're not  using the time functionality in matlab or the   timetable functionality you might really want to  look into it so i want to ask the question is my   data remember i said they told me that the data  was sampled every five minutes so i like to trust   but i don't completely i want to verify too and  so that's why i'm going to ask in this timetable   if the data's regular and it's not so this is kind  of a bummer right because i've got three problems   with my data right now so i have messy data okay  i've got zeros in my data that really don't seem   to make sense i might have some spikes in my data  and now it's not even sampled every five minutes   so i've got to figure out how i want to work on  this so that i have the most cleaned up reasonable   data i could have and i'm going to select a  region that i could clean and you'll see with   this drop down which is a user control that i  can insert here um that i can choose i could   program it so that it's any one of my 11  regions i'm going to leave it on done woody   and i'm going to capture the data into what i call  the clean signal which of course it isn't yet but   what i want to do is i want to select the data  and and take a look at it and so well matlab's   computing this it's going to make my plot and  here's the raw data for the dunwoody region   and you can see i have a whole host of zeros there  way more than i even maybe anticipated from seeing   that stack plot before and we can see a bunch  of spikes now when i think about spikes here   and i think about energy use you know a lot of  times if we're sitting at the dinner table what   we were doing five minutes ago is kind of what  we're doing now so you don't expect rapid changes   and then to rapidly change back again in no  time at all so these spikes are also a bit   suspicious and so i need to think about how i'm  going to clean things up now there are a ton of   functions in matlab signal processing toolbox and  stuff that you may or may not know about that can   help you do this oh i want to show you something  else while i'm here another thing that i can do is   i can do a constrained zoom and look at the dates  my dates go from march through january because i   have a year's worth of data so february is not  listed there but it's there if i go like this   what you'll see is matlab is smart about the dates  and it will zoom in appropriately whoops zoom in   appropriately and it will let you zoom sort of  your heart's content now you can see we have   some interesting spikes and things we have a  double spike a couple hours apart there okay so   i can look at the data and now i  need to think about using the data   and cleaning it up and one of the things that  i could do i showed you there were controls   when i showed you that control i didn't show you  what other controls there were there i can put in   numeric sliders check boxes buttons and edit  fields we also recently introduced something   whoops called tasks and here are the tasks i'm  running the latest version of matlab r202020b   and you can see i have a bunch of  tasks in matlab that allow me to do   data pre-processing clean missing  data wow that's going to help   clean outlier data that's going to help smoothing  it maybe i'll explain why that might make sense   and i could potentially use um synchronizing  tables maybe later on um and then there's also   tools from other products that may help you  as well so i'm going to come here and i want   to remove the zeros and instead of removing the  zeros i choose to put in the clean missing data   task here so clean missing data is here i fill in  the data that i want i'm going to tell it what i   want the input to be it's going to come from my  clean signal and i need to tell it what part of   the signal and it's the load and the x-axis i  want to be date and i can i want to fill the   missing and i can choose how to fill the missing  and i can do linear interpolation in this case   max gap to fill whatever you want and then i want  it to display both the clean data and the filled   missing entries and here we have the plot that  comes out of it now that's all nice and magic   and um and then i can take my missing data and i  can put that back into my clean signal when we're   done clean missing now what if i wanted to show  this to someone but they didn't need to see all   this mess what i can do is i can collapse this  and they can just see this funny line of code   that says clean missing is filled missing data in  clean signal dot load using linear interpolation   you don't have to show that if you don't want  but if you're like me once again you might have   a reason to want to see what code's behind it  and you don't want it to be something magical   and so i can click here again and i can show  the actual matlab code behind it if i would like   and so you can rest assured that we have um the  right thing and if you see you might this is an   opportunity to find out well here's a function  i didn't know about and you can come here   and fill missing so i can come up here and  we can say film missing and i will come to   film missing in matlab here and i can look at  the c also because once there's missing values   there may be other things i want to do this  refers to the task fill outliers is it missing   remove the missing standardize the missing so  a bunch of information i can get out of that   so i decided to fill the missing ones first  because if i change the time first and   interpolate it somehow to because the time is  moving around then my zeros might not be zeros   and so i might not fill something i meant to fill  so i'm doing the filling the missing first and   then i'm also going to clean the outliers with the  same kind of idea with the clean outlier task here come on let me come here maybe it doesn't like that i've  done that i don't know i don't think it should   matter but here i can come to the um clean outlier  data and so that's what's embedded in here we're   going to take the clean signal put it into the  clean outliers same thing we are going to have   date versus load we're going to fill the  outliers we're not going to remove them   and i can do a moving median in this case with a  threshold factor of 3 standard deviations and i'm   gonna have the window be um two hours wide okay  and then i'm gonna make a plot that shows you   everything that happened to this data pretty much  here and what you see here in the dark heavy blue   is the cleaned data what you see in a light blue  which you barely see because mostly the heavy blue   is covering it is the clean data but you'll  see a few little bits of the clean data here   now i'll also mark what outliers came out of  this in x and then the filled out liars with   the red dots and the outlier thresholds are in  gray and again i could zoom in and we could see   what's going on here if we wanted to but we  basically now have a clean signal for one um   one year of one of the 11 regions  and this is just for the load so far   and now i'm going to argue that there's a lot of  other stuff going on here that in any given day   maybe within an hour we don't really care if some  things happen and maybe you you go off and you   um you know have something on your hair dryer  on for a few minutes and then you turn it off   it's not that important we can smooth the data  out and in fact we're going to smooth the data   out with the smooth data app that the task that's  up there and we're going to use a moving median   with the same time length and the reason i'm doing  the smooth data at this point is honestly because   i already know and i told you that the other data  the weather data is hourly so we're not going to   need every five minutes of data but i might as  well get a good estimate of the five minutes   of the data that we get and so smoothing it  seems to be a fair thing to do at this point okay   so and here i get my smooth data so you can see  that we're getting sort of some hourly jittery   stuff that's going on around there okay  now i told you i had three problems i had   uneven sampling of the data i had missing data  with zeros and i had outliers we fixed the missing   data with zeros we fixed the outliers so now we  need to regularize the time stamps and um first   i can find out okay one thing i can do i'm going  to come here and when i come here another thing   i could do is i could run and advance a section  and i use run in advance so often that if i right   click here you'll see it's grayed out i added  it already to my quick access tool strip here   and so i can minimize the tool strip now and  i can come here and i can run this and go to   the next section well i'm going to come back and  see what the answers are for this section first   okay and so i'm going to get um a summary  of the differences between the dates so the   differences between my five minutes if they're  all five minutes my differences are all zero but   you'll notice that my differences are all five but  you'll notice my median difference is five minutes   my maximum difference is an hour and five minutes  and my minimum is one second well one second you   know could be jitter of some clock that's done  something's gone weird for a little while okay   so what could i do well i might want to figure out  why it's an hour and five minutes off suppose it's   a legitimately an hour and five minutes off what  i can do is i can say take the clean signal and in   steps of five minutes interpolate it with linear  interpolation here so i could do that and that's   what i did here but let me tell you something else  i smell a rat when something is too round a number   and an hour and five minutes is an  exactly an hour more than my median   difference and so i'm wondering what's going on  and the actual fact is is i lied to you before   sorry and what i did was when i said we didn't  need to worry about the time zone we actually   kind of do because in the us we change our clocks  twice a year in most locations and new york is one   of them so if i'd looked through that file a lot  more carefully or a different month of the file   instead of seeing est the whole time i would also  see edt for daylight savings time some of the time   and so if i had properly accounted for that  and read that in as part of the date and time   we would not have a max of 105. we'd have a max  of either five or five in a couple seconds or   something like that okay and for the rest of what  i'm going to do i actually have the data properly   um accounting for daylight savings time so there's  no issues with the data going forward for that   the next thing so i just did one station one  year um out of you know 15 years or whatever   so i'm pretty sure you want to watch me do all  the other 10 stations and for all of the years   well maybe not um i actually have done that in  advance and so i've got that ready to go when   we get to that point but now i need to think  about the weather okay and i'm going to come   here and i'm going to do the same idea that i  had before i'm going to do this run in advance   and here we're going to let it run and i'm  going to come back and show you the code we   have this directory of hourly files and  i already had downloaded the data before   and it looks like i have two gigabytes of data  which is a rather large amount and if i type the   first file in the folder and i just look at the  first five lines you can see this scrolls and   scrolls and scrolls and scrolls and scrolls i've  got a lot of columns there but i don't actually   care about all of them and not only that you'll  see comma space comma many of them are actually   missing values anyhow but what i do care about  when i actually get a chance to go through here   is i care about the w band which happens to be  the weather station number um and that's the way   we identify the stations we care about i need the  date and time they actually store it as separately   whereas at the um uh energy um load forecasting  place they store them date and time together and   then i care about the dry bulb fahrenheit and  the dew point fahrenheit if i were doing this   in europe maybe i would switch it to celsius but  it's the us so most most things are in fair and i   care you can get it either way it's in the in the  data set either way and rather than say read all   two gigabytes in at once i'm going to say read  in a million records at once and then i'm going   to tell it what formats just in case it didn't  know and i'm going to preview the data and here's   what a preview of my five columns look like  here's my station my date my time and so on   and now i need to figure out oh i need to tell  you something remember all those stations that   i downloaded for the weather well they didn't they  weren't just new york in each one of them they had   all the stations across the whole us in them  but i only need if we went to the very top   here okay i'm down about two-thirds let  me come back up here whoops i only need   um about i don't know a handful of stations 15  20 stations to cover the stations that are in   new york city right now or new york state so if i  come back here um what i want to do is i want to   find out these i looked up the numbers these  are the stations that we care about and if i   take a look at them there happen to be 17 of them  and all of the rest of that two gigabytes of data   i don't need right now now i have a pretty nice  beefy machine that matters gave me and even so   i don't want to load all that data at once and  so i'm going to use a concept called tall arrays   and this allows me to operate on my data in a way  that i don't have to load all the data at once   if i happen to have parallel computing resources  available which i do because i started my parallel   pool before and i mentioned it i can process  some of them in parallel and if i were hooked   up somehow to a cluster of the cloud i could  use the cloud that resource instead to do some   of my computing on tall arrays so you basically  write your regular matlab code but you tell it   that the data is tall before you um actually work  on the data and so i'm going to take my data store   remember the data store is just a reference  to that data in the file on my hard drive   and i'm making it tall and it knows  something about this data when i come up my   click over here here's what dsw thinks  it is it tells me a bunch of stuff   and i can come here now and um i can make it  tall in fact i'm going to start running this now   and i can make the data tall and then what i'm  going to do is i'm just going to filter out and   keep in the raw data the stations that match the  stations i care about here that were on my station   list and we can see a little bit now notice  when it's showing me a preview of the raw data   it says it's m by five because it hasn't read all  the data yet and it's not telling me how much um   it doesn't know how much there is yet so it's just  telling me there's a bunch but it's five columns   and now i'm going to filter out those uh  unwanted stations this was the raw data this is   the filtered out stations here notice my filtered  station does not begin with 3011 because 301 one   wasn't on that list but oh four seven two five  was so this is the first station we care about   and then i'm going to take the date and time  and i'm going to merge them into a date time   construct in matlab and i'm going to make the time  zone notice america new york and when i do that   it's going to take care of that whole time zone  issue with daylight savings versus regular time   so i'm taking care of that  and i'm going to also put the temperatures the dry bulb and the dew point  together and so i have my station date and time   and you can see it's basically  once an hour at this station   and i get my two temperatures and none of that  would have been done until we typed this at last   command which was gather what happens is all  the commands in between my toll and my gather   get saved up and as soon as i say gather it says  okay go use whatever resources you have available   through my parallel pool to compute them and you  can see it figured out that it could do this in   one pass it didn't need an intermediate result  if it did it would have maybe passed through the   data two times or three times so we try to be  very economical about passing through the datum   more than we need to because that's expensive  and the next thing i want to do here after i get   my data back is i wanna um i want to reformat my  table and now you'll see that i have the date and   time and i have the station numbers here and they  i'm getting a warning because you can't have a on   without asking to preserve you can't have a  column name beginning with a number so it puts   an x in front so that's what all that is about and  you'll see the data that we have now when i first   saw this i thought the weather the weather people  are kind of strange look at this this one has a   bunch of nands at 45 minutes and 40 51 minutes and  then it's got a reading at 53 which we saw before   and then it's got all these nands at all these  other times and then it has another reading at an   hour and 53 and this one has nan's at a different  time and it collects its data at 56 and so on   and then i realized how smart the  weather station was the weather system was uh administration because what they  did is they decided to not slam their servers   and so they basically have um a time band  in which the different stations report and   so they reported regular times once an hour but  staggered a little bit within a small time window   and so what we need to do is just coalesce them  and get the the non-nan value for the hour to be   the hourly value for each station because from the  uh weather uh station's point of view they haven't   given us more data other than hourly so that's  probably the best we can do right now now we then   need to take our data that's the load that's every  five minutes and our weather which is once an hour   and we have to merge them and so what i'm going  to do now is i'm going to load in the full data   sets that are already cleaned up here and when  i say cleaned up we've what they we've done is   we put these into um a hourly form too and if you  want um you can there's code that you can do that   with and now i want to synchronize them and i get  to synchronize these two things in many different   ways i can look at synchronize to find out but  one is i could take the union so i could do um   all the values in one all the values that are in  all of them and then fill in values if i want to   between or i could put nands in there or  whatever or i could just do the intersection   in this case i'm choosing to do the  intersection because when we tried it ourselves   to do this without um uh with without uh without  down sampling we didn't get a much better model   and it took a lot more time because you can see  these these data are big let me come here and   let me do this hold on one second because my hands  stuck to my mouse pad so now um you can see i have   my weather table it has 17 stations my ny iso used  to be smaller it had 11 in it and 12 for the time   but i've got 17 more because i've got 17  weather stations in it now too okay so um   now if i think about this and i want to model the  energy use in new york i could put them into one   big model all the different 11 regions but that  doesn't really make sense to me because they're   operated independently so what i'd rather  do is i'd rather make 11 independent models   and then you ask at any given time like that web  app did which region are you in i'll use that   model and give you the information so i'm going  to pick um a location to model from and this one   is going to be new york city and i'm going to  choose in this case there's only one weather   station in that region and it's the laguardia one  klga and i'm just going to change the column names   or to make it easier to see what's going on and  now i need to think about creating predictors   and um i told you some of what the predictors  might be uh but i didn't tell you everything so   let's start thinking about this um well i need to  know what hour of the day it is i told you time of   day might matter and what month it is is kind  of a proxy for the season um you know i guess   if you're living in somewhere like hawaii maybe  it doesn't matter but that's not where we are   year might matter some years are more severe or  more mild than other years i argued with you the   i made the argument earlier with you that day  of week might matter because our behavior is   different on weekends versus weekdays and i've  represented that two ways the weekday is just   one through seven and the is week so it is what  day of the week and then is weekend is a true or   false depending if it's a weekend day or not okay  um and then i also want to get the temperature and   the dew point out so i'm going to pull those out  separately instead of having them be in one column   and now um i need to think about a few more  things well let me think about this well as   i've been sitting here um the weather hasn't  changed much my use of energy hasn't changed much   so my energy use even like an hour ago is not much  different than it is now so there's a correlation   between my energy use from hour to hour and maybe  day to day and maybe week to week the week to   week because of the weekends versus weekdays  so what i want to think about is pulling out   um uh the hour of the day and um and the  basically the hours of the day and so what i   want to do i have hourly data now and so i want  to compute what's called a cross correlation   and i'm i'm going to hope that josh will tell me  if my hand's not on screen because i have no idea   um if i were going to do a correlation with the  data and my data are represented by my fingers   what i'd want to do is to do an autocorrelation  i simply simply put my hand right on top of each   other and i have a hundred percent correlation  it's perfect if i wanted to then i can start   off setting them and what you can see is i can  see that they're not going to be quite as well   correlated as i go and so i'm going to compute  that for the load for 200 lags well why 200 well   um i'm going to compute that well i need to get my  predictors in here now let me come and get these   built up and i'm going to get my predictors in  here my lag predictors i want to explain what's   going on first of all here's my autocorrelation  and it's a very high correlation because i never   took the mean out and i can use these tools so  for example i could use data tips and if i'm   if i'm having a good day i'll get it that was  at x equals zero it's 100 correlated look where   it's next it goes down and then up and that the  next time 24 hours before it's 0.99 and you can   see that the correlation goes down and then it  rises in its next local high is over here 168   168 is 24 times seven so 24 hours a day for  a week and so that's why i went 200 to just   make sure i could see whether there was a weekly  trend since there seems to be a weekly trend i   want to actually put those data the prior day and  the prior prior week into my model data as well   okay so we're at an hour folks and i haven't  even made a model yet because what's typical   is you have to spend a ton of time pre-processing  the data to get it into the form you need   but fortunately you can then save  it and then spend a bunch of time   modeling but you can also see how useful matlab is  with the pre-processing part loading the data in   and clear cleaning it up so that you're ready  to go okay so now you're ready to model and   you're going to do a machine learning model and  for those of you who don't know much about the   modeling the reason that i'm going to use some  machine learning modeling is because there is no   i told you i'm a physicist but there's no a model  i know of that says whether squared minus the   temperature of the prior day and so on is going  to be my prediction for the load for the future so   i need to make some sort of statistical model and  the load is basically continuous so i need to make   a regression model rather than a classification  model classification model would be tell me one of   what of each of these categories or five things  does something belong to well the way i'm going   to do this i'm going to do it two different  ways one first is if you don't know much about classification and regression at all you're going  to probably go look something up online maybe in   the documentation but you also have access to apps  in matlab and apps are kind of like we've had apps   for a long time tasks or newer tasks are sort of  smaller than apps but apps you'll see here are i   have my favorites up here and they're organized  those are just mine and we have machine learning   apps here we have signal processing apps and so  on and you can see here i put several apps for   machine learning on my favorites list including  the regression learner app and i'm going to bring   up the regression learner app and here it is  and the idea behind the app is it's going to   help lead me through the process of what i need  to do to create a model and so i'm going to make   a new session and i'm going to my data from the  workspace and the data we've just been looking at   is the model data and what you see here is  it tells me that my model is 77 000 by 11.   that's fine and now it says that it's going to  try to predict the prior week i don't want to   predict the prior week i want to predict the load  so i'm going to simply come here and switch to   the load and now it's going to predict based on  all the other things that i put in my model data   if i wanted to i could take some of them out now  early on i also said to you that even when i'm   not making a statistical model but one where  i do parameter optimization i need to somehow   understand if that model is actually useful as  a predictor and so i need to do something called   model validation and i'm going to use holdout  validation right now and i'm going to hold out 10   of the data when i start this session and what's  going to happen is i'm going to make my model   which i have not made yet this is just a plot of  the data i'm going to make my model um from 90   of the data and i'm saving 10 percent that it's  never seen before to help me understand how well   i've done with my modeling now um uh we already  made our features basically by pulling out those   those things like the prior hour and the prior day  and things like that now here are all the possible   different ways i could do create a machine  learning model and if you were like me when   i first saw this i was not that well acquainted  with all of them you'll notice we i'm going to   start from the bottom ensembles of trees and at  the end next to the end on many of these you're   going to see all ensembles all gaussian process  models all support vector machines and so on   up top i can pres do everything or i can choose  all quick to train and that is what i'm going to   do right now i'm going to choose all quick to  train and you'll see that i have selected use   parallel since it i should be able to and i'm  going to hit train and it's chosen four models   and it's going to work on those four models right  now and it's already got the linear regression   linear regression is basically least squares  it's basically the idea of let's compute the one   that we know really easily how to compute let's  assume there's a linear relationship between all   our inputs and our output and then what model do  we get and you'll see we get an rmsa a root mean   squared error 429 and you'll see other measures of  error down here and i'll tell you about the model   you'll also see that we calculated um three  different kinds of trees and the fine tree   has the most uh as the best the least rmse but  you'll see that the medium tree has only a little   bit more rmse but it has fewer coefficients it's a  little bit bit less complicated so i might choose   that one and take a look and see how the models  and prediction look and i can look instead of   just at the response i can look at the predicted  versus actual if everything were perfect we'd have   a straight line there and i could look at the  residuals to see if there's any trends maybe   when i get all done looking at these i can say  oh which one do i like i like this one and i can   keep the figure if i want i can export the model  so i can use it in matlab and i can also choose   to make this so that it's reproducible and i can  generate a function and i just generated that code   right now you can see that because it says auto  generated right now i happen to be in the um time   zone where it's noon right now okay so that's  one way you can go ahead and use it um now um   that's fine i'm going to close this we  don't need it because i want to show you   the other way to do it as well so i can use the  app and notice that i didn't need to be an expert   at this it's a really nice way to get information  that i need i can generate the code that we   want and so i can make the system not require  point and clicking from someone else later on   but i'm going to close that that's fine and  instead what i'm going to do is i'm going   to manually split the data into training and  testing sets and what you'll see here is i'm   going to choose the year 2012 data as my test  data set now the reason i'm choosing a full year   is because i want to make sure i have all the  months represented i don't actually have to choose   one full year though i could and still meet that  criteria by choosing a month from one year and a   different month from another year or mix the days  across different months across different years as   long as i had every day of every year represented  i would have some hope of having covered that   but i am just loading in um i'm just  basically taking the year 2012 is my test data   and so that's going to be the data that we use for  validation and the rest of the data is going to   be the data that we use to do the training so i've  got my training data and i've got my test data and   we found um by running all of the models at home  that we would do best if we did a bag decision   tree now let me explain what a decision tree while  this is calculating decision tree is actually a   bunch of cascaded if else if statements so it  says something like if the load yesterday was   under 300 and if the load um an hour ago was under  425 and if the temperature is greater than 37   and so on so and so on when you get to the end  you say then i think the load is the predicted   load is whatever it is and you simply go down  each branch of that tree and you can find it out   but remember to do this we left out some data was  there anything special about the data we left out   hopefully not and so if we left a different set  of 10 percent of the data out i'd get a different   tree but hopefully qualitatively similar and so  what we found out is that we didn't do much better   with trees than if we calculated 30 of them and  aggregate them that's what tree bagger does it   does an aggregation and then for reasons that  i don't know um because i'm not in the energy   industry instead of using rmse or one of the  other standard measures of error they use   mean absolute percent error and if i come here i  can come and look at the mape and we can say find   it and you'll see it's a function that's sitting  at the bottom of this live script so i can put   functions in my scripts which you haven't been  able to do before now when i run this you'll   see that when i run the data with my training  set i get my mape to be excuse me 1.33 percent   i don't honestly know if that's great or not  i think it's pretty good though but when i run   it on my prediction set i get 2.65 which is  almost twice the size that's a little scary   um so maybe my model is not any good well before  i throw my model out i'd like to look a little   further and what i want to do is i'm going to plot  the data and i'm going to plot the data for the um   the year of our training set our test excuse me  our test set 2012 and i'm going to plot the model   and the data that it comes from and here you see  the um the outcome here i've got the load i've   got the um model the model here in blue i've got  what really happened in red and i've got the error   in the model here at the bottom now i'm going to  pop this figure out of the um out of the matlab   live script because i want you to be able to  see this bigger that's the best way i can do   that right now so hopefully you can see a much  bigger picture now of what's going on and you'll   see the model does pretty well a lot of the  time we come here and let's do zoom in okay but   we can see now that there are some places where  it doesn't do as well well let's zoom in here and   when i zoom in i've linked these plots and you'll  see around april 17th we have something that   doesn't quite match well what you probably don't  know is that there's um a holiday that some people   celebrate um around that time depends where  you are which celebr what you celebrate in   massachusetts it's called patriots day in um other  uh uh other states i'm not sure what they call   it but that's one place that i could look another  thing that i could do is i could look at this next   one maybe there's one here whoops here and this  one is around the date of memorial day which is   usually a three or four day weekend for us in the  u.s so our habits become different on those days   that are holidays those days behave a little bit  more weekend like well let me tell you some more   of what's going on so things don't behave quite  the same at christmas time because a lot of people   take time off same for our thanksgiving  but you'll notice there's a glaring   mismatch here for quite a chunk here so  let me zoom in on that and the dates are   late october october 29th to november 2nd 2012  new york city and let me tell you what it is not   we hadn't had the election yet that year it was  not any sports event it was not patriots it was   not football it was not basketball hockey or what  did i leave out baseball football i don't know if   i missed one i'm sorry basketball it wasn't any  of those it wasn't world cup it was not halloween   even even though kids may go out and get sick for  one day they don't disrupt things for five days   if you were to go to your favorite search  engine you would find out that it was super   storm sandy that it hit the u.s then so am  i sad that our model did not predict this   well emotionally yes but from point of view of  scientifically no because we didn't teach it   anything about um storms which brings me  to the last point i want to make on this   which is we could choose to teach it about  the weekends and if we do that we actually get   the um the model um very closely cohering  the mape goes down from 1.33 to a lot smaller   and then the misfit is pretty much all if we leave  this is the test set pretty much all because of   um super storm sandy so i'm going to come back  to my slide here and um let me go like this   and uh let me just get rid of all of  these things you're not distracted so   i talked to you a little bit about accessing the  data so that we could then pre-process it and   the problem is is that it's in so many different  sources that it could be really hard to to figure   it out you noticed i had to go to two different  sources and download but fortunately that's pretty   easy to set up in matlab and i was able to scrape  the websites that way after i get the data set up   you saw me use data source store to do the  aggregation and we have tools throughout matlab   for helping you with different kinds of data and  data that might be out of memory like big data   like i showed you with tall um and we show i  showed you some point and kick click tools as   well so far um i didn't show you the import  tool but that's one way you can load in data   but i have showed you some of the other tools  that you can use um now the next thing we need   to do is we have to take that data and get it into  a model and then part of the model is that many   of us like me are scientists or engineers with a  particular area of expertise that may not include   data science to begin with does that mean we  have to go find a data scientist well maybe but   maybe not we can figure out because we're domain  experts we know what features might be important   even better than um a data scientist who doesn't  know our ex our application area may know and so   this may it takes a lot of time but it needs um  people with expertise domain expertise to do it   and then there's the model development and that  does take time but we have the apps and the tools   to show you how to look at the data create the  different models time to conduct the analysis by   looking in the app and being able to use the app  to help you with that and you saw me a be able   to take the data from the app and then be able to  generate code from it so that we could then if we   had made our model on a small amount of data then  we could then translate that and begin to use it   on a much larger set of data with the same code  that we had just different data so it lets us as   non-data scientists bring our domain expertise and  apply data science to the problem we're working on   the next thing we need to figure out is how to  share and this one gets harder i mean i showed   you with the live script how you could you could  hand someone the code the live script you can also   convert it into a pdf or something like that  um but you can also share it by generating not   matlab code because you might want to share  this with people who are in a realm where matlab   isn't the arena that they're going to be  working in and we can share it with other   people who have other needs either via an  app or via code in another language okay so   um the way i can get it out of matlab there's  a bunch of different ways one is i can just use   one of our coder products which will allow me  to generate standalone codes for cc plus and   so on here and that will work on the algorithm  if i want to work on exporting the entire um system including the algorithm  including the graphics and things   i can deploy that to enterprise systems even  if i need to make sure that it will connect   through and link with people who are using other  languages in order to work whether that's through   python or java or c plus plus so  you're going to either use a coder or a system product there ok i just want  to make sure you know that we have   um mathworks services that you might think  about we have a great group of consultants   and trainers around the world for those of you who  are at educational institutions we have a bunch of   experts in uh education as well who  often come and visit you even um   uh not even virtually at the moment and so um  hopefully you're getting the help that you need   and it was my pleasure uh to help you i'm gonna  just look for a moment at the questions here   um the question is do you need any special add-ins  or toolbox to work with machine learning or big   data um so thank you for asking the question and  the answer is yes you need machine learning and   statistic statistics and machine learning toolbox  and if you're working with large data and you want   to take advantage of all your cores or a parallel  cluster and so on you need the parallel computing   toolbox and possibly the matlab parallel server  so potentially three things and for many of you   at schools with campus-wide licenses  you have access to all of those it looks like we may be done so i want to   thank you very much for watching we have  put some links in the description below   in the youtube page and you two in particular  you might find interesting uh first is the matlab role and secondly if you're interested  in learning more about simulink   you might consider entering the simulink  student challenge that's happening soon   where you might win a thousand dollars now  don't forget to subscribe to the matlab channel   to get reminders for the next event and i  look forward to seeing you next week thank you you
Info
Channel: MATLAB
Views: 26,507
Rating: undefined out of 5
Keywords: work from home live, matlab, simulink, mathworks, matlab tutorial, using matlab remotely, working from home matlab, time series data in matlab, how to plot time series data in matlab, time-series, time-series data
Id: b4sq1dIdBS8
Channel Id: undefined
Length: 83min 23sec (5003 seconds)
Published: Thu Oct 01 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.