JMP Academic Series: JMP Basics for Professors and Students 08Mar2017

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so again my name is Mia Stevens and I'm a member of the jump academic team and today's webinar is jump basics for professors and students so this is designed for any academic who is brand new to jump and it's just getting started and we'll cover basic navigation and basic functionality and jump so we'll start with the Welcome tour to give you a high-level overview of what jump is all about and the basic structure and setup of jump will see tools for summarizing and graphing data in job in fact this is where we'll spend the bulk of the time I'll provide an overview of tools for data analysis but I won't dig in deeply into these topics there are other webinars that'll go much further I'll just point out where these these tools are available we'll spend a little bit of time talking about how to get data into jump and how to save your work and we'll also see a lot of resources for learning jump and getting help directly within jump I'm using a journal for this webinar and a journal is just a file with an jump that allows you to organize notes and links to datasets or or websites or other information I'm using a Mac for this webinar but note that Java runs natively as well on a Windows machine so this is what jump looks like on a Windows machine so where there are a couple of little differences I'll toggle over to Windows and show you what those differences are but for the most part the basic functionality and jump is the same whether you're on on Windows or Mac so let's get started with our little tour and this should answer the question what does jump all about so think of this as your your quick 5-minute tour so I'm gonna click on this link what is jump and jump does have a programming language but you don't need to know how to write code in order to use jump so what I'm gonna do is just generate a little bit of output so you can get a high-level view of what jump is all about and then the native structure of jump so what I've done is I've opened up a data set and this is data from the sample data directory under the help menu so with jump in the sample data library or organized in an outline sort of mode in the sample data directory itself is a number of data sets somewhere around 500 data sets and along with different scripts for teaching and learning jobs so these data this is called SAT by year and this is test score data for the United States over an 8-year period and I've just launched a little bit of output from from this data set the first piece of output and we'll come back and revisit all of this in a few minutes is from the first option under analyze and we group our analyses under the analyze menu starting with univariate graphs and statistics bivariate graphs and statistics and as we scroll down the list the methods get much more comprehensive and complex so if it model is for linear models where we have multiple X's or multiplies and jump also provides a number of more advanced features for this webinar we'll mostly stick to some of the basic platforms so a little bit about all of our output if you've never seen jump before everything in jump is dynamic so if I click and drag I can highlight values in a graph and notice that the corresponding values are highlighted in every other graph that's open and they're also highlighted back in the data table so all graphs are linked to all other graphs and back to the data table if you don't like the look or feel of jump there are options to customize the look and feel of jump within any analysis for example if I click on this little red triangle we hide options under the red triangles to keep the output clean if I select the stack option I can convert our histograms and our output to a horizontal look rather than the vertical look for each individual variable here that I'm analyzing there are also red triangles and the red triangles are context-sensitive menus that provide additional information and the additional options that are provided makes sense for the type of variables of the type of analysis we've requested so if we look at the data SAT verbalist test scores so this is continuous data or numeric data and if I click on the red triangle we see options that make sense for one continuous variable of one numeric variable so for example test is a one sample t-test we can ask for confidence intervals so we provide a pretty flat initial output with some default options turned on but if you click on the red triangle you get additional options and also if you right-click anywhere in the output you get additional options that make sense for wherever you happen to right-click poor region region is categorical data so we see a bar chart and frequency distribution and again different options under the red triangle now a little bit more about all of our output all of the axes are dynamic so if I click and drag I can shift the axis to one direction or another or if I move my mouse to the end of an axis I can dynamically rescale or we Bend the axis and there are a number of tools across the top that we'll talk about for example a grabber that'll allow you to directly interact with any graph or any analysis I'll put this back on the cursor tool all graphs are dynamically resizable by moving your mouse to the bottom corner of the data set or the bottom corner of the graph and these little gray icons allow us to tuck away or display parts of our output that we may or may not be interested in looking at every table of output is also dynamic so if I right click in any table I can ask for additional statistics if they exists or sort or make into a data table or change the look in the feel of the table so all of the output and jump is customizable and dynamic so this is output form from the distribution platform well if I'm fundamentally interested in graphics I'll select an option under the graph menu and all of our analyses in jump if we select an option from analyze you'll see a graph with the appropriate statistics so the premise says a graph for every statistic if you're fundamentally interested in graphics dynamic interactive graphics we'll select an option under the graph menu and here just to get us started I'll show the graph builder so the graph builder is a dynamic platform for dragging and dropping variables so if I click start over and simply select variables and drag and drop we see the jump produces a graph and the graph is smart enough to know that SAT verbal and SAT math contained continuous and numeric data so it makes sense to plot those variables against one another and it's put region up here in this overlay so the observations of the states are grouped by region so there are a number of different zones here for dragging and dropping and we'll take a closer look at this as we go along and finally as part of a quick tour if you've got data with multiple dimensions as we do here so we've got a number of states a number of years they're grouped into region and we've got a number of continuous variables we can plot the data using a bubble plot and a bubble plot there is an option under the graph menu so this allows us to see changes over time and it also allows us because we have a hierarchy to split the data out so if I click on one of the bubbles and click the split button we can see the data that are grouped into that particular region and all of this was driven from a data table so a little bit more about our data tables data tables consists of columns and rows and the assumption is that within a column we have the same type of information the icons on the side tell us a little bit about our data so we've got three panels on the side we've got a table panel and the table panel allows us to save notes so here if I double click on this note it'll tell us why the points are colored the reference this is where the data came from and these icons in these options with the little green triangles these are actually saved scripts so if you're interested in writing code or would like to save your work out you can easily save the script to a data table or save it to a file to save with others the middle panel the columns panel tells us about the variables in our data set so we've got 19 columns one column is selected where you see the little red icon this tells us the modelling type and the modelling type is how jump is going to treat the variable in an analysis if I right click right on top this icon I can see that there are four different modelling types continuous which is numeric data where Calculon calculating an average makes sense or having a decimal place makes sense ordinal is ordered categories and nominal is simply unordered categories or labels and there can also be a non for variables that we don't plan on using in any analysis so let's look a little bit about our data tables so a little bit more about basic navigation and jump again data are grouped in columns and rows and you see for the columns and rows there are red icons so everywhere in jumping you'll see these little red hotspots and the red hotspots give you additional options based on wherever the hotspot falls so here I'm clicking on the red triangle for rows and I see additional options related to rows if I click on the top red triangle I get a Dutch additional options related to columns and I see the same on the side so if I click on the red triangle next to columns I see additional options and the same for rows so this data set has 19 columns 408 rows and because I selected some observations in a histogram 72 of those rows are selected to deselect columns and rows there's a little divider in the top corner of the data table and if I click right below this divider this diagonal I can deselect my rows and if I click right above the diagonal I deselect the columns you can also use the control key so if I select a column like state and I want to deselect State clicking again while holding down the control key will deselect this column the same feature works in any graph or any analysis the little gray icons I mentioned these earlier allow you to tuck away parts of your output that you're not interested in looking at so here if I'm not interested in looking at the column or table a rows panel they simply want to see the data grid I could tuck away the data grid and then clicking on the gray triangle again returns that grid all right what else do we want to talk about well when we first opened jump there are a few windows that open up by default and let's quickly walk through these windows one of the windows is called the tip of the day window and the tip of the day window now has 58 tips for navigating and working and jump and these are randomly displayed so for example how do I automatically save open windows and as I scroll through these you'll see that these are common tips for navigating and working within jump how do I do several things at a time how do i broadcast commands from one graph to another so that I don't have to repeat the same commands several times how do I paste column names so many of the things that we're gonna walk through today are covered in the tips of the day and I'd recommend once you get started with jump take a few moments to walk through these and you'll see a lot of nice little shortcuts for working within jump another window that opens is called the jump starter and you can also access the jump starter from the window menu if you're on a Mac and the jump starter window gives you the ability to quickly navigate through the menus and jump and see what's available under each menu and some users prefer to use this as a navigation panel so for example if I click on basic will see the basic types of analyses that are available with some text describing each of those analyses and see what's available from fit model predictive modeling and notice that there's a little icon that appears there are two different versions of jump there is jump and there's jump Pro and we'll talk a little bit more about the family products in a moment jump pro is essentially jump with additional tools in advanced functionality for predictive modeling so generalized regression is only available in jump pro there's some advanced mixed models that are only available in jump Pro some partially squares functionality and as we scroll down the list you'll see for some of the advanced functionality a lot of these features are only available in jump Pro if you are curious about the differences between jump and general protocol isis's include jump pro automatically there's a jump Pro button at the bottom and I'm using jump pro 13 for this demo most of what I'm gonna talk about is built directly into jump but there are some additional features in jump row that we will touch base on so this jump Pro menu tells you what's available only in jump Pro and then some features that are jump Pro that are built into other menus so the jump starter is a nice way of navigating to see what particularly is available from you so the menu items would then jump now there's also a home window and the home window gives you a panel to allow you to navigate through your open windows so for example I've opened the SAT by year data and jump basics for professors and students so these are my two recent files and then under windows this allows me to quickly navigate back and forth between my open files and my open analyses if you're using jump on a Windows machine the home window launches by default and if you close the home window it automatically closes jump but it gives you the same navigation ability now let me take a step back for a moment and I'm gonna close the home window and let's talk a little bit about jump so jump is actually a family of products I'm using jump pro 13 for this webinar there's also jump clinical for clinical trials jump genomics for doing genetic research for high schools and community colleges there's a student edition which is a scaled back version version and these are direct links to provide to find additional information and most academic sites at campuses have access to what we call the academic suite and the academic suite is jump 13 and jump pro 13 for both Mac and Windows and it also includes an e-learning course so how do you get jump if you don't have a copy of jump are you working off of a trial if you're a student or researcher or faculty member at a school or university you may already have access to a camp campus-wide license you can get a short-term license of jump through on the hub comm slash so this is a six-month or one year license the student edition is bundled with many textbooks for free but the standard version is also bundled with many textbooks and I'll make this journal available there's two email addresses here if you're interested in using jump in the classroom Chuck London can provide additional information and Pawnee allen can provide information if you're a researcher or you're interested in jump genomics or jump clinical or you're interested in using jump for for institutional research and all of the information on how to get jumped for academics or academic use is available at jump comm slash get jump okay so with that behind us I actually want to skip ahead and go right down to resources for learning jump there are some really really nice resources available under the help menu so the jump help is searchable help and in jump 13 we converted it to a browser-based version ebooks is all of the documentation for a job so this is something like 4,000 pages of documentation so if you have questions on algorithms or you're looking for examples this is a really nice place to look the sample data library I mentioned earlier has something like 500 data sets so if I click on sample data library you'll see that these are groups into groupings by by subject and then there's a searchable list if you are using a previous version of jump or have used a previous version and are curious about the new features in jump 13 these are listed under the new features tab you can also directly go to the jump user community so you'll find that there are a lot of resources for learning and using jump that are posted on the user community along with examples and a really nice discussion forum that's very active so our community of jumping users is very active and answering questions and providing examples and also of providing additional information if you run into any problems using jump you'll see a link to contact technical support and as I go through this if you're interested in writing code you'll see a scripting index so this for example give you information on how to integrate with our or with MATLAB if you're curious about statistical features that are available in jump these are provided on the statistics index and again I'm gonna be using data from the sample data library throughout this webinar and I won't talk about all of these but there are nice examples grouped by type of analysis and also by type of data and for classroom use we've grouped a lot of the really nice data sets and scripts under teaching resources so for example under examples for teaching you'll see the names of data sets and here's SAT by year and the text defines some knowing nice ways that you can use these data sets in the classroom if you're teaching an introductory statistics course you'll see there are a lot of scripts for teaching statistical concepts and we don't have time in this webinar to walk through these but I want to point point out that they're here and they provide a really nice interface with jump for exploring course statistical concepts now the last thing we'll talk about before we go further is that you can easily customize the look and feel of jump so jump has preferences and we mentioned this within it within a graph that you can make changes and customize it but if you want to change everything and jump for every time you do a particular analysis under jump you can select preferences and preferences allows you for example here I've turned on a purple laser pointer to turn on a laser pointer to change the way your graphs look to change your styles to change the platforms so when I first got started I showed the distribution platform and if we always want to see a stacked or horizontal look you can turn these options on and click apply and then okay and there are many other options that are available and the red triangles and you may choose to turn these on if you're on a Windows machine you'll note that menus hide by default so under file preferences I want to point out that under windows specific you can ask jump to always hide menus and toolbars and this is the default or you can change its endeavor or based on window size in fact I think based on window size is the default so if you're just getting started with job I'd suggest changing this to never and this will give you a menu on every window a little bit more on our resources jumped out comm slashed each we've made a lot of our different resources for using jump available so if I go back to my browser all of our resources for using and learning jump are available from jump dot SATCOM slash teach so this includes some getting started videos all of our live academic webinars are listed here so there'll be additional webinars for analyzing data and building models and all of our recorded webinars are posted under the academic webinar library if you have a specific question for example how do I do X and jump the learning library has short short one-page guides and how-to videos there's a nice case study library and a lot of other information for getting started with Java so let's talk about summarizing and graphing data so I'm going to keep using this SAT by year data and highlight some of the nice features for summarizing data and also for producing interactive graphics so I'll start with a tool under the group under the columns menu called the kalos viewer and the columns viewer is a nice place to get started anytime you're first looking at data if you're interested in looking at numerix summaries of the data so for example if I click and drag and select all of the variables or if I click and hold down the control key I can select variables that are not continued contiguous or if I click and hold down a shift key I can select all of the variables between the selected variables so different ways of selecting variables the options here are pretty pretty simple if I want to be able to show quartiles in addition to the basic summary statistics I'll select show quartiles and then select show summary I'll make this menu a little smaller so I can see everything and all of my variables are currently selected so I'm gonna deselect by clicking this clear select button and this gives me a high-level overview of all of the information in this data set so I can see that I've got 51 states I can see information for expenditure student faculty ratio salaries the percent of students in the states that are taking the exam statistics that are produced depend on whether the variables are coded as nominal or categorical or continuous so for categorical data I see the number of categories so I've got 51 States 8 years and 8 regions if I were missing data I'd see a new column in here called in missing in this case we're not missing any data for any of the variables for continuous data we see the minimum the maximum means standard deviation and because I selected show quartiles we also see the medians through the interquartile range so this is a good place to go to explore your data to look at the range of values in your data set to compare the mean to the meeting if you're interested in the shape of the distribution it's also a good way to look at your data to see if your data need to be coded any differently so for example this data set includes latitude and longitude and we can see that the data comes in with degree north degree west so the data are coded geographically so how do we code our data if we see that the data are miscoded how do we change this well with these two variables selected I can either go back to the data table or I could go to columns and then go to column information and this is where I see the metadata for our variables so I've got both of these variables selected they're coded as numeric and the modeling type which is how jump will treat them in an analysis is continuous we can also change the format so we see in this case these are coded as latitude and if I go down to Geographic we can see the option that's selected you can also code your data in a variety of different formats and you can change the width the displays and the number of decimal places under column properties you can specify notes or information regarding how the values display so for example if the data are entered as 0 1 or pass/fail and you want to be able to instead of showing 0 1 show labels or order them pass or fail depending on which one you want to appear first in an analysis you can change properties under column properties and a note on a value ordering jumpa plot alphanumeric data in order so for example if I got data entered a small medium large the medium actually the large will plot first so value ordering can be used to plot the data in the order you'd like it appear in a graph but there are many other options available under the red triangle or sorry under column properties so I'm going to click cancel here and let's say that there are certain variables we're interested in looking at further so for example verbal math and region there's a link to the distribution platform directly under summary statistics fact if I click distribution it launched the distribution platform but let's see how to get there from scratch I'm going to close this window and I'll close the column viewer and go to analyze distribution and let me tuck away this in the background to clean up my screen a little bit so this is what we call a launch dialog and an hour our launch dialogues generally only ask for the variables that we're interested in looking at so they ask for very few options and the options that appear will depend on the type of variable we're looking at so recall that this little blue triangle tells us we've got continuous data and the red bar indicates that we've got categorical data or nominal data if I select those variables and click Y columns we can plot as many variables at a time as we'd like a couple of options here that are nice if you're dealing with big data sets with lots and lots of variables and your interest and only looking at the graphs click histograms only and if you're interested in only searching for certain variables to help you with the long list of variables the red triangle next to the column selection so the list of columns allows you to search for certain types of variables or exclude certain formats from the search so I'll click OK here and I set a preference to ask for my output to turn sideways so I've asked for a horizontal layout that's also stacked again under the red triangle you can change the look and feel by asking for uniform scaling or stack which converts it to horizontal if you ask for a lot of different variables you can arrange them in rows to make it a little bit easier to look at and again you have additional options under the red triangle so for example we can change the display of the the histogram and the color we can add axes we can ask for a normal quantile plot and these are toggle switches so if I ask for a normal quantile plot we see the normal quantile plot and again the points are color-coded from the data set and if we want to turn that off we simply select it again again everything is linked to everything else and if we're interested in additional statistics by default we get the quantiles and we also get summary statistics there's a red triangle next to summary statistics that allows us to ask for additional statistics so for example I'll often turn on the number missing or or sometimes I look at robust measures if I know that I've got outliers you can also change the Alpha level for confidence interval or the trim mean percent I'll simply hit cancel here now a little bit of information on all of our output if you're looking for help on your output there are two forms of help directly available recall that they're searchable help from the help menu there's also what we call hover help so for example these two values here these are the 95% confidence interval for the mean if you have a statistic in your interest in information regarding that particular statistic if you take your mouse and hold your mouse on top of that statistic or if you're on a Windows machine if you hover around in a clockwise direction you'll see a little box up here and that little box is called hover help and how fast appears depends on how fast you have your mouse set so here we give a definition of looking at and we generally give a little bit of help with interpretation so this is the upper end of a 95% confidence interval so anywhere you see a statistic and jump hover your mouse in a clockwise direction and you'll see a little box appear a second form of health is really nice from any window in fact I'll go ahead and make this plot above the histogram a little bit bigger so we could see it better is the question mark and again if you're on a Windows machine your menu hides automatically and unless you change it in your preference but you can also ask where by hitting the Alt key on your keyboard or the option key so I've got this plot and this is a box plot but let's say we're interested in additional information regarding this box plot the question mark gives us act direct access to our help so if you click on the question mark and move your question mark over wherever you'd like additional information and click this launches the interactive help so we can see that this is an interactive ox plot and it gives us a definition of the components of the box plot and then we can refine our search using the options on the side so again two forms have help directly within any platform the hover help simply hover on top of a statistic and the question mark click on the question mark and drop the question mark wherever you're looking for additional information so I'll make this a little smaller again I mentioned that you can plot many variables at a time and keep in mind that jump will update the type of analysis and graphs produced based on the type of data selected so here I've got two continuous variables and I've got a categorical variable so categorical variables by default we see bar charts and frequency distribution again all the tables and jump are interactive so if we want to add additional information or short by certain values you can right click directly in a table and additional options are available into the red triangle and for categorical data different options are available then we see for continuous data so for continuous data we might want summary information we might want to perform a hypothesis test we might want to change the the Alpha level on a confidence interval we might want to explore the distributional fit for categorical data there are fewer options available so this is the distribution platform and I'll omit the discussion of basic inference and hypothesis testing and these topics will be covered in a future webinar so so far we've seen the columns of Viewer for looking at summary statistics one variable at a time distribution for looking at univariate graphs and statistics well what if I want to be able to look at cross tabs so I want to summarize on more than one variable at a time for this I'll use tabulate so think of tabulate as a drag and drop version of a pivot table and jump so we've got drop zones for columns this is where we put the columns we'd like to analyze and drop zones for rows and middle part and the resulting cells is where we show the statistics that we've computed so let's say for example we're interested in looking at math and verbal so I'll select these holding the control key to extend selection and as I drag these towards the palette notice that there's a blue rectangle or blue outline around the zones and the blue outline indicates that if I release the variables then jump we'll be able to do something with the information in these variables so I'll let go under drop zone for columns and by default jump gives us the sum for SAT verbal and SAT Math additional statistics are listed in the middle so for example I might be interested in the mean and the standard deviation if I drag and drop these two statistics right in the results panel it'll add these statistics in addition to the sum now to get rid of the sum I could on a Mac just drag and move it away and you see a little little garbage symbol up here I'll hit undo you can also right click and hit delete but I'm gonna hit undo one more time another way that I can override sum is simply by dropping these two variables on top of the word sum so this tells jump to replace the sum with these two variables now the other drop zone is a tiny little box and I can drag and drop other variables to this tiny little box I've got a lot of decimal places here so to remove some of those decimal places I'll click change format and there are a lot of different formats here what I generally use is use the same decimal format and then fix decimal to indicate the number of decimal places to display now we can continue to add additional information of this so for example we can add another variable by dragging to the end of the table or we can break the data down further by dropping something in drops on four rows notice if I let go a region I already have region let me let me add a year if I if I add year before region this creates a hierarchy so it's summarizing first by year and then by region and if I drag year after region this summarizes by region and then by year now a little bit on on how to save our work let's say for example we want to be able to save the keystrokes to be able to regenerate this later recall that you can write code directly within jump our code is called JSL again you don't need to know how to write code to use jump but under the red triangle in every analysis there's an option save script and we change this a little bit and jump 13 and there's an option to save it to the data table so when I click Save script to data table it gives us the option to name the script and we can also replace existing scripts if we've been saving these regularly I'll click OK and notice back in the data table there's a new option if I close this output and simply click on the green triangle or right-click and click run script' it'll regenerate the analysis exactly as I had it now there's different ways of saving our work and I'll talk about these as we're going along for example we might choose to pull this into a Word file or some other file the third icon over on our toolbar the fat plus sign is our selection tool so let's say I'm interested in everything grouped under the you laid a grey outline bar if I click on any section of jump notice that it highlights it and now I can use a typical shortcuts in jump like edit copy and go over to word if I have a word open dr. go ahead and reopen word get a blank document now I can simply paste or control V and this paste in an editable format and if I want to paste in an object I'll hit undo then under the under the the little drop-down is an option paste special and if you're on a Mac the option to paste an object is PDF if you're on a Windows its enhanced metafile and this will paste it as an object exactly as you saw it and jump so control C control V paste ctrl C paste special paste as an object so a couple of ways of saving your work while we're here we can talk about a few other ways under file sorry under let me see I'll go to I've got this selected and I'll hit ctrl C and then under file there's an option to export and we can export in a number of different formats so image HTML interactive HTML with data so this will encrypt the data in the background and allow users to interact with your data so you have to be a little bit careful with this one you can also export your your analysis as a Microsoft PowerPoint if you're on a Windows machine these options are under edit and it's called save selection ass and you'll see the same options but you won't see this little this little window pop up so let's move on from tabulate so so far always seen the column viewer we've seen distribution if I know I want to look at one variable against another and I know which variables I'm interested in looking at I can also use fit Y by X so fit Y by X lets me look at one variable against a second variable and you have a key in the bottom corner that defines the type of analysis you'll get based on the variables that you select so we'll do this quickly we want spend a lot of time here again there's another video our webinar coming up data summary and analysis would jump and we'll go into this in much more detail in that webinar but let's say for example I'm interested in looking at the relationship between SAT Math and region so by default this will give a take us to what we call the one-way platform or we're maybe interested in looking at the relationship between math and verbal it will assume that during this analysis makes sense if I click OK for the relationship between math and region we see a scatterplot with several options under the red triangle I know that many of these are related to graphics and many of them are related to performing inference so hypothesis testing equivalence testing and you have a lot of different display options under the under the red triangle for the bivariate fit from Malthus is verbal we have a bivariate fit and from here we can fit a line fit a special for polynomial density ellipses where we can ask for correlation so that's fit y by X but if we don't really know the type of analysis we're interested in looking at or really or we're really in an exploratory mode under graph we'll use graph builder and graph pill there's a really versatile graphing platform for interacting with your data and exploring your data so when we first got started I simply dragged and dropped these variables into the middle of this palette and this plotted math versus verbal with region as the overlay variable and again you can click and drag and move variables to these different zones but let's go ahead and start over so let's say I'm interested in math and as I drag this near the palette notice that the zones all highlight in blue so everywhere you see a blue highlight these are zones that we can release this variable so this is what it looks like if we let go in and Y and I'm simply dragging with that releasing X frequency field if you've got summarized data page if you want a different page or a different graph for every category of that variable and here if it's continuous it'll automatically Bend group Y bins and group X also provides Bening there's overlay collar size so lots of different places to let go of variables so if I let go of SAT Math in the Y zone and let go of verbal in the X Owen by default we get a scatterplot and for any graph type that appears you'll see additional options on the side so by default we see the points and there are additional options related to points here so for example this jitters is an option that can be turned on in certain certain graphs and we also see a smoother and a smoother has a little slider that allows us to change the bendiness of this smooth line that's been fit all the graph elements can be requested from this ribbon at the top of the graph so for example if I'm plotting a continuous versus a continuous variable having a slider might make sense or having a line of fit might make sense and again as you hold your mouse over any of these icons you see additional options so this is a line of fit and from the line of fit I can change the degree of the fit so quadratic or cubic I can also change what displays by default so for example I might use this before fitting a model to explore the type of model that might make sense and we can add additional information to this graph so for example I've got a line of fit but maybe I want the line of fit for each of the particular regions so if I drag region to the overlay it fits each one of these lines on top of one another or I could color by region or size the points by region or wrap or group in one direction or the other so it's a really nice interactive graphing platform where you can look at your data in all sorts of different ways I'll turn off some of these statistics here and set this back to linear what else is available for continuous data well we can ask for a density ellipse and asked to display the correlation we can ask for contour and some of these may or may not make sense so for example a bard chart is available but it doesn't necessarily make sense if I replace verbal with region notice that there are different options available in this case the bar chart does make sense but I can also ask for a box plot or histograms and as we scroll through we see some things that make sense and some things that might not make sense given the context and again for any one of the graph elements that we request the axes are dynamics so for example if I want to change the access for SAT Math if I double-click I have a specification window where I can change the type of scale change the format the minimum and the maximum the number of increments I can change the way that the tick marks display I can add reference lines for categorical access you also have options so for example you may choose to change the labels the display or change the orientation of the labels so lots of things that you can do directly from any axis you can also click and change any label so if you don't like the label you can simply click on it and change it when you have a graph that you decide is the finished product there's a done button here and the done button simply closes this control panel on the side so let's say this is the graph that is I'd like to make is my final graph again you can change any text or you can right-click and change the fonts font colors change axes around you can also resize the graph dynamically and there are additional options of the red triangle so for example we may not want to show the legend or a footer we may want to include missing categories so there are lots of options of the red triangle and I'll ask for that control panel one more time so I'm gonna hit start over and just briefly this is one of the platforms in jump where you can do geographic mapping so if I drag state - this map shape zone jump ships with shape files but you can also add your own and what this is doing is telling jump go look through the shape files that ship with the product until you find Alabama and Alaska and there are two files is one that has the names of the shapes and there's a second one that has the XY boundaries so jump has found the shape file for US states and there are several that ship with the product so I think there's counties there's also world world countries and certain countries there are territories within those countries that are also available and again you can easily add your own and if I want to be able to color this by for example math scores I can drag math to the color zone and I've changed a preference so the colors different but you can change colors as you go along so for example if we want to change the gradient or change the scale you can easily do this now a couple of little features for interacting with graphs and again this is a sort of an overview we may want to take this graph and or any analysis and see how the picture changes as I explore other variables so there are a couple of nice features for doing this on a Mac you see there's two little icons local data filter and column switcher these are also available on the red triangle and their local data filter or call us which is under reduce so for example I might want to plot several different variables on this so I can swap out SAT Math with whatever variables I'm interested in so if I click and drag and hold down the control key to extend the selection we can select as many variables as we're interested in and this allows us to update this picture based on other variables so here I'm plotting math verbal % taking so if you've got a lot of different variables you want to explore this is a really nice way to do this and you can also do this from within any analysis within jump so I'll put it back on SAT Math and I'll remove the column switcher and what if we want to try to dig into this and understand why this might be happening the local data filter allows us to explore this picture but based on values of other so for example I've got this variable percent taking I'll click Add and this gives me a slider with the values that this variable takes on so percent taking is a variable that represents percentage of students in the state that take this exam so some states only 5% of the students take it in other states eighty-seven percent of the students take it now I can just click and drag and change this and notice how when we get to the upper end certain states are selected and are displaying and as I get down towards the lower end we see the picture switches so the local data filter and you can build a really nice hierarchy here can be used to explore a particular graph or display there's also a global version of this so I'm gonna remove the local data filter and if you want to interact with all of your output and every analysis is opened under rows there's an option called data filter and this is what we call a global data filter so this will impact all data tables or it'll impact your data table and all open graphs or analyses so for example if I'm interested in looking at that same variable and if I slide notice it selects the observations corresponding to these values in the graph and also back in the data table if I click the show button this is equivalent to asking jump under columns or sort of under rows to hide values so they're hide out there's a hide option there's an exclude option and there's one that does both so hiding tells jump don't let me see these points on any picture and you can do this manually there may be points that are distorting the the graph of the analysis if you don't want to see them on the graph you hide if you don't want them to be included in an analysis use the exclude button so the exclude button choses this don't signed and this tells jump don't put these points in any future calculation so as I click and drag I'm basically updating the picture by hiding and excluding observations in the data table and my graph then will only display values that meet my criteria that I've said so data filter is a global global filter based on on variables in your data set the local data filter will only interact with the current graph and hide and exclude give you a nice way to interact with data to remove observations from analyses or graphs that you may not be interested in looking at so I'm gonna go to clear row States to wipe that out and we've covered most of the options I wanted to discuss under summarizing and graphing data and we've talked a little bit about distribution distribution is is both for univariate graphs but also statistics and inference or confidence intervals and hypothesis tests we saw a little bit of fit Y by X and remember that fit Y by X gives you this little grid that tells you the type of analysis you'll get based on the variables you select so from here we can ask for ANOVA or a t-test simple linear regression logistic regression or contingency table this is anytime we've got one X and one Y if you want to fit a model where you've got multiple X's or multiple Y's we use fit model in fit model if you if your response is continuous by default you get standard least squares regression if your response is categorical you get logistic regression from here you can also fit mixed models perform stepwise regression you can easily visualize your model using what we call the prediction profiler and you can also run simulations so fit model and I'll simply go to this quickly since we haven't seen it yet has lots of options available under personality and some of these are only available in jump pro so for example mixed models and generalized regression which is a modern generalized linear modeling platform that also includes penalized methods these are only available in jump pro there is a standard version of a mixed model available in jump but this assumes that you've got a structured covariance matrix so let me close this and a little bit more about basic analysis features and again just just a high-level overview we cover these in a future webinar there are also a lot of multivariate procedures available so under analyze and see multivariate methods and a jump 13 we restructured our our menus slightly so under multivariate if you're interested in pairwise correlations use the multivariate option but you can also ask for principal components or discriminate several clustering features and a jump 13 we added latent class analysis and cluster variables is also available and you see that there are a number of other options available another new feature that we added in jump 13 is this text Explorer so if you're dealing with unstructured text data for example like the survey data if we ask for the text Explorer there's a new data data type or modelling type for unstructured text and here we can explore phrases and words [Music] and there are lots of different options available for working with unstructured data so for example late in class analysis for clustering documents and latent semantic analysis which is similar to principal components you can ask for a word cloud but it's a really nice nice new platform and Jump 13 so we wanted to talk a little bit about getting data into jump we've already talked a little bit about saving our work so I mentioned how to copy and paste I didn't mention ID actually I did mention how to save your scripts of a data table you can save HTML and high resolution graphics you can also save to PowerPoint let's talk about getting data into jump so a common question is how much data can jump handle and the general rule of thumb is that you need twice the memory as your largest file size to be able to effectively deal with the data in jump so jump is an in-memory program generally if you have one or two gig as your maximum file size if you've got twice that in terms of memory jump can generally handle it okay if you want to create a new data table and jump use file new data table and by default you get an empty data table with one column that's coded as continuous if you want to change the name simply click and type I'll call this gender if you want to add new columns you can simply double click in the space provided after the original column or you can right click and ask for new columns if you know how many columns you want to add so for example let's say I know I need three columns if I just double click in the third column it'll add three columns let's call this one height call this one weight now a little bit about our data in general jump is case-sensitive so for example let's say I have a female just inner female and if I use the down arrow it creates that one row and gives me a new row and if I enter a lowercase F now jump thinks I have two separate categories of female notice also that by accepting F jump is automatically changed the modeling type to categorical or nominal so jump is case-sensitive and remember that jump analyses are based on the data type or the modeling type that we have a little bit more about jump let's say this person is 5 feet 8 inches tall I wouldn't want to enter this person like this so anytime jump sees special symbols jump thinks of the data our text so for something like height instead we want to enter this person in decimal format so we'll say 68 inches instead so any special symbol you can add them to graphs but your data needs to go in as pure numbers there are features and jump for cleaning updated like this so for example if I if I had a big data set and I know that I've got some issues with leading spaces or with capitalization under columns there's a nice recode option that allows us to easily clean up data quality issues now I can also easily pull data in from any other format so for example if I've got data in Excel and I thought I had Excel open here let me go ahead open up Excel again so let's say I've got data in Excel and I've got this data set called banking data and notice the first row is my column header if I click and drag and I'll just select some of these and hit ctrl C that first row jump needs to recognize that first row as my column headers so if I go back to jump and I do file new data table this gives me a new blank data table and I don't want to just paste the data in because jump won't recognize that the first rows column headers so you have an option under edit called paste with column names and this tells dr. recognized that first row as the column names and notice that jump has correctly assigned the modeling type so overdraft protection yes and the last three variables are all continuous the first thing you want to do anytime you pull data into jump is make sure that the correct modeling types are assigned if jump sees numbers it's gonna think it's continuous if jump sees text is gonna think it's categorical now we can also pull data indirectly so for example this banking example this is just a link to some excel data as soon as I click on banking example it gives me a sneak peek at what this spreadsheet looks like so this spreadsheet actually has two sheets I've got descriptions and then I've also got the data so by default the display would look like this so this particular example my column headers are in row 3 and my data aren't start in row 4 so you can use the options on the side to tell jump where the data actually begins and to make sure that the first row column header is actually reflected so my column headers start unreal for my data starts in row 5 you can also use this to concatenate spreadsheets and also restructure your data if it's in Excel if I click import again jump brings the data in and it's known to put this first row as the column header now you can also open data in a variety of different forms and I'll go to Windows for a second and I think I'm just about out of time I want to open up a dataset in jump and I'll just do this on the windows side if it hit file open you'll see on the side under all jump files the jump can open up a variety of different types of files so for example text files we can easily read in text files Excel files and different formats we're integrating with SAS and R and MATLAB we can also add in read in SPSS files so you can read in a variety different types of files directly into jump if you're on the windows side there's also an excel add-in so if you're in excel on the window side you can use this add-in to push data directly from excel over to jump so I'm just about at a time we will post this journal and the recording in the webinar library and this will this will appear in a day or so and with this I will go ahead and just summarize what we've done we talked about basic navigation and jump with focus on tools were summarizing and graphing data short and short overview of tools for analyzing data and remember that there will be another webinar specifically on data summary and analysis we saw how to get data into jump and save your work and I talked about a few little tips and tricks as we were going along and recall that all of our resources are posted at jump comm slash teach so I will stop there and open this up for questions
Info
Channel: Mia Stephens
Views: 561
Rating: 5 out of 5
Keywords: jmp, academic, data summary, data analysis, graph builder, geographic mapping
Id: 69CHfImmbSw
Channel Id: undefined
Length: 58min 41sec (3521 seconds)
Published: Wed Mar 08 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.