Get it Right in Black & White Episode 9 - Reusable Charts

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right here we are we're live [Music] all right welcome everybody how's everybody doing hi karen i'm all right hi sriram hi anita hi good morning oh yeah all right well i think i'll uh dive right in let me just check if everything is working the stream i think is live all right so today i'd like to talk about reusable charts this has been a fascination of mine for years because every time you figure out how to make a certain visualization um it becomes sort of a telescope that you can point in different directions right at different data sets and encapsulating the reusable elements has always been somewhat of a challenge hey felipe how are you hey i'm fine thank you what about you good good good we got a full house today we got sriram nita adil and felipe this is amazing very cool so yeah today i'm going to talk about reusable charts and what i'm going to do today the idea is to take what we did last time refactor it a little bit start to use es6 modules and then adopt this reusable charts pattern which is sort of the the module pattern that is is commonly found with pure d3 implementations but first let me review the submissions from last week yeah we got some really good stuff this week uh here's philippe's work title to the dot um nice so there's little tool tips that you get when you hover um you want to talk about this a little bit well that's that's not much but this is just uh the title i didn't um knew how to use the invert function so i add p x and py in the marks uh but what px and py is doing it's pretty much the invert function if you can scroll a little bit uh yeah there you go right uh nice uh but maybe there is some kind of invert function that i could use in the text i think andrea i guess let's talk about this in one of the posts right this is just just to have the title and i i got this from your cheat oh yeah cheap tricks for interaction yeah nice let me just provide a little bit of feedback on this so this step here creating marks as a derivative you know array of data this is where all of the transformation can happen that you need in the visualization including logic like putting together strings see doing it this way it actually puts a bunch of intelligence into the rendering logic meaning this this string of the tooltip is derived in the rendering logic rather than in the pre-processing step so here's one thing that you could do is just move this logic out of here and use something like d dot title instead and then you could just say the title of each mark is and then you can get rid of these intermediate values p x and p y so it's a it's a very small change but this is more in the spirit of like then here do do i have to put d dot x v yeah okay no yeah a good catch yeah i just overlooked that yep because x value and d is visible here so this should still be working yep it's still there nice but yeah great work great work adding that that's something that we had talked about during the last uh you know last week session and i never added it but i'm great i'm happy to see that uh it was added here yes this was just after the class the last class and then my my exercises it's in the end of the posts it's a little tricky though because just to just so you realize um whenever this runs it's going to append another title every time that's not a problem here because it only runs once but just something to be aware of um and yeah in the future we'll dig into the patterns that you need to to make it work properly but yeah nice work very nice let's see oh this one's beautiful this is awesome this is um well let's read what it says daily new cases of covet in italy from andrea this is awesome awesome nice yes x-axis uses scale time excellent and there's some interactivity here let's check it out toggle path whoa it turns the path on and off that's super nice oh my gosh beautiful let's take a look at the code real quick i see that there's different sizes in use as well ah so when the when the path is toggled it just changes the display attribute on the path brilliant very nice very nice yeah so it parses the date a pretty generic way of parsing the numbers and let's check out the rendering r nice so the radius here changes along with x and y beautiful way of doing it um but i have to say that uh with radius here let me just fork this and suggest a tweak with radius um it's it's a good practice to have it so that the the area of the circle corresponds with the data values so one filled in pixel would correspond with one you know unit of of the data and this method here doesn't accomplish that because it uses a linear scale but the area is a function of the square root so we can change this to scale square root sqrt which we can also get from d3 like that and then it really the range really should start at zero and the domain really should go from zero to the max of the data and yeah max we need to import from d3 so yeah now now that the domain starts at zero and the range starts at 0 and we're using a square root scale now the area of the circles is actually proportionate to the value you see as it goes down it gets down to nothing which is what you'd expect and the higher values are are you know bigger circles and i think well actually what is the what is the radius value terrapia intensiva nice yeah i don't know what it means but this is beautiful work beautiful work let's see what else we've got it means intensive treatment ah intensive treatment hmm yeah interesting so the number of cases yeah it's it's an interesting uh pattern here i'm not quite sure how to what to make of it but as the case numbers were going up it was not that number was not that high but later on that number was higher oh someone's joining hello all right we got someone new hello uh how do you say max maximiliano you want to introduce yourself a little bit um hello kuran how are you good good how are you yes i'm i'm max you can call me max it's okay and i'm from chile an interest i haven't been able to to see the course uh like it's online last the same day so yeah i managed to do it today sorry great to meet you nice welcome welcome i'm glad you could join us so yeah we got a full house today bunch of people sri ram adil anita and felipe is here so um yeah as i dig in feel free to interrupt me and ask questions to clarify as we go awesome thanks awesome so yeah this week there was just a lot of activity in this forum like a lot of back and forth and i'm really happy to see that uh oh here's here's something that adil made dinosaur d3 scatter plot wow ideally you want to just talk briefly about this one uh sure yeah this was um based on a data set from the natural history museum in london um they have about 300 plus dinosaurs in that data set and um i just wanted to try uh playing with it a bit using this and uh so i thought try plotting the length of the dinosaur uh on the ax on the x-axis with the weight of the dinosaur on the y-axis and um i then also stole the uh the tip from felipe and yourself about implementing these little uh tooltips um and uh yeah so it was just really uh just just uh getting familiar with the data set and um it's a it's a bit patchy the the data so i had to do a bit of extra processing because not all of the dinosaurs had a weight um and so uh from 300 plus dinosaurs i could only get uh about 49 i think which had both length and weight and then i uh yeah just just uh just try to create a label early on that combined some interesting bits about each dinosaur nice that's beautiful beautiful stuff the uh the one in the corner i thought might might have been a typo uh the far corner but it turns out it's actually a um the biggest land animal that's ever been discovered wow that's incredible 70 000 kilograms argentinosaurus 35 meters 70 000 kilograms you got to be kidding me yeah i looked on wikipedia apparently 70 000 was kind of average for that dinosaur yeah not that it's just sort of average wow um one really small thing that i would suggest um so when you have numbers like 70 000 i don't think we've touched upon number formatters but it's a really quick change that i would like to just do right now just to show how it could be done so the idea is to oh it looks like you used it here um i think that was already built into your fork i think i see oh maybe the axis does it automatically format but if you so one like the first thing i noticed was in the tool tip 70 000 kg um ideally would have a comma after the 70 and we can add it pretty easily by importing format from d3 and then making something called uh i'll call it comma format and we can just call format with a comma and format accepts this you know very specific string that you can do all sorts of things with but a very simple version of it is just to specify to to add a comma separator and then in your code that generates the label we can just pass that through that function like so so whatever the the y value is we format it with a comma and voila there it is 70 comma thousands it's very nice oh that's that's really nice yeah thank you yeah nice great work very fascinating data set and i think that's it if if you can just go a little bit up uh previous to this one yeah this one um oh yeah sorry yeah just just it's a little bit different it's it's the same data set but i used colors the name of the the class but my goal here was to have all the information that we had in the data set of the flowers uh here so i put also the a few mouths over the color means that the specimen yep right the species very nice yep and uh the elliptical radius is the sepal but that's my question because if i change the order of the data the petal and the sepal uh the radius becomes very small so it almost vanished so how can i like normalize the radius right yeah that's a good question i mean let me dig in a little bit it it depends on how you use the scales like which scales you use for the radius so oh you had a question how can i usually change shapes according to the data um yeah this this questions like because i didn't want to use ellipses in the beginning right i thought okay i can use different marks but then i said okay i don't know how to do this so i will just use ellipses and then i can use the the radio different radius different right right i mean it's um it's it's creative to use an ellipse that has you know a different width and height for the different values here but it may be a bit hard to read and so you know maybe maybe the best we could do is just show three variables i mean color is color color is a great choice for the species because it represents a you know a column that has three distinct values uh quant uh yeah qualitative uh categorical attribute but when it comes to the radius um i would say again it might make sense to use circles and then set the radius using a square root scale so i'm going to set r to be r of r value of d is the way i would set it up and then r again could be a square root scale and then the domain can go from zero to the max of r value and then the range would go from zero to max radius and i think there's a lot of things that we don't have defined i can pull in this and this from d3 and then our value okay you've got our x value in our y value it may be sort of too tricky to show both of those at the same time and so we can just pick one of these and oh let's use this as max radius i'll set it to be i don't know 15. that should work maybe i missed something oh yeah data the max over the data of the r value yeah and so because the the data values are like they don't even get close to zero they're in centimeters and so it works pretty well with with the square root scale to set the radius like this and we can set the max radius to maybe 10 so we don't get so much overlap but now radius is showing the the variability in simple length and it doesn't it doesn't vary that much actually but yeah this is how i would suggest to do it is to use the square root scale where the domain and range both start at zero yeah i i see but um if i if i have a kind of data that i won't like uh make a normalization of of the range of the data how how can i do it how do you mean well i'm not sure what you mean um i mean imagine that i have a range of data but i want to restrict it from zero to one and have the average um i'm not quite understanding so let's take for example this data set yeah if you if you see like a petal with it's it's has 0.2 if i if i use this it's too small so i want to make it from a range from let's say one two two uh the the size of the the radius then it would be a linear scale right yeah i mean you could do that that's totally you know something doable so i'm just going to comment out this one and put this other one back where like you said we we could make it so that the lowest value corresponds with some like min radius and the max value corresponds with some max radius and we could define min radius to be like i don't know five and this would work it would make it so that you have like it would make it so that it's guaranteed that the smallest circle has a radius of five and the biggest circle has a radius of max radius which is 10. so this is how you would do that i think this is what you were asking about yes yes that's that's right for this case it doesn't make sense but it's just something that i was thinking about yeah i mean technically it's possible and and you could use a sk a linear scale and do it like this however the main problem with doing it this way is that the area of the circle does not correspond to the data values which could be problematic um it's it's not easy to read this in a sense that it's not it's not like um i don't know what the work what the best word is like like it's it's almost like not really misleading yes misleading yeah exactly exactly it can be misleading because you know you could pick a mean radius of one and it would make some of the dots really tiny but it would you know it's it doesn't really express the true like variability within that particular value of the data and so you know this is an area of debate within the community of people who make visualizations some people are very lacks on how they use radius and so you know it could be a linear scale it could be a square root scale could be a log scale you know whatever it takes to make a pretty picture but i i my sort of philosophy is on the opposite end of the spectrum where like i firmly believe that if you use size to encode some value from the data then you should make it so that the area of the circle corresponds exactly with the values from the data so if the value in the data is zero it should be a zero sign circle that you can't even see you know and uh but that's sort of an edge case i've seen people use one as the min value so that you could at least see something on the screen but besides that you know i would pretty pretty firmly stick to that philosophy that if you do use radius then the best thing to do is to use a square root scale because the area of a circle varies with the square root of the radius what's that what's that uh that formula yeah area is pi r squared right so this is this is why it makes sense to use a square root scale and so the radius of a circle is calculated with respect to that same formula so that's why if you if you want to make it so that the area of the circle corresponds with the values from the data then you have to use a square root scale and make sure that the domain and the range both start at zero and so the domain could go from zero to the max value oh whoops i'm editing why my mistake yeah so it would be this version here of setting up the scale oh whoops that has to be defined above yeah so it would be this version here and you know sometimes it doesn't turn out the way you expect yeah in this case it doesn't doesn't mean anything right yes exactly exactly and so this is you know part of the process of making data visualizations is just trying different things and seeing what works um you know spreading the value across the screen from left to right between the min and the max like you know the x-axis or the y-axis of a scatter plot that's hugely effective at looking at you know things that fall in in like a small range but there's a lot of variability within that small range uh but radius isn't is not good for that sort of thing you know the because these values i mean look they value they vary between like i don't know two and four or something and so it's not really that great of a variation so it doesn't really uh it doesn't really pop out when you use radius but it's worth a try for sure all right cool cool so um yeah great work everybody very nice um this might be actually a good juncture to take like a five minute break uh because it's a big gap between sections uh so before i dive into this reusable chart stuff let's take a five minute break and then uh then we'll dig in okay see you in five minutes okay all right i'm back all right let's dig into this um idea of a reusable chart we will build a reusable chart based on the scatter plot we made last time uh on the way i'm going to refactor the code into modules and then the whole point of this reusable chart thing is to strictly decouple the specific and the generic meaning the stuff that's specific to the data set and the visualization that we're making and then the generic stuff that um you know implements reusable logic for a specific type of visualization and we're going to use this tried and true pattern from mike bostock in 2012 can you imagine towards reusable charts is what it's called and i'm going to make it dynamically update it's going to dynamically change x and y and i want to talk a little bit about why i'm doing this today from the structure of the course perspective i i want to go in this direction so that we have a very solid basis for starting on different visualization types so rather than go wide in the beginning like we only made one visualization type so far a scatter plot so rather than you know making a bar chart and a line chart from there what i want to do is make a reusable chart version of the scatter plot and then use that as a basis for future episodes where we branch out into all sorts of different visualization types that way by the end of the course we'll have like a library of reusable chart components that are sort of usable off the shelf and we'll be working within this this pattern which is very useful to know and it's one of the trickiest aspects of d3 so let's dig in i'm going to start by forking this scatter plot that we made last time and i'll call it reusable d3 scatter plot right now it's just um this index.html single file sort of thing before we do anything i want to change some stuff around so that it it reflects like the structure of a javascript project that you might see so i'll split out this the styles into a separate file oh someone is joining hello hello someone has joined us uh larry how are you you want to all right nice you want to introduce yourself a little bit uh yeah uh my name is uh larry rancho i'm from the philippines nice i work with uh with the university as a faculty and i also work as a developer for data visualization for a company in australia so yeah i've been using your resources since i started working with d3 thank you so much i really appreciate that oh fantastic fantastic well i'm happy you could join um i think this is the most people that have ever been on the live call and this is great so we'll see how it goes um all right thank you welcome welcome okay so the first thing i'm going to do is split out the css into a different file um just because i personally prefer to have a bunch of small well-defined files that that way you know the complexity can scale over time so i'll call it styles.css and i'll take this css and move it into that file and then get rid of that style tag and then we can use a link tag and i can never remember the syntax so i'm just going to google it um yeah the html link tag there we go link rel equals stylesheet which is i guess the type of thing it is and it points to styles.css there we go i'm also going to create a package.json and this is going to have inside of it dependencies d3 at a specific version which i'm going to find from on package i'm just trying to figure out what's the latest version of d3 okay it's 6.7.0 so that's what i'll put there that way um we can get rid of this script tag here and then i'm going to put all of this javascript into a separate file called index.js and the way that vishub is set up it automatically loads in that file and it also automatically loads in our dependencies which includes d3 this is how most modern day javascript build systems work so like it mirrors the experience of webpack or up locally so now we're at sort of a a comfortable starting point where we can start to refactor this stuff at this point i would like to introduce the concept of towards reusable charts this is a great great piece from mike bosdog the author of d3 from 2012 and it's one of those unique things that has actually stood the test of time so rather than trying to invent my own pattern which i did for the 2018 version of this course and rather than use a library that has components built in like react does which i used for last year's version of the course this time i want to use the pure d3 way of making so-called components once you make a component with this pattern you can easily wrap it with whatever other framework you're using like view or react or angular or whatever but this is a way of making components that is just purely dependent on d3 and nothing else so here's how it works it's going to be a function so as i go through i'm going to implement these ideas so i'll make a new file called scatterplot dot js this will be our scatter plot component i'll start by saying export const scatter plot equals a function and it's going to be used in much the same way as a d3 axis this towards reusable charts pattern is actually used or a variant of it you know is used within d3 itself where you have a constructor for something and then you have these chainable methods that you can add onto it so let's keep reading in this towards reusable charts one way of configuring the thing would be to pass arguments but um you know he mike boss dog here goes through the various ways of of configuring something and the pros and cons of each you know it's cumbersome for the caller to remember for example the order of arguments so maybe make it a config object but that's also cumbersome for the caller because the calling code must then manage both the chart function and the configuration object over time so rather than do it like that we can use this method chaining pattern that looks something like this you create the instance of the chart and then you call for example dot width passing in the width that that then returns the instance of the chart and then you can call dot height to set the height and that state of width and height is stored inside of that instance of the chart so let's work toward this in our code um in this scatterblock component we can have a thing that i'm going to call my which is going to be the instance of the chart and this name pays homage to the original article which also uses my so in this chart constructor it returns a function called my this is going to be a function and this is what will be returned from this constructor so let's create some of these um accessors or getter setters this here is a kind of verbose way of of doing it but let me just walk through what this means my is a function and in javascript functions can have properties so we can set my.width to be this function that accepts as input of value which will be the new width or not and if it's if it's invoked with no arguments that's what this is checking here arguments.length so if there are no arguments then the function acts as a getter it returns the width which is stored in this variable here otherwise it sets width to be the passed in value and crucially it returns my this this returning of my is what enables method chaining to happen and height is just the same pattern but this is kind of verbose and um when i do this sort of thing i like to look at the source code of d3 axis as a reference because it has much the same pattern to it but the implementation is is pretty um pretty concise and this is part of d3 itself so i think it's a good um reference to use it's a it's a much smaller way of implementing these getter setter functions and since we'll have a bunch of these i would prefer this one the way it works is you know it it uses underscore for the name of the thing that gets passed in and it says okay return if there is an argument set the value of the thing internally comma axis and axis in our case is going to be my and this is a weird little javascript expression that you can do you can actually have an expression that is just two things separated by a comma and it implicitly returns the second thing and so in this case axis gets returned to enable the method chaining but if there if there's no argument then it just returns uh offset this internal variable here so i'm just going to copy this template and use it over here and adapt it to to our code here so instead of axis it's going to be my instead of offset i'm going to start with width and height because width and height does need to be configured so my.width equals a function where if a value is specified it just sets that value to be width right here and it returns my otherwise it will just return width but width is not defined in this scope yet so i need to say let width like this and we can't use const because we reassign to width here so that's why this needs to be let and this is the general pattern of these chainable getter setter functions now i i realize this might have been a lot um are there any questions so far yeah the the plus that came before the underscore um was that to ensure that it was a number because i think that worked that this came up in the previous lesson exactly yep that's exactly what it is so in in d3 axis i think it was the offset which is expected to be a number and so yeah if you pass in a string like 50 you know it gets to be that value but if you if you preface it with this unary plus operator it parses the string into a number and so that we actually could add it there this is this is called defensive programming where you sort of expect the worst case you know you expect people might abuse the api and pass in a string where they're really supposed to pass in a number so that's why the plus is there it's just to guarantee that before it gets to this variable inside that it's the right type of thing um so hey why not let's let's leave it there it'll make our api more robust so we could pass in a string if you you know if we accidentally had a string that's our width and everything would work out just fine yeah and the uh the my after the comma that that is i don't think i've seen that before yeah yeah this one threw me for a loop the first time i saw it too let me unpack it a little bit yeah i'm glad you're asking because it's a lot to process so in javascript there is this um this construct that's not very widely used which is just parentheses and stuff separated by commas and when you execute that it returns the last thing and those things in the middle could potentially have side effects so let's say let x equals zero can have something like x equals five comma as a a piece of javascript and like it's yeah it's cryptic it's kind of weird but to understand what it does it just you know it just executes this and it returns the last thing and so this is a miniature version of what's happening in the code here um it says width equals this thing and so now if we inspect the value of x it it's actually 5 because it was assigned here and this you know in all likelihood this violates some kind of like you know if you were to use eslint this might not pass eslint because it's too it's cryptic um but this is what mike bostock uses in d3 axis so it's good enough for me that's sort of how i look at it and i think these these parentheses inside may have been added by prettier if i run prettier it adds those parentheses i guess just for clarity but yeah that's what's going on here if you just have parentheses with commas it executes all of those things and it returns the last entry in the list yep that makes sense thank you yeah um and another another thing that may not be obvious is that we can't use the arrow function here because with with the arrow function arguments is not defined um check this out if we make an arrow function that just logs out arguments arguments is a special keyword in javascript if we execute f we get arguments is not defined that's one of those little changes that was introduced with the fat arrow function but instead if we say f equals function you know the long form where we write out function then we can access arguments and it gets undefined but if we pass in like one two three as the arguments that's what arguments resolves to is the is the arguments that you pass into the function so um we could you know instead check like is the value defined but then that would like break in the case of zero you know which so as a general reusable pattern for these getter setter things this is the safest way to check if there were any arguments passed in or not and that's why we use this keyword function so so that we get arguments defined inside of here all right so we can move on and and we can do the same thing for height we can set height and return height and define height up here and now we can begin to move our visualization logic into this component here the way it's going to look is in our main we're going to load in the data and then all this logic is going to be go inside of this scatter plot function which we're calling my right here so i'm just going to paste it it's not going to work and this is how we can invoke this component we can say well first of all we need to import it import scatter plot from dot slash scatter plot and now we can use it in much the same way as we use a d3 axis we can say scatter plot dot width is width which we have here as our you know window.inner width and same thing for dot height we can call dot height which invokes our setter and we can pass in the local variable here height and then we can call this with our svg element and that's another aspect of this towards reusable charts pattern is that it accepts as input a d3 selection and it's going to put stuff inside of that selection so um similarly to axes like d3 axis it expects a d3 selection of a group element and then it puts the axis stuff inside of there our scatter plot can do the same you know it could accept maybe a whole svg element or or an svg group element either way could work and this is how you would invoke it you know pass in the selection to the chart which is equivalent to saying selection dot call my chart and we got into this last time with the axes and so the way that i would love to see it is svg.call scatterplot width and height like this so this is the the overall pattern i'm just going to keep going until it works another point of contention with this pattern is how to deal with the data i think we should deal with the data as just another thing that we can set you know so let's set dot data we pass in the data and then in our scatter plot implementation we can have another one of these local variables called data and then we can have another one of these accessors for data and in this case we we must not have this little plus because it's expected to be an array not not a string that's going to be coerced into a number so i'm going to get rid of that it's just going to set data internally to whatever was passed in here and it's going to remain broken for a while so i'm just going to keep going like this adding the things that we need for example x value y value margin and radius these are all you know configurable things so let's go ahead and add add these as as configurable things on our plot the way it would look to invoke this stuff is that we can say dot x value and pass in this function dot y value and pass in this function dot margin and pass in this margin and then dot radius passing in five like that and then when we use prettier it all formats nicely and this is what it looks like to configure our scatter plot now we need to go and implement all of those as you know local variables with getter setters so we've got what was it x value y value margin and radius and then i'm just going to copy the same uh template four times change data to x value which is going to be a function change data to y value which is also going to be a function change data to margin which will be an object and then change data to radius which will be a number so we can actually bring back that little plus to be defensive about how we implement this okay now we have getter setters for x value y value margin and radius now quran can i can i ask yeah yeah sure sure when i see the the more verbose version of that function um the one that you showed before the function evaluates the absence of argument when with an exclamation mark right in this more concise version how is the absence of length is being evaluated in this in this version oh i see what you're asking yeah yeah let me explain it's it's a um oh there's all our errors it's a behavior of javascript where so let's say let um bull equals true bool is going to be some boolean value and so if we have something like bull question mark yes or no it evaluates to yes if bool is false it evaluates to no and here's the tricky bit in javascript there's this notion of truthy things are truthy or falsey um you know when they're coerced to booleans strings for example numbers when you pass them into when you treat them like booleans they're they're coerced to be truthy or falsy and so the way that that plays out in the case of arguments.length arguments.length is going to be a number because arguments is an array and so if arguments.length is 0 then the value of 0 will be put into this ternary operator and it turns out that 0 is falsy in javascript so that's why it would evaluate to no but if the length of arguments is let's say one that evaluates to something that is truthy in javascript it's kind of true it's like true if it's treated as a boolean and so that would evaluate to yes so that's why arguments.length works in this case like it does great okay it makes sense thanks nice nice happy to hear it and this is so great i'm so glad that that you're asking these questions because um sometimes i don't know what to stop and explain and what what not to you know so thanks for your question all right let's keep going here um looking at this file it seems like everything is right we're pulling in the data we're defining our svg one thing that we're not doing yet is in our scatter plot this function here needs to take as input a selection i can call it svg but honestly i would rather not because it could be an svg it could be a group element and so i'm just going to replace svg with selection so i just replaced all instances of svg with selection and now it should work we might be missing some imports yeah that that's one thing that's outstanding um oh i never changed this around to use the es6 import syntax this is something i should have done as soon as i moved it to index.js so we can import all this stuff that we need from d3 and in scatterplot dot js we're going to need scale linear extent axis left and axis bottom but not csv and not select and in index.js i don't think we're going to need any of that other stuff so we could just use csv and select over there so let's see is it working oh we get a nice error unexpected token okay there's an unexpected token somewhere in index.js line 46. it's missing the column in line 45. um in line 40 no no you're chaining no you cannot have that it almost feels like it has like a like an older version of my file ah scale yeah it was some some sort of glitch now it says scale linear is not defined okay fair enough scale linear we should be getting that from d3 that's odd let me try console.log scale linear to see if it is even like loading to that point it is it's there so now what what what it's working the total uh total twice magic yeah total twilight zone moment with errors that resolve themselves yeah like magic all right sweet we've done it it works this is amazing so yeah just just to quickly review everything index.js has all the logic that we had from earlier where it loads in the data it sets up the svg but now our main function is a lot more concise it just invokes our scatter plot and configures it with all this stuff and here's something that's kind of mind-blowing you can take this expression and and we can skip that local variable we could just pass that straight into data like that that works too yeah to me that's a mind-blowing thing because a weight it's like this asynchronous control flow and it it has to like wait until that's done before it invokes this whole thing but that's the magic of async await right there and i kind of prefer it like this it's it's more uniform all the configuration happens right here awesome awesome um yeah and in our scatter plot we have a bunch of local variables in this closure a closure is this the scope of the variables inside of this function so when you invoke the scatter block constructor it creates this closure and that's where these variables live and we use let here not const because they can change over time and then we've got this my function that gets invoked with a selection and this is where it sets up all the scales and does all the you know transformations right here and it references these variables that are defined above um and it resolves to the things that we passed in to these these getter setter functions and we just use them as setters not getters and yeah all the logic is the same as same as it was before for the scatter plot and it works so with this template um one would be able to change the data set and the and the name of the variable the new variables if you change the data set and it will update like automatically now exactly yes that's right and and the key thing is that nothing at all inside scatterplot.js will need to change nothing at all if you change the data set that's what i meant earlier by separating the specific from the generic like this is a totally generic scatter plot implementation if you were to change the data set the only file you would need to change is index.js you can change the csv url change how the rows are parsed and change the configuration of the plot and all that yeah this is like the whole point of this reusable pattern is that all of this configuration happens outside of that reusable component so yeah you can just tweak it right here if you want it to be um you know pedal width you could just change it right here and boom it updates uh if if i want to to have like the specie as a different marker as a classification marker how could i add this to this code that's a great question and let me just clarify what you mean by marker i think you might mean like a different shape like these yep yes that's right that's right yeah you know this would be um this would be a great thing to try to do as an exercise this week and and i can outline how to do it i don't think we have time today to actually do it but i can outline yeah but but should i use like a case switch inside and where in the logic this uh switch case algorithm would be or should i use something completely different well let me walk through how how you would how you might do it um the switch case meaning the logic that determines which shape it is is that what you mean yes yes so d3 dot symbol does that with symbol dot type yeah yeah this would be actually a great thing to work through but what you would do is create an instance of d3 symbol and then set up an ordinal scale that maps the the three values for the species to these symbols you know maybe three particular symbols and yeah it's exposed as like symbol circle symbol cross symbol diamond so it would be something like you know create an ordinal scale and then set the range to be d3 simple circle d3 symbol cross d3 symbol diamond and that scale you can pass in a value from the species column and it will give you back out one of these one of these symbol types and then in the rendering logic of the scanner plot instead of rendering circles these would be paths and then when you set the the d attribute of the path you would want to change the type of the symbol and then invoke the symbol and that's how you could make a scatter plot with different shapes i don't know i kind of want to just go ahead and do it right now what do you think should i yes do it do it do it right yeah let's do it let's do it why not i'm gonna keep this as it is because it's nice and clean but i'm gonna fork this and say reusable d3 scatter plot with cymbals oh i love this it's going to be so much fun okay now we need to import these things oh actually symbols check it out symbols is an array containing the set of built-in symbol types so we could just use that okay so i'm going to import symbols from d3 and then i'm going to build out something very similar to x and y i'm going to call it symbol i guess symbol value and then i'll set up another one of these accessors for symbol value is going to be a function and then in index.js this is where we would want to say okay our symbol value is the species i think it's d dot species um i'll just take a look yes yes it is it is nice yeah species nice okay so this is how we would invoke it and configure it now let's implement the rendering so instead of circles these would be paths select all path and instead of cx and cy oh yeah we would still need to position these oh how would we do that you know what we would want to do is um probably use group l yeah okay okay it's getting a little tricky here but let's just get a bunch of different shapes to show up and then we can um we can worry about positioning them looking for the okay there's a constructor so let's import symbol from d3 and then we're gonna need a symbol scale i'm going to call it symbol scale because symbol the word the the name symbol is already taken because it's imported and this will be an ordinal scale we need to import that as well from d3 and the range this is the key thing the range is going to be symbols and the domain you know we don't actually need to set the domain uh because it sort of fills in automatically but just for completeness sake we could say data.map symbol value and that will just get all of the different species values and that will contain duplicates but that's okay because when you pass an array with duplicates to scale ordinal dot domain it automatically you know deduplicates it so we can just say just to inspect how that ended up symbol scale dot domain let's just console.log that it should just give us those three values oh there's some problem oh i forgot a comma there we go see so that's correct setosa versus color virginica and then if we take a look at the range it's a bunch of these d3 symbol implementations which are actually objects that have a draw function but we don't really strictly need to know what that is okay now we need to create a symbol generator symbol generator is new symbol using the constructor from d3 and let me see um size defaults to 64. that should be fine i mean we could we could use it to change the size but let's just use the default for now see how it turns out and this is the thing we can say symbol dot type so this is what we need to do we need to make these paths and we don't need cy cx or r but we need d which is this domain specific language for svg paths that's what the symbol generator will output so we can call symbol generator so this will be a function that takes as input d and then we can call simple generator of symbol value of d those these are different right the the string one and the the other one they don't mean the the same thing right there they are totally distinct and different yeah it's confusing that they have the same name um d is the attribute of an svg path which has a very specific meaning so if you look at the documentation for svg paths it expects a d attribute that is going to be a string that is a it's an expression in a domain specific language that defines svg paths okay so yeah so this d is the attribute d and this d is it's called it's just one of the rows okay not a column but a row in it's it's an element in the data array or rather the marks array and you know oh my gosh i i sort of forgot that we have this transformation step so this is actually where we can compute d which makes it even more confusing in a way because we d dot d but this is where it makes sense to pass in yeah to do the transformation to the marks to for clarity i'm going to call it path path d because it's there's too many d's okay and this should work i think but looks like broken let's see what's going on cannot access symbol generator before initialization okay yeah that makes sense i have to move it to be before we compute the marks okay i think it's sort of working see that we have some stuff in the corner but it's in the corner that's the problem um and they're pretty tiny they're pretty tiny so i kind of want to make it um bigger but the the main problem is they're all in one place they're all in the corner and i wonder can we have a transform if you just set attribute x and y it does not work i don't believe x and y works on paths i mean we can try it that would be ideal if it did uh but it doesn't okay we may be able to specify a transform on the path so we can translate by d dot x and d dot y but i don't know if that's going to work oh sorry i forgot to make it a function of d oh well look at that it's sort of worked but but they're all circles they're all circles that's indicative of a problem yeah yeah i forgot to um i forgot to set the type so here's what we need to do this is not right we need to we need to call simple generator dot type and then invoke it this is what we need to do uh looks like there's some problem that's not quite right why this empty parenthesis in the end yeah i think we need to invoke it or you know maybe i should consult the documentation simple.type yeah but it seems that you're invoking two times there you you're involved with the parameter and then you're invoking again see this is what i wanna do i want to i want to create an instance of d3 symbol we'll call it simple generator and then for each mark i want to say symbol generator dot type is oh oh i forgot to pass it through the scale that's so silly so symbol scale of symbol value of d okay so symbol value of d will give us the species value we need to pass that through the scale symbol scale in order to get at the particular symbol and remember symbol scale it just maps the the symbol the species to the various symbols that we imported from d3 so we need to call symbol generator dot type to set the type and then these empty parentheses on the end are to invoke the symbol generator as a function to generate this path string so now we can see that it's actually giving us different types of shapes yeah it's kind of tricky business it is but i think what we need to do to move these around um oh wait a minute it should be missing the clothes like this that's what it needs to be there we go there we go so it's translating by x and y correctly now yeah this is what you were trying to do isn't it yes very cool amazing amazing great great yeah that only take like 10 minutes amazing okay now i can make a scatter plot with kuhn's face yeah you totally could yeah yeah you could even put different people's faces yeah yeah nice wow pretty satisfying um one thing that's just not quite linked up yet i just want to wrap this up nicely radius is here but it's not being used so what i want to do is use it so that we can configure the size of the symbols but radius doesn't make sense we can use size instead and the default was 164. so let me say size is 50 and then in in our scatter plot i'm just going to do a search and replace across the whole thing and replacing radius how do you do a search and replace in view oh i'm using i'm using vim mode and then once you enter vim mode you can use the vim command for the search and replace which is this okay so the the percent means global search and replace um s means search i guess it's it's one of those sort of obscure vim things um it works with some other command line tools too in linux but yeah this is how you trigger it you you hit colon to enter this little space which is you know derived from vim the editor and then radius is the thing you find size is the thing it gets replaced with g means global means um it means to replace every instance on a line and so this is just sort of the the incantation to make it happen so once i run that we have size and once we have size defined here then what we can do is we can pass it into our simple generator dot size size like that and it it could we could make it vary for each mark but for now i'm just going to make it all the same size and just to confirm that it's working i'm going to change the size here to like 500 okay and it gets bigger and yeah size it's the behavior of size with d3 symbols can be a little counter-intuitive because it actually was really well thought out so that the area of each of the symbols is the same i'm pretty sure um and and when you change the size you're actually changing the area so it's not a linear scale in at play here it's it's more like a square root scale internally so if i change it to like a thousand that means there's going to be a thousand black pixels for each shape but if i change it to 100 that means there's going to be a hundred filled in black pixels for each shape and while i'm talking about symbols i would be remiss to not show this really nice piece oh that's not the one hang on hang on mike bostock has this really nice thing that illustrates that they're all the same area it'll be worth the wait there it is yeah this is a really nice little piece by mike bostock that he developed um looks like in 2017 i think when he was working on building out d3 shape and so notice here how the radius of each of these circles is different but it says here each of these shapes has a configurable area here 2 500 square pixels and so this is the this is the like deep thought that that you get for free when you use d3 symbol it turns out that each of these shapes has exactly the same number of filled in pixels and that's what gets configured when you call dot size all right well that's how we do it that's how we make a scatter plot with symbols uh within the confines of this reusable chart pattern and this is just playing out so well i mean i would much prefer to do things this way to have a reusable chart pattern in place and then branch out to all sorts of different visualization types rather than you know get it basically working in one huge file and fork that a a bunch of times so yeah thanks thanks everyone for for you know sticking with me through this refactoring effort and i hope it was useful thanks quran it was really insightful all this class nice happy to hear it happy to hear it i'd like to leave you all with some exercise options fork and modify what we made and maybe add access labels one of the things that it's missing is the labels the text labels on the axes and i think we have now enough knowledge of d3 to be able to do that another option would be you know fork this scatter plot and change the data set to get that experience of just changing the index file and leaving leaving the scatter plot implementation as it is as a generic thing or fork this and change the chart type i mean you would probably need to change the data set too in order to do that but that would be a great exercise if you could fork this and maybe change it to be a line chart component or an area chart or a bar chart we're going to be doing that in subsequent episodes but if you you know feel inspired by all means take stab at it now and if if you can't get it done you know at least take a stab at it share your work and maybe maybe you could use that as a jumping off point for a future episode so all of this is in the forum the vishub forum i made one for episode nine uh where did it go i may have lost it or something but i'm going to make one now for episode 9 so you can post there all right any last questions that was absolutely fantastic thank you thank you oh everything's very good thank you excellent all right well i'm happy you all could join me thanks for taking the time and uh have a good week i'll see you next week see you take care thanks yeah bye
Info
Channel: Curran Kelleher
Views: 770
Rating: undefined out of 5
Keywords:
Id: uad2LrClF1E
Channel Id: undefined
Length: 100min 5sec (6005 seconds)
Published: Sat May 08 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.